S Sacrifice There is no scholarly consensus regarding the meaning of the term ‘sacrifice.’ In its widest usage the term, w...
70 downloads
3617 Views
17MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
S Sacrifice There is no scholarly consensus regarding the meaning of the term ‘sacrifice.’ In its widest usage the term, which derives from the Latin sacrificium, is applied to virtually any form of gift-giving to a deity. A more restricted application limits its use to situations in which the gift is mutilated or destroyed. An even more restrictive usage is suggested by the Oxford English Dictionary (Simpson and Weiner 1989\1933), whose primary definition is ‘the slaughter of an animal (often including the subsequent consumption of it by fire) as an offering to God or a deity.’ The term sacrifice is, therefore, best regarded as a polythetic category which includes heterogeneous modes of behavior motivated by intentions of variable kinds. These include commemoration, initiation, expiation, propitiation, establishing a covenant, or affecting a transition in social status. Regarding sacrifice not as constituting a class of phenomena having a fixed set of features in common but as a multivocal term applying to phenomena that merely have ‘family resemblances’ in common has the advantage of mitigating somewhat the Euro-centric bias inherent in the work of many of the most influential thinkers on the subject, among them Robertson Smith (1894\1927), Hubert and Mauss (1898), and Durkheim (1912). ‘The Sacrifice’ for this generation of scholars was the central feature of religion, a Judaic– Christian bias still evident as late as 1956, the year in which E. E. Evans-Pritchard published his classic, Nuer Religion. Tellingly, Evans-Pritchard did not identify any word in the Nuer language that unambiguously corresponded to the English ‘sacrifice.’ This lack of correspondence appears widespread. Benveniste (1969, p. 223), for example, could discover no common word for sacrifice in the Indo-European languages.
1. The Terminology of Sacrifice In more elaborate forms than straightforward ritualized gift-giving, sacrifice typically involves a set of four terms. First, there is the sacrificer, the executor of the ritual. Second, there is the sacrifier, the beneficiary of the sacrifice. This may be the executor of the ritual, individual members of the community, or the community in toto. Third, there is the object sacrificed. Although this gift may be anything, ethnographic
descriptions usually emphasize the bloody sacrifice of a living being, in which case blood typically is described as connoting such ideas as fertility, renewal, replenishment,vitality, life force, and empowerment. Access by the sacrifier to these desirables is acquired when this substance is shed in ritual by the act of forcing life out of the victim. Substitution, or the use of a surrogate in cases when the offering is that of a human being, is a common occurrence, so frequently resorted to that some writers, including two who have made important contributions to the topic— de Heusch (1986) and Bloch (1992, p. 32)—consider substitution to be a key feature of sacrificial rituals. Fourth, there is the deity to whom the sacrifice is made.
2. Approaches to the Sacrifice A number of distinguished thinkers have advanced what might be termed ‘approaches’ to the study of sacrifice, which afford insightful, though overlapping, perspectives. Among the most useful are the following five.
2.1 Sacrifice as Gift For Tylor (1871\1913), following Plato, the sacrifice was a gift made to an anthropomorphic divinity, much as a gift might be given to a chief, whose acceptance made the deity indebted to humanity. This approach was popular with nineteenth century scholars, who attributed to the act of sacrificing such motives as expiating sins, obtaining benefits, nourishing the deity, and establishing good relations between human beings and spirits. This theory lays stress on the beliefs of the givers rather than on the ritual performance itself, which in this interpretation is relegated to logically subservient status. This approach is at the same time too broad and too narrow. It is overly broad in that these motives undoubtedly are widespread in many rituals of sacrifice all over the world, yet excessively narrow since sacrificial rituals are employed to do much more than simply offer gifts.
2.2 Sacrifice as Communion Robertson Smith (1894\1927), in contrast to Tylor, granted priority to ritual, a ranking modern ethno13439
Sacrifice graphic fieldwork tends to validate. Robertson Smith’s main point was that early Semitic sacrifice, and by implication sacrifice among nonliterate peoples, was a feast human beings communally shared with their deity. He understood the immolated victim to be the clan totem, an animal that shared in the blood of its human kin. It was also the clan ancestor, which was the clan deity, so in consuming their animal victim members of the community were eating their deity. This act of consumption conferred spiritual power. Later research in the field undermined much of Robertson Smith’s argument, but unquestionably the sacrifice does offer a means by which human beings may conjoin with their deity and establish a relationship, and here Robertson Smith did have insight. He strongly influenced Hubert and Mauss, whose Essai sur la nature et la fonction du sacrifice, which appeared in 1898, remains the basic work on the topic. Like all the studies of the time, it depended entirely upon secondary and tertiary sources rather than original ethnography, and the authors further restricted their sources to Sanskrit and Hebrew texts. To them the most significant feature about the sacrificial feast was that it provided the mechanism by which communion, in the sense of communication between human beings and deity, could be brought about. It was made possible by the fact that the victim represented both sacrifier and divinity, thus breaking down the barrier between the sacred and the profane (a moral contrast that played a major role in their approach). Yet, although the sacrifical act conjoined the sacred and profane, it could also disjoin them since while there were sacrifices that brought the spirit into the human world and made the profane sacred, there were also sacrifices that removed the spirit from the human world and made what had become sacred, profane. The former Hubert and Mauss called rituals of ‘sacralization;’ the latter, rituals of ‘desacralization.’ Even though Evans-Pritchard—who was very much influenced by their ideas—found this distinction among the Nuer (1956, pp. 197–230), it is by no means as universal as they supposed. Hubert and Mauss were intrigued by those sacrifices in which it is the deity that offers his life for the benefit of his worshippers. As they saw it, the exemplary sacrifice was that of a god, who, through unqualified self-abnegation, offers himself to humankind, in this way bestowing life on lesser beings. The most celebrated example of this is the central ritual of the Catholic religion, the Mass, where Christ’s sacrifice on the Cross is re-enacted. In a sacrifice of this nature the relationship between the deity and worshippers raises the question of relative status. From Hubert and Mauss’ perspective the relationship is one of grave imbalance since the deity gives more than it gets. This may appear so, but Valeri (1985, pp. 66–7) has advanced a sophisticated argument designed to prove that sacrifice actually creates a bond of ‘mutual’ 13440
indebtedness between the human and the divine, which makes each party dependent upon the other. This proposition may not be true universally, but support for Valeri’s argument comes from the bear sacrifices carried out by the Ainu of Japan. In these, the immolation of bears brings fertility to the community and nurture to the spirits (Irimoto 1996). Durkheim’s (1912) distinctive contribution, which otherwise relied heavily upon Hubert and Mauss, and Robertson Smith, was to promote the importance of sacrificial activities in bringing about social cohesion and the values that contribute to this cohesion. In Timor, a South Pacific island, a ritual is performed in which a local king is figuratively slain by members of two ethnically different communities (Hicks 1996). The king, who comes from one of these groups, symbolizes their unity as a single social entity at the same time that he symbolizes their local god. In addition, by periodically immolating their king the two groups regenerate their notions of ‘society’ and ‘god’ as epistemological categories.
2.3 Sacrifice as Causality For Hocart (1970, p. 217) the most compelling reason for performing collective rituals was the ‘communal pursuit of fertility.’ A sacrifice is offered to the gods in order that life and the fertility that engenders and sustains it are acquired by the community, a view that focuses upon sacrifice as a practical device to affect empirical transformation. Catholic rituals before the Reformation were thought of in this way; afterwards they were reinterpreted as a symbolic process (Muir 1997). For Catholics the Mass materially transformed the bread into Christ’s body; for Protestants the bread merely represented Christ’s sacrifice.
2.4 Sacrifice as Symbol System More contemporary scholars, such as Valeri (1985), have emphasized the symbolic character of sacrificial behavior, analyzing its structure and isolating its motifs within the wider context of a society’s ideology. In this approach, the array of symbols isolated is then apprehended as a semantic system whose meaning requires decoding. Valeri’s interpretation of Hawaiian sacrifice, for example, requires that sacrificial rituals be understood as a symbolic action ‘that effects transformations of the relationships of sacrifier, god, and group by representing them in public’ (1985, pp. 70–1). Although vulnerable to the criticism that it relies too heavily on the subjective bias of the analyst, this approach has the advantage of enabling sacrificial practices to be understood within a wider social context, at the same time as it allows an analysis of sacrifice to open up a distinctive perspective on society.
Safety, Economics of 2.5 Sacrifice as Catharsis Another approach to sacrifice is to focus on violence. Girard (1972), most notably, has suggested that sacrifice offers a socially controlled, and therefore, socially acceptable, outlet for the aggressive urges of human beings. The typical manner by which this is accomplished is to use a sacrificial victim as a scapegoat. A more recent scholar who has stressed the importance of violence in sacrifice is Bloch (1992), but his approach differs from that of Girard in that for Bloch sacrifice requires violence because human beings need ‘to create the transcendental in religion and politics,’ and that this is attainable through violence done to the victim (Bloch 1992, p. 7). One difficulty with this approach is that behavior that one might arguably class as sacrificial may involve no violence at all.
3. Origins of Sacrifice Sacrificial rituals have an extensive pedigree. One of the earliest for which a reasonably detailed record is available dates from about 3,000 years ago. The Vedic Aryans held sacrifice to be their central religious activity (Flood 1996, pp. 40–1), and it consisted of putting milk, grains of rice and barley, and domesticated animals into a sacred fire that would transport the offerings to supernatural entities. As the Agnicayana, this ritual is still carried out in Kerala, southwest India. Evidence for sacrificial behavior extends further back than Vedic times, to the civilizations of Sumer 5,000 years ago and China 4,000 years ago, and its presence seems accounted for even further in the past. This long history, taken with the widespread practice of sacrifice all over the world, suggests that a case could be made that sacrificial behavior is one of the natural dispositions common to the species Homo sapiens. As Needham (1985, p. 177) has averred for ritual in general, it may be that sacrifice, ‘Considered in its most characteristic features … is a kind of activity—like speech or dancing—that man as a ceremonial animal happens naturally to perform.’ See also: Commemorative Objects; Exchange in Anthropology; Honor and Shame; Religion: Morality and Social Control; Ritual; Trade and Exchange, Archaeology of
Bibliography Benveniste E 1969 Le Vocabulaire des institutions indoeuropeT ennes, I. Economie, parenteT , socieT teT ; II. Pouoir, droit, religion. Les Editions de Minuit, Paris Bloch M 1992 Prey into Hunter: The Politics of Religious Experience. Cambridge University Press, Cambridge, UK Durkheim E 1912\1968 Les Formes eT leT mentaires de la ie religieuse: le systeZ me toteT mique en Australie. Presse Universitaires de France, Paris
Evans-Pritchard E E 1956 Nuer Religion. Clarendon Press, Oxford, UK Flood G O 1996 An Introduction to Hinduism. Cambridge University Press, Cambridge, UK Frazer J G 1922 The Golden Bough: A Study in Magic and Religion, Vol. 1. Macmillan, London Girard R 1972 La Violence et le sacreT . Bernard Grasset, Paris de Heusch L 1986 Le Sacrifice dans les religions africaines. Editions Gallimard, Paris Hicks D 1996 Making the king divine: A case study in ritual regicide from Timor. Journal of the Royal Anthropological Institute 2: 611–24 Hocart A M 1970 Kings and Councillors: An Essay in the Comparatie Anatomy of Human Society. University of Chicago Press, Chicago Hubert H, Mauss H 1898 Essai sur la nature et la fonction du sacrifice. AnneT e sociologique 2: 29–138 Irimoto T 1996 Ainu worldview and bear hunting strategies. In: Pentika$ inen J (ed.) Shamanism and Northern Ecology (Religion and Society 36). Mouton de Gruyter, Berlin, pp. 293–303 Muir E 1997 Ritual in Early Modern Europe. Cambridge University Press, Cambridge, UK Needham R 1985 Exemplars. University of California Press, Berkeley, CA Simpson J A, Weiner E S C 1933\1989 Oxford English Dictionary, 2nd. edn. Clarendon Press, Oxford, UK Smith W R 1894\1927 Lectures on the Religion of the Semites, 3rd edn. Macmillan, New York Tylor E B 1871\1913 Primitie Culture, 4th edn. Murray, London, Vol. 2 Valeri V 1985 Kinship and Sacrifice: Ritual and Society in Ancient Hawaii (trans. Wissing P). University of Chicago Press, Chicago
D. Hicks
Safety, Economics of The economics of safety is concerned with the causes and prevention of accidents resulting in economic losses. The primary focus is on accidents causing workplace injuries and diseases. The topic is examined using several variants of economic theory. One set of theories applies when there is an ongoing relationship between the party who is generally the source of the accidents and the party who generally bears the costs of the accidents. Negotiations (explicit or implicit) between the parties can provide ex ante economic incentives to reduce accidents. Another set of theories applies when there is no ongoing relationship between the party who causes the accident and the party who suffers the consequences. Ex post financial penalties for the party who caused the harm can deter others from causing accidents.
1. ‘ Pure ’ Neoclassical Economics The neoclassical theory of work injuries is examined in Thomason and Burton (1993). Workplace accidents are undesirable by-products of processes that produce 13441
Safety, Economics of goods for consumption (Oi 1974). Several assumptions are used in the neoclassical analysis of these processes. Employers and workers have an ongoing relationship. Employers maximize profits. Workers maximize utility, not just pecuniary income. Workers at the margin are mobile and possess accurate information concerning the risks and consequences of work injuries. One implication of these assumptions is that employers pay higher wages for hazardous work than for nonhazardous work. On the assumption that accident insurance is not provided by employers or the government, the equilibrium wage for the hazardous work will include a risk premium equal to the expected costs of the injuries, which includes lost wages, medical care, and the disutility caused by the injury. If workers are risk averse, the premium will also include a payment for the uncertainty about who will be injured. The employer has an incentive to invest in safety in order to reduce accidents and thus the risk premium. The firm will make safety investments until the marginal expenditure on safety is equal to the marginal reduction in the risk premium. Since there is a rising marginal cost to investments in safety, equilibrium will occur with a positive value for the risk premium, which means that in equilibrium there will be some work injuries. Variants on neoclassical economics theory can be generated by changing some of the previous assumptions. For example, if all workers purchase actuarially fair insurance covering the full costs of work injuries, they will be indifferent about being injured. Furthermore, if injured, workers will have no incentive to return to work since the insurance fully compensates them for their economic losses. The change in worker behavior resulting from the availability of insurance is an example of the moral hazard problem, in which the insurance increases the quantity of the events being insured against (i.e., the occurrence of injuries and the durations of the resulting disabilities). These variants on neoclassical economics with their implications for employee behavior are discussed in Burton and Chelius (1997). 1.1 Eidence Consistent with the Neoclassical Model There is a burgeoning literature on compensating wage differentials for the risks of workplace death and injury. Ehrenberg and Smith (1996) report that generally studies find industries having the average risk of job fatalities (about 1 per 10,000 workers per year) pay wages that are 0.5 percent to 2 percent higher than the wages for comparable workers in industries with half that level of risk. Viscusi (1993) reviewed 17 studies estimating wage premiums for the risks of job injuries, and 24 studies estimating premiums for the risks of job fatalities. The total amount of risk premiums implied by these various estimates is substantial: probably the upper bound is the Kniesner and Leeth (1995) figure of $200 billion in risk premiums in the US in 1993. 13442
The Viscusi (1993) survey of studies of risk premiums for workplace fatalities included evidence from the United Kingdom (UK), Quebec, Japan, and Australia. More recently, Siebert and Wie (1994) found evidence for risk premiums for work injuries in the UK, and Miller et al. (1997) provided additional evidence on compensating differentials for risk of death in Australia. 1.2 Qualifications Concerning the Neoclassical Theory The model that postulates workplace safety results from risk premiums that provide financial incentives to employers to invest in safety has been challenged. Some critics assert that risk premiums are inadequate because workers lack sufficient information about the risks of work injuries and\or have limited mobility to move to less hazardous jobs. Ehrenberg and Smith (1996) conclude that workers have enough knowledge to form accurate judgments about the relative risks of various jobs. However, Viscusi (1993) refers to a sizable literature in psychology and economics documenting that individuals tend to overestimate low probability events, such as workplace injuries, which could result in inappropriately large risk premiums. Both Ehrenberg and Smith (1996) and Viscusi (1993) argue there is sufficient worker mobility to generate risk premiums. Another challenger to the neoclassical model is Ackerman (1988), who argues that the labor market does not generate proper risk premiums because certain costs resulting from work injuries are not borne by workers but are externalized. The empirical evidence on risk premiums has also been challenged or qualified. Ehrenberg and Smith (1996), for example, conclude that the studies of compensating wage differentials for the risks of injury or death on the job ‘ are generally, but not completely, supportive of the theory. ’ Viscusi (1993) concludes: ‘ The wage-risk relationship is not as robust as is, for example, the effect of education on wages. ’ The most telling attack on the compensating wage differential evidence is by Dorman and Hagstrom (1998), who argue that most studies have been specified improperly. They find that, after controlling for industry-level factors, the only evidence for positive compensating wage differential pertains to unionized workers. For nonunion workers, they argue that properly specified regressions suggest that workers in dangerous jobs are likely to be paid less than equivalent workers in safer jobs. The Dorman and Hagstrom challenge to the previously generally accepted view—that dangerous work results in risks premiums for most workers—is likely to result in new empirical studies attempting to resuscitate the notion of risk premiums. However, even if every empirical study finds a risk premium for workplace fatalities and injuries, the evidence would not conclusively validate the neoclassical economics
Safety, Economics of approach. Ehrenberg (1988) provides the necessary qualification: if there are any market imperfections (such as lack of information or mobility), the ‘ mere existence of some wage differential does not imply that it is a fully compensating one. ’ And if the risk premium is not fully compensating (or is more than fully compensating), then inter alia, the market does not provide the proper incentive to employers to invest in safety. Another qualification concerning the use of risk premiums as a stimulus to safety is that institutional features in a country ’s labor market may aid or impede the generation of risk premiums. In the US, the relative lack of government regulation of the labor market and the relative weakness of unions probably facilitate the generation of risk premiums. In countries like Germany, where unions are relatively strong and apparently opposed to wage premiums for risks, and where labor markets appear to be less flexible than those in the US because of government regulation of the labor market, wages are less likely to reflect underlying market forces that produce risk premiums.
2. Modified Neoclassical Economics and the Old Institutional Economics Some economists rely on a ‘ modified ’ version of neoclassical economics theory that recognizes limited types of government regulation of safety and health are appropriate in order to overcome some of the attributes of the labor market that do not correspond to the assumptions of pure neoclassical economics. These attributes include the lack of sufficient knowledge and mobility by employees, and the possible lack of sufficient knowledge or motivation of employers about the relationship between expenditures on safety and the reduction in risk premiums. The ‘ old institutional economists ’ (OIE) would agree with these critiques of lack of knowledge and mobility, and would also emphasize factors such as the unequal bargaining power of individual workers as another limitation of the labor market. An example of government promulgation of information in order to overcome the lack of knowledge in the labor market is the Occupational Safety and Health Act (OSHAct) Hazard Communication standard, which requires labeling of hazardous substances and notification to workers and customers and which is estimated to save 200 lives per year (Viscusi 1996). (Other aspects of the OSHAct are discussed below.)
2.1
Experience Rating
An example of a labor market intervention that could improve incentives for employers to invest in safety is the use of experience rating in workers ’ compensation programs, which provide medical and cash benefits to
workers injured at work. Firm-level experience rating is used extensively in the workers ’ compensation programs in the US, and is used to some degree in many other countries, including Canada, France, and Germany. Firm-level experience rating determines the workers ’ compensation premium for each firm above a minimum size by comparing its prior benefit payments to those of other firms in the industry. In the pure neoclassical economics model, the introduction of workers ’ compensation with experience rating should make no difference in the safety incentives for employers compared to the incentives provided by the labor market without workers ’ compensation. Under assumptions such as perfect experience rating, risk neutrality by workers, and actuarially fair workers ’ compensation premiums, which are explicated by Burton and Chelius (1997), the risk premium portion of the wage paid by the employer will be reduced by an amount exactly equal to the amount of the workers ’ compensation premium. Also, under these assumptions, the employer has the same economic incentives to invest in safety both before and after the introduction of the workers ’ compensation program. Under an alternative variant of the pure neoclassical economics approach, in which the assumption of perfect experience rating is dropped, the introduction of workers ’ compensation will result in reduced incentives for employers to reduce accidents. In contrast, the OIE approach argues that the introduction of workers ’ compensation with experience rating should improve safety because the limitations of knowledge and mobility and the unequal bargaining power for employees mean that the risk premiums generated in the labor market are inadequate to provide employers the safety incentives postulated by the pure neoclassical economics approach. Commons (1934), a leading figure in the OIE approached, claimed that unemployment is the leading cause of labor market problems, including injuries and fatalities, because slack labor markets undercut the mechanism that generates compensating wage differentials. Commons asserted that experience rating provides employers economic incentives to get the ‘ safety spirit ’ that would otherwise be lacking. The modified neoclassical economics approach also accepts the idea that experience rating should help improve safety by providing stronger incentives to employers to avoid accidents, although they place less emphasis on the role of unemployment in undercutting compensating wage differentials and more emphasis on the failure of employers to recognize the cost savings possible from improved safety without the clear signals provided by experience-rated premiums. A number of recent studies of the workers ’ compensation program provide evidence that should help assess the virtues of the various economic theories. However, the evidence from US studies is inconclusive. One survey of studies of experience rating by Boden (1995) concluded that ‘ research on the safety impacts 13443
Safety, Economics of has not provided a clear answer to whether workers ’ compensation improves workplace safety. ’ In contrast, a recent survey by Butler (1994) found that most recent studies provide statistically significant evidence that experience rating ‘ has had at least some role in improving workplace safety for large firms. ’ Burton and Chelius (1997) sided with Butler. The beneficial effect of experience rating on safety has been found in Canada by Bruce and Atkins (1993) and in France by Aiuppa and Trieschmann (1998). The mildly persuasive evidence that experience rating improves safety is consistent with the positive impact on safety postulated by the OIE approach and the modified neoclassical economists, and is inconsistent with the pure neoclassical view that the use of experience rating should be irrelevant or may even lead to reduced incentives for employers to improve workplace safety.
2.2 Collectie Action While the modified neoclassical economists and the OIE largely would agree on the desirability of several prevention approaches, such as the use of experience rating in workers ’ compensation, the OIE endorsed another approach that many modified neoclassical economists would not support, namely the use of collective bargaining and other policies that empower workers. Collective bargaining agreements can minimize unsafe activities or at least explicitly require employers to pay a wage premium for unsafe work. Many agreements also establish safety committees that assume responsibility for certain activities, such as participation in inspections conducted by government officials. If workers are injured, unions can help them obtain workers ’ compensation benefits, thereby increasing the financial incentives for employers to improve workplace safety. The beneficial effects predicted by the OIE approach for these collective efforts appear to be achieved. Several studies, including Weil (1999) concluded that OSHA enforcement activity was greater in unionized firms than in nonunionized firms. Moore and Viscusi (1990) and other researchers found that unionized workers receive larger compensating wage differentials for job risks than unorganized workers. Hirsch et al. (1997) found that ‘ unionized workers were substantially more likely to receive workers ’ compensation benefits than were similar nonunion workers. ’ Experience rating means that these higher benefit payments will provide greater economic incentives for these unionized employers to reduce injuries. There are also safety committees in some jurisdictions that are utilized in nonunionized firms. Burton and Chelius (1997) reviewed several studies involving such committees in the US and Canada, and found limited empirical support for their beneficial effects. Reilly et al. (1995) provided a more positive assessment 13444
for their accomplishments in the UK. These studies provide some additional support for the beneficial role of collective action postulated by the OIE.
3. The New Institutional Economics New institutional economics (NIE) authors generally argue that market forces encourage efficient forms of economic organization without government assistance and that opportunities for efficiency-improving public interventions are rare. This section considers only the Coase theory\transaction costs economics strain of NIE.
3.1 The Coase Theory and Transaction Costs Economics In the absence of costs involved in carrying out market transactions, changing the legal rules about who is liable for damages resulting from an accident will not affect decisions involving expenditures of resources that increase the combined wealth of the parties. The classic example offered by Coase (1988) involves the case of straying cattle that can destroy crops growing on neighboring land. The parties will negotiate the best solution to the size of the herd, the construction of a fence, and the amount of crop loss due to the cattle whether or not the cattle-rancher is assigned liability for the crop damages. Coase recognized that the assumption that there were no costs involved in carrying out transactions was ‘ very unrealistic. ’ According to Coase (1988): In order to carry out a market transaction, it is necessary to discover who it is that one wishes to deal with … and on what terms, to conduct negotiations …, to draw up the contract, to undertake the inspection needed to make sure that the terms of the contract are being observed, and so on. These operations are often … sufficiently costly … to prevent many transactions that would be carried out in a world in which the pricing system worked without cost.
The goal of transactions costs economics is to examine these costs and to determine their effect on the operation of the economy. As scholars of transaction costs have demonstrated, when transactions costs are significant, changing the legal rules about initial liability can affect the allocation of resources.
3.2
Eidence Concerning Changes in Liability Rules
Workplace safety regulation provides a good example of a change in liability rules, since in a relatively short period (1910–20), most US states replaced tort suits (the employer was only responsible for damages if negligent) with workers ’ compensation (the employer
Safety, Economics of is required to provide benefits under a no-fault rule) as the basic remedy for workplace injuries. Evidence from Chelius (1977) ‘ clearly indicated that the death rate declined after workers ’ compensation was instituted as the remedy for accident costs. ’ This result suggests that high transaction costs associated with the determination of fault in negligence suits were an obstacle to achieving the proper incentives to workplace safety, and that the institutional features of workers ’ compensation, including the no-fault principle and experience rating, provided a relatively more efficient approach to the prevention of work injuries. An interesting study that also suggests that institutional features can play a major role in determining the effects of changing liability rules is Fishback (1987). He found that fatality rates in coal mining were generally higher after states replaced negligence law with workers ’ compensation. Fishback suggested that the difference between his general results and those of Chelius might be due to high supervision costs in coal mining in the early 1900s. The transaction economics component of the NIE theory thus appears to provide a useful supplement to neoclassical economics, since institutional features, such as liability rules, can have a major impact on the economic incentives for accident prevention.
4. Law and Economics Law and economics (L&E) theory draws on neoclassical economics and transaction cost economics, but is distinctive in the extent to which it examines legal institutions and legal problems. This section examines the tort law branch of L&E theory, while the next section examines the employment law branch.
4.1 Theoretical Stimulus of Tort Law to Safety Tort law typically is used when one party harms another party, and the parties do not have an ongoing relationship. However, even though workers and employers have a continuing relationship, tort suits were generally used as a remedy for workplace accidents in the US until workers ’ compensation programs were established. When negligence is the legal standard used for tort suits, if the employer has not taken proper measures to prevent accidents and thus is at fault, the employer will be liable for all of the consequences of the injury. The standard for the proper prevention measure was developed by Judge Learned Hand and restated by Posner (1972) as: The judge (or jury) should attempt to measure three things in ascertaining negligence: the magnitude of the loss if an accident occurs; the probability of the accident ’s occurring; and the burden (cost) of taking precautions to prevent it. If
the product of the first two terms, the expected benefits, exceeds the burden of precautions, the failure to take those precautions is negligence.
Posner argued that proper application of this standard would result in economically efficient incentives to avoid accidents. Burton and Chelius (1977) examine some qualifications to this conclusion.
4.2
Eidence on the Tort Law Stimulus to Safety
There are two types of empirical evidence that suggest skepticism is warranted about the stimulus to workplace safety from tort suits. First, tort suits were used as the remedy for workplace injuries in the late 1800s and early 1900s. As previously noted, Chelius (1977) found that the replacement of the negligence remedy with workers ’ compensation led to a general reduction in workplace fatalities. The Fishback (1987) contrary result for a specific industry—coal mining—provides a qualification to this general result. Second, in other areas of tort law, there is a major controversy among legal scholars about whether the theoretical incentives for safety resulting from tort suits actually work. One school of thought is exemplified by Landes and Posner (1987), who state that ‘ although there has been little systematic study of the deterrent effect of tort law, what empirical evidence there is indicates that tort law … deters. ’ An opposing view of the deterrent effects of tort law is provided by Priest (1991), who finds almost no relationship between actual liability payouts and the accident rate for general aviation and states that ‘ this relationship between liability payouts and accidents appears typical of other areas of modern tort law as well, such as medical malpractice and product liability. ’ A study of tort law by Schwartz (1994) distinguished a strong form of deterrence (as postulated by Landes and Posner) from a moderate form of deterrence, in which ‘ tort law provides a significant amount of deterrence, yet considerably less that the economists ’ formulae tend to predict. ’ Schwartz surveyed a variety of areas where tort law is used, including motorist liability, medical malpractice, and product liability, and concluded that that the evidence undermines the strong form of deterrence but provides adequate support for the moderate form of deterrence. As to workers ’ injuries, Schwartz cited the Chelius and Fishback studies and concluded ‘ it is unclear whether a tort system or workers ’ compensation provides better incentives for workplace safety. ’ Burton and Chelius (1997) concluded, based on both the ambiguous historical experience of the impact of workers ’ compensation on workplace safety and the current controversy over the deterrence effect in other areas of tort law, that ‘ the law and economics theory concerning tort law does not provide much assistance in 13445
Safety, Economics of designing an optimal policy for workplace safety and health. ’ Further examinations of the relative merits of workers ’ compensation and the tort system in dealing with workplace accidents are Dewees et al. (1996) and Thomason et al. (1998).
5. Goernment Mandate Theory s. Law and Economics Theory 5.1
The Goernment Mandate Theory
The government mandate theory argues that the government promulgation of health and safety standards and enforcement of the standards by inspections and fines will improve workplace safety and health. This is basically a legal theory, although many of the supporting arguments involve reinterpretation or rejection of studies conducted by economists. For example, proponents of the theory object to the evidence on compensating wage differentials and on the deterrent effect of experience rating and object in principle to economists ’ reliance on cost-benefit analysis. Supporters of the theory, such as McGarity and Shapiro (1996), also provide a positive case why OSHA is necessary: OSHA ’s capacity to write safety and health regulations is not bounded by any individual worker ’s limited financial resources. Likewise, OSHA ’s capacity to stimulate an employer to action does not depend upon the employees ’ knowledge of occupation risks or bargaining power.
The government mandate theory would not be endorsed by many economists, including the OIE. Commons and Andrews (1936), for example, criticized at length the punitive approaches that used factory inspectors in the form of policemen, since this turned employers into adversaries with the law. The minimum standards supported by the OIE were those developed by a tripartite commission, involving employers, employees, and the public, rather than standards promulgated by the government. While the OIE theory is, thus, unsympathetic to the government mandate theory, the sharpest attack is derived from the L&E theory.
5.2 The L&E Theory Concerning Goernment Regulations L&E scholars make a distinction between mandatory, minimum terms (standards) and those terms that are merely default provisions (or guidelines) that employers and employees can agree to override. Most employment laws, including workplace safety laws, create standards and are thus objectionable to the L&E scholars. 13446
Willborn (1988) articulated the standard economic objections to mandatory terms. Employers will treat newly imposed standards like exogenous wage increases and in the short run will respond by laying off workers. In the long run, employers will try to respond to mandates by lowering the wage. The final wagebenefits-standards employment package will make workers worse off than they were before the imposition of standards—otherwise the employers and workers would have bargaining for the package without legal compulsion.
5.3
Eidence on the Effects of OSHA Standards
The evidence, as reviewed by Burton and Chelius (1997), suggests that the OSHAct has done little to improve workplace safety, thus lending more support to the L&E theory than to the government mandate theory. OSHA ’s ineffectiveness in part may be due to the lack of inspection activity, since the average establishment is only inspected once every 84 years. But the evidence also suggests that allocating additional resources to plant inspections may be imprudent. Smith (1992) concluded, after an exhaustive review of the studies of OSHA inspections, that the evidence ‘ suggests that inspections reduce injuries by 2 percent to 15 percent, ’ although the estimates often are not statistically significant (and thus cannot be distinguished confidently from zero effect). Several studies, including Scholz and Gray (1990) and Weil (1996) provide a more favorable assessment of the OSHA inspection process. However, even Dorman (1996), who supports an aggressive public policy to reduce workplace injuries, provided a qualified interpretation of such evidence: ‘ even the most optimistic reading indicates that … more vigorous enforcement alone cannot close the gap between U.S. safety conditions and those in other OECD countries. ’ In addition to the questionable effectiveness of OSHA inspections, some of the standards promulgated by OSHA have been criticized as excessively stringent. Viscusi (1996) examined OSHA standards using an implicit value of life of $5 million (derived from the compensating wage differential studies) as the standard for an efficient regulation. Four of the five OSHA safety regulations, but only one of the five OSHA health regulations adopted as final rules, had costs per life saved of less than $5 million. This evidence caused Burton and Chelius (1997) to provide a strong critique of the government mandate theory: To be sure, cost-benefit analysis of health standards issued under the OSHAct is not legal, and so those standards that fail the cost-benefit test (considering both the lives saved plus injuries and illnesses avoided) do not violated the letter and presumably the purpose of the law. But to the extent that the rationale offered by the government mandate theorists for regulation of health is that workers lack enough information to make correct decisions and therefore the government is in
Safety, Economics of a better position to make decisions about how to improve workplace health, the evidence on the variability of the cost\benefit ratios for OSHA health standards is disquieting. Rather than OSHA standards reflecting interventions in the marketplace that overcome deficiencies of the marketplace, the explanation of why the stringency of regulation varies so much among industries would appear at best to be a result of technology-based decisions that could well aggravate the alleged misallocation of resources resulting from operation of the market and at worst could reflect relative political power of the workers and employers in various industries.
6. Conclusions This review suggests that understanding the economics of workplace safety involves a rather eclectic mix of theories. Burton and Chelius (1997) were least impressed with the arguments and evidence pertaining to the pure neoclassical economics and the government mandate theories. They concluded that among the other economic theories pertaining to safety, no single theory provides an adequate understanding of the causes and prevention of workplace accidents. Rather, a combination of the theories, though untidy, is needed.
Bibliography Ackerman S 1988 Progressive law and economics—and the new administrative law. Yale Law Journal 98: 341–68 Aiuppa T, Trieschmann J 1998 Moral hazard in the French workers ’ compensation system. Journal of Risk and Insurance 65: 125–33 Boden L 1995 Creating economic incentives: lessons from workers ’ compensation systems. In: Voos P (ed.) Proceedings of the 47th Annual Meeting. Industrial Relations Research Association, Madison, WI Bruce C J, Atkins F J 1993 Efficiency effects of premium-setting regimes under workers ’ compensation: Canada and the United States. Journal of Labor Economics 11: Part 2: S38–S69 Burton J, Chelius J 1997 Workplace safety and health regulations: Rationale and results. In: Kaufman B (ed.) Goernment Regulation of the Employment Relationship. 1st edn. Industrial Relations Research Association, Madison, WI Butler R 1994 Safety incentives in workers ’ compensation. In: Burton J, Schmidle T (eds.) 1995 Workers ’ Compensation Year Book. LRP Publications, Horsham, PA Chelius J R 1977 Workplace Safety and Health. American Enterprise Institute for Public Policy Research, Washington, DC Coase R H 1988 The Firm, the Market, and the Law. University of Chicago Press, Chicago Commons J R 1934 Institutional Economics: Its Place in Political Economy. Macmillan, New York Commons J R, Andrews J B 1936 Principles of Labor Legislation, 4th edn. MacMillan, New York Dewees D, Duff D, Trebilcock M 1996 Exploring the Domain of Accident Law: Taking the Facts Seriously. Oxford University Press, Oxford Dorman P 1996 Markets and Mortality: Economics, Dangerous Work, and the Value of Human Life. Cambridge University Press, Cambridge
Dorman P, Hagstrom P 1998 Wage compensation for dangerous work revisited. Industrial and Labor Relations Reiew 52: 116–35 Ehrenberg R G 1988 Workers ’ Compensation, Wages, and the Risk of Injury. In: Burton J (ed.) New Perspecties on Workers ’ Compensation. ILR Press, Ithaca, NY Ehrenberg R G, Smith R S 1996 Modern Labor Economics: Theory and Public Policy 6th edn. Addison-Wesley, Reading, MA Fishback P V 1987 Liability rules and accident prevention in the workplace: Empirical evidence from the early twentieth century. Journal of Legal Studies 16: 305–28 Hirsch B T, Macpherson D A, Dumond J M 1997 Workers ’ compensation recipiency in union and nonunion workplaces. Industrial and Labor Relations Reiew 50: 213–36 Kniesner T, Leeth J 1995 Abolishing OSHA. Regulation 4: 46–56 Landes W M, Posner R A 1987 The Economic Structure of Tort Law. Harvard University Press, Cambridge, MA McGarity T, Shapiro S 1996 OSHA ’s Critics and Regulatory reform. Wake Forest Legal Reiew 31: 587–646 Miller P, Mulvey C, Norris K 1997 Compensating differentials for risk of death in Australia. Economic Record 73: 363–72 Moore M J, Viscusi W K 1990 Compensation Mechanisms for Job Risks: Wages, Workers ’ Compensation, and Product Liability. Princeton University Press, Princeton, NJ Oi W Y 1974 On the economics of industrial safety. Law and Contemporary Problems 38: 669–99 Posner R 1972 A theory of negligence. Journal of Legal Studies 1: 29–66 Priest G L 1991 The modern expansion of tort liability: its sources, its effects, and its reform. Journal of Economic Perspecties 5: 31–50 Reilly B, Paci P, Holl P 1995 Unions, safety committees and workplace injuries. British Journal of Industrial Relations 33: 275–88 Schwartz G T 1994 Reality in the economic analysis of tort law: does tort law really deter? UCLA Law Reiew 42: 377–444 Scholz J T, Gray W B 1990 OSHA enforcement and workplace injuries: a behavioral approach to risk assessment. Journal of Risk and Uncertainty 3: 283–305 Siebert S, Wie X 1994 Compensating wage differentials for workplace accidents: evidence for union and nonunion workers in the UK. Journal of Risk and Uncertainty 9: 61–76 Smith R 1992 Have OSHA and workers ’ compensation made the workplace safer? In: Lewin D, Mitchell O, Sherer P (eds.) Research Frontiers in Industrial Relations and Human Resources. Industrial Relations Research Association, Madison, WI Thomason T, Burton J F 1993 Economic effects of workers ’ compensation in the United States: Private insurance and the administration of compensation claims. Journal of Labor Economics 11: Part 2: S1–S37 Thomason T, Hyatt D, Roberts K 1998 Disputes and dispute resolution. In: Thomason T, Burton J, Hyatt D (eds.) New Approaches to Disability in the Workplace. Industrial Relations Research Association, Madison, WI Viscusi W K 1993 The value of risks to life and health. Journal of Economic Literature 31: 1912–46 Viscusi W K 1996 Economic foundations of the current regulatory reform efforts. Journal of Economic Perspecties 10: 119–34 Weil D 1996 If OSHA is so bad, why is compliance so good? Rand Journal of Economics 27: 618–40
13447
Safety, Economics of Weil D 1999 Are mandated health and safety committees substitutes for or supplements to labor unions? Industrial and Labor Relations Reiew 52: 339–60 Willborn S 1988 Individual employment rights and the standard economic objections: theory and empiricism. Nebraska Legal Reiew 67: 101–39
J. F. Burton, Jr.
Sample Surveys: Cognitive Aspects of Survey Design Cognitive aspects of survey design refers to a research orientation that began in the early 1980s to integrate methods, techniques, and insights from the cognitive sciences into the continuing effort to render data from sample surveys of human populations more valid and reliable, and to understand systematically the threats to such reliability and validity.
These concerns led to a cross-disciplinary research endeavor that would bring the insights and methods of the cognitive sciences, especially cognitive psychology, to bear on the problems raised by surveys, and that would encourage researchers in the cognitive sciences to use surveys as a means of testing, broadening, and generalizing their laboratory-based conclusions. Recent work in the field has incorporated theories and viewpoints from fields not usually classified as the cognitive sciences, notably linguistics, conversational analysis, anthropology, and ethnography. Described below are some of the practical and theoretical achievements of the movement to date. These include methodological transfers from the cognitive sciences to survey research, the broad establishment of cognitive laboratories in governmental and nongovernmental survey research centers, theoretical formulations that account for many of the mysteries generated by survey responses, and practical applications to the development and administration of on-going surveys. Some references appear at the close describing these achievements more fully and speculating on the future course of the field.
1. Background and History While a sample survey of human beings can be of good and measurable accuracy only if it is accomplished through probability sampling (Sample Sureys: The Field and Sample Sureys: Methods), probability sampling is only the first step in assuring that a survey is successful in producing results accurate enough for use in predicting elections, for gauging public opinion, for measuring the incidence of a disease or the amount of acreage planted in a crop, or for any of the other myriad of important uses that depend on survey data. Questions must be worded not only in ways that are unbiased, but that are understandable to respondents and convey the same meaning to respondents as was intended by the survey’s author. If the questions refer to the past, they must be presented in ways that help respondents remember the facts accurately and report truthfully. And interviewers must be able to understand respondents’ answers correctly to record and categorize them appropriately. These and other nonsampling issues have been of concern to survey researchers for many decades (e.g., Payne 1951). Researchers were puzzled by the fact that changing the wording slightly in a question, sometimes resulted in a different distribution of answers and sometimes did not; worried that sometimes the context of other questions affected answers to a particular question, and sometimes did not; concerned that sometimes respondents were able to remember things accurately and sometimes were not. In the late 1970s and early 1980s, as survey data came to be used ever more extensively for public policy purposes, concerns for the validity of those data became especially widespread. 13448
2. Methodological Transfers Much work in the field is rooted in a classic information processing model applied to the cognitive tasks in a survey interview. The model suggests that the tasks the respondent must accomplish, in rough order in which they must be tackled (though respondents go back and forth among them), are comprehension\interpretation, retrieval, judgment, and communication. These models have been intertwined with the earlier models of the survey interview as a social interaction (Sudman and Bradburn 1974) through the efforts of such researchers as Clark and Schober (1992) and Sudman et al. (1996) to take into account both the social communication and individual cognitive processes that play a part in the survey interview. Guided by this model, one of the earliest achievements of the movement to study cognitive aspects of survey design was the adoption of some of the methods of the cognitive psychology laboratory for the pretesting of survey questionnaires. In particular, these methods are aimed at insuring that questions are comprehensible to respondents and that the meanings transmitted are those intended by the investigator. They also can point to problems in retrieval and communication. Such cognitive pretesting includes ‘think-aloud protocols,’ in which specially recruited respondents answer the proposed questionnaire aloud to a researcher, and describe, either concurrently with each question or retrospectively at the end of the questionnaire, their thought processes as they attempt to understand the question and to retrieve or construct an answer. Also used is behavioral coding, based on
Sample Sureys: Cognitie Aspects of Surey Design procedures originally developed by Cannell and coworkers (e.g., Cannell and Oskenberg 1988), which note questions for which respondents ask for clarification or for which their responses are inadequate. Other methods include reviews of questionnaires by cognitively trained experts, the use of focus groups, and for questionnaires that are automated, measures of response latency. Detailed descriptions of these methods and their applicability to detect problems at various stages of the question answering process appear in Sudman et al. (1996), Schwarz and Sudman (1996), and Forsyth and Lessler (1991).
3. Interiewer Behaior Although survey interviewing began with investigators themselves merely going out and having conversations with respondents, as the survey enterprise aspired to greater scientific respectability and as it became sufficiently large-scale so that many interviewers were needed for any project, standardized interviewing became the norm. Printed questionnaires were to be followed to the letter; interviewers were instructed to answer respondents’ requests for clarification by merely re-reading the question or by saying that the meaning of any term is ‘whatever it means to you.’ While it is probably true that skilled interviewers have always deviated from these rigid standards in the interest of maintaining respondent cooperation and getting valid data, the robot-like interviewer delivering the same ‘stimulus’ to every respondent was considered the ideal. Working with videotapes of interviews for the National Health Interview Survey and for the General Social Survey and using techniques of conversational analysis, Suchman and Jordan (1990) showed that misunderstandings frequently arose because of the exclusion of normal conversational resources from the survey interview, and that these misunderstandings resulted not only in discomfort on the part of respondents, but in data that did not meet the needs of the survey designer. Suchman and Jordan recommended a more collaborative interviewing style, in which interviewer and respondent work together to elicit the respondent’s story and then fill in the questionnaire. Note that this recommendation applies almost exclusively when the information sought is about the respondent’s activities or experiences rather than about his\her attitudes or opinions. Both theoretical and practical advances have followed on these insights. On a theoretical level, several researchers have used variants of the ‘cooperativeness’ principle advanced by Grice (1975) to help explain some of the anomalies that have puzzled survey researchers over the decades. Many are understandable as respondents bringing to the survey interview the communicative assumptions and skills that serve well in everyday life. The maxims of the cooperativeness principle require a participant
in a conversation to be truthful, relevant, informative, and clear (Sudman et al. 1996, p. 63). Thus, for example, it has long been known that respondents are willing to report on their attitudes towards fictitious issues or nonexistent groups, and investigators have taken this willingness as evidence of the gullibility of respondents and perhaps of the futility of trying to measure attitudes at all. But to ‘catch on’ to the ‘trick’ nature of such a question, a respondent would have to assume that the interviewer violated all the maxims of the cooperativeness principle. Parsimony suggests, instead, that the respondent would assume that the interviewer knew what s\he was talking about, and answer accordingly. Several puzzles about context effects, discussed below, also seem understandable in light of Grice’s maxims. On a practical level, freeing interviewers to be more conversational with respondents is not costless; when interviewers explain more, interviews take longer, and are thus more costly in interviewers’ wages and respondents’ patience. Schober and Conrad (e.g., 1997) have embarked on a program of research that has demonstrated that conversational interviewing generates better data than does standardized interviewing when the mappings of the respondent’s experiences onto the survey’s concepts are complicated, but indeed at the cost of longer interviews. These investigators are now experimenting with interviewing styles that fall at several points along the continuum between fully standardized and fully conversational, seeking a point that balances the benefits in accuracy to be gained by a more conversational style with the costs in interview length.
4. Factual Questions s. Questions about Attitudes or Opinions The distinction between questions that ask respondents to report on facts, usually autobiographical, and those that ask for an attitude or an opinion has been an important one in survey research. Indeed, the explicit origins of the movement to study cognitive aspects of survey design lie in a desire to improve the accuracy of factual reports from respondents; the extension of concern to attitudinal questions came somewhat later, although such a concern was certainly prefigured by the work reported in Turner and Martin (1984). The distinction remains important for interviewer behavior, in the sense that the kinds of conversational interviewing being investigated, are designed to help respondents better comprehend the intent of the question and to recall more accurately their experiences. Interviewer interventions designed for these purposes, when the question is aimed at getting a factual report might well bias the respondent’s answer when the aim of the question is to elicit an attitude or opinion. The distinction is also valid in the sense that the cognitive theories about 13449
Sample Sureys: Cognitie Aspects of Surey Design autobiographical memory that have been used to understand problems of the recall of factual information are less applicable to attitudes and opinions, and the cognitive techniques that have been developed to aid respondent recall for autobiographical questions are not appropriate for the recall of attitudes or opinions. But in important ways the distinction obscures similarities between the two types of questions, as will be discussed in the section on context effects, below.
4.1 Improing the Accuracy of Factual Reports Very often survey questions ask for the frequency with which a respondent has accomplished a behavior (e.g., visited a doctor) or experienced an event (e.g., had an illness) during a specified time period (usually starting sometime in the past and ending with the date of the interview) called the reference period. Thus the respondent, having understood what events or experiences that should be included in the report, has a two-fold task: he or she must recall the events or experiences, and also determine whether those events or experiences fell inside the reference period. Theories from the cognitive sciences about how experiences are stored in autobiographical memory and retrieved therefrom have been marshaled to understand both phases of the respondent’s recall task and to improve their accuracy. In a phenomenon called ‘telescoping,’ respondents are known often to move events or experiences forward in time, reporting them as falling within the reference period although in fact they occurred earlier. One of the earliest contributions to the literature on cognitive aspects of survey design was the proposal of a technique dubbed ‘landmarking’ to control telescoping (Loftus and Marburger 1983). Here, rather than inquiring about the last six months or the last year, a respondent is supplied with (or is asked to supply) a memorable date, and then is asked about events or experiences since that date. The related concept of bounding is also useful in controlling telescoping. In a panel survey, in which respondents are interviewed repeatedly at regular intervals, informing them of the events or experiences reported at the last interview will prevent such events or experiences from being placed in the time period between the previous interview and the current one. An analogous technique useable in a single interview is the two-time-frame procedure in which the respondent is first asked the report events or experiences during a long reference period (e.g., the last 6 months) and then for a shorter one that is really of interest to the investigator (e.g., the last 2 months). This technique both seems to relieve the pressure on respondents to report socially desirable events or experiences and to stress the interviewer’s interest in correct dating, encouraging respondents to strive for greater ac13450
curacy. A theoretical explanation of telescoping, taking into account the workings of memory for the storage of timing and the effects of rounding to culturally stereotypical values (e.g., 7 days, 10 days, 30 days), is provided by Hutenlocher et al. (1990). The task of counting or estimating the number of events or experiences in the reference period can also provide a challenge to the respondent. Issues of frequency regularity and similarity influence whether the respondent is likely to actually count the instances or to employ an estimation strategy, as well as influencing whether counting or estimation is likely to be more accurate. In general, respondents are more likely to retrieve from memory and count events or experiences if their number is few and to estimate if their number is large. But regular events (e.g., going to the dentist every 6 months) and similar events (e.g., trips to the grocery store) are more likely to be estimated. And investigators have recently found that estimation (perhaps by retrieving a rate and applying that rate to the length of the reference period) is usually more accurate, than the attempt to retrieve and count regular, similar events or experiences. For a full discussion of these issues, see Sudman et al. (1996).
4.2 Context Effects Context effects occur when the content of the questions preceding a particular question (or even following that question in a self-administered questionnaire in which the respondent is free to return to earlier questions after reading later ones) influence how some respondents answer that question. Our understanding of the term has been broadened recently to include also the effects on the distribution of responses of the response categories offered in a closed-ended question. Investigators have used the maxims of Grice’s cooperativeness principle to approach a systematic understanding of these effects, which have seemed mysterious for many years. Full treatments of these issues may be found in Sudman et al. (1996) and Tourangeau (1999); of necessity only some selected findings can be presented here. A pair of puzzling context effects that have fascinated researchers for years are assimilation effects and contrast effects. An assimilation effect occurs when the content of the preceding items moves the responses to the target item in the direction of the preceding items; a contrast effect when the movement is in the opposite direction. These effects can now be understood through the inclusion\exclusion model of the judgment process in survey responding, proposed by Schwarz and Bless (1992). The model holds that when a respondent has to make a judgment about a target stimulus, s\he must construct a cognitive representation, not only of the target stimulus, but also of a standard with which to compare the target
Sample Sureys: Cognitie Aspects of Surey Design stimulus. To construct these representations respondents must retrieve information from memory; since not all information in memory can be retrieved, respondents retrieve only that which is most easily accessible. And information can be accessible either because it is ‘chronically’ accessible or because it is temporarily accessible; it is the temporary accessibility of information that is affected by the context of the question. Positive temporarily accessible information supplied by the context and included in the representation of the target will render the judgment of the target more positive: negative context information thus included will render the judgment more negative. Both these processes produce assimilation effects. Contrast effects are created when the context suggests that information be excluded from the representation of the target, or included in the representation of the standard to which the target is to be compared. For example, Schwarz and Bless (1992) were able to manipulate the frequency of positive evaluations of a political party by using a reference to a popular politician in a preceding question. They were able to create an assimilation effect, resulting in more positive evaluations, by encouraging respondents to include the politician in their representations of the party (by having them recall that the politician had been a member of the party). They were also able to create a contrast effect, resulting in less positive evaluations of the political party, by encouraging respondents to exclude the politician from their representation of the party (by having them recall that his present position took him out of party politics). The model predicts inclusion of more information when the target is a general question, thus encouraging assimilation effects; it also predicts contrast effects when the target is narrow. Thus, a preceding question about marital happiness seems to result in higher reports of general happiness for those who are happily married, but lower reports of general happiness for those who are unhappy in their marriages. This influence of the preceding question can be eliminated by having respondents see the questions about marital happiness and general happiness as part of a single unit (either by an introduction that links the two questions or by a typographical link on a printed questionnaire); then, in obedience to the conversational maxim that speakers should present new information, respondents exclude their marital happiness from their evaluation of their general happiness. Similar recourse to conversational maxims helps explain the influence of the response alternatives presented in closed answer questions on the distribution of responses. For example, Schwarz et al. (1985) found that 37.5 percent of respondents reported watching TV for more than two and a half hours a day when the response scale ranged from ‘up to 2" hours’ # to ‘more than 4" hours,’ but only 16.2 percent reported # watching more than two and a half hours a day when the response scale ranged from ‘up to " hour’ to ‘more #
than 2" hours.’ Respondents take the response scales # to be conversationally informative about what typical TV watching habits are and calibrate their report accordingly.
5. Some Achieements of the Moement to Study Cognitie Aspects of Sureys 5.1 The Impact of Theories Perhaps the most profound effect of the movement to study cognitive aspects of survey design is the introduction of theory into the research area. There has been space above to present only a few of the many such theoretical advances, but their utility should be clear. For example, consulting the maxims of cooperativeness in conversation gives us a framework for predicting and preventing response effects. In the same sense, the inclusion\exclusion model, much more fully developed than is presented here, makes predictions for what to expect from various manipulations of question ordering and content. These predictions offer guidance for the experimental advance of the research domain, and their testing systematically advances our knowledge, both of the phenomena and of the applicability of the theory.
5.2 The Cognitie Laboratories and Their Impact Cognitive interviews in laboratory settings have become standard practice in several US government agencies (including the National Center for Health Statistics, the Bureau of Labor Statistics, and the Census Bureau), as well as in government agencies around the world, and at many academic and commercial survey organizations. Work in these laboratories supplements the usual large-scale field pretests of surveys. For illustrative purposes it is worth describing two major efforts at US government agencies. The Current Population Survey (CPS) is a largescale government survey, sponsored by the Bureau of Labor Statistics and carried out by the Census Bureau. From interviews with some 50,000 households monthly are derived major government statistical series, including the unemployment rate. In the early 1990s the CPS underwent its decennial revision, this time using the facilities of the government cognitive laboratories as well as traditional field testing of proposed innovations. In particular, this research found that the ordering and wording of the key questions regarding labor force participation caused some respondents to report themselves as out of the labor force when in fact they were either working part time or looking for work. It also found that the concept of layoff, which the CPS takes to mean 13451
Sample Sureys: Cognitie Aspects of Surey Design temporary furlough from work with a specific date set for return, was not interpreted that way by respondents. Respondents seemed to understand ‘on layoff’ as a polite term for ‘fired.’ Appropriate revision of these questions, followed by careful field testing and running the new and old versions in parallel for some months resulted in estimates of the unemployment rate in which policy makers and administrators can have greater confidence. More details can be found in Norwood and Tanur (1994). The US Decennial Census has always asked for a racial characterization of each resident. Originally designed to distinguish whites (mostly free) from blacks (mostly slaves) for purposes of counting for apportionment (the US Constitution originally mandated that a slave be counted as 3\5 of a man), the uses to which answers to the race identification question was put evolved over time. Civil Rights legislation starting in the 1960s designated protected groups, and it thus became important to know the population of such groups. At the same time, individuals were becoming both more conscious of their racial selfidentification and less willing to be pigeonholed into a small set of discrete categories, especially as more individuals were self-identifying with more than one racial group. The 1990 Census asked two questions in this general area—one on racial identity followed by one on Hispanic ethnic identification. Experimental work (Martin et al. 1990) indicated that the ordering of these questions mattered to respondents. For example, when the Hispanic ethnicity question appeared first, fewer people chose ‘other’ as their racial category. This and other research made a convincing case that the complexity of Americans’ racial and ethnic self-identification needed a more complex set of choices on the Census form. A 4-year program of research resulted in the conclusion that allowing respondents to choose as many racial categories as they believe apply to them is a solution superior to the provision of a multiracial category (see Tucker et al. 1996). (For more details on this and other census research see Censuses: History and Methods.) The 2000 Decennial Census allowed respondents to choose as many racial categories as they wished; at this writing it is not yet known what proportion of the population took advantage of that opportunity, or the effects those choices will have on the analyses of the data.
6. Conclusions The movement to study cognitive aspects of surveys has thus given us some theoretical insights and practical approaches. Ideally, we could hope for a theory of the survey designing and responding process, but we are still far from that goal, and may perhaps never be able to attain it. But the frameworks so far provided offer clear guidance of how to proceed in an effort to reach systematic understanding. Far fuller accounts of the results of the movement to study 13452
cognitive aspects of survey design and ideas of future directions can be found in Sirken et al. (1999a), Sirken et al. (1999b), Sudman et al. (1996), and Tanur (1992). See also: Probability: Formal; Probability: Interpretations; Sample Surveys, History of; Sample Surveys: Methods; Sample Surveys: Model-based Approaches; Sample Surveys: Nonprobability Sampling; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field
Bibliography Cannel C F, Oskenberg L 1988 Observation of behavior in telephone interviews. In: Groves R M, Biemer P B, Lyberg L E, Massey J T, Nichols II W L, Waksberg J (eds.) Telephone Surey Methodology. Wiley, New York Clark H H, Schober M F 1992 Asking questions and influencing answers. In: Tanur J. (ed.) Questions about Questions: Inquiries into the Cognitie Bases of Sureys. Sage, New York Forsyth B H, Lessler J 1991 Cognitive laboratory methods: A taxonomy. In: Biemer P, Groves R M, Lyberg L E, Mathiowetz N, Sudman S (eds.) Measurement Errors in Sureys. Wiley, New York Grice H P 1975 Logic and conversation. In: Cole P, Morgan J L (eds.) Syntax and Semantics, Vol 3: Speech Acts. Academic Press, New York Hippler H J, Schwarz N, Sudman S (eds.) 1987 Social Information Processing and Surey Methodology. SpringerVerlag, New York Huttenlocher J, Hedges L V, Bradburn N M 1990 Reports of elapsed time: Bounding and rounding processes in estimation. Journal of Experimental Psychology: Learning, Memory, and Cognition 16: 196–213 Loftus E F, Marburger W 1983 Since the eruption of Mt. St. Helens, did anyone beat you up? Improving the accuracy of retrospective reports with landmark events. Memory and Cognition 11: 114–20 Martin E, Demaio T J, Campanelli P C 1990 Context effects for census measures of race and Hispanic origin. Public Opinion Quarterly 54: 551–66 Norwood J L, Tanur J M 1994 Measuring unemployment in the nineties. Public Opinion Quarterly 58: 277–94 Payne S L 1951 The Art of Asking Questions. Princeton University Press, Princeton, NJ Schober M F, Conrad F G 1997 Does conversational interviewing reduce survey measurement error? Public Opinion Quarterly 61: 576–602 Schwarz N, Bless H 1992 Constructing reality and its alternatives: Assimilation and contrast effects in social judgment. In: Martin L L, Tesser A. (eds.) The Construction of Social Judgment. Erlbaum, Hillsdale, NJ Schwarz N, Hippler H J, Deutsch B, Strack F 1985 Response categories: Effects on behavioral reports and comparative judgments. Public Opinion Quarterly 49: 388–95 Schwarz N, Sudman S (eds.) 1996 Answering Questions: Methodology for Determining Cognitie and Communicatie Processes in Surey Research. Jossey-Bass, San Francisco Sinaiko H, Broedling L A (eds.) 1976 Perspecties on Attitude Assessment Sureys and their Alternaties. Pendleton, Champaign, IL Sirken M G, Herrmann D J, Schechter S, Schwarz N, Tanur J M, Tourangeau R (eds.) 1999a Cognition and Surey Research. Wiley, New York
Sample Sureys, History of Sirken M G, Jabine T, Willis G, Martin E, Tucker C (eds.) 1999b A New Agenda for Interdisciplinary Surey Research Methods: Proceedings of the CASM II Seminar. US Department of Healthy and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, MD Suchman L, Jordan B 1990 Interactional troubles in face-to-face survey interviews. Journal of the American Statistical Association 85: 232–53 Sudman S, Bradburn N M 1974 Response Effects in Sureys: A Reiew and Synthesis. Aldine, Chicago Sudman S, Bradburn N M, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Tanur J M (ed.) 1992 Questions about Questions: Inquiries into the Cognitie Bases of Sureys. Sage, New York Tourangeau R 1999 Context effects on answers to attitude questions. In: Sirken M G, Hermann D J, Schwarz N, Tanur J M, Tourangeau R (eds.) Cognition and Surey Research. Wiley, New York Tucker C, McKay R, Kojetin B, Harrison R, de al Puente M, Stinson L, Robison E 1996 Testing methods of collecting racial and ethnic information: Results of the Current Population Survey Supplement on Race and Ethnicity. Bureau of Labor Statistics, Statistical Notes, No. 40 Turner C F, Martin E (eds.) 1984 Sureying Subjectie Phenomena. Sage, New York
J. M. Tanur
Sample Surveys, History of Sampling methods have their roots in attempts to measure the characteristics of a nation’s population by using only a part instead of the whole population. The movement from the exclusive use of censuses to the at least occasional use of samples was slow and laborious. Two intertwined intellectual puzzles had to be solved before the move was complete. The first such puzzle was whether it is possible to derive valid information about a population by examining only a portion thereof (the ‘representative’ method). And the second puzzle concerns the method for choosing that portion. Issues of choice themselves fall into two categories: the structuring of the population itself in order to improve the accuracy of the sample, and whether to select the units for inclusion by purposive or random methods. This article considers the development of methods for the sampling of human populations. We begin by describing two antecedents of modern survey methods—the development of government statistics and censuses and the general development of statistical methodology, especially for ratio estimation. Then we deal with the movement from censuses to sample surveys, examining the debate over representative methods and the rise of probability sampling, including the seminal 1934 paper of Jerzy Neyman. We describe the impact in the United States of the results
obtained by Neyman on sample surveys and the contributions of statisticians working in US government statistical agencies. We end with a brief description of the subsequent establishment of random sampling and survey methodology more broadly and examine some history of the movement from the study of the design of sampling schemes to the study of other issues arising in the surveying of human populations.
1. Antecedents of Modern Sample Surey Methods In order to determine facts about their subjects or citizens, and more often about their number, governments have long conducted censuses. Thus, one set of roots of sample survey methodology is intertwined with the history of methods for census-taking. The origins of the modern census are found in biblical censuses described in the Old Testament (e.g., see the discussions in Duncan (1984) and Madansky (1986) as well as in censuses carried out by the ancient Egyptians, Greeks, Japanese, Persians, and Romans (Taeuber (1978)). For most practical purposes we can skip from biblical times to the end of the eighteenth century and the initiation of census activities in the United States of America, even though there is some debate as to whether Canada, Sweden, or the United States should be credited with originating the modern census (Willcox 1930) (see Censuses: History and Methods). Another antecedent of sampling lies in the problem of estimating the size of a population when it is difficult or impossible to conduct a census or complete enumeration. Attempts to solve this problem even in the absence of formal sampling methods were the inspiration of what we now know as ratio estimation. These ideas emerged as early as the seventeenth century. For example, John Graunt used the technique to estimate the population of England from the number of births and some assumptions about the birth rate and the size of families. Graunt’s research, later dubbed political arithmetic, was based on administrative records (especially parish records) and personal observation, as in his seminal work, Natural and Political Obserations Made Upon the Bills of Mortality, published in 1662. To accomplish what Graunt did by assumption, others looked to hard data from subsets of a population. Thus, Sir Frederick Morton Eden, for example, found the average number of people per house in selected districts of Great Britain in 1800. Using the total number of households in Great Britain from tax rolls (and allowing for those houses missing from the rolls), Eden estimated the population of Great Britain as nine million, a figure confirmed by the first British census of 1801 (Stephan (1948). Even earlier, in 1765 and 1778, civil servants published population estimates for France using an 13453
Sample Sureys, History of enumeration of the population in selected districts and counts of births, deaths, and marriages, for the country as a whole (see Stephan (1948). It was Pierre Simon de Laplace, however, who first formally described the method of ratio estimation during the 1780s, and he then employed it in connection with a sample survey he initiated to estimate the population of France as of September 22, 1802. He arranged for the government to take a sample of administrative units (communes), in which the total population, y, and the number of registered births in the preceding year, x, were measured. Laplace then estimated the total population of France by Yp l Xy\x, where X was the total number of registered births. Laplace was the first to estimate the asymptotic bias and variance of Yp , through the use of a superpopulation model for the ratio p l y\x (see Laplace (1814)). Cochran (1978) describes this derivation and links it to results which followed 150 years later. Quetelet later applied the ratio estimation approach in Belgium in the 1820s but abandoned it following an attack by others on some of the underlying assumptions (see Stigler 1986, pp. 162–6).
2. From Censuses to Sureys The move from censuses to sample surveys to measure the characteristics of a nation’s population was slow and laborious. Bellhouse (1988), Kruskal and Mosteller (1980), and Seng (1951) trace some of this movement, especially as it was reflected in the discussions regarding surveys that took place at the congresses of the International Statistical Institute (ISI). The move succeeded when investigators combined random selection methods with the structuring of the population and developed a theory for relating estimates from samples, however structured, to the population itself. As early as the 1895 ISI meeting, Kiaer (1896) argued for a ‘representative method’ or ‘partial investigation,’ in which the investigator would first choose districts, cities, etc., and then units (individuals) within those primary choices. The choosing at each level was to be done purposively, with an eye to the inclusion of all types of units. That coverage tenet, together with the large sample sizes recommended at all levels of sampling, was what was judged to make the selection representative. Thus the sample was, approximately, a ‘miniature’ of the population. Kiaer used systematic sampling in his 1895 survey of workers as a means of facilitating special tabulations from census schedules for a study of family data in the 1900 Norwegian census. Kiaer actually introduced the notion of random selection in his description of this work by noting that at the lowest level of the sample structure ‘the sample should be selected in a haphazard and random way, so that a sample selected in this manner would turn out in the same way as would have been the case 13454
had the sample been selected through the drawing of lots … ’ (Kiaer 1897, p. 39) The idea of less than a complete enumeration was opposed widely, and Kiaer presented arguments for sampling at ISI meetings in 1897, 1901, and 1903. Lucien March, in a discussion of Kiaer’s paper at the 1903 meeting, formally introduced the concepts of simple random sampling without replacement and simple cluster sampling, although not using these names (Bellhouse 1988). Arthur Lyon Bowley did a number of empirical studies on the validity of random sampling, motivated at least in part by a 1912 paper by Franicis Ysidro Edgeworth (1912). Bowley used large sample normal approximations (but did not conceptualize what he was doing as sampling from a finite population) to actually test the notion of representativeness. Then, he carried out a 1912 sample survey on poverty in Reading, in which he drew the respondents at random (see Bowley 1913), although he appears to have equated simple random sampling with systematic sampling (Bellhouse 1988). At about the same time, in what appears to be an independent line of discovery, Tchouproff and others were overseeing the implementation of sample survey methods in Russia, especially during the First World War and the period immediately following it. This work was partially documented after the Russian Revolution in Vestnik Statistiki, the official publication of the Central Statistical Administration (Zarkovic 1956, 1962). Even though there is clear evidence that Tchouproff had developed a good deal of the statistical theory for random sampling from finite populations during this period (see Seneta 1985), whether he actually implemented forms of random selection in these early surveys is unclear. He later published the formulae for the behavior of sample estimates under simple random sampling and stratified random sampling from finite populations in Tchouproff (1918a, 1918b, 1923a, 1923b). In a seemingly independent development of these basic ideas for sampling in the context of his work on agricultural experiments, Neyman in Splawa-Neyman (1923) described them in the form of the drawing of balls without replacement from an urn. In the resulting 1925 paper, Splawa-Neyman (1925) gave the basic elements of the theory for sampling from finite populations and its relationship with sampling from infinte populations. These results clearly overlapped those of Tchouproff, and two years later following his death, his partisans took Neyman to task for his lack of citation to the Russian work as well as to earlier work of others (see Fienberg and Tanur 1995, 1996) for a discussion of the controversy). By 1925, the record of the ISI suggests that the representative method was taken for granted, and the discussions centered around how to accomplish representativeness and how to measure the precision of sample-based estimates, with the key presentations
Sample Sureys, History of being made by Bowley (1926) and in the same year by the Danish statistician Adolph Jensen. Notions of clustering and stratification were put forward, and Bowley presented a theory of proportionate stratified sampling as well as the concept of a frame, but purposive sampling was still the method of choice. It was not until Gini and Galvani made a purposive choice of which returns of an Italian census to preserve, and found that districts chosen to represent the country’s average on seven variables were, in that sense, unrepresentative on other variables, that purposive sampling was definitively discredited (Gini 1928, Gini and Galvani 1929).
3. Neyman’s 1934 Paper on Sampling In their work on the Italian census, Gini and Galvani seemed to call into question the accuracy of sampling. Neyman took up their challenge in his classic 1934 paper presented before the Royal Statistical Society, ‘On the two different aspects of the representative method.’ In it he compared purposive and random sampling, and concluded that it wasn’t sampling that was problematic but rather Gini and Galvani’s purposive selection. Elements of synthesis were prominent in the paper as well. Neyman explicitly uncoupled clustering and purposive sampling, saying, ‘In fact the circumstance that the elements of sampling are not human individuals, but groups of these individuals, does not necessarily involve a negation of the randomness of the sampling’ (1952, p. 571). He calls this procedure ‘random sampling by groups’ and points out that, although Bowley did not consider it theoretically, he used it in practice in London, as did O. Anderson in Bulgaria. Neyman also combined stratification with clustering to form ‘random stratified sampling by groups,’ and he provided a method for deciding how best to allocate samples across strata (optimal allocation). The immediate effect of Neyman’s paper was to establish the primacy of the method of stratified random sampling over the method of purposive selection, something that was left in doubt by the 1925 ISI presentations by Jensen and Bowley. But the paper’s longer-term importance for sampling was the consequence of Neyman’s wisdom in rescuing clustering from the clutches of those who were the advocates of purposive sampling and integrating it with stratification in a synthesis that laid the groundwork for modern-day multistage probability sampling. Surprisingly, for many statisticians the memorable parts of Neyman’s paper were not these innovations in sampling methodology but Neyman’s introduction of general statistical theory for point and interval estimation, especially the method of confidence intervals (see Estimation: Point and Interal and Frequentist Inference).
As pathbreaking as Neyman’s paper was, a number of its results had appeared in an earlier work by Tschuprow (1923a 1923b), in particular the result on optimal allocation in stratified sampling (see Fienberg and Tanur 1995, 1996 for discussions of this point). The method was derived much earlier, by the Danish mathematician Gram (1883) in a paper dealing with calculations for the cover of a forest based on a sample of trees. Gram’s work has only recently been rediscovered and seems not to have been accessible to either Tchouproff or Neyman. Neyman provided the recipe for others to follow and he continued to explain its use in convincing detail to those who were eager to make random sampling a standard diet for practical consumption (e.g., see Neyman 1952 for a description based on his 1937 lectures on the topic at the US Department of Agriculture Graduate School, as discussed below).
4. The Deelopment of Random Sampling in the United States One might have thought that the resolution of the controversy over the representative method and the articulation of the basic elements of a theory of sample surveys by Neyman would have triggered extensive application of random sampling throughout the world. Surprisingly, this was not the case. With some notable exceptions, for example, in England (see Cochran 1939 and Yates 1946) and India, the primary application occurred in the United States and this led to a spate of new and important methodological developments. As late as 1932, however, there were few examples of probability sampling anywhere in the US federal government (Duncan and Shelton (1978) and the federal statistical agencies had difficulty responding to the demand for statistics to monitor the effects of the programs of President Franklin Roosevelt’s New Deal. In 1933, the American Statistical Association (ASA) set up an advisory committee that grew into the Committee on Government Statistics and Information Services (COGSIS), sponsored jointly by ASA and the Social Science Research Council. COGSIS helped to stimulate the use of probability sampling methods in various parts of the Federal government, and it encouraged employees of statistical agencies to carry out research on sampling theory. For example, to establish a technical basis for unemployment estimates, COGSIS, and the Central Statistical Board which it helped to establish, organized an experimental Trial Census of Unemployment as a Civil Works Administration project in three cities, using probability sampling, and carried out in late 1933 and early 1934. The positive results from this study led in 1940 to the establishment of the first large-scale, ongoing sample survey on employment and unemployment using probability sampling methods. This survey later 13455
Sample Sureys, History of became known as the Current Population Survey and it continues to the present day. Another somewhat indirect outcome of the COGSIS emphasis on probability sampling took place at the Department of Agriculture Graduate School where Deming, who recognized the importance of Neyman’s 1934 paper, invited Neyman to present a series of lectures in 1937 on sampling and other statistical methods (Neyman (1952). These lectures had a profound impact on the further development of sampling theory not simply in agriculture, but across the government as well as in universities. Among those who worked on the probabilitysampling-based trial Census of Unemployment at the Bureau of the Census was Hansen, who was then assigned with a few others to explore the field of sampling for other possible uses at the Bureau, and went on to work on the 1937 sample Unemployment Census. After working on the sample component of the 1940 decennial census (under the direction of Deming), Hansen worked with others to redesign the unemployment survey based on new ideas on multistage probability samples and cluster sampling (Hansen and Hurwitz 1942, 1943). They expanded and applied their approach in various Bureau surveys, often in collaboration and interaction with others, and this effort culminated in 1953 with the publication of a two-volume compendium of theory and methodology (Hansen et al. 1953a, 1953b). Cochran’s (1939) paper, written in England and independently of the US developments, is especially notable because of its use of the analysis of variance in sampling settings and the introduction of superpopulation and modeling approaches to the analysis of survey data. In the 1940s, as results from these two separate schools appeared in various statistical journals, we see some convergence of ideas and results. The theory of estimation in samples with unequal probabilities of selection also emerged around this time (see Horvitz and Thompson 1952, Hansen et al. 1985) (see Sample Sureys: Methods). Statisticians have continued to develop the theoretical basis of alternative methods of probability sampling and statistical inference from sampling data over the past fifty years (see, e.g., Rao and Bellhouse 1990, Sa$ rndal et al. 1992). The issue of statistical inference for models from survey data remains controversial, at least for those trained from a traditional finite sampling perspective.
Paul Lazarsfeld, Rensis Likert, William Ogburn, and Samuel Stouffer (e.g., see Stephan 1948, Converse 1987 for a discussion). Stouffer actually spent several months in England in 1931–32, and learned about sampling and other statistical ideas from Bowley, Karl and Egon Pearson, and R. A. Fisher. He then participated in the Census of Unemployment at the Bureau of the Census. Market research and polling trace their own prehistory to election straw votes collected by newspapers, dating back at least to the beginning of the nineteenth century. Converse (1987) points out, however, a more serious journalistic base; election polls were taken and published by such reputable magazines as the Literary Digest (which had gained a reputation for accuracy before the 1936 fiasco). Then, as now, election forecasting was taken as the acid test of survey validity. A reputation for accuracy in ‘calling’ elections was thought to spill over to a presumption of accuracy in other, less verifiable areas. There was a parallel tradition in market research, dating back to just before the turn of the twentieth century, attempting to measure consumers’ product preferences and the effectiveness of advertising. It was seen as only a short step from measuring the opinions of potential consumers about products to measuring the opinions of the general public about other objects, either material or conceptual. By the mid-1930s there were several well-established market research firms. Many of them conducted election polls in 1936 and achieved much greater accuracy than did the Literary Digest. It was the principals of these firms (e.g., Archibald Crossley, George Gallup, and Elmo Roper) who put polling—election, public opinion, and consumer—on the map in the immediate pre-World War II period. The polling and market research surveys of Crossley, Gallup, Roper, and others were based on a sampling method involving ‘quota controls,’ but did not involve random sampling. Stephan (1948) observed the close link between their work and the method of purposive sampling that had gained currency much earlier in government and academic research circles, but which by the 1930s had been supplanted by random sampling techniques.
5. Market Research and Polling
The 1940s saw a rapid spread of probability sampling methods to a broad array of government agencies. It was, however, only after the fiasco of the 1948 presidential pre-election poll predictions (Mosteller et al. 1949) that market research firms and others shifted towards probability sampling. Even today many organizations use a version of probability sampling with quotas (Sudman 1967).
The 1930s, and especially the period after World War II, however, saw a flowering of survey methodology in market research and polling, as well as in the social sciences more broadly. Initial stimulation came from a number of committees at the Social Science Research Council and social scientists such as Hadley Cantril, 13456
6. From Sampling Theory to the Study of Nonsampling Error
Sample Sureys, History of Amidst the flurry of activity on the theory and practice of probability sampling during the 1940s, attention was being focused on issues of nonresponse and other forms of nonsampling error (e.g., see Deming 1944) such as difficulty in understanding questions, or remembering answers, etc.). A milestone in this effort to understand and model nonresponse errors was the development of an integrated model for sampling and nonsampling error in censuses and surveys, in connection with planning for and evaluation of the 1950 census (Hansen et al. (1951). This analysis-of-variance-like model, or variants of it, has served as the basis of much of the work on nonsampling error since (see Linear Hypothesis; Nonsampling Errors). New developments in sampling since the midtwentieth century have to do less with the design of samples and more to do with the structuring of the survey instrument and interview. Thus, there has been a move from face-to-face interviewing to telephone interviewing (with the attendant problems connected with random-digit dialing), and then to the use of computers to assist in interviewing. Nonsampling errors continue to be studied broadly, often under the rubric of cognitive aspects of survey design (see Nonsampling Errors; Sample Sureys: Cognitie Aspects of Surey Design). See also: Censuses: History and Methods; Government Statistics; Sample Surveys: Methods; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field; Social Survey, History of; Statistical Methods, History of: Post-1900; Statistical Systems: Censuses of Population
Bibliography Bellhouse D 1988 A brief history of random sampling methods. In: Krishnaiah P, Rao C (eds.) Handbook of Statistics, Vol. 6. North Holland, Amsterdam, pp. 1–14 Bowley A 1913 Working-class households in Reading. Journal of the Royal Statistical Society 76: 672–701 Bowley A 1926 Measurement of the precision attained in sampling. Bulletin of the International Statistical Institute 22(1): 6–62 Cochran W 1939 The use of the analysis of variance in enumeration by sampling. Journal of the American Statistical Association 128: 124–35 Cochran W G 1978 Laplace’s ratio estimator. In: David H (ed.) Contributions to Surey Sampling and Applied Statistics: Papers in Honor of H. O. Hartley. Academic Press, New York, pp. 3–10 Converse J 1987 Surey Research in the United States: Roots and Emergence 1890–1960. University of California Press, Berkeley, CA Deming W 1944 On errors in surveys. American Sociological Reiew 19: 359–69 Duncan J, Shelton W 1978 Reolution in United States Goernment Statistics, 1926–1976. US Government Printing Office, Washington, DC
Duncan O 1984 Notes on Social Measurement. Sage, New York Edgeworth F Y 1912 On the use of the theory of probabilities in statistics relating to society. Journal of the Royal Statistical Society 76: 165–93 Fienberg S, Tanur J 1995 Reconsidering Neyman on experimentation and sampling: Controversies and fundamental contributions. Probability and Mathematical Statistics 15: 47–60 Fienberg S, Tanur J 1996 Reconsidering the fundamental contributions of Fisher and Neyman on experimentation and sampling. International Statistical Reiew 64: 237–53 Gini C 1928 Une application de la me! thode repre! sentative aux mate! riaux du dernier recensement de la population italienne (1er de! cembre 1921). Bulletin of the International Statistical Institute 23(2): 198–215 Gini C, Galvani L 1929 Di una applicazione del metodo rappresentative all’ultimo censimento italiano della popolazione (1m Decembri 1921). Annali di Statistica 6(4): 1–107 Gram J 1883 About calculation of the mass of a forest cover by means of test trees (in Danish). Tidsskrft for Skobrug 6: 137–98 Hansen M, Dalenius T, Tepping B 1985 The development of sample surveys of finite populations. In: Atkinson A, Fienberg S (eds.) A Celebration of Statistics: The ISI Centenary Volume. Springer Verlag, New York, pp. 327–54 Hansen M, Hurwitz W 1942 Relative efficiencies of various sampling units in population inquiries. Journal of the American Statistical Association 37: 89–94 Hansen M, Hurwitz W 1943 On the theory of sampling from finite populations. Annals of Mathematical Statistics 14: 333–62 Hansen M, Hurwitz W, Madow W 1953a Sample Surey Methods and Theory, Vol. 1. Wiley, New York Hansen M, Hurwitz W, Madow W 1953b Sample Surey Methods and Theory, Vol. 2. Wiley, New York Hansen M, Hurwitz W, Marks E, Mauldin W 1951 Response errors in surveys. Journal of the American Statistical Association 46: 147–90 Horvitz D, Thompson D 1952 A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47: 663–85 Jenson A 1926 Report on the representative method in statistics. Bulletin of the International Statistical Institute 22: 359–80 Kiaer A 1895\1896 Observation et expe! riences concernant des de! nombrements re! presentatifs. Bulletin of the International Statistical Institute 9(2): 176–83 Kiaer A 1897 The representative method of statistical surveys (in Norwegian). Christiana Videnskabsselskabets Skrifter. II Hisoriskfilosofiske 4 [reprinted with an English trans. by Central Bureau of Statistics of Norway, Oslo, 1976)] Kruskal W, Mosteller F 1980 Representative sampling, iv: The history of the concept in statistics, 1895–1939. International Statistical Reiew 48: 169–95 Laplace P 1814 Essai philosophique sur les probabiliteT s. Dover, New York [trans. of 1840 6th edn. appeared in 1951 as A Philosophical Essay on Probabilities, trans. Truscott F W, Emory F L] Madansky A 1986 On biblical censuses. Journal of Official Statistics 2: 561–69 Mahalanobis P 1944 On large-scale sample surveys. Philosophical Transactions of the Royal Society of London 231(B): 329–451 Mahalanobis P 1946 Recent experiments in statistical sampling in the Indian Statistical Institute. Journal of the Royal Statistical Society 109: 325–78
13457
Sample Sureys, History of Mosteller F, Hyman H, McCarthy P J, Marks E S, Truman D B 1949 The Pre-election Polls of 1948. Social Science Research Council, New York Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–625 Neyman J 1952 Lectures and Conferences on Mathematical Statistics and Probability. US Department of Agriculture, Washington, DC [The 1952 edition is an expanded and revised version of the original 1938 mimeographed edition.] Rao J, Bellhouse D 1990 The history and development of the theoretical foundations of survey based estimation and statistical analysis. Surey Methodology 16: 3–29 Sa$ rndal C, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer Verlag, New York Seneta E 1985 A sketch of the history of survey sampling in Russia. Journal of the Royal Statistical Society Series A 148: 118–25 Seng Y 1951 Historical survey of the development of sampling theories and practice. Journal of the Royal Statistical Society, Series A 114: 214–31 Splawa-Neyman J 1923 On the application of probability theory to agricultural experiments. Essay on principles (in Polish). Roczniki Nauk Rolniczych Tom (Annals of Agricultural Sciences) X: 1–51 [Sect. 9, translated and edited by D. M. Dabrowska and T. P. Speed, appeared with discussion in Statistical Science (1990) 5: 463–80] Splawa-Neyman J 1925 Contributions of the theory of small samples drawn from a finite population. Biometrika 17: 472–9 [The note on this republication reads ‘These results with others were originally published in La Reue Mensuelle de Statistique, publ. Par l’Office Central de Statistique de la Re! publique Polonaise 1923 tom. vi, 1–29] Stephan F 1948 History of the use of modern sampling procedures. Journal of the American Statistical Association 43: 12–39 Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty before 1900. Harvard University Press, Cambridge, MA Sudman S 1967 Reducing the Costs of Sureys. Aldine, Chicago Taeuber C 1978 Census. In: Kruskal W H, Tanur J M (eds.) International Encyclopedia of Statistics. Macmillan and the Free Press, New York, pp. 42–6 Tchouproff A A 1918a On the mathematical expectation of the moments of frequency distributions (Chap. 1 and 2). Biometrika 12: 140–69 Tchouproff A A 1918b On the mathematical expectation of the moments of frequency distributions (Chap. 3 and 4). Biometrika 12: 185–210 Tschuprow A A 1923a On the mathematical expectation of the moments of frequency distributions in the case of correlated observations (Chaps. i–iii). Metron 2: 461–93 Tschuprow A A 1923b On the mathematical expectation of the moments of frequency distributions in the case of correlated observations (Chaps. iv–vi). Metron 2: 646–80 Willcox W 1930 Census. In: Seligman E R A, Johnson A (eds.) Encyclopedia of Social Sciences, Vol. 3. Macmillan, New York pp. 295–300 Yates F 1946 A review of recent developments in sampling and sample surveys (with discussion). Journal of the Royal Statistical Society 109: 12–42 Zarkovic S S 1956 Note on the history of sampling methods in Russia. Journal of the Royal Statistical Society, Series A 119: 336–8
13458
Zarkovich S S 1962 A supplement to ‘note on the history of sampling methods in Russia.’Journal of the Royal Statistical Society, Series A 125: 580–2
S. E. Fienberg and J. M. Tanur
Sample Surveys: Methods A survey consists of a number of operations or steps of survey design. The ultimate goal of the survey is to generate estimates of population parameters, based on observations of (a sample of) the units that comprise the population. The design steps can be viewed as a chain of links. The chain is no stronger than its weakest link, for example, see Surey Sampling: The Field. The steps can be labeled in the following way: (a) Research objectives are defined, i.e., the subject-matter problem is translated into a statistical problem. Researchers must define the target population they want to study and the concepts they wish to measure. Indicators (variables) of the concepts are chosen and eventually, questions are formulated. Problems in this first step usually result in relevance errors, see Hox (1997). (b) A frame of the population is developed. The frame could be a list of population units, a map, or even a set of random numbers that could be used to access a population of individuals with telephones. Coverage errors result when frame units do not completely correspond with population units. (See Groves 1989.) (c) The mode of administrating the survey is chosen. The suitable survey mode depends on budget constraints, topic, and the type of measurement that is being considered. Common modes include faceto-face interviews, telephone interviews, diaries, and administrative records. New modes related to the Internet have recently entered the scene, see Lyberg and Kasprzyk (1991) and Dillman (2000). (d) The questionnaire is developed. For each survey construct, one or more survey questions are developed. Questionnaire development is complicated since the perception of questions varies among respondents and is sensitive to a number of cognitive phenomena. Effects that are treated in the literature include: question wording, question order, order between response alternatives, context, navigational principles, and the influence of interviewers. Many large survey organizations emphasize this step and have created cognitive laboratories to improve their questionnaire work, see Sudman et al. (1996) and Forsyth and Lessler (1991). (e) The sampling design specifies the sampling unit, the method for randomly selecting the sample from the frame, and the sample size. The choice depends on the assumed population variability and the costs of sampling different units, see Sa$ rndal et al. (1992). (f)
Sample Sureys: Methods Data are collected, i.e., observations or measurements are made of the sampled units. Typically both random and systematic errors occur at this stage, see de Leeuw and Collins (1997). (g) Data are processed. This step is needed to make estimation and tabulation possible, given that only raw data exist. The processing includes editing activities where data consistency and completeness are controlled, entry of data (data capture), and coding of variables, see Lyberg and Kasprzyk(1997). All design steps are interdependent. A decision regarding one step has an impact on other steps. Thus, several iterations of the above steps of the design process may be necessary before the final design is determined. There follows details of the survey process.
1. Populations and Population Parameters A population is a set of units. For example, a population might be the inhabitants of a country, the households in a city, or the companies in a sector of industry. The ‘units of study’ are the units used as a basis of the analysis, for example individuals, households, or companies. The ‘target population’ is that particular part of the population about which inference is desired. The ‘study variables’ are the values of the units studied, for example the age of an individual, the disposable income of a household, or the number of employees in a company. The ‘population parameters’ are characteristics of the population, summarizing features of the population, for example the average disposable income for households in a population. Most often interest centers on finding the population parameters for specific subsets of the population such as geographical regions or age groups. The subsets are called ‘domains of study.’
2. Sampling Units, Frames, and Probability Samples Frequently, statements about some population parameters are problematic because of a lack of time and funding necessary for surveying all units in the population. Instead we have to draw a sample, a subset of the population, and base conclusions on that sample. High quality conclusions concerning the parameter values require care when designing the sample. Assessing the quality of the conclusions made can be accomplished by using a probability sampling design, which is a design such that all units in the population is given a known probability of being included in the sample. Given that knowledge, it is possible to produce estimates of the parameters that are unbiased and of high precision. Moreover it is also possible to estimate the precision of the estimates based on the sample. This is usually in contrast to nonprobability sampling designs where, strictly speak-
ing, the statistical properties of the estimators are unknown unless one is willing to accept distribution assumptions for the variables under study in combination with the sampling design. Probability sampling requires a frame of sampling units. The units of study in the population can be of different types, such as individuals, households, or companies. Even if the analysis is intended to be based on these units it is often practical or even necessary to group the population units into units that are better suited for sampling. The reason for this can be that, due to imperfect frames, there is no register of the units of study and therefore the units cannot be selected directly. Instead, groups of units can be selected, for example, villages or city blocks, and interviews are conducted with the people in the villages or blocks. Here the sampling unit consists of a cluster of units of study. In many countries, the demarcation of the clusters are based on geographical units identified through areas on maps. Often the clusters are formed to be a suitable workload for an interviewer or enumerator and are called the enumeration area. The enumeration areas are often formed and updated in population censuses. In countries where the telecommunication system is well developed, interviews are often done by telephone. The sampling unit in this case is the cluster of individuals, i.e., the household members that have access to the same telephone. The frame can take different shapes. For example, a list of N students at a university complete with names and addresses is a frame of students. In this case the sampling units and the units of study are the same. This makes it easy to select a sample of n students by numbering the students from 1 to N and generating n random numbers in a computer. Then we select the students that have the same number as the realized random numbers. Note that in doing so we deliberately introduce randomization through the random number generation. This is in contrast to methods that do not use randomization techniques but act as if data are independent identically distributed random variables regardless of how the data was obtained. Often, the sampling units are not the same as the study units. If, for example, the units of study are individuals in a country and the target population consists of all individuals living in the country at a specific date, there is seldom a list or data file that contains the individuals. Also the cost of making such a list would be prohibitive. Using a sequence of frames containing a sequence of sampling units of decreasing sizes can improve this situation. For example, if there exists a list of villages in the country, then this is a frame of primary sampling units. From a sample of villages, a list of all households within the villages selected can be made. This is a frame of secondary sampling units. Note that there is a need for making the list of households only for the villages selected. Using this list, some of the households may be selected for interviews. Thus, a sequence 13459
Sample Sureys: Methods of frames containing a sequence of sampling units has been constructed in a way that permits selection of a probability sample of units of study without enumerating all the individuals in the country. The existence of, or the possibility of, creating reliable frames is a prerequisite for making efficient sampling designs. Often the frame situation will govern the sampling procedure. Ideally the frame should cover the target population and only that population. It should provide the information necessary for contacting the units of study such as names, addresses, telephone numbers, or the like. If there is auxiliary information about the units, it is very practical if it is included in the frame. Auxiliary information is useful for constructing efficient sampling designs. A specific type of frame is used in ‘Random Digit Dialing’ (RDD). When telephone numbers are selected, a common technique is to select numbers by generating them at random in a computer. The frame is the set of numbers that could be generated in the computer. The selection probability of the cluster is proportional to the number of telephones to which the household members have access.
3. Sampling Methods and Estimators The overall goal is to be able to make (high) quality estimates of the parameters as cheaply as possible. This is implemented using a ‘sampling strategy,’ i.e., a combination of sampling design (a certain combination of sampling methods) and an ‘estimator,’ which is as efficient as possible, i.e., gives the highest precision for a given cost. There exists a variety of sampling methods that can be applied in different situations depending on the circumstances. The most frequently employed sampling methods are: (a) Simple Random Sampling (SRS). Every unit in the population is given an equal chance of being included in the sample. (b) Systematic Sampling. Every kth unit in the frame is included in the sample, starting from a randomly selected starting point. The sampling step k is chosen so that the sample has a predetermined size. (c) Stratified Sampling. The units are grouped together in homogenous groups (strata) and then sampled, for example, using SRS within each group. Stratified sampling is used either for efficiency reasons or to ensure a certain sample size in some domains of study, such as geographical regions or specific age groups. It is necessary to know the values of a stratification variable in advance to be able to create the strata. Stratification is efficient if the stratification variable is related to the variables under study. If the number of units that are being selected within each stratum is proportional to the number of units in the stratum, then this is called proportional allocation. Usually this will give fairly efficient estimates. It can be improved upon using optimal allocation, which requires knowledge of the variability of the study 13460
variables or correlated variables within each stratum. In many cases the improvement is small compared to proportional stratification. (d) Unequal Probability Sampling. This is a method, which can be employed either for efficiency reasons or for sheer necessity. Often the probability of selecting a cluster of units is (or is made to be) proportional to the number of units in the cluster. This is called selecting with probability proportional to size (PPS), with size in this case equal to the number of units. The measure of size can vary. The efficiency of the estimator will increase if the measure of size is related to the study variables. (e) Multistage Sampling. This is a technique, which is typically used in connection with frame problems. The above-mentioned sampling methods can be combined to produce good sampling designs. For example, the following is an example of a ‘master sample design’ that is used in some countries. Suppose that the object is to conduct a series of sample surveys concerning employment status, living conditions, and household expenditures. Also suppose that based on a previous census there exist enumeration areas (EAs) for the country, which are reasonably well updated so that approximations of the population sizes are available for each EA. Suppose that it is deemed appropriate to stratify the EAs according to geographical regions both because the government wants these regions as domains of study and because there is reason to believe that the consumption pattern is different within each region. Thus, the stratification of the EAs would serve both purposes, namely controlling the sample sizes for domains of study and presumably being more efficient than SRS. A number of EAs are selected within each stratum. The number of EAs to be selected is chosen to be proportional to the aggregated population figures in the regions according to the most recent census. The EAs are the primary sampling units. If the EAs vary a lot in size it could be efficient to select the EAs with PPS; otherwise SRS could be used. The EAs so selected, constitute the master sample of primary sampling units that will be kept the same for a number of years. The actual number of years will depend on the migration rate within the regions. From this master sample a number of households are selected each time a new survey is to be conducted. When selecting the households, it is possible to make a new updating of the EAs, i.e., to make a new list of the households living in the area and then select with systematic sampling or SRS a number of households to be interviewed. The households are the secondary sampling units. Often the sampling fractions in the second stage are determined in such a way that the resulting inflation factor for the households is the same. This is called a ‘self-weighting design.’ In principle, a self-weighting design makes it possible to calculate the estimates and the precision of the estimates without a computer.
Sample Sureys: Methods
4. Estimators, Auxiliary Information, and Sample Weights To each sampling design there corresponds at least one estimator, which is a function of the sample data used for making statements about the parameters. The form of the estimator depends on the sampling design used and also on whether auxiliary information is included in the function. In survey sampling, some auxiliary information is almost always available, i.e., values of concomitant variables known for all units in the population. If this information is correlated to the study variables, it can be used to increase the efficiency of the sampling strategy either by incorporating the information in the sampling method, as in stratification, or in the probability of including the unit in the sample as in PPS, or in adjusting the estimator. A very general form of an estimator is the so-called generalized regression estimator (Cassel et al. 1976), which for estimating the population total takes the form tGR l tyjβV (txkTx) where x denotes the values of the auxiliary variable, Tx the known value of the population total of x, β# is the regression coefficient estimated from the sample between x and the study variable y and n x tx l i α " i
is the so-called Horvitz–Thompson estimator of X, and αi the inclusion probability of unit i. The function ty is the estimator of the unknown value of the parameter. The motivation for using this estimator is based on the conviction that the study variable and the auxiliary variable are linearly related. Thus, this can be seen as one example of the use of models in survey sampling. If the inclusion probabilities are the same for all units, i.e., if αi l (n\N ), then the Horvitz– Thompson estimator becomes the expanded sample mean, which is an unbiased estimator of the population total under the SRS scheme if the expectation is taken with respect to the design properties. The sample weights are the inflation factors used in the estimator for inflating the sample data. They are usually a function of the inverted values of the inclusion probabilities. In the Horvitz–Thompson estimator, the sample weights are N(αin)−". For the sample mean, the sample weights become N\n. As was noted earlier, to each sampling design, there exists an estimator that is natural, i.e., it is unbiased in the design sense. For example, in the case of multistage probability sampling, the sampling weights are functions of the different selection probabilities in each selection step. If the selection probabilities are carefully selected they may form a weight, which is constant for all units in the sample, and the value taken by the estimator can be calculated simply by summing the values of the
units in the sample and multiplying the sum by the constant. This is called a ‘selfweighted estimator’ (and design). This technique was important before the breakthrough of the personal computer, because it saved a lot of manual labor but with today’s powerful PCs, it has lost its merits. The generalized regression estimator can be rewritten in the form: n
tGR l wixi " where wi are the sample weights pertinent to unit i for this estimator. The ‘calibration estimator’ is the form of the regression estimator where the sample weights have been adjusted so that tx Tx. This adjustment causes a small systematic error, which decreases when the sample size increases.
5. Assessing the Quality of the Estimates—Precision The quality of the statements depend among other things on the precision of the estimates. Probability sampling supports measurement of the sampling error, since a randomly selected subset of the population is a basis for the estimates. The calculation takes into consideration the randomization induced by the sampling design. Because the design might be complex, the calculation of the precision might be different from what is traditionally used in statistics. For example the variance of the sample mean y- s for estimating the population mean y- under SRS with replacement is S #\n where S # is the population variance, where S# l
1 N ( yiky` )# Nk1 "
and n is the sample size. This formula is similar to that used in traditional statistical analysis. However for the Horvitz–Thompson estimator, the variance is: 1N N 2 " " F
E
αiαjkαij αij
G
E
H
F
yj yi k αj αi
# G
H
where αij is the joint inclusion probability of units i and j. As can be seen, the calculation of the variance becomes complicated for complicated sampling designs, having a large number of selection stages and also unequal selection probabilities in each step. This complexity becomes even more pronounced when estimating population parameters in subgroups where the means are ratios of random variables. However some shortcut methods have been developed: ultimate cluster techniques, Taylor series, jack-knifing, and replications, see Wolter (1985). It is also evident from the formula for the variance that the variance of the 13461
Sample Sureys: Methods Horvitz–Thompson estimator depends on the relation between the values of the study variable y and the inclusion probability α. If the inclusion probability can be made to be approximately proportional to the values of the study variable, then the variance becomes small. That is one reason why much emphasis is put on the existence of auxiliary information in survey sampling. However, as was shown by Godambe (1955) there does not exist a uniformly minimum variance unbiased estimator in survey sampling when inference is restricted to design-based inference.
One should bear in mind, however, that for each survey step there are methods designed to keep the errors small. Systematic use of such known dependable methods decreases the need for evaluation studies or heavy reliance on modeling. See also: Sample Surveys: Nonprobability Sampling; Sample Surveys: Survey Design Issues and Strategies
Bibliography 6. Assessing the Quality of the Estimates—Total Surey Error The total survey error can be measured by the mean squared error (MSE), which is the sum of the variance and the squared bias of the estimate. Regular formulas for error estimation do not take all MSE components into account but rather the precision components mentioned above. Basically, a variance formula includes random variation induced by the sampling procedure and the random response variation. Components such as correlated variances induced by interviewers, coders, editors, and others have to be estimated separately and added to the sampling and response variance. The same goes for systematic errors that contribute to the bias. They have to be estimated separately and added, so that a proper estimate of MSE can be obtained. Often the survey quality is visualized by a confidence interval, based on the estimated precision. Obviously such an interval might be too short because it has not taken the total error into account. Some of the sources of correlated variances and biases can be traced to: (a) respondents who might have a tendency to, for instance, under-report socially undesirable behaviors; (b) interviewers who might systematically reformulate certain questions and do so in an interviewerspecific fashion; (c) respondents who do not participate because they cannot be contacted or they refuse; (d) incomplete frames resulting in, e.g., undercoverage; and (e) coders who might introduce biased measurements if they tend to erroneously code certain variable descriptions. There are of course many other possibilities for error to occur in the estimates. There are basically two ways of assessing the quality of estimates: (a) The components of MSE can be estimated—this is a costly and time-consuming operation and (b) by a modeling approach—it might be possible to include, e.g., nonresponse errors and coverage errors in the assessment formulas. This method’s success depends on the realism of the modeling of the error mechanisms involved. 13462
Cassel C M, Sa$ rndal C E, Wretman J H 1976 Some results on generalized difference and generalized regression estimation for finite populations. Biometrika 63: 615–20 De Leeuw E, Collins M 1997 Data collection methods and survey quality: An overview. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley, New York Dillman D 2000 Mail and Internet Sureys. Wiley, New York Godambe V P 1955 A unified theory of sampling from finite populations. Journal of Royal Statistical Society, Series B 17: 269–78 Groves R 1989 Surey Errors and Surey Costs. Wiley Forsyth B, Lessler J 1991 Cognitive laboratory methods: A taxomomy. In: Biemer P et al. (eds.) Measurement Errors in Sureys. Wiley, New York Hox J 1997 From theoretical concept to survey question. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley, New York, pp. 47–70 Lyberg L, Kasprzyk D 1991 Data collection methods and measurement error: An overview. In: Biemer P et al. (eds.) Measurement Errors in Sureys. Wiley Lyberg L, Kasprzyk D 1997 Some aspects of post-survey processing. In: Lyberg L et al. (eds.) Surey Measurement and Process Quality. Wiley Sa$ rndal C E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Sudman S, Bradburn N, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Wolter K 1985 Introduction to Variance Estimation. SpringerVerlag
C. M. Cassel and L. Lyberg
Sample Surveys: Model-based Approaches The theory for sample surveys has been developed using two theoretical frameworks: design-based and model-based. The design-based approach uses the probabilities with which units are selected for the sample for inference; in the model-based approach, the investigator hypothesizes a joint probability distribution for elements in the finite population, and uses that probability distribution for inference. In this article, the two approaches are described and compared, and guidelines are given for when, and how,
Sample Sureys: Model-based Approaches one should perform a model-based analysis of survey data. The use of models for survey design is discussed briefly.
1. Inference in Sample Sureys How does one generalize from individuals in a sample to those not observed? The problem of induction from a sample was much debated by philosophers, social scientists, and mathematicians of the eighteenth and nineteenth centuries, including Immanuel Kant, Charles Peirce, John Venn, and Adolphe Quetelet. In the early years of the twentieth century, many investigators resisted the idea of using survey samples rather than censuses for ‘serious statistics’ because of inference issues (see Sample Sureys, History of). The debates involving official uses of sample surveys in these years resulted in the development of two philosophical frameworks for inference from a sample: design-based inference and model-based inference. In design-based inference, first expounded systematically by Neyman (1934), the sample design provides the mechanism for inferences about the population. Suppose that a without-replacement probability sample (see Sample Sureys: Methods) of n units is to be taken from a population of N units. A random variable Zi is associated with the ith unit in the population; Zi l 1 if the unit is selected for inclusion in the sample, and Zi l 0 if the unit is not selected. The joint probability distribution of oZ ,…, ZNq is used for " inference statements such as confidence intervals (see Estimation: Point and Interal ). The quantity being measured on unit i, yi, is irrelevant for inference in the design-based approach. Whether yi is household income, years of piano lessons, or number of cockroaches in the kitchen, properties of estimators depend exclusively on properties of the random variables oZ , …, ZNq that describe " the probability sampling design. The Horvitz– Thompson (1952) estimator, N
Zi yi\πi i="
(1)
where πi l P(Zi l 1), is an unbiased estimator of the population total N yi; the variance of the Horvitz– i= Thompson estimator," N N
yi yj(πiπj)-" Cov (Zi, Zj) i="j=" depends on the covariance structure of oZ , …, ZNq. " the inThe design-based approach differs from ferential framework used in most other areas of statistics. There, yi is the observed value of a random variable Yi; the joint probability distribution of the Yi’s and proposed stochastic models allow inferential
statements to be made. Following questions raised by Godambe (1955) about optimal estimation and survey inference, Brewer (1963) and Royall (1970) suggested that the same model-based frameworks used in other areas of statistics also be used in finite population sampling. Thompson (1997) summarized various approaches to prediction using models. In the model-based approach to inference, the finite population values are assumed to be generated from a stochastic model. Regression models (see Linear Hypothesis: Regression (Basics)) are often adopted for this purpose: if covariates xi , xi , …, xip are known " a#possible model is for every unit in the population, Yi l β jβ xi j(jβpxipjεi ! " "
(2)
where the εi’s are random variables with mean zero and specified covariance structure. The parameters βj for j l 0, …, p may be estimated by standard techniques such as generalized least squares, and the regression equation used to predict the value of y for units not in the sample. Then the finite population total is estimated by summing the observed values of the yi’s for units in the sample and the predicted values of the response for units not in the sample. To illustrate the two approaches, consider a hypothetical survey taken to study mental health status in a population of 10,000 women over age 65. A stratified random sample of 100 urban women and 100 rural women is drawn from a population of 6,000 urban women and 4,000 rural women over age 65. One response of interest is the score on a depression inventory. Let y` U denote the sample mean score for the urban women and let y` R denote the sample mean score for the rural women. Under the design-based approach as exposited in Cochran (1977), πi l 1\60 if person i is in an urban area and πi l 1\40 if person i is in a rural area. Every urban woman in the sample represents herself and 59 other urban women who are in the population but not in the sample; every sampled rural woman represents herself plus 39 other rural women. The Horvitz– Thompson estimator for the mean depression score for the population is N−"N Ziyi\πi; here, the i= estimated population mean score" under the stratified random sampling design is 0.6y` Uj0.4y` R. A 95 percent confidence interval (CI) for the mean refers to the finite population: if a CI were calculated for each possible sample that could be generated using the sampling design, the selection probabilities of the samples whose CIs include the true value of the population mean depression score sum to 0.95. Inference refers to repeated sampling from the population, not to the particular sample drawn. Any number of stochastic models might be considered in a model-based approach. Consider first a special case of the regression model in Eqn. (2), and assume that the εi’s are independent and normally 13463
Sample Sureys: Model-based Approaches distributed with constant variance. Model 1 has the form Yi l β jβ xijεi, where xi l 1 if person i is an ! and " x l 0 if person i is a rural resident. urban resident i The stochastic model provides a link between sampled and unsampled units: women who are not in the sample are assumed to have the same mean depression score as women with the same urban\rural status who are in the sample. Under Model 1, β# l y` R and β# l ! of β and" β . y` Uky` R are the least squares estimates ! " Thus the predicted value of y for each urban woman is y` U, and the predicted value of y for each rural woman is y` R. The model-based estimate of mean depression in this population is thus 1 [100y` Uj100y` Rj5,900y` Uj3,900y` R] 10,000 l 0.6y` Uj0.4y` R With Model 1, the point and interval estimates of the finite population mean are the same as for the designbased approach. The 95 percent confidence interval is interpreted differently, however; 95 percent of CIs from samples with the same values of xi that could be generated from the model are expected to contain the true value of β j0.6β . ! model-based " In this example, inference with Model 1 accords with the results from design-based inference. But suppose that the model adopted is Model 2: Yi l µjεi. Then the predicted value of the response for all women in the finite population is y` l ( y` Ujy` R)\2, and the mean depression score for the finite population is also estimated by y` . If depression is higher among rural women than among urban women, the estimate y` from Model 2 will likely overestimate the true mean depression score in the population of 10,000 women because rural women are not proportionately represented in the sample. In the model-based approach, inference is not limited to the 10,000 persons from whom the sample is drawn but applies to any set of persons for whom the model is appropriate. Random selection of units is not required for inference as long as the model assumptions hold. If the model is comprehensive enough to include all important information, as in Model 1 above, then the sampling design is irrelevant and can be disregarded for inference. The two types of inference differ in the conception of randomness. Design-based inference relies on the randomness involved in sample selection; it relates to the actual finite population, and additional assumptions are needed to extend the inferences to other possible populations. Model-based inference is conditional on the selected sample; the randomness is built into the model, and the model assumptions are used to make inferences about units not in the sample. Design-based inference depends on other possible samples that could have been selected from the finite population but were not, while model-based inference 13464
depends on other possible populations that could have been generated under the model but were not. The approaches are not completely separated in practice. Rao (1997) summarized a conditional designbased approach to inference, in which inference is restricted to a subset of possible samples. Sa$ rndal et al. (1992) advocated a model-assisted approach, in which a population model inspires the choice of estimator, but inference is based on the sampling design. In the depression example, a model-assisted estimator could incorporate auxiliary information such as race, ethnicity, and marital status through a model of the form in Eqn. (2); however, the stratified random sampling design is used to calculate estimates and standard errors.
2. Models in Descriptie and Analytic Uses of Sureys Estimating a finite population mean or total is an example of a descriptive use of a survey: the characteristics of a particular finite population are of interest. In much social science research, survey data are used for analytic purposes: investigating relationships between factors and testing sociological theories. Data from the US National Crime Victimization Survey may be used to estimate the robbery rate in 1999 (descriptive), or they may be used to investigate a hypothesized relationship between routine activities and likelihood of victimization (analytic). In the former case, the population of inference is definite and conceivably measurable through a census. In the latter, the population of inference is conceptual; the investigator may well be interested in predicting the likelihood of victimization of a future person with given demographic and routine activity variables. Smith (1994) argued that design-based inference is the appropriate paradigm for official descriptive statistics based on probability samples. Part of his justification for this position was the work of Hansen et al. (1983), who provided an example in which small deviations from an assumed model led to large biases in inference. Brewer (1999) summarized work on design-based and model-based estimation for estimating population totals and concluded that a modelassisted generalized regression estimator (see Sa$ rndal et al. 1992), used with design-based inference, captures the best features of both approaches. Models must of course always be used for inference in nonprobability samples (see Sample Sureys: Nonprobability Sampling); they may also be desirable in probability samples that are too small to allow the central limit theorem to be applied for inference. Lohr (1999 Chap. 11) distinguished between obtaining descriptive official statistics and uncovering a ‘universal truth’ in an analytic use of a survey. Returning to the depression example, the investigator might be interested in the relationship between de-
Sample Sureys: Model-based Approaches pression score ( y) and variables such as marital status, financial resources, number of chronic health problems, and ability to care for oneself. In this case, the investigator would be interested in testing a theory that would be assumed to hold not just for the particular population of 10,000 women but for other populations as well, and should be making inferential statements about the βs in model (2). The quantity for inference in the design-based setting is bp, the least squares estimate of β that would be obtained if the xis and yi were known for all 10,000 persons in the finite population. The quantity bp would rarely be of primary interest to the investigator, though, since it is merely a summary statistic for this particular finite population. In social research, models are generally motivated by theories, and a model-based analysis allows these theories to be tested empirically. The generalized least squares estimator of β, β# LS, would be the estimator of choice under a pure modelbased approach because of its optimality properties under the proposed model. This estimator is, however, sensitive to model misspecification. An alternative, which achieves a degree of robustness to the model at the expense of a possibly higher variance, is to use the design-based estimator of bp. If the proposed stochastic model is indeed generating the finite population and if certain regularity conditions are met, an estimator that is consistent for estimating bp will also be consistent for estimating β. Under this scenario, a design-based estimate for bp also estimates the quantity of primary interest β and has the advantage of being less sensitive to model misspecification. Regardless of philosophical differences on other matters of inference, it is generally agreed that two aspects of descriptive statistics require the use of models. All methods currently used to adjust for nonresponse (see Nonsampling Errors) employ models to relate nonrespondents to respondents, although the models are not necessarily testable. In small area estimation, sample sizes in some subpopulations of interest are too small to allow estimates of sufficient precision; models are used to relate such subpopulations to similar subpopulations and to useful covariates.
of the area or the population, but to the fact that the sample size in the domain is small or may even be zero. Consider the states to be the small areas, and let yk be the proportion of school-age children who are poor in state k. The direct estimate y` k of yk is calculated using data exclusively from the CPS, and Vp (ya k) is an estimate of the variance of y` k. Since in some states Vp ( y` k) is unacceptably large, the current practice for estimating poverty at the state level (see National Research Council, 2000, p. 49) uses auxiliary information from tax returns, food stamp programs, and the decennial census to supplement the data from the CPS. A regression model for predicting yk using auxiliary information gives predicted values
3. Models for Small Area Estimation
4.
In small area estimation, a model is used to estimate the response in subpopulations with few or no sample observations. As an example, the US Current Population Survey (CPS) provides accurate statistics about income and poverty for the nation as a whole. It was not designed, though, to provide accurate estimates in domains such as states, counties, or school districts—the sample would have to be prohibitively large in order to provide precise estimates of poverty for every county in the USA. These domains are called small areas—the term ‘small’ does not refer to the size
The first step in a model-based analysis for either descriptive or analytical use is to propose and fit a model to the data. Dependence among units, such as dependence among children in the same school, can be treated using hierarchical linear models or other methods discussed in Skinner et al. (1989). The biggest concern in a model-based analysis, as pointed out by Hansen et al. (1983), is that the model may be misspecified. Many, but not all, of the assumptions implicit in a model can be checked using the sample data. Appropriate plots of the data provide
% yV k l βV j βV j xjk ! j=" where the xjk s represent covariates for state k (e.g., x k " is the proportion of child exemptions reported by families in poverty in state k, and x k is the proportion of people receiving food stamps# in state k). The predicted value yV k from the regression equation is combined with the direct estimate y` k from the CPS according to the relative amounts of information present in each: the small area estimate for state k is yg k l γky` kj(1kγk)yV k where γk is determined by the relative precision of y` k and yV k. If the direct estimate is precise for a state, i.e., Vp ( y` k) is small, then γk is close to one and the small area estimate yg k relies mostly on the direct estimate. Conversely, if the CPS contains little information about state k’s poverty rate, then γk is close to zero and yg k relies mostly on the predicted value from the regression model. The small area model allows the estimator for area k to ‘borrow strength’ from other areas and incorporate auxiliary information from administrative data or other sources. Ghosh and Rao (1994) and Rao (1999) review properties of this model and other models used in small area estimation.
Performing a Model-based Analysis
13465
Sample Sureys: Model-based Approaches some graphical checks of model adequacy and correctness of the assumed variance structure, as described in Lohr (1999). These assumptions can also be partially checked by performing hypothesis tests of nested models, and by fitting alternative models to the data. In the depression example, plotting the data separately for rural and urban residents would reveal inadequacy of Model 2 relative to Model 1. Another method that can sometimes detect model inadequacy is comparison of design-based and modelbased estimates of model parameters. As mentioned in Sect. 2, if the model is correct for units in the finite population, then the design-based estimates and the model-based estimates should both be consistent for the model parameters. A substantial difference in the estimates could indicate that the sample design contains information not captured in the model, and that perhaps more covariates are needed in the model. One crucial assumption that cannot be checked using sample data is that the model describes units not n the sample. This assumption is especially important in nonprobability samples and in use of models for nonresponse adjustment or small area estimation.
5. Models in Surey Design Kalton (1983) distinguished between the use of models in survey analysis and in survey design, stating that ‘the use of models to guide the choice of sample design is well-established and noncontroversial.’ In good survey practice, a stratified sampling design is often chosen because it is thought that there are differences among stratum means. An unequal probability design may be employed because of prior belief that large counties have more variability in total number of crime victimizations than small counties; models provide a mechanism for formalizing some of the knowledge about population structure and exploring results of alternative assumptions. Cochran (1977) illustrated the use of models for designing systematic samples. Sa$ rndal et al. (1992, Chap. 12) summarized research on optimal survey design, in which auxiliary information about the population is used to select a design that minimizes the anticipated variance of an estimator under the model and design. Models used for design purposes do not affect the validity of estimates in design-based inference. A poor model adopted while designing a probability sample may lead to larger variance of design-based estimates, but the estimates will retain properties such as unbiasedness under repeated sampling. A good model at the design stage often leads to a design with greatly increased efficiency. A model-based analysis can be conducted on data from any sample, probability or non-probability. Probability sampling is in theory unnecessary from a pure model-based perspective, and Thompson (1997) and Brewer (1999) concluded that certain forms of 13466
purposive non-probability sampling can be superior to probability sampling when a model-based analysis is to be conducted and the model is correct. In practice, however, there is always concern that the assumed model may miss salient features of the population, and probability sampling provides some protection against this concern. For a ratio model, with Yi l βxijεi and V [εi] l σ#xi, the model-based optimal design specifies a purposive sample of the population units with the largest x values. Such a design does not allow an investigator to check whether the model is appropriate for small xs; an unequal probability sample with πi proportional to xi does allow such model checking, and allows inferences under either design- or modelbased frameworks. As Brewer (1999) pointed out, there is widespread public perception that ‘randomized sampling is fair’ and that perception provides a powerful argument for using probability sampling for official statistics. The following sources are useful for further exploration of modes of inference in sample surveys. Lohr (1999, Chap. 11) provides a more detailed heuristic discussion of the role of models in survey sampling; Thompson (1997) gives a more mathematical treatment. The articles by Smith (1994), Rao (1997), and Brewer (1999) contrast inferential philosophies, discuss appropriate use of models in analysis of survey data, and provide additional references. See also: Sample Surveys, History of; Sample Surveys: Methods; Sample Surveys: Survey Design Issues and Strategies; Sample Surveys: The Field
Bibliography Brewer K R W 1963 Ratio estimation and finite populations: some results deducible from the assumption of an underlying stochastic process. Australian Journal of Statistics 5: 93–105 Brewer K R W 1999 Design-based or prediction-based inference? Stratified random vs. stratified balanced sampling. International Statistical Reiew 67: 35–47 Cochran W G 1977 Sampling Techniques, 3rd edn. Wiley, New York Ghosh M, Rao J N K 1994 Small area estimation: An appraisal. Statistical Science 9: 55–76 Godambe V P 1955 A unified theory of sampling from finite populations. Journal of the Royal Statistical Society B 17: 269–78 Hansen M H, Madow W G, Tepping B J 1983 An evaluation of model-dependent and probability-sampling inferences in sample surveys. Journal of the American Statistical Association 78: 776–93 Horvitz D G, Thompson D J 1952 A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47: 663–85 Kalton G 1983 Models in the practice of survey sampling. International Statistical Reiew 51: 175–88 Lohr S L 1999 Sampling: Design and Analysis. Duxbury Press, Pacific Grove, CA National Research Council 2000 Small-area Income and Poerty Estimates: Priorities for 2000 and Beyond. Panel on Estimates
Sample Sureys: Nonprobability Sampling of Poverty for Small Geographic Areas, Committee on National Statistics. National Academy Press, Washington, DC Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–606 Rao J N K 1997 Developments in sample survey theory: An appraisal. Canadian Journal of Statistics 25: 1–21 Rao J N K 1999 Some recent advances in model-based small area estimation. Surey Methology 25: 175–86 Royall R M 1970 On finite population sampling theory under certain linear regression models. Biometrika 57: 377–87 Sa$ rndal C E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Skinner C J, Holt D, Smith T M F 1989 Analysis of Complex Sureys. Wiley, New York Smith T M F 1994 Sample surveys 1975–1990; an age of reconciliation? International Statistical Reiew 62: 5–34 Thompson M E 1997 Theory of Sample Sureys. Chapman & Hall, London
S. L. Lohr
ference cannot be applied to assess the bias or variability of estimators based on nonprobability samples, as such methods do not allow for unknown or zero selection probabilities. Surveys carried out by national statistical agencies invariably use probability sampling. Marsh and Scarborough (1990) also noted ‘the preponderance of probability sampling in university social science.’ Nonprobability sampling is much more common in market and opinion research. However, Taylor (1995 observed large national differences in the extent to which nonprobability sampling, particularly quota sampling, is viewed as an acceptable tool for market research. In Canada and the USA, probability sampling using telephone polling and random-digit dialing is the norm for public opinion surveys. In Australia and South Africa probability sampling is also prevalent, but with face-to-face interviews. On the other hand, in many European countries such as France and the UK, quota sampling is much more common.
1. Conenience Sampling
Sample Surveys: Nonprobability Sampling A sample collected from a finite population is said to be a probability sample if each unit of the population has nonzero probability of being selected into the sample, and that probability is known. Traditional methods of probability sampling include simple and stratified random sampling, and cluster sampling. Conclusions concerning the population may be obtained by design-based, or randomization, inference. See Sample Sureys: The Field and Sample Sureys: Methods. The values of variables of interest in the population are considered as fixed quantities, unknown except for those units selected into the sample. Inference proceeds by considering the behavior of estimators of quantities of interest under the randomization distribution, based on the known selection probabilities. For example, if the N population values of variable Y are denoted Y , …, YN and the n sample values by y , …, yn then y- ," the sample mean, is a " possible estimator for Yz , the population mean. If the sample is obtained by simple random sampling, then, with respect to this randomization distribution, y` is unbiased for Yz and has sampling variance (Nkn)\Nn(Nk1)ΣN (Y kYz )#. i=" i Nonprobability sampling refers to any method of obtaining a sample from a population which does not satisfy the criteria for probability sampling. Nonprobability samples are usually easier and cheaper to collect than probability samples, as the data collector is allowed to exercise some choice as to which units to include in the sample. For a probability sample, this choice is made entirely by the random sampling mechanism. However, methods of design-based in-
The easiest and cheapest way to collect sample data is to collect information on those population units which are most readily accessible. A university researcher may collect data on students. Surveys carried out through newspapers, television broadcasts or Internet sites (as described, for example, by Bradley, 1999) are necessarily restricted to those individuals who have access to the medium in question. Sometimes only a small fraction of the population is accessible, in which case the sample may consist of exactly those units which are available for observation. Some surveys involve an element of self-selection where individuals decide whether to include themselves in the sample or not. If participation is timeconsuming, or financial cost is involved, then the sample is more likely to include individuals with an interest in the subject of the survey. This may not be important. For example, an interest in participating in an experimental study of behavior might be considered to be unlikely to be associated with the outcome of the experiment. However, where the variable of interest relates to opinion on a question of interest, as is often the case in newspaper, television or Internet polls, it is likely that interest in participation is related to opinion, and it is much harder to justify using the sample data to make conclusions about a wider population. A famous example of the failure of such a nonprobability sample to provide accurate inferences about a wider population is the Literary Digest poll of 1936. Ten million US citizens were sent postcard ballots concerning the forthcoming presidential election. Around 2 million of these were returned, a sample size which, if associated with a simple random sample, would be expected to predict the population with negligible error. However, when calibrated against the 13467
Sample Sureys: Nonprobability Sampling election results, the Literary Digest poll was in error by 19 percentage points in predicting Roosevelt’s share of the vote. On the other hand, useful inferences can be made using convenience samples. Smith and Sugden (1988) considered statistical experiments, where the allocation of a particular treatment to the units under investigation is controlled, usually by randomization. In such experiments, the selection of units is not usually controlled and is often a convenience sample. For example, individuals might be volunteers. Nevertheless, inferences are often successfully extended to a wider population. Similarly, obserational studies where neither treatment allocation nor sample selection is controlled, usually because it is impossible to do so, can be thought of as arising from convenience samples. Smith (1983) noted that Doll and Hill (1964) in their landmark study of smoking and health, used a sample entirely made up of medical practitioners. However, the validity of extending conclusions based on their data, to the general population, is now widely recognized. Studies based on convenience samples can be an extremely effective way of conducting preliminary investigations, but it is desirable that any important conclusions drawn about a wider population are further investigated, preferably using probability samples. Where some kind of explanatory, rather than simply descriptive, inference is desired, Smith and Sugden (1988) argued that ‘the ideal studies are experiments within surveys in which the scientist has control over both the selection of units and the allocation of treatments.’ This approach was considered in detail by Fienberg and Tanur (1989).
2. Quota Sampling When using survey data to draw an inference about a population of interest, the hope of the analyst is that sample estimators of quantities of interest are close to the corresponding population values. If a nonprobability sample has been collected, then it is instructive to observe the precision of sample estimators of known population quantities. For example, how do the sample proportions of males and females compare to known population values? If they differ substantially, then the sample is ‘unrepresentative’ of the population and one might have legitimate cause for concern about the reliability of estimates of unknown quantities of interest. Purposie sampling is a term used for methods of choosing a nonprobability sample in a way that makes it ‘representative’ of the population, although there is no generally agreed definition of a representative sample, and purposive sampling is often based on subjective considerations. In quota sampling, the sample selection is constrained to ensure that the sample proportions of certain control variables approximately match the 13468
known population proportions. For example, if the population proportions of males and females are equal, then equal numbers of male and female units are selected into the sample. Age groups are also commonly used in designing quota samples. Sample totals for each cell of a cross-classification of two or more control variables (for example, age by sex) may also be fixed by the design. Examples are given by Moser and Kalton (1971). Quota sampling is most commonly used in market and opinion research, where control variables usually include age, sex, and socioeconomic class. Other variables such as employment status and housing tenure are also used. The known population proportions for the control variables are calculated from census data, or from surveys based on large probability samples. Variables with known population totals which are not used in setting quotas may be used for weighting in any subsequent analyses. Where data collection involves visiting households, further constraints beyond the quotas may be applied to sample selection. For example, data collectors may be assigned a prespecified travel plan. However, where the mode of data collection involves intercepting individuals on the street for interview, then the only constraint on the data collector may be to satisfy the quotas. It is this freedom given to the data collector that provides both the biggest advantage and biggest disadvantage of quota sampling. The advantage is that with only the quota constraints to satisfy, data collection is relatively easy. Such surveys can be carried out rapidly by an individual data collector performing interviews on a busy street corner. As with any nonprobability sampling scheme, however, there is no way of assessing the bias associated with quota sampling. The sample units are necessarily selected from those which are available to the data collector, given their mode of interviewing. If availability is associated with any of the survey variables, then significant bias may occur. Advocates of quota sampling argue that the quotas control for this, but there is no way of guaranteeing that they do. Neither can design-based inference be used to assess the variability of estimates based on quota samples. Sometimes, a simple model is used to assess this variability. If one assumes that the data collectors used are drawn from a population of possible data collectors, then the ‘between collector’ variance combines both sampling variability and interviewer variability. Deville (1991) modeled the quota sampling process and provided some alternative measures of variability. Studies comparing quota and probability sampling have been carried out. Moser and Stuart (1953) discovered apparent availability biases in the quota samples they investigated, with respect to the variables occupation and education. In particular, they noticed that the quota samples underestimated the proportion of population with lower levels of education. Marsh and Scarborough (1990) investigated nine possible sources of availability bias in quota samples. They
Sample Sureys: Nonprobability Sampling found that, amongst women, their quota sample overestimated the proportion from households with children. Both studies found that the quota samples tended to underestimate the proportion of individuals in the extreme (high and low) income groups. Quota samples are often used for political opinion polls preceding elections. In such examples they can be externally validated against the election results and historically quota samples have often been shown to be quite accurate. Indeed Worcester (1996) argued that election forecasts using quota samples for UK elections in the 1970s were more accurate than those using probability samples. Smith (1996) presented similar evidence. However, it is also election forecasting which has led to quota sampling coming under closest scrutiny. In the US presidential election of 1948, the Crossley, Gallup, and Roper polls all underestimated Truman’s share of the vote by at least five percentage points, and as a consequence, predicted the wrong election winner. Mosteller et al. (1949) in their report on the failure of the polls found one of the two main causes of error to be errors of sampling and interviewing, and concluded (p. 304) that ‘it is likely that the principal weakness of the quota control method occurred at the local level at which respondents are selected by interviewers.’ The UK general election of 1992 saw a similar catastrophic failure of the pre-election opinion polls, with pre-election polls giving Labour an average lead of around 1.5 percentage points. In the election, the Conservative lead over Labour was 7 percentage points. A report by the Market Research Society Working Party (1994) into the failure of the polls identified inaccuracies in setting the quota controls as one of a number of possible sources of error. As a result the sample proportions of the key variables did not accurately reflect the proportions in the population. Lynn and Jowell (1996) attributed much of the error to the selection bias inherent in quota sampling, and argued for increased use of probability sampling methods for future election forecasts.
3. A Formal Framework As methods of design-based inference cannot be applied to data obtained by nonprobability sampling, any kind of formal assessment of bias and variability associated with nonprobability samples requires a model-based approach (see Sample Sureys: Modelbased Approaches). Smith (1983) considered the following framework, which can be used to assess the validity of inferences from various kinds of nonprobability samples. Let i l 1, …, N denote the population units, vector Yi the values of the unknown survey variables, and vector Zi the values of variables which are known prior to the survey. Let A be a binary variable indicating whether a unit is selected into the sample (Ai l 1) or not (Ai l 0), and let As be the values of A for the observed sample. Smith (1983)
modeled the population values of Y and the selection process jointly through f (Y, As Q Z; θ, φ) l f (Y Q Z; θ) f (As Q Y, Z; φ)
(1)
where θ and φ are distinct model parameters for the population model and selection model respectively. Given As, Y can be partitioned as (Ys, Ys` ) into observed and unobserved values. Inferences based on the observed data model f (Ys Q Z; θ) and extended to the population are said to ignore the selection mechanism, and in situations where this is valid, the selection is said to be ignorable (Rubin, 1976); see Statistical Data, Missing. Selection is ignorable when f (As Q Y, Z; φ) l f (As Q Z; φ)
(2)
so that the probability of making the observed selection, for given Z, is the same for all Y. A sufficient condition for this is that A and Y are conditionally independent given Z. A probability sampling scheme, perhaps using some stratification or clustering based on Z, is clearly ignorable. Nonprobability sampling schemes based on Z (for example selecting exactly those units corresponding to a particular set of values of Z) are also ignorable. However, whether or not inferences are immediately available for values of Z not contained in the sample depends on the form of the population model f(Y Q Z; θ) and, in particular, whether the entire θ is estimable using Ys. If Y is independent of Z then there is no problem, but this is an assumption which cannot be verified by sample data based on a restricted sample of values of Z. If this assumption seems implausible, then post-stratification may help. Smith (1993) considered partitioning the variables comprising Y into measurement variables Ym and stratification variables Yq, and post-stratifying. If f (Ym Q Yqs, Z; ξ) l f (Ym Q Yqs; ξ) s s
(3)
where ξ are parameters for the post-stratification model, then inference for any Z is available. This condition implies that, given the observed values Ym of s the stratification variables, Z gives no further information concerning the measurement variables. This approach provides a way of validating certain inferences based on a convenience sample, where Z is an indicator variable defining the sample. Smith (1983) also considered ignorability for quota sampling schemes. He proposed modeling selection into a quota sample in two stages, selection into a larger sample for whom quota variables Yq are recorded, followed by selection into the final sample, based on a unit’s quota variables and the requirements to fill the quota. For the final sample, the variables of interest Ym are recorded. Two ignorability conditions result, requiring that at neither stage does probability of selection, given Yq and Z, depend on Ym. 13469
Sample Sureys: Nonprobability Sampling This formal framework makes clear, through expressions such as (2) and (3) when model-based inferences from nonprobability samples can and cannot be used to provide justifiable population inferences. However, it is important to realize that the assumptions required to ensure ignorability cannot be verified using the sample data alone. They remain assumptions which need to be subjectively justified before extending any inferences to a wider population. These formal concepts of ignorability confirm more heuristic notions of what is likely to comprise a good nonprobability sampling scheme. For example, opinion polls with a large element of self-selection are highly unlikely to result in an ignorable selection. On the other hand one might have much more faith in a carefully constructed quota sampling scheme, where data collectors are assigned to narrowly defined geographical areas, chosen using a probability sampling scheme, and given restrictive guidelines on choosing the units to satisfy their quota.
4. Discussion The distinction between probability sampling and nonprobability sampling is necessarily coarse. At one extreme is a carefully constructed probability survey with no nonresponse; at the other extreme is a sample chosen entirely for the investigator’s convenience. However, most surveys fall between these two extremes, and therefore strictly should be considered as nonprobability samples. Examples include quota surveys of households where the geographical areas for investigation are chosen using a probability sample, or statistical experiments where a convenience sample of units is assigned treatments using a randomization scheme. The validity of any inferences extended to a wider population depends on the extent to which the selection of units is ignorable for the inference required. This applies equally to any survey with nonresponse. The presence of nonresponders in a probability survey introduces a nonprobability element into the selection mechanism. Considerations of ignorability (of nonresponse) now need to be considered. However, surveys with probability sampling usually make a greater effort to minimize nonresponse than nonprobability surveys, where there is little incentive to do so. Furthermore, even with nonresponse, it is easier to justify ignorability of a probability sampling mechanism. Further details concerning specific issues may be obtained from the sources referenced above. Alternative perspectives on nonprobability sampling are provided by general texts on sampling such as Hansen et al. (1953), Stephan and McCarthy (1958) and Moser and Kalton (1971). See also: Sample Surveys, History of; Sample Surveys: Survey Design Issues and Strategies 13470
Bibliography Bradley N 1999 Sampling for internet surveys. An examination of respondent selection for internet research. Journal of the Market Research Society 41: 387–95 Deville J-C 1991 A theory of quota surveys. Surey Methodology 17: 163–81 Doll R, Hill A B 1964 Mortality in relation to smoking: ten years’ observations of British doctors. British Medical Journal 1: 1399–410 Fienberg S E, Tanur J M 1989 Combining cognitive and statistical approaches to survey design. Science 243: 1017–22 Hansen M H, Hurwitz W N, Madow W G 1953 Sample Surey Methods and Theory. Volume 1: Methods and Applications. Wiley, New York Lynn P, Jowell R 1996 How might opinion polls be improved? The case for probability sampling. Journal of the Royal Statistical Society A 159: 21–8 Market Research Society Working Party 1994 The Opinion Polls and the 1992 General Election. Market Research Society, London Marsh C, Scarborough E 1990 Testing nine hypotheses about quota sampling. Journal of the Market Research Society 32: 485–506 Moser C A, Kalton G 1971 Surey Methods in Social Inestigation. Heinemann, London Moser C A, Stuart A 1953 An experimental study of quota sampling (with discussion). Journal of the Royal Statistical Society A 116: 349–405 Mosteller F, Hyman H, McCarthy P J, Marks E S, Truman D B 1949 The Pre-election Polls of 1948: Report to the Committee on Analysis of Pre-election Polls and Forecasts. Social Science Research Council, New York Rubin D B 1976 Inference and missing data. Biometrika 63: 581–92 Smith T M F 1983 On the validity of inferences from nonrandom samples. Journal of the Royal Statistical Society A 146: 394–403 Smith T M F 1996 Public opinion polls: the UK general election, 1992. Journal of the Royal Statistical Society A 159: 535–45 Smith T M F, Sugden R A 1988 Sampling and assignment mechanisms in experiments, surveys and observational studies. International Statistical Reiew 56: 165–80 Stephan F F, McCarthy P J 1958 Sampling Opinions. An Analysis of Surey Procedure. Wiley, New York Taylor H 1995 Horses for courses: how survey firms in different countries measure public opinion with very different methods. Journal of the Market Research Society 37: 211–19 Worcester R 1996 Political polling: 95% expertise and 5% luck. Journal of the Royal Statistical Society A 159: 5–20
J. J. Forster
Sample Surveys: Survey Design Issues and Strategies A treatment of survey questions intended to be useful to those wishing to carry out or interpret actual surveys should consider several issues: the basic difference between questions asked in surveys and
Sample Sureys: Surey Design Issues and Strategies questions asked in ordinary social interaction; the problems of interpreting tabulations based on single questions; the different types of survey questions that can be asked; the possibility of bias in questioning; and the insights that can be gained by combining standard survey questions with randomized experiments that vary the form, wording, and context of the questions themselves. Each of these issues is treated in this article.
1. The Unique Nature of Surey Questioning A fundamental paradox of survey research is that we start from the purpose of ordinary questioning as employed in daily life, yet our results are less satisfactory for that purpose than for almost any other. In daily life a question is usually asked because one person wishes information from another. You might ask an acquaintance how many rooms there are in her house, or whether she favors the legalization of abortion in America. The assumption on both sides of the interaction is that you are interested in her answers in and of themselves. We can call such inquiries ordinary questions. In surveys we use similar inquiries—that is, their form, their wording, and the manner of their asking are seldom sharply distinguishable from ordinary questions. At times we may devise special formats, with names like Likert-type or forced-choice, but survey questions cannot depart too much from ordinary questioning because the essential nature of the survey is communication with people who expect to hear and respond to ordinary questions. Not surprisingly, respondents believe that the interviewer or questionnaire is directly interested in the facts and opinions they give, just as would an acquaintance who asked the same questions. They may not assume a personal interest in their answers, but what they do assume is that their answers will be combined with the answers of all others to give totals that are directly interpretable. Thus, if attitudes or opinions are inquired into, the survey is viewed as a kind of referendum and the investigator is thought to be interested in how many favor and how many oppose legalized abortion or whatever else is at issue. If facts are being asked about, the respondent expects a report telling how many people have what size homes, or whatever the inquiry is about. (By factual data we mean responses that correspond to a physical reality and could, in principle, be provided by an observer as well as by a respondent, for example, when counting rooms. By attitudinal data we mean responses that concern subjective phenomena and therefore depend on self-reports by respondents. The distinction is not airtight: for example, the designation of a respondent’s ‘race’ can be based on self-report but also on the observations of others, and the two may differ without either being
clearly ‘wrong.’) In this article the focus will be on attitudes, including opinions, beliefs, and values, though much of the discussion can be applied to factual data as well. Experienced survey researchers know that a simple tally of responses to a question—what survey researchers refer to as the ‘marginals’—are usually too much a function of the way the question was asked to allow for any simple interpretation. The results of questions on legalized abortion depend heavily on the conditions, definitions, and other subtleties presupposed by the question wording, and the same is true to an extent even for a question on how many rooms there are in a house. Either we must keep a question quite general in phrasing and leave the definitions, qualifications, and conditions up to each respondent—which invites unseen variations in interpretation—or we must try to make the question much more limited in focus than was usually our goal in the first place. Faced with these difficulties in interpreting univariate results from separate questions, survey investigators can proceed in one or both of two directions. One approach is to ask a wide range of questions on an issue and hope that the results can be synthesized into a general conclusion, even though this necessarily involves a fair amount of judgment on the part of the researcher. The other direction—the one that leads to standard survey analysis—is to hold constant the question (or the index, if more than a single item is being considered) and make comparisons across time or other variables. We may not be sure of exactly what 65 percent means in terms of general support for legalized abortion, but we act on the assumption that if the question wording and survey conditions have been kept constant, we can say, within the limits of sampling error, that it represents such and such an increase or decrease from an earlier survey that asked the same question to a sample from the same population. Or if 65 percent is the figure for men and 50 percent is the figure for women, a sex difference of approximately 15 percent exists. Moreover, research indicates that in most cases relationships are less affected by variations in the form of question than are univariate distributions—generalized as the rule of ‘form-resistant correlations’ (Schuman and Presser 1981). The analytic approach, together with use of multiple questions (possibly further combined on the basis of a factor analytic approach), can provide a great deal of understanding and insight into an attitude, though it militates against a single summary statement of the kind that the respondents expect to hear. (Deming’s (1968, p. 601) distinction between enumerative and analytic studies is similar, but he treats the results from enumerative studies as unproblematic, using a simple factual example of counting number of children. In this article, univariate results based on attitude questions are regarded as questionable attempts to simulate actual 13471
Sample Sureys: Surey Design Issues and Strategies referenda. Thus the change in terminology is important.) This difference between what respondents expect— the referendum point of view—and what the sophisticated survey researcher expects—the analytic point of view—is often very great. The respondent in a national survey believes that the investigator will add up all the results, item by item, and tell the nation what Americans think. But the survey investigator knows that such a presentation is usually problematic at best and can be dangerously misleading at worst. Moreover, to make matters even more awkward, political leaders often have the same point of view as respondents: they want to know how many people favor and how many oppose an issue that they see themselves as confronting. Yet it may be neither possible nor desirable for the survey to pose exactly the question the policy maker has in mind, and in any case such a question is likely to be only one of a number of possible questions that might be asked on the issue.
2. Problems with the Referendum Point of View There are several reasons why answers obtained from isolated questions are usually uncertain in meaning. First, many public issues are discussed at a general level as though there is a single way of framing them and as though there are just two sides. But what is called the abortion issue, to follow our previous example, consists of a large number of different issues having to do with the reasons for abortion, the trimester involved, and so forth. Likewise, what is called ‘gun control’ can involve different types of guns and different kinds of controls. Except at the extremes, exactly which of these particular issues is posed and with what alternatives makes a considerable difference in the univariate results. Indeed, often what is reported as a conflict in findings between two surveys is due to their having asked about different aspects of the same general issue. A second problem is that answers to survey questions always depend on the form in which the question is asked, because most respondents treat that form as a constraint on their answers. If two alternatives are given by the interviewer, most respondents will choose one, rather than offering a substitute of their own that they might prefer. For example, in one survey-based experiment the authors identified the problems spontaneously mentioned when a national sample of Americans was asked to name the most important problem facing the country. Then a parallel question was formulated for a comparable sample that included none of the four problems mentioned most often spontaneously, but instead four problems that had been mentioned by less than three percent of the population in toto, though with an invitation to respondents to substitute a different problem if they wished. Despite the invitation, the majority of respon13472
dents (60 percent) chose one of the rare problems offered explicitly, which reflected their unwillingness to go outside the frame of reference provided by the question (Schuman and Scott 1987). Evidently, the form of a question is treated by most people as setting the ‘rules of the game,’ and these rules are seldom challenged even when encouragement is offered. It might seem as though the solution to the rules-ofthe-game constraint is to keep questions ‘open’—that is, not to provide specific alternatives. This is often a good idea, but not one that is failsafe. In a related experiment on important events and changes from the recent past, ‘the development of computers’ was not mentioned spontaneously nearly as often as economic problems, but when it was included in a list of past events along with economic problems, the development of computers turned out to be the most frequent response (Schuman and Scott 1987). Apparently people asked to name an important recent event or change thought that the question referred only to political events or changes, but when the legitimacy of a different kind of response was made explicit, it was heavily selected. Thus a question can be constraining even when it is entirely open and even when the investigator is unaware of how it affects the answers respondents give. A third reason for the limitations of univariate results is the need for comparative data to make sense in interpretation. Suppose that a sample of readers of this article is asked to answer a simple yes\no question as to its value, and that 60 percent reply positively and 40 percent negatively. Leaving aside all the problems of question wording discussed thus far, such percentages can be interpreted only against the backdrop of other articles. If the average yes percentage for all articles is 40 percent, the author might feel proud of his success. If the average is 80 percent, the author might well hang his head in shame. We are all aware of the fundamental need for this type of comparison, yet it is easy to forget about the difficulty of interpreting absolute percentages when we feel the urge to speak definitively about public reactions to a unique event. Finally, in addition to all of the above reasons, there are sometimes subtle features of wording that can affect answers. A classic example of a wording effect is the difference between ‘forbidding’ something and ‘not allowing’ the same thing (Rugg 1941). A number of survey experiments have shown that people are more willing to ‘not allow’ a behavior than they are to ‘forbid’ the same behavior, even though the practical effects of the distinction in wording are nil (Holleman 2000). Another subtle feature is context: for example, a question about abortion in the case of a married woman who does not want any more children is answered differently depending on whether or not it is preceded by a question about abortion in the case of a defective fetus (Schuman and Presser 1996 [1981]). The problems of wording and context appear equally when an actual referendum is to be carried out
Sample Sureys: Surey Design Issues and Strategies by a government: considerable effort is made by politicians on all sides of the issue to control the wording of the question to be voted on, as well as its placement on the ballot, with the battle over these decisions sometimes becoming quite fierce. This shows that there is never a single way to phrase a referendum and that even small variations in final wording or context can influence the outcome of the voting. The same is true for survey questions, but with the crucial difference that they are meant to provide information, not to determine policy in a definitive legal sense. The analytic approach, when combined with use of multiple questions to tap different aspects of an issue, provides the most useful perspective on survey data. Rather than focusing on the responses to individual items as such, the analysis of change over time and of variations across demographic and social background variables provides the surest route to understanding both attitudinal and factual data. Almost all important scholarly work based on surveys follows this path, giving attention to individual percentages only in passing. In addition, in recent years, classic betweensubjects experiments have been built into surveys, with different ways of asking a question administered to random subsamples of a larger probability sample in order to learn about the effects of question wording (Schuman and Presser 1996 [1981]). These surveybased experiments, traditionally called ‘split-ballots,’ combine the advantage of a probability sample survey to generalize to a much larger population with the advantage of randomized treatments to test causal hypotheses. Survey-based experiments have been used to investigate a variety of methodological uncertainties about question formulations, as we will see below, and are also employed increasingly to test hypotheses about substantive political and social issues.
3. Types of Surey Questions When investigators construct a questionnaire, they face a number of decisions about the form in which their questions are to be asked, though the decisions are often not made on the basis of much reflection. The effects of such decisions were first explored in surveybased experiments conducted in the mid-twentieth century and reported in books by Cantril (1944) and Payne (1951). In 1981 (1996), Schuman and Presser provided a systematic review of variations due to question form, along with much new experimental data. Recent books by Sudman et al. (1996), Tanur (1992), Tourangeau et al. (2000), and Krosnick and Fabrigar (forthcoming) consider many of these same issues, as well as a number of additional ones, drawing especially on ideas and research from cognitive psychology(seealsoQuestionnaires:CognitieApproaches; Sample Sureys: Cognitie Aspects of Surey Design).
An initial important decision is whether to ask a question in open or closed form. Open questions, where respondents answer in their own words and these are then coded into categories, are more expensive in terms of both time and money than closed questions that present two or more alternatives that respondents choose from. Hence, open questions are not common in present-day surveys, typically being restricted to questions that attempt to capture rapid change and that are easy to code, as in standard inquiries about ‘the most important problem facing the country today.’ In this case, immediate salience is at issue and responses can usually be summarized in keyword codes such as ‘unemployment,’ ‘terrorism,’ or ‘race relations.’ An open-ended approach is also preferable when numerical answers are wanted, for example, how many hours of television a person watches a week. Schwarz (1996) has shown that offering a specific set of alternatives provides reference points that can shape answers, and thus it is probably better to leave such questions open, as recommended by Bradburn et al. (1979). More generally, open and closed versions of a question often do lead to different univariate response distributions and to different multivariate relations as well (Schuman and Presser 1996, [1981]). Partly this is due to the tendency of survey investigators to write closed questions on the assumption that they themselves know how to frame the main choices, which can lead to their overlooking alternatives or wording especially meaningful to respondents. Many years ago Lazarsfeld (1944) proposed as a practical compromise the use of open questions in the early development of a questionnaire, with the results then drawn on to frame closed alternatives that would be more efficient for use in an actual survey. What has come to be called ‘cognitive interviewing’ takes this same notion into the laboratory by studying carefully how a small number of individuals think about the questions they are asked and about the answers they give (see several chapters in Schwarz and Sudman 1996). This may not eliminate all open\closed differences, but it helps investigators learn what is most salient and meaningful to respondents. Even after closed questions are developed, it is often instructive to include follow-up ‘why’ probes of answers in order to gain insight into how respondents perceived the questions and what they see their choices as meaning. Since it is not practical to ask such followups to all respondents about all questions, Schuman (1966) recommended the use of a ‘random probe’ technique to obtain answers from a subsample of the larger sample of questions and respondents. When the focus is on closed questions, as it often is, a number of further decisions must be made. A frequently used format is to state a series of propositions, to each of which the respondent is asked to indicate agreement or disagreement. Although this is an efficient way to proceed, there is considerable 13473
Sample Sureys: Surey Design Issues and Strategies evidence that a substantial number of people, especially those with less education, show an ‘acquiescence bias’ when confronted with such statements (Krosnick and Fabrigar forthcoming). The main alternative to the agree\disagree format is to require respondents to make a choice between two or more statements. Such a balanced format encourages respondents to think about the opposing alternatives, though it also requires investigators to reduce each issue to clearly opposing positions. Another decision faced by question writers is how to handle DK (don’t know) responses. The proportion of DK answers varies not only by the type of issue—there are likely to be more to a remote foreign policy issue than to a widely discussed issue like legalization of abortion (Converse 1976–77)—but also by how much DK responses are encouraged or discouraged. At one extreme, the question may offer a DK alternative as one of the explicit choices for respondents to consider, even emphasizing the desirability of it being given if the respondent lacks adequate information on the matter. At the other extreme interviewers may be instructed to urge those who give DK responses to think further in order to provide a more substantive answer. In between, a DK response may not be mentioned by the interviewer but can be accepted when volunteered. Which approach is chosen depends on one’s beliefs about the meaning of a DK response. Those who follow Converse’s (1964) emphasis on the lack of knowledge that the majority of people possess about many public issues tend to encourage respondents to consider a DK response as legitimate. Those who argue like Krosnick and Fabrigar forthcoming) that giving a DK response is mainly due to ‘satisficing’ prefer to press respondents to come up with a substantive choice. Still other possibilities are that DKs can involve evasion in the case of a sensitive issue (e.g., racial attitudes) and in such cases it is unclear what respondents will do if prevented from giving a DK response. A more general methodological issue is the desirability of measuring attitude strength. Attitudes are typically defined as favorable or unfavorable evaluations of objects, but the evaluations can also be seen as varying in strength. One can strongly favor or oppose the legalization of abortion, for example, but hold an attitude toward gun registration that is weaker, or vice versa. Further, there is more than one way to measure the dimension of strength, as words like ‘extremity,’ ‘importance,’ ‘certainty,’ and ‘strength’ itself suggest, and thus far it appears that the different methods of measurement are far from perfectly correlated (Petty and Krosnick 1995). Moreover, although there is evidence that several of these strength measures are related to actual behavior, for example, donating money to one side of the dispute about the legalization of abortion, in the case of gun registration the relation has been much weaker, 13474
apparently because other social factors (i.e., the effectiveness of gun lobbying organizations) play a large role independent of attitude strength (Schuman and Presser 1996 [1981].
4. Question Wording and Bias Every question in a survey must be conveyed in words and words are never wholly neutral and unproblematic. Words have tone, connotation, implication— which is why both a referendum and a survey are similar in always coming down to a specific way of describing an issue. Of course, sometimes a question seems to have been deliberately biased, as when a ‘survey’ mailed out by a conservative organization included the following question: Do you believe that smut peddlers should be protected by the courts and the Congress, so they can openly sell pornographic materials to your children?
But more typical are two versions of a question that was asked during the Vietnam War: If a situation like Vietnam were to develop in another part of the world, do you think the United States should or should not send troops [to stop a communist takeoer]?
Mueller (1973) found that some 15 percent more Americans favored military action when the bracketed words were included than when they were omitted. Yet it is not entirely clear how one should regard the phrase ‘to stop a communist takeover’ during that period. Was it ‘biasing’ responses to include the phrase, or was it simply informing respondents of something they might like to have in mind as they answered? Furthermore, political leaders wishing to encourage or discourage an action can choose how to phrase policy issues, and surveys cannot ignore the force of such framing if they wish to be relevant to important political outcomes. Another instructive example was the attempt to study attitudes during the 1982 war between Argentina and Britain over ownership of a small group of islands in the South Atlantic. It was virtually impossible to phrase a question that did not include the name of the islands, but for the Argentines they were named the Malvinas and for the British the Falkland Islands. Whichever name was used in a question could be seen as prejudicing the issue of ownership. This is an unusual example, but it shows that bias in survey questions is not a simple matter.
5. Conclusion In this sense, we return to the fundamental paradox of survey research—the referendum point of view vs. the analytic point of view. We all wish at times to know
Sample Sureys: The Field what the public as a whole feels about an important issue—whether it involves military intervention in a distant country or a domestic issue like government support for health care. Therefore, we need to remind both ourselves and the public of the limitations of univariate survey results, while at the same time taking whatever steps we can to reduce those limitations. Above all, this means avoiding the tendency to reduce a complex issue to one or two simple closed questions, because every survey question imposes a unique perspective on responses, whether we think of this as ‘bias’ or not. Moreover, survey data are most meaningful when they involve comparisons, especially comparisons over time and across important social groups—provided that the questions have been kept as constant in wording and meaning as possible. From a practical standpoint, probably the most useful way to see how the kinds of problems discussed here can be addressed is to read significant substantive analyses of survey data, for example, classics like Stouffer (1955) and Campbell et al. (1960), and more recent works that grapple with variability and bias, for example, Page and Shapiro (1992) and Schuman et al. (1997).
Bibliography Cantril H 1944 Gauging Public Opinion. Princeton University Press, Princeton, NJ Bradburn N M, Sudman S, with assistence of Blair E, Locander W, Miles C, Singer E, Stocking C 1979 Improing Interiew Method and Questionnaire Design. Jossey-Bass, San Francisco Campbell A, Converse P E, Miller W E, Stokes D E 1960 The American Voter. Wiley, New York Converse J M 1987 Surey Research in the United States: Roots & Emergence 1890–1960. University of California Press, Berkeley, CA Converse P 1964 The nature of belief systems in mass publics. In: Apter D E (ed.) Ideology and Discontent. Free Press, New York Converse J M 1976–7 Predicting no opinion in the polls. Public Opinion Quarterly? 40: 515–30 Deming W E 1968 Sample surveys: the field. In: Sills D (ed.) International Encyclopedia of the Social Sciences. Macmillan, New York, Vol. 13, pp. 594–612 Holleman B H 2000 The Forbid\Allow Asymmetry: On the Cognitie Mechanisms Underlying Wording Effects in Sureys. Rodopi, Amsterdam Krosnick J A, Fabrigar L A forthcoming Designing Great Questionnaires: Insights from Social and Cognitie Psychology. Oxford University Press, Oxford, UK Lazarsfeld P E 1944 The controversy over detailed interviews— an offer to negotiate. Public Opinion Quarterly 8: 38–60 Mueller J E 1973 War, Presidents, and Public Opinion. Wiley, New York Page B I, Shapiro R Y 1992 The Rational Public: Fifty Years of Trends in Americans’ Policy Preferences. University of Chicago Press, Chicago Payne S L 1951 The Art of Asking Questions. Princeton University Press, Princeton, NJ
Petty R E, Krosnick J A (eds.) 1995 Attitude Strength: Antecedents and Consequences. Erlbaum, Mahwah, NJ Rugg D 1941 Experiments in wording questions: II. Public Opinion Quarterly 5: 91–2 Schuman H 1966 The random probe: a technique for evaluating the validity of closed questions. American Sociological Reiew 21: 218–22 Schuman H, Presser S 1981 Questions and Answers in Attitude Sureys: Experiments on Question Form, Wording, and Context. Academic Press, New York [reprinted 1996, Sage Publications, Thousand Oaks, CA] Schuman H, Scott J 1987 Problems in the use of survey questions to measure public opinion. Science 236: 957–9 Schuman H, Steeh C, Bobo L, Krysan M 1997 Racial Attitudes in America: Trends and Interpretations. Harvard University Press, Cambridge, MA Schwarz N 1996 Cognition and Communication: Judgmental Biases, Research Methods, and the Logic of Conersation. Erlbaum, Mahwah, NJ Schwarz N, Sudman S (eds.) 1996 Answering Questions. JosseyBass, San Francisco Stouffer S A 1955 Communism, Conformity, and Ciil Liberties. Doubleday, New York Sudman S, Bradburn N B, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco Tanur J M (ed.) 1992 Questions About Questions: Inquiries Into the Cognitie Bases of Sureys. Russell Sage Foundation, New York Tourangeau R, Ripps L J, Rasinski K A 2000 The Psychology of Surey Response. Cambridge University Press, Cambridge, UK
H. Schuman
Sample Surveys: The Field 1. Definition of Surey Sampling Survey sampling can be defined as the art of selecting a sample of units from a population of units, creating measurement tools for measuring the units with respect to the survey variables and drawing precise conclusions about the characteristics of the population or of the process that generated the values of the units. A more specific definition of a survey is the following (Dalenius 1985): (a) A surey concerns a set of objects comprising a population. One class of population concerns a finite set of objects such as individuals, businesses, and farms. Another, concerns events during a specific time period, such as crime rates and sales. A third class concerns plain processes, such as land use or the occurrence of certain minerals in an area. More specifically one might want to define a population as, for example, all noninstitutionalized individuals 15–74 years of age living in Sweden on May 1, 2000. (b) This population has one or more measurable 13475
Sample Sureys: The Field properties. Examples of such properties are individuals’ occupations, business’ revenues, and the number of elks in an area. (c) A desire to describe the population by one or more parameters defined in terms of these properties. This calls for observing (a sample of ) the population. Examples of parameters are the proportion of unemployed individuals in the population, the total revenue of businesses in a certain industry sector during a given time period and the average number of elks per square mile. (d) In order to get observational access to the population, a frame is needed i.e., an operational representation, such as a list of the population objects or a map of the population. Examples of frames are business and population registers; maps where the land has been divided into areas with strictly defined boundaries; or all n-digit numbers, which can be used to link telephone numbers to individuals. Sometimes the frame has to be developed for the occasion because there are no registers available and the elements have to be listed. For general populations this is done by combining a multi-stage sampling and the listing procedure by letting the survey field staff list all elements in sampled areas only. Other alternatives would be too costly. For special populations, for example, the population of professional baseball players in the USA, one would have to combine all club rosters into one frame. In some surveys there might exist a number of frames covering the population to varying extents. For this situation a multiple frame theory has been developed (see Hartley 1974). (e) A sample of sampling units is selected from the frame in accordance with a sampling design, which specifies a probability mechanism and a sample size. There are numerous sample designs (see Sample Sureys: Methods) developed for different survey situations. The situation may be such that the design chosen solves a problem (using multistage sampling when not all population elements can be listed, or when interviewer and travel costs prevent the use of simple random sampling of elements) or takes advantage of the circumstances (using systematic sampling, if the population is approximately ordered, or using stratified sampling if the population is skewed). Every sample design specifies selection probabilities and a sample size. It is imperative that selection probabilities are known, or else the design is nonmeasurable. (f ) Observations are made on the sample in accordance with a measurement design i.e., a measurement method and a prescription as to its use. This phase is called data collection. There are at least five different main modes of data collection: face-to-face interviewing, telephone interviewing, self-administered questionnaires and diaries, administrative records, and direct observation. Each of these modes can be conducted using different levels of technology. Early attempts using the computer took place in the 13476
1970s, in telephone interviewing. The questionnaire was stored in a computer and a computer program guided the interviewer throughout the interview by automatically presenting questions on the screen and taking care of some interviewer tasks such as keeping track of skip patterns and personalizing the interview. This technology is called CATI (Computer Assisted Telephone Interviewing). Current levels of technology for the other modes include the use of portable computers for face to face interviewing, touch-tone data entry using the telephone key pad, automatic speech recognition, satellite images of land use and crop yields, ‘people meters’ for TV viewing behaviors, barcode scanning in diary surveys of purchases, electronic exchange of administrative records, and Internet. Summaries of these developments are provided in Lyberg and Kasprzyk (1991), DeLeeuw and Collins (1997), Couper et al. (1998) and Dillman (2000). Associated with each mode is the survey measurement instrument or questionnaire. The questionnaire is the result of a conceptualization of research objectives i.e., a set of properly worded and properly ordered questions. The design of the questionnaire is a science of its own. See for example Tanur (1992) and Sudman et al. (1996). (g) Based on the measurements an estimation design is applied to compute estimates of the parameters when making inference from the sample to the population. Associated with each sampling design are one or more estimators that are functions of the data that have been collected to make statements about the population parameters. Sometimes estimators rely solely on sample data, but on other occasions auxiliary information is part of the function. All estimators include sample weights that are used to inflate the sample data. To calculate the error of an estimate, variance estimators are formed, which makes it possible to calculate standard errors and eventually confidence intervals. See Cochran (1977) and Sa$ rndal et al. (1992) for comprehensive reviews of the sampling theory.
2. The Status of Surey Research There are many types of surveys and survey populations that fit this definition. A large number of surveys are one-time surveys aiming at measuring attitudes or other population behaviors. Some surveys are continuing, thereby allowing the estimation of change over time. An example of this is a monthly labor force survey. Typically such a survey uses a rotating design where a sampled person is interviewed a number of times. One example of this is that the person participates 4 months in a row, is rotated out of the sample for the next 4 months and then rotates back for a final 4 months. Other surveys aim at comparing different populations regarding a certain characteristic, such as the literacy level in different countries. Business
Sample Sureys: The Field surveys often study populations where there are a small number of large businesses and many smaller ones. In the case where the survey goal is to estimate a total, it might be worthwhile to deliberately cut off the smallest businesses from the frame or select all large businesses with a probability of one and the smaller ones with other probabilities. Surveys are conducted by many different organizations. There are national statistical offices producing official statistics, there are university-based organizations conducting surveys as part of the education and there are private organizations conducting surveys on anything ranging from official statistics to marketing. The survey industry employs more than 130,000 people only in the USA, and the world figure is of course much larger. Survey results are very important to society. Governments get continuing information on parameters like unemployment, national accounts, education, environment, consumer price indexes, etc. Other sponsors get information on e.g., political party preferences, consumer satisfaction, child day-care needs, time use, and consumer product preferences. As pointed out by Groves (1989), the field of survey sampling has evolved through somewhat independent and uncoordinated contributions from many disciplines including statistics, sociology, psychology, communication, education and marketing research. Representatives of these disciplines have varying backgrounds and as a consequence tend to emphasize different design aspects. However, during the last couple of decades, survey research groups have come to collaborate more as manifested by, for instance, the edited volumes such as Groves et al. (1988), Biemer et al. (1991), Lyberg et al. (1997), and Couper et al. (1998). This teamwork development will most likely continue. Many of the error structures resulting from specific sources must be dealt with by multi-disciplinary teams since the errors stem from problems concerning sampling, recall, survey participation, interviewer practices, question comprehension, and conceptualization. (See Sample Sureys: Cognitie Aspects of Surey Design.) The justification for sampling (rather than surveying the entire population, a total enumeration) is lower costs but also greater efficiency. Sampling is faster and less expensive compared to total enumeration. Perhaps more surprisingly, sampling often allows a more precise measurement of each sampled unit than that possible in a total enumeration. This often leads to sample surveys having quality features that are superior to those of total enumerations. Sampling as an intuitive tool has probably been used for centuries, but the development of a theory of survey sampling did not start until the late 1800s. Main contributors to this early development, frequently referred to as ‘the representative method,’ were Kiaer (1897), Bowley (1913, 1926), and Tschuprow (1923). Apart from various inferential aspects they discussed issues such as stratified sam-
pling, optimum allocation to strata, multistage sampling, and frame construction. In the 1930s and the 1940s most of the basic methods that are used today were developed. Fisher’s randomization principle was applied to sample surveys and Neyman (1934, 1938) introduced the theory of confidence intervals, cluster sampling, ratio estimation, and two-phase sampling. The US Bureau of the Census was perhaps the first national statistical office to embrace and further develop the theoretical ideas suggested. For example, Morris Hansen and William Hurwitz (1943, 1949) and Hansen et al. (1953) helped place the US Labor Force Survey on a full probability-sampling basis and they also led innovative work on variance estimation and the development of a survey model decomposing the total survey mean squared error into various sampling and bias components. Other important contributions during that era include systematic sampling (Madow and Madow 1944), regression estimation (Cochran 1942), interpenetrating samples (Mahalanobis 1946) and master samples (Dalenius 1957). More recent efforts have concentrated on allocating resources to the control of various sources of error i.e., methods for total survey design, taking not only sampling but also nonsampling errors into account. A more comprehensive review of historical aspects are provided in Sample Sureys, History of.
3. The Use of Models While early developments focused on methods for sample selection in different situations and proper estimation methods, later developments have to a large extent focused on theoretical foundations and the use of probability models for increasing the efficiency of the estimators. There has been a development from implicit modeling to explicit modeling. The model traditionally used in the early theory is based on the view that what is observed for a unit in the population is basically a fixed value. This approach may be called the ‘fixed population approach.’ The stochastic nature of the estimators is a consequence of the deliberately introduced randomization among the population units. A specific feature of survey sampling is the existence of auxiliary information i.e., known values of a concomitant variable, which is in some sense related to the variable under study, so that it can be used to improve the precision of the estimators. The relationship between the variable under study and the auxiliary variables are often expressed as linear regression models, which often can be interpreted as expressing the belief (common or the sampler’s own) concerning the structure of the relationship between the variables. Such modeling is used extensively in early textbooks, see Cochran 1953. A somewhat different approach is to view the values of the variables 13477
Sample Sureys: The Field as realizations of random variables using probability models. In combination with the randomization of the units this constitutes what is called the superpopulation approach. Model based inference is used to draw conclusions based solely on properties of the probability models ignoring the randomization of the units. Design based inference on the other hand ignores the mechanism that generated the data and concentrates on the randomization of the units. In general, model-based inference for estimating population parameters like means of subgroups can be very precise if the model is true but may introduce biased estimates if the model is false, while design-based inference leads to unbiased, but possibly inferior estimates of the population parameters. Model assisted inference is a compromise that aims at utilizing models in such a way that, if the model is true the precision is high, but if the model is false the precision will be no worse than if no model had been used. (See Sample Sureys: Model-based Approaches.) As an example, we want to study a population of families in a country. We want to analyse the structure of disposable income for the households and find out the relation between factors like age, sex, education, the number of household members, and the disposable income for a family. A possible model of the data generating process could be that the disposable income is a linear function of these background variables. There is also an element of unexplained variation between families having the same values of the background variables. Also, the income will fluctuate from year to year depending on external variation in society. All this shows that the data generating process could be represented by a probability model where the disposable income is a linear function of background variables and random errors over time and between families. The super population model would be the set of models describing how the disposable income is generated for the families. For inferential purposes, a sample of families is selected. Different types of inference can be considered. For instance we might be interested in giving a picture of the actual distribution of the disposable income in the population at the specific time when we selected the sample, or we might be interested in estimating the coefficients of the relational model either because we are genuinely interested in the model itself e.g., for prediction of a future total disposable income for the population, which would be of interest for sociologists, economists and decision makers, or for using the model as a tool for creating more efficient estimators of the fixed distribution, given for example that the distribution of sex and age is known with reasonable accuracy in the population and can be used as auxiliary information. Evidently, the results would depend on the constellation of families comprising our sample. If we use a sample design that over-represents the proportion of large households or young households with small 13478
children, compared to the population, the inference based on the sample can be misleading. Model-based inference ignores the sample selection procedure and assumes that the inference conditional on the sample is a good representation of what would have been the case if all families had been surveyed. Design-based inference ignores the data generation process and concentrates on the artificial randomisation induced by the sampling procedure. Model-assisted inference uses models as tools for creating more precise estimates. Broadly speaking, model-based inference is mostly used in the case when the relational model is of primary interest. This is the traditional way of analysing sample data as given in textbooks in statistical theory. Design-based inference on the other hand is the traditional way of treating sample data in survey sampling. It is mainly focused on giving a picture of the present state of the population. Model-assisted inference uses models as tools for selecting estimators, but relies on design properties. It too is mainly focused on picturing the present state in the population. Modern textbooks such as Cassel et al. (1977) and Sa$ rndal et al. (1992) discuss the foundations of survey sampling and make extensive use of auxiliary information in the survey design. The different approaches mentioned above have their advocates, but most of the surveys conducted around the world still rely heavily on design-based approaches with implicit modeling. But models are needed to take nonsampling errors into account since we do not know exactly how such errors are generated. To make measurement errors part of the inference procedure, one has to make assumptions about the error structures. Such error structures concern cognitive issues, question wording and perception, interviewer effects, recall errors, untruthful answers, coding, editing and so on and so forth. Similarly, to make errors of nonobservation (frame coverage and nonresponse errors) part of the inference procedure, one needs to model the mechanisms that generate these errors. The compromise called model-assisted inference takes advantage of both design and model- based features. Analysis of data from complex surveys denotes the situation that occurs when the survey statistician is trying to estimate the parameters of a model used for description of a random phenomenon, for example econometric or sociological models such as time series models, regression models or structural equation models. It is assumed that the data available are sample survey data that have been generated by some sampling mechanism that does not support the assumption of independent identically distributed (IID) observations on a random variable. The traditional inference developed for the estimation of parameters of the model (and not for estimating the population parameters) presupposes that IID is at hand. In some cases, traditional inference based on e.g., maximum likelihood gives misleading results. Comprehensive reviews of analysis of data from complex surveys are
Sample Sureys: The Field provided by Skinner et al. (1989) and Lehtonen and Pahkinen (1995). The present state of affairs is that there is a relatively well-developed sampling theory. The theory of nonsampling errors is still in its infancy, however. A typical scenario is that survey methodologists try to reduce potential errors by using for example, cognitively tested questionnaires and various means to stimulate survey participation, and these things are done to the extent available resources permit. However, not all nonsampling error sources are known and some that are known defy expression. The error reduction strategy can be complemented by sophisticated modeling of error structures. Unfortunately, a rather common implicit modeling seems to be that nonsampling errors have no serious effect on estimates. In some applications, attempts are made to estimate the total error or error components by evaluation techniques, i.e., for a subsample of the units, the survey is replicated using expensive ‘gold standard’ methods and the differences between the preferred measurements and the regular ones are used as estimates of the total errors. This is an expensive and time-consuming procedure that is not very suitable for long-range improvements. A more modern and realistic approach is to develop reliable and predictable (stable) survey processes that can be continuously improved (Morganstein and Marker 1997).
4. Conclusions Obviously there are a number of future challenges in the field of survey sampling. We will provide just a few examples: (a) Many surveys are conducted in a primitive way because of limited funding and knowhow. The development of more efficient designs taking nonsampling errors into account at the estimation stage is needed. There is also a need for strategies that can help allocate resources to various design stages so that total errors are minimized. Sometimes those in charge of surveys concentrate their efforts on the most visible error sources or where there is a tool available. For instance, most survey sponsors know that nonresponse might be harmful. The indicator of nonresponse error, the nonresponse rate, is both simple and visible. Therefore it might be tempting to put most resources into this error source. On the other hand, not many users are aware of the cognitive phenomena that affect the response delivery mechanism. Perhaps from a total error point of view more resources should be spent on questionnaire design. (b) Modern technology permits simultaneous use of multiple data collection modes within a survey. Multiple modes are used to accommodate respondents, to increase response rates and to allow inexpensive data collection when possible. There are, however, mode effects and
there is a need for calibration techniques that can adjust the measurements or the collection instruments so that the mode effect vanishes. (c) International surveys are becoming increasingly important. Most methodological problems mentioned are inflated under such circumstances. Especially interesting is the concept of cultural bias. Cultural bias means that concepts and procedures are not uniformly understood, interpreted, and applied across geographical regions or ethnical subpopulations. To define and measure the impact of such bias is an important challenge. See also: Databases, Core: Demography and Registers; Databases, Core: Political Science and Political Behavior; Databases, Core: Sociology; Microdatabases: Economic; Survey Research: National Centers
Bibliography Biemer P, Groves R, Lyberg L, Mathiowetz N, Sudman S 1991 Measurement Errors in Sureys. Wiley, New York Bowley A L 1913 Working-class households in reading. Journal of the Royal Statistical Society 76: 672–701 Bowley A L 1926 Measurement of the precision attained in sampling. Proceedings of the International Statistical Institute XII: 6–62 Cassel C-M, Sa$ rndal C-E Wretman J 1977 Foundations of Inference in Surey Sampling. Wiley, New York Cochran W G 1942 Sampling theory when the sampling-units are of unequal sizes. Journal of the American Statistical Association 37: 199–212 Cochran W G 1953 Sampling Techniques, 1st edn. Wiley, New York Cochran W G 1977 Sampling Techniques, 3rd edn. Wiley, New York Couper M, Baker R, Bethlehem J, Clark C, Martin J, Nicholls W, O’Reilly J 1998 Computer Assisted Surey Information Collection. Wiley, New York Dalenius T 1957 Sampling in Sweden. Almqvist and Wiksell, Stockholm, Sweden Dalenius T 1985 Elements of Surey Sampling. Notes prepared for the Swedish Agency for Research Cooperation with Developing Countries (SAREC) DeLeeuw E, Collins M 1997 Data collection methods and survey quality: An overview. In: Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo C, Scwarz N, Trewin D (eds.) Surey Measurement and Process Quality. Wiley, New York Dillman D 2000 Mail and Internet Sureys: The Tailored Design Method, 2nd edn, Wiley, New York Groves R 1989 Surey Errors and Surey Costs. Wiley, New York Groves R, Biemer P, Lyberg L, Massey J, Waksberg J (eds.) 1988 Telephone Surey Methodology. Wiley, New York Hansen M H, Hurwitz W N 1943 On the theory of sampling from finite populations. Annals of Mathematical Statistics 14: 333–62 Hansen M H, Hurwitz W N 1949 On the determination of optimum probabilities in sampling. Annals of Mathematical Statistics 20: 426–32
13479
Sample Sureys: The Field Hansen M H, Hurwitz W N, Madow W G 1953 Sample Surey Methods and Theory (I and II). Wiley, New York Hartley H O 1974 Multiple methodology and selected applications. Sankhya, Series C 36: 99–118 Kiaer A N 1897 The representative method for statistical surveys (original in Norwegian). Kristiania Videnskabsselskabets Skrifter. Historisk-filosofiske klasse 4: 37–56 Lehtonen R, Pahkinen E J 1995 Practical Methods for Design and Analysis of Complex Sureys. Wiley, New York Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo, C, Schwarz N, Trewin D (eds.) 1997 Surey Measurement and Process Quality. Wiley, New York Lyberg L, Kasprzyk D 1991 Data collection methods and measurement error: An overview. In: Biemer P, Groves R, Lyberg L, Mathiowetz N, Sudman S (eds.) Measurement Errors in Sureys. Wiley, New York Madow W G, Madow L H 1944 On the theory of systematic sampling, I. Annals of Mathematical Statistics 15: 1–24 Mahalanobis P C 1946 On large-scale sample surveys. Philosophical Transactions of the Royal Society London, Series B 231: 329–451 Morganstein D, Marker D 1997 Continuous Quality Improvement in Statistical Agencies. In Lyberg L, Biemer P, Collins M, DeLeeuw E, Dippo C, Scwarz N, Trewin D (eds.) Surey Measurement and Process Quality. Wiley, New York Neyman J 1934 On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–625 Neyman J 1938 Contribution to the theory of sampling human populations. Journal of the American Statistical Association 33: 101–16 Sa$ rndal C-E, Swensson B, Wretman J 1992 Model Assisted Surey Sampling. Springer-Verlag, New York Skinner C, Holt D, Smith T M F (eds.) 1989 Analysis of Complex Sureys. Wiley, New York Sudman S, Bradburn N, Schwarz N 1996 Thinking About Answers: The Application of Cognitie Processes to Surey Methodology. Jossey-Bass, San Francisco, CA Tanur J (ed.) 1992 Questions About Questions. Russell Sage, New York Tschuprow A A 1923 On the mathematical expectation of the moments of frequency distributions in the case of correlated observation. Metron 2: 461–93; 646–80
L. Lyberg and C. M. Cassel
Sanctions in Political Science A sanction is an action by one actor (A) intended to affect the behavior of another actor (B) by enhancing or reducing the values available to B. Influence attempts by A using actual or threatened punishments of B are instances of negative sanctions. Influence attempts by A using actual or promised rewards to B are instances of positive sanctions. Not all influence attempts involve sanctions. Actor A may influence actor B by reason, example, or the provision of information without the use of sanctions. 13480
1. Concepts Although the definitions of positive and negative sanctions may appear simple, there are both conceptual and empirical difficulties in distinguishing between the two. Some things take the form of positive sanctions, but actually are not; e.g., giving a bonus of $100 to an employee who expected a bonus of $500, or promising not to kill a person who never expected to be killed in the first place. Likewise, some things take the form of negative sanctions, but actually are not; e.g., a threat to reduce the salary of a person who expected to be fired or the beating of a masochist. Is withholding a reward ever a punishment? Always a punishment? Is withholding a punishment ever a reward? Always a reward? The answers depend on Actor B’s perception of the situation. In order to distinguish rewards from punishments one must establish B’s baseline of expectations at the moment A’s influence attempt begins (Blau 1986). This baseline is defined in terms of B’s expected future value position, i.e., expectations about B’s future position relative to the things B values. Positive sanctions, then, are actual or promised improvements in B’s value position relative to B’s baseline of expectations, and negative sanctions are actual or threatened deprivations relative to the same baseline.
2. Historical Context Although references to both positive and negative sanctions can be traced to the seventeenth century, the most common usage of the term ‘sanctions’ refers to negative sanctions. Until the mid-twentieth century, sanctions were viewed primarily as mechanisms for enforcing societal norms, including those embodied in laws. Although this normative view of sanctions continues in legal theory, ethics, political theory, and sociology (Barry 1995, Coleman 1990, Waldron 1994), a broader usage has emerged in political science during the latter part of the twentieth century. This broader usage depicts sanctions as potentially relevant to any type of influence attempt, regardless of whether it is aimed at enforcing social norms or not. This usage is typically found in discussions of influence and power (Baldwin 1971, 1989, Blau 1986, Oppenheim 1981, Lasswell and Kaplan 1950).
3. Kinds of Sanctions Since sanctions are defined in terms of the values of the target of an influence attempt (actor B), they may take a variety of forms. Lasswell and Kaplan (1950) identified eight exemplary values that could serve as the basis for positive or negative sanctions: power, wealth, rectitude, physical well-being, affection, skill, and enlightenment.
Sanctions in Political Science Although it is sometimes suggested that positive and negative sanctions are opposites in the sense that generalizations about one type are equally applicable to the other, mutatis mutandis, this is often not true. Listed below are some of the hypothesized differences between positive and negative sanctions.
argue that force is the ultimate influence technique. Identifying the conditions under which one type of sanction is more effective than another is likely to continue as a focus of research in political science as well as in other social sciences.
3.5 Legitimation 3.1 A’s Burden of Response When A’s influence attempt is based on a promise, B’s compliance obligates A to respond with a reward; whereas B’s failure to comply calls for no further response from A.
3.2 Role of Costs One important consequence of the asymmetry between positive and negative sanctions is that promises tend to cost more when they succeed, while threats tend to cost more when they fail (Boulding 1989, Parsons 1963, Schelling 1960). The difference can be summarized as follows: The bigger the threat, the higher the probability of success; the higher the probability of success, the less the probability of having to implement the threat; the less the probability of having to implement the threat, the cheaper it is to make big threats. The bigger the promise, the higher the probability of success; the higher the probability of success, the higher the probability of having to implement the promise; the higher the probability of having to implement the promise, the more expensive it is to make big promises (ceteris paribus).
It is usually easier to legitimize demands based on positive sanctions than demands based on negative ones. An example is provided by one of the most important social institutions in the world—the institution of private property. Most societies have one set of rules specifying the conditions under which a person may be deprived of property and quite a different set of rules specifying conditions under which one person may augment another person’s property.
4. Foreign Policy Sanctions Political scientists interested in international relations and foreign policy have devoted a great deal of research to sanctions during the last two decades of the twentieth century. Most of this research has focused on economic sanctions, with comparatively little attention devoted to military, diplomatic, or other noneconomic forms of sanctions. Although progress has occurred with respect to systematic gathering of empirical data (Hufbauer et al. 1990), refinement of concepts (Baldwin 1985), and development of theories, social scientific research on the role of sanctions in international relations is still in its infancy.
3.3 Indicators of Success
4.1 Methodological Problems
Whereas a successful threat requires no action by A, a successful promise obligates A to implement that sanction. In a well-integrated social system, where the probability that B will comply with A’s wishes is relatively high, promises will be more visible than threats. Indeed, it is precisely because threats are so successful in domestic politics that they are so difficult to detect. Since most citizens obey most laws most of the time, threats to punish lawbreakers have to be carried out with respect to only a small minority of citizens in most polities.
Five methodological problems provide useful foci for future research: (a) Lack of agreement on terms and concepts. While some scholars conceive of economic sanctions in terms of mechanisms for making any type of influence attempt (Baldwin 1985), others define economic sanctions in terms of particular policy goals (Pape 1997). This lack of agreement on basic concepts not only impedes debate within the field of foreign policy studies, it is an obstacle to crossfertilization between the study of economic sanctions in foreign policy and discussions of other kinds of sanctions in other areas of social science. (b) Preoccupation with economic sanctions. Research on economic sanctions has been largely insulated from research on other techniques of statecraft. This insulation sometimes gives the impression that economic sanctions are sui generis, rather than treating them as a particular type of statecraft. Although actual or threatened military force is a military sanction, it too is usually viewed as a unique focus of
3.4 Efficacy The question of the relative efficacy of positive versus negative sanctions in exercising influence has been much debated. Parson (1963) has argued that negative sanctions have more intrinsic effectiveness than positive sanctions in deterrence situations. And ever since Machiavelli and Hobbes, there have been those who
13481
Sanctions in Political Science inquiry. In order to understand when and why policy makers use economic sanctions, it is necessary to understand the alternatives among which policy makers choose. These alternatives include diplomatic, military, and\or verbal sanctions (propaganda). Future research on sanctions might usefully focus on comparative studies of alternative types of sanctions and the conditions under which each type is most likely to be cost-effective. (c) Neglect of positie sanctions. Research on sanctions is heavily skewed toward negative sanctions. This is true not only of research by international relations scholars, but also of political science in general. During the last two decades, research on positive sanctions has increased (Davis 2000, Newnham 2000), but the emphasis is still disproportionately on negative sanctions. (d) Lack of agreement on criteria of success. Scholars disagree as to which criteria should be used in measuring the success of economic sanctions. Some consider only the effectiveness of the influence attempt in achieving its goals (Morgan and Schwebach 1997, Pape 1997). Others argue that estimates of success should take into consideration the costs of the sanctions to the user, the costs for noncompliance inflicted on the target, the difficulty of the undertaking, and the comparative utility of alternative policy options (Baldwin 1985, 2000). This lack of agreement on how to conceive of success is not peculiar to the international relations literature. Unlike economics, political science lacks a standardized measure of value in terms of which success can be measured. There is much less agreement among political scientists as to what constitutes political success than there is among economists as to what constitutes economic success (Baldwin 1989). Developing a set of agreed upon criteria of success applicable to noneconomic, as well as economic, sanctions would be a valuable step in sanctions research. (e) Premature generalization. Research on economic sanctions by international relations scholars frequently leads to premature attempts to specify policy implications. Some scholars, for example, imply that only foolish policy makers would use such policy instruments (Morgan and Schwebach 1997, Tsebelis 1990). In order to make judgements as to the wisdom of using economic sanctions in a given situation, however, one would need to know how effective such sanctions were likely to be, with respect to which goals and targets, at what cost, and in comparison with which policy alternatives (Baldwin 2000). Current research on economic sanctions rarely asks such questions. See also: Control: Social; Deterrence; Diplomacy; Efficacy: Political; Foreign Policy Analysis; Game Theory; Law: Economics of its Public Enforcement; Legal Culture and Legal Consciousness; Legitimacy: Political; National Security Studies and War Potential 13482
of Nations; Norms; Power: Political; Punishment, Comparative Politics of; Punishment: Social and Legal Aspects; Utilitarianism: Contemporary Applications
Bibliography Baldwin D A 1971 The power of positive sanctions. World Politics 24: 19–38 Baldwin D A 1985 Economic Statecraft. Princeton University Press, Princeton, NJ Baldwin D A 1989 Paradoxes of Power. Basil Blackwell, New York Baldwin D A 2000 The sanctions debate and the logic of choice. International Security 24: 80–107 Barry B 1995 Justice as Impartiality. Clarendon Press, Oxford, UK Blau P M 1986 Exchange and Power in Social Life. Transaction, New Brunswick, NJ Boulding K E 1989 Three Faces of Power. Sage, Newbury, CA Coleman J S 1990 Foundations of Social Theory. Belknap, Cambridge, MA Davis J W 2000 Threats and Promises: The Pursuit of International Influence. The Johns Hopkins University Press, Baltimore, MD Hufbauer G C, Schott J J, Elliott K A 1990 Economic Sanctions Reconsidered, 2nd edn. Institute for International Economics, Washington, DC Lasswell H D, Kaplan A 1950 Power and Society: A Framework for Political Inquiry. Yale University Press, New Haven, CT Morgan T C, Schwebach V L 1997 Fools suffer gladly: The use of economic sanctions in international crises. International Studies Quarterly 41: 27–50 Newnham R E 2000 More flies with honey: Positive economic linkage in German Ostpolitik from Bismarck to Kohl. International Studies Quarterly 44: 73–96 Oppenheim F E 1981 Political Concepts: A Reconstruction. University of Chicago Press, Chicago Pape R A 1997 Why economic sanctions do not work. International Security 22: 90–136 Parsons T 1963 On the concept of political power. Proceedings of the American Philosophical Society 107: 232–62 Schelling T C 1960 The Strategy of Conflict. Harvard University Press, Cambridge, MA Tsebelis G 1990 Are sanctions effective? A game-theoretic analysis. Journal of Conflict Resolution 34: 3–28 Waldron J 1994 Kagan on requirements: Mill on sanctions. Ethics 104: 310–24
D. A. Baldwin
Sapir, Edward (1884–1939) 1. General Career Edward Sapir was born in Lauenburg, Germany (now Lebork, Poland) on January 26, 1884, but his family emigrated to the United States when he was five years old and eventually settled in New York City. His
Sapir, Edward (1884–1939) brilliance earned him a full scholarship to the prestigious Horace Mann School and subsequently a Pulitzer fellowship to Columbia College, where he received his BA in 1904. He continued with graduate work in Germanic philology, but was soon drawn into Franz Boas’s orbit and took up an anthropological career, for which he proved extraordinarily well fitted. In 1905 he began a series of field visits to the West that resulted in the detailed documentation of several American Indian languages and cultures, including Wishram Chinook, Takelma, Yana, Southern Paiute, and Nootka. The speed and accuracy with which Sapir collected linguistic and ethnographic data has probably never been surpassed, and his notes, even from his earliest field trips, are among the most valuable manuscripts in American Indian studies. After receiving his doctorate from Columbia in 1909 with a dissertation on the Takelma language, Sapir taught briefly at the University of Pennsylvania, and in 1910 became Chief of the Division of Anthropology in the Geological Survey of Canada. He remained in this position until 1925, making the Nootka (Nuuchahnulth) language the focus of his research from 1910 to 1914. Between 1915 and 1920, when World War I and its aftermath brought field work to a halt, he devoted a considerable amount of time to the comparative linguistics of North American languages, establishing the Na-Dene relationship between Athabaskan, Tlingit, and Haida, and proposing expansions of the Penutian and Hokan stocks to include a large number of languages in North and Central America. During this period he also began making a name for himself as a literary and social commentator, and began publishing his experimental poetry. Sapir married Florence Delson in 1911, and they had three children. Shortly after the birth of their third child in 1918, Florence Sapir began to manifest signs of serious illness, partly psychological in character. Although his wife’s deteriorating health became a great concern to Sapir, and her death in 1924 was emotionally devastating, he remained committed to the intensive field documentation of American Indian languages, and in 1922 commenced an Athabaskan research program that eventually encompassed fullscale descriptive studies of Sarsi, Kutchin (Gwich’in), Hupa, and Navajo. This work was motivated partly by Sapir’s conviction that a historical relationship could be demonstrated between the Na-Dene languages of North America and the Sino-Tibetan languages of East Asia. After the publication of his highly influential book Language (1921) Sapir was regarded widely as one of the leading linguists of his generation. In 1925 he accepted Fay-Cooper Cole’s offer of an academic position in the newly reorganized Department of Anthropology and Sociology at the University of Chicago. The move from Ottawa, and his second marriage to Jean McClenaghan, with whom he was to
have two more children, was an emotional and intellectual watershed. He found the interdisciplinary atmosphere at Chicago stimulating and increasingly he addressed general and theoretical topics in his professional writing. He also began contributing sprightly and provocative essays on general social and cultural topics to such publications as Mencken’s American Mercury. While much of his teaching was in linguistics and he took a leading role in establishing that discipline as an autonomous field of study, Sapir also became involved in developing a general model for social science. After he moved to Yale in 1931 as Sterling Professor of Anthropology and Linguistics he became interested particularly in a psychologically realistic paradigm for social research (see Irvine 1999) and led a seminar on personality and culture. Meanwhile linguistic work on Athabaskan, specifically Navajo, continued to absorb him, and the prospect of finding remote connections of the Na-Dene languages in Asia had led him to the study of Tibetan and Chinese. This extraordinarily diverse agenda was abruptly suspended in the summer of 1937 when Sapir suffered a serious heart attack, from which he never fully recovered. After a year and a half of precarious health and restricted activity he died in New Haven, Connecticut on February 4, 1939, a few days after his 55th birthday.
2. Scientific Contribution Sapir’s scientific work can be divided into three distinct parts. First, there is his substantive work in descriptive and comparative linguistics, almost entirely devoted to North American Indian languages. Second, there is his role in establishing the paradigm for twentieth century linguistic research. Finally, there are the flashes of insight—seldom elaborated into formal hypotheses—with which Sapir from time to time illuminated the landscape of linguistic and social theory.
2.1 Substantie Work on Languages The face that American Indian linguistics presents to the world at the beginning of the twenty-first century probably owes more to Sapir than to any other scholar. When Sapir took up the anthropological study of American Indian languages in 1905, the field was dominated by the classificatory concerns of John Wesley Powell’s Bureau of American Ethnology, which saw as its principal task the identification of language and dialect boundaries, and the grouping of languages into families whose historical relationship was undoubted. Only a few earlier scholars like Pickering and Humboldt had been interested in the 13483
Sapir, Edward (1884–1939) general philological study of these languages. It was Franz Boas, Sapir’s mentor at Columbia, who first proposed making comparative investigation of the grammatical structures of American Indian languages (and of non-European languages generally) a topic for sustained scientific research. Drawing his model from a German tradition of associating language and social behavior that led back through Steinthal to Humboldt, Boas impressed upon his anthropological students the necessity of understanding how linguistic ‘morphology’ (by which he meant grammatical structure) channeled ideas into expressive forms. Unfortunately for Boas, few of his students were equipped either by training or by intellectual inclination to carry out linguistic research of a more than superficial kind. The only significant exception was Sapir, whom one may imagine was attracted to Boas’s anthropology for precisely this reason. From his earliest fieldwork on Wishram Chinook in 1905, Sapir made grammatical analysis the centerpiece of his research, and from his first publications portrayed American Indian languages with a descriptive clarity they had seldom before enjoyed. His accomplishments are legendary to the scholars who today study these languages, and must rank in the first tier of the grammarian’s art. The most renowned of his published studies include a full grammar of Takelma, a now-extinct language of southern Oregon (1922); a full grammar and dictionary of Southern Paiute (1930–31\1992); and an outline grammar of Nootka, with texts and vocabulary, prepared with the assistance of Swadesh (1939). Sapir’s grammatical descriptions are couched in a metalanguage derived from European comparative philology, tempered by Boas’s insistence that the structure of every language is sui generis. In discussions of the evolution of grammatical theory, Sapir’s grammars are sometimes portrayed as ‘processual’ or as early examples of ‘generative’ descriptions, but such labels imply a theoretical deliberation that was uncharacteristic of Sapir’s work. His concern was to explicate the patterns of the language under consideration as lucidly and unambiguously as possible, not to test a general theory of linguistic structure. The aptness with which Sapir’s descriptions captured the spirit of the languages he analyzed is no better illustrated than by his model of Athabaskan grammar. Although in this case it was not embodied in a full grammatical treatment, he laid out the basic features of his descriptive system for Athabaskan in a number of shorter works, and passed it on to his students and successors through his teaching and in his files. Sixty years after his death it remains the standard descriptive model for all work in Athabaskan linguistics, regardless of the theoretical stance of the analyst. Throughout his career Sapir maintained a deep interest in historical relationships among languages. He took pride in extending the rigor of the reconstructive method to American Indian language 13484
families, and laid the foundations for the comparative study of both Uto-Aztecan and Athabaskan. In the 1930s he returned to Indo-European linguistics and made major contributions to the Laryngeal Hypothesis (the proposal, originating with Ferdinand de Saussure, that the phonology of Proto-Indo-European included one or more laryngeal or pharyngeal consonants not attested in the extant IndoEuropean languages).
2.2
The Professionalization of American Linguistics
By 1921, Sapir was able to draw on the analytic details of the American Indian languages on which he had worked to illustrate, in his book Language, the wide variety of grammatical structures represented in human speech. One of the most significant impacts of this highly successful book was to provide a model for the professionalization of academic linguistics in the US during the 1920s and early 1930s. Under Sapir’s guidance, a distinctive American School of linguistics arose, focused on the empirical documentation of language, primarily in field situations. Although most of the students Sapir himself trained at Chicago and Yale worked largely if not exclusively on American Indian languages, the methods that were developed were transferable to other languages. In the mid-1930s Sapir directed a project to analyze English, and during World War II many of Sapir’s former students were recruited to use linguistic methods to develop teaching materials for such strategically important languages as Thai, Burmese, Mandarin, and Russian. A distinction is sometimes drawn between the prewar generation of American linguists—dominated by Sapir and his students and emphasizing holistic descriptions of American Indian languages—and the immediate postwar generation, whose more rigid and focused formal methods were codified by Leonard Bloomfield and others. Since many of the major figures of ‘Bloomfieldian’ linguistics were Sapir’s students, this distinction is somewhat artificial, and a single American Structuralist tradition can be identified extending from the late 1920s through 1960 (Hymes and Fought 1981). There is little doubt that Sapir’s influence on this tradition was decisive.
3. Theoretical Insights From his earliest work under Boas, Sapir’s independent intellectual style often carried him well beyond the bounds of the academic paradigm he was ostensibly working within. He was not, however, a programmatic thinker, and his groundbreaking work, while deeply admired by his students and colleagues, seldom resulted in significant institutional changes, at least in the short term. The cliche! ‘ahead of his time’ is
Sapir, Edward (1884–1939) especially apt in Sapir’s case, and the influence of some of his views continues to be felt. This is most striking in structural linguistics, where Sapir must certainly be accounted one of the most influential figures of the twentieth century. As early as 1910 Sapir was commenting on the importance of formal patterning in phonology, and in the 1920s he was among the first to enunciate the ‘phonemic principle’ which later figured importantly in his teaching at Chicago and Yale. Characteristically, he left it to his students and such colleagues as Leonard Bloomfield to formalize an analytic methodology, while he himself pursued the psychological implications of formal patterning. Sapir’s (1917) trenchant critique of the culture concept, particularly as defined by A. L. Kroeber, was largely ignored at the time. Sapir argued that the attribution of cultural patterning to an emergent ‘superorganic’ collective consciousness was an intellectual dead-end, and that research would be better directed at understanding the individual psychology of collective patterned behavior. While Kroeber’s view was undoubtedly the dominant one in anthropology and general social science for much of the twentieth century, Sapir’s analysis is much more consistent with recent models of human sociocultural behavior that have been developed by evolutionary psychologists. The fact that Sapir’s views came from someone with extraordinary insight into the elaborate self-referential patterns of language—traditionally, the most formalized and objectified of social behaviors—can hardly be accidental, and again is consistent with recent developments in cognitive science. In the late 1920s Sapir found an important intellectual ally in Harry Stack Sullivan, a psychiatrist whose interpersonal theory of the genesis of schizophrenia resonated with Sapir’s views (Perry 1982). The two became close friends, and together with Harold Lasswell they planned and organized the William Alanson White Psychiatric Foundation, a research and teaching institution that ultimately was located in Washington, DC. In the year before he died Sapir gave serious consideration to leaving Yale and working with Sullivan in a research position at the Foundation. Sapir’s views on the history of language were linked to his view of the abstraction of patterns, and were equally controversial or misunderstood. A distinction must be drawn between Sapir’s work, noted earlier, as a historical linguist within a family of languages whose relationship was secure (Athabaskan, Uto-Aztecan, Indo-European), and his explorations of much less certain relationships (Hokan, Penutian, Na-Dene) and possible interhemispheric connections (Sino-Dene). In the former, Sapir worked—with characteristic creativity and insight—with tools and models derived from a long tradition of comparative linguistics. In the latter, it was (or seemed to his contemporaries) often a matter of brilliant intuition. In fact, in this work he
usually relied on an assessment of similarities in structural pattern that distinguished features susceptible to unconscious change from generation to generation (e.g., regular inflectional patterns, words) from those that are largely inaccessible to individual cognition. Although he never offered a theoretical explication, his idea of what constituted ‘deep’ linguistic patterns was exemplified in his classification of North and Central American Indian languages (1929). This classification, which remains influential, is still untested in its own terms.
4. Impact and Current Importance A charismatic teacher, Sapir had a succession of highly motivated students both at Chicago and at Yale, the most prominent among them Morris Swadesh (who collaborated with him on Nootka research), Harry Hoijer (who codified Sapir’s analysis of Athabaskan grammar), Mary R. Haas, Stanley Newman, C. F. Voegelin, George L. Trager, Zellig Harris, David G. Mandelbaum, and Benjamin L. Whorf. Through these students Sapir exercised a considerable posthumous influence on intellectual and institutional developments in both linguistics and in anthropology through the 1960s. A postwar collection of Sapir’s most important general papers, edited by Mandelbaum (1949), was widely read and is still consulted. Harris’ (1951) extended review of this book provides a comprehensive summary of Sapir’s oeure as it was understood by his immediate circle. Sapir is cited most frequently today for the ‘Sapir–Whorf Hypothesis’ of linguistic relativity, a name that inaccurately implies an intellectual collaboration between Sapir and his Yale student, Benjamin Whorf, who himself died in 1941. The more correctly designated ‘Whorf theory complex’ (Lee 1996) was a retrospective construct that derived largely from writings of Whorf’s that were unpublished at the time of Sapir’s death. Although undoubtedly stimulated by Sapir’s writing and teaching, Whorf’s proposal that the structure of a language to some extent determines the cognitive and behavioral habits of its speakers cannot be connected directly with Sapir’s mature thought on the psychology of language and culture. Sapir’s most enduring achievement is his own descriptive linguistic work. Long after the writings of most of his contemporaries have been forgotten, Sapir’s grammatical studies continue to be held in the highest esteem. In recent years his holistic analytic technique has been emulated by a number of linguists seeking an alternative to narrow formalism, particularly when working with American Indian or other indigenous languages. Sapir’s life and work was the subject of a 1984 conference (Cowan et al. 1986), from which emerged a plan to publish a standard edition of all of Sapir’s work, including edited versions of unfinished manu13485
Sapir, Edward (1884–1939) scripts; by year 2000 seven volumes had appeared. A biography by Darnell (1990) is useful for the externals of Sapir’s career, but her reluctance to give an intellectual account of Sapir’s work, particularly in linguistics, leaves some important issues still to be addressed (Silverstein 1991). See also: Archaeology and the History of Languages; Historical Linguistics: Overview; Language and Ethnicity; Language and Thought: The Modern Whorfian Hypothesis; Linguistic Anthropology; Linguistic Fieldwork; Linguistic Typology; Linguistics: Comparative Method; North America and Native Americans: Sociocultural Aspects; North America, Archaeology of; North America: Sociocultural Aspects; Phonology; Population Composition by Race and Ethnicity: North America; Psycholinguistics: Overview; Sapir–Whorf Hypothesis; Sociolinguistics
Bibliography Cowan W, Foster M K, Koerner K (eds.) 1986 New Perspecties in Language, Culture and Personality. Benjamins, Amsterdam and Philadelphia Darnell R 1990 Edward Sapir: Linguist, Anthropologist, Humanist. University of California Press, Berkeley and Los Angeles, CA Harris Z S 1951 Review of D G Mandelbaum (ed.). Selected writings of Edward Sapir in language, culture, and personality. Language 27: 288–333 Hymes D, Fought J 1981 American Structuralism. Mouton, The Hague, The Netherlands Irvine J T (ed.) 1999 The psychology of culture: A course of lectures by Edward Sapir, 1927–1937. In: The Collected Works of Edward Sapir, Vol. 3, Culture. Mouton de Gruyter, Berlin and New York, pp. 385–686 [also published as a separate volume, The Psychology of Culture, Mouton de Gruyter 1993] Lee P 1996 The Whorf Theory Complex: A Critical Reconstruction. Benjamins, Amsterdam and Philadelphia Mandelbaum D M (ed.) 1949 Selected Writings of Edward Sapir in Language, Culture, and Personality. University of California Press, Berkeley and Los Angeles, CA Perry H S 1982 Psychiatrist of America: The Life of Harry Stack Sullian. Belknap Press, Cambridge, UK and London Sapir E 1917 Do we need a ‘superorganic’? American Anthropologist 19: 441–47 Sapir E 1921 Language. Harcourt Brace, New York Sapir E 1922 The Takelma language of Southwestern Oregon. In: Boas F (ed.) Handbook of American Indian Languages, Part 2. Bureau of American Ethnology, Washington, DC Sapir E 1929 Central and North American Indian languages. Encyclopaedia Britannica, 14th edn. Vol. 5, pp. 138–41 Sapir E 1930–31\1992 The Southern Paiute Language. The Collected Works of Edward Sapir, Vol. 10. Mouton de Gruyter, Berlin and New York Sapir E, Swadesh M 1939 Nootka Texts: Tales and Ethnological Narraties with Grammatical Notes and Lexical Materials. Linguistic Society of America, Philadelphia Silverstein M 1991 Problems of Sapir historiography. Historiographia Linguistica 18: 181–204
V. Golla 13486
Sapir–Whorf Hypothesis 1. Nature and Scope of the Hypothesis The Sapir–Whorf hypothesis, also known as the linguistic relativity hypothesis, refers to the proposal that the particular language one speaks influences the way one thinks about reality. Although proposals concerning linguistic relativity have long been debated, American linguists Edward Sapir (1884–1939) and Benjamin Lee Whorf (1897–1941) advanced particularly influential formulations during the second quarter of the twentieth century, and the topic has since become associated with their names. The linguistic relativity hypothesis focuses on structural differences among natural languages such as Hopi, Chinese, and English, and asks whether the classifications of reality implicit in such structures affect our thinking about reality more generally. Analytically, linguistic relativity as an issue stands between two others: a semiotic-level concern with how speaking any natural language whatsoever might influence the general potential for human thinking (i.e., the general role of natural language in the evolution or development of human intellectual functioning), and a functional- or discourse-level concern with how using any given language code in a particular way might influence thinking (i.e., the impact of special discursive practices such as schooling and literacy on formal thought). Although analytically distinct, the three issues are intimately related in both theory and practice. For example, claims about linguistic relativity depend on understanding the general psychological mechanisms linking language to thinking, and on understanding the diverse uses of speech in discourse to accomplish acts of descriptive reference. Hence, the relation of particular linguistic structures to patterns of thinking forms only one part of the broader array of questions about the significance of language for thought. Proposals of linguistic relativity necessarily develop two linked claims among the key terms of the hypothesis (i.e., language, thought, and reality). First, languages differ significantly in their interpretations of experienced reality—both what they select for representation and how they arrange it. Second, language interpretations have influences on thought about reality more generally—whether at the individual or cultural level. Claims for linguistic relativity thus require both articulating the contrasting interpretations of reality latent in the structures of different languages, and assessing their broader influences on, or relationships to, the cognitive interpretation of reality. Simple demonstrations of linguistic diersity are sometimes mistakenly regarded as sufficient in themselves to prove linguistic relativity, but they cannot in themselves show that the language differences affect thought more generally. (Much confusion arises in this
Sapir–Whorf Hypothesis regard because of the practice in linguistics of describing the meaningful significance of individual elements in a language as ‘relative to’ the grammatical system as a whole. But this latter relativity of the meaning of linguistic elements to the encompassing linguistic structure should be distinguished from broader claims for a relativity of thought more generally to the form of the speaker’s language.) A variety of other arguments to the effect that distinctive perceptual or cognitive skills are required to produce and comprehend different languages likewise usually fail to establish any general effects on thought (see Niemeier and Dirven 2000). Linguistic relativity proposals are sometimes characterized as equivalent to linguistic determinism, that is the view that all thought is strictly determined by language. Such characterizations of the language– thought linkage bear little resemblance to the proposals of Sapir or Whorf, who spoke in more general terms about language influencing habitual patterns of thought, especially at the conceptual level. Indeed, no serious scholar working on the linguistic relativity problem as such has subscribed to a strict determinism. (There are, of course, some who simply equate language and thought, but under this assumption of identity, the question of influence or determinism is no longer relevant.) Between the patent linguistic diversity that nearly everyone agrees exists and a claim of linguistic determinism that no one actually espouses, lies the proposal of linguistic relativity, that is the proposal that our thought may in some way be taken as relative to the language spoken.
2. Historical Deelopment of the Hypothesis Interest in the intellectual significance of the diversity of language categories has deep roots in the European tradition (Aarsleff 1988, Werlen 1989, Koerner 1992). Formulations related to contemporary ones appear during the Enlightenment period in the UK (Locke), France (Condillac, Diderot), and Germany (Hamman, Herder). They are stimulated variously by opposition to the universal grammarians, by concerns about the reliability of language-based knowledge, and by practical efforts to consolidate national identities and cope with colonial expansion. Most of this work construes the differences among languages in terms of a hierarchical scheme of adequacy with respect to reality, to reason, or to both. Later, nineteenth-century work in Germany by Humboldt and in France\Switzerland by Saussure drew heavily on this earlier tradition and set the stage for the approaches of Sapir and Whorf. Humboldt’s arguments, in particular, are often regarded as anticipating the Sapir–Whorf approach. He argued for a linguistic relativity according to the formal processes used by a language (e.g., inflection, agglutination, etc.). Ultimately this remains a hierarchical relativity in which certain language types (i.e.,
European inflectional ones) are viewed as more adequate vehicles of thought and civilization—a view distinctly at odds with what is to follow. Working within the US anthropological tradition of Franz Boas and stimulated by the diversity and complexity of Native American languages, Edward Sapir (1949) and Benjamin Lee Whorf (1956) reinvigorated and reoriented investigation of linguistic relativity in several ways (Lucy 1992a, Lee 1997). First, they advocated intensive first-hand scientific investigation of exotic languages; second, they focused on structures of meaning, rather than on formal grammatical process such as inflection; and third, they approached these languages within a framework of egalitarian regard. Although not always well understood, theirs is the tradition of linguistic relativity most widely known and debated today. Whorf’s writings, in particular, form the canonical starting point for all subsequent discussion. Whorf proposed a specific mechanism for how language influences thought, sought empirical evidence for language effects, and articulated the reflexive implications of linguistic relativity for scholarly thought itself. In his view, each language refers to an infinite variety of experiences with a finite array of formal categories (both lexical and grammatical) by grouping experiences together as analogically ‘the same’ for the purposes of speech. The categories in a language also interrelate in a coherent way, reinforcing and complementing one another, so as to constitute an overall interpretation of experience. These linguistic classifications vary considerably across languages not only in the basic distinctions they recognize but also in the assemblage of these categories into a coherent system of reference. Thus the system of categories which each language provides to its speakers is not a common, universal system, but a particular ‘fashion of speaking.’ Whorf argued that these linguistic structures influence habitual thought by serving as a guide to the interpretation of experience. Speakers tend to assume that the categories and distinctions of their language are entirely natural and given by external reality, and thus can be used as a guide to it. When speakers attempt to interpret an experience in terms of a category available in their language, they unwittingly involve other language-specific meanings implicit in that particular category and in the overall configuration of categories in which it is embedded. In Whorf’s view language does not blind speakers to some obvious reality, but rather it suggests associations which are not necessarily entailed by experience. Because language is such a pervasive and transparent aspect of behavior, speakers do not understand that the associations they ‘see’ are from language, but rather assume that they are ‘in’ the external situation and patently obvious to all. In the absence of another language (natural or artificial) with which to talk about experience, speakers will not be able to recognize 13487
Sapir–Whorf Hypothesis the conventional nature of their linguistically based understandings. Whorf argues that by influencing everyday habitual thought in this way, language can come to influence cultural institutions generally, including philosophical and scientific activity. In his empirical research Whorf showed that the Hopi and English languages treat ‘time’ differently, and that this difference corresponds to distinct cultural orientations toward temporal notions. Specifically, Whorf argued that speakers of English treat cyclic experiences of various sorts (e.g., the passage of a day or a year) in the same grammatical frame used for ordinary object nouns. Thus, English speakers are led to treat these cycles as object-like in that they can be measured and counted just like tangible objects. English also treats objects as if they each have a form and a substance. Since the cyclic words get put into this object frame, English speakers are led to ask what is the substance associated with the forms a day, a year, and so forth. Whorf argues that our global, abstract notion of ‘time’ as a continuous, homogeneous, formless something can be seen to arise to fill in the blank in this linguistic analogy. The Hopi, by contrast, do not treat these cycles as objects but as recurrent events. Thus, although they have, as Whorf acknowledged, words for what English speakers would recognize as temporal cycles (e.g., days, years, etc.), the formal analogical structuration of these terms in their grammar does not give rise to the abstract notion of ‘time’ that English speakers have. (Ironically, critics of Whorf ’s Hopi data often miss his point about structural analogy and focus narrowly on individual lexical items.) Finally, grouping referents and concepts as formally ‘the same’ for the purposes of speech has led speakers to group those referents and concepts as substantively ‘the same’ for action generally, as evidenced by related cultural patterns of belief and behavior he describes.
3. Empirical Research on the Hypothesis Although the Sapir–Whorf proposal has had wide impact on thinking in the humanities and social sciences, it has not been extensively investigated empirically. Indeed, some believe it is too difficult, if not impossible in principle, to investigate. Further, a good deal of the empirical work that was first developed was quite narrowly confined to attacking Whorf ’s analyses, documenting particular cases of language diversity, or exploring the implications in domains such as color terms that represent somewhat marginal aspects of language structure. In large part, therefore, acceptance or rejection of the proposal for many years depended more on personal and professional predilections than on solid evidence. Nonetheless, a variety of modern initiatives have stimulated renewed interest in mounting empirical assessments of the hypothesis. 13488
Contemporary empirical efforts can be classed into three broad types, depending on which of the three key terms in the hypothesis they take as their point of departure: language, reality, or thought (Lucy 1997). A structure-centered approach begins with an observed difference between languages, elaborates the interpretations of reality implicit in them, and then seeks evidence for their influence on thought. The approach remains open to unexpected interpretations of reality but often has difficulty establishing a neutral basis for comparison. The classic example of a language-centered approach is Whorf ’s pioneering comparison of Hopi and English described above. The most extensive contemporary effort to extend and improve the comparative fundamentals in a structurecentered approach has sought to establish a relation between variations in grammatical number marking and attentiveness to number and shape (Lucy 1992b). This research remedies some of the traditional difficulties of structure-centered approaches by framing the linguistic analysis typologically so as to enhance comparison, and by supplementing ethnographic observation with a rigorous assessment of individual thought. This then makes possible the realization of the benefits of the structure-centered approach: placing the languages at issue on an equal footing, exploring semantically significant lexical and grammatical patterns, and developing connections to interrelated semantic patterns in each language. A domain-centered approach begins with a domain of experienced reality, typically characterized independently of language(s), and asks how various languages select from, encode, and organize it. Typically, speakers of different languages are asked to refer to ‘the same’ materials or situations so the different linguistic construals become clear. The approach facilitates controlled comparison, but often at the expense of regimenting the linguistic data rather narrowly. The classic example of this approach, developed by Roger Brown and Eric Lenneberg in the 1950s, showed that some colors are more lexically encodable than others, and that more codable colors are remembered better. This line of research was later extended by Brent Berlin, Paul Kay, and their colleagues, but to argue instead that there are crosslinguistic uniersals in the encoding of the color domain such that a small number of ‘basic’ color terms emerge in languages as a function of biological constraints. Although this research has been widely accepted as evidence against the validity of linguistic relativity hypothesis, it actually deals largely with constraints on linguistic diversity rather than with relativity as such. Subsequent research has challenged Berlin and Kay’s universal semantic claim, and shown that different color-term systems do in fact influence color categorization and memory. (For discussions and references, see Lucy 1992a, Hardin and Maffi 1997, Roberson et al. 2000.) The most successful effort to improve the quality of the linguistic comparison in
Sapir–Whorf Hypothesis a domain-centered approach has sought to show cognitive differences in the spatial domain between languages favoring the use of body coordinates to describe arrangements of objects (e.g., ‘the man is left of the tree’) and those favoring systems anchored in cardinal direction terms or topographic features (e.g., ‘the man is east\uphill of the tree’) (Pederson et al. 1998, Levinson in press). This research on space remedies some of the traditional difficulties of domaincentered approaches by developing a more rigorous and substantive linguistic analysis to complement the ready comparisons facilitated by this approach. A behaior-centered approach begins with a marked difference in behavior which the researcher comes to believe has its roots in and provides evidence for a pattern of thought arising from language practices. The behavior at issue typically has clear practical consequences (either for theory or for native speakers), but since the research does not begin with an intent to address the linguistic relativity question, the theoretical and empirical analyses of language and reality are often weakly developed. The most famous example of a behavior-centered approach is the effort to account for differences in Chinese and English speakers’ facility with counterfactual or hypothetical reasoning by reference to the marking of counterfactuals in the two languages (Bloom 1981). The interpretation of these results remains controversial (Lucy 1992a).
4. Future Prospects Two research trends are unfolding at the present time. First, in the cognitive and psychological sciences awareness is increasing of the nature and scope of language differences. This has led to a greater number of studies focused on the possible cognitive consequences of such differences (e.g., Levinson in press, Niemeier and Dirven 2000). Second, an increasing integration is emerging among the three levels of language and thought problem (i.e., the semiotic, structural, and functional levels). On the semiotic side, for example, research on the relationship between language and thought in development is increasingly informing and informed by work on linguistic relativity (Bowerman and Levinson 2001). On the functional side, research on the relationship of cultural and discursive patterns of use is increasingly being brought into dialogue with Whorfian issues (Silverstein 1979, Friedrich 1986, Wierzbicka 1992, Hill and Mannheim 1992, Gumperz and Levinson 1996). The continued relevance of the linguistic relativity issue seems assured by the same impulses found historically: the patent relevance of language to human sociality and intellect, the reflexive concern with the role of language in scholarly practice, and the practical encounter with linguistic diversity. To this we must
add the increasing concern with the unknown implications for human thought of the impending loss of many if not most of the world’s languages (Fishman 1982). See also: Cognitive Psychology: History; Human Cognition, Evolution of; Language and Philosophy; Language and Thought: The Modern Whorfian Hypothesis; Linguistic Anthropology; Linguistics: Overview; Sapir, Edward (1884–1939); Semiotics; Wittgenstein, Ludwig (1889–1951)
Bibliography Aarsleff H 1988 Introduction. In: Von Humboldt W On Language: The Diersity of Human Language-structure and its Influence on the Mental Deelopment of Mankind (Heath P, trans.). Cambridge University Press, Cambridge, UK, pp. vii–xv Bloom A H 1981 The Linguistic Shaping of Thought. Lawrence Erlbaum, Hillsdale, NJ Bowerman M, Levinson S C 2001 Language Acquisition and Conceptual Deelopment. Cambridge University Press, Cambridge, UK Fishman J 1982 Whorfianism of the third kind: Ethnolinguistic diversity as a worldwide societal asset (The Whorfian Hypothesis: Varieties of validation, confirmation, and disconfirmation II). Language in Society 11: 1–14 Friedrich P 1986 The Language Parallax: Linguistic Relatiism and Poetic Indeterminacy. University of Texas, Austin, TX Gumperz J J, Levinson S C (eds.) 1996 Rethinking Linguistic Relatiity. Cambridge University Press, Cambridge, UK Hardin C L, Maffi L (eds.) 1997 Color Categories in Thought and Language. Cambridge University Press, Cambridge, UK Hill J H, Mannheim B 1992 Language and world view. Annual Reiew of Anthropology 21: 381–406 Koerner E F K 1992 The Sapir–Whorf hypothesis: A preliminary history and a bibliographic essay. Journal of Linguistic Anthropology 2: 173–8 Lee P 1997 The Whorf Theory Complex: A Critical Reconstruction. John Benjamins, Amsterdam, The Netherlands Levinson S C in press Space in Language and Cognition: Explorations in Cognitie Diersity. Cambridge University Press, Cambridge, UK Lucy J A 1992a Language Diersity and Thought: A Reformulation of the Linguistic Relatiity Hypothesis. Cambridge University Press, Cambridge, UK Lucy J A 1992b Grammatical Categories and Cognition: A Case Study of the Linguistic Relatiity Hypothesis. Cambridge University Press, Cambridge, UK Lucy J A 1997 Linguistic relativity. Annual Reiew of Anthropology 26: 291–312 Niemeier S, Dirven R (eds.) 2000 Eidence for Linguistic Relatiity. John Benjamins, Amsterdam, The Netherlands Pederson E, Danziger E, Wilkins D, Levinson S, Kita S, Senft G 1998 Semantic typology and spatial conceptualization. Language 74: 508–56 Roberson D, Davies I, Davidoff J 2000 Color categories are not universal: Replications and new evidence from a stone-age culture. Journal of Experimental Psychology—General 129: 369–98
13489
Sapir–Whorf Hypothesis Sapir E 1949 The Selected Writings of Edward Sapir in Language, Culture, and Personality (Mandelbaum D G, ed.). University California Press, Berkeley, CA Silverstein M 1979 Language structure and linguistic ideology. In: Clyne P, Hanks W, Hofbauer C (eds.) The Elements: A Parasession on Linguistic Units and Leels. Chicago Linguistic Society, Chicago, pp. 193–247 Werlen I 1989 Sprache, Mensch und Welt: Geschichte und Bedeutung des Prinzips der sprachlichen RelatiitaW t. Wissenschaftliche Buchgesellschaft, Darmstadt, Germany Whorf B L 1956 Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf (Carroll J B, ed.). MIT Press, Cambridge, MA Wierzbicka A 1992 Semantics, Culture, and Cognition. Uniersal Human Concepts in Culture-specific Configurations. Oxford University Press, Oxford, UK
J. A. Lucy
Sauer, Carl Ortwin (1889–1975) Carl Sauer was one of the towering intellectual figures of the twentieth century, not only in geography but also in a wider intellectual sphere. However, because of his wide-ranging thought, speculative sweep, and world perspective over a number of subjects, it is difficult to define his contribution to the social and behavioral sciences precisely and easily. But one can say that some of his underlying concerns were about scholarship, independence of thought, opposition to academic bureaucracy, a sympathy and identification with rural folk, concern for cultural diversity and environmental quality, and a distaste for the technological and scientific ‘fix,’ particularly the solutions offered by the emerging social sciences after 1945. He was born in Warrenton, Missouri on December 24, 1889 of German parents, and because of his background and three years of schooling in Calw (near Stuttgart) he was influenced by German culture and literature. He completed his graduate studies at Chicago University in 1915 under the geographer Ellen SempleChurchill,thegeologistRollinD.Salisbury,and plant ecologist Henry C. Cowles, the latter two of whom made a lasting intellectual impression on him. He then taught at the University of Michigan until 1923, when he moved to geography at Berkeley where he taught for 34 years (32 as Chair) and established one of the most distinctive graduate schools of American Geography that would always be associated with ‘cultural geography.’ After he retired in 1955 he enjoyed 20 remarkably productive years that saw the publication of four books and a score of influential papers, all distinguished by big and speculative ideas, that were the fruits of a lifetime’s reflection and unhurried reading. His reputation soared so that an ‘aura of sage, philosopher-king, and even oracle surrounded him’ (Hooson 1981, p. 166). He died in Berkeley on July 18, 1975. 13490
1. Cultural Geography Sauer rebelled against the sterile environmental determinism of contemporary geography with its emphasis on humans as response mechanisms to physical factors. If nothing else, his experience in the Economic Land Survey in the Michigan Cutovers had shown him that humans radically transformed the earth, often for the worse, and in the process created cultural landscapes. In his search for a new, humane geography, cultural anthropology seemed to offer a means of dealing with the diversity of humankind and its cultural landscapes through time. On arriving in Berkeley he found natural soul-mates in the anthropologists Alfred L. Kroeber and Robert H. Lowie. The concept of ‘culture’ subsequently pervaded all his teaching and writing. In The Morphology of Landscape he distilled an almost wholly German geographical literature, established the primacy of human agency in the formation of cultural landscapes ‘fashioned out of the natural landscape by a cultural group,’ and the importance of a time-based approach. In addition, he placed great importance on observation and contemplation in the field—a Verstehen or empathetic understanding and intuitive insight into behavior or object in order to achieve ‘a quality of reasoning at a higher plane’ than the tangible facts (see Sauer 1925, Williams 1983). Sauer wrote ‘Morphology’ in order to ‘emancipate’ himself from determinist thinking and, with a few more papers on the cultural landscape of the midwest frontier, he put the epistemological game behind him and started substantive field work on early settlement and society in Lower California, Arizona and Mexico. During the 1930s he produced many works, including Aztatlan and The Road to Cibola both published in a new monograph series Ibero-Americana that he founded in the same year with Alfred Kroeber and the historian H. E. Bolton. (For these and many other of Sauer’s publications see Leighly 1963.) These investigations drew him into the controversy about New World plant domestication and plant origins, and into collaboration with botanists, archeologists and ethnologists, whom he found congenial intellectual company.
2. Widening Horizons All the time Sauer’s horizons were getting wider, his ideas more speculative, and his ethical values more refined. Toward the end of the 1930s he wrote a slight, but ultimately influential, paper which was a sustained and biting critique of the destructive social and environmental impact that resulted from the predatory outreach of Europe, which had few counterparts at that time except, perhaps, in the writing of Karl Marx (Sauer 1938). He drew inspiration from the work of George Perkins Marsh on the human transformation of the earth and Ernst Friedrich’s concept of Raub-
Sauer, Carl Ortwin (1889–1975) wirtschaft, or destructive exploitation (see Marsh [1865] 1965, Friedrich 1904). His experience and knowledge of Latin and Central America and their history suggested to him that the Spanish conquest had led to a devastating and permanent impoverishment of the land and of its cultures and societies. Disease, warfare and enslavement had disrupted traditional value systems. Thus, the diffusion of technologically superior societies could affect humans and their culture just as much as it could physical resources. But two works more than any others established his world reputation and heralded a remarkable decade of multifaceted yet interrelated speculative understanding of the place of humans on earth. First was Agricultural Origins and Dispersals (Sauer 1952a) that flowered later into a string of publications into the human uses of the organic world, and early humans in the Americas from the Ice Age onward. Unfortunately, radiocarbon dating came too late to inform Sauer’s writing, but although he may not have provided the answers, he defined the questions brilliantly. Second, in 1956, with the collaboration of Marston Bates and Lewis Mumford, he masterminded the Princetown symposium on ‘Man’s Role in Changing the Face of the Earth,’ the theme of which thereafter became his overriding interest. (See Mumford, Lewis (1895–1990).) All his learning and concerns culminated in this volume, and in his chapter ‘The Agency of Man on Earth’ (Sauer 1956). The capacity of humans to alter the natural environment—the ‘deformation of the pristine’—the cult of progress and waste that stemmed from mass production (‘commodity fetishism’), and the alien intrusion of humans into world ecology, were included. In contemporary terms, the theme was the degradation of the environment, and it was an early and influential statement. It also had another dimension: globally, the ‘imperialism of production’ was as bad as the old, colonial imperialism, and might ultimately be no better than Marxist totalitarianism; mass culture was eliminating not only biological diversity but also cultural diversity, and older and less robust societies. Somehow, humans had to rise above this mindless, shortterm exploitative mode. ‘The high moments of history have come not when man was concerned with the comforts and displays of the flesh but when his spirit was moved to grow in grace.’ Therefore, what was needed was ‘an ethic and aesthetic under which man, practising the qualities of prudence and moderation, may indeed pass on to posterity a good Earth’ (Sauer 1956, p. 68). His simply articulated ideas had a resonance with many activists and intellectuals, as well as Californian avant-garde poets and literati, who extolled his work as an example of cultural and ecological sensitivity and respect, tinged with deep historical insight and scholarship, and made attractive by his simple and pithy language. He also tapped a deep spring of feeling
during the 1960s and 1970s at the time of Vietnam and student unrest with their concerns at the limits of the earth and technological\political power. Yet Sauer was a complex mix. He was congenitally nonconformist but deeply conservative, and although profoundly concerned with conservation was never formally an ‘environmentalist,’ and indeed, he thought the movement was little more than an ‘ecological binge.’
3. Distrust of the Social Sciences In many ways it is ironical that Sauer has a place in this Encyclopedia because he had such a deep distrust and distaste for the behavioral and social sciences. Although he eschewed methodological and epistemological discussion, his writing, and particularly his personal correspondence, reveal that he was ‘a philosopher in spite of himself’ (Entrikin 1984). Culture history became his model of social science, and it was drawn from the natural sciences, not the social sciences. He was basically a pragmatist who was influenced by the writings of the German cultural geographers Friedrich Ratzel and Eduard Hahn, and by the methods of geology and anthropology and their emphasis on the provisional character of working hypotheses which were no more than a means to an end. His later association in Berkeley with people who worked on ‘tangible things,’ like plant ecologists, agricultural scientists, botanists (e.g., Ernest Babcock), experts in the evolution of population genetics (see Wright, Sewall (1889–1988)), and geneticists reinforced this. He argued for a theoretical and methodological pluralism that stemmed naturally from the inherent diversity of nature and culture. The natural science idea of a dynamic balance and diversity that arose from organic evolution was embedded deeply in his thought, and he felt that the modern world had disrupted these processes so that it was out of balance. Hence his deep distrust of US capitalism and all bureaucratic systems which would destroy diversity and local community. Liberal social scientists who designed, planned, and directed community life were more likely to destroy than enhance it by imposing universalizing concepts of social organization, and by ignoring the inherent pluralism and diversity of nature and culture that stimulated the naı$ ve curiosity about the world which was the essence of geography. Theoretical and normative social science as practiced by economists, sociologists, and political scientists, grated on him with its exaggerated confidence in the statistical and the inductive, and its ‘dialectic atmosphere.’ Their focus on the present precluded a better insight into the origins and evolution of any topic, and gave ‘an exaggerated accent on contemporaneity.’ His social science was based firmly in history and geography—this was culture history. 13491
Sauer, Carl Ortwin (1889–1975) During the late 1930s he talked jokingly of the two patron saints of social scientists—St. Bureaucraticus, who represented rationalism and professionalism in US academic life, and St. Scholasticus, who represented social theorists who sought normative generalizations about humankind and society. Consistently, these ‘Sons of Daedalus’ were the target of his criticism because their well-funded procedures and programs emasculated the independence of the impressionable younger scholars (Williams 1983). His address, ‘Folkways of the Social Sciences’ was an eloquent, if not audacious, plea to social scientists to ‘give back the search for truth and beauty to the individual scholar to grow in grace as best he can’ and to reinstate time and place into the study of the USA (Sauer 1952b). Sauer’s other beV te noire—mass production—promoted the homogenization of society, not only in the USA but globally, as the USA firmly assumed the position of superpower during the 1950s, with what he saw as a disdain for the ‘lesser breeds.’ Sauer was aware that he was ‘out of step’ with his colleagues and intellectuals, and spoke of himself as an ‘unofficial outsider’ and even as a ‘peasant.’ This has been typified as part of the antimodernism that characterized early twentieth century US intellectual life. But it went much deeper than that; his ideas anticipated society’s fears and disenchantment with progress, science, the elimination of diversity, and the degradation of the environment. Perhaps the ultimate relevance of Sauer’s work was in his groping towards a rapprochement between the social sciences, the humanities and the biological life sciences. By being behind he was far ahead. See also: Agricultural Sciences and Technology; Environmental Determinism; Geography; Place in Geography
Bibliography Entrikin J N 1984 Carl O. Sauer: Philosopher in spite of himself. Geographical Reiew 74: 387–408 Friedrich E 1904 Wesen und geographische Verbreitung der ‘Raubwirtschaft.’ Petermanns Mitteilungen 50: 68–79, 92–5 Hooson D 1981 Carl Ortwin Sauer. In: Blouet B W (ed.) The Origins of Academic Geography in the United States. Archon, Hamden, CT, pp. 165–74 Leighly J (ed.) 1963 Land and Life: A Selection from the Writings of Carl Ortwin Sauer. University of California Press, Berkeley, CA Marsh G P 1965 Man and Nature: Physical Geography as Modified by Human Action, Lowenthal D (ed.). Harvard University Press, Cambridge, MA Sauer C O 1925 The Morphology of Landscape. University of California Publications in Geography, Berkeley, CA, Vol. 2, pp. 19–53 Sauer C O 1938 Destructive exploitation in modern colonial expansion. Comptes Rendus du CongreZ s International de GeT ographie, Amsterdam 2 (Sect. 3c): 494–9
13492
Sauer C O 1952a Agricultural Origins and Dispersals. Bowman Memorial Lectures, Series 2. American Geographical Society, New York Sauer C O 1952b Folkways of social science. In: The Social Sciences at Mid-Century: Papers Deliered at the Dedication of Ford Hall, April 19–21, 1951. University of Minnesota Press, Minneapolis, MN Sauer C O 1956 The agency of man on earth. In: Thomas W L (ed.) Man’s Role in Changing the Face of the Earth. Chicago University Press, Chicago, IL, pp. 49–69 Williams M 1983 The apple of my eye: Carl Sauer and historical geography. Journal of Historical Geography 9: 1–28 Williams M 1987 Carl Sauer and man’s role in changing the face of the earth. Geographical Reiew 77: 218–31
M. Williams
Saussure, Ferdinand de (1857–1913) 1. Saussure’s Status in Twentieth-century Linguistics Saussure is best known for the posthumous compilation of lecture notes on general linguistics taken down assiduously by students attending his courses during 1907–1911, the Cours de linguistique geT neT rale, edited by his former students and junior colleagues and first published in 1916 (and since 1928 translated into more than a dozen languages). During his lifetime, Saussure was most widely known for his masterly MeT moire of 1878 devoted to an audacious reconstruction of the Proto-Indo-European vowel system. However, it is generally agreed that his Cours ushered in a revolution in linguistic thinking during the 1920s and 1930s which still at the beginning of the twentyfirst century is felt in many quarters, even beyond linguistics proper. He is widely regarded as ‘the father of structuralism’; to many his work produced a veritable ‘Copernican revolution’ (Holdcroft 1991, p. 134). Indeed, essential ingredients and terms of his theory have become points of reference for any serious discussion about the nature of language, its functioning, development, and uses.
2. Formatie Years and Career Saussure was born on November 26, 1857 in Geneva, Switzerland. Although from a distinguished Geneva family which—beginning with Horace Be! ne! dict de Saussure (1740–1799)—can boast of several generations of natural scientists, F. de Saussure was drawn early to language study, producing an ‘Essai pour reT duire les mots du grec, du latin et de l’allemand aZ un petit nombre de racines’ at age 14 or 15 (published in Cahiers Ferdinand de Saussure 32: 77–101 [1978]). Following his parents’ wishes, Saussure attended classes in chemistry, physics, and mathematics at the
Saussure, Ferdinand de (1857–1913) University of Geneva during 1875–1876, before being allowed to join his slightly older classmates who had left for Leipzig the year before. So in the fall of 1876 Saussure arrived at the university where a number of important works in the field of Indo-European phonology and morphology, including Karl Verner’s (1846–1896) epoch-making paper on the last remaining series of exceptions to ‘Grimm’s Law,’ had just been published. Saussure took courses with Georg Curtius (1820–1885), the mentor of the Junggrammatiker, and a number of the younger professors, such as August Leskien (1840–1916), Ernst Windisch (1844–1918), Heinrich Hu$ bschmann (1848–1908), Hermann Osthoff (1847–1909), and others in the field of Indic studies, Slavic, Baltic, Celtic, and Germanic. During 1878–1879 Saussure spent two semesters at the University of Berlin, enrolling in courses in Indic philology with Heinrich Zimmer (1851–1910) and Hermann Oldenberg (1854–1920). After barely six semesters of formal study of comparative-historical Indo-European linguistics Saussure, then just 21, published his major lifetime work. In this 300-page MeT moire sur le systeZ me primitif des oyelles dans les langues indo-europeT ennes (1879) Saussure assumed, on purely theoretical grounds, the existence of an early Proto-Indo-European sound of unknown phonetic value (designated *A) which would develop into various phonemes of the Indo-European vocalic system depending on its combination with those ‘sonantal coefficients.’ Saussure was thus able to explain a number of puzzling questions of IndoEuropean ablaut. However, the real proof of Saussure’s hypotheses came only many years later, after his death, following the decipherment of Hittite and its identification as an Indo-European language. In 1927 the Polish scholar Jerzy Kuryowicz (1895– 1978) pointed to Hittite cognates, i.e., related words corresponding to forms found in other Indo-European languages, that contained a laryngeal (not present in any of the other attested Indo-European languages) corresponding to Saussure’s ‘phoneZ me’ *A (Szemere! nyi 1973). What is significant in Saussure’s approach is his insistance on, and rigorous use of, the idea that the original Proto-Indo-European vowels form a coherent system of interrelated terms. Indeed, it is this emphasis on the systematic character of language which informs all of Saussure’s linguistic thinking to the extent that there are not, contrary to received opinion, two Saussures, the author of the MeT moire and the originator of the theories laid down in the Cours (cf. Koerner 1998). Having returned to Leipzig, Saussure defended his dissertation on the use of the genitive absolute in Sanskrit in February 1880, leaving for Geneva soon thereafter. Before he arrived in Paris in September 1880, he appears to have conducted fieldwork on Lithuanian, an Indo-European language of which documents reach back to the sixteenth century only, but which exhibits a rather conservative vowel system
comparable with that of Ancient Greek. First-hand exposure to this language was instrumental in his explanation of the Lithuanian system of accentuation (Saussure 1896) for which he is justly famous. In 1881, Michel Bre! al (1832–1915), the doyen of French linguistics, secured him a position as MaıV tre de ConfeT rences at the En cole des Hautes En tudes, a post he held until his departure for Geneva 10 years later. In Paris, Saussure found a number of receptive students, among them Antoine Meillet (1866–1936), Maurice Grammont (1866–1946), and Paul Passy (1859–1940), but also congenial colleagues such as Gaston Paris (1839–1903), Louis Havet (1849–1925), who had previously written the most detailed review of his MeT moire, and Arse' ne Darmesteter (1848–1888). Still, Saussure did not write any major work subsequent to his doctoral dissertation, but he wrote a series of frequently etymological papers, which illustrate his acumen in historical linguistics. It was through the posthumous publication of lectures on (in fact historical and) general linguistics that Saussure became known for his theoretical and nonhistorical views. In 1891, the University of Geneva offered Saussure a professorship of Sanskrit and Comparative Grammar, which was made into a regular chair of Comparative Philology in 1896. It was only late in 1906 that the Faculty added the subject of General Linguistics to his teaching load. It was this decision and Saussure’s three ensuing lecture series (1907, 1908–9, and 1910– 11) in which he developed his thoughts about the nature of language and the manner in which it was to be studied that eventually led to the epoch-making book he did not write, the Cours de linguistique geT neT rale. Saussure died on February 22, 1913 at Cha# teau Vufflens, Switzerland.
3. The Cours De Linguistique GeT neT rale The Cours appeared in 1916. By the 1920s Saussure’s name began to be almost exclusively connected with this posthumous work which was based largely on extensive lecture notes carefully taken down by a number of his students. One of them was Albert Riedlinger (1883–1978), whose name appears on the title page of the Cours as a collaborator. It was, however, put together by Saussure’s successors in Geneva, Charles Bally (1865–1947) and Albert Sechehaye (1870–1946), neither of whom had attended these lectures themselves as is frequently, but erroneously, stated in the literature. Indeed, their own focus of attention was nonhistorical linguistics, stylistics and syntax, respectively, and this had a considerable bearing on the manner in which Saussure’s ideas were presented (Amacker 2000), with Historical Linguistics, the subject Saussure was most interested in, being relegated to the end of the book. (See Godel 1957, for an analysis of the editors’ work; also Strozier 1988, for a close analysis of the texts.) It 13493
Saussure, Ferdinand de (1857–1913) was the long general introduction of the Cours and the part dealing with nonhistorical (‘synchronic’) linguistics, which made history.
3.1 Saussure’s\the Cours’s Legacy The ideas advanced in the Cours produced something of a revolution in linguistic science; historicalcomparative grammar which had dominated linguistic research since the early nineteenth century soon became a mere province of the field. At least in the manner the Cours had been presented by the editors, Saussure’s general theory of language was seen as assigning pride of place to the nonhistorical, descriptive, and ‘structural’ approach. (Saussure himself did not use the last-mentioned term in a technical sense.) This emphasis on the investigation of the current state of a language or languages led to a tremendous body of work concerned with the analysis of the linguistic system or systems of language and its function(s), and a concomitant neglect of questions of language change and the field of Historical Linguistics in general, a situation still very much characteristic of the current linguistic scene. However, the field has become stronger since the mid-1980s, as sociolinguistic and typological aspects took hold in the investigation of language change. From the 1920s onwards, notably outside of the traditional centers of Indo-European comparative linguistics, a variety of important schools of linguistic thought developed in Europe that can be traced back to proposals made in the Cours. These are usually identified with the respective centers from which they emanated, such as Geneva, Prague, Copenhagen, even London; more precisely these developments are to be associated with the names of Bally and Sechehaye, Roman Jakobson (1896–1982) and Nikolaj S. Trubezkoy (1890–1938), Louis Hjelmslev (1899– 1965), and John Rupert Firth (1890–1960), respectively. In North America too, through the work of Leonard Bloomfield (1887–1949), Saussure’s ideas became stock-in-trade among linguists, descriptivists, structuralists, and generativists (cf. Joseph 1990, for Saussure’s influence on Bloomfield as well as Chomsky). In each ‘school,’ it is safe to say, essential ingredients of the Cours were interpreted differently, at times in opposition to some of Saussure’s tenets as found in the book, which Saussure specialists now refer to as the ‘vulgata’ text, given that a number of points made in the Cours go back to its editors, not Saussure himself. However, it is this text that has made the impact on modern linguistics.
3.2 The Main Tenets of the Cours At the core of Saussure’s linguistic theory is the assumption that language is a system of interrelated 13494
terms, which he called ‘langue’ (in contradistinction to ‘parole,’ the individual speech act or speaking in general). This supra-individual ‘langue’ is the underlying code ensuring that people can speak and understand each other; the language-system thus has a social underpinning. At the same time, ‘langue’ is an operative system embedded in the brain of everyone who has learned a given language. The analysis of this system and its functioning, Saussure maintains, is the central object of linguistics. His characterization of ‘langue’ as a ‘fait social’ has often led to the belief that Saussure’s thinking is indebted to E; mile Durkheim’s (1858–1917) sociological framework. While the two were close contemporaries and shared much of the same intellectual climate, no direct influence of the latter on the former can be demonstrated; Meillet, Saussure’s former student and a collaborator of Durkheim’s since the late 1890s, publicly denied it when it was first proposed in 1931. For Saussure the social bond between speakers sharing the same language (‘langue’) was constitutive for the operation of this unique semiological system (see below). The language system is a network of relationships which Saussure characterized as being of two kinds: ‘syntagmatic’ (i.e., items are arranged in a consecutive, linear order) and ‘associative,’ later on termed (by Firth and by Hjelmslev) ‘paradigmatic’ (i.e., pertaining to the organization of units in a deeper, not directly observable fashion dealing with grammatical and semantic relations). Since it is only in a state (‘eT tat de langue’) that this system can be revealed, the nonhistorical, ‘synchronic’ approach to language must take pride of place. Only after two such language states of different periods in the development of a given language have been properly described can the effects of language change be calculated, i.e., ‘diachronic,’ historical linguistics be conducted. Hence the methodological, if not epistemological primacy of synchrony over diachrony. Apart from syntagmatic vs. paradigmatic relations, several trichotomies can be found in the Cours which, however are usually reduced to dichotomies. Many of them have become current in twentieth-century thought, far beyond their original application, i.e., language–langue–parole (i.e., language in all its manifestations or ‘speech’; language as the underlying system, and ‘speaking,’ with terms such as ‘tongue’ and ‘discourse’ or ‘competence’ and ‘performance’ being proposed to replace the langue\parole couple), signe–signifieT –signifiant (sign, signified, and signifier), synchrony vs. diachrony (Saussure’s ‘panchrony’ would be an overarching of these two perspectives). Saussure’s definition of language as ‘a system of (arbitrary) signs’ and his proposal of linguistics as the central part of an overall science of sign relations or ‘se! miologie’ have led to the development of a field of inquiry more frequently called (following Charles Sanders Peirce’s [1839–1914] terminology) ‘semiotics,’ which more often than not deals with sign systems
Saage, Leonard J (1917–71) other than those pertaining to language, such as literary texts, visual art, music, and architecture. (For the wider implications of Saussure’s socio-semiotic ideas and a critique of the various uses of Saussure’s concepts in literary theory, see Thibault 1997.) As is common with influential works (cf., e.g., Freud’s) many ingredients of Saussure’s general theory of language have often been taken out of their original context and incorporated into theories outside their intended application, usually selectively and quite arbitrarily, especially in works by French writers engaged in ‘structural’ anthropology (e.g., Claude Le! vi-Strauss) and Marxist philosophy (e.g., Louis Althusser), literary theory (e.g., Jacques Derrida), psychoanalysis (e.g., Jacques Lacan), and semiotics (e.g., Roland Barthes), and their various associates and followers. (For a judicious critique of these extralinguistic exploitations, see Tallis 1988.) However, these uses—and abuses—demonstrate the endurance and originality of Saussure’s ideas. He has achieved in linguistics a status comparable to Imanuel Kant in philosophy, in that we can, similar to Kant’s place in the history of thought, distinguish between a linguistics before Saussure and a linguistics after Saussure.
Bibliography Amacker R 2000 Le de! veloppment des ide! es saussurennes chez Charles Bally et Albert Sechehay. Historigraphia Linguistica 27: 205–64 Bouquet S 1997 Introduction aZ la lecture de Saussure. Payot, Paris Engler R 1976 Bibliographie saussurienne [1970–]. Cahiers Ferdinand de Saussure 30: 99–138, 31: 279–306, 33: 79–145, 40: 131–200, 43: 149–275, 50: 247–95 (1976, 1977, 1979, 1986, 1989, 1997). Godel R 1957 Les Sources manuscrites du Cours de linguistique geT neT rale de F. de Saussure. Droz, Geneva, Switzerland Harris R 1987 Reading Saussure: A Critical Commentary on the Cours de linguistique geT neT rale. Duckworth, London Holdcroft D 1991 Saussure: Signs, System, and Arbitrariness. Cambridge University Press, Cambridge, UK Joseph J E 1990 Ideologizing Saussure: Bloomfield’s and Chomsky’s readings of the Cours de linguistique ge! ne! rale. In: Joseph J E, Taylor T E (eds.) Ideologies of Language. Routledge, London and New York, pp. 51–93 Koerner E F K 1972 Bibliographia Saussureana, 1870–1970: An Annotated, Classified Bibliography on the Background, Deelopment and Actual Releance of Ferdinand De Saussure’s General Theory of Language. Scarecrow Press, Metuchen, NJ Koerner E F K 1973 Ferdinand de Saussure: Origin and Deelopment of his Linguistic Thought in Western Studies of Language. A Contribution to the History and Theory of Linguistics. F. Vieweg and Sohn, Braunschweig, Germany Koerner E F K 1988 Saussurean Studies\En tudes saussuriennes. Aant-propos de R. Engler. Slatkine, Geneva, Switzerland Koerner E F K 1998 Noch einmal on the History of the Concept of Language as a ‘syste' me ou' tout se tient’. Cahiers Ferdinand de Saussure 51: 21–40
Saussure F de 1879 Me! moire sur le syste' me primitif des voyelles dans les langues indo-europe! ennes. B. G. Teubner, Leipzig, Germany (Repr. G. Olms, Hildesheim, 1968.). In: Lehmann W P (ed.) A Reader in Nineteenth-Century Historical IndoEuropean Linguistics. Indiana University Press, Bloomington and London, 1967, pp. 218–24 Saussure F de 1896 Accentuation lituanienne. Indogermanische Forschungen. Anzeiger 6: 157–66 Saussure F de 1916 Cours de linguistique ge! ne! rale ed. by Charles Bally and Albert Sechehaye, with the collaboration of Albert Riedlinger. Payot, Lausanne and Paris. (2nd ed., Paris: Payot, 1922; 3rd and last corrected ed., 1931; 4th ed., 1949; 5th ed., 1960, etc.—English transl.: (1) By Wade Baskin, Course in General Linguistics, Philosophical Library, London and New York, 1959 (repr., New York: McGraw-Hill, 1966; rev. ed., Collins\Fontana, London, 1974), and (2) by Roy Harris, Course in General Linguistics, Duckworth, London, 1983. Saussure F de 1922 Recueil des publications scientifiques ed. by Charles Bally and Le! opold Gautier. Payot, Lausanne; C. Winter, Heidelberg. (Repr. Slatkine, Geneva, 1970; it includes a reprint of the MeT moire, his 1880 dissertation (pp. 269–338), and all of Saussure’s papers published during his lifetime.) Saussure F de 1957 Cours de linguistique ge! ne! rale (1908–1909): Introduction. Cahiers Ferdinand de Saussure 15: 6–103 Saussure F de 1967–1968, 1974 Cours de linguistique geT neT rale. En dition critique par Rudolf Engler, 4 fasc. Otto Harrassowitz, Wiesbaden, Germany Saussure F de 1972 Cours de linguistique geT neT rale. Payot, Paris Saussure F de 1978 (1872) Essai pour re! duire les mots du grec, du latin et de l’allemand a' un petit nombre de racines. Cahiers Ferdinand de Saussure 32: 77–101 Strozier R M 1988 Saussure, Derrida, and the Metaphysics of Subjectiity. Mouton de Gruyter, Berlin and New York Szemere! nyi O 1973 La the! orie des laryngales de Saussure a' Kuryowicz et a' Benveniste: Essai de re! e! valuation. Bulletin de la SocieT teT de Linguistique de Paris 68: 1–25 Tallis R 1988 Not Saussure. Macmillan, London Thibault P J 1997 Re-Reading Saussure: The Dynamics of Signs in Social Life. Routledge, London
E. F. K. Koerner
Savage, Leonard J (1917–71) L. J. Savage, always addressed by friends as Jimmie, was born in Detroit, Michigan on 20 November 1917. His father was in the real-estate business, his mother a nurse. Throughout his life he suffered from poor eyesight, a combination of nystagmus and extreme myopia, which was to affect the development of his career in many ways. Most of us accept reading as an easy, even casual, activity, whereas to him it was a serious business with the text held close to the eye. The result was that he appeared to regard the material as important and would absorb it with an intensity that often escapes those with normal vision. To have him read something you had written was, at first, an uncomfortable business, for he would politely question much of the material, but later to realize that he was being constructive and the final result benefited 13495
Saage, Leonard J (1917–71) enormously from his study of it. He was a superb lecturer, once remarking that he could hold himself spell-bound for an hour. When, towards the end of his life, he gave the Fisher lecture, he held the audience spell-bound for considerably more than the allotted hour. The poor eyesight at first affected him adversely, his high-school teacher not recommending him for further education. His parents were insistent and he went initially to Wayne University and then to the University of Michigan. His attempts to do biology failed because he could not draw, and chemistry was ruled out when he dropped the beaker in the laboratory. Eventually he met a fine mathematics teacher who introduced him to a field in which the reading had to be intense, yet minimal, where his brilliance was recognized and he gained a Ph.D in the application of vectorial methods to metric geometry. After a year each at Princeton, Cornell, and Brown, 1944 found him a member of the Statistical Research Group at Columbia University, a group he described as ‘one of the greatest hotbeds statistics has ever had,’ a judgement that is surely sound, as almost all those who were to become the leading statisticians in the US of the 1950s were in the group or associated with it. So he became a statistician, and although his greatest work was in the theory, as such he was able to indulge his interest in science, so that throughout his career he enjoyed, and was extremely good at, consulting with his scientific colleagues. It was not just science that interested him, for he had inherited from his father a respect for business, and had acquired from Milton Friedman, a member of the group, an appreciation of economics, that made his advice useful and generously available to all serious thinkers. In 1946, Savage went to the University of Chicago where, from 1949, he was in the Statistics department founded by Allen Wallis, becoming chairman from 1957 until he left the university in 1960. His departure was a turning point in his life. A reason for leaving was personal difficulties in his marriage to Jane Kretschmer that they hoped could be mended by a move to Michigan with their two sons, Sam and Frank. It did not work out and they were divorced in 1964. That year he married Jean Pearce and moved to Yale. It was in New Haven on 1 November 1971 that he tragically died at the early age of 53, and statistics lost a great scholar and many of us a dear friend. The move from Chicago was partly influenced by his perception of his relations with colleagues there. He felt, as we will see below, that he had produced a sound axiomatization of statistics showing that many standard statistical procedures were unsatisfactory, so that his colleagues should either explain where the axioms were wrong, or else abandon the faulty procedures. They did neither. They were not alone in this; most statisticians acted as the Chicago faculty, refusing to contest the axioms, yet refusing to use the results (many still do so today) and 13496
Savage found it difficult to get a suitable post until Yale obliged. The monograph by Savage with others (1962), although now rather dated, indicates the sort of antagonism that existed, though, in that book, all the protagonists are on their best behavior. Savage (1981) contains most of his more important papers, tributes from others and a technical summary of his work. Many of his papers, contained therein, deal, not with technicalities, but with his growing appreciation of statistical ideas as they developed throughout his life. Savage is justifiably famous for one tremendous contribution to statistics and the scientific method, contained in his 1954 book The Foundations of Statistics. Or, to be more correct, the first seven chapters of that book. Anyone contemplating study of this glorious achievement should first read his preface to the second edition in 1972, for this is one of the most honest pieces of scientific writing that I know. To appreciate the importance of the book, it is necessary to cast one’s mind back to the state of statistics around 1950. Fisher made enormous advances from 1920 onwards. Neyman and Pearson had introduced new ideas and Wald was developing decision analysis. But the statistical procedures available, though undoubtedly valuable and much appreciated by scientists, especially biologists, were a disparate collection of ideas lacking coherence. Savage had the idea that statistics should be like other branches of mathematics and be based on a set of concepts, termed axioms, from which the procedures would follow as theorems. He wanted to provide a firm basis for Fisher’s results. Another way of understanding is to recognize that Fisher had suggested procedures whose properties had been studied by him and others, whereas Savage turned the method around and asked what properties we wanted and discovered what procedures provided them. His expectation was that they would prove to be just what statisticians were using. To his great surprise, they were not. Interestingly, his last paper, Savage et al. (1976), is a brilliant, sympathetic review of Fisher’s work. Savage had worked with von Neumann in Princeton and had much appreciated his work on games with Morgenstern (1944), that introduced axioms leading to the concept of a utility function for the outcomes of a game. However, their treatment had used probability without justification, so Savage conceived the idea of developing an axiomatic system for decision-making under uncertainty which would not use probability in the axioms but derive it in theorems, so uniting the ideas of probability and utility. Also influential was the development by Wald (1950) of an attempt to unite statistical ideas under the concept of decision-making, using von Neumann’s idea of minimax, but lacking any axiomatic foundation. Savage’s 1954 book presents axioms for a single decision-maker, now often referred to as ‘you,’ faced with selecting a course of action in the face of uncertainty. He then proves three
Saage, Leonard J (1917–71) theorems: first, that your various uncertainties present must combine according to the rules of probability; second, that your perceptions of the merits of the possible outcomes that result from your actions must be described by a real-valued utility function (which is itself based on probability); third, that your optimum decision is that which maximizes your expected utility (MEU), the expectation being evaluated according to the probabilities developed in the first part of this trilogy. One of the axioms he used encapsulated a simple notion that has since been seen to be of some importance. If you prefer A to B when C is true and, at the same time, prefer A to B when C is false; then you prefer A to B when you do not know whether or not C is true. It is called the ‘sure-thing’ principle because the preference of A over B is sure, in American parlance, at least as far as C is concerned. The introduction of MEU was not new, it having been used by Daniel Bernoulli in the eighteenth century, but the development extended its use considerably, in particular by showing that it was the only sensible method for a single decision-maker to use. Utility had recently been explained, as we have seen, by von Neumann. The original and dramatically important result was the first, saying that probability was the only sensible description for uncertainty, so that the rules of probability were not arbitrary but dictated by the sensible and modest requirements expressed in the axioms. Others, from Laplace onwards, had used probability without justification: Savage showed that it was inevitable. So it was the extensive and unique use of probability that was Savage’s main contribution in the book, and there are two ways in which his analysis was influential. First was the fact that the probability was personal, expressing the uncertainty of the decision-maker, ‘you.’ It was not the probability, but rather your probability. An immediate reaction is that this contradicts the very nature of the scientific method which is supposed to yield objective results. This apparent contradiction was resolved in a later paper with Edwards and Lindeman (Savage et al. 1963), which showed in the principle of stable estimation, that under general conditions, people with different uncertainties would reach agreement on receipt of enough data. This result agrees with the observation that scientists typically disagree with one another in the early stages of an investigation but are eventually brought together as the evidence accumulates. Many now think that the reduction of uncertainty, through data, incorporated in the laws of probability, expresses the nature of induction and is part of the scientific method. A second, influential factor was the result that probability was the only sensible description of uncertainty, for scientists, encouraged by statisticians, had been using other methods, for example in the use of tail-area significance tests of a null hypothesis, H, where the relevant quantity is the probability in the tail
of a distribution, given that H is true. Savage’s argument was that the proper evaluation is the probability of H, given the data, not of an aspect of the data, given H. Similar contradictions arise with confidence intervals for a parameter which involve probability statements about the interval, not about the parameter. These are instances where his conclusions clashed with standard practice, leading to disagreements like that at Chicago mentioned above. There was a third way in which Savage’s ideas contradicted the current practice, though this was not fully realized until Birnbaum’s (1962) paper. The axiomatic development showed that if you had a statistical model with probability p(x\θ) for data x given parameter θ, then the only way the data could affect your uncertainty of θ, given x, was through the likelihood function p(x\θ) expressing how that probability varied with θ for the fixed, observed x. This is the likelihood principle. Significance tests clearly violate the principle since they use a tail-area which involves integration of p(x\θ) over x-values, that is, over data that you do not have. Indeed, many statistical procedures, even today, violate the principle. A related example of conflict with current practice is found in optional stopping. It had long been known that, under commonly occurring circumstances, it is possible to continue sampling until a null hypothesis is rejected at any preassigned significance level, so that you can opt to stop at your convenience. Savage showed that it was the significance test that was at fault, not the stopping procedure. It is an astonishing fact, that he explains in the preface to the second edition, that he and others who worked with him, including myself, in the mid-1950s, did not appreciate these ideas and did not comprehend that, rather than his work justifying statistical practice, it altered it. The latter part of his book in which the justification is attempted, largely using the tool of minimax that von Neumann had employed with great effect in the theory of games, is today of little interest, in contrast to the brilliance of the early chapters. Savage was a true scholar, a man who studied the work of others and accepted any valid criticism of his work, to incorporate their ideas with due acknowledgment. And so he understood the ideas of Ramsey (1926) who had, in a less rigorous presentation, developed similar ideas, ideas which no one had understood until Savage. But more importantly, he appreciated the work of the Italian, de Finetti, who, from an entirely different standpoint, and one which is perhaps more forceful than Savage’s, developed the concept and uniqueness of personal probability. (For English readers, the best references are de Finetti (1972, 1974, 1975), though his ideas originate in the 1930s.) Savage acquired skill in Italian and worked with him. One of the problems de Finetti had studied was your assessment of probability. You are uncertain whether it will rain tomorrow in your city; so according to their thesis, you have a probability for rain. 13497
Saage, Leonard J (1917–71) How are you to assess its value? De Finetti suggested a scoring rule: if you choose p as your value, you will receive a penalty score (1kp)# if it rains and p# if not. Scores for different events will be added. To minimize your total score the values that you give must be probabilities. The basic paper was written by Savage (1972) with due acknowledgment to the originator. With Hewitt (1955), Savage also extended significantly another result of de Finetti on exchangeability. Savage conducted a major study with Dubins that resulted in their book (Savage and Dubins 1965). The problem here is that you are in an unfavorable gaming situation at a casino and your object is to maximize your chance of winning a fixed sum of money; how should you play? They showed that boldness pays; bet all you have or the maximum amount the casino will allow. This led to a thorough study of betting strategies and many interesting and useful results were obtained. For reasons that are not entirely clear, the school of statistics that rests on Savage’s results is called Bayesian, after the eighteenth century cleric, Thomas Bayes partly because much use is made of Bayes’s theorem. Today it is a flourishing school of statistics with a large and increasing number of adherents, that co-exists alongside the earlier school, often called frequentist because probability therein refers to frequency, whereas to a Bayesian it expresses your uncertainty of an event. Bayesians believe that their approach to the handling of data, and the making of decisions that use the data, is important for social and behavioral sciences for the following reasons. If these subjects are to be truly scientific, their arguments must depend heavily on data which, properly interpreted, can be used to defend theories in the interplay of theory and practice that is the scientific method. In the physical sciences, the data ordinarily take the form of results of planned experiments in the field or laboratory, experiments which can, and should be, repeated by others, in accord with stable estimation, their repetition connecting naturally with frequency. In the social sciences, such planning and repetition is rarely possible, so that reliance has to be placed on observational data, and the behavioral sciences are scarcely better off. It is now widely appreciated that observational data have to be analyzed with considerable care, so that any conclusions are soundly based and not confounded with other factors that the lack of planning has been unable to control. Statistics therefore becomes a more important ingredient in the social, than the physical, sciences. Furthermore, social decisions must often be made in the face of uncertainty, whereas more time and effort by an experimental scientist can substantially reduce, if not eliminate, the uncertainty. The Bayesian approach recognizes and incorporates all of these factors. There is another sense in which the Bayesian approach may be valuable and that lies in its ability to express uncertainty about any aspect of a study, not confining probability to the repetitious, frequency 13498
aspects. Often the conclusions in the social and behavioral sciences are tentative and the personalistic perception of probability, as a measure of your belief in a hypothesis, can provide an expression of uncertainty that can be conveyed to others. We still have a long way to go in employing these ideas, when even a weather-forecaster is reluctant to say there is a probability of 0.8 that it will rain tomorrow, or, if he does, sometimes does not appreciate the meaning of what is being said. Savage had been described as ‘the Euclid of statistics’ in that he did for statistics what Euclid did for geometry, laying down a set of axioms from which, as theorems, the only sensible statistical procedures emerge. As with Euclid, other geometries will emerge, so other statistical systems may arise, but, for the moment, Savage’s Bayesian approach is the only systematic one available and appears to work. He was, in the sense of Kuhn, a true revolutionary, who overturned one paradigm, replacing it by another without, at first, realizing what he had done. See also: Bayesian Statistics; Bayesian Theory: History of Applications; Decision Theory: Bayesian; Fisher, Ronald A (1890–1962); Hotelling, Harold (1895– 1973); Probability: Interpretations; Risk: Theories of Decision and Choice; Statistical Methods, History of: Pre-1900; Statistics, History of; Tversky, Amos (1937–96)
Bibliography Birnbaum A 1962 On the foundations of statistical inference. Journal of the American Statistical Association 57: 269–306 de Finetti B 1972 Probability, Induction and Statistics: The Art of Guessing. Wiley, London de Finetti B 1974\5 Theory of Probability, 2 Vols. Wiley, London Dubins L E, Savage L J 1965 How to Gamble if You Must: Inequalities for Stochastic Processes. McGraw-Hill, New York Ramsey F P 1926 Truth and probability. In: Braithwaite R B (ed.) The Foundations of Mathematics and Other Logical Essays. Routledge and Kegan Paul, London Savage L J 1954 The Foundations of Statistics. Wiley, New York Savage L J 1971 Elicitation of personal probabilities and expectations. Journal of the American Statistical Association 66: 783–801 Savage L J 1972 The Foundations of Statistics, 2nd edn. Dover, New York Savage L J et al. 1976 On re-reading R A Fisher. Annals of Statistics 4: 441–500 Savage L J 1981 The Writings of Leonard Jimmie Saage—A Memorial Selection. American Statistical Association and Institute of Mathematics, Washington Savage L J, Edwards W, Lindman H 1963 Bayesian statistical inference for psychological research. Psychological Reiews 70: 193–242 Savage L J, Hewitt E 1955 Symmetric measures on Cartesian products. Transactions of the American Mathematical Society 80: 470–501
Saings Behaior: Demographic Influences Savage L J et al. 1962 The Foundations of Statistical Inference: A Discussion. Methuen, London von Neumann J, Morgenstern O 1944 Theory of Games and Economic Behaior. Princeton University Press, Princeton, NJ Wald A 1950 Statistical Decision Functions. Wiley, New York
D. V. Lindley
Savings Behavior: Demographic Influences In any period all economic output is either consumed or saved. By consuming, individuals satisfy their current material needs. By saving, individuals accumulate wealth that serves a multitude of purposes. When invested or loaned, wealth yields a stream of income to the holder. It is a means of providing for future material needs, for example, during retirement. Wealth provides partial protection against many of life’s uncertainties, including the loss of a job or unexpected medical needs. Wealth can be passed on to future generations out of feelings of altruism, or used as a carrot to encourage desired behavior among those who hope for a bequest. Wealth provides status to the holder. Because of the varied purposes served by saving, many behavioral models have been proposed and a variety of influencing factors have been identified. Demographic factors, including age structure, fertility, and mortality, have been found to influence saving in many studies. Additional factors that have been identified include the level of per capita income, economic growth rates, interest rates, characteristics of the financial system, fiscal policy, uncertainty, and public pension programs.
1. Saing Trends and their Importance In the 1960s, low saving rates in the developing world were a serious impediment to economic development. The industrialized countries had much higher rates of saving, but international capital flows from the industrialized to the developing world were insufficient to finance needed investment in infrastructure and industrial enterprise. Since that time, the developing countries have followed divergent paths. Saving rates in sub-Saharan Africa have remained low. After increasing somewhat in the 1970s, saving rates in Latin America declined during the late 1970s and early 1980s. In contrast, South Asian and especially East Asian saving rates have increased substantially and currently are well above saving rates found in the industrialized countries. The emergence of high saving rates is widely believed to bear major responsibility for Asia’s rapid economic growth up until the mid-1990s.
The industrialized countries face their own saving issue. Since the mid-1970s, they have experienced a gradual, but steady, decline in saving rates. US saving rates have reached especially low levels. Current US household saving rates are near zero. The low rates of saving raise two concerns: first, that economic growth will be unsustainable and, second, that current generations of workers will face substantially reduced standards of living when they retire.
2. General Saing Concepts The national saving rate is the outcome of decisions by three sets of actors—governments, firms, and households—but analyses of saving rates are overwhelmingly based on household behavioral models. The reliance on household models is justified on two grounds. First, decisions by households may fully incorporate the decisions made by firms and governments. Firms are owned by households. When firms accumulate wealth, the wealth of households increases as well. Thus, from the perspective of the household, saving by firms and saving by households are close substitutes. Consequently, household behavior determines the private saving rate, the combined saving of firms and households. Government saving also affects the wealth of the household sector by affecting current and future taxes. By issuing debt governments can, in principle, increase consumption and reduce national saving at the expense of future generations. However, households may choose to compensate future generations (their children) by increasing their saving and planned bequests. If they do so, national saving is determined entirely by the household sector and is independent of government saving (Barro 1974). Second, firms and governments may act as agents for households. Their saving on behalf of households may be influenced by the same factors that influence household behavior. The clearest example of this type of behavior is when firms or governments accumulate pension funds on behalf of their workers or citizens. Households are motivated to save for a number of reasons, but current research emphasizes three motives: insurance, bequests, and lifecycle motives. Households may accumulate wealth to insure themselves against uncertain events, e.g., the loss of a crop or a job, the death of a spouse, or an unanticipated medical expense. Households may save in order to accumulate an estate intended for their descendents. Household saving may be motivated by lifecycle concerns, the divergence between the household’s earnings profile and its preferred consumption profile. The importance of these and other motives in determining national saving rates is a matter of vigorous debate among economists. Proponents of the bequest motive have estimated that as much as 80 percent of US national saving is accounted for by the bequest motive (Kotlikoff 1988). Proponents of the 13499
Saings Behaior: Demographic Influences lifecycle motive have estimated that an equally large portion of US national saving is accounted for by the lifecycle motive (Modigliani 1988). Clearly, the way in which changing demographic conditions impinge on national saving rates will depend on the importance of these different motives.
3. The Lifecycle Model Both the larger debate on the determinants of saving and explorations of the impact on saving of demographic factors have been framed by the lifecycle model (Modigliani and Brumberg 1954). The key idea that motivates the model is the observation that for extended portions of our lives we are incapable of providing for our own material needs. Thus, economic resources must be reallocated from economically productive individuals concentrated at the working ages to dependents concentrated at young or old ages. Several mechanisms exist for achieving this reallocation. In traditional societies, the family plays a dominant role. Typically, the young, the old, and those of working age live in extended families supported by productive family members. In modern societies, solving the lifecycle problem is a shared responsibility. Although the young continue to rely primarily on the family, the elderly rely on the family, on transfers from workers effected by the government, and on personnel wealth they have accumulated during their working years (Lee 2000). The lifecycle saving model is most clearly relevant to higher income settings where capital markets have developed, family support systems have eroded, and workers anticipate extended periods of retirement. Under these conditions, the lifecycle model implies that households consisting of young, working age adults will save, while household consisting of old, retired adults will dis-save. The national saving rate is determined, in part, by the relative size of these demographic groups. A young age structure yields high saving rates; an old age structure yields low saving rates. Likewise, a rise in the rate of population growth leads to higher saving rates because of the shift to a younger age structure. In the standard formulation of the lifecycle model, changes in the growth rate of per capita income operate in exactly the same way as changes in the population growth rate. Given higher long-run rates of economic growth, young adults have greater lifetime earnings than older adults. They have a correspondingly greater impact on the aggregate saving rate because of their control of a larger share of economic resources. Consequently, an increase in either the population growth rate or the per capita income growth rate leads to higher saving. The rate of growth effect is one of the most important and widely tested implications of the lifecycle saving model. With a great deal of consistency, 13500
empirical studies have found that an increase in the rate of per capita income growth leads to an increase in the national saving rate. Empirical research does not, however, support the existence of a positive population rate of growth effect (Deaton 1989). The standard lifecycle model provides no basis for reconciling the divergent rate of growth effects. The impact of child dependency on household saving provides one possible explanation of why population growth and economic growth need not have the same effect on aggregate saving. Coale and Hoover (1958) were the first to point out the potentially important impact of child dependency. They hypothesized that ‘A family with the same total income but with a larger number of children would surely tend to consume more and save less, other things being equal’ (p. 25). This raised the possibility that saving follows an inverted-U shaped curve over the demographic transition. In low-income countries, with rapid population growth rates and young age structure, slower population growth would lead to higher saving. But in higher income countries that were further along in their demographic transitions, slower population growth would lead to a population concentrated at older, low saving ages as hypothesized in the lifecycle model. These contrasting demographic effects have been modeled in the empirical literature using the youth and old-age dependency ratios. The variable lifecycle model incorporates the role of child dependency by allowing demographic factors to influence both the age structure of the population and the age profiles of consumption, earning, and, hence, saving. In this version of the lifecycle model the size of the rate of growth effect varies depending among other things on the number or cost of children. As with the dependency ratio model, saving follows an inverted-U shaped path over the demographic transition. However, the impact of demographic factors varies with the rate of economic growth. The model implies that in countries with rapid economic growth, e.g., East Asia, demographic factors will have a large impact on saving, but in countries with slow economic growth, e.g., Africa, demographic factors would have a more modest effect (Mason 1987). Empirical studies of population and saving come in two forms. Most frequently, researchers have based their analyses on aggregate time series data now available for many countries. Recent estimates support the existence of large demographic effects on saving. Analysis by Kelley and Schmidt (1996) supports the variable lifecycle model. Higgins and Williamson (1997) find that demographic factors influence saving independently of the rate of economic growth. An alternative approach relies on microeconomic data to construct an age-saving profile. The impact of age structure is then assessed assuming that the age profile does not change. In the few applications undertaken, changes in age structure had an impact on saving that is more modest than found in analyses of
Scale in Geography aggregate saving data. (See Deaton and Paxson (1997) for an example.) Until these approaches are reconciled, a firm consensus about the magnitude of demographic effects is unlikely to emerge.
4. Unresoled Issues There are a number of important issues that have not been resolved and require additional work. First, the saving literature does not yet adequately incorporate the impact of changing institutional arrangements. Studies of saving in the industrialized countries and recent work on developing countries considers the impact of state-sponsored pension programs on saving, but the impact of family support systems has received far too little emphasis. The erosion of the extended family in developing countries is surely one of the factors that has contributed to the rise of saving rates observed in many developing countries. Second, the role of mortality has received inadequate attention. The importance of lifecycle saving depends on the expected duration of retirement. In high mortality societies, few reach old age and many who do continue to work. Only late in the mortality transition, when there are substantial gains in the years lived late in life, does an important pension motive emerge. Third, the saving models currently in use are static models and do not capture important dynamics. Recent simulation work that combines realistic demographics with lifecycle saving behavior shows that during the demographic transition countries may experience saving rates that substantially exceed equilibrium values for sustained periods of time (Lee 2000).
Bibliography Barro R J 1974 Are government bonds net wealth? Journal of Political Economy 6 (December): 1095–117 Coale C A, Hoover E M 1958 Population Growth and Economic Deelopment in Low-income Countries: A Case Study of India’s Prospects. Princeton University Press, Princeton, NJ Deaton A1989 Saving in developing countries: Theory and review. Proceedings of the World Bank Annual Conference on Deelopment Economics, Supplement to the World Bank Economic Review and the World Bank Research Observer, pp. 61–96 Deaton A, Paxson C 1997 The effects of economic and population growth on national saving and inequality. Demography 34(1): 97–114 Higgins M, Williamson J G 1997 Age structure dynamics in Asia and dependence on foreign capital. Population and Deelopment Reiew 23(2): 261–94 Kelley A C, Schmidt R M 1996 Saving, dependency and development. Journal of Population Economics 9(4): 365–86
Kotlikoff L J 1988 Intergenerational transfers and savings. Journal of Economic Perspecties 2(2): 41–58 Lee R D 2000 Intergenerational transfers and the economic life cycle: A cross-cultural perspective. In: Mason A, Tapinos G (eds.) Sharing the Wealth: Demographic Change and Economic Transfers Between Generations. Oxford University Press, Oxford, UK Lee R D, Mason A, Miller T 2000 Life cycle saving and the demographic transition: The case of Taiwan. In: Chu C Y, Lee R D (eds.) Population and Economic Change in East Asia, Population and Deelopment Reiew. 26: 194–219 Mason A 1987 National saving rates and population growth: A new model and new evidence. In: Johnson D G, Lee R D (eds.) Population Growth and Economic Deelopment: Issues and Eidence. University of Wisconsin Press, Madison, WI, pp. 5230–60 Modigliani F 1988 The role of intergenerational transfers and life cycle saving in the accumulation of wealth. Journal of Economic Perspecties 2(2): 15–40 Modigliani F, Brumberg R 1954 Utility analysis and the consumption function: An interpretation of cross-section data. In: Kurihara K (ed.) Post-Keynesian Economics. Rutgers University Press, New Brunswick, NJ
A. Mason
Scale in Geography Scale is about size, either relative or absolute, and involves a fundamental set of issues in geography. Scale primarily concerns space in geography, and this article will focus on spatial scale. However, the domains of temporal and thematic scale are also important to geographers. Temporal scale deals with the size of time units, thematic scale with the grouping of entities or attributes such as people or weather variables. Whether spatial, temporal, or thematic, scale in fact has several meanings in geography.
1. Three Meanings of Scale The concept of scale can be confusing, insofar as it has multiple referents. Cartographic scale refers to the depicted size of a feature on a map relative to its actual size in the world. Analysis scale refers to the size of the unit at which some problem is analyzed, such as at the county or state level. Phenomenon scale refers to the size at which human or physical earth structures or processes exist, regardless of how they are studied or represented. Although the three referents of scale frequently are treated independently, they are in fact interrelated in important ways that are relevant to all geographers, and the focus of research for some. For example, choices concerning the scale at which a map 13501
Scale in Geography should be made depend in part on the scale at which measurements of earth features are made and the scale at which a phenomenon of interest actually exists.
1.1 Cartographic Scale Maps are smaller than the part of the earth’s surface they depict. Cartographic scale expresses this relationship, traditionally in one of three ways. A verbal scale statement expresses the amount of distance on the map that represents a particular distance on the earth’s surface in words, e.g., ‘one inch equals a mile.’ The representative fraction (RF) expresses scale as a numerical ratio of map distance to earth distance, e.g., ‘1:63,360.’ The RF has the advantage of being a unitless measure. Finally, a graphic scale bar uses a line of particular length drawn on the map and annotated to show how much earth distance it represents. A graphic scale bar has the advantage that it changes size appropriately when the map is enlarged or reduced. Alternatively, all three expressions of scale may refer to areal measurements rather than linear measurements, e.g., a 1-inch square may represent 1 square mile on the earth. Given a map of fixed size, as the size of the represented earth surface gets larger, the RF gets smaller (i.e., the denominator of the RF becomes a larger number). Hence, a ‘large-scale map’ shows a relatively small area of the earth, such as a county or city, and a ‘small-scale map’ shows a relatively large area, such as a continent or a hemisphere of the earth. This cartographic scale terminology is frequently felt to be counterintuitive when applied to analysis or phenomenon scale, where small-scale and large-scale usually refer to small and large entities, respectively. An important complexity about cartographic scale is that flat maps invariably distort spatial relations on the earth’s surface: distance, direction, shape, and\or area. How they distort these relations is part of the topic of map projections. In many projections, especially small-scale maps that show large parts of the earth, this distortion is extreme so that linear or areal scale on one part of the map is very different than on other parts. Even so-called equal area projections maintain equivalent areal scale only for particular global features, and not for all features at all places on the map. Variable scale is sometimes shown on a map by the use of a special symbol or multiple symbols at different locations.
1.2 Analysis Scale Analysis scale includes the size of the units in which phenomena are measured and the size of the units into which measurements are aggregated for data analysis and mapping. It is essentially the scale of understanding of geographic phenomena. Terms such as 13502
‘resolution’ or ‘granularity’ are often used as synonyms for the scale of analysis, particularly when geographers work with digital representations of the earth’s surface in a computer by means of a regular grid of small cells in a satellite image (rasters) or on a computer screen (pixels). Analysis scale here refers to the area of earth surface represented by a single cell. It has long been recognized that in order to observe and study a phenomenon most accurately, the scale of analysis must match the actual scale of the phenomenon. This is true for all three domains of scale— spatial, temporal, and thematic. Identifying the correct scale of phenomena is, thus, a central problem for geographers. Particularly when talking about thematic scale, using data at one scale to make inferences about phenomena at other scales is known as the cross-level fallacy (the narrower case of using aggregated data to make inferences about disaggregated data is wellknown as the ecological fallacy). Geographers often analyze phenomena at what might be called ‘available scale,’ the units that are present in available data. Many problems of analysis scale arise from this practice, but it is unavoidable given the difficulty and expense involved in collecting many types of data over large parts of the earth’s surface. Geographers have little choice in some cases but to analyze phenomena with secondary data, data collected by others not specifically for the purposes of a particular analysis. For example, census bureaus in many countries provide a wealth of data on many social, demographic, and economic characteristics of their populace. Frequently, the phenomenon of interest does not operate according to the boundaries of existing administrative or political units in the data, which after all were not created to serve the needs of geographic analysis. The resolution of image scanners on remote-sensing satellites provides another important example. Landsat imagery is derived from thematic mapper sensors, producing earth measurements at a resolution of about 30 by 30 meters. However, many phenomena occur at finer resolutions than these data can provide. Most useful is theory about the scale of a phenomenon’s existence. Frequently lacking this, but realizing that the available scale may not be suitable, geographers use empirical ‘trial-and-error’ approaches to try to identify the appropriate scale at which a phenomenon should be analyzed. Given spatial units of a particular size, one can readily aggregate or combine them into larger units; it is not possible without additional information or theory to disaggregate them into smaller units. Even given observations measured at very small units, however, there is still the problem of deciding in what way the units should be aggregated. This is known as the modifiable areal unit problem (MAUP, or MTUP in the case of temporal scale). Various techniques have been developed to study the implications of MAUP (Openshaw 1983).
Scale in Geography 1.3 Phenomenon Scale Phenomenon scale refers to the size at which geographic structures exist and over which geographic processes operate in the world. It is the ‘true’ scale of geographic phenomena. Determining the scale of phenomena is clearly a major research goal in geography. It is a common geographic dictum that scale matters. Numerous concepts in geography reflect the idea that phenomena are scale-dependent or are defined in part by their scale. Vegetation stands are smaller than vegetation regions, and linguistic dialects are distributed over smaller areas than languages. The possibility that some geographic phenomena are scale independent is important, however. Patterns seen at one scale may often be observed at other scales; whether this is a matter of analogy or of the same processes operating at multiple scales is theoretically important. The mathematics of fractals has been applied in geography as a way of understanding and formalizing phenomena such as coastlines that are self-similar at different scales (Lam and Quattrochi 1992). The belief has often been expressed that the discipline of geography, as the study of the earth as the home of humanity, can be defined partially by its focus on phenomena at certain scales, such as cities or continents, and not other scales. The range of scales of interest to geographers are often summarized by the use of terminological continua such as ‘local-global’ or ‘micro-, meso-, macroscale.’ The view that geographers must restrict their focus to particular ranges of scales is not shared universally, however, and advances have and will continue to occur when geographers stretch the boundaries of their subject matter. Nonetheless, few would argue that subatomic or interplanetary scales are properly of concern for geography. It is widely recognized that various scales of geographic phenomena interact, or that phenomena at one scale emerge from smaller or larger scale phenomena. This is captured by the notion of a ‘hierarchy of scales,’ in which smaller phenomena are nested within larger phenomena. Local economies are nested within regional economies, rivers are nested within larger hydrologic systems. Conceptualizing and modeling such scale hierarchies can be quite difficult, and the traditional practice within geography of focusing on a single scale largely continues.
2. Generalization The world can never be studied, modeled, or represented in all of its full detail and complexity. Scale is important in part because of its consequences for the degree to which geographic information is generalized. Generalization refers to the amount of detail included in information; it is essentially an issue of simplifi-
cation, but also includes aspects of selection and enhancement of features of particular interest. As one studies or represents smaller pieces of the earth, one tends strongly to deal with more detailed or more finegrained aspects of geographic features. For example, large-scale maps almost always show features on the earth’s surface in greater detail than do small-scale maps; rivers appear to meander more when they are shown on large-scale maps, for instance. Studied most extensively by cartographers, generalization is in fact relevant to all three meanings of scale, and to all three domains of spatial, temporal, and thematic scale.
3. Conclusion Issues of scale have always been central to geographic theory and research. Advances in the understanding of scale and the ability to investigate scale-related problems will continue, particularly with the increasingly common representation of geographic phenomena through the medium of digital geographic information (Goodchild and Proctor 1997). Cartographic scale is becoming ‘visualization’ scale. How is scale, spatial and temporal, communicated in dynamic, multidimensional, and multimodal representations, including visualization in virtual environments? Progress continues on the problem of automated generalization, programming intelligent machines to make generalization changes in geographic data as scale changes. The ability to perform multiscale and hierarchical analysis will be developed further. More profound than these advances, however, the widespread emergence of the ‘digital world’ will foster new conceptions of scale in geography.
Bibliography Buttenfield B P, McMaster R B (eds.) 1991 Map Generalization: Making Rules for Knowledge Representation. Wiley, New York Goodchild M F, Proctor J 1997 Scale in a digital geographic world. Geographical and Enironmental Modeling 1: 5–23 Hudson J C 1992 Scale in space and time. In: Abler R F, Marcus M G, Olson J M (eds.) Geography’s Inner Worlds: Perasie Themes in Contemporary American Geography. Rutgers University Press, New Brunswick, NJ Lam N S-N, Quattrochi D A 1992 On the issues of scale, resolution, and fractal analysis in the mapping sciences. The Professional Geographer 44: 88–98 MacEachren A M 1995 How Maps Work: Representation, Visualization, and Design. Guilford Press, New York Meyer W B, Gregory D, Turner B L, McDowell P F 1992 The local-global continuum. In: Abler R F, Marcus M G, Olson J M (eds.) Geography’s Inner Worlds: Perasie Themes in Contemporary American Geography. Rutgers University Press, New Brunswick, NJ
13503
Scale in Geography Muehrcke P C, Muehrcke J O 1992 Map Use: Reading, Analysis, Interpretation, 3rd edn. JP Publications, Madison, WI Openshaw S 1983 The Modifiable Areal Unit Problem. Geo Books, Norfolk, UK
D. R. Montello
and practical consequences. For example, if only violent crimes are subject to classification as homicides, then ‘homicide’ is a kind of ‘violent crime,’ and deaths caused by executive directives to release deadly pollution could not be homicides. 1.2 Typologies
Scaling and Classification in Social Measurement
Classification assimilates perceived phenomena into symbolically labeled categories. Anthropological studies of folk classification systems (D’Andrade 1995) have advanced understanding of scientific classification systems, though scientific usages involve criteria that folk systems may not meet entirely. Two areas of social science employ classification systems centrally. Qualitative analyses such as ethnographies, histories, case studies, etc. offer classifications—sometimes newly invented—for translating experiences in unfamiliar cultures or minds into familiar terms. Censuses of individuals, of occurrences, or of aggregate social units apply classifications—usually traditional—in order to count entities and their variations. Both types of work depend on theoretical constructions that link classification categories.
A typology differentiates entities at a particular level of a taxonomy in terms of one or more of their properties. The differentiating property (sometimes called a feature or attribute) essentially acts as a modifier of entities at that taxonomic level. For example, in the USA kinship system siblings are distinguished in terms of whether they are male or female; in Japan by comparison, siblings are schematized in terms of whether they are older as well as whether they are male or female. A scientific typology differentiates entities into types that are exclusive and exhaustive: every entity at the relevant taxonomic level is of one defined type only, and every entity is of some defined type. A division into two types is a dichotomy, into three types a trichotomy, and into more than three types a polytomy. Polytomous typologies are often constructed by crossing multiple properties, forming a table in which each cell is a theoretical type. (The crossed properties might be referred to as variables, dimensions, or factors in the typology.) For example, members of a multiplex society have been characterized according to whether they do or do not accept the society’s goals on the one hand, and whether they do or do not accept the society’s means of achieving goals on the other hand; then crossing acceptance of goals and means produces a fourfold table defining conformists and three types of deviants. Etic–emic analysis involves defining a typology with properties of scientific interest (the etic system) and then discovering ethnographically which types and combinations of types are recognized in folk meanings (the emic system). Latent structure analysis statistically processes observed properties of a sample of entities in order to confirm the existence of hypothesized types and to define the types operationally.
1.1 Taxonomies
1.3 Aggregate Entities
Every classification category is located within a taxonomy. Some more general categorization, Y, determines which entities are in the domain for the focal categorization, X; so an X always must be a kind of Y. ‘X is a kind of Y’ is the linguistic frame for specifying taxonomies. Concepts constituting a taxonomy form a logic tree, with subordinate elements implying superordinate items. Taxonomic enclosure of a classification category is a social construction that may have both theoretical
Aggregate social entities such as organizations, communities, and cultures may be studied as unique cases, where measurements identify and order internal characteristics of the entity rather than relate one aggregate entity to another. A seeming enigma in social measurement is how aggregate social entities can be described satisfactorily on the basis of the reports of relatively few informants, even though statistical theory calls for substantial samples of respondents to survey populations. The
Social measurements translate observed characteristics of individuals, events, relationships, organizations, societies, etc. into symbolic classifications that enable reasoning of a verbal, logical, or mathematical nature. Qualitative research and censuses together define one realm of measurement, concerned with assignment of entities to classification categories embedded within taxonomies and typologies. Another topic in measurement involves scaling discrete items of information such as answers to questions so as to produce quantitative measurements for mathematical analyses. A third issue is the linkage between social measurements and social theories.
1. Classifications
13504
Scaling and Classification in Social Measurement key is that informants all report on the same thing—a single culture, community, or organization—whereas respondents in a social survey typically report on diverse things—their own personal characteristics, beliefs, or experiences. Thus, reports from informants serve as multiple indicators of a single state, and the number needed depends on how many observations are needed to define a point reliably, rather than how many respondents are needed to describe a population’s diversity reliably. As few as seven expert informants can yield reliable descriptions of aggregate social entities, though more are needed as informants’ expertise declines (Romney et al. 1986). Informant expertise correlates with greater intelligence and experience (D’Andrade 1995) and with having a high level of social integration (Thomas and Heise 1995).
2. Relations Case grammar in linguistics defines events and relationships in terms of an actor, action, object, and perhaps instrumentation, setting, products, and other factors as well. Mapping sentences (Shye et al. 1994) apply the case grammar idea with relatively small lists of entities in order to classify relational phenomena within social aggregates. For example, interpersonal relations in a group can be specified by pairing group members with actions such as loves, admires, annoys, befriends, and angers. Mapping sentences defining the relations between individuals or among social organizations constitute the measurement model for social network research.
3. Scaling Quantitative measurements differentiate entities at a given taxonomic level—serving like typological classifications, but obtaining greater logical and mathematical power by ordering the classification categories. An influential conceptualization (Stevens 1951) posited four levels of quantification in terms of how numbers relate to classification categories. Nominal numeration involves assigning numbers arbitrarily simply to give categories unique names, such as ‘batch 243.’ An ordinal scale’s categories are ordered monotonically in terms of greater-than and less-than, and numbering corresponds to the rank of each category. Numerical ranking of an individual’s preferences for different foods is an example of ordinal measurement. Differences between categories can be compared in an interal scale, and numbers applied to categories reflect degrees of differences. Calendar dates are an example of interval measurements—we know from their birth years that William Shakespeare was closer in time to Geoffrey Chaucer than Albert Einstein was to Isaac Newton. In a ratio scale categories have magnitudes that are whole or fractional multiples of one another,
and numbers assigned to the categories represent these magnitudes. Population sizes are an example of ratio measurements—knowing the populations of both nations, we can say that Japan is at least 35 times bigger than Jamaica. A key methodological concern in psychometrics (Hopkins 1998) has been: how do you measure entities on an interval scale given merely nominal or ordinal information? 3.1 Scaling Dichotomous Items Nominal data are often dichotomous yes–no answers to questions, a judge’s presence–absence judgments about the features of entities, an expert’s claims about truth–falsity of propositions, etc. Answers of yes, present, true, etc. typically are coded as ‘one’ and no, absent, false, etc. as ‘zero.’ The goal, then, is to translate zero-one answers for each case in a sample of entities into a number representing the case’s position on an interval scale of measurement. The first step requires identifying how items relate to the interval scale of measurement in terms of a graph of the items’ characteristic curves. The horizontal axis of such a graph is the interval scale of measurement, confined to the practical range of variation of entities actually being observed. The vertical axis indicates probability that a specific dichotomous item has the value one for an entity with a given position on the interval scale of measurement. An item’s characteristic curve traces the changing probability of the item having the value one as an entity moves from having a minimal value on the interval scale to having the maximal value on the interval scale. Item characteristic curves have essentially three different shapes, corresponding to three different formulations about how items combine into a scale. Spanning items have characteristic curves that essentially are straight lines stretching across the range of entity variation. A spanning item’s line may start as a low probability value and rise to a high probability value, or fall from a high value to a low value. A rising line means that the item is unlikely to have a value of one with entities having a low score on the interval scale; the item is likely to have a score of one for entities having a high score on the interval scale; and the probability of the item being valued at one increases regularly for entities between the low and high positions on the scale. Knowing an entity’s value on any one spanning item does not permit assessing the entity’s position along the interval scale. However, knowing the entity’s values on multiple spanning items does allow an estimate of positioning to be made. Suppose heuristically that we are working with a large number of equivalent spanning items, each having an item characteristic curve that starts at probability 0.00 at the minimal point of the interval scale, and rises in a 13505
Scaling and Classification in Social Measurement straight line to probability 1.00 at the maximal point of the interval scale. The probability of an item being valued at one can be estimated from the observed proportion of all these items that are valued at one—which is simply the mean item score when items are scored zero-one. Then we can use the characteristic curve for the items to find the point on the interval scale where the entity must be positioned in order to have the estimated item probability. This is the basic scheme involved in construction of composite scales, where an averaged or summated score on multiple items is used to estimate an entity’s interval-scale value on a dimension of interest (Lord and Novick 1968). The more items that are averaged, the better the estimate of an entity’s position on the interval scale. The upper bound on number of items is pragmatic, determined by how much precision is needed and how much it costs to collect data with more items. The quality of the estimate also depends on how ideal the items are in terms of having straight-line characteristic curves terminating at the extremes of probability. Irrelevant items with a flat characteristic curve would not yield an estimate of scale position no matter how many of them are averaged, because a flat curve means that the probability of the item having a value of one is uncorrelated with the entity’s position on the interval scale. Inferences are possible with scales that include relevant but imperfect items, but more items are required to achieve a given level of precision, and greater weight needs to be given to the more perfect items. Decliitous items have characteristic curves that rise sharply at a particular point on the horizontal axis. Idealized, the probability of the item having a value of one increases from 0.00 to the left of the inflection point to 1.00 to the right of the inflection point; or alternatively the probability declines from 1.00 to 0.00 in passing the inflection point. Realistically, the characteristic curve of a declivitous item is S-shaped with a steep rise in the middle and graduated approaches to 0.00 at the bottom and to 1.00 at the top. The value of a single declivitous item tells little about an entity’s position along the interval scale. However, an inference about an entity’s scale position can be made from a set of declivitous items with different inflection points, or difficulties, that form a cumulatie scale. Suppose heuristically that each item increases stepwise at its inflection point. Then for an entity midway along the horizontal axis, items at the left end of the scale will all have the value of one, items at the right end of the scale will all have the value zero, and the entity’s value on the interval scale is between the items with a score of one and the items with a score of zero. If the items’ inflection points are evenly distributed along the interval scale, then the sum of items’ zeroone scores for an entity constitutes an estimate of where the entity is positioned along the interval scale. 13506
That is, few of the items have a value of one if the entity is on the lower end of the interval scale, and many of the items are valued at one if the entity is at the upper end of the interval scale. This is the basic scheme involved in Guttman scalogram analysis (e.g., see Shye 1978, Part 5). On the other hand, we might use empirical data to estimate the position of each item’s inflection point on the interval scale, while simultaneously estimating entity scores that take account of the item difficulties. This is the basic scheme involved in scaling with Rasch models (e.g., Andrich 1988). Entities’ positions on the interval scale can be pinned down as closely as desired through the use of more declivitous items with inflection points spaced closer and closer together. However, adding items to achieve more measurement precision at the low end of the interval scale does not help at the middle or the high end of the interval scale. Thus, obtaining high precision over the entire range of variation requires a large number of items, and it could be costly to obtain so much data. Alternatively, one can seek items whose characteristic curves rise gradually over a range of the interval scale such that sequential items on the scale have overlapping characteristic curves, whereby an entity’s position along the interval scale is indicated by several items. Regional items have characteristic curves that rise and fall within a limited range of the interval scale. That is, moving an entity up the interval scale increases the probability of a particular item having a value of one for a while, and then decreases the probability after the entity passes the characteristic curve’s maximum value. For example, in a scale measuring prejudice toward a particular ethnic group, the probability of agreeing with the item ‘they require equal but separate facilities’ increases as a person moves away from an apartheid position, and then decreases as the person moves further up the scale toward a nondiscriminatory position. A regional item’s characteristic curve is approximately bell-shaped if its maximum is at the middle of the interval scale, but characteristic curves at the ends of the scale are subject to floor and ceiling clipping, making them look like declivitous items. If an entity has a value of one on a regional item, then the entity’s position along the interval scale is known approximately, since the entity must be positioned in the part of the scale where that item has a nonzero probability of being valued at one. However, a value of zero on the same item can result from a variety of positions along the interval scale and reveals little about the entity’s position. Thus, multiple regional items have to be used to assess positions along the whole range of the scale. The items have to be sequenced relatively closely along the scale with overlapping characteristic curves so that no entity will end up in the noninformative state of having a zero value on all items.
Scaling and Classification in Social Measurement One could ask judges to rate the scale position of each item, and average across judges to get an item score; then, later, respondents can be scored with the average of the items they accept. This simplistic approach to regional items was employed in some early attempts to measure social attitudes. Another approach is statistical unfolding of respondents’ choices of items on either side of their own positions on the interval scale in order to estimate scale values for items and respondents simultaneously (Coombs 1964). Item analysis is a routine aspect of scale construction with spanning items, declivitous items, or regional items. One typically starts with a notion of what one wants to measure, assembles items that should relate to the dimension, and tests the items in order to select the items that work best. Since a criterion measurement that can be used for assessing item quality typically is lacking, the items as a group are assumed to measure what they are supposed to measure, and scores based on this assumption are used to evaluate individual items. Items in a scale are presumed to measure a single dimension rather than multiple dimensions. Examining the dimensionality assumption brings in additional technology, such as component analysis or factor analysis in the case of spanning items, multidimensional scalogram analysis in the case of declivitous items, and nonmetric multidimensional scaling in the case of regional items. These statistical methods help in refining the conception of the focal dimension and in selecting the best items for measuring that dimension.
logies, like Rasch scaling and nonmetric multidimensional scaling, can be interpreted within the conjoint analysis framework. Magnitude estimations involve comparisons to an anchor, for example: ‘Here is a reference sound. … How loud is this next sound relative to the reference sound.’ Trained judges using such a procedure can assess intensities of sensations and of a variety of social opinions on ratio scales (Stevens 1951, Lodge 1981). Comparing magnitude estimation in social surveys to the more common procedure of obtaining ratings on category scales with a fixed number of options, Lodge (1981) found that magnitude estimations are more costly but more accurate, especially in registering extreme positions. Rating scales with bipolar adjective anchors like good–bad are often used to assess affective meanings of perceptions, individuals, events, etc. Such scales traditionally provided seven or nine answer positions between the opposing poles with the middle position defined as neutral. Computerized presentations of such scales with hundreds of rating positions along a graphic line yield greater precision by incorporating some aspects of magnitude estimation. Cross-cultural and cross-linguistic research in dozens of societies has demonstrated that bipolar rating scales align with three dimensions—evaluation, potency, and activity (Osgood et al. 1975). An implication is that research employing bipolar rating scales should include scales measuring the standard three dimensions in order to identify contributions of these dimensions to rating variance on other bipolar scales.
4. Measurements and Theory 3.2 Ordered Assessments Ordinal data—the starting point for a three-volume mathematical treatise on measurement theory (Krantz et al. 1971, Suppes et al. 1989, Luce et al. 1990)—may arise from individuals’ preferences, gradings of agreement with statements, reckonings of similarity between stimuli, etc. Conjoint analysis (Luce and Tukey 1964, Michell 1990) offers a general mathematical model for analyzing such data. According to the conjoint theory of measurement, positions on any viable quantitative dimension are predictable from positions on two other quantitative dimensions, and this assumption leads to tests of a dimension’s usefulness given just information of an ordinal nature. For example, societies might be ranked in terms of their socioeconomic development and also arrayed in terms of the extents of their patrifocal technologies (like herding) and matrifocal technologies (like horticulture), each of which contributes additively to socioeconomic development. Conjoint analyses could be conducted to test the meaningfulness of these dimensions, preliminarily to developing interval scales of socioeconomic development and patrifocal and matrifocal technologies. Specific scaling methodo-
Theories and measurements are bound inextricably. In the first place, taxonomies and typologies—which are theoretical constructions, even when rooted in folk classification systems—are entailed in defining which entities are to be measured, so are part of all measurements. Second, scientists routinely assume that any particular measurement is wrong to some degree—even a measurement based on a scale, and that combining multiple measurements improves measurement precision. The underlying premise is that a true value exists for that which is being measured, as opposed to observed values, and that theories apply to true values, not ephemeral observed values. In essence, theories set expectations about what should be observed, in contrast to what is observed, and deviation from theoretical expectations is interpreted as measurement error (Kyburg 1992). A notion of measurement error as deviation from theoretical expectations is widely applicable, even in qualitative research (McCullagh and Behan 1984, Heise 1989). Third, the conjoint theory of measurement underlying many current measurement technologies requires theoretical specification of relations between variables 13507
Scaling and Classification in Social Measurement before the variables’ viability can be tested or meaningful scales constructed. This approach is in creative tension with traditional deductive science, wherein variable measurements are gathered in order to determine if relations among variables exist and theories are correct. In fact, a frequent theme in social science during the late twentieth century was that measurement technology had to improve in order to foster the growth of more powerful theories. It is clear now that the dependence between measurements and theories is more bidirectional than was supposed before the development of conjoint theory. See also: Classification: Conceptions in the Social Sciences; Dimensionality of Tests: Methodology; Factor Analysis and Latent Structure: IRT and Rasch Models; Test Theory: Applied Probabilistic Measurement Structures
Bibliography Andrich D 1988 Rasch Models for Measurement. Sage, Newbury Park, CA Coombs C H 1964 A Theory of Data. John Wiley & Sons, New York D’Andrade R 1995 The Deelopment of Cognitie Anthropology. Cambridge University Press, New York Heise D R 1989 Modeling event structures. Journal of Mathematical Sociology 14: 139–69 Hopkins K D 1998 Educational and Psychological Measurement and Ealuation, 8th edn. Allyn and Bacon, Boston Krantz D H, Luce R D, Suppes P, Tversky A 1971 Foundations of Measurement. Volume 1: Additie and Polynomial Representations. Academic Press, New York Kyburg Jr. H E 1992 Measuring errors of measurement. In: Savage C W, Ehrlich P (eds.) Philosophical and Foundational Issues in Measurement. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 75–91 Lodge M 1981 Magnitude Scaling: Quantitatie Measurement of Opinions. Sage, Beverly Hills, CA Lord F N, Novick M R 1968 Statistical Theories of Mental Test Scores. Addison-Wesley, Reading, MA Luce R D, Krantz D H, Suppes P, Tversky A 1990 Foundations of Measurement. Volume 3: Representation, Axiomatization, and Inariance. Academic Press, New York Luce R D, Tukey J W 1964 Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology 1: 1–27 McCullagh B C, Behan C 1984 Justifying Historical Descriptions. Cambridge University Press, New York Michell J 1990 An Introduction to the Logic of Psychological Measurement. Lawrence Erlbaum Associates, Hillsdale, NJ Osgood C, May W H, Miron M S 1975 Cross-cultural Uniersals of Affectie Meaning. University of Illinois Press, Urbana, IL Romney A K, Weller S C, Batchelder W H 1986 Culture as consensus: A theory of culture and informant accuracy. American Anthropologist 88: 313–38 Shye S (ed.) 1978 Theory Construction and Data Analysis in the Behaioral Sciences, 1st edn. Jossey-Bass, San Francisco Shye S, Elizur D, Hoffman M 1994 Introduction to Facet Theory: Content Design and Intrinsic Data Analysis in Behaioral Research. Sage, Thousand Oaks, CA
13508
Stevens S S 1951 Mathematics, measurement, and psychophysics. In: Stevens S S (ed.) Handbook of Experimental Psychology. John Wiley and Sons, New York, pp. 1–49 Suppes P, Krantz D H, Luce R D, Tversky A 1989 Foundations of Measurement. Volume 2: Geometrical, Threshold, and Probabilistic Representations. Academic Press, New York Thomas L, Heise D R 1995 Mining error variance and hitting pay-dirt: Discovering systematic variation in social sentiments. The Sociological Quarterly 36: 425–39
D. R. Heise
Scaling: Correspondence Analysis In the early 1960s a dedicated group of French social scientists, led by the extraordinary scientist and philosopher Jean-Paul Benze! cri, developed methods for structuring and interpreting large sets of complex data. This group’s method of choice was correspondence analysis, a method for transforming a rectangular table of data, usually counts, into a visual map which displays rows and columns of the table with respect to continuous underlying dimensions. This article introduces this approach to scaling, gives an illustration and indicates its wide applicability. Attention is limited here to the descriptive and exploratory uses of correspondence analysis methodology. More formal statistical tools have recently been developed and are described in Multiariate Analysis: Discrete Variables (Correspondence Models).
1. Historical Background Benze! cri’s contribution to data analysis in general and to correspondence analysis in particular was not so much in the mathematical theory underlying the methodology as in the strong attention paid to the graphical interpretation of the results and in the broad applicability of the methods to problems in many contexts. His initial interest was in analyzing large sparse matrices of word counts in linguistics, but he soon realized the power of the method in fields as diverse as biology, archeology, physics, and music. The fact that his approach paid so much attention to the visualization of data, to be interpreted with a degree of ingenuity and insight into the substantive problem, fitted perfectly the esprit geT ometrique of the French and their tradition of visual abstraction and creativity. Originally working in Rennes in western France, this group consolidated in Paris in the 1970s to become an influential and controversial movement in post1968 France. In 1973 they published the two fundamental volumes of, L’Analyse des DonneT es (Data Analysis), the first on La Classification, that is,
Scaling: Correspondence Analysis unsupervised classification or cluster analysis, and the second on, L’Analyse des Correspondances, or correspondence analysis (Benze! cri 1973), as well as from 1977 the journal Les Cahiers de l’Analyse des DonneT es, all of which reflect the depth and diversity of Benze! cri’s work. For a more complete historical account of the origins of correspondence analysis, see Nishisato (1980), Greenacre (1984), and Gifi (1990).
2. Correspondence Analysis Correspondence analysis (CA) is a variant of principal components analysis (PCA) applicable to categorical data rather than interval-level measurement data (see Factor Analysis and Latent Structure: Oeriew). For example, Table 1 is a contingency table obtained from the 1995 International Social Survey Program (ISSP) survey on national identity, tabulating responses from 23 countries on the question: ‘How much do you agree\disagree with the statement: Generally (respondent’s country) is a country better than most other countries?’ (For example, Austrians are asked to evaluate the statement: Generally Austria is a country better than most other countries.) The object of CA is to obtain a graphical display in the form of a spatial map of the rows (countries) and columns (question responses), where the dimensions of the map as well as the specific positions of the row and column points can be interpreted.
The theory of CA can be summarized by the following steps: (a) Let N be the IiJ table with grand total n and let P l (1\n) N be the correspondence matrix, with grand total equal to 1. CA actually analyzes the correspondence matrix, which is free of the sample size. If N is a contingency table, then P is an observed bivariate discrete distribution. (b) Let r and c be the vectors of row and column sums of P respectively and Dr and Dc diagonal matrices with r and c on the diagonal. (c) Compute the singular value decomposition of the centred and standardized matrix with general element (pijkricj)\Nricj: D−r "/# (Pkr cV )D−c "/# l UDαVV
(1)
where the singular values are in descending order: α α … and U V U l VV V l I. " (d) #Compute the standard coordinates X and Y: X l D−r "/# U Y l D−c "/# V
(2)
and principal coordinates F and G: F l X Dα
G l Y Dα
(3)
Notice the following: The results of CA are in the form of a map of points representing the rows and columns with respect to a
Table 1 Responses in 23 countries to question on national pride. Source: ISSP 1995 Country Austria Bulgaria Canada Czech Republic Former E. Germany Former W. Germany Great Britain Hungary Ireland Italy Japan Latvia Netherlands Norway New Zealand Poland Philippines Russia Slovakia Slovenia Spain Sweden USA
Agree strongly
Agree
Can’t decide
Disagree
Disagree strongly
Missing
TOTAL RESPONSE
272 208 538 73 59 121 151 66 167 70 641 92 158 258 283 153 152 272 100 61 71 140 525
387 338 620 156 138 321 408 175 529 324 398 190 758 723 499 378 556 297 199 204 343 426 554
184 163 223 372 138 369 282 268 149 298 139 215 545 334 170 396 260 352 384 258 320 375 168
79 112 99 289 142 232 139 258 120 281 36 225 417 114 41 365 214 307 338 366 372 167 61
33 129 33 158 71 139 25 143 14 98 24 153 129 30 8 80 10 134 267 64 42 80 21
52 155 30 63 64 100 53 90 15 23 18 169 82 68 42 226 8 223 100 83 73 108 38
1007 1105 1543 1111 612 1282 1058 1000 994 1094 1256 1044 2089 1527 1043 1598 1200 1585 1388 1036 1221 1296 1367
13509
Scaling: Correspondence Analysis selected pair of principal axes, corresponding to pairs of columns of the coordinate matrices—usually the first two columns for the first two principal axes. The choice between principal and standard coordinates is described below. The total variance, called inertia, is equal to the sum of squares of the matrix decomposed in (1): ( p ijkr i c j )#\( r i cj ) i
(4)
j
which is the Pearson chi-squared statistic calculated on the original table divided by n (see Multiariate Analysis: Discrete Variables (Oeriew)). The squared singular values α #, α#,…, called the # principal inertias, decompose the "inertia into parts attributable to the respective principal axes, just as in PCA the total variance is decomposed along principal axes. The most popular type of map, called the symmetric map, uses the first two columns of F for the row coordinates and the first two columns of G for the column coordinates, that is both in principal coordinates as given by (3). An alternative scaling, which has a more coherent geometric interpretation, but less aesthetic appearance, is the asymmetric map, for example, rows in principal coordinates F and columns in standard coordinates Y in (2) (or vice versa). The choice between a row-principal or column-principal asymmetric map is governed by whether the original table is considered as a set of rows or a set of columns, respectively, when expressed in percentage form. The positions of the rows and the columns in a map are projections of points, called profiles, from their true positions in high-dimensional space onto a bestfitting lower-dimensional space. A row or column profile is the corresponding row or column of the table divided by its respective total—in the case of a contingency table the profile is a conditional frequency distribution. Each profile is weighted by a mass equal to the value of the corresponding row or column margin, ri or cj. The space of the profiles is structured by a weighted Euclidean distance function called the chi-squared distance and the optimal map is obtained by fitting a lower-dimensional space which fits the profiles by weighted least-squares. Equivalent forms of (4) which show the use of profile, mass, and chi-squared distance are: ri ( i
j
p ij p kcj )#\cj l cj ( ijkri)#\ri ri cj j i
(5)
Thus the inertia is a weighted average squared distance between the profile vectors (e.g., prij, j l 1,… for a row i profile, weighted by the mass ri) and their respective average (e.g., cj, j l 1,…, the average row profile), where the distance is of a weighted Euclidean form (e.g., with inverse weighting of the j-th term by cj). 13510
An equivalent definition of CA is as a pair of classical scaling problems, one for the rows and one for the columns. For example, a square symmetric matrix of chi-squared distances can be calculated between the row profiles, with each point weighted by its respective row mass. Applying classical scaling (also known as principal coordinate analysis, see Scaling: Multidimensional) to this distance matrix, and taking the row masses into account, leads to the row principal coordinates in CA. The singular value decomposition (SVD) may be written in terms of the standard coordinates in the following equivalent form, for the (i, j)-th element: p ijkr i c j l r i c j (1j α k x ik y jk )
(6)
k
which shows that CA can be considered as a bilinear model (see Multiariate Analysis: Discrete Variables (Correspondence Models)). For any particular solution, for example in two dimensions where the first two terms of this decomposition are retained, the residual elements have been minimized by weighted least-squares.
3. Application The symmetric map of Table 1, with rows and columns in principal coordinates, is given in Fig. 1. Looking at the positions of the question responses first with respect to the first (horizontal) principal axis, they are seen to lie in their substantive order from ‘strongly disagree’ on the left to ‘strongly agree’ on the right, with ‘missing’ on the disagreement side. The scale values of these categories constitute an optimal scale, by which is meant a standardized interval scale for the categorical variable of agreement—disagreement, including the missing value category, which optimally discriminates between the 23 countries, that is which gives maximum between-country variance. In the two-dimensional map the response category points form a curve known as the ‘horseshoe’ or ‘arch’ which is fairly common for data on an ordinal scale. The second dimension then separates out polarized groups towards the top, or inside the arch, where both extremes of the response scale lie, as well as the missing response in this case. Turning attention to the countries now, they will line up from left to right in an ordination of agreement induced by the scale values. The five Eastern Bloc countries lie on the left, unfavorable, extreme of the map, with Japan, Canada, USA, and New Zealand at the other, favorable side. The countries generally follow the curved shape as well, but a country such as Bulgaria which lies in a unique position inside the curve is characterized by a relatively high polarization of responses as well as high missing values. Bulgaria’s position in the middle of the first axis contrasts with the position of Great Britain, for example, which is
Scaling: Correspondence Analysis
Figure 1 Symmetric correspondence analysis map of Table 1
also in the middle but due to responses more in the intermediate categories of the scale rather than a mixture of extreme responses. The principal inertias are indicated in Fig. 1 at the positive end of each axis and the quality of the display is measured by adding together the percentages of inertia, 68.6 percentj17.8 percent l 86.4 percent. This means that there is a ‘residual’ of 13.6 percent not depicted in the map, which can only be interpreted by investigating the next dimensions from third onwards. This 13.6 percent is the percentage of inertia minimized in the weighted least-squares solution of CA in two dimensions.
4. Contributions to Inertia Apart from assessing the quality of the map by the percentages of inertia, other more detailed diagnostics in CA are the so-called contributions to inertia, based on the two decompositions of the total inertia, first by rows and second by columns: # ( p ijkr i c j )#\( r i c j ) l α k# l r i f ik i
j
k
k
i
# l c j gjk k
j
# , (respectively, column Every row component r i f ik component c j g#jk ) can be expressed relative to the
principal inertia αk# of the corresponding dimension k, where α#k is the sum of these components for all the rows (respectively, columns). These relative values provide a diagnostic for deciding which rows (respectively, columns) are important in the determination of the k-th principal axis. In a similar way, for a fixed row each row compon# (respectively, column component c g# for a ent r i f ik j jk fixed column) can be expressed relative to the total # (respectively, c g # ) across all principal r i f ik k k j jk axes. These relative values provide a diagnostic for deciding which axes are important in explaining each row or column. These values are analogous to the squared factor loadings in factor analysis, that is, squared correlations between the row or column and the corresponding principal axis or factor (see Factor Analysis and Latent Structure: Oeriew).
5. Extensions Although the primary application of CA is to a twoway contingency table, the method is regularly applied to analyze multiway tables, tables of preferences, ratings, as well as measurement data on ratio- or interval-level scales. For multiway tables there are two approaches. The first approach is to convert the table to a flat two-way table which is appropriate to the problem at hand. Thus, if a third variable is introduced into the example above, say ‘sex of respondent,’ then an appropriate way to flatten the three-way table 13511
Scaling: Correspondence Analysis would be to interactively code ‘country’ and ‘sex’ as a new row variable, with 23i2 l 46 categories, crosstabulated against the question responses. For each country there would now be a male and a female point and one could compare sexes and countries in this richer map. This process of interactive coding of the variables can continue as long as the data do not become too fragmented into interactive categories of very low frequency. Another approach to multiway data, called multiple correspondence analysis (MCA), applies when there are several categorical variables skirting the same issue, often called ‘items.’ MCA is usually defined as the CA algorithm applied to an indicator matrix Z with the rows being the respondents or other sampling units, and the columns being dummy variables for each of the categories of all the variables. The data are zeros and ones, with the ones indicating the chosen categories for each respondent. The resultant map shows each category as a point and, in principle, the position of each respondent as well. Alternatively, one can set up what is called the Burt matrix), B l ZVZ, the square symmetric table of all two-way crosstabulations of the variables, including the crosstabulations of each variable with itself (named after the psychologist Sir Cyril Burt). The Burt matrix is reminiscent of a covariance matrix and the CA of the Burt matrix can be likened to a PCA of a covariance matrix. The analysis of the indicator matrix Z and the Burt matrix B give equivalent standard coordinates of the category points, but slightly different scalings in the principal coordinates since the principal inertias of B are the squares of those of Z. A variant of MCA called joint correspondence analysis (JCA) avoids the fitting of the tables on the diagonal of the Burt matrix, which is analogous to least-squares factor analysis. As far as other types of data are concerned, namely rankings, ratings, paired comparisons, ratio-scale, and interval-scale measurements, the key idea is to recode the data in a form which justifies the basic constructs of CA, namely profile, mass, and chi-squared distance. For example, in the analysis of rankings, or preferences, applying the CA algorithm to the original rankings of a set of objects by a sample of subjects is difficult to justify, because there is no reason why weight should be accorded to an object in proportion to its average ranking. A practice called doubling resolves the issue by adding either an ‘anti-object’ for each ranked object or an ‘anti-subject’ for each responding subject, in both cases with rankings in the reverse order. This addition of apparently redundant data leads to CA effectively performing different variants of principal components analysis on the original rankings. A recent finding by Carroll et al. (1997) is that CA can be applied to a square symmetric matrix of squared distances, transformed by subtracting each squared distance from a constant which is substantially larger 13512
than the largest squared distance in the table. This yields a solution which approximates the classical scaling solution of the distance matrix. All these extensions of CA conform closely to Benze! cri’s original conception of CA as a universal technique for exploring many different types of data through operations such as doubling or other judicious transformations of the data. The latest developments on the subject, including discussions of sampling properties of CA solutions and a comprehensive reference list, may be found in the volumes edited by Greenacre and Blasius (1994) and Blasius and Greenacre (1998). See also: Factor Analysis and Latent Structure: Overview; Multivariate Analysis: Discrete Variables (Correspondence Models); Multivariate Analysis: Discrete Variables (Overview); Scaling: Multidimensional
Bibliography Benze! cri J-P 1973 L’Analyse des DonneT es Vol I: La Classification, Vol. II: L’Analyse des Correspondances. Dunod, Paris Blasius J, Greenacre M J 1998 Visualization of Categorical Data. Academic Press, San Diego, CA Carroll J D, Kumbasar E, Romney A K 1997 An equivalence relation between correspondence analysis and classical metric multidimensional scaling for the recovery of Euclidean distances. British Journal of Mathematical and Statistical Psychology 50: 81–92 Gifi A 1990 Nonlinear Multiariate Analysis. Wiley, Chichester, UK Greenacre M J 1984 Theory and Applications of Correspondence Analysis. Academic Press, London Greenacre M J 1993 Correspondence Analysis in Practice. Academic Press, London Greenacre M J, Blasius J 1994 Correspondence Analysis in the Social Sciences. Academic Press, London International Social Survey Program (ISSP) 1995 Surey on National Identity. Data set ZA 2880, Zentralarchiv fu$ r Empirische Sozialforschung. University of Cologne Lebart L, Morineau A, Warwick K 1984 Multiariate Descriptie Statistical Analysis. Wiley, Chichester, UK Nishisato S 1980 Analysis of Categorical Data: Dual Scaling and its Applications. University of Toronto Press, Toronto, Canada
M. Greenacre
Scaling: Multidimensional The term ‘Multidimensional Scaling’ or MDS is used in two essentially different ways in statistics (de Leeuw and Heiser 1980a). MDS in the wide sense refers to any technique that produces a multi-dimensional geo-
Scaling: Multidimensional metric representation of data, where quantitative or qualitative relationships in the data are made to correspond with geometric relationships in the representation. MDS in the narrow sense starts with information about some form of dissimilarity between the elements of a set of objects, and it constructs its geometric representation from this information. Thus the data are ‘dissimilarities,’ which are distance-like quantities (or similarities, which are inversely related to distances). This entry only concentrates on narrowsense MDS, because otherwise the definition of the technique is so diluted as to include almost all of multivariate analysis. MDS is a descriptive technique, in which the notion of statistical inference is almost completely absent. There have been some attempts to introduce statistical models and corresponding estimating and testing methods, but they have been largely unsuccessful. I introduce some quick notation. Dissimilarities are written as δij, and distances are dij(X ). Here i and j are the objects of interest. The nip matrix X is the configuration, with coordinates of the objects in p. Often, data weights wij are also available, reflecting the importance or precision of dissimilarity δij.
1. Sources of Distance Data Dissimilarity information about a set of objects can arise in many different ways. This article reviews some of the more important ones, organized by scientific discipline.
1.1 Geodesy The most obvious application, perhaps, is in sciences in which distance is measured directly, although generally with error. This happens, for instance, in triangulation in geodesy, in which measurements are made which are approximately equal to distances, either Euclidean or spherical, depending on the scale of the experiment. In other examples, measured distances are less directly related to physical distances. For example, one could measure airplane, road, or train travel distances between different cities. Physical distance is usually not the only factor determining these types of dissimilarities.
1.2 Geography\Economics In economic geography, or spatial economics, there are many examples of input–output tables, where the table indicates some type of interaction between a number of regions or countries. For instance, the data may have n countries, where entry fij indicates the number of tourists traveling, or the amount of grain
exported, from i to j. It is not difficult to think of many other examples of these square (but generally asymmetric) tables. Again, physical distance may be a contributing factor to these dissimilarities, but certainly not the only one.
1.3 Genetics\Systematics A very early application of a scaling technique was Fisher (1922). He used crossing-over frequencies from a number of loci to construct a (one-dimensional) map of part of the chromosome. Another early application of MDS ideas is in Boyden (1931), where reactions to sera are used to give similarities between common mammals, and these similarities are then mapped into three-dimensional space. In much of systematic zoology, distances between species or individuals are actually computed from a matrix of measurements on a number of variables describing the individuals. There are many measures of similarity or distance which have been used, not all of them having the usual metric properties. The derived dissimilarity or similarity matrix is analyzed by MDS, or by cluster analysis, because systematic zoologists show an obvious preference for tree representations over continuous representations in p.
1.4 Psychology\Phonetics MDS, as a set of data analysis techniques, clearly originates in psychology. There is a review of the early history, which starts with Carl Stumpf around 1880, in de Leeuw and Heiser (1980a). Developments in psychophysics concentrated on specifying the shape of the function relating dissimilarities and distances, until Shepard (1962) made the radical proposal to let the data determine this shape, requiring this function only to be increasing. In psychophysics, one of the basic forms in which data are gathered is the ‘confusion matrix.’ Such a matrix records how many times row-stimulus i was identified as column-stimulus j. A classical example is the Morse code signals studied by Rothkopf (1957). Confusion matrices are not unlike the input– output matrices of economics. In psychology (and marketing) researchers also collect direct similarity judgments in various forms to map cognitive domains. Ekman’s color similarity data is one of the prime examples (Ekman 1963), but many measures of similarity (rankings, ratings, ratio estimates) have been used.
1.5 Psychology\Political Science\Choice Theory Another source of distance information is ‘preference data.’ If a number of individuals indicate their prefer13513
Scaling: Multidimensional Table 1 Ten psychology journals
A B C D E F G H I J
Journal
Label
American Journal of Psychology Journal of Abnormal and Social Psychology Journal of Applied Psychology Journal of Comparatie and Physiological Psychology Journal of Consulting Psychology Journal of Educational Psychology Journal of Experimental Psychology Psychological Bulletin Psychological Reiew Psychometrika
AJP JASP JAP JCPP JCP JEP JExP PB PR Pka
ences for a number of objects, then many choice models use geometrical representations in which an individual prefers the object she is closer to. This leads to ordinal information about the distances between the individuals and the objects, e.g., between the politicians and the issues they vote for, or between the customers and the products they buy.
journals. The journals are given in Table 1. The actual data are in Table 2. the basic idea, of course, is that journals with many cross-references are similar.
3. Types of MDS There are two different forms of MDS, depending on how much information is available about the distances. In some of the applications reviewed in Sect. 1 the dissimilarities are known numbers, equal to distances, except perhaps for measurement error. In other cases only the rank order of the dissimilarities is known, or only a subset of them is known.
1.6 Biochemistry Fairly recently, MDS has been applied in the conformation of molecular structures from nuclear resonance data. The pioneering work is Crippen (1977), and a more recent monograph is Crippen and Havel (1988). Recently, this work has become more important because MDS techniques are used to determine protein structure. Numerical analysts and mathematical programmers have been involved, and as a consequence there have been many new and exciting developments in MDS.
3.1 Metric Scaling In metric scaling the dissimilarities between all objects are known numbers, and they are approximated by distances. Thus objects are mapped into a metric space, distances are computed, and compared with the dissimilarities. Then objects are moved in such a way that the fit becomes better, until some loss function is minimized. In geodesy and molecular genetics this is a reasonable procedure because dissimilarities correspond rather directly with distances. In analyzing input– output tables, however, or confusion matrices, such tables are often clearly asymmetric and not likely to be
2. An Example Section 1 shows that it will be difficult to find an example that illustrates all aspects of MDS. We select one that can be used in quite a few of the techniques discussed. It is taken from Coombs (1964, p. 464). The data are cross-references between ten psychological
Table 2 References in row-journal to column-journal
A B C D E F G H I J
13514
A
B
C
D
E
F
G
H
I
J
122 23 0 36 6 6 65 47 22 2
4 303 28 10 93 12 15 108 40 0
1 9 84 4 11 11 3 16 2 2
23 11 2 304 1 1 33 81 29 0
4 49 11 0 186 7 3 130 8 0
2 4 6 0 6 34 3 14 1 1
135 55 15 98 7 24 337 193 97 6
17 50 23 21 30 16 40 52 39 14
39 48 8 65 10 7 59 31 107 5
1 7 13 4 14 14 14 12 13 59
Scaling: Multidimensional directly translatable into distances. Such cases often require a model to correct for asymmetry and scale. The most common class of models (for counts in a square table) is E ( fij) l αi βj exp okφ(dij(X ))q, where φ is some monotone transformation through the origin. For φ equal to the identity this is known as the choice model for recognition experiments in mathematical psychology (Luce 1963), and as a variation of the quasi-symmetry model in statistics (Haberman 1974). The negative exponential of the distance function was also used by Shepard (1957) in his early theory of recognition experiments. As noted in Sect. 1.3, in systematic zoology and ecology, the basic data matrix is often a matrix in which n objects are measured on p variables. The first step in the analysis is to convert this into an nin matrix of similarities or dissimilarities. Which measure of (dis)similarity is chosen depends on the types of variables in the problem. If they are numerical, Euclidean distances or Mahanalobis distances can be used, but if they are binary other dissimilarity measures come to mind (Gower and Legendre 1986). In any case, the result is a matrix which can be used as input in a metric MDS procedure. 3.2 Nonmetric Scaling In various situations, in particular in psychology, only the rank order of the dissimilarities is known. This is either because only ordinal information is collected (for instance by using paired or triadic comparisons) or because, while the assumption is natural that the function relating dissimilarities and distances is monotonic, the choice of a specific functional form is not. There are other cases in which there is incomplete information. For example, observations may only be available on a subset of the distances, either by design or by certain natural restrictions on what is observable. Such cases lead to a distance completion problem, where the configuration is constructed from a subset of the distances, and at the same time the other (missing) distances are estimated. Such distance completion problems (assuming that the observed distances are measured without error) are currently solved with mathematical programming methods (Alfakih et al. 1998).
other. For instance, one can impose the restriction that the configurations are the same, but the transformation relating dissimilarities and distances are different. Or one could require that the projections on the dimensions are linearly related to each other in the sense that dij(Xk) l dij(XWk), where Wk is a diagonal matrix characterizing occasion k. A very readable introduction to three-way scaling is Arabie et al. (1987). 3.4 Unfolding In ‘multidimensional unfolding,’ information is only available about off-diagonal dissimilarities, either metric or nonmetric. This means dealing with two different sets of objects, for instance individuals and stimuli or members of congress and political issues, and dissimilarities between members of the first set and members of the second set, and not on the withinset dissimilarities. This typically happens with preference and choice data, in which how individuals like candies, or candidates like issues is known, but not how the individuals like other individuals, and so on. In many cases, the information in unfolding is also only ordinal. Moreover, it is ‘conditional,’ which means that while it is known that a politician prefers one issue over another, it is not known if a politician’s preference for an issue is stronger than another politician’s preference for another issue. Thus the ordinal information is only within rows of the offdiagonal matrix. This makes unfolding data, especially nonmetric unfolding data, extremely sparse. 3.5 Restricted MDS In many cases it makes sense to impose restrictions on the representation of the objects in MDS. The design of a study may be such that the objects are naturally on a rectangular grid, for instance, or on a circle or ellipse. Often, incorporating such prior information leads to a more readily interpretable and more stable MDS solution. As noted in Sect. 3.3, some of the more common applications of restricted MDS are to three-way scaling.
3.3 Three-way Scaling
4. Existence Theorem
In ‘three-way scaling’ information is available on dissimilarities between n objects on m occasions, or for m subjects. Two easy ways of dealing with the occasions is to perform either a separate MDS for each subject or to perform a single MDS for the average occasion. Three-way MDS constitutes a strategy between these two extremes. This technique requires computation of m MDS solutions, but they are required to be related to each
The basic existence theorem in Euclidean MDS, in matrix form, is due to Schoenberg (1935). A more modern version was presented in the book by Torgerson (1958). I give a simple version here. Suppose E is a nonnegative, hollow, symmetric matrix or order n, and suppose Jn l Inkn" enehn is the ‘centering’ operator. Here In is the identity, and en is a vector with all elements equal to one. Then E is a matrix of squared 13515
Scaling: Multidimensional Euclidean distances between n points in p if and only if k" JnEJn is positive semi-definite of rank less # than or equal to p. This theorem has been extended to the classical nonEuclidean geometries, for instance by Blumenthal (1953). It can also be used to show that any nonnegative, hollow, symmetric E can be embedded ‘nonmetrically’ in nk2 dimensions.
which two points coincide (and a distance is zero). It is shown by de Leeuw (1984) that at a local minimum of STRESS, pairs of points with positive dissimilarities cannot coincide.
5.2 Least Squares on the Squared Distances A second loss function, which has been used a great deal, is SSTRESS, defined by
5. Loss Functions
∆
5.1 Least Squares on the Distances The most straightforward loss function to measure fit between dissimilarities and distances is STRESS, defined by ∆
n
n
STRESS(X ) l wij(δijkdij(X ))#. i=" j="
(1)
Obviously this formulation applies to metric scaling only. In the case of nonmetric scaling, the major breakthrough in a proper mathematical formulation of the problem was Kruskal (1964). For this case, STRESS is defined as, ∆
STRESS(X, Dp ) l
n
n
SSTRESS(X ) l wij(δij# kd ij# (X ))#. i=" j="
ni= nj= wij(dV ijkdij(X ))# " " ni= nj= wij(dij(X )kd` ij(X ))# " "
(2)
and this function is minimized over both X and D< , where D< satisfies the constraints imposed by the data. In nonmetric MDS the D< are called disparities, and are required to be monotonic with the dissimilarities. Finding the optimal D< is an ‘isotonic regression problem.’ In the case of distance completion problems (with or without measurement error), the d# ij must be equal to the observed distances if these are observed, and they are otherwise free. One particular property of the STRESS loss function is that it is not differentiable for configurations in
Clearly, this loss function is a (fourth-order) multivariate polynomial in the coordinates. There are no problems with smoothness, but often a large number of local optima results. Of course a nonmetric version of the SSTRESS problem can be confronted, using the same type of approach used for STRESS.
5.3 Least Squares on the Inner Products The existence theorem discussed above suggests a third way to measure loss. Now the function is known as STRAIN, and it is defined, in matrix notation, as ∆
STRAIN(X ) l troJ(∆(#)kD(#)(X ))J(∆(#)kD(#)(X ))q (4) where D(#)(X ) and ∆(#) are the matrices of squared distances and dissimilarities, and where J is the centering operator. Since JD(#)(X )J lk2XXh this means that k" J∆(#)J is approximated by a positive # semi-definite matrix of rank r, which is a standard eigenvalue–eigenvector computation. Again, nonmetric versions of minimizing STRAIN are straightforward to formulate (although less straightforward to implement).
Table 3 Transformed journal reference data 0.00 2.93 4.77 1.89 3.33 2.78 0.77 1.02 1.35 3.79
13516
2.93 0.00 2.28 3.32 1.25 2.61 2.39 0.53 1.41 4.24
4.77 2.28 0.00 3.87 2.39 1.83 3.13 1.22 3.03 2.50
1.89 3.32 3.87 0.00 5.62 4.77 1.72 1.11 1.41 4.50
3.33 1.25 2.39 5.62 0.00 2.44 3.89 0.45 2.71 3.67
(3)
2.78 2.61 1.83 4.77 2.44 0.00 2.46 1.01 2.90 2.27
0.77 2.39 3.13 1.72 3.89 2.46 0.00 0.41 0.92 2.68
1.02 0.53 1.22 1.11 0.45 1.01 0.41 0.00 0.76 1.42
1.35 1.41 3.03 1.41 2.71 2.90 0.92 0.76 0.00 2.23
3.79 4.24 2.50 4.50 3.67 2.27 2.68 1.42 2.23 0.00
Scaling: Multidimensional
6. Algorithms
jection problem on the set of configurations satisfying the constraints, which is usually easy to solve.
6.1 Stress The original algorithms (Kruskal 1964) for minimizing STRESS use gradient methods with elaborate stepsize procedure. In de Leeuw (1977) the ‘majorization method’ was introduced. It leads to a globally convergent algorithm with a linear convergence rate, which is not bothered by the nonexistence of derivatives at places where points coincide. The majorization method can be seen as a gradient method with a constant step-size, which uses convex analysis methods to prove convergence. More recently, faster linearly or superlinearly convergent methods have been tried successfully (Glunt et al. 1993, Kearsley et al. 1998). One of the key advantages of the majorization method is that it extends easily to restricted MDS problems (de Leeuw and Heiser 1980b). Each subproblem in the sequence is a least squares pro-
6.2 SSTRESS Algorithms for minimizing SSTRESS were developed initially by Takane et al. (1977). They applied cyclic coordinate descent, i.e., one coordinate was changed at the time, and cycles through the coordinates were alternated with isotonic regressions in the nonmetric case. More efficient alternating least squares algorithms were developed later by de Leeuw, Takane, and Browne (cf. Browne (1987)), and superlinear and quadratic methods were proposed by Glunt and Liu (1991) and Kearsley et al. (1998). 6.3 STRAIN Minimizing STRAIN was, and is, the preferred algorithm in metric MDS. It is also used as the starting
Figure 1 Metric analysis (STRAIN left, STRESS right)
Figure 2 Nonmetric analysis (transformation left, solution right)
13517
Scaling: Multidimensional point in iterative nonmetric algorithms. Recently, more general algorithms for minimizing STRAIN in nonmetric and distance completion scaling have been proposed by Trosset (1998a, 1998b).
7. Analysis of the Example 7.1 Initial Transformation In the journal reference example, suppose E( fij) l αiβj expokφ(dij(X ))q. In principle this model can be tested by contingency table techniques. Instead the model is used to transform the frequencies to estimated distances, yielding klog
pffgg ffgg $ φ(d (X )) ij ji
ii jj
ij
where fij4 l fijj". This transformed matrix is given in # Table 3.
7.2 Metric Analysis In the first analysis, suppose the numbers in Table 3 are approximate distances, i.e., suppose that φ is the identity. Then STRAIN is minimized, using metric MDS by calculating the dominant eigenvalues and corresponding eigenvectors of the doubly-centered squared distance matrix. This results in the following two-dimensional configurations. The second analysis iteratively minimizes metric STRESS, using the majorization algorithm. The solutions are given in Fig. 1. Both figures show the same grouping of journals, with Pka as an outlier, the journals central to the discipline, such as AJP, JExP, PB, and PR, in the middle, and more specialized journals generally in the periphery. For comparison purposes the STRESS of the first solution is 0.0687, that of the second solution is 0.0539. Finding the second solution takes about 30 iterations.
7.3 Nonmetric STRESS Analysis Next, nonmetric STRESS is minimized on the same data (using only their rank order). The solution is in Fig. 2. The left panel displays the transformation relating the data in Table 3 to the optimally transformed data, a monotone step function. Again, basically the same configuration of journals, with the same groupings emerges. The nonmetric solution has a (normalized) STRESS of 0.0195, and again finding it takes about 30 iterations of the majorization method. The optimal transformation does not seem to deviate systematically from linearity. 13518
8. Further Reading Until recently, the classical MDS reference was the little book by Kruskal and Wish (1978). It is clearly written, but very elementary. A more elaborate practical introduction is by Coxon (1982), which has a useful companion volume (Davies and Coxon 1982) with many of the classical MDS papers. Some additional early intermediate-level books, written from the psychometric point of view, are Davison (1983) and Young (1987). More recently, more modern and advanced books have appeared. The most complete treatment is no doubt Borg and Groenen (1997), while Cox (1994) is another good introduction especially aimed at statisticians.
Bibliography Alfakih A Y, Khandani A, Wolkowicz H 1998 Solving Euclidean distance matrix completion problems via semidefinite programming. Computational Optimization and Applications 12: 13–30 Arabie P, Carroll J D, DeSarbo W S 1987 Three-Way Scaling and Clustering. Sage, Newbury Park Blumenthal L M 1953 Distance Geometry. Oxford University Press, Oxford, UK Borg I, Groenen P 1997 Modern Multidimensional Scaling. Springer, New York Boyden A 1931 Precipitin tests as a basis for a comparitive phylogeny. Proceedings of the Society for Experimental Biology and Medicine 29: 955–7 Browne M 1987 The Young–Householder algorithm and the least squares multidimensional scaling of squared distances. Journal of Classification 4: 175–90 Coombs C H 1964 A Theory of Data. Wiley, New York Cox T F 1994 Multidimensional Scaling. Chapman and Hall, New York Coxon A P M 1982 The User’s Guide to Multidimensional Scaling: With Special Reference to the MDS(X) Library of Computer Programs. Heinemann, Exeter, NH Crippen G M 1977 A novel approach to calculation of conformation: Distance geometry. Journal of Computational Physics 24: 96–107 Crippen G M, Havel T F 1988 Distance Geometry and Molecular Conformation. Wiley, New York Davies P M, Coxon A P M 1982 Key Texts in Multidimensional Scaling. Heinemann, Exeter, NH Davison M L 1983 Multidimensional Scaling. Wiley, New York de Leeuw J 1977 Applications of convex analysis to multidimensional scaling. In: Barra J R, Brodeau F, Romier G, van Cutsem B (eds.) Recent Deelopments in Statistics: Proceedings of the European Meeting of Statisticians, Grenoble, 6–11 September, 1976. North Holland, Amsterdam, The Netherlands, pp. 133–45 de Leeuw J 1984 Differentiability of Kruskal’s stress at a local minimum. Psychometrika 49: 111–3 de Leeuw J, Heiser W 1980a Theory of multidimensional scaling. In: Krishnaiah P (ed.) Handbook of Statistics. North Holland, Amsterdam, The Netherlands, Vol. II de Leeuw J, Heiser W J 1980b Multidimensional scaling with restrictions on the configuration. In: Krishnaiah P (ed.)
Scandal: Political Multiariate Analysis. North Holland, Amsterdam, The Netherlands, Vol. v, pp. 501–22 Ekman G 1963 Direct method for multidimensional ratio scaling. Psychometrika 23: 33–41 Fisher R A 1922 The systematic location of genes by means of cross-over ratios. American Naturalist 56: 406–11 Glunt W, Hayden T L, Liu W M 1991 The embedding problem for predistance matrices. Bulletin of Mathematical Biology 53: 769–96 Glunt W, Hayden T, Rayden M 1993 Molecular conformations from distance matrices. Journal of Computational Chemistry 14: 114–20 Gower J C, Legendre P 1986 Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification 3: 5–48 Haberman S J 1974 The Analysis of Frequency Data. University of Chicago Press, Chicago Kearsley A J, Tapia R A, Trosset M W 1998 The solution of the metric STRESS and SSTRESS problems in multidimensional scaling using Newton’s method. Computational Statistics 13: 369–96 Kruskal J B 1964 Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29: 1–27 Kruskal J B, Wish M 1978 Multidimensional Scaling. Sage, Beverly Hills, CA Luce R 1963 Detection and recognition. In: Luce R D, Bush R R, Galanter E (eds.) Handbook of Mathematical Psychology. Wiley, New York, Vol. 1 Rothkopf E Z 1957 A measure of stimulusa similarity and errors in some paired-associate learning tasks. Journal of Experimental Psychology 53: 94–101 Schoenberg I J 1935 Remarks on Maurice Fre! chet’s article Sur la de! finition axiomatique d’une classe d’espaces distancie! s vectoriellement applicable sur l’espace de Hilbert. Annals of Mathematics 724–32 Shepard R N 1957 Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika 22: 325–45 Shepard R N 1962 The analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika 27: 125–40, 219–46 Takane Y, Young F W, de Leeuw J 1977 Nonmetric individual differences in multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika 42: 7–67 Torgerson W S 1958 Theory and Methods of Scaling. Wiley, New York Trosset M W 1998a Applications of multidimensional scaling to molecular conformation. Computing Science and Statistics 29: 148–52 Trosset M W 1998b A new formulation of the nonmetric strain problem in multidimensional scaling. Journal of Classification 15: 15–35 Young F W 1987 Multidimensional Scaling: History, Theory, and Applications. Earlbaum, Hillsdale, NJ
J. de Leeuw
Scandal: Political The word ‘scandal’ is used primarily to describe a sequence of actions and events which involve certain kinds of transgressions and which, when they become
known to others, are regarded as sufficiently serious to elicit a response of disapproval or condemnation. A scandal is necessarily a public event in the sense that, while the actions which lie at the heart of the scandal may have been carried out secretly or covertly, a scandal can arise only if these actions become known to others, or are strongly believed by others to have occurred. This is one respect in which scandal differs from related phenomena such as corruption and bribery; a scandal can be based on the disclosure of corruption or bribery, but corruption and bribery can exist (and often do exist) without being known about by others, and hence without becoming a scandal.
1. The Concept of Scandal The concept of scandal is very old and the meaning has changed over time. In terms of its etymological origins, the word probably derives from the Indogermanic root skand-, meaning to spring or leap. Early Greek derivatives, such as the word skandalon, were used in a figurative way to signify a trap, an obstacle or a ‘cause of moral stumbling.’ The idea of a trap or an obstacle was an integral feature of the theological vision of the Old Testament. In the Septuagint (the Greek version of the Old Testament), the word skandalon was used to describe an obstacle, a stumbling block placed along the path of the believer, which could explain how a people linked to God might nevertheless begin to doubt Him and lose their way. The notion of a trap or obstacle became part of Judaism and early Christian thought, although it was gradually prised apart from the idea of a test of faith. With the development of the Latin word scandalum and its diffusion into Romance languages, the religious connotation was gradually attenuated and supplemented by other senses. The word ‘scandal’ first appeared in English in the sixteenth century; similar words appeared in other Romance languages at roughly the same time. The early uses of ‘scandal’ in the sixteenth and seventeenth centuries were, broadly speaking, of two main types. First, ‘scandal’ was used in a religious context to refer to the conduct of a person which brought discredit to religion, or to something which hindered religious faith or belief. Second, ‘scandal’ and its cognates were also used in more secular contexts to describe actions or utterances which were regarded as scurrilous or abusive, which damaged an individual’s reputation, which were grossly discreditable, and\or which offended moral sentiments or the sense of decency. It is these later, more secular senses which underlie the most common modern uses of the word ‘scandal.’ Although the word continues to have some use as a specialized religious term, ‘scandal’ is used mainly to refer to a broader form of moral transgression which is no longer linked specifically to religious codes. More precisely, ‘scandal’ could be defined as actions or events which have the following characteristics: their 13519
Scandal: Political occurrence involves the transgression of certain values, norms or moral codes; their occurrence involves a degree of secrecy or concealment, but they are known or strongly believed to exist by individuals other than those directly involved; some individuals disapprove of the actions or events and may be offended by the transgression; some express their disapproval by publicly denouncing the actions or events; and the disclosure and condemnation of the actions or events may damage the reputation of the individuals responsible for them. While scandal necessarily involves some form of transgression, there is a great deal of cultural and historical variability in the kinds of values, norms, and moral codes which are relevant here. What counts as scandalous activity in one context, e.g., extramarital affairs among members of the political elite, may be regarded as acceptable (even normal) elsewhere. A particular scandal may also involve different types of transgression. A scandal may initially be based on the transgression of a moral code (e.g., concerning sexual relations), but as the scandal develops, the focus of attention may shift to a series of ‘second-order transgressions’ which stem from actions aimed at concealing the original offence. The attempt to cover up a transgression—a process that may involve deception, obstruction, false denials, and straightforward lies—may become more important than the original transgression itself, giving rise to an intensifying cycle of claim and counterclaim that dwarfs the original offence. Scandals can occur in different settings and milieux, from local communities to the arenas of national and even international politics. When scandals occur in settings which are more extended than local communities, they generally involve mediated forms of communication such as newspapers, magazines, and, increasingly, television. The media play a crucial role in making public the actions or events which lie at the heart of scandal, either by reporting allegations or information obtained by others (such as the police or the courts) or by carrying out their own investigations. The media also become a principal forum in which disapproval of the actions or events is expressed. Mediated scandals are not simply scandals which are reported by the media; rather, they are ‘mediated events’ which are partly constituted by the activities of media organizations.
2. The Nature of Political Scandal Scandals are common in many spheres of social life; not all scandals are political scandals. So what are the distinctive features of political scandals? One seemingly straightforward way of answering this question is to say that a political scandal is any scandal that involves a political leader or figure. But this is not a particularly helpful or illuminating answer. For an 13520
individual is a political leader or figure by virtue of a broader set of social relations and institutions which endow him or her with power. So if we wish to understand the nature of political scandal, we cannot focus on the individual alone. Another way of answering this question would be to focus not on the status of the individuals involved, but on the nature of the transgression. This is the approach taken by Markovits and Silverstein, two political scientists who have written perceptively about political scandal. According to Markovits and Silverstein (1988), the defining feature of political scandal is that it involves a ‘violation of due process.’ By ‘due process’ they mean the legally binding rules and procedures which govern the exercise of political power. Political scandals are scandals in which these rules and procedures are violated by those who exercise political power and who seek to increase their power at the expense of due process. Since due process is fully institutionalized only in the liberal democratic state, it follows, argue Markovits and Silverstein, that political scandals can occur only in liberal democracies. One strength of Markovits and Silverstein’s account is that it analyzes political scandal in relation to some of the most important institutional features of modern states. But the main shortcoming of this account is that it provides a rather narrow view of political scandal. It treats one dynamic—the pursuit of power at the expense of process—as the defining feature of political scandal. Hence any scandal that does not involve this particular dynamic is ipso facto nonpolitical. This means that a whole range of scandals, such as those based on sexual transgressions, would be ruled out as nonpolitical, even though they may involve senior political figures and may have farreaching political consequences. Markovits and Silverstein’s claim that political scandal can occur only in liberal democracies should also be viewed with some caution. It is undoubtedly the case that liberal democracies are particularly prone to political scandal, but this is due to a number of specific factors (such as the highly competitive nature of liberal democratic politics and the relative autonomy of the press) and it does not imply that political scandal is unique to this type of political organization. Political scandals can occur (and have occurred) in other types of political system, from the absolutist and constitutional monarchies of early modern Europe to the various forms of authoritarian regime which have existed in the twentieth century. But political scandals in these other types of political system are more likely to remain localized scandals and are less likely to spread beyond the relatively closed worlds of the political elite. An alternative way of conceptualizing political scandals is to regard them as scandals involving individuals or actions which are situated within a political field (Thompson 2000). It is the political field that constitutes the scandal as political; it provides the
Scandal: Political context for the scandal and shapes its pattern of development. A field is a structured space of social positions whose properties are defined primarily by the relations between these positions and the resources attached to them. The political field can be defined as the field of action and interaction which bears on the acquisition and exercise of political power. Political scandals are scandals which occur within the political field and which have an impact on relations within this field. They may involve the violation of rules and procedures governing the exercise of political power, but they do not have to involve this; other kinds of transgression can also constitute political scandals. We can distinguish between three main types of political scandal, depending on the kinds of norms or codes which are transgressed. Sex scandals involve the transgression of norms or codes governing the conduct of sexual relations. In some contexts, sexual transgressions carry a significant social stigma and their disclosure may elicit varying degrees of disapproval by others. Financial scandals involve the infringement of rules governing the acquisition and allocation of economic resources; these include scandals involving bribery, kickbacks and other forms of corruption as well as scandals stemming from irregularities in the raising and deployment of campaign funds. Power scandals are based on the disclosure of activities which infringe the rules governing the acquisition or exercise of political power. They involve the unveiling of hidden forms of power, and actual or alleged abuses of power, which had hitherto been concealed beneath the public settings in which power is displayed and the publicly recognized procedures through which it is exercised.
3. The Rise of Political Scandal The origins of political scandal as a mediated event can be traced back to the pamphlet culture of the seventeenth and eighteenth centuries. During the period of the English Civil War, for instance, there was a proliferation of anti-Royalist pamphlets and newsbooks which were condemned as heretical, blasphemous, scurrilous, and ‘scandalous’ in character. Similarly, in France, a distinctive genre of subversive political literature had emerged by the early eighteenth century, comprising the libelles and the chroniques scandaleuses, which purported to recount the private lives of kings and courtiers and presented them in an unflattering light. However, in the late eighteenth and nineteenth centuries, the use of ‘scandal’ in relation to mediated forms of communication began to change, as the term was gradually prised apart from its close association with blasphemy and sedition and increasingly applied to a range of phenomena which displayed the characteristics we now associate with scandal. By the late nineteenth century, mediated scandal had become a relatively common feature of the
political landscape in countries such as Britain and the USA. In Britain there were a number of major scandals, many involving sexual transgressions of various kinds, which destroyed (or threatened to destroy) the careers of key political figures such as Sir Charles Dilke (a rising star of the Liberal Party whose career was irrevocably damaged by the events surrounding a divorce action in which he was named as co-respondent) and Charles Parnell (the charismatic leader of the Irish parliamentary party whose career was destroyed by revelations concerning his affair with Mrs. Katharine O’Shea). There were numerous political scandals in nineteenth-century America too, some involving actual or alleged sexual transgressions (such as the scandal surrounding Grover Cleveland who, it was said, had fathered an illegitimate child) and many involving corruption at municipal, state and federal levels of government. America in the Gilded Age witnessed a flourishing of financial scandals in the political field, and the period of Grant’s Presidency (1869–77) is regarded by many as one of the most corrupt in American history. While the nineteenth century was the birthplace of political scandal as a mediated event, the twentieth century was to become its true home. Once this distinctive type of event had been invented, it would become a recognizable genre that some would seek actively to produce while others would strive, with varying degrees of success, to avoid. The character and frequency of political scandals varied considerably from one national context to another, and depended on a range of specific social and political circumstances. In Britain and the USA, there were significant political scandals throughout the early decades of the twentieth century, but political scandals have become particularly prevalent in the period since the early 1960s. In Britain, the Profumo scandal of 1963 was a watershed. This was a classic sex scandal involving a senior government minister (John Profumo, Secretary of State for War) and an attractive young woman (Christine Keeler), but it also involved issues of national security and a series of second-order transgressions which proved fatal for Profumo’s career. In the USA, the decisive political scandal of the twentieth century was undoubtedly Watergate, a power scandal in which Nixon was eventually forced to resign as President in the face of his imminent impeachment. Many countries have developed their own distinctive political cultures of scandal which have been shaped by, among other things, their respective traditions of scandal, the activities of journalists, media organizations, and other agents in the political field, the deployment of new technologies of communication, and the changing political climate of the time. Political scandal has become a potent weapon in struggles between rival candidates and parties in the political field. As fundamental disagreements over matters of principle have become less pronounced, questions of character and trust have become incr13521
Scandal: Political easingly central to political debate and scandal has assumed increasing significance as a ‘credibility test.’ In this context, the occurrence of scandal tends to have a cumulative effect: scandal breeds scandal, because each scandal exposes character failings and further sharpens the focus on the credibility and trustworthiness of political leaders. This is the context in which President Clinton found that his political career was nearly destroyed by scandal on more than one occasion. Like many presidential hopefuls in the past, Clinton campaigned on the promise to clean up politics after the sleaze of the Reagan administration. But he soon found that members of his own administration—and, indeed, that he and his wife—were being investigated on grounds of possible financial wrongdoing. He also found that allegations and revelations concerning his private life would become highly public issues, threatening to derail his campaign is 1992 (with the Gennifer Flowers affair) and culminating in his impeachment and trial by the Senate following the disclosure of his affair with Monica Lewinsky. What led to Clinton’s impeachment was not the disclosure of the affair as such, but rather a series of second-order transgressions committed in relation to a sexual harassment case instituted by Paula Jones, in the context of which Clinton gave testimony under oath denying that he had had sexual relations with Monica Lewinsky, thereby laying himself open to the charge of perjury, among other things. Clinton’s trial in the Senate resulted in his acquittal, but his reputation was undoubtedly damaged by the scandal which overshadowed the second term of his presidency. See also: Elites: Sociological Aspects; Mass Media: Introduction and Schools of Thought; Political Culture; Political Discourse; Political Trials; Public Opinion: Political Aspects
Bibliography Allen L et al. 1990 Political Scandals and Causes CeleZ bres Since 1945: An International Reference Compendium n.d. Longman, Harlow, Essex, UK, Chicago, IL, USA: Published in the USA and Canada by St. James Press Garment S 1992 Scandal: The Culture of Mistrust in American Politics. Doubleday, New York King A 1986 Sex, money, and power. In: Hodder-Williams R, Ceaser J (eds.) Politics in Britain and the United States: Comparatie Perspecties. Duke University Press, Durham, NC pp. 173–222 Markovits A S, Silverstein M (eds.) 1988 The Politics of Scandal: Power and Process in Liberal Democracies. Holmes and Meier, New York Allen L et al. 1990 Political Scandals and Causes CeleZ bres Since 1945: An International Reference Compendium n.d. Longman, Harlow, Essex, UK, Chicago, IL, USA: Published in the USA and Canada by St. James Press
13522
Schudson M 1992 Watergate in American Memory: How We Remember, Forget, and Reconstruct the Past. Basic Books, New York Thompson J B 2000 Political Scandal: Power and Visibility in the Media Age. Polity, Cambridge, UK
J. B. Thompson
Schemas, Frames, and Scripts in Cognitive Psychology The terms ‘schema,’ ‘frame,’ and ‘script’ are all used to refer to a generic mental representation of a concept, event, or activity. For example, people have a generic representation of a visit to the dentist office that includes a waiting room, an examining room, dental equipment, and a standard sequence of events in one’s experience at the dentist. Differences in how each of the three terms, schema, frame, and script, are used reflect the various disciplines that have contributed to an understanding of how such generic representations are used to facilitate ongoing information processing. This article will provide a brief family history of the intellectual ancestry of modern views of generic knowledge structures as well as an explanation of how such knowledge influences mental processes.
1. Early Deelopment of the ‘Schema’ Concept The term ‘schema’ was already in prominent use in the writings of psychologists and neurologists by the early part of the twentieth century (e.g., Bartlett 1932, Head 1920, see also Piaget’s Theory of Child Deelopment). Though the idea that our behavior is guided by schemata (the traditional plural form of schema) was fairly widespread, the use of the term ‘schema’ was seldom well defined. Nevertheless, the various uses of the term were reminiscent of its use by the eighteenth century philosopher Immanuel Kant. In his influential treatise Critique of Pure Reason, first published in 1781, Kant wrestled with the question of whether all knowledge is dependent on sensory experience or whether there exists a priori knowledge, which presumably would consist of elementary concepts, such as space and time, necessary for making sense of the world. Kant believed that we depend both on our knowledge gained from sensory experience and a priori knowledge, and that we use a generic form of knowledge, schemata, to mediate between sensorybased and a priori knowledge. Unlike today, when the idea of a schema is most often invoked for large-scale structured knowledge, Kant used the term in reference to more basic concepts. In Critique of Pure Reason he wrote: ‘In truth, it is not images of objects, but schemata, which lie at the foundation of our pure sensuous conceptions. No
Schemas, Frames, and Scripts in Cognitie Psychology image could ever be adequate to our conception of triangles in general … The schema of a triangle can exist nowhere else than in thought and it indicates a rule of the synthesis of the imagination in regard to pure figures in space.’ The key idea captured by Kant was that people need mental representations that are typical of a class of objects or events so that we can respond to the core similarities across different stimuli of the same class (see Natural Concepts, Psychology of; Mental Representations, Psychology of ). The basic notion that our experience of the world is filtered through generic representations was also at the heart of Sir Frederic Bartlett’s use of the term ‘schema.’ Bartlett’s studies of memory, described in his classic book published in 1932, presaged the rise of schematheoretic views of memory and text comprehension in the 1970s and 1980s. In his most influential series of studies, Bartlett asked people to read and retell folktales from unfamiliar cultures. Because they lacked the background knowledge that rendered the folktales coherent from the perspective of the culture of origin, the participants in Bartlett’s studies tended to conventionalize the stories by adding connectives and explanatory concepts that fit the reader’s own world view. The changes were often quite radical, particularly when the story was recalled after a substantial delay. For example, in some cases the reader gave the story a rather conventional moral consistent with more familiar folktales. Bartlett interpreted these data as indicating that: ‘Remembering is not the re-excitation of innumerable fixed, lifeless, and fragmentary traces’ (p. 213). Instead, he argued, memory for a particular stimulus is reconstructed based on both the stimulus itself and on a relevant schema. By schema, Bartlett meant a ‘whole active mass of organised past reactions or experience.’ Thus, if a person’s general knowledge of folktales includes the idea that such tales end with an explicit moral, then in reconstructing an unfamiliar folktale the person is likely to elaborate the original stimulus by adding a moral that appears relevant (see Reconstructie Memory, Psychology of ). Although Bartlett made a promising start on understanding memory for complex naturalistic stimuli, his work was largely ignored for the next 40 years, particularly in the United States where researchers focused on much more simplified learning paradigms from a behavioral perspective. Then, interest in schemata returned as researchers in the newly developed fields of cognitive psychology and artificial intelligence attempted to study and to simulate discourse comprehension and memory (see Knowledge Actiation in Text Comprehension and Problem Soling, Psychology of ).
2. The Rediscoery of Schemata Interest in schemata as explanatory constructs returned to psychology as a number of memory re-
searchers demonstrated that background knowledge can strongly influence the recall and recognition of text. For example, consider the following passage: ‘The procedure is actually quite simple. First, you arrange items into different groups. Of course, one pile may be sufficient, depending on how much there is to do. If you have to go somewhere else due to lack of facilities that is the next step; otherwise, you are pretty well set.’ This excerpt, from a passage used by Bransford and Johnson (1972), is very difficult to understand and recall unless the reader has an organizing schema for the information. So, readers who are supplied with the appropriate title, Washing Clothes, are able to recall the passage much more effectively than readers who are not given the title. Although the early 1970s witnessed a great number of clear demonstrations that previous knowledge shapes comprehension and memory, progress on specifying exactly how background knowledge becomes integrated with new information continued to be hampered by the vagueness of the schema concept. Fortunately, at this same time, researchers in artificial intelligence (AI) identified the issue of background knowledge as central to simulations of text processing. Owing to the notorious inability of computers to deal with vague notions, the AI researchers had to be explicit in their accounts of how background knowledge is employed in comprehension. A remarkable cross-fertilization occurred between psychology and AI that encouraged the development and testing of hypotheses about the representation and use of schematic knowledge. 2.1 Schema Theory in AI Research: Frames and Scripts Although many researchers in the nascent state of AI research in 1970s were unconcerned with potential connections to psychology, several pioneers in AI research noted the mutual relevance of the two fields early on. In particular, Marvin Minsky and Roger Schank developed formalizations of schema-theoretic approaches that were both affected by, and greatly affected, research in cognitive psychology. One of the core concepts from AI that was quickly adopted by cognitive psychologists was the notion of a frame. Minsky (1975) proposed that our experiences of familiar events and situations are represented in a generic way as frames that contain slots. Each slot corresponds to some expected element of the situation. One of the ways that Minsky illustrated what he meant by a frame was in reference to a job application form. A frame, like an application, has certain general domains of information being sought and leaves blanks for specific information to be filled in. As another example, someone who walks into the office of a university professor could activate a frame for a generic professor’s office that would help orient the person to the situation and allow for easy identification 13523
Schemas, Frames, and Scripts in Cognitie Psychology
Figure 1 A child’s birthday party script. The script consists of a set of distinct scenes. The slots under each scene can be filled with default information, such as ‘hot dogs’ for party food. Note that the cake ceremony could be broken down further into scenes such as candle lighting, blowing out candles, etc. Thus, one script or schema can be embedded inside another
of its elements. So, a person would expect to see a desk and chair, a bookshelf with lots of books, and, these days at least, a computer. An important part of Minsky’s frame idea is that the slots of a frame are filled with default values if no specific information from the context is provided (or noticed). Thus, even if the person didn’t actually see a computer in the professor’s office, he or she might later believe one was there because computer would be a default value that filled an equipment slot in the office frame (see Brewer and Treyens 1981 for relevant empirical evidence). The idea that generic knowledge structures contain slots with default values was also a central part of Roger Schank’s work on scripts. Scripts are generic representations of common social situations such as going to the doctor, having dinner at a restaurant, or attending a party (Fig. 1). Teamed with a social psychologist, Robert Abelson, Schank developed computer simulations of understanding stories involving such social situations. The motivation for the development of the script concept was Schank’s observation that people must be able to make a large number of inferences quite automatically in order to comprehend discourse. This can be true even when the discourse consists of a relatively simple pair of sentences: Q: So, would you like to get some dinner? A: Oh, I just had some pizza a bit ago. Even though the question is not explicitly answered, a reader of these two sentences readily understands that the dinner invitation was declined. Note that the interpretation of this exchange is more elaborate if the reader knows that it occurred between a young man and a young woman who met at some social event and 13524
had a lengthy conversation. In this context, a script for asking someone on a date may be invoked and the answer is interpreted as a clear ‘brush off.’ Thus, a common feature of the AI work on frames and scripts in simulations of natural language understanding was that inferences play a key role. Furthermore, these researchers provided formal descriptions of generic knowledge structures that could be used to generate the inferences necessary to comprehension. Both AI researchers and cognitive psychologists of the time recognized that such knowledge structures captured the ideas implicit in Bartlett’s more vague notion of a schema. The stage was set to determine whether frames and scripts were useful descriptors for the knowledge that people actually use when processing information (see Inferences in Discourse, Psychology of; Memory for Text).
2.2 Empirical Support for Schema Theory If people’s comprehension and memory processes are based on retrieving generic knowledge structures and, as necessary, filling slots of those knowledge structures with default values, then several predictions follow. First, people with similar experiences of common situations and events will have similar schemata represented in memory. Second, as people retrieve such schemata during processing, they will make inferences that go beyond directly stated information. Third, in many cases, particularly when processing is guided by a script-like structure, people will have expectations about not only the content of a situation but also the sequence of events. Cognitive psychologists obtained empirical support for each of these predictions. For example, Bower et al. (1979) conducted a series of experiments on people’s generation of, and use of, scripts for common events like going to a nice restaurant, attending a lecture, or visiting the doctor. They found high agreement on the actions that take place in each situation and on the ordering. Moreover, there was good agreement on the major clusters of actions, or scenes, within each script. Consistent with the predictions of schema theory, when people read stories based on a script, their recall of the story tended to include actions from the script that were not actually mentioned in the story. Bower et al. (1979) also used some passages in which an action from a script was explicitly mentioned but out of its usual order in the script. In their recall of such a passage, people tended to recall the action of interest in its canonical position rather than in its actual position in the passage. Cognitive psychologists interested in discourse processing further extended the idea of a schema capturing a sequence of events by proposing that during story comprehension people make use of a story grammar. A story grammar is a schema that specifies the abstract organization of the parts of a story. So, at the highest
Schemas, Frames, and Scripts in Cognitie Psychology level a story consists of a setting and one or more episodes. An episode consists of an event that presents a protagonist with an obstacle to be overcome. The protagonist then acts on a goal formulated to overcome the obstacle. That action then elicits reactions that may or may not accomplish the goal. Using such a scheme, each idea in a story can be mapped onto a hierarchical structure that represents the relationship among the ideas (e.g., Mandler 1984). We can use story grammars to predict people’s sentence reading times and recall because the higher an idea is in the hierarchy, the longer the reading time and the better the memory for that idea (see Narratie Comprehension, Psychology of ). Thus, the body of research that cognitive psychologists generated in the 1970s and early 1980s supported the notion that data structures like frames and scripts held promise for capturing interesting aspects of human information processing. Nevertheless, it soon became clear that modifications of the whole notion of a schema were necessary in order to account for people’s flexibility in comprehension and memory processing.
3. Schema Theory Reised: The New Synthesis Certainly, information in a text or the context of an actual event could serve as a cue to trigger a relevant script. But what about situations that are more novel and no existing script is triggered? At best, human information processing can only sometimes be guided by large-scale schematic structures. Traditionally, schema theorists have tended to view schema application as distinct from the processes involved in building up schemata in the first place. So, people might use a relevant schema whenever possible, but when no relevant schema is triggered another set of processes are initiated that are responsible for constructing new schemata. More recently, however, an alternative view of schemata has been developed, again with contributions both from AI and cognitive psychology. This newer approach places schematic processing at one end of a continuum in which processing varies from data driven to conceptually driven. Data-driven (also known as bottom-up) processing means that the analysis of a stimulus is not appreciably influenced by prior knowledge. In the case of text comprehension, this would mean that the mental representation of a text would stick closely to information that was directly stated. Conceptually driven (also known as top-down) processing means that the analysis of a stimulus is guided by pre-existing knowledge, as, for example, described by traditional schema theory. In the case of text comprehension, this would mean that the reader makes many elaborative inferences that go beyond the information explicitly stated in the text. In many cases, processing will be an intermediate point
on the continuum so that there is a modest conceptually driven influence by pre-existing knowledge structures. This more recent view of schemata has been called schema assembly theory. The central idea of this view is that a schema, in the sense of a framework guiding our interpretation of a situation or event, is always constructed on the spot based on the current context. Situations vary in how much background knowledge is readily available, so sometimes a great deal of related information is accessed that can guide processing and at other times very little of such background knowledge is available. In these cases we process new information in a data-driven fashion. This new view of schemata was synthesized from work in AI and cognitive psychology as researchers realized the need to incorporate a more dynamic view of the memory system into models of knowledgebased processing. For example, AI programs based on scripts performed poorly when confronted with the kinds of script variations and exceptions seen in the ‘real world.’ Accordingly, Schank (1982) began to develop systems in which appropriate script-like structures were built to fit the specific context rather than just retrieved as a precomplied structure. Therefore, different possible restaurant subscenes could be pulled together in an ad hoc fashion based on both background knowledge and current input. In short, Schank’s simulations became less conceptually driven in their operation. Likewise, cognitive psychologists, confronted with empirical data that was problematic for traditional schema theories, have moved to a more dynamic view. There were two main sets of empirical findings that motivated cognitive psychologists to make these changes. First, it became clear that people’s processing of discourse was not as conceptually driven as classic schema theories proposed. In particular, making elaborative inferences that go beyond information directly stated in the text is more restricted than originally believed. People may be quite elaborative when reconstructing a memory that is filled with gaps, but when tested for inference making concurrent with sentence processing, people’s inference generation is modest. For example, according to the classic view of schemata, a passage about someone digging a hole to bury a box of jewels should elicit an inference about the instrument used in the digging. The default value for such an instrument would probably be a shovel. Yet, when people are tested for whether the concept of shovel is primed in memory by such a passage (for example, by examining whether the passage speeds up naming time for the word ‘shovel’), the data suggest that people do not spontaneously infer instruments for actions. Some kinds of inferences are made, such as connections between goals and actions or filling in some general terms with more concrete instances, but inference making is clearly more restricted than researchers believed in the early days after the re13525
Schemas, Frames, and Scripts in Cognitie Psychology discovery of schemata (e.g., Graesser and Bower 1990, Graesser et al. 1997). Second, it is not always the case that what we remember from some episode or from a text is based largely on pre-existing knowledge. As Walter Kintsch has shown, people appear to form not one, but three memories as they read a text: a short-lived representation of the exact wording, a more durable representation that captures a paraphrase of the text, and a situation model that represents what was described by the text, rather than the text itself (e.g., Kintsch 1988). It is this third type of representation that is strongly influenced by available schemata. However, whether or not a situation model is formed during comprehension depends on not only the availability of relevant chunks of knowledge, but also the goals of the reader (Memory for Meaning and Surface Memory; Situation Model: Psychological ). Thus, background knowledge can affect processing of new information in the way described by schema theories, but our processing represents a more careful balance of conceptually-driven and data-driven processes than claimed by traditional schema theories. Moreover, the relative contribution of background knowledge to understanding varies with the context, especially the type of text and the goal of the reader. To accommodate more flexibility in the use of the knowledge base, one important trend in research on schemata has been to incorporate schemata into connectionist models. In connectionist models the representation of a concept is distributed over a network of units. The core idea is that concepts, which are really repeatable patterns of activity among a set of units, are associated with varying strengths. Some concepts are so often associated with other particular concepts that activating one automatically entails activating another. However, in most cases the associations are somewhat weaker, so that activating one concept may make another more accessible if needed. Such a system can function in a more top-down manner or a more bottom-up manner depending on the circumstances. To rephrase this idea in terms of traditional schema theoretic concepts, some slots are available but they need not be filled with particular default values if the context does not require it. Thus, the particular configuration of the schema that is used depends to a large extent on the processing situation (see Cognition, Distributed ). In closing, it is important to note that the success that researchers in cognitive psychology and AI have had in delineating the content of schemata and in determining when such background knowledge influences ongoing processing has had an important influence on many other ideas in psychology. The schema concept has become a critical one in clinical psychology, for example in cognitive theories of depression, and in social psychology, for example in understanding the influence of stereotypes on person perception (e.g., Hilton and von Hipple 1996). It 13526
remains to be seen whether the concept of a schema becomes as dynamic in these areas of study as it has in cognitive psychology. See also: Connectionist Models of Concept Learning; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Knowledge (Explicit and Implicit): Philosophical Aspects; Knowledge Representation; Knowledge Spaces; Mental Representations, Psychology of; Metaphor and its Role in Social Thought: History of the Concept; Reference and Representation: Philosophical Aspects; Schemas, Social Psychology of
Bibliography Bartlett F C 1932 Remembering. Cambridge University Press, Cambridge, UK Bower G H, Black J B, Turner T J 1979 Scripts in memory for text. Cognitie Psychology 11: 177–220 Bransford J D, Johnson M K 1972 Contextual prerequisites for understanding: A constructive versus interpretive approach. Journal of Verbal Learning and Verbal Behaior 11: 717–26 Brewer W F, Treyens J C 1981 Role of schemata in memory for places. Cognitie Psychology 13: 207–30 Graesser A C, Bower G H (eds.) 1990 Inferences and Text Comprehension: The Psychology of Learning and Motiation. Academic Press, San Diego, CA, Vol. 25 Graesser A C, Millis K K, Zwaan R A 1997 Discourse comprehension. Annual Reiew of Psychology 48: 163–89 Head H 1920 Studies in Neurology. Oxford University Press, London Hilton J L, von Hipple W 1996 Stereotypes. Annual Reiew of Psychology 47: 237–71 Kant I 1958 Critique of Pure Reason, trans. by N. K. Smith. Modern Library, New York. Originally published in 1781 Kintsch W 1988 The role of knowledge in discourse comprehension: A construction-integration model. Psychological Reiew 95: 163–82 Mandler J M 1984 Stories, Scripts, and Scenes: Aspects of Schema Theory. Erlbaum Associates, Hillsdale, NJ Minsky M 1975 A framework for representing knowledge. In: Winston P H (ed.) The Psychology of Computer Vision. McGraw-Hill, New York Schank R C 1982 Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge University Press, New York Schank R C, Abelson R P 1977 Scripts, Plans, Goals, and Understanding. Erlbaum Associates, Hillsdale, NJ Whitney P, Budd D, Bramucci R S, Crane R S 1995 On babies, bath water, and schemata: A reconsideration of top-down processes in comprehension. Discourse Process 20: 135–66
P. Whitney
Schemas, Social Psychology of Schemas (or schemata) are generic knowledge structures that summarize past experiences and provide a framework for the acquisition, interpretation, and
Schemas, Social Psychology of retrieval of new information. For example, one might have a clown schema based on past encounters with, and previously learned knowledge about, clowns. When one encounters a new clown, one’s clown schema may cause one to notice its painted face, expect polka-dotted pantaloons, interpret its actions as goofy, and recall the presence of a unicycle. Usually, such operations are functional, leading to more rapid, accurate, and detailed information processing. However, such operations can lead to biases or errors when they produce inaccurate information (e.g., when one mistakenly recalls a clown as having a bulbous red nose).
1. History Many of the fundamental precepts underlying the schema concept were anticipated by the Gestalt movement in European psychology. Bartlett (1932) incorporated these ideas into his concept of schema. He posited that past memories are stored as larger, organized structures rather than as individual elements, and that newly perceived information is accommodated into these ‘masses’ of old information. He suggested that this accommodation process is accomplished primarily through unconscious inference and faulty memory. Consequently, his theory focused more on the dysfunctional, reconstructive, and inaccurate nature of schemas. Because of his emphasis on subjective experience and the unconscious, his views and research methods departed from the behaviorist leanings of American psychology, and were largely ignored until the cognitive revolution in the 1960s and 1970s. In the mid-1970s, schema theory was refined by cognitive psychologists (see Brewer and Nakamura 1984 and Hastie 1981). While incorporating Bartlett’s basic premises, these modern approaches also acknowledged the functional benefits of schemas in terms of increased cognitive efficiency and accuracy. Furthermore, these modern approaches extended the concept of schema by incorporating it within contemporary cognitive theory. For example, schemas are now construed as (a) selectively activated, just like any other concept in memory, (b) nested within each other, with each defined partly through reference to related schemas, (c) providing ‘default values’ for embedded informational features, and (d) having active, testable memory functions. These ideas stimulated considerable work on schematic processing within cognitive psychology.
2. Schematic Processing For a schema to affect processing it must first be activated. Because of the unitary and organized nature of schemas, this is generally assumed to be an all-ornothing process, in that the entire schema (as opposed
to bits and pieces) either comes to mind completely or not at all. For example, one could not activate a chair schema without becoming aware of both the typical appearance of chairs and their normal seating function. Research has identified a number of variables that influence when particular schemas are activated by new information, including fit, context and priming. In terms of fit, for example, a chair with typical features like four legs, armrests, and a back is more likely to activate the chair schema than is a threelegged stool. Context can also influence activation, so that an ice chest is more likely to activate a chair schema at a campsite than on a store shelf. Finally, schemas can be primed in memory through frequent or recent experiences that bring them to mind. For example, reading about chairs in this paragraph could increase the likelihood that you would activate a chair schema in response to a tree stump. Once activated, schemas appear to have complex effects on attention. Overall, research suggests that schemas direct attention toward schema-relevant, rather than schema-irrelevant, information. However, it makes some difference whether the schema-relevant information is schema congruent or schema incongruent. Schema-incongruent information captures the most attention, presumably because it violates expectations. Schema-congruent information is attended to while an appropriate schema is in the process of being activated or instantiated, but ignored once this activation is accomplished. For example, early in the process a seat might be noticed because it helps to identify an object as a chair, but once a chair schema is activated, the seat would be ignored as obvious. Schemas have also been shown to have powerful effects on the interpretation (or encoding) of new information, providing a framework that shapes expectancies and judgments. Research suggests that this influence is especially pronounced when the new information is ambiguous, vague or incomplete. For example, the ambiguous word ‘chair’ would be interpreted differently if you had a department head schema activated than if you had a furniture schema activated. Finally, research indicates that schemas affect the retrieval of information from memory. Memory effects are often viewed as reflecting the organizational properties of schemas. Because schemas are large, organized clusters of linked information, they provide multiple routes of access for retrieving individual items of information. Such retrieval routes have been shown to facilitate memory in a number of ways. For example, experts have particularly complex and wellorganized schemas and consequently exhibit better recall for domain-relevant information. Thus, as chess expertise increases, so does memory for board arrangements from actual games (though not for random arrangements). Enhanced recall for schema-related information can reflect either retrieval or guessing mechanisms. For 13527
Schemas, Social Psychology of example, if a newspaper article activates a prisoner execution schema, the link between such executions and electric chairs could help one to recall that an electric chair was mentioned. On the other hand, the schema may instead facilitate an educated guess. As noted previously, schemas have been found to provide defaults for filling in missing information. Thus an electric chair might be ‘remembered’ even if it were not mentioned in the story. Such guesses produce biases and errors when the inferred information is incorrect.
4. Social Psychological Research
3. Criticisms of the Schema Concept
4.1
A number of criticisms were leveled at the schema concept in the 1980s (e.g., Alba and Hasher 1983, Fiske and Linville 1980). First, schemas were criticized as definitionally vague, with little consensus regarding concept boundaries. In other words, it is unclear what differentiates schemas from other kinds of knowledge, attitudes or attributions. Second, research on schemas has been attacked as operationally vague, especially because of unvalidated manipulations. For example, researchers sometimes assume that schemas are activated by cues or expectancies in the absence of manipulation checks evidencing such activation. Third, work on schemas has been denounced as nonfalsifiable, in that virtually any result can be interpreted as reflecting schema use. Fourth, it has been argued that schema theory has difficulty accounting for the complexity and accuracy of memory. For instance, there is evidence that people encode and remember more than simply the gist of events, and that schematic distortions are the exception rather than the rule. Finally, some critics have complained that schema theory and research amounts to little more than the reframing of previously known phenomena in terms of a new, schematic vocabulary. For example, biases in person perception documented as early as the 1940s have since been reframed as schema effects. On the other hand, schema theory survived such criticisms because its strengths clearly outweigh its weaknesses. The breadth of the ‘vague’ schema concept allows it to encompass a wide variety of phenomena, ranging from autobiographical memory to visual perception. In many such contexts, the schema concept has been embedded in elaborate theory, allowing for more precise definition, manipulation, and measurement. With such refinements, schema theory often makes specific, falsifiable predictions that lead to clearer, if not totally unambiguous, interpretations. Additionally, modern schema research has increasingly focused on the ways that schemas facilitate, rather than interfere with, memory, accounting better for the frequent accuracy of memory. Overall, the schema concept has proven to be tremendously heuristic, leading to novel research that has documented important new phenomena.
Social psychologists have examined the effects of trait schemas on the processing of schema-related information (see Taylor and Crocker 1981). This research found patterns of recall for congruent, incongruent, and irrelevant information paralleling the patterns described earlier for cognitive stimuli. Specifically, people best recall trait-incongruent information, followed by trait-congruent information, with traitirrelevant information last. For example, if people have an impression (schema) of Jamie as honest, they would best recall his dishonest behaviors (e.g., embezzling money from the honor society), because this information is surprising and needs to be reconciled with Jamie’s other behaviors. They would next best recall Jamie’s honest behaviors (e.g., returning a lost wallet), because these are expected and, thus, readily fit into the existing schema. Jamie’s honesty-irrelevant behaviors (e.g., consulting on statistical issues) would be worst recalled because they do not relate in any way to the honesty schema. Perhaps one of the most significant contributions of this social psychological work has been to move beyond the differential attention explanation for these findings. Such effects are now understood to also reflect behavior-to-behavior or behavior-to-schema interconnections that are created as people think about how this information can be reconciled. Incongruent information actually becomes most interconnected as people mull it over while trying to make sense of it, and irrelevant information remains least interconnected, because it warrants little reflection.
13528
Social psychologists have applied the schema concept to complex interpersonal phenomena often using cognitive methods and theory. Given this approach, research on schemas has become an integral part of the subfield known as social cognition. Such research has not only elaborated the kinds of information processing effects described in the cognitive literature, but has also led to the identification of a wide variety of new phenomena. Schematic Processing of Social Information
4.2 Self-schemas Social psychologists have also used the concept of schema to understand the nature of the self, which had been debated by psychologists for a century (see Linville and Carlston 1994). Most early empirical work focused on the content of the self-concept, though such work was hindered by vague conceptions of self and by the idiosyncratic nature of self-knowledge. The conception of self as a schema provided a more coherent definition, and shifted the empirical emphasis from the content of the self to the functions of self-schemas. The consequence was a proliferation
Schemas, Social Psychology of of social psychological research in this area. Considerable empirical effort went into assessing whether the self is a unique kind of schema, distinct from other forms of social knowledge. Although there are exceptions, most studies have failed to find convincing evidence that self-schemas are unique or distinct from other forms of knowledge. Consequently, the seemingly ‘unique’ effects of self-schemas may simply reflect the greater complexity and personal relevance of selfknowledge. Researchers have also explored whether people have just one, unified self-schema, or whether people have multiple self-schemas. Most contemporary theories depict people as having multiple self-schemas, with the one that is currently activated (in use) being referred to as the working or phenomenal self. Several approaches to the idea of multiple selves have been highly influential. One approach (self-complexity theory) suggests that people have different self-schemas representing their varied social roles and relationships such as student, parent, athlete, lover, and so on. When many such self-schemas implicate different attributes, then failures in one realm may be offset by successes in another, leading to emotional buffering and enhanced mental health. On the other hand, when one has few self-schemas, and these overlap considerably, then failure in any realm can have a devastating affect on one’s self-esteem. Another approach (self-discrepancy theory) identifies three kinds of self-schema: the actual self that we each believe describes us as we are, the ideal self that we aspire to become, and the ought self that others expect us to live up to. According to this theory, increased (rather than decreased) overlap among selfschemas leads to enhanced mental health. For example, discrepancies between the actual and ideal selves leads to dysphoria, and discrepancies between the actual and ought selves leads to anxiety. Theorists now construe multiple self-schemas in terms of the different kinds of information that is activated or accessible at any given time. A number of factors affect which selfschema is currently activated. The contexts or situations we find ourselves in can cue particular selfschemas. For instance, a work setting generally will cue our professional selves whereas a home setting will generally cue our familial selves. The self-aspects that are most salient to an individual are also influenced by individual differences. For example, members of Mensa may chronically categorize themselves and others in terms of intelligence. Such ‘schematics’ tend to view the chronic trait as important and to rate themselves highly on that dimension compared with ‘aschematics.’ A related issue involves whether self-schemas are stable or change over time. In general, schemas are viewed as fairly stable representations, and this is even more pronounced for self-schemas. However in recent years, research has focused on factors underlying the occasional malleability of the self-concept. Re-
searchers have shown that major life events, such as losing a job or experiencing the death of a loved one, can affect the self-schema substantially. Additionally, our self-schemas can change when we obtain new selfknowledge (e.g. learning our IQ). Relatedly, our selfschemas can be influenced by how others think about and treat us. For example, research suggests that if teachers treat students as incompetent, the students tend to incorporate this view into their self-schema and behave accordingly. Finally, our self-schemas change over time, reflecting normal age-related changes in maturity, social and cognitive development, and complexity.
4.3 Biases in Person Perception As noted earlier, schemas serve a number of cognitive functions, such as providing a framework for interpreting information and furnishing default values for unobserved or unrecalled details. When these inferences are incorrect, they produce biases and errors. For example, impressions of others can be influenced by context schemas, as when a person is viewed as more athletic when encountered at a gym than when encountered at a funeral. Similarly, impressions of a new acquaintance can be influenced by others present in the environment, even when there is no logical reason for this. For instance, relatives of physically disabled people may be viewed as having some of the same limitations simply because the disability schema has been activated. Impressions can also be influenced by deT ja u effects. If you meet someone new who resembles someone you already know, activation of the resembled-person’s schema may cause you to assume that the new person has similar traits and characteristics. Role schemas can also bias impressions. If you know two nuns, you may confuse them because they activate similar role schemas. These are just a couple examples of many ways in which schemas can alter impressions.
4.4 Types of Social Schema As the preceding review suggests, social psychologists have identified many different types of social schema. These include previously-mentioned role schemas (e.g., occupations), relationship schemas (e.g., parent), and trait schemas (e.g., honesty). Additionally, impressions of individuals have been construed as person schemas. Event schemas represent common ‘scripts’ as ordered sequences of actions that comprise a social event, such as attending a wedding or dining at a restaurant. Such event schemas facilitate memory for social events, as evidenced by research showing that people have difficulty remembering events that are presented to them in an illogical order. There are also nonverbal schemas composed of series of physical acts 13529
Schemas, Social Psychology of (sometimes called procedural schemas), such as a schema for riding a bike. Similarly, simple judgment rules or heuristics (e.g., smooth talkers are more believable) are sometimes viewed as nonverbal, procedural schemas. Finally, a wide variety of stereotype schemas have been identified in the social literature. Research suggests that race, age, and gender schemas, in particular, are used automatically to categorize others, presumably because these cues are visually salient and our culture defines them as important. Some research has suggested that these automatic schema effects can be overridden when people have sufficient awareness, motivation and cognitive resources to do so (e.g., Devine 1989). However, this issue remains controversial.
Higgins E T 1987 Self discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 Linville P W 1985 Self complexity and affective extremity: Don’t put all of your eggs in one cognitive basket. Social Cognition 3: 94–120 Linville P W, Carlston D E 1994 Social cognition of the self. In: Devine P G, Hamilton D L, Ostrom T M (eds.) Social Cognition: Impact on Social Psychology. Academic Press, San Diego Smith E R 1998 Mental representation and memory. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. Oxford University Press, New York Taylor S E, Crocker J 1981 Schematic bases of social information processing. In: Higgins E T, Herman C P, Zanna M P (eds.) Social Cognition: The Ontario Symposium Vol. 1. Erlbaum, Hillsdale, NJ
D. E. Carlston and L. Mae
5. Future Directions The term schema is no longer in vogue, although the essential features of the concept of schemas have been incorporated within broader theories of knowledge structure and mental representation. Work in this area promises to become increasingly sophisticated in several respects. Sophisticated new cognitive theories (e.g., connectionism) provide a better understanding of how schemas form and evolve as new information is acquired. Theorists are better defining and differentiating different types of cognitive structures and mechanisms. Research is providing better evidence for the kinds of representations assumed to underlie different phenomena. Such developments are refining our understanding of the schema concept and how the mind works in general. Ultimately, this work will prove most interesting and impactful as it demonstrates that schematic representations have behavioral as well as cognitive consequences. See also: Piaget’s Theory of Child Development; Schemas, Frames, and Scripts in Cognitive Psychology; Social Psychology, Theories of
Bibliography Alba J W, Hasher L 1983 Is memory schematic? Psychological Bulletin 93: 203–31 Bartlett F C 1932 Remembering. Cambridge University Press, Cambridge, UK Bartlett F C 1967 Remembering. A Study in Experimental and Social Psychology. Cambridge University Press, London Brewer W F, Nakamura G V 1984 The nature and functions of schemas. In: Wyer Jr. R S, Srull T K (eds.) Handbook of Social Cognition Vol. 1. Erlbaum, Hillsdale, NJ Devine P G 1989 Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology 56: 5–18 Fiske S T, Linville P W 1980 What does the schema concept buy us? Personality and Social Psychology Bulletin 6: 543–57 Hastie R 1981 Schematic principles in human memory. In: Higgins E T, Herman C P, Zanna M P (eds.) Social Cognition: The Ontario Symposium. Erlbaum, Hillsdale, NJ, Vol. 1
13530
Schizophrenia Schizophrenia is a psychosis—a severe mental disorder in which the person’s emotions, thinking, judgment, and grasp of reality are so disturbed that his or her functioning is seriously impaired. The symptoms of schizophrenia are often divided into ‘positive’ and ‘negative.’ Positive symptoms are abnormal experiences and perceptions like delusions, hallucinations, illogical and disorganized thinking, and inappropriate behavior. Negative symptoms are the absence of normal thoughts, emotions, and behavior; such as blunted emotions, loss of drive, poverty of thought, and social withdrawal. Despite common features, different forms of schizophrenia are quite dissimilar. One person, for example, may be paranoid, constantly bothered by voices warning him or her about plots or threats, but able to show good judgment and high functioning in many areas of life. Another may be bizarre in manner and appearance, preoccupied with delusions of bodily disorder, passive and withdrawn. So marked are the differences that many researchers believe that the illness will eventually prove to be a set of different conditions which lead to somewhat similar consequences.
1. Diagnostic Difficulties It is difficult to define schizophrenia precisely. The two most common functional psychoses are schizophrenia and bipolar disorder (also known as manic-depressive illness). The distinction between the two is not easy to make and psychiatrists in different parts of the world at different times have drawn the boundaries in different ways. Bipolar disorder is an episodic disorder in which psychotic symptoms are associated with severe alterations in mood—at times elated, agitated episodes of mania, at other times depression, with
Schizophrenia physical and mental slowing, despair, guilt, and low self-esteem. The course of schizophrenia, by way of contrast, though fluctuating, tends to be more continuous, and the person’s display of emotion is likely to be incongruous or lacking in spontaneity. Markedly illogical thinking is common in schizophrenia. Auditory hallucinations may occur in either bipolar disorder or schizophrenia, but in schizophrenia they are more likely to be commenting on the person’s actions or to be conversing one with another. Delusions, also, can occur in both conditions; in schizophrenia they may give the individual the sense that he or she is being controlled by outside forces or that his or her thoughts are being broadcast or interfered with.
2. Schizophrenia is Uniersal Schizophrenia is a universal condition and an ancient one. Typical cases are evident in the medical writings of ancient Greece and Rome, and the condition occurs today in every human society. While the content of delusions and hallucinations varies from culture to culture, the form of the illness is similar everywhere. Two World Health Organization studies, applying a standardized diagnostic approach, have identified characteristic cases of schizophrenia in developed and developing world countries from around the globe ( World Health Organization 1979, Jablensky et al. 1992). One of these studies (Jablensky et al. 1992) demonstrated that the rate of occurrence of new cases (the incidence) is similar in every country studied from India to Ireland—around one per 10,000 adults each year. However, since both death and recovery rates for people with psychosis are higher in the Third World, the point prevalence of schizophrenia (the number of cases existing at any point in time) is lower in the Third World—around 3 per 1,000 of the adult population compared to 6 per 1,000 in the developed world ( Warner and de Girolamo 1995). The risk of developing the illness at some time in one’s life (the lifetime prevalence) is a little higher—around one percent of the developed world population. See Mental Illness, Epidemiology of.
3. Conceptual Framework The bio-psychosocial model of illness clarifies how different factors shape schizophrenia. The model posits that the predisposition to developing an illness, its onset, and its course are each influenced by biological, psychological, and sociocultural factors. A variety of factors can affect the different phases of schizophrenia, many being environmental. Some, such as genetics, gender, and synaptic pruning are innate. Biological, psychological and social factors are involved to some extent in most phases of schizophrenia. In general, however, the research suggests that the factors responsible for the predisposition to developing an illness are more likely to be biological, that
psychological factors are often important in triggering the onset, and that the course and outcome are particularly likely to be influenced by sociocultural factors.
4. The Course and Outcome of Schizophrenia Wide variation occurs in the course of schizophrenia. In some cases the onset of illness is gradual, extending over months or years; in others it begins suddenly. Some have episodes of illness lasting weeks or months with full remission of symptoms between each episode; others have a fluctuating course in which symptoms are continuous; others again have little variation over the course of years. Swiss psychiatrist, Luc Ciompi, studied the onset, course, and outcome of illness in people with schizophrenia, following them into old age (Ciompi 1980). He found that the onset of the illness was either acute (less than six months from first symptoms to full-blown psychosis) or insidious in roughly equal numbers of cases; the course was episodic or continuous in approximately equal numbers of patients; and the outcome was moderate to severe disability in half the cases and mild disability or full recovery in the other half. Despite popular and professional belief, schizophrenia does not have a progressive, downhill course with universally poor outcome. In fact, schizophrenia usually becomes less severe as the sufferer grows older. A review of outcome studies conducted in Europe and North America throughout the twentieth century reveals that, over the course of months or years, 20 to 25 percent of people with schizophrenia recover completely from the illness—all their psychotic symptoms disappear and they return to their previous level of functioning. Another 20 percent continue to have some symptoms, but are able to lead satisfying and productive lives ( Warner 1994). In the developing countries, recovery rates are even better. The two World Health Organization studies mentioned above ( World Health Organization 1979, Jablensky et al. 1992) have shown that good outcome occurs in about twice as many patients diagnosed with schizophrenia in the developing world as in the developed world. The reason for the better outcome in the Third World is not completely understood, but it may be that many people with mental illness in developing world villages are better accepted, less stigmatized, and more likely to find work in a subsistence agricultural economy ( Warner 1994).
5. Factors Affecting the Course of Schizophrenia Age of onset: The later in life the illness begins, the milder it proves to be. Onset of schizophrenia before the age of 14 is rare, but when it does begin this early it is associated with a severe course of illness. Onset after the age of 40 is also rare, and is associated with a milder course. 13531
Schizophrenia Gender: Women usually develop their first symptoms of schizophrenia later than men and the course of their illness tends to be less severe. The reason for these differences is not clear, but may be related to the protective effect of female hormones on brain development and function. Stressful life eents: Stress can trigger episodes of schizophrenia. People with schizophrenia are more likely to report a stressful life event preceding an episode of illness than during a period of remission. Similarly, stressful events are more likely to occur prior to an episode of schizophrenia than in the same time period for people drawn from the general population (Rabkin 1982). Life stress is also more common before the first episode of schizophrenia and so, although stress does not cause the illness, it may well influence the timing of onset. The research indicates that the life events occurring before episodes of schizophrenia are milder and less objectively troublesome than those before episodes of other disorders such as depression (Beck and Worthen 1972), suggesting that people with schizophrenia are exquisitely sensitive to stress. Domestic stress: The robust results of the ‘expressed emotion’ (EE) research, conducted in several countries in the developed and developing worlds, reveal that people with schizophrenia living with relatives who are critical or over-involved (referred to in the research as high EE) have a higher relapse rate than those living with relatives who are less critical or intrusive (low EE) (Leff and Vaughn 1985, Parker and Hadzi-Pavlovic 1990). A meta-analysis of 26 EE studies of schizophrenia conducted in 11 countries indicates that the relapse rate over a two-year follow-up period was more than twice as high, at 66 percent, for patients in families which included a high EE relative than in low EE households (29 percent) (Kavanagh 1992). Other studies have shown that relatives who are less critical and over-involved exert a positive therapeutic effect on the person with schizophrenia, their presence leading to a reduction in the patient’s level of arousal (Sturgeon et al. 1981). There is no indication that the more critical and over-involved relatives are abnormal by everyday Western standards; low EE family members may be unusually low-key and permissive (Angermeyer 1983). Several studies have shown that family psychoeducational interventions can lead to a change in the level of criticism and over-involvement among relatives of people with schizophrenia and so reduce the relapse rate (Falloon et al. 1982, Berkowitz et al. 1981). Effective family interventions provide three basic ingredients: (a) education about the illness; (b) help in developing problem-solving mechanisms; and (c) practical and emotional support (McFarlane 1983, Falloon et al. 1982, Leff and Vaughn 1985). Substance use: Drug and alcohol abuse is more common among people with serious mental illness. Unemployment, social isolation, and alienation may 13532
contribute to these high rates. Several studies have shown that people with serious mental illness who abuse substances have a worse course of illness (Drake and Wallach 1989) but other researchers have found psychopathology to be no worse or, sometimes, lower among people with mental illness who use substances (Zisook et al. 1992, Buckley et al. 1994). One reason for this discrepancy may lie in the common finding that substance users are also more likely to be noncompliant with treatment (Drake and Wallach 1989); the poor course of illness, when it is observed, may be a result of this noncompliance rather than a direct consequence of substance use.
6. Causes of Schizophrenia There is no single organic defect or infectious agent which causes schizophrenia, but a variety of factors increase the risk of developing the illness, including genetic factors and obstetric complications. Genetic factors. Relatives of people with schizophrenia have a greater risk of developing the illness, the risk being progressively higher among those who are more genetically similar to the person with schizophrenia (see Fig. 1). For a second-degree relative the lifetime risk is about two percent (twice the risk for someone in the general population); for a first-degree relative is about ten percent, and for an identical twin (genetically identical to the person with schizophrenia) the risk is close to 50 percent (Gottesman 1991). Studies of people adopted in infancy reveal that the increased risk of schizophrenia among the relatives of people with the illness is due to inheritance rather than environment. The children of people with schizophrenia have the same increased prevalence of the illness whether they are raised by their biological parent with schizophrenia or by adoptive parents (Gottesman 1991, Warner and de Girolamo 1995). There is evidence implicating several genes in causing schizophrenia ( Wang et al. 1995, Freedman et al. 1997), and it is likely that more than one is responsible, either through an interactive effect or by producing different variants of the disorder. See Mental Illness, Genetics of. Obstetric complications. A review and meta-analysis of studies conducted prior to mid-1994 on the effect of obstetric complications in schizophrenia, reveals that complications before and around the time of birth appear to double the risk of developing the illness (Geddes and Lawrie 1995). In a more recent metaanalysis of a different sample of studies, the same researchers found the risk to be increased by a factor of 1.4. Since these analyses were published, more recent studies have shown variable results. Data gathered at the time of birth from very large cohorts of children born in Finland and Sweden in the 1960s and 1970s indicate that various obstetric complications double
Schizophrenia General population 12.5% 3rd degree relatives
25% 2nd degree relatives
50% 1st degree relatives
1%
First cousins
2%
Uncles/Aunts
2%
Nephews/Nieces
4%
Grandchildren
5%
Half siblings
6%
Parents
6%
Siblings
9%
Children
13%
Fraternal twins 100%
17%
Identical twins
48% 0
Genes shared
Relationship to person with schizophrenia
10 20 30 40 50 Lifetime risk of developing schizophrenia
Figure 1 The average risk of developing schizophrenia for relatives of a person with the illness, compiled from family and twin studies conducted in Europe between 1970 and 1987. Reprinted by permission of the author from Gottesman (1991, p. 96) # 1991 Irving I. Gottesman
or triple the risk of developing schizophrenia (Hultman et al. 1999, Dalman et al. 1999, Jones et al. 1998). An American study shows that the risk of schizophrenia is more than four times greater in those who experience oxygen deprivation before or at the time of birth, and that such complications increase the risk of schizophrenia much more than other psychoses like bipolar disorder (Zornberg et al. 2000). Recent Scottish research, on the other hand, found the effect of obstetric complications to be less than in prior studies (Kendell et al. 2000). Other recent research suggests that only early-onset (before age 30) cases of schizophrenia are associated with obstetric complications (Byrne et al. 2000, Cannon et al. 2000). Obstetric complications are a statistically important risk because they are so common. In the general population, they occur in up to 40 percent of births (the precise rate of occurrence depending on how they are defined) (McNeil 1988, Geddes and Lawrie 1995). The authors of the meta-analyses cited above estimate that complications of pregnancy and delivery may increase the prevalence of schizophrenia by 20 percent (Geddes and Lawrie 1995). The obstetric complications most closely associated with the increased risk of developing schizophrenia are those that induce fetal oxygen deprivation, particularly prolonged labor (McNeil 1988), and placental complications (Jones et al. 1998, Hultman et al. 1999, Dalman et al. 1999). Early delivery is also more
common for those who go on to develop schizophrenia, and infants who suffer perinatal brain damage are at a much-increased risk of subsequent schizophrenia (Jones et al. 1998). Trauma at the time of labor and delivery, and especially prolonged labor, is associated with an increase in structural brain abnormalities—cerebral atrophy and small hippocampi— which occur frequently in schizophrenia (McNeil et al. 2000). Viruses. The risk of intrauterine brain damage is increased if a pregnant woman contracts a viral illness. We know that more people with schizophrenia are born in the late winter or spring than at other times of year, and that this birth bulge sometimes increases after epidemics of viral illnesses like influenza, measles, and chickenpox. Maternal viral infections, however, probably account for only a small part of the increased risk for schizophrenia ( Warner and de Girolamo 1995).
7. Myths about the Causes of Schizophrenia Parenting: Contrary to the beliefs of professionals prior to the 1970s and to the impression still promoted by the popular media, there is no evidence, even after decades of research, that family or parenting problems cause schizophrenia. As early as 1948, psychoanalysts proposed that mothers fostered schizophrenia in their 13533
Schizophrenia offspring through cold and distant parenting. Other theorists blamed parental schisms, and confusing patterns of communication within the family (Lidz et al. 1965, Laing and Esterton 1970). The double-bind theory, put forward by anthropologist Gregory Bateson, argued that schizophrenia is promoted by contradictory parental messages from which the child is unable to escape (Bateson et al. 1956). While enjoying broad public recognition, such theories have seldom been adequately tested, and none of the research satisfactorily resolves the question of whether differences found in the families of people with schizophrenia are the cause or the effect of psychological abnormalities in the disturbed family member (Hirsch and Leff 1975). Millions of family members of people with schizophrenia have suffered needless shame, guilt, and stigma because of this widespread misconception. Drug abuse: Drug abuse does not cause schizophrenia, though it is possible, but by no means certain, that it can trigger the onset of the illness. Hallucinogenic drugs like LSD can induce short episodes of psychosis and heavy use of marijuana and stimulant drugs like cocaine and amphetamines may precipitate brief, toxic psychoses with features similar to schizophrenia (Bowers 1987), but there is no evidence that these drugs cause a long-lasting illness like schizophrenia. In the 1950s and 1960s, LSD was used as an experimental drug in psychiatry in Britain and America, but the proportion of these volunteers and patients who developed a long-lasting psychosis was scarcely greater than in the general population (Cohen 1960, Malleson 1971). A Swedish study found that army conscripts who used marijuana heavily were six times more likely to develop schizophrenia later in life (Andreasson et al. 1987), but this may well have been because those people who were destined to develop schizophrenia were more likely to use marijuana as a way to cope with the premorbid symptoms of the illness. Schizophrenia is preceded by a long period of prodromal symptoms, and a German study has demonstrated that the onset of drug and alcohol abuse in people with schizophrenia usually follows the very first negative symptom of schizophrenia (such as social withdrawal) but precedes the first positive symptom (such as hallucinations). The authors conclude that substance use is an avenue to the relief of the earliest symptoms of the illness, but is not a cause (Hambrecht and Hafner 1995).
8. The Brain in Schizophrenia Physical changes in the brain have been identified in some people with schizophrenia. The analysis of brain tissue after death has revealed a number of structural abnormalities, and new brain-imaging techniques have revealed changes in both the structure and function of the brain during life. Techniques such as magnetic resonance imaging (MRI) reveal changes in the size of 13534
different parts of the brain, especially the temporal lobes. The fluid-filled spaces (the ventricles) in the interior of the temporal lobes are often enlarged and the temporal lobe tissue diminished. The greater the observed changes the greater the severity of the person’s thought disorder and auditory hallucinations (Suddath et al. 1990). Some imaging techniques, such as positron emission tomography (PET), measure the actual functioning of the brain and provide a similar picture of abnormality. PET scanning reveals hyperactivity in the temporal lobes, particularly in the hippocampus, a part of the temporal lobe concerned with episodic memory (Tamminga et al. 1992). Another type of functional imaging, electrophysiological brain recording using EEG tracings, shows that most people with schizophrenia seem to be excessively responsive to repeated environmental stimuli and more limited in their ability to blot out irrelevant information (Freedman et al. 1997). In line with this finding, those parts of the brain that are supposed to screen out irrelevant stimuli, such as the frontal lobe, show decreased activity on PET scan (Tamminga et al. 1992). Tying in with this sensory screening difficulty, postmortem brain tissue examination has revealed problems in a certain type of brain cell—the inhibitory interneuron. These neurons damp down the action of the principal nerve cells, preventing them from responding to too many inputs. Thus, they prevent the brain from being overwhelmed by too much sensory information from the environment. The chemical messengers or neurotransmitters ( primarily gammaamino butyric acid or GABA) released by these interneurons are diminished in the brains of people with schizophrenia (Benes et al. 1991, Akbarian et al. 1993) suggesting that there is less inhibition of brain overload. Abnormality in the functioning of these interneurons appears to produce changes in the brain cells that release the neurotransmitter dopamine. The role of dopamine has long been of interest to schizophrenia researchers, because drugs like amphetamines that increase dopamine’s effects can cause psychoses that resemble schizophrenia, and drugs that block or decrease dopamine’s effect are useful for the treatment of psychoses (Meltzer and Stahl 1976). Dopamine increases the sensitivity of brain cells to stimuli. Ordinarily, this heightened awareness is useful in increasing a person’s awareness during times of stress or danger, but, for a person with schizophrenia, the addition of the effect of dopamine to an already hyperactive brain state may tip the person into psychosis. These findings suggest that in schizophrenia there is a deficit in the regulation of brain activity by interneurons, so that the brain over-responds to environmental signals and lacks the ability to screen out unwanted stimuli. This problem is made worse by a decrease in the size of the temporal lobes, which ordinarily process
Schizophrenia sensory inputs, making it more difficult for the person to respond appropriately to new stimuli. (See MRI (Magnetic Resonance Imaging) in Psychiatry.)
9. Why Does Schizophrenia Begin after Puberty? Schizophrenia researchers have long been puzzled about why the illness normally begins in adolescence when important risk factors, such as genetic loading and neonatal brain damage, are present from birth or sooner. Recent research attempts to address the question. Normal brain development leads to the loss of 30–40 percent of the connections (synapses) between brain cells during the developmental period from early life to adolescence (Huttenlocher 1979). Brain cells themselves do not diminish in number during this period, only their connectivity. It appears that we may need a high degree of connectivity between brain cells in infancy to enhance our ability to learn language rapidly (toddlers learn as many as twelve new words a day). The loss of neurons during later childhood and adolescence, however, improves our ‘working memory’ and our efficiency to process complex linguistic information (McGlashan and Hoffman 2000). For people with schizophrenia, this normally useful process of synaptic pruning is excessive, leaving fewer synapses in the frontal lobes and medial temporal cortex (Feinberg 1983). In consequence, there are deficits in the interaction between these two areas of the brain in schizophrenia, which reduce the adequacy of working memory ( Weinberger et al. 1992). One intriguing computer modeling exercise suggests that decreasing synaptic connections and eroding working memory in this way not only leads to abnormalities in the ability to recognize meaning when stimuli are ambiguous but also to the development of auditory hallucinations (Hoffman and McGlashan 1997). It is possible, therefore, that this natural and adaptive process of synaptic elimination in childhood, if carried too far, could lead to the development of schizophrenia (Feinberg 1983). If true, this would help explain why schizophrenia persists among humans despite its obvious functional disadvantages and its association with reduced fertility. The genes for synaptic pruning may help us refine our capacity to comprehend speech and other complex stimuli, but, when complicated by environmental assaults resulting in brain injury, the result could be symptoms of psychosis. As yet, this formulation is speculative.
10. Effectie Interentions in Schizophrenia There is more agreement now about what is important in the treatment of schizophrenia than ever before. In establishing the World Psychiatric Association global project designed to combat the stigma and discrimi-
nation resulting from schizophrenia ( Warner 2000), prominent psychiatrists from around the world recently agreed on the following principles: People with schizophrenia can be treated effectively in a variety of settings. These days the use of hospitals is mainly reserved for those in an acute relapse. Outside of the hospital, a range of alternative treatment settings have been devised which provide supervision and support and are less alienating and coercive than the hospital ( Warner 1995). Family involvement can improve the effectiveness of treatment. A solid body of research has demonstrated that relapse in schizophrenia is much less frequent when families are provided with support and education about schizophrenia. Medications are an important part of treatment but they are only part of the answer. They can reduce or eliminate positive symptoms but they have a negligible effect on negative symptoms. Fortunately, modern, novel antipsychotic medications, introduced in the past few years, can provide benefits while causing less severe side effects than the standard antipsychotic drugs which were introduced in the mid-1950s. Treatment should include social rehabilitation. People with schizophrenia usually need help to improve their functioning in the community. This can include training in basic living skills; assistance with a host of day-to-day tasks; and job training, job placement, and work support. The psychosocial clubhouse is one effective model for providing many of these forms of assistance (Mosher and Burti 1989). The assertive community treatment model has proven effective in preventing relapse and hospital admission (Stein and Test 1980). Work helps people recover from schizophrenia. Productive activity is basic to a person’s sense of identity and worth. The availability of work in a subsistence economy may be one of the main reasons that outcome from schizophrenia is so much better in Third World villages ( Warner 1994). Given training and support, most people with schizophrenia can work, as has been demonstrated by several north Italian employment programs ( Warner 1994). However, due to problems such as work disincentives in disability pension schemes, high general unemployment and inadequate vocational rehabilitation services, the employment of people with schizophrenia in Britain and the United States has routinely been as low as 15 percent in recent years ( Warner 2000). People with schizophrenia can get worse if treated punitively or confined unnecessarily. Extended hospital stays are rarely necessary if good community treatment is available. Jails or prisons are not appropriate places of care. Yet, around the world, large numbers of people with schizophrenia are housed in prison cells, usually charged with minor crimes, largely because of the lack of adequate community treatment. People with schizophrenia and their family members should help plan and even deliver treatment. 13535
Schizophrenia Consumers of mental health services can be successfully employed in treatment programs, and when they help train treatment staff, professional attitudes and patient outcome both improve (Sherman and Porter 1991, Warner 2000). People’s responses towards someone with schizophrenia influence the person’s course of illness and quality of life. Negative attitudes can push people with schizophrenia and their families into hiding the illness and drive them away from help. If people with schizophrenia are shunned and feared they cannot be genuine members of their own community. They become isolated and victims of discrimination in employment, accommodation, and education ( Warner 2000). The recent US Surgeon General’s report on mental illness cited stigma as one of the most important obstacles to effective treatment (US Department of Health and Human Services 1999). See also: Depression, Clinical Psychology of; Developmental Psychopathology: Child Psychology Aspects; Differential Diagnosis in Psychiatry; Mental and Behavioral Disorders, Diagnosis and Classification of; Mental Illness, Epidemiology of; Mental Illness, Genetics of; Psychiatric Assessment: Negative Symptoms; Schizophrenia and Bipolar Disorder: Genetic Aspects; Schizophrenia: Neuroscience Perspective; Schizophrenia, Treatment of
Bibliography Akbarian S, Vinuela A, Kim J J, Potkin S G, Bunney W E J, Jones E G 1993 Distorted distribution of nicotinamideadenine dinucleotide phosphate-diaphorase neurons in temporal lobe of schizophrenics implies anomalous cortical development. Archies of General Psychiatry 50: 178–87 Andreasson S, Allebeck P, Engstrom A, Rydberg U 1987 Cannabis and schizophrenia: A longitudinal study of Swedish conscripts. Lancet 1987(2): 1483–86 Angermeyer M C 1983 ‘Normal deviance’: Changing norms under abnormal circumstances. Presented at the Seventh World Congress of Psychiatry, Vienna Bateson G, Jackson D, Haley J 1956 Towards a theory of schizophrenia. Behaioral Science 1: 251–64 Beck J, Worthen K 1972 Precipitating stress, crisis theory, and hospitalization in schizophrenia and depression. Archies of General Psychiatry 26: 123–9 Benes F M, McSparran I, Bird E D, San Giovani J P, Vincent S L 1991 Deficits in small interneurons in prefrontal and cingulate cortices of schizophrenic and schizoaffective patients. Archies of General Psychiatry 48: 996–1001 Berkowitz R, Kuipers L, Eberlein-Fries R, Leff J D 1981 Lowering expressed emotion in relatives of schizophrenics. New Directions in Mental Health Serices 12: 27–48 Bowers M B 1987 The role of drugs in the production of schizophreniform psychoses and related disorders. In: Meltzer H Y (ed.) Psychopharmacology: The Third Generation of Progress. Raven Press, New York Buckley P, Thompson P, Way L, Meltzer H Y 1994 Substance abuse among patients with treatment-resistant schizophrenia:
13536
Characteristics and implications for clozapine therapy. American Journal of Psychiatry 151: 385–9 Byrne M, Browne R, Mulryan N, Scully A, Morris M, Kinsella A, McNeil T, Walsh D, O’Callaghan E 2000 Labour and delivery complications and schizophrenia: Case-control study using contemporaneous labour ward records. British Journal of Psychiatry 176: 531–6 Cannon T D, Rosso I M, Hollister J M, Bearden C E, Sanchez L E, Hadley T 2000 A prospective cohort study of genetic and perinatal influences in the etiology of schizophrenia. Schizophrenia Bulletin 26: 351–66 Ciompi L 1980 Catamnestic long-term study on the course of life and aging of schizophrenics. Schizophrenia Bulletin 6: 606–18 Cohen S 1960 Lysergic acid diethylamide: Side effects and complications. Journal of Nerous and Mental Disease 130: 30–40 Dalman C, Allebeck P, Cullberg J, Grunewald C, Koster M 1999 Obstetric complications and the risk of schizophrenia. Archies of General Psychiatry 56: 234–40 Drake R E, Wallach M A 1989 Substance abuse among the chronically mentally ill. Hospital and Community Psychiatry 40: 1041–6 Falloon I R H, Boyd J L, McGill C W, Razani J, Moss H B, Gilderman A M 1982 Family management in the prevention of exacerbations of schizophrenia: A controlled study. New England Journal of Medicine 306: 1437–40 Feinberg I 1983 Schizophrenia: Caused by a fault in programmed synaptic elimination during adolescence? Journal of Psychiatric Research 17: 319–34 Freedman R, Coon H, Myles-Worsley M, Orr-Urtreger A, Olincy A, Davis A, Polymeropoulos M, Holik J, Hopkins J, Hoff M, Rosenthal J, Waldo M C, Reimherr F, Wender P, Yaw J, Young D A, Breese C R, Adams C, Patterson D, Adler L E, Kruglyak L, Leonard S, Byerley W 1997 Linkage of a neurophysiological deficit in schizophrenia to a chromosome 15 locus. Proceedings of the National Academy of Sciences of the USA 94: 587–92 Geddes J R, Lawrie S M 1995 Obstetric complications and schizophrenia. British Journal of Psychiatry 167: 786–93 Gottesman I 1991 Schizophrenia Genesis: The Origins of Madness. Freeman, New York Hambrecht M, Hafner H 1995 Substance abuse or schizophrenia: Which comes first? Presented at the World Psychiatric Association Section of Epidemiology and Community Psychiatry Symposium, New York Hirsch S, Leff J 1975 Abnormality in Parents of Schizophrenics. Oxford University Press, London Hoffman R E, McGlashan T H 1997 Synaptic elimination, neurodevelopment, and the mechanism of hallucinated ‘voices’ in schizophrenia. American Journal of Psychiatry 154: 1683–9 Hultman C M, Sparen P, Takei N, Murray R M, Cnattingius S 1999 Prenatal and perinatal risk factors for schizophrenia, affective psychosis, and reactive psychosis of early onset: Case control study. British Medical Journal 318: 421–6 Huttenlocher P R 1979 Synaptic density in the human frontal cortex–developmental changes and effects of aging. Brain Research 163: 195–205 Jablensky A, Sartorius N, Ernberg G, Anker M, Korten A, Cooper J E, Day R, Bertelsen A 1992 Schizophrenia: Manifestations, incidence and course in different cultures: A World Health Organization ten-country study. Psychological Medicine Monograph Supplement 20 Jones P B, Rantakallio P, Hartikainen A-L, Isohanni M, Sipila P 1998 Schizophrenia as a long-term outcome of pregnancy,
Schizophrenia and Bipolar Disorder: Genetic Aspects delivery, and perinatal complications: A 28-year follow-up of the 1996 north Finland general population birth cohort. American Journal of Psychiatry 155: 355–64 Kavanagh D J 1992 Recent developments in expressed emotion and schizophrenia. British Journal of Psychiatry 160: 601–20 Kendell R E, McInneny K, Juszczak E, Bain M 2000 Obstetric complications and schizophrenia: Two case-control studies based on structured obstetric records. British Journal of Psychiatry 176: 516–22 Laing R D, Esterton A 1970 Sanity, Madness and the Family: Families of Schizophrenics. Penguin Books, Baltimore Leff J, Vaughn C 1985 Expressed Emotion in Families. Guilford Press, New York Lidz T, Fleck S, Cornelison A 1965 Schizophrenia and the Family. International Universities Press, New York MacGlashan T H, Hoffman R E 2000 Schizophrenia as a disorder of developmentally reduced synaptic connectivity. Archies of General Psychiatry 57: 637–48 Malleson N 1971 Acute adverse reactions to LSD in clinical and experimental use in the United Kingdom. British Journal of Psychiatry 118: 229–30 McFarlane W R (ed.) 1983 Family Therapy in Schizophrenia. Guilford Press, New York McNeil T F 1988 Obstetric factors and perinatal injuries. In: Tsuang M T, Simpson J C (eds.) Handbook of Schizophrenia: Nosology, Epidemiology and Genetics. Elsevier Science, New York Meltzer H Y, Stahl S M 1976 The dopamine hypothesis of schizophrenia: A review. Schizophrenia Bulletin 2: 19–76 Mosher L, Burti L 1989 Community Mental Health: Principles and Practice. Norton, New York Parker G, Hadzi-Pavlovic D 1990 Expressed emotion as a predictor of schizophrenic relapse: An analysis of aggregated data. Psychological Medicine 20: 961–5 Rabkin J G 1982 Stress and psychiatric disorders. In: Goldberger L, Breznitz S (eds.) Handbook of Stress: Theoretical and Clinical Aspects. Free Press, New York, pp. 566–84 Sherman P S, Porter M A 1991 Mental health consumers as case management aides. Hospital and Community Psychiatry 42: 494–8 Stein L I, Test M A 1980 Alternative to mental hospital treatment: I. Conceptual model, treatment program, and clinical evaluation. Archies of General Psychiatry 37: 392–7 Sturgeon D, Kuipers L, Berkowitz R, Turpin G, Leff J 1981 Psychophysiological responses of schizophrenic patients to high and low expressed emotion relatives. British Journal of Psychiatry 138: 40–5 Suddath R L, Christison G W, Torrey E F, Casanova M F, Weinberger D R 1990 Anatomical abnormalities in the brains of monozygotic twins discordant for schizophrenia. New England Journal of Medicine 322: 789–94 Tamminga C A, Thaker G K, Buchanan R, Kirkpatrick B, Alphs L D, Chase T N, Carpenter W T 1992 Limbic system abnormalities identified in schizophrenia using positron emission tomography with fluorodeoxyglucose and neocortical alterations with deficit syndrome. Archies of General Psychiatry 49: 522–30 US Department of Health and Human Services 1999 Mental Health: A Report of the Surgeon General. US Department of Health and Human Services, Substance Abuse and Mental Health Services Administration, Center for Mental Health Services, National Institutes of Health, National Institute of Mental Health, Rockville, MD Wang S, Sun C, Walczak C A, Ziegle J S, Kipps B R, Goldin L R, Diehl S R 1995 Evidence for a susceptibility locus for
schizophrenia on chromosome 6pter- p22. Nature Genetics 10: 41–6 Warner R 1994 Recoery from Schizophrenia: Psychiatry and Political Economy. Routledge, New York Warner R (ed.) 1995 Alternaties to the Hospital for Acute Psychiatric Care. American Psychiatric Press, Washington, DC Warner R 2000 The Enironment of Schizophrenia: Innoations in Practice, Policy and Communications. Routledge, London Warner R, de Girolamo G 1995 Epidemiology of Mental Problems and Psychosocial Problems: Schizophrenia. World Health Organization, Geneva, Switzerland Weinberger D R, Berman K F, Suddath R, Torrey E F 1992 Evidence of a dysfunction of a prefrontal-limbic network in schizophrenia: A magnetic resonance imaging and regional cerebral blood flow study of discordant monozygotic twins. American Journal of Psychiatry 149: 890–7 World Health Organization 1979 Schizophrenia: An International Follow-up Study. Wiley, Chichester, UK Zisook S, Heaton R, Moranville J, Kuck J, Jernigan T, Braff D 1992 Past substance abuse and clinical course of schizophrenia. American Journal of Psychiatry 149: 552–3 Zornberg G L, Buka S L, Tsuang M T 2000 Hypoxia-ischemiarelated fetal\neonatal complications and risk of schizophrenia and other nonaffective psychoses: A 19-year longitudinal study. American Journal of Psychiatry 157: 196–202
R. Warner
Schizophrenia and Bipolar Disorder: Genetic Aspects Schizophrenia and bipolar mood disorder (the latter sometimes called ‘manic-depressive illness,’) are among the most serious of all psychiatric disorders, indeed of all medical disorders. Both of these psychiatric illnesses tend to have a rather early age of onset, with most patients first becoming ill in their teens or twenties, and the symptoms are often chronic, particularly in the case of schizophrenia. Moreover, these illnesses are often severely disabling, and are associated with increased rates of educational problems, unemployment, marital difficulties, alcohol or substance abuse, and suicide. Schizophrenia and bipolar disorder each affect approximately one percent of the population in the USA and Western Europe. If one takes into account not only the numbers of people affected by schizophrenia or bipolar disorder but also the fact that many patients are seriously disabled for much of their adult lives, then the cost of these two disorders, in both economic and human terms, rivals that of diseases such as heart disease and stroke, which affect more people but tend not to strike until much later in life. This aticle first considers diagnostic issues, including the most widely used diagnostic criteria for schizophrenia, bipolar disorder, and related disorders. Evi13537
Schizophrenia and Bipolar Disorder: Genetic Aspects dence for genetic factors in schizophrenia and bipolar disorder is reviewed next, including some complicating factors, such as the likely presence of etiologic heterogeneity and the interaction of genes with environmental stressors. Issues of genetic counseling are then considered. The chapter concludes with a discussion of evidence that genes for schizophrenia and bipolar disorder may have a ‘silver lining,’ in terms of increased creative potential.
1. Diagnostic Issues 1.1 Schizophrenia and Related Disorders The criteria currently most widely used in diagnosing schizophrenia, bipolar disorder, and related disorders are those described in the most recent (4th) edition of the Diagnostic and Statistical Manual (DSM-IV) of the American Psychiatric Association (1994). Briefly summarized, the diagnostic criteria for schizophrenia listed in DSM-IV require two or more of the following five characteristic symptoms: hallucinations; delusions; disorganized speech; grossly disorganized behavior; and negative symptoms, such as flattened affect. In addition, the patient must have shown significant deterioration of functioning in areas such as occupational or interpersonal relations, and have had continuous signs of illness for at least six months, including at least one month with the characteristic symptoms. DSM-IV also recognizes several related disorders that manifest milder forms of certain symptoms that are often seen in schizophrenia. Schizotypal personality disorder, for example, is used for persons who have several characteristic features (e.g., magical thinking, recurrent illusions, and peculiar behavior or speech). If a person does not have such schizotypal eccentricities but has shown several signs of marked disinterest in interpersonal relationships (e.g., extreme aloofness, absence of any close friends), then a diagnosis of schizoid personality disorder is given. In paranoid personality disorder, there is a broad pattern of extreme suspiciousness, as shown by several signs (e.g., irrational fears that others wish to harm one). Family and adoption studies suggest that these personality disorders are part of a ‘spectrum’ of disorders that are genetically related to schizophrenia proper. 1.2 Bipolar Disorder and Related Disorders In DSM-IV, the diagnosis of ‘bipolar I’ disorder requires a history of at least one episode of mania, which is defined as a period in which a person’s mood is ‘abnormally and persistently’ expansive, elevated, or irritable. In addition, diagnosis of bipolar disorder requires that the mood change either involve psychotic features which last at least a week, or lead to 13538
hospitalization. The disturbance in mood must be severe enough to disrupt seriously social or occupational functioning, and\or to require hospitalization. The manic episode must also involve at least three (four, if only an irritable mood is present) of seven symptoms: (a) greatly inflated self-esteem, (b) increased activity or restlessness, (c) unusual talkativeness, (d) racing thoughts or flight of ideas, (e) decreased need for sleep, (f ) distractibility, and (g) extremely risky actions (such as buying sprees or reckless driving) whose potentially dangerous consequences are not appreciated. The criteria for a hypomanic episode are essentially the same as those for a manic one, except that the symptoms are neither psychotic nor so severe that they severely impair social functioning or require hospitalization. The term bipolar is potentially confusing, because it does not require a history of major depression, even though most persons with bipolar disorder will also have experienced episodes of major depression. (By contrast, in major depressive disorder, a person has experienced an episode of major depression, but not an episode of mania.) As in the case of schizophrenia, there appears to be a ‘spectrum’ of affective disorders that are genetically related to bipolar disorder but that have symptoms which are milder than those found in frank bipolar disorder. Thus DSM-IV also includes (a) bipolar II disorder (in which a patient has experienced a hypomanic, rather than a manic episode, as well as an episode of major depression) and (b) cyclothymic disorder, which involves a history of multiple hypomanic episodes as well as multiple periods of depressive symptoms (but not major depression). Bipolar disorder is also often accompanied by other concurrent, or ‘co-morbid,’ disorders, such as anxiety disorders, or alcohol and substance abuse. Moreover, even when symptoms of mania and depression are in remission, patients with a history of bipolar disorder often still meet criteria for personality disorders, particularly those with narcissistic or histrionic features. 1.3 Differential Diagnosis A diagnosis of schizophrenia requires that diagnoses of mood disorders and schizoaffective disorders be excluded. Diagnoses of schizophrenia, mood disorders, and schizoaffective disorders all require the exclusion of neurologic syndromes caused by factors such as substance abuse, medications, or general medical condition. Until rather recently there has been a tendency, particularly in the USA, for mania with psychotic features to be misdiagnosed as schizophrenia. Accurate differential diagnosis is crucial, because misdiagnosis (and resultant inappropriate treatment) may result, on the one hand, in patients being needlessly exposed to harmful side effects of medication or, on the other hand, being deprived of
Schizophrenia and Bipolar Disorder: Genetic Aspects appropriate medications that may alleviate unnecessary suffering and save patients’ jobs, marriages, even their lives. Mania, for example, often responds well to medications (particularly lithium and certain anticonvulsants such as carbamazepine and valproate) that are less effective in treating schizophrenia. It is important, moreover, to diagnose and treat schizophrenia and bipolar disorder as early as possible in the course of the illness because there is increasing evidence that these illnesses (and their underlying brain pathologies) tend to worsen if left untreated (e.g., Wyatt 1991).
2. Eidence for Genetic Factors in Schizophrenia There are several complementary lines of evidence for an important role of genetic factors in the etiology of schizophrenia. First, there are converging lines of evidence from family, twin, and adoption studies that the risk for schizophrenia is greatly increased among schizophrenics’ biological relatives. Second, investigators have, in recent years, increasingly marshalled techniques from molecular genetics to look for more direct evidence of genetic factors in schizophrenia. 2.1 Family, Twin, and Adoption Studies A person’s risk of developing schizophrenia increases, on average, with his or her increasing degree of genetic relatedness to a schizophrenic patient (e.g., Matthysse and Kidd 1976, Holzman and Matthysse 1990, Torrey et al. 1994). The risk of developing schizophrenia over a person’s lifetime is about 0.8 percent for people in the general population, though there is a several-fold variation in the prevalence of schizophrenia across different populations that have been studied around the world (e.g., see Torrey 1987). By contrast, a person’s lifetime risk is 5–10 percent if the person has a first-degree relative with schizophrenia, and is much higher—nearly 50 percent in some studies—if a person is the monozygotic (genetically identical) twin of a schizophrenic patient. While these risk figures are consistent with genetic transmission, they are not conclusive, because the degree of genetic resemblance among relatives tends to parallel their level of exposure to similar environments. Twin studies have rather consistently reported concordance rates for schizophrenia in monozygotic (MZ) twins that are several times higher than those for dizygotic (DZ) twins (e.g., Gottesman et al. 1987, Torrey et al. 1994). It is also noteworthy that the schizophrenia concordance rate for MZ twins reared apart is quite close to that for MZ twins reared together. On the other hand, the number of such MZ twins reared apart is rather small. Moreover, it is unclear whether these twins who were reared in different settings may still have had significant contact with each other after they were separated, so that the
separation of shared genetic and environmental factors may not have been complete. The most conclusive available evidence for genetic factors in schizophrenia, therefore, comes from adoption studies. For example, Heston (1966) studied 47 adult adoptees who had been born in the USA to a schizophrenic mother, but separated shortly after birth. Five of these ‘index’ adoptees were found to have subsequently developed a diagnosis of schizophrenia, vs. none of 50 matched control adoptees who had been born to demographically matched, but psychiatrically healthy, mothers. More systematic adoption studies of schizophrenia have been carried out in Scandinavia. In Denmark, for example, Kety et al. (1994) were able to identify all individuals in the entire country who had been adopted away from their biological parents at an early age and subsequently were hospitalized with a diagnosis of schizophrenia. For each of these 74 schizophrenic adoptees, a ‘control’ adoptee was identified who was closely matched for age, gender, and the socioeconomic status of the adoptive home. The control adoptees’ biological parents had not been hospitalized for mental illness. Psychiatric diagnoses of over 1100 of the adoptees’ respective biological and adoptive relatives were made after careful review of psychiatric interviews and records. Significantly higher rates of schizophrenia were found in the biological (but not in the adoptive) relatives of schizophrenic adoptees than in the biological relatives of control adoptees (5.0 percent vs. 0.4 percent). The prevalence of schizotypal personality disorder was also significantly elevated among the schizophrenic adoptees’ biological relatives (Kendler et al. 1994). Moreover, rates of schizophrenia and related ‘spectrum’ disorders were significantly elevated even in the schizophrenic adoptees’ biological paternal half-siblings, who had not even shared the same womb as the schizophrenic adoptees. 2.2 Association and Linkage Studies In recent decades, advances in molecular genetics have enabled researchers to look more directly for genetic factors in schizophrenia. One strategy is to investigate ‘candidate genes’ for which there is a theoretical reason to suspect a role in schizophrenia. Thus several groups of investigators have looked for genes that influence susceptibility to certain infectious agents, since exposure to these agents, particularly during pre- or perinatal development, appears to increase risk for schizophrenia. For example, in a number of epidemiological studies, increased risk of exposure to influenza during the middle trimester of gestation has been found to be associated with increased risk of schizophrenic outcome. Individuals’ genotypes can powerfully affect their immunologic response to infections (and their mothers’ ability to combat infections while they are still in utero). Several studies have found that certain alleles of genes that play an important role 13539
Schizophrenia and Bipolar Disorder: Genetic Aspects in immune function, such as those in the HLA complex, are more prevalent in schizophrenia. McGuffin (1989) reviewed nine such studies and concluded that there was a highly significant association between the HLA A9 allele and risk for the paranoid subtype of schizophrenia. Murray et al. (1993) suggested that prenatal exposure to influenza may increase risk of schizophrenia in the offspring because, in genetically susceptible mothers, the flu virus stimulates production of maternal antibodies that cross the placenta and disrupt fetal brain development. Even if one does not have a good ‘candidate’ gene for a disorder such as schizophrenia, however, it is still possible to apply the more indirect strategy of genetic ‘linkage.’ This strategy makes use of the fact that genes that are located very near one another on the same chromosome (and are therefore said to be closely ‘linked’) will tend to be inherited together. As the result of recent advances in molecular genetics, there are now thousands of identified genes that can be used to mark various regions of different chromosomes. One can then study large numbers of ‘multiplex’ families in which there are at least two family members with schizophrenia, in order to examine whether there is a significant tendency for schizophrenia and alleles for particular marker genes to be transmitted together within the same family. In principle, genetic linkage studies provide an elegant approach that makes it possible to identify disease genes whose role in a disease such as schizophrenia is a complete surprise to investigators (and thus would never have been chosen as ‘candidate’ genes for association studies). There are, however, some practical difficulties with linkage studies. One difficulty is that, because hundreds of different marker genes can be examined, there is a rather high probability of obtaining a spurious, or ‘false positive,’ linkage finding by chance. In order to screen out such false positive findings, it is important to determine whether interesting linkage results can be confirmed in new, independent, studies involving large numbers of multiplex families. In fact, linkage findings implicating genes on a particular region of chromosome 6 as risk factors for schizophrenia have recently been reported by several different research groups (e.g., see review by Gershon et al. 1998). 2.3 Etiologic Heterogeneity and Genotype–Enironment Interactions The search for genetic factors is complicated by several factors. It is likely, for example, that disorders such as schizophrenia are etiologically heterogeneous. That is, it is probable that the syndrome of schizophrenia can be produced by a number of different combinations of genes and\or environmental factors. Thus, while each of a number of different genes may well increase risk for schizophrenia, it is likely that no single gene is necessary for the production of most cases. One 13540
approach to the problem of heterogeneity is to identify characteristics that distinguish specific, genetically more homogeneous, subtypes of schizophrenia or bipolar disorder. Maziade et al. (1994), for example, found evidence for linkage at a locus on chromosome 11 with schizophrenia in one, but not the others, of several large pedigrees that they examined. The schizophrenics in the extended family that did show linkage were distinguished from the families that did not by having a particularly severe and unremitting form of schizophrenia. If this finding can be confirmed in other pedigrees, it would provide a valuable example of the subtyping strategy. A related problem is that, while a particular gene or genes may significantly increase one’s risk for developing schizophrenia or bipolar disorder, it will usually not be sufficient to produce the disorder; that is, most individuals who carry a susceptibility gene will not themselves become ill. (That is, there is ‘incomplete penetrance’ of the gene’s effects.) One strategy for dealing with this latter problem is to identify more sensitive, subclinical, phenotypes that indicate the presence of the gene even in people who do not develop the illness. For example, studies of schizophrenics’ families suggest that most schizophrenics carry a gene which leads to schizophrenia only 5–10 percent of the time, but causes abnormal smooth pursuit eye moements over 70 percent of the time. These eye movement dysfunctions should thus provide a better target for genetic linkage studies than schizophrenia itself (Holzman and Matthysse 1990). A further complication is the likelihood of genotype–environment interactions. For example, there is evidence from dozens of studies that pre- and perinatal complications are significant risk factors for schizophrenia (e.g., Torrey et al. 1987). Such complications are, of course, hardly unique to schizophrenia; this suggests that pre- or perinatal insults to the developing brain may interact with genetic liability factors to produce schizophrenia. Kinney et al. (1999), for example, found that schizophrenics were much more likely than were either control subjects or the schizophrenics’ own non-schizophrenic siblings to have both a major perinated complication and eye tracking dysfunction. Moreover, these non-schizophrenic siblings tended to have either a history of perinatal complications or eye tracking dysfunction, but not both in the same sibling. This pattern of findings was consistent with a two-factor model in which perinatal brain injury and specific susceptibility genes often interact to produce schizophrenia.
3. Eidence for Genetic Factors in Bipolar Disorder As in the case of schizophrenia, several converging lines of evidence strongly implicate genetic factors in the etiology of bipolar disorder.
Schizophrenia and Bipolar Disorder: Genetic Aspects 3.1 Family, Twin, and Adoption Studies There is a strong tendency for bipolar disorder to run in families, and the risk of bipolar disorder in a firstdegree relative of a manic-depressive is about 8 percent (vs. about 1 percent in the general population). Further evidence for a high heritability of bipolar disorder is provided by twin studies, particularly three twin studies conducted in Scandinavia in recent decades. The average concordance rate for bipolar disorder in these latter studies was 55 percent in MZ vs. only 5 percent in DZ twin pairs (see review by Vehmanen et al. 1995). Complementary evidence for genetic factors in bipolar disorder is provided by adoption studies. For example, Mendlewicz and Ranier (1977) identified 29 adoptees with bipolar disorder, along with demographically matched controls who were either (a) psychiatrically normal or (b) had had polio during childhood. When the biological and adoptive parents of these adoptees were interviewed, significantly more cases of bipolar disorder, major depression, and schizoaffective disorder were found in the biological parents of bipolar adoptees than in the biological parents of either of the two control groups. The respective groups of adoptive parents, by contrast, did not differ significantly in the prevalence of these disorders. 3.2 Association and Linkage Studies Although there have been dozens of reports of linkage between bipolar disorder and genetic loci on various chromosomes, few of these reports have subsequently been confirmed. Among these few are linkages to particular regions of chromosomes X, 18 and 21; each of these linkages has been confirmed by several independent groups. While this suggests that genes in these regions significantly influence susceptibility for bipolar disorder, the effects of these genes on susceptibility may be modest in size (for reviews, see Gershon et al. 1998 and Pekkarinen 1998). A key difficulty in identifying genes in bipolar disorder is the likely presence of etiologic heterogeneity. One approach to overcoming this challenge is to search for markers of genetic subtypes of bipolar disorder. For example, MacKinnon et al. (1998), after identifying a strong tendency for a subtype of bipolar disorder with concomitant panic disorder to run in families, found strong evidence for linkage of the panic-disorder subtype to marker genes on a region of chromosome 18. There was no evidence of such linkage for bipolar disorder without panic disorder. Other research has found that manic-depressive patients who respond well to lithium treatment represent a subtype that is etiologically more homogeneous, and has a stronger familial tendency, than non-lithium responders. Both linkage and association studies suggest that susceptibility to this lithium-
responsive subtype is increased by certain alleles of the gene for phospholipidase C, an enzyme important in the phosphoinositol cycle that is thought to be a therapeutic target of lithium (Alda 1999).
4. Genetic Counseling A recent study (Trippitelli et al. 1998) found that most patients with bipolar disorder and their unaffected spouses would be interested in receiving counseling about their own genetic risk, and that of their children. The majority of patients and spouses, for example, would take advantage of a test for susceptibility genes. Even when precise gene(s) involved in a particular case of schizophrenia or bipolar disorder are unknown, genetic counselors can provide a patient’s relatives with an estimated risk of developing the disorder, based on average figures from many different family studies. These risk figures, it should be noted, refer to a person’s lifetime risk—an important distinction (e.g., for schizophrenia, one’s risk has been cut roughly in half by age 30, and by age 50 it is extremely small). It is crucial that counseling be based on accurate differential diagnosis. Problems often encountered in counseling, such as counselees being confused by genetic information, or feeling fearful and embarrassed, may be heightened in families with bipolar disorder and schizophrenia, because these psychiatric disorders often carry a social stigma, and because many parents have (unfairly) been blamed for their child’s illness. For these reasons, Kessler (1980) emphasized the importance of (a) careful follow-up, to make sure that counseling was understood, and (b) the inclusion of professionals with good psychotherapeutic skills as part of the counseling team.
5. Creatiity and Liability for Schizophrenia and Bipolar Disorder There is increasing evidence, from converging lines of research, for the idea that genetic liability for schizophrenia and bipolar disorder is associated with unusual creative potential. This idea, which has long been the subject for theoretical speculation, has received empirical support from several complementary types of studies. For example, a number of studies involving non-clinical samples have reported that more creative subjects tend to score higher on personality test variables that are associated with liability for schizophrenia or bipolar disorder.
5.1 Creatiity in Schizophrenics’ Biological Relaties Of even greater interest are studies that have found unusual creativity in samples of the healthier bio13541
Schizophrenia and Bipolar Disorder: Genetic Aspects logical relatives of diagnosed schizophrenics. In an Icelandic sample, for example, Karlsson (1970) found that the biological relatives of schizophrenics were significantly more likely than people in the general population to be recognized in Who’s Who for their work in creative professions. In the adoption study noted earlier, Heston (1966) serendipitously discovered that, among the psychologically healthier ‘index’ adoptees (i.e., who had a biological mother with schizophrenia), there was a subgroup of psychologically healthy individuals who had more creative jobs and hobbies than the control adoptees. Using a similar research design, Kinney et al. (2000) studied the adopted-away offspring of schizophrenic and control biological parents. The creativity of these adoptees’ actual vocational and avocational activities was rated by investigators who were blind to the adoptees’ personal and family histories of psychopathology. Real-life creativity was rated as significantly higher, on average, for subjects who, while not schizophrenic, did have signs of magical thinking, recurrent illusions or odd speech. 5.2 Creatiity in Patients with Bipolar Disorder and Their Relaties There is also evidence for increased creativity among subjects with major mood disorders and their biological relatives. Studies of eminent writers, artists, and composers carried out in the USA, the UK, and France, all found significantly higher rates of major mood disorders among these creators than among the general population (e.g., see Jamison 1990). Richards et al. (1988) extended this link by showing that measures of ‘everyday,’ or non-eminent, creativity were significantly higher, on average, in manic-depressive and cyclothymic subjects and their normal relatives than in control subjects who did not have a personal or family history of mood disorders. Moreover, both creative artists and patients with mood disorders report that their creativity is significantly enhanced during periods of moderately elevated mood (e.g., Richards and Kinney 1990). These complementary findings suggest that the association between increased creative potential and genetic liability for bipolar disorder may extend not only to the millions of people with bipolar disorder, but also to tens of millions of others who, while not ill themselves, may carry genes for the disorder. 5.3 Implications of Link between Creatiity and Genes for Schizophrenia and Bipolar Disorder It is important to determine what maintains the high prevalence of genes for bipolar disorder in the population, despite the high rates of illness and death that are associated with this disorder. One interesting possibility is that genes which increase liability for 13542
bipolar disorder may also be associated with personally and socially beneficial effects, such as increased drive and creativity. The research findings suggesting an association between genes for bipolar disorder and increased creativity are also potentially of great significance in terms of how patients and their families view greater liability for bipolar disorder, as well as for combatting the social stigma that is still often attached to the disorder. Parallel considerations apply in the case of schizophrenia, but perhaps with even greater force, because schizophrenia tends to be an even more chronic and disabling disease, have even lower fertility, and carry an even greater social stigma. As rapid advances in molecular biology and discovery of genetic markers make it possible to detect major genes for schizophrenia (and to identify individuals who carry these genes), it will become increasingly important to know whether such genes are associated with positie, as well as negative, behavioral phenotypes or outcomes—and to understand what genetic and\or environmental modifiers affect how these genes are expressed. See also: Behavioral Genetics: Psychological Perspectives; Bipolar Disorder (Including Hypomania and Mania); Depression; Genetic Screening for Disease-related Characteristics; Genetics and Development; Genetics of Complex Traits Through the Life Cycle; Intelligence, Genetics of: Cognitive Abilities; Intelligence, Genetics of: Heritability and Causation; Mental Illness, Etiology of; Mental Illness, Genetics of; Personality Disorders; Schizophrenia, Treatment of
Bibliography Alda A 1999 Pharmacogenetics of lithium response in bipolar disorder. J. Psychiatry Neurosci. 24(2): 154–8 American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders (DSM-IV), 4th edn. American Psychiatric Association, Washington, DC Gershon E S, Badner J A, Goldin L R, Sanders A R, Cravchik A, Detera-Wadleigh S D 1998 Closing in on genes for manicdepressive illness and schizophrenia. Neuropsychopharmacology 18: 233–42 Gottesman I I, McGuffin P, Farmer A E 1987 Clinical genetics as clues to the ‘real’ genetics of schizophrenia. Schizophrenia Bulletin 13: 23–48 Heston L L 1966 Psychiatric disorders in foster home reared children of schizophrenic mothers. British Journal of Psychiatry 112: 819–25 Holzman P S, Matthysse S 1990 The genetics of schizophrenia: A review. Psychological Science 1(5): 279–86 Jamison K R 1990 Manic-depressive illness and accomplishment: Creativity, leadership, and social class. In: Goodwin F, Jamison K R (eds.) Manic-depressie Illness. Oxford University Press, Oxford, UK Karlsson J L 1970 Genetic association of giftedness and creativity with schizophrenia. Hereditas 66: 177–81 Kendler K S, Gruenberg A M, Kinney D K 1994 Independent diagnoses of adoptees and relatives, using DSM-III criteria, in
Schizophrenia: Neuroscience Perspectie the provincial and national samples of the Danish adoption study of schizophrenia. Archies of General Psychiatry 51: 456–68 Kessler S 1980 The genetics of schizophrenia: a review. Schizophrenia Bulletin 6: 404–16 Kety S S, Wender P H, Jacobsen B, Ingraham L J, Jansson L, Faber B, Kinney D K 1994 Mental illness in the biological and adoptive relatives of schizophrenic adoptees. Replication of the Copenhagen study in the rest of Denmark. Archies of General Psychiatry 51: 442–55 Kinney D K, Richards R L, Lowing P A, LeBlanc D, Zimbalist M A, Harlan P 2000 Creativity in offspring of schizophrenics and controls: An adoption study. Journal of Creatiity Research 13(1): 17–25 Matthysse S, Kidd K K 1976 Estimating the genetic contribution to schizophrenia. American Journal of Psychiatry 133: 185–91 Maziade M, Martinez M, Cliche D, Fournier J P, Garneau Y, Merette C 1994 Linkage on the 11q21–22 region in a severe form of schizophrenia. American Psychiatric Association Annual Meeting, New Research Programs and Abstracts, p. 97 McGuffin P 1989 Genetic markers: an overview and future perspectives. In: Smeraldi E, Belloni L (eds.) A Genetic Perspectie for Schizophrenic and Related Disorders. EdiErmes, Milan Mendlewicz J, Ranier J D 1977 Adoption study supporting genetic transmission of manic-depressive illness. Journal of the American Medical Association 222: 1624–7 Murray R M, Takei N, Sham P, O’Callaghan E, Wright P 1993 Prenatal influenza, genetic susceptibility and schizophrenia. Schizophrenia Research 9(2,3): 137 Pekkarinen P 1998 Genetics of bipolar disorder. Psychiatria Fennica 29: 89–109 Richards R L, Kinney D K 1990 Mood swings and everyday creativity. Creatiity Research Journal 3: 202–17 Torrey E F 1987 Prevalence studies in schizophrenia. British Journal of Psychiatry 150: 598–608 Torrey E F, Bowler A E, Taylor E H, Gottesman, I I 1994 Schizophrenia and Manic-depressie Disorder. Basic Books, New York Trippitelli C L, Jamison K R, Folstein M R, Bartko J J, DePaulo J P 1998 Pilot study on patients’ and spouses’ attitudes toward potential genetic testing for bipolar disorder. American Journal of Psychiatry 155(7): 899–904 Vehmanen L, Katrio J, Lonnqvist J 1995 Twin studies on concordance for bipolar disorder. Clinical Psychiatry and Psychopathology 26: 107–16 Wyatt R J 1991 Neuroleptics and the natural course of schizophrenia. Schizophrenia Bulletin 17: 325–51 Mackinnon D F, Xu J, McMahon F J, Simpson S G, Stine O C, McInnis M G, De Paulo J R 1998 Bipolar disorder and panic disorder in families: an analysis of chromosome 18 data. American Journal of Psychiatry 155(6): 829–31 Kinney D K, Yurgelun-Todd D A, Tramer S J, Holzman P S 1998 Inverse relationship of perinatal complications with eyetracking dysfunction in relatives of patients with schizophrenia: evidence for a two-factor model. American Journal of Psychiatry 155(7): 976–8 Richards R, Kinney D K, Lunde I, Benet M, Merzel A P C 1988 Creativity in manic-depressives, cyclothymes, their normal relatives and control subjects. Journal of Abnormal Psychology. 97(3): 1–8
D. K. Kinney
Schizophrenia: Neuroscience Perspective Schizophrenia is a complex mental illness characterized by acute phases of delusions, hallucinations, and thought disorder, and chronically by apathy, flat affect, and social withdrawal. Schizophrenia affects 1 percent of the world’s population, independent of country or culture, and constitutes a severe public health issue (WHO 1975). Schizophrenic patients normally start to display symptoms in their late teens to early twenties, but the time of onset and the course of the illness are very variable (American Psychiatry Association 1987). Approximately a third of patients experience one acute episode after which they make a more or less full recovery. Another third are affected by the illness throughout their lives, but their symptoms are to some extent alleviated by anti-psychotic drugs. The remaining third are so chronically ill that they show little or no improvement, even with medication (Johnstone 1991).
1. Intellectual Function There is a striking deterioration in intellectual and cognitive function in schizophrenia (e.g., Johnstone 1991). Patients have difficulty in initiating and completing everyday tasks, being distracted easily and tending to give up when confronted by any obstacles. These deficits are similar to the problems in initiation and planning associated with frontal-lobe lesions (Shallice and Burgess 1991). This has led many researchers to suggest that the core deficit of schizophrenia is a failure to activate frontal cortex appropriately during cognitive tasks involving planning and decision making (‘task-related hypofrontality’), a notion supported in part by many functional imaging studies (see below). Schizophrenic patients also show deficits in attention and memory tasks that engage prefrontal, hippocampal, and medial temporal systems, even in drug-free populations (Saykin et al. 1994). In the rest of this entry we shall consider the direct evidence for brain abnormalities in schizophrenia.
2. Post-mortem Neuropathology Studies of post-mortem brains (Harrison 1999) have observed a decrease in overall brain size and have linked schizophrenia to structural abnormalities in the prefrontal cortex and temporal lobe, especially the hippocampus and amygdala. Several studies have found that schizophrenic brains tend to have enlarged lateral ventricles compared to nonschizophrenic brains. Histological studies have shown evidence for 13543
Schizophrenia: Neuroscience Perspectie abnormal synaptic appearance in the cingulate and hippocampal pyramidal cells in schizophrenia. The most reproducible positive anatomical finding in postmortem hippocampal formation has been the reduced size of neuronal cell bodies in schizophrenia. However, these changes have not been found in all studies.
3. In Vio Studies Of Brain Structure 3.1 Computerized Tomography (CT) Studies The main finding from CT scan studies is that the lateral ventricles are enlarged in schizophrenic patients compared to normal controls (Van Horn and McManus 1992). Although the finding of increased ventricular volume in schizophrenic patients is widespread and replicated, the difference between schizophrenic patients and normal controls is small. Overlap with the normal population is appreciable. Correlation between CT findings and symptoms has been investigated (Lewis 1990). Enlarged ventricles have been associated with chronicity of illness, poor treatment response, and neuropsychological impairment in many, but not all of these studies. 3.2 Magnetic Resonance Imaging (MRI) Studies MRI studies (see Harrison 1999) have tended to confirm the finding of enlarged ventricles in schizophrenic patients, but also permit a more detailed analysis of brain structure. Temporal lobe reductions have been reported, and are especially prominent in the hippocampus, parahippocampal gyrus, and the amygdala, but have not been observed in all studies. Robust relationships between temporal lobe reductions and clinical features have not yet been found. Some studies have found frontal lobe reductions in schizophrenic patients. Reversal or reduction of normal structural cerebral asymmetries may be related to the pathogenesis of schizophrenia (Crow 1995). Various unusual symmetries have been observed consistent with the hypothesis that failure to develop normal asymmetry is an important component of the pathology underlying some forms of schizophrenia. There are some MRI data that provide support for a hypothesis of disconnection between brain areas in schizophrenia. These results support the existence of a relative ‘fronto-temporal dissociation’ in schizophrenia. Evidence for such dissociation has also been obtained in functional imaging studies (see below). All these studies used a ‘regions of interest’ approach in which measurements were restricted to prespecified brain regions. More recently, techniques have been developed in which differences can be detected automatically throughout the brain. Using such techniques, Andreasen et al. (1994) observed decreased thalamus size in schizophrenic patients consistent with observations in post-mortem brains. 13544
4. Functional Imaging 4.1 Resting Studies Using Positron Emission Tomography (PET) More sensitive measures of brain integrity can be obtained by measuring cerebral blood flow in a patient at rest. However, these results are difficult to interpret because the pattern of blood flow may be altered by the current mental state of the patient and by medication. Early studies of this sort observed a relative reduction of blood flow in the frontal lobes of patients with schizophrenia (Ingvar and Franzen 1974). This pattern of activity became known as hypofrontality. However, subsequent studies have not always replicated this observation. Several studies have looked at clinical correlates associated with hypofrontality, but the results are inconsistent. Among features showing a positive relationship with hypofrontality are chronicity, negative symptoms, and neuropsychological task impairment. Relations have also been observed between the pattern of blood flow and the symptomatology of patients at the time of scanning. For example, patients manifesting ‘psychomotor poverty’ showed reduced blood flow in dorsolateral prefrontal cortex (Liddle et al. 1992). 4.2 The Dopamine Hypothesis One of the most robust findings in schizophrenia research has been the observation that drugs which block dopamine receptors are effective in reducing the severity of symptoms such as hallucinations and delusions (Seeman 1986). This led to the dopamine (DA) hypothesis of schizophrenia, which posits that schizophrenia is caused by an overactivity of dopamine receptors (Van Rossum 1966). The best way to investigate the dopamine hypothesis is the in io visualization of radioactive ligand binding to quantify dopamine receptor densities in drug-naive patients using PET. These studies suggest that compared to healthy controls, patients with schizophrenia show a significant but mild increase in, and a larger variability of, D2 receptor density (Laruelle 1998). There is also evidence that D1 receptor density is reduced in the prefrontal cortex of schizophrenic patients (Okubo et al. 1997).
5. Cognitie Actiation Studies Functional neuroimaging experiments generally evaluate brain activity associated with performance of cognitive or sensori-motor tasks. Cognitive activation studies have provided further evidence for decreased frontal activity (‘hypofrontality’) in schizophrenia. In addition, there is increasing evidence that schizophrenics show abnormal integration between the
Schizophrenia: Neuroscience Perspectie frontal cortex and other brain regions, including the temporal lobes, the parietal lobes, and hippocampus, during cognitive tasks. 5.1 Task-based Studies of Executie Function Schizophrenia is characterized largely by impairments in planning and execution, and therefore tasks that involve this kind of planning and modification of behavior have been exploited in the scanner. Several studies have found that schizophrenic patients show reduced activity in the dorsolateral prefrontal cortex (DLPFC) while performing the Wisconsin card sorting task, a popular test of planning. In addition to decreased DLPFC activity, schizophrenic patients also showed abnormal responses in the temporal lobes and parahippocampal gyrus (Ragland et al. 1998). The results suggest that schizophrenia may involve a breakdown in the integration between the frontal and temporal cortex, which is necessary for executive and planning demands in healthy individuals. This interpretation moves away from the simple notion that dysfunction in isolated brain regions, explains the cognitive deficits in schizophrenia, and towards the idea that neural abnormality in schizophrenia reflects a disruption of integration between brain areas. 5.2 Willed Action Willed actions are self-generated in the sense that the subject makes a deliberate and free choice to perform one action rather than another. Willed actions are a fundamental component of executive tasks. In normal subjects, willed actions are associated with increased blood flow in the DLPFC. Schizophrenic patients, especially those with negative signs, have difficulty with tasks involving free choices, and show an associated lack of activity in DLPFC. Activity in this region normalizes as the symptoms decrease (Spence et al. 1998). This suggests that hypofrontality depends on current symptoms. Studies of willed action also suggest that the underactivity in DLPFC observed in some schizophrenic patients is accompanied by overactivity in posterior brain regions. There is evidence of a lack of the normal reciprocal interaction between the frontal and the superior temporal cortex in schizophrenia, which supports the notion of impaired functional integration (McGuire and Frith 1996). 5.3 Memory Tasks Memory impairments are an especially enduring feature of schizophrenia. Functional neuroimaging studies of memory have demonstrated hypofrontality, abnormal interaction between temporal and frontal cortex, and a dysfunctional cortico-cerebellar circuit in schizophrenic patients compared to control subjects.
The hippocampus is a brain structure that is well known to be involved in memory. Evidence for impaired hippocampal function in schizophrenia was found in a well-controlled functional imaging study (Heckers et al. 1998). In this study, schizophrenic patients failed to recruit the hippocampus during successful retrieval, unlike normal control subjects. The schizophrenic patients also showed a more widespread activation of prefrontal areas and parietal cortex during recollection than did controls. The authors propose that this overactivation represents an ‘effort to compensate for the failed recruitment of the hippocampus.’ This result supports the idea that neural abnormality in schizophrenia reflects a disruption of integration between brain areas. Fletcher et al. (1999) also found evidence for abnormal integration between brain areas in schizophrenia during the performance of a memory task. They demonstrated an abnormality in the way in which left prefrontal cortex influenced activity in left superior temporal cortex, and suggested that this abnormality was due to a failure of the anterior cingulate cortex to modulate the prefronto-temporal relationship.
6. Imaging Symptoms Functional neuroimaging is also useful for evaluating neural activity in patients experiencing specific psychotic symptoms, such as hallucinations and passivity. 6.1 Hallucinations Hallucinations, perceptions in the absence of external stimuli, are prominent among symptoms of schizophrenia. Functional neuroimaging studies of auditory hallucinations suggest that they involve neural systems dedicated to auditory speech processing as well as a distributed network of other cortical and subcortical areas (Dierks et al. 1999). There is also evidence that the activity associated with auditory hallucinations resembles that seen when normal subjects are using inner speech (McGuire et al. 1996). 6.2 Passiity Passivity symptoms, or delusions of control, in which patients claim that their actions and speech are being controlled by an external agent, are common in schizophrenia. Schizophrenic patients with passivity showed hyperactivation of inferior parietal lobe (BA 40), the cerebellum, and the cingulate cortex relative to schizophrenic patients without passivity and to normal controls. When patients no longer experienced passivity symptoms, a reversal of the hyperactivation of parietal lobe and cingulate was seen (Spence et al. 1997). Hyperactivity in parietal cortex may reflect the ‘unexpected’ nature of the 13545
Schizophrenia: Neuroscience Perspectie movement experienced by patients. The movement feels as though it is being caused by an external force (Frith et al. 2000).
7. Conclusions There is considerable evidence for structural abnormalities in the brains of patients with schizophrenia, but the abnormalities identified so far are not specific to this disorder, are very variable and cannot easily be related to the symptoms. The neurotransmitter dopamine is clearly important in schizophrenia, but its precise role remains unclear. Studies of brain function are still at an early stage, but suggest that schizophrenia may be characterized by disorders of connectivity between cortical and subcortical regions. Some of the symptoms of schizophrenia can be understood in terms of these disconnections. Current advances in imaging techniques aimed at measuring connectivity in the brain are likely to have a major impact on our understanding of schizophrenia. See also: Mental Illness, Epidemiology of; Mental Illness, Etiology of; Mental Illness, Genetics of; Psychiatric Assessment: Negative Symptoms; Schizophrenia; Schizophrenia and Bipolar Disorder: Genetic Aspects; Schizophrenia, Treatment of
Bibliography American Psychiatry Association 1987 Diagnostic and Statistical Manual of Mental Disorders (DSM-III-R), 3rd edn. American Psychiatry Association, Washington, DC Andreasen N C, Arndt S, Swayze V II, Cizadlo T, Flaum M, O’Leary D S, Ehrhardt J C, Yuh W T 1994 Thalamic abnormalities in schizophrenia visualised through magnetic resonance image averaging. Science 266: 294–8 Crow T J 1995 Aetiology of schizophrenia: An evolutionary theory. International Clinical Psychopharmacology 10 (Suppl. 3): 49–56 Dierks T, Linden D E J, Jandi M, Formisano E, Goebel R, Lanfermann H, Singer W 1999 Activation of Heschl’s gyrus during auditory hallucinations. Neuron 22: 615–21 Fletcher P C, McKenna P J, Friston K J, Frith C D, Dolan R J 1999 Abnormal cingulate modulation of fronto-temporal connectivity in schizophrenia. NeuroImage 9: 337–42 Frith C D, Blakemore S-J, Wolpert D M 2000 Explaining the symptoms of schizophrenia: Abnormalities in the awareness of action. Brain Research Reiews 31: 357–63 Harrison P J 1999 The neuropathology of schizophrenia. Brain 122: 593–624 Heckers S, Rauch S L, Goff D, Savage C R, Schacter D R, Fischmaer A J, Alpert N A 1998 Impaired recruitment of the hippocampus during conscious recollection in schizophrenia. Nature Neuroscience 4: 318–23 Ingvar D H, Franzen G 1974 Distribution of cerebral activity in chronic schizophrenia. Lancet 2: 1484–86 Johnstone E C 1991 Defining characteristics of schizophrenia. British Journal of Psychiatry Supplement 13: 5–6 Laruelle M 1998 Imaging dopamine transmission in schizophrenia. A review and meta-analysis. Quarterly Journal of Nuclear Medicine 42: 211–21
13546
Lewis S W 1990 Computerised tomography in schizophrenia 15 years on. British Journal of Psychiatry Supplement. 9: 16–24 Liddle P F, Friston K J, Frith C D, Frackowiak R S 1992 Cerebral blood flow and mental processes in schizophrenia. Journal of the Royal Society of Medicine 85: 224–7 McGuire P K, Frith C D 1996 Disordered functional connectivity in schizophrenia. Psychological Medicine 26: 663–7 McGuire P K, Silbersweig D A, Wright I, Murray R M, Frackowiak R S, Frith C D 1996 The neural correlates of inner speech and auditory verbal imagery in schizophrenia: Relationship to auditory verbal hallucinations. British Journal of Psychiatry 169: 148–59 Okubo Y, Suhara T, Suzuki K, Kobayashi K, Inoue O, Terasaki O, Someya Y, Sassa T, Sudo Y, Matsushima E, Iyo M, Tateno Y, Toru M 1997 Decreased prefrontal dopamine D1 receptors in schizophrenia revealed by PET. Nature 385: 634–6 Ragland J D, Gur R C, Glahn D C, Censits D M, Smith R J, Lazarev M G, Alavi A, Gur R E 1998 Frontotemporal cerebral blood flow change during executive and declarative memory tasks in schizophrenia: A positron emission tomography study. Neuropsychology 12: 399–413 Saykin A J, Shtasel D L, Gur R E, Kester D B, Mozley L H, Stafiniak P, Gur R C 1994 Neuropsychological deficits in neuroleptic naive patients with first-episode schizophrenia. Archies of General Psychiatry 51: 124–31 Seeman P 1986 Dopamine\neuroleptic receptors in schizophrenia. In: Burrows G D, Norman T R, Rubenstein G (eds.) Handbook on Studies of Schizophrenia, Part 2. Elsevier, Amsterdam Shallice T, Burgess P W 1991 Deficits in strategy application following frontal lobe damage in man. Brain 114: 727–41 Spence S A, Brooks D J, Hirsch S R, Liddle P F, Meehan J, Grasby P M 1997 A PET study of voluntary movement in schizophrenic patients experiencing passivity phenomena (delusions of alien control). Brain 120: 1997–2011 Spence S A, Hirsch S R, Brooks D J, Grasby P M 1998 Prefrontal cortex activity in people with schizophrenia and control subjects. Evidence from positron emission tomography for remission of ‘hypofrontality’ with recovery from acute schizophrenia. British Journal of Psychiatry 172: 316–23 Van Horn J D, McManus I C 1992 Ventricular enlargement in schizophrenia. A meta-analysis of studies of the ventricle:brain ratio. British Journal of Psychiatry 160: 687–97 Van Rossum J M 1966 The significance of dopamine receptor blockade for the mechanism of action of neuroleptic drugs. Archies Internationales de Pharmacodynamie et de Therapie 160: 492–94 World Health Organization 1975 Schizophrenia: A Multinational Study. WHO, Geneva, Switzerland
C. D. Frith and S-J Blakemore
Schizophrenia, Treatment of Schizophrenia is a brain disorder of unknown origin which severely deteriorates numerous complex functions of the central nervous system (CNS) including thought, emotion, perception, cognition, and behavior. Schizophrenia is one of the most debilitating psychiatric disorders. Due to the high lifetime prevalence (about 1.5 percent), the typical onset in early adulthood, and the strong tendency towards a chronic
Schizophrenia, Treatment of course, the disorder requires a very high degree of health care provisions. About one-quarter of hospital beds are occupied by schizophrenic patients, and the total costs of treatment are enormous (e.g., US$50 billion per year in the United States). Although there is no cure for schizophrenia, the combined administration of pharmacological and psychosocial interventions considerably improves outcome, enhances quality of life in the patients affected, and enables social integration to a large extent.
1. Causes and Pathophysiology of Schizophrenia The causes of schizophrenia are unknown so far. Therefore, the term ‘schizophrenia’ refers to an empirically defined syndrome characterized by a combination of certain symptoms which occur in a particular temporal pattern. The idea that schizophrenia is a distinct brain disorder is rooted in Emil Kraepelin’s concept of dementia praecox (Kraepelin 1893). This concept emphasized one particular aspect of the disorder: the onset of persistent cognitive disturbances early in life. The term ‘schizophrenia’ was coined by Eugen Bleuler (1916), who wanted to emphasize the loss of coherence between thought, emotion, and behavior which represents another important feature of the disorder. Bleuler actually spoke about ‘schizophrenias,’ implying a group of diseases rather than one distinct disease entity. The present diagnostic systems compiled by the World Health Organization (ICD-10 1992) and the American Psychiatric Association (DSM-IV 1994) distinguish various subtypes of schizophrenia (see Table 1) which are classified according to particular symptom combinations, and according to certain aspects of the course and prognosis of the disease. However, these subtypes are defined in a phenomenological manner which neither implies distinct causes for any of the subtypes nor any particular treatment. Although we do not know individual causes of schizophrenia, we know that the basis for the disorder is an interaction between genetic susceptibility factors and environmental components. According to numerous genetic studies the estimates of hereditability converge to about 80 percent (Owen and Cardno 1999). However, although linkage studies have yielded
evidence for a number of susceptibility loci on various chromosomes, neither a single gene nor a combination of genes have been discovered so far which are definitively involved. Similarly, a number of environmental factors (e.g., birth and pregnancy complications, viral infections) are likely to play a role, but details of this role remain to be established. Although the symptoms of schizophrenia clearly demonstrate a severe disturbance of brain functions, only subtle alterations of brain morphology have been detected. The most stable finding is a slight enlargement of brain ventricles. Recent sophisticated studies combining morphological and functional brain imaging have been suggested to indicate that subtle structural or functional lesions particularly in prefrontal and limbic neural circuits impair the integrity of these circuits (Andreasen et al. 1998). Although it is still a matter of debate whether these lesions are due to a neurodevelopmental or a neurodegenerative process (Lieberman 1999), it is clear that disturbed functioning of neuronal circuits goes along with altered neurotransmission. Dopamine, serotonin, and glutamate are the neurotransmitters most frequently implicated in the pathophysiology of schizophrenia. However, it is still not clear whether disturbed neurotransmission is the cause or the consequence of the major disease process. Due to this scantiness of etiological and pathophysiological knowledge, the treatment of patients suffering from schizophrenia is based exclusively on empirical clinical knowledge. As outlined in detail below, the treatment approach is multimodal and based on the symptom patterns present in any individual patient (see Mental Illness, Etiology of ).
2. Symptoms of Schizophrenia—the Targets for Treatment Schizophrenia affects almost all areas of complex brain functions, and thought, perception, emotion, cognition, and behavior in particular. There have been numerous approaches to systematize the plethora of symptoms according to a variety of theoretical concepts. One widely accepted concept is the distinction between positive and negative symptoms (Andreasen
Table 1 Subtypes of schizophrenia according to the major classification systems ICD-10 Paranoid schizophrenia Catatonic schizophrenia Undifferentiated schizophrenia Residual schizophrenia Hebephrenic schizophrenia — Simple schizophrenia
[DSM-IV] Schizophrenia, paranoid type Schizophrenia, catatonic type Schizophrenia, undifferentiated type Schizophrenia, residual type — Schizophrenia, disorganized type —
13547
Schizophrenia, Treatment of Table 2 Positive and negative symptoms of schizophrenia. Areas of symptoms are listed and selected examples appear in parentheses Positive symptoms Hallucinations (auditory, somato-tactile, isual, olfactory) Delusions ( persecutory, religious, thought broadcasting, thought insertion) Bizarre behavior (unusual clothing or appearance, repetitie, stereotyped behaior) Positive formal thought disorders (incoherence, illogicality, tangentiality) Negative symptoms Affective flattening (unchanging facial expression, paucity of expressie gestures) Alogia ( poerty of speech, increased response latency) Avolition-apathy ( physical anergia, problems with personal hygiene) Anhedonia-asociality (reduction in social actiities, sexual interest, closeness) Attention (social inattentieness, inattentieness during testing)
et al. 1995, Table 2) (see Psychiatric Assessment: Negatie Symptoms). This approach groups symptoms according to whether they represent a loss of or a deficiency in a normal brain function (negative symptoms) or the appearance of abnormal phenomena (positive symptoms). From the perspective of treatment the positive–negative dichotomy is of importance because these two domains react somewhat differently to treatment. In general, positive symptoms respond fairly well to treatment with antipsychotic drugs, whereas negative symptoms are rather difficult to influence. The latter is particularly problematic, because negative symptoms are usually the limiting factor for personal and social rehabilitation. In addition to the typical symptoms, patients suffering from schizophrenia often present numerous other psychiatric problems. Among these, anxiety, sleep disturbances, obsessions and compulsions, depression and drug or substance abuse are particularly frequent. Although these additional symptoms and problems may in many cases be the consequence of the typical symptoms rather than signs of independent additional psychiatric disorders, they complicate the course of the disease and often necessitate specific treatment approaches. The course of schizophrenia varies considerably between patients. Typically, the disease begins during early adulthood with the appearance of rather unspecific negative symptoms resulting in social withdrawal and impaired social and scholastic adjustment. Months to years later positive symptoms such as delusions and hallucinations appear either gradually or abruptly. Across time, positive symptoms tend to show an episodic course, whereas negative symptoms either remain quite stable or even progress. The associated psychiatric problems mentioned above do not show any systematic temporal relationship to the course of schizophrenia itself. Since the syndrome of schizophrenia has been defined, data on the final outcome have varied considerably, depending mainly on the diagnostic concepts applied. There is no doubt that full remission occurs, but probably in less than 20 13548
percent of patients. In the majority of patients schizophrenia takes a chronic course (Schultz and Andreasen 1999). The major benefit from modern treatment strategies is probably not a substantial increase in the rate of full remissions, but a significant reduction in the number of extremely ill patients, and in the severity and number of episodes characterized by prominent positive symptoms.
3. The Principles of Treatment Treatment of schizophrenia is usually multimodal and comprises approaches from two major areas, which are drug treatment and psychosocial interventions. In general, appropriate drug treatment is a prerequisite for the ability of the patients to comply with and actively take part in psychosocial treatments. The more effective drug treatment is, the more specialized and sophisticated psychosocial interventions can be successfully applied. Vice versa, appropriate psychosocial treatment considerably improves the compliance with drug treatment, because it enhances insight into the disease process, which initially is poor in many patients suffering from schizophrenia. Antipsychotic drugs (also called neuroleptics) are the most important and most effective therapeutic weapon. The major targets of these drugs are positive symptoms, although newer substances might also reduce negative symptoms to some extent (see below). Treatment with antipsychotic drugs is usually a longterm treatment, whereas drugs to control accessory symptoms (anxiolytics, hypnotics, and antidepressants) are prescribed intermittently when needed. The use of electroconvulsive therapy was widespread before antipsychotic drugs were available, but today is very limited, although this treatment is effective in certain conditions (Fink and Sackeim 1996). Among the psychosocial interventions, supportive and psychoeducative approaches are feasible for the majority of patients, whereas structured social skill training, family therapy, or complex programs in-
Schizophrenia, Treatment of cluding cognitive-behavioral therapies require a considerable degree of insight and compliance. Although recent psychosocial treatment approaches incorporate also psychodynamic aspects, classical psychodynamic psychotherapy, in general, is not effective and sometimes even counterproductive. Prior to the discovery of antipsychotic drugs, most schizophrenic patients had been hospitalized for decades or even for their entire life. Today, very differentiated treatment facilities are available, which to a large extent are community based. These include, besides classical hospitals and outpatient departments, short-term crisis intervention facilities, day-time clinics, and supported living facilities. This network of facilities considerably increases the chance for social integration and reduces the time spent overall in hospitals. However, recent aggressive approaches to shorten the duration of hospital stays dramatically might lead to a high frequency of rehospitalization, poor long-term outcome, and an overall increase in the costs of treatment. An important limiting factor for all treatment approaches is the patients’ compliance. In particular in severely ill patients compliance is often poor. In schizophrenia noncompliance is particularly due to a lack of insight into the fact of being ill, reduced ability to actively take part in the treatment process due to negative symptoms, or active avoidance of treatment due to positive symptoms such as delusions or imperative voices. To enable stable treatment compliance in patients suffering from schizophrenia, a confidential and empathetic attitude of the people involved is an important prerequisite.
4. Particular Aspects of Treatment 4.1 Pharmacotherapy The introduction of chlorpromazine into clinical practice in 1952 is the hallmark of the pharmacological treatment of schizophrenia (Delay and Deniker 1952). Chemically, chlorpromazine is a phenothiazine and other substances of this group also possess antipsychotic properties. In the 1950s and 1960s, numerous antipsychotic drugs with different structures (in particular butyrophenones [e.g., haloperidol] and thio-
xanthenes [e.g., flupentixol]) were developed. Despite this diversity in chemical structure, these first-generation drugs do not differ qualitatively with respect to their profile of beneficial and undesired effects. They target mainly the positive symptoms of schizophrenia mentioned in Table 2 without preferentially affecting one or the other of these symptoms. Their effectiveness does not depend on the subtype of schizophrenia (see Table 1). The common mode of action of the first-generation antipsychotics is believed to be blockade of the D2 subtype of dopamine receptors (Pickar 1995). Blockade of these receptors in the mesolimbic dopamine system is thought to cause a reduction in positive symptoms, and blockade in the nigrostriatal dopamine system is believed to be responsible for the typical side effects of these drugs. These are severe disturbances of motor behavior caused by a drug-induced dysfunction of the dopaminergic extrapyramidal system, which plays a pivotal role in the control of movements. Table 3 summarizes these extrapyramidal syndromes. Those which occur acutely are often very disturbing, but respond fairly well to treatment (e.g., acute dystonia to anticholinergic drugs), or at least cease upon drug discontinuation or dose reduction. Tardive extrapyramidal syndromes, however, are frequent (5–10 percent on long-term treatment with first-generation antipsychotics), often resistant to treatment approaches, and they usually persist if the antipsychotic drug is stopped. Another severe side effect of unknown cause, which might be related to the extrapyramidal system, is the neuroleptic malignant syndrome (Pelonero et al. 1998; see also Sect. 4.3). Extrapyramidal side effects are common to all first-generation drugs, whereas other, less specific side effects (e.g., sedation, weight gain, postural hypotension) occur with some but not all substances (see Psychopharmacotherapy: Side Effects). Due to the central role attributed to D2 dopamine receptor blockade, it has been thought for a long time that antipsychotic effectiveness and extrapyramidal syndromes are inevitably linked to each other. However, in the late 1960s it turned out that the dibenzodiazepine clozapine is a potent antipsychotic drug that almost never induces extrapyramidal symptoms and is effective in a proportion of otherwise treatmentresistant patients (Angst et al. 1971). Moreover, it was
Table 3 Extrapyramidal syndromes induced by first-generation antipsychotic drugs. Acute (within days) Drug-induced Parkinsonism (bradykinesia, increased muscular tone, tremor) Dystonia (sudden-onset, sustained muscular contractions preferably in the cephalic musculature) Akathisia (internal restlessness and urge to moe, often coupled with pacing behaior)
[Late or tardive (within months to years)] Tardive dyskinesia or dystonia (irregular choreiform, or dystonic moements in any oluntary muscle group) Tardive tics (repetitie, short-lasting, stereotyped moements or other kind of behaior [e.g., ocalization]) Tardive myoclonus (brief asynchronous moements which are not stereotyped)
13549
Schizophrenia, Treatment of shown that clozapine not only improves positive but to some extent also negative symptoms. Therefore, the introduction of clozapine into clinical practice was the first qualitative advance after the discovery of chlorpromazine, and it is justified to call this substance a second-generation antipsychotic drug. Unfortunately it turned out quickly that clozapine has a very particular and important drawback: it induces lifethreatening agranulocytosis in about one percent of patients treated (Baldessarini and Frankenburg 1991). Therefore, the use of this drug is only possible when white blood cell counts are monitored in short intervals. It is still not known why clozapine is a potent antipsychotic drug, but lacks extrapyramidal side effects. Among the possible reasons are that clozapine affects mainly the mesolimbic dopamine system, that it blocks D1 receptors to the same extent as D2 receptors, that it has a high affinity to D4 receptors, and that it also has high affinity to serotonin receptors. This theoretical framework was the basis for the development of various second-generation antipsychotics (see Table 4). Most of them have high affinities for serotonin receptors, although only one substance (olanzapine) is similar to clozapine in the equivalent binding to D1 and D2 receptors. So far there is no indication that any of these newer drugs carries a substantial risk of agranulocytosis, the most dangerous side effect of clozapine. Moreover, these substances have shown to be effective when compared to first-generation antipsychotic drugs and indeed to induce fewer extrapyramidal side effects. For two of them (olanzapine and risperidone) there are data suggesting that they also might ameliorate negative symptoms. However, it still remains to be seen how the effectiveness of these newer drugs compares to clozapine. Treatment with antipsychotic drugs in schizophrenic patients is usually a long-term treatment continuing for many years. Intermittently, patients may need additional other psychotropic medications. Most frequently anxiolytics, hypnotics, and antidepressants are administered. The recognition and treatment of intervening depressive episodes is particularly important because depressive and negative symptoms are sometimes difficult to differentiate, but need different treatment. When administering antidepressant drugs to schizophrenic patients, one should be aware that during treatment with antidepressants positive symptoms sometimes exacerbate. 4.2 Psychosocial Treatment It is now generally accepted that psychosocial treatment of patients with schizophrenia is not an alternative to pharmacological treatment, but rather an important complementary approach (Bustillo et al. 2000). Psychosocial measures yield little benefit for positive symptoms; in contrast, intensive psycho13550
dynamic psychotherapy might induce exacerbations. The most important and successful interventions are psychoeducative teaching and training in social skills and problem solving. Although it is still debated whether these treatments are directly effective against negative symptoms, there is no doubt that they considerably improve coping with these symptoms. The major goal of educational approaches is to increase the knowledge about and the insight into the disease process and, as a consequence, to improve treatment compliance. Educational programs are focused on knowledge about the disease process and it’s course, the mode of action and side effects of medication, early signs of exacerbation such as restlessness and insomnia, and detailed information about the various community based or hospital facilities available. These programs are particularly effective when important family members are educated as well. Moreover, approaches to improve the communication styles within the patient’s network of social relationships are often useful, although the concept that particular forms of communication (‘high expressed emotions’) play a causative role in schizophrenia has been seriously challenged. Schizophrenic patients’ social skills are often poor. This is one major reason for the often dramatically reduced amount of social contacts. Training of social skills in patients suffering from schizophrenia is performed in a group setting. Such groups are focused on the repetitive training of behavior in various social situations of importance for everyday life. In addition, some rather basic teaching of the theoretical background of successful social interaction might be included. Impaired problem solving in schizophrenic patients is due to a combination of disturbances in divided attention, planning, and motivation. Problem solving training includes the analysis of problems relevant to daily life and, particularly, dividing a problem into subproblems that can be successively solved. Similar to the training of social skills, the steps of the solution are practiced repeatedly. In recent years, more complex psychotherapeutic approaches have been developed, combining cognitive, behavioral, or psychodynamic approaches, which, however, are suitable for the minority of patients only. In particular for chronic patients who in earlier times often were hospitalized for decades, providing supervised residential living arrangements has proven to be of considerable benefit because in this setting the degree of self-sustainment can be maximized in parallel with the permanent provision of professional help. 4.3 Emergency Treatment Schizophrenia is not only a debilitating but also a lifethreatening disorder. If severe positive symptoms are present, such as verbal imperative hallucinations,
Schizophrenia, Treatment of Table 4 Relative receptor affinities of selected newer antipsychotic compounds compared to the first-generation drug haloperidol. [Dopamine]
Haloperidola Clozapine Risperidone Olanzapine Seroquel Sertindole Ziprasidone
D " jjj jj jj jjj j jjj j
D # jjjj jj jjjj jjj jj jjjjj jjjj
[Serotonin] 5HT " .. j jj .. .. jj *
5HT α #
[Noradrenaline]
j jjj jjjjj jjj j jjjj jjjj
α " jj jjj jjj jjj jjjj jjj jj
A
#
.. jjj jjj .. j .. ..
Histamine Acetylcholine H M " " .. .. jjjj jjjjj jj .. jjjj jjjjj jjjj jjj j j j ..
Adapted from Pickar, 1995 a Prototype first generation antipsychotic drug
delusions of persecution, or severe formal thought disturbances, the patient’s life is threatened by a misinterpretation of his or her environment (e.g., of dangerous objects or situations) and by suicidal intentions. Suicidality can also emerge when negative symptoms are prominent. Acute suicidality and an acute confusional psychotic state both necessitate immediate treatment, including the administration of anxiolytics, antipsychotics, or both, preferably in a hospital setting. Although every effort should be undertaken to get the informed consent of the patient, it is often necessary to initiate treatment against his or her explicit will. This has to be done in accordance with the legal regulations in the respective country. Less frequently occurring life-threatening situations in patients suffering from schizophrenia are febrile catatonia and the neuroleptic malignant syndrome, a rare side effect of antipsychotic drugs (Pelonero et al. 1998). Both conditions show similarities, including fever, disturbances of motor behavior, and reduced responsiveness to external stimuli. The differential diagnosis is difficult, but very important because in febrile catatonia treatment with antipsychotic drugs is necessary, whereas such treatment has to be immediately stopped when a neuroleptic malignant syndrome is suspected. In both of these conditions admission to an intensive care unit is necessary.
5. Future Perspecties on the Treatment of Schizophrenia Despite the rapid development of the knowledge about human genetics, it is unlikely that we will see a major advance in our understanding of the genetic causes of schizophrenia in the near future. The reason is that the available epidemiological data render a major contribution of single genes quite unlikely and suggest that multiple genes contribute to the susceptibility to schizophrenia. However, it is likely that genetic investigations will improve the management of schizophrenia by enabling prediction of the treatment response to antipsychotic drugs. Very recently, for
example, it has been shown that a combination of six polymorphisms in neurotransmitter-receptor-related genes enable a significant prediction of the treatment response to clozapine (Arranz et al. 2000). Although these preliminary studies must be replicated and extended to other drugs, it is probable that this approach will significantly enhance the success of drug treatment in schizophrenia. It is also likely that the near future will bring new drugs which are the result of a ‘fine-tuning’ of the chemical structure and the binding profile to dopamine and serotonin receptors of those substances which are presently available. Another approach which is currently being pursued is to target glutamate receptors, although first clinical trials are only in part encouraging (Farber et al. 1999). Based on a theory implicating the immune system in the pathophysiology of schizophrenia, studies are being started that use immunomodulation as a treatment. This is an interesting and promising approach, which also might help to understand the mode of action of presently available antipsychotic drugs because some of them influence the production of cytokines which are important immune mediators (Pollma$ cher et al. in press). Another approach based on pathophysiological theories is to target phospholipids. These are an important cell-membrane component and there is some evidence suggesting that either their uptake is deficient or that they are excessively broken down in schizophrenic patients (Horrobin 1999). See also: Nosology in Psychiatry
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders. American Psychiatric Association, Washington, DC Andreasen N C, Arndt S, Alliger R, Miller D, Flaum M 1995 Symptoms of schizophrenia. Methods, meanings, and mechanisms. Archies of General Psychiatry 52: 341–51
13551
Schizophrenia, Treatment of Andreasen N C, Paradiso S, O’Leary D S 1998 ‘‘Cognitive dysmetria’’ as an integrative theory of schizophrenia: a dysfunction in cortical-subcortical-cerebellar circuitry? Schizophrenia Bulletin 24: 203–18 Angst J, Bente D, Berner P, Heimann H, Helmchen H, Hippius H 1971 Das klinische Wirkungsbild von Clozapin. Pharmakopsychiatrie 4: 201–11 Arranz M J, Munro J, Birkett J, Bolonna A, Mancama D, Sodhi M, Lesch K P, Meyer J F, Sham P, Collier D A, Murray R M, Kerwin R W 2000 Pharmacogenetic prediction of clozapine response. Lancet 355: 1615–16 Baldessarini R J, Frankenburg F R 1991 Drug therapy – Clozapine. A novel antipsychotic agent. New England Journal of Medicine 324: 746–54 Bustillo J, Keith S J, Lauriello J 2000 Schizophrenia: psychosocial treatment. In: Sadock B J, Sadock V A (eds.) Kaplan & Sadock’s Comprehensie Textbook of Psychiatry. 7th edn. Lippincott Williams and Wilkins, Baltimore, MD, pp. 1210–7 Delay J, Deniker P 1952 38 cas de psychoses traite! es par la cure prolonge! e et continue de 4560 R.P. Annales me! dicopsychologique 110: 364–741 Farber N B, Newcomer J W, Olney J W 1999 Glycine agonists: what can they teach us about schizophrenia? Archies of General Psychiatry 56: 13–17 Fink M, Sackeim H A 1996 Convulsive therapy in schizophrenia? Schizophrenia Bulletin 22: 27–39 Horrobin D F 1999 Lipid metabolism, human evolution and schizophrenia. Prostaglandins Leukotrienes and Essential Fatty Acids 60: 431–7 Kraepelin E 1910 Psychiatrie. Ein Lehrbuch fa` r Studierende und An rzte. 8 Auflage. Barth, Leipzig Lieberman J A 1999 Is schizophrenia a neurodegenerative disorder? A clinical and neurobiological perspective. Biological Psychiatry 46: 729–39 Owen M J, Cardno A G 1999 Psychiatric genetics: progress, problems, and potential. Lancet 354 (Suppl. 1): SI11–14 Pelonero A L, Levenson J L, Pandurangi A K 1998 Neuroleptic malignant syndrome: a review. Psychiatric Serices 49: 1163–72 Pickar D 1995 Prospects for pharmacotherapy of schizophrenia. Lancet 345: 557–62 Pollma$ cher T, Haack M, Schuld A, Kraus T, Hinze-Selch D in press Effects of antipsychotic drugs on cytokine networks. Journal of Psychiatric Research Schultz S K, Andreasen N C 1999 Schizophrenia. Lancet 353: 1425–30 World Health Organization 1996 International Statistical Classification of Diseases. World Health Organization, Geneva, Switzerland
during the last decade (see also School Outcomes: Cognitie Function, Achieements, Social Skills, and Values). In particular, large-scale projects like TIMSS (Third International Mathematics and Science Study) (Beaton et al. 1996) had a strong and enduring impact on educational policy as well as on educational research and have also raised the public interest in scholastic achievement.
1. School Achieement and its Determinants: An Oeriew SA can be characterized as cognitive learning outcomes which are products of instruction or aimed at by instruction within a school context. Cognitive outcomes mainly comprise procedural and declarative knowledge but also problem-solving skills and strategies. The following facets, features, and dimensions of SA, most of which are self-explaining, can be distinguished: (a) episodic vs. cumulative; (b) general\global vs. domain-specific; (c) referring to different parts of a given subject (in a foreign language e.g., spelling, reading, writing, communicating, or from the competency perspective, grammatical, lexical, phonological, and orthographical); (d) test-based vs. teacher-rated; (e) performance-related vs. competence-related; (f) actual versus potential (what one could achieve, given optimal support); (g) different levels of aggregation (school, classroom, individual level); (h) curriculum-based vs. cross-curricular or extracurricular. Figure 1 shows a theoretical model that represents central determinants of SA. Cognitive and motivational determinants are embedded in a complex system of individual, parental, and school-related determinants and depend on the given social, classroom, and cultural context. According to this model, cognitive and motivational aptitudes have a direct impact on learning and SA, whereas the impact of the other determinants in the model on SA is only indirect.
T. Pollma$ cher
School Achievement: Cognitive and Motivational Determinants Although the analysis of school achievement (SA)—its structure, determinants, correlates, and consequences—has always been a major issue of educational psychology, interest in achievement as a central outcome of schooling has considerably increased 13552
Figure 1 Interplay of individual cognitive and motivational and other determinants of SA
School Achieement: Cognitie and Motiational Determinants
2. Cognitie Determinants 2.1 Intelligence Intelligence (see Intelligence, Prior Knowledge, and Learning) is certainly one of the most important determinants of SA. Most definitions of intelligence refer to abstract thinking, ability to learn, and problem solving, and emphasize the ability to adapt to novel situations and tasks. There are different theoretical perspectives, for example Piagetian and neo-Piagetian approaches, information-processing approaches, component models, contextual approaches, and newer integrative conceptions such as Sternberg’s triarchic theory of intelligence, or Gardner’s model of multiple intelligences. Among them, the traditional psychometric approach is the basis for the study of individual differences. In this approach quantitative test scores are analyzed by statistical methods such as factor analysis to identify dimensions or factors underlying test performance (Sternberg and Kaufman 1998). Several factor theories have been proposed ranging from a general-factor model to various multiple-factor models. One of the most significant contributions has been a hierarchical model based on second-order factors that distinguishes between fluid and crystallized abilities. Fluid intelligence refers to basic knowledge-free information-processing capacity such as detecting relations within figural material, whereas crystallized intelligence reflects influences of acculturation such as verbal knowledge and learned strategies. Fluid and crystallized abilities can both be seen as representing general intelligence as some versions of a complete hierarchical model would suggest (see Gustafsson and Undheim 1996). Substantial correlations (the size of the correlations differ from r l 0.50 to r l 0.60) between general intelligence and SA have been reported in many studies. That is, more than 25 percent of variance in post-test scores are accounted for by intelligence, depending on student age, achievement criteria, or time interval between measurement of intelligence and SA. General intelligence, especially fluid reasoning abilities, can be seen as a measure of flexible adaptation of strategies in novel situations and complex tasks which put heavy demands and processing loads on problem solvers. The close correlation between intelligence and SA has formed the basis for the popular paradigm of underachievement and overachievement: Students whose academic achievement performance is lower than predicted on the basis of their level of intelligence are characterized as underachievers; in the reverse case as overachievers. This concept, however, has been criticized as it focuses on intelligence as the sole predictor of SA although SA is obviously determined by many other individual as well as instructional variables. Intelligence is related to quality of instruction. Low
instructional quality forces students to fill in gaps for themselves, detect relations, infer key concepts, and develop their own strategies. On the whole, more intelligent students are able to recognize solutionrelevant rules and to solve problems more quickly and more efficiently. Additionally, it is just that ability that helps them to acquire a rich knowledge base which is ‘more intelligently’ organized and more flexibly utilizable and so has an important impact on following learning processes. Accordingly, intelligence is often more or less equated with learning ability (see Gustafsson and Undheim 1996). Beyond general intelligence, specific cognitive abilities do not seem to have much differential predictive power, although there is occasional evidence for a separate contribution of specific abilities (Gustafsson and Undheim 1996). In Gardner’s theory of multiple intelligences there are seven abilities (logicalmathematical, linguistic, spatial, musical, bodilykinesthetic, interpersonal, and intrapersonal intelligence). Among domain-specific abilities, reading ability that is related to linguistic intelligence in Gardner’s model, has some special significance. Specific abilities such as phonological awareness have turned out as early predictors of reading achievement.
2.2 Learning Styles and Learning Strategies Styles refer to characteristic modes of thinking that develop in combination with personality factors (see Cognitie Styles and Learning Styles). While general cognitive or information-processing styles are important constructs within differential psychology, learning styles are considered as determinants of SA. Learning styles are general approaches toward learning; they represent combinations of broad motivational orientations and preferences for strategic processing in learning. Prominent learning style conceptions mainly differ with respect to intrinsic and extrinsic goal orientations and surface processing vs. deep processing. Intrinsic vs. extrinsic motivation deals with whether task fulfillment is a goal of its own or a means for superordinate goals. In deep processing, learners try to reach a comprehension of tasks as complete as possible to incorporate their representations into their own knowledge structures and give them personal significance. In surface processing, learners confine processing to the point where they can reproduce the material in an examination or achievement situation as well as possible (Snow et al. 1996). Compared to general learning styles, learning strategies are more specific: They represent goal-oriented endeavors to influence one’s own learning behavior. While former research mostly dealt with simple study skills that were seen as observable behavioral techniques, now cognitive and metacognitive strategies are included. Learning strategies therefore comprise cognitive strategies (rehearsal, organization, elaboration) 13553
School Achieement: Cognitie and Motiational Determinants and metacognitive strategies (planning, monitoring, regulation) as well as resource-oriented strategies (creating a favorable learning environment, controlling attention, and sustaining concentration) (Snow and Swanson 1992). Frequently, learning strategies are assessed by means of questionnaires. In spite of the great importance learning strategies have for understanding knowledge acquisition and learning from instruction, only few empirical data concerning the relation between learning strategies and SA are available. Most studies done in the context of schools and universities have shown positive but small correlations between learning strategies and SA. One reason may be that assessment by questionnaires is too general. On the whole, the exact conditions under which learning strategies are predictive to achievement have still to be worked out. Besides motivational and metacognitive factors, epistemological beliefs are important determinants for learning strategies and achievement. They refer to beliefs about the nature of knowledge and learning such as: knowledge acquisition depends on innate abilities, knowledge is simple and unambiguous, and learning is quick. Epistemological beliefs can influence quality of achievement and persistence on difficult tasks (Schommer 1994).
2.3 Prior Knowledge Although there have been early attempts to identify knowledge prerequisites for learning, the role of prior knowledge (see Intelligence, Prior Knowledge, and Learning) as a determinant of SA has long been neglected. Since the early 1980s, research on expertise has convincingly demonstrated that superior performance of experts is mainly caused by the greater quantity and quality of their knowledge bases (see Expertise, Acquisition of). Prior knowledge does not only comprise domain-specific content knowledge, that is, declarative, procedural, and strategic knowledge, but also metacognitive knowledge, and it refers to explicit knowledge as well as to tacit knowledge (Dochy 1992). Recent research has shown that taskspecific and domain-specific prior knowledge often has a higher predictive power for SA than intelligence: 30 to 60 percent of variance in post-test scores is accounted for by prior knowledge (Dochy 1992, Weinert and Helmke 1998). Further results concern the joint effects of intelligence and prior knowledge on SA. First, there is a considerable degree of overlap in the predictive value of intelligence and prior knowledge for SA. Second, lack of domain-specific knowledge cannot be compensated for by intelligence (Helmke and Weinert 1999). But learning is not only hindered by a lack of prior knowledge, but also by misconceptions many learners have acquired in their interactions with everyday problems. For example, students often have naive 13554
convictions about physical phenomena that contradict fundamental principles like conservation of motion. These misconceptions, which are deeply rooted in people’s naive views of the world, often are in conflict with new knowledge to be acquired by instruction. To prevent and overcome these conflicts and to initiate processes of conceptual change is an important challenge for classroom instruction.
3. Motiational Determinants 3.1 Self-concept of Ability The self-concept of ability (largely equivalent with subjective competence, achievement-related selfconfidence, expectation of success, self-efficacy; see Self-concepts: Educational Aspects; Self-efficacy: Educational Aspects) represents the expectancy component within the framework of an expectancy x value approach, according to which subjective competence (expectancy aspect) and subjective importance (value aspect) are central components of motivation. Substantial correlations between self-concept of ability and SA have been found. Correlations are the higher, the more domain-specific the self-concept of ability is conceptualized, the higher it is, and the older the pupils are. Self-concept of ability is negatively related to test anxiety (see Test Anxiety and Academic Achieement) and influences scholastic performance by means of various mechanisms: Students with a high self-concept of ability (a) initiate learning activities more easily and quickly and tend less to procrastination behavior, (b) are more apt to continue learning and achievement activities in difficult situations (e.g., when a task is unexpectedly difficult), (c) show more persistence, (d) are better protected against interfering cognitions such as self-doubt and other worries (Helmke and Weinert 1997).
3.2 Attitude towards Learning, Motiation to Learn, and Learning Interest There are several strongly interrelated concepts that are associated with the subjective value—of the domain or subject under consideration, of the respective activities, or on a more generalized level, of teachers and school. The value can refer to the affective aspect of an attitude, to subjective utility, relevance, or salience. In particular, attitude toward learning means the affective (negative or positive) aspect of the orientation towards learning; interest is a central element of self-determined action and a component of intrinsic motivation, and motivation to learn comprises subjective expectations as well as incentive values (besides anticipated consequences like pride, sorrow, shame, or reactions of significant others, the
School Achieement: Cognitie and Motiational Determinants incentive of learning action itself, i.e., interest in activity). Correlations between SA and those constructs are found to be positive but not very strong (the most powerful is interest with correlations in the range of r l 0.40). These modest relations indicate that the causal path from interest, etc. to SA is far and complex and that various mediation processes and context variables must be taken into account (Helmke and Weinert 1997).
3.3 Volitional Determinants Motivation is a necessary but often not sufficient condition for the initiation of learning and for SA. To understand why some people—in spite of sufficient motivation—fail to transform their learning intentions into correspondent learning behavior, volitional concepts have proven to be helpful. Recent research on volition has focused on forms of action control, especially the ability to protect learning intentions against competitive tendencies, using concepts like ‘action vs. state orientation’ (Kuhl 1992). The few empirical studies that have correlated volitional factors with SA demonstrate a nonuniform picture with predominantly low correlations. This might be due to the fact that these variables are primarily significant for self-regulated learning (see Self-regulated Learning) and less important in the typical school setting, where learning activities and goals are (at least for younger pupils) strongly prestructured and controlled by the teacher (Corno and Snow 1986). Apart from that there certainly cannot be expected any simple, direct, linear correlations between volitional characteristics and SA, but complex interactions and manifold possibilities of mutual compensation, e.g., of inefficient learning strategies by increased effort.
appear to profit from a high degree of structuring and suffer from a too open, unstructured learning atmosphere, whereas the reverse is true for self-confident pupils with a solid base of prior knowledge (Corno and Snow 1986). Research has shown a large variation between classrooms, concerning the relation between anxiety and achievement as well as between intelligence and achievement (Helmke and Weinert 1999). A similar process of functional compensation has been demonstrated for self-concept (Weinert and Helmke 1987). (c) Culture specificity. Whereas the patterns and mechanisms of basic cognitive processes that are crucial for learning and achievement are probably universal, many relations shown in Fig. 1 depend on cultural background. For example, the ‘Chinese learner’ does not only show (in the average) a higher level of effort, but the functional role of cognitive processes such as rehearsal is different from the equivalent processes of Western students (Watkins and Biggs 1996). (d) Dynamic interplay. There is a dynamic interplay between SA and its individual motivational determinants: SA is affected by motivation, e.g., selfconcept, and affects motivation itself. From this perspective, SA and its determinants change their states as independent and dependent variables. The degree to which the mutual impact (reciprocity) of academic self-concept and SA is balanced has been the issue of controversy (skill development vs. selfenhancement approach). See also: Academic Achievement: Cultural and Social Influences; Academic Achievement Motivation, Development of; Cognitive Development: Learning and Instruction; Intelligence, Prior Knowledge, and Learning; Learning to Learn; Motivation, Learning, and Instruction; Test Anxiety and Academic Achievement
4. Further Perspecties In Fig. 1 only single determinants and ‘main’ effects of cognitive and motivational determinants on SA were considered. Actually, complex interactions and context specificity as well as the dynamic interplay between variables have to be taken into account. The following points appear important: (a) Interactions among arious indiidual determinants. For example, maximum performance necessarily requires high degrees of both intelligence and of effort, whereas in the zone of normal\average achievement, lack of intelligence can be compensated (as long as it does not drop below a critical threshold value) by increased effort (and vice versa). (b) Aptitude x treatment interactions and classroom context. Whether test anxiety exerts a strong negative or a low negative impact on SA depends on aspects of the classroom context and on the characteristics of instruction. For example, high test-anxious pupils
Bibliography Beaton A E, Mullis I V S, Martin M O, Gonzales D L, Smith T A 1996 Mathematics Achieement in the Middle School Years. TIMSS International Study Center, Boston Corno L, Snow R E 1986 Adapting teaching to individual differences among learners. In: Wittrock M C (ed.) Handbook of Research on Teaching, 3rd edn. Macmillan, New York, pp. 255–96 Dochy F J R C 1992 Assessment of Prior Knowledge as a Determinant for Future Learning: The Use of Prior Knowledge State Tests and Knowledge Profiles. Kingsley, London Gustafsson J-E, Undheim J O 1996 Individual differences in cognitive functions. In: Berliner D C, Calfee R C (eds.) Handbook of Educational Psychology. Simon & Schuster Macmillan, New York, pp. 186–242 Helmke A, Weinert F E 1997 Bedingungsfaktoren schulischer Leistungen. In: Weinert F E (ed.) EnzyklopaW die der Psychologie. PaW dagogische Psychologie. Psychologie des Unter-
13555
School Achieement: Cognitie and Motiational Determinants richts und der Schule. Hogrefe, Go$ ttingen, Germany, pp. 71–176 Helmke A, Weinert F E 1999 Schooling and the development of achievement differences. In: Weinert F E, Schneider W (eds.) Indiidual Deelopment from 3 to 12. Cambridge University Press, Cambridge, UK, pp. 176–92 Kuhl J 1992 A theory of self-regulation: Action versus state orientation, self-discrimination, and some applications. Applied Psychology: An International Reiew 41: 97–129 Schommer M 1994 An emerging conceptualization of epistemological beliefs and their role in learning. In: Garner R, Alexander P A (eds.) Beliefs about Text and Instruction with Text. Erlbaum, Hillsdale, NJ, pp. 25–40 Snow R E, Corno L, Jackson D 1996 Individual differences in affective and conative functions. In: Berliner D C, Calfee R C (eds.) Handbook of Educational Psychology. Simon & Schuster Macmillan, New York, pp. 243–310 Snow R E, Swanson J 1992 Instructional psychology: Aptitude, adaptation, and assessment. Annual Reiew of Psychology 43: 583–626 Sternberg R J, Kaufman J C 1998 Human abilities. Annual Reiew of Psychology 49: 479–502 Watkins D A, Biggs J B 1996 The Chinese Learner: Cultural, Psychological, and Contextual Influences. CERC and ACER, Hong Kong Weinert F E, Helmke A 1987 Compensatory effects of student self-concept and instructional quality on academic achievement. In: Halisch F, Kuhl J (eds.) Motiation, Intention, and Volition. Springer, Berlin, pp. 233–47 Weinert F E, Helmke A 1998 The neglected role of individual differences in theoretical models of cognitive development. Learning and Instruction 8(4): 309–23
A. Helmke and F.-W. Schrader
School Administration as a Field of Inquiry Inquiry on the administration of education reflects a variety of intellectual and social influences. Those influences and the main trends in the field’s scholarship are examined.
1. The Ecology of an Applied Field of Inquiry The ecology of an applied field of inquiry reflects its indigenous theory and research and that of related fields, demands from the worlds of policy, practice, and professional preparation, and the spirit of the times which usually mirrors societal and global trends. School administration (often educational administration) is a field of many specializations in such areas as politics, organizational studies, fiscal affairs, law, and philosophical issues or, in less-disciplined oriented areas, such as school effectiveness, leadership and supervision, human resource management and labor relations, and equity issues. 13556
Explanations about subject matter are developed in areas of this sort. As in other fields, theoretical plausibility is judged using various logical and evidentiary criteria. However, applied fields also attend to relevance for practice and implications for assorted conceptions of organizational improvement.
2. A Brief History Educational administration became a recognizable academic field in the early twentieth century in North America and later, in other parts of the world (Campbell et al. 1987, Culbertson 1988). Initially, the field was oriented to practical matters and broad issues reflecting pedagogical and societal values. An example was democratic administration, especially popular before and after World War II. In the 1950s, the field began to adopt social science theories and methods. In the mid-1970s, the subjectivistic, neo-Marxist (largely as critical theory), and identity politics perspectives, already popular in the social sciences and humanities, reached the field, with postmodernism a later addition. These perspectives typically were critical of science, which generated extensive debate. The social science emphasis led to specialization along disciplinary lines that resulted in considerable research, much of it reported in the Encyclopedias of Educational Research issued in roughly 10-year cycles, with 1992 the most recent, in two International Encyclopedias of Education issued in 1975 and 1994, and in the first Handbook of Research on Educational Administration (Boyan 1988). The second Handbook (Murphy and Louis 1999) was more fragmented and factious, often mixing research with philosophical commentary.
3. Current Trends Current trends in inquiry into school administration include conflicting views of knowledge and ethics, efforts to bridge the gaps between theory on the one hand and policy and practice on the other, more attention to making sense of complexity, and an emerging literature on comparative international aspects of school administration.
3.1 Conflicting Views The controversies about knowledge and appropriate methods of seeking it, and about ethics, are ultimately philosophical. The contending positions are similar across the social sciences and humanities, although each field’s literature reflects its peculiarities. The main strands of contending thought are grounded in different general philosophies. Subjectivism, a version of idealism, stresses mind, reason,
School Administration as a Field of Inquiry and intuition. Science is criticized for devaluing humanistic concerns in favor of objectivity. In ethics, there is usually a hierarchy of values with the highest being absolute. In educational administration, Greenfield was this movement’s leading early figure, while Hodgkinson remains its best known scholar, with special contributions to ethical theory ( Willower and Forsyth 1999; see also relevant entries in the 1992 Encyclopedia of Educational Research and the 1994 International Encyclopedia of Education). Critical theory, with its historical roots in the Frankfurt School’s search for a more contemporary Marxism, is devoted to a reformist agenda (revolution is out of fashion) and an ethic of emancipation. Its ideology now goes beyond class to stress race and gender, although advocates of identity politics may not endorse critical theory. Suspicious of science, described as serving the purposes of the ruling class, critical theory’s literature attends mainly to unjust social arrangements. There are many adherents across subfields of education, with William Foster noteworthy in educational administration. Generally, they contend that schools serve the powerful, giving short shrift to the disadvantaged. The antidote is political action, including radicalized educators. Postmodernists and poststructuralists reject metanarratives or broad theories of every kind, including scientific and ethical ones, believing such theories totalize and ignore difference. This leads to an emphasis on text or words; all that is left after efforts to understand and improve the world are disallowed. Texts are deconstructed, or examined for their assumptions, including the ways they oppress and ‘trivialize’ otherness, by what is said and omitted. This view has only recently received much attention in educational administration: the August 1998 issue of Educational Administration Quarterly was on postmodernism and the September 1998 Journal of School Leadership featured a debate on that perspective. In applied fields such as administration, postmodernism may be selectively combined with critical theory. This was discussed by Alvesson and Deetz (1996) and illustrated in the June 1998 issue of Administratie Science Quarterly on critical perspectives, where the writings of Foucault on power and control were often cited. It remains to add that Derrida, perhaps the leading scholar in the postmodern-poststructuralist camp, recently (1994) appeared to argue for a relaxing of strictures against metanarratives to allow for radicalized critique, which is what deconstruction has been all along. This suggests recognition that postmodern relativism and nihilism have made it irrelevant to the world of human activity and practice. Adherents of the views sketched often associate science with positivism, a long dead perspective that flourished in the Vienna Circle (1924–1936). Using stringent standards of verification, positivists saw metaphysics as pointless, and values as mere pref-
erences. Although it held on longer in fields such as psychology, positivism lost favor and was folded into the more trenchant analytical philosophy. Nevertheless, positivism has been disinterred to serve as a target in philosophical disputes. Naturalistic and pragmatist philosophies have been the main sources of support for proponents of scientific inquiry in educational administration. Science is seen as a human activity that seeks explanations of how things work, that can be subjected to public assessment. It is an open, growing, enterprise that is fallible but self-corrective, as better methods and theories displace older ones. Based on logic and evidence, its results have been highly successful from the standpoints of the development of plausible theories and of benefiting humankind. Claims about uncovering an ultimate reality or final truths are not made; they are inconsistent with a self-rectifying conception of inquiry. In ethics, absolute principles and pregiven ideologies are rejected in favor of an inquiry-based process that examines moral choices in concrete situations. Competing alternatives are appraised in terms of likely consequences using relevant concepts and theories, and by clarifying applicable principles. Such principles are derived from cumulative moral experience and are guides, not absolutes (see Willower and Forsyth 1999). These contrasting philosophical approaches were sketched to suggest the substance of contemporary debate, because of their implications for scholarship in school administration. Qualitative research, for instance, is a mainstay of subjectivists and those critical theorists who do ‘critical ethnography.’ Postmodernists are less explicit about research, but several articles in the Educational Administration Quarterly issue on postmodernism reported field research. Qualitative studies were done in educational administration long before these views gained popularity, but have become more widespread, and the literature now includes work aimed at demonstrating injustices, rather than building theory. Changing philosophical emphases can legitimate new subject matter and methods, as shown by comparing the second Handbook of Research on Educational Administration (Murphy and Louis 1999) with the first (Boyan 1988). However, philosophical influences are filtered by variations of interest and practice. Work may borrow selectively, sometimes from conflicting philosophies, or tailor philosophical ideas to special purposes, or ignore them entirely. For instance, in ethics, single concepts such as equity, caring, or community often are stressed, sometimes along with visions of the ideal school. More philosophical explorations of moral principles and their applications are scarcer. In epistemology, many writers in antiscience camps argue against positivistic ‘hegemony,’ while ignoring formidable views of inquiry. Hence, many disputes are about politics rather than theories of knowledge. In educational administration, examples of broad philo13557
School Administration as a Field of Inquiry sophical efforts are Evers and Lakomski’s joint work on epistemology and Hodgkinson’s and Willower’s respective approaches to ethics and to inquiry (See Boyan 1988 and Willower and Forsyth 1999). The substitution of politics for philosophical substance can be found in most social sciences and humanities, reflecting the influences of critical theory, postmodernism, and identity politics. Various commentators see these influences as declining because of their one-sidedness and practical irrelevance. However, they remain visible, claiming a share of the literatures of assorted specializations, including educational administration. 3.2 Theory and Practice In school administration, current issues command much attention; theoretical discussions have a more specialized appeal. Devolution, school site based management, and restructured decision processes are examples of a type of contemporary reform that is the object of wide attention. Such reform results from recent trends in government and politics, but also is the current incarnation of a long line of participatorytype schemes, especially seen in Western countries. They range across democratic administration, human relations, organizational development, open climates and participatory styles of leadership, organizational health, and staff empowerment. This sort of reform is oriented to practice, but has theoretical grounding because openness and participation are often held to be paths to organizational improvements, such as increased staff motivation and commitment. However, most school improvement efforts make demands on time, erode teacher autonomy, and disturb orderly routines, adding to staff overload. Counter forces include pressures to display legitimating progress and educator hopes for positive student outcomes. Such contradictory pressures are characteristic of attempts to improve school practices, but how they play out depends on contexts and contingencies. Another contemporary reform is privatization. It stresses consumer choice, efficiency, and economy. Often linked to efforts to change governmental policies on education, it is squarely in the political arena. While reforms of policy and organization aim at changing practice, students of cognition and learning have sought to understand how practitioners solve problems and make decisions. Influenced by Simon and the ‘Carnegie School,’ this work emphasizes domain-general problem-solving processes and domain-specific knowledge. Research comparing expert and novice decision makers shows the importance of both. In educational administration, the investigations of Leithwood and his associates (Leithwood and Steinbach 1995) have been noteworthy (for other sources, see Willower and Forsyth 1999). Professional preparation programs for school administrators use many approaches to transfer learning 13558
to practice. They include case studies, simulations (sometimes computer based), films, approximations of virtual reality, internships, and other school site based activities. Program improvement has benefited from the cooperative efforts of professorial and administrator associations ( Willower and Forsyth 1999). Improving the practice of school administrators and educational reform is, like inquiry, ever unfinished. Current successes do not guarantee future ones, despite progress made in developing explanative theories, and continuing efforts to improve schools and the preparation of administrators. A barrier to the implementation of theory-based reforms are practice is the multiplicity and complexity of the influences that impinge on the schools, considered next. 3.3 Making Sense of Complexity School administration appears to be more complex than many forms of management. Schools tend to be vulnerable to community and societal forces, are usually subject to substantial regulation (no surprise since they serve society’s children), and have difficulty demonstrating effectiveness. The latter occurs because of the complexity of showing the school’s part in student learning vs. that of family, variations in student motivation and ability, and other factors. Further, schools are commonly expected to foster subject matter achievement and acceptable social skills and conduct. Beyond that, school administrators oversee personnel who ordinarily define themselves as professionals and value latitude in their work. Thus, school organizations are characterized by external pressures and internal forces that heighten complexity. In addition, social changes have placed new demands on schools as societal arrangements that provided regulatory settings for children and youth, have begun to erode. As this occurred, schools increasingly have had to cope with an array of social ills, for instance, substance abuse, vandalism, and violence, against a backdrop that too often includes dysfunctional families and subcultures with norms that devalue learning, along with the miseducative effects on young people of ubiquitous mass media. As a result, many schools face expanded responsibilities, adding new complexities to the administration of schools. Such complexity can be daunting. One response in education, as in business, has been a susceptibility to fads, with total quality management as a recent example. These fads may include some reasonable ideas, commonly packaged as part of a larger program. Although often treated as panaceas, they are typically short-lived. Such fads can be explained as attempts to give perceived legitimacy to organizational efforts to confront what in reality are intractable problems. While practitioners adapted to changing social conditions and increasing complexity, those who study school administration sought to gain a better understanding of such phenomena. More sophisticated
School Administration as a Field of Inquiry quantitative analyses facilitated by computer technology provided ways of examining relationships among many variables in numerous combinations, clusters, and sequences. Although empirical investigations rarely employed such analyses, their availability provides possible handles on complexity. More popular recently have been field studies, done as participant observation, ethnography, or case studies. While qualitative methods are favored by those wishing to advance a particular view using illustrations from selected ‘narratives,’ more traditional field work has long sought to plumb complexity by attending to the details of daily activity. Disputes about qualitative vs. quantitative research are not new, mirroring the richness vs. rigor debates in sociology in the 1930s (see Waller 1934). In educational administration, the dominant view appears to favor the use of a variety of methods recognizing that each has strengths and weaknesses. In any event, empirical studies in quantitative and qualitative modes have sought to make complexity more understandable, albeit in different ways. The increased recognition of complexity has rekindled interest in chaos theory in educational administration and the social sciences. Chaos theory, it is well to recall, is not a denial of the possibility of orderliness. Rather it is a search for odd, intricate, or many faceted patterns. Reviewing chaos scholarship, Griffiths et al. (1991) concluded enthusiasm for the theory had not resulted in meaningful social research. They suggested that chaos theory has limited applications in educational administration because of the theory’s dependency on precise measures. Conceptual efforts to accommodate fluidity and fluctuation are not new. To cite others, they range from Hegelian dialectical analysis with its clash of opposites, followed by a new synthesis, to threshold effects noted only when a variable reaches a certain level, with little or no gradation, to equifinality or functional equivalents where similar results stem from different sources. Ethical theory has also been concerned with complexity. Perspectives that recognize how moral choices are embedded in elaborate contexts, and attempt to include potential consequences of action as part of choice, are illustrative. Related are cognitive-type studies of problem solving and decision making and work on types of intelligence and their relationship to problem solving. Clearly, human experience is fluid and complex, and human behavior is often irrational and contradictory. Yet, in inquiry, the irrational is apprehended through rational processes, and experience is known through the use of concepts and explanations. Obviously, concepts and theories simplify, but so do intuition, lore, pregiven ideologies, aphorisms, and other ways of confronting experience. Nothing comprehends everything. The trick, in science and administration, is to include the relevant elements and get to the heart of the problem. The record on understanding complexity
in school administration, as in the social sciences, is one of incremental gain. The methods of scientific inquiry can make complexity more comprehensible and more manageable, but there are no panaceas.
3.4 Internationalization Educational administration is becoming internationalized. New preparation programs and journals are cropping up around the world. Established outlets such as the Journal of Educational Administration and Educational Administration Quarterly regularly feature comparative-international pieces, and research done cooperatively by investigators in different countries is increasingly appearing, as for example, in the July 1998 issue of the Journal of School Leadership. In addition to the International Encyclopedias of Education, World Yearbooks of Education, and others, new works are being published, for example the 1996 International Handbook of Educational Leadership and Administration. Scholarly associations such as the University Council for Educational Administration (in North America) and the Commonwealth Council for Educational Administration and Management continue to develop cooperative activities. Internationalization presents opportunities for inquiry that take advantage of different societal-cultural settings to examine the impact of contingencies on behavior and outcomes. Such research can show how the relationships of the variables under study are mediated by setting variables. This, in turn, can lead to new theoretical explanations, not to mention better international understanding. See also: Administration in Organizations; Educational Research and School Reform; School (Alternative Models): Ideas and Institutions; School as a Social System; School Effectiveness Research; School Management
Bibliography Alvesson M, Deetz S 1996 Critical theory and postmodernism approaches to organizational studies. In: Clegg S R, Hardy C, Nord W R (eds.) Handbook of Organizational Studies. Sage, London Boyan N J (ed.) 1988 Handbook of Research on Educational Administration. Longman, New York Campbell R F, Fleming T, Newell T J, Bennion J W 1987 A History of Thought and Practice in Educational Administration. Teachers College Press, New York Culbertson J A 1988 A century’s quest for a knowledge base. In: Boyan N J (ed.) Handbook of Research on Educational Administration. Longman, New York Derrida J 1994 Specters of Marx: The State of the Debt, the Work of Mourning, and the New International. Routlege, New York
13559
School Administration as a Field of Inquiry Griffiths D E, Hart A W, Blair B G 1991 Still another approach to administration: Chaos theory. Educational Administration Quarterly 27: 430–51 Leithwood K, Steinbach R 1995 Expert Problem Soling. State University of New York Press, Albany, NY Murphy J, Louis K S (eds.) 1999 Handbook of Research on Educational Administration, 2nd edn. Jossey-Bass, San Francisco Waller W 1934 Insight and scientific method. American Journal of Sociology 40: 285–97 Willower D J, Forsyth P B 1999 A brief history of scholarship on educational administration. In: Murphy J, Louis K S (eds.) Handbook of Research on Educational Administration, 2nd edn. Jossey-Bass, San Francisco, pp. 1–24
D. J. Willower
School (Alternative Models): Ideas and Institutions Ever since its beginnings at the turn of the eighteenth to nineteenth century, the modern educational system has been accompanied by alternative schools, the numbers and broad effects of which have varied over the course of the years. This coexistence is by no means based on independence, but is the manifestation of complex reciprocal relations which exist despite fundamental differences in the ideologies of the two approaches. Phases of intensive development in the domain of traditional schools have been mirrored by similar phases of intense activity in the alternative school movement; alternative schools fulfill important functions within the general spectrum of educational possibilities.
1. Background and Conception Because proponents of alternative schooling are critical of contemporary manifestations of the traditional school system, the focus of their critique has shifted over time. In general, however, their criticism is based on objections to the structures and functions of the modern school. They do not restrict their demands to a return to earlier historical epochs and structures in which schools still played a minor role. Their objectives are for duress (compulsory education and instruction) to be replaced by freedom; teacher-directed schooling and instruction by self-directed action on the part of the student; the curriculum by the individual needs of the student; competition and rivalry by community; compartmentalization by wholeness and integral methods in instruction; the separation of life and school by an interweaving of the two spheres (e.g., Holzman 1997). The fundamental criticism of traditional schooling is based on the conviction that this form of education is burdened by a basic con13560
tradiction: Although adults—and particularly adults living in modern democracies—are granted dignity, majority, and self-determination, children and adolescents at school are denied these basic rights, albeit on the pretext that they first have to learn how to exercise them. Because it is not possible to recreate the historical conditions which preceded institutionalized education and its inherent ambivalences, however, the critics of traditional schooling have to resort to socalled ‘counterschool’ initiatives (German: Gegenschulen). Although some of these are embedded in a variety of alternative organizations (e.g., flat-sharing communes, communal child care, and even dietetic measures), in general it is not society that is deschooled, but school itself. The extent to which this radical rearrangement leads to the complete disintegration of institutions is dependent on its (partly unintended) consequences.
2. Structural and Organizational Characteristics Pedagogical aims are reflected in the specific organizational characteristics of any given school (e.g., the general structure of the school, the relationships between teachers, parents, and students, the organization of classes, the methods implemented). In turn, these characteristics constitute the decisive prerequisites allowing many alternative institutions to exceed the norm with regard to content and in both the pedagogical and the social fields. Indeed, the typical approach of the alternative schools can be reduced to a simple formula: in contrast to the universal institutions of the traditional system, which are characterized by high levels of internal differentiation and complicated organizational structures, alternative institutions are distinguished by social clarity and immediacy owing to the fact that they are, from the outset, geared to the particularities of their respective clients. Accordingly, the student body in an alternative school on average numbers less than 200 (cf. Mintz et al. 1994). Thus, irrespective of content-related differences, the alternative school represents a special type of school or, more specifically, a type of school offering a special kind of education. While this offer may formally be an open one, it in fact only reaches the few who know how to take it up. Due to these invisible processes of selection, alternative schools can be expected to be more or less immune to many of the problems arising in schools which really are open to all (as is made clear by the reports and evaluations of alternative schools; cf. as a very vivid example Duke and Perry 1978). In accordance with the very definition of the alternative school, however, these assets cannot readily be transferred to the mass school system as a whole. Instead of having to simultaneously meet the needs of students from different backgrounds and with various levels of qualification, alternative schools are able to respond
School (Alternatie Models): Ideas and Institutions to individual differences and developments emanating from a more homogeneous context of attitudes and lifestyles. The additional potential for motivation and identification (in both students and teachers) which is inherent in a free choice of school or in the decision to embark upon a rather unusual experiment constitutes another factor which cannot easily be transferred to the traditional school. As a rule, ‘counterschools’ are spared the arduous coordination of pedagogical subfunctions on the basis of a division of labor: because of the small size of these institutions, instruction and education, counseling and care all rest in the hands of a few. Instead of having to strictly adhere to general and at least partly formulated rules in order to ensure the smooth running of the school and its classes, confidence in the closer personal ties and connections of the school community makes it possible to act more effortlessly, easily, and flexibly. Finally, the danger of merely running out of inspiration within the confines of internal school processes is reduced, as the high levels of input from the local environment mean that the system’s contours remain rather diffuse. The close links between the home and the school, which are not limited to financial support, but also involve the parents’ practical cooperation, can be seen as another important asset.
3. Deelopments Since the 1960s The concepts behind alternative schooling and its implementation reach far back in time, and in some respects they refer to the founders of modern educational theory (e.g., Rousseau or Herbart). Progressive education in the United States and so-called ReformpaW dagogik in Germany, both of which emerged after the end of the nineteenth century, may certainly be considered to be phases of activity to which a number of existing institutions can be traced back (e.g., Semel and Sadovnik 1999). However, the main wave of alternative school foundation resulted from political and social conditions including the failure or petering out of large-scale attempts at educational reform (e.g., Carnie et al. 1996); this holds especially for the Anglo-American countries since the 1960s. Most of the alternative institutions founded during this phase were set up in the United States, and the majority are still to be found there (cf. Mintz et al. 1994). Although less than 5 percent of students at elementary and secondary school actually attend alternative schools, the institutions now number several thousands in the United States, most of them elementary schools. The institutions are highly divergent in terms of their targets, methods, integration of parents and students, links with the traditional system, locus of control, and circles of addressees. It would thus seem justified to focus this overview of alternative schooling since the 1960s on the situation in the United States, all the more so because both the
fundamental principle of alternative schools (‘as many institutions as there are needs and groups’) and current trends in development (adopting the tasks of the traditional system) are to be observed there. One important form of alternative education is home schooling (cf. Evans 1973), where parents instruct their children at home with the permission of the state. If—as is quite often the case—these children meet up with other home students in (elementary) schools on single days of the week, there is a sliding transition to school cooperatives aiming at unconstrained, playful, spontaneous learning in open instruction. Other institutions specifically target problematic cases, so-called at-risk students, including not only chronic truants and low-performing students, but also talented nonconformers (in the broadest sense). A well-balanced program of drill and freedom is intended to render school attractive again, thus helping these students to enhance their performance. However, this development has also led to a reverse trend, a sort of back-to-basics shift to formal ‘three Rs’ schools, which specifically emphasize the basic skills, discipline, and respect for the authority of parents and teachers (cf., e.g., Deal and Nolan 1978). Similar trends are to be observed at the secondary level. The large number of schools-within-schools— smaller, less complex units running different experimental programs, and affiliated to larger state schools—is particularly notable. The spectrum also includes schools attempting to set themselves up as ‘just community schools’ based on the principles of Kohlberg’s developmental theory (cf. Kohlberg 1985). The so-called ‘schools without walls’ represent another form of alternative education. They not only offer a ‘normal’ curriculum (differing from that of the traditional school in terms of content and time), but give students the opportunity to learn in the adult world. This is intended to provide students with extrascholastic experience and to enable them to prove themselves in the world of work. Although this approach initially seemed very promising, experience has shown that it is in fact very hard to put into practice.
4. Recent Trends and Perspecties Over the past few decades, there have been signs of a kind of division of function and labor between the traditional and the alternative schools in the United States. This has been expressed in an increase in the state-controlled administration of alternative institutions. The fact that such large numbers of US alternative schools were founded at the beginning of the 1980s is probably related to this development. The radical free schools (e.g., Dennison 1969), which signaled the upheavals of the 1960s and certainly provided considerable impetus for subsequent developments, play a lesser role in today’s broader stream 13561
School (Alternatie Models): Ideas and Institutions of initiatives. The basic conceptions and expectations of alternative schooling have also been affected, resulting in the extremely short lifespan of some of these institutions. There is still a great deal of fluctuation, making it hard to make valid evaluations of these institutions (cf. the data in Mintz et al. 1994, p. 10ff.). Even the early approaches and experiments demonstrated the great weaknesses and dangers of such institutions—the cost of their structure-related advantages (cf. also Swidler 1979). Critics have reduced this to the polemic formula that precisely those principles and premises which allowed the pure ‘counterschool’ initiatives of the early period to emerge later threatened their ability to survive. Wherever the concept of freedom as a counterpole to the compulsory character of the traditional institutions went hand in hand with systematic shortcomings in the specification of learning targets, there was the risk that later conflicts be all the more severe due to the differing hopes and expectations of those involved (cf. Barth 1972, especially p. 108ff). When every schoolrelated action was dependent on the consensus of every single member of the school community, the school’s fate was put in the hands of changing minorities with the power of veto, all the more so when the small size of these private institutions kindled fears of sudden financial ruin. Allowing almost all of those concerned to take an active role in the decisionmaking process did not in fact ensure the equality of all participants—instead of increasing the capacity to solve problems, there was a growing risk of disappointment, and the necessary basic consensus was threatened by frequent conflicts and failures in minor matters. The firm and wide-ranging commitment of parents, extending well into issues of classroom practice, often resulted not so much in the enhancement as in the disturbance of school activities. Teachers were consequently made to feel insecure and were put on the defensive instead being provided with support and control. Attempts to simultaneously cope with the factual demands, organizational affairs, and personal standards inherent in these institutions—preferably without referring to preset guidelines—led to the systematic overexertion of those concerned, and the end of many ‘counterschools’ was marked by the burnout of the teaching staff (Dennison described this phenomenon as early as 1969). The unconditional openness with respect to the other members of the school community and the intimacy of the small group responsible for the school meant that factual disputes often escalated into personal trials. Moreover, school affairs were often overshadowed by secondary motives such as parental self-realization or the search for security. By integrating and promoting alternative schools, the state aims to increase the diversification of the educational spectrum, to enhance the overall efficiency of schools (which has been subject to criticism for decades; cf., e.g., Goodlad 1984), and to better meet 13562
the specific needs of certain problem groups in both school and society. The US programs of community schools, magnet schools and, more recently, charter schools provide evidence for this trend, and are looked upon with growing interest by other countries as possible models for the resolution of system-related problems. However, more detailed analyses of these models point to significant limitations in their problem-solving capacity, limitations which are ultimately rooted in the basic concepts of market and choice (cf. Lauder et al. 1999). The community schools program has already been thwarted by the US authorities’ desegregation measures: although these schools were directed at the advancement of members of the underprivileged minorities, most of whom are colored, they undermined the desired objective of racial integration. Magnet schools were supposed to be a free choice of school with a special educational program which acted like a magnet, attracting students from beyond the racial boundaries marked out in residential areas and school districts. They were intended to promote racial integration, which can scarcely be achieved through the compulsory measure of busing. A number of objections have been voiced against this school form. For example, it is argued that the enlargement of catchment areas weakens the link between the school and the parental home. Furthermore, although magnet schools may well establish a balanced relation between whites and blacks in their own student bodies, there are too few magnet schools to ensure social compensation on a large scale. Moreover, there is an imperceptible shift in their circle of addressees: instead of reaching the underprivileged, out-of-school colored population with their specific educational provision, magnet schools in fact seem to appeal to white parents and their children who would otherwise switch to private schools or the more privileged suburbs (cf. Smrekar and Goldring 1999). The charter schools, on the other hand, which in a way transfer the principles of the private school (i.e., the autonomy of schools and a free choice of school) to the state school system, are faced with comparable problems on a different level inasmuch as unforeseen side effects have been revealed in the social process. Despite the fact that the numbers of charter schools have increased considerably over the past few years, they are still far from ever being able to constitute the majority of state schools, belonging by definition to the optional domain of educational provision. The half-hearted legislative reservations perceptible in most of the federal states against according too many privileges to these schools threaten to render the entire charter program futile (cf. Hassel 1999).
Bibliography Barth R S 1972 Open Education and the American School. Agathon Press, New York
School as a Social System Carnie F, Tasker M, Large M (eds.) 1996 Freeing Education. Steps Towards Real Choice and Diersity in Schools. Hawthorn Press, Stroud, UK Deal T E, Nolan R R (eds.) 1978 Alternatie Schools. Ideologies, Realities, Guidelines. Nelson-Hall, Chicago, IL Dennison G 1969 The Lies of Children. The Story of the First Street School. Random House, New York Duke D L, Perry C 1978 Can alternative schools succeed where Benjamin Spock, Spiro Agnew, and B. F. Skinner have failed? Adolescence 13: 375–92 Evans T 1973 The School in the Home. Harper & Row, New York Goodlad J I A 1984 Place Called School: Prospects for the Future. McGraw Hill, New York Hassel B C 1999 The Charter School Challenge—Aoiding the Pitfalls, Fulfilling the Promise. Brookings Institution Press, Washington, DC Holzman L 1997 Schools for Growth. Radical Alternaties to Current Educational Models. Lawrence Erlbaum Associates, Mahwah, NJ Kohlberg L 1985 The just community approach to moral education in theory and practice. In: Berkowitz M W, Oser F (eds.) Moral Education: Theory and Application. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 27–87 Lauder H, Hughes D, Watson S, Waslander S, Thrupp M, Strathdee R, Simiyu I, Dupuis A, McGlinn J, Hamlin J 1999 Trading in Futures. Why Markets in Education Don’t Work. Open University Press, Buckingham, UK Mintz J, Solomon R, Solomon S, Muscat A (eds.) 1994 The Handbook of Alternatie Education. A Solomon Press Book. Macmillan, New York Ramseger J 1975 Gegenschulen. Radikale Reformschulen in der Praxis [Radical Reform Schools in Practice]. Klinkhardt, Bad Heilbrunn, Germany Semel S F, Sadovnik A R (eds.) 1999 ‘Schools of Tomorrow,’ Schools of Today. What Happened to Progressie Education History of School and Schooling. Peter Lang Verlag, New York, Vol. 8 Smrekar C, Goldring E 1999 School Choice in Urban America. Magnet Schools and the Pursuit of Equity. Teachers’ College, Columbia University, New York Swidler A 1979 Organization Without Authority. Dilemmas of Social Control in Free Schools. Harvard University Press, Cambridge, MA
A. Leschinsky
School as a Social System A ‘social system’ is a set of related elements that work together to attain a goal. Social scientists have frequently employed the concept of a social system to study the school. This has been done in three ways. First, the class has been portrayed as a social system; its elements include the teacher, the students, and formal and informal groups within the class. Second, the school itself has been viewed as a social system; its components are the administration, faculty, counselors, students, academic departments, the curriculum, the extra curriculum, and social networks and other subunits within the school. Third, the school has been
seen as one of a number of subunits in the larger social system of society (Gordon 1957, Loomis and Dyer 1976, Parsons 1961, Smelser 1988).
1. Viewing the School Class as a Social System The most familiar application of the social system model to schools is found in Parsons’ (1959) classic essay on the school class as a social system. Parsons focuses on the class, rather than the larger unit of the school, because he sees the class as the primary place where students learn and mature. According to Parsons, the two most important functions of the class are ‘socialization’ and ‘allocation.’ These functions define and motivate the academic and social processes that occur in the classroom and the interactions that occur among its various components.
1.1 Socialization Socialization is the process through which students internalize the kinds of commitments that they need to play a useful role in adult society. Although the family and community also socialize students, the length of time students spend in school makes the school class a major influence in the socialization process. Time spent in school during the students’ formative years provides ongoing opportunities for systematic, intensive socialization. During this time, students are expected to accept societal values and use them to guide their behavior. Parsons claims that several structural characteristics of the class facilitate the socialization of students. First, students assigned to the same class are fairly homogeneous with respect to age and social development. They are also somewhat homogeneous with respect to social class, since class composition is constrained by the socioeconomic characteristics of the neighborhood in which the school is located. The developmental and socioeconomic homogeneity of the students assists the teacher in transmitting values to the students who become, in turn, models of appropriate behavior for each other. Second, students in the same class are under the tutelage of one or a small number of adult teachers. The age difference between the teacher and students supports the teacher’s authority. The singular role of the teacher as the representative of adult society lends legitimacy to the teacher’s values. A third structural feature of the school class is the curriculum. Pupils are exposed to the same curriculum and assigned the same tasks. A shared curriculum and similar activities allows the teacher to reinforce cultural values and attitudes in multiple ways over the school year. A fourth structural feature of the class is its reward structure. Teachers generally establish a set of rewards 13563
School as a Social System and punishments governing student behavior in a classroom. Good behavior, defined by the teacher’s values, is rewarded, while disruptive actions are sanctioned. Students learn the rules of conduct that apply to social and work situations, and are motivated to obey these rules by the rewards or punishments their behaviors incur.
1.2 Allocation Allocation is the process of sorting individuals and assigning them to groups based on their ability or skills. In a school, allocation refers to the assigning of students to instructional units according to their abilities or achievement. The purpose of this sorting process is to prepare students to attain an occupational position in society commensurate with their capabilities (Gamoran et al. 2000, Hallinan 1994, Lynch 2000, Oakes 1994). Parsons defines achievement as excellence, relative to one’s peers, in meeting the expectations of the teacher. Achievement has two components: cognitive and moral. Teachers attempt to motivate students to achieve academically and to learn the skills needed to perform in adult society. They also teach students a moral code and a set of behaviors consistent with the values of adult society. Students are evaluated on each of these two components. Society may assign greater weight to one component than the other or may treat them as equally important. As with socialization, structural differentiation occurs through a process of rewarding students for excellence and punishing them for failure to meet teacher expectations. In the elementary school classroom, the criteria for excellence are not clearly differentiated across cognitive and moral dimensions. Teachers aim to develop both good citizenship and an achievement motivation in children. They also begin the process of classifying youth on the basis of their ability to achieve. While assigning students to classes by ability typically does not occur in elementary school, many elementary school teachers instruct their students in small, abilitybased groups within the classroom. Their purpose is to facilitate instruction and to prepare students for later channeling into ability-based classes. This initial sorting sensitizes students to the differential achievement that characterizes a class and to the way rewards are allocated for performance. At the secondary level, the cognitive and moral components of achievement are separable, and greater emphasis is placed on cognitive achievement. Parsons claims that students who achieve academic excellence in high school are better suited for technical roles in society, while those who excel in the moral or social sphere are more fit for socially oriented roles. The actual sorting of students in high school typically occurs by assigning students to classes for instruction based on their ability level. Since ability-grouped 13564
classes vary by student and teacher characteristics and by the curriculum, they represent different social systems within the school. The conceptual attraction of the social system model of the classroom has stimulated a body of research over the past few decades. The model depicts how the teacher, formal and informal groups, and individual students interact in the classroom and perform or cooperate with the functions of socialization and allocation. Examples of research based on this model include studies of teacher expectations, the quantity and quality of teacher–student interactions, gender and race effects on task-related and social interactions, peer influences and student friendship groups.
2. Viewing the School as a Social System Researchers are more likely to focus on the school, rather than the classroom, when utilizing the social system model to analyze education. Like the class, the school must perform the functions of socialization and allocation in order to play its assigned role in society. Viewing the school as a social system directs attention to how the parts of a school interact to carry out these functions.
2.1 Socialization For students to be socialized successfully in school, they must cooperate in their education. They are less likely to resist if they accept the authority of their teachers and believe that the school’s policies and practices are fair. When students agree with school policies and practices, they are unlikely to resist socialization. With student cooperation, a school is expected to attain its goal of graduating students who have internalized societal values and norms beyond those held by their families.
2.2 Allocation To attain its goal of allocation, a school must locate students on an achievement hierarchy, defining each person’s capability and aspirations. A critical factor in the school’s ability to allocate students is that both the family and school attach high value to achievement (Parsons 1959). When adults view student achievement as a high priority, students are more likely to internalize the motivation to achieve and to cooperate with adults as they orient students toward specific adult roles. When schools succeed in this effort, they match society’s human resources with its adult occupational structure. The primary mechanism for the allocation process is the organizational differentiation of students for
School as a Social System instruction by ability. Middle and high schools in the United States typically assign students to Academic, Vocational, and General tracks. Students assigned to the Academic track take college preparatory courses, those assigned to the Vocational track take skills and job-related courses, and students assigned to the General track are offered both low-level academic courses and skills courses. Track assignment is a major determinant of whether a student advances to college or enters the job market after graduation. Theoretical and empirical studies demonstrate the way various parts of a school interact to socialize students and prepare them for the labor market. Research on the school as a bureaucracy, on the formal or informal organization of the school, and on social networks within the school, illustrates this approach. These and other studies demonstrate the heuristic value of conceptualizing the school as a social system and show how this model has integrated a wide variety of studies about schooling.
3. Viewing the School as a Subsystem in Society Many social scientists have used the social system model to analyze the role of various institutions in society. Education is seen as one of society’s primary institutions, along with religion, the economy, and the judicial system. The aim of a social system approach to the study of schools in society is to ascertain how schooling enables society to achieve its goals. 3.1 Structural Functionalist Perspectie on Schools in Society A theoretical perspective that dominated early twentieth century societal analysis was structural functionalism (Durkheim 1956, Parsons 1951). A major premise of structural functionalism is that a society must perform a set of functions in order to survive. According to Parsons (1956), these functions are: obtaining and utilizing resources from the system’s external environment; setting goals for the system and generating the motivation and effort to attain these goals; regulating the units of the system to insure maintenance and to avoid destructive conflicts; and storing and distributing cultural symbols, ideas, and values. Schools perform these functions by socializing students to societal values, by providing a common culture and language, by enabling and encouraging competition, and by preparing students for the labor market. Structural functionalism purports that a social system must exist in a state of equilibrium or, if disrupted, must make adjustments to restore balance. If one social institution in society undergoes major change, interrelated social institutions are expected to accommodate this change and to bring society back to a stable state. For example, if the economy of a society
were to change, students would need to be socialized and allocated in different ways to support the new economic structure. The social system would then be restored to balance, though it would differ structurally from its original state. 3.1.1 Limitations of the structural functionalist model. The structural functionalist perspective has been challenged on many grounds. The most common criticism questions its assumption of systemic equilibrium. Critics claim that structural functionalism ignores the processes of social change internal to a social system. They argue that the relationship among the parts of a social system tends to change over time and that the social system that emerges as a result of these changes may differ fundamentally from the original system. Another criticism of structural functionalism regards its claim that a social system continues to operate as intended if all the parts of the system perform their functions. This assumption fails to take into account the possibility of external shocks to the system. An external shock could alter the pattern of interactions among the system’s parts in such a way as to disrupt its stability, leading to a basic restructuring of the system or, possibly, to its disintegration. Finally, the structural functionalist perspective has been criticized on ideological or political grounds. Critics claim that a structural functionalist perspective portrays the school as an institution that supports and perpetuates the existing social order and its stratification system. They argue that while schools reward students for ability and achievement, they also maintain the influence of ascribed characteristics on adult success. Structural functionalists fail to analyze the extent to which schools preserve a class-based society. Not only does structural functionalism ignore the way schools perpetuate the status quo; it also fails to explain how ascriptive characteristics mediate the effects of achievement after graduation. By linking occupational success to organizational characteristics of schools and academic achievement, structural functionalism fails to explain the poor fit that often exists between a student’s skills and abilities and the student’s future place in the labor market. Critics of structural functionalism argue that a job seeker can often negotiate with a prospective employer, and that this process allows an individual’s ascribed characteristics and social status to influence job placement. In short, structural functionalism is generally viewed as a static theory, which only partially describes the interactions in a school system, or between schools and the rest of society. Even when social change is incorporated into the model, the change is not seen as leading to a radical transformation of the school. As a result, the theory fails to depict the more dynamic and controversial dimensions of schooling. Nevertheless, structural functionalism has been a useful theoretical 13565
School as a Social System perspective to explain how schools and classrooms function as social systems under certain conditions in society.
3.2 Conflict Perspectie on the School as a Social System Conflict theory (Bowles and Gintis 1976, Collins 1971) is an alternative theoretical model for the analysis of the school as a social system. Conflict theory posits competition as the major force driving societal development. While structural functionalism views technological needs and economic growth as the major influences on society, conflict theory argues that competition for wealth and power is the primary state of society and social change is its inevitable result. According to conflict theory, ascribed characteristics are the basis of elite status. Collins (1971) argues that the continuing effort of the elite to exert control over lower status groups creates an ongoing struggle for power and prestige. Conflict theory specifies conditions under which subordinates are likely to resist the domination of superiors through noncompliance and resistance. Under conditions of economic hardship, political turmoil, or cultural conflict, nonelites are more likely to resist domination and to challenge the relationship that exists between education and occupation. Their discontent typically precipitates social change. Conflict theory not only explains the relationship between education and occupation; it also yields insights into the power struggles that occur between schools and other social groups in society. For example, during the student movement in the USA in the 1960s and 1970s, students resisted authority and the status quo. Tensions and disruptions continued until students were granted a greater voice in university affairs and in political life. Current controversies about school prayer, sex education, curriculum content, racial integration, and school vouchers are led by competing interest groups. These controversies are likely to lead to compromises that involve some redistribution of authority and power. The power struggle that occurs in society may also be observed in the school and in the classroom. In schools, students create their own value system that may be inconsistent with the school’s academic goals. The tests and grades administered by schools as a sorting mechanism may be viewed as a way to maintain the status quo, to enforce discipline and order, or to co-opt the most intelligent of the lower classes. In the classroom, students may challenge teacher authority and negotiate teacher power through resistance. By directly addressing the relationship between conflict and social change, ‘conflict theory’ supplements structural functionalism in explaining the behavior of schools in society and in predicting social change. Both theories point to important aspects of 13566
the dynamics of social systems. As stressed by structural functionalists, schools do socialize and allocate students, mostly by meritocratic criteria. But conflict theorists are correct in maintaining that nonmeritocratic factors also influence the allocation process. Structural functionalists are accurate in stating that students typically cooperate with teachers in the learning process, but conflict theorists recognize that some students resist authority and rebel. Further, conflict occurs in communities, schools, and classrooms, but it is not always class-based. Relying on the insights of both theories increases our understanding of schools in society.
4. Coleman’s Analysis of the School as a Social System In his last major work on social theory, Coleman (1990) argued explicitly that the aim of sociology is to explain the behavior of social systems. He pointed out that social systems might be studied in two ways. In the first approach, the system is the unit of analysis and either the behavior of a sample of social systems are studied or the behavior of a single social system is observed over time. Coleman’s (1961) study of the adolescent subculture follows this strategy. The second approach involves examining internal processes in a social system, including the relationships that exist among parts of the system. Research on the association between track level in high school and a student’s growth in academic achievement is an example of linking a sub-unit of a school to an individual. Studies of the transition from school to work illustrate the relationship between two subunits of society, education and the labor market. Social system analysis involves explaining three transitions: macro to micro, micro to micro, and micro to macro level transitions (Alexander et al. 1987, Collins 1981, Knorr-Cetina and Cicourel 1981). The macro to micro transition involves the effects of the social system itself on subunits (typically individuals) in the system. An analysis of the effects of ability group level on students’ educational aspirations is an example. The micro to micro transition links characteristics of subunits in the social system to the behavior of those subunits. An example is an analysis of the effects of gender on achievement. Finally, the micro to macro transition pertains to how the behavior of subunits in a social system influences the social system as a whole. Research on the effects of student achievement on the way the school organizes students for instruction is illustrative. Coleman argued that the micro to macro transition is the most difficult to analyze, because it requires specifying the interdependency among subunits or individuals in a social system. His theory of purposive action provides a way to model this transition. In general, Coleman’s insistence that the study of social
School Effectieness Research systems includes an explicit focus on the transitions that exist between macro level and micro level processes in the social system promises to draw greater attention to the social system approach to the study of schooling in society.
5. Conclusions Conceptualizing the school as a social system is a useful approach to the study of schools. The social system model has led to new theoretical insights about how education, as an institution, affects other societal institutions. It has also generated a significant body of empirical research that demonstrates the interdependence of sub-units in a school and of schools within larger organizational units and their effects on social outcomes. These studies have yielded a better understanding of the role schools play and the contribution they make to contemporary social life. See also: Coleman, James Samuel (1926–95); Educational Institutions and Society; Socialization in Adolescence; Socialization in Infancy and Childhood; Socialization, Sociology of; System: Social
Bibliography Alexander J, Giesen B, Munch R, Smelser N 1987 The Micromacro Link. University California Press, Berkeley, CA Bowles S, Gintis H 1976 Schooling in Capitalist America: Educational Reform and the Contradictions of Economic Life. Basic Books, New York Coleman J S 1961 The Adolescent Society. Free Press, New York Coleman J S 1990 The Foundations of Social Theory. Belknap Press of Harvard University Press, Cambridge, MA Collins R 1971 Functional and conflict theories of educational stratification. American Sociological Reiew 36: 1002–19 Collins R 1981 On the micro-foundations of macro-sociology. American Journal of Sociology 86: 984–1015 Durkheim E 1956 Education and Sociology. Free Press, New York Gamoran A, Secada W, Marrett C 2000 The organizational context of teaching and learning: Changing theoretical perspectives. In: Hallinan M (ed.) Handbook of the Sociology of Education. Kluwer\Academic\Plenum, New York, Chap. 2, pp. 37–64 Gordon C 1957 The Social System of the High School. Free Press, Glencoe, IL Hallinan M 1994 Tracking: From theory to practice. Sociology of Education 67(2): 79–84, 89–91 Knorr-Cetina K, Cicourel A (eds.) 1981 Adances in Social Theory and Methodology. Routledge and Kegan Paul, London Loomis C P, Dyer E 1976 Educational social systems. In: Lommis C P (ed.) Social Systems: The Study of Sociology. Schenkman, Cambridge, MA, Chap. 7 Lynch K 2000 Research and theory on equality and education. In: Hallinan M (ed.) Handbook of the Sociology of Education. Kluwer\Academic\Plenum, New York, Chap. 4, pp. 85–106 Oakes J 1994 More than misapplied technology: A normative and political response to Hallinan on tracking. Sociology of Education 67(2): 84–9, 91
Parsons T 1951 The Social System. Free Press, New York Parsons T 1956 Economy and Society: A Study in the Integration of Economic and Social Theory. Free Press, Glencoe, IL Parsons T 1959 The school class as a social system: Some of its functions in American society. Harard Educational Reiew 29: 297–318 Parsons T 1961 An outline of the social system. In: Parsons T, Shiks E, Naegele K, Pitts R (eds.) Theories of Society. Free Press, New York, Vol. 1 Smelser N 1998 Social structure. In: Smelser N (ed.) Handbook of Sociology. Sage, Newbury Park, CA, pp. 103–29
M. Hallinan
School Effectiveness Research 1. School Effectieness and School Effectieness Research In the most general sense ‘school effectiveness’ refers to the level of goal attainment of a school. Although average achievement scores in core subjects, established at the end of a fixed program are the most probable ‘school effects,’ alternative criteria like the responsiveness of the school to the community and the satisfaction of the teachers may also be considered. Assessment of school effects occurs in various types of applied contexts like the evaluation of school improvement programs or comparing schools for accountability purposes, by governments, municipalities, or individual schools. School effectiveness research attempts to deal with the causal aspects inherent in the effectiveness concept by means of scientific methods. Not only are assessment of school effects considered, but particularly the attribution of differences in school effects to malleable conditions. Usually, school effects are assessed in a comparative way, e.g., by comparing average achievement scores between schools. In order to determine the ‘net’ effect of malleable conditions, like the use of different teaching methods or a particular form of school management, achievement measures have to be adjusted for intake differences between schools. For this purpose student background characteristics like socioeconomic status, general scholastic aptitudes, or initial achievement in a subject are used as control variables. This type of statistical adjustment in research studies has an applied parallel in the strive for ‘fair comparisons’ between schools, known under the label of ‘value-added.’
2. Strands of Educational Effectieness Research School effectiveness research has developed as a gradual integration of several research traditions. The roots of current ‘state-of-the-art’ school effectiveness 13567
School Effectieness Research
Figure 1 A basic systems model of school functioning
research are sketched by briefly referring to each of these research traditions. The elementary design of school effectiveness research is the association of hypothetical effectivenessenhancing conditions of schooling and output measures, mostly student achievement. A basic model from systems theory, where the school is seen as a black box, within which processes or ‘throughput’ take place to transform inputs into outputs. The inclusion of an environmental or context dimension completes this model (see Fig. 1). The major task of school effectiveness research is to reveal the impact of relevant input characteristics on output and to ‘break open’ the black box in order to show which process or throughput factors ‘work,’ next to the impact of contextual conditions. Within the school it is helpful to distinguish a school and a classroom level and, accordingly, school organizational and instructional processes. Research tradition in educational effectiveness varies according to the emphasis that is put on the various antecedent conditions of educational outputs. These traditions also have a disciplinary basis. The
common denominator of the five areas of effectiveness research that will be distinguished is that in each case the elementary design of associating outputs or outcomes of schooling with antecedent conditions (inputs, processes, or contextual) applies. The following research areas or research traditions can be distinguished: (a) Research on equality of opportunities in education and the significance of the school in this. (b) Economic studies on education production functions. (c) The evaluation of compensatory programs. (d) Studies of unusually effective schools. (e) Studies on the effectiveness of teachers, classes, and instructional procedures. For a further discussion of each of these research traditions the reader is referred to Scheerens (1999). A schematic characterization of research orientation and disciplinary background is given in Table 1.
3. Integrated School Effectieness Research In recent school effectiveness studies these various approaches to educational effectiveness have become integrated. Integration was manifested in the conceptual modeling and the choice of variables. At the technical level multilevel analysis has contributed significantly to this development. In contributions to the conceptual modeling of school effectiveness, schools became depicted as a set of ‘nested layers’ (Purkey and Smith 1983), where the central assumption was that higher organizational levels facilitated effectiveness enhancing conditions at lower levels (Scheerens and Creemers 1989). In this way, a synthesis between production functions, instructional
Table 1 General characteristics of types of school effectiveness research Independent variable type (a) (un)equal opportunities (b) production functions (c) evaluation compensatory programs (d) effective schools (e) effective instruction
13568
Dependent variable type
Discipline
Main study type
socioeconomic status and IQ of pupils, material school characteristics material school characteristics specific curricula
attainment
sociology
survey
achievement level
economics
survey
achievement level
interdisciplinary pedagogy
quasi-experiment
‘process’ characteristics of schools characteristics of teachers, instruction, class organization
achivement level
interdisciplinar pedagogy
case-study
achievement level
educational psychology
experiment, observation
School Effectieness Research
Figure 2 An integrated model of school effectiveness (from Scheerens 1990)
effectiveness, and school effectiveness became possible by including the key variables from each tradition, each at the appropriate ‘layer’ or level of school functioning (the school environment, the level of school organization and management, the classroom level, and the level of the individual student). Conceptual models that were developed according to this integrative perspective are those by Scheerens (1990), Creemers (1994), and Stringfield and Slavin (1992). The Scheerens model is shown in Fig. 2. Exemplary cases of integrative, multilevel school effectiveness studies are those by Brandsma (1993), Sammons et al. (1995), and Grisay (1996). In Table 2 (cited from Scheerens and Bosker 1997) the results of three meta-analyses and a re-analysis of an international data set have been summarized and compared to results of more ‘qualitative’ review of the research evidence. The qualitative review was based on studies by Purkey and Smith (1983), Levine and Lezotte (1990), Scheerens (1992), and Sammons et al. (1995). The results concerning resource input variables are based on the re-analysis of Hanushek’s (1979) summary of results of production function studies that was carried out by Hedges et al. (1994). As stated before this re-analysis was criticized, particularly the unexpectedly large effect of per pupil expenditure.
The results on ‘aspects of structured teaching’ are taken from meta-analyses conducted by Fraser et al. (1987). The international analysis was based on the IEA Reading Literacy Study and carried out by Bosker (Scheerens and Bosker 1997, Chap. 7). The meta-analysis on school organizational factors, as well as the instructional conditions ‘opportunity to learn,’ ‘time on task,’ ‘homework,’ and ‘monitoring at classroom level,’ were carried out by Witziers and Bosker and published in Scheerens and Bosker (1997, Chap. 6). The number of studies that were used for these meta-analyses varied per variable, ranging from 14 to 38 studies in primary and lower secondary schools. The results in this summary of reviews and metaanalyses indicate that resource-input factors on average have a negligible effect, school factors have a small effect, while instructional variables have an average to large effect. The conclusion concerning resource-input factors should probably be modified and somewhat ‘nuanced,’ given the results of more recent studies referred to in the above, e.g., the results of recent studies concerning class-size reduction. There is an interesting difference between the relatively small effect size for the school level variables reported in the meta-analysis and the degree of certainty and consensus on the relevance of these factors in the more qualitative research reviews. It should be noted that the three blocks of variables depend on types of studies using different research methods. Education production function studies depend on statistics and administrative data from schools or higher administrative units, such as districts or states. School effectiveness studies focusing at school level factors are generally carried out as field studies and surveys, whereas studies on instructional effectiveness are generally based on experimental designs.
4. Foundational and Fundamental School Effectieness Studies Foundational school effectiveness studies refer to basic questions about the scope of the concept of school effectiveness. Can a school be called effective on the basis of achievement results measured only at the end of a period of schooling, or should such a school be expected to have high performance at all grade levels? Can school effectiveness be assessed by examining results in just one or two school subjects, or should all subject matter areas of the curriculum be taken into account? Also, shouldn’t one restrict the qualification of a school being effective to consistently high performance over a longer period of time, rather than a ‘one shot’ assessment at just one point in time? Fortunately all of these questions are amenable to empirical research. These type of studies that are associated with the consistency of school effects over grade-levels, teachers, subject-matter areas, and time 13569
School Effectieness Research Table 2 Reviews of the evidence from qualitative reviews, an international study, and research syntheses; coefficients are correlations with student achievement (the plus signs in the table refer to a positive assessment of the variable in question in the reviews). Qualitative reviews Resource input ariables: Pupil–teacher ratio Teacher training Teacher experience Teachers’ salaries Expenditure per pupil School organizational factors: Productive climate culture Achievement pressure for basic subjects Educational leadership Monitoring\evaluation Cooperation\consensus Parental involvement Staff development High expectations Orderly climate
j j j j j j j j j
Instructional conditions: Opportunity to learn Time on task\homework Monitoring at classroom level
j j j
International analyses k0.03 0.00
0.02 k0.03 0.04 k0.07 0.20
0.02 0.04 0.00 k0.02 0.08
0.14 0.05 0.15 0.03 0.13
0.20 0.04
0.11
0.15 0.00\k0.01 (n.s.) k0.01 (n.s.)
Aspects of structured teaching: —cooperative learning —feedback —reinforcement Differentiation\adaptive instruction
have been referred to as ‘foundational studies’ (Scheerens 1993) because they are aimed at resolving issues that bear upon the scope and ‘integrity’ of the concept of school effectiveness. A recent review of such foundational studies is given in Scheerens and Bosker (1997, Chap. 3). Their results concerning primary schools are presented in Table 3. Consistency is expressed in terms of the correlation between two different rank orderings of schools. Results are based on arithmetic and language achievement. The results summarized in Table 3 indicate that there is a reasonable consistency across cohorts and subjects, while the consistency across grades is only average. Results measured at the secondary level likewise show reasonably high stability coefficients (consistency across cohorts), and somewhat lower coefficients for stability across subjects (e.g., in a French study (Grisay 1996), coefficients based on value-added results were 0.42 for French language and 0.27 for mathematics). The average consistency between subjects at the secondary level was somewhat lower than in the case of primary schools (r about 13570
Research syntheses
0.09 0.19\0.06 0.11 (n.s.) 0.27 0.48 0.58 0.22
Table 3 Consistency of school effectiveness (primary level) Type of consistency
Average correlation
across time (stability) (1 or 2 years) across grades across subjects
r l 0.70 (0.34–0.87) r l 0.50 (0.20–0.69) r l 0.70 (0.59–0.83)
Source: Scheerens and Bosker (1997, Chap. 3)
0.50). This phenomenon can be explained by the fact that, at the secondary level, different teachers usually teach different subjects, so that inconsistency is partly due to variation between teachers. The few studies in which factor analysis was used to examine the size of a stable school factor relative to year specific and subject specific effects have shown results varying from a school factor explaining 70 percent of the subject and cohort specific (gross) school effects, to 25 percent. The picture that emerges from these studies on the stability and consistency of school effects is far from
School Effectieness Research being indifferentially favorable with respect to the unidimensionality of the school effects concept. Consistency is fair, when effects at the end of a period of schooling are examined over a relatively short time interval. When grade-level and subject matter area is brought into the picture, consistency coefficients tend to be lower, particularly when different teachers teach different grades or subjects. School effects are generally seen as teacher effects, especially at the secondary school level The message from these ‘foundational studies’ is that one should be careful not to overgeneralize the results of school effectiveness studies when only results in one or two subject matter areas at one point in time are measured. Another implication is that hypothetical antecedent conditions of effects are not only to be sought at the school organizational level, but also at the level of teaching and the teaching and learning process. Fundamental school effectiveness studies are theory- and model-driven studies. Bosker and Scheerens (1994) presented alternative causal specifications of the conceptual multilevel models referred to in an earlier section. These models attempt to grasp the nature of the relationships between e.g., schools and classroom conditions. For example, whether such relationships are additive, interactive, reciprocal, or form a causal chain. Other studies that have attempted to formalize these types of relationships are by Hofman (1995) and Creemers and Reezigt (1996). In general, it appeared to be difficult to establish the better ‘fit’ of one of the alternative model specifications. More complex models, based on the axioms of microeconomic theory, have been tested by de Vos (1998), making use of simulation techniques. So far, these studies are too few to draw general conclusions about the substantive outcomes; continuation of this line of study is quite interesting, however, also from a methodological point of view. From a substantive point of view educational effectiveness studies have indicated the relatively small effects of these conditions in developed countries, where provisions are at a uniformly relatively high level. At the same time the estimates of the impact of innate abilities and socioeconomic background characteristics—also when evaluated as contextual effects—seem to grow, as studies become more methodologically refined. Given the generally larger variation in both conditions and outcomes of schooling in developing countries—and the sometimes appallingly low levels of both—there is both societal and scientific relevance in studying school effectiveness in these settings.
5. The Future of School Effectieness Research From this article it could be concluded that school effectiveness research could be defined in a broad and
a narrow sense. In the broadest sense one could refer to all types of studies which relate school and classroom conditions to achievement outcomes, preferably after taking into account the effects of relevant student background variables. In a narrower sense, state-of-the-art integrative school effectiveness studies and foundational and fundamental studies could be seen as the core. Following the broader definition the future of school effectiveness studies is guaranteed, particularly in the sense of ‘applied’ studies, like cohort studies, large-scale effect studies carried out for accountability purposes, monitoring studies, and assessment studies. State-of-the-art, fundamental and foundational school effectiveness studies are a much more vulnerable species. One of the major difficulties is the organizational complexity and costs of the ‘state-ofthe-art types’ of study. Given the shortage of these kind of studies the more fundamental and foundational types of studies are likely to be dependent on the quality of data-sets that have been acquired for ‘applied purposes.’ Therefore the best guarantee for continued fundamental school effectiveness research lies in the enhanced research technical quality of applied studies. One example consists of quasi-experimental evaluation of carefully designed school improvement projects. Another important development is the use of IRT (Item Response Theory) modeling and ‘absolute’ standards in assessment programs. If school effects can be defined in terms of distance or closeness in average achievement to a national or even international standard, some of the interpretation weaknesses of comparative standards belong to the past. See also: Educational Assessment: Major Developments; Program Evaluation; School Improvement; School Management; School Outcomes: Cognitive Function, Achievements, Social Skills, and Values; Teacher Behavior and Student Outcomes
Bibliography Bosker R J, Scheerens J 1994 Alternative models of school effectiveness put to the test. In: Bosker R J, Creemers B P M, Scheerens J (eds.) Conceptual and Methodological Adances in Educational Effectieness Research, special issue of the International Journal of Educational Research 21: 159–80 Brandsma H P 1993 Basisschoolkenmerken en de Kwaliteit an het Onderwijs [Characteristics of primary schools and the quality of education]. RION, Groningen, The Netherlands Creemers B P M 1994 The Effectie Classroom. Cassell, London Creemers B P M, Reezigt G J 1996 School level conditions affecting the effectiveness of instruction. School Effectieness and School Improement 7: 197–228 Fraser B L, Walberg H J, Welch W W, Hattie J A 1987 Syntheses of educational productivity research. Special Issue of the International Journal of Educational Research 11
13571
School Effectieness Research Grisay A 1996 Eolution des acquis cognitifs et socio-affectifs des elees au cours des annees de college [Eolution of cognitie and socio-affectie outcomes during the years of secondary education]. Universite! de Lie' ge, Lie' ge, Belgium Hanushek E A 1979 Conceptual and empirical issues in the estimation of educational production functions. Journal of Human Resources 14: 351–88 Hedges L V, Laine R D, Greenwald R 1994 Does money matter? A meta-analysis of studies of the effects of differential school inputs on student outcomes. Educational Researcher 23: 5–14 Hofman W H A 1995 Cross-level relationships within effective schools. School Effectieness and School Improement 6: 146–74 Levine D U, Lezotte L W 1990 Unusually Effectie Schools: A Reiew and Analysis of Research and Practice. National Center for Effective Schools Research and Development, Madison, WI Purkey S C, Smith M S 1983 Effective schools: A review. The Elementary School Journal 83: 427–52 Sammons P, Hillman J, Mortimore P 1995 Key Characteristics of Effectie Schools: A Reiew of School Effectieness Research. OFSTED, London Scheerens J 1990 School effectiveness and the development of process indicators of school functioning. School Effectieness and School Improement 1: 61–80 Scheerens J 1992 Effectie Schooling. Research, Theory and Practice. Cassell, London Scheerens J 1993 Basic school effectiveness research: Items for a research agenda. School Effectieness and School Improement 4: 17–36 Scheerens J 1999 School Effectieness in Deeloped and Deeloping Countries: A Reiew of the Research Eidence. World Bank paper, Washington, DC Scheerens J, Bosker R J 1997 The Foundations of Educational Effectieness. Elsevier Science Ltd., Oxford, UK Scheerens J, Creemers B P M (eds.) 1989 Developments in school effectiveness research. Special themed issue of the International Journal of Educational Research 13 Stringfield S C, Slavin R E 1992 A hierarchical longitudinal model for elementary school effects. In: Creemers B P M, Reezigt G J (eds.) Ealuation of Effectieness. ICO-Publication 2. GION, Groningen, The Netherlands de Vos H 1998 Educational Effects: A Simulation-Based Analysis. University of Twente, Enschede, The Netherlands
J. Scheerens
tion important conditions and crucial characteristics of the processes. In the definition of these different concepts generally, and to distinguish school improvement from other concepts, at least three points should be taken into consideration. (a) The level in educational system where the change takes place: the school, the district, or the state. Ultimately, every change process has to reach the classroom and the level of the individual student. In order to make a distinction, educational change at national level is often called ‘systemic reform.’ The term ‘school improvement’ makes clear that the change takes place at school level. (b) In all definitions, distinctions should be made between, on the one hand, the process of educational change, the strategies used, and the characteristics of the change process, and, on the other, the outcomes of the change process. In everyday use, the term can refer to one of these aspects or even to a combination of them: for example, to strategy and the product of change. Fullan clarifies what he calls ‘the confusion between the terms change and progress’ (Fullan 1991, p. 4). Not every change is progress: the actual outcome of the change processes in terms of progress is not as important as the intended outcomes in the desired direction. (c) When it comes to the outcomes of change, there is often no well-defined concept of what the change is aimed at: the context of education, the inputs and processes, or, ultimately, the outcomes in terms of student achievement. The International School Improvement Project (ISIP) defines school improvement as a ‘a systematic sustained effort aimed at change in learning conditions and other related internal conditions in one or more schools, with the ultimate aim of accomplishing educational goals more effectively’ (Van Velzen et al. 1985). This definition specifies improvement as an innovation or planned change with specific means (change in learning and other internal conditions of the school) and specific goals (ultimately to accomplish educational goals more effectively).
School Improvement 2. Practice and Theory in School Improement 1. School Improement Different terms are used to describe change processes in schools and educational systems as a whole. They include: ‘educational change,’ ‘innovation,’ ‘reform,’ and development.’ These terms are quite often used in an interchangeable manner, and their meaning in the literature is frequently ambiguous. Many authors do not define them, but instead tend to trust everyday usage to convey the meaning of the words. Other authors define the concept by including in the defini13572
School improvement as a process has been demarcated into three global phases: the initiation, the implementation, and the continuation processes. Stakeholders of school improvement are more narrowly circumscribed and consist of students, teachers, principals, and parents at school level, and of local educational authorities, consultants, and the local community at local level. In exceptional cases, where it is a regional or national strategy to improve all schools, the stakeholders at these levels are also relevant.
School Improement School improvement is a very widespread phenomenon and a wide variety of improvement efforts can be found. For example, there is a lot of improvement going on which does not aim at enhancing student outcomes at all. These types of improvement focus, for instance, on the career development of teachers, on restructuring the organization of the school, on the way decisions are made, or on the relationships between schools and their clients. Sometimes restructuring takes place at classroom level. Changes in the writing practices of elementary schoolteachers, for example, have been described in great detail. Nevertheless, the actual impact of the changes on students and on student achievement is usually left out. Some schools practice improvement on their own and try to find their own solutions to their problems. Other schools have chosen to implement as accurately as possible improvement programs that have been developed elsewhere. This is called fidelity implementation. Often they do not consider alternative educational options. For example, they may opt for a specific program primarily because another school was satisfied with it (Stringfield 1995). Some schools are involved in improvement only because their government expects them to be (Hopkins et al. 1994). Depending on the extent to which educational policies are translated into clear outlines and contents, fidelity implementation is a more or less appropriate concept. When the educational policy is rather prescriptive with respect to curricular content and outcome levels, as it is in the UK (Hopkins et al. 1994), it is theoretically possible to check whether or not schools are implementing this policy. When the educational policy is rather open-ended, as is the Dutch policy on educational priorities (Van der Werf 1995) or the Dutch inclusion policy, it is virtually impossible to apply the notion of fidelity implementation. Only a small part of school improvement is based on research (Stringfield 1995). Innovations are hardly ever tested before they are implemented in educational practice, and an adequate ealuation of their impact is rare. The same applies to the occurrence of experimental or quasi-experimental designs in improvement projects. Some school improvers have preferred forms of action research instead of research-based experiments. Sometimes projects based on changing only a couple of factors report great successes. It may be that changes in just a few important areas can alter a whole system, but the question is whether these changes— often cases of educational mass hysteria—will stand the test of time. In education, the margins for change are generally small, because of the impact of noneducational factors such as the individual characteristics of students. Evaluations are often not carried out satisfactorily or are performed after a timespan that is too short. School improvement projects have shown that it takes time to design, develop, implement, and evaluate changes in schools—more time than is often available.
Because of inadequate evaluations, questions about the causation of effects, extremely important for the school-effectiveness knowledge base, cannot be answered. School improvement, therefore, should consistently pursue assessment of its results, pay attention to its failures and try to learn from these, and limit its goals to prevent confusion of cause and effect (Hopkins et al. 1999). In school improvement, the orientation towards educational practice and policy-making is emphasized. Although, in educational practice schools and classrooms can be found that succeed much better than others, theories cannot be based on exemplary practice only. Insofar as theoretical notions have been developed, they have not yet been empirically and systematically tested. The typologies of school cultures (Hargreaves 1995), for example, have not been studied in educational practice, and their effects on the success of school improvement are as yet unknown. Moreover, their relationships to the first criterion for school improvement (enhancing student outcomes) are not always very clear. The same holds for the factors that are supposed to be important in different stages of educational change, outlined by Fullan (1991), and for his ideas about the essential elements in educational change at classroom level (beliefs, curriculum materials, and teaching approaches). Even though these ideas are derived from school improvement practice, their importance and their potential effects are not accounted for in detail, and have not yet been studied in research. A contribution to theory development is the generic framework for school improvement provided by Hopkins (1996). In this framework, three major components are depicted: educational givens, a strategic dimension, and a capacity-building dimension. Educational givens cannot be changed easily. Givens can be external to the school (such as an external impetus for change) and internal (such as the school’s background, organization, and values). The strategic dimension refers to the competency of a school to set its priorities for further development, to define student learning goals and teacher development, and to choose a strategy to achieve these goals successfully. The capacity-building dimension refers to the need to focus on conditions for classroom practice and for school development during the various stages of improvement. Finally, the school culture has a central place in the framework. Changes in the school culture will support teaching–learning processes which will, in turn, improve student outcomes (Hopkins 1996). Despite the obvious gaps in theory development and testing in the field of school improvement, there are already some elements of a knowledge base. By trying to improve schools, knowledge on the implementation of classroom and school effectiveness factors in educational practice has become available. This has provided the possibility of studying, to 13573
School Improement different degrees, the influence of factors and variables on educational outcomes (Stoll and Fink 1994).
3. The Link Between School Improement and School Effectieness School effectiveness and school improvement conjoin because of their mutual interest, although their actual relationship may be very complicated (Reynolds et al. 1993, Creemers and Reezigt 1997). The major aim in the field of school effectiveness was always to link theory and research on the one hand, and practice and policy-making on the other. School improvement can be fostered by a knowledge base covering what works in education that can be applied in educational practice. The combination of theory, research, and development is not new in education. Almost all movements start out to make knowledge useful for educational practice and policy-making, or state their goal in terms of supplementing policy practice with a knowledge base supplied by theory and research from a cyclical point of view. The next step is to use practical knowledge for further advances in theory and research. In this way, research and improvement can have a relationship as a surplus benefit for both. School effectiveness has led to major shifts in educational policy in many countries by emphasizing the accountability of schools and the responsibility of educators to provide all children with opportunities for high achievement, thereby enhancing the need for school improvement (Mortimore 1991). School effectiveness has pointed to the need for school improvement, in particular by focusing on alterable school factors. School improvement projects were necessary to find out how schools could become more effective. These projects were often supposed to implement effective school factors in educational practice (Scheerens 1992) and, in doing so, could yield useful feedback for school effectiveness. School improvement might point to inaccurate conceptions of effectiveness, such as the notion of linearity or one-dimensionality (Hargreaves 1995). In addition, school improvement might give more insight into the strategies for changing schools successfully in the direction of effectiveness. The relatively short history of school effectiveness and improvement shows some successes of this linkage. Research results are being used in educational practice, sometimes with good results. School improvement findings are sometimes used as input for new research. Renihan and Renihan state that ‘the effective schools research has paid off, if for no other reason than that it has been the catalyst for school improvement efforts’ (Renihan and Renihan 1989, p. 365). Most authors, however, are more skeptical (Reynolds et al. 1993). Fullan states that school effectiveness ‘has mostly focused on narrow educational goals, and 13574
the research itself tells us almost nothing about how an effective school got that way and if it stayed effective’ (Fullan 1991, p. 22). Stoll and Fink (1992) think that school effectiveness should have done more to make clear how schools can become effective. According to Mortimore (1991), a lot of improvement efforts have failed because research results were not translated adequately into guidelines for educational practice. Changes were sometimes forced upon a school, and when the results were disappointing the principals and teachers were blamed. Teddlie and Roberts (1993) suggest that effectiveness and improvement representatives do not cooperate automatically, but tend to see each other as competitors. Links between school effectiveness and school improvement were stronger in some countries than in others (Reynolds 1996). In the early years of school effectiveness, links were strong in the USA and never quite disappeared there. Many districts have implemented effective schools programs in recent years, but research in the field has decreased at the same time, and, because of this, school improvement is sometimes considered ‘a remarkable example of (…) over-use of a limited research base’ (Stringfield 1995, p. 70). Reynolds et al. (2000) came to the conclusion based on the analyses of Dutch, British, and North American initiatives that the following principles are fundamental to a successful merger of school effectiveness and school improvement: (a) a focus on teaching, learning, and the classroom level; (b) use of data for decision making; (c) a focus on pupil outcomes; (d) addressing schools’ internal conditions; (e) enhanced consistency (through implementation of ‘reliable’ programs); and (f) pulling levers to affect all levels, both within and beyond the school. Recently, several projects (Stoll et al. 1996, Hill 1998) have started to integrate school effectiveness and school improvement. They form successful examples of the concept of sustained interactivity. These projects all share a clear definition of the problem that should be overcome, in terms of student outcomes and classroom strategies, to enhance these outcomes within the context of the school. Often, the outcomes are clearly specified for one school subject or elements of a school subject, such as comprehensive reading. The content of the projects is a balanced mix of the effectiveness knowledge base and the concepts from school improvement. The projects have detailed designs, both for the implementation of school improvement and for evaluation in terms of empirical research. By means of a research component integrated into the projects right from the start, it is possible to test effectiveness hypotheses, and to evaluate improvement outcomes at the same time. The use of control groups is essential in this respect, and various projects now incorporate control groups or choose to compare
School Improement their results to norm groups on the basis of nationwide tests. Also, many projects are longitudinal in their designs. Although most integrated projects have started recently, some of them have been running for almost a decade now, and they have been disseminated to various educational contexts. Therefore, long-term effects and context-specific effects can easily be tracked by means of follow-up measurement. An additional feature of projects which last for several subsequent years is the possibility of testing the effectiveness of school improvement strategies and changing strategies whenever necessary. The Halton’s Effective Schools Project illustrates this feature clearly (Stoll and Fink 1996, Stoll et al. 1996). The project started in Canada in 1986 with the intention of implementing British effectiveness knowledge in Halton district schools. It soon became clear that the effectiveness knowledge base in itself would not automatically lead to changes in educational practice. Over the years, the project paid a lot of attention to questions on the processes of change in schools. It focuses on the planning process in schools, the teaching and learning processes in classrooms, and staff development. Successful changes turned out to be enhanced by a collaborative school culture, a shared vision of what the school will stand for in the future, and a climate in which decisions are made in a transparent way. Based on their Halton experiences, Stoll and Fink (1996) have developed a conceptual model which links school effectiveness and school improvement through the school development planning process. The model blends the school effectiveness knowledge base with knowledge about change processes in schools. The school development planning process is at the center of the model. The process is considered to be multilayered. Two outer layers comprise invitational leadership, and continuing conditions and cultural norms. The inner cycle layer is formed by the ongoing planning cycle of assessment, planning, implementation, and evaluation of educational processes. The two central core layers refer to a strong focus on the teaching–learning processes and the curriculum, and the students in the school. The school development planning process is influenced by the context of the school and foundations such as research findings, and, in turn, influences intermediate outcomes at teacher and school levels, as well as student outcomes. Finally, the planning process is influenced by several partners (external agencies, educational networks). In the UK, 66 percent of improvement programs now pursue goals which fit into the school effectiveness tradition of student outcomes (Reynolds et al. 1996). However, it is still not clear whether real improvements will always occur. The IQEA project (Improving the Quality of Education for All, Hopkins et al. 1994) started in 1991 as a staff development project. Gradually, a focus on classroom improvement and its effects on student achievement took over. The project does
not stop with the implementation of priorities for development, but also pays explicit attention to conditions that will sustain the changes. These are staff development, involvement, leadership, coordination, inquiry and reflection, and collaborative planning (Stoll et al. 1996). In addition, when the classroom became the center of attention, the project specified the classroom-level conditions that are necessary for effective teaching and student achievement. These are authentic relationships, rules and boundaries, teacher’s repertoire, reflection on teaching, resources and preparation, and pedagogic partnerships. Other promising projects in the UK are the Lewisham School Improvement Project and the Hammersmith and Fulham LEA Project. Both projects actively try to enhance student achievement by means of school effectiveness knowledge, and both projects are cooperating with a research institute (Reynolds et al. 1996). Recent theories about school improvement stress the self-regulation of schools. Self-regulation assumes target setting and embodies the behavioral concept of mechanisms, feedback, and reinforcement. This selfregulatory approach to school improvement can be combined with the analogous self-regulatory feedback loops in educational effectiveness, which are again further elaborated in the upward spiraling school development planning process (Stoll and Fink 1996, Hopkins 1995). The analysis of school improvement efforts over a period of time resulted in a distinction of several types of schools (effective vs. ineffective and improving vs. declining). Stoll and Fink (1998) and Hopkins et al. (1999) give descriptions of those schools and the features of the change processes going on. The declining effective school (the ‘cruising school’) has received particular attention because it points at the difficulties of keeping educational quality at the same level (Fink 1999). See also: Educational Evaluation: Overview; Educational Innovation, Management of; Educational Institutions and Society; Educational Policy: Comparative Perspective; Educational Policy, Mechanisms for the Development of; Educational Research and School Reform; School Administration as a Field of Inquiry; School Effectiveness Research; School Management
Bibliography Creemers B P M, Reezigt G J 1997 School effectiveness and school improvement: Sustaining links. School Effectieness and School Improement 8: 396–429 Fink D 1999 The attrition of change: A study of change and continuity. School Effectieness and School Improement 10: 269–95 Fullan M G 1991 The New Meaning of Educational Change. Teachers College Press, New York
13575
School Improement Hargreaves D H 1995 School culture, school effectiveness and school improvement. School Effectieness and School Improement 6: 23–46 Hill P W 1998 Shaking the foundations: Research driven school reform. School Effectieness and School Improement 9: 419–36 Hopkins D 1995 Towards effective school improvement. School Effectieness and School Improement 6: 265–74 Hopkins D 1996 Towards a theory for school improvement. In: Gray J, Reynolds D, Fitz-Gibbon C, Jesson D (eds.) Merging Traditions: The Future of Research on School Effectieness and School Improement. Cassell, London, pp. 30–51 Hopkins D, Ainscow M, West M. 1994 School Improement in an Era of Change. Teachers College Press, New York Hopkins D, Reynolds D, Gray J 1999 Moving on and moving up: Confronting the complexities of school improvement in the improving schools project. Educational Research and Ealuation 5: 22–40 Mortimore P 1991 School effectiveness research: Which way at the crossroads? School Effectieness and School Improement 2: 213–29 Renihan F I, Renihan P J 1989 School improvement: Second generation issues and strategies. In: Creemers B, Peters T, Reynolds D (eds.) School Effectieness and School Improement. Swets and Zeitlinger, Amsterdam\Lisse, pp. 365–77 Reynolds D 1996 County reports from Australia, the United Kingdom, and the United States of America. Introduction and overview. School Effectieness and School Improement 7: 111–13 Reynolds D, Hopkins D, Stoll L 1993 Linking school effectiveness knowledge and school improvement practice: Towards a synergy. School Effectieness and School Improement 4: 37–58 Reynolds D, Sammons P, Stoll L, Barber M, Hillman J 1996 School effectiveness and school improvement in the United Kingdom. School Effectieness and School Improement 7: 133–58 Reynolds D, Teddlie C, Hopkins D, Stringfield S 2000 School effectiveness and school improvement. In: Teddlie C, Reynolds D (eds.) The International Handbook of School Effectieness Research. Falmer Press, London, pp. 206–31 Scheerens J 1992 Effectie Schooling: Research, Theory and Practice. Cassell, London Stoll L, Fink D 1992 Effecting school change: The Halton approach. School Effectieness and School Improement 3: 19–42 Stoll L, Fink D 1994 School effectiveness and school improvement: Voices from the field. School Effectieness and School Improement 5: 149–78 Stoll L, Fink D 1996 Changing our Schools: Linking School Effectieness and School Improement. Open University Press, Buckingham, UK Stoll L, Fink D 1998 The cruising school: The unidentified ineffective school. In: Stoll L, Myers K (eds.) No Quick Fixes: Perspecties on Schools in Difficulty. Falmer Press, London, pp. 189–206 Stoll L, Reynolds D, Creemers B, Hopkins D 1996 Merging school effectiveness and school improvement: Practical examples. In: Reynolds D, Creemers B, Bollen R, Hopkins D, Stoll L, Lagerweij N (eds.) Making Good Schools, Linking School Effectieness and School Improement. Routledge, London, pp. 113–47 Stringfield S 1995 Attempting to enhance students’ learning through innovative programs: The case for schools evolving into High Reliability Organizations. School Effectieness and School Improement 6: 67–96
13576
Teddlie C, Roberts S P 1993 More Clearly Defining the Field: A Surey of Subtopics in School Effectieness Research. Paper presented at the Annual Meeting of the American Educational Research Association, Atlanta, GA Van der Werf M P C 1995 The Educational Priority Policy in The Netherlands: Content, Implementation and Outcomes. SVO, Den Haag, The Netherlands Van Velzen W G, Miles M B, Ekholm M, Hameyer U, Robin D 1985 Making School Improement Work: A Conceptual Guide to Practice. ACCO, Leuven, Belgium
B. P. M. Creemers
School Learning for Transfer Most of what is taught in school is expected to affect learning and performance that transcend the mastery of those subjects: one learns addition to prepare the grounds for the study of multiplication, science to understand the surrounding physical and natural world, and history to gain a sense of identity and a deeper understanding of current events. Such expectations concern the transfer of what is learned in school in one domain to other realms of learning and activity, a spillover of learning that functions like a ripple effect. Thus, the study of topic A is supposed to facilitate the study of B, make it easier, faster, or better understood relative to when B is learned without the learning of A preceding it. Two approaches to the study of transfer have dominated the field since the beginning of the twentieth century: transfer as a function of similar elements between A and B, and as a function of the understanding of general principles that are transferable to a variety of situations. The history of research on transfer shows how these two major principles underlie the development of the field, even when other basic assumptions about learning and thinking have recently been challenged.
1. Two Basic Approaches The expectation for transfer has as long a history as institutional learning itself. Plato argued that the study of abstract reasoning assists the solution of daily problems. Similarly, the debates about Talmudic and Biblical texts in ancient times were argued to ‘sharpen minds.’ The scientific study of transfer has a long history as well; it dates back to the beginning of the twentieth century with the studies by Thorndike and Woodworth (1901) in which the idea that Latin and other taxing school subjects developed one’s general ‘faculties of mind’ was challenged. The expected transfer was not found; mastery of Latin, Greek, or geometry did not facilitate either performance or efficiency in learning other subjects. Thorndike’s
School Learning for Transfer findings led to his formulation of the theory of ‘identical elements’: transfer from A to B is likely to occur the more elements are common to both. It followed that transfer can take place only in the relatively rare case when there is clear and apparent similarity between the constituent elements of the learned topics. Thus, one would have to learn a host of independent issues and procedures as no transfer in the absence of identical elements should be expected. Thorndike’s theory was countered by a more cognitively-oriented and Gestalt-flavored alternative that emphasized the meaningful conceptual understanding of underlying principles which then could be thoughtfully applied to new situations. The more general the learned principle, the farther could it be transferred to new instances. Judd (1908) had his subjects learn to shoot at targets submerged in water after having learned principles of light refraction in water. Having learned these principles, subjects were much better at hitting the targets than those who have only practiced the activity. As we shall see below, the two approaches—transfer as a matter of common elements and transfer as a matter of higher order cognitive processes of understanding and abstraction—although having undergone profound modifications and developments over the years, continue to this day to dominate the field.
2. Paucity of Findings Research on transfer as well as practitioners’ expectations for transfer are characterized by an ongoing pendulum-like oscillation between the two approaches. Such oscillation is not so much the result of evidence that supports both approaches as it is a function of the paucity of findings clearly supporting either one of them. Indeed, one of the hallmarks of the field is the great discrepancy between the expectation for transfer from school learning to the acquisition of new subjects or to the handling of daily events and the findings from controlled transfer studies. The latter often do not yield the transfer findings one would have expected. A typical case is the study by Pea and Kurland (1984) who have found that children learning LOGO programming showed no advantage in their ability to plan over children who did not master LOGO. Transfer, when at all found in carefully carried out studies, appears to be highly specific, as when a particular procedure or principle is applied in a new situation which bears great and apparent (often analogical) similarity to the learning material, and when the process is accompanied by such facilitation as coaching, specific guidance and cueing (see Bransford and Schwartz (1999, Cox (1997, for detailed reviews). Such findings would seem to support Thorndike’s theory and pessimism about transfer, as summarized by Detterman (1993): ‘In short, from
studies that claim to show transfer and that don’t show transfer, there is no evidence to contradict Thorndike’s general conclusions: Transfer is rare, and its likelihood of occurrence is directly related to the similarity between the two situations’ (p.15).
3. Transfer on the Rebound Whereas Thorndike’s pessimism seemed to have won the day for a while, the research tradition of Judd (1908), with its emphasis on comprehension and general (nonspecific) transfer gradually showed its strength. As research of this tradition tended to show, when relevant principles or strategies are mindfully attended to, or better yet—either abstracted by learners and\or metacognitively monitored—transfer to a variety of instances can be obtained even in the absence of apparent common elements. At the same time, other research on the transfer of skills, ostensibly still in a tradition more similar to that originated by Thorndike (Thorndike and Woodworth 1901), has also yielded positive results. Thus, for example, Singley and Anderson (1989) have shown that when training to near automaticity of discrete skill components is carried out, transfer from one task (e.g., a line editor) to another (text editor) can be virtually perfect. Also more field-based research found support for impressive transfer from the study of universitytaught disciplines: Nisbett et al. (1993) have shown that the study of psychology, and also to a lesser extent medicine (but not law or chemistry), improve students’ abilities to apply statistical and methodological rules of inference to both scientific and daily problems.
4. The High and the Low Roads to Transfer The renewed success of the two lines of research, often addressing higher order cognitions and information processing exercized by multiple activities, suggests that transfer may not be a unitary process as the two approaches differ in important ways. Observing these differences, Salomon and Perkins (1989) have developed a theory to account for the possibility that transfer takes either one of two routes (or a combination thereof), described as the high road and the low road of transfer. The low road reflects to an extent the Thorndikian line, and more recently that of Anderson’s skill-acquisition and transfer theory. It is taken when skills, behaviors, or action tendencies are repeatedly practiced in a variety of situations until they are mastered to near-automaticity and are quite effortlessly applied to situations whose resemblance to the learning situations is apparent and easily perceived. Learning to drive and learning to read are two cases in point as is transfer from one text editor to another as studied by Singely and Anderson (1989). Other candidates for low road transfer are attitudes, 13577
School Learning for Transfer cognitive styles, dispositions, and belief systems the application of which to new situations is rarely a mindful process. In contrast, the high road to transfer is characterized by the process of mindful abstraction of knowledge elements that afford logical abstraction: principles, rules, concepts, procedures, and the like. It is this mindfully abstracted, decontextualized idea (‘ethnic oppression may lead to revolt’) that becomes a candidate for transfer from one particular instance (the Austro-Hungarian Empire) to another (Russia and Cechenia). Simon (1980) commented in this respect that ‘To secure substantial transfer of skill acquired in the environment of one task, learners need to be made explicitly aware of these skills, abstracted from their specific task content’ (p. 82, emphasis added). There is ample evidence to support the role of the high road in obtaining transfer. One of the studies by Gick and Holyoak (1983) illustrates this point. Participants who were given two stories and were asked to write a summary of how the two stories resembled each other (that is, their common moral), showed a 91 percent transfer to the solution of another story, relative to 30 percent transfer of a non-summary group. Bassok (1990), showed that mastering algebraic abstractions (plus examples) allowed students to view physics problems as particular instances to which the more abstract algebraic operations could be applied. Physics, on the other hand, is too particular and thus students do not expect and do not recognize any possible relationship between it and algebraic operations. Research also directs attention to the role played by self-regulation and metacognitions in the process of mindful abstraction and transfer via the high road. Following the studies and review of Campione et al. (1982), Salomon et al. (1989) have shown that students interacting with a semi-intelligent computerized Reading Partner that provides metacognitive-like guidance, tend to internalize that guidance and transfer it to new reading as well as to writing situations. The high road\low road theory sheds light on the many failures of obtaining transfer in controlled studies. Close examination of such studies suggests that in many cases neither the low road of repeated practice nor the high road of mindful abstraction was taken. Not enough time is allocated for practice for the former, and not enough attention is given for mindful abstraction for the latter. As a consequence, neither near automatic transfer on the basis of easily recognized common elements can be attained, nor farther transfer on the basis of metacognitively guided mindful abstraction.
5. New Approaches Research on transfer, until recently, did not challenge the basic paradigm and conception of transfer as a 13578
process involving change of one’s performance on a new task as a result of his or her prior performance on a preceding and different task. Such a conception of transfer has recently become challenged on the basis of new theories of learning, the role of activity, and the place of cognitions therein. These challenges can be arranged along a dimension ranging from the least to the more radical, relative to traditional notions of transfer. Bransford and Schwartz (1999), coming from a mainstream tradition of cognitive science applied to instructional issues, argue that the current ways of demonstrating transfer are more appropriate for studying only full-blown expertise. According to them, ‘There are no opportunities for … [students] to demonstrate their abilities to learn to solve new problems by seeking help from other resources such as text or colleagues or by trying things out, receiving feedback, and getting opportunities to revise’ (p. 68). They recommend replacing the typical transfer task which they call ‘sequestered problem solving’ (SPS) by a conception of transfer as ‘preparation for future learning’ (PFL). Thus, one would measure transfer not by having students apply previously acquired knowledge or skill to new situations, demonstrating either knowledge how or knowledge that something, but rather by demonstrating knowledge with—thinking, perceiving, and judging with whatever knowledge and tools are available even if that knowledge is not consciously recalled. In other words, transfer would be expected to become demonstrated by way of ‘people’s abilities to learn new information and relate their learning to previous experiences’ (Bransford and Schwartz (1999, p. 69). An illustration of the above can be seen in a study in which college and fifth grade students did not differ in their utilization of previous knowledge and educational experiences for the creation of a statewide plan to recover the bald eagle. This is a typical SPS way of measuring transfer, suggesting in this case that previous educational experiences did not transfer to the handling of the new problem. However, when the students were asked to generate questions that would need to be studied in preparation for the creation of a recovery plan (applying a PFL approach), striking differences were found between the two age groups in favor of the college students. Focusing on different issues, the two groups showed how, knowingly or not, previous learning has facilitated the generation of the list of questions to be asked. The use of tools and information resources would also show how previous learning of skills and knowledge becomes actively involved in the process of solving a new problem. While Bransford and Schwartz (1999) continue to adhere to the traditional conception of learning as knowledge acquisition and to cognitions as general and transferable tools of the mind, others have developed an alternative, socio-cultural approach according to which neither the mind and its content
School Learning for Transfer (e.g., arithmetic algorithms) should be treated as a neutral toolbox ready for application wherever suitable, nor should knowledge-in-use and social context of activity be taken as two independent entities. Learning is a matter of participation in a community of learners, thus knowledge is not a noun denoting possession but rather a verb denoting the active process of knowing-as-construction within a social context of actiity. One underlying assumption is that learner, activity, and content become united through culturally constructed tools. A second assumption is that learning is highly situated in a particular activity context of participation, and its outcomes are ‘becoming better attuned to constraints and affordances of activity systems so that the learner’s contribution to the interaction is more successful’ (Greeno 1997, p. 12). Seen in this light, the traditional cognitiist concern with the transferability of acquired knowledge across tasks becomes, from the newer situatie perspective, an issue concerned with the task and the ‘consistency or inconsistency of patterns of participatory processes across situations’ (Greeno (1997, p. 12). An important practical implication of this approach is that one would need to take into consideration the kinds of participatory activities to which school-based learning might be expected to transfer, rather than expect decontextualized school materials to transfer all on its own to other tasks, situations, and activities. Students are likely to restructure new situations to fit their previous practices and thus make the old available for transfer to the new. Illustrating this approach to learning and cognition, Saxe (1991), showed that Brazilian children’s mathematical experience as street vendors affected their math learning when later on they became enrolled in school, and vice versa. ‘[I]n recurring activities like practices of candy selling and schooling—contexts in which knowledge is well learned and in which similar problems emerge on a repeated basis—transfer is a protracted process. In their repeated efforts to solve practice-linked problems, individuals attempt to structure problem contexts in coherent and more adequate ways’ (p. 179). The new approaches to transfer clearly deviate from the traditional views. They suggest new ways of measuring transfer, and more radically—challenge traditional assumptions by treating knowledge, skill, task, activity, and context as one participatory activity. By so doing, they emphasize the uniqueness of each activity setting. An important aspect of this approach is its ability to incorporate within the idea of participatory activity motivational and dispositional elements also. These elements have so far been neglected by much research on transfer, although they may be crucial for students’ choice to treat each situation as different, or as allowing students to exercise familiar patterns of participation. Still, it appears that despite these important novelties, the principles of transfer as a function of similarity between situations (or activities) and of depth of
understanding general principles (e.g., of useful participation in a community of learners) have not yet been replaced but only redefined. See also: Competencies and Key Competencies: Educational Perspective; Learning Theories and Educational Paradigms; Learning to Learn; Metacognitive Development: Educational Implications; School Effectiveness Research; Situated Learning: Out of School and in the Classroom; Thorndike, Edward Lee (1874–1949); Transfer of Learning, Cognitive Psychology of
Bibliography Bassok M 1990 Transfer to domain-specific problem solving procedures. Journal of Experimental Psychology: Learning Memory and Cognition 6: 522–33 Bransford J, Schwartz D 1999 Rethinking transfer: A simple proposal with multiple implications. Reiew of Educational Research 24: 61–100 Campione J C, Brown A L, Ferrara R A 1982 Mental retardation and intelligence: Contributions from research with retarded children. Intelligence 2: 279–304 Cox B D 1997 A rediscovery of the active learner in adaptive contexts: A developmental-historical analysis of transfer of training. Educational Psychologist 32: 41–55 Detterman D R 1993 The case for prosecution: Transfer as an epiphenomenon. In: Detterman D K, Sternberg R J (eds.) Transfer on Trial: Intelligence, Cognition and Instruction. Ablex, Norwood, NJ Gick M L, Holyoak K J 1983 Schema induction and analogical transfer. Cognitie Psychology 15: 1–38 Greeno J G 1997 Response: On claims that answer the wrong questions. Educational Researcher 26: 5–17 Judd C H 1908 The relation of special training to general intelligence. Educational Reiew 36: 28–42 Nisbett R E, Fong G T, Lehman D R, Cheng P W 1993 Teaching reasoning. In: Nisbett R E (ed.) Rules for Reasoning. Erlbaum, Hillsdale, NJ Pea R D, Kurland D M 1984 On the cognitive effects of learning computer programming. New Ideas in Psychology 2: 37–68 Salomon G, Globerson T, Guterman E 1989 The computer as a zone of proximal development: Internalizing reading-related metacognitions from a Reading Partner. Journal of Educational Psychology 81: 620–7 Salomon G, Perkins D N 1989 Rocky roads to transfer: Rethinking mechanisms of a neglected phenomenon. Educational Psychology 24: 113–42 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Simon H A 1980 Problem solving and education. In: Tuma D T, Reif R (eds.) Problem Soling and Education: Issues in Teaching and Research. Erlbaum, Hillsdale, NJ Singley M K, Anderson J R 1989 The Transfer of Cognitie Skill. Harvard University Press, Cambridge, MA Thorndike E L, Woodworth R S 1901 The influence of improvement in one mental function upon the efficacy of other functions. Psychological Reiew 8: 247–61
G. Salomon Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13579
ISBN: 0-08-043076-7
School Management
School Management School management is in considerable disarray, if not turmoil, and is likely to remain so in the first decade of the twenty-first century. This is partly a reflection of sweeping transformations in society, mirrored in developments in education generally and in schools in particular. It also reflects failure to effect a powerful link between school management as a field of study and school management as a field of practice, so that each informs and influences the other. More fundamental is the concern that, in each instance, management is not connected to learning as well as it should be.
1. Definition of School Management Disarray is evident in the matter of definition. In some nations, notably the USA, administration is used to describe a range of functions performed by the most senior people in an organization, with management a more limited, sometimes routine set of tasks, often performed by those in supporting roles. A conceptual difficulty relates to the distinction between leadership and management. The work of Kotter (1990) is helpful in resolving the issue. Leadership is concerned with change and entails the three broad processes of establishing direction, aligning people, and motivating and inspiring. Management is concerned with the achievement of outcomes, and the three broad processes of planning and budgeting, organizing and staffing, and controlling and problemsolving. These difficulties are reflected in the role ambiguity of those who hold senior positions in schools, notably principals or head teachers. There is a generally held view that their work entails both leadership and management. They should certainly be leaders. In some instances managers will support them. There is concern when they act solely as managers. Kotter (1990) offers a contingent view that is as helpful for schools as it is for organizations in general. He contends that the emphasis and balance in leadership and management are contingent on two variables: the amount of change that is required and the complexity of the operation. Complex schools in a turbulent environment require high levels of leadership and management. Schools that are relatively simple but are faced with major change require leadership more than management. Schools that are complex in stable circumstances require management more than leadership. Simple schools in stable settings may require little of each.
2. School Management as a Field of Practice There are three important developments in school management that are evident in almost every nation 13580
(Caldwell and Spinks 1998). Schools and systems of schools vary in the extent to which each has unfolded. The first is the shift of authority and responsibility to schools in systems of public education. Centrally determined frameworks of curriculum, standards, and accountabilities are still in force but schools have a considerable degree of freedom in how they will meet expectations. This development is known as schoolbased management or local management or selfmanagement, and is frequently accompanied by the creation of school councils, school boards, or other structures for school-based decision-making. Comprehensive reform along these lines is most evident in Australia, the UK, Canada, New Zealand and the USA, but most other nations have begun the process or are planning it. Reasons for decentralization vary but are generally consistent with the changing view of the role of government that gathered momentum in the final decade of the twentieth century. As applied to education, this view holds that governments should be mainly concerned with setting direction, providing resources, and holding schools to account for outcomes. They should also provide schools with the authority and responsibility to determine the particular ways they will meet the needs of their students, within a centrally determined framework. The second development arises from higher expectations for schools as governments realise that the knowledge and skill of their people are now the chief resource if nations are to succeed in a global economy. There is recognition of the high social cost of failure and the high economic cost of providing a world-class system of education. As a result, there is unprecedented concern for outcomes, and governments have worked individually and in concert to introduce systems of testing at various stages of primary and secondary schooling to monitor outcomes and provide a basis for target setting in bringing about improvement. In several countries, notably the UK and some parts of the USA, rankings of schools are published in newspapers. Comparisons of national performance are now possible through projects such as the Third International Mathematics and Science Study (Martin and Kelly 1996). The third development is change in the nature of schooling itself. Driven to a large extent by advances in information and communications technology, much of the learning that in the past could only occur in the classroom can now occur at anytime and anywhere there is access to a networked computer. Students and their teachers have access to a vast amount of information. Interactive multi-media have enriched and individualized learning on a remarkable scale. This development is not uniform across all schools or across all nations, and disparities are a matter of concern. There is growing anxiety on the part of government that structures for the delivery of public education that
School Management have survived for a century or more may no longer be adequate for the task. Charter schools made their appearance in Canada and the USA in the 1990s, and the number increased very rapidly, even though they are still a small fraction of the total number of schools. Charter schools receive funding from the public purse but are otherwise independent, operating outside the constraints of a school district. A schools-for-profit movement gathered momentum around the same time, with public funding enhanced by private investment through the offering of shares on the stock exchange, as instanced by the pioneering Edison Schools. In the UK, the Blair Government privatized the support services of several local education authorities and established a framework for public–private partnerships in the management of publicly owned schools. The number of private schools is increasing in most countries, reflecting growing affluence and loss of faith in the public sector. A contentious policy, gaining ground in a number of countries, is the provision of public funds (often described as ‘vouchers’) allowing parents to access private schools when public schools do not meet expectations. Taken together, these developments have created conditions that call for high levels of leadership and management at the school level. Schools are more complex and are experiencing greater change than ever before. A particular challenge is to prepare, select, place, appraise, and reward those who serve, and there is growing concern in some countries about the quality and quantity of those who seek appointment. Some governments have created or are planning new institutions to address this concern, illustrated by the establishment in England in 2000 of the National College for School Leadership. There is evidence of a loss of faith in the university as an institution to shape and inform approaches to school management, and to prepare those who will serve as leaders and managers in the future.
3. School Management as a Field of Study School management has emerged relatively recently as a sub-field of study within the broader field of educational management (administration). There were few formal programs and little research on the phenomenon until the 1950s. Prior to that, approaches to the management of schools paralleled those in industry. Indeed, it was generally recognized that management in education was largely a mirror image of industry, with the creation of large centralized and bureaucratized systems of public education. The knowledge base was drawn from the industrial sector. Departments of educational administration made their appearance in universities in the US in the midtwentieth century and soon proliferated. They were
established in the UK soon after, with many other nations following suit, drawing on the US and, to a lesser extent, the UK for their intellectual underpinnings. The first quarter-century, from the early1950s to about the mid-1970s was characterized by efforts to build theory in educational management, modeling to a large degree the discipline approach of the behavioral sciences. In the USA, Educational Administration Quarterly was an attempt to provide for education what Administratie Science Quarterly has done for administration (management). In the UK, a seminal work edited by two pioneers in the field had the title Educational Administration and the Social Sciences (Baron and Taylor 1969). Separate sub-disciplines were created in fields such as economics, finance, governance, human relations, industrial relations, law, planning, policy, and politics. Large numbers of students enrolled, and departments of educational administration became a major source of revenue for schools of education. The theory movement in educational management was the subject of powerful attack in the mid-1970s, notably by Canadian scholar T. B. Greenfield (1975). However, it was not until the late 1980s that a systematic response was mounted by the University Council for Educational Administration in the 1987 report of the National Commission on Excellence in Educational Administration (NCEE 1987). The critique was largely directed at efforts to build a scientific theory of educational management; the separation of management from learning and teaching; and fundamental considerations of context, culture, ethics, meaning, and values. Blueprints for reform were drawn up but the field was still in ferment at the turn of the twenty-first century (see Murphy and Forsyth (1999) for a review of efforts to reform the field). Evers and Lakomski have developed a new conceptual framework (‘naturalistic coherentism’) in an effort to bring coherence to the fields of educational administration and management. Lakomski (2001) argues that leadership ought to be replaced by better conceptions of organizational learning. On a more pragmatic level, the critique of educational management as a field of study lies in its failure to impact practice on a large scale, and to assist in a timely manner the implementation of reform and the resolution of concerns. A promising approach to unifying the field is to place learning at the heart of the effort. This means that policy interest in the improvement of student outcomes should be the driver of this ‘quest for a center,’ as US scholar Joseph Murphy (1999) has described it. Murphy classified the different stages of development in the ‘profession’ of educational administration, as summarized in Table 1. Much momentum was gained in the 1990s with the establishment of the International Congress for School Effectiveness and Improvement (ICSEI), which made its mission the bringing together of policy-makers, practitioners, and researchers. The periodical School 13581
School Management Table 1 Rethinking the center for the profession of educational administration (Murphy 1999) Time frame 1820–1900 Ideology 1901–1945 Prescription 1946–1985 Behavioral science 1986– Dialectic
Center of gravity
Foundation
Engine
Philosophy Management Social sciences School improvement
Religion Administrative functions Academic disciplines Education
Values Practical knowledge Theoretical knowledge Applied knowledge
Effectieness and School Improement quickly established itself as a leading international journal. The integration of policy, practice, and research to achieve school improvement is becoming increasingly evident, illustrated in approaches to literacy in the early elementary ( primary years). The examples of Australia and the UK are interesting in this regard. Levels of literacy were considered by governments in both nations to be unacceptably low. Implementation of approaches in the Early Literacy Research Project led to dramatic improvement in the state of Victoria, Australia. Concepts such as ‘whole school design’ emerged from this integration. Hill and Cre! vola (1999) formulated a ‘general design for improving learning outcomes’ that proved helpful in parts of Australia and the United States. It has nine elements: beliefs and understanding; leadership and coordination; standards and targets; monitoring and assessment; classroom teaching strategies; professional learning teams; school and classroom organization; intervention and special assistance; and home, school, and community partnerships. The implications are clear as far as educational management is concerned: there must be high levels of knowledge about learning and teaching and how to create designs and formulate strategies that bring about improvement. This contributes to a particularly demanding role where decentralization has occurred, as is increasingly the case in most nations.
4. The Future of School Management The picture that emerged at the dawn of the twentyfirst century is one in which school management does not stand in isolation. Leadership and management, as conceptualized by Kotter (1990) are both important, and their focus must be on learning, integrated in the notion of creating a coherent and comprehensive whole-school design. Those who serve in positions of leadership are operating in a milieu of high community expectations in which knowledge and skill are the chief determinants of a nation’s success in a global economy. Rapidly increasing costs of achieving 13582
expectations mean a capacity to work in new arrangements, including innovative public–private partnerships. High expectations mean a commitment to core values such as access, equity, and choice. All of this unfolds in an environment in which the nature of schooling is undergoing fundamental change. There are major implications for universities and other providers of programs for the preparation and professional development of school leaders. An interesting development in Victoria (Australia) and the UK is that governments are turning to the private sector to specify requirements and offer a major part of training programs. Governments in both places commissioned the Hay Group to assist in this regard. The target population in the UK is serving principals, numbering about 25,000. The program has been very highly rated by both principals and employers. The most recent contribution of the Hay Group has been the specification for Victoria of 13 competencies and capabilities for school leaders. The focus is school improvement and there are four components: (a) driving school improvement ( passion for teaching and learning, taking initiative, achieving focus); (b) delivering through people (leading the school community, holding people accountable, supporting others, maximizing school capability); (c) building commitment (contextual know-how, management of self, influencing others); and (d) creating an educational vision (analytic thinking, big picture thinking, gathering information). These developments suggest that training will be a shared responsibility in the future, with a limited role for universities unless they can form effective partnerships with the profession itself and work in association with private providers. Universities will have a crucial role to play, especially in research, but again that is likely to be in a range of strategic alliances if it is to be valued. The role of the university in philosophy and critique of school management will be more highly valued to the extent that the disjunction is minimized (between study and practice and between management and learning). A fundamental issue that awaits resolution is the extent to which those who work in school management are not weighed down by the high expectations. It was noted at the outset that applications for appointment are declining in quality and quantity. There is even a
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values concern in well-funded schools in the private sector. Part of the solution is to infuse leadership and management throughout the school and generally build the capacity of all to work in the new environment. The concept of ‘knowledge management’ is likely to be become more important, so that accounting for the resources of a school will extend beyond financial and social capital to include intellectual capital in the form of the knowledge and skill of staff. This may go only part of the way, given that schools are in many respects still operating with a design for an earlier era. Innovation in design must extend to the creation of a work place that is engaging for all that work within it. What should be retained and what should be abandoned in current designs is the challenge for the first decade of the twenty-first century. Drucker’s concept of ‘organized abandonment’ (Drucker 1999) is as important for schools as it is for other organizations. See also: Educational Innovation, Management of; Management: General
Bibliography Baron G, Taylor W (eds.) 1969 Educational Administration and the Social Sciences. Athlone, London Caldwell B J, Spinks J M 1998 Beyond the Self-managing School. Falmer, London Drucker P F 1999 Management Challenges for the 21st Century. Butterworth Heinemann, Oxford Greenfield T B 1975 Theory about organizations: A new perspective and implications for schools. In: Hughes M G (ed.) Administering Education: International Challenge. Athlone, London, pp. 71–99 Hill P W, Cre! vola C A 1999 The role of standards in educational reform for the 21st century. In: Marsh D D (ed.) Preparing our Schools for the 21st Century. ASCD Yearbook 1999. Association for Supervision and Curriculum Development, Alexandria, VA Kotter J P 1990 A Force for Change: How Leadership Differs from Management. Free Press, New York Lakomski G 2001 Management Without Leadership: A Naturalistic Approach Towards Improing Organizational Practice. Pergamon\Elsevier, Oxford, UK Martin M O, Kelly D L (eds.) 1996 Third International Mathematics and Science Study. Technical Report. Vol. 1: Design and Deelopment. Boston College, Chestnut Hill, MA Murphy J 1999 The quest for a center: Notes on the state of the profession of educational leadership. Invited paper at the Annual Meeting of the American Educational Research Association, Montreal, April Murphy J, Forsyth P (eds.) 1999 Educational Administration: A Decade of Reform. Corwin Press, Newberry Park, CA National Commission on Excellence in Educational Administration 1987 Leaders for America’s Schools. University Council for Educational Administration, Tempe, AZ
B. J. Caldwell
School Outcomes: Cognitive Function, Achievements, Social Skills, and Values Achievement in US schools is as high as it was in the mid-1970s, despite the fact that increasingly more poor and minority students are remaining in schools longer than ever. Unlike in the early 1960s, where researchers debated whether there was any impact of schooling on students’ learning, it is now accepted that schools (and especially teachers) impact student achievement. Classroom learning and school achievement broadly defined continue to grow in important ways (see, for example, Biddle et al. 1997). However, much of the recent scholarship on classroom processes has been conceptual in nature, and the associated empirical research (if any) typically involves only a few classrooms. There is considerable advocacy for certain types of classroom environments—classroom environments that are notably different from traditional instruction—but solid empirical support to show the effects of new instructional models is largely absent. Here we discuss the need for new school-based research that addresses curriculum and instructional issues in order to advance theoretical understanding of student learning in school settings. We argue that it is no longer possible to discuss ‘successful schooling’ on the basis of subject-matter outcomes alone. Effective schools research and policy discussions must also include the measurement of non-subject-matter outcomes of schooling.
1. Status of Achieement in Schools The state of achievement in US schools is widely debated and incorporates at least three major camps. The first contends that current school performance is lower than that of US youth from the mid-1970s and that of international contemporaries. Indeed, several publications supported by the federal government have corroborated the assertion that US schools are failing, including Nation at Risk (National Commission for Excellence in Education 1983) and Prisoners of Time (National Education Commission on Time and Learning 1994). A second camp argues that students’ performance today is as good as it was in the mid-1970s and that international comparisons are uninterpretable (Berliner and Biddle 1995). These and other researchers note that the stability of achievement has been maintained at a time when US schools are increasingly diverse and serving more minority students and those from poor homes than ever before. This camp also: (a) notes that reports of average school achievement are often highly misleading, because students’ performance varies markedly between schools; and (b) has concerns about the performance of minority students. 13583
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values Importantly, there is documentable evidence not only of the stability of student achievement over time, but also of real progress. For example, in less than a decade, the number of students, including minority students, taking advanced placement tests in high schools for college credit doubled such that onefifth of US students in 1996 entered college with credits earned from testing (Good and Braden 2000). A third camp agrees with the second position, but asserts that traditional achievement standards are no longer applicable to a changing society—schools must address different, higher subject-matter standards. Although we accept the premise that some curriculum changes are probably necessary to accommodate the knowledge needs of a changing society, we have two major problems with advocates for new, higher standards. First, there has been no specification of what these new skills are and, typically, if examples are even provided, the rhetoric of higher standards essentially translates into moving content normally taught later (college calculus) to appear sooner (in high school). This position is specious, because it lacks sound theoretical and empirical support. Second, extant data suggest that achievement of students in states with ‘higher standards’ is lower than that of students in states with purported lower standards (Camilli and Firestone 1999)! We align our arguments with advocates of the second camp, and we note that there is extensive evidence to support this position—student achievement scores are stable in the face of growing student diversity (Berliner and Biddle 1995, Good and Braden 2000). However, despite strong empirical data to show that student performance has not declined, we believe there is much room for improvement. For example, schools in urban settings need to improve student subject-matter performance, and all schools need to further enhance student non-subject-matter achievement. The academic accomplishments of students in today’s schools are difficult to understand for various reasons. Here, we review several reasons why the achievement of US students is problematic.
2. Which Test is Appropriate? For one example of the difficulty of using extant assessment instruments to compare students within the USA or across cultures, one only need note the discrepancy between the measures. For example, as Bracey and Resnick (1998) report, in 1996 the National Assessment of Educational Progress (NAEP) mathematics performance indicated that only 18 percent of US fourth graders were proficient, only 2 percent were advanced, and 32 percent were below basic. In contrast, US fourth graders performed well above the average scores of the 26 countries that participated in the international TIMSS study at the fourth-grade 13584
level. Which test is the best descriptor? Are US students above average in fourth-grade mathematics, or not? This issue of ‘which test’ is embedded in US testing, as well. For example, in New York state there has been a raging debate about the appropriateness of a state-mandated test (i.e., in relation to other measures, such as advanced placement tests).
3. Problems with Curriculum, Theory, and Test Alignment Other factors also limit efforts to relate the implemented curriculum with student achievement. For example, in mathematics, a current ‘theoretical’ recommendation is to avoid instruction that is a mile wide and an inch deep (i.e., in terms of the breadth and depth of curriculum concepts covered). Those who advise teachers about this problem apparently do not worry about the opposite problem—creating a curriculum that is four inches wide and a mile deep! Aligning standardized achievement tests with curriculum implementation is vital for meaningful interpretation of student scores. Appropriately, achievement tests should assess the philosophical orientation of the curriculum. Then, other tests and research can be used to assess the utility of a curriculum for various purposes (i.e., do various curricula have different effects on long-term memory of the curriculum taught or on students’ ability to transfer ideas learned in a curriculum to solve new problems?). Finally, curriculum writers often do not understand the learning theory that undergirds particular instructional practices or goals. Some curriculum theorists do not appear to understand that behavioral models are powerful when dealing with factual and conceptual knowledge (and constructivist models are exceedingly weak in this area). Also, they often fail to understand that information-processing models (cognitive science) are more powerful than behavioral or constructivist principles at addressing wide horizontal transfer of academic concepts and authentic problem solving. In addition, in applying constructivist principles, many curriculum experts do not even recognize the differences among constructivist theories. Despite the difficulties associated with assessment, instruments, and the curricula, and instructional alignment with achievement outcomes, there are data to suggest that teachers and schools can make a difference in student achievement.
4. Effectie Teachers The question that was problematic is the mid1970s—‘Do teachers make a difference in student learning?’—can now be answered with a definite ‘yes’ (Weinert and Helmke 1995). There are clear data to
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values illustrate that some teachers notably outperform other teachers in helping similar students to learn the type of material historically measured on standardized achievement tests. It is not only possible to identify effective teachers on this useful but limited measure of student learning (e.g., see Good et al. 1983), but also research has illustrated that the practices of ‘successful’ teachers could be taught to other teachers. As one instance of this research base, the Missouri Mathematics Program was successful in improving students’ performance in mathematics. However, a series of research studies in this program found that teachers who were involved in the training program implemented the intended program uneenly. Also, not surprisingly, higher levels of program implementation were associated with higher levels of student mathematical performance. Importantly, some of the ‘resistance’ to implementing the Missouri Mathematics Program was due to teachers’ beliefs about mathematics and how it should be taught (Good et al. 1983). Teachers’ beliefs are a key, but often overlooked, point in school reform. And, as Randi and Corno (1997) found, failure to take teachers’ ideas into account lowers the impact of many reform interventions. Many aspects of teaching and learning are still problematic. Although there are numerous plausible case examples of teachers who impact students’ ability to think and to apply knowledge, there is still debate about how teachers obtain these ‘higher order’ outcomes and whether other teachers can be educated in ways that allow them to achieve comparable effects on their students. Unfortunately, research on teachers’ roles in helping students to develop thinking and problem-solving skills has in recent times been limited to case-study research. Further, research that helps teachers to expand—and assess—their capacity for impacting students’ thinking abilities has been inert.
5. Effectie Schools There is some evidence that schools (serving similar populations of students) have more effect on student achievement (as measured by conventional standardized achievement tests) than do other schools (e.g., Good and Weinstein 1986). However, unlike the correlational and experimental research on teacher effects, the early research base describing ‘effective schools’ was sparse and questionable. Teddlie and Stringfield’s (1993) longitudinal study in Louisiana has provided strong evidence that schools make a difference. They found that schools serving similar populations of students had different ‘climates’ and that these school differences were associated with differential student achievement. In general, the study confirmed many of the arguments that school effects researchers had made previously. Interestingly, in their
eight-year study, about one-half of the schools retained their relative effectiveness during the time period (stability rates were similar for schools that had initially been defined as effective or ineffective). More work on this issue is warranted. There still have not been successful and replicated studies to show that factors associated with ‘effective’ schools can be implemented in other schools in ways that enhance student achievement. Although there are notable examples that schools can be transformed, there is comparatively little research to see if this knowledge can be used to improve other schools.
6. Comprehensie School Programs In recent years, intervention using components of previous teacher and school effects research and emerging ideas (e.g., cooperative student groups) have been combined in ‘comprehensive school programs’ designed to transform all aspects of schooling at the same time (including governance, structure, instruction, home schooling, communication, curriculum, and evaluation!). Although these broad interventions offer potential, it should be noted that the theoretical assumptions and congruence of these programs have not been assessed. Some programs appear to have a serious misalignment between program components. For example, as McCaslin and Good (1992) noted, some schools emphasize a behavioral approach to classroom management (‘do as you are told’) and a constructivist approach to the curriculum (‘think and argue your conception’). Recent school intervention efforts have focused upon the impact of broad programs on student achievement. Typically, research includes no observational data, and thus there is no evidence of what parts of the program are implemented and\or whether the program parts that are presented represent a theoretically integrated program or a ‘patchwork quilt’ of separate and conflicting program parts. Further, the more complex and comprehensive the intervention model, the more likely teachers will alter program parts (often permanently). These ‘teachable’ moments go unnoticed by reformers. Perhaps unsurprisingly, there is growing consensus that such programs have not had consistent, demonstrable effects on students’ achievement (Slavin 1999).
7. Little Support for New Research In the past decade, policy leaders have tended to agree with advocates of camp one about the effectiveness of schools. Those in camp one (and who believe that schools are failing radically) have been able to convince policymakers to invest widely in charter and voucher plans, even in the absence of any convincing 13585
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values pilot data. To these advocates, both the problem and solution are evident. Those in the second camp (stable student achievement over time) have been ‘represented’ by researchers attempting to implement comprehensive school reform. Unfortunately, as noted above, such research attempts fail to show the effects of program implementation. This group of researchers does not question the knowledge base sufficiently and hence fails to develop research designs that could improve its conceptions of ‘good practice.’ Those who fall into the third camp (new societal requirements mandating new processes and outcomes of schooling) are prepared to reject extant theory and research on the basis that it is outdated and irrelevant. Indeed, it is not only that monolithic state-mandated standardized tests have gotten in the way of scientific research, but also that there is a growing willingness of professional groups to advocate desirable teaching processes on theoretical grounds alone. Prestigious groups, such as the National Council of Teachers of Mathematics (NCTM), have encouraged curriculum development without advocacy for research on instruction and have encouraged teachers to teach less and to use more student group work. Although it is perhaps the case that at an aggregated level, teachers as a group should teach less, the real issue is about when teachers should teach and when students should explore alone or with other students in small or large groups.
8. Controllable and Uncontrollable Influences on Student Learning It has become better understood in the last 25 years of the twentieth century that schools can influence students’ learning independent of home background. However, it is also better understood that the effects of schools are mediated by other factors. Bracey and Resnick (1998) argue that many factors impacting students are within the control of the school district, including: aligning textbooks with standards; aligning the broader curriculum with standards; employing high-quality teachers and providing them with appropriate professional development; and having modern textbooks and laboratory equipment. Although these factors are controllable in theory, it should be noted that some school districts have considerably fewer resources to work with than do other districts. It seems unconscionable that all schools do not receive these controllable resources, which are known to impact achievement. Bracey and Resnick (1998) have argued that there are various out-of-school factors that can influence student achievement, including: teenage pregnancy rates; percentage of female-headed households; poverty rate; parental and maternal educational levels;thepercentageofstudentsforwhomEnglishisnot their native language; number of violent incidents per 13586
year; and number of annual policy visits\disciplinary actions that occur at particular schools. These opportunity-to-learn variables are seen as conditions that impact student achievement but are largely beyond the direct influence of the school system. Accordingly, various groups have noted that achievement problems of students from low-income families need to be addressed by other social agencies. Clearly, poor achievement in schools is often outside the control of individual schools, but not outside the influence of the broader society. If a society chooses to hold individual schools ‘accountable’ for achievement, it is incumbent on that society to spread its wealth in ways that address factors that individual schools cannot address.
9. Improing Performance: Aligning Theory and Research Massive investments of public funds in student testing have occurred over the last two decades of the twentieth century and, as noted, some policymakers have called for even more testing and for raising extant standards. The research for higher standards has led to the development of tests in some states that are designed, absurdly, so that most students fail them. Oddly, despite this massive investment in testing, no information has been obtained to describe adequately what students learn in schools and how conditions can be changed to further enhance student learning. We believe that school reform efforts need to bring theory and empirical research together. It is time to stop advocacy from prestigious groups that do not support their reform calls in the form of theory and data. There is a growing sense of the importance of instructional balance in order to scaffold the learning needs of students in complex classroom settings (Good and Brophy 2000, Grouws and Cebulla 2000). However, these assertions demand more research examination in various contexts. In contrast, this advice is currently not echoed by professional groups. These groups continue to overemphasize (in our opinion) problem solving and application without concomitant attention to the development of conceptual knowledge. The ability to solve problems also requires the ability to recognize and find problems as well as the computational and conceptual skills necessary for addressing problems. Simply put, there can be too little or too much attention to concepts, facts, or problem solving.
10. Non-subject-matter Outcomes of Schooling Americans have always argued that their students should not simply be ‘academic nerds,’ but that students should also be participants in broader society (have jobs, participate in drama or sports, etc.).
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values Increasingly, parents and citizens have expressed interest in schools being more responsive to the nonsubject-matter needs of youth. Often, this argument is expressed only in terms of improving achievement—if children feel secure, they will achieve better. But increasingly, the assertion is made that some outcomes of schooling are important whether or not they impact subject-matter learning. Despite the considerable advocacy of policy leaders for higher standards, many citizens are pushing hard for students to learn responsibility. For example, when asked what first comes to their minds when Americans think about today’s teenagers, 67 percent described them as ‘rude, irresponsible, and wild,’ and 41 percent reported that they did not think teens had enough to do and that they were not being taught the value of hard work (schools are not doing their ‘job’) (Farkas et al. 1997). This idea is echoed in the 1999 Phi Delta Kappan\Gallup poll, where 43 percent of parents reported that the main emphasis of schools should be on students’ academic skills, and 47 percent of parents said they believe that the main emphasis of schools should be helping students to learn to take responsibility. And earlier in a 1996 Phi Delta Kappan\Gallup poll, parents indicated they would prefer their children to receive bad grades and be active in school activities rather than making an excellent grade and not participating in extracurricular activities (60 percent to 20 percent). It is clear that parents and citizens, reacting to a perception of adolescents’ moral decline, believe that schools should play a greater role in teaching non-subject-matter content, such as responsibility and civic behavior. Although citizens want schools to provide students with more non-subject-matter knowledge, there are problems with the systematic addition of this ‘content’ to the curriculum. One problematic issue is the difficulty in finding consensus on what a moral curriculum would look like. Subsequently, what assessment criteria should be used to measure the curriculum’s effectiveness? Is the curriculum successful if more students vote? Or is it effective if students are able to negotiate conflicts calmly? As we have argued, achievement tests do not always represent adequately the curricula they are supposed to measure. Therefore, it is reasonable to think that appropriate accountability and assessment tools would be difficult to implement for non-subject-matter content—at least without adequate empirical research. Policymakers, however, are not only supporting past emphases of the impact of schools on achievement, but are pressuring for higher and higher standards. In some instances, the standards have been pushed so high (and artificially high) that in Virginia and in Arizona over 90 percent of students failed exams that presumably represented efforts by state government to improve standards. At some level, increasing standards may be a laudable goal; however, developing standards that only 10 percent of our
students pass seems to be a ridiculous and selfdefeating effort. Given the exaggerated, negative views of youth, and testing policies designed to guarantee youth’s failure, it seems important to explore motivation. Are such practices designed to suggest that youth, especially minority youth, do not deserve resources in their schools?
11. Educators on Noncognitie Factors Rothstein (2000) has noted that, in part, achievement measures are reified in making decisions about the effectiveness of schooling, because there are many of these measures, but few ways to measure noncognitive outcomes. He noted that there are 10 core outcomes that motivate citizens to invest in schools so that students can contribute to society and lead productive adult lives. The dimensions he identifies are literacy, competencies in mathematics, science, and technology, problem solving, foreign languages, history knowledge, responsible democratic citizenship, interest in creative arts, sound wellness practices, teamwork and social ethics, and equity (narrowing the gap between minority and white students in the other nine areas). Wanlass (2000) has drawn attention to the numerous problems in contemporary society, including the effects of a global economy, increased ethnic diversity, advanced technology, and increasing risk for various problems such as intolerance (hate crimes), overpopulation, and a dramatic depletion of natural resources. In this context, she argued a need to help youth develop the affective dispositions required for the modern world (leadership, tolerance, philanthropic concern). She suggested several strategies, capitalizing on the unique strengths and talents of more students and the need to better develop the unique abilities of nonacademic areas as well as academic concerns (performing arts, social interaction, leadership, civic-mindedness, social advocacy, etc.). Goodenow (1992) has argued that students who feel more connected to their schools are more likely to be academically and socially successful. Students who felt more personally valued and invested in were more likely to place higher value and have higher expectations for classroom success than were students who did not feel valued. This finding was replicated in a large-scale study (Blum and Rinehart 1996), which surveyed more than 90,000 students nationwide and found that those who felt more connected to their schools were less likely to engage in risky behavior, including: smoking, doing drugs, engaging in sexual activity, and repeated truancy, and they were less likely to drop out of school. In this study, school connectedness was defined as students who: (a) felt they were treated fairly, (b) reported they got along with other teachers and students, and (c) felt close to people at school. Connectedness was also explored in 13587
School Outcomes: Cognitie Function, Achieements, Social Skills, and Values terms of how schools were perceived by their teachers and administrators. A school that encouraged more connectedness exhibited less student prejudice and had higher daily attendance rates, lower drop out rates, and more teachers employed with masters degrees. Although these data are correlational in nature, it suggests that some factors thought to be uncontrollable, may in fact be controllable. Researchers have also investigated the role of nonsubject-matter variables as they influence students’ motivation to achieve. For example, Nichols (1997) examined the nature of the relationship between students’ perceptions of fairness and their reported levels of motivation to achieve. Of the four dimensions of fairness identified through factor analytic techniques (personal, rules, punishment, teacher), a personal sense of fairness more strongly mediated motivational variables than did other dimensions. Similar to Goodenow (1992), this data set indicated that students who felt more personally valued were more likely to be motivated to achieve in school settings. However, even if students’ personal sense of fairness was not linked directly to an achievement motivation, it would seem that this is a desirable outcome of schooling. Students should feel respected and safe, whether or not these variables influence achievement.
12. Conclusion Clearly, more public debate is called for, if more focused judgments are to be reached concerning those aspects of schools that are defined as most critical. However, given the growing evidence that citizens are concerned about non-subject-matter outcomes of schooling, it seems incumbent on policymakers to ‘proxy’ these concerns by collecting data on some of the many outcomes that could be measured. There is a need for research and arguments focused on nonsubject-matter variables, as well. For example, it seems that high subject-matter specialization and excessively high levels of student obedience (as opposed to student initiatives) are not a prescription for forging a professional community well prepared for engaging in creative work and problem solving. A recent poll of teenagers found that the three greatest pressures they experience are to get good grades (44 percent), to get into college (32 percent), and to fit in socially. In contrast, fewer teens report feeling pressure to use drugs or alcohol (19 percent) or to be sexually active (13 percent). What modern youth are concerned about varies markedly from the view presented by the media. Unfortunately, policymakers at present seem more intent on blaming youth and holding them accountable than on understanding them. The collection of reliable data on various non-subject-matter outcomes of schooling might help to assure that youth are developing the pro13588
active skills necessary for life in a democracy (e.g., civility, a willingness to examine ideas critically, and ambition). The historical concern for evidence about subjectmatter growth should continue—particularly if these measures are collected in ways that provide a basis for curriculum improvement. However, it is important to recognize that US public schools are about much more than subject-matter acquisition. More research on students and how they mediate social, emotional, and academic contexts might enable educators to design better school programs that recognize students both as academic learners and as social beings (McCaslin and Good 1992). See also: Curriculum as a Field of Educational Study; Educational Assessment: Major Developments; Educational Evaluation: Overview; Educational Research for Educational Practice; School Effectiveness Research; Teacher Behavior and Student Outcomes
Bibliography Berliner D, Biddle B 1995 The Manufactured Crisis: Myth, Fraud and the Attack on America’s Public Schools. Addison-Wesley, New York Biddle B, Good T, Goodson I 1997 Teachers: The International Handbook of Teachers and Teaching. Kluwer, Dordrecht, The Netherlands, Vols. 1 and 2 Blum R W, Rinehart P M 1996 Reducing the Risk: Connections That Make a Difference in the Lies of Youth. University of Minnesota, Division of General Pediatrics and Adolescent Health, Minneapolis, MN Bracey G W, Resnick M A 1998 Raising the Bar: A School Board Primer on Student Achieement. National School Boards Association, Alexandria, VA Camilli G, Firestone W 1999 Values and state ratings: An examination of the state-by-state indicators in quality counts. Educational Measurement: Issues and Practice 35: 17–25 Farkas S, Johnson J, Duffett A, Bers A 1997 Kids These Days: What Americans Really Think About The Next Generation. Public Agenda, New York Good T L, Braden J 2000 The Great School Debate: Choice, Vouchers, and Charters. Erlbaum, Mahwah, NJ Good T L, Brophy J 2000 Looking in Classrooms, 8th edn. Longman, New York Good T L, Weinstein R 1986 Schools make a difference: Evidence, criticisms, and new directions. American Psychologist 41: 1090–7 Good T L, Grouws D A, Ebmeier H 1983 Actie Mathematics Teaching. Longman, New York Goodenow C 1992 School motivation, engagement, and sense of belonging among urban adolescent students. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA Grouws D, Cebulla K 2000 Elementary and middle school mathematics at the crossroads. In: Good T (ed.) American Education: Yesterday, Today, and Tomorrow. Ninety-Ninth Yearbook of the National Society for the Study of Education. University of Chicago Press, Chicago, Chapter 5, pp. 209–55
Schooling: Impact on Cognitie and Motiational Deelopment McCaslin M, Good T 1992 Compliant cognition: The misalliance of management and instructional goals and current school reform. Educational Researcher 21: 4–17 National Commission for Excellence in Education 1983 A Nation At Risk: The Imperaties for Education Reform. US Department of Education, National Commission for Excellence in Education, Washington, DC National Education Commission on Time and Learning 1994 Prisoners of Time. US Government Printing Office, Washington, DC Nichols S 1997 Students in the Classroom: Engagement and Perceptions of Fairness. Master’s thesis, University of Arizona Randi J, Corno L 1997 Teachers as innovators. In: Biddle B, Good T, Goodson I (eds.) International Handbook of Teachers and Teaching. Kluwer, Dordrecht, The Netherlands, Vol. 2, Chap. 12, pp. 1163–221 Rothstein R 2000 Toward a composite index of school performance. Elementary School Journal 100(5): 409–42 Slavin R 1999 How Title I can become the engine of reform in America’s schools. In: Orfield G, Debray E (eds.) Hard Work for Good Schools: Facts Not Fads In Title I Reform. The Civil Rights Project, Harvard University, Cambridge, MA, Chap. 7, pp. 86–101 Teddlie C, Stringfield S 1993 Schools Make a Difference: Lessons Learned From a 10-Year Study of School Effects. Teachers College Press, New York Wanlass Y 2000 Broadening the concept of learning and school competence. Elementary School Journal 100(5): 513–28 Weinert L, Helmke A 1995 Interclassroom differences in instructional quality and interindividual differences in cognitive development. Educational Psychologist 30: 15–20
T. L. Good and S. L. Nichols
Schooling: Impact on Cognitive and Motivational Development Up until the mid-1970s, neither the general public nor the social sciences could agree on whether schooling has a significant impact on cognitive development, how large this impact is or may be, and which variables are responsible for any potential effects it may have. For example, even in 1975, Good et al. asked skeptically, ‘Do schools or teachers make a difference? No definite answer exists because little research has been directed on the question in a comprehensive way’ (Good et al. 1975, p. 3). Things have changed since then, because the findings from numerous empirical studies have banished any serious doubts concerning whether schooling impacts on both the cognitive and motivational development of students. This article will report the arguments questioning the importance of school for the mental development of children and adolescents and assess their scientific validity; present the empirical evidence on the strong impact of school on cognitive development, while simultaneously showing the limits to its generaliz-
ability; and sketch the findings on the role of school in motivational development. The final section presents some conclusions on the relations between cognitive and motivational development under the influence of school.
1. Scientific Doubts Regarding the Impact of Schooling on Cognitie Deelopment Compared with the impact of social origins and\or stable individual differences in intellectual abilities on cognitive development, that of schooling was long considered marginal. Nonetheless, the Coleman Report (Coleman et al. 1966) was still a great shock to many politicians, educators, and teachers when it concluded, ‘that school brings little influence to bear on a child’s achievement that is independent of his background and general social context’ (p. 325). Just as radically, Jencks et al. (1972) confirmed ‘that the character of schools’ output depends largely on a single input, namely, the characteristics of the entering children. Everything else—the schools’ ‘budget, its policies, the characteristics of the teachers—is either secondary or completely irrelevant’ (p. 256). According to their findings, elementary school contributed 3 percent or less to the variance in cognitive development, and secondary school 1 percent or less. Even though both Coleman et al.’s (1966) and Jencks et al.’s (1972) general statements were modified and differentiated slightly in comparisons between private and public schools, between middle-class and lower-class children, and between the effects of different school variables, the final conclusion remains unchanged: ‘Nevertheless, the overall effect of elementary school quality on test scores appears rather modest’ (Jencks et al. 1972, p. 91). Viewed from the perspective of the beginning of the twenty-first century, such statements and conclusions are, without exception, underestimations of the impact of schooling on cognitive development in childhood and adolescence. There are various reasons for these underestimations. First, two aspects of cognitive development that should be kept strictly separate are frequently confounded: the growth and acquisition of competencies, knowledge, and skills on the one side compared with the change in interindividual differences in cognitive abilities on the other. We now know that schools are necessary and very influential institutions for the acquisition of that declarative and procedural knowledge that cannot be learned in the child’s environment outside school, but that the quantity and quality of schooling has only a limited power to modify individual differences in abilities. Second, Coleman et al. (1966), Jencks et al. (1972), and many other studies, compared very similar schools in the USA or other industrialized countries. Naturally, such schools reveal more commonalities than 13589
Schooling: Impact on Cognitie and Motiational Deelopment differences due to the similarities in teacher training, curricular goals, educational traditions, the budget of the educational institutions, the state-regulated quantity of teaching, and standardized achievement tests. The methodological problems in such research did not become evident until the publication of findings from cross-cultural studies. These showed that when schools in the Third World—or even achievement measured in ethnic groups with no or very low school education—are taken into account alongside schools in industrialized countries, massive effects of the quantity and quality of instruction can be ascertained (Anderson et al. 1977). The third reason for underestimating the impact of schools was that scientific interest in educational sociology focused on the role of social class and the family in mental development, whereas developmental psychology was dominated by Jean Piaget’s constructivist model. Whereas theories of cognitive development deal with general mechanisms of thought, action, and experience, models of school learning in the strict sense are concerned with the acquisition of specific skills and small bits of knowledge. Piaget and his co-workers emphasized ‘that this form of learning is subordinate to the laws of development and development does not consist in the successive accumulation of bits of learning since development follows structuration laws that are both logical and biological’ (Inhelder and Sinclair 1969, p. 21).
2. Empirical Confirmation of the Impact of Schooling on Cognitie Deelopment In many social scientists, the skeptical attitude toward the role of schools in recent decades has been overcome to a large extent (although not completely) by the results of numerous empirical studies. This applies not only to the acquisition of specific cognitive competencies but also to the promotion of intelligence. After reviewing the relevant literature, Rutter (1983, p. 20) came to the conclusion that ‘the crucial component of effective teaching includes a clear focus on academic goals, an appropriate degree of structure, an emphasis on active instruction, a task-focused approach, and high achievement expectations.’ Although recent years have seen increasingly strong criticism of Rutter’s description of the successful classroom, it is broadly confirmed empirically that active teachers, direct instruction, and effective classroom management on the one side accompanied by active, constructive, and purposeful learners on the other side are necessary conditions for the effective acquisition of competencies, knowledge, and skills. Naturally, a major role can also be assigned to intrinsically motivated, self-organized, and cooperative learning (Weinert and Helmke 1995). Nonetheless, the impact of schools on cognitive development ranges far beyond teaching declarative 13590
and procedural knowledge and also includes the promotion of intellectual abilities. For example, Ceci (1991) summarized his state-of-the-art review as follows: Of course, schooling is not the complete story in the formation and maintenance of IQ-scores and IQ-related cognitive processes … Children differ in IQ and cognitive processes prior to entering school, and within a given classroom there are sizeable individual differences despite an equivalence of schooling. … Thus, the conclusion seems fairly clear: Even though many factors are responsible for individual and group differences in the intellectual development of children, schooling emerges as an extremely important source of variance, notwithstanding historical and contemporary claims for the contrary. (p. 719)
Looking at the available data from industrialized countries, the Third World, and, above all, crosscultural studies on the impact of schooling on cognitive development in general and the effectiveness of specific features of schools in particular, substantial effects can be ascertained on intellectual abilities; on metacognitive strategies; on the acquisition of verbal, mathematical, scientific, logical, and technological competencies; and on various forms of domainspecific and cross-domain knowledge. These findings provide unequivocal support for Geary’s (1995) hypothesis that schools are necessary cultural conditions for the acquisition of those abilities and skills that cannot be learned in the child’s immediate environment: this means almost all the cognitive competencies necessary for a good life and a successful career in our modern scientifically and technologically shaped world. Up to which level cognitive development is enhanced and which competencies are acquired depends strongly on the quality and quantity of schooling. This is confirmed by findings from a number of large-scale international studies that not only compared school achievement but also assessed important features and variables in the individual national school systems. A current example is the Third International Mathematics and Science Study (TIMSS) carried out by the International Association for the Evaluation of Educational Achievement (IEA; see Baumert et al. 1997). In association with numerous national and regional studies on educational productivity, school effectiveness, or school improvement, it has been possible to identify a variety of combinations and configurations of variables in the school system, the individual school, the classroom, the teacher, instruction, and the school context that impact on various dimensions and aspects of the cognitive development of students. At the same time, these findings have also led to scientific suggestions for improving school systems. Alongside mean effect sizes, the long-term effects of exceptional schools and teachers are of particular interest. For example, Pederson et al. (1978) reported
Schooling: Impact on Cognitie and Motiational Deelopment the strong long-term effects of one excellent teacher in a single-case study. Compared with the students of other teachers, the children she had taught in their first two years at school were significantly more successful in their later academic careers as well as in their later jobs. More detailed analyses of this effect revealed that this teacher attained above-average effects on school performance in the first two school grades as well as on her students’ attitudes toward work and their initiative, but no great increase in intellectual abilities. Even more important than the direct teacher effects were the indirect effects of a good, successful, and positively experienced start to school on all the children’s further cognitive and motivational development. Although it has to be said that this is only one single-case study, its findings could be replicated as a trend, though not so emphatically, in several representative longitudinal studies (Weinert and Helmke 1997). However, great methodological care is necessary when gathering, analyzing, and interpreting longitudinal data. Only multilevel analyses (that view teacher or instruction variables as being moderated by classroom contexts, and classroom effects as being influenced by features of the school, the school system, and the sociocultural environment) permit valid estimations of the effects of certain clusters of variables on students’ cognitive development and growth in knowledge. Such studies have shown that individual variables make only a very limited contribution to explaining the variance in student outcomes. There are many reasons for this: one of the most important is the complex ‘multideterminedness’ of students’ academic achievement. Haertel et al. (1983) characterize the ensuing theoretical and methodological problems as follows: … classroom learning is a multiplicative diminishing-returns function of four essential factors—student ability and motivation, and quality and quantity of instruction … Each of these essential factors appear to be necessary but insufficient by itself to classroom learning; that is, all four of these factors appear to be required at least at minimum levels for the classroom learning to take place. It also appears that the essential factors may substitute, compensate, or trade-off for one another in diminishing rates of return. (p. 75)
3. The Limited Impact of School on the Modification of Indiidual Differences in Cognitie Abilities and Competencies Of the four essential conditions for classroom learning identified by Haertel et al. (1983), only two (the quantity and quality of instruction) are under the control of the school and the teacher. The other two (ability and motivation) are features of students.
Already when entering school, children exhibit large interindividual differences in both their cognitive competencies and their motivational tendencies. Through genetic factors and through covariations between genotype and environmental influences, interindividual differences in intellectual abilities already become moderately stable early in childhood. Round about the seventh year of life, astonishingly high stability coefficients of 0.5 to 0.7 are already found, despite the relatively low reliability of many tests at this age level. Assuming that more intelligent students learn, on average, more quickly, more easily, and better than less intelligent students under the same conditions of instruction, it can be anticipated that interindividual differences in mental abilities and cognitive development will not diminish under the influence of school but become increasingly more stable. This is also supported by the results of many empirical studies: For example, during the course of elementary schooling, stability coefficients in verbal intelligence tests increase to about 0.8, nonverbal intelligence to 0.7, mathematical competencies to 0.7, reading and writing to 0.7, and the development of positive attitudes toward learning, self-confidence, and test anxiety to 0.6 in each case (Weinert and Helmke 1997). This recognizable trend in elementary school continues in secondary school, with stabilities in IQ scores already attaining values between 0.8 and 0.9 from the age of 12 years onward, thus permitting almost perfect long-term prediction. These and other empirical findings can be used to draw some theoretical conclusions on classroom learning that may be modified, but not falsified, by variations in school contexts: (a) Cognitive abilities and competencies develop dramatically in childhood and adolescence—not least under the influence of the school. At the same time, interindividual differences already stabilize at a relatively early age and then remain more or less invariant. (b) This trend applies not only for intellectual abilities but also for the acquisition of cognitive competencies and domain-specific knowledge. This is particularly apparent when the knowledge acquired is cognitively demanding. In this case, it is necessary to assume a relationship between the level of intelligence and the quality of the knowledge (‘intelligent knowledge’). (c) Naturally, the stabilities in individual differences are in no way perfect, so that changes can be observed in students’ relative positions in the ranking of abilities and academic achievements. However, very strong changes are often an exception due to idiosyncrasies rather than the rule. (d) All attempts to level out these individual differences in cognitive abilities and competencies that stabilize during the course of schooling and to make achievements and achievement dispositions equal for all students have generally been unsuccessful. The same applies to the concept of mastery learning when 13591
Schooling: Impact on Cognitie and Motiational Deelopment practiced over a longer period of time and on demanding tasks. (e) Of course, very significant changes in individual differences result when students are taught for different lengths of time and\or with different achievement aspirations in various subjects or the same subject. Whereas, for example, some students broaden their basic knowledge of the world only slightly through the physics they learn at school, others acquire a high degree of expertise in the subject. Despite the doubtful validity of the claim put forward in the novice–expert research paradigm that intellectual abilities and talents are irrelevant for the acquisition of expertise, the extent of deliberate practice seems to be decisive for the achievement of excellence. (f) Regardless of the level of individual abilities and regardless of the stability of interindividual differences in ability, it is still necessary for all students to acquire all their cognitive competencies through active learning. Many studies have shown that the quantity and quality of instruction are useful and, in part, even necessary for this.
4. The Impact of Schooling on Motiational Deelopment The available statistical meta-analyses confirm the belief shared by most laypersons that student motivation is an essential personal factor for successful learning at school. However, motivational tendencies and preferences are not just conditions of learning but are also influenced by learning and teaching in the classroom. Their systematic promotion is simultaneously an important goal of school education. To analyze motivational conditions and consequences of school learning, it is necessary to distinguish strictly between dispositional motives and current motivation. Whereas motives (e.g., achievement motive, affiliation motive, attribution style, test anxiety, self-confidence) are understood as personal traits that are relatively stable and change only slowly, motivation concerns current approach or avoidance tendencies that not only depend on dispositional motives but are also influenced strongly by situational conditions (stimuli, incentives, rewards, threats, etc.). Naturally, the long-term modification of motivational development through the school predominantly involves the students’ dispositional motives, attitudes, and value orientations. School enrollment between the fifth and seventh year of life seems to be a sensitive period of development for this. The cognitive competencies acquired during preschool years enable first graders to compare themselves with others, to make causal interpretations over the differences and changes in achievement they register, and to form anticipations regarding future achievements. At the same time, the school class offers a variety of opportunities for 13592
comparisons on the levels of behavior, achievement, evaluation, and ability. Teachers play an important role in this. Almost all preschool-age children are optimists: They look forward to attending school, most anticipate that they will do well, they have only a low fear of failure, and they believe that they will be able to cope with all the demands of the school. This morethan- optimistic attitude changes during the first two years at school. In general, children become realists, although most of them continue to hold positive expectations regarding their own success. This basic attitude stabilizes during the further course of schooling—naturally, with large interindividual differences due, in part, to personal academic successes and failures, in part to a stable anticipation bias (Weinert and Helmke 1997). This general developmental trend is modified more or less strongly by specific features of the school, the classroom, the teacher, and the methods of instruction (for overview see Pintrich and Schunk 1996). A few examples of this will be given below. These come either from studies in which the motivational effects of teaching and learning were observed in the classroom, or from studies assessing the effects of planned interventions. Personal causation (the feeling of being an origin and not a pawn) is an important motivational condition for successful classroom learning. If students are to experience personal causation, they must have opportunities to (a) set themselves demanding but realistic goals, (b) recognize their own strengths and weaknesses, (c) be self-confident about the efficacy of their own actions, (d) evaluate whether they have attained the goals they have set themselves, and (e) assume responsibility not only for their own behavior but also, for example, that of their classroom peers. Psychological interventions designed to enhance the ‘origin atmosphere’ in the classroom and increase the ‘origin experiences’ of the students have led not only to the intended motivational changes but also to broad improvements in average academic achievement. Similarly positive outcomes are reported from studies designed to improve the goal orientation of learning, to strengthen achievement motivation, and to modify the individual patterns of attributions used in intuitive explanations of success and failure. For example, efforts have been made in both classrooms and individual students to modify the typical attribution style for learned helplessness (attributing failure to stable deficits in ability and success to luck or low task difficulty) so that given abilities and varying effort as a function of task difficulty become the dominant personal attribution patterns. Positive modifications to student motivational development depend decisively on the teacher’s attitudes, instructional strategies, and feedback behavior. As a result, many intervention programs for enhancing motivation focus on both students and teachers
Schooling: Impact on Cognitie and Motiational Deelopment simultaneously. For example, a large study of ‘selfefficacious schools’ is being carried out at the present time in Germany. It is testing the assumption that enhanced self-efficacy beliefs may affect not only teacher behavior (by reducing burnout syndromes and enhancing a proactive attitude) but also student learning (by increasing motivation and achievement). For several decades, theories of learning motivation have been dominated by the assumption that basic needs are satisfied through achievement, social support, rewards, self-evaluation, and so on. Frequently the motivational mechanisms were those underlying expectancyivalue models, cost-benefit calculations, or instrumental behavior models. In comparison, intrinsic motives for learning have played a much smaller role. Intrinsic motivation addresses needs and goals that are satisfied by the process of learning itself and by the possibility of experiencing and enjoying one’s own expression of competence and increasing knowledge in a preferred content domain. In this theoretical framework, working and learning are not a means to an end, but are more or less an end in themselves; learning activities and learning goals are not clearly distinguished (e.g., exploratory drive, experience of flow, effectancy motivation, etc.). Underlying these concepts is the assumption ‘… that intrinsic motivation is based in the innate, organismic needs for competence and self-determination. It energizes a high variety of behaviors and psychological processes for which the primary rewards are the experience of effectance and autonomy’ (Deci and Ryan 1985, p. 32). The reason for a careful consideration of intrinsic motivation is that in educational philosophy intrinsic motivated learning and behavior are regarded as the most important goals of schooling. The worry is that strengthening extrinsic motivation by making external incentives and rewards available is likely to be detrimental to achieving this goal. The ideas underlying such fears come from the results of some studies investigating the ‘overjustification effect,’ which show that the use of rewards can undermine intrinsic motivation. However, the results of several recent studies indicate that this fear is unjustified. Furthermore, the relation between dispositional motives, actual motivation, and learning activities is not simple and linear but is influenced strongly by interconnections with cognitive processes, metacognitive strategies, and volitional skills.
5.
Conclusions
Schools are cultural and educational institutions for promoting cognitive development and imparting those competencies that the child cannot acquire within the immediate environment. Throughout the world, schools fulfil this evolutionary, psychological, and cultural function more or less well. The quantity and quality of instruction are a major source of the
variance in the genesis of interindividual differences in cognitive competencies. Moreover, the development and promotion of competent, self-regulated learning is an equally important goal of schools in modern societies. The development of intrinsic and extrinsic motives becomes particularly important here in association with metacognitive and volitional competencies. This is why recent decades have seen not only studies examining which relations can be found between the classroom atmosphere, the teacher’s behavior, and the subjective experiences of the students on the one side and their motivational and cognitive development on the other side, but also numerous programs designed to promote motivation systematically that have been tried out with, in part, notable success. The relation between cognitive and motivational development and their interaction are of particular interest in both theory and practice. This is just as true for the influence of academic success on the development of a positive self-concept as for the importance of the self-concept for individual progress in learning (Weinert and Helmke 1997). See also: School Effectiveness Research; School Outcomes: Cognitive Function, Achievements, Social Skills, and Values
Bibliography Anderson R C, Spiro R J, Montague W E (eds.) 1977 Schooling and the Acquisition of Knowledge. Erlbaum, Hillsdale, NJ Baumert J, Lehmann R et al. 1997 TIMSS – MathematischnaturwissenschaftlicherUnterricht im internationalen Vergleich: Deskriptie Befunde. Leske and Budrich, Opladen, Germany Ceci S J 1991 How much does schooling influence general intelligence and its cognitive components? A reassessment of evidence. Deelopmental Psychology 27: 703–22 Coleman J S, Campbell E R, Hobson C J, Mc Partland J, Mood A, Weingold F D, York R L 1966 Equality of Educational Opportunity. Government Printing Office, Washington, DC Deci E, Ryan R M 1985 Intrinsic Motiation and Self-determination in Human Behaior. Plenum Press, New York Geary D C 1995 Reflections on evolution and culture in children’s cognition. American Psychologist 50: 24–36 Good T L, Biddle B J, Brophy J E 1975 Teachers Make a Difference. Holt, Rinehart, & Winston, New York Haertel G D, Walberg H J, Weinstein T 1983 Psychological models of educational performance. A theoretical synthesis of constructs. Reiew of Educational Research 53: 75–91 Inhelder B, Sinclair H 1969 Learning cognitive structures. In: Mussen P H, Langer J, Covington M (eds.) Trends and Issues in Deelopmental Psychology. Holt, Rinehart, & Winston, New York, pp. 2–21 Jencks C, Smith M, Acland H, Bane M J, Cohen D, Gintis H, Heyns B, Michelson S 1972 Inequality. Basic Books, New York Pederson E, Faucher T A, Eaton W W 1978 A new perspective on the effects of first grade teachers on children’s subsequent adult status. Harard Educational Reiew 48: 1–31
13593
Schooling: Impact on Cognitie and Motiational Deelopment Pintrich P R, Schunk D 1996 Motiation in Education: Theory, Research, and Applications. Prentice Hall, Englewood Cliffs, NJ Rutter M 1983 School effects on pupil progress: Research findings and policy implications. Child Deelopment 54: 1–29 Weinert F E, Helmke A 1995 Learning from wise mother nature or big brother instructor: The wrong choice as seen from an educational perspective. Educational Psychologist 30: 135–42 Weinert F E, Helmke A (eds.) 1997 Entwicklung im Grundschulalter. Beltz, Weinheim, Germany
F. E. Weinert
Schools, Micropolitics of ‘Micropolitics’ is a specific perspective in organization theory. It focuses on ‘those activities taken within organizations to acquire, develop, and use power and other resources to obtain one’s preferred outcomes in a situation in which there is uncertainty or dissent’ (Pfeffer 1981, p. 7) because it considers them as most important for the constitution and the workings of organizations. This article will explore the uses and usefulness of this concept in educational research. It starts by outlining the micropolitical view on educational organizations. Then, it gives some examples of representative micropolitical research in order to illustrate typical topics and research strategies. Finally, it discusses criticisms of micropolitics and indicates areas for development.
1. The Concept of Micropolitics 1.1 Main Elements Traditional organizational theories of different origins seem to converge in describing organizations as goal oriented and rationally planned, characterized by stable objective structures, a high degree of integration, and a common purpose of its members. Conflicts between the members of an organization are considered costly and irrational ‘pathologies’ to be eradicated as soon as possible. However, conflicts are endemic in organizational life (Ball 1987). Different stakeholders pursue their own interests which are not necessarily identical with the formulated goals. Coalitions are formed, meetings are boycotted, and formal structures are ignored. As such phenomena frequently occur in organizations, it is not helpful to exclude them from organizational theorizing: models which can cope with this ‘dark side of organizational life’ (Hoyle 1982, p. 87) are needed. The micropolitical perspective was originally developed in the realm of profit organizations (e.g., 13594
Bacharach and Lawler 1980, Pfeffer 1981). In a 1982 seminal paper, Eric Hoyle (1982) put forward arguments for exploring the concept also for education. In 1987 Stephen Ball (1987) published the book The Micro-Politics of the School in which he used empirical material to push the development of the concept. In its wake the term ‘micropolitics’ appeared more frequently in various research papers (see Blase 1991 for a representative collection). What are the main elements of a micropolitical view of schools as presented in this early research? (a) The micropolitical perspective is based on a specific view of organizations. They are seen as characterized by diverse goals, by interaction and relationships rather than structures, by diffuse borders and unclear areas of influence rather than clear-cut conditions of superordination and delegation, by continuous, unsystematic, reactive change rather than by longer phases of stable performance and limited projects of development. The main focus of micropolitical approaches is not the organization as it aspires to be in mission statements or organizational charts, but the organization-in-action and, in particular, the ‘space between structures’ (Hoyle 1982, p. 88) which produces enough ambiguity to allow political activities to flourish. (b) A micropolitical approach is based on a specific image of actors. The members of organizations are seen as pursuing their own interests in their daily work. The actors may do this as individuals, they may coalesce in loosely associated interest sets, or use subdivisions of the organization (e.g., departments, professionals versus administration) as power bases. In order to protect or enhance their organizational room for maneuver, they aim to retain or obtain control of resources, such as the following (see Kelchtermans and Vandenberghe 1996, p. 7, Ball 1987, p. 16): (i) material resources, such as time, funds, teaching materials, infrastructure, time tabling, etc. (ii) organizational resources such as procedures, roles, and positions which enable and legitimate specific actions and decisions. They define ‘prohibited territories’ and relationships of subordination or autonomy. Thereby, they have their bearing on the actors’ chances to obtain other resources (e.g., through career, participation, etc.). (iii) normative or ideological resources, such as values, philosophical commitments, and educational preferences. A particularly important ideological resource is the ‘definition of the organization’ which indicates what is legitimate and important in the organizational arena. (iv) informational resources: the members of an organization are interested in obtaining organizationally relevant information and to have their expertise, knowledge, and experience acknowledged. (v) social resources: affiliation to, support from, and reputation with influential groups within and outside
Schools, Micropolitics of the school are assets which may be mobilized for one’s organizational standing. (vi) personal resources: also personal characteristics—such as being respected as a person, being grounded in an unchallenged identity as a teacher, etc.—are resources in organizational interactions. (c) A micropolitical perspective pays special attention to interaction processes in organizations. These are interpreted as strategic and conflictual struggle about the shape of the organization which, in consequence, defines the members’ room for maneuver. This has the following implications: (i) Organizational interaction is power-impregnated. It draws on the resources mentioned above for influencing the workings of an organization. However, these resources have to be mobilized in interaction in order to become ‘power.’ (ii) Power is in need of relationship. In order to satisfy his or her interests an actor needs some interaction from other actors. Power builds on dependency, which makes the relationship reciprocal, but, as a rule, asymmetric. (iii) Power is employed from both ends of the organizational hierarchy. There are advantages of position, however, ‘micropolitical skills may be well distributed throughout the organization and position is not always an accurate indicator of influence’ (Ball 1994, p. 3824). (iv) Actors use a multitude of strategies and tactics to interactively mobilize their resources and to prevent others from doing so, such as, for example, setting the agenda of meetings, boycotting and escalating, avoiding visibility, confronting openly or complying, carefully orchestrating scenes behind closed doors or in public arenas, etc. (v) Organizational interaction is seen as conflictual and competitive. ‘Confronted by competition for scarce resources and with ideologies, interests and personalities at variance, bargaining becomes crucial’ (Gronn 1986, p. 45). The relationship between control and conflict (or domination and resistance) is conceived ‘as the fundamental and contradictory base of organizational life’ (Ball 1994, p. 3822). (vi) Organizational change is nonteleological, pervasive, and value-laden since it advances the position of certain groups at the expense of the relative status of others (see Ball 1987, p. 32). Schools are not organizations easily changed; a ‘quick-fix orientation’ in innovation will not be appropriate to their cultural complexities (see Corbett et al. 1987, p. 57).
2. Research in the Micropolitics of Education 2.1 Methods and Strategies If part of the micropolitical activities which are interesting for the researcher take place in privacy, if
these processes are often overlain by routine activities (see Ball 1994, p. 3822), and if experienced actors regularly use strategies such as covering up actions, deceiving about real intentions etc., then researching the micropolitics of schools will have to cope with some methodological problems. To come to grips with their object of study micropolitical studies focus on critical incidents, persons and phases that disrupt routine, for example, the appointment of a new principal, the introduction of new tasks or programs, instances of structural reorganization (e.g., a school merger), etc. They pay special attention to the processes of (re-)distribution of resources, rewards, and benefits (see Ball 1994, p. 3823) or analyze instances of organizational conflict to find out how new organizational order has been established. These strategies provide a quicker access to indicative processes, but are certainly in danger of inducing an overestimation of the proportion of conflict in organizations. Micropolitical studies try to enhance their credibility through triangulation of methods, collection of multiple perspectives, longterm engagement in the field, critical discourse in a team of researchers, and feedback from participants and external researchers. Although there have been some attempts to tackle organizational micropolitics with quantitative means (see Blickle 1995), most studies are based on qualitative methods. There have been interview and observation studies, however, the typical strategy is a case study approach mainly based on interviews, field notes from participant or nonparticipant observation, transcripts of meetings, analyses of documents, and cross-case comparison of different sites. A good example of such a case study is Sparkes’ (1990) three-year study of a newly appointed departmental head’s attempts at innovation: the head seeks to orientate Physical Education teaching towards a child-centered approach emphasizing mixedability grouping and individual self-paced activities which run counter to the then prevalent ‘sporting ideology.’ In the beginning the new head believes in rational change and democratic participation. The head tries to make other department members understand his own particular teaching approach before introducing changes to the curriculum. However, not all staff can be convinced through dialogue. In his management position the departmental head can control the agenda of department meetings, time, and procedure of arriving at decisions, etc. He secures the agreement of crucial staff members before the meetings and networks with other influential persons outside his department. His proposals for innovation are nevertheless met with resistance by some staff. However, this resistance cannot prevent the structural changes being introduced. However, these structural changes cannot guarantee teachers’ conformity to the head’s educational goals within the confines of individual classrooms. Thus, he goes on to consolidate 13595
Schools, Micropolitics of his domination of the department by careful selection of new staff and exclusion of the unwilling. Although being successful in that his proposals for curriculum change were implemented within one year, the departmental head ended up in a situation that was characterized by conflict and uncertainty and, thus, in many respects, counteracted his own aspirations of democratic leadership. Sparkes suggests that his findings have some bearing beyond the specific case. He warns that the ‘ideology of school-centered innovation … fails to acknowledge the presence and importance of conflict and struggle both within, and between, departments … the commonly held view of teacher participation in the innovative process as ‘good’ in itself is highly questionable. The findings presented indicate that qualitative differences exist in the forms of participation available to those involved and that, to a large extent, these are intimately linked to the differential access that individuals have to a range of power resources’ (Sparkes 1990, p. 177).
2.2 Research Topics Typical topics of micropolitical studies are the following: Processes of change: when organizational ‘routine games’ are disrupted e.g., through educational, curricular, or organizational innovations, through the recruitment of new principals or teachers, through the introduction of quality assurance systems, then intensified processes of micropolitical relevance are to be expected. Some studies investigate how externally mandated reform is dealt with in schools: when Ball and Bowe (1991, p. 23) studied the impact of the 1988 British Educational Reform Act, they found that ‘the implementation of externally initiated changes is mediated by the established culture and history of the institution, and that such changes, ‘‘their acceptance and their implementation become sites as well as the stake of internal dispute.’’ The reform puts some possible alternative definitions about the meaning and the workings of schools on the agenda which have to be ‘processed’ in the local environments through disputes over topics such as, for example, ‘serving a community versus marketing a product’ or ‘priority to managerial or professional views of schooling.’ Leadership and the relationships in the organization: traditionally, educational ethnography had a strong classroom orientation. Micropolitics may be understood as a move to apply ethnographic methods to study the interaction of the members of the school outside the classroom. Although parents’, pupils’ and ancillary staff’s voices are included in some studies, most research focuses on the interaction of teachers. Since steering and regulation of schoolwork are at the heart of the micropolitical approach, it is no wonder that issues of leadership have been frequently investigated (see Ball 1987, p. 80). Holders of senior positions 13596
must acquire micropolitical skills for their career advancement and they ‘have much to lose if organizational control is wrested from them’ (Ball 1994, p. 3824) which makes them perfect foci for study. While many of these studies concentrated on sketching vivid images of a ‘politics of subordination,’ Blase and Anderson (1995) developed a framework for a ‘micropolitics of empowerment’ through facilitative and democratic leadership. Socialization and professional deelopment of teachers: the politically impregnated organizational climate in schools is the context in which individual teachers develop their professional identity (Kelchtermans and Vandenberghe 1996). During the first years of their career teachers concentrate on acquiring knowledge and competencies with respect to academic and control aspects of classroom instruction. Later on they build up a ‘diplomatic political perspective,’ because they feel under ‘constant scrutiny’ by pupils, parents, fellow teachers, and administrators (see Blase 1991, p. 189, Kelchtermans and Vandenberghe 1996). A pervasive feeling of ‘vulnerability’ induces many teachers to develop a range of protective strategies such as low risk grading, documentation of assessment and instruction, avoidance of risky topics and extracurricular activities, etc. (see Blase 1991, p. 193). Thus, it is no wonder that some authors claim that the acquisition of political competence is fundamental to teachers’ career satisfaction (see Ball 1994, p. 3824). Since micropolitical perceptions and competencies are almost absent in beginning teachers’ interpretive frameworks, attention to the induction phase is important (Kelchtermans and Vandenberghe 1996, p. 12). Understanding schools as organizations: since micropolitics is an approach to organization theory all these studies aim to contribute to our understanding of schools as organizations. There are many studies which prove the intensity of power-impregnated activity in schools and analyze typical causes and process forms. It has been argued that political control strategies are a characteristic part of school life ‘precisely because the professional norms of teacher autonomy limit the use of the effectiveness of more overt forms of control’ (Ball 1994, p. 3824). In Iannaconne’s (1991) view, the unique contribution a micropolitical approach can offer lies in exploiting the metaphor of ‘society’ for schools: a typical school is organized like a ‘caste society’ since a majority of its members are ‘subject to its laws without the right to share in making them’ (Iannaconne 1991, p. 469). From the characteristics of a ‘caste society’ some hypotheses about schools may be derived, e.g.: (a) Such societies are characterized by great tensions between the castes and a fragile balance of power. (b) They tend to conceal the internal differences of a caste from others. Consequently, critical self-examination of castes and open debate over policy issues is difficult.
Schools, Micropolitics of (c) They tend to avoid fundamental conflicts and to displace them into petty value conflicts. ‘In schools, as in small rural towns, the etiquette of gossip characterizes teacher talk. Faculty meetings wrangle over trivial matters, avoiding philosophic and ethical issues like the plague’ (Iannaconne 1991, p. 469).
3. Criticism and Areas for Deelopment The first wave of research and conceptual work in micropolitics at the beginning of the 1990s has triggered some criticism which will be discussed in order to identify possible limitations, but also areas for development. (a) Many authors of the micropolitical approach revel in examples of illegitimate strategies: secrecy, deception, and crime add much to the thrill of micropolitical studies. Some critics argue that talking about the pervasiveness of ‘illegitimate power’ produces itself a ‘micropoliticization’ of practice and will undermine confidence in organizations. This claim implies that a second meaning of ‘micropolitics’ is introduced: it does not (only) refer to an approach in organization theory (which analyses both legitimate and illegitimate interactions and their effects for the structuration of an organization), but refers to those specific activities and attitudes in organizations which are based on power politics with illegitimate means. This conceptual move, however, produces problems: how should it be possible to draw the line between ‘good’ and ‘bad organizational politics’ since it were precisely situations characterized by diverse goals and contested expertise which made a concept like ‘micropolitics’ necessary? (b) Although the proponents of micropolitics are right to criticize a presupposition of organizational consensus, it is unsatisfactory to interpret any consensus as a form of domination (see Ball 1987, p. 278, Ball 1994, p. 3822). Many early studies did not include cooperative or consensual interactions (see Blase 1991, p. 9). It will be necessary to develop conceptual devices to account for relationships of co-agency which neither dissolve any cooperation into subtle forms of domination nor gloss over the traces of power which are at the core of collaboration and support (see Altrichter and Salzgeber 2000, p. 104). (c) In focusing on strategic aspects of action micropolitical approaches are again in danger of overemphasizing the ‘rationality’ of organizational action. To understand the workings of organizations it may be necessary to consider actions which are mainly influenced by emotions and affect, which have not been carefully planned according to some interests, as well as unintended consequences of otherwise motivated actions. (d) Although being right to put the spotlight on the
potential instability of organizational processes, early micropolitical studies had some difficulty in conceptualizing the relative stability and durability of organizations, which we can observe every day in many organizations. For this purpose, it will be necessary to bring agency and structure in a balanced conceptual relationship: micropolitics will not deal with ‘relationships rather than structures’ (Ball 1994, p. 3822) but precisely with the interplay between interaction and structure. (e) The understanding of politics displayed in many early micropolitical studies was quite unsatisfactory, not to say unpolitical. Politics was often seen as the opposite of truth and reason; it was associated with treason, conspiracy and accumulation of influence as if fair and democratic ways of negotiation in organizations were inconceivable. More recent studies, for example, Blase and Anderson (1995), seem to overcome this bias. It was also criticized that in some micropolitical studies the external relationships of the organization were dissolved into the interactions between individual actors. However it must be said that some studies took care to account for the societal embedding of organizational interactions (see Ball and Bowe 1991). In the meantime there is general agreement that this must be a feature of sound micropolitical studies (see Blase 1991, p. 237). Problems (c) and (e) may be overcome by reference to Giddens’ (1992) theory of structuration, to Crozier and Friedberg’s (1993) concept of organizational games, and to Ortmann et al.’s (1990) studies of power in organizational innovation. On this conceptual basis, organizational structure may be seen as an outcome of and resource for social actions which enables and restricts actions, but never fully determines them. Organizations are webs woven from concrete interactions of (self-)interested actors. They acquire some spatial-temporal extension through the actors’ use and reproduction of ‘games.’ The ‘game’ is the point of conceptual intersection in which the analytically divided elements of agency and structure, of individual and organization meet. The game works as an ‘indirect integration mechanism which relates the diverging and\or contradictory action of relatively autonomous actors’ (Crozier and Friedberg 1993, p. 4). In order to take action and to pursue their own interests members of organizations must at least partly use structural resources of the organization i.e., they are forced to play within the organizational rules, and thereby, they reproduce these rules. The functioning of an organization may be explained as ‘result of a series of games articulated among themselves, whose formal and informal rules indirectly integrate the organization members’ contradictory power strategies’ (Ortmann et al. 1990, p. 56). The interlocked character of the games and the high amount of routine elements in social practices are major reasons for the relative stability of organizations (see Altrichter and Salzgeber 2000, p. 106). 13597
Schools, Micropolitics of See also: Educational Innovation, Management of; Educational Leadership; Group Processes in Organizations; Leadership in Organizations, Psychology of; Organization: Overview; Organizational Decision Making; School as a Social System; School Improvement; School Management
Bibliography Altrichter H, Salzgeber S 2000 Some elements of a micropolitical theory of school development. In: Altrichter H, Elliott J (eds.) Images of Educational Change. Open University Press, Buckingham, UK, pp. 99–110 Bacharach S, Lawler E 1980 Power and Politics in Organizations. Jossey-Bass, San Francisco, CA Ball S J 1987 The Micro-Politics of the School. Routledge, London Ball S J 1994 Micropolitics of schools. In: Husen T, Postlethwaite T N (eds.) The International Encyclopedia of Education, 2nd edn. Pergamon, Oxford, UK, pp. 3821–6 Ball S J, Bowe R 1991 Micropolitics of radical change. Budgets, management, and control in British schools. In: Blase J (ed.) The Politics of Life in Schools. Sage, Newbury Park, CA, pp. 19–45 Blase J (ed.) 1991 The Politics of Life in Schools. Sage, Newbury Park, CA Blase J, Anderson G L 1995 The Micropolitics of Educational Leadership. Cassell, London Blickle G 1995 Wie beeinflussen Personen erfolgreich Vorgesetzte, KollegInnen und Untergebene? [How do persons successfully influence superiors, colleagues, and subordinates?] Diagnostica 41: 245–60 Corbett H D, Firestone W A, Rossman G B 1987 Resistance to planned change and the sacred in school cultures. Education Administration Quarterly 23: 36–59 Crozier M, Friedberg E 1993 Die Zwaenge kollektien Handelns. Hain, Frankfurt\M., Germany [1980 Actors and Systems. University of Chicago Press, Chicago, IL] Giddens A 1992 Die Konstitution der Gesellschaft. Campus, Frankfurt\M., Germany [1984 The Constitution of Society. Polity Press, Cambridge, UK] Gronn P 1986 Politics, power and the management of schools. In: Hoyle E, McMahon A (eds.) World Yearbook of Education. Kogan Page, London Hoyle E 1982 Micropolitics of educational organizations. Educational Management and Administration 10: 87–98 Iannaconne L 1991 Micropolitics of education–what and why. Education and Urban Society 23: 465–71 Kelchtermans G, Vandenberghe R 1996 Becoming political: a dimension in teachers’ professional development. Paper presented at the AREA-conference. New York (ERIC-document: ED 395-921) Ortmann G, Windeler A, Becker A, Schulz H-J 1990 Computer und Macht in Organisationen [Computers and power in organizations]. Westdeutscher Verlag, Opladen, Germany Pfeffer J 1981 Power in Organizations. Pitman, Boston, MA Sparkes A C 1990 Power, domination and resistance in the process of teacher-initiated innovation. Research Papers in Education 5: 153–78
H. Altrichter 13598
Schumpeter, Joseph A (1883–1950) Joseph A. Schumpeter is one of the best known and most flamboyant economists of the twentieth century. His main accomplishments are to have introduced the entrepreneur into economic theory through Theorie der wirtschaftlichen Entwicklung (1912) and to have produced a magnificent history of economic thought, History of Economic Analysis (posthumously published in 1954). Schumpeter’s interests were wide ranging, and his writings also include contributions to sociology and political science. His analysis of democracy, which can be found in his most popular work, Capitalism, Socialism, and Democracy (1942), is generally regarded as an important addition to the theory of democracy.
1. Early Life and Career Joseph Alois Schumpeter was born on February 8, 1883 in the small town of Triesch in the AustroHungarian Empire (today Trest in Slovakia). His father was a cloth manufacturer and belonged to a family which had been prominent in the town for many generations; and his mother, who was the daughter of a medical doctor, came from a nearby village. Schumpeter’s father died when he was four years old, an event that was to have great consequences for the future course of his life. Some time later his mother moved to Graz and then to Vienna, where she got married to a retired officer from a well known, aristocratic family, Sigismund von Ke! ler. Schumpeter was consequently born into a small-town bourgeois family but raised in an aristocratic one; and his personality as well as his work were to contain a curious mixture of values from both of these worlds. After having graduated in 1901 from the preparatory school for the children of the elite in the Empire, Theresianum, Schumpeter enrolled at the University of Vienna. His goal from the very start was to become an economist. Around the turn of the century the University of Vienna had one of the best educations in economics in the world, thanks to Carl Menger and his two disciples Eugen von Bo$ hm-Bawerk and Friedrich von Wieser. Schumpeter had the latter two as his teachers, and he also attended lectures in mathematics on his own. When Schumpeter received his doctorate in 1906, he was 23 years old and the youngest person to have earned this degree in the Empire. After having traveled around for a few years and added to his life experience as well as to his knowledge of economics, Schumpeter returned in 1908 to Vienna to present his Habilitationsschrift. Its title was Das Wesen und der Hauptinhalt der theoretischen NationaloW konomie (1908; The Nature and Essence of Theoretical Economics), and it can be characterized as a work in economic methodology from an analytical perspective. In 1909 Schumpeter was appointed as-
Schumpeter, Joseph A (1883–1950) sistant professor at the University of Czernowitz and thereby became the youngest professor in Germanspeaking academia. It was also at Czernowitz that Schumpeter produced his second work, the famous Theorie der wirtschaftlichen Entwicklung (1912; 2nd edn. from 1926 trans. as The Theory of Economic Deelopment). At the University of Graz, to which he moved as a full professor in 1911, Schumpeter produced a third major study, Epochen der Dogmen- und Methodengeschichte (1914; trans. in 1954 as Economic Doctrine and Method ). Over a period of less than ten years Schumpeter had produced three important works, and one understands why he referred to the third decade in a scholar’s life as ‘the decade of sacred fertility.’
1.1 The Works from the Early Years Das Wesen is today best known for being the place where the term ‘methodological individualism’ was coined. In its time, however, this work played an important role in introducing German students to analytical economics, which Gustav von Schmoller and other members of the Historical School had tried to ban from the universities in Germany. Das Wesen is well written and well argued, but Schumpeter would later regard it as a youthful error and did not allow a second edition to appear. The reason for Schumpeter’s verdict is not known, but it is probably connected to his vigorous argument in the book that economics must sever all links with the other social sciences. While the emphasis in Das Wesen had been exclusively on statics, in his second book, Theorie der wirtschaflichen Entwicklung, it was on dynamics. Schumpeter very much admired Walras’ theory of general equilibrium, which he had tried to explicate in Das Wesen, but he was also disturbed by Walras’ failure to handle economic change. What was needed in economics, Schumpeter argued, was a theory according to which economic change grew out of the economic system itself and not, as in Walras, a theoretical scheme in which economic change was simply explained as a reaction to a disturbance from outside the economic system. Schumpeter starts the argument in Theorie by presenting an economy where nothing new ever happens and which therefore can be analyzed with the help of static theory, that is all goods are promptly sold, no new goods are produced or wanted, and profits are zero. Schumpeter then contrasts this situation of a ‘circular flow’ with a situation in which the activities of the entrepreneur sets the whole economic system in motion, and which can only be explained with the help of a dynamic theory. The model that Schumpeter now presented in Theorie would later be repeated and embellished upon in many of his writings: the entrepreneur makes an innovation, which earns him a huge profit, this profit attracts a swarm of other,
less innovative entrepreneurs and as a result of all these activities a wave of economic change begins to work its way through the economic system, a business cycle, in brief, has been set in motion. The theory of business cycles that one can find in Theorie is fairly rudimentary, and Schumpeter would later spend much time in improving upon it. Much of Schumpeter’s analysis in Theorie is today forgotten and few economists are today interested in the way that Schumpeter explains interest, capital and profit by relating these to the activities of the entrepreneur. Some parts of Theorie, however, are still very much alive and often referred to in studies of entrepreneurship. In one of these Schumpeter discusses the nature of an innovation, and explains how it differs from an invention; while an invention is something radically new, an innovation consists of an attempt to turn something—for example, an invention—into a money-making enterprise. In another famous passage Schumpeter discusses the five main forms of innovations: (1) The introduction of a new good … or of a new quality of a good. (2) The introduction of a new method of production … (3) The opening of a new market … (4) The conquest of a new source of supply of raw materials or halfmanufactured goods … (5) The carrying out of the new organization of any industry, like the creation of a monopoly position … or the breaking up of a monopoly position.
Schumpeter’s description of what drives an entrepreneur is also often referred to, as is his general definition of entrepreneurship: ‘the carrying out of new combinations.’ Schumpeter’s third work from these early years is a brief history of economics, commissioned by Max Weber (1914–20) for a handbook in political economy (Grundriss der SozialoW konomik). Its main thesis is that economics came into being when analytical precision, of the type that one can find in philosophy, came into contact with an interest in practical affairs, of the type that is common among businessmen; and to Schumpeter this meeting took place in the works of the Physiocrats. Epochen der Dogmen- und Methodengeschichte was later overshadowed by the much longer History of Economic Analysis, but while the latter work tends to be used exclusively as a reference work, Schumpeter’s study from 1914 can be read straight through and is also more easy to appreciate.
1.2 Difficult Mid-Years By the mid-1910s Schumpeter had established himself as one of the most brilliant economists in his generation, and like other well-known economists in the Austro-Hungarian Empire he hoped for a prominent political position, such as finance minister or economic adviser to the Emperor. During World War I Schumpeter wrote a series of memoranda, which were 13599
Schumpeter, Joseph A (1883–1950) circulated among the elite of the Empire, and through which he hoped to establish himself as a candidate for political office. The various measures that Schumpeter suggested in these memoranda are of little interest today, for example, that the Empire must not enter into a customs union with its powerful neighbor, Germany. What is of considerably more interest, however, is that they provide an insight into Schumpeter’s political views when he was in his early 30s. Schumpeter’s political profile at this time can be described in the following way: he was a royalist, deeply conservative, and an admirer of tory democracy of the British kind. Schumpeter’s attempt during World War I to gain a political position failed, but his fortune changed after the Empire had fallen apart and the state of Austria came into being. In early 1919 Schumpeter was appointed finance minister in a joint Social Democratic and Catholic Conservative government in Austria, led by Karl Renner. Already by the fall of 1919, however, Schumpeter was forced to resign, mainly because the Social Democrats felt that he had betrayed them and could not be trusted. The Social Democrats in particular thought that Schumpeter had gone behind their backs and stopped their attempts to nationalize parts of Austria’s industry. Schumpeter always denied this, and in retrospect it is difficult to establish the truth. What is clear, however, is that the Austrian Social Democrats advocated a democratic and peaceful form of socialism, while Schumpeter detested all forms of socialism. Loath to go back to his position at the University of Graz, Schumpeter spent the early part of the 1920s working for a small bank in Vienna, the Biedermann Bank. Schumpeter did not take part in the everyday activities of the bank but mainly worked as an independent investor. These activities ended badly, however, and by the mid-1920s Schumpeter was forced to leave the Biedermann Bank and had by this time accumulated a considerable personal debt. For many years to come, Schumpeter would give speeches and write articles in order to pay back his debts. Adding to these economic misfortunes was the death in 1926 of his beloved wife Anna (‘Annie’) Reisinger, a 20 years younger working class girl to whom Schumpeter had only been married for a year. From this time on, Schumpeter’s friends would later say, there was a streak of pessimism and resignation in his personality. In 1925 Schumpeter was appointed professor of public finance at the University of Bonn, and it is clear that he was both pleased and relieved to be back in academia. While his ambition, no doubt, extended to politics and business, he always felt an academic at heart. The great creativity that Schumpeter had shown in the 1910s was, however, not to return in the 1920s, and Annie’s death was no doubt an important reason for this. Schumpeter’s major project during the 1920s was a book in the theory of money, but this work was never completed to his satisfaction Schumpeter 1970). 13600
Schumpeter did, however, write several interesting articles during these years, some of which are still worth reading. One of these is ‘The Instability of Capitalism’ (1928), in which Schumpeter suggests that capitalism is undermining itself and will eventually turn into socialism. Another worthwhile article is ‘Gusta s. Schmoller und die Probleme on heute’ (1926), in which Schumpeter argues that economics has to be a broad science, encompassing not only economic theory but also economic history, economic sociology and statistics. Schumpeter’s term for this type of general economics is ‘SozialoW konomik.’ To illustrate what economists can accomplish with the help of sociology, Schumpeter wrote an article on social classes (1927), which is still often referred to. Schumpeter starts out by drawing a sharp line between the way that the concept of class is used in economics and in sociology. In economics, Schumpeter says, class is basically used as a category (e.g., wage-earner\nonwage-earner), while in sociology class is seen as a piece of living reality. An attempt was also made by Schumpeter to link his theory of the entrepreneur to the idea of class. In mentioning Schumpeter’s articles on social classes, something must also be said about two of his other famous studies in sociology: ‘The Fiscal Crisis of the Tax State’ (1918) and ‘The Sociology of Imperialisms’ (1918–19). The former study can be described as a pioneer effort in the sociology of taxation, which focuses on the situation in Austria just after World War I but which also attempts to formulate some general propositions about the relationship between taxation and society. Schumpeter, for example, suggests that if the citizens demand more and more subventions from the state, but are unwilling to pay for these through ever higher taxes, the state will collapse. In the article on imperialism, Schumpeter argues that imperialism grows out of a social situation which is not to be found in capitalism; and whatever imperialist forms that still exist represent an atavism and a leftover from feudal society. Schumpeter’s famous definition of imperialism is as follows: ‘imperialism is the objectless disposition on the part of the state to unlimited forcible expansion.’
2. The Years in the United States By 1930 Schumpeter was still depressed about his life and quite unhappy about his career; he had found no one to replace his wife on an emotional level and he felt that he deserved a better professional position than Bonn. When Werner Sombart’s chair at the University of Berlin became vacant, Schumpeter had high hopes that he would get it. When this did not happen, however, he instead accepted an offer from Harvard University; and in 1932 he moved to Cambridge, Massachusetts for good. Schumpeter was slowly to regain his emotional composure, and also his capacity
Schumpeter, Joseph A (1883–1950) to work, in the United States. In 1937 he got married to economist Elizabeth Boody, who regarded it as her duty in life to take care of Schumpeter and manage his worldly affairs. With the support of his wife, Schumpeter produced three major works: Business Cycles (1939), Capitalism, Socialism and Democracy (1942), and History of Economic Analysis (published posthumously in 1954).
2.1 The Works from the American Period Of these three works, the one for which Schumpeter had the highest hopes was definitely Business Cycles (1939), which appeared exactly 25 years after his last book in Europe. This work is more than 1100 pages long and it traces business cycles in Great Britain, the United States and Germany during the years 1787– 1938. Schumpeter had hoped that Business Cycles would get the kind of reception that Keynes’ General Theory got, and he was deeply disappointed by the lack of interest that his colleagues and other economists showed. The impression one gets from histories of economic thought, however, is that Business Cycles is rarely read today, mainly because few people are persuaded by Schumpeter’s theory that business cycles are set off by innovations and that their motion can be best described as a giant Kondratieff cycle, modified by Juglar and Kitchin cycles; Schumpeter’s handling of statistics is also considered poor. To a large extent this critique is true, although it should also be noted that a number of themes and issues in this and other of Schumpeter’s works are increasingly being referred to in evolutionary economics. After many years of excruciatingly hard work on Business Cycles, Schumpeter decided that he needed an easier task to work on than the recasting of economic theory that was on his agenda since the time in Bonn. In the late 1930s Schumpeter therefore began working on what he saw as a minor study of socialism. This minor study, however, quickly grew in size, and when it finally appeared in 1942—under the title Capitalism, Socialism, and Democracy—it was nearly 400 pages long. Of all of Schumpeter’s works (1942), this book was to become the most popular one, especially in the United States where it is still considered a standard work in political science. Capitalism, Socialism, and Democracy has also been translated into more languages than any other work by Schumpeter, and it is still in print. Capitalism, Socialism, and Democracy contains an interesting history of socialist parties (part V), a fine discussion of the nature of socialism (part III), and an excellent discussion of the work of Karl Marx (part I). What has made this work into a social science classic, however, are its two remaining parts with their discussion of capitalism (part II) and the nature of democracy (part IV). Capitalism, Schumpeter says, can be characterized as a process of continuous
economic change, through which new enterprises and industries are continuously being created and old ones continuously being destroyed. Schumpeter’s famous term for this process is ‘creative destruction.’ Perfect competition of the type that economists talk about, he also says, is a myth; and monopolies, contrary to common belief and economists’ dogma, are often good for the economy. Without monopolies, for example, very expensive forms of research and investments would seldom be undertaken. Schumpeter’s theory of monopolies has led to a large number of studies of innovation. By ‘the Schumpeterian hypothesis’ two propositions are meant: that large firms are more innovative than small firms, and that innovation tends to be more frequent in monopolistic industries than in competitive ones. Each of these hypotheses can in their turn be broken down into further hypotheses. Many decades later the results of all of these studies are still inconclusive, and there is no consensus if monopolistic forms of enterprise further or obstruct innovation (for an overview of research on the Schumpeterian hypothesis, see Kamien and Schwartz 1982). Schumpeter’s overall theory in Capitalism, Socialism, and Democracy—the idea that capitalism is on its way to disappear and be replaced by socialism—has, on the other hand, led to little concrete research. One reason for this is probably that many readers of Schumpeter’s book have felt that his argument is very general in nature and at times also quite idiosyncratic. Schumpeter argues, for example, that capitalism tends to breed intellectuals, and that these are hostile to capitalism; that the bourgeois class is becoming unable to appreciate property; and that the general will of the bourgeoisie is rapidly eroding. ‘Can capitalism survive?’ Schumpeter asks rhetorically, and then gives the following answer, ‘No. I do not think it can.’ Schumpeter’s conclusion about the inevitable fall of capitalism has often been sharply criticized (e.g., Heertje 1981). His more general attempt to tackle the question of how social and economic institutions are needed for an economic system to work properly has, however, remained unexplored. There also exists a certain tendency to cite catchy phrases from Schumpeter’s work without paying attention to their organic place in his arguments. What has always been much appreciated in Capitalism, Socialism, and Democracy is Schumpeter’s theory of democracy. Schumpeter starts out by criticizing what he calls ‘the classical doctrine of democracy,’ which he characterizes in the following way: ‘the democratic method is that institutional arrangement for arriving at political decisions which realizes the common good by making the people itself decide issues through the election of individuals who are to assemble in order to carry out its will.’ What is wrong with this theory, he argues, is that there is no way of establishing what the common good is. There is also the additional fact that politicians do not have interests 13601
Schumpeter, Joseph A (1883–1950) of their own, according to this theory; their function is exclusively to translate the desires of the people into reality. To the classical type of democracy Schumpeter counterposes his own theory of democracy, which he defines in the following manner: ‘the democratic method is that institutional arrangement for arriving at political decisions in which individuals acquire the power to decide by means of a competitive struggle for the people’s vote.’ Some critics have argued that Schumpeter’s theory is elitist in nature, since democracy is reduced to voting for a political leader at regular intervals. This may well be true, and Schumpeter’s theory of democracy no doubt coexists in his work with contempt for the masses. Regardless of this critique, however, Schumpeter’s theory is generally regarded as being considerably more realistic than the classical doctrine of democracy. World War II was a very trying time for Schumpeter since he felt emotionally tied to German speaking Europe, even though he by now was a US citizen. Schumpeter did not support the Nazis but like many conservatives felt that Communism constituted a more dangerous enemy than Hitler. Schumpeter’s intense hatred of Roosevelt, whom he suspected of being a socialist, also put him on a collision course with the academic community in Cambridge during World War II. In his diary Schumpeter expressed his ambivalence towards Hitler, Jews, Roosevelt and many other things; and in his everyday life he more and more withdrew into solitude and scholarship. The War, in brief, was a very difficult and demoralizing time for Schumpeter. From the early 1940s Schumpeter started to work feverishly on a giant history of economic thought that was to occupy him till his death some ten years later. He did not succeed in completing History of Economic Analysis, which was instead put together by Elizabeth Boody Schumpeter after several years of work. History of Economic Analysis is many times longer than Schumpeter’s history of economics from 1914, and it is also considerably more sophisticated in its approach. Schumpeter’s ambition with this work, it should first of all be noted, was not only to write the history of economic theory, but also to say something about economic sociology, economic history, and economic statistics, the four ‘fundamental fields’ of economics or SozialoW konomik. It should also be mentioned that History of Economic Analysis begins with a very interesting methodological discussion of the nature of economics. This is then followed by a brilliant, though occasionally willful discussion of economics from classical Greece to the years after World War I. The section on twentieth century economics was unfortunately never completed, but it is clear that Schumpeter was very discontent with the failure of his contemporaries to create a truly dynamic economic theory. Schumpeter died in his sleep on January 8, 1950 and the cause was put down as cerebral hemorrhage. Some 13602
of his friends thought that the real reason was overwork, while his diary shows that he had been tired of life for a long time. Schumpeter’s private notes also show that behind the image of brilliant lecturer and iconoclastic author, which Schumpeter liked to project, there was a person who felt deeply unhappy with the way that his life had evolved and with his failure to produce a theory that would revolutionize economic thought. ‘With a fraction of my ideas,’ he wrote in dismay a few years before his death, ‘a new economics could have been founded.’ See also: Business History; Capitalism; Democracy; Democracy, History of; Democracy: Normative Theory; Democratic Theory; Determinism: Social and Economic; Development, Economics of; Development: Social; Economic Growth: Theory; Economic History; Economic Sociology; Economic Transformation: From Central Planning to Market Economy; Economics, History of; Economics: Overview; Economics, Philosophy of; Entrepreneurship; Entrepreneurship, Psychology of; Imperialism, History of; Imperialism: Political Aspects; Innovation and Technological Change, Economics of; Innovation, Theory of; Marxian Economic Thought; Policy History: State and Economy; Political Economy, History of; Social Democracy; Socialism; Statistical Systems: Economic
Bibliography Allen R L 1991 Opening Doors: The Life and Work of Joseph Schumpeter. Transaction Publishers, New Brunswick, NJ Augello M 1990 Joseph Alois Schumpeter: A Reference Guide. Springer-Verlag, New York Harries S E (ed.) 1951 Schumpeter: Social Scientist. Books for Libraries Press, Freeport, New York Heertje A (ed) 1981 Schumpeter’s Vision: Capitalism, Socialism and Democracy After 40 Years. Praeger, New York Kamien M I, Schwartz N L 1982 Market Structure and Innoation. Cambridge University Press, Cambridge, UK Schumpeter J A 1908 Das Wesen und der Hauptinhalt der theoretischen NationaloW konomie. Duncker and Humblot, Leipzig Schumpeter J A 1912 Theorie der wirtschaftlichen Entwicklung. Duncker & Humblot, Leipzig (actually appeared in 1911, 2nd edn. 1926). [trans. into English in 1934 as The Theory of Economic Deelopment, Harvard University Press, Cambridge, MA] Schumpeter J A 1914 Epochen der Dogmen- und Methodengeschichte. In: Bu¨cher et al. (eds.) Grundriss der Sozialo¨konomik. I. Abteilung, Wirtschaft und Wirtschaftswissenschaft. J.C.B. Mohr, Tu$ bingen, Germany, pp. 19–124 Schumpeter J A 1926 Gustav vs. Schmoller und die probleme von heute. Schmollers Jahrbuch 50: 337–88 Schumpeter J A 1939 Business Cycles: A Theoretical, Historical and Statistical Analysis of the Capitalist Process. McGrawHill, New York Schumpeter J A 1942 Capitalism, Socialism, and Democracy. Harper and Brothers, New York
SchuW tz, Alfred (1899–1959) Schumpeter J A 1954 History of Economic Analysis. Oxford University Press, New York Schumpeter J A 1954 Economic Doctrine and Method, Oxford University Press, New York Schumpeter J A 1970 Das Wesen des Geldes. Vandenhoeck and Ruprecht, Go$ ttingen, Germany Schumpeter J A 1989 The instability of capitalism. In: Schumpeter J A (ed.) Essays. Transaction Publishers, New Brunswick, NJ Schumpeter J A 1991 The crisis of the tax state. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 The sociology of imperialisms. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 Social classes in an ethically homogenous milieu. In: Schumpeter J A (ed.) The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpeter J A 1991 The Economics and Sociology of Capitalism. Princeton University Press, Princeton, NJ Schumpter J A 2000 Briefe\Letters. J.C.B. Mohr, Tu$ bingen, Germany Stolper W F 1994 Joseph Alois Schumpeter: The Priate Life of a Public Man. Princeton University Press, Princeton, NJ Swedberg R 1991 Schumpeter: A Biography. Princeton University Press, Princeton, NJ Weber M (ed.) 1914–20 Grundriss der SozialoW konomik. J.C.B. Mohr, Tu$ bingen, Germany
R. Swedberg
Schu$ tz, Alfred (1899–1959) Alfred Schu$ tz was born in Vienna on April 13, 1899. Here he studied law and social sciences from 1918 to 1921. After receiving his Doctorate of Law he continued his studies in the social sciences until 1923. His teachers included Hans Kelsen, an eminent lawyer, Ludwig von Mises, and Friederich von Wieser, prominent representatives of the Austrian school of economics. Equally important as his university studies was his participation in the informal academic life of Vienna, e.g., his participation in the private seminars of von Mises, where Schu$ tz also cultivated his philosophical interests. In addition to his scholarly activities, Schu$ tz held a full-time job as a bank attorney, a dual life which lasted until 1956. After the annexation of Austria to the Third Reich, Schu$ tz and his family escaped to New York. On his arrival, Schu$ tz got in touch with Talcott Parsons whose Structure of Social Action he intended to review. But despite their intense correspondence (Schu$ tz and Parsons 1978), the contact between Schu$ tz and Parsons broke down. Instead, Schu$ tz became affiliated with the Graduate Faculty at the New School for Social Research in New York where he started teaching as Lecturer in 1943, advancing to a full Professor of Sociology in 1952 and also of Philosophy in 1956. He died on May 20, 1959 in New York.
1. Major Aims of SchuW tz’s Theory Alfred Schu$ tz’s social theory focuses on the concept of the life world, which is understood as a social-cultural world as it is perceived and generated by the humans living in it. Schu$ tz identifies everyday interaction and communication as being the main processes in which the constitution of the life world’s social reality takes place. In this sense, his approach pursues two major goals: the development of a theory of action which reveals the constitution of social reality with its meaning structure given in the commonly shared typical patterns of knowledge, and a description of the life world with its multiple strata of reality emerging from these constituting processes. These leitmotifs merge in a theory of life world constitution—a project which occupied Schu$ tz during the final years of his life.
2. Main Intellectual Sources of the SchuW tzian Approach Schu$ tz’s conception originated in the intellectual discourse in the social sciences and philosophy of the early decades of this century. Schu$ tz developed his position from within a triangular interdisciplinary field marked by Max Weber’s historising interpretative sociology (erstehende Soziologie), the Austrian economic approach represented by Ludwig von Mises, and the philosophical theories of the stream of consciousness formulated by H. Bergson and E. Husserl. From the very onset, he adopted Max Weber’s view of social reality as a meaningful social-cultural world, as well as his concept of social action as a meaning oriented behavior, and shared Weber’s search for an interpretative method in sociology. He also shared the methodological individualism advocated by Weber and by the Austrian economists who denoted individual action as the starting point of social research. At the same time, he accepted the need for a generalizing theory of action, as stressed by his teacher Ludwig von Mises, and criticized Weber for neglecting to develop the basic framework of such a theory and especially for not inquiring into the constitution of meaning attached to social action. But Schu$ tz’s criticism also addressed the ‘general theory of action’ based on an ‘a priori’ postulate of rational choice put forth by Ludwig von Mises. Schu$ tz rejected this conception since it imposed an abstract framework on actors’ orientation of action and ignored their actual stock of knowledge. In order to analyze how the meaning attached to action is revealed, Schu$ tz referred to the philosophical concepts of Henri Bergons and Edmund Husserl. Both offer insights into the stream of lived experience and into the acts of consciousness in which our world is meaningfully constituted. Schu$ tz adopted the Bergsonian idea of the stream of consciousness in his first scholarly attempts in 1924–28 (Schu$ tz 1982), only to later recognize the difficulties in Bergson’s intuitivism 13603
SchuW tz, Alfred (1899–1959) and to turn to Husserl’s phenomenology which ultimately gave the name to his approach. Husserl was concerned with the acts of consciousness which establish the meaningful taken-for-grantedness of the world in humans’ natural attitude. His term ‘life world,’ which Schu$ tz later assumes, is aimed at the reality constituted in those acts. Based primarily on this philosophical approach, Schu$ tz wrote his first monograph, Der sinnhafte Aufbau der sozialen Welt, in 1932, in which he developed the basic concept of his theory (Schu$ tz 1932). Starting with the criticism on Max Weber mentioned above, he devoted himself to the process in which humans create the social world as a reality that is meaningful and understandable to them.
3. Theory of the Life World with its Eeryday Core In order to grasp the constitution of the social world, Schu$ tz (1932) had to transcend the realm of consciousness and perception analyzed by Husserl. He included both acts of consciousness as well as human action and interaction into the constitutive process under scrutiny. Departing from the transcendental philosophical approach, he developed his own mundane phenomenology which analyzes the constitution of the meaningful reality within social relationships in the everyday world. He proceeded in three steps dealing with three problems constitutive for any theory of action which is searching for the construction of social reality. (a) How does a meaningful orientation of human action emerge? (b) How can we understand the Other? (c) How is socially-shared, intersubjective valid knowledge generated? Following the principle of methodological individualism, Schu$ tz started with the analysis of single individual action. Relying on Bergson’s concept of the inner stream of lived experience and on Husserl’s investigations on intentionality and temporality of consciousness, Schu$ tz first explored the constitution of meaning in subjects as a temporal process. The meaning attached to a lived experience emerges from successive acts of selective reflection aimed towards it and framing it into a context of other experiences. The schemes of experience on which the primary meaning of action, i.e., its project is based, arise from this basic temporal dynamics and plasticity of consciousness. The temporality of the meaning attached to an action manifests itself in two kinds of motivation attached to its different temporal dimensions: ‘in-order-to-moties’ which direct action toward the future, and ‘becausemoties’ representing the roots of the action in the past. However, the meaning of the project of action changes during the time when the projected action is 13604
going on. The final meaning of an action can be found in the experience perceived by the subject when looking at the changes that have emerged in the meaning structure of the action when the action is completed. The meaning attached to an action and consequently the schemes of experience are thus not only affected by the acts of consciousness but also by the action itself. In his second step, Schu$ tz proceeded to show how the schemes of experience are shaped by interaction and communication. He perceived communication as a process in which two subjective streams of consciousness are coordinated within a social interaction (Wirkensbeziehung). Thus, communication signifies an interaction in which the meaning of ego’s action consists in the intention to evoke a reaction of the alter. Actions in this sense have the function of signs which are mutually indicated and interpreted. The ultimate meaning of my action is revealed in the reaction of the Other and vice versa, therefore communication generates a chain of moties where my inorder-to-moties become because-moties of the Other and provide a common stock of shared patterns of interpretation which allows for mutual understanding even if each of the actors are always referring to their own schemes of experience. In this concept of understanding, based on interaction, Schu$ tz offers his own solution to the problem of intersubjectivity posed by Husserl. The constitution and appropriation of shared knowledge primarily takes place in long-lasting faceto-face interaction (we-relations) where the mutual expectations are learned, verified, and sedimented to typical patterns that can be applied to more remote and anonymous strata of the social world. As a result, the meaning structure of the social world is characterized by typifications of actions, situations, and persons generated in interaction and communication. In these three steps, Schu$ tz laid down the main features of his theory of the life world. In his later work, Schu$ tz (1962, 1964, 1966) determined this communicatively created social reality as the world of everyday life, in which typical patterns are taken for granted, and which represents the intersubjective common core of the reality we live in. Later on (Schu$ tz 1962, 1970), he disclosed further structural characteristics of this everyday core of the life world. Since its typical structure greatly depends on action, it is also the pragmatic orientation selecting the areas where typification processes take place. Both typicality and this kind of selection, which Schu$ tz designated as pragmatic releance, represent two generative principles of order in the everyday world. They determine its formal structure and at the same time, when realized in action, they shape this structure into a concrete social–cultural world, which is thus characterized by social distribution and differentiation of knowledge. A third moment structuring the everyday world can be found in the rootedness of its constitution in individual action. Here actors and their bodies represent the
SchuW tz, Alfred (1899–1959) central point of the eeryday world and its temporal, spatial, and social dimensions that are arranged into spheres of past and future, of within-reach and distant and of intimacy and anonymity, respectively, to the ctors’ own position (Schu$ tz and Luckmann 1973). Everyday reality is nevertheless not identical with the life world on the whole. By suspending their pragmatic interests, actors are able to modify their everyday experiences and perceive them as objects of a game, fantasy, art, science, or, if the conscious attention is completely absent, as a dream. There are areas—even in the realm of everyday action directed by the principal of pragmatic relevance—which are beyond the sphere of actors everyday practice and therefore transcend it. All these modifications represent different provinces of meaning transcending the everyday world and constituting multiple realities (Schu$ tz 1962) which make up the life world. Nevertheless, among the different provinces of the life world, the everyday core denotes a paramount reality since it is only here that actors, driven by the fundamental anxiety when facing the finality of their lives, have to master their living conditions and are engaged in communicative processes producing common knowledge which makes mutual understanding possible. Schu$ tz (1962) viewed communication as a substantial constitutive mechanism of social reality and stressed the role of language in this process. He considered language—including its syntax and semantics—as an objectivation of sedimented, social provided stock of knowledge that preserves relevances and typification inherent to cultures and social groups, and thus as crucial for the constitution of the life world (Schu$ tz and Luckmann 1989). Schu$ tz saw language as delineating an essential case of social objectified systems of appresentation which bridges the transcendence between different areas and meaning provinces of the life world. Another important integrative mechanism in the life world are symbolic structures, which are often based on language, and intermediate between the everyday world and noneveryday realities such as religion, arts, or politics (Schu$ tz 1962). Schu$ tz did not restrict his study of the structure of the life world solely to theoretical research. He also applied his theory as a general interpretative scheme to several social fields and problems. He explored courses of intercultural communication and the social distribution of knowledge, the social conditions of equality in modern societies as well as the fields of music and literature (Schu$ tz 1964).
4. Consequences of the Theory of Life World for the Methodology of the Social Sciences Schu$ tz considered the construction of ideal types to be the general methodological device in the social sciences since sociology, economy, political science, etc., ex-
plain social phenomena by modeling ideal actors and courses of action which they endorse with special features as explanatory variables. However, he did not see ideal types in the Weberian sense, i.e., as a scientific model which does not need to correspond to reality in all respects. For Schu$ tz, the ideal-typical method is legitimized by his findings on the typicality of everyday world which is the object of social research. Social reality can be approached by type-construction on the scientific level because its immanent structure is typified itself. Thus, the Schu$ tzian theory of the life world and its structures represents a methodological tool to bridge the gap between sociological theoretical reasoning and its object. The methodological rule which Schu$ tz derived from his theoretical approach consists correspondingly in the postulate of adequacy (Schu$ tz 1962, 1964) between everyday and scientific typifications. This postulate holds that ideal types featured by the social sciences have to be constructed in correspondence to the structure of the everyday typifications, so that everyday actors can take the model for granted when they act under the conditions stated in the ideal type. In this sense, social scientists as well as everyday actors have to follow the same frame of reference given by the structure of the life world but their cognitive attitude differ in several respects: scientists neither share the pragmatic interest of everyday actors nor do they share their everyday rationality restricted by their beliefs in the grantedness of typical knowledge. Opposing T. Parson’s functionalism and C. G. Hempel’s methodologic positivism, Schu$ tz stressed that scientists must not impose their own theoretical concepts and rationality on their object of study—the life world. First, they must discover and then respect its immanent meaning structure.
5. The Significance of the SchuW tzian Approach for the Social Sciences Since the 1960s, Schu$ tz’s theory has drawn sociologists’ attention to inquiries into everyday interaction, communication as well as to the insight that social reality has to be considered as a construction produced within these processes. Sociologists started to examine the practices of everyday action, communication and interpretation from which social reality emerges. H. Garfinkel’s (1967) Ethnomethology, which was aimed at finding the formal properties of everyday practices, led to a series of case studies covering a wide range of everyday life in society and its institutions. Continuing this line of research, examinations of everyday communication were provided by E. A. Shegloff and H. Sacks (1995), whose Conersational Analysis became a widespread method in qualitative sociological research. A. Cicourel’s (1964) Cognitie Sociology revealed the constructed character of data in social institutions as well as in science and thus initiated a 13605
SchuW tz, Alfred (1899–1959) series of studies in the sociology of organization and in the sociology sciences. Milieus as formations of everyday interaction are the subject of the Social Phenomenology R. Grathoff (1986). P. L. Berger and Th. F. Luckmann’s (1966) idea of the Social Construction of Reality (as being a general process in which cultural worlds emerge) triggered new impulses in both the sociology of knowledge and the sociology of culture reconceived now in Schu$ tzian terms. In this context, Schu$ tz’s impact can also be seen in the sociology of language and the sociology of religion. Contemporary Marxian theory saw the way in which Schu$ tz focused on the everyday practice as a possible mediation between social structure and individual consciousness. The term phenomenological sociology was coined in the 1970s (G. Psathas 1973) to address the spectrum of approaches oriented and inspired by Schu$ tz’s theory. Under this label, Schu$ tz’s approach became one of the general paradigms in the interpretative social science and theory of action. The diffusion and empirical application of the Schu$ tzian approach enforced the search for qualitative research methods which would reveal data pertaining to the social construction of social reality in everyday life. Aside from ethnomethodology and conversational analysis mentioned above, this quest especially led to a refinement in the techniques of narrative interviews and in biographical research. Once established in the 1970s, the Schu$ tzian paradigm influenced the mainstream of sociological theory which became sensitive not only to the social construction of life world but also to the phenomenological background of the Schu$ tzian theory. The ‘life world,’ in the sense of a basic social reality provided by humans in their ‘natural’ intercourse, became one of the central terms in social theory (J. Habermas 1981). The everyday construction of social reality was recognized as a crucial mechanism in which society emerges (P. Bourdieu 1972, A. Giddens 1976, Z. Baumann 1991). The phenomenological conception of meaning constitution reformulated as an autopoiesis (selfcreation) of social and psychic systems influenced the development of the contemporary sociological system theory (N. Luhmann 1996). Beyond of the scope of sociology, other humanities also gained innovative impulses from the Schu$ tzian approach. In philosophy, Schu$ tz’s theory led to a critical assessment of the Husserlian view of intersubjectivity and to the conceptions of a worldly phenomenology (L. Embree 1988) or of a philosophy of modern anonymity (M. Natanson 1986). In the literature, Schu$ tz’s constructionism inspired the esthetics of reception which pointed out the beholder’s participation in co-creating the autonomous reality of literary works (H. R. Jauss 1982). Schu$ tz’s concept of the structure of life world also affected theorizing in social geography (B. Werlen 1993), educational theory (K. Meyer-Drawe 1984), and political science (E. Voegelin 1966). 13606
See also: Constructivism\Constructionism: Methodology; Culture, Sociology of; Ethnomethodology: General; Everyday Life, Anthropology of; Husserl, Edmund (1859–1938); Interactionism: Symbolic; Interpretive Methods: Micromethods; Knowledge, Sociology of; Methodological Individualism: Philosophical Aspects; Phenomenology in Sociology; Phenomenology: Philosophical Aspects; Verstehen und Erkla$ ren, Philosophy of; Weber, Max (1864–1920)
Bibliography Bauman Z 1991 Modernity and Ambialence. Polity Press, Oxford, UK Berger P L, Luckmann Th 1966 The Social Construction of Reality. Doubleday, Garden City, New York Bourdieu P 1972 Esquisse d’une TheT orie de la Pratique, preT ceT deT de trois eT tudes d’ethnologie kabyle. Droz S.A., Geneva, Switzerland Cefai D 1998 Phe´nome´nologie et les Sciences Sociales: la Naissance d’une Philosophie Anthropologique. Libraire Droz, Geneva, Switzerland Cicourel A V 1964 Method and Measurement in Sociology. The Free Press of Glencoe, New York Embree L 1988 Wordly Phenomenology: The Continuing Influence of Alfred Schutz on North American Human Sciences. University Press of America, Washington DC Garfinkel G 1967 Studies in Ethnomethodology. Prentice-Hall, Englewood Cliffs, NJ Giddens A 1976 New Rules of Sociological Method. Hutchinson & Co., London, UK Grathoff R 1986 Milieu und Lebenswelt. Suhrkamp, Frankfurt\M, Germany Habermas J 1981 Theorie des kommunikatien Handelns. Suhrkamp, Frankfurt\M, Germany Jauss H P 1982 Aq sthetische Erfahrung und literarische Hermeneutik. Suhrkamp, Frankfurt\M, Germany Luhmann N 1996 Die neuzeitlichen Wissenschaften und die PhaW nomenologie. Picus, Vienna, Austria Meyer-Drawe K 1984 Leiblichkeit und SozialitaT t. Fink, Munich, Germany Natanson M 1986 Anonymity: A Study in The Philosophy of Alfred Schutz. Indiana University Press, Bloomington, IN Psathas, G (ed.) 1973 Phenomenological Sociology. Issues and Applications. John Wiley & Sons, New York Sacks H 1995 Lectures on Conersation. Blackwell, Oxford, UK Schu$ tz A 1932 Der sinnhafte Aufbau der sozialen Welt. Springer, Vienna Schu$ tz A 1962 Collected Papers I. Nijhoff, The Hague, The Netherlands Schu$ tz A 1964 Collected Papers II. Nijhoff, The Hague, The Netherlands Schu$ tz A 1966 Collected Papers III. Nijhoff, The Hague, The Netherlands Schu$ tz A 1970 Reflections on the Problem of Releance. Yale University Press, New Haven, CT Schu$ tz A 1982 Life Forms and Meaning Structure. Routledge and Kegan Paul, London Schu$ tz A 1995 Collected Papers IV. Nijhoff, The Hague, The Netherlands Schu$ tz A, Luckmann T 1973 Structures of Life World I. Northwestern University Press, Evanston, IL
Science and Deelopment Schu$ tz A, Luckmann T 1989 Structures of Life World II. Northwestern University Press, Evanston, IL Schu$ tz A, Parsons T 1978 Theory of Social Action: The Correspondence of Alfred SchuW tz and Talcott Parsons. Indiana University Press, Bloomington, IN Srubar I 1988 Kosmion die Genese der pragmatischen Lebensweltheorie on Alfred SchuW tz und ihr anthropologischer Hintergrund. Suhrkamp, Frankfurt am Main, Germany Wagner H R 1983 Alfred SchuW tz. An Intellectual Biography. University of Chicago Press, Chicago Werlen B 1993 Society, Action and Space. An Alternatie Human Geography. Routledge, London Voegelin E 1966 Anamnesis. Zur Theorie der Geschichte und Politik. Piper, Munich
I. Srubar
Science and Development Duringthepasthalf-century,conventionalunderstandings of science, development, and their relationship have changed radically. Formerly, science was thought to refer to a clear and specific variety of Western knowledge with uniformly positive effects on society. Formerly, development was viewed as a unidirectional process of social change along Western lines. Formerly, science was viewed as a powerful contributor to the developmental process. Each of these ideas has been subjected to insightful criticism. This article will examine science and development, concluding that the relationship between the two is problematic, partly because of the complexity of the concepts themselves. Three major theories of development are considered, together with the main types of research institutions in developing areas.
1. Science Much of what is termed science in developing areas is far from what would be considered ‘pure science’ in the developed world. The ‘root concept’ of science involves research, the systematic attempt to acquire new knowledge. In its modern form, this involves experimentation or systematic observation by highly trained specialists in research careers, typically university professors with state-of-the-art laboratory equipment. These scientists seek to contribute to a cumulative body of factual and theoretical knowledge, testing hypotheses by means of experiments and reporting their results to colleagues through publication in peer-reviewed journals. Yet when a new variety of seed is tested by a national research institute and distributed to farmers in Africa, this is described as the result of ‘science.’ When the curator of a botanical exhibit has a college
degree, he or she may be described as ‘the scientist.’ When a newspaper column discusses malaria or AIDS, ‘scientific treatments’ are recommended. Seeds, educated people, and advice are not science in the abstract and lofty sense of the pursuit of knowledge for its own sake or systematically verified facts about the world. But they are science from the standpoint of those who matter—local people who spend scarce resources on their children’s education, development experts who determine how and where to spend funds, politicians who decide whether to open a new university, corporate personnel who open a new factory in a developing region. Perhaps the most important shift in recent thinking about science is a broadening of the scholarly view to include the ideas of science found among ordinary people. These are often more extended in developing areas, because of the association of science with ‘modern’ things and ideas. ‘Science’ in its extended sense includes technological artifacts, trained expertise, and knowledge of the way the world works. The importance of this point will be clear in the conclusion. Given the fuzziness of the boundaries that separate science from other institutions, and the dependence of modern research on sophisticated technical equipment, the term ‘technoscience’ is often used to denote the entire complex of processes, products, and knowledge that flows from modern research activities. Even if we recognize that the term ‘science’ has extended meanings, it is useful to draw a distinction between (a) the institutions that produce knowledge and artifacts and (b) the knowledge that is produced. That is, on the one hand, there are organizations, people, and activities that are devoted to the acquisition of knowledge and things that can be produced with knowledge. These constitute the modern organization of research. On the other hand, there are claims involving knowledge and artifacts—often significantly transformed as they leave the confines of the research laboratory. What makes claims and practices ‘scientific’ is their association with scientific institutions. Modern research capacity is concentrated in industrialized countries. Indeed, with respect to the global distribution of scientific and technical personnel, scientific organizations, publications, citations to scientific work, patents, equipment, and resources, scientific institutions display extremely high degrees of inequality. The most common indicator of scientific output is publications. In 1995, Western Europe, North America, Japan, and the newly industrialized countries accounted for about 85 percent of the world total. Leaving aside countries and allies of the former Soviet Union, developing areas contributed less than 9 percent of the world total. Much the same applies to technological output measured in patents and expenditures on research and development (UNESCO 1998). Yet if we shift our focus from the question of inequality to the question of diffusion, an entirely different picture arises. To what extent have the idea 13607
Science and Deelopment and practice of research spread throughout the world? The main issues here involve who conducts research, on what subjects, and what happens to the results. Each of these topics is the subject of analysis and controversy. Scientific research in developing countries began during the pre-independence era with the establishment of universities and research institutes. Research was conducted on crops and commodities for export as well as conditions (e.g., disease) that affected the profits sought by external agents from their control over the land, labor, and property of colonized peoples. Methodologies and organizational models for research were brought by European colonists to Asia, Africa, and Latin America. During the era of independence and throughout the 1970s, number of types of entities engaged in the generation of knowledge multiplied. The main scientific organizations now fall into five main types, or sectors: academic departments, state research institutes, international agencies, private firms and, to a lesser degree, nongovernmental organizations.
2. Deelopment The concept of development involves several dimensions of transformation, including the creation of wealth (that is, rapid and sustained economic growth) and its distribution in a fashion that benefits a broad spectrum of people rather than a small elite (that is, a reduction in social inequality). Cultural transformation (recognition of and attendant value placed on local traditions and heritage) has also been viewed as an important aspect of the process since the early 1980s. There is general agreement that development in the second half of the twentieth century is not a mere recapitulation of the process of industrialization that characterized Europe and North America in the eighteenth and nineteenth centuries. Three theoretical perspectives, with many variations, have dominated development studies: modernization, dependency, and institutional. One way of distinguishing these theories is by their position on the ways in which relationships external to a country affect the process of change. Since scientific institutions and knowledge claims are of external origin, each of these perspectives views science and technology as important in the development process, with very different assessments of the costs and benefits.
2.1 Modernization The oldest approach, sometimes called modernization theory, focused on the shift from a traditional, rural, agricultural society to a modern, urban, industrial form. Transformations internal to a country (such as formal education, a market economy, and democratic 13608
political structures) are emphasized, while external relationships are de-emphasized. However, science was the exception to this, available to benefit developing nations through technology transfer from Western sources. This idea relied on two assumptions. One was the ‘hardness’ of technological artifacts—their alleged independence from people and culture, their seeming ability to produce certain effects ‘no matter what.’ The second was the ‘linear model’ of technology development in which the (a) discoveries of basic science lead to the (b) practical knowledge of applied science and finally to (c) technological applications such as new products. In retrospect, both of these assumptions were simplistic in any context, but in the developing world they were especially problematic. The assumption of ‘hardness’ has been replaced by the generalization that the uses, effects, and even the meanings of technological artifacts are affected by the context of use. First, effective technologies, from automobiles to indoor plumbing, do not typically stand alone, but are embedded in systems that provide infrastructure (roads, sewage treatment) which is often lacking. Second, the provision of artifacts such as buildings and computers is much easier than their maintenance, whichrequiresboth resources and knowledge. Third, introduction of new technology involves a multiplicity of consequences—positive and negative, short term and long term, economic and ecological. Many of these consequences are unpredictable, even in those rare cases where such foresight is attempted. The case of the Green Revolution is illustrative. In the 1960s, widespread food shortages, population growth, and predicted famine in India prompted major international foundations to invest research and technology transfer efforts towards the goals of increasing agricultural productivity and the modernization of technology. What resulted were new kinds of maize, wheat, and rice. These modern varieties promised higher yields and rapid maturity, but not without other inputs and conditions. They were, rather, part of a ‘package’ that required fertilizers as well as crop protection inputs such as pesticides, herbicides, and fungicides—sometimes even irrigation and mechanization. Moreover, seed for these varieties had to be purchased anew each year. The consequences of the Green Revolution are still debated, and there is little doubt that many of them were positive. Famine in India was averted through increased yields, but the benefits of the technology required capital investments that were only possible for wealthier farmers. Not only did the adoption of new technology increase dependence on the suppliers of inputs, but it was claimed to increase inequality by hurting the small farmer—one intended beneficiary of the technology. The actual complexity of the outcomes is revealed by one of the most sophisticated assessments—modern seed varieties do reach small farmers, increase employment, and decrease food prices, but the benefits are less than expected because the poor are
Science and Deelopment increasingly landless workers or near landless farm laborers (Lipton and Longhurst 1989). What is important for the question of the relationship between science and development is that the products and practices of the Green Revolution were research-based technology. This technology was often developed in international research institutes funded by multilateral agencies such as the World Bank and bilateral donors such as the US Agency for International Development. Since the combined resources of these donors dwarf those of many poor countries, their developmental and research priorities constitute a broad global influence on the nature of science for development. The largest and most visible of these organizations form a global research network, the Consultative Group for International Agricultural Research (CGIAR) which grew from 4 to 13 centers during the 1970s as support by donors quadrupled. The influence of this network of donors and international agencies was clearly evident in the early 1990s when environmental concerns led to an emphasis on ‘sustainability’ issues. This led to a change in CGIAR priorities, as the older emphasis on agricultural productivity shifted to the relatively more complex issue of natural resource management. 2.2 Dependency Modernization theory emphasized internal factors while making an exception of science. Dependency theory and its close relative, world system theory, emphasized the role of external relationships in the developmental process. Relationships with developed countries and particularly with multinational corporations were viewed as barriers. Economic growth was controlled by forces outside the national economy. Dependency theory focused on individual nations, their role as suppliers of raw materials, cheap labor, and markets for expensive manufactured goods from industrialized countries. The unequal exchange relationship between developed and developing countries was viewed as contributing to poor economic growth. World system theory took a larger perspective, examining the wider network of relationships between the industrialized ‘core’ countries, impoverished ‘peripheral’ countries, and a group of ‘semiperipheral’ countries in order to show how some are disadvantaged by their position in the global system. Because of their overspecialization in a small number of commodities for export, the unchecked economic influence of external organizations, and political power wielded by local agents of capital, countries on the periphery of the global capitalist system continue to be characterized by high levels of economic inequality, low levels of democracy, and stunted economic growth. What is important about the dependency account is that science is not viewed in benign terms, but rather as one of a group of institutional processes that con-
tribute to underdevelopment. As indicated above, research is highly concentrated in industrialized countries. Dependency theory adds to this the notion that most research is also conducted for their benefit, with problems and technological applications selected to advance the interests of the core. The literature on technology transfer is also viewed in a different light. The development of new technology for profit is associated with the introduction and diffusion of manufactured products that are often unsuited to local needs and conditions, serving to draw scarce resources away from more important developmental projects. The condition of dependency renders technological choice moot. This concern with choice, associated with the argument that technology from abroad is often imposed on developing countries rather than selected by them, has resurfaced in many forms. In the 1970s it was behind the movement known as ‘intermediate’ technology, based on the work of E. F. Schumacher, which promoted the use of small-scale, labor-intensive technologies that were produced locally rather than of complex, imported, manufactured goods. These ‘appropriate’ technologies might be imported from abroad, but would be older, simpler, less mechanized, and designed with local needs in mind. What these viewpoints had in common was a critical approach to the adoption of technology from abroad. By the late 1980s and 1990s even more radical positions began to surface, viewing Western science as a mechanism of domination. These arguments were more closely related to ecological and feminist thought than to the Marxist orientation of dependency theory. Writers such as Vandana Shiva proposed that Western science was reductionist and patriarchal in orientation, leading to ‘epistemic violence’ through the separation of subject and object in the process of observation and experimentation (1991). ‘Indigenous knowledge’ and ‘non-Western science’ were proposed as holistic and sustainable alternatives to scientific institutions and knowledge claims. Such views had an organizational base in nongovernmental organizations (NGOs), which received an increasing share of development aid during this period, owing to donor distrust of repressive and authoritarian governments in developing areas. NGOs have been active supporters of local communities in health, community development, and women’s employment, even engaging in research in alternative agriculture (Farrington and Bebbington 1993). 2.3 Institutional Theory Institutional theory seeks to explain why nations are committed to scientific institutions as well as what forms these take. The central theme is that organizational structures developed in industrialized countries are viewed by policy makers, donors, and other states as signals of progress towards modern institutional 13609
Science and Deelopment development and hence worthy of financial support. Regardless of the positive or negative consequences of their activities, the introduction and maintenance of certain forms in tertiary education and government serves to communicate this commitment. Institutional theory provides an account of the growth and structure of the academic and state research sectors, as successful organizations in industrialized nations operate as models far from their original contexts. Academic departments consist of researchers grouped by subject, each of whom is relatively free to select research projects. They bear the closest resemblance to the root concept of science introduced at the beginning of this article. But research requires time and resources. In areas such as sub-Saharan Africa, laboratories and fieldwork are poorly funded, if at all, since many institutions can barely afford to pay salaries. Professors teach, consult, and often maintain other jobs. Research is conducted as a secondary activity and professional contacts with other scientists in Europe and the US are few. Equally important to the scientific establishment are state research institutes. These organizations are agencies of the state, they are charged with performing research with relevance to development, with health and agriculture the two most important content areas. They are linked to ministries, councils, and international agencies as well as systems (such as Extension Services in agriculture) that deliver technology to users—again based on a model from the developed world.
3. Relationships Between Science and Deelopment The popularity of dependency arguments and the resurgence of interest in indigenous forms of knowledge implies continued competition for older views of the uniformly positive effects of science. Institutional theory provides an alternative account of the spread of science and its organizational forms. But two features of current scholarship may prove more significant in the long run. First, extreme diversity exists among developing areas in terms of their economic, social, and cultural patterns. It makes decreasing sense to speak of ‘development’ as an area of study. Latin American nations, for example, are generally far better positioned than the nations of sub-Saharan Africa. There is even wide variation within countries, as the case of India makes clear. While much of India qualifies as a developing area, it is among the world’s top producers of scientific work, has a technically skilled, Englishspeaking labor force second only to the US, and is a leading exporter of computer software for corporations. Second, ‘science’ is viewed as having many dimensions, many effects, and fuzzy institutional boundaries, 13610
but it is always a feature of the modern, industrial, interconnected world. Science cannot be the cause of modernization because, in its diverse institutional articulations and its evolving fit with society, science exemplifies the meaning of modernization itself. See also: Biomedical Sciences and Technology: History and Sociology; Development: Social-anthropological Aspects; Development Theory in Geography; Innovation and Technological Change, Economics of; Research and Development in Organizations; Science and Technology, Anthropology of; Science and Technology: Internationalization; Science and Technology, Social Study of: Computers and Information Technology; Technology, Anthropology of
Bibliography Baber Z 1996 The Science of Empire: Scientific Knowledge, Ciilization, and Colonial Rule in India. State University of New York Press, Albany, NY Farrington J, Bebbington A 1993 Reluctant Partners: Nongoernmental Organisations, the State, and Sustainable Agricultural Deelopment. Routledge, London Gaillard J 1991 Scientists in the Third World. University of Kentucky Press, Lexington, KY Kloppenburg J 1988 First the Seed: The Political Economy of Plant Biotechnology. Cambridge University Press, Cambridge, UK Lipton M, Longhurst R 1989 New Seeds and Poor People. Johns Hopkins University Press, Baltimore, MD Pearse A 1980 Seeds of Plenty, Seeds of Want: Social and Economic Implications of the Green Reolution. Clarendon Press, Oxford, UK Schumacher E F 1973 Small is Beautiful: Economics as if People Mattered. Harper and Row, New York Shahidullah S 1991 Capacity Building in Science and Technology in the Third World. Westview Press, Boulder, CO Shiva V 1991 The Violence of the Green Reolution. Zed Books, London Shrum W, Shenhav Y 1995 Science and technology in less developed countries. In: Jasanoff S, Markle G, Peterson J, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Stewart F 1977 Technology and Underdeelopment. Westview Press, Boulder, CO UNESCO 1998 World Science Report 1998. Elsevier, Paris Yearley S 1988 Science, Technology, and Social Change. Unwin Hyman, London
W. Shrum
Science and Industry The premodern industrial craft economy provided the initial intersection of industry with science through scientific instrument making. The development of scientific inquiry through craft-based production, and its effects, can be seen in Galileo and the telescope, chang-
Science and Industry ing the world-picture and the place of humans within it. From as early as Leeuwenhoek’s microscope to as late as the making of the Hubble telescope in the 1980s, lens and mirror construction was an uncertain art, not fully amenable to scientific analysis and control. The invention of moveable type transformed the transfer of knowledge, through storage and retrieval devices, and facilitated the development of modern scientific as well as popular literature. However, it was the development of the steam engine by a scientifically informed inventor, James Watt, which led to a systematization of practice that could be analyzed scientifically, becoming the basis for the next advance. This process gave rise to a new term, technology, to denote an innovation process as well as its results. Technology is the feedback link between science and industry. Invented in the eighteenth century, this new concept represented, on the one hand, the systematization of craft, through the creation of engineering disciplines, originally to quantify and sytematize the construction of military fortifications (Calvert 1967). On the other hand, technology was also derived from the extension of science. This occurred through the creation of applied sciences such as solid state physics and the invention of semiconductor devices such as the transistor. This article surveys the relationship between science and industry since the 1700s.
1. Early Science and Industrial Deelopment Science originated in the seventeenth century as organized investigation of the natural world according to relatively secure methodological principles. In this era, practical and theoretical concerns each provided between 40–60 percent of the impetus to research, with some overlap (Merton 1938). Well before science was institutionalized in universities and research institutes, individual scientists, loosely connected through scientific societies, provided an occasional basis for industrial development. For example, in the absence of reliable navigational techniques, commerce was impeded by the need for ships to stay close to shorelines. In response to a prize offered by the British government, astronomers used their observational techniques and knowledge base to develop star charts useful to navigators. Their involvement in solving a commercial problem was secured without incurring explicit costs. These were considered to be assumable by the astronomers themselves, with navigational research considered as an offshoot of their government or academic responsibilities that could be carried out at marginal cost. Clockmakers approached the problem from the opposite stance by adapting a mechanical device to the solution of the navigational problem. Of lower status and with lesser financial resources and institutional backing than the astronomers, a clockmaker who
arrived at a mechanical solution to the problem had great difficulty in being taken seriously by the judges of the competition. Moreover, as an independent craftsman he was dependent upon receiving intermediate financial dispensations from government to improve his device. Nevertheless, science and craft intersected to overcome blockages to the flow of trade by providing reliable navigational methods (Sobel 1996). Science became more directly involved in industrial production in seventeenth century Germany when professors of pharmacy invented medical preparations in the course of their experimentation. With support from some German princely states, and in collaboration with entrepreneurs, firms were formed to commercialize these pharmaceutical discoveries. Thus, the academic spin-off process, with governmental and industrial links, was adumbrated at this time (Gustin 1975).
2. Incremental Innoation: Learning by Doing In an era when most industry was craft based, incremental improvements arose primarily from workers’ experience with the process of production. For example, in the course of firing bricks a worker might notice that a brick had attained exceptional strength under high heat and then attempt, through trial and error, to duplicate what the observation of a chance event had brought to light. Eventually, the conditions that produced the original improved brick might be approximately reproduced and a sufficiently high rate of production achieved, with the ‘experimenter’ knowing more or less how but not really why the useful result had been achieved (Landes 1969). Scientific advance was also built upon what is now called ‘learning by doing’ (Lundvall and Borras 1997). By interviewing various practitioners, researchers began to collate and systematize their local knowledge into broader syntheses. Thus, advances in understanding of stratigraphy derived from miners’ practical experience in eighteenth-century Italy (Vaccari 2000). Much, if not most, innovation still takes place through craft-based experience. Indeed, scientific principles and methods such as those developed in operations research have recently been applied to systematize incremental innovation. Incremental innovation itself has been scientized through the application of statistical techniques, pioneered in Japan during the 1930s by the disciples of Edward Deming, the US researcher, who only later gained consulting opportunities and renown in his own country.
3. The Industrialization of Science The connection between academe, science, and industry strengthened with the invention of the chemistry laboratory as a joint teaching and research format 13611
Science and Industry by Justus Liebig at the University of Giessen in the mid-nineteenthth century (Brock 1997). Having achieved reliable analytical methods, Liebig could assign an unsolved problem to a student and expect a solution to arise in due course, with a minimum of supervision from the master or his assistants. The teaching laboratory model then spread to other experimental fields. As an organizational innovation combined with replicable methods of investigation, the laboratory allowed training and original investigation to be coordinated and expanded, creating a larger market for research equipment. Alexander Von Humboldt theorized the unity of teaching and research as an academic model as well as a practical tenet (Oleson and Voss 1979). The incorporation of science into the university along with methods to revive classical knowledge led to the development of research as an academic practice. The research university model was transferred from Germany to the US in the mid-nineteenth century and eventually became the base for technology transfer and firm formation (Jencks and Riesman 1968). Liebig, himself, attempted to take the development of technology from academic research a step further by starting businesses based upon his scientific discoveries. His mix of successes and failures foreshadowed the contemporary science-based firm. Nevertheless, this was an unusual combination of roles of researcher and entrepreneur in one person at the time (Jones 1993). More typically, the professor’s students made technology arising from scientific research into companies. Thus, the emblem of the Zeiss optical firm incorporates a portrait of the original founders, including professor and student.
4. The Foundation of Firms Based on Scientific Research With the invention of the laboratory, instrument making was internalized within science. As science and the need for research equipment grew, instrument production began to be externalized and made into an industry. Scientific instrument making is an early source of firm formation linked to academic research. For example, scientist-initiated instrument firms grew up around MIT and Harvard in the late nineteenth century, along with consulting firms such as A. D. Little providing a formal overlay of relationships between university and industry, beyond personal ties between teacher and former student. Until quite recently, scientific instrument making was a specialized niche industry, having more to do with academia than industry. A shift toward dual use industrial and scientific technologies began to occur with the development of electronic instrumentation. Oscilloscopes served as research tools, for example, to record nerve impulses in physiology, but were also utilized to provide quality assurance in industrial 13612
production. The formation of the Hewlett Packard and Varian Corporations, based upon innovations in electronics in the physics department at Stanford University in the 1930s, exemplify this transitional phase in the relationship between scientific instrumentation and industry (Lenoir 1997). Industrialized science is exemplified today by massproduced scientific equipment such as the sequencing machines crucial to the human genome project. What is new is the breakdown of the distinction between scientific instrumentation and the means of industrial production. This has occurred through the emergence of technologies such as computers that have broad application to science, business, art, and other techniques (Ellul 1964). Indeed, the computer is a machine with such protean implications that it became the basis of a science itself and contributed to the creation of a new class of sciences of the artificial.
5. The Rise of Science-based Industry Although it had its precursors in China, with gunpowder, paper making, and the organizational technology of bureaucracy, the science–industry interface has been rationalized in the West and North and transferred to the East and South in the modern era. Karl Marx, the initiator of the theory of sciencebased industry, even though he lost confidence and retreated to a labor theory of value, was far seeing. In the late nineteenth century, he had one example on which to base his thesis, the British chemist Perkin’s research on aniline-based dyes that was translated into an industry in Germany. According to a close observer, ‘Science and technology may have converged circa 1900’ (Wise 1983). The rise of the chemical and electrical industries in the late nineteenth century fulfilled some of the promise of science for industrial development. Nevertheless, even in these industries the translation of science into useful products often had an intervening phase based on ‘cut and try methods that only later became rationalized as in the unit method of scaling up chemical production (Servos 1980). Thomas Alva Edison, the inventor of the electric light, was also the inventor of the systematic production of inventions. His ‘idea factory,’ staffed by formally trained scientists and craftspersons, provided a support structure for the application of Edison’s basket of techniques to a series of technical problems whose solution was the basis for the creation of new industries and firms (Israel 1998). One of these companies, the General Electric Corporation (GE), took the connection between science and industry one step further in the US by hiring an academic scientist to organize a research laboratory for the firm. GE’s hiring of Willis Whitney brought the consulting function of academics to industrial problems within the firm and also created a new source of
Science and Industry product development, the corporate R&D laboratory (Wise 1983). A reverse relationship between science and engineering exists when corporations have ‘first mover’ rather than ‘follower’ business strategies. Siemens, for example, developed new business on the basis of advanced research in semiconductors during the early postwar period while its competitor, AEG, waited to see if a sufficient market developed before entering a field. Indeed, Siemen’s confidence in science was so great that it did not employ sufficient engineers to refine and lower the cost of its production (Serchinger 2000). The function of the corporate R&D lab was at least threefold: (a) maintain contact with the academic world and other external sources of information useful to the firm; (b) assist in solving problems in production processes and product development originating within the firm; and (c) originate new products within and even beyond the existing areas of activity of the firm. A few corporate labs, typically in ‘public’ industries such as telecommunications took on a fourth function as quasi-universities, contributing to the advance of science itself (Nelson 1962). Smaller firms typically focused on the first two functions, close to production, while larger firms more often spanned the entire range of activities (Reich 1987).
6. Uniersity–Industry–Goernment Relations Models for close public–private cooperation that were invented before World War II have recently been revived and expanded (Owens 1990). During the early postwar period, an institutional division of labor arose in R&D, with industry supporting applied research; government funding basic research in universities and research institutes; and universities providing knowledge and trained persons to industry (Reingold 1987). This ‘virtuous circle’ still accurately describes much of the university–industry–government relationship (Kevles 1977) Socialist countries developed a variant of this format, attempting to create closer connections to research and production through central planning mechanisms. However, by locating the R&D function in so-called branch institutes, socialist practice actually created a distance between research and production that was even greater than in capitalist firms that located their R&D units geographically apart from production while retaining organizational ties. Recently, socialist and, to some extent, large corporate formats for R&D have been decomposed and recombined in new formats. In the US, where these developments have taken their most advanced form to date, several innovations can be identified including: (a) Universities extending their functions from research and training into technology development and firm formation through establishment of new organ-
izational mechanisms such as centers, technology transfer offices, and incubator facilities. (b) The rise of start-up, high-tech firms, whether from universities, large corporate laboratories, or previous failed spin-offs, providing a new dynamic element in the science–industry relationship, both as specialized research organizations and as production units rooted in scientific advance. (c) A new role for government in encouraging collaboration among various academic and industrial units, large and small, in industrial innovation going beyond the provision of traditional R&D funding, including laws changing ‘the rules of the game’ and programs to encourage collaboration among firms and between universities and firms. An industrial penumbra appears around scientific institutions such as universities and research laboratories, creating feedback loops, as well as conflicts of interest and commitment between people involved in the two spheres. Over time, as conflicts are resolved, new hybrid forms of science-based industry and research, as well as new roles such as the industrial and entrepreneurial scientist, are institutionalized (Etzkowitz 2001).
7. Conclusion Economic development is increasingly based on science and technology at the local, regional, national, and multinational levels. The science–industry connection, formerly an ancillary and subsidiary aspect of both science and industry, now moves to the center of the stage of economic development strategy as the university and other knowledge-creating organizations become a source of new industry. Political entities at all of these levels develop policies and programs to enhance the science–industry interface, and especially to encourage high-tech firm formation (Etzkowitz et al. 2000). The ‘endless frontier’ model of ‘knowledge flows,’ transferred to industry through publications and graduates was insufficient to induce industrial innovation in most cases (Brown 1999). Closer connections were required. A series of organizational innovations to enhance technology transfer took place along a continuum: providing training in entrepreneurship, translation of research findings into intellectual property rights, and encouraging the early stages of firm formation. The potential for participation in knowledge-based economic development revivifies the discredited nineteenth century notion of progress. During the colonial era, technology transfer typically took place in a format that helped maintain political control, and the higher forms of tacit knowledge were kept secret by employing expatriate engineers. Nevertheless, as in India, a steel industry, created by local entrepreneurs, provided an economic base for a political indepen13613
Science and Industry dence movement. Moreover, a seemingly overexpanded higher educational system, originally put in place to train lower level bureaucrats and technicians, has become an engine of growth for India’s software industry. The synthesis of science and industry is universalized, as both industrial and natural heritages, such as machine-tool industry in German and biodiversity in Brazil, are integrated into new global configurations, often through local firms in strategic alliance with multinationals. Given the decreasing scale of scientific equipment and the increasing availability of higher education, countries and regions in virtually every part of the world can take advantage of opportunities to develop niches. A trend toward worldwide technological innovation has transformed the previous designations of third and second worlds into the ‘emerging world.’ New candidates for science-based economic growth bridge the expected gaps between ‘long waves’ of economic growth. Heretofore, the technology-based long waves identified by Freeman and his co-workers were made possible by a few key technologies; that is, information technology and biotechnology (Freeman and Soete 1997). The potential for growth from technologies such as solar photovoltaics, multimedia, and the Internet, and new materials arising from nanotechnology are so numerous that there is little or no technical reason, only political, for business cycles with declines as well as rises. See also: History of Science; History of Technology; Industrial Society\Post-industrial Society: History of the Concept; Industrial Sociology; Innovation: Organizational; National Innovation Systems; Reproductive Rights in Developing Nations; Science, Technology, and the Military; Scientific Revolution: History and Sociology
Bibliography Brock W 1997 Justus on Liebig: The Chemical Gatekeeper. Cambridge University Press, Cambridge, UK Brown C G 1999 Patent policy to fine tune commercialization of government sponsored university research. Science and Public Policy December: 403–14 Calvert M 1967 The Mechanical Engineer in America, 1830–1910: Professional Cultures in Conflict. Johns Hopkins University Press, Baltimore Carnavale A 1991 America and the New Economy. Jossey Bass, San Francisco Clow A 1960 The industrial background to John Dalton. In: Cardwell D S L (ed.) John Dalton and the Progress of Science. Manchester University Press, Manchester, UK Davis L 1984 The Corporate Alchemist: Profit Makers and Problem Makers in the Chemical Industry. William Morrow, New York Dyson F 1999 The Sun, the Genome and the Internet. Oxford University Press, New York
13614
Ellul J 1964 The Technological Society. Knopf, New York Etzkowitz H 2001 The Second Academic Reolution: MIT and the Rise of Entrepreneurial Science. Gordon and Breach, London Etzkowitz H, Gulbrandsen M, Levitt J 2000 Public Venture Capital. Harcourt, New York Freeman C, Soete L 1997 The Economics of Industrial Innoation. MIT Press, Cambridge, MA Gustin B 1975 The emergence of the German chemical profession. Ph.D. dissertation, University of Chicago Israel P 1998 Edison: A Life of Inention. Wiley, New York Jencks C, Riesman D 1968 The Academic Reolution. Doubleday, New York Johnson J 1990 The Kaiser’s Chemists: Science and Modernization in Imperial Germany. University of North Carolina Press, Chapel Hill, NC Jones P 1993 Justus Von Liebig, Eben Horsford and the development of the baking powder industry. Ambix 40: 65–74 Kevles D 1977 The NSF and the debate over postwar research policy, 1942–45. ISIS 68 Landes D 1969 The Unbound Prometheus. Cambridge University Press, Cambridge, UK Lenoir T 1997 Instituting Science: The Cultural Production of Scientific Disciplines. Stanford University Press, Stanford, CA Lundvall B-A, Borras S 1997 The Globalising Learning Economy. The European Commission, Brussels Merton R K 1938 Science, Technology and Society in Seenteenth Century England. St. Catherines Press, Bruges Nelson R 1962 The link between science and invention: The case of the transistor. In: The Rate and Direction of Inentie Actiity. Princeton University Press, Princeton, NJ, pp. 549–83 Oleson A, Voss J 1979 The Organization of Knowledge in Modern America. Johns Hopkins University Press, Baltimore Owens L 1990 MIT and the federal ‘Angel’ academic R&D and federal–private cooperation before World War II. ISIS 81: 181–213 Reich L 1987 Edison, Coolidge and Langmuir: Evolving approaches to American industrial research. Journal of Economic History 47: 341–51 Reingold N 1987 Vannever Bush’s new deal for research; or the triumph of the old order. Historical Studies in the Physical and Biological Sciences 17: 299–344 Serchinger R 2000 Wirstschaftswunder in Prezfeld, Upper Franconia; Interactions between science, technology and corporate strategies in Siemens semiconductor rectifier research & development, 1945–1956. History and Technology 335: 82 Servos J 1980 The industrial relations of science: Chemical engineering at MIT, 1900–1939. ISIS 71: 531–49 Sobel D 1996 Longitude. Penguin, Baltimore Vaccari E 2000 Mining and knowledge of the Earth in eighteenth century Italy. Annals of Science 57(2): 163–80 Wise G 1983 Ionists in industry: Physical chemistry at General Electric, 1900–1915. ISIS 74: 7–21
H. Etzkowitz
Science and Law The relationship between law and science has occupied scientists, philosophers, policymakers, and social analysts since the early modern period. Nature, according
Science and Law to science’s early modern practitioners, was governed by law-like regularities—God’s laws—and the work of science lay in discerning and revealing these laws of nature. In time, however, human law came to be seen as a fallible institution in need of rationalization, to be made more like a science in order to avoid arbitrariness. For early twentieth-century legal reformers, science provided a model of regularity for law to aspire to. At the same time, law and science were viewed by their practitioners as independent institutions, each with its own organization, practices, objectives, and ethos. Similarities and differences between the two fields were noted by many observers, with frequent commentary on conflicts between the law’s desire for justice and science’s commitment to the truth. By the end of the twentieth century, a new preoccupation with the law’s instrumental uses of science emerged, which led to divergent schools of thought about the epistemological status of scientific evidence and the ways in which legal and policy institutions should interact with scientific experts. The growth of the state’s regulatory powers increased governments’ dependence on scientific methods and information, and disputes developed about the extent to which science could reliably answer the questions put to it by legislators, regulators, and litigants. An influx of technically complex disputes caused judicial systems to reassess their handling of scientific and technical evidence. Analysts of these processes grappled with questions about the law’s capacity to render justice under conditions of scientific and social uncertainty, as well as the continued relevance of the lay jury and the role of legal proceedings in the production of scientific knowledge. This article reviews the resulting scholarship on the nature and consequences of the law–science relationship under four major headings. One concerns the role of these two institutions in shaping the authority structures of modernity, particularly in legitimating the exercise of power in democratic societies. A second relates to the law’s impact on the objectives, status, and professional practices of scientific disciplines relevant to the resolution of social problems. A third focuses on responses by courts and their critics to the challenges posed by experts in the legal system, including shifts in the rules governing expert testimony. Fourth and finally, an emerging line of research looks at the law as a site for generating scientific and technical knowledge and inquires into the implications of collaboration between law and science for individual and social justice.
1. Law and Science in the Transition to Modernity Since the origins of modern scientific thought, the term ‘law’ has been used to describe both the regularities discernible in nature and the rules laid down by
religious or secular authorities to guide human conduct. Both kinds of laws, it was once popularly assumed, embodied principles that should work always and everywhere in the same fashion, even though Robert Boyle, an early modern founder of experimental science, cautioned against too literal an assimilation of the behavior of matter under natural law to the behavior of human agents under civil law (Shapin 1994, pp. 330–1). As certain facts about human nature and behavior (such as the approximate equality of humans, their vulnerability, and the limits of their resources and altruism) came to be accepted as given, modern legal theorists adopted more synthetic views regarding the relationship between science and law. According to the renowned British scholar of jurisprudence H. L. A. Hart, for example, these natural facts underwrite a ‘natural law’ which human societies require in order to maintain their salient moral characteristics (Hart 1961). The American legal theorist Lon Fuller took a more sociological tack in his account of the ‘morality of law.’ Echoing Robert Merton’s well-known essay on the norms of science, Fuller argued that law and science resemble each other because each institution possesses an internal morality resulting from its distinctive arrangements, practices, and fiduciary demands (Fuller 1964). Later legal scholarship compared the ‘cultures’ of law and science and likewise called attention to their normative and procedural particularities (Goldberg 1994, Schuck 1993), but these institutional characteristics were now seen as a source of conflict, or culture clash. As elite institutions, science and law historically have drawn upon each other’s practices to build their authority. Thus, early modern experimentalists in the natural sciences enlisted the support of witnesses, both real and ‘virtual’ (Shapin and Schaffer 1985), as if they were proving a case in a court of law. The Enlightenment of the eighteenth century and the rise of liberal democracies brought science, now prized for its practical utility (Price 1985), into a more openly instrumental partnership with the law. In the USA, Thomas Jefferson drafted the first Patent Act of the fledgling American republic in 1793, thereby implementing the constitutional grant of power to Congress ‘to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.’ Where Jefferson saw the state as sponsoring science, some modern scholars have seen science’s open, self-regulating, nonhierarchical modes of governance as offering a model or prototype for democracy (Polanyi 1962). Others, however, regard the relationship of scientific and political authority as one of mutual dependency. Democratically constituted governments, they argue, need science and technology in order to answer constant public criticism and make positive demonstrations of their worth to citizens. States, therefore, have harnessed science and technology to instrumental projects in both war and peace. 13615
Science and Law In these legitimation exercises, the public attests to the state’s technical performance, much as scientists bore witness to each other’s observations in early modern experimental laboratories (Ezrahi 1990). The law’s procedural forms in this way have underwritten the credibility of both modern science and the state. The law’s own dependence on science has grown by distinctive pathways in the USA, as could be expected in a society where public decisions are exceptionally transparent and significant policy change commonly proceeds through litigation. By the beginning of the twentieth century, ideological and political shifts fostered the idea that social policies enacted by the state could and should be responsive to findings in the sciences. An icon of the Progressive era, US Supreme Court Justice Louis D. Brandeis, spurred this development through his historic ‘Brandeis brief’ in Muller . Oregon (1908). Written to defend the state’s 10-hour working-day restriction on women’s employment, the 100-page brief is widely regarded as the first attempt to bring social science insights into the courts. Brandeis, then a public interest lawyer in Boston, argued successfully that women were sufficiently different from men to justify special legislative protection, thereby distinguishing the Oregon case from the Court’s earlier, infamous decision in Lochner . New York (1905), which struck down a similar state law curtailing working hours for (male) bakers. Social science evidence received a further boost when, in overturning the ‘separate but equal’ doctrine that was the basis for racial segregation in public schools, the Supreme Court in Brown . Board of Education (1954) relied on psychological studies of segregation’s ‘detrimental effect upon the colored children.’ In later years, students with disabilities would benefit from similar sorts of evidence in claiming rights to special education (Minow 1990). Employment discrimination, racial segregation, antitrust, capital punishment, and environmental liability were among the many litigated subjects that brought social science testimony into the US courts in the latter decades of the twentieth century. These cases kept alive the view that objective scientific information could be invoked in legal proceedings to combat various forms of oppression and social injustice. If science was allied with the search for justice in US courts, its role in policymaking was primarily as the voice of reason. Government’s increasing involvement in welfare policy in the New Deal, intensified by the rise of social regulation in the 1970s, created new demands for rational decisions concerning a host of costly and potentially explosive problems, such as race and the environment. As can be observed in the context of environmental regulation (Jasanoff 1990), federal agencies proved increasingly less able to take refuge in generalized claims of expertise. Much work went into separating science from politics and creating new institutions to deliver putatively independent advice to government, although episodes like the 1995 13616
dissolution of the Office of Technology Assessment, an advisory body to the US Congress, called attention to the fragility of such efforts. New types of scientific expertise also evolved to support governmental decisions with substantial distributive impacts, as discussed in the next section. The relationship between law and science in modern European states was in some ways more covert than in the USA, but nonetheless pervasive. The crises of industrialization in the nineteenth century and the ensuing rise of the welfare state ushered in a demand for credible public action that states could not satisfy without rapid increases in their capacity to order society through newly accredited social science disciplines (Wagner et al. 1991). In turn, the classifications offered by the human sciences in particular supplied the basis for new forms of self-understanding, normalization, and social control. Whether in medicine, psychology, criminology or studies of intelligence, science offered the means for dividing people into categories of normal and pathological, mainstream and deviant. The scientific study of human characteristics thus enabled members of society to see and discipline each other without the need for constant, formal regulatory oversight by the state (Foucault 1990). Science, on this account, not only provided supports for legal authority but actually replaced some of the ordering functions of positive law. European institutions at first resisted the pressures that fed attempts to separate science from politics in US legal contexts. Expert advisory bodies were routinely constituted to include political as well as technical expertise, although judgments about the need for lay or nonprofessional expertise varied from country to country. European courts, for their part, dealt with more focused and on the whole less politically salient disputes than in the USA, so that there were fewer reasons to contest expert judgments. By the 1990s, however, a number of developments tried public confidence in European governments’ uses of science, chief among them the growing political influence of the European Union’s technical bureaucracies and the fallout from the UK government’s botched response to the bovine spongiform encephalopathy (BSE) crisis, which cast doubt on the integrity and impartiality of experts. European lawmakers, regulators, and courts began reconsidering the adequacy of their institutional arrangements for sciencebased decision-making, as the issue of science and governance rose to new prominence on political agendas.
2. Disciplines, Professions, and the Law The interactivity of law and science has spurred new disciplinary formations and helped to change professional norms and practices, especially in the human sciences, which serve both as allies in regulation and
Science and Law law enforcement and as uneasy subjects of the law’s skeptical gaze. This dual relationship with the law has prompted novel lines of scientific research and theorizing, accompanied by periodic legal encroachment upon science’s professional autonomy. Developments in the fields of risk analysis and psychology clearly illustrate these patterns. The scientific study of risk is an outgrowth of state efforts to manage the hazards of industrial production and an extension of modern governments’ growing reliance on formal decision-making tools to rationalize difficult political decisions. Uncertainty about technology’s risks and benefits rose through the twentieth century along with the size and complexity of systems of production. From village-based economies, in which injured parties presumed they knew the agents as well as the causes of harms inflicted on them, the global economy pulled consumers into webs of relationships that diffused both the knowledge of risks and the responsibility for regulation and compensation. Epitomizing the plight of industrialization’s innocent victims, millions of asbestos workers in mines, shipyards, and other industries were unknowingly exposed to severe health hazards, and hundreds of thousands died of consequent illnesses. Resulting lawsuits financially crippled asbestos manufacturers without adequately recompensing injured workers. A 1984 accident at a US subsidiary’s chemical plant in Bhopal, India, again produced hundreds of thousands of victims, with massive scientific and legal uncertainties clouding the settlement of claims. As the pervasiveness of industrial hazards became apparent, legislatures sought to ensure that regulation would not wait for proof in the form of ‘body counts’: risk, not harm, became the basis for action under new laws safeguarding health, safety, and the environment, and state agencies struggled to develop credible processes for assessing and mitigating risks before anyone was injured. The sciences, for their part, responded with attempts to mobilize existing knowledge and methods to provide more useful tools for analyzing risks and crafting preventive strategies. From relatively modest beginnings, rooted largely in the engineering and physical sciences (as in assessing the risks of meltdown in nuclear power plants), risk assessment in support of regulatory standards grew into an important, and increasingly contested, branch of the social sciences. Risk studies acquired the indicia of professionalization, with dedicated journals, societies, curricula, and centers at major universities. One focus of research and development in this field, particularly in the USA, was the construction of reliable models for measuring risk, from the use of animal species as surrogates for humans to sophisticated methods of representing uncertainty in risk estates (National Research Council 1994). Risk assessment presented not only methodological difficulties for scientists and engineers, but also institutional challenges for governments needing to persuade industry
and the public that their decisions were scientifically sound and politically balanced (Jasanoff 1990). Lawsuits questioning the risk assessment practices of federal agencies were not uncommon and led to several significant judicial decisions, including Supreme Court rulings on the validity of the occupational standard for benzene in 1980 and the air quality standard for ozone and particulates in 2001. Other researchers meanwhile approached risk from the standpoint of social psychology, asking why lay observers frequently diverged from experts in their judgments concerning the relative significance of different types of risk. This question engendered a sizeable body of experimental and survey work on risk perception (Slovic 2000), as well as theoretical critiques of such work, showing problems in both the framing and characterization of lay attitudes (Irwin and Wynne 1994). While some social scientists occupied themselves with identifying and measuring ‘real’ risks, or perceptions of them, others treated risk itself as a new organizing principle in social life, noting that vulnerability to risk did not map neatly onto earlier divisions of society according to race, class or economic status (Beck 1992). By contrast, the US environmental justice movement generated data suggesting that social inequality still mattered in the distribution of industrial risks, which fell disproportionately on poor and minority populations (Bullard 1990). This strategic, though controversial, research led in 1994 to a presidential executive order demanding that US federal agencies should consider the equity implications of their actions; equity analysis emerged in this way as an offshoot of existing methodologies for assessing risk and offered formal justification for legal claims of environmental injustice. Unlike risk analysis, which is a product of twentiethcentury concerns, law and the mental health disciplines have existed in a relationship of close reciprocity at least since 1843, when the British House of Lords laid down the so-called M’Naghten rule for determining whether a criminal defendant is entitled to relief on grounds of insanity. The rule holds that a person may be released from criminal responsibility for actions taken while he or she was unable to distinguish between right and wrong as a result of mental disease or defect. More recently, the human sciences’ ability to sort people’s behavior and capacities into seemingly objective, ‘natural’ categories has served the needs of both democratic and totalitarian states in a growing variety of legal settings. The need to determine people’s mental states in civil as well as criminal proceedings was an important driver of the legal system’s enrolment of professionals from these fields. In US law, for example, psychiatric evidence was used to determine whether a criminal accused could stand trial, whether conditions for sentencing leniency or parole were met, and whether a convicted criminal posed a sufficient threat of long-term dangerousness to 13617
Science and Law deserve the death penalty. While these intersections provided new professional opportunities for psychiatry and psychology, spawning a thriving pedagogic specialty of law and mental health (Gutheil and Appelbaum 2000), entanglements with the law also disclosed frequent divisions among experts and undermined disciplinary authority in other ways. The most visible challenge to professional autonomy occurred in the 1974 US case of Tarasoff . Regents of the Uniersity of California (redecided in 1976), in which the California Supreme Court ruled that psychiatrists were required to warn (later changed to a duty to protect) third parties threatened by their patients. Although the decision did not open the floodgates to liability as many therapists had feared, Tarasoff was widely seen as influencing therapeutic practice, often in counterproductive ways. In another notable decision affecting the mental health field, the US Supreme Court in Barefoot . Estelle (1983) essentially ignored an amicus brief by the American Psychiatric Association stating that psychiatric predictions of dangerousness are wrong in as many as two out of three cases. The Court expressed confidence in juries’ capacity to evaluate expert evidence even if it is weak or deficient. By the end of the twentieth century, the alliance between law and the human sciences—joined, eventually, by the nascent science of genomics—helped to define a host of novel disorders and syndromes that were invoked to extenuate or penalize various types of conduct. Post-traumatic stress disorder (PTSD), for example, along with its specific manifestations such as rape trauma syndrome, entered the DSM-IV, the official diagnostic handbook for psychologists. So codified, PTSD became not only an identifiable and presumably treatable medical condition, but also a basis for asserting new legal claims (as in lawsuits for stress-related psychological injury in the workplace) or defending oneself against them (as in murder trials of sexually abused defendants). Possibly the most notorious example of synergy between law and psychology occurred in a rash of so-called recovered memory and child abuse cases, which became especially prevalent in the US in the 1980s (Hacking 1995). Initially gaining credibility from psychiatric experts, who testified to the reliability of recovered childhood memories, these cases caused scandal as witnesses recanted and it became apparent that some experts had more likely instilled than elicited the shocking ‘memories.’ The profession eventually organized to deny the scientific validity of the claims and repudiate the expertise of the claimants (Loftus and Ketcham 1994). Research on eyewitness identification, another branch of research dealing with memory and recall, also received its chief impetus from the legal process. Prompted by insupportably high error rates on the part of eyewitnesses, psychologists tested people’s ability to recognize persons they had encountered in 13618
stressful situations and discovered that many factors, such as race, inhibit accurate identification under these circumstances (Wells 1988). These findings began to have an impact on police procedure by the late 1990s as awareness dawned that relatively simple changes, such as sequential rather than simultaneous presentation of suspects in line-ups, could substantially improve the reliability of eyewitness testimony.
3. Scientific Eidence and Expert Witnesses Whereas civil law systems relegated the production of scientific and technical evidence largely to national forensics labs and court-appointed experts, common law proceedings traditionally left it to the parties themselves to produce evidentiary support for their claims. The adversary process, with its right of crossexamination, was assumed to be equal to the task of testing the evidence, no matter how arcane or technical, thereby permitting the fact-finder—the judge or jury—to ascertain the truth. Although commentators sometimes deplored the practice of treating experts as hired guns, the parties’ basic right to present their evidence was not in doubt, and courts rarely used their legally recognized power to appoint independent experts who might offer a more disinterested view (Jasanoff 1995). By the turn of the century, several factors combined to challenge this hands-off attitude. The sheer volume of cases demanding some degree of technical analysis was one contributing cause, especially in the US, where low entry barriers to litigation and the inadequacy of social safety nets routinely brought into the courts controversies that were resolved by other means in most common law jurisdictions. Highly publicized instances of expertise gone awry, as in the recovered memory cases, shook people’s confidence in the power of cross-examination to keep out charlatans presenting themselves as scientists. The economic consequences of successful products liability lawsuits strained the system from yet another direction, especially in a growing number of mass tort actions that threatened industries with bankruptcy. It was tempting to blame many of these developments on the perceived weaknesses of the adversary system: the passivity and low scientific acumen of judges, the technical illiteracy of juries, the gamesmanship of lawyers, the bias or incompetence of experts selected by parties. These deficits led to calls for reforming the process by which expert testimony was admitted into the courtroom. The initial round of critique marked a turn away from the liberal ideals of the 1970s and proved to be politically powerful, although it was neither methodologically rigorous nor of lasting scholarly value. An early broadside accused courts of deciding cases on the basis of ‘junk science,’ a rhetorically useful concept that took hold with opponents of the burgeoning tort
Science and Law system but proved difficult to characterize systematically (Huber 1991). The barely concealed program of this and related writings was to remove as much as possible of the testing of scientific evidence from the adversary system, particularly from the jury’s purview, and to decide these issues either in judicially managed pretrial proceedings or with the aid of scientists appointed by the courts. Undergirding this literature was an almost unshakable faith in science’s selfcorrective ability and a technocratic conviction that truth would eventually win out if only the legal system left scientific fact-finding to scientists (Foster and Huber 1997). These attacks also tacitly presumed that mainstream expert opinion could be identified on any issue relevant to the resolution of legal disputes. These positions would later be thrown into doubt, but not until the spate of polemical work had left its mark on American scientific and social thought. Studies by practicing scientists, physicians and some legal academics reiterated and amplified the theme of the law’s misuse of science. Critics cited as evidence a number of tort cases in which multimillion dollar jury verdicts favorable to plaintiffs conflicted with the opinions of respected scientists who denied any causal connection between the alleged harmful agent, such as a drug or workplace toxin, and the harm suffered. Particularly noteworthy was the litigation involving Bendectin, a medication prescribed to pregnant women for morning sickness and later suspected of causing birth defects in their children. Juries frequently awarded damages despite assertions by epidemiologists that there was no statistically significant evidence linking Bendectin to the claimed injuries. Research on the role of experts in these cases showed suggestive differences in behavior (such as higher rates of repeat witnessing) between plaintiffs’ and defendants’ experts (Sanders 1998), leading some to question whether judges were adequately screening offers of scientific testimony. Another episode that drew considerable critical commentary was litigation involving silicone gel breast implants. Close to a half-million women surgically implanted with these devices sued the manufacturer, Dow Corning, claiming injuries that ranged from minor discomfort to permanent immune system damage. The company’s initial settlement offer broke down under the onslaught of lawsuits, but epidemiological studies, undertaken only after the commencement of legal action, indicated no causal connection between the implants and immune system disorders. Publication of these results in the prominent New England Journal of Medicine led its executive editor to join the chorus of accusation against the legal system’s apparent misuse of science (Angell 1996). Rising discontent about the quality of scientific evidence led the US Supreme Court to address the issue for the first time in 1993. In Daubert . Merrell Dow Pharmaceuticals, the Court set aside the so-called Frye rule that had governed the admissibility of expert testimony for the preceding 70 years. Instead of simply
demanding that scientific evidence should be ‘generally accepted’ within the relevant peer community, the Court urged judges to screen science in pretrial proceedings, in accordance with standards that scientists themselves would use. The Court offered four sample criteria: (a) was the evidence based on a testable theory or technique, and had it been tested; (b) had it been peer reviewed; (c) did it have a known error rate; and (d) was the underlying science generally accepted? Two more major evidence decisions of the 1990s cemented the high court’s message that trial judges should play a vastly more proactive gate-keeping role when confronted by scientific and technical evidence. As judicial involvement on this front increased, new complaints appeared on the horizon: that judges were using the Daubert criteria as an inappropriately inflexible checklist rather than as the guidelines they were meant to be; that mechanical application of Daubert and its progeny was trumping deserving claims; that judges were usurping the jury’s constitutional role; and that misinterpretation of Daubert was introducing unscientific biases and impermissibly raising the burden of proof in civil cases. In most civil law jurisdictions, by contrast, the inquisitorial approach to testing evidence, coupled with the state’s near monopoly in generating forensic science, precluded much controversy about the legitimacy of experts or the quality of their testimony. Tremors could be detected, however, arising from such episodes as the discovery of substandard or fabricated evidence in British trials of Irish terrorists and child abuse suspects and the French government’s cover-up of the contamination of the blood supply with the HIV-AIDS virus. Ironically, as US courts moved to consolidate their gate-keeping function in the 1990s, constricting the entry routes for experts, a countermove could be discerned in some European legal systems to open the courts, and policy processes more broadly, to a wider diversity of expert opinion (Van Kampen 1998).
4. Sites of Knowledge-making Investigation of the law-science relationship took a considerably more sophisticated turn in the 1990s as it attracted the attention of a new generation of researchers in science and technology studies. From this disciplinary vantage point, the use–misuse framing that dominated popular writing appeared superficial in the extreme when set against the observed richness of interactions between the two institutions. Rather, as this body of work began to document, it was the hybridization of law and science that held greatest interest and significance for science and society. At this active frontier of social problem solving, one could observe in fine-grained detail how important normative and epistemological commitments either reinforced or contradicted each other in contemporary 13619
Science and Law societies. For example, even within the common law’s adversarial environment, the ritualistic character of legal proceedings successfully held the line against certain forms of radical skepticism, leaving untouched traditional hierarchies of expert authority (Wynne 1982), a belief in the ultimate accessibility of the truth (Smith and Wynne 1989), and an abiding faith in technological progress (Jasanoff 1995). At the same time, the legal process increasingly came to be recognized as a distinctive site for the production of otherwise unavailable knowledge and expertise. Even in cases where litigation arguably produced anomalous results—such as Bendectin and breast implants—resort to the courts demonstrably added to society’s aggregate store of information. Other cases were unambiguously progressive. A particularly striking example was the importation of molecular biological techniques into the arenas of criminal identification and paternity testing in the form of ‘DNA typing.’ Not only did the legal system’s needs provide the initial context for the technique’s development, but subsequent contestation over the reliability of DNA-based identification had impacts on the field of population genetics and spurred the formation of new testing methods and agencies, as well as the standardization of commonly used tests (Lynch and Jasanoff 1998). As the technique became black-boxed, new market niches for it emerged within the legal process, most spectacularly as a means of exonerating mistakenly convicted persons. The success of DNA as a forensic tool thus helped to destabilize the legal system’s longstanding commitment to eyewitness evidence, and even its reliance on the seemingly unassailable authority of fingerprints (Cole 2001), thereby facilitating critiques of a possibly antiquated privileging of visual memory (Wells 1988). But the massive uptake of forensic DNA analysis by the criminal justice system also opened up new areas of social concern, such as privacy and database protection, which called for continued vigilance and ingenuity on the part of the legal system. The rise of DNA typing, like that of risk analysis and post-traumatic stress disorder, captured the dynamics of an epoch in which the establishment of any form of social order became virtually unthinkable without the mutual engagement of law and science. In such a time, commentary deploring the law’s alleged misuse of science appeared increasingly reductionist and out of touch with everyday reality. The emergence of science and law as a special topic within science and technology studies responded to the inadequacies of earlier analyses and showed greater intellectual promise. A major contribution of this work was to abandon once for all the pared down ‘clashing cultures’ model of the law–science relationship and to put in its place a more nuanced picture of the positive interplay between normative and cognitive authority. By exploring in detail how the law not only ‘uses’ science, but also questions it and often compensates for its deficien13620
cies, the new scholarship on science and law invited reflection on the intricate balancing of truth and justice in technologically advanced societies. See also: Biotechnology; Disciplines, History of, in the Social Sciences; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Intellectual Property, Concepts of; Intellectual Property: Legal Aspects; Law: History of its Relation to the Social Sciences; Legal Process and Social Science: United States; Norms in Science; Power in Society; Professions, Sociology of; Science and Technology Studies: Experts and Expertise; Science and the State; Science, Economics of; Truth and Credibility: Science and the Social Study of Science
Bibliography Angell M 1996 Science on Trial: The Clash of Medical Eidence and the Law in the Breast Implant Case. Norton, New York Beck U 1992 Risk Society: Towards a New Modernity. Sage, London Bullard R D 1990 Dumping in Dixie: Race, Class, and Enironmental Quality. Westview, Boulder, CO Cole S 2001 Manufacturing Identity: A History of Criminal Identification Techniques from Photography Through Fingerprinting. Harvard University Press, Cambridge, MA Ezrahi Y 1990 The Descent of Icarus: Science and the Transformation of Contemporary Democracy. Harvard University Press, Cambridge, MA Foster K R, Huber P W 1997 Judging Science: Scientific Knowledge and the Federal Courts. MIT Press, Cambridge, MA Foucault M 1990 The History of Sexuality. Vintage, New York Fuller L 1964 The Morality of Law. Yale University Press, New Haven, CT Goldberg S 1994 Culture Clash. New York University Press, New York Gutheil T G, Appelbaum P S 2000 Clinical Handbook of Psychiatry and the Law, 3rd edn. Lippincott Williams & Wilkins, Philadelphia, PA Hacking I 1995 Rewriting the Soul. Princeton University Press, Princeton, NJ Hart H L A 1961 The Concept of Law. Oxford University Press, Oxford, UK Huber P 1991 Galileo’s Reenge: Junk Science in the Courtroom. Basic Books, New York Irwin A, Wynne B (eds.) 1994 Misunderstanding Science. Cambridge University Press, Cambridge, UK Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Jasanoff S 1995 Science at the Bar: Law, Science and Technology in America. Harvard University Press, Cambridge, MA Loftus E F, Ketcham K 1994 The Myth of Repressed Memory: False Memories and Allegations of Sexual Abuse. St. Martin’s Press, New York Lynch M, Jasanoff S (eds.) 1998 Contested identities: Science, law and forensic practice. Social Studies of Science 28: 5–6
Science and Religion Minow M 1990 Making all the Difference: Inclusion, Exclusion, and American Law. Cornell University Press, Ithaca, NY National Research Council (NRC) 1994 Science and Judgment in Risk Assessment. National Academy Press, Washington, DC Polanyi M 1962 The republic of science. Minera 1: 54–73 Price D K 1985 America’s Unwritten Constitution: Science, Religion, and Political Responsibility. Harvard University Press, Cambridge, MA Sanders J 1998 Bendectin on Trial: A Study of Mass Tort Litigation. University of Michigan Press, Ann Arbor, MI Schuck P 1993 Multi-culturalism redux: Science, law, and politics. Yale Law and Policy Reiew 11: 1–46 Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Slovic P 2000 The Perception of Risk. Earthscan, London Smith R, Wynne B (eds.) 1989 Expert Eidence: Interpreting Science in the Law. Routledge, London Van Kampen P T C 1998 Expert Eidence Compared: Rules and Practices in the Dutch and American Criminal Justice System. Intersentia, Antwerp Wagner P, Wittrock B, Whitley R (eds.) 1991 Discourses on Society: The Shaping of the Social Science Disciplines. Kluwer, Dordrecht, The Netherlands Wells G L 1988 Eyewitness Identification: A System Handbook. Carswell, Toronto Wynne B 1982 Rationality and Ritual: The Windscale Inquiry and Nuclear Decisions in Britain. British Society for the History of Science, Chalfont St. Giles, UK
S. Jasanoff
Science and Religion Understanding the present relationship between science and religion requires a recognition of broad historical trends. The subject has been most commonly treated in depth by historians of science, though scholars in both pure science and theology have also been active. Several posts in the UK, including a professorship at Oxford, and at least three major journals are devoted to the subject of science and religion (Zygon, Perspecties on Science and Christian Faith, and Science and Christian Belief ), and numerous organizations have been formed to promote its study both in the US and UK. Scholarship has become increasingly sophisticated, and frequent attacks on earlier ‘simplistic’ views have had the effect of excluding from the debate a great many ordinary people. This is quite unnecessary, though it is true that both ‘science’ and ‘religion’ are not eternally unchanging entities and do vary over time and space. In what follows, science is simply the study and systematic observation of the natural world, while religion may be seen generally as humanity’s quest after God, a quest taking a variety of different
forms. The subject is, of course, an emotive issue for some, from vociferous antagonists of religion (like the geneticist Richard Dawkins) to equally uncompromising supporters (such as the American ‘creationists’). In the following account the main emphasis will be on Christianity, not for partisan reasons but because, historically, that religion more than any other has related closely to the emerging sciences.
1. The Social Relations of Science and Christianity This topic has become the subject of a number of wellknown statements or theses, five of which will now be considered. Some are about specific periods in history (as the seventeenth century) while others relate to allegedly more timeless issues. Some bear well-known names of their chief proponents.
1.1 The Merton Thesis In 1938, the American sociologist Robert Merton argued that the emergence of science in the seventeenth century was promoted by the ascendancy of Protestant religion. Since then this contentious position has been revisited again and again, and it is hard to resist the conclusion that, in modified form, the thesis has more than an element of truth. Certainly, there is a correlation between visibility in science and religious allegiance. In the French AcadeT mie des Sciences from 1666 to 1885, the ratio of Protestants to Catholics was 80: 18, despite the preponderance of Catholics in the country. Others have shown a similarly high proportion of Puritans (not merely Protestants) in the membership of the early Royal Society in England, and Merton argued that Puritan attitudes did much to encourage its growth. There have been objections to such ‘counting of heads,’ not least because of the difficulty of defining a Puritan. If such a person is considered theologically rather than politically some of the difficulties disappear. On this basis a Puritan is one who holds strongly to the teaching of the Bible as opposed to the church or tradition, not necessarily a supporter of the Parliamentary cause in revolutionary England. Yet it may still be argued that such correlations do not prove that Puritan theology encouraged science, for may not both have emerged from a common cause? Could not both have been an expression of new movements of social and economic change and of a libertarian philosophy? This may well be true, but a general correlation in the 1600s between promotion of science and a generally Protestant loyalty to the Bible seems inescapable. 13621
Science and Religion 1.2 The Hooykaas Thesis This goes further than the Merton thesis might suggest, and argues that the origins of modern science do not merely correlate neatly with Protestant beliefs but are directly derived from them. It has been implied by many authors since the 1970s but is particularly associated with the Dutch historian of science Reijer Hooykaas (1906–94) in his epochal Religion and the Rise of Modern Science. To make such an assertion is to stand traditional views on their head. It was once customary to see science as a truly Greek inheritance, at last freed from religious shackles at the Renaissance. Hooykaas invites us to view it rather as a product of Biblical theology, freshly released at the Reformation, and subverting much of Greek philosophy which had for 1500 years inhibited any real rise of experimental science. Serious evidence exists in support of this thesis. There are explicit declarations by many well-known scientific figures from Francis Bacon to Isaac Newton and beyond that their science was inspired theologically. Then there is a remarkable congruity between Biblical and scientific credos. At least five points of intersection can be identified. (a) Most profound of all, perhaps, was what Hooykaas called the ‘demythologization of nature,’ so that nature was no longer to be seen as divine or even animate, but more like a machine than anything else. It is not God but a creation by him, and therefore a proper object of study and experiment. Such was the theology of the Old and New Testaments, and it was eloquently proclaimed by statements from ‘the father of chemistry’ Robert Boyle and many others. (b) Moreover, the idea that nature worked by laws is at the very foundation of science, and is also Biblical. Amongst those who wrote of laws impressed by God on nature were Descartes, Boyle, and Newton. Men looked for laws when they recognized the law-giver. (c) But science can only discover these laws by experimentation. Manipulation of nature had been regarded by many Greeks as socially unacceptable (except for slaves) or even impious. Among the most urgent advocates of an experimental method, based on widespread Biblical approval of manual techniques for testing, was Francis Bacon who urged ‘men to sell their books, and to build furnaces.’ (d) Then again a further religious impetus to science came from the Biblical exhortations to see the heavens and earth as manifesting the glory of their Creator. The astronomer Kepler, it is said, asserted that in his researches he ‘was thinking God’s thoughts after him.’ This religious motivation became a strong emphasis in Calvinistic theology. (e) Finally, the Biblical mandate for humanity to exert ‘dominion’ over nature opened up possibilities of scientific work ‘for the glory of God and the relief of man’s estate’ (Bacon). As Hooykaas wrote: The Biblical conception of nature liberated man from the naturalistic bonds of Greek religiosity and philosophy and
13622
gave a religious sanction to the development of technology [Hooykaas, Religion and the Rise of Modern Science, 1973, p. 67].
One cannot dismiss the burgeoning literature linking science to Protestantism in the seventeenth century as mere rhetoric. But it can be argued that Hooykaas underestimates the numbers of Roman Catholics who were eminent in science, though even here caution is needed. The oft-quoted counter example of the Catholic Copernicus must not obscure the astronomer’s indebtedness to both his Lutheran assistant Rheticus, and the liberal legacies of Erasmus within his own church. Attitudes towards nature varied widely within Catholicism, and there is always the danger of generalising too widely. But in its essence the Hooykaas thesis appears to be substantially correct.
1.3 The Lynn White Thesis If Christian theology has been one of the formative influences on modern science it does not follow that this has always been in the best interests of the world and its inhabitants. That opinion has been forcibly expressed in a further thesis, first proposed by the American historian Lynn White in 1966\7. He argued that much of the damage to our environment springs from a misuse of science and technology for which ‘Christianity bears a huge burden of guilt.’ More specifically he locates the problem in the ‘realisation of the Christian dogma of man’s transcendence of, and rightful mastery over, nature’ (White 1967). He urges a return not to the primitive Christianity of the New Testament but to the animistic world of St Francis of Assisi which saw the earth and its inhabitants as brothers rather than instruments. In White’s view the Biblical call to ‘dominion’ must have been understood in terms of exploitation. However, careful historical examination of the evidence suggests that few, if any, pioneers in science or theology held this particular view. John Calvin explicitly repudiated it, as did the noted eighteenth century writer William Derham. They, and many others, urged an interpretation of dominion as ‘responsible stewardship.’ In more recent times ideologically driven human conquest of nature has been on the Marxist rather than the Christian agendas. It is now widely acknowledged that, as a historical generalization, the Lynn White thesis stands largely discredited.
1.4 The Conflict Thesis This thesis is much older than the others we have examined, and much better known. The thesis states that science and religion have been in perpetual conflict and that, eventually, science will vanquish
Science and Religion religion. It was most notoriously promulgated by the Victorian naturalist T. H. Huxley, but has been advocated by many others in the late nineteenth century and is probably best not attributed to any one individual. Two books arguing the point are History of the Conflict between Religion and Science by J. W. Draper (first published in 1875), and A History of the Warfare of Science with Theology in Christendom by A. D. White (1895). They achieved enormous circulation and are still in print. The essence of the argument is that where scientific conclusions have been challenged by the church there has usually been an eventual retreat by the ecclesiastical authorities. Two classic cases were those of Galileo in the seventeenth century and of Darwin nearly 250 years later. It was in the aftermath of the latter imbroglio that the conflict thesis was formulated. Many more examples were unearthed by Draper and White and an impressive case assembled (especially by the latter). Manifestly, it is hard to reconcile this thesis with the three previously considered. If science owed so much to religion how could they possibly be in conflict? The thesis flies in the face of massive evidence that points to a powerful alliance between science and Christianity since at least the seventeenth century. A noteworthy feature of the conflict thesis is that it was first proposed at a time when history of science hardly existed and historical scholarship had certainly not explored the nuances perceived by Merton, Hooykaas, and their colleagues. Detailed examination of the relatively minor cases urged by Draper and White reveals that many were badly documented, some plainly apocryphal and others greatly exaggerated. It seems that the authors sometimes saw what they were looking for and built their generalizations on a slender foundation. To be sure, Galileo and Darwin were assailed by organized religion but they seem to have been relatively isolated exceptions to the general rule that Christianity and science coexisted in harmony. Partial explanations for the persecution of Galileo lay in power struggles within the church, while Darwin’s problems arose in part from a deeply divided society in industrial Britain. Today the books of Draper and White are rated not as serious works of historical scholarship but as highly polemical tracts reflecting the tensions existing in the social and cultural environment in which the authors lived. Draper wrote as a man deeply disenchanted with the Roman Catholic church, not least for its recent declaration of papal infallibility. White, on the other hand, was President of one of the first explicitly nonsectarian colleges in the US (Cornell) and had suffered badly at the hands of the religious establishment. Each had his own axe to grind in taunting the organized church. The result was not history but myth. Yet a conflict thesis is commonly held in the world today, so one needs to asks how such a tendentious
myth could have become so entrenched in Western culture. To understand the reason it is necessary to recall the plight of English science in the last 40 years of Victoria’s reign. Underfunded by government, inadequately supported by secondary education, unpopular with the general public for a variety of reasons and a poor relation in those citadels of the establishment, the universities of Oxford and Cambridge, English science was crying out for recognition and support. It was falling seriously behind in the race with Continental science (especially German) and had no chance of occupying a leading place in English culture. In search of a weapon with which to acquire some kind of hegemony Thomas Henry Huxley and some close allies determined on a course of action. They would seek by all means within their power to undermine the privileged position of the Anglican church and put science in its place. A detailed strategy was worked out, an important element being the propagation of a conflict thesis in which the church was always portrayed as the vanquished party in its perennial fights with science. The books of Draper and White were at their service, and so was evolution which, as Huxley said, was a good stick with which to beat the church. With all the discipline of a military campaign the attack was launched and the conflict thesis enshrined as a self-evident truth in English culture. It was a demonstrably wrong but it carried the day, at least with the general public (see Russell 1989).
1.5 The Complementarity Thesis In the 1960s and 1970s the scientist\philosopher Donald MacKay made much of the paradigm of complementarity. Derived from modern physics, where, for instance, wave and particle descriptions of light are complementary, the notion was applied to explanations from science and theology. It was the antithesis of a reductionist view where objects or events were nothing but phenomena of one kind (usually materialist). This view did not preclude traffic from one area to another, and did not deny their mutual influence. Indeed MacKay was committed strongly to the Hooykaas thesis. But it did mean that science and theology could co-exist as complementary enterprises. We may call this the complementarity thesis. Its merits include the ability to account for several phenomena of modern and recent science. One of these is the large number of scientists who hold to Christian beliefs. In Huxley’s day they included many of the most eminent physical scientists as Faraday, Joule, Stokes, and Lord Kelvin. Statistical surveys undertaken since that day have confirmed the same general trend, which is quite the opposite of common expectation. In the early years of the twenty-first century the numbers appear to be increasing. The only reasonable explanation is that such people regard their 13623
Science and Religion scientific and religious activities as complementary and nonthreatening. Membership data and corporate activities of organisations like the American Scientific Affiliation and Christians in Science appear to bear this out. In modern times an interesting tendency strengthens still further the case for a complementary approach. That is the rebirth of natural theology, in which the natural universe is seen to testify to the power and goodness of God. Such arguments go back to Biblical times, but were advocated with fresh vigour in the late eighteenth and early nineteenth centuries, reaching their peak with Natural Theology by William Paley, first appearing in 1802. The argument for a Designer from the apparent design of the natural world received hard blows from the philosophical critique of Hume and the evolutionary theories of Darwin. The proponents of the conflict thesis added their opposition. Yet over 100 years later the disclosures of modern science have caused some cynics to have second thoughts, and have encouraged scientist\theologians like John Polkinghorne to revive a form of natural theology that does justice to a natural world infinitely more complex than that perceived by Paley or even Darwin.
universe as an alternative to the view of it as a machine which can be used or abused at whim. They have been encouraged by the Gaia hypothesis of James Lovelock which emphasizes the complexity of earth and its biosphere and its capacity for self-regulation. This had led several (including Lovelock) to posit the earth as in some sense ‘alive.’ From this it has been but a step to invoke the earth (Gaia) as a mother and even a goddess. If taken to extremes this view may lead to a greater sensitivity to the needs for sustainable development and other desirable attitudes to the earth. But it may also lead to a reversion to a prescientific animistic or even divine universe in which the benefits of a thriving science are rejected as minimal. If, however, the values of Christian monotheism are integrated with these views it may be possible for science to prosper, but always in a spirit of responsible stewardship. At present, we seem a long way from either of these positions.
2. Beyond Traditional Christianity
Bibliography
It remains to note that orthodox Christianity has no monopoly in engagement with science, though historically it happens to have been the most prominent faith to do so. Much literature on science and religion has focused, therefore, on the Christian faith. A modern variant of Christianity is sufficiently different to justify the term heterodox; that is the so-called process theology that denies omniscience to God and regards natural processes as contributing to his selffulfillment. It has been extensively used in discussions about indeterminacy in nature, about human free will and about the phenomenon of evil. Though favoured by some philosophers and theologians it cuts little ice with the majority of scientists. The other great monotheistic religions of Judaism and, to a smaller extent, Islam have encouraged their followers in the study of nature though, operating in very different cultures and with different presuppositions, their effects have been rather unlike those of Christianity. The eastern mystical religions have arguably been less important for the growth of science, though some parallels between atomic physics and Buddhism have been emphasized by Franz Capra. The early promise of Chinese science was not fulfilled because the concept of scientific law was hardly present, reflecting, according to the historian Joseph Needham, the absence of any monotheistic law-giver in ancient Chinese culture. Finally, the growth of environmental awareness has led some to espouse an almost pantheistic view of the
Barbour I 1990 Religion in an Age of Science. Harper, San Francisco Brooke J H 1991 Science and Religion: Some Historical Perspecties. Cambridge University Press, Cambridge, UK Finocchiaro M A 1989 The Galileo Affair: A Documentary History. University of California Press, Berkeley, CA Harrison P 1998 The Bible, Protestantism and the Rise of Natural Science. Cambridge University Press, Cambridge, UK Hooykaas R 1972 Religion and the Rise of Modern Science. Scottish Academic Press, Edinburgh, UK Jeeves M A, Berry R J 1998 Science, Life and Christian Belief. Apollos, Leicester, UK Kaiser C 1991 Creation and the History of Science. Marshall Pickering, London Lindberg D C, Numbers R (eds.) 1998 God and Nature: Historical Essays on the Encounter Between Christianity and Science. University of California Press, Berkeley, CA Livingstone D N, Hart D G, Noll M A (eds.) Eangelicals and Science in Historical Perspectie. Oxford University Press, New York Moore J M 1979 The Post-Darwinian Controersies: A Study of the Protestant Struggle to Come to Terms with Darwin in Great Britain and America 1870–1900. Cambridge University Press, Cambridge, UK Nebelsick H P 1992 The Renaissance, the Reformation and the Rise of Science. T. & T. Clark, Edinburgh, UK Polkinghorne J 1990 One World: The Interaction of Science and Theology. SPCK, London Polkinghorne J 1991 Reason and Reality: The Relationship Between Science and Theology. SPCK, London Russell C A 1985 Cross-currents: Interactions Between Science and Faith. Inter-Varsity Press, Leicester, UK (corrected and reprinted, Christian Impact, London, 1996)
13624
See also: Creationism, Evolutionism, and Antievolutionism; Religion and Politics: United States; Religion and Science; Religion: Evolution and Development; Religion, History of; Religion, Psychology of; Religion, Sociology of
Science and Social Moements Russell C A 1989 The conflict metaphor and its social origins. Science and Christian Belief 1: 3–26 Russell C A 1994 The Earth, Humanity and God. University College Press, London White L 1967 Science 155: 1203
C. A. Russell
spokespersons have offered critical perspectives on both the form and content of science, as well as on the uses to which science has been put. In recent decades, a range of so-called new social movements have criticized the dominant biases and perspectives of many scientific and technological fields. This article first presents these different forms of interaction between science and social movements in general, historical terms. It then discusses contemporary relations between science and social movements, and, in particular, the environmental movement.
Science and Social Movements The social movements that are discussed in this article refer to major political processes of popular mobilization that have significant societal significance, and include organized forms of protest as well as broader and more diffuse manifestations of public opinion and debate. In relation to science, social movements conceived in this way can be seen to have served as seedbeds for new modes of practicing science and organizing knowledge more generally, and also as sites for critically challenging and reconstituting the established forms of scientific activity.
1.
Introduction
Since they fall between different academic specializations and disciplinary concerns, the relations between science and social movements have tended to be a neglected subject in the social sciences. For students of social movements in sociology and political science, the relations to science have been generally of marginal interest and received little systematic attention (Dellaporta and Diani 1999), while in science and technology studies, social movements usually have been regarded more as contextual background than as topics of investigation in their own right. Considering their potential significance as ‘missing links,’ both in the social production of knowledge and in broader processes of political and social change, however, the relations between science and social movements deserve closer scrutiny. For one thing, new ideas about nature and society have often first emerged outside the world of formal scientific activity, within broader social and political movements. Social movements have also often provided audiences, or new publics, for the spreading and popularization of scientific findings and results. Movements of religious dissent were among the most significant disseminators of the new experimental philosophy in the mid-seventeenth century, for example, while, in the nineteenth and twentieth centuries, new approaches to medicine, social relations, gender roles, and the environment found receptive audiences within social and political movements. Most visibly, social movements and their
2. Historical Perspecties A fundamental insight of the sociology of knowledge, as it developed in the 1920s and 1930s, was to suggest that modern science emerged in the seventeenth century out of a more all-encompassing struggle for political freedom and religious reform (Scheler 1980\1924; Merton 1970\1938). What eventually came to be characterized as modern science represented an institutionalized form of knowledge production that, at the outset, had been inspired by more sweeping social and political transformations. The historical project of modernity did not begin as a new scientific method, or a new mechanical worldview, or, for that matter, as a new kind of state support for experimental philosophy in the form of scientific academies. As for the Reformation, it arose as a more deep-seated challenge to traditional ways of thought in social and religious life. It was a protest against the Church (which is why the members of the movement were called protestants), and it was an encompassing social and cultural movement that articulated and practiced alternative, or oppositional forms of religion, politics, and learning as part of its political Table 1 A schematic representation of the scientific revolution from movement to institution From moement …
To institution
Social reform Millenarian vision Connection to radical politics Decentralized structure Democratic\open to all Spiritual (absolute) knowledge
Reform of philosophy Experimental program Rejection of politics
Technical-economic improvement Informal communication pamphlets, correspondence
Scientific development
Central academy Elitist\professional Instrumental (probabalistic) knowledge
Formal communication journals, papers
13625
Science and Social Moements struggle (Mandrou 1978). The teachings of Paracelsus, Giordano Bruno, and Tomaso Campanella, to name some of the better known examples, combined a questioning of established religious and political authority with an interest in scientific observation, mathematics, mechanics, and technical improvements (Merchant 1980). But as the broader movements were replaced by more formalized institutions in the course of the seventeenth century, the political and social experiments came to be transformed into scientific experiments; the political and religious reformation, tinged with mysticism and filled with mistrust of authority, was redefined and reconstituted, at least in part, as a scientific revolution (see Table 1). In the later seventeenth and early eighteenth centuries, the scientific ‘aristocracy’ that had emerged in London and Paris at the Royal Society and the Academie des Sciences was challenged by dissenting groups and representatives of the emerging middle classes, some of whom fled from Europe to the colonies in North America, and some of whom established scientific societies, often in provincial areas in opposition to the established science of the capital cities. Many of them shared with the academicians and their royal patrons a belief in what Max Weber termed the protestant ethic, that is, an interest in the value of hard work and the virtue of making money, and most had an interest in what Francis Bacon had termed ‘useful knowledge.’ The social movements of the enlightenment objected, however, to the limited ways in which the Royal Society and the Parisian Academy had organized the scientific spirit and institutionalized the new methods and theories of the experimental philosophy. The various attempts to democratize scientific education in the wake of the French Revolution and to apply the mechanical philosophy to social processes—that is, to view society itself as a topic for scientific research and analysis—indicate how critique and opposition helped bring about new forms of scientific practice. Adam Smith’s new science of political economy developed in the Scottish hinterland, and many of the first industrial applications of experimentation and mechanical philosophy took place in the provinces, rather than in the capital cities, where the scientific academies were located (Russell 1983). Many of the political and cultural trends of the nineteenth century—from romanticism and utopianism to socialism and populism—also began as critical movements that at a later stage, and in different empirical manifestations, came to participate in reconstituting the scientific enterprise. Romanticism, for example, first emerged as a challenge to the promethean ambitions of science, in the guise, for example, of Mary Shelley’s mad Doctor Frankenstein, while socialism, in the utopian form promulgated by Robert Owen, was a reaction, among other things, to the problematic ways in which the technological applications of scientific activity were being spread into 13626
modern societies. Romantic writers and artists, like William Blake, Johann Goethe, and later Henry David Thoreau, were critical of the reductionism and cultural blindness of science and of the ‘dark satanic mills’ in which the mechanical worldview was being applied. They sought to mobilize the senses and the resources of myth and history in order to envision and create other ways of knowing (Roszak 1973). Some of their protest was transformed into constructive artistic creation, and later in the century, the romantic revolt of the senses inspired both the first waves of environmentalism in the form of conservation societies and the emergence of a new kind of holistic scientific discipline, to which was given the name ecology: household knowledge. In the late nineteenth century, the labor movement also sought a deep-going, fundamental political, and cultural transformation of society, but it would see its critical message translated, in the twentieth century, into packages of reforms and a more welfare-oriented capitalism, on the one hand, and into the ‘scientific socialism’ of Leninand Stalin on the other (Gouldner 1980). Once again, however, science benefited from this institutionalization of the challenge of social movements; the knowledge interests of the labor movement, for example, entered into the new social science disciplines of economics and sociology. A political challenge was once again transformed into programs of scientific research and state policy; but while new forms of scientific expertise were developed, little remained of the broader democratization of knowledge production that the labor movement, in its more radical days, had articulated. In the early twentieth century, the challenge to established science was mobilized most actively in the colonies of European imperialism, as well as in the defeated imperialist powers of Germany and Italy; the critique was primarily of scientific civilization writ large, and of what Mahatma Gandhi in India called its ‘propagation of immorality.’ In the name of modernism, science stood for the future and legitimated, in the colonies as well as in Europe, a wholesale destruction of the past and of the ‘traditional’ knowledges—the other ways of knowing—that had been developed in other civilizations (Tambiah 1990). These social movements had an impact on the development of political ideology on both the right and left, but also inspired new sciences of ethnography and anthropology, as well as the sociology of knowledge. Even more important perhaps were the various attempts to combine the artisanal knowledges of the past and of other peoples with the modern science and technology of the present in new forms of architecture, design, and industrial production. Many of the regional development programs of the 1930s and 1940s, in Europe and North America, can trace their roots back to the cultural critique of modernism that was inspired by the Indian independence movement and by such ‘movement intellectuals’ as William Morris and
Science and Social Moements Patrick Geddes in Britain. Both Roosevelt’s New Deal and the Swedish model welfare state can be said to have mobilized civilizationally critical perspectives in their projects of social reconstruction.
3. Science and Contemporary Social Moements In recent decades, social movements have also served to challenge and reorient the scientific enterprise. Out of the anti-imperialist and student movements of the 1960s and the environmentalist, feminist, and identity movements of the 1970s and 1980s have emerged a range of alternative ideas about science, in form, content, and meaning, that have given rise to new scientific theories, academic fields, and technological programs (Schiebinger 1993, Harding 1998). Out of critique have grown the seeds of new, and often more participatory, ways of sciencing—from technology and environmental impact assessment to women’s studies, queer theory, and postcolonial discourses. What were in the 1960s and 1970s protest movements of radical opposition largely have been emptied of their political content, but they have given rise to new branches of, and approaches to, science and technology. While the more radical, or oppositional, voices have lost much of their influence, the more pragmatic and scientific voices have been given a range of new opportunities. This is not to say that there is no longer a radical environmental opposition or a radical women’s liberation movement, but radicals and reformists increasingly have drifted apart from one another, and in most countries now work in different organizations, with little sense of a common oppositional movement identity. There has been a fragmentation of what was a coherent movement into a number of disparate bits and pieces (Melucci 1996). The new social movements rose to prominence in the downturn of a period of institutional expansion and economic growth. They emerged in opposition to the dominant social order and to its hegemonic scientific technological regime, which had been largely established during and immediately after World War II (Touraine 1988). The war led to a fundamental transformation in the world of science and technology, and to the emergence of a new relation, or contract, between science and politics. Unlike previous phases of industrialization, in which science and engineering had lived parallel but separate identities, World War II ushered in the era of technoscience. The war effort had been based on an unprecedented mobilization of scientists to create new weapons, from radar to the atomic bomb, and to gather and conduct intelligence operations. In the process, ‘little’ science was transformed into big science, or industrialized science. Especially important for the social movements that were to develop in the 1960s and beyond was the fact that scientific research was placed at the center of postwar economic development. Many of the econ-
omically significant new products—nylon and other synthetic textiles, plastics, home chemicals and appliances, television—were based directly on scientific research, and the new techniques of production were also of a different type: it was the era of chemical fertilizers and insecticides, of artificial petrochemicalbased process industries, and food additives. The forms of big science also differed from the ways in which science had been organized in the past. The big science laboratories—both in the public and private sectors—were like industrial factories, and scientifictechnical innovation came to be seen as an important concern for business managers and industrial organizers. The use of science in society had become systematized, and, as the consequences of the new order became more visible, new forms of mistrust and criticism developed (Jamison and Eyerman 1994). One wing of the public reacted to the destruction of the natural environment, what an early postwar writer, Fairfield Osborn, termed ‘man’s war with nature.’ The exploitation of natural resources was increasing and, in the 1940s and 1950s, it began to be recognized that the new science-based products were more dangerous for the natural environment than those that had come before. But it would only be with Rachel Carson, and her book Silent Spring in 1962, that an environmental movement began to find its voice and its characteristic style of expression. It was by critically evaluating specific instances of scientific technology, particular cases of what Osborn had called the ‘flattery of science’ that the environmentalist critique would reach a broader public. Carson singled out the chemical insecticides for detailed scrutiny and assessment, but her point was more general. Carson’s achievement was to direct the methods of science against science itself, but also to point to another way of doing things, the biological or ecological way—what she called in her book the road not taken. Another source of inspiration for the new social movements came from philosophers and social historians who questioned the more general impact of technoscience on the human spirit. It was onedimensional thinking which critical theorists like Herbert Marcuse reacted against, the dominance of an instrumental rationality over all other forms of knowing. For Lewis Mumford, another major source of inspiration, it was the homogenization of the landscape that was most serious, the destruction of the organic rhythms and flows of life that had followed in the wake of postwar economic growth, as well as the dominance of what he termed the ‘megamachine,’ the use of technology for authoritarian purposes. In the 1970s, a range of new social movements, building on these and other sources of inspiration, came to articulate an oppositional, or alternative approach to science and technology (Dickson 1974). The so-called new social movements represented an integrated set of knowledge interests, which combined the critique of Rachel Carson with the liberation 13627
Science and Social Moements dialectics of Marcuse and the direct democracy of the student movement. The new movements involved a fundamental critique of modern science’s exploitative attitude to nature, as well as an alternative organizational ideal—a democratic, or participatory ideal—for the development of knowledge. There were also distinct forms of collective learning in the new social movements of environmentalism and feminism, as well as grass-roots engineering activities that went under the name of appropriate or alternative technology.
4.
Social Moements as Cognitie Praxis
On the basis of these historical and contemporary relations between science and social movements, many social movements can be characterized as producers of science (Eyerman and Jamison 1991). The critical ideas and new public arenas that are mobilized by social movements often provide the setting for innovative forms of cognitive praxis,combining alternative worldviews, or cosmological assumptions with alternative organizational and practical–technical criteria. In the case of environmentalism, the cosmology was, to a large extent, the translation of a scientific paradigm into a socioeconomic paradigm; in the 1970s, the holistic concepts of systems ecology were transformed into political programs of social ecology, and an ecological worldview was to govern social and political interactions. Technology was to be developed under the general perspective that ‘small is beautiful’ (in the influential phrase of E. F. Schumacher), and according to the assumption that large-scale, environmentally destructive projects should be opposed and stopped. At the same time, new contexts for education and experimentation and the diffusion of research were created in the form of movement workshops and, in the Netherlands, for example, in the form of science shops, which allowed activist groups to gain access to the scientific expertise at the universities (Irwin 1995). In the 1980s, this cognitive praxis was decomposed largely into a disparate cluster of organizations and individuals, through processes of professionalization and fragmentation. The knowledge interests of the environmental movement were transformed into various kinds of professional expertise, which made it possible to incorporate parts of the movement into the established political culture, and to shift at least some of the members of the movement from outsider to insider status. Some of the alternative technical projects proved commercially viable—biological agriculture, wind energy plants, waste recycling—and gave rise to a more institutionalized form of environmental politics, science, and technology (Hajer 1995). Similar processes can be identified in relation to other social movements of recent decades (Rose 1994). The political struggles for civil rights, women’s and sexual liberation, and ethnic and national identity 13628
have inspired new approaches to knowledge that have since been institutionalized and transformed into established scientific fields, such as women’s studies, gay and lesbian studies, African-American studies, as well as in new areas of medicine and technology. At the same time, science itself has been reconstituted, partly as a result of the critical perspectives and cognitive challenges posed by social movements. Many of the postmodern theories of the cultural and human sciences have been inspired by the experiences of social and political movements. Out of the alternative public spaces that have been created by social and political movements has emerged a new kind of scientific pluralism, in terms of organization, worldview assumptions, and technical application. In the transformations of movements into institutions, a significant channel of cognitive and cultural change can thus be identified. It may be hoped, in conclusion, that these interactions between science and social movements will receive more substantial academic attention in the future. See also: History of Science; Innovation, Theory of; Kuhn, Thomas S (1922–96); Marx, Karl (1818–89); Social Movements, History of: General; Social Movements: Psychological Perspectives; Social Movements, Sociology of
Bibliography Dellaporta D, Diani M 1999 Social Moements. An Introduction. Routledge, London\Blackwell, Oxford, UK Dickson D 1974 Alternatie Technology and the Politics of Technical Change. Fontana, London Eyerman R, Jamison A 1991 Social Moements. A Cognitie Approach. Penn State University Press, University Park, PA Gouldner A 1980 The Two Marxisms. Contradictions and Anomalies in the Deelopment of Theory. Seabury University Press, New York Hajer M 1995 The Politics of Enironmental Discourse. Ecological Modernization and the Policy Process. Oxford University Press, Oxford, UK Harding S 1998 Is Science Multicultural? Postcolonialisms, Feminisms, and Epistemologies. Indiana University Press, Bloomington, IN Irwin A 1995 Citizen Science. A Study of People, Expertise and Sustainable Deelopment. Routledge, London Jamison A, Eyerman R 1994 Seeds of the Sixties. University of California Press, Berkeley, CA Mandrou R 1978 From Humanism to Science 1480–1700. Penguin Books, Harmondsworth, UK Melucci A 1996 Challenging Codes: Collectie Action in the Information Age. Cambridge University Press, Cambridge, UK Merchant C 1980 The Death of Nature. Women, Ecology, and the Scientific Reolution. Harper and Row, San Francisco Merton R 1970 (1938) Science, Technology, and Society in Seenteenth-century England. Harper and Row, New York Rose H 1994 Loe, Power and Knowledge. Toward a Feminist Transformation of the Sciences. Polity Press, Cambridge, UK Roszak T 1973 Where the Wasteland Ends: Politics and Transcendence in Post-industrial Society. Anchor, New York
Science and Technology, Anthropology of Russell C 1983 Science and Social Change 1700–1900. Macmillan, London Scheler M 1980 (1924) Problems of a Sociology of Knowledge. Routledge, London Schiebinger L 1993 Nature’s Body: Gender in the Making of Modern Science. Beacon, Boston Tambiah S 1990 Magic, Science, Religion and the Scope of Rationality. Cambridge University Press, Cambridge, UK Touraine A 1988 The Return of the Actor. Social Theory in Postindustrial Society. University of Minnesota Press, Minneapolis, MN
A. Jamison
Science and Technology, Anthropology of The anthropology of science and technology is an expanding arena of inquiry and intervention that critically examines the cultural boundary that sets science and technology off from the lives and experiences of people. Most research in this arena draws on ethnographic fieldwork to make visible the significance and force of this boundary, as well as meanings and experiences that cut across it or otherwise get hidden. Key categories of projects include juxtaposing shared systems of cultural meaning, following flows of metaphors back and forth across the boundary, and retheorizing or relocating the boundary itself in ways that reconnect science and technology to people. While advancing understanding of how people position science and technology in both specialized and everyday worlds, the anthropology of science and technology also calls attention to cultural possibilities whose realization might help expand participation in decision-making about science and technology.
1. Helping STS Fulfill its Dual Objecties The unique contribution of this field to science and technology studies (STS) is that it helps revive and refocus questions regarding relations among science, technology, and people. STS research and researchers are held together by their diverse, yet collective, efforts to trouble and transform the dominant, but simplistic, model or image of science and technology in society. According to the dominant model, researchers live in specialized technical communities whose deliberations are essentially opaque and presumably free of cultural content. Knowledge, in the singular, is created by these bright, well-trained people located inside the academy and then diffuses outside into the public arena through mechanisms of education, popularization, policy, and the benefits of new technologies. Its social significance is evaluated exclusively in the public arena, where knowledge is used, abused, or ignored. The outward travel of knowledge preserves the autonomy of creation and establishes a sharp
boundary between science and technology on the one side and people on the other, including the actual lives of scientists and technologists. STS has sought to engage this model in two ways. First, it brings together researchers who analyze the conceptual and the social dimensions of science and technology simultaneously and in historical perspective. Second, by offering new ways of thinking, STS promises to afford society new pathways for confronting and resolving problems that involve science and technology. STS thus offers the dual trajectories of theory and intervention, of proposing new frameworks of interpretation, and participating critically in societal problem solving. These activities complement one another. Good theory defines pathways that make a difference, and successful acts of critical participation depend upon novel theoretical insights. During the 1980s and early 1990s, a major focus within STS was on a philosophical debate between ‘objectivism’ and ‘(social) constructivism.’ Constructivism provided a major theoretical advance over 1970s research on the ‘public understanding of science’ and the ‘impacts’ of science and technology, which tended to take for granted the internal contents of science and technology (see Technology Assessment). One important body of constructivist work was often labeled ‘anthropology of science’ because it relied on the direct observation of scientists in laboratories to link scientific practices to knowledge development. For more information on this formative strand in the anthropology of science (see Laboratory Studies: Historical Perspecties; Actor Network Theory). By questioning how science and technology gain internal contents, the philosophical debate between objectivism and constructivism concentrated attention on the science\technology side of the cultural boundary with people. Also, a focus on the emergence and stabilization of either new scientific knowledge or new technological artifacts tended to maintain the more general cultural separation between science and technology (see Technology, Social Construction of ). The newer strands in the anthropology of science and technology explicitly foreground the interventionist potential of STS by moving back and forth across the boundary between science\technology and people, investigating its force and making visible meanings and experiences that get hidden when it is taken for granted. In this work, the commitment to ethnographic practices forces attention to alternative pathways for critical participation as an integral part of theoretical innovation (see also Ethnography).
2. Helping Anthropology Rethink Culture As recently as 1988, the American Anthropological Association (AAA) rejected proposed panels on the anthropology of science and technology on the grounds such work did not fit under the AAA umbrella. Things had changed by 1992 as a series of 13629
Science and Technology, Anthropology of panels on ‘cyborg anthropology’ and on the work of Donna Haraway attracted standing-room-only audiences in a ballroom setting. Key to this shift was a growing recognition that anthropological debates over the status of cultural analysis and cultural critique were similar to developments and deliberations in STS. Anthropologists grew hopeful that analyzing and critiquing the dominant model of science and technology in cultural terms might further people’s understanding of how sciences and technologies, including anthropology, live in society. Emerging questions included: what are the implications of accounting for science, technology, and people in cultural terms? Might the finding of wider cultural meanings in science and technology improve searches for alternative configurations? To what extent has anthropology itself depended on the dominant model of science and technology in society? How can one participate critically in science and technology through ethnographic work? As it emerged from symbolic anthropology during the 1970s, the concept of culture drew from the predominant model of language as an underlying grammar, structure, or system of symbols and meanings (see Culture: Contemporary Views). Beneath surface differences in speech and action among competent participants in a given community lies a more fundamental sharedness, a bounded ‘culture.’ This concept of cultures as sets of shared assumptions or presuppositions depends on a contrast with the concept of ‘nature.’ Where nature provides people with base needs and desires, culture provides content and meaning. Until the 1980s, it appeared difficult to apply the concept of culture to science and technology. While theorizing culture as a bounded system helped in accounting for differences across cultures, this approach offered no means of accounting for differences within cultures. Yet research questions in STS typically focused on the latter. Also, much research in anthropology itself, for example, kinship theory, depended on the nature\culture distinction for its legitimacy. Treating anthropological work itself as a cultural enterprise could have threatened to undermine the discipline. As Franklin (1995) put it, by the 1990s ‘several trajectories coalesce[d] to produce … momentum’ for an emerging anthropology of science and technology. Feminist anthropology had made visible the role of biological assumptions in gender and kinship studies (see Kinship in Anthropology). Poststructuralism made it possible to think about power–knowledge relationships and to make the conceptual limitations of the ‘human’ a focal point for social theory (see Knowledge, Sociology of; Postmodernism in Sociology). Postmodernist critiques of science and technology called attention to the processes of their production, making ‘progress’ a contingent effect (see Cultural Studies of Science). In revealing how anthropologists ‘write culture’ by turning people into ‘others,’ post13630
modernism in anthropology introduced ‘cultural critique’ as an anthropological practice (see Cultural Critique: Anthropological). Cross-cultural comparisons of knowledge systems began to reposition Western science as ethnoscience (see Knowledge, Anthropology of; Indigenous Knowledge: Science and Technology Studies; Postcoloniality). The rise of interdisciplinary cultural studies juxtaposed ‘popular’ with ‘high’ cultural forms, revealing power relations between the two and retheorizing culture as a site of active work (see Cultural Studies of Science). Feminist critiques of science demonstrated its saturation with gender metaphors, and feminist critiques of reproductive technologies provided compelling accounts of women’s experiences that could not be counted unproblematically as the benefits of innovation (see Feminist Epistemology; Gender and Technology). Cross-disciplinary interests in emerging transnational forms demanded simultaneous attention to technology, knowledge, and capital (see Globalization, Anthropology of; Postcoloniality; Capitalism: Global). Growing demands for accountability across the academy in general fueled interest in how ethnographic research can make a difference in the arena under study (see Ethnography; Adocacy in Anthropology). During the early 1990s, cultural anthropologists studying science and technology actively resisted the label ‘anthropology of science and technology’ on the grounds it masked these trajectories, confining diverse work to a bounded subdiscipline. The label works only to mark a collection of intellectual activities within which competition exists not to achieve domination but to make a difference and where border crossing is accepted to enhance the life chances of what the dominant model submerges. Ongoing projects fall into roughly three categories, with individual researchers and studies often contributing to more than one.
3. Juxtaposing Cultural Systems of Meaning The publication of Sharon Traweek’s Beamtimes and Lifetimes in 1988 marked the emergence of projects that identify and juxtapose cultural systems of meaning in science and technology. Rather than focusing solely on theory change in science, Traweek provides ‘an account of how high energy physicists see their own world; how they have forged a research community for themselves, how they turn novices into physicists, and how their community works to produce knowledge’ (1988, p. 1). Studying the culture of highenergy physicists extends and transforms the anthropological project of cross-cultural comparison by studying people who live in more than one culture at the same time, in this case those of the international physics community, Japan, and the United States. Demonstrating sharedness among high-energy physicists and then setting physicists off as a community
Science and Technology, Anthropology of achieves two key contributions. It repositions the dominant model of science and technology, which physicists both embrace and embody, into one among many possible cultural perspectives without suggesting that all perspectives on physical knowledge are equal. The ethnographic approach also locates the researcher within the power relations that constitute the field of study and makes visible ways in which scientists function as people who live social lives. The juxtaposition of shared systems of meaning has proven fruitful in analyzing public controversies over science and technology. Articulating subordinate perspectives and locating these alongside dominant ones demonstrates that factual claims gain meaning in terms of more general frameworks of interpretation. Such work also intervenes in power relations in ways that highlight mediation and collaboration as possible pathways for resolution (see also Scientific Controersies). Representative contributions in this category juxtapose Brazilian spiritists, parapsychologists, and proponents of bacterial theories of cancer with ‘orthodox’ perspectives (David Hess); antinuclear weapons groups with nuclear weapons scientists (Hugh Gusterson); creation science with evolutionary science (Christopher Toumey); antinuclear power with pronuclear power groups (Gary Downey); artificial intelligence researchers with expert system users (Diana Forsythe); advocates of midwifery with obstetrics (Robbie Davis-Floyd); Marfan scientists with technicians and activists (Deborah Heath); a unified Europe with separate nation–states seeking space travel (Stacia Zabusky); nuclear power plant operators with plant designers (Constance Perin); and competing workgroups in industry (Frank Dubinskas). Recent work questions the culture\people relationship by understanding cultures as, for example, shifting discourses (Gusterson) or recursive, cross-cutting perspectives in a social arena (Hess). Future work will likely review the notion of sharedness, no longer asserting it as a condition of anthropological analysis but exploring what gets accomplished when it can be demonstrated empirically.
4. Following Flows of Metaphors Across the Boundary The publication of Emily Martin’s The Woman in the Body in 1987 marked the emergence of projects that follow flows of metaphors from general cultural life into science and technology and back again into people’s lives and experiences. Writing a cultural history of the present, Martin demonstrates how medical conceptions of menstruation and menopause draw upon metaphors of production in portraying these as breakdown, the failure to produce. In addition to contrasting descriptions from medical textbooks with an ethnographic account of the actual experiences
of women, Martin also experiments with re-imagining menstruation and menopause as positive processes of production. Following flows of metaphors brings a new approach to the study of public understanding and of what the dominant model of science and technology characterizes as impacts. It calls direct attention to the existence and life of the cultural boundary that separates science and technology from people. How do people import scientific facts and technological artifacts into their lives and worlds? How does the dominant model of science and technology live alongside and inform other cultural meanings? Like the strategy of juxtaposition, following metaphors focuses attention on how people involve themselves with science and technology, making visible those meanings and experiences that do not have a place within the dominant model alongside those that do. What sorts of effects do scientific facts have in people’s bodies and lives? How do experiences with technologies go beyond the simply positive, negative, or neutral? How do emerging sciences and technologies contribute to the fashioning of selves? Such work makes visible patterns and forms of difference that may not correlate with demographic categories of race, gender, and class. Finally, following metaphors introduces the possibility of speculating on alternative cultural possibilities, including narrative experiments with counterfactual scenarios. While sharing with readers an anthropological method of critical thinking, such speculation also forces more explicit attention to how anthropological accounts participate critically within their fields of study. How, for example, might choices of theory and method combine with aspects of researchers’ identities to shape pathways of intervention? Work along these lines follows how sonography and amniocentesis accelerate the introduction of medical expertise into pregnancies (Rayna Rapp); the cultural origins of Western science and its shifts over time (David Hess); how people calculate in diverse settings (Jean Lave); the cultural meanings of medicine in everyday life (Margaret Lock); the narrative construction of scientists’ autobiographies (Michael Fischer); creative adjustments when medical expertise fails to account for pregnancy loss (Linda Layne); the flows of meanings that constituted the reproductive sciences as sciences (Adele Clarke); what living with gender assumptions means for women scientists (Margaret Eisenhart); how Bhopal lives as fact and image in different enunciatory communities (Kim Fortun); and how patents travel around the world (Marianne De Laet). Future work is likely to elaborate questions of scale and identity. How do metaphors gain effects at different scales and how do meanings that live at different scales combine in people’s lives and selves? How do people respond to meanings that summon 13631
Science and Technology, Anthropology of and challenge them, positioning themselves in searches for identities that work (see also Identity in Anthropology).
5. Retheorizing the Boundary between Science\Technology and People The publication of Donna Haraway’s Simians, Cyborgs, and Women in 1991 marked the emergence of projects that retheorize the boundary itself. Moving through the project of following flows of metaphors to reimagine categories that ‘implode’ on one another, Haraway calls for ‘pleasure in the confusion of boundaries and … responsibility in their construction’ (1991, p. 150). Claiming the cyborg (see Cyborg) as a feminist icon and key marker of what she calls the ‘New World Order,’ Haraway forges attention to the contemporary dissolution of boundaries between human and animal, human and machine, and physical and nonphysical. Challenging the ‘god-trick’ of universalism, she poses ‘situated knowledges’ (see Situated Knowledge: Feminist and Science and Technology Studies Perspecties) as a means of holding simultaneously to a radical historical contingency and a no-nonsense commitment to faithful accounts of a ‘real’ world. This project calls attention to burgeoning collections of activities involving science and technology that live across the purported separation between them or across their boundary with people. It includes the exploration of emerging fields that do not fit conventional disciplinary categories, such as biotechnology, biomedicine, and bioengineering, and documents the decline and loss of a distinction between basic and applied science. Following novel activities in research and production motivates the invention of new labels for the anthropological object of study, including, for example, ‘technoscience’ and ‘technoculture.’ The project of retheorizing the boundary between science\technology and people highlights the presence of the nature\culture distinction in anthropology of science and technology as well as in other areas of STS inquiry and intervention. Pressing questions include: through what sorts of processes do analytic findings and interpretations become naturalized as facts in everyday, or popular, modes of theorizing? In what ways might analytic accounts, including claims about culture, depend upon facts from popular theorizing? How might new modes of theorizing about analysis contribute to rethinking the nature\culture distinction itself? Finally, by calling attention to the difficulty of living within existing categories while attempting to theorize and embody new ones, retheorizing the boundary sharpens the question of intervention. Locating experiences that live across categories invites researchers 13632
to examine how emergent categories impact and inflect old ones. Through what sorts of pathways might reformulations actually intervene and achieve change in specific cases? How might it be possible to assess the extent to which reformulations and refigurations prove, in fact, to be helpful, and to whom? Contributions to this diverse project explore how ideas of the natural help constitute cultural ways of knowing (Marilyn Strathern); the situatedness of practices in machine design and machine use (Lucy Suchman); opportunities for a ‘cyborg anthropology’ that studies people without starting with the ‘human’ (Gary Downey, Joseph Dumit, Sarah Williams); how reproductive technologies blur the facts of life (Sarah Franklin); emerging refigurations to constitute and categorize artificial life (Stefan Helmreich); the activities of biotechnology scientists who move outside the academy to gain academic freedom (Paul Rabinow); the importance of ‘good hands’ and ‘mindful bodies’ in laboratory work (Deborah Heath); how practices of tissue engineering materialize new life forms (Linda Hogle); experiences with PET scanning that escape the nature\culture distinction (Joseph Dumit); the contemporary production of ‘cyborg babies’ (Robbie Davis-Floyd and Joseph Dumit); experiences of computer engineers that belie the separation of human and machine (Gary Downey); possibilities for reinvigorating general anthropology by locating technology and humans in the unified frame of cyborg (David Hakken); and how African fractal geometry escapes classification in either cultural or natural terms alone (Ron Eglash) (See also Actor Network Theory for an approach that defines both humans and nonhumans as ‘actants’.) Future work is likely to forge novel alliances among theoretical perspectives previously separated by the nature\culture distinction. Actively locating academic theorizing in the midst of popular theorizing will likely force modalities of intervention into focus. Finally, to the extent that the anthropology of science and technology succeeds in challenging and replacing the simplistic dominant model by blurring and refiguring the boundary between science\technology and people, wholly new projects of inquiry and intervention will have to emerge to take account of the changing context. See bibliography for additional reviews. See also: Actor Network Theory; Common Sense, Anthropology of; Cultural Studies of Science; Culture: Contemporary Views; Ethnography; Gender and Technology; Globalization, Anthropology of; History of Science; History of Technology; Identity in Anthropology; Indigenous Knowledge: Science and Technology Studies; Interpretation in Anthropology; Knowledge, Anthropology of; Science, New Forms of; Science, Sociology of; Scientific Culture; Scientific Disciplines, History of; Symbolism in Anthropology; Technology, Anthropology of
Science and Technology: Internationalization
Bibliography Davis-Floyd R, Dumit J (eds.) 1988 Cyborg Babies: From Techno-Sex to Techno-Tots. Routledge, New York Downey G L, Dumit J (eds.) 1998 Cyborgs & Citadels: Anthropological Interentions in Emerging Sciences and Technologies. SAR Press, Santa Fe, NM Franklin S 1995 Science as culture, cultures of science. Annual Reiew of Anthropology 24: 163–84 Haraway D 1991 Simians, Cyborgs, and Women: The Reinention of Nature. Routledge, New York Heath D, Rabinow P (eds.) 1993 Bio-politics: The anthropology of the new genetics and immunology. Culture, Medicine, Psychiatry 17(special issue) Hess D J 1995 Science and Technology in a Multicultural World: The Cultural Politics of Facts and Artifacts. Columbia University Press, New York Hess D J, Layne L L (eds.) 1992 The anthropology of science and technology. Knowledge and Society 9 Layne L L (ed.) Anthropological approaches in science and technology studies. Science, Technology, & Human Values 23(1): Winter (special issue) Martin E 1987 The Woman in the Body. Beacon Press, Boston Nader L (ed.) 1996 Naked Science: Anthropological Inquiry into Boundaries, Power, and Knowledge. Routledge, New York Traweek S 1988 Beamtimes and Lifetimes: The World of High Energy Physicists. Harvard University Press, Cambridge, MA
G. L. Downey
Science and Technology: Internationalization Science is predominantly done in national scientific communities, but scientific information regularly crosses boundaries. Various conditions influence this flow, which means that it is not symmetric in all directions. Internationalization is the uneven process wherein cross-border linkages of communication and collaboration in science and technology among countries multiply and expand. Linkages involve individual scientists and their institutions, but also, increasingly, governments through treaties and conventions that include strong science and technology components. The growing density of interconnections between large industrial corporations regarding research in the precompetitive phase as well as technological alliances is also relevant.
1. Some Distinctions 1.1 Scientific Internationalism Scientific internationalism emerged with the development of national academies of science (Crawford et al. 1993). Later, correspondence and interchange between individual scientists channeled through international scientific unions and congresses appearing
in the nineteenth century (Elzinga and Landstro$ m 1996). The flare-up of nationalism and World War I caused a temporary rupture in this internationalism, both in mode of organization and spirit. After the war, transnational scientific relations were gradually repaired, even with Germany, but at the cost of promoting a strict neutralist ideology, nurturing an image of science disembodied from and standing above society. 1.2 Scientific Internationalization and Economic Globalization The multiplier effect of interconnections between transnational corporations (TNCs) and financial institutions that shape trade related agreements and intellectual property regimes influences, but must not be confused with, scientific internationalization. It is a ‘globalization’ process in the economic sphere, driven by for-profit motives, facilitated by a combination of neoliberal politics, privatization and informationtechnological developments. Internationalization of science, by contrast, is ultimately predicated on trust and solidarity among intellectual peers, even though it is interpenetrating increasingly with economic globalization. A further distinction is between quantitative and qualitative aspects of internationalization. Quantitative studies deal with numbers and patterns of crossborder linkages, and multinational interconnectivity or collaboration and cooperation. They provide indications of changing trends, but for deeper insights need complementation by studies of qualitative changes, e.g., emergence of new institutional arrangements and incentive systems that facilitate international exchange, reorientation of local research agendas, or harmonization of approaches to policy and priorities, like foresight methodologies, at national and regional levels. Internationalization of industrial research and development (R&D) is now recognized as an important research topic. Statistical surveys and empirical case studies confirm increases in the numbers of multicountry patents held by individual firms, as well as a proliferation of technological alliances within and between a triad of trading blocks (NAFTA; North American Free Trade Agreement, EU; European Union, and Developing Asian Economies plus Japan). In this respect it is actually more appropriate to speak of a ‘triadization’ than economic globalization. Pertinent literature on internationalization of technology also refers to other types of interaction between TNCs, and it deals with various R&D management strategies in this context. But in this respect also, extant overviews are limited largely to providing taxonomies and typologies of discernible patterns (Research Policy 1999); (for further details, see National Innoation Systems and Technological Innoation). 13633
Science and Technology: Internationalization Qualitative aspects of internationalization also include epistemic change: i.e., in the intellectual contents of scientific fields, namely, dominant perspectives, methodologies and theories. Thematic integration appears along research fronts, as well as sectoral lines where problems transcend national boundaries (acid rain, global climate change, AIDS), or are too costly for a single nation to handle alone (e.g., CERN). In these instances, and more generally, when international research programs foster coordination of national contributions they also force standardization of data formats, preferred instrumentation and experimental practices. In what follows, transformations of interconnectivity consonant with internationalism is probed in cartographic, institutional and epistemic dimensions, with political aspects also thrown in relief.
2. Tracing the Span and Patterns of International Networks At the time of writing, the volume of science, national outputs per monies allocated per gross national product (GNP), and the distribution of research efforts and patenting across the globe are mapped regularly. This is done with the help of publication counts (of papers) and reviews of who cites whom (Price 1986) to trace citation patterns and coauthorship linkages (see Scientometrics). Visibility of scientists from different countries or in regions is compared. Evaluations of research performance use science and technology indicators as proxies for effectivity, quality and international standing of research groups and their institutions. The span of global networks increased during the last decades of the twentieth century. Coauthorship patterns reveal growing contacts of scientists across national borders during the 1970s and 1980s (Hicks and Katz 1996), with a certain slackening in the 1990s (UNESCO 1998). Leading scientific producers such as the USA engage less in international coauthorships than do smaller countries—the larger the national or regional scientific community, the greater is ‘self-reliance.’ Generally, internationalism is only played up in large countries when scientific leadership is in decline, or other countries possess desired specialty knowledge. An interesting anomaly is that India and China also figure lower on polls of cross-border co-authorships than might be expected. Latin America has remained constant, while Africa, especially its sub-Saharan part, stands out as most disadvantaged. Concentration of resources, prestige, authority and recognition thus display regional variations, with densities following continental contours dominated by an Anglophone region. The predominance of English as the main language of scientific communication, as evidenced in the Science Citation Index (SCI), is also 13634
increasing. International coauthorship furthermore reveals subclusters (e.g., the Scandinavian countries) influenced by geographical vicinity, historical traditions, and linguistic as well as cultural affinities. The number of Third World countries now participating in world science has increased, but the vast majority still belong to the scientific periphery (Schott 1998). The share of liberal democracies in world science 1986 was nearly five times their share of world population, while the poorer countries’ share in scientific production counted for only one-tenth of their share of the world population. With a few remarkable exceptions (India, Brazil, China) this global gap became even more exaggerated during the 1990s (Schott 1993). Increased connectivity, scope and participation in scientific communication across national borders, in other words, cannot be equated with decreasing hierarchization. Rather, ‘globalization’ of institutional models and participation in science is accompanied by a deglobalization in dispersion of science. This contrasts sharply with the notion of science as a public good. Scientific knowledge remains highly concentrated where it is first created, namely among OECD countries. The scientific centers in advanced industrial nations, by virtue of prestige and scientific achievement (often measured by relative density of Nobel laureates in specific fields) also exercise influence over work done in peripheral countries. A Eurocentric skew persists in power, resources, problem selection, and overriding perspectives on the content and function of science and technology. The end of the Cold War opened a new era of pluricentrism, primarily around three large regional blocks: North America, the European Union, plus Japan and developing economies in Asia. These cleavages are reflected in transnational citation impact and coauthorship clusters, and they coincide with current density patterns in international trade and technology alliances (EC 1997, pp. 10–11, 93).
3. Driing Factors Public spending on R&D is more intense in the USA than in the EU, while in terms of patenting, Japan has caught up with the USA, and the EU rates third (EC 1997a, pp. 53, 93). Economic globalization is a second driving force to reckon with, also uneven. With R&D investments concentrated in a few high-technology industries in the world, these lead strategic reconfigurations in S&T landscapes. Rapid advances in information and communication technologies interlock with economic development, providing new vehicles for rapid interaction (e.g., the Internet), and spurring further alliances and integration of knowledge in pre-competitive phases. Simultaneously, science is being drawn more deeply into the economic globalization process by policy responses, as countries
Science and Technology: Internationalization and whole regions (e.g., the EC through its Framework Programs) facilitate technological development for highly competitive world markets. The EC’s mobility schemes for graduate students and post docs must be seen in this light, and so also—by extension—events like the turn of the millennium announcement of a cooperative bilateral university-level agreement between two of the world’s most prestigious institutions, MIT (Massachussetts Institute of Technology, USA) and Cambridge University (UK), and the EC’s a new integrative policy for ‘the European Research Area.’ Such events herald a new phase in the intermeshing of the two processes, scientific internationalism and economic globalization. An ideological driving factor is the traditional cosmopolitan ethos inherent to academe. Associated with modernity, it incorporates the idea of progress, with a history parallel to the emergence of the nation state, democracy and secularization. This means it is in fact culturally bound, as are its CUDOS components enunciated by Robert Merton as the social glue of modern scientific institutions: intellectual Communism, Universalism, Disinterestedness, and Organized Skepticism (see Norms in Science). Recent deconstruction of such norms, standards and models of science by scholars in the newer sociology of science highlights the role of particularism, drawing attention to social mechanisms of negotiation between various actors, and how these shape or ‘stabilize’ scientific consensus that can never be final. This captures the cultural diversity of scientific practices, but not the basic ideological import of universalism as an ideal in scientific internationalism. A recent factor of internationalization resides in the global nature of environmental problems and the demand for concerted political action on a global scale to address them. Increasing numbers of treaties and conventions with strong science and technology components have been drawn up in this vein. On a converging track one finds pressures of escalating costs of large-scale facilities and calls to share budgetary burdens in newer as well as more traditional fields. The end of the Cold War was also a triggering factor. In its wake appear programs to aid Eastern and Central European scientists in their transition to capitalism and market-driven incentives, and new forms of international science and technology cooperation. Finally, there are the local pressures in smaller nations that push scientists in settings of relative isolation in all parts of the world to integrate their work more closely with research fronts. Here the benefits of internationalization and the added intellectual stimulation it entails sometimes has the makings of a self-fulfilling prophecy. Pressures to increase local visibility and gain international recognition by exposure to wider peer control of scientific quality get entrenched in procedures and new funding opportunities at national research councils and uni-
versities, affecting the behavior of individuals and groups of researchers, and success in getting grants.
4. Explanatory Models Traditional literature on the history of international scientific organizations usually distinguishes two types of fora, scientific non and intergovernmental organizations (scientific NGOs and scientific IGOs). These are taken to represent two different institutional mechanisms for fostering internationalization. In the innovation literature, on the other hand, the focus has mostly been on firms and their role in international diffusion of technologies. In general, there are two broad strands of theorizing about international organizations. One is rooted in the assumption that networks and more stable forms of organization arise in response to considerations of efficiency and rational goal-oriented behavior, in the course of which less viable alternative forms of interaction are circumvented. An economistic variant of this approach is implicit in an evolutionary theory of technological advance and diffusion. Here ‘market’ is taken as the selection mechanism that filters out a set of particular technologies from many potentially possible ones, and these get diffused across the globe (Nelson and Winter 1982). Explicitly or implicitly, the market is regarded as being determined by the tastes, preferences and purchasing power of potential technology users, who are treated as exogenous. Internationalization, in turn, becomes largely a unidirectional product of technological regimes or trajectories which cross and are taken up in socially constituted selective environments. Depending on similarities in the selection mechanisms, or ‘learning’ between national innovation systems, the transfer of ideas and technologies may be harmonized successively at a global level. Theoretical assumptions in the historiography of scientific organizations, or alternatively in regimetheory within the study of international relations, run somewhat parallel to this. Instead of the prominence attributed to an economic imperative, in these literatures ideological or political imperatives, and even communities of experts, may be foregrounded. One therefore gets the picture of internationalization primarily as the product of ideologically driven selfdirection by the scientific community (in the case of scientific NGOs) or of governments’ guiding hands, plus rules and experts (in the case of scientific IGOs). In all these accounts, the organizations in question are regarded more or less as mechanisms through which other agencies act. They are not depicted as purposive actors with an autonomy, power or culture of their own, even if regime-theory has been criticized for giving too much prominence to experts while 13635
Science and Technology: Internationalization obfuscating the role of the nation states that invest them with authority. In contrast to this, a sociological strand of theorizing takes its point of departure in Weber’s view of bureaucracies and focuses squarely on issues of legitimacy and power. Its advocates seek to explain a much broader range of impacts organizations can have, among others by virtue of their role in constructing actors, interests and social purpose. Emphasized are complexity, multiplicity and flexibility, with actors incorporated as endogenous to the processes of change. In this perspective it becomes interesting to consider how fora for international interaction, once they are created, take on a life of their own, exercising power autonomously in ways unintended and unanticipated by scientific associations or governments at the outset. The same can be said about conventions or regimes introduced to regulate and standardize intercourse between firms internationally in the fields of trade or intellectual property rights; norms are taken to have significant repercussions on the character of the interface between science and industry, and on the use of expertise in other realms. The constructivist approach associated with sociological institutionalism thus explains the emergence of the relatively autonomous powers of new international fora, in terms of the rational-legal authority they embody, emphasizing the new interests for the parties involved, and the concomitant learning process in which certain organizational models become diffused across the globe. New bureaucracies as they develop are seen to provide a generic cultural form that shapes the various forums in specific ways in their respective domains (firms, scientific NGOs, and scientific IGOs). New actors are seen to be created, responsibilities specified, and authority delineated, defining and binding the roles of both old and new actors, giving them meaning and normative values. In this model, culture, imagery, and rhetoric are held to be forceful ingredients in the life of international organizations, especially in the way these play out their roles in constructing social worlds with a global reach (Finnemore 1993, Barnett and Finnemore 1999). With the foregoing in mind, the next section highlights some empirical facets, with particular regard to the two most obvious institutional mechanisms (and associated with them, a few typically prominent actors and programs) pertinent to internationalization of—mostly academic, but also governmentally directed—science (as distinct from internationalization of R&D in industrial enterprise—for this see National Innoation Systems).
5. Two Institutional Mechanisms: Scientific NGOs and IGOs The numbers and influence of scientific NGOs and IGOs have grown tremendously since the 1980s. Now 13636
they not only find themselves interacting, but also pulled in different directions by lobbies of both transnational corporations (TNCs) driven by the profit motive, and nongovernmental civic society organizations (social movement NGOs). The latter are frequently fired by an ethic of equality and justice. Truth, politics, money, and human equality or justice, then, are the four ‘logics’ that meet in international forums, when strategies to tackle global problems are negotiated, e.g., Rio 1992; in the process new international groups of experts join the scene. This is an important field for future studies (Rayner and Malone 1998). Analytically, it is useful to draw a distinction between autoletic and heteroletic organizations, especially between ones meant to serve science as an end in itself, and ones that are created and sustained by governmental action (Elzinga and Landstro$ m 1996). This is parallel to the distinction between policy for science, and science for policy, serving to mark institutional separation of science from politics, an aspect central to arguments regarding the integrity and objectivity of expert knowledge under strain (see Scientific Controersies).
5.1 Scientific NGOs In general, nongovernmental mechanisms operate directly between research communities of different countries, without the intervening medium of governments. They are autoletic, the premise being that communities of scientists are best be left to themselves to organize their transnational contacts for common goals. A unique example is the International Council of Scientific Unions (ICSU), the umbrella organization that in 1931 sprang from an older cosmopolitan ideal (Greenaway 1991). It coordinated the Second International Polar Year (1932–3), the forerunner of a series of programs of global, often multidisciplinary studies that began in 1952 with the plan for the International Geophysical Year (1957). Present-day successors are the International Geosphere-Biosphere Program (IGBP) and the World Climate Research Program (WCRP). These cut across disciplines pertinent to research into global environmental problems. Each has several subprograms dealing with specific themes and aspects of global climate change. Other major programs cover biodiversity (DIVERSITAS) and the International Human Dimensions of Global Environmental Change program (IHDP), cosponsored with the International Social Science Council (ISSC), ICSU’s smaller sister organization for the social sciences, created in 1952. A milestone event is the World Conference on Science in Budapest (ICSU 1999), sponsored jointly with the United Nations Educational, Scientific and Cultural Organization (UNESCO), one of the world’s most wide-ranging intergovernmental organizations. Since
Science and Technology: Internationalization its foundation in 1945 UNESCO has worked constantly to link the peripheries to the centers of science.
5.2 Scientific IGOs Scientific IGOs typify the second (heteroletic) type of mechanism for internationalization. They promote scientific interchange via governmental channels. The point of departure is not scientific knowledge production as such, but the use of science for a particular purpose that form the basis for concerted action. In such contexts science is made a vehicle for the promotion of cultural, economic, political, and other goals at regional and global levels. UNESCO has already been mentioned. The World Meteorological Organization (WMO) and the World Health Organization (WHO) are two of many others within the UN family; in the domain of technology too there are many IGOs, some of which set and regulate standards. The World Bank is another significant actor. A recent addition is the Intergovernmental Panel on Climate Change (IPCC), which follows up on the science produced under the auspices of IGBP and WCRP, harmonizing research-based statements in advice to governments. This has important repercussions for scientific and technological pursuits. Some IGOs serve stakeholder interests in specific regions of the world, OECD being an example. Since 1963 it has been a pacesetter in developing R&D statistics, and science and technology policy doctrines, contributing to some harmonization between countries. A recent innovation was the creation of the Forum for Megascience (1992), renamed the Global Science Forum. It is responsible for periodical reviews of very large-scale projects so costly (in the order of billions of US dollars per year) and complex that they require multinational collaboration. Thence science and international diplomacy must meet to produce unique management cultures (Watkins 1997). Megaprojects can be concentrated at one site (e.g., CERN) or distributive (e.g., the Human Genome Project). Deep ocean drilling, climate change, thermonuclear fusion experimentation, as well as Antarctic research are further examples. Antarctica is a continent shaped by science as key vehicle in an international regime outside the UN (Elzinga, in Crawford et al. 1993).
6. Intermesh and Thematic Integration Proliferation and interpenetration of scientific NGOs and IGOs over past decades has gone hand-in-hand with a corresponding growth in numbers of civic NGOs and corporate lobbies. These also interact with scientific bodies, adding to the hybrid criss-cross of connections in which scientific and political agendas converge and blend. Leading scientists respond to the strain by constantly reemphasizing the need to protect
the objectivity and integrity of scientific knowledge claims (Greenaway 1991). Conflicts have emerged around attempts to commercialize and privatize national databases, with both ICSU and IGOs, like the WMO, coming out strongly in favor of a policy for open access to information that is invaluable for world research on global problems. Similar tensions have evolved over intellectual property in biotechnology, where scientists are more apt to be of two minds (see Intellectual Property, Concepts of). In the wake of globalization, in Third World countries in particular, both scientists and politicians have reacted against the design of Trade Related Intellectual Property Rights (TRIPS) as negotiated within the World Trade Organization (WTO). Around these and other issues, new tensions arise to pull scientists in several directions along the four different ‘logics’ (delineated at the beginning of Sect. 4). Thematic integration of research agendas is a second qualitative dimension needing further study. How problems are framed, data, as well as concept formation involves interpretation and epistemological imperatives (see Situated Knowledge: Feminist and Science and Technology Studies Perspecties). This is apparent in research on global climate change, where large international scientific programs (IGBP and WCRP) work hand-in-hand with IPCC to orchestrate problem sets, preferred methodologies and modeling criteria. Core sets of concepts spun in the world’s leading scientific countries with epistemic communities of climatologists around General Circulation Models (GMCs) have a bearing on the type of data field workers should look for, and the format in which the data are cast; special funding programs exist that enroll scientists from the scientific peripheries. Creating global consensus around a core of scientifically accepted knowledge and streamlining homogeneous accounts to spur concerted political action have epistemological implications beyond science. As activities, they contribute to the formation of common world views. Concepts such as ‘global warming potential’ (of greenhouse gases) help to build bridges between science and political decision-making. The very notion of ‘global climate,’ as well as ‘Earth system science’ are other examples where conceptual work facilitate cognitive integration, both over disciplinary boundaries in science and in interfaces with citizens at large (Jasanoff and Wynne, in Rayner and Malone 1998). They change our world picture in a reimagining of humankind in its encounters with nature (e.g., ideas like anthropogenic ‘fingerprints’). Representations of local climates, the atmosphere, and circulations systems in the oceans are linked up with representations of human ecology (e.g., land use), which in turn are linked to conceptions of risk and truth (see Risk, Sociology and Politics of). Internationalization of science resonates with gradual reorientation of perspectives and scientific practices in unitary fashion. To what extent alternatives to expertise as avenues of 13637
Science and Technology: Internationalization knowledge production are foreclosed remains a contentious issue. See also: History of Science; History of Technology; International Organization; International Science: Organizations and Associations; Science and Technology, Social Study of: Computers and Information Technology; Scientific Academies, History of; Universities and Science and Technology: Europe; Universities and Science and Technology: United States
Bibliography Barnett M, Finnemore M 1999 The politics, power, and pathologies of international organizations. International Organization 53: 699–732 Crawford C, Shinn T, So$ rlin S (eds.) 1993 Denationalizing Science. Kluwer, Dordrecht, The Netherlands European Commission 1997 Second European Report on S&T Indicators. Office for Official Publications of the European Communities, Luxembourg. Elzinga A, Landstro$ m C (eds.) 1996 Internationalism and Science. Taylor Graham, London Finnemore M 1993 International organizations as teachers of norms—the United Nations Educational and Cultural Organization and science policy. International Organization 47: 565–97 Greenaway F 1991 Science International: A History of the International Council of Scientific Unions. Cambridge University Press, Cambridge, UK Hicks D, Katz J S 1996 Where is science going? Science, Technology and Human Values 21: 379–406 ICSU 1999 Science International [Sept: Special Issue] ICSU Secretariat, Paris Nelson R, Winter S 1982 An Eolutionary Theory of Economic Change. Harvard University Press, Cambridge, MA Price D de S 1986 Little Science, Big Science and Beyond. Columbia University Press, New York Rayner S, Malone E (eds.) 1998 Human Choice and Climate Change. Batelle Press, Columbus, OH, vol. 1 Research Policy 1999 Special Issue 28, 107–36 (The internationalization of Industrial R&D) Schott S 1993 World science: Globalization of institutions and participation. Science, Technology and Human Values 18: 196–208 Schott T 1998 Between center and periphery in the scientific world system: Accumulation of rewards, dominance and selfreliance in the center. Journal of World-systems Research 4: 112–44 UNESCO 1998 World Science Report. Elsevier, Paris Watkins J D 1997 Science and technology in foreign affairs. Science 277: 650–1
A. Elzinga
collaborations between sociotechnical systems (STS) scholars and maverick computer scientists began. This passes lightly over some early important work critiquing automation, such as that of J. D. Bernal, the early days of the STS movement in Europe, particularly in England, Germany, and Scandinavia, and the neo Marxian analyses of labor process, many of which came to inform later STS work described below. The nexus of work that currently links STS with computer and information science is a very complex one, with roots in all the areas described below, and held together by a strong, invisible college. This group shares a common concern with how computers shape, and are shaped by, human action, at varying levels of scale. The links with STS include concerns about computer design within the social construction of technology; computers as an agent of social or organizational change; ethics and\or computing (Introna and Nissenbaum 2000); critical studies of computers and computer\information science; applied, activist, and policy research on issues such as the ‘digital divide,’ or the unequal distribution of computing and information technology across socioeconomic strata and regions of the world. Some contributions from this part of STS that have been used by scholars in many other parts of the field are Suchman’s ‘situated action’ perspective; Star’s ‘boundary objects’ (Star and Griesemer 1989); Forsythe’s methodological questions about ‘studying up’ and the politics of the anthropology of computing (1993); Henderson’s work on engineering drawings as ‘conscription devices’ (1999); Edwards’ work on computing and the Cold War, and its model of ‘closed and green’ worlds (1996); Berg’s critique of rationalization in medicine via computing (1997); Heath and Luff’s study of ‘centers of control’ via the technologies and interactions of the London Underground work force (2000); Woolgar’s program building and analytic work—much of it concerning the World Wide Web—from the Virtual Society? Program based at Brunel University (Grint and Woolgar 1997); Bowker’s concept of ‘infrastructural inversion,’ taken from information management at Schlumberger Corporation (1994); and Yates’ historical examination of information control techniques in American business (1989, 1996, 1999); Hanseth et al.’s work on standards and ethics in information technology (1996); and Bowker and Star’s work on large-scale classification schemes (1999).
Science and Technology, Social Study of: Computers and Information Technology
1. Automation and the Impact of Computerization
Roots and origins being elusive and often illusory, I will date this article from about 1980, when some key
Many early studies of computers and society, or computers and organizations, concerned computing
13638
Science and Technology, Social Study of: Computers and Information Technology as an automation of human work. This included two major analytical concerns: the replacement of humans by machines in the workplace, and the ‘deskilling’ of existing functional jobs (Braverman 1998). Later, the analytic picture became much richer, as researchers realized that automation and deskilling were just two dimensions of the picture. An early breakthrough article by two social-computer scientists defined this territory as ‘the web of computing’ (Kling and Scacchi 1982), a term which became a touchstone for the next generation. It refers to the co-construction of human work and machine work, in the context of complex organizations and their politics. Much of this research took place in business schools; some in departments of sociology or industrial psychology. In Scandinavia and other regions with strong labor unions, partnerships between the unions and researchers were important, at first focused on job replacement and deskilling, and later, on design created through partnerships with users, social scientists, and computer designers (discussed below). The social impact of computing remains a strong research strain worldwide, despite the anthropological fact that finding ‘untouched tribes’ is increasingly difficult. Thus the problematics of interest to STS have shifted from initial impact on a group or institution, to understanding ongoing dynamics such as usage, local tailoring of systems, shifts in design approaches, understanding the role of infrastructure in helping to form impacts, and the ecology of computers, paper, telephones, fax machines, face-to-face communication, and so forth commonly found in most offices, and increasingly in homes and schools.
2. The Emulation of Human Cognition and Action (Artificial Intelligence and Robotics) The early program of artificial intelligence (AI) work, beginning in an organized way during World War II, was to create a machine that would emulate human thinking (classic AI) and action (robotics). Some of the emulation work came in the form of intelligent tutoring systems and expert systems (Wenger 1987) meant to replace or supplement human decision making. The decision-making side of the research continues to be strong in management schools and in government agencies; it is often critiqued by STS researchers, although some do use these tools (Gilbert and Heath 1985, Byrd et al. 1992). The discussion about whether emulation of human thinking would be possible dates far back; it came to the attention of many STS researchers with the work of philosopher Hubert Dreyfus (1979, 1992), who argued for the irreducibility of human thought and therefore its impossibility in computers. Later STS researchers (including Forsythe, Star, and Suchman) formed direct collaborations with AI researchers. These part-
nerships took both critical and system-building forms; in both cases, the social science researchers acted as informants about the nature of the ‘real world’ as opposed to the emulated world of AI. AI changed during the late 1980s and early 1990s—some computer scientists spoke of the ‘AI winter,’ referring to funding problems unknown in the 1970–88 period. As personal computing and e-mail spread, and later, the Web (1994 ), and the early promises of AI seemed not to be paying off, AI began to lose its prestige within computer science. Many AI researchers changed to problems in information science, software engineering, or cognitive science. A branch of AI, distributed artificial intelligence, continued to interact with the STS community (Huhns 1987, Huhns and Gasser 1989). Their interests were in modeling and supporting spatially and temporally distributed work and decision practices, often in applied settings. This reflected and bridged to STS concerns with community problemsolving, communication and translation issues, and the division of labor in large scientific projects.
3. The Enterprise of Computing, Its Military Roots and the Role of Actiism In 1983 US President Ronald Reagan introduced his (in)famous ‘Star Wars’ (officially known as the Strategic Defense Initiative, or SDI) military defense program, a massively expensive attempt to create a distributed laser and missile-based system in space that would protect the USA from foreign attack. The proposal put computer scientists, and especially software engineers, at the center of the design effort. An immediate outcry was raised by the computing community as well as by alarmed STS scholars close to it. There were concerns that a system of that magnitude was untestable; tools were lacking even to emulate the testing. There were concerns about its viability on other grounds. There were also concerns about its ecological impact on the field of computer science (akin to those raised by organismal biologists about the Human Genome Project)—that the lion’s share of research funds would be funneled into this project, orphaning others. A new group, Computer Scientists for Social Responsibility (CPSR) (www.cpsr.org), was formed in Silicon Valley as a focus for these concerns. The group quickly found common ground in many areas, including activist, ethical, policy and intellectual questions. It has flourished and now sponsors or cosponsors three conferences a year in which STS scholars often participate: Computers, Freedom and Privacy (policy); Directions in Advanced Computing (DIAC) (intellectual and design directions, and their political and ethical bases and outcomes); and the Participatory Design Conference (PDC) (brings together users, community organizers, computer and social scientists to work on issues of co-design and 13639
Science and Technology, Social Study of: Computers and Information Technology appropriate technology) (Kyng and Mathiassen 1997). It, like its counterparts in the UK (Computers and Social Responsibility) and elsewhere, became an important meeting ground between STS and computer\ information science. This was especially true of those critical of military sponsorship of computing agendas (at one point in the US some 98 percent of all funding of computer science came from the military, especially ARPA (formerly DARPA)—the Advanced Research Projects of the Army, arguably the critical actor in the creation of the internet (Abbate 1999). Beginning in the 1980s, a grassroots movement sometimes called ‘community computing’ arose, and linked with CPSR and other similar organizations. It attempted to increase access to computing for poor people and those disenfranchised from computing for a variety of reasons (Bishop 2000). This often involved the establishment of ‘freenets,’ or publicly available free computing. Terminals were placed in venues such as public libraries, churches, and after-school clubs; some were distributed directly to people in need. This general problematic has come to be called the ‘digital divide,’ and forms an active nexus of research across many areas of interest to STS. An important part of the attempt to enfranchise everyone arose in the world of feminist activism and scholarship, as early computer hackers were nearly all male, and there were sexist and structural barriers to women as both users and professionals. Another meeting place between STS and computer\information science was formed by numerous conferences on women in computer science, gender and computing, and feminist analyses of the problem choice and ethics of computer science. An excellent series of conferences, whose proceedings are published in book form every four years, is sponsored by IFIPS (the International Federation of Information Processing Societies, Working Group 9.1, entitled Women, Work and Computing). See, for example, Grundy et al. (1997). There is a resurgence of interest in this topic with the advent of the Web (see, e.g., Wakeford 1999).
(1996) looked at issues such as the culture of programming, its practices, and the role of technicians (Barley and Orr 1997). Some, such as Nardi, conducted usability tests for prototypes of systems. Sociologists examined work practices, organizational processes, and informed the computer scientists about the ways in which these issues could shape or block design (Star and Ruhleder 1996).
4. Design
5.2 Computer-supported Cooperatie Work (CSCW) and the Participatory Design Moement (PD)
Beginning in the early 1980s, but reaching full strength in the late 1980s, a number of STS and STS-linked scholars began studying information technology design. This took two forms: studying the work of doing design, and doing design as part of a multidisciplinary team. Henderson’s work (cited above) on the visual design practices of engineers is a good example of the former; other important works in the area include Kunda (1992); and a special issue on Design of the journal Computer Supported Collaboratie Work (Volume 5:4, 1996). STS scholars participating directly in design did so in a number of ways. Anthropologists (such as Nardi 1993, Nardi and O’Day 1999) and Orr 13640
5. From ‘The Unnamable’ to Social Informatics: Shaping an Inisible College Linked to STS An invisible college of social scientists, including STS researchers, who do–study–critique computer and information science has steadily grown since the early 1980s. A brief overview of some of the ‘sister travelers’ now growing in strength in STS itself includes the following.
5.1 Organization Theory and Analysis People who study organizations have for some time been concerned with the use and impact of computing within them. During the 1990s, STS has become an increasingly important theoretical resource for this field, and has resulted in a number of organization science conferences inviting keynote addresses by STS scholars; reading STS work and attempting to apply it in organizational design and studies. The work of Latour and actor network theory has been especially important (Orlikowski et al. 1995). The policy area of STS has also become increasingly important where issues of privacy, employee rights, intellectual property, and risk are related to computer and information systems (regular updated reports and references can be found on Phil Agre’s ‘Red Rock Eater’ electronic news service: http:\\dlis.gseis.ucla.edu\people\pagre\rre. html).
Cognitive and experimental psychologists have been involved with the design of computer and information systems since at least the 1950s. They formed a field that came to be known as Human–Computer Interaction (HCI). This field focused originally on very small-scale actions, such as measuring keyboard strokes, attention and interface design, ergonomics of workstations, and usability at the level of the individual user. In the late 1980s, part of this field began to stretch out to more organizational and cultural issues (Grudin 1990, Bannon 1990). The impact of the
Science and Technology, Social Study of: Computers and Information Technology personal computer and the consequent decentralization of computing, and the rapid spread of networked computing to non-computer specialists were important factors. This wing of HCI began to draw together computer scientists, sociologists, anthropologists, and systems analysts of like mind. In 1986 there was a search to put a name to this. Two of the major competitors were ‘office information systems’ (to replace the old ‘office automation’), and ‘computersupported cooperative work (CSCW),’ which won out. CSCW is an interdisciplinary and international group that studies a range of issues including the nature of cooperation, critiques of computer science, building systems to support cooperative work (both local and high distributed). STS scholars have been part of this field since the beginning; Star was a founding co-editor of the journal. Her work and that of Suchman’s have been widely used in CSCW. Closely linked with CSCW is PD, the practice of designing computer systems with user communities and social informaticians. Annual conferences are held in tandem with CSCW. They have picked up and actively use STS approaches. The roots of PD have been reviewed recently by Asarco (2000); they include corporate initiatives, community development, social movements, and the Scandinavian School, discussed in the next section.
5.3 The ‘Scandinaian School’ In the 1950s the (powerful) trade unions in Scandinavia helped to pass the ‘codetermination legislation.’ This law stated that unions must be involved in technological design—originally motivated by concerns about deskilling and job loss through automation. A form of sociotechnical systems analysis evolved into a set of techniques for studying work places and processes. As computers arrived in the workplace, this came to include progressive computer scientists and social scientists, many of whom now participate in STS publishing and conferences, as well as CSCW and PD (Greenbaum and Kyng 1991, Bjerknes et al. 1987, Bødker 1991, Neumann and Star 1996).
5.4 Computers and Education The use of computers for science education began in the 1960s. The study of this, based in schools of education and science departments, has had both critical components and basic research. In recent years, the addition of distance education via computers, internet classes, and the ubiquity of computers in college classes has become a critical component of this community. One branch is called ComputerSupported Cooperative Learning (CSCL), and has a
lively relationship with both STS and CSCW (Koschmann 1996). STS concepts are beginning to be used across all areas of science education, and recent STS conferences reflect the links strengthening between science education and STS.
5.5 Former Library Schools as an Emergent Nexus for STS Work Since the 1980s, schools of library science have experienced massive closures, due to a complex of corporatization, declining public sphere funding for public libraries, and a general move to close professional schools whose faculty and students are mostly women (social work and physical therapy schools have met similar fates). In the USA, a few of the surviving library schools have reinvented themselves as new ‘Schools of Information,’ most dropping the word ‘library’ from the title. Sites include the University of Michigan, University of Illinois at Urbana-Champaign, University of North Carolina, and Indiana University. Their faculty now includes social scientists, computer and information scientists, and library scientists. STS work is central to many of the programs, and several faculty members from the STS world who work with information technology or information itself have found positions there. The structural shape of the programs differs in countries with different forms of funding, and different configurations of the public sphere. Kling’s ‘Social Informatics’ page is an excellent resource (http:\\www.slis. indiana.edu\SI\), as is Myers’ ‘Qualitative Research in Information Science’ (http:\\www.auckland.ac. nz\msis\isworld\) (see also Kling 2000). However, STS work is still very influential, for example, at the Royal School of Librarianship in Copenhagen.
6. The Web: Cultural Studies, Economic Impact, Social Practices, and Ethics The public availability of the World Wide Web from 1994 and the commercialization of the Internet–Web, combined with falling prices for computers with good graphics and cheap memory, has changed the face of computing dramatically. Many STS scholars have been involved in different facets of studying the Web. Cultural studies of chat rooms, home pages, MOOs and MUDs, and social inequities in distribution and use have exploded (see e.g., Turkle 1995). Some of this is adapted by STS scholars as studying the use and development of technology; some as a lens through which technology mediates culture. E-commerce, including scientific publishing, has come to overlap with STS work (e.g., Knorr-Cetina and Preda 1998). Ethical areas, such as privacy, electronic stalking and harassment, hate speech sites, and identification of 13641
Science and Technology, Social Study of: Computers and Information Technology minors on the Web (which are protected by human subjects regulations, but impossible to identify in many e-venues) are among these issues (Friedman and Nissenbaum 1996).
7. Challenges and Future Directions One of the difficult things for STS scholars who work in this area is the process of tacking back and forth between sympathetic communities housed in information and computer science, STS itself, and one’s home discipline. This appears at many levels: job openings, publications, choosing conferences to attend, and the growing call from within industry and computer science for ethnographers, social science evaluators, and co-designers. Sometimes this latter call takes the form of guest talks to technical groups, keynote addresses at technical meetings, consulting, or being asked for free advice about complex social problems within an organization. As with STS itself, there are a growing number of PhD programs within social informatics, broadly speaking, and the field is both spreading and converging. This may, in the long run, ease the juggling problem. At present, most researchers in the area pursue a double career, publishing and participating in both computer\IS and STS communities, with some bridge-building back to the home disciplines. Another challenge lies in the area of methods. Many social informatics\STS researchers inherited methodological practices from their home disciplines, and some of these make an uneasy fit with current technological directions. For example, there is now a great deal of survey research and of ‘ethnography’ being done on the Web, including intense study of chat rooms, e-discussion lists, and other forms of online behavior. On the survey side, sampling and validity problems loom large, although they are old problems with venerable methods literatures addressing their solution. The question of sampling only from those with e-mail hookups is similar to the old question of sampling only those who have telephones. Every sample has limits. However, the validity issue, such as the practices of filling out e-mail forms, how the survey appears in the context of many other e-mail messages, and the impact of the genre itself on the content is only now beginning to be explored. Similarly, for ethnographers, it is clear that e-mail messages are not the readymade fieldnotes that many early studies (i.e., from the early 1990s) delighted in. Forms of triangulation between online and offline research are now coming to the fore in sophisticated methods discussions; another contribution lies in the socially little-known interaction between the built information environment and the phenomenology of users (Wakeford and Lyman 1999). In conclusion, growing overlaps between complex computer\information science groups, STS, and the 13642
other social worlds listed above means new opportunities for STS. STS students are finding employment in corporate research and development (R&D) settings, new information schools, government policy employers concerned about information technology, and in the non-profit sector concerned with issues such as the digital divide. A note on access: much of this sort of material exists in the proceedings of conferences and is indexed by corporate or organizational authors. It is, in libraries, usually held in special Computer Science libraries on campus, and is, by social science standards, badly indexed. Seek help from the reference librarian to find this material. Working groups or special interest groups (SIGs in computer science terminology) can be powerful loci of change, with thousands of members, both professional and academic. Some are of direct interest to STS scholars, such as SIGSOC, the Special Interest Group on Computers and Society run by the ACM (Association for Computer Machinery, the dominant professional organization for computer scientists in the USA) or ACM- SIGCHI (SIG on Computers and Human Interaction), which has grown so large it is now functionally its own professional organization. Although all professions have jargon and acronyms, computer and information science is highly dense with them. Scholars seeking to build bridges to the computer\information science community can consult many online ‘acronym servers’ such as Free On-Line Dictionary of Computing at http:\\foldoc.doc.ic.ac.uk\foldoc\index.html; acronym finder at http:\\www.acronymfinder.com\.
8. Selected Journal Resources Publishing Some STS Work Accounting, Management and Information Technology (which, in 2000, was due to change its name to Information and Organization) Organization Science Computer Supported Cooperatie Work (CSCW): The Journal of Collaboratie Computing Information Systems Research The Scandinaian Journal of Information Science CSCL Learning Sciences Journal of the American Society of Information Sciences Human–Computer Interaction The Information Society Information Technology and People CMC Magazine Electronic Journal on Virtual Culture: http:\\www.monash.edu.au\journals\ejvc\ See also: Artificial Intelligence in Cognitive Science; Communication: Electronic Networks and Pub-
Science and Technology, Social Study of: Computers and Information Technology lications; Communication: Philosophical Aspects; Computers and Society; Digital Computer: Impact on the Social Sciences; Human–Computer Interaction; Human–Computer Interface; Information and Knowledge: Organizational; Information Society; Information Technology; Information Theory; Mass Communication:Technology;ScienceandTechnology: Internationalization; Telecommunications and Information Policy
Bibliography Abbate J 1999 Inenting the Internet. MIT Press, Cambridge, MA Asarco P M 2000 Transforming society by transforming technology: The science and politics of participatory design. Accounting, Management and Information Technologies 10: 257–90 Bannon L 1990 A pilgrim’s progress: From cognitive science to cooperative design. AI and Society 4(2): 59–75 Barley S, Orr J (eds.) 1997 Between Craft and Science: Technical Work in US Settings. IRL Press, Ithaca, NY Berg M 1997 Rationalizing Medical Work: Decision-support Techniques and Medical Practices. MIT Press, Cambridge, MA Bishop A P 2000 Technology literacy in low-income communities. (Re\mediating adolescent literacies). Journal of Adolescent and Adult Literacy 43: 473–76 Bjerknes G, Ehn P, Kyng M (eds.) 1987 Computers and Democracy: A Scandinaian Challenge. Avebury, Aldershot, UK Bødker S 1991 Through the Interface: A Human Actiity Approach to User Interface Design. Erlbaum, Hillsdale, NJ Bowker G 1994 Information mythology and infrastructure. In: Bud-Frierman L (ed.) Information Acumen: The Understanding and Use of Knowledge in Modern Business. Routledge, London, pp. 231–47 Bowker G, Star S L 1999 Sorting Things Out: Classification and its Consequences. MIT Press, Cambridge, MA Bowker G, Star S L, Turner W, Gasser L (eds.) 1997 Social Science, Information Systems and Cooperatie Work: Beyond the Great Diide. Erlbaum, Mahwah, NJ Braverman H 1998 Labor and monopoly capital: The degradation of work in the twentieth century. 25th Anniversary edn. Monthly Reiew Press, NY Byrd T A, Cossick K L, Zmud R W 1992 A synthesis of research on requirements analysis and knowledge acquisition techniques. MIS Quarterly 16: 117–39 Dreyfus H L 1979 What Computers Can’t Do: The Limits of Artificial Intelligence, Revd. edn. Harper & Row, New York Dreyfus H L 1992 What Computers Still Can’t Do: A Critique of Artificial Reason. MIT Press, Cambridge, MA Edwards P N 1996 The Closed World: Computers and the Politics of Discourse in Cold War America. MIT Press, Cambridge, MA Forsythe D E 1992 Blaming the user in medical informatics: The cultural nature of scientific practice. Knowledge and society. The Anthropology of Science and Technology 9: 95–111 Forsythe D E 1993 The construction of work in artificial intelligence. Science, Technology, and Human Values 18: 460–80
Friedman B, Nissenbaum H 1996 Bias in computer systems. ACM Transactions on Information Systems 14: 330–48 Gilbert N, Heath C (eds.) 1985 Social Action and Artificial Intelligence. Gower, Aldershot, UK Greenbaum J, Kyng M 1991 Design at Work: Cooperatie Design of Computer Systems. Erlbaum, Hillsdale, NJ Grint K, Woolgar S 1997 The Machine at Work: Technology, Work, and Organization. Polity Press, Cambridge, UK Grudin J 1990 The computer reaches out: The historical continuity of interface design. In: Chew J, Whiteside J (eds.) Proceedings of the CHI ’90 Conference on Human Factors in Computing Systems, April 1–5; ACM Press, Seattle, WA, pp. 261–68 Grundy A F et al. (eds.) 1997 Spinning a web from past to future. Proceedings of the 6th International IFIP Conference. Bonn, Germany, May 24–27, Springer, Berlin Heath C, Luff P 2000 Technology in Action. Cambridge University Press, New York Hanseth O, Monteiro E, Hatling M 1996 Developing information infrastructure: The tension between standardization and flexibility. Science, Technology, and Human Values 21: 407–27 Henderson K 1999 On Line and on Paper: Visual Representations, Visual Culture, and Computer Graphics in Design Engineering. MIT Press, Cambridge, MA Huhns M (ed.) 1987 Distributed Artificial Intelligence. Morgan Kaufmann, Los Altos, CA Introna L, Nissenbaum H 2000 The politics of search engines. IEEE Spectrum 37: 26–28 Kling R 2000 Learning about information technologies and social change: The contribution of social informatics. The Information Society 16: 21–232 Kling R, Scacchi W 1982 The web of computing: Computing technology as social organization. Adances in Computers 21: 3–78 Knorr-Cetina K D, Preda A 1998 The epistemization of economic transactions. In: Sales A, Adikhari K (eds.) Knowledge, Economy and Society. Sage, New York Koschmann T (ed.) 1996 CSCL, Theory and Practice of an Emerging Paradigm. Erlbaum, Mahwah, NJ Kunda G 1992 Engineering Culture: Control and Commitment in a High-tech Corporation. Temple University Press, Philadelphia, PA Kyng M, Mathiassen L (eds.) 1997 Computers and Design in Context. MIT Press, Cambridge, MA Luff P, Hindmarsh J, Heath C (eds.) 2000 Workplace Studies: Recoering Work Practice and Informing System Design. Cambridge University Press, New York Nardi B A 1993 A Small Matter of Programming: Perspecties on End User Computing. MIT Press, Cambridge, MA Nardi B A, O’Day V 1999 Information Ecologies: Using Technology with Heart. MIT Press, Cambridge, MA Neumann L, Star S L 1996 Making infrastructure: The dream of a common language. In: Blomberg J, Kensing F, DykstraErickson E (eds.) Proceedings of PDC ’96 (Participatory Design Conference). Computer Professionals for Social Responsibility, Palo Alto, CA, pp. 231–40 Orlikowski W, Walsham G, Jones M, DeGross J (eds.) 1995 Information Technology and Changes in Organizational Work. Proceedings of IFIP WG8.2 Conference, Cambridge, UK. Chapman and Hall, London Orr J E 1996 Talking About Machines: An Ethnography of a Modern Job. ILR Press, New York Star S L 1988 The structure of ill-structured solutions: Heterogeneous problem-solving, boundary objects and distri-
13643
Science and Technology, Social Study of: Computers and Information Technology buted artificial intelligence. In: Huhns M, Gasser L (eds.) Distributed Artificial Intelligence 2. Morgan Kauffmann, Menlo Park, CA, pp. 37–54 Star S L, Griesemer J 1989 Institutional ecology, ‘translations,’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939, Social Studies of Science, 19: 387–420. (Reprinted in Mario Biagioli (ed.) The Science Studies Reader. Routledge, London, pp. 505–24) Star S L, Ruhleder K 1996 Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research 7: 111–34 Turkle S 1995 Life on the Screen: Identity in the Age of the Internet. Simon & Schuster, New York Wakeford N, Lyman P 1999 Going into the (virtual) field. American Behaioral Scientist 43: Wakeford N 1999 Gender and the landscapes of computing at an internet cafe! . In: Crang P, Dey J (eds.) Virtual Geographies. Routledge, London Wenger E 1987 Artificial Intelligence and Tutoring Systems: Computational and Cognitie Approaches to the Communication of Knowledge. Morgan Kaufmann, Los Altos, CA Yates J 1989 Control Through Communication: The Rise of System American Management. Johns Hopkins University Press, Baltimore, MD Yates J 1996 Exploring the black box: Technology, economics, and history. Technology and Culture 37: 61–620 Yates J 1999 Accounting for growth: Information systems and the creation of the large corporation. Journal of Economic History 59: 540–2
S. L. Star
Science and Technology Studies: Ethnomethodology Ethnomethodology is a sociological approach to the study of practical actions which has influenced the development of constructionist, discourse analytic, and related approaches in science and technology studies (S&TS). Early ethnomethodological studies of ordinary activities and social scientific research practices developed an orientation to local practices, situated knowledge, and concrete discourse which later became prominent in science and technology studies. In addition to being a precursor to S&TS, ethnomethodology continues to offer a distinctive approach to practical actions in science and mathematics which rivals more familiar versions of social constructionism.
1. Ethnomethodological Research Policies In the 1960s, Harold Garfinkel coined the term ethnomethodology as a name for a unique sociological approach to practical actions and practical reasoning (Garfinkel 1974, Heritage 1984, p. 45, Lynch 1993, pp. 13644
3–10). Garfinkel was influenced by existential phenomenology (especially Schutz 1962) and sociological theories of action (especially Parsons 1937). As usually defined, ethnomethodology is the investigation of ‘folk methods’ for producing the innumerable practical and communicative actions which constitute a society’s form of life. 1.1
Ethnomethodological Indifference
Unlike methodologists in the philosophy of science who accord special status to scientific methods, ethnomethodologists examine methods for composing and coordinating ordinary as well as scientific activities. Garfinkel deemed all methods to be worthy of detailed study. This research policy is known as ethnomethodological indifference (Garfinkel and Sacks 1970, pp. 345–6, Lynch 1993, pp. 141–2). According to this policy, any method is worthy of study, regardless of its professional status, relative importance, adequacy, credibility, value, and necessity. This does not mean that ethnomethodologists treat all methods as equally ‘good’; instead, it means that they do not believe that preconceptions about the importance and validity of particular methods should determine a choice of research topic. Ethnomethodological indifference is similar in some respects to the more familiar policies of symmetry and impartiality in the Strong Programme in the sociology of scientific knowledge (Bloor 1976, pp. 4–5). However, there is a significant difference between the two, which has to do with a less obvious implication of the term ‘ethnomethodology.’ In addition to being a name for an academic field that studies ‘folk methods,’ the term refers to ‘methodologies’—systematic inquiries and knowledge about methods—which are internal to the practices studied. Many social scientists (including many in S&TS) assume that the persons they study perform their activities unreflexively or even unconsciously, and that the intervention of an outside analyst is necessary for explicating, explaining, and criticizing the tacit epistemologies that underlie such activities. In contrast, ethnomethodological indifference extends to the privileges ascribed to social scientific analysis and criticism: Persons doing ethnomethodological studies can ‘care’ no more or less about professional sociological reasoning than they can ‘care’ about the practices of legal reasoning, conversational reasoning, divinational reasoning, psychiatric reasoning, and the rest. (Garfinkel and Sacks 1970, p. 142)
Perhaps more obviously than other domains of practice, scientific research involves extensive methodological inquiry and debate. Science is not alone in this respect. In modern (and also many ancient) societies, large bodies of literature articulate, discuss, and debate the practical, epistemic, and ethical character of a broad array of activities, including legal procedures,
Science and Technology Studies: Ethnomethodology food preparation, dining, sexuality, child rearing, gardening, and fly fishing. Any effort to explain such practices sociologically must first come to terms with the fact that explanations of many kinds (including social explanations) are embedded reflexively in the history, production, and teaching of the activities themselves.
1.2 Ethnomethodological Reflexiity Garfinkel (1967, p. 1) spoke of the ‘reflexive’ or ‘incarnate’ character of accounting practices and accounts. By this he meant that social activities include endogenous practices for displaying, observing, recording, and certifying their regular, normative, and ‘rational’ properties. In other words, social agents do not just act in orderly ways, they examine, record, and reflexively monitor their practices. This particular sense of ‘reflexivity’ goes beyond the humanistic idea that individuals ‘reflect’ on their own situations when they act, because it emphasizes collective, and often highly organized, accounting practices. The fact that the persons and organized groups that social scientists study already observe, describe, and interpret their own methodic practices can provoke challenges to the authority of social science methods when the latter conflict with native accounts. Whether or not a social scientist agrees with, for example, official accounts of scientific method given by the subjects of an ethnographic or historical investigation, it is necessary to pay attention to the more pervasive way in which methodic understandings, and understandings of method, play a constitutive role in the practices under study. For ethnomethodologists, the research question is not just ‘How do a society’s (or scientific field’s) members formulate accounts of their social world?,’ but ‘How do members perform actions so as to make them account-able; that is, observable and reportable in a public domain of practice?’ Accordingly, accounts are not simply interpretations made after actions take place; instead, actions display their accountability for others who are in a position to witness, interpret, and record them.
1.3
Topic and Resource
The idea that explanations, explanatory concepts, and, more generally, natural language resources are common to professional sociology and the activities sociologists study is a source of long-standing consternation and confusion in the social sciences (Winch 1990, [1958]). As Garfinkel and Sacks (1970, p. 337) noted: The fact that natural language serves persons doing sociology, laymen or professionals, as circumstances, as topics, and as
resources of their inquiries furnishes to the technology of their inquiries and to their practical sociological reasoning its circumstances, its topics, and its resources.
An injunction developed by ethnomethodologists (Zimmerman and Pollner 1970), and adopted by proponents of discourse analysis (Gilbert and Mulkay 1984) is that social analysts should not confuse topic and resource. That is, they should not confuse the common sense explanations given by participants in the social activities studied with the analytic resources for studying those same activities. A variant of this injunction in social studies of science warns the analyst not to adopt, for polemical purposes, the very vocabulary of scientific authority that social studies of science make problematic (Woolgar 1981, Ashmore 1989). While this may be good advice in particular cases, when taken as a general policy the injunction not to confuse topic and resource may seem to encourage a retreat from the very possibility of describing or explaining, let alone criticizing, the practices in question. If every possible analytic resource is already found in the social world as a problematic social phenomenon, then what would a sociologist have to say that is not already incorporated into the practices and disputes studied? While acknowledging that this is a problem for any sociological explanation of science, Latour (1986) recommends a ‘semiotic turn’ which would examine the terms of the (scientific) tribe from the vantage point provided by an abstract theoretical vocabulary. But, unless one grants special epistemic status to semiotics, this ‘turn’ simply reiterates the problem.
2. Ethnomethodology and the Problem of Description Ethnomethodologists do not agree upon a single solution to the problem of reflexivity, and contradictory ways of handling it are evident in the field, but some ethnomethodologists believe that sociological descriptions do not require special theoretical or analytical auspices in order to overcome reflexivity. As Sharrock and Anderson (1991, p. 51) argue, under most circumstances in which descriptions are made and accepted as adequate, there is no need to overcome epistemological scepticism. Ethnomethodologists treat reflexivity as ubiquitous, unavoidable, and thus ‘irremediable.’ Consequently, reflexivity is not a ‘methodological horror’ (Woolgar 1988, p. 32) that makes description or explanation impossible or essentially problematic, because an ethnomethodologist is in no worse (or better) shape than anyone else who aims to write intelligible, cogent, and insightful descriptions of particular states of affairs. Even so, there still remains the question of the scientific, or other, grounds of ethnomethodological descriptions. 13645
Science and Technology Studies: Ethnomethodology 2.1 The Possibility of ‘Scientific’ Descriptions of Human Behaior A possible basis for ethnomethodological descriptions is presented in an early argument written by Sacks (1992, pp. 802–5), the founder of a field that later came to be called conversation analysis. This was a brief, but intriguing, argument about the possibility of developing a science that produces stable, reproducible, naturalistic accounts of human behavior. Sacks observed that natural scientists already produce accounts of human behavior when they write natural language descriptions of methods for reproducing observations and experiments. Social scientists who attempt to describe ‘methods’ in a broad range of professional and non-professional activities are in no worse position than natural scientists who attempt to describe their own particular methods. In other words, Sacks conceived of the replication of experiments as a particular (but not necessarily special) case of the reproduction of social structures. While social scientists do not aim to produce ‘how to’ manuals, they can write accounts of social actions which are ‘true’ in a praxiological sense: adequate to the production and reproduction of the relevant practices.
producing standard laboratory protocols (Garfinkel 1986, 1991, Garfinkel et al. 1981, Livingston 1986, Suchman 1987). 2.3 Ethnomethodology and Social Constructionism Ethnomethodology has an ambivalent relation to social constructionism. Early laboratory studies (Latour and Woolgar 1979, Knorr 1981) integrated selected themes from ethnomethodology into constructionist arguments, but many ethnomethodologists prefer to speak of the ‘production’ rather than the ‘construction’ of social orders (Lynch 1993, Button and Sharrock 1993). While it may be so that no fact is ever free of ‘construction’ in the broadest sense of the word, local uses of the term in empirical science (though not in mathematics and certain branches of theory) refer to research artifacts which are distinguished from a field of natural objects (Lynch 1985). Ethnomethodologists prefer not to speak indiscriminately of the construction or manufacture of knowledge, in order to preserve an ‘indifferent’ orientation to the way distinctions between constructed and unconstructed realities are employed in practical action and situated argument.
2.2 Replication and the Reproduction of Social Structure
2.4 Normatie and Ethical Considerations
It is well established in the sociology of science that replication is problematic. Instead of being a methodological foundation for establishing natural facts, particular instances of experimental replication often beg further questions about the detailed conditions and competencies they involve (Collins 1985). To say that replication is problematic does not imply that it is impossible, but it does raise the question of how scientists manage to secure assent to the adequacy of their experiments. The phenomenon of just how scientists conduct experiments, and how they establish particular results in the relevant communities, has become a major topic for socio-historians and ethnographers of science (Gooding et al. 1989). Consequently, in the more general case of the reproduction of social structure, it seems reasonable to conclude that Sacks identifies a topic for ethnomethodological research, but not a grounding for a possible sociological research program (Lynch and Bogen 1994). Ethnomethodological studies of the reproduction of order in science, other professions, and daily life, address the topic of the production of instructed actions. Instructed actions include an open-ended variety of actions performed in accordance with rules, plans, recipes, methods, programs, guidelines, maps, models, sets of instructions, and other formal structures. In addition to examining efforts to reproduce initial observations, ethnomethodologists have studied a series of topics: deriving mathematical proofs, following instructions for photocopying, and re-
Ethnomethodology has been criticized for treating normative aspects of the activities studied as ‘mere phenomena’ (Habermas 1984, p. 106). For example, when studying a contentious court case (Goodwin 1994) or jury deliberation (Maynard and Manzo 1993), an ethnomethodologist does not assess the discursive practices by reference to ideal standards of validity, rationality, or justice. This does not suppose that normative considerations are ‘mere’ phenomena. Instead, it supposes that peoples’ methods (the phenomena studied by ethnomethodologists) are just that: situated actions that, for better or worse, incorporate normative judgments and ethical claims. Instead of recasting such judgments and claims in terms of one or another transcendent framework, ethnomethodologists attempt to explicate the way normative judgments are produced, addressed, and fought over in specific circumstances. When ethnomethodologists investigate highly charged uses of, and appeals to, normative judgment, they enable readers to examine specific configurations of action and reasoning that do not neatly fall under recipe versions of norms and values. Consequently, rather than invoking or developing a normative theory, ethnomethodologists invite their readers to consider intricate situations of practice and contestation that no single general framework can possibly forecast or resolve. Practical and ethical dilemmas are no less salient for ethnomethodologiststhanforculturalanthropologists,
13646
Science and Technology Studies: Experts and Expertise lawyers, and other participant-investigators. The policy of ethnomethodological indifference does not relieve an investigator of ethical choices and responsibilities. Like other investigators, ethnomethodologists may in some circumstances find it advisable to respect norms of privacy and propriety, while in others they may feel compelled to expose wrongdoing. However, as a body of doctrines, research policies, and exemplary studies, ethnomethodology does not supply a set of rules or ethical guidelines for making such difficult choices. This does not mean that ethnomethodologists proceed without ethics, but that their ethical judgments, like many of their other judgments, have a basis in communal life that is not encapsulated by any academic school, theory, or method. See also: Ethnology; Ethnomethodology: General; Parsons, Talcott (1902–79); Reflexivity in Anthropology; Reflexivity: Method and Evidence
Bibliography Ashmore M 1989 The Reflexie Thesis: Wrighting the Sociology of Scientific Knowledge. University of Chicago Press, Chicago Bloor D 1976 Knowledge and Social Imagery. Routledge and Kegan Paul, London Button G (ed.) 1991 Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Button G, Sharrock W 1993 A disagreement over agreement and consensus in constructionist sociology. Journal of Theory Social Behaiour 23: 1–25 Collins H M 1985 Changing Order: Replication and Induction in Scientific Practice. Sage, London Garfinkel H 1967 Studies in Ethnomethodology. Prentice Hall, Englewood Cliffs, NJ Garfinkel H 1974 On the origins of the term ‘ethnomethodology.’ In: Turner R (ed.) Ethnomethodology. Penguin, Harmondsworth, UK Garfinkel H (ed.) 1986 Ethnomethodological Studies of Work. Routledge and Kegan Paul, London Garfinkel H 1991 Respecification: Evidence for locally produced, naturally accountable phenomena of order, logic, reason, meaning, method, etc. in and as of the essential haecceity of immortal ordinary society (I)—an announcement of studies. In: Button G (ed.) Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Garfinkel H, Lynch M, Livingston E 1981 The work of a discovering science construed with materials from the optically discovered pulsar. Philosophy of the Social Sciences 11: 131–58 Garfinkel H, Sacks H 1970 On formal structures of practical actions. In: McKinney J C, Tiryakian E A (eds.) Theoretical Sociology: Perspecties and Deelopment. Appleton-CenturyCrofts, New York Gilbert G N, Mulkay M 1984 Opening Pandora’s Box: An Analysis of Scientists’ Discourse. Cambridge University Press, Cambridge, UK Gooding D, Pinch T, Schaffer S (eds.) 1989 The Uses of Experiment. Cambridge University Press, Cambridge, UK
Goodwin C 1994 Professional vision. American Anthropologist 96: 606–33 Habermas J 1984 The Theory of Communicatie Action. Volume I: Reason and the Rationalization of Society. Beacon Press, Boston Heritage J 1984 Garfinkel and Ethnomethodology. Polity Press, Oxford, UK Knorr K 1981 The Manufacture of Knowledge. Pergamon, Oxford, UK Latour B 1986 Will the last person to leave the social studies of science please turn on the tape recorder. Social Studies of Science 16: 541–8 Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Facts. Sage, London Livingston E 1986 The Ethnomethodological Foundations of Mathematics. Routledge and Kegan Paul, London Lynch M 1985 Art and Artifact in Laboratory Science. Routledge and Kegan Paul, London Lynch M 1993 Scientific Practice and Ordinary Action: Ethnomethodology and Social Studies of Science. Cambridge University Press, New York Lynch M, Bogen D 1994 Harvey Sacks’s primitive natural science. Theory, Culture and Society 11: 65–104 Maynard D, Manzo J 1993 On the sociology of justice: Theoretical notes from an actual jury deliberation. Sociological Theory 11: 171–93 Parsons T 1937 The Structure of Social Action. Free Press, New York, Vols. I & II Sacks H 1992 Lectures on Conersation. Blackwell, Oxford, UK, Vol. I Schutz A 1962 Collected Papers. Martinus Nijhoff, The Hague, Vol. I Sharrock W, Anderson B 1991 Epistemology: Professional scepticism. In: Button G (ed.) Ethnomethodology and the Human Sciences. Cambridge University Press, Cambridge, UK Suchman L 1987 Plans and Situated Actions. Cambridge University Press, Cambridge, UK Winch P 1990[1958] The Idea of a Social Science and its Relation to Philosophy, 2nd edn. Humanities Press, Atlantic Highlands, NJ Woolgar S 1981 Interests and explanations in the social study of science. Social Studies of Science 11: 365–94 Woolgar S 1998 Science: The Very Idea. Tavistock, London Zimmerman D H, Pollner M 1970 The everyday world as a phenomenon. In: Douglas J D (ed.) Understanding Eeryday Life: Toward the Reconstruction of Sociological Knowledge. Aldine, Chicago
M. Lynch
Science and Technology Studies: Experts and Expertise 1. Introduction Science is a social activity to construct, justify, and critique cognitive claims based on widely accepted methodologies and theories within relevant communi13647
Science and Technology Studies: Experts and Expertise ties. The term science is used here to include claims of true statements about any phenomenon, be it natural, anthropological, social, or even metaphysical. Methods or procedures are clearly distinct among the various domains of science, but the main assertion here is that the overall goal of claiming truthful statements remains the essence of scientific inquiry throughout all disciplines (similar attempts in Campbell 1921, pp. 27–30). Using scientific expertise, however, is not identical with generating scientific statements (Lindblom and Cohen 1979, p. 7ff.). In a policy arena, scientific experts are expected to use their skills and knowledge as a means of producing arguments and insights for identifying, selecting, and evaluating different courses of collective action. Since such advice includes the prediction of likely consequences of political actions in the future, experts are also in demand to give advice on how to cope with uncertain events and how to make a prudent selection among policy options, even if the policy-maker faces uncertain outcomes and heterogeneous preferences (Cadiou 2001, p. 27). Many policy-makers expect scientific experts to help construct strategies that promise to prevent or mitigate the negative and promote the positive impacts of collective actions. In addition, scientific expertise is demanded as an important input to design and facilitate communication among the different stakeholders in debates about technology and risk. Based on these expectations, scientific expertise can assist policy-makers to meet five major functions (similar in Renn 1995): (a) providing factual insights that help policymakers to identify and frame problems and to understand the situation (enlightenment function); (b) providing instrumental knowledge that allows policy-makers to assess and evaluate the likely consequences of each policy option ( pragmatic or instrumental function); (c) providing arguments, associations, and contextual knowledge that helps policy-makers to reflect on their situation and to improve and sharpen their judgment (reflexie function); (d) providing procedural knowledge that helps policy-makers to design and implement procedures for conflict resolution and rational decision making (catalytic function); and (e) providing guidelines or designing policy options that assist decision-makers in their effort to communicate with the various target audiences (communicatie function). These five functions touch on crucial aspects of policy-makers’ needs. First, insights offered by experts help policy-makers to understand the issues and constraints of different policy options when designing and articulating policies. Policy-makers need background information to develop standards, to ground economic or environmental policies on factual knowledge, and to provide information about the success or 13648
failure of policies. Second, scientific methods and their applications are needed to construct instrumental knowledge in the format of ‘if–then’ statements and empirically tested theories; this knowledge leads to the articulation of means-ends oriented policies and problem solving activities. Third, scientific reasoning and understanding help policy-makers to reflect on their activities and to acknowledge social, cultural, institutional, and psychological constraints as well as opportunities that are not easily grasped by common sense or instrumental reasoning. However, scientific statements may also restrict policy-makers as they are directed towards adopting a single perspective in analyzing and framing a problem. Fourth, policymakers may use scientists to design procedures of policy formulation and decision making in accordance with normative rules of reasoning and fairness. These procedures should not interfere with the preferences of those who are involved in the decision-making process, but provide tools for making these preferences the guiding principle of policy selection. To meet this function, scientists need to play a role similar to a chemical catalyst by speeding up (or if necessary slowing down) a process of building consensus among those who are entitled to participate in the policymaking process (Fishkin 1991). Lastly, scientific experts can help to design appropriate communication programs for the purpose of legitimizing public policies as well as preparing target audiences for their specific function or role in the task of risk management. This article focuses predominantly on the influence of scientific and technical expertise, in particular the results of technology assessments, on public policymaking. The second section deals with the risks and challenges of technical experts providing input to policy design and implementation. The third section addresses the influence of systematic knowledge for policy-making. The fourth section provides some theoretical background for the role of expertise in deliberative processes. The fifth section focuses on cultural differences in the use of expertise for policymaking. The last section summarizes the main points of this article.
2. Using Scientific Expertise for Policy-making: Risks and Challenges The interaction between experts and policy-makers is a major issue in technology management today and is likely to become even more important in the future. This is due in the first instance to the increased interactions between human interventions and natural responses and, secondarily, to the increased complexity of the necessary knowledge for coping with economic, social, and environmental problems. Population growth and migration, global trade, inter-
Science and Technology Studies: Experts and Expertise national market structures, transboundary pollution, and many other conditions of modern life have increased the sensitivity to external disturbances and diminished the capability of social and natural systems to tolerate even small interventions. Although contested by some (Simon 1992), most analysts agree that ecological systems have become more vulnerable as the impact of human intervention has reached and exceeded thresholds of self-repair (Vitousek et al. 1986). Given this critical situation, what are the potential contributions of expertise to the policy process? In principle, experts can provide knowledge that can help to meet the five functions mentioned above and to anticipate potential risks before they materialize. But they can do this only to the degree that the state of the art in the respective field of knowledge can provide reliable information pertaining to the policy options. Many policy-makers share assumptions about expertise that turn out to be wishful thinking or illusions (Funtowicz and Ravetz 1990, Jasanoff 1990, 1991, Rip 1992, Beck 1992). Most prominent among these are: (a) illusion of certainty: making policy-makers more confident about knowing the future than is justified; (b) illusion of transferability: making policy-makers overconfident that certainty in one aspect of the problem applies to all other aspects as well; (c) illusion of ‘absolute’ truth: making policymakers overconfident with respect to the truthfulness of evidence; (d) illusion of ubiquitous applicability: making policy-makers overconfident in generalizing results from one context to another. These illusions are often reinforced by the experts themselves. Many experts feel honored to be asked by powerful agents of society for advice. Acting under the expectation of providing unbiased, comprehensive, and unambiguous advice, they often fall prey to the temptation to oversell their expertise and provide recommendations far beyond their realm of knowledge. This overconfidence in one’s own expertise gains further momentum if policy-maker and advisor share similar values or political orientations. As a result policy-makers and consultants are prone to cultivate these illusions and act upon them. In addition to these four types of illusions, experts and policy-makers tend to overemphasize the role of systematic knowledge in making decisions. As much as political instinct and common sense are poor guides for decision making without scientific expertise, the belief that scientific knowledge is sufficient to select the correct option is just as short sighted. Most policy questions involve both systematic as well as anecdotal and idiosyncratic knowledge (Wynne 1989). Systematic knowledge often provides little insight into designing policies for concrete issues. For example, planning highways, supporting special industries, promoting health care for a community and many other
issues demand local knowledge on the social context and the specific history of the issue within this context (Wynne 1992, Jasanoff 1991). Knowledge based on local perspectives can be provided only by those actors who share common experiences of the issue in question. The role of systematic versus particularistic knowledge is discussed in more detajobil in the next section.
3. The Releance of Systematic Expertise for Policy-making There is little debate in the literature that the inclusion of expertise is essential as a major resource for designing and legitimizing technological policies (Jasanoff 1990). A major debate has evolved, however, on the status of scientific and technical expertise for representing all or most of the knowledge that is relevant to these policies. This debate includes two related controversies: the first deals with the problem of objectivity and realism; the second one with the role of subjective and experiential knowledge that nonexperts have accumulated over time. This is not the place to review these two controversies in detail (see Bradbury 1989, Shrader-Frechette 1991). Depending on which side one stands on in this debate, scientific evidence is either regarded as one input to fact-finding among others or as the central or even only legitimate input for providing and resolving knowledge claims. There is agreement, however, among all camps in this debate that systematic knowledge is instrumental for understanding phenomena and resolving problems. Most analysts also agree that systematic knowledge should be generated and evaluated according to the established rules or conventions of the respective discipline (Jaeger 1998, p. 145). Methodological rigor aiming to accomplish a high degree of validity, reliability and relevance remains the most important yardstick for judging the quality of scientific insights. Constructivist scholars in science and technology studies do not question the importance of methodological rules in securing credible knowledge but are skeptical whether the results of scientific inquiries represent objective or unambiguous descriptions of reality (Latour and Woolgar 1979, Knorr-Cetina 1981). Rather, they see scientific results as products of specific processes or routines that an elite group of knowledge producers has framed as ‘objective’ and ‘real.’ The ‘reality’ of these products is determined by the availability of research routines and instruments, prior knowledge and judgments, and social interests (see also Beck 1992, although he regards himself as a moderate realist). For the analysis of scientific input to policy-making, the divide between the constructivists and the realists matters only in the degree to which scientific input is used as a genuine knowledge base or as a final arbiter for reconciling knowledge conflicts. A knowledge 13649
Science and Technology Studies: Experts and Expertise discourse deals with different, sometimes competing claims that obtain validity only through compatibility checks with acknowledged procedures of data collection and interpretation, proof of theoretical compatibility and conclusiveness, and the provision of intersubjective opportunities for reproduction (Shrader-Frechette 1991, pp. 46ff.). Obviously many research results do not reach the maturity of proven facts, but even intermediary products of knowledge, ranging from plain hypotheses, via plausible deductions to empirically proven relationships, strive for further perfection (cf. the pedigree scheme of Functowicz and Ravetz 1990). On the other hand, even the most ardent proponent of a realist perspective will admit that only intermediary types of knowledge are often available when it comes to assess and evaluate risks. What does this mean for the status and function of scientific expertise in policy contexts? First, scientific input has become a major element of technological decision-making in all technologically developed countries. The degree to which the results of scientific inquiry are taken as ultimate evidence to judge the appropriateness and validity of competing knowledge claims is contested in the literature and also contested among policy-makers and different social groups. Frequently, the status of scientific evidence becomes one of the discussion points during social or political deliberation depending on the context and the maturity of scientific knowledge in the technological policy arena under question. For example, if the issue is the effect of a specific toxic substance on human health, subjective experience may serve as a heuristic tool for further inquiry and may call attention to deficits in existing knowledge, although toxicological and epidemiological investigations are unlikely to be replaced with intuitions from the general public. If the issue, by contrast, is siting of an incinerator, local knowledge about sensitive ecosystems or traffic flows may be more relevant than systematic knowledge about these impacts in general (a good example for the relevance of such particularistic knowledge can be found in Wynne 1989). Second, the resolution of competing claims of scientific knowledge usually are governed by the established rules within the relevant disciplines. These rules may not be perfect and even contested within the community. Yet they are regarded as superior to any other alternative (Shrader-Frechette 1991, pp. 190ff.). Third, many technological decision options require systematic knowledge that is either not available or still in its infancy or in an intermediary status. Analytic procedures are then demanded by policy-makers as a means to assess the relative validity of each of the intermediary knowledge claims, to display their underlying assumptions and problems, and to demarcate the limits of ‘reasonable’ claims, that is, to identify the range of those claims that are still compatible with the state of the art in this knowledge domain (Shrader-Frechette 1991). Fourth, knowledge 13650
claims can be systematic and scientific as well as idiosyncratic and anecdotal. Both forms of knowledge have a legitimate place in technological decisionmaking. How they are used depends on the context and the type of knowledge required for the issue in question (Wynne 1992). All four points show the importance of knowledge for technology policy and decision making, but also make clear that choosing the right management options requires more than looking at the scientific evidence alone.
4. Scientific Eidence in Deliberatie Processes Given the delicate balance between anecdotal, experiential, and systematic knowledge claims, policymaking depends on deliberative processes in which competing claims are generated or ignored, sorted, selected, and highlighted. The first objective is to define the relevance of different knowledge claims for making legitimate and defendable choices. The second objective is to cope with issues of uncertainty and to assign trade-offs between those who will benefit and those who will suffer from different policy options. The third objective is to take into account the wider concerns of the affected groups and the public at large. These three elets of deliberation are related to coping with the problems of complexity, uncertainty, and ambiguity. How can deliberative processes deal with the problems of complexity, uncertainty, and ambiguity? To respond to this question, it is necessary to introduce the different theoretical concepts underlying deliberative processes. The potential of deliberation has been discussed primarily in three schools of thought: (a) The utility-based theory of rational action (basics in Fisher and Ury 1981, Raiffa 1994; review of pros and cons in Friedman 1995): in this concept, deliberation is framed as a process of finding one or more option(s) that optimize the payoffs to each participating stakeholder. The objective is to convert positions into statements of underlying interests. If all participants articulate their interest, it is either possible to find a new win-win option that is in the interest of all or at least does not violate anybody‘s interest (Pareto optimal solution) or to find a compensation that the winner pays to the losers to the effect that both sides are at least indifferent between the situation without the preferred policy option and no compensation and the implementation of this option plus compensation (Kaldor–Hicks solution). In this context systematic knowledge is required to inform all participants of the likely consequences of each decision option. The evaluation of desirability of each option is a matter of individual preferences and in this perspective outside of the deliberation process. (b) Theory of communicative action (Habermas 1987, Webler 1995): this concept focuses on the communicative process of generating preferences,
Science and Technology Studies: Experts and Expertise values, and normative standards. Normative standards are those prescriptions that do not apply only to the participants of the discourse but also to society as a whole or at least a large segment of the external population. Normative standards in technological arenas include, for example, exposure limits or performance standards for technologies. They apply to all potential emitters or users regardless whether they were represented at the discourse table or not. The objective here is to find consensus among moral agents (not just utility maximizers) about shared meaning of actions based on the knowledge about consequences and an agreement on basic human values and moral standards. Systematic knowledge in this context helps the participants to provide insights into the potential effects of collective decision options and help them to reorganize their preferences according to mutually desirable outcomes. (c) Theory of social systems (Luhmann 1986, Eder 1992): the (neo)functional school of sociology pursues a different approach to deliberation. It is based on the assumption that each stakeholder group has a separate reservoir of knowledge claims, values, and interpretative frames. Each group-specific reservoir is incompatible with the reservoir of the other groups. This implies that deliberative actions do not resolve anything. They represent autistic self-expressions of stakeholders. In its cynical and deconstructivist version, deliberation serves as an empty but important ritual to give all actors the illusion of taking part in the decision process. In its constructive version deliberation leads to the enlightenment of decision-makers and participants. Far from resolving or even reconciling conflicts, deliberation in this viewpoint has the potential to decrease the pressure of conflict, provide a platform for making and challenging claims, and assist policymakers in making them cognizant of different interpretative frames (Luhmann 1993). Deliberations help to reframe the decision context, to make policymakers aware of public demands, and enhance legitimacy of collective decisions through reliance on formal procedures (Skillington 1997). In this understanding of deliberation, reaching a consensual conclusion is neither necessary nor desirable. The process of talking to each other, exchanging arguments, and widening one‘s horizon is all what deliberation is able to accomplish. It is an experience of mutual learning without a substantive message. Systematic knowledge in this context is never free of context and prescriptive assumptions. Hence, each group will make knowledge claims according to its interests and strategic goals. Integration of knowledge is based on rhetoric, persuasion skills, and power rather than established rules of ‘discovering the truth.’ These three understandings of deliberation are not mutually exclusive although many proponents of each school argue otherwise. Based on the previous arguments, it is quite obvious that the rational actor approach provides a theoretical framework to under-
stand how actors in deliberative processes deal with complexity and partially uncertainty. The communicative action approach provides a theoretical structure for understanding and organizing discourses on ambiguities and moral positions. In particular, this concept can highlight those elements of deliberation that help participants to deal competently with moral and normative issues beyond personal interests. The system–analytic school introduces some skepticism towards the claim of the other schools with respect to the outcomes of deliberation. Instead it emphasizes the importance of procedures, routines, and learning experiences for creating links or networks between the major systems of society. Deliberation is the lubricant that helps each of the collective social actors to move mostly independently in society without bumping into the domains of the other actors. Deliberative processes aimed at integrating experts, stakeholders, policy-makers, and the public at large can be organized in many different forms. Practical experiences have been made with advisory committees, citizen panels, public forums, consensus conferences, formal hearings, and others (see Rowe and Frewer 2000). A hybrid model of citizen participation (Renn et al. 1993) has been applied to studies on energy policies and waste disposal issues in West Germany, for waste-disposal facilities in Switzerland, and to sludge-disposal strategies in the United States (Renn 1999).
5. Cultural Styles in Using Scientific Expertise The way that knowledge and expertise are included in policy processes depends on many factors. Comparative research on the influence of systematic knowledge in policy processes emphasizes the importance of cultural context and historic developments (Solingen 1993). In addition, state structures and institutional arrangements significantly influence the type of inclusion of expertise into the decision-making processes. There has been a major shift in modern states to an organized and institutionalized exchange of science organizations with policy-making bodies (Mukerji 1989, Jasanoff 1990). Although science has become a universal enterprise, the specific meaning of what science can offer to policy-makers differs among cultures and nations (Solingen 1993). The situation is even more diverse when one investigates the use of science in different countries for advising policymakers. Scientific and political organizations partially determine what aspects of life are framed as questions of knowledge and what of ‘‘subjective’’ values. In addition, national culture, political traditions, and social norms influence the mechanisms and institutions for integrating expertise in the policy arenas (Wynne 1992). In one line of work, policy scholars have developed a classification of governmental styles that highlight 13651
Science and Technology Studies: Experts and Expertise four different approaches to integrating expert knowledge into public decisions (Brickman et al. 1985, Jasanoff 1986, O’Riordan and Wynne 1987, Renn 1995). These styles have been labeled inconsistently in the literature, but they refer to common procedures in different nations. The ‘adversarial’ approach is characterized by an open forum in which different actors compete for social and political influence in the respective policy arena. The actors in such an arena need and use scientific evidence to support their position. Policy-makers pay specific attention to formal proofs of evidence because policy decisions can be challenged on the basis of insufficient use or neglect of scientific knowledge. Scientific advisory boards play an important role as they help policy-makers to evaluate competing claims of evidence and to justify the final policy selection (Jasanoff 1990). A sharp contrast to the adversarial approach is provided by the fiduciary style (Renn 1995). The decision-making process is confined to a group of patrons who are obliged to make the ‘common good’ the guiding principle of their action. Public scrutiny or involvement is alien to this approach. The public can provide input to and arguments for the patrons but is not allowed to be part of the negotiation or policy formulation process. Scientists outside the policymaking circles are used as consultants at the discretion of the patrons and are selected according to prestige or personal affiliations. Their role is to provide enlightenment and information. Patrons’ staff generate instrumental knowledge. This system relies on producing faith in the competence and the fairness of the patrons involved in the decision-making process. Two additional styles are similar in their structure but not identical. The consensual approach is based on a closed circle of influential actors who negotiate behind closed doors. Representatives of important social organizations or groups and scientists work together to reach a predefined goal. Controversy is not visible and conflicts are often reconciled before formal negotiations take place. The goal of the negotiation is to combine the best available evidence with the various social interests that the different actors represent. The corporatist style is similar to the consensual approach, but is far more formalized. Well-known experts are invited to join a group of carefully selected policymakers representing major forces in society (such as employers, unions, churches, professional associations, and environmentalists). Invited experts are asked to offer their professional judgment, but they often do not need to present formal evidence for their claims. This approach is based on trust in the expertise of scientists. These four styles are helpful in characterizing and analyzing different national approaches to policymaking. The American system is oriented toward the adversarial style and the Japanese system toward the consensual. The policy style of northern Europe comes closest to the corporatist approach, whereas most 13652
southern European countries display a fiduciary approach. All these systems, however, are in transition. Interestingly the United States has tried to incorporate more consensual policies into its adversarial system, while Japan is faced with increasing demands for more public involvement in the policy process. These movements towards hybrid systems have contributed to the genesis of a new regulatory style, which may be called ‘mediative.’ There has been a trend in all technologically developed societies to experiment with opening expert deliberations to more varied forms of stakeholder or public participation. In the United States, it has taken the form of negotiated or mediated rule making, in Europe it has evolved as an opening of corporatist clubs to new groups such as the environmental movement. It is too early to say whether this new style will lead to more convergence among the countries or to a new set of cultural differentiations (Renn 1995).
6. Conclusions The economic and political structures of modern societies underwent rapid transitions in the late twentieth century. This transition was accompanied by globalization of information, trade, and cultural lifestyles, an increased pluralism of positions, values, and claims, the erosion of trust and confidence in governing bodies, an increased public pressure for participation, and growing polarization between fundamentalist groups and agents of progressive change. The resulting conflicts put pressure on political systems to integrate different outlooks and visions of the future and to provide justifications of governmental decisions on the basis of both facts and values. In this situation, policy-making institutions discovered an urgent need for policy advice, as well as new modes of integrating expertise with values and preferences. Research on advisory processes indicates that the following points will need to be addressed. (a) Using scientific expertise in the policy arena is one element in the quest of modern societies to replace or amend the collective learning process of trial and error by more humane methods of anticipation, in which the possibility of errors is reduced. This process is socially desired, though it cannot reduce the uncertainties of change to zero. Anticipation both necessitates and places new demands on expertise in the service of policy-making. (b) Scientific expertise can serve five functions: enlightenment; pragmatic or instrumental; reflexive; catalytic; and communicative. All five are in demand by policy-makers, but expertise may also distort their perspective on the issue or prescribe a specific framing of the problem. That is why many policy analysts demand that scientific input be controlled by democratic institutions and be open to public scrutiny. (c) Scientific expertise is also used for legitimizing
Science and Technology Studies: Experts and Expertise decisions and justifying policies that may face resistance or opposition. Expertise can, therefore, conflict with public preferences or interests. In addition, policy-makers and experts pursue different goals and priorities. Expertise should be regarded as one crucial element of policy-making among others. Scientific advice is often mandated by law, but its potential contributions may vary from one policy arena to another. In particular, scientific expertise cannot replace public input in the form of locally relevant knowledge, historical insights, and social values. (d) The influence of expertise depends on the cultural meaning of expertise in different social and political arenas. If systematic expertise is regarded as the outcome of a socially constructed knowledge system, its authority can be trumped by processes that are seen to be more democratic. If, however, systematic expertise is seen as an approximation of reality or truth, it gains a privileged status among different sources of knowledge and inputs, even if it is the product of a democratically imperfect process. The degree to which these different understandings of expertise are accepted or acknowledged within a political arena will affect the practical influence and power of experts in collective decision-making. (e) Scientific expertise is absorbed and utilized by the various policy systems in different styles. One can distinguish four styles: adversarial, fiduciary, consensual, and corporatist. A new mediative style seems to be evolving from the transitions toward more open procedures in decision-making. This style seems to be specifically adjusted to postmodern societies. Scientific expertise in this style pursues a ‘system and problem oriented’ approach to policy-making, in which science, politics, and economics are linked by strategic networks. (f ) Organizing and structuring discourses on the selection of policy options is essential for the democratic, fair, and competent management of public affairs. The mere desire to initiate a two-way-communication process and the willingness to listen to public concerns are not sufficient. Deliberative processes are based on a structure that assures the integration of technical expertise, regulatory requirements, and public values. Co-operative discourse is one model among others that has been designed to meet that challenge. No one questions the need to initiate a common discourse or dialogue among experts, policy-makers, stakeholders and representatives of affected publics. This is particularly necessary if highly controversial subjects are at stake. The main challenge of deliberative processes will continue to be how to integrate scientific expertise, rational decision-making, and public values in a coherent form. See also: Expert Systems in Cognitive Science; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Expertise, Acquisition of;
Medical Expertise, Cognitive Psychology of; Policy History: Origins; Policy Knowledge: Universities; Policy Networks
Bibliography Beck U 1992 Risk Society: Toward a New Modernity (trans. Ritter M A). Sage, London Bradbury J A 1989 The policy implications of differing concepts of risk. Science, Technology, and Human Values 14(4): 380–99 Brickman R S, Jasonoff S, Ilgen T 1985 Controlling Chemicals: The Politics of Regulation in Europe and the United States. Cornell University Press, Ithaca, NY Cadiou J-M 2001 The changing relationship between science, technology and governance. The IPTS Report 52: 27–9 Campbell N 1951 (original 1921) What is Science? Dover, New York Eder K 1992 Politics and culture: On the sociocultural analysis of political participation. In: Honneth A, McCarthy T, Offe C, Wellmer A (eds.) Cultural–Political Interentions in the Unfinished Project of Enlightenment. MIT Press: Cambridge, MA, pp. 95–120 Fisher R, Ury W 1981 Getting to Yes: Negotiating Agreement without Giing In. Penguin Books, New York Fishkin J 1991 Democracy and Deliberation: New Directions of Democratic Reform. Yale University Press, New Haven, CT Friedman J (ed.) 1995 The Rational Choice Controersy. Yale University Press, New Haven, CT Funtowicz S O, Ravetz J R 1990 Uncertainty and Quality in Science for Policy. Kluwer, Dordrecht and Boston Habermas J 1987 Theory of Communicatie Action. Vol. II: Reason and the Rationalization of Society. Beacon Press, Boston Jaeger C 1998 Current thinking on using scientific findings in environmental policy making. Enironmental Modeling and Assessment 3: 143–53 Jasanoff S 1986 Risk Management and Political Culture. Russell Sage Foundation, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Knorr-Cetina K D 1981 The Manufacture of Knowledge: An Essay on the Constructiist and Contextual Nature of Science. Pergamon Press, Oxford, UK Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills and London Lindblom C E, Cohen D K 1979 Usable Knowledge: Social Science and Social Problem Soling. Yale University Press, New Haven, CT Luhmann N 1986 The autopoiesis of social systems. In: Geyer R F, van der Zouven J (eds.) Sociocybernetic Paradoxes: Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 172–92 Luhmann N 1993 Risk: A Sociological Theory. Aldine de Gruyter, New York Mukerji C 1989 A Fragile Power: Scientists and the State. Princeton University Press, Princeton, NJ O’Riordan T, Wynne B 1987 Regulating environmental risks: A comparative perspective. In: Kleindorfer P R, Kunreuther H C (eds.) Insuring and Managing Hazardous Risks: From Seeso to Bhopal and Beyond. Springer, Berlin, pp. 389–410 Raiffa H 1994 The Art and Science of Negotiation, 12th edn. Cambridge University Press, Cambridge, UK
13653
Science and Technology Studies: Experts and Expertise Renn O 1995 Style of using scientific expertise: A comparative framework. Science and Public Policy 22: 147–56 Renn O 1999 A model for an analytic deliberative process in risk management. Enironmental Science and Technology 33(18): 3049–55 Renn O, Webler T, Rakel H, Dienel P C, Johnson B B 1993 Public participation in decision making: A three-step-procedure. Policy Sciences 26: 189–214 Rip A 1992 The development of restrictedness in the sciences. In: Elias N, Martins H, Whitely R (eds.) Scientific Establishments and Hierarchies. Kluwer. Dordrecht and Boston, pp. 219–38 Rowe G, Frewer L J 2000 Public participation methods: A framework for evaluation. Science, Technology & Human Values 225(1): 3–29 Shrader-Frechette K 1991 Risk and Rationality. Philosophical Foundations for Populist Reforms. University of California Press, Berkeley, CA Simon J L 1992 There is no environmental, population, or resource crisis. In: Tyler-Miller G (ed.) Liing in the Enironment. Wadsworth, Belmont, pp. 29–30 Skillington T 1997 Politics and the struggle to define: A discourse analysis of the framing strategies of competing actors in a ‘new’ participatory forum. British Journal of Sociology 48(3): 493–513 Solingen E 1993 Between models and the State: Scientists in comparative perspective. Comparatie Politics 26(1): 19–27 Vitousek P M, Ehrlich A H, Matson P H 1986 Human appropriation of the products of photosynthesis. Bio Science 34: 368–73 Webler T 1995 ‘Right’ discourse in citizen participation. An evaluative yardstick. In: Renn O, Webler T, Wiedemann P (eds.) Fairness and Competence in Citizen Participation. Kluwer, Dordrecht and Boston, pp. 35–86 Wynne B 1989 Sheepfarming after Chernobyl. Enironment 31: 11–15, 33–9 Wynne B 1992 Uncertainty and environmental learning. Reconceiving science and policy in the preventive paradigm. Global Enironmental Change 2: 111–27
O. Renn
Science and the Media The phrase ‘science and the media’ as a single concept encompasses two major social institutions. Science is both a system of reliable knowledge about the natural world and a complex social system for developing and maintaining that knowledge. Similarly, the media comprise a variegated system for collecting and presenting information as well as being an institution with significant economic, political, and social impact. Though long perceived as separate realms, recent scholarship has highlighted their mutual dependence. Increasing episodes of tension and conflict in the late twentieth century led scholars, scientists, media producers, and social critics to explore the interactions of science and the media. Working scientists and media producers sought guidance on how to use each others’ resources more effectively, while analysts asked how such issues as political and economic interests, rhe13654
torical conventions, and audience responses shed light on the roles of science and media in shaping each other. One effect of recent scholarship has been to elide the difference between ‘media’ (all forms of media) and ‘the media’ (particularly the major institutions of newspapers, magazines, television, and radio), an elision that will be evident throughout this article.
1. Science and the Media: A Brief History The history of science and the media is the history of a growing mesh: increasing use of media by science, increasing attention to scientific ideas by media institutions, and increasing tensions caused by the rising interaction. Through the middle of the twentieth century, science and media were largely seen as separate realms. Science had emerged in the nineteenth century from natural philosophy as a systematic approach to understanding the natural world. With the advent of research-based industries such as chemical dyes and electronic communication, science began to attract the attention of capitalists and government leaders who understood the value of controlling scientific knowledge. By the end of World War II, the emergence of ‘big science’ had made science a powerful institutional force as well as a system of accredited knowledge with increasingly far-reaching applications. The media had also evolved in the nineteenth century from small, local organizations into nationally and internationally-circulated publications that served as tools of merchant capitalism and political control, full of advertising, business and political news, and the ideology of the industrial revolution. By the early twentieth century, new electronic media such as movies and radio provided the means for newly-developed forms of ‘publicity’ and propaganda to help produce modern mass society, as well as to increase public access to new forms of cultural entertainment. Since the late seventeenth century, natural philosophers had relied on print media, particularly professional journals such as the Philosophical Transactions and books such as Isaac Newton’s Principia, as tools for disseminating the results of their investigations. These publications were not only records of experiments or philosophical investigations, but also served as active carriers of the rhetorical structures that enabled natural philosophers to convince people outside their immediate vicinity of the truth of their claims (Shapin and Schaffer 1985). Early in the nineteenth century, new forms of science media emerged, particularly a genre known variously as ‘popularization,’ ulgarisation [French], or diulgacion [Spanish]. These books and magazines provided entre! e into the knowledge of the natural world for non-scientist readers. They ranged from textbooks for young women through to new mass-circulation magazines filled with instruction on the latest achievements
Science and the Media of a rapidly industrializing society. The so-called ‘great men’ of late nineteenth century England (most notably Thomas Huxley and John Tyndall) became evangelists for science, lecturing widely to enlist public support for the enterprise of rational understanding of the natural world; they converted their lectures into articles and books to spread their reach even further. At the same time, scientists’ use of media for disseminating knowledge among the scientific community grew dramatically, with the growth of new journals and the founding of publishing houses committed to scientific information. The growth of both science and media in the twentieth century forced specialization into the fields, leading to a distinction between what some scientists have called ‘communication in science’ and ‘communication about science.’ For scientists in their daily work, broad synthetic journals such as Science and Nature (published respectively in the USA and the UK) were joined by topical journals in chemistry, physics, biology, and their many sub-disciplines. For non-scientists, magazines and other media (such as radio and new ‘wire services’ for newspapers) increasingly focused on particular audiences, such as teachers, inventors, ‘high culture’ readers, general media users, and so on. Although scientific ideas appeared in some entertainment media, science in most non-technical media became a component of news coverage for different audiences. In the 1920s and 1930s, a small group of professional journalists began to write almost exclusively about science. (These developments were clearest in the highly developed countries of Europe and North America. The patterns in the rest of the world have been little studied and so cannot be described.) By the second half of the twentieth century, the interaction of science and media had become a complex web, in which new scientific research appeared in abstracts, journals, press reports, and online, while other forms of science–not necessarily derivative of the research reports–appeared regularly in media outlets such as newspapers, televisions, websites, radio programs, puppet shows, traveling circuses, and museums. Moreover, as science became central to modern culture, the presence of science in media such as poetry, sculpture, fine art, entertainment films, and so on, became more evident. In the journalistic media, many reports circulated internationally, through cooperative agreements between public broadcasting systems and worldwide news services. The rise of the World Wide Web in the 1990s provided further opportunities for information about science to circulate more fully between the developed and developing world. However, from about the mid-century onward, no matter what media were involved, many working scientists perceived presentations in non-technical media to be incorrect, oversimplified, or sensationalized. Thus, tensions developed between the institu-
tions of science and those of the media. Those tensions led eventually to the development of an analytic field of science and media, populated largely by media sociologists, communication content scholars, and sociologists of science.
2. Understanding Science and Media Analysis of science and media has led to a deeper understanding of the essential role of media in creating scientific knowledge, the ways that media presentations shape understandings of nature, and the role of public debate in constituting scientific issues. The earliest attempts to understand the relationship of science and media came from two directions: the overwhelming growth of scientific information, leading to attempts by scientists and scientific institutions to learn how to manage scientific publications, and emerging tensions between scientists and journalists, as specialized reporting developed in the 1930s. Research and publications before and after World War II led by the 1970s to two common understandings. First, sociologists of science had come to understand the fundamental role of communication in the production of reliable knowledge about the natural world. Scientists do not work in isolation, but must constantly present their ideas to colleagues for acknowledgement, testing, modification, and approbation. The process of communication, occurring in various media, is the ‘essence of science’ (Garvey 1979), leading to science becoming ‘public knowledge’ (Ziman 1968). At the same time, the apparent differences between the goals of science (methodical, tentative, but nonetheless relatively certain statements about the natural world) and the needs of public media (rapid, attention-getting narratives about issues directly related to readers and viewers, regardless of certainty) had also been explicated (Krieghbaum 1967), leading to practical attempts to ‘bridge the gap’ between science and the media but also to recurring fears that the gap was unbridgeable. Beginning in the 1970s, new developments in the sociology of scientific knowledge opened up new ways of conceiving of the relationship between science and media (Barnes 1974, Latour and Woolgar 1979). In particular, attention to the rhetorical goals of scientists in their use of media suggested that a distinction between communication in science and communication about science could not be maintained. Instead, some researchers began to discuss the ‘expository’ nature of science, showing how scientists tailored their communication to meet the needs of specific media and specific communication contexts (Bazerman 1988, Shinn and Whitley 1985). These researchers highlighted the interaction of the idealized intellectual goals of science (for the production of reliable knowledge about the natural world) with the social and institutional goals of scientists and their employers or 13655
Science and the Media patrons (for priority, status in the community, ownership of ideas and thus patents, and so on). In this conception of science and media, scientific knowledge itself might differ in different presentations—in some cases, appearing as a ‘narrative of science’ (focusing on the methodological and theoretical structures leading to particular knowledge of the natural world), in other cases as a ‘narrative of nature’ (focusing on the relationships between organisms or entities, with emphasis on the ‘story’ linking different aspects of those organisms and entities) (Myers 1990). The very boundary between ‘science’ and ‘non-science’ (or ‘mere’ popularization), between communication in science and communication about science, was shown to be highly mutable, itself an object of rhetorical construction used for political purposes by participants in scientific controversies to establish control over elements of a debate (Hilgartner 1990). A second new line of inquiry highlighted the institutional interdependence of science and the media. Pointing to the growing post-World War II need for scientific institutions to generate public and political support, and to the media’s need to draw on scientific developments for constant infusions of drama and ‘newness,’ the new research identified ‘selling science’ as a major element of the political economy linking science and the media (Nelkin 1987). Finally, work on the images of science that appeared over time helped show the ways that social concerns interacted with scientific developments to create cultural representations of science (Weart 1988, LaFollette 1990, Nelkin and Lindee 1995). At the same time that these developments were occurring in the understanding of science communication, new approaches were developing in understanding the relationship between science and the public. Spurred by concerns in the scientific community about lack of ‘scientific literacy,’ the studies ultimately questioned the idea that the public has a ‘deficit’ of knowledge that needs to be ‘improved’ (Irwin and Wynne 1996). Instead, the new approach focused on the possibility of engaging the public in discussions about scientific issues, recognizing the contingent nature of understanding of scientific issues and thus the alternative meanings that might emerge in democratic discussions instead of authoritarian pronouncements (Sclove 1995). Much of the new work looked at issues of uncertainty and risk, highlighting the social negotiations required to make personal and policy decisions in a context of incomplete information (Friedman et al. 1999). The new work clearly implied that science and the media cannot be perceived as separate institutions with separate goals. Instead, better understanding comes from analyzing science and media across audiences, looking at the political and economic interests of media producers, at the rhetorical meanings of media conventions, and at the audience responses to media content (especially in terms of meaning-making). 13656
3. Approaches to Studying Science and Media 3.1 Political and Economic Interests of Media Producers The most detailed and well-developed understandings of science and the media point to the political and economic interests that motivate the producers of media representations of science. In many countries, the media depend on government subsidies or at least on political tolerance. Moreover, media are expensive to produce. While the presence of the media in democratic societies is often seen as fundamental to political liberty, the economic need to appeal to broad audiences (even for government-controlled media) often leads to editorial decisions that emphasize broad coverage and sensationalism over depth and sobriety. Science media are no different, and even the most prestigious scientific journals such as Science and Nature have been criticized for highlighting the most ‘newsworthy’ research, for hyping scientific reports beyond their value, for engaging in ‘selling science.’ Newspaper and television coverage of science is routinely criticized for its sensationalism, particularly in the coverage of controversial issues such as food contamination (e.g., the BSE or ‘mad cow’ epidemic in Britain in the 1980s and 1990s), genetic modification of food, and global climate change. Journalists (and their employers) routinely defend their practices on the basis of ‘what readers\viewers want,’ which is often judged by sales success. Entertainment media are criticized even more heavily for creating movie plots that involve impossible science or focus on highly unlikely scenarios; producers disclaim any responsibility for accuracy, claiming only to be providing (profitable) escapism. These recurring disputes between the scientific community and media producers highlight the irreconcilable tension between the goals of science and the goals of the media.
3.2 Rhetorical Meanings of Media Conentions An emerging area of understanding is the way that media conventions shape the rhetorical meaning of science communication. Scientific papers reporting original experimental research, for example, have developed a standardized format (called ‘IMRAD,’ for introduction, methods, results, analysis, discussion). The IMRAD format, conventionally written in an impersonal tone, with passive grammatical constructions, hides the presence of the researcher in the research, highlighting the objective character of scientific knowledge that is claimed to exist independently of its observer. In journalistic reports of scientific controversies, the conventions of objective reporting require ‘balance’—the presentation of both sides of a controversy (Bazerman 1988). Though 99.9 percent of all researchers may hold that a new claim
Science and the State (such as the eventually discredited 1989 announcement of a new form of ‘cold nuclear fusion’) is scientifically untenable, the journalistic norm of balance may lead to approximately equal attention to all claims. Similarly, journalistic goals of story telling often lead to emphasis on the human, narrative dimensions of an issue over the theoretical or mathematical components of the science. New research is exploring the meaning of visual representations of science, at every level from the technical outputs of genome analyzers to mass media manipulations of false-color images produced by orbiting astronomical telescopes (Lynch and Woolgar 1990).
3.3 Audience Responses to Media Content One of the most difficult areas to investigate is the audience response to particular media presentations. Virtually no work, for example, has been done on how scientists respond to specialized science media, despite anecdotal stories suggesting that changes in information presentation (such as World Wide Web access to scientific databases) may have dramatic impacts on how scientists conceive of the natural world and frame their hypotheses and theories. Some work has been done at broader public levels, such as responses to media presentations of risk information. But, although the psychology of risk perceptions, for example, has been extensively investigated, the factors affecting individual responses to media coverage of specific risky incidents are so varied as to defy measurement. Though the complexity leads some researchers to reject the possibility of quantitative analysis, others insist that social scientists can learn to isolate appropriate variables. Certainly some areas of audience response could be investigated more carefully; audience reaction to images of science in entertainment films, for example, has never been systematically tested, despite frequent statements that ‘the image of scientists in the movies is bad.’ The shift away from separate analysis of ‘science’ and ‘media’ to an integrated understanding of ‘science and media’ has led in recent years to the possibility of cross-cutting analyses, such as those of producers, rhetorical structures, and audience reception. These analyses will ultimately yield a much richer understanding of the interaction of science and media. Less clear is how such improved understanding will contribute to the practical concerns of those in the scientific community and elsewhere, who worry about possible linkages among science literacy, the image of science, and public support for science. See also: Educational Media; Genetics and the Media; Media and Child Development; Media and History: Cultural Concerns; Media Ethics; Media Events; Media, Uses of; Public Broadcasting; Public Relations in Media; Research Publication: Ethical Aspects
Bibliography Barnes B 1974 Scientific Knowledge and Sociological Theory. Routledge and Kegan Paul, London Bazerman C 1988 Shaping Written Knowledge: The Genre and Actiity of the Experimental Article in Science. University of Wisconsin Press, Madison, WI Friedman S M, Dunwoody S, Rogers C L (eds.) 1999 Communicating Uncertainty: Media Coerage of New and Controersial Science. Erlbaum Associates, Mahway, NJ Garvey W D 1979 Communication: The Essence of Science–Facilitating Information Exchange Among Librarians, Scientists, Engineers and Students. Pergamon Press, New York Hilgartner S 1990 The dominant view of popularization: conceptual problems, political uses. Social Studies of Science 20(3): 519–39 Irwin A, Wynne B (eds.) 1996 Misunderstanding Science? The Public Reconstruction of Science and Technology. Cambridge University Press, Cambridge, UK Krieghbaum H 1967 Science and the Mass Media. New York University Press, New York LaFollette M 1990 Making Science Our Own: Public Images of Science, 1910–1955. University of Chicago Press, Chicago Latour B, Woolgar S 1979 Laboratory Life. Sage, Beverly Hills, CA Lynch M, Woolgar S 1990 Representation in Scientific Practice. MIT Press, Cambridge, MA Myers G 1990 Writing Biology: Texts in the Social Construction of Scientific Knowledge. University of Wisconsin Press, Madison, WI Nelkin D 1987 Selling Science: How the Press Coers Science and Technology. Freeman, New York Nelkin D, Lindee M S 1995 The DNA Mystique: The Gene as a Cultural Icon. Freeman, New York Sclove R 1995 Democracy and Technology. Guilford, New York Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Shinn T, Whitley R (eds.) 1985 Expository Science: Forms and Functions of Popularisation D Reidel, Dordrecht, The Netherlands, Vol. 9 Weart S 1988 Nuclear Fear: A History of Images. Harvard University Press, Cambridge, MA Ziman J M 1968 Public Knowledge: An Essay Concerning the Social Dimension of Science. Cambridge University Press, Cambridge, UK
B. V. Lewenstein
Science and the State 1. Introduction Since the sixteenth century, the ‘scientific revolution’ has generated a host of new, intellectual, rhetorical, and institutional strategies for undermining traditional political authorities and constructing alternative ones. Science became a vital resource in the modern attempt to discredit traditional political hierarchies and spiritual transcendental sources of authority in 13657
Science and the State the context of public affairs. By providing new rationales for order compatible with the novel modern commitments to the values of individualism, voluntarism, and egalitarianism, science and technology paradoxically also provided new grounds for novel modern forms of hierarchy and authority committed to the use of knowledge in the reconstruction of society. The widely shared notion that science goes along with progress appeared to suggest that knowledge, when applied to public affairs, can in fact depoliticize public discourse and action. This faith produced wide mandates for large-scale social and political engineering and monumental state-sponsored technological projects in both authoritarian and democratic states. While such ideals and projects were from the very beginning criticized by observers such as Edmund Burke (1729–97), throughout the twentieth century the record of the relations between science and politics turned out to be sufficiently mixed to discard earlier hopes. The controversial role of scientists in the production and military deployment of means of mass destruction, the role of scientists in the Nazi experiments on human beings and the endorsement of racism, and their involvement in the industrial pollution of the environment, diminished the public trust in science fostered by such developments as medical breakthroughs and spectacular space flights. Since the closing decades of the twentieth century, an increasing number of social scientists and historians have been pointing out that new uncertainties in the sphere of politics have been converging with newly recognized uncertainties of science and risks of technology to form novel postmodern configurations of state, science, and society (Eisenstadt 1999, Beck 1992). It has become increasingly recognized that, in ethnically and religiously heterogeneous and multicultural societies, science and technology can no longer be attached to uncontroversial comprehensive values that privilege claims of neutrality, objectivity, and rationality; nor can the earlier belief that knowledge can depoliticize public discourse and action be sustained any longer.
2. The Authorities of Science and the State The belief that science can provide a secular version of the synoptic divine view of human affairs was politically most significant. Until its erosion towards the end of the twentieth century, the modernist configuration of science and the state was formed over the course of at least four centuries. The secularization of political power in the modern state has put the dilemma of arbitrary rule at the center of modern political theory and practice. Whereas legal and constitutional constraints appeared to be the most appropriate remedy to this problem, the rise of modern 13658
science and technology encouraged faith in the possibility of restraint and moderation through public enlightenment and by means of scientific and technical advice. Not surprisingly, the redemptive drive behind the integration of science and politics had its origins in premodern religious visions and values (Manuel 1974). In both its authoritarian and democratic versions, this modernist program rested on the belief that the modern state can remedy the defects of the present sociopolitical order and realize an ideal order. Science contributed to this modernist outlook the faith in the power of secular knowledge to control nature and reconstruct society. In his dedication to Lorenzo di Medici of Florence in The Prince (1513), Niccolo Machiavelli implies that the holder of the God’s-eye view of the entire state is neither the King, who looks down at the people from the top of the hierarchy, nor the people, who see the King at the top from their lower state. It is a person like himself, a man of knowledge. The King, to be sure, has knowledge of the people, and the people have knowledge of the King but, because of his intellectual perspective, ‘a man of humble and obscure condition’ like Machiavelli can claim to see both the King and his subjects and understand the nature of their interactions. Political theory and political science thus claimed to have inherited the God’s-eye view of the whole polity and evolved a secular vision of the state as an object of knowledge. In turn, the nation-state has often found it useful to adopt the frame, if not always the content, of the inclusive scientific outlook in order to legitimate its interventions in the name of the general common good and its expressions in public goals such as security, health, and economic welfare. If, from the inclusive perspective of the state, all citizens, social groups, or institutions appeared as parts which the supreme power had the capacity to fit together or harmonize, science in its various forms often provided the technical means to rationalize, and the rhetorical strategies to depoliticize, such applications of power. The potential of science not only as a source of knowledge but also as a politically useable authority was already recognized by Thomas Hobbes. This founder of modern political theory was in contact with Francis Bacon, admired William Harvey’s findings about the circulation of the blood, and criticized Robert Boyle’s position on the vacuum (Shapin and Schaffer 1985). In his influential works on the state, Hobbes tried to buttress his claims by appealing to the authority of scientific, especially mathematical, certainties, hoping to enlist the special force of a language, which claims to generate proofs as distinct from mere opinions (Skinner 1996). Although he was suspicious of the experimental science of Boyle, Hobbes followed Machiavelli in describing politics in the language of causes rather than motives. Thomas Sprat, the historian of the Royal Society (1667), claimed that when compared to the contentious languages of religions, the temperate discourse of science can advance con-
Science and the State sensus across diverse social groups. In both the experimental and mathematical scientific traditions, however, knowledge, whether based on inferences or observations (or on their combination), was expected to end disputes. The association between science and consensus was regarded as deeply significant in a society which had begun to view authority and order as constructed from the bottom up rather than coming from above (Dumont 1986). Reinforced by the growing image of science as a cooperative, nonhierarchical, and international enterprise in which free individuals come to uncoerced agreements on the nature of the universe, scientific modes of reasoning and thinking appeared to suggest a model for evolving discipline and order among equals (Polanyi 1958, 1962). Deeply compatible with the notion that legitimate power and authority are constituted by social contracts, the rise of modern scientific knowledge appeared to accompany and reinforce the rise of the modern individual and his or her ability to challenge hierarchical forms of knowledge and politics. From Hobbes and Descarte, through Kant, J. S. Mill, and late-modern democratic thinkers such as Popper, Dewey, and Habermas, the commitment to the centrality of individual agency becomes inseparable from a growing faith in the possibility of expanding public enlightenment backed up by the demonstrated success of scientific modes of reasoning and acting.
3. Power and the Social Sciences In succeeding centuries, one of the most persistent ambitions of Western society has been to make conclusions touching the ‘things’ of society, law, and politics appear as compelling and impervious to the fluctuations of mere opinion as conclusions touching the operation of heaven and earth. The faith that natural scientists are bound by the ‘plain’ language of numbers to speak with an authority which cannot be corrupted by fragile human judgment was gradually extended to the fields of engineering and the social sciences (Porter 1986). The notion that experts who are disciplined by respect for objective facts are a symbol of integrity and can therefore serve as guardians of public virtues against the villains of politics and business was widely interpreted to include, beyond natural scientists, other categories of experts who speak the language of numbers. Social science disciplines like economics, social statistics, sociology, political science, and psychology adopted modes of observing, inferring, and arguing which appeared to deploy in the sphere of social experience notions of objective facts and discernible laws similar to those that the natural sciences had developed in relation to physical nature. The social sciences discovered that the language of quantification is not only a powerful tool in the production of knowledge of society but also a
valuable political and bureaucratic resource for depersonalizing, and thus legitimizing, the exercise of power (Porter 1995). In the context of social and political life the separation, facilitated by expert languages, between facts and values, causal chains and motives, appeared profoundly consequential. Until effectively challenged, mostly since the 1960s, such scientific and technical orientations to human affairs appeared to produce the belief that economic, social, political, and even moral ‘facts’ can be used to distinguish subjective or partisan from professional and apolitical arguments or actions. The apparent authority of science in the social and political context motivated the principal founders of modern ideologies like socialism, fascism, and liberalism to enlist science to their views of history, society, and the future. While theoretical and mathematified scientific knowledge remained esoteric, as religious knowledge was for centuries in the premodern state, the ethos of general enlightenment and the conception of scientific knowledge as essentially public made scientific claims appear agreeable to democratic publics even when these claims remained, in fact, elusive and removed from their understanding. In fields such as physics, chemistry, and medicine, machines, instruments, and drugs could often validate in the eyes of the lay public claims of knowledge which theories alone could not substantiate. The belief that science depoliticizes the grounds of state policies and actions, and subordinates them to objective and, at least in the eyes of some, transparent, professional standards, has been highly consequential for the uses of science in the modern state. Political leaders and public servants discovered that the appearance of transparency and objectivity can make even centralized political power seem publicly accountable (Price 1965, Ezrahi 1990). Moreover, the realization that the authority of science could in various contexts be detached from substantive scientific knowledge and deployed without the latter’s constraint often enhanced the political usefulness of both natural and social scientists independently of the value actually accorded to their professional judgments. Still, the uses of expert scientific authority in the legitimization of state politics and programs has often empowered scientists to actually exercise influence on the substance of government actions or to effectively criticize them from institutional bases outside the government (Jasanoff 1990). The actual uses of scientific expertise could often be just the consequence of a political demand for expert authority.
4. Science in Democratic and Authoritarian States In democratic societies, the integration of scientific authority, and sometimes also of scientific knowledge and technology, into the operations of the modern 13659
Science and the State state encouraged the development of a scientifically informed public criticism of the government. This process was greatly augmented by the rise of the modern mass media and the ability of scientists to call public attention to governmental failures traceable to nonuse or misuse of scientific expertise. The role of scientists in empowering public criticism of the uses of nuclear weapons and nuclear energy (Balough 1991), in grounding public criticism of the operation of such agencies as the National Aeronautics and Space Administration (NASA) and the Food and Drugs Administration (FDA), and in the criticism of public and private agencies for polluting the environment, illustrates the point. In authoritarian states like the Soviet Union, such cooperation between scientists, the mass media, and the public in criticizing government policies and actually holding the government accountable was, of course, usually repressed. In the authoritarian modernist version of science and the state, science was used mostly to justify, and partly to direct, comprehensive planning and control. Here, the force of scientific knowledge and authority rarely constrained the rationalization of centralization or warranted public skepticism towards the government or criticism of its actions (Scott 1998). Even in democratic states, however, science and technology were massively used to promote the goals of reconstruction, coordination, mass production, and uniformity. The redemptive role assumed by the political leadership in the name of equality, public welfare, and uniformity was no less effective in rationalizing state interventions than the totalistic utopianism enlisted to justify the direction and manipulation of society in authoritarian regimes. In all the variants of the modern state, bodies of expert knowledge such as statistics, demography, geography, macroeconomics, city planning, public medicine, mental health, applied physics, psychology, geology, and others were used to rationalize the creation of government organizations and semipublic regulatory bodies which had the mandate to shift at least part of the discretion held formerly by political, legal, and bureaucratic agents to experts (Price 1965, Wagner et al. 1991). In many democratic states, the receptiveness to experts within the state bureaucracy was manifest in the introduction of merit-based, selective, personnel recruitment procedures and a widening use of examinations. The cases of the former USSR and Communist China suggest, by contrast, the ease with which experts and their expertise could be controlled by political loyalists. Even where it was facilitated, however, the welding of bureaucrats and professionals was bound to be conflict-ridden. The hierarchical authority structure of bureaucratic organizations has inevitably come to both challenge and be challenged by the more horizontal structure of professional peer controls (Wilson 1989, Larson 1977). Nevertheless, the professionalization of fields of public policy and 13660
regulation boosted the deployment of more strictly instrumental frames of public actions in large areas. The synoptic gazes of power and science could thus converge in supporting such projects as holistic social engineering, city planning, and industrial scientific agriculture (Scott 1998).
5. Science, War, and Politics The most dramatic and consequential collaboration between science and the modern state took place during wars. Such collaboration was usually facilitated by the atmosphere of general mobilization, defining the war effort as the top goal of the nation. In a state of emergency, even democratic states have suspended, or at least limited, the competitive political process. Under such conditions, democracies have temporarily, albeit voluntarily, experienced the usual conditions of the authoritarian state: centralized command and control and a nationally coordinated effort. In the aftermath of such wars, politicians and experts often tried to persist in perpetuating the optimal conditions of their collaboration, regarding the resumption of usual political processes as disruptive of orderly rational procedures. In authoritarian nationalist or socialist states, open competitive politics was repressed as a matter of routine along with the freedom of scientific research, and the liberty of scientists to diffuse their findings and interpret them to students and the public at large (Mosse 1966). In such countries, the political elite usually faced the dilemma of how to exploit the resources of science for the war effort and for advancing its domestic goals without becoming vulnerable to the revolutionary potential of science as the expression of free reason, criticism, and what Robert K. Merton called ‘organized skepticism’ (1973). In the democratic state, the special affinities between the participatory ethos of citizens, free to judge and evaluate their government, and the values of scientific knowledge and criticism did not usually allow the coherences and the clarities of the war period to survive for long, and the relations between science and the state had to readjust to conditions of open political contests and the tensions between scientific advice to, or scientific criticism of, the state. Limitations imposed by open political contests were mitigated, however, in political cultures which encouraged citizens to evaluate and judge the government with reference to the adequacy of its performance in promoting public goals. The links between instrumental success and political legitimation preserved in such contexts the value of the interplay between expert advice to the government and to its public critics in the dynamic making of public policy and the construction of political authority. Political motives for appealing to the authority of science could of course be compatible with the readiness to ignore knowledge and
Science and the State adequate performance, although it did not need to contradict the willingness to actually use scientific knowledge to improve performance. Discrepancies between the uses of scientific authority and scientific knowledge became widespread, however, because of the growing gaps between the timeframes of science and politics in democratic societies. In fields such as public health, economic development, security, and general welfare, instrumentally effective policies and programs must usually be measured over a period of years, and sometimes even decades. But the politicians whose status has come to be mediated by the modern mass media need to demonstrate their achievements within the timeframe of months or even days or weeks. In contemporary politics, one month is a long time, and the life expectancy of issues on the public agenda is usually much shorter. Such a state of affairs means that political actors often lack the political incentives to invest in the costly resources that are required for instrumental success whose payoffs are likely to appear only during the incumbency of their political successors.
6. Science and the Transformation of the Modern Democratic State These conditions have encouraged leaders in democratic states increasingly to replace policy decisions and elaborate programs with grand political gestures (Ezrahi 1990). Where instant rewards can be obtained by well-advertised commitment to a particular goal or policy, the political motive to invest in substantive moves to change reality and improve governmental performance over the longer run may easily decline. Besides the tendency of democratic politics to privilege the present over the future in the allocation of public resources, the position of science in substantiating long-term instrumental approach to public policy was further destabilized by the fragmentation of the normative mandates of state policies. Roughly since the 1960s, the publics of most Western democracies have become increasingly aware of the inherent constraints on the determination and ordering of values for the purpose of guiding public choices. Besides the influence of ethnic, religious, linguistic, or cultural differences on this process, such developments as the feminist movement reflect even deeper pressures to reorder basic values. Feminist spokespersons demanded, for example, that financial and scientific resources directed to control diseases be redistributed to redress gender discrimination against the treatment of specifically female diseases. In part, the feminist critique has been but one aspect of wider processes of individuation, which, towards the later part of the twentieth century, have increased the diversity of identities, tastes, lifestyles, and patterns of association in modern societies. This new pluralism implied the necessity of
continually renegotiating the uses of science and technology in the social context and locally adapting it to the diverse balances of values and interests in different communities. In the course of the twentieth century, leading scientific voices like J. B. S. Haldane, J. B. Bernal, and Jacques Monod tended to treat almost any resistance to the application of advanced scientific knowledge in human affairs as an ‘abuse of science’ resulting from irrationalism, ignorance, or prejudice. These and many other scientists failed to anticipate the changes in social and political values which complicated the relations of science and politics during the twentieth century and the constraints imposed by the inherently competitive value environment of science and policy making in all modern states. The constant, often unpredictable, changes in the value orientations of democratic publics and other related developments have clearly undermined confidence in the earlier separation between facts and values, science and politics, technological and political structures of action. Such a central element of policy making as risk assessment, for example, which for a long time was regarded as a matter best left to scientists, was gradually understood as actually a hybrid process combining science and policy judgments (Jasanoff 1990, Beck 1992). In contemporary society, in which the respective authorities of science and the state were demystified, knowledge has come to be regarded as too complex to directly check power, and power as too diffused to direct or repress the production and diffusion of knowledge. Historically, one of the most important latent functions of science in the sociocultural construction of the democratic political universe was to warrant the faith in an objective world of publicly certifiable facts which can function as standards for the resolution or assessment of conflicts of opinions. To the extent that scientists and technologists could be regarded as having the authority to state the valid relations between causes and effects, they were believed to be vital to the attribution of responsibility for the desirable or undesirable consequences of public actions. Policies were regarded within this model of science and politics as hypotheses, which are subject to tests of experience before a witnessing public (Ezrahi 1990). Elements of this conception have corresponded to the modern political experience. Although the public gaze has always been mediated, and to some extent manipulated, by hegemonic elites, publics could usually see when a war effort failed to achieve the stated objectives, when a major technology like a nuclear reactor failed and endangered millions of citizens, or when economic policy succeeded in producing affluence and stability. The expansion of this conception of government responsibility and accountability to regions outside the West even made it possible to redescribe fatal hunger in wide areas in India, for instance, as the consequence of policy rather 13661
Science and the State than natural disaster (Sen 1981). Still, in large areas, the cultural and normative presuppositions of the neutral public realm as a shared frame for the nonpartisan or nonideological description and evaluation of reality have not survived the proliferation of the modern mass media and the spread of political and cultural pluralism. In contemporary democracies, the pervasive mediating role of largely commercialized electronic communication systems has diminished the power of the state to influence public perceptions of politics and accentuated the weight of emotional and esthetic factors relative to knowledge or information in the representation and construction of political reality. Influenced by diverse specialized and individualized electronic media, the proliferation of partly insulated micro-sociocultural universes within the larger society has generated a corresponding proliferation of incommensurable notions of reality, causality, and factuality.
7. Transformations of the Institutional Structure and Intellectual Orientations of Science Changes in the political and institutional environment of science in the modern state have been accompanied by changes in the internal institutional and intellectual life of science itself. In the modern democratic state, the status of science as a distinct source of certified knowledge and authority was related to its institutional autonomy vis-a' -vis the structure of the state and its independence from private economic institutions. Freedom of research and academic freedom were regarded as necessary conditions for the production and diffusion of scientific knowledge and, therefore, also for its powers to ground apolitical advice and criticism in the context of public affairs. Such institutional autonomy was, of course, never complete. The very conditions of independence and freedom had to be negotiated in each society and reflected at least a tacit balance between the needs of science and the state as perceived by the government. But the international universalistic ethos of science tended to ignore or underplay such local variations (Solingen 1994). The institutional arrangements that secured a degree of autonomy and freedom to scientists seemed to warrant the distinction between basic and applied research, between pure research directed to the advancement of knowledge and research and development directed to advance specific industrial, medical, or other practical goals. An important function of this distinction was to balance the adaptation of science to the needs of the modern state and the preservation of the internal intellectual traditions and practices of scientific research and academic institutions. In practice, the relations between science and the state were more symbiotic than the public ethos of science would have suggested. Scientific institutions, 13662
in order to function, almost always required public and political support and often also financial assistance. On the other hand, the state could not remain indifferent to the potential uses of scientific knowledge and authority to both secure adequate responses to problems and facilitate the legitimation of its actions by reference to apolitical expert authority. Thus, while the separation between truth and power and the respect for the boundaries between pure and applied sciences, as well as between politics and administration, were considered for a long time as the appropriate way to think about the relation between science and politics (Price 1965), the relations between science and the state, as well as between scientists and politicians, turned out to be much more interactive and conflictridden. Since scientists and politicians respectively controlled assets useful for each other’s work, they were bound to be more active in trying to pressure each other to cooperate and engage in mutually useful exchanges (Gibbons et al. 1994). Not all these exchanges were useful to either science or the state in the long run. The mobilization of scientists to the war efforts of the modern nation-state, while it boosted the status of scientists domestically in the short run, had deleterious effects on the international network of scientific cooperation, as well as on the independence of science in the long run. Especially in cases of internally controversial wars, like the American involvement in Vietnam, the mobilization of science inevitably split the scientific community and politicized the status of science. Nevertheless, the unprecedented outlays of public money made available to science in the name of national defense allowed many research institutions to acquire expansive advanced facilities and instruments that permitted the boosting of pure science in addition to militarily related research and development. But the gains in size and in potential scientific advance were obtained at the cost of eroding the glorious insulation of scientific research and exposing its delicate internal navigational mechanisms to the impact of external political values and institutions (Leslie 1993). Yielding to the pressures of the modern nation-state to make science more relevant and useful to more immediate social goals inevitably reduced the influence of internal scientific considerations and the priorities of scientific research (Guston 2000). Following such developments, the university, the laboratory, and the scientific community at large could often appear less elitist and more patriotic, but also more detached from their earlier affinities to humanistic culture and liberal values. Like the partnership between science and the state, the partnership between science and the market often facilitated by their collaboration during the war effort and postwar privatization of projects and services has undermined the autonomy of science and the ethos of basic research in many late modern states. But, whereas the interpenetration between science and the
Science and the State state subjected the internal intellectual values of science to the pressures of public goals and pressing political needs, the links between science and the private sector subordinated scientific norms to the private values of profit making. The conversion of scientific expertise into capital opened the way for substantial private support for the intellectual pursuits of science. But the linkages between science, industry, and capital only reinforced the decline of science and its institutions as a bastion of enlightenment culture, progress, and universal intellectual values. Thus, while scientific and technical knowledge and skills were increasingly integrated into a wider spectrum of industrial productions and commercial services, science as a whole became more amorphous, less recognizable or representable as a set of distinct institutions, and a community with a shared ethos (Gibbons et al. 1994). An instructive illustration of these processes is the pressure exerted by both the state and private business firms on scientists to compromise the cherished professional norm of publicity. In the context of science, the transparency of methodology and the publicity of research results have for a long time been reinforced by both ethical commitment and practical needs. The methodologies and findings of any research effort are the building blocks, the raw materials, with which scientists in discrete places produce more knowledge. Transparency and publicity are necessary for the orderly flow of the research process and the operation of science as a cooperative enterprise (Merton 1973). In addition, publicity has been a necessary element of the special status of science as public knowledge. From the very beginning, pioneering scientists such as Bacon, Boyle, and Lavoisier distinguished scientific claims from the claims of magic, alchemy, cabala, and other esoteric practices by the commitment to transparency. Transparency was also what rendered claims of scientific knowledge appear publicly acceptable by democratic citizens (Tocqueville 1945). Turning scientific research into a state secret in order to gain an advantage in war or into an economic (commercial) secret in order to gain an advantage in the market was not congenial to sustaining the cooperative system of science or the early luster of science as an embodiment of noble knowledge and virtues which transcended national loyalties and sectoral interests. In the aftermath of such developments, the virtues of objectivity, distinterestedness, universality, and rationality which were associated with earlier configurations of science (Polanyi 1958, 1962, Merton 1973) appeared unsustainable. Moreover, the fact that the resources of science and technology could be enlisted not only by liberal and democratic but also by fascist, communist, and other authoritarian regimes accentuated the image of science as an instrument insufficiently constrained by internal norms and flexible enough to serve contradictory, unprogressive, and extremely controversial causes.
8. The Postmodern Condition and the Reconfiguration of Science and the State These developments had a profoundly paradoxical impact on the status of science in the late-modern state. Especially since the closing decades of the twentieth century, while such projects as the decoding of the human genome indicate that scientific knowledge has advanced beyond even the most optimistic predictions, the authority of science in society and in the context of public affairs has suffered a sharp decline. In order to consider this state of affairs as reversible, one needs to believe that the insular autonomy of science can be restored and that the political and economic environments of science can become more congenial for sustaining this condition. Such expectations seem unwarranted. On the other hand, an apparent decline in the distinct social and political value of scientific authority may not necessarily undermine the impact of scientific knowledge on the ways states and governments act. What appears against the past as the decline of scientific authority in the larger social and political context may even be redescribed as a reconfiguration of the authority of expertise in the postmodern state. The earlier distinctions between science, politics, law, ethics, economy, and the like may have become irrelevant in complex contemporary societies, in which the use of scientific knowledge requires finer, more intricate, and continual adjustments and readjustments between science and other expressions of truth or power. In the absence of hegemonic ideological or national frames, the macropolitical sphere of the state divides into a multitude of—often just temporary and amorphous—subgroups, each organized around a particular order of values and interests. In each of these particular universes scientific knowledge and expertise can occupy a privileged authority as a means to advance shared goals. When these groups clash or compete in the wider political arena, their experts tend to take sides and function as advocates. The state and its regulatory institutions enlist their own experts to back up their particular perspectives. Within this new, more open-ended and dynamic system, science is more explicitly and reflexively integrated into social economic and political values. In the context of the new pluralist polity, the authority and knowledge of science may therefore be less distinctly visible and less relevant to the symbolic, ideological, or cultural aspects of politics. At the same time, it is no less present in the organization, management, and the normative constitution of social order. See also: Academy and Society in the United States: Cultural Concerns; Development and the State; Higher Education; History of Science: Constructivist Perspectives; Science and Industry; Science, Economics of; Science Funding: Asia; Science Funding: Europe; Science Funding: United States; Science, 13663
Science and the State Social Organization of; Science, Sociology of; Science, Technology, and the Military; Scientific Academies, History of; Scientific Academies in Asia; State and Society; Universities, in the History of the Social Sciences
Bibliography Balough B 1991 Chain Reaction: Expert Debate and Public Participation in American Commercial Nuclear Power 1945– 1975. Cambridge University Press, New York Beck U 1992 Risk Society: Towards a New Modernity. Sage Publications, London Dumont L 1986 Essays on Indiidualism, Modern Ideology in Anthropological Perspectie. University of Chicago Press, Chicago Eisenstadt S N 1999 Paradoxes of Democracy: Fragility, Continuity and Change. The Woodrow Wilson Center Press, Washington, DC Ezrahi Y 1990 The Descent of Icarus: Science and the Transformation of Contemporary Democracy. Harvard University Press, Cambridge, MA Ezrahi Y 1996 Modes of reasoning and the politics of authority in the modern state. In: Olson D R, Torrance N (eds.) Modes of Thought: Explorations in Culture and Cognition. Cambridge University Press, Cambridge, UK Gibbons M, Nowotny H, Limoges C, Schwartzman S, Scott P, Trow M (eds.) 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Guston D H 2000 Between Politics and Science: Assuring the Integrity of Research. Cambridge University Press, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Larson M S 1977 The Rise of Professionalism. University of California Press, Berkeley, CA Leslie S W 1993 The Cold War and American Science. Columbia University Press, Cambridge, MA Manuel F E 1974 The Religion of Isaac Newton. Clarendon Press, Oxford, UK Merton R K 1973 The normative structure of science. In: Storer N W (ed.) The Sociology of Science. University of Chicago Press, Chicago Mosse G L 1966 Nazi Culture. Schoken Bookes, New York Polanyi M 1958 Personal Knowledge: Towards a Post Critical Philosophy. University of Chicago Press, Chicago Polanyi M 1962 The Republic of Science: Its Political and Economic Theory. Minerva Press, pp. 54–73 Porter T M 1986 The Rise of Statistical Thinking 1820–1900. Princeton University Press, Princeton, NJ Porter T M 1995 Trust In Numbers, The Pursuit of Objectiity in Science and Public Life. Princeton University Press, Princeton, NJ Price D K 1965 The Scientific Estate. Belknap, Cambridge, MA Scott J C 1998 Seeing Like a State. Yale University Press, New Haven, CT Sen A K 1981 Poerty and Famines: An Essay on Entitlements and Depriation. Clarendon Press, Oxford, UK Shapin S, Schaffer S 1985 Leiathan and the Air Pump: Hobbes, Boyle and the Experimental Life. Princeton University Press, Princeton, NJ Skinner Q 1996 Reason and Rhetoric in the Philosophy of Hobbes. Cambridge University Press, Cambridge, UK
13664
Solingen E (ed.) 1994 Scientists and The State: Domestic Structures and the International Context. University of Michigan Press, Ann Arbor, MI Sprat Th [1667] 1958 History of the Royal Society. Washington University Press, St Louis, MO Tocqueville A de 1945 Democracy in America. In: Bradley Ph (ed.) Vintage Books, New York Wagner P, Weiss C H, Wittrock B, Wollman H 1991 Social Sciences and Modern States. Cambridge University Press, Cambridge, UK Wilson J Q 1989 Bureaucracy: What Goernment Agencies Do and Why They Do It. Basic Books, New York
Y. Ezrahi
Science, Economics of Determining the principles governing the allocation of resources to science as well as the management and consequences of the use of these resources are the central issues of the economics of science. Studies in this field began with the assumption that science was a distinct category of public spending that required rationalization. They have moved towards the view that science is a social system with distinct rules and norms. As the systems view has developed, the focus of the economics of science has moved from the effects of science on the economy to the influence of incentives and opportunities on scientists and research organizations. There is a productive tension between viewing science as a social instrument and as a social purpose. In the first view, science is a social investment in the production and dissemination of knowledge that is expected to generate economic returns as this knowledge is commercially developed and exploited. This approach has the apparent advantage that the standard tools of economic analysis might be directly employed in choosing how to allocate resources to science and manage their use. In the second approach, science is assumed to be a social institution whose norms and practices are distinct from, and only partially reconcilable with, the institutions of markets. While this second approach greatly complicates the analysis of resource allocation and management, it may better represent the actual social organization of science and the behavior of scientists, and it may therefore ultimately produce more effective rules for resource allocation and better principles for management. Both approaches are examined in this article, although it is the first that accounts for the majority of the economics of science literature (Stephan 1996).
1. The Economic Analysis of Science as a Social Instrument In arguing for a continuing high rate of public funding of science following World War II, US Presidential
Science, Economics of Science Advisor Vannevar Bush (1945) crafted the view that science is linked intrinsically to technological and economic progress as well as being essential to national defense. The aim of ‘directing’ science to social purposes was already well recognized, and had been most clearly articulated before the war by John D. Bernal (1939). What distinguished Bush’s argument was the claim that science had to be curiosity-driven and that, in companies, such research would be displaced by the commercial priorities of more applied research. The view that science is the wellspring of economic growth became well established within the following generation, giving rise to statements like ‘Basic research provides most of the original discoveries from which all other progress flows’ (United Kingdom Council for Scientific Policy 1967). The concept of science as a source of knowledge that would be progressively developed and eventually commercialized became known as the ‘linear model.’ In the linear model, technology is science reduced to practical application. The ‘linear model’ is an oversimplified representation that ignores the evidence that technological change is often built upon experience and ingenuity divorced from scientific theory or method, the role of technological developments in motivating scientific explanation, and the technological sources of instruments for scientific investigation (Rosenberg 1982). Nonetheless, it provides a pragmatic scheme for distinguishing the role of science in commercial society. If science is instrumental in technological progress and ultimately economic growth and prosperity, it follows that the economic theory of resource allocation should be applicable to science. Nelson (1959) and Arrow (1962) demonstrated why market forces could not be expected to generate the appropriate amount of such investment from a social perspective. Both Arrow and Nelson noted that in making investments in scientific knowledge, private investors would be unable to capture all of the returns to their investment because they could not charge others for the use of new scientific discoveries, particularly when those discoveries involved fundamental understanding of the natural world. Investment in scientific knowledge therefore had the characteristics of a ‘public good’, like publicly accessible roads. This approach established a basis for justifying science as a public investment. It did not, however, provide a means for determining what the level of that investment should be. Investments in public goods are undertaken, in principle, subject to the criterion that benefits exceed costs by an amount that is attractive relative to other investments of public funds. To employ this criterion, a method for determining the prospective returns or benefits from scientific knowledge is required. The uncertainty of scientific outcomes is not, in principle, a fundamental barrier to employing this method. In practice, it is often true that the returns from invest-
ments in public good projects are uncertain, and prospective returns often involve attributing to new projects the returns from historical projects. Griliches (1958) pioneered a methodology for retrospectively assessing the economic returns on research investment, estimating that social returns of 700 percent had been realized in the period 1933–55 from the $2 million of public and private investments on the development of hybrid corn from 1910–55. Other studies of agricultural innovation as well as a limited number of studies of industrial innovation replicated Griliches’ findings of a high social rate of return (see Steinmueller 1994 for references). Mansfield (1991) provides a fruitful approach for continuing to advance this approach. Mansfield asked R&D executives to estimate the proportion of their company’s products and processes commercialized in 1975–85 that could not have been developed, or would have been substantially delayed, without academic research carried out in the preceding 15 years. He also asked them to estimate the 1985 sales of the new products and cost savings from the new processes. Extrapolating the results from this survey to the total investment in academic research and the total returns from new products and processes, Mansfield concluded that this investment had produced the (substantial) social rate of return of 28 percent. The preceding discussion could lead one to conclude that the development of a comprehensive methodology for assessing the rate of return based on scientific research was only a matter of greater expenditure on economic research. This conclusion would be unwarranted. Efforts to trace the returns from specific government research efforts (other than in medicine and agriculture) have been less successful. The effort by the US Department of Defense Project Hindsight to compute the returns from defense research expenditures not only failed to reveal a positive rate of return, but also rejected the view that ‘any simple or linear relationship exists between cost of research and value received’ (Office of the Director of Defense Research and Engineering 1969). Similar problems were experienced when the US National Science Foundation sought to trace the basic research contributions underlying several major industrial innovations (National Science Foundation 1969). In sum, retrospective studies based on the very specific circumstances of ‘science enabled’ innovation or upon much broader claims that science as a whole contributes a resource for commercial innovation seem to be sustainable. When these conditions do not apply, as in the cases of specific research programs with uncertain application or efforts to direct basic research to industrial needs, the applicability of retrospective assessment, and therefore its value for resource allocation policy, is less clear. More fundamentally, imputing a return to investments in scientific research requires assumptions about the ‘counter-factual’ course of developments that 13665
Science, Economics of would have transpired in the absence of specific and identified contributions of science. In examples like hybrid corn or the poliomyelitis vaccine, a reasonable assumption about the ‘counter-factual’ state of the world is a continuation of historical experience. Such assumptions are less reasonable in cases where scientific contributions enable a particular line of development but compete with alternative possibilities or where scientific research results are ‘enabling’ but are accompanied by substantial development expenditures (David et al. 1992, Mowery and Rosenberg 1989, Pavitt 1993). For science to be analyzed as a social instrument, scientific activities must be interpreted as the production of information and knowledge. As the results of this production are taken up and used, they are combined with other types of knowledge in complex ways for which the ‘linear model’ is only a crude approximation. The result is arguably, and in some cases measurably, an improvement in economic output and productivity. The robustness and reliability of efforts to assess the returns to science fall short of standards that are employed in allocating public investment resources. Nonetheless, virtually every systematic study of the contribution of science to economy has found appreciable returns to this social investment. The goals of improving standards for resource allocation and management may be better served, however, by analyzing science as a social institution.
2. Science as a Social Institution The economic analysis of science as a social system begins by identifying the incentives and constraints that govern the individual choices of scientists and this may reflect persistent historical features of science or contemporaneous policies. Incentives may include tangible rewards such as monetary awards, intangible, but observable, rewards such as status, and less observable rewards such as personal satisfaction. Similarly, constraints should be interpreted broadly, including not only financial limitations but also constraints stemming from institutional rules, norms, and standards of practice. The following simplified account suggests one of several ways of assembling these elements into a useful analytical framework. Becoming a scientist requires substantial discipline and persistence in educational preparation as well as skills and talents that are very difficult to assess. Scientific training may be seen as a filter for selecting from prospective scientists those who have the ability and drive to engage in a scientific career. In addition, the original work produced during research training demonstrates the capacity of the researcher and provides a means for employers to assess the talents of the researcher (David 1994). Analyzing science education as an employment filter is a complement to 13666
more traditional studies of the scientific labor market such as those reviewed by Stephan (1996). The employment filter approach may also waste human resources by making schooling success the only indicator of potential for scientific contribution. If, for example, the social environment of the school discourages the participation or devalues the achievement of women or individuals from particular ethnic groups, the filter system will not perform as a meritocracy. The distinctive features of science as a social system emerge when considering the incentives and constraints facing employed scientists. Although there is a real prospect of monetary reward for outstanding scientific work (Zuckerman 1992), many of the incentives governing scientific careers are related to the accumulation of professional reputation (Merton 1973). While Merton represented science as ‘universalist’ (open to claims from any quarter), the ability to make meaningful claims requires participation in scientific research networks, participation that is constrained by all of the social processes that exclude individuals from such social networks or fail to recognize their contribution. The incentive structure of seeking the rewards from professional recognition, and the social organization arising from it, is central to the ‘new economics of science’ (Dasgupta and David 1994). The new economics of science builds upon sociological analyses (Cole and Cole 1973, Merton 1973, Price 1963) of the mechanisms of cumulative reinforcement and social reward within science. From an economic perspective, the incentive structure governing science is the result of the interactions between the requirement of public disclosure and the quest for recognition of scientific ‘priority’, the first discovery of a scientific result. Priority assures the alignment of individual incentives with the social goal of maximizing the scientific knowledge base (Dasgupta and David 1987). Without the link between public disclosure and the reward of priority, it seems likely that scientists would have an incentive to withhold key information necessary for the further application of their discoveries (David et al. 1999). As Stephan (1996) observes, the specific contribution of the new economics of science is in linking this incentive and reward system to resource allocation issues. Priority not only brings a specific reward of scientific prestige and status but also increases the likelihood of greater research support. Cumulative advantage therefore not only carries the consequence of attracting attention, it also enables the recruitment of able associates and students and provides the means to support their research. These effects are described by both sociologists of science and economists as the Matthew effect after Matthew 25:29, ‘For to every one who has will more be given, and he will have abundance; but from him who has not, even what he has will be taken away.’ As in the original parable, it
Science, Economics of may be argued that this allocation is appropriate since it concentrates resources in hands of those who have demonstrated the capacity to produce results. The race to achieve priority and hence to collect the rewards offered by priority may, however, lead to inappropriate social outcomes because priority is a ‘winner take all’ contest. Too many resources may be applied to specific races to achieve priority and too few resources may be devoted to disseminating and adapting scientific research results (Dasgupta and David 1987, David and Foray 1995), a result that mirrors earlier literature on patent and technology discovery races (Kamien and Schwartz 1975). Moreover, the mechanisms of cumulative advantage resulting from achieving priority may reduce diversity in the conduct of scientific research. This system has the peculiarity that the researchers who have the greatest resources and freedom to depart from existing research approaches are the same ones who are responsible for creating the status quo. The principal challenges to the view that science is a distinct social system are the growing number of scientific publications by scientists employed in private industry (Katz and Hicks 1996) and the argument that scientific knowledge is tightly bound to social networks (Callon 1994). Private investments in scientific research would appear to question the continuing validity of the ‘public good’ argument. For example, Callon (1994) contends that scientific results are, and have always been, strongly ‘embedded’ within networks of researchers and that ‘public disclosure’ is therefore relatively useless as a means of transfer for scientific knowledge. Gibbons et al. (1994) argue that research techniques of modern science have become so well distributed that public scientific institutions are no longer central to scientific activity. While the arguments of both Callon and Gibbons et al. suggest that private scientific research is a direct substitute for publicly funded research, other motives for funding and publication such as gaining access to scientific networks suggest that public and private research are complementary (David et al. 1999). The growing reliance of industry on science provides a justification for investing in science to improve the ‘absorption’ of scientific results (Cohen and Levinthal 1989, Rosenberg 1990). Employed scientists need to be connected with other scientific research colleagues who identify ‘membership’ in the scientific community with publication, and labor force mobility for employed scientists requires scientific publication. Thus, it is premature to conclude that the growing performance of scientific research in industry or publication of scientific results by industrial authors heralds the end of the need for public support of science. The growing significance of private funding of scientific research does, however, indicate the need to improve the socioeconomic analysis of the incentive and governance structures of science. Empirical work on the strategic and tactical behavior of indiv-
idual scientists, research groups, and organizations is urgently needed to trace the implications of the changing environment in which the social institutions of science are evolving. Ultimately, these studies should be able to meet the goal of developing better rules for allocating and managing the resources devoted to science. See also: Innovation, Theory of; Research and Development in Organizations; Research Funding: Ethical Aspects; Science and the State; Science Funding: Asia; Science Funding: Europe; Science Funding: United States; Science, Technology, and the Military
Bibliography Arrow K J 1962 Economic Welfare and the Allocation of Resources for Invention. The Rate and Direction of Inentie Actiity. National Bureau of Economic Research, Princeton University Press, Princeton, NJ Bernal J D 1939 The Social Function of Science. MIT Press, Cambridge MA. MIT Press Paperback edition 1967 Bush V 1945 Science: The Endless Frontier: A Report to the President on a Program for Postwar Scientific Research. United States Office of Scientific Research and Development, Washington DC. National Science Foundation reprint 1960 Callon M 1994 Is Science a Public Good? Fifth Mullins Lecture, Virginia Polytechnic Institute, 23 March 1993. Science, Technology and Human Values 19: 395–424 Cohen W M, Levinthal D A 1989 Innovation and learning: The two faces of R&D. Economic Journal 99(397): 569–96 Cole J R, Cole S 1973 Social Stratification in Science. University of Chicago Press, Chicago, IL Dasgupta P, David P A 1987 Information disclosure and the economics of science and technology. In: Feiwel G R (ed.) Arrow and the Ascent of Modern Economic Theory. New York University Press, New York pp. 519–40 Dasgupta P, David P A 1994 Toward a new economics of science. Research Policy 23(5): 487–521 David P A 1994 Positive feedbacks and research productivity in science: reopening another black box. In: Granstrand O (ed.) Economics of Technology. North Holland, Amsterdam and London, pp. 65–89 David P A, Foray D 1995 Accessing and expanding the science and technology knowledge base. Science and Technology Industry Reiew 16: 13–68 David P A, Foray D, Steinmueller W E 1999 The research network and the new economics of science: From metaphors to organizational behaviours. In: Gambardella A, Malerba F (eds.) The Organization of Economic Innoation in Europe. Cambridge University Press, Cambridge, UK pp. 303–42 David P A, Mowery D, Steinmueller W E 1992 Analysing the economic payoffs from basic research. Economics of Innoation and New Technology 2: 73–90 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Griliches Z 1958 Research costs and social returns: hybrid corn and related innovations. Journal of Political Economy (October): 419–31 Kamien M I, Schwartz N L 1975 Market structure and innovation: a survey. Journal of Economic Literature 13(1): 1–37
13667
Science, Economics of Katz J S, Hicks D M 1996 A systemic view of British science. Scientometrics 35(1): 133–54 Mansfield E 1991 Academic research and industrial innovation. Research Policy 20(1): 1–12 Merton R 1973 The Sociology of Science: Theoretical and Empirical Inestigations. University of Chicago Press, Chicago, IL Mowery D C, Rosenberg N 1989 Technology and the Pursuit of Economic Growth. Cambridge University Press, Cambridge, UK National Science Foundation 1969 Technology in Retrospect and Critical Eents in Science (TRACES). National Science Foundation, Washington, DC Nelson R R 1959 The simple economics of basic scientific research. Journal of Political Economy 67(June): 297–306 Office of the Director of Defense Research and Engineering 1969 Project Hindsight: Final Report. Washington, DC Pavitt K 1993 What do firms learn from basic research. In: Foray D, Freeman C (eds.) Technology and the Wealth of Nations: The Dynamics of Constructed Adantage. Pinter Publishers, London Price D J, de Solla 1963 Little Science, Big Science. Columbia University Press, New York Rosenberg N 1982 Inside the Black Box: Technology and Economics. Cambridge University Press, Cambridge, UK Rosenberg N 1990 Why do firms do basic research (with their own money). Research Policy 19(2): 165–74 Steinmueller W E 1994 Basic science and industrial innovation. In: Dodgson M, Rothwell R (eds.) Handbook of Industrial Innoation. Edward Elgar, London pp. 54–66 Stephan P E 1996 The economics of science. Journal of Economic Literature XXXIV(September): 1199–235 United Kingdom Council for Scientific Policy (Great Britain) 1967 Second Report on Science Policy. HMSO, London Zuckerman H 1992 The proliferation of prizes: Nobel complements and Nobel surrogates in the reward system of science. Theoretical Medicine 13: 217–31
W. E. Steinmueller
Science Education The launch of Sputnik in 1957 almost single-handedly conferred on science education the status of an Olympic sport. A series of international comparison studies along with the increasing importance of science and technology in the global economy have nurtured this image (e.g., Schmidt et al. 1997). Stakeholders, including policy makers, natural scientists, textbook publishers, test designers, classroom teachers, business leaders, and pedagogical researchers, have responded with competing strategies for improving science education. In some countries, most notably the United States, powerful groups have mandated untested policies and unachievable educational goals—often driven by a single aspect of the problem such as textbooks, assessments, curriculum frameworks, peer learning, or technological innovation. Experience with failed policy initiatives and ambiguous research findings has highlighted the systemic, intricate, complex, interconnected nature of science 13668
education. Innovations succeed under some circumstances, but not others. Students regularly fail to learn what they are taught, prefer ideas developed from personal experiences, and make inferences based on incomplete information. Curriculum designers often neglect these aspects of learners and expect students to absorb all the information in a textbook or learn to criticize research on ecology by studying logic puzzles (Linn and Hsi 2000, Pfundt and Duit 1991). International studies of science learning in over 50 countries raise questions about the complex relationships between achievement and curriculum, or between interest in science and science learning, or between teachers’ science knowledge and student success that many have previously taken for granted. This systemic character of science education demands a more nuanced and contextual understanding of the development of scientific understanding and the design of science instruction, as well as new research methods, referred to as design studies, to investigate the impact of innovations.
1. Deeloping Scientific Understanding 1.1 Knowledge Integration Most researchers would agree that science learners engage in a process of knowledge integration, making sense of diverse information by looking for patterns and building on their own ideas (Bransford et al. 1999, Bruer 1993). Knowledge integration involves linking and connecting information, seeking and evaluating new ideas, as well as revising and reorganizing scientific ideas to make them more comprehensive and cohesive. Designing curriculum materials to support and guide the process of knowledge integration has proven difficult. Most textbooks and even hands-on activities reflect a view of learners as absorbing information rather than attempting to integrate new ideas with their existing knowledge. Recent research provides guidance to curriculum designers by describing the interpretive, cultural, and deliberate dimensions of knowledge integration. Learners interpret new material in light of their own ideas and experiences, frequently relying on personal perspectives rather than instructed ideas. For example, science learners often believe that objects in motion come to rest, based on their extensive observations of the natural world. Learning happens in a cultural context where group norms, expectations, and supports shape learner activity. For example, when confronted with the Newtonian view that objects in motion remain in motion, many learners conclude that objects come to rest on the playground but remain in motion in science class, invoking separate norms for these distinct contexts. Individuals make deliberate decisions about their science learning, develop commitments about reusing what they learn, pay attention to some science debates
Science Education but not others, and select or avoid science as a career. For example, some students desire a cohesive account of topics like motion and seek to explain new phenomena such as the role of friction in nanotechnology, while others report with pride that they have ‘forgotten everything taught in science class.’ The interpretive, cultural, and deliberate dimensions of science learning apply equally to pre-college students, preservice and in-service science teachers, research partnerships, and lifelong science learners. These perspectives help clarify the nature of knowledge integration and suggest directions for design of science instruction. 1.2 Interpretie Nature of Science Learning Learners develop scientific expertise by interpreting the facts, processes, and inquiry skills they encounter in terms of their own experiences and ideas. Experts in science develop richly connected ideas, patterns, and representations over the years and regularly test their views by interpreting complex situations, looking for anomalies, and incorporating new findings. For example, expert physicists use free body diagrams to represent mechanics problems while novices rely on the formulas they learn in class. Piaget (1971) drew attention to the ideas that students bring to science class such as the notion that the earth is round like a pancake, or that heavier objects displace more volume. Piaget offered assimilation and accommodation as mechanisms to account for the process of knowledge integration. Vygotsky (1962) distinguished spontaneous ideas developed from personal experience such as the view that heat and temperature are the same from the instructed distinction between heat and temperature. Recent research calls for a more nuanced view of knowledge integration by showing that few learners develop a coherent perspective on scientific phenomena: most students develop ‘knowledge in pieces’ and retain incohesive ideas in their repertoire (diSessa 2000). Even experts may give incohesive views of phenomena when asked to explain at varied levels of granularity. For example, scientists may have difficulty designing a picnic container to keep food safe even when they have expert knowledge of molecular kinetic theory (Linn and Hsi 2000). Science textbooks, lectures, films, and laboratory experiments often reinforce students’ incoherent views of science; they offer disconnected, inaccessible ideas, and avoid complex personally-relevant problems. For example, many texts expect students to gain understanding of friction by analyzing driving on icy roads. But non-drivers and those living in warm climates find this example unfamiliar and inaccessible. Similarly, research on heat and temperature reveals that difficulties in understanding the particulate nature of matter stand in the way of interpreting the molecularkinetic model. The engineering-based heat flow model
offers a more descriptive and accessible account of many important aspects of heat and temperature such as wilderness survival or home insulation or thermal equilibrium. Designing instruction to stimulate the interpretive process means carefully selecting new, compelling ideas to add to the views held by students and supporting students as they organize, prioritize, and compare these various ideas. This process of weighing alternative accounts of scientific phenomena can clash with student ideas about the nature of science and of science learning. Many students view science knowledge as established and science learning as memorization. To promote knowledge integration, students need to interpret dynamic examples of science in the making and they need to develop norms for their own scientific reasoning. Students need a nuanced view of scientific investigation that contrasts methodologies and issues in each discipline. General reasoning skills and critical thinking are not sufficient. Instead, students need to distinguish the epistemological underpinnings of methodologies for exploring the fossil record, for example, from those required for study of genetic engineering. They need to recognize pertinent questions for research on earthquake-resistant housing, DNA replication, and molecular modeling. Effective knowledge integration must include an understanding of the ethical and moral dilemmas involved in diverse scientific areas, as well as the nature of scientific advances. Research on how students make sense of science suggests some mechanisms to promote the interpretive process of knowledge integration. Clement (1991) calls for designing bridging analogies to help students make effective connections. For example, to help students understand the forces between a book and a table, Clement recommends comparing the experience of placing the book on a spring, a sponge, a mattress, and possibly, releasing it on air. Linn and Hsi (2000) call on designers to create pivotal cases that enable learners to make subtle connections and reconsider their views. For example, a pivotal scientific visualization of the relative rates of heat flow in different materials helped students interpret their personal observations that, at room temperature, metals feel colder than wood. To help students sort out alternative experiences, observations, school ideas, and intuitions, research shows the benefit of encouraging students to organize their knowledge into larger patterns, motivating students to critique alternative ideas, and establishing a taste for cohesive ideas. 1.3 Cultural Context of Science Learning All learning occurs in a cultural context where communities respond to competing perspectives, confer status on research methods, and establish group norms and expectations. Students benefit from the cultural 13669
Science Education context of science, for example, when they find explanations provided by peers more comprehensible than those in textbooks or when peers debate alternative views. Expert scientists have well established mechanisms for peer review, norms for publications, and standards for inquiry practices. Group norms are often institutionalized in grant guidelines, promotion policies, and journal publication standards. The cultural context of the classroom has its own characteristics that may or may not support and encourage the development of cohesive, sustained inquiry about science. Students may enact cultural norms that exclude those from groups underrepresented in science from the discourse (Wellesley College Center for Research on Women 1992, Keller 1983) or limit opportunities for students to learn from each other. Textbooks and standardized tests may privilege recall of information over sustained investigation. Images of science in curriculum materials and professional development programs may neglect societal or policy issues. Contemporary scientific controversies, such as the current international debate about genetically modified foods, rarely become topics for science classes. As a result, students may develop flawed images of science in the making and lack ability to make a critical evaluation of news accounts of personally-relevant issues. For example, to make decisions about the cultivation and consumption of genetically-modified foods, students would ideally compare the risks from traditional agricultural practices such as hybridization to the risks from genetic modification. They would also weigh issues of economics, world hunger, and individual health. In addition, students studying this controversy would learn to distinguish comments from scientists supported by the agricultural industry, environmental protection groups, and government grants. Examining knowledge integration from a cultural perspective helps to clarify the universality of controversy in science and the advantages of creating classroom learning communities that illustrate the role of values and beliefs in scientific work (Brown 1992, Bransford and Brown 1999). 1.4 Deliberatie Perspectie on Science Learning Students make deliberate decisions about science learning, their own progress, and their careers. Lifelong science learners deliberately reflect on their views, consider new accounts of scientific problems, and continuously improve their scientific understanding; they seek a robust and cohesive account of scientific phenomena. Designing instruction that develops student responsibility for science learning creates a paradox. In schools, science curriculum frameworks, standards, texts, and even recipe-driven hands-on experiences leave little opportunity for independence. Yet, the curriculum cannot possibly provide all the necessary information about science; instead, students 13670
need supervised practice in analyzing their own progress in order to guide their own learning. Many instructional frameworks offer mechanisms leading to self-guided, intentional learning (Linn and Hsi 2000, White and Frederiksen 1998). Vygotsky (1962) drew attention to creating a ‘zone of proximal development’ by designing accessible challenges so that students, supported by instruction and peers, could continue to engage in knowledge integration. Vygotsky argued that, when students encounter new ideas within their zone of proximal development and have appropriate supports, they can compare and analyze spontaneous and instructed ideas, achieve more cohesive understandings, and expand their zone of proximal development. With proper support students can even invent and refine representations of their experimental findings (diSessa 2000). Others have shown that engaging students in guided reflection, analysis of their own progress, and critical review of their own or others, arguments establishes a more deliberative stance towards science learning (Linn and Hsi 2000, White and Frederikson 1998).
2. Designing Science Instruction Design of science instruction occurs at the level of state and national standards, curriculum frameworks for science courses, materials such as textbooks or software, and activities carried out by both students and teachers. In many countries, a tension has emerged between standards that mandate fleeting coverage of a list of important scientific topics and concerns of classroom teachers that students lack opportunity to develop a disposition towards knowledge integration and lifelong science learning. As science knowledge explodes, citizens face increasingly complex sciencerelated decisions, and individuals need to regularly update their workplace skills. To make decisions about personal health, environmental stewardship, or career advancement, students need a firm foundation in scientific understanding, as well as experience interpreting complex problems. To lead satisfying lives, students need to develop lifelong learning skills that enable them to revisit and refine their ideas and to guide their own science learning in new topic areas. Recent research demonstrates the need to design and test materials to be sure they are promoting knowledge integration and setting learners on a path towards lifelong learning. Frameworks for design of science instruction for knowledge integration call for materials and activities that feature accessible ideas, make thinking visible, help students learn from others, and encourage self-monitoring (Bransford et al. 2000, Linn and Hsi 2000, White and Frederiksen 1998). 2.1 Designing Accessible Ideas To promote knowledge integration, students need a designed curriculum that includes pivotal cases and
Science Education bridging analogies to help them learn. Rather than asking experts to identify the most sophisticated ideas, designers need to select the most accessible and generative ideas to add to the mix of student views. College physics courses generally start with Newton rather than Einstein; in pre-college courses one might start with everyday examples from the playground rather than the more elegant but less understandable frictionless problems. To make the process of lifelong knowledge integration accessible, students need some experience with sustained, complex inquiry. Carrying out projects such as developing a recycling plan for a school or researching possible remedies for the worldwide threat of malaria, engage students in the process of scientific inquiry and can establish lifelong learning skills. Often, however, science courses neglect projects or provide less knowledge integration intensive experiences such as a general introduction of critical thinking or hands-on recipes for solving unambiguous problems. By using computer learning environments to help guide students as the carry out complex projects curriculum designers can foster a robust understanding of inquiry (Driver et al. 1996). Projects take instructional time, require guidance for individual students, bring the complexities of science to life, and depend on well-designed questions. Students often confuse variables such as food and appetite, rely on flawed arguments from advertisements or other sources, and flounder because they lack criteria for critiquing their own progress. For example, when students critique projects, they may comment on neatness and spelling, rather than looking for flaws in an argument. Many teachers avoid projects because they have not developed the pedagogical skills necessary to mentor students, deal with the uncertainties of contemporary science dilemmas, or design researchable questions. Research shows that computer learning environments can make complex projects more successful by scaffolding inquiry, providing help and hints, and freeing teachers to interact with students about complex science issues (Feurzeig and Roberts 1999, Linn and Hsi 2000, White and Frederiksen 1998).
of frog deformities, evaluate the water quality in local streams, and design houses for desert climates. Research demonstrates benefits of asking students to make their thinking visible in predictions, reflections, assessments of their progress, and collaborative debate. Technological learning environments also make student ideas visible with embedded performance assessments that capture ability to critique arguments, make predictions, and reach conclusions. Such assessments help teachers and researchers identify how best to improve innovations and provide an alternative to high stakes assessments (Heubert and Hauser 1998, Bransford et al. 1999).
2.3 Helping Students Learn from Others When students collaboratively investigate science problems, they can prompt each other to reflect, provide explanations in the language of their peers, negotiate norms for critiques of arguments, and specialize in specific aspects of the problem. Several programs, including Kids as Global Scientists (http:\\www.onesky.umich.edu\) and Project Globe (http:\\www.globe.gov\), orchestrate contributions from students around the world to track and compare extreme weather. Technological learning environments support global collaborations as well as collaborative debate and equitable discussion. In collaborative debate (see Science Controversies On-Line: Partnerships in Education, SCOPE, http:\\scope.educ.washington.edu\), students research contemporary controversies on the Internet with guidance from a learning environment, prepare their arguments often using visual argument representations, and participate in a classroom debate where every student composes a question for each presenter. Teachers have a rich sample of student work to use for assessment. Online scientific discussions engage many more students than do class discussions and also elicit more thoughtful contributions (Linn and Hsi 2000).
2.4 Promoting Autonomy and Lifelong Learning 2.2 Making Thinking Visible Students, teachers, and technological tools can make thinking visible to model the process of knowledge integration, illustrate complex ideas, and motivate critical analysis of complex situations. Learning environments, such as WorldWatcher (http:\\www. worldwatcher.nwu.edu\), Scientists in Action (http:\\ peabody.Vanderbilt.edu), and the Web-Based Integrated Science Environment (WISE—http:\\wise. berkeley.edu) guide students in complex inquiry and make thinking visible with scientific visualizations. In these projects, students successfully debate the causes
New pedagogical practices often implemented in computer learning environments can nudge students towards deliberate, self-guided learning. Projects with personally-relevant themes capitalize on the intentions of students and motivate students to revisit ideas after science class is over. For example, students who studied deformed frogs brought news articles to their teacher years after the unit was completed. Projects can offer students a rich context, such as the rescue of an endangered animal or the analysis of the earthquake safety of their school, that raises investment in science learning. 13671
Science Education Students need to monitor their own learning to deal with new sources of science information, such as the Internet, where persuasive messages regularly appear along with public service announcements. Helping students jointly form partnerships where multiple forms of expertise are represented, develop common norms and criteria for evaluating arguments, and deliberately review their progress prepares for situations likely to occur later in life.
3. Research Methods Researchers have responded to the intricate complexities in science education with new research methods informed by practices in other design sciences, including medicine and engineering. In the design sciences, researchers create innovations like science curricula, drugs, or machines and study these innovations in complex settings. In education, these innovations, such as technology enhanced science projects, build on increased understanding of science learning. Research in a design science is typically directed by a multi-disciplinary partnership. In education, partners bring expertise in a broad range of aspects of learning and instruction, including technology, pedagogy, the science disciplines, professional development, classroom activity structures, and educational policy; collaborators often have to overcome perceptions of status differences among the fields represented. By working in partnership, individuals with diverse forms of expertise can jointly contribute to each others’ professional development. Practices for design studies come from design experiments and Japanese lesson study (Brown 1992, diSessa 2000, Lewis 1995). Design studies typically start when the partnership creates an innovation, such as a new learning environment, curriculum, or assessment and have as their goal the continuous improvement of the innovation. The inspiration for innovative designs can come from laboratory investigations, prior successes, spontaneous ideas, or new technologies. Many technology-enhanced innovations have incorporated elements that have been successful in laboratory studies, like one-on-one tutoring or collaborative learning. Other innovations start with scientific technologies such as real time data collection (Bransford et al. 1999). In design studies, partners co-design innovations and evaluations following the same philosophy so that assessments are sensitive to the goals of instruction. Partners often complain that standardized, multiple choice tests fail to tap progress in knowledge integration and complex scientific understanding. Results from assessment allow the partnership to engage in principled refinement of science instruction and can also inform future designers. Often, innovations become more flexible and adaptive as they are refined, 13672
making it easier for new teachers to tailor instruction to their students and curricular goals. Design studies may also be conducted by small groups of teachers engaging in continuous improvement of their instruction, inspired by the Japanese lesson study model. When teachers collaborate to observe each other, provide feedback, and jointly improve their practice, they develop group norms for design reviews. Methodologies for testing innovations in complex settings and interpreting results include an eclectic mix of approaches from a broad range of fields, including classroom observations, video case studies, embedded assessments of student learning, student performance on design reviews, classroom tests, and standardized assessments, as well as longitudinal studies of students and teachers. When partnerships design rubrics for interpreting student work they develop a common perspective. Design study research has just begun to address the complexities of science education. For example, current investigations reveal wide variation among teachers implementing science projects. In classrooms, some teachers spend up to five minutes with each group of students, while others spend less than one minute per group. In addition, some teachers, after speaking to a few small groups, recognize a common issue and communicate it to their whole class. This practice, while demanding for teachers, has substantial benefits. Developing sensitivity to student dilemmas when complex projects are underway requires the same process of knowledge integration described above and may only emerge in the second and subsequent uses of the project. The design study approach to science instruction succeeds when teachers and schools commit to multiple refinements of the same instructional activities, rather than reassigning teachers and selecting new curriculum materials annually.
4. Emerging Research Topics and Next Steps Emerging areas for science education research include professional development and policy studies. The interpretive, cultural, and deliberate character of learning applies equally to these fields. Teachers may value lifelong science learning but have little experience connecting the science in the curriculum to their own spontaneous ideas and few insights into how to support this process in students. Teachers taking a knowledge integration approach to instruction face complex questions such as whether to introduce genetics by emphasizing genotypes, phenotypes, the human genome, or a treatment perspective. They may discover that some students come to class believing that two unrelated people who look alike could be twins. Effective professional development should help teachers with these sorts of specific questions rather than providing general
Science Funding: Asia grounding in the science discipline or glitzy experiments that confuse rather than inform students. Inspired by the Japanese lesson study approach, more and more research groups are convening and studying collaborative groups of science teachers who face similar instructional decisions. Like science students, science teachers need opportunities and encouragement to build on their spontaneous ideas about learning, instruction, and the nature of science to become proficient in guiding their own professional development. Science policy-makers may also be viewed through this lens of interpretive, cultural, and deliberate science learning. Policy-makers frequently hold ideas about science learning that might be at odds with those held by teachers. The status differences between policymakers, natural scientists, and classroom teachers can interfere with open and effective communication. Building a cohesive perspective on pedagogy, science teaching, and science learning has proven very difficult in almost every country. The popularity of high-stakes assessments that might be insensitive to innovation underscores the dilemma. Recent news reports that high-stakes assessments are motivating absenteeism, cheating, and unpromising teaching practices increases the problem (Heubert and Hauser 1998). Science educators face many complex, pressing problems including the connection between science and technology literacy; the dual goals of excellence and equitable access to science careers; the tradeoffs between a focus on public understanding of science and career preparation; the role and interpretation of high stakes tests; and the challenges of balancing the number of topics in the curriculum with the advantages of science project work. If we form a global partnership for science education and jointly develop a cohesive research program on lifelong learning, we have an unprecedented opportunity to collaboratively design and continuously improve teaching, instruction, and learning. See also: Discovery Learning, Cognitive Psychology of; Gender and School Learning: Mathematics and Science; Scientific Concepts: Development in Children; Scientific Reasoning and Discovery, Cognitive Psychology of; Teaching and Learning in the Classroom; Teaching for Thinking
Bibliography Bransford J, Brown A L, Cocking R R, National Research Council (US) 1999 How People Learn: Brain, Mind, Experience, and School. National Academy Press, Washington DC Brown A 1992 Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. Journal of Learning Sciences 2(2): 141–78 Bruer J T 1993 Schools for Thought: A Science of Learning in the Classroom. MIT Press, Cambridge, MA
Clement J 1991 Non-formal Reasoning in Science: The Use of Analogies, Extreme Cases, and Physical Intuition. Lawrence Erlbaum Associates, Mahwah, NJ diSessa A 2000 Changing Minds: Computers, Learning, and Literacy. MIT Press, Cambridge, MA Driver R, Leach J, Millar R, Scott P 1996 Young People’s Images of Science. Open University Press, Buckingham, UK Feurzeig W, Roberts N (eds.) 1999 Modeling and Simulation in Science and Mathematics Education. Springer, New York Heubert J, Hauser R (eds.) 1998 High Stakes: Testing for Tracking, Promotion, and Graduation. National Academic Press, Washington, DC Keller E F 1983 Gender and Science. W H Freeman, San Francisco, CA Lewis C 1995 Educating Hearts and Minds: Reflections on Japanese Preschool and Elementary Education. Cambridge University Press, New York Linn M C, Hsi S 2000 Computers, Teachers, Peers: Science Learning Partners. Lawrence Erlbaum Associates, Mahwah, NJ Piaget J 1971 Structuralism. Harper & Row, New York Pfundt H, Duit R 1991 Students’ Alternatie Frameworks, 3rd edn. Institute for Science Education at the University of Kiel\Institut fu$ r die Pa$ dagogik der Naturwissenschaften. Kiel, Germany Schmidt W H, Raizen S A, Britton E D, Bianchi L J, Wolfe R G 1997 Many Visions, Many Aims: A Cross-national Inestigation of Curricular Intentions in School Science. Kluwer, Dordrecht, Germany Vygotsky L S 1962 Thought and Language. MIT Press, Cambridge, MA Wellesley College Center for Research on Women 1992 How Schools Short Change Girls. American Association of University Women Education Foundation, Washington, DC White B Y, Frederiksen J R 1998 Inquiry, modeling, and metacognition: Making science accessible to all students. Cognition and Instruction 16(1): 3–118
M. C. Linn
Science Funding: Asia Science funding refers to national expenditure, from both public and private sources, for the institutionalization and promotion of a variety of scientific activities conventionally termed research and development (R&D). These may take the form of basic, applied, and development research undertaken or sponsored across a range of science and technology (S&T) institutions in national S&T innovation systems. In the postwar era, the concept of science funding assumed considerable importance in the national policies of Asian countries. The government, both as a patron and as a powerful mediator, played a significant part in shaping the structure and direction of science funding. In the latter half of the twentieth century, the belief in science as a symbol of ‘progress’ was transformed into an established policy doctrine in the Asian region. 13673
Science Funding: Asia Creating wealth from knowledge and achieving social, political, and military objectives came to be closely associated with deliberately fostering S&T activities through funding scientific research. In the mid 1990s, the Asian region accounted for 26.9 percent of the world’s gross expenditure on research and development (GERD), which was US$470 billions in 1994. While Japan and newly industrializing countries (NICs) accounted for 18.6 percent, South East Asia, China, India, and other South Asian countries accounted for 0.9 percent, 4.9 percent, 2.2 percent, and 0.3 percent respectively. Within the Asian region, whereas Japan and NICs accounted for about 69 percent, other countries accounted for 31 percent of total science funding. Three main interrelated approaches underlie the concept of science funding in Asian countries. The first approach underlines the importance and essential nature of public or government funding of R&D as a ‘public good’ that improves the general interest and welfare of the society or nation as a whole. The public good approach emphasizes the funding and generation of knowledge that is non-competitive, open for public access, and non-exclusive. As private firms and the market are unable to capture all of the benefits of scientific research, they often tend to under-invest in R&D, which is also a key component of the process of innovation. It is in this context that public funding becomes important by bearing the social costs of generating scientific knowledge or information. Further, numerous studies have shown that public funding of scientific research as a public good yields several social and economic benefits (see Arrow 1962, Pavitt 1991). In contrast to institutionalized and ‘codified’ forms of knowledge, scientific capacity in the form of ‘tacit knowledge’ is seen to be embodied in research personnel trained over a period of time. Thus, publicly supported training in higher educational settings and networks of professional associations and academies forms an important component of science funding. A closely related second approach to science funding is state sponsorship of military and defense related strategic R&D. Even though there is considerable spinoff from military R&D to the non-military sectors of economy, the main rationale for the importance given to such research is national security. Most countries in the world, including those in the Asian region, spend more money on military and defenserelated scientific research activities than in the civilian R&D domain. The third approach essentially recognizes the importance of private sources of funding for scientific research. Although private patrons have always played an important part in supporting S&T-related research activities, the postwar decades, particularly the 1980s and 1990s, witnessed a remarkable shift from public to private sources of funding of science. The significance of private science funding has grown since the 1980s, as the ‘linear model,’ which was based 13674
Table 1 GERD by source of funding from the late 1970s (A) to the mid-1990s (B) Govt A B Japan NICs SE Asia India China OSA
30 43 86 89 100 85
22 33 58 82 82 78
Private industry A B 70 52 14 12 — 6
67 63 27 16 15 8
Other nat. sources A B — 3 — — — 7
10 4 15 1 3 14
Source: World Science Report, Unesco, Paris, 1998; CAST ASIA II, 22–30 March 1982. OSA: Other South Asian Countries include Bangladesh, Nepal, Myanmar, Pakistan, Sri Lanka, and Mongolia. NICS include South Korea, Taiwan, Singapore, Hong Kong. South East Asia includes Thailand, Philippines, Malaysia, Indonesia, and Vietnam.
on the perceived primacy of basic research, began to lose its credibility. Insights from studies on the economics of innovation came to substantiate the view that it is ‘downstream’ development spending in the spectrum of R&D ‘that plays a crucial role in determining who gets to capture the potential rents generated by scientific research’ (Rosenberg 1991, p. 345). Private industrial firms, rather than government funding, dominate this segment of R&D in Asia as in other regions. Second the importance of private industry as a source of science funding attracted considerable attention from the mid 1980s, as the success of Japan and other East Asian NICs came to be analyzed from the perspective of how these countries transferred the burden of science funding from public to private sources. As the expenditure on military R&D is largely met by governments, the pattern of GERD in the Asian region can be explored in terms of public and private sources. As Table 1 shows, three broad subregions are discernible in the Asian region insofar as the source of science funding is concerned. While private industry accounts for approximately two-thirds, government accounts for one-third of the total R&D expenditure in the case of Japan and NICs. There has been a significant transformation in the increase of private funding of science in South Korea. Private sources accounted for about 2.3 percent of GDP, that is, 80 percent of total R&D funds, which is one of the highest levels in the world. In Japan, Singapore, and Taiwan, over 65 percent of total R&D spending is financed by private industry. Much of this transformation is the result of state mediation and institutional mechanisms which offer appropriate incentives for the private sector to invest in R&D. In the second subregion of South East Asia, there has been a perceptible change in private industry’s share of total R&D, which increased from 14 percent to 27 percent, whereas the government’s share witness-
Science Funding: Asia Table 2 GERD and military expenditure as a percentage of GDP GERD 1994 Japan NICs SE Asia India China Other Asian countries
3n02 1n60 0n50 0n80 0n50 0n38
Military expenditure 1985 1995 1n00 5n45 2n95 3n50 4n90 4n60
1n00 4n05 2n20 2n40 2n30 3n50
Source: World Science Report, Unesco, Paris, 1998; World Development Reports 1998\1999; and 1999\2000.
ed a decline from 86 percent to 58 percent between the late 1970s and the mid-1990s. In Southern Asia, which comprises China, India, and other South Asian countries, the government continues to be the main patron for science funding. The private sector accounts for less than 16.5 percent. In China and India, while the proportion of private funding of R&D increased to 15 percent and 4.4 percent respectively during the late 1970s and the mid-1990s, the private sector is likely to play a more dominant role in the 2000s. This is because these countries are witnessing considerable foreign direct investment and economic restructuring which are fostering liberalization and privatization. Except in the case of the Philippines and Sri Lanka, where about a quarter of total R&D funding is met by foreign sources, foreign funding hardly plays a significant role in the Asian region as whole. Although there is no established correlation between GERD as a percentage of GDP and economic growth rates, this indicator has assumed great significance in S&T policy discourse in the 1990s. Whereas the industrially advanced countries spend between 2 percent to 3 percent of GDP on R&D activities, middle-income countries such as the NICs and some others in the South East Asian region spend over 1.5 percent. The poorer developing countries spend less than 1 percent and the countries falling at the bottom of economic and human development rank index spend even less than 0.5 percent. This general trend is
also largely borne out in Asia, as shown in Table 2. However, the relation between different types or forms of R&D and economic growth has also come into sharper focus. Even though the of share private industry in overall R&D effort is emerging as one of the determinants of economic dynamism, some studies draw attention to various national technological activities (in relation to the national science base) which have a direct bearing on national economic performance measured in terms of productivity and export growth (see Pavitt 1998, p. 800). For instance, as shown in Table 3, the most economically dynamic NICs such as Taiwan and South Korea began with relatively weak scientific strength, as did India in the early 1980s. Nevertheless, they outperformed India by the early 1990s in terms of the change in their share of world publications, as well as in the share of registered patents in USA. Taiwan and South Korea not only spent a much higher proportion on R&D as a percentage of GDP, but the privately dominated R&D structure was such that it gave high priority to patenting. In actual terms, while Japan and South Korea filed 335,061 and 68,446 patents in their respective countries, India filed only 1545 patents in 1995–6. Further, whereas high technology exports as a proportion of total manufacturing exports in Japan, South Korea, Hong Kong, and Singapore registered 39 percent, 39 percent, 27 percent, and 71 percent respectively, India registered hardly 10 percent in 1996, which is much lower than the figure for China (21 percent) for the same year. As indicated in Table 2, there is a good deal of military burden in most of the Asian countries. Even though there was drastic reduction in military expenditure as a percentage of GDP between 1985 and 1995 in the region, all the Asian countries and subregions still spend three to four times more on military activities than on civilian R&D. While Pakistan maintained over 6 percent, Sri Lanka increased its military expenditure as a percentage of GDP from 2.9 percent to 4.6 percent between 1985 and 1995. Japan is the only country in the Asian region that limited its military expenditure to 1 percent of GDP while spending over three times this figure on civilian R&D (3.1 percent) during the same period.
Table 3 Trends in scientific and technological performance in selected Asian countries
County
Change in share of world publications, 1993\1982
Change in share of US patents, 1993\1983
Publications per million population, 1980–84
5n97 5n45 3n53 2n37 0n83
12n81 29n79 03n20 02n42 02n45
23.3 8.00 71.6 45.90 18.10
Taiwan South Korea Singapore Hong Kong India Source: As given in Pavitt (1998, p. 800).
13675
Science Funding: Asia In terms of the sectoral focus of R&D funding in the Asian region, a contrasting picture emerged. Japan and NICs directed their science funding to high technologies, capital, exports of engineering goods, advanced materials, modern biology, and basic research in information and communication technologies. In the 1990s, Japan, South Korea, and Singapore registered greater proportions of private sector funding than even Germany and the USA. Japan’s R&D expenditures have increased eightfold over the 20 year period from 1971 to 1993, which is the highest rate of increase among the industrially developed nations. At the other extreme are the Southern Asian countries, including China, India, Pakistan, Bangladesh, Nepal, Myanmar, and Sri Lanka, where R&D funding related to agriculture and to the manufacturing sector assumed equal importance, as over 60 percent of their populations was dependent on agriculture. Another revealing feature about countries such as China, India, and Pakistan was the importance given to military and strategic R&D, which consumed 45 to 55 percent of the total R&D budget. In most of the South East Asian countries, the contribution of agriculture to the GDP witnessed a considerable decline in contrast to Southern Asian countries. In terms of sectoral contribution to the GDP, none of the South East Asian countries accounted for more than 26 percent (the figure for Vietnam) for agriculture in 1998. The manufacturing and service sectors witnessed unprecedented growth rates between 1980 and 1998. Though these countries spend no more than 0.5 percent of GDP on R&D, the thrust of science funding is directed to manufacturing, industrial, and service-related activities. Agricultural research consumes only 5–10 percent of the total R&D funding, much less than in the Southern Asian countries. See also: Infrastructure: Social\Behavioral Research (Japan and Korea); Research and Development in Organizations; Research Ethics: Research; Research Funding: Ethical Aspects; Science, Economics of; Science Funding: United States; Science, Social Organization of; Scientific Academies in Asia; Universities and Science and Technology: Europe; Universities and Science and Technology: United States
Bibliography Arrow K 1962 Economic welfare and the allocation of resources for invention. In: Nelson R (ed.) The Rate and Direction of Inentie Actiities. Princeton University Press, Princeton, NJ, pp. 609–25 Pavitt K 1991 What makes basic research economically useful. Research Policy 20, pp. 109–19 Pavitt K 1998 The social shaping of the national science base. Research Policy 27, pp. 793–805 Rosenberg N 1991 Critical issues in science policy research. Science and Public Policy 18, pp. 335–46
13676
UNESCO 1998 World Science Report. Paris World Development Report 1998\1999 World Development Report 1999\2000
V. V. Krishna
Science Funding: Europe European governments invest considerable sums of money in science. This article examines the reasons why they do this, covering briefly the historical context of European science funding and highlighting current issues of concern. The focus is on government funding of science, rather than funding by industry or charities, since government has historically been the largest funder of ‘science’ as opposed to ‘technology.’ As an approximate starting point, ‘science’ refers to research that is undertaken to extend and deepen knowledge rather than to produce specific technological results, although the usefulness of this distinction will be questioned below. By ‘science policy’ what is meant is the set of objectives, institutions, and mechanisms for allocating funds to scientific research and for using the results of science for general social and political objectives (Salomon 1977). ‘Europe’ here only refers to Western Europe within the European Union, excluding the Eastern European countries.
1. Background Government funding of science in Europe started in a form that would be recognizable today only after World War II, although relations between science and the state can be traced back at least as far as the Scientific Revolution in the seventeenth century (Elzinga and Jamison 1995). The history of science funding in Europe can be summarized broadly as a movement from a period of relative autonomy for scientists in the postwar period, through stages of increasing pressures for accountability and relevance, resulting in the situation today, where the majority of scientists are encouraged to direct their research towards areas that will have some socially or industrially relevant outcome. However, this account is too simplistic. Many current concerns about science funding are based on the idea that ‘pure’ or ‘basic’ science (autonomous research concerned with questions internal to the discipline) is being sacrificed in place of ‘applied’ research (directed research concerned with a practical outcome), incorrectly assuming that there is an unproblematic distinction between the two (see Stokes 1997). Looking back, it can be seen that even in the late 1950s there were expectations that science should provide practical outcomes in terms of economic and social benefits, and the work that scientists were doing
Science Funding: Europe at this time was not completely ‘pure,’ because much of it was driven by Cold War objectives. This is a demonstration of the broader point that in science policy, the categories used to describe different types of research are problematic, and one must be careful when using the traditional terminology. With these caveats in place, it is possible to trace the major influences on European science funding. In the 1950s and 1960s, much of the technologically oriented funding of research was driven by military objectives and attempts to develop nuclear energy. In terms of science funding, this was a period of institutional development and expansion in science policy (Salomon 1977). The autonomy that scientists enjoyed at this time was based on the assumption that good science would spontaneously generate benefits. Polanyi (1962) laid out the classic argument to support this position, describing a self-governing ‘Republic of Science.’ He argued that because of the essential unpredictability of scientific research, government attempts to direct science would be counterproductive because they would suppress the benefits that might otherwise arise from undirected research. This influential piece can be seen as a response to Bernal’s work (1939), which was partly influenced by the Soviet system and which argued that science should be centrally planned for the social good. Another important concept of the time was the ‘linear model’ propounded by US science adviser Vannevar Bush (1945). In this model for justifying the funding of science, a one-way conceptual line was drawn leading from basic research to applied research to technological innovation, implying that the funding of basic research would ultimately result in benefits that would be useful to society. But pressures on science from the rest of society were increasing. In the 1970s, there was a growing awareness of environmental problems (often themselves the results of scientific and technological developments), and European countries experienced the oil crises, with accompanying fiscal restrictions. There were increasing pressures on scientists to be accountable for the money they were spending on research. Also at this time the social sciences, especially economics, provided new methods for understanding the role of scientific research in industrial innovation and economic growth (see Freeman 1974). In the 1980s Europe realized it had to respond to the technological and economic challenges of Japan and the US, and because of the ending of the Cold War, military incentives for funding science were no longer so pressing. Technology, industrial innovation, and competitiveness were now the main reasons for governments to fund science. Academic studies of innovation also began to question the linear model of the relationship between science and technology, described above, arguing that the process was actually more complicated (e.g., Mowery and Rosenberg 1989). This led to pressures on the previous ‘contract’
between government and scientists (Guston and Keniston 1994), which had been based on the assumptions of the linear model. Rather than presuming that science would provide unspecified benefits at some unspecified future time, there were greater and more specific expectations of scientists in return for public funding. This was accompanied by reductions in the growth of science budgets, producing a ‘steady state’ climate for scientific research, where funding was not keeping up with the rapid pace at which research was growing (see Ziman 1994). Science policy work at this time produced tools and data for measuring and assessing science. Various techniques were developed, such as technology assessment, research evaluation, technology management, indicator-based analysis, and foresight (Irvine and Martin 1984). In the 1990s, there was greater recognition of the importance of scientific research for innovation, with the development of new hi-tech industries that relied on fundamental scientific developments (such as biotechnology), in conjunction with other advanced technologies. There were also growing pressures for research to be relevant to social needs. Gibbons et al. (1994) argued that the 1990s have witnessed an increasing emphasis on problem-oriented, multidisciplinary research, with knowledge production having spread out to many diverse locations, and that distinctions between basic and applied science, and between science and technology, are becoming much more difficult to make.
2. The Influence of the European Union Moving from a general historical context to look more specifically at the European level shows that research funding from the European Union (EU), in the form that it currently takes, did not start until 1984 with the first ‘Framework Programme.’ This funded pre-competitive research (i.e., research that is still some way from market commercialization) following an agenda influenced by industrial needs (Sharp 1997). From the 1960s onward, the Organization for Economic Cooperation and Development (OECD) had been a more influential multinational organization than the EU in terms of national science policies (Salomon 1977). In particular, the OECD enabled countries to compare their research activities with those of other countries, and encouraged greater uniformity across nations. Currently EU research funding only comprises a few percent of the total research funding of all the member states (European Commission 1994), although it has been more important in the ‘less favored’ regions of Europe (Peterson and Sharp 1998). Consequently, in terms of science funding, the national sources are more important than the EU. However, EU programs do have an influence on the funding priorities of national governments. In theory, the EU 13677
Science Funding: Europe does not fund research that is better funded by nation states (according to the ‘principle of subsidiarity,’ see Sharp 1997), so it does not fund much basic research, but is primarily involved in funding research that is directed towards social or industrial needs. The most important impact of the EU has been in stimulating international collaboration and helping to form new networks, encouraging the spread of skills. One of the requirements of EU funded projects is that they must involve researchers from at least two countries (Sharp 1997). This could be seen as part of a wider political project that is helping to bind Europe together. It is possible that many of these collaborations might have happened without European encouragement because of a steady rise in all international collaborations (Narin et al. 1991). However, it is likely that through its collaborative programs and their influence, the EU will play an increasingly important role in the future of research funding in the member countries (Senker 1999).
3. Indiidual Countries in Europe Since it is the individual countries in Europe that are responsible for the majority of science funding, the organization of their research systems deserves attention. All the countries have shown the general trends outlined above, but the historical and cultural differences among the European nations lead to considerable diversity in science funding arrangements. It is possible to compare the different countries by looking at the reasons why they fund science and the ways in which research is organized. European nations, like those elsewhere, have traditionally funded science to encourage economic development, although most countries also attach importance to advancing knowledge for its own sake. Some countries such as Sweden and Germany have emphasized the advancement of knowledge, and other countries, such as Ireland, have put more emphasis on economic development (Senker 1999). Since the 1980s, the economically important role of science has been emphasized in every country. This has often been reflected at an organizational level with the integration of ministerial responsibilities for science funding with those for technology and higher education. We can compare individual countries in terms of differences in the motivations behind funding research. Governments in France and Italy have traditionally promoted ‘prestige’ research, and have funded large technology projects, such as nuclear energy. These reasons for funding research, even though they are less significant in the present climate, have had longlasting effects on the organization of the national research systems. The UK is notable in that the importance of science for economic competitiveness is 13678
emphasized more than in other European countries, and industrial concerns have played a larger role (Rip 1996). Organizational differences between countries can tell us something about the way research funding is conceptualized and can reflect national attitudes toward the autonomy and accountability of researchers. In the different European countries, the locus of scientific research varies. In some countries, the universities are most important (e.g., Scandinavia, Netherlands, UK), and funds are competed for from research councils (institutions that mediate between scientists and the state, see Rip 1996). In this type of system there will usually be some additional university funding that provides the infrastructure, and some of the salaries. The level of this funding varies between countries, which results in differences in scientists’ dependence on securing research council funds and has implications for researcher autonomy. In other countries, a great deal of scientific research is carried out in institutions that are separate from the universities (e.g., France and Italy). The situation is not static, and scientific research in the university sector has been growing in importance across the whole of Europe (Senker 1999). For example, in France, the science funding system has traditionally been centralized with most research carried out in the laboratories of the Centre National de la Recherche Scientifique (CNRS). Now the situation is changing and universities are becoming more involved in the running of CNRS labs, because universities are perceived to be more flexible and responsive to user needs (Senker 1999). Germany is an interesting case because there is a diversity of institutions involved in the funding of science. There is a division of responsibility between the federal state and the La$ nder, which are responsible for the universities. There are also several other types of research institute, including the Max Planck institutes, which do basic research, and the more technologically-oriented Fraunhofer institutes. Resulting institutional distinctions between different types of research may lead to rigidities in the system (Rip 1996). In all countries in Europe, there is an attempt to increase coordination between different parts of the national research system (Senker 1999).
4. Current Trends As has been emphasized throughout, European governments have demanded increasing relevance of scientific results and accountability from scientists in return for funding research. Although the situation is complex, it is clear that these pressures, and especially the rhetoric surrounding them, increased significantly during the 1990s. This has led to worries about the place for serendipitous research in a ‘utilitarian– instrumental’ climate (Nowotny 1997, p. 87).
Science Funding: Europe These pressures on science to be useful are not the only notable feature of the current funding situation. The views of the public are also becoming more important in decisions concerning the funding of science. The risks and detrimental effects of science are of particular concern to the public, possibly because the legitimacy of the authority of politicians and scientists is being gradually eroded (Irwin and Wynne 1996). Throughout Europe, there has been a growth in public distrust in technological developments, which has led to pressures for wider participation in the scientific process. This is related to the current (and somewhat desperate) emphasis on the ‘public understanding of science,’ which is no longer simply about educating the public in scientific matters, but has moved towards increasing participation in the scientific process (see Gregory and Miller 1998). Concerns about the environmental effects of scientific developments can be traced back to the 1960s, but recent incidents in the 1980s and 1990s have led to a more radical diminution of public faith in scientific experts (with issues such as climate change, Chernobyl, and genetically modified foods). The public distrust of science may also be due to the fact that scientists, by linking their work more closely either to industrial needs or to priorities set by government, are losing their previously autonomous and potentially critical vantage point in relation to both industry and government. Certain European countries, especially the Netherlands and Scandinavia, which have a tradition of public participation, are involving the public more in debates and priority setting on scientific and technological issues. This has been described as a ‘postmodern’ research system (Rip 1996). As the distinction between science and technology becomes less clear, in this type of research system there is also a blurring of boundaries between science and society.
5. Implications An implication of these changes in science funding is that the growing importance of accountability and of the role of the public in scientific decisions may have epistemological effects on the science itself, since scientific standards will be determined not only by the scientific community but by a wider body of actors often with divergent interests (Funtowicz and Ravetz 1993). If norms are linked to institutions, and if institutions are changing because of the greater involvement of external actors in science, and of science in other arenas, then the norms may be changing too (Elzinga 1997). This is an issue that was touched on in the 1970s and 1980s by the proponents of the ‘finalization thesis’ who argued that, as scientific disciplines become more mature, they become more amenable to external
steering (Bo$ hme et al. 1983). The importance of external influences leads to worries about threats to traditional values of what constitutes ‘good’ science (Elzinga 1997, see also Guston and Keniston 1994 for US parallels). There may be an emergence of new standards of evaluation of scientific research. European science funding has changed considerably since it was institutionalized, partly because of its success in generating new technologies and partly because of its failures and their social consequences. It is becoming more difficult to categorize science, technology, and society as separate entities (Jasanoff et al. 1995), or to think of pure scientists as different from those doing applied research. Wider society has become inextricably linked with the progress of science and the demands placed on scientists and sciencefunding mechanisms are starting to reflect this restructuring. This tendency is likely to continue into the future. See also: Infrastructure: Social\Behavioral Research (Western Europe); Kuhn, Thomas S (1922–96); Research and Development in Organizations; Research Funding: Ethical Aspects; Science, Economics of; Science Funding: Asia; Science Funding: United States; Science, Social Organization of; Universities and Science and Technology: Europe
Bibliography Bernal J D 1939 The Social Function of Science. Routledge, London Bo$ hme G, Van Den Daele W, Hohlfeld R, Krohn W, Scha$ fer W 1983 Finalization in Science: The Social Orientation of Scientific Progress. Reidel, Dordrecht, The Netherlands Bush V 1945 Science: The Endless Frontier. USGPO, Washington, DC Elzinga A 1997 The science–society contract in historical transformation. Social Science Information 36: 411–45 Elzinga A, Jamison A 1995 Changing policy agendas. In: Jasanoff S, Markle G E, Petersen J, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA European Commission 1994 The European Report on Science and Technology Indicators. European Commission, Luxembourg Freeman C 1974 The Economics of Industrial Innoation. Penguin, Harmondsworth, UK Funtowicz S O, Ravetz J R 1993 Science for the post-normal age. Futures 25: 739–56 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge. Sage, London Gregory J, Miller S 1998 Science in Public: Communication, Culture and Credibility. Plenum, New York Guston D, Keniston K 1994 The Fragile Contract: Uniersity Science and the Federal Goernment. MIT Press, London Irvine J, Martin B 1984 Foresight in Science: Picking the Winners. Pinter, London Irwin A, Wynne B (eds.) 1996 Misunderstanding Science? The Public Reconstruction of Science and Technology. Cambridge University Press, Cambridge, UK
13679
Science Funding: Europe Jasanoff S, Markle G E, Petersen J, Pinch T (eds.) 1995 Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Mowery D, Rosenberg N 1989 Technology and the Pursuit of Economic Growth. Cambridge University Press, Cambridge, UK Narin F, Stevens K, Whitlow E 1991 Scientific co-operation in Europe and the citation of multinationally authored papers. Scientometrics 21: 313–23 Nowotny H 1997 New societal demands. In: Barre R, Gibbons M, Maddox J, Martin B, Papon P (eds.) Science in Tomorrow’s Europe. Economica International, Paris Peterson J, Sharp M 1998 Technology Policy in the European Union. Macmillan, Basingstoke, UK Polanyi M 1962 The republic of science: its political and economic theory. Minera 1: 54–73 Rip A 1996 The post-modern research system. Science and Public Policy 23: 343–52 Salomon J J 1977 Science policy studies and the development of science policy. In: Spiegel-Ro$ sing I, Solla Price D (eds.) Science, Technology and Society: A Cross-disciplinary Perspectie. Sage, London Senker J 1999 European Comparison of Public Research Systems. Report prepared for the European Commission. SPRU, Sussex, UK Sharp M 1997 Towards a federal system of science in Europe. In: Barre R, Gibbons M, Maddox J, Martin B, Papon P (eds.) Science in Tomorrow’s Europe. Economica International, Paris Stokes D E 1997 Pasteur’s Quadrant: Basic Science and Technological Innoation. Brookings Institution Press, Washington, DC Ziman J 1994 Prometheus Bound. Cambridge University Press, Cambridge, UK
J. Calvert and B. R. Martin
Science Funding: United States The pursuit of knowledge has historically been shaped by patronage relationships. The earliest scientists were astronomers and mathematicians supported at ancient courts in the Middle East and China. In Renaissance Europe, observers of and experimenters with nature had patrons in the aristocracy or royal families. In nineteenth century America, industrial philanthropists took up their cause. Since World War II, research in the United States has received its support largely from government, with help from industry and private foundations. This article focuses on the postwar history of government funding for research in the United States. That history has been characterized by a creative tension between autonomy and accountability, embodied in successive waves of invention of new institutional arrangements. Should science set its own agenda, or should it answer to the public? Could the two goals be reconciled? These issues have echoed through five decades of research policy. 13680
1. Autonomy for Prosperity In the eighteenth and nineteenth centuries, the US federal government funded research only if it was applied directly to practical goals. The earliest federal efforts were in surveying and geological exploration, activities that contributed both to nation-building and to the search for mineral wealth. A second major wave of federal effort in the latter half of the nineteenth century took up agricultural research in partnership with the States, through land grant colleges and agricultural experiment stations. Scientists were also brought in to solve the immediate problems of wartime. World War I was known as ‘the chemists’ war,’ because of the use of chemical warfare agents. World War II became ‘the physicists’ war,’ with the invention of the atomic bomb. And in the 1930s, a small federal laboratory for health research was established, which later became the National Institutes of Health, the nation’s research arm for biomedical sciences (Dupree 1964, Kevles 1978, Strickland 1989). By the time of World War II, these efforts had grown into a set of research programs in mission agencies, focused on food, health, and defense. These activities were carried out largely in government laboratories, and were focused on immediate, practical goals. In later decades, two other agencies joined this group. The National Aeronautics and Space Administration (NASA) was formed in response to the launching of a Russian satellite, the Sputnik, in 1957 (Logsdon 2000). And the Department of Energy, when it was established in the 1970s, incorporated major research elements, including the high-energy physics laboratories. This array of government research efforts continues to anchor the mission-oriented portion of a pluralistic system of funding for US research. The other dimension of that system is fundamental research. It is anchored in universities, and interwoven with graduate education and the preparation of new generations of researchers. This second dimension has its origins in moments of national crisis, when government research efforts were expanded temporarily through the cooptation of university researchers. After such an interlude in World War I, influential US scientists tried to convince the federal government to keep the research relationship with universities going, but to no avail. It was only after the success of the crucial atomic bomb project that the achievements of science carried enough weight to make credible proposals for permanent, across-the-board government support for research (Kevles 1978). The spokesperson for this plan was Vannevar Bush, a university scientist who had been active in the government projects around the war. He called for the formation of a National Research Foundation, to provide basic support for research, without the specific targets that had been imposed during the war. He argued that unfettered research, carried out largely in
Science Funding: United States
Figure 1 US basic research spending, 1953–1998, total and Federal
universities, would build a base of human and knowledge resources that would help solve problems across the range of challenges in health, defense, and the economy (Bush 1990 [1945]). But to make these contributions, researchers needed freedom—both freedom in the laboratory to choose and pursue research problems, and organizational freedom at the level of government agencies, to set the larger research agenda in directions that were free from political control. Bush’s model of the relationship between science and society has been called the ‘autonomy for prosperity’ model (Cozzens 2001). Most of the experimentation over the next few decades with ways to maintain autonomy while being useful took the form of add-ons to this model. Bush’s model built a protective shell of organizational autonomy around the agencies that provided funds for basic research. The National Science Foundation (NSF) and the National Institutes of Health (NIH), both of which grew very fast in the postwar period, developed strategies that insulated direct decisions around research from the political context. One funding mechanism that embodied the essence of the autonomy-for-prosperity model in its early days was the science development program. Under these
programs in the 1950s and 1960s, federal block grants allowed universities to build their educational and research capacities, with very few strings attached (Drew 1985). The second strategy was the project grant\peer review system of funding. Under this mechanism, the government hands over choice of research topics as well as the judgment of what is to count as quality into the hands of researchers. ‘Peer review’ for project selection became both the major form of quality control in the federal funding system and an important symbol of scientific freedom (Chubin and Hackett 1990). Ironically, the autonomy-protecting institutional shell that Bush and his successors designed undermined the practical effectiveness that had won credibility for his plan to start with. As government funding for research grew rapidly in the 1950s, beyond the scale of Vannevar Bush’s wildest dreams, the institutional shell unintentionally created a protected space where new researchers could do their work without any grounding in the practical problems of the world. As soon as the practical grounding was lost, a gap was created between science and society that needed to be bridged if autonomy was really going to be turned into prosperity. 13681
Science Funding: United States
2. Knowledge Transfer and Knowledge Mandating Among the first set of funding methods developed to bridge this gap were knowledge transfer mechanisms. Knowledge transfer mechanisms do not threaten either laboratory or organizational autonomy, because they leave the protective shell in place while focusing on diffusing or disseminating research-based knowledge. For example, in the 1960s and 1970s, many federal programs addressed the ‘information explosion’ with ‘science information services,’ under the rubric of ‘information policy.’ The government encouraged first journals, then abstracting and indexing services, to provide access to the exploding journal literature, and later added extension services. The emphasis was on providing infrastructure for communication, not shaping its content; science was to speak to the public, not with it. These mechanisms protect research autonomy by segmenting knowledge processes over time. As the size of the research enterprise continued to grow, however, the pendulum swung back and the accountability issue inevitably arose. Elected representatives of the public asked ‘What are we getting for all this spending?’ In the absence of specific answers, a backlash eventually appeared. Policymakers began to demand that research solve societal problems directly, rather than through the diffuse claim of ‘autonomy for prosperity.’ In 1969, for example, Congress passed ‘the Mansfield amendment,’ limiting support from the Department of Defense to goal-directed research. The Nixon administration phased out science development programs. And in the 1970s, several ‘knowledge mandating’ programs appeared. At NSF, for example, the Program of Research Applied to National Needs (RANN) grew out of a Presidential concern about ‘too much research being done for the sake of research.’ (Mogee 1973). At NIH, the ‘War on Cancer’ emerged during the same period (Strickland 1989, Rettig 1977). In their more programmed forms, such programs threatened both individual and organizational autonomy. But even in softer forms, in which money was only redirected from one priority to another, they still threatened organizational autonomy, since external forces were dictating research directions. There is an important lesson in the history of these programs: over time, organizational autonomy won out over societal knowledge mandating. RANN was abolished, and over the years, both NSF and NIH have found ways to look programmatic without doing central planning.
3. Knowledge Sharing By the mid-1970s, a third approach began to be developed in the United States to bridge the gap between the science produced inside the protective 13682
institutional shell and the problems of the world surrounding it. This new form threatened neither the individual autonomy of researchers nor the organizational autonomy of funding organizations. This new form can be called ‘knowledge-sharing.’ This approach first took the form of an emphasis on partnerships, and within that emphasis, an early focus on partnerships with industry. New centers involved industry partners in setting strategic plans. The interchange of people and information became the rule. The critical element in these was a two-way dialog, which replaced the one-way science-to-society diffusion of information of knowledge transfer and the one-way society-to-science mechanism of knowledge mandating. In the two-way dialog, scientists became strategic thinkers, able to formulate their own problems creatively, but steeped again, as in the 1950s, in the problems articulated by some external set of partners. Another new element was the explicit link to education, at both graduate and undergraduate levels. The Engineering Research Centers of the NFS, for example, were intended to produce a ‘new breed’ of engineers, better prepared than previous generations to participate in R&D in industry because they understood the needs and culture of industry. This new partnership model raised questions: Was knowledge mandating being changed from a public function to a private one under this scheme? A change in the law that allowed universities to hold intellectual property rights in results produced under government grants further heightened these concerns. Public knowledge seemed to be on a path to privatization. At the same time, several well-publicized instances of accusations of fraud in science seemed to undermine the trusting relationships among citizens, government, and science (Guston 1999). In the late 1980s, however, a second crucial step in the development of the partnership model took place. This was a step back toward the public. Centers were urged to form partnerships, not only with industry, but also with State and local governments, citizen groups, and schools. The benefits that industry gained from earlier collaborations were now available to other parts of society, in a new pattern that has been called ‘partner pluralism’ (Cozzens 2001). Strategic plans began to be shaped by two-way dialog across many sectors, and the education of a new breed of researcher able to bridge the cultures of university, school, and public service sector began. Another manifestation of knowledge-sharing and an attempt to restore public trust arrived with a round of attention among researchers to public awareness of science and science education. The 1990s heard a new note in the discussions of the public among science leaders. Research leaders stressed that scientists need to learn more about the public, as well as the public learning more about science. Even the venerable National Academy of Sciences recommended that the institutions of science open their doors to the public,
Science, New Forms of urging, for example, that NIH take more advice from the public in priority-setting. These were signs of the beginning of a two-way dialog.
4.
The Knowledge Society
As US research entered the twenty-first century, many believed that knowledge is the key resource in the new economy. For a society to be innovative, however, its creative capacity must be widely shared. This goal will not be achieved unless its scientists become strategic thinkers steeped in society’s problems and issues, and government funding agencies remain public through partner pluralism. Accountability for research in the knowledge society is achieved through the engagement of many societal actors in the research enterprise. By placing many actors on an equal footing and encouraging two-way dialog, research policy in the twenty-first century will help stimulate shared creativity, and open a new path to prosperity. See also: Academy and Society in the United States: Cultural Concerns; Research Funding: Ethical Aspects; Science Funding: Asia; Science Funding: Europe; Science, Social Organization of
Bibliography Bush V 1990 [1945] Science—the Endless Frontier. National Science Foundation, Washington, DC, reprinted from the original Chubin D E, Hackett E J 1990 Peerless Science: Peer Reiew and US Science Policy. State University of New York Press, Albany, New York Cozzens S E 2001 Autonomy and accountability for 21st century science. In: de la Mothe J (ed.) Science, Technology, and Goernance. Pinter, London Drew D E 1985 Strengthening Academic Science. Praeger, New York DuPree A H 1964 Science in the Federal Goernment: A History of Policies and Actiities to 1940. Harper & Row, New York England J M 1983 A Patron for Pure Science: The National Science Foundation’s Formatie Years. National Science Foundation, Washington, DC Guston D H 2000 Between Politics and Science: Assuring the Productiity and Integrity of Research. Cambridge University Press, New York Kevles D J 1978 The Physicists: The History of a Scientific Community in Modern America. Knopf, New York Logsdon J M 1970 The Decision to Go to the Moon: Project Apollo and the National Interest. MIT Press, Cambridge, MA Mogee M E 1973 Public Policy and Organizational Change: The Creation of the RANN Program in the National Science Foundation. MSc. Thesis, George Washington University, Washington, DC Morin A J 1993 Science Policy and Politics. Prentice-Hall, Englewood Cliffs, NJ Mukerji C 1989 A Fragile Power: Scientists and the State. Princeton University Press, Princeton, NJ
Rettig R A 1977 Cancer Crusade: The Story of the National Cancer Act of 1971. Princeton University Press, Princeton, NJ Smith B L R 1989 American Science Policy since World War II. Brookings Institution, Washington, DC Stokes D E 1997 Pasteur’s Quadrant: Basic Science and Technological Innoation. Brookings Institution, Washington, DC Strickland S P 1988 The Story of the NIH Grants Program. University Press of America, Lanham, MD
S. E. Cozzens
Science, New Forms of In recent times, there has been a growing sense that ‘science’ is not merely expanding in its grasp of the natural world, but that the activity itself is changing significantly. For this we use the term ‘new forms of science,’ understanding both the practice and the reflection on it. Since science is now a central cultural symbol, changes in the image of science rank in importance with those in the practice itself. Our situation now is one of the differentiation of a practice that hitherto seemed unified, and of conflict replacing consensus on many issues, some of which had been settled centuries ago. The prospect for the immediate future is for a social activity of science that no longer has a unified core for its self-consciousness and its public image. The main baseline for this ‘novelty’ is roughly the century preceding World War II, when the ‘academic’ mode of scientific activity became dominant. This was characterized by the displacement of invention and diffusion by disciplinary research, and of amateurs (with private means or patronage) by research professionals (perhaps with a partial sinecure in advanced teaching). In the latter part of this period, the mathematical–experimental sciences took the highest prestige away from the descriptive field sciences. The social sciences were increasingly pressured to conform to the dominant image. And throughout, the consciousness of the activity, reflected in scholarly history and philosophy as well as in popularization, was triumphalist and complacent. There could be no way to doubt that the idealistic scientists, exploring nature for its own sake, inevitably make discoveries that eventually enrich all humanity both culturally and materially. Of course, any ‘period’ in a social activity is an intellectual construct. Closer analyses will always reveal change and diversity within it; and it contains both relics of the deeper past and germs of the future. But occasionally there occurs a transforming event, so that with all the continuities in practice there is a discontinuity in consciousness. In the case of recent science, this was the Manhattan Project, in which the first atomic bombs were designed and constructed in a 13683
Science, New Forms of gigantic scientific engineering enterprise. It was of particular symbolic significance that the project was conceived and managed, not by humble ‘applied scientists’ or engineers, but by the elite theoretical physicists. It was science itself that was implicated in the moral ambiguities and political machinations that ensued in the new age of nuclear terror. After the Bomb, it was no longer possible to build scholarly analyses on assumptions of the essential innocence of science and the beneficence of scientists. Prewar studies in the sociology of science, such as those of Merton (with his ‘four norms’), which assumed a purity in the endeavor and its adherents, could later find echoes only in those of the amateur philosopher Polanyi (1951), himself embattled against J. D. Bernal and the Marxist ‘planners’ of science (1939). But it took about a decade for a critical consciousness to start to take shape. Those scientists who shared the guilt and fear about atomic weapons still saw the cure in educating society, as through the Bulletin of the Atomic Scientists, rather than wondering whether science itself had taken a wrong turning somewhere. Leo Szilard (1961) imagined an institute run by wise scientists that would solve the world’s problems; and Jacob Bronowski (1961) faced the problem of the responsibility of science for the Bomb, only to deny it absolutely. The vision of Vannevar Bush (1945), of science as a new and endless frontier for America, continued the prewar perspective into postwar funding. The first significant novelty in the study of science was accomplished in the 1950s by Derek J. de Solla Price (1963). Although his quantitative method seemed at times to be a caricature of science itself, he was submitting scientific production to an objective scrutiny that had hitherto seemed irrelevant or verging on irreverent. And he produced the first ‘limits to growth’ result; with a doubling time (steady over three centuries!) of 15 years, compared to some 70 in the economy, sooner or later society would refuse to foot the bill. Since the prevailing attitude among scientists was that society’s sole contribution to science should be a blank check, Price’s bleak forecast, however vague in its timing, was not well received. He went on to give the first characterization of the new age of science; but he only called it ‘big,’ as opposed to the ‘little’ of bygone ages. A more serious disenchantment motivated Thomas S. Kuhn. He had imbibed the naive progressive image of science, whereby truths were piled on truths, and scientists erred only through bad methods or bad attitudes. Then it all cracked, and he wrote his seminal work on Scientific Reolutions (1962), all the more powerful because of its confusions and unconscious ironies. Whether he actually approved of routinized puzzle-solving ‘normal science’ was not clear for many years; to idealists like Popper, Kuhn’s flat vision of science represented a threat to science and to civilization (1970). 13684
Through the turbulent 1960s, more systematically radical social critiques began to emerge. Of these, the most cautious was that of John Ziman’s Public Knowledge (1968), and the most ambiguous that of Jerome Ravetz (combining Gemeinschaft craftsmen’s science, Nuclear Disarmament, and the Counterculture, but ending his book with a prayer by Francis Bacon) (1996, [1971]). An attempt at a coherent Marxist perspective was mounted by Steven and Hilary Rose (1976). This was enlivened by their internal tension between Old and New Left, but with the complete discrediting of Soviet socialism, it was soon relegated to purely historical significance. A serious attempt at an objective social study of science was made by N. D. Ellis (The Scientific Worker 1969) but there was then no audience for such an approach. Another premature social analysis of science came to light around then. This was the work of Ludwik Fleck (1979 [1935]) on the ‘genesis and development of a scientific fact’; at the time of its publication and for long after, the very title smacked of heresy. At the same time, the new dominant social practice of science, either ‘industrialized’ (Ravetz) or ‘incorporated’ (the Roses) took shape. The prophetic words of President Eisenhower on the corrupting influence of state sponsorship were little heeded; and an American science that was both ‘pure’ and ‘big’ produced its first muckraker, in Dan Greenberg (1967), a Washington correspondent for Science magazine. There was a short but sharp protest at the militarization of science in the Vietnam War (Allen 1970), out of which came various attempts at organizing for ‘social responsibility in science.’ On the ideological front, there was a collapse of attempts to preserve science as the embodiment of the Good and the True. Somewhat too late for the 1960s but revolutionary nonetheless, Paul Feyerabend published Against Method, in which even the rearguard defenses of Popper (1963) and Lakatos (1970) were demolished. Then came the tongue-in-cheek naive anthropology of Laboratory Life (Latour and Woolgar 1977), considering scientists as a special tribe producing ‘inscriptions’ which were intended to gain respect and to rebut hostile criticism. A radically skeptical theory of scientific knowledge was elaborated in sociological studies called ‘constructivism.’ As prewar science receded in time and relevance, scholars could settle down to the analysis of this new form of social practice of science in its own terms. Early in the postwar period there had been a conception that scientists in the public arena should be ‘on tap but not on top,’ a sort of ‘drip’ theory of their function. But more sophisticated analyses began to emerge. Among these was ‘mandated science’ (Salter 1988), in which the pitfalls of well-intentioned involvement were chronicled. The special uses and adaptations of science in the regulatory process and in the advisory function were analyzed (Jasanoff 1990). In such studies, it became clear that neither ‘science’
Science, New Forms of nor ‘scientist’ can be left as elements outside the process; indeed, the old Latin motto ‘who guards the guardians?’ reminds us of the essentially recursive nature of the regulatory process. For (as Jasanoff showed) those who regulate or advise act by certain criteria and are also selected for their position; and who sets the criteria, and who chooses the agents, can determine the whole process before the scientistadvisors even get to work. Meanwhile the existing trends in the social practice of science were intensified. Physics lost the preeminence it had gained through winning the war with the Atomic Bomb. Civil nuclear power (which was really engineering anyway) failed to live up to its early promise, producing not only disasters (near and actual) in practice but also intractable problems in waste management. The core of basic physics research, high-energy studies, was caught in a cul-de-sac of ever more expensive machines hunting ever more esoteric particles. It may also have suffered from its overwhelming reliance, not covert but not widely advertised either, on the US military for its funding. Then physics was caught up in the debate over ‘Star Wars,’ which was controversial not merely on its possible political consequences, but even on the question of whether it could ever work as claimed. In this case, it appeared that the technical criteria of quality were totally outweighed by the political and fiscal. The focus of leading-edge research shifted to biology, initiated by the epochal discovery of the structure of the genetic material DNA, the ‘double helix.’ There was a steady growth in the depth of knowledge and in the power of tools in molecular biology. This was punctuated in the 1970s by an episode of discovery of, and reaction to, possible hazards of research. In an unprecedented move, the ‘recombinant DNA’ research community declared a moratorium on certain experiments, until the hazards had been identified and regulated. Critics of the research interpreted this as an admission that the hazards were real rather than merely hypothetical. There was a brief period (1976–7) of confrontation in the USA, complicated by some traditional town–gown antagonisms and by relics of Vietnam activism. But it soon died down, having served partly to stimulate interest in the research but also planting the seeds of future disputes. Another novelty in this period was the emergence of ethics as a systematic concern of science. Qualms about the uses of science in warfare go back a long way (Descartes 1638 has a significant statement), as well as disputes over priority and honesty. But with the Bomb, the possibility of science being turned to evil ends became very real; and with the loss of innocence in this respect, other sorts of complaints could get a hearing. Medical research is the most vulnerable to criticism, since the subjects are humans (or other sentient beings) and there must be a balance between their personal interests and those of the research (and hence of the
broader community). But some cases went too far, as that of the African-American men who (in the 1930s) were not given treatment for syphilis, so that their degeneration could be recorded for science. Also, there were revelations of some of the more bizarre episodes of military science at the height of the Cold War, such as subjecting unwitting human subjects either to nuclear radiation or to psychotropic drugs. Since some universities shared responsibility with the military for such outrages, the whole of science could not but be tarnished. Out of all this came a problem of the sort that could not have been imagined in the days of little science: activism, to the point of terror tactics, against scientists and labs accused of cruelty to animals. A further loss of innocence was incurred in the realization of the Marxist vision of science as ‘the second derivative of production,’ albeit under conditions that Marxists did not anticipate. Now ‘curiosity-driven,’ or even ‘investigator-initiated’ research, forms a dwindling minority within the whole enterprise. The days when the (British) Medical Research Council could simply ‘back promising chaps’ are long since gone. Increasingly, research is mission oriented, with the missions being defined by the large institutions, public and private, possessing the means and legitimacy to do so. Again, it is in biology, or rather bioengineering, where the change is most remarked. With genetic engineering, science is increasingly caught up in the globalization, or rather commodification, of everything that can be manipulated, be it life-forms or even human personality. When firms are accused of bio-piracy (appropriating plants or genetic materials from local people, and then patenting them), this is not exactly new, since it was also practiced by the illustrious Kew Gardens in Victorian times. But such practices are no longer acceptable in the world community (a useful survey of these issues is to be found in Proctor 1991, Chap. 17). The social organization of science has changed in order to accommodate to these new tasks and circumstances. The rather vague term ‘mode 2’ has been coined to describe this novel situation (Gibbons et al. 1994). In the terms of this analysis of a condition which is still developing, scientists are reduced to proletarians, neither having control over the intellectual property in the products of their work (since it is done largely on contracts with various restrictions on publication), nor even the definable skills of disciplinary science (since most projects fall between traditional boundaries). The discussion in the book provides a useful term for this sort of science: ‘fungible.’ For the research workers become a convertible resource, being shipped from project to project, to be reassigned or discarded as convenient. There are all sorts of tensions within such a new dispensation, perhaps most notably the contradictory position of the educational institutions, whose particular sort of excellence may not be compatible with 13685
Science, New Forms of the demands of the new labor market and research regime. With the fragmentation and globalization of politics, other new sorts of scientific practice are emerging, not continuous with the evolving mainstream institutions but either independent of, or in opposition to, them. The leading environmental organizations have their own scientific staff who debate regularly with those promoting controversial projects. Increasingly, ‘the public’ (or representatives of the more aware and critical sections of it) is drawn into deliberative processes for the evaluation and planning of developments in technology and medicine. In some respects a ‘stakeholder society’ in science is emerging, engaged in debate on issues that had hitherto been left for the experts and politicians to decide. Movements for ‘community research’ are emerging in the USA (with a focus in the Loka Foundation) and elsewhere. This development can be seen as an appropriate response to an emerging new major task for science. In the process of succeeding so brilliantly in both discovery and invention, science has thrown up problems that can be loosely defined as risk. The tasks here have significant differences from those of the traditional scientific or technical problems. For one, problems of risk and safety are inherently complex, involving the natural world, technical systems, people, institutions, values, uncertainties, and ignorance. Isolated, reductionist studies of the ‘normal science’ sort may be an essential component, but cannot determine a policy. Further, ‘safety’ can never be conclusively established, not merely because it is impossible to prove impossibility, but more because the evidence of causation is either only suggestive (as from toxicological experiments with acute doses on non-human species) or indirect (as from epidemiological studies). In the debates on managing risk, methodology becomes politicized, as it is realized that the assignment of burden of proof (reflected in the design of statistical studies and of experiments) can be critical in determining the result that goes forward into the policy process. Conflict of interest becomes difficult to avoid, since most opportunities for achieving expertise are gained through work with the interests promoting, rather than with those criticizing, new developments. Also, the debates take place in public forums, such as the media, activist campaigning, or the courts, each of which produces its characteristic distortions on the process. One response to this increased public involvement has been a variety of recommendations, from respected and even official sources, for greater openness and transparency in the science policy process. In some respects, this is only prudence. In the UK, the controversy over ‘mad cow disease’ (or the BSE catastrophe) could fester for a decade only because of official control of knowledge and ignorance. In its aftermath, the development of genetically engineered food crops was threatened by ‘consumer power’ in 13686
Europe and beyond. Expressed negatively, we can be said to be living in a ‘risk society’ (Beck 1992), where new forms of political activity are fostered by the consequences of an inadequately controlled science– technology system (Sclove 1995, Raffensperger 1999). The assumption of beneficence of that system is now wearing thin; thus the mainstream British journal New Scientist made this comment on the mapping of the human genome: ‘For all our nascent genetic knowledge, if we end up with our genes owned by rich corporations and [with] a genetic underclass, will we really have advanced that much?’ (2000) There is no doubt that a significantly modified practice and image of science will be necessary for these new tasks of managing risks in the current social and political context. If nothing else, traditional science assumed that values were irrelevant to its work (this was, paradoxically, one of its great claims to value), and that uncertainties could be managed by technical routines (such as statistics). This was the methodological basis for the restriction of legitimacy to experts and researchers whose ‘normal science’ training had rigorously excluded such considerations. With uncertainty (extending even to salient ignorance) and value loading involved in any study of risk or safety, we are firmly in a period where a ‘post-normal’ conception of science is appropriate (Funtowicz and Ravetz 1993). The core of this new conception is ‘quality,’ since ‘truth’ is a luxury in contexts where, typically, facts are uncertain, values in dispute, stakes high, and decisions urgent. Under these circumstances, there is a need for an ‘extended peer community’ for the assurance of the quality of the scientific inputs, supplementing the existing communities of subject specialists and clients. And this new body of peers will have its own ‘extended facts.’ The materials for this new sort of science involve not only the operations of the natural world, but also the behavior of technical systems and the social systems of control. Therefore, information of new sorts becomes relevant, including local knowledge and commitments, investigative journalism, and published confidential reports. Although ‘post-normal science’ has its political aspects, this proposed extension of legitimacy of participation in a scientific process is not based on political objectives. It is a general conception of the appropriate methodological response to this new predicament of science, where the tasks of managing risks and ensuring safety cannot be left to the community of accredited scientific experts alone. This account of new forms of science would be incomplete without mention of a tendency which was thought to have been laid to rest long ago, as superstition unworthy of a civilized society. Coming in through East Asian practices of medicine and South Asian practices of consciousness, enriched cosmologies, involving ‘vibrations’ and ‘energies’ have become popular, even chic. The reduction of reality to its
Science, Social Organization of mathematical dimensions, which was the metaphysical core of the Scientific Revolution, is now being eroded in the practice of unconventional medicine. Although this has not yet touched the core of either research or the policy sciences, it is a presence, alien to our inherited scientific culture, which cannot be ignored. We are coming to the end of a period lasting several centuries, when in spite of all the major developments and internal divisions and conflicts, it made sense to speak of ‘science’ as an activity on which there was little disagreement on fundamentals. The very successes of that science have led to new challenges, from which significantly novel forms of science are emerging, characterized by equally novel forms of engagement with society. See also: Cultural Studies of Science; Foucault, Michel (1926–84); History of Science; History of Science: Constructivist Perspectives; Innovation, Theory of; Kuhn, Thomas S (1922–96); Polanyi, Karl (1886–1964); Popper, Karl Raimund (1902–94); Reflexivity, in Science and Technology Studies; Science, Sociology of; Scientific Knowledge, Sociology of; Social Science, the Idea of
Bibliography Allen J (ed.) 1970 March 4: Scientists, Students, and Society. MIT Press, Cambridge, MA Beck U 1992 Risk Society: Towards a New Modernity. Sage, London Bernal J D 1939 The Social Function of Science. Routledge, London Bronowski J 1961 Science and Human Values. Hutchinson, London Bush V 1945 Science: The Endless Frontier. Government Printing Office, Washington, DC Descartes R 1638 Discours de la MeT thode (6th part) Editorial 2000 New Scientist, July 1 Ellis N D 1969 The scientific worker. PhD thesis, University of Leeds Feyerabend P 1975 Against Method. New Left Books, London Fleck L 1979 Genesis and Deelopment of a Scientific Fact. Translation from German edn., 1935. University of Chicago Press, Chicago Funtowicz S, Ravetz J R 1993 Science for the post-normal age. Futures 25: 739–55 Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge. Sage, London Greenberg D 1967 The Politics of Pure Science. New American Library, New York Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lakatos I 1970 History of science and its rational reconstructions. In: Lakatos I, Musgrave A (eds.) Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK
Latour B, Woolgar S 1977 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills, CA Merton R 1957 Social Theory and Social Structure. The Free Press, Glencoe, IL Polanyi M 1951 The Logic of Liberty. Routledge, London Popper K R 1963 Conjectures and Refutations. Routledge, London Popper K R 1970 Normal science and its dangers. In: Lakatos I, Musgrave A (eds.) Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK Price D J 1963 Little Science, Big Science. Cambridge University Press, Cambridge, UK Proctor R N 1991 Value-free Science? Harvard University Press, Cambridge, MA Raffensperger C 1999 Editorial: Scientists making a difference. The Networker: The Newsletter of the Science and Enironmental Health Network 4 Ravetz J R 1996 [1971] Scientific Knowledge and its Social Problems, new edn. Transaction Publishers, New Brunswick, NJ Rose S, Rose H 1976 The Political Economy of Science. Macmillan, London Salter L 1988 Mandated Science: Science and Scientists in the Making of Standards. Kluwer Academic Publishers, Dordrecht, Holland Sclove R E 1995 Democracy and Technology. The Guilford Press, New York Szilard L 1961 The Voice of the Dolphins and Other Stories. Simon & Schuster, New York Ziman J 1968 Public Knowledge. Cambridge University Press, Cambridge, UK
J. R. Ravetz
Science, Social Organization of The sociology of science has been divided since about 1980 between those contending that science gains sociological significance because of its organizational location and forms, and those arguing that it should be understood for its knowledge-building practices. The two groups have tended to treat social organization in completely different ways, and have consciously developed their ideas in opposition to the others. However, both have used some notion of the social organization of science to explain the constitution of facts about nature and the development of ways of reworking nature for strategic advantage. The tensions between the schools have been productive of new approaches to organizations and the operation of power. A scholar like Jasanoff, interested in political process and expertise, has used constructivist theories of scientific knowledge to show how central science has become to the legal system and regulatory structures (Jasanoff 1990). In contrast, a researcher like Bruno Latour, interested in the social struggles involved in making science ‘real,’ has demonstrated how the laboratory has become productive of powers that shape contemporary life (Latour 1993). This work, and much more like it, has begun to suggest 13687
Science, Social Organization of the fundamental ways that contemporary political systems and other major institutions depend on science and technology for their forms, operations, and legitimacy. It would have been hard for sociologists to entirely avoid the correlation between the growth of modern States and the so-called scientific revolution, and the questions it raises about power and control of the natural world. Courts in the fifteenth and sixteenth centuries used scientists to help design military technology, do political astrology, and make the earth into a showplace of power (Grafton 1999, Masters 1998). Even members of the Royal Society in the seventeenth century (often depicted as a politically independent organization for science) dedicated much effort to addressing public problems with their research (Webster 1976). Practical as well as conceptual arts like cartography gained importance in Italy, France, and England as a tool for trade and territorial control. Double entry bookkeeping provided a way of legitimating both commerce and governmental actions through the management of ‘facts’ (Poovey 1998). Botany and the plant trade, particularly in Spain, The Netherlands, and France, enriched the food supply and increased the stock of medicinal herbs. Italian engineers tried to tame rivers, the Dutch and French built canals (Masters 1998), the English deployed medical police to ensure public health (Carroll 1998), and the French, English, and Germans worked on forestry (Scott 1998). Military engineering throughout Europe was revolutionized with a combination of classical architectural principles, and new uses of cannon fire and other weapons (Mukerji 1997). As Patrick Carroll argues, technoscience was not a product of the twentieth century, but already part of the culture of science (the engine science) of the seventeenth century (Carroll 1998). The point of science and engineering in this period was human efficacy, the demonstration of human capacities to know the world and transform it for effect. One manifestation of this was the cultivation of personal genius and individual curiosity, yielding a dispassionate science, but another was the constitution of political territories that were engineered for economic development and political legibility, and managed (in part) to maintain the strength and health of the population (Mukerji 1997, Scott 1998).
1. Functionalist Foundations of the Sociology of Science The early sociologists of science, such as Merton and Ben-David, took for granted both historical and contemporary links between science and state power. They were steeped in the sociological literature on organizations that defined technology, at least, as central to the organization of major institutions— 13688
from the military to industry. Merton in his dissertation recognized the historical interest of courts and States in scientists and engineers, but traced the development of a disinterested culture of science, emanating from the places where thought was at least nominally insulated from political pollution: universities, scientific societies, and professional journals (Merton 1973, Crane 1972). Looking more directly at politics and science, Ben-David (1971), still interested in how social organization could promote excellence in science, considered differences in the organization of national science systems and their effects on thought. He assumed a kind of Mannheimian sociology of knowledge to argue that systems of research impact the progress of science. If location in the social system shaped what people could know, then the social organization of science was necessarily consequential for progress in science and engineering (Ben-David 1971). These approaches to the organization of science were grounded in their historical moment—the Cold War period. Science and engineering were essential to the power struggle of East and West. It was commonly held that WWII had been won through the successful effort to dominate science and technology. Policymakers in the US and Europe wanted to gain permanent political advantages for their countries by constructing a system of research in science and engineering that would continue to dominate both thought and uses of natural resources. Western ideology touted the freedom of thought allowed in the non-Communist world as the route to the future. According to the Mertonians, history seemed to support this posture. England with its Royal Society independent of the government was the one which produced Newton—not France and Italy with their systems of direct state patronage (Crane 1972, Merton 1973). The result was a clear victory for the open society, and reason to be confident about American science, which was being institutionalized inside universities rather than (in most cases) national laboratories.
2. The Sociology of Scientific Knowledge For all that the sociology of scientific knowledge (SSK) first presented itself as a radical subfield at odds with the functionalist tradition, it continued Merton’s impulse to associate science less with politics than philosophy (Barnes et al. 1996). Taking the laboratory (rather than the Department of Defense), as the center of calculation for science (and science studies) was an effective way to imagine that power was not at stake in science—even in the US. Moreover, SSK presented good reasons to look more closely at scientific knowledge. In the Mertonian tradition, sociologists had discussed the relationship of organizational structures to scientific progress—as though sociologists could
Science, Social Organization of and would know when science was flourishing. Sociologists of scientific knowledge were not prepared to make such judgments about scientific efficacy, nor passively willing to accept the word of scientists as informants on this. For SSK researchers, what was at stake was the philosophical problem of knowledge— how you know when an assertion about nature is a fact (Barnes et al. 1996, Collins 1985). This was precisely what Merton (1973) had declared outside the purview of sociology, but SSK proponents now transformed into a research question. How did insiders to the world of science make these kinds of determinations for themselves? The implications for studying the organization of science were profound. Laboratories were the sites where facts were made, so they were the organizational focus of inquiry (Knorr-Cetina 1981, Latour and Woolgar 1979, Lynch 1985). Making a scientific fact was, by these accounts, a collective accomplishment, requiring the coordination of the cognitive practices both within and across laboratories. Laboratories had their own social structures—some temporary and some more stable. In most cases laboratory members distinguished between principal investigators and laboratory technicians, who had different roles in the constitution of knowledge. The significance of research results depended on the social credibility of the researchers. Science was work for gentlemen whose word could be trusted—not tradesmen or women even if they did the work (Shapin and Schaffer 1985). There were alliances across laboratories forged with ideas and techniques that were shared by researchers and were dedicated to common problems or ways of solving them (Pickering 1984). To make scientific truths required more than just an experiment that would confirm or disconfirm an hypothesis. The results had to be witnessed and circulated within scientific communities to make the ‘facts’ known (Shapin and Schaffer 1985). Scientific paradigms needed proponents to defend and promote them. Creating a scientific fact was much like a military campaign; it required a high degree of coordination of both people and things. It was a matter of gaining the moral stature in the scientific community to have a scientist’s ideas taken as truthful. The tools accomplishing these ends were both social and cognitive (Shapin and Schaffer 1985). Members of these two schools (Mertonian and SSK) may have envisioned the task of articulating a sociology of science differently, but they shared a basic interest in socially patterned ways of thinking about, mobilizing, and describing nature (or the nature of things). Now that the dust has settled on their struggle for dominance in sociology the epistemological break assumed to exist between them has come to seem less profound. Power circulates through laboratories, and scientific experts circulate through the halls of power (Haraway 1989, Jasanoff 1990, 1994, Mukerji 1989). Ways of organizing research affect both the con-
stitution of knowledge and ways of life (Rabinow 1996). The world we know is defined and engineered through patterns of cognition that include manipulation of nature—in the laboratory and beyond.
3. The Politics of Knowledge-making and Knowledge Claims Contemporary research in the sociology of science has shed new light on organizations that sociologists thought they understood before, but never examined for their cognitive processes and relations to nature. Jasanoff’s work on the regulatory system in the US, for example, does not simply argue that concern about pollution and scientific regulation of other aspects of life has stimulated new research and yielded new information for policymakers—although that is true. She has shown how the legitimacy of the State has come to depend (to a surprising extent) on its claims to provide at least a minimal level of well-being for the population. Safe air, a healthy food supply, and good drinking water have become taken-for-granted aspects of political legitimacy that depend not only on manipulations of nature but also on the development of new strategies of reassurance. Regulators have not been able simply to ask scientists to use existing expertise to assess and ameliorate problems. They have had to cultivate sciences pertinent to the problems, and face the controversies about the results of new lines of research. The result is a new set of cognitive tools for science honed for policy purposes, and new pressures on political actors to understand and work with at least some of these measurements (Jasanoff 1990, 1994). Similarly, political legitimacy in rich countries also rests on the government’s ability to confront and address medical problems. As Steven Epstein pointed out in his study of AIDS research, the population now generally expects doctors, scientists, and policymakers to solve problems and keep the population healthy. Any fundamental disruption of this faith in expertise leads to anger and (in the US case he studied) political activism. Government officials (like public health workers) in these instances need not only to advocate and support research but also to make the government’s health system seem responsive to public needs. This means that the dispassionate pursuit of scientific truths cannot dictate the practice of research, or the dispersement of drugs. Research protocols cannot be entirely determined by experts, but require debate with the activists as well as professional politicians for whom the problem is an issue (Epstein 1996). Science is therefore used to manage public health (the body politic), and for creating a healthful environment for the citizenry. It is also used to design and manage infrastructures for the society as a whole. Computers are used to design roads, and manage toll systems. 13689
Science, Social Organization of They are employed by hospitals to define illnesses, codify medical practices, and determine courses of treatment for individuals. The military has not only used scientists and engineers (in Cold War style) to develop new weapons and create new modes for their delivery, but has mobilized these groups to develop national communication infrastructures—from road systems to airports to the Internet (Edwards 1996). This engineering is oddly an outgrowth of territorial politics—in this period when States are supposed (by theories from post-modernist to Marxist) to be dying due to globalization (Castells 1996). However, responsibility for health and well- being are circumscribed by national boundaries, and legal responsibility for them is kept within territorial boundaries and remains largely a problem for States.
4. Post-colonial Studies of Science and Technology The territorial dimensions of legitimacy and public health are particularly apparent in the burgeoning post-colonial literature on science, technology, and medicine, showing the export to poor countries of research involving dangerous materials or medical risks (Rafael 1995). The export of risk was perhaps most dramatically illustrated by the US use of an atoll in the Pacific Ocean for bomb tests, but there have been numerous less dramatic instances of this. Manufacturers using dangerous materials have been encouraged to set up factories in Third World countries where economic growth has been considered a higher priority than environmental safety (Jasanoff 1994). Medical researchers in need of subjects have frequently turned to colonial populations for them (Rafael 1995). These practices make it obvious that dangerous research and manufacture are not necessarily abandoned because of the risks. They are simply done in those less powerful places within a state’s sphere of influence where legitimacy is not at stake. In these instances, political regulation at home has not necessarily created a moral revolution in science, but a physical dissociation of practices and responsibility. Outside the social world of Western gentlemen, there has often been little concern on the part of researchers about their moral stature. Governments have not systematically worried about the normative consequences of technological development. Instead, poor people have been treated (like prisoners at home) as a disposable population of test subjects—just what AIDS patients refused to be (Epstein 1996, Jasanoff 1994, Rafael 1995). This pattern is both paralleled by, and connected to, the export of high technology into post-colonial areas of the world. The growth of manufacturing in Third World countries, using computerized production systems, and the growth of computing itself in both 13690
India and Africa for example, testify to another pattern of export of science and technology (Jasanoff 1994, Jules-Rosette 1990). These practices are usually presented as ways that corporations avoid local union rules, and deploy an educated workforce at low cost (Castells 1996). However, they are also ways large corporations have found to limit in-migration of labor from post-colonial regions, and to avoid political responsibility for the health of laboring people from poorer parts of the world. Less a case of exporting risk, it is a way of exporting responsibility for it (Jasanoff 1994).
5. Commercial Stakes in Scientific Thought It would be easy to think that while politics is driving some areas of research, there remains a core of pure science like the one desired by Merton. However, commercial as well as political forces work against this end (Martin 1991). Even the publishing system that was supposed to buffer science from the workings of power has turned out to be corruptible. It is not simply that scientific texts (the immutable mobiles of Latour and Woolgar (1979) have found political uses; texts have not been stabilized by being put in print. As Adrian Johns (1998) has shown, simply because printing technology could fix ideas and stabilize authorship has not meant that the publishing industry would use it this way. In seventeenth-century England—the heyday of the scientific revolution— publications were often pirated, changed for commercial purposes, or reattributed. Scientific authorship never did unambiguously extend empiricism and accurate beliefs about nature, or give scientific researchers appropriate recognition for their work. Publishing in science was just another part of the book trade, and was managed for profit. To this day, commercial pressures as well as peer review shapes the public record in science. The purpose of the science section in newspapers is to sell copies, not promote scientific truth. However, scientists still frequently use the popular press to promote their research and advance their careers in science (Epstein 1996). The practices of science and engineering themselves are not so clearly detached either. Commercial interests in biotechnology and computing have powerful effects on the organization of research, and relations between the university and industry (Rabinow 1996). Even though the Cold War is over, scientists still often cannot publish what they learn from DOD research (which includes work in computing). They rely on external funding, and so must study what is of interest to the government. They are under careful supervision of administrators when they export engineering practices from the laboratory into the world (Mukerji 1989). Dreams of democracy served by science and technology seem hard to sustain.
Science, Social Organization of
6. Beyond Power\Knowledge The broad range of contemporary studies of the politics of science and engineering do not just manifest a revival of interest in the political dimensions\connections of science and engineering in the post-Foucaultian world of power\knowledge, but they manifest new understandings of how power operates through multiple organizational forms. Scientists have (both historically and in the present) aided in the articulation of a system of power based on the principles familiar to the Frankfurt School—the domination of nature for the domination of people. Researchers in science studies who focus on the political mobilization of research for organizational advantage are now making clearer how strategic scientific management of the natural world works and does not work. It seems that governments and industry historically have not so much attained the ideas about nature they needed or paid for, but that scientists, in pursuing knowledge, have also produced means of dominating nature that have been used (and to some extent contained) by those institutions (Mukerji 1989). Modern states, in funding the cultivation of cognitive systems for learning about and managing natural resources, nuclear power, chemical pollutants, and viruses, have generated new patterns of domination but have also opened themselves up to new questions of legitimacy and, in some cases (Epstein 1996), to a redistribution of expertise. The system of scientific and engineering research is not just productive of ideas, but also transportation systems, research animals, laboratories themselves, and new technologies (like the Internet) (Edwards 1996, Kohler 1994, Rabinow 1996). The result is not just a brains trust of scientists, but an entire sociotechnical environment built for strategic effect (Cronon 1991). The cognitive systems of science and engineering are not just ways of coordinating thought through language to reach the truth, but ways of making the world again to reflect and carry human intelligence (and stupidity) about nature. See also: Academy and Society in the United States: Cultural Concerns; Disciplines, History of, in the Social Sciences; Human Sciences: History and Sociology; Kantian Ethics and Politics; Normative Aspects of Social and Behavioral Science; Paradigms in the Social Sciences; Research and Development in Organizations; Scientific Academies in Asia; Scientific Knowledge, Sociology of; Truth and Credibility: Science and the Social Study of Science; Universities, in the History of the Social Sciences
Bibliography Barnes B, Bloor D, Henry J 1996 Scientific Knowledge. Chicago University Press, Chicago Ben-David J 1971 The Scientist’s Role in Society. Prentice-Hall, Englewood Cliffs, NJ
Bowker G, Star S L 1999 Sorting Things Out: Classification and its Consequences. MIT Press, Cambridge, MA Carroll P 1998 Ireland: Material Construction of the Technoscientific State. Dissertation, University of California, San Diego, CA Castells M 1996 The Rise of the Network Society. Blackwell, London Collins H 1985 Changing Order. Sage, Beverly Hills, CA Crane D 1972 Inisible Colleges. University of Chicago Press, Chicago Cronon W 1991 Nature’s Metropolis. Norton, New York Edwards P N 1996 The Closed World. MIT Press, Cambridge, MA Epstein S 1996 Impure Science. University of California Press, Berkeley, CA Grafton A 1999 Cardano’s Cosmos. Harvard University Press, Cambridge, MA Habermas J 1975 Legitimation Crisis. Beacon Press, Boston Haraway D 1989 Primate Visions. Routledge, New York Jasanoff S 1990 The Fifth Branch. Harvard University Press, Cambridge, MA Jasanoff S 1994 Learning from Disaster. University of Pennsylvania Press, Philadelphia, PA Johns A 1998 The Nature of the Book. University of Chicago Press, Chicago Jules-Rosette B 1990 Terminal Signs: Computers and Social Change in Africa. Mouton de Gruyter, New York Knorr-Cetina K 1981 The Manufacture of Knowledge. Pergamon Press, New York Kohler R 1994 Lords of the Fly. University of Chicago Press, Chicago Latour B 1993 We Hae Neer Been Modern. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1979 Laboratory Life. Sage Publications, Beverly Hills, CA Lynch M 1985 Art and Artifact in Laboratory Science. Routledge and Kegan Paul Martin B 1991 Scientific Knowledge in Controersy. State University of New York Press, Albany, NY Masters R 1998 Fortune is a Rier. Free Press, New York Merton R K 1973 The Sociology of Science. University of Chicago Press, Chicago Mukerji C 1989 A Fragile Power. Princeton University Press, Princeton, NJ Mukerji C 1997 Territorial Ambitions and the Gardens of Versailles. Cambridge University Press, New York Pickering A 1984 Constructing Quarks. University of Chicago Press, Chicago Poovey M 1998 A History of the Modern Fact. University of Chicago Press, Chicago Proctor R 1991 Value-free Science? Harvard University Press, Cambridge, MA Rabinow P 1996 The Making of PCR. University of Chicago Press, Chicago Rafael V 1995 Discrepant Histories. Temple University Press, Philadelphia, PA Scott J C 1998 Seeing Like a State. Yale University Press, New Haven, CT Shapin S, Schaffer S 1985 Leiathan and the Air-pump. Princeton University Press, Princeton, NJ Webster C 1976 The Great Instauration. Holmes and Meier, New York
C. Mukerji Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13691
ISBN: 0-08-043076-7
Science, Sociology of
Science, Sociology of To make science the object of sociological analysis directs attention to the production and consumption of scientific knowledge in diverse cultural contexts, institutional structures, local organizations, and immediate settings. The sociology of science divides into three broad lines of inquiry, each distinguished by a particular mix of theories and methods. The earliest systematic studies (mostly from the 1950s to the early 1970s) focus on the structural contexts of scientists’ behavior: what rules govern the pursuit of scientific knowledge, how are scientists judged and rewarded, how is scientific research broken up into dense networks of specialists? In the 1980s, sociologists shift their attention to the practices through which scientific knowledge is constructed—at the laboratory bench or in the rhetoric of professional papers. Starting in the 1990s, science is put in more encompassing societal contexts, as sociologists examine scientists as purveyors of cognitive authority, and explore their linkages to power, politics, and the economy.
egories (such as the division of a society by family or gender). However, as human societies grew in size and as their institutions became functionally differentiated, a distinctively scientific pursuit of knowledge was gradually insulated from such social causes. The observable facts of modern science, Durkheim concluded, were in accord with the reality of the physical world—a position that forestalled examinations of how observable facts are also shaped by the culture and communities in which they arise. Karl Marx’s materialism would seem to commit him to the idea that all beliefs arise amid historically specific conditions of production, as they are shaped by the goals and interests of a ruling class. The rise of science in seventeenth-century Europe is intimately bound with the rise of industrial capitalism and, for Marx, can be explained in terms of the utilities of science-based technologies for improving productivity and enlarging surplus value. But although the rate of scientific growth may be explained by its congruence with the interests of the bourgeoisie, Marx seems to suggest that the content of scientific claims inside professionalized research networks is nonideological—that is, an objective account of natural reality.
1. Precursors It is remarkable how much the literature in sociology of science is bunched into the last third of the twentieth century. Perhaps only after the deployment of nuclear weapons, or only after genetic engineering raised eugenic nightmares, could sociologists begin to think about science as a social problem rather than as a consistent solution; or maybe earlier generations of sociologists were guided by epistemological assumptions that rendered true scientific knowledge immune from social causes—thus putting it outside the orbit of sociological explanation.
1.1 Classical Anticipations ‘Science’ is nowhere indexed in Max Weber’s encyclopedic Economy and Society, a measure of his unwillingness or inability to see it as a consequential factor in human behavior or social change. Weber’s interest in science was largely methodological and political. Could the causal models employed so effectively in the natural sciences be used as well to study social action? Does the objectivity and neutrality of the social scientist preclude involvement in political activity? Emile Durkheim also sought to institutionalize sociology by making its methods appear scientifically precise, but at the same time considered scientific knowledge as an object of sociological study. Durkheim suggested that basic categories of thought and logic (such as time and space) are social in origin, in that they correspond to fundamental social cat13692
1.2 Science in the Sociology of Knowledge Even more surprising is the failure of systematic sociological studies of science to emerge from a blossoming sociology of knowledge in the 1920s and 1930s. Neither Max Scheler nor Karl Mannheim, authors of foundational treatises on the social determinants of knowledge, inspired sustained inquiry into the social determinants of science—probably because both distinguished scientific knowledge from other kinds in a way that truncated what sociology could say about it. Scheler isolated the content of scientific knowledge—and the criteria for ascertaining validity—by describing these as absolute and timeless essences, not shaped by social interests. The effects of social structure (specifically, the power of ruling elites) is limited to selections of problems and beliefs from that self-contained and essential realm of ideas. Mannheim sustained the neo-Kantian distinction between formal knowledge of the exact sciences and socio-historical knowledge of culture. Phenomena of the natural world are invariant, Mannheim suggests, and so therefore are criteria for deciding truth (i.e., impartial observations based on accurate measurements). In contrast, cultural phenomena become meaningful only as they are constructed through interest-laden judgments of significance, which are neither impartial nor invariant, and thus they are amenable to sociological explanation. Robert K. Merton’s 1938 classic Science, Technology and Society in Seenteenth-century England (see Merton 1973) tackles a fundamental problem:
Science, Sociology of why did modern science emerge with a flourish in seventeenth-century England? His answer has become known as the ‘Merton Thesis:’ an ethos of Puritanism that provided both the motivating force and legitimating authority for the pursuit of scientific inquiry. Certain religious values—e.g., God is glorified by an appreciation of his handiwork in Nature, or Blessed Reason separates human from beast—created a cultural context fertile for the rise of science. Merton also explains shifts in the foci of research attention among the early modern ‘natural philosophers’ by connecting empirical inquiry to the search for technological solutions to practical problems in mining, navigation, and ballistics.
2. Social Organization of the Scientific Community When concerted sociological studies of science began in the late 1950s and 1960s, research centered on the institutions or social structures of science—with relatively less attention given to the routine practices involved in making knowledge or to the wider settings in which science was conducted. This work was largely inspired by theories of structural-functional analysis, which ask how the community of scientists is organized in order to satisfy modern society’s need for certified, reliable knowledge. One distinctive feature of this first phase is a reliance on quantitative methods of analysis. With statistical data drawn from surveys of scientists and from the Science Citation Index (and other bibliometric sources), sociologists developed causal models to explain individual variations in research productivity and used topographical techniques such as multidimensional scaling to map the dense networks of scientists working at a research front.
2.1
Norms of Science
The shift from analyzing science in society to analyzing its internal social organization was effected in Merton’s 1942 paper on the normative structure of science (in Merton 1973). Written under the shadow of Nazism, Merton argues that the success of scientists in extending certified knowledge depends, at once, on a salutary political context (namely democracy, which allows science a measure of autonomy from political intrusions and whose values are said to be congruent with those of science—quite unlike fascism) and on an internal institutionalized ethos of values held to be binding upon the behavior of scientists. This ethos comprised the famous norms of science: scientists should evaluate claims impersonally (universalism), share all findings (communism), never sacrifice truth for personal gain (disinterestedness) and always question authority (organized skepticism). Behavior con-
sonant with these moral expectations is functional for the growth of reliable knowledge, and for that reason young scientists learn through precept how they are expected to behave, conformity is rewarded, and transgressions are met with outrage. Subsequent work ignored Merton’s conjectures about science and democracy, as sociologists instead pursued implications of the four social norms. Studies of behavioral departures from these norms—ethnocentrism, secrecy, fraud, plagiarism, dogmatism— precipitated debates over whether such deviance is best explained by idiosyncratic characteristics of a few bad apples or changing structural circumstances (such as commercialization of research) that might trigger increases in such behavior. Sociologists continue to debate the possibility that Merton’s norms are better explained as useful ideological justifications of scientists’ autonomy and cognitive authority. Other research suggests that the norms guiding scientific conduct vary historically, vary among disciplines, vary among organizational contexts (university research vs. military or corporate research), and vary even in their situational interpretation, negotiation and deployment—raising questions about whether the norms identified by Merton are functionally necessary for enlarging scientific knowledge.
2.2 Stratification and Scientific Careers The norm of universalism in particular has elicited much empirical research, perhaps because it raises questions of generic sociological interest: how is scientific performance judged, and how are inequalities in the allocation of rewards and resources best described and explained? With effective quantitative measures of individual productivity (number of publications or citations to one’s work), resources (grant dollars), and rewards (prizes, like the Nobel), sociologists have examined with considerable precision the determinants of individual career success or failure. Competition among scientists is intense, and the extent of inequality high: the distribution of resources and rewards in science is highly skewed. A small proportion of scientists publish most research papers (and those papers collect most citations), compete successfully for research grants and prestigious teaching posts, achieve international visibility and recognition, and win cherished prizes. Debate centers on whether these observed inequalities in the reward system of science are compatible with the norm of universalism—which demands that contributions to knowledge be judged on their scientific merit, with resources and opportunities meted out in accordance with those judgments. The apparent elitism of science may result from an ‘accumulation of advantage’: work by relatively more eminent or well-positioned scientists is simply noticed 13693
Science, Sociology of more and thus tends to receive disproportional credit—which (over time) enlarges the gap between the few very successful scientists and everybody else. Such a process may still be universalistic because it is functional for the institutional goal of science: giving greater attention to research of those with accomplished track-records may be an efficient triage of the overwhelming number of new candidate theories or findings. Others suggest that particularism contributes to the stratification of scientists—old boy networks that protect turf and career reputations by rewarding sycophants. The underrepresentation of women in the higher echelons of science has called attention to sometimes subtle sexism that occurs early in the scientific career (restricted access to well-connected mentors, or essential research equipment, or opportunities to collaborate and assignment to trivial problems or mind-numbing tasks).
2.3 Institutionalization of the Scientific Role A separate line of sociological inquiry (exemplified in work by Joseph Ben-David 1991 and Edward Shils) seeks an explanation for how science first became a remunerable occupation—and later, a profession. How did the role of the scientist emerge from historically antecedent patterns of amateurs who explored nature part-time and generally at their own expense? The arrival of the ‘scientist’ as an occupational selfidentification with distinctive obligations and prerogatives is inseparable from the institutionalization of the modern university (itself dependent upon government patronage). Universities provided the organizational form in which the practice of science could become a full-time career—by fusing research with teaching, by allowing (ironically) the historic prestige of universities as centers of theology and scholasticism to valorize the new science, and by providing a bureaucratic means of paying wages and advancing careers. The scientific role has also been institutionalized in corporate and government labs. The difficulties of transporting a ‘pure science’ ideal of university-based research into these very different organizational settings have been the object of considerable sociological attention. Scientists in industry or government face a variety of competing demands: their research is often directed to projects linked to potential profits or policy issues rather than steered by the agenda of their discipline or specialty; the need to maintain trade secrets or national security hampers the ability of scientists in these settings to publicize their work and receive recognition for it. And, as Jerome Ravetz (1971) suggests, the intrusion of ‘bureaucratic rationality’ into corporate and state science compromises the craft character of scientific work: largely implicit understandings and skills shared by the community of scientists and vital for the sustained accumulation of 13694
scientific wisdom have little place in accountabilities driven by the bottom-line or policy-relevance.
2.4 Disciplines and Specialties Sociologists use a variety of empirical indicators to measure the social and cognitive connections among scientists: self-reports of those with whom a scientist exchanges ideas or preprints, subject-classifications of publications in topical indexes or abstract journals, lineages of mentor–student relationshships or collaborations, patterns of who cites whom or is cited with whom (‘co-citation’). The networks formed by such linkages show occasional dense clusters of small numbers of scientists whose informal communications are frequent, who typically cite each other’s very recent papers, and whose research focusses on some new theory, innovative method, or breakthrough problem. Emergence of these clusters—for example, the birth of radio astronomy in England after WWII, as described by David Edge and Michael Mulkay (1976)—is a signal that science has changed, both cognitively and socially: new beliefs and practices are ensconced in new centers for training or research with different leaders and rafts of graduate students. Over time, these specialties evolve in a patterned way: the number of scientists in the network becomes much larger and connections among them more diffuse, the field gets institutionalized with the creation of its own journals and professional associations, shattering innovations become less common as scientists work more on filling in details or adding precision to the now-aging research framework. As one specialty matures, another dense cluster of scientists emerges elsewhere, as the research front or cutting edge moves on.
3. Sociology of Scientific Knowledge A sea-change in sociological studies of science began in the 1970s with a growing awareness that studies of the institutional and organizational contexts shaping scientists’ behavior could not illuminate sufficiently the processes that make science science: experimental tinkering, sifting of evidence, negotiation of claims, replacement of old beliefs about nature with new ones, achievement of consensus over the truth. All of these processes—observation, getting instruments and research materials (e.g., mice) to work, logic, criteria for justifying a finding as worthy of assent, choices among theories, putting arguments into words or pictures, persuading other scientists that you are correct—are uncompromisingly social, cultural, and historical phenomena, and so sociologists set about to explain and interpret the content of scientific knowledge by studying the routine practices of scientific work.
Science, Sociology of This research is guided by constructivist theories (and, less often, ethnomethodology), which center attention on the practically accomplished character of social life. Rather than allow a priori nature or given social structures to explain behavior or belief, constructivist sociologists examine how actors incessantly make and remake the structural conditions in which they work. Such research relies methodologically on historical case studies of scientific debate, up-close ethnographic observations of scientific practices, and on interpretative analysis of scientific texts.
3.1 Sociology of Discoery Diverse studies of scientific discovery illustrate the range of sociological perspectives brought to bear on these consequential events. An early line of inquiry focusses on the social and cognitive contexts that cause the timing and placing of discoveries: why did these scientists achieve a breakthrough then and there? Historical evidence points to a pattern of simultaneous, multiple and independent discoveries—that is, it is rare for a discovery to be made by a scientist (or a local team) who are the only ones in the world doing research on that specific question. Because honor and recognition are greatest for solutions to the perceivedly ‘hottest’ problems in a discipline, the best scientists are encouraged by the reward system of science to tackle similar lines of research. But these same social structures can also forestall discovery or engender resistance to novel claims. Cognitive commitments to a long-established way of seeing the natural world (reinforced by reputations and resources dependent upon those traditional perspectives) can make it difficult for scientists to see the worthiness of a new paradigm. Resistance to new ideas seems to be greatest among older scientists, and in cases where the proposed discovery comes from scientists with little visibility or stature within the specialty or discipline that would be transformed. More recent sociological research considers the very idea of ‘discovery’ as a practical accomplishment of scientists. Studies inspired by ethnomethodology offer detailed descriptions of scientific work ‘first-timethrough,’ taking note of how scientists at the lab bench decide whether a particular observation (among the myriad observations) constitutes a discovery. Other sociologists locate the ‘moment’ of discovery in downstream interpretative work, as scientists narrate firsttime-through research work with labels such as ‘breakthrough.’ Such discovery accounts are often sites of dissensus, as scientists dispute the timing or implications of an alleged discovery amid ongoing judgments of its significance for subsequent research initiatives or allocations of resources. These themes—interests, changing beliefs, ordinary scientific work, post-hoc accountings, dissent, persuasion—
have become hallmarks of the sociology of scientific knowledge.
3.2 Interests and Knowledge-change In the mid-1970s to the 1980s, sociology of science took root at the Science Studies Unit in Edinburgh, as philosopher David Bloor developed the ‘strong programme,’ while Barry Barnes, Steven Shapin, Donald MacKenzie, and Andrew Pickering developed its sociological analog—the ‘interest model.’ The goal is to provide causal explanations for changes in knowledge—say, the shift from one scientific understanding of nature to a different one. Scientists themselves might account for such changes in terms of greater evidence, coherence, robustness, promise, parsimony, predictive power, or utility of the new framework. Sociologists, in turn, account for those judgments in terms of social interests of scientists that are either extended or compromised by a decision to shift to the new perspective. What becomes knowledge is thus contingent upon the criteria used by a particular community of inquirers to judge competing understandings of nature, and also upon the goals and interests that shape their interpretation and deployment of those criteria. Several caveats are noted: interests are not connected to social positions (class, for example, or nationality, discipline, specialty) in a rigidly deterministic way; social interests may change along with changes in knowledge; choices among candidate knowledge-claims are not merely strategic— that is, calculations of material or symbolic gains are bounded by considerable uncertainty and by a shared culture of inquiry that provides standards for logical or evidential adequacy and for the proper use of an apparatus or concept. Drawing on historical case studies of theoretical disputes in science—nineteenth-century debates over phrenology and statistical theory, twentieth-century debates among high-energy physicists over quarks— two different kinds of interests are causally connected to knowledge-change. Political or ideological commitments can shape scientists’ judgments about candidate knowledge claims: the development of statistical theories of correlation and regression by Francis Galton, Karl Pearson and R. A. Fisher depended vitally on the utility of such measures for eugenic objectives. Different social interests arise from the accumulated expertise in working with certain instruments or procedures, which incline scientists to prefer theories or models that allow them to capitalize on those skills.
3.3 Laboratory Practices and Scientific Discourse As sociologists moved ever closer to the actual processes of ‘doing science,’ their research divided into 13695
Science, Sociology of two lines of inquiry: some went directly to the laboratory bench seeking ethnographic observations of scientists’ practices in situ; others examined scientists’ discourse in talk and texts—that is, their accounting practices. These studies together point to an inescapable conclusion: there is nothing not-social about science. From the step-by-step procedures of an experiment to writing up discovered facts for journal publication, what scientists do is describable and explicable only as social action—meaningful choices contingent on technical, cognitive, cultural, and material circumstances that are immediate, transient, and largely of the scientists’ own making. Laboratory ethnographies by Karin Knorr-Cetina (1999), Michael Lynch (1993), and Bruno Latour and Steve Woolgar (1986) reveal a science whose order is not to be found in transcendent timeless rules of ‘scientific method’ or ‘good lab procedures,’ but in the circumstantial, pragmatic, revisable, and iterative choices and projects that constitute scientific work. These naturalistic studies emphasize the local character of scientific practice, the idea that knowledgemaking is a process situated in particular places with only these pieces of equipment or research materials or colleagues immediately available. Never sure about how things will turn out in the end, scientists incessantly revise the tasks at hand as they try to get machines to perform properly, control wild nature, interpret results, placate doubting collaborators, and rationalize failures. Even methodical procedures widely assumed to be responsible for the objective, definitive, and impersonal character of scientific claims—experimental replication, for instance—are found to be shot-through with negotiated, often implicit, and potentially endless judgments about the competence of other experimentalists and the fidelity of their replication-attempts to the original (as Harry Collins (1992) has suggested). Ethnographic studies of how scientists construct knowledge in laboratories compelled sociologists then to figure out how the outcomes of those mundane contextual practices (hard facts, established theories) could paradoxically appear so unconstructed—as if they were given in nature all along, and now just found (not made). Attention turned to the succession of ‘inscriptions’ through which observations become knowledge—from machine-output to lab notebook to draft manuscript to published report. Scientists’ sequenced accounts of their fact-making rhetorically erase the messy indeterminacy and opportunism that sociologists have observed at the lab bench, and substitute a story of logic, method, and inevitability in which nature is externalized as waiting to be discovered. Such studies of scientific discourse have opened up an enduring debate among constructivist sociologists of science: those seeking causal explanations for scientists’ beliefs treat interests as definitively describable by the analyst, while others (Gilbert and Mulkay 1984, Woolgar 1988) suggest that socio13696
logists must respect the diversity of participants’ discursive accounts of their interests, actions, or beliefs—and thus treat actors’ talk and text not as mediating the phenomena of study but as constituting them.
3.4 Actor-networks and Social Worlds After sociologists worked to show how science is a thoroughly social thing, Bruno Latour (1988) and Michel Callon (1986) then retrieve and reinsert the material: science is not only about facts, theories, interests, rhetoric, and power but also about nature and machines. Scientists accomplish facts and theories by building ‘heterogenous networks’ consisting of experimental devices, research materials, images and descriptive statistics, abstract concepts and theories, the findings of other scientists, persuasive texts—and, importantly, none of these are reducible to any one of them, nor to social interests. Things, machines, humans, and interests are, in the practices of scientists, unendingly interdefined in and through these networks. They take on meanings via their linkages to other ‘actants’ (a semiotic term for anything that has ‘force’ or consequence, regardless of substance or form). In reporting their results, scientists buttress claims by connecting them to as many different actants as they can, in hopes of defending the putative fact or theory against the assault of real or potential dissenters. From this perspective, length makes strength, that is, the more allies enrolled and aligned into a network—especially if that network is then stabilized or ‘black boxed’—the less likely it is that dissenters will succeed in disentangling the actants and thereby weaken or kill the claim. Importantly for this sociology of science, the human and the social are decentered, in an ontology that also ascribes agency to objects of nature or experimental apparatuses. Actor-network theory moved the sociological study of science back outside the laboratory and professional journal—or, rather, reframed the very idea of inside and outside. Scientists and their allies ‘change the world’ in the course of making secure their claims about nature, and in the same manner. In Latourian vernacular, not only are other scientists, bits of nature or empirical data enlisted and regimented, but also political bodies, protest movements, the media, laws and hoi polloi. When Louis Pasteur transformed French society by linking together microbes, anthrax, microscopes, laboratories, sick livestock, angry farmers, nervous Parisian milk-drinkers, public health officials, lawmakers, and journalists into what becomes a ‘momentous discovery,’ the boundary between science and the rest of society is impossible to locate. Scientists are able to work autonomously at their benches precisely because so many others outside the lab are also ‘doing science,’ providing the life
Science, Sociology of support (money, epistemic acquiescence) on which science depends. The boundaries of science also emerge as theoretically interesting in related studies derived from the brand of symbolic interactionism developed by Everett Hughes, Herbert Blumer, Anselm Strauss, and Howard Becker (and extended into research on science by Adele Clarke 1990, Joan Fujimura, and Susan Leigh Star). On this score, science is work—and, instructively, not unlike work of any other kind. Scientists (like plumbers) pursue doable problems, where ‘doability’ involves the articulation of tasks across various levels of work organization: the experiment (disciplining research subjects), the laboratory (dividing labor among lab technicians, grad students, and postdocs), and ‘social worlds’ (the wider discipline, funding agencies, or maybe animal-rights activists). Scientific problems become increasingly doable if ‘boundary objects’ allow for cooperative intersections of those working on discrete projects in different social worlds. For example, success in building California’s Museum of Vertebrate Zoology in the early twentieth century depended upon the standardization of collection and preparation practices (here, the specimens themselves become boundary objects) that enabled biologists to align their work with trappers, farmers, and amateur naturalists in different social worlds. As in actor-network theory, sociologists working in the ‘social worlds’ tradition make no assumption about where science leaves off and the rest of society begins—those boundaries get settled only provisionally, and remain open to challenge from those inside and out.
4. Science as Cultural Authority It is less easy to discern exactly what the sociology of science is just now, and where it is headed. Much research centers on the position of science, scientists, and scientific knowledge in the wider society and culture. Science is often examined as a cognitive or epistemic authority; scientists are said to have the legitimate power to define facts and assess claims to truth. This authority is not treated as an inevitable result of the character or virtue of those who become scientists, the institutional organization of science (norms, for example) or of the ‘scientific method.’ It is, rather, an accomplished resource pursued strategically by a profession committed not only to extending knowledge but also to the preservation and expansion of its power, patronage, prestige, and autonomy. No single theoretical orientation or methodological program now prevails. Constructivism remains appealing as a means to render contingent and negotiable (rather than ‘essential’) those features of scientific practice said to justify its epistemic authority. But as the agenda in the sociology of science shifts from
epistemological issues (how is knowledge made?) to political issues (whose knowledge counts, and for what purposes?), constructivism has yielded to a variety of critical theories (Marxism, feminism and postmodernism) that connect science to structures of domination, hierarchy, and hegemony. A popular research site among sociologists of science is the set of occasions where scientists find their authority challenged by those whose claims to knowledge lack institutional legitimacy.
4.1 Credibility and Expertise Steven Shapin (1994) (among others) has identified credibility as a constitutive problem for the sociology of science. Whose knowledge-claims are accepted as believable, trustworthy, true or reliably useful—and on what grounds? Plainly, contingent judgments of the validity of claims depend upon judgments of the credibility of the claimants—which has focussed sociological attention on how people use (as they define) qualities such as objectivity, expertise, competence, personal familiarity, propriety, and sincerity to decide which candidate universe becomes provisionally ‘real.’ A long-established line of sociological research examines those public controversies that hinge, in part, on ‘technical’ issues. Case studies of disputes over environmental and health risks find a profound ambivalence: the desire for public policy to be decided by appropriate legislative or judicial bodies in a way that is both understandable and accountable to the populace runs up against the need for expert skills and knowledge monopolized by scientific, medical or engineering professionals. Especially when interested publics are mobilized, such disputes often become ‘credibility struggles’ (as Steven Epstein (1998) calls them). In his study of AIDS politics, Epstein traces out a shift from activists’ denunciation of scientists doing research on the disease to their gaining a ‘seat at the table’ by learning enough about clinical trials and drug development to participate alongside scientists in policy decisions. In this controversy, as in many others, the cultural boundaries of science are redrawn to assign (or, alternatively, to deny) epistemic authority to scientists, would-be scientists, citizens, legislators, jurists, and journalists.
4.2 Critique of Science Recent sociological studies have themselves blurred the boundaries between social science and politics by examining the diverse costs and benefits of science. Whose agenda does science serve—its own? global capital? political and military elites? colonialism? patriarchy? the Earth’s? Studies of molecular biology and biotechnology show how the topics chosen for 13697
Science, Sociology of scientific research—and the pace at which they are pursued—are driven by corporate ambitions for patents, profits, and market-share. Related studies of the Green Revolution in agricultural research connect science to imperialist efforts to replace indigenous practices in less developed countries with ‘advanced technologies’ more consonant with the demands of global food markets. Feminist researchers are equally interested in the kinds of knowledge that science brings into being—and, even more, the potential knowledges not sought or valorized. In the nineteenth century, when social and natural science offered logic and evidence to legitimate patriarchal structures, other styles of inquiry and learning practiced among women (Parisian salons, home economics, midwifery, and cookery) are denounced as unscientific and, thus, suspect. Other feminists challenge the hegemony of scientific method, as a way of knowing incapable of seeing its own inevitable situatedness and partiality; some suggest that women’s position in a genderstratified society offers distinctive epistemic resources that enable fuller and richer understandings of nature and culture. These critical studies share an interest in exposing another side of science: its historical complicity with projects judged to be inimical to goals of equality, human rights, participatory democracy, community and sustainable ecologies. They seek to fashion a restructured ‘science’ (or some successor knowledgemaker) that would be more inclusive in its practitioners, more diverse in its methods, and less tightly coupled to power. See also: Actor Network Theory; Cultural Studies of Science; Laboratory Studies: Historical Perspectives; Norms in Science; Science and Technology Studies: Ethnomethodology; Science, Social Organization of; Scientific Controversies; Scientific Culture; Scientific Knowledge, Sociology of; Strong Program, in Sociology of Scientific Knowledge; Truth and Credibility: Science and the Social Study of Science
Bibliography Barber B, Hirsch W (eds.) 1962 The Sociology of Science. Free Press, Glencoe, IL Barnes B, Bloor D, Henry J 1996 Scientific Knowledge: A Sociological Analysis. University of Chicago Press, Chicago Barnes B, Edge D (eds.) 1982 Science in Context. MIT Press, Cambridge, MA Ben-David J 1991 Scientific Growth. University of California Press, Berkeley, CA Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Bliuex Bay. In: Law J (ed.) Power, Action and Belief. Routledge & Kegan Paul, London Clarke A E 1990 A social worlds research adventure. In: Cozzens S E, Gieryn T F (eds.) Theories of Science in Society. Indiana University Press, Bloomington, IN
13698
Clarke A E, Fujimura J H (eds.) 1992 The Right Tools for the Job: At Work in Twentieth-century Life Sciences. Princeton University Press, Princeton, NJ Collins H M 1992 Changing Order: Replication and Induction in Scientific Practice. University of Chicago Press, Chicago Edge D O, Mulkay M J 1976 Astronomy Transformed. Wiley, New York Epstein S 1998 Impure Science: AIDS, Actiism, and the Politics of Knowledge. University of California Press, Berkeley, CA Gieryn T F 1999 Cultural Boundaries of Science: Credibility on the Line. University of Chicago Press, Chicago Gilbert G N, Mulkay M 1984 Opening Pandora’s Box: A Sociological Analysis of Scientists’ Discourse. Cambridge University Press, Cambridge, UK Hagstrom W O 1965 The Scientific Community. Basic Books, New York Harding S 1986 The Science Question in Feminism. Cornell University Press, Ithaca, NY Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) 1995 Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Kloppenburg J R Jr. 1988 First the Seed: The Political Economy of Plant Biotechnology 1492–2000. Cambridge University Press, Cambridge, UK Knorr-Cetina K 1999 Epistemic Cultures. Harvard University Press, Cambridge, MA Knorr-Cetina K, Mulkay M J (eds.) 1983 Science Obsered. Sage, Beverly Hills, CA Latour B 1988 Science in Action. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1986 Laboratory Life: The Construction of Scientific Facts. Princeton University Press, Princeton, NJ Long J S, Fox M F 1995 Scientific careers: Universalism and particularism. Annual Reiew of Sociology 21: 45–71 Lynch M 1993 Scientific Practice and Ordinary Action. Cambridge University Press, Cambridge, UK Merton R K 1973 The Sociology of Science. University of Chicago Press, Chicago Mulkay M J 1979 Science and the Sociology of Knowledge. George Allen & Unwin, London Mulkay M J 1980 Sociology of science in the West. Current Sociology 28: 1–184 Nelkin D (ed.) 1984 Controersy: The Politics of Technical Decisions. Sage, Beverly Hills, CA Ravetz J R 1971 Scientific Knowledge and its Social Problems. Oxford University Press, Oxford, UK Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S 1995 Here and everywhere: Sociology of scientific knowledge. Annual Reiew of Sociology 21: 289–321 Woolgar S 1988 Science: The Very Idea. Tavistock, London Zuckerman H 1988 The sociology of science. In: Smelser N (ed.) Handbook of Sociology. Sage, Newbury Park, CA
T. F. Gieryn
Science, Technology, and the Military Technology has changed warfare since time immemorial. The invention of gunpowder, artillery, and rifles revolutionized warfare. Individual scientists, for cen-
Science, Technology, and the Military turies, have advised the military on specific problems: Archimedes reportedly helped the tyrant of Syracuse in devising new weaponry against the Romans in 212 BC; Leonardo da Vinci supplied us with a variety of drawings of new armaments; and, since the emergence of ‘modern’ science in the sixteenth and seventeenth centuries, many prominent scientists, including Tartaglia, Galileo, Newton, Descartes, Bernouilli, and Euler, have devoted some of their time and intellect to helping solve military problems. This article first argues that World War II and the subsequent Cold War produced a dramatic change in the way scientists became involved in the weapons innovation process. Next, it shows that concerns about the resulting ‘arms race’ brought about a new type of studies—defense technology assessment studies—that dealt with the impact of new weapons systems on national and international security. Many of the newly developed weapons were perceived to have a negative impact, which raised the question of whether and how the weapons innovation process could be influenced. The article discusses a variety of analytical approaches aimed at understanding the dynamics of the weapons innovation process. It argues that a sociotechnical network approach is the most promising one to provide valuable insights for influencing this innovation process. This approach also provides a suitable framework for investigating the relationship between civil and military technological innovation, a subject of growing interest that is discussed in the final section.
1. Weapons Innoation Becomes Organized World War II produced a dramatic change in the way scientists became involved in military matters. Science and scientists in great numbers were mobilized for weapons innovation in a highly organized and concentrated effort. In the USA these scientists contributed, mainly under the auspices of the newly established Office of Scientific Research and Development, to the development of a variety of new technologies, including the atomic bomb, radar, the proximity fuse, and also penicillin. The decisive contribution of scientists to these war efforts implied a fundamental shift in the role of science and technology in future military affairs. Immediately after the war, US science policy pioneer Vannevar Bush (1945), drawing on the war experience, advised the President: [t]here must be more—and more adequate—military research in peacetime. It is essential that the civilian scientists continue in peacetime some portion of those contributions to national security which they have made so effectively during the war.
His advice stood in sharp contrast to Thomas Alva Edison’s suggestion, many years before, during World
War I, to the Navy, that it should bring into the war effort at least one physicist in case it became necessary to ‘calculate something’ (Gilpin 1962, p. 10). For the first time in history military research and development (R&D) became a large-scale institutionalized process even in peacetime, indeed on a scale not seen before; it was legitimized as well as fueled by the climate of the Cold War. In the decades following the war, weapons were replaced in a rapid process of ‘planned obsolescence.’ The R&D was carried out in national laboratories, the defense industry, laboratories of the military services, and at universities to varying degrees in different countries. A United Nations study of 1981 estimated that annually some $100 billion, that is, some 20–25 percent of all R&D expenditures, were devoted to military R&D. The resulting ‘qualitative’ arms race in nuclear, conventional, and biological and chemical weapons between the NATO and Warsaw Pact countries during the Cold War raised the question of whether national and international security actually decreased, rather than increased, as a result of ‘destabilizing’ weapons innovations. A related question was whether military technological developments could be steered or directed so as not to undermine international arms control agreements. After the end of the Cold War in 1990, observers wondered why, particularly in the Western industrialized countries, military R&D efforts continued nearly unabated, while the original threat had disappeared. The systematic and organized involvement of science and technology in developing new armaments raised more questions of both societal and socialscientific interest. For instance, to what extent have military R&D and defense relations influenced academic research (mainly a concern in the USA) and the course—or even the content—of scientific and technological developments more generally? What is the relationship between (developments in) civil and military technology: are these separate developments or do they, on the contrary, profit from each other through processes of diffusion and spin-off? Addressing these questions has become an interdisciplinary task, drawing on and integrating insights from many fields. This challenge has been taken up, though still on a limited scale, within the framework of the science and technology (S&T) studies that have emerged since the 1970s. This article focuses on the origin and nature of ‘defense technology assessment studies’ and the related research on possibilities for influencing the weapons innovation process.
2. Defense Technology Assessment Studies The declared purpose of weapons innovation is enhancing national security (often broadly interpreted as 13699
Science, Technology, and the Military including intervention and power projection). However, during the Cold War many argued that the evercontinuing weapons innovation process was actually counter-productive. The issue was not only that the huge amounts spent on armament might be a waste of resources, but also that the huge military R&D effort and the resulting new weapons caused a rapid decrease rather than an increase of national security (e.g., York 1971, p. 228). Since the 1960s, many studies, often carried out by scientists who had become concerned about the escalating arms race, have dealt with the impact of new weapons systems and new military technologies on national and international security. These studies pointed out, for instance, that the anti-ballistic missile (ABM) systems, consisting of many land-based antimissile missiles, as proposed by the USA in the 1960s, would actually stimulate the Soviet Union to deploy even more nuclear missiles. Also, a similar ABM system by the Soviet Union would trigger the USA to deploy multi-warhead missiles (MIRVs—Multiple Independently Targetable Re-entry Vehicles). These missiles, carrying up to twelve nuclear warheads, could thus saturate the capabilities of the Soviet ABM interception missiles. Actually, the development and deployment of MIRVed missiles by the USA even preceded a possible Soviet ABM system. As the then US Defense Secretary, Robert McNamara, wrote (quoted in Allison and Morris 1975, p. 118): Because the Soviet Union might [emphasis in original] deploy extensive ABM defenses, we are making some very important changes in our strategic missile forces. Instead of a single large warhead our missiles are now being designed to carry several small warheads … . Deployment by the Soviets of a ballistic missile defense of their cities will not improve their situation. We have already [emphasis added] taken the necessary steps to guarantee that our strategic offensive forces will be able to overcome such a defense.
This weapons innovation ‘dynamics’ was aptly encapsulated by Jerome Wiesner, former science adviser to President John F. Kennedy, as ‘we are in an arms race with ourselves—and we are winning.’ In the 1990s, when the USA continued its efforts to develop anti-satellite (ASAT) technology capable of destroying an adversary’s satellites, Wiesner’s words might rightly have been paraphrased as ‘we are in an arms race with ourselves—and we are losing.’ For the irony here is that it is the US military system that, more than any other country’s defense system, is dependent on satellites (for communication, reconnaissance, eavesdropping, and so on), which would be highly vulnerable to a hostile ASAT system. The most likely route, however, for hostile countries to obtain the advanced ASAT technology would not be through their own R&D, but through the proliferation, that is, diffusion of US ASAT technology, once it had been developed. Again, the USA would very likely decrease 13700
rather than increase its own security by developing ASAT technology. Many of the early and later ‘impact assessments’ of weapons innovations assessed their potential of circumventing and undermining existing international agreements that aimed to halt the arms race, like the Anti-Ballistic Missile (ABM) Treaty (1972) and the accompanying Strategic Arms Limitation Agreements (SALT, 1972), the SALT II Treaty (1979), and a Comprehensive Test Ban Treaty (CTBT, 1996). In addition, assessments were made, both by independent scientists and governmental agencies, of the potential of civil technologies to spill over into military applications: for in such cases, countries could, under the guise of developing civil technologies make all preparations needed for developing nuclear, chemical, or biological weapons. Through these ‘dual-use’ technologies, a proliferation of weapons could occur, or at least the threshold lowered for obtaining those weapons, without formally violating the Non Proliferation Treaty (1971), the Comprehensive Test Ban Treaty (1996), the Biological Weapons Convention (1972), or the Chemical Weapons Convention (1993). The Arms Control readings (York 1973) from the Magazine Scientific American provide an instructive sample of early defense technology assessments. In the 1980s, both the USA and NATO emphasized the importance of a stronger conventional defense, which was then believed to be feasible because of new ‘emerging technologies,’ including sensor and guidance technologies, C$I (Command, Control, Communications and Intelligence) technologies (like real time data processing), electronic warfare, and a variety of new missiles and munitions. The emphasis on high technology in the conventional weapons area, in its turn, triggered a great variety of studies (pro and con) by academic and other defense analysts, not only in the USA, but also in Europe. These included analyses of the technical feasibility of the proposed systems, the affordability of acquiring sufficient numbers of ever more costly weapons, and the associated consequences for national defense and international security. The results of these defense technology assessments were fed into the public discussion, with the aim of containing the arms race and reaching international arms control agreements, or preventing their erosion. The impact of these assessments on the weapons innovation process and its accompanying military R&D was often limited. Many, therefore, considered the weapons innovation process to be ‘out of control,’ providing another topic of S&T studies on military technology.
3. Influencing the Weapons Innoation Process What does it mean to ‘bring weapons innovation under political control’? The concept seems obvious at
Science, Technology, and the Military one level but it is actually not trivial and needs elaboration. In national politics there is often no consensus on the kinds of armament that are desirable or necessary. Those who say that politics is not in control may actually mean that developments are not in accordance with their political preferences, whereas those who are quite content with current developments may be inclined to say that politics is in control. Neither position is analytically satisfactory. But neither is it satisfactory to say that politics is in control simply because actual weapons innovations are the outcome of the political process, which includes lobbying of defense contractors, interservice rivalry, bureaucratic politics, arguments over ideology and strategic concepts, and so forth, as Greenwood (1990) has suggested. Rather than speaking of control, one should ask whether it would be possible to influence the innovation process in a systematic way, or to steer it according to some guiding principle (Smit 1989). This implies that the basic issue of ‘control,’ even for those who are content with current developments, concerns whether it would be possible to change their course if this was desired. A plethora of studies have appeared on what has been called the technological arms race (see e.g. Gleditsch and Njolstad 1990). Many of them deal with what President Eisenhower in his much-cited farewell address called the military-industrial complex, later extended to include the bureaucracy as well. These studies belong to what has been called the bureaucratic-politics school or domestic structure model (Buzan 1987, Chap. 7), in contrast to the actionreaction models (Buzan 1987, Chap. 6), which focus on interstate interactions as an explanation of the dynamics of the arms race. A third approach, the technological imperative model (Buzan 1987, Chap. 8) sees technological change as an independent factor in the arms race, causing an unavoidable advance in military technology, if only for its links with civil technological progress—though a link whose importance is under debate (see also the last section of this article). By contrast, Ellis (1987), in his social history of the machine gun, has shown the intricate interweaving of weapons innovation with social, military, cultural, and political factors. To some extent, these studies might be considered complementary, focusing on different elements in a complex pattern of weapons development and procurement. For instance, the ‘reaction’ behavior in the interstate model might be translated into the ‘legitimation’ process of domestically driven weapons developments. Many of these studies are of a descriptive nature. Some (Allison and Morris 1975, Brooks 1975, Kaldor 1983) are more analytical and try to identify important determinants in the arms race. These factors are predominantly internal to each nation and partly linked up with the lengthy process—10 to 15 years—of the development of new weapons systems. Other
studies (Rosen 1991, Demchak 1991) relate military innovation, including military technology, to institutional and organizational military factors and to the role of civilians. There are quite a number of case studies on the development of specific weapons systems, and empirical studies on the structure of defense industries (Ball and Leitenberg 1983, Kolodziej 1987, Gansler 1980, 1987) or arms procurement processes (Cowen 1986, Long and Reppy 1980). However, hardly any of these studies focus on the question of how the (direction) of the weapons innovation process and the course of military R&D might be influenced. One task for future S&T studies, therefore, would be to combine insights from this great variety of studies for a better understanding of these innovation processes. Some steps have already been taken. The lengthy road of developing new weapons systems implies that it will be hard to halt or even redirect a system at the end when much investment has been made. Influencing weapons innovations, therefore, implies a continuous process of assessment, evaluation, and (re-)directing, starting at the early stages of the R&D process (see also Technology Assessment). Just striving for ‘technological superiority,’ one of the traditional guiding principles in weapons development, will lead to what is seemingly an autonomous process. Seemingly, because technology development is never truly autonomous. The appearance of autonomy results from the fact that many actors (i.e., organizations) are involved in developing technology, and no single actor on its own is able to steer the entire development. Rather, all actors involved are connected within a network—we may call it a sociotechnical network—working together and realizing collectively a certain direction in the weapons innovation process. Network approaches, appearing in several areas of science and technology studies since the mid 1980s, in which the positions, views, interests, and cultures of the actors involved are analyzed, as well as their mutual links and the institutional settings in which the actors operate, open up an interesting road for dealing with the question of influencing military technological developments (Elzen et al. 1996). Such approaches emphasize the interdependencies between the actors and focus on the nature of their mutual interactions (see also Actor Network Theory). From a network approach it is evident why one single organization by itself cannot determine technological developments. At the same time, network approaches have the power to analyze the way these developments may be influenced by actors in a network. Networks both enable and constrain the possibilities of influencing technology developments. Analyzing them may provide clues as to how to influence them. Weapons innovation and its associated military R&D differ in one respect from nearly all other technologies, in that there is virtually only one 13701
Science, Technology, and the Military customer of the end product—that is, the state. (Some civil industries, like nuclear power and telecommunications in the past, also show considerable similarities in market structure—monopolies or oligopolies coupled with one, or at most a few dominant purchasers. They are also highly regulated, and markedly different from the competitive consumer goods sectors (see Gummett 1990). Moreover, only a specific set of actors comprises the sociotechnical networks of military technological developments, including the defense industry, the military, the defense ministry, and the government. The defense ministry, as the sole buyer on the monopsonistic armament market, has a crucial position. In addition, the defense ministry is heavily involved in the whole R&D process by providing, or refunding to industry, much of the necessary funds. Yet the defense ministry, in its turn, is dependent on the other actors, like the defense industry and military laboratories, which provide the technological options from which the defense ministry may choose. S&T studies of this interlocked behavior offer a promising approach for making progress on the issue of regulatory regimes. Not steering from a central position, but adopting instead an approach of ‘decentralized regulation’ in which most actors participate then seems the viable option for influencing military technological developments. In this connection various ‘guiding principles’ could play a role in a regulatory regime for military technological innovation (see also Enserink et al. 1992). Such guiding principles could include the ‘proportionality principle,’ ‘humanitarian principles,’ and ‘limiting weapons effects to the duration of conflict’ (contrary, for instance, to the use of current anti-personnel mines). Additional phenomena that should in any event be taken into account in such network approaches are the increasing international cooperation and amalgamation of the defense industry, the possibly increasing integration of civil and military technology, and the constraining role of international arms control agreements. The intricate relation between civil and military technology will briefly be discussed in the final section.
4. Integration of Ciil and Military Technology Sociotechnical network approaches seem particularly useful for studying the relation between civil and military technology. This issue has assumed increasing interest because of the desire to integrate civil and military technological developments. Technologies that have both civil and military applications are called dual-use technologies. The desire for integration originates (a) from the need for lower priced defense products because of reductions in procurement budgets, and (b) from the new situation that in a number of technological sectors innovation in the civil sector has outstripped that in the military sector. Examples 13702
of such sectors are computers and information and communication technology, where such integration already emerges. Such integration, of course, could be at odds with a policy of preventing weapons proliferation as discussed before. Research on the transformations needed to apply civil technologies in the military sector and vice versa have only just begun. The extent to which civil and military technologies diverge depends not only on the diverging needs and requirements, but also on the different institutional, organizational, and cultural contexts in which these developments occur. Several case studies illustrate how intricately interwoven the characteristics of technology may be with the social context in which it is being developed or in which it functions. MacKenzie (1990) conducted a very detailed study of the development of missile guidance technologies in relation to their social context, and the technological choices made for improving accuracy, not only for missiles but also for aircraft navigation. He showed how different emphases in requirements for missile accuracy and for civil (and military) air navigation resulted in alternative forms of technological change: the former focusing on accuracy, the latter on reliability, producibility, and economy. A number of historical studies have addressed the question of the relation between civil and military technology. Some of them have shown that in a number of cases in the past, the military successfully guided technological developments in specific directions that also penetrated the civil sector. Smith (1985), for instance, showed that the manufacturing methods based on ‘uniformity’ and ‘standardization’ that emerged in the USA in the nineteenth century, were more or less imposed (though not without difficulties and setbacks) by the Army Ordnance Department’s wish for interchangeable parts. Noble (1985) investigated, from a more normative perspective, three technical changes in which the military have played a crucial role—namely, interchangeable parts manufacture, containerization, and numerical control— arguing that different or additional developments would have been preferable from a different value system. Studies going back to the nineteenth or early twentieth centuries, though interesting from a historical perspective, may not always be relevant for modern R&D and current technological innovation processes. Systematic studies of the interrelation between current military and civil technological developments are only of recent date (see Gummett and Reppy 1988). They point out that it may be useful to distinguish not only between different levels of technology, such as generic technologies and materials, components, and systems (Walker et al. 1988), but also between products and manufacturing or processing technologies (Alic et al. 1992). In certain technological areas the distinguishing features between civil and military technology may be found at the level
Science, Technology, and the Military of system integration, rather than at the component level. In conclusion one may say that military technology, for many reasons, is a fascinating field for future science and technology studies: its links with a broad scope of societal institutions, its vital role in international security issues, the need for influencing its development, and its increasing integration with civil technology, particularly in the sector of information and communication technologies, which may revolutionize military affairs. See also: Innovation: Organizational; Innovation, Theory of; Military and Disaster Psychiatry; Military Geography; Military History; Military Psychology: United States; Military Sociology; Research and Development in Organizations; Science and Technology, Social Study of: Computers and Information Technology
Bibliography Alic J A, Branscomb L M, Brooks H, Carter A B, Epstein G L 1992 Beyond Spinoff: Military and Commercial Technologies in a Changing World. Harvard Business School Press, Boston Allison G T, Morris F A 1975 Armaments and arms control: Exploring the determinants of military weapons. Daedalus 104(3): 99–129 Ball N, Leitenberg M 1983 The Structure of the Defense Industry. Croom Helm, London and Canberra Brooks H 1975 The military innovation system and the qualitative arms race. Daedalus 104(3): 75–97 Buzan B 1987 An Introduction to Strategic Studies: Military Technology and International Relations. Macmillan, Basingstoke, UK Cowen R 1986 Defense Procurement in the Federal Republic of Germany: Politics and Organization. Westview Press, Boulder, CO Demchak C C 1991 Military Organizations, Complex Machines. Modernization in the US Armed Serices. Cornell University Press, Ithaca, NY Ellis J 1987 The Social History of the Machine Gun. The Cresset Library, London. (Reprint of 1975 ed.). Elzen B, Enserink B, Smit W A 1996 Socio-technical networks: How a technology studies approach may help to solve problems related to technical change. Social Studies of Science 26(1): 95–141 Enserink B, Smit W A, Elzen B 1992 Directing a cacophony—weapon innovation and international security. In: Smit W A, Grin J, Voronkov L (eds.) Military Technological Innoation and Stability in a Changing World. Politically Assessing and Influencing Weapon Innoation and Military Research and Deelopment. VU University Press, Amsterdam, The Netherlands, pp. 95–123 Gansler J S 1980 The Defense Industry. MIT Press, Cambridge, MA Gansler J S 1989 Affording Defense. MIT Press, Cambridge, MA Gilpin R 1962 American Scientists and Nuclear Weapons Policy. Princeton University Press, Princeton, NJ Gleditsch N P, Njolstad O (eds.) 1990 Arms Races. Technological and Political Dynamics. Sage, London
Greenwood T 1990 Why military technology is difficult to constrain. Science Technology and Human Values 15(4): 412–29 Gummett P H 1990 Issues for STS raised by defense science and technology policy. Social Studies of Science 20: 541–58 Gummett P H, Reppy J (eds.) 1988 The Relations Between Defense and Ciil Technologies. Kluwer, Dordrecht, The Netherlands Hacker B C 1994 Military institutions, weapons, and social change: Towards a new history of military technology. Technology and Culture 35(4): 768–834 Kaldor M 1983 The Baroque Arsenal. Sphere, London Kolodziej E A 1987 Making and Marketing Arms: The French Experience and its Implications for the International System. Princeton University Press, Princeton, New Jersey Long F A, Reppy J (eds.) 1980 The Genesis of New Weapons. Pergamon, New York MacKenzie D 1990 Inenting Accuracy: A Historical Sociology of Nuclear Missile Guidance. MIT Press, Cambridge, MA Mendelsohn E, Smith M R, Weingart P (eds.) 1988 Science, Technology and the Military. Sociology of the Sciences Yearbook, Vol. XII\1. Kluwer, Boston Molas-Gallart J 1997 Which way to go? Defense technology and the diversity of ‘dual-use’ technology transfer. Research Policy 26: 367–85 Noble D F 1985 Command performance: A perspective on the social and economic consequences of military enterprise. In: Smith M R (ed.) Military Enterprise and Technological Change—Perspecties on the American Experience. MIT Press, Cambridge, MA, pp. 329–46 Rosen S P 1991 Winning the Next War. Innoation and the Modern Military. Cornell University Press, Ithaca Sapolsky H M 1977 Science, technology and military policy. In: Spiegel-Ro$ sing I M, de Solla Price D (eds.) Science, Technology and Society. A Cross-disciplinary Perspectie. Sage, London, pp. 443–72 Smith M R 1985 Army ordnance and the ‘‘American system’’ of manufacturing, 1815–1861. In: Smith M R (ed.) Military Enterprise and Technological Change—Perspecties on the American Experience. MIT Press, Cambridge, MA, pp. 39–86 Smit W A 1989 Defense technology assessment and the control of emerging technologies. In: Borg M ter, Smit W A (eds.) Non-proocatie Defence as a Principle of Arms Reduction. Free University Press, Amsterdam, The Netherlands, pp. 61–76 Smit W A 1995 Science, technology, and the military: relations in transition. In: Jasanoff S, Markle G E, Peterson J C, Pirich T (eds.) Handbook of Science and Technology Studies. Sage, London, pp. 598–626 Smit W A, Grin J, Voronkov L (eds.) 1992 Military Technological Innoation and Stability in a Changing World. Politically Assessing and Influencing Weapon Innoation and Military Research and Deelopment. VU University Press, Amsterdam The Annals of the American Academy of Political and Social Science 1989 Special Issue: Universities and the Military. 502 (March) Walker W, Graham M, Harbor B 1988 From components to integrated systems: Technological diversity and integration between military and civilian sectors. In: Gummett Ph, Reppy J (eds.) The Relations Between Defence and Ciil Technologies. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 17–37
13703
Science, Technology, and the Military York H F 1971 Race to Obliion: A Participants View of the Arms Race. Simon and Schuster, New York York H F (comp.) 1973 Arms Control. Readings from Scientific American. Freeman, San Francisco
W. A. Smit
Scientific Academies, History of Scientific academies are associations of scientific practitioners like the Academie Royale des Sciences of Paris, the Royal Society of London, the Berlin Academy, the Russian Academy of Science, or the US National Academy of Science. Since their inception in seventeenth-century Europe, the distinctive feature of scientific academies has been the regulation of their membership and activities according to corporate protocols, statutes, and bylaws (which, nevertheless, may be developed in accordance with the policies of the state authorities that sometimes sponsor them). Typical activities of scientific academies include: the publication of journals, monographs, and the collected work of famous scientists, the award of prizes and medals (the Nobel Prize being the most famous example), the organization of scientific meetings and, more rarely, of research projects (sometimes in collaboration with other academies). They may also take an advising role in matters of science policy, and may be called upon to advise governments on scientific and technological projects. The sociological and historical significance of scientific academies is tied to their crucial role in the development of science as a peer-based system, in constituting the ‘scientist’ as a distinct socio-professional role, in establishing networks of communication and collaboration, and, more generally, in fostering the so-called internationalism of science (Crawford 1992). Historically, scientific academies have functioned as institutions demarcating the elite of scientific practitioners from other socioprofessional groups (which may have included other scientific practitioners pursuing different methodological programs or more practical subjects). By doing so, they have created the conditions of possibility for corporate authority in science (though, historically, different academies have taken different stances with respect to their role as judges of scientific and technical claims). They have also constituted themselves as crucial ‘lobbying’ entities in negotiations between corporate science and the state. But while academies are seen as a cornerstone in the development of science as a peer-based and peerregulated system, the very notion of peer has undergone important changes since the seventeenth century and remains contested even today. There is a circular relationship between the definition of peer and acad13704
emy. Academies are constituted by peers, but they also constitute what peer means. A peer is not only someone who has specific technical competencies in a given scientific field, but also a person who, through a process of training and professional socialization, has developed (and has been given the opportunity to develop) a specific corporate sociability and set of values. The ingredients of such a sociability are rooted in the historical, disciplinary, and socio-political context in which a given academy developed, but there are some features that have cut across these various contexts. One of them is gender. A historically conspicuous feature of scientific academies is their distinctly male corporate culture which, until the beginning of the twentieth century, led to the exclusion of women from their membership—an exclusion that was less marked in other less corporate sites of science (Schiebinger 1989). Another theme that runs through the literature on academies is their relative ‘independence.’ Some have seen academies as providing an institutional boundary between science and society, thereby shielding the allegedly ever-threatened ‘purity’ of science (BenDavid 1971). Others, instead, have taken the empirically more defensible view that academies are institutions constituted within specific socio-political ecologies—ecologies to whose development they contribute from within. The former view sees science as in need of being or becoming as independent as possible from the social nomos, while the latter treats it as a set of practices that have articulated increasingly finer and more pervasive relationships with the sociopolitical order precisely by developing more specialized and professionalized institutional forms. In one case, institutionalization means separation; in the other it means integration within an increasingly articulated interplay of social institutions. These considerations bear directly on the problem of defining ‘scientific academy.’ It is easy to compile a long and inclusive list of past or present academies, but it would be much harder, perhaps impossible, to categorize them as a long-standing institutional type like, say, the university. Depending on historical and national contexts, ‘scientific academy’ refers to different types of associations with different structures, functions, methodological orientations, membership requirements, notions of intellectual property and authorship, funding arrangements, and affiliations with private patrons or state governments and bureaucracies. Since their inception in seventeenth-century Europe, scientific academies have dealt with the production, reward, and dissemination of scientific knowledge, the training of practitioners, the evaluation of patent applications, and the advising of state and political authorities on scientific and techical matters. Some academies have been established by practitioners themselves and tend to be independently funded. Others are state-sponsored and tend to function as a body of scientific experts for the state that
Scientific Academies, History of supports them. Certain academies have a distinctly international membership, while others draw it from a specific country (though they may also extend limited forms of membership to a few foreign practitioners). Some select their members from one specific discipline, others gather practitioners from different fields. Some promote and fund the production of new knowledge, others concerns themselves with its dissemination. Most modern academies do not offer stipends to researchers (and may even require their members to contribute annual fees), while some earlier academies (Paris, Berlin, and St Petersburg) did provide their members with salaries and research funds (McClellan 1985). Because scientific academies have assumed most of the functions now typical of other institutions, it is possible to find partial analogies between academies and these other components of the social system of science. For instance, it may be difficult in some cases to draw a sharp demarcation between scientific academies and other institutions such as discipline-based professional associations, national laboratories and research institutes, foundations (like Rockefeller in the US), national agencies for the development of science (CNRS in France, NSF or NIH in the US, CNR in Italy, etc.), or even university science departments. Similar problems emerge when we try to distinguish scientific academies from other less scientific-sounding associations which, nevertheless, promote or promoted the production of knowledge about nature. In this category we find early modern philosophical salons (usually established and run by women (Lougee 1976)), religious orders that supported collective work in mathematics, astronomy, and natural history (like the Society of Jesus), provincial academies of natural philosophy, belles lettres, and history (common in Europe and America throughout the nineteenth century), nonprofessional associations for the popularization of science (which may be seen as the modern heirs to earlier provincial academies), or contemporary associations of amateur scientists. The same taxonomical difficulties emerge when one looks for some essential demarcation between academies and collection-based institutions like the Smithsonian Institution in Washington and other museums of science and natural history, and botanical gardens. In the premodern period, such institutions tended to be associated with academies and, as shown by the American Museum of Natural History in New York, may still perform some of the research functions of their historical ancestors. The same may be said about astronomical observatories. Many of them (Greenwich, Paris, Berlin, St Petersburg, etc.) were developed within scientific academies or soon became annexed to them, but later evolved into independent institutions or became linked to other state agencies (like the Bureau de Longitude in nineteenth-century France). In sum, one either runs the risk of treating ‘academy’ as a synonym for ‘scientific institution’ (in the same
way some apply the term ‘laboratory’ to anything from large-scale international organizations like CERN and private industrial entities like Bell Labs, to early modern alchemical workshops or to Uraniborg—Tycho Brahe’s sixteenth-century aristocratic observatory-in-a-castle (Christianson 2000)) or, in a reverse move, of identifying academies solely with institutions bearing that name (but likely to display remarkably different structures and functions), while leaving out a slew of other associations whose activities may in fact overlap with those pursued, at one point or another, by ‘proper’ academies. A possible solution to this definitional puzzle is to take a genealogical (rather than taxonomical) approach. In any given historical and national context there is a division of labor and functions between scientific academies and the surrounding institutions of science and of the state bureaucracy. It is this institutional ecology that we need to look at to understand which role academy fulfills in any given context. If we take peer-based management to be the distinctive trait of academies, then a reasonable starting point of their genealogy is the development of legal statutes and bylaws; that is, of instruments for the establishment of their corporate form of life. This makes the Royal Society of London (chartered in 1662) the first scientific academy. This criterion excludes a number of well-known seventeenth-century academies (such as the Lincei (1603–1631), the Cimento (1657–1667), and even the early Academie Royale des Sciences (which began to gather around 1666 but obtained its first statute only in 1699). Academies that did not have legal statutes or charters gathered at the will of a prince who sponsored them, and operated within the rules and system of exchange typical of the patronage system. They can be seen as the extension of the model of informal court gatherings and aristocratic literary salons to the domain of natural philosophy. They were princedependent not only in the sense that they were funded by princes, but their work, publications, membership, and sociability were structured by their direct, quasipersonal relationship to the patrons who, in several cases, participated in them. In these contexts, academicians were more princely subjects than collaborating scientific peers (Biagioli 1996). The next step—the development of charters, statutes, and bylaws—marked the transition from the patronage framework to institutions which, while still connected to royal power, had a more formal and bureaucratic relationship to the state. This institutional development followed two different models, one exemplified by the Royal Society of London, the other by the Academie Royale des Sciences. These two models matched quite closely the structure and discourse of royal power in those two countries: absolutism in France and constitutional monarchy in England. 13705
Scientific Academies, History of In fact, despite its name, the Royal Society was a private institution (some may say a large gentlemen’s club) that, while chartered by the king, received negligible funding and virtually no supervision from the crown (Hunter 1989). Its members were nominated and approved by the Society itself. Similar protocols were followed for the selection of the president and other officials. There were no paid positions except that of the secretary and the curator of experiments, and the operating budget came mostly from annual membership fees which were typically hard to collect. The training of young practitioners was not one of the Society’s stated goals. The Academie des Sciences, instead, was much more of a state agency. The French crown funded it, provided stipends, and closely controlled its projects, membership, and publications. Its members were grouped according to disciplinary taxonomies, and organized hierarchically according to seniority (with students at the bottom) and, sometimes, social status. The king selected crown bureaucrats to run its operation (Hahn 1971). These two models and their various permutations were adopted during the remarkable development (almost an explosion) of scientific academies in the eighteenth century. Driven by a mix of national pride, emulation, and the state’s growing appetite for technical expertise, virtually all European states, from England to Russia, developed such academies (McClellan 1985). Academies proliferated not only across but within nations, moving from the capitals to the periphery (Roche 1978). Reflecting the logic of political absolutism, state academies tended to follow the French paradigm, while provincial and private academies, having little or no connections to central state power, looked to the Royal Society as their primary model. The center–periphery distinction maps quite well on the disciplinary scope of early scientific academies: provincial institutions often included literature and history sections, while centrally located academies tended to specialize in natural philosophy. Until the French Revolution, academies constituted the primary site of scientific activity, epitomized the very notion of scientific institution, and provided science with a kind of social and professional legitimation it had struggled to achieve in previous periods. Some military schools (especially in France) shared the academies’ concern with research. But with the exception of Bologna, Pavia, and Gottingen, early modern universities usually limited themselves to the teaching of (not the research in) natural philosophy. If the eighteenth century was the golden age of scientific academies, the nineteenth century saw the beginning of their decline, or at least their slow but steady reframing from research institutions into prestige-bearing ones. New kinds of institutions emerged and older ones were reformed as the social system of science developed in scale and specialization. The 13706
remarkable complexity of this scenario is such that only a few general trends can be discussed here, and only in reference to their impact on the role of academies. First, starting with Germany, the restructuring of the university led to an increased representation of the sciences in its curriculum and, more importantly, to a slow but steady trend toward research, not only teaching, in the natural sciences. In the second half of the nineteenth century, this trend was followed by other countries including the US, where new universities (first Johns Hopkins and then the so-called landgrant state universities) were developed with an eye on European research models—models that were slowly adopted by Ivy League universities as well. This trend eroded much of the academies’ pedagogical and research role. Second, with the slow but progressive rise of sciencebased industry, the private sector (often in collaboration with universities) provided more niches for research and for the employment for fast-growing cadres of scientists. Widespread industrial development eroded another aspect of the academies’ traditional function: that of judge of technological innovation. The emergence of the patent system, first in England but, after 1800, in most other European countries and the US, placed the reward and protection of technological innovation in the domain of the law and of the market and disconnected it from its authoritative assessment by academies (as had been the case in early modern France and other countries). Third, the establishment or growing role of state or military technical agencies and schools (like, say, the Corps of Engineers and Geological Survey in the US, the Ecole Politechnique and the Ecole des Ponts et Chaussees in France) greatly reduced the role of scientific academies as providers of technical and scientific expertise to the state (Shinn 1980). Fourth, with the branching of science in increasingly numerous disciplines, a slew of professional associations emerged and, with them, increasingly specialized journals. Not only did these discipline-based associations erode the academies’ quasi-monopoly on scientific publishing and translations, but they also took up much of their function as international nodes of communication, collaboration, and standardization of methodologies, terminology, and units of measurement. The ‘internationalism’ of science which largely had been connected not only to networks of communication among academies but also to the export and import of academicians across national boundaries (most conspicuously in the eighteenth century) was replaced by more specific disciplinary and professional networks and by the migration of students (and later post doctorates) between universities and, later, institutes and laboratories. Similarly, the academies’ role in the diffusion of scientific knowledge was eroded by these trends, as well as by the development of a wider nonprofessional audience for science and of
Scientific Academies, History of the popular publications and magazines that catered to them. All these trends continued and were further accelerated in the twentieth century, when the additional creation of national agencies for the promotion and funding of science and the growing role of private foundations in the patronage of science took on yet another role that had been traditionally played by academies. Even large-scale, state-sponsored technical and military projects (which, historically, had seen the direct or indirect participation of scientific academies) have become commonly structured around collaborations between the state, the university, and industry. International scientific collaborations are also primarily managed by universities and national science agencies and, in the case of science-based industry, through joint ventures, licensing, patent sharing, or mergers and acquisitions. (These trends, however, are typical of Western contexts. In the former Soviet Union, the Soviet Academy of Science played a dominant role in research, training, and rewards at a time when its western counterparts had largely shed those functions (Vucinich 1984).) In sum, it is possible to identify at least three phases in the genealogy and role of academies within the broader ecology of the social system of science: (a) patronage-based princely academies without statutes and corporate protocols up to the end of the seventeenth century; (b) peer-based academies with statutes, publications, and, in many cases, formalized relations with the state since the late seventeenth century, and mostly during the eighteenth century; and (c) dramatic increase of the scale and complexity of the social system of science associated with the multiplication of more specialized scientific institutions taking up most of the traditional roles of scientific academies (since the early nineteenth century). The net result of this historical trajectory is that academies have moved from being informal gatherings structured by the curiosity and philosophical interests of princes and gentlemen, to comprehensive and multifunctional corporate entities crucial to the production and legitimation of science and its social system, to prestigious institutions that are marginally involved in research but still maintain an important advising role to the state and other constituencies, and bestow professional recognition on leading scientists through membership and awards. In many ways, the changing role of prizes awarded by academies encapsulates this trajectory: In the early modern period, prizes were seen as means to promote new research (not unlike today’s grants) but now they function as rewards of important work that scientists have done in the past. The importance of the largely symbolic role of modern scientific academies, however, should not be underestimated. In an age when ‘lobbying’ has become the dominant mode of corporate participation in
political decisions (especially in the US), academies remain the authoritative agents of corporate science, no matter how accurately they may be able to represent the interests and concerns of such a large and diversified community. See also: Centers for Advanced Study: International\Interdisciplinary; History of Science; History of Science: Constructivist Perspectives; Scientific Disciplines, History of; Universities, in the History of the Social Sciences
Bibliography Ben-David J 1971 The Scientist’s Role in Society: A Comparatie Study. Prentice-Hall, Englewood Cliffs, NJ Biagioli M 1996 Etiquette, interdependence, and sociability in seventeenth-century science. Critical Inquiry 22: 193–238 Boehm L, Raimondi E 1981 Uniersita’ Accademie e Societa’ Scientifiche in Italia e in Germania dal Cinquecento al Settecento. Il Mulino, Bologna, Italy Cahan D 1984 The institutional revolution in German physics, 1865–1914. Historical Studies in the Physical and Biological Sciences 15: 1–65 Cavazza M 1990 Settecento inquieto: Alle origini dell’Istituto delle Scienze di Bologna. Mulino, Bologna, Italy Christianson J R 2000 On Tycho’s Island. Cambridge University Press, Cambridge, UK Crawford E 1992 Nationalism and Internationalism in Science, 1880–1930: Four Studies of the Nobel Population. Cambridge University Press, Cambridge, UK Crosland M 1992 Science Under Control: The French Academy of Sciences, 1795–1914. Cambridge University Press, Cambridge, UK Daniels G 1976 The process of professionalization of American science: The emergent period, 1820–1860. In: Rheingold N (ed.) Science in America Since 1820. Science History Publications, Canton, MA, pp. 63–78 Fox R, Weisz G 1980 The Organization of Science and Technology in France, 1808–1914. Cambridge University Press, Cambridge, UK Frangsmyr T (ed.) 1989 Science in Sweden: The Royal Swedish Academy of Science, 1739–1989. Science History Publications, Canton, MA Frangsmyr T (ed.) 1990 Solomon’s House Reisited. Science History Publications, Canton, MA Graham L 1967 The Soiet Academy of Sciences and the Communist Party, 1927–1932. Princeton University Press, Princeton, NJ Hahn R 1971 The Anatomy of a Scientific Institution. University of California Press, Berkeley, CA Heilbron J L 1983 Physics at the Royal Society During Newton’s Presidency. William Clark Memorial Library, Los Angeles Hunter M 1989 Establishing the New Science. Boydell, Woodbridge, MA Jasanoff S 1990 The Fifth Branch: Science Adisers as Policymakers. Harvard University Press, Cambridge, MA Kevles D 1978 The Physicists: The History of a Scientific Community in Modern America. Knopf, New York
13707
Scientific Academies, History of Knowles Middleton W E 1971 The Experimenters: A Study of the Academia del Cimento. Johns Hopkins University Press, Baltimore, MD Kohlstedt S 1976 The Formation of the American Scientific Community. University of Illinois Press, Urbana, IL Lougee C 1976 Le Paradis des Femmes: Women, Salons, and Social Stratification in Seenteenth-century France. Princeton University Press, Princeton, NJ MacLeod R, Collins P (eds.) 1981 The Parliament of Science: The British Association for the Adancement of Science, 1831–1981. Kluwer, Northwood, UK McClellan J E 1985 Science Reorganized. Columbia University Press, New York Moran B 1992 Patronage and Institutions. Boydell, Woodbridge Morrell J, Thackray A 1981 Gentlemen of Science: Early Years of the British Association for the Adancement of Science. Clarendon Press, New York Oleson A, Brown S C (eds.) The Pursuit of Knowledge in the Early American Republic: American Scientific and Learned Societies from Colonial Times to the Ciil War. Johns Hopkins University Press, Baltimore, MD Pyeson L, Sheets-Pyeson S 1999 Serants of Nature. Norton, New York Roche D 1978 Le Siecle de Lumieres en Proince. Mouton, Paris Schiebinger L 1989 The Mind Has No Sex? Harvard University Press, Cambridge, MA Shinn T 1980 Saoir scientifique et pouoir social: L’Ecole Polytechnique, 1794–1914. Presses de la Fondation Nationale des Sciences Politiques, Paris Vucinich A 1984 Empire of Knowledge: The Academy of Sciences of the USSR, 1917–1970. University of California Press, Berkeley, CA
The criterion for selection of the institutions described in this article is that academic research is their main function. Hence, academies that focus mainly on teaching, and policy-oriented research institutes, are not included. Taiwan’s Academia Sinica is examined in terms of organization and research accomplishment, as well as its role in society. Similar research academies in mainland China and in Japan are also briefly introduced.
presidents since its founding, and the current president, Dr. Lee Yuan-Tseh, a Nobel laureate in chemistry, took office in 1994. The upper-level organization of the Academia Sinica comprises three parts: the convocation, the council, and the central advisory committee. The convocation is a biennial meeting attended by preeminent Chinese scholars from all over the world who are elected to be academicians of the Academia Sinica. The year 2000 saw the 26th convocation and most of the 193 academicians—an honorary lifetime title— gathered to elect new academicians and council members. They also proposed research policies for the Academia Sinica. The council consists of 18 ex officio members, including all directors of research institutes, and 35 elected members. Specific functions of the council include review of research projects, evaluation of proposals related to institutional changes, promotion of academic cooperation within and outside Taiwan and, perhaps most importantly, to present a shortlist of three candidates for the presidency of the Academia Sinica to the President of the Republic of China, who make the final decision. The central advisory committee, established in 1991, includes chairpersons of the advisory committees of individual institutes and three to nine specialists nominated by the president of the Academia Sinica. Their tasks are to recruit scholars of various disciplines as well as to suggest long-term and interdisciplinary research plans to the president. The committee is also responsible for evaluating large-scale cross-institutional research projects, applications for postdoctoral research posts, and the annual awards for junior researchers’ publications. But the core of the Academia Sinica is made up of the 25 institutes\preparatory offices classified into three divisions: Mathematics and Physical Sciences, Life Sciences, and Humanities and Social Sciences (see Table 1). In 2000, 815 research staff (including 13 distinguished research fellows, 324 research fellows, 197 associate research fellows, 138 assistant research fellows, and 143 research assistants) conducted active research either individually or in groups within, as well as across, institutions. In addition to the research staff, postdoctoral researchers, contracted research assistants, and administrative officers add up to approximately 3,000 people working in the Academia Sinica.
1. The Academia Sinica, Taiwan
1.2 Research Focus and Achieements
M. Biagioli
Scientific Academies in Asia
1.1 Organization The Academia Sinica was founded in mainland China 1928 in with two major missions: to undertake research in science and the humanities and to instigate, coordinate and encourage academic research. In 1949, after the Chinese civil war, the Academia was moved to Taiwan and re-established at the present site in 1954. The Academia has been headed by seven 13708
There are six fundamental principles or basic goals guiding academic research in Academia Sinica (Lee 1999) Examples of research accomplishments, landmark research projects and significant publications are discussed below. 1.2.1 A balance between scientific and humanitarian research. Critics have condemned the dominance of
Scientific Academies in Asia Table 1 Number of research staff at Academia Sinica, Taiwan: 1994–2000 Researchnstitutes Division of Mathematics & Physical Sciences Institute of Mathematics Institute of Physics Institute of Chemistry Institute of Earth Sciences Institute of Information Sciences Institute of Statistical Sciences Institute of Atomic and Molecular Sciences Institute of Astronomy and Astrophysicsa Institute of Applied Science & Engineeringa Subtotal No. increase Rate of increase (percent) Division of Life Sciences Institute of Botany Institute of Zoology Institute of Biological Chemistry Institute of Molecular Biology Institute of Biomedical Sciences Institute of BioAgricultural Sciencesa Subtotal No. increase Rate of increase (percent) Division of Humanities & Social Sciences Institute of History and Philology Institute of Ethnology Institute of Modern History Institute of Economics Institute of American and European Studies Sun Yat-Sen Institute for Social Sciences and Philosophy Institute of Sociology Institute of Chinese Literature and Philosophya Institute of Taiwan Historya Institute of Linguisticsa subtotal No. increase Rate of increase (percent) Total No. increase Rate of increase (percent) a
1994
1995
1996
1997
1998
1999
30 41 29 37 28 29 21 1
33 41 28 36 29 31 22 2
32 39 27 35 30 31 22 4
34 39 27 34 30 32 22 7
32 42 28 37 32 30 22 10
31 43 30 37 34 32 24 10
216
222 6 2.78
41 28 30 80 51 230
41 30 31 84 48
220 k2 k0.90 41 30 33 82 49
234 4 1.74
235 1 0.43
225 5 2.27 39 29 34 79 54 235 0 0.00
233 8 3.56 38 30 35 82 54 239 4 1.70
2000
241 8 3.43
31 42 27 34 36 32 24 13 2 241 0 0.00
39 29 34 85 55 8 250 11 4.60
40 27 36 87 54 11 255 5 2.00
65 39 53 48 35 48
66 40 51 48 34 48
66 25 51 49 37 44
69 27 49 49 34 47
60 28 48 49 34 45
58 27 49 46 33 47
56 27 46 46 32 46
14
16
17 17
18 19
19 17
21 22
18 21
1
6
10
13
303
309 6 1.98
316 7 2.27
325 9 2.85
14 12 326 1 0.31
749 16 2.14
765 6 0.78
771 14 1.82
785 13 1.66
798 22 2.76
15 11 329 3 0.92 820 k5 k0.61
16 11 319 k10 k3.04 815 66
Preparatory Office
scientific over humanitarian research in Taiwan. This imbalance came about because of the general need to modernize the country’s economy in the postwar years. During this process, most resources were allocated to technology-oriented research. As a result, natural science and the life sciences enjoyed a major share of manpower as well as of budget. In later years, especially after the lifting of martial law on
Taiwan in the 1980s, the environment for social sciences research improved greatly. The humanities and social sciences (at least in the Academia Sinica) have received relatively fair treatment if adequacy of regular budget is used as an indicator. This is perhaps a rare phenomenon worldwide. The Academia Sinica claims to have achieved balanced development among its three divisions: 13709
Scientific Academies in Asia Mathematics and Physical Sciences, Life Sciences, and Humanities and Social Sciences. This has been achieved by setting different criteria for the various divisions with regard to both their research direction and evaluation, and by building up a lively academic community that allows researchers from various divisions to work on the same topic. A recently completed study on Taiwanese aboriginals is one example of such a project. Under the coordination of a researcher from the Institute of Ethnology, staff from Biomedical Sciences, along with scholars from universities and hospitals, jointly investigated the migratory history, the hereditary traits from blood samples, and the ethnic differences of aboriginals in order to establish possible genetic polymorphic markers. This type of joint endeavor by different disciplines continues and is given priority in the funding process. An ongoing study of cardiovascular-related disease, the second leading cause of death in Taiwan, is another example. The study intends to collect multifaceted community data to help combat this disease. Funded by the Academia Sinica, researchers in epidemiology, social demography, economics and statistics have formed an interdisciplinary team to tackle the issue from various angles.
1.2.2 A balance between indigenous and international research. In the late 1990s the question of the relative priority of indigenization versus internationalization of academic research was much debated among social scientists in Taiwan. Supporters of indigenization emphasized the particularistic or the unique aspect of social science studies in Taiwan and the importance of avoiding the influence of dominant Western models. To others, however, internalization of social sciences is regarded as an inevitable global trend fitting into the theme of ‘knowledge without national boundaries.’ Regarding debates on the nature of the research, the Academia Sinica takes a balanced stand. On the one hand, it encourages active participation in the valued conventional research areas. On the other hand, focusing on Taiwan’s particular social issues and disseminating relevant research findings is considered to be important for the intellectual community and for mankind in general. Hence, the Academia Sinica has funded large-scale research projects that have both of the above purposes in goal. A recent group project on the long-term development of capitalism in Taiwan has the potential to extend the Taiwanese experience to other Asian economies. This study encompasses the history of agricultural and industrial developments in Taiwan, trade and navigational expansion, macroeconomic performance, and the role the Taiwan Development Company played in the consolidation of capitalism in 13710
the territory. From the colonial era to the period of Japanese rule and the postwar era, Taiwan has gone through significant social transitions in its capitalist development. Although each stage may be characterized by different sets of institutions, one common factor emerges from the historical process: the expansion of exports (from tea, sugar, rice, processed foods, to light manufactured goods and producer goods). The exploration of the origin and the evolution of capitalist development in Taiwan will not only benefit the local academies, but will enhance the comparative study of economic development in other countries.
1.2.3 A balance between basic and applied research. Academic research is the main function of the Academia Sinica. However, increasing demands are being made on it to provide research results for applied use. The Academia Sinica recognizes the necessity to respond to important social phenomena and the need for technological development in its research endeavors. Heavy emphasis has been placed on the implementation of findings from basic research. An equal amount of effort has also been allocated to promoting applied research that may shed light on the consequent academic research. Two illustrations from Information Science and Life Science will highlight this recent focus. A study on natural language understanding is directed towards the construction of a computer program with a knowledge system that is capable of understanding human perception of various recognition systems. The project has successfully developed a concept recognition mechanism called the ‘Information Map.’ This map arranges human knowledge in a hierarchical fashion with a cross-referencing capability. Using the information map, concept understanding can be reduced to intelligent semantic search in the knowledge system. This project has already produced a semantic ‘Yellow Page’ search for a telephone company and an automatic question and answer agent on the internet. A popular product of its Chinese phonetic input system is a typing software package widely used by the public. Another wellknown research project with excellent applied values is the method for detecting differentially expressed genes. The research group has developed a DNA microarray with colorimetric detection system to simultaneously monitor the expression of thousands of genes in a microarray format on nylon membrane. Testing on filter membranes and quantifying the expression levels of the target genes in cells under different physiological or diseased states will reduce each screening process to a couple of days. It is clear that the applied research is basically restricted to non-social sciences. In social science divisions, market-oriented or consumer-involved
Scientific Academies in Asia studies remain quite rare. Researchers are mostly committed to basic research funded by the institute or by the National Science Council. Although policy study is ostensibly given a high priority, the fact that institutes to be established in the near future will still specialize in the basic disciplines reflects a general emphasis on this area. At present, few policy-oriented research projects are undertaken by individuals. 1.2.4 Coordination and promotion of academic research in Taiwan. Under the new organizational law, which allows more flexibility for research institutes to establish ad hoc research centers, the Academia Sinica will be given the responsibility of proposing a national academic research agenda, coordinating academic research in Taiwan, and training intellectual manpower. These tasks reflect the central role of the Academia Sinica in Taiwanese academies. In order to meet these requirements, the Academia has attempted to collaborate with various universities by exchanging teaching and research staffs. Ad hoc research committees and specific research programs including scholars from different institutes have also been established. The committees on Sinological Research, on Social Problems, on Taiwan Studies, and on Mainland China Studies—all established since 1997—exemplify such an effort. These committees are interdisciplinary in nature, and comprise scholars from within as well as outside the Academia Sinica. The committee may focus on a few selected research issues and organize related seminars\conferences. The committee is also allowed to form various taskforces to plan future collaborative research topics. Besides the ad hoc research committees, the promotion of large-scale cross-institutional research projects has become important to the Academia Sinica. A so-called ‘thematic project’ will share a common research framework and will include several individual research topics proposed by researchers from different institutes and universities. The study of the organization-centered society represents one of these group efforts. In the investigation of modern social structure, nine subprojects were proposed, all funded exclusively by the Academia Sinica. Their findings on the importance of impersonal trust in Taiwan’s economic development—instead of the traditional interpersonal trust— as key to organizational success, has given rise to the future research perspective that a modern society such as Taiwan is organization-centered. Similar thematic projects aiming to promote collaborative work among various academic institutes have been encouraged by the granting of funds. However, it should be pointed out that playing a central role does not equate to having the central planning function. Taiwanese social scientists, in comparison with life scientists or Japanese colleagues, have a strong inclination to pursue individual re-
searches. Various researchers joining together in a large communal laboratory or generations of scholars working in the same topic are not common at all. The thematic project in the Academia Sinica, or the joint project promoted by the National Science Council, is more of an initiative to encourage collaborative teamwork rather than a reaction to the present research demand. Whether individual projects maintain their importance or are replaced by group efforts will not change the expectations placed on the Academia Sinica—to respect individual research freedom and to facilitate research needs in Taiwan.
1.2.5 Encouragement of international collaboration. Active participation in international research has always been a priority in Taiwan. Researchers are encouraged to present their findings at international meetings and to build collaborative relationships with foreign research groups. Renowned scholars abroad are also frequently invited to visit and to work with research staff at the Academia Sinica. In line with this trend, a proposal has been made to establish an international graduate school at Academia Sinica (Chang 2000). The aim of this program is to attract highly qualified young people to do their Ph.D. degree under the guidance of top researchers in Taiwan. The competitive program is intended to provide the opportunity for independent inquiry as well as dynamic peer interaction; it is assumed that the supportive research environment will facilitate the training of future intellectual leaders and creative scholars. Although this program is yet to be finalized, the Academia Sinica has clearly revealed its interest in taking a concrete step towards globalization by investment in brilliant young minds.
1.2.6 Feedback into society. It is the firm belief among academicians that any type of feedback to society from members of the Academia Sinica must be based on solid academic research. Several feedback patterns have been adopted. With regard to emergent social issues, researchers with related knowledge and research findings are encouraged to air their suggestions by organizing conferences or public hearings. The problem of the over-plantation of betel nuts on hills and mountains and their harmful effect on the environmental protection is an example. Short-term research projects—such as the analysis of juvenile delinquency initiated by the committee on social problems—are another possible strategy. Furthermore, the Academia Sinica opens its campus annually and individual research institutes sponsor introductory lectures to interested students and visitors. Numerous data sets from social science or biological researches, as well as valuable original historical files, are gradually released for public use. 13711
Scientific Academies in Asia 1.3 The Role of the Academia Sinica Whether the Academia Sinica should play a role beyond pure academic research and beyond researchbased feedback to various social problems has always hotly debated. Some of the more articulate members have been quite vocal in insisting that it should have a stronger role in economic and social life. When it comes to how far the institute should involve itself in politics, however, the issue is more delicate. The subject is closely linked with the role of contemporary intellectuals in Taiwan (Chang 2000), where it is considered permissible for intellectuals to vocalize their moral conscience with regard to significant political issues. Nevertheless, it is precisely because most intellectuals are respected for their professional scholarship, not necessarily the correctness of their political views, that the appropriateness of actual political participation is seriously questioned. For most ordinary members of the Academia Sinica, the pressure to produce excellent research is the foremost stimulus. Academic ambition is factored into the evaluation of promotion, the review criteria for assessing institutes’ research accomplishment, the process of planning new research development, and in the regular report to the Legislative Yuan. Nevertheless, considering that research staff are government employees, the public perhaps has a right to voice concerns about the public utility of the Academia Sinica, whatever its academic credentials and quality of research. When challenged about its value to the Taiwanese taxpayer, the Academica Sinica usually reminds the public of its past research accomplishments and its dynamic future role, concrete evidence of which can be found in several newly established research institutes. When it comes to past achievements, the most senior institute in the Humanities and Social Science division—the Institute of History and Philology—is often cited. It was established in the same year as the Academia Sinica (1928). Early collective projects such as the An-Yang excavation, the study of Chinese dialects, and the reconstruction of ancient histories gained international fame for the institute. The institute is also engaged in the systematic compilation and organization of valuable Chinese historical documents, which contributes enormously to the field of Sinology and further enhances its academic status. Far from resting on the academic achievements of the past, the Academia Sinica is constantly trying to stay on the cutting edge of research, as can be seen by the recently established institutes. In the division of Mathematics and Physical Sciences, the Institute of Astronomy and Astrophysics (1993) and the Institute of Applied Science and Engineering Research (1999) are the two latest research institutes. The separation between pure and applied science, especially in the life sciences, is obviously not applicable any more. Within the division of Humanities, the Institute of Taiwan 13712
History (1993) and the Institute of Linguistics (1997) were formed after drastic social changes had taken place in Taiwan. The Taiwan History institute is in the forefront of indigenous research in Taiwan and it has become the focal coordinating agency for Taiwan studies. In short, Taiwan’s Academia Sinica is a government-sponsored research institute. With funding available from the regular budget, academic research has been its main prescribed task. The highly trained research staff represents the research elite in Taiwan and has full liberty in deciding on individual projects. In recent years, the Academia Sinica has made a conscious effort to promote major interdisciplinary research programs both in fulfillment of its leadership role in the Taiwanese academies and as a response to changing societal expectations. A review of recent developments within the Academia Sinica reveals its intention to gain a greater global profile on the basis of its academic performance and generous research resources.
2. Research Academies in Mainland China 2.1 The Chinese Academy of Sciences The Chinese Academy of Sciences was founded in 1949, the same year in which the Academia Sinica moved to Taiwan. With basic research as its main task, this academy has perhaps the largest organization of any institution of its type in the world. Besides the headquarters in Beijing, 22 branches made up of no fewer than 121 research institutes are scattered all over the country. Among five academic divisions (Mathematics and Physics, Chemistry, Biological Science, Earth Science, Technological Science), more than 40,000 scientists and technical professionals work for the Academy. Among them, nearly 10,000 are regarded as basic research staff, and 230 members of the Academy (out of the current 584 members who are elected as the most preeminent scientists in mainland China) also actively engage in research at the Academy. Members of the Academy enjoy the highest academic prestige. They play a planning and consulting role in China’s scientific and technological development as well as providing reports and suggestions regarding important research issues. A review of the general research orientation of the Chinese Academy of Sciences reveals at least two important characteristics which may differentiate it from the Academia Sinica in Taiwan. First, basic research with highly applied values was a major priority of the Academy from the beginning. Hightech research and development centers are growing rapidly and the support staff composed of well-trained professionals has become a major facilitating force. Second, collaboration with industrial sectors and with foreign institutes has contributed to the Academy’s
Scientific Academies in Asia research resources. The cooperative relationship has been substantial and extensive in that more than 3,000 enterprises have joined the industry–education– research development program. The international exchange program involves 7,000 personnel annually. This program has benefited both the research staff and the postgraduates of the Academy and fulfilled an important training function: more than 14,000 staff and graduate students have received advanced training abroad since 1978 and over 8,000 have completed their studies and returned to the Academy.
2.2 The Chinese Academy of Social Sciences The Chinese Academy of Social Sciences was formally established in 1977 from the former Philosophy and Social Sciences division of the Chinese Academy of Sciences. The central headquarters in Beijing is made up of 14 research institutes employing 2,200 staff. Among these centralized institutes are Economics, Archeology, History, Law, and Ethnology. As is the case with the Academy of Sciences, there are many branch institutes throughout China so that the staff complement totals 4,200 in 31 research institutes. According to much of the publicity material on the Academy of Social Sciences, the needs of the country appear to be of the utmost importance in the selection of research projects. The material and spiritual development and the democratization of the nation are constantly cited as basic motives to conduct relevant studies. This bias probably owes more to the fact that funding is from central government than to any policy implications. For example, the national philosophy and social sciences committee has organized several selected research topics every five years, such as the study of changes among rural families under the economic reform coordinated by Beijing University during the 7th national Five-year Plan. A substantial proportion of research undertaken by the Academy of Social Sciences consists of special commissions of this sort. Because the availability of funding is the key to the commencement of any research, there is a substantial reliance on foreign funding. Funding from foreign foundations is usually generous enough to greatly enhance the possibility of conducting extensive studies across different regions of the nation. But collaboration with foreign institutes, especially in social science surveys, tends to be limited to data collection. Highly quality academic written manuscript from the collaborative project in Academy of Social Sciences is relatively inadequate and waits to be promoted in the future. As in the Academy of Sciences, there is a longstanding international academic exchange program in the Academy of Social Sciences. More than 4,100 research staff and graduate students have participated in this program since 1978 and positive outcomes are
revealed in new research projects as well as publications. The 82 academic journals published by the Academy cover various disciplines such as sociology, law, history, literature, world economics, etc.
3. Research Academies in Japan There are basically two lines of research institutions in Japan: one under the Ministry of Education and the other under the Science and Technology agencies. University-affiliated research institutes, as well as independent national research institutes with graduate schools, come under the jurisdiction of Ministry of Education. As of 1997, among 587 Japanese universities, 62 had affiliated research institutes of which 20 were open to all researchers in Japan. There are also 14 independent research institutes in Japan, unaffiliated with any university, carrying out major academic research projects. These so-called interuniversity research institutes are set up because of a specific demand to undertake academic research that requires resources and manpower beyond the university boundary. The National Laboratory for High Energy Physics was the first of this kind to be established (1971). The famous National Museum of Ethnology (1974) and the National Institute of Multimedia Education (1997), which aim at scientific research, data collection, and curricula development, have a substantial research complement of their own and are staffed by visiting scholars from abroad as well as local ones. The actual contribution of the interuniversity research institutes lies in the nature of basic research. Large-scale facilities and data resources as well as human resources seconded from universities throughout Japan are considered important mechanisms in enhancing the progress of scientific research in Japan. Other national research institutes, mostly concerned with natural sciences but also including social sciences (such as the noted National Institute of Population and Social Security Research and Economic Research Institute), fall into the domain of Science and Technology agencies. The population and social security research institute was founded in 1996 by combining two government research organizations: the Institute of Population Problems and the Social Development Research Institute. It is now affiliated with the Ministry of Health and Welfare. A research staff of 45 is located in seven research departments. Although policy-oriented research comprises a major proportion of the institute’s work and is self-defined by the staff, academic research is still encouraged through both institutional and individual efforts. Surveys concerning population and social security are carried out to produce primary and secondary data for policy formulation. At the same time, these data also give rise to future academic studies on social and economic issues. There is also a national advisory board on the scientific development of Japan. The Science Council 13713
Scientific Academies in Asia of Japan, attached to the Prime Minister’s office, was established in 1949. Unlike the Japan Academy or academicians and academy members in Taiwan and mainland China, the 210 distinguished scientists from all fields who sit on the board are not given honorary lifetime titles but serve three-year terms of office. The council enjoys great academic prestige and represents Japanese scientists domestically as well as internationally. The council has the right to advise the government to initiate important scientific research programs, and the government may seek professional recommendations from the council as well. The council is also actively engaged in bilateral scientific exchanges and other forms of international participation. The council is also expected to coordinate academic research in Japan and facilitate the implementation of important decisions concerning academic development in Japan. With a few exceptions, such as the Nihon University Population Research Institute, most academies in Japan are national. But the restrictions stemming from their organizational structure (they come under the jurisdiction of various government ministries) may translate into less research freedom or a higher demand for policy-oriented studies. In addition, those academies or research institutes affiliated with universities are usually also expected to carry out teaching functions at the individual researcher’s level.
4. Conclusion This article has briefly outlined the national character of research organizations in Taiwan, mainland China, and Japan. The importance of the government’s role in academic development in this region can be clearly observed. The private sector, in contrast, plays only a minor role, if any, in academic research. However, several differences may be distinguished among the three territories. Taiwan’s Academia Sinica is perhaps foremost in terms of both research autonomy and social services research. Benefiting from the cultural tradition and the expectations of the society in which it functions, researchers there also appear to enjoy more resources in funding and in social prestige. The motivation for research can be stated in purely academic terms and no policy orientation is required in order to receive adequate funding. In addition, Taiwanese scholars have shown a stronger preference for individual research projects. Mainland China, in comparison, launched its modern social science sector in the late 1970s, a time when a few basic disciplines such as sociology were still under suspicion. That contextual factor has certainly introduced an added element to the Chinese academy—the importance of correct political attitudes. As a consequence, research aims are required to be framed within the mainstream ideology and group efforts are more likely to be observed. Japan has a different tradition regarding 13714
academies. Although research autonomy is encouraged and political factors are not necessarily emphasized, research still tends to be applied in nature. This is largely because of structural factors, in that most academies are affiliated with various government ministries or with research institutes that are concerned with policy formulation and teaching as well as pure research. Also, Japanese social scientists are more inclined to participate in group projects headed by a leader in the field. Whether this is consistent with the national character remains to be explored. With the relatively positive outlook for economic growth in the near future, the academies in Taiwan, mainland China, and Japan may experience substantial concomitant development and may thus reach new horizons in certain research fields. Nevertheless, academic collaboration within the region itself is still comparatively rare and may be a focus in the future agenda. Hopefully, a unique Asian perspective may be developed from persistent and extensive social science studies in these areas. See also: Centers for Advanced Study: International\ Interdisciplinary; History of Science; Science Funding: Asia; Scientific Academies, History of; Universities, in the History of the Social Sciences
Bibliography Academia Sinica 1998 Academia Sinica: A Brief Introduction. Academia Sinica, Taipei, Taiwan Brief Description of The Chinese Academy of Social Sciences. The Chinese Academy of Social Sciences, Beijing Brief Description of the Science of Council of Japan. Science Council of Japan, Tokyo Chang S. I 2000 Rationale for an International Graduate School at Academia Sinica. Report presented at the Preliminary Discussion of the Mentoring of Graduate Students to Become Future Intellectual Leaders. Academia Sinica, Taipei, Taiwan Introduction to the National Institute of Population and Social Security Research in Japan. 2000 National Institute of Population and Social Security Research, Tokyo Lee Y-T 1999 The Academic Development of Academia Sinica: Its Present Situation and its Future Vision. The Newsletter of the Legislatie Yuan, The Legislative Yuan, Taipei, Taiwan The Glorious Fifty Years of The Chinese Academy of Sciences: 1949—1999. 1999 The Chinese Academy of Sciences, Science Press The Weekly Newsletter of Academia Sinica 2000 The Personnel Office of Academia Sinica
C.-C. Yi
Scientific Concepts: Development in Children This article examines the development of children’s scientific understanding. It is organized into four
Scientific Concepts: Deelopment in Children sections: initial understanding; development of physical concepts; development of biological concepts; and learning processes.
1. Initial Understanding of Scientific Concepts A large amount of recently obtained evidence indicates that infants’ conceptual understanding is considerably more sophisticated than previously assumed. Traditionally, researchers relied on children’s verbal explanations and\or their actions as measures of their conceptual understanding. These methods often underestimated infants’ and very young children’s understanding, owing to their inarticulateness and poor motor coordination. However, a new method, the violation-of-expectation paradigm, has made it possible to assess infants’ physical knowledge by examining how long they look at ‘possible’ and ‘impossible’ events. A typical experiment involves habituating the child to a series of physically possible events and then presenting either a different possible event or an impossible event. The assumption is that children who understand the impossibility of the one event will look longer at it, because they are surprised to see the violation of the principle. Studies using this violation of expectation paradigm have revealed impressive initial understanding of physical concepts. Even infants possess certain core concepts and understand several principles that govern the mechanical movement of objects (Spelke and Newport 1998). For example, 4-month-olds have a notion that one solid object cannot move through the space occupied by another solid object. In the studies that demonstrated this phenomenon, infants were first habituated to a display in which a ball was dropped behind a screen, which was then removed to reveal the ball on the floor. They then saw two events. In the consistent condition, the ball was again dropped, and when the screen was removed, infants saw the ball resting on a platform above the stage floor. In the inconsistent event, the ball was resting on the stage floor under the platform. Infants looked longer at the inconsistent event than at the consistent one, as if they were surprised to see the violation of the object–solidity principle. Other studies with a similar approach have also revealed infants’ understanding of gravity and other physical regularities. Infants appear to understand that an unsupported object should move downward and that objects do not ordinarily move without any external force being applied. Infants’ knowledge of physical regularities is gradually refined over the first year, as demonstrated by their understanding of collisions. Most 3-month-olds appear surprised to see a stationary object move when not hit by another object. Six-month-olds can appreciate how the features of objects affect a collision. They appeared to be surprised when an object moves further following a collision with a small moving object than it does after colliding with a larger moving
object. Later during the first year, infants responded differently to events in which the object that was struck moved at an angle perpendicular to the motion of the object that struck it than when it moved in a more standard path. Other researchers, however, raise concerns about the use of such paradigms to draw inferences about infants’ conceptual knowledge (e.g., Haith and Benson 1998). They argue that differential looking only indicates that infants discriminate between two events and that perceptual rather than conceptual features might drive this discrimination. Thus, infants’ visual preference or looking time might have been incorrectly interpreted as evidence of an appreciation of physical principles. Early conceptual understanding is not limited to understanding of physics principles. By age 3 years, children can distinguish living from nonliving things. They recognize that self-produced movement is unique to animals. In one study that made this point, preschoolers were shown brief videotapes in which animals or inanimate artifacts moved across the screen. Then the children were asked to make judgments about internal causes (does something inside this object make it move) and external causes (did a person make this move). Children typically attributed the cause of the animate object’s motion to internal features (‘it moves itself;’ ‘something inside makes it move’). In contrast, they were more likely to attribute the motion of an artifact to an external agent (Gelman 1990). Preschoolers are also sensitive to differences in the ‘types of stuff’ inside animate and inanimate objects. They draw inferences about identity and capacity to function based on internal parts, associating, for example, internal organs, bones, and blood with animals and ‘hard stuff’ or ‘nothing’ with inanimate objects. After hearing a story about a skunk that was surgically altered so that it looked like a raccoon, young children reported that the animal was still a skunk, despite its altered appearance. However, children did not reason in the same way when they heard similar stories about artifacts; a key that was melted down and stamped into pennies was no longer a key (Keil 1989).
2. Deeloping Understanding of Physical Concepts Rudimentary understanding of basic concepts does not imply full-blown appreciation of physical principles. Even older children’s concepts and theories often involve substantial misconceptions. Understanding of physical concepts undergoes substantial change with age and experience. One good example involves physical causality and mechanical movement. When Event A precedes Event B, many 3- and 4-yearolds fail to choose A consistently as the cause, whereas 13715
Scientific Concepts: Deelopment in Children 5- and 6-year-olds are considerably more likely to choose A. Children also hold intuitive theories of motion that are inconsistent with fundamental mechanical principles. For example, when asked to predict how a ball would travel after rolling through a spiral tube, only one-fourth of 9-year-olds and less than half of 11-year-olds correctly predicted the ball’s trajectory. Misconceptions also occur with other concepts. For example, children’s conceptions of matter, weight, volume, and density undergo substantial change with age. Most 3-year-olds have undifferentiated notions of the roles of density, weight, and volume in producing buoyancy of objects placed in liquids. Most 4- and 5year-olds have some conception of density, although their judgments are also affected by other features of the objects (i.e., weight and volume). Eight- and 9year-olds, in contrast, rely consistently on density in judging whether objects will sink or float. On some physical tasks, children’s increasing understanding can be characterized as a series of increasingly adequate rules. One such task is Siegler’s (1976) balance scale task. On each side of the scale’s fulcrum were four pegs on which metal weights could be placed. In each trial, children were shown a configuration of weights on pegs and were asked to predict whether the scale would balance or whether one side would go down after release of a lever that held the scale motionless. Most children based their predictions on one of four rules. The large majority of 5-year-olds relied solely on weight (Rule I). This involved predicting that the scale would balance if both sides had the same amount of weight and that the side with more weight would go down if the two sides had different amounts of weight. Nine-year-olds often used Rule II. This involved predicting that the side with more weight would go down when one side had more weight, but predicting that the side with its weight further from the fulcrum would go down when weights on the two sides were equal. Some 9-year-olds and most 13–17-yearolds used Rule III. They considered both weight and distance on all problems, and predicted correctly when weights, distances, or both were equal for the two sides. However, when one side had more weight and the other side’s weights were further from the fulcrum, children muddled through, not relying consistently on any identifiable approach. Rule IV allowed children to solve all balance scale problems. It involved choosing the side with greater torque (WLDL vs. WRDR) when one side had more weight (W) and the other had its weight further from the fulcrum (D). Few children or adults used Rule IV. Similar sequences of rules have been shown to characterize the development of a variety of tasks, including projection of shadow, probability, water displacement, conservation of liquid and solid quantity, and time, speed, and distance. A related way of conceptualizing the development of scientific understanding is as a succession of 13716
increasingly adequate mental models. One good example of such a succession involves understanding of the Earth as an astronomical object (Vosniadou and Brewer 1992). Some children, particularly young ones, conceive of the Earth as a flat, solid, rectangular shape. A slightly more sophisticated mental model is to think of the Earth as a disk. Three yet more advanced incorrect approaches are the ‘dual Earth model,’ which includes a flat, disk-like Earth where people live and a spherical Earth up in the sky; the ‘hollow sphere model,’ in which people live on flat ground inside a hollow sphere; and the ‘flat sphere model,’ in which, people live on flat ground on top of a hollow sphere. All three of these models allow children to reconcile their perception that the Earth looks flat with their teachers’ and textbooks’ insistence that the Earth is round. The proportion of children who possess the correct ‘spherical model’ of the Earth increases from 15 percent to 40 percent to 60 percent from first to third to fifth grade.
3. Deeloping Understanding of Biological Concepts Young children have a concept of living things, but it does not perfectly match the concept of older children and adults. Until about age 7 years, most children do not view plants as living things. In addition, 3-yearolds fairly often attribute life to powerful, complex, or moving inanimate objects, such as robots and the Sun. A similar mix of understandings and misunderstandings is evident in preschoolers’ views regarding internal parts of living things. They know that animals have bones and hearts, but have little idea of their functions. Children’s understanding of other uniquely biological concepts, such as growth, inheritance, and illness, also undergoes substantial change with age and experience. Preschool children have some appreciation of biological growth. They expect animals to grow, appreciate that growth can only occur in living things, and understand that growth is directional (small to big). However, preschoolers also believe that living things may or may not grow, and have difficulty accepting that even small things, such as a worm or butterfly, grow. Not until age 5 or 6 years do children realize the inevitability of growth; one cannot keep a baby pet small and cute just because one wants to do so. Children’s understanding of inheritance, another uniquely biological process, also develops with age and experience. Preschoolers understand that like begets like: Dogs have baby dogs, rabbits have baby rabbits, and offspring generally share biological properties with parents (Wellman and Gelman 1998). They also believe that animals of the same family share physical features even when they are raised in different environments. For example, preschoolers believe that
Scientific Concepts: Deelopment in Children a rabbit raised by monkeys would still prefer carrots to bananas. However, other studies suggest that not until 7 years of age do children understand birth as part of a process mediating the acquisition of physical traits and nurturance as mediating the acquisition of beliefs. Only older children clearly distinguish between properties likely to be affected by heredity and properties likely to be affected by environment. For example, not until school age do children expect a boy to resemble his biological father in appearance but to resemble his adoptive father in beliefs. Thus, younger children seem to have different intuitions about the mechanisms of inheritance than do older children and adults. Even preschoolers show some understanding of illness, yet another biological process. For example, preschoolers have a notion that an entity can induce illness or be contaminated and that contamination may occur through the workings of invisible, physical particles. Four- and 5-year-olds have some understanding of contagion; they believe that a child is more likely to get sick from exposure to another person who caught the symptom by playing with a sick friend than from another person who developed the symptom through other means. Although preschoolers may have a general idea that germs can cause symptoms, they do not differentiate the effects of symptoms caused by germs from those caused by poison, for example. The mature concept of illness, which is characterized as uniting various components such as its acquisition, symptoms, treatment, and transmission, is not mastered until much later (Solomon and Cassimatis 1999). As with physical concepts, children’s understanding of biological concepts involves substantial developmental change. For some concepts, changes involve enrichment, as children learn more and more details and phenomena relevant to the concepts. For others, developmental changes involve radical conceptual reorganization. There are some interesting parallels between the redefining and restructuring involved in the history of scientific understanding and the changes that occur within an individual lifetime (Carey 1985).
4. Learning Processes Acquisition of scientific understanding involves the discovery of new rules and concepts through direct experience, as well as through instruction. Children’s misconceptions can be overcome through experience that contradicts them. Only recently, however, have researchers directly examined the learning processes involved in the acquisition of scientific concepts. One approach that has proved particularly useful for learning about changing understanding of scientific concepts is the microgenetic method. This approach involves observing changing performance on a trialby-trial basis, usually as children gain experience that
promotes rapid change. Thus, the approach yields the type of high-density data needed to understand change processes (Siegler and Crowley 1991). One example of the usefulness of the approach is provided by Siegler and Chen’s (1998) study of preschoolers’ learning about balance scales. The children were presented problems in which the two sides of the scale had the same amount of weight, but one side’s weight was further from the fulcrum. The goal was to see if children acquired Rule II, which correctly solves such problems, as well as problems on which weight on the two sides varies but distance does not. Children’s rules were assessed in a pretest and posttest in which children were asked to predict which side of the scale would go down or whether it would remain balanced. In the feedback phase between the pretest and post-test, children were repeatedly asked to predict which side of the balance, if either, would go down if a lever that held the arm motionless was released; then the lever was released and the child observed the scale’s movement; and then the child was asked to explain the outcome they had observed. The trial-by-trial analysis of changes in children’s predictions and explanations allowed the examination of the learning processes involved in rule acquisition. Four learning processes were identified. The first component of learning involves noticing potential explanatory variables (e.g., the role of distance) which previously had been ignored. The second step involves formulating a rule that incorporated distance as well as weight. To be classified as formulating a rule, children needed to explain the scale’s action in one trial by stating that a given side went down because its disks were further from the fulcrum, and then in the next trial to predict that the side with its disks further from the fulcrum would go down. The third component involves generalizing the rule to novel problems by using it in most trials after it was formulated. Finally, the last component involves maintaining the new rule under less facilitative circumstances, by using the new rule in the posttest, where no feedback was given. The componential analysis proved useful for understanding learning in general and also developmental differences in learning. The key variable for learning of both older and younger children, and the largest source of developmental differences in learning, involved the first component, noticing the potential explanatory role of distance from the fulcrum. Most 5year-olds noticed the potential role of distance during learning, whereas most 4-year-olds did not. Children of both ages who noticed the role of distance showed high degrees of learning; those of both ages who did not, did not. The same componential analysis of children’s learning of scientific concepts has proved useful in examining children’s learning about water displacement and seems applicable to many other concepts also. Although children often discover new rules, modify their mental models, or acquire new concepts through 13717
Scientific Concepts: Deelopment in Children direct experience both in physical and biological domains, direct observation of the natural world is often inadequate for learning new concepts. Indeed, daily experience sometimes hinders children’s understanding. For example, children’s misconceptions about the shape and motion of the Earth might result from the fact that the world looks flat. Well planned instruction is essential for helping children to overcome these misconceptions and to gain more advanced understanding. The effects of instruction on children’s scientific understanding can be illustrated by a study of the acquisition of the variable control principle (Chen and Klahr 1999). The variable control principle involves manipulating only one variable at a time so that an unconfounded experiment can be conducted and valid inferences can be made about the results. Most early elementary school children do not discover the variable control concept on their own. This observation led Chen and Klahr (1999) to test whether second, third and fourth graders could learn the concept through carefully planned instruction. Children were asked to design experiments to test the possible effects of different variables (whether the diameter of a spring affects how far it stretches, whether the shape of an object affects the speed with which it sinks in water, etc.). An unconfounded design would contrast springs that differed only in diameter but not in length, for example. Providing direct instruction proved to be effective in acquiring the control of variables concept. Both older and younger children who received instruction in designing tests in a specific task were able to understand the rationale and to apply the principle to other tasks. However, the older children were more able to extend the principle to novel contexts. When receiving training in designing tests involving the diameter of a spring, for example, second graders were able to apply the concept only to testing other variables involving springs (e.g., wire size). Third graders used the principle in designing experiments involving other mechanical tasks, such the speed with which objects sank in water. Only fourth graders, however, were able to apply the principle to remote contexts, such as experiments on the causes of plant growth. In summary, recent research has revealed that infants, toddlers, and preschoolers have considerably greater scientific knowledge than previously recognized. However, their knowledge is far from complete, though. Developmental changes in children’s scientific understanding involve both enrichment and structural reorganization. Older children possess more accurate and coherent rules and mental models. These understandings arise, at least in part, from their richer experience, their more advanced abilities to encode and interpret that experience, and their superior ability to separate their theories from the data (Kuhn et al. 1995). Older children also generalize the lessons from instruction more effectively than do younger children. Although there is general agreement that early 13718
understanding of scientific concepts is surprisingly strong, there are disagreements about exactly what knowledge or concepts to attribute to infants, toddlers, and preschoolers. Even less is known about how children progress from their initial understanding of scientific concepts to more advanced understanding. Addressing the issue of how change occurs remains a major challenge, as well as a fruitful direction for future research. See also: Cognitive Development in Childhood and Adolescence; Cognitive Development in Infancy: Neural Mechanisms; Concept Learning and Representation: Models; Early Concept Learning in Children; Infant Development: Physical and Social Cognition; Piaget’s Theory of Child Development; Scientific Reasoning and Discovery, Cognitive Psychology of
Bibliography Carey S 1985 Conceptual Change in Childhood. MIT Press, Cambridge, MA Chen Z, Klahr D 1999 All other things being equal: acquisition and transfer of the control of variables strategy. Child Deelopment 70: 1098–120 Gelman R 1990 First principles organize attention to and learning about relevant data; number and the animate–inanimate distinction as examples. Cognitie Science 14: 79–106 Haith M, Benson J 1998 Infant cognition. In: Damon W, Kuhn D, Siegler R S (eds.) Handbook of Child Psychology. Vol. 2. Cognition, Perception and Language, 5th edn. J. Wiley, New York, pp. 199–254 Keil F C 1989 Concepts, Kinds, and Cognitie Deelopment. MIT Press, Cambridge, MA Kuhn D, Garcia-Mila M, Zohar A, Andersen C 1995 Strategies of knowledge acquisition. Monographs of the Society for Research in Child Deelopment 60: 1–127 Siegler R S 1976 Three aspects of cognitive development. Cognitie Psychology 8: 481–520 Siegler R S, Chen Z 1998 Developmental differences in rule learning: a microgenetic analysis. Cognitie Psychology 36: 273–310 Siegler R S, Crowley K 1991 The microgenetic method: a direct means for studying cognitive development. American Psychologist 46: 606–20 Solomon G E A, Cassimatis N L 1999 On facts and conceptual systems: young children’s integration of their understandings of germs and contagion. Deelopmental Psychology 35: 113–26 Spelke E S, Newport E L 1998 Nativism, empiricism, and the development of knowledge. In: Damon W, Lerner R M (eds.) Handbook of Child Psychology. Vol. 1. Theoretical Models of Human Deelopment, 5th edn. J. Wiley, New York, pp. 275–340 Vosniadou S, Brewer W F 1992 Mental models of the earth: a study of conceptual change in childhood. Cognitie Psychology 24: 535–85
Scientific Controersies Wellman H M, Gelman S A 1998 Knowledge acquisition in foundational domains. In: Damon W, Kuhn D, Siegler R S (eds.) Handbook of Child Psychology. Vol. 2. Cognition, Perception and Language, 5th edn. J. Wiley, New York, pp. 575–630
Z. Chen and R. Siegler
Scientific Controversies Science in general can be an object of controversy such as in disputes between science and religion. Particular scientific findings can also generate controversies either within or outside science. The importance of scientific controversy has been recognized by scholarship within science and technology studies (S&TS) since the 1970s. Indeed the study of controversies has become an important methodological tool to gain insight into key processes that are not normally visible within the sciences. What makes something a scientific controversy? It is important to distinguish longstanding disputes, such as that between science and religion, or the merits of sociobiological explanation as applied to humans, or whether the fundamental constituents of matter are particles or waves, from more localized disputes such as over the existence of a new particle or a new disease transmitting entity. The latter sorts of controversy are more like a ‘hot spot’ that erupts for a while on the surface of science than a deeply entrenched longrunning battle. Also controversies are not be confused with the bigger sea changes which science sometimes undergoes during scientific revolutions. Although defining a scientific revolution is itself contested, the all-pervasive nature of the changes in physics brought about by quantum mechanics and relativity seems different from, for example, the controversy over the detection of large fluxes of gravitational radiation or over the warped zipper model of DNA. Similarly, long-running debates on the relative impacts of nature and nurture on human behavior have a different character from more episodic controversies, such as the possibility of interspecies transfer of prion disease. Of course, intractable disputes and revolutions share some of the features associated with controversies, but it is the bounded nature of controversies which has led to their becoming an object of study in their own right, especially within the tradition of S&TS associated with the sociology of scientific knowledge (SSK). One metaphor for understanding why controversies have taken on such methodological importance is that of ‘punching’ a system. On occasions scientists gain insight into natural systems by punching, or destabilizing, them. For example, one may learn more about the laws of momentum by bouncing one billiard ball off another than by watching a stationary billiard ball.
Similarly Rutherford famously used scattering experiments in which gold foil was bombarded with alpha particles to uncover the structure of the atom and in particular the presence of the nucleus. The methodological assumption underpinning the study of controversies is similar. By studying a scientific controversy, one learns something about the underlying dynamics of science and technology and their relations with wider society. For instance, during a controversy the normally hidden social dimensions of science may become more explicit. Sites of contestation are places to facilitate the investigation of the metaphors, assumptions, and political struggles embedded within science and technology. We can note four different influential approaches towards the study of scientific controversies. The school of sociological research associated with Robert Merton (1957) first recognized the importance of controversies within science. Of particular interest to Merton was the existence of priority disputes. Many well-known controversies center on who is the first scientist to make a particular scientific discovery. A second approach toward the study of scientific controversy developed in the 1960s as concerned citizens increasingly protested what they took to be the negative effects of science and technology. Here the source of controversy is the perceived negative impact of science and technology on particular groups and it is the study of these political responses that forms the core of the analysis. The new SSK which emerged in the 1970s and which largely displaced the Mertonian School provides a third approach towards the study of controversies. Here the focus is on controversies at the research frontiers of science where typically some experimental or theoretical claim is disputed within an expert community. Modern S&TS owe a heavy debt to SSK but are less likely to make distinctions between the content of science and its impact. Within this fourth approach, controversies are seen as integral to many features of scientific and technological practice and dissemination. Their study forms a key area of the discipline today.
1. Merton and Priority Disputes Merton’s interest in priority disputes stemmed from his claim that science has a particular normative structure or ‘institutional ethos’ with an accompanying set of rewards and sanctions. Because so much of the reward structure of science is built upon the recognition of new discoveries, scientists are particularly concerned to establish the priority of their findings. Such priority disputes are legion, such as the famous fight between Newton and Leibnitz over who first discovered the calculus. It was Thomas Kuhn (1962) who first raised a fundamental problem for the analysis of priority 13719
Scientific Controersies disputes. A priority dispute is predicated upon a model of science, later known as the ‘point model’ of scientific discovery, which can establish unambiguously who discovered what and when. Asking the question of who discovered oxygen, Kuhn showed that the crucial issue is what counts as oxygen. If it is the dephlogisticated air first analyzed by Priestly then the discovery goes to him, but if it is oxygen as understood within the modern meaning of atomic weights then the discovery must be granted to Lavoisier’s later identification. The ‘point model’ requires discovery to be instantaneous, and for discoveries to be recognized and dated. A rival ‘attributional model’ of discovery, first developed by Augustin Brannigan (1981), draws attention to the social processes by which scientific discoveries are recognized and ‘attributed.’ This approach seems to make better sense of the fact that what counts as a discovery can vary over time. In short, it questions the Eureka moment of the point model. For example, Woolgar (1976), in his analysis of the pulsar’s discovery, shows that the date of the discovery varies depending on what stage in the process is taken to be the defining point of the discovery. If the discovery is the first appearance of ‘scruff’ on Jocelyn Bell’s chart recording of signals from the radio telescope, then it will be dated earlier than when it was realized that the unambiguous source of this ‘scruff’ was a star. This case was particularly controversial because it was alleged by the dissonant Cambridge radio astronomer Fred Hoyle that the Nobel Prize winners for this discovery should have included Jocelyn Bell, who was then a graduate student. Priority disputes can touch in this way on the social fabric of science, such as its gender relationships and hierarchical structure. Despite the challenge posed by the attributional model, it is the point model of discovery that is embedded in the reward system of science. As a result, priority disputes still abound. In modern technoscience, discovery can mean not only recognition, but also considerable financial reward, as for example with patents, licensing arrangements, or stock in a biotech company. In such circumstances, priority disputes have added salience. One has only to think of the unseemly battle between Robert Gallo and the Pasteur Institute over priority in the discovery that HIV is the cause of AIDS. In this case, there was not only scientific priority at stake, but also the licensing of the lucrative blood test for identifying AIDS. The controversy could only be settled by intervention at the highest political level. The Presidents of the USA and France, Ronald Reagan and Jacques Chirac, agreed to share the proceeds from the discovery. Again what was at stake scientifically was not simply who was first past the post; the protagonists initially claimed to have isolated different retroviruses and disagreed over the effectiveness of the various blood tests. This case was marked by additional controversy because of allega13720
tions of scientific misconduct raised against Gallo that led to Congressional and National Institute of Health (NIH) investigations.
2. Controersy oer the Impact of Science and Technology That a priority dispute could require the intervention of national political leaders is an indication of just how important science and technology can become for the wider polity. In response to the AIDS crisis, activist groups have campaigned and pressured scientists and government officials to do more scientifically. They have also intervened in matters of research design, such as the best way to run clinically controlled trials. Such activist engagement dates back to the political protests that science and technology generated in the 1960s in the context of Vietnam-era issues such as war and environmentalism. There has been increasing recognition that science and technology are neither neutral nor necessarily beneficial and that many developments stemming from modern science and technology, such as nuclear power, petrochemical industries, and genetic engineering, raise profound and controversial issues for a concerned citizenry. Dorothy Nelkin, a pioneer in analyzing these types of disputes identified four types of political, economic, and ethical controversies that engage the public in the US (Nelkin 1995). One set revolves around the social, moral, and religious impact of science. Issues such as the teaching of evolution in US schools, animal rights, and the use of fetal tissue fall into this first category. A second type of controversy concerns a clash between the commercial and economic values surrounding science and technology and that of the environmental movement. Ozone depletion, toxic waste dumps, and greenhouse gases are pertinent examples. A third set has been provoked by health hazards arising from the transformation of food and agricultural practices by the use of modern science and technology. Genetically modified foods, the carcinogenic risks posed by food additives, and the use of bovine growth hormones in the dairy industry all belong in this category. A fourth group centers on conflicts between individual rights and group rights: a conflict that has been heightened by new developments in science and technology. For example, the mass fluoridation of water to improve dental health denies individuals the right to choose for themselves whether they want fluoride in their water supply. Research on these sorts of controversies has focused mainly on the interest politics of the groups involved. How and why do they get involved in political action over science and technology; what underlying political values do such groups exhibit; and how do they effectively intervene to protest some perceived deleterious development stemming from science, technology, or medicine? The positions taken by the
Scientific Controersies participants are consistent with their interests, although these interests may not enable the outcome or closure of a debate to be predicted. For instance, the demise of nuclear power had as much to do with economics as with political protest. Since scientists themselves often play an active part in these disputes, a full analysis will touch upon how scientists deploy their science for political aims. But, by and large, this research tradition has avoided using the entry of scientists into these disputes to examine the core processes by which scientific knowledge is developed and certified. In short, the attention was focused upon seeing how scientists became political rather than upon how politics might itself shape scientific knowledge. Political controversies were treated as analytically separable from epistemic controversies and as resolved by distinct processes of closure (Engelhardt and Caplan 1987). Typically, epistemic controversies were thought to be closed by application of epistemic and methodological standards, while political controversies were closed through the intervention of ‘non-scientific factors,’ such as economic and political interests.
3. Scientific Controersy and the Sociology of Scientific Knowledge With the emergence of the SSK in the late 1970s, it was no longer possible to avoid examining how scientific knowledge was shaped and how this shaping contributed to the dynamics of controversies. A key tenet of this new sociology of science, as formulated by David Bloor (1991) in his ‘Strong Programme,’ was that of symmetry. This principle called upon sociologists to use the same explanatory resources to explain both successful and unsuccessful knowledge claims. It raised to methodological status the necessity of examining the processes by which science distinguishes the wheat of truth from the chaff of error. SSK soon turned its attention towards examining scientific controversies because it is during such controversies that this symmetry principle can be applied to good effect. With each side alleging that it has ‘truth’ on its side, and disparaging the theoretical and experimental efforts of the other, a symmetrical analysis can explain both sides of the controversy using the same sorts of sociological resources. This differs from the earlier interest approach to controversies in that it applies this symmetrical sociological analysis to the very scientific claims made by the participants. Bloor and his colleagues of the Edinburgh school pursued their program mainly through theoretical analysis supported by historical case studies. H. M. Collins and the ‘Bath School’ by contrast, developed an empirical method for studying the SSK in contemporaneous cases: a method based primarily upon the study of scientific controversies. One early ap-
plication of the method was to the study of parapsychology (Collins and Pinch 1982). Collins and Pinch suggested that controversies such as that provoked by parapsychology were resolved by boundary crossing between two different forums of scientific debate: the constitutive and the contingent. Generalizing from several case studies of controversies, Collins (1981) argued that during controversies scientific findings exhibited ‘interpretative flexibility,’ with the facts at stake being debated and interpreted in radically different ways by the parties in the controversy. This interpretative flexibility did not last forever: by following a controversy over time, researchers could delineate the process of ‘closure’ by which controversy vanished and consensus emerged. Collins defined the group of scientists involved in a controversy as the ‘core set.’ Only a very limited set of scientists actively partook in controversies; the rest of the scientific community depended upon the core set for their expert judgment as to what to believe. This was particularly well illustrated by Martin Rudwick (1985) in his study of the great Devonian Controversy in the history of geology. As researchers followed controversies from their inception to the point of closure, it became necessary to address matters of scientific method as they were faced in practice by the participants. Factors that had usually been seen as issues of method or epistemology thus became open to sociological investigation: for example, the replication of experiments, the role of crucial experiments, proofs, calibration, statistics, and theory. In addition, other factors such as reputation, rhetoric, and funding were shown to play a role in the dynamics of controversies. An important finding of this research was what Collins (1992) called the ‘experimenter’s regress.’ Controversies clearly were messy things and were very rarely resolved by experiments alone. Collins argued that in more routine science experiments were definitive because there was an agreed-upon outcome which scientists could use as a way of judging which scientists were the competent practitioners. If one could get one’s experiment to work, one had the requisite skills and competence; if one failed, one lacked the skills and competence. The trouble was that when there was a dispute at the research frontiers there was no agreed upon outcome by which to judge the competent practitioners. Experiments had to be built to investigate a claimed new phenomenon, but failure to find the new phenomenon might mean either there was no new phenomenon to be found or that the experimenter failing to find it was incompetent. This regress was only broken as a practical matter by the operation of a combination of factors such as rhetoric, funding, and prior theoretical dispositions. Often the losing side in a scientific controversy continues to fight for its position long after the majority consensus has turned against it. Those who continue will meet increasing disapprobation from 13721
Scientific Controersies their colleagues and may be forced to leave science altogether. ‘Life after death’ goes on at the margins and often finally passes away only when the protagonists themselves die or retire (Simon 1999). The uncertain side of science is clearest during moments of controversy. Most scientists never experience controversies directly, and often it is only after exposure to a controversy that scientists become aware of the social side of science, start reading in science studies, and even employ ideas drawn from science studies to understand what has happened to them. This work on scientific controversy has been exemplified by a number of case studies of modern science such as memory transfer, solar neutrinos, gravity waves, high-energy physics, and famously cold fusion. Historians have also taken up the approach used by sociologists and the sociological methods have been extended to a number of historical case studies. Such studies pose particular methodological challenges because often the losing viewpoint has vanished from history. Shapin and Schaffer’s (1985) study of the dispute between Robert Boyle and Thomas Hobbes over Boyle’s air pump experiments was a landmark in research on scientific controversy, because it showed in a compelling way how the wider political climate, in this case that of Restoration Britain, could shape the outcome of a controversy. It showed how that climate could help institutionalize a new way of fact-making experiments in the Royal Society at the same time. In addition, it drew attention to the literary and technological dimensions of building factual assent in science. By documenting the witnesses to particular experimental performances, a culture of ‘virtual witnessing’ was born. The SSK approach to scientific controversy has also been influential in the study of technology. The social construction of technology (SCOT) framework uses concepts imported from the study of scientific controversy such as ‘interpretative flexibility’ and ‘closure.’ A variety of competing meanings are found in technological artifacts and scholars study how ‘closure mechanisms’ such as advertising produce a stable meaning of a technology (Pinch and Bijker 1987, Bijker 1995). Another influential approach to the study of controversies in sciences and technology has been that developed by Bruno Latour and Michel Callon. Again, the initial impetus came from studies of scientists. Callon’s (1986) article on a controversy over a new method of harvesting scallops is one of the first articulations of what later became known as Actor Network Theory (ANT). Callon argues that the outcome of a controversy cannot be explained by reference to the social realm alone, but the analyst must also take account of the actions of non-human actors, such as scallops, which play a part in shaping the outcome. Subsequently Latour’s work on how ‘trials of strength’ are settled in science and technology has become especially influential within the new SSK. 13722
Such struggles, according to Latour (1987), involve aligning material and cognitive resources with social ones into so-called ‘immutable mobiles’ or black boxes, objects which remained fixed when transported along scientific networks and which contain embedded within them sets of social, cognitive, and material relationships. Latour and Woolgar (1979) in their now classic study of a molecular biology laboratory showed that literary inscriptions play a special role in science. They indicated how controversies could be analyzed in terms of whether certain modalities are added to or subtracted from scientific statements making them more or less fact-like. The role of discourse in scientific controversies has been examined in great depth in a study of the oxidative phosphorylation controversy by Gilbert and Mulkay (1984). They showed how particular repertoires of discourse, such as the ‘empiricist repertoire’ and the ‘contingent repertoire,’ are used selectively by scientists in order to bolster their own claims or undermine those of their opponents. Subsequently there has been much work on how a variety of rhetorical and textual resources operate during controversies (e.g., Myers 1990). Sometimes the resolution of controversy is only possible by drawing boundaries around the relevant experts who can play a role in the controversy. Sometimes particular scientific objects cross such boundaries and form a nexus around which a controversy can be resolved. Such ‘boundary work’ (Gieryn 1983) and ‘boundary objects’ (Star and Griesemer 1989) form an important analytical resource for understanding how controversies end. In addition to analyzing scientific controversies, SSK has itself become a site of controversy. Most notably, lively controversies have occurred over the viability of interest explanations, over the extent to which the sociology of science should itself be reflexive about its methods and practices, and over the role of non-human actors. The ‘science war’ involving debates between natural scientists and people in science studies over the methods and assumptions of science studies and cultural studies of science is another area of controversy that is ripe for sociological investigation.
4. Scientific Controersy in Science and Technology Studies Today In contemporary S&TS, the sites of contestation chosen for analysis have become more heterogeneous. One strength of the new discipline of S&TS is the wide terrain of activities involving science and technology that it examines. For example, similar methods can be used to examine controversies involving science and technology in the courtroom, the media, quasi-governmental policy organizations, and citizens’ action groups. Indeed, many of the most contentious political issues facing governments and citizens today involve
Scientific Controersies science and technology: issues such as genetically modified foods, gene therapy, and in itro fertilization. The study of controversies in modern technoscience—with its porous boundaries between science, technology, politics, the media, and the citizenry—also calls for the analyst to broaden the array of analytical tools employed. Although the fundamental insights produced by SSK remain influential, such insights are supplemented by an increased understanding of how macro-political structures such as the state and the legal system enable and constrain the outcome of scientific controversies. Examples of this sort of work include: Jasanoff’s (1990) investigations of how technical controversies are dealt with by US agencies such as the Environmental Protection Agency (EPA) and the Food and Drug Administration (FDA), Lynch and Jasanoff’s (1998) work on science in legal settings such as the use of DNA evidence in courtrooms, and Epstein’s (1996) work on the AIDS controversy. In the latter case Epstein deals not only with the dispute about the science of AIDS causation, but also turns to social movements research to understand how AIDS activists outside of science got sufficient influence actually to affect the design of clinical trials by which new AIDS drugs are tested. Particularly interesting methodological issues have been raised by the study of controversies that overtly impinge upon politics. When studying controversies within science SSK researchers were largely able to adopt the neutral stance embodied in the symmetry principle of the Strong Programme (see Epigenetic Inheritance). However, some scholars have argued that, when dealing with cases where analysis could have a direct impact upon society, it is much harder to maintain neutrality. Researchers studying these sorts of disputes, such as whether Vitamin C is an effective cancer cure, find they can become ‘captured’ by the people they are studying. This complicates the possibility of producing the sort of neutral analysis sought after in SSK. A number of solutions have been proposed to this dilemma (see Ashmore and Richards 1996). Several authors have attempted to produce typologies of scientific controversies (Engelhardt and Caplan 1987, Dascal 1998). Unfortunately, such typologies are not as useful as they could be because they are confounded by their underlying epistemological assumptions. For example, within an SSK approach it makes little sense to work with a category of, say, ‘sound argument,’ for closing a controversy because in SSK ‘sound argument’ is seen as part of the controversy. What counts as a ‘sound argument’ can be contested by both sides. Martin and Richards in their (1995) review adopt a fourfold typology. This review is particularly useful because it distinguishes between the different types of epistemological assumptions underlying the different analytical frameworks em-
ployed: namely, positivist, group politics, SSK, and social structural. Traditional history and the philosophy of science according to one account (Dascal 1998) are becoming more cognizant of the phenomenon of scientific controversy. But the call to examine scientific controversy by historians and philosophers makes strange reading to scholars immersed in S&TS. It is as if the historians and philosophers of controversy simply have ignored or failed to read the relevant literatures. Thus neither Nelkin’s (1992) influential volume, Controersies nor a special edition of Social Studies of Science edited by H. M. Collins (1981), Knowledge and Controersy, which sets out the SSK approach towards scientific controversy, are referenced. That the best way to study scientific controversy is still controversial within the academy could scarcely be more obvious. See also: Epigenetic Inheritance
Bibliography Ashmore M, Richards E 1996 The politics of SSK: Neutrality, commitment and beyond. Special issue of Social Studies of Science 26: 219–445 Bijker W 1995 Of Bicycles, Bakelites, and Bulbs: Towards a Theory of Sociotechnical Change. MIT Press, Cambridge, MA Bloor D 1991 Knowledge and Social Imagery, 2nd edn. University of Chicago Press, Chicago Brannigan A 1981 The Social Basis of Scientific Discoeries. Cambridge University Press, Cambridge, UK Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieux Bay. In: Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge? Sociological Review Monograph, Routledge, London, pp. 196–229 Collins H M 1981 Knowledge and controversy: Studies in modern natural science. Special issue of Social Studies of Science 11: 1 Collins H M 1992 Changing Order, 2nd edn. University of Chicago Press, Chicago Collins H M, Pinch T J 1982 The construction of the paranormal, nothing unscientific is happening. In: Wallis R (ed.) On the Margins of Science: The Social Construction of Rejected Knowledge. Sociological Review Monograph, University of Keele, Keele, UK Dascal M 1998 The study of controversies and the theory and history of science. Science in Context 11: 147–55 Engelhardt T H, Caplan A L 1987 Scientific Controersies: Case Studies in the Resolution and Closure of Disputes in Science and Technology. Cambridge University Press, Cambridge, UK Epstein S 1996 Impure Science: AIDS, Actiism and the Politics of Knowledge. University of California Press, Berkeley, CA Gieryn T 1983 Boundary work and the demarcation of science from non-science: Strains and interests in professional ideologies of scientists. American Sociological Reiew 48: 781–95 Gilbert N G, Mulkay M K 1984 Opening Pandora’s Box. Cambridge University Press, Cambridge, UK Jasanoff S 1990 The Fifth Branch: Science Adisors as Policy Makers. Harvard University Press, Cambridge, MA
13723
Scientific Controersies Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Latour B 1987 Science in Action. Harvard University Press, Cambridge, MA Latour B, Woolgar S W 1979 Laboratory Life. Sage, London Lynch M, Jasanoff S (eds.) 1998 Contested identities: Science, law and forensic practice. Special issue of Social Studies of Science 28: 5–6 Martin B, Richards E 1995 Scientific knowledge, controversy, and public decision making. In: Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Merton R K 1957 Priorities in scientific discoveries: A chapter in the sociology of science. American Sociological Reiew 22: 635–59 Myers G 1990 Writing Biology: Texts and the Social Construction of Scientific Knowledge. University of Wisconsin Press, Madison, WI Nelkin D (ed.) 1992 Controersies: Politics of Technical Decisions, 3rd edn. Sage, Newbury Park, CA Nelkin D 1995 Science controversies: The dynamics of public disputes in the United States. In: Jasanoff S, Markle G E, Petersen J C, Pinch T (eds.) Handbook of Science and Technology Studies. Sage, Thousand Oaks, CA Pinch T J, Bijker W E 1987 The social construction of facts and artifacts. Or how the sociology of science and the sociology of technology might benefit each other. In: Bijker W E, Hughes T S, Pinch T J (eds.) The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology. MIT Press, Cambridge, MA Rudwick M 1985 The Great Deonian Controersy: The Shaping of Scientific Knowledge Among Gentlemanly Specialists. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle and the Experimental Life. Princeton University Press, Princeton, NJ Simon B 1999 Undead science: Making sense of cold fusion after the (arti)fact. Social Studies of Science 29: 61–87 Star S L, Griesemer J 1989 Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939. Social Studies of Science 19: 387–420 Woolgar S 1976 Writing an intellectual history of scientific development: The use of discovery accounts. Social Studies of Science 6: 395–422
T. Pinch
Scientific Culture 1. Modern Scientific Culture All human societies develop knowledge about their natural and social worlds. Even hunter-gatherers possess remarkable local knowledge about plants, animals, and climate. The great world civilizations had a more complex division of labor that allowed priests, doctors, smiths, farmers, and other specialists to develop more elaborate and less local knowledge systems about astronomy, psychology, medicine, metallurgy, agriculture, and other fields. Researchers 13724
tend to reserve the term ‘science’ for the elaborated, written knowledge systems of the great world civilizations—such as ancient Greek or Chinese science—and they tend to describe other, more local knowledge systems as ethnomedicine, ethnoastronomy, ethnobotany, and so on. Many attempts have been made to describe the distinctive features of the type of science that emerged in Western Europe around 1500–1700 (Cohen 1997). The term ‘modern science’ is somewhat clearer than ‘Western science,’ not only because Western science was built up in part from non-Western sources, but also because Western science rapidly became globalized. However, two clarifications are in order. First, the term ‘scientist’ did not emerge until the nineteenth century; in earlier centuries the term ‘natural philosopher’ was more common. Second, the term ‘modern’ is used here to refer to a society characterized by a family of institutions that includes modern science, constitutional democracy, capitalism, religious pluralism, social mobility, and a universalistic legal system. Although some of those institutions can be found in some of the world’s societies prior to 1500, their development as a system occurred in the West and accelerated after 1500. As an institution, science soon found a niche in modern societies by providing research and\or ideological support for the emerging capitalist industries, the state, and the churches (e.g., ballistics, mining, navigation, taxation, public health, critiques of magic). Modern societies provided more than financial resources that supported the growth of modern science as an institution; the culture of modernity provided intellectual resources that contributed to the emergence of modern scientific inquiry. Three of the central values of modern scientific culture are empiricism, formalism, and mechanism. Although each of the distinctive features of modern scientific culture can be found in other scientific cultures (and may not be found across all modern scientific disciplines), as a family of features they have some value in characterizing modern scientific culture, and in showing its points of confluence with the cultures of modernity more generally. Empiricism refers to the value placed on observations as a means for resolving disputes about natural and social knowledge. In some sciences, empiricism was developed as experimentalism; in other words, observations could be brought into the laboratory and subjected to experiments that were, in principle, reproducible by competent members of the scientific community. However, other fields remained nonexperimental, such as the fieldwork-based sciences. Because observations are always subject to interpretation, their use as a resource for dispute resolution was in turn rooted in broader societal cultures. Scientists had to trust each other not to lie (Shapin 1994), and they required societies and journals in which they could debate and share results. Those
Scientific Culture requirements were met in the European societies that fostered the emergence of a ‘relatively autonomous intellectual class’ (Ben-David 1991, p. 304) and a public sphere of open debate (Habermas 1989). The relationship between empiricism and the broader society went beyond institutional requirements; other sectors of society were also characterized by an empirical cultural style. For example, constitutional democracies and markets were based on the ‘empiricism’ of elections and consumer purchases. Likewise, some Protestant churches replaced church dogma with the empiricism of direct interpretation of Bibles, and they emphasized knowledge of God through study of his works (Hooykaas 1990, p. 191, Merton 1973, Chap. 11). In this sense, the empiricism of science emerged as part of a more general way of resolving conflicting opinions about the world through recourse to data gathering and evidence. Formalism refers to the value placed on increasingly higher levels of generalization. As generalization progresses in science, concepts and laws tend to become increasingly abstract and\or explicit. In some fields, the generalizations took the form of mathematical laws that encompassed a wide range of more specific observations. In physics, for example, it became possible to apply the same set of formal laws to the mechanics of both terrestrial and celestial objects, and space and time were analyzed in terms of geometry (Koyre! 1965). In other fields, the generalizations took the form of increasingly formal and abstract taxonomies and systems of classification, as in early modern biology (Foucault 1970). Again, broader institutions and values contributed to the emergence of this specific form of inquiry. When researchers found support for societies that published journals where an archive of research could be located, they found the institutional resources that allowed them to conjugate their work with that of others. Likewise, as Western colonial powers expanded, they sent out scientific expeditions that incorporated local knowledges and new observations into existing research fields. As ‘Western’ science became increasingly cosmopolitan, it also became more abstract. Again, however, the relationship with the broader society went beyond institutional influences to shared values. For example, scientific formalism developed in parallel with modernizing, Western political systems and social contract theories that emphasized the abstraction of general laws from particular interests, or of a general good from individual wills. A third common feature is the value of mechanism as a form of explanation. Over time, modern science tended to rule out occult astrological forces, vitalistic life forces, and so on. When the concept of force was retained, as in Newton’s gravitational force, it was subjected to the restraints of a formal law. Again, general social and cultural resources supported this development. The attack on occult forces was consistent with reformation campaigns against popular
magic (Jacob 1988), and the disenchantment of the world that mechanistic models depended on was both supported by Christianity and intensified by some forms of Protestantism. The development and spread of clocks and constitutions provided an early metaphor of mechanism, and as new technologies and social charters were developed, new metaphors of mechanism continued to emerge (Kammen 1987, pp. 17–8, Shapin 1996, pp. 30–7). Sciences that violate the cultural value of mechanism, such as parapsychology (the study of claimed paranormal phenomena), are rejected. Likewise, the incorporation of local sciences has tended to occur after filtering out occult or vital forces. For example, in response to demand from patients and cost efficiency concerns, acupuncture and Chinese medicine are being incorporated into cosmopolitan biomedicine. However, even as the practices are being incorporated, the vitalistic concept of ‘chi’ and Chinese humoral concepts are being translated into mechanistic concepts consistent with modern biology and physics. Just as values such as empiricism, formalism, and mechanism have been used to describe the intellectual culture of modern science, so a related family of concepts has been used to describe its institutional culture. Most influential was sociologist Robert Merton’s (1973, Chap. 13) list of four central norms. Although subsequent research suggested that norms were frequently violated and better conceptualized as an occupational ideology (Mulkay 1976), Merton’s analysis did have the advantage of pointing to the fundamental preconditions for the existence of modern scientific culture. Perhaps the basic underlying institutional value is autonomy. In other words, there is a value placed on leaving alone a certified community of qualified peers to review and adjudicate the credibility of various claims of evidence and consistency, rather than have the function ceded to the fiat of kings, dictators, church leaders, business executives, and others who do not understand the research field. The value of autonomy creates an interesting tension between science and another modern institution, constitutional democracy. Although scientific communities have suffered greatly under nondemocratic conditions, their demand for some degree of autonomy based on expertise also entails a defense of a type of elitism within a modern, democratic social order that values egalitarianism. The tension is reduced by valuing egalitarianism within the field of science, that is, by reserving it for those persons who have obtained the credentials to practice as a scientist.
2. Variations in Scientific Culture(s) Going beyond the focus on modern science as a whole, much research on scientific culture has been devoted to its variations over time and across disciplines. Historians occasionally use the concept of periods as a way 13725
Scientific Culture of ordering cultural history, for example in the divisions of music history from baroque to classical to romantic. Although periodization is very approximate and relatively subjective, it nonetheless helps to point to some of the commonalities across scientific disciplines within the same time period, and some of the disjunctures in the history of science over time (Foucault 1970). Changes in conceptual frameworks and research programs across disciplines within a time period usually are part of broader cultural changes. For example, in the nineteenth century political values were frequently framed by grand narratives of progress (shared with the ‘white man’s burden’ of colonialism among other ideologies), and likewise new scientific fields such as cosmology, thermodynamics, and evolutionary biology evidenced a concern with temporal issues. In the globalized information society of the twenty-first century, concerns with complex systems have become more salient, and scientists in many disciplines are developing research programs based on ideas of information processing and systemic complexity. Science is to some extent universal in the sense that, for example, physicists throughout the world agree upon the basic laws and empirical findings of physics. However, there are also significant cultural variations in science across geographic regions. The variations are more salient in the humanities and social sciences, where distinctive theoretical frameworks often have a regional flavor. For example, one may speak of French psychoanalysis or Latin American dependency theory. The variations are also obvious for the institutional organization of scientific communities, laboratories, and university systems (Hess 1995). Furthermore, variations in the institutional organization are sometimes associated with differing research traditions in the content of a field. For example, the salary and power structure of German universities in the first third of the twentieth century favored a more theoretical approach to genetics and a concern with evolution and development. In contrast, the relatively collegial governance structures of the American universities, together with their location in agricultural schools, contributed to the development of American genetics in a more specialized, empirical, and applied direction (Harwood 1993). Sometimes variations in institutional structures and conditions are also associated with differences in laboratory technologies, which in turn restrict the selection of research problems. For example, in Japan, funding conditions for physics have in part shaped the emergence of detector designs that differ substantially from those in the USA (Traweek 1988). In addition to the temporal and national cultures of science, a third area of variation in scientific cultures involves disciplinary cultures. For example, both highenergy physics and molecular biology are considered ‘big science,’ but their disciplinary cultures are substantially different (Knorr-Cetina 1999). Regarding 13726
their intellectual cultures, in high-energy physics data are widely recognized as heavily interpretable, whereas in molecular biology the contact with the empirical world is much more hands-on and direct. Regarding the institutional cultures of the two fields, laboratories in high-energy physics are very large-scale, with the individual’s work subordinated to that of the cooperative team, even to the point of near anonymity in publication. In contrast, in molecular biology the laboratories are smaller, even for the genome project, and competition among individual researchers remains salient, with frequent disputes over priority in a publication byline.
3. Multiculturalism and Science A third approach to the topic of scientific culture involves the ongoing modernization of scientific cultures. Although modern science is increasingly international and cosmopolitan, its biases rooted in particular social addresses have constantly been revealed. Science has been and remains largely restricted to white men of the middle and upper classes in the developed Western countries (Harding 1998, Rossiter 1995, Schiebinger 1989). As scientific knowledge and institutions have spread across the globe and into historically excluded groups, they have sometimes undergone changes in response to perspectives that new groups bring to science. For example, in the case of primatology, humans and monkeys have coexisted for centuries in India. Consequently, it is perhaps not surprising that Indian primatologists developed new research programs based on monkey–human interactions that challenged the romanticism of Western primatologists’ focus on natural habitats (Haraway 1989). More generally, as scientific disciplines become globalized, different national cultural traditions intersect with the globalized disciplinary cultures to reveal unseen biases and new possibilities for research. As women and under-represented ethnic groups have achieved a place in scientific fields, they have also tended to reveal hidden biases and develop new possibilities. Although the reform movements of multiculturalism in science can be restricted to the institutional culture of science (such as reducing racism and sexism in hiring practices), they sometimes also extend to the intellectual culture or content of science. Theories and methods—particularly in the biological and social sciences—that seem transparently cosmopolitan and truthful to white, male colleagues often appear less so to the new groups. Whether it is the man-the-hunter theory of human evolutionary origins (Haraway 1989), the ‘master’ molecule theory of nucleus to cytoplasm relations (Keller 1985), or biological measures of racial inferiority (Harding 1993), scientific disciplines find themselves continually challenged to prove their universalism and to modify
Scientific Disciplines, History of concepts and theories in response to anomalies and new research. In some cases the members of excluded groups do more than replace old research methods and programs with new ones; they also create new research fields based on their identity concerns. A prime example is the work of the African American scientist George Washington Carver. Although best known in American popular culture for finding many new uses for the peanut, Carver’s research was embedded in a larger research program that was focused on developing agricultural alternatives to King Cotton for poor, rural, African American farmers (Hess 1995).
4. Conclusions It is important not to think of the embeddedness of scientific cultures in broader cultural practices as a problem of contamination. The broader cultures of modern science provide a source of metaphors and institutional practices that both inspire new research and limit its possibilities. For example, if evolutionary theory could not be thought before the progressivist culture of the nineteenth century, it cannot help but to be rethought today. Not only have new research findings challenged old models, but the broader cultural currents of complex systems and limits to growth have also inspired new models and empirical research (DePew and Weber 1995). In turn, today’s concepts and theories will be rethought tomorrow. The broader societal cultures are not weeds to be picked from the flower bed of scientific culture(s) but the soil that both nurtures and limits its growth, even as the soil itself is transformed by the growth that it supports. See also: Academy and Society in the United States: Cultural Concerns; Cultural Psychology; Cultural Studies of Science; Culture in Development; Encyclopedias, Handbooks, and Dictionaries; Ethics Committees in Science: European Perspectives; History of Science; History of Science: Constructivist Perspectives; Scientific Academies in Asia
Bibliography Ben-David J 1991 Scientific Growth. University of California Press, Berkeley, CA Cohen F 1997 The Scientific Reolution. University of Chicago Press, Chicago DePew D, Weber B 1995 Darwinism Eoling. MIT Press, Cambridge, MA Foucault M 1970 The Order of Things. Vintage, New York Habermas J 1989 The Structural Transformation of the Public Sphere. MIT Press, Cambridge, MA Haraway D 1989 Primate Visions. Routledge, New York Harding S (ed.) 1993 The Racial Economy of Science. Routledge, New York Harding S (ed.) 1998 Is Science Multicultural? Indiana University Press, Bloomington, IN
Harwood J 1993 Styles of Scientific Thought. University of Chicago Press, Chicago Hess D 1995 Science and Technology in a Multicultural World. Columbia University Press, New York Hooykaas R 1990 Science and reformation. In: Cohen, I (ed.) Puritanism and the Rise of Science. Rutgers University Press, New Brunswick, NJ Jacob M 1988 The Cultural Meaning of the Scientific Reolution. Knopf, New York Kammen M 1987 A Machine that Would Go of Itself. Knopf, New York Keller E 1985 Reflections on Gender and Science. Yale University Press, New Haven, CT Knorr-Cetina K 1999 Epistemic Cultures. Harvard University Press, Cambridge, MA Koyre! A 1965 Newtonian Studies. Chapman and Hall, London Merton R 1973 Sociology of Science. University of Chicago Press, Chicago Mulkay M 1976 Norms and ideology in science. Social Science Information 15: 637–56 Rossiter M 1995 Women Scientists in America. Johns Hopkins University Press, Baltimore, MD Schiebinger L 1989 The Mind has no Sex? Harvard University Press, Cambridge, MA Shapin S 1994 A Social History of Truth. University of Chicago Press, Chicago Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago Traweek S 1988 Beamtimes and Lifetimes. Harvard University Press, Cambridge, MA
D. Hess
Scientific Disciplines, History of The scientific discipline as the primary unit of internal differentiation of science is an invention of nineteenth century society. There exists a long semantic prehistory of disciplina as a term for the ordering of knowledge for the purposes of instruction in schools and universities. But only the nineteenth century established real disciplinary communication systems. Since then the discipline has functioned as a unit of structure formation in the social system of science, in systems of higher education, as a subject domain for teaching and learning in schools, and finally as the designation of occupational and professional roles. Although the processes of differentiation in science are going on all the time, the scientific discipline as a basic unit of structure formation is stabilized by these plural roles in different functional contexts in modern society.
1. Unit Diisions of Knowledge Disciplina is derived from the Latin discere (learning), and it has often been used since late Antiquity and the early Middle Ages as one side of the distinction disciplina vs. doctrina (Marrou 1934). Both terms 13727
Scientific Disciplines, History of meant ways of ordering knowledge for purposes of teaching and learning. Often they were used synonymously. In other usages doctrina is more intellectual and disciplina more pedagogical, more focused on methods of inculcating knowledge. A somewhat later development among the church fathers adds to disciplina implications such as admonition, correction, and even punishment for mistakes. This concurs with recent interpretations of discipline, especially in the wake of Michel Foucault, making use of the ambiguity of discipline as a term always pointing to knowledge and disciplinary power at the same time (cf. Hoskin in Messer-Davidow et al. 1993). Finally, there is the role differentiation of teaching and learning and the distinction doctrina\disciplina is obviously correlated with it (Swoboda 1979). One can still find the same understandings of doctrina and disciplina in the literature of the eighteenth century. But what changed since the Renaissance is that these two terms no longer refer to very small particles of knowledge. They point instead to entire systems of knowledge (Ong 1958). This goes along with the ever more extensive use by early modern Europe of classifications of knowledge and encyclopedic compilations of knowledge in which disciplines function as unit divisions of knowledge. The background to this is the growth of knowledge related to developments such as the invention of printing, the intensified contacts with other world regions, economic growth and its correlates such as mining and building activities. But in these early modern developments there still dominates the archial function of disciplines. The discipline is a place where one deposits knowledge after having found it out, but it is not an active system for the production of knowledge.
2. Disciplines as Communication Systems A first premise for the rise of disciplines as production and communication systems in science is the specialization of scientists and the role differentiation attendant on it (Stichweh 1984, 1992). Specialization is first of all an intellectual orientation. It depends on a decision to concentrate on a relatively small field of scientific activity, and, as is the case for any such decision, one needs a social context supporting it, that is, other persons taking the same decision. Such decisions are rare around 1750 when encyclopedic orientations dominated among professional and amateur scientists alike, but they gained in prominence in the last decades of the eighteenth century. Second, specialization as role differentiation points to the educational system, which is almost the only place in which such specialized roles can be institutionalized as occupational roles. From this results a close coupling of the emerging disciplinary structures in science and the role structures of institutions of higher education. 13728
This coupling is realized for the first time in the reformed German universities of the first half of the nineteenth century and afterwards quickly spreads from there to other countries. Third, role differentiation in institutions of higher education depends on conditions of organizational growth and organizational pluralization. There has to be a sufficient number of organizations which must be big enough for having differentiated roles and these organizations must be interrelated in an ongoing continuity of interactions. The emergence of communities of specialists is a further relevant circumstance. In this respect the rise of disciplines is synonymous with the emergence of scientific communities theorized about since Thomas Kuhn (Kuhn 1970). Scientific communities rest on the intensification of interaction, shared expertise, a certain commonality of values, and the orientation of community members towards problem constellations constitutive of the respective discipline. Modern science is not based on the achievements of extraordinary individuals but on the epistemic force of disciplinary communities. Scientific communities are communication systems. In this respect the emergence of the scientific discipline is equivalent to the invention of new communication forms specific of disciplinary communities. First of all one may think here of new forms of scientific publications. In the eighteenth century a wide spectrum of publication forms existed; they were not, however, specialized in any way. There were instructional handbooks at the university level, journals of a general scientific nature for a regional public interested in utility, and academy journals aiming at an international public, each covering a wide subject area but with rather limited communicative effects. It was only after 1780 that in France, in Germany, and finally, in England, nationwide journals with a specific orientation on such subjects as chemistry, physics, mineralogy, and philology appeared. In contrast to isolated precursors in previous decades, these journals were able to exist for longer periods exactly because they brought together a community of authors. These authors accepted the specialization chosen by the journal; but at the same time they continually modified this specialization by the cumulative effect of their published articles. Thus the status of the scientific publication changed. It now represented the only communicative form by which, at the macrolevel of the system of science—defined originally by national but later by supranational networks—communication complexes specialized along disciplinary lines could be bound together and persist in the long run (Stichweh 1984, Chap. 6, Bazerman 1988). At the same time the scientific publication became a formal principle interfering in every scientific production process. Increasingly restrictive conditions were defined regarding what type of communication was acceptable for publication. These conditions
Scientific Disciplines, History of included the requirement of identifying the problem tackled in the article, the sequential presentation of the argument, a description of the methods used, presentation of empirical evidence, restrictions on the complexity of the argument accepted within an individual publication, linkage with earlier communications by other scientists—using citations and other techniques—and the admissibility of presenting speculative thoughts. In a kind of feedback loop, publications, as the ultimate form of scientific communication, exercised pressure on the scientific production process (i.e., on research) and were thereby able to integrate disciplines as social systems. This reorganization of the scientific production process adheres to one new imperative: the search for noelties. The history of early modern Europe was already characterized by a slow shift in the accompanying semantics associated with scientific truth, from an imperative to preserve the truth to an interest in the novelty of an invention. The success achieved in organizing traditional knowledge, as well as tendencies towards empirical methods and increased use of scientific instruments, worked toward this end. In this dimension, a further discontinuity can be observed in the genesis of the term research in the years after 1790. In early modern times the transition from the preservation to the enlargement of knowledge could only be perceived as a continual process. In contrast, research from about 1800 refers to a fundamental, and at any time realizable, questioning of the entire body of knowledge until then considered as true. Competent scientific communication then had to be based on research in this sense. What was communicated might be a small particle of knowledge, as long as it was a new particle of knowledge. Scientific disciplines then became research disciplines based on the incessant production of novelties. The link between scientific disciplines and organizations of higher education is mediated by two more organizational structures. The first of these are disciplinary careers. Specialized scientists as members of disciplinary communities do not need only specialized occupational roles. Additionally there may be a need for careers in terms of these specialized roles. This again is a condition which sharply distinguishes eighteenth from nineteenth century universities. Around 1750 you still find, even in German universities, hierarchical career patterns which implied that there was a hierarchical succession of chairs inside of faculties and a hierarchical sequence of faculties by which a university career was defined as a progression of steps through these hierarchized chairs. One could, for example, rise from a chair in the philosophical faculty to an (intellectually unrelated) chair in the medical faculty. The reorganization of universities since early nineteenth century completely discontinued this pattern. Instead of a succession of chairs in one and the same university, a scientific career meant a progression through positions inside a discipline,
which normally demands a career migration between universities. This presupposes intensified interactions and competitive relations among universities which compete for qualified personnel and quickly take up new specializations introduced elsewhere. In Germany such regularized career paths through the national university system were especially to be observed from around 1850. This pattern is again closely related to disciplinary curricula, meaning that one follows one’s disciplinary agenda not only in one’s research practice and personal career, but furthermore that there exist institutional structures favoring teaching along lines close to current disciplinary core developments. The unity of teaching and research is one famous formula for this, but this formula does not yet prescribe disciplinary curricular structures which would demand that there should be a complete organization of academic studies close to the current intellectual problem situation and systematics of a scientific discipline. Only if this is the case does there arise a professionalization of a scientific discipline, which means that a systematic organization of academic studies prepares for a non-academic occupational role which is close to the knowledge system of the discipline. Besides professionalization there is then the effect that the discipline educates its own future research practitioners in terms of the methods and theories constitutive of the discipline. A discipline doing this is not only closed on the level of the disciplinary communication processes, it is also closed on the level of socialization practices and the attendant recruitment of future practitioners (on the operational closure of modern science see Luhmann 1990, Stichweh 1990).
3. The Modern System of Scientific Disciplines It is not sufficient to analyze disciplines as individual knowledge producing systems. One has to take into account that the invention of the scientific discipline brings about first a limited number, then many scientific disciplines which interact with one another. Therefore it makes sense to speak of a modern system of scientific disciplines (Parsons 1977, p. 300ff., Stichweh 1984) which is one of the truly innovative social structures of the modern world. First of all, the modern system of scientific disciplines defines an internal enironment (milieu interne in the sense of Claude Bernard) for any scientific activity whatsoever. Whatever goes on in fields such as physics, sociology, or neurophysiology, there exists an internal environment of other scientific disciplines which compete with that discipline, somehow comment on it, and offer ideas, methods, and concepts. There is normal science in a Kuhnian sense, always involved with problems to which solutions seem to be at hand in the disciplinary tradition itself; but normal science is 13729
Scientific Disciplines, History of always commented upon by a parallel level of interdisciplinary science which arises from the conflicts, provocations and stimulations generated by other disciplines and their intellectual careers. In this first approximation it is already to be seen that the modern system of scientific disciplines is a very dynamic system in which the dynamism results from the intensification of the interactions between ever more disciplines. Dynamism implies, among other things, ever changing disciplinary boundaries. It is exactly the close coupling of a cognitiely defined discipline and a disciplinary community which motivates this community to try an expansionary strategy in which the discipline attacks and takes over parts of the domain of other disciplines (Westman 1980, pp. 105–6). This was wholly different in the disciplinary order of early modern Europe, in which a classificatory generation of disciplinary boundaries meant that the attribution of problem domains to disciplines was invariable. If one decided to do some work in another domain, one had to accept that a change over to another discipline would be necessary to do this. Closely coupled to this internally generated and self-reinforcing dynamics of the modern system of scientific disciplines is the openness of this system to new disciplines. Here again arises a sharp difference to early modern circumstances. In early modern Europe there existed a closed and finite catalogue of scientific disciplines (Hoskin 1993, p. 274) which was related to a hierarchical order of these disciplines (for example philosophy was a higher form of knowledge than history, and philosophy was in its turn subordinated to faculty studies such as law and theology). In modern society no such limit to the number of disciplines can be valid. New disciplines incessantly arise, some old ones even disappear or become inactive as communication systems. There is no center and no hierarchy to this system of the sciences. Nothing allows us to say that philosophy is more important than natural history or physics more scientific than geology. Of course, there are asymmetries in influence processes between disciplines, but no permanent or stable hierarchy can be derived from this. The modern system of scientific disciplines is a global system. This makes a relevant difference from the situation of the early nineteenth century, in which the rise of the scientific discipline seemed to go along with a strengthening of national communities of science (Crawford et al. 1993, Stichweh 1996). This nationalization effect, which may have had to do with a meaningful restriction of communicative space in newly constituted communities, has since proved to be only a temporary phenomenon, and the ongoing dynamics of (sub-) disciplinary differentiation in science seems to be the main reason why national communication contexts are no longer sufficient infrastructures for a rapidly growing number of disciplines and subdisciplines. 13730
4. The Future of the Scientific Discipline The preponderance of subdisciplinary differentiation in the late twentieth century is the reason most often cited for the presumed demise of scientific discipline postulated by a number of authors. But one may object to this hypothesis on the ground that a change from disciplinary to subdisciplinary differentiation processes does not at all affect the drivers of internal differentiation in modern science: the relevance of an internal environment as decisive stimulus for scientific variations, the openness of the system to disciplinary innovations, the nonhierarchical structure of the system. Even if one points to an increasing importance of interdisciplinary ventures (and to problem-driven interdisciplinary research) which one should expect as a consequence of the argument on the internal environment of science, this does not change the fact that disciplines and subdisciplines function as the form of consolidating interdisciplinary innovations. And, finally, there are the interrelations with the external environments of science (economic, political, etc.), which in twentieth and twenty-first century society are plural environments based on the principle of functional differentiation. Systems in the external environment of science are dependent on sufficiently stable addresses in science if they want to articulate their needs for inputs from science. This is true for the educational environment of science which has to organize school and higher education curricula in disciplinary or interdisciplinary terms, for role structures as occupational structures in the economic environment of science, and for many other demands for scientific expertise and research knowledge which always must be able to specify the subsystem in science from which the respective expertise may be legitimately expected. These interrelations based on structures of internal differentiation in science which have to be identifiable for outside observers are one of the core components of modern society which, since the second half of the twentieth century, is often described as knowledge society. See also: Disciplines, History of, in the Social Sciences; History and the Social Sciences; History of Science: Constructivist Perspectives; Human Sciences: History and Sociology; Knowledge Societies; Scientific Academies, History of; Scientific Culture; Scientific Revolution: History and Sociology; Teaching Social Sciences: Cultural Concerns; Universities and Science and Technology: Europe; Universities and Science and Technology: United States; Universities, in the History of the Social Sciences
Bibliography Bazerman C 1988 Shaping Written Knowledge: The Genre and Actiity of the Experimental Article in Science. University of Wisconsin Press, Madison, WI
Scientific Discoery, Computational Models of Crawford E, Shinn T, So$ rlin S 1993 Denationalizing Science. The Contexts of International Scientific Practice. Kluwer, Dordrecht, The Netherlands Hoskin K W 1993 Education and the genesis of disciplinarity: The unexpected reversal. In: Messer-Davidow E, Shumway D R, Sylvan D J (eds.) Knowledges: Historical and Critical Studies in Disciplinarity. University Press of Virginia, Charlottesville, VA, pp. 271–305 Kuhn T S 1970 The Structure of Scientific Reolutions, 2nd edn. University of Chicago Press, Chicago Luhmann N 1990 Die Wissenschaft der Gesellschaft. Suhrkamp, Frankfurt am Main, Germany Marrou H I 1934 ‘Doctrina’ et ‘Disciplina’ dans la langue des pe' res de l’e! glise. Archius Latinitatis Medii Aei 9: 5–25 Messer-Davidow E, Shumway D R, Sylvan D J (eds.) 1993 Knowledges: Historical and Critical Studies in Disciplinarity. University Press of Virginia, Charlottesville, VA Ong W J 1958 Ramus, Method, and the Decay of Dialogue: From the Art of Discourse to the Art of Reason. Harvard University Press, Cambridge, MA Parsons T 1977 Social Systems and the Eolution of Action Theory. Free Press, New York Stichweh R 1984 Zur Entstehung des Modernen Systems Wissenschaftlicher Disziplinen—Physik in Deutschland 1740–1890. Suhrkamp, Frankfurt am Main, Germany Stichweh R 1990 Self-organization and autopoiesis in the development of modern science. In: Krohn W, Ku$ ppers G, Nowotny H (eds.) Selforganization—Portrait of a Scientific Reolution. Sociology of the Sciences, Vol. XIV. Kluwer Academic Publishers, Boston, pp. 195–207 Stichweh R 1992 The sociology of scientific disciplines: On the genesis and stability of the disciplinary structure of modern science. Science in Context 5: 3–15 Stichweh R 1996 Science in the system of world society. Social Science Information 35: 327–40 Swoboda W W 1979 Disciplines and interdisciplinarity: A historical perspective. In: Kockelmans J J (ed.) Interdisciplinarity and Higher Education. University Park, London, pp. 49–92 Westman R S 1980 The astronomers’s role in the sixteenth century: A preliminary study. History of Science 18: 105–47
R. Stichweh
Scientific Discovery, Computational Models of Scientific discovery is the process by which novel, empirically valid, general, and rational knowledge about phenomena is created. It is, arguably, the pinnacle of human creative endeavors. Many academic and popular accounts of great discoveries surround the process with mystery, ascribing them to a combination of serendipity and the special talents of geniuses. Work in Artificial Intelligence on computational models of scientific reasoning since the 1970s shows that such accounts of the process of science are largely mythical. Computational models of scientific discovery are computer programs that make discover-
ies in particular scientific domains. Many of these systems model discoveries from the history of science or simulate the behavior of participants solving scientific problems in the psychology laboratory. Other systems attempt to make genuinely novel discoveries in particular scientific domains. Some have produced new findings of sufficient worth that the discoveries have been published in mainstream scientific journals. The success of these models provides some insights into the nature of human cognitive processes in scientific discovery and addresses some interesting issues about the nature of scientific discovery itself (see Scientific Reasoning and Discoery, Cognitie Psychology of ).
1. Computational Models of Scientific Discoery Most computational models of discovery can be conceptualized as performing a recursive search of a space of possible states, or expressions, defined by the representation of the problem. Procedures are used to search the space of legal states by manipulating the expressions and using tests of when the goal or subgoals have been met. To manage the search, which is typically subject to potential combinatorial explosion, heuristics are used to guide the selection of appropriate operators. This is essentially an application of the theory of human problem solving as heuristic search within a symbol processing system (Newell and Simon 1972). For example, consider BACON (Langley et al. 1987) an early discovery program which finds algebraic formulas as parsimonious descriptions of quantitative data. States in the problem search space of BACON include simple algebraic formulas; such as P\D or P#\D, where, for instance, P is the period of revolution of planets around the sun and D is their distance from the sun. Tests in BACON attempt to find how closely potential expressions match the given quantitative data. Given quantitative data for the planets of the solar system, one step in BACON’s discovery path finds that neither P#\D nor P\D are constant and that the first expression is monotonically increasing with respect to the second. Given this relation between the expressions BACON applies its operator to give the product of the terms, i.e., P$\D#. This time the test of whether the expression is constant, within a given margin of error, is true. P$\D# l constant is one of Kepler’s planetary motion laws. For more complex cases with larger numbers of variables, BACON uses discovery heuristics based on notions of symmetry and the conservation of higher order terms to pare down the search space. The heuristics use the underlying regularities within the domain to obviate the need to explore parts of the search space that are structurally similar to previously explored states. Following such an approach, computational models have been developed to perform tasks spanning a full 13731
Scientific Discoery, Computational Models of spectrum of theoretical activities including the formation of taxonomies, discovering qualitative and quantitative laws, creation of structural models and the development of process models (Langley et al 1987, Shrager and Langley 1990, Cheng 1992). The range of scientific domains covered is also impressive, ranging from physics and astronomy, to chemistry and metallurgy, to biology, medicine, and genetics. Some systems have produced findings that are sufficiently novel to be worthy of publication in major journals of the relevant discipline (Valdes-Perez 1995).
2. Scope of the Models Computational models of scientific discovery have almost exclusively addressed theory formation tasks. However, the modeling of experiments has not been completely neglected as models have been built that design experiments, to a limited extent, by using procedures to specify what properties should be manipulated and measured in an experiment and the range of magnitudes over which the properties should be varied (Kulkarni and Simon 1988, Cheng 1992). For these systems, the actual experimental results are either provided by the user of the system or generated by a simulated experiment in the software. Discovery systems have also been directly connected to a robot which manipulates a simple experimental setup so that data collected from the instruments can be fed to the system directly, so eliminating any human intervention (Huang and Zytkow 1997). Nevertheless, few systems have simulated or supported substantial experimental activities, such as observing or creating new phenomena, designing experiments, inventing new experimental apparatus, developing new experimental paradigms, establishing the reliability of experiments, or turning raw data into evidence. This perhaps reflects a fundamental difference between the theoretical and experimental sides of science. While both clearly involve abstract conceptual entities, experimentation is also grounded in the construction and manipulation of physical apparatus, which involves a mixture of sophisticated perceptual abilities and motor skills. Developing models of discovery that include such capabilities would necessarily require other areas within AI beyond problem solving, such as image processing and robotics. The majority of discovery systems model a single theory formation task. The predominance of such systems might be taken as the basis for a general criticism of computational scientific discovery. The models are typically poor imitations of the diversity of activities in which human scientists are engaged and, perhaps, it is from this variety that scientific creativity arises. Researchers in this area counter such arguments by claiming that the success of such single task systems is a manifestation of the underlying nature of the 13732
process of discovery, that it is composed of subprocesses or tasks that are relatively autonomous. More complex activity can be modeled by assembling systems that perform one task into a larger system, with the inputs to a particular component subsystem being the outputs of other systems. The handful of models that do perform multiple tasks demonstrate the plausibility of this claim (e.g., Kulkarni and Simon 1988, Cheng 1992). The organization of knowledge structures and procedures in those systems exploits the hierarchical decomposition of the overall process into tasks and subtasks. This in turn raises the general question about the number and variety of different tasks that constitute scientific discovery and the nature of their interactions. What distinct search spaces are involved and how is information shared among them? Computational models of scientific discovery provide some insight into this issue. At a general level, many models can be characterized in terms of two spaces, one for potential hypotheses and the other a space of instances or sets of data (Simon and Lea 1974). Scientific discovery is then viewed as the search of each space mutually constrained by the search of the other. Inferring a hypothesis dictates the form of the data needed to test the hypothesis, while the data itself will determine whether the hypothesis is correct and suggest in what ways it should be amended (Kulkarni and Simon 1988). This is an image of scientific discovery that places equal importance on theory and experiment, portraying the overall process as a dynamic interaction between both components. This approach is applicable both to disciplines in which individual scientists do the theorizing and experimenting and to disciplines in which these activities are distributed among different individuals or research groups. The search of the theoretical and experimental spaces can be further decomposed into additional subspaces; for example, Cheng (1992) suggests three subspaces for hypotheses, models, and cases under the theory component, and spaces of experimental classes, setups, and tests under the experimental component.
3. Deeloping Computational Models of Discoery One major advantage of building computational models over other approaches to the study of scientific discovery is the precision that is imposed by writing a running computer program. Ambiguities and inconsistencies in the concepts used to describe discovery processes become apparent when attempting to encode them in a programming language. Another advantage of modeling is the ability to investigate alternative methods or hypothetical situations. Different versions of a system may be constructed embodying, say, competing representations to investigate the difficulty
Scientific Eidence: Legal Aspects of making the discovery with the alternatives. The same system can be run with different sets of data, for example, to explore whether a discovery could have been made had there been less data, or had different data been available. Many stages are involved in the development of the models, including: formulation of the problem, engineering appropriate problem representations, selecting and organizing data, design and redesign of the algorithm, actual invocation of the algorithm, and filtering and interpretation of the findings of the system (Langley 1998). Considering the nature and relative importance of these activities in the development of systems provides further insight into the nature of scientific discovery. In particular, the design of the representation appears to be especially critical to the success of the systems. This implies that generally in scientific discovery finding an effective representation may be fundamental to the making of discoveries. This issue has been directly addressed by computational models that contrast the efficacy of different representations for modeling the same historical episode (Cheng 1996). Consistent with work in cognitive science, diagrammatic representations may in some cases be preferable to informationally equivalent propositional representations. Although computational models argue against any special abilities of great scientists beyond the scope of conventional theories of problem solving, the models suggest that the ability of some scientists to modify or create new representations may be an explanation, at least in part, of why they were the ones to succeed.
4. Conclusions and Future Directions Given the extent of the development work necessary on a discovery system, it seems appropriate to attribute discoveries as much to the developer as to the system itself, although without the system many of the novel discoveries would not have been possible. This does not imply that machine discovery is impossible, but that care must be taken in delimiting the capabilities of discovery systems. Further, the ability of the KEKEDA system (Kulkarni and Simon 1988) to change its goals to investigate any surprising phenomenon it discovers suggests that systems can be developed that would filter and interpret the output of existing systems, by constraining the search of the space defined by the outputs of those systems using metrics based on notions of novelty. Developing such a system, or other systems that find problems or that select appropriate representations, will require the system to possess a substantially more extensive knowledge of the target domain. Such knowledge based systems are costly and time consuming to build, so it appears that the future of discovery systems will be more as collaborative support systems for domain
scientists rather than fully autonomous systems (Valdes-Perez 1995). Such systems will exploit the respective strengths of domain experts and the computational power of the models to compensate for each others’ limitations. See also: Artificial Intelligence: Connectionist and Symbolic Approaches; Artificial Intelligence in Cognitive Science; Artificial Intelligence: Search; Deductive Reasoning Systems; Discovery Learning, Cognitive Psychology of; Intelligence: History of the Concept; Problem Solving and Reasoning, Psychology of; Problem Solving: Deduction, Induction, and Analogical Reasoning; Scientific Reasoning and Discovery, Cognitive Psychology of
Bibliography Cheng P C-H 1992 Approaches, models and issues in computational scientific discovery. In: Keane M T, Gilhooly K (eds.) Adances in the Psychology of Thinking. HarvesterWheatsheaf, Hemel Hempstead, UK, pp. 203–236 Cheng P C-H 1996 Scientific discovery with law encoding diagrams. Creatiity Research Journal 9(2&3): 145–162 Huang K-M, Zytkow J 1997 Discovering empirical equations from robot-collected data. In: Ras Z, Skowron A (eds.) Foundations of Intelligent Systems. Springer, Berlin Kulkarni D, Simon H A 1988 The processes of scientific discovery: The strategy of experimentation. Cognitie Science, 12: 139–75 Langley P 1998 The computer-aided discovery of scientific knowledge. In: Proceedings of the First International Conference on Discoery Science. Springer, Berlin Langley P, Simon H A, Bradshaw G L, Zytkow J M 1987 Scientific Discoery: Computation Explorations of the Creatie Processes. MIT Press, Cambridge, MA Newell A, Simon H A 1972 Human Problem Soling. PrenticeHall, Englewood Cliffs, NJ Shrager J, Langley P (eds.) 1990 Computational Models of Scientific Discoery and Theory Formation. Morgan Kaufmann, San Mateo, CA Simon H A, Lea G 1974 Problem solving and rule induction: a unified view. In: Gregg L W (ed.) Knowledge and Cognition. Lawrence Erlbaum, Potomac, MD, pp. 105–127 Valdes-Perez R E 1995 Some recent human\computer discoveries in science and what accounts for them. AI Magazine 16(3): 37–44
P. C.-H. Cheng
Scientific Evidence: Legal Aspects Expertise, scientific and otherwise, has been part of the legal landscape for centuries (Hand 1901). Over the last decades of the twentieth century the role of scientific evidence in the law has expanded rapidly, in both regulatory settings and in litigation. Statutes and 13733
Scientific Eidence: Legal Aspects treaties routinely require agencies to provide scientific justifications for regulatory decisions within and among nations. National and international policy disputes are increasingly fought out within the risk assessment rhetoric of science (Craynor 1993). Courts are another large consumer of science. In both criminal and civil cases, parties believe scientific testimony will make their case stronger. As science’s role has grown, so has interest in the relationship between law and science. This essay reviews the present state of knowledge concerning the science–law relationship and mentions several areas ripe for investigation.
1. What Law Wants from Science In both the administrative and the courtroom context, law is most frequently interested in acquiring scientific answers to practical questions of causation (e.g., does exposure to airborne asbestos cause lung cancer) and measurement (e.g., what is the alcohol level in a driver’s blood). Both of these questions entail questions as to whether specific techniques (e.g., a breathalyzer) are capable of producing reliable measurements. Theoretical questions per se are frequently of secondary interest, useful insofar as they help courts and regulators to chose among conflicting measurements, extrapolations, or causal assertions. Courts and agencies are both interested in questions of general and specific causation. Agencies are interested in the effects of nitrous oxide on the environment and the specific level of emissions from a particular plant. Courts may be interested in whether a given level of airborne asbestos causes lung cancer. And they may need to determine if a particular plaintiff’s lung cancer was caused by asbestos exposure. It is often much more difficult to achieve a scientifically rigorous answer to this latter type of question. An important difference between courts and agencies is the level of proof necessary to reach a conclusion. In administrative settings agencies may prevail if they can show that a substance poses a danger or a risk. In a courtroom however, in order to prevail, the plaintiff will be required to show both general and specific causation. Scientific evidence that is sufficient in the regulatory context may be considered insufficient in the court context. For example, animal studies showing a relationship between saccharin consumption and cancer may be sufficient to cause an agency to impose regulations on human exposure to the sweetener, but would not be sufficient to permit a groups of plaintiffs to prevail on a claim that they were actually injured by consuming saccharin in diet soft drinks, or even on the general causation claim that saccharin at human dose levels causes cancer in any humans. 13734
2. Legal Control of Scientific Eidence One of the more interesting aspects of the use of science in courts is the legal effort to control the terms of the law–science interaction. Judicial control of scientific experts is shaped by whether the court is operating in an inquisitorial or an adversarial system (van Kampen 1998). In inquisitorial systems, e.g., Belgium, Germany, France, Japan, the judge plays a large role in the production of evidence. Experts are almost always court appointed and are asked to submit written reports. Parties may be given the opportunity to object to a particular expert, question the expert about the opinion rendered, or hire their own expert to rebut the court-appointed expert; but it is very difficult to attack the admissibility of an expert’s report successfully (Langbein 1985, Jones 1994). In adversarial systems the parties usually have far greater control over the production of evidence, including the selection and preparation of expert witnesses. Courtappointed experts are rare (Cecil and Willging 1994). A similar pattern can be observed in the regulatory arena. Jasanoff (1991) compares the relatively closed, consensual, and non-litigious British regulatory approach to the open, adversarial and adjudicatory style found in the United States. Clearly, legal organization and legal culture shape the way in which legal systems incorporate science. In turn, legal organization is related to larger cross-cultural differences. For example, open, adversarial approaches are more pronounced in societies that are more individualistic and lower on measures of power distance (Hofstede 1980). Judicial control of scientific expert testimony has become a high profile issue in the United States. Critics complain that the combination of adversarial processes and the use of juries in both criminal and civil trials encourages the parties to introduce bad science into the trial process. They argue, with some empirical support, that the use of experts chosen by the parties produces individuals who by their own account stray relatively further from Merton’s four norms of science—universalism, communism, disinterestedness, and organized skepticism (Merton 1973). They submit that jurors are too easily swayed by anyone who is called a scientist regardless of the quality of the science supporting the expert’s position. (See Expert Witness and the Legal System: Psychological Aspects.) Many now call upon judges to play a greater role in selecting and supervising experts, a role closer to that found in inquisitorial systems (Angell 1996). The preference for an inquisitorial style makes several assumptions about the nature of science and the proper relationship between law and science. Similar assumptions underlie the admissibility rules employed by courts. The history of admissibility decisions in American courts provides a useful way to explore these assumptions. Precisely because the knowledge possessed by the scientist is beyond the ken of the court, the judge (or
Scientific Eidence: Legal Aspects the jury) often is not in a good position to determine if what the expert is offering is helpful. In a 1923 decision a federal court offered one solution to this problem; acquiesce to the judgment of the community of elites who are the guardians of each specialized area of knowledge. Under the so-called Frye rule experts may testify only if the subject of their testimony had reached ‘general acceptance’ in the relevant scientific community. In 1993 in Daubert vs. Merrell Dow Pharmaceuticals, Inc., and subsequent opinions, the United States Supreme Court moved away from the Frye rule. In its place federal courts have developed a nonexclusive multifactor test designed to assess the validity of the science underlying the expert’s testimony. Factors the courts have mentioned include: whether the theory or technique underlying the expert’s testimony is falsifiable and had been tested; the error rate of any techniques employed; whether the expert’s research was developed independent of litigation; whether the testifying expert exercised the same level of intellectual rigor that is routinely practiced by experts in the relevant field; whether the subject matter of the testimony had been subjected to peer review and publication in refereed journals; and, in a partial retention of the Frye test, whether the theory or technique had achieved general acceptance in the relevant scientific community. The Daubert-inspired test offers an alternative to acquiescence, a do-it-yourself effort on the part of the courts. A number of the factors are consistent with an adversarial legal system that institutionalizes mistrust of all claims to superior authority. The adversary system itself is consistent with the values of a lowpower distance culture that affords relatively less legitimacy to elite authoritative opinion and, therefore, is skeptical that scientists are entitled to a privileged language of discourse from which others are excluded. It is also consistent with a view of science strongly influenced by other forces in society. These are perspectives on the scientific enterprise that are associated with those who adopt a social constructionist view of science (Shapin 1996, Pickering 1992).
3. Legal Understanding of the Scientific Enterprise Daubert, however, offers anything but a social constructionist test if by this we mean that scientific conclusions are solely the result of social processes within the scientific community. At its core, the opinion requires of judges that they become sufficiently knowledgeable about scientific methods so that they can fairly assess the validity of evidence offered at trial. This requirement that scientific testimony must pass methodological muster reflects a positivist approach that is slanted toward a Baconian view of science. The opinion cites with favor a
Popperian view of how to distinguish the scientific enterprise from other forms of knowledge (Popper 1968). In this regard, the opinion is not unique. In their use of scientific evidence, both courts and administrative agencies seem to distinguish the process of science from its products. They accept the constructionist insight that the process of doing science is a social enterprise and is subject to the buffeting, often distorting winds of social, political, economic, and legal influences. At the same time, courts, agencies, and legislatures cling to a realist belief that the products of science may state a truth about the world, or at least something so similar to truth as it is commonly understood at a given point in history that the particular discipline of law does not need to concern itself with the difference. The legal system’s view of science adopts a strong version of what Cole (1992, p.x) calls a realist–constructivist position, i.e., science is socially constructed both in the laboratory and in the wider community, but the construction is constrained by input from the empirical world. It rejects what he calls a relativist–constructionist position that claims nature has little or no influence on the cognitive content of science. The focus on methods is a search for some assurance that the expert has given the empirical world a reasonable opportunity to influence and constrain the expert’s conclusions. Ultimately, the law’s epistemology with respect to science holds that there are a set of (social) practices often given the shorthand name ‘the scientific method’ which increase the likelihood that someone will make positive contributions to knowledge; a set of practices that scientists themselves frequently point to as the sources of past scientific success (Goldman 1999). There is a large dose of pragmatism in all of this, of course, and the Daubert rule itself has been cited as an example of ‘the common law’s genius for muddling through on the basis of experience rather than logic’ (Jasanoff 1995, p. 63). Not surprisingly, some have criticized the courts for failing to adopt a philosophically coherent admissibility rule (Schwartz 1997). The court’s admissibility rulings do seem to have proceeded in happy obliviousness to the ‘science wars’ that arguably began with Fleck (1979), flourished with Kuhn (1962) and raged for much of the last half of the twentieth century between the defenders of a more traditional, positivist view of science and those critics who emphasize its historical, political, social, and rhetorical aspects (Leplin 1997, Latour 1999). The same could be said of administrative use of science. The rejection of relativist views of science does not mean that all legal actors hold identical views. It would be valuable to map the beliefs of legal actors on central disputes in the science wars, and how these beliefs impact their use of science. For example, if, as seems likely, plaintiff personal injury lawyers in the United States hold a more relativist view of science, how, if at all, does this affect their selection and 13735
Scientific Eidence: Legal Aspects preparation of experts and, in turn, how are these experts received in courts?
4. Law–Science Interdependence Law’s approach to science should not be understood in terms of science alone but rather in terms of the law–science interaction. There are several dimensions to this relationship. First, although modern legal systems may recognize that scientists are influenced by the social world around them, permitting radical deconstruction that would undermine science’s claim to special status is difficult to imagine (Fuchs and Ward 1994). The modern state is increasingly dependent upon science as a source of legitimacy. By turning to science for solutions to complex environmental and safety issues, legislatures are able to avoid making difficult political choices while giving the appearance of placing decision-making in the hands of apparently neutral experts who are held in very high esteem relative to other elites (Lawler 1996). The advantages of this approach are so great that agencies frequently engage in a ‘science charade’ in which they wrap their decision in the language of science even when there is very little research supporting a regulation (Wagner 1995). Second, and related to the first observation, the law–science interaction is one of the important ways in which the state helps to produce and maintain science’s dominant position in modern Western society. In both its administrative and judicial actions, the law defines what is and what is not scientific knowledge, and thereby assists science in important boundary maintenance work of excluding many from the community of ‘scientists.’ For example, in the case of United States vs. Starzecpyzel, the court permitted the state’s handwriting experts to testify on the question of whether the defendants had forged documents, but only if they did not refer to themselves as ‘scientists.’ Moreover, legal decisions contain an implicit epistemology that reinstitutionalizes one view of the nature of scientific knowledge. When courts and other institutions of the state reject a relativist view of science that argues the empirical world has little, if any, influence on what is accepted as true by the scientific community, they help to marginalize anyone who adopts this position. There can be little doubt that the view is given little or no attention in law. A cursory search for the names of individuals most associated with these debates in American federal cases finds over 100 references to Popper alone, but no more than one or two passing references to the more prominent critics of traditional understandings of the scientific enterprise. Third, the exact contours of the law–science interaction are shaped by a society’s legal structure and its legal culture. For example, the American legal system’s realist–constructivist understanding of science fits neatly with its own normative commitment to 13736
both ‘truth’ and ‘justice’ as legitimate dispute resolution goals. Ideally, cases should be correctly decided, should arrive at the truth. A realist view of science is consistent with the idea that a court may arrive at the correct outcome. But the truth is contested, and the courts should also give litigants the sense that they were listened to; that they received procedural justice (Tyler 1990). A constructivist understanding of the scientific enterprise legitimates the right of each party to find an expert who will present their view of the truth. We may hypothesize that all legal systems tend toward an epistemology of science and the scientific enterprise that fit comfortably within their dominant methods of law-making and dispute settlement. We might expect, therefore, that inquisitorial systems would be more skeptical of a constructivist view of the scientific enterprise than is the case in adversarial legal systems.
5. Effects of Growing Interdependence As this essay attests, science’s impact on legal processes grows apace. In many areas, an attorney unarmed with scientific expertise operates at a significant disadvantage. A growing number of treatises attempt to inform lawyers and judges of the state of scientific knowledge in various areas (Faigman et al. 1997). At the level of the individual case, there is some evidence that scientific norms are altering the way scientifically complex lawsuits are tried. Restrictive admissibility rules are one of several ways that courts in the United States may restrict traditional adversary processes when confronted with cases involving complex scientific questions. What of law’s effects on science? If the supply of science on some issue is affected by demand, legal controversy should draw scientists to research topics they might otherwise have eschewed. There is evidence that this does occur. The subject matter of scientific research is shaped by legal controversy in a wide number of areas, from medical devices (Sobol 1991) to herbicides (Schuck 1986). Some have argued that law’s interest has an additional effect of causing the production of ‘worse’ science (Huber 1991), but it might also be argued that in areas such as DNA testing, law’s interest has produced better science and better technology than might otherwise have existed. On the other hand, the threat of legal controversy may impede research into some areas, such as the development of new contraceptive devices. Even if we believe all science is produced through a process of social construction, it is also the case that most scientists believe that their work is constrained by the empirical world (Segerstra/ le 1993). Moreover, they often attempt to surround themselves with a social structure that keeps society at a distance. Scientific groups are increasingly attempting to propa-
Scientific Instrumentation, History and Sociology of gate ethical standards for those who offer expert testimony in courts, reflecting the widespread belief that scientists who typically appear in legal arenas differ from their colleagues on these dimensions. Finally, legal use of science affects the closure of scientific debates (Sanders 1998). On the one hand, law may perpetuate controversy by supplying resources to both sides. On the other hand, it may help to bring a dispute to closure by authoritatively declaring one side to be correct. See also: Expert Systems in Cognitive Science; Expert Testimony; Expert Witness and the Legal System: Psychological Aspects; Parties: Litigants and Claimants; Science and Law; Science and Technology Studies: Experts and Expertise; Statistics as Legal Evidence
Bibliography Angell M 1996 Science on Trial: The Clash of Medical Eidence and the Law in the Breast Implant Case. Norton, New York Cecil J S, Willging T E 1994 Court appointed experts. In Federal Judicial Center Reference Manual on Scientific Eidence. West, St. Paul, MN Cole S 1992 Making Science: Between Nature and Society. Harvard University Press, Cambridge, MA Craynor C F 1993 Regulating Toxic Substances: A Philosophy of Science and the Law. Oxford University Press, New York Daubert vs. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993) Faigman D L, Kaye D, Saks M, Sanders J 1997 Modern Scientific Eidence: The Law and Science of Expert Testimony. West Group, St. Paul, MN Fleck L 1979 Genesis and Deelopment of a Scientific Fact. University of Chicago Press, Chicago; IL Fuchs S, Ward S 1994 What is deconstruction, and where and when does it take place? Making facts in science, building cases in law. American Sociological Reiew 59: 481–500 Goldman A I 1999 Knowledge in a Social World. Clarendon Press, Oxford UK Hand L, Learned 1901 Historical and practical considerations regarding expert testimony. Harard Law Reiew 15: 40 Hofstede G H 1980 Culture’s Consequences: International Differences in Work-related Values. Sage, Beverly Hills, CA Huber P W 1991 Galileo’s Reenge: Junk Science in the Courtroom. Basic Books, New York Jasanoff S 1991 Acceptable evidence in a pluralistic society. In: Mayo G D, Hollander R D (eds.) Acceptable Eidence: Science and Values in Risk Management. Oxford University Press, Oxford; UK Jasanoff S 1995 Science at the Bar: Law, Science, and Technology in America. Harvard University Press, Cambridge, MA Jones C A G 1994 Expert Witnesses: Science, Medicine, and the Practice of Law. Clarendon Press, Oxford Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago, IL Langbein J H 1985 The German advantage in civil procedure. Uniersity of Chicago Law Reiew 52: 823–66 Latour B 1999 Pandora’s Hope: Essays on the Reality of Science Studies. Harvard University Press, Cambridge, MA
Lawler A 1996 Support for science stays strong. Science 272: 1256 Leplin J 1997 A Noel Defense of Scientific Realism. Oxford University Press, New York Merton R K 1973 The Sociology of Science: Theoretical and Empirical Inestigations. University of Chicago Press, Chicago, IL Pickering A (ed.) 1992 Science as Practice and Culture. University of Chicago Press, Chicago, IL Popper K R 1968 The Logic of Scientific Discoery. Hutchinson, London Sanders J 1998 Bendectin On Trial: A Study of Mass Tort Litigation. University of Michigan Press, Ann Arbor, MI Schuck P H 1986 Agent Orange On Trial: Mass Toxic Disaster in the Courts. Belknap Press of Harvard University Press, Cambridge, MA Schwartz A 1997 A ‘dogma of empiricism’ revisited: Daubert vs. Merrell Dow Pharmaceuticals, Inc. and the need to resurrect the philosophical insight of Frye vs. United States. Harard Journal of Law and Technology 10: 149 Segerstra/ le U 1993 Bringing the scientist back in the need for an alternative sociology of scientific knowledge. In: Brante T, Fuller S, Lynch W (eds.) Controersial Science: From Content to Contention. State University of New York Press, Albany, NY Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago, IL Sobol R B 1991 Bending The Law: The Story of the Dalkon Shield Bankruptcy. University of Chicago Press, Chicago, IL Tyler T R 1990 Why People Obey the Law. Yale University Press, New Haven, CT United States v. Starzecpyzel, 880 F. Supp. 1027 (S.D.N.Y. 1995) van Kampen P T C 1998 Expert Eidence Compared: Rules and Practices in the Dutch and American Criminal Justice System. Intersentia Rechtswetenschappen, Antwerp, Belgium Wagner W E 1995 The science charade in toxic risk regulation. Columbia Law Reiew 95: 1613–723
J. Sanders
Scientific Instrumentation, History and Sociology of In the course of the last seventy years, historical and sociological analysis of instrumentation has changed considerably. World War II serves as an important watershed. Before the war, ‘instrumentation’ referred mainly to scientific instruments. They figured in experimentation, whose purpose was to demonstrate the truths of theory by making theoretical claims visible. After 1945 the ways in which instrumentation has been perceived and the functions attributed to it have multiplied and expanded, taking into account not only devices in the science laboratory, but also apparatus used in industry, government, health care, the military, and beyond. Instrumentation is now identified in many areas of science and technology 13737
Scientific Instrumentation, History and Sociology of studies as central to research, engineering, industrial production, and to the processes of innovation. It is perceived as a mechanism that conditions the content of knowledge and affects the organization of work and even broader societal interactions. In the writings of early twentieth-century students of science, instrumentation was seldom discussed and never highlighted. Historian-philosophers of science like Gaston Bachelard (1933, 1951) saw science chiefly in terms of the development of new theory. In this idealist historiographical tradition, experimentation received little attention, and instrumentation was only treated as a prop for experiments whose function was to document the discoveries embodied in scientific theories. Instruments did not invite study. Questions of instrument design, construction, and use, and their limitations, went unattended. This is not to suggest, however, that there was no interest in scientific devices at the time. A few scholars were fascinated by them, and they strove to preserve apparatus and to unearth new devices. They were interested in the technical intricacies and novelty of instruments, for example those of Galileo Galilei. Devices were often viewed as antiquities, and due to this focus the distinction between the work of curators of science museums and instrument scholarship was sometimes a fine one. Here again, instrumentation was not treated as an active component in the knowledge production process, nor was it regarded as problematic in terms of its invention, use, or impact on the organization of research and science and technology communities. Stimulating new perspectives in scientific instrumentation arose in the 1970s and 1980s in connection with historical and sociological investigations of post1945 big science. For a long time most of the research that portrayed instrumentation as a central component of science and technology focused on devices in the physical sciences. The classic study by John Heilbron and Robert Seidel (1989) of the Berkeley cyclotron in the 1930s is emblematic of the new place of instrumentation in contemporary historiography. Issues of design, finance, engineering, and construction lay at the center of the cyclotron study. The cyclotron was portrayed as an instrument whose technical and social facets involved uncertainties. It was not a ‘pure’ instrument that reflected science’s drive to probe the physical world. While the cyclotron in part served this objective, the instrument also reflected the economic and institutional environment of the San Francisco region, the hope for better healthcare, financial concessions wrung from the government, and involvement by wealthy research foundations and industry. This history demonstrates that scientific instrumentation may be guided by the scientific community, but that it is sometimes spawned by circumstances and forces outside the pale of science. The idea that instruments are not neutral devices that serve science but elements that give structure to 13738
the scientific community first took root in studies of radio astronomy. This provocative concept was quickly extended to the sphere of high-energy physics at large (Krige 1996), oceanography (Mukerji 1992), and space science (Krige 2000). David Edge and Michael Mulkay (1976) first demonstrated that a scientific discipline, radio astronomy, which emerged in the 1950s and 1960s, was directly linked to or even defined by the design, construction, and diffusion of an altogether novel device: the radio telescope (itself an outgrowth of microwave technical research). The radio telescope discovered astronomical bodies and events. It contributed importantly to the birth of a new speciality, with its own university departments, journals, and national and international congresses. A new scientific instrument transformed knowledge, and it also affected the very institution of science. In a more speculative, even iconoclastic representation of scientific instruments, they are depicted as the key to research career success, and yet more assertively, as decisive motors in determining what is true in science. In some areas of science, the equipment crucial to carrying out telling experiments is extremely scarce due to the expense of acquisition and operation. By virtue of possessing a monopoly in an area of instrumentation, a scientist or laboratory can exercise control over the production of the best experimental data. Studies in this vein have been done for the fields of biology (Latour and Woolgar 1979) and physics (Pickering 1984, Pinch 1986). In this perspective Bruno Latour (1987) has suggested that scientific instruments yield not merely professional and institutional advantage, but more important, what is true and false, valid and invalid in science. A researcher’s dealings with instruments empower him or her to be heard and to be ‘right’ during scientific controversies. By dint of possessing a strategic apparatus, a laboratory is well positioned to establish what is and is not a sound claim. Latour insists that arguments and findings based on ‘weaker’ instruments, on mathematics, and on rational evaluation are a poor match against a truly powerful scientific instrument. Analyses of this sort are diametrically opposite to those of pre-World War II idealist historiography. A balanced, subtle, intellectual, and social contribution of instrumentation in the work of scientific research is found in the writings of Peter Galison. In How Experiments End (1987), this author argues that in twentieth-century microscopic experimental physics, instrument-generated signals are often crucial to settling rival claims, and he suggests that the work of separating noise from signals constitutes a key component in this process. In Image and Logic, Galison (1997) states that, contrary to what is often argued, scientific findings are not the outcome of interactions between theory and experimentation. He factors in a third element, namely instrumentation. Science thus derives from a triangular exchange between theory,
Scientific Instrumentation, History and Sociology of experimentation, and instrumentation. Galison speaks of a ‘trading zone’: a language and realm where these three currents merge, and where intelligibility is established. The brand of philosophical realism espoused by Ian Hacking (1983, 1989) likewise accords a central position to instrumentation. He insists that physical entities exist to the extent that instruments generate unarguably measurable effects. The classical example given by Hacking is the production of positrons whose presence induces palpable technical effects. Today, other science and technology studies identify instrumentation as an element which influences and sometimes structures the organization of work. Instruments are depicted as rendering obsolete some activities (functions) and some groups, as stimulating fresh functions, and as helping create the backdrop for organizational transformations. In the case of early big science, high-energy physics, highly specialized physicists were replaced in the task of particle detection and tracking by armies of low skilled women observers because of the emergence of alternative large-scale photographic technologies and protocols. In parallel, new technical roles, framed by new work arrangements, arose for engineering and technician cadres who were assigned to design or maintain novel devices or to assure effective interfaces between instrument packages. In a different sphere, the introduction of the first electronic calculators and computers near the end of World War II transformed occupations and work organization as they spelled the end of the human calculators (mostly women) who during the early years of the war contributed to the military’s high-technology programs. Due to the advent of the electronic computer in the late 1940s and 1950s, small horizontally organized work groups supplanted the former vastly bigger and vertically structured work system. Outside science too, for example in the military and industry, instrumentation is also currently viewed as having a structuring impact on the organization of labor. Control engineering, connected as it is with chains of cybernetic-based instrument systems, is depicted as having profoundly modified the organization of certain activities in military combat operations, fire control, and guidance. Other studies highlight instrumentation as a force behind changes in the composition and organization of many industrial tasks and occupations, through automation and robotics (Noble 1984, Zuboff 1988). These alter the size of a work force and its requisite skills, and the internal and external chains of hierarchy and command. Throughout the 1980s and 1990s, students of science and technology tracked and analyzed the development, diffusion, and implantation of devices in spheres increasingly distant from science and the laboratory. The concept ‘instrumentation’ took on a broader and different meaning from the initial historical and sociological concept of ‘scientific instrumentation.’
Studies of medical instrumentation led the way in this important transformation. Stuart Blume’s (1992) investigations of the emergence and diffusion of catscan devices, NMR imaging, and sonography illuminated the links between academic research, industrial research, the non-linear processes of industrial development of medical instrumentation, and the farflung applications of such apparatus. Blume’s and similar analyses had the effect of introducing significant complexities into the earlier fairly clear-cut notion of instrumentation. They blur the former understanding of a ‘scientific instrument’ by showing the multiple sites of its origins and development. They further show that an instrument may possess multiple applications, some in science and others in engineering, industry, metrology, and the military. One thing stands out with clarity; a scientific instrument is frequently coupled to industry in powerful ways: design, construction, diffusion, and maintenance. This link is historically situated in the late nineteenth and the twentieth centuries, and it appears to grow constantly in strength. This situation is pregnant with material and epistemological implications for science and beyond, as experiment design, laboratory work practices, reasoning processes, the organization of industry, and daily life are now all so interlocked with instrumentation. With only a few exceptions, however, little historical and sociological work has thus far concentrated on firms specifically engaged in the design, construction, and diffusion of instrumentation. It was not until the late nineteenth and early twentieth centuries that the military general staffs and politicians of certain countries (Germany, France, Great Britain, and somewhat later the US, the USSR, and Japan) began to perceive that their nation’s fate was tightly bound up with the quantity and quality of the instrumentation that endogenous industry could conceive and manufacture. In part because of this new consciousness, as well as the growth of scientific research, the number of companies involved in instrument innovation and production increased impressively. Mari Williams (1994) indicates that instrument companies must be thought of as key components in national systems of education, industry, government policy, and the organization of science. During much of the nineteenth century, France enjoyed the lead. England challenged the French instrument industry at the turn of the century, but by this time leadership clearly belonged to recently united Germany. Success in the scientific instrument industry appears to have been associated with tight organic bonds between specific firms and specific research laboratories, which was the case in England for companies close to the Cavendish. Germany’s immense success was the product of an organic association between the military, government, industrial manufacturing, and instrument making firms (Joerges and Shinn 2001). In the twentieth century, such proximity (not only geographic but perhaps more 13739
Scientific Instrumentation, History and Sociology of particularly in terms of working collaborations and markets) similarly proved effective in the United States. Does there exist an instrument-making or instrument-maker culture? This question too has received little attention, and any answer would have to be historically limited. Nevertheless, one often-cited study does address this issue, albeit indirectly. For a sample of post-World War II US instrument specialists, the sociologist Daniel Shimshoni (1970) looked at instrument specialists’ job mobility. He discovered that when compared to other similarly trained personnel, instrument specialists changed jobs more frequently than other categories of employees. However, the reasons behind this high mobility received scanty attention. One interpretation (not raised by Shimshoni) is that instrument specialists change employers in order to carry their instruments into fresh environments. Alternatively, once instrument specialists have performed their assigned tasks, perhaps employers encourage their departure from the firm, and instrument men are consequently driven to seek work elsewhere. One theme that has gained considerable attention during the 1980s and 1990s is the relationship between innovation and instrumentation. In an influential study, the sociologist of industrial organization and innovation Eric Von Hippel (1988) has explored the sites in which instrument innovations arise, and has examined the connections between those sites and the processes of industrial innovation. He indicates that a sizable majority of industrially relevant instrument novelties comes from inside industry, and definitely not from academia or from firms specialized in instrumentation. The instruments are most frequently the immediate and direct consequence of locally experienced technical difficulties in the realms of product design, manufacture, or quality control. Instrumentation often percolates laterally through a company, and is thus usually home used as well as home grown. Many devices do not move beyond the firm in which they originate. Hence, while in some instances dealings with academia and research may be part of industry practice, instrumentation is normally only loosely tied to science. When the connection between instrumentation development and science is strong, it is, moreover, often the case that industryspawned instrumentation percolates down to the science laboratory rather than academia-based devices penetrating industry. Some sociologists of innovation suggest that instrumentation in industry is linked to research and development, and through it to in-house technology and to company performance. According to some studies, during the 1970s and 1980s instrumentation correlated positively with industrial performance and company survival for US plants in a range of industrial sectors (Hage et al. 1993). Firms that exhibited small concern for advanced technology tended to stumble or 13740
close when compared with companies that actively sought new technologies. According to the Von Hippel hypothesis, an important fraction of innovative technology would take the form of in-house instrument-related innovation. The connection between academia, instrument innovation, and economic performance has been approached from a different perspective in a more narrowly researched and fascinatingly speculative study carried out by the economic historian Nathan Rosenberg (1994). Rosenberg suggests that many key instrument innovations having a great impact on industry spring from fundamental research conducted in universities. To illustrate this claim, he points to the university-based research on magnetic spin and resonance conducted by Felix Bloch at Stanford University in the late 1940s and 1950s. This basic physical research (theoretical and experimental) gave rise to nuclear magnetic resonance (NMR) and to NMR instrumentation; and in turn, NMR apparatus and related devices have spread outward in American industry, giving rise to new products and affecting industrial production operations. The best-known use of NMR instrumentation is in the area of medical imaging. Rosenberg suggests that the spillover effect of academic instrument research has been underestimated, and that the influence of academic research through instrumentation is characterized by a considerable multiplier effect. In the final analytical orientation indicated here, instrumentation is represented as a transverse epistemological, intellectual, technical, and social force that promotes a convergence of practices and knowledge among scientific and technological specialities. Instrumentation acts as a force that partly transcends the distinctions and differentiations tied to specific divisions of labor associated with particular fields of practice and learning. This transcending and transverse function is reserved to a particular category of instrumentation: ‘research-technology’ (Shinn 1993, 1997, 2000, Joerges and Shinn 2001). Research-technology is built around ‘generic’ devices that are open-ended general instruments. They result from basic instrumentation research and instrument theory. Examples include automatic control and registering devices, the ultracentrifuge, Fourier transform spectroscopy, the laser, and the microprocessor. Practitioners transfer their products into academia, industry, state technical services, metrology, and the military. By adapting their generic products to local uses, they participate in the development of narrow niche apparatus. The research-technologist operates in an ‘interstitial’ arena between established disciplines, professions, employers, and institutions. It is this in-between position that allows the research-technologist to design generalist non-specific generic devices, and then to circulate freely in and out of niches in order to diffuse them. Through this multi-audience diffusion, a form of
Scientific Knowledge, Sociology of practice-based universality arises, as a generic instrument provides a lingua franca and guarantees stable outcomes to multiple groups involved in a sweep of technical projects. Throughout the 1980s and 1990s studies devoted to instrumentation, or studies in which instrumentation plays a leading role, grew appreciably. The impressive number of instrument-related articles appearing in many of the major journals of the social studies of science and technology testify to the fact that the theme is now strong. Two general tendencies characterize today’s enquiries into instrumentation: first, a great diversity in the number of analytic currents and research schools which perceive instrumentation as basic to past and contemporary cognitive and organizational activities; and second, considerable variety in the spheres of activity that are putatively affected by instrumentation and in the mechanisms that allegedly underpin instrument influence. See also: Experiment, in Science and Technology Studies; History of Technology; Innovation, Theory of; Technological Innovation; Technology, Anthropology of
Bibliography Bachelard G 1933 Les intuitions atomistiques. Boivin, Paris Bachelard G 1951 L’actiiteT rationaliste de la physique contemporaine. Presses Universtaires de France, Paris Blume S 1992 Insight and Industry: On the Dynamics of Technological Change in Medicine. MIT Press, Cambridge, MA Edge D O, Mulkay M J 1976 Astronomy Transformed: The Emergence of Radio Astronomy in Britain. Wiley, New York Galison P 1987 How Experiments End. University of Chicago Press, Chicago Galison P 1997 Image and Logic: A Material Culture of Microphysics. University of Chicago Press, Chicago Hacking I 1983 Representing and Interening. Cambridge University Press, Cambridge, UK Hacking I 1989 The divided circle: a history of instruments for astronomy, navigation and surveying. Studies in the History and Philosophy of Science 20: 265–370 Hage J, Collins P, Hull F, Teachman J 1993 The impact of knowledge on the survival of American manufacturing plants. Social Forces. 72: 223–46 Heilbron J L, Seidel R W 1989 Lawrence and his Laboratory: A History of the Lawrence Berkeley Laboratory. University of California Press, Berkeley, CA Joerges B, Shinn T 2001 Instrumentation Between Science, State and Industry. Kluwer, Dordrecht, The Netherlands Krige J 1996 The ppbar project. In: Krige J (ed.) History of CERN. North-Holland, Amsterdam, Vol. 3 Krige J 2000 Crossing the interface from R&D to operational use: The case of the European meteorological satellite. Technology and Culture 41(1): 27–50 Latour B 1987 Science in Action: How to Follow Scientists and Engineers Through Society. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, Beverley Hills, CA
Mukerji C 1992 Scientific techniques and learning: Laboratory ‘signatures’ and the practice of oceanography. In: Bud R, Cozzens S (eds.) Inisible Connections: Instruments, Institutions and Science. SPIE Optical Engineering Press, Bellingham, WA Noble D F 1984 Forces of Production: A Social History of Industrial Automation, 1st edn. Knopf, New York Pickering A 1984 Constructing Quarks: A Sociological History of Particle Physics. University of Chicago Press, Chicago Pinch T J 1986 Confronting Nature: The Sociology of Solarneutrino Detection. Reidel, Dordrecht, The Netherlands Rosenberg N 1994 Exploring the Black Box: Technology, Economics, and History. Cambridge University Press, Cambridge, UK Shimshoni D 1970 The mobile scientist in the American instrument industry. Minera 8(1): 58–89 Shinn T 1993 The Bellevue grand electroaimant, 1900–1940: Birth of a research-technology community. Historical Studies in the Physical Sciences 24: 1, 157–87 Shinn T 1997 Crossing boundaries: The emergence of research technology communities. In: Leydesdorf L, Etzkowitz H (eds.) Uniersities and the Global Knowledge Economy: A Triple Helix of Uniersity–Industry–Goernment Relations. Cassel Academic Press, London Shinn T 2000 Formes de division du travail scientifique et convergence intellectuelle (Forms of division of scientific labor and intellectual convergence). Reue Francm aise de Sociologie 41: 447–73 Von Hippel E 1988 The Sources of Innoation. Oxford University Press, Oxford, UK Williams M E 1994 The Precision Makers: A History of the Instrument Industry in Britain and France 1870–1939. Routledge, London Zuboff S 1988 In the Age of the Smart Machine: The Future of Work and Power. Basic Books, New York
T. Shinn
Scientific Knowledge, Sociology of 1. Origins The central concern of the sociology of knowledge is the idea that what people take to be certain is an accident of the society in which they are born and brought up. It is obvious that religious or political truths are largely affected by their social settings. If the truths are moral then such relativism is more troubling. But if the truths are scientific, then the idea of social determination is widely regarded as subversive, or self-defeating—since the science which purports to show that science is socially situated is itself socially situated. The sociology of knowledge, as formulated by Mannheim (1936), avoided these dangerous and dizzying consequences by putting scientific and mathematical knowledge in a special category; other kinds of knowledge had roots in society but the knowledge of the natural sciences was governed by nature or logic. Scientific method, then, when properly applied, would insulate scientists from social influences and their knowledge should be more true and more 13741
Scientific Knowledge, Sociology of universal than other kinds of knowledge. From the early 1970s, however, groups of sociologists, philosophers, historians, and other social scientists began programs of analysis and research which treated scientific knowledge as comparable with other kinds of knowledge. Their work broke down the barrier between ordinary knowledge and scientific knowledge. The post-1970s Sociology of Scientific Knowledge (SSK), which included major contributions from historians of science, shows us how to understand the impact of social influences on all knowledge. We can use the field’s own approach to query its intellectual origins. A suitable model is a well-known paper by a historian, Paul Forman (1971), who argued that the rise of quantum theory owed much to the ‘Weltanschauung’ of Weimar Germany. In the same way it could be argued that the 1960s saw the growth of postwar prosperity in Europe which allowed the development of new markets with young people becoming powerful consumers of goods and producers of culture—much of it consisting of a rebellion against traditional forms of expression; it saw the development of the birth control pill, making possible a (temporary) breakdown of sexual mores; and it saw experiments with perception-affecting drugs, while protests against the Vietnam war threatened old lines of institutional authority. The new order had its counterpart in academe. Antipsychiatry questioned the barriers between the sane and the insane, while in Europe at least, romanticized versions of Marxism dominated debate. SSK’s questioning of the traditional authority of science can plausibly be seen as a product of this social ferment. The emblematic book of the period as far as science was concerned was Thomas Kuhn’s (1962) The Structure of Scientific Reolutions. Kuhn’s book was influential in helping to reinforce the intellectual atmosphere in which radical ideas about science could flourish. Kuhn’s notion of the incommensurability of paradigms, and paradigm revolution, provided a way of thinking about scientific knowledge as a product as much of culture as of nature. The book was not widely discussed until late in the 1960s, when the new social movements were at their height, and this adds force to the argument. Philosophy of science in the early 1970s was also affected by this turn. The so-called ‘Popper–Kuhn’ debate (Lakatos and Musgrave 1970) pitted Karl Popper’s (e.g., 1959) notion that scientific theories could be falsified with near certainty, even if they could not be proved, against Thomas Kuhn’s (1961, 1962) claim that what was taken to be true or false varied in response to sudden revolutions in scientific world view. The ‘Duhem–Quine thesis’ (Losee 1980) showed that scientific theories were supported by networks of observations of ideas such that no one observation could threaten any other part of the network directly. Imre Lakatos’s ([1963]\1976) brilliant analysis of the history of Euler’s theorem showed 13742
that falsification of a theory was only one choice when faced with apparently recalcitrant observations. Lakatos’s work was particularly attractive to the social analyst of science because it dealt with the detailed history of a real case rather than arising from general principles. Lakatos was showing how mathematicians argued (or could have argued in principle). This was the world of philosophical activity into which the sociology of scientific knowledge was born. Like any other group of knowers, sociologists of scientific knowledge prefer models of the genesis of their own ideas which reflect their self-image as ‘rational creatures’ or, in Mannheim’s phrase, ‘free floating intellectuals.’ In these models Kuhn plays a much smaller role. There are two intellectual stories. One turns on Ludwik Fleck’s ([1935]\1979) Genesis and Deelopment of a Scientific Fact, which Kuhn cites in the preface of his 1962 book, and which already contained many of the key ideas found therein. Furthermore, Fleck’s book anticipated the developments in sociology of scientific knowledge which came in the 1970s in that, unlike The Structure of Scientific Reolutions, it included a case study of a contemporaneous scientific episode, Fleck’s own research on the Wasserman reaction for the diagnosis of syphilis. Indeed, Fleck’s book remains unique in sociology of science in that his social analysis was of a pioneering piece of scientific research which he himself was pioneering. Thus it is the most perfect piece of participant observation yet done in the social study of scientific knowledge. Fleck’s book, however, was not widely known until many years after Kuhn’s was published, and not until long after the sociology of scientific knowledge had established its own way of doing things. Fleck, then, was the first to set out some of the crucial ideas for SSK, but he had little influence on the new movement except through his influence on Kuhn, while Kuhn helped open up the social and intellectual space for what came after rather than providing intellectual foundations or methodological principles; in other words, Kuhn provided the intellectual space but not, to use his own term, the paradigm. The origin of SSK’s intellectual and methodological paradigm has to be understood by looking at more direct influences. In the 1970s a group of philosophers, concerned with defending science against what they perceived as its critics, put forward the argument that sociological analysis could be applied only to knowledge that was in error, whereas true knowledge remained insulated from social forces. The intellectual source for the sociologists who inspired these initial attacks and fought against them was yet another philosopher, Ludwig Wittgenstein, whose work was central to the anthropological debate about whether there were universal standards of rationality (Wilson 1970). Readings of Ludwig Wittgenstein’s later philosophy (1953, 1956, Winch 1958) were the crucial theoretical resource for the work in sociology of
Scientific Knowledge, Sociology of scientific knowledge that had its origins in the mid-1970s. The ‘Edinburgh School’ and its ‘strong program’ argued on philosophical grounds that ‘symmetrical analysis’—analysis that treated true knowledge and false knowledge equally—was possible. Historical studies conducted in this framework showed how social influence affected scientific conclusions, in science in general, in early studies of the brain, in the development of statistics, and in high-energy physics (Bloor 1973, 1983, Barnes 1974, Shapin 1979, MacKenzie 1981, Pickering 1984). Independently, a group at the University of Bath applied Wittgensteinian ideas to contemporary episodes of science, particularly controversies in physics and the fringe sciences, and developed the methodology of interviewing at networks of competing scientific centers. Its program became known as the ‘Empirical Programme of Relativism’ (EPOR; Collins 1975, 1985, Pinch 1977, 1986, Travis 1980, 1981). These approaches were to become quite widely adopted in the early years of SSK. As the 1970s turned into the 1980s other groups, more strongly influenced by anthropological practice, entered and influenced the field. The Edinburgh–Bath approach was influenced by anthropology in that much of the discussion of the meaning of Wittgensteinian philosophy for the social sciences turned on questions to do with the knowledge of isolated ‘tribes’ and its relationship to Western knowledge. The newer contributors, however, took their methodological approach as well as their philosophy from anthropology. Their work founded the tradition of ‘laboratory studies.’ In the Strong Program the source of material tended to be historical archives; in EPOR the method was interviews with members of the ‘coreset’ of scientific laboratories contributing to a scientific controversy; in laboratory studies the method was a long sojourn in, or deep study of, a single laboratory (Latour and Woolgar 1979, Knorr Cetina 1981). The setting for the laboratory studies tended to be the life sciences rather than the hard sciences or fringe sciences. Another important input was the interpretive tradition in sociology associated with phenomenology and ethnomethodology. Indeed the term ‘social construction’ is a widely used description of the approach: ‘social construction of scientific knowledge’ was taken from the title of a well-known book by Peter Berger and Thomas Luckman (1967)—though Berger’s Initation to Sociology (1963) probably had more direct impact on the field. Ethnomethodology strongly influenced all those who practiced SSK and especially the detailed analyses of scientific practice such as those by Lynch (1985 [but written much earlier than its publication date]), Woolgar (1976), and especially the move of some authors into what became known as ‘discourse analysis’ (Knorr Cetina and Mulkay 1983). As the 1980s turned into the 1990s the concentration
on language that began with the move to discourse analysis helped SSK to come to seem a part of the much larger movement known as ‘postmodernism.’ The influence of French philosophers of literature and culture such as Derrida became marked. Since then ‘cultural studies of science,’ which share with SSK what could broadly be called a ‘social constructivist’ approach to science, have attracted large numbers of followers in the humanities as well as the social sciences. The respective methods and, more especially, the attitudes to methodology, make it possible to separate cultural studies from SSK. Crucially, SSK practitioners still take natural science as the touchstone where matters of method are concerned, though the model of science is very far from the narrow statistically-based notion of science that informed the ‘scientific sociology’ of the 1950s and early 1960s and continues to dominate much mainstream sociological practice in the USA. SSK stresses the more general values of careful empirical observation and repeatability. Cultural studies, on the other hand, takes philosophical literary criticism, or semiotics, as the model to be followed.
2. Concepts Among sociology’s traditional topics was the study of the social factors which affected scientists’ choice of research topic. The sociology of scientific knowledge, however, took it that the outcomes of research projects were also affected by their social setting. SSK showed that theoretical and experimental procedures did not determine, or fully determine, the conclusions of scientific research. The philosophical ideas described above provided an important starting point, but sociology developed its own concepts set in the working world of the scientist rather than the abstract world of the philosopher. One strand of research showed how difficult it was to transfer the skills of physics experimentation between settings, that it was even harder to know that such skills had been transferred, and that it was harder still to know when an experiment had been satisfactorily completed. This meant we did not know whether a negative experiment should be taken to contradict a positive one. This argument became known as ‘the experimenter’s regress’ (Collins 1975, 1985), and revealed the scope for ‘interpretive flexibility’ regarding the outcomes of passages of experimentation and theorization. It also means that even while SSK stresses the importance of empirical research and repeatability, it is aware that, by themselves, such procedures cannot turn disputed knowledge into certainty in either the natural or social sciences. Even if experimental science produced widely accepted data, as Pinch (1981) showed, their ‘evidential significance’ could vary and the same finding could count as revolutionary or insignificant. Sociological studies also described the process between the dis13743
Scientific Knowledge, Sociology of ordered activities of the laboratory and the orderly ‘findings’ (Latour and Woolgar 1979, Knorr Cetina 1981) or ‘discoveries’ (Brannigan 1981) reported—or constructed—in the published literature. Latour and Woolgar demonstrated the way that ‘modalities’— phrases that qualified a finding or referred to the particular time and place of its generation—were successively stripped from publications as the finding became established. What they called ‘inversion’ was the way the stripping of the modalities, and like processes, comprised the establishment of the finding—a reversal of the accepted direction of the causal arrow. It was also shown that, with some exceptions, ‘distance lends enchantment,’ that is, certainty about scientific outcomes tends to increase as the intepreter’s distance from the laboratory increases (Collins 1985, MacKenzie 1998). This explained why commentators’ accounts often seemed far more confident than the reports of the scientists themselves, and explained much about the way scientific knowledge was diffused beyond the specialists. Mechanisms were described by which potentially open-ended disputes were brought to a close around a particular interpretation, with EPOR concentrating on fringe sciences, the ‘French School’ concentrating on the interplay of actors within networks (Latour 1987), and the ‘Edinburgh School’ concentrating on large-scale political influences on the content of ideas (Shapin 1979, MacKenzie 1981, Shapin and Schaffer 1987). More recently, work on closure has broadened to include studies in the social construction of technology, the public understanding of science and law and science.
3. Significance As the recent so-called ‘science wars’ have revealed, outsiders consider that the importance of SSK and related analyses of science lies in the impact they have on the perception of the meaning of science in the wider world. SSK is widely seen as a radical relativist attack on science. But both the philosophical and political radicalism of the social analysis of science varies from program to program and study to study. For example, ‘epistemological relativism’ implies that one social group’s way of justifying its knowledge is as good as another’s and that there is no external vantage point from which to judge between them; all that can be known can be known only from the point of view of one social group or another. Ontological relativism is the view that, in social groups such as those described above, reality itself is different. A combination of epistemological and\or ontological relativism can be referred to as ‘philosophical relativism.’ This attitude is nicely captured by McHugh in the following quotation: ‘We must accept that there are no adequate grounds for establishing criteria of truth except the grounds that are employed to grant or concede it—truth is conceivable only as a socially 13744
organized upshot of contingent courses of linguistic, conceptual and social behaviour’ (1971, p. 329). Philosophical relativism is, then, a philosophically radical viewpoint. A still more philosophically radical position is the ‘actor (or actant) network theory’ (ANT) most closely associated with Michel Callon and Bruno Latour. The approach was first signaled by Latour and Woolgar when, in 1986, they changed the subtitle of Laboratory Life (Latour and Woolgar 1979) from The Social Construction of Scientific Facts to The Construction of Scientific Facts so as to signify that the ‘social’ in their view no longer deserved special mention in the shaping of scientific knowledge. Subsequently, as Callon and Latour developed their ideas, scientific facts came to be seen as emerging from the interplay of ‘actants’ in the ‘text,’ of life—suggesting that ANT should most properly be included under cultural studies of science rather than SSK. Within Callon and Latour’s ‘text,’ terms such as ‘social’ and ‘natural’ are themselves outcomes of the interplay of actants rather than original causes. Therefore, to draw attention to the social as a special factor is to begin the discussion at too low a level of generality. Under this approach ‘constructivism’ has ceased to be social and among the actants nonhumans are given as much reality-forming agency as humans (Callon 1986). The philosophical radicalness of ANT is clear in its refusal to accept even the notion of human and nonhuman as primary. Methodological relativism, by contrast, says nothing direct about reality or the justification of knowledge. Methodological relativism is an attitude of mind recommended by some of those who practice SSK; it says that the sociologist or historian of science should act as though the beliefs about reality of any competing groups being investigated are not caused by the reality itself. The intention is to limit analysis to the kind of causes of scientific beliefs that are located in the domain of the social. Methodological relativism is meant to push exploration of the social causes of belief to the limit without having it cut off by the argument that a belief was widely accepted because it was rational. Methodological relativism is, then, not at all radical as a philosophy, it is a (social) scientific method. Relativism may be called time-bounded if it claims only that scientific procedures are often inconclusive in the short term whatever the deeper significance of the consensuses that are attained in the long term. This kind of relativism, which has no philosophical potency, is all that is needed to support much current social analysis of contemporary science, especially controversies that impact on the public domain (e.g., Collins and Pinch 1993, Richards 1991, Epstein 1996). The relationship between philosophical radicalism and political radicalism is, however, perverse. Time-bounded relativism has consequences for contemporary scientific controversies such as engage
Scientific Knowledge, Sociology of environmentalists and the like: it has this power because it shows that, contrary to the way these things are treated in textbooks and the like, the ‘short-term’ period when scientific disputes are being resolved is often many decades long. The empirical and analytic tools of SSK explain why this is likely to be so and it means that we should not expect speedy science-based resolutions to our current technological dilemmas. From this it can be argued that the decision-making power once monopolized by scientific experts should be more widely shared (Wynne 1987). Methodological relativism, which is philosophically quietist, tends to be disturbing to the scientific community as it blatantly ignores scientific consensus however well established. To go the other extreme, ANT, the most philosophically radical position of all, by removing humans from the central position they hold in social constructivism, recreates a relationship between scientists and their nonhuman objects of study which is similar to that which held before the revolutions of the 1960s and 1970s (Collins and Yearley 1992, Callon and Latour 1992, Fuller 2000). What about philosophical relativism? Scientists and philosophers seem to believe that philosophical relativism can be used to justify nonscientific beliefs such as astrology or creationism, though its proponents deny it fervently, and at least one critic (Fuller 2000) has argued that it is politically quietist. SSK, of course, argues that as knowledge moves from its seat of creation to a wider audience, it tends to be stripped down and simplified, so it is not surprising that the subtleties of the various relativist positions have escaped the more outspoken critics of the social analysis of scientific knowledge.
Bibliography Barnes B S 1974 Scientific Knowledge and Sociological Theory. Routledge and Kegan Paul, London Barnes B S, Bloor D, Henry J 1996 Scientific Knowledge: A Sociological Analysis. Athlone Press, London Berger P L 1963 Initation to Sociology. Anchor Books, Garden City, NY Berger P L, Luckman T 1967 The Social Construction of Reality. Allen Lane, London Bloor D 1973 Wittgenstein and Mannheim on the sociology of mathematics. Studies in the History and Philosophy of Science 4: 173–91 Bloor D 1983 Wittgenstein: A Social Theory of Knowledge. Macmillan, London Brannigan G 1981 The Social Basis of Scientific Discoeries. Cambridge University Press, New York Callon M 1986 Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieuc Bay. In: Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge? Routledge & Kegan Paul, London, pp. 196–233 Callon M, Latour B 1992 Don’t throw the baby out with the bath school! In: Pickering A (ed.) Science as Practice and Culture. University of Chicago Press, Chicago, pp. 343–68
Collins H M 1975 The seven sexes: A study in the sociology of a phenomenon, or the replication of experiments in physics. Sociology 9(2): 205–24 Collins H M 1985 Changing Order: Replication and Induction in Scientific Practice. Sage, Beverly Hills & London [2nd edn., University of Chicago Press, 1992] Collins H M, Pinch T J 1993 The Golem: What Eeryone Should Know About Science. Cambridge University Press, Cambridge & New York [subsequent editions, 1994, 1998] Collins H M, Yearley S 1992 Epistemological chicken. In: Pickering A (ed.) Science as Practice and Culture. University of Chicago Press, Chicago, pp. 301–26 Epstein S 1996 Impure Science: AIDS, Actiism and the Politics of Knowledge. University of California Press, Berkeley, Los Angeles & London Feyerabend P K 1975 Against Method. New Left Books, London Fleck L 1979 Genesis and Deelopment of a Scientific Fact. University of Chicago Press, Chicago [first published in German in 1935] Forman P 1971 Weimar culture, causality and quantum theory, 1918–1927: Adaptation by German physicists and mathematicians to a hostile intellectual environment. In: McCormmach R (ed.) Historical Studies in the Physical Sciences, No 3. University of Pennsylvania Press, Philadelphia, PA, pp. 1–115 Fuller S 2000 Thomas Kuhn: A Philosophical History for our Times. University of Chicago Press, Chicago Knorr Cetina K D 1981 The Manufacture of Knowledge. Pergamon, Oxford, UK Knorr Cetina K D, Mulkay M (eds.) 1983 Science Obsered: Perspecties on the Social Study of Science. Sage, London & Beverley Hills Kuhn T S 1961 The function of measurement in modern physical science. ISIS 52: 162–76 Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lakatos I 1976 Proofs and Refutations. Cambridge University Press, Cambridge [originally published in British Journal for the Philosophy of Science 1963 XIV: 1–25, 120–39, 221–45, 296–342] Lakatos I, Musgrave A (eds.) 1970 Criticism and the Growth of Knowledge. Cambridge University Press, Cambridge, UK Latour B 1987 Science in Action. Open University Press, Milton Keynes, UK Latour B, Woolgar S 1979 Laboratory Life: The Social Construction of Scientific Facts. Sage, London & Beverly Hills [2nd edn., 1986] Losee J 1980 A Historical Introduction to the Philosophy of Science. Oxford University Press, Oxford, UK Lynch M 1985 Art and Artifact in Laboratory Science: A Study of Shop Work and Shop Talk in a Research Laboratory. Routledge and Kegan Paul, London McHugh P 1971 On the failure of positivism. In: Douglas J D (ed.) Understanding Eeryday Life. Routledge and Kegan Paul, London, pp. 320–35 MacKenzie D 1981 Statistics in Britain 1865–1930. Edinburgh University Press, Edinburgh, UK MacKenzie D 1998 The certainty trough. In: Williams R, Faulkner W, Fleck J (eds.) Exploring Expertise: Issues and Perspecties. Macmillan, Basingstoke, UK Mannheim K 1936 Ideology and Utopia: An Introduction to the Sociology of Knowledge. University of Chicago Press, Chicago Pickering A 1984 Constructing Quarks: A Sociological History of Particle Physics. Edinburgh University Press, Edinburgh, UK
13745
Scientific Knowledge, Sociology of Pinch T J 1977 What does a proof do if it does not prove? In: Mendelsohn E, Weingart P, Whitley R (eds.) The Social Production of Scientific Knowledge. Reidel, Dordrecht, The Netherlands Pinch T J 1981 The sun-set: The presentation of certainty in scientific life. Social Studies of Science 1(11): 131–58 Pinch T J 1986 Confronting Nature: The Sociology of Solarneutrino Detection. Reidel, Dordrecht, The Netherlands Popper K R 1959 The Logic of Scientific Discoery. Harper & Row, New York Richards E 1991 Vitamin C and Cancer: Medicine or Politics. Macmillan, Basingstoke, UK Shapin S 1979 The politics of observation: Cerebral anatomy and social interests in the Edinburgh Phrenology Disputes. In: Wallis R (ed.) On the Margins of Science: The Social Construction of Rejected Knowledge, Sociological Reiew Monograph, 27. Keele University Press, Keele, UK, pp. 139–78 Shapin S, Schaffer S 1985 Leiathan and the Air Pump: Hobbes, Boyle and the Experimenal Life. Princeton University Press, Princeton, NJ Travis G D L 1980 On the importance of being earnest. In: Knorr K, Krohn R, Whitley R (eds.) The Social Process of Scientific Inestigation: Sociology of the Sciences Yearbook, 4. Reidel, Dordrecht, The Netherlands, pp. 165–93 Travis G D L 1981 Replicating replication? Aspects of the social construction of learning in planarian worms. Social Studies of Science 11: 11–32 Wilson B (ed.) 1970 Rationality. Blackwell, Oxford, UK Winch P G 1958 The Idea of a Social Science. Routledge and Kegan Paul, London Woolgar S 1976 Writing an intellectual history of scientific developments: The use of discovery accounts. Social Studies of Science 6: 395–42 Wittgenstein L 1953 Philosophical Inestigations. Blackwell, Oxford, UK Wittgenstein L 1956 Remarks on the Foundations of Mathematics. Blackwell, Oxford, UK Wynne B 1987 Risk Management and Hazardous Wastes: Implementation and the Dialectics of Credibility. Springer, Berlin
H. M. Collins
Scientific Reasoning and Discovery, Cognitive Psychology of The cognitive psychology of scientific reasoning and discovery refers to the study of the cognitive processes that scientists use in all aspects of science. Researchers have used interviews and historical records, cognitive experiments on components of scientific thinking, computational models based on particular scientific discoveries, and investigations of scientists as they reason live, or ‘in io,’ in an effort to uncover the thinking and reasoning strategies that are important in science. In this article, six different approaches to scientific reasoning are discussed. One important point to note is that scientific thinking builds upon many different cognitive components such as induction, deduction, analogy, problem solving, priming, and 13746
categorization, that are objects of study in their own rights. Research specifically concerned with scientific thinking tends to use content domains that are from an established domain of science (such as physics, biology, or chemistry), or looks at how different cognitive processes such as concepts and deduction are used together in areas like experiment design. Useful books on the nature of scientific thinking are Tweeney et al. (1982), Giere (1992), and Klahr et al. (2000).
1. Interiews and the Historical Record Two frequently used and related approaches that have been used to investigate scientific thinking have been interviews with scientists and the analysis of historical records and documents such as notebooks. One of the earliest accounts of scientific thinking and reasoning was the interview of Albert Einstein conducted by the Gestalt Psychologist Max Wertheimer (1959). Wertheimer argued that a key strategy used by Einstein was to search for invariants. Wertheimer saw the velocity of light as a key invariant around which Einstein built his theory. Wertheimer incorporated his analysis into a Gestalt theory of thought. More recently, researchers have conducted interviews in the context of principles from cognitive science. For example, Paul Thagard (1999) has conducted many interviews with the scientists who proposed that ulcers are caused by bacteria. Thagard has pointed to the important roles of serendipity, observation, and analogy in this discovery. A related line of inquiry is the use of historical documents. Using the scientists’ lab books, biographical, and autobiographical materials, researchers attempt to piece together the reasoning strategies that the scientists used in making a discovery. For example, Nersessian (1992) has conducted extensive analyses of the physicist Faraday’s notebooks and has argued that the key to understanding his discoveries is in terms of his use of mental models. By mapping out the types of mental models that Faraday used and showing how these types of models shaped the discoveries that Faraday made, Nersessian offered a detailed account of the mental processes that led to a particular discovery. Another cognitive approach using the historical record is to take a real scientific discovery, such as Monod and Jacob’s Nobel Prize-winning discovery of a mechanism of genetic control and give people the same problem, using a simulated scientific laboratory, and determine whether people use the same discovery strategies that the scientists used to make the discovery, such as focusing on unexpected findings (Dunbar 1993).
2. Scientific Reasoning as Problem Soling and Concept Formation Two common approaches to scientific thinking have been to see it as a way of discovering new concepts or
Scientific Reasoning and Discoery, Cognitie Psychology of as a form of problem solving. Beginning with Bruner et al.’s (1956) classic experiments in which college students were asked to induce the rule that determines whether an item is, or is not, a member of a category, these researchers attempted to discover the types of inductive reasoning strategies used to acquire new concepts. Bruner et al. argued that much of science consists of inducing new concepts from data and that the memory loads that different strategies require will make certain types of inductive reasoning strategies more common than others. More recently, Holland et al. (1986) provided an account of the different inductive learning procedures that could be used to acquire new concepts in science. Herbert Simon (1977) argued that concept formation is a form of problem solving, thus the two approaches can be seen as complimentary (see Klahr et al. 2000). Simon argued that scientific thinking consists of a search in a problem space with the two main spaces being a hypothesis space and an experiment space. The hypothesis and experiment spaces refer to all possible hypotheses, experiments, and operators that can be used to get from one part of the space to the next part of the space, such as grouping common elements (the grouping operator) in sets of results to form a new hypothesis. Simon has taken specific scientific discoveries and mapped out the types of heuristics (or strategies), such as heuristics for designing experiments that a scientist used in searching the experiment space. Using the notion of searching in a problem space, other researchers have analyzed the types of search heuristics that are used in all aspects of scientific thinking and have conducted experiments on the problem-solving heuristics that people use in designing experiments, formulating hypotheses, and interpreting results (Dunbar 1993, Klahr et al. 2000). These approaches specify the types of knowledge that an individual must possess and the heuristics that are used to formulate hypotheses, design experiments, and interpret data.
3. Errors in Scientific Thinking One of the most frequently investigated aspects of scientific thinking and reasoning has been the finding that both scientists and participants in psychology experiments attempt to confirm their hypothesis when they conduct an experiment, sometimes called ‘confirmation bias’ (see Tweeney et al. 1982). Following from the writings of the philosopher Karl Popper, many psychologists have assumed that attempting to confirm a hypothesis is a faulty reasoning strategy. Numerous studies have revealed that, when given an hypothesis to test, people will design experiments that will confirm their hypothesis and not conduct experiments that could falsify their own hypothesis. This is a pervasive phenomenon that is difficult to eradicate; even when given instructions to falsify hypotheses,
people find it difficult to do. Thus, many researchers have concluded that both people in psychology experiments as well as scientists at large make this faulty reasoning error. However, Klayman and Ha (1987) argued that conducting experiments that confirm a hypothesis is not necessarily a scientific reasoning error. They argued that if the prior probability of confirming one’s hypothesis is low, then even if the scientist is attempting to confirm the hypothesis, it can still be disconfirmed. One other interpretation of the phenomenon of confirmation bias is that early in developing a theory or a hypothesis people will attempt to confirm the hypothesis; however once the hypothesis is fleshed out and confirmed, people will attempt to conduct disconfirming experiments (see Tweeney et al. 1980).
4. Science ‘In io’: How Scientists Think in Naturalistic Contexts One important issue in scientific reasoning and discovery is that most accounts have tended to use indirect evidence such as lab notebooks, biographies, and interviews with scientists to determine the thinking and reasoning strategies that scientists use. Another approach is to conduct experiments on isolated aspects of scientific thinking. See Dunbar (1995) for an analysis of these standard approaches that he has termed ‘in itro.’ Both approaches, while very informative, do not look at scientists directly. Thus, a complimentary approach has been to investigate real scientists’ thinking and reasoning strategies while they are conducting real research. Using this ‘in io’ approach, Dunbar (1999) has identified the specific ways that scientists use analogies, deal with unexpected findings, and use collaborative reasoning strategies in their research. He found that scientists use analogies to similar entities (or ‘local analogies’) when fixing experimental problems, analogies to entities from the same class of items (or ‘regional analogies’) when formulating new hypotheses, and analogies to very dissimilar domains (‘long-distance analogies’) when explaining scientific issues to others. Furthermore, Dunbar found that over half the findings that scientists obtain are unexpected, and that the scientists have specific strategies for dealing with these unexpected findings: First, scientists provide methodological explanations using local analogies that suggest ways of changing their experiments to obtain the desired result. If the changes to experiments do not provide the desired results, then the scientists switch from blaming the method to formulating hypotheses; this involves the use of ‘regional analogies,’ as well as collaborative reasoning in which groups of scientists build models and theories together. Dunbar has further brought back these ‘in io’ findings into the cognitive laboratory to conduct controlled experi13747
Scientific Reasoning and Discoery, Cognitie Psychology of ments, which, taken together, have been used to build new accounts of the ways that analogy, collaborative reasoning, and causal reasoning are used in scientific thinking (Dunbar 1999).
5. The Deelopment of Scientific Thinking Skills Beginning with the work of Piaget, many researchers have noted that children are similar to scientists. This ‘child-as-scientist’ metaphor has two main strands. First, children’s acquisition of new concepts and theories is said to be similar to the large conceptual changes that occur in scientific fields. Researchers investigating this view have pointed to parallels between changes in children’s concepts, such as their concepts of heat and temperature, and changes in the concepts of heat and temperature in the history of physics (see Chi 1992, Carey 1992). The second strand of the child-as-scientist metaphor is that children reason in identical ways to scientists, ranging from deduction and induction to experimental design. Some researchers have argued that there is little difference between a scientist and a three-year-old; while scientists clearly have considerably more knowledge of specific domains than children, their underlying competencies are viewed as the same. Other researchers have taken this ‘child-as-scientist’ view even further and argued that infants are basically scientists. Yet other researchers have argued that there are fundamental differences between children and scientists and that scientific thinking skills follow a developmental progression (see Klahr et al. 2000 for an overview of this debate).
6. Cognitiely Drien Computational Discoery: Twenty-first Century Scientific Discoery One development that has occurred in many sciences is the placing of vast amounts of information in computer databases. In the year 2000 the entire human genome was mapped and now exists in databases. Similar developments have occurred in physics, where the entire universe has been mapped and put on a database. In the case of the human genome, data consists of long sequences of nucleotides, each of which is represented by a letter such as A for Adenine. Databases consist of strings of letters such as ATGTC with each letter representing a particular nucleotide. These strings or sequences can extend for hundreds of millions of nucleotides without interruption. Buried in these sequences are genes, and most of the genes are of unknown function. One goal of researchers is to search the database, find genes, and determine the function of the genes and how the genes interact with each other. This new wave of scientific investigation incorporates many of the principles of analogical reasoning, inductive reasoning and problem-solving strategies dis13748
cussed here, as well as many of the algorithms (neural nets, Markov models, and production systems) discovered by cognitive psychologists in the 1980s and 1990s to discover the functions of genetic sequences and the properties of certain types of matter in the universe. One interesting aspect of this development has been that rather than computer programs being expected to make an entire discovery from beginning to end, they are a tool that can be used by scientists to help make a discovery. The cognitive psychology of scientific reasoning has moved from being a description of the scientific mind to an active participant in scientific practice. See also: Discovery Learning, Cognitive Psychology of; History of Science; Informal Reasoning, Psychology of; Piaget’s Theory of Child Development; Problem Solving and Reasoning: Case-based; Problem Solving and Reasoning, Psychology of; Problem Solving: Deduction, Induction, and Analogical Reasoning; Reasoning with Mental Models; Scientific Concepts: Development in Children
Bibliography Bruner J S, Goodnow J J, Austin G A 1956 A Study of Thinking. Wiley, New York Carey S 1992 The origin and evolution of everyday concepts. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN, pp. 89–128 Chi M 1992 Conceptual change within and across ontological categories. Examples from learning and discovery in science. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN, pp. 129–86 Dunbar K 1993 Concept discovery in a scientific domain. Cognitie Science 17: 397–434 Dunbar K 1995 How scientists really reason: Scientific reasoning in real-world laboratories. In: Sternberg R J, Davidson J E (eds.) Mechanisms of Insight. MIT Press, Cambridge, MA, pp. 365–95 Dunbar K 1999 The scientist in vivo: How scientists think and reason in the laboratory. In: Magnani L, Nersessian N, Thagard P (eds.) Model-based Reasoning in Scientific Discoery. Kluwer Academic\Plenum Publishers, New York, pp. 89–98 Giere R N (ed.) 1992 Minnesota Studies in the Philosophy of Science. Vol. XV: Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN Holland J H, Holyoak K J, Nisbett R E, Thagard P R 1986 Induction: Processes of Inference, Learning, and Discoery. MIT Press, Cambridge, MA Klahr D, Dunbar K, Fay A, Penner D, Schunn C 2000 Exploring Science: The Cognition and Deelopment of Discoery Processes. MIT Press, Cambridge, MA Klayman J, Ha Y W 1987 Confirmation, disconfirmation, and information in hypothesis testing. Psychological Reiew 94: 211–28 Nersessian N J 1992 How do scientists think? Capturing the dynamics of conceptual change in science. In: Giere R N (ed.) Minnesota Studies in the Philosophy of Science. Vol. XV:
Scientific Reolution: History and Sociology of Cognitie Models of Science. University of Minnesota Press, Minneapolis, MN Simon H A 1977 Models of Discoery: and Other Topics in the Methods of Science. D Reidel Publishing, Dordrecht, The Netherlands Thagard P 1999 How Scientists Explain Disease. Princeton University Press, Princeton, NJ Tweney R D, Doherty M E, Mynatt C R (eds.) 1981 On Scientific Thinking. Columbia University Press, New York Wertheimer M 1959 Productie Thinking. Harper and Row, New York
K. Dunbar
Scientific Revolution: History and Sociology of 1. The Term ‘Scientific Reolution’ The term ‘Scientific Revolution’ was introduced to denote what has been considered since the nineteenth century one of the most important discontinuities in the history of European science (H F Cohen 1994, I B Cohen 1985, Lindberg and Westman 1990). It covered roughly the period between Copernicus and Newton, and led from Aristotelian natural philosophy (see Aristotle (384–322 BC)) deriving its dogmatic authority from the Church to the establishment of the classical sciences and their institutions, i.e., the period between 1500 and 1700 (see Scientific Disciplines, History of Scientific Academies, History of). This period can be characterized as one in which a new social group, the ‘engineer-scientists’ (consisting of engineers, scientists, inventors, artists, and explorers), emerged and became institutionalized. This group confronted traditional natural philosophy with the challenges of practice and experience, but also engaged in self-contained explanations of natural phenomena, expecting that science is a means to master nature, as Francis Bacon put it. Among the lasting achievements of the Scientific Revolution were the establishment of heliocentric astronomy, classical mechanics, as well as numerous contributions to optics, chemistry, physiology and other areas of modern science (for an overview see Butterfield 1965, Dijksterhuis 1986, Hall 1954). Many of these achievements became, in fact, the basis for technological breakthroughs. However, these breakthroughs occurred, as a rule, much later than anticipated by the protagonists of the Scientific Revolution. Further, the intellectual breakthroughs responsible for the revolution’s lasting impact on the development of science were mainly attained towards its end and only a generation or more after their proclamation by the early pioneers (Damerow et al. 1992).
The Scientific Revolution has become a paradigmatic reference for all approaches to the history and philosophy of science (see History of Science). At least since the Enlightenment it has been conceived as the triumph of the scientific method over the irrationalism of religious beliefs (see Science and Religion). Opposition to this view, together with a growing professionalization of historical studies, has opened up the space for other accounts, including attempts—such as that by Pierre Duhem—to deny that the Scientific Revolution actually represented a radical break, claiming instead that it merely constituted an episode within a continuous accumulation of scientific knowledge since the Middle Ages (Duhem 1996). Furthermore, the traditional emphasis on the role of outstanding protagonists of the Scientific Revolution, such as Bacon, Galileo, and Descartes, and their individual ‘discoveries’ (see e.g., Koyre! 1978) has, in recent scholarship, increasingly receded in favor of an analysis of contexts tracing scientific achievements back to cultural, social, and economic conditions (Osler 2000, Porter and Teich 1992, Shapin 1996). The Scientific Revolution, on the other hand, itself provided a model for historical explanations when Thomas Kuhn (see Kuhn, Thomas S (1922–96)) radically challenged traditional historiography with his notion of ‘scientific revolution’ conceived of as a general structure of knowledge development (Kuhn 1962).
2. Historical and Sociological Context From a sociological perspective, the Scientific Revolution appears as part of a wider social process in which technical knowledge assumed a new role in the organization of European societies (see Renaissance; Enlightenment). This process took off in the late Middle Ages and was primarily rooted in the larger cities, which saw an ever more diversified and developed artisanal culture and a growing accumulation of merchandise capital. The cities of early modern Europe thus offered favorable conditions for the rapid growth of technical knowledge and the reflection of this growth in political, philosophical and religious thinking. Large-scale ventures involving technical expertise, such as projects of military architecture, water regulation, military adventures, or seafaring expeditions (see History of Technology), involved types of resources, social mobility, and an outlook on the world available only in urban centers such as Florence, Venice, Paris and London, which in fact became, long after they had attained an outstanding economic role, also the nuclei of the Scientific Revolution. Historians such as Edgar Zilsel (Zilsel 2000) have considered the early modern engineering projects as a decisive condition for the practical orientation and empirical knowledge base that distinguish the science of this period from its medieval antecedents. 13749
Scientific Reolution: History and Sociology of Beginning in the fifteenth century, ambitious practical ventures (such as the building of the cupola of Florence Cathedral, the search for a sea route to India, or the development of a new military technology) increasingly relied on expert knowledge comprising both logistic and technological competencies exceeding those of traditional practitioners and artisans. Such competencies could only be gained on the basis of broader reflection on the relevant practical and theoretical knowledge resources available. This reflection became the specialty of the new group of engineerscientists such as Filippo Brunelleschi, Christopher Columbus, Leonardo da Vinci, Niccolo Tartaglia, and Galileo Galilei. While the states of the time (if not actually at war) competed with each other in the pursuit of practical ventures, for example, building projects and seafaring expeditions, the knowledge acquired in such ventures was nevertheless constantly spread among the engineer-scientists employed by them. In the social fabric of the early modern period, engineer-scientists occupied a place similar to that earlier conquered by Renaissance humanists, administrators and artists (see Art History). These practically oriented intellectuals were, as a rule, highly mobile, offering their services to whatever patronage was available. At the same time, they constituted, as a social group, a collective memory, accumulating and transmitting the new knowledge long before appropriate institutions of learning emerged and in spite of the frequent political and military turnovers of the time. The characteristic features of the engineer-scientists and their work become understandable against the background of their uncertain social status and their dependence on the patronage of courts and city governments (see e.g., Biagioli 1993) with rapidly changing power structures. Examples are their incessant engagement with projects for potential future patrons; their usually unrealistic promises regarding the practical benefits of their theories, inventions or projects; the secretiveness with which they treated their discoveries; their frequent involvement in priority struggles; as well as the striving to ennoble their practical knowledge by the claim of creating ‘new sciences.’ The social and political ambitions of the engineerscientists were reflected in their pursuit of a literary culture of technical knowledge, largely emulating the humanist culture of the courts, including their reference to ancient Greek and Roman canons. Their contribution to the literary culture was in turn welcomed by those interested in challenging the transcendent, religious legitimization of the feudal order as an argument for the possibility of an immanent explanation of both the natural and the social world. This affinity, together with the fact that an allencompassing explanation of the world on the basis of Aristotelian philosophy had been adopted as the official doctrine of the Catholic Church, brought the 13750
engineer-scientists almost unavoidably into conflict with its power structures (Feldhay 1995). Other developments contributed to turning the growth of technological and scientific knowledge into a force driving profound changes in the established structures of European society. The invention of printing offered a revolutionary new means of dissemination that challenged the exclusiveness of a literary culture based on manuscripts. The new dissemination technique contributed to overcoming the traditional separation between various branches of practical knowledge, confined by a transmission process relying exclusively on participation and oral communication as well as restricted by guild regulations. It also bridged the social gulf between such practical knowledge and the theoretical knowledge transmitted via scholarly texts at universities, monasteries, and courts. As a result, knowledge resources of different provenance were integrated and became widely available. Together with the accumulation of new knowledge in the context of the great practical ventures of the time, this process formed part of a veritable explosion of knowledge, both in the sense of its expansion and of its spreading across traditional social barriers. In reaction to this knowledge explosion and its growing significance for the functioning of the advanced European societies, new institutions of learning such as the Accademia del Cimento (1657), the Royal Society (1660), and the Acade! mie Royale des Sciences (1666) emerged in the middle of the seventeenth century (see Knowledge Societies; Scientific Academies, History of). Traditional institutions, such as those of the Church, on the other hand, had to accommodate to the new situation, as may be illustrated by the prominent involvement of Jesuits in the scientific culture of the time (Wallace 1991). Parallel to this process of institutionalization, science had, towards the end of the period here under consideration, gradually emancipated itself from the expectation of immediate practical benefits and could increasingly be pursued for its own sake. In summary, as a result of the Scientific Revolution, not only the production and transmission of technological knowledge but also its representation by scientific theories became an essential factor in the social, economic, and cultural development of European societies. This is the context in which scientists such as Huygens, Leibniz, Hook, Newton, and Wallis, traditionally identified with the completion of the Scientific Revolution by the creation of classical terrestrial and celestial mechanics, achieved their celebrated results.
3. Structures and Achieements The cognitive, social, and material structures and achievements of the Scientific Revolution have been extensively studied and discussed amid much contro-
Scientific Reolution: History and Sociology of versy. Recent scholarship in the context of a cultural history of science has emphasized the possibility of an integrative treatment of these various dimensions. From the point of view of the traditional history of ideas, the Scientific Revolution appears primarily as the renewal of a knowledge development going back to antiquity, as a renaissance of Greek science. In fact, the ancient tradition of mathematical science, and in particular the ‘Elements’ of Euclid, provided the protagonists of the Scientific Revolution with the canonical model for a mathematical theory, a model which they systematically applied to new areas such as ballistics (Tartaglia) and even ethics (Spinoza). But the ancient tradition also had in stock designs for a theory of nature not based on Aristotelian views which they could hence exploit in their struggle against scholasticism, in particular Platonism and atomism. The ancient tradition finally offered a substantial corpus of knowledge in such domains as geometry, mathematical astronomy, and mechanics, serving as a point of departure for new scientific endeavors. The perception of their work as a renewal of antique science was typical of the self-image of the protagonists of the Scientific Revolution, who honored each other by such titles as that of a ‘new Archimedes.’ In short, the characterization of the Scientific Revolution as a renaissance of antique science by historians of ideas such as Koyre! is in agreement with the claims of contemporary scientists to have accomplished a radical break with Aristotelian scholasticism (Koyre! 1978). This characterization, however, is in apparent conflict with the results of studies inaugurated by scholars in the tradition of Duhem, pointing to a conceptual continuity between early modern science and its medieval predecessors (and even with contemporary scholasticism), in spite of the anti-Aristotelian polemics (Duhem 1996). Such studies have provided evidence that the intellectual means available to the engineer-scientists engaged in creating new sciences were essentially still rooted in traditional conceptual frameworks. For instance, Galileo Galilei and his contemporaries allowed their investigations of motion and mechanics to be shaped by such Aristotelian notions as the distinction between natural and violent motion and the assumption that violent motion is caused by a moving force (Damerow et al. 1992). Studies of medieval natural philosophy have revealed, on the other hand, that the advanced explanations of phenomena such as projectile motion, by which early modern scientists distinguished themselves from Aristotelian natural philosophy, make use of concepts of causation such as that of ‘impetus’ (or ‘impressed force’) and techniques for conceptualizing change such as Oresme’s doctrine of the ‘latitude of forms’ that had already been developed by late antique or medieval commentators on Aristotle (Clagett 1968). Such contrasting accounts of early modern science appear less incompatible if other dimensions of the
development of scientific knowledge are taken into account. For instance, epistemological considerations have suggested that one should differentiate between the claim of Renaissance engineer-scientists to have created a new science or a new scientific method, on the one hand, and the knowledge base they shared with their contemporaries, on the other hand. Using the terminology introduced by Elkana (1988) we can say that their ambitious claim resulted from their ‘image of knowledge,’ which was determined by their fragile social status. This image is to be distinguished from the shared ‘body of knowledge’ comprising the antique and medieval heritage which determined what challenges they were able to master. The role of this shared knowledge for the Scientific Revolution has been analyzed not only from the point of view of the history of ideas but also from the viewpoint of its actual function in the social and material contexts of this revolution. It has thus turned out that the material culture of the Scientific Revolution decisively shaped the way in which the knowledge of antique and medieval science was taken up or newly interpreted. It has become evident, for instance, that the knowledge resources available to the engineer-scientists, largely structured by traditional conceptual frameworks, were challenged by their application to the new objects of a rapidly expanding range of experience, acquired in the context of the practical ventures in which the engineer-scientists were engaged. While investigations of ‘challenging objects’ such as the pendulum, the trajectories of artillery, purified metals, the dissected human body, the terrestrial globe, and the planetary system often remained less successful than their protagonists hoped and claimed, they nevertheless triggered elaborations and modifications of these traditional frameworks creating the foundations for the accomplishments of the age of classical science. Columbus searched for the sea route to India, but discovered America. Kepler tried in the Pythagorean tradition to unravel the harmonies of the world but from the harmonies he found only three laws of planetary motion that became the starting point of Newtonian cosmology. Galileo explained ballistics and pendulum motion on the basis of the impetus concept, but in fact he contributed to a new mechanics, which expelled this concept once and for all from science. Harvey defended Aristotle’s claim of the primacy of the heart among the organs, but became the founder of an anti-Aristotelian theory, a mechanistic medicine. All these achievements resulted from coping with challenging new objects while relying on essentially traditional intellectual means. Classical science, often considered an accomplishment of the Scientific Revolution, was actually established for the most part only in the course of the eighteenth century. Even classical mechanics, the pilot and model science of the Scientific Revolution, assumed the formulation familiar from today’s physics textbooks only in the aftermath of Newton’s pion13751
Scientific Reolution: History and Sociology of eering work. The characteristics of classical science, relatively stable theoretical frameworks and generally accepted standards for the production of knowledge serving as the canonical reference for a scientific community, were, in any case, not yet shared by the scientific endeavors of the Scientific Revolution. The conceptual framework of classical science comprising basic concepts such as inertia, a new methodical canon including the experimental method and mathematical techniques such as the differential calculus, new images of knowledge such as the mechanization of the world view, and the new institutions of classical science such as the academies were nevertheless consequences of the Scientific Revolution, as a belated result of reflection on its accumulated experience. See also: Evolution: Diffusion of Innovations; History of Science; History of Technology; Innovation, Theory of; Kuhn, Thomas S (1922–96); Physical Sciences: History and Sociology; Technological Innovation
Bibliography Biagioli M 1993 Galileo, Courtier: The Practice of Science in the Culture of Absolutism. University of Chicago Press, Chicago Butterfield H 1965 The Origins of Modern Science 1300–1800. The Free Press, New York Clagett M H (ed.) 1968 Nicole Oresme and the Medieal Geometry of Qualities and Motions. University of Wisconsin Press, Madison, WI Cohen H F 1994 The Scientific Reolution: A Historiographical Inquiry. University of Chicago Press, Chicago Cohen I B 1985 Reolution in Science. Belknap Press, Cambridge, MA Damerow P, Freudenthal G, MacLaughlin P, Renn J 1992 Exploring the Limits of Preclassical Mechanics. Springer, New York Dijksterhuis E J 1986 The Mechanization of the World Picture: Pythagoras to Newton. Princeton University Press, Princeton, NJ Duhem P 1996 Essays in the History and Philosophy of Science. Hackett, Indianapolis, IN Elkana Y 1988 Experiment as a second order concept. Science in Context 2: 177–96 Feldhay R 1995 Galileo and the Church: Political Inquisition or Critical Dialogue? Cambridge University Press, Cambridge, UK Hall A R 1954 The Scientific Reolution 1500–1800: The Formation of the Modern Scientific Attitude. Longmans, Green, London Koyre! A 1978 Galileo Studies. Humanities Press, Atlantic Highlands, NJ Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lindberg D C, Westman R S (eds.) 1990 Reappraisals of the Scientific Reolution. Cambridge University Press, Cambridge, UK Osler M J (ed.) 2000 Rethinking the Scientific Reolution. Cambridge University Press, Cambridge, UK
13752
Porter R, Teich M (eds.) 1992 The Scientific Reolution in National Context. Cambridge University Press, Cambridge, UK Shapin S 1996 The Scientific Reolution. University of Chicago Press, Chicago Wallace W A 1991 Galileo, the Jesuits, and the Medieal Aristotle. Variorum, Aldershot, UK Zilsel E 2000 The Social Origins of Modern Science. Kluwer, Dordrecht, The Netherlands
P. Damerow and J. Renn
Scientometrics 1. Introduction Scientometrics can be defined as the study of the quantitative aspects of scientific communication, R&D practices, and science and technology (S&T) policies. The objective is to develop indicators of the intellectual and social organization or the sciences using network relations between scientific authors and texts. The specialty has developed in relation to the increased capacities of computer storage and information retrieval of scientific communications (e.g., citation analysis). Archival records of scientific communications contain institutional address information, substantive messages (e.g., title words), and relational information from which one is able to reconstruct patterns and identify the latent characteristics of both authors and document sets. Using scientometric techniques, one is thus able to relate institutional characteristics at the level of research groups with developments at the level of scientific disciplines and specialties. Citations, for example, can be used for retrieving documents on the basis of author names, or vice versa. The scientometric representation is formal: it remains in need of an interpretation. The focus on uncertainty contained in the distribution relates scientometrics additionally to the (neo-evolutionary) study of complex and adaptive systems. Simulation models are increasingly used for the study of the role of sciencebased technologies in innovation processes. However, the specialty remains data driven because of its mission to provide indicators to S&T policy processes and R&D management.
2. A Metric of Science? In 1978, the journal Scientometrics was launched as a new medium to stimulate the quantitative study of scientific communication. Derek de Solla Price, one of the founding fathers of the specialty, proclaimed in his introduction the development of scientometrics as the
Scientometrics emergence of a ‘relatively hard’ social science. This claim has generated discussion from the very beginning of the specialty. In that same year (1978), leading sociologists of science published an edited volume entitled Toward a Metric of Science: The Adent of Science Indicators, dedicated to ‘Paul F. Lazarsfeld (1901–76), Master of Quantitative and Qualitative Social Research’ (Elkana et al. 1978). The systematic comparison of science indicators across fields of science was made possible by the creation of the Science Citation Index by Eugene Garfield of the Institute of Scientific Information (Garfield 1979). A preliminary version of this index became available in 1962. The creation of the database has stimulated the development of new perspectives on studies in various traditions. For example, the growth of scientific disciplines and specialties can be discussed in quantitative terms using this database (e.g., Price 1963), ‘invisible colleges’ can be explained in terms of network structures (Crane 1969), and theories of citations can perhaps be developed and tested (cf. Braun 1998). The development of the specialty went hand in hand with the need for a means of legitimating science policies (Wouters 1999). Narin (1976) elaborated an instrumentarium for the systematic development of the biennial series of Science Indicators which the National Science Foundation of the USA began providing in 1972. (In 1987, the name of this series was changed to Science and Engineering Indicators.) With the further development of S&T policies in other nation-states and with the gradual emergence of such policies at the European level, scientometrics became a booming business during the 1980s. By comparing radio-astronomy facilities at the international level, Martin and Irvine (1983) showed the feasibility of comparing research groups in terms of quantitative performance indicators. In a special issue of Social Studies of Science about performance indicators, Collins (1985) raised the question of the unit of analysis: what is being assessed in terms of what? The authors, the papers, or the cognitions shaped in terms of sociocognitive interactions among authors and discourses? In relation to French traditions of linguistic analysis, Callon et al. (1983) proposed using words and their co-occurrences instead of citations and co-citations (Small 1973) as units of analysis. Citations can be considered as subtextual codifications, while words indicate the variation at the level of texts. Words may change both in meaning and in terms of their observable frequency distributions.
3. Methodologies The availability of well-organized relational databases on an annual basis challenges the scientometrician to develop comparative statistics. How is the structure in each of these years to be depicted, and how may
changes in structure be distinguished from and related to changes in the observable variation? Can the difference over time be identified as ‘growth of science,’ or is it mainly a difference between two measurement errors? How significant are differences when tested against simulation results? In principle, the idea of a dynamic mapping of science requires an independent operationalization of structural (that is, latent) dimensions of the maps and observable variation which is to be pencilled into these maps. Science, however, develops not only in terms of the variation, but also by changing its structural dimensions. Because of the prevailing reflexivity within the science system, previous structures can be felt as constraints and at the same time be used as resources. The construction of structure may historically be stabilized, but reflexive actors are able to deconstruct and to assess the previous constructions with hindsight. The methodological apparatus for the mapping of science in terms of multivariate statistics (multidimensional scaling, cluster analysis, etc.) has been a product of the 1980s (e.g., Van Raan 1988). The 1990s have provided the evolutionary turn: what does history mean in relation to (envisaged?) future options? How can the system itself be informed reflexively with respect to its self-organizing capacities (Leydesdorff 1995)? The relation to technometrics and the measurement of ‘systems of innovation’ has become central to a shifting research agenda. The development of the sciences has increasingly been contextualized in relation to science-based technologies and innovation systems (Gibbons et al. 1994). 3.1 Comparatie Statistics of Science Various methods for the mapping of science can be based on relational indicators such as citations, words, co-occurrences of each of these categories, etc. Clustering, however, requires the choice of a similarity criterion and a clustering algorithm. The distinction between positional (or factor analytical) and relational (or graph analytical) approaches is also relevant: the network is expected to contain an architecture in which actors have a position. A nearby position does not necessarily imply a relation. The methodological reflection may thus help to clarify the theoretical analysis (Burt 1982). The mapping requires a specific perspective for the projection. Each perspective assumes a position or a window. If the multidimensional space is highly structured, the different positions may provide nearly incommensurable projections. In an article on the development of the relation between scientometrics and other subfields of S&T studies, Leydesdorff and Van den Besselaar (1997) showed that the mappings depicting science studies from the perspective of the journals Scientometrics, Social Studies of Science, and Research Policy, respectively, are increasingly different 13753
Scientometrics example, depicts the emergence of the modern citation itself as a historical phenomenon, but using theoretically informed helplines. Scientific citation emerged around the turn of the twentieth century as a means to search in both the textual and the social dimensions of science. Thus, the citation can be considered as an indicator of the complexity of sociocognitive interaction in science after its institutionalization in the nineteenth century.
3.3 Neo-eolutionary Methodologies
Figure 1 The emergence of the modern citation in the February issues of Journal of the American Chemical Society (1890–1910) (after Leydesdorff and Wouters 1999, p. 175).
during the 1990s. Citation relations among these core journals tend to decrease. The authors characterize Social Studies of Science as the ‘codifier’ of the field (along the historical axis), Scientometrics as the ‘formalizer,’ and Research Policy as the ‘utilizer.’ Only a few scholars in ‘Science, Technology, and Innovation Studies’ have developed competences for communicating across these subdisciplinary boundaries. 3.2 Time-series Methodologies During the 1980s a debate raged in the community concerning the scientometric indication of a ‘decline of British science.’ Eventually, a special issue of Scientometrics was devoted to the debate in 1991 (Martin 1991). Some agreement could be reached that the inclusion and exclusion of data types and the framework for the comparison can be crucial for the dynamic evaluation. Should one compare with reference to a previous stage (for example, in terms of an ex ante fixed journal set), or should one with hindsight reconstruct the relevant data? For example, not only has the number of biotechnology journals changed, but also our understanding of ‘biotechnology’ is changing continuously. The sociological understanding of scientific knowledge production and control seems to have eroded Price’s dream of developing scientometrics as a relatively ‘hard social science.’ Using time-series analysis, one is always able to increase the fit of the curve by allowing for higherorder polynomials. Here again, the theoretical appreciation has to guide the choices of the parameters. For example, if one wishes to measure growth, it may be useful to include a second- or third-order polynomial in addition to the linear fit. Figure 1, for 13754
The networks of texts and the networks of authors operate upon each other in a selective mode. The distributions are expected to contain information, and this information may be increasingly codified by recurrent selections. As the systems ‘lock-in’ (in terms of their mutual information), the closure of the communication into a paradigm is one among various possibilities. The uncertainties which prevail in these networks can interactively generate codifications, which can be expected to perform a ‘life-cycle.’ Both participants and observers are able to hypothesize these structures reflexively, and new information can be expected to induce an update. Thus, the codification drives the knowledge production process. Codification also provides instruments for local control of otherwise global developments. The relational operation is recursive. For example, citations refer to other texts and\or to citations in other texts. The networks resulting from this operation are expected to have an architecture (which can be mapped at each moment in time). Operations are expected to be reproduced if they are able to further the production of new knowledge and the latter’s retention into cognitive structures. What is functional, however, is decided at a next moment in time, that is, at the level of a reflexive hypernetwork that overlays the historically generated ones. The distributions indicate the patterns of the expected operations. Thus, the process of new knowledge claims is propelled and made more precise and selective in tradeoffs of references in social and cognitive dimensions. Scientometric studies can be helpful in revealing the patterns of intellectual and social organization which may have remained ( partially) latent to the knowledgable actors involved. Simulation studies using scientometric mappings as input enable us to indicate the difference that the moves of the players can make. The complexity of the scientists’ worlds is reflected in the scientometric reconstructions. The recognition of these objectified reconstructions recursively assumes and potentially refines the cognition within the discourses at both levels. Over time, the cognitive reconstruction becomes thoroughly selective: citations may be ‘obliterated by incorporation’ into the body of knowledge, and social factors may play a role in further selections, e.g., in
Screening and Selection terms of reputations. In this co-evolution between communications and authors, distributions of citations function, among other things, as contested boundaries between specialties. Since the indicators are distributed, the boundaries remain to be validated. Functions are expected to change when the research front moves further. By using references, authors position their knowledge claims within one specialty area or another. Some selections are chosen for stabilization, for example, when codification into citation classics occurs. Some stabilizations are selected for globalization at a next-order level, for example, when the knowledge component is integrated into a technology.
4. Conclusion The focus on evolutionary dynamics relates scientometrics increasingly with the further development of evolutionary economics (Leydesdorff and Van den Besselaar 1994). How can systems of innovation be delineated? How can the complex dynamics of such systems be understood? How is the ( potentially random) variation guided by previously codified expectations? How can explorative variation be increased in otherwise ‘locked-in’ trajectories of technological regimes or paradigms? From this perspective, the indication of newness may become more important than the indication of codification. The Internet, of course, offers a research tool for what has also now been called ‘sitations’ (Rousseau 1997). ‘Webometrics’ may develop as a further extension of scientometrics relating this field with other subspecialities of science and technology studies, such as the public understanding of science or the appropriation of technology and innovation using patent statistics.
Crane D 1969 Social structure in a group of scientists. American Sociological Reiew 36: 335–352 Elkana Y, Lederberg J, Merton R K, Thackray A, Zuckerman H 1978 Toward a Metric of Science: The Adent of Science Indicators. Wiley, New York Garfield E 1979 Citation Indexing. Wiley, New York Gibbons M, Limoges C, Nowotny H, Schwartzman S, Scott P, Trow M 1994 The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies. Sage, London Leydesdorff L, Van den Besselaar P (eds.) 1994 Eolutionary Economics and Chaos Theory: New Directions in Technology Studies. Pinter, London Leydesdorff L, Van den Besselaar P 1997 Scientometrics and communication theory: Towards theoretically informed indicators. Scientometrics 38: 155–74 Leydesdorff L, Wouters P 1999 Between texts and contexts: Advances in theories of citation? Scientometrics 44: 169–182 Leydesdorff L A 1995 The Challenge of Scientometrics: The Deelopment, Measurement, and Self-organization of Scientific Communications. DSWO Press, Leiden University, Leiden, Netherlands Martin B R 1991 The bibliometric assessment of UK scientific performance. A reply to Braun, Gla$ nzel and Schubert. Scientometrics 20: 333–57 Martin B R, Irvine J 1983 Assessing basic research: Some partial indicators of scientific progress in radio astronomy. Research Policy 12: 61–90 Narin F 1976 Ealuatie Bibliometrics. Computer Horizons Inc., Cherry Hill, NJ Price D, de Solla 1963 Little Science, Big Science. Columbia University Press, New York Raan A F J Van (ed.) 1988 Handbook of Quantitatie Studies of Science and Technology. Elsevier, North-Holland, Amsterdam Rousseau R 1997 Sitations: An exploratory study. Cybermetrics 1: 1 http:\\www.cindoc.csic.es\cybermetrics\articles\v1i1p1. html Small H 1973 Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science 24: 265–9 Wouters P 1999 The citation culture. PhD Thesis, University of Amsterdam
See also: Communication: Electronic Networks and Publications; History of Science; Libraries; Science and Technology, Social Study of: Computers and Information Technology; Science and Technology Studies: Experts and Expertise
L. Leydesdorff
Screening and Selection Bibliography Braun T (ed.) 1998 Topical discussion issue on theories of citation. Scientometrics 43: 3–148 Burt R S 1982 Toward a Structural Theory of Action. Academic Press, New York Callon M, Courtial J-P, Turner W A, Bauin S 1983 From translations to problematic networks: an introduction to coword analysis. Social Science Information 22: 191–235 Collins H M 1985 The possibilities of science policy. Social Studies of Science 15: 554–8
1. Introduction In order to assist in selecting individuals possessing either desirable traits such as an aptitude for higher education or skills needed for a job or undesirable ones such as having an infection, illness or a propensity to lie, screening tests are used as a first step. A more rigorous selection device, e.g., diagnostic test or detailed interview is then used for the final classification. 13755
Screening and Selection For some purposes, such as screening blood donations for a rare infection, the units classified as positive are not donated but further tests on donors may not be given as most will not be infected. Similarly, for estimating the prevalence of a trait in a population, the screening data may suffice provided an appropriate estimator that incorporates the error rates of the test is used (Hilden 1979, Gastwirth 1987, Rahme and Joseph 1998). This article describes the measures of accuracy used to evaluate and compare screening tests and issues arising in the interpretation of the results. The importance of the prevalence of the trait on the population screened and the relative costs of the two types of misclassification are discussed. Methods for estimating the accuracy rates of screening tests are briefly described and the need to incorporate them in estimates of prevalence is illustrated.
2. Basic Concepts The purpose of a screening test is to determine whether a person or object is a member of a particular class, C or its complement, Ck. The test result indicating that the person is in C will be denoted by S and a result indicating non-membership by Sk. The accuracy of a test is described by two probabilities: η l P [S Q C ] being the probability that someone in C is correctly classified, or the sensitivity of the test; and θ l P [SkQ Ck] being the probability that someone not in C is correctly classified, or the specificity of the test. Given the prevalence, π l P(C ), of the trait in the population screened, from Bayes’ theorem it follows that the predictive value of a positive test (PVP) is P [C Q S ] l πη\ [πηj(1kπ) (1k θ )]. Similarly, the predictive value of a negative test is P[CkQ Sk] l (1kπ)θ\[(1kπ) θjπ (1kη)]. In the first two sections, we will assume that the accuracy rates and the prevalence are known. When they are estimated from data, appropriate sampling errors for them and the PVP are given in Gastwirth (1987). For illustration, consider an early test for HIV, having a sensitivity of 0.98 and a specificity of 0.93, applied in two populations. The first has a very low prevalence, 1.0i10−$, of the infection while the second has a prevalence of 0.25. From Eqn. (1), the PVP in the first population equals 0.0138, i.e., only about one-andone-half percent of individuals classified as infected 13756
would actually be so. Nearly 99 percent would be false positives. Notice that if the fraction of positives, expected to be 0.0797, in the screened data were used to estimate prevalence, a severe overestimate would result. Adjusting for the error rates yields an accurate estimate. In the higher prevalence group, the PVP is 0.8235, indicating that the test could be useful in identifying individuals. Currently-used tests have accuracy rates greater than 0.99, but even these still have a PVP less than 0.5 when applied to a low prevalence population. A comprehensive discussion is given in Brookmeyer and Gail (1994). The role of the prevalence, also called the base rate, of the trait in the screened population and how well people understand its effect has been the subject of substantial research, recently reviewed by Koehler (1996). An interesting consequence is that when steps are taken to reduce the prevalence of the trait prior to screening, the PVP declines and the fraction of falsepositives increases. Thus, when high-risk donors are encouraged to defer or when background checks eliminate a sizeable fraction of unsuitable applicants prior to their being subjected to polygraph testing the fraction of classified positives who are truly positive is small. In many applications screening tests yield data that are ordinal or essentially continuous, e.g., scores on psychological tests or the concentration of HIV antibodies. Any value, t, can be used as a cut-off point to delineate between individuals with the trait and ‘normal.’ Each t generates a corresponding sensitivity and specificity for the test and the user must then incorporate the relative costs of the two different errors and the likely prevalence of the trait in the population being screened to select the cut off. The receiver operating characteristic (ROC) curve displays the trade-off between the sensitivity and specificity defined by various choices of t and also yields a method for comparing two or more screening tests. To define the ROC curve, assume that the distribution of the measured variable (test score or physical quantity) is F(x) for the ‘normal’ members of the population but is G(x) for those with the trait. The corresponding density functions are f(x) and g(x) respectively and g(x) will be shifted (to the right (left) if large (small) scores indicate the trait) of f (x) for a good screening test. Often one fixes the probability (α l 1ky θ or one minus the specificity) of classifying a person without the characteristic as having it at a small value. Then t is determined from the equation F (t) l 1k θ. The sensitivity, η, of the test is 1kG(t ). The ROC curve plots η against 1k θ. A perfect test would have η l 1 so the closer the ROC is to the upper left corner in Fig. 1, the better the screening test. In Fig. 1 we assume f is a normal density with mean 0 and variance 1 while g is normal with mean 1 and the same " variance. For comparison we also graphed the ROC for a second test, which has a density g with mean 2 and variance 1. Notice that the ROC #curve for the
Screening and Selection
Figure 1 The ROC curves for screening tests 1 and 2. The solid line is the curve when the diseased group has mean 1 and the dashed curve is for the second group with mean 2
second test is closer to the left corner (0,1) than that of the first test. A summary measure (Campbell 1994), which is useful in comparing two screening tests is the area, A, under the ROC. The closer A is to its maximum value of 1.0, the better the test is. In Fig. 1, the areas under the ROC curves for the two tests are 0.761 and 0.922, respectively. Thus, the areas reflect the fact that the ROC curve for the second test is closer to what a ‘perfect’ test would be. This area equals the probability a randomly chosen individual will have a higher score on the screening test than a normal one. This probability is the expected value of the Mann–Whitney form of the Wilcoxon test for comparing two distributions and methods for estimating it are in standard non-parametric statistics texts. Non-parametric methods for estimating the entire ROC curve are given by Wieand et al. (1989) and Hilgers (1991) obtained distribution-free confidence bounds for it. Campbell (1994) uses the confidence bounds on F and G to construct a joint confidence interval for the sensitivity and one minus the specificity in addition to proposing alternative confidence bounds for the ROC itself. Hseih and Turnbull (1996) determine the value of t that maximizes the sum of sensitivity and specificity. Their approach can be extended to maximizing weighted average of the two accuracy rates, suggested by Gail and Green (1976). Wieand et al. (1989) also developed related statistics focusing on the portion of the ROC lying above a
region, α and α so the analysis can be confined to # that are practically useful. Greenvalues of "specificity house and Mantel (1950) determine the sample sizes needed to test whether both the specificity and sensitivity of a test exceed pre-specified values. The area A under the ROC can also be estimated using parametric distributions for the densities f and g. References to this literature and an alternative approach using smoothed histograms to estimate the densities is developed in Zou et al. (1997). They also consider estimating the partial area over the important region determined by two appropriate small values of α. The tests used to select employees need to be reliable and valid. Reliability means that replicate values are consistent while validity means that the test measures what it should, e.g., successful academic performance. Validity is often assessed by the correlation between the test score (X ) and subsequent performance (Y ). Often X and Y can be regarded as jointly normal random variables, especially as monotone transformations of the raw scores can be used in place of them. If a passing score on the screening or pre-employment test is defined as X t and successful performance is defined as Y d, then the sensitivity of the test is P[X t Q Y d ], the specificity is P[X t Q Y d ] and the prevalence of the trait is P[Y d ] in the population of potential applicants. Hence, the aptitude and related tests can be viewed from the general screening test paradigm. When the test and performance scores are scaled to have a standard bivariate normal distribution, both the sensitivity and specificity increase with the correlation, ρ. For example, suppose one desired to obtain employees in the upper half of the performance distribution and used a cut-off score, t, of one-standard deviation above the mean on the test (X ). When ρ l 0.3, the sensitivity is 0.217 while the specificity is 0.899. If ρ l 0.5, the sensitivity is 0.255 and the specificity is 0.937. The use of a high cut-off score eliminates the less able applicants but also disqualifies a majority of applicants who are in the upper half of the performance distribution. Reducing the cut-off score to one-half a standard deviation above the average raises the sensitivities to 0.394 and 0.454 for the two values of ρ but lowers the corresponding specificities to 0.777 and 0.837. This trade-off is a general phenomenon as seen in the ROC curves.
3. The Importance of the Context in Interpreting the Results of Screening Tests In medical and psychological applications, an individual who tests positive for a disease or condition on a screening test will be given a more accurate confirmatory test or intensive interview. The cost of a ‘false positive’ screening result on a medical exam is 13757
Screening and Selection often considered very small relative to a ‘false negative,’ which could lead to the failure of suitable treatment to be given in a timely fashion. A false positive result presumably would be identified in a subsequent more detailed exam. Similarly, when government agencies give employees in safety or security sensitive jobs a polygraph test, the loss of potentially productive employee due to a false positive was deemed much less than the risk of hiring an employee who would might be a risk to the public or a security risk. One can formalize the issue by including the costs of various errors and the prevalence, π, of the trait in the population being screened in determining the cut-off value of the screening test. Then the expected cost, which weights the probability of each type of error by its cost is given by wα(1kπ)j(1kw) πG (t). Here the relative costs of a false positive (negative) are w and 1kw, respectively and as before t l F−" (1kα). The choice of cut-off value, to, minimizing the expected cost satisfies: g (t) w (1kπ) l f (t) (1kw) π
(1)
Whenever the ratio, g: f, of the density functions is a monotone function the previous equation yields an optimum cut-off point, to, which depends on the costs and prevalence of the trait. Note that for any value of π, the greater the cost of a false positive, the larger will be the optimum value, to. This reflects the fact that the specificity needs to be high in order to keep the false positive rate low. Although the relative costs of the two types of error are not always easy to obtain and the prevalence may only be approximately known, Eqn. (1) may aid in choosing the critical value. In practice, one should also assess the effect slight changes in the costs and assumed prevalence have on the choice of the cut-off value. The choice of t that satisfies condition (1) may not be optimal if one desires to estimate the prevalence of the trait in a population rather than classifying individuals. Yanagawa and Tokudome (1990) determine t when the objective is to minimize the relative absolute error of the estimator of prevalence on the basis of the screening test results. The HIV\AIDS epidemic raised questions about the standard assumptions about the relative costs of the two types of error. A ‘false positive’ classification would not only mean that a well individual would worry until the results of the confirmatory test were completed, they also might have social and economic consequences if friends or their employer learned of the result. 13758
Similar problems arise in screening blood donors and in studies concerning the association of genetic markers and serious diseases. Recall that the vast majority of donors or volunteers for genetic studies are doing a public service and are being screened to protect others or advance knowledge. If a donation tests positive, clearly it should not be used for transfusion. Should a screened-positive donor be informed of their status? Because the prevalence of infected donors is very small, the PVP is quite low so that most of the donors screened positive are ‘false.’ Thus, blood banks typically do not inform them and rely on approaches to encourage donors from highrisk groups to exclude themselves from the donor pool (Nusbacher et al. 1986). Similarly, in a study (Hartge et al. 1998) of the prevalence of mutations in two genes that have been linked to cancer the study participants were not notified of their results. The screening test paradigm is useful in evaluating tests used to select employees. The utility of a test depends on the costs associated with administering the test and the costs associated with the two types of error. Traditionally, employers focused on the costs of a false positive, hiring an employee who does not perform well, such as termination costs, and the possible loss of customers. The costs of a false negative are more difficult to estimate. The civil-rights law, which was designed to open job opportunities to minorities, emphasized the importance of using appropriate tests, i.e., tests that selected better workers. Employers need to check whether the tests or job requirements (e.g., possession of a high school diploma) have a disparate impact upon a legally protected group. When they exclude a significantly greater fraction of minority members than majority ones, the employer needs to validate it, i.e., show it is predictive of on the job performance. Arvey (1979) and Paetzold and Willborn (1994) discuss these issues.
4. Estimating the Accuracy of the Screening Tests So far, we have assumed that we can estimate the accuracy of the screening tests on samples from two populations where the true status of the individuals is known with certainty. In practice, this is often not the case and can lead to biased estimates of the sensitivity and specificity of a screening test, as some of the individuals believed to be normal have the trait, and vice versa. If one has samples from only one population to which to apply both the screening and confirmatory test, then one cannot estimate the accuracy rates. The data would be organized into a 2i2 table, with four cells, only three of which are independent. There are five parameters, however, the two accuracy rates of the two tests plus the prevalence of the trait in the
Screening and Selection population. In some situations, the prevalence of the trait may vary amongst sub-populations. If one can find two such sub-populations and if the accuracy rates of both tests are the same in both of those subpopulations, then one has two 2i2 tables with six independent cells, with which to estimate six parameters. Then estimation can then be carried out (Hui and Walter 1980). This approach assumes that the two tests are conditionally independent given the true status of the individual. When this assumption is not satisfied, Vacek (1985) showed that the estimates of sensitivity and specificity of the tests are biased. This topic is an active area of research, recently reviewed by Hui and Zhou (1998). A variety of latent-class models have been developed that relax the assumption of conditional independence (see Faraone and Tsuang 1994 and Yang and Becker 1997 and the literature they cite).
5. Applications and Future Concerns Historically screening tests were used to identify individuals with a disease or trait, e.g., as a first stage in diagnosing medical or psychological conditions or select students or employees. They are being increasingly used, often in conjunction with a second, confirmatory test, in prevalence surveys for public health planning. The techniques developed are often applicable, with suitable modifications, to social science surveys. Some examples of prevalence surveys illustrate their utility. Katz et al. (1995) compared two instruments for determining the presence of psychiatric disorders in part to assess the needs for psychiatric care in the community and the available services. They found that increasing the original cut-off score yielded higher specificity without a substantial loss of sensitivity. Similar studies were carried out in Holland by Hodiamont et al. (1987) who found a lower prevalence (7.5 percent) than the 16 percent estimate in York. The two studies, however, used different classification systems illustrating that one needs to carefully examine the methodology underlying various surveys before making international comparisons. Gupta et al. (1997) used several criteria based on the results of an EKG and an individual’s medical history to estimate the prevalence of heart disease in India. Often one has prior knowledge of the prevalence of a trait in a population, especially if one is screening similar populations on a regular basis, as would be employers, medical plans, or blood centers. Bayesian methods incorporate this background information and can yield more accurate estimates (see Geisser 1993, and Johnson, Gastwirth and Pearson 2001). A cost-effective approach is to use an inexpensive screen at a first stage and retest the positives with a more
definitive test. Bayesian methodology for such studies was developed by Erkanli et al. (1997). The problem of misclassification arises often in questionnaire surveys. Laurikka et al. (1995) estimated the sensitivity and specificity of self-reporting of varicose veins. While both measures were greater than 0.90, the specificity was lower (0.83) for individuals with a family history than those with negative histories. Sorenson (1998) observed that often selfreports are accepted as true and found that the potential misclassification could lead to noticeable (10 percent) errors in estimated mortality rates. The distortion misclassification errors can have on estimates of low prevalence traits because of the high fraction of false positive classifications, was illustrated in the earlier discussion of screening tests for HIV\ AIDS. Hemenway (1997) applies these concepts to demonstrate that surveys typically overestimate rare events; in particular the self-defense uses of guns. Thus it is essential to incorporate the accuracy rates into the prevalence estimate (Hilden 1979, Gastwirth 1987, Rahme and Joseph 1998). Sinclair and Gastwirth (1994) utilized the HuiWalter paradigm to assess the accuracy of both the original and re-interview (by supervisors) classifications in labor force surveys. In its evaluation, the Census Bureau assumes that the re-interview data are correct; however, those authors found that both interviews had similar accuracy rates. In situations where one can obtain three or more classifications, all the parameters are identifiable (Walter and Irwig 1988) Gastwirth and Sinclair (1998) utilized this feature of the screening test approach to suggest an alternative design for judge–jury agreement studies that had another expert, e.g., law professor or retired judge, assess the evidence.
6. Conclusion Many selection or classification problems can be viewed from the screening test paradigm. The context of the application determines the relative costs of a misclassification or erroneous identification. In criminal trials, society has decided that the cost of an erroneous conviction far outweighs the cost of an erroneous acquittal. While, in testing job applicants, the cost of not hiring a competent worker is not as serious. The two types of error vary with the threshold or cut-off value and the accuracy rates corresponding to these choices is summarized by the ROC curve. There is a burgeoning literature in this area as researchers are incorporating relevant covariates, e.g., prior health status or educational background into the classification procedures. Recent issues of Biometrics and Multiariate Behaioral Research, Psychometrika and Applied Psychological Measurement as well as the medical journals cited in the article contain a variety of 13759
Screening and Selection articles presenting new techniques and applications of them to the problems discussed. See also: Selection Bias, Statistics of
Bibliography Arvey R D 1979 Fairness in Selecting Employees. AddisonWesley, Reading, MA Brookmeyer R, Gail M H 1994 AIDS Epidemiology: A Quantitatie Approach. Oxford University Press, New York Campbell G 1994 Advances in statistical methodology for the evaluation of diagnostic and laboratory tests. Statistics in Medicine 13: 499–508 Erkanli A, Soyer R, Stangl D 1997 Bayesian inference in twophase prevalence studies. Statistics in Medicine 16: 1121–33 Faraone S V, Tsuang M T 1994 Measuring diagnostic accuracy in the absence of a gold standard. American Journal of Psychiatry 151: 650–7 Gail M H, Green S B 1976 A generalization of the one-sided two-sample Kolmogorov–Smirnov statistic for evaluating diagnostic tests. Biometrics 32: 561–70 Gastwirth J L 1987 The statistical precision of medical screening procedures: Application to polygraph and AIDS antibodies test data (with discussion). Statistical Science 2: 213–38 Gastwirth J L, Sinclair M D 1998 Diagnostic test methodology in the design and analysis of judge–jury agreement studies. Jurimetrics Journal 39: 59–78 Geisser S 1993 Predictie Inference. Chapman and Hall, London Greenhouse S W, Mantel N 1950 The evaluation of diagnostic tests. Biometrics 16: 399–412 Gupta R, Prakash H, Gupta V P, Gupta K D 1997 Prevalence and determinants of coronary heart disease in a rural population of India. Journal of Clinical Epidemiology 50: 203–9 Hartge P, Struewing J P, Wacholder S, Brody L C, Tucker M A 1999 The prevalence of common BRCA1 and BRCA2 mutations among Ashkenazi jews. American Journal of Human Genetics 64: 963–70 Hemenway D 1997 The myth of millions of annual self-defense gun uses: A case study of survey overestimates of rare events. Chance 10: 6–10 Hilden J 1979 A further comment on ‘Estimating prevalence from the results of a screening test.’ American Journal of Epidemiology 109: 721–2 Hilgers R A 1991 Distribution-free confidence bounds for ROC curves. Methods and Information Medicine 30: 96–101 Hodiamont P, Peer N, Syben N 1987 Epidemiological aspects of psychiatric disorder in a Dutch health area. Psychological Medicine 17: 495–505 Hseih F S, Turnbull B W 1996 Nonparametric methods for evaluating diagnostic tests. Statistics Sinica 6: 47–62 Hui S L, Walter S D 1980 Estimating the error rates of diagnostic tests. Biometrics 36: 167–71 Hui S L, Zhou X H 1998 Evaluation of diagnostic tests without gold standards. Statistical Methods in Medical Research 7: 354–70 Johnson W O, Gastwirth J L, Pearson L M 2001 Screening without a gold standard: The Hui–Walter paradigm revisited. American Journal of Epidemiology 153: 921–4 Katz R, Stephen J, Shaw B F, Matthew A, Newman F,
13760
Rosenbluth M 1995 The East York health needs study— Prevalence of DSM-III-R psychiatric disorder in a sample of Canadian women. British Journal of Psychiatry 166: 100–6 Koehler J J 1996 The base rate fallacy reconsidered: descriptive, normative and methodological challenges (with discussion). Behaioral and Brain Sciences 19: 1–53 Laurikka J, Laara E, Sisto T, Tarkka M, Auvinen O, Hakama M 1995 Misclassification in a questionnaire survey of varicose veins. Journal of Clinical Epidemiology 48: 1175–8 Nusbacher J, Chiavetta J, Naiman R, Buchner B, Scalia V, Horst R 1986 Evaluation of a confidential method of excluding blood donors exposed to human immunodeficiency virus. Transfusion 26: 539–41 Paetzold R, Willborn S 1994 Statistical Proof of Discrimination. Shepard’s\McGraw Hill, Colorado Springs, CO Rahme E, Joseph L 1998 Estimating the prevalence of a rare disease: adjusted maximum likelihood. The Statistician 47: 149–58 Sinclair M D, Gastwirth J L 1994 On procedures for evaluating the effectiveness of reinterview survey methods: Application to labor force data. Journal of American Statistical Assn 91: 961–9 Sorenson S B 1998 Identifying Hispanics in existing databases. Ealuation Reiew 22: 520–34 Vacek P M 1985 The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 41: 959–68 Walter S D, Irwig L M 1988 Estimation of test error rates, disease prevalence and relative risk from misclassified data: A review. Journal of Clinical Epidemiology 41: 923–37 Wieand S, Gail M H, James B R, James K 1989 A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76: 585–92 Yanagawa T, Tokudome S 1990 Use of screening tests to assess cancer risk and to estimate the risk of adult T-cell leukemia\ lymphoma. Enironmental Health Perspecties 87: 77–82 Yang I, Becker M P 1997 Latent variable modeling of diagnostic accuracy. Biometrics 52: 948–58 Zou K H, Hall W J, Shapiro D E 1997 Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Statistics in Medicine 16: 214–56
J. L. Gastwirth
Search, Economics of The economics of search study the implications of market frictions for economic behavior and market performance. ‘Frictions’ in this context include anything that interferes with the smooth and instantaneous exchange of goods and services. The most commonly-studied problems arise from imperfect information about the location of buyers and sellers, their prices, and the quality of the goods and services that they trade. The key implication of these frictions is that individuals are prepared to spend time and other resources on exchange; they search before buying or selling. The labor market has attracted most
Search, Economics of theoretical and empirical interest in this area of research, because of the heterogeneities that characterize it and the existence of good data on flows of workers and jobs between activity and inactivity, which can be used to test its propositions.
1. Historical Background The first formal model of individual behavior, due to Stigler (1961), was in the context of a goods market: choosing the optimal number of sellers to search before buying at the lowest price. Stigler’s rule, known as a fixed-sample rule, was abandoned in favor of sequential stopping rules: choosing the optimal reservation price and buying at the first store encountered which sells at or below the reservation price (see McCall 1970 for an early influential paper). The first big momentum to research in the economics of search came with the publication of Phelps et al. (1970), which showed that search theory could be used to analyze the natural rate of unemployment and the inflation-unemployment trade off, the central research questions of macroeconomics at that time. Although interest in the inflation-unemployment trade-off has since waned, interest in search theory as a tool to analyze transitions in the labor market and equilibrium unemployment has increased. The momentum in this direction came in the 1980s, when contributions by Diamond (1982), Mortensen (1982) and Pissarides (1985) showed that search theory could be used to construct equilibrium models of the labor market with more accurate predictions than the traditional neoclassical model (see Pissarides 2000, Mortensen and Pissarides 1999a, 1999b for reviews). The appearance of comprehensive data on job and worker flows, which can be studied with the tools of search theory, also contributed to interest in this direction (see Leonard 1987, Dunne et al. 1989, Davis et al. 1996, Blanchard and Diamond 1990). This article reviews major developments since the mid 1980s, with explicit reference to labor markets.
2. Job Search An individual has one unit of labor to sell to firms, which create jobs. The valuation of labor takes place under the assumptions that agents have infinite horizons and discount future income flows at the constant rate r, they know the future path of prices and wages and the stochastic processes that govern the arrival of trading partners, and they maximize the present discounted value of expected incomes. Let Ut be the expected present discounted value of a unit of labor before trade at time t (the ‘value’ of an unemployed worker) and Wt the expected value of an employed
worker. During a short time interval δt the unemployed worker receives income bδt, and a job offer arrives with probability aδt. The frictions studied in the economics of search are summarized in the arrival process. In the absence of frictions, a 4 _. With frictions and search, a 0; with no search, a l 0. A large part of the literature is devoted to specifying the arrival process, an issue addressed in Sect. 3.1. The choice of search intensity has also been studied, by making a an increasing function of search effort, but this issue is not addressed here (see Pissarides 2000, Chap. 5). If a job offer arrives, the individual has the option of taking it, for an expected return Wt+δt; or not taking it and keeping instead return Ut+δt. If no offer arrives, the individual’s return is Ut+δt. Therefore, with discount rate r, Ut satisfies the Bellman equation Ut l bδtjaδt
max (Wt+δt, Ut+δt) U j(1kaδt ) t+δt (1) 1jrδt 1jrδt
Rearrangement of terms yields U kUt rUt l bja (max (Wt+δt, Ut+δt)kUt+δt)j t+δt δt
(2)
Taking the limit of (2) as δt _, and omitting subscripts for convenience, yields rU l bja (max (W, U )kU )jU}
(3)
where UI denotes the rate of change of U. Equation (3) is a fundamental equation in the economics of search. It can be given the interpretation of an arbitrage equation for the valuation of an asset in a perfect capital market with risk-free interest rate r. This asset yields coupon payment b and at some rate a, it gives its holder the option of a discrete change in its valuation, from U to W. Optimality requires that the option is taken (and the existing valuation given up) if W U. The last term, UI , shows capital gains or losses due to changes in the market valuation of the asset. In most of the economics of labor-market search, however, research concentrates on ‘steady states,’ namely, on situations where the discount rate, transition rates, and income flows are all constant. With infinite horizons there are then stationary solutions to the valuation equations, obtained from (3) with UI l 0. One simple way of solving (3) is to assume that employment is an ‘absorbing state,’ so when a job that offers wage w is accepted, it is kept for life. Then, W l w\r, and if the individual is sampling from a known wage offer distribution F (w), the stationary version of (3) satisfies rU l bja( max (w\r, U ) dF (w)kU)
(4) 13761
Search, Economics of The option to accept a job offer is taken if w\r U, giving the reseration wage eqn. ξ l rU
(5)
The reservation wage is defined as the minimum acceptable wage, and it is obtained as the solution to (4) and (5). Partial models of search and empirical research in the duration of unemployment have explored generalized forms of Eqn. (4) to derive the properties of transitions of individuals from unemployment to employment (see Devine and Kiefer 1991). For a known F (w) with upper support A, Eqn. (4) specializes to a ξ l bj r
& (wkξ ) dF (w) A
(6)
ξ
Various forms of (6) have been estimated by the empirical literature or used in the construction of partial models of the labor market. The transition from unemployment to employment (the unemployment ‘hazard rate’) is a (1k F (ξ)) and it depends both on the arrival of offers and on the individual’s reservation wage. a, r, and the parameters of the wage offer distribution can be made to depend on the individual’s characteristics. The empirical literature has generally found that unemployment compensation acts as a disincentive on individual transitions through the influence of b on the reservation wage, but the effect is not strong. The offer arrival rate increases the hazard rate, despite the fact that the reservation wage increases in a. A number of personal characteristics influence reservation wages and transitions, including age, education, and race.
3. Two-sided Matching for Gien Wages Recent work in the economics of search has focused mainly on the equilibrium implications of frictions and search decisions. An equilibrium model needs to specify the decisions of firms and solve for the offer arrival rate a. In addition, a mechanism is needed to ensure that search is an ongoing process. The latter is achieved by introducing a probability λδt that a negative shock will hit a job during a short time interval δt. When the negative shock arrives the job is closed down (‘destroyed’), and the worker has to search again to find another job. For the moment, λ is assumed to be a positive constant (see Sect. 5).
3.1. The Aggregate Matching Function To derive the equilibrium offer arrival rate, suppose that at time t there are u unemployed workers and vacant jobs. In a short time interval δt each un13762
employed worker moves to employment with probability aδt, so in a large market the total flow of workers from unemployment to employment, and the total flow of jobs from vacant state to production, are both auδt. A key assumption in the equilibrium literature is that the total flows satisfy an aggregate matching function. The aggregate matching function is a black box that gives the outcome of the search process in terms of the inputs into search. If the u unemployed workers are the only job seekers and they search with fixed intensity of one unit each, and firms also search with fixed intensity of one unit for each job vacancy, the matching function gives m l m (u, )
(7)
with m standing for the flow of matches, au. The function is usually assumed to be continuous and differentiable, with positive first partial derivatives and negative second derivatives, and to satisfy constant returns to scale (see Petrongolo and Pissarides 2000 for a review). A commonly-used matching function in the theoretical literature, derived from the assumption of uncoordinated random search, is the exponential m l (1ke−ku/v),
k0
(8)
The empirical literature, however, estimates a loglinear (constant elasticity) form, which parallels the Cobb–Douglas production function specification, with the elasticity on unemployment estimated in the range 0.5–0.7. The fact that job matching is pairwise implies that the transition rates of jobs and workers are related Poisson processes. Given au l m, the rate at which workers find jobs is a l m\u. If q is the rate of arrival of workers to vacant jobs, then total job flows q l au, and so q l m\. The equilibrium literature generally ignores individual differences and treats the average rates m\u and m\ as the rates at which jobs and workers, respectively, arrive to each searching worker and vacant job. By the properties of the matching function, E
u ,1
qlm F
G
(9) H
m (θ−", 1) q (θ )
(10)
with qh(θ) 0 and elasticity kη(θ) ? (k1, 0). Here, θ is a measure of the tightness of the market, of the ratio of the inputs of firms into search to the inputs of workers. Similarly, the transition rate of workers is E
alm F
1,
u
G
θq (θ) H
(11)
Search, Economics of By the elasticity properties of q (θ), ca\cθ 0. In the steady state, the inverse of the transition rates, l\q (θ) and 1\θq (θ), are the expected durations of a vacancy and unemployment respectively. The influence of tightness on the transition rates is independent of the level of wage rates. If there are more vacant jobs for each unemployed worker, the arrival rate of workers to the typical vacancy is lower and the arrival rate of job offers to the typical unemployed worker is higher, irrespective of the level of wages. When a worker and a firm meet and are considering whether or not to stay together, they are not likely to take into account the implications of their action for market tightness and the transition rates of other unmatched agents. For this reason, the influence of tightness on the transition rates is known as a search externality. Several papers in the economics of search have explored the efficiency properties of equilibrium given the existence of the externality (see Sect. 4.3.2 and Diamond 1982, Mortensen 1982, Pissarides 1984, Hosios 1990). 3.2 Job Creation A job is an asset owned by the firm and is valued in a perfect capital market characterized by the same riskfree interest rate r. Suppose that in order to recruit a worker, a firm has to bear set-up cost K to open a job vacancy and in addition has to pay a given flow cost c for the duration of the vacancy. The flow cost can be interpreted as an advertising and recruitment cost, or as the cost of having an unfilled position in what might be a complex business environment (which is not modeled). Let V be the value of a vacant position and J the value of a filled one. Reasoning as in the case of the value of a job seeker, U, the Bellman equation satisfied by V is rV lkcjq (θ) (JkV )
(12)
The vacant job costs c and the firm is offered the option to take a worker at rate q (θ). Since the firm has to pay set-up cost K to create a job, it will have an incentive to open a job if (and only if ) V K. The key assumption made about job creation is that the gains from job creation are always exhausted, so jobs are created up to the point where VlK
(13)
Substitution of V from (12) into (13) yields rKjc J l Kj q (θ)
(14)
Competition requires that the expected present discounted value of profit when the worker arrives, the
value of a filled job J, should be just sufficient to cover the initial cost K and the accumulated costs for the duration of the vacancy, interest on the initial outlay rK and ongoing costs c for the expected duration of the vacancy, 1\q (θ). Let productivity be a constant p in all jobs and the wage rate a constant w. With break-up rate λ, the value of a job satisfies the Bellman equation rJ l pkwkλJ
(15)
The flow of profit to the firm is pkw until a negative shock arrives that reduces its value to 0. Replacing J in (14) by its expression in (15) yields the job creation condition pkw rKjc kKk l0 rjλ q (θ)
(16)
Equation (16) determines θ for each w and parallels the conventional labor demand curve. A higher wage rate makes it more expensive for firms to open jobs, leading to lower market tightness. Frictions slow down the arrival of suitable workers to vacant jobs and so the firm incurs some additional recruitment costs. If the arrival rate is infinitely fast, as in Walrasian economics, q (θ) is infinite and the last term in (16) disappears. The assumptions underlying (13) ensure that at the margin, the recruitment cost is just covered.
4. Wage Setting With search frictions, there is no supply of labor that can be equated with demand to give wages. The conventional supply of labor is constant here: there is a fixed number of workers in the market, which is usually normalized to unity, and each supplies a single unit of labor. In search equilibrium there are local monopoly rents. Firms and workers who are together can start producing immediately. If they break up, they can start producing only after they find another partner through an expensive process of search. The value of the search costs that they save by staying together correspond to a pure economic rent; they could be taken away from them and they would still stay together. Wages need to share those rents. Two approaches dominate in the literature. The first and more commonly-used approach employs the solution to a Nash bargaining problem (see Diamond 1982, Pissarides 2000). The Nash solution allocates the rents according to each side’s ‘threat points.’ The threat points in this case are the returns from search, and the Nash solution gives to each side the same surplus over and above their expected returns from search. Generalizing this solution concept, wages are determined such that the net gain accruing to the 13763
Search, Economics of worker from the match, WkU, is a fixed proportion β of the total surplus, the sum of the worker’s and the firm’s surplus, JkV. This sharing rule can be obtained as the maximization of the product (WkU ) β ( JkV )"−β, β ? (0, 1)
been found by a number of authors (e.g., Blanchflower and Oswald 1994), although it should be noted that similar wage equations can also be derived from other theoretical frameworks.
(17)
The coefficient β can be given the interpretation of bargaining strength, although strictly speaking, bargaining strength in the conventional Nash solution is given by the threat points U and V (see Binmore et al. 1986 for an interpretation of β in terms of rates of time preference). The second approach to wage determination postulates that the firm ‘posts’ a wage rate for the job, which the worker either takes or leaves. The posted wage may be above the worker’s reservation wage because of some ‘efficiency wage’ type arguments, for example, in order to reduce labor turnover, encourage more effort or attract more job applicants.
4.2 Equilibrium Equation (21) replaces the conventional labor supply curve and closes the system. When combined with the job creation condition (16) it gives unique solutions for wages and market tightness. A variety of intuitive properties are satisfied by this equilibrium. For example, higher labor productivity implies higher wages and tightness; higher unemployment income implies higher wages but lower tightness. It remains to obtain the employment rate in equilibrium. The labor force size is fixed, so by appropriate normalizations, if at some time t unemployment is ut, employment is 1kut. In a short time interval δt, atutδt workers are matched and λ (1kut) δt workers lose their jobs. Given that at l θtq (θt), the evolution of unemployment is given by
4.1 Bargaining The value of an employed worker, W, satisfies the Bellman eqn. rW l wkλ (WkU )
(18)
The worker earns wage w and gives up the employment gain WkU when the negative shock arrives. No worker has an incentive to quit into unemployment or search for another job whilst employed, for as long as w b, which is assumed. With (15) and (17) in place, the Nash bargaining solution to (18) gives the sharing rule WkU l β (WkUjJkV ).
l (1kβ ) bjβ [ pk(rjλ) Kj(rKjc) θ ]
(20) (21)
There is a premium on the reservation wage which depends on the worker’s bargaining strength and the net surplus produced. Wages depend positively on unemployment income and the productivity of the job, the first because of the effect that unemployment income has on the cost of unemployment and the second because of the monopoly rents and bargaining. Wages also depend on market tightness. In more tight markets they are higher, because the expected duration of unemployment in the event of disagreement is less. Empirical evidence supporting this wage equation has 13764
Dividing through by δt and taking the limit as δt yields uc l λ (1ku)kθq (θ) u
(22) 0 (23)
Because the solution for θ is independent of u, this is a stable differential equation for unemployment with a unique equilibrium ul
(19)
Making use of the value equations and sharing rule yields w l rUjβ ( pk(rjλ)KkrU )
ut+δt l utjλ (1kut) δtkθtq (θt) utδt
λ λjθq (θ)
(24)
Equation (24) is often referred to as the Beveridge curve, after William Beveridge who first described such a ‘frictional’ equilibrium. Plotted in space with vacancies on the vertical axis and unemployment on the horizontal is a convex-to-the origin curve. In the early literature the impact of frictions on the labor market was measured by the distance of this curve from the origin (see also Pissarides 2000, Blanchard and Diamond 1989).
4.3 Wage Posting Wage posting is an alternative to wage bargaining, with different implications for search equilibrium. The firm posts a wage for the job and the worker who searches it either takes it or leaves it. Three different models of wage posting are examined.
Search, Economics of 4.3.1 Single offers, no prior information. Workers search sequentially one firm at a time, they discover the firm’s offer after they have made contact and they have to accept it or reject it (with no recall) before they can sample another firm. The worker who contacts a firm that posts wage wi has two options. Accept the wage offer of the firm and enjoy expected return Wi, obtained from (18) for w l wi, or reject it and continue search for return U. The firm that maximizes profit chooses wi subject to Wi U. Since the firm will have no incentive to offer the worker anything over and above the minimum required to make workers accept its offer, in this model wages are driven to the worker’s reservation wage, wi l b (Diamond 1971). In terms of the bargaining solution, the ‘Diamond’ (see (20), (21)) equilibrium requires β l 0. The model can then be solved as before, by replacing β by 0 in the job creation condition and wage equation. Tightness, and consequently unemployment, absorb all shocks other than those operating through unemployment income, which also change wages. (There is a paradox in this model, in that if wages are equal to unemployment income no worker will have an incentive to search.)
4.3.2 Competitie search. The next model increases the amount of information that workers have before they contact the firm (Moen 1997). Workers can see the wage posted by each firm but because of frictions they cannot be certain that they will get the job if they apply. If the probability of a job offer across firms is the same, all workers will apply for the highest-wage job. Queues will build up and this will reduce the probability of an offer. In equilibrium, the wage offer and queue characterizing each job have to balance each other out, so that all firms get applicants. The length of the queue is derived from the matching process. The firm that posts wage wi makes an offer on average after 1\q (θi) periods and a job applicant gets an offer on average after 1\θiq (θi) periods. Implicit in this formulation is the assumption that more than one firm offers the same wage, and firms compete for the applicants at this wage. Workers apply to only one job at a time. Suppose now there is a firm, or group of firms, such that when workers join their queue they derive expected income stream UF , the highest in the market. The constraint facing a firm when choosing its wage offer is that the worker who applies to it derives at least as much expected utility as UF . The expected profit of the firm that posts wage wi solves rVi l kcjq (θi) ( JikVi)
(25)
rJi l pkwikλJi
(26)
The worker’s expected returns from applying to the firm posting this wage satisfy the system of equations rUi l bjθiq (θi) (WikUi) rW l w kλ (W kUz ) i
i
i
(27) (28)
The firm chooses wi to maximize Vi subject to Ui UF . The first-order maximization conditions imply that all firms offer the same wage, which satisfies WkU l
η ( JkV ) 1kη
(29)
Comparison of (29) with (19) shows that the solution is indeed similar to the Nash solution but with the share of labor given by the (negative of the) elasticity of q (θ), (which is equal to the unemployment elasticity of the underlying matching function). The rest of the model can be solved as in the Nash case. There is a special significance to the share of labor obtained in this formulation. In the case where wages are determined according to the Nash rules, the firm and worker choose the wage after they meet, so it is unlikely that they will internalize the search externalities; they do not take into account the effect of their choices on the transition rates of unmatched agents. It can be shown that with constant returns to scale there is a unique internalizing rule which requires β l η, the solution of the wage posting model considered here (Hosios 1990, Pissarides 1984, 2000). For this reason, this particular wage posting model is often called the competitie search equilibrium. The key assumption that gives efficiency is the relaxation of the informational restrictions on workers, which lets them know both the firm’s wage offer and the length of the queue associated with it.
4.3.3 Wage differentials. The third version of the wage posting model relaxes the assumption that workers have the choice of at most one wage offer at a time but does not allow workers knowledge of the wage offer before they apply (Burdett and Judd 1983, Burdett and Mortensen 1998, Montgomery 1991). The easiest way to introduce this is to allow workers the possibility of search on the job, i.e., to let them continue looking for a better job after they accepted one. Suppose for simplicity that job offers arrive to searching workers at the same rate a, irrespective of whether they are employed or unemployed. Suppose also that job search is costless (except for the time cost) and job changing is costless. Then, if a worker is earning w now and another offer paying wh comes along, the worker accepts the new offer if (and only if) wh w. The worker’s reservation wage is the current wage. 13765
Search, Economics of Unemployment pays b, so no firm can pay below b and attract workers. Consider a firm that pays just above b. Anyone applying for a job from the state of unemployment will accept its offer, but no one else will. Employed job seekers will have no incentive to quit their jobs to work at a wage close to b. In addition, this firm’s workers will be quitting to join other firms, which may be paying above b. So a firm paying a low wage will have high turnover and will be waiting long before it can fill its vacancies. A firm paying a high wage will be attracting workers both from unemployment and from other firms paying less than itself. Moreover, it will not be losing workers to other firms. So high-wage firms will have fewer vacant positions. Now suppose the two firms have access to the same technology. The low-wage firm enjoys a lot of profit from each position, but has a lot of vacant positions. The high-wage firm enjoys less profit from each position, but has them filled. It is possible to show under general conditions that high wage and low wage firms will co-exist in equilibrium. Burdett and Mortensen (1998) show this by assuming that there is a distribution of wage offers for homogeneous labor, F (w), and demonstrating that (a) no two firms will offer the same wage, (b) firms can choose any wage between the minimum b and a maximum w- , and enjoy the same profit in the steady state. The maximum is given by E
w` l pk F
λ ajλ
# ( pkb) G
(30)
H
where as before p is the productivity of each worker and λ the job destruction rate. The interesting result about wage posting in this model is that once the assumption that the worker can only consider one offer at a time is relaxed, a distribution of wage offers for homogeneous labor arises. The distribution satisfies some appealing properties. As a 0, the upper support tends to w- l b, the Diamond solution. At the other extreme, as a _, it can be shown that all wage offers converge to p, the competitive solution. Intuitively, a l 0 maximizes the frictions suffered by workers and a l _ eliminates them.
5. Job Destruction A body of empirical literature shows that there is a lot of ‘job churning,’ with many jobs closing down and new ones opening to take their place. It is also found that both the job creation and job destruction rates, especially the latter, vary a lot over the cycle (see Leonard 1987, Dunne et al. 1989, Davis et al. 1996). This is an issue addressed by search theorists. Returning to the model with a Nash wage rule, the job 13766
creation flow is the matching rate m(u, ). The job creation rate is defined as the job creation flow divided by employment, m(u, )\(lku), which is m (u, ) u l θq (θ) 1ku 1ku
(31)
Since both θ and u are endogenous variables of the model that respond to shocks, the model predicts a variable job creation rate that can be tested against the data (see Mortensen and Pissarides 1994, Cole and Rogerson 1996). But the job destruction flow is λ (1ku), so the job destruction rate is a constant λ, contrary to observation. Two alternative ways of making it variable are considered.
5.1 Idiosyncratic Shocks In the discussion so far, the productivity of a job is p until a negative shock arrives that reduces it to zero. Generalizing this idea, suppose that although initially the productivity is p, when a shock arrives it changes it to some other value px. The component p is common to all jobs but x is specific to each job. It has distribution G (x) in the range [0, 1]. New jobs start with specific productivity x l 1; over time shocks arrive at rate λ that transform this productivity to a value between 0 and 1, according to the distribution G. The firm has the choice of either continuing to produce at the new productivity, or closing the job down. The idea is that initially firms have a choice over their product type and technique and choose the combination that yields maximum productivity. Over time techniques are not reversible, so if the payoffs from a given choice change, the firm has the choice of either continuing in the new environment or destroying the job. As in the case where workers are faced with take-itor-leave-it choices from a given wage distribution, Mortensen and Pissarides (1994) show that when the firm has the choice of either taking a productivity or leaving it, its decision is governed by a reservation productivity, denoted R. The reservation productivity depends on all the parameters of the model. Profit maximization under the Nash solution to the wage bargain implies that both workers and firms agree about the optimal choice of R, i.e., which jobs should be closed and which should continue in operation. With knowledge of R, the flow of job closures generalizes to λG (R) (1ku), the fraction of jobs that get shocks below the reservation productivity. The job destruction rate then becomes λG (R), which responds to the parameters of the economy through the responses of R to shocks. An interesting feature of the job creation and job destruction rates, which conforms to observation, is
Search, Economics of that because the job creation rate depends on unemployment, which is a slow-moving variable, whereas the job destruction rate does not depend on it, the job destruction rate is more volatile than the job creation rate. For example, a rise in general productivity p, associated with a positive cyclical shock, reduces job destruction immediately by reducing the reservation productivity, but increases job creation at first and then reduces it, as tightness rises at first but unemployment falls in response.
5.2 Technological Progress and Obsolescence Another way of modeling job destruction borrows ideas from Schumpeter’s theory of growth through creative destruction (see Aghion and Howitt 1994, Caballero and Hammour 1994). Jobs are again created at the best technology but once created, their technology cannot be updated. During technological progress, the owners of the jobs have the option of continuing in production with the initial technology or closing the job down and opening another, more advanced one. As new jobs are technologically more advanced, the wage offers that workers can get from outside improve over time. There comes a time when the worker’s outside options have risen sufficiently to render the job obsolete. Formally, the model can be set up as before, with the technology of the job fixed at the frontier technology at creation time and wages growing over time because of growth in the returns from search. The value of a job created at time 0 and becoming obsolete at T is J l !
&! e T
−(r+λ)t
[ p (0)kw (t )] dt
(32)
As before, r is the discount rate and λ is the arrival rate of negative shocks that may lead to earlier job destruction. p (0) is the initial best technology and w (t) the growing wage rate. The job is destroyed when w (t) reaches p (0); i.e., the job life that maximizes J is ! defined by w (T *) l p (0). A useful restriction to have when there is growth is to assume that both unemployment income and the cost of recruitment grow at the exogenous rate of growth of the economy. As an example, consider the wage eqn. (21) with the restriction K l 0 and b (t) l begt, c (t) l cegt, where g is the rate of growth of the economy. T * then satisfies egT* l
(1kβ ) p (1kβ) bjβcθ
(33)
Jobs are destroyed more frequently when growth is faster, when unemployment income is higher and when the tightness of the market is higher.
Job destruction in this model has two components, the jobs destroyed because of the arrival of shocks, λ (1ku), and those destroyed because of obsolescence. The latter group was created T * periods earlier and survived to age T *, θq (θ) ue−λT* (note that in the steady state the job creation flow is a constant θq (θ) u). Therefore, the job destruction rate now is λjθq (θ) ue−λT*\ (1ku), which varies in response to changes in T* but also in response to changes in the job creation rate θq (θu)\(1ku). See also: Behavioral Economics; Consumer Economics; Consumption, Sociology of; Economics: Overview; Information, Economics of; Labor Markets, Labor Movements, and Gender in Developing Nations; Labor Supply; Market Research; Market Structure and Performance; Stigler, George Joseph (1911–91); Transaction Costs and Property Rights; Wage Differentials and Structure; Work, Sociology of
Bibliography Aghion P, Howitt P 1994 Growth and unemployment. Reiew of Economic Studies 61: 477–94 Binmore K G, Rubinstein A, Wolinsky A 1986 The Nash bargaining solution in economic modelling. Rand Journal of Economics 17: 176–88 Blanchard O J, Diamond P A 1989 The Beveridge curve. Brookings Papers on Economic Actiity 1: 1–60 Blanchard O J, Diamond P A 1990 The cyclical behavior of the gross flows of US workers. Brookings Papers on Economic Actiity 2: 85–155 Blanchflower D G, Oswald A J 1994 The Wage Cure. MIT Press, Cambridge, MA Burdett K, Judd K 1983 Equilibrium price distributions. Econometrica 51: 955–70 Burdett K, Mortensen D T 1998 Wage differentials, employer size, and unemployment. International Economic Reiew 39: 257–73 Caballero R J, Hammour M L 1994 The cleansing effect of recessions. American Economic Reiew 84: 1350–68 Cole H L, Rogerson R 1999 Can the Mortensen–Pissarides matching model match the business cycle facts? International Economic Reiew 40: 933–59 Davis S J, Haltiwanger J C, Schuh S 1996 Job Creation and Destruction. MIT Press, Cambridge, MA Devine T J, Kiefer N M 1991 Empirical Labor Economics: The Search Approach. Oxford University Press, Oxford Diamond P A 1971 A model of price adjustment. Journal of Economic Theory 3: 156–68 Diamond P A 1982 Wage determination and efficiency in search equilibrium. Reiew of Economic Studies 49: 217–27 Dunne T, Roberts M J, Samuelson L 1989 Plant turnover and gross employment flows in the manufacturing sector. Journal of Labor Economics 7: 48–71 Hosios A J 1990 On the efficiency of matching and related models of search and unemployment. Reiew of Economic Studies 57: 279–98 Leonard J S 1987 In the wrong place at the wrong time: The extent of frictional and structural unemployment. In: Lang K, Leonard J S (eds.) Unemployment and the Structure of Labor Markets. Basil Blackwell, New York
13767
Search, Economics of McCall J J 1970 Economics of information and job search. Quarterly Journal of Economics 84: 113–26 Moen E R 1997 Competitive search equilibrium. Journal of Political Economy 105: 385–411 Montgomery J 1991 Equilibrium wage dispersion and interindustry wage differentials. Quarterly Journal of Economics 106: 163–79 Mortensen D T 1982 The matching process as a noncooperative\bargaining game. In: McCall J J (ed.) The Economics of Information and Uncertainty. University of Chicago Press, Chicago, IL Mortensen D T, Pissarides C A 1994 Job creation and job destruction in the theory of unemployment. Reiew of Economic Studies 61: 397–415 Mortensen D T, Pissarides C A 1999a Job reallocation, employment fluctuations, and unemployment. In: Woodford M, Taylor J (eds.) Handbook of Macroeconomics. North-Holland, Amsterdam Mortensen D T, Pissarides C A 1999b New developments in models of search in the labor market. In: Ashenfelter O, Card D (eds.) Handbook of Labor Economics. North-Holland, Amsterdam Petrongolo B, Pissarides C A 2000 Looking into the black box: A survey of the matching function. Journal of Economic Literature 39: 390–431 Phelps E S et al. 1970 Microeconomic Foundations of Employment and Inflation Theory. Norton, New York Pissarides C A 1984 Efficient job rejection. Economic Journal 94: 97–108 Pissarides C A 1985 Short-run equilibrium dynamics of unemployment, vacancies, and real wages. American Economic Reiew 75: 676–90 Pissarides C A 2000 Equilibrium Unemployment Theory, 2nd edn. MIT Press, Cambridge, MA Stigler G J 1961 The economics of information. Journal of Political Economy 69: 213–25
C. A. Pissarides
Second Language Acquisition Humans are all born with the capacity to learn and to use a language; but they are not born with a language. It is not part of human genetical endowment that ‘horse’ means ‘equine quadruped,’ that the past tense is marked by ‘-ed,’ or that the negation follows the finite verb; this knowledge must be derived from the input with which the learner is confronted. The ways which lead this innate language faculty to the knowledge of a particular linguistic system vary considerably, depending on factors such as age, nature of input and whether this task is undertaken for the first time (‘first language acquisition,’ FLA) or not (‘second language acquisition,’ SLA). SLA is not a homogeneous phenomenon, for at least two reasons. First, it need not wait until the learner has completed FLA; hence, there is a continuous transition from bilingual FLA, in which a child is exposed more or less simultaneously to two systems from birth, to the 13768
adult’s struggles with a new kind of linguistic input. Second, there is a wide range of ways in which the human language faculty gains access to a second linguistic system, ranging from metalinguistic description, as in traditional Latin classes, to language learning by everyday communication, as in the case of a foreign worker. In the history of mankind, explicit teaching of a language is a relatively late phenomenon, and untutored learning was, and probably still is, the most common case; but due to its practical importance, SLA in the classroom still dominates research. Linguists and laymen alike tend to consider children’s way to their mother tongue to be the most important type of language acquisition. This view seems most natural; but it leads easily to a distorted picture of how the human language faculty functions, and what its typical manifestations are. FLA is a very complex mixture of cognitive, social and linguistic developments, and it is not easy to isolate its purely linguistic components. The acquisition of the English tense and aspect system, for example, not only requires the learning of a particular mapping of forms and meanings, but also the development of the concept of time itself. Moreover, most people learn more than one language, albeit to different degrees of perfection. Therefore, the normal manifestation of the human language faculty is a ‘learner variety,’ i.e., a linguistic system which comes more or less close to the linguistic habits of a particular social group. In a child’s case, the final learner variety is usually a ‘perfect replication’ of these habits; children who grow up in multilingual communities often achieve two or even three such perfect replications. Adults who set out to learn another language hardly ever reach a stage where they speak like those from whom they learn; their ‘learner varieties’ normally fossilize at an earlier stage. This does not mean that their final learner variety is less of a language, or less efficient; there is no reason to assume that a linguistic system which says ‘He swam yesterday’ is a superior manifestation of the human language faculty than a system which says ‘He swimmed yesterday’ or even ‘He swim yesterday.’ It is just the way the English do it, and deviations from their norms are stigmatized. If the study of language acquisition, and of SLA in particular, should inform us about the nature of the human language faculty, then it must not focus on issues of perfect replication and why it fails sometimes, but try to clarify how the human language faculty deals under varying conditions with particular forms of linguistic input to which it has access. The first step to this end is to isolate the crucial factors which play a role in this process, and to look at ways they can vary. The second step is to investigate what happens under varying constellations. The final step is to draw generalizations from these findings and to turn them into a theory not just of language acquisition, but the nature of human language itself (Klein 1986).
Second Language Acquisition The picture which research on SLA offers at the time of writing is much less systematic. As with so many other disciplines, it has its origin in practical concerns; researchers were looking for scientific ways to improve foreign language teaching, and this seems impossible without a deeper understanding of the principles of SLA. Therefore, most empirical work in this field is still in the classroom. A second source of inspiration was research on FLA, which started much earlier and therefore set the theoretical and methodological stage. More recently, work in theoretical linguistics has increasingly influenced research on SLA. These and other influences, for example from cognitive and social psychology, resulted in a very scattered picture of theories, methods and findings. Rather than reviewing this research, the following discussion will concentrate on three key issues (useful surveys are found in Ellis 1994, Ritchie and Bhatia 1996, Mitchell and Myles 1998, Braidi 1999).
searcher with a simple and clear design for empirical work. There is a yardstick against which the learners’ production and comprehension can be measured: the target language, or actually what grammar books and dictionaries say about it. What is measured is the differences between what learners do and what the set norm demands. Therefore, the dominant method in SLA research was, and is, error analysis: Learners’ errors are marked and then either counted and statistically analyzed, or they are interpreted individually (Corder 1981, Ellis 1994, pp. 561–664). There are two problems with this perspective. First, it does not tell us what learners do but what they are not able to do. Second, its results reflect not just the principles according to which the human language faculty functions, but the efficiency of a particular teaching method. Therefore, this approach may be of eminent importance to the language teacher, but it is of limited value if we want to understand the nature of human language.
1. SLA and Foreign Language Instruction The pedagogical background of SLA research has led naturally to a particular view on SLA, for which two assumptions are constitutive: (a) There is a well-defined target of the acquisition process—the language to be learned. This target language is a clearly fixed entity, a structurally and functionally balanced system, mastered by those who have learned it in childhood, and more or less correctly described in grammars and dictionaries. (b) SLA learners miss this target at varying degrees and in varying respects—they make errors in production as well as in comprehension, because they lack the appropriate knowledge or skills. This is the target deiation perspectie. It is the teacher’s task to erase, or at least to minimize, the deviations; it is the researcher’s task to investigate which ‘errors’ occur when and for which reasons. As a consequence, learners’ performance in production or comprehension is not studied very much in its own right, as a manifestation of learning capacity, but in relation to a set norm; not in terms of what learners do, but in terms of what they fail to do. The learners’ utterances at some time during the process of acquisition are considered to be more or less successful attempts to reproduce the structural properties of target language utterances. Learners try to do what the mature speaker does, but do it less well. Three reasons make the target deviation perspective so natural and attractive, in fact, almost self-evident. First, it is the natural perspective of the language teacher: language teaching is a normative process, and the teacher is responsible for moving students as closely to some norm as possible. Second, it is also the natural perspective of all of those who had to learn a second language in the classroom—and that means, also, of practically every language researcher. Third, the target deviation perspective provides the re-
2. FLA and SLA Experience shows that FLA normally leads to ‘perfect command’ of the target language, whereas SLA hardly ever does. Why this difference? Is perfect attainment of a second language possible at all? Does the learning process only stop at an earlier point, or does it follow different principles? The last question has found two opposite answers. The identity hypothesis, advocated by many researchers in the early 1970s, claims that the underlying processes are essentially the same across all types of acquisition. Under this view, the fact that the learner already knows a language plays no role: there is no transfer from the ‘source language’ (Odlin 1989). Evidence came mainly from the order in which certain grammatical phenomena, such as inflectional morphemes or the position of negation, are acquired. It turned out, however, that these similarities are quite isolated; there are hardly any supporters of the identity hypothesis anymore. Under the opposite view, it is mainly structural differences between source and target language that cause problems for learners. This contrastie hypothesis has given rise to a number of contrastive grammars for pedagogical purposes. But while there are many clear cases in which learners’ first language interferes in the learning process, structural contrasts can at best account for some properties of the acquisitional process. In acquisition outside the classroom, for example, all learners regularly develop a particular type of ‘learner variety’ which is essentially independent of source and target language (see Klein and Perdue 1997). The net result of thirty years of research is simply that there are similarities as well as dissimilarities. The varying success in final attainment could be due to (a) age differences, or (b) to the fact that there is already a language which blocks the acquisition of a 13769
Second Language Acquisition second language. The second possibility is ruled out by the fact that school age children normally have no problem in learning a second language to perfection; hence, the varying success must be an age effect. Apparently, the capacity to learn a language does not disappear, but it deteriorates with age. Since this capacity is stored in the brain, it seems plausible to assume that changes in the brain are responsible for the age effect. The clearest statement of this view is Lenneberg’s theory of a biologically fixed ‘critical period,’ during which the brain is receptive for language; it ranges approximately from birth to puberty. After this period, linguistic knowledge can only be learned in a different form, roughly like the knowledge of historical or geographical facts (Lenneberg 1967). This theory has the seductive charm of simple solutions, and hence has been welcomed with great enthusiasm. But as far as is known, all potentially relevant changes in the brain occur in the first four years of life, rather than around puberty. Moreover, all available evidence shows that the capacity to learn a new language deteriorates only gradually; there is no clear boundary at puberty or at any other time. Finally, it could be shown that ‘perfect attainment’ is perhaps rare but definitely possible after puberty (see Birdsong 1999). It appears, therefore, that there is no clear biological threshold to language acquisition; the age effect is due to a much wider array of factors (Singleton 1989).
3. SLA and Theoretical Linguistics The apparent ease and speed with which children, despite deviant and insufficient input, become perfect speakers of their mother tongue has led Noam Chomsky and other generative grammarians to assume that a great deal of the necessary linguistic knowledge is innate. Since every newborn can learn any language, this innate knowledge must be universal, and it is this ‘universal grammar’ (UG) which is the proper object of linguistic theory. Since languages also differ in some respects (otherwise, SLA would be superfluous), the competence of mature speakers is supposed to include a ‘peripheral part,’ which includes all idiosyncratic properties and must be learned by input analysis, and a ‘core.’ The core consists of a number of universal principles—the UG. Initially, these principles include a number of ‘open parameters,’ i.e., variable parts which must be fixed by input analysis. Chomsky made this point only for FLA, and only in the mid 1980s was the question raised whether UG is still ‘accessible’ in SLA. A number of empirical studies tested the potential ‘resetting’ of various parameters. Spanish, for example, allows the omission of a subject pronoun, a property which is structurally linked to other features such as a relatively rich inflectional word order and relatively free word order; these and other properties 13770
form the ‘pro-drop parameter.’ English children have set this parameter the opposite way when acquiring their language. Are adult English learners of Spanish able to ‘reset’ it, or do they have to learn all of these properties by input analysis? Results are highly controversial (see e.g., Eubank 1991, Epstein et al. 1997). Although inspired by theoretical linguistics, most empirical research in this framework keeps the traditional ‘target deviation perspective’; with only a few exceptions, it deals with acquisition in the classroom, hence reflecting the effects of teaching methods. Moreover, there is no agreement on the definition of the parameters itself; in fact, more recent versions of generative grammar have essentially abandoned this notion. Finally, it is an open issue as to which parts of linguistic knowledge form the core and which parts belong to the periphery, and hence must be learned from the input. These language-specific parts clearly include the entire lexicon, the inventory of phonemes, inflectional morphology, all syntactic properties in which languages can differ—in short, almost everything. It seems more promising, therefore, to look at how learners construct their learner varieties by input analysis.
4. Learner Varieties The alternative to the target deviation perspective is to understand the learners’ performance at any given time as an immediate manifestation of their capacity to speak and to understand: form and function of these utterances are governed by principles, and these principles are those characteristic of the human language faculty. Early attempts in this direction are reflected in notions such as ‘interlanguage,’ ‘approximate systems’ and so on. Since the 1980s, most empirical work on SLA outside the classroom has taken this ‘learner variety perspective’ (von Stutterheim 1986, Perdue 1993, Dietrich et al. 1995). In its most elaborate form, it can be characterized by three key assumptions (Klein and Perdue 1997). (a) During the acquisitional process, learners pass through a series of learner varieties. Both the internal organization of each variety at a given time, as well as the transition from one variety to the next, are essentially systematic in nature. (b) There is a small set of principles which are present in all learner varieties. The actual structure of an utterance in a learner variety is determined by a particular interaction of these principles. The kind of interaction may vary, depending on various factors, as the learner’s source language. With ongoing input analysis, the interaction changes. Picking up some component of noun morphology from the input, for example, may cause the learner to modify the weight of other factors to mark the grammatical status of a noun phrase. Therefore, learning a new feature is not adding a new piece of puzzle which the learner has to
Second World War, The put together. Rather, it entails a sometimes minimal, sometimes substantial reorganization of the whole variety, where the balance of the various factors approaches the balance characteristic of the target language successively. (c) Learner varieties are not imperfect imitations of a ‘real language’ (the target language), but systems in their own right. They are characterized by a particular lexical repertoire and by a particular interaction of structural principles. Fully developed languages, such as Spanish, Chinese or Russian, are only special cases of learner varieties. They represent a relatively stable state of language acquisition—that state where learners stop learning because there is no difference between their variety and the variety of their social environment, from which they get input. Thus, the process of language acquisition is not to be characterized in terms of errors and deviations, but in terms of the twofold systematicity which it exhibits: the inherent systematicity of a learner variety at a given time, and the way in which such a learner variety evolves into another one. If we want to understand the acquisitional process, we must try to uncover this two fold systematicity, rather than look at how and why a learner misses the target. See also: First Language Acquisition: Cross-linguistic; Foreign Language Teaching and Learning; Language Acquisition; Language Development, Neural Basis of
Bibliography Birdsong D (ed.) 1999 Second Language Acquisition and the Critical Period Hypothesis. Erlbaum, Mahwah, NJ Braidi S M 1999 The Acquisition of Second Language Syntax. Arnold, London Corder P 1981 Error Analysis and Interlanguage. Oxford University Press, Oxford, UK Dietrich R, Klein W, Noyau C 1995 Temporality in a Second Language. Benjamins, Amsterdam Ellis R 1994 The Study of Second Language Acquisition. Oxford University Press, Oxford, UK Epstein S, Flynn S, Martohardjono G 1997 Second language acquisition. Theoretical and experimental issues in contemporary research. Behaioural and Brain Sciences 19: 677–758 Eubank L (ed.) 1991 Point–Counterpoint Uniersal Grammar in the Second Language. Benjamins, Amsterdam Klein W 1986 Second Language Acquisition. Cambridge University Press, Cambridge, UK Klein W, Perdue C 1997 The basic variety, or couldn’t natural languages be much simpler? Second Language Research 13: 301–47 Lenneberg E 1967 Biological Foundations of Language. Wiley, New York Mitchell R, Myles F 1998 Second Language Learning Theories. Arnold, London Odlin T 1989 Language Transfer. Cambridge University Press, Cambridge, UK Perdue C (ed.) 1993 Adult Language Acquisition Crosslinguistic Perspecties. Cambridge University Press, Cambridge, UK, 2 Vols
Ritchie W C, Bhatia T (eds.) 1996 Handbook of Second Language Acquisition. Academic Press, New York Singleton D 1989 Language Acquisition: The Age Factor. Multilingual Matters, Clevedon, UK Stutterheim C von 1986 TemporalitaW t in der Zweitsprache (Temporality in second language). De Gruyter, Berlin
W. Klein
Second World War, The 1. The Second World War: A Narratie World War II was an event of massive significance. For at least fifty years after its end in 1945 it continued to condition societies and ideas throughout the world. Much of the politics of the second half of the twentieth century can be read as occurring in an ‘after-war’ context. The war exacted a death toll of at least 60 million, and probably tens of millions more than that (figures for China and the rest of Asia are mere guesses and the USSR’s sacrifice has risen from seven to 20 to 29 or more million as time has passed, circumstances varied, and the requirements of history altered). A majority of the casualties were civilians, a drastic change from World War I when some 90 percent of deaths were still occasioned at the fronts. Moreover, the invention of the atom bomb during the war and its deployment by the USA at Hiroshima and Nagasaki (August 6 and 9, 1945) suggested that, in any future nuclear conflict, civilians would compose 90 percent or more of the victims. When this apparent knowledge was added to the revelations of Nazi German barbarism on the eastern front and the Nazis’ massacre of European Jewry, either in pit killings or when deliberately transported to such death camps as Auschwitz-Birkenau, Treblinka, Sobibor, Chelmno, Belzec, and Majdanek, another casualty of the war seemed to be optimism itself. Certainly at ‘Auschwitz’ and perhaps at Hiroshima, ‘civilization,’ the modernity of the Enlightenment, the belief in the perfectibility of humankind, had led not to hope and life but instead to degradation and death. This more or less open fearfulness, with its automatic resultant linking of a pessimism of the intellect to any optimism of the will among postwar social reformers, may be the grandest generalization that can be made about the meaning of World War II. Big history, however, should not lose sight of microhistory. Actually World War II was fought on many fronts, at different times, for different reasons, and with different effects. In this sense, there was a multiplicity of World War II’s. In September 1939, a war broke out between Nazi Germany and authoritarian Poland. The liberal democratic leadership of Britain and France intervened 13771
Second World War, The saying that they would defend Poland, although in practice they made do with ‘phoney war’ until the Nazi forces took the military initiative, first in Denmark and Norway, and then in the Low Countries and France in April–May 1940. In June 1940, Fascist Italy entered the war as had been envisaged in the ‘Pact of Steel’ signed with its Nazi ally in May 1939. War now spread to the Italian empire in North and East Africa and, from October 1940, to the Balkans. Italian forces botched what they had hoped would be a successful Blitzkrieg against Greece and the effect over the next year was to bring most of the other Balkan states into the conflict. Often these fragile states dissolved into multiple civil wars, the most complicated and the one of most lasting significance began in Yugoslavia from March–April 1941. On June 22, 1941, the Nazi Germans invaded the Soviet Union, commencing what was, in some eyes, the ‘real’ World War II, and certainly the one that was inspired by the most direct ideological impulse and which unleashed the most horrendous brutality. In the course of the campaign in the east it is estimated that the Germans sacked 1,710 towns and 70,000 villages. During the epic siege of Leningrad from 1941 to 1944, a million or so of the city’s inhabitants starved to death. In their invasion, the Germans were joined by an assortment of anticommunist allies and friends, including military forces from authoritarian Romania and Fascist Italy. Many Lithuanians, Latvians, and Estonians, and quite a few anti-Soviet elements within the USSR (Ukrainian nationalists, people from the Caucasus, and others) acted an auxiliaries of Nazi power. The Nazis were even embarrassed by a ‘Russian’ army under General A. A. Vlasov, willing to fight on their side against Stalin and his system. Volunteers also came from pro-fascist circles in France, and from Spain and Portugal, states ruled by clerical and reactionary dictators who hated communists but were not fully reconciled to the radical thrust of much Nazi-fascist rhetoric and some Nazifascist policy. On December 7, 1941, the war widened again when the Japanese airforce attacked the American Pacific fleet at anchor at Pearl Harbor in Hawaii. In the following weeks, the Japanese army and navy thrust south and east, dislodging the British from Singapore by January 15, 1942. They went on to seize the Philippines and Dutch East Indies (later Indonesia) and were in striking distance of Australia before being checked at the Battle of the Coral Sea in June 1942. They simultaneously continued the terrible campaign that they had been waging in China since 1937 (or rather since 1931, when they had attacked Manchukuo or Manchuria). In their special wars, the militarist Japanese leadership tried to throw off what they called the imperialist yoke of US capitalism and the older ‘white’ metropolitan empires. The purity of their antiimperial motives was damaged, however, by their own commitment to empire and by their merciless killing of 13772
Asians. Moreover, as in Europe, their invasions often touched off varieties of civil war provoked by the highly complex stratification of society in the region. At the forefront of such campaigns were often local nationalists who imagined communities subservient neither to European powers nor the Japanese. On December 11, 1941, Germany and Italy, loyal to the terms of the anti-Comintern pact, had also declared war on the USA, somewhat ironically so, given that Japan, checked by military defeats in an unofficial war against the USSR at Khalkin Gol and Nomonhan in 1938–9, had now engaged with other enemies. The Italians would find out the implications in North Africa, the Germans after the Allied invasion of France on ‘D-Day’ (June 6, 1944), as well as in Italy from September 8, 1943 (the Fascist dictator, Benito Mussolini, overthrown on July 25, was thereafter restored as a sort of German puppet in northern Italy; allied forces moved slowly up the peninsula from the south, liberating Rome on June 4, 1944 but only reaching Milan at the end of the war in late April 1945). Of the participants in the war, the USA, once fully mobilized, possessed the biggest and most productive economy, and was therefore of crucial importance in the eventual defeat of the anti-Comintern powers. The campaign the Americans fought with the most passion, and with an evident racism of their own, was the war against Japan. In another sense, the USA had a relatively soft war, not disputed on its own territory and not requiring the sort of physical or spiritual sacrifice obligatory from most other combatants. The USA’s special World War II was not really a visceral one. If the war was fought in many different ways, it is equally true that the variety of conflicts did not all end at the same time and in the same way. The Nazi armies surrendered on May 8, 1945, the Japanese on August 15. But matters were more complicated than that. France’s special World War II would commemorate ‘victory’ from the date of the liberation of Paris on August 25, 1944 (and General Charles De Gaulle would firmly proclaim that Paris and France had liberated themselves). In most of Nazi-fascist occupied Europe and in parts of Asia, partisan movements had never altogether accepted defeat. Communists were invariably prominent in such resistance, even if quite a few still envisaged themselves as fighting as much for the Soviet revolution as for the liberty of their own nation state. Every successive ‘liberation,’ in Europe most frequently coming after the military victory of the Red Army, and in the Pacific that of the USA, had its own special character. Yugoslavia and China were two especially complicated places where the resistance was very strong but where it was contested, not only by the Nazi-fascists and the Japanese but also by local anticommunist and nationalist or particularist forces. The effects and memory of their wars were by definition to be very different from such societies as the USA, Australia, and the UK which did not endure foreign
Second World War, The occupation, though the last, with its severe experience of bombing, was itself different from the other two. In sum, World War II was not just an enormously influential event but also an extraordinarily complicated one. Its complexity has in turn stimulated many passionate and long-lasting debates about its historical meaning.
2. The Causes of War For many years it was customary to argue that World War II as compared to World War I had a simple cause. This was ‘Hitler’s war.’ Mainstream analysis continues to ascribe to the Nazi dictator great responsibility for the invasion of Poland and for the spreading of the war thereafter, and especially for the launching of Operation Barbarossa against the USSR. Nonetheless, from the 1960s, the course of historiography, especially as exemplified in the rise of social history, did not favor a ‘Great Man’ view of the past and tended to urge that even dictators had limits to their free will. As early as 1964, English radical historian A. J. P. Taylor (1964) argued in a book entitled The Origins of the Second World War that the several crises which led up to September 1939 and what he sardonically called the ‘War for Danzig’ needed to be understood in a variety of contexts, including the peace settlements at the end of World War I, the course of German political and social history, the institutionalization of the Russian revolution, with its victorious but feared and paranoid communist and then Stalinist regime, and the lights and shadows of democratic liberalism in Western Europe. Taylor wrote with a flaunted stylistic brilliance and practiced a brittle historical cleverness. He was destined to be misunderstood and often gloried in the misunderstanding. His book thus produced an enormous controversy, the first of many to be sparked by attempts to define the meaning of World War II. At the same time, Taylor’s idiosyncrasies ensured that his work could easily enough be dismissed by those mainstream historians who liked to feel the weight of their commitment to Rankean principles and to make their liking evident. Nonetheless the issues raised by Taylor did not go away. In West Germany, the so-called ‘Fischer controversy,’ sparked by the Hamburg liberal historian Fritz Fischer’s Griff nach der Weltmacht, a massively documented study of German aims during World War I and also published in 1961, raged through the decade. Although Fischer, then and thereafter, wrote almost exclusively about World War I and about imperial Germany, he was read as commenting on World War II and, indeed, on Germany’s divided fate in its aftermath. Two issues were prominent. Was imperial Germany an aggressive power in a way that bore comparison with the Nazi regime during the 1930s?
Was the motive of the German leadership as much domestic as foreign—did they seek foreign adventure and even world war in order to divert the pressure building for social democracy? Was the appalling conflict from 1939 to 1945 caused by the ‘German problem,’ which may have begun as early as 1848 and may have continued after 1945? Fischer’s work was more directly influential than Taylor’s and the issue of the relationship between Innenpolitik and Aussenpolitik fitted neatly into the preoccupations of new social historians who, by the 1970s, were scornfully dismissing the work of ‘old fashioned diplomatic historians.’ For all that, specialist works on the causation of World War II continued to privilege the power of Hitler and acknowledge the ideological thrust of Nazism. English independent Marxist, Tim Mason, may have tried to pursue a Fischerian line in asking how much the diplomatic crises of 1938–9 were prompted by the contradictions of Nazi economic and social policy, but his essays remained at the periphery of most analysis. Rather such firmly Rankean historians as Gerhard Weinberg and Donald Watt assembled evidence which, in their eyes, only confirmed that the war was caused by Hitler. The newest and most authoritative Englishlanguage biographer of the Fu$ hrer, Ian Kershaw, despite his background in social history, does not disagree. Hitler may have been erratic as an executive. Nazi totalitarian state and society, contrary to its propaganda about militant efficiency and a people cheerfully bound into a Volksgemeinschaft, may in practice have often been ramshackle. But the dictator, Kershaw argued, did possess power. Indeed, so allembracing was his will that Germans strove to ‘work towards’ their Fu$ hrer, to accept his ideas and implement his policies before he had fully formulated them. For a generation in the wake of the Fischer controversy, scholarship on Nazism had separated into ‘intentionalists’ (advocates of Great Man history) and ‘functionalists’ (those who preferred to emphasize the role of structures and contexts and who were especially alert to the ‘institutional darwinism’ of the Nazi regime). Now Kershaw, often a historian of the golden mean, seemed to have found a way to resolve and end that conflict. The only major variants on this reaffirmation that Hitler had provoked the war came from certain conservative viewpoints. Some anticommunist historians focused on the Ribbentrop–Molotov pact (August 23, 1939), placing responsibility for the war on ‘Stalin’ and the Russian revolution in what seemed to others a highly tendentious effort to blame the victim. More common was the view expressed most succinctly by Zionist historian Lucy Dawidowicz that the whole conflict was in essence a ‘war against the Jews.’ In this interpretation, German nationalism, German anticommunism, German racism towards Slavs, Nazi repression of the socialist and communist left, none in any sense equated with Hitlerian anti13773
Second World War, The Semitism. Hitler (or, in the variant recently made notorious by Daniel Goldhagen, the Germans) wanted to kill Jews; that was the purpose of the Nazi regime; that was the aim of its wars. In other ex-combatant societies, further examples of local focus are evident. In England, fans of appeasement still existed, the most prominent being John Charmley. For him, the problem with the war lay simply in Britain’s engagement in it. As far as the British empire was concerned, Nazism did not need to be fought and the USSR should never have been an ally. Worst of all was the fact that the USA had dominated the post war world. Charmley has never quite said so, but implication of his work is that the ‘real’ World War II for Britain, and the one it most dramatically lost, was implicitly fought against its American ‘cousins.’ The Asian-Pacific conflict has similarly been subject to historical revision. In the 1960s Gabriel Kolko and other American ‘new leftists’ applied a Marxian model to their nation’s foreign policy, being very critical of the gap between its idealistic and liberal theory and its realist and capitalist practice. They were in their turn duly subjected to withering fire from more patriotic and traditional historians. Nonetheless a consensus grew that, at least in regard to the onset of the American–Japanese war, US policy makers carried some responsibility. By the 1980s, liberal historian John Dower was even urging that the two rivals had been ‘enemies of a kind,’ neither of them guiltless of racism and brutality. In Japan, by contrast, an officially sponsored silence long hung over everything to do with the war. The Ministry of Education was particularly anxious that schoolchildren not be exposed to worrying facts about such terrible events as the massacre in Nanking in 1937, the practice of germ warfare, the exploitation of ‘comfort women,’ and the many other examples of Japanese murder, rape, and pillage in Asia and the Pacific. Nonetheless, a stubborn undercurrent of opinion exemplified in the work of historian Ienaga Saburo continued to contest the Ministry line and, by the 1990s, the Japanese leadership had gone further than ever before in admitting some of the misdeeds of its militarist predecessors. Fully critical history may still not be especially appreciated in Tokyo. But, by the end of the 1990s, Japan was not the only society to behave that way in that world in which the ideology of economic rationalism had achieved unparalleled hegemony backed by what American democratic historian Peter Novick has called ‘bumper sticker’ lessons from the past.
3. The Course of the War In the preceding paragraphs it has not always been possible to keep fully separate discussions about the 13774
coming of World War II from what happened after hostilities commenced. In any case, military history in the pure sense, just like diplomatic history, after 1945 soon lost ground professionally. To most eyes, the military history of the war can swiftly enough be told. Of the anti-Comintern states, Germany and Japan but not Italy, won rapid initial victories, exulting in their respective Blitzkriegs. But their triumphs were always brittle. Germany and its allies may have been good at getting into wars, but their ideologies made it difficult for them thereafter to contemplate any policy except the complete liquidation of the enemy. Hitler and the Japanese imperial and military leadership thus made no attempts to offer compromise from a position of strength, and Mussolini’s occasional flirtation with the idea always involved Nazi sacrifice in their wars, especially that in the east, rather than Italian loss. Nor did the anti-Comintern states make the most of the huge territories, which they had conquered, and the immense material resources that they therefore controlled. Nazi Germany is something of a case study in this regard. In the West, where the war was always gentler, the Nazis found plenty of direct and indirect collaborators. They were thus, for example, able to harness a very considerable proportion of the French economy to their cause. They also started to construct a new economic order that was not utterly unlike some of the developments, which would occur in Western Europe after Nazi-fascism had been defeated. With extraordinary contradiction for a state built on an utter commitment to racial purity, Nazi Germany, already before 1939, needed immigrants to staff its economy. Once the war began, this requirement became still more pressing. One partial solution was to import workers by agreement with its ally Italy and by arrangement with the friendly Vichy regime in France. Not all such French ‘guest-workers’ came unwillingly and not all had especially bad wars. In Germany they joined other, more reluctant, immigrants from the east, who were often little more than slave laborers. Poles and Soviet prisoners of war constituted the majority of these; at first they were frequently worked to death. However, as the Nazi armies turned back and the war settled into one of attrition and retreat, as symbolized by the great defeat at Stalingrad (November 1942–January 1943), the Germans began to treat even laborers from the east in a way that allowed some minimum chance of their survival and which also permitted some tolerable productivity from their labor. The exception was, of course, the Jews, who, from September–October 1941, became the objects of the ‘Final Solution,’ a devotion to murder confirmed by officials who attended the Wannsee conference in January 1942. In terms of fighting a war, the adoption of the policy of extermination was, of course, counterproductive in many senses, among which was the economic. In staffing and fueling the trains, which transported the Jews to the death camps, in the low
Second World War, The productivity of these and the other camps, the Nazis wasted resources needed at the front. Saul Friedlander will write elsewhere in this volume about the debates about the meaning of the Holocaust. It is worth noting in this segment, however, that each combatant society has argued about its particular experience of being visited by the Nazi war machine and therefore of being exposed to collaboration with it. Two important examples occurred in France and the USSR. The spring of 1940 brought disaster to the Third French Republic. Now was the time of what Marc Bloch, one of the founders of the great structuralist historical school of the Annales, a patriotic Frenchman and a Jew, called the ‘strange defeat.’ Military historians have demonstrated that France was not especially inferior in armament to the invading Germans. Rather, France lost for reasons to do with morale and the domestic divisions of French society. As a result, by June 1940 the French state and empire had collapsed. During the next four years, the inheritance of the Third Republic was disputed between the Vichy regime headed by Marshal Pe! tain within the rump of French metropolitan territories, the ‘Free French’ under General Charles De Gaulle, resident in London, and there, however reluctantly and churlishly, dependent on Allied goodwill and finance, and a partisan movement which gradually became more active in the occupied zones. This last was typically divided between communists and other forces, some of which disliked communists as much as they hated the invaders. The years of Axis occupation were thus also the time of the ‘Franco-French civil war,’ with killings and purge trials extending well beyond liberation. The meaning of war, occupation, and liberation in France has been much disputed after 1945. It took a film maker, Marcel Ophuls in Le Chagrin et la pitieT (1971), and an American historian, Robert Paxton, to break a generation of silence about the troubling implications of this period of national history, although Socialist President Franc: ois Mitterrand (1981–95), with his own equivocal experience of Vichy, was scarcely an unalloyed advocate of openness during his term in office. Historian Henry Rousso has brilliantly examined the ‘Vichy syndrome’ and done much to expose some of the obfuscations favored by many different leadership groups in post-1945 French politics. Perhaps his work does not go far enough, however. With its fall, and then with decolonization after 1945, France had lost its political empire, promising to become just another European state. However, this loss was curiously compensated by the rise and affirmation of French culture. In almost every area of the humanities, such French intellectuals as De Beauvoir, Braudel, Barthes, Foucault, Le! vi-Strauss, Lyotard, Baudrillard, and Nora, charted the way to postmodernity. They did so sometimes invoking the pro-Nazi philosopher Martin Heidegger and almost always without much reckoning of the collapse of the French nation state in 1940. The intellectuals of France
were much given to forgetting their nation’s fall, the better to affirm their own rights to cultural imperium. If France provides a case study of the transmutation of defeat into victory, the USSR offers the reverse example of a victor whose people would eventually learn that ‘actually’ they had lost. The nature of the Soviet war effort is still in need of research. Sheila Fitzpatrick, a historian of Soviet social history, has depicted a population by 1939 brutalized and depressed by the tyranny, incompetence, and contradictions of Stalinism. Her work does not really explain, however, how that same populace fought so stubbornly ‘for the motherland, for Stalin.’ No doubt the absurd murderousness of Nazi policies gave them little alternative. No doubt the aid which eventually flowed through from the USA was of great significance. But something does remain unexplained about how ‘Stalin’s Russia’ won its ‘Great Patriotic War.’
4. The Consequences of the War By now very clear, however, are the consequences of the war for the USSR. In the short term the war made the Soviet state the second superpower, the global rival to the USA, and gave Stalin, until his death in 1953, an almost deified status. However, as the postwar decades passed, it became clear that the USSR and its expanded sphere of influence in Eastern Europe were not recovering from the war with the speed being dramatically exemplified in Western Europe and Japan. Indeed, the history of the USSR, at least until Gorbachev’s accession to the party secretaryship in 1985 and, arguably, until the fall of the Berlin Wall and the collapse of communism (1989–91), should best be read as that of a generation who had fought and won its wars (including the terrible domestic campaigns for collectivization as well as the purges of the 1930s). Leaders and people were unwilling or unable to move beyond that visceral experience. After 1945, the USSR became the archetypal place ‘where old soldiers did not die nor even fade away.’ Brezhnev, Chernenko, and their many imitators further down the power structure were frozen into a past that had an ever-diminishing connection with a world facing a new technological and economic revolution. Other ex-combatant societies were more open to change than the USSR but a certain sort of remembering and a certain sort of forgetting can readily enough be located in them, too. Memory proved most threatening in Yugoslavia. There the victorious partisans under Josip Broz Tito seemed for a time to have won a worthwhile victory. The special history of their campaigns against the Nazi-fascist occupiers of their country allowed them claims to independence from the USSR, which they duly exercised after 1948. At the same time the barbarity, during the war, of the collaborationist Croat fascist regime under Ante 13775
Second World War, The Pavelic! , whose savagery embarrassed even the Germans, seemed to suggest that the region was indeed best administered by a unitary state. The fall of communism, however, also brought down a communist Yugoslavia in which such Serb leaders as Slobodan Milosevic, unable to believe in official ideals, increasingly recalled the nationalism of wartime Cetniks rather than the internationalist Marxism once espoused by the partisans. This memory of war and murder justified new wars and new murders, even if with the somewhat ironical result that, by the end of the 1990s, the (Serb) winners of the last war had become losers and the (Slovene, Croat, Bosnian Moslem, and Kosovar) losers had become winners. In Italy, the country with the largest communist party in the West and a polity which had a ‘border idiosyncrasy,’ bearing some comparison with Yugoslavia’s role in the Eastern Bloc, the inheritance of war and fascism similarly possessed peculiar features. Postwar Italy began by renouncing fascism, empire, and war, in 1946 abandoning the monarchy that had tolerated the imposition of Mussolini’s dictatorship and, in 1947, adopting a constitution which made considerable claim that the new Republic would be based on labor. In practice, however, Italy took its place in the Cold War West. Its purging of ex-Fascists was soon abandoned and both the power elites and the legislative base of the new regime exhibited much continuity with their Fascist predecessors. Nonetheless, from the 1960s, an ideology of antifascism was accorded more prominence in a liberalizing society. From 1978 to 1985, Sandro Pertini, an independent socialist who had spent many years in a Fascist jail and been personally involved in the decision to execute Mussolini, became Italy’s president. Widely popular, he seemed an embodiment of the national rejection of the Fascist past. Once again, however, the process of memory was taking a turn and a different useable past was beginning to emerge. Left terrorists in the 1970s had called themselves the new Resistance and declared that they were fighting a Fascist-style state—the governing Christian Democrats were thought to be merely a mask behind which lurked the Fascist beast. The murder of Aldo Moro in 1978 drove Italians decisively away from this sort of rhetoric and, in the 1980s and 1990s, Italians sought instead a ‘pacification’ with the past in which ex-Fascists had as much right to be heard as ex-partisans. Among historians, ‘anti-antiFascists,’ led by Renzo De Felice, the biographer of Mussolini, provided evidence and moral justification for this cause. Media magnate and conservative politician Silvio Berlusconi joined those who agreed that Italy’s World War II had lost its ethical charge. Among the Western European ex-combatants states perhaps the UK was the place where the official myth of the war survived with least challenge. A vast range of British society and behavior was influenced by Britain’s war. The Welfare State, as codified by the 13776
postwar Labor government under Clement Attlee, was explained and justified as a reward for the effort of the British people in the ‘people’s war.’ Wartime conservative Prime Minister, Winston Churchill, despite his many evident limitations, remained a national icon. British comedies from the Goons in the 1950s to Dad’s Army in the 1970s and 1980s to Goodnight Sweetheart in the 1990s were obsessively set in the war. From the alleged acuteness of their secret service activities to the alleged idealism of their rejection of Nazism, the British have constantly sought to preserve the lion’s share of the positives of World War II for themselves. The suspicion of a common European currency and the many other examples of continuing British insularity in turn reflect the British cherishing of the fact that they fought alone against the Nazifascists from 1940 to 1941, and express their associated annoyed perplexity that somehow their wartime sacrifice entailed a slower route to postwar prosperity compared with that of their continental neighbors. Memory after memory, history after history, World War IIs, in their appalling plenitude, still eddy around. As the millennium ended, another historian wrote a major book about the meaning of an aspect of the war, and about the construction of that meaning. Peter Novick’s The Holocaust in American Life (1999) daringly wondered whether the privileging of the Nazi killing of the Jews in contemporary Jewish and even gentile American discourse is altogether a positive. Being a historical victim at one time in the past, he argued cogently, can obscure as well as explain. His caution is timely. It is probably good that World War IIs are with us still; it will be better if the interpretation of so many drastic events can still occasion democratic debate, courteous, passionate, and humble debate, and if we can therefore avoid possessing a final solution to its many problems. See also: Cold War, The; Contemporary History; First World War, The; Genocide: Historical Aspects; Holocaust, The; International Relations, History of; Military History; War: Anthropological Aspects; War: Causes and Patterns; War Crimes Tribunals; War, Sociology of; Warfare in History
Bibliography Bartov O 1986 The Eastern Front 1941–5: German Troops and the Barbarisation of Warfare. St. Martin’s Press, New York Bosworth R J B 1993 Explaining Auschwitz and Hiroshima: History Writing and the Second World War 1945–1990. Routledge, London Bosworth R J B 1998 The Italian Dictatorship: Problems and Perspecties in the Interpretation of Mussolini and Fascism. Arnold, London Calder A 1969 The People’s War: Britain 1939–45, Pantheon, London Charmley J 1993 Churchill: The End of Glory—A Political Biography. Hodder and Stoughton, London
Secondary Analysis: Methodology Dawidowicz L 1986 The War Against the Jews 1933–45, rev. edn. Penguin, Harmondsworth, UK Dower J W 1986 War Without Mercy: Race and Power in the Pacific War. Pantheon, New York Fischer F 1967 Germany’s Aims in the First World War. Chatto and Windus, London Fischer F 1986 From Kaiserreich to Third Reich: Elements of Continuity in German History, 1871–1945. Allen and Unwin, London Fitzpatrick S 1999 Eeryday Stalinism: Ordinary Life in Extraordinary Times: Soiet Russia in the 1930s. Oxford University Press, New York Gorodetsky G 1999 Grand Delusion: Stalin and the German Inasion of Russia. Yale University Press, New Haven, CT Ienaga S 1979 Japan’s Last War: World War II and the Japanese 1931–1945. Australia University Press, Canberra, Australia Hogan M J (ed.) 1996 Hiroshima in History and Memory. Cambridge University Press, Cambridge, UK Kershaw I 1999 Hitler 1889–1936: Hubris. W. W. Norton, New York Kershaw I 2000 Hitler 1936–1945: Nemesis. Allen Lane, London Kolko G 1968 The Politics of War: The World and United States Foreign Policy 1943–5. Random House, New York Milward A 1977 War, Economy and Society 1939–1945. University of California Press, Berkeley, CA Novick P 1999 The Holocaust in American Life. Houghton Mifflin, Boston Parker R A C 1990 Struggle for Surial: The History of the Second World War. Oxford University Press, Oxford, UK Rousso H 1991 The Vichy Syndrome: History and Memory in France Since 1944. Harvard University Press, Cambridge, MA Taylor A J P 1964 The Origins of the Second World War, rev. edn. Penguin, Harmondsworth, UK Thorne C 1986 The Far Eastern War: States and Societies 1941–5. Unwin, London Tumarkin N 1994 The Liing and the Dead: The Rise and Fall of the Cult of World War II in Russia. Basic Books, New York Watt D C 1989 How War Came: The Immediate Origins of the Second World War 1938–1939. Pantheon Books, New York Weinberg G L 1994 A World at Arms: A Global History of World War II. Cambridge University Press, Cambridge, UK
R. J. B. Bosworth
Secondary Analysis: Methodology Secondary analysis refers to a set of research practices that involve utilizing data collected by someone else or data that has been collected for another purpose (e.g., administrative records). It is used, to varying degrees, across a wide range of disciplines and throughout the world. Given this breadth, it is not surprising that it has taken on numerous forms. It is also conducted for several distinct reasons. Although not a research methodology, per se, several features distinguish it from other research activities. In turn, these features create opportunities and limitations for the secondary
analyst. Central issues concern: (a) data availability, access, and documentation; (b) maintaining confidentiality and privacy pledges made by primary researchers; and (c) proprietary rights and data ownership.
1. Secondary Analysis as a ‘Methodology’ The use of secondary data in behavioral and social sciences is ubiquitous, appearing in a number of traditional (i.e., quantitative) and untraditional (i.e., qualitative) forms. Despite its rather extended roots in the social and behavioral sciences, it is not widely celebrated as a method. As partial evidence of its relative obscurity, consider the fact that between 1978 and the present, a search of PsychInfo uncovered only 36 articles and books using the keyword ‘secondary analysis.’ Furthermore, the phrase ‘secondary analysis’ appeared in fewer than half of the article or book titles. Books on the topic are also relatively scarce; fewer than a dozen were uncovered through the same search (e.g., Boruch et al. 1981, Elder et al. 1993, Hyman 1972, Stewart 1984). Ironically, the apparent obscurity of secondary analysis as a methodology is due to its pervasiveness. That is, the use of secondary data sources is so commonplace in many fields (e.g., economics, education, and sociology) that there is little need in calling attention to it as a separable methodology. This makes sense because unlike ethnography, survey research, or quasi-experimentation which each have distinctive methodological procedures and practices, secondary analysis does not involve a new or different set of tactics. Even from a statistical point of view, there is little to distinguish it from primary analysis, and most of the measurement, design, and statistical issues facing the secondary analyst are largely the same as those faced by a primary analyst. The obvious exception is that the secondary analyst is constrained by scope and nature of the design (e.g., the sample, sample sizes, attrition, missing data, measures, and research design) inasmuch as these have been specified by someone else (McCall and Appelbaum 1991). As such, secondary analysis boils down to a data resource, not a methodology, per se. However, the use of secondary data—especially when micro data records are used—does involve a unique set of logistical, ethical and practical considerations.
2. Varieties of Secondary Analysis Because primary data can assume a number of forms (e.g., data based on cross-sectional surveys, administrative records, panel surveys, observations), secondary analysis has taken on a variety of forms. Two 13777
Secondary Analysis: Methodology main categories can be identified: (a) some data are developed for the explicit purpose of serving as a national resource; and (b) other forms of data are byproducts of the actions of individuals and organizations. The latter are included in the definition of secondary analysis because they play a large role in contemporary research. Given the increased use of the Internet, more administrative records, facts, and statistics (public and private) also will be readily available for use in research. 2.1 Traditional Forms of Secondary Analysis The most common forms of secondary data include population censuses, continuous or regular surveys, national cohort studies, multisource data sets, administrative records, and one-time surveys or research studies. In the US, interest in secondary analysis was prompted by the appearance of public opinion polls and surveys in the 1940s and 1950s, and continued in the 1960s, when the federal government began, in earnest, gathering data through large-scale surveys based on representative sampling (Hyman 1972). Of particular interest to social science researchers are the large scale, nationally representative, longitudinal (panel) surveys. In the USA, even cursory literature searches reveal hundreds of publications in economics and sociology that use data drawn from, for example, the Panel Study of Income Dynamics (PSID), the National Longitudinal Survey of Youth (NLSY), and the General Social Survey (GSS). In the United Kingdom, the General Household Survey (GHS) has been conducted annually since the beginning of the 1970s. The reuse of large-scale survey data is also evident across many nations simultaneously, as when Ravillion and Chen (1997) examined the correspondence between the rate of growth in gross domestic product and changes in poverty rates. The willingness of domestic and foreign governments to invest in data collection prompted Cherlin (1991) to speculate that the term secondary analysis will become obsolete. Many large-scale surveys are being sponsored by governments with no primary data analyses in mind. Whereas large-scale surveys like the PSID are deliberately fielded to answer a multitude of research questions about populations and subgroups, substantial data gathering also is undertaken by governments as byproducts of their operation, to monitor their own processes and outcomes, or to assess environmental conditions or events (e.g., weather, rainfall). These are generally referred to as administrative or governmental records. The use of governmental records in research has a long tradition; over 100 years ago, Emile Durkheim used government statistics to examine some of the causes of suicide (Cherlin 1991). Individual research studies also are fodder for secondary analysis. These reanalyzes are undertaken to check the accuracy of statistical conclusions in 13778
primary studies; for testing new statistical procedures or substantive hypotheses; resolving conflicts among researchers; for testing supplemental hypotheses suggested in primary analyses. The latter are usually conducted by the primary analyst, and this category represents the prevalent use of the term ‘secondary analysis.’ In recent years, secondary analysts reanalyzed multiple studies that use the same instrumentation, a tactic more akin to the spirit of metaanalysis. A particularly important role for secondary analysis of individual studies is testing or demonstrating the superiority of new statistical methods. A thoughtful example is provided by Muthen and Curran (1997). They showed that latent growth curve models produced larger intervention effects than had been previously reported by exerting greater control over extraneous sources of error. More generally, in the past 25 years, efforts to address causal questions with extant data have produced substantial advances in statistical modeling (Duncan 1991). And, reanalysis of data from program evaluation studies has been undertaken to assure policy-makers that the results of primary studies are technically sound (Boruch et al. 1981). 2.2 Less Traditional Forms of Secondary Analysis Although not a traditional form of secondary analysis, Webb et al. (1965) demonstrated that a substantial amount of ‘data’ are produced by individuals and organizations (public and private) as byproducts of their daily transactions. In their now classic text Unobtrusie Measures: Nonreactie Research in the Social Sciences, they identified a host of unconventional ways in which researchers can reuse existing data. In making their case, they showed how physical traces (erosion and accretion); the content of running records (e.g., mass media, judicial records), private records (sales records, credit card purchases), and contrived observations (e.g., hidden cameras) can be used as sources of data in research. The literature is filled with studies based on creative uses of these artifacts. Examples include assessing: the type, amount, and ‘quality’ of household garbage to assess the dietary and recycling habits of Arizona residents; the amount of trash left in the streets by revelers at Marti Gras to estimate the size of the daily crowd; and the differential wear and tear seen in separate sections to ascertain the popularity of sections of the International Encyclopedia of the Social Sciences. Finally, whereas secondary analysis has been historically viewed as a quantitative endeavor, reanalysis of qualitative data is now regarded as a legitimate part of the enterprise. It would appear that Webb et al. (1965) had a substantial influence on subsequent generations of primary researchers. The most underacknowledged uses of secondary data are within primary evaluation
Secondary Analysis: Methodology studies, where a mix of new and existing data are increasingly being used to evaluate the effectiveness of social interventions. The use of mixed method evaluation designs represents a core feature of contemporary evaluation theory and practice (Lipsey and Cordray 2000). 2.3 Adantages and Disadantages of Secondary Analysis Secondary analysis offers several advantages over primary data collection in behavioral and social sciences, but it also has its shortcomings. On the positive side of the ledger, re-use of data is efficient. These efficiencies include: (a) replication (or not) of findings across investigators; and (b) the discovery of biases in conventional statistical methods. Both of these benefit science by winnowing false hypotheses more quickly and offering new evidence (estimates) in their place. Testing new hypotheses, beyond those envisioned by the data developers, add to the efficiency of knowledge acquisition. These benefits have served as partial justification for the cost of conducting large scale, nationally representative surveys (using crosssectional and panel designs) that query individuals on a wide array of topics. Some topics involving special populations (e.g., twins) or long time frames cannot be investigated without reliance upon archives and data sharing among investigators. Alternatively, secondary data impose limits on what can be studied, where, and over what period of time. McCall and Appelbaum (1991) suggest using a Feasibility Matrix of SampleiMeasureiAssessment Age as a tool for prescreening the potential utility of secondary data sets. In addition, technical problems (e.g., selection biases, sample attrition, and missing data) can be so severe as to limit the value of the primary data set. As such, assessing data quality probably needs to be included in the McCall– Appelbaum matrix. The existence of data can shortcircuit one facet of the scientific process, tempting the analyst to ‘mine’ the data rather than initiate analyzes based on a theory or hypothesis. Although interesting results may emerge, such practices can lead to unreliable findings or findings limited to a single operationalization. The extent to which these problems in secondary analysis have influenced knowledge development is unknown. But, a mismatch between theory and data can be avoided with proper consideration of the relevance and quality of each data source.
impeded by a number of factors. In particular, because data do not ‘speak for themselves,’ they must be well documented. Data from government-sponsored, large surveys, panels and so on are often routinely archived and well documented. This is not uniform across all types of secondary data. Data sharing among individual investigators can become contentious, especially in light of questions about proprietary rights and data ownership. Researchers are obliged to honor original pledges of confidentiality and privacy. Balancing the desire to share data with these ethical requirements may require configuring the data in a way that may limit how it is disclosed and the methodological options available to the secondary analyst. 3.1 Aailability, Access and Documentation Establishment of data archives, advances in computer technology, and the emergence of the World Wide Web (WWW) have greatly facilitated access to data and solved some of the early problems that plagued the first generation of secondary analysts (e.g., poor documentation, noncommon data formats and language). Since the 1960s, the Interuniversity Consortium for Political and Social Research (ICPSR) has functioned as a major repository and dissemination service in the US. The National Archives have served a similar function for some governmental data. Increasingly, these roles have been devolved to individualgovernmentalagenciesthathavedeveloped skills in storing and disseminating their own data. In addition, the Internet has the potential for revolutionizing the use and transfer of data. Secondary analysis also has been institutionalized within journal and professional codes of conduct. To facilitate access to data appearing in scientific journals, many journals have adopted policies whereby contributing authors are expected to make the data from their studies available for a designated period (usually three to five years) of time. Similarly, professional associations (e.g., the American Psychological Association) have incorporated data sharing into their codes of ethical behavior. Whereas archives and agencies require that data be properly documented, journals and professional associations are generally silent on this aspect of the data sharing process. Because data documentation is a process that records analytic decisions as they unfold over the course of the study, primary study authors need to be aware of these nontrivial editorial and ethical demands (see Boruch et al. 1981).
3. What is Unique About Secondary Analysis? Using data generated by someone else does raise several issues that make secondary analysis somewhat unique. Obviously, secondary analysis is possible only when it is available, easy to access, and in a form that is usable. Availability and access can be facilitated or
3.2 Ethical Considerations and Disclosure The need to protect the confidentiality and privacy of research participants is sometimes at odds with the desire to make data available to others. Resolving these competing values requires careful consideration 13779
Secondary Analysis: Methodology at the time that primary research is conducted, documented, stored, and disseminated. Concealing the identity of participants can often be accomplished by removing personal identifiers from the data file. To the extent that identification is still possible through deductive disclosure (combining information to produce a unique profile of an individual), alternative procedures are needed. If identifiers are needed to link records (as in longitudinal research), additional layers of protection are needed. Fortunately, Boruch and Cecil (1979) provide a comprehensive treatment of the available statistical procedures (e.g., inoculating raw data with known amounts of error, randomized response techniques, collapsing categories) and institutional procedures (e.g., third parties who would serve as ‘data brokers’) that can be used to relax these problems. Data sharing among individual researchers requires explicit attention to ethical, statistical, organizational, and logistical (e.g., video editing and image modification) issues throughout the research process.
3.3 Proprietary Rights and Data Ownership Changing policies, practices, and technology will facilitate data sharing and, as a consequence, increase use of data collected by someone else. Alternatively, secondary analysis of data for the purposes of addressing disputes among analysts can be quite difficult to negotiate. When conflict arises, it often revolves around who ‘owns’ the data and if, when, how and how much of it should be disclosed to others. Establishing proprietary rights to research and evaluation data is not a simple matter. Data ownership depends on how the research was financed, policies of the sponsoring and host organizations, conditions specified in laws (e.g., Freedom of Information Act, Privacy Act), data sharing polices of the journal in which the research is published, and ethical guidelines of professional associations to which authors belong. As researchers embark on primary or secondary analyses, it is necessary to understand these avenues and constraints.
graphy and Registers; Intellectual Property: Legal Aspects; Intellectual Property Rights: Ethical Aspects; International Research: Programs and Databases; Meta-analysis: Overview; Privacy of Individuals in Social Research: Confidentiality; Unobtrusive Measures
Bibliography Boruch R F, Cecil J S 1979 Assuring the Confidentiality of Social Research Data. University of Pennsylvania Press, Philadelphia, PA Boruch R F, Wortman P M, Cordray D S and Associates (eds.) 1981 Reanalyzing Program Ealuations: Policies and Practices for Secondary Analysis of Social and Educational Programs, 1st edn. Jossey-Bass, San Francisco, CA Cherlin A 1991 On analyzing other people’s data. Deelopmental Psychology 27(6): 946–8 Duncan G J 1991 Made in heaven: Secondary data analysis and interdisciplinary collaborators. Deelopmental Psychology 27(6): 949–51 Elder Jr. G H, Pavalko E K, Clipp E C 1993 Working with Archial Data: Studying Lies. Sage Publications, Newbury Park, CA Hyman H H 1972 Secondary Analysis of Sample Sureys: Principles, Procedures and Potentialities. Wiley, London Lipsey M W, Cordray D S 2000 Evaluation methods for social intervention. Annual Reiew of Psychology 51: 345–75 McCall R B, Appelbaum M I 1991 Some issues of conducting secondary analysis. Deelopmental Psychology 27(6): 911–17 Muthen B O, Curran P J 1997 General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Method 2: 371–402 Ravillion M, Chen S H 1997 What can new survey data tell us about recent changes in distribution and poverty? World Bank Economic Reiew 11: 357–82 Stewart D W 1984 Secondary Research: Information Sources and Methods. Sage, Beverly Hills, CA Webb E J, Campbell D T, Schwartz R D, Sechrest L 1965 Unobtrusie Measures: Nonreactie Research in the Social Sciences. Rand McNally, Chicago, IL
D. S. Cordray
4. A Summary Note Access to quantitative and qualitative data from governments, businesses, and individual researchers has greatly facilitated the practice of secondary analysis. Changes in information technology—notably greater use of the World Wide Web—will undoubtedly enhance these practices even further at primary and secondary levels of research and evaluation. See also: Archives and Historical Databases; Data Archives: International; Databases, Core: Demo13780
Secrecy, Anthropology of A simple definition of secrecy as the deliberate concealment of information has served cross-cultural comparison (e.g., Tefft 1980, Bok 1982). Under this definition, people in cultures everywhere have secrets. Since a secret may turn out to be ‘empty’—no hidden information in fact exists—some anthropologists have suggested that definition be refocused on the practice or the ‘doing’ of secrets rather than on a secret’s
Secrecy, Anthropology of informational content. Anthropologists have been primarily interested in what could be called public, or institutionalized, secrecy: those persistent practices of hiding information within kinship, religious, political, or economic groups. Areas of research have included concealments of ritual knowledge and the social functions of secret societies. Uses of more personal, or private, secrets have also attracted attention insofar as these relate to cultural constructions of personhood and to the dynamics of interpersonal relationships. The study of secrets has posed obvious methodological and ethical problems for ethnographers.
1. The Paradox of Secrecy The revelation of secrets is as important as the keeping of them. Ethnographers have called this the ‘paradox’ of secrecy (Bellman 1984). For a secret to persist within a social order, it must eventually be told to someone who may in turn keep the secret until it passes along again. Furthermore, the social consequence of secrets rests on the fact that people know of their existence even though content remains hidden. Public awareness of the existence of secrets can afford prestige and power to those who know what others do not. Many accounts of secrecy revisit Georg Simmel’s ground-breaking analysis of secrecy and secret societies (1950 [1908]), including his observation that secrets are a form of ‘inner property.’ Simmel suggested that secrets are a type of ‘adornment’—a jewel or bauble that seduces public attention. Secrets with social consequence thus must be at least partly transparent and also periodically transacted and revealed.
2. Public Secrecy Simmel’s commodification opened the door to a political economy of secrecy. If secrets are property, then they are produced, exchanged, and consumed. Secrets are public because they have exchange value within a knowledge marketplace, and because such exchange has political consequence. Systems of political hierarchy may be grounded in an unequal distribution of secrets. Those in the know may dominate those who do not know so long as the ignorant grant that hidden knowledge has value.
2.1 Secret Societies Anthropologists have pursued Simmel’s original concern with secret societies, documenting a variety of secret groups, associations, lodges, and clubs in cultures around the world the members of which are pledged to maintain secrets, ritual and otherwise. The
early comparative category ‘secret society’ was catholic enough to encompass Melanesian men’s houses and grade societies, Australian totemic cults, Native American medicine lodges, West African male and female associations, and more (Webster 1908). Secrecy itself—the one attribute that these diverse organizations had in common—thus enticed anthropological attention (as Simmel predicted). Liberal theories of democratic process and of the capitalist marketplace are suspicious of secrecy as they are of cabals and monopolies. Democracy requires an informed citizenry, and the market is supposed to depend on the free flow of information. From this perspective, secrecy generally goes against the public good. Structural-functionalist analysis, however, argued that secret societies commonly serve important social functions, including the education of children, preservation of political authority, punishment of lawbreakers, stimulation of the economy, and the like (Little 1949). Such ‘social’ secret societies address important community needs, unlike ‘anti-social’ secret societies whose goals are criminal or revolutionary (Wedgewood 1930).
2.2 Secrecy and Power In the latter years of the twentieth century, anthropological attention returned to issues of power and inequality. Neutral definitions of culture as more-orless shared knowledge gave way to new concerns with the contradictions, gaps, variation, and disparity in that knowledge. Secret knowledge and secret societies might have social functions, but they also maintain ruling political and economic regimes. The distribution of public secrets typically parallels that of other rights and powers. Adults hide information from children, often until the latter have been ritually initiated into adulthood. Men refuse to share ritual knowledge with women. Family and lineage groups own various sorts of genealogical, medical, or historical knowledge, sharing these only with kin. Anthropologists have used a political economy of secrets to account for various forms of inequality. For example, the power of Kpelle elders over youth in Liberia was grounded in their management of the secrets of the Poro society (Murphy 1980). Beyond West Africa, anthropologists have argued that older, male keepers of secrets thereby acquired authority over women and the other uninitiated in various societies of Melanesia (Barth 1975), Native America (particularly Southwestern Pueblo cultures), and Australia (Keen 1994). Male appropriation of religious ritual, of access to the supernatural, of technology, of medicine, of sacred texts, of history, or of other valued information cemented men’s authority over the ignorant. 13781
Secrecy, Anthropology of Furthermore, women, children, and others in subordinate political position may be forced to reveal what they know, or otherwise denied rights to secrecy. Rights to have secrets are as consequential as the right to reveal information. Conversely, secrets can be a device to resist power. The dominated conceal what they know from authority. Women share knowledge kept from men. Slaves commune in secret languages. Children construct hidden worlds that evade adult awareness. In this reading, secrecy is a weapon of the weak that functions to resist and deflect supervision from above. Secrecy can also preserve a person’s idiosyncratic beliefs and practices from public deprecation as in the case, for example, of middle-class English witches (Luhrmann 1989). Alongside preservation of regimes of power, secrecy also contributes to perceptions of individual identity. Self-understanding may transform after a person has acquired secret information. Boys, for example, come to redefine themselves as men after progressing through an initiation ritual during which they learn adult secrets (Herdt 1990).
2.3 Rights to Reeal Secrets Many analysts of secrecy systems have concluded that many secrets are not actually secret. Women know men’s tricks: those spirit voices overheard during ritual are really bamboo flutes or bullroarers. Children pretend not to know that their parents are not really Santa Claus. Kin from one lineage are familiar with the supposedly secret names and genealogies of their neighbors. In oral societies, the leakiness of secret knowledge, in fact, helps maintain its viability within a community. If a secret holder dies, others are around to reveal lost knowledge if need be. Systems of secrecy often rely upon inequalities in rights to reveal knowledge rather than on an effective concealment of information—rights that the Kpelle summarize in the term ‘you cannot talk it’ (Bellman 1984). Folk copyrights of this sort regulate who may speak about what (Lindstrom 1990). Even though someone may know the content of concealed information, that person may not repeat this knowledge in public without the copyright to do so. Family groups in Vanuatu, for example, own songs, myths, genealogies, and medical recipes that others may not publicly reveal without that family’s permission. In private contexts, however, public secrets are surreptitiously discussed. Knowledge is regulated not just by restricted transmission—by secrecy—but also by copyrights that limit who speaks publicly. Folk systems of copyright transform secrets into ‘open’ secrets. Those supposed not to know must pretend not to know. And those supposed to know pretend to think that only they, in fact, do know. Anthropological analyzes of the social and psychological dynamics of open secrecy prefigured 13782
work on other similar institutions, including the homosexual ‘closet’ (Sedgwick 1990, see also Taussig 1999 on public secrecy).
3. Personal Secrecy Secrecy becomes possible given particular assumptions about personhood. The person must comprise some sort of inner self where secrets are stored along with an intention and capacity to conceal information. One can imagine a culture where notions of the person lacking such an inner self might deny the possibility of secrecy. No such culture exists although Western conceptions of childhood have often presumed that psychologically immature children lack competence with secrecy—that aptitudes to intentionally conceal information emerge as part of the child developmental process (Bok 1982). Western historians, too, have suggested that there have been different regimes of secrecy in the past, related to shifts in constructions of personhood. Simmel (1950) connected the evolution of secrecy to that of individualization and urbane modernity (his argument recalls similar evolutionary accounts of personal privacy.) In pre-urban societies, lack of individualism and everyday intimacies of contact made secrecy difficult. Simmel supposed that secrets increased with developing opportunities for personal reserve and discretion. More recent historians, stimulated by the work of Michel Foucault (1978) have argued instead that modernity shrinks opportunities for personal secrecy—that bureaucratic power structures increasingly have come to rely upon the monitoring of individuals, either by themselves or by institutional devices that extract information. According to Foucault, the origins of such extraction trace back to the Christian practice of confession. People are obliged to reveal their secrets to ecclesiastical, juridical, and other authorities. Revelation to authority confirms ones subjugation within a social order. The modern individual possesses inner capacities to conceal information but also contradictory urges and duties to reveal those secrets. Stimulated by the work of Erving Goffman (1959) and others, students of interpersonal relationships have remarked the significance of secrets, masks, and ‘face’ in an everyday micropolitics of self-presentation. Such studies have noted more egalitarian uses of revelation. People strengthen their relationships, creating communities of trust, by revealing secret information. This may be a secret about themselves, or a secret about another that they pass along, often in return for the promise ‘not to tell.’ Such secrets are a kind of gossip (Bok 1982), the exchange of which remarks and maintains sentiments of friendship. Personal secrets are a social currency that people invest in
Secular Religions their relationships. Whereas public secrets maintain political value insofar as their transmission is restricted, the value of personal secrets flows from their easy everyday revelation.
4. The Study of Secrecy Anthropology’s interest in cross-cultural description as a whole can be taken to be the desire to learn and reveal other people’s secrets (Taussig 1999). Methodologically, ethnographers face obvious problems of access to concealed information, but the study of secrecy raises even larger ethical puzzles. Anthropological codes of ethics generally require ethnographers to ensure that research does not harm the safety, dignity, or privacy of the people they study. Some researchers have described the structure and function of secrecy systems while avoiding details of secret knowledge content. Others have promised not to make their publications available to those members of a community (women, often) who should not have access to secret information. A few have refrained from publishing at all and restrict access to their fieldnotes. Ethical issues are thorniest where the secrets that anthropologists probe help maintain unjust social orders. See also: Censorship and Secrecy: Legal Perspectives; Emotions, Sociology of; Guilt; Interpersonal Trust across the Lifespan; Knowledge, Anthropology of; Ritual; Trust: Philosophical Aspects; Trust, Sociology of
Bibliography Barth F 1975 Ritual and Knowledge Among the Baktaman of New Guinea. Yale University Press, New Haven, CT Bellman B L 1984 The Language of Secrecy: Symbols & Metaphors in Poro Ritual. Rutgers University Press, New Brunswick, NJ Bok S 1982 Secrets: On the Ethics of Concealment and Reelation, 1st edn. Pantheon Books, New York Foucault M 1978 The History of Sexuality Volume 1: An Introduction. Pantheon Books, New York Goffman E 1959 The Presentation of Self in Eeryday Life. Doubleday, Garden City, NY Herdt G 1990 Secret societies and secret collectives. Oceania 60: 361–81 Keen I 1994 Knowledge and Secrecy in an Aboriginal Religion. Oxford University Press, Oxford, UK Lindstrom L 1990 Knowledge and Power in a South Pacific Society. Smithsonian Institution Press, Washington, DC Little K L 1949 The role of the secret society in cultural specialization. American Anthropologist 51: 199–212 Luhrmann T M 1989 The magic of secrecy. Ethos 17: 131–65 Murphy W P 1980 Secret knowledge as property and power in Kpelle society: Elders versus youth. Africa 50: 193–207 Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA
Simmel G 1950 [1908] The secret and the secret society. In: Wolff K (ed.) The Sociology of Georg Simmel. Free Press, Glencoe, IL Taussig M 1999 Defacement: Public Secrecy and the Labor of the Negatie. Stanford University Press, Stanford, CA Tefft S K 1980 Secrecy: A Cross-cultural Perspectie. Human Sciences Press, New York Webster H 1908 Primitie Secret Societies: A Study in Early Politics and Religion, 2nd edn. rev. Macmillan, New York Wedgewood C H 1930 The nature and function of secret societies. Oceania 1: 129–45
L. Lindstrom
Secular Religions The term ‘secular religions’ can be used to describe certain apparently secular enterprises that appear to share common features with enterprises usually thought of as ‘religious.’ ‘Secular religion’ is one of several terms—including ‘civil religion,’ ‘invisible religion,’ para-religion, and ‘quasi-religion’—which draw attention to religious and religious-like beliefs and activities which do not fit easily into the Western folk conception of religion as a distinct institutional structure focused on a transcendent being or beings.
1. Examples of ‘Secular Religions’ At first blush, the very notion of ‘secular religion’ would appear to be an oxymoron. In both popular and sociological usage, the term ‘religion’ is typically associated with the realm of the ‘sacred’ and contrasted with that of the ‘secular.’ However, a number of scholars have pointed out striking similarities between ostensibly religious and ostensibly secular undertakings. Secular ideologies it is argued may, like religions, unite followers into a community of shared beliefs. They may provide adherents with a sense of meaning and ultimate purpose. They may inspire in believers a sense of transcendence usually associated with religion. Attempts to find parallels between the sacred and the secular have been especially common in studies of the political realm and in studies of therapeutic organizations and activities.
1.1 Political ‘Religions’ Numerous scholars have highlighted the religious aspects of political movements and ideologies. Communism, for example, has often been regarded as a secular religion. Zuo (1991) has recently described the veneration of Chairman Mao during the Chinese Cultural Revolution as a political religion replete with 13783
Secular Religions sacred beings (Mao himself), sacred texts (the Little Red Book), and ritual (political denunciations). The term ‘political religion’ has also been employed to describe attempts made in developing societies to rally support for the concept of the nation. Crippin (1988) has gone so far as to argue the nationalism is the religion par excellence in modern society and that it is displacing more traditional forms of religion. O’Toole (1977) has used the term ‘sect’ to describe certain political groups operating in Canada, including the Socialist Labor Party, followers of De Leon who wait for a Communist millennium which they regard as imminent. Such social movements as environmentalism, the animal rights movement, and the health food movement have been described as quasireligions insofar as they provide adherents with a coherent worldview and sense of purpose at the same time that they command a great deal of loyalty.
1.2 Therapeutic ‘Religions’ Many scholars have drawn attention to ritual aspects of western medical practice. Others have pointed out that psychiatrists have much in common with shamans and other religious healers. A number of researchers have pointed out the similarities that exist between self-help groups and religious organizations (Galanter 1989, Rudy and Greil 1988). One family within the self-help movement that is perhaps more obviously ‘religious’ in character than most of the others includes Alcoholics Anonymous (AA) and other 12-step or codependency groups. A few among many of the religious characteristics of AA are a conception of the sacred, ceremonies and rituals, creedal statements, experiences of transcendence, and the presence of an AA philosophy of life.
1.3 Other Examples Other examples of attempts to point out analogies between apparently secular enterprises and religious ones abound in the social scientific literature. Within the realm of sport, attention has been paid to ritual elements and experiences of transcendence to be found in cricket, baseball, football, and the Olympic games. In the sphere of business, some writers have drawn attention to the sectarian characteristics of certain types of business, such as home-party sales organizations and direct sales organizations. It has now become commonplace among students of corporate culture to regard such ordinary activities as meetings, presentations and retirement dinners as rituals. Among the voluntary organizations that have been interpreted as quasi-religions are the Boy Scouts, groups of hobbyists and collectors, and ‘fan clubs.’ Jindra (1994) has described the phenomenon of Star Trek ‘fandom’ 13784
as religious in that it provides participants with a common worldview and inspires high levels of commitment. Several writers have attempted to characterize atheism itself as a religious enterprise, characterizing Ethical Culture and other humanist groups as analogous to religions. Still others have viewed faith in science as the dominant religious perspective of the contemporary era.
2. ‘Secular Religions’ and the Problem of Defining Religion It may be argued that an interest in secular religion is a natural outgrowth of the functionalist perspective in classical sociology, which tended to view religion as serving a necessary function for the maintenance of society by uniting members of a society into a common moral universe. If, as many early sociologists thought, supernaturalistic conceptions of the universe were destined to recede in the face of industrialization and the increasing influence of science, the question arose as to what phenomena might serve as the ‘social cement’ of future societies. Comte’s call for a ‘religion of humanity’ qualifies as an early sociological mention of the notion of a ‘secular religion.’ Durkheim’s expectation that a ‘cult of the individual’ might play a similar role in the maintenance of society as that traditionally played by religions represents another early effort to provide intellectual justification for the idea that apparently secular ideologies may have religious characteristics. As appealing and intuitive as the idea that secular enterprises may share important features with religions might be, the notion of a ‘secular religion’ must confront theoretical problems concerning the appropriate sociological definition of ‘religion’ and of ‘the sacred.’ Functional definitions of religion emphasize that the essential element in religion is the provision of an encompassing system of meaning or the ability to connect people to the ultimate conditions of their existence. Substantive definitions of religion argue that what distinguishes religion from other types of human activity is its reference to the sacred, the supernatural, or the superempirical. The advantage of functional definitions is a breadth and inclusiveness that allows one to look at beliefs and practices not commonly referred to as religious but which may nonetheless resemble religious phenomena in important ways. One major disadvantage is that they may have the effect of so broadening the concept of religion that it becomes meaningless. While substantive definitions avoid this problem, they may result in allowing traditional Western conceptions of the nature of religion to determine what is to be included in the ‘religious’ category. Viewed from the perspective of the debate over the definition of religion, the theoretical problem with the concept of ‘secular
Secular Religions religion’ is that it seems to rely simultaneously on both a broad functional approach to religion and on a narrower substantive approach. The idea that environmentalism or nationalism could be properly called a religion relies on a functional definition, while the idea that such religions are ‘secular’ rather than ‘sacred’ necessarily depends on a substantive definition. One might argue that both functional and substantive definitions of religion share the weakness that they privilege social scientists’ conceptions of religion over folk conceptions of religion, that is to say the ways that people use the term ‘religion’ in everyday life. Some social scientists would therefore argue that the search for a scientifically valid definition of religion is futile and that the best that can be done is to define religion ethnographically, treating it as a ‘category of discourse’ with meanings that are changeable over time. The position that scholars take with regard to these definitional issues will necessarily influence the way in which they approach the study of ‘secular religions.’ This is perhaps one reason why there is at present no consensus within the social sciences with regard to the most appropriate method for studying phenomena residing on the border between the sacred and the secular.
3. Approaches to the Study of ‘Secular Religions’ Within contemporary sociology and anthropology, there exist several different research traditions that focus on the boundary between the religious and the non-religious. 3.1 The Notion of Ciil Religion While Rousseau coined the term ‘civil religion,’ its development as a social scientific concept is attributed to Robert Bellah (1967). Working within the functionalist tradition, which sees religion as integrative for society, Bellah argued for the existence of a US civil religion, an ‘institutionalized collection of sacred beliefs about the American nation,’ which binds US citizens together in spite of denominational pluralism. A key tenet of US civil religion is the conception of the USA as a nation with a divinely ordained mission. Although the civil religion concept was developed in the US context, it has been applied to the analysis of many states including Canada, Israel, and Malaysia (see Christianity: Eangelical, Reialist, and Pentecostal). 3.2 The Implicit Religion Tradition Although the term ‘implicit religion’ was coined and popularized by Edward Bailey (1983), the concept owes much to the work of Thomas Luckmann (1967).
Working with a broad functional definition of religion as the transcending of biological nature and the formation of a self, Luckmann argues that religion is a human universal. Luckmann maintains that traditional religions have become irrelevant in the modern world but that religion, rather than disappearing, has become personalized, privatized, and ‘invisible’ (see Religiosity: Modern). Following Luckmann’s lead, Bailey’s implicit religion approach looks for the experience of the sacred within events of everyday life ordinarily dismissed as profane. Thus, in a study of interaction in an English public house, Bailey interprets the ethos of ‘being a man,’ mastering alcohol and respecting the selves of others as implicitly religious. 3.3 The Study of ‘Religious’ Forms There also exist a relatively large number of studies of general social ‘forms’ which are deemed to be relevant in both sacred and secular contexts. General discussions of ‘secular ritual,’ for example, fall into this category. Goffman’s (1967) work on the functional significance of such rituals of everyday life as ‘saving face’ and showing ‘deference’ is particularly well known. Collins (1981) treats ‘interaction ritual chains’ as a key element in his attempt to lay a microsociological foundation for macro-sociology. Once conversations, social greetings, and athletic contests are allowed to count as ritual, ritual becomes virtually coterminous with social life. For this reason, some scholars reject ‘ritual’ as a meaningless category, while others argue that the ubiquity of ritual simply means that ritualizing is a fundamental human activity. Many sociological studies of the commitment process either explicitly or implicitly present the commitment process as operating in much the same way in both sacred and secular contexts (Kanter 1972). A number of writers have developed models of the identity change process which highlight the similarity between religious conversion and other forms of identity change (Galanter 1989, Greil and Rudy 1983). Studies that employ the term ‘sect’ in the analysis of apparently secular organizations have already been discussed. 3.4 The ‘Quasi-religion’ Approach The quasi-religion approach popularized by Greil (1993) and his colleagues relies on an ethnographic approach to the term ‘religion,’ regarding it, not as an objective category susceptible to social scientific definition, but as a claim negotiated in the course of social interaction. Greil distinguishes between parareligions and quasi-religions. Para-religions, which are ostensibly nonreligious entities that nonetheless deal with matters of ultimate concern, resemble the enterprises referred to in this article as ‘secular religions.’ 13785
Secular Religions The term ‘quasi-religion,’ on the other hand, refers to groups, like AA or certain occult groups, which deal with matters of transcendence or ultimate concern, but which do not see themselves or present themselves unambiguously as religious. The concern here is not to determine whether a particular group is or is not a quasi-religion but to examine the process by which particular groups attempt to claim or repudiate the religious label and by which other groups and individuals respond to these claims.
4. The Theoretical Significance of Secular Religions Pursuing the analogy between ostensibly secular enterprises and ‘religion’ as it is usually conceived raises important questions concerning the proper definition of religion, the process of secularization and the nature of religion in contemporary societies. The study of secular religions puts into bold relief the problematic nature of social scientific attempts to define religion. How one defines religion has important consequences for how one thinks about secular religions and for whether or not one regards the concept as useful. Attention to religious border phenomena presses one to consider the value of an ethnographic approach to definition that conceptualizes religion as a category of discourse whose precise meaning and implications are continually being negotiated in the course of social interaction. The study of secular religions also directs attention to difficulties in specifying and evaluating the secularization thesis, which claims that the influence of religion is declining in contemporary societies (see Secularization). It should be clear that one’s approach to the question of secularization (including whether one even conceives of secularization as a theoretical possibility) depends on one’s definition of religion and on one’s treatment of religious border phenomena. Several of the approaches to secular religion discussed here imply that traditional Western conceptions of religion as an institutionalized set of beliefs and practices focusing on a transcendent deity is beginning to lose its hold over many people. Likewise, many people seem to be tending to give expression to their experiences of transcendence in institutional contexts that have not typically been thought of as religious. If this is, in fact, the case, the line between religion and nonreligion may be expected to become even more blurred and the study of religious border phenomena even more central to the social scientific enterprise. See also: Communism; Laicization, History of; Nationalism, Sociology of; New Religious Movements; Religion, Sociology of; Religiosity, Sociology of 13786
Bibliography Bailey E 1983 The implicit religion of contemporary society: An orientation and plea for its study. Religion 13: 69–83 Bellah R N 1967 Civil religion in America. Daedalus 96: 1–21 Collins R 1981 On the micro-foundations of macro-sociology. American Journal of Sociology. 86: 984–1014 Crippin T 1988 Old and new gods in the modern world: Toward a theory of religious transformation. Social Forces 67: 316–36 Galanter M 1989 Cults: Faith, Healing, and Coercion. Oxford University Press, New York Goffman E 1967 Interaction Ritual: Essays on Face-to-Face Behaior 1st edn. Anchor Books, Garden City, NY Greil A L 1993 Explorations along the sacred frontier: Notes on para-religions, quasi-religions, and other boundary phenomena. In: Bromley D, Hadden J K (eds.) Handbook of Cults and Sects in America: Assessing Two Decades of Research and Theory Deelopment, (Volume 2 of Religion and the Social Order). JAI Press, Greenwich, CT pp. 153–72 Greil A L, Rudy D R 1983 Conversion to the world view of Alcoholics Anonymous: A refinement of conversion theory. Qualitatie Sociology 6: 5–28 Jindra M 1994 Star trek fandom as a religious phenomenon. Sociology of Religion 55: 27–51 Kanter R M 1972 Commitment and Community: Communes and Utopias in Sociological Perspectie. Harvard University Press, Cambridge, MA Luckmann T 1967 The Inisible Religion: The Problem of Religion in Modern Society. Macmillan, New York O’Toole R 1977 The Precipitous Path: Studies in Political Sects. Peter Martin, Toronto, Canada Rudy D R, Greil A L 1988 Is Alcoholics Anonymous a religious organization? Sociological Analysis 50: 41–51 Zuo J P 1991 Political religion: The case of the cultural revolution in China. Sociological Analysis 52: 99–110
A. L. Greil
Secularization In its precise historical sense, ‘secularization’ refers to the transfer of persons, things, meanings, etc., from ecclesiastical or religious to civil or lay use. In its broadest sense, often postulated as a universal developmental process, secularization refers to the progressive decline of religious beliefs, practices, and institutions.
1. The Term ‘Secularization’ Etymologically, the term secularization derives from the medieval Latin word saeculum, with its dual temporal-spatial connotation of secular age and secular world. Such a semantic connotation points to the fact that social reality in medieval Christendom was structured through a system of classification which
Secularization divided ‘this world’ into two heterogeneous realms or spheres, ‘the religious’ and ‘the secular.’ This was a particular, and historically rather unique, variant of the kind of universal dualist system of classification of social reality into sacred and profane realms, postulated by E; mile Durkheim. In fact, Western European Christendom was structured through a double dualist system of classification. There was, on the one hand, the dualism between ‘this world’ (the City of Man) and ‘the other world’ (the City of God). There was, on the other hand, the dualism within ‘this world’ between a ‘religious’ and a ‘secular’ sphere. Both dualisms were mediated, moreover, by the sacramental nature of the church, situated in the middle, simultaneously belonging to both worlds and, therefore, able to mediate sacramentally between the two. The differentiation between the cloistered regular clergy and the secular clergy living in the world was one of the many manifestations of this dualism. The term secularization was first used in canon law to refer to the process whereby a religious monk left the cloister to return to the world and thus become a secular priest. In reference to an actual historical process, however, the term secularization was first used to signify the lay expropriation of monasteries, landholdings, and the mortmain wealth of the church after the Protestant Reformation. Thereafter, secularization has come to designate any transfer from religious or ecclesiastical to civil or lay use.
1.1 Secularization as a Historical Process Secularization refers to the historical process whereby the dualist system within ‘this world’ and the sacramental structures of mediation between this world and the other world progressively break down until the medieval system of classification disappears. Max Weber’s expressive image of the breaking of the monastery walls remains perhaps the best graphic expression of this radical spatial restructuration initiated by the Protestant Reformation. This process, which Weber conceptualized as a general reorientation of religion from an other-worldly to an inner-worldly direction, is literally a process of secularization. Religious ‘callings’ are redirected to the saeculum. Salvation and religious perfection are no longer to be found in withdrawal from the world but in the midst of worldly secular activities. In fact, the symbolic wall separating the religious and the secular realms breaks down. The separation between ‘this world’ and ‘the other world,’ for the time being at least, remains. But, from now on, there will be only one single ‘this world,’ the secular one, within which religion will have to find its own place. To study what new systems of classification and differentiation emerge within this one secular world and what new place religion will have within it
is precisely the analytical task of the theory of secularization. Obviously, such a concept of secularization refers to a particular historical process of transformation of Western European Christian societies and might not be directly applicable to other non-Christian societies with very different modes of structuration of the sacred and profane realms. It could hardly be applicable, for instance, to such ‘religions’ as Confucianism or Taoism, insofar as they are not characterized by high tension with ‘the world’ and have no ecclesiastical organization. In a sense those religions which have always been ‘worldly’ and ‘lay’ do not need to undergo a process of secularization. In itself such a spatial-structural concept of secularization describes only changes in the location of Christian religion from medieval to modern societies. It tells very little, however, about the extent and character of the religious beliefs, practices, and experiences of individuals and groups living in such societies. Yet the theory of secularization adopted by the modern social sciences incorporated the beliefs in progress and the critiques of religion of the Enlightenment and of Positivism and assumed that the historical process of secularization entailed the progressive decline of religion in the modern world. Thus, the theory of secularization became embedded in a philosophy of history that saw history as the progressive evolution of humanity from superstition to reason, from belief to unbelief, from religion to science.
2. The Secularization Paradigm Secularization might have been the only theory within the social sciences that was able to attain the status of a paradigm. In one form or another, with the possible exception of Alexis de Tocqueville, Vilfredo Pareto, and William James, the thesis of secularization was shared by all the founders. Paradoxically, the consensus was such that for over a century the theory of secularization remained not only uncontested but also untested. Even Durkheim’s and Weber’s work, while serving as the foundation for later theories of secularization, offer scant empirical analysis of modern processes of secularization, particularly of the way in which those processes affect the place, nature and role of religion in the modern world. Even after freeing themselves from some of the rationalist and positivist prejudices about religion, they still share the major intellectual assumptions of the age about the future of religion. For Durkheim, the old gods were growing old or already dead and the dysfunctional historical religions would not be able to compete with the new functional gods and secular moralities which modern societies were bound to generate. For Weber, the process of 13787
Secularization intellectual rationalization, carried to its culmination by modern science, had ended in the complete disenchantment of the world, while the functional differentiation of the secular spheres had displaced the old integrative monotheism, replacing it with the modern polytheism of values and their unceasing and irreconcilable struggle. The old churches remain only as a refuge for those ‘who cannot bear the fate of the times like a man’ and are willing to make the inevitable ‘intellectual sacrifice.’ Only in the 1960s one finds the first attempts to develop more systematic and empirically grounded formulations of the theory of secularization in the works of Acquaviva (1961), Berger (1967), Luckmann (1963), and Wilson (1966). It was then, at the very moment when theologians were celebrating the death of God and the secular city, that the first flaws in the theory became noticeable and the first systematic critiques were raised by Martin (1969) and Greeley (1972) in what constituted the first secularization debate. For the first time it became possible to separate the theory of secularization from its ideological origins in the Enlightenment critique of religion and to distinguish the theory of secularization, as a theory of the modern autonomous differentiation of the secular and the religious spheres, from the thesis that the end result of the process of modern differentiation would be the progressive erosion, decline and eventual disappearance of religion. Greeley (1972) already pointed out that the secularization of society, which he conceded, by no means implied the end of church religiosity, the emergence of ‘secular man,’ or the social irrelevance of religion in modern secular societies. Yet after three decades the secularization debate remains unabated. Defenders of the theory tend to point to the secularization of society and to the decline of church religiosity in Europe as substantiating evidence, while critics tend to emphasize the persistent religiosity in the United States and widespread signs of religious revival as damaging counterevidence that justify discarding the whole theory as a ‘myth.’
3. The Three Subtheses of the Theory of Secularization Although it is often viewed as a single unified theory, the paradigm of secularization is actually made up of three different and disparate propositions: secularization as differentiation of the secular spheres from religious institutions and norms, secularization as general decline of religious beliefs and practices, and secularization as privatization or marginalization of religion to a privatized sphere. Strictly speaking, the core and central thesis of the theory of secularization is the conceptualization of the historical process of societal modernization as a process of functional differentiation and emancipation of the secular 13788
spheres—primarily the state, the economy, and science—from religion and the concomitant specialized and functional differentiation of religion within its own newly found religious sphere. The other subtheses, the decline and privatization of religion, were added as allegedly necessary structural consequences of the process of secularization. Maintaining this analytical distinction should allow the examination and testing of the validity of each of the three propositions independently of each other and thus refocus the often fruitless secularization debate into comparative historical analysis that could account for obviously different patterns of secularization.
3.1 The Differentiation and Secularization of Society The medieval dichotomous classification of reality into religious and secular realms was to a large extent dictated by the church. In this sense, the official perspective from which medieval societies saw themselves was a religious one. Everything within the saeculum remained an undifferentiated whole as long as it was viewed from the outside, from the perspective of the religious. The fall of the religious walls put an end to this dichotomous way of thinking and opened up an entire new space for processes of internal differentiation of the various secular spheres. Now, for the first time, the various secular spheres—politics, economics, law, science, art, etc.—could come fully into their own, become differentiated from each other, and follow what Weber called their ‘internal and lawful autonomy.’ The religious sphere, in turn, became a less central and spatially diminished sphere within the new secular system, but also a more internally differentiated one, specializing in ‘its own religious’ function and losing many other ‘nonreligious’ (clerical, educational, social welfare) functions (Luhmann 1977). The loss in functions entailed as well a significant loss in hegemony and power. It is unnecessary to enter into the controversial search for first causes setting the modern process of differentiation into motion. It suffices to stress the role which four related and parallel developments played in undermining the medieval religious system of classification: the Protestant Reformation; the formation of modern states; the growth of modern capitalism; and the early modern scientific revolution. Each of the four developments contributed its own dynamic to modern processes of secularization. The four of them together were certainly more than sufficient to carry the process through. The Protestant Reformation by undermining the universalist claims of the Roman Catholic church helped to destroy the old organic system of Western Christendom and to liberate the secular spheres from religious control. Protestantism also served to legit-
Secularization imate the rise of bourgeois man and of the new entrepreneurial classes, the rise of the modern sovereign state against the universal Christian monarchy, and the triumph of the new science against Catholic scholasticism. Moreover, Protestantism may also be viewed as a form of internal secularization, as the vehicle through which Christian religious contents were to assume institutionalized secular forms in modern societies, thereby erasing the religious\secular divide. If the universalist claims of the church as a salvation organization were undermined by the religious pluralism introduced by the Reformation, its monopolist compulsory character was undermined by the rise of a modern secular state which progressively was able to concentrate and monopolize the means of violence and coercion within its territory. In the early absolutist phase the alliance of throne and altar became even more accentuated. The churches attempted to reproduce the model of Christendom at the national level, but all the territorial national churches, Anglican, Lutheran, Catholic, and Orthodox, fell under the caesaro–papist control of the absolutist state. As the political costs of enforcing conformity became too high, the principle cuius regio eius religio turned into the principle of religious tolerance and state neutrality towards privatized religion, the liberal state’s preferred form of religion. Eventually, new secular raison d’eT tat principles led to the constitutional separation of church and state, even though some countries such as England and the Scandinavian Lutheran countries may have maintained formal establishment. Capitalism, that revolutionizing force in history which according to Marx ‘melts all that is solid into air and profanes all that is holy,’ had already sprouted within the womb of the old Christian society in the medieval towns. No other sphere of the saeculum would prove more secular and more unsusceptible to religious and moral regulation than the capitalist market. Nowhere is the transvaluation of values which takes place from Medieval to Puritan Christianity as radical and as evident as in the change of attitude towards ‘charity’—that most Christian of virtues— and towards poverty. Following Weber (1958), one could distinguish three phases and meanings of capitalist secularization: in the Puritan phase, ‘asceticism was carried out of monastic cells into everyday life’ and secular economic activities acquired the meaning and compulsion of a religious calling; in the utilitarian phase, as the religious roots dried out, the irrational compulsion turned into ‘sober economic virtue’ and ‘utilitarian worldlines’; finally, once capitalism ‘rests on mechanical foundations,’ it no longer needs religious or moral support and begins to penetrate and colonize the religious sphere itself, subjecting it to the logic of commodification (Berger 1967). The conflict between the church and the new science, symbolized by the trial of Galileo, was not as much about the substantive truth or falsity of the new
Copernican theory as it was about the validity of the claims of the new science to have discovered a new autonomous method of obtaining and verifying truth. The conflict was above all about science’s claims to differentiated autonomy from religion. The Newtonian Enlightenment established a new synthesis between faith and reason, which in Anglo-Saxon countries was to last until the Darwinian crisis of the second half of the nineteenth-century. Across the Channel, however, the Enlightenment became patently radicalized and militantly anti-religious. Science was transformed into a scientific and scientistic worldview which claimed to have replaced religion the way a new scientific paradigm replaces an outmoded one. As each of these carriers—Protestantism, the modern state, modern capitalism, and modern science— developed different dynamics in different places and at different times, the patterns and the outcomes of the historical process of secularization varied accordingly. Yet it is striking how few comparative historical studies of secularization there are which would take these four, or other, variables into account. David Martin’s A General Theory of Secularization is perhaps the single prominent exception. Only when it comes to capitalism has it been nearly universally recognized that economic development affects the ‘rates of secularization,’ that is, the extent and relative distribution of religious beliefs and practices. This positive insight, however, turns into a blinder, when it is made into the sole variable accounting for different rates of secularization. As a result, those cases in which no positive correlation is found, as expected, between rates of secularization and rates of industrialization, urbanization, proletarianization, education, in short, with indicators of socio-economic development, are classified as ‘exceptions’ to the ‘rule.’ 3.2 The Decline of Religion Thesis The assumption, often stated but mostly unstated, that religion in the modern world was declining and would likely continue to decline has been until recently a dominant postulate of the theory of secularization. It was based primarily on evidence from European societies showing that the closer people were involved in industrial production, the less religious they became, or at least, the less they took part in institutional church religiosity. The theory assumed that the European trends were universal and that non-European societies would evince similar rates of religious decline with increasing industrialization. It is this part of the theory which has proven patently wrong. One should keep in mind the inherent difficulties in comparative studies of religion. Globally, the evidence is insufficient and very uneven. There is often no consensus as to what counts as religion and even when there is agreement on the object of study, there is likely to be disagreement on which of the dimensions of religiosity (membership affiliation, official vs. popular 13789
Secularization religion, beliefs, ritual and nonritual practices, experiences, doctrinal knowledge, and their behavioral and ethical effects) one should measure and how various dimensions should be ranked and compared. Nevertheless, one can prudently state that since World War II, despite rapid increases in industrialization, urbanization, and education, most religious traditions in most parts of the world have either experienced some growth or maintained their vitality (Whaling 1987). The main exceptions were: the rapid decline of primal religions mostly in exchange for more ‘advanced’ ones (mainly Islam and Christianity), the sudden and dramatic decline of religion in communist countries, a process which is now being reversed after the fall of communism, and the continuous decline of religion throughout much of Western Europe (and, one could add, some of its colonial outposts such as Argentina, Uruguay, New Zealand, and Quebec). What remains, therefore, as significant and overwhelming evidence is the progressive and apparently still continuing decline of religion in Western Europe. It is this evidence which has always served as the empirical basis for most theories of secularization. Were it not for the fact that religion shows no uniform sign of decline in Japan or the United States, two equally modern societies, one could still perhaps maintain the ‘modernizing’ developmentalist assumption that it is only a matter of time before the more ‘backward’ societies catch-up with the more ‘modern’ ones. But such an assumption is no longer tenable. Leaving aside the evidence from Japan, a case which should be crucial, however, for any attempt to develop a general theory of secularization, there is still the need to explain the obviously contrasting religious trends in Western Europe and the United States. Until very recently, most of the comparative observations as well as attempts at explanation came from the European side. European visitors have always been struck by the vitality of American religion. The United States appeared simultaneously as the land of ‘perfect disestablishment’ and as ‘the land of religiosity par excellence.’ Yet, Europeans rarely felt compelled to put into question the thesis of the general decline of religion in view of the American counterevidence. Religious decline was so much taken for granted that what required an explanation was the American ‘deviation’ from the European ‘norm.’ The standard explanations have been either the expedient appeal to ‘American exceptionalism’ or the casuistic strategy to rule out the American evidence as irrelevant, because American religion was supposed to have become so ‘secular’ that it should no longer count as religion (Luckmann 1963). From a global–historical perspective, however, it is the dramatic decline of religion in Europe that truly demands an explanation. A plausible answer would require a search for independent variables, for those independent carriers of secularization present in 13790
Europe but absent in the United States. Looking at the four historical carriers mentioned above, neither Protestantism nor capitalism would appear as plausible candidates. The state and scientific culture, however, could serve as plausible independent variables, since church–state relations and the scientific worldviews carried by the Enlightenment were significantly different in Europe and America. What the United States never had was an absolutist state and its ecclesiastical counterpart, a caesaro– papist state church. It was the caesaro–papist embrace of throne and altar under absolutism that perhaps more than anything else determined the decline of church religion in Europe. Consistently throughout Europe, nonestablished churches and sects in most countries have been able to withstand the secularizing trends better than the established church. It was the very attempt to preserve and prolong Christendom in every state and thus to resist modern functional differentiation that nearly destroyed the churches in Europe. The Enlightenment and its critique of religion became themselves independent carriers of processes of secularization wherever the established churches became obstacles to the modern process of functional differentiation or resisted the emancipation of the cognitive–scientific, political–practical, or aesthetic– expressive secular spheres from religious and ecclesiastical tutelage. In such cases, the Enlightenment critique of religion was usually adopted by social movements and political parties, becoming in the process a selffulfilling prophecy. By contrast, wherever religion itself accepted, perhaps even furthered, the functional differentiation of the secular spheres from the religious sphere, the radical Enlightenment and its critique of religion became superfluous. Simply put, the more religions resist the process of modern differentiation, that is, secularization in the first sense, the more they will tend in the long run to suffer religious decline, that is, secularization in the second sense. 3.3 The Priatization of Religion Thesis As a corollary of the thesis of differentiation, religious disestablishment entails the privatization of religion. Religion becomes indeed ‘a private affair.’ Insofar as freedom of conscience, ‘the first freedom’ as well as the precondition of all modern freedoms, is related intrinsically to ‘the right to privacy’—to the modern institutionalization of a private sphere free from governmental intrusion as well as free from ecclesiastical control—and inasmuch as the right to privacy serves as the very foundation of modern liberalism and of modern individualism, then indeed the privatization of religion is essential to modern societies. There is, however, another more radical version of the thesis of privatization which often appears as a corollary of the decline of religion thesis. In modern secular societies, whatever residual religion remains
Segregation Indices becomes so subjective and privatized that it turns ‘invisible,’ that is, marginal and irrelevant from a societal point of view. Not only are traditional religious institutions becoming increasingly irrelevant but, Luckmann (1963) adds, modern religion itself is no longer to be found inside the churches. The modern quest for salvation and personal meaning has withdrawn to the private sphere of the self. Significant for the structure of modern secular societies is the fact that this quest for subjective meaning is a strictly personal affair. The primary ‘public’ institutions (state, economy) no longer need or are interested in maintaining a sacred cosmos or a public religious worldview. The functionalist thesis of privatization turns problematic when instead of being a falsifiable empirical theory of dominant historical trends, it becomes a prescriptive normative theory of how religious institutions ought to behave in the modern world. Unlike secular differentiation, which remains a structural trend that serves to define the very structure of modernity, the privatization of religion is a historical option, a ‘preferred option’ to be sure, but an option nonetheless. Privatization is preferred internally as evinced by general pietistic trends, by processes of religious individuation, and by the reflexive nature of modern religion. Privatization is constrained externally by structural trends of differentiation which force religion into a circumscribed and differentiated religious sphere. Above all, privatization is mandated ideologically by liberal categories of thought which permeate modern political and constitutional theories. The theory of secularization should be free from such a liberal ideological bias and admit that there may be legitimate forms of ‘public’ religion in the modern world, which are not necessarily anti-modern fundamentalist reactions and which do not need to endanger either modern individual freedoms or modern differentiated structures. Many of the recent critiques and revisions of the paradigm of secularization derive from the fact that in the 1980s religions throughout the world thrust themselves unexpectedly into the public arena of moral and political contestation, demonstrating that religions in the modern secular world continue and will likely continue to have a public dimension. See also: Religion: Evolution and Development; Religion, Phenomenology of; Religion, Sociology of; Secular Religions
Bibliography Acquaviva S S 1961 L’ecclissi del sacro nella ciiltaZ industrialle. Edizioni di Communita' , Milan [1979 The Decline of the Sacred in Industrial Society. Blackwell, Oxford] Berger P 1967 The Sacred Canopy. Doubleday, Garden City, NY Casanova J 1994 Public Religions in the Modern World. University of Chicago Press, Chicago Dobbelaere K 1981 Secularization: A Multidimensional Concept. Sage, Beverly Hills, CA
Greeley A 1972 Unsecular Man: The Persistence of Religion. Schocken, New York Greeley A 1989 Religious Change in America. Harvard University Press, Cambridge, MA Hadden J K, Shupe A (eds.) 1989 Secularization and Fundamentalism Reconsidered. Paragon House, New York Luckmann T 1963 Das Problem der Religion in der modernen Gesellschaft. Rombach, Freiburg [1967 Inisible Religion. Macmillan, New York] Luhmann N 1977 Funktion der Religion. Suhrkamp Verlag, Frankfurt [1984 Religious Dogmatics and the Eolution of Societies. Mellen Press, New York] Martin D 1969 The Religious and the Secular. Shocken, New York Martin D 1978 A General Theory of Secularization. Blackwell, Oxford, UK Stark R, Bainbridge W S 1985 The Future of Religion. University of California Press, Berkeley, CA Weber M 1958 The protestant ethic and the spirit of capitalism. Scribner’s Sons, New York Whaling F (ed.) 1987 Religion in Today’s World: The Religious Situation of the World from 1945 to the Present Day. T & T Clark, Edinburgh, UK Wilson B 1966 Religion in Secular Society. Watts, London Wilson B 1985 Secularization: The inherited model. In: Hammond P (ed.) The Sacred in a Secular Age. University of California Press, Berkeley, CA
J. Casanova
Segregation Indices Measuring the residential segregation of racial and ethnic populations became very important to social scientists during the civil rights movement in the United States (Duncan and Duncan 1955; Taueber and Taueber 1965). Until the mid-1970s, when important critiques of the dissimilarity index stimulated efforts to develop alternatives, research almost exclusively relied upon this measure, and focused on the residential segregation of blacks from whites in the United States. Despite the many alternatives proposed, only the isolation index has really joined the dissimilarity index as a measure frequently used in empirical research on residential segregation. However, applications have extended substantially, to other racial and ethnic populations, and also to groups within and across races, defined by income, poverty, nativity, or other characteristics. Virtually all segregation indices are constructed by aggregating the observed distributions of the ‘minority’ and referent populations across a metropolitan area or city divided into small geographic areas—usually census tracts or census blocks in the United States. These geographic units are sufficiently small to be taken as rough approximations of neighborhoods whose residents would acknowledge and share some sense of common spatial identity, and hence some sense of community. Massey and Denton’s (1988) review and analysis of 13791
Segregation Indices 20 indices that either had been or could be proposed for measuring distinct aspects of residential segregation represents the major watershed in the field. Massey and Denton suggested that the 20 indices measured five distinct dimensions of residential segregation: (a) Evenness: indices that measure how the observed areal distributions of the minority and majority group populations in a metropolitan area deviate from distributions representing various definitions and criteria for equality or randomness. (b) Exposure: indices based on the probability that the next person a minority group member encounters in their neighborhood or geographic area will also be a member of that group, or will instead be a member of the referent group. (c) Concentration: indices that measure the degree to which a minority population is concentrated in a relatively small number of compact tracts. (d) Centralization: indices that measure the degree to which a minority population resides close to the center or ‘downtown’ areas of a city. (e) Clustering: indices that measure the degree to which the minority population disproportionately resides in neighborhoods or geographic units that adjoin one another, and hence form clusters or enclaves.
1. Measures of Eenness The index of dissimilarity remains by far the most frequently used in empirical research on residual segregation. It is built on the premise that a metropolitan area or city would be integrated completely if all of its census tracts (or other areal units) had exactly the proportion of minority group and majority group residents as the metropolitan area or city as a whole. The index is calculated as the average deviation of the tracts or areal units from this proportionate representation, where the deviation is tract i is QpikPQ. Thus,
0
n
15
D l [ti Q pikP Q ] [2TP(1kP)] " The resulting dissimilarity score can be interpreted as representing the percentage of the metropolitan area’s minority residents who would have to move from tracts or areal units where the minority group is overrepresented to areal units where they are underrepresented to achieve proportionate representation or complete integration. The primary criticism of the dissimilarity index is that it is insensitive to moves (or ‘transfers’) of minority group members among tracts where they are over-represented or under-represented. Several indices developed in literatures measuring income inequalities overcome this violation of the ‘transfers principle.’ 13792
The Gini coefficient averages the differences in the proportions of the minority group in every pair of tracts or areal units in the metropolitan area, where QpikpjQ represents the difference in these proportions for tracts i and j
0
n
15[2T #P(1kP)]
n
G l [titj Q pikpj Q ] i=" j="
The Atkinson measures allow researchers to develop indices that weight tracts where the minority group is under-represented ( pi P) and over-represented ( pi P) equally (b l 0.5), or to weight underrepresentation (0 b 0.5) or over-representation (0.5 b 1.0) more heavily
)
0
n
A l [P\(1kP)] (1\PT ) (1kpi)("−b)pbiti "
1)" "
/( −b)
The theoretical or conceptual value of the Atkinson indices lies in applications where one might be more interested in reducing under-representation (e.g., integrating ‘lily-white’ suburbs) or in reducing overrepresentation (e.g., desegregating all black or Hispanic ghettos or barrios). Entropy or information indices are based on the concept that entropy is maximized and information minimized when distributions are in equilibrium. In segregation applications, this would occur when the proportions of the minority and majority group populations were equal (50 percent for two groups) in a metropolitan area. The entropy index measures the percentage of the metropolitan area’s entropy that is represented by the average deviation of entropy in each tract or areal unit from that in the metropolitan area as a whole. Thus, where entropy in each tract i and in the metropolitan area or city are represented respectively, by Ei l pi log (1\pi)j(1kpi) log (1\(1kpi)) and E l P log (1\P)j(1kP) log (1\(1kP)) then
0
n
H l [ti(EkEi)] "
15ET
2. Measures of Exposure Scores on the entropy index depend on P, the proportion that the minority group represents of the metropolitan area or city population. This violates a
Segregation Indices criterion that proposes the segregation indices should have ‘compositional invariance,’ that is, independent of the proportions of minority and majority group members in different cities or metropolitan areas. However, in many critical applications such as school desegregation, the very low percentages of minority or majority group students in a district constitute major barriers to increasing racial integration or balance. The conceptual appropriateness and theoretical value of indices that are influenced by the relative proportions of minority and majority group members is thus undeniable, and they are among the few measures besides the dissimilarity index that are used extensively in empirical research. The interaction and isolation measures are designed specifically to measure segregation defined by the degree to which the relative sizes of the minority and majority populations, as well as their distributions across neighborhoods or areal units, affect the chances of their availability to interact with one another in those neighborhoods. The interaction index is the average, weighted by the proportion of the metropolitan area’s minority group living in tract i, of the majority group’s proportion in each tract or areal unit i. It can be interpreted as representing the probability that the next person that a minority resident of a neighborhood or tract encounters will belong to the majority population. n P* l x y
"
[(xi\X )(yi\ti)]
Conversely, the isolation index measures the likelihood that minority residents in a tract encounter only each other, and is calculated as the minorityweighted average of the minority group’s proportion in each other or areal unit i
higher densities. If the minority population was uniformly distributed throughout the city or metropolitan area, the proportion of the total minority population in tract or areal unit i would be equal to that tract’s proportion of the total area in the metropolitan area (ai\A). Delta measures concentration by aggregating the deviations from this expectation
0
(9 :5 9 (t a \T#)k" (t a \T#):* n
n
ACO l 1k (xiai\X )k (tiai\T ) " " " n
n
P* l x x "
[(xi\X ) (xi\ti)]
Because the interaction and isolation indices depend on the composition of the population, xP*y and xP*x will be equal only if the majority and minority groups represent equal proportions of the population. The correlation ratio, Eta#, adjusts for the effects of proportion minority of the population, and indeed can be classified as an evenness measure. V l Eta# l ( xP*xkP)\(1kP)
n#
n"
i i
Concentration indices seek to measure the extent to which the minority population occupies less of the metropolitan area’s or city’s geographic area than the majority or total population, and hence also lives at
i i
The relative concentration index compares the relative concentration observed for the minority and majority populations with the maximum ratio possible, which occurs if the minority population were concentrated in the smallest possible area, and the majority population spread across the largest. RCO l
(9" (x a \X ):59" (y a \Y ):k1*5 9" (t a \T"):59 (t a \T#):k1* n
n
i i
n
i i
n
i i
3. Measures of Concentration
1
n
∆ l 0.5 Q (xi\X )k(ai\A) Q " Like the index of dissimilarity, the score on delta can be interpreted as representing the percentage of the minority group population that would have to move from areas of higher to lower than average density for the group to live at uniform densities across the metropolitan area. Massey and Denton (1988) proposed new measures of concentration. The absolute concentration index compares the observed distribution of the minority group in the metropolitan area against the theoretical minimum and maximum areas they could occupy. If tracts are ordered and the total population is cumulated from smallest to largest in area, the minimum area that could accommodate the minority group population is reached at tract n where the cumu" the total minority lated population reaches or exceeds population (X ) of the metropolitan area. Analogously, the maximum area that the minority population could occupy is obtained by cumulating the total population from the tracts largest in area to smallest, and finding the point n where the cumulated population equals or exceeds the# minority total.
n#
i i
The segregation indices usual range from 0.0 to 1.0. In contrast, the relative concentration index ranges from k1.0 to 1.0, with the negative values indicating that the minority population is less concentrated than the majority. 13793
Segregation Indices
4. Measures of Centralization Because older and cheaper housing is usually closer to the center of the city, and because housing discrimination prevented blacks and other minorities in the United States from suburbanizing for decades after whites did, minority populations are often concentrated in neighborhoods closer to the center of the city. Both centralization measures order the metropolitan area’s tracts or areal units by increasing distance from the central business district. The absolute centralization index aggregates the deviations of the cumulative proportion of the minority (Xi) reached at each tract i from the proportion of the land area reached by that tract n
n
ACE l (Xi− Ai)k (XiAi− ) " " " " The relative centralization index, aggregates the deviations of the cumulative proportions of the minority (Xi) and majority (Yi) populations reached at each tract i from what would be expected were they comparably distributed n
n
RCE l (Xi− Yi)k (XiAi− ) " " " " The ACE ranges from 0.0 to 1.0, but the RCE from k1.0 to 1.0, with negative values indicating that the minority population is less centralized than the majority.
5. Measures of Clustering The final dimension of residential segregation measures the degree to which tracts with disproportionately large minority populations adjoin one another, creating, for example, large ghettos or barrios, or are scattered throughout the metropolitan area. This is a form of the geographer’s contiguity and checkerboard problems, and drawing on this literature, Massey and Denton (1988) proposed as a measure of absolute clustering. ACL
( 9 : 9 :*5 (" 9(x \X)" (c t ):k9(X\n#)" "(c ):* n
n
n
n
l (xi\X) (cijxj) k (X\n#) (cijxj) i=" j=" " " n
n
i
n
ij j
SP l (XPxxjYPyy)\TPtt where n
n
Pxx l cijxjxj\X # i=" j=" and Pyy and Ptt are calculated by analogues of Pxx. Comparing the relative intragroup proximities of the minority and majority populations produces a measure of relative clustering. RCL l (Pxx\Pyy)k1 Clustering can also be measured by extending the concepts underlying the interaction and isolation measures to estimate how these probabilities should decay with distance. The distance–decay interaction and isolation indices can be constructed by aggregating, over each tract i, the probability that the next person one meets anywhere in the city is respectively a majority or minority resident from tract j, and that this probability exponentially decays with increasing distance between i and j. Thus n
DPxy l "
A
C
n
(xi\X) (kijyj\tij) B "
D
and n
DPxx "
A
C
n
(xi\X) (kijxj\tj) B "
D
where kij l [tij exp (kdij)]
5
A
n
(tj exp (kdij) B "
C
D
n
i= j=
ij
where cij l exp(kdij), that is, the negative exponential of the distance dij between the centroids of tracts i and j (and where dij l (0.6ai)!.&). The negative exponential is used to estimate the otherwise massive contiguity matrix whose elements equal 1.0 when tracts i and j are contiguous, and 0.0 when they are not. The formula takes the average number of minority residents in nearby tracts as a proportion of the total population in 13794
those tracts. The absolute clustering index has a minimum of 0.0 and can approach, but never reach, 1.0. The spatial proximity index compares the average proximity of minority group residents to one another (Pxx), and the majority group residents with one another (Pyy) to the average proximity among the residents in the total population (Ptt), weighted by the fraction that each group represents in the population. Thus
6. Conclusion Few empirical researchers have used more than one or two of these 20 indices in studying residential segregation, and the full spectrum remains of interest primarily to those devoted to constructing measures of the phenomenon in its varied dimensions. Massey and Denton (1988) used factor analysis to extract one index from each dimension and to define a concept of
Selection Bias, Statistics of hypersegregation for blacks in the United States based on their high segregation scores on each of the five dimensions. This has extended the use of measures beyond the dissimilarity and interaction\isolation indices. However, the literature is still far from developing an over-arching framework that might suggest how several measures might be used systematically to identify how urban areas vary in the five-dimensional space suggested or with what consequences. One might imagine, for example, identifying and comparing urban areas where high unevenness results from low integration in the suburbs rather than large ghettos; those with concentrated and perhaps centralized ghettos or barrios vs. others with comparably high unevenness or isolation but less clustered distributions of the minority population. The full set of indices has much potential value that remains to be developed. See also: Ethnic Conflicts; Locational Conflict (NIMBY); Population Composition by Race and Ethnicity: North America; Race and Urban Planning; Racial Relations; Residential Concentration\Segregation, Demographic Effects of; Residential Segregation: Sociological Aspects; Urban Geography; Urban Sociology
1.
Selection Bias
The following illustrates the basic problem. Suppose one hypothesizes the following relationship, in the population, between a dependent variable, Y*, and a set of j l 1,…, J explanatory variables, X y l xi βjεi i
(1)
Here the unit-specific values of Y* are denoted y i (where i denotes the ith observation) and each unit’s values of the X variables are contained in the vector xi. ε is a random error and the parameter vector, β, is to be estimated. Specific assumptions about the distribution of ε will determine the choice of estimation technique. If Y* is considered as a latent variable, a variety of different statistical models can be generated from Eqn. (1) depending on the manner in which Y* is actually observed (though still, at this point, assuming that Y* is observed, in some fashion, for all units). To do this, write a second equation that defines the relationship between the observed variable Y, and the latent variable, Y*. For example, where Y* is completely observed yi l y i
(2a)
but one might also have
Bibliography Duncan O D, Duncan B 1955 A methodological analysis of segregation indices. American Sociological Reiew 20: 210–17 Massey D S, Denton N A 1988 The dimensions of residential segregation. Social Forces 67: 281–315 Taueber K E, Taueber A F 1965 Negroes in Cities: Residential Segregation and Neighborhood Change. Aldine, Chicago
R. Harrison
Selection Bias, Statistics of In almost all areas of scientific inquiry researchers often want to infer, on the basis of sample data, the characteristics of a relationship that holds in the relevant population. Frequently this is the relationship between a dependent variable and one or more explanatory variables. In some cases, however, although one may have complete information on the explanatory variables, information on the dependent variable is lacking for some observations or ‘units.’ Furthermore, whether or not this information is present may not be conditionally independent (given the model) of the value taken by the dependent variable itself. This is a case of selection bias.
yi l 1
yi l 0
if y 0 i otherwise
(2b)
In the case of Eqn. (2(b)) Y is the binary observed realization of the underlying latent variable, Y*. More complex relationships between Y* and Y are also possible. To capture the idea of selection bias, suppose that there is another latent variable Z* such that whether or not we observe a value for Y depends on the value of Z* which is given by z l whi αjνi i
(3)
Here w is a vector of explanatory variables with coefficients α, and ν is a random error term. The observed variable Z is defined as zi l 1 zi l 0
if z i0 i otherwise
(4)
Finally, define the observation equation for Y as depending on Z* as follows yi l yi
if zi l 1
yi unobserved if zi l 0
(5) 13795
Selection Bias, Statistics of Equations (1), (2), and (5), together with Eqns. (3) and (4), link the latent variable Y* to its observed counterpart, Y, when observation of the latter depends on the value of another variable, Z*. Selection bias occurs when there is a nonzero covariance between the error terms ε and ν. More complex selection bias models can be generated by having, for instance, more than one Z* variable. The censored regression (or Tobit) model is a simpler case in which whether Y is observed or not depends on whether it exceeds (or, in other cases, falls below) a given threshold value.
2. The Problem Selection bias is a problem because if one tries to estimate β using normal statistical methods the resulting estimates will be poor and potentially misleading. For example, if Eqn. (2(a)) specifies the relationship between Y and Y*, discarding cases where Y is unobserved and running an OLS regression on the remainder will give estimates that are biased and inconsistent. Where might this kind of selection bias arise? Suppose one draws a random sample from a population and carries out a survey. Some respondents may refuse to answer some, though not all, questions. If the response to such a question played the role of dependent variable in a regression analysis there would be a selection bias problem if the probability of having responded to the item was not independent of the response value, given the specification of the regression model. This may not always be so: in the simplest case nonresponse might be random. Alternatively, response might be independent of the response value controlling for a set of measured covariates. In this case there is selection on observables (Heckman and Robb 1985). But often neither of these cases holds and a nonzero residual correlation between the selection Eqn. (3) and the outcome Eqn. (1) means that there is selection on unobservables. In studies of the criminal justice system in which the population is all those charged with a crime, sentence severity is observed only for those found guilty. In studies of the effectiveness of university education the outcome (say, examination results) is only observed for those who had completed that period of education. In studies of women’s earnings in paid work the earnings of women who are not in the labor force cannot be observed. To these examples could be added very many more. Selection bias is a pervasive problem. As a consequence a variety of methods has been proposed to deal with it. The problem of selection bias arises because of the nonzero correlation of the errors in Eqns. (1) and (3), and this arises commonly in the following context. Suppose there is a nonexperimental comparison of two groups, one exposed to some policy treatment, the 13796
other not. In this case the outcome of interest, Y, could be observed for members of both groups. Let Z now be the indicator of group membership (so that, for instance, Z l 0 means membership of the comparison group and Z l 1 means membership of the treatment group). In the case where Eqn. (2(a)) holds write the equation for Y as yi l xhi βjγzijεi
(6)
and interest would focus on the estimate of γ as the gross effect on the outcome measure of being in the treatment, rather than the comparison, group. The problem of selection bias still arises to the extent that ε and ν have a nonzero correlation. The difference between this and the type of selection bias discussed initially (namely that now Y is observed for both groups) is more apparent than real as an alternative formulation shows. For each unit Y is observed as the response to membership of either the treatment or comparison group, but not both. Let Y be the " outcome given treatment and Y given no treatment. ! Then the ith individual unit’s gain from treatment is simply ∆i l Y i–Y i. But ∆ cannot be measured since " ! one cannot simultaneously observe a unit’s values of Y and Y . Instead " ! Yi l ZY ij(1kZ)Y i " !
(7)
is observed. Eqn. (7) thus specifies the incomplete observation of two latent variables, Y and Y . This ! more " than set-up can easily be extended to cases with two groups. As might be expected, this approach is commonly encountered in the evaluation literature but one of its earliest formulations was by Roy (1951) as the problem of self-selection.
3. The Solutions Broadly speaking there are two ways to address problems of selection bias: ex ante by research design, and ex post by statistical adjustment. The modern literature on correcting for selection bias is itself biased towards the latter, beginning with the work of Heckman in the 1970s (Heckman 1976, 1979) whose so-called ‘two step’ method is widely used and hardwired into many econometrics programs. The argument that underlies this technique is the following. Assume the set-up as defined by Eqns. (1), (2(a)), (3), (4), and (5) and a nonzero covariance between ε and ν. Then we can write the ‘outcome equation’ (i.e., the regression equation for Y, given its observability) as E( yiQzi l 1, xi) l xhi βjE(εiQzi l 1)
(8)
Because there are only observations of Y when z l 1 there is an extra term for the conditional expectation
Selection Bias, Statistics of of ε. Because of the nonzero covariance between ε and ν and if, as is usually assumed, E(ε) l 0, then this conditional expectation cannot be zero. One can write E( yiQzi l 1, xi) l xhi βjE(εiQνi whiα)
(9)
Using a standard result for the value of a truncated bivariate normal distribution, the second term on the right-hand side of this equation is given by φ(whi α) E(εiQνi whi α) l ρσε σν Φ(whi α)
(10)
Here the σs are the standard deviations of the respective error terms from Eqns. (1) and (3); ρ is the correlation between them; φ and Φ are, respectively, the density and distribution functions of the standard normal and their ratio, as it appears in Eqn. (10), is termed the ‘inverse Mill’s ratio.’ Heckman’s two-step method requires first running a probit regression with Z as the dependent variable, estimated using all the observations in the data. Then, for those observations for which Y is observed, the probit coefficient estimates are used to compute the value of the inverse Mill’s ratio. This is then inserted as an extra variable in the OLS regression in which Y is the dependent variable to give E( yiQzi l 1, xi) l xhi βjθλV i
(11)
where λ is the inverse Mill’s ratio and the ‘hat’ indicates that it is estimated. Its coefficient, θ, is then itself an estimate of ρσεσν. This is the covariance between ε and ν (σν is set to unity in the probit). This approach is extended readily to the case in which an outcome is observed for both groups (Z l 0 and Z l 1). What are the assumptions of this approach and what are the properties of the resulting estimator when we apply this method to data from a sample of the population? First, it is assumed that all the other requirements for an OLS regression are met. Second, in the set-up outlined above, the joint distribution of ε and ν should be bivariate normal (though, in general, weaker assumptions suffice, namely that ν be normally distributed; and the expectation of ε, conditional on ν, should be linear in ν: (see Olsen 1980)). Given that the assumptions hold, the two-step estimator yields consistent estimates of the population β in Eqn. (1). The standard errors are incorrect (due to heteroscedasticity and the use of an estimated value of λ) but these are corrected readily. An alternative to the two-step approach is to estimate both the selection and outcome equations simultaneously using maximum likelihood (ML). The resulting estimates are asymptotically unbiased and more efficient than those from the two-step method (see Nelson 1984, who compares OLS, the two-step method and ML).
This two-step method is, in any case, applicable only when the relationship between Y* and Y is as given by Eqn. (2(a)). When this relationship is given by, for example, Eqn. (2(b)), the two-stage method is inconsistent. In these cases ML is the most feasible option. If the outcome variable is itself binary, the joint log-likelihood of the selection and outcome equations has the same general as for a bivariate probit but one in which there are three, rather than four, possible outcomes. They are z l 1 and y l 1; z l 1 and y l 0; and z l 0 (in which case y is not observed). Although the two-step method is probably the most widely used approach to correcting for selection bias it has been subjected to much criticism. Among the main objections are the following: (a) sensitiity to distributional assumptions. Practical implementation of the method renders it particularly sensitive in this respect. If the assumptions are not met the estimator has no desirable properties (i.e., it is not even consistent). (b) identification and robustness. It is common to find that the estimate of the inverse Mill’s ratio is highly correlated either with the explanatory variables or with the intercept in Eqn. (11). The extent to which such problems will arise depends mainly on three things. They are: (i) the specification of the selection equation; (ii) the sample correlation between the two sets of explanatory variables, X and W (call this q); (iii) the degree of sample selection in the sample ( p, the proportion of cases for which z l 1). For example, if X and W are identical, the twoequation system is identified only because of the nonlinearity of the probit equation. But for some ranges of p the probit function is almost linear, with the result that the estimated λ will be close to a linear function of the explanatory variables in the probit; and thus it will be highly correlated with the X variables in the outcome equation. In general, the correlation between the inverse Mill’s ratio estimate and the other explanatory variables in this equation will be greater the greater is q and the closer is p to zero or one (this issue is discussed in detail in Breen 1996). If the selection equation does not discriminate well between the selected and unselected observations the estimated inverse Mill’s ratio will be approximately a constant, and there will therefore be a high correlation between it and the intercept of the outcome equation. Both these objections reflect genuine difficulties in applying the model. In principle the solution to the identification and robustness problems is simple: ensure that the probit equation discriminates well and do not rely on the nonlinearity of the probit for identification. In practice it may be rather difficult to achieve these things. On the other hand, the issue of distributional assumptions may be even less tractable, obliging recourse to semiparameteric or other approaches (Lee 1994; Cosslett 1991). 13797
Selection Bias, Statistics of An alternative is to try to assess the likely degree of bias that sample selection might induce in a given case. Rubin (1977) presents a Bayesian approach in which the investigator computes the likely degree of bias, conditional on a prior belief about the relationship between the parameters of the distribution of Y in the selected and nonselected samples. The mean and the variance of the latter sample might, for example, be expressed as a function of their values in the selected sample, conditioning on the values of observed covariates. It is then straightforward to express the extent of selection bias for plausible values of the parameters of this function and to place a corresponding Bayesian probability interval around any estimates. Rosenbaum (1995, 1996) suggests and illustrates other sensitivity analyses for selection bias in nonexperimental evaluations.
4.
Program Ealuation and Selection Bias
For several years now the literature on selection bias has been dominated by discussion of how to evaluate programs (such as job training programs) when randomized assignment to treatment and control group is not possible. In a very widely cited paper, Lalonde (1986) compared the measured effectiveness of labor market programs using a randomized assignment design with their effectiveness computed using the Heckman method and showed that they led to quite different results (but see also Heckman and Hotz 1989). While some have seen this as a fatal criticism of the method, one consequence has been the development of greater awareness of the need to ensure the suitability of different selection bias correction approaches for specific cases. Another has been a greater concern to identify other possible sources of bias in nonrandomized program evaluations. In the statistical literature, matching methods commonly are advocated in program evaluations in the absence of randomization. They involve the direct comparison of treated and untreated units that share common, or very similar, values on a set of relevant covariates, X. The rationale for this is the assumption that, conditional on X, and assuming only selection on observables (all of which are in X ), the observed outcome for nonparticipants has the same distribution as the unobserved outcome that participants would have had had they not participated (see Heckman et al. 1997). This then allows one to estimate ∆. In passing, note that matching is also used commonly to impute missing values in item nonresponse in surveys (Little and Rubin 1987). A central issue is how to carry out such matching. If there are K covariates, then the problem is one of matching pairs, or sets, of participants and nonparticipants in this K-dimensional space. But by a result due to Rosenbaum and Rubin (1983) (see also Rosenbaum 1995), matching using a scalar quantity called the propensity score is equally 13798
effective. The propensity score is simply the probability of being in the treatment, rather than the comparison group, given the observed covariates. Matching on the propensity score controls bias due to all observed covariates and, even if there is selection on unobservables, the method nevertheless produces treatment and comparison groups with the same distribution of X variables. This is so under the assumption that the true propensity score is known, but Rosenbaum and Rubin (1984) show that an estimated propensity score (typically using a logit model) appears to perform at least as well. The use of matching draws attention to the problem of the possibly different distributions of covariates (or, equally, of propensity scores) in the treatment and comparison groups. There are two important aspects: first, the support of the propensity score may be different, so some ranges of propensity score values may be present in one group and not the other. More simply, some participants may have no comparable non-participants. Second, the distributions of the set of common values of propensity scores (i.e., which appear in both groups) may be different (Heckman et al. 1997, 1998). Hitherto, in practice, both of these sources of bias in evaluating the program’s effects commonly would have been confounded with selection bias proper. The use of propensity scores can remove the former but, if there is also selection on unobservables (i.e., selection bias proper), matching cannot be expected to solve the problem. Indeed, it is possible that these different forms of bias may have opposite signs, so that correcting for some but not others may make the overall bias greater. Nevertheless, the ability to draw such finer distinctions among biases is a valuable development, not least because it allows one to focus on methods for controlling selection bias, free from the contaminating effects of other biases.
5. Conclusion This article has only skimmed the surface of selection bias statistics. In particular, it has looked only at models for cross-sectional data: other approaches are possible given other sorts of data. For example, given longitudinal data, with observations of Y for each unit at two (or more) points in time, one prior to the program to be evaluated, the other after it, then it may be possible to specify and test a model which assumes that unobserved causes of selection bias are unitspecific and time-invariant and can therefore be removed by differencing (Heckman and Robb 1985, Heckman et al. 1997, 1998). Considerations of this sort draw attention to the ex ante approach to dealing with selection bias that was referred to earlier. Here the emphasis falls on designing research so as to remove, or at least reduce, selection bias. Random assignment is one possibility, though not always feasible in practice. Nevertheless, quasi-experimental approaches, interrupted time-series, and longitudinal
Self-concepts: Educational Aspects designs with repeated measures on the outcome variable are among a range of possibilities that could be considered in designing ones research to minimize selection bias problems (see Rosenbaum 1999 and associated comments). Even so, there will still be many instances in which analysts use data over whose collection they have had no control and where the use of ex post adjustments will be unavoidable. To conclude: much work on the selection bias problem has been undertaken since the 1980s yet there is no widespread agreement on which statistical methods are most suitable for use in correcting for the problem. Some broad guidelines for practitioners can, however, be discerned: (a) the need to ensure that methods to correct for selection bias are appropriate and that their requirements (such as distributional properties) are met; (b) the need to be aware of other, possibly confounding, sources of bias; (c) the usefulness of analyses of the sensitivity of conclusions to various possible magnitudes of selection bias; and of the sensitivity of the selection-bias corrected results to the assumptions of whatever method is employed; and (d) the desirability of designing research so that selection bias problems are, as far as possible, eliminated without the need for complex and sometimes fragile ex post adjustment. See also: Longitudinal Research: Panel Retention; Mortality Differentials: Selection and Causation; Nonequivalent Group Designs; Screening and Selection
job training programme. Reiew of Economic Studies 64: 605–54 Heckman J J, Robb R 1985 Alternative methods for evaluating the impact of interventions. In: Heckman J J, Singer B (eds.) Longitudinal Analysis of Labor Market Data. Cambridge University Press, New York Lalonde R J 1986 Evaluating the econometric evaluations of training programs with experimental data. American Economic Reiew 76: 604–20 Lee L F 1994 Semi-parametric two-stage estimation of sample selection models subject to Tobit-type selection rules. Journal of Econometrics 61: 305–44 Little R D, Rubin D B 1987 Statistical Analysis with Missing Data. Wiley, New York Nelson F D 1984 Efficiency of the two-step estimator for models with endogenous sample selection. Journal of Econometrics 24: 181–96 Olsen R J 1980 A least squares correction for selectivity bias. Econometrica 48: 1815–20 Rosenbaum P R 1995 Obserational Studies. Springer-Verlag, New York Rosenbaum P R 1996 Observational studies and nonrandomized experiments. In: Ghosh S, Rao C R (eds.) Handbook of Statistics. Elsevier, Amsterdam, Vol. 13 Rosenbaum P R 1999 Choice as an alternative to control in observational studies. Statistical Science 14: 259–304 Rosenbaum P R, Rubin D B 1983 The central role of the propensity score in observational studies for causal effects. Biometrika 70: 41–55 Rosenbaum P R, Rubin D B 1984 Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association 79: 516–24 Roy A D 1951 Some thoughts on the distribution of earnings. Oxford Economic Papers 3: 135–46 Rubin D B 1977 Formalizing subjective notions about the effect of nonrespondents in sample surveys. Journal of the American Statistical Association 72: 538–43
R. Breen
Bibliography Breen R 1996 Regression Models: Censored, Sample-selected or Truncated Data. Sage, Thousand Oaks, CA Cosslett S 1991 Semiparametric estimation of a regression model with sample selectivity. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK Heckman J J 1976 The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement 5: 475–92 Heckman J J 1979 Sample selection bias as a specification error. Econometrica 47: 153–61 Heckman J J, Hotz V J 1989 Choosing among alternative nonexperimental methods for estimating the impact of social programs: The case of manpower training. Journal of the American Statistical Association 84(408): 862–74 Heckman J J, Ichimura H, Smith J, Todd P E 1998 Characterizing selection bias using experimental data. Econometrica 66: 1017–98 Heckman J J, Ichimura H, Todd P E 1997 Matching as an econometric evaluation estimator: Evidence from evaluating a
Self-concepts: Educational Aspects The capacity to reflect one’s own capabilities and actions is uniquely human. Early in life, young children begin to form beliefs about themselves which may serve as reference mechanisms for perceiving the world and oneself, and for regulating emotion, motivation, and action. Research on self-related beliefs can be traced back to the seminal writings of William James (1893) who distinguished the self as ‘I’ (‘existential self’) and as ‘me’ (‘categorical self’), the latter implying cognitions about the self as an object of thinking (as represented in a sentence like ‘I think about me’). Research was continued throughout the decades after that and gained a central status in personality and social psychology as well as educational research after the cognitive paradigm shift in 13799
Self-concepts: Educational Aspects the 1950s and 1960s. At that time, behavioristic approaches were gradually replaced by cognitive perspectives, thus making it possible to acknowledge the importance of cognitive processes in human agency. Since then, the number of studies on selfconcepts has increased drastically. This also applies to studies on the educational relevance of self-concepts (Hansford and Hattie 1982, Helmke 1992).
1. Definition of the Term ‘Self-concept’ The term ‘self-concept’ is used in two interrelated ways. Talking about the self-concept of a person has to be differentiated from referring to a number of different self-concepts of this person. Usage of the term in self-concept research implies that ‘the’ selfconcept may be defined as the total set of a person’s cognitive representations of him- or herself which are stored in memory. In the plural sense, different self-concepts of a person are subsets of such representations relating to different self-attributes (like abilities, physical appearance, social status, etc.). In both variants, the term implies that self-concepts are self-representations which are more or less enduring over time, in contrast to situational self-perceptions. The definition is open as to whether these representations refer to factual reality of personal attributes, or to fictitious (e.g., possible or ideal) attributes. There are two definitional problems which remain conceptually unresolved and impede scientific progress. First, it is unclear whether self-related emotions should be included or not. One may argue that emotions are different from cognition and should be regarded as separate constructs. From such a perspective, (cognitive) self-concepts and (emotional) selfrelated feelings might be subsumed under umbrella constructs like self-esteem, comprising both cognitive and affective facets, but should not be mixed up conceptually. On the other hand, prominent measures of the construct take more integrative perspectives by combining cognitive and affective self-evaluative items. One example are H. Marsh’s widely used SelfDescription Questionnaires (SDQ: Marsh 1988) measuring academic self-concepts both by cognitive items (e.g., ‘I am good at mathematics’) and by affective items (e.g., ‘I enjoy doing work in mathematics).’ Second, there is no common agreement on the conceptual status of self-related cognitions linking the self to own actions and the environment. An example are self-efficacy beliefs pertaining to own capabilities to be successful in solving specific types of tasks, thus cognitively linking the self (own capabilities) to own actions (task performance) and to environmental demands (a domain of tasks; cf. Pajares 1996; see Self-efficacy: Educational Aspects). It may be argued that such self-representations should be regarded as part of a person’s self-concept as well. However, terms 13800
like self-concept, on the one hand, and self-efficacy beliefs, control beliefs etc., on the other, have been used as if they were relating to conceptually distinct phenomena, and have been addressed by different traditions of research. These research traditions tend to mutually ignore each other in spite of overlapping constructs and parallel findings, implying that more ‘cross-talk’ among researchers should take place in order to reduce conceptual confusion and the proliferation of redundant constructs which still prevails at the beginning of the twenty-first century (Pajares 1996).
2. Facets and Structures of Self-concepts Self-related cognitions can refer to different attributes of the self, and can imply descriptive or evaluative perspectives pertaining to the factual or nonfactual (e.g., ideal) reality of these attributes. Research has begun to analyze the structures of these different representational facets.
2.1 Representations of Attributes: Hierarchies of Self-concepts Self-concepts pertaining to different attributes may vary along two dimensions: the domain of attributes which is addressed, and their generality. It may be theorized that self-concepts are hierarchically organized along these two dimensions in similar ways as semantic memory networks can be assumed to be, implying that more general self-concepts are located at top levels of the self-concept hierarchy, and more specific self-concepts at lower levels. In educational research, the hierarchical model put forward by Shavelson et al. (1976) stimulated studies on facets of self-concepts differing in generality. This model implied that a ‘general self-concept’ is located at the top of the hierarchy; an ‘academic self-concept’ pertaining to one’s academic capabilities as well as social, emotional, and physical nonacademic self-concepts at the second level; self-concepts relating to different school subjects and to social, emotional, and physical subdomains at the third level; and more specific selfconcepts implying evaluations of situation-specific behavior at lower levels. Studies showed that correlations between academic self-concepts pertaining to different subjects tend to be near zero, in contrast to performance which typically is correlated positively across academic domains. An explanation has been provided by Marsh’s internal– external frame of reference model (I\E model) positing that the formation of academic self-concepts may be based on internal standards of comparison implying within-subject comparison of abilities across domains, as well as external standards implying between-
Self-concepts: Educational Aspects subjects social comparison with other students (Marsh 1986). Applying internal standards may lead to negative correlations between self-concepts pertaining to different domains (e.g., perceiving high ability in math may lead to lowered estimates of verbal abilities, and vice versa). External standards would imply positive correlations since achievement in domains like mathematics and languages is positively correlated across students. The model assumes that students use both types of standards, implying that opposing effects may cancel out. Accordingly, the Shavelson et al. (1976) model has been reformulated by assuming that students hold math-related and verbal-related academic self-concepts, but no general academic self-concept (Marsh and Shavelson 1985).
2.2 Different Perspecties on Attributes Self-representations can imply descriptive as well as evaluative accounts of attributes (e.g., ‘I am tall’ vs. ‘I am an attractive woman’). In academic self-concepts, this distinction may often be blurred because descriptions of academic competence may use comparison information implying some evaluation as well (e.g., ‘I am better at math than most of my classmates’). Any subjective evaluation of academic competence may use a number of different standards of evaluation, e.g. (a) social comparison standards and (b) intraindividual comparison across domains as addressed by Marsh’s I\E model (see above), as well as (c) intraindividual comparison of current and past competence (implying an evaluation of one’s academic development), (d) mastery-oriented comparison relating one’s competence to content-based criteria of minimal or optimal performance in an academic domain, and (e) cooperative standards linking individual performance to group performance. Beyond existing attributes (‘real self’), self-representations can pertain to attributes of the self which do not factually exist, but might exist (‘possible selves’), are wanted by oneself (‘ideal self’), wanted by others (‘ought self’), etc. Concepts of nonreal selves may be as important for affect and motivated self-development as concepts of the real self. For example, perceived discrepancies between ideal and real self may be assumed to trigger depressive emotion, whereas discrepancies between ought and real self may give rise to anxiety (Higgins 1987).
3. Self-concepts and Academic Achieement The relation between self-concept and academic achievement is one of the most often analyzed problems in both self-concept and educational research. In the first stage of research on this problem, connections between the two constructs were analyzed by correlational
methods based on cross-sectional field studies. Results implied that self-concepts and achievement may be positively linked. However, the magnitude of the correlation proved to depend on the self-concept and achievement constructs used. For example, in the meta-analysis provided by Hansford and Hattie (1982), the average correlation between self-concept and achievement\performance measures was r l 0.22 for general self-esteem, and r l 0.42 for self-concept of ability. When both self-concepts and achievement are measured in more domain-specific ways, correlations tend to be even higher (e.g., correlations for self-concepts and academic achievement in specific school subjects). This pattern of relations implies that the link gets closer when self-concept and criterion measure are matched, and when both are conceptualized in domain-specific ways. In the second stage, researchers began to analyze the causal mechanisms producing these relations. From a theoretical perspective, the ‘skill development’ model maintained that academic self-concepts are the result of prior academic achievement, whereas the ‘selfenhancement’ approach implied that self-concepts influence students’ achievement (cf. Helmke and van Aken 1995). In a number of longitudinal studies, both hypotheses were tested competitively by using crosslagged panel analysis and structural equations modeling. Results showed that both hypotheses may be valid, thus implying that self-concepts and achievement may be linked by reciprocal causation: Prior academic achievement exerted effects over time on subsequent academic self-concepts, and self-concepts influenced later achievement, although effect sizes for the latter causal path were less strong and consistent (cf. Helmke and van Aken 1995, Marsh and Yeung 1997). This evidence suggests that academic selfconcepts are partly based on information about own academic achievement, and may contribute to later academic learning and achievement. Not much is known about the exact nature of these mechanisms. Investigators have just begun to explore possible mediators and moderators of self-concept\achievement relations. Judging from theory and available evidence, the following may be assumed.
3.1
Effects of Achieement on Self-concepts
Academic feedback of achievement (e.g., by grades) may underly the formation of self-representations of academic capabilities. The match between feedback and self-representations may depend on the cumulativeness and consistency of feedback, its salience, the reference norms used, and the degreee of congruency with competing information from parents, peers, or one’s past. This implies that the impact of achievement on self-concepts may differ between schools, teachers, and classrooms using different standards and practices 13801
Self-concepts: Educational Aspects of feedback (cf. Helmke 1992). Finally, beyond exerting effects of academic self-concepts, achievement feedback can be assumed to influence students’ general sense of self-worth (Pekrun 1990), thus affecting students’ overall psychological health and personality development as well.
3.2
Effects of Self-concepts on Achieement
Self-concepts may influence the formation of selfefficacy and success expectations when being confronted with academic tasks. These expectations may underly academic emotions and motivation, which may in turn influence effort, strategies of learning, and on-task behavior directly affecting learning and performance (Pekrun 1993). Research has shown that selfconcepts implying moderate overestimation of own capabilities may be optimal for motivation and learning gains, whereas an underestimation may be rather detrimental. However, self-evaluations precisely reflecting reality may also be suboptimal for learning and achievement, indicating that there may be a conflict between educational goals of teaching students to be self-realistic vs. highly motivated (Helmke 1992).
4. Deelopment of Self-concepts 4.1 Basic Needs Underlying Self-concept Deelopment Two classes of basic human needs seem to govern the development of self-representations (cf. Epstein 1973). One are general needs for maximizing pleasure and minimizing pain, from which needs for self-enhancementfollow,implyingmotivationtoestablishand maintain a positive view of the self. The second category comprises needs to perceive reality and foresee the future in such ways that adaptation and survival are possible, thus implying needs for selfperceptions which are consistent with reality, and consistent with each other (needs for consistency). Self-enhancement and consistency may converge when feedback about the self is positive. However, they may be in conflict when feedback is negative: The need for self-enhancement would suggest not accepting negative feedback, whereas needs for reality-oriented consistency would imply endorsing it.
4.2 Deelopmental Courses across the Life Span The development of self-concepts is characterized by the interplay of mechanisms driven by these conflicting basic needs. In general, in the absence of strong information about reality, humans tend to overe13802
stimate their capabilities in self-enhancing ways. At the same time, they search for self-related information, and tend to endorse this information even if it is negative on condition that such negative information is salient, consistent, and cumulative. For example, it has repeatedly been found that many children entering school drastically overestimate their competences to master academic demands, but adjust their selfevaluations downward when diverging cumulative feedback is given (Helmke 1992). The process of realistically interpreting self-related feedback may be impeded in young children by their lack of sufficiently sophisticated metacognitive competences. Nevertheless, the interplay of beginners’ optimism and experts (relative) realism may characterize not only the early elementary years, but later phases of entering new institutions as well (university, marriage, a new job etc.), thus implying a dynamic interplay of self-related optimism and realism across much of the life span.
4.3
Impact of Educational Enironments
The influence of social environments was addressed early in the twentieth century by symbolic interactionism postulating that interactions with significant others may shape the development of self-concepts (cf. the concept of the ‘looking glass self,’ Cooley 1902). Generally, a number of environmental variables may be influential. Self-concepts can develop according to direct attributions of traits and personal worth by other persons on condition that such attributions are consistent with other sources of information and interpreted accordingly. A second source of selfrelated information are indirect, implicit attributions which are conveyed by others’ emotional and instrumental behavior towards the developing person. Of specific importance are acceptance and support by others implying attributions of personal worth, thereby influencing the development of a person’s general self-esteem (Pekrun 1990). Beyond attributions, social environments may define situational conditions for the development of knowledge, skills, motivation, and behavior, which may in turn contribute to self-perceptions of own competences. For example, instruction may build up knowledge and skills influencing knowledge-specific academic self-concepts; support of autonomy may foster the acquisition of self-regulatory abilities and, thereby, the development of related self-concepts of abilities; and consistent behavioral rules may enhance the cognitive predictability of students’ environments which may also positively affect their overall sense of competence. As outlined above, concerning academic self-concepts, feedback of achievement implying information about abilities and effort may be of specific importance, and the effects of such feedback may depend on
Self-conscious Emotions, Psychology of the reference norms used. Feedback given according to competitive, interindividually referenced norms implies that the self-concepts of high achievers may benefit, whereas low achievers may have difficulties protecting their self-esteem. In contrast, intraindividual, mastery-oriented and cooperative norms may be better suited to foster low achievers’ self-concepts as well. Finally, classroom conditions may also be influential. Specifically, grouping of students may influence students’ relative, within-classroom achievement positions, thus affecting their academic selfconcepts when competitive standards of grading are used. For example, being in a low-ability class would help an average student to maintain positive academic self-concepts, whereas being in a class of highly gifted students would enforce downward adjustment of selfevaluation (‘big-fish-little-pond effect,’ Marsh 1987).
5. Summary of Implications for Educational Practice Fostering students’ self-concepts may be beneficial for their achievement and personality development. Furthermore, since positive self-esteem may be regarded as a key element of psychological health and wellbeing, nurturing self-esteem may be regarded as an educational goal which is valuable in itself. From the available research summarized above, it follows that education may contribute substantially to self-concept development. Concerning parents as well as academic institutions like the school, acceptance and support may be of primary importance for the development of general self-esteem. In addition, giving students specific feedback on achievement, traits, and abilities may help in shaping optimistic self-concepts and expectancies which nevertheless are grounded in reality. In designing feedback, it may be helpful to use mastery-oriented, individual, and cooperative reference norms instead of relying on social comparison standards and competitive grading. Furthermore, educational environments may help students’ selfconcept development by providing high-quality instruction, consistent normative and behavioral structures implying predictability, as well as sufficient autonomy support fostering students’ sense of controllability and competence. See also: Motivation, Learning, and Instruction; School Achievement: Cognitive and Motivational Determinants; Schooling: Impact on Cognitive and Motivational Development; Self-development in Childhood; Self-efficacy; Self-efficacy: Educational Aspects; Self-regulated Learning
Bibliography Bracken B A (ed.) 1996 Handbook of Self-concept. Wiley, New York
Cooley C H 1902 Human Nature and the Social Order. Scribner, New York Covington M V 1992 Making the Grade. Cambridge University Press, New York Covington M V, Beery R 1976 Self-worth and School Learning. Holt, Rinehart and Winston, New York Epstein S 1973 The self-concept revisited or a theory of a theory. American Psychologist 28: 404–16 Hansford B C, Hattie J A 1982 The relationship between self and achievement\performance measures. Reiew of Educational Research 52: 123–42 Hattie J 1992 The Self-concept. Erlbaum, Hillsdale, NJ Helmke A 1992 Selbstertrauen und schulische Leistungen. Hogrefe, Go$ ttingen, Germany Helmke A, van Aken M A G 1995 The causal ordering of academic achievement and self-concept of ability during elementary school: A longitudinal study. Journal of Educational Psychology 87: 624–37 Higgins E T 1987 Self-discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 James W 1893 Psychology. Fawcett, New York Kulik J A, Kulik C L 1992 Meta-analytic findings on grouping processes. Gifted Child Quarterly 36: 73–7 Marsh H W 1986 Verbal and math self-concepts: An internal\external frame of reference model. American Educational Research Journal 23: 129–49 Marsh H W 1987 The big-fish-little-pond effect on academic self-concept. Journal of Educational Psychology 79: 280–95 Marsh H W 1988 Self Description Questionnaire I (SDQ): Manual and Research Monograph. Psychological Corporation, San Antonio, TX Marsh H W, Shavelson R J 1985 Self-concept: Its multifaceted, hierarchical structure. Educational Psychologist 20: 107–25 Marsh H W, Yeung A S 1997 Causal effects of academic selfconcept on academic achievement: Structural equation models of longitudinal data. Journal of Educational Psychology 89: 41–54 Pajares F 1996 Self-efficacy beliefs in academic settings. Reiew of Educational Research 66: 543–78 Pekrun R 1990 Social support, achievement evaluations, and self-concepts in adolescence. In: Oppenheimer L (ed.) The Self-concept. Springer, Berlin, pp. 107–19 Pekrun R 1993 Facets of adolescents’ academic motivation: A longitudinal expectancy-value approach. In: Maehr M, Pintrich P (eds.) Adances in Motiation and Achieement. JAI Press, Greenwich, CT, Vol. 8, pp. 139–89 Shavelson R J, Hubner J J, Stanton G C 1976 Self-concept: Validation of construct interpretations. Reiew of Educational Research 46: 407–41 Skaalvik E M, Rankin R J 1995 A test of the internal\external frame of reference model at different levels of math and verbal self-perception. American Educational Research Journal 32: 161–84
R. Pekrun
Self-conscious Emotions, Psychology of Shame, guilt, embarrassment, and pride are members of a family of ‘self-conscious emotions’ that are evoked by self-reflection and self-evaluation. This self-evaluation may be implicit or explicit, consciously ex13803
Self-conscious Emotions, Psychology of perienced or transpiring beyond our awareness. But either way, the self is the object of self-conscious emotions. In contrast to ‘basic’ emotions (e.g., anger, fear, joy) which are present very early in life, self-conscious emotions have been described as ‘secondary,’ ‘derived,’ or ‘complex’ emotions because they emerge later and hinge on several cognitive achievements— recognition of the self separate from others and a set of standards against which the self is evaluated. For example, Lewis et al. (1989) showed that the capacity to experience embarrassment coincides with the emergence of self-recognition. Very young children first show behavioral signs of embarrassment during the same developmental phase (15–24 months) in which they show a rudimentary sense of self. Moreover, within this range, children who display self-recognition (in a ‘rouge’ test) are the same children who display signs of embarrassment in an unrelated task.
1. Shame and Guilt The terms ‘shame’ and ‘guilt’ are often used interchangeably. When people do distinguish the two, they typically suggest that shame arises from public exposure and disapproval of a failure or transgression, whereas guilt is more ‘private,’ arising from one’s own conscience. Recent empirical research has failed to support this public\private distinction. For example, Tangney et al. (1996) found that people’s real-life shame and guilt experiences are each most likely to occur in the presence of others. More important, neither the presence of others nor others’ awareness of the respondents’ behavior distinguished between shame and guilt. Overall, there are surprisingly few differences in the types of events that elicit shame and guilt. Shame is somewhat more likely in response to violations of social norms, but most types of events (e.g., lying, cheating, stealing, failing to help another) result in guilt for some people and shame for others. So what is the difference between shame and guilt? Empirical research supports Helen Block Lewis’s (1971) notion that shame involves a negative evaluation of the global self, whereas guilt involves a negative evaluation of a specific behavior. This differential emphasis on self (‘I did that horrible thing’) vs. behavior (‘I did that horrible thing’) leads to different affective experiences. Shame is an acutely painful emotion typically accompanied by a sense of shrinking, ‘being small,’ and by a sense of worthlessness and powerlessness. Shamed people also feel exposed. Although shame doesn’t necessarily involve an actual observing audience, there is often the imagery of how one’s defective self would appear to others. Not surprisingly, shame often leads to a desire to escape or hide—to sink into the floor and disappear. 13804
In contrast, guilt is typically less painful and devastating because the primary concern is with a specific behavior, not the entire self. Guilt doesn’t affect one’s core identity. Instead, there is tension, remorse, and regret over the ‘bad thing done’ and a nagging preoccupation with the transgression. Rather than motivating avoidance, guilt typically motivates reparative action.
1.1 Implications of Shame and Guilt for Interpersonal Adjustment Research has consistently shown that, on balance, guilt is the more adaptive emotion, benefiting relationships in a variety of ways (Baumeister et al. 1994, Tangney 1995). Three sets of findings illustrate the adaptive, ‘relationship-enhancing functions’ of guilt, in contrast to the hidden costs of shame. First, shame typically leads to attempts to deny, hide, or escape; guilt typically leads to reparative action—confessing, apologizing, undoing. Thus, guilt orients people in a more constructive, proactive, future-oriented direction, whereas shame orients people toward separation, distance, and defense. Second, a special link between guilt and empathy has been observed at the levels of both emotion states and dispositions. Studies of children, college students, and adults (Tangney 1995) show that guilt-prone individuals are generally empathic individuals. In contrast, shame-proneness is associated with an impaired capacity for other-oriented empathy and a propensity for ‘self-oriented’ personal distress responses. Similar findings are evident when considering feelings of shame and guilt ‘in the moment.’ Individual differences aside, when people describe personal guilt experiences, they convey greater empathy for others, compared to descriptions of shame experiences. It appears that by focusing on a bad behavior (as opposed to a bad self), people experiencing guilt are relatively free of the egocentric, self-involved focus of shame. Instead, their focus on a specific behavior is likely to highlight the consequences for distressed others. Third, there is a special link between shame and anger, again observed at both the dispositional and state levels. At all ages, shame-prone individuals are also prone to feelings of anger and hostility (Tangney 1995). Moreover, once angered, they tend to manage their anger in an aggressive, unconstructive fashion. In contrast, guilt is generally associated with constructive means of handling anger. Similar findings have been observed at the situational level, too. In a study of couples’ real-life episodes of anger, shamed partners were significantly more angry, more aggressive, and less likely to elicit conciliatory behavior from the offending partner (Tangney 1995). What accounts for this link between shame and anger? When feeling
Self-conscious Emotions, Psychology of shame, people initially direct hostility inward (‘I’m such a bad person’). But this hostility may be redirected outward in a defensive attempt to protect the self by shifting the blame elsewhere (‘Oh what a horrible person I am, and damn it, how could you make me feel that way!’). In sum, studies employing diverse samples, measures and methods converge. All things equal, it’s better if your friend, partner, child, or boss feels guilt than shame. Shame motivates behaviors that interfere with interpersonal relationships. Guilt helps keep people constructively engaged in the relationship at hand. 1.2 Implications of Shame and Guilt for Psychological Adjustment Although guilt appears to be the more ‘moral’ or adaptive emotion when considering social adjustment, is there a trade-off vis-a' -vis individual psychological adjustment? Does the tendency to experience guilt or shame leave one vulnerable to psychological problems? Researchers consistently report a relationship between proneness to shame and a whole host of psychological symptoms, including depression, anxiety, eating disorder symptoms, subclinical sociopathy, and low self-esteem (Harder et al. 1992, Tangney et al. 1995). This relationship is robust across measurement methods and diverse populations. There is more controversy regarding the relationship of guilt to psychopathology. The traditional view is that guilt plays a significant role in psychological symptoms. Clinical theory and case studies make frequent reference to a maladaptive guilt characterized by chronic self-blame and obsessive rumination over one’s transgressions. On the other hand, recent theory and research has emphasized the adaptive functions of guilt, particularly for interpersonal behavior. Tangney et al. (1995) argued that once one makes the critical distinction between shame and guilt, there’s no compelling reason to expect guilt over specific behaviors to be associated with poor psychological adjustment. The empirical research is similarly mixed. Studies employing adjective checklist-type (and other globally worded) measures find both shame-proneness and guilt-proneness correlated with psychological symptoms. On the other hand, measures sensitive to the self vs. behavior distinction show no relationship between proneness to ‘shame-free’ guilt and psychopathology. 1.3 Deelopment of Guilt Distinct from Shame The experience of self-conscious emotions requires the development of standards and a recognized self. In addition, a third ability is required to experience guilt (about specific behaviors) independent of shame (about the self)—the ability to make a clear distinction between self and behavior. Developmental re-
search indicates that children do not begin making meaningful distinctions between attributions to ability (enduring characteristics) vs. attributions to effort (more unstable, volitional factors) until about age eight—the same age at which researchers find interpretable differences in children’s descriptions of shame and guilt experiences (Ferguson et al. 1991).
2. Embarrassment Miller (1996) defines embarrassment as ‘an aversive state of mortification, abashment, and chagrin that follows public social predicaments’ (p. 322). Indeed, embarrassment appears to be the most ‘social’ of the self-conscious emotions, occurring almost without exception in the company of others. 2.1
Causes of Embarrassment
In Miller’s (1996) catalog of embarrassing events described by several hundred adolescents and adults, ‘normative public deficiencies’ (tripping in front of a large class, forgetting someone’s name, unintended bodily induced noises) were at the top of the list. But there were many other types of embarrassment situations—awkward social interactions, conspicuousness in the absence of any deficiency, ‘team transgressions’ (embarrassed by a member of one’s group), and ‘empathic’ embarrassment. The diversity of situations that lead to embarrassment has posed a challenge to efforts at constructing a comprehensive ‘account’ of embarrassment. Some theorists believe that the crux of embarrassment is negative evaluation by others (Edelmann 1981, Miller 1996). This social evaluation account runs into difficulty with embarrassment events that involve no apparent deficiency (e.g., being the center of attention during a ‘Happy Birthday’ chorus). Other theorists subscribe to the ‘dramaturgic’ account, positing that embarrassment occurs when implicit social roles and scripts are disrupted. A flubbed performance, an unanticipated belch, and being the focus of ‘Happy Birthday’ each represent a deviation from accustomed social scripts. Lewis (1992) distinguished between two types of embarrassment—embarrassment due to exposure and embarrassment due to negative self-evaluation. According to Lewis (1992), embarrassment due to exposure emerges early in life, once children develop a rudimentary sense of self. When children develop standards, rules and goals (SRGs), a second type of embarrassment emerges—‘embarrassment as mild shame’ associated with failure in relation to SRGs. 2.2 Functions of Embarrassment Although there is debate about the fundamental causes of embarrassment, there is general agreement about its 13805
Self-conscious Emotions, Psychology of adaptive significance. Gilbert (1997) suggests that embarrassment serves an important social function by signaling appeasement to others. When untoward behavior threatens a person’s standing in an important social group, visible signs of embarrassment function as a nonverbal acknowledgment of shared social standards, thus diffusing negative evaluations and the likelihood of retaliation. Evidence from studies of both humans and nonhuman primates supports this remedial function of embarrassment (Keltner and Buswell 1997).
2.3 Embarrassment and Shame Is there a difference between shame and embarrassment? Some theorists essentially equate the two emotions. A more dominant view is that shame and embarrassment differ in intensity of affect and\or severity of transgression. Still others propose that shame is tied to perceived deficiencies of one’s core self, whereas embarrassment results from deficiencies in one’s presented self. Recent research suggests that shame and embarrassment are indeed quite different emotions—more distinct, even, than shame and guilt. For example, comparing adults’ personal shame, guilt, and embarrassment experiences, Tangney et al. (1996) found that shame was a more intense, painful emotion that involved a greater sense of moral transgression. But controlling for intensity and morality, shame and embarrassment still differed markedly along many affective, cognitive, and motivational dimensions. When shamed, people felt greater responsibility, regret, and self-directed anger. Embarrassment was marked by more humor, blushing, and a greater sense of exposure.
2.4 Indiidual Differences in Embarrassability As with shame and guilt, people vary in their propensity to experience embarrassment. These individual differences are evident within the first years of life, and are relatively stable across time. Research has shown that embarrassability is associated with neuroticism, high levels of negative affect, self-consciousness, and a fear of negative evaluation from others. Miller (1996) has shown that this fear of negative evaluation is not due to poor social skills, but rather a heightened concern for social rules and standards.
3. Pride Of the self-conscious emotions, pride has received the least attention. Most research comes from developmental psychology, particularly in the achievement domain. 13806
3.1 Deelopmental Issues There appear to be substantial developmental shifts in the types of situations that induce pride, the nature of the pride experience itself, and the ways in which pride is expressed. For example, Stipek et al. (1992) observed developmental changes in the criteria children use for evaluating success and failure—and in the types of situation that lead to pride. Children under 33 months respond positively to task-intrinsic criteria (e.g., completing a tower of blocks), but they do not seem to grasp the concept of competition (e.g., winning or losing a race to complete a tower of blocks). It is only after 33 months that children show enhanced pride in response to a competitive win. There are also developmental shifts in the importance of praise from others. Stipek et al. (1992) reported that all children 13–39 months smiled and exhibited pleasure with their successes. But there were age differences in social referencing. As children neared two years of age, they began to seek eye contact with parents upon completing a task, often actively soliciting parental recognition, which in turn enhanced children’s pleasure with achievements. Stipek et al. (1992) suggests that the importance of external praise may be curvilinear across the lifespan. Very young children take pleasure in simply having immediate effect on their environment; as they develop selfconsciousness (at about two), others’ reactions shape their emotional response to success and failure. Still later, as standards become increasingly internalized, pride becomes again more autonomous, less contingent on others’ praise and approval. 3.2 Two Types of Pride? Both Tangney (1990) and Lewis (1992) have suggested that there are two types of pride. Paralleling the self vs. behavior distinction of guilt and shame, Tangney (1990) distinguished between pride in self (‘alpha’ pride) and pride in behavior (‘beta’ pride). Similarly, Lewis (1992) distinguished between pride (arising from attributing one’s success to a specific action) and hubris (pridefulness arising from attributions of success to the global self). Lewis (1992) views hubris as largely maladaptive, noting that hubristic individuals are inclined to distort and invent situations to enhance the self, which can lead to interpersonal problems.
4. Future Research Future research will no doubt focus on biological and social factors that shape individual differences in selfconscious emotions. In addition, we need to know more about the conditions under which guilt, shame, pride, and embarrassment are most likely to be adaptive vs. maladaptive. Finally, more cross-cultural research is needed. Kitayama et al. (1995) make the
Self-deelopment in Childhood compelling argument that, owing to cultural differences in the construction of the self, self-conscious emotions may be especially culturally sensitive. See also: Culture and Emotion; Emotion and Expression; Emotion, Neural Basis of; Emotions, Evolution of; Emotions, Psychological Structure of; Shame and the Social Bond
Tangney J P, Miller R S, Flicker L, Barlow D H 1996 Are shame, guilt and embarrassment distinct emotions? Journal of Personality and Social Psychology 70: 1256–69 Tangney J P, Wagner P, Gramzow R 1992 Proneness to shame, proneness to guilt, and psychopathology. Journal of Abnormal Psychology 101: 469–78
J. P. Tangney
Bibliography Baumeister R F, Stillwell A M, Heatherton T F 1994 Guilt: An interpersonal approach. Psychological Bulletin 115: 243–67 Edelmann R J 1981 Embarrassment: The state of research. Current Psychological Reiews 1: 125–38 Ferguson T J, Stegge H, Damhuis I 1991 Children’s understanding of guilt and shame. Child Deelopment 62: 827–39 Gilbert P 1997 The evolution of social attractiveness and its role in shame, humiliation, guilt, and therapy. British Journal of Medical Psychology 70: 113–47 Harder D W, Cutler L, Rockart L 1992 Assessment of shame and guilt and their relationship to psychopathology. Journal of Personality Assessment 59: 584–604 Keltner D, Buswell B N 1997 Embarrassment: Its distinct form and appeasement functions. Psychological Bulletin 122: 250–70 Kitayama S, Markus H R, Matsumoto H 1995 Culture, self, and emotion: A cultural perspective on ‘self-conscious’ emotion. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford Press, New York, pp. 439–64 Lewis H B 1971 Shame and Guilt in Neurosis. International Universities Press, New York Lewis M 1992 Shame: The Exposed Self. Free Press, New York Lewis M, Sullivan M W, Stanger C, Weiss M 1989 Selfdevelopment and self-conscious emotions. Child Deelopment 60: 146–56 Mascolo M F, Fischer K W 1995 Developmental transformation in appraisals for pride, shame, and guilt. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford, New York, pp. 64–113 Miller R S 1996 Embarrassment: Poise and Peril in Eeryday Life. Guilford Press, New York Stipek D J, Recchia S McClintic S 1992 Self-evaluation in young children. Monographs of the Society for Research in Child Development 57 (1, Serial No. 226) R5–R83 Tangney J P 1990 Assessing individual differences in proneness to shame and guilt: Development of the self-conscious affect and attribution inventory. Journal of Personality and Social Psychology 59: 102–11 Tangney J P 1995 Shame and guilt in interpersonal relationships. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford Press, New York, pp. 114–39 Tangney J P, Burggraf S A, Wagner P E 1995 Shame-proneness, guilt-proneness, and psychological symptoms. In: Tangney J P, Fischer K W (eds.) Self-conscious Emotions: The Psychology of Shame, Guilt, Embarrassment, and Pride. Guilford, New York, pp. 343–67
Self-development in Childhood In addition to domain-specific self-concepts, the ability to evaluate one’s overall worth as a person emerges in middle childhood. The level of such global self-esteem varies tremendously across children and is determined by how adequate they feel in domains of importance as well as the extent to which significant others (e.g., parents and peers) approve of them as a person. Efforts to promote positive self-esteem are critical, given that low self-esteem is associated with many psychological liabilities including depressed affect, lack of energy, and hopelessness about the future.
1. Introduction Beginning in the second year of life toddlers begin to talk about themselves. With development, they come to understand that they possess various characteristics, some of which may be positive (‘I’m smart’) and some of which may be negative (‘I’m unpopular’). Of particular interest is how the very nature of such selfevaluations changes with development as well as among individual children and adolescents across two basic evaluative categories, (a) domain-specific selfconcepts, i.e., how one judges one’s attributes in particular arenas, e.g., scholastic competence, social (for a complete treatment of self-development in childhood and adolescence, see Harter 1999). Developmental shifts in the nature of self-evaluations are driven by changes in the child’s cognitive capabilities (see Self-knowledge: Philosophical Aspects and Self-ealuatie Process, Psychology of). Cognitivedevelopmental theory and findings (see Piaget 1960, 1963, Fischer 1980) alert us to the fact that the young child is limited to very specific, concrete representations of self and others, for example, ‘I know my ABCs’ (see Harter 1999). In middle to later childhood, the ability to form higher-order concepts about one’s attributes and abilities (e.g., ‘I’m smart’) emerges. There are further cognitive advances at adolescence, allowing the teenager to form abstract concepts about 13807
Self-deelopment in Childhood the self that transcend concrete behavioral manifestations and higher-order generalizations (e.g., ‘I’m intelligent’).
2. Deelopmental Differences in Domain-specific Self-concepts
panied by parental messages that make the child feel inadequate, incompetent, and unlovable. Such children will also engage in all-or-none thinking but conclude that they are all bad.)
2.2 Middle to Later Childhood
Domain-specific evaluative judgments are observed at every developmental level. However, the precise nature of these judgments varies with age (see Table 1). In Table 1, five common domains in which children and adolescents make evaluate judgments about the self are identified: Scholastic competence, Physical competence, Social competence, Behavioral conduct, and Physical appearance. The types of statements vary, however, across three age periods, early childhood, later childhood, and adolescence, in keeping with the cognitive abilities of each age period.
2.1 Early Childhood Young children provide very concrete accounts of their capabilities, evaluating specific behaviors. Thus, they communicate how they know their ABCs, how they can run very fast, how they are nice to a particular friend, how they don’t hit their sister, and how they possess a specific physical feature such as pretty blond hair. Of particular interest in such accounts is the fact that the young child typically provides a litany of virtues, touting his or her positive skills and attributes. One cognitive limitation of this age period is that the young child cannot distinguish the wish to be competent from reality. As a result, they typically overestimate their abilities because they do not yet have the skills to evaluate themselves realistically. Another cognitive characteristic that contributes to potential distortions is the pervasiveness of all-or-none thinking. That is, evaluations are either all positive or all negative. With regard to self-evaluations, they are typically all positive. (Exceptions to this positivity bias can be observed in children who are chronically abused, since severe maltreatment is often accom-
As the child grows older, the ability to make higherorder generalizations in evaluating his or her abilities and attributes emerges. Thus, rather than cite prowess at a particular activity, the child may observe that he or she is good at sports, in general. This inference can further be justified in that the child can describe his or her talent at several sports (e.g., good at soccer, basketball, baseball). Thus, the higher-order generalization represents a cognitive construction in which an over-arching judgment (good at sports) is defined in terms of specific examples which warrant this conclusion. Similar processes allow the older child to conclude that he or she is smart (e.g., does well in math, science, and history). The structure of a higherorder generalization about being well behaved could include such components as obeying parents, not getting in trouble, and trying to do what is right. A generalization concerning the ability to make friends may subsume accounts of having friends at school, making friends easily at camp, and developing friendships readily upon moving to a new neighborhood. The perception that one is good-looking may be based on one’s positive evaluation of one’s face, hair, and body. During middle childhood, all-or-none thinking diminishes and the aura of positivity fades. Thus, children do not typically think that they are all virtuous in every domain. The more common pattern is for them to feel more adequate in some domains than others. For example, one child may feel that he or she is good at schoolwork and is well behaved, whereas he or she is not that good at sports, does not think that he or she is good-looking, and reports that it is hard to make friends. Another child may report the opposite pattern. There are numerous combinations of positive and negative evaluations across these domains that chil-
Table 1 Developmental changes in the nature of self-evaluative statements across different domains Early childhood specific behaviors
Later childhood generalizationsa
Adolescence abstractionsa
Scholastic competence Athletic competence Social competence Behavioral conduct
I know my A,B,C’s I can run very fast I’m nice to my friend, Jason I don’t hit my sister
I’m smart in school I’m good at sports It’s easy for me to make friends I’m well behaved
Physical appearance
I have pretty blond hair
I’m good looking
I’m intelligent I’m athletically talented I’m popular I think of myself as a moral person I’m physically attractive
Domains
aExamples
13808
in the table represent positive self-evaluations. However, during later childhood and adolescence, negative judgments are also observed
Self-deelopment in Childhood dren can and do report. Moreover, they may report both positive and negative judgments within a given domain, for example, they are smart in some school subjects (math and science) but ‘dumb’ in others (english and social studies). Such evaluations may also be accompanied by self-affects that also emerge in later childhood, for example, feeling proud of one’s accomplishments but ashamed of one’s perceived failures (see also Self-conscious Emotions, Psychology of). This ability to consider both positive and negative characteristics is a major cognitive–developmental acquisition. Thus, beginning in middle to later childhood, these distinctions result in a profile of selfevaluations across domains. Contributing to this advance is the ability to engage in social comparison. Beginning in middle childhood one can use comparisons with others as a barometer of the skills and attributes of the self. In contrast, the young child cannot simultaneously compare his or her attributes to the characteristics of another in order to detect similarities or differences that have implications for the self. Although the ability to utilize social comparison information for the purpose of selfevaluation represents a cognitive-developmental advance, it also ushers in new, potential liabilities. With the emergence of the ability to rank-order the performance of other children, all but the most capable children will necessarily fall short of excellence. Thus, the very ability and penchant to compare the self with others makes one’s self-concept vulnerable, particularly if one does not measure up in domains that are highly valued. The more general effects of social comparison can be observed in findings revealing that domain-specific self-concepts become more negative during middle and later childhood, compared to early childhood.
2.3 Adolescence For the adolescent, there are further cognitivedevelopmental advances that alter the nature of domain-specific self-evaluations. As noted earlier, adolescence brings with it the ability to create more abstract judgments about one’s attributes and abilities. Thus, one no longer merely considers oneself to be good at sports but to be athletically talented. One is no longer merely smart but views the self more generally as intelligent, where successful academic performance, general problem-solving ability, and creativity might all be subsumed under the abstraction of intelligence. Abstractions may be similarly constructed in the other domains. For example, in the domain of behavioral conduct, there will be a shift from the perception that one is well behaved to a sense that one is a moral or principled person. In the domains of social competence and appearance, abstractions may take the form of perceptions that one is popular and physically attractive.
These illustrative examples all represent positive self-evaluations. However, during adolescence (as well as in later childhood), judgments about one’s attributes will also involve negative self-evaluations. Thus, certain individuals may judge the self to be unattractive, unpopular, unprincipled, etc. Of particular interest is the fact that when abstractions emerge, the adolescent typically does not have total control over these new acquisitions, just as when one is acquiring a new athletic skill (e.g., swinging a bat, maneuvering skis), one lacks a certain level of control. In the cognitive realm, such lack of control often leads to overgeneralizations that can shift dramatically across situations or time. For example, the adolescent may conclude at one point in time that he or she is exceedingly popular but then, in the face of a minor social rebuff, may conclude that he or she is extremely unpopular. Gradually, adolescents gain control over these self-relevant abstractions such that they become capable of more balanced and accurate self-representations (see Harter 1999).
3. Global Self-esteem The ability to evaluate one’s worth as a person also undergoes developmental change. The young child simply is incapable, cognitively, of developing the verbal concept of his\her value as a person. This ability emerges at the approximate age of eight. However, young children exude a sense of value or worth in their behavior. The primary behavioral manifestations involve displays of confidence, independence, mastery attempts, and exploration (see Harter 1999). Thus, behaviors that communicate to others that children are sure of themselves are manifestations of high self-esteem in early childhood. At about the third grade, children begin to develop the concept that they like, or don’t like, the kind of person they are (Harter 1999, Rosenberg 1979). Thus, they can respond to general items asking them to rate the extent to which they are pleased with themselves, like who they are, and think they are fine, as a person. Here, the shift reflects the emergence of an ability to construct a higher-order generalization about the self. This type of concept can be built upon perceptions that one has a number of specific qualities; for example, that one is competent, well behaved, attractive, etc. (namely, the type of domain-specific selfevaluations identified in Table 1). It can also be built upon the observation that significant others, for example, parents, peers, teachers, think highly of the self. This process is greatly influenced by advances in the child’s ability to take the perspective of significant others (Selman 1980). During adolescence, one’s evaluation of one’s global worth as a person may be further elaborated, drawing upon more domains and sources of approval, and will also become more abstract. Thus, adolescents can directly acknowledge 13809
Self-deelopment in Childhood that they have high or low self-esteem, as a general abstraction about the self (see also Self-esteem in Adulthood).
her competence in domains where one had aspirations to succeed. Cooley focused on the salience of the opinions that others held about the self, opinions which one incorporated into one’s global sense of self (see Self: History of the Concept).
4. Indiidual Differences in Domain-specific Selfconcepts as well as Global Self-esteem Although there are predictable cognitively based developmental changes in the nature of how most children and adolescents describe and evaluate themselves, there are striking individual differences in how positively or negatively the self is evaluated. Moreover, one observes different profiles of children’s perceptions of their competence or adequacy across the various self-concept domains, in that children evaluate themselves differently across domains. Consider the profiles of four different children. One child, Child A, may feel very good about her scholastic performance, although this is in sharp contrast to her opinion of her athletic ability, where she evaluates herself quite poorly. Socially she feels reasonably well accepted by her peers. In addition, she considers herself to be well behaved. Her feelings about her appearance, however, are relatively negative. Child A also reports very high self-esteem. Another child, Child B, has a very different configuration of scores. This is a boy who feels very incompetent when it comes to schoolwork. However, he considers himself to be very competent, athletically, and feels well received by peers. He judges his behavioral conduct to be less commendable. In contrast, he thinks he is relatively good-looking. Like Child A, he also reports high self-esteem. Other profiles are exemplified by Child C and Child D, neither of whom feel good about themselves scholastically or athletically. They evaluate themselves much more positively in the domains of social acceptance, conduct, and physical appearance. In fact, their profiles are quite similar to each other across the five specific domains. However, judgments of their self-esteem are extremely different. Child C has very high self-esteem whereas Child D has very low selfesteem. This raises a puzzling question: how can two children look so similar with regard to their domainspecific self-concepts but evaluate their global selfesteem so differently? We turn to this issue next, in examining the causes of global self-esteem.
5. The Causes of Children’s Leel of Self-esteem Our understanding of the antecedents of global selfesteem have been greatly aided by the formulations of two historical scholars of the self, William James (1892) and Charles Horton Cooley (1902). Each suggested rather different pathways to self-esteem, defined as an overall evaluation of one’s worth as a person (see reviews by Harter 1999, Rosenberg 1979). James focused on how the individual assessed his or 13810
5.1 Competence–Adequacy in Domains of Importance For James, global self-esteem derived from the evaluations of one’s sense of competence or adequacy in the various domains of one’s life relative to how important it was to be successful in these domains. Thus, if one feels one is successful in domains deemed important, high self-esteem will result. Conversely, if one falls short of one’s goal in domains where one has aspirations to be successful, one will experience low selfesteem. One does not, therefore, have to be a superstar in every domain to have high self-esteem. Rather, one only needs to feel adequate or competent in those areas judged to be important. Thus, a child may evaluate himself or herself as unathletic; however, if athletic prowess is not an aspiration, then self-esteem will not be negatively affected. That is, the high selfesteem individual can discount the importance of areas in which one does not feel successful. This analysis can be applied to the profiles of Child C and Child D. In fact, we have directly examined this explanation in research studies by asking children to rate how important it is for them to be successful (Harter 1999). The findings reveal that high selfesteem individuals feel competent in domains they rate as important. Low self-esteem individuals report that areas in which they are unsuccessful are still very important to them. Thus, Child C represents an example of an individual who feels that social acceptance, conduct, and appearance, domains in which she evaluates herself positively, are very important but that the two domains where she is less successful, scholastic competence and athletic competence are not that important. In contrast, Child D rates all domains as important, including the two domains where he is not successful; scholastic competence and athletic competence. Thus, the discrepancy between high importance coupled with perceptions of inadequacy contribute to low self-esteem.
5.2 Incorporation of the Opinions of Significant Others Another important factor influencing self-esteem can be derived from the writings of Cooley (1902) who metaphorically made reference to the ‘looking-glass self’ (see Oosterwegel and Oppenheimer 1993). According to this formulation, significant others (e.g., parents and peers) were social mirrors into which one gazed in order to determine what they thought of the
Self-deelopment in Childhood 3.75
High support
3.50 Moderate support Low support
Self-esteem
3.25 3.00 2.75 2.50 2.25 2.00
Low Moderate High Average competence in important domains
Figure 1 Findings on how competence in domains of importance and social support combine to predict global self-esteem
self. Thus, in evaluating the self, one would adopt what one felt were the judgments of these others whose opinions were considered important. Thus, the approval, support, or positive regard from significant others became a critical source of one’s own sense of worth as a person. For example, children who receive approval from parents and peers will report much higher self-esteem than children who experience disapproval from parents and peers. Findings reveal that both of these factors, competence in domains of importance and the perceived support of significant others, combine to influence a child’s or adolescent’s self-esteem. Thus, as can be observed in Fig. 1, those who feel competent in domains of importance and who also report high support, rate themselves as having the highest selfesteem. Those who feel inadequate in domains deemed important and who also report low levels of support, rate themselves as having the lowest self-esteem. Other combinations fall in between (data from Harter 1993).
6. Conclusions Two types of self-representations that can be observed in children and adolescents were distinguished, evaluative judgments of competence or adequacy in specific domains and the global evaluation of one’s worth as a person, namely overall self-esteem. Each of these undergoes developmental change based on agerelated cognitive advances. In addition, older children and adolescents vary tremendously with regard to whether self-evaluations are positive or negative. Within a given individual, there will be a profile of selfevaluations, some of which are more positive and some more negative. More positive self-concepts in domains considered important, as well as approval
from significant others, will lead to high self-esteem. Conversely, negative self-concepts in domains considered important, coupled with lack of approval from significant others, will result in low self-esteem. Selfesteem is particularly important since it is associated with very important outcomes or consequences. Perhaps the most well-documented consequence of low self-esteem is depression. Children and adolescents (as well as adults) with the constellation of causes leading to low self-esteem will invariably report that they feel depressed, emotionally, and are hopeless about their futures; the most seriously depressed consider suicide. Thus, it is critical that we intervene for those experiencing low self-esteem. Our model of the causes of self-esteem suggests strategies that may be fruitful, for example, improving skills, helping individuals discount the importance of domains in which it is unlikely that they can improve, and providing support in the form of approval for who they are as people. Future research, however, is necessary to determine the different pathways to low and high self-esteem. For example, for one child, the sense of inadequacy in particular domains may be the pathway to low selfesteem. For another child, lack of support from parents or peers may represent the primary cause. Future efforts should be directed to the identification of these different pathways since they have critical implications for intervention efforts to enhance feelings of worth for those children with low self-esteem (see also Self-concepts: Educational Aspects). Positive self-esteem is clearly a psychological commodity, a resource that is important for us to foster in our children and adolescents if we want them to lead productive and happy lives. See also: Identity in Childhood and Adolescence; Personality Development in Childhood; Self-concepts: Educational Aspects; Self: History of the Concept; Self-knowledge: Philosophical Aspects; Self-regulation in Childhood
Bibliography Cooley C H 1902 Human Nature and the Social Order. Charles Scribner’s Sons, New York Damon W, Hart D 1988 Self-understanding in Childhood and Adolescence. Cambridge University Press, New York Fischer K W 1980 A theory of cognitive development: the control and construction of hierarchies of skills. Psychological Reiew 87: 477–531 Harter S 1993 Causes and consequences of low self-esteem in children and adolescents. In: Baumeister R F (ed.) Selfesteem: The Puzzle of Low Self-regard. Plenum, New York Harter S 1999 The Construction of the Self: A Deelopmental Perspectie. Guilford Press, New York James W 1892 Psychology: The Briefer Course. Henry Holt, New York Oosterwegel A, Oppenheimer L 1993 The Self-system: Deelopmental Changes Between and Within Self-concepts. Erlbaum, Hillsdale, NJ
13811
Self-deelopment in Childhood Piaget J 1960 The Psychology of Intelligence. Littlefield, Adams, Patterson, NJ Piaget J 1963 The Origins of Intelligence in Children. Norton, New York Rosenberg M 1979 Conceiing the Self. Basic Books, New York Selman R L 1980 The Growth of Interpersonal Understanding. Academic Press, New York
S. Harter
Self-efficacy Self-efficacy refers to the individual’s capacity to produce important effects. People who are aware of being able to make a difference feel good and therefore take initiatives; people who perceive themselves as helpless are unhappy and are not motivated for actions. This article treats the main concepts related to self-efficacy, their theoretical and historical contexts, their functions and practical uses, as well as developmental and educational\therapeutic aspects.
1. Concepts Everything that happens is caused to happen (Aristotle). Making changes means being a cause or providing a cause that produces a change. As the true causes are difficult to identify, the terms conditions and contingencies are often used instead. An effect is contingent upon a condition or a set of conditions if it always occurs if the condition or the set of conditions is met. Such conditions are sufficient but not necessary for producing the effect. Here we are interested in human actions as necessary conditions of change (see Motiation and Actions, Psychology of). Actions, too, depend on conditions. Considering person-related conditions of effective actions, we can differentiate aspects like knowledge, initiative, perseverance, intelligence, experience, physical force, help from others, and more. Thus, instead of saying that an actor is able to produce a certain effect, we can say more elaborately that an actor is endowed with certain means or conditions that enable him or her to attain certain goals (Fig. 1). We say that individuals (or groups) are in control of a specific goal if they are able to produce the corresponding changes (horizontal line of Fig. 1). More elaborately, they are in control of a specific goal if they are aware of the necessary contingencies and if they are competent enough to make these contingencies work (both diagonal lines in Fig. 1). Control is complete if these contingencies are necessary and sufficient; control is partial if the contingencies are necessary but not sufficient. Instead of control, Bandura (1977) introduced the word efficacy, more specifically self-efficacy. We use control and selfefficacy interchangeably. 13812
Figure 1 Means–ends relations and agency as components of control (adapted from Skinner et al. 1988 and Flammer 1990)
Controlling means putting control into action. This is not equivalent to having control or being in control: People are in control of certain states of affairs, if they can put control into action, even if they do not. For example, somebody has control over buying a new bicycle, even if he or she does not buy it. People can have control over certain events without knowing it; they will then probably miss possible chances to activate such control. On the other hand, people may believe that they have some control, but in fact do not. That may make them feel good as long as there is no need to put this control into work. Obviously, it is important that people do not only have control, but that they also know that they have control. Not being in control of an important situation is equivalent to being helpless in this respect (Seligman 1975). It has been proved that the psychological effects of helplessness (HL) are different depending on whether the helpless person believes themself to be helpless for ever (chronic HL), whether being helpless is unique (personal vs. universal HL), and whether helplessness is related to a specific domain or to most domains of life (specific vs. global HL). In the worst case, helpless people are (a) deeply sad about not having control, (b) demotivated to take initiatives or to invest effort and perseverance, (c) cognitively blind for any alternative or better view of the state of the world, and (d) devaluate themselves. Obviously, at least in subjectively important domains, we prefer self-efficacy to helplessness: selfefficacy beliefs provide us with security and pride. When we lack self-efficacy in important domains we either strive for self-efficacy (by fighting, learning, or training) or search for compensation. A common type of compensation consists of seeking help or delegating personal control (l indirect control or proxy control), e.g., to pay a gardener caring for one’s garden or to put a doctor in charge with one’s health or to pray to God for a favor in a seemingly hopeless situation. Another way of compensating lacking (primary) control is to use secondary control (Rothbaum et al. 1982). While control (i.e., primary control) consists of making the
Self-efficacy world fit with one’s goals and aspirations, secondary control accommodates personal aspirations or personal interpretations of the actual state in order to make them fit with the world (see Coping across the Lifespan).
2. History of the Concept of Self-efficacy In the 1950s, Rotter (1954) suggested the concept locus of control, meaning the place where control of desired reinforcement for behavior is exerted. Internal control means control within the person, external control means control outside the person, possibly in powerful others, in objective external conditions, or in chance or luck. Rotter and his associates have developed valid measuring instruments that have been used in thousands (!) of studies demonstrating that internal locus of control is positively correlated with almost all desirable attributes of humans. Fritz Heider (1944) who studied the subjective attributions for observed actions had already suggested the concepts of internality and externality. The true origin, i.e., the ‘cause’ of an observed action, is either attributed to the person (internal, personal liability) or to person-independent conditions (external, no personal liability). Heider’s work has triggered a large research tradition on causal attributions. Results of this research were of use for successful differentiations of subjective interpretations concerning experiences of helplessness (see above; see also Attributional Processes: Psychological). Consequently, attribution theory remains an important element of self-efficacy theory. Modern self-efficacy theory goes beyond Rotter’s theory insofar as it is more differentiated (e.g., contingency vs. competence, primary vs. secondary), distinctively referred to specific domains of actions (e.g., health, school), and elaborated to also include aspects other than personality (e.g., motivation, development).
3. Self-efficacy as an Important Element of a Happy and Successful Person Individuals with high self-efficacy beliefs also report strong feelings of well-being and high self-esteem in general (Bandura 1997, Flammer 1990). They are willing to take initiative in related domains, to apply effort if needed, and persevere in efforts as long as they believe in their efficacy. Potentially stressful situations produce less subjective stress in highly self-efficient individuals. However, while self-efficacy acts as a buffer against stress, it can also—indirectly—produce stress insofar as it can induce overly ambitious individuals to assume more responsibilities than they are able to cope with in sheer quantity. Moreover, self-efficacy has been reported to exert a positive influence on recovery from surgery or illness
and on healthy lifestyles. It is not surprising that high self-efficacy beliefs enhance school success; likewise, school failure inhibits relative self-efficacy beliefs, again partly depending on the individual’s attributional patterns. Interestingly, it has been demonstrated repeatedly and in several cultures that in most domains healthy and happy individuals tend to slightly overestimate themselves. Realistic estimation of selfefficacy is rather typical for persons vulnerable to depressed mood, and clear underestimation increases the chance for a clinical (reactive) depression. On the other hand, major overestimation might result in painful and harmful clashes with reality.
4. The Deelopment of Self-efficacy Beliefs Evidently, the newborn baby does not have selfefficacy beliefs in our sense. The basic structure of the self-efficacy beliefs develops within the first three or four years. According to Flammer’s (1995) analysis, the infant’s development towards the basic understanding of self-efficacy proceeds through a developmental sequence consisting of the acquisition of (a) the basic event schema (i.e., that classes of events happen), (b) the elementary causal schema (conditions, i.e., actions, events), (c) the understanding of personally producing effects, (d) the understanding of success and failure in aiming at nontrivial goals (visible as pride and as shame, respectively), and (e) the discovery of being not only the origin of one certain change but also capable of producing such changes. Obviously, this development proceeds in domains that are accessible by the infant so far. Later on, this development will have to be extended to further domains. As to the domain of school success, within the second half of the first decade of life, the child learns more and more differentiations of means towards the same ends. Thus, he or she gradually abandons a global concept of simply being or not being able and singles out—probably in this sequence—the factor effort (more effort is needed to solve tasks—a typical lesson to be learned early in school), the factors individual ability and task difficulty (higher difficulty requiring more ability), and finally the understanding of the compensatory relation between effort and ability (it is possible to reach the same goals by being less capable but more hard-working). In adolescence and early adulthood more lessons have to be learned. More and more domains become accessible to personal control due to increased cognitive, physical, or economic strength and social power. This is exciting, indeed. However, individuals have permanently to select from the choices which are offered to them (Flammer 1996). Trying to control everything results in overburdening. One thing is to deselect control domains because they compete with higher priority control domains; another thing is to be forced to renounce control because no accessible 13813
Self-efficacy contingencies seem to exist. As long as there are enough attractive alternatives available, it is not painful, but it can severely hurt handicapped individuals and old people when they lose control of important domains. Old people are well advised both not to resign too early and to search for compensations. Such compensations consists of artifacts of all kinds (from memory aids to hearing aids), but they also include the above mentioned compensations like indirect control (social resources) and secondary control. Indeed, it seems that the extent and the importance of secondary control increases with the lifetime (Heckhausen and Schulz 1995). Under certain conditions, Baltes and Silverberg (1994) have even suggested that people in old people’s homes adjust better if in certain domains they give up personal control at all. Alluding to the concept of learned helplessness, they called such behavior learned dependency. Learned dependency helps to avoid certain social conflicts; the only remaining personal control may be the control of giving in.
5. Educational and Therapeutic Aspects According to the development of the basic structure of self-efficacy, contingent behavior by the caregivers is crucial already within the first weeks of life. Caregivers’ behavior should be predictable, i.e., contingent at least upon the baby’s actual behavior, and as far as ever possible upon the baby’s perceptions, feelings, and intentions. This requires an enormous potential of sensitivity towards the child. Fortunately, researchers have demonstrated that parental empathy is partly a natural gift with the majority of attentive parents. Studies have shown that contingent behavior fosters children’s happiness, but also their willingness to learn and their curiosity. If caregivers are judged as not reacting contingently enough, we also have to consider that some babies show quite unorganized behavior and make the caregiver’s task very difficult (‘difficult babies’). In such cases it is difficult to decide whether the noncontingency has originated from the caregiver or from the baby. The subsequent steps in the development of selfefficacy require that caregivers provide freedom for experimentation, let the child try by himself or herself, and comment on the successes and failures in a way that the child can establish and maintain confidence in his or her efficacy (Schneewind 1995). Nevertheless, caregivers should try to prevent the child from dangerous and frequent hopeless experiences. Psychotherapy with individuals who have severely undermined self-efficacy beliefs is difficult. Teaching and trying to convince them that they are really capable even when they believe not to be does not help much. Helping them to recall from memory prior success experiences instead of being impressed by failures only is more efficient. Even more efficient are new and successful experiences. Helpless individuals not only 13814
interpret failures to their disadvantage, they also play down their contribution to eventual success. Parallel to these findings, memory research has demonstrated that depressed people’s memories of own actions are biased towards recalling more of their failures than their own success. This leaves us with an important contrast: while children and healthy adults tend to overestimate their self-efficacy, individuals who have lost confidence in themselves immunize such devastating beliefs by not trying anymore, by self-damaging attributions, and by recalling their biography in a way that is consistent with their beliefs. Given the pervasive influence of positive beliefs in self-efficacy, it is important to help individuals with establishing and maintaining selfefficacy beliefs at a high level, and to guide failureexpecting persons to positive experiences.
6. Conclusion Within the last decades, theory and research have established self-efficacy beliefs as important elements in the understanding of human action and human well-being in a very large sense. However, little is known so far about differences in self-efficacy beliefs in different life domains and among different cultures. Further research should include more systematic comparisons between cultures, between life domains, and—if possible—between historical times. In addition, it is suggested that in the future investigators consider more seriously the fact that all changes are due to a multitude of necessary conditions. More specifically, there is a need for researchers to consider the efficacy and efficacy-beliefs of interacting people, that is, to examine concepts such as shared control or ‘common efficacy.’ See also: Control Behavior: Psychological Perspectives; Learned Helplessness; Motivation and Actions, Psychology of; Self-efficacy and Health; Selfefficacy: Educational Aspects; Self-regulation in Adulthood; Self-regulation in Childhood
Bibliography Baltes M M, Silverberg S B 1994 The dynamics between dependency and autonomy. In: Featherman D L, Lerner R M, Parlmutter M (eds.) Life-span Deelopment and Behaior. Erlbaum, New York Bandura A 1977 Self-efficacy: Toward a unifying theory of behavioral change. Psychological Reiew 84: 191–215 Bandura A 1997 The Exercise of Control. Freeman, New York Flammer A 1990 Erfahrung der eigenen Wirksamkeit [Experiencing One’s Own Efficacy]. Huber, Bern, Switzerland Flammer A 1995 Developmental analysis of control beliefs. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York Flammer A 1996 Entwicklungstheorien [Theories of Deelopment]. Huber, Bern, Switzerland
Self-efficacy and Health Heckhausen J, Schultz R 1995 The life-span theory of control. Psychological Reiew 102: 284–304 Heider F 1944 Social perception and phenomenal causality. Psychological Reiew 51: 358–74 Rothbaum F, Weisz J R, Snyder S S 1982 Changing the world and changing the self: a two-process model of perceived control. Journal of Personality and Social Psychology 42: 5–37 Rotter J B 1954 Social Learning and Clinical Psychology. Prentice-Hall, Englewood Cliffs, NJ Schneewind K A 1995 Impact of family processes on control beliefs. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York Seligman M E P 1975 Helplessness. On Depression, Deelopment and Death. Freeman, San Francisco Skinner E A, Chapman M, Baltes P B 1988 Control, means– ends, and agency beliefs: A new conceptualization and its measurement during childhood. Journal of Personality and Social Psychology 54: 117–33
A. Flammer
Self-efficacy and Health The quality of human health is heavily influenced by lifestyle habits. By exercising control over several health habits people can live longer healthier and slow the process of aging (see Control Beliefs: Health Perspecties). Exercise, reduce dietary fat, refrain from smoking, keep blood pressure down, and develop effective ways of coping with stressors. If the huge health benefits of these few lifestyle habits were put into a pill it would be declared a spectacular breakthrough in the field of medicine. The recent years have witnessed a major change in the conception of human health and illness from a disease model to a health model. It is just as meaningful to speak of levels of vitality as of degrees of impairment. The health model, therefore, focuses on health promotion as well as disease prevention. Perceived self-efficacy plays a key role in the self-management of habits that enhance health and those that impair it.
1. Perceied Self-efficacy Perceived self-efficacy is concerned with people’s beliefs in their capabilities to exercise control over their own functioning and over environmental events. Such beliefs influence what courses of action people choose to pursue, the goals they set for themselves and their commitment to them, how much effort they put forth in given endeavors, how long they persevere in the face of obstacles and failure experiences, their resilience to adversity, whether their thought patterns are self-hindering or self-aiding, how much stress and depression they experience in coping with taxing
environmental demands, and the level of accomplishments they realize (Bandura 1997, Schwarzer 1992). In social cognitive theory, perceived self-efficacy operates in concert with other determinants in regulating lifestyle habits. They include the positive and negative outcomes people expect their actions to produce. These outcome expectations may take the form of aversive and pleasurable physical effects, approving and disapproving social reactions, or selfevaluative consequences expressed as self-satisfaction and self-censure. Personal goals, rooted in a value system, provide further self-incentives and guides for health habits. The perceived sociostructural facilitators and impediments operate as another set of determinants of health habits. Self-efficacy is a key determinant in the causal structure because it affects health behavior both directly, and by its influence on these other determinants. The stronger the perceived efficacy, the higher the goal challenges people set for themselves, the more they expect their efforts to produce desired outcomes, and the more they view obstacles and impediments to personal change as surmountable. There are two major ways which a sense of personal efficacy affects human health. At the more basic level, such beliefs activate biological systems that mediate health and disease. The second level is concerned with the exercise of direct control over habits that affect health and the rate of biological aging.
2. Impact of Efficacy Beliefs on Biological Systems Stress is an important contributor to many physical dysfunctions (O’Leary 1990). Perceived controllability appears to be the key organizing principle in explaining the biological effects of stress. Exposure to stressors with the ability to exercise some control over them has no adverse physical effects. But exposure to the same stressors without the ability to control them impairs immune function (Herbert and Cohen 1993b, Maier et al. 1985). Epidemiological and correlational studies indicate that lack of behavioral or perceived control over stressors increases susceptibility to bacterial and viral infections, contributes to the development of physical disorders and accelerates the rate of progression of disease (Schneiderman et al. 1992). In social cognitive theory, stress reactions arise from perceived inefficacy to exercise control over aversive threats and taxing environmental demands (Bandura 1986). If people believe they can deal effectively with potential stressors, they are not perturbed by them. But, if they believe they cannot control aversive events, they distress themselves and impair their level of functioning. Perceived inefficacy to manage stressors activates autonomic, catecholamine and opioid systems that modulate the immune system in ways that 13815
Self-efficacy and Health can increase susceptibility to illness (Bandura 1997, O’Leary 1990). The immunosuppressive effects of stressors is not the whole story, however. People are repeatedly bombarded with taxing demands and stressors in their daily lives. If stressors only impaired immune function people would be highly vulnerable to infective agents that would leave them chronically bedridden with illnesses or quickly do them in. Most human stress is activated while competencies are being developed and expanded. Stress aroused while gaining a sense of mastery over aversive events strengthens components of the immune system (Wiedenfeld et al. 1990). The more rapid the growth of perceived coping efficacy, the greater the boost of the immune system. Immunoenhancement during development of coping capabilities vital to effective adaptation has evolutionary survival value. The field of health functioning has been heavily preoccupied with the physiologically debilitating effects of stressors. Self-efficacy theory also acknowledges the physiologically strengthening effects of mastery over stressors. As Dienstbier (1989) has shown, a growing number of studies is providing empirical support for physiological toughening by successful coping. Depression is another affective pathway through which perceived coping efficacy can affect health functioning. It has been shown to reduce immune function, and to heighten susceptibility to disease (Herbert and Cohen 1993a). The more severe the depression, the greater the reduction in immunity. Perceived self-efficacy to exercise control over things one values highly produces bouts of depression (Bandura 1997). Social support reduces vulnerability to stress, depression, and physical illness. But social support is not a self-forming entity waiting around to buffer harried people against stressors. People have to go out and find, create, and maintain supportive relationships for themselves. This requires a robust sense of social efficacy. Perceived social inefficacy contributes to depression both directly, and by curtailing development of social supports (Holahan and Holahan 1987). Social support, in turn, enhances perceived selfefficacy. Mediational analyses show that social support alleviates depression and fosters health-promoting behavior only to the extent that it boosts personal efficacy.
3. Self-efficacy in Promoting Healthful Lifestyles Lifestyle habits can enhance or impair health (see Health Behaiors). This enables people to exert some behavioral control over their vitality and quality of health. Efficacy beliefs affect every phase of personal change: whether people even consider changing their 13816
health habits; whether they enlist the motivation and perseverance needed to succeed should they choose to do so; and how well they maintain the habit changes they have achieved (Bandura 1997).
3.1 Initiation of Change People’s beliefs that they can motivate themselves and regulate their own behavior play a crucial role in whether they even consider changing detrimental health habits. They see little point in even trying if they believe they do not have what it takes to succeed. If they make an attempt, they give up easily in the absence of quick results or setbacks. Among those who change detrimental health habits on their own, the successful ones have stronger perceived selfefficacy at the outset than nonchangers and subsequent relapsers. Efforts to get people to adopt healthful practices rely heavily on persuasive communications in health education campaigns. Health communications foster adoption of healthful practices mainly by raising beliefs in personal efficacy, rather than by transmitting information on how habits affect health, by arousing fear of disease, or by increasing perception of one’s personal vulnerability or risk (Meyerowitz and Chaiken 1987). To help people reduce health-impairing habits requires a change in emphasis, from trying to scare people into health, to enable them with the skills and self-beliefs needed to exercise control over their health habits. In community-wide health campaigns, people’s pre-existing beliefs that they can exercise control over their health habits, and the efficacy beliefs enhanced by the campaign, both contribute to health-promoting habits (Maibach et al. 1991).
3.2 Adoption of Change Effective self-regulation of health behavior is not achieved through an act of will. It requires development of self-regulatory skills. To build people’s sense of efficacy, they must develop skills on how to influence their own motivation and behavior. In such programs, they learn how to monitor their health behavior and the social and cognitive conditions under which they engage in it; set attainable subgoals to motivate and guide their efforts; draw from an array of coping strategies rather than rely on a single technique; enlist self-motivating incentives and social supports to sustain the effort needed to succeed; and apply multiple self-influence consistently and persistently (Bandura 1997, Perri 1985). Once equipped with skills and belief in their self-regulatory capabilities, people are better able to adopt behaviors that promote health, and to eliminate those that impair it. A large body of evidence
Self-efficacy and Health reveals that the self-efficacy belief system operates as a common mechanism through which psychosocial treatments affect different types of health outcomes (Bandura 1997, Holden 1991).
3.3 Maintenance of Change It is one thing to get people to adopt beneficial health habits. It is another thing to get them to adhere to them. Maintenance of habit change relies heavily on self-regulatory capabilities and the functional value of the behavior. Development of self-regulatory capabilities requires instilling a resilient sense of efficacy as well as imparting skills. Experiences in exercising control over troublesome situations serve as efficacy builders. Efficacy affirmation trials are an important aspect of self-management because, if people are not fully convinced of their personal efficacy, they rapidly abandon the skills they have been taught when they fail to get quick results or suffer reverses. Like any other activity, self-management of health habits includes improvement, setbacks, plateaus, and recoveries. Studies of refractory detrimental habits show that a low sense of efficacy increases vulnerability to relapse (Bandura 1997, Marlatt et al. 1995). To strengthen resilience, people need to develop coping strategies not only to manage common precipitants of breakdown, but to reinstate control after setbacks. This involves training in how to manage failure (see Health: Self-regulation).
4. Self-management Health Systems Healthcare expenditures are soaring at a rapid rate. With people living longer and the need for healthcare services rising with age, societies are confronted with major challenges on how to keep people healthy throughout their lifespan, otherwise they will be swamped with burgeoning health costs. Health systems generally focus heavily on the supply side with the aim of reducing, rationing, and curtailing access to health to contain health costs. The social cognitive approach works on the demand side by helping people to stay healthy through good self-management of health habits. This requires intensifying health promotion efforts and restructuring health delivery systems to make them more productive. Efficacy-based models have been devised combining knowledge of self-regulation of health habits with computer-assisted implementation that provides effective health-promoting services in ways that are individualized, intensive and highly convenient (DeBusk et al. 1994). In this type of self-management system, people monitor their health habits. They set short-term goals for themselves and receive periodic feedback of progress towards their goals along with guides on how to manage troublesome situations.
Efficacy ratings identify areas in which self-regulatory skills must be developed and strengthened if beneficial changes are to be achieved and maintained. The productivity of the system is vastly expanded by combining self-regulatory principles with the power of computer-assisted implementation. A single implementer, assisted with a computerized coordinating and mailing system, provides intensive individualized training in self-management for large numbers of people simultaneously. The self-management system reduces health risk factors, improves health status, and enhances the quality of life in cost-effective ways (Bandura 1997). The self-management system is well received by participants because it is individually tailored to their needs, provides continuing personalized guidance and informative feedback that enables them to exercise considerable control over their own change; is a homebased program that does not require any special facilities, equipment, or attendance at group meetings that usually have high dropout rates; serves large numbers of people simultaneously under the guidance of a single implementer; is not constrained by time and place; and provides valuable health-promotion services at low cost. By combining the high individualization of the clinical approach with the large-scale applicability of the public health approach, the selfmanagement system includes the features that ensure high social utility. Linking the interactive aspects of the self-management model to the Internet can vastly expand its availability for preventive and promotive guidance. Chronic disease has become the dominant form of illness and the major cause of disability. The treatment of chronic disease must focus on self-management of physical conditions over time rather than on cure. This requires, among other things, pain amelioration, enhancement and maintenance of functioning with growing physical disability and development of selfregulative compensatory skills. Holman and Lorig (1992) have devised a prototypical model for the selfmanagement of different types of chronic diseases. Patients are taught cognitive and behavioral pain control techniques; proximal goal setting combined with self incentives as motivators to increase levels of activity; problem solving and self-diagnostic skills; and the ability to locate community resources and to manage medication programs. How healthcare systems deal with clients can alter their efficacy in ways that support or undermine their restorative efforts. Clients are, therefore, taught how to take greater initiative for their healthcare in dealings with health personnel. These skills are developed through modeling of self-management skills, guided mastery practice, and enabling feedback. The self-management program retards the biological progression of the disease, raises perceived selfregulatory efficacy, reduces pain and distress, fosters better cognitive symptom management, lessens the 13817
Self-efficacy and Health impairment of role functions, improves the quality of life, and decreases the use of medical services. Both the perceived self-efficacy at the outset, and the efficacy beliefs instilled by the self-management program predict later health status and functioning (Holman and Lorig 1992).
5. Childhood Health Promotion Models Many of the lifelong habits that jeopardize health are formed during childhood and adolescence (see Childhood Health). For example, unless youngsters take up the smoking habit as teenagers they rarely become smokers in adulthood. It is easier to prevent detrimental health habits than to try to change them after they have become deeply entrenched as part of a lifestyle. The biopsychosocial model provides a valuable public health tool for societal efforts to promote the health of its youth. Health habits are rooted in familial practices. But, schools have a vital role to play in promoting the health of a nation. This is the only place where all children can be easily reached so it provides a natural setting for promoting healthful habits and building self-management skills. Effective health promotion models include several major components. The first component is informational. It informs people of the health risks and benefits of different lifestyle habits. The second component develops the social and self-regulative skills for translating informed concerns into effective preventive action. As noted earlier, this includes self-monitoring of health practices, goal setting, and enlistment of selfincentives for personal change. The third component builds a resilient sense of self-regulatory efficacy to support the exercise of control in the face of difficulties that inevitably arise. Personal change occurs within a network of social influences. Depending on their nature, social influences can aid, retard, or undermine efforts at personal change. The final component, therefore, enlists and creates social supports for desired changes in health habits (see Social Support and Health). Educational efforts to promote the health of youth usually produce weak results. They are heavy on didactics but meager on personal enablement. They provide factual information about health, but do little to equip children with the skills and self-beliefs that enable them to manage the emotional, and social pressures to adopt detrimental health habits. Managing health habits involves managing emotional states and diverse social pressures for unhealthy behavior not just targeting a specific health behavior for change. Health promotion programs that encompass the essential elements of the self-regulatory model prevent or reduce injurious health habits. Health knowledge can be conveyed readily, but changes in values, 13818
attitudes, and health habits require greater effort. The more behavioral mastery experiences provided in the form of role enactments, the greater the beneficial changes (Bruvold 1993). The more intensive the program and the better the implementation, the stronger the impact (Connell et al. 1985). Comprehensive approaches that integrate guided mastery health programs with family and community efforts are more successful in promoting health and preventing adoption of detrimental health habits, than are programs in which the schools try to do it alone (Perry et al. 1992).
6. Efficacy Beliefs in Prognostic Judgment and Health Outcomes Much of the work in the health field is concerned with diagnosing maladies, forecasting the likely course of different physical disorders and prescribing appropriate remedies. Medical prognostic judgments involve probabilistic inferences from knowledge of varying quality and inclusiveness about the multiple factors governing the course of a given disorder. One important issue regarding medical prognosis concerns the scope of determinants included in a prognostic model. Because psychosocial factors account for some of the variability in the course of health functioning, inclusion of self-efficacy determinants in prognostic models enhances their predictive power (Bandura 2000). Recovery from medical conditions is partly governed by social factors. Recovery from a heart attack provides one example. About half the patients who experience heart attacks have uncomplicated ones. Their heart heals rapidly, and they are physically capable of resuming an active life. But the psychological and physical recovery is slow for those patients who believe they have an impaired heart. The recovery task is to convince patients that they have a sufficiently robust heart to resume productive lives. Spouses’ judgments of patients’ physical and cardiac capabilities can aid or retard the recovery process (see Social Support and Recoery from Disease and Medical Procedures). Programs that raise and strengthen spouses’ and patients’ beliefs in their cardiac capabilities enhance recovery of cardiovascular capacity (Taylor et al. 1985). The couple’s joint belief in the patients’ cardiac efficacy is the best predictor of improvement in cardiac functioning. Those who believe that their partners have a robust heart are more likely to encourage them to resume an active life than those who believe their partner’s heart is impaired and vulnerable to further damage. Pursuit of an active life strengthens the cardiovascular system. Prognostic judgments are not simply inert forecasts of a natural course of a disease. Prognostic expectations can affect patients’ beliefs in their physical efficacy. Therefore, diagnosticians not only foretell,
Self-efficacy and Health but may partly influence the course of recovery from disease. Prognostic expectations are conveyed to patients by attitude, word, and the type and level of care provided them. People are more likely to be treated in enabling ways under positive than under negative expectations. Differential care that promotes in patients different levels of personal efficacy and skill in managing health-related behavior can exert stronger impact on the trajectories of health functioning than simply conveying prognostic information. Prognostic judgments have a self-confirming potential. Expectations can alter patients’ sense of efficacy and behavior in ways that confirm the original expectations. The self-efficacy mechanism operates as one important mediator of self-confirming effects.
7. Socially Oriented Approaches to Health The quality of health of a nation is a social matter not just a personal one. It requires changing the practices of social systems that impair health rather than just changing the habits of individuals. Vast sums of money are spent annually in advertising and marketing products and promoting lifestyles detrimental to health. With regard to injurious environmental conditions, some industrial and agricultural practices inject carcinogens and harmful pollutants into the air we breathe, the food we eat, and the water we drink, all of which take a heavy toll on health. Vigorous economic and political battles are fought over environmental health and where to set the limits of acceptable risk. We do not lack sound policy prescriptions in the field of health. What is lacking is the collective efficacy to realize them. People’s beliefs in their collective efficacy to accomplish social change by perseverant group action play a key role in the policy and public health approach to health promotion and disease prevention (Bandura 1997, Wallack et al. 1993). Such social efforts take a variety of forms. They raise public awareness of health hazards, educate and influence policymakers, devise effective strategies for improving health conditions, and mobilize public support to enact policy initiatives. While concerted efforts are made to change sociostructural practices, people need to improve their current life circumstances over which they command some control. Psychosocial models that work best in improving health and preventing disease promote community self-help through collective enablement (McAlister et al. 1991). Given that health is heavily influenced by behavioral, environmental, and economic factors, health promotion requires greater emphasis on the development and enlistment of collective efficacy for socially oriented initiatives. See also: Control Beliefs: Health Perspectives; Health Behavior: Psychosocial Theories; Health Behaviors;
Health Education and Health Promotion; Health: Self-regulation; Self-efficacy; Self-efficacy: Educational Aspects
Bibliography Bandura A 1986 Social Foundations of Thought and Action: A Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1997 Self-efficacy: The Exercise of Control. W H Freeman, New York Bandura A 2000 Psychological aspects of prognostic judgments. In: Evans R W, Baskin D S, Yatsu F M (eds.) Prognosis of Neurological Disorders, 2nd edn. Oxford University Press, New York, pp. 11–27 Bruvold W H 1993 A meta-analysis of adolescent smoking prevention programs. American Journal of Public Health 83: 872–80 Connell D B, Turner R R, Mason E F 1985 Summary of findings of the school health education evaluation: Health promotion effectiveness implementation and costs. Journal of School Health 55: 316–21 DeBusk R F et al. 1994 A case-management system for coronary risk factor modification after acute myocardial infarction. Annals Of Internal Medicine 120: 721–9 Dienstbier R A 1989 Arousal and physiological toughness: Implications for mental and physical health. Psychological Reiew 96: 84–100 Herbert T B, Cohen S 1993a Depression and immunity: A metaanalytic review. Psychological Bulletin 113: 472–86 Herbert T B, Cohen S 1993b Stress and immunity in humans: a meta-analytic review. Psychosomatic Medicine 55: 364–79 Holahan C K, Holahan C J 1987 Self-efficacy, social support, and depression in aging: a longitudinal analysis. Journal of Gerontology 42: 65–8 Holden G 1991 The relationship of self–efficacy appraisals to subsequent health related outcomes: A meta–analysis. Social Work in Health Care 16: 53–93 Holman H, Lorig K 1992 Perceived self-efficacy in selfmanagement of chronic disease. In: Schwarzer R (ed.) SelfEfficacy: Thought Control of Action. Hemisphere Pub. Corp, Washington, DC, pp. 305–23 Maibach E, Flora J, Nass C 1991 Changes in self-efficacy and health behavior in response to a minimal contact community health campaign. Health Communication 3: 1–15 Maier S F, Laudenslager M L, Ryan S M 1985 Stressor controllability, immune function, and endogenous opiates. In: Brush F R, Overmier J B (eds.) Affect, Conditioning, and Cognition: Essays on the Determinants of Behaior. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 183–201 Marlatt G A, Baer J S, Quigley L A 1995 Self-efficacy and addictive behavior. In: Bandura A (ed.) Self-efficacy in Changing Societies. Cambridge University Press, New York, pp. 289–315 McAlister A L, Puska P, Orlandi M, Bye L L, Zbylot P 1991 Behaviour modification: Principles and illustrations. In: Holland W W, Detels R, Knox G (eds.) Oxford Textbook of Public Health: Applications in Public Health, 2nd edn. Oxford University Press, Oxford, UK, pp. 3–16 Meyerowitz B E, Chaiken S 1987 The effect of message framing on breast self-examination attitudes intentions and behavior. Journal of Personality and Social Psychology 52: 500–10
13819
Self-efficacy and Health O’Leary A 1990 Stress, emotion, and human immune function. Psychological Bulletin 108: 363–82 Perri M G 1985 Self-change strategies for the control of smoking, obesity, and problem drinking. In: Shiffman S, Wills T A (eds.) Coping and Substance Use. Academic Press, Orland pp. 295–317 Perry C L, Kelder S H, Murray D M, Klepp K 1992 Community-wide smoking prevention: Long-term outcomes of the Minnesota heart health program and the class of 1989 study. American Journal of Public Health 82: 1210–6 Schneiderman N, McCabe P M, Baum A (eds.) 1992 Stress and Disease Processes: Perspecties in Behaioral Medicine. Erlbaum, Hillsdale, NJ Schwarzer R (ed.) 1992 Self-efficacy: Thought Control of Action. Hemisphere Pub. Corp, Washington, DC Taylor C B, Bandura A, Ewart C K, Miller N H, Debusk R F 1985 Exercise testing to enhance wives’ confidence in their husbands’ cardiac capabilities soon after clinically uncomplicated acute myocardial infarction. American journal of Cardiology 55: 635–8 Wallack L, Dorfman L, Jernigan D, Themba M 1993 Media Adocacy and Public Health: Power for Preention. Sage, Newbury Park, CA Wiedenfeld S A, Bandura A, Levine S, O’Leary A, Brown S, Raska K 1990 Impact of perceived self-efficacy in coping with stressors on components of the immune system. Journal of Personality and Social Psychology 59: 1082–94
A. Bandura
Self-efficacy: Educational Aspects Current theoretical accounts of learning and instruction postulate that students are active seekers and processors of information (Pintrich et al. 1986). Research indicates that students’ cognitions influence the instigation, direction, strength, and persistence of achievement behaviors (Schunk 1995). This article reviews the role of one type of personal cognition: self-efficacy, or one’s perceived capabilities for learning or performing behaviors at designated levels (Bandura 1997). The role of self-efficacy in educational contexts is discussed to include the cues that students use to appraise their self-efficacy. A model of the operation of self-efficacy is explained, along with some key findings from educational research. The entry concludes by describing the role of teacher efficacy and suggesting future research directions.
1. Self-efficacy Theory Self-efficacy can affect choice of activities, effort, persistence, and achievement (Bandura 1997, Schunk 1991). Compared with students who doubt their learning capabilities those with high self-efficacy for accomplishing a task participate more readily, work 13820
harder, persist longer when they encounter difficulties, and demonstrate higher achievement. Learners acquire information to appraise selfefficacy from their performance accomplishments, vicarious (observational) experiences, forms of persuasion, and physiological reactions. Students’ own performances offer reliable guides for assessing efficacy. Successes raise self-efficacy and failures lower it, but once a strong sense of self-efficacy is developed a failure may not have much impact (Bandura 1986). Learners also acquire self-efficacy information from knowledge of others through classroom social comparisons. Similar others offer the best basis for comparison. Students who observe similar peers perform a task are apt to believe that they, too, are capable of accomplishing it. Information acquired vicariously typically has a weaker effect on self-efficacy than performance-based information because the former can be negated easily by subsequent failures. Students often receive persuasive information from teachers and parents that they are capable of performing a task (e.g., ‘You can do this’). Positive feedback enhances self-efficacy, but this increase will be temporary if subsequent efforts turn out poorly. Students also acquire efficacy information from physiological reactions (e.g., heart rate, sweating). Symptoms signaling anxiety might be interpreted to mean that one lacks skills. Information acquired from these sources does not automatically influence self-efficacy; rather, it is cognitive appraised (Bandura 1986). In appraising efficacy, learners weigh and combine perceptions of their ability, the difficulty of the task, the amount of effort expended, the amount of external assistance received, the number and pattern of successes and failures, similarity to models, and credibility of persuaders (Schunk 1991). Self-efficacy is not the only influence in educational settings. Achievement behavior also depends on knowledge and skills, outcome expectations, and the perceived value of outcomes (Schunk 1991). High selfefficacy does not produce competent performances when requisite knowledge and skills are lacking. Outcome expectations, or beliefs concerning the probable outcomes of actions, are important because students strive for positive outcomes. Perceived value of outcomes refers to how much learners desire certain outcomes relative to others. Learners are motivated to act in ways that they believe will result in outcomes they value. Self-efficacy is dynamic and changes as learning occurs. The hypothesized process whereby self-efficacy operates during learning is as follows (Schunk 1996). Students enter learning situations with varying degrees of self-efficacy for learning. They also have goals in mind, such as learning the material, working quickly, pleasing the teacher, and making a high grade. As they engage in the task, they receive cues about how well they are performing, and they use these cues to assess
Self-efficacy: Educational Aspects their learning progress and their self-efficacy for continued learning. Perceived progress sustains motivation and leads to continued learning. Perceptions of little progress do not necessarily diminish selfefficacy if learners believe they know how to perform better, such as by working harder, seeking help, or switching to a more effective strategy (Schunk 1996).
2. Factors Affecting Self-efficacy There are many instructional, social, and environmental factors that operate during learning. Several of these factors have been investigated to determine how they influence learners’ self-efficacy. For example, research has explored the roles of goal setting, social modeling, rewards, attributional feedback, social comparisons, progress monitoring, opportunities for self-evaluation of progress, progress feedback, and strategy instruction (Schunk 1995). As originally conceptualized by Bandura, selfefficacy is a domain-specific construct. Self-efficacy research in education has tended to follow this guidance and assess students’ self-efficacy within domains at the level of individual tasks. In mathematics, for example, students may be shown sample multiplication problems and for each sample judge their confidence for solving similar problems correctly. Efficacy scales typically are numerical and range from low to high confidence. After completing the efficacy assessment students are presented with actual problems to solve. These achievement test problems corresponding closely to those on the self-efficacy test, although they are not identical. Such specificity allows researchers to relate self-efficacy to achievement to determine correspondence and prediction (Pajares 1996). Other measures often collected by self-efficacy researchers include persistence, motivation, and selfregulation strategies. Following a pretest, students receive instruction complemented by one or more of the preceding educational variables. After the instruction is completed students receive a post-test; in some studies follow-up maintenance testing is done. A general finding from much educational selfefficacy research is that educational variables influence self-efficacy to the extent that they convey to learners information about their progress in learning (Schunk 1995). For example, much research shows that specific proximal goals raise self-efficacy, motivation, and achievement better than do general goals (Schunk 1995). Short-term specific goals provide a clear standard against which to compare learning progress. As learners determine that they are making progress, this enhances their self-efficacy for continued learning. In contrast, assessing progress against a general goal (e.g., ‘Do your best’) is difficult; thus, learners receive less clear information about progress, and self-efficacy is not strengthened as well.
3. Predictie Utility of Self-efficacy Self-efficacy research has examined the relation of selfefficacy to such educational outcomes as motivation, persistence, and achievement (Pajares 1996). Significant and positive correlations have been obtained across many studies between self-efficacy assessed prior to instruction and subsequent motivation during instruction. Initial judgments of self-efficacy have been found also to correlate positively and significantly with post-test measures of self-efficacy and achievement collected following instruction. Multiple regression has been used to determine the percentage of variability in skillful performance accounted for by self-efficacy. Schunk and Swartz (1993) found that post-test self-efficacy was the strongest predictor of children’s paragraph writing skills. Shell et al. (1989) found that although self-efficacy and outcome expectations predicted reading and writing achievement, self-efficacy was the strongest predictor. Several studies have tested causal models. Schunk (1981) employed path analysis to reproduce the correlation matrix comprising long-division instructional method, self-efficacy, persistence, and achievement. The best model showed a direct effect of method on achievement and an indirect effect through persistence and self-efficacy, an indirect effect of method on persistence through self-efficacy, and a direct effect of self-efficacy on persistence and achievement. Schunk and Gunn (1986) found that the largest direct influence on achievement was due to use of effective learning strategies; achievement also was heavily influenced by self-efficacy.
4. Teacher Self-efficacy Self-efficacy is applicable to teachers as well as students. Ashton and Webb (1986) postulated that self-efficacy should influence teachers’ activities, efforts, and persistence. Teachers with low self-efficacy may avoid planning activities that they believe exceed their capabilities, may not persist with students having difficulties, may expend little effort to find materials, and may not reteach content in ways students might better understand. Teachers with higher self-efficacy might develop challenging activities, help students succeed, and persevere with students who have trouble learning. These motivational effects enhance student learning and substantiate teachers’ self-efficacy by suggesting that they can help students learn. Correlational data show that self-efficacy is related to teaching behavior. Ashton and Webb (1986) found that teachers with higher self-efficacy were likely to have a positive classroom environment (e.g., less student anxiety and teacher criticism), support students’ ideas, and meet the needs of all students. High teacher self-efficacy was positively associated with use of praise, individual attention to students, checking on 13821
Self-efficacy: Educational Aspects students’ progress in learning, and their mathematical and language achievement. Tschannen-Moran et al. (1998) discuss teacher self-efficacy in greater depth.
5. Future Research Directions The proliferation of self-efficacy research in education has enlightened understanding of the construct but also has resulted in a multitude of measures. In developing efficacy assessments it is imperative that researchers attempt to be faithful to Bandura’s (1986) conceptualization of self-efficacy as a domain-specific measure. Research will benefit from researchers publishing their instruments along with validation data to include reliability and validity. As self-efficacy research continues in settings where learning occurs, it will be necessary to collect longitudinal data showing how self-efficacy changes over time as a consequence of learning. This focus will require broadening of self-efficacy assessments from reliance on numerical scales to qualitative data. Researchers also should relate measures of teacher self-efficacy to those of student self-efficacy to test the idea that these variables reciprocally influence one another. Finally, research is needed on the role of selfefficacy during self-regulation. Self-regulation refers to self-generated thoughts and actions that are systematically oriented toward attainment of one’s learning goals (Zimmerman 1990). Self-efficacy has the potential to influence many aspects of self-regulation, yet to date only a few areas have been explored in research. This focus will become more critical as selfregulation assumes an increasingly important role in education.
Schunk D H 1996 Goal and self-evaluative influences during children’s cognitive skill learning. American Educational Research Journal 33: 359–82 Schunk D H, Gunn T P 1986 Self-efficacy and skill development: Influence of task strategies and attributions. Journal of Educational Research 79: 238–44 Schunk D H, Swartz C W 1993 Goals and progress feedback: Effects on self-efficacy and writing achievement. Contemporary Educational Psychology 18: 337–54 Shell D F, Murphy C C, Bruning R H 1989 Self-efficacy and outcome expectancy mechanisms in reading and writing achievement. Journal of Educational Psychology 81: 91–100 Tschannen-Moran M, Hoy A W, Hoy W K 1998 Teacher efficacy: Its meaning and measure. Reiew of Educational Research 68: 202–48 Zimmerman B J 1990 Self-regulating academic learning and achievement: The emergence of a social cognitive perspective. Educational Psychology Reiew 2: 173–201
D. H. Schunk
Self-esteem in Adulthood Self-esteem refers to a global judgment of the worth or value of the self as a whole (similar to self-regard, selfrespect, and self-acceptance), or to evaluations of specific aspects of the self (e.g., appearance self-esteem or academic self-esteem). The focus of this article is on global self-esteem, which has distinct theoretical importance and consequences (Baumeister 1998, Rosenberg et al. 1995). Although thousands of studies of self-esteem have been published (Mruk 1995), many central questions about the nature, functioning, and importance of global self-esteem remain unresolved (Baumeister 1998).
Bibliography Ashton P T, Webb R B 1986 Making a Difference: Teachers’ Sense of Efficacy and Student Achieement. Longman, New York Bandura A 1986 Social Foundations of Thought and Action: A Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1997 Self-efficacy: The Exercise of Control. Freeman, New York Pajares F 1996 Self-efficacy beliefs in academic settings. Reiew of Educational Research 66: 543–78 Pintrich P R, Cross D R, Kozma R B, McKeachie W J 1986 Instructional psychology. Annual Reiew of Psychology 37: 611–51 Schunk D H 1981 Modeling and attributional effects on children’s achievement: A self-efficacy analysis. Journal of Educational Psychology 73: 93–105 Schunk D H 1991 Self-efficacy and academic motivation. Educational Psychologist 26: 207–31 Schunk D H 1995 Self-efficacy and education and instruction. In: Maddux J E (ed.) Self-efficacy, Adaptation, and Adjustment: Theory, Research, and Application. Plenum, New York, pp. 281–303
13822
1. The Importance of Self-esteem Global self-esteem is a central aspect of the subjective quality of life. It is related strongly to positive affect and life satisfaction (Diener 1984), less anxiety (Solomon et al. 1991), and fewer depressive symptoms (Crandall 1973). High and low levels of self-esteem appear to be associated with different motivational orientations. High self-esteem people focus on selfenhancement and ‘being all that they can be,’ whereas low self-esteem people focus on self-protection and avoiding failure and humiliation (Baumeister et al. 1989). Low self-esteem has been proposed as a cause of many social problems, such as teenage pregnancy, aggression, eating disorders, and poor school achievement. However, evidence that low self-esteem is a cause, rather than a symptom, of these problems is scarce (Baumeister 1998, Dawes 1994, Mecca et al. 1989), and some researchers have suggested that high
Self-esteem in Adulthood self-esteem may actually be the cause of social problems such as aggression (Baumeister 1998). Because of unresolved issues regarding the nature and functioning of self-esteem and how to measure it, firm conclusions about whether high self-esteem is or is not socially useful seem premature.
2. Issues in the Self-esteem Literature 2.1 Trait or State? Psychologists assume typically that self-esteem is a psychological trait (i.e., that it is stable over time and across situations). In support of this view, self-esteem tends to be highly stable over long periods of time (see, e.g., Rosenberg 1979). Self-esteem is also a state, however, changing in response to events and experiences in the course of life (Heatherton and Polivy 1991). James (1890) suggested that self-esteem has qualities of both a state and a trait, rising and falling in response to achievements and setbacks relevant to one’s aspirations. On the other hand, he recognized that people tend to have average levels of self-esteem that are not linked directly to their objective circumstances. Although research supports James’ intuition, for some people self-esteem is relatively stable and trait-like across time, whereas for others it is more state-like, fluctuating daily (Kernis and Waschull 1995).
2.2 Affect or Cognitie Judgment? Researchers disagree about whether self-esteem is fundamentally a feeling or a judgment about the self. Global self-esteem and mood are correlated strongly, leading some to conclude that affect is a component of self-esteem (e.g., Brown 1993, Pelham and Swann 1989). Others argue that self-esteem is a cognitive judgment, based on standards of worth and accessible information about how well an individual is meeting those standards (see, e.g., Moretti and Higgins 1990). Current mood may be one source of information on which judgments of self-worth are based (Schwarz and Strack 1999). It seems likely that the relationship between self-esteem and mood is complex: both mood and self-esteem may be affected independently by life events; mood may be a source of information on which judgments of self-esteem are based; and mood may also be a consequence of having high or low selfesteem.
2.3 Where Does Self-esteem Come From? Why are some people high and others low in selfesteem? James (1890) suggested that global self-esteem
is determined by successes divided by pretensions, or how well a person is doing in areas that are important. Although it might seem logical that high self-esteem results from success in life (e.g., being smart, attractive, wealthy, and popular), these objective outcomes are related only weakly to self-esteem. For example, socioeconomic status (Twenge and Campbell 1999), physical attractiveness as rated by observers (Diener et al. 1995, Feingold 1992), obesity (Miller and Downey 1999), school achievement (Rosenberg et al. 1995), and popularity (Wylie 1979) are related only weakly to global self-esteem. A stronger relationship is observed between global self-esteem and how well people beliee they are doing in important domains, but the direction of this relationship is unclear. Believing one is doing well in important domains might cause high selfesteem, or people may think they are doing well because they have high self-esteem. Experimental studies have demonstrated that specific self-evaluations are sensitive to manipulated success or failure, but evidence that global self-esteem responds to such feedback is very scarce (Blascovich and Tomaka 1991). Mead (1934) and Cooley (1902) proposed that selfesteem develops in social relationships. Cooley (1902) argued that subjectively-interpreted feedback from others is a main source of information about the self. The self-concept arises from imagining how others perceive and evaluate the self. These ‘reflected appraisals’ affect self-perceptions and self-evaluations, resulting in what Cooley (1902) described as the ‘looking glass self.’ Mead (1934) argued that the looking glass self is a product of, and essential to, social interaction. To interact smoothly and effectively with others, people need to anticipate how others will react to them, and so they need to learn to see themselves through the eyes of others, either the specific people with whom they interact, or a generalized view of how most people see the self, or a ‘generalized other.’ Research indicates that self-esteem is related only weakly to others’ evaluations of the self, but is related strongly to beliefs about others’ evaluations (Shrauger and Schoeneman 1979). Evidence regarding the causal direction of this effect is scarce. A third view, which can encompass these others, is that self-esteem is a judgment of self-worth constructed on the basis of information and standards for the self that are available and accessible at the moment. People may differ in the self-standards that are chronically accessible to them (see, e.g., Higgins 1987). For example, some people may judge their self-worth chronically according to whether they are competent in important domains, whereas others judge their selfworth chronically according to whether others approve of them (Crocker and Wolfe 2001). In this view, self-esteem will be stable over time if the standards used to evaluate the self and information about how well an individual is doing relative to those standards are stable. When circumstances make alternative standards salient, or alter beliefs about how well the 13823
Self-esteem in Adulthood individual is doing relative to those standards, then self-esteem changes (Crocker 1999, Quinn and Crocker 1999).
2.4 Defensie or Genuine? Genuine self-esteem is usually assumed to be synonymous with self-worth, self-respect, and self-acceptance, despite awareness of one’s flaws and shortcomings. Yet, many studies have demonstrated that people who are high in self-esteem are more likely to make excuses for failure, derogate others when threatened, and have unrealistically positive views of themselves (see Baumeister 1998, Blaine and Crocker 1993, Taylor and Brown 1988 for reviews), behaviors which appear to be quite defensive. These findings have fueled the suspicion that many people who are outwardly high in self-esteem inwardly harbor serious doubts about their self-worth, and have defensive, rather than genuinely high, self-esteem. Although the distinction between genuine and defensively high selfesteem has a long history in psychology, researchers have had little success at distinguishing these two types of high self-esteem empirically. One view is that implicit, or nonconscious, evaluations of the self are dissociated from conscious selfevaluations (see, e.g., Greenwald and Banaji 1995). According to this view, genuine high self-esteem results from having high explicit (conscious) and implicit (nonconscious) self-esteem, whereas defensively high self-esteem results from having high explicit selfesteem and low implicit self-esteem. A measure of implicit self-esteem based on the Implicit Associations Test (Greenwald et al. 1998) shows the predicted dissociation between implicit and explicit measures, but to date research has not demonstrated that defensive behaviors such as blaming others for failure are associated uniquely with the combination of high explicit and low implicit self-esteem. Another view is that people who have stable high self-esteem are relatively nondefensive in the face of failure, whereas people with unstable high self-esteem are both defensive and hostile when they fail (see Kernis and Waschull 1995 for a review). According to Kernis and his colleagues, people with unstable high self-esteem have a high level of ego-involvement in everyday events, consequently their self-esteem is at stake even when relatively minor negative events occur. Considerable evidence has accumulated supporting the view that people with unstable high self-esteem are defensive, whereas people with stable high self-esteem are not. Crocker and Wolfe (2001) argue that instability of self-esteem results when outcomes in a person’s life are relevant to their contingencies, or conditions, of self-worth. Consequently, Crocker and Wolfe argue that people are defensive when they receive negative or threatening information in do13824
mains in which their self-esteem is contingent. To date, however, they have not provided empirical support for their view. In general, the issue of defensive vs. genuine selfesteem has focused attention on dimensions of self-esteem that go beyond whether it is high or low. This broader perspective on multiple dimensions of self-esteem might help resolve several issues in the selfesteem literature (Crocker and Wolfe 2001). For example, the role of self-esteem in social problems such as substance abuse and eating disorders may be linked to instability or contingencies of self-esteem as well as, or instead of, level of self-esteem.
2.5 A Cultural Uniersal? Psychologists have long assumed that there is a universal need to have high self-esteem. Consistent with this view, most people have high self-esteem, and will go to great lengths to achieve, maintain, and protect this. Yet, almost all of this research has been conducted in a North American cultural context, leaving open the possibility that the need for high selfesteem is a culturally specific phenomenon (see Heine et al. 1999 for a review). Levels of self-esteem in Asians are related to how much time they have spent in the USA or Canada (Heine et al. 1999). Furthermore, Asians and Asian-Americans, on average, do not show the same self-enhancing tendencies so characteristic of North Americans. Heine et al. argue that there are fundamental cultural differences in the nature and importance of self-esteem. Taking Japan as an example, they argue that, whereas in the USA and Canada people are self-enhancing and motivated to achieve, maintain, and protect high self-esteem, in Japan people are self-critical and motivated to improve the self. This self-critical orientation in Japan, they argue, is adaptive in a culture that values selfcriticism and self-improvement, and considers them to be evidence of commitment to the group. Consequently, in contrast to North America, self-criticism may lead to positive self-feelings resulting from living up to cultural standards that value self-criticism. The notion that the need for self-esteem and the motivation to self-enhance are not universal is a crucial development in self-esteem research.
3. Measurement of Self-esteem In 1974, Wylie reviewed research on self-esteem and criticized researchers for developing idiosyncratic measures of self-esteem, rather than well-established, psychometrically valid and reliable instruments. Although measurement of self-esteem has improved somewhat since Wylie’s critique (see Blascovich and Tomaka 1991 for a review), there remains a tendency
Self-esteem in Adulthood for researchers to develop idiosyncratic measures for particular studies. Issues in the measurement of selfesteem tend to reflect issues in its conceptualization. Measures of trait self-esteem assess global judgments of self-worth, self-respect, or self-regard that encourage respondents to consider how they usually or generally evaluate themselves (see, e.g., Rosenberg 1965), or assess evaluations of the self in several domains and create a composite score, on the assumption that such a composite is an indicator of global self-esteem (see, e.g., Coopersmith 1967). Measures of state self-esteem, on the other hand, focus on how the person feels at a specific moment in time. Some state measures assess momentary evaluations of the self in one or more domains such as appearance or performance (see, e.g., Heatherton and Polivy 1991). Others assess current mood or self-related affect with self-ratings on items such as feeling proud, important, and valuable (see, e.g., Leary et al. 1995). Both types of state self-esteem measures appear to be responsive to positive and negative events. Researchers interested in state self-esteem tend not to measure momentary or current self-worth, self-regard, or self-respect (i.e., global state self-esteem). Consequently, studies of state self-esteem and studies of trait self-esteem tend to measure different constructs, making results across these types of studies difficult to compare.
4. Future Directions Several important issues remain to be addressed by research. First, additional progress is needed in measurement and conceptualization of self-esteem, and in identifying and validating different types of selfesteem (e.g., defensive vs. genuine, implicit vs. explicit, stable vs. unstable, and contingent vs. noncontingent). Possible cultural and subcultural differences in the nature and functioning of self-esteem are a very important area needing further exploration. Only after progress has been made in these areas will researchers be able to provide definitive answers to questions about the social importance of self-esteem. See also: Intrinsic Motivation, Psychology of; Selfefficacy; Self-evaluative Process, Psychology of; Self: History of the Concept; Self-regulation in Adulthood; Well-being (Subjective), Psychology of
Bibliography Baumeister R F 1998 The self. In: Gilbert D T, Fiske S T, Lindzey G (eds.) Handbook of Social Psychology, 4th edn. McGraw-Hill, New York, pp. 680–740 Baumeister F, Tice D M, Hutton D G 1989 Self-presentational motivations and personality differences in self-esteem. Journal of Personality 57: 547–79
Blaine B, Crocker J 1993 Self-esteem and self-serving biases in reactions to positive and negative events: An integrative review. In: Baumeister R F (ed.) Self-esteem: The Puzzle of Low Self-regard. Erlbaum, Hillsdale, NJ, pp. 55–85 Blascovich J, Tomaka J 1991 Measures of self-esteem. In: Robinson J P, Shaver P R, Wrightsman L S (eds.) Measures of Personality and Social Psychological Attitudes. Academic Press, San Diego, CA, pp. 115–60 Brown J D 1993 Motivational conflict and the self: The doublebind of low self-esteem. In: Baumeister R F (ed.) Self-esteem: The Puzzle of Low Self-regard. Plenum, New York, pp. 117–30 Cooley C H 1902 Human Nature and the Social Order. Schocken, New York Coopersmith S 1967 The Antecedents of Self-esteem. W. H. Freeman, San Francisco, CA Crandall R 1973 The measurement of self-esteem and related constructs. In: Robinson J, Shaver P R (eds.) Measures of Social Psychological Attitudes. Institute for Social Research, Ann Arbor, MI Crocker J 1999 Social stigma and self-esteem: Situational construction of self-worth. Journal of Experimental Social Psychology 35: 89–107 Crocker J, Wolfe C T 2001 Contingencies of self-worth. Psychological Reiew 108: 593–623 Dawes R M 1994 House of Cards: Psychology and Psychotherapy Built on Myth. Free Press, New York Diener E 1984 Subjective well-being. Psychological Bulletin 95: 542–75 Diener E, Wolsic B, Fujita F 1995 Physical attractiveness and subjective well-being. Journal of Personality and Social Psychology 69: 120–29 Feingold A 1992 Good-looking people are not what we think. Psychological Bulletin 111: 304–41 Greenwald A G, Banaji M R 1995 Implicit social cognition: Attitudes, self-esteem, and stereotypes. Psychological Reiew 102: 4–27 Greenwald A G, McGhee D E, Schwarz J L K 1998 Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology 74: 1464–80 Heatherton T F, Polivy J 1991 Development and validation of a scale for measuring state self-esteem. Journal of Personality and Social Psychology 60: 895–10 Heine S J, Lehman D R, Markus H R, Kitayama S 1999 Is there a universal need for positive self-regard? Psychological Reiew 106: 766–94 Higgins E T 1987 Self-discrepancy: A theory relating self and affect. Psychological Reiew 94: 319–40 James W 1890 The Principles of Psychology. Harvard University Press, Cambridge, MA, Vol. 1 Kernis M H, Waschull S B 1995 The interactive roles of stability and level of self-esteem: Research and theory. In: Zanna M P (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 27, pp. 93–141 Leary M R, Tambor E S, Terdal S K, Downs D L 1995 Selfesteem as an inter-personal monitor: The sociometer hypothesis. Journal of Personality and Social Psychology 68: 518–30 Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago, IL Mecca A M, Smelser N J, Vasconcellos J (eds.) 1989 The Social Importance of Self-esteem. University of California Press, Berkeley, CA
13825
Self-esteem in Adulthood Miller C T, Downey K T 1999 A meta-analysis of heavyweight and self-esteem. Personality and Social Psychology Reiew 3: 68–84 Moretti M M, Higgins E T 1990 Relating self-discrepancy to self-esteem: The contribution of discrepancy beyond actual self-ratings. Journal of Experimental Social Psychology 26: 108–23 Mruk C 1995 Self-esteem: Research, Theory, and Practice. Springer, New York Pelham B W, Swann W B Jr 1989 From self-conceptions to selfworth: On the sources and structure of global self-esteem. Journal of Personality and Social Psychology 57: 672–80 Quinn D M, Crocker J 1999 When ideology hurts: Effects of feeling fat and the protestant ethic on the psychological wellbeing of women. Journal of Personality and Social Psychology 77: 402–14 Rosenberg M 1965 Society and the Adolescent Self-image. Princeton University Press, Princeton, NJ Rosenberg M 1979 Conceiing the Self. Basic Books, New York Rosenberg M, Schooler C, Schoenbach C, Rosenberg F 1995 Global self-esteem and specific self-esteem: Different concepts, different outcomes. American Sociological Reiew 60: 141–56 Schwarz N, Strack F 1999 Reports of subjective well-being: Judgmental processes and their methodological implications. In: Kahneman D, Diener D, Schwarz N (eds.) Well-being: Foundations of Hedonic Psychology. Russell Sage, New York, pp. 61–84 Shrauger J S, Schoeneman T J 1979 Symbolic interactionist view of self-concept: Through the looking-glass darkly. Psychological Bulletin 86: 549–73 Solomon S, Greenberg J, Pyszczynski T 1991 A terror management theory of social behavior: The psychological functions of self-esteem and cultural worldviews. In: Zanna M P (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 24, pp. 91–159 Taylor S E, Brown J D 1988 Illusion and well-being: A socialpsychological perspective on mental health. Psychological Bulletin 103: 193–210 Twenge J M Campbell W K 1999 Does Self-esteem Relate to Being Rich, Successful, and Well-educated? A Meta-analytic Reiew. Manuscript, Case-Western Reserve University Wylie R C 1974 The Self-concept (rev. edn.). University of Nebraska Press, Lincoln, NE Wylie R C 1979 The Self-concept: Theory and Research on Selected Topics, 2nd edn. University of Nebraska Press, Lincoln, NE, Vol. 2
J. Crocker
Self-evaluative Process, Psychology of This article explores some of the processes associated with evaluative response to one’s own self. Evaluation refers to a response registering the idea\feeling that some aspect of one’s world is good or bad, likeable or dislikable, valuable or worthless. It is one of our most fundamental psychological responses. Indeed, research on the psychology of meaning has revealed that 13826
evaluation is the single most important aspect of meaning. Evaluative responses are fast; there is evidence that we have an evaluation of some things even before we are completely able to recognize them. Evaluative responses are sometimes automatic—we do not set out to make them and we often cannot turn them off. Moreover, evaluation often colors our interpretation of situations. For example, ambiguous actions of people we like are interpreted more benevolently than the same actions of people we do not like. An evaluative response attached to the self is often termed self-esteem and we will use the terms selfesteem and self-evaluation interchangeably.
1. Indiidual Differences in Self-ealuation There are literally thousands of studies reported since the 1950s measuring self-esteem and comparing persons who are high with persons who are low on this dimension (see Self-esteem in Adulthood). Most frequently, self-esteem is assessed by self-report. One of the most popular measures (Rosenberg 1965) consists of ten items like ‘I am a person of worth’ followed by a series of graded response options, e.g., strongly agree, agree, disagree, strongly disagree. Such measures have proven to be reliable and valid. However, they are subject to the same general criticisms of any self-report measure: Scores can be distorted by the tendency to agree with an item regardless of its content and the tendency to try to create a favorable impression. Moreover, there may be aspects of one’s selfevaluation that are not easily accessible to conscious awareness. To address some of these concerns ‘implicit’ measures of self-esteem are currently being explored (Greenwald and Banaji 1995). Most of these measures work by priming the self, i.e., making the self salient, and then measuring the impact of self-salience on other evaluative responses. For example, when the self is primed, the more positive the self-evaluation the faster one should be in making other positive evaluative judgments. As of this writing, implicit measures of self-evaluation show great promise but it is still unclear what impact such measures will have on our ultimate understanding of self-evaluation. Individual differences in self-evaluation have been associated with a variety of psychological traits. For example, compared to persons with low self-esteem, persons with high self-esteem tend to achieve more in school, be less depressed, better adjusted, less socially anxious and more satisfied with life, etc. Going through this research almost leads one to draw the conclusion that all the good things in life are positively associated with self-esteem. Indeed, the intuition that high self-esteem is good is so compelling that the state of California even put together a task force to promote self-esteem. There are, of course, many arguments for the positive impact of self-esteem. However, most of
Self-ealuatie Process, Psychology of the studies rely on correlational methods. Correlational methodology makes it difficult to know if selfesteem is a cause or an effect of these other variables. For example, it may be that high self-esteem leads to school achievement but it may also be that school achievement improves self-esteem. It may also be that the correlation between achievement and self-esteem is not causal at all; each may be caused by the same third variable, e.g., general health. Recent research is beginning to correct the simple view of self-esteem as always ‘good.’ For example, it may be persons who are high in self-esteem rather than persons who are low in self-esteem that are most likely to be aggressive (Baumeister et al. 1996). Why? Persons high in self-esteem have more to lose when confronted by failure or a personal affront. Related to this suggestion is the observation that self-esteem may be stable in some persons but unstable in others. Stability of self-esteem is consequential (Kernis and Waschull 1995). Persons whose self-esteem is high on the average but whose self-evaluation fluctuates over time score higher on a hostility measure than persons who are high in self-esteem but whose self-evaluation is stable. Perhaps it is persons who aspire to feel positive about themselves but are unsure of themselves that tend to fluctuate in their self-evaluation and to respond aggressively to threats to self-esteem.
2. Self Moties The idea that persons strive to maintain a positive selfevaluation is obvious. It is not difficult to notice that people respond positively to success and compliments and negatively to failure and insults. They tend to seek out persons who respect their accomplishments and situations in which they can do well. In spite of the obviousness and ubiquity of a self-enhancement motive, at least two other motives have captured some research attention. One is the motive for self-knowledge, i.e., a self-assessment motive, and the other is a consistency or self-verification motive. Feeling good about ourselves can take us only so far. It is also important to have accurate knowledge about the self. Leon Festinger (1954) suggested that we have a drive to evaluate our abilities and opinions. Indeed, we seem to be fascinated by information about ourselves. We want to know what others think of us. We go to psychotherapists to learn more about ourselves. We are even curious about what the stars have to say about our lives (note the popularity of horoscopes). Systematic research has varied the diagnosticity of experimental tasks, i.e., the extent to which we believe the task is truly revealing of our abilities. Under some conditions, for example, when certainty is low or when we are in a particularly good mood, we prefer diagnostic feedback to flattering feedback. Thus, there is evidence for the self-assessment motive.
Another motive that has received some research attention is the tendency to verify one’s current view of the self (Swann 1990). According to this point of view, people are motivated to confirm their self-view. They will seek out persons and situations that provide belief consistent feedback. If a person has a positive view of self, he or she will seek out others who also evaluate them positively and situations in which they can succeed. Note that this is exactly the same expectation that could be derived from a self-enhancement point of view. However, self-verification and self-enhancement predictions diverge when a person has a negative view of self. The self-verification hypothesis predicts that persons with a negative self-view will seek out others who also perceive them negatively and situations that will lead to poor performance. There is some evidence for the self-verification prediction. On the other hand, neither the self-assessment motive nor the self-verification motive appears to be as robust as the self-enhancement motive (Sedikides 1993).
3. Self-enhancement As noted above, self-enhancement processes are frequent, easy to observe, and robust. William James and a host of contemporary workers have focused on particular mechanisms by which self-evaluation is affected. For example, James suggests that threats to self-esteem are stronger if they involve abilities on which the self has ‘pretensions’ or aspires to do well. Others have noted that feelings of success and failure that affect self-evaluation often come from comparison with other persons. Still other investigators have shown the impact of self-enhancement on cognition. For example, the kinds of causal attributions people make for their own successes and failures are often self-serving. Persons tend to locate the causes of success internally (due to my trying, ability) and the causes of failure externally (due to bad luck, task difficulty). The self-enhancement mechanisms proposed in the psychological literature have been so numerous and so diverse that the collection of them has sometimes been dubbed the ‘self zoo.’ However, three general classes of mechanism encompass many of the proposed self-enhancement\protection mechanisms: Social comparison, inconsistency reduction, and value expression.
3.1 Social Comparison One large class of self-enhancement mechanisms concerns social comparisons. The Self-Evaluation Maintenance (SEM) model (Tesser 1988), for example, proposes that when another person does better than we do at some activity, our own self-evaluation is affected. The greater our ‘closeness’ to the other person 13827
Self-ealuatie Process, Psychology of (through similarity, contiguity, personal relationship, etc.) the greater the effect on our self-evaluation. Being outperformed by another can lower self-evaluation by inviting unflattering self-comparison, or it can raise self-evaluation, a kind of ‘basking in reflected glory’. (Examples of basking are seen in statements like, ‘That’s my friend Bob, the best widget maker in the county.’) The releanceof the performance domain determines the relative importance of these opposing processes. If the performance domain is important to one’s self-definition, i.e., high relevance, then the comparison process will be dominant. One’s selfevaluation will be threatened by a close other’s better performance. If the performance domain is unimportant to one’s self-definition, i.e., low relevance, then the reflection process will be dominant. One’s self-evaluation will be augmented by a close other’s better performance. Thus, combinations of relative performance, closeness and relevance, are the antecedents of self-esteem threat or enhancement. The assumption that people are motivated to protect or enhance self-evaluation, combined with the sketch of how another’s performance affects self-evaluation, provides the information needed to predict selfevaluation maintenance behavior. An example: Suppose Nancy learns that she made a Bj on the test. The only other person from Nancy’s dormitory in this chemistry class, Kaela, made an Aj. This should be threatening to Nancy: Kaela outperformed her; Kaela is psychologically close (same dormitory); and chemistry is high in relevance to Nancy who is studying to be a doctor. What can Nancy do to reduce this threat and maintain a positive self-evaluation? She can change the performance differential by working harder herself or by preventing Kaela from doing well, e.g., hide the assignments, put the wrong catalyst in Kaela’s beaker. She can reduce her psychological connection to Kaela, e.g. change dorms and avoid the same classes. Alternatively, she can convince herself that this performance domain is not self-relevant, e.g., chemistry is not highly relevant to the kind of medicine in which she is most interested. Laboratory experiments have produced evidence for each of these modes of dealing with social comparison threat to selfevaluation.
3.2 Cognitie Consistency The number of variations within this approach to selfevaluation regulation is also substantial. An example of this approach is cognitive dissonance theory (Festinger 1957). According to dissonance theory, self-esteem is threatened by inconsistency. Holding beliefs that are logically or ‘psychologically’ inconsistent, i.e., dissonant, with one another is uncomfortable. For example, suppose a student agrees to a request to write an essay in favor of a tuition increase 13828
at her school. Her knowledge that she is opposed to a tuition increase is dissonant with her knowledge that she agreed to write an essay in favor of a tuition increase. One way to reduce this threatening dissonance is for the student to change her attitude to be more in favor of a tuition increase. Note that social comparison mechanisms and consistency reduction mechanisms are both self-enhancement strategies, yet they seem to have little in common. Threat from dissonance rarely has anything to do with the performance of another, i.e., social comparison. Similarly, inconsistency is generally irrelevant to an SEM threat, whereas other’s performance is crucial. Attitude change is the usual mode of dissonance threat reduction; on the other hand, changes in closeness, performance, or relevance are the SEM modes.
3.3 Value Expression The notion that expressing one’s most cherished values can affect self-esteem also has a productive history in social psychology. Simply expressing who we are, affirming our important values seems to have a positive effect on self-evaluation. According to self-affirmation theory (e.g., Steele 1988), self-evaluation has at its root a concern with a sense of global self-integrity. Selfintegrity refers to holding self conceptions and images that one is ‘adaptively and morally adequate, that is, as competent, good, coherent, unitary, stable, and capable of free choice, capable of controlling important outcomes, and so on’ (Steele 1988, p. 262). If the locus of a threat to self-esteem is self-integrity then the behavior to reduce that threat is self-affirmation or a declaration of the significance of an important selfvalue. Again, note that as a self-enhancement strategy, affirming a cherished value is qualitatively different from the SEM behaviors of changing closeness, relevance or performance or the dissonance behavior of attitude change.
3.4 Putting It All Together We have briefly described three classes of self-enhancement mechanisms: Social comparison, cognitive consistency and value expression. Each of these mechanisms is presumed to regulate self-evaluation, yet they are strikingly different from one another. These differences raise the question of whether selfevaluation is a unitary system or whether there are three (or more) independent self-evaluation systems. The goal of the self-enhancement motive is to maintain positive self-esteem. If there is a unitary self-evaluation system, the various self-evaluation mechanisms should substitute for one another. For example, if behaving inconsistently reduces self-evaluation then a positive social comparison experience, being part of the same
Self-ealuatie Process, Psychology of system, should be able to restore self-evaluation. On the other hand, if there are separate self-evaluation systems then a positive social comparison experience will not be able to restore a reduction in self-evaluation originating with inconsistent behavior. One would have to reduce the inconsistency to restore one’s selfevaluation. Recent research favors the former interpretation. At least under certain circumstances, the three self-enhancement mechanisms are mutually substitutable for one another in maintaining self-evaluation. In short, self-evaluation appears to be a unitary system with multiple processes for regulating itself (Tesser et al. 1996).
evidence of motives for self-accuracy and for selfverification. At least three processes affect self-evaluation: social comparison, cognitive consistency, and value expression. Although these processes are qualitatively different from one another, they are substitutable for one another in maintaining self-esteem. Self-enhancement is thought to have evolutionary roots in the individual’s connections to groups. See also: Cognitive Dissonance; Self-esteem in Adulthood; Self-monitoring, Psychology of; Self-regulation in Adulthood; Social Comparison, Psychology of
4. The Origins of Self-ealuation Psychologists have only recently begun to think about the origins of the self-enhancement motive. One line of research suggests that the self-enhancement motive grows out of an instinct for self-preservation coupled with knowledge of our own mortality. Although we as individuals may not live on, we understand that the culture of which we are a part does live on. ‘Immortality’ comes from our connection with our culture. Self-evaluation is a psychological indicator of the extent to which we are connected and acceptable to our culture and, hence, an index of our own ‘immortality’ (Pysczynski et al. 1997). Another line of work builds on the observation that evolution has predisposed us to be social, gregarious animals who are highly dependent on group living. We wish to maintain a positive self-esteem because self-esteem is a kind of ‘sociometer’ that indicates the extent to which we are regarded positively or negatively by others (Leary et al. 1995). Also taking an evolutionary perspective, the SEM model builds on the sociometer idea in two ways. It suggests that groups differ in the power they have to affect self-esteem. Compared to psychologically distant groups, psychologically close groups are typically more consequential to our well being and they have greater impact on our self-esteem. The SEM model also suggests that division of labor is fundamental to groups to maximize efficiency and to avoid conflict. Consequently, self-evaluation is more sensitive to feedback regarding the self’s own niche in the group. See Beach and Tesser (in press) for discussion.
5. Summary Self-evaluation has a productive history in psychology. Individual differences in self-esteem tend to be correlated with a number of positive attributes such as school achievement, general happiness, and lack of depression. There is a strong tendency for people to maintain a positive self-esteem but there is also
Bibliography Baumeister R F, Smart L, Boden J M 1996 Relation of threatened egotism to violence and aggression: The dark side of high self-esteem. Psychological Reiew 103: 5–33 Beach S R H, Tesser A in press Self-evaluation maintenance and evolution: Some speculative notes. In: Suls J, Wheeler L (eds.) Handbook of Social Comparison. Lawrence Erlbaum Associates, Mahwah, NJ Festinger L 1954 A theory of social comparison processes. Human Relations 7: 117–40 Festinger L 1957 A Theory of Cognitie Dissonance. Row, Peterson, Evanston, IL Greenwald A G, Banaji M R 1995 Implicit social cognition: attitudes, self-esteem, and stereotypes. Psychological Reiew 102: 4–27 James W 1905 The Principles of Psychology. Holt, New York, Vol. 1 Kernis M H, Waschull S B 1995 The interactive roles of stability and level of self-esteem: Research and theory. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 27, pp. 93–141 Leary M R, Tambor E S, Terdal S K, Downs D L 1995 Selfesteem as an interpersonal monitor: The sociometer hypothesis. Journal of Personality and Social Psychology 68: 518–30 Pysczynski T, Greenberg J, Solomon S 1997 Why do we need what we need? A terror management perspective on the roots of human social motivation. Psychological Inquiry 8: 1–20 Rosenberg M 1965 Society and the Adolescent Self-image. Princeton University Press, Princeton, NJ Sedikides C 1993 Assessment, enhancement, and verification determinants of the self-evaluation process. Journal of Personality and Social Psychology 65(2): 317–38 Steele C M 1988 The psychology of self-affirmation: Sustaining the integrity of the self. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 21, pp. 261–302 Swann W B 1990 To be adored or to be known? The interplay of self-enhancement and self-verification. In: Sorrentino R M, Higgins E T (eds.) Handbook of Motiation & Cognition. Guilford Press, New York, Vol. 2, pp. 408–48 Tesser A 1988 Toward a self-evaluation maintenance model of social behavior. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, San Diego, CA, Vol. 21, pp. 181–227
13829
Self-ealuatie Process, Psychology of Tesser A, Martin L, Cornell D 1996 On the substitutability of self-protective mechanisms. In: Gollwitzer P M, Bargh J A (eds.) The Psychology of Action: Linking Motiation and Cognition to Behaior. Guilford Press, New York, pp. 48–67
A. Tesser
Self-fulfilling Prophecies A self-fulfilling prophecy occurs when an originally false social belief leads to its own fulfillment. The selffulfilling prophecy was first described by Merton (1948), who applied it to test anxiety, bank failures, and discrimination. This article reviews some of the controversies surrounding early self-fulfilling prophecy research, and traces how those controversies have led to modern research on relations between social beliefs and social reality. Self-fulfilling prophecies did not receive much attention until Rosenthal and Jacobson’s (1968) Pygmalion study. Teachers were led to believe that randomly selected students would show dramatic increases in IQ over the school year. Results seemed to show that, especially in the earlier grade levels, those students gained more in IQ than other students. Thus, the teachers’ initially false belief that some students would show unusual IQ gains became true.
1. Controersy, Replication, and Meta-analysis Rosenthal and Jacobson’s study (1968) was highly controversial. Although it seemed to explain the low achievement of disadvantaged students, it was criticized on methodological and statistical grounds. This controversy inspired attempts at replication. Only about one third of these early attempts succeeded (Rosenthal and Rubin 1978). Critics concluded that the phenomenon was unreliable. Proponents concluded that this demonstrated the existence of selffulfilling prophecies because, if chance differences were occurring, replications would only succeed 5 percent of the time. This controversy inspired Rosenthal’s work on meta-analysis—statistical techniques for summarizing the results of multiple studies. Rosenthal and Rubin’s (1978) meta-analysis of the first 345 studies of interpersonal expectancy effects conclusively demonstrated that self-fulfilling prophecies are a real and reliable phenomenon. That meta-analysis also showed that they were neither pervasive (nearly two-thirds of the studies failed to find the effect) nor powerful (effect sizes, in terms of correlation or regression coefficients, averaged 0.2–0.3). This, however, did not end the controversy. Although few modern researchers dispute the existence of self-fulfilling prophecies in general, several do dispute the claims that teacher expectations influence student intelligence (Snow 1995). For example, Snow 13830
(1995) concluded that: (a) the expectancy effect disappears if one removes five students with implausible IQ increases (100 points within one year) from the Rosenthal and Jacobson (1968) study, and (b) the literature fails to demonstrate an effect of teacher expectations on IQ.
2. Self-fulfilling Stereotypes Some researchers saw in self-fulfilling prophecies an explanation for social inequalities. Thus, in the 1970s, research began addressing the self-fulfilling effects of stereotypes. The main ideas were that; if most stereotypes were inaccurate and if self-fulfilling prophecies were common and powerful (as was believed) then negative stereotypes regarding intelligence, achievement, motivation, etc., may produce self-fulfilling prophecies that lead individuals from devalued groups to objectively confirm those stereotypes. The early experimental research seemed to support this perspective: (a) White interviewers’ racial stereotypes could undermine the performance of Black interviewees. (b) Males acted more warmly towards, and evoked warmer behavior from, female interaction partners erroneously believed to be more physically attractive. (c) When interacting with a sexist male who was either physically attractive or who was interviewing them for a job, women altered their behavior to appear more consistent with traditional sex stereotypes. (d) Teachers used social class as a major basis for expectations and treated students from middle class backgrounds more favorably than students from lower class backgrounds (see reviews by Jussim et al. 1996 and Snyder 1984).
3. Widespread Acceptance and More Questions The role of self-fulfilling prophecies, however, in creating social problems remained unclear because many of the early self-fulfilling prophecy experiments suffered important limitations. In most, if the expectancy manipulation was successful, perceivers developed erroneous expectations. However, under naturalistic conditions perceivers may develop accurate expectations, which, by definition, do not create self-fulfilling prophecies (because self-fulfilling prophecies begin with an initially false belief ). Three categories of research have addressed the limitations of the early experiments in different ways.
3.1 Naturalistic Studies of Teacher Expectations Longitudinal, quantitative investigations of naturally occurring teacher expectancies addressed the accuracy problem directly. All assessed relations between
Self-fulfilling Prophecies teacher expectations and students’ past and future achievement. If teacher expectations predicted future achievement beyond effects accounted for by students’ past achievement, results were interpreted as providing evidence consistent with self-fulfilling prophecies. These studies also provided data capable of addressing two related questions: (a) How large are naturally occurring teacher expectation effects? (b) Do teacher expectations predict students’ achievement more because they create self-fulfilling prophecies or more because they are accurate? The results were consistent: (a) In terms of standardized regression coefficients, the self-fulfilling effects of teacher expectations were about 0.1–0.2. (b) Teacher expectations were strongly based on students’ past achievement. (c) Teachers’ expectations predicted students’ achievement more because they were accurate than because they led to self-fulfilling prophecies. (d) Even teachers’ perceptions of differences between students from different demographic groups (i.e., stereotypes) were mostly accurate (Jussim et al. 1996).
3.2 Naturalistic Studies of Close Relationships Recent research has begun investigating the occurrence of self-fulfilling prophecies in close relationships. Children come to view their math abilities in a manner consistent with their mothers’ sex stereotypes (Jacobs and Eccles 1992). New college roommates change each others’ self-perceptions of academic and athletic ability (McNulty and Swann 1994). People who feel anxious that their romantic partners will reject them often evoke rejection from those partners (Downey et al. 1998). Furthermore, the more positive the illusions one holds regarding one’s romantic partner the longer that relationship is likely to continue and the more positively one’s romantic partner will come to view him or herself (Murray et al. 1996). As with the teacher expectation studies, however, self-fulfilling prophecy effect sizes average about 0.2.
3.3 Nonconscious Priming of Stereotypes Chen and Bargh (1997) avoided inducing false expectations by nonconsciously priming a stereotype, and observing its effect on social interaction. First, they subliminally presented to perceivers either AfricanAmerican or White faces. Perceivers and targets (all of whom were White) were then placed into different rooms where they communicated through microphones and headphones. These interactions were recorded and rated for hostility. Perceivers primed with an African-American face were rated as more
hostile, and targets interacting with more hostile perceivers reciprocated with greater hostility themselves. The expectancy effect size was 0.23.
4. Moderators Failures to replicate and generally small effect sizes prompted some researchers to begin searching for moderators—factors that inhibit or facilitate selffulfilling prophecies (Jussim et al. 1996). This research has identified some conditions under which powerful self-fulfilling prophecies occurred and many conditions under which self-fulfilling prophecies did not occur. Identified moderators include characteristics of perceivers, targets, and the situation.
4.1 Perceier Moderators (a) Perceivers motivated to be accurate or sociable are not likely to produce self-fulfilling prophecies. (b) Perceivers motivated to confirm a particular belief about a target, or to arrive at a stable impression of a target are more likely to produce self-fulfilling prophecies. (c) Perceivers with a rigid cognitive style or who are certain of their beliefs about targets are more likely to produce self-fulfilling prophecies.
4.2 Target Moderators (a) Unclear self-perceptions lead people to become more vulnerable to self-fulfilling prophecies. (b) When perceivers have something targets want (such as a job), targets often confirm those beliefs in order to create a favorable impression. (c) When targets desire to facilitate smooth social interactions, they are also more likely to confirm perceivers’ expectations. (d) When targets believe that perceivers hold a negative belief about them, they often act to disconfirm that belief. Similarly, when their main goal is to defend a threatened identity, or express their personal attributes, they are also likely to disconfirm perceivers’ expectations. (e) Self-fulfilling prophecies are stronger among students from at least some stigmatized social groups (African-American students, students from lower social class backgrounds, and students with a history of low achievement).
4.3 Situational Moderators (a) Self-fulfilling prophecies are most common when people enter new situations, such as kindergarten or military service. 13831
Self-fulfilling Prophecies (b) Experimental studies conducted in educational contexts were much more likely to obtain self-fulfilling prophecies if the expectancy manipulation occurred early in the school year (presumably, because teachers were more open to the information at that time).
5. Accumulation Small self-fulfilling prophecy effects, if they accumulate over time, might lead to large differences between targets. If, for example, stereotype-based expectations lead to small differences in the intellectual achievement of students from middle class or poor backgrounds each year, those differences may accumulate over time to lead to large social class differences in achievement. This argument lies at the heart of claims emphasizing the power of self-fulfilling prophecies to contribute to social problems. Self-fulfilling prophecies in the classroom, however, do not accumulate. All studies examining this issue have failed to find accumulation and, instead, have generally found that teacher expectation effects dissipate over time (Smith et al. 1999). Whether selffulfilling prophecies accumulate outside of the classroom is currently unknown.
6. Future Directions 6.1 Naturalistic Studies Outside of Classrooms Nearly all of the early naturalistic research focused on teacher expectations. Thus, the recent emergence of research on self-fulfilling prophecies in close relationships has been sorely needed, and will likely continue. The expectations that parents, employers, therapists, coaches, etc. develop regarding their children, employees, clients, athletes, etc. are all rich areas for future research. 6.2 Stereotype Threat Stereotype threat refers to concern that one’s actions may fulfill a negative cultural stereotype of one’s group (Steele 1997). Such concerns may, paradoxically, lead to the fulfillment of those stereotypes. For example, African-American students who believe they are taking a test of intelligence (triggering potential concern about confirming negative cultural stereotypes regarding African-American intelligence) perform worse than White students; however, when led to believe that the same test is one of ‘problem-solving,’ the differences evaporate. Similar patterns have occurred for women taking standardized math tests (Steele 1997), and for middle class and poor students on intelligence tests (Croizet and Claire 1998). Stereotype threat is a relatively new concept in the social 13832
sciences, and has been thus far used primarily to explain demographic differences in standardized test performance. In addition, it helps identify how cultural stereotypes (beliefs about the widespread beliefs regarding groups) may be self-fulfilling, even in the absence of a specific perceiver with an inaccurate stereotype. As such, it promises to remain an important topic in the social sciences for some time.
7. Conclusion Self-fulfilling prophecies are pervasive in the sense that they occur in many different contexts. They are not pervasive in the sense that self-fulfilling prophecy effect sizes are typically small, and many studies have failed to find them. Because of the alleged power of expectancy effects to create social problems, teachers have sometimes been accused of perpetrating injustices based on race, class, sex, and other demographic categories. This accusation is unjustified. Teacher expectations predict student achievement primarily because those expectations are accurate. Furthermore, even when inaccurate, teacher expectations do not usually influence students very much; and even when they do influence students, such influence is likely to dissipate over time. Sometimes, however, both inside and outside the classroom, self-fulfilling prophecies can be powerful. In the classroom, the effects among some groups (low achievers, African-Americans, students from lower social class backgrounds) have been quite powerful. Although self-fulfilling prophecies in the classroom do not accumulate, they can be very long lasting— detectable as many as six years after the original teacher-student relationship (Smith et al. 1999). Outside the classroom, recent research has demonstrated the potentially important role of self-fulfilling prophecies in close relationships, and in the maintenance of socio-cultural stereotypes. Thus, self-fulfilling prophecies occur in a wide variety of contexts and are a major phenomenon linking social perception to social behavior. See also: Decision Making, Psychology of; Personality and Conceptions of the Self; Self-concepts: Educational Aspects; Stereotypes, Social Psychology of; Teacher Behavior and Student Outcomes
Bibliography Chen M, Bargh J A 1997 Nonconscious behavioral confirmation processes: The self-fulfilling consequences of automatic stereotype activation. Journal of Experimental Social Psychology 33: 541–60 Croizet J, Claire T 1998 Extending the concept of stereotype and threat to social class: The intellectual underperformance of students from low socioeconomic backgrounds. Personality and Social Psychology Bulletin 24: 588–94
Self: History of the Concept Downey G, Freitas A L, Michaelis B, Khouri H 1998 The selffulfilling prophecy in close relationships: Rejection sensitivity and rejection by romantic partners. Journal of Personality and Social Psychology 75: 545–60 Jacobs J E, Eccles J S 1992 The impact of mothers’ gender-role stereotypic beliefs on mothers’ and children’s ability perceptions. Journal of Personality and Social Psychology 63: 932–944 Jussim L, Eccles J, Madon S 1996 Social perception, social stereotypes, and teacher expectations: Accuracy and the quest for the powerful self-fulfilling prophecy. Adances in Experimental Social Psychology 28: 281–388 Merton R K 1948 The self-fulfilling prophecy. Antioch Reiew 8: 193–210 McNulty S E, Swann W B 1994 Identity negotiation in roommate relationships: The self as architect and consequence of social reality. Journal of Personality and Social Psychology 67: 1012–23 Murray S L, Holmes J G, Griffin D W 1996 The self-fulfilling nature of positive illusions in romantic relationships: Love is not blind, but prescient. Journal of Personality and Social Psychology 71: 1155–80 Rosenthal R, Jacobson L 1968 Pygmalion in the classroom: Teacher expectations and pupils’ intellectual deelopment. Holt, Rinehart, and Winston, New York Rosenthal R, Rubin D B 1978 Interpersonal expectancy effects: The first 345 studies. Behaioral and Brain Sciences 1: 377–415 Smith A E, Jussim L, Eccles J 1999 Do self-fulfilling prophecies accumulate, dissipate, or remain stable over time? Journal of Personality and Social Psychology 77: 548–65 Snow R E 1995 Pygmalion and intelligence? Current Directions in Psychological Science 4: 169–71 Snyder M 1984 When belief creates reality. Adances in Experimental Social Psychology 18: 247–305 Steele C M 1997 A threat in the air: How stereotypes shape intellectual identity and performance. American Psychologist 52: 613–29
L. Jussim
Self: History of the Concept It is often asserted that the concept of the self emerges only in early modern times in connection with the concern for subjectivity that is taken to be characteristic of modernity. While it is true that the term ‘self’ as a noun, describing that which in a person is really and intrinsically this person, comes into use only from the seventeenth century onwards, what can be called the question of selfhood was not unfamiliar to earlier thinkers. This is, in its core, the question whether there is—and if so, what is—some unity, or at least continuity and coherence, of a human being over the life-course beyond their bodily constitution, beyond the unity of the body. In the history of human thought, this question has been answered in a great variety of different ways. Broadly, empirical traditions tended to doubt the
existence of such unity. They could observe changes, even radical transformations, in the human mind and had thus no ground to postulate any a priori unity. Transcendental traditions, in contrast, tended to argue that there had to be a unity of apperception, or of consciousness. Otherwise, not even the question of the self could be asked. In the intellectual space between these two positions, so to say, the issue could be addressed in ways that are more specific to the social and psychological sciences. Then, the faculty of memory, for instance, could be seen as enabling a sense of continuous self to emerge. Human subjects have the ability to narrate their lives. Or, selves could be seen as formed in the interaction with others, that is, in the mirroring of an image of oneself through the responses to one’s own words and actions by others. In moral philosophy, the existence of a continuous self was seen as a precondition for holding human beings accountable for their past deeds. Selfhood was thus linked to moral and political responsibility.
1. Selfhood in the Social Sciences In these forms, the question of the self was already posed at the historical moment when the social sciences, by and large in the contemporary understanding of the term, arose towards the end of the eighteenth century. It is noteworthy, then, that the emerging social sciences rather reduced the range of ways of exploring selfhood across the nineteenth century. In some of their fields, like liberal political philosophy and political economy\economics, they postulated a rational self, able to make choices and to be responsible for her, or rather: his deeds. In other fields, in particular in the sociological way of reasoning, the orientations and behaviors of human beings were seen as determined by their social position. This opposition has become known as the one between an under- and an oversocialized conception of the human being (Wrong 1961). More cautiously, the empirical and behavioral strands of social research restricted themselves to observing human behavior and refrained from making any assumptions about selfhood at all. Regularities emerged here only through the aggregation of observations. All these approaches had in common, though, was that they aimed at regularizing and stabilizing human orientations and behavior. Whether human beings were under- or oversocialized or just happened to behave according to patterns, the broader question of selfhood, namely whether and how a sense of continuity and coherence in a human being forms, was rather neglected or answered a priori by theoretical postulate. In this light, it seems appropriate to say that the social sciences have developed a serious interest in 13833
Self: History of the Concept questions of human selfhood only late, systematically only in the early twentieth century. Furthermore, they have largely ignored ways of representing the human self that were proposed in other areas, in philosophy for instance, but maybe even more importantly in literature. Thus, issues such as individuality and subjectivity, the possible idiosyncrasy of a life-course and life-project, have long remained outside the focus of the social sciences, which have tended to look at the question from the perspective of a fully developed, stable, personal identity, rather than making the continuity and coherence of the self an issue open to investigation. This focus can hardly be understood otherwise than through the perceived need to expect a political stability of the world on the basis of a presupposed coherence of human orientations and actions (see Freedom\Liberty: Impact on the Social Sciences). The ground for an interest in such broader issues was prepared outside of what is conventionally recognized as social science, namely by Friedrich Nietzsche, and later by Sigmund Freud. Nietzsche radically rejected the problems of moral and political philosophy and thus liberated the self from the impositions of the rules of the collective life. Freud located the drives toward a fuller realization of one’s self in the human psyche and connected the history of civilization with the repression of such drives. Against such a ‘Nietzschean-Freudian’ background (Rorty 1989), Georg Simmel and George Herbert Mead could observe the ways in which identities are formed in social interaction and conceptualize variations of selfformation in different social contexts. From then on, a sociology and social psychology of selfhood and identity has developed which no longer relies on presuppositions about some essence of human nature and is able to connect its findings to both child psychology and phenomenology. It emphasizes the socially constructed nature of selfhood, but remains capable, at least in principle, to analyze the specific social contexts of self-formation thus working towards a comparative-historical sociology of selfhood. This broadened debate on selfhood has created a semantic space in which aspects of the self are emphasized in various ways (Friese 2001). This space can be described by connecting the concept of self to notions of modernity, of meaning, and of difference.
2. Selfhood and Modernity The idea of human beings as autonomous subjects is often taken to be characteristic of modernity as an era, or at least for the self-understanding of modernity as an ethos (see Modernity: History of the Concept). Such a view entails a conception of the self as rather continuous and coherent, since only on such a basis is the choice of a path of action and a way of life as well as the acceptance of the responsibility for one’s deeds conceivable. The concept of selfhood is here con13834
ditioned by the need to maintain a notion of human autonomy and agentiality as a basic tenet of what modernity is about, namely the possibility to shape the world by conscious human action. Such modernism, however, is not necessarily tied to the atomist and rationalist individualism of some versions of economic and political thought, most notably neoclassical economics and rational choice theory. In the light of the twentieth century developments in sociology and (social) psychology, as mentioned above, the view that connects selfhood to modernity mostly—in all its more sophisticated forms—starts out from an assumption of constitutive sociality of the human being. Not absolute autonomy, but rather the conviction that human beings have to construct their self-identities can then be seen as characteristically modern (Hollis 1985). Unlike the concept of the rational, autonomous self, this concept is open towards important qualifications in terms of the corporeality, situatedness, and possible nonteleological character of human action (Joas 1996). Without having to presuppose the self-sustained individual of modernity, the aim is to demonstrate how autonomous selves develop through social interactions over certain phases of the life-course (Joas 1998, Straub 2001). The commitment to autonomy and agentiality, characteristic of modernity, becomes visible rather in the fact that formation of self (or, of self-identity) is here understood as the forming and determining of the durably significant orientations in a life. It is related to the formation of a consciousness of one’s own existence and thus biographically predominantly to the period of adolescence. Crises of identity occur accordingly during growing up; more precisely one should speak of life crises during the formation of one’s identity. Self-identity once constituted is seen as basically stable further on. No necessary connection is presupposed between self-formation and individuality; theoretically, human beings may well form highly similar identities in great numbers. Since the very concept of identity is connected to continuity and coherence of self, however, stability is turned into a conceptual assumption. Objections against such a conceptualization go in two different directions. On the one hand, this discourse, which often has its roots in (social) psychology, stands in a basic tension to any culturalist concept of identity, which emphasizes meaning. On the other hand, doubts about the presupposition of continuity and coherence of selfhood employ notions of difference and alterity that are not reducible to the idea that selves are formed by relating to others, by intersubjectivity.
3. Selfhood and Meaning Some critics of the close connection between selfformation and modernity argue that every form of selfhood is dependent on the cultural resources that
Self: History of the Concept are at hand to the particular human being when giving shape to their important orientations in life. Human beings give meaning to their lives by interpreting their situations with the help of moral-cultural languages that precede their own existence and surround them. Cultural determinism is the strong version of such theorizing, mostly out of use nowadays, but many current social theories adopt a weaker version of this reasoning which indeed sustains the notion of the continuity and coherence of the self but sees this self as strongly embedded in cultural contexts. Such a view of the self has its modern source in romanticism (see Romanticism: Impact on Social Thought). Philosophically, it emerged as a response to the rationalist leanings of the Enlightenment, and politically, against the conceptions of abstract freedom in individualist liberalism. Unlike cultural determinism, however, which has a strongly oversocialized conception of the human being and thus hardly any concept of self at all, romanticism emphasizes agency and creativity in the process of self-formation and self-realization. Human beings are seen in their singularity, but the relation to others is an essential and inescapable part of their understanding of their own selves. Charles Taylor’s inquiry into The Sources of the Self is a most recent and forceful restatement of this conception. Taylor starts out from the familiar argument that the advent of modernity indicates that common frameworks for moral evaluation can no longer be presumed to exist. The key subsequent contention is then that the ability and inclination to question any existing such framework of meaning does not lead into a sustainable position that would hold that no such frameworks are needed at all, a view he calls the ‘naturalist supposition’ (Taylor 1989, p. 30). If such frameworks of meaning are what gives human beings identity, allows them to orient themselves in social and moral space, then they are not ‘things we invent’ and may as well not invent. They need to be seen as ‘answers to questions which inescapably pre-exist for us,’ or, in other words, ‘they are contestable answers to inescapable questions’ (Taylor 1989, pp. 38, 40). Taylor develops here the contours of a concept of inescapability as part of a moral-social philosophy of selfhood under conditions of modernity. Meaning-centered conceptions of selfhood have recently been underlined as a basis of communitarian positions in moral and political philosophy, such as Sandel’s concept (1982) of the ‘encumbered self.’
4. Selfhood and Otherness Arguably, these two concepts of selfhood remain within the frame of a debate in which the under- and the oversocialized views of the human being occupy the extreme points. The introduction of interaction
and intersubjectivity in self-formation has created intermediate theoretical positions, and, possibly more importantly, they have allowed different accentuations of selfhood without making positions mutually incompatible. Nevertheless, the modernity-oriented view still emphasizes a self that acts upon the world, whereas the meaning-oriented view underlines the fact that the self is provided the sense of their actions by the world of which they are a part. Since the mid-1970s, in contrast, theoretical and empirical developments have tended to break up the existing two-dimensional mode of conceptualization. On the one hand, the notion of a ‘decentering of the subject’ has been proposed mainly from poststructuralist discussions. On the other hand, the observation of both multiple and changing basic orientations in human beings has led to the proliferation of the—infelicitously coined—term ‘postmodern identity’ as a new form of selfhood. In both cases, a strong concept of self has been abandoned, in the one case on the basis of philosophical reflection, in the other, grounded on empirical observation. Both the ideas of a ‘decentring’ and of a ‘postmodern self’ question some major implications of the more standard sociological view of selfhood, namely the existence of the human self as a unit and its persistence as the ‘same’ self over time. In the former perspective, the philosophical maxim of thinking identity and difference as one double concept, rather than as one concept opposed to another, is translated into sociological thinking as the need to think self and other as a relation, rather than as a subject and a context or object. Such notions rather underline the nonidentitarian character of being by pointing to the issue of ‘the other in me’ (Emmanuel Le! vinas) as a question of co-constitution rather than of interaction. While emphasized in recent poststructuralist thought, similar ideas can be found, for instance, in Arendt’s (1978, pp. 183–7) insistence on the ‘two-in-one,’ on the relation to oneself as another, as the very precondition for thought. In a broad sense, they go back to the ancient view of the friend as an ‘other self.’ And Theunissen (1977\1965) had already conceptualized self-other relations on the basis of an understanding of ‘the Other’ as referring to all ‘concepts through which philosophy interprets the presence and present of the fellow\co-human being (Mitmensch) or of the transcendental original form of this human being.’ In one particular formulation, Cavell’s reflections provide an example for a thinking about selfhood that does not presuppose an idea of identity, coherence, or consistency (Cavell 1989, 1990). He cautions against ‘any fixed, metaphysical interpretation of the idea of a self’ and against the idea of ‘a noumenal self as one’s ‘‘true self’’ and of this entity as having desires and requiring expression.’ In contrast, Cavell suggests that the ‘idea of the self must be such that it can contain, let us say, an intuition of partial compliance with its idea 13835
Self: History of the Concept of itself, hence of distance from itself’ or, in other words, he advocates the idea of ‘the unattained but attainable self’ (Cavell, 1990, pp. 31, 34). Cavell proposes here a relation of the unattained and the attainable as constitutive for the self, that is, he makes the very question of attainability a central feature of a theory of selfhood.
5. Selfhood and Sociohistorical Transformations The development of this threefold semantic space has considerably enriched the conceptualization and analysis of selfhood in the social sciences during the twentieth century. If one considers the three fields of inquiry as explorations of the various dimensions of selfhood, rather than as mutually exclusive perspectives, the question of the constitution of the human self emerges as a probleT matique that concerns all human beings and for which there may be a variety of processes which can be determined generally only in their forms but not in their results. As a consequence, however, it becomes much more difficult to conclude from the prevailing character of selfhood on social structure and political order. This has often been seen as a ‘weakness’ of symbolic interactionism, for instance, which Mead’s view of the self is said to have inaugurated, as a social theory that allegedly cannot address issues of societal constitution. But by the same move Mead allows for and recognizes a plurality of selves that has returned to the center of discussion today—after the renewal of a regressive synthesis of identity and society in Talcott Parsons’s work (to which—what is often overlooked—Erik Erikson contributed as well). The question of the relation between selfhood and society and politics thus needs to be rephrased as a question of comparative-historical sociology rather than merely of social theory. Some indications as to how this relation should be understood can be found when reading the historical sociology of twentieth century societies in this light. The social upheaval during the second half of the nineteenth century with industrialization, urbanization and the phrasing of ‘the social question’ is often seen as a first ‘modern’ uprooting of established forms of selfhood, as a first massive process of ‘disembedding’ (Giddens 1990). The development towards so-called mass societies during the first half of the twentieth century lets the question of the relation between individuation and growth of the self emerge. Totalitarianism has been analyzed in terms of an imbalance between imposed individuation and delayed self-formation, the result having been the tendency towards ‘escape from freedom’ and into stable collective identities (Fromm 1941, Arendt 1958). The first three decades after the Second World War are then considered as a form of re-embedding of selves into the institutional frames of democratic welfare societies. Most recently, the indications of dissolution and dismantling of the rather comprehensive set of 13836
social institutions of the interventionist welfare state are one of the reasons to focus sociological debate again on questions of selfhood and identity. During this period, as some contributions argue, no longer continuity and coherence but transience, instability, and inclination to change are said to be marks of the important life orientations of contemporary human beings (see Shotter and Gergen 1989, Lash and Friedman 1992, Kellner 1995). However, many of these analyses are challenged on grounds of the limited representativity of the empirical sample or on conceptual grounds. Given the theoretical insights described above as the emergence of the threefold semantic space of selfhood, any attempt to offer a full-scale reformulation of the issue of the varieties of selfhood in different sociohistorical configurations would be adventurous. Forms of self-constitution and substantive orientations of human selves are just likely to be highly variable across large populations with varieties of experiences and sociocultural positions. However, the notion of recurring crises of modernity and the identification of historically distinct processes of disembedding and re-embedding could be the basis for a socially more specific analysis of the formation and stability of social selves (Wagner 1994). The social existence of the ‘modern’ idea that human beings construct their selves is what many societies have in common throughout the 1800s and 1900s. As such, it does not give any guidance in defining different configurations. Therefore, three qualifying criteria have been introduced. First, the existence of the idea of construction of selfhood still leaves open the question whether all human beings living in a given social context share it and are affected by it. The social permeation of the idea may be limited. Second, human beings in the process of constructing their selves may consider this as a matter of choice, as a truly modernist perspective would have it. In many circumstances, however, though a knowledge and a sense of the fact of social construction prevails, it may appear to human beings as almost natural, as in some sense pre-given or ascribed, which social self they are going to have. Third, the stability of any self one has chosen may vary. Such a construction of selfhood may be considered a once-in-a-lifetime occurrence, but may also be regarded as less committing and, for instance, open to reconsideration and change at a later age. These criteria widen the scope of constructability of selfhood. All conditions of construction have existed for some individuals or groups at any time during the past two centuries in the West. But the widening of the scope of construction may mark the transitions from one to another social configuration of modernity. These transitions entail social processes of disembedding and provoke transformations of social selves, in the course of which not only other selves are acquired but the possibility of construction is also more widely perceived.
Self-knowledge: Philosophical Aspects See also: Culture and the Self (Implications for Psychological Theory): Cultural Concerns; Identity and Identification: Philosophical Aspects; Identity in Anthropology; Identity: Social; Interactionism and Personality; Modernity; Person and Self: Philosophical Aspects; Personality and Conceptions of the Self; Self: Philosophical Aspects
Bibliography Arendt H 1958 The Origins of Totalitarianism, 2nd edn. World Publishing Company, Cleveland, OH, and Meridian Books, New York Arendt H 1978 The Life of the Mind, Vol. 1: Thinking. Harcourt, Brace, Jovanovich, New York Cavell S 1989 The New yet Unapproachable America. Lectures After Emerson After Wittgenstein. Living Batch Press, Albuquerque, NM Cavell S 1990 Conditions Handsome and Unhandsome: The Constitution of Emersonian Perfectionism. The University of Chicago Press, Chicago Friese H 2001 Identity: Desire, name and difference. In: Friese H (ed.) Identities. Berghahn, Oxford, UK Fromm E 1941 Escape from Freedom. Holt, Rinehart, and Winston, New York Giddens A 1990 The Consequences of Modernity. Polity, Cambridge Hollis M 1985 Of masks and men. In: Carrithers M, Collins S, Lukes S (eds.) The Category of the Person: Anthropology, Philosophy, History. Cambridge University Press, Cambridge, UK and New York Joas H 1996 The Creatiity of Action [trans. by Gaines J, Keast P]. Polity, Cambridge Joas H 1998 The autonomy of the self. The meadian heritage and its postmodern challenge. European Journal of Social Theory 1(1): 7–18 Kellner D 1995 Media Culture, Cultural Studies, Identity and Politics Between the Modern and the Postmodern. Routledge, London and New York Lash S, Friedman J (eds.) 1992 Modernity and Identity. Blackwell, Oxford, UK Rorty R 1989 Contingency, Irony, and Solidarity. Cambridge University Press, Cambridge, UK Sandel M J 1982 Liberalism and the Limits of Justice. Cambridge University Press, Cambridge, UK and New York Shotter J, Gergen K J (eds.) 1989 Texts of Identity. Sage, London Straub J 2001 Personal and collective identity. A conceptual analysis. In: Friese H (ed.) Identities. Berghahn, Oxford, UK Taylor C 1989 Sources of the Self. The Making of Modern Identity. Harvard University Press, Cambridge, MA Theunissen M 1977\1965 Der Andere. de Gruyter, Berlin Wagner P 1994 A Sociology of Modernity: Liberty and Discipline. Routledge, London and New York Wrong D H 1961 The oversocialized conception of man in modern sociology. American Sociological Reiew 26: 183–93
P. Wagner
Self-knowledge: Philosophical Aspects ‘Self-knowledge’—here understood as the knowledge a creature has of its own mental states, processes, and dispositions—has come to be, along with consciousness, an increasingly compelling topic for philosophers and scientists investigating the structure and development of various cognitive capacities. Unlike consciousness, however, the capacity for self-knowledge is generally held to be the exclusive privilege of human beings—or, at least, of creatures with the kind of conceptual capacity that comes with having a language of a certain rich complexity. Prima facie, self-knowledge involves more than simply having sensations (or sensory experiences). Many say it involves even more than having moods, emotions, and ‘first-order’ intentional states, paradigmatically beliefs and desires, but also hopes, fears, wishes, occurrent thoughts, and so on. Some philosophers disagree, claiming that firstorder intentional states (individuated by ‘propositional content’) are themselves properly restricted to language-using creatures and, by that token, to selfknowing creatures. Nevertheless, it is generally agreed that self-knowledge involves, minimally, the capacity for a conceptually rich awareness or understanding of one’s own mental states and processes, and hence the capacity, as it is often put, to form ‘second-order’ beliefs about these, a capacity that can be extended to, or perhaps only comes with, forming beliefs about the mental states and processes of others. More generally, then, self-knowledge involves the capacity for adopting what Daniel Dennett calls the intentional stance (Dennett 1987)—understanding oneself (and others) as having experiences and ways of representing the world that shape agential behavior. But this capacity should not be understood in purely spectatorial terms. The more sophisticated a creature’s intentional understanding, the more sophisticated it will be at acting in a world partially constituted by agential activity. Thus, the capacity for intentional understanding continually plays into, and potentially modifies, an agent’s own intentionally directed activity, a point I will return to below.
1. Empirical Findings: Testing the Limits of Intentional Understanding A number of experiments have been designed to test for the presence of such a capacity in nonhuman animals (primarily hominids) and very young children, with interestingly controversial results. Some investigators claim that the evidence for intentional understanding (including a fairly sophisticated awareness of self ) is persuasively strong in the case of certain primates (chimpanzees and bonobos, though for a deflationary interpretation of this evidence based on 13837
Self-knowledge: Philosophical Aspects further empirical trials, see Povinelli and Eddy (1996)). Experiments with young children (most famously the ‘false-belief task’) seem clearly to indicate that intentional understanding comes in degrees, passing through fairly discernible stages. How precisely to characterize these stages and what constitutes the mechanism of change is hotly debated (Astington et al. 1988), yet there is an emerging pattern in the empirical data that is noteworthy for standard philosophical discussions of self-knowledge. First, under certain, not very unusual conditions, young children will make mistakes in attributing basic mental states to themselves (desires and beliefs) that are inconceivable from the adult point of view because of the direct firstpersonal awareness adults seem to have of these states. Second, the errors children make in self-attribution mirror in kind and frequency the errors they make in attributing mental states to others; there seems to be no dramatic asymmetry between self-knowledge and knowledge of other minds, at least for children (Flavell et al. 1995, Gopnik 1993). Moreover, though children become increasingly more sophisticated in their intentional self- and other-attributions, this symmetry and systematicness in kind and frequency of error seems to persist in notable ways into adulthood, with subjects ‘confabulating’ explanations for their actions and feelings that are purportedly based on direct introspective awareness of their own minds (Nisbett and Wilson 1977). To psychologists investigating these phenomena, this pattern of results suggests that, for both children and adults, judgments about mental states and processes are affected (perhaps unconsciously) by background beliefs about how such states are caused and how they in turn cause behavior, and these beliefs can be systematically wrong, at least in detail. There is nothing special about the capacity for self-knowledge that avoids the error-causing beliefs embedded in our ‘folk-psychology.’
2. The Philosophical Project: Explaining the Special Character of Self-knowledge Although judgments about our own mental states and processes are clearly not infallible, there are distinctive features of self-knowledge that require explanation. These features, and the interrelations among them, are variously presented in the philosophical literature, but they are generally held to consist in the following (Wright et al. 1998, pp. 1–2). (a) Immediacy—the knowledge we have of our own sensations, emotions, and intentional states is normally not based on behavioral or contextual evidence; it is groundless or immediate, seemingly involving some kind of ‘special access’ to our own mind. (b) Saliency (or, transparency)—knowledge of our own minds, especially with regard to sensations and occurrent thoughts, is unavoidable; if we have a 13838
particular thought or sensation, we typically know that we have it (such mental states are ‘self-intimating’). (c) (First-person) authority—judgments about our own minds (expressed in sincere first-person claims) are typically treated by others as correct simply by virtue of our having made them. Normally, such claims require no justification and are treated as true, so long as there is no overwhelmingly defeating counter-evidence to suggest that we are wrong about ourselves. The traditional account against which most discussions of self-knowledge react is drawn from Rene! Descartes (Descartes 1911). Viewing the mind as a non-physical substance directly knowable only from the first-person point of view, Descartes had a ready explanation of first-person authority: we clearly and distinctly perceie the contents of our own minds, which are by their nature transparent to us. Alternatively, since we have no special perceptual access to the minds of others, the only way we can make judgments about them is by observing their external bodily movements and inferring from their similarities to ours that they are likewise caused by mental states and processes. Such weakly inferred third-person judgments are perforce dramatically less secure than first-person judgments, which are themselves infallible. On this view, the commonly acknowledged asymmetry between first- and third-person judgments is exaggerated in both directions. This Cartesian account has been discredited in almost everyone’s estimation—and not just empirically, but conceptually. Even the picturesque metaphor of the ‘mind and its contents’ was held by Gilbert Ryle to encourage such bad habits of thought that nothing worthwhile could be said in its defense. The ‘metaphysical iron curtain’ it introduced between knowledge of self and knowledge of others was, in Ryle’s view, sufficient to show the bankruptcy of the Cartesian picture by making impossible our everyday practices of correctly applying, and being able to correct our applications of, ‘mental conduct’ concepts either to others or to ourselves. Since this would undermine the possibility of a coherent language containing such concepts, it undermines the possibility of our conceptualizing ourselves intentionally at all. Thus, so far from providing an explanation of our capacity for self-knowledge, Descartes’ theory discredits it. We find out about ourselves in much the same way that we find out about other people, Ryle is famous for saying: ‘A residual difference in the supplies of the requisite data makes some differences in degree between what I can know about myself and what I can know about you, but these differences are not all in favor of self-knowledge’ (Ryle 1949, p. 155). Although Ryle exhaustively illustrated his view that the ways we find out about ourselves are as various as the kinds of things we find out, thus debarring any simple philosophical analysis of self- and other-knowledge, he is
Self-knowledge: Philosophical Aspects generally thought to have promoted the doctrine of ‘logical behaviorism’ (not Ryle’s term), according to which all talk about so-called ‘mental’ states and processes (our own or anyone else’s) is really just talk about actual and potential displays of overt behavior. In fact, though clearly suspicious of philosophical defenses of first-person authority based on a principle epistemic privilege, Ryle is more profitably read as following Wittgenstein in trying to reveal how the ‘logical (or ‘grammatical) structure’ of a pervasive way of thinking about the ‘mind and its contents’ inevitably generates self-defeating explanations of various phenomena, including our capacity for self-knowledge. Any substantial account of such phenomena must therefore exemplify a different kind of logical structure altogether (see discussions of Wittgenstein’s view in McDowell 1991, Wright 1991).
3. Current Proposals: ‘Causal–Perceptual’ s. ‘Constitutie’ Explanations of Self-knowledge This proposed opposition between different ‘logical types’ of explanation is given more substance in contemporary work with philosophers pursuing two quite different methodological approaches to understanding the purported immediacy, saliency, and authority of self-knowledge. The first ‘causal–perceptual’ approach focuses on the mechanics of selfknowledge, including the nature of the states we are purportedly making judgments about, the nature of the states that constitute those judgments, and the nature of the connection between them. The second so-called ‘constitutive’ approach focuses on the larger implications of our capacity for self-knowledge, including what role it plays in specifying conditions for the possibility of our powers and practices of agency. Adopting either one of these approaches does not rule out addressing the questions featured by the alternative approach, although the sorts of answers given are materially shaped by which questions occupy center field.
3.1 Causal–Perceptual Accounts of Self-knowledge According to some philosophers, the basic defects of the Cartesian view stem not from the idea that firstperson knowledge is a form of inner perception, but from its dualistic metaphysics and an overly restricted conception of what perception involves. On Descartes’s ‘act-object’ model of perception, mental states are objects before the mind’s ‘inner eye’ which we perceive (as we do external objects) by the act of discerning their intrinsic (nonrelational) properties. Without mounting any radical ‘grammatical’ critique, problems with this view can be readily identified,
especially once the mind is understood to be part of the material world (see Shoemaker 1996, Wright et al. 1998). (a) There is no organ of inner perception analogous to the eye (but we do have a capacity for sensing the position of our bodies, by proprioception, and this does not require any specialized organ; perhaps introspection is like proprioception). (b) Sensations could possibly count as ‘objects’ of inner perception, but intentional states (beliefs and desires) are less plausible candidates. First, phenomenologically, we do not seem to be aware of our intentional states as discernibly perceptible in the way that sensations are; there is no perceptual experience that precedes, or comes with, our judgments about such states—no ‘qualia.’ Second, intentional states are individuated by their propositional content, and their propositional content is plausibly determined by (causal) relational features these states bear to the subject’s physical or social environment, rather than intrinsic features of the states themselves. If the ‘externalist’ view of content is right, then authoritative self-knowledge of these states cannot be based on inwardly perceiving their intrinsic properties (see Ludlow and Martin 1998). (c) In contrast with external perception, there are no mediating perceptual experiences between the object perceived and judgments based on those perceivings. If mental states and processes are objects of inner perception, they must be ‘self-intimating’ objects, where the object of perception is itself a perceptual experience of that object. Taken literally, this idea is hard to make coherent, though some have found it invitingly appropriate in the case of sensations as a way of marking what is so unusually special and mysterious about them. Many of these problems (and more) are satisfactorily addressed by adopting a ‘tracking’ model of inner perception (Armstrong 1968). We do have ‘special access’ to our own minds in so far as the second-order beliefs we form about our (ontologically distinct) first-order states and processes are reliably caused, perhaps subcognitiely, by the very first-order states and processes such beliefs are about. There need be no distinctive phenomenology that characterizes such access, and no need therefore to give different accounts of our self-knowledge of intentional states versus sensations, imaginings, and other phenomenologically differentiated phenomena. Of course, if all such states are functionally defined—that is, dispositionally, in terms of their causal relations to other mental states, to perceptual inputs and to behavioral outputs—there is nothing about them that makes them in principle accessible only to the person whose states they are. But this is a strength of the approach. Such states are accessible in a special way to the subject (by normally and reliably causing the requisite secondorder beliefs), thereby accounting for the immediacy, saliency, and authority of our first-person judgments. 13839
Self-knowledge: Philosophical Aspects Yet because causal mechanisms can break down in particular cases, first-person error may be expected (as in self-deception). Such errors may also be detected: since first-order states are constitutively related to a subject’s linguistic and nonlinguistic behavior, significant evidence of noncohesiveness between the subject’s sayings and doings may be used to override her selfascriptions. Moreover, there is no inconsistency between this account of the special character of selfknowledge and psychologists’ findings of systematic error in self- and other attribution. It may be just as they suggest: since the concepts in terms of which we express our second-order beliefs (about self and other) are functionally defined in terms of folk-psychological theory, systematic error may result from the theory’s deficiencies.
3.2 Constitutie Accounts of Self-knowledge The causal–perceptual approach to self-knowledge has many advantages, including, it would seem, the ontological and conceptual independence of first-order states from the second-order beliefs that track them in self-knowers. For this means there is no conceptual bar on attributing beliefs, desires, and so forth to creatures that lack the capacity for self-knowledge but still behave rationally enough to be well predicted from the intentional stance. Why then do some philosophers insist that having such first-order intentional states is constitutively linked to having authoritative self-knowledge—and, therefore, presumably, to a subject’s forming second order beliefs about her first order states for which ‘truth is the default condition’ (Wright 1991)? To begin with a negative answer: if causal–perceptualists are right in saying that the subject’s status as an authoritative self-knower is merely contingently dependent on a reliable causal mechanism linking first- and second-order states, then we could not account for the special role self-knowledge plays in constituting that subject as a rational and responsible agent (Bilgrami 1998, Burge 1996, Shoemaker 1996 ). How so? Consider, for instance, the causal–perceptualist account of error in self-attribution: The subject forms a false second-order belief due to a breakdown in the causal mechanism normally linking that belief to the appropriate first-order state. Her lapse of rationality consists in her saying one thing (because of the second-order belief ) and doing another (because of her first-order states). But why count this a lapse of rationality (as surely we do), rather than simply a selfmisperception, analogous (except for the causal pathway) to misperceptions of others? What imperative is there for a subject’s self-attributions to line up with her other sayings and doings in order for her to be a coherent subject, responsible and responsive to the norms embedded in our folk-psychological conception of rational agency? 13840
Different authors take various tacks in answering this question, but there is a discernible common thread. An agent cannot act linguistically or nonlinguistically in ways that respond self-consciously to these norms, unless she knows in a directive sense what she is doing—i.e., unless she recognizes herself as acting in accord with certain intentional states that she knows to be hers, and knows in a way that acknowledges her own agency in producing and maintaining them. Moreover, an agent is self-consciously responsive to norms that govern her own intentional states (such as belieing that p), not just by recognizing what counts as norm-compliant behavior, but actually by regulating herself (her judgments, thoughts, and actions) in accord with such norms. On this picture, if a person’s sincere claims about herself do not cohere with the (other) things she does and says, we may say she fails to know herself. But, just as importantly, we say that she acts unknowingly. That is, there is some sense (perhaps pathological) in which she acts, but without authoring her own acts as a free and responsible agent. Hence, she does not authorize such acts; they are out of her reflective control. To fail in self-knowledge on this account is not to be wrong in the passively spectatorial sense that one can be wrong about others; it is to fail more actively in exercising one’s powers of agency. Constitutive explanations of self-knowledge display a very different logical structure from causal–perceptual accounts. Consequently, they suggest a different way of conceptualizing the intentional capacities of self-knowers, a way that has become needlessly complicated through retaining the traditional architecture of first-order states and second-order beliefs about them that are ontologically, if not conceptually, distinct and that supposedly underlie and get expressed in first-person claims (McGeer 1996). This distinction makes sense on a causal–perceptualist account of selfknowledge; but it only confuses and obscures on the constitutive account, leading, for instance, to inappropriate descriptions of what distinguishes us from other animals. The constitutive explanation of selfknowledge focuses on what it means to be an intentional creature of a certain sort, an intentional creature that is capable of actively regulating her own intentional states, revising and maintaining them in selfconscious recognition of various norms. No doubt there are automatic mechanisms also involved in regulating intentional states in an environmentally useful way, mechanisms we may well share with nonlinguistic creatures and which explain why adopting the intentional stance towards them works as well as it does (Shoemaker 1996, essay 11). But in our own case, we increase the range and power of the intentional stance by understanding what rational, responsible agency requires of us and molding ourselves to suit. The apparatus of first- and second-order states may play some useful role in further elucidating this capacity for intentional self-regulation, but more likely
Self-monitoring, Psychology of it encourages bad habits of thought—a lingering vestige of the dying, but never quite dead, Cartesian tradition in epistemology and philosophy of mind. See also: Culture and the Self (Implications for Psychological Theory): Cultural Concerns; Person and Self: Philosophical Aspects; Self-development in Childhood; Self: History of the Concept; Self: Philosophical Aspects
Bibliography Armstrong D M 1968 A Materialist Theory of the Mind. Routledge and Kegan Paul, London Astington J W, Harris P L, Olson D R (eds.) 1988 Deeloping Theories of Mind. Cambridge University Press, Cambridge, UK Bilgrami A 1988 Self-knowledge and resentment. In: Wright C, Smith B C, Macdonald C (eds.) Knowing our Own Minds. Clarendon Press, Oxford, UK Burge T 1996 Our entitlement to self-knowledge. Proceedings of the Aristotelian Society 96: 91–111 Dennett D C 1987 The Intentional Stance. MIT Press, Cambridge, MA Descartes R 1911 Meditations on first philosophy. In: Haldane E S, Ross G R T (eds.) The Philosophical Works of Descartes. Cambridge University Press, Cambridge, UK, Vol. 1, pp. 131–200 Flavell J H, Green F L, Flavell E R 1995 Young Children’s Knowledge About Thinking. University of Chicago Press, Chicago Gopnik A 1993 How we know our minds: The illusion of firstperson knowledge of intentionality. Behaioral and Brain Sciences 16: 1–14 Ludlow P, Martin N (eds.) 1998 Externalism and Selfknowledge. Cambridge University Press, Cambridge, UK McDowell J 1991 Intentionality and interiority in Wittgenstein. In: Puhl K (ed.) Meaning Scepticism. W. de Gruyter, Berlin, pp. 168–9 McGeer V 1996 Is ‘Self-knowledge’ an empirical problem? Renegotiating the space of philosophical explanation. Journal of Philosophy 93: 483–515 Nisbett R E, Wilson T D 1977 Telling more than we can know: Verbal reports on mental processes. Psychological Reiew 84(3): 231–59 Povinelli D J, Eddy T J 1996 What young chimpanzees know about seeing. Monographs of the Society for Research in Child Deelopment 61(3): 1–190 Ryle G 1949 The Concept of Mind. Hutchinson’s University Library, London Shoemaker S 1996 The First Person Perspectie and Other Essays. Cambridge University Press, Cambridge, UK Wright C 1991 Wittgenstein’s later philosophy of mind: Sensation, privacy and intention. In: Puhl K (ed.) Meaning Scepticism. W. de Gruyter, Berlin, pp. 126–47 Wright C, Smith B C, Macdonald C (eds.) 1998 Knowing Our Own Minds. Clarendon Press, Oxford, UK
V. McGeer
Self-monitoring, Psychology of According to the theory of self-monitoring, people differ in the extent to which they monitor (i.e., observe and control) their expressive behavior and self-presentation (Snyder 1974, 1987). Individuals high in selfmonitoring are thought to regulate their expressive self-presentation for the sake of public appearances, and thus be highly responsive to social and interpersonal cues to situationally appropriate performances. Individuals low in self-monitoring are thought to lack either the ability or the motivation to regulate their expressive self-presentations for such purposes. Their expressive behaviors are thought instead to reflect their own inner states and dispositions, including their attitudes, emotions, selfconceptions, and traits of personality. Research on self-monitoring typically has employed multi-item, self-report measures to identify people high and low in self-monitoring. The two most frequently employed measuring instruments are the 25 true–false items of the original Self-monitoring Scale (Snyder 1974) and an 18-item refinement of this measure (Gangestad and Snyder 1985; see also Lennox and Wolfe 1984). Empirical investigations of testable hypotheses spawned by self-monitoring theory have accumulated into a fairly sizable published literature (for a review of the literature, see Gangestad and Snyder 2000).
1. Major Themes of Self-monitoring Theory and Research Soon after its inception, and partially in response to critical theoretical issues of the times, self-monitoring was offered as a partial resolution of the ‘traits vs. situations’ and ‘attitudes and behaviors’ controversies in personality and social psychology. The propositions of self-monitoring theory suggested that the behavior of low self-monitors ought to be predicted readily from measures of their attitudes, traits, and dispositions whereas that of high self-monitors ought to be best predicted from knowledge of features of the situations in which they operate. Self-monitoring promised a ‘moderator variable’ resolution to debates concerning the relative roles of person and situation in determining behavior. These issues set the agenda for the first generation of research and theorizing on selfmonitoring, designed primarily to document the relatively ‘situational’ orientation of high self-monitors and the comparatively ‘dispositional’ orientation of low self-monitors (for a review, see Snyder 1987). In a second generation of research and theorizing, investigations moved beyond issues of dispositional and situational determination of behavior to examinations of the linkages between self-monitoring and 13841
Self-monitoring, Psychology of interpersonal orientations. Perhaps the most prominent of these programs concerns the links between expressive control and interpersonal orientations, as revealed in friendships, romantic relationships, and sexual involvements (e.g., Snyder et al. 1985). Other such programs of research concern advertising, persuasion, and consumer behavior (e.g., Snyder and DeBono 1985), personnel selection (e.g., Snyder et al. 1988), organizational behavior (Caldwell and O’Reilly 1982, Kilduff 1992), socialization and developmental processes (e.g., Eisenberg et al. 1991, Graziano and Waschull 1995), cross-cultural studies (e.g., Gudykunst 1985). Central themes in these programs of research have been that high self-monitors live in worlds of public appearances created by strategic use of impression management, and that low self-monitors live in worlds revolving around the private realities of their personal identities and the coherent expression of these identities across diverse life domains. Consistent with these themes, research on interpersonal orientations has revealed that high, relative to low, self-monitors choose as activity partners friends who will facilitate the construction of their own situationally-appropriate images and appearances (e.g., Snyder et al. 1983). Perhaps because of their concern with images and appearances, high self-monitors have romantic relationships characterized by less intimacy than those of low self-monitors. Also consistent with these themes, explorations of consumer attitudes and behavior have revealed that high self-monitors value consumer products for their strategic value in cultivating social images and public appearances, reacting positively to advertising appeals that associate products with status and prestige; by contrast, low selfmonitors judge consumer products in terms of the quality of the products stripped of their image-creating and status-enhancing veneer, choosing products that they can trust to perform their intended functions well (e.g., DeBono and Packer 1991). These same orientations manifest themselves in the workplace as well, with high self-monitors preferring positions that call for the exercise of their self-presentational skills; thus, for example, high self-monitors perform particularly well in occupations that call for flexibility and adaptiveness in dealings with diverse constituencies (e.g., Caldwell and O’Reilly 1982) whereas low self-monitors appear to function best in dealing with relatively homogeneous work groups. It should be recognized that, although these programs of research, for the most part, have not grounded their hypotheses or interpretations in selfmonitoring’s traditionally fertile ground—issues concerning the dispositional vs. situational control of behavior—they do nevertheless reflect the spirit of the self-monitoring construct. That is, their guiding themes represent clear expressions of self-monitoring theory’s defining concerns with the worlds of public appearances and social images, and the processes by 13842
which appearances and images are constructed and sustained. However, it should also be recognized that these lines of research go beyond showing that individual differences, in concern for cultivating public appearances, affect self-presentational behaviors. These programs of research have demonstrated that these concerns, and their manifestations in expressive control, permeate the very fabric of individuals’ lives, affecting their friendship worlds, their romantic lives, their interactions with the consumer marketplace, and their work worlds.
2. The Nature of Self-monitoring Despite their generativity, the self-monitoring construct and its measure have been the subject of considerable controversy over how self-monitoring ought to be interpreted and measured. The roots of this controversy are factor analyses that clearly reveal that the items of the Self-monitoring Scale are multifactorial, with the emergence of three factors being the most familiar product of these factor analyses (Briggs et al. 1980). These factor analyses, and attempts to interpret them, have stimulated a critically important question: Is self-monitoring truly a unitary phenomenon? Although there is widespread agreement about the multifactorial nature of the items of the Self-monitoring Scale, there exist diverging viewpoints on the interpretation of this state of affairs. One interpretation is that some criterion variables represented in the literature might relate to one factor, other criterion variables to a second independent factor, and yet others to still a third factor—an interpretation which holds that self-monitoring is not a unitary phenomenon (e.g., Briggs and Cheek 1986). Without disputing the multifactorial nature of the self-monitoring items, it is nevertheless possible to construe self-monitoring as a unitary psychological construct. Taxonomic analyses have revealed that the self-monitoring subscales all tap, to varying degrees, a common latent variable that may reflect two discrete or quasidiscrete self-monitoring classes (Gangestad and Snyder 1985). In addition, the Self-monitoring Scale itself taps a large common factor accounting for variance in its items and correlating, to varying degrees, with its subscales; this general factor approximates the Self-monitoring Scale’s first unrotated factor (Snyder and Gangestad 1986). Thus, the Selfmonitoring Scale may ‘work’ to predict diverse phenomena of individual and social functioning because it taps this general factor; this interpretation is congruent with self-monitoring as a unitary, conceptually meaningfully psychological construct. Although much of the debate about the nature of the self-monitoring construct has focused on con-
Self-monitoring, Psychology of trasting interpretations of the internal structure of the Self-monitoring Scale, it is possible to consult another source of evidence with which to address the major issues of the self-monitoring controversy—the literature on the Self-monitoring Scale’s relations with criterion variables external to the scale itself (i.e., behavioral, behavioroid, and performance measures of phenomena relevant to self-monitoring theorizing). Based on a quantitative review of the literature on the Self-monitoring Scale’s relations with behavioral and behavioroid external criterion variables, it appears that, with some important exceptions, a wide range of external criteria tap a dimension directly measured by the Self-monitoring Scale (Gangestad and Snyder 2000). Based on this quantitative appraisal of the selfmonitoring literature, it is possible to offer some specifications of what self-monitoring is and what it is not, specifications that may guide the next generations of theory and research on self-monitoring. That is, it is possible to identify ‘exclusionary messages’ about features of self-monitoring theory that should not receive the attention heretofore accorded them (e.g., delimiting the scope of self-monitoring as a moderator variable such that claims about peer-self agreement ought no longer be made, although claims about behavioral variability may yet be made), and to identify ‘inclusionary messages’ about features that should define the evolving agenda for theory and research on self-monitoring (e.g., focusing on the links between self-monitoring and strategic motivational agendas associated with engaging in, or eschewing, impression management tactics that involve the construction of social appearances and the cultivation of images).
3. Conclusions To some extent, the productivity and generativity of the self-monitoring construct may derive from the fact that it appears to capture one of the fundamental dichotomies of psychology—whether behavior is a product of forces that operate from outside of the individual (exemplified by the ‘situational’ orientation of the high self-monitor) or whether it is governed by influences that guide from within the individual (typified by the ‘dispositional’ orientation of the low self-monitor). In theory and research, self-monitoring has served as a focal point for issues in assessment, in the role of scale construction in theory building, and in examining fundamental questions about personality and social behavior, particularly those concerning how individuals incorporate inputs from their own stable and enduring dispositions and inputs from the situational contexts in which they operate into agendas for action that guide their functioning as individuals and as social beings.
See also: Impression Management, Psychology of; Personality and Conceptions of the Self; Selfconscious Emotions, Psychology of; Self-evaluative Process, Psychology of; Self-knowledge: Philosophical Aspects
Bibliography Briggs S R, Cheek J M 1986 The role of factor analysis in the development and evaluation of personality scales. Journal of Personality 54: 106–48 Briggs S R, Cheek J M, Buss A H 1980 An analysis of the selfmonitoring scale. Journal of Personality and Social Psychology 38: 679–86 Caldwell D F, O’Reilly C A 1982 Boundary spanning and individual performance: The impact of self-monitoring. Journal of Applied Psychology 67: 124–27 DeBono K G, Packer M 1991 The effects of advertising appeal on perceptions of product quality. Personality and Social Psychology Bulletin 17: 194–200 Eisenberg N, Fabes R, Schaller M, Carlo G, Miller P 1991 The relation of parental characteristics and practices to children’s vicarious emotional responding. Child Deelopment 62: 1393–1408 Gangestad S, Snyder M 1985 ‘To carve nature at its joints’: On the existence of discrete classes in personality. Psychological Reiew 92: 317–49 Gangestad S, Snyder M 2000 Self-monitoring: appraisal and reappraisal. Psychological Bulletin 126: 530–55 Graziano W G, Waschull S B 1995 Social development and selfmonitoring. Reiew of Personality and Social Psychology 15: 233–60 Gudykunst W B 1985 The influence of cultural similarity, type of relationship, and self-monitoring on uncertainty reduction processes. Communication Monographs 52: 203–17 Kilduff M 1992 The friendship network as a decision-making resource: dispositional moderators of social influences on organization choice. Journal of Personality and Social Psychology 62: 168–80 Lennox R, Wolfe R 1984 Revision of the self-monitoring scale. Journal of Personality and Social Psychology 46: 1349–64 Snyder M 1974 Self-monitoring of expressive behavior. Journal of Personality and Social Psychology 30: 526–37 Snyder M 1987 Public appearances, Public realities: The Psychology of Self-monitoring. W. H. Freeman, New York Snyder M, Berscheid E, Glick P 1985 Focusing on the exterior and the interior: Two investigations of the initiation of personal relationships. Journal of Personality and Social Psychology 48: 1427–39 Snyder M, Berscheid E, Matwychuk A 1988 Orientations toward personnel selection: Differential reliance on appearance and personality. Journal of Personality and Social Psychology 54: 972–9 Snyder M, DeBono K G 1985 Appeals to image and claims about quality: Understanding the psychology of advertising. Journal of Personality and Social Psychology 49: 586–97 Snyder M, Gangestad S 1986 On the nature of self-monitoring: Matters of assessment, matters of validity. Journal of Personality and Social Psychology 51: 125–39
13843
Self-monitoring, Psychology of Snyder M, Gangestad S, Simpson J A 1983 Choosing friends as activity partners: The role of self-monitoring. Journal of Personality and Social Psychology 45: 1061–72
M. Snyder
Self-organizing Dynamical Systems Interactions, nonlinearity, emergence and context, though omnipresent in the social and behavioral sciences, have proved remarkably resistant to understanding. New light may be shed on these problems by virtue of the introduction and development of theoretical concepts and methods of self-organization and the mathematical tools of (nonlinear) dynamical systems (Haken 1996, Kelso 1995). Self-organized dynamics promises both a language for, and a strategy toward, understanding human behavior on multiple levels of description. A key problem on any given level of description is to identify the essential dynamical variables characterizing the formation and change of behavioral patterns so that the pattern dynamics, the rules governing behavior, may be found and their predictions pursued. Dynamics offers not only a concise mathematical description of different kinds and classes of behavior. Dynamical models provide new measures and predict new phenomena often not observed before in studies of individual human and social behavior. And they may help explain effects previously attributed to more static, nondynamical mechanisms. Therein lies the promise of dynamics for the social and behavioral sciences at large. Dynamics offers a new way to think about, perhaps even solve, old problems. Here, following a brief historical introduction, the main ideas of dynamical systems are described in a nontechnical fashion. The connection between notions of self-organization and nonlinear dynamical systems is then addressed as a foundation for understanding behavioral pattern formation, flexibility, and change. A flavor of the approach, which stresses an intimate relationship between theory and experiment, is addressed in the context of a few examples from the behavioral and social sciences. Finally, some extensions of the approach are discussed briefly, including a way to include meaningful information into the selforganizing dynamics.
1. A Short History of Dynamical Systems For the eighteenth century Scottish philosopher David Hume, all the sciences bore a relation to the human mind. In his Treatise of Human Nature (1738), Hume first divided the mind into its contents: ideas and 13844
impressions. Then he added dynamics, noting the impossibility of simple ideas forming more complex ones without some bond of union between them. Hume’s three dynamical laws for the association of ideas—resemblance, contiguity, and cause and effect—were thought to be responsible for controlling all mental operations. A kind of attraction, Hume thought, existed in the mental world: the motion of the mind was conceived as analogous to the motion of a body, as described earlier by Newton. Mental ‘stuff’ was governed (somehow) by dynamics. All this has a strangely contemporary ring to it. Dynamics is not only the language of the natural sciences, but has permeated the social, behavioral, and neurosciences as well (Arbib et al. 1998, Kelso 1995, Vallacher and Novak 1997). Cognitive science, whose theory and paradigms are historically embedded in the language of the digital computer, may be the most recent of the human sciences to fall under the spell of dynamics (Port and van Gelder 1995, Thelen and Smith 1994). Which factors led to the primacy of dynamics as the language of science, whether natural or human? First, was the insight of the French mathematician Henri Poincare! to put dynamics into geometric form. Poincare! introduced an array of powerful methods for visualizing how things behave, somewhat independent of the things themselves. Second, following Poincare! ’s lead, were developments in understanding nonlinear dynamical systems from a mathematical point of view. These are sets of differential equations that can exhibit multiple solutions for the same parameters and are capable of highly irregular, even unpredictable behavior. Third, was the ability to visualize how these equations behave by integrating them numerically on a digital computer. Parameters can be varied and parameter spaces explored without doing a ‘real’ experiment or collecting data of any kind. In fact, computer simulation and visualization is often the only possible method of exploring complex behavior. Finally, were quite specific examples of cooperative (or ‘collective’) behavior in physics (e.g., fluids, lasers), chemistry (e.g., chemical reactions), biology (e.g., morphogenesis), and later, in the behavioral sciences. The behavioral experiments (Kelso 1984) and consequent theoretical modeling (Haken et al. 1985) showed for the first time that dynamic patterns of human behavior, how behavior forms, stabilizes, and changes, may be described quite precisely in the mathematical language of nonlinear dynamical systems. Not only were new effects predicted and observed, it was possible to derive these emergent patterns and the pattern dynamics by nonlinearly coupling the component parts. Since these early empirical and theoretical developments, ideas of self-organization and dynamics have infiltrated the social, behavioral, economic, and cognitive sciences, not to speak of literature and the arts.
Self-organizing Dynamical Systems
2. Dynamical Systems: Some Terminology What exactly is a dynamical system? It is not possible here to do justice to the mathematics of dynamical systems and its numerous technical applications, for example, in time series analysis (e.g., Strogatz 1994). For present purposes, only the basics are considered. Fundamentally, a dynamical system pertains to anything—such as the behavior of a human being or a social group—that evolves over time. Mathematically, if X , X , …, Xn are the variables characterizing the " #behavior, a dynamical system is a system of system’s equations stipulating the temporal evolution of X, the vector of all permissible X-values. Therein lies the rub: in the social and behavioral sciences one has to find these Xs and identify their dynamics on a chosen level of description. If X is a continuous function of time, then the dynamics of X is typically described by a set of first order ordinary differential equations (ODEs). The order of the equation refers to the highest derivative that appears in the equation. Thus, XI l F (X ) is a first order equation where the dot above the X denotes the first derivative with respect to time and F (X ) gives the vector field. ODEs are of greatest interest when F is nonlinear because, depending on the particular form of F, a wide variety of behaviors can arise. Another important class of dynamical system is difference equations or maps. Maps are often useful when the behavior of a system appears to be organized around discrete temporal events. Maps and ODEs provide complementary ways to model and study nonlinear dynamical behavior, but we shall not pursue maps further here (see, however, van Geert 1995 as an example of the approach). A vectorfield is defined at each and every point of the state or phase space of the system, defined by the values of the variables X , X , …, Xn. As these states # evolve in time they can be" represented graphically as a trajectory. The state space is filled with trajectories. At any point in a given trajectory, a velocity vector can be determined by differentiation. These velocity vectors prescribe a velocity vectorfield, which is assumed to represent the flow or behavior of the dynamical system (Abraham and Shaw 1982). This flow often has characteristic tendencies particularly in dissipative systems, named thus because their state space volume decreases or dissipates over time toward attractors. In other words, trajectories converge over time (technically, as time goes to infinity) to subsets of the phase space (a given attractor’s basin of attraction). One may even think of an attractor as a representation of the goal of the behaving system (Saltzman and Kelso 1987). The term transient defines the segment of a trajectory starting from some initial condition in the basin of attraction until it settles into an attractor. Attractors can be fixed points, in which all initial conditions converge to some stable rest state. Attrac-
tors can be periodic, exhibiting preferred rhythms or orbits on which the system settles regardless of where it starts. Or, there can be so-called strange attractors; strange because they exhibit deterministic chaos, a type of irregular behavior resembling random noise, yet often containing pockets of quite ordered behavior. The presence of chaos in physical systems is ubiquitous, and there is some evidence suggesting that it may also play an important role in certain biological systems, including the human brain and its disorders (Kelso and Fuchs 1995). Chaos means sensitive dependence on initial conditions: nudging a chaotic system by just a hair can change its future behavior dramatically. In the vernacular, human behavior certainly seems ‘chaotic’ sometimes. However, although an area of active interest in many research fields, it remains unclear whether chaos, and its close relative, fractals (Mandelbrot 1982) provide any essential insights in the social and behavioral sciences. What may be rather more important to appreciate is that parameters, so-called control parameters, can move a system through a vast array of different kinds of dynamic behavior of which chaos is only one. Thus, when a parameter changes smoothly, the attractor in general also changes smoothly. In other words, sizeable changes in the input have little or no effect on the resulting output. However, when the control parameter passes through a critical point or threshold in an intrinsically nonlinear system an abrupt change in the attractor can occur. This sensitive dependence on parameters is called a bifurcation in mathematics, or a nonequilibrium phase transition in physical theories of pattern formation (Haken 1977).
3. Concepts of Self-organization Dynamical theory, by itself, does not give us any essential insights into how patterns of behavior may be created in complex systems that contain multiple (often different) elements interacting nonlinearly with each other and their environment. This is where the concept of self-organization is helpful (Haken 1977, Nicolis and Prigogine 1977). Self-organization refers to the spontaneous formation of pattern and pattern change in complex systems whose elements adapt to the very patterns of behavior they create. Think of birds flocking, fish schooling, bees swarming. There is no ‘self ’ inside the system ordering the elements, dictating to them what to do and when to do it. Rather, the system, which includes the environment, literally organizes itself. Inevitably, when the elements form a coupled system with the world, coordinated patterns of behavior arise. Emergent pattern reflects the collective behavior of the elements. Collective behavior as a result of selforganization reduces the very large number of degrees of freedom into a much smaller set of relevant 13845
Self-organizing Dynamical Systems dynamical variables called, appropriately enough, collective variables. The enormous compression of degrees of freedom near critical points can arise because events occur on different timescales: the faster individual elements in the system may become ‘enslaved’ to the slower, emergent pattern or collective variables, and lose their status as relevant behavioral quantities (Haken 1977). Alternatively, one may conceive of a hierarchy of timescales for various processes underlying human behavior. On a given level of the hierarchy are pattern variables subject to constraints (e.g., of the task) that act as boundary conditions on the pattern dynamics. At the next level down are component processes and events that typically operate on faster timescales. Notice in this scheme (Kelso 1995), the key is to chose a level of description and understand the relation between adjacent levels, not reduce to some ‘fundamental’ lower level. So what causes behavioral patterns to form? And what causes pattern change? It is here that the connection between processes of self-organization and dynamical systems theory becomes quite explicit. In complex systems that are open to exchanges of information with their environment, naturally occurring environmental conditions or intrinsic, endogenous factors may take the form of control parameters in a nonlinear dynamical system. When the control parameter crosses a critical value, instability occurs, leading to the formation of new (or different) patterns. Dynamic instability is the generic mechanism underlying spontaneous self-organized pattern formation and change in all systems coupled to their internal or external environment. The reason is that near instability the individual elements, in order to adjust to current conditions (control parameter values), must order themselves in a new or different way. Fluctuations are always present, constantly testing whether a given behavioral pattern is stable. Fluctuations are not just noise; rather, they allow the system to discover new, more adaptive behavioral patterns. The patterns that emerge and change near instabilities have a rich and highly nonlinear pattern, or coordination dynamics. Herein lies the basis of the hypothesis that human beings and the basic forms of behavior they produce, may be understood as self-organizing dynamical systems (Kelso 1995). Humans—complex systems consisting of multiple, interacting elements—produce behavioral patterns that are captured by collective variables. The resulting dynamical rules are essentially nonlinear and thus capable of producing a rich repertoire of behaviors. By this hypothesis, all human behavior—even human cognitive development (Magnusson 1995, Thelen and Smith 1994) and learning (Zanone and Kelso 1992) which occur on very different time scales—arises because complex material systems, through the process of self-organization, create a dynamic, pattern forming system that is capable of both behavioral 13846
simplicity (such as fixed point behavior) and enormous, even creative, behavioral complexity (such as chaotic behavior). No new principles, it seems, need be introduced (but see Sect. 5).
4. Finding Dynamical Laws in Behaioral and Social Systems In the social and behavioral sciences the key collective variables characterizing the system’s state space are seldom known a priori and have to be identified. Science always needs special entry points, places where irrelevant details may be pruned away while retaining the essential aspects of the behavior one is trying to understand. Unique to the perspective of self-organized dynamical systems is its focus on qualitative change, places where behavior bifurcates. The reason is that qualitative change affords a clear distinction between one pattern and another, thereby enabling one to identify the key pattern variables that define behavioral states. Likewise, any parameter that induces qualitative behavioral change qualifies as a control parameter. The payoff from knowing collective pattern variables and control parameters is high: they enable one to obtain the dynamical rules of behavior on a chosen level of description. By adopting the same strategy the next level down, the individual component dynamics may be studied and identified. It is the interaction between these that creates the patterns at the level above, thereby building a bridge across levels of description.
5. The Laws that Bind Us It has long been recognized that social relationships are an emergent product of the process of interaction. Commonly studied relationships are the mother– infant interaction, the marriage bond and the patienttherapist relation, all of which are extremely complicated and difficult to understand. A more direct approach may be to study simpler kinds of social coordination, with the aim of determining whether such emergent behavior is self-organized, and if so what its dynamics are. As an illustrative example, consider two people coordinating their movements. In this particular task each individual is instructed to oscillate a limb (the lower leg in this case) in the same or opposite direction to that of the other (Schmidt et al. 1990). Obviously the two people must watch each other in order to do the task. Then, either by an instruction from the experimenter or by following a metronome whose rate systematically increases, the social dyad also must speed up their movements. When moving their legs up and down in the same direction, the two members of the dyad can remain
Self-organizing Dynamical Systems
Figure 1 The potential, V(φ) of Eqn. (2) (with ∆ω l 0) and Eqn. (4) (with ∆ω 0). Black balls symbolize stable coordinated behaviors and white balls correspond to unstable behavioral states (see text for details)
synchronized throughout a broad range of speeds. However, when moving their legs in the opposite direction (one person’s leg extending at the knee while the other’s is flexing), such is not the case. Instead, at certain critical speeds participants spontaneously change their behavior so that the legs now move in the same direction. How might these social coordination phenomena be explained? The pattern variable that changes qualitatively at the transition is the relative phase, φ. When the two people move in the same direction, they are in-phase with each other, φ l 0. When they move in different directions, their behavior is antiphase (φ l pπ radians or p180 degrees). The phase relation is a good candidate for a collective variable because it clearly captures the spatiotemporal ordering between the two interacting components. Moreover, φ changes far more slowly than the variables that might describe the individual components (position, velocity, acceleration, electromyographic activity of contracting muscles, etc.) Importantly, φ changes abruptly at the transition.
The simplest dynamics that captures all the observed facts is φc l ka sinφk2b sin2φ
(1)
where φ is the relative phase between the movements of the two individuals, φ0 is the derivative of φ with respect to time, and the ratio b\a is a control parameter corresponding to the movement rate in the experiment. An equivalent formulation of Eqn. (1) is φc lkcV(φ)\cφ with V(φ) lka cosφkb cos2φ
(2)
In the literature, this is called the HKB model of coordinated behavior, after Haken et al. (1985) who formulated it as an explanation of the coordination of limb movements within a single person (Kelso 1984). Figure 1 (top) allows one to develop an intuitive understanding of the behavior of Eqn. (2), as well as to connect to the key concepts of stability and instability in self-organizing dynamical systems introduced 13847
Self-organizing Dynamical Systems earlier. The dynamics can be visualized as a particle moving in a potential function, V(φ). The minima of the potential are points of vanishing force, giving rise to stable solutions of the HKB dynamics. As long as the speed parameter (b\a) is slow, meaning the cycle period is long, Eqn. (2) has two stable fixed point attractors, collective states at φ l 0 and φ l pπ rad. Thus, two coordinated behavioral patterns coexist for exactly the same parameter values, the essentially nonlinear feature of bistability. Such bi- and in general multistability is common throughout the behavioral and social sciences. Ambiguous figures (old woman or young girl? Faces or vase?) are well-known examples (e.g., Kruse and Stadler 1995). As the ratio b\a is decreased, meaning that the cycle period gets shorter as the legs speed up, the formerly stable fixed point at φ l pπ rad. becomes unstable, and turns into a repellor. Any small perturbation will now kick the system into the basin of attraction of the stable fixed point (behavioral pattern) at φ l 0. Notice also that once there, the system’s behavior will stay in the in-phase attractor, even if the direction of the control parameter is reversed. This is called hysteresis, a basic form of memory in nonlinear dynamical systems. What about the individual components making up the social dyad? Research has established that these take the form of self-sustaining oscillators, archetypal of all time-dependent behavior whether regular or not. The particular functional form of the oscillator need not occupy the reader here. More important is the nature of the nonlinear coupling that produces the emergent coordination. The simplest, perhaps fundamental coupling that guarantees all the observed emergent properties of coordinated behavioral patterns—multistability, flexible switching among behavioral states, and primitive memory, is K l (X} kX} ) oαjβ(X kX )#q (3) "# " # " # where X and X are the individual components and α " coupling # parameters. Notice that the ‘next and β are level up,’ the level of behavioral patterns and the dynamical rule that governs them (Eqns.(1) and (2)), can be derived from the level below, the individual components and their nonlinear interaction. One may call this constructive reductionism: by focusing on adjacent levels, under the task constraint of interpersonal coordination, the individual parts can be put together to create the behavior of the whole. The basic self-organized dynamics, Eqns. (2) and (3) have been extended in numerous ways, only a few of which are mentioned here. (a) Critical slowing down and enhancement of fluctuations. Introducing stochastic forces into Eqns. (1) and (2) allows key predictions to be tested and quantitatively evaluated. Critical slowing is easy to understand from Fig. 1 (top). As the minima at φ l pπ become shallower and shallower, the time it takes to adjust to a small perturbation takes longer and longer. Thus, 13848
the local relaxation time is predicted to increase as the instability is approached because the restoring force (given as the gradient in the potential) becomes smaller. Likewise, the variability of φ is predicted to increase due to the flattening of the potential near the transition point. Both predictions have been confirmed in a wide variety of experimental systems, including recordings of the human brain. (b) Symmetry breaking. Notice that Eqns. (1) and (2) are symmetric: the dynamical system is 2π periodic and is identical under left-right reflection (φ is the same as kφ). This assumes that the individual components are identical, which is seldom, if ever the case. Nature thrives on broken symmetry. To accommodate this fact, a term ∆ω is incorporated into the dynamics φc l ∆ωka sinφk2b sin2φ, and V(φ) lk∆ωφka cosφkb cos2φ
(4)
for the equation of motion and the potential respectively. Small values of ∆ω shift the attractive fixed points (Fig. 1 middle) in an adaptive manner. For larger values of ∆ω the attractors disappear entirely (Fig. 1 bottom) causing the relative phase to drift: no coordination between the components occurs. Note, however, that the dynamics still retain some curvature (Fig. 1 bottom right): even though there are no attractors there is still attraction to where the attractors used to be. The reason is that the difference (∆ω) between the individual components is sufficiently large that they do their own thing, while still retaining a tendency to cooperate. This is how global integration, in which the component parts are locked together, is reconciled with the tendency of the parts to function locally, as individual autonomous units (Kelso 1995). (c) Information: a new principle? Unlike the behavior of inanimate things, the self-organizing dynamics of human behavior is fundamentally informational, though not in the standard sense of data communicated across a channel (Shannon and Weaver 1949). Rather, collective variables are context-dependent and intrinsically meaningful. Context-dependence does not imply lack of reproducibility. Nor does it mean that every new context requires a new collective variable or order parameter. As noted above, within- and between-person coordinated behaviors are described by exactly the same self-organizing dynamics. One of the consequences of identifying the latter is that in order to modify or change the system’s behavior, any new information (say a task to be learned, an intention to change one’s behavior) must be expressed in terms of parameters acting on system-relevant collective variables. The benefit of identifying the latter is that one knows what to modify. Likewise, the collective variable dynamics—prior to the introduction of new information—influences how that information is used. Thus, information is not lying out there as mere data: information is meaningful
Self-organizing Dynamical Systems to the extent that it modifies, and is modified by, the collective variable dynamics. (d) Generalization. The basic dynamics (Eqns. (1–4)) can readily be elaborated as a model of emergent coordinated behavior among many anatomically different components. Self-organized behavioral patterns such as singing in a group or making a ‘wave’ during a football game are common, yet unstudied examples. Recently, Neda et al. (2000) have examined a simpler group activity: applause in theater and opera audiences in Romania and Hungary. After an exceptional performance, initially thunderous incoherent clapping gives way to slower, synchronized clapping. Measurements indicate that the clapping period suddenly doubles at the onset of the synchronized phase, and slowly decreases as synchronization is lost. This pattern is a cultural phenomenon in many parts of Europe: a collective request for an encore. Increasing frequency (decreasing period) is a measure of the urgency of the message, and culminates in the transition back to noise when the performers reappear. These results are readily explained by a model of a group of globally coupled nonlinear oscillators (Kuramoto 1984) dφk K N l ωkj sin(φjkφk) dt N j= "
(5)
in which a critical coupling parameter, Kc determines the different modes of clapping behavior. K is a function of the dispersion (D) of clapping frequencies Kc l
pπ2$ D
(6)
as the interactions among disciplines continues to grow. Up to now, the use of nonlinear dynamics is still quite restricted, and largely metaphorical. One reason is that the tools are difficult to learn, and require a degree of mathematical sophistication. Their implementation in real systems is nontrivial, requiring a different approach to experimentation and observation. Another reason is that the dynamical perspective is often cast in opposition to more conventional theoretical approaches, instead of as an aid to understanding. The former tends to emphasize decentralization, collective decision-making and cooperative behavior among many interacting elements. The latter tends to focus on individual psychological processes such as intention, perception, attention, memory, and so forth. Yet there is increasing evidence that intending, perceiving, attending, deciding, emoting, and remembering have a dynamics as well. The language of dynamics serves to bridge individual and group processes. In each case, dynamics must be filled with content, with key variables and parameters obtained for the system under study. Every system is different, but what we learn about one may aid in understanding another. What may be most important are the principles and insights gained when human brains and human behavior are seen in the light of selforganizing dynamics. See also: Chaos Theory; Computational Psycholinguistics; Dynamic Decision Making; Emergent Properties; Evolution: Self-organization Theory; Hume, David (1711–76); Neural Systems and Behavior: Dynamical Systems Approaches; Stochastic Dynamic Models (Choice, Response, and Time)
Bibliography During fast clapping, synchronization is not possible due to the large dispersion of clapping frequencies. Slower, synchronized clapping at double the period arises when small dispersion appears. Period doubling rhythmic applause tends not to occur in big open-air concerts where the informational coupling among the audience is small. K can also be societally imposed. In Eastern European communities during communist times, synchronization was seldom destroyed because enthusiasm was often low for the ‘great leader’s’ speech. For people in the West, the cultural information content of different clapping patterns may be quite different. Regardless, the mathematical descriptions for coordinated behavior—of social dyads and the psychology of the crowd—are remarkably similar.
6. Conclusion and Future Directions The theoretical concepts and methods of self-organizing dynamics are likely to play an ever greater role in the social, behavioral, and cognitive sciences, especially
Abraham R H, Shaw C D 1982 Dynamics: The Geometry of Behaior. Ariel Press, Santa Cruz, CA Arbib M A, Erdi P, Szentagothai J 1998 Neural Organization: Structure, Function and Dynamics. MIT Press, Cambridge, MA Haken H 1977 Synergetics, an Introduction: Non-equilibrium Phase Transitions and Self-organization in Physics, Chemistry and Biology. Springer, Berlin Haken H 1996 Principles of Brain Functioning. Springer, Berlin Haken H, Kelso J A S, Bunz H 1985 A theoretical model of phase transitions in human hand movements. Biological Cybernetics 51: 347–56 Kelso J A S 1984 Phase transitions and critical behavior in human bimanual coordination. American Journal of Physiology: Regulatory, Integratie and Comparatie 15: R1000–4 Kelso J A S 1995 Dynamic Patterns: The Self-organization of Brain and Behaior. MIT Press, Cambridge, MA Kelso J A S, Fuchs A 1995 Self-organizing dynamics of the human brain: Critical instabilities and Sil’nikov chaos. Chaos 5(1): 64–9 Kuramoto Y 1984 Chemical Oscillations, Waes and Turbulence. Springer-Verlag, Berlin Magnusson D 1995 Individual development: A holistic, integrated model. In: Moen P, Elder G H Jr, Lu$ scher K (eds.)
13849
Self-organizing Dynamical Systems Examining Lies in Context. American Psychological Association, Washington, DC Mandelbrot B 1982 The Fractal Geometry of Nature. Freeman, New York Neda Z, Ravasz E, Brechet Y, Vicsek T, Barabasi A L 2000 The sound of many hands clapping. Nature 403: 849–50 Nicolis G, Prigogine I 1977 Self-organization in Nonequilibrium Systems. Wiley, New York Port R F, van Gelder T 1995 Mind as Motion: Explorations in the Dynamics of Cognition. MIT Press, Cambridge, MA Saltzman E L, Kelso J A S 1987 Skilled actions: A task dynamic approach. Psychological Reiew 94: 84–106 Schmidt R C, Carello C, Turvey M T 1990 Phase transitions and critical fluctuations in the visual coordination of rhythmic movements between people. Journal of Experimental Psychology: Human Perception and Performance 16: 227–47 Shannon C E, Weaver W 1949 The Mathematical Theory of Communication. University of Illinois Press, Chicago Strogatz S H 1994 Nonlinear Dynamics and Chaos. AddisonWesley, Reading, MA Thelen E, Smith L B 1994 A Dynamic Systems Approach to the Deelopment of Cognition and Action. MIT Press, Cambridge, MA Vallacher R R, Nowak A 1997 The emergence of dynamical social psychology. Psychological Inquiry 8: 73–99 Van Geert P 1995 Learning and development: A semantic and mathematical analysis. Human Deelopment 38: 123–45 Zanone P G, Kelso J A S 1992 The evolution of behavioral attractors with learning: Nonequilibrium phase transitions. Journal of Experimental Psychology: Human Perception and Performance 18/2: 403–21
J. A. S. Kelso
Self: Philosophical Aspects There are two related ways in which philosophical reflection on the self may usefully be brought to bear in social science. One concerns the traditional problem about personal identity. The other concerns the distinctive kinds of social relations into which only selves can enter. Sustained reflection on the second issue reveals striking analogies between some of the social relations into which only selves can enter and certain first personal relations that only a self can bear to itself. These analogies entail two possibilities that bear on the first issue concerning personal identity, namely, the possibility that there could be group selves who are composed of many human beings and the possibility that there could be multiple selves within a single human being. Section 1 outlines the classical problem of personal identity that was originally posed by Locke and continues to generate philosophical debate. Section 2 discusses some of the distinctive social relations into which only selves can enter. Section 3 draws several analogies between such social relations and first 13850
personal relations. Section 4 shows how these analogies point to the possibilities of group and multiple selves. It closes with some cautionary remarks about how overlooking these two possibilities can lead to confusions about methodological individualism.
1. The Philosophical Problem about Personal Identity To have a self is to be self-conscious in roughly the sense that Locke took to be the defining mark of the person. In his words, a person is ‘a thinking, intelligent being, that has reason and reflection, and can think itself as itself in different times and places’ (Locke 1979). In general, all of these capabilities—for thought, reason, reflection, and reflexive self-reference—tend to be lumped together by philosophers under the common heading of self-consciousness. Except in some generous accounts of animal intelligence, it is generally assumed that the only self-conscious beings are human beings. Locke famously disagreed. This was not because he believed that there are other species of animal besides the human species that actually qualify as persons (though he did discuss at some length a certain parrot that was reported to have remarkable conversational abilities). It was rather because he thought that the condition of personal identity should not be understood in biological terms at all; it should be understood, rather, in phenomenological terms. A person’s identity extends, as he put it, exactly as far as its consciousness extends. He saw no reason why the life of a person so construed, as a persisting center of consciousness, should necessarily coincide with the biological lifespan of a given human animal. In defense of this distinction between personal and animal identity he offered the following thought experiment: first, imagine that the consciousnesses of a prince and a cobbler are switched each into the other’s body and, then, consider who would be who thereafter. He thought it intuitively obvious that the identity of the prince is tied to his consciousness in such a way that he would remain the same person—would remain himself—even after having gained a new, cobbling body (similar remarks apply to the cobbler). Many objections have been raised against Locke’s argument and conclusion. It still remains a hotly disputed matter among philosophers whether he was right to distinguish the identity of the person from the identity of the human animal (Parfit 1984, Perry 1975, Rorty 1979). This philosophical dispute has not drawn much attention from social scientists who have generally assumed against Locke that the life of each individual self or person coincides with the life of a particular human being. The anti-Lockean assumption is so pervasive in the social sciences that it serves as common ground even among those who take
Self: Philosophical Aspects opposing positions on the issue concerning whether all social facts are reducible to facts about individuals or whether there are irreducibly social facts (see Indiidualism ersus Collectiism: Philosophical Aspects; Methodological Indiidualism: Philosophical Aspects)). Both sides take for granted that the only individuals in question are human beings. Prima facie, it seems reasonable that social scientists should disregard Locke’s distinction between the self or person and the human being. For something that neither Locke nor neo-Lockeans have ever done is offer an actual instance of his distinction. All they have argued for is the possibility that a person’s consciousness could be transferred from one human body to another and, in every case, they have resorted to thought experiments in their attempts to show this. There is a significant difference between Locke’s original thought experiment about the prince and the subsequent neo-Lockean experiments. Whereas his simply stipulates that a transfer of consciousness has occurred, the latter typically describe some particular process by which the transfer of consciousness is supposed to be accomplished—as in, for example, brain transplantation or brain reprogramming. However, although this might make the neo-Lockean thought experiments seem more plausible, there is no realistic expectation that what they ask us to imagine will ever actually come to pass (Wilkes 1988). Because this is so, they have no particular bearing on the empirical concerns of social science. Nevertheless, it is important to be clear about whether the anti-Lockean assumption that informs social science is correct. For the way in which we conceive the boundaries that mark one individual self off from another will inevitably determine how we conceive the domain of social relations that is the common subject of all the social sciences. So we need to be absolutely clear about whether it is right to assume, against Locke, that the identity of the individual is at bottom a biological fact. As the next two sections will show, there are reasons to side with Locke. But these reasons differ from Locke’s own in several respects. Unlike his, they are not derived from thought experiments. Furthermore, they support a somewhat different version of his distinction between the identity of the self or person and the human being. Finally, they invite the expectation that this version of his distinction can be realized in fact and not just in imagination—which makes it relevant to social science after all.
2. The Sociality of Seles: Rational Relations Starting roughly with Hegel, various philosophers have mounted arguments to the effect that selfconsciousness is a social achievement that simply cannot be possessed by beings who stand outside of all social relations (Wittgenstein 1953, Davidson 1984).
The common conclusion of these arguments is overwhelmingly plausible. But even if the conclusion were shown to be false, it would remain true that many selfconscious beings are social and, furthermore, their social relations reflect their self-consciousness in interesting ways. In order to gain a full appreciation of this fact, we must shift our attention away from the phenomenological dimension of self-consciousness that Locke emphasized in his analysis of personal identity, and shift our attention to the other dimension of self-consciousness that he emphasized in his definition of the person as a ‘being with reason and reflection.’ One need not cite Locke as an authority in order to see that reflective rationality is just as crucial to selfhood as its phenomenological dimension. For it obviously will not do to define the self just in phenomenological terms, as a center of consciousness, and this is so even if Locke was right to analyze the identity of the self or person in such terms. After all, some sort of center of consciousness is presupposed by all sentience, even in the case of animals that clearly do not have selves, such as wallabies or cockatoos. It also will not do to suppose that selfhood is in place whenever the sort of consciousness that goes together with sentience is accompanied by the mere capacity to refer reflexively to oneself. For, assuming that wallabies and cockatoos can represent things (which some philosophers deny (Davidson 1984)), it is very likely that some of their representations will function in ways that are characteristic of self-representation. Consider, for example, mental maps in which such animals represent their spatial relations to the various objects they perceive; such maps require a way to mark the distinction between their own bodies and other bodies, a distinction which is not inaptly characterized as the distinction between self and other (Evans 1982). Self-consciousness in the sense that goes together with selfhood involves a much more sophisticated sort of self-representation that incorporates a conception of oneself as a thinking thing and, along with it, the concept of thought itself. Armed with this concept, a self-conscious being cannot only represent itself; it can self-ascribe particular thoughts and, in doing so, it can bring to bear on them the critical perspective of reflective rationality. Whenever selves in this full sense enter into social relations, they enter into relations of reciprocal recognition. Each is self-conscious; each conceives itself as an object of the other’s recognition; each conceives the other as so conceiving itself (i.e., as an object of the other’s recognition); each conceives both as having a common nature and, in so doing, each conceives both as so conceiving both (i.e., as having a common nature). Since this reciprocally recognized common nature is social as well as rational, it affords the possibility of the social exercise of rational capacities. Thus, corresponding to the capacity to ascribe thoughts to oneself, there is the capacity to ascribe 13851
Self: Philosophical Aspects thoughts to others; corresponding to the capacity to critically evaluate one’s own thoughts there is the capacity to critically evaluate the thoughts of others; and together with all this comes another extraordinary social capacity, the capacity for specifically rational influence in which selves aim to move one another by appealing to the normative force of reasons. Such rational influence is attempted in virtually every kind of social engagement that takes place among selves, including: engaging in ordinary conversation and argument; offering bribes; lodging threats; bargaining; cooperating; collaborating. In all of these kinds of engagement reasons are offered up in one guise or another in the hope that their normative force will move someone else.
3. Analogies Between First Personal and Social Relations It is because selves have a rational nature that analogies arise between their first personal relations to their own selves and their social relations to one another. Here are three examples.
3.1 Critical Ealuation It is a platitude that the principles of rationality are both impersonal and universal. They lay down normative requirements that anyone must meet in order to be rational, and they constitute the standards by which eeryone evaluates their own and everyone else’s rational performances. It follows that when we evaluate our own thoughts by these standards, we are really considering them as if they were anyone’s—which is to say, as if they were not our own. We sometimes describe this aspect of critical self-evaluation with the phrase ‘critical distance.’ There is a specific action by which we accomplish this distancing: we temporarily suspend our commitment to act on our thoughts until after we have determined whether keeping the commitment would be in accord with the normative requirements of rationality. If we determine that it would not be in accord with those requirements, then the temporary suspension of our commitment becomes permanent, and our critical distance on our thoughts gives way to literally disowning them. But before getting all the way to the point of disowning our thoughts we must first have gotten to a point where our relations to our thoughts are rather like our relations to the thoughts of others. For, whether we decide to disown the thoughts under critical scrutiny or to reclaim them, the normative standards on the basis of which we do so are necessarily conceived by us as impersonal and universal. In consequence, there is a straightforward sense in which we are doing the same thing when we apply the standards to our own 13852
thoughts and when we apply them to the thoughts of others. There is this difference, though: whereas selfevaluation involves distancing ourselves from our own thoughts, evaluating the thoughts of others involves a movement in the opposite direction. We must project ourselves into their points of view so that we can appeal to their own sense of how the principles of rationality apply in their specific cases. This sort of criticism that involves projection is sometimes called ‘internal criticism,’ in contrast with the sort of ‘external criticism’ that proceeds without taking into account whether its target will either grasp it or be moved by it (Williams). This difference between internal criticism of others and self-criticism—the difference between projection and distancing—does nothing to undermine the analogies between them. On the contrary, for by these opposite movements two distinct selves can bring themselves to the very same place. That is, when one self projects itself into another’s point of view for the purposes of internal criticism, it can reach the very same place that the other gets to by distancing itself from its own thoughts for the purposes of selfcriticism. Moreover, there is a sense in which what the two selves do from that place is not merely analogous, but the very same thing. Each of the two selves brings to bear the same critical perspective of rationality on the same particular self’s thoughts. The only difference that remains is that in the one case it is a social act directed at another, whereas in the other case it is a self-directed act.
3.2 Joint Actiities Just as internal criticism of another replicates what individual selves do when they critically evaluate themselves, so also, joint activities carried out by many selves may replicate the activities of an individual self. For the purposes of such activities, distinct selves need to look for areas of overlap between (or among) their points of view, such as common ends, common beliefs about their common ends, and common beliefs about how they might act together so as to realize those ends. These areas of commonly recognized overlap can then serve as the basis from which the distinct selves deliberate and act together and, as such, the areas of overlap will serve as a common point of view which is shared by those selves. Whenever distinct selves do deliberate together from such a common point of view, they will be guided by the very same principles that ordinarily govern individual deliberation, such as consistency, closure (the principle that one should accept the implications of one’s attitudes), and transitivity of preferences. All such principles of rationality taken together define what it is for an individual to be fully or optimally rational. For this reason, they generally apply separately to each individual but not to groups of individuals. (Thus, different parties to a
Self: Philosophical Aspects dispute might be rational even though they disagree, but they cannot be rational if they believe contradictions.) The one exception is joint activities in which such groups of individuals deliberate together from a common point of view. To take a limited example, you and I might deliberate individually about a philosophical problem or we might do so together. If we do it individually, each of us will aim to be consistent in our reasoning because that is what the principles of rationality require of us individually. But these principles do not require us to do this together; which is to say, you and I may disagree about many things without compromising our ability to pursue our individual deliberations from our separate points of view in a completely rational manner. What would be compromised, however, is our ability to work on a philosophical problem together. We cannot do that without resolving at least some of our disagreements, namely, those that bear on the problem. This means that if we truly are committed to working on the problem together, then we should recognize a sort of conditional requirement of rationality, a requirement to achieve as much consistency between us as would be necessary to carry out our joint project. Although we needn’t achieve overall consistency together in order to do this—that is, we needn’t agree on every issue in every domain—nevertheless, we must resolve our disagreements about the relevant philosophical matters. When we resolve these disagreements we will be doing in degree the very same thing that individuals do when they respond to the rational requirement of consistency. The same holds for the other principles of rationality such as closure and transitivity of preferences. For, in general, whenever distinct individuals pursue ends together, they temporarily cease deliberating and acting separately from their individual points of view in order to act and deliberate together from their common point of view. They will also strive to achieve by social means within a group—albeit in diminished degree—the very sorts of rational relations and outcomes that are characteristic of the individual self (Korsgaard 1989).
3.3
Rationality oer Time
Although there is some sense in which individual points of view are extended in time, the deliberations that proceed from such points of view always take place in time from the perspective of the present. The question immediately arises, why should an individual take its future attitudes into account when it deliberates from the perspective of the present? The question becomes poignant as soon as we concede that an individual’s attitudes may change significantly over time. Thus, consider Parfit’s Russian nobleman who believes as a young man that he should give away all of his money to the poor, but who also believes that he
will become much more conservative when he is older and will then regret having given the money away. What should he do? Parfit answers as follows: if he recognizes any normative pressure toward an altruistic stance from which he is bound to take the desires of others into account, then he must extend that stance to his own future desires as well—regardless of the fact that he now disapproves of them (Parfit 1984). Other philosophers have also insisted on parity between the normative demands of prudence and altruism. Among moral philosophers, the parity is often taken in the positive way that Parfit does, as showing that we have reason to be both prudent and altruistic (Sidgewick 1981, Nagel 1970). But some philosophers have taken it in a negative way, as showing that we lack reason to be either. According to them, it follows from the fact that deliberation always proceeds from the perspective of the present that rationality does not require us to take our own future attitudes into account any more than it requires us to take the attitudes of others into account. If they are right, then the rational relations that hold within an individual over time are comparable to social relations within a group (Elster 1979, Levi 1985). However, anyone who is struck by this last suggestion—that an individual over time is, rationally speaking, analogous to a group—would do well to bear in mind what the first two analogies brought out, which is that the social exercise of rational capacities within a group, can, in any case, replicate individual rationality.
4. Group Seles and Multiple Seles It might seem implausible that the normative analogies between first personal and social relations could ever jeopardize the metaphysical distinction between them, which is grounded in whatever facts constitute the identity and distinctness of selves. According to Locke these facts are phenomenological; according to his opponents, the facts are biological. In both cases, the metaphysical facts on offer would clearly be adequate to ground the distinction between first personal and social relations, and this is so no matter how deep and pervasive the analogies between these two sorts of relations might be at the normative level. Yet there is a way of thinking about the self that calls this into question. It is the very way of thinking that predominated in Locke’s initial definition of the person as a being with ‘reason and reflection.’ When we emphasize this rational dimension of the self, we will find it natural to use a normative criterion in order to individuate selves. According to this normative criterion, we have one self wherever we find something that is subject to the normative requirements that define individual rationality. There is some disagreement among philosophers about exactly what these requirements are. But, even so, we can say this much 13853
Self: Philosophical Aspects with confidence: it is a sufficient condition for being a rational individual who is subject to the requirements of rationality that the individual have some conception of what those requirements are and is in some degree responsive to them—in the sense of at least trying to meet them when it deliberates and accepting criticism when it fails to meet them (Bilgrami 1998). Of course, such criticism might issue from without as well as within, since a rational individual in this sense would certainly have the conceptual resources with which to comprehend efforts by others to wield rational influence over it. This ensures that there is a readily available public criterion by which such rational individuals can be identified for social purposes: there is one such individual wherever there is an interlocutor whom we can rationally engage. It must be admitted that, due to the analogies between first personal and social relations, the boundaries between such individuals will not always be clear. Yet this does not mean that the normative conception of the self is inadequate or unworkable. It only means that the conception has unfamiliar and, indeed, revisionist consequences. For, if we press the analogies to their logical limit, it follows that there can be group selves comprising many human beings and, also, multiple selves within a single human being (Rovane 1998). Take the group case first. We have seen that, for the purposes of joint activities, distinct selves must find areas of overlap between their individual points of view which can then serve as a common point of view from which they deliberate and act together. We have also seen that such joint deliberations replicate individual rationality in degree. Now, suppose that this were carried to the following extreme: a group of human beings resolved to engage only in joint activities; accordingly, they resolved to rank all of their preferences about such joint activities together and to pool all of their information; and, finally, they resolved always to deliberate jointly in the light of their jointly ranked preferences and pooled information. The overlap between their points of view would then be so complete that they no longer had separate points of view at all. They would still have separate bodies and separate centers of consciousness. But, despite their biological and phenomenological separation, they would nevertheless all share a single rational point of view from which they deliberated and acted as one. Consequently, they could also be rationally engaged in the ways that only individual selves can be engaged. There is good reason to regard such a group as a bona fide individual in its own right (Korsgaard 1989, Rovane 1998, Scruton 1989). Now take the multiple case. Recall that it might be rational for an individual to disregard its future attitudes when it deliberates about what to do in the present, in just the way that it might be rational (though possibly immoral) to disregard the attitudes of others. If the normative demands of rationality are thus confined to one’s own present attitudes, then 13854
one’s future self will be no more bound by one’s present deliberations than one is presently bound to take one’s future attitudes into account. In other words, one’s future self will be free to disregard any long-term intentions or commitments that one is thinking of undertaking now. The point is not that there are no such things as long-term intentions or commitments. The point is that when one makes them one does not thereby bind one’s future self in any causal or metaphysical sense. It is always up to one’s future self to comply or not as it wishes, and one could not alter this aspect of one’s situation except by undermining one’s future agency altogether. Although this may seem alarming, it does not follow that individual selves do not endure over time. All that follows is that the unity of an individual self over time is constituted by the same sorts of facts that constitute the unity of a group self. Just as the latter requires shared commitments to joint endeavors so too does the former, the only difference being that the latter involves sharing across bodies while the former involves sharing across time. This serves to bring out that the metaphysical facts in terms of which Locke and his opponents analyze the identity of the self over time are, in a sense, normatively impotent. The fact that there is a persisting center of consciousness, or a persisting animal body, does not suffice to bind the self together rationally over time. And, arguably (though this is more counterintuitive), such facts do not even suffice to bind the self together rationally in the present either. That is why we are able to understand the phenomenon of dissociative identity disorder, formerly labeled multiple personality disorder. Human beings who suffer from this disorder exhibit more than point of view, each of which can be rationally engaged on its own. Since none of these points of view encompasses all of the human host’s attitudes, there is no possibility of rationally engaging the whole human being. At any given time one can engage only one part of it or another. Each of these parts functions more or less as a separate self, as is shown precisely by the susceptibility of each to rational engagement on its own. Although these multiple selves cannot be rationally engaged simultaneously, there is good reason to suppose that they nevertheless exist simultaneously. While one such self is talking, its companions are typically observing and thinking on their own (this generally comes out in subsequent conversation with them). Furthermore, these multiple selves who coexist simultaneously within the same body are sometimes ‘co-conscious’ as well—that is, they sometimes have direct phenomenological access to one another’s thoughts. But their consciousness of one another’s thoughts does not put them into the normative first person relation that selves bear to their own thoughts; for they do not take one another’s thoughts as a basis for their deliberations and actions. Thus neither their shared body nor their shared consciousness suffices to bind multiple personalities together in the sense that is
Self-regulated Learning required by the normative conception of the self. On that conception, they qualify as individual selves in their own rights (Braude 1995, Rovane 1998, Wilkes 1988). Between these two extreme cases of group and multiple selves, the normative conception provides for a wide spectrum of cases, exhibiting differing degrees of rational unity within and among human beings. The conception says that whenever there is enough rational unity—though it is hard to say how much is enough— there is a rational being with its own point of view who can be rationally engaged as such and who, therefore, qualifies as an individual self in its own right.
5. Implications for Methodological Indiidualism It is not the case that all of the facts about the Columbia Philosophy Department can be reduced to facts about its members atomistically described. Nor is it the case that all of the facts about that department can be explained by appealing to facts about its members atomistically described. Why? Because sometimes the members of the department stop reasoning from their own separate human-size points of view and begin to reason together from the point of view of the department’s distinctive projects, needs, opportunities and so on. When the members of the department do this, they give over a portion of their human lives to the life of the department in a quite literal sense, so that the department itself can deliberate and act from a point of view that cannot be equated with any of their individual points of view. Outside the context of this article, this claim about the Columbia Philosophy Department might naturally be taken to contradict methodological individualism. This is because it is generally assumed that the only individuals there are or can be human individuals. That is what leads many to interpret the facts about the Columbia Philosophy Department—insofar as they cannot be properly described or explained by appealing to facts about its individual human members atomistically described—as irreducibly social phenomena. However, this interpretation fails to take account of the way in which the department instantiates the rational properties that are characteristic of the individual self (vs. a social whole). See also: Person and Self: Philosophical Aspects; Personality and Conceptions of the Self; Self: History of the Concept; Social Identity, Psychology of
Bibliography Bilgrami A 1998 Self-knowledge and resentment. In: Wright C et al. (eds.) Knowing Our Own Minds. Oxford University Press, Oxford, UK BraudeS E1995FirstPersonPlural:MultiplePersonalityDisorder and the Philosophy of Mind. Rowman Littlefield, Lanham, MD
Davidson D 1984 Inquires into Truth and Interpretation. Oxford University Press, New York Elster J 1979 Ulysses and the Sirens: Studies in Rationality and Irrationality. Cambridge University Press, New York Evans G 1982 Varieties of Reference. Oxford University Press, New York Gilbert M 1989 On Social Facts. Routledge, London Korsgaard C 1989 Personal identity and the unity of agency: A Kantian response to parfit. Philosophy and Public Affairs 28(2) Levi I 1985 Hard Choices. Cambridge University Press, New York Locke J 1979. In: Niddich P (ed.) An Essay Concerning Human Understanding. Oxford University Press, Oxford, UK Nagel T 1970 The Possibility of Altruism. Clarendon, Oxford, UK Parfit D 1984 Reasons and Persons. Clarendon, Oxford, UK Perry J (ed.) 1975 Personal Identity. University of California Press, Berkeley, CA Rorty A O (ed.) 1979 The Identities of Perons. University of California Press, Berkeley, CA Rovane C 1998 The Bounds of Agency: An Essay in Reisionary Metaphysics. Princeton University Press, Princeton, NJ Scruton R 1989 Corporate persons. Proceedings of the Aristotelian Society 63 Sidgewick H 1981 The Methods of Ethics. Hackett, Indianapolis, IN Wilkes K V 1988 Real People: Personal Identity Without Thought Experiments. Oxford University Press, New York Wittgenstein L 1953 Philosophical Inestigations. Macmillan, New York
C. Rovane
Self-regulated Learning Self-regulated learning refers to how students become masters of their own learning processes. Neither a mental ability nor a performance skill, self-regulation is instead the self-directive process through which learners transform their mental abilities into taskrelated skills in diverse areas of functioning, such as academia, sport, music, and health. This article will define self-regulated learning and describe the intellectual context in which the construct emerged, changes in researchers’ emphasis over time as well as current emphases, methodological issues related to the construct, and directions for future research.
1. Defining Self-regulated Learning Self-regulated learning involves metacognitive, motivational, and behavioral processes that are personally initiated to acquire knowledge and skill, such as goal setting, planning, learning strategies, self-reinforcement, self-recording, and self-instruction. A selfregulated learning perspective shifts the focus of educational analyses from students’ learning abilities and instructional environments as fixed entities to students’ self-initiated processes for improving their 13855
Self-regulated Learning methods and environments for learning. This approach views learning as an activity that students do for themselves in a proactive way, rather than as a covert event that happens to them reactively as a result of teaching experiences. Self-regulated learning theory and research is not limited to asocial forms of education, such as discovery learning, self-education through reading, studying, programmed instruction, or computer assisted instruction, but can include social forms of learning such as modeling, guidance, and feedback from peers, coaches, and teachers. The key issue defining learning as self-regulated is not whether it is socially isolated but rather whether the learner displays personal initiative, perseverence, and adaptive skill in pursuing it. Most contemporary selfregulation theorists have avoided dualistic distinctions between internal and external control of learning and have envisioned self-regulation in broader, more interactive terms. Students can self-regulate their learning not only through covert cognitive means but also through overt behavioral means, such as selecting, modifying, or constructing advantageous personal environments or seeking social support. A learner’s sense of self is not limited to individualized forms of learning but includes self-coordinated collective forms of learning in which personal outcomes are achieved through the actions of others, such as family members, team-mates, or friends, or through use of physical environment resources, such as tools. Thus, covert self-regulatory processes are viewed as reciprocally interdependent with behavioral, social, and environmental self-regulatory processes. Self-regulation is defined as a variable process rather than as a personal attribute that is either present or absent. Even the most helpless learners attempt to control their functioning, but the quality and consistency (i.e., quantity) of their processes are low. Novice learners rely typically on naive forms of selfregulatory processes, such as setting nonspecific distal goals, using nonstrategic methods, inaccurate forms of self-monitoring, attributions to uncontrollable sources of causation, and defensive self-reactions. By contrast, expert learners display powerful forms of self-regulatory processes, especially during the initial phase of learning. Student efforts to self-regulate their learning has been analyzed in terms of three cyclical learning phases. Forethought phase processes anticipate efforts to learn and include self-motivational beliefs, such as self-efficacy, outcome expectations, intrinsic interest, as well as task analysis skills, such as planning, goal setting, and strategy choice. Performance phase processes seek to optimize learning efforts and include use of time management, imagery, self-verbalization, and self-observation processes. Self-reflection phase processes follow efforts to learn and provide understanding of the personal implication of outcomes. They include self-judgment processes, such as selfevaluation and attributions, and self-reactive processes, such as self-satisfaction and adaptive\defensive 13856
inferences. Because novice learners fail to use effective forethought processes proactively, such as proximal goal setting and powerful learning strategies, they must rely on reactive processes occurring after learning attempts that often have been unsuccessful. Such unfortunate experiences will trigger negative selfevaluations, self-dissatisfaction, and defensive selfreflections—all of which undermine self-motivation necessary to continue cyclical efforts to learn. By understanding self-regulation in this cyclical interactive way, qualitative as well as quantitative differences in process can be identified for intervention.
2. Intellectual Context for Self-regulated Learning Research Interest in students’ self-regulated learning as a formal topic emerged during the 1970s and early 1980s out of general efforts to study human self-control. Promising investigations of children’s use of self-regulatory processes like goal setting, self-reinforcement, selfrecording, and self-instruction, in such areas of personal control as eating and task completion prompted educational researchers and reformers to consider their use by students during academic learning. Interest in self-regulation of learning was also stimulated by awareness of the limitations of prior efforts to improve achievement that stressed the role of mental ability, social environmental background of students, or qualitative standards of schools. Each of these reform movements viewed students as playing primarily a reactive rather than a proactive role in their own development. In contrast to prior reformers who focused on how educators should adapt instruction to students based on their mental ability, sociocultural background, or achievement of educational standards, self-regulation theorists focused on how students could proactively initiate or substantially supplement experiences designed to educate themselves. Interest in self-regulation of learning emerged from many theoretical sources during the 1970s and 1980s. For example, operant researchers adapted the principles and technology of B. F. Skinner for personal use, especially the use of environmental control, selfrecording, and self-reinforcement. Their preference for the use of single-subject research paradigms and time series data was especially useful for individuals seeking greater self-regulation of learning. During this same time period, phenomenological researchers shifted from monolithic, global views of how selfperceptions influenced learning to hierarchical, domain-specific views and began developing new selfconcept tests to assess functioning in specific academic domains. The research of Hattie Marsh, Shavelson, and others was especially influential in gaining new currency for the role of academic domain self-concepts in learning. During this era, social learning researchers, such as Bandura, Schunk, and Zimmerman,
Self-regulated Learning shifted their emphasis from modeling to self-regulation and renamed the approach as social cognitive. They identified a new task-specific motive for learning—self-efficacy belief, and they linked it empirically to other social cognitive processes, such as goal setting, self-observation, self-judgment, and self-reaction. During this era, researchers such as Corno, Gollwitzer, Heckhausen, Kuhls, and others, resurrected volitional notions of self-regulation to explain human efforts to pursue courses of learning in the face of competing events. In their view, self-regulatory control of action can be undermined by ruminating, extrinsic focusing, and vacillating—which interfere with the formation and implementation of an intention. Also during the 1970s and 1980s, suppressed writings of Vygotsky were published in English that explained more fully how children’s inner speech emerges from social interactions and serves as a source of self-control. Cognitive behaviorists, such as Meichenbaum, developed models of internalization training on the basis of Vygotsky’s description of how overt speech becomes self-directive. During this same era, cognitive constructivists shifted their interest from cognitive stages to metacognition and the use of learning strategies to explain self-regulated efforts to learn. The research and theory of Flavell played a major role in effecting this transition by describing self-regulation in terms of metacognitive knowledge, self-monitoring, and control of learning. Research on self-regulation was also influenced by the emergence of goal theories during the 1970s and 1980s. Locke and Lathan showed that setting specific, proximal, challenging but attainable goals greatly influenced the effectiveness of learners’ efforts to learn. Theorists such as Ames, Dweck, Maehr, Midgley, and Nicholls identified individual differences in goal orientations that affected students’ efforts to learn on their own. These researchers found that learning or mastery goal orientations facilitated persistence and effort during self-directed efforts to learn, whereas performance or ego goal orientations curtailed motivation and achievement. During this same period, another perspective emerged that focused on the role of intrinsic interest in learning. Deci, Harackiewicz, Lepper, Ryan, Vallerand, and others demonstrated that perceptions of personal control, competence, or interest in a task were predictive of intrinsic efforts to learn on one’s own. This focus on intrinsic motivation was accompanied by a resurgence of research on the role of various forms of interest in self-directed learning by a host of scholars in Europe as well as elsewhere, such as Eccles, Hidi, Krapp, Renninger, Schiefele, Wigfield, and others. During these same decades, research on self-regulation of learning was influenced by cybernetic conceptions of how information is processed. Researchers such as Carver, Scheier, and Winne demonstrated the important role of executive processes, such as goal setting and self-monitoring, and feedback control loops in self-directed efforts to learn.
3. Changes in Emphasis Oer Time Before the 1980s, researchers focused on the impact of separate self-regulatory processes, such as goal setting, self-efficacy, self-instruction, volition, strategy learning, and self-management with little consideration for their broader implications regarding student learning of academic subject matter. Interest in the latter topic began to coalesce in the mid-1980s with the publication of journal articles describing various types of selfregulated learning processes, good learning strategy users, self-efficacious learners, and metacognitive engagement, among other topics (Jan Simons and Beukhof 1987, Zimmerman 1986). By 1991, a variety of theories and some initial research on self-regulated learning and academic achievement was published in special journal articles and edited textbooks (Maehr and Pintrich 1991, Zimmerman and Schunk 1989). These accounts of academic learning, which described motivational and self-reactive as well as the metacognitive aspects of self-regulation, spurred considerable research. By the mid-1990s, a number of edited texts had been published chronicling the results of this first wave of descriptive research and experimental studies of self-regulated learning (Pintrich 1995, Schunk and Zimmerman 1994). The success of these empirical studies stimulated interest in systematic interventions to students’ self-regulated learning and the results of these implementations emerged in journal articles and textbooks by the end of the 1990s (Schunk and Zimmerman 1998).
4. Emphases in Current Theory and Research There is much current interest in understanding the self-motivational aspects of self-regulation as well as the metacognitive aspects (Heckhausen and Dweck 1998). Self-regulated learners are distinguished by their personal initiative and associated motivational characteristics, such as higher self-efficacy beliefs, learning goal orientations, favorable self-attributions, and intrinsic motivation, as well as by their strategic and self-monitoring competence. The issue of selfmotivation is of both practical as well as theoretical importance. On the practical side, researchers often confront apathy or helplessness when they seek to improve students’ use of self-regulatory processes, and they need viable methods for overcoming this lack of motivation. On the theoretical side, researchers need to understand how motivational beliefs interact with learning processes in a way that enhances students’ initiative and perseverance. A number of models have included motivational and learning features as interactive components, such as Pintrich’s self-schema model, Boerkaert’s three-layered model, Kuhl’s action\state control model, and Bandura, Schunk, and Zimmerman’s cyclical phase model. These models are designed to transcend conceptual barriers between 13857
Self-regulated Learning learning and motivational processes and to understand their reciprocal interaction. For example, causal attributions are not only expected to affect students’ persistence and emotional reactions but also adaptations in their methods of learning. Each of these models seeks to explain how learning can become selfmotivating and can sustain effort over obstacles and time. A second issue of current interest is the acquisition of self-regulatory skills in deficient populations of learners. Study skill training programs have been instituted with diverse populations and age groups of students. These intervention programs have involved a variety of formats, such as separate strategy or study skill training classes, strategy training classes linked explicitly to subject matter classes, such as in mathematics or writing, or personal trainer services in classrooms and counseling centers. This research is uncovering a fascinating body of evidence regarding the metacognitive, motivational, and behavioral limitations of at-risk students. For example, naive learners often overestimate their knowledge and skill, which can lead to understudying, nonstrategic approaches, procrastination, and faulty attributions. Successful academic functioning requires accurate self-perceptions before appropriate goals are set and strategies are chosen. Many of these interventions are predicated on models that envision self-regulatory processes as cyclically interdependent, and a goal of these approaches is to explain how self-fulfilling cycles of learning can be initiated and sustained by students.
5. Methodological Issues Initial efforts to measure self-regulated learning processes relied on inventories in which students are asked to rate their use of specific learning strategies, various types of academic beliefs and attitudes, typical methods of study, as well as their efforts to plan and manage their study time. Another method is the use of structured interviews that involved open-ended questions about problematic learning contexts, such as writing a theme, completing math assignments, and motivating oneself to study under difficult circumstances. The latter form of assessment requires students to create their own answers, and experimenters to train coders to recognize and classify various qualitative forms of self-regulatory strategies. Although both of these approaches have reported substantial correlations with measures of academic success, they are limited by their retrospective or prospective nature—their focus on the usual or preferred methods of learning. As such, they depend on recall or anticipatory knowledge rather than on actual functioning under taxing circumstances. To avoid these limitations, there have been efforts to study self-regulation on-line at various key points during learning episodes using speak-aloud, experimenter questioning, and 13858
various performance and outcome measures. Experimental studies employing the latter measures offer the best opportunity to test the causal linkage among various self-regulatory processes, but, even this approach has potential shortcomings, such as inadvertent cuing or interference with self-regulatory processes. Two common research design issues have emerged in academic interventions with at-risk populations of students: the lack of a suitable control group, and the reactive effects of self-regulatory assessments. Educators have asked self-regulation researchers to provide assistance to students who are often in jeopardy of expulsion, and it is unethical to withhold treatment from some of these students for experimental purposes. One solution is to use intensive within-subject, time-series designs in which treatments are introduced in sequential phases. However, as students begin to collect and graph data on themselves, they become more self-observant and self-reflective—which can produce unstable baselines. This, in turn, can confound causal inferences about the effectiveness of other self-regulatory components of the intervention.
6. Future Directions for Research and Theory 6.1 Role of Technology The computer has been recommended as an ideal instrument to study and enhance students’ self-regulation. Program menus can be faded when they are no longer needed and performance processes and outcomes can be logged in either a hidden or overt fashion. Computers provide the ultimate feedback to the experimenter or the learner because results can be analyzed and graphed in countless ways to uncover underlying deficiencies. Winne is developing a specially designed computer learning environment designed to study and facilitate self-regulation by the student. 6.2 Deelopmental Research Relatively little attention has been devoted to forms of self-regulation that can be performed by young children. Very young children have difficulty observing and judging their own functioning, are not particularly strategic in their approach to learning, and tend to reason intuitively from beliefs rather than evidence. Their self-judgments of competence do not match their teachers’ judgments until approximately the fifth grade. However, there is reason to expect that simple forms of self-regulatory processes begin to emerge during the early years in elementary school. For example, there is evidence that children can make selfcomparisons with their earlier performance at the outset of elementary school. These issues are con-
Self-regulation in Adulthood nected to the key underlying issue of how selfregulation of learning develops in naturalistic as well as designed instructional contexts. 6.3 Out-of-school Influences There is increasing research showing that nonacademic factors such as peer groups, families, and part-time employment strongly affect school achievement. Schunk and Zimmerman have discussed the forms of social influence that these social groups can have on students’ self-regulatory development. Brody and colleagues have found that parental monitoring of their children’s activities and standard setting regarding their children’s performance were very predictive of the children’s academic as well as behavior selfregulation, which in turn was predictive of their cognitive and social development. Martinez-Pons has recently reported that parental modeling and support for their children’s self-regulation was predictive of the youngsters’ success in school. More attention is needed about the psychosocial origins of self-regulatory competence in academic learning. 6.4
Bibliography Heckhausen J, Dweck C S (eds.) 1998 Motiation and Selfregulation Across the Lifespan. Cambridge University Press, Cambridge, UK Jan Simons P R, Beukhof R (eds.) 1987 Regulation of Learning. SVO, Den Haag, The Netherlands Maehr M L, Pintrich P R (eds.) 1991 Adances in Motiation and Achieement: Goals and Self-regulatory Processes. JAI Press, Greenwich, CT, Vol. 7 Pintrich P (ed.) 1995 New Directions in College Teaching and Learning: Understanding Self-regulated Learning. Jossey-Bass, San Francisco, No. 63, Fall Schunk D H, Zimmerman B J (eds.) 1994 Self-regulation of Learning and Performance: Issues and Educational Applications. Erlbaum, Hillsdale, NJ Schunk D H, Zimmerman B J (eds.) 1998 Self-regulated Learning: From Teaching to Self-reflectie Practice. Guilford Press, New York Zimmerman B J (ed.) 1986 Self-regulated learning. Contemporary Educational Psychology 11, Special Issue Zimmerman B J, Schunk D H (eds.) 1989 Self-regulated Learning and Academic Achieement: Theory, Research, and Practice. Springer, New York
B. J. Zimmerman
Role of Teachers
The recent focus on academic intervention research has uncovered evidence that teachers often conduct classrooms where it is difficult for students to selfregulate effectively. For example, teachers who fail to set specific instructional goals, are ambiguous or inconsistent about their criteria for judging classroom performance, give ambiguous feedback about schoolwork, make it difficult for students to take charge of their learning. Of course, students who enter such classes with well-honed self-regulatory skills possess personal resources that poorly self-regulated learners do not possess. Such self-regulatory experts can turn to extra-classroom sources of information, can deduce subtle unspecified criteria for success, and can rely on self-efficacious beliefs derived from earlier successful learning experiences. Regarding the development of self-regulated learners, few teachers ask students to make systematic self-judgments about their schoolwork, and as a result, students are not prompted or encouraged to use selfregulatory subprocesses such as self-observation, selfjudgment, and self-reactions. Students who lack awareness of their functioning have little reason to try to alter their personal methods of learning. Finally, teachers who run classrooms where students’ have little personal choice over their goals, methods, and outcomes of learning can undermine students’ perceptions of control and assumption of responsibility for their classroom outcomes. See also: Learning to Learn; Motivation, Learning, and Instruction; Self-efficacy: Educational Aspects
Self-regulation in Adulthood Self-regulation is one of the principal functions of the human self, and it consists of processes by which the self manages its own states and actions so as to pursue goals, conform to ideals and other standards, and maintain or achieve desired inner states. Many experts use self-regulation interchangeably with the everyday term self-control, although some invoke subtle distinctions such as restricting self-control to refer to resisting temptation and stifling impulses.
1. Scope of Self-regulation Most knowledge about self-regulation can be grouped into four broad and one narrower category. The most familiar is undoubtedly impulse control, which refers to regulating one’s behavior so as not to carry out motivated acts that could have harmful or undesirable consequences. Dieting, responsible sexual behavior, recovery from addiction, and control of aggression all fall in the category of impulse control. A second category is affect regulation, or the control of emotions. The most common form of this is the attempt to bring oneself out of a bad mood or bring an end to emotional distress. In principle, however, affect regulation can refer to any attempt to alter any mood or emotion, including all attempts to induce, end, or prolong either positive or negative emotions. Affect regulation is widely regarded as the most difficult and problematic of the major spheres, because most people 13859
Self-regulation in Adulthood cannot change their moods and emotions by direct control or act of will, and so people must rely on indirect strategies, which are often ineffective. The third category is the control of thought. This includes efforts to stifle unwanted ideas or to concentrate on a particular line of thought. It can also encompass efforts to direct reasoning and inference processes, such as in trying to make a convincing case for a predetermined conclusion or to think an issue carefully through so as to reach an accurate judgment. The fourth major category is performance regulation. Effective performance often requires selfregulation, which may include making oneself put forth extra effort or persevere (especially in the face of failure), avoid choking under pressure, and make optimal tradeoffs between speed and accuracy. A narrower category of self-regulation involves superordinate regulation, which is sometimes called self-management. These processes cut across others and involve managing one’s life so as to afford promising opportunities. Choosing challenges or tasks (such as college courses) that are appropriately suited to one’s talents, avoiding situations that bring hardto-resist temptations, and conserving one’s resources during stressful periods fall in this category.
2. Importance of Self-regulation The pragmatic importance of self-regulation can scarcely be understated. Indeed, most of the personal and social problems afflicting citizens of modern, highly developed countries involve some degree of failure at self-control. These problems include drug and alcohol abuse, addiction, venereal disease, unwanted pregnancy, violence, gambling, school failure and dropping out, eating disorders, drunk driving, poor physical fitness, failure to save money, excessive spending and debt, child abuse, and behavioral binge patterns. Experts regularly notify people that they could live longer, healthier lives if they would only quit smoking, eat right, and exercise regularly, but people consistently fail to regulate themselves sufficiently in those three areas. Longitudinal research has confirmed the enduring benefits of self-regulation. Mischel and his colleagues (e.g., Mischel 1996) found that children who were better able to exercise self-control (in the form of resisting temptation and delaying gratification) at age 4 years were more successful socially and academically over a decade later. Thus, these regulatory competencies appear to be stable and to yield positive benefits in multiple spheres over long periods of time. Self-regulation enables people to resist urges for immediate gratification and adaptively pursue their enlightened self-interest in the form of long-term goals. Self-regulation also has considerable theoretical importance. As one of the self’s most important and adaptive functions, self-regulation is central to the effective functioning of the entire personality. The processes by which the self controls itself offer im13860
portant insights into how the self is structured and it operates. Higgins (1996) has analyzed the ‘sovereignty of self-regulation,’ by which he means that selfregulation is the master or supreme function of the self.
3. Feedback Loops Psychologists have borrowed important insights from cybernetic theory to explain self-regulatory processes (e.g., Powers 1973). The influential work by Carver and Scheier (1981, 1982, 1998) analyzed self-awareness and self-regulation processes in terms of feedback loops that control behavior. In any system (including mechanical ones such as thermostats for heating\ cooling systems), control processes depend on monitoring current status, comparing it with goals or standards, and initiating changes where necessary. The basic form of a feedback loop is summarized in the acronym TOTE, which stands for test, operate, test, exit. The test phase involves assessing the current status and comparing it with the goals or ideals. If the comparison yields a discrepancy, some operation is initiated that is designed to remedy the deficit and bring the status into line with what is desired. Repeated tests are performed at intervals during the operation so as to gauge the progress toward the goals. When a test reveals that the desired state has been reached, the final (exit) step is enacted, and the regulatory process is ended. Feedback loops do not necessarily exist in isolation, of course. Carver and Scheier (1981, 1982) explained that people may have hierarchies of such loops. At a particular moment, a person’s actions may form part of the pursuit of long-term goals (such as having a successful career), medium-term goals (such as doing well in courses that will furnish qualifications for that career), and short-term goals (such as persisting to finish a particular assignment in such a course). Once a given feedback loop is exited, indicating the successful completion of a short-term act of selfregulation (e.g., finishing the assignment), the person may revert back to aiming at the long-term goals. Emotion plays a central role in the operation of the feedback loop (Carver and Scheier 1998). Naturally, reaching a goal typically brings positive emotions, but such pleasant feelings may also arise simply from making suitable progress toward the goal. Thus, emotion may arise from the rate of change of discrepancy between the current state and the goal or standard. Meanwhile, negative emotions may arise not only when things get worse but also simply when one stands still and thereby fails to make progress. Carver and Scheier (1998) phrase this as a ‘cruise control’ theory of affect and self-regulation, invoking the analogy to an automobile’s cruise control mechanism. Like that mechanism, emotion kicks in to regulate the process whenever the speed deviates from the prescribed rate of progress toward the goal.
Self-regulation in Adulthood Although this analysis has emphasized feedback loops that seek to reduce discrepancies between real and ideal circumstances, there are also ‘negative feedback loops’ in which the goal is to increase the discrepancy between oneself and some standard. For example, people may seek to increase the discrepancy between themselves and the average person.
4. Self-regulation Failure Two main types of self-regulatory failure have been identified (for a review, see Baumeister et al. 1994). The more common is underregulation, in which the person fails to exert the requisite self-control. For example, the person may give in to temptation or give up prematurely at a task. The other category is misregulation, in which person exerts control over self but does so in some counterproductive manner so that the desired result is thwarted. Several common themes and factors have been identified in connection with underregulation. One is a failure to monitor the self, which prevents the feedback loop from functioning. Thus, successful dieters often count their calories and keep track of everything they eat. When they cease to monitor how much they eat, such as when watching television or when distracted by emotional distress, they may eat much more without realizing it. Alcohol contributes to almost every known pattern of self-control failure, including impulsive violence, overeating, smoking, gambling and spending, and emotional excess. This pervasive effect probably occurs because alcohol reduces people’s attention to self and thereby impairs their ability to monitor their own behavior (see Hull 1981). Emotional distress likewise contributes to underregulation, and although there are multiple pathways by which it produces this effect, one of them is undoubtedly impairment of the self-monitoring feedback loop. A perennial question is whether underregulation occurs because the self is weak or because the impulse is too strong. The latter invokes the concept of ‘irresistible impulses,’ which is popular with defense lawyers and addicts hoping to imply that no person could have resisted the problematic urge. After reviewing the evidence, Baumeister et al. (1994) concluded that most instances of underregulation involve some degree of acquiescence by the individual, contrary to the image of a valiant person being overwhelmed by an irresistible impulse. For example, during a drinking binge, the person will actively participate in procuring and consuming alcohol, which contradicts claims of having lost control of behavior and being passively victimized by the addiction. Yet clearly people would prefer to resist these failures and so at some level do experience them as against their will. Self-deception and inner conflict undoubtedly contribute to muddying the theoretical waters. At present, then, the relative contributions of powerful
impulses and weak or acquiescent self-control have not been fully understood, except to indicate that both seem to play some role. Misregulation typically involves some erroneous understanding of how self and world interact. In one familiar example, sad or depressed people may consume alcohol in the expectation that it will make them feel better, when in fact alcohol is a depressant that often makes them feel even worse. The strategy therefore backfires. By the same token, when under pressure, people may increase their attention to their own performance processes, under the assumption that this extra care will improve performance, but such attention can disrupt the smooth execution of skills, causing what is commonly described as ‘choking under pressure.’
5. Strength and Depletion The successful operation of the feedback loop depends on the ‘operate’ stage, in which the person actually changes the self so as to bring about the desired result. Recent work suggests that these operations involve a limited resource akin to energy or strength. The traditional concept of ‘willpower’ thus is being revived in psychological theory. Willpower appears to be a limited resource that is depleted when people exert self-control. In laboratory studies, if people perform one act of self-regulation, they seem to be impaired afterwards. For example, after resisting the temptation to eat forbidden chocolates, people are less able to make themselves keep trying on a difficult, frustrating task (Baumeister et al. 1998). Whenever there are multiple demands on selfregulation, performance gradually deteriorates and afterwards the self appears to suffer from depletion of resources (Muraven and Baumeister 2000). Although the nature of this resource is not known, several important conclusions about it are available. First, it appears that the same strength or energy is used for many different acts of self-regulation, as opposed to each sphere of self-control using a different facility. Trying to quit smoking or keep a diet may therefore impair a person’s ability to persist at work tasks or to maintain control over emotions. Second, the resource is sufficiently limited that even small exertions cause some degree of depletion. These effects do not necessarily imply pending exhaustion; rather, people conserve their resources when partly depleted. Third, the resource is also used for making decisions and choices, taking responsibility, and exercising initiative. Fourth, there is some evidence that willpower can be increased by regular exercise of selfcontrol. Fifth, rest and sleep seem to be effective at replenishing the resource. The effects of ego depletion provide yet another insight into self-control failure. This helps explain the familiar observation that self-control tends to deteriorate when people are working under stress or 13861
Self-regulation in Adulthood pressure, because they are expending their resources to coping with the stress and therefore have less available for other regulatory tasks.
6. Ironic Processes Another contribution to understanding self-regulatory failure invokes the notion of different mental processes that can work at cross-purposes. Wegner (1994) proposed that effective self-regulation involves both a monitoring process, which remains vigilant for threat or danger (including, for example, the temptation to do something forbidden), and an operating process, which compels the self to act in the desirable fashion. When the monitoring process detects a threat, the operating process helps to prevent disaster, such as when the dieter refuses the offer of dessert. Unfortunately, the monitoring process is generally more automatic than the operating process, so the monitor may continue to notice temptations or other threats even when the operating process is not working. Upon completion of a diet, for example, the dieter may stop saying no to all tempting foods, but the monitoring process continues to draw attention to every delicious morsel that becomes available, so that the person feels constantly tempted to eat. Likewise, when people are tired or depleted, the monitoring system may continue to seek out troublesome stimuli, with disastrous results.
Bibliography Baumeister R F, Bratslavsky E, Murave M, Tice D 1998 Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology 74: 1252–65 Baumeister R F, Heatherton T F, Tice D 1994 Losing Control: How and Why People Fail at Self-regulation. Academic Press, San Diego, CA Carver C S, Scheier M F 1981 Attention and Self-regulation. Springer, New York Carver C S, Scheier M F 1982 Control theory: A useful conceptual framework for personality—social, clinical and health psychology. Psychological Bulletin 92: 111–35 Carver C S, Scheier M F 1998 On the Self-regulation of Behaior. Cambridge University Press, New York Higgins E 1996 The ‘self digest’: Self-knowledge serving selfregulatory functions. Journal of Personality and Social Psychology 71: 1062–83 Hull J G 1981 A self-awareness model of the causes and effects of alcohol consumption. Journal of Abnormal Psychology 90: 586–600 Mischel W 1996 From good intentions to willpower. In: Gollwitzer P M, Bargh J A (eds.) The Psychology of Action. Guilford, New York Muraven M, Baumeister R F 2000 Self-regulation and depletion of limited resources: Does self-control resemble a muscle? Psychological Bulletin 126: 247–59 Powers W T 1973 Behaior: the Control of Perception. Aldine Publishing Company, Chicago Wegner D M 1994 Ironic processes of mental control. Psychological Reiew 101: 34–52
R. F. Baumeister
7. Conclusion Effective self-regulation depends on three ingredients. First, one must have clear and unconflicting standards, such as clear goals or ideals. Second, one must monitor the self consistently, such as by keeping track of behavior. Third, one must have the wherewithal to produce the necessary changes in oneself, and that may include willpower (for direct change) or knowledge of effective strategies (for indirect efforts, such as for changing emotions). Problems in any area can impair self-regulation. Several areas for future research on self-regulation are promising. The nature of the psychological resource that is expended must be clarified, and brain processes may shed valuable light on how self-regulation works. Interpersonal aspects of selfregulation need to be explicated. The replenishment of depleted resources will also help elucidate the nature of self-control in addition to being of interest in their own right. Individual differences in selfregulation and developmental processes for acquiring self-control need considerably more study. Applications to clinical, industrial, and relationship phenomena need to be examined. Effective self-regulation includes the ability to adapt oneself to diverse circumstances. It is a vital aspect of human success in life. 13862
Self-regulation in Childhood Self-regulation has several meanings in the psychology literature, some of which have conceptual links with each other. Three definitions of self-regulation are provided below, but only the first is the theme of this article.
1. Definitions Self-regulation refers to the development of children’s ability to follow everyday customs and valued norms embraced and prescribed by their parents and others. Self-regulation is a vital constituent of the socialization process. A broad construct, self-regulation encompasses a diverse set of behaviors such as compliance, the ability to delay actions when appropriate, and modulation of emotions, motor, and vocal activities as suitable to norm-based contexts. The self in selfregulation is an essential feature, and involves a sentient self—one that recognizes and understands the reasons for standards and evaluates one’s own actions in relation to others’ feelings and needs. Thus the hallmark of self-regulation is the ability to act in accordance with various family and social values in
Self-regulation in Childhood the absence of external monitors, across a variety of situations, but neither slavishly nor mindlessly. A second meaning of self-regulation refers to various physiological or psychobiological processes that function adaptively to situational demands, and often do not involve normative standards. Examples include the regulation of intensity of arousal subsequent to emotion producing events, or the control of centrally regulated physiological and perceptual systems to novel events. Centrally regulated systems include electrodermal responses, vagal tone and heart rate, and attention control. Variation in these responses is often studied for links to temperament styles, control of arousal states, and emotion control and coping (e.g., Rothbart et al. 1995). A third view of self-regulation comes from recent perspectives in motivation. Here, with an emphasis on an individual’s goals and the processes that shape the person’s actions, a major role is given to the self’s role and to evaluation of gains and losses (e.g., Heckhausen and Dweck 1998). Autonomy, control, self-integrity, and efficacy are essential in order for the individual to be effective in the pursuit of desired goals. Within the motivational framework, substantial attention is directed to understanding how adults facilitate or inhibit the child’s autonomy. As noted earlier, this entry is concerned with the development of self-regulation to family and sociocultural standards. Related articles include Coping across the Lifespan; Control Behaior: Psychological Perspecties; and Self-regulation in Adulthood.
2. Historical Contexts The quest to understand the origins and development of self-regulation was a late twentieth century conceptual endeavor grounded in a clinical issue: why do some children as young as two and three years of age reveal high levels of dysregulated behaviors? Examples include resistance to parental requests, difficulty with family routines, high levels of activity, and unfocused attention often associated with an inability ‘to wait.’ At the time, earlier findings (e.g., impulse control, resistance to discipline) had implicated factors such as ineffective parenting, as well as distortions in the child’s own cognitive or language skills. With rare exceptions, few attempts had been made to understand antecedents, age-related correlates, and consequences of early self-regulated or dys-regulated behaviors. It seemed that a more comprehensive and integrative perspective could lead to a broadened developmental model, and provide insights about multiple contributors to early-appearing problematic behaviors. This was the impetus for a new view of self-regulation (Kopp 1982). Family and social standards were increasingly emphasized because of their relevance for self-regulation (Kopp 1987, Gralinski and Kopp 1993).
Self-regulation is not a unique psychological construct, rather it has ideational and behavioral links to courses of action variously labeled as self-management, self-control, and will. These emphases are longstanding and can be found in biblical passages, early philosophical writings, in manuals for parents during the seventeenth century and on to the present, and in the essays of William James, Freud, and John Dewey. More recent thinking can be found in several domains of psychology: developmental, personality, social learning, psychopathology. The enduring historical importance underscores the crucial fact that indiidual adherence to behavioral norms provides essential support for the structure and function of family and sociocultural groups. Societal norms are grounded in historical time and locale, and are reflected by caregivers’ socialization practices. Changes in practices have implications for the kinds of behaviors expected of children (including those associated with self-regulation). The dramatic changes in the sociocultural scene in the USA since the mid-1900s provide an example. At the end of the twentieth century, parenting was primarily authoritarian and children were expected to conform to standards without dissent. However, a mid-century transformation in child rearing practices occurred. The change was linked to two decades of economic hardship, a devastating war, quests for family life among returning veterans, and a pediatrician (Benjamin Spock) whose counsel to mothers about trusting their own judgement would resonate with new parents. In time, as a more relaxed style of parenting emerged, there was increasing tolerance for children to assert themselves about norms. Parents began to reveal a willingness to negotiate with their children about a variety of things including how norms could be met. At the beginning of the twenty-first century, a reciprocal style of parenting (often termed authoritative; Baumrind 1967) is far more common than the more inflexible authoritarian approach. For social scientists, this changing socialization orientation led to new or recast research topics. Studies focused on the characteristics and consequences of authoritative parenting, children’s autonomy needs, and understanding how the self-determination needs of the individual balance with the value based requirements of social–cultural groups. The challenges to define and explain the balancing process continue.
3. Socialization and Self-regulation 3.1 Principles Despite changing emphases in socialization, three principles remain a constant; these have implications for the development of self-regulation. First, most sociocultural groups expect primary authority figures such as parents to begin the active process of socialization. These individuals actively expose older infants 13863
Self-regulation in Childhood and toddlers to family norms and practices. Later, these socializing agents will be joined by others who typically include teachers, age mates, friends, and neighbors. New socializing agents may reinforce previous norms, expand the interpretation of norms, or introduce new ones. Children are expected to adapt their behavior accordingly. Second, parents tend to be mindful—perhaps implicitly—of a child’s developmental capabilities, the family’s functional needs, and sociocultural standards when they teach norms. Protection of the child from harm seems to be the most salient initial child-rearing value. Then the content of parents’ socialization efforts progresses to other family concerns, and from there to the broader context of neighborhood, community, and societal norms. Across sociocultural groups, there are assumptions that toddlers and preschoolers will learn a few specific family and neighborhood norms, whereas school-aged children will adopt family and cultural customs and moral standards, formal laws, and economic values. Adolescents are expected to prepare themselves for social, emotional, and economic independence using as infrastructure the norms of the family and culture. The third principle is that socialization and selfregulation are adaptive, bidirectional processes. Children and adolescents are not passive recipients in the socialization process. With the growth of their own cognitive skills and recognition of their own self needs, children show an increasing desire to have a say in defining the everyday customs that they encounter, and the ways customs are followed. Thus each succeeding generation in modern, democratic sociocultural groups modifies the content or enactment of customs and norms in some way. An implication for children is that effective self-regulation represents thoughtful, rather than indiscriminate, adherence to family practices and social norms. 3.2 Studying Self-regulation in the Context of Socialization Self-regulation is often used as a conceptual frame for research rather than a design element with the three themes of compliance, delay, and behavioral modulation. Rather, most research has been focused on the study of compliance during the preschool period (typically, from 3 to 5 years of age). Although this emphasis is warranted—adherence to norms is considered to be a crucial developmental task for this age (Sroufe and Rutter 1984)—important knowledge gaps exist. There is fragmented information about effective integration of compliance, delay, and behavioral modulation, and developmental trends among younger and older children. Recent descriptive studies also reveal considerable complexity in young children’s responses to parental requests. This finding has prompted efforts to refine operational definitions, expand the research contexts 13864
for the study of self-regulation, and utilize datacollection procedures that involve multiple measurement techniques and informants. These recent efforts have largely focused on children without major behavioral problems. Lastly, considerable research has highlighted parental and child correlates of self-regulation components, albeit sometimes within a limited age period and one or two contexts. The most complete database is available for the preschool years. 3.3 Research Data: Parents, Children, and Social Norms The timing of active socialization originates in child behaviors such as the onset of walking. When very young children locomote on their own, they discover opportunities to explore, sometimes with the potential for physical or psychological harm. Thus parents start to identify acceptable and unacceptable behaviors for the child, and attempt to obtain some measure of compliance using a variety of attention getting techniques (Schaffer 1996). These beginnings mark the dynamic and occasionally formidable interplay between parents’ socialization efforts and young children’s trajectory toward self-regulation. Although parents use their child’s behavior as a socializing cue, they also rely on family needs (e.g., the composition of a family, living arrangements), as well as societal norms. A toddler in a large family with limited living space is likely to be exposed to somewhat different restrictions to an only child living in spacious quarters. However, three overarching themes unite parents’ initial socialization efforts: young children must be protected from harming themselves or others (LeVine 1974, Gralinski and Kopp 1993); they must not tamper with family members’ possessions; and they must learn to respect others’ feelings (Dunn 1988). As children become older and their social environments increase, parental cautions about normative standards extend beyond the immediate family to peers, neighbors, and teachers, and focus on conventions and moral values, among other topics. In addition to structuring the content of socialization norms, the how of parental socialization is crucial. There is unequivocal consensus about the effect of child-rearing styles at least with EuroAmerican middle-class families. Sensitive and knowledgeable parenting is correlated with children’s compliance to norms, across age periods (Kochanska and Aksan 1995). 3.4 A Self-regulation Deelopmental Trajectory Learning to deal with parents’ norm-related requests is an all-powerful challenge. Children must bring their cognitive, linguistic, social, and emotional resources to situations that demand self-regulation. The task is most difficult for young children: they have cognitive
Self-regulation in Childhood Table 1 Self-regulation: developmental landmarks Components of self-regulation Modulation of emotions, vocalizations, and motor activities
Landmarks
Compliance
Delay\waiting (object rewards, others’ attention)
Emergent: responses tend to be situation specific and unpredictable Functional: a few predictable norm responses; some cognizance of self role Integratie: response coherence across two self-regulation components; nascent causal reasoning Internalized: understands, accepts others’ values and self role for social norms Future oriented: planful behavior for some norm contexts
Early 2nd year
Early 2nd year
Early PSa
End 2nd year; early PS
Mid-PS
Mid-PS
Late PS
Late PS
Late PS
Late PSb; S-Ac
S-A
S-A
Mid-late S-A
Mid-late S-A
Mid-late S-A
a Preschool period. b Italics indicate presumed progression. c School age years.
and language limitations; they are inclined to explore anything that looks interesting; they long for autonomy and self-assertion. It is not surprising that selfregulation takes years to mature into an effective process. Table 1 depicts the three components of selfregulation along with ages associated with developing behavioral landmarks. The ages represent approximations; the landmarks highlight behaviors that typify increasing maturity to parental norm-based requests. Using compliance as an exemplar, Emergent signifies that responses to a parent prohibition (e.g., child does not stand on a kitchen chair) may occur sporadically. Functional portrays fairly regular compliance to a rule (e.g., child does not stand on chairs, whether in the living room or kitchen); children may show signs of remorse when they do not comply. Integratie refers to predictability (i.e., coherence) in children’s behavior across different kinds of norm-based situations (e.g., a child does not yell in a market, respects a sibling’s possessions, and waits to be served dinner). Extant data reveal that compliance to family do’s and dont’s tends to be easier for young children than norm-based situations that require waiting or modulation of ongoing behaviors (also shown in Table 1). This disparity may be related to additional regulatory demands imposed on children when timing or refined behavioral nuances are critical for an effective response. In these instances, the use of strategic behaviors (e.g., self-distraction, conscious suppression efforts) may be crucial for effective self-regulation. Despite inevitable setbacks, children become more responsive to specific family and other social rules. The growth of self-regulation is facilitated by greater understanding of people and events, widening sensitivity to everyday happenings in the family and
neighborhood, and increasing ability to talk about self needs in relation to others’ needs. The importance of language in self-regulation is exemplified by the transition from outright refusals common among toddlers to attempts to talk about and negotiate task demands among older preschoolers (Klimes-Dougan and Kopp 1999). Still, children of this age may have difficulty with anger control, waiting for parental attentiveness, and sharing possessions. An extensive database on topics related to selfregulation in the preschool years suggests that correlates include gender (many girls learn behavioral controls sooner than boys), temperament characteristics such as controlled attention and inhibition of motor acts, verbal skills, effective self-distraction, and competent use of strategies (Eisenberg and Fabes 1992, Metcalfe and Mischel 1999). Overall, continuing efforts of parents to socialize their children gradually dovetail with children’s growth of their own behavioral cognitive, social, and motivational repertoire. Few details exist about how parents and children collaborate with their own resources so that children increasingly take on responsibility for self-regulation. What can be said with certainty is that self-regulation gradually emerges when children understand the reasons for values and standards, possess a cognizant self, are able to suppress an immediate goal that runs counter to family norms, and voluntarily assume responsibility for their own actions across a variety of situations. Self-generated and self-monitored adherence to norms is self-regulation. This autonomous, sometimes conscious, effort to frame activities with normative values stands in contrast to early forms of compliance that do not rely on knowledge or self-awareness. Effective self-regulation is by definition an adaptive, 13865
Self-regulation in Childhood developmental process in that children have to discover how to meet their own self needs while in general following societal standards across many settings. The task is a changing one because each age period is associated with new socialization demands imposed by parents, teachers, other individuals, and the larger social context.
4. Self-regulation: Its Value and Directions The construct of self-regulation has heightened awareness of dys-regulation and its long-term implications. Findings from related research point to long-standing problems when young children have difficulty with compliance, delay, and impulse control. To date, however, the developmental antecedents of these problems have been elusive. A renewed focus on the toddler years should be useful: two important developments occur in the second and third years that have relevance for the growth of self-regulation. These are emergent selfhood and the control of attention and consciousness. It is well known that effective norm-based behaviors require reconciliation of conflicts between self-goals and social norm demands. Among mature individuals, these conflicts are often met with reflection about courses of action, informal appraisals of costs and benefits, and plans for making reparations should they be necessary. This complex cognitive strategy is not available to young children. Understanding how very young children begin to sort through competing self and social goals may provide insights into effective paths to self-regulation. With respect to conscious learning about social norms, this almost certainly occurs, but when and how are not well understood. However, consciousness demands psychic energy, so it is in children’s best interests to sort out those family and social norms that require relatively habitual responses from those that demand additional attentional or memory efforts from themselves. How children learn to differentiate and classify standards may yield understanding about strategies useful for modulating behaviors in novel norm-based contexts, and why some children falter in these situations. See also: Parenting: Attitudes and Beliefs; Self-development in Childhood; Self-regulation in Adulthood; Socialization and Education: Theoretical Perspectives; Socialization in Infancy and Childhood; Socialization, Sociology of
Bibliography Baumrind D 1967 Child care practices anteceding three patterns of preschool behavior. Genetic Psychology Monographs 75: 43–88
13866
Dunn J 1988 The Beginning of Social Understanding. Blackwell, Oxford, UK Eisenberg N, Fabes R A 1992 Emotion, regulation, and the development of social competence. In: Clark M S (ed.) Reiew of Personality and Social Psychology: Vol. 14. Emotion and Social Behaior. Sage, Newbury Park, CA, pp. 119–50 Gralinski J H, Kopp C B 1993 Everyday rules for behavior: Mothers’ requests to young children. Deelopmental Psychology 29: 573–84 Heckhausen J M, Dweck C S 1998 Motiation and Self-regulation Across the Life Span. Cambridge University Press, New York Klimes-Dougan B, Kopp C B 1999 Children’s conflict tactics with mothers: A longitudinal investigation of the toddler and preschool years. Merrill-Palmer Quarterly 45: 226–41 Kochanska G, Aksan N 1995 Mother child mutually positive affect, the quality of child compliance to requests and prohibitions, and maternal control as correlates of early internalization. Child Deelopment 66: 236–54 Kopp C B 1982 Antecedents of self-regulation: a developmental perspective. Deelopmental Psychology 18: 199–214 Kopp C B 1987 The growth of self-regulation: caregivers and children. In: Eisenberg N (ed.) Contemporary Topics in Deelopmental Psychology. Wiley, New York, pp. 34–55 LeVine R A 1974 Parental goals: A cross-cultural view. In: Leichter H J (ed.) The Family as Educator. Teachers College Press, New York, pp. 56–70 Metcalfe J, Mischel W 1999 A hot\cool-system analysis of delay of gratification: dynamics of willpower. Psychological Reiew 106: 3–19 Rothbart M K, Posner M I, Hershey K L 1995 Temperament, attention, and developmental psychopathology. In: Cicchetti D, Cohen D J (eds.) Manual of Deelopmental Psychopathology. Wiley, New York, Vol. 1, pp. 315–40 Schaffer H R 1996 Social Deelopment. Blackwell, Oxford, UK Sroufe L A, Rutter M 1984 The domain of developmental psychopathology. Child Deelopment 55: 17–29
C. B. Kopp
Semantic Knowledge: Neural Basis of Semantic knowledge is a type of long-term memory, commonly referred to as semantic memory, consisting of concepts, facts, ideas, and beliefs (e.g., Tulving 1983). Semantic memory is thus distinct from episodic or autobiographical memories, which are unique to an individual and tied to a specific time and place. For example, answering the question ‘What does the word breakfast mean?’ requires semantic memory. In contrast, answering the question ‘What did you have for breakfast yesterday?’ requires episodic memory, to retrieve information about events in our personal past, as well as semantic memory, to understand the question. Semantic memory therefore includes the information stored in our brains that represents the meaning of words and objects. Understanding the nature of meaning, however, has proven to be a fairly intractable problem: especially
Semantic Knowledge: Neural Basis of regarding the meaning of words. One important reason why this is so is that words have multiple meanings. The specific meaning of a word is determined by context, and comprehension is possible because we have contextual representations (see Miller 1999 for a discussion of lexical semantics and context). Cognitive neuroscientists have begun to get traction on the problem of meaning in the brain by limiting inquiry to concrete objects as represented by pictures, and by their names. Consider, for example, the most common meaning of two concrete objects; a camel and a wrench. Camel is defined as ‘either of two species of large, domesticated ruminants (genus camelus) with a humped back, long neck, and large cushioned feet,’ the object wrench is defined as ‘any number of tools used for holding and turning nuts, bolts, pipes, etc.’ (Webster’s New World Dictionary 1988). Two things are noteworthy about these definitions. First, they are largely about features; camels are large and have a humped back; wrenches hold and turn things. Second, different types of features are emphasized for different types of object. The definition of the camel consists of information about its visual appearance, whereas the definition of the wrench emphasizes how it is used. Differences in the types of feature that define different objects have played, and continue to play, a central role in models of how semantic knowledge is organized in the human brain. Another point about these brief definitions is that they include only part, and perhaps only a small part, of the information we may possess about these objects. For example, we may know that camels are found primarily in Asia and Africa, that they are known as the ‘ships of the desert,’ and that the word camel can also refer to a color and a brand of cigarettes. Similarly, we also know that camels are larger than a bread box and weigh less than a 747 jumbo jet. Although this information is also part of semantic memory, little is known about the brain bases of these associative and inferential processes. Neuroscientists are, however, beginning to gain insights into the functional neuroanatomy associated with identifying objects and retrieving information about specific object features and attributes.
1. Semantic Representations A central question for investigators interested in the functional neuroanatomy of semantic memory has been to determine how information about object concepts is represented in the brain. A particularly influential idea guiding much of this work is that the features and attributes that define an object are stored in the perceptual and motor systems active when we first learned about that object. For example, information about the visual form of an object, its typical
color, its unique pattern of motion, would be stored in or near regions of the visual system that mediate perception of form, color, and motion. Similarly, knowledge about the sequences of motor movements associated with the use of an object would be stored in or near the motor systems active when that object was used. This idea has a long history in behavioral neurology. Indeed, many neurologists at the beginning of the twentieth century assumed that the concept of an object was composed of the information about that object learned through direct sensory experience and stored in or near sensory and motor cortices (e.g., Lissauer 1988, 1890).
2. Semantic Deficits Result from Damage to the Left Temporal Lobe The modern era of study on the organization of semantic knowledge in the human brain began with Elizabeth Warrington’s seminal paper ‘The selective impairment of semantic memory’ (Warrington 1975). Warrington reported three patients with progressive dementing disorders that provided neurological evidence for a semantic memory system. There were three main components to the disorder. First, it was selective. The disorder could not be accounted for by general intellectual impairment, sensory or perceptual problems, or an expressive language disorder. Second, the disorder was global, in the sense that it was neither material- nor modality-specific. Object knowledge was impaired regardless of whether objects were represented by pictures or their written or spoken names. Third, the disorder was graded. Knowledge of specific object attributes (e.g., does a camel have two, four, or six legs?) was more impaired than knowledge of superordinate category information (i.e., is a camel a mammal, bird, or insect?). Following Warrington’s report, similar patterns of semantic memory dysfunction have been reported in patients with brain damage resulting from a wide range of etiologies. These have included patients with progressive dementias such as Alzheimer’s disease and semantic dementia, herpes encephalitis, and closed head injury (for review, see Patterson and Hodges 1995). Consistent with the properties established by Warrington, these patients typically had marked difficulty producing object names under a variety of circumstances, including naming pictures of objects, naming from the written descriptions of objects, and generating lists of objects that belong to a specific category (e.g., animals, fruits and vegetables, furniture, etc.). In addition, the deficits were associated primarily with damage to the left temporal lobe, suggesting that information about object concepts may be stored, at least in part, in this region of the brain. 13867
Semantic Knowledge: Neural Basis of
3. Brain Damage Can Lead to Category-specific Semantic Deficits Other patients have been described with relatively selective deficits in recognizing, naming, and retrieving information about different object categories. The categories that have attracted the most attention are animals and tools. This is because of a large and growing number of reports of patients with greater difficulty naming and retrieving information about animals (and often other living things) than about tools (and often other man-made objects). Reports of the opposite pattern of dissociation (greater for tools than animals) are less frequent. However, enough carefully-studied cases have been reported to provide convincing evidence that these categories can be doubly dissociated as a result of brain damage. The impairment in these patients is not limited to visual recognition. Deficits occur when knowledge is probed visually and verbally, and therefore assumed to reflect damage to the semantic system or systems (for review, see Forde and Humphreys 1999). While it is now generally accepted that these disorders are genuine, their explanation, on both the cognitive and neural levels, remains controversial. Two general types of explanation have been proposed. The most common explanation focuses on the disruption of stored information about object features. Specifically, it has been proposed that knowledge about animals and tools can be disrupted selectively because these categories are dependent on information about different types of features stored in different regions of the brain. As exemplified by the definitions provided previously, animals are defined primarily by what they look like, and functional attributes play a much smaller role in their definition. In contrast, functional information, specifically how an object is used, is critical for defining tools. As a result, damage to areas where object form information is stored leads to deficits for categories that are overly-dependent on visual form information, whereas damage to regions where object use information is stored leads to deficits for categories overly-dependent on functional information. The finding that patients with categoryspecific deficits for animals also have difficulties with other visual-form-based categories, such as precious stones, provides additional support for this view (e.g., WarPrington and Shallice 1984). This general framework for explaining categoryspecific disorders was first proposed by Warrington and colleagues in the mid- to late 1980s (e.g., Warrington and Shallice 1984). Influential extensions and reformulation of this general idea have been provided by a number of investigators, including Farah and McClelland (1991), Damasio (1990), Caramazza et al. (1990), and Humphreys and Riddoch (1987). The second type of explanation focuses on broader semantic distinctions (e.g., animate v. inanimate 13868
objects), rather than on features and attributes, as the key to understanding category-specific deficits. A variant of this argument has recently been proposed by Caramazza and Shelton (1998) to counter a number of difficulties with feature-based formulations. Specifically, these investigators note that a central prediction of at least some feature-based models is that patients with a category-specific deficit for animals should have more difficulty in answering questions that probe knowledge about visual information about animals (does an elephant have a long tail?), than about function information (is an elephant found in the jungle?). As Caramazza and Shelton (1998) show, at least some patients with an animal-specific knowledge disorder (and, according to their argument, all genuine cases) have equivalent difficulty with both visual and functional questions about animals. As a result of these and other findings, Caramazza and Shelton argue that category-specific disorders cannot be explained by feature-based models. Instead, they propose that such disorders reflect evolutionary adaptations for animate objects, foods, and perhaps by default, tools, and other manufactured objects (the ‘domain-specific hypothesis’; Caramazza and Shelton 1998).
4. Functional Brain Imaging Reeals that Semantic Knowledge about Objects is Distributed in Different Regions of the Brain Functional brain imaging allows investigators to identify the neural systems active when normal individuals perform different types of task. These studies have confirmed findings with brain damaged subjects, and have begun to extend knowledge of the neural basis of semantic memory. Although functional brain imaging is a relatively new tool, the current evidence suggests that information about different object features is stored in different regions of the cerebral cortex. One body of evidence in support of this claim comes from experiments using word generations tasks. Subjects are presented with the name of an object, or a picture of an object, and required to generate a word denoting a specific feature or attribute associated with that object. In one study, positron emission tomography (PET) was used to investigate differences in patterns of brain activity when subjects generated the name of an action typically associated with an object (e.g., saying the word ‘pull’ in response to a static, achromatic line-drawing of a child’s wagon), relative to generating an associated color word e.g., saying ‘red’ in response to the wagon) (Martin et al. 1995). Relative to generating a color word, action-word generation activated a posterior region of the left temporal lobe, called the middle temporal gyrus. This location was of particular interest because it was just anterior to sites known to be active during motion
Semantic Knowledge: Neural Basis of
Figure 1 Brain regions active during retrieval of color and action knowledge. A Ventral view of the human brain showing approximate locations of regions in the occipital cortex active during color perception (black ovals), and color word generation in response to object pictures and object names (gray ovals). This region is also active when subjects generate color imagery in response to verbal commands, and in subjects with auditory word–color synethesia (for review, see Martin 2001). B Lateral view of the left hemisphere showing location of the region active during motion perception (area MT; black oval), and region of the middle temporal gyrus active when subjects generated action words to object pictures, and object names (gray oval). Location of this region is based on over 20 studies encompassing nine native languages (for review, see Martin 2001). The white oval indicates the approximate location of the region of inferior prefrontal cortex implicated in selecting, retrieving, and maintaining lexical and phonological information (for review, see Martin and Chao 2001)
perception (area MT). In contrast, relative to generating action words, color-word generation activated a region on the ventral, or underside of the temporal lobe, called the fusiform gyrus. This location was of particular interest because it was just anterior to sites known to be active during color perception. Finally, the same pattern of results was found when subjects were presented with the written names of objects, rather than an object picture (Fig. 1) (for review see Martin 2001). These findings, and findings from other studies using a wide range of semantic processing tasks, provide support for two important ideas about the neural representation of semantic knowledge. First, there is a single semantic system in the brain, rather than separate semantic systems for different modalities of input (visual, auditory) or types of material (pictures of objects, words) (e.g., Vandeberghe et al. 1996). This system includes multiple brain regions, especially in the temporal and frontal lobes of the left hemisphere (for review see Price et al. 1999, Martin 2001). Second,
information about object features and attributes are not stored in a single location, but rather are stored as a distributed network of discrete cortical areas. Moreover, the locations of the sites appear to follow a specific plan that parallels the organization of sensory systems, and, as will be reviewed below, motor systems, as well.
5. Ventral Occipitotemporal Cortex and the Representation of Object Form Another body of evidence that object concepts may be represented by distributed feature networks comes from studies contrasting patterns of neural activity associated with naming, and performing other types of tasks with objects from different categories. A feature common to all concrete objects is that they have physical shape or form. Evidence is accumulating that suggests that many object categories elicit distinct 13869
Semantic Knowledge: Neural Basis of
Figure 2 Brain regions showing object category-related patterns of activation. Approximate location of regions in the ventral temporal cortex A, and lateral cortex B, active during viewing, naming, and retrieving information about animals (black ovals) and tools (gray ovals). Blecause of their anatomic proximity to visual object form (and form-related features like color), motion, and motor processing areas, it has been suggested that information about how an object appears, moves, and is used, is stored in these regions (see text for details). These regions interact with lowerlevel visual processing areas in occipital cortex (double arrows); especially when discriminating among items of similar appearance (e.g., four-legged animals, faces). 1. Superior temporal sulcus (STS). 2. Middle temporal gyrus. 3. Premotor cortex
patterns of neural activity in regions involved in object form processing (ventral occipital and temporal cortex). Moreover, the locations of these category-related activations appear to be consistent across individual subjects and processing tasks (e.g., naming object pictures, matching pictures, reading object names). This seems to be especially so for objects defined primarily by visual form-related features such as animals, faces, and landmarks. Early reports using PET found that naming (Martin et al. 1996) and matching (Perani et al. 1995) pictures of animals resulted in greater activation of the left occipital cortex than performing these same tasks with pictures of tools. Because the occipital cortex is involved primarily in the early stages of visual processing, it was suggested that this activity reflected topdown activation from more anterior sites in the occipitotemporal object processing stream (Martin et al. 1996). This may occur whenever detailed information about visual features or form is needed to identify an object. Specifically, naming objects that differ from other members of the same category by relatively subtle differences in visual form (four-legged 13870
animals) may require access to stored information about visual detail. Retrieving this information, in turn, may require participation of early visual processing areasn the occipital cortex. A subsequent report showing that unilateral occipital lesions could result in greater difficulty in naming and retrieving information about animals than tools provided converging evidence for this view (Tranel et al. 1997). However, most patients with semantic deficits for animals have had lesions confined to the temporal lobes (for review, see Gainotti et al. 1995). Functional brain imaging of normal individuals has now provided evidence for category-related activations in the ventral region of the temporal lobes. This has been accomplished by using functional magnetic resonance imaging (fMRI), which provides better spatial resolution than was possible using PET. A number of investigators have found that distinct regions of the ventral temporal cortex show differential responses to different object categories. In one study, viewing, naming, and matching pictures of animals, as well as answering written questions about animals, were found to activate the lateral region of the fusiform
Semantic Knowledge: Neural Basis of gyrus, relative to performing these tasks with pictures and names of tools. In contrast, the medial fusiform was more active for tools than animals. A similar, but not identical, pattern of activation was found for viewing faces (lateral fusiform) relative to viewing houses (medial fusiform) (Chao et al. 1999). Other investigators have also reported face-related activity in the lateral region of the fusiform gyrus, and houserelated activity in more medial regions of the ventral temporal lobe, including the fusiform, lingual, and parahippocampal gyrus (Fig. 2) (for review, see Kanwisher et al. 2001, Martin 2001). These findings suggest that different object categories elicit activity in different regions of ventral temporal cortex, as defined by the location of their peak activation. Moreover, the typological arrangement of these peaks was consistent across subjects and tasks. Importantly, however, the activity associated with each object category was not limited to a specific region of the ventral occipitotemporal cortex, but rather was distributed over much of the region (Chao et al. 1999). Additional evidence for the distributed nature of object representations in the ventral temporal cortex comes from single cell recordings from intracranial depth electrodes implanted in epileptic patients (Kreiman et al. 2000a). Recordings from the regions of the medial temporal cortex (entorhinal cortex, hippocampus, and amygdala), which receive major inputs from the ventral temporal regions described above, identified neurons that showed highly selective responses to different object categories, including animals, faces, and houses. Moreover, the responses of the neurons were category-specific rather than stimulus-specific. That is, animal-responsive cells responded to all pictures of animals, rather than to one picture or a select few. Studies reporting similar patterns of neural activity when subjects view and imagine objects provide further support that object information is stored in these regions of cortex. For example, regions active during face perception are also active when subjects imagine famous individuals (O’Craven and Kanwisher 2000). Similar findings have been reported for viewing and imagining known landmarks (O’Craven and Kanwisher 2000), houses, and even chairs (Ishai et al. 2000). In addition, the majority of category-selective neurons recorded from human temporal cortex also responded selectively when the patients were asked to imagine these objects (Kreiman et al. 2000b). Taken together, the data suggest that the ventral occipitotemporal cortex may be best viewed, not as a mosaic of discrete category-specific areas, but rather as a lumpy feature-space, representing stored information about features of object form shared by members of a category. How this feature space is organized, and why its topological arrangement is so consistent from one subject to another, are critical questions for future investigations.
6. Lateral Temporal Cortex and the Representation of Object Motion Information about how objects move through space, and patterns of motor movements associated with their use, are other features that could aid object identification. This would be especially true for categories of manufactured objects such as tools that have a more variable mapping between their name and their visual form than a category such as four-legged animals. Thus access to these additional features may be required to identify them as unique entities. Here again, evidence is accumulating that naming and identifying objects with motion-related attributes activate areas close to regions that mediate perception of object motion (posterior region of the lateral temporal lobe) with different patterns of activity associated with biological and manufactured objects. A number of laboratories using a variety of paradigms with pictures and words have reported that tools elicit greater activity in the posterior, left middle temporal gyrus than animals and other object categories (for review, see Martin 2001). Moreover, the active region was just anterior to area MT, and overlapped with the region active in the verb generation studies discussed above. Damage to this region has been reported selectively to impair tool recognition and naming (Tranel et al. 1997). In contrast, naming animals and viewing faces elicits greater activity in the superior temporal sulcus (STS) (Fig. 2). This region is of particular interest because of its association with the perception of biological motion in monkeys as well as humans (for review, see Allison et al. 2000). As suggested for the ventral temporal cortex, neurons in the lateral temporal cortex may also be tuned to features that objects within a category share. The nature of these features is unknown; however, based on its anatomical proximity to visual motion processing areas, this region may be tuned to features of motion associated with different objects. In support of this conjecture, increased activity in the posterior lateral temporal cortex has been found when subjects viewed static pictures of objects that imply motion (Kourtzi and Kanwisher 2000, Senior et al. 2000), and when subjects focused attention on the direction of eye gaze (Hoffman and Haxby 2000). Investigation of the differences in the properties of motion associated with biological and manufactured objects may provide clues to the organization of this region.
7. Ventral Premotor Cortex and the Representation of Use-associated Motor Moements If activations associated with different object categories reflect stored information about object properties, then one would expect tools to elicit activity in 13871
Semantic Knowledge: Neural Basis of motor-related regions. Several laboratories have reported this association. Specifically, greater activation of the left ventral premotor cortex has been found for naming tools relative to naming animals, viewing pictures of tools relative to viewing pictures of animals, faces, and houses, and generating action words to tools (Fig. 2). Mental imagery (e.g., imagining manipulating objects with the right hand) has also resulted in ventral premotor activation (Fig. 2) (for review, see Martin 2001). Electrophysiological studies have identified cells in monkey ventral premotor cortex that responded not only when objects were grasped, but also when the animals viewed objects they had had experience of manipulating (Jeannerod et al. 1995). The ventral premotor activation noted in the human neuroimaging studies may reflect a similar process. These findings are consistent with reports of patients with greater difficulty in naming tools than animals following damage to the left lateral frontal cortex (for review, see Gainotti et al, 1995) and suggest that the left ventral premotor cortex may be necessary for naming and retrieving information about tools.
8. Conclusions and Future Directions Evidence from functional brain imaging studies provides considerable support for feature-based models of semantic representation. However, some of the findings could be interpreted as evidence for the ‘domainspecific’ hypothesis as well. For example, the clustering of activations associated with animals and faces, on the one hand, and tools and houses, on the other, may be viewed as consistent with this interpretation. Other evidence suggests, however, that all nonbiological object representations do not cluster together. For example, it has been reported that activity associated with a category of objects of no evolutionary significance (chairs) was located lateral to the face-responsive region (in the inferior temporal gyrus), rather than medially where tools and houses elicit their strongest activity (Ishai et al. 1999). Both functional brain imaging and patient studies suggest that object knowledge is represented in distributed cortical networks. There do not seem to be single regions that map on to whole object categories. Nevertheless, there may be a broader organization of these networks that reflects evolutionarily adapted, domain-specific knowledge systems for biological and nonbiological kinds of object. This possibility remains to be explored. Although progress is being made in understanding the neural substrate of some aspects of meaning, many important questions and issues remain. For example, semantic representations are prelexical. Yet, to be of service, they must be linked intimately to the lexicon. Little is known about how the lexicon is organized in the brain, and how lexical and semantic networks interact (Damasio et al. 1996). 13872
Another important issue concerns the neural basis of retrieval from semantic memory. Semantic knowledge, like all stored information, is not useful unless it can be retrieved efficiently. Studies of patients with focal lesions have shown that the left lateral prefrontal cortex is involved critically in word retrieval, even in the absence of a frank aphasia (e.g., Baldo and Shimamura 1998). Functional brain imaging studies have confirmed this association (Fig. 2). Moreover, recent evidence suggests that different regions of the left inferior prefrontal cortex may mediate selection among competing alternatives in semantic memory, whereas other regions may be involved in retrieving, manipulating, and maintaining semantic information in working memory (for review, see Martin and Chao 2001). Much additional work will be needed to describe the role that different regions of prefrontal cortex play in semantic processing. Another critical question will be to determine how semantic object representations are modified by experience. Some of the findings discussed here suggest that the typological arrangement among object categories in the cortex is relatively fixed. Other evidence suggests a much more flexible organization, in which the development of expertise with an object category involves a particular portion of the fusiform gyrus (Gauthier et al. 1999). Longitudinal studies tracking changes in the brain associated with learning about completely novel objects should help to clarify this issue. There is also little known about where object information unrelated to sensory or motor properties is stored (e.g., that camels live in Asia and Africa). Similarly, little is known about the representation of abstract concepts (e.g., honor, liberty, and justice), metaphors, and the like. Finally, questions concerning the neural systems involved in object category formation have been almost totally neglected. E. E. Smith and colleagues have shown that exemplar-based and rule-based categorization activate different neural structures (Smith et al. 1999). This finding suggests that categorization processes may be a fruitful area for future investigation. The advent of techniques to converge fMRI data with technologies such as magnetoencephalography (MEG) that provide temporal information on the order of milliseconds, should provide a wealth of new information on the neural basis of semantic knowledge. See also: Comprehension, Cognitive Psychology of; Dementia, Semantic; Evolution and Language: Overview; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Meaning and Rule-following: Philosophical Aspects; Memory for Meaning and Surface Memory; Semantic Similarity, Cognitive Psychology of; Semantics; Sentence Comprehension, Psychology of; Word, Linguistics of; Word Meaning: Psychological Aspects
Semantic Processing: Statistical Approaches
Bibliography Allison T, Puce A, McCarthy G 2000 Social perception and visual cues: role of the STS. Trends in Cognitie Science 4: 267–78 Baldo J V, Shimamura A P 1998 Letter and category fluency in patients with frontal lobe lesions. Neuropsychology 12: 259–67 Caramazza A, Shelton J R 1998 Domain-specific knowledge systems in the brain the animate-inanimate distinction. Journal of Cognitie Neuroscience 10: 1–34 Caramazza A, Hillis A E, Rapp B C, Romani C 1990 The multiple semantics hypothesis: multiple confusions? Cognitie Neuropsychology 7: 161–89 Chao L L, Haxby J V, Martin A 1999 Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nature Neuroscience 2: 913–19 Damasio A R 1990 Category-related recognition deficits as a clue to the neural substrates of knowledge. Trends in Neurosciences 13: 95–8 Damasio H, Grabowski T J, Tranel D, Hichwa R D, Damasio A R 1996 A neural basis for lexical retrieval. Nature 380: 499–505 Farah M J, McClelland J L 1991 A computational model of semantic memory impairment: Modality specificity and emergent category specificity. Journal of Experimental Psychology: General 120: 339–57 Forde E M E, Humphreys G W 1999 Category-specific recognition impairments: a review of important case studies and influential theories. Aphasiology 13: 169–93 Gainotti G, Silveri M C, Daniele A, Giustolisi L 1995 Neuroanatomical correlates of category-specific semantic disorders: a critical survey. Memory 3: 247–64 Gauthier I, Tarr M J, Anderson A W, Skudlarski P, Gore J C 1999 Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects. Nature Neuroscience 2: 568–73 Hoffman E A, Haxby J V 2000 Distinct representations of eye gaze and identity in the distributed human neural system for face perception. Nature Neuroscience 3: 80–4 Humphreys G W, Riddoch M J 1987 On telling your fruit from your vegetables: A consideration of category-specific deficits after brain damage Trends in Neurosciences 10: 145–48 Ishai A, Ungerleider L G, Haxby J V 2000 Distributed neural systems for the generation of visual images. Neuron 28: 979–90 Ishai A, Ungerleider L G, Martin A, Schouten J L, Haxby J V 1999 Distributed representation of objects in the ventral visual pathway. Proceedings of the National Academy of Sciences, USA 96: 9379–84 Jeannerod M, Arbib M A, Rizzolatti G, Sakata H 1995 Grasping objects: The cortical mechanisms of visuomotor transformation. Trends in Neurosciences 18: 314–20 Kanwisher N, Downing P, Epstein R, Kourtzi Z 2001 Functional neuroimaging of visual recognition. In: Cabeza R, Kingstone A (eds.) Handbook of Functional Neuroimaging of Cognition. MIT Press, Cambridge, MA Kourtzi, Z, Kanwisher N 2000 Activation in human MT\MST by static images with implied motion. Journal of Cognitie Neuroscience 12: 48–55 Kreiman G, Koch C, Fried I 2000a Category-specific visual responses of single neurons in the human medial temporal lobe. Nature Neuroscience 3: 946–53 Kreiman G, Koch C, Fried I 2000b Imagery neurons in the human brain. Nature 408: 357–61
Lissauer H. 1988 [1890] A case of visual agnosia with a contribution to theory. Cognitie Neuropsychology 5: 157–92 Martin A 2001 Functional neuroimaging of semantic memory. In: Cabeza R, Kingstone A (eds.) Handbook of Functional Neuroimaging of Cognition. MIT Press, Cambridge, MA Martin A, Chao L L 2001 Semantic memory and the brain: Structure and processes. Current Opinion in Neurobiology. 11: 194–201 Martin A, Haxby J V, Lalonde F M, Wiggs C L, Ungerleider L G 1995 Discrete cortical regions associated with knowledge of color and knowledge of action. Science 270: 102–5 Martin A, Wiggs C L, Ungerleider L G, Haxby J V 1996 Neural correlates of category-specific knowledge. Nature 379: 649–52 Miller G A 1999 On knowing a word. Annual Reiew of Psychology 50: 1–19 O’Craven K M, Kanwisher N 2000 Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitie Neuroscience 6: 1013–23 Patterson K, Hodges J R 1995 Disorders of semantic memory. In: Baddeley A D, Wilson B A, Watts F N (eds.) Handbook of Memory Disorders. Wiley, New York Perani D, Cappa S F, Bettinardi V, Bressi S, Gorno-Tempini M, Matarrese M, Fazio F 1995 Different neural systems for the recognition of animals and man-made tools. Neuroreport 6: 1637–41 Price C J, Indefrey P, van Turennout M 1999 The neural architecture underlying the processing of written and spoken word forms. In: Brown C M, Hagoort P (eds.) Neurocognition of Language Processing. Oxford University Press, New York Senior C, Barnes J, Giampietro V, Simmons A, Bullmore E T, Brammer M, David A S 2000 The functional neuro-anatomy of implicit motion perception of ‘representational momentum’. Current Biology 10: 16–22 Smith E E, Patalano A L, Jonides J 1999 Alternative strategies of categorization. Cognition 65: 167–96 Tranel D, Damasio H, Damasio A R 1997 A neural basis for the retrieval of conceptual knowledge. Neuropsychologia 35: 1319–27 Tulving E 1983 Elements of Episodic Memory. Oxford University Press, New York Vandenberghe R, Price C, Wise R, Josephs O, Frackowiak R S 1996 Functional anatomy of a common semantic system for words and pictures. Nature 383: 254–6 Warrington E K 1975 The selective impairment of semantic memory. Quarterly Journal of Experimental Psychology 27: 635–57 Warrington E K, Shallice T 1984 Category specific semantic impairments. Brain 107: 829–54 Webster’s New World Dictionary 1988 3rd college edn. Simon and Schuster, New York
A. Martin
Semantic Processing: Statistical Approaches The development of methods for representing meaning is a critical aspect of cognitive modeling and of applications that must extract meaning from text input. This ability to derive meaning is the key to any 13873
Semantic Processing: Statistical Approaches approach that needs to use or evaluate knowledge. Nevertheless, determining how meaning is represented and how information can be converted to this representation is a difficult task. For example, any theory of meaning must describe how the meaning of each individual concept is specified and how relationships among concepts may be measured. Thus, appropriate representations of meaning must first be developed. Second, computational techniques must be available to permit the derivation and modeling of meaning using these representations. Finally, some form of information about the concepts must be available in order to permit a computational technique to derive the meaning of the concepts. This information about the concepts can either be more structured humanbased input, for example dictionaries or links among related concepts, or less structured natural language. With the advent of more powerful computing and the availability of on-line texts and machine-readable dictionaries, novel techniques have been developed that can automatically derive semantic representations. These techniques capture effects of regularities inherent in language to learn about semantic relationships among words. Other techniques have relied on hand-coding of semantic information, which is then placed in an electronic database so that users may still apply statistical analyses on the words in the database. All of these techniques can be incorporated into methods for modeling a wide range of psychological phenomena such as language acquisition, discourse processing, and memory. In addition, the techniques can be used in applied settings in which a computer can derive semantic knowledge representations from text. These settings include information retrieval, natural language processing, and discourse analysis.
1.
discovered, then semantic concepts can be defined by their combination, or composition of these features. Thus, different concepts will have different basic semantic features or levels of these features. Concepts vary in their relatedness to each other, based on the degree to which they have the same semantic features (e.g., Smith and Medin 1981). One approach to define empirically the meaning of words using a feature system was the use of the semantic differential (Osgood et al. 1957). By having people rate how words fell on a Likert scale of different bipolar adjectives (e.g., fair– unfair, active–passive), a word could be represented as the combination of ratings. Through collecting large numbers of these ratings, Osgood could define a multidimensional semantic space in which the meaning of a word is represented as a point in that space based on the ratings on each scale, and words could be compared among each through measuring distances in the space. Feature representations have been widely used in psychological models of memory and categorization. They provide a simple representation for encoding information for input into computational models, most particularly connectionist models (e.g., Rumelhart and McClelland 1986). While the representation permits assigning concepts based on their features, it does not explicitly show relations among the concepts. In addition, in many models using feature representations, the dimensions are handcreated; therefore, it is not clear whether the features are based on real-world constructions, or are derived psychological constructions developed on the fly, based on the appropriate context. Some of the statistical techniques described below avoid the problem of using humanly defined dimensions by automatically extracting relevant dimensions.
The Representation of Meaning
Definitions of semantics generally encompass the concepts of knowledge of the world, of meanings of words, and of relationships among the words. Thus, techniques that perform analyses of semantics must have representations that can account for these different elements. Because these computational techniques rely on natural language, typically in the form of electronic text, they must further be able to convert information contained in the text to the appropriate representation. Two representational systems that are widely used in implementations of computational techniques are feature systems and semantic networks. While other methods of semantic representation (e.g., schemas) also account for semantic information in psychological models, they are not as easily specified computationally. 1.1
Feature Systems
The primary assumption behind feature systems is that if a fixed number of basic semantic features can be 13874
1.2
Semantic or Associatie Networks
A semantic network approach views the meaning of concepts as being determined by their relations to other concepts. Concepts are represented as nodes with labeled links (e.g., IS-A or Part-of ) as relationships among the nodes. Thus, knowledge is a combination of information about concepts and how those concepts relate to each other. Based on the idea that activation can spread from one node to another, semantic networks have been quite influential in the development of models of memory. Semantic networks and spreading activation have been widely used for modeling sentence verification times and priming, and have been incorporated into many localist connectionist models. Semantic networks permit economies of storage because concepts can inherit properties shared by other concepts (e.g., Collins and Quillian 1969). This basis has made semantic networks a popular approach for the development of computer-based lexicons, most particularly within the field of artificial intelligence.
Semantic Processing: Statistical Approaches Nevertheless, many of the assumptions for relatedness among concepts must still be based on handcrafted networks in which the creator of the network uses their knowledge to develop the concepts and links.
2. Statistical and Human Approaches to Deriing Meaning No matter what type of representation is used for meaning, some form of information must be stored within the representation for it to be useful. There are two primary techniques to derive representations and fill in the lexical information. The first is to handcraft the information through human judgments about definitions, categorization, organization, or associations among concepts. The second is to use automated methods to extract information from existing on-line texts. The former approach relies on human expertise, and it is assumed that humans’ introspective abilities and domain knowledge will provide an accurate representation of lexical relations. The latter approach assumes that the techniques can create a useful and accurate representation of meaning from processing natural language input.
2.1 Human Extraction of Meaning for Knowledge Bases Long before the existence of computing, humans developed techniques for recording the meaning of words. These lexical compilations include dictionaries, thesauri, and ontologies. While not strictly statistical techniques for deriving meaning, human-based methods deserve mention for several reasons. Many of these lexical compilations are now available on-line. For example, there are a number of machine readable dictionaries on-line (e.g., Wilks et al. 1996), as well as projects to develop large ontologies of domain knowledge. Because of this availability, statistical techniques can be applied to these on-line lexicons to extract novel information automatically from a lexicon, for example to discover new semantic relations. In addition, the information from existing lexical entries can be used in automatic techniques for categorizing new lexical items. One notable approach has been the development of WordNet (see Fellbaum 1998). WordNet is a handbuilt on-line lexical reference database which represents both the forms and meanings of words. Lexical concepts are organized in synonym sets (synsets) which represent semantic and lexical relations among concepts including synonymy, antonymy, hyponymy, meronymy, and morphological relations. Automated techniques (described in Fellbaum 1998) have been applied to WordNet for a number of applications including discovering new relations among words, performing automated word-sense identification, and
applying it to information retrieval through automated query expansion. Along with machine-readable dictionaries and ontologies, additional human-based approaches to deriving word similarity have collected large numbers of word associations to derive word-association norms (e.g., Deese 1965) and ratings of words on different dimensions (e.g., Osgood et al. 1957). The statistics from these collections are incorporated into computational cognitive models. In human-generated representations, however, there is much overhead involved in collecting the set of information before the lexical representation can be used. Because of this, it is not easily adapted to new languages or new domains. Further, handcrafted derivations of relationships among words do not provide a basis for a representational theory.
2.2
Automatic Techniques for Deriing Semantics
‘You shall know a word by the company it keeps’ (Firth 1957). The context in which any individual word is used can provide information about both the word’s syntactic role and the semantic contributions of the word to the context. Thus, with appropriate techniques for measuring the use of words in language, we should be able to infer the meaning of those words. While much research has focused on automatically extracting syntactic regularities from language, there has been a recent increase in research on approaches to extracting semantic information. This research takes a structural linguistic approach in its assumption that the structure of meaning (or of language in general) can be derived or approximated through distributional measures and statistical analyses. In order automatically to derive a semantic representation from analyzing language, several assumptions must be fulfilled. First it is assumed that information about the cooccurrence of words within contexts will provide appropriate information about semantic relationships. For example, the fact that ‘house’ and ‘roof’ often occur in the same context or in similar contexts informs that there must be a relationship between them. Second, assumptions must be made about what constitutes the context in which words appear. Some techniques use a moving window which moves across the text analyzing five to 10 words as a context; others use sentences, paragraphs, or complete documents as the complete context in which words appear. Finally, corpora of hundreds of thousands to millions of words of running text are required so that there are enough occurrences of information about how the different words appear within their contexts. A mathematical overview of many statistical approaches to natural language may be found in Manning and Schu$ tze (1999), and a review of statistical techniques applied to corpora may be found in 13875
Semantic Processing: Statistical Approaches Boguraev and Pustejovsky (1996). Below, we focus on a few techniques that have been applied to psychological modeling, have shown psychological plausibility, and\or have provided applications that may be used more directly within cognitive science. The models described all use feature representations of words, in which words are represented as vectors of features. In addition, the models automatically derive the feature dimensions rather than having them predefined by the researcher. 2.2.1 The HAL model. The HAL (Hyperspace Analog to Memory) model uses lexical cooccurrence to develop a high-dimensional semantic representation of words (Burgess and Lund 1999). Using a large corpus (320 million words) of naturally occurring text, they derive vector representations of words based on a 10-word moving window. Vectors for words that are used in related contexts (within the same window) have high similarity. This vector representation can characterize a variety of semantic and grammatical features of words, and has been used to investigate a wide range of cognitive phenomena. For example, HAL has been used for modeling results from priming and categorization studies, resolving semantic ambiguity, modeling cerebral asymmetries in semantic representations, and modeling semantic differences among word classes, such as concrete and abstract nouns. 2.2.2 Latent Semantic Analysis. Like HAL, Latent Semantic Analysis (LSA) derives a high-dimensional vector representation based on analyses of large corpora (Landauer and Dumais 1997). However, LSA uses a fixed window of context (e.g., the paragraph level) to perform an analysis of cooccurrence across the corpus. A factor analytic technique (singular value decomposition) is then applied to the cooccurrence matrix in order to derive a reduced set of dimensions (typically 300 to 500). This dimension reduction causes words that are used in similar contexts, even if they are not used in the same context to have similar vectors. For example, although ‘house’ and ‘home’ both tend to occur with ‘roof,’ they seldom occur together in language. Nevertheless, they would have similar vector representations in LSA. In LSA, vectors for individual words can be summed to provide measures of the meaning of larger units of text. Thus, the meaning of a paragraph would be the sum of the vectors of the words in that paragraph. This basis permits the comparison of larger units of text, such as comparing the meaning of sentences, paragraphs, or whole documents to each other. LSA has been applied to a number of different corpora, ranging from large samples of language that children and adults would have encountered to specific 13876
corpora on particular domains, such as individual course topics. For the more general corpora, it derives a generalized semantic representation of knowledge similar to that general knowledge acquired by people in life. For the domain-specific corpora, it generates a representation more similar to that of people knowledgeable within that domain area. LSA has been used both as a theoretical model and as a tool for the characterization of semantic relatedness of units of language (see Landauer et al. 1998 for a review). As a theoretical model, LSA has been used to model the speed of acquisition of new words by children, its scores overlap those of humans on standard vocabulary and subject matter tests, it mimics human word sorting and category judgments, it simulates word-word and passage-word lexical priming data, and it accurately estimates textual coherence and the learnability of texts by individual students. The vector representation in LSA can be applied within other theoretical models. For example, propositional representations based on LSA-derived vectors have been integrated into the ConstructionIntegration model, a symbolic connectionist model of language (see Kintsch 1998). As an application, LSA has been used to measure the quality and quantity of knowledge contained in essays, for matching user queries to documents in information retrieval, and for performing automatic discourse segmentation. 2.2.3 Connectionist approaches. Connectionist modeling uses a network of interacting processing units operating on feature vectors to model cognitive phenomena. It has been widely used to model aspects of language processing. Although in some connectionist models words or concepts are represented as vectors in which the features have been predefined (e.g., McClelland and Kawamoto 1986), recent models have automatically derived the representation. Elman (1990) implemented a simple recurrent network that used a moving window analyzing a set of sentences from a small lexicon and artificial grammar. Based on a cluster analysis of the activation values of the hidden units, the model could predict syntactic and semantic distinctions in the language, and was able to discover lexical classes based on word order. One current limitation, however, is that it is not clear how well the approach can scale up to much larger corpora. Nevertheless, like LSA, due to the constraint satisfaction in connectionist models, the pattern of activation represented in the hidden units goes beyond direct cooccurrence, and captures more of the contextual usage of words.
3.
Conclusions
Statistical techniques for extracting meaning from online texts and for extending the use of machinereadable dictionaries have become viable approaches
Semantic Processing: Statistical Approaches for creating semantic-based models and applications. The techniques go beyond modeling just cooccurrence of words. For example, the singular value decomposition in LSA or the use of hidden units in connectionist models permits derivation of semantic similarities that are not found in local cooccurrence but that are seen in human knowledge representations. The techniques incorporate both the idea of feature vectors, from feature-based models, and the idea that words can be defined by their relationships to other words found in semantic networks. 3.1 Adantages and Disadantages of Statistical Semantic Approaches Compared to many models of semantic memory, statistical semantic approaches are quite parsimonious, using very few assumptions and parameters to derive an effective representation of meaning. They further avoid problems of human-based meaning extraction since the techniques can process realistic environmental input (natural language) directly into a representation. The techniques are fast, requiring only hours or days to develop new lexicons, and can work in any language. They typically need large amounts of natural language to derive representation; thus, large corpora must be obtained. Nevertheless, because they are applied to large corpora, the lexicons that are developed provide realistic representations of tens to hundreds of thousands of words in a language. 3.2 Theoretical and Applied Uses of Statistical Semantics Within cognitive modeling, statistical semantic techniques can be applied to almost any model that must incorporate meaning. They can therefore be used in modeling in such areas as semantic and associative priming, lexical ambiguity resolution, anaphoric resolution, acquisition of language, categorization, lexical effects on word recognition, and higher-level discourse processing. The techniques can provide useful additions to a wide range of applications that must encode or model meaning in language; for example, information retrieval, automated message understanding, machine translation, and discourse analysis. 3.3 The Future of Statistical Semantic Models in Psychology While it is important that the techniques provide effective representations for applications, it is also important that the techniques have psychological plausibility. The study of human language processing can help inform the development of more effective methods of deriving and representing semantics. In
turn, the development of the techniques can help improve cognitive models. For this to happen, strong ties must be formed between linguists, computational experts, and psychologists. In addition, the techniques and lexicons derived from them are not widely available to all researchers. For the techniques to succeed, better distribution, either through Web interfaces or through software, will allow them to be more easily incorporated into a wider range of cognitive models. See also: Connectionist Models of Language Processing; Lexical Access, Cognitive Psychology of; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Lexical Semantics; Lexicon; Memory for Meaning and Surface Memory; Semantic Knowledge: Neural Basis of; Semantic Similarity, Cognitive Psychology of; Semantics; Word Meaning: Psychological Aspects
Bibliography Boguraev B, Pustejovsky J 1996 Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA Burgess C, Lund K 1999 The dynamics of meaning in memory. In: Dietrich E, Markman A B (eds.) Cognitie Dynamics: Conceptual and Representational Change in Humans and Machines. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 117–56 Collins A M, Quillian M R 1969 Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaior 8: 240–7 Deese J 1965 The Structure of Associations in Language and Thought. Johns Hopkins University Press, Baltimore, MD Elman J L 1990 Finding structure in time. Cognitie Science 14: 179–211 Fellbaum C 1998 WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA Firth J R 1968 A synopsis of linguistic theory 1930–1955. In: Palmer F (ed.) Selected Papers of J.R. Firth. Longman, New York, pp. 32–52 Kintsch W 1998 Comprehension: A Paradigm for Cognition. Cambridge University Press, New York Landauer T K, Dumais S T 1997 A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Reiew 104: 211–40 Landauer T K, Foltz P W, Laham D 1998 An introduction to Latent Semantic Analysis. Discourse Processes 25: 259–84 Manning C D, Schu$ tze H 1999 Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA McClelland J L, Kawamoto A H 1986 Mechanisms of sentence processing: Assigning roles to constituents. In: Rumelhart D E, McClelland J L (eds.) PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 2. MIT Press, Cambridge, MA, pp. 272–325 Osgood C E, Suci G J, Tannenbaum P H 1957 The Measurement of Meaning. University of Illinois Press, Urbana, IL Rumelhart D E, McClelland J L 1986 Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA, Vol. 1 Smith E E, Medin D L 1981 Categories and Concepts. Harvard University Press, Cambridge, MA
13877
Semantic Processing: Statistical Approaches Wilks Y, Slator B, Guthrie L 1996 Electric Words: Dictionaries, Computers and Meanings. MIT Press, Cambridge, MA
P. W. Foltz
Semantic Similarity, Cognitive Psychology of Semantic similarity refers to similarity based on meaning as opposed to form. The term is used widely throughout cognitive psychology and related areas such as psycholinguistics, memory, reasoning, and neuropsychology. For example, semantically similar words are confusable with each other, and can prime each other, with the consequence that verbal memory performance is heavily dependent on similarity of meaning (see False Memories, Psychology of; Priming, Cognitie Psychology of). In the context of reasoning, people draw inductive inferences on the basis of semantic similarity, for example, inferring that properties of cows are more likely to be true of horses than to be true of hedgehogs (Heit 2001) But what is semantic similarity? In experimental work, semantic similarity is often estimated directly from participants’ explicit judgments. However, there are several advantages to representing semantic similarity within a standardized framework or model. These advantages include the greater consistency and other known mathematical properties that can be imposed by a formal model, and the possible benefits of data reduction and the extraction of meaningful dimensions or features of the stimuli being analyzed. The history of research concerning semantic similarity can be captured in terms of the succession of models that have been applied to this topic. The models themselves can be divided into generic models of similarity and a number of more specialized models aimed at particular aspects of semantic similarity.
1. General Models of Similarity Although the following accounts are meant to address similarity in general terms, they can be readily applied to semantic similarity. The two classical, generic approaches to modeling similarity are spatial models of similarity and Tversky’s (1977) contrast model. More recently, structured approaches have addressed limitations of both of these classical accounts.
1.1 Spatial Representations Spatial models seek to represent similarity in terms of distance in a psychological space (Shepard 1980). An 13878
item’s position is determined through its coordinate values along the relevant dimensions; nearby points thus represent similar items, whereas distant items are psychologically very different. For example, a spatial representation of mammal categories might represent different animals on dimensions of size and ferocity, with lions and tigers being close on both of these dimensions (Fig. 1). The relevant space is derived using the technique of multidimensional scaling (MDS), a statistical procedure for dimensionality reduction. MDS works from experimental participants’ judgments, typically in matrices of proximity data such as pairwise confusions or similarity ratings between items. Spatial models have been used widely for visualization purposes and as the heart of detailed cognitive models, e.g., of categorization and recognition memory (Nosofsky 1991). Less widely used are related statistical procedures such as hierarchical clustering (Shepard 1980) although they have also been used in the context of semantic similarity.
1.2 The Contrast Model The traditional alternative to spatial models is Tversky’s (1977) contrast model. This account was developed to address perceived limitations of spatial models which bring these into conflict with behavioral data (but also see Nosofsky 1991). Chief among these are violations of the so-called metric axioms that underlie any spatial scheme, such as asymmetries in human similarity judgments that are not well captured by spatial distance which should be symmetrical. For example, people judged the similarity of North Korea to China to be greater than the similarity of China to North Korea. In the contrast model (Eqn. (1)), the similarity between items a and b is a positive function of the features common to both items and a negative function of the distinctive features of a and also the distinctive features of b. Each of these three feature sets is governed by a weighting parameter which allows the model to capture asymmetries according to the nature of a particular task. According to the focusing hypothesis, greater attention is given to distinctive features of the first item in a comparison than of the second item, hence α β. When China is more familiar than North Korea, having more known distinctive features, then the similarity from China to North Korea should be lower than the similarity of North Korea to China. S (a, b) l θf (AEB)kαf (AkB)kβf ( BkA)
(1)
Although applications of the contrast model to the modeling of specific cognitive tasks are fewer than those of spatial models, an application to semantic similarity can be found in Ortony (1979), who applied the model to distinctions between literal and metaphorical similarity.
Semantic Similarity, Cognitie Psychology of Tiger Lion
Wolf
Ferocity
Elephant
Dog Giraffe
Hedgehog
Pig
Size
Figure 1 Multidimensional scaling representation of mammals
1.3 Structured Representations Despite the efforts aimed at establishing the superiority of either the contrast model or spatial models, it has been argued that both approaches share a fundamental limitation in that they define similarity based on oversimplified kinds of representations: points in space or feature sets. Arguably most theories of the representation of natural objects, visual textures, sentences, etc., assume that these cannot be represented in line with these restrictions (see Feature Representations in Cognitie Psychology). Instead, they seem to require structured representations: complex representations of objects, their parts and properties, and the interrelationships between them. Descriptions such as SIBLING-OF (Linus, Lucy) and SIBLING-OF (Lucy, Linus) cannot be translated in an obvious way to either lists of features or points in space that represent the similarity between these two items as well as their differences from BROTHER-OF (Linus, Lucy). (See also Fodor and Pylyshyn (1988) for a critique of attempts in connectionist networks to represent relational structure in a featural framework.) The perceived need for accounts of similarity to work with structured representations has given rise to the structural alignment account (see Markman (2001) for an overview) which has its roots in research on analogical reasoning (see Mental Models, Psychology of). Structural alignment operates over structured representations such as frames (see Schemas, Frames, and Scripts in Cognitie Psychology) consisting of slots and fillers. The comparison process requires that at least some of the predicates, that is relations such as ABOVE (x, y ), are identical across the comparison.
These identical predicates are placed in correspondence. The alignment process then seeks to build maximal structurally consistent matches between the two representations. Structural alignment has been implemented in a variety of computational models (e.g., Falkenhainer et al. 1990, Goldstone and Medin 1994) that have been used to capture behavioral data, especially similarity judgments and analogical reasoning. Experimental results have supported an important prediction of the structural alignment account, that there will be a greater impact of alignable differences compared with nonanalignable differences. Nonalignable differences between two representations are elements of one object that have no correspondence in the other. In contrast, alignable differences refer to representational elements that have corresponding roles in the two representations but fail to match exactly. For example, imagine two tables, one with a flower on top and the other with a bowl on top in addition to a chair beneath it. Comparing the two scenes, flower vs. bowl would be an alignable difference, whereas the chair would be a nonalignable difference.
2. Specialized Models of Semantic Similarity 2.1 Semantic Differentials Leaving behind generic models of similarity, semantic similarity has also been captured through a variety of specially built models and approaches. The first of these is Osgood et al.’s (1957) semantic differential. 13879
Semantic Similarity, Cognitie Psychology of This approach used psychometric techniques to compute the psychological distance between concepts. Participants were required to rate concepts on 10–20 bipolar scales for a set of semantically relevant dimensions, such as positive–negative, feminine– masculine. The ratings of words on these dimensions would essentially form their coordinates in semantic space. The approach is thus related to spatial models of similarity with the difference primarily in the way that the semantic space is derived. Whereas models based on multidimensional scaling require pairwise proximity data for all concepts of interest, the semantic differential approach requires that all concepts of interest are rated on all relevant dimensions. Therefore, the semantic differential method suffers from the difficulty that the relevant dimensions must be stipulated in advance. Still, semantic differentials have been widely used, largely because the data are straightforward to collect, analyze, and interpret. 2.2 Semantic Networks and Featural Models Osgood et al.’s work motivated the semantic feature model of Smith et al. (1974) which addressed how people verify statements such as ‘a robin is a bird’ and ‘a chicken is a bird,’ considering the defining and characteristic features of these concepts. Verifying the latter statement would be slower owing different characteristic features for chicken vs. bird. The semantic feature model was developed in contrast to Collins and Quillian’s (1969) semantic network model. In this model semantic meaning is captured by nodes that correspond to individual concepts. These nodes are connected by a variety of links representing the nature of the relationship between the nodes (such as an IS-A link between robin and bird). Semantic similarity (or distance) is captured in terms of the number of links that must be traversed to reach one concept from another. Although the semantic network model was a groundbreaking attempt at capturing structured relations, it still suffered from some problems. For example, the model correctly predicts that verifying ‘a robin is an animal’ is slower than verifying ‘a robin is a bird’ owing to more links being traversed for the first statement. However, this model does not capture typicality effects such as the difference between robin and chicken. 2.3 High-dimensional Context Spaces Osgood et al.’s semantic differential can also be seen as a predecessor of contemporary corpus-derived measures of semantics and semantic similarity. For example, Burgess and Lund’s (1997) hyperspace analogue to language model (HAL) learns a highdimensional context from large-scale linguistic corpora that encompass many millions of words of speech or text. The model tracks lexical co-occurrences 13880
throughout the corpus and from these derives a highdimensional representational space. The meaning of a word is conceived of as a vector. Each element of this vector corresponds to another word in the model, with the value of an element representing the number of times that the two words co-occurred within the discourse samples that constitute the corpus. For example, the vector for dog will contain an element reflecting the number of times that the word ‘bone’ was found within a given range of words in the corpus. These vectors can be viewed as the coordinates of points (individual words) in a high-dimensional semantic space. Semantic similarity is then a matter of distance between points in this space. Several other such usage-based models have been proposed to date; similar in spirit to HAL, for example, is Landauer and Dumais’s (1997) latent semantic analysis (see also Semantic Processing: Statistical Approaches). The basic approach might be seen to be taking to its logical consequence Wittgensteir’s famous adage of ‘meaning as use.’ Its prime advantage over related approaches such as spatial models of similarity and the semantic differential lies in the ability to derive semantics and thus measures of semantic similarity for arbitrarily large numbers of words without the need for any especially collected behavioral data. The ability of these models to capture a wide variety of phenomena, such as results in semantic priming and effects of semantic context on syntactic processing, has been impressive. 2.4 Connectionist Approaches The final approach to semantic similarity to be discussed shares with these context-based models a statistical orientation, but connectionist modeling has been popular particularly in neuropsychological work on language and language processing. In connectionist models, the semantics of words are represented as patterns of activations, or banks of units representing individual semantic features. Semantic similarity is then simply the amount of overlap between different patterns, hence these models are related to the spatial accounts of similarity. However, the typically nonlinear activation functions used in these models allow virtually arbitrary re-representations of such basic similarities. The representation schemes utilized in these models tend to be handcrafted rather than derived empirically as in other schemes such as multidimensional scaling and high-dimensional context spaces. However, it is often only very general properties of these semantic representations and the similarities between them that are crucial to a model’s behavior, such as whether these representations are ‘dense’ (i.e., involve the activation of many semantic features) or ‘sparse,’ so that the actual semantic features chosen are not crucial. For example, this distinction between dense and sparse representation has been used to capture patterns of semantic errors
Semantics associated with acquired reading disorders (Plaut and Shallice 1993) and also patterns of category specific deficits following localized brain damage (Farah and McClelland 1991).
3. Conclusion There is a wide variety of models of semantic similarity available. Underlying this array of approaches is a fundamental tension that is as yet unresolved. Many of the models reviewed can be classed as loosely spatial in nature. These models have all been applied extensively to behavioral data, yet there are fundamental limitations to spatial approaches with respect to their representational capacity, made clear by the successes of the contrast model and the structural alignment approach. The behavioral evidence that relational structure is important to similarity seems particularly compelling. This leaves this area of research with two contrasting and seemingly incompatible strands of models, each of which successfully relates to experimental data. It is possible only to speculate on how this contradiction might ultimately be resolved. One possibility is that spatial models and structured representations capture different aspects or types of semantic similarity: one automatic, precompiled, effortless, and in some ways shallow, and one the result of more in-depth, or line, or metacognitive processing. Such a distinction between types of processing of similarity would be compatible with the success of models based on, for example, context spaces in capturing phenomena such as semantic priming, and the success of models such as those based on structural alignment in capturing phenomena such as analogy, or the understanding of novel concept combinations. Whether such a distinction will take shape or whether there will one day be a single, all-encompassing account remains for future research. See also: Categorization and Similarity Models; Categorization and Similarity Models: Neuroscience Applications; Dementia, Semantic; Lexical Semantics; Semantic Knowledge: Neural Basis of; Semantic Processing: Statistical Approaches; Semantics
Bibliography Burgess C, Lund K 1997 Modeling parsing constraints with high-dimensional context space. Language and Cognitie Processes 12: 177–210 Collins A M, Quillian M R 1969 Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaior 8: 240–7 Falkenhainer B, Forbus K D, Gentner D 1990 The structuremapping engine: Algorithm and examples. Artial Intelligence 41: 1–63
Farah M J, McClelland J L 1991 A computational model of semantic memory impairment: Modality-specificity and emergent category-specificity. Journal of Experimental Psychology: General 120: 339–57 Fodor J, Pylyshyn Z 1988 Connectionism and cognitive architecture: A critical analysis. Cognition 28: 3–71 Goldstone R L, Medin D L 1994 The time course of comparison. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 29–50 Heit E 2001 Properties of inductive reasoning. Psychonomic Bulletin and Reiew 7: 569–92 Landauer T K, Dumais S T 1997 A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Reiew 104: 211–40 Markman A B 2001 Structural alignment, similarity, and the internal structure of category representations. In: Hahn U, Ramscar M (eds.) Similarity and Categorization. Oxford University Press, Oxford, UK Nosofsky R M 1991 Stimulus bias, asymmetric similarity and classification. Cognitie Psychology 23: 94–140 Ortony A 1979 Beyond literal similarity. Psychological Reiew 86: 161–80 Osgood C E, Suci G J, Tannenbaum P H 1957 The Measurement of Meaning. University of Illinois Press, Urbana, IL Plaut D C, Shallice T 1993 Deep dyslexia: A case study of connectionist neuropsychology. Cognitie Neuropsychology 10: 377–500 Shepard R N 1980 Multidimensional scaling, tree-fitting, and clustering. Science 210: 390–7 Smith E E, Shoben E J, Rips L J 1974 Structure and process in semantic memory: A featural model for semantic decisions. Psychological Reiew 81: 214–41 Tversky A 1977 Features of similarity. Psychological Reiew 84: 327–52
U. Hahn and E. Heit
Semantics Semantics is the study of meaning communicated through language, and is usually taken to be one of the three main branches of linguistics, along with phonology, the study of sound systems, and grammar, which includes the study of word structure (morphology) and of sentence structure (syntax). This entry surveys some of the main topics of current semantics research.
1. Introduction Traditionally, the main focus of linguistic semantics has been on word meaning, or lexical semantics. Since classical times writers have commented on the fact, noticed surely by most reflecting individuals, that the meaning of words changes over time. Such observa13881
Semantics tions are the seeds of etymology, the study of the history of words. Over longer stretches of time, such changes become very obvious, especially in literate societies. Words seem to shift around: some narrow in meaning such as English ‘queen,’ which earlier meant ‘woman, wife’ but now means ‘wife of a king.’ Others become more general, while still others shift to take on new meaning or disappear altogether. Words are borrowed from language to language. The study of such processes is now part of historical semantics (Fisiak 1985). Another motivation for the study of word meaning comes from dictionary writers as they try to establish meaning correspondences between words in different languages, or, in monolingual dictionaries, seek to provide definitions for all the words of a language in terms of a simple core vocabulary. In lexicology similarities and differences in word meaning are a central concern. The principled study of the meaning of phrases and sentences has only become established in linguistics relatively recently. Thus it is still common for descriptive grammars of individual languages to contain no separate section on semantics other than providing a lexicon. Nonetheless it has always been clear that one can identify semantic relations between sentences. Speakers of English know from the semantics of negation that nominal negation has different effects than sentence negation, so that ‘No-one complained’ may aptly be used to answer ‘Who complained?,’ while ‘Someone did not complain’ may not. Two sentences may seem to say essentially the same thing, even be paraphrases of each other, yet one may be more suited to one context than another–like the pair ‘Bandits looted the train’ and ‘The train was looted by bandits.’ A single sentence may be internally inconsistent, such as ‘Today is now tomorrow,’ or seem to be repetitive or redundant in meaning, such as ‘A capital city is a capital city.’ Another feature of sentence meaning is the regularity with which listeners draw inferences from sentences, and often take these to be part of the meaning of what was said. Some inferential links are very strong, such as entailment. Thus we say that ‘Bob drank all of the beer’ entails ‘Bob drank some of the beer’ (assuming the same individual Bob, beer, etc.), because it is hard to think of a situation where acceptance of the second sentence would not follow automatically from acceptance of the first. Other inferential links are weaker and more contextually dependent: from the utterance ‘Bob drank some of the beer’ it might be reasonable to infer ‘Bob didn’t drink all of the beer,’ but it is possible to think of situations where this inference would not hold. We might say that a speaker of the first sentence is implying the second, in a certain context. Speakers of all languages regularly predict and use such inferential behavior to convey their meaning, such that often more meaning seems to be communicated than is explicitly stated. All these aspects of sentence meaning are under study in various semantic frameworks. 13882
Semanticists share with philosophers an interest in key issues in the use of language, notably in reference. We use this term to describe the way in which speakers can pick out, or name, entities in the world by using words as symbols. Many scholars, especially formal semanticists, accept Frege’s distinction between reference (in German, Bedeutung) and sense (Sinn); see Frege (1980). Reference is the act of identifying an entity (the referent) while sense is the means of doing so. Two different linguistic expressions such as ‘the number after nine’ and ‘the number before eleven’ differ in sense but they both share the same referent, ‘ten.’ For semanticists it is particularly interesting to study the various mechanisms that a language offers to speakers for this act of referring. These include names such as ‘Dublin,’ nouns such as ‘cat,’ which can be used to refer to a single individual, ‘your cat,’ or a whole class, ‘Cats are carnivorous,’ quantified nominals such as ‘many cats,’ ‘some cats,’ ‘a few cats,’ etc. Linguists as well as philosophers have to account for language’s ability to allow us to refer to nonexistent and hypothetical referents such as ‘World War Three,’ ‘the still undiscovered cure for cancer,’ ‘the end of the world.’ Semanticists also share interests with psychologists, for if sense is the meaning of an expression, it seems natural to many semanticists to equate it with a conceptual representation. Cognitive semanticists, in particular, (for example Lakoff 1987, Talmy 2000), but also some generative linguists (Jackendoff 1996), seek to explore the relationship between semantic structure and conceptual structure. One axis of the debate is whether words, for example, are simply labels for concepts, or whether there is a need for an independent semantic interface that isolates just grammatically relevant elements of conceptual structure. As Jackendoff (1996) points out, many languages make grammatical distinctions corresponding to the conceptual distinctions of gender and number, but few involve distinctions of colour or between different animal species. If certain aspects of concepts are more relevant to grammatical rules, as is also claimed by Pinker (1989), this may be justification for a semantic interface.
2. Approaches to Meaning Even in these brief remarks we have had to touch on the crucial relationship between meaning and context. Language of course typically occurs in acts of communication, and linguists have to cope with the fact that utterances of the same words may communicate different meanings to different individuals in different contexts. One response to this problem is to hypothesize that linguistic units such as words, phrases, and sentences have an element of inherent meaning that does not vary across contexts. This is sometimes called inherent, or simply, sentence meaning. Language
Semantics users, for example speakers and listeners, then enrich this sentence meaning with contextual information to create the particular meaning the speaker means to convey at the specific time, which can then be called speaker meaning. One common way of reflecting this view is to divide the study of meaning into semantics, which becomes the study of sentence meaning, and pragmatics, which is then the study of speaker meaning, or how speakers use language in concrete situations. This is an attempt to deal with the tension between the relative predictability of language between fellow speakers and the great variability of individual interpretations in interactive contexts. One consequence of this approach is the view that the words that a speaker utters underdetermine their intended meaning. Semantics as a branch of linguistics is marked by the theoretical fragmentation of the field as a whole. The distinction between formal and functional approaches, for example, is as marked in semantics as elsewhere. This is a large subject to broach here but see Givo! n (1995) and Newmeyer (1998) for characteristic and somewhat antagonistic views. One important difference is the attitude to the autonomy of levels of analysis. Are semantics and syntax best treated as autonomous areas of study, each with its own characteristic entities and processes? A related question at a more general level is whether linguistic processes can be described independently of general psychological processes or the study of social interaction. Scholars in different theoretical frameworks will give contradictory answers to these questions of micro- and macroautonomy. Autonomy at both levels is characteristic of semantics within generative grammar; see, for example, Chomsky (1995). Functionalists such as Halliday (1996) and Harder (1996) would on the other hand argue against microautonomy, suggesting that grammatical relations and structure cannot be understood without reference to semantic function. They also seek motivation for linguistic structure in the dynamics of communicative interaction. A slightly different external mapping is characteristic of cognitive semantics, for example Lakoff (1987) and Langacker (1987), where semantic structures are correlated to conceptual structures. Another dividing issue in semantics is the value of formal representations. Scholars are divided on whether our knowledge of semantics is sufficiently mature to support attempts at mathematical or other symbolic modeling; indeed, on whether such modeling serves any use in this area. Partee (1996), for example, defends the view of formal semanticists that the application of symbolic logic to natural languages, following in particular the work of Montague (1974), represents a great advance in semantic description. Jackendoff (1990), on the other hand, acknowledges the value of formalism in semantic theory and description but argues that formal logic is too narrow adequately to describe meaning in language. Other
scholars, such as Wierzbicka (1992), view the search for formalism as premature and distracting. There has been an explosive increase in the research in formal semantics since Montague’s (1974) proposal that the analysis of formal languages could serve as the basis for the description of natural languages. Montague’s original theory comprised a syntax for the natural language, say English, a syntax for the logical language into which English should be translated (intensional logic), rules for the translation, and rules for the semantic interpretation of the intensional logic. This and subsequent formal approaches are typically referential (or denotational) in that their emphasis is on the connection of language with a set of possible worlds, including the real, external world and the hypothetical worlds set up by speakers. Crucial to this correspondence is the notion of truth, defined at the sentence level. A sentence is true if it correctly describes a situation in some world. In this view, the meaning of a sentence is characterized by describing the conditions which must hold for it to be true. The central task for such approaches is to extend the formal language to cope with the semantic features of natural language while maintaining the rigor and precision of the methodology. See the papers in Lappin (1996) for typical research in this paradigm. Research in cognitive semantics presents an alternative strategy. Cognitive semanticists reject what they see as the mathematical, antimentalist approach of formal semantics. In their view meaning is described by relating linguistic expressions to mental entities, conventionalized conceptual structures. These semanticists have proposed a number of conceptual structures and processes, many deriving from perception and bodily experience and, in particular, conceptual models of space. Proposals for underlying conceptual structures include image schemas (Johnson 1987), mental spaces (Fauconnier 1994), and conceptual spaces (Ga$ rdenfors 1999). Another focus of interest is the processes for extending concepts, and here special attention is given to metaphor. Lakoff (1987) and Johnson (1987) have argued against the classical view of metaphor and metonymy as something outside normal language, added as a kind of stylistic ornament. For these writers metaphor is an essential element in our categorization of the world and our thinking processes. Cognitive semanticists have also investigated the conceptual processes which reveal the importance of the speaker’s perspective and construal of a scene, including viewpoint shifting, figure-ground shifting, and profiling (Langacker 1987).
3. Topics in Sentence Semantics Many of the semantic systems of language, for example tense (see Tense, Aspect, and Mood, Linguistics of), aspect, mood, and negation, are marked 13883
Semantics grammatically on individual words such as verbs. However, they operate over the whole sentence. This ‘localization’ is the reason that descriptive grammars usually distribute semantic description over their analyses of grammatical forms. Such semantic systems offer the speaker a range of meaning distinctions through which to communicate a message. Theoretical semanticists attempt to characterize each system qua system, as in for example Verkuyl’s (1993) work on aspect and Hornstein’s (1990) work on tense. Typological linguists try to characterize the variation in such systems across the world’s languages, as in the studies of tense and aspect by Comrie (1976, 1985), Binnick (1991), and Bybee et al. (1994). We can sketch some basic features of some of these systems.
3.1 Situation Type and Aspect Situation type and aspect are terms for a language’s resources that allow a speaker to describe the temporal ‘shape’ of events. The term situation type is used to describe the system encoded in the words of a language, while aspect is used for the grammatical systems which perform a similar role. To take one example, languages typically allow speakers to describe a situation either as static, as in ‘The bananas are ripe,’ or as dynamic, as in ‘The bananas are ripening.’ Here the state is the result of the process but the same situation can be viewed as more static or dynamic, as in ‘The baby is asleep’ and ‘The baby is sleeping.’ As these examples show, this distinction is lexically marked: in English, for example, adjectives are typically used for states, and verbs for dynamic situations. There are, however, a group of stative verbs, such as ‘know,’ ‘understand,’ ‘love,’ ‘hate,’ which describe static situation types. There are a number of semantic distinctions typically found amongst dynamic verbs, for example the telic\atelic (bounded\unbounded) distinction and the punctual\ durative distinction. Telic verbs describe processes which are seen as having a natural completion, which atelic verbs do not. A telic example is ‘Matthew was growing up,’ and an atelic example is ‘Matthew was drinking.’ If these procsses are interrupted at any point, we can automatically say ‘Matthew drank,’ but not ‘Matthew grew up.’ However, atelic verbs can form telic phrases and sentences by combining with other grammatical elements, so that ‘Matthew was drinking a pint of beer’ is telic. Durative verbs, as the term suggests, describe processes that last for a period of time, while punctual describes those that seem so instantaneous that they have no detectable internal structure, as in the comparison between ‘The man slept’ and ‘The light flashed.’ As has often been observed, if an English punctual verb is used with a durative adverbial, the result is an iterative meaning, as in ‘The light flashed all night,’ where we understand the event to be repeated over the time mentioned. 13884
Situation type typically interacts with aspect. Aspect is the grammatical system that allows the speaker choices in how to portray the internal temporal nature of a situation. An event, for example, may be viewed as closed and completed, as in ‘Joan wrote a book,’ or as an ongoing process, perhaps unfinished, as in ‘Joan was writing a book.’ The latter verb form is described as being in the progressive aspect in English, but similar distinctions are very common in the languages of the world. In many languages we find described a distinction between perfective and imperfective aspects, used to describe complete versus incomplete events; see Bybee et al. (1994) for a survey. As mentioned above, aspect is intimately associated both with situation type and tense. In Classical Arabic the perfective is strongly associated with past tense (Comrie 1976, Binnick 1991). In English, for example, stative verbs are typically not used with progressive aspect, so that one may say ‘I know some French’ but not ‘I am knowing some French.’ Staying with the progressive, when it is used in the present tense in English (and in many other languages) it carries a meaning of proximate future or confident prediction as in ‘We’re driving to Los Angeles’ or ‘I’m leaving you.’ The combination of the three semantic categories of tense, situation type, and aspect produces a complex system that allows speakers to make subtle distinctions in relating an event or describing a situation.
3.2 Modality Modality is a semantic system that allows speakers to express varying attitudes to a proposition. Semanticists have traditionally identified two types of modality. One is termed epistemic modality, which encodes a speaker’s commitment to, or belief in, a proposition, from the certainty of ‘The ozone layer is shrinking’ to the weaker commitments of ‘The ozone layer may\might\could be shrinking.’ The second is deontic modality, where the speaker signals a judgment toward social factors of obligation, responsibility, and permission, as in the various interpretations of ‘You must\can\ may\ ought to borrow this book.’ These examples show that similar markers, here auxiliary verbs, can be used for both types. When modality distinctions are marked by particular verbal forms, these are traditionally called moods. Thus many languages, including Classical Greek and Somali, have a verb form labeled the optative mood for expressing wishes and desires. Other markers of modality in English include verbs of propositional attitude, as in ‘I know\believe\think\doubt\ that the ozone layer is shrinking,’ and modal adjectives, as in ‘It is certain\probable\likely\possible that the ozone layer is shrinking.’ A related semantic system is evidentiality, where a speaker communicates the basis or source for present-
Semantics ing a proposition. In English and many other languages this may be done by adding expressions like ‘allegedly,’ ‘so I’ve heard,’ ‘they say,’ etc., but certain languages mark such differences morphologically, as in Makah, a Nootkan language spoken in Washington State (Jacobsen 1986, p. 10): wiki:caxaw: ‘It’s bad weather’ (seen or experienced directly); wiki:caxakpi:d: ‘It looks like bad weather’ (inference from physical evidence); wiki:caxakqad\I: ‘It sounds like bad weather’ (on the evidence of hearing); and wiki:caxakwa:d: ‘I’m told there’s bad weather’ (quoting someone else). 3.3 Semantic Roles This term describes the speaker’s semantic repertoire for relating participants in a described event. One influential proposal in the semantics literature is that each language contains a set of semantic roles, the choice of which is partly determined by the lexical semantics of the verb selected by the speaker. A characteristic list of such roles is: agent: the initiator of some action, capable of acting with volition; patient: the entity undergoing the effect of some action, often undergoing some change in state; theme: the entity which is moved by an action, or whose location is described; experiencer: the entity which is aware of the action or state described by the predicate but which is not in control of the action or state; beneficiary: the entity for whose benefit the action was performed; instrument: the means by which an action is performed or something comes about; location: the place in which something is situated or takes place; goal: the entity towards which something moves; recipient: the entity which receives something; and source: the entity from which something moves. In an example like ‘Harry immobilized the tank with a broomstick,’ the entity Harry is described as the agent, the tank as the patient, and the broomstick as the instrument. These roles have also variously been called deep semantic cases, thematic relations, participant roles, and thematic roles. One concern is to explain the matching between semantic roles and grammatical relations. In many languages, as in the last example, there is a tendency for the subject of the sentence to correspond to the agent and for the direct object to correspond to a patient or theme; an instrument often occurs as a prepositional phrase. Certain verbs allow variations from this basic mapping, for example the we find with English verbs such as ‘break’: ‘The boy broke the window with a stone’ (subject l agent); ‘The stone broke the window’ (subject l instrument); ‘The win-
dow broke’ (subject l patient). Clearly verbs can be arranged into classes depending on the variations of mappings they allow, and not all English verbs pattern like ‘break.’ We can say ‘The admiral watched the battle with a telescope,’ but ‘The telescope watched the battle’ and ‘The battle watched’ sound decidedly odd. From this literature emerges the claim that certain mappings are more natural or universal. One proposal is that, for example, there is an implicational hierarchy governing the mapping to subject, typically such as: agentrecipient\benefactivetheme\patientinstrumentlocation. In such a hierarchy each left element is more preferred than its right neighbor, so that moving rightward along the string gives us fewer expected subjects. The hierarchy also makes certain typological claims: if a language allows a certain semantic role to be subject, it will allow all those to its left. Thus if we find that a language allows the role instrument to be subject, we predict that it allows the roles to the left, but we do not know if it allows location subjects. One further application of semantic roles is in lexical semantics, where the notion allows verbs to be classified by their semantic argument structure. Verbs are assigned semantic role templates or grids by which they may be sorted into natural classes. Thus, English has a class of transfer, or giving verbs, which in one type includes the verbs ‘give,’ ‘lend,’ ‘supply,’ ‘pay,’ ‘donate,’ ‘contribute.’ These verbs encode a view of the transfer from the perspective of the agent and may be assigned the pattern agent, theme, recipient, as in ‘The committee donated aid to the famine victims.’ A second subclass of these transfer verbs encodes the process from the perspective of the recipient. These verbs include ‘receive,’ ‘accept,’ ‘borrow,’ ‘buy,’ ‘purchase,’ ‘rent,’ ‘hire,’ and have the pattern recipient, theme, source, as in ‘The victims received aid from the committee.’ 3.4 Entailment, Presupposition, and Implication These terms relate to types of information a hearer gains from an utterance but which are not stated directly by the speaker. These phenomena have received a lot of attention because they seem to straddle the putative divide between semantics and pragmatics described above, and because they reveal the dynamic and interactive nature of understanding the meaning of utterances. Entailment describes a relationship between sentences such that on the basis of one sentence, a hearer will accept a second, unstated sentence purely on the basis of the meaning of the first. Thus sentence A entails sentence B, if it is not possible to accept A but reject B. In this view a sentence such as ‘I bought a dog today’ entails ‘I bought an animal today’; or ‘President Kennedy was assassinated yesterday’ entails ‘President Kennedy is now dead.’ Clearly these sentential relations depend on lexical relations: a speaker who understands the meaning of 13885
Semantics the English word ‘dog’ knows that a dog is an animal; similarly the verb ‘assassinate’ necessarily involves the death of the unfortunate object argument. Entailment then is seen as a purely automatic process, involving no reasoning or deduction, but following from the hearer’s linguistic knowledge. Entailment is amenable to characterization by truth conditions. A sentence is said to entail another if the truth of the first guarantees the truth of the second, and the falsity of the second guarantees the falsity of the first. Presupposition, on the other hand, is a more complicated notion. In basic terms, the idea is simple enough: that a speaker communicates certain assumptions aside from the main message. A range of linguistic elements communicates these assumptions. Some, such as names, and definiteness markers such as the articles ‘the’ and ‘my,’ presuppose the existence of entities. Thus ‘James Brown is in town’ presupposes the existence of a person so called. Other elements have more specific presuppositions. A verb such as ‘stop’ presupposes a preexisting situation. So a sentence ‘Christopher has stopped smoking’ presupposes ‘Christopher smoked.’ If treated as a truth-conditional relation, presupposition is distinguished from entailment by the fact that it survives under negation: ‘Christopher has not stopped smoking’ still presupposes ‘Christopher smoked,’ but the sentence ‘I didn’t buy a dog today’ does not entail ‘I bought an animal today.’ There are a number of other differences between entailment and presupposition that cast doubts on the ability of a purely semantic, truth-conditional account of the latter. Presuppositions are notoriously context sensitive, for example. They may be cancelled without causing an anomaly: a hearer can reply ‘Christopher hasn’t stopped smoking, because he never smoked’ to cancel the presupposition by what is sometimes called metalinguistic negation. This dependency on context has led some writers to propose that presupposition is a pragmatic notion, definable in terms of the set of background assumptions that the speaker assumes is shared in the conversation. See Beaver (1997) for discussion. A third type of inference is Grice’s conversational implicature (1975, 1978). This is an extremely contextsensitive type of inference which allow participants in a conversation to maintain coherence. So, given the invented exchange below, A: Did you give Mary the book? B: I haven’t seen her yet. It is reasonable for A to infer the answer ‘no’ to her question. Grice proposed that such inferences are routinely relied on by both speakers and hearers, and that this reliance is based on certain assumptions that hearers make about a speaker’s conduct. Grice classified these into several different types, giving rise to different types of inference, or, from the speaker’s point of view, what he termed implicatures. The four main maxims are called Quality, Quantity, 13886
Relevance, and Manner (Grice 1975, 1978). They amount to a claim that a listener will assume, unless there is evidence to the contrary, that a speaker will have calculated their utterance along a number of parameters: they will tell the truth, try to estimate what their audience knows, and package their material accordingly, have some idea of the current topic, and give some thought to their audience being able to understand them. In our example above, it is A’s assumption that B’s reply is intended to be relevant that allows the inference ‘no.’ Implicature has three characteristics: first, that it is implied rather than said; second, that its existence is a result of the context i.e., the specific interaction. There is no guarantee that in other contexts ‘I haven’t seen her’ will be used to communicate ‘no.’ Third, implicature is cancelable without causing a contradiction. Thus the implicature ‘no’ in our example can be cancelled if B adds the clause ‘but I mailed it to her last week.’ These three notions—entailment, presupposition, and implicature—can all be seen as types of inference. They are all produced in conversation, and are taken by participants to be part of the meaning of what a speaker has said. They differ in a number of features and crucially in context sensitivity. The attempt to provide a unified analysis of them all is a challenge to semantic and pragmatic theories. See Sperber and Wilson (1995) for an attempt at such a unified approach.
4. Future Deelopments Although semantics remains theoretically a very diverse field it is possible to detect some shared trends which seem likely to develop further. One is a move away from a static view of sentences in isolation, detached from the speaker\writer’s act of communication, toward dynamic, discourse-based approaches. This has always been characteristic of functional approaches to meaning but has also been noticeable in formal approaches as they move away from their more philosophical origins. Among examples of this we might mention discourse representation theory (Kamp and Reyle 1993) and dynamic semantics (Groenendijk et al. 1996). Another development which seems likely to continue is a closer integration with other disciplines in cognitive science. In particular, computational techniques seem certain to make further impact on a range of semantic inquiry, from lexicography to the modeling of questions and other forms of dialogue. A subfield of computational semantics has emerged and will continue to develop; see Rosner and Johnson (1992) for example. See also: Etymology; Lexical Processes (Word Knowledge): Psychological and Neural Aspects; Lexical
Semiotics Semantics; Lexicology and Lexicography; Lexicon; Semantic Knowledge: Neural Basis of; Semantic Processing: Statistical Approaches; Semantic Similarity, Cognitive Psychology of; Word Meaning: Psychological Aspects
Bibliography Beaver D 1997 Presuppositions. In: Van Bentham J, Ter Meulen A (eds.). Handbook of Logic and Language. Elsevier, Amsterdam, pp. 939–1008 Binnick R I 1991 Time and the Verb: A Guide to Tense and Aspect. Oxford University Press, Oxford, UK Bybee J, Perkins R, Pagliuca W 1994 The Eolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. University of Chicago Press, Chicago Chomsky N 1995 The Minimalist Program. MIT Press, Cambridge, MA Comrie B 1976 Aspect: An Introduction to the Study of Verbal Aspect and Related Problems. Cambridge University Press, Cambridge, UK Comrie B 1985 Tense. Cambridge University Press, Cambridge, UK Fauconnier G 1994 Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge University Press, Cambridge, UK Fisiak J (ed.) 1985 Historical Semantics—Historical Word Formation. Mouton de Gruyter, Berlin Frege G 1980 Translations from the Philosophical Writings of Gottlob Frege [ed. Geach P, Black M]. Blackwell, Oxford, UK Ga$ rdenfors P 1999 Some tenets of cognitive semantics. In: Allwood J, Ga$ rdenfors P (eds.). Cognitie Semantics: Meaning and Cognition. John Benjamins, Amsterdam, pp. 12–36 Givo! n T 1995 Functionalism and Grammar. John Benjamins, Amsterdam Grice H P 1975 Logic and conversation. In: Cole P, Morgan J (eds.). Syntax and Semantics, Speech Acts. Academic Press, New York, Vol. 3 pp. 43–58 Grice H P 1978 Further notes on logic and conversation. In: Cole P (ed.). Syntax and Semantics 9: Pragmatics. Academic Press, New York, pp. 113–28 Groenendijk J, Stokhof M, Veltman F 1996 Coreference and modality. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 179–214 Halliday M A K 1994 An Introduction to Functional Grammar, 2nd edn. Edward Arnold, London Harder P 1996 Functional Semantics: A Theory of Meaning, Structure and Tense in English. Mouton de Gruyter, Berlin Hornstein N 1990 As Time Goes By: Tense and Uniersal Grammer. MIT Press, Cambridge, MA Jackendoff R 1990 Semantic Structures. MIT Press, Cambridge, MA Jackendoff R 1996 Semantics and cognition. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 539–60 Jacobsen W H Jr 1986 The heterogeneity of evidentials in Makah. In: Chafe W, Nichols J (eds.). Eidentiality: The Linguistic Coding of Epistemology. Ablex, Norwood, NJ, pp. 3–28 Johnson M 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. University of Chicago Press, Chicago Kamp H, Reyle U 1993 From Discourse to Logic. Kluwer, Dordrecht, The Netherlands
Lakoff G 1987 Women, Fire, and Dangerous Things: What Categories Reeal About the Mind. University of Chicago Press, Chicago Langacker R W 1987 Foundations of Cognitie Grammar. Stanford University Press, Stanford, CA Lappin S (ed.) 1996 The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK Lehmann W P 1992 Historical Linguistics, 3rd edn. Routledge, London Montague R 1974 Formal Philosophy: Selected Papers of Richard Montague [ed. Thomason R H]. Yale University Press, New Haven, CT Newmeyer F J 1998 Language Form and Language Function. MIT Press, Cambridge, MA Partee B H 1996 The development of formal semantics in linguistic theory. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 11–38 Pinker S 1989 Learnability and Cognition: The Acquisition of Argument Structure. MIT Press, Cambridge, MA Rosner M, Johnson R (eds.) 1992 Computational Linguistics and Formal Semantics. Cambridge University Press, Cambridge, UK Sperber D, Wilson D 1995 Releance: Communication and Cognition, 2nd edn. Blackwell, Oxford, UK Talmy L 2000 Toward a Cognitie Semantics. MIT Press, Cambridge, MA Verkuyl H J 1993 A Theory of Aspectuality: The Interaction Between Temporal and Atemporal Structure. Cambridge University Press, Cambridge, UK Wierzbicka A 1992 Semantics, Culture, and Cognition. Uniersal Concepts in Culture-specific Configurations. Oxford University Press, Oxford, UK
J. I. Saeed
Semiotics Semiotics is an interdisciplinary field that studies ‘the life of signs within society’ (Saussure 1959, p. 16). While ‘signs’ most commonly refers to the elements of verbal language and other vehicles of communication, it also denotes any means of representing or knowing about an aspect of reality. As a result, semiotics has developed as a close cousin of such traditional disciplines as philosophy and psychology. In the social sciences and humanities, semiotics has become an influential approach to research on culture and communication particularly since the 1960s. This article describes the classical origins of semiotics, exemplifies its application to contemporary culture, and outlines its implications for the theory of science.
1. Origins: Logic and Linguistics Two different senses of ‘signs’ can be traced in the works of Aristotle (Clarke 1990, p. 15) (see Aristotle (384–322 BC)). First, the mental impressions that 13887
Semiotics people have, are signs which represent certain objects in the world to them. Second, spoken and written expressions are signs with which people are able to represent and communicate a particular understanding of these objects to others. The first sense, of mental impressions, points to the classical understanding of signs, not as words or images, but as naturally occurring evidence that something is the case. For example, fever is a sign of illness, and clouds are a sign that it may rain, to the extent that these signs are interpreted by humans. The second sense of signs as means of communication points towards the distinction that came to inform most modern science between conventional signs, especially verbal language, and sense data, or natural signs. Modern natural scientists could be said to study natural signs with reference to specialized conventional signs. Given the ambition of semiotics to address the fundamental conditions of human knowledge, it has been an important undercurrent in the history of science and ideas. The first explicit statement regarding natural signs and language as two varieties of one general category of signs came not from Aristotle, but from St. Augustine (c. AD 400). This position was elaborated throughout the medieval period with reference, in part, to the understanding of nature as ‘God’s Book’ and an analogy to the Bible. However, it was not until the seventeenth century that the term semiotic emerged in John Locke’s An Essay concerning Human Understanding (1690). Here, Locke proposed a science of signs in general, only to restrict his own focus to ‘the most usual’ signs, namely, verbal language and logic (Clarke 1990, p. 40).
1.1 Peircean Logic and Semiotics Charles Sanders Peirce (1839–1914) was the first thinker to recover this undercurrent in an attempt to develop a general semiotic. He understood his theory of signs as a form of logic, which informed a comprehensive system for understanding the nature of being and of knowledge. The key to the system is Peirce’s definition of the sign as having three aspects: A sign, or representamen, is something which stands to somebody for something in some respect or capacity. It addresses somebody, that is, creates in the mind of that person an equivalent sign, or perhaps a more developed sign. That sign which it creates I call the interpretant of the first sign. That sign stands for something, its object.
An important implication of this definition is that signs always serve to mediate between objects in the world, including social facts, and concepts in the mind. Contrary to various skepticist positions, from Antiquity to postmodernism, Peirce’s point was not that signs are what we know, but how we come to know what we can justify saying that we know, in science 13888
Figure 1 The process of semiosis
and in everyday life. Peircean semiotics married a classical, Aristotelian notion of realism with the modern, Kantian position (see Kant, Immanuel (1724–1804)) that humans necessarily construct their understanding of reality in the form of particular cognitive categories. A further implication is that human understanding is not a singular internalization of reality, but a continuous process of interpretation, what is called semiosis. This is witnessed, for example, in the process of scientific discovery, but also in the ongoing coordination of social life. Figure 1 illustrates the process of semiosis, noting how any given interpretation (interpretant) itself serves as a sign in the next stage of an unending process which differentiates the understanding of objects in reality. Although Peirce’s outlook was that of a logician and a natural scientist, semiosis can be taken to refer to the processes of communication by which cultures are maintained and societies reproduced and, to a degree, reformed. One of the most influential elements of Peirce’s semiotics outside logic and philosophy has been his categorization of different types of signs, particularly icon, index, and symbol. Icons relate to their objects through resemblance (e.g., a realistic painting); indices have a causal relation (e.g., fever as a symptom of illness); and symbols have an arbitrary relation to their object (e.g., words). Different disciplines and fields have relied on these types as analytical instruments in
Semiotics order to describe the ways in which humans perceive and act upon reality.
1.2 Saussurean Linguistics and Semiology Compared with Peirce, the other main figure in the development of semiotics, Ferdinand de Saussure (1857–1913), placed a specific focus on verbal language (see Saussure, Ferdinand de (1857–1913)), what Peirce had referred to as symbols. Probably the main achievement of Saussure was to outline the framework for modern linguistics (see Linguistics: Oeriew). In contrast to the emphasis that earlier philology had placed on the diachronic perspective of how languages change over time, Saussure proposed to study language as a system in a synchronic perspective. Language as an abstract system (langue) could be distinguished, at least for analytical purposes, from the actual uses of language (parole). The language system has two dimensions. Along the syntagmatic dimension, letters, words, phrases, etc., are the units that combine to make up meaningful wholes, and each of these units has been chosen as one of several possibilities along a paradigmatic dimension, for example, one verb in preference to another. This combinatory system helps to account for the remarkable flexibility of language as a medium of social interaction. To be precise, Saussure referred to the broader science of signs, as cited in the introduction to this article, not as semiotics, but as semiology. Like Locke, and later Peirce, Saussure relied on classical Greek to coin a technical term, and the two variants are explained, in part, by the fact that Peirce and Saussure had no knowledge of each other’s work. Moreover, in Saussure’s own work, the program for a semiology remained undeveloped, appearing almost as an aside from his main enterprise of linguistics. It was during the consolidation of semiotics as an interdisciplinary field from the 1960s that this became the agreed term, overriding Peirce’s ‘semiotic’ and Saussure’s ‘semiology’, as symbolized by the formation in 1969 of the International Association for Semiotic Studies. It was also during this period that Saussure’s systemic approach was redeveloped to apply to culture and society. One important legacy of Saussure has been his account of the arbitrariness of the linguistic sign. This sign type is said to have two sides, a signified (concept) and a signifier (the acoustic image associated with it), whose relation is conventional or arbitrary. While this argument is sometimes construed in skepticist and relativist terms, as if sign users were paradoxically free to choose their own meanings, and hence almost destined to remain divorced from any consensual reality, the point is rather that the linguistic system as a whole, including the interrelations between signs and
their components, is arbitrary, but fixed by social convention. When applied to studies of cultural forms other than language, or when extended to the description of social structures as signs, the principle of arbitrariness has been a source of analytical difficulties.
2. Applications: Media, Communication and Culture It was the application of semiotics to questions of meaning beyond logic and linguistics which served to consolidate semiotics as a recognizable, if still heterogeneous field from the 1960s. Drawing on other traditions of aesthetic and social research, this emerging field began to make concrete what the study of ‘the life of signs within society’ might mean. The objects of analysis ranged from artworks to mass media to the life forms of premodern societies, but studies were united by a common interest in culture in the broad sense of worldviews that orient social action. The work of Claude Le! vi-Strauss (1963) on structural anthropology was characteristic and highly influential of such research on shared, underlying systems of interpretation which might explain the viewpoints or actions of individuals in a given context. It was Saussure’s emphasis on signs as a system, then, which provided a model for structuralist theories (see Structuralism) about social power (e.g., Louis Althusser) and about the unconscious as a ‘language’ (e.g., Jacques Lacan) (see Psychoanalysis: Oeriew). Of the French scholars who were especially instrumental in this consolidation, Roland Barthes, along with A. J. Greimas, stands out as an innovative and systematic theorist. His model of two levels of signification, reproduced in Fig. 2, has been one of the most widely copied attempts to link concrete sign vehicles, such as texts and images, with the ‘myths’ or ideologies which they articulate. Building on Louis Hjelmslev’s formal linguistics, Barthes (1973) suggested that the combined signifier and signified (expressive form and conceptual content) of one sign (e.g., a picture of a black man in a French uniform saluting the flag) may become the expressive form of a
Language MYTH
((
1. Signifier 2. Signified 3. Sign I SIGNIFIER
II SIGNIFIED
III SIGN
Figure 2 Two Levels of Signification (reproduced by permission of Roland Barthes [Random House Group Ltd (Jonathan Cape)] from Mythologies, p. 115, first published in French by de Seuil in 1957)
13889
Semiotics further, ideological content (e.g., that French imperialism is not a discriminatory or oppressive system). Barthes’ political agenda, shared by much semiotic scholarship since then, was that this semiotic mechanism serves to naturalize particular worldviews, while oppressing others, and should be deconstructed (see Critical Theory: Contemporary). In retrospect, one can distinguish two ways of appropriating semiotics in social and cultural research. On the one hand, semiotics may be treated as a methodology for examining signs, whose social or philosophical implications are interpreted with reference to another set of theoretical concepts. On the other hand, semiotics may supply also the theoretical framework, so that societies, psyches, and images are understood not just in procedural, but also in actual conceptual terms as signs. The latter position is compatible with a variant of semiotics, which assumes that also biological and even cosmological processes are best understood as semioses. This ambitious and arguably imperialistic extension of logical and linguistic concepts into other fields has encountered criticism, for example, in cases where Saussure’s principle of arbitrariness has been taken to apply to visual communication or to the political and economic organization of society. A semioticization of, for instance, audiovisual media (for example, see Visual Images in the Media) can be said to neglect their appeal to radically different perceptual registers, which are, in certain respects, natural rather than conventional (e.g., Messaris 1994) (for example, see Cognitie Science: Oeriew). Similarly, a de facto replacement of social science by semiotics may exaggerate the extent to which signs, rather than material and institutional conditions, determine the course of society, perhaps in response to semioticians’ political ambition of making a social difference by deconstructing signs. In recent decades, these concerns, along with critiques of Saussurean formalist systems, have led to attempts to formulate a social semiotics, integrating semiotic methodology with other social and communication theory (e.g., Hodge and Kress 1988, Jensen 1995). On the relations between semiotics and other fields, see No$ th 1990. A final distinction within semiotic studies of society and culture arises from the question of what is a ‘medium’. Media research has drawn substantially on semiotics to account for the distinctive sign types, codes, narratives and modes of address of different media such as newspapers, television or the Internet (for example, see Mass Media: Introduction and Schools of Thought). A particular challenge has been the film medium, which was described by Christian Metz (1974) not as a language system, but as a ‘language’ drawing on several semiotic codes. Beyond this approach to media as commonly understood, semioticians, taking their inspiration from Le! viStrauss’ anthropology, have described other objects and artifacts as vehicles of meaning. One rich example 13890
is Barthes’ study of the structure of fashion (see Fashion, Sociology of ), as depicted in magazines (Barthes 1985). The double meaning of a ‘medium’ is another indication both of the potential theoretical richness of semiotics and of the pitfalls of confusing different levels of analysis.
3. Theory of Signs and Theory of Science It is this double edge which most likely explains the continuing influence of semiotic ideas since Antiquity, mostly, however, as an undercurrent of science. The understanding articulated in Aristotle’s writings of signs as evidence for an interpreter of something else which is at least temporarily absent, or as an interface which is not identical with the thing that is in evidence to the interpreter, may be taken as a cultural leap that has made possible scientific reflexivity as well as social organization beyond the here and now. Further, the sign concept has promoted the crucial distinction beginning with Aristotle between necessary and probable relations, as in logical inferences. Returning to the examples above, fever is a sure sign that the person is ill, since illness is a necessary condition of fever. But clouds are only a probable sign of rain. Empirical social research is mostly built on the premise that probable signs can lead to warranted inferences. The link between a theory of signs and a theory of science has been explored primarily from a Peircean perspective. For one thing, Peirce himself made important contributions to logic and to the theory of science generally, notably by adding to deduction and induction the form of inference called abduction, which he detected both in everyday reasoning and at the core of scientific innovation. For another thing, the wider philosophical tradition of pragmatism (for example, see Pragmatism: Philosophical Aspects), which Peirce inaugurated, has emphasized the interrelations between knowledge, education, and action, and it has recently enjoyed a renaissance at the juncture between theory of science and social theory (e.g., Bernstein 1991, Joas 1993). By contrast, the Saussurean tradition has often bracketed issues concerning the relationship between the word and the world (Jakobson 1981, p. 19), even if critical research in this vein has recruited theories from other schools of thought in order to move semiology beyond the internal analysis of signs. For future research, semiotics is likely to remain a source of inspiration in various scientific domains, as it has been for almost 2500 years. In light of the record from the past century of intensified interest in theories of signs, however, the field is less likely to develop into a coherent discipline. One of the main opportunities for semiotics may be to develop a meta-framework for understanding how different disciplines and fields conceive of their ‘data’ and ‘concepts’. The semiotic heritage offers both systematic analytical procedures
Semiparametric Models and a means of reflexivity regarding the role of signs in society as well as in the social sciences. See also: Communication: Philosophical Aspects
Bibliography Barthes R 1973 Mythologies. Paladin, London Barthes R 1985 The Fashion System. Cape, London Bernstein R J 1991 The New Constellation. Polity Press, Cambridge, UK Bouissac P 1998 (ed.) Encyclopedia of Semiotics. Oxford University Press, New York Clarke Jr. D S 1990 Sources of Semiotic. Southern Illinois University Press, Carbondale, IL Greimas A J, Courte! s J 1982 Semiotics and Language: An Analytical Dictionary. Indiana University Press, Bloomington, IN Hodge R and Kress G 1988 Social Semiotics. Polity Press, Cambridge, UK Jakobson R 1981 Linguistics and poetics. In: Selected Writings. Mouton, The Hague, The Netherlands, Vol. 3 Joas H 1993 Pragmatism and Social Theory. University of Chicago Press, Chicago, IL Jensen K B 1995 The Social Semiotics of Mass Communication. Sage, London Le! vi-Strauss C 1963 Structural Anthropology. Basic Books, New York Messaris P 1994 Visual ‘Literacy’: Image, Mind, and Reality. Westview Press, Boulder, CO Metz C 1974 Language and Cinema. Mouton, The Hague, The Netherlands No$ th W 1990 Handbook of Semiotics. Indiana University Press, Bloomington, IN Peirce C S 1982 Writings of Charles S. Peirce: A Chronological Edition. Indiana University Press, Bloomington, IN Peirce C S 1992–98 The Essential Peirce. Indiana University Press, Bloomington, IN, Vols 1–2 Posner R, Robering K, Sebeok T A (eds.) 1997–98 Semiotik: Ein Handbuch zu den zeichentheoretischen Grundlagen on Natur und Kultur\Semiotics: A Handbook on the Sign-Theoretic Foundations of Nature and Culture. Walter de Gruyter, Berlin, Vols. 1–2 de Saussure F 1959 Course in General Linguistics. Peter Owen, London Sebeok T A 1994 Encyclopedic Dictionary of Semiotics, 2nd edn. Mouton de Gruyter, Berlin, Vols 1–3
K. B. Jensen
Semiparametric Models Much empirical research in the social sciences is concerned with estimating conditional mean functions. For example, labor economists are interested in estimating the mean wages of employed individuals, conditional on characteristics such as years of work
experience and education. The most frequently used estimation methods assume that the conditional mean function is known up to a set of constant parameters that can be estimated from data, possibly by ordinary least squares. Models in which the only unknown quantities are a finite set of constant parameters are called ‘parametric.’ The use of a parametric model greatly simplifies estimation, statistical inference, and interpretation of the estimation results but is rarely justified by theoretical or other a priori considerations. Estimation and inference based on convenient but incorrect assumptions about the form of the conditional mean function can be highly misleading. Semiparametric statistical methods reduce the strength of the assumptions required for estimation and inference, thereby reducing the opportunities for obtaining misleading results. These methods are applicable to a wide variety of estimation problems in economics and other fields.
1. Introduction A conditional mean function gives the mean of a dependent variable Y conditional on a vector of explanatory variables X. Denote the mean of Y conditional on X l x by E(Y Q x). For example, suppose that Y is a worker’s weekly wage (or, more often in applied econometrics, the logarithm of the wage) and X includes such variables as years of work experience and education, race, and sex. Then E(Y Q x) is the mean wage (or logarithm of the wage) when experience and the other explanatory variables have the values specified by x. As an illustration, the solid line in Fig. 1 shows an estimate of the mean of the logarithm of weekly wages, log W, conditional on years of work experience, EXP, for white males with 12 years of education who work full time and live in urban areas of the North Central USA. The estimate was obtained by applying a nonparametric method (explained in Sect. 2) to data from the 1993 Current Population Survey (CPS). The estimated conditional mean of log W increases steadily up to approximately 30 years of experience and is flat thereafter. In most applications, E(Y Q x) is unknown and must be estimated from data on the variables of interest. In the case of estimating a wage function, the data consist of observations of individuals’ wages, years of experience, and other characteristics. The most widely used method for estimating E(Y Q x) is not the nonparametric method mentioned previously, but rather a method that assumes that E(Y Q x) is known up to finitely many constant parameters. This gives a ‘parametric model’ for E(Y Q x). Often, E(Y Q x) is assumed to be a linear function of x, in which case the parameters can be estimated by ordinary least squares (OLS), among other ways. A linear function has the form E(Y Q x) l βhx, where β is a vector of coefficients. For 13891
Semiparametric Models
Figure 1 Estimates of E(log W Q EXP)
example, if x consists of an intercept and the two variables x and x , then β has three components, and βhx l β jβ" x jβ# x . OLS estimators are described in ! " " See, # # for example, Goldberger (1998). many textbooks. The OLS estimator of E(Y Q x) can be highly misleading if E(Y Q x) is not linear in the components of x, that is if there is no β such that E(Y Q x) l βhx. This problem is illustrated by the dashed and dotted lines in Fig. 1, which show two parametric estimates of the mean of the logarithm of weekly wages conditional on years of work experience. The dashed line is the OLS estimate that is obtained by assuming that E(log W Q EXP) is the linear function E(log W Q EXP) l β jβ EXP. The dotted line is the OLS estimate that is ! " by assuming that E(log W Q EXP) is quadraobtained tic: E(log W Q EXP) l β jβ EXPjβ EXP#. The non! line) " places #no restrictions on parametric estimate (solid the shape of E(log W Q EXP). The linear and quadratic models give misleading estimates of E(log W Q EXP). The linear model indicates that E(log W Q EXP) steadily increases as experience increases. The quadratic model indicates that E(log W Q EXP) decreases after 32 years of experience. In contrast, the nonparametric estimate of E(log W Q EXP) becomes nearly flat at approximately 30 years of experience. Because the nonparametric estimate does not restrict the conditional mean function to be linear or quadratic, it is more likely to represent the true conditional mean function. The opportunities for specification error increase if Y is binary. For example, consider a model of the choice of travel mode for the trip to work. Suppose that the available modes are automobile and transit. Let Y l 1 if an individual chooses automobile and Y l 0 if the individual chooses transit. Let X be a vec13892
tor of explanatory variables such as the travel times and costs by automobile and transit. Then E(Y Q x) is the probability that Y l 1 (the probability that the individual chooses automobile) conditional on X l x. This probability will be denoted P(Y l 1 Q x). In applications of binary response models, it is often assumed that P(Y Q x) l G(βhx), where β is a vector of constant coefficients and G is a known probability distribution function. Often, G is assumed to be the cumulative standard normal distribution function, which yields a ‘binary probit’ model, or the cumulative logistic distribution function, which yields a ‘binary logit’ model (see Multiariate Analysis: Discrete Variables (Logistic Regression)). The coefficients β can then be estimated by the method of maximum likelihood (Amemiya 1985). However, there are now two potential sources of specification error. First, the dependence of Y on x may not be through the linear index βhx. Second, even if the index βhx is correct, the ‘response function’ G may not be the normal or logistic distribution function. See Horowitz (1993, 1998) for examples of specification errors in binary response models and their consequences. Many investigators attempt to minimize the risk of specification error by carrying out a ‘specification search’ in which several different models are estimated and conclusions are based on the one that appears to fit the data best. Specification searches may be unavoidable in some applications, but they have many undesirable properties and their use should be minimized. There is no guarantee that a specification search will include the correct model or a good approximation to it. If the search includes the correct model, there is no guarantee that it will be selected by the investigator’s model selection criteria. Moreover, the
Semiparametric Models search process invalidates the statistical theory on which inference is based. The rest of this entry describes methods that deal with the problem of specification error by relaxing the assumptions about functional form that are made by parametric models. The possibility of specification error can be essentially eliminated through the use of nonparametric estimation methods. These are described in Sect. 2. They assume that E(Y Q x) is a smooth function but make no other assumptions about its shape or functional form. However, nonparametric methods have important disadvantages that seriously limit their usefulness in applications. Semiparametric methods, which are described in Sect. 3 offer a compromise. They make assumptions about functional form that are stronger than those of a nonparametric model but less restrictive than the assumptions of a parametric model, thereby reducing (though not eliminating) the possibility of specification error. In addition semiparametric methods avoid the most serious practical disadvantages of nonparametric methods.
2. Nonparametric Models In nonparametric estimation E(Y Q x) is assumed to satisfy smoothness conditions such as differentiability, but no assumptions are made about its shape or the form of its dependence on x. Ha$ rdle (1990) and Fan and Gijbels (1996) provide detailed discussions of nonparametric estimation methods. One easily understood and frequently used method is called ‘kernel estimation.’ This method was used to produce the solid line in Fig. 1. To describe the kernel method simply, assume that X is a continuously distributed, scalar random variable. Let oYi, Xi : i l 1, …, nq be a random sample of n observations of (Y, X ). Let K be a probability density function that is bounded, continuous, and symmetrical about zero. For example, K may be the standard normal density function. Let ohnq to be a sequence of positive numbers that converges to 0 as n _. For each n l 1, 2, … and i l 1, …, n define the function wni(:) by wni(x) l
K [(xkXi)\hn]
n
(1)
K [(xkXi)\hn]
i="
Then the kernel nonparametric estimator of E(Y Q x) is n
Hn(x) l wni(x)Yi i="
n _, then Hn(x) E(Y Q x) with probability 1. Thus, if n is large, Hn(x) is likely to be very close to E(Y Q x). Ha$ rdle (1990) provides a detailed discussion of the statistical properties of kernel nonparametric estimators. Nonparametric estimation minimizes the risk of specification error, but the price of this flexibility can be high. One important reason for this is that the precision of a nonparametric estimator decreases rapidly as the number of continuously distributed components of X increases. This phenomenon is called the ‘curse of dimensionality.’ As a result of it, impracticably large samples are usually needed to obtain acceptable estimation precision if X is multidimensional, as it often is in social science applications. For example, a labor economist may want to estimate mean log wages conditional on years of work experience, years of education, and one or more indicators of skill levels, thus making the dimension of X at least 3. See Exploratory Data Analysis: Multiariate Approaches (Nonparametric Regression) for further discussion of the curse of dimensionality. Another problem is that nonparametric estimates can be difficult to display, communicate, and interpret when X is multidimensional. Nonparametric estimates do not have simple analytic forms, so displaying and interpreting them can be difficult. If X is one- or twodimensional, then the estimate of E(Y Q x) can be displayed graphically as in Fig. 1, but only reduceddimension projections can be displayed when X has three or more components. Many such displays and much skill in interpreting them can be needed to fully convey and comprehend the shape of the estimate of E(Y Q x). Another problem with nonparametric estimation is that it does not permit extrapolation. That is, it does not provide predictions of E(Y Q x) at points x that are outside of the support (or range) of the random variable X. This is a serious drawback in policy analysis and forecasting, where it is often important to predict what might happen under conditions that do not exist in the available data. Finally, in nonparametric estimation, it can be difficult to impose restrictions suggested by economic or other theory. Matzkin (1994) discusses this issue. Semiparametric methods permit greater estimation precision than do nonparametric methods when X is multidimensional. In addition, semiparametric estimates are easier to display and interpret than nonparametric ones and provide limited capabilities for extrapolation and imposing restrictions derived from economic or other theory models.
(2)
Hn(x) is a weighted average of the observed values of Y. Observations Yi for which Xi is close to x get higher weight than do observations for which Xi is far from x. It can be shown that if hn 0 and nhn\(log n) _ as
3. Semiparametric Models The term ‘semiparametric’ refers to models in which there is an unknown function in addition to an unknown finite dimensional parameter. For example, the binary response model P(Y l 1 Q x) l G( βhx) is 13893
Semiparametric Models semiparametric if the function G and the vector of coefficients β are both treated as unknown quantities. This section describes two semiparametric models of conditional mean functions that are important in applications. The section also describes a related class of models that have no unknown finite-dimensional parameters but, like semiparametric models, mitigate the disadvantages of fully nonparametric models. In addition to the estimation of conditional mean functions, semiparametric methods can be used to estimate conditional quantile and hazard functions, binary response models in which there is heteroskedasticity of unknown form, transformation models, and censored and truncated mean- and median- regression models, among others. Space does not permit discussion of these models here. Horowitz (1998) and Powell (1994) provide more comprehensive treatments in which these models are discussed. 3.1 Single Index Models In a semiparametric single index model, the conditional mean function has the form E(Y Q x) l G(βhx)
(3)
where β is an unknown constant vector and G is an unknown function. The quantity βhx is called an ‘index.’ The inferential problem is to estimate G and β from observations of (Y, X). G in (3.1) is analogous to a link function in a generalized linear model, except in Eqn. (3) G is unknown and must be estimated. Model (3) contains many widely used parametric models as special cases. For example, if G is the identity function, then Eqn. (3) is a linear model. If G is the cumulative normal or logistic distribution function, then Eqn. (3) is a binary probit or logit model. When G is unknown, Eqn. (3) provides a specification that is more flexible than a parametric model but retains many of the desirable features of parametric models, as will now be explained. One important property of single index models is that they avoid the curse of dimensionality. This is because the index βhx aggregates the dimensions of x, thereby achieving ‘dimension reduction.’ Consequently, the difference between the estimator of G and the true function can be made to converge to zero at the same rate that would be achieved if βhx were observable. Moreover, β can be estimated with the same rate of convergence that is achieved in a parametric model. Thus, in terms of the rates of convergence of estimators, a single index model is as accurate as a parametric model for estimating β and as accurate as a one-dimensional nonparametric model for estimating G. This dimension reduction feature of single index models gives them a considerable advantage over nonparametric methods in applications where X is multidimensional and the single index structure is plausible. 13894
A single-index model permits limited extrapolation. Specifically, it yields predictions of E(Y Q x) at values of x that are not in the support of X but are in the support of βhX. Of course, there is a price that must be paid for the ability to extrapolate. A single index model makes assumptions that are stronger than those of a nonparametric model. These assumptions are testable on the support of X but not outside of it. Thus, extrapolation (unavoidably) relies on untestable assumptions about the behavior of E(Y Q x) beyond the support of X. Before β and G can be estimated, restrictions must be imposed that insure their identification. That is, β and G must be uniquely determined by the population distribution of (Y, X ). Identification of single index models has been investigated by Ichimura (1993) and, for the special case of binary response models, Manski (1988). It is clear that β is not identified if G is a constant function or there is an exact linear relation among the components of X (perfect multicollinearity). In addition, (3.1) is observationally equivalent to the model E(Y Q X ) l G*(γjδβhx), where γ and δ0 are arbitrary and G* is defined by the relation G*(γjδν) l G(ν) for all ν in the support of βhX. Therefore, β and G are not identified unless restrictions are imposed that uniquely specify γ and δ. The restriction on γ is called ‘location normalization’ and can be imposed by requiring X to contain no constant (intercept) component. The restriction on δ is called ‘scale normalization.’ Scale normalization can be achieved by setting the β coefficient of one component of X equal to one. A further identification requirement is that X must include at least one continuously distributed component whose β coefficient is nonzero. Horowitz (1998) gives an example that illustrates the need for this requirement. Other more technical identification requirements are discussed by Ichimura (1993) and Manski (1988). The main estimation challenge in single index models is estimating β. Given an estimator bn of β, G can be estimated by carrying out the nonparametric regression of Y on bhnX (e.g., by using the kernel method described in Sect. 2). Several estimators of β are available. Ichimura (1993) describes a nonlinear least squares estimator. Klein and Spady (1993) describe a semiparametric maximum likelihood estimator for the case in which Y is binary. These estimators are difficult to compute because they require solving complicated nonlinear optimization problems. Powell et al. (1989) describe a ‘densityweighted average derivative estimator’ (DWADE) that is noniterative and easily computed. The DWADE applies when all components of X are continuous random variables. It is based on the relation β`E [ p(X )cG( βhX )\cX ] lk2E[Ycp(X)\cX ] (4) where p is the probability density function of X and the second equality follows from integrating the first by
Semiparametric Models parts. Thus, β can be estimated up to scale by estimating the expression on the right-hand side of the second equality. Powell et al. (1989) show that this can be done by replacing p with a nonparametric estimator and replacing the population expectation E with a sample average. Horowitz and Ha$ rdle (1996) extend this method to models in which some components of X are discrete. They also give an empirical example that illustrates the usefulness of single index models. Ichimura and Lee (1991) investigate a multiple index generalization of Eqn. (3).
3.2 Partially Linear Models In a partially linear model, X is partitioned into two nonoverlapping subvectors, X and X . The model " # has the form E(Y Q x , x ) l βhx jG(x ) " # " #
(5)
where β is an unknown constant vector and G is an unknown function. This model is distinct from the class of single index models. A single index model is not partially linear unless G is a linear function. Conversely, a partially linear model is a single index model only in this case. Stock (1989, 1991) and Engle et al. (1986) illustrate the use of Eqn. (5) in applications. Identification of β requires the ‘exclusion restriction’ that none of the components of X are perfectly predictable by components of X . When "β is identified, it can be estimated with an n−"/## rate of convergence regardless of the dimensions of X and X . Thus, the # curse of dimensionality is avoided" in estimating β. An estimator of β can be obtained by observing that Eqn. (5) implies YkE(Y Q x ) l βh[X kE(X Q x )]jU # " " #
(6)
where U is an unobserved random variable satisfying E(U Q x , x ) l 0. Robinson (1988) shows that under " #conditions β can be estimated by applying regularity OLS to Eqn. (6) after replacing E(Y Q x ) and E(X Q x ) # " b# , with nonparametric estimators. The estimator of β, n converges at rate n−"/# and is asymptotically normally distributed. G can be estimated by carrying out the nonparametric regression of YkbhnX on X . Unlike " the #curse of bn, the estimator of G suffers from dimensionality; its rate of convergence decreases as the dimension of X increases. #
& f (ν)w (ν) dν l 0, k
k
k l 1, …, d
(8)
where wk is a non-negative weight function. An additive model is distinct from a single index model unless E(Y Q x) is a linear function of x. Additive and partially linear models are distinct unless E(Y Q x) is partially linear and G in Eqn. (5) is additive. An estimator of fk(k l 1, …, d ) can be obtained by observing that Eqn. (7) and (8) imply
&
fk(xk) l E(Y Q x)w−k(x−k) dx−k
(9)
where x−k is the vector consisting of all components of x except the kth and w−k is a weight function that satisfies w−k(x−k) dx−k l 1. The estimator of fk is obtained by replacing E(Y Q x) on the right-hand side of Eqn. (9) with nonparametric estimators. Linton and Nielsen (1995) and Linton (1997) present the details of the procedure and extensions of it. Under suitable conditions, the estimator of fk converges to the true fk at rate n−#/& regardless of the dimension of X. Thus, the additive model provides dimension reduction. It also permits extrapolation of E(Y Q x) within the rectangle formed by the supports of the individual components of X. Hastie and Tibshirani (1990) and Exploratory Data Analysis: Multiariate Approaches (Nonparametric Regression) discuss an alternative estimation procedure called ‘backfitting.’ This procedure is widely used, but its asymptotic properties are not yet well understood. Linton and Ha$ rdle (1996) describe a generalized additive model whose form is E(Y Q x) l G[ µjf (x )j(jfK(xd)] " "
(10)
where f , …, fd are unknown functions and G is a known, "strictly increasing (or decreasing) function. Horowitz (2001) describes a version of Eqn. (10) in which G is unknown. Both forms of Eqn. (10) achieve dimension reduction. When G is unknown, Eqn. (10) nests additive and single index models and, under certain conditions, partially linear models. The use of the nonparametric additive specification (7) can be illustrated by estimating the model E(log W Q EXP, EDUC)
3.3 Nonparametric Additie Models Let X have d continuously distributed components that are denoted X , … Xd. In a nonparametric " additive model of the conditional mean function, E(Y Q x) l µjf (x )j(jfd(xd) " "
where µ is a constant and f , …, fd are unknown " normalization confunctions that satisfy a location dition such as
(7)
l µjfEXP(EXP)jfEDUC(EDUC) where W and EXP are defined as in Sect. 1, and EDUC denotes years of education. The data are taken from the 1993 CPS and are for white males with 14 or fewer 13895
Semiparametric Models
Figure 2 (a) Estimate of fEXP in additive nonparametric model of E(log W Q EXP, EDUC). (b) Estimate of fEDUC in additive nonparametric model of E(log W Q EXP, EDUC)
years of education who work full time and live in urban areas of the North Central US. The results are shown in Fig. 2. The unknown functions fEXP and fEDUC are estimated by the method of Linton and Nielsen (1995) and are normalized so that fEXP(2) l fEDUC(5) l 0. The estimates of fEXP (Fig. 2a) and fEDUC (Fig. 2b) are nonlinear and differently shaped. Functions fEXP and fEDUC with different shapes cannot be produced by a single index model, and a lengthy specification search might be needed to find a parametric model that produces the shapes shown in Fig. 2. Some of the fluctuations of the estimates of fEXP and fEDUC may be artifacts of random sampling error rather 13896
than features of E(log W Q EXP, EDUC). However, a more elaborate analysis that takes account of the effects of random sampling error rejects the hypothesis that either function is linear.
4. Conclusions This article has described several semiparametric methods for estimating conditional mean functions. These methods relax the restrictive assumptions made by linear and other parametric models, thereby reducing (though not eliminating) the likelihood of
Senescence: Genetic Theories of seriously misleading inference. The value of semiparametric methods in empirical research has been demonstrated. Their use is likely to increase as their availability in commercial statistical software packages increases.
Bibliography Amemiya T 1985 Adanced Econometrics. Harvard University Press, Cambridge, MA Engle R F, Granger C W J, Rice J, Weiss A 1986 Semiparametric estimates of the relationship between weather and electricity sales. Journal of the American Statistical Association 81: 310–20 Fan J, Gijbels I 1996 Local Polynomial Modelling and its Applications. Chapman & Hall, London Goldberger A S 1998 Introductory Econometrics. Harvard University Press, Cambridge, MA Ha$ rdle W 1990 Applied Nonparametric Regression. Cambridge University Press, Cambridge, UK Hastie T J, Tibshirani R J 1990 Generalized Additie Models. Chapman & Hall, London Horowitz J L 1993 Semiparametric and nonparametric estimation of quantal response models. In: Maddala G S, Rao C R, Vinod H D (eds.) Handbook of Statistics. Elsevier, Amsterdam, Vol. II, pp. 45–72 Horowitz J L 1998 Semiparametric Methods in Econometrics. Springer-Verlag, New York Horowitz J L 2001 Nonparametric estimation of a generalized additive model with an unknown link function. Econometrica 69: 599–631 Horowitz J L, Ha$ rdle W 1996 Direct semiparametric estimation of single-index models with discrete covariates. Journal of the American Statistical Association 91: 1632–40 Ichimura H 1993 Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics 58: 71–120 Ichimura H, Lee L-F 1991 Semiparametric least squares estimation of multiple index models: single equation estimation. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK, pp. 3–49 Klein R W, Spady R H 1993 An efficient semiparametric estimator for binary response models. Econometrica 61: 387–421 Linton O B 1997 Efficient estimation of additive nonparametric regression models. Biometrika 84: 469–73 Linton O B, Ha$ rdle W 1996 Estimating additive regression models with known links. Biometrika 83: 529–40 Linton O B, Nielsen J P 1995 A kernel method of estimating structured nonparametric regression based on marginal integration. Biometrika 82: 93–100 Manski C F 1988 Identification of binary response models. Journal of the American Statistical Association 83: 729–38 Matzkin R L 1994 Restrictions of economic theory in nonparametric methods. In: Engle R F, McFadden D L (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 4, pp. 2523–58 Powell J L 1994 Estimation of semiparametric models. In: Engle R F, McFadden D L (eds.) Handbook of Econometrics. NorthHolland, Amsterdam, Vol. 4, pp. 2444–521
Powell J L, Stock J H, Stoker T M 1989 Semiparametric estimation of index coefficients. Econometrica 51: 1403–30 Robinson, P M 1988 Root-N-consistent semiparametric regression. Econometrica 56: 931–54 Stock J H 1989 Nonparametric policy analysis. Journal of the American Statistical Association 84: 567–75 Stock J H 1991 Nonparametric policy analysis: An application to estimating hazardous waste cleanup benefits. In: Barnett W A, Powell J, Tauchen G (eds.) Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, UK, pp. 77–98
J. L. Horowitz
Senescence: Genetic Theories Senescence is the progressive deterioration of vitality that accompanies increasing age. Like other features of organismal life histories, patterns of senescence vary between individuals within populations, between populations of the same species, and between species, suggesting that they are modifiable by genetic factors and subject to evolutionary change. In this article, the various evolutionary forces that might direct genetic modifications of senescence are considered, and a theoretical framework for understanding the evolution of life histories is presented. The secondary problem of the maintenance of genetic variation for life history traits is also reviewed.
1. Medawar’s Principle The modern evolutionary theory of senescence begins with Medawar who argued that ‘… the efficacy of natural selection deteriorates with increasing age’ (1952, p. 23). A simple hypothetical example similar to a case considered by Hamilton (1966) illustrates the principle. Consider two genetic variants in humans, both having age-specific effects as follows: each variant confers complete immunity against a lethal disease, but only for one particular year of life. The first variant gives immunity to 12 year-olds, while the second variant confers immunity at the age of 60 years. What are the relative selective advantages of the genetic variants? If, for simplicity, the effects of parental care are ignored and it is also assumed that menopause always comes before 60 years, then it is immediately obvious that the second variant is selectively neutral, having no effect at all on the ability of carriers to transmit genes to the next generation, whereas the first variant has a significant selective advantage. This example illustrates the general principle that natural selection is most effective in the 13897
Senescence: Genetic Theories of young. To obtain a more exact and quantitative understanding of the relation between organismal age and the force of selection, it is necessary to develop a description of selection in age-structured populations.
2. Age-structured Populations Some organisms, such as annual plants, complete their life cycles in discrete fashion, exhibiting no overlap of parental and offspring generations. However, most higher organisms, including humans, have overlapping generations. Under the latter circumstances, the description of population composition and growth requires two kinds of information: age-specific survival, and age-specific fertility. Survivorship, denoted l(x), is defined as the probability of survival from birth or hatching until age x. A survivorship curve is a graph of l(x) versus x, where x ranges from zero to the greatest age attained in the population. The survivorship is initially 100 percent at birth and then declines to zero at the maximum observed age; it cannot increase with increasing age. If a cohort of 1,000 age-synchronized individuals are followed throughout their lives, then 500 of them will be alive at the age when l(x) is 0.50, 100 will be alive when l(x) is 0.10, and so on. Age-specific fertility, represented as m(x), is defined as the average number of progeny produced by a female of age x. One of the fundamentals of demography is that, under a wide range of conditions, a population having fixed l(x) and m(x) schedules will eventually attain a stable age-structure. That is, after a period of time the proportions of the population in each age-class will reach unchanging values. If the survivorship or fertility schedules are altered, then a different age-structure will evolve. Prior to attaining the stable age distribution, population growth is likely to be erratic, but once the stable age distribution is reached, then, under the assumption of unlimited resources, the population will grow smoothly. In particular, the population will exhibit exponential growth described by the following equation: N (t) l N (0) ert
(1)
where N(t) is population size as a function of time t, N(0) is initial population size, e is the natural exponential, and r is the Malthusian parameter, also known as the intrinsic rate of increase of the population. The parameter r combines the effects of age-specific survival and fertility and translates them into a population growth rate. The value of r is the implicit solution to the following equation, known as the Euler–Lotka equation:
&e
−rx l (x)
m (x) dx l 1
(2)
The significance of the Malthusian parameter is that it reflects fitness, in the Darwinian sense, in an age13898
structured population. If there are several genotypes in a population, and if those genotypes differ with respect to age-specific survival or fertility patterns, then each genotype will have a particular r value. Those r’s specify the rate at which a population consisting of only that genotype would grow, once the stable age distribution has been attained. The r’s also specify the relative fitnesses of the genotypes in a genotypically mixed population. The genotype with the highest r has the highest fitness and will be favored by natural selection under conditions that allow population growth. Much of the evolutionary theory relating to senescence and life histories uses the Malthusian parameter r as a surrogate for Darwinian fitness, essentially asking what changes in the l(x) and m(x) schedules would maximize the intrinsic rate of increase. There is one other quantity that arises as a measure of fitness in populations with overlapping generations. Fisher (1930) defined ‘reproductive value,’ which is the expected number of progeny that will be produced by an individual of age x over the rest of its lifetime, given that it has survived to age x. Reproductive value is not the same as fitness, because it does not take into account the chances of surviving to age x.
3. Hamilton’s Perturbation Analysis Hamilton (1966) asked the following question: What sorts of small genetic changes in the l(x) or m(x) schedules will be favored by natural selection? To answer this question he employed the Malthusian parameter r as a measure of fitness, assuming that the modifications of l(x) and m(x) that lead to the highest value of r will be the ones to evolve. He also approximated the continuous functions described above with their discrete-time counterparts. The discrete-time rate of population increase is: λ l er
(3)
The discrete-time version of the Euler–Lotka equation is lx mx λ−x l 1
(4)
Age-specific survival is expressed in descrete time as: lx l p1 p2 p3 … px
(5)
where px is the probability of surviving the duration of the xth age class given that one has survived to the beginning of age class x. Now consider the evolutionary fate of a mutation which causes a small change in the ability to survive at some particular age a. The new mutation will be favored by natural selection if it causes an increase in r, or, what is equivalent in the discrete-time case, an increase in ln λ. The effect of the perturbation is studied by examining the partial derivative of λ with
Senescence: Genetic Theories of respect to pa. Hamilton obtained a closed form of this derivative and was able to conclude the following: (a) The force of selection, as indicated by the partial derivative is highest at the youngest pre-reproductive ages, begins to decline when reproduction commences and drops to zero when reproduction ceases. (b) If a mutation causes a gain in survival at a particular age a and an equal loss in survival at age a , " # then such a mutation will increase in the population only if a a. # (c) If "a mutation causes a gain in fertility at a particular age a and an equal loss of fertility at age a , " # then such a mutation will increase in the population only if a a. # (d ) If" a mutation causes a loss in survival at a particular age and an increase in fertility at that same age, then the limits of the loss in survival that can be tolerated are set by the inverse of the reproductive value. That is, if the reproductive value at age x is large, then only a small reduction of survival can be exchanged for a gain in fertility, but if the remaining reproductive value is small, then a large reduction in survival can evolve. (For further explication of these results, see Roughgarden 1996, p. 363.) Hamilton’s general conclusion is that ‘for organisms that reproduce repeatedly senescence is to be expected as an inevitable consequence of the working of natural selection’ (1966, p. 26). This is a view that is clearly consistent with Medawar (1952). For a technical discussion of the validity of the assumption that the Malthusian parameter is equivalent to fitness in agestructured populations, see Charlesworth (1980), whose models extend the early results of Haldane (1927) and Norton (1928).
4. Pleiotropy Pleiotropy means that a single gene affects two or more characters. In the context of life history evolution, pleiotropy means that a single gene affects the fitness of the organism at two or more ages. It is convenient to categorize the combinations of agespecific pleiotropic effects as shown in Table 1. If a new mutation improves fitness in both young and old animals, then it is likely to be favored by natural Table 1 Pleiotropic effects of mutations affecting life history Effect on fitness in the young
Effect on fitness in the old
Evolutionary fate
j k k k j
j k k j k
Increase Decrease Decrease Decrease Increase
Table 2 Fitness parameters for a one-locus model of antagonistic pleiotropy Genotype:
A1A1
A1A2
A2A2
Fitness in young: Fitness in old:
High Low
Medium Medium
Low High
selection, and will increase in the population. Conversely, a gene that decreases fitness in both young and old organisms will be eliminated by natural selection. The more interesting cases in Table 1 are those in which the fitness effects on young and old organisms are negatively correlated, a condition referred to as ‘negative pleiotropy’ or ‘antagonistic pleiotropy.’ Medawar’s principle suggests that mutations that improve early fitness at the expense of late fitness will be favored by natural selection, while those with the converse effects will be eliminated. The possibility that genes might increase fitness at one age and also decrease it at another was mentioned by early theorists, but the first strong advocate of this mechanism of the evolution of senescence was Williams (1957), who noted that natural selection will tend to maximize vigor in the young at the expense of vigor later in life. An example of negatively pleiotropic gene action of the sort that Williams proposed is shown in Table 2. Williams argued that, in the course of selecting for the allele A1 which is beneficial at young ages, the deleterious effects of allele A2 on the old are brought along; in this scenario, senescence evolves as an incidental consequence of adaptation at earlier ages. The exact mathematical conditions for the increase of antagonistic, pleiotropic mutations have been derived (Charlesworth 1980), verifying that such mutations can indeed increase in populations. While the theoretical basis for antagonistic pleiotropy is sound and widely accepted, it is unclear whether there exists the special sort of genetic variation that this mechanism requires. While it is easy to imagine physiological situations in which there could occur trade-offs between the fitness of the young and the old, there are few, if any, actual cases of such variation identified (Finch 1990, p. 37), even though a half century has passed since Williams’ proposal. Negative correlations between life history characters are sometimes construed as evidence for pleiotropy, but this interpretation overlooks the fact that phenotypic correlations arise from factors other than pleiotropy, including the correlation of environmental factors and the correlation of alleles at genetically linked loci (linkage disequilibrium). What is required for the antagonistic pleiotropy model is not just evidence of trade-offs in life history traits, which is abundant, but a demonstration that there exist tradeoffs in life history characters that are mediated by alternative alleles at specific polymorphic loci. Until such genes are characterized and shown to play a role 13899
Senescence: Genetic Theories of in life history evolution, the antagonistic pleiotropy model will remain an interesting theoretical construct, but one of unknown, and possibly negligible, biological significance.
contrast to the situation with antagonistic pleiotropy, there is experimental evidence for the kinds of genetic variation that the model requires, namely spontaneous mutations with age-specific effects on vital rates (Mueller 1987, Pletcher et al. 1998, 1999).
5. Mutation Accumulation Germ-line mutations, which are changes in the DNA sequence in sperm and egg cells, occur at low but nonzero rates, largely as a result of proof-reading errors in the enzymes that replicate DNA. This slow, steady input of genetic variants has the potential to corrupt the gene pool, since almost all of the novel variants that have some effect are deleterious. However, natural selection works against the corrupting effect by removing carriers of deleterious mutations. A balance is reached between the steady input of deleterious genes through mutation and their removal by natural selection. One of the characteristics of the equilibrium balance state is that, for any particular gene, the deleterious alleles are present at low frequencies, usually much less than 1 percent. The low rate of occurrence of each of many hereditary human diseases is thought to reflect the mutation-selection balance operating at many genes, each of which is capable of being mutated to produce a deleterious condition (Hartl and Clark 1997). The classical mutation-selection balance model is appropriate for mutations that have deleterious effects early in life, but what happens when the disability is expressed only late in life? Medawar (1952) suggested that natural selection will be unable to counteract the feeble pressure of repetitive mutation if the mutant genes make their effects known at advanced ages, either post-reproductively or at ages not attained by most of the members of the species. This follows naturally from his proposal that the force of selection declines with increasing age. Under such conditions, the deleterious mutations would gradually accumulate, unchecked by natural selection. In this view, senescence is a process driven entirely by mutation. This mechanism for the evolution of senescence is distinct from, but not mutually exclusive of, antagonistic pleiotropy. While the pleiotropy process suggests that senescence is the incidental consequence of adaptation, the mutation accumulation model invokes deterioration without adaptation (Partridge and Barton 1993). Charlesworth (1980) has analyzed a deterministic model of an age-structured population with recurrent mutation. He derived an approximation for the frequency of heterozygous carriers of deleterious alleles and found that the equilibrium frequency is inversely proportional to the selection intensity. The significance of this result is that when there is only very weak selection pressure, as at advanced ages, then mutant alleles can attain high frequencies under the influence of recurrent mutation. This result verifies the earlier conjectures of Medawar and Williams. In 13900
6. Postponement of Deleterious Effects Medawar (1952) also considered the case of Huntington’s chorea, a grave and ultimately fatal nervous disorder that usually manifests itself in middle-aged patients. He suggested that there could be selection in favor of genetic modifiers which have as their main effect the postponement of the effects of the Huntington’s gene or other genes causing hereditary disorders. This suggestion, which very much resembles an earlier proposal of R. A. Fisher concerning the evolution of dominance, is unlikely to be correct. While it makes sense that there would be some selection in favor of delaying the mutant effect, Charlesworth (1980, p. 219) has shown that the selection pressure exerted on such a hypothetical modifier gene would be exceedingly small, on the order of the mutation rate. This is because the modifier has an effect on fitness only when it co-occurs with the Huntington’s or other disease gene, which is at mutation-selection balance and present in only a small fraction of the population. Under such conditions the evolutionary fate of the modifier is likely to be determined by genetic drift or other stochastic factors rather than the minuscule selective pressure.
7. The Variation Problem While the primary concern of theorists has been to explain the degeneration of vitality associated with old age, there is a secondary problem that can also be addressed with these models. Genetic variation is the raw material upon which natural selection operates to produce adaptations and new species. The mechanisms by which variation is maintained in populations are therefore of considerable interest to evolutionary geneticists. To what extent do genetic models of senescence tend to maintain variation in life histories within populations? Several authors have addressed the question and come to two different answers depending upon the theoretical construct employed. Curtsinger et al. (1994) analyzed deterministic one- and two-locus models of antagonistic pleiotropy and asked under what conditions polymorphisms would be maintained. The conditions for stable polymorphism were found to be rather restrictive, especially with weak selection. The conditions were also found to be very sensitive to dominance parameters; in particular, reversal of dominance properties with respect to the two traits is often required for polymorphism, but seems improbable on biochemical grounds.
Senescence: Genetic Theories of Tuljapurkar (1997) gives an overview of modeling strategies and describes some of his own models in which mortality is assumed to depend on both organismal age and random variables in the environment. In these models, the relative fitnesses are measured by a stochastic growth rate, which reflects average vital rates and environmental variability. Results from several related models show that phenotypic combinations that differ in age-specific fertility can be equally fit in a range of stochastic environments. The paper concludes that polymorphisms for length of reproductive life can be readily maintained by selection in temporally varying environments.
8. Future Directions Two important challenges to the genetical theories of senescence arise from recent experimental work. The first challenge concerns mortality rates at advanced ages. Observations of survival in experimental organisms are usually presented in terms of age-specific survivorship, as defined in Sect. 2, but if sample sizes are sufficiently large then the survival data can also be analyzed in terms of hazard functions, which define the instantaneous risk of death as a function of age. Unlike survivorship, the hazard function can be nonmonotonic. Many experimental studies of moderate sample size have documented that the hazard increases approximately exponentially with age, a dynamic generally referred to as the Gompertz law (Finch 1990). Recent experiments have been done on an unusually large scale, making it possible to estimate hazards at very advanced ages. For Drosophila, nematode worms, and Med-flies, hazard functions increase exponentially in the early part of the life history, as expected, but at the most advanced ages the hazard functions decelerate, bending over and producing unexpected ‘mortality plateaus’ (see Vaupel et al. 1998, for a review of the experimental evidence and data on human populations). The existence of mortality plateaus at advanced, post-reproductive ages poses a challenge for mutation accumulation models, which predict, under a wide range of assumption, a ‘wall’ of high mortality at the age when reproduction is completed. A preliminary attempt to accommodate mortality plateaus into antagonistic pleiotropy models has failed (Pletcher and Curtsinger 1998, Wachter 1999). One possible solution is that the mortality plateaus are caused by population heterogeneity of both genetic and nongenetic origin (Pletcher and Curtsinger 2000, Service 2000). It has so far proved very difficult to measure the relevant heterogeneity and determine whether it is of sufficient magnitude to produce the plateaus. A second possibility that can explain some features of the plateaus is a model of positive pleiotropy, which causes late-life mortality rates to avoid inflation because of the positively correlated effects of alleles
selected for early survival (Pletcher and Curtsinger 1998). Models of positive pleiotropy merit further investigation. The second experimental challenge to current theory concerns genetic variance. The mutation accumulation model predicts that genetic variance for mortality should increase at advanced ages. Recent experiments document instead a decline of genetic variance at advanced ages in experimental populations of Drosophila (Promislow et al. 1996). Hughes and Charlesworth (1994) initially reported that genetic variance for mortality increases with age in their Drosophila populations, but a re-analysis of the data show close concordance with the Promislow result (Shaw et al. 1999). At present no one knows why genetic variance declines at advanced ages; it could be related to the mortality plateaus described above. Three other lines of research appear to hold promise. Group selection is a process by which collections of organisms succeed or fail as a collective, either becoming extinct or spawning new groups in competition with other groups. Groups selection arguments are sometimes invoked to explain altruistic behaviors, but evolutionary biologists typically disdain group selection, because the process tends to be much weaker than selection between individuals, and also seems to work only in very small populations ( Williams 1966). However, group selection might play a role in shaping post-reproductive mortality rates, when individual-level selection is essentially inoperative. This type of model could be particularly relevant to human evolution, and is essentially unexplored. A related type of model that needs further development involves the evolution of vital rates in combination with kin selection, taking into account the effects of post-reproductive survival, parental care, and fitness effects mediated through relatives in the kin group (Roach 1992). Finally, as noted by Tuljapurkar (1997), the theoretical methods are limited to small perturbations and local analyses, under which conditions the population is always close to demographic equilibrium. There does not exist at present a theory that can accommodate large mutational changes in vital rates in combination with non-equilibrium demographics. See also: Aging and Health in Old Age; Aging, Theories of; Brain Aging (Normal): Behavioral, Cognitive, and Personality Consequences; Differential Aging; Lifespan Development, Theory of; Old Age and Centenarians; Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms
Bibliography Charlesworth B 1980 Eolution in Age-structured Populations. Cambridge University Press, Cambridge, UK
13901
Senescence: Genetic Theories of Curtsinger J W, Service P, Prout T 1994 Antagonistic pleiotropy, reversal of dominance, and genetic polymorphism. American Naturalist 144: 210–28 Finch C E 1990 Longeity, Senescence, and the Genome. University of Chicago Press, Chicago Fisher R A 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford, UK Haldane J B S 1927 A mathematical theory of natural and artificial selection, Part IV. Mathematical Proceedings of the Cambridge Philosophical Society 32: 607–15 Hamilton W D 1966 The moulding of senescence by natural selection. Journal of Theoretical Biology 12: 12–45 Hartl D L, Clark A G 1997 Principles of Population Genetics, 3rd edn. Sinauer Associates Inc., Sunderland, MD Hughes K A, Charlesworth B 1994 A genetic analysis of senescence in Drosophila. Nature 367: 64–6 Medawar P B 1952 An Unsoled Problem of Biology. H K Lewis, London Mueller L D 1987 Evolution of accelerated senescence in laboratory populations of Drosophila. Proceedings of the National Academy of Sciences of the USA 84: 1974–77 Norton H T J 1928 Natural selection and Mendelian variation. Proceedings of the London Mathematical Society 28: 1–45 Partridge L, Barton N H 1993 Optimality, mutation, and the evolution of ageing. Nature 362: 305–11 Pletcher S D, Curtsinger J W 1998 Mortality plateaus and the evolution of senescence: Why are old-age mortality rates so low? Eolution 52: 454–64 Pletcher S D, Curtsinger J W 2000 The influence of environmentally induced heterogeneity on age-specific genetic variance for mortality rates. Genetical Research Cambridge 75: 321–9 Pletcher S D, Houle D, Curtsinger J W 1998 Age-specific properties of spontaneous mutations affecting mortality in Drosophila melanogaster. Genetics 148: 287–303 Pletcher S D, Houle D, Curtsinger J W 1999 The evolution of age-specific mortality rates in Drosophila melanogaster: Genetic divergence among unselected lines. Genetics 153: 813–23 Promislow D E L, Tatar M, Khazaeli A A, Curtsinger J W 1996 Age-specific patterns of genetic variance in Drosophila melanogaster. I. Mortality. Genetics 143: 839–48 Roach D A 1992 Parental care and the allocation of resources across generations. Eolutionary Ecology 6: 187–97 Roughgarden J 1996 Theory of Population Genetics and Eolutionary Ecology. An Introduction. Prentice Hall, Upper Saddle River, NJ Service P M 2000 Heterogeneity in individual mortality risk and its importance for evolutionary studies of senescence. American Naturalist. 156: 1–13 Shaw F, Promislow D E L, Tatar M, Hughes L, Geyer C 1999 Towards reconciling inferences concerning genetic variation in senescence in Drosophila melanogaster. Genetics 152: 553–66 Tuljapurkar S 1997 The evolution of senescence. In: Wachter K W, Finch C E (eds.) Between Zeus and the Salmon. National Academy Press, Washington, DC, pp. 65–77 Vaupel J W, Carey J R, Christensen K, Johnson T E, Yashin A I, Holm N V, Iachine I A, Kanisto V, Khazaeli A A, Liedo P, Longo V D, Zeng Y, Manton K G, Curtsinger J W 1998 Biodemographic trajectories of longevity. Science 280: 855–60 Wachter K W 1999 Evolutionary demographic models for mortality plateaus. Proceedings of the National Academy of Sciences of the USA 96: 10544–7 Williams G C 1957 Pleiotropy, natural selection, and the evolution of senescence. Eolution 11: 398–411
13902
Williams G C 1966 Adaptation and Natural Selection, A Critique of Some Current Eolutionary Thought. Princeton University Press, Princeton, NJ
J. W. Curtsinger
Sensation and Perception: Direct Scaling One aim of psychophysics is to measure the subjective intensity of sensation. We easily hear that a tone of given intensity has a particular loudness, and that a tone of higher intensity sounds louder still. The measurement problem lies in the question ‘How much louder?’ Direct scaling is a particular way of answering that question: the observer is asked to assign numbers corresponding to the subjective magnitudes of given stimulus intensities, thus providing the capability of saying that, for example, one tone sounds twice or ten times as loud as another.
1. History 1.1 Fechner’s Solution G. Fechner first proposed a solution, albeit an indirect one, in 1860 (Boring 1950). He accepted Weber’s law—that the difference threshold (the physical size of a difference in intensity needed to tell one signal from another) grows in proportion to signal intensity, determining a constant of proportionality, the Weber fraction, unique to each sensory continuum. Fechner assumed that all difference thresholds are subjectively constant and therefore can serve as the unit of measurement for scales of sensation. He concluded that subjective magnitude grows by constant differences as stimulus magnitude grows by constant ratios, a logarithmic relation now known as Fechner’s law. 1.2 Category Rating and Magnitude Scaling In Fechner’s laboratory, there also evolved a direct approach to measuring sensation magnitude. This was the method of absolute judgment (later called category rating), in which the subject assigns stimuli to categories according to some subjective aspect. Originally designed to study esthetic judgment, it was adapted by 1920 to studying sensation magnitude; early findings using this method supported Fechner’s law. In the 1950s, S. S. Stevens refined another and related direct scaling procedure in which the subject estimates the apparent magnitude of a signal by assigning to it a numerical value—hence, magnitude estimation. Data from this method were better fitted by a power law.
Sensation and Perception: Direct Scaling
2. Methods All the direct scaling methods require an observer to judge a series of stimuli varying along (at least) one dimension. For instance, the experimenter might present a pure tone at several sound pressure levels covering a 1000:1 range and request judgments of loudness, or present a light disk at several luminanoes covering a 10:1 range and request judgments of brightness.
2.1 Category Rating The observer is instructed to assign each stimulus to a category, which may be designated descriptively (very soft, soft, … very loud) or numerically (1, 2, … , 7); the assignments should create subjectively equal category intervals. The allowable response values, including the extreme values, are specified by the experimenter. Usually the number of categories ranges from 5 to 11, although smaller and larger values are sometimes used. Typically, the stimulus range does not exceed 10:1, although it may do so.
2.2 Magnitude Estimation and Production In magnitude estimation, the observer is instructed to assign to each stimulus a (positive) number so that the number reflects its apparent intensity. There are no limits on allowable response values. Originally, one stimulus value was designated as standard and a numerical value assigned to it as modulus. Later, however, the use of designated standard and assigned modulus was abandoned and the observer instructed to choose any numbers appropriate to represent apparent magnitudes. Typically the stimulus range is at least 100:1 and is often greater. In magnitude production, stimulus and judgmental continua are interchanged; the experimenter presents a set of numbers, and the observer produces values on some intensive continuum to effect a subjective match for each presented number.
3. Achieements of Direct Scaling 3.1 A Quantitatie Phenomenology These methods made possible a quantitative phenomenology of the various sensory systems. For example, in hearing, the importance of salient variables, such as frequency and intensity, to the ability of the observer to detect a sound or to tell the difference between two sounds, had long been known. Furthermore, by having the observer match tones of varying frequency and intensity to a fixed reference tone, equal loudness contours were established. What remained unknown were the loudness relations among these contours. Magnitude estimation, by providing a numerical reference scale, permits the observation that one tone sounds twice, or ten times, or half as loud as another. Thus, magnitude estimation has been used to map the effects of variables known to influence perception for loudness, brightness, tastes and smells, and tactile sensations, as well as internal states such as perceived effort, fatigue, and satiety (Stevens 1975). It has also been used to study continua without a physical measure (e.g., seriousness of crime, severity of punishment, and their relationship) (Sellin and Wolfgang 1964). Category rating has been employed to study the cognitive integration of perceptual dimensions (Anderson 1981).
3.2 Intermodal Comparisons
2.3 Cross-modal Matching
Direct scaling also provides the capability to compare and contrast phenomena that occur in several modalities, such as temporal and spatial summation, sensory adaptation and recovery, and spatial inhibition (brightness contrast and auditory masking), at levels above absolute threshold (Marks 1974). For example, a persisting olfactory stimulus of constant intensity produces a smell intensity that diminishes in strength over time. That diminution can be assessed by asking an observer to assign numbers to the apparent magnitude of the smell at different elapsed times, e.g., after 5 s, 10 s, 30 s, 60 s, … ; in this way, the course of sensory adaptation can be traced for this stimulus and for others of differing initial intensities. When the procedure is repeated in a different modality, the parameters of the adaptation curves can be compared.
The use of numbers, which has provoked many objections, can be avoided by instructing the observer to match intensities on one physical continuum directly to intensities on another. For example, the observer may produce luminances to match sound pressure levels so that brightness of a light disk is equivalent to loudness of a tone, or the observer may produce handgrip forces to match odor intensities so that perceived effort is equivalent to strength of smell.
The third major achievement of direct scaling is the discovery that, to at least a first approximation, equal stimulus ratios produce equal judgmental ratios. This nearly universal relation is called Steven’s power law, after S. S. Stevens who established it: subjective
3.3 Steens’ Power Law
13903
Sensation and Perception: Direct Scaling magnitude is proportional to intensity raised to some power. Further, that exponent takes distinctive values for each stimulus continuum, ranging from 0.3 for luminance to 2.0 or more for electric shock. Later experiments showed that the magnitude exponents predicted cross-modal matching exponents: if continua A and B had magnitude exponents a and b, and if a cross-modal match of B to A was obtained, the resulting exponent was a\b. Indeed, the cross-modal exponents for a variety of continua are connected in a transitive network (Daning 1983), a finding with important theoretical implications.
4. Problems of Direct Scaling 4.1 Use of Numbers as Responses Critics questioned treating the numbers that subjects assigned to stimulus magnitudes as if they were measurements. However, the practice was validated by the discovery that the results obtained with number as the matching continuum are in agreement with the results obtained with cross-modal matching (Stevens, 1975). Furthermore, in cross-modal matching, some other stimulus continuum can substitute for number, since the choice of a reference continuum is, for most purposes, arbitrary. Whether the numbers used by an observer constitute a direct measure of sensation is an unresolved, and perhaps unresolvable, question (Teghtsoonian 1974). 4.2 The Psychophysical Regression Effect Since magnitude estimation (assigning numbers to physical intensities) and magnitude production (assigning physical intensities to match numbers) are inverses of each other, they should produce the same exponent. However, this is not the case. The size of the difference in exponents may be quite large, with the result that a precise value of the exponent for a given stimulus continuum cannot be specified; no satisfactory combination rule has been agreed upon. The size, and indeed the direction, of the difference depends on the range of physical intensities (or numbers) presented; for practical purposes, the smallest effect is exhibited when the range is large. Some portion, at least, of this regression effect depends on sequential effects, the influence exerted by previous stimuli and judgments on subsequent judgments. 4.3 Category s. Ratio Scales Although both category rating and magnitude estimation purport to give scales of sensation magnitude, they do not, in general, agree with each other: both scales can be characterized as power functions, but the exponents are not the same. Much discussion has 13904
centred on whether subjects can make two kinds of judgments or whether the experimenter transforms a single kind of magnitude judgment by treating it as ratio or interval. However, an experimental determination of the sources of variance in the type of scale produced shows that instructions to judge either ratios or intervals account for almost none of the variance; much more important are such methodological variables as whether judgmental end-points are assigned or free and whether the range of stimuli is small or large. With free end-points and a large range, category judgment and magnitude estimation produce the same results; the obtained power functions have stable exponents characteristic of magnitude estimation (Montgomery 1975). 4.4 Local s. Global Psychophysics A long-standing conundrum in psychophysics has been the relation between the local (thresholds, both absolute and difference) and the global (measurements of subjective magnitude at levels above absolute threshold) (Luce and Krurnhansl (1988). Fechner believed he had combined the two by using difference thresholds as subjective units to yield a logarithmic scale of sensation. Stevens (1961) proposed, while honoring Fechner, to repeal his law and to substitute a power scale of sensation; he believed that one could not derive measures of sensation magnitude from threshold determinations. An early attempt to reconcile these two positions was R. Teghtsoonian’s argument (1973) that both difference thresholds and power law exponents are indices of dynamic range. He proposed that there is a common scale of sensory magnitude for all perceptual continua and that the observer’s dynamic range for each continuum maps completely onto that common scale. For example, the least sound intensity experienced has the same subjective magnitude as the least luminance, and the greatest sound intensity to which the auditory system responds has the same subjective magnitude as the greatest luminance. Thus the mapping of widely divergent dynamic ranges for the several perceptual continua onto a single subjective magnitude range determines their power law exponents, and the mapping of widely divergent difference thresholds onto a single subjective difference threshold determines their Weber fractions. See also: Fechnerian Psychophysics; Memory Psychophysics; Psychophysical Theory and Laws, History of; Psychophysics; Scaling: Correspondence Analysis; Stevens, Stanley Smith (1906–73).
Bibliography Anderson N H 1981 Foundations of Information Integration Theory. Academic Press, New York
Sensation Seeking: Behaioral Expressions and Biosocial Bases Boring E G 1950 A History of Experimental Psychology, 2nd edn. Appleton-Century-Crofts, New York Daning R 1983 Intraindividual consistencies in cross-modal matching across several continua. Perception and Psychophysics 33: 516–22 Luce R D, Krumhansl C L 1988 Measurement, scaling, and psychophysics. In: Atkinson R C, Hernstein R J, Lindzey G, Luce R D (eds.) Steens’ Handbook of Experimental Psychology, 2nd edn. Vol 1: Perception and Motiation. Wiley, New York Marks L E 1974 Sensory Processes: The New Psychophysics. Academic Press, New York and London Montgomery H 1975 Direct estimation: Effect of methodological factors on scale type. Scandinaian Journal Psychology 16: 19–29 Sellin J T, Wolfgang M E 1964 The Measurement of Delinquency. Wiley, New York Stevens S S 1961 To honor Fechner and repeal his law. Science 133: 80–6 Stevens S S 1975 Psychophysics: Introduction to its Perceptual, Neural, and Social Prospects. Wiley, New York Teghtsoonian R 1971 On the exponents in Stevens’s law and the constant in Ekman’s law. Psychology Reiew 78: 71–80 Teghtsoonian R 1974 On facts and theories in psychophysics: Does Ekman’s law exist? In: Moskowitz H R, Scharf B, Stevens J C (eds.) Sensation and Measurement: Papers in Honor of S. S. Steens. Reidel, Dordrecht, The Netherlands
M. Teghtsoonian
Sensation Seeking: Behavioral Expressions and Biosocial Bases Sensation seeking is a personality trait defined as the tendency to seek varied, novel, complex, and intense sensations and experiences and the willingness to take risks for the sake of such experience (Zuckerman 1979, 1994). The standard measure of the trait is the Sensation Seeking Scale (SSS). In the most widely used version (form V) it consists of four subscales: (a) Thrill and Adventure Seeking (through risky and unusual sports or other activities); (b) Experience Seeking (through the mind and senses, travel, and an unconventional lifestyle); (c) Disinhibition (through social and sexual stimulation, lively parties, and social drinking); (d) Boredom Susceptibility (aversion to lack of change and variety in experience and people). Form V uses a total score based on the sum of the four subscales. More recently a single scale called ‘Impulsive Sensation Seeking’ has been developed which is a combination of items measuring the tendency to act impulsively without planning ahead, and adaptations of items from the SSS which assess the general need for excitement without mention of specific interest or activities (Zuckerman 1993). Similar constructs and measures have been developed by other researchers including: change seek-
ing, stimulus variation seeking, excitement seeking, arousal seeking, novelty seeking and venturesomeness. Most of these scales correlate very highly with the SSS and usually have the same kind of behavioral correlates. Novelty seeking, a construct and scale devised by Cloninger (1987), not only correlates highly with Impulsive Sensation Seeking, but is based on a similar kind of biobehavioral model. Individuals high on novelty seeking are described as impulsive, exploratory, fickle, and excitable. They are easily attracted to new interests and activities, but are easily distracted and bored. Those who are low on this trait are described as reflective, rigid, loyal, stoic, frugal, orderly, and persistent. They are reluctant to initiate new activities and are preocccupied with details, and think very carefully and long before making decisions.
1. Theoretical and Research Origins of the Construct The first version of the SSS was an attempt to provide an operational measure of the construct ‘optimal levels of stimulation and arousal’ (the level at which one feels and functions best). The author was conducting experiments on sensory deprivation (SD) in which participants volunteered to spend some length of time (from 1 to 24 hours in the author’s experiments and up to two weeks in those of other investigators) in a dark soundproof room. It was theorized that persons with high optimal levels of stimulation would be most deprived by the absence of stimulation. Experiments show that the high sensation seeker became more restless over the course of an eight-hour experiment but did not experience more anxiety than the low sensation seekers (defined by the first version of the SSS). The construct changed as research using the SSS was extended to the broader domain of life experience.
2. Behaioral Expressions Contrary to what we expected, the volunteers for the SD experiments were more commonly high sensations seekers than lows. This was because they had heard that people had weird experiences in SD, like hallucinations. High sensation seekers volunteered for any kind of unusual experiments, like hypnosis or drug studies. They did not volunteer for more ordinary experiments. The concept of sensation seeking centered more around the need for novel stimulation or inner experiences rather than stimulation per se. Surveys of drug use and preferences showed that high sensation seekers tended to be polydrug users of both stimulant and depressant drugs whereas lows did not use any drugs (Segal et al. 1980). It was the novel experience of drugs, not their effect on arousal, that 13905
Sensation Seeking: Behaioral Expressions and Biosocial Bases attracted high sensation seekers. Of course hallucinatory drugs had a particular attraction for those who scored high on this trait. They were not deterred by the legal, social, or physical risks entailed in drug use. Sensation seeking in preadolescents predicts their later alcohol and drug use and abuse. Similarly sensation seekers proved to be attracted to risky sports that provided intense or novel sensations and experiences, like mountain climbing, parachuting, scuba diving, and car racing. They were not found in ordinary field sports or among compulsive exercisers. Outside of sports the high sensation seekers were found to drive their cars faster and more recklessly than lower sensation seekers. High sensation seekers are attracted to exciting vocations such as firefighting, emergency room work, flying, air-traffic control, and dangerous military assignments. When they are stuck in monotonous desk jobs they report more job dissatisfaction than low sensation seekers. In interpersonal relationships high sensation seekers tend to value the fun and games aspects rather than intimacy and commitment (Richardson et al. 1988). They tend to have more premarital sexual experience with more partners and engage in ‘risky sex.’ There is a high degree of assortative mating based on sensation seeking, i.e., highs tend to marry highs and lows tend to marry lows. Couples coming for marital therapy tend to have discrepant scores on the trait. Divorced persons are higher on the trait than monogamously marrieds. Not all expressions of sensation seeking are risky. High sensation seekers like designs and works of art that are complex and emotionally evocative (expressionist). They like explicit sex and horror films and intense rock music. When watching television they tend to frequently switch channels (‘channel surfing’). High sensation seekers enjoy sexual and nonsense types of humor (Ruch 1988). Low sensation seekers prefer realistic pastoral art pictures, media forms like situation comedies, and quiet background music.
3. Psychopathology Sensation seeking is an essentially normal trait and most of those who are very high or low on the trait are free from psychopathology. However, persons with certain kinds of disorders involving a lack of impulse control tend to be high sensation seekers. Sensation seeking is a primary motive for those with antisocial personality disorder; their antisocial behavior involves great risks where the only motive is sometimes the increase of excitement. Other disorders with a high number of sensation seekers include those with conduct disorders, borderline personality disorders, alcohol and drug abusers, and bipolar (manic-depressive) disorders. It has been discovered that there are genetic and biological trait links between many of 13906
these disorders and sensation seeking. For instance, as yet unaffected children of bipolars tend to be high sensation seekers as well as showing similar differences on an enzyme to be discussed.
4. Genetics Studies of twins raised in intact families show a relatively high degree of heritability (60 percent), compared to other personality traits that are typically in the range 30–50 percent. Analyses show no effects for the shared family environment; the environment that is important is that outside of the family which affects each twin differently. A study of twins separated shortly after birth and adopted into different families confirms these results (Hur and Bouchard 1997). A specific gene has been found to be related to novelty seeking (Ebstein et al. 1996). The gene produces one class of receptors for the neurotransmitter dopamine. One of the two major forms of the gene is found more often in high sensation seekers. This form of the gene has also been found in high percentages of heroin abusers, pathological gamblers, and children with attention deficit hyperactivity disorder. The neurotransmitter dopamine as well as the other two monoamines in the brain, norepinephrine and serotonin, are theorized to underlie the three behavioral mechanisms involved in sensation seeking: strong approach, and weak arousal and inhibition.
5. Biochemistry Males who are high in sensation seeking trait have high levels of the hormone testosterone compared to average levels in lower sensation seekers (Daitzman and Zuckerman 1980). This finding is consistent with the difference between men and women in the trait and the finding that sensation seeking peaks in the late teens and declines with age in both sexes. The enzyme monoamine oxidase (MAO) type B is lower in high sensation seekers than in those who score low on the trait. This is also consistent with age and gender differences since women are higher than men on MAO at all ages and MAO rises in the brain and blood platelets with age. Type B MAO is a regulator of the monoamines, particularly dopamine, and low levels imply a lack of regulation perhaps related to the impulsivity characteristic of many high sensation seekers. Low levels of MAO are also found in disorders characterized by poor behavioral control: attention deficit hyperactivity disorder, antisocial and borderline personality disorders, alcoholism, drug abuse, pathological gambling disorder, and mania (bipolar disorder). MAO is part of the genetic predisposition for these disorders as shown by the finding that the enzyme is low in as yet nonaffected children of alcoholics and those with bipolar disorder. Evidence
Sensation Seeking: Behaioral Expressions and Biosocial Bases of behavioral differences in newborn infants related to MAO levels also show the early effects on temperament. MAO differences are also related to behavioral traits in monkeys analogous to those of high and low sensation seeking humans.
6. Psychophysiology Differences in the psychophysiological responses of the brain and autonomic nervous system as a function of stimulus intensity and novelty have been found and generally replicated (Zuckerman 1990). The heart rate response reflecting orienting to moderately intense and novel stimuli is stronger in high sensation seekers than in lows, perhaps reflecting their interest in novel stimuli (experience seeking) and disinterest in repeated stimuli (boredom suceptibility). The cortical evoked potential (EP) reflects the magnitude of the brain cortex response to stimuli. Augmenting–reducing is a measure of the relationship between amplitude of the EP as a function of the intensity of stimuli. A high positive slope (augmenting) is characteristic of high sensation seekers (primarily those of the disinhibition type) and very low slopes, sometimes reflecting a reduction of response at the highest stimulus intensities (reducing), is found primarily in low sensation seekers. These EP augmenting–reducing differences have been related to differences in behavioral control in individual cats and strains of rats analogous to sensation seeking behavior in humans (Siegel and Driscoll 1996).
7. Eolution Comparative studies of humans and other species using the same biological markers suggest that the trait of impulsive sensation seeking has evolved in the mammalian line (Zuckerman 1984). Exploration and foraging is risky but adaptive in species that must move frequently to avoid exhaustion of the resources in an area. The balance between sensation seeking and fear determines exploration of the novel environment. Our own hominid species that came out of Africa and settled the entire earth in about 100,000 years had to have at least a moderate degree of sensation seeking. The hunting of large animals by men, warfare, and the seeking of mates outside of the group all involved risks which may have been overcome by the sensationseeking pleasure in such activity. Individual differences in the trait are seen in human infants before the major effects of socialization are seen. This suggests that impulsive sensation seeking is a basic trait related to individual differences in the approach, arousal, and inhibition behavioral mechanisms in humans and other species.
8. Future Directions One gene (DRD4) has been associated with sensation seeking but it only accounts for 10 percent of the genetic variance. The search for other major genes will continue and the understanding of what these genes do in the nervous system will fill in the large biological gap between genes and sensation-seeking behavior. Longitudinal studies that begin with genetic markers like the DRD4, biological markers like MAO, and behavioral markers like reactions to novelty, will be used to find out how specific environmental factors interact with dispositions to determine the expressions of sensation seeking, for instance why one sensation seeker becomes a criminal and another a firefighter who does skydiving on weekends.. See also: Gender Differences in Personality and Social Behavior; Genetic Studies of Personality; Personality and Crime; Personality and Risk Taking; Temperament and Human Development; Temperament: Familial Analysis and Genetic Aspects
Bibliography Cloninger C R 1987 A systematic model for clinical description and classification of personality variants. Archies of General Psychiatry 44: 573–88 Cloninger C R, Sigvardsson S, Bohman M 1988 Childhood personality predicts alcohol abuse in young adults. Alcoholism: Clinical and Experimental Research 12: 494–505 Daitzman R J, Zuckerman M 1980 Disinhibitory sensation seeking, personality, and gonadal hormones. Personality and Indiidual Differences 1: 103–10 Ebstein R P, Novick O, Umansky R, Priel B, Oser Y, Blaine D, Bennett E R, Nemanov L, Katz M, Belmaker R H 1996 Dopamine D4 receptor (D4DR) exon III polymorphism associated with the human personality trait, novelty seeking. Nature Genetics 12: 78–80 Hur Y, Bouchard T J Jr 1997 The genetic correlation between impulsivity and sensation seeking traits. Behaior Genetics 27: 455–63 Richardson D R, Medvin N, Hammock G 1988 Love styles, relationship experience, and sensation seeking: A test of validity. Personality and Indiidual Differences 9: 645–51 Ruch W 1988 Sensation seeking and the enjoyment of structure and content of humor: Stability of findings across four samples. Personality and Indiidual Differences 9: 861–71 Segal B S, Huba, G J, Singer J L 1980 Drugs, Daydreaming and Personality: Study of College Youth. Erlbaum, Hillsdale, NJ Siegel J, Driscoll P 1996 Recent developments in an animal model of visusal evoked potential augmenting\reducing and sensation seeking behavior. Neuropsychobiology 34: 130–5 Teichman M, Barnea Z, Rahav G 1989 Sensation seeking, state and trait anxiety, and depressive mood in adolescent substance users. International Journal of the Addictions 24: 87–9 Zuckerman M 1979 Sensation Seeking: Beyond the Optimal Leel of Arousal. Erlbaum, Hillsdale, NJ Zuckerman M 1984 Sensation seeking: A comparative approach to a human trait. Behaioral and Brain Sciences 7: 413–71
13907
Sensation Seeking: Behaioral Expressions and Biosocial Bases Zuckerman M 1990 The psychophysiology of sensation seeking. Journal of Personality 58: 313–45 Zuckerman M 1993 Sensation seeking and impulsivity: A Marriage of traits made in biology? In: McCown W G, Johnson J L, Shure M B (eds.) The Impulsie Client: Theory, Research, and Treatment. American Psychological Association, Washington, DC, pp. 71–91 Zuckerman M 1994 Behaioral Expressions and Biosocial Bases of Sensation Seeking. Cambridge University Press, New York
M. Zuckerman
Sensitive Periods in Development, Neural Basis of In all animals including man, the organization of brain and behavior is not fully determined by genetic information. Information from the environment, mediated by sensory organs, plays an important role in shaping the central nervous system (CNS) or a given behavior to its adult appearance. Many studies have shown that the environmental influence on development is not only different between species, but also varies along the time of development. In many cases, it is only a very short period of time where external stimulation affects the development. The same stimulus, given earlier or later in life, may have no effect or no visible effect. This phenomenon, a time limited influence of external stimulation on the wiring of the CNS or on the performance of a given behavior, has been called, with only subtle differences in meaning, ‘sensitive’ or ‘critical periods,’ or ‘sensitive phases.’ I shall use ‘sensitive period’ here just because it is the most frequent term.
1. Examples of Deelopmental Phenomena with Sensitie Periods 1.1 Imprinting and Song Learning Probably the best known example of early external influences of the environment on the organization of behavior is the so-called ‘imprinting’ process (Lorenz 1935), by which a young bird restricts its social preference to a particular animal or object. In the course of ‘filial imprinting,’ for example, a young chick or duck learns about the object that it has followed when leaving the nest (Hess 1973). Young zebra finches, in the course of ‘sexual imprinting’ (Immelmann 1972), learn the features of an object that subsequently releases courtship behavior in fullygrown birds. In addition to these two phenomena, many other paradigms of imprinting have been described, as for example homing in salmons, habitat 13908
imprinting, acoustic imprinting, or celestial orientation in birds. All imprinting phenomena are characterized by at least two criteria (Immelmann and Suomi 1982): First, learning about the object which the bird is following later on, or to which courtship behavior is directed, is restricted to a sensitive period early in development. In the case of filial imprinting, this phase is quite short (several hours) and it starts directly after hatching. In sexual imprinting, which has been investigated mainly in birds, which are hatching underdeveloped and with closed eyes, the sensitive period starts at the day of eye opening, and may last for several days. The second feature of all imprinting paradigms examined so far is that the information storage is rather stable. The preference for an object to follow or to court, which has been established in the course of the sensitive period, cannot be altered later on. Whenever the bird has a chance to choose between the imprinted object and another one, it will choose the familiar imprinted object. Song learning is, at the first glance, a little bit more complicated than imprinting. It has been shown to comprise two parts (Konishi 1965, Marler 1970). Early in life a young male bird (only males, in most avian species, sing) learns about the song he himself is singing later when adult, and he is learning mainly from his father. At the time of learning, the young male is not able to sing by himself. This time span is called the acquisition period, and it is thought that the male is storing some template of the song he has heard during this phase of learning. When the bird grows older, he starts singing by himself, and it has been shown that he tries to match his own song with the previously acquired template. During this ‘crystallization period,’ the young male selects its final song from a bigger set of other songs that he was singing at the beginning. Thereafter, this selected song remains stable and shows only minor variation. The other songs are, in most cases, not uttered any longer. Song learning thus shows the same characteristics as imprinting. It occurs during a sensitive period, and after the crystallization period, the song that is selected cannot easily be altered. Recent research indicated that a second event like crystallization could also be demanded in imprinting. At least formally, one can separate an acquisition period (which is the ‘classical’ sensitive period) and a second event that may be called stabilization also in imprinting (Bischof 1994). As already mentioned, development is an interplay between genetic instruction and acquired information. This is also the case in imprinting and song learning (Bischof 1997). In imprinting, not only the behavior for which the object is learned is genetically determined, there are also some genetic constraints which at least lead to advantages for certain objects to be learned easier than others. In filial imprinting, it has been shown for example that there is a natural preference for red over other colors, and this prefer-
Sensitie Periods in Deelopment, Neural Basis of ence can be enhanced or diminished by selective breeding. In song learning, only a small variety of songs can be learned by most species, and there is some indication that certain features of song are innate and not alterable by early learning.
1.2 Plasticity of the Visual Cortex in Cats The best-known example of sensitive periods in neural development is the plasticity of neurons of the visual cortex in cats (Hubel and Wiesel 1970). In the adult cat, most neurons in area 17 are driven by visual stimulation of the left as well as the right eye, and are thus defined as binocular. If one eye of a kitten is briefly sutured closed in its early postnatal life, the access of the eyes to cortical neurons is dramatically altered. There is an obvious lack of binocular neurons in the visual cortex of such kittens, and most of the neurons are driven exclusively or are at least strongly dominated by, the non-deprived eye. These changes in ‘ocular dominance distribution’ are only observed if monocular deprivation occurs during postnatal development; the same deprivation in an adult cat does not cause any change. Thus, there is a sensitive period during which the alteration of the visual input affects the wiring of neurons within the visual cortex. The wiring, which has been established after the end of this sensitive period, remains stable for the rest of life. Most of the results obtained in cats were later on confirmed and extended by research on other animals, including monkey, ferret, rat, and mouse (Berardi et al. 2000). As in imprinting and song learning, there is evidence that in addition to early learning, genetic instruction plays an important role for the organization of the visual cortex. The basic pattern of wiring is already there at birth, and this basic pattern is either stabilized during the sensitive period if the sensory information is adequate, or can be altered if the information coming from the eyes deviates from normal. For example, if one eye is closed as in the experiments above, or if the eyes are not aligned appropriately as it is the case in squinting.
2. Similarity of Eents with Sensitie Periods on the Behaioral and on the Neuronal Leel The short overviews in Sect. 1 already show that in addition to the fact that the environment affects the organization of brain and behavior only during a sensitive period, there are three features that are similar in all examples. First, there is some genetic preorganization that is maintained or altered by sensory information during the sensitive period. Second, the shape of the sensitive period over time is very similar in all cases: the effect of environmental information
increases quickly to a maximum, and becomes smaller much slower, going asymptotically to zero. Third, if the sensitive period is over, the brain structure or the behavior is not easily altered again. However, so far our comparison was only on phenomena, the effects may be due to very different mechanisms. It is therefore necessary to go into more detail and to discuss the mechanisms by which storage of information occurs in the different paradigms, and how the start, the duration and the end of sensitive periods, respectively, is determined. Sects. 2.1–2.3 will show that there is indeed a lot of similarity on the level of mechanisms.
2.1 Control of Sensitiity The control of the time span over which external information is able to affect brain and behavior, was initially thought to be due to a genetically determined window which was opened for some time during development allowing external information to access the CNS. However, this turned out to be too simple an idea because the environmental sensitivity could be shifted in time due to experimental conditions (Bateson 1979). Dark rearing, for example, delays the sensitive period during which ocular dominance can be shifted by monocular deprivation, and the sensitive period for imprinting lasts longer if the young bird is isolated and thus not able to see the appropriate stimulus which leads to imprinting. In imprinting, one can also show that the sensitive period is prolonged if the stimulus is not optimal. For example, exposing a zebra finch male on another species like a Bengalese finch leads to a prolongation of the sensitive period. The ideas, which were raised to explain these phenomena, were as follows (Bischof 1997): the natural onset of the sensitive periods coincides with the functioning of the sensory systems involved. Thus, the sensitive period for sexual imprinting in zebra finches starts with about 10 days when the eyes are fully open, while in the precocial chicks, filial imprinting starts directly after birth because these birds are born with open eyes. The sensitive period for monocular deprivation should also start at eye opening. This is roughly correct, but the effect is quite low at the beginning, and some recent results indicate that eye opening may trigger some intermediate events that then lead to enhanced sensitivity of the affected area of the brain. Why does the sensitive period have a time course with a quite sharp increase to a maximum of sensitivity, but a slow, asymptotic decrease? One idea is that the sensitive period is some self-terminating process. If we suppose that an object is described by a limited amount of information bits, or, on the storage side, there is a limited store for the information which has to be acquired, it is easy to imagine that the probability for storage of a given bit of information is 13909
Sensitie Periods in Deelopment, Neural Basis of high at the beginning and goes asymptotically to zero, dependent on the amount of information already stored (Bischof 1985).
2.2 Sites and Modes of Storage For all the examples of phase specific developmental phenomena in Sect. 1, the locations within the brain are known where the plastic events can be observed. Therefore, it is possible to compare the changes of wiring between different examples. It was the visual cortex plasticity where it was detected first that the anatomical basis for the development of neuronal specificity was a segregation of previously overlapping neuronal elements within area 17 (LeVay et al. 1978). Thus, the specification of neurons was caused by a reduction of preexisting neuronal elements. This principle was also found in imprinting and song learning. In both paradigms, the spine density of neurons within the areas that are involved in the learning process was substantially reduced in the course of the sensitive period (Bischof 1997). This indicates that pruning of preexisting elements is an essential part of the physiological mechanisms underlying phase dependent developmental plasticity and learning. The reduction of spine density is stable thereafter; it can not be enhanced by any treatment when it has occurred once. This is also an indication that the reduction of spines may be the anatomical basis of imprinting like learning and developmental plasticity.
2.3 Ultrastructural Eents It has long been speculated that the machinery causing learning induced changes during imprinting, song learning, and cortical plasticity may be different from that causing changes in adult learning, because there is a decrease in spine density instead of an increase, and the changes are stable and cannot be reversed. However, concerning the basic machinery, no significant differences were found. Developmental learning can obviously also be explained by ‘Hebbian’ synapses which strengthen their connections if pre- and postsynaptic neurons fire together, and disconnect if the activity is asynchronous. To cause changes in the postsynaptic neuron, NMDA receptors are involved as in other learning paradigms, causing Long Term Potentiation (LTP) or Long Term Depression (LTD) as well as the cascades of second messengers which finally lead to the activation of the genome which then causes long term changes in synaptic efficiency. The difference to adult learning obviously lies in the fact that plasticity is limited to a certain time span. Many ideas have been developed which systems may gate plasticity (Katz 1999). One of the earliest ideas was that myelination delimits plasticity. Unspecified projections, adrenergic, serotoninergic or cholinergic 13910
have been shown to play a role where it was investigated. Recent links from experiments with knockout animals point towards neurotrophic agents which may only be available to a limited amount; when the resource is exhausted, plasticity is no longer possible (Berardi and Maffei 1999). Another very interesting finding is that inhibition plays a role (Fagiolini and Hensch 2000); neurons may have to reach a genetically determined balance of inhibition and excitation to become plastic. Whether these latter ideas can also be applied to the other early learning paradigms has to be examined.
3. Is Generalization to Humans Possible? One has to be very careful if one generalizes examples from one species to another, and this is even more true for generalization from animals to humans. However, there are some hints that at least part of the results described in Sect. 2 can be applied to humans (Braun 1996). On the neuronal level, it is globally accepted that amblyopia, a visual deficit, is based on the mechanisms described for the development of the visual cortex. It has been shown that if one corrects in humans the misalignment of the eyes which causes amblyopia during early development, the connection between eyes and cortical neurons, and this is no longer possible in adults (Hohmann and Creutzfeldt 1975). On the behavioral level, it has been shown that language learning has so much in common with song learning (Doupe and Kuhl 1999) that it is intriguing to speculate that the neuronal machinery may be similar in both paradigms. However, one has to be aware that the similarity is as yet only on the phenomenological level. Since the early days of imprinting research, it has also been discussed whether aggressiveness, social competence, and similar things are imprinted (Leiderman 1981), and whether this can also be applied to humans. The frightening idea was that in this case parents could easily make big mistakes if they did not confront their children with the appropriate surrounding. However, evidence is sparse even in animals that social competence is severely influenced by early experience. If there is an influence, ways are available, even in the case of sexual imprinting in the zebra finch, to at least overcome temporarily the effects of imprinting. However, that imprinting effects are in most cases only covered but not eliminated, may be reason enough to pay some attention to the conditions under which children grow up. See also: Birdsong and Vocal Learning during Development; Brain Development, Ontogenetic Neurobiology of; Neural Plasticity in Visual Cortex;
Sentence Comprehension, Psychology of Prenatal and Infant Development: Overview; Visual Development: Infant
Bibliography Bateson P 1979 How do sensitive periods arise and what are they for? Animal Behaiour 27: 470–86 Berardi N, Maffei L 1999 From visual experience to visual function: Roles of neurotrophins. Journal of Neurobiology 41: 119–26 Berardi N, Pizzorusso T, Maffei L 2000 Critical periods during sensory development. Current Opinion Neurobiology 10: 138– 45 Bischof H-J 1985 Environmental influences on early development: A comparison of imprinting and cortical plasticity. In: Bateson P P G, Klopfer P H (eds.) Perspecties in Ethnology, Vol. 6. Mechanisms. Plenum Press, New York, pp. 169–217 Bischof H-J 1994 Sexual imprinting as a two-stage process. In: Hogan J A, Bolhuis J J (eds.) Causal Mechanisms of Behaioural Deelopment. Cambridge University Press, Cambridge, UK, pp. 82–7 Bischof H-J 1997 Song learning, filial imprinting, and sexual imprinting: Three variations of a common theme? Biomedical Research–Tokyo 18(Suppl. 1): 133–46 Braun K 1996 Synaptic reorganization in early childhood experience and learning processes: Relevance for the development of mental diseases. Zeitschrift fuW r Klinische Psychologie Psychiatrie und Psychotherapie 44: 253–66 Doupe A J, Kuhl P K 1999 Birdsong and human speech: Common themes and mechanisms. Annual Reiew of Neuroscience 22: 567–631 Fagiolini M, Hensch T K 2000 Inhibitory threshold for criticalperiod activation in primary visual cortex. Nature 404: 183–6 Hess E H 1973 Imprinting: Early Experience and the Deelopmental Psychobiology of Attachment. Van Nostrand Reinhold, New York Hohmann A, Creutzfeldt O D 1975 Squint and the development of binocularity in humans. Nature 254: 613–4 Hubel D H, Wiesel T N 1970 The period of susceptibility to the physiological effects of unilateral eye closure in kittens. Journal Physiology (London) 206: 419–36 Immelmann K 1972 The influence of early experience upon the development of social behaviour in estrildine finches. Proceedings of the 15th International Ornithological Congress. The Hague 316–38 Immelmann K, Suomi S J 1982 Sensitive phases in development. In: Immelmann K, Barlow G W, Petrinovich L, Main M (eds.) Behaioural Deelopment. Cambridge University Press, Cambridge, UK, pp. 395–431 Katz L C 1999 What’s critical for the critical period in visual cortex? Cell 99: 673–6 Konishi M 1965 The role of auditory feedback in the control of vocalization in the white-crowned sparrow. Zeitschrift fuW r Tierpsychologie 22: 770–83 Leiderman P H 1981 Human mother-infant social bonding: Is there a sensitive phase? In: Immelmann K (ed.) Behaioral Deelopment. The Bielefeld Interdisciplinary Project. Cambridge University Press, Cambridge, UK Le Vay S, Stryker M P, Shatz C J 1978 Ocular dominance columns and their development in layer IV of the cat’s visual cortex: A quantitative study. Journal of Comparatie Neurology 179: 223–44
Lorenz K 1935 Der Kumpan in der Umwelt des Vogels. Journal fuW r Ornithologie. 83: 137–213, 289–413 Marler P 1970 A comparative approach to vocal learning: Song development in white-crowned sparrows. Journal of Comparatie and Physiological Psychology. 71(Suppl.): 1–25
H.-J. Bischof
Sentence Comprehension, Psychology of In the process of mapping from form (speech or printed text) to meaning, listeners and readers have the task of combining individual word meanings into sentence meanings. This article examines the cognitive challenges posed by this task, describes some experimental techniques that psycholinguists have used to understand the task, summarizes some basic empirical phenomena of sentence comprehension, and surveys the range of cognitive theories that have been developed to explain how people comprehend sentences.
1. The Tasks of Sentence Comprehension Listeners, and skilled readers, recognize individual words in a manner which appears effortless but which actually hides a wealth of complex cognitive processes (e.g., see Lexical Access, Cognitie Psychology of; Word Recognition, Cognitie Psychology of). They can go further and identify the message being conveyed by a discourse, they can create a mental model of the scenario being described, and they can engage in further social interaction with the person who produced the words. The gap between word and message is bridged by the process of sentence comprehension. A sentence conveys meaning. The meaning of a sentence is composed out of the meanings of its words, guided by the grammatical relations that hold between the words of the sentence. The psychology of sentence comprehension is concerned with the cognitive processes that permit a reader or listener to determine how the word meanings are to be combined in a way that satisfies the intention of the writer or speaker. The reader\listener’s task of mapping from print or sound to meaning is not trivial. Small differences in the input can make great changes in meaning. Consider the contrast between ‘Tom gave his dog biscuits’ and ‘Tom gave him dog biscuits.’ The difference between ‘his’ and ‘him’ signals a difference in the ‘case’ of the pronoun, an aspect of its morphology that carries information about the relation it has to a verb or some other ‘case-assigning’ word in a sentence. The form ‘him’ signals the accusative case, which means that the pronoun has some immediate relation to the verb, in this case the indirect object or ‘recipient’ of the action denoted by the verb. The form ‘his,’ on the other hand, signals the possessive or ‘genitive’ case, which means 13911
Sentence Comprehension, Psychology of that the pronoun is immediately related to the following noun ‘dog.’ This whole noun phrase in turn takes the recipient role. Morphologically signaled case actually plays a minor role in English, although it plays a very important role in many other languages. Languages like English signal the structural relations between their words primarily by word order (compare ‘The dog bit the man’ and ‘The man bit the dog’), and English, like all languages, signals structural relations by the choice of particular lexical items (compare ‘The man helped the child to first base’ and ‘The man helped the child on first base’). Listeners and readers must be sensitive to subtle pieces of information about the structure of sentences. Their task is complicated by the fact that they must also be sensitive to structural relations that span arbitrarily long distances. Consider sentences like ‘The boy likes the girl,’ ‘The boy from the small town likes the girl,’ ‘The boy from the small town where I grew up likes the girl,’ and so forth. There is no limit to the amount of material that can be included in the sentence to modify ‘the boy.’ Nonetheless, the form of the final verb, ‘likes,’ must agree in number with the initial subject, ‘the boy.’ Or consider a sentence like ‘Which girl did the boy from the small town … like?’ The initial question phrase, ‘which girl,’ must be understood to be the direct object of the final verb ‘like,’ even though the two phrases can be separated by an arbitrarily long distance. Readers and listeners (and writers and speakers) are clearly sensitive to such ‘long distance dependencies’ (cf. Clifton and Frazier 1989, Fodor 1978). A third problem faced by listeners and readers is the ubiquitous ambiguity (temporary or permanent) of language. Ambiguity begins in the speech stream, which often can be segmented at different points into different words (see, e.g., Speech Perception, see also Cutler et al. 1997). A given word form can correspond to different lexical concepts (e.g., the money vs. river meaning of ‘bank’). And a string of words can exhibit different structural relations, which may or may not be resolved by the end of a sentence (consider, e.g., the temporary ambiguity in ‘I understood the article on word recognition was written by a real expert’). Readers and listeners are generally not aware of such ambiguities (although puns force awareness), but that only means that their cognitive processes adequately resolve the ambiguities in the course of sentence comprehension. 1.1 The Role of Psycholinguistics in the Deelopment of Cognitie Psychology The modern study of sentence comprehension began at the start of the cognitive revolution, in the late 1950s and early 1960s. Prior to this time, complex behaviors were claimed to be the result of (possibly complex) stimulus–response contingencies. Sentences were presumed to be produced and understood by chaining 13912
strings of words together, under the control of environmental and internal stimuli. Psycholinguistic phenomena of sentence comprehension (and language acquisition; see e.g., Language Acquisition) demonstrated the inadequacy of stimulus–response chaining mechanisms. The sheer fact that we are always understanding and producing completely novel sentences was hard enough to explain in a stimulus–response framework, but even sharper arguments against behaviorism were constructed. Consider just one example. Long-distance dependencies, discussed above, could not be analyzed in terms of stimulus–response chains. If stimuli are chains of words, then the effective stimulus that links a ‘which’ question phrase with the point in the sentence where it has to be interpreted (often referred to as a ‘gap’) would have to be different for each added word that intervenes between the ‘which’ word (‘stimulus’) and its gap (‘response’). In the limit, this means that an arbitrarily large (potentially infinite) number of stimuli would have been associated with a response, which is not possible within stimulus–response theory. 1.2 The Role of Linguistic Knowledge in Sentence Comprehension Arguments like the one just given were among the strongest reasons to develop a psychology of cognitive processes (see Miller 1962). Cognitive psychologists took arguments like this to support the claim that the mind had to have processes that operated on structures more abstract than stimulus–response chains. Many psycholinguists took the new generative grammars being developed by Chomsky (1957, see also Chomsky 1986) to provide descriptions of the structures that cognitive processes could operate on. One important structure is phrase structure, the division of a sentence into its hierarchically arranged phrases and the labeling of these phrases. For example, ‘I understood the article on word recognition was written by an expert’ is divided into the subject ‘I’ and the verb phrase ‘understood … expert’; this verb phrase is divided into the verb ‘understood’ and its embedded sentence complement ‘the article … was written by an expert’; this sentence is similarly divided into subject and verb phrase, and so forth. Another structure (identified only after several years of development of linguistic theory) is the long-distance dependency between ‘moved items’ or ‘fillers’ and their ‘traces’ or ‘gaps’ (as in the relation between a ‘which’ phrase and its gap, discussed earlier). Still other structures involve case relations (e.g. the distinction between accusative and genitive case discussed earlier) and thematic role relations (e.g., the distinction between theme or affected object and recipient in ‘Tom gave his dog biscuits’). Early psycholinguists devoted much attention to the ‘psychological reality’ of grammatical structures, arguing whether they are really involved in sentence
Sentence Comprehension, Psychology of comprehension (Fodor et al. 1974). As psycholinguistic theory developed, it became apparent that the real debate involves not whether these structures are real but how they are identified in the course of sentence comprehension and how, in detail, the mind creates and represents them (Frazier 1995). It now seems impossible to explain how we understand sentences without theorizing about how people assign structure to sentences. However, as we will see later, there is very lively debate about just how we do identify sentence structure.
2.
Ways of Studying Sentence Comprehension
Sentence comprehension seems almost effortless and automatic. How can one observe the fine-grain details of such a smoothly flowing process? Early psycholinguists focused on what was remembered once a sentence was comprehended. They learned some interesting things. Grammatical structure can predict which sentences will be confused in recall and recognition and it can predict what words from a sentence will be good recall probes. While these findings supported the ‘psychological reality’ of grammatical structures, other findings indicated that the gist of a sentence was typically more salient in memory than its specific form or even its grammatical structure. (See Tanenhaus and Trueswell 1995 for a more detailed review.)
psycholinguists have developed ways to assess just what a reader or listener might be thinking about at any given time. One can interrupt a spoken sentence with a probe (e.g., a word to name or to recognize from earlier in the sentence) and use probe reaction times to make inferences about how highly activated the probe word is. For example, probes for fillers such as ‘girl’ in ‘Which girl did the boy from the small town like’ are sometimes observed to be faster after the word that assigns it a thematic role (‘like’) than at other points in the sentence, as if the filler was reactivated at the gap. (Note that this technique is methodologically tricky and can mislead a researcher if it is misused; McKoon et al.1994.) One can also measure ERPs (event-related brain potentials, electrical traces of underlying brain activities measured at the scalp) at critical points in sentences, for example, when a sentence becomes implausible or ungrammatical or simply difficult to process (Kutas and van Petten 1994). There appear to be distinct signatures of different types of comprehension disruption. For instance, implausibility or violation of expectations commonly leads to an ‘N400,’ a negative-going potential shift peaking about 400 ms. after the introduction of the implausibility. This allows a researcher to test theories about just when and how disruption will appear in hearing or reading a sentence.
3. Phenomena of Sentence Comprehension 3.1 Clausal Units
2.1 Online Measures Showing that the end state of comprehension is gist tells us little about how meaning is arrived at. The early psycholinguists recognized this, and developed ‘online’ tasks to probe the early stages of sentence comprehension. Some used a technique of sentence verification, measuring the time taken to decide whether a sentence was true or false (e.g., when spoken of a picture). Others (e.g., Just et al. 1982, see also Haberlandt 1994) measured the time readers took to read each word or phrase of a sentence when sentence presentation was under their control (pressing a button brings on each new word), thereby getting a more precise look at the difficulty readers experience word by word and allowing the development of theories about how readers construct interpretations of sentences. The development of techniques for measuring eye movements during reading (Rayner et al. 1989; see also Eye Moements in Reading), coupled with the discovery that the eye generally seems to be directed toward whatever input the reader is processing, allowed even more sensitive and less disruptive ways of measuring the course of reading. In addition to devising techniques for measuring which parts of sentences readers found easy or hard,
Early theories of processing, guided by linguistic analyses that posited ‘transformational’ rules whose domains could be as large as a full clause, proposed that full sentence analysis and interpretation took place only at a clause boundary. These theories were supported by evidence that memory for verbatim information declines across a clause boundary, that clauses resists interruption by extraneous sounds, and that readers pause at ends of clauses and sentences to engage in ‘wrap-up’ processes (Fodor et al. 1974, Just and Carpenter 1980).
3.2 Immediacy However, a great deal of evidence now exists indicating that readers do substantial interpretation on a wordby-word basis, without waiting for a clause boundary. The initial evidence for this claim came from MarslenWilson’s (1973) work on ‘fast shadowing’ (repeating what one hears, with no more than a quarter-second delay). He showed that if the shadowed message contained a mispronunciation of a word’s end (e.g., ‘cigareppe’), the shadower would fairly often correct it spontaneously. Since this happened more often in 13913
Sentence Comprehension, Psychology of normal than in scrambled sentences, the listener must be using grammatical structure or meaning within a quarter-second or so to constrain recognition of words. Evidence from measuring eye movements in reading led to a similar conclusion (see Eye Moements in Reading). For instance, if a person reads ‘While Mary was mending the sock fell off her lap,’ the reader’s eyes are likely to fixate a long time on the word ‘fell’ or to regress from that word to an earlier point in the sentence. This happens because the verb ‘fell’ is grammatically inconsistent with the normally preferred structural analysis of ‘ … mending the sock’; ‘the sock’ must be the subject of ‘fell,’ not the object of ‘mending.’ Similarly, a person who reads ‘When the car stopped the moon came up’ is likely to fixate a longer than normal time on ‘moon.’ Here, ‘moon’ is simply implausible as the direct object of ‘stop,’ not ungrammatical. This pattern of results indicates that readers (and presumably listeners) create and evaluate grammatical and meaning relations word by word, without waiting for a clause boundary. It is tempting to hypothesize that readers and listeners always perform a full semantic analysis of sentences word by word, with never a delay. This ‘immediacy hypothesis’ may be too strong. Some evidence indicates that, while true lexical ambiguities (e.g., the difference between the bank of a river and a bank for money) may be resolved word by word, sense ambiguities (e.g., the difference between a newspaper as something to read and a newspaper as an institution) may not be (Frazier and Rayner 1990). Similarly, determining the antecedent of a pronoun may be a task that readers do not always complete at the first opportunity. Nonetheless, one secure conclusion is that readers and listeners generally understand a great deal of what a sentence conveys with little or no lag.
A simple empirical generalization (which may or may not describe the cognitive process a reader or listener is engaging in; see below) is that when a choice between two analyses has to be made, the reader\listener initially favors (a) the choice that is simpler in terms of grammatical structure, or (b) in the absence of complexity differences, the choice that permits new material to be related to the most recently processed old material. In the ‘horse raced’ sentence, the main verb analysis is simpler than the relative clause analysis. In ‘While Mary was mending the sock fell off her lap,’ a reader\listener prefers to relate ‘the sock’ to the recently processed verb ‘mending’ rather than to material that has not yet been received. In both cases, these preferences are disconfirmed by material that follows, resulting in a garden-path. A great deal of experimental work using techniques of self-paced reading and eye movement measurement has indicated that reading is slowed (and regressive eye movements are encouraged) at points where gardenpaths occur. Similarly, experimental research using ERPs has indicated the existence of distinct patterns of neural response to being garden-pathed. Research using these, and other, techniques has gone further and demonstrated the existence of subtle garden-paths that may not be apparent to conscious introspection (e.g., the temporary tendency to take ‘the article … ’ as the direct object of ‘understand’ in ‘I understood the article on word recognition was written by a real expert’). Normally, it appears that the rules readers and listeners use to decide on the grammatical structures of sentences function so smoothly that their operation cannot be observed. But just as the analysis of visual illusions allows insight into the normal processes of visual perception by identifying when they go astray, identifying when a rule for analyzing sentence structure gives the wrong answer can go far in determining what rule is actually being followed.
3.3
3.4 Lexical and Frequency Effects
Garden-pathing
Another much-studied phenomenon is ‘gardenpathing.’ When readers and listeners construct wordby-word interpretations of sentences, they sometimes make mistakes. Bever (1970) initiated the study of garden-pathing with his infamous sentence ‘The horse raced past the barn fell’ (compare ‘The car driven into the garage caught fire’ if you doubt that the former sentence is actually possible in English). Readers are ‘led down the garden path’ by their preference to take ‘horse’ as the agent of ‘race,’ and is disrupted when the final verb ‘fell’ forces a revision so that ‘raced past the barn’ is taken as a relative clause that modifies ‘horse,’ and ‘horse’ is taken to be the theme of ‘race’ and the subject of ‘fell.’ The mistakes readers and listeners make can be extremely diagnostic of the decision rules they follow. 13914
Once the existence and basic nature of garden-paths were discovered, researchers realized that gardenpaths did not always occur. The sentence presented earlier, ‘The car driven into the garage caught fire’ does not seem to lead to a garden-path. This may be due in part to the fact that the verb ‘driven’ is unambiguously a participle, not a main verb, and the normally preferred simpler analysis is grammatically blocked. However, other cases indicate that more subtle effects exist. Sentences with verbs that are obligatorily transitive are relatively easy (e.g., ‘The dictator captured in the coup was hated’ is easier than ‘The dictator fought in the coup was hated’). Sentences with verbs like ‘cook,’ whose subject can take on the thematic role of theme\affected object, are particularly easy (‘The soup cooked in the pot tasted good’).
Sentence Comprehension, Psychology of Sentences in which the subject is implausible as agent of the first verb but plausible as its theme are easier than sentences in which subject is a plausible agent (e.g., ‘The evidence examined by the lawyer was unreliable’ is easier than ‘The defendant examined by the lawyer was unreliable,’ although this difference may depend on a reader being able to see the start of the disambiguating ‘by’ phrase while reading the first verb). Consider a different grammatical structure, prepositional phrase attachment. It is easier to interpret a prepositional phrase as an argument of an action verb than as a modifier of a noun (e.g., ‘The doctor examined the patient with a stethoscope’ is easier than ‘The doctor examined the patient with a broken wrist’), which is consistent with a grammatical analysis in which the former sentence is structurally simpler than the latter. However, this difference reverses when the verb is a verb of perception or a ‘psychological’ verb and the noun that may be modified is indefinite rather than definite (e.g., ‘The salesman glanced at a customer with suspicion’ is harder than ‘The salesman glanced at a customer with ripped jeans’) (Spivey-Knowlton and Sedivy 1995; see also Frazier and Clifton 1996, Mitchell 1994, Tanenhaus and Trueswell 1995, for further discussion and citation of experimental articles). These results indicate that grammatical structure is not the only factor that affects sentence comprehension. Subtle details of lexical structure do as well. Further, the sheer frequency with which different structures occur and the frequency with which particular words occur in different structures seems to affect sentence comprehension: For whatever reason, more frequent constructions are easier to comprehend. For example, if a verb is used relatively frequently as a participle compared to its use as a simple past tense, the difficulty of the ‘horse raced’ type garden-path is reduced. Similarly, the normal preference to take a noun phrase following a verb as its direct object (which leads to difficulty in sentences like ‘I understood the article on word recognition was written by an expert’) is reduced when the verb is more often used with sentence complements than with direct objects. MacDonald et al. (1994) review these findings, spell out one theoretical interpretation which will be considered shortly, and discuss the need to develop efficient ways of counting the frequency of structures and to decide on the proper level of detail for counting structures.
of possible referents. For instance, if there are two books on a table, you can ask for the ‘book that has the red cover.’ Some researchers (e.g., Altmann and Steedman 1988) have suggested that the difficulty of the relative clause in a ‘horse raced’ type garden-path arises simply because the sentence is presented out of context and there is no set of referents for the relative clause to select from. This suggestion entails that the garden-path would disappear in a context in which two horses are introduced, one of which was raced past a barn. Most experimental research fails to give strong support to this claim (cf. Mitchell 1994, Tanenhaus and Trueswell 1995). However, the claim may be correct for weaker garden-paths, for example the prepositional phrase attachment discussed above (‘The doctor examined the patient with a broken wrist’), at least when the verb does not require a prepositional phrase argument. And a related claim may even be correct for relative clause modification when temporal relations, not simply referential context, affect the plausibility of main verb vs. relative clause interpretations (cf. Tanenhaus and Trueswell 1995).
3.6
Effects of Prosody
Most research on sentence comprehension has focused on reading. However, some experimental techniques permit the use of spoken sentences, which raises the possibility of examining whether the prosody of a sentence (its rhythm and melody) can affect how it is comprehended. The presence of a prosodic boundary (marked in English by a change in pitch at the end of the prosodic phrase, elongation of its final word, and possibly the presence of a pause) can affect the interpretation of a sentence. It can even eliminate the normal preferences that result in garden-paths. For example, Kjelgaard and Speer (1999) showed that the proper pattern of prosodic boundaries can eliminate the normal difficulty of sentences like ‘Whenever Madonna sings a song is a hit.’ As these authors emphasize, though, there is more to prosody than just putting in pauses. The entire prosodic pattern of an utterance (including the presence and location of ‘pitch accents’; Schafer et al. 1996) can affect how it is interpreted.
4. Psychological Models of Sentence Comprehension 3.5 Effects of Context Sentences are normally not interpreted in isolation, which raises the question of how context can affect the processes by which they are understood. One line of research has emphasized the referential requirements of grammatical structures. One use of a modifier, such as a relative clause, is to select one referent from a set
Psycholinguists have learned a great deal about the phenomena of sentence comprehension. This does not mean that they agree about the cognitive processes of readers and listeners that produce these phenomena. Early theories of how people understand sentences claimed that the rules that make up a linguistic theory of a language are known (implicitly) to language users, 13915
Sentence Comprehension, Psychology of who use the rules directly in comprehending language (see Fodor et al. 1974). This did not work with the rules provided by early transformational grammars. Their rules operated on domains as large as a clause, and it soon became clear that people did not wait for the end of a clause to understand a sentence. These early direct incorporation theories were replaced by what can be called ‘detective-style’ theories. Readers and listeners were presumed to search for clues of any kind to the relations among words in a sentence. Psycholinguistic theorizing became little more than listing cues that people used. In the 1970s, as grammars changed to use more restrictive types of rules (especially emphasizing phrase-structure rules), a new breed of grammarbased theories developed. These theories added descriptions of decision strategies to the claim that people use the rules provided by the grammar, focusing on how they choose among various alternative rules in analyzing a sentence (Frazier 1987; cf. Frazier and Clifton 1996, Mitchell 1994, for further discussion of these theories). They claimed, for instance, that readers and listeners incorporate each new word into a sentence structure in the simplest and quickest possible way. This led to compelling accounts of garden-path phenomena, and the theories under discussion are often referred to as ‘garden-path’ theories. In the 1970s and 1980s, linguistic theory moved away from positing broad, generally applicable rules to making claims about information contained in individual lexical items (constrained by some extremely broad general principles). This encouraged the development of ‘lexicalist’ theories of sentence comprehension (see MacDonald et al.1994, for the most completely developed statement of such a theory). These theories emphasize the specific contribution of individual words, rather than the effects of conformity to generally applicable word configurations. An individual word can provide a variety of kinds of information, including the possible and preferred thematic roles for its arguments, the frequency with which it is used in various constructions, and the entities it can refer to, as well as information about the phrase structure configurations it can appear in. Lexicalist theories propose that all these kinds of information are available and are used in deciding among alternative analyses. It is clear that contemporary theories of sentence comprehension differ in the range of information that they claim guides sentence analysis. Garden-path theories are ‘modular’ (cf. Fodor 1983) in that they claim that only certain necessarily relevant types of information affect initial decisions about sentence structure. Lexicalist theories are generally nonmodular in allowing many different kinds of information to affect these decisions. The theories also differ in what can be called depth-first vs. breadth-first processing (see Clifton 2000, for discussion). Garden13916
path theories are typically depth-first (although logically they need not be); they typically claim that a single analysis is built initially and then evaluated. Lexicalist theories are typically breadth-first; they claim that several different analyses are activated in parallel (generally being projected from the head of a phrase), and that all available types of information are used to choose among them. Experimental data have thus far not been able to settle the debate between theorists favoring depth-first models and theorists favoring breadth-first ones. Garden-path theories explain garden-paths elegantly, and provide a more adequate account of how sentence structures are created than lexicalist theories do. On the other hand, they are forced to ascribe the effects of lexical structure, frequency, and context described above to a ‘reanalysis’ stage, which follows discovery that the initially preferred analysis is incorrect. Lexicalist theories, on the other hand, provide a natural account of effects of lexical structure, frequency, and context, especially when implemented as connectionist constraint-satisfaction models. But they have not yet been developed in a way that adequately explains where sentence structures come from (it is unappealing to say that the large parts of the grammar are stored with each lexical entry) or that explains why structurally complex sentences are overwhelmingly the ones that lead to garden-paths.
5. Beyond Parsing This article has focused on how readers and listeners use their knowledge of language to compose sentence meanings from words. While much of the power of language comes from readers’ and listeners’ abilities to ‘follow the rules’ and arrive at grammatically licensed interpretations of sentences, it is also clear that people have the ability to go beyond such literal interpretation. The context in which an utterance is heard or a sentence is read can influence its interpretation far more profoundly than the context effects discussed above. The meaning of words can be altered depending on the context in which they occur (consider ‘bank’ in the context of a trip to the river to fish vs. a purchase of a house, or ‘red’ in the context of a fire truck vs. a red-haired woman). The reference of a definite noun phrase can depend the discourse context. A listener can determine whether a speaker who says ‘Can you open this door?’ is asking about the adequacy of a job of carpentry vs. requesting some assistance. Speakers can rely on listeners to get their drift when they use figurative language such as metaphors and possibly irony (‘My copy editor is a butcher’). And very generally, what a listener or reader takes a speaker or writer to mean may depend on their shared goals and mutual knowledge (cf. Clark 1996). Nonetheless, all these varied uses of language depend on the listener\reader’s ability to put words
Sequential Decision Making together into sentence meanings, which makes the study of the topics considered in this article an important branch of cognitive psychology. See also: Eye Movements in Reading; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Syntactic Aspects of Language, Neural Basis of; Syntax; Syntax–Semantics Interface; Text Comprehension: Models in Psychology
Bibliography Altmann G, Steedman M 1988 Interaction with context during human sentence processing. Cognition 30: 191–238 Bever T G 1970 The cognitive basis for linguistic structures. In: Hayes J R (ed.) Cognition and the Deelopment of Language. Wiley, New York, pp. 279–352 Chomsky N 1957 Syntactic Structures. Mouton, The Hague, The Netherlands Chomsky N 1986 Knowledge of Language: Its Nature, Origin, and Use. Praeger, New York Clark H H 1996 Using Language. Cambridge University Press, Cambridge, UK Clifton Jr C 2000 Evaluating models of sentence processing. In: Crocker M, Pickering M, Clifton Jr C (eds.) Architecture and Mechanisms of Language Processing. Cambridge University Press, Cambridge, UK, pp. 31–55 Clifton Jr C, Frazier L 1989 Comprehending sentences with long-distance dependencies. In: Carlson G N, Tanenhaus M K (eds.) Linguistic Structure in Language Processing. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 273–318 Cutler A, Dahan D, Van Donselaar W 1997 Prosody in the comprehension of spoken language: A literature review. Language and Speech 40: 141–201 Fodor J A 1983 Modularity of Mind. MIT Press, Cambridge, MA Fodor J A, Bever T G, Garrett M F 1974 The Psychology of Language: An Introduction to Psycholinguistics and Generatie Grammar. McGraw-Hill, New York Fodor J D 1978 Parsing strategies and constraints on transformations. Linguistic Inquiry 9: 427–74 Frazier L 1987 Sentence processing: A tutorial review. Attention and Performance 12: 559–86 Frazier L 1995 Representational issues in psycholinguistics. In: Miller J L, Eimas P D (eds.) Handbook of Perception and Cognition, Vol. 11: Speech, Language, and Communication. Academic Press, San Diego Frazier L, Clifton Jr C 1996 Construal. MIT Press, Cambridge, MA Frazier L, Rayner K 1990 Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29: 181–200 Haberlandt K F 1994 Methods in reading research. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 1–31 Just M A, Carpenter P A 1980 A theory of reading: From eye fixations to comprehension. Psychological Reiew 85: 329–354 Just M A, Carpenter P A, Woolley J D 1982 Paradigms and processes in reading comprehension. Journal of Experimental Psychology: General 111: 228–38
Kjelgaard M M, Speer S R 1999 Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory and Language 40: 153–94 Kutas M, Van Petten C K 1994 Psycholinguistics electrified: Event-related brain potential investigations. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 83–143 MacDonald M C, Pearlmutter N J, Seidenberg M S 1994 Lexical nature of syntactic ambiguity resolution. Psychological Reiew 101: 676–703 Marslen-Wilson W D 1973 Linguistic structure and speech shadowing at very short latencies. Nature 244: 522–3 McKoon G, Ratcliff R, Ward G 1994 Testing theories of language processing: An empirical investigation of the on-line lexical decision task. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 1219–28 Miller G A 1962 Some psychological studies of grammar. American Psychologist 17: 748–62 Mitchell D C 1994 Sentence parsing. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, San Diego, CA, pp. 375–410 Rayner K, Sereno S, Morris R, Schmauder R, Clifton Jr C 1989 Eye movements and on-line language comprehension processes. Language and Cognitie Processes 4: SI 21–50 Schafer A, Carter J, Clifton Jr C, Frazier L 1996 Focus in relative clause construal. Language and Cognitie Processes 11: 135–63 Spivey-Knowlton M, Sedivy J C 1995 Resolving attachment ambiguities with multiple constraints. Cognition 55: 227–67 Tanenhaus M K, Trueswell J C 1995 Sentence comprehension. In: Miller J, Eimas P (eds.) Handbook of Perception and Cognition: Speech, Language, and Communication. Academic Press, San Diego, CA, Vol. 11 pp. 217–62
C. Clifton, Jr.
Sequential Decision Making Sequential decision making describes a situation where the decision maker (DM) makes successive observations of a process before a final decision is made, in contrast to dynamic decision making (see Dynamic Decision Making) which is more concerned with controlling a process over time. Formally a sequential decision problem is defined, such that the DM can take observations X , X ,… one " DM # at a time. After each observation Xn the can decide to terminate the process and make a final decision from a set of decisions D, or continue the process and take the next observation Xn+ . If the observations X , X ,… form a random "sample, # sequential sampling. the procedure is"called In most sequential decision problems there is an implicit or explicit cost associated with each observation. The procedure to decide when to stop taking observations and when to continue is called the stopping rule. The objective in sequential decision making is to find a stopping rule that optimizes the 13917
Sequential Decision Making decision in terms of minimizing losses or maximizing gains including observation costs. The optimal stopping rule is also called the optimal strategy or the optimal policy. A wide variety of sequential decision problems have been discussed in the statistics literature, including search problems, inventory problems, gambling problems, and secretary-type problems, including sampling with and without recall. Several methods have been proposed to solve the optimization problem under specified conditions, including dynamic programming, Markov chains, and Bayesian analysis. In the psychological literature, sequential decision problems are better known as optional stopping problems. One line of research using sequential decision making is concerned with seeking information in situations such as buying houses, searching for a job candidate, price searching, or target search. The DM continues taking observations until a decision criterion for acceptance is reached. Another line of research applies sequential decision making to account for information processing in binary choice tasks (see Diffusion and Random Walk Processes; Stochastic Dynamic Models (Choice, Response, and Time)), and hypothesis testing such as in signal detection tasks (see Signal Detection Theory: Multidimensional). The DM continues taking observations until either of two decision criteria is reached. Depending on the particular research area, observations are also called offers, options, items, applicants, information, and the like. Observation costs include explicitly not only possibly money, but also time, effort, aggravation, discomfort, and so on. Contrary to the objective of statisticians or economists, psychologists are less interested in determining the optimal stopping rule, and more interested in discussing the variables that affect human decision behavior in sequential decision tasks. Optimal decision strategies are considered as normative models, and their predictions are compared to actual choice behavior.
1. Sequential Decision Making with One Decision Criterion In sequential decision making with one decision criterion the DM takes costly observations Xn, n l 1,… of a random process one at a time. After observing Xn l xn the DM has to decide whether to continue sampling observations or to stop. In the former case, the observation Xn+ is taken at a cost of cn+ ; in the " " latter case the DM receives a net payoff that consists of the payoff minus the observation costs. The DM’s objective is to find a stopping rule that maximizes the expected net payoff. The optimal stopping rule depends on the specific assumptions made about the situation: (a) the dis13918
tribution of X is known, not known or partly known, (b) Xi are distributed identically for all i, or have the similar distribution but with different parameters, or have different distributions, (c) the number of possible observations, n, is bounded or unbounded, (d) the sampling procedure, e.g., it is possible to take the highest value observed so far when stopping (sampling with recall) or only to take the last value when stopping (sampling without recall), and (e) the cost function, cn, is fixed for each observation or is a function of n. Many of these problems have been studied theoretically by mathematicians and experimentally by psychologists. Pioneering experimental work was done in a series of papers by Rapoport and colleagues (1966, 1969, 1970, 1972).
1.1 Unknown Sample Distribution: Secretary-type Problems Kahan et al. (1967) investigated decision behavior in a sequential search task where the DM had to find the largest of a set of n l 200 numbers, observed one at a time from a deck of cards. The observations were taken in random order without replacement. The DM could only declare the current observation as the largest number (sampling without recall), and could compare the number with the previous presented numbers. No explicit cost for each observation was taken, i.e., c l 0. The sample distribution was unknown to the DM. A reward was paid only when the card with the highest number was selected, and nothing otherwise. This describes a decision situation that is known as the secretary problem (a job candidate search problem; for various other names, see, e.g., Freeman 1983) which, in its simplest form, makes explicit the following assumptions (Ferguson 1989): (a) only one position is available, (b) the number n of applicants is known, (c) applicants are interviewed sequentially in random order, each order being equally likely, and (d) all applicants can be ranked without ties—the decision to reject or accept an applicant must be based only on the relative ranks of the applicants interviewed so far, (e) an applicant once rejected cannot later be recalled, and (f ) the payoff is 1 when choosing the best of the n applicants, 0 otherwise. The optimal strategy for this kind of problem is to reject the first sk1, s 1, items (cards, applicants, draws) and then choose the first item that is best in the relative ranking so far. With 1 1 1 as j j(j s sj1 nk1
(1)
the optimal strategy is to stop if as 1 and to continue if as 1, which can easily be determined for small n. For large n, the probability of choosing the best item is
Sequential Decision Making approximated by 1\e and the optimal s by n\e. (e l 2.71…). (For derivations, see, e.g., DeGroot 1970, Freeman 1983, Gilbert and Mosteller 1966). Kahan et al. (1967) reported that about 40 percent of their subjects did not follow the optimal strategy but stopped too late and rejected a card that should have been accepted. The failure of the strategy for describing behavior was assigned to its inadequacy for the described task. Although at the beginning of the experiment the participants did not know anything about the distribution (requirement), they could learn about the distribution by taking observations (partly information). To guarantee ignorance of the distribution, Gilbert and Mosteller (1966) recommended supplying only the rank of the observation made so far and not the actual value. Seale and Rapoport (1997) conducted an experiment following this advice. They found that participants (with n l 40 and n l 80) stopped earlier than prescribed by the optimal stopping rule. They proposed simple decision rules or heuristics to describe the actual choice behavior. Using a cutoff rule, the DM rejects the first sk1 applicants and then chooses the next top-ranked applicant, i.e., the candidate. The DM simply counts the number of applicants and then stops on the first candidate after observing sk1 applicants. Under a candidate count rule, the DM counts the number of candidates and chooses the j th candidate. A successive non-candidate rule requires the DM to choose the first candidate after observing at least k consecutive noncandidates following the last candidate. The secretary problem has been extended and generalized in many different directions within the mathematical statistics field. Each of the above assumptions has been relaxed in one way or another. (Ferguson 1989, Freeman 1983). However, the label of secretary problem tends to be used only when the distribution is unknown and the decision to stop or to continue depends only on the relative ranking of the observations taken so far and not on their actual values.
1.2 Known Sample Distribution Rapoport and Tversky (1966, 1970) investigated choice behavior when the mean and the variance of the distribution was known to the DM. The cost for each observation was fixed but the amount varied across experimental conditions, and the number of possible observations n was unbounded (1966) or bounded and known (1970). Behavior for sampling with and without recall was compared. When sampling is without recall only the value of the last observation, Xn l xn, can be received, and the payoff is this value minus the total sampling cost, i.e, xnkcn. The optimal strategy is to find a stopping rule that maximizes the expected payoff E(XNkcN). When sampling is with recall, the highest value observed so far can be selected and the
payoff is max(x ,…, xn)kcn and the optimal strategy is to find a" stopping rule that maximizes E(max(X ,…, XN)kcN). In the following, with subscripts" and * denote the expected gain from an (optimal) procedure.
1.2.1 Number of Obserations Unbounded. If n is unbounded, i.e., if the number of observations that can be taken is unlimited, and X , X … are sampled from " F(x), # the optimal strata known distribution function egy is the same for both sampling with and without recall. In particular, the optimal strategy is to continue to take observations whenever the observed value xj *, and to stop taking observations as soon as an observed value xj *, where * is the unique solution of
&
_
(xk*) d F (x) l c k_
*
_.
(2)
v*
When the observations are taken from a standard normal distribution with density functions φ(x) and distribution function Φ(x), we have that * l
φ(*)kc . 1kΦ(*)
(3)
Although sampling with and without recall have the same solution, they seem to be different from a psychological point of view. Rapoport and Tversky (1966) found that the group sampling without recall took significantly fewer observations than the participants sampling with recall. The mean number of observations for both groups decreased with increasing cost c, and the difference with respect to the number of observations taken was diminished. However, the participants in both groups took fewer observations than prescribed by the optimal strategy. This nonoptimal behavior of the participants was attributed to a lack of thorough knowledge of the distributions. 1.2.2 Number of Obserations Bounded. If n, n 2, is bounded, i.e., if not more than n observations can be taken, the optimal stopping rules for sampling with and without recall are different. For sampling without recall, an optimal procedure is to continue taking observations whenever xj n−jkc and to stop as soon as xj n−jkc, where j l 1, 2… n indicates the number of observations which remain available and j+ l (jkc)k "
&
_
vj−c
(xk(jkc)) d F (x).
(4) 13919
Sequential Decision Making With l E (X)kc, the sequence can be computed " successively. Again, assuming a standard normal distribution j+ l φ(jkc)jΦ(jkc). For sampling" with recall, the optimal strategy is to continue the process whenever a value xj * and to stop taking observations as soon as an observed value xj *, where * is as in Eqn. (2), which is the same solution as for n unbounded. (For derivations of the strategies, see DeGroot 1970, Sakaguchi 1961.) Rapoport and Tversky (1970) investigated choice behavior within this scenario. Sampling was done both with and without recall. The number of observations that could be taken as well as observation cost varied across experimental groups. One third of the participants did not follow the optimal strategy. Under both sampling procedures and all cost conditions, they took on average fewer observations than predicted by the corresponding optimal stopping rules. There were no systematic differences due to cost, as observed in their previous study. They concluded that ‘the optimal model provides a reasonable good account of the behavior of the subjects’ (p. 119).
1.3 Different Sample Distributions for Each Obseration Most research concerned with sequential decision making assumes that the observations are sampled from the same distribution, i.e., Xi are distributed identically for all i. For many decision situations, however, the observations may be sampled from the same distribution family with different parameters, or from different distributions. Especially in economic areas, such as price search, it is reasonable to assume that the distributions from which observations are taken change over time. The sequence of those samples has been called nonstationary series. Of particular interest are two special nonstationary series: ascending and descending series. For ascending series, the observations are drawn from distributions, usually from normal distributions, with increasing mean as i increases; for descending series the mean of the distribution decreases as i increases, i indicating the sample index. For both cases, experiments have been conducted to investigate choice behavior in a changing environment. Shapira and Venezia (1981) compared choice behavior for ascending, descending and constant (identically distributed Xi) series. In one experiment (numbers from a deck of cards), the distributions were known to the DM; no explicit observation costs were imposed; sampling occurred without recall; and the number of observations that could be taken was limited to n l 7. The variance of the distributions varied across experimental groups. An optimal procedure was assumed to continue taking observations whenever xj n−j, and to stop as soon as xj n−j, where j l 1, 2… n indicates the number of 13920
observations which remain available. k l 1,…, n indicates the specific distribution for the jth observation. Thus
&
j+ l jk "
_
vj
(xkj) dFk(x).
(5)
With l E(X ) the sequence can be computed succes" sively. " Assuming a standard normal distribution j+ l φk(j)jΦk(j). "Across all conditions, 58 percent of the participants behaved in an optimal way. The proportion of optimal stopping did not depend on the type of series but on the size of the variance. Nonoptimal stopping (24 percent stopped too early; 18 percent too late) depended on the series and on the size of the variance. In particular, participants stopped too early on ascending and too late on descending series. A similar result was observed by Brickman (1972). In this study, departing from the optimal stopping rule was attributed to an inadequacy of the stopping rule taken for the particular experimental conditions (assuming complete knowledge of the distributions). In a secretary problem design (see Sect. 1.1), Corbin et al. (1975) were less concerned with optimal choice behavior than with the processes by which the participants made their selections, and with factors that influenced those processes. The emphasis of the investigation was on decision making heuristics rather than on the adequacy of optimal models. With the same optimal stopping rule for all experimental conditions, they found that stopping behavior depended on contextual variables such as the ascending or descending trend of the inspected numbers of the stack.
2. Search Problems—Multiple Information Sources In a sequential decision making task with multiple information sources, the DM has the option to take information sequentially from different sources. Each information source may provide valid information with a particular probability and at different cost. The task is not only to decide to stop or to continue the process but also, if continuing, which source of information to consult. Early experimental studies were done by Kanarick et al. (1969), Rapoport (1969), Rapoport et al. (1972). A typical task is to find an object (e.g., a black ball) which is hidden in one of several possible locations (e.g., in one of several bins containing white balls). The optimal search strategy depends on further task specifications, such as whether the object can move from one location to another, how many objects are to be found, and whether the search process may stop before the object has been found. Rapoport (1969)
Sequential Decision Making investigated the case when a single object that could not move was to be found in one of r, r 2, possible locations. The DM was not allowed to stop the process before the target was found. All of the following were known to the DM: the a priori probability pi, pi 0 that the object is in location i, i l 1, 2,…, r, with ri = pi l 1; a miss probability αi, 0 αi 1, that even if" the object is in location i it will not be found in a particular search of that location (1kαi is referred to the respective detection probability); and a cost, ci, for a single observation at location i. The objective of the DM is to find a search strategy that minimizes the expected cost. For i l 1,…, r and j l 1, 2,… let Πij denote the probability that the object is found at location i during the jth search and the search is terminated. Then " (1kα ), Πij l piα j− i i
i l 1,…, r, j l 1, 2,…
(6)
If all values of Πij\ci for all values of i and j are arranged in order of decreasing magnitude, the optimal strategy is to search according to this ordering (for derivations, see DeGroot 1970). Ties may be ordered arbitrarily among themselves. The optimal strategy is determined by the detection probabilities and observation costs, and optimal search behavior implies a balance between maximizing the detection probability and minimizing the observation cost. Rapoport (1969) found that participants did not behave optimally. They were more concerned with maximizing the probability of detecting the target than with minimizing observation cost. Increasing the difference of observation cost ci among the i l 1, 2, 3, 4 locations showed that the deviation from the optimal strategy even increased. Rapoport et al. (1972) varied the search problems by allowing the DM to terminate the search at any time; adding a terminal reward, R, for finding the target; and a terminal penalty, B, for not finding the target. Most participants showed a bias toward maximizing detection probability vs. minimizing search cost per observation, similar to the previous study.
3. Sequential Decision Making with Two or More Possible Decisions A random sample X , X , … is generated by an " # Θ. The DM can take unknown state of nature, observations one at a time. After observing Xn l xn the DM makes inferences about Θ based on the values of Xi, …, Xn and can decide whether to continue sampling observations or to stop the process. In the former case, observation Xn+ is taken; in the latter, the DM makes a final decision d "? D. The consequences to the DM depend on the decision d and the value θ.
The statistical theory for this situation was developed by Wald during the 1940s. It has been used to test hypotheses and estimate parameters. In psychological research, sequential decision making of this kind is usually limited to two decisions D l od , d q, # and applied to binary choice tasks (see Diffusion" and Random Walk Processes; Stochastic Dynamic Models (Choice, Response, and Time); Bayesian Theory: History of Applications). The standard theory of sequential analysis by Wald (1947) does not include considerations of observation costs C(n), losses for terminal decisions L(θ, d ), and a priori (subjective) probabilities π of the alternative states of nature. Deferred decision theory generalizes the original theory by including these variables explicitly. The objective of the DM is to find a stopping rule that minimizes expected loss (called risk) and expected observation cost. The form of that optimal stopping rule depends mainly on the assumptions about the number of observations that can be taken (bounded or unbounded), and on the assumption of cost per observation (fixed or not) (see DeGroot 1970). Birdsall and Roberts (1965), Edwards (1965), and Rapoport and Burkheimer (1971) introduced the idea of deferred decision theory as normative models of choice behavior to the psychological community. Experiments investigating human behavior in deferred decision tasks have been carried out by Pitz and colleagues (e.g., Pitz et al. 1969), and by Busemeyer and Rapoport (1988). Rapoport and Wallsten (1972) summarize experimental findings. For illustration, assume the decision problem in its simplest form. Suppose two possible states of nature θ or θ , and two possible decisions d and d . Cost c per" # " of observations # observation is fixed and the number is unbounded. The DM does not know which of the states of nature, θ or θ is generating the observation, " # but there are a priori probabilities π that it is θ and " (1kπ) that it is θ . Let wi denote the loss for a terminal # decision incurred by the DM in deciding that θi is not the correct state of nature when it actually is (i l 1, 2). No losses are assumed when the DM makes a correct decision. Let πn denote the posterior probability that θ is the correct state of nature generating the " observations after n observations have been made. The total posterior expected loss is rn l minow πn, w (1kπn)qjnc. The DM’s objective is to " the# expected loss. An optimal stopping rule minimize is specified in terms of decision boundaries, α and β. If the posterior probability is greater than or equal to α, then decision d is made; if the posterior probability is smaller than or"equal to β, then d is selected; otherwise # sampling continues. See also: Decision Making (Naturalistic), Psychology of; Decision Making, Psychology of; Decision Research: Behavioral; Dynamic Decision Making; Multi-attribute Decision Making in Urban Studies 13921
Sequential Decision Making
Sequential Statistical Methods
Bibliography Birdsall T G, Roberts R A 1965 Theory of signal detectability: Deferred decision theory. The Journal of Acoustical Society of America 37: 1064–74 Brickman P 1972 Optional stopping on ascending and descending series. Organizational Behaior and Human Performance 7: 53–62 Busemeyer J R, Rapoport A 1988 Psychological models of deferred decision making. Journal of Mathematical Psychology 32(2): 91–133 Corbin R M, Olson C L, Abbondanza M 1975 Context effects in optional stopping decisions. Organizational Behaior and Human Performance 14: 207–16 De Groot M H 1970 Optimal Statistical Decisions. McGrawHill, New York Edwards W 1965 Optimal strategies for seeking information: Models for statistics, choice response times, and human information processing. Journal of Mathematical Psychology 2: 312–29 Ferguson T S 1989 Who solved the secretary problem? Statistical Science 4(3): 282–96 Freeman P R 1983 The secretary problem and its extensions: A review. International Statistical Reiew 51: 189–206 Gilbert J P, Mosteller F 1966 Recognizing the maximum of a sequence. Journal of the American Statistical Association 61: 35–73 Kahan J P, Rapoport A, Jones L E 1967 Decision making in a sequential search task. Perception & Psychophysics 2(8): 374–6 Kanarick A F, Huntington J M, Peterson R C 1969 Multisource information acquisition with optional stopping. Human Factors 11: 379–85 Pitz G F, Reinhold H, Geller E S 1969 Strategies of information seeking in deferred decision making. Organizational Behaior and Human Performance 4: 1–19 Rapoport A 1969 Effects of observation cost on sequential search behavior. Perception & Psychophysics 6(4): 234–40 Rapoport A, Tversky A 1966 Cost and accessibility of offers as determinants of optional stopping. Psychonomic Science 4: 45–6 Rapoport A, Tversky A 1970 Choice behavior in an optional stopping task. Organizational Behaior and Human Performance 5: 105–20 Rapoport A, Burkheimer G J 1971 Models of deferred decision making. Journal of Mathematical Psychology 8: 508–38 Rapoport A, Lissitz R W, McAllister H A 1972 Search behavior with and without optional stopping. Organizational Behaior and Human Performance 7: 1–17 Rapoport A, Wallsten T S 1972 Individual decision behavior. Annual Reiew of Psychology 23: 131–76 Sakaguchi M 1961 Dynamic programming of some sequential sampling design. Journal of Mathematical Analysis and Applications 2: 446–66 Seale D A, Rapoport A 1997 Sequential decision making with relative ranks: An experimental investigation of the ‘secretary problem.’ Organizational Behaior and Human Decision Processes 69(3): 221–36 Shapira Z, Venezia I 1981 Optional stopping on nonstationary series. Organizational Behaior and Human Performance 27: 32–49 Wald A 1947 Sequential Analysis. Wiley, New York
A. Diederich 13922
Statistics plays two fundamental roles in empirical research. One is in determining the data collection process: the experimental design. The other is in analyzing the data once it has been collected. For the purposes of this article, two types of experimental designs are distinguished: sequential and nonsequential. In a sequential design the data that accrue in an experiment can affect the future course of the experiment. For example, an observation made on one experimental unit treated in a particular way may determine the treatment used for the next experimental unit. The term ‘adaptive’ is commonly used as an alternative to sequential. In a nonsequential design the investigator can carry out the entire experiment without knowing any of the interim results. The distinction between sequential and nonsequential is murky. An investigator’s ability to carry out an experiment exactly as planned is uncertain, as information that becomes available from within and outside the experiment may lead the investigator to amend the design. In addition, a nonsequential experiment may give results that encourage the investigator to run a second experiment, one that might even simply be a continuation of the first. Considered separately, both experiments are nonsequential, but the larger experiment that consists of the two separate experiments is sequential. In a typical type of nonsequential design, 20 patients suffering from depression are administered a drug and their improvements are assessed. An example sequential variation is the following. Patients’ improvements are recorded ‘in sequence’ during the experiment. The experiment stops should it happen that at least nine of the first 10 patients, or no more than one of the first 10 patients improve(s). On the other hand, if between two and eight of the first 10 patients improve then sampling continues to a second set of 10 patients, making the total sample size equal to 20 in that case. Another type of sequential variation is when the dose of the drug is increased for the second 10 patients should it happen that fewer than four of the first 10 improve. Much more complicated sequential designs are possible. For example, the first patient may be assigned a dose in the middle of a range of possible doses. If the patient improves then the next patient is assigned the next lower dose, and if the first patient does not improve then the next patient is assigned the next higher dose. This process continues, always dropping the dosage if the immediately preceding patient improved, and increasing the dosage if the immediately preceding patient did not improve. This is called an ‘up-and-down’ design. Procedures in which batches of experimental units (such as groups of 10 patients each) are analyzed before proceeding to the next stage of the experiment
Sequential Statistical Methods are called ‘group-sequential.’ Designs such as the upand-down design, in which the course of the experiment can change after each experimental unit responds are called ‘fully sequential.’ So a fully sequential design is a group-sequential design in which the group size is one. Designs in which the decision of when to stop the experiment depends on the accumulating results are called ‘sequential stopping.’ Using rules to determine which treatments to assign to the next experimental unit or batch of units is called ‘sequential allocation.’ Designs of most scientific experiments are sequential, although perhaps not formally so. Investigators usually want to conserve time and resources. In particular, they do not want to continue an experiment if they have already learned what they set out to learn, and this is so whether their conclusion is positive or negative, or if finding a conclusive answer would be prohibitively expensive. (Where the investigator discovers that the standard deviation of the observations is much larger than originally thought is an example of an experimental design that would be prohibitively expensive to continue because the required sample size would be large.) Sequential designs are difficult or impossible to use in some investigations. For example, results might take a long time to obtain, and waiting for them would mean delaying other aspects of the experiment. Suppose one is interested in whether grade-schoolers diagnosed with attention deficit hyperactivity disorder (ADHD) should be prescribed Ritalin. The outcome of interest is whether children on Ritalin will be addicted to drugs as adults. Consider assigning groups of 10 children to Ritalin and 10 to a placebo, and waiting to observe their outcomes before deciding whether to assign an additional group of 10 patients to each treatment. The delay in observation means that it would probably take hundreds of years to get an answer to the overall question. The long-term nature of the endpoint means that any reasonable experiment addressing this question would necessarily be nonsequential, with large numbers of children assigned to the two groups before any information at all would become available about the endpoint.
1. Analyzing Data from Sequential Experiments—Frequentist Case Consider an experiment of a particular type, say one to assess extrasensory perception (ESP) ability. A subject claiming to have ESP is asked to choose between two colors. The null hypothesis of no ability is that the subject is only guessing, in which case the correct color has a probability of 1\2. Suppose the subject gets 13 correct out of 17 tries. How should these results be analyzed and reported? The answer depends on one’s statistical philosophy. Frequentists and Bayesians
take different approaches. Frequentist analyses depend on whether the experiment’s design is sequential, and if it is sequential the conclusions will differ depending on the actual design used. In the nonsequential case the subject is given exactly 17 tries. The frequentist P-value is the probability of results as extreme as or more extreme than those observed. The results are said to be ‘statistically significant’ if the P-value is less than 5 percent. A convention is to include both 13 or more successes and 13 or more failures. (This ‘two-sided’ case allows for the possibility that the subject has ESP but has inverted the ‘extrasensory’ signals.) Assuming the null hypothesis and that the tries are independent, the probabilities of the number of successes is binomial. Binomial probabilities can be approximated using the normal distribution. The z-score for 13 out of 17 is about 2, and so the probability of 13 or more successes or 13 or more failures is about 0.05 (the exact binomial probability is 0.049), and so the results are statistically significant at the 5 percent level and the null hypothesis is rejected. Now suppose the experiment is sequential. The frequentist significance level is now different, and it depends on the actual design used. Suppose the design is to sample until the subject gets at least four successes and at least four failures—same data, different design. Again, more extreme means more than 13 successes (and exactly four failures) or more than 13 failures (and exactly four successes). The total probability of these extreme values is 0.021—less than 0.049—and so the results are now more highly significant than if the experiment’s design had been nonsequential. Consider another sequential design, one of a type of group-sequential designs commonly used in clinical trials. The experimental plan is to stop at 17 tries if 13 or more are successes or 13 or more are failures, and hence the experiment is stopped on target. But if after 17 tries the number of successes is between five and 12 then the experiment continues to a total of 44 tries. If at that time, 29 or more are successes or 29 or more are failures then the null hypothesis is rejected. To set the context, suppose the experiment is nonsequential, with sample size fixed at 44 and no possibility of stopping at 17; then the exact significance level is again 0.049. When using a sequential design, one must consider all possible ways of rejecting the null hypothesis in calculating a significance level. In the group-sequential design there are more ways to reject than in the nonsequential design with the sample size fixed at 17 (or fixed at 44). The overall probability of rejecting is greater than 0.049 but is somewhat less than 0.049j0.049 because some sample paths that reject the null hypothesis at sample size 17 also reject it at sample size 44. The total probability of rejecting the null hypothesis for this design is actually 0.080. Therefore, even though the results beyond the first 17 observations are never observed, the fact that they might have been observed makes 13 successes of 17 no 13923
Sequential Statistical Methods Table 1 Summary of experimental designs and related significance levels Stopping rule After 17 observations (nonsequential) After at least 4 successes and 4 failures After 17 or 44 observations, depending on interim results Stop when ‘you think you know the answer’
Significance level 0.049 0.021 0.08 Undefined
longer statistically significant (since 0.08 is greater than 0.05). The three designs above are summarized in Table 1. The table includes a fourth design in which the significance level cannot be found. To preserve a 0.05 significance level in groupsequential or fully sequential designs, investigators must adopt more stringent requirements for stopping and rejecting the null hypothesis; that is, they must include fewer observations in the region where the null hypothesis is rejected. For example, the investigator in the above study might drop 13 successes or failures in 17 tries and 29 successes or failures in 44 tries from the rejection region. The investigator would stop and claim significance only if there are at least 14 successes or at least 14 failures in the first 17 tries, and claim significance after 44 tries only if there are at least 30 successes or at least 30 failures. The nominal significance leels (those appropriate had the experiment been nonsequential) at n l 17 and n l 44 are 0.013 and 0.027, and the overall (or adjusted) significance level of rejecting the null hypothesis is 0.032. (No symmetric rejection regions containing more observations allow the significance level to be greater than this but still smaller than 0.05.) With this design, 13 successes out of 17 is not statistically significant (as indicated above) because this data point is not in the rejection region. The above discussion is in the context of significance testing. But the same issues apply in all types of frequentist inferences, including confidence intervals. The implications of the need to modify rejection regions depending on the design of an experiment are profound. In view of the penalties that an investigator pays in significance level that are due to repeated analyses of accumulating data, investigators strive to minimize the number of such analyses. They shy away from using sequential designs and so may miss opportunities to stop or otherwise modify the experiment depending on accumulating results. What happens if investigators fail to reveal that other analyses did occur, or that the experiment might have continued had other results been observed? Any frequentist conclusion that fails to take the other analyses into account is meaningless. Strictly speaking, 13924
this is a breach of scientific ethics when carrying out frequentist analyses. But it is difficult to find fault with investigators who do not understand the subtleties of frequentist reasoning and who fail to make necessary adjustments to their inferences. For more information about the frequentist approach to sequential experimentation, see Whitehead (1992).
2. Analyzing Data from Sequential Experiments—Bayesian Case When taking a Bayesian approach (see Bayesian Statistics) (or a likelihood approach), conclusions are based only on the observed experimental results and do not depend on the experiment’s design. So the murky distinction that exists between sequential and nonsequential designs is irrelevant in a Bayesian approach. In the example considered above, 13 successes out of 17 tries will give rise to the same inference in each of the designs considered. Bayesian conclusions depend only on the data actually observed and not otherwise on the experimental design (Berger and Wolpert 1984, Berry 1987). The Bayesian paradigm is inherently sequential. Bayes’s theorem prescribes the way learning takes place under uncertainty. It specifies how an observation modifies one’s state of knowledge (Berry 1996). Moreover, each observation that is planned has a probability distribution. After 13 successes in 17 tries, the probability of success on the next try can be found. This requires a distribution, called a ‘prior distribution,’ for the probability of success on the first of the 17 tries. Suppose the prior distribution is uniform from zero to one. (This is symmetric about the null hypothesis of 1\2, but it is unlikely to be anyone’s actual prior distribution in the case of ESP because it gives essentially all the probability to some ESP ability.) The predictive probability of a success on the 18th try is then (13j1)\(17j2) l 0.737, called ‘Laplace’s rule of succession’ (Berry 1996, p. 204). Whether to take this 18th observation can be evaluated by weighing the additional knowledge gained (having 14 successes out of 18, with probability 0.737, or 13 successes out of 18, with probability 0.263) with the costs associated with the observation. Predictive probabilities are fundamental in a Bayesian approach to sequential experimentation. They indicate how likely it is that the various possibilities for future data will happen, given the data currently available. Suppose that after 13 successes of 17 tries one is entertaining taking an additional 27 observations. One may be interested in getting at least 30 successes out of the total of 44 observations—which means at least 17 of the additional 27 observations are successes. The predictive probability of this is about 50 percent. Or one may be interested in getting successes
Sequential Statistical Methods in at least 1\2 (22) of the 44 tries. The corresponding predictive probability is 99.5 percent. The ability to use Bayes’s theorem for updating one’s state of knowledge and the use of predictive probabilities makes the Bayesian approach appealing to researchers in the sequential design of experiments. As a consequence, many researchers who prefer a frequentist perspective use the Bayesian approach in the context of sequential experimentation. If they are interested in finding the frequentist operating characteristics (such as significance level and power), these can be calculated by simulation. The next section (Sect. 4) considers a special type of sequential experiment. The goals of the section are to describe some of the calculational issues that arise in solving sequential problems and to convey some of the interesting aspects of sequential problems. It takes a Bayesian perspective.
3. Sequential Allocation of Experiments: Bandit Problems In many types of experiments, including many clinical trials, experimental units are randomized in a balanced fashion to the candidate treatments. The advantage of a balanced design is that it gives maximal information about the differences between treatments. In some types of experiment, including some clinical trials, it may be important to obtain good results on the units that are part of the experiment. Treatments—or arms—are assigned based on accumulating results; that is, assignment is sequential. The goal is to maximize the overall effectiveness—of those units in the experiment, but also, perhaps, including those units not actually in the experiment but whose treatment might benefit from information gained in the experiment. Specifying a design is difficult. The first matter to be considered is the arm selected for the initial unit. Suppose that the first observation is X . The second " next, given component of the design is the arm selected X and also given the first arm selected. The third " component depends on X and the second observation " X , and on the corresponding arms selected. And so # A design is optimal if it maximizes the expected on. number of successes. An arm is optimal if it is the first selection of an optimal design. Consider temporarily an experiment with n units and two available arms. Outcomes are dichotomous: arm 1 has success probability p and arm 2 has success " probability p . The goal is to maximize the expected # number of successes among the n units. Arm 1 is standard and has known success proportion p . Arm 2 has unknown efficacy. Uncertainty about p is"given in # To be terms of a prior probability distribution. specific, suppose that this is uniform on the interval from 0 to 1.
Table 2 Possible designs and associated expected number of successes Design
Expected number of successes
o1; 1, 1q o1; 1, 2q o1; 2, 1q o1; 2, 2q o2; 1, 1q o2; 1, 2q o2; 2, 1q o2; 2, 2q
2p " p jp#j(1kp )\2 " " " p jp \2j(1kp )p " " " " p j1\2 " 1\2jp " 1\2j(1\2)p j(1\2)(1\3) " 1/2j(1/2)(2/3)j(1/2)p " 2(1/2) l 1
If n l 1 then the design requires only an initial selection, arm 1 or arm 2. Choosing arm 1 has expected number of successes p . Choosing arm 2 has con" ditional expected number of successes p , and an unconditional expected number of successes,# the prior probability of success, which is 1\2. Therefore arm 1 is optimal if p 1\2 and arm 2 is optimal if p 1\2. " (Both arms—and any randomization between"them— are optimal when p l 1\2.) " The problem is more complicated for n 2. Consider n l 2. There are two initial choices and two choices depending on the result of the first observation. There are eight possible designs. One can write a design as oa; aS, aFq, where a is the initial selection, aS is the next selection should the first observation be a success, and aF is the next selection should the first observation be a failure. To find the expected number of successes for a particular design, one needs to know such quantities as the probability of a success on arm 2 after a success on arm 2 (which is 2\3) and the probability of a success on arm 2 after a failure on arm 2 (which is 1\3). The possible designs and their associated expected numbers of successes are given in Table 2. It is easy to check that only three of these expected numbers of successes (shown in bold) are candidates for the maximum. If p 5\9 then o1; 1, 1q is optimal; " o2; 2, 1q is optimal; and if if 1\3 p 5\9 then " o2; 2, 2q is optimal. For example, if p l p 1\3 then " then it is optimal to use the unknown arm" 2 1\2 initially. If the outcome is a success, then a decision is made to ‘stay with a winner’ and use arm 2 again. If a failure occurs, then the decision is made to switch to the known arm 1. Enumeration of designs is tedious for large n. Most designs can be dropped from consideration based on theoretical results (Berry and Fristedt 1985). For example, there is a breakeven value of p , say p, such " that arm 1 is optimal for p p. Also, one" need " " consider only those designs that continue to use arm 1 once it has been selected. But many designs remain. Backward induction can be used to find an optimal design (Berry and Fristedt 1985). 13925
Sequential Statistical Methods Table 3 The optimal expected proportion of successes for selected values of n and for fixed p l 1\2 " n 1 2 5 10 20 50 100 200 500 Proportion of successes
0.500
0.542
0.570
0.582
0.596
Table 4 The breakeven values of p * for selected values of n " n 1 2 5 10 20 p* "
0.500
0.556
0.636
0.698
0.758
Table 3 gives the optimal expected proportion of successes for selected values of n and for fixed p l " 1\2. Asymptotically, for large n, the maximal expected proportion of successes is 5\8, which is the expected value of the maximum of p and p . Both arms offer the # same chance of success on" the current unit, but only arm 2 gives information that can help in choosing between the arms for treating later units. Table 4 gives the breakeven values of p for selected " values of n. This table shows that information is more important for larger n. For example, if p l 0.75 then arm 1 would be optimal for n l 10, but" it would be advisable to test arm 2 when n l 100; this is so even though arm 1 has probability of 0.75 of being better than arm 2. When there are several arms with unknown characteristics, the problem is still more complicated. Optimal designs may well indicate selection of an arm that was used previously and set aside in favor of another arm because of inadequate performance. For the methods and theory for solving such problems, see Berry (1972), and Berry and Fristedt (1985). The optimal designs are generally difficult to describe. Berry (1978) provides easy to use sequential designs that are not optimal but that perform reasonably well. Suppose the n units in the experiment are a subset of the N units on which arms 1 and 2 can be applied. Berry and Eick (1995) consider the case of two arms with dichotomous response and show how to incorporate all N units into the design problem. They find the optimal Bayes design when p and p have " They# comindependent uniform prior distributions. pare this with various other sequential designs and with a particular nonsequential design: balanced randomization to arms 1 and 2. The Bayes design performs best on average, of course, but it is robust in the sense that it outperforms the other designs for essentially all pairs of p and p . " #
4. Further Reading The pioneers in sequential statistical methods were Wald (1947) and Barnard (1944). They put forth the sequential probability ratio test (SPRT), which is of 13926
0.607
0.613
0.617
0.621
1,000
10,000
0.622
0.624
50
100
200
500
1000
10,000
0.826
0.869
0.902
0.935
0.954
0.985
fundamental importance in sequential stopping problems. The study of the SPRT dominated the theory and methodology of sequential experimentation for decades. For further reading about Bayesian vs. frequentist issues in sequential design, see Berger (1986), Berger and Berry (1988), Berger and Wolpert (1984), and Berry (1987, 1993). For further reading about the frequentist perspective, see Chow et al. (1971) and Whitehead (1992). For further reading about Bayesian design issues, see Berry (1993), Berry and Stangl (1996), Chernoff and Ray (1965), Cornfield (1966), and Lindley and Barnett (1965). For further reading about bandit problems, see Berry (1972, 1978), Berry and Eick (1995), Berry and Fristedt (1985), Bradt et al. (1956), Friedman et al. (1964), Murphy (1965), Rapoport (1967), Rothschild (1974), Viscusi (1979), and Whittle (1982\3). There is a journal called Sequential Analysis that is dedicated to the subject of this article. See also: Clinical Treatment Outcome Research: Control and Comparison Groups; Experimental Design: Overview; Experimental Design: Randomization and Social Experiments; Psychological Treatments: Randomized Controlled Clinical Trials; Quasi-Experimental Designs
Bibliography Barnard G A 1944 Statistical Methods and Quality Control, Report No. QC\R\7. British Ministry of Supply, London Berger J O 1986 Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, New York Berger J O, Berry D A 1988 Statistical analysis and the illusion of objectivity. American Scientist 76: 159–65 Berger J O, Wolpert R L 1984 The Likelihood Principle. Institute of Mathematical Statistics, Hayward, CA Berry D A 1972 A Bernoulli two-armed bandit. Annals of Mathematical Statistics 43: 871–97 Berry D A 1978 Modified two-armed bandit strategies for certain clinical trials. Journal of the American Statistical Association 73: 339–45
Serial Verb Constructions Berry D A 1987 Interim analysis in clinical trials: The role of the likelihood principle. American Statistician 41: 117–22 Berry D A 1993 A case for Bayesianism in clinical trials (with discussion). Statistics in Medicine 12: 1377–404 Berry D A 1996 Statistics: A Bayesian Perspectie. Duxbury Press, Belmont, CA Berry D A, Eick S G 1995 Adaptive assignment versus balanced randomization in clinical trials: A decision analysis. Statistics in Medicine 14: 231–46 Berry D A, Fristedt B 1985 Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London Berry D A, Stangl D K 1996 Bayesian methods in health-related research. In: Berry D A, Stangl D K (eds.) Bayesian Biostatistics. Marcel Dekker, New York, pp. 1–66 Bradt R N, Johnson S M, Karlin S 1956 On sequential designs for maximizing the sum of n observations. Annals of Mathematical Statistics 27: 1060–70 Chernoff H, Ray S N 1965 A Bayes sequential sampling inspection plan. Annals of Mathematical Statistics 36: 1387–407 Chow Y S, Robbins H, Siegmund D 1971 Great Expectations. Houghton Mifflin, Boston Cornfield J 1966 Sequential trials, sequential analysis and the likelihood principle. American Statistician 20: 18–23 Friedman M P, Padilla G, Gelfand H 1964 The learning of choices between bets. Journal of Mathematical Psychology 1: 375–85 Lindley D V, Barnett B N 1965 Sequential sampling: Two decision problems with linear losses for binomial and normal random variables. Biometrika 52: 507–32 Murphy R E Jr. 1965 Adaptie Processes in Economic Systems. Academic Press, New York Rapoport A 1967 Dynamic programming models for multistage decision making tasks. Journal of Mathematical Psychology 4: 48–71 Rothschild M 1974 A two-armed bandit theory of market pricing. Journal of Economic Theory 9: 185–202 Viscusi W K 1979 Employment Hazards: An Inestigation of Market Performance. Harvard University Press, Cambridge, MA Wald A 1947 Sequential Analysis. Wiley, New York Whitehead J 1992 The Design and Analysis of Sequential Clinical Trials. Horwood, Chichester, UK Whittle P 1982\3 Optimization Oer Time. Wiley, New York, Vols. 1 and 2
D. A. Berry
Serial Verb Constructions
scholars would regard as an SVC: (1) Kofi de pono no baae Kofi take-PAST table the come-PAST ‘Kofi brought the table’ or more literally, ‘Kofi took the table and came with it’ SVCs have attracted attention from linguists for three main reasons. First, syntacticians have found them interesting because the possibility of having more than one main (i.e., nondependent, nonauxiliary) verb within a clause or clause-like unit challenges traditional assumptions that a clause contains exactly one predicate (see Foley and Olson 1985, pp. 17, 57). Second, because of their prevalence both in languages of West Africa and in certain Creole languages, they have become central to debates among Creole language researchers, being seen either as evidence for the importance of substrate (i.e., African mother-tongue) influence on the Creole, or as deriving from universal properties of human language which are manifested in the creolization process (see Pidgin and Creole Languages. Third, serial verbs show a tendency to be reanalyzed as other grammatical categories (complementizers and prepositions), with implications for the processes of historical change (see Historical Linguistics: Oeriew). While different researchers use different criteria for defining serial verbs, the following set would probably be accepted by most and is based on one described by McWhorter (1997, p. 22) as a ‘relatively uncontroversial distillation of the conclusions of various scholars’: (2) (a) an SVC contains only one overt subject; (b) it contains no overt markers of coordination or subordination; (c) it expresses concomitant actions (either simultaneous or consecutive) or a single action; (d) it falls under one intonational contour; (e) it has tense–modality–aspect marking on none of the verbs, or on one only (usually the first or last in the series) or on all of them; and (f ) it contains a second verb which is not obligatorily subcategorized for by the first verb. Even the above six conditions leave it unclear where exactly to draw the line around SVCs, particularly in languages which have little morphology. In some cases of what may appear to be SVCs, we may be dealing simply with unmarked coordination of verb phrases expressing simultaneous actions.
1. Definition and Importance The term ‘serial verb construction’ (SVC) is usually applied to a range of apparently similar syntactic constructions in different languages, in which several verbs occur together within one clause or unit, without evidence of either subordination or coordination of the verbs. For example, sentences like (1), from the West African language Twi, are typical of what most
2. Serial Verb Types Serial constructions can be classified into different types on the basis of the functions of the verbs in the series relative to one another. 13927
Serial Verb Constructions The sentences in (3) provide examples of different functional types of SVC which have been discussed in the literature. (3) (a) Directional complements: V is go, come, or # verb which another common intransitive motion functions to indicate the directionality of the action denoted by V . Olu! gbe! a"' ga wa! Olu take chair come ‘Olu brought a chair’ (Yoruba, West African) (b) Other motion erb complements: similar to type (a), but V may be transitive motion verb with its own # object expressing the goal of the action denoted by V Kofi tow agyan no wuu Amma " Kofi throw-PAST arrow the pierce-PAST Amma ‘Kofi shot Amma with the arrow’ (c) Instrumental constructions: V is take or a " semantically similar common verb, while V denotes # of V . an action performed with the aid of the object " Kofi teki a nefi koti a brede Kofi take the knife cut the bread ‘Kofi cut the bread with a knife’ (Sranan, Caribbean Creole) (d) Datie constructions: V is usually gie, with an # object which denotes the (semantic) indirect object of V. "Ogyaw ne sika ma4 a4 me he-leave-PAST his money give-PAST me ‘‘He left me his money’’ (Twi, West African) (e) Comparatie constructions: a verb meaning pass or surpass is used with an object to express comparison. In this case ‘V ’ is often in fact an adjective. Amba tranga pasa "Kofi Amba strong pass Kofi ‘Amba is stronger than Kofi’ (Sranan, Caribbean Creole) (f ) Resultatie constructions: V denotes the result or # by V . consequence of an action denoted " Kofi naki Amba kiri Kofi hit Amba kill ‘Kofi struck Amba dead’ (or more literally, ‘Kofi hit Amba and killed her’) (Sranan, Caribbean Creole) (g) Idiomatic constructions (lexical idioms): These are cases where the meaning of the verbs together is not derivable from the meanings of the verbs separately. They are not found in all serializing languages, but some of the West African group are particularly rich in them. Anyi-Baule (West African): bu ‘hit’ nı4 a4 ‘look’ bu…nı4 a4 ‘say, tell’ Yoruba (West African): la' ‘cut open’ ye! … ‘understand’ la' …ye! ‘explain’
3. Distribution in the World’s Languages Serial verb constructions have been identified in many languages but it is clear that they occur in areal clusters 13928
and are not spread evenly among the languages of the world. So far serializing languages have been identified in the following areas: West Africa (Kwa and related language families), the Caribbean (Creole languages which have a historical relationship with Kwa languages), Central America (e.g., Misumalpan), Papua New Guinea (Papuan languages and Tok Pisin— Melanesian Pidgin English), South-east Asia (e.g., Chinese, Vietnamese, Thai). There are isolated reports of SVC-like constructions from elsewhere. How credible these are depends on which criteria are adopted to define SVCs.
4. Grammatical Analyses Numerous researchers have put forward proposals for grammatical analyses of SVCs. The analyses may be classified into two types, with some degree of overlap. Semantic analyses typically seek to account for SVCs in terms of how they break down verbal concepts into more basic semantic components (take–carry–come for bring, for example). Syntactic analyses differ mainly according to whether they treat SVCs as a phrase structure phenomenon (usually under a version of X-bar theory), as a form of complementation, subordination, or secondary predication (i.e., involving more than one clause-like unit), or as a lexical phenomenon (e.g., as single but disjoint lexical items (see Syntax)). The following phrase structure (4) proposed for SVCs in the literature is typical of those advocated by a number of researchers: (4)
VP
V XP VP where X is N or P
This produces a right-branching tree with a theoretically infinite number of verbs in a series. Such a structure would allow lexical relations and relations like ‘subject-of’ and ‘object-of’ to hold between verbs in the series, and captures the intuition that verbs (or verb phrases) function as complements to verbs earlier in the sequence. Some researchers have treated SVCs as involving complementation or subordination, with clauses or clause-like units embedded within the VP and dependent on a higher verb. For example, Larson (1991, pp. 198–205) has argued in favor of analyzing serial constructions as a case of ‘secondary predication.’ A structure similar to that for Carol rubbed her finger raw in English can also, he says, account for serial combinations of the take… come and hit… kill types (a) and (f ) above). Foley and Olson (1985) propose a distinction between ‘nuclear’ and ‘core’ serialization. In nuclear serialization the serial verbs all occur in the nucleus of the clause, in other words the verbs form a single unit which must share arguments and modifiers. The core layer of the clause consists of the nuclear layer plus the
Serial Verb Constructions core arguments of the verb (Foley and Olson 1985, p. 34). In ‘core’ serialization two cores, each with own nucleus and arguments, are joined together. Languages may exhibit both kinds of serialization or just one. Examples (a) and (b) below (Foley and Olson 1985, p. 38) illustrate this difference; (a) is an example of core, and (b) of nuclear, serialization:
creoles which had serializing substrata’ (1997, p. 39). SVCs have thus become a major site of contention between ‘substratists’ and ‘universalists’ in Creole studies, with both sides remaining committed to their own positions (see Pidgin and Creole Languages).
(5) (a) fu fi fase isoe he sit letter write ‘He sat down and wrote a letter’ (Barai, Papuan) (b) Fu fase fi isoe He letter sit write ‘He sat writing a letter’
6. SVCs and Language Change
This evidence suggests that ‘serialization’ may not be a unitary phenomenon, and although a single account which explains the two or more different construction types (such as Foley and Olson’s) would be welcome, it may be that separate analyses may be needed as more different types of SVC come to light.
5. The Creolists’ Debates SVCs are among the most salient grammatical features of many Creoles of the Caribbean area which distinguish them grammatically from their European lexifier languages. As such they have attracted attention from creolists, who, finding similar structures in West African languages which historically are connected with the Creoles in question, have claimed SVCs are evidence of substrate influence on the grammar of the Creoles concerned. An alternative view has been offered by Bickerton (1981) and Byrne (1987), who regard the similarities between SVCs in the Creole and West African languages as coincidental. Instead, they argue, SVCs result in Creoles from universal principles of grammar, and are a consequence of the rudimentary verbal structure (V but not VP) which exist, they say, in a Creole at the earliest stage of development. They point out the existence of SVCs or serial-like structures in other pidgins and Creoles such as Tok Pisin (New Guinea Pidgin) and Hawaiian Creole English, which have no historical connections with West Africa. McWhorter (1997) argues against Bickerton and Byrne, claiming that the Caribbean Creoles not only share SVCs with West African languages, but that their SVCs are structurally similar to each other in ways that SVCs from other areal language groupings are not. Using the nuclear\core distinction proposed by Foley and Olson (see above), he argues that Papuan languages serialize at the nuclear level, while Kwa languages, Caribbean Creoles, Chinese, and other Southeast Asian languages serialize at the core level. Considering also the range of pidgins and Creoles which lack SVCs altogether, he concludes that ‘SVCs have appeared around the world in precisely the
Particular verbs which participate in serial constructions in some cases appear to have been reanalyzed as members of another syntactic category, e.g., from verb to preposition (gie for) or complementizer (say that). In such cases, the position of the verb usually makes it amenable to such a reanalysis (e.g., if it typically occurs immediately before its objects, it may be reanyalzed as a preposition if the semantics encourage this interpretation). Reanalyzed serial verbs lose, to varying degrees, their verbal properties and take on, again to varying degrees, the morphological characterisics of their new category. Lord (1973) describes cases of reanalysis of verbs as prepositions, comitative markers, and a subordinating conjunction in Yoruba, Ga4 , Ewe, and Fon. Verbs with the meaning ‘say’ are susceptible to reinterpretation as complementizers (introducing sentential clauses), while those with the meaning ‘finish’ have a tendency to be interpreted as completive aspect markers (see Grammaticalization).
7. Related Constructions Lack of a satisfactory definition of ‘serial verb construction’ makes it difficult to decide what may count as a related kind of structure. Chinese has a range of verbal structures which bear resemblances to SVCs. One of these, the class of co-erbs, is widely believed to be the result of the reanalysis of serial verbs as prepositions. Others, so-called resultaties, resemble secondary predications. The Bantu languages, though related to the serializing languages of West Africa, do not seem to have SVCs. However, some have verbal chains where the second and subsequent verb have different marking from the first. In many Bantu languages, the verb meaning say is homophonous with a complementizer. These phenomena remain to be explained by a general theory. See also: Syntactic Aspects of Language, Neural Basis of; Syntax; Valency and Argument Structure in Syntax
Bibliography Bickerton D 1981 Roots of Language. Karoma, Ann Arbor, MI Byrne F 1987 Grammatical Relations in a Radical Creole: Verb Complementation in Saramaccan. Benjamins, Amsterdam
13929
Serial Verb Constructions Foley W A, Olson M 1985 Clausehood and verb serialization. In: Nichols J A, Woodbury A C (eds.) Grammar Inside and Outside the Clause: Some Approaches to Theory from the Field. Cambridge University Press, Cambridge, UK, pp. 17–60 Larson R K 1991 Some issues in verb serialization. In: Lefebvre C (ed.) Serial Verbs: Grammatical Comparatie and Cognitie Approaches. Benjamins, Amsterdam, pp. 184–210 Lefebvre C J (ed.) 1991 Serial Verbs: Grammatical, Comparatie and Cognitie Approaches. Benjamins, Amsterdam Lord C 1973 Serial verbs in transition. Studies in African Lingusitics 4(3): 269–96 McWhorter J H 1997 Towards a New Model of Creole Genesis. Peter Lang, New York Sebba M 1987 The Syntax of Serial Verbs: An Inestigation into Serialization in Sranan and other Languages. Benjamins, Amsterdam
M. Sebba
Service Economy, Geography of Producer services are types of services demanded primarily by businesses and governments, used as inputs in the process of production. We may split the demand for all services broadly into two categories: (a) demands originating from household consumers for services that they use, and (b) demands for services originating from other sources. An example of services demanded by household consumers is retail grocery services, while examples of producer services are management consulting services, advertizing services, and computer systems engineering.
1. Deelopment of the Term ‘Producer Serices’ Producer Services have emerged as one of the most rapidly growing industries in advanced economies, while the larger service economy has also exhibited aggregate growth in most countries. In the 1930s, Fisher (1939) observed the general tendency for shifts in the composition of employment as welfare rose, with an expanding service sector. Empirical evidence of this transformation was provided by Clark (1957), and the Clark–Fisher model was a popular description of development sequences observed in nations that were early participants in the Industrial Revolution. However, this model was criticized as a necessary pattern of economic development by scholars who observed regions and nations that did not experience the pattern of structural transformation encompassed in the Clark–Fisher model. Moreover, critics of the Clark–Fisher model also argued that the size of the service economy relative to that of goods and primary production required efforts to classify the service economy into categories more meaningful than Fisher’s residual category ‘services.’ 13930
The classification of service industries based upon their nature and their source of demand was pioneered in the 1960s and 1970s. The classic vision of services developed by Adam Smith (‘they perish the very instant of their performance’) was challenged as scholars recognized that services such as legal briefs or management services can have enduring value, similar to investments in physical capital. The term producer services became applied to services with ‘intermediate’ as opposed to ‘final’ markets. While this distinction between intermediate or producer services and final or consumer services is appealing, it is not without difficulties. Some industries do not fit neatly into one category or another. For example, households as well as businesses and governments demand legal services, and hotels—a function generally regarded to a consumer service—serve both business travelers and households. Moreover, many of the functions performed by producer service businesses are also performed internally by other businesses. An example is the presence of in-house accounting functions in most businesses, while at the same time there are firms specializing in performing accounting services for their clients. Notwithstanding these difficulties of classification, there is now widespread acceptance of the producer services as a distinctive category of service industry. However, variations in the classification of industries among nations, as well as differences in the inclusion of specific industries by particular scholars leads to varying sectoral definitions of producer services. Business, legal, and engineering and management services are included in most studies, while the financial services are often considered to be a part of the producer services. It is less common to consider services to transportation, wholesaling, and membership organizations as a component of the producer services. Table 1 documents for the US economy growth in producer services employment between 1985 and 1995 by broad industrial groups. The growth of producer services over this time period was double the national rate of job growth, and this growth rate was almost identical in metropolitan and rural areas. While financial services grew relatively slowly, employment growth in business and professional services was very strong. Key sectors within this group include temporary help agencies, advertizing, copy and duplicating, equipment rental, computer services, detective and protective, architectural and engineering, accounting, research and development, and management consulting and public relations services.
2. Early Research on the Geography of Producer Serices Geographer’s and regional economist’s research on producer services only started in the 1970s. Pioneering
Serice Economy, Geography of Table 1 US employment change in producer Services, 1985–95 Sector
Job growth (thousands)
Percentage growth
994 4,249 275 599 6,117 19,039
16.6 79.2 40.0 38.6 45.0 23.4
Finance, insurance, and real estate Business and professional services Legal services Membership organizations Total, producer services Total, all industries
research includes the investigations of office location in the United Kingdom by Daniels (1975), and the studies of corporate control, headquarters, and research functions related to corporate headquarters undertaken in the United States by Stanback (1979). Daniels (1985) provided a comprehensive summary of this research, distinguishing between theoretical approaches designed to explain the geography of consumer and producer services. The emphasis in this early research focused on large metropolitan areas, recognizing the disproportionate concentration of producer services employment in the largest nodal centers. This concentration was viewed as a byproduct of the search for agglomeration economies, both within freestanding producer services and in the research and development organizations associated with corporate headquarters. This early research also identified the growth of international business service organizations that were concentrated in large metropolitan areas to serve both transnational offices of home-country clients and to generate an international client base. Early research on producer services also documented tendencies for decentralization from central business districts into suburban locations, as well as the relatively rapid development of producer services in smaller urban regions. In the United States, Noyelle and Stanback (1983) published the first comprehensive geographical portrait of employment patterns within the producer services, differentiating their structure through a classification of employment, and relating this classification to growth trends between 1959 and 1976. While documenting relatively rapid growth of producer services in smaller urban areas with relatively low employment concentrations, this early research did not point towards employment decentralization, although uncertainties related to the impact of the development of information technologies were recognized as a factor that would temper future geographical trends. The geography of producer services was examined thoroughly in the United Kingdom through the use of secondary data in the mid-1980s (Marshall et al. 1988). This research documented the growing importance of the producer services as a source of employment, in an economy that was shedding manufacturing jobs. It encouraged a richer appreciation of the role of
producer services by policy-makers and scholars and called for primary research to better understand development forces within producer service enterprises. While many scholars in this early period of research on producer services emphasized market linkages with manufacturers or corporate headquarter establishments, other research found industrial markets to be tied broadly to all sectors of the economy, a fact also borne out in input-output accounts. Research also documented geographic markets of producer service establishments, finding that a substantial percentage of revenues were derived from nonlocal clients, making the producer services a component of the regional economic base (Beyers and Alvine 1985).
3. Emphasis in Current Research on the Geography of Producer Serices In the 1990s there has been an explosion of geographic research on producer services. Approaches to research in recent years can be grouped into studies based on secondary data describing regional patterns or changes in patterns of producer service activity, and studies utilizing primary data to develop empirical insights into important theoretical issues. For treatment of related topics see Location Theory; Economic Geography; Industrial Geography; Finance, Geography of; Retail Trade; Urban Geography; and Cities: Capital, Global, and World.
3.1 Regional Distribution of Producer Serices The uneven distribution of producer service employment continues to be documented in various national studies, including recent work for the United States, Canada, Germany, and the Nordic countries. While these regions have exhibited varying geographic trends in the development of producer services, they share in common the relatively rapid growth of these industries. They also share in common the fact that relatively few regions have a concentration of or a share of employment at or above the national average. In Canada, the trend has been towards greater concentration, a pattern which Coffey and Shearmur 13931
Serice Economy, Geography of (1996) describe as uneven spatial development. Utilizing data for the 1971 to 1991 time period, they find a broad tendency for producer service employment to have become more concentrated over time. The trend in the United States, Germany, and the Nordic countries differs from that found in Canada. In the Nordic countries producer service employment is strongly concentrated in the capitals, but growth in peripheral areas of Norway and Finland is generally well above average, while in Denmark and Sweden many peripheral areas exhibit slow growth rates (Illeris and Sjøholt 1995). In Germany, the growth pattern exhibits no correlation between city size and the growth of business service employment (Illeris 1996). The United States also exhibits the uneven distribution of employment in the producer services. In 1995 only 34 of the 172 urban-focused economic areas defined by the US Bureau of Economic Analysis had an employment concentration in producer services at or above the national average, and these were predominately the largest metropolitan areas in the country. However, between 1985 and 1995 the concentration of employment in these regions diminished somewhat, while regions with the lowest concentrations in 1985 tended to show increases in their level of producer services employment. Producer service growth rates in metropolitan and rural territory have been almost identical over this same time period in the United States. There is also considerable evidence of deconcentration of employment within metropolitan areas from central business districts into ‘Edge Cities’ and suburban locations in Canada and the United States (Coffey et al. 1996a, Coffey et al. 1996b, Harrington and Campbell 1997). Survey research in Montreal indicates that this deconcentration is related more strongly to new producer service businesses starting up in suburban locations than to the relocation of existing establishments from central city locations (Coffey et al. 1996b).
3.2 Producer Serices and Regional Deelopment Numerous studies have now been undertaken documenting the geographic markets of producer service establishments; for summaries see Illeris (1996) and Harrington et al. (1991). Illeris’ summary of these studies indicates typical nonlocal market shares at 35 percent, but if nonlocal sales are calculated by weighting the value of sales the nonlocal share rises to 56 percent. In both urban and rural settings establishments are divided into those with strong interregional or international markets, and those with primarily localized markets. Over time establishments tend to become more export oriented, with American firms tending to have expanded interregional business, while European firms more often enter international markets (O’Farrell et al. 1998). Thus, producer services contribute to the economic base of communities and 13932
their contribution is rising over time due to their relatively rapid growth rate. Growth in producer service employment in the United States has occurred largely through the expansion in the number of business establishments, with little change in the average size of establishments. Between 1983 and 1993 the number of producer service establishments increased from 0.95 million to 1.34 million, while average employment per establishment increased from 11 to 12 persons. The majority of this employment growth occurred in single unit establishments, nonpayroll proprietorships, and partnerships. People are starting these new producer service establishments because they want to be their own boss, they have identified a market opportunity, their personal circumstances lead them to wish to start a business, and they have identified ways to increase their personal income. Most people starting new companies were engaged in the same occupation in the same industry before starting their firm, and few report that they started companies because they were put out of work by their former employer in a move designed to downsize in-house producer service departments (Beyers and Lindahl 1996a). The use of producer services also has regional development impacts. Work in New York State (MacPherson 1997) has documented that manufacturers who make strong use of producer services as inputs are more innovative than firms who do not use these services, which in turn has helped stimulate their growth rate. Similar positive impacts on the development of clients of producer service businesses has been documented in the United Kingdom (Wood 1996) and more broadly in Europe (Illeris 1996).
3.3 Demand Factors Considerable debate has raged over the reasons for the rapid growth of the producer services, and one common perception has been that this growth has occurred because of downsizing and outsourcing by producer service clients to save money on the acquisition of these services. However, evidence has accumulated that discounts the significance of this perspective. Demand for producer services is increasing for a number of reasons beyond the cost of acquiring these services, based on research with both the users of these services and their suppliers (Beyers and Lindahl 1996a, Coffey and Drolet 1996). Key factors driving the demand for producer services include: the lack of expertise internal to the client to produce the service, a mismatch between the size of the client and the volume of their need for the service, the need for third-party opinions or expert testimony, increases in government regulations requiring use of particular services, and the need for assistance in managing the complexity of firms or to stay abreast of new technologies.
Serice Economy, Geography of Evidence also indicates that producer service establishments often do business with clients who have in-house departments producing similar services. However, it is rare that there is direct competition with these in-house departments. Instead, relationships are generally complementary. Current evidence indicates that there have been some changes in the degree of inhouse vs. market acquisition of producer services, but those selling these services do not perceive the balance of these in-house or market purchases changing dramatically (Beyers and Lindahl 1996a). 3.4 Supply and Competitie Adantage Considerations The supply of producer services is undertaken in a market environment that ranges from being highly competitive to one in which there is very little competition. In order to position themselves in this marketplace, producer service businesses exhibit competitive strategies, and as with the demand side, recent evidence points towards the prevalence of competitive strategies based on differentiation as opposed to cost (Lindahl and Beyers 1999, Hitchens et al. 1996, Coffey 1996). The typical firm tries to develop a marketplace niche, positioning itself to be different from competitors though forces such as an established reputation for supplying their specialized service, the quality of their service, their personal attention to client needs, their specialized expertise, the speed with which they can perform the service, their creativity, and their ability to adapt quickly to client needs. The ability to deliver the service at a lower cost than the client could produce it is also a factor considered important by some producer service establishments. 3.5 Flexible Production Systems The flexible production paradigm has been extended from its origin in the manufacturing environment in recent years into the producer services (Coffey and Bailly 1991). There are a variety of aspects to the issue of flexibility in the production process, including the nature of the service being produced, the way in which firms organize themselves to produce their services, and their relationships with other firms in the production and delivery process. The labor force within producer services has become somewhat more complex, with the strong growth of the temporary help industry that dispatches a mixture of part-time and full-time temporary workers, as well as some increase in the use of contract or part-time employees within producer service establishments. However, over 90 percent of employment remains full-time within the producer services (Beyers and Lindahl 1999). The production of producer services is most frequently undertaken in a manner that requires the labor force to be organized in a customized manner for each job, although a minority of firm’s approach work
in a routinized manner. Almost three-fourths of firms rely on outside expertise to produce their service, and half of producer service firms engage in collaboration with other firms to produce their services to extend their range of expertise or geographic markets. The services supplied by producer service establishments are also changing frequently; half of the establishments interviewed in an US study had changed their services over the previous five years. They do so for multiple reasons, including changing client expectations, shifts in geographic or sectoral markets, changes in government regulations, and changes in information technologies with their related changes on the skills of employees. These changes most frequently produce more diversified service offerings, but an alternative pathway is to become more specialized or to change the form or nature of services currently being produced (Beyers and Lindahl 1999).
3.6 Information Technologies Work in the producer services typically involves face to face meetings with clients at some point in the production process. This work may or may not require written or graphical documents. In some cases the client must travel to the producer service firm’s office, and in other cases the producer service staff travels to client. These movements can be localized or international in scale. However, in addition to these personal meetings, there is extensive and growing use of a variety of information technologies in the production and delivery of producer services work. Computer networks, facsimile machines, telephone conversations, courier services, and the postal system all play a role, including a growing use of the Internet, e-mail, and computer file transfers between producers and clients. Some routine functions have become the subject of e-commerce, but much of the work undertaken in the producer services is nonroutine, requiring creative interactions between clients and suppliers.
3.7 Location Factors The diversity of market orientations of producer service establishments leads to divergent responses with regard to the choice of business locations (Illeris 1996). Most small establishments are located convenient to the residential location of the founder. However, founders often search for a residence that suits them, a factor driven by quality of life considerations for many rural producer service establishments (Beyers and Lindahl 1996b). Businesses with spatially dispersed markets are drawn to locations convenient to travel networks, including airports and the Interstate Highway System in the US. Businesses 13933
Serice Economy, Geography of with highly localized markets position themselves conveniently to these markets. Often ownership of a building, or a prestigious site that may affect trust or confidence of clients is an influencing factor. Establishments that are part of large multi-establishment firms select locations useful from a firm-network-oflocations perspective, which may be either an intra- or interurban pattern of offices.
4. Methodological Considerations Much of the research reported upon here has been conducted in the United States, Canada, or the United Kingdom. While there has been a rapid increase in the volume of research reported regarding the themes touched upon in the preceding paragraphs, the base of field research is still slender even within the nations just mentioned. More case studies are needed in urban and rural settings, focused on marketplace dynamics, production processes, and the impact of technological development on the evolution of the producer services. The current explosion of e-commerce and its use by producer service firms is a case in point. Recent research has tended to be conducted either by surveying producer service enterprises or their clients, but only rarely are both surveyed to obtain answers that can be cross-tabulated. Individual researchers have developed their protocol for survey research, yielding results that tend to be noncomparable to other research. Means need to be developed to bring theoretical and empirical approaches into greater consistency and comparability. There are variations in the organization of production systems related to the producer services among countries. Some countries internalize these functions to a relatively high degree within other categories of industrial activity, and there needs to be greater thought given to ways in which research on these functions within these organizations can be measured.
5. Future Directions of Theory and Research Research on the producer services will continue at both an aggregate scale, as well as at the level of the firm or establishment. Given the recent history of job growth in this sector of advanced economies, there will certainly be studies documenting the ongoing evolution of the geographical distribution of producer services. This research needs to proceed at a variety of spatial scales, ranging from intrametropolitan, to intermetropolitan or interregionally within nations, and across nations. International knowledge of development trends is currently sketchy, especially in developing countries where accounts may be less 13934
disaggregate than in developed countries with regard to service industries. Research at the level of the firm and establishment must continue to better understand the start-up and evolution of firms. While there is a growing body of evidence regarding motivations and histories of firm founders, there is currently little knowledge of the movement of employees into and among producer service establishments. There is also meager knowledge of the distribution of producer service occupations and work in establishments not classified as producer services. Case studies are needed of collaboration, subcontracting, client-seller interaction, processes of price formation, the interplay between the evolution of information technologies and serviceproduct concepts, and types of behavior that are related to superior performance as measured by indicators such as sales growth rate, sales per employee, or profit. There is also a pressing need for international comparative research, not just in the regions that have been relatively well researched (generally the US, Canada, and Europe), but also in other parts of the planet. There is also a pressing need for the development of more robust models and theory in relation to the geography of producer services. See also: Geodemographics; Market Areas; Services Marketing
Bibliography Beyers W B, Alvine M J 1985 Export services in postindustrial society. Papers of the Regional Science Association 57: 33–45 Beyers W B, Lindahl D P 1996a Explaining the demand for producer services: Is cost-driven externalization the major factor? Papers in Regional Science 75: 351–74 Beyers W B, Lindahl D P 1996b Lone eagles and high fliers in rural producer services. Rural Deelopment Perspecties 12: 2–10 Beyers W B, Lindahl D P 1999 Workplace flexibilities in the producer services. The Serice Industries Journal 19: 35–60 Clark C 1957 The Conditions of Economic Progress. Macmillan, London Coffey W J 1996 Forward and backward linkages of producer service establishments: Evidence from the Montreal metropolitan area. Urban Geography 17: 604–32 Coffey W J, Bailly A S 1991 Producer services and flexible production: An exploratory analysis. Growth and Change 22: 95–117 Coffey W J, Drolet R 1996 Make or buy? Internalization and externalization of producer service inputs in the Montreal metropolitan area. Canadian Journal of Regional Science 29: 25–48 Coffey W J, Drolet R, Pole' se M 1996a The intrametropolitan location of high order services: Patterns, factors and mobility in Montreal. Papers in Regional Science 75: 293–323 Coffey W J, Pole' se M, Drolet R 1996b Examining the thesis of central business district decline: Evidence from the Montreal metropolitan area. Enironment and Planning A 28: 1795–1814
Serices Marketing Coffey W J, Shearmur R G 1996 Employment Growth and Change in the Canadian Urban System, 1971–94. Canadian Policy Research Networks, Ottawa, Canada Daniels P W 1975 Office Location. Bell, London Daniels P W 1985 Serice Industries, A Geographical Appraisal. Methuen, London Fisher A 1939 Production, primary, secondary, and tertiary. Economic Record 15: 24–38 Harrington J W, Campbell H S Jr 1997 The Suburbanization of Producer Service Employment. Growth and Change 28: 335–59 Harrington J W, MacPherson A D, Lombard J R 1991 Interregional trade in producer services: Review and synthesis. Growth and Change 22: 75–94 Hitchens D M W N, O’Farrell P N, Conway C D 1996 The competitiveness of business services in the Republic of Northern Ireland, Wales, and the south East of England. Enironment and Planning A 28: 1299–1313 Illeris S 1996 The Serice Economy. A Geographical Approach. Wiley, Chichester, UK Illeris S, Sjøholt P 1995 The Nordic countries: High quality services in a low density environment. Progress Planning 43: 205–221 Lindahl D P, Beyers W B 1999 The creation of competitive advantage by producer service establishments. Economic Geography 75: 1–20 MacPherson A 1997 The role of producer service outsourcing in the innovation performance of New York State manufacturing firms. Annals of the Association of American Geographers 87: 52–71 Marshall J, Wood P A, Daniels P W, McKinnon A, Bachtler J, Damesick P, Thrift N, Gillespie A, Green A, Leyshon A 1988 Serices and Uneen Deelopment. Oxford University Press, Oxford, UK Noyelle T J, Stanback T M 1983 The Economic Transformation of American Cities. Rowman and Allanheld, Totowa, NJ O’Farrell P N, Wood P A, Zheng J 1998 Regional influences on foreign market development by business service companies: Elements of a strategic context explanation. Regional Studies 32: 31–48 Stanback T M 1979 Understanding the Serice Economy: Employment, Productiity, Location. Allanheld and Osmun, Totowa, NJ Wood P 1996 Business services, the management of change and regional development in the UK: A corporate client perspective. Transactions of the Institute of British Geographers 21: 649–65
W. B. Beyers
Services Marketing Marketing, as a philosophy and as a function, has already reached the maturity stage. In the 1950s, the basic emphasis was on consumer goods and the mass marketing approach; in the 1960s, it was the marketing potentiality of durable goods; and in the 1970s; the emphasis was on industrial goods. Only in the 1980s did service organizations start to take a professional interest in marketing approaches
and tools. More recently also, the public services, and the nonprofit services in general, have begun to participate in marketing. Despite its recent arrival, services marketing is undoubtedly the most innovative and, thanks to new technology, seems as though it may change the old approach completely. The trend is moving from mass marketing to one-to-one marketing, a typical feature of services promotion.
1. Peculiarities of Serices Marketing If services marketing is becoming so important, it is necessary to understand its peculiarities. Everything is related to the intangibility of serices: they are ‘events’ more than ‘goods,’ and as a consequence it is impossible to store them. They are ‘happenings,’ where it is difficult to forecast what will be produced and what the consumer will get. The relationship becomes increasingly important, especially when the consumer becomes a ‘part-time producer’—a ‘pro-sumer’ ( producerjconsumer). Interaction with people is very important in many services, such as restaurants, air transportation, tourism, banking, and so on. The interaction is not only between producer and consumer, but also among the users, as happens in schools among students. Because of these elements there is another very important aspect: in services marketing it is difficult to standardize performance, and quality can be very different from one case to another. Quality control ahead of the event is impossible, because nobody knows in advance the service that will actually be provided to the customer. Many external factors can affect the level of quality, e.g., the climate, or the unpredictable behavior of customers. Consequently, it is not always possible to fulfill promises at the point of delivery, even if the organization does its best to deliver the required level of quality. Another peculiarity is that services are not ‘visible’ and so cannot represent a ‘status symbol,’ unless the service can be matched with tangible objects, such as credit cards, university ties, football scarves, etc. But the most important aspect is that services cannot be stored, and therefore production and use are simultaneous. This creates problems in terms of production capacity, because it is difficult to match fluctuations in demand over seasons, weeks, or days. It is possible to synchronize production and use by means of different tools. The most used approach is pricing, because varying service prices may be applied at different periods of time (e.g., telephone charges, or airline tickets). There are other tools, such as ‘user education,’ implemented by the service producer. In this case, the producer tries to ‘educate’ the consumer to ask for a particular, better-quality, service during the so-called ‘valley periods’ when sales are slow. Other solutions involve the employment of parttime workers during ‘peak periods,’ or the implemen13935
Serices Marketing tation of maintenance activity in the ‘valley periods’; making reservations is a typical approach in the case of theatres or in the health care service. In all cases it is necessary to overcome fluctuating demand with a ‘synchromarketing activity.’
2. Critical Success Factors in Serices Marketing Because of the peculiarities of the services, there are many critical success factors in services marketing: (a) service industrialization, (b) strategic image management, (c) customer satisfaction surveys, (d) operations, personnel, and marketing interaction, (e) internal marketing, (f) user education, and participation, (g) managerial control of costs, and investments, (h) relationship with the public authorities, (i) quality control, and (j) synchromarketing. Of these, further comment can be devoted to internal marketing, whose goal is ‘to create an internal environment which supports customer consciousness and sales mindedness among the personnel’ (Gronroos 1990). ‘Internal marketing means applying the philosophy and practices of marketing to people who serve the external customers, so that the best possible people can be employed and retained, and they will do the best possible work’ (Berry and Parasuraman 1991). Another point to emphasize is quality control, because, as has been said, the quality of the service is perceived to be more than objective. Many studies say that the focal point in services is the perceived quality, as a result of a comparison between the expected quality and the image of the service really provided. As a consequence, it is very important to survey customer opinion on a regular basis, and to check whether, via this opinion, service is becoming better or worse. A further point is related to the importance of a company’s image as a factor affecting, positively or negatively, customers’ judgment. This is why it is necessary to manage a company’s image with a strongly strategic approach, outlining what the company is, what people think it is, and how it wants people to think of it. Image improvement is not only a matter of communication; it also involves personnel behavior, services provided, and material facilities.
3. Future Perspecties At the start of the twenty-first century, services marketing is very important in the developed countries, because of the relevance of services in terms of GNP, employment, etc., and we can predict that it is going to increase further in importance, thanks to new technologies that facilitate the relationship between producer and customer. There will be an increasing amount of one-to-one marketing, which is particularly
13936
suitable for services. Another important development will concern the internationalization of services (e.g., McDonald’s, Club Med, Manchester United, Hertz, etc.), where the critical success factors will be related to the improvement of ‘high tech’ in the back office and ‘high touch’ in the front line: Industrialization and personalization at the same time, to achieve high customer satisfaction in every country. Sales of products now often include the offer of some services, as, for example, in the automotive industry, where this is represented by the after sales service, a very important aspect affecting the buyer decision process. Offers such as this are the key issues that, in most cases, actually make the difference, given the fact that the basic product is very frequently similar to other products, and the difference is provided by the service included in the product. See also: Advertising Agencies; Advertising and Advertisements; Advertising: General; Advertising, Psychology of; Computers and Society; Internet: Psychological Perspectives; Market Research; Marketing Strategies; Markets: Artistic and Cultural; Service Economy, Geography of
Bibliography Berry L L, Parasuraman A 1991 Marketing Serices— Competing Through Quality. The Free Press Cherubini S 1981 Il Marketing dei Serizi. Franco Angeli Editore Cowell D 1984 The Marketing of Serices. Heinemann Donnelly J H, George W R (eds.) 1981 Marketing of Serices, AMA’s Proceedings Series. Eiglier P, Langeard E 1987 Seruction—Le Marketing des Serices. McGraw-Hill, New York Eiglier P, Langeard E, Lovelock C H, Bateson J E G, Young R F 1977 Marketing Consumer Serices: New Insights. Marketing Science Institute Fisk R P 2000 Interactie Serices Marketing. Hougthon Mifflin Gronroos C 1990 Serice Management and Marketing. Managing the Moments of Truth in Serice Competition. Lexington Books Heskett J L, Sasser W E, Hart C W L 1986 Serice Breakthroughs. Changing the Rules of the Game. The Free Press, New York Lovelock C H 1984 Serices Marketing. Prentice-Hall, Englewood Cliffs, NJ Normann R 1984 Serice Management. Strategy and Leadership in Serice Businesses. Wiley, New York Payne A 1993 The Essence of Serices Marketing. Prentice-Hall, Englewood Cliffs, NJ Payne A, McDonald M H B 1997 Marketing Planning for Serices. Butterworth-Heinemann, Oxford, UK Zeithamal V A 1996 Serices Marketing. McGraw-Hill, New York
S. Cherubini Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Set Settlement and Landscape Archaeology For contemporary archaeology, settlement and landscape approaches represent an increasingly important focus that is vital for a core mission of the discipline to describe, understand, and explain long-term cultural and behavioral change. Despite this significance, few syntheses of this topic have been undertaken (cf. Parsons 1972, Ammerman 1981, Fish and Kowalewski 1990, Billman and Feinman 1999). Yet settlement and landscape approaches provide the only large-scale perspective for the majority of premodern societies. These studies are reliant on archaeological surface surveys, which discover and record the distribution of material traces of past human presence\ habitation across a landscape (see Surey and Excaation (Field Methods) in Archaeology). The examination and analysis of these physical remains found on the ground surface (e.g., potsherds, stone artifacts, house foundations, or earthworks) provide the empirical foundation for the interpretation of ancient settlement patterns and landscapes.
1.
Historical Background
Although the roots of settlement pattern and landscape approaches extend back to the end of the nineteenth century, archaeological survey has only come into its own in the post World War II era. Spurred by the analytical emphases of Steward (1938), Willey’s Viru! Valley archaeological survey (1953) provided a key impetus for settlement pattern research in the Americas. In contrast, the landscape approach, which has a more focal emphasis on the relationship between sites and their physical environments, has its roots in the UK. Nevertheless, contemporary archaeological studies indicate a high degree of intellectual cross-fertilization between these different surface approaches.
1.1 Early Foundations for Archaeological Surey in the Americas and England The American settlement pattern tradition stems back to scholars, such as Morgan (1881), who queried how the remnants of Native American residential architecture reflected the social organization of the native peoples who occupied them. Yet the questions posed
by Morgan led to relatively few immediate changes in how archaeology was practiced, and for several decades few scholars endeavored to address the specific questions regarding the relationship between settlement and social behavior that Morgan posed. When surface reconnaissance was undertaken by archaeologists, it tended to be a largely unsystematic exercise carried out to find sites worthy of excavation. In the UK, the landscape approach, pioneered by Fox (1922), was more narrowly focused on the definition of distributional relationships between different categories of settlements and environmental features (e.g., soils, vegetation, topography). Often these early studies relied on and summarized surveys and excavations that were carried out by numerous investigators using a variety of field procedures rather than more uniform or systematic coverage implemented by a single research team. At the same time, the European landscape tradition generally has had a closer link to romantic thought as opposed to the more positivistic roots of the North American settlement pattern tradition (e.g., Sherratt 1996). 1.2
The Deelopment of Settlement Archaeology
By the 1930s and 1940s, US archaeologists working in several global regions recognized that changing patterns of social organization could not be reconstructed and interpreted through empirical records that relied exclusively on the excavation of a single site or community within a specific region. For example, in the lower Mississippi Valley, Phillips et al. (1951) located and mapped archaeological sites across a large area to analyze shifting patterns of ceramic styles and settlements over broad spatial domains and temporal contexts. Yet the most influential and problem-focused investigation of that era was that of Willey in the Viru! Valley. Willey’s project was the first to formally elucidate the scope and potential analytical utility of settlement patterns for understanding long-term change in human economic and social relationships. His vision moved beyond the basic correlation of environmental features and settlements as well as beyond the mere definition of archtypical settlement types for a given region. In addition to its theoretical contributions, the Viru! program also was innovative methodologically, employing (for the first time in the Western Hemisphere) vertical air photographs in the location and mapping of ancient settlements. Al13937
Settlement and Landscape Archaeology though Willey did not carry out his survey entirely on foot, he did achieve reasonably systematic areal coverage for a defined geographic domain for which he could examine changes in the frequency of site types, as well as diachronic shifts in settlement patterns. Conceptually and methodologically, these early settlement pattern projects of the 1930s and 1940s established the intellectual underpinnings for a number of multigenerational regional archaeological survey programs that were initiated in at least four global regions during the 1950s and 1960s. In many ways, these later survey programs were integral to the theoretical and methodological re-evaluations that occurred in archaeological thought and practice under the guise of ‘the New Archaeology’ or processualism. The latter theoretical framework stemmed in part from an expressed emphasis on understanding longterm processes of behavioral change and cultural transition at the population (and so regional) scale. This perspective, which replaced a more normative emphasis on archtypical sites or cultural patterns, was made possible to a significant degree by the novel diachronic and broad scalar vantages pieced together for specific areas through systematic regional settlement pattern fieldwork and analysis. 1.3
Large-scale Regional Surey Programs
During the 1950s through the 1970s, major regional settlement pattern programs were initiated in the heartlands of three areas where early civilizations emerged (Greater Mesopotamia, highland Mexico, and the Aegean), as well as in one area known for its rich and diverse archaeological heritage (the Southwest USA). The achievements of the Viru! project also stimulated continued Andean settlement pattern surveys, although a concerted push for regional research did not take root there until somewhat later (e.g., Parsons et al. 1997, Billman and Feinman 1999). Beginning in 1957, Robert M. Adams (e.g., 1965, 1981) and his associates methodically traversed the deserts and plains of the Near East by jeep, mapping earthen tells and other visible sites. Based on the coverage of hundreds of square kilometers, these pioneering studies of regional settlement history served to unravel some of the processes associated with the early emergence of social, political, and economic complexity in Greater Mesopotamia. Shortly thereafter, in highland Mexico, large-scale, systematic surveys were initiated in the area’s two largest mountain valleys (the Basin of Mexico and the Valley of Oaxaca). These two projects implemented field-by-field, pedestrian coverage of some of the largest contiguous survey regions in the world, elucidating the diachronic settlement patterns for regions in which some of the earliest and most extensive cities in the ancient Americas were situated (e.g., Sanders et al. 1979, Blanton et al. 1993). After decades, about 13938
half of the Basin of Mexico and almost the entire Valley of Oaxaca were traversed by foot. In the Aegean, regional surveys (McDonald and Rapp 1972, Renfrew 1972) were designed to place important sites with long excavation histories in broader spatial contexts. Once again, these investigations brought new regional vantages to areas that already had witnessed decades of excavation and textual analyses. Over the same period, settlement pattern studies were carried out in diverse ecological settings across the US Southwest, primarily to examine the differential distributions of archaeological sites in relation to their natural environments, and to determine changes in the numbers and sizes of settlements across the landscape over time. In each of the areas investigated, the wider the study domain covered, the more diverse and complex were the patterns found. Growth in one part of a larger study area was often timed with the decrease in the size and number of sites in another. And settlement trends for given regions generally were reflected in episodes of both growth and decline. Each of these major survey regions (including much of the Andes) is an arid to semiarid environment. Without question, broad-scale surface surveys have been most effectively implemented in regions that lack dense ground cover, and therefore the resultant field findings have been most robust. In turn, these findings have fomented long research traditions carried out by trained crews, thereby contributing to the intellectual rewards of these efforts. As Ammerman (1981, p. 74) has recognized, ‘major factors in the success of the projects would appear to be the sheer volume of work done and the experience that workers have gradually built up over the years.’
1.4 Settlement Pattern Research at Smaller Scales of Analysis Although settlement pattern approaches were most broadly applied at the regional scale, other studies followed similar conceptual principles in the examination of occupational surfaces, structures, and communities. At the scale of individual living surfaces or house floors, such distributional analyses have provided key indications as to which activities (such as cooking, food preparation, and toolmaking) were undertaken in different sectors (activity areas) of specific structures (e.g., Flannery and Winter 1976) or surfaces (e.g., Flannery 1986, pp. 321–423). In many respects, the current emphasis on household archaeology (e.g., Wilk and Rathje 1982) is an extension of settlement pattern studies (see Household Archaeology). Both household and settlement pattern approaches have fostered a growing interest in the nonelite sector of complex societies, and so have spurred the effort to understand societies as more than just undifferentiated, normative wholes.
Settlement and Landscape Archaeology At the intermediate scale of single sites or communities, settlement pattern approaches have compared the distribution of architectural and artifactual evidence across individual sites. Such investigations have clearly demonstrated significant intrasettlement variation in the functional use of space (e.g., Hill 1970), as well as distinctions in socioeconomic status and occupational history (e.g., Blanton 1978). From a comparative perspective, detailed settlement pattern maps and plans of specific sites have provided key insights into the similarities and differences between contemporaneous cities and communities in specific regions, as well as the elucidation of important patterns of cross-cultural diversity.
the areas that they endeavor to examine to the nature of the terrain and the density of artifactual debris (generally nonperishable ancient refuse) associated with the sites in the specified region. For example, sedentary pottery-using peoples generally created more garbage than did mobile foragers; the latter usually employed more perishable containers (e.g., baskets, cloth bags). Consequently, other things being equal, the sites of foragers are generally less accessible through settlement and landscape approaches than are the ancient settlements that were inhabited for longer durations (especially when ceramics were used). 2.2
2. Contemporary Research Strategies and Ongoing Debates The expansion of settlement pattern and landscape approaches over the last decades has promoted the increasing acceptance of less normative perspectives on cultural change and diversity across the discipline of archaeology. In many global domains, archaeological surveys have provided a new regional-scale (and in a few cases, macroregional-scale) vantage on past social systems. Settlement pattern studies also have yielded a preliminary means for estimating the parameters of diachronic demographic change and distribution at the scale of populations, something almost impossible to obtain from excavations alone. Nevertheless, important discussions continue over the environmental constraints on implementation, the relative strengths and weaknesses of different survey methodologies, issues of chronological control, procedures for population estimation, and the appropriate means for the interpretation of settlement pattern data. 2.1
Enironmental Constraints
Although systematic settlement pattern and landscape studies have been undertaken in diverse environmental settings including heavily vegetated locales such as the Guatemalan Pete! n, the eastern woodlands of North America, and temperate Europe, the most sustained and broadly implemented regional survey programs to date have been enacted in arid environments. In large part, this preference pertains to the relative ease of finding the artifactual and architectural residues of ancient sites on the surface of landscapes that lack thick vegetal covers. Nevertheless, archaeologists have devised a variety of means, such as the interpolation of satellite images, the detailed analysis of aerial photographs, and subsurface testing programs, that can be employed to locate and map past settlements in locales where they are difficult to find through pedestrian coverage alone. In each study area, regional surveys also have to modify their specific field methodologies (the intensity of the planned coverage) and the sizes of
Surey Methodologies and Sampling
Practically since the inception of settlement pattern research, archaeologists have employed a range of different field survey methods. A critical distinction has been drawn between full-coverage and sample surveys. The former approaches rely on the complete and systematic coverage of the study region by members of a survey team. In order to ensure the full coverage of large survey blocks, team members often space themselves 25–50 m apart, depending on the specific ground cover, the terrain, and the density of archaeological materials. As a consequence, isolated artifact finds can occasionally be missed. But the researchers generally can discern a reasonably complete picture of settlement pattern change across a given region. Alternatively, sample surveys by definition are restricted to the investigation of only a part of (a sample of ) the study region. Frequently such studies (because they only cover sections of larger regions) allow for the closer spacing of crew members. Archaeologists have employed a range of different sampling designs. Samples chosen for investigation may be selected randomly or stratified by a range of diverse factors, including environmental variables. Nevertheless, regardless of the specific sampling designs employed, such sample surveys face the problem of extrapolating the results from their surveyed samples to larger target domains that are the ultimate focus of study. Ultimately, such sample surveys have been shown to be more successful at estimating the total number of sites in a given study region than at defining the spacing between sites or at discovering rare types of settlement. The appropriateness of sample design can only be decided by the kinds of information that the investigator aims to recover. There is no single correct way to conduct archaeological survey, but certain methodological procedures have proven more productive in specific contexts and given particular research aims. 2.3
Chronological Constraints and Considerations
One of the principal strengths of settlement pattern research is that it provides a broad-scale perspective 13939
Settlement and Landscape Archaeology on the changing distribution of human occupation across landscapes. Yet the precision of such temporal sequences depends on the quality of chronological control (see Chronology, Stratigraphy, and Dating Methods in Archaeology). The dating of sites during surveys must depend on the recovery and temporal placement of chronologically diagnostic artifacts from the surface of such occupations. Artifacts found on the surface usually are already removed from their depositional contexts. Finer chronometric dating methods generally are of little direct utility for settlement pattern research, since such methods are premised on the recovery of materials in their depositional contexts. Of course, chronometric techniques can be used in more indirect fashion to refine the relative chronological sequences that are derived from the temporal ordering of diagnostic artifacts (typically pottery). In many regions, the chronological sequences can only be refined to periods of several hundred years in length. As a result, sites of shorter occupational durations that may be judged to be contemporaneous in fact could have been inhabited sequentially. In the same vein, the size of certain occupations may be overestimated as episodes of habitation are conflated. Although every effort should be made to minimize such analytical errors, these problems in themselves do not negate the general importance of the long-term regional perspective on occupational histories that in many areas of the world can be derived from archaeological survey alone. Although the broad-brush perspective from surveys may never provide the precision or detailed views that are possible from excavation, they yield an encompassing representation at the population scale that excavations cannot achieve. Adequate holistic perspectives on past societies rely on the multiscalar vantages that are provided through the integration of wide-ranging archaeological surveys with targeted excavations.
2.4
Population Estimation
One of the key changes in archaeological thought and conceptualization over the past half-century has been the shift from essentialist\normative thinking about ancient societies to a more populational perspective. But the issue of how to define past populations, their constituent parts, and the changing modes of interaction between those parts remains challenging at best. Clearly, multiscalar perspectives on past social systems are necessary to collect the basic data required to estimate areal shifts in population size and distribution. Yet considerable debate has been engendered over the means employed by archaeologists to extrapolate from the density and dispersal of surface artifacts pertaining to a specific phase to the estimated sizes of past communities or populations. Generally, archaeologists have relied on some combination of the empirically derived size of a past 13940
settlement, along with a comparative determination of surface artifact densities at that settlement, to generate demographic estimates for a given community. When the estimates are completed for each settlement across an entire survey region, extrapolations become possible for larger study domains. By necessity, the specific equations to estimate past populations vary from one region to another because community densities are far from uniform over time or space. Yet due to chronological limitations, as well as the processes of deposition, disturbance, and destruction, our techniques for measuring ancient populations remain coarsegrained. Although much refinement is still needed to translate survey data into quantitative estimates of population with a degree of precision and accuracy, systematic regional surveys still can provide the basic patterns of long-term demographic change over time and space that cannot be ascertained in any other way.
2.5
The Interpretation of Regional Data
Beyond the broad-brush assessment of demographic trends and site distribution in relation to environmental considerations, archaeologists have interpreted and analyzed regional sets of data in a variety of ways. Landscape approaches, which began with a focused perspective on humans and their surrounding environment, have continued in that vein, often at smaller scales. Such studies often examine in detail the placement of sites in a specific setting with an eye toward landscape conservation and the meanings behind site placement (Sherratt 1996). At the same time, some landscape studies have emphasized the identification of ancient agrarian features and their construction and use. In contrast, the different settlement pattern investigations have employed a range of analytical and interpretive strategies. In general, these have applied more quantitative procedures and asked more comparatively informed questions. Over the last 40 years (e.g., Johnson 1977), a suite of locational models derived from outside the discipline has served as guides against which different sets of archaeological data could be measured and compared. Yet debates have arisen over the underlying assumptions of such models and whether they are appropriate for understanding the preindustrial past. For that reason, even when comparatively close fits were achieved between heuristically derived expectations and empirical findings, questions regarding equifinality (similar outcomes due to different processes) emerged. More recently, theorybuilding efforts have endeavored to rework and expand these locational models to specifically archaeological contexts with a modicum of success. Continued work in this vein, along with the integration of some of the conceptual strengths from both landscape and settlement pattern approaches are requisite to under-
Sex Differences in Pay standing the complex web of relations that govern human-to-human and human-to-environment interactions across diverse regions over long expanses of time.
3. Looking Forward: The Critical Role of Settlement Studies The key feature and attribute of archaeology is its long temporal panorama on human social formations. Understanding these formations and how they changed, diversified, and varied requires a regional\ populational perspective (as well as other vantages at other scales). Over the last century, the methodological and interpretive toolkits necessary to obtain this broad-scale view have emerged, diverged, and thrived. The emergence of archaeological survey (and settlement pattern and landscape approaches) has been central to the disciplinary growth of archaeology and its increasing ability to address and to contribute to questions of long-term societal change. At the same time, the advent of settlement pattern studies has had a critical role in moving the discipline as a whole from normative to populational frameworks. Yet settlement pattern work has only recently entered the popular notion of this discipline, long wrongly equated with and defined by excavation alone. Likewise, many archaeologists find it difficult to come to grips with a regional perspective that has its strength in (broad) representation at the expense of specific reconstructed detail. Finally, the potential for theoretical contributions and insights from settlement pattern and landscape approaches (and the wealth of data collected by such studies) has only scratched the surface. In many respects, the growth of regional survey and analysis represents one of the most important conceptual developments of twentieth-century archaeology. Yet at the same time, there are still so many mountains (literally and figuratively) to climb. See also: Chronology, Stratigraphy, and Dating Methods in Archaeology; Household Archaeology; Survey and Excavation (Field Methods) in Archaeology
Blanton R E 1978 Monte AlbaT n: Settlement Patterns at the Ancient Zapotec Capital. Academic Press, New York Blanton R E, Kowalewski S A, Feinman G M, Finsten L M 1993 Ancient Mesoamerica: A Comparison of Change in Three Regions, 2nd edn. Cambridge University Press, Cambridge, UK Fish S K, Kowalewski S A (eds.) 1990 The Archaeology of Regions: A Case for Full-coerage Surey. Smithsonian Institution Press, Washington, DC Flannery K V (ed.) 1986 GuilaT Naquitz: Archaic Foraging and Early Agriculture in Oaxaca, Mexico. Academic Press, Orlando, FL Flannery K V, Winter M C 1976 Analyzing household activities. In: Flannery K V (ed.) The Early Mesoamerican Village. Academic Press, New York, pp. 34–47 Fox C 1923 The Archaeology of the Cambridge Region. Cambridge University Press, Cambridge, UK Hill J N 1970 Broken K Pueblo: Prehistoric Social Organization in the American Southwest. University of Arizona Press, Tucson, AZ Johnson G A 1977 Aspects of regional analysis in archaeology. Annual Reiew of Anthropology 6: 479–508 McDonald W A, Rapp G R Jr (eds.) 1972 The Minnesota Messenia Expedition. Reconstructing a Bronze Age Enironment. University of Minnesota Press, Minneapolis, MN Morgan L H 1881 Houses and House Life of the American Aborigines. US Department of Interior, Washington, DC Parsons J R 1972 Archaeological settlement patterns. Annual Reiew of Anthropology 1: 127–50 Parsons J R, Hastings C M, Matos R 1997 Rebuilding the state in highland Peru: Herder-cultivator interaction during the Late Intermediate period in Tarama-Chinchaycocha region. Latin American Antiquity 8: 317–41 Phillips P, Ford J A, Griffin J B 1951 Archaeological Surey in the Lower Mississippi Alluial Valley, 1941–47. Peabody Museum of Archaeology and Ethnology, Cambridge, MA Renfrew C 1972 The Emergence of Ciilisation. The Cyclades and the Aegean in the Third Millennium BC. Methuen, London Sanders W T, Parsons J R, Santley R S 1979 The Basin of Mexico: Ecological Processes in the Eolution of a Ciilization. Academic Press, New York Sherratt A 1996 ‘Settlement patterns’ or ‘landscape studies’? Reconciling reason and romance. Archaeological Dialogues 3: 140–59 Steward J H 1938 Basin Plateau Aboriginal Sociopolitical Groups. Bureau of American Ethnology, Washington, DC Wilk R R, Rathje W L (eds). 1982 Archaeology of the Household: Building a Prehistory of Domestic Life. American Behaioral Scientist 25(6) Willey G R 1953 Prehistoric Settlement Patterns in the ViruT Valley, Peru. Bureau of American Ethnology, Washington, DC
G. M. Feinman
Bibliography Adams R M 1965 Land Behind Baghdad: A History of Settlement on the Diyala Plains. University of Chicago Press, Chicago Adams R M 1981 Heartland of Cities: Sureys of Ancient Settlement and Land Use on the Central Floodplain of the Euphrates. University of Chicago Press, Chicago Ammerman A J 1981 Surveys and archaeological research. Annual Reiew of Anthropology 10: 63–88 Billman B R, Feinman G M (eds.) 1999 Settlement Pattern Studies in the Americas: Fifty Years Since ViruT . Smithsonian Institution Press, Washington, DC
Sex Differences in Pay Differences in pay between men and women remain pervasive in late twentieth-century labor markets, with women’s average earnings consistently below men’s 13941
Sex Differences in Pay even when differences in hours of work are taken into account. This ‘gender pay gap’ may result from unequal pay within jobs, but is also related to the different types of jobs occupied by men and women. Considerable debate has arisen over the extent to which it is evidence of discrimination in labor markets or simply a result of the individual attributes and choices of men and women. The size of the pay gap may also be influenced by the institutional framework of pay bargaining and regulation in different countries. The article reviews a range of factors thought to contribute to the gender pay gap, and the strategies designed to eradicate it. It draws primarily on evidence from affluent industrialized nations.
1. The Gender Pay Gap: Trends and Comparisons While the gap between men’s and women’s average earnings has narrowed in most countries in the late twentieth century, significant inequality remains. Moreover, substantial cross-national differences are apparent. Table 1 illustrates some of these variations in selected countries. To enhance comparison, the data are drawn as far as possible from one source (International Labour Office [ILO] Yearbook of Labour Statistics). Where available, the figures are for hourly earnings, as data for longer periods produce deflated estimates of women’s relative earnings due to the tendency for women to work fewer hours than men. Also, while the most widely available and reliable data are for manufacturing, women’s employment in indus-
trialized nations tends to be concentrated more in service industries. Thus earnings in the ILO’s more inclusive category of ‘non-agricultural activity’ are also reported where available, although these are less satisfactory for cross-national comparison due to variations in the range of industries included. Even within countries, series breaks in data collection may limit the accuracy of comparisons over time. Some caution is therefore needed in interpreting crossnational differences and trends. Nevertheless, a number of broad observations can be made on the basis of the data presented in Table 1. Looking first at trends within countries, an increase in women’s average earnings relative to men’s since 1970 is evident in all of the countries listed except Japan. In many cases this has occurred mainly in the 1970s, although some countries (for example, Belgium, Canada, and the USA) show marked improvement since 1980. In several cases, however, closing of the gender pay gap has slowed in the 1990s. The table also provides some indication of crossnational variation in the size of the gender pay gap. Figures for Canada and the USA are from different sources and not strictly comparable with data from the other countries. Among the others listed, the picture of cross-national variance appears broadly similar for manufacturing and ‘non-agricultural’ data. On the basis of the hourly manufacturing data, which are the most reliable for comparative purposes, Sweden stands out having the narrowest gender pay gap in the mid-1990s. The large gap in Japan reflects the use of monthly rather than hourly earnings, and thus the impact of different working hours between men and
Table 1 Women’s earnings as a percentage of men’si, selected countries and yearsii Non-agricultural industriesiii
Manufacturing
Australia Belgium Canadaiv Denmark France Japanv Netherlands New Zealand Sweden UK USAvi
1970
1980
1990
1995
1970
1980
1990
1995
64 68 — 74 77vii 45 72 66viii 80 58 —
79 70 — 86 77 44 75 71 90 69 —
83 75 — 85 79 41 75 74 89 68 —
85 79 — 85ix 79x 44ix 75 77 90 71 —
65 67 60 72 — 51 74 72viii — 60 —
86 69 64 85 79 54 78 77 — 70 65
88 75 68 83 81 50 78 81 — 76 78
90 79 73 — 81x — 76 81 — 79 81
Sources: International Labour Office (ILO) Yearbook of Labour Statistics. ILO, Geneva, various issues; Statistics Canada, Earnings of Men and Women, Cat No 13-217 Annual, Statistics Canada, Ottawa, various issues; United States Department of Labor, Women’s Bureau Women’s Earnings as a Percent of Men’s, http:\\www.dol.gov\dol\wb\public\wbIpubs\7996.htm (accessed October 10, 1999). i. Percentages are based on hourly earnings except for Canada and Japan. ii. Countries have been selected on the basis of data availability and to provide a span across continents. iii. ILO ‘non-agricultural’ activity groups. Data for Canada and the USA are from different sources and are based on all industries. iv. For Canada, percentages are based on yearly earnings for full-year, full-time workers. The use of yearly wages deflates the figures as women work fewer hours than men on average, even when the comparison is limited to full-time workers. v. For Japan, percentages are based on monthly earnings. vi. For the USA, percentages are based on hourly earnings, but these are for workers paid an hourly wage and are not directly comparable with hourly data for other countries. vii. 1972. viii. 1974. ix. 1992. x. 1993 — Data unavailable.
13942
Sex Differences in Pay women. However, OECD approximations of hourly earnings in manufacturing in Japan still show a very large pay gap (OECD 1988, p. 212), and ILO data indicate that on this issue Japan looks more like other East Asian countries (for example, South Korea and Singapore) than the other countries listed in the table. Overall, the data raise several questions. Why is there a gender pay gap? What accounts for its variation over time and across nations? What can, or should, be done to eradicate it? These questions are addressed in the following sections which examine possible explanations for the gender pay gap, the effect of institutional factors on its variation across countries, and the strategies most likely to assist in its elimination.
2. Explaining the Gender Pay Gap One of the most direct ways in which sex differences in pay have arisen historically is through the establishment of different rates of pay for men and women in wage-setting processes. The assumption that men, as ‘breadwinners,’ should be entitled to higher pay than women underpinned pay determination in many countries until well into the twentieth century. In some countries this was institutionalized in the form of a ‘family wage’ for men. While early challenges to these assumptions were made in several countries, equal pay rates for men and women were usually only achieved where men’s jobs were threatened by cheaper female labor (see, for example, Ryan and Conlon 1989, pp. 99–100). In most countries it was not until the 1960s and 1970s, following the growth of second wave feminism and increasing involvement of women in paid employment, that the inequity of different rates of pay for men and women was responded to directly through policies of equal pay for equal work. However, much of the pay difference between men and women results not from different pay rates in the same jobs, but from the location of men and women in different jobs. This sex segregation of the labor market has proved remarkably resistant to change, and in Walby’s (1988) view, accounts for the persistence of the gender pay gap since World War II in spite of the increasing human capital (i.e., education and labor market experience) of women over that time period (see Sex Segregation at Work). Segregation is significant for pay inequality to the extent that female-dominated sectors of the labor market deliver lower pay. Female-dominated occupations, for example, may be low-skilled or involve skills that have been undervalued. They may also be less likely to provide discretionary payments such as overtime or bonuses. Empirical studies demonstrate that the proportion of women in an occupational group is negatively associated with wage levels, with typically around one-third of the gender pay gap shown to be due to occupational segregation by sex (Treiman and Hartmann 1981). Results for this type
of analysis are highly dependent on the level of disaggregation of occupations applied. Finer disaggregation uncovers greater inequality, as vertical segregation exists within broad occupational groups, with men more likely to be in the higher status, higher paid, jobs. Concentration of women in a small number of occupational groups is also significant for pay outcomes. Grimshaw and Rubery (1997), for example, demonstrate that in seven OECD countries around 60 percent of women are concentrated in just 10 occupational groups out of a total of between 50 and 80, with little change in this situation since the mid-1980s. Moreover, their analysis shows a wage penalty associated with this concentration of employment. Women within these occupations, on average, earn less than the all-occupation average. Another form of labor market division potentially affecting pay differences is the distribution of men and women between firms. Blau (1977) has shown that in the USA women and men are differently distributed among firms, with women more likely to be employed in comparatively low-paying firms. These sorts of divisions may contribute to sex differences in earnings within occupations. A further type of division that has become of increasing significance is that between full-time permanent and other, less regular, types of employment. Part-time work is highly female dominated in most countries, and on average tends to be less well remunerated per hour than full-time work. Waldfogel (1997, p. 215), for example, shows that part-time work carries a wage penalty for the women in her sample. Thus, although part-time positions may assist women to retain careers by facilitating the combination of work and family responsibilities, the concentration of women in this type of work may contribute to the gender pay gap. In sum, sex differences in pay can arise from different pay rates in particular jobs or the distribution of female employment into lower paying jobs. In both cases, pay differences may be due to forms of discrimination or to non-discriminatory influences. Non-discriminatory influences include the individual attributes and choices of men and women, and although these may reflect broader social inequities, they can be distinguished from overt forms of discrimination within the labor market. Both areas are considered below.
2.1 Indiidual Attributes and Choices A wide body of literature has addressed the extent to which the gender pay gap can be explained by individual attributes and choices. Human capital theory suggests that productivity related differences between men and women, such as education, skill, and labor market experience, may account for sex differ13943
Sex Differences in Pay ences in earnings. Typically, analyses decompose earnings differences between men and women into a component that can be ‘explained’ by productivity factors, and another that represents the ‘unexplained’ elements of the gender pay gap that are ascribed to labor market discrimination (Oaxaca 1973). Whether human capital differences themselves might be evidence of social inequality (for example, reflecting differential access to training or restricted choices about labor force attachment) has not been an explicit part of this analytical approach. Numerous studies have been conducted for different countries, and although there are difficulties of measurement and interpretation that complicate such analyses, human capital variables typically account for less than half, and often considerably smaller proportions, of the gender pay gap (Treiman and Hartmann 1981, p. 42). The studies which explain most have been those including detailed estimates of employment experience or labor force attachment (for example, Corcoran and Duncan 1979). Where women’s more intermittent labor force attachment has been captured effectively, as in longitudinal surveys or work histories, the negative effect on wages has been clearly demonstrated (Waldfogel 1997, pp. 210–11). Alongside intermittent labor force attachment, several studies show a wage penalty for women associated with the presence of children. In a regression model utilizing data from 1968–88, Waldfogel (1997, p. 212) identifies a penalty in hourly wages associated with having a child, and a larger penalty for two or more children, even after controlling for actual employment experience and factors such as education. The effects of family and domestic labor responsibilities are thus likely to be cumulative, lowering women’s earnings through reducing employment experience and the capacity to retain career paths. While some may interpret such findings as evidence that a proportion of the gender pay gap is non-discriminatory and simply due to individual choices, others may observe that women’s disproportionate responsibility for family care affects the range of choices available (see Motherhood: Economic Aspects).
2.2 Discrimination in Employment There are several types of labor market discrimination with implications for sex differences in pay. For example, employers may discriminate against women when recruiting and promoting staff, preferring to hire men for higher status positions, and investing more in training and career support for men. This may be ‘statistical discrimination,’ reflecting assumptions that women will be more likely than men to leave work, or have lower levels of commitment to it, once they have family responsibilities. However, it is not clear that women leave jobs more frequently than men (see England 1992, p. 33), hence the rationality of this type 13944
of discrimination is questionable. Less direct forms of discrimination may be the result of customary practice, with long-standing procedures and organizational cultures effectively hindering the advancement of women (see Cockburn 1991). Discrimination may also be evident in the way pay rates are established. While the overt use of different rates of pay for men and women discussed earlier is no longer widespread, female-dominated areas of employment may be relatively underpaid. England’s (1992) analysis provides some evidence for this by showing that the sex composition of an occupation explains between 5 percent and 11 percent of the gender pay gap even when factors such as different types of demands for skill and effort, and industrial and organizational characteristics are controlled for. Undervaluation could be the result of women’s comparatively low bargaining power, and could also reflect gender-biased estimations of the value of skills in female-dominated occupations. England (1992), for example, shows that ‘nurturing’ skills carry a wage penalty, thus suggesting that this type of work is devalued because of its association with typical ‘women’s work.’ Such findings indicate that policies of ‘comparable worth’ or ‘equal pay for work of equal value’—that is, comparisons of dissimilar jobs to produce estimations of job value free of genderbias—have an important role to play.
3. The Role of Institutions While the factors considered thus far contribute to an understanding of the gender pay gap in a general sense, they are frequently less useful in explaining differences between countries. For example, differences between countries in the level of occupational segregation appear to be unrelated to performance on sex differences in pay. In Japan, the combination of low levels of occupational segregation and a large gender pay gap may be explained partly by women’s relative lack of access to seniority track positions in large firms (Anker 1997, p. 335)—that is, by other types of segregation. However, a different type of explanation is necessary to understand why some countries that are highly sex-segregated by occupation (such as Sweden and Australia) have comparatively narrow gender pay gaps. This anomaly suggests the importance of institutional factors in influencing the gender pay gap. Blau and Kahn (1992), for example, point out that the effect of occupational concentration on sex differences in pay will be influenced by the overall wage distribution in any country—where wages are relatively compressed, the effect of segregation on earnings will be minimized. Wage distribution is in turn likely to be a product of the institutional framework for wage bargaining, with more centralized and regulated systems conducive to lower levels of wage dispersion,
Sex Differences in Pay and, therefore, a narrower gender pay gap. Crossnational statistical evidence supports these links, showing association between centralized wage fixation and high levels of pay equity (Whitehouse 1992, Gunderson 1994). This relationship is likely to result not only from wage compression, but also from an enhanced capacity to implement equal pay measures in more centralized systems. Rubery (1994) notes that the degree of centralization has implications for several matters of relevance to pay equity outcomes, including the maintenance of minimum standards and the scope for equal value comparisons. Decentralized pay systems tend to provide more limited scope for comparisons to support equal pay for work of equal value cases, and results may be limited to specific enterprises, or to individuals. Overall, trends in wage bargaining arrangements may be more influential on pay equity outcomes than specific gender equity measures (Rubery 1994, Rubery et al. 1997, Whitehouse 1992). Wage bargaining systems are, however, country-specific, and embedded in national employment systems (Rubery et al. 1997). Translation of institutional structures across countries is therefore unlikely to be a viable proposition.
4. Policy options As the gender pay gap cannot be attributed to one type of cause, a number of strategies will be necessary to attempt its elimination. While addressing pre-market impediments to women’s advancement (such as sexrole stereotypes and their impact on educational and employment choices) is part of this agenda, the main strategies are those designed to remove barriers within the labor market. International conventions such as the ILO’s Equal Remuneration Convention (No. 100) and the United Nations’ Convention on Elimination of all forms of Discrimination Against Women have provided some impetus for action, and most countries have now implemented some form of equal pay legislation or prohibition against discrimination in employment. However, the efficacy of such measures is highly variable. The most direct strategies are legislative provisions requiring the payment of equal pay for equal work, and equal pay for work of equal value. Equal pay for equal work provisions have been most effective where they have included a requirement to remove differential pay rates for men and women in industrial agreements. The rapid narrowing of the gender pay gap in Australia and the UK in the 1970s, for example, demonstrates the effectiveness of measures that removed direct discrimination in collective agreements (Gregory et al. 1989, Zabalza and Tzannatos 1985). Improvements in Sweden in the same decade also reflect the advantage of widespread coverage by collective agreements, and in that case predate the introduction of equal pay legislation. However, while
equal pay requirements may be quite effective initially where they equalize minimum rates between men and women across a comprehensive set of collective agreements, the level of segregation in the labor market means that most provisions for equal pay for equal work apply to only a small proportion of women, as few men and women are in fact doing the same work. Provisions for equal pay for work of equal value (comparable worth) open up a wider field for contestation, but this strategy has proved difficult to implement. Historical bias in the way ‘female’ jobs are valued has not been easy to eradicate as even quite detailed job evaluation methods retain aspects of gender bias and may in fact perpetuate existing hierarchies (Steinberg 1992). Moreover, cases have proved complex and time consuming. Comparable worth does permit reconsideration of the valuation of work, however, and is particularly important given the apparent resistance of patterns of occupational segregation to change. It will be most effective where the scope for comparison is wide, and results apply collectively to types of jobs rather than to individuals. Apart from strategies that deal directly with pay, provisions to prohibit discrimination in employment have been adopted in most countries. Anti-discrimination legislation and equal employment opportunity or affirmative action measures aim to prevent sex discrimination in hiring and placement, and in some cases seek to correct for past sex discrimination by requiring attention to the unequal distribution of men and women within organizations. While it is difficult to estimate the impact of such measures on the gender pay gap, they have no doubt restricted the scope for overt discrimination and contributed to a gradual change in customs and attitudes. Given the impact of family responsibilities on women’s earnings noted earlier, erosion of the gender pay gap will also require the use of strategies to assist in the combination of work and family responsibilities. Ultimately this also requires a more even division of domestic labor between men and women to assist women to retain career progression and employment experience. Paid parental and family leave, and measures to deliver flexibility with job security while combining employment and caring responsibilities will be part of this agenda, although experience thus far suggests that encouragement to fathers to share these types of provisions will also be necessary. Finally, it must be emphasized that the gender pay gap is dependent on a wide range of policy and institutional factors, most of which are not designed with gender equity goals in mind. In particular, wage bargaining arrangements and employment policies may affect the size of the gender pay gap by affecting outcomes such as wage dispersion. Trends away from centralized and regulated forms of pay bargaining are therefore of some concern as they may increase dispersion, which in turn may erode gains made 13945
Sex Differences in Pay through equal pay or comparable worth strategies. In short, the pursuit of pay equity cannot be limited to a single agenda, but requires multiple policy measures, and—like most endeavors attempting significant social change—will require a long period of time to achieve.
Bibliography Anker R 1997 Theories of occupational segregation by sex: An overview. International Labour Reiew 136: 315–39 Blau F 1977 Equal Pay in the Office. DC Heath and Company, Lexington, MA Blau F, Kahn L 1992 The gender earnings gap: Learning from international comparisons. American Economic Reiew, Papers and Proceedings 82: 533–8 Cockburn C 1991 In the Way of Women: Men’s Resistance to Sex Equality in Organizations. ILR Press, Ithaca, NY Corcoran M, Duncan G 1979 Work history, labor force attachments, and earnings differences between the races and sexes. Journal of Human Resources 14: 3–20 England P 1992 Comparable Worth: Theories and Eidence. Aldine de Gruyter, New York Gregory R G, Anstie R, Daly A, Ho V 1989 Women’s pay in Australia, Great Britain and the United States: the role of laws, regulations, and human capital. In: Michael R T, Hartmann H I, O’Farrell B (eds.) Pay Equity: Empirical Inquiries, National Academy Press, Washington, DC, pp. 222–42 Grimshaw D, Rubery J 1997 The Concentration of Women’s Employment and Relatie Occupational Pay: a Statistical Framework for Comparatie Analysis. Labour Market and Social Policy, Occasional Paper No 26, Organisation for Economic Cooperation and Development, Paris Gunderson M 1994 Comparable Worth and Gender Discrimination: an International Perspectie. International Labour Office, Geneva, Switzerland Oaxaca R 1973 Male–female wage differentials in urban labor markets. International Economic Reiew 14: 693–709 OECD 1988 Employment Outlook. Organization for economic cooperation and development, Paris Rubery J 1994 Decentralisation and individualisation: The implications for equal pay. Economies et SocieT teT s 18: 79–97 Rubery J, Bettio F, Fagan C, Maier F, Quack S, Villa P 1997 Payment structures and gender pay differentials: Some societal effects. The International Journal of Human Resource Management 8: 131–49 Ryan E, Conlan A 1989 Gentle Inaders: Australian Women at Work. Penguin, Ringwood, Victoria Steinberg R J 1992 Gendered instructions. Cultural lag and gender bias in the Hay system of job evaluation. Work and Occupations 19: 387–423 Treiman D J, Hartmann H I (eds.) 1981 Women, Work and Wages: Equal Pay for Jobs of Equal Value. National Academy Press, Washington, DC Walby S 1988 Introduction. In: Walby S (ed.) Gender Segregation at Work. Open University Press, Milton Keynes, UK Waldfogel J 1997 The effect of children on women’s wages. American Sociological Reiew 62: 209–17 Whitehouse G 1992 Legislation and labour market gender inequality: An analysis of OECD countries. Work, Employment & Society 6: 65–86.
13946
Zabalza A, Tzannatos Z 1985 Women and Equal Pay: The Effect of Legislation on Female Employment and Wages in Britain. Cambridge University Press, Cambridge, UK
G. Whitehouse
Sex Hormones and their Brain Receptors Nerve cells react not only to electrical and chemical signals from other neurons, but also to a variety of steroid factors arising from outside the brain. Steroids, particularly gonadal steroids, have profound effects on brain development, sexual differentiation, central nervous control of puberty, the stress response, and many functions in the mature brain including cognition and memory (Brinton 1998, McEwen 1997). The brain contains receptors for all five classes of steroid hormones: estrogens, progestins, androgens, glucocorticoids, and mineralocorticoids. Steroid hormone receptors are not uniformly distributed, but occupy specific loci within the brain (McEwen 1999, Shughrue et al. 1997). As expected the hypothalamus has a rich abundance of steroid receptors, particularly receptors for the sex hormones estrogen, progesterone, and testosterone. Remarkably, the limbic system is also a site for steroid action and contains estrogen, androgen, and glucocorticoid receptors. The cerebral cortex is also a site of steroid action and expresses receptors for estrogens and glucocorticoids. Some of these receptors arise from separate genes while others arise from alternative splicing of a single gene. For example the progesterone receptor exists in multiple isoforms derived from the alternative splicing of one gene (Whitfield et al. 1999) whereas estrogen receptors have multiple receptor subtypes (ERα and ERβ) that originate from separate genes, as well as multiple isoforms (Warner et al. 1999, Keightley 1998).
1. Structure of Steroid Receptors All genomic steroid hormone receptors are composed of at least three domains: the N-terminal (or A\B) region, the DNA binding domain containing two zincfingers (region C), and the ligand-binding domain to which the hormone binds (region E). Select steroid receptors, such as the estrogen receptors, have an additional C-terminal (or F) region, which serves to modify receptor function (Tsai and O’Malley 1994, Warner et al. 1999) (Fig. 1A). The amino-terminal A\B region of steroid receptors is highly variable and contains a transactivation domain, which interacts with components of the core transcriptional complex. Region C of these receptors contains a core sequence of 66 amino acids that is the most highly conserved region of the nuclear hormone receptor family. This region folds into two structures that bind zinc in a way characteristic of type II zinc
Sex Hormones and their Brain Receptors
Figure 1 Structural aspects of Steroid Receptors. A. The steroid receptors can be divided into 6 functional domains A, B, C, D, E, and F. The function of each domain is indicated by the solid lines. B. Ribbon representation of the ligand binding domain of estrogen receptor α indicating the ligand binding pocket. Upon ligand binding the receptor dimerizes and then can act as a transcription factor (adapted from Brzozowski et al. 1997). C. Ribbon representation of ERα demonstrating the conformational alterations induced by 17β-estradiol. Upon binding estradiol, helix 12 folds to cover the ligand binding pocket leaving the coactivator binding site free. D. Ribbon representation of ERα demonstrating the conformational alterations induced by raloxifene. Upon binding raloxifene, helix 12 is displaced obscuring the coactivator binding site (adapted from Brzozowski et al. 1997)
fingers. These two type II zinc clusters are believed to be involved in specific DNA binding, thus giving the C regions its designation as the DNA binding domain. In addition, the C region plays a role in receptor dimerization (Fig. 1A and B). The D region contains a hinge portion that may be conformationally altered upon ligand binding. The E\F-region is functionally complex since it contains regions important for ligand binding and receptor dimerization, nuclear localization, and interactions with transcriptional co-
activators and corepressors (Horwitz et al. 1996). It has been observed that a truncated receptor lacking regions A\B and C can bind hormone with an affinity similar to that of the complete receptor suggesting that region E is an independently folded structural domain. Amino acids within region E that are conserved amongst all members of the nuclear receptor family are thought to be responsible for the conformation of a hydrophobic pocket necessary for steroid binding whereas the non-conserved amino acids may be 13947
Sex Hormones and their Brain Receptors important for ligand specificity. The conserved amino acid sequences within the ligand binding domain of the nuclear receptors make up a common structural motif, which is composed of 11–12 individual α helices with helix 12 being absent in some members of the superfamily of steroid receptors. Although domain F is not required for transcriptional response to hormone, it appears to be important for modulating receptor function in response to different ligands. Of prime importance in estrogen receptor function is helix 12 (Fig. 1C), found in the most carboxyl region of the ligand-binding domain, which is thought to be responsible for transcriptional activation function (AF-2) activity (Warner et al. 1999). This region realigns over the ligand-binding pocket when associated with agonists, but takes on a different conformation with antagonists (Fig. 1). Such conformational changes are thought to affect interactions with coactivators, since helix 12 is required for interaction with several of the coactivator proteins and mutations that are known to abolish AF-2 transcriptional activity also abolish the interaction of the nuclear receptors with several of their associated proteins (McKenna et al. 1999, Warner et al. 1999). The ability of different ligands to determine the conformational state of the receptor has profound consequences on the ability of the steroid receptor to drive gene transcription.
2. Classical Mechanism of Sex Steroid Action Classically, sex steroids exert their effects by binding to intracellular receptors and modifying gene expression (Tsai and O’Malley 1994). Steroid receptors are ligand-activated transcription factors that modulate specific gene expression. In the presence of hormone, two receptor monomers dimerize and bind to short DNA sequences located in the vicinity of hormone-regulated genes (Fig. 2). These specific DNA sequences, or hormone response elements (HRE), contain palindromic or directly repeating half-sites (Tsai and O’Malley 1994). Since an HRE exerts its action irrespective of its orientation and when positioned at a variable distance upstream or downstream from a variety of promoters (Tsai and O’Malley 1994), it is an enhancer. Steroid receptors represent inducible enhancer factors that contain regions important for hormone binding, HRE recognition, and activation of transcription. The mechanism of transactivation by nuclear receptors has recently achieved further complexity by the discovery of an increasing number of coregulators. Coregulator proteins can modulate the efficacy and direction of steroid-induced gene transcription. Coregulators include coactivators ERAP160 and 140, and RIP160, 140, and 80 which were biochemically identified by their ability to specifically interact with the hormone binding domain of the receptor in a 13948
ligand-dependent manner (McKenna et al. 1999) (Fig. 2). For example, the interaction between estrogen receptors and coregulators was promoted by estradiol whereas antiestrogens did not promote this interaction. Further studies led to the identification of many other co-activators including glucocorticoid receptor interacting protein 1 (GRIP 1), steroid receptor coactivator 1 (SRC 1), and transcriptional intermediary factor 2 (TIF 2). When cotransfected with nuclear receptors, including estrogen receptor, these coactivators are capable of augmenting liganddependent transactivation. In addition, the phosphoCREB-binding protein (CBP) and the related p300 have been demonstrated to be estrogen receptorassociated proteins and involved in ligand-dependent transactivation. It has also been shown that coactivator RIP140 interacts with the ER in the presence of estrogen, and this interaction enhances transcriptional activity between 4- to 100-fold, depending on promoter context.
3. Rapid Signaling Effects of Sex Steroids Although steroids induce many of their effects in the brain through activation of genomic receptors, nontranscriptional actions of estrogens, progestins, glucocorticoids, and aldosterone have been observed in a variety of tissues, including the brain (Brinton 1993, 1994, Watson et al. 1995, Pappas et al. 1995). These nontranscriptional actions are characterized by shortterm effects that range from milliseconds to several minutes. Additionally, these effects still occur in the presence of actinomycin D or cyclohexamide, known transcriptional blockers. The first suggestion of the non-transcriptional actions of steroids came when it was observed that progesterone induced rapid anesthetic and sedative actions when injected subcutaneously (Brinton 1994). Then in the 1960s and 1970s numerous studies suggested that estrogen modulated the electrical activity of a variety of nerve cells (Foy et al. 1999). Subsequent studies utilizing various techniques and preparations have shown that these effects occur too rapidly to be genomic in nature (Foy and Teyler 1983). Morphological studies of neuronal development conducted in itro have shown that estrogenic steroids exert a growth-promoting, neurotrophic effect on hippocampal and cortical neurons via a mechanism that requires activation of NMDA receptors (Brinton et al. 1997). In io studies have revealed a proliferation of dendritic spines following 17β-estradiol treatment that can be prevented by blockade of NMDA receptorchannels, though not by AMPA or muscarinic receptor antagonists (Woolley 1999). Other reports have provided evidence that chronic 17β-estradiol treatment increases the number of NMDA receptor binding sites, and NMDA receptor-mediated responses (Woolley 1999).
Sex Hormones and their Brain Receptors
Figure 2 Classical mechanism of steroid action. Upon binding of steroid (S) to an inactive steroid receptor, the receptor is activated and two receptor-ligand monomers dimerize and bind to the hormone response element (HRE). Coactivators such as RIP140, CBP, and SRC-1 bind to and link the hormone receptor with the general transcription factors (GTF) and RNA polymerase of the transcription machinery to alter transcription
Recent studies suggest a direct link between the estrogen receptor and the mitogen-activated protein kinase (MAPK) signaling cascade (Singh et al. 2000). MAPKs are a family of serine-threonine kinases that
become phosphorylated and activated in response to a variety of cell growth signals. In neuronal cells, estrogen resulted in neuroprotection that was associated with a rapid activation of the MAPK 13949
Sex Hormones and their Brain Receptors
Figure 3 Rapid signaling effects of sex steroids. Estradiol, acting through a possible membrane bound or cytoplasmic receptor, causes the transient activation of c-src-tyrosine kinase. Activated c-src can potentiate the glutamate response through the NMDA receptor. Additionally, c-src can cause the phosphorylation of p21 (ras)-guanine nucleotide activating protein leading to the activation of MAP Kinase, which has many downstream effects on cell survival and growth
signaling pathway (Singh et al. 2000). These neuroprotective effects, which occurred within 5 minutes of estrogen exposure are thought to occur through the transient activation of c-src-tyrosine kinases and tyrosine phosphorylation of p21(ras)-guanine nucleotide activating protein (Fig. 3). Additionally, the potentiation of the NMDA receptor mediated neuronal response by estrogen is thought to be mediated by c-src-tyrosine kinases (Bi et al. 2000). It is not yet clear whether these effects require the classical estrogen receptor (ERα\β) or if there is an as of yet undiscovered membrane receptor that mediates these rapid effects of estrogen (Razandi et al. 1999). Thus, while steroid receptors are ligand-induced transcriptional enhancers, they are also activators of intracellular signaling pathways that can profoundly influence neuronal function and survival. Challenges remain in our understanding of steroid action and sites of action in brain. Remarkably, our knowledge of genomic sites of steroid action has 13950
greatly expanded in the past decade while the membrane sites of steroid action remain a challenge to fully characterize. Our understanding of steroid effects in brain has led to the expansion of the role of steroid receptors beyond sexual differentiation and reproductive neuroendrocrine function to regulators of nearly every aspect of brain function including cognition. The full range of mechanisms by which steroids influence such a wide array of brain functions remains to be discovered. See also: Gender Differences in Personality and Social Behavior; Gender-related Development
Bibliography Bi R, Broutman G, Foy M R, Thompson R F, Baudry M 2000 The tyrosine kinase and mitogen-activated protein kinase pathways mediate multiple effects of estrogen in hippocampus.
Sex Offenders, Clinical Psychology of Proceedings of the National Academy of Science, USA 97: 3602–7 Brinton R D 1993 17 β-estradiol induction of filopodial growth in cultured hippocampal neurons within minutes of exposure. Molecular Cell Neuroscience 4: 36–46 Brinton R D 1994 The neurosteroid 3 alpha-hydroxy-5 alphapregnan-20-one induces cytoarchitectural regression in cultured fetal hippocampal neurons. Journal of Neuroscience 14: 2763–74 Brinton R D 1998 Estrogens and Alzheimer’s disease. In: Marwah J, Teitelbaum H (eds.) Adances in Neurodegeneratie Disorders, Vol. 2: Alzheimer’s and Aging. Prominent Press, Scottsdale, AZ, pp. 99 Brinton R D, Proffitt P, Tran J, Luu R 1997 Equilin, a principal component of the estrogen replacement therapy premarin, increases the growth of cortical neurons via an NMDA receptor-dependent mechanism. Experimental Neurology 147: 211–20 Foy M R, Teyler T J 1983 17-alpha-Estradiol and 17-betaestradiol in hippocampus. Brain Research Bulletin 10: 735–9 Foy M R, Xu J, Xie X, Brinton R D, Thompson R F, Berger T W 1999 17 β-estradiol enhances NmDA receptor mediated EPSPs and long-term potentiation in hippocamfal CA1 cells. Journal of Neurophysiology 81: 925–8 Horwitz K B, Jackson T A, Bain D L, Richer J K, Takimoto G S, Tung L 1996 Nuclear receptor coactivators and corepressors. Molecular Endocrinology 10: 1167–77 Keightley M C 1998 Steroid receptor isoforms: Exception or rule? Molecular & Cellular Endocrinology 137: 1–5 McEwen B S 1997 Hormones as regulators of brain development: Life-long effects related to health and disease. Acta Paediatrica 422 (Suppl.): 41–4 McEwen B S 1999 Clinical review 108: The molecular and neuroanatomical basis for estrogen effects in the central nervous system. Journal of Clinical Endocrinology & Metabolism 84: 1790–7 McKenna N J, Lanz R B, O’Malley B W 1999 Nuclear receptor coregulators: Cellular and molecular biology. Endocrine Reiew 20: 321–44 Pappas T C, Gametchu B, Watson C S 1995 Membrane estrogen receptors identified by multiple antibody labeling and impeded-ligand binding. FASEB 9: 404–10 Razandi M, Pedram A, Greene G L, Levin E R 1999 Cell membrane and nuclear estrogen receptors (ERs) originate from a single transcript: Studies of ERalpha and ERbeta expressed in Chinese hamster ovary cells. Molecular Endocrinology 13: 307–19 Shughrue P J, Lane M V, Merchenthaler I 1997 Comparative distribution of estrogen receptor-alpha and -beta mRNA in the rat central nervous system. Journal of Comprehensie Neurology 388: 507–25 Singh M, Setalo G J, Guan X, Frail D E, Toran-Allerand C D 2000 Estrogen-induced activation of the mitogen-activated protein kinase cascade in the cerebral cortex of estrogen receptor-alpha knock-out mice. Journal of Neuroscience 20: 1694–1700 Tsai M J, O’Malley B W 1994 Molecular mechanisms of action of steroid\thyroid receptor superfamily members. Annual Reiew of Biochemistry 63: 451–86 Warner M, Nilsson S, Gustafsson J A 1999 The estrogen receptor family [Review] [50 refs]. Current Opinions in Obstetrics & Gynecology 11: 249–54 Watson C S, Pappas T C, Gametchu B 1995 The other estrogen receptor in the plasma membrane: Implications for the actions
of environmental estrogens. Enironmental Health Perspecties 103 (Suppl.): 41–50 Whitfield G K, Jurutka P W, Haussler C A, Haussler M R 1999 Steroid hormone receptors: Evolution, ligands, and molecular basis of biologic function. Journal of Cell Biochemistry 32– 3 (Suppl.): 110–22 Woolley C S 1999 Electrophysiological and cellular effects of estrogen on neuronal function. Critical Reiew of Neurobiology 13: 1–20
R. D. Brinton and J. T. Nilsen
Sex Offenders, Clinical Psychology of The clinical psychology of sex offenders involves assessment, treatment, and prevention. Clinical assessment involves the careful description of the problem and an estimation about the risk for recidivism, or re-offense. Clinical treatment involves interventions to reduce the risk of recidivism. Clinical prevention involves interventions before a person becomes a sex offender to reduce the risk of recidivism. There is much more research on assessment and treatment of sex offenders than there is on prevention of sex offending.
1. Who are Sex Offenders? A sex offender is anyone who has forced another person to engage in sexual contact against their will. Sex offending may or may not involve physical force. For example, a person can use psychological force to get another person to have sex, as in the case of a power differential between two persons (e.g., employer–employee). However, coercive sex that involves physical force is more likely to be regarded as sex offending than coercive sex that does not. Another issue is the ability of the victim to give consent. Minors and developmentally disabled persons are usually considered to lack the ability to give consent. A person’s ability to give consent may also become impaired, such as in the case of substance abuse. Thus, a person engaging in sex with a person who is unable to consent or whose ability to give consent is impaired might be considered a sex offender. The term sex offender usually is associated with a person who has been apprehended in a legal context for their coercive sexual behavior. Nevertheless, not all persons who engage in sexually coercive behaviors are caught. Thus, two persons could engage in the same sexually coercive behavior, but the one who is caught would be considered a sex offender and the one who is not caught would not. Moreover, legal statutes 13951
Sex Offenders, Clinical Psychology of vary from jurisdiction to jurisdiction. For example, sexual penetration may be required to be considered a sex offender in some jurisdictions, while other jurisdictions may have broader definitions that include nonpenetrative forms of sexual contact. Legal jargon may obscure the meaning and impact of sexually coercive behavior, as well. For example, terms such as ‘indecent liberties’ and ‘indecent assault’ may both refer to rape, as defined as forced sexual intercourse. However, indecent liberties or indecent assault charges carry very different legal meanings and punishments than does rape. Similarly, rape committed by a stranger is more likely to be prosecuted than rape committed by someone known to the victim. Nevertheless, the behavior in both instances is rape. Because of these inconsistencies and vagaries in legal definitions of sex offending, the current article will adopt a broad approach to sexual aggression, focusing on coercive sexual behavior whether or not the perpetrator has been apprehended. There is often disagreement between perpetrators and victims on whether sexual aggression actually occurred, or the seriousness of it. A comprehensive discussion of the veracity of perpetrator and victim reports is beyond the scope of this article. However, current definitions of sexual aggression give more weight to victims’ perceptions of the occurrence of sexual aggression, given perpetrators’ tendencies to defend and minimize aggressive behavior and the inherent power differential between victims and offenders. It is usually disadvantageous to bring the negative attention to oneself that accompanies accusing someone of sexually victimizing them. In courtroom settings, victims’ personal and sexual histories are examined in cases of sexual abuse in a manner that is unparalleled for other types of offenses. Nevertheless, child custody disputes appear to be one context in which there may be some risk, albeit small, for false accusations of sexual abuse on the part of a parent seeking to disqualify the parenting fitness of the other parent. The Diagnostic and Statistical Manual (DSM ) of the American Psychiatric Association includes several sexual disorders, or paraphilias, that sex offenders may have. These include fetishism (sexual arousal associated with nonliving objects), transvestic fetishism (sexual arousal associated with the act of crossdressing), voyeurism (observing unsuspecting nude individuals in the process of disrobing or engaging in sexual activity), exhibitionism (public genital exposure), frotteurism (touching a nonconsenting person’s genitalia or breasts, or rubbing one’s genitals against a nonconsenting person’s thighs or buttocks), pedophilia (sexual attraction or contact involving children), sexual masochism (sexual arousal associated with suffering), and sexual sadism (sexual arousal associated with inflicting suffering). Although the fetish disorders do not involve contact with an actual person, these disorders can result in arrest when a fetishist 13952
steals the fetish items (e.g., undergarments). Rape often does not involve sexual arousal directly associated with suffering and is more commonly classified in DSM as a component of antisocial personality disorder than as a sexual disorder. Such a classification appears justified, in that many rapists engage in both sexual and nonsexual forms of aggression and other rule-violating behaviors. The focus of this article will be on rape and child molestation because there exists more research on these topics than on any of the other types of sexual offenses. The emphasis in the literature on these two sex offenses reflects the fact that these may create more harm to victims than the other disorders.
2. History The clinical study of sex offenders had its beginnings in research on sexuality. Some of the earliest scholarly writings on sexual deviance were detailed case studies by psychiatrist Krafft-Ebing (1965\1886). Krafft-Ebing postulated that all sexual deviations were the result of masturbation. The case study method focused exclusively on highly disturbed individuals without matched control cases, which precluded objective considerations of etiology (Rosen and Beck 1988). Kinsey and colleagues (1948) conducted large-scale normative surveys of sexual behavior, resulting in major works on male and female sexuality. Because adult–child sexual contact was relatively common in his samples, Kinsey underplayed the negative effects of such behavior (Rosen and Beck 1988). Work on sexual deviance continued at the Kinsey Institute after Kinsey’s death (Gebhard et al. 1965). A major advance in the assessment of sexual arousal was the development of the laboratory measures of penile response by Freund (1963). Freund’s measure, known as the penile plethysmograph, involved an inflatable tube constructed from a condom, by which penile volume change in response to erotic stimuli (e.g., nude photographs) was measured by air displacement (Rosen and Beck 1988). Less intrusive penile measures to assess circumference changes were later developed (Bancroft et al. 1966; Barlow et al. 1970). Among behaviorists, penile response to deviant stimuli (e.g., children, rape) became virtually a gold standard of measurement for sexual deviance (e.g., Abel et al. 1977). The emphasis of behaviorists on the role of sexual arousal in sex offending came under criticism from feminist theories. Rape was conceptualized as a ‘pseudosexual’ act of anger and violence rather than as a sexual disorder. This approach was popularized by Groth (1979). More recent conceptualizations have incorporated sexual, affective, cognitive, and developmental motivational components of sex offending (Hall et al. 1991).
Sex Offenders, Clinical Psychology of
3. Risk Factors for Sex Offending One major risk factor for being a sex offender is being male. Less than 1 percent of females perpetrate any form of sexual aggression, whereas the percentage of men who are rapists ranges from 7 to 10 percent, and the percentage of men who admit to sexual contact with children is 3 to 4 percent. Evolutionary psychologists suggest that mating with multiple partners provides males with a reproductive advantage. Thus, sexually aggressive behavior may be a by-product of evolutionary history. Nevertheless, most men are not sexually aggressive. Thus, being a male may be a risk factor for being sexually aggressive, but certain environmental conditions may be necessary for someone to become sexually aggressive. For example, aggressive behavior, including sexually aggressive behavior, is accepted and socialized among males more than females in most societies. Another risk factor for males becoming sexually aggressive, particularly against children, is personal sexual victimization. Boys who have been sexually victimized are more likely to engage in sexualized behaviors (e.g., sexual touching) immediately following sexual victimization than boys who have not. Moreover, in recent studies, over half of adult sex offenders have reported being sexually abused during childhood. However, the sexual abuse of males is not invariably associated with becoming sexually abusive, as the majority of males who are sexually abused do not become abusers. The single best predictor of sex offending is past sex offending. Persons who have sexually offended multiple times in the past are more likely to sexually offend in the future than those who have limited histories of sex offending. Recidivism rates for child molesters and rapists are similar at 25 to 32 percent. Persons who have committed multiple sex offenses have broken the barriers against offending and have lowered the threshold for re-offending by developing a pattern of behavior. Thus, sex offenders having multiple offenses are the group that poses the greatest risk to community safety. Moreover, the predictive utility of past sex offending suggests that interventions with first-time offenders and interventions to prevent sex offending from starting are critical to avoid the establishment of an ingrained pattern of behavior.
4. Motiational Factors for Sexual Offending What causes men to become sex offenders? One common explanation has been the sexual preference hypothesis that sex offenders are more sexually aroused by coercive than by consenting sexual activity. Learning theories posit that sexual arousal to coercive sexual activity is conditioned (see Classical Conditioning and Clinical Psychology). For example, a pedophile may have had sexually arousing experiences
during childhood with peers and may never have outgrown these experiences. The fusion of violent and sexual images in the media may influence some men to associate sexual arousal with aggressive activity. The sexual preference hypothesis appears to be most appropriate for some sex offenders against children, particularly those who offend against boys. For many men who molest children, their sexual arousal to children exceeds their sexual arousal to adults, as assessed by genital measures. However, some child molesters, particularly incestuous offenders, may be more sexually aroused by adults than by children. The sexual preference hypothesis is less applicable to men who rape adults. Most rapists’ sexual arousal is greatest in response to adult consenting sexual activity and less in response to sexual activity that involves physical force. However, a minority of rapists do exhibit a sexual preference for rape over consenting adult sexual activity. These rapists would be considered sexual sadists. Another common explanation of sex offending has been anger and power. Some feminist scholars have gone as far as to say that sex offending is a ‘pseudosexual’ act and that it is not about sex. However, recent feminist scholarship recognizes the sexual and aggressive aspects of sex offending. If sex offending is a violent, rather than a sexual, act, why does the perpetrator not simply assault the victim rather than sexually assaulting the victim? Anger and power are most relevant in explaining the rape of women by males. Adversarial relationships with women may cause some men to attempt to enforce their sense of superiority over women via rape. Some child molesters may also have anger and power motives for their offending. However, depression is a more common precursor to child molesting than anger. One source of depression may be perceived or actual social incompetence in peer relationships. Sexual contact with children may represent a maladaptive method of coping with this depression. Excuses to justify sexually aggressive behavior may be the motivation for some forms of sex offending. Such excuses deny or minimize the impact of sexual aggression and are known as cognitive distortions. Cognitive distortions are common in acquaintance rape situations and in incest. An acquaintance rapist may contend that rape cannot occur in the context of a relationship or that the existence of a relationship justifies any type of sexual contact that the rapist desires. Common cognitive distortions among incest offenders are that sexual contact with their child is a form of affection, or sexual education, or that it merely amounts to horseplay. Thus, the motivation for a person employing cognitive distortions is that the sexual contact is ‘normal’ and is not aggressive. It is likely that these different types of motivation for sex offending may interact for many sex offenders. Sexual arousal to coercive sexual behavior, emotional problems, and cognitive distortions may coexist for 13953
Sex Offenders, Clinical Psychology of some sex offenders, and a singular explanation of the basis of the problem may be inadequate. Nevertheless, complex explanations of sex offending are less likely to result in treatment interventions than are explanations that identify the major motivational issues.
5. Treatment Interentions for Sex Offenders The most common forms of treatment for sex offenders have been behavioral, cognitive-behavioral, and psychohormonal interventions. Behavioral methods have typically involved interventions to reduce sexual arousal to inappropriate stimuli (e.g., children, rape) (see Behaior Therapy: Psychiatric Aspects). Aversive conditioning, which pairs sexual arousal to the inappropriate stimulus with an aversive stimulus (e.g., foul odor, thoughts about punishment for sex offending) is perhaps the most widely used behavioral treatment with sex offenders. Cognitive-behavioral methods focus on the influence of cognitive factors, including individual beliefs, standards, and values, on sex-offending behavior (see Cognitie Therapy). Common cognitive-behavioral treatments often involve cognitive restructuring, empathy enhancement, social skills training, and self-control techniques. Relapse prevention, which has been adapted from the treatment of addictive behaviors, has gained relatively wide acceptance in the treatment of sex offenders. This cognitive-behavior method involves self-control via anticipating and coping with situations following treatment that create high risk for relapse (e.g., social contact with potential victims) (see Relapse Preention Training). Psychohormonal treatments involve the use of antiandrogen drugs to reduce the production of testosterone and other androgens. Such androgen reduction suppresses sexual arousal. The antiandrogen that has been most commonly used is medroxyprogesterone (Depo Provera). Evidence across recent studies of rapists, child molesters, and exhibitionists suggests that cognitivebehavioral and psychohormonal interventions may be more effective than no treatment or behavioral treatments. Recidivism rates among sex offenders, as measured by arrests for sex offenses, are 25–32 percent. Sex offender recidivism rates in studies of behavioral treatments have been approximately 40 percent. Both cognitive-behavioral and psychohormonal treatments have yielded recidivism rates of 13 percent. Why is there such a high rate of recidivism in studies of behavioral treatments? It could be contended that behavioral treatments are too narrowly focused on sexual arousal and ignore other motivational issues (e.g., emotional, cognitive). However, psychohormonal treatments also primarily focus on sexual arousal reduction and result in relatively low recidivism rates. Thus, it is possible that any positive effects of behavioral treatments may ‘wear off’ over time. It is also possible that the sex offenders in the 13954
behavioral treatment studies were at higher risk for recidivism, although the recidivism rates for all forms of treatment are based on both inpatient and outpatient samples. Thus, there is support for cognitivebehavioral and psychohormonal treatments being the most effective treatments with sex offenders. Nevertheless, the evidence of this treatment effectiveness is somewhat limited. A major limitation of psychohormonal treatments is compliance. These treatments usually involve intramuscular injections and are effective only as long as they are complied with. Sex offender refusal and dropout rates with psychohormonal treatments range from 50 to 66 percent. Moreover, a study directly comparing the relative effectiveness of cognitivebehavioral vs. antiandrogen treatments within the same population of sex offenders is needed. Such a study could clarify whether the effective suppression of sexual arousal achieved by antiandrogen treatments is sufficient to reduce recidivism or whether the more comprehensive aspects of cognitive-behavioral treatments offer necessary adjuncts. Also unknown is whether a combined cognitive behavioral plus antiandrogen approach would be superior to either approach alone. An encouraging development is evidence of effectiveness of cognitive-behavioral methods in reducing recidivism among adolescent sex offenders. Interventions with adolescents may prevent the development of a history of sex offending that is associated with reoffending. However, there are extremely few outcome studies of adolescent sex offender treatment. Most of the available treatment research on sex offenders has been conducted with European American populations. It is unknown whether these treatments are equally effective with other groups. For example, cognitive-behavioral methods are individually based. However, individually-based interventions may not be as effective in cultures in which there is a strong group orientation. Thus, individual change might be offset by group norms. For example, in some patriarchal groups, misogynous attitudes may be normative and sexual aggression permissible. Conversely, there may be protective cultural factors that could be mobilized to prevent sex offending. Thus, the context in which interventions are used needs to be examined in future research.
6. Preention There have been virtually no prevention studies that have examined perpetration of sex offending as an outcome measure. Yet the costs to society of sex offending, in terms of damage to and rehabilitation of victims and of incarceration and rehabilitation of offenders, implore us to seek ways to prevent the problem before it occurs. The recidivism rates of the most effective treatments are at 13 percent, which is
Sex Preferences in Western Societies significantly better than other forms of treatment or no treatment. Yet, it could be contended that a 13 percent recidivism rate is unacceptably high. Perhaps the effective components of cognitive-behavioral interventions could be adapted for proactive use in prevention programs. Most sexual abuse prevention programs have focused on potential victims for interventions. However, perpetrators, not victims, are responsible for sex offenses. A completely effective sex offense prevention program would eliminate the need for victim programs. Prevention programs for potential victims are critically important in terms of empowerment. However, there has been a disproportionate amount of attention in sexual abuse prevention to victims. Relatively simple modifications of existing victim prevention programs could potentially go a long way toward preventing perpetration of sex offenses. For example, most children’s sexual abuse prevention programs present the concept of ‘bad touch,’ which usually is instigated by someone else. Missing from most of these programs, however, is the idea that not only should someone else not ‘bad touch’ you, but you also should not ‘bad touch’ someone else. Such an intervention could reach many potential perpetrators who would not otherwise receive this information. The impact of efforts to prevent sex offense perpetration is unknown. Thus, there is a great need for the development and evaluation of interventions to prevent sex offending. See also: Childhood Sexual Abuse and Risk for Adult Psychopathology; Rape and Sexual Coercion; Regulation: Sexual Behavior; Sexual Harassment: Legal Perspectives; Sexual Harassment: Social and Psychological Issues; Sexual Perversions (Paraphilias); Treatment of the Repetitive Criminal Sex Offender: United States
Gebhard P H, Gagnon J H, Pomeroy W B, Christenson C V 1965 Sex Offenders: An Analysis of Types. Harper & Row, New York Groth A N 1979 Men Who Rape: The Psychology of the Offender. Plenum, New York Hall G C N 1995 Sexual offender recidivism revisited: A metaanalysis of recent treatment studies. Journal of Consulting and Clinical Psychology 63: 802–9 Hall G C N 1996 Theory-based Assessment, Treatment, and Preention of Sexual Aggression. Oxford University Press, New York Hall G C N, Andersen B L, Aarestad S, Barongan C 2000 Sexual dysfunction and deviation. In: Hersen H, Bellack A S (eds.) Psychopathology in Adulthood, 2nd edn. Allyn and Bacon, Boston Hall G C N, Barongan C 1997 Prevention of sexual aggression: Sociocultural risk and protective factors. American Psychologist 52: 5–14 Hall G C N, Hirschman R, Beutler L E (eds.) 1991 Special section: Theories of sexual aggression. Journal of Consulting and Clinical Psychology 59: 619–81 Heilbrun K, Nezu C M, Keeney M, Chung S, Wasserman A L 1998 Sexual offending: Linking assessment, intervention, and decision making. Psychology, Public Policy, and Law 4: 138–74 Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. W B Saunders, Philadelphia Krafft-Ebing R von 1965 Psychopathia Sexualis. Putnam, New York (Original work published in 1886) Marshall W L, Fernandez Y M, Hudson S M, Ward T 1998 Sourcebook of Treatment Programs for Sexual Offenders. Plenum Press, New York Prentky R A, Knight R A, Lee A F S 1997 Risk factors associated with recidivism among extrafamilial child molesters. Journal of Consulting and Clinical Psychology 65: 141–9 Rosen R C, Beck J C 1988 Patterns of Sexual Arousal: Psychophysiological Processes and Clinical Applications. Guilford, New York
G. C. N. Hall
Sex Preferences in Western Societies Bibliography Abel G G, Barlow D H, Blanchard E B, Guild D 1977 The components of rapists’ sexual arousal. Archies of General Psychiatry 34: 895–903 Bancroft J H, Gwynne Jones H E, Pullan B P 1966 A simple transducer for measuring penile erection with comments on its use in the treatment of sexual disorder. Behaiour Research and Therapy 4: 239–41 Barbaree H E, Marshall W L, Hudson S M 1993 The Juenile Sex Offender. Guilford Press, New York Barlow D, Becker R, Leitenberg H, Agras W W 1970 A mechanical strain gauge for recording penile circumference change. Journal of Applied Behaior Analysis 3: 73–6 Buss D M, Malamuth N M 1996 Sex, Power, Conflict: Eolutionary and Feminist Perspecties. Oxford University Press, New York Freund K 1963 A laboratory method for diagnosing predominance of homo- or heteroerotic interest in the male. Behaiour Research and Therapy 1: 85–93
Since 1950, a great number of authors working in the field of anthropology, demography, sociology, and psychology in North America and Europe have tried to determine parental sex preferences. The authors have used various approaches and samples, but in general results are relatively consistent. In this entry, the four methods most frequently used in this field will be reviewed: (a) first-child preference, (b) only-child preference, (c) sex preference for the next child, and (d) the parity-progression ratio technique.
1. First-child Preference Many authors have tried to determine the sex preference of adult individuals with regard to their firstborn child with a question such as ‘For your first child would you prefer a girl or a boy?’ More than 30 studies 13955
Sex Preferences in Western Societies conducted in North America, for the most part with samples of college students or respondents recruited for fertility studies, have indicated that most women and men prefer a boy rather than a girl for their firstborn child. These findings give the strong impression that the preference for a boy as a firstborn has been universal among people in Western societies since 1950. Since these results are based on a hypothetical situation that might not happen in the respondent’s life, Steinbacher and Gilroy (1985) have argued that an assessment of sex preference when the women are pregnant would be more valid. A review of the literature in English and French found 16 empirical investigations in which it was possible to identify the maternal sex preference of pregnant women. In eight of these studies, the information concerning the expectant father’s preference was also available, either from the fathers themselves (in three studies) or from their pregnant wives (in five studies). These studies clearly indicate that first-time pregnant women more often prefer a girl than a boy, especially after 1981, when a preference for a girl is shown in six out of seven studies. The data concerning expectant fathers are different; in fact, in seven out of the eight studies, men preferred a boy rather than a girl for a first child. Concerning the variables associated with the sex preference for a first child, two studies have presented data on pregnant women (Uddenberg et al. 1971, Steinbacher and Gilroy 1985). First, Uddenberg et al. (1971) found that women who grew up with only female siblings more often preferred a son first in a small sample of 81 first-time pregnant women in Sweden. (Some other studies showed the opposite: namely, that the more sisters a woman has, the greater her preference for a girl.) In addition, women who desire a girl are psychologically more autonomous than women who desire a boy. Interestingly, Uddenberg et al. (1971) also found no significant difference between the age or the social classes of the women who preferred a girl or a boy, or who expressed no specific preference. Second, the study by Steinbacher and Gilroy (1985), which dealt with 140 first-time pregnant women in the USA, reported that older women more often chose the no-preference category, and that those who agreed strongly with the women’s movement preferred a girl rather than a boy. Otherwise, they did not find any significant relationship in variables such as race, income, marital status, or religion. Interestingly, in this type of literature, the nopreference percentage has varied from 25 percent to 59 percent since 1981. An important point is to determine whether in fact many of the women who claim to have no preference use this answer to hide a preference, as has been suggested by Pharis and Manosevitz (1980). Marleau et al. (1996) consequently checked the validity of this traditional sex preference question by compar13956
ing the answers to that question with the answers to a ‘feminine\masculine’ scale which assessed how pregnant women imagined the sex of their future baby. It was shown that women having expressed no sex preference on a direct question had in fact no explicit image of their baby as male or female on the ‘feminine\masculine’ scale. This experiment seems to confirm the validity of the classical direct question.
2. Only-child Preference In reviewing the literature relative to only-child preference, Marleau and Maheu (1998) identified many studies in which it was possible to identify women’s and\or men’s sex preference(s). Two subgroups of studies are found. The first one consists of 11 studies in which the subjects were forced to select the sex of a child on the basis of the hypothetical situation that they would have only one child in their whole lives. In the second subgroup, five studies were designed to elicit the number of children the subjects desired in their lives. Some subjects declared that they wanted only one child, and it was possible from these answers to determine the sex of this child for further analysis. The results of these two subgroups of studies were collapsed together for the final analysis. It should be noted that all of these studies were made in the USA between 1951 and 1991, and that in the majority of studies the samples consisted of college or university students. The mains results indicate that women, in general, prefer a boy to a girl for an only child. However, in three of the five most recently published studies, women more often prefer a girl rather than a boy for an only child. Results for men show that in nearly all studies, at least 70 percent prefer a boy. Some of these authors have tried to determine whether any variables are related to the sex preference. The most frequently found connection has been to education; in the two most recent decades, the data indicate that women who have reached university level more often prefer a girl than a boy for an only child. Men, whether they have an university education or not, prefer a boy more often. Pooler (1991) has identified two other variables: the variable ‘wife retaining her own name’ and the variable ‘religion.’ Female students who agree with the idea of retaining their own name after marriage more often prefer a girl. In addition, Jewish students prefer a girl whereas Catholics and Protestants prefer a boy. Some authors have hypothesized that the women’s preference for a female only child could be attributed to the fact that the perceived traditional female role disadvantage appears now to be significantly diminishing in Western societies. For example, Hammer and McFerran (1988) showed, significantly, that all subgroups of females (except unmarried noncollege females) would prefer to be reborn as a female.
Sex Preferences in Western Societies
3. The Preference for Sex of the Next Child Another method consists in asking individuals their sex preference for a next child, based on the existing family composition. This type of question is found habitually in fertility surveys. In general, the data indicate that women with only one child more often prefer a child of the opposite sex. Moreover, a high percentage of women prefer a child of the opposite sex when they already have two or three children of the same sex. (Men rarely participate in these fertility surveys.) For example, Marleau and Saucier (1993) showed that almost half of the women who already had a child hoped that their second child would be of the opposite sex in data from the Canadian Fertility Survey of 1984. Nearly 80 percent of women who already had two boys desired a girl for their next child, whereas nearly 50 percent of those with two girls preferred a boy. Some authors have worked with other measures, especially with the mean number of desired children. Here the results are mixed. Some studies have found that women with a boy and a girl intend to have the same mean number of children as those who have two children of the same sex, but other studies have found that the mean number of children desired is higher for the latter group. Other authors have worked with a measure such as the use of contraception by women who have already had children. A study done by Krishnan (1993) with the Canadian Fertility Survey of 1984 on women aged between 18 and 49 showed a son preference: the women who already had two sons were more likely to use contraception than those who had two girls.
Social Survey of 1990 that 58 percent of the couples with two children of the same sex were more likely to have another child as compared with 53 percent of couples with two children of both sexes. On the other hand, those who stopped childbearing the more often were those with a boy first and a boy second. No such differences in behavior occurred in couples with three children. This method is interesting because it can be computed from large databases collected for other purposes. But, it remains a weakness of this method that it gives good results only if sex preferences are relatively homogeneous in the population studied.
5. Conclusion When we compare the findings reviewed above, we note that a global tendency is revealed by the first three, namely the increasing preference among women for a girl over a boy since 1980. More research will be needed to verify whether this trend will continue and to understand the reasons for this recent shift. A further trend is revealed by the first method, namely the increasing proportion of women who have chosen the no-preference option since 1980. This trend is visible for both pregnant and nonpregnant women. See also: Family Size Preferences; Family Systems and the Preferred Sex of Children; Family Theory and the Realities of Childbearing Behavior; Family Theory: Economics of Childbearing; Fertility Control: Overview; Fertility: Proximate Determinants
4. Parity Progression Ratio
Bibliography
A method used by many authors is the parityprogression ratio. This technique consists in observing real behavior rather than being satisfied with verbal statements of attitude as in the methods mentioned above. The parity progression ratio is the proportion of couples at a given parity who have at least one additional child. If certain sex compositions of existing children are associated with a lower than average progression ratio, the inference is made that the predominant sex in those compositions is preferred. More than 30 studies were identified in the literature. In general, the data indicate that couples with one child continue to bear children regardless of the sex of the first child. Parents with two children of the same sex are more likely to go on than those with one child of each sex. For example, it was shown (Marleau and Saucier 1996) in a large sample from the Canadian General
Hammer M, McFerran J 1988 Preference for sex of child: A research update. Indiidual Psychology 44: 481–91 Krishnan V 1993 Gender of children and contraceptive use. Journal of Biosocial Science 25: 213–21 Marleau J D, Maheu M 1998 Un garc: on ou une fille? Le choix des hommes et des femmes a' l’e! gard d’un seul enfant. Population 5: 1033–42 Marleau J D, Saucier J-F 1993 Pre! fe! rence des femmes canadiennes et que! be! coises non enceintes quant au sexe du premier enfant. Cahiers queT beT cois de deT mographie 22: 363–72 Marleau J D, Saucier J-F 1996 Influence du sexe des premiers enfants sur le comportement reproducteur: une e! tude canadienne. Population 2: 460–3 Marleau J D, Saucier J-F, Bernazzani O, Borgeat F, David H 1996 Mental representations of pregnant nulliparous women having no sex preference. Psychological Reports 79: 464–6 Pharis M E, Manosevitz M 1980 Parental models of infancy: A note on gender preferences for firstborns. Psychological Reports 47: 763–8 Pooler W S 1991 Sex of child preferences among college students. Sex Roles 9/10: 569–76
13957
Sex Preferences in Western Societies Steinbacher R, Gilroy F G 1985 Preference for sex of child among primiparous women. The Journal of Psychology 124: 283–8 Uddenberg N, Almgren P E, Nilsson A 1971 Preference for sex of the child among pregnant women. Journal of Biosocial Science 3: 267–80
J. D. Marleau and J.-F. Saucier
Sex-role Development and Education Today’s debate on sex or gender is dominated by two antagonistic paradigms: sociobiology and constructivism. On evolutionary terms man’s characteristics are determined by natural selection operating at the level of genes. Man has survived millions of years of competitive struggle for existence. This by definition implies egotism as a core feature of his genes, i.e., a readiness to maximize own reproductive chances at the cost of others. Egotistic genes may produce ‘altruistic’ behavior in case the net reproductive success of own genes (embodied in close relatives) can be increased (kinship altruism) or future repayments in emergencies can be secured (reciprocal altruism). Gene egotism has implications for sex differences. Across all species females’ investment in reproduction is greater: they produce fewer and larger, i.e., more costly gametes; among mammals they invest time in carrying and nursing the young and among humans in taking care of them during an extended phase of dependency. Both sexes—being but vehicles of egotistic genes—strive to maximize reproductive success. Different strategies, however, will be efficient. In view of their high pre-investment, taking care of the young pays for females and given that females will take care, it pays better for males to spread their genes as widely as possible. For females sexual reserve is the better strategy—it allows them to test a male’s fidelity and increase his investments; and it pays to select high status males, since their greater resources increase chances of survival for the few young a female will be able to bear. In contrast, males profit from quick seductions and from selecting females according to beauty and youthfulness—criteria that indicate high reproductive capacity. Further assumptions are needed: given that genes happen to reside half of the time of their existence in male, the other half in female bodies, they have to be assumed to operate differently in different environments. Also, the adaptive value of present characteristics can be justified only by reference to life conditions of our ancestors—of which we know little. Constructivism draws a contrary picture. Sex differences—so its core claim—are but social constructions. In fact, the classificatory system itself is a modern Western design stipulating the following 13958
features (see Tyrell 1986). (a) Reference to physical aspects, i.e., the presence\absence of a penis (some cultures focus more on social activities like childcare or warfare). (b) Binary, i.e., a strictly exclusive categorization (some cultures allow for or even ascribe a positive status to hermaphrodites). (c) Inclusive, i.e., all individuals are classified even those with unclear genetic make-up (using an operation to improve outward appearance if necessary). (d) Irreversible (except via operation) (some cultures allow for social sex role changes). (e) Ascriptive from birth on (some cultures define children as neutrals and ascribe sex role membership only in initiation rituals). The very assumption of large sex differences is a modern idea that from the constructivist perspective is mistaken: it is an essentialist reification of what in fact is but a cooperative interactive achievement. Humans—so the basic tenet—don’t ‘have’ a sex and they ‘are’ not males or females, rather they ‘act’ and ‘see’ each other as such. Studies on transsexuality analyze the ways in which individuals perform and recognize gender. The two paradigms are based on opposite assumptions. In sociobiology man is but a vehicle for powerful genes, in constructivism he is an omnipotent creator. In sociobiology he is a lone wolf entering social relations only for reproductive concerns, in constructivism he is a social being whose very (even sexual) identity is dependent on interactive co-construction. Nevertheless, both paradigms concur in their ahistorical approach. In sociobiology genes that survived under the living conditions of our ancestors determine human dispositions forever—across all periods and societies. In constructivism man is created ever anew in each interaction situation. Both approaches simply ignore the power of history—consolidated in collective traditions and reproduced in biographical learning processes. These will come to the fore in the following analysis of present sex role understanding. Three aspects will be discussed: its empirical description, historical emergence, and ontogenetic development.
1. Sex Differences—a Descriptie Perspectie Sex differences can be analyzed on various levels: on the level of the individual, the culture, the social structure.
1.1 Psychological Leel Recent meta-analyses of US data show that with greater educational equality sex differences in cognitive performance have largely disappeared over the past few decades except for a slight overrepresentation of males at the very top of mathematical and the very bottom of verbal abilities and an average higher male performance in spatial ability tasks. Male spatial superiority is even greater among
Sex-role Deelopment and Education Mexicans but has not been found among Eskimos. These latter findings suggest a connection between the development of spatial understanding and cultural differences in degree of supervision and control exerted upon young girls. It has been claimed that morality is gendered: women are maintained to be more flexible and careoriented, men to be more rigidly oriented to abstract principles and more autonomous. These differences are seen to arise from differences in the structure of self shaped by the early experience of female mothering. Girls can maintain the primary identification with the first caretaker (relational self ), while boys—in order to become different—have to distance themselves (autonomous self ) (Gilligan and Wiggins 1988). Neither flexibility nor care, however, are specific to women. Flexibility is a correlate of a modern moral understanding. Kant had still ascribed exceptionless validity to negative duties given that God—not man—was held responsible for any harm resulting from compliance. On innerwordly terms, however, impartially minimizing harm is given priority over strict obedience to rules. If exceptions are at all deemed justifiable they will more likely be conceded by those who are aware of possible costs incurred by anyone affected. Individuals who are personally involved will be more knowledgeable of such costs. This may explain why women—in agreement with Gilligan— were found to judge more flexibly with respect to abortion, yet at the same time more rigidly with respect to the issue of draft resistance than men (Do$ bert and Nunner-Winkler 1985). Besides, those in power can insist more rigidly on their convictions, thus flexibility might be the virtue of subordinates. Care is not part of a relational self-structure produced in early childhood—rather it is part of the female role obligation. Indeed, preschool girls showed no more empathic concerns than boys (Nunner-Winkler 1994), but a majority of (especially older) German subjects more often justified strictly condemning working mothers by referring to their dereliction of duty and to their egotistic strivings for self-fulfillment than to the harm their children might suffer.
1.2 Cultural Leel Two aspects—although empirically concurring—need to be distinguished: gender stereotypes and genderrole obligations. Gender stereotypes are collectively shared assumptions about the different ‘nature’ of men and women. Across cultures men are assumed to be aggressive, independent, and assertive, women to be emotional and sensitive, emphatic, and compliant. Contradictory evidence does not detract from such persuasions— immunity to empirical refutation is the very core of stereotypes and there are mechanisms to uphold them.
Expectations guide the way observations are perceived, encoded, and interpreted, e.g., noncompliance is perceived as a sign of strength if shown by a man, of dogmatism if shown by a woman. Behavior that conforms to expectations is encoded on abstract terms, discrepant behavior with concrete situational details. This eases making use of the ‘except clause’ when interpreting deviant cases (e.g., for a woman she is extraordinarily assertive). Thus, suitably framed and interpreted even conflicting observations can stabilize stereotypes. With gender-role obligations, women are assigned to the private sphere, men to the public realm. Thus, taking care of children and household chores is seen to be primarily women’s task, breadwinning men’s. These roles are defined by contrasting features. Family roles are ascribed, diffuse, particularistic, affective and collectivity-oriented; occupational roles are achieved, specific, universalistic, affectively neutral, and selforiented (Parsons 1964). Identifying and living in agreement with one’s gender role will influence ways of reacting, feeling, and judging. Thus, gender differences might be understood as a correlate not primarily of genetic dispositions or of a self-structure shaped in infancy, but rather of the culturally institutionalized division of labor between the sexes.
1.3 Sociostructural Leel Gender stereotypes and role obligations influence career choice and commitment to the occupational sphere. In consequence, there is a high gender segregation of the workforce. The proportion of women is over 90 percent in some fields (e.g., secretary, receptionist, kindergarten-teacher) and less than 5 percent in others (e.g., mechanic, airplane pilot). Jobs that are considered women’s work tend to offer fewer opportunities for advancement, less prestige, and lower pay than jobs occupied primarily by men. Worldwide the gender gap in average wage is 30–40 percent and it shows little sign of closing. Top positions in economy, politics, and sciences are almost exclusively filled by men, and part-time working is almost exclusively a female phenomenon. Both men and women tend to hold negative attitudes towards females in authority. Women entering male occupations are critically scrutinized, males entering female occupations (e.g., nursing) in contrast easily win acceptance and promotion.
2. The Historical Emergence of Gender Differences There are two (partly independent) dimensions implied in the debate on gender roles: the hierarchical one of equality vs. inequality of rights and the horizontal one of difference vs. sameness in personality 13959
Sex-role Deelopment and Education make-up. We begin with equality. According to medieval understanding individuals find themselves in different social positions by the will of God and it is God who commanded that women obey men. Enlightenment declared all men to be equal—irrespective of gender (or race). Thus, a new justification was needed if the subordination of women (or black slaves) was to be maintained. This instigated a search for ‘natural’ differences between women and men (between blacks and whites) that soon succeeded in specifying differences in brain size or the shape and position of sexual organs (in IQ)—much to the detriment of women (or blacks). Legal discrimination has largely discontinued. Women (and blacks) are granted full rights to vote or to participate in the educational system. The assumption of gender differences, however, is still prevalent. It arose in consequence of the industrialization process. In agricultural economies women had their own sphere of control (house, garden, cattle, commercialization of surplus products) and their contribution was essential for subsistence: ‘These women in no way resemble the 19th century image of women as chaste, coy, demure … Peers describe them as wild, daring, rebellious, unruly’ (Bock and Duden 1977). Industrialization led to a separation of productive and reproductive work, i.e., to the contrast between familial and occupational roles described above. With rapid urbanization and increasing anonymity around the turn of the twentieth century, antimodernist discontent grew. Increasingly, female ‘complementary virtues,’ e.g., empathy, emotionality, sensitivity came to be seen as bulwark against the cold rationality of the structure of capitalist economy and bureaucratic administration. This sentiment was (and partly still is) shared even by feminists deeply committed to secure legal, political, and social equality for women.
3. Ontogenetic Deelopment Each generation of newborns is an invasion of barbarians—nonetheless, within a decade or two most turn into useful members of their specific society. How is this effected? Education is too narrow a term in that it refers primarily to methods purposefully applied in order to produce desired results. Children, however, are influenced not primarily by planned educational actions, but rather by the entirety of their life conditions. They are not merely passive objects to social instruction, rather—in mostly implicit learning processes—they actively reconstruct the basic rule systems underlying their experiences. This way they acquire knowledge systems, value orientations, action dispositions. In this (self-)socialization process different learning mechanisms are at work: classical and instrumental conditioning produce response tendencies; 13960
through bestowal and withdrawal of love a conformity disposition is shaped; through parental authority and firmness the internalization of values is furthered; children imitate behavior of models they deem interesting and they implicitly recognize regularities and rule structures. With increasing cognitive and ego development reflexive self-distancing from and consciously taking a stance towards one’s previous learning history becomes possible. How are sex roles acquired? Increasingly parents advocate identical educational goals for boys and girls. Nevertheless, unwittingly, especially fathers tend to treat them differently—handling male infants more roughly and disapproving of sissy behavior (Golombok and Fivush 1994). Also, from early on, children prefer same-sex playmates (Maccoby 1990). Such early experiences may leave some traces. More direct sex role learning, however, seems to depend on sociocognitive prerequisites. A change has been documented to occur in children’s understanding of concepts. They shift from focusing on externally observable surface features to basic definitional criteria (in the case of nominal terms) or to the assumption of stable and essential inner characteristics that all members of a given category share (in the case of natural kind terms). Sex is treated like a natural kind term, i.e., sex is understood to remain constant despite outward changes, to denote some ‘essential’ even if unobservable commoness, and to allow for generalizing new information across all members of the same category (Gelman et al. 1986). This constitutes a universal formal frame of reference (that makes stereotypical thinking so irresistible). It needs to be filled with content. Children learn what is typical and appropriate for men and women in their culture by beginning to selectively observe and imitate exemplary same-sex models (Slaby and Frey 1975). Largely, this learning process is intrinsically motivated (Kohlberg 1966), i.e., by the desire to be a ‘real boy\girl’ and become a ‘real man\woman.’ It proceeds by (mostly implicitly) reconstructing those gendered behavioral, expressive, and feeling rules that are institutionalized in the given culture. In our culture there are many cues from which children will read what constitutes sexappropriate demeanor. The sexual division of labor is seen already in the family: women are more likely to sacrifice their career for the family (and men their family-life for their career) and even full-timeemployed mothers spend considerably more time on childcare and housework than fathers. In the school, teachers tend to give more attention to boys and praise them for the quality of their work (while praising girls for neatness). Curricula may segregate the sexes, offering home economics to girls, contact sport or mechanic training to boys. The social structure of the school impresses the idea of male authority over women with most principals in elementary school being male although most teachers are female. In fact, it has been found that first-graders attending schools
Sex-role Deelopment and Education with a female principal display less stereotypical views on gender roles than children in schools with a male principal. These early lessons on sex differences are reinforced in public: in politics and business top positions are mostly filled by men and books, films, TV, and advertisements depict men as dominant, powerful, and strong, women as beautiful, charming, and yielding. Thus, consciously treating boys and girls alike in family and kindergarten will be of little avail in counterbalancing the impressive overall picture of structural sexual asymmetry and alleged personality differences between the sexes in personality characteristics and behavioral dispositions.
are low (1.3; 59.7 percent). The German welfare system is described as paternalistic (e.g., women are offered extended publicly financed maternity leaves yet there are hardly any daycare centers for infants or all-day schools and the disapproval of working mothers is especially high (Garhammer 1997). Such differences between countries may indicate that social policy does have an important part to play in reducing or reproducing gender inequalities. See also: Education and Gender: Historical Perspectives; Education (Primary and Secondary Schools) and Gender; Gender and School Learning: Mathematics and Science; Gender Differences in Personality and Social Behavior
4. The Future of Sex Roles With modernization, ascriptive categories lose importance. Social systems increasingly come to be differentiated in subsystems each fulfilling a specific function and operating according to its own code (Luhmann 1998). Gender is the code of the family; it cannot substitute for the codes of other subsystems, e.g., in science it is the truth of a statement, on the market it is the purchasing power, in court it is the lawfulness of the sentence that counts; the gender of the author, the customer, the judge are (or should be) irrelevant. True, there still exists inequality between the genders. Nevertheless, all over the world it has been drastically reduced over the past few decades. In all countries women have profited more by the educational expansion of the 1960s and 1970s; while at the beginning of the twentieth century in less than 1 percent of 133 countries analyzed voting rights were conceded to women, today all of them with male franchise (over 90 percent) have extended suffrage to women (Ramirez et al. 1997). Many countries have inserted a clause concerning equal social participation rights for women in their constitution and set up special institutions; also some have introduced affirmative actions. Increasingly, women come into prestigious positions which will improve chances for succeeding women who now meet with role models and old girls’ networks. Nevertheless, merely increasing the proportion of women on top will not suffice. It may only increase the tendency of a split up of female biographies with women opting for either family or career (while men can have both). A real change requires a more equal distribution of productive and reproductive work between the genders. In this respect Sweden has been quite successful by institutionalizing an egalitarian welfare regime (e.g., providing publicly financed daycare centers across the whole country; granting a generous leave of absence to both parents at child birth, and offering a specific paternity leave to fathers which 83 percent make use of ). In Sweden we find a high birthrate (1.8) along with a high rate of female employment (75.5 percent). This stands in contrast to the situation in Germany where both rates
Bibliography Bock G, Duden B 1977 Arbeit aus Liebe—Liebe als Arbeit. Zur Entstehung der Hausarbeit im Kapitalismus. Frauen und Wissenschaft. Beitra$ ge zur Berliner Sommeruniversita$ t fu$ r Frauen Juli 1976. Berlin, 118ff Do$ bert R, Nunner-Winkler G 1985 Value change and morality. In: Lind G, Hartmann H A, Wakenhut R (eds.) Moral Deelopment and the Social Enironment. Precedent Publishing, Inc., Chicago, pp. 125–53 Garhammer M 1997 Familiale und gesellschaftliche Arbeitsteilung—ein europa$ ischer Vergleich. Zeitschrift fuW r Familienforschung 9: 28–70 Gelman S A, Collman P, Maccoby E E 1986 Inferring properties from categories versus inferring categories from properties: The case of gender. Child Deelopment 57: 396–404 Gilligan C, Wiggins G 1988 The origins of morality in early childhood relationships. In: Gilligan C, Ward J V, Taylor J M (eds.) Mapping the Moral Domain: A Contribution of Women’s Thinking to Psychological Theory and Education. Harvard University Press, Cambridge, MA, pp. 110–38 Golombok S, Fivush R 1994 Gender Deelopment. Cambridge University Press, New York Kohlberg L 1966 A cognitive-developmental analysis of children’s sex-role concepts and attitudes. In: Maccoby E E (ed.) The Deelopment of Sex Differences. Leland Stanford Junior University, Stanford, CA, pp. 82–173 Luhmann N 1998 Die Gesellschaft der Gesellschaft. Suhrkamp, Frankfurt a.M Maccoby E E 1990 Gender and relationships. A developmental account. American Psychologist 4: 513–20 Nunner-Winkler G 1994 Der Mythos von den zwei Moralen. Deutsche zeitschrift fuW r Philosophie 42: 237–54 Parsons T 1964 The Social System. The Free Press of Glencoe, London Ramirez F, Soysal Y, Shanahan S 1997 The changing logic of political citizenship: Cross-national acquisition of women’s suffrage rights, 1890–1990. American Sociological Reiew 62: 735–45 Slaby R G, Frey K S 1975 Development of gender constancy and selective attention to same-sex models. Child Deelopment 46: 849–56 Tyrell H 1986 Geschlechtliche Differenzierung und Geschlechterklassifikation. KoW lner Zeitschrift fuW r Soziologie und Sozialpsychologie 38: 450–89
G. Nunner-Winkler Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13961
ISBN: 0-08-043076-7
Sex Segregation at Work
Sex Segregation at Work All societies organize work through a sexual division of labor in which women and men typically perform different tasks. These tasks may also depend as well on people’s age, race, ethnicity, nativity, and so forth. Although this sexual division of labor divides market and non market work by sex, sex segregation usually refers to the sexual division of labor among paid workers. Thus, sex segregation is the tendency for the sexes to do different kinds of paid work in different settings. Customarily, ‘segregation’ denotes physical separation, as in school segregation by race. However, the term ‘sex segregation’ refers to physical, functional, or nominal differentiation in the work that women and men do. ‘Men’s work’ does not simply differ from ‘women’s work,’ however; predominantly male jobs tend to be better and more highly valued. This occurs both in vertical segregation which assigns men higherlevel jobs than women in the same occupation, and in horizontal segregation in which the sexes pursue different occupations.
underestimate the effect of occupational segregation on the pay gap: the amount of sex segregation in metropolitan labor markets affects how much segregation affects women’s and men’s earnings with all women losing pay and all men gaining in highly segregated markets, and no pay gap if occupational segregation were very low (Cotter et al. 1997, p. 725). Men dominate the more lucrative jobs partly because employers reserve such jobs for men and partly because female work is devalued (England 1992). The under-valuation of female activities not only shrinks women’s pay, it also reduces their economic independence, familial power, and sense of entitlement. Sex segregation also creates disparities in the sexes’ authority, mobility opportunities, working conditions, and chance to acquire skills. Segregation disproportionately relegates women to dead-end jobs or jobs on short career ladder, thus reducing their aspirations and opportunity for mobility ladders (Baron et al. 1986). This vertical segregation creates a ‘glass ceiling’ that concentrates women in the lower level positions and reduces their authority.
2. Theories of Segregation 1. Sex Segregation and Sex Inequality Segregation—whatever its form—is a key engine of inequality. The domains reserved for members of dominant and subordinate groups are not just different, they are unequal, with the more desirable domains reserved for members of dominant groups. Because economic, social, and psychological rewards are distributed through people’s jobs, segregation both facilitates and legitimates unequal treatment. Sex segregation reinforces sex stereotypes, thereby justifying the sexual division of labor it entails, and it reduces equal-status cross-sex contacts that could precipitate challenges to differentiation. In sum, workplace segregation benefits men and harms women (Cotter et al. 1997).
1.1 Consequences of Sex Segregation Job, establishment, and occupational segregation account for most of the differences in the rewards that men and women garner from employment. The most important of these differences is earnings. Estimates of the importance of occupational segregation for the pay gap between the sexes vary widely, depending on the analytic approach. An estimated one-third of the earnings gap results from occupational segregation in the USA and abroad (Anker 1998). Establishmentlevel data for the USA, Norway, and Sweden by Petersen and his colleagues attribute about threequarters of the gap to occupational segregation (Petersen and Morgan 1995). This approach may still 13962
The most common theories of sex segregation emphasize the preferences of workers and employers. Hypothetically, gender-role socialization or a sexual division of domestic labor that encourages men to pick jobs that maximize earnings and women to select jobs that facilitate child-rearing lead to sex-differentiated preferences (England 1992). Men’s vested interests also hypothetically prompt them to exclude women from ‘men’s jobs.’ The universality of the assumed sexspecific preferences limits the value of preference explanations: because preferences theoretically vary between, not within the sexes, the sexes should be completely segregated. Empirical evidence casts doubt on preference theories: youthful occupational preferences only loosely related to the sex composition of adults’ occupation (see Jacobs 1989, 1999). Moreover, the sexes similarly value high pay, autonomy, prestige, and advancement opportunities, which should minimize between-sex differences. Explanations focusing on employers’ preferences stemming from sex biases and attempts to minimize employment costs through statistical discrimination are limited by their emphasis on hard-to-measure motives. Although it stands to reason that employers influence job-level segregation because they assign workers to jobs, it is difficult to learn why. However, by examining the effects of training and turnover costs and skill on women’s exclusion and sex segregation, Bielby and Baron (1986) showed that employers discriminated statistically against women by imputing to individuals stereotypical characters of their sex, but contrary to neoclassical economic theory, this practice was neither efficient nor rational.
Sex Segregation at Work In sum, standard theoretical approaches to segregation predict universally high segregation rather than explaining variation in its level. Although these theories have stimulated considerable research, it is difficult to directly test them, so the importance of women’s and employers’ ‘choices’ remains a matter of debate. More fruitful explanations (summarized later) try to explain variation in segregation across aggregates or over time.
3. Measuring Segregation Segregation is a property of aggregates, not individuals. Its measurement depends on how it is conceptualized. The most common conceptualization— the sexes’ different representation across work categories—is measured by the index of dissimilarity, D, whose formula is Σ fikmi\2 ( fi and mi represent the percentages of female and male workers in each work entity). D is the percentage of either female or male workers who would have to change to a work category in which they are underrepresented for the sexes’ distributions across categories to be identical. The size of D depends both on the extent of segregation and the relative sizes of the work entities (e.g., occupation) so it is unsuitable for comparing segregation across populations with work entities of different relative sizes. A size-standardized variant of D which holds entity size constant permits comparing levels of segregation across populations. Segregation is also conceptualized as the concentration of the sexes in work entities dominated by one sex. This conception is relevant to the thesis that women are disproportionately crowded into relatively few occupations. A simple measure of concentration is the proportion of each sex in the n occupations that employ the most men or women. (For an index of concentration, see Jacobs (1999) who warns that concentration measures assume that female and male occupations are aggregated to the same degree.) A third conception of segregation is each sex’s probability of having a coworker in their job or occupation of the same or the other sex, designated as P* (Jacobs 1999). The value of P* depends on both the extent of segregation and the sex composition of the population. Occupations can be described in terms of their sex composition, and are sometimes characterized as ‘integrated’ or ‘segregated’ or ‘female- or male dominated’ based on whether one sex is disproportionally represented. For example, in the USA, billing clerk is a segregated occupation because 89 percent of its incumbents are female, whereas women are just 47 percent of the labor force. The individual-level experience of segregation is the sex composition of her or his job or occupation, so a billing clerk holds a predominantly female occupation. In 1992, Charles proposed using log linear methods to study sex segregation. Because log linear methods
adjust for both the varying sizes of work entities and the sexes’ respective shares of the labor force they are useful for studying why the extent of segregation varies cross-nationally and over time. They also allow researchers to estimate the impact of independent variables on a sex’s representation in specific occupations or occupational categories, net of the occupational structure (Charles 1992, p. 488, 1998). All measures of segregation are sensitive to the level of aggregation of the work units of interest. If segregation exists within units—which it usually does—the more aggregated the units, the less segregation will be observed. Although fully disaggregated data, such as establishment-level data, are preferable, they are hard to come by. Cross-national data are particularly likely to be aggregated because researchers need to make work units comparable across countries and because the data some countries collect are broadly aggregated.
4. The Extent of Segregation Most research on segregation focuses on the sexes’ concentration in different occupations. (An occupation comprises jobs that involve similar activities within and across establishments.) In a ‘typical’ country in the early 1990s about 55 percent of workers held either ‘male’ or ‘female’ jobs (Anker 1998, p. 5). Although countries vary considerably in the extent to which the sexes are segregated occupationally, in the 1990s most developed nations had segregation indices between 50 and 70 (Anker 1998). In 1997, the segregation index for detailed occupations in the USA was 53.9 (Jacobs 1999). The sexes are most segregated in the Middle East and Africa. In the early 1990s, segregation indices across detailed occupations were at least 70 for Tunisia, Kuwait, and Jordan, largely because of Muslim proscriptions against contact between the sexes. Most of the EU societies and ‘transition economies’ (former Soviet satellites) had indices in the mid-50s. Segregation was lowest in Asian\Pacific nations, with indices ranging from 36 (China) to 50 (Japan). China’s low segregation level results from its large, sex-integrated agricultural sector. Among the Western developed countries, Italian and US workers were least segregated, and Scandinavians most segregated. In these countries, occupational segregation is positively related to women’s labor force participation; indeed the same factors facilitate women’s labor force participation and encourage sex segregation—paid parental leave and part-time work which lead to statistical discrimination against women and the shift of customarily female domestic tasks into the market economy (Charles 1992). Also contributing to cross-national differences in segregation is the relative sizes of manufacturing and service sectors, since men tend to dominate the 13963
Sex Segregation at Work former and women the latter (Charles 1998). Variation in national commitment to gender ideology also explains cross-national differences in segregation (Charles 1992). 4.1 Segregation Across Jobs Much higher levels of sex segregation exist in establishment-level studies of job segregation (a job is a position in an establishment whose incumbents perform particular tasks) because people in the same occupation work in different establishments or different jobs (Bielby and Baron 1984; Tomaskovic-Devey 1993). For example, male lawyers are more likely than female lawyers to work for law firms, and within law firms men dominate litigation. Organizations’ characteristics and personnel practices affect how segregated they are. Small and large establishments are the most segregated. Small establishments are segregated because they employ workers of only one sex; whereas large bureaucracies have segregative personnel practices, such as sex segregated job ladders (Bielby and Baron 1984).
5. Trends in Segregation In the last quarter of the twentieth century segregation fell in most of the world, although women continued to dominate clerical and service occupations, and most customarily male production jobs remained inaccessible to women (Charles 1998). Segregation increased during the 1970s and 1980s in Asian\Pacific countries; remained constant in Southern and Central European countries; and declined in the USA, Western Europe, the Middle East and North Africa, and other developing countries (Anker 1998, p. 328). In most countries declines reflected increased occupational integration as well as shifts in the occupational structure in which heavily segregated occupations shrank. In the USA, increased integration resulted primarily from women’s movement into customarily male occupations. The higher pay in customarily male jobs draw women. Large numbers of men enter customarily female occupations only when they become much more attractive, and such changes are rare. The sizestandardized index was stable between 1990 and 1997, indicating that after 1990 real occupational integration halted, although fewer persons worked in heavily segregated occupations (Jacobs 1999).
to integration. In the USA, for example, as fewer women majored in education and more majored in business, educational segregation declined and occupations integrated. The shrinking experience gap between the sexes may also reduce segregation if it relieves employers’ misgivings over hiring women for jobs that involve firm-specific skills. However, employers’ hiring and job-assignment practices are the proximate determinants of the level of segregation. Many establishments employ women in customarily male jobs only when there are not enough qualified men. Thus, organizational and occupational growth both foster integration if they cause a shortfall of male labor. Shortages have also brought about integration when occupations deteriorated in their pay, autonomy, or working conditions (e.g., pharmacist, typesetter; Reskin and Roos 1990) and thus became less attractive to men. Employers who face economic penalties for segregating and whose job assignments are monitored are more likely than are others to integrate jobs (Baron et al. 1991). Thus, anti-discrimination and affirmative action regulations have fostered integration by making it illegal for employers to exclude workers from jobs on the basis of their sex. These rules have made a difference largely through ‘class-action’ lawsuits against employers. Anti-discrimination laws are ineffective in reducing segregation when regulations are not enforced and when class-action lawsuits are not permitted. Finally, according to development theory, modernization replaces ascription with achievement in filling social positions. However, modernization is associated with the growth of economic sectors that most societies label as ‘women’s work.’ Thus, the net effect of modernization, as measured by GNP, has been segregative because it expands female dominated clerical and sales occupations (Charles 1992). 5.2 Nominal Integration What appears to be genuine integration may be a stage in a process of re-segregation in which women are replacing men as the predominant sex. In the USA, for example, insurance adjusting and typesetting re-segregated over a 20-year period when these occupations changed from male to female dominated. For individual women, the corresponding process is the ‘revolving door’ through which women in male occupations pass back into customarily female occupations (Jacobs 1989).
5.1 Explaining the Decline in Segregation According to the theoretical framework summarized above, the more similar the sexes become in their preferences and skills, the less segregated they will be. According to cross-national research, the narrowing of the education gap between the sexes has contributed 13964
5.3 Explaining Stability Because the amount of sex segregation has changed so little, scholars have devoted as much attention to explaining stability as to explaining change. Unchanging levels of segregation may result from stability in
Sex Therapy, Clinical Psychology of the causal variables or from processes whose opposing effects cancel out each other. Gender ideology strongly favors stability. Many occupations are sex-labeled, representing a cultural consensus that they are appropriate for one sex but not the other (e.g., childcare worker, automobile mechanic). Although these labels are not deterministic (indeed they sometimes vary across societies), they predispose employers to prefer sex-typical workers and workers to prefer sex-typical jobs. Although state policies have altered gender ideology, it is usually a conservative force. Work organizations also resist change in customary business practices. Organizations’ structures and practices reflect the cultural environment present when they were founded, and inertia preserves these structures and practices. Such practices include sex-segregated promotion ladders, credentials that are required for jobs, and the use of workers’ informal networks to fill jobs. For example, recruitment through networks perpetuates the sex composition of jobs because people’s acquaintances tend to be of their same sex. Statistical discrimination or stereotype-based job assignments perpetuate the sex composition of jobs because they prevent employers from discovering that their stereotypes are unfounded. Thus, barring external pressures to alter job assignments, job segregation is unlikely to change. Of course, stable levels of segregation may be misleading if they are the result of counteracting forces that have opposing effects on the level of sex segregation. For example, in the 1980s Japan’s occupations became slightly more integrated, but this was offset by the increased number of workers employed in occupations that were either male or female dominated. The net effect of the growth of female dominated clerical and service sectors in most economically advanced countries will cancel out some shift toward greater integration within occupations. See also: Affirmative Action: Comparative Policies and Controversies; Affirmative Action: Empirical Work on its Effectiveness; Affirmative Action Programs (India): Cultural Concerns; Affirmative Action Programs (United States): Cultural Concerns; Affirmative Action, Sociology of; Discrimination; Education (Higher) and Gender; Equality of Opportunity; Feminist Legal Theory; Gender and School Learning: Mathematics and Science; Gender and the Law; Modernization, Sociological Theories of; Sex Differences in Pay
Bibliography Anker R 1998 Gender and Jobs: Sex Segregation of Occupations in the World. International Labour Office, Geneva, Switzerland Baron J N, Davis-Blake A, Bielby W T 1986 The structure of opportunity: How promotion ladders vary within and among
organizations. Administratie Science Quarterly 31: 248–73 Baron J N, Mittman B S, Newman A E 1991 Targets of opportunity: Organizational and environmental determinants of gender integration within the California Civil Service 1979–1985. American Journal of Sociology 96: 1362–1401 Bielby W T, Baron J N 1984 A woman’s place is with other women. In: Reskin B F (ed.) Sex Segregation in the Workplace: Trends, Explanations, Remedies. National Academy Press, Washington, DC Bielby W T, Baron J N 1986 Men and women at work: Sex segregation and statistical discrimination. American Journal of Sociology 91: 759–99 Charles M 1992 Cross-national variation in occupational sex segregation. American Sociological Reiew 57: 483–502 Charles M 1998 Structure, culture, and sex segregation in Europe. Research in Social Stratification and Mobility 16: 89–116 Cotter D A, DeFiore J A, Hermsen J M, Kowalewski B, Vanneman R 1997 All women benefit: The macro-level effects of occupational integration on earnings inequality. American Sociological Reiew 62: 714–34 England P 1992 Comparable worth. Theories and Eidence. Aldine de Gruyter, New York Gross E 1968 Plus c: a change … ? The sexual structure of occupations over time. Social Problems 16: 198–208 Hakim C 1992 Explaining trends in occupational segregation: The measurement, causes, and consequences of the sexual division of labor. European Sociological Reiew 8: 127–52 Jacobs J A 1989 Reoling Doors. Stanford University Press, Stanford, CA Jacobs J A 1999 The sex segregation of occupations: Prospects for the 21st century. Forthcoming In: Powell G A (ed.) Handbook of Gender in Organizations. Sage, Newbury Park, CA Petersen T, Morgan L A 1995 Separate and unequal: Occupation-establishment sex segregation and the gender wage gap. American Journal of Sociology 101: 329–65 Reskin B F 1993 Sex segregation in the workplace. Annual Reiew of Sociology 19: 241–70 Reskin B F, Roos P A 1990 Job Queues, Gender Queues: Explaining Women’s Inroads Into Customarily Male Occupations. Temple University Press, Philadelphia, PA Tomaskovic-Devey D 1993 Gender and Racial Inequality at Work. Cornell University, Ithaca, NY
B. F. Reskin
Sex Therapy, Clinical Psychology of Sex therapy is an approach to the treatment of sexual problems. Results of psychophysiological studies of sexual responses in sexually functional subjects, in the 1960s, allowed gynecologist William Masters and psychologist Virginia Johnson to develop a therapeutic format for the treatment of sexual inadequacy. Apart from organic causes, sexual problems may be attributed to interpersonal difficulties and to problems with emotional functioning. Masters and Johnson were convinced that ‘adequate’ stimulation would result in sexual response in an engaged partner. Adequate stimulation feels good, is rewarding, and facilitates focus of attention on those feelings. Orig13965
Sex Therapy, Clinical Psychology of inally, adequate stimulation in therapies was restricted to behaviors comprising the classical heterosexual foreplay, coitus, and orgasm sequence. This criterion for adequate stimulation has been abandoned because of the rich diversity in sexual practices that exist in the real world. Masters and Johnson hypothesized that fear of failure, and as a consequence becoming a spectator of one’s own feelings, is the most important cause for sexual inadequacy. They devised a very ingenious procedure to assist people in becoming engaged in feelings of sexual excitement. Three important steps were specified. In the first step people learn to accept that mutual touching feels good. In this step demands for sexual performance, and anxieties related to such demands, are precluded by an instruction not to touch genitals or secondary sex characteristics (e.g., breasts). When the first step leads to acceptance of positive bodily feelings, the second step demands including genitals and breasts in mutual touching and caressing. Masters and Johnson have suggested some variations to accommodate specific sexual problems (e.g., premature ejaculation, erectile problems). The second step has to result in the physiological signs of sexual excitement—vaginal lubrication in women and erection of the penis in men. In the classical sequence the first and second step prepared for the third step which consisted of coital positions, again with variations, and stimulation through coitus to orgasm. Although coitus may be relevant from a reproductive point of view, it is not always and certainly not the only way to experience the ultimate pleasures of sex.
1. Desire, Excitement, and Orgasm Masters and Johnson, on the basis of their psychophysiological studies, proposed a model of sexual response consisting of three phases: an excitement phase, an orgasm phase, and a resolution phase. The excitement phase and orgasm phase may be recognized as the second and third step of the classic treatment format. When other therapists began to apply Masters and Johnson’s format it became clear that many patients do not easily engage in interactions prescribed by the sex therapy format. Apparently, an initial motivational step was missing. People who hesitate or avoid intimate interactions may lack desire, which often means that they do not expect the interaction to be rewarding. In 1979 Helen Kaplan proposed adding a desire phase, preceding the three phases specified by Masters and Johnson. Since Kaplan’s proposal it has become clear that the prevalence of lack of desire is considerable. With hindsight most people accept that lack of desire must be the most important sexual problem. It is a problem for the individual who does not arouse desire in his or her partner. In most instances, not feeling desire is in itself unproblematic, but lack of sexual desire may become a problem in the relationship. 13966
Sex therapy was a fresh and new treatment. Sexual problems were openly discussed, there was no timeconsuming delving into past conflicts, and there were suggestions for a direct reversal of symptoms of sexual failure. Masters and Johnson preferred working with couples because the interaction within the couple often contributes in important ways to the sexual difficulties. Other therapists have offered treatment to individuals and to groups of individuals or couples. An alternative to Masters and Johnson’s therapy format is the mimicking of normal sexual development through the use of masturbation. This has been an important step for many women, especially those who missed this aspect of discovery of their own sexuality. In this approach people learn to induce sexual excitement through masturbation to eventually apply this ‘skill’ in interaction with a partner. Some approaches developed from behavior therapy and rational emotive therapy focused on performance anxiety and fear of failure. Others used interventions from couple and group therapy. It is fair to say that nowadays almost all approaches to sexual difficulties incorporate elements from the Masters and Johnson sex therapy format.
2. Sexual Dysfunctions in Men 2.1 Diagnostic Procedures The aim of the initial clinical interview is to gather detailed information concerning current sexual functioning, onset of the sexual complaint, and the context in which the difficulty occurred. This information gathering may be aided by the use of a structured interview and paper-and-pencil measures regarding sexual history and functioning. An individual and conjoint partner interview, if possible, can provide additional relationship information and can corroborate data provided by the patient. The initial clinical interview should help the clinician in formulating the problem. It is important to seek the patient’s agreement with the therapist’s formulation of the problem. When such a formulation is agreed upon, the problem may guide further diagnostic procedures. Many men with erectile dysfunction may be wary of psychological causes of their problem. Psychological causes seem to imply that the man himself is responsible for his problem. This may add to the threat to his male identity that he is already experiencing by not being able to function sexually. Considering the way a man may experience his problem, it can be expected that it will not be easy to explain to him the contribution of psychological factors. A clinician knowledgeable in biopsychosocial aspects of sexual functioning should be able to discuss the problem openly with the patient. Dysfunctional performance is meaningful performance in the sense that misinformation, emotional states, and obsessive concerns about performance provide information about the
Sex Therapy, Clinical Psychology of patient’s ‘theory’ of sexual functioning. When contrasting this information with what is known about variations in adequate sexual functioning, it is often clear that one cannot but predict that the patient must fail. For the clinician a problem arises when, even with adequate stimulation and adequate processing of stimulus information according to the clinician’s judgment, no response results, either at a physiological or a psychological level. At this point, a number of assessment methods aimed at identifying different components or mechanisms of sexual functioning may be considered. In principle, two main strategies may be followed: In the first, although a psychological factor interfering with response cannot be inferred from the report of the patient, one can still suspect some psychological factor at work. Possibly the patient is not aware of this factor, thus he cannot report on it. Eliminating this psychological influence may result in adequate response. The second strategy applies when even with adequate (psychological) stimulation and processing, responding is prevented by physiological dysfunction. Physiological assessment may then aid in arriving at a diagnostic conclusion. The biopsychosocial approach predicts that it is inadequate to choose one of these strategies exclusively. The fact that sexual functioning is always psychophysiological functioning means that there may always be an unforeseen psychological or biological factor.
2.2 Psychological Treatments of Sexual Dysfunctions in Men 2.2.1 Approaches to treatment. The most important transformation of the treatment of sexual dysfunctions occurred after the publication of Masters and Johnson’s (1970) Human Sexual Inadequacy. First of all, they brought sex into the treatment of sexual problems. Before the publication of their seminal book, sexual problems were conceived as consequences of (nonsexual) psychological conflicts, immaturity, and relational conflicts. In most therapies for sexual problems sex was not a topic in the therapeutic transactions. There were always things ‘underlying,’ ‘behind,’ and ‘besides’ the sexual symptoms that deserved discussion. Masters and Johnson proposed to attempt directly to reverse the sexual dysfunction by a kind of graded practice and focus on sexual feelings. If sexual arousal depends directly on sexual stimulation, that very stimulation should be the topic of discussion. Here the second important transformation occurred: A sexual dysfunction was no longer something pertaining to an individual; rather, it was regarded as a dysfunction of the couple. It was assumed they did not communicate in a way that allowed sexual arousal to occur when they intend to ‘produce’ it. Masters and Johnson thus
initially considered the couple as the ‘problem’ unit. Treatment goals were associated with the couple concept: The treatment goal was orgasm through coital stimulation. This connection between treatment format and goals was lost once Masters and Johnson’s concept was used in common therapeutic practice. People came in for treatment as individuals. Male orgasm through coitus adequately fulfills reproductive goals, but it is not very satisfactory for many women because they do not easily achieve orgasm through coitus. What has remained over the years, since 1970, is a direct focus on dysfunctional sex and a focus on sexual sensations and feelings as a vehicle for reversal of the dysfunction. What Masters and Johnson tried to achieve in their treatment model is a shift in their patients’ focus of attention. Let us look at one of Masters and Johnson’s interventions to elucidate this point. People with sexual dysfunctions tend to wait and look for the occurrence of feelings, instead of feeling what occurs—hence, the spectator role. Their attention is directed towards something that is not there or does not exist, which is frustrating. In simplest form, Masters and Johnson propose to redirect attention by using the following steps: first of all they manipulate expectations by instructing the patient about what is allowed to occur and what is not. It is explained to the patient that nonsexual feelings are to be accepted as a way to accept sexual feelings later on, and therefore sexual areas are excluded in the initial homework tasks. From a psychological point of view this manipulation is ingenious; it directs attention away from sex—when you feel a caress on your arm it may be pleasant but (now) it is not sexual—however, at the same time, it defines sexual feelings as feelings in ‘sexual areas.’ To attain a direct approach of sexual function numerous variants of couple, communication, and group therapy have been used. Rational emotive therapy has been used to change expectation and emotions (see Behaior Psychotherapy: Rational and Emotie). To remedy biographical memories connected with sexual dysfunction, psychoanalytic approaches have been used as well as cognitive behavior therapy approaches (see Behaior Therapy: Psychological Perspecties) . There are specific interventions for some dysfunctions; for example, premature ejaculation has been treated with attempts at heightening the threshold for ejaculatory release (stop–start or squeeze techniques). Recently, as a spin-off of research into cardiac vascular smooth muscle pharmacology, drugs have become available which act by relaxing smooth muscles in the spongiose and cavernous structures in the penis. This relaxation is necessary to allow blood flow into the penis, thus causing an erection. Some of these drugs (e.g., sildenafil—Viagra2) support the natural neurophysiological reaction to sexual stimuli. Others act locally in the penis without sexual stimu13967
Sex Therapy, Clinical Psychology of lation. To slow down speed of ejaculation in men with premature ejaculation SSRIs (serotonin selective reuptake inhibitors) seem to be helpful. Smooth muscle relaxants like sildenafil are helpful in older men when less naturally occuring transmitters are available. Men with vascular or neurodegenerative diseases (e.g., diabetes, multiple sclerosis) may also benefit from the use of smooth muscle relaxants. Although these drugs are very effective, they do not help every man with erectile problems. Pharmacological treatment for erectile disorder may be an important step in restoring sexual function. Most couples will need information and advice to understand what they may expect from this treatment. For many it will not bring the final resolution of their relationship problems. In addition to drug treatment they will need some form of sex therapy or psychotherapy. 2.2.2 Validated treatments for male sexual dysfunctions. It has been difficult to get an overview of treatments for sexual dysfunctions because any proposal about how to approach dysfunctions was valid. This has changed through the introduction of criteria for validated or evidence-based practice by the American Psychological Association (APA). From the timely review by O’Donohue et al. (1999) of psychotherapies for male sexual dysfunctions it appears that the state-of-the-art is far from satisfactory. Following the criteria of APA’s Task Force, they found no controlled outcome studies for male orgasmic disorder, sexual aversion disorder, hypoactive sexual desire disorder, and dyspareunia in men. For premature ejaculation and for erectile disorder there is evidence for the usefulness of psychological treatment. But effects are limited and often unstable over time. Although the evidence-based practice movement should be firmly supported, unqualified support would be disastrous for the practice of the treatment of sexual problems. The care for patients with sexual problems must be continued even without proof according to the rules of ‘good clinical practice.’ The sensible clinician will learn to be very careful about any claims concerning either diagnostic procedures or treatments.
3. Sexual Dysfunctions in Women 3.1 Diagnostic Methods Similar to the procedures in men, initial interviews should help the clinician in formulating the problem and in deciding whether sex therapy is indicated. Since sexual problems can be a consequence of sexual trauma it is necessary to ask if the woman ever experienced sexual abuse. An important issue is the agreement between therapist and patient about 13968
the formulation of the problem and the nature of the treatment. To reach a decision to accept treatment, the patient needs to be properly informed about what the diagnosis and the treatment involve. Dependent on the nature of the complaint, the initial interviews may be followed by medical assessments. In contrast to the assessment of men, the psychobiological assessment of women’s sexual problems is not well developed.
3.2 Psychological Treatments of Sexual Dysfunctions in Women 3.2.1 Approaches to treatment. Similar to men, the treatment of sexual dysfunction in women contains many elements from Masters and Johnson sex therapy. As noted before, an important addition, especially in women, is the use of masturbation to discover their own sexuality. Low sexual desire is generally treated with sensate focus exercises to minimize performance pressure, and communication training. In the treatment of sexual aversion the focus is on decreasing anxiety, the common core of sexual aversions. Behavioral techniques, like exposure, are most commonly used. Treatment of sexual arousal disorder generally consists of sensate focus exercises and masturbation training, with the emphasis on becoming more selffocused and assertive in asking for adequate stimulation. For the treatment of primary (lifelong) anorgasmia there exists a well-described treatment protocol. Basic elements of this program are education, self-exploration and body awareness, and directed masturbation. Because of the broad range of problems behind the diagnosis of secondary (not lifelong) and situational anorgasmia, there is no major treatment strategy for this sexual disorder. Dependent on the problem, education, disinhibition strategies, and assertiveness training are used. It is important to identify unrealistic goals for treatment like achieving orgasm during intercourse without clitoral stimulation. For dyspareunia (genital pain often associated with intercourse), there are multiple possible somatic and psychological causes. A common picture is vulvar vestibulitis, pain at small inflamed spots at the lower side of the vaginal opening. However, often there is no clear organic cause for the pain. Treatment should be tuned to the specific causes diagnosed and can vary from patient to patient. Behavioral interventions typically include prohibition of intercourse, finger exploration of the vagina, first by the woman, then by her partner. Sensate-focus exercises may be used to increase sexual arousal and sexual satisfaction. Pelvic floor muscle exercises and relaxation training can be recommended in case of vaginismus or a high level of muscle tension in the pelvic floor.
Sexual Attitudes and Behaior Treatment of vaginismus commonly involves exposure to vaginal penetration by using dilators of increasing size or the women’s fingers. Pelvic floor muscle exercises may be used to provide training in discrimination of vaginal muscle contraction and relaxation, and to teach voluntary control over muscle spasm. Pharmacological treatment of sexual disorders of women is just beginning. Smooth muscle relaxants have been used in women to ameliorate sexual arousal and as a consequence hypoactive sexual desire. It appears that drugs like sildenafil produce smooth muscle relaxation and increased genital blood flow, but they have no effect on the subjective experience of sexual response. In women with subphysiological levels of testosterone—mainly in the postmenopause—testosterone patches appear to have an effect on mood, energy, and libido. 3.2.2 Validated treatments for women’s sexual dysfunctions. Reviews of treatments for sexual dysfunctions in women following the criteria for validated or evidence-based practice have been published (O’Donohue et al. 1997, Heiman and Meston 1997, Baucom et al. 1998). Heiman and Meston conclude that treatments for primary anorgasmia fulfil the criteria of ‘well-established,’ and secondary anorgasmia studies fall into the ‘probably efficacious’ group. They conclude with some reservations that vaginismus appears to be successfully treated if repeated practice with vaginal dilators is included in the treatment. Their reservations are due to a lack of controlled or treatment comparison studies of vaginismus. All authors conclude that adequate data on the treatment of sexual desire disorder, sexual arousal disorder, and dyspareunia is lacking. Although the evidence-based practice movement deserves support, care for patients with sexual problems must be continued even without proof according to the rules of ‘good clinical practice.’
4. Future Directions Sex therapy bloomed in the 1970s and 1980s, but reviews of evidence-based treatments suggest that developments stagnated and very few new studies have been undertaken. The recent shift to biological approaches will continue, at least for a while. Viagra and testosterone patches will shortly be followed by more centrally acting drugs (e.g., dopamine agonists). The search for drugs has provoked a wide range of studies into the biological basis of sexual function. This work inspires behavioral and cognitive neuroscience studies, which may provide a framework and new tools to better understand sexual emotions and sexual motivation. See also: Psychological Treatments, Empirically Supported; Sexuality and Gender
Bibliography Baucom D H, Shoham V, Mueser K T, Daiuto A D, Stickle T R 1998 Empirically supported couple and family interventions for marital distress and adult mental health problems. Journal of Consulting and Clinical Psychology 66: 53–88 Heiman J R, Meston C M 1997 Evaluating sexual dysfunction in women. Clinical Obstetrics and Gynecology 40: 616–29 Janssen E, Everaerd W 1993 Determinants of male sexual arousal. Annual Reiew of Sex Research 4: 211–45 Kaplan H S 1995 The Sexual Desire Disorders: Dysfunctional Regulation of Sexual Motiation. Brunner\Mazel, New York Kolodny R C, Masters W H, Johnson V E 1979 Textbook of Sexual Medicine. 1st edn., Little, Brown, Boston Laan E, Everaerd W 1995 Determinants of female sexual arousal: Psychophysiological theory and data. Annual Reiew of Sex Research 6: 32–76 Laumann E O, Paik A, Rosen R C 1999 Sexual dysfunction in the United States: Prevalence and predictors. Journal of the American Medical Association 281: 537–44 Masters W H, Johnson V E 1970 Human Sexual Inadequacy. 1st edn., Little, Brown, Boston O’Donohue W T, Dopke C A, Swingen D N 1997 Psychotherapy for female sexual dysfunction: A review. Clinical Psychology Reiew 17: 537–66 O’Donohue W T, Geer J H (eds.) 1993 Handbook of Sexual Dysfunctions: Assessment and Treatment. Allyn and Bacon, Boston O’Donohue W T, Swingen D N, Dopke C A, Regev L V 1999 Psychotherapy for male sexual dysfunction: A review. Clinical Psychology Reiew 19: 591–630
W. Everaerd
Sexual Attitudes and Behavior The focus in this article is on changes in premarital, homosexual, and extramarital sexual attitudes and behaviors as revealed in nationally representative surveys during the second half of the twentieth century. Attention will be mainly on shared sexual attitudes and behaviors in various societies. A shared attitude is a cultural orientation that pressures us to have positive or negative feelings toward some behavior, and to think about that behavior in a particular way.
1. Sexual Attitudes: Premarital Sexuality Findings from various countries will be examined and compared but because of the large number of national surveys conducted in the USA, we will start there. The first national representative sample of adults in the USA that used a series of scientifically designed questions to measure premarital sexual attitudes was completed in 1963 (Reiss 1967). The National Opinion Research Center (NORC) at the University of Chicago was contracted to do the survey. Reiss composed 24 questions about premarital sexual relationships that 13969
Sexual Attitudes and Behaior formed two unidimensional scales and several subscales (Reiss 1967, Chap. 2). The highest acceptance was to the question asking about premarital coitus for a man when engaged. Even on this question only 20 percent (30 percent of males and 10 percent of females) said they agreed that such premarital coitus was acceptable (Reiss 1967, p. 31). The date of this survey (early 1963) was strategic because it was at the start of the rapid increase in premarital sexuality that came to be known as the sexual revolution, and which soon swept the USA and much of the western world. Two years later, in 1965, NORC fielded a national survey that contained four questions on premarital sexual attitudes. One of the questions also asked about the acceptance of premarital coitus for males when engaged. Scott (1998) reports an average acceptance rate of 28 percent, consisting of 37 percent of males and 21 percent of females in this survey. Acceptance was designated by a person checking either of the last two categories (‘wrong only sometimes’ and ‘not wrong at all’), and disapproval was indicated by a person checking either of the first two categories (‘always wrong’ and ‘almost always wrong’). The 20 percent acceptance rate in the 1963 national survey and the 28 percent rate in this 1965 survey lends credence to a rough estimate that at the start of the sexual revolution in premarital sex (1963–5), about one quarter of adults in the USA accepted premarital coitus while three quarters disapproved of it. In 1970 Albert Klassen and his colleagues at the Kinsey Institute conducted the next nationally representative survey (1989). Klassen also used the NORC to conduct his survey. He used four questions and asked about the acceptability of premarital intercourse for males and females, when in love and when not in love. Reiss reported that the responses to questions about males specifying love and those specifying engagement, were less than 2 percent apart, and so the Klassen question about premarital sex for males in love can be compared to the two previous surveys. Klassen reports that 52 percent (60 percent of males and 45 percent of females) chose the two acceptant categories of ‘wrong only sometimes’ or ‘not wrong at all’ (Klassen et al. 1989, p. 389). In just seven years acceptance of coitus rose from 20 percent in 1963, 28 percent in 1965, to 52 percent in 1970. The questions used in all three national surveys seem comparable and, most importantly, the size of the difference from 1963 to 1970 is so large, that it is hard not to conclude that in those seven years, something that can be called a revolution began to evidence itself in American attitudes toward premarital coitus. Reiss considered the increased autonomy of females and young people as the key factor in the increase of premarital sexual attitudes and developed the Autonomy Theory explanation of the sexual revolution around this concept (Reiss 1967, Chap. 10, Reiss and Miller 1979). In the l960s a higher percentage of females were employed than ever before and this 13970
meant more autonomy for them, and more autonomy for their children from parental controls and indoctrination. Females more than males were impacted by this increased autonomy and the three studies show much greater proportionate changes in female attitudes than in males. Adult males’ acceptance doubled from 30 percent in 1963 to 60 percent in 1970, whereas adult females’ acceptance increased 3.5-fold from 10 percent to 45 percent during the same period. The autonomy theory predicted that since female autonomy was increasing the most, female premarital permissiveness would increase the most. This is precisely what happened. Starting in 1972 the NORC introduced the General Social Survey (GSS) to gather national data annually or biennially on adults in the USA concerning premarital sexuality and a wide range of other nonsexual attitudes and behaviors. These data afford a basis here to examine the change in premarital sexual attitudes from 1972 to 1998. Unfortunately, the GSS researchers did not ask a question modeled after that in the l963, 1965, and 1970 surveys, which all specified gender and the presence of love or engagement. The GSS question basically taps a respondent’s global response to the acceptability of premarital coitus. It asked: ‘If a man and a woman have sex relations before marriage, do you think it is always wrong, almost always wrong, wrong only sometimes or not wrong at all’ (Davis and Smith 1999, p. 235). The ‘always wrong’ response to the GSS question is the only response that clearly excludes acceptance of coitus even for an engaged or in love male and so it will be considered as the response indicating rejection of such behavior. The 1972 GSS survey reported that 63 percent accepted premarital coitus under some condition and 37 percent checked the ‘always wrong’ category. That was a significant growth from the 52 percent acceptance reported by Klassen for 1970. By 1975 the GSS surveys reported that acceptance had risen to 69 percent. The rise was small after 1975 and acceptance was 74 percent in 1998 (Davis and Smith 1999, p. 235). Looking at all the national surveys in the USA from 1963 to 1998 the evidence is that the period of most rapid change in acceptance of premarital coitus was from 1963 to 1975, with an overall increase from 20 percent to 69 percent. That is the period that we can most accurately label as a premarital sexual revolution. In all the surveys discussed the data showed that females, much more than males, increased their acceptance of premarital coitus and this led to more gender equality in attitudes towards premarital coitus. The gender comparisons in 1963 were 10 percent female acceptance to 30 percent male acceptance. By 1975 the comparisons were 65 percent female acceptance to 74 percent male acceptance. Studies in a number of European societies indicate that even after the increased acceptance of premarital coitus in the USA, many Western countries were still more acceptant than the USA. For example, using
Sexual Attitudes and Behaior data from the International Social Survey Program (ISSP) of 1994, comparing the USA to five other societies, Scott reports that only Ireland was less acceptant of premarital coitus than the USA. Germany and Sweden were much more acceptant, and even Britain and Poland were significantly more acceptant (Scott 1998, p. 833). Unlike in the USA, Scott reports increases in acceptance of premarital coitus continuing in Britain during the 1980s and 1990s. There are other national surveys that can be studied such as the 1971 and 1992 national surveys in Finland that show similar trends to what was found in the USA (Kontula and Haavio-Mannila 1995, Chap. 12). Frequent mention of comparable changes in premarital sexual attitudes can be found in the International Encyclopedia of Sexuality in its accounts of 31 societies (Francoeur 1997). There are also a number of other European countries with national surveys taken in the 1980s and 1990s but lacking earlier national surveys for comparison. Nevertheless, what evidence we have on these other societies seems to support a significant increase in the acceptance of premarital coitus similar to what was happening in the USA, although not necessarily in the exact same years.
2. Sexual Attitudes: Homosexuality In 1973, a question on homosexual behavior was first asked in the GSS national survey in the USA. No distinction was made between male and female homosexuality. Nineteen percent accepted homosexual behavior as ‘wrong only sometimes’ or ‘not wrong at all.’ That did not vary a great deal until the 1993 GSS survey where acceptance jumped to over 29 percent. It then rose to 34 percent in 1996 and to 36 percent in 1998 (Davis and Smith 1999). One can only speculate as to why in the early 1990s this change accelerated in the USA. Perhaps the changes in the 1980s toward greater civil rights for homosexuals encouraged the increase in acceptance of homosexual behavior itself. However changes in the intensity of feelings cannot be indicated by the simple percent distribution on the GSS question on homosexuality. Clearly, using just one question to measure a sexual attitude, while useful, does not afford us sufficient information. There are national data on homosexuality from other countries. Inglehart, using his two World Value Surveys, compared 20 societies in 1981 and 1990 on the question of homosexuality. He reported that in all but three of the 20 (Ireland, Japan, South Africa) there was an increase in acceptance of homosexuality between 1981 and 1990 (Inglehart 1997, p. 279). In addition, the ISSP (1994) reported that Poland and Ireland were less acceptant of homosexuality than the USA, whereas Britain, West Germany, East Germany, and Sweden were more acceptant (Scott 1998, p. 833). When we compare changes in males and females in
the USA using the GSS data we find that females changed more than males in accepting homosexuality. Homosexual attitudes have traditionally been one of the very few sexuality areas where females equal or exceed males in the acceptance of a sexual behavior. This type of male\female difference was commonly found also in Western European countries studied in the World Values Surveys. But this higher female level of acceptance of homosexuality was not typically found in Eastern European or Asian countries (Inglehart 1997).
3. Sexual Attitudes: Extramarital Sexuality Extramarital sexual attitudes present a very different trend from either premarital or homosexual attitudes. The GSS data for the USA show that the acceptance of extramarital sexuality fell significantly between 1973 and 1988. Acceptance of extramarital sexuality (answering ‘wrong only sometimes’ and ‘not wrong at all’) in 1973 was 16 percent but by 1988 this had dropped to only 8 percent. Although male acceptance stayed higher than that of females, both genders showed close to a 50 percent decrease in acceptance over that period. The fear of HIV\AIDS may have played a role. Negative experiences with extramarital sexuality during the era of rapidly increasing divorce rates in the 1970s may also have contributed to this conservative trend. This more conservative shift in extramarital attitudes was seen in only a minority of the 20 countries on which Inglehart presents data for the period 1981–90 (Inglehart 1997, p. 367). France, Northern Ireland, Sweden, Argentina, and South Africa showed the sharpest decreases in their acceptance of extramarital sexuality. Meanwhile, Mexico, Italy, Finland, and Hungary evidenced the strongest changes toward greater acceptance of extramarital sex. This finding presents a most interesting puzzle as to why some countries changed to be more restrictive while others became more acceptant of extramarital sexuality. Adding more to this puzzle is the finding by Inglehart that from 1981 to 1990, 16 (of 19) countries increased their belief that a child needs two parents to be happy (1997, p. 287). Thus, there was clearly more agreement in most of these countries on the importance of the two-parent family than on extramarital sexuality. It must be that in some countries, extramarital sexuality was not seen as a challenge to the stability of the two-parent family. There is a need here to also study the various types of extramarital coital relationships to distinguish the impact on stable relationships of having a casual vs. a love affair and\or a consensual vs. a nonconsensual affair (Reiss 1986, Chap. 3). These sexual complexities are but one of the many sexual conundrums waiting to be deciphered by more detailed research that can clarify and elaborate the survey research data presented here. 13971
Sexual Attitudes and Behaior
4. Relation of Sexual Behaiors and Sexual Attitudes The relation of sexual behavior to the attitude changes we have noted can be explored in several national studies in the USA. The 1982 National Survey of Family Growth (NSFG) interviewed females 15 to 44 years old. These 15–44 year old females were asked about their first coitus, and from this we can obtain retrospective reports for those in this sample who were teenagers in the early 1960s. Hopkins reports that the average percent nonvirginal for 15–19 year old females in 1962 was 13 percent and this rose to 30 percent by 1971 (Hopkins 1997). Zelnik and Kantner (1980) undertook three nationally representative samples of teenage females in the USA during the 1970s. They report that the percentage of 15–19 year old females who had experienced coitus rose from 30 percent in 1971 to 43 percent in l976, and finally to 50 percent in 1979 (Zelnik and Kantner 1980). The 50 percent rate dropped to 45 percent in 1982 and rose back to about 50 percent in 1988 and has stayed close to that level since then (Singh and Darroch 1999, Reiss 1997, Chap. 3). There were many other changes in teenage sexual relationships, such as improved contraception, that cannot be discussed here. Finally, it should be noted that premarital coital behavior, just like premarital attitudes, changed much more for females than for males (Laumann et al. 1994). When we compare these behavioral changes with the attitudinal changes discussed above, it is apparent that the large increase in the acceptance of premarital coitus starting in the 1960s was very much in line with the contemporaneous increases in teenage coital behavior. Using the GSS surveys allows one to compare attitudes with behavior in all three areas of premarital, homosexual, and extramarital for specific years. These data show a relatively close relationship between attitudes and behaviors in all three areas. For example, among those who said premarital sex is ‘always wrong’ 32 percent had premarital coitus in the last year, while among those who said premarital coitus was ‘not wrong at all,’ 86 percent had premarital coitus in the last year (Smith 1994, p. 89). The comparable figures for homosexuality are 1 percent vs. 15 percent, and for extramarital sex 2 percent vs. 18 percent. These are very large differences and they support the interactive relationship of attitudes and behavior in our sexual lives which others have commented upon (Reiss 1967 Chap. 7, Klassen et al. 1989, p. 253). Many of the same countries that were noted for increases in sexual attitudes since the 1960s also have national data supporting changes in sexual behavior, particularly in premarital sexuality (Inglehart 1997). A large national survey in England presents data supporting a very close association of attitudes and behaviors in premarital, homosexual, and extramarital sexuality (Johnson et al. 1994, p. 245). Also, in the area of homosexuality, Laumann reports similarities in his 13972
1992 American data concerning homosexuality with data from other national surveys (Laumann et al. 1994). Finally, Francouer’s encyclopedic work written by experts from 31 societies also supports a close connection between attitude changes and behavior changes in many of the countries studied (Francoeur 1997). The puzzle concerning how, and in what temporal sequences attitudes and behavior influence each other is one that requires careful research and theoretical attention.
5. Conclusions The representative national surveys examined leads to several important conclusions about sexual attitudes and behaviors in the post Kinsey era. It seems clear that there has been a sexual revolution in the area of premarital sexuality in the USA and in a large number of other Western countries. This is evidenced in both attitudes and behaviors and more strongly on the part of females than males. There has also been a more moderate increase in the acceptance of homosexuality. Finally, it was found that the acceptance of extramarital sexuality has actually decreased in the USA and elsewhere while increasing in a few other countries. There is a dearth of theories regarding why such sexual changes have occurred. The autonomy theory argues that the key variable in premarital sexual attitude change is the rise in autonomy (Reiss 1967, Reiss and Miller 1979). Such a change is also part of increased social acceptance of gender equality and of premarital sexuality, particularly in a gender equal relationship. The extensive examination by Hopkins of national data in the USA to test the autonomy theory’s ability to explain premarital sexuality trends from 1960–90, strongly supports the increases in female gender equality as a key determinant of changes in autonomy, which in turn produced changes in premarital sexual attitudes and behaviors (Hopkins 1997, Chap. 6).
6. Future Directions Reiss has delineated the nature of a new sexual ethic that he finds is increasingly popular in many countries in the Western world. He calls this new ethic, HER Sexual Pluralism, meaning that the moral yardstick in a sexual relationship is now the degree of Honesty, Equality, and Responsibility present. The older norms that judged people by whether they had performed a specific sexual behavior, have increasingly been replaced by focusing on the HER relationship parameters (Reiss 1997, Chaps. 1 and 10). This new ethic fits with the increased acceptance of premarital and homosexual sexuality, for the HER ethic does not use
Sexual Attitudes and Behaior marriage as the Rubicon of good and bad sexuality. The drop in the acceptance of extramarital sexuality in many countries may reflect the difficulty in carrying out two HER relationships simultaneously. HER sexual pluralism is well integrated with the more gender equal type of evolving Western society and is predicted to become the dominant sexual ethic of the twenty-first century. One other theoretical explanation of sexual trends comes from Inglehart (1997). He postulates that as capitalist societies become more affluent and increasing numbers of their citizens feel secure, there has occurred a rise in ‘non materialist’ values. These new values stress well being and quality of life, over accumulation of more economic wealth. Inglehart argues that as part of this major emphasis on quality of life, we are witnessing a liberation and pluralization of sexual values, particularly in the premarital and homosexual areas. Inglehart’s thesis is quite compatible with the autonomy theory because both theories are placing the birth of the changes in sexuality as a key part of the development of a new type of society; a society that is more autonomous and more concerned with the quality of life, than with economic survival. The fact that young people in these societies evidence these sexuality trends more than older people lends further support to the future growth of these changes. There appears to be a change in our basic social institutions in much of the world, and with that, a change in sexual relationships is occurring in line with the emerging HER sexual pluralism ethic. Many aspects of sexual attitudes and behaviors could not be discussed in this article. Even the surveys in the three areas examined illustrate the need for a more detailed examination of the many nuances of sexuality in each area. In addition, sexual science requires better coverage of peoples outside the Western World (Barry and Schlegel 1980, Reiss 1986). We can make progress by combining qualitative and quantitative methods in our work and linking those many disciplines that study sexuality. One way that significant progress can be encouraged is by the establishment of a multidisciplinary Ph.D. degree in sexual science, and in the Spring of 1999 the Kinsey Institute started work to produce just such a degree program (Reiss 1999). This program will be immensely helpful in expanding sexual science’s ability to explain our sexual lives. Theory is explanation, and science without theory is just bookkeeping. In all science we need to know why something is the way we find it, not just describe what is found. With the growth of our scientific explanations we will be better able to contain the myriad of sexual problems that plague sexual relationships worldwide. See also: Gay, Lesbian, and Bisexual Youth; Gay\ Lesbian Movements; Prostitution; Rape and Sexual Coercion; Rationality and Feminist Thought; Reg-
ulation: Sexual Behavior; Reproductive Rights in Affluent Nations; Sexual Behavior and Maternal Functions, Neurobiology of; Sexual Behavior: Sociological Perspective; Sexual Orientation: Historical and Social Construction; Sexuality and Gender
Bibliography Barry H, Schlegel A (eds.) 1980 Cross-Cultural Samples and Codes. University of Pittsburgh Press, Pittsburgh, PA Davis J A, Smith T W 1999 General Social Sureys, 1972–1998. University of Connecticut, Storrs, CT Francoeur R T 1997 The International Encyclopedia of Sexuality. Continuum Publishing, New York, 3 Vols. Hopkins K W 1997 An explanation for the trends in American teenagers’ premarital coital behavior and attitudes between 1960–1990. Unpublished doctoral dissertation, University of Minnesota, Minneapolis, MN Inglehart R 1997 Modernization and Postmodernization: Cultural, Economic, and Political Change in 43 Societies. Princeton University Press, Princeton, NJ Johnson A M, Wadsworth J, Wellings K, Field J, Bradshaw S 1994 Sexual Attitudes and Lifestyles. Blackwell Scientific Publications, Oxford, UK Klassen A D, Williams C J, Levitt E E 1989 Sex and Morality in the US. Wesleyan University Press, Middletown, CT Kontula O, Haavio-Mannila E 1995 Sexual Pleasures: Enhancement of Sex Life in Finland, 1971–1992. Dartmouth Publishing, Aldershot, UK Laumann E O, Gagnon J H, Michael R T, Michales S 1994 The Social Organization of Sexuality. University of Chicago Press, Chicago Reiss I L 1967 The Social Context of Premarital Sexual Permissieness. Holt, Rinehart & Winston, New York Reiss I L, Miller B C 1979 Heterosexual permissiveness: a theoretical analysis. In: Burr W, Hill R, Nye I, Reiss I (eds.) Contemporary Theories About the Family. Free Press of MacMillan, New York, Vol. 1 Reiss I L 1986 Journey Into Sexuality: An Exploratory Voyage. Prentice-Hall, Englewood Cliffs, NJ Reiss I L 1997 Soling America’s Sexual Crises. Prometheus Books, Amherst, NY Reiss I L 1999 Evaluating sexual science: Problems and prospects. Annual Reiew of Sex Research 10: 236–71 Scott J 1998 Changing attitudes to sexual morality: A crossnational comparison. Sociology 32: 815–45 Singh S, Darroch J E 1999 Trends in sexual activity among adolescent American women: 1982–1995. Family Planning Perspecties 31: 212–19 Smith T W 1994 Attitudes toward sexual permissiveness: Trends, correlates, and behavioral connections. In: Rossi A S (ed.) Sexuality Across the Life Course. University of Chicago Press, Chicago Zelnik M, Kantner J 1980 Sexual activity, contraceptive use and pregnancy among Metropolitan-area teenagers: 1971–1979. Family Planning Perspecties 12: 230–7
I. L. Reiss Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13973
ISBN: 0-08-043076-7
Sexual Behaior and Maternal Functions, Neurobiology of
Sexual Behavior and Maternal Functions, Neurobiology of During the second half of the twentieth century, interdisciplinary research efforts have generated considerable information about the physiological mechanisms that control mammalian reproductive behaviors. The approach has used field and laboratory studies with animals to elucidate general principles that are beginning to be assimilated by the social sciences as they toil to understand the sexual and parental behaviors of our own species. What follows is a review of basic research on sexual behavior, sexual differentiation, and maternal functions in mammals including humans.
1. Sexual Behaior Mammalian sexual behavior is facilitated in males and females by the hormonal secretions of the testes and ovaries, respectively. In females the display of sexual behavior shows cycles closely associated with fluctuations in the hormonal output of the ovaries. In many species, including rodents commonly used in laboratory experiments, the cyclic display of behavior includes both changes in the ability of the female to copulate as well as her motivation or desire to engage in sexual behavior. In these species intercourse is often physically impossible except during a brief period of optimal hormonal stimulation of central and peripheral tissues. The central effects of ovarian hormones (i.e., estrogen (E) and progesterone (P)) in the facilitation of female sexual behavior are mediated by receptors found in several brain regions including the ventromedial nucleus of the hypothalamus (VMH), the preoptic area, and the midbrain central gray. The effects of E and P on neurons of the VMH appear to be sufficient to facilitate the display of the postural adjustment necessary for copulation in female rats (i.e., the lordosis reflex), and the display of lordosis is prevented by lesions of the VMH (Pfaff et al. 1994). Female rats with VMH lesions can show lordosis if other neural systems that normally inhibit the display of lordosis are also surgically removed (Yamanouchi et al. 1985). The rewarding or reinforcing aspects of female copulation are likely to be mediated by dopaminergic systems that include the nucleus accumbens. The rewarding aspects of female sexuality appear to be activated only when the female can control when and how often she has contact with the male during mating, i.e., when the female can ‘pace’ the copulatory encounter (Erskine 1989). In sharp contrast with nonprimate species, female monkeys are capable of engaging in sexual behavior at all phases of the ovulatory cycle and after ovariectomy. This fact which of course also applies to women, has been often used to question the importance of ovarian hormones in the modulation of female sexuality in 13974
primates. Other work, however, has shown that in spite of having the ability to copulate throughout the menstrual cycle, female monkeys show fluctuations in sexual motivation that are predictable from the pattern of ovarian E production across the cycle. For the behavioral effects of E to be evident, female monkeys must be tested in situations in which they are given the opportunity to choose between avoiding contact with the male or alternatively to approach the male and solicit his attention. In a series of elegant experiments conducted under naturalistic conditions Wallen and associates (Wallen 1990, 2000) have shown that in rhesus females, the willingness to leave an all-female group in order to approach a sexually active male peaks at the time of maximal E production at the end of the follicular phase. Further, under the same conditions, females never approach the male if their ovarian functions are suppressed by pharmacological manipulations. As argued by Wallen, in rhesus females and perhaps also in women, ovarian hormones do not determine the ability to engage in sexual behavior but have salient effects on sexual desire. Almost nothing is known about where E acts in the brain to facilitate sexual motivation in female primates. In male mammals, castration results in a reduction in sexual behavior and replacement therapy with testosterone (T) or its metabolites restores the behavior to pre castration levels. In nonprimate species both the motivational and performance aspects of male sexuality are affected by lack of T. In men and possibly other primates, lack of T seems to affect sexual motivation more than sexual ability (Wallen 2000). Thus, men with undetectable circulating levels of T can reach full erections when shown visual erotic materials, but report little sexual interest in the absence of T replacement. In men, erectile dysfunction is often due to non endocrine causes such as vascular pathologies or damage to peripheral nerves. In the brain T seems to act primarily in the medial preoptic area (MPOA) to facilitate male sexual behavior, but it is evident that other brain regions as well as the spinal cord and other peripheral sites need hormonal stimulation for the optimal display of sexual behavior in males. Lesions of the MPOA have immediate and profound disruptive effects on male sexual behavior and these effects are remarkably consistent across species. Lesions of the MPOA however do not equally affect all components of male sexual behavior and some of the effects of the lesions are paradoxical. For example, in male rats lesions of the MPOA that virtually abolish male sexual behavior do not seem to affect willingness to work on an operant task when the reward is access to a receptive female. Similarly, male mice continue to show courtship behavior directed to females after receiving large lesions of the MPOA. It has been suggested that lesions of the MPOA selectively affect consummatory aspects of male behavior leaving appetitive or motivational components intact. Not all the data fit this proposed dichotomy. For
Sexual Behaior and Maternal Functions, Neurobiology of example rhesus monkeys that show copulatory deficits after MPOA damage do not lose all the consummatory aspects of male behavior; after the lesions the animals are capable of achieving erections and they frequently masturbate to ejaculation (Nelson 2000). The view that male sexual behavior results primarily from T action in the MPOA and female sexual behavior from E and P action in the VMH is not without detractors. In a recent study of male sexual behavior in the rat, it was found that androgen antagonists implanted directly into the MPOA did block male sexual behavior as would be expected. However, the most effective site for androgen antagonist blockade was in the VMH! (McGinnis et al. 1996). There is also considerable overlap between the neural systems that are active during female sexual behavior and those that are active in the male during copulation. These studies may indicate that the appetitive aspects of sexual behavior, seeking out a partner, courtship, etc. are under the control of similar neural systems in both sexes (Wersinger et al. 1993).
2. Sexual Differentiation In addition to the activational effects gonadal hormones have on adult sexual behavior, there is an extensive literature base to show that gonadal hormones also affect the development of brain systems that regulate male and female sexual behavior. These long-lasting effects are often referred to as ‘organizational effects’ to distinguish them from the concurrent facilitative actions that gonadal hormones have in the adult. Organizational hormone actions are thought to occur primarily during the period of sex differentiation of the nervous system. In species with short gestation, such as rats and hamsters, this developmental period occurs around the time of birth, whereas in those species with longer gestation times, such as primates, sex differentiation of the nervous system takes place during fetal development (Nelson 2000). Normal males are exposed to androgens throughout this early developmental period as a result of the testes becoming active during fetal development. When genetic female rats were treated with T throughout this time they showed significant masculinization of behavior and genital morphology. As adults, these experimental females displayed most of the elements of male sexual behavior including the ejaculatory reflex. When males were deprived of androgens during this time they were less likely to show the consummatory responses associated with masculine copulation. Further, in the absence of androgens during early development, male rodents develop into adults who show most of the elements of female sexual behavior if treated with ovarian hormones. For many laboratory rodents the behavioral masculinizing effects of T treatment result from metabolites of T, namely estradiol and reduced androgens such as dihydrotesto-
sterone. There is good evidence that for some species it is the estrogenic metabolite that is essential for behavioral masculinization. Ovarian hormones do not appear to play a role in the development of the neural systems that underlie feminine sexual behavior in mammals. When female laboratory rodents, such as the golden hamster, are treated with estrogen during early development they actually show reduced levels of female sexual behavior as adults. However, their levels of male-like behavior, such as mounting receptive females, are increased, findings that are consistent with the concept that estrogen is a masculinizing metabolite of T in males. The findings on the effects of gonadal hormones during early development of the female are not easily interpreted, because an independent definition of just what is feminine and what is masculine is not available. On the one hand it is clear that if a female rodent is exposed to high levels of androgen throughout early development she will develop a male-like anatomy and will, as an adult, show all the elements of male sexual behavior. On the other hand, female rodents are often exposed to low levels of androgens normally during gestation as a result of developing next to a male in the uterus. These females are often more dominant than other females and less attractive to males (VomSaal 1989). But they still copulate as females and reproduce, suggesting that normal variations in female phenotype may result from normal variation in androgen exposure during development. This variability in female behavior and attractiveness may be an important part of any normal female population. The problem lies in defining the limits of normal female variation resulting from androgen (or estrogen) exposure, as opposed to what androgen effects might be interpreted as masculinization (Fang and Clemens 1999). The criteria for making this distinction have not yet been defined. Numerous sex differences have been reported for the mammalian nervous system and most of these probably result from the differential exposure to gonadal hormones that occurs during sex differentiation. It also presumed that the sex differences in behavior that we have noted result from these differences in the nervous system of males and females, but few models are available to show a strong correlation between sex differences in the CNS and sex differences in behavior (but see Ulibarri and Yahr 1996). Gonadal hormones can influence the development of the nervous system in a number of ways such as altering anatomical connectivity, neurochemical specificity, or cell survival. For example, while both male and female rats have the same number of nerve cells in the dorsomedial nucleus of the lumbosacral cord prior to sex differentiation, in the absence of androgen many of these cells die in the female, a normal process referred to as ‘apoptosis.’ This differential cell-death rate leaves the adult male rat with a larger dorsomedial nucleus than the female. In some brain regions it is suspected that hormones may actually promote cell death result13975
Sexual Behaior and Maternal Functions, Neurobiology of ing in nuclei that are larger in the female than in the male. There are also numerous examples of sex differences in the peripheral nervous system as well. For many years sex differences in human behavior were regarded as reflections of differences in how male and female children are reared. However, the accumulation of volumes of work showing that sex differences in non humans are strongly influenced by differential hormone exposure has forced scientists from many fields to re-evaluate the nurture hypothesis. Most would probably now agree to several general statements. 1. The brains of male and female humans are structurally very different. 2. These differences may reflect, at least in part, the different endocrine histories that characterize the development of men and women. 3. Some of the behavioral differences between males and females may result from these differences in their nervous systems. Disagreement occurs when we try to specify how much of one trait or another is due to biological factors and how much to experience, and it must be recognized at the outset that a clean separation of nature from nurture is not possible. Most of the evidence for biological factors operating to produce sex differences in human behavior comes from clinical or field psychology studies. A number of syndromes involve variation in androgen or estrogen levels during early development: congenital adrenal hyperplasia (CAH) is a syndrome in which the adrenal gland produces more androgen than normal. Turners Syndrome is characterized by regression of the ovaries and reduced levels of androgen and E from an early age. Hypogonadism is a condition where boys are exposed to lower than normal levels of androgen. There are also populations of girls and boys who were exposed to the synthetic estrogen, diethylstilbestrol (DES) during fetal development as well as populations whose mothers were treated with androgenic-like hormones during pregnancy. A number of studies point to a change in girls play behavior as a result of exposure to androgens during fetal development. CAH girls or girls exposed to exogenous androgenic compounds are often found to show an increased preference for playing with toys preferred by boys and less likely to play with toys preferred by untreated girls. These androgen-exposed girls also are more likely to be regarded as ‘tomboys’ than their untreated sibs or controls and engage in more rough and tumble play than controls (Hines 1993). Some have argued that variation in androgen or E levels during early development may affect sexual orientation, but a general consensus on this point has not been reached at this time.
3. Maternal Functions For rodents, especially in the case of the laboratory rat, the endocrine and neural mechanisms responsible for the onset and maintenance of maternal behavior 13976
are relatively well understood (Numan 1994). This understanding has stemmed to a large extent from the careful description of behaviors shown by maternal rats; these behaviors are easy to identify and to quantify in the laboratory. Such a rich and objective behavioral description is often lacking for other species including our own. In rats, all components of maternal care of the young (except milk production) can be induced in the absence of the hormonal changes that normally accompany pregnancy, parturition, and lactation. Thus, virgin female rats become maternal if they are repeatedly exposed to pups of the right age, a process referred to as sensitization. Hormones, nevertheless, play a critical role in the facilitation of maternal behavior and are necessary for the coordination of maternal care and the arrival of the litter. Using different experimental paradigms, several laboratories have identified E as the principal hormone in the facilitation of maternal behavior. When administered systemically or when delivered directly into the MPOA, E triggers the display of maternal behavior under many experimental conditions. Prolactin from the anterior pituitary or a prolactin-like factor from the placenta also play a role, albeit secondary, in the facilitation of maternal functions; administration of prolactin or analogs of this hormone enhances the activational effects of E on maternal behavior. Also, mice lacking functional prolactin receptors show poor maternal care. Other peptides and steroids have been implicated in the support of maternal functions (Nelson 2000). For example, central infusions of oxytocin facilitate maternal behavior under some conditions and the effects of P on food intake and fuel partitioning are crucial to meet the energetic challenges of pregnancy and lactation (Wade and Schneider 1992). The integrity of the MPOA is necessary for the display of normal maternal behavior. Damage of the MPOA using conventional lesions or chemical lesions that spare fibers of passage interferes with normal maternal behavior under several endocrine conditions. Similar behavioral deficits are seen after knife cuts that interrupt the lateral connections of the MPOA. Cuts that interrupt other connections of the MPOA do not reproduce the behavioral deficits seen after complete lesions (Numan 1994). Normally, male rats do not participate in the care of the young, but exposing males to pups for several days can induce components of maternal behavior. Compared to females, adult males require more days of sensitization with pups and tend to show less robust maternal behavior. This sex difference is not evident before puberty (Stern 1987) suggesting that sexual differentiation of behavior may not be completed until after sexual maturation. When males are induced to care for the pups, the behavior is sensitive to the disruptive effects of lesions of the MPOA. Since MPOA damage affects both male sexual behavior and the display of maternal care, it is possible that there is partial overlap between the neural circuits
Sexual Behaior: Sociological Perspectie that support these two behavioral functions. Alternatively, MPOA lesions may affect more fundamental aspects of the behavioral repertoire of the animals, and such a deficit then becomes evident in different ways under different social conditions. The effects of MPOA lesions in females are also complex and not fully understood. In addition to affecting maternal functions MPOA damage results in a facilitation of the lordosis reflex concurrently with a reduction in the females’ willingness to approach a sexually active male in testing situations that permit female pacing. Finegrained analyses of the functional anatomy of the MPOA are needed to further elucidate the precise role of this area in mammalian reproductive functions. Studies of monogamous mammalian species where males and females remain together after mating offer the opportunity to study paternal care as well as social bonding between parents and between parents and offspring. One promising model is that of the prairie vole in which both the female and the male care for the young (DeVries and Villalba 1999). Studies of these monogamous rodents suggest that bonding of the female to the male and to her young may be enhanced by oxytocin, a peptide hormone synthesized in the hypothalamus and secreted by the posterior pituitary. In the male, paternal care appears to result, not from oxytocin, but from another hypothalamic hormone, vasopressin. In addition to these posterior pituitary hormones investigators have found evidence for a role of the endogenous opiates in strengthening the bond between mother and offspring (Keverne et al. 1999). See also: Queer Theory; Sex Hormones and their Brain Receptors; Sexual Behavior: Sociological Perspective; Sexual Orientation: Biological Influences
Bibliography Carter C S, Lederhendler I I, Kirkpatrick B (eds.) 1999 The Integratie Neurobiology of Affiliation. MIT Press, Cambridge, MA DeVries G J, Villalba C 1999 Brain sexual dimorphism and sex differences in parental and other social behaviors. In: Carter C S, Lederhendler I I, Kirkpatrick B (eds.) The Integratie Neurobiology of Affiliation. MIT Press, Cambridge, MA, pp. 155–68 Erskine M 1989 Solicitation behavior in the estrous female rat: A review. Hormones and Behaiour 23: 473–502 Fang J, Clemens L G 1999 Contextual determinants of female– female mounting in laboratory rats. Animal Behaiour 57: 545–55 Haug M R, Whalen R E, Aron C, Olsen K L (eds.) 1993 The Deelopment of Sex Differences and Similarities in Behaior. Kluwer, Boston, MA Hines M 1993 Hormonal and neural correlates of sex-typed behavioral development in human beings. In: Haug M, Whalen R E, Aron C, Olsen K L (eds.) The Deelopment of Sex Differences and Similarities in Behaior. Kluwer, Boston, MA, pp. 131–50
Keverne E G, Nevison C M, Martel F L 1999 Early learning and the social bond. In: Carter C S, Lederhendler I I, Kirkpatric B (eds.) The Integretie Neurobiology of Affiliation, MIT Press, Cambridge, MA, pp. 263–74 McGinnis M Y, Williams G W, Lumia A R 1996 Inhibition of male sex behavior by androgen receptor blockade in preoptic area or hypothalamus, but not amygdala or septum. Physiology & Behaior 60: 783–89 Nelson R J 2000 An Introduction to Behaioral Endocrinology, 2nd edn. Sinauer Associates, Sunderland, MA Numan M 1994 Maternal behavior. In: Knobil E, Neil J D (eds.) The Physiology of Reproduction, 2nd edn. Raven Press, New York, Vol. 2. pp. 221–302 Pfaff D W, Schwartz-Giblin S, McCarthy M M, Kow L-M 1994 Cellular and molecular mechanisms of female reproductive behaviors. In: Knobil E, Neil J D (eds.) The Physiology of Reproduction, 2nd edn. Raven Press, New York, Vol. 2. pp. 107–220 Pfaus J G 1996 Homologies of animals and human sexual behaviors. Hormones and Behaior 30: 187–200 Stern J M 1987 Pubertal decline in maternal responsiveness in Long-Evans rats: Maturational influences. Physiology & Behaior 41: 93–99 Ulibarri C, Yahr P 1996 Effects of androgen and estrogens on sexual differentiation of sex behavior, scent marking, and the sexually dimorphic area of the gerbil hypothalamus. Hormones and Behaiour 30: 107–30 VomSaal F S 1989 Sexual differentiation in litter-bearing mammals: Influence of sex of adjacent fetuses in utero. Journal of Animal Science 67: 1824–40 Wade G N, Schneider J E 1992 Metabolic fuels and reproduction in female mammals. Neuroscience and Biobehaioral Reiews 16: 235–72 Wallen K 1990 Desire and and ability: Hormones and the regulation of female sexual behavior. Neuroscience and Biobehaioral Reiews 14: 233–41 Wallen K 2000 Risky business: Social context and hormonal modulation of primate sexual desire. In: Wallen K, Schneider J E (eds.) Reproduction in Context, MIT Press, Cambridge, MA, pp. 289–323 Wallen K, Schneider J E (eds.) 2000 Reproduction in Context. MIT Press, Cambridge, MA Wersinger S R, Baum M J, Erskine M S 1993 Mating-induced FOS-like immunoreactivity in the rat forebrain: A sex comparison and a dimorphic effect of pelvic nerve transection. Journal of Neuroendocrinology 5: 557–68 Yamanouchi K, Matsumoto A, Arai Y 1985 Neural and hormonal control of lordosis behavior in the rat. Zoological Sciences 2: 617–27
A. A. Nunez and L. Clemens
Sexual Behavior: Sociological Perspective This article considers social science research in the field of human sexual behavior since the start of the nineteenth century. Sexual behavior is understood here in a broad sense to include not just sexual acts but 13977
Sexual Behaior: Sociological Perspectie also the associated verbal interactions and emotions (most notably love), as well as sexual desires, fantasies, and dysfunctions.
1. Oeriew The main disciplines considered here are the sociology and anthropology of sexuality, the psychology and psychopathology of sexual behavior, and sexology. These disciplines began to take form in the nineteenth century, and were influenced by other intellectual currents and areas of knowledge which, though not discussed in this article, need to be indicated: (a) Eugenics, in the form given it by Francis Galton from the 1860s. Eugenicist preoccupations were shared by most of the leading sexologists of the late nineteenth and early twentieth centuries (notably Havelock Ellis and Auguste Forel). (b) The history of sexuality. This developed in the nineteenth century, initially as the history of erotic art and practices, and of prostitution; and later as the history of sexuality in the ancient world and in other cultural contexts. (c) The ethology of sexuality. This emerged at the end of the nineteenth century, and has grown in influence since the 1960s, related partly to the development of sociobiology. Three phases can be identified in the development of social science research on sexuality: the nineteenth century (when the emphasis was on the study of prostitution and the psychopathology of sexual behavior); the period 1900–45 (that of the great sexological syntheses, the pioneering anthropological monographs and the first sex surveys); and the period beginning in 1946 (marked in particular by an expansion of quantitative research on the general population).
2. 1830–99: From the ‘Pathological’ to the ‘Normal’ It would be inaccurate to portray the nineteenth century in the industrialized countries as uniformly puritanical. The period was, of course, characterized by a widespread double standard in sexual morality (much less restrictive for young men than for young women), repression of masturbation, hypocrisy over the expression of love and sexual desires, censorship of literature and erotic art, and so on. But the nineteenth century also saw the development of feminism, the struggle for the civil rights of homosexuals, and the introduction of contraceptive methods in many countries. It was in the nineteenth century also that sexual behavior emerged as a major subject of scientific study. 13978
A characteristic of much work on sexuality in this period is how a concentration on the ‘pathological’ and ‘deviant’ was used to cast new light on the ‘normal’; that is, on the behavior most widespread in the population. For example, the quantitative study of prostitution preceded that of sexuality in marriage; the ‘perversions’ (referred to today as ‘paraphilias’) were examined before heterosexual intercourse between married couples; and the first scientific description of the orgasm (in 1855 by the French physician, Fe! lix Roubaud) actually appeared in a study on impotence. The first quantitative research on sexual behavior was conducted in the 1830s, much of it using the questionnaire technique being developed at this time. The first major empirical study in this field, based on quantification and combining sociological and psychological perspectives, was the investigation of prostitution in Paris conducted by the physician Alexandre Parent-Ducha# telet (1836). The second main current of research in the nineteenth century was concerned with the psychopathology of sexuality. The years 1886 and 1887 were a decisive period. In 1886 was published the first edition of Richard von Krafft-Ebing’s Psychopathia sexualis which presented a systematic classification of ‘sexual perversions.’ In 1887, the psychologist Alfred Binet (Binet 2001) published an article with the title ‘Le fe! tichisme dans l’amour’ (‘Erotic fetishism’). Binet’s text was the origin of an intellectual fashion for labeling the various ‘sexual perversions’ as ‘isms’ (the psychiatrist Charles Lase' gue had coined the term ‘exhibitionists’ in 1877 but not the word ‘exhibitionism’). By ‘fetishism,’ Binet referred to the fact of being particularly—or indeed exclusively—sexually excited by one part of the body or aspect of character, or by objects invested with a sexual significance (for example, underwear or shoes). He argued that the fetishism of any given individual was usually formed in childhood or adolescence, through a psychological process of association, during or after an experience that stirred the first strong sexual feelings. For Binet, many ‘sexual perversions’ as well as homosexuality should be considered as different forms of ‘erotic fetishism.’ Finally, he asserted that ‘pathological’ fetishism (such as obsessional fetishism for certain objects) was merely an ‘exaggerated’ form of the fetishism characteristic of ‘normal’ love (that of the majority of people). For a time this notion provided the unifying perspective for the psychopathology of sexuality, beginning with that elaborated by KrafftEbing in successive editions of his Psychopathia sexualis. The years that followed saw the generalization of the terms ‘sadism’ and ‘masochism’ (popularized by Krafft-Ebing), ‘narcissism’ (invented by Havelock Ellis and Paul Na$ cke in 1898–99), ‘transvestism’ (introduced by Magnus Hirschfeld around 1910). The psycho-analysis of Sigmund Freud integrated these various expressions and, most importantly, Binet’s
Sexual Behaior: Sociological Perspectie ideas about the lasting influence of childhood sexual impressions. This period also saw the development— encouraged by Alfred Binet and Pierre Janet—of the analysis of the sexual content in ordinary daydreams. The first questionnaire-based surveys of what in the twentieth century came to be referred to as sexual fantasies were carried out in the United States in the 1890s, in the context of research on adolescence led by G. S. Hall.
3. 1900–45: Large-scale Sexological Syntheses, Pioneering Anthropological Monographs and the First Sex Sureys The large volume of research conducted between 1900 and the end of World War II can be divided into three main currents (for a general view of the most significant contributions from this period, see E. Westermarck (1936). The first current is that of sexological research. It was in this period that sexology acquired an institutional status. The first sexological societies were set up in Germany in the years after 1910, and in 1914 Albert Eulenburg and Iwan Bloch founded the period’s most important journal of sexology (the Zeitschrift fuW r Sexualwissenchaft). The first Institute for Sexual Science was opened by Magnus Hirschfeld in Berlin in 1919. In the 1920s the first international conferences of sex research were held. This period also saw publication of large-scale works of synthesis, in particular those by Auguste Forel (Swiss), Albert Moll, Hermann Rohleder, Magnus Hirschfeld (German), followed later by Rene! Guyon (French) and Gregorio Maran4 on (Spanish). In the 1920s and 1930s, sexology became more self-consciously ‘political.’ A stated aim was to advance the ‘sexual liberation’ of young people and women, a cause advocated in influential books by B. Lindsey and W. Evans, Bertrand Russell, and Wilhelm Reich. Also published in these years were a number of extremely successful works popularizing sexological questions (the best known being those of the Englishwoman, Marie Stopes, and of the Dutch gynecologist, Th. H. Van de Velde). The aim of these manuals was to promote an enjoyment of marital sex, and the emphasis was accordingly on sexual harmony, orgasm, sexual dysfunctions, and no longer—as had been the case at the end of the nineteenth century—on ‘sexual perversions.’ The most representative and influential expression of the various tendencies in sexology at this time were the seven volumes of Studies in the Psychology of Sex (1900–28) by the Englishman, Havelock Ellis. The second current of research is in sexual anthropology. Broadly speaking these were either comparative studies of marriage and sexual life (E. Crawley, W. I. Thomas, W. G. Sumner, A. Van Gennep, R.
Briffault, K. Wikman and, foremost, E. Westermarck) or anthropological monographs, in particular those by B. Malinowski, M. Mead and G. Gorer. The most important of these monographs is that of Bronislaw Malinowski (1929) on the natives of the Trobriand Islands in New Guinea: topics examined include prenuptial sexuality, marriage and divorce, procreation, orgiastic festivals, erotic attraction, sexual practices, orgasm, the magic of love and beauty, erotic dreams and fantasies, as well as the morals of sex (decency and decorum, sexual aberrations, sexual taboos). The third and final current from this period is quantitative studies of sex behavior. Before 1914 this research was conducted mainly in Russia, Germany and Scandinavia. Between the wars, it developed primarily in the United States (R. Pearl, G. V. Hamilton, K. B. Davis, R. L. Dickinson, and L. Beam). These works prepared the way for and in many respects prefigured the research conducted by Kinsey and his co-workers from 1938.
4. Post-1946: Empirical Research in the Age of Sexual Liberalization In many countries, the second half of the twentieth century was a period of sexual liberalization. The improved status of women was reflected in a greater recognition of their rights in sexual matters (with implications for partner choice, use of contraception and abortion, as well as sexual pleasure). One consequence of this change was to encourage research into contraception: the contraceptive pill became available from 1960, and sterilization for contraceptive purposes was the most widely used means of birth control in the world by the end of the 1970s. Research was also encouraged into the physiology of the orgasm, and particularly the female orgasm, in the 1950s and 1960s (E. Gra$ fenberg, A. M. Kegel, A. C. Kinsey, W. H. Masters and V. E. Johnson, etc.). Another important factor of change was the arrival at adolescence in the 1960s of the postwar baby boom generation; economic affluence was the context for their demands for greater sexual freedom. This aspiration was reflected in a fall in age of first intercourse, especially for young women, and, related to this, a decline in the norm of female virginity at first marriage (or formation of first stable union). This liberalization reached its peak in the developed countries at the end of the 1970s and was brought to an abrupt halt by the AIDS epidemic, awareness of which began to develop, first in the United States, from 1981. In the course of the last fifty years of the twentieth century, the social sciences have made a major contribution to the understanding of human sexuality. Special mention must be made of the contributions from historical demography, history (R. Van Gulik, K. J. Dover, M. Foucault, P. Brown, and others), 13979
Sexual Behaior: Sociological Perspectie ethnology (V. Elwin, G. P. Murdock, C. S. Ford and F. A. Beach), the psychology of sexuality (see Eysenck and Wilson 1979), but also research originating in gay and lesbian studies conducted from a perspective of ‘social constructionism’ (sexuality is not a biological given but is socially constructed), as well as research on sexual identity, transsexualism, pornography and fantasies (for a general overview of the research mentioned above see: Arie' s and Be! jin (eds.) 1982, Allgeier and Allgeier 1988, McLaren 1999). Many of the new insights into sexual behavior acquired in this period have come from quantitative-based empirical research. The most influential of this research in the 1940s and 1950s was that directed in the United States by Alfred Kinsey. Between 1938 and 1954, Kinsey and his coresearchers interviewed more than 16,000 volunteers. While the personal information they collected was probably reliable, the sample constructed by Kinsey’s team was not representative of the US adolescent and adult population. Kinsey et al. (1948, 1953) distinguished the following ‘sources of sexual outlet’: masturbation, nocturnal emissions (or sex dreams), premarital heterosexual petting, premarital coitus (or intercourse), marital coitus, extramarital coitus, intercourse with prostitutes, homosexual responses and contacts, animal contacts. They established that the sexual history of each individual represents a unique combination of these sources of outlet and showed that between individuals there could be wide variation in ‘total sexual outlet’ (the sum of the orgasms derived from the various sources of sexual outlet). They also identified a number of sociological patterns. For example, compared with less educated people, better educated men and women had first heterosexual intercourse later, but had greater acceptance and experience of masturbation, heterosexual petting, foreplay and orogenital sexual practices. Also, according to these researchers, people with a premarital petting experience were more likely to have a stable marriage. In these two volumes there were curious omissions, the most striking being the almost total neglect of the emotions, notably love. And the interpretations given by the authors were sometimes debatable, such as the presentation of premature ejaculation as almost ‘normal’ because it happened to be widespread in the United States, or in considering women’s erotic imagination to be much less developed than men’s on the grounds that it seemed less responsive to sexually explicit images. It was in large part because of these two volumes that sexology between the 1950s and the end of the 1970s often resembled little more than what has been described as ‘orgasmology’ (see Arie' s and Be! jin (eds.) 1982, pp. 183 et seq.). However, they do represent an important stage in the development of the sociology of sexuality. A large number of quantitative surveys on sexuality were conducted in the 1960s and 1970s, influenced in 13980
part by Kinsey’s research, but which gave much greater attention to the sexual attitudes, personality, family background, feelings and even fantasies of the people being interviewed. It has also to be noted that this research was increasingly based on representative samples. These surveys were conducted on adult populations (Sweden, England, France, Finland, United States) but also on adolescents (England, United States, Denmark), young students and workers (West Germany) and homosexuals (United States, West Germany, in particular). They were conducted in a climate of increased politicization of sexuality that recalled the 1920s and often had utopian aspirations. The adoption of alternative sexual lifestyles (exemplified by the ‘communes’) was advocated by some; others celebrated the revolutionary potential of the (chiefly clitoral) orgasm. But this climate changed rapidly in the 1980s with the emergence of AIDS. This epidemic, for which no vaccine existed, demonstrated the need for up-to-date empirical data on sexual behavior as a basis for encouraging prevention behaviors by the population (more careful partner selection, and the use of HIV testing and condoms, etc.). In response, large-scale surveys of the general population, often using probability samples, were carried out in the 1990s, in Europe and the United States (see Wellings et al. 1994, Laumann et al. 1994, Kontula and Haavio-Mannila 1995, Be! jin 2001) and in the developing world (Cleland and Ferry (eds.) 1995). The sex surveys of the 1990s cannot be summarized here, but a number of their shared characteristics can be identified. They are based either on interviews or questionnaires (either face-to-face, or self-administered, or by telephone), and have been facilitated by the unquestionably greater willingness in recent decades to talk about sex, as reflected in better participation and response rates and fewer abandons. The theories advanced to interpret the data collected are often, though not always, those which Laumann et al. (1994, pp. 5–24) refer to as ‘scripting theory,’ ‘choice theory’ and ‘social network theory.’ The first postulates that, because of their exposure to an acculturation process, individuals usually follow ‘sexual scripts’ which prescribe with whom, when, where, how, and why they should have sex. The second places the emphasis on the costs (in time, money, emotional and physical energy, personal reputation, etc.) of sexual behavior. The third seeks to understand why some types of sexual relations occur between people with similar social characteristics whereas others (more unconventional) involve socially more contrasted individuals. One noteworthy finding from these surveys is that, compared with the developed countries, those of SubSaharan Africa are characterized by earlier occurrence of first heterosexual intercourse and a higher level of multiple partnership, but also by a substantially higher
Sexual Behaior: Sociological Perspectie percentage of people who had not had intercourse during the previous month. In other words, in the countries with a ‘young’ population structure, heterosexual activity tends to begin sooner but occurs less frequently and extends over a shorter period.
5. Conclusion: Future Directions To simplify, it can be said that in the nineteenth century the social sciences (including sexology) focused primarily on the forms of sexuality considered to be ‘deviant’ or ‘perverse.’ In addition, they gave priority to an exploration of behavior before beginning to study the psychological aspects (personality, sexual desires and fantasies). The same process occurred in the twentieth century, but on a broader scale, taking as subject the general population and thus an ‘average’ sexuality. Initially, between the 1920s and the end of the 1960s, the emphasis was on sexual practices (in particular, ‘sexual technique,’ and the orgasm). In a second period, however, especially since the start of the 1970s, attention has focused increasingly on sexual desire and fantasies. The fact, for example, that Viagra (the erection-enhancing drug first marketed in 1998) only works for men who feel attracted to their partners, illustrates the need for an understanding of the interior aspects of sexuality, in particular of the psychological blockages, desires and fantasies. This suggests that one trend in the future will be a growth of research on sexual fantasies and the complexities of sexual orientation and identity, and into the effects of pornography, cybersex and sexual addictions. A second probable trend in future research is the development of the comparative study of ‘national’ sexualities as revealed in the sex surveys of the 1990s. The data on sexuality that has been assembled needs to be subjected to analysis and interpretation. A comparative analysis of national sex surveys offers an excellent means of identifying the influence of culture on sexual attitudes and behavior as well as on desires and fantasies. It should be possible, for example, to compare the respective influence of cultures that are predominantly hedonistic or ascetic in orientation. Other factors to be assessed include religious beliefs and practices, population aging, democratization, and the relationship between the sexes. This material is potentially the basis for new syntheses in the sociology of sexuality, comparable in scope to the great sociosexological syntheses produced in the first thirty years or so of the twentieth century. A third trend in future research will be a continuing exploration of the themes developed since the 1970s whose common point is their focus on sexual behavior that is more or less coercive in character: paedophilia, sexual tourism, sexual violence, sexual harassment, and sexual mutilations, notably those inflicted on women.
A fourth line of research for the future follows from the growing numbers of elderly and old people in the population, who increasingly expect to remain sexually active. Their sexuality will surely be the subject of further and more detailed research. The fifth and final trend for the future concerns a partial renewal of social science research on sexuality from the recent developments in the ethology of sexuality. More generally, indeed, interdisciplinary perspectives can be expected to inform and enrich all areas of social science research in the field of sexual behavior. See also: Eugenics, History of; Family and Kinship, History of; Heterosexism and Homophobia; Sex Therapy, Clinical Psychology of; Sexual Attitudes and Behavior; Sociobiology: Overview
Bibliography Allgeier A R, Allgeier E R 1988 Sexual Interactions. Heath, Lexington, MA Arie' s P, Be! jin A (eds.) 1982 SexualiteT s occidentales. Seuil, Paris [1985 Western Sexuality. Basil Blackwell, Oxford, UK] Be! jin A 2001 Les fantasmes et la ie sexuelle des Francm ais. Payot, Paris Binet A 2001 Le feT tichisme dans l’amour [1st edn. 1887]. Payot, Paris Cleland J, Ferry B (eds.) 1995 Sexual Behaiour and AIDS in the Deeloping World. Taylor and Francis, London Ellis H 1900–28 Studies in the Psychology of Sex (7 Vols.). Davis, Philadelphia, PA Eysenck H J, Wilson G 1979 The Psychology of Sex. Dent, London Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. Saunders, Philadelphia, PA Kinsey A C, Pomeroy W B, Martin C E, Gebhard P H 1953 Sexual Behaior in the Human Female. Saunders, Philadelphia, PA Kontula O, Haavio-Mannila E 1995 Sexual Pleasures. Enhancement of Sex Life in Finland, 1971–1992. Dartmouth, Aldershot, UK Laumann E O, Gagnon J H, Michael R T, Michaels S 1994 The Social Organization of Sexuality. Sexual Practices in the United States. University of Chicago Press, Chicago, IL McLaren A 1999 Twentieth-century Sexuality: A History. Basil Blackwell, Oxford, UK Malinowski B 1929 The Sexual Life of Saages in North-Western Melanesia. Routledge, London Parent-Ducha# telet A 1836 De la prostitution dans la ille de Paris. Baillie' re, Paris Wellings K, Field J, Johnson A M, Wadsworth J 1994 Sexual Behaiour in Britain. Penguin Books, Harmondsworth, UK Westermarck E A 1936 The Future of Marriage in Western Ciilization. Macmillan, London
A. Be! jin Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
13981
ISBN: 0-08-043076-7
Sexual Harassment: Legal Perspecties
Sexual Harassment: Legal Perspectives After a remarkably swift development in law and popular consciousness, sexual harassment remains the subject of controversy and debate. The concept of sexual harassment is largely an American invention. Like many other concepts that have mobilized social action in the United States, this one emerged in the context of law reform. In the 1970s, US feminists succeeded in establishing sexual harassment as a form of sex discrimination prohibited by Title VII of the Civil Rights Act of 1964 (the major federal statute prohibiting discrimination in employment). Since then, the concept has taken hold in broader arenas such as organizational practice, social science research, media coverage, and everyday thought. Although Title VII remains the primary legal weapon against sexual harassment in the US, traditional anti-discrimination law squares uneasily with the overtly sexual definition of harassment that emanated from 1970s feminist activism and ideas. Just at the time when this narrow sexual definition of harassment has come under criticism in the US, however, that same definition has begun to spread to other nations—inviting inquiry around the globe. Today, sexual harassment’s conceptual boundaries are being rethought in the law, in the wider culture, and in feminist thought.
1. Definitions and Origins In the United States, harassment is predominantly defined in terms of unwanted sexual advances. In the eyes of the public and the law, the quintessential case of harassment involves a powerful male supervisor who makes sexual advances toward a female subordinate. Harassment is an abuse of sexuality; it connotes men using their workplace power to satisfy their sexual needs. This sexual model of harassment was forged in early Title VII law. Some of the earliest cases were brought by women who had been fired for refusing their bosses’ sexual advances. Lower courts at first rejected these claims, reasoning that the women had been fired because of their refusal to have affairs with their supervisors and not ‘because of [their] sex’ within the meaning of the law. The appellate courts reversed; they held employers responsible for the bosses’ conducts as a form of sex discrimination now called quid pro quo harassment. The results were a step forward: It was crucial for the courts to acknowledge that sexual advances can be used as a tool of sex discrimination. But the reasoning spelled trouble, because the courts’ logic equated the two. The courts said the harassment was based on sex under Title VII because the advances were driven by a sexual attraction that the male supervisor felt for a woman but would not have felt for a man. By locating 13982
the sex bias in the sexual attraction presumed to underline the supervisor’s advances, these decisions singled out (hetero) sexual desire as the sine qua non of harassment. Had the supervisor demoted the plaintiff or denigrated her intelligence, the court would have had far more difficulty concluding that the conduct was a form of sexism prescribed by law. Even at the time, there were broader frameworks for understanding women’s experiences as sex harassment. Carroll Brodsky’s book, The Harassed Worker (1976), for example, articulated a comprehensive nonsexual definition of harassment as ‘treatment that persistently provokes, pressures, frightens, intimidates, or otherwise discomforts another person.’ Rather than a form of sexual exploitation, Brodsky saw harassment as ‘a mechanism for achieving exclusion and protection of privilege in situations where there are no formal mechanisms available.’ (Brodsky 1976, p. 4). In his usage, ‘sexual harassment’ referred not simply to sexual advances, but to all uses of sexuality as a way of tormenting those who felt ‘discomfort about discussing sex or relating sexually.’ (Brodsky 1976, p. 28). A few Title VII decisions had already recognized sexual taunting and ridicule as a mechanism for male supervisors and co-workers to drive women away from higher-paying jobs and fields reserved for men. Indeed, the very concept of harassment as a form of discrimination in the terms and conditions of employment was first invented in race discrimination cases, where judges had discovered that employers could achieve racial segregation not only through formal employment decisions such as hiring and firing, but also through informal, everyday interactions that create an atmosphere of racial inferiority in which it is more difficult for people of color to work. Analogizing to the race cases, the courts might have likened bosses’ demands for sexual favors from women to other discriminatory supervisory demands—such as requiring black women to perform heavy cleaning that is not part of their job description in order to keep them in their place, or requiring women to wear gendered forms of dress or to perform stereotypically feminine duties not considered part of the job when men do it. Or judges might have located the sexism in a male boss’s exercise of the paternalist prerogative to punish as an employee someone, who dares to step out of her place, as a woman by refusing the boss sexual favors; sociological analysis reveals that male bosses have penalized female employees for other non-sexual infractions that represent gender insubordination rather than job incompetence (Crull 1987). But the courts relied instead on a sexualized framework put forward by some feminist lawyers and activists. Toward the mid-1970s, US cultural-radical feminists moved toward a simplistic view of heterosexuality as the lynchpin of women’s oppression (Willis 1992, p. 144). Given this ideological commitment, it is not surprising that these early feminists conceived of women’s workplace harassment as a
Sexual Harassment: Legal Perspecties form of unwanted sexual advances analogous to rape. Lin Farley’s book, Sexual Shakedown, defined harassment as ‘staring at, commenting upon, or touching a woman’s body; requests for acquiescence in sexual behavior; repeated nonreciprocated propositions for dates; demands for sexual intercourse; and rape.’ (Farley 1978, p. 15). A few years later, Catharine MacKinnon argued that harassment is discriminatory precisely because it is sexual in nature—and because heterosexual sexual relations are the primary mechanism through which male dominance and female subordination are maintained (MacKinnon 1979). In the US, less than a decade later, the consolidation of this narrow view of women’s workplace harassment was largely complete. The 1980 Equal Employment Opportunity Commission (EEOC) guidelines defined sex harassment as ‘unwelcome sexual advances, requests for sexual favors, and other verbal or physical conduct of a sexual nature’—a definition courts have read to require overtly sexual conduct for purposes of proving both quid pro quo harassment (which involves conditioning employment opportunities on submission to sexual advances) and hostile work environment harassment (which involves creating an intimidating or hostile work environment based on sex). Indeed, in hostile environment cases, the lower courts have tended to exonerate even serious sexist misconduct if it does not resemble a sexual advance (Schultz 1998). Media coverage has infused this view of harassment into popular culture. The 1991 Anita Hill-Clarence Thomas controversy helped solidify the view of harassment as sexual predation. Hill, at the time a novice lawyer in her mid-twenties, claimed that Thomas, her then-supervisor at the Department of Education and later Chair of the EEOC, had pressured her to go out with him and regaled her with lewd accounts of pornographic films and his own sexual prowess (Mayer and Abramson 1994). Soon afterward, the news media broke the story of Tailhook, in which drunken Navy pilots sexually assaulted scores of women at a raucous convention (Lancaster 1991). Later in the 1990s, public attention turned to the harassment lawsuit of a former Arkansas state employee, Paula Jones, who alleged that President Bill Clinton had made crude sexual advances towards her while he was the Governor of Arkansas. Organizations seeking to avoid legal liability for sexual harassment have also defined it in terms of overtly sexual conduct. Many employers have adopted sexual harassment policies, but Schultz’s research reveals that these policies are rarely if ever integrated into broader anti-discrimination programs. Indeed, some employers’ efforts to avert sexual harassment may undermine their own efforts to integrate women fully into the workplace, as firms adopt segregationist strategies designed to limit sexual contact between men and women (such as prohibiting men and women from traveling together). Such policies reinforce perceptions of women as sexual objects and deprive them
of the equal training and opportunity the law was meant to guarantee.
2. Current Challenges In recent years, the prevailing understanding of sexual harassment has come under challenge in the US. Civil libertarians have voiced concern that imposing vicarious liability on employers for their employees’ sexual harassment gives employers a powerful incentive to curb workers’ freedom of speech and sexual expression in the workplace. Critics say harassment law incorporates vague standards—including the requirement that the hostile work environment harassment be ‘sufficiently severe and pervasive to alter the conditions of the victim’s employment and create an abusive working environment’—(Meritor Saings Bank s. Vinson, p. 67)—that permit employers to adopt broad policies that chill sexual speech (Strossen 1995, Volokh 1992). Many employers may not limit their policies to ‘unwelcome’ conduct, because they will want to avoid costly, contentious inquiries into whether particular sexual interactions were unwelcome. Because few employers have a stake in protecting their employees’ freedom of expression (and only government employers have any First Amendment obligation to do so), many firms may simply adopt broad, across-the-board proscriptions on sexual activity and talk on the part of employees (Rosen 1998, Hager 1998). Although there has been no systematic research in this area, some alarming incidents have been reported. (Schuldt 1997), (Grimsley 1996, p. A1). Such concerns have resonated with a new generation of feminist legal scholars, who have begun to worry about the extent to which equating workplace sexual interaction with sex discrimination replicates neoVictorian stereotypes of women’s sexual sensibilities (Abrams 1998, Franke 1997, Strossen 1995). While civil libertarians have urged repealing or restricting the range of employer liability under Title VII—often, proposing instead to hold individual harassers responsible for their own sexual misconduct under common law (Hager 1998, Rosen 1998), younger feminist scholars have focused on reforming sex harassment law to bring it in line with traditional antidiscrimination goals. Kathryn Abrams defines harassment as sex discrimination not because (hetero) sexual relations inherently subordinate women, but because she believes male workers use sexual advances to preserve masculine control and norms in the workplace; she would limit liability to cases in which the harasser manifestly disregards the victim’s objections or ambivalence toward his advances (Abrams 1998). Katherine Franke argues that harassment is a ‘technology of sexism’ through which men police the boundaries of gender; she would focus on whether men have used sexuality to press women or other men 13983
Sexual Harassment: Legal Perspecties into conventional ‘feminine’ or ‘masculine’ roles (Franke 1997). Both Abrams and Franke attempt to break with the old equation of sexuality and sexism. Yet neither makes a decisive break, for each retains the idea that (hetero) sexual objectification is the key producer of gender (for Abrams, of gender subordination in the workplace, for Franke, of gender performance throughout social life). Bolder analyses seek to jettison altogether sexual harassment’s conceptual underpinnings in sexuality. Janet Halley’s queer theory-based critique emphasizes both the cultural\psychic dangers of outlawing the expression of sexuality and the heavier-handed repression such an approach places on sexual minorities. To the extent that harassment law focuses on whether sexual conduct is offensive to a reasonable person, judges and juries will rely on their ‘common sense’ to evaluate the advances—and other actions—of gays, lesbians, bisexuals, and other sexual dissidents as inherently more offensive than those of heterosexuals (Halley 2000). As Kenji Yoshino has observed, the courts have conditioned liability in same-sex sexual harassment cases on the harasser’s sexual orientation. Conduct that courts consider an unwanted sexual advance when the harasser is homosexual is deemed innocuous horseplay when the harasser is heterosexual—an approach that sets up a two-tiered system of justice that has nothing to do with the victim’s injury (Yoshino 2000). (See Sexual Orientation and the Law.) For related reasons, Schultz has argued that sex harassment law should abandon its emphasis on sexual misconduct and focus on gender-based exclusion from work’s privileges. She argues for reconceptualizing harassment as a means of preserving the masculine composition and character of highly-valued types of work and work competence (Schultz 1998). In Schultz’s view, the prevailing sexual understanding of harassment is too narrow, because it neglects more common, non-sexual forms of gender-based mistreatment and discrimination that keep women in their place and prevent them from occupying the same heights of pay, prestige, and authority as men. Indeed, Schultz contends, the centrality of occupational identity to mainstream manhood leads some men to harass other men they regard as unsuitably masculine for the job. At the same time, sexual model risks repression of workers’ sexual talk or interaction—even where they do not threaten gender equality on the job. Schultz’s call to move away from a sexuality-based harassment jurisprudence builds on the earlier work of Regina Austin, who recognized in 1988 that most workplace harassment was not ‘sexual’ in nature and proposed a tort of worker abuse to protect employees from classbased mistreatment at the hands of their bosses (Austin 1988). Although Austin was concerned with structures of class and race, recently a more individual dignitary approach has been revived by Anita Bernstein and by Rosa Ehrenreich, who propose protecting 13984
employees’ rights to equal respect through antidiscrimination and tort law, respectively (Bernstein, 1997, Ehrenreich 1999). Recognizing harassment as just another form of discriminatory treatment restores Title VII’s protections to those who allege discriminatory abuse based on characteristics other than gender or even race, such as religion and disability (Goldsmith 1999).
3. Future Directions In the United States, the sexual model of harassment always rested uneasily alongside traditional employment discrimination law—which is concerned with work, not sexuality. Although the future is far from certain, the US Supreme Court’s recent sexual harassment decisions seem poised to restore harassment law to its traditional focus. The Court’s 1998 decisions in Ellerth s. Burlington Industries and Faragher s. Boca Raton held that a company’s vicarious liability for a supervisor’s harassment turns on whether the harassment involves a ‘tangible employment action’—such as hiring, firing, or promotion—not on the content of the misconduct or its characterizations as quid pro quo or hostile environment harassment. In the absence of such a tangible action, companies can avoid liability by proving that they adequately corrected harassment a victim reported (or reasonably should have reported) through acceptable in-house channels. By creating a loophole for companies who investigate harassment through their own procedures, the Court sought to counter any current incentives for managers to ban sexual interactions across the board. By adhering to vicarious liability where harassment involves the same tangible employment decisions as more traditional forms of discriminatory treatment, the Court acknowledged that harassment is simply a form of employment discrimination subject to the usual legal rules (White 1999). The Court’s decision in Oncale s. Sundowner Offshore Serices further reconciles sexual harassment with the traditional discrimination approach. In Oncale, the Court held that male-on-male harassment is actionable under Title VII, taking pains to emphasize that harassment is not to be equated with conduct that is sexual in content or design. ‘We have never held that workplace harassment … is … discrimination because of sex merely because the words used have sexual content or connotations,’ said the Court (Oncale s. Sundowner Offshore Serices, Inc., p. 80). By the same token, ‘harassing conduct need not be motivated by sexual desire to support an inference of discrimination on the basis of sex.’ (Oncale s. Sundowner Offshore Serices, Inc., p. 80). ‘The critical issue,’ stressed the Court, ‘is whether members of one sex are exposed to disadvantageous terms or conditions of employment to which members of the other
Sexual Harassment: Legal Perspecties sex are not exposed, (Oncale s. Sundowner Offshore Serices, Inc., p. 80). The social sciences are also beginning to look beyond the sexual model, providing evidence of the need to conceptualize sex harassment in broader terms. In the early years, most workers attempted to document the prevalence of harassment, albeit with limited results due to their implicit reliance on a conceptually narrow (yet often vague) notion of harassment specified by lists of sexual behaviors derived from the EEOC guidelines (Welsh 1999). Over time, the research has become more multi-faceted, as a burgeoning scholarship has focused greater attention on defining harassment, theorizing its causes, documenting its consequences, and developing predictors. Yet much research remains wedded to the sexual view. Prominent theories of harassment posit that men’s tendency to stereotype women as sexual objects is ‘primed’ by the presence of sexual materials or behavior to induce a ‘sex-role spillover’ effect that leads men to sexualize women inappropriately in the workplace (Fiske 1993, Gutek 1992). Predictive models search for characteristics that predispose men to harassment, such as a propensity to sexualize women over whom they have some supervisory authority (Pryor et al. 1995). Further examples abound (see Welsh 1999, Borgida and Fiske 1995). Nonetheless, recent research has opened up a broader horizon. A few studies have included measures of gender-based harassment that is not necessarily sexual in content or design: The results suggest that such harassment is more widespread than overtly sexual forms (Frank et al. 1998, Fitzgerald et al. 1988). Social psychologists have begun to look beyond sexual objectification to explore how other gender-based stereotypes and motives can combine with occupational identities and institutional contexts to produce a variety of forms of workplace harassment and discrimination that are not all motivated by sexual attraction (Fiske and Glick 1995). Even some economists have moved away from the conventional view of harassment as sexual coercion that is unrelated to the internal dynamics of labor markets (Posner 1989) to develop innovative theories to explain how incumbent workers can obtain the power to exclude and disadvantage aspiring entrants (Lindbeck and Snower 1988). These developments align social psychological and economic theories more closely with those of sociologists, who have long emphasized harassment’s connection to structural features of organizations such as gross numerical imbalance, standardless selection processes, and the absence of managerial accountability (Reskin 2000, Schultz 1991). In tension with such efforts to transcend a narrow definition of sex harassment in the United States are developments in some other parts of the globe, where the sexual model championed by early US culturalradical feminists seems to be gaining headway. According to some commentators, the American under-
standing of sex harassment has been disseminated abroad so successfully that it now forms the foundation for international debates on sex harassment (Cahill 2000). Following closely on the heels of legal developments in the USA, for example, the European Union took steps to condemn and outlaw workplace sex harassment as a violation of women’s dignity, and it defined this concept in terms strikingly similar to the definition promulgated by the US EEOC. Encouraged by EU initiatives, feminists in Europe have drawn on particular features of the US model to promote versions of sex harassment law that resonate with their own traditions. In Austria, Cahill shows, feminists capitalized on desires to signal Austria’s compliance with the economic First World’s laws to press for a harassment law that adopts both the sexualized substantive definition and the privatized enforcement mechanisms of the US approach. The importation of these features allows critics to reject not only the law but also the very existence of harassment as a nonindigenous, imperialist export (Cahill 2000). In France, as Saguy shows, French feminists won a law that criminalizes the use of ‘orders, threats, constraint or serious pressure in the goal of obtaining sexual favors, by someone abusing the authority conferred by his position’ (Saguy 2000, p. 19 and note 34). This approach incorporates the American cultural-radical feminist view of harassment as a form of sexual abuse yet simultaneously signals distance from what the French perceive to be US sexual prudishness by highlighting that harassment is an abuse of hierarchical authority—an idea that conforms to conventionalFrenchviewsofhierarchicalpowerasinimical to equality. Neither French feminists nor lawmakers connect harassment to a larger system of workplace gender inequality that relegates women to inferior jobs; the gender segregation of work is accepted as a ‘neutral’ background condition rather than challenged as the structural context of inequality in which sex harassment flourishes and which it fosters. Friedman’s work suggests that even in Germany, where there is a tradition of using law to promote workers’ empowerment, transplanting the US model of harassmentas-sexual overtures serve to exacerbate what German feminists have decried as a conservative judicial tendency to define sex harassment as a violation of female sexual honor which requires moral purity (Friedman 2000). It is perhaps ironic that just as German (and other) feminists are trying to highlight the specificity of sexual harassment, a new generation of US feminists are striving to incorporate sexual harassment into broader frameworks for understanding the dynamics of ingroup\outgroup exclusion among different groups of workers—a project that might well be assisted by expanding the German concept of mobbing as a pervasive pattern of workplace harassment that targets an individual for exclusion and abuse at the hands of coworkers and supervisors’ (Friedman 2000, 13985
Sexual Harassment: Legal Perspecties p. 6). What is needed is a structural analysis of how a variety of different forms of harassment—including sexual advances, gender-based and other forms of taunting, physical threats, verbal denigration, work sabotage, heightened surveillance, and social isolation—can be used by socially dominant groups to code and to claim scarce social resources (such as good jobs and a sense of entitlement to them) to the exclusion of others. In this project, Americans have as much to learn from other countries’ experiences as vice versa. It is cause for optimism that so many scholars, activists and policymakers are struggling toward such a broader—and deeper—understanding of harassment. Inspired by new research, new feminisms, and newer social movements (such as queer theory) that are calling into question the old top-down, malefemale sexual model, the law of sex harassment awaits being overhauled to fit the world of the twenty-first century. See also: Autonomy at Work; Gender and the Law; Gender Differences in Personality and Social Behavior; Heterosexism and Homophobia; Lesbians: Historical Perspectives; Lesbians: Social and Economic Situation; Male Dominance; Organizations: Authority and Power; Psychological Climate in the Work Setting; Rape and Sexual Coercion; Regulation: Sexual Behavior; Sex Differences in Pay; Sex Segregation at Work; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Harassment: Social and Psychological Issues; Sexuality and Gender; Workplace Environmental Psychology
Bibliography Abrams K 1998 The new jurisprudence of sexual harassment. Cornell Law Reiew 83: 1169–1230 Arvey Richard D, Cavanaugh M A 1995 Using surveys to assess the prevalence of sexual harassment: Some methodological problems. Journal of Social Issues 51(1): 39–50 Austin R 1988 Employer abuse, worker resistance, and the tort of intentional infliction of emotional distress. Stanford Law Reiew 41: 1–59 Bernstein A 1997 Treating sexual harassment with respect. Harard Law Reiew 111: 45–527 Borgida E, Fiske S T (eds.) 1995 Special issue: gender stereotyping, sexual harassment, and the law. Journal of Social Issues 51(1): 1–207 Brodsky C 1976 The Harassed Worker. Lexington Books, Lexington, MA Burlington Industries s. Ellerth. 1998. 524 US 742 Cahill M 2000 The legal problem of sexual harassment and its international diffusion: A case study of Austrian sexual harassment law. (unpublished manuscript) Crull P 1987 Searching for the causes of sexual harassment: An examination of two prototypes. In: Bose C, Feldberg R, Sokoloff N (eds.) Hidden Aspects of Women’s Work. Praeger Press, New York, pp. 225–44
13986
Ehrenreich R 1999 Dignity and discrimination: Toward a pluralistic understanding of workplace harassment. Georgetown Law Journal 88: 1–64 Faragher s. City of Boca Raton. 1998. 524 US 775 Farley Lin 1978 Sexual Shakedown: The Sexual Harassment of Women on the Job. McGraw-Hill, New York Fiske T 1993 Controlling other people: The impact of power on stereotyping. American Psychologist 48: 621–8 Fiske S T, Glick P 1995 Ambivalence and stereotypes cause sexual harassment: A theory with implications for organizational change. Journal of Social Issues 51: 97–115 Fitzgerald L F, et al. 1988 The incidence and dimensions of sexual harassment in academia and the workplace. Journal of Vocational Behaior 32: 152–75 Frank E et al. 1998 Prevalence and correlates of harassment among US women physicians. Archies Internal Medicine 1998 158: 352–8 Franke K 1998 What’s wrong with sexual harassment? Stanford Law Reiew 49: 691–772 Friedman G 2000 Dignity at work: workplace harassment in Germany and the United States. (unpublished manuscript). Goldsmith E 1999 God’s House or the Law’s. Yale Law Journal 108: 1433–40 Grimsley K D 1996 In combating sexual harassment, companies sometimes overreact. Washington Post (Dec. 23): A1 Gutek B 1992 Understanding sexual harassment at work. Notre Dame Journal of Law, Ethics & Public Policy 6: 335–58 Hager M 1998 Harassment as a tort: Why title VII hostile environment liability should be curtailed. Connecticut Law Reiew 30: 375–439 Halley J 2000 Sexuality harassment. (unpublished manuscript). Kanter R M 1997 Men and Women of the Corporation. Basic Books, New York Lancaster J 1991 Navy ‘gauntlet’ probed: Sex harassment alleged at fliers’ convention. Washington Post (Oct. 30): A1 Lindbeck A, Snower D 1988 The Insider-Outsider Theory of Employment and Unemployment. MIT Press, Cambridge, MA MacKinnon C 1979 The Sexual Harassment of Working Women. Yale University Press, New Haven, CT Mayer J, Abramson J 1994 Strange Justice: the Selling of Clarence Thomas. Houghton Mifflin, Boston Meritor Federal Saings Bank s. Vinson. 1986. 477 US 57. Oncale s. Sundowner Offshore Serices, Inc. 1998. 523 US 75. Posner R 1989 An economic analysis of sex discrimination laws. Uniersity of Chicago Law Reiew 56: 1311–35 Pryor J B et al. 1995 A social psychological model for predicting sexual harassment. Journal of Social Issues 51: 69–84 Reskin B 2000 The proximate causes of employment discrimination. Contemporary Sociology 29: 319–28 Rosen J 1998 In defense of gender blindness. The New Republic 218: 25–35 Saguy A C 2000 Sexual harassment in France and the United States: Activists and public figures defend their definitions. In: Lamont M, Thevenot L (eds.) Rethinking Comparatie Cultural Sociology: Polities and Repertoires of Ealuation in France and the United States. Cambridge University Press, Cambridge, UK, pp. 56–93 Schuldt Gretchen 1997 Ex-Miller exec. copied page with anatomical word: man suing over firing says he showed copy of female co-worker. Milwaukee Sentinel (June 26), 1 Schultz V 1991 Telling stories about women and work: Judicial interpretations of sex segregation in Title VII Cases raising the lack of interest argument. Harard Law Reiew 103: 1749–843 Schultz V 1998 Reconceptualizing sexual harassment. Yale Law Journal 107: 1683–1804
Sexual Harassment: Social and Psychological Issues Strossen N 1995 Defending Pornography: Free Speech, Sex, and the Fight for Women’s Rights. Anchor Books, New York Volokh E 1992 Freedom of speech and workplace harassment. UCLA Law Reiew 39: 1791–872 Welsh S 1999 Gender and sexual harassment. Annual Reiew of Sociology 25: 169–90 White R H 1999 There’s nothing special about sex: The supreme court mainstreams sexual harassment. William & Mary Bill of Rights Journal 7: 725–53 Willis E 1992 Radical feminism and feminist radicalism. In: Wills E (ed.) No More Nice Girls: Countercultural Essays, Wesleyan University Press, London, 117–50 Yoshino K 2000 The epistemic contract of bisexual erasure. Stanford Law Reiew 52: 353–461
V. Schultz and E. Goldsmith
Sexual Harassment: Social and Psychological Issues Sexual harassment is generally defined as unwelcome sexual advances, requests for sexual favors, or other verbal or physical conduct of a sexual nature that is either a condition of work or is severe and pervasive enough to interfere with work performance or to create a hostile, intimidating work environment. It may consist of words, gestures, touching, or the presence of sexual material in the work environment. It typically involves a pattern of behavior over a period of time, rather than a single event. We might think about the former as ‘an episode of sexual harassment.’ In perhaps 90 percent of the episodes, women are the recipients and men are the initiators, but both sexes can harass and both can be harassed by the same sex or the other sex. If sexual harassment meets certain criteria (e.g., unwelcome, severe, and pervasive), it is illegal in many countries, but not all behavior commonly considered sexual harassment violates the law.
1. A Short History of Research on Sexual Harassment It is no doubt safe to assume that sexual harassment has been around for a long time, but it has been labeled, studied, and legislated for only about 20 years. In 1978, journalist, Lin Farley wrote Sexual Shakedown to bring attention to the phenomenon. In 1979, legal scholar, Catharine MacKinnon wrote an influential book that would provide a legal framework for dealing with sexual harassment in the US. MacKinnon argued that sexual harassment was a form of sex discrimination (i.e., denies women equal opportunity in the workplace) and therefore Title VII of the 1964 Civil Rights Act, which forbids discrimination on the basis of sex (among other social
categories), should apply. A year after her book was published, the US Equal Employment Opportunity Commission established guidelines on sexual harassment. Early empirical studies of sexual harassment in the workplace and academia started appearing in print about the same time. By 1982, at least one journal (Journal of Social Issues) had produced a whole issue devoted to scholarship on the topic. Today sexual harassment is studied by scholars in many countries who work in many fields, including law, psychology (clinical, forensic, organizational, social), sociology, management and human resources, history, anthropology, communication, and the humanities. Within the social sciences, sexual harassment is studied through quantitative techniques that focus on the measurement of constructs and determination of base rate statistics and qualitative case-study techniques focusing on specific occupations such as wait staff and female coal miners.
2. Measurement of Sexual Harassment In the early 1980s, researchers frequently used the term ‘social-sexual behaviors’ to distinguish a set of behaviors that might constitute sexual harassment from a legally defined measure of sexual harassment. These sets typically included behaviors unlikely to be considered sexual harassment either under the law or by a majority of the population. By including a broad range of behaviors, researchers could learn whether people’s views of specific behaviors differ over time (or across samples). It would also allow researchers to see if legal and illegal social-sexual behaviors have common antecedents and consequences. More recently, however, sexual harassment is what researchers say they are measuring. This has caused some confusion because many people seem to interpret statistics on sexual harassment to indicate the percentage of the workforce that would have a strong legal claim of sexual harassment. This is not true. Researchers have not attempted to capture the legal definition of harassment because: (a) the legal definition changes as the law develops, so the legal definition is a moving target; (b) laws vary from country to country; (c) targets may experience negative consequences of sexual harassment without having the harassment rise to meet a legal definition; (d) there is no reason to believe that we can learn about sexual harassment only by measuring it to conform to its legal definition. Some scholars now make explicit the point that a lay definition of sexual harassment does not necessarily imply that a law has been broken. A global item, say, ‘Have you ever been sexually harassed,’ is rarely used to measure sexual harassment because some researchers contend that it results in an under-reporting of the phenomenon. Workers seem reluctant to acknowledge that they have been sexually 13987
Sexual Harassment: Social and Psychological Issues harassed. In addition, asking respondents if they have been sexually harassed places a great cognitive load on them, as they would have to determine first what constitutes sexual harassment and then determine if they had experienced any behavior that met those criteria. In many studies, a single question asking the respondent if she has been sexually harassed is used as an indicator of acknowledging or labeling sexual harassment rather than an indicator of sexual harassment, per se. Most studies measure sexual harassment by asking respondents if they have experienced any of a list of behaviors that might be considered sexual harassment. These measures are generally (but not always) designed to be suitable for both sexes. In some cases, respondents are asked whether they have experienced a list of behaviors that might be considered sexual harassment, and later in the survey asked which of those behaviors they consider sexual harassment, allowing the researcher to determine which of a broad range of behaviors respondents have experienced that they consider sexual harassment. Multi-item measures of sexual harassment are also based on a list of behaviors that people may have experienced. The best known of these measures is the Sexual Experiences Questionnaire (SEQ) developed by Louise Fitzgerald and her colleagues. The SEQ has experienced many changes; the number of questions asked, the wording of questions, and the wording of responses have all been modified, and its refinement is ongoing. The number of subscales that emerge from it has also changed over time so it is important to keep in mind that all the studies using the SEQ are not necessarily using the same set of questions, scored the same way, and at this point it cannot be used to assess changes over time or differences across studies.
3. The Prealence of Sexual Harassment A number of studies have relied on random sample surveys including studies of government workers, the military, specific geographical areas, and specific work organizations. Some of these have obtained quite high response rates, some studies have been conducted in Spanish as well as in English, and a few longitudinal analyses have been conducted. In measuring prevalence, there is some debate about the timeframe that should be considered. Researchers typically inquire about experiences within the past year, the prior two years, or throughout the person’s entire work life, depending on the purpose of the study. While critics have cautioned that retrospective measures will introduce inaccuracy or bias, researchers active in the field are less concerned. It is difficult to know if one is currently being sexually harassed and qualitative studies provide convincing examples of events initially not labeled sexual harassment that came to be so labeled at a later date. 13988
3.1 The Prealence of Sexual Harassment of Women The random-sample studies together suggest that from about 35 percent to 50 percent of women have been sexually harassed at some point in their working lives, where sexual harassment refers to behavior that most people consider sexual harassment. Estimates are higher among certain groups such as women who work in male-dominated occupations. The most commonly reported social-sexual behaviors are the less severe ones, involving sexist or sexual comments, undue attention or body language. Sexual coercion is, fortunately, much rarer, involving 1–3 percent of many samples of women. In contrast, a study of allegations in cases tried in court showed a much higher incidence of severe behaviors, with 22 percent involving physical assault, 58 percent nonviolent physical contact, and 18 percent violent physical contact. The incidence of sexual harassment of women appears to be rather stable. Three US Merit Systems Protections Board studies spanning 14 years show that 42–44 percent of women in the federal workforce have experienced one or more of a list of potentially sexuallyharassing behaviors within the previous 24 months. In addition, the number of charges filed with the US Equal Employment Opportunity Commission may be leveling off in the range of 15,000–16,000 per year (in a labor force of about 120 million people). 3.2 The Prealence of Sexual Harassment of Men Men have been included in studies of sexual harassment from the very beginning. In her study of a random sample of working men and women in Los Angeles County, Barbara Gutek found that some time during their working lives, from 9 percent to 35 percent of men (depending on definition of harassment) had experienced some behavior initiated by one or more women that they considered sexual harassment. The US Merit Systems Protection Board studies found that from 14 percent to 19 percent of men in the Federal workforce experienced at least one episode of a sexually harassing experience (initiated by either men or women) within the previous two years. These surveys revealed that about one-fifth of the harassed men were harassed by another man. Data from the 1988 US Department of Defense Survey of Sex Roles in the Active Duty Military revealed that about onethird of the men (but about 1 percent of the women) who experienced at least one of nine types of uninvited, unwanted sexual attention during the previous 12 months, reported that the initiator was the same sex. The harassment of men by other men tends to be of two types: lewd comments that were considered offensive and attempts to enforce male gender role behaviors. Harassment by women is somewhat different, consisting of negative remarks and\or unwanted sexual attention.
Sexual Harassment: Social and Psychological Issues The available research on sexual harassment of men, admittedly much less than the research on women, suggests that many of the behaviors women might find offensive are not considered offensive to men when the initiators are women and\or they report few negative consequences. In addition, a disproportionate percentage of men’s most distressing experiences of sexual harassment come from other men. Presumably these are especially distressing because the recipient’s masculinity and\or sexual orientation are being called into question.
3.3 The Prealence of Sexual Harassment Among Other Groups Relatively few studies have either focused on or found consistent differences in the experience of sexual harassment beyond the consistent differential rate of sexual harassment of women vs. men. It may be the case that younger and unmarried women are somewhat more likely targets of sexual harassment than older and married women. Lesbian women may be more likely to be sexually harassed than heterosexual women, or they may simply be more likely to label their experiences sexual harassment. Although several authors have suggested that in the US women of color (Asian, African-American, Hispanic, and American Indian) are more likely to be targets of sexual harassment than Caucasian women, the evidence is far from clear. Several random sample surveys found no clear link between ethnicity and the experience of sexual harassment but qualitative studies suggest that women of color experience sexual harassment frequently. Two kinds of arguments have been advanced for reasons why minority women might experience relatively more sexual harassment. A direct argument relies on stereotyping of minorities. Although the stereotypes of African-American women differ from stereotypes of Chicanas or Asian-American women, in each case the stereotype might place these women at greater risk. An indirect argument relies on concepts of power and marginality. As women of color are less powerful and more marginal by virtue of their ethnicity than white women, they may be more prone to sexual harassment.
4. Explanations for Sexual Harassment Why sexual harassment exists has been of interest to social scientists since the phenomenon acquired a label. The various explanations can be subsumed into four categories: natural\biological perspectives, organizational perspectives, sociocultural explanations, and individual differences perspectives. These explanations tend to be broad in scope, not easily testable in a laboratory.
4.1 Natural\Biological Explanations There are two natural\biological perspectives: a hormonal model and an adaptive\evolutionary explanation. While intriguing, neither is supported by available data.
4.2 Organizational Explanations There are two organizational perspectives: sex-role spillover and organizational power. Sex-role spillover, defined as the carry over into the workplace of genderbased expectations that are irrelevant or inappropriate to work, occurs because gender role is more salient than work role and because under many circumstances, men and women fall back on sex role stereotypes to define how to behave and how to treat those of the other sex. Sex-role spillover tends to occur most often when the gender ratio is heavily skewed in either direction, i.e., when the job is held predominantly either by men or by women. In the first situation (predominantly male), nontraditionally employed women are treated differently than their more numerous male co-workers, are aware of that different treatment, report relatively frequent social-sexual behavior at work, and tend to see sexual harassment as a problem. In the second situation (predominantly female), female workers hold jobs that take on aspects of the female sex-role and where one of those aspects is sex object (e.g., cocktail waitress, some receptionists), women may become targets of unwanted sexual attention, but may attribute the way they are treated to their job, not their gender. Several studies find some support for this perspective. The earliest writings on sexual harassment were about men abusing the power that comes from their positions in organizations to coerce or intimidate subordinate women. Some subsequent statements on the power perspective are gender neutral, suggesting that although men tend to harass women, in principle if women occupied more positions of power, they might harass men in equal measure. This interpretation of sexual harassment as an abuse of organizational power is contraindicated by research showing that about half or more of harassment comes from peers. In addition, both customers and subordinates are also sources of harassment. Sexual harassment by subordinates has been documented primarily in academic settings where, for example, studies find up to half of female faculty at universities had experienced one or more sexually harassing behaviors by male students. While power cannot explain all sexual harassment, various kinds of power—formal organizational power and informal power stemming from ability to influence —remain potent explanations for at least some sexual harassment. For example, the fact that sexual harassment by customers is fairly common can be 13989
Sexual Harassment: Social and Psychological Issues explained, at least in part, by the emphasis employers place on customer satisfaction and the notion that the ‘customer is always right.’ Some researchers focus less on broad theoretical perspectives and more on the types of organizational factors that need to be included in models of sexual harassment. Thus far, contact with the other sex and a unprofessional and\or sexualized work environment have been identified as correlates of sexual harassment. 4.3 Sociocultural Explanations There are at least two ways of thinking of the broader sociocultural context. One is that behavior at work is merely an extension of male dominance that thrives in the larger society. Overall, there is general agreement in the literature about the characteristics of the sex stratification system and the socialization patterns that maintain it. The exaggeration of these roles can lead to sexual harassment. For example, men can sexually harass women when they are overly exuberant in pursuing sexual self-interest at work, or they feel entitled to treat women as sex-objects, or when they feel superior to women and express their superiority by berating and belittling the female sex. The second way of thinking of the broader sociocultural context is to study the sociocultural system itself and examine how and why status is assigned. According to this view, sexual harassment is an organizing principle of our system of heterosexuality, rather than the consequence of systematic deviance. 4.4 Indiidual Difference Explanations Although the data suggest that most sexual harassers are men, most men (and women) are not sexual harassers. This makes the study of personality characteristics particularly relevant. The search for individual-level characteristics of perpetrators does not negate any of the other explanations, but helps to determine, for example, which men in a maledominated society or which men in powerful positions in organizations harass women when most men do not. John Pryor developed a measure of the Likelihood to Sexually Harass (LSH) in men, consisting of 10 vignettes that place the respondent in a position to grant someone a job benefit in exchange for sexual favors. Currently the most widely known individual difference measure used in the study of sexual harassment, the LSH, has been validated in a number of studies. For example, undergraduate men who score relatively high on the LSH demonstrate more sexual behavior in a lab experiment and hold more negative attitudes toward women relative to those who report a lower likelihood to sexually harass. Sexual harassment may be an attempt to immediately gratify the desire for discrimination, intimidation, or sexual pleasure and therefore those people 13990
with low self-control may also be more likely to sexually harass. Some research findings support this gender-neutral explanation.
5. Factors Affecting Judgments of Sexual Harassment The most widely published area of research on sexual harassment measures people’s perceptions about sexual harassment. One set of studies attempts to understand which specific behaviors (e.g., repeated requests for a date, sexual touching, stares or glances, a sexually oriented joke) respondents consider to be sexual harassment. The other set of studies attempts to understand factors that affect the way respondents perceive behavior that might be considered sexual harassment. Typically, respondents are asked to read a vignette in which factors are manipulated and then they are asked to make judgments about the behavior in the vignette. The factors that are manipulated include characteristics of the behavior (e.g., touching vs. comments), characteristics of the situation (e.g., the relationship between the initiator and recipient), and characteristics of the initiator and recipient (e.g., sex, age, attractiveness, occupation). In addition, characteristics of the rater (e.g., sex, age) are typically measured. 5.1 The Effects of Rater Sex on Judgments of Sexual Harassment Sex is the most frequently studied feature in studies about perceptions of sexual harassment. In all, hundreds of studies have been done. These studies have been reviewed using traditional methods and metaanalyses. The conclusions of these reviews are two: (a) women consistently rate vignettes and specific behaviors more sexually harassing than men and (b) the average difference is small, raising questions about the practical significance of these results. Practical significance is important because in the United States it would appear that these studies have had an indirect influence on the Ninth Circuit’s 1991 decision, Ellison s. Brady. In that case, the court adopted a new legal standard in hostile environment cases of sexual harassment, the reasonable woman standard, which replaces the traditional reasonable person standard in that Circuit. Juries are asked to evaluate the events from the perspective of a reasonable woman: taking into account all the facts, would a reasonable woman consider the plaintiff to be sexually harassed? The predominant claim is the new standard would force judges and juries to look at the case from the perspective of the complainant, who is typically a woman. This would presumably make it less difficult for a plaintiff to make a convincing claim of hostile work environment harassment. While some recent research suggests that a reasonable woman standard as currently implemented may not result in different
Sexual Orientation and the Law judgments than the traditional reasonable person standard, the magnitude of the gender gap in perceptions about sexual harassment may not justify a change in standards, regardless of the effect of standard itself.
5.2 The Effects of Other Rater Characteristics on Judgments of Sexual Harassment Although gender is far and away the most widely studied factor in the study of sexual harassment perceptions, other factors have been studied. For example, when the initiator is higher status than the recipient, judges generally respond more positively toward the recipient, more negatively toward the initiator, and perceive more harassment than when the initiator is not a supervisor. In addition, several studies that compare students with workers find that students have a broader, more lenient view of social-sexual behavior relative to samples of workers who are typically somewhat older and have more work experience.
6. Some Remaining Issues While much is known, we lack a complete picture of sexual harassment. Must concern about sexual harassment eliminate any kind of dating flirtation at work? How responsible should employers be for the behavior of their employees? How should targets of harassing behavior respond in order to eliminate harassment without damaging their own career possibilities? See also: Gender and Place; Male Dominance; Sexual Attitudes and Behavior; Sexual Harassment: Legal Perspectives; Workplace Safety and Health
Bibliography Bowes-Sperry L, Tata J 1999 A multiperspective framework of sexual harassment. In: Powell G N (ed.) Handbook of Gender and Work. Sage Publications, Inc., Thousand Oaks, CA pp. 263–80 Brewer M B, Berk R A (eds.) 1982 Beyond Nine to Five: Sexual harassment on the job. Journal of Social Issues 38(4): 1–4 Estrich S 1991 Sex at work. Stanford Law Reiew 43: 813–61 Borgida E, Fiske S T 1995 (eds.) Gender stereotyping, sexual harassment, and the law. Journal of Social Issues 51(1) Franke K M 1997 What’s wrong with sexual harassment? Stanford Law Reiew 49: 691–772 Gutek B A 1985 Sex and the Workplace. Jossey-Bass Publishers, San Francisco Gutek B A, Done R 2001 Sexual harassment. In: Unger R K (ed.) Handbook of the Psychology of Women and Gender. Wiley, New York Harris V. 1993 Forklift Systems, 114 S. Ct. 367 MacKinnon C A 1979 Sexual Harassment of Working Women: A
Case of Sex Discrimination. Yale University Press, New Haven, CT O’Donohue W (ed.) 1997 Sexual Harassment: Theory, Research, and Treatment. Allyn and Bacon, Boston Pryor J B, McKinney K(Issue eds.) 1995 Special Issue: Research advances in sexual harassment. Basic and Applied Social Psychology 17(4) Stockdale M S 1996 Sexual Harassment in the Workplace: Perspecties, Frontiers, and Response Strategies. Sage Publications, Inc., Newbury Park, CA Welsh S 1999 Gender and sexual harassment. Annual Reiew of Sociology 25: 169–90 Williams C L, Giuffe P A, Dellinger K 1999 Sexuality in the workplace: Organizational control, sexual harassment, and the pursuit of pleasure. Annual Reiew of Sociology 25: 73–93
B. A. Gutek
Sexual Orientation and the Law Sexual orientation refers to any classification based on sexual desire or conduct, such as heterosexuality, same-sex sexuality, or bisexuality. It is a classification that includes all sexual orientations, just as race is a classification that includes all races (White, Black, Asian, etc.) However, just as discussions of race tend to focus on marginalized racial groups (such as people of color), discussions of sexual orientation commonly refer to minority sexual orientations, gay, lesbian, or bisexual. As a result, this article will reflect that focus on gay people. It will, however, also note the growing literature on heterosexuality and transgender people.
1. Homosexuality and Heterosexuality as Historically Contingent The study of the law relating to sexually marginalized people, also known as ‘queer legal theory,’ evolved in the 1980s and 1990s out of feminist legal theory, postmodern theory, and critical race theory. The most influential work informing queer legal theory is Michel Foucault’s History of Sexuality, Vol. 1 (Foucault 1978), which posited that sexual orientation is socially constructed rather than naturally or divinely ordained. This point has been further developed by Judith Butler’s performativity theory of sexual orientation and gender, which posits that identities are performed rather than biologically or divinely ordained (Butler 1990).
1.1 From Conduct to Status and Back to Conduct The social construction of sexual orientation is revealed by the different ways in which society and law have viewed same-sex sexuality. Different terms used 13991
Sexual Orientation and the Law to describe same-sex sexuality reflect these changes. Prior to the late nineteenth century, same-sex sexuality was viewed as sinful conduct, redeemable through repentance. The terms ‘homosexual’ and ‘heterosexual’ did not appear in English until 1892, along with the idea that sexual orientation was a status. These terms are products of German and English medical research on same-sex sexuality, which saw same-sex sexuality as a status, gave it a name, contrasted it with the previously unmarked status of heterosexuality, and assigned it social meaning as a constellation of characteristics that a person is (rather than particular sexual acts that one does). The terms ‘invert’ or ‘homosexual’ refer to this medicalized understanding of same-sex sexuality as a sickness rather than sinful conduct. Oscar Wilde’s 1895 trials and imprisonment for gross indecency are credited with introducing into popular culture the status of the male homosexual as effete, artistic, self-centered, and sybaritic. A third stage, beginning in the 1940s and 1950s, relied on the scientific research of Alfred Kinsey and Evelyn Hooker to contend that same-sex sexuality was a normal variation from opposite-sex sexuality, rather than a disease (Hooker 1957, Kinsey et al. 1948). The nascent homophile movement adopted the term ‘gay’ to distance itself from the medical judgment associated with ‘homosexual.’ A fourth stage, emerging through queer theory in the 1990s, rejected the very notion of sexual orientation status (evil or neutral), pointing out the indeterminate and potentially subordinating qualities associated with status-based notions of identity. This stage dubbed minority sexual orientations ‘queer’ (reclaiming the epithet), treating this anti-status as an epistemological category. To be queer is to believe that racial, sexual, and sexual orientation subordination is wrong. Queer theory arguably uncouples status from conduct, so that one person can be both queer and actively heterosexual.
‘Transgender’ generally refers to gender norm transgression. Some transgender people, known as transsexuals, undergo medical treatment to change from one sex to another. Other categories of transgender people include transvestites (people who cross-dress but do not undergo medical gender reassignment), and transgendered people (those who do not conform to gender norms for the sex they were assigned at birth, but neither do they undergo medical treatment to reassign their sex). Some cultures do not draw bright lines to distinguish same-sex sexuality from gender identity issues. In Bolivia, for example, men who engage in same-sex sexuality do so within a cultural understanding that one of the participants, in some sense, is female. Men who engage in same-sex sexuality call themselves gente de ambiente, or people of the atmosphere (West and Green 1997). Three subcategories of gente de ambiente highlight the overlap between sexual orientation and transgender identity in Bolivian culture. Traestis, or transvestites, view themselves as women trapped in men’s bodies. Camuflados, the camouflaged, see themselves as men, but take the receptive role in intercourse, and often pass as heterosexual in public spaces. The third category, hombres, or just men, take the penetrating role in intercourse, and are distinguished from heterosexual men in that they respond to sexual overtures from other men. In Bolivia, and elsewhere, these sexual orientation and gender categories are fluid. A man might sleep with men exclusively, only occasionally, or for money. He might desire to sleep with men, but refrain for fear of social or legal condemnation. The very difficulty in deciding the characteristics of identity categories forms the thesis of much postmodern scholarship that identity generally, and sexual identity in particular, is indeterminate because it varies historically, culturally, and within a particular individual’s lifetime.
1.2 Fluidity Between Sexual Orientation and Gender Identity
2. Changing Focus of Sexual Orientation Research
Sexual orientation and gender issues overlap, but are also distinct in important ways. Both are associated with gender nonconformity. They differ, however, in their fundamental premises. Conventional approaches to sexual orientation presuppose two sexes, and categorize a woman as lesbian if she is with another woman (homosexual literally meaning same-sex), and heterosexual if she is with a man (heterosexual literally meaning different-sex). Transgender theory, in contrast, posits that there are more than two sexes, and that people exist along a continuum of masculinities and femininities. This reasoning renders gay theory’s primary assumption absurd; to speak of a person being gay or heterosexual ceases to make sense when there are many more than the two options of being a woman-with-a-woman or woman-with-a-man.
Sexual orientation law is a synergy of gay rights advocacy and theoretical academic writings. Advocates bring cases and support legislation countering anti-gay discrimination. Scholars both craft theories that inform these efforts and evaluate the cases and statutes that become law. Two important areas involve anti-sodomy laws and the ban on same-sex marriage.
13992
2.1 Anti-sodomy Laws Because the criminalization of same-sex sexuality is the most obvious expression of state condemnation, sexual orientation law began by challenging antisodomy laws. In the US case Bowers vs. Hardwick, 478 US 186 (1986), the US Supreme Court upheld Geor-
Sexual Orientation and the Law gia’s statute criminalizing sodomy against a constitutional privacy challenge. The Court issued extraordinarily homophobic declarations, such as the concurring opinion’s citation of Blackstone (1859) to describe same-sex sodomy as ‘an offense of ‘‘deeper malignity’’ than rape, a heinous act ‘‘the very mention of which is a disgrace to human nature.’’’ The gratuitous nastiness inspired both numerous critiques of the decision (Goldstein 1988, Thomas 1992) and challenges to other laws disadvantaging gay people. One strand of this critique challenges the validity of anti-sodomy statutes, arguing that sodomy is an historically contingent category. For example, early Roman–Dutch law (imported to Colonial South Africa) interpreted sodomy as including a wide range of nonconforming sexual acts, such as anal penetration, bestiality, masturbation, oral penetration, penetration with an inanimate object, interfemoral intercourse, and heterosexual intercourse between a Jew and a Christian (West and Green 1997, p. 6). This insight undermines the logic of sodomy law defenders, who, along with the US Supreme Court in Bowers vs. Hardwick, rest their defense on ‘millennia of moral teaching’ (Goldstein 1988). If what counts as reprehensible conduct differs markedly among people, places, and times, theorists reason, then one cannot invoke a uniform condemnation of the conduct. While many governments have decriminalized same-sex sexuality, the criminal ban remains strong in some places. In 2000 in Malaysia, for example, former deputyPrime Minister AnwarIbrahim was convicted of sodomy and sentenced to nine years in prison after a 14-month trial. At the opposite extreme, the 1996 South African Constitution explicitly forbade sexual orientation discrimination. The ban on gays in the military, like anti-sodomy laws, excludes gay people from full citizenship. While the ban remains in the US, other countries, such as Israel, allow gay people to serve in the military.
Hungary, Iceland, Norway and Sweden each recognize some type of partnership, which accords same-sex couples (and sometimes opposite-sex unmarried couples) various benefits accorded to married couples. Parts of Canada and Spain also recognize same-sex partnerships, and domestic partners enjoy some employment benefits in Argentina, Canada, Israel, New Zealand, and South Africa. Other countries, such as Australia, Austria, Belgium, Brazil, the Czech Republic, Germany, Portugal, Spain, and the UK recognize narrower rights, such as succession rights in private housing and some inheritance rights. Australia, New Zealand, Belgium, Denmark, Finland, Germany, Iceland, The Netherlands, Norway, South Africa, Sweden, and the UK recognize same-sex relationships for immigration purposes. Smaller governmental units, such as cities, counties, and provinces around the world, ban discrimination on the basis of sexual orientation. In the USA, only one state, Vermont, accords samesex relationships recognition akin to marriage; the federal government and a majority of the states explicitly refuse to recognize same-sex marriage. The Vermont Supreme Court in Baker vs. Vermont, 744 A.2d 864 [Vt. 1999] found that banning same-sex marriage (or its equivalent under another name) violated the Common Benefits Clause of the Vermont constitution. That clause provides that ’government is, or ought to be, instituted for the common benefit, protection, and security of the people, nation, or community, and not the particular emolument or advantage of any single person, family, or set of persons, who are only a part of that community.’ Following the court’s instructions, the Vermont legislature created a structure parallel to marriage for same-sex couples, calling it civil union.
2.2 The Ban on Same-sex Marriage
The emphasis in current theory and research on sexual orientation law is to deconstruct and reconstruct legal regulations that subordinate gay and transgendered people. This rubric includes various ideological and methodological approaches.
Although anti-sodomy laws are rarely enforced, their existence impedes other anti-discrimination claims, such as gay people’s rights to marry or to serve in the military. Since the 1980s, many countries (i.e., South Africa) and US states (i.e., Georgia) have decriminalized same-sex sexuality. Legal advocacy and scholarship has moved on to contest discrimination in other areas. In Romer vs. Eans, 517, US 620 (1996), the US Supreme Court invalidated a state constitutional amendment that forbade any state entity from protecting gay, lesbian, or bisexual people from discrimination. Advocates and legal scholars, however, have paid far more attention to marriage litigation. In 2001, the Netherlands became the first country to allow same-sex couples to marry. Other countries such as Denmark, France, Greenland,
3. Theoretical Approaches to Sexual Orientation Law
3.1 Ideology Ideological diversity includes both classical liberal and critical approaches. Liberal approaches suggest that gay people are sufficiently similar to heterosexuals to justify equal legal treatment. This approach suggests that altering legal rules to lift the ban on gay participation in marriage or the military will not fundamentally change these institutions. A critical approach, in contrast, contests essentialized notions of identity, seeing it as socially constructed rather than 13993
Sexual Orientation and the Law reflecting essential commonalties among people who engage in same-sex sexuality. This critical approach often seeks a more comprehensive restructuring of legal regulations than simply adding gay people to existing institutions. Instead, critical approaches propose incorporating sexual orientation analysis into other antisubordination discourses, such as feminism, critical race studies, and class analysis. Critical theorists might further propose alternatives to marriage, such as domestic partnership for all people, contending that only new institutions can alleviate the sex\gender hierarchies inherent in marriage.
3.2 Methodology Postmodern, empirical, and legal economic approaches have contributed to the literature on sexual orientation law. One empirical approach compiles data about rates of arrest and prosecution for sexual offenses, and points out that consensual same-sex sexual activity is unfairly singled out for criminal penalty (Eskridge 1999). The two major premises of this approach, that like parties should be treated alike, and further that the state should generally refrain from interfering in private consensual activities, together lead to the conclusion that the state statutes criminalizing sodomy should be invalidated. One postmodern approach closely examines the language and reasoning of a judicial opinion to decode the cultural context of the decision, revealing, for example, the Supreme Court’s strategic and biased collapse of status and conduct in Bowers vs. Hardwick. This approach focuses on how the majority decision in Hardwick collapsed the distinction between status and conduct, and ignored the many cross-sex couples who commit sodomy by engaging in acts such as fellatio, cunnilingus, and anal intercourse. The majority in Hardwick framed the question as ‘whether the Federal Constitution confers a fundamental right upon homosexuals to engage in sodomy’ focusing the legal inquiry on both a bad act (sodomy) and a bad status (homosexuality). Under this reasoning: [I]f sodomy is bad … then homosexuals and heterosexuals who do it are bad. If homosexuals are bad, we are bad whether we’ve engaged in sodomy or not. To hold both of these positions with consistency, you have to be willing to say that many, many heterosexuals are bad. This the majority Justices never acknowledged … They wanted the badness of each to contaminate the other—while heterosexual personhood remained out of the picture, protected from the taint with which it was logically involved. (Halley 1999, pp. 7–8)
An emerging strand of legal scholarship uses legal economic premises to examine heterosexuality as well as same-sex sexuality. US Court of Appeals Judge Richard Posner’s book Sex and Reason (1992 posited a bioeconomic theory of sexuality that combined sociobiology with legal economics. Posner’s often controversial analysis (describing, for example, the economic efficiency in some contexts of female infanticide and baby selling, which he renames ‘parental-right selling’) provoked a firestorm of response. Yet recent scholarship has built on Posner’s economic approach to posit a bargaining theory of sexuality, suggesting that legal regulation should equalize the bargaining positions between men and women (Hirshman and Larson 1998).
4. Future Directions of Theory and Research The youth of sexual orientation legal research makes it unpredictable. Four likely future trends emerge. First, future research may well include increased use of empirical methods such as compilations data from court records in criminal or family law cases. Second, queer legal theory’s emphasis on post-identity analysis and legal doctrine’s focus on identity as a foundational category require that future scholarship address the tensions between deconstructing identity and reconstructing a legal regime that does not discriminate on the basis of sexuality or gender performance. Third, scholars are likely to try to resolve the analytical tension between a binary construction of sex that underlies gay theory and a fluid construction of sex that underlies transgender approaches. Fourth and finally, future scholarship and advocacy may focus on the legal regulation of heterosexuality and bisexuality, further developing the literature on how legal regulation misconstrues identity as fixed or natural. On an ideological level, theoretical research is likely to continue to develop in both liberal (assimilationist) and critical (utopian) strands. See also: Civil Rights; Family Law; Feminist Legal Theory; Gay\Lesbian Movements; Gender and the Law; Privacy: Legal Aspects; Queer Theory; Regulation: Sexual Behavior; Sex Segregation at Work; Sexual Attitudes and Behavior; Sexual Orientation: Historical and Social Construction; Social Class and Gender
Bibliography By leaving the categories of sexual orientation status and conduct ambiguous, legal doctrine governing consensual sexuality ‘remained always ready to focus on ‘‘act’’ or ‘‘status’’ according to the expediencies of the situation’ (Halley 1999, p. 8) 13994
Blackstone W 1859 Commentories. Harpe Brothers, New York, Vol. IV Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Eskridge W 1999 Gaylaw: Challenging the Apartheid of the Closet. Harvard University Press, Cambridge, MA
Sexual Orientation: Biological Influences Foucault M 1978 The History of Sexuality, Vol. I: An Introduction. Pantheon, New York Goldstein A 1988 History, homosexuality and political values: Searching for the hidden determinants of Bowers v. Hardwick. Yale Law Journal. 97: 1073–103 Halley J 1999 Don’t: A Reader’s Guide to the Military’s Anti-Gay Policy. Duke University, Durham, NC Hirshman L, Larson J 1998 Hard Bargains: The Politics of Sex. Oxford University Press, New York Hooker E 1957 The adjustment of the male overt homosexual. Journal of Projectie Techniques 21: 18–31 International Lesbian and Gay Association, http:\\ www.ilga.org\ Kinsey A, Pomeroy W, Martin C 1948 Sexual Behaior in the Human Male. W. B. Saunders, Philadelphia, PA Posner R 1992 Sex and Reason. Harvard University Press, Cambridge, MA Robson R 1992 Lesbian (Out)law: Surial Under the Rule of Law. Firebrand Books, Ithaca, NY Rubinstein W 1997 Sexual Orientation and the Law (2nd edn.). West, St. Paul, MN Symposium: InterSEXionality: Interdisciplinary perspectives on queering legal theory. 1998. Dener Uniersity Law Reiew 75: 1129–464. Thomas K 1992 Beyond the privacy principle. Columbia Law Reiew 92: 1431–516 Valdes F 1995 Queers, sissies, dykes and tomboys: Deconstructing the conflation of ‘sex,’ ‘gender,’ and ‘sexual orientation’ in Euro-American law and society. California Law Reiew 83: 1–377 West D, Green R 1997 Sociolegal Control of Homosexuality: A Multinational Comparison. Plenum Press, New York
M. M. Ertman
Sexual Orientation: Biological Influences How humans develop sexual orientations is a question that people have pondered for at least a century. In the later part of the twentieth century, scientists began to articulate sophisticated biological theories of how sexual orientations develop. This article critically surveys contemporary biological theories of the development of sexual orientations. In particular, it examines three recent studies of how sexual orientations develop and their theoretical underpinnings. The focus is on these studies, because not only are they positively cited by almost every scientist trying to develop a biological theory of sexual orientation, but also because they are typical in their assumptions and methodology.
1. What Is a Sexual Orientation? A person’s sexual orientation concerns his or her sexual desires and fantasies towards others in virtue of their sex (gender). However, a person’s sexual orien-
tation is only one part of a person’s sexual interest generally. People have a wide range of sexual tastes. Some are attracted to people of certain ages, people of certain body types, of certain races, of certain hair colors, of certain personality types, of certain professions, as well as to people of a certain sex and a certain sexual orientation. Further, people are not only sexually interested in certain sorts of people, some also have quite specific interests in certain sorts of sexual acts, certain venues for sex, or a certain frequency of having sex. We recognize that people can be sorted into all sorts of groups in virtue of their sexual interests, but most contemporary scientific studies focus only on the sex of the people a person is sexually attracted to as an essential feature about him or her. Doing so may be culturally salient but it is not scientifically justified.
2. What Makes a Theory a Biological One? To say that sexual orientation is biologically based is an ambiguous claim; there are various senses in which it is trivially true that sexual orientation is biological. Everything psychological is biologically based. Humans can have sexual orientations while inanimate objects and one-celled organisms cannot because of our biological\psychological make-up. The same sort of claim is true with respect to having a favorite type of music: humans but not single-celled organisms can have a favorite type of music. Even though a preference for classical music seems a paradigmatic example of a learned trait, such a preference is also biological in that a certain cognitive complexity is required in order to have such a preference. Sexual orientation is at least biologically based in the sense that musical preferences are. The central claim of biological research on sexual orientation makes a much bolder claim. It says a person’s sexual orientation is inborn or determined at a very early age and, as a result, a person’s sexual orientation is ‘wired into’ his or her brain. To understand the significance of such a claim, it is useful to contrast three models of the role genes and other biological factors might play in sexual orientation (Byne 1996, Stein 1999, pp. 123–7). According to the permissie model, genes or other biological factors influence neuroanatomical structures on which experience inscribes sexual orientation, but biological factors do not directly or indirectly determine sexual orientation. Something like the permissive model correctly describes the development of musical preferences. Various genetic and biological factors make it possible for our experiences with various kinds of music to shape our musical preferences. Contrast the permissive model with the direct model, according to which genes, hormones, or other biological factors directly influence the brain structures that underlie sexual orientation. According to the 13995
Sexual Orientation: Biological Influences direct model, the neurological structures responsible for the direction of a person’s sexual attraction toward men or women develop at a very early age as a result of a person’s genes or other biological factors. One version of the direct model sees genes in the q28 region of the X chromosome as coding for a set of proteins that causes the INAH-3 region of the hypothalamus to develop so as to determine a person’s sexual orientation (LeVay and Hamer 1994). The direct and the permissive models can be contrasted with the indirect model, according to which genes code for (and\or other biological factors influence) temperamental or personality factors that shape how a person interacts with his or her environment and experiences of it, which, in turn, affects the development of his or her sexual orientation. On this view, the same gene (or set of genes) might predispose to homosexuality in some environments, to heterosexuality in others, and have no effect on sexual orientation in others. An example of such a theory is Daryl Bem’s theory of sexual orientation according to which biological factors code for childhood personality types and temperaments (for example, aggressiveness, willingness to engage in physical contact, and so on) (Bem 1996, Peplau et al. 1998). In societies like ours where there are significantly different gender roles typically associated with men and women, these different personality and temperament types get molded into gender roles that, in turn, play a crucial role in the development of sexual orientation. The three biological theories of the development of sexual orientation discussed below accept the direct model. However, what evidence there is for these theories is equally consistent with the indirect model.
3. Is Sexual Orientation Wired into the Brain? In 1991, Simon LeVay published a study of the size of INAH-3, a particular cell group in the hypothalamus (LeVay 1991). Starting from the assumption that there are neurological differences between men and women, LeVay decided to look for sexual-orientation differences in some of the areas of the hypothalamus that seem to exhibit sex differentiation. He reasoned as follows: given that most people who are primarily attracted to women are men and most people who are primarily attracted to men are women, in order to discover where sexual orientation is reflected in the brain, we should look in parts of the brain that are structured differently for men and women. This picture is based on seeing gay men as having female-typical characteristics and seeing lesbians as having maletypical characteristics. Seeing gay men and lesbians as gender inverts has a certain cultural salience but its scientific merit has been subject to serious criticism (Stein 1999, pp. 202–5, Byne 1994). To examine the hypothalamus, LeVay had to study portions of human brain tissue that are accessible only 13996
after the person has died. Further, LeVay needed to know the sexual orientations of the people associated with the brain tissue he was studying. LeVay’s study was made possible as a result of the AIDS epidemic, which had the result of making available brains from people whose self-reported sexual histories are to some extent part of their medical records. LeVay examined 41 brains: 19 of them from men LeVay presumed to be gay because they had died of complications due to AIDS and their medical records suggested that they had been exposed to HIV (the virus that causes AIDS) through sexual activity with other men; six of them from men of undetermined sexual orientation who also died of AIDS and who LeVay presumed were heterosexual; 10 of them from men of undetermined sexual orientation who died of causes other than AIDS and who were also presumed to be heterosexual; and six of them from women all of whom were presumed to be heterosexual, one whom died of AIDS and five who died from other causes. LeVay found that, on average, the INAH-3 of the presumed gay men were significantly smaller than those of the presumed heterosexual men and about the same size as those of the women. From this, he inferred that gay men’s INAH-3 are in a sense ‘feminized.’ Although LeVay rather cautiously concluded that his ‘results do not allow one to decide if the size of INAH-3 in an individual is the cause or consequence of that individual’s sexual orientation or if the size of INAH-3 and sexual orientation co-vary under the influence of some third unidentified variable,’ he also said that his study illustrates that ‘sexual orientation in humans is amenable to study at the biological level’ (LeVay 1991, p. 1036). In media interviews after the publication of his study he made even stronger claims; he said, for example, that the study ‘opens the door to find the answer’ to the question of ‘what makes people gay or straight’ (Gelman 1992).
4. Is Sexual Orientation Inherited? Various studies suggest that sexual orientation runs in families (e.g., Pillard and Weinrich 1986). Such studies show that a same-sex sibling of a homosexual is more likely to be a homosexual than a same-sex sibling of a heterosexual is to be a homosexual; more simply, for example, the brother of a gay man is more likely to be gay than the brother of a straight man. These studies do not establish that sexual orientation is genetic because most siblings, in addition to sharing a significant percentage of their genes, share many environmental variables, that is, they are raised in the same house, are fed the same meals, attend the same schools and have many of the same adult role models. For these reasons, disentangling inherited and environmental influences requires more sophisticated studies. Heritability studies done by Michael Bailey and Richard Pillard assessed sexual orientation in identical
Sexual Orientation: Biological Influences twins, fraternal twins, nontwin biological siblings, and similarly-aged unrelated adopted siblings (Bailey and Pillard 1991, Bailey et al. 1993). If sexual orientation is genetic, then, first, all identical twins should have the same sexual orientation and, second, the rate of homosexuality among the adopted siblings should be equal to the rate of homosexuality in the general population. If, on the other hand, identical twins are as likely to have the same sexual orientation as adopted siblings, this suggests that genetic factors make very little contribution to sexual orientation. In both twin studies, subjects were recruited through ads placed in gay publications that asked for homosexual or bisexual volunteers with twin or adoptive siblings. Volunteers were encouraged to reply to the ad ‘regardless of the sexual orientation’ of their siblings. In both of these studies, the percentage of identical twins who are both homosexual is substantially higher than the percentage with respect to fraternal twins. For example, 48 percent of the identical twins of lesbians were also lesbians, 16 percent of the fraternal twin sisters were lesbians, 14 percent of the nontwin biological sisters were lesbians, as were 6 percent of adoptive sisters. These results show that sexual orientation is at least partly not the result of genetic factors. However, the higher concordance rate is consistent with a genetic effect because identical twins share all of their genes while fraternal twins, on average, share only half of their genes. Also consistent with a genetic effect is the result that the concordance rates for both types of twins are higher than the concordance rates for adopted siblings.
5. Is Sexual Orientation Genetic? Building on heritability studies, Dean Hamer and his collaborators obtained DNA samples from gay brothers in families in which homosexuality seemed surprisingly common. These samples were analyzed using linkage analysis, a technique for narrowing the location of a gene for some trait, to see if there was any particular portion of the X chromosome that was identical in the pairs of brothers at an unexpectedly high frequency (Hamer et al. 1993). Hamer found that a higher than expected percentage of the pairs of gay brothers had the same genetic sequences in a particular portion of the q28 region of the X chromosome (82 percent rather than the expected 50 percent). In other words, he found that gay brothers are much more likely to share the same genetic sequence in this particular region than they are to share the same genetic sequence in any other region of the X chromosome. Hamer’s study suggests that the q28 region is the particular place where sexual orientation differences are inscribed. This study does not, contrary to popular belief, claim to identify any particular genetic sequence associated with homosexuality. At best, it has found that many of the pairs of homosexual brothers
had the same genetic sequences in this portion of the X chromosome. When Hamer is at his most precise, he says that his study shows that ‘at least one subtype of male sexual orientation is genetically influenced’ (Hamer et al. 1993, p. 321). In other contexts, Hamer is less careful. For example, in his book, Science of Desire, he talks of ‘gay genes,’ going so far as to use this term in the book’s subtitle (Hamer and Copeland 1994).
6. Problems with These (and Other) Studies 6.1 Lack of Confirmation Independent confirmation is the earmark of the scientific method. LeVay’s results have not been independently confirmed (Byne 1996) and Hamer’s study has been recently disconfirmed (Rice et al. 1999). Although various research teams have confirmed the twin study results, a recent and more sophisticated study by Bailey undermines the methodology of early twin studies (Bailey et al. in press). Bailey systematically recruited subjects from a registry of identical and fraternal twins in Australia. In women, the percentage of identical twins of bisexuals and homosexuals who are also either bisexual or homosexual was between 24 and 30 percent (depending on how the boundaries of these groups are drawn), while the percentage of same-sex fraternal twins of bisexual and homosexual women was between 10 and 30 percent. Not only is the difference between identical and fraternal twins significantly smaller than in previous studies but the percentages for identical twins with the same sexual orientations are dramatically lower. Although these results can be read as consistent with the direct model, the evidence is much weaker than earlier heritability studies suggested.
6.2 Problems with the Subject Pool This Australian study shows that the results of earlier twin and family studies (Bailey and Pillard 1991, Bailey et al. 1993, Hamer et al. 1993), which recruited subjects through HIV clinics, lesbian and\or gay organizations, newspapers, and other nonsystematic methods, must have been inflated by sampling bias. In particular, it suggests that gay men and lesbians with identical twins of the same sexual orientation are more likely to participate in such studies. If an experiment makes use of a subject pool that is in some way biased, then this gives rise to doubts about the conclusions based on it. LeVay’s subject pool is also biased. In particular, the homosexual population in his study is made up exclusively of men who died from complications due to AIDS and who told hospital staff that they had engaged in same-sex sexual activities. 13997
Sexual Orientation: Biological Influences 6.3 Methods of Classification A related problem with such studies concerns the determination of subjects’ sexual orientations. Conclusions based on studies that inaccurately assign sexual orientations to subjects are weak. LeVay assumed, on the basis of no particular evidence, that all the women in his study were heterosexual and that all the men in his studies whose hospital records did not indicate same-sex sexual activity were heterosexual. Other studies that rely on a family member to report a person’s sexual orientation may be similarly problematic. Further, by assuming that there are two sexual orientations—gay and straight—and that people can be easily and reliably classified as one or the other on the basis of their behavior or their self-report, scientific research accepts, without strong justification, that our cultural presumptions about human sexual desires are scientifically valid (Stein 1999, pp. 201–13). 6.4 Undefended Assumptions Generally, most biological research on sexual orientation accepts without argument a quite particular picture of sexual orientation. For example, many studies in the emerging research program unquestioningly accept the inversion assumption, according to which lesbians and gay men are seen as gender inverts and many studies assume the direct relevance of animal models of sexual behavior to human sexual orientation (Stein 1999, pp. 164–79). More crucially, such studies typically accept that a person’s sexual orientation is a deep scientific property about her and that sexual orientation is a ‘window into a person’s soul.’ This view of the centrality of sexual orientation to human nature is neither culturally universal nor scientifically established (Stein 1990, Stein 1999, pp. 93–116).
7. Conclusion Although there is some evidence that is consistent with biological factors playing an indirect role in the development of sexual orientations, there is no convincing evidence that biological factors are a direct cause of sexual orientation. How human sexual desires develop is an interesting research question, but we are probably rather far from answering it. See also: Culture as Explanation: Cultural Concerns; Feminist Theory: Radical Lesbian; Sexual Preference: Genetic Aspects
Bibliography Bailey J M, Pillard R C 1991 A genetic study of male sexual orientation. Archies of General Psychiatry 48: 1089–96
13998
Bailey J M, Pillard R C, Neale M C, Agyei Y 1993 Heritable factors influence sexual orientation in women. Archies of General Psychiatry 50: 217–23 Bailey J M, Dunne M P, Martin N G in press The distribution, correlates, and determinants of sexual orientation in an Australian twin sample Bem D J 1996 Exotic becomes erotic: A developmental theory of sexual orientation. Psychological Reiew 103: 320–35 Byne W 1994 The biological evidence challenged. Scientific American 270(May): 50–5 Byne W 1996 Biology and sexual orientation: Implications of endocrinological and neuroanatomical research. In: Cabaj R, Stein T (eds.) Textbook of Homosexuality and Mental Health. American Psychiatric, Press, Washington, DC Gelman D 1992 Born or bred? Newsweek, February 24: 46–53 Hamer D, Copeland P 1994 The Science of Desire: The Search for the Gay Gene and the Biology of Behaior. Simon and Schuster, New York Hamer D H, Hu S, Magnuson V L, Hu N, Pattatucci A 1993 A linkage between DNA markers on the X chromosome and male sexual orientation. Science 261: 321–7 LeVay S 1991 A difference in hypothalamic structure between heterosexual and homosexual men. Science 253: 1034–7 LeVay S, Hamer D H 1994 Evidence for a biological influence in male homosexuality. Scientific American 270(May): 44–9 Peplau L A, Garnets L D, Spalding L R, Conley T D, Veniegas R C 1998 A critique of Bem’s ‘exotic becomes erotic’ theory of sexual orientation. Psychological Reiew 105: 387–94 Pillard R C, Weinrich J 1986 Evidence of familial nature of male homosexuality. Archies of General Psychiatry 43: 808–12 Rice G, Anderson C, Risch N, Ebers G 1999 Male homosexuality: Absence of linkage to microsatellite markers at xq28. Science 284: 665–7 Stein E D (ed.) 1990 Forms of Desire: Sexual Orientation and the Social Constructionist Controersy. Garland, New York Stein E D 1999 The Mismeasure of Desire: The Science, Theory, and Ethics of Sexual Orientation. Oxford University Press, New York
E. Stein
Sexual Orientation: Historical and Social Construction Sexual orientation is a much used but fundamentally ambiguous concept. It came into use in discussions of sexuality in the 1970s largely as a synonym for homosexual desire and object choice, less frequently for heterosexual patterns. ‘Sexual orientation’ suggests an essential sexual nature. The task of historical or social constructionist approaches is to suggest that this belief is what itself needs investigation. Constructionist approaches seek to do two broad things: to understand the emergence of sexual categorizations (such as ‘the homosexual’ or ‘the heterosexual’ in western cultures since the nineteenth century) within their specific historical and cultural contexts; and to interpret the sexual meanings, both subjective and social, which allow people to identify with, or reject,
Sexual Orientation: Historical and Social Construction these categorizations. It is, thus, largely preoccupied, not with what causes individual desires or orientations, but with how specific definitions develop within their historic contexts, and the effects these definitions have on individual self-identifications and collective meanings.
1. The Rise of Constructionist Approaches to Sexuality The classical starting point for social constructionist approaches is widely seen as an essay on The Homosexual Role by the British sociologist, Mary McIntosh (1968). Its influence can be traced in a range of historical studies from the mid-1970s (Weeks 1977, Greenberg 1988), and it has been anthologized frequently (e.g., Stein 1992). What is important about the work is that it asks what was at the time a new question: not, as had been traditional in the sexological tradition from the late nineteenth century, what are the causes of homosexuality, but rather, why are we so concerned with seeing homosexuality as a condition that has causes? And in tackling that new question, McIntosh proposed an approach that opened up a new research agenda through seeing homosexuals ‘as a social category, rather than a medical or psychiatric one.’ Only in this way, she suggests, can the right questions be asked, and new answers proposed. Using Kinsey (Kinsey et al. 1948), McIntosh makes a critical distinction between homosexual behavior and ‘the homosexual role.’ Homosexual behavior is widespread; but distinctive roles have developed only in some cultures, and do not necessarily encompass all forms of homosexual activity. The creation of a specialized, despised, and punished role or category of homosexual, such as that which developed in Britain from the early eighteenth century, was designed to keep the bulk of society pure in rather the same way that the similar treatment of some kinds of criminal keeps the rest of society law-abiding. McIntosh drew on a variety of intellectual sources, from structural functionalism to dramaturgical approaches, but clearly central to her argument was a form of labeling theory. The creation of the homosexual role was a form of social control designed to minoritize the experience, and protect and sustain social order and traditional sexual patterns. If McIntosh put on the agenda the process of social categorization, another related but distinctive approach that shaped social constructionism came from the work of Gagnon and Simon, summarized in their book Sexual Conduct: The Social Sources of Human Sexuality (1974). Drawing again on the work of Kinsey and symbolic interactionist traditions, they argued that, contrary to the teachings of sexology, sexuality, far from being the essence of ‘the natural,’ was subject to sociocultural shaping to an extra-
ordinary degree. The importance culture attributes to sexuality may, therefore, they speculated, not be a result of its intrinsic significance: society may have had a need to invent its importance and power at some point in history. Sexual activities of all kinds, they suggested, were not the results of an inherent drive but of complex psychosocial processes of development, and it is only because they are embedded in social scripts that the physical acts themselves become important. These insights suggested the possibility of exploring the complex processes by which individuals acquired subjective meanings in interaction with significant others, and the effects of ‘sexual stigma’ on these developmental processes (Plummer 1975). By the mid-1970s, it is possible to detect the clear emergence of a distinctive sociological account, with two related concerns. One focused on the social categorization of sexuality, asking questions about what historical factors shaped sexual differences which appeared as natural but were in fact cultural. The other was concerned primarily with methods of understanding the shaping of subjective meanings through sexual scripting, which allowed a better understanding of the balance between individual and collective sexual meanings. A third theoretical element now came into play: that represented by the work of Foucault (1976\1979). Foucault’s essay is often seen, misleadingly, as the starting point of constructionist approaches, but there can be no doubt of the subsequent impact of what was planned as a brief prolegomena to a multivolumed study. Like Gagnon and Simon, Foucault appeared to be arguing that ‘sexuality’ was a ‘historical invention.’ Like McIntosh, and others who had been influenced by her, he saw the emergence of the concept of a distinctive homosexual personage as a historical process, with the late nineteenth century as the key moment. The process of medicalization, in particular, was seen as a vital explanatory factor. Like McIntosh, he suggested that psychologists and psychiatrists have not been objective scientists of desire, as the sexological tradition proclaimed, but on the contrary ‘diagnostic agents in the process of social labeling’ (McIntosh 1968). But at the same time, his suggestion that people do not react passively to social categorization—‘where there is power, there is resistance’—left open the question of how individuals react to social definitions, how, in a word that now became central to the debate, identities are formed in the intersection of social and subjective meanings.
2. Sexual Behaior, Sexual Categories, and Sexual Identities The most crucial distinction for social constructionism is between sexual behavior, categories, and identities. Kinsey et al (1948) had shown that there was no 13999
Sexual Orientation: Historical and Social Construction necessary connection at all between what people did sexually and how they identified themselves. If, in a much disputed figure, 37 per cent of the male population had had some sort of sexual contact with other men to the point of orgasm, yet a much smaller percentage claimed to be exclusively homosexual, identity had to be explained by something other than sexual proclivity or practice. Yet at the same time, by the 1970s, many self-proclaimed homosexuals were ‘coming out,’ in the wake of the new lesbian and gay movement. Many saw in the historicization of the homosexual category a way of explaining the stigma that homosexuality carried. What was made in history could be changed in history. Others, however, believed clearly that homosexuality was intrinsic to their sense of self and social identity, essential to their nature. This was at the heart of the so-called social constructionist–essentialist controversy in the 1970s and 1980s (Stein 1992). For many, a critique of essentialism could also be conceived of as an attack on the very idea of a homosexual identity, a fundamental challenge to the hard won gains of the lesbian and gay movement, and the claim to recognition of homosexuals as a legitimate minority group. This was the source of the appeal of subsequent theories of a ‘gay gene’ or ‘gay brain,’ which suggested that sexual orientation was wired into the human individual. It is important to make several clear points in response to these debates, where social scientific debates became a marker of social movement differences. First, the distinction between behavior, categories, and identities need not necessarily require the ignoring of questions of causation, it merely suspends them as irrelevant to the question of the social organization of sexuality. Foucault himself stated that: ‘On this question I have absolutely nothing to say’ (cited in Halperin 1995). The really important issue is not whether there is a biological or psychological propensity that distinguishes those who are sexually attracted to people of the same gender from those who are not. More fundamental are the meanings these propensities acquire, however, or why ever they occur, the social categorizations that attempt to demarcate the boundaries of meanings, and their effect on collective attitudes and individual sense of self. Social categorizations have effects in the real world, whether or not they are direct reflections of inherent qualities and drives. The second point to be made is that the value of the argument about the relevance of theories of a ‘homosexual role’ does not depend ultimately on the validity of the variants of role theory (cf. Whitam and Mathy 1986; Stein 1992). The use of the word ‘role’ was seen by McIntosh (1968) as a form of shorthand, referring not only to a cultural conception or a set of ideas but also to a complex of institutional arrangements which depended on and reinforced these ideas. Its real importance as a concept is that it defined an issue that required exploration. Terms such as constructionism 14000
and roles are in the end no more than heuristic devices to identify and understand a problem in studying sexuality in general and homosexuality in particular. It is transparently obvious that the forms of behavior, identity, institutional arrangements, regulation, beliefs, ideologies, even the various definitions of the ‘sexual,’ vary enormously through time and across cultures and subcultures. A major objective of historical and social constructionist studies of the erotic has been to problematize the taken for granted, to denaturalize sexuality in order to understand its human dimensions and the coils of power in which it is entwined, how it is shaped in and by historical forces and events. The historicization of the idea of the homosexual condition is an excellent pioneering example of this. The third point that requires underlining is that, regardless of evidence for the contingency of sexual identities, this should not imply that personal sexual identities, once acquired, can readily be sloughed off. The fact that categories and social identities are shaped in history does not in any way undermine the fact that they are fully lived as real. The complex relationship between societal categorization and the formation of subjectivities and sexual identities has in fact been the key focus of writing about homosexuality since the mid-1970s. On the one hand, there is a need to understand the classifying and categorizing processes which have shaped our concepts of homosexuality— the law, medicine, religion, patterns of stigmatization, formal and informal patterns of social regulation. On the other, it is necessary to understand the level of individual and collective reception of, and battle with, these classifications and categorizations. The best historical work has attempted to hold these two levels together, avoiding both sociological determinism (you are what society dictates) or extreme voluntarism (you can be anything you want): neither is true (see discussion in Vance 1989). Some of the most interesting work has attempted to explore the subcultures, networks, urban spaces, or even rural idylls that provided the space, the conditions of possibility, for the emergence of distinctive homosexual identities. McIntosh’s suggestion that the late seventeenth century saw the emergence of a subcultural context for a distinctive homosexual role in England has been enormously influential. Her rediscovery of the London mollies’ clubs has been the starting point of numerous historical excavations (e.g., Trumbach 1977; Bray 1982). There is now plentiful work which attempts to show that subcultures and identities existed before the late seventeenth century, for example, in the early Christian world (Boswell 1980), or in other parts of Europe (see essays in Herdt 1994), just as there have been scholars who have argued that we cannot really talk about homosexual identities until the late nineteenth, or even midtwentieth centuries (see essays in Plummer 1981). There is a real historical debate. As a result, it would
Sexual Orientation: Historical and Social Construction now seem remarkable to discuss sexual identities (and their complex relationship to social categorizations) without a sense of their historical and social context. Sexual identities are made in history, not in nature.
3. Heterosexuality and Homosexuality Identities are not, however, created in isolation from other social phenomena. In particular, a history of homosexual identities cannot possibly be a history of a single homogeneous entity, because the very notion of homosexuality is dependent, at the very least, on the existence of a concept of heterosexuality, which in turn presupposes a binary notion of gender. Only if there is a sharply demarcated difference between men and women does it become meaningful to distinguish same sex from other sex relationships. Social constructionism, Vance (1989) noted, had paid little attention to the construction of heterosexuality. But without a wider sense that ‘the heterosexual’ was also a social construction, attempts to explain the invention of ‘the homosexual’ made little sense. One of the early attractions of the first volume of Foucault’s The History of Sexuality was precisely that it both offered an account of the birth of the modern homosexual, and put that into a broader historical framework: by postulating the invention of sexuality as a category in western thought, and in delineating the shifting relationships between men and women, adults and children, the normal and the perverse, as constituent elements in this process. Foucault himself was criticized for putting insufficient emphasis on the gendered nature of this process, but this was more than compensated for by the developing feminist critique of ‘the heterosexual institution,’ with its own complex history (MacKinnon 1987; Richardson 1996). Central to these debates was the perception that sexuality in general is not a domain of easy pluralism, where homosexuality and heterosexuality sit easily side by side. It is structured in dominance, with heterosexuality privileged, and that privilege is essentially male oriented. Homosexuality is constructed as a subordinate formation within the ‘heterosexual continuum,’ with male and female homosexuality having a different relationship to the dominant forms. In turn, once this is recognized, it becomes both possible and necessary to explore the socially constructed patterns of femininity and masculinity (Connell 1995). Although the constructionist debates began within the disciplines of sociology and history, later developments, taking forward both theoretical and political (especially feminist) interventions, owed a great deal to postructuralist and deconstructionist literary studies, and to the emergence of ‘queer studies.’ Whereas history and sociology characteristically have attempted to produce order and pattern out of the chaos of events, the main feature of these approaches is to show the binary conflicts reflected in literary texts.
The texts are read as sites of gender and sexual contestation and, therefore, of power and resistance (Sedgwick 1990). Sedgwick’s work, and that of the American philosopher, Butler (1990) were in part attempts to move away from the essentialist\constructionist binaries by emphasizing the ‘performative’ nature of sex and gender. This in turn opened up what might be called the ‘queer moment’ that radically challenged the relevance of fixed sexual categorizations. For queer theorists, the perverse is the worm at the center of the normal, giving rise to sexual and cultural dissidence and a transgressive ethic, which constantly works to unsettle binarism and to suggest alternatives.
4.
Comparatie Perspecties
Much of the debate about the homosexual\heterosexual binary divide was based on the perceived Western experience, and was located in some sense of a historical development. Yet from the beginning, comparisons with non-Western sexual patterns were central to constructionist perspectives. Foucault (1976\1979) compared the Western ‘science of sex’ with the non-Western ‘erotic arts.’ It was the very fact of different patterns of ‘institutionalised homosexuality,’ that formed the starting point of McIntosh’s essay, where she had identified two key regulating elements: between adults and nonadults (as in the intergenerational sex which denoted the passage from childhood to adulthood in some tribal and premodern societies), and between the genders (as in the case of the native American ‘berdache’). For some writers, these patterns are the keys to understanding homosexuality in premodern times. Historians have traced the evolution over time of patterns of homosexual life which have shifted from intergenerational ordering, through categorization around gender and class, to recognizably modern forms of egalitarian relationships (for a historical perspective see Trumbach 1998). So it is not surprising that constructionist approaches have led to an efflorescence of studies of sexuality in general, and homosexuality in particular, in other cultures, tribal, Islamic, southern (Herdt 1994). This comparative framework increasingly has been deployed within contemporary western societies to highlight the difficulty of subsuming behavior within a confining definition of condition or fixed orientation.
5. Beyond Constructionism Historical and social constructionism has advanced and changed rapidly since the 1970s. The ‘category’ that early scholars were anxious to deconstruct has become ‘categories’ which proliferate in contemporary societies. ‘Roles,’ neat slots into which people could be expected to fit as a response to the bidding of the agents of social control, have become ‘performances’ 14001
Sexual Orientation: Historical and Social Construction (Butler 1990) or ‘necessary fictions’ (Weeks 1995), whose contingencies demand exploration. ‘Identities,’ which once seemed categoric, are now seen as fluid, relational, hybrid: people are not quite today what they were yesterday, or will be tomorrow. Identities have come to be seen as built around personal ‘narratives,’ stories people tell each other in the various interpretive communities to which they belong (Plummer 1995). Individual identities, it is increasingly recognized, are negotiated in the ever-changing relationship between self and other, within rapidly changing social structures and meanings. Sexual orientation may, or may not, be a product of genetics, psychosocial structuring, or environmental pressures. That issue, which has tortured sexology for over a century, may or may not be resolved at some stage in the future. For the constructionist, however, other questions are central: not what causes the variety of sexual desires, ‘preferences’ or orientations that have existed in various societies at different times, but how societies shape meanings around sexual diversity, and the effects these have on individual lives. See also: Feminist Theory: Radical Lesbian; Gay, Lesbian, and Bisexual Youth; Gay\Lesbian Movements; Gender Differences in Personality and Social Behavior; Gender Ideology: Cross-cultural Aspects; Heterosexism and Homophobia; Masculinities and Femininities; Queer Theory; Sex-role Development and Education; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Orientation and the Law; Sexual Orientation: Biological Influences; Sexual Preference: Genetic Aspects; Sexuality and Gender; Sexuality and Geography
Kinsey A C, Pomeroy W B, Martin C E 1948 Sexual Behaior in the Human Male. Saunders, Philadelphia McIntosh M 1968 The homosexual role. Social Problems 16(2): 182–92 MacKinnon C A 1987 Feminism Unmodified: Discourses on Life and Law. Harvard University Press, Cambridge, MA Plummer K 1975 Sexual Stigma: An Interactionist Account. Routledge and Kegan Paul, London Plummer K 1995 Telling Sexual Stories: Power, Change and Social Worlds. Routledge, London Richardson D (ed.) 1996 Theorising Heterosexuality: Telling it Straight. Open University Press, Buckingham, UK Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA Stein E (ed.) 1992 Forms of Desire: Sexual Orientation and the Social Constructionist Controersy. Routledge, New York Trumbach R 1977 London’s sodomites: Homosexual behavior and Western culture in the 18th century. Journal of Social History 11(1): 1–33 Trumbach R 1998 Sex and the Gender Reolution, Heterosexuality and the Third Gender in Enlightenment London. Chicago University Press, Chicago, Vol. 1 Vance C S 1989 Social Construction Theory: Problems in the History of Sexuality. In: Altman D, Vance C, Vicinus M, Weeks J et al. (eds.) Homosexuality, Which Homosexuality? GMP Publishers, London, pp. 13–34 Weeks J 1977 Coming Out: Homosexual Politics in Britain from the Nineteenth Century to the Present. Quartet Books, London Weeks J 1995 Inented Moralities: Sexual Values in an Age of Uncertainty. Columbia University Press, New York Whitam F L, Mathy R M 1986 Male Homosexuality in Four Societies. Praeger, New York
J. Weeks
Sexual Perversions (Paraphilias) Bibliography Altman D, Vance C, Vicinus M, Weeks J et al. 1989 Homosexuality, Which Homosexuality? GMP Publishers, London Boswell J 1980 Christianity, Social Tolerance, and Homosexuality: Gay People in Western Europe from the Beginning of the Christian Era to the Fourteenth Century. University of Chicago Press, Chicago Bray A 1982 Homosexuality in Renaissance England. GMP Publishers, London Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Connell R W 1995 Masculinities. University of California Press, Berkeley, CA Foucault M 1976\1979 Histoire de la sexualite, 1, La Volunte de Saoir. Gallimard, Paris [trans. 1979 as The History of Sexuality, Vol. 1, An Introduction. Allen Lane, London] Gagnon J H, Simon W 1973 Sexual Conduct: The Social Sources of Human Sexuality. Aldine, Chicago Greenberg D E 1988 The Construction of Homosexuality. University of Chicago Press, Chicago Halperin D 1995 Saint Foucault: Towards a Gay Hagiography. Oxford University Press, New York Herdt G (ed.) 1994 Third Sex, Third Gender: Beyond Sexual Dimorphism in Culture and History. Zone Books, New York
14002
1. Definition Modern usage tends to favor the word ‘paraphilia’ rather than ‘deviance’ or ‘perversion.’ In this article these latter terms will be used only in their historical context. The word ‘paraphilia’ means a love of ( philia) the beyond or irregular ( para), and is used instead of those words which today have pejorative implications that are not always relevant. The term itself is used to describe people, usually men, with intense sexual urges that are directed towards nonhuman objects, or the suffering or humiliation of oneself or one’s partner, or more unacceptably, towards others who are incapable of giving informed consent, such as children, animals, or unwilling adults. People who are paraphiliacs often exhibit three or four different aspects, and clinical psychiatric conditions (personality disorders or depression) may sometimes be present. Paraphilias include: exhibitionism (the exposure of ones genitals to a stranger, sometimes culminating in masturbation); voyeurism or peeping (the observance of strangers undressing or having sexual intercourse,
Sexual Perersions (Paraphilias) without their being aware of the voyeur, who usually masturbates); fetishism (the use of inanimate objects for arousal, usually articles of women’s underwear, although if these are used to cross-dress, transvestic fetishism is the diagnosis); frotteurism (the rubbing of the genitals against the buttocks, or the fondling of an unsuspecting woman, usually in a crowded situation, so that detection of the perpetrator is unlikely). Pedophilia refers to men sexually attracted to children, some to girls, some to boys and some to either sex. Some pedophiles are attracted sexually only to children, the exclusive types, but some are attracted to adults as well, the nonexclusive type. Sexual sadism and sexual masochism (S&M) involve the infliction or reception of pain respectively to necessitate arousal and orgasm. Sadistic fantasies that involve obtaining complete control of the victim are particularly dangerous and may result in death. Some masochistic behaviors that involve self-asphyxiation as part of a masturbatory ritual can result in accidental death. Other paraphilias include the use for sexual purposes of corpses (necrophilia), animals (zoophilia), feces (coprophilia), enemas (klismaphilia) and urine (urophilia). It is of particular interest that just as masturbation is nowadays excluded from being considered a deviant act, so homosexuality, which figured so largely in this area in earlier times, was removed as a paraphilia in the 1974 edition of the Diagnostic and Statistical Manual of Mental Disorders.
2. Greek Mythology The ancient Greeks often ascribed to their gods extreme forms of sexual behavior. Whether this reflected fears or wishful thinking or actual practices is obviously very difficult to evaluate. Zeus’s myriad affairs were often conducted in the form of animals or even inanimate objects, such as a cloud or a shower of gold, and his objects of desire were male or female, as indeed were those of the other gods and heroes. Artemis and Athena both eschewed sex altogether and remained adamantly virginal. The mighty Hercules cross-dressed to be humiliated by Queen Omphale, while Theseus and Achilles both donned women’s clothes without apparent later loss of esteem (Licht 1969, Bullough 1976).
3. History 3.1 The Hunter-gatherer Past In most species sexuality and reproduction are of necessity tied together and overlap, but in modern humans it is the relatively recent separation of sexuality from reproduction that represents a crucial evolutionary phase. Despite huge sociological changes we are genetically still at the stage of hunter-gatherers,
who had a shorter life, no menopause and long periods between childbirth due to prolonged breast feeding. Perhaps because of better nutrition, and possibly exposure to electric light (which affects the hypothalamus), girls in the developed countries in the 1990s have their first period at around 13 years of age, and ovulation starts soon afterwards, with pregnancy being a distinct possibility. Usually most women have conceived and had their children by 30–40 years of age, which, assuming many will reach 80 years of age, leaves some 40–50 years of life to enjoy nonreproductive sex. Sex therefore takes on new meanings and fulfils other roles, perhaps as a recreational activity, which allows time for sexual variant behavior (Short 1976). 3.2 Sexuality Numerous so-called Venus figurines and cave paintings have been found from between 25,000 and 10,000 years ago that depict almost all of the sexual acts with which we are familiar today. What the meanings and significance of these depictions were to the people who created them we can only speculate. Early studies showed that paraphilias were present universally in every culture and throughout every historical period. Often tied up with religious rites, sacred prostitution, and\or phallic cults, public shows of self-immolation, sadism, fetishism, and homosexuality flourished. Iwan Bloch believed that every sensory organ could function as an erotogenous zone, and so form the basis for a so-called perversion. Freud pointed out that the ancients laid stress on the instinct itself, whereas we today are preoccupied with its object. Thus it was considered wrong to be passive, that is, to be penetrated anally by one’s slave, but acceptable to penetrate him or her. Today the object, that is, whom a man penetrates, is important.
3.3 Classifications The enormous variations found in human sexual behavior have always been part of the heritage of humankind, but it was not until the eighteenth century in the West that such activities were labeled and put into categories, so that attempts at a scientific classification could be made. These categories varied from culture to culture and showed a great deal of fluidity, as each culture constructed the behavior in different ways. The essential behaviors remained constant, however, and it was rather how each society considered them, whether they approved or disapproved, that created the particular sexual climate. Classifications thus offer an insight into the current thinking of each particular time. The Marquis de Sade (1740–1814), while imprisoned in the Bastille, wrote a full description of most perversions and suggested a classification about 14003
Sexual Perersions (Paraphilias) 100 years before his successors, Krafft-Ebing, Havelock Ellis, and Freud. He wrote many short stories on sexual themes including incest and homosexuality. He also foresaw the women’s rights issue, which he treated in a compassionate and modern way, yet this did not preclude his use of women as victims in his writings. His 120 Days of Sodom is perhaps his main work on perversion, although all his works contain little homilies about sex and morals. As de Sade lived in an age that was both turbulent and brutal, it says something for the humanity of the man that he implored us not to laugh at or sneer at those with deviant impulses, but rather to look upon them as one might a cripple, with pity and understanding. De Sade offers the following classification for sexuality (Gorer 1962): (a) People with weak or repressed sexual desires— the nondescript majority, he saw as cannon fodder for the next two groups; (b) Natural perverts (born so); (c) Libertines—people who imitate group (b), but wilfully rather than innately. De Sade’s admiration was for group (b). The 120 Days of Sodom, written while Sade was in the Bastille, contains examples of all these, and the range extends from foot fetishism, various types of voyeurism, obsessional rituals and bestiality, to frank sadism and murder. Sade described many sexual acts involved with body fluids, sperm, blood, urine, saliva, and feces. Jeremy Bentham (1748–1832), the utilitarian philosopher who believed that nature has placed humankind under the governance of two sovereign masters, pain and pleasure, published his essay on pederasty in 1785 in which he argued cogently for the decriminalization of sodomy (then punishable by hanging), and other sexual acts. He challenged the notion current at the time that such behaviors weaken men, threaten marriage, and diminish the population. He went on to discuss these punishments as an irrational antipathy to pleasure, and he highlighted the dangers that prohibitions incur, i.e., possible blackmail and false accusations (Crompton 1978). In the nineteenth century, Darwin in his theory of evolution was able to explain the presence of gill slits and the tail present in the early human embryo as well as the appendix, as evidence of our evolutionary past. Ernst Haeckel, Darwin’s forceful adherent, developed the concept that ontogeny recapitulated phylogeny (see Eolution, Natural and Social: Philosophical Aspects). It was argued from this that if elements from our past physical evolution were so preserved, why should this not apply to our psychological history? Thus sexual perversions were deemed to reflect a return to an earlier developmental stage in phylogeny that had somehow become fixated, so that perverts, like other races and women, were at a lower point on the evolutionary ladder. Following on from Darwin’s observations of hermaphroditism in nature, Karl A. Ulrichs (whose pseudonym was Numa Numantius) 14004
believed male homosexuals to be female souls in men’s bodies. Succeeding generations of doctors and sexologists argued these points and many offered alternative classifications of the perversions to reflect their views. These are mainly of historical value now, but perhaps we could look at Krafft-Ebing’s classification as an example. He believed that life was a struggle between instinct and civilization, and that mental illness created no new instincts, but only altered those that already existed. He believed that hereditary taint and excessive masturbation were among the causes of sexual perversion. His classification, which is self-explanatory, was as follows: Group 1: deviation of the sex impulse. Too little sexual feeling (impotence, frigidity); too much (satyriasis and nymphomania), and sexual feeling appearing at the wrong time (in childhood or old age) or being directed wrongly (at children, the elderly or animals). Group 2: Sexual release through other forms of activity (inflicting or receiving pain or sexual bondage). Group 3: Inverted sexuality (homosexuality, bisexuality, and transvestism). Krafft Ebing’s Psychopathia Sexualis ran into 12 editions in his lifetime, with many revisions, and although he did write some of the more explicit details in Latin, the British Medical Journal in 1893 lamented that the entire book had not been written in Latin, ‘in the decent obscurity of a dead language.’ Like Sade before him, Krafft Ebing spoke of ‘perverts’ as ‘stepchildren of nature,’ and asked for a more medical and sympathetic approach rather that mere legal strictures. His belief that most perversions were mainly congenital was challenged by Alfred Binet who argued that with fetishism (a term he coined), for example, the fetish may take many different forms, for example, an obsession with the color of the eyes, bodily odors, various types of dress, or different inanimate objects. Heredity could not dictate this choice, rather a chance phenomena that occurred in childhood was more likely as a cause. Furthermore, on this argument, the fetishist could just as well become a sadist, masochist, or homosexual if early childhood events had been so conducive. As many individuals were exposed to such childhood events and did not develop a perversion, however, Binet had to conclude that there could well be a congenital morbid state in those who did so. Albert Moll argued further that as all biological organs and functions were subject to variations and anomalies, why should the sexual behaviors be any different? Moll further believed that the sense of smell was always an important factor in mammalian sexuality, and the advent of clothing in human culture diminished this. Wilhelm Fliess, an ENT surgeon and colleague of Freud, drew attention to the erectile
Sexual Perersions (Paraphilias) tissues of the nose and its similarity to that of the penis and clitoris. The upright posture adopted by humans had distanced them from the ground, away from feces and smell, so diminishing the importance of olfaction in human arousal, and forming what Freud described as an abandoned erotogenous zone (Sulloway 1992). Today these mechanisms have been somewhat elaborated. A region in the nasal septum (Jacobson’s organ), is linked by nerves to the hypothalamus, a region of the brain that controls sex-hormone secretion. Small volatile chemicals known as pheromones, found in sweat around the genital and axillary regions, are known to stimulate this nasal pathway. Subtle differences in the pheromone balance depend on the individual’s genetic make-up, and it is possible that these differences give a biological basis for avoidance of attraction to one’s near relatives who exhibit similar profiles. The importance of smell in animal sexuality is well known, and it is of interest that to many fetishists it is soiled rather than clean female underwear that arouses them (Goodman 1998). Freud believed that each individual from infancy to adulthood repeated the moral development of the race, from sexual promiscuity, to perversion, and then on to heterosexual monogamy. There was a correct developmental path, during which the infant, from the stage of polymorphous perversity, went through various phases of development, and negotiated the Oedipal situation and castration complex. Freud believed that perversions arose because of arrested development leading to fixations on this pathway due to sexual trauma. Those who did not deviate when so exposed in childhood, Freud believed to have been protected by constitutional factors, namely an innate sense of propriety that had been acquired through moral evolution. He classified as deviants those who have different sexual aims (from normal vaginal intercourse), for example, sadists, fetishishists, and those who have a different love object (from the normal heterosexual partner), such as pedophiles or homosexuals (Weeks 1985) (see Psychoanalysis: Current Status).
4. Modern Times At the beginning of the twenty-first century, technology-led cultures have had a profound effect on sexual behavior, which has expanded to fill the different niches in the manner of Darwin’s ‘tangled bank.’ We may see even greater fluidity of such behaviors in the future. Just two areas will be briefly considered. 4.1 Fetishism and Fashion Today the fashion industry makes much use of fetishism in its designs. Magazines cater for a whole range of people, both gay and straight, with fetishistic and S&M interests. Relatively new phenomena have
appeared in recent times, such as ‘vogueing’ where the devotee dresses up as a facsimile body of an admired personality either of the same sex (homovestism), or of the opposite sex (the well-known transvestism). Common examples are Elvis Presley and Madonna lookalikes. Voguers compete with each other for realness. Fans who obtain objects, such as articles of clothing or signed photographs from their idols, or who in the USA search the dustbins of stars for trinkets (so-called ‘trashcanners’) and later use these in their sexual fantasies, resemble classical fetishists in their behavior (Gammon and Makinen 1994, Goodman 1998). 4.2 Technology and Sexuality Every technological invention has altered society in some way and this applies equally to the field of sexual behavior. Newspapers allow wide advertisement of sexual services, the telephone made possible obscene telephone calls, while commercial sex lines have utilized this phenomenon so that men (it is mainly men who use them) can dial and indulge in sexual talk for the price of the call. The automobile is often used as a place of sexual assignation, the home video for viewing obscene material, and the camcorder for making erotic home movies. The term ‘cybersex’ is used for people who wish to indulge in erotic fantasies through the Internet. The Internet provides a means of getting in touch with other like-minded individuals worldwide, so that fantasies, which may previously have existed only in the mind of one individual, can be exchanged, enhanced, and embellished. Arrangements have been made for such people to meet in order to carry out acts, which have included pedophilia, rape, and even murder. The perpetrator(s) may take on a different age or gender persona (‘gender bending’), to entrap the unwary, especially children. As a result of this, groups have been set up to combat these problems, for example, the Cyberangels in the UK, who monitor the net and inform the police where necessary. Psychologically some individuals in tedious relationships may find that occasional looks at erotica enlivens their sex lives, but others have become obsessed and addicted to on-line sex (‘hot chatting’), and indeed have needed therapy to help them cope (Durkin and Bryant 1995). Virtual reality, which offers the possibility of taking part in a virtual sexual scenario, will further increase the scope of sexual variant behavior.
5. Recent Scientific Adances in the Understanding of the Paraphilias 5.1 The Brain The human brain reached its present size and proportions about 50,000–100,000 years ago. The brain 14005
Sexual Perersions (Paraphilias) consists of regions which formed at different epochs in vertebrate evolution. It is the hypothalamus that is largely concerned with the hormonal control of reproduction. Certain nuclei found here in homosexual menhavebeenclaimedtobemorefemalethanmale-like, and similar findings have been made in male to female transsexuals (Swaab et al. 1997). Sexual fantasy depends on the cerebral cortex, which is of such complexity that an almost infinite variety of mental responses are possible which ensures a unique plasticity to the range of sexual behaviors. Furthermore, following head injury or the use of certain drugs, paraphilic behavior may become evident in individuals who did not show such tendencies before. Medical conditions such as temporal lobe epilepsy have been associated with fetishism in some patients. New noninvasive techniques for studying the brain, such as functional magnetic resonance imaging and positron emission tomography are helping to elucidate its functions.
are thought to influence the developing fetus. This has been shown in mice and other species. When a male fetus is placed in between two female fetuses leakage of female hormones may feminize the male. Testosterone can masculinize the female if a female is between two males (Vom Saal and Bronson 1980). Stress applied to pregnant rats just one week before delivery resulted in homosexual and bisexual behaviors in males (Ward and Weisz 1980) and Dorner et al. (1983) thought that the stress of war in pregnant women resulted in a higher incidence of male homosexuality in the German population. Sex hormones, particularly testosterone, are known to interact with the immune system. It has been suggested that the mother’s immune system triggered by a male fetus can affect male psychosexual development, especially if she has had many pregnancies. This would further inhibit her immune system and may explain the preponderance of elder brothers in the families of homosexuals and certain pedophiles (Blanchard and Bogaert 1996) (see Homosexuality and Psychiatry).
5.2 Genes and the Deeloping Fetus The contributions of genes and development to adult sexual behavior have been a topic of intense debate for many years. Recently, new discoveries have added fuel to this argument, and new concepts have been considered. In humans the Y-chromosome slows down the growth rate in the male fetus, which is consequently born relatively more immature compared with the female and is therefore more vulnerable. Thus males have a higher perinatal mortality and a higher incidence of accidental death, and later in life they show an increased vulnerability to cardiac disease and certain forms of cancer. Mental handicap, which includes autism and epilepsy, is more common in men than women and perhaps the preponderance of the paraphilias in males is also a reflection of this vulnerability (Ounsted and Taylor 1972). Do genes play some part in determining human sexual behaviors and orientation? Certainly in Drosophila, the fruit fly, where male and female behavior were once considered separate and exclusive, recent experimental manipulations of various genes have produced bisexual and homosexual behaviors in males. Even courtship chains, both heterosexual and homosexual (like something out of the Marquis de Sade), have been seen, behaviors that never occur in the wild (Yamamoto et al. 1996). In humans, some paraphiliac behavior does seem to run in families (Gorman 1964, Cryan et al. 1992), and suggestions of a gene occurring on the X-chromosome that predisposes to male homosexuality have been made (Hamer and Copeland 1994). Developmental factors are also of seeming importance in future behaviors and it is the presence of the two hormones, testosterone, and estrogen, that 14006
6. The Future Freud and the sexologists considered the libido in terms of the combustion engine and electricity, as well as using biological ideas that were extant at the time. Today as knowledge in all fields converges, concepts from one area are fertilized as never before by ideas from others. Cybernetics, the study of control and communication in artificial neural networks, has been applied to biological systems such as the brain. Chaos theory, a concept derived from nonlinear dynamics and used initially to predict the weather, has been applied to numerous other areas of research which include fetal development and sexual behavior, as well as various branches of psychology (Goodman 1997). Waddington (1975) portrayed fetal development as a mountainous terrain, in which the fetus is the ball that rolls down the valleys, which depict the possible developmental pathways. These pathways represent the culmination of millions of years of evolution and are relatively resistant to change. Both genetic factors and early environmental stresses may divert the fetus onto another pathway however, although the system does have a degree of stability. It may well be that certain paraphiliacs and individuals with homosexual or transsexual identities have been diverted from the more common (but not necessarily more normal) path of heterosexual identity. If sexual orientation and behavior are linked to cognition, as seems to be true, then variations could have evolutionary possibilities by throwing up individuals who have the ability to think differently from their peers, offering no little advantage in the struggle for survival. The epistemological solipsism of the developing brain needs sex for its development in the world and not just for pro-
Sexual Preference: Genetic Aspects creation (Freeman 1995). Sexual behavior and its variants therefore may merely be a reflection of this process. We should contemplate it with a sense of awe. See also: Rape and Sexual Coercion
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders: DSM-IV, 4th edn. American Psychiatric Association, Washington DC Blanchart R, Bogaert A F 1996 Homosexuality in men and number of older brothers. American Journal of Psychiatry 153(1): 27–31 Bullough V L 1976 Sexual Variance in Society and History. Wiley, New York Crompton L 1978 Offences against oneself, by Jeremy Bentham, Part 1 (ca. 1785). Journal of Homosexuality 3(4): 389–405 Cryan E M J, Butcher G J, Webb M G T 1992 Obsessive-compulsive disorder and paraphilia in a monozygotic twin pair. British Journal of Psychiatry 161: 694–8 Dorner G, Schenk B, Schmiedel B, Ahrens L 1983 Stressful events in prenatal life of bi- and homosexual men. Experimental Clinical Endocrinology 81: 83–7 Durkin K F, Byant C D 1995 ‘‘Log on to sex’’: Some notes on the carnal computer and erotic cyberspace as an emerging research frontier. Deiant Behaior: An Interdisciplinary Journal 16: 179–200 Freeman W J 1995 Societies of Brains. A Study in the Neuroscience of Loe and Hate. Lawrence Erlbaum, Hillsdale, NJ Gammon L, Makinen M 1994 Female Fetishism: A New Look. Lawrence & Wishart, London Goodman R E 1997 Understanding human sexuality: Specifically homosexuality and the paraphilias in terms of chaos theory and fetal development. Medical Hypotheses 48(3): 237–43 Goodman R E 1998 The paraphilias: An evolutionary and developmental perspective. In: Freeman H, Pullen I, Stein G, Wilkinson G (eds.) Seminars in Psychosexual Disorders. Gaskell, London, pp. 142–55 Gorer G 1963 The Life and Ideas of the Marquis de Sade. Peter Owen, London Gorman G F 1964 Fetishism occurring in identical twins. British Journal of Psychiatry 110: 255–6 Hamer D, Copeland P 1994 The Science of Desire: The Search for the Gay Gene and the Biology of Behaior. Simon and Schuster, New York Licht H 1969 Sexual Life in Ancient Greece. Panther Books, London Ounsted C, Taylor D C 1972 The Y-chromosome message, a point of view. In: Ounsted C, Taylor D C (eds.) Gender Differences: Their Ontogeny and Significance. Churchill Livingstone, Edinburgh, UK Short R V 1976 The evolution of human reproduction. Proceedings of the Royal Society of London, Biology 195: 3–24 Swaab D F, Zhou J N, Fodor M, Hofman M A 1997 Sexual differentiation of the human hypothalamus: Differences according to sex, sexual orientation and transsexuality. In: Ellis L, Ebertz L (eds.) Sexual Orientation. Toward Biological Understanding. Praeger, Westport, CT, pp. 129–50 Sulloway F J 1992 Freud. Biologist of the Mind. Harvard University Press, Cambridge, MA
Vom Saal F S, Bronson F H 1980 Sexual characteristics of adult female mice are correlated with their blood testosterone levels during prenatal development. Science 208: 597–9 Waddington C H 1975 The Eolution of an Eolutionist. Edinburgh University Press, Edinburgh, UK Ward I L, Weisz J 1980 Maternal stress alters plasma testosterone in foetal males. Science 207: 328–9 Weeks J 1985 Sexuality and its Discontents. Routledge and Kegan Paul, London Yamamoto D, Ito H, Fujitani K 1996 Genetic dissection of sexual orientation: Behavioral, cellular, and molecular approaches in Drosophila melanogaster. Neuroscience Research 26: 95–107
R. E. Goodman
Sexual Preference: Genetic Aspects In many animals, females preferentially mate with males that are adorned with extravagant traits like bright feathers or complicated courtship behavior. As a result such sexual preference has led to the evolution of many elaborate male signals as for example nightingale song or peacock feathers. Many male extravagant traits are thus clearly caused by female sexual preferences, but why did female preference evolve in the first place. Several different hypotheses have been proposed to explain the evolution of female choice. Here, these hypotheses and the available empirical evidence that might help to distinguish between them are discussed.
1. Sexual Selection and the Sex Roles Males and females in most animal species differ to a large extent in their morphology and their behavior. Female birds are, for example, often drab and coy compared to their colorful and sexually active mates. In many species, males compete with other males for access to females and often use specialized horns, teeth or other weapons, whereas their mates often do not fight with other females. The basic reason behind these sex differences is gamete size that often differs by several orders of magnitude between the sexes. Females produce relatively large gametes, the eggs, and accordingly can only produce relatively few of them. Males, on the other hand, produce large numbers of tiny sperm cells. Due to this difference in the number of gametes available, female reproductive success usually is limited by the number of eggs produced, and multiple matings by females have at most only a small effect on female reproductive success. Male reproductive success, however, is mainly limited by the number of eggs fertilized and thus by the number of mates obtained and not so much by sperm 14007
Sexual Preference: Genetic Aspects number (Clutton-Brock and Vincent 1991). Males therefore can benefit from attractiveness and from competition for mating partners whereas females probably benefit more by choosing the best ones among the many available mates. In accordance with this expectation, females often seem to carefully examine the available males and reject mating attempts by nonpreferred males (Andersson 1994). In peacocks, for example, hens prefer males with large colorful feathers that are distributed symmetrically, and in some other birds, females prefer males with large song repertoires. The few species where males heavily invest in offspring, e.g., some katydids with large parental investment, as well as seahorses and wading birds with exclusive paternal care, confirm the rule. In these species females usually compete for mates and males choose among females because males are limited by the number of offspring they can care for and females are limited by the number of males they can obtain to care for their offspring. These studies on sex role reversed species show that the sex difference in the benefit of additional matings after the first one is the driving force behind sexual selection. With standard sex roles, male traits that impair survival but render males especially attractive to females can evolve since male traits that are preferred by females will cause an increased mating frequency. Due to the effect of mating frequency on male reproductive success, these traits can lead to an increased lifetime mating success even when male survival decreases. The splendid feathers of male peacocks for example probably decrease male survival because this shiny ornament will also attract predators and make escape less efficient. Less adorned males may live longer. However, when they leave, on average, fewer offspring than attractive males, genes for being less adorned will go extinct. Compared to the evolution of male traits under the influence of female choice, the evolution of female preference is less easy to explain. In the next section the possible routes for the evolution of female preference are described.
2. Theoretical Models for the Eolution of Preferences Several hypothesis have been presented to explain the evolution of female preference that causes females to choose their mating partners among the available males (see Kirkpatrick and Ryan 1991, Maynard Smith 1991, for reviews). The simplest possibility would be that males differ in the amount of resources provided or in their fertilization rate. According to this hypothesis, choosy females increase the number of offspring produced and they thus benefit directly from their choice. In some birds, for example, those males are preferred as mates that provide superior territories and outstanding paternal care. Another possible explanation for the evolution of female preferences is 14008
that choice has an influence on the genetic quality of the offspring. According to this hypothesis females benefit indirectly when offspring of preferred males have superior viability or increased mating success compared to the offspring of nonpreferred males. And finally, female preference might have evolved in another context than sexual selection, that is, female birds might prefer males with blue feathers because it is adaptive to react positively to blue when blue berries constitute a valuable resource so that female sensory physiology became tuned to this color. This model has accordingly been termed sensory exploitation hypothesis, meaning that males exploit female sensory physiology to attain attractiveness (Ryan and KeddyHector 1992). For all of these models, genetic variation in preference is essential since evolution of all traits rests on the existence of genetic variance. Despite the importance of such data, not very many species have been examined for genetic variance in female preference. In most of the examined species, ranging from insects to mice, within population variation in female preference seems generally to be influenced by additive genetic variance (Bakker and Pomiankowski 1995). Such genetic variance in preference means that this trait can easily evolve whenever there is a selective advantage to a specific preference. In the following paragraph the different hypotheses for the evolution of female preferences will be discussed in more detail and the genetic aspects of these models will be concentrated on. When females benefit directly from their choice by increased lifetime fecundity, it is easy to see that any new choice will evolve as long as the extra benefit gained is larger than the extra cost paid for female choice. For example, females might prefer healthy mates because this might reduce the risk of infection during mating. Choosing males with intact and shiny feathers might thus reduce the risk of ectoparasite transfer during copulation. For models that rest on indirect benefits of female preference, females only gain from choice regarding the genetic quality of their offspring. For these indirect benefit models of female preference, the male trait also needs to have genetic variance. One of these hypotheses, Fisher’s arbitrary sexual selection model, predicts that female choice should evolve in response to the evolution of male traits. If females prefer males with a specific trait size, a linkage disequilibrium (a nonrandom association of genes) will build up, since the genes for the preference and the genes for the male trait will co-occur together more frequently than by chance. The existence of choosy females will, in addition, cause increased reproductive success of these preferred males. Since these males also carry a disproportional share of the preference allele, due to the linkage disequilibrium that is caused by the preference, the preference evolves in response to the evolution of the male trait (Kirkpatrick and Ryan 1991). If sexual selection is strong and if preference and male trait are heritable, both the
Sexual Preference: Genetic Aspects male trait and the female preference are predicted to evolve by a positive feedback that eventually might lead to runaway selection. The distinctive and sufficient genetic condition for Fisher’s arbitrary sexual selection model thus is a genetic correlation between female preference and male traits. According to the good genes hypothesis, another hypothesis that suggests indirect benefits of female preferences, choosy females benefit not only by producing preferred sons but also by producing offspring with superior viability. For such a process to work, male attractiveness needs to indicate male viability. Speaking in genetic terminology, there has to be a genetic correlation between male signaling traits and viability. Since a genetic correlation between female preference and male signaling traits will also build up as a result under the good genes model, such a correlation thus cannot be taken as evidence for the arbitrary sexual selection model but it means that one of the indirect benefit models helps to maintain the preference. To provide evidence that the sensory exploitation model helps to explain the evolution of female preference, one has to show that female preference evolved before the male trait and thus independent of mate choice. The usual way is to show that female preference is more ancestral in phylogeny than the male trait. One of the best studied examples of this hypothesis is the tungara frog, Physalaemus pustulosus, where females have a preference for a male acoustic signal that is not produced by the males, but males of a closely related species do produce it. The suggested explanation for this pattern is that female choice evolved first in the ancestor of both species and the male trait evolved later in only one of those species. However, the loss of the attractive male signal during phylogeny cannot be excluded as an alternative explanation.
3. Testing Genetic Aspects of Sexual Selection Models The good genes hypothesis predicts that females benefit from choosing attractive males because these males will produce offspring with superior viability. The critical problem for this hypothesis is that attractive males need to have higher viability. Why should such a correlation between attractiveness and viability exist? The most prominent explanation is the following: Attractive signals are costly and only males with superior viability are able to afford these costs so that only those males can attain high attractiveness. This view of condition-dependent signaling is supported by empirical data showing that in various insects, spiders, and frogs the production of attractive courtship signals consumes more energy than the production of less attractive signals. Also, males that suffer from disease or are starving are usually not as attractive as healthy competitors. Such a process of condition-dependent signaling leads to a genetic cor-
relation between male signaling traits and male viability when there is genetic variation for both traits. If females prefer males with signaling traits that are correlated with viability, female choice can evolve since it will become associated with high viability and a rare female choice allele can thus be expected to increase in frequency when the costs of female choice are low. If females choose among males on the basis of a male trait, a genetic correlation will build up between male trait and female preference at the same time. Since this prediction is identical to the critical and sufficient condition for the arbitrary sexual selection model, experimental separation of these two nonexclusive hypotheses is difficult. The most frequent method to examine whether the good genes hypothesis contributes to maintain female preference is to compare the viability of the offspring of preferred males with the viability of average or nonpreferred males. In some studies using this method, female choice seems to have an astonishingly large effect on offspring viability, in other studies, no significant effect was observed (see Møller and Alatalo 1999 for a review). Theoretical models show that these indirect sexual selection benefits are unlikely to increase offspring fitness by more than a few percent (Kirkpatrick 1996). Since direct benefits are suggested to incur larger benefits, evolution of female preferences is predicted to be dominated by direct benefits if these can be obtained by female choice. However, even a small indirect benefit to sexual preferences may be large enough for quick evolution when choice is not very costly. With an exhaustive literature review Møller and Alatalo (1999) showed that a significant positive correlation coefficient between male viability and attractiveness exists on average and they therefore proposed that females can gain a few percent regarding the viability of their offspring when they carefully choose among the potential mates. In general, the evolution and maintenance of sexual preference seems to be due to various factors. Empirical evidence exists in favor of each of the most prominent hypotheses—direct benefits, sensory exploitation, Fisher’s arbitrary sexual selection and good gene models. These hypotheses are not mutually exclusive and some of them might work synergistically. In the future, more studies examining the quantitative genetics of sexually selected traits are necessary to evaluate the importance of the different hypotheses (Bakker 1999).
4. The Maintenance of Genetic Variation in Male Traits 4.1 Theoretical Expectation Both indirect sexual selection models depend on the existence of genetic variance in male traits and will in 14009
Sexual Preference: Genetic Aspects turn also cause strong selection on these traits. For theoretical reasons it has often been argued that genetic variance of traits under strong selection will usually be smaller than the genetic variance of traits that are less strongly selected (Fisher 1930). This difference occurs because under strong selection, all but the most beneficial alleles will disappear quickly from a population. In line with these arguments, life history traits that are closely connected to fitness have a lower additive genetic variance than morphological traits that are believed to be under weaker selection. If strong selection would deplete genetic variance in male attractiveness or viability, the benefit of female preference would decrease and if there are costs to female choice, females would accordingly benefit from refraining from choice and saving these costs. However, females seem to choose strongly among males in most animal species (Andersson 1994). Furthermore, female choice and the effects of sexual selection on male traits are most obvious in lekking species, where males court females at courtship territories and do not provide any resources for the females. In these cases, direct benefits are unlikely and it is not easy to see why sensory exploitation should be more frequent in these species. The indirect benefit models are thus the only ones that are likely to explain why females strongly select among the available males in lekking species and elsewhere when males do not provide resources. However, when females exert sexual selection on male viability or attractiveness, the genetic variance in male quality should decrease and the benefit of the preference will also decrease in response. The important question thus is whether male genetic variance is large enough to counterbalance the costs of being choosy.
4.2 Empirical Eidence for Genetic Variance among Males There is ample evidence for genetic variance in male signaling traits and in attractiveness: crosses between populations that differ in the male signaling trait show that the difference is inherited; heritability estimates from father-son comparisons show that genetic variation exists even within populations and artificial selection experiments have proven that sexually selected male traits can evolve quickly. Despite strong sexual selection on male signaling traits, sexually selected traits generally seem to have heritabilities that are as large as the values for traits that are assumed to be only under weak selection (Pomiankowski and Møller 1995). The deviation from the theoretical expectation is even more impressive when one compares the additive genetic variance (another measure for genetic variance that does not depend on the extent of environmental influence on the trait under consideration) between sexually selected traits and traits that are assumed to be not under sexual selection. Sexually selected traits have significantly larger ad14010
ditive genetic variance than other traits, showing that the existing genetic variation is sufficient for both indirect sexual selection models.
4.3 Possible Reasons for the Maintenance of Genetic Variance The extent of genetic variation present in natural populations thus seems to contradict the theoretical expectations. It is, therefore, important to understand how extensive genetic variance can be maintained in the face of strong sexual selection. Among the hypotheses put forward to explain these data that seem to contradict the theory, the following three causes for the maintenance of genetic variance shall be discussed in some detail because they have received considerable credit: capture of genetic variance in condition, hostparasite coevolution, and meiotic drive. Male signals are only likely to honestly indicate male viability, when attractive signals are costly because otherwise males with low viability will also be able to produce these attractive signals. When signals are costly, only males in good condition will be able to produce attractive signals (Zahavi 1975). Due to this process, male signal quality will indicate a male’s condition. Since male condition is assumed to possess large genetic variance because many traits will influence condition, male signals will capture genetic variance in condition and can in this way maintain the genetic variance in male viability necessary for the good genes model (Rowe and Houle 1996). Another hypothesis suggests that host-parasite coevolution is important in maintaining genetic variance in male signaling traits (Hamilton and Zuk 1982, Westneat and Birkhead 1998). Let us assume there are male traits that indicate that a male is free of parasites or resistant to parasites. If such resistance is heritable, females would clearly benefit from choosing those males and female preference could accordingly evolve. With increasing resistance, selection on the parasite would lead to new types of parasites that in turn will cause the selection of new resistance. This arms race can in principle lead to a persistent and large advantage to female preference for resistant males because genetic variance in males is maintained since superior male genotypes change with time. There is some indication that such a process also works in humans, where scents from potential partners with dissimilar MHC-genes (an immunologically important group of genes) are preferred ( Wedekind and Furi 1997). Preference of females for males resistant to the action of meiotic drive has also recently been proposed as a scenario that allows persistent benefits of female choice. In stalk-eyed flies, females prefer males with longer eyestalks, and lines selected for longer eyestalks show increased resistance to meiotic drive (Wilkinson et al. 1998). It was, therefore, suggested that females
Sexual Preference: Genetic Aspects benefit from choosing males with large eyestalks because resistance to meiotic drive is more frequent in these males. Theoretical simulations, however, revealed that the predicted process cannot occur, but they showed that the avoidance of males possessing meiotic drive does allow persistent benefits for female preference (Reinhold et al. 1999).
5. Genetics of Sexually Selected Traits: Xchromosomal Bias Based on reciprocal crosses between two Drosophila species, Ewing (1969) long ago proposed that a disproportional part of the genes that influence traits important for mate recognition reside on the Xchromosome. Recently, two reviews revealed that Xchromosomal genes actually have a disproportionate effect on sex and reproduction related traits in humans and on sexually selected traits in animals. Using large molecular databases, Saifi and Chandra (1999) compared the linkage of mutations influencing traits related to sex and reproduction with all other traits. Their analysis shows that genes influencing traits related to sex and reproduction are several times more likely to be linked to the X-chromosome than other traits. With a different method using literature data on reciprocal crosses, Reinhold (1998) found that the influence of X-chromosomal genes is much stronger for traits that are likely to be under sexual selection than for other genes. On average, about one third of the difference between the two parental lines used for the reciprocal crosses was caused by X-chromosomal genes when sexually selected traits were considered. For those traits that were classified to be not under sexual selection, this value was much smaller—on average two percent of the difference was due to Xlinked genes—and was not significantly different from zero. Such a bias towards the X-chromosome can be expected for traits that are influenced by sexually antagonistic selection (Rice 1984) and for sex-limited traits that are under fluctuating selection (Reinhold 1999). Antagonistic selection occurs if the optimal phenotype differs for males and females and if a genetic correlation between the sexes prevents the phenotypes to reach their evolutionary stable optimum. Under sexually antagonistic selection, sexlinked traits can be expected to evolve faster than other traits because sex-linked traits almost always differ in their expression in the two sexes. Rare recessive X-chromosomal genes, for example, will always be expressed when they occur in the heterogametic sex (the sex that has two different sex chromosomes; in humans the males, they posses an X as well as a Y-chromosome) and will be expressed almost not at all in the homogametic sex. This difference in expression then provides the raw material selection can work on, so that preferential X-linkage
can be expected for traits under sexually antagonistic selection. Under fluctuating selection the fate of an allele is influenced by its geometric mean fitness a fitness measure that is equivalent to the case of effective interest rates when the interest rate on a financial investment fluctuates in time. This fitness measure is influenced by the extent of expression of a trait. Between autosomal and X-chromosomal genes there is such a difference in the extent of expression: (a) Xchromosomal genes coding for sex limited male traits in heterogametic males are only expressed to one third because the other two thirds of all X-chromosomes are present in females that do not express the genes under consideration; (b) Autosomal sex-limited genes, i.e. genes that are expressed in only one sex and do not reside on the sex chromosomes but lie on any of the other chromosomes, are, in contrast, expressed to one half provided they are not totally recessive. Due to this difference in expression, autosomal genes coding for the same phenotype as X-chromosomal genes have a disadvantage compared to X-linked genes. As a consequence, X-chromosomal genes coding for sex limited traits are expected to evolve more easily than autosomal genes (Reinhold 1999). The observed Xbias for sexually selected traits can accordingly be explained by the effect of fluctuating selection on these sex-limited traits. See also: Genetics and Mate Choice; Sex Hormones and their Brain Receptors; Sexual Attitudes and Behavior; Sexual Orientation: Biological Influences; Sexual Orientation: Historical and Social Construction; Sexuality and Gender; Y-chromosomes and Evolution
Bibliography Andersson M 1994 Sexual Selection. Princeton University Press, Princeton, NJ Bakker T C M 1999 The study of intersexual selection using quantitative genetics. Behaiour 136: 1237–65 Bakker T C M, Pomiankowksi A 1995 The genetic basis of female mate preferences. Journal of Eolutionary Biology 8: 129–71 Clutton-Brock T H, Vincent A C J 1991 Sexual selection and the potential reproductive rates of males and females. Nature 351: 58–60 Ewing A W 1969 The genetic basis of sound production in Drosophila pseudobscura and D. persimilis. Animal Behaiour 17: 555–60 Fisher R A 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford, UK Hamilton W D, Zuk M 1982 Heritable true fitness and bright birds: A role for parasites. Science 218: 384–7 Kirkpatrick M 1996 Good genes and direct selection in the evolution of mating preferences. Eolution 50: 2125–40 Kirkpatrick M, Ryan M J 1991 The evolution of mating preferences and the paradox of the lek. Nature 350: 33–8 Møller A P, Alatalo R V 1999 Good-genes effects in sexual selection. Proceedings of the Royal Society of London Series B 266: 85–91
14011
Sexual Preference: Genetic Aspects Pomiankowski A, Møller A P 1995 A resolution of the lek paradox. Proceedings of the Royal Society of London Series B 260: 21–9 Reinhold K 1998 Sex linkage among genes controlling sexually selected traits. Behaioural Ecology and Sociobiology 44: 1–7 Reinhold K 1999 Evolutionary genetics of sex-limited traits under fluctuating selection. Journal of Eolutionary Biology 12: 897–902 Reinhold K, Engqvist L, Misof B, Kurtz J 1999 Meiotic drive and evolution of female choice. Proceedings of the Royal Society of London Series B 266: 1341–5 Rice W R 1984 Sex chromosomes and the evolution of sexual dimorphism. Eolution 38: 735–42 Rowe L, Houle D 1996 The lek paradox and the capture of genetic variance by condition dependent traits. Proceedings of the Royal Society of London Series B 263: 1415–21 Ryan M J, Keddy-Hector A 1992 Directional patterns of female mate choice and the role of sensory biases. The American Naturalist 139: S4–S35 Saifi G M, Chandra H S 1999 An apparent excess of sex- and reproduction-related genes on the human X chromosome. Proceedings of the Royal Society of London Series B 266: 203–9 Smith J M 1991 Theories of sexual selection. Trends in Ecology and Eolution 6: 146–51 Wedekind C, Furi S 1997 Body odour preferences in men and women: Do they aim for specific MHC combinations or simply heterozygosity? Proceedings of the Royal Society of London Series B 264: 1471–9 Westneat D F, Birkhead T R 1998 Alternative hypotheses linking the immune system and mate choice for good genes. Proceedings of the Royal Society of London B 265: 1065–73 Wilkinson G S, Presgraves D C, Crymes L 1998 Male eye span in stalk eyed flies indicate genetic quality by meiotic drive suppression. Nature 391: 276–79 Zahavi A 1975 Mate selection—a selection for a handicap. Journal of Theoretical Biology 53: 205–14
K. Reinhold
Sexual Risk Behaviors By the mid-1980s, the menace of AIDS had made indiscriminate sexual behavior a serious risk to health. Under certain conditions sexual behavior can threaten health and psychosocial well-being. Unwanted pregnancies and sexually transmitted diseases (STDs) such as gonorrhea and syphilis have long been potential negative consequences of sexual contact. Yet today HIV infection receives most attention. Because of this, the article will focus on the risk of HIV infection through sexual contact, while HIV infection through needle sharing, maternal-child transmission, and transfusion of blood and blood products will not be addressed. This article is divided into five sections. Section 1 defines sexual risk behavior and delimits the scope. Section 2 discusses the research methodology in this field. Section 3 presents epidemiological data on sexual behavior patterns and HIV infection status among relevant population groups. Section 4 examines 14012
the determinants of risk behavior. Section 5 concludes with a discussion of strategies for reducing risk exposure.
1. Definitions Sexuality is an innate aspect of humanity and is oriented toward sensory pleasure and instinctual satisfaction. It goes beyond pure reproduction and constitutes an important part of sensual interaction. Sexual behavior embodies the tension between biological determination, societal and cultural norms, and an individual’s personal life choices. Sexuality and sexual behavior are subject to constant change. Sexual mores and practices differ widely both within and between cultures, and also from epoch to epoch. One defining feature of sexuality is, therefore, the diversity of sexual conduct and the attitudes surrounding it. Sexual risk behavior can be described as sexual behavior or actions which jeopardize the individual’s physical or social health. High-risk sexual practices, such as unprotected intercourse with infected individuals, constitute unsafe sexual conduct, and therefore deserve to be classified as a health risk behavior or behavioral pathogen (Matarazzo et al. 1984) on a par with smoking, lack of exercise, unhealthy diet, and excessive consumption of alcohol. Among the STDs, which include gonorrhea, syphilis, genital herpes, condylomata, and hepatitis B among others, HIV infection and AIDS are by far the most dangerous. It has linked love and sexuality with disease and death. If an HIV-infected person fails to inform a partner of his or her serostatus (positive HIV antibody test), unprotected sexual intercourse resulting in the partner’s infection has also legal consequences. In general, unprotected sexual intercourse (i.e., without condoms) with an HIV-infected partner is of risk. HIV may be acquired asymptomatically and transmitted unknowingly. There is a ‘hierarchy’ in the level of risk involved in sexual practices. High-risk sexual practices include unprotected anal intercourse (especially for the receptive partner) and unprotected vaginal intercourse (again, more for the receptive partner) with infected individuals, as well as any other practices resulting in the entry or exchange of sexual fluids or blood among the partners. Oral sex, on the other hand, carries only a low risk (see Sexually Transmitted Diseases: Psychosocial Aspects). Petting is not considered sexual risk behavior. The risk of sexual behavior is a function of the partner’s infection status, the sexual practices employed, and the protective measures used. Sex with a large number of sexual partners is also risky because of the higher probability of coming into contact with an infected partner. The risk can be minimized or ruled out entirely by appropriate protective measures: low-risk sexual practices, use of condoms, and avoidance of sexual contact with HIV positive partners.
Sexual Risk Behaiors Sexual risk behavior also applies to the area of family planning. Unwanted pregnancies may occur if the sexual partners do not use safe methods of contraception. If the partners do not wish to have children, sexual contact risks an undesired outcome. Unwanted pregnancy can have consequences ranging from changes in life plans, in the partnership, and in decisions to abort with the ensuing emotional stress and medical risks. Deviant sexual behavior such as exhibitionism or pedophilia cannot be defined as risk behavior. The analysis of sexual risk behavior falls in the domains of psychology, sociology, medicine, and the public health sciences (Bancroft 1997, McConaghy 1993). Research is available in the areas of health psychology, health sociology and social science AIDS research (von Campenhourdt et al. 1997; see the journal AIDS Care; see HIV Risk Interentions). The terms used in studies more frequently than ‘sexual risk behavior’ are ‘HIV risk behavior,’ ‘HIV protective behavior,’ ‘AIDS-preventive behavior,’ ‘HIV-related preventive behavior,’ ‘safer sexual behavior,’ ‘safe sex,’ or ‘safer sex’ (DiClemente and Peterson 1994, Oskamp and Thompson 1996). These descriptions ought to be given preference over the term ‘sexual risk behavior’ in order to prevent pathologizing certain sexual practices; it is not the sexual practice as such which presents the risk, but the partner’s infection status. Although this term will be used further here, this issue ought to be kept in mind.
2. Research on Sexual Risk Behaior Assessment of sexual risk behavior is complicated by problems in research methodology. Fundamentally, research in sexology faces the same methodological problems as other social sciences. Studies of sexual behavior require special survey instruments and must take into account factors influencing data acquisition and results (see Bancroft 1997). The most important methods of data collection include questionnaires, personal interviews, telephone surveys, and self-monitoring. None of these methods is clearly superior to the others. The quality of retrospective data depends on the selected period, the memory capacity of study participants, the frequency of the behavior, and whether the behavior is typical. In general, in research on sexuality no adequate external criteria are available to validate reporting. Psychophysiological data exist only to a limited extent, and field experiments and participatory observations are usually not applicable. Since the different survey methods and instruments involve differential advantages and disadvantages, a combination of methods would be appropriate. However, this is often not feasible due to practical constraints including limited resources and constrained access to target groups. When conducting scientific surveys on sexual behavior, it is important to inform and instruct study
participants about the purpose of the investigation. Questions must be phrased in a neutral and unprejudiced manner. For sensitive issues, such as taboo topics and unusual sexual practices, the phrasing of questions and their position in the questionnaire are critical. When surveying populations with groupspecific language codes (e.g., minorities that have a linguistic subculture), it is necessary to discuss sexual terms with the interviewees beforehand. The willingness to participate in the study and to answer survey questions can be influenced by the fear of reprisals if the answers become public. Depending on the objectives of the study and the study sample, relative anonymity (via telephone interview) can play an important role in the willingness to participate and the openness of responses. Independent of the selected survey instrument, exaggerations, understatements, and socially desirable answers are to be expected, particularly to questions about sexual risk behavior. Study participants report their own behavior based on role expectations and their self-image. Thus, it is important to assess and control for appropriate motivational and dispositional variables, such as social desirability and the tendency for self-disclosure and attitudes toward sexuality. The person of the researcher, in particular his or her sex, age, and sexual orientation, has an important influence on response behavior. For example, for a heterosexual study sample it may be advisable that male participants be questioned by a male interviewer and female participants by a female interviewer. A heterosexual interviewer must be prepared not to be accepted by homosexual participants. This applies in particular when questions address highly intimate sexual experiences and sexual risk behavior. Generalizability is one of the most serious problems when reporting data from scientific surveys on sexuality. Access to the study sample and the willingness to participate are more closely linked to relevant personal characteristics of the participants (sexual experiences, attitude toward sexuality) compared to other research. Participants tend to show, for example, greater selfdisclosure, a broader repertoire of sexual activities, less guilt, and less anxiety than nonparticipants. Since randomized or quota selection are rarely possible, an exact description of the sampling frame is essential to permit an estimate of the representativity of the sample and the generalizability of the results. Generally, study findings are often based on reports by individuals who are, to a substantial degree, self-selected. This must be taken into consideration when evaluating the findings of studies on sexual behavior.
3. Epidemiology Data on the epidemiology of sexual risk behavior can be acquired from different sources. Numbers can be obtained through studies on sexual risk behavior, focusing on the behavior itself. As mentioned above, 14013
Sexual Risk Behaiors methodological problems often plague such studies. Even if rates of condom use and numbers of sexual partners are assessed correctly, they may not provide the specific information needed to present the epidemiology of sexual risk behavior. The finding that about 50 percent of heterosexual men do not regularly use condoms is only relevant if they do not use condoms in sexual encounters with potentially infected partners. This information, however, is rarely available from existing studies. Numbers can also be extrapolated from incidence and prevalence rates on sexually transmitted disease or unwanted pregnancies. These data focus on consequences rather than the risk behavior itself, and extrapolation to the prevalence of risk behaviors is complicated by the fact that it is sometimes difficult to distinguish how an infection was transmitted (sexual contact, transfusions, or needle sharing). In addition, reporting may be incomplete or inaccurate for the following reasons: (a) not all countries require compulsory registration of HIV infections, (b) cases are often recorded anonymously which may result in multiple registrations of the same case, and (c) there may be a high number of unreported cases that do not find their way into the statistics. Despite these methodological problems, epidemiological data are necessary and useful for behavioral research. The available epidemiological data on sexual risk behavior from European and US American sources are presented below for five different subgroups: adolescents, ethnic groups, homosexuals, heterosexuals, and prostitutes. These subgroups are not exhaustive, but they elucidate the differences and specificity of epidemiological data on sexual risk behavior (see Sexually Transmitted Diseases: Psychosocial Aspects). 3.1 Adolescents Adolescents comprise a special group concerning sexual risk behavior. Their initiation into sexuality may shape their sexual behavior for many years to come. They have as yet no rigid sexual behavior patterns and are subject to influences from their peers, parents, the media, and other sources. Compared to the 1970s, teenagers are sexually active at a younger age. Many teenagers (about 25 percent of males and 40 percent of females) have had sexual intercourse by the time they reach 15 years of age, and the mean age of first sexual intercourse is between 16 and 17 years. Because adolescents are sexually active earlier in their lives, they also engage in sexual risk behavior at an earlier age. The majority have several short-term sexual relationships, and by the end of their teens about half report having had more than four partners. The majority of adolescents report being exclusively heterosexual, but an increasing number of teenage males report being bisexual or homosexual. Data on infection rates show that sexually transmitted diseases are widespread among adolescents. 14014
The human papilloma virus (HPV) is likely to be the most common STD among adolescents, with a prevalence of 28–46 percent among women under the age of 25 in the US (Centers for Disease Control and Prevention 2000). The prevalence of HIV infection is increasing slightly among adolescents, but accurate data are difficult to obtain. Due to its long incubation period, those with AIDS in their twenties probably contracted the virus as adolescents, and such cases are on the rise. Teenage pregnancy is still an issue, although numbers generally have decreased. The numbers are still high, especially in the US, where 11 percent of all females between ages 15 and 19 become pregnant (Adler and Rosengard 1996). Data on behavior indicate that about 75 percent of adolescents use contraception. Adolescents in steady relationships predominantly use the pill, while adolescents with casual sexual encounters mainly use condoms. Most adolescents are aware of the risk of pregnancy when sexually active, and they use condoms for the purpose of contraception rather than for protection against STDs. Most of them have sufficient knowledge about HIV infection and AIDS, although erroneous assumptions still prevail. Personal vulnerability to AIDS is perceived to be fairly low, and awareness of other sexually transmitted diseases is practically nonexistent. Alcohol and drug use, common among adolescents, further influence sexual behavior among adolescents including reduced frequency of condom use. 3.2 Ethnic Groups There are distinct differences in HIV infection among different ethnic groups. Data on infection rates are available mainly from studies conducted in the US. These show that African–Americans face the highest risk of contracting HIV. Despite making up about 12 percent of the US population, the prevalence among African–Americans is 57 percent of HIV diagnoses and 45 percent of AIDS cases. Almost two-thirds of all reported AIDS cases in women are among African– Americans. The incidence rate of reported AIDS cases is eight times the rate of whites (Centers for Disease Control and Prevention 1999). The Hispanic population has the next highest prevalence rates. Hispanics accounted for about 20 percent of the total number of new AIDS cases in 1998, while their population representation was about 13 percent. Their rate is about four times that of whites. It is likely that differences in sexual behavior underlie these statistics, but most studies of behavior have focused on one specific group rather than comparing them with each other. Possible reasons for differences in infection rates could be that (a) members of ethnic groups often have a lower socioeconomic status and have less access to health care, (b) they are less educated and score lower on HIV risk behavior knowledge, (c) they com-
Sexual Risk Behaiors municate less with partners about sexual topics, and men often have a dominant role in relationships, such that women have difficulty discussing promiscuity and condom use, and (d) men often are less open about their sexual orientation and their HIV status compared to white men.
3.3 Homosexuals Homosexual men (men who have sex with men) are a high-risk group for contracting HIV infection. Estimates suggest that 5 percent to 8 percent of all homosexual men are HIV positive. Unprotected sexual contact among homosexual men accounts for about 50 percent of all HIV infections (Robert KochInstitut 1999). Compared to other groups, homosexual men are more likely to engage in sexual risk behavior, including receptive or insertive anal sex and higher numbers of sexual partners. However, prevention campaigns seem to have influenced the sexual behavior of this group. An estimated three-quarters of homosexual men now report using condoms, especially with anonymous partners. There also seems to be a decline in the overall number of sexual partners. However, some homosexual men continue to expose themselves to considerable risk of HIV infection. Although more homosexual men are using condoms, they often do not use them consistently in all potentially risky sexual encounters. Also, especially those who still engage in unprotected sex are usually sexually very active and\or have multiple partners. Studies on sexual behavior often overlook this fact when reporting rates of condom use. Homosexual women are at practically no risk of HIV infection through sexual behavior.
condoms are used. Up to 23 percent of heterosexual men and 35 percent of heterosexual women report having had two or more sexual partners in the past five years, and up to 6 percent of men and 3 percent of women admit to extramarital sex in the past 12 months. Thus, condom use in these settings is of greatest relevance (Johnson et al. 1994, Laumann et al. 1994). Most individuals in monogamous relationships use condoms as a method of contraception rather than as a means of protection against infection.
3.5 Prostitutes There seems to be a distinction between ‘drug-related’ and ‘professional’ prostitutes. HIV infections are fairly uncommon among professional prostitutes in developed countries; studies report rates of 0.5 percent to 4 percent. Rates of infection among drug-related prostitutes are much higher, with about 30 percent being HIV infected. Behavioral patterns differ between the two groups. Drug-related prostitutes may be forced by either financial need or dependence on a drug supplier to conform to the wishes of their clients, often including sexual intercourse without condoms. This is less the case for professional prostitutes who have more of a choice in their clients and who can insist on using condoms. Rates for unprotected sex among drug-related prostitutes range from 41 percent to 74 percent. In contrast, rates for professional prostitutes range from 20 percent to 50 percent. Rates may be even higher for male prostitutes (Kleiber and Velten 1994).
4. Determinants of Risk Behaior 3.4 Heterosexuals Heterosexuals in monogamous relationships are at low risk for contracting HIV. However, often both partners have had previous sexual contacts, and there is no guarantee that all of those past partners were not infected. Most cases of HIV transmission in this setting are male to female and result from the man’s past highrisk sexual contacts, including homosexual relations and encounters with prostitutes. Sex tourism to countries with high rates of HIV adds particular risk. Differences in sexual behavior between men and women, combined with the relatively higher efficiency of HIV transmission from male to female, have the consequence that in developed countries only 5 percent of all HIV infections in men are due to unprotected heterosexual intercourse, whereas for women the figure is 33 percent. Rates of condom use among heterosexual couples vary from study to study, but about 40–60 percent of sexually active individuals report not using condoms. No conclusive data are available in which settings
Models that attempt to describe and explain HIV risk and protective behavior include the following factors: cultural factors (e.g., ethnic group, social norms), social and environmental factors (e.g., membership in subgroups, knowing HIV-infected individuals), demographic factors (socioeconomic status, marital status), biographic factors (e.g., sexual orientation, attitude toward health), and psychosocial factors (e.g., level of knowledge about STD risk, self-efficacy, attitude toward condoms). Sociological models emphasize the way risk behavior is influenced by social class, educational level and the overall social context. Social disadvantage often goes hand in hand with lack of opportunities for health maintenance and medical care, and with a higher level of risk-taking behavior. Sexual behavior is imbedded in the mode of living of the individual and is closely tied in with the social environment. Some sociological theories focus on the aspect of communication in intimate relationships and economic factors (e.g., financial dependence). Especially among subgroups such as substance abusers, prostitutes and the socioecono14015
Sexual Risk Behaiors mically disadvantaged, social and economic conditions can be the central factor that controls behavior. Psychological models focus on determinants of risk behavior that entail processes taking place within the individual. Although a large number of studies have been published since the mid-1980s, and despite a long tradition of research into risk-taking behavior even before the era of AIDS (Yates 1992), there is no comprehensive and sufficiently well-founded theory to explain sexual risk behavior (Bengel 1996). Most significant in the field of sexual risk behavior are the social-psychological models (e.g., Theory of Reasoned Action and Planned Behavior, Theory of Protection Motivation, and the AIDS Risk Reduction Model; see Health Behaior: Psychosocial Theories). For the purpose of this article the AIDS Risk Reduction Model is presented because it entails the most direct links to preventive strategies. It distinguishes between (a) demographic and personality variables, (b) labeling stage variables, (c) commitment stages variables, and (d) enactment stage variables (Catania et al. 1990). Demographic factors such as gender, age, and education, as well as personality factors such as impulsivity or the readiness to take risks contribute little in explaining sexual risk behavior (Sheeran et al. 1999). Also of limited predictive value are labeling stage variables such as knowledge of AIDS, sexual experience, and threat appraisal or risk perception. The assumption is that each individual makes a personal assessment of the risk of infection or disease (perceived vulnerability and perceived severity). A heterosexual, monogamous male, for instance, may perceive the menace of AIDS as a severe health issue, but may feel personally invulnerable or nonsusceptible. However, some individuals who conduct manifest sexual risk behaviors may underestimate their personal risk as compared to others and thereby perform an ‘optimistic bias’ (see Health Risk Appraisal and Optimistic Bias). Many studies have shown significant but tenuous correlations between threat appraisal variables and risk behavior. Commitment stage variables influence behavior more substantially and include: (a) social influence: perception of social pressure from significant others to use or not use a condom and of a sexual partner’s attitude toward condoms; (b) beliefs about condoms: attitudes toward condoms, intentions, and perceived barriers to condom use; (c) self-efficacy: confidence in the ability to protect oneself against HIV; and (d) pregnancy prevention: condom use for contraceptive purposes. Social pressure, self-efficacy, attitudes toward condoms, as well as previous use of condoms correlate closely and significantly with HIV protective behavior. The extent to which protective or risky behavior is displayed also depends on situative and interactive factors, the enacting stage variables. Particularly among casual sexual encounters, lack of immediate 14016
condom availability can be the decisive determinant of risk behavior. The nature of the relationship and, in particular, the level of communication about safe sex play a central role in risk behavior. Can the partners communicate about HIV and protective behavior, or are they afraid of offending the partner and jeopardizing what could otherwise be a valued romance? The significance of the influencing factors above varies, depending on the target behavior (e.g., condom use, sexual practices) and on target group (e.g., homosexuals, adolescents, prostitutes; see, e.g., Flowers et al. 1997). All available theoretical approaches and models assume that HIV-protective behavior is governed by a rational decision process. Emotional and motivational factors, as well as planned behavior and action control, largely have been disregarded in these models and have also been insufficiently researched. After experiencing a risk situation, individuals change both their risk perception and their appraisal of the options available for risk management. Especially when uncertainty or fear about HIV infection is high, cognitive coping (e.g., ‘I know my partner’s friends, so he is not HIV infected’) and behavioral coping (e.g., seeking HIV antibody testing) are deployed.
5. Strategies for Behaioral Change Reducing the rates of infections with HIV and other sexually transmitted diseases and preventing unwanted pregnancies constitute major tasks for health science and policy. Although sexual risk behavior in most important target groups is difficult to assess, and explanatory models lack empirical validation and are incomplete, preventive programs must be developed and implemented (Kelly 1995). A societal agreement on target groups and on methods used in such prevention programs is essential. The need to prevent the spread of AIDS has triggered lively and controversial discussions in many countries: should the emphasis be on information and personal responsibility, or should regulatory measures be employed to stem the tide of the disease? Controversy has raged among scientists about the ability to influence sexual behavior and the right of the state to intervene in such intimate affairs. There is agreement, however, that sexual risk behavior cannot be regarded as an isolated mode of personal conduct, but must be seen in the context of an individual’s lifestyle and social environment. Prevention programs promote the use of condoms as the basic method of protection. They promulgate a message of personal responsibility to prevent risk: ‘protect yourself and others.’ AIDS prevention programs in Western European and North American countries have pursued two objectives: (a) to convey basic information on modes of transmission and methods of protection and (b) to motivate the popu-
Sexual Risk Behaiors lation to assess individual risk and undertake behavioral change if needed. These recommendations start by urging ‘safe sex,’ that is, use of condoms and avoidance of sexual practices in which body fluids are exchanged, in any situation where infection is a potential risk. Individuals are also advised to reduce the number of sexual partners, to avoid anonymous sexual partners and to reduce the use of substances that may result in loss of control. Prevention programs use mass communicative, personal, and structural measures. Mass communication involves dissemination of information via media such as radio, TV, newspaper, and posters, as well as distribution of brochures and informational leaflets. These media may be intended for all audiences or may be aimed at a specific group. The information is conveyed in simple, concrete, everyday language, describing the modes of transmission and the options available for protection. Personal measures include telephone hotlines, events, and seminars for special target groups, street-working, individual counseling, and information for sex tourists. These person-toperson prevention programs aim at fostering recognition of the problem, improving the level of knowledge, and changing attitudes, intentions, and behaviors of members of particular target groups. Structural measures include the provision of sterile syringes for intravenous drug users, easy access to condoms, and the improvement of the social situation of prostitutes. Preventive measures must be tailored to the lifestyle, environment, and language of each target group given its specific risk behavior pattern and risk of HIV infection. Programs that rely entirely on the generation of fear enhance risk perception but offer no coping alternatives. Specific messages alone, for instance an appeal for condom use, are also often inadequate to bring about (permanent) behavior changes. Communication among sexual partners should be encouraged as one of the crucial target parameters of prevention. Evaluation of prevention programs has suggested that they have been successful at conveying the most important information about AIDS. In certain sections of the population there are nevertheless uncertainties, irrational assumptions, and false convictions about risk of infection through such activities as kissing or work contacts. The acceptance level of this information varies widely depending on the target group and is lowest among intravenous drug users. As expected, outside of those groups at especially high risk, the greatest fear of infection prevails among persons below the age of 35 and among singles. Only moderate behavioral change has been found among intravenous drug users. Rapid and significant changes in behavior have been found among homosexual males, especially in cities (fewer sexual partners, fewer high-risk sexual practices, increased use of condoms). Yet only a minority practices safe sex all the time, and some of the behavioral changes are relatively unstable.
Some experts fear that the messages of preventive campaigns are wearing off and that some individuals are now less concerned about becoming infected than in the past. This may be due to impact of the new antiviral drugs, which have changed the perception of HIV from that of a death sentence to that of a chronic, manageable disease. See also: Adolescent Health and Health Behaviors; Health Behavior: Psychosocial Theories; Health Behaviors; Health Risk Appraisal and Optimistic Bias; HIV Risk Interventions; Regulation: Sexual Behavior; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexually Transmitted Diseases: Psychosocial Aspects
Bibliography Adler N E, Rosengard C 1996 Adolescent contraceptive behavior: Raging hormones or rational decision making? In: Oskamp S, Thompson S C (eds.) Understanding and Preenting HIV Risk Behaior. Safer Sex and Drug Use. Sage, Thousand Oaks, CA Bancroft J (ed.) 1997 Researching Sexual Behaior. Methodological Issues. Indiana Press, Bloomington, IN Bengel J (ed.) 1996 Risikoerhalten und Schutz or AIDS: Wahrnehmung und Abwehr des HIV-Risikos—Situationen, Partnerinteraktionen, Schutzerhalten (Risk behavior and protection against AIDS: Perception of and defense against the risk of HIV—situations, partner interactions, protective behavior). Edition Sigma, Berlin von Campenhourdt L, Cohen M, Guizzardi G, Hausser D (eds.) 1997 Sexual Interactions and HIV-Risk. Taylor & Francis, London Catania J A, Kegeles S M, Coates T J 1990 Towards an understanding of risk behavior: An AIDS risk reduction model. Health Education Quarterly 17: 53–72 Centers for Disease Control and Prevention 1999 HIV\AIDS among African–Americans (On-line). Available: http:\\www. cdc.gov\hiv\pubs\facts\afam.pdf Centers for Disease Control and Prevention 2000 Tracking the hidden epidemics. Trends in the STD epidemics in the United States (On-line). Available: http:\\www.cdc.gov\nchstp\ dstd\StatsITrends\StatsIandITrends.htm DiClemente R J, Peterson J L (eds.) 1994 Preenting AIDS. Plenum, New York Flowers P, Sheeran P, Beail N, Smith J A 1997 The role of psychosocial factors in HIV risk-reduction among gay and bisexual men: A quantitative review. Psychology and Health 12: 197–230 Johnson A M, Wadsworth J, Wellings K, Field J 1994 Sexual Attitudes and Lifestyles. Blackwell, Oxford, UK Kelly J A 1995 Changing HIV Risk Behaior: Practical Strategies. Guilford Press, New York Kleiber D, Velten D 1994 Prostitutionskunden. Eine Untersuchung uW ber soziale und psychologische Charakteristika on Besuchern weiblicher Prostituierter in Zeiten on AIDS. (Clients of prostitutes. An investigation into social and psychological characteristics of customers of female prostitutes in the era of AIDS). Nomos Verlagsgesellschaft, BadenBaden
14017
Sexual Risk Behaiors Laumann E O, Gagnon J H, Michael R T, Michaels S 1994 The Social Organization of Sexuality. Sexual Practices in the United States. University of Chicago Press, Chicago Matarazzo J D, Weiss S M, Herd J A, Miller N E, Weiss S M (eds.) 1984 Behaioral Health: A Handbook of Health Enhancement and Disease Preention. Wiley, New York McConaghy N (ed.) 1993 Sexual Behaior. Problems and Management. Plenum, New York Oskamp S, Thompson S C (eds.) 1996 Understanding and Preenting HIV Risk Behaior. Safer Sex and Drug Use. Sage, Thousand Oaks, CA Robert Koch-Institut 1999 AIDS\HIV Halbjahresbericht I\99. Bericht des AIDS-Zentrums im Robert Koch-Institut uW ber aktuelle epidemiologische Daten (AIDS\HIV semiannual report I\99. Report by the AIDS Center at the Robert KochInstitute on Current Epidemiological Data). Berlin Sheeran P, Abraham C, Orbell S 1999 Psychosocial correlates of heterosexual condom use: A meta-analysis. Psychology Bulletin 125: 90–132 Yates J F 1992 Risk-taking Behaior. Wiley, Chichester, UK
J. Bengel
Sexuality and Gender The meaning of the terms sexuality and gender, and the ways that writers have theorized the relationship between the two, have changed considerably over the last 40 years. The term sexuality has various connotations. It can refer to forms of behavior, it may include ideas about pleasure and desire, and it is also a term that is used to refer to a person’s sense of sexual being, a central aspect of one’s identity, as well as certain kinds of relationships. The concept of gender has also been understood in relation to varying criterion. Prior to the 1960s, it was a term that referred primarily to what is coded in language as masculine or feminine. The meaning of the term gender has subsequently been extended to refer to personality traits and behaviors that are specifically associated either with women or men; to any social construction having to do with the male\female distinction, including those which demarcate female bodies from male bodies; to gender being thought of as the existence of materially existing social groups ‘men’ and ‘women’ that are the product of unequal relationships. In this latter sense, gender as a socially meaningful category is dependant on a hierarchy already existing in any given society, where one class of people (men) have systematic and institutionalized power and privilege over another class of people (women) (Delphy 1993). The term patriarchy or, more recently, the phrase ‘patriarchal gender regimes,’ is used as a way of conceptualizing the oppression of women which results. More recently, the notion of gender as social practice has emerged and is associated with the work of Judith Butler (1990), who argues that gender is performatively enacted through a continual citation 14018
and reiteration of social norms. Butler offers a similar analysis of sexuality, claiming that far from being fixed and naturally occurring, (hetero) sexuality is ‘unstable,’ dependant on ongoing, continuous, and repeated performances by individuals ‘doing heterosexuality,’ which produce the illusion of stability. There is no ‘real’ or ‘natural’ sexuality to be copied or imitated: heterosexuality is itself continually in the process of being reproduced. Such ideas are part of the establishment of a new canon of work on sexuality and gender that has emerged since the 1960s. This newer approach differs radically from the older tradition put forward by biologists, medical researchers, and sexologists, which developed through the late nineteenth century and was profoundly influential during the first half of the twentieth century. The traditional approach to understanding sexuality and gender has been primarily concerned with establishing ‘natural’ or ‘biological’ explanations for human behavior. Such analyses are generally referred to as essentialist. More recent approaches, although not necessarily denying the role of biological factors, have emphasized the importance of social and cultural factors; what is now commonly known as the social constructionist approach. The sociological study of sexuality emerged in the 1960s and 1970s and was informed by a number of theoretical approaches that were significant at the time; notably symbolic interactionism, labeling theory and deviancy theory, and feminism. The work of writers such as John Gagnon and William Simon (1967, 1973) in the US, and Mary McIntosh (1968) and Kenneth Plummer (1975) in the UK, was particularly important in establishing a different focus for thinking about sexuality. A primary concern of such works was to highlight how sexuality is social rather than natural behavior and, as a consequence, a legitimate subject for sociological enquiry. Gagnon and Simon developed the notion of sexual scripts which, they argued, we make use of to help define who, what, where, when, how, and—most antiessentialist of all—why we have sex. A script refers to a set of symbolic constructs which invest actors and actions, contexts and situations, with ‘sexual’ meaning—or not as the case may be. People behave in certain ways according to the meanings that are imputed to things; meanings which are specific to particular historical and cultural contexts; meanings that are derived from scripts learnt through socialization and which are modified through ongoing social interactions with others. Most radical of all, Gagnon and Simon claimed that not only is sexual conduct socially learnt behavior, but the reason for wanting to engage in sexual activity, what in esssentialist terms is referred to as sexual ‘drive’ or ‘instinct,’ is in fact a socially learnt goal. Unlike Freud, who claimed the opposite to be true, Gagnon and Simon suggested that social motives underlie sexual actions. They saw gender as central to this and detailed how in con-
Sexuality and Gender temporary Western societies sexual scripts are different for girls and boys, women and men. Here gender is seen as a central organizing principle in the interactional process of constructing sexual scripts. In this sense gender can be seen as constitutive of sexuality, at the same time as sexuality can be seen as expressive of gender. Thus, for example, Gagnon and Simon argue that men frequently express and gratify their desire to appear ‘masculine’ through specific forms of sexual conduct. For example, for young men in most Western cultures first sexual intercourse is a key moment in becoming a ‘real man,’ whereas this is not the same for young women. It is first menstruation rather than first heterosex that marks being constituted as ‘women.’ Another important contribution to the contemporary study of sexuality, which has posed similar challenges to essentialist theories, is the discourse analysis approach. One example is the work of Michel Foucault (1979) and his followers, who claim that sexuality is a modern ‘invention’ and that by taking ‘sexuality’ as their object of study, various discourses, in particular medicine and psychiatry, have produced an artificial unity of the body and its pleasures: a compilation of bodily sensations, pleasures, feelings, experiences, and actions which we call ‘the sexual.’ Foucault understands sex not as some essential aspect of personality governed by natural laws that scientists may discover, but as an idea specific to certain cultures and historical periods. Foucault draws attention to the fact that the history of sexuality is a history of changing forms of regulation and control over sexuality. What ‘sexuality’ is defined as, its importance for society, and to us as individuals may vary from one historical period to the next. Furthermore, Foucault argues, as do interactionists, that sexuality is regulated not only through prohibition, but is produced through definition and categorization, in particular through the creation of sexual categories such as, for example, ‘heterosexual’ and ‘homosexual.’ Foucault argues that, while both heterosexual and homosexual behavior has existed in all societies, there was no concept of a person whose sexual identity is ‘homosexual’ until relatively recently. Although there is some disagreement among writers as to precisely when the idea of the homosexual person emerged, it has its origins in the seventeenth to nineteenth centuries, with the category lesbian emerging somewhat later than that of male homosexuality. Such analyses have also highlighted how medical and psychiatric knowledge during the late nineteenth and early twentieth centuries was a key factor in the use of the term ‘homosexual’ to designate a certain type of person rather than a form of sexual conduct. A major criticism of Foucault’s work is that insufficient attention is given to examining the relationship between sexuality and gender. Feminist writers in particular have pointed out how in Foucault’s account of sexuality there is little analysis
of how women and men often have different discourses of sexuality. Sexuality is employed as a unitary concept and, such critic’s claim, that sexuality is male. Despite such criticisms, many feminists have utilized Foucauldian perspectives. A further challenge to essentialist ideas about sexuality and gender is associated with psychoanalysis, in particular the reinterpretation of Freud by Jacques Lacan. For Lacan and his followers sexuality is not a natural energy that is repressed; it is language rather than biology that is central to the construction of ‘desire.’ Lacanian psychoanalysis has had a significant influence on the development of feminist theories of sexuality and gender, although some writers have been critical of Lacan’s work (Butler 1990). At the same time as social scientists and historians were beginning to challenge the assumption that sexual desires and practices were rooted in ‘nature,’ more and more people were beginning to question dominant ideas about gender roles and sexuality. The late 1960s\early 1970s saw the emergence of both women’s and gay and lesbian liberation movements in the US and Europe. An important contribution to analyses of sexuality and gender at that time was the distinction feminists, along with some sociologists and psychologists, sought to make between the terms sex and gender. Sex referred to the biological distinction between females and males, whereas gender was developed and used as a contrasting term to sex. Gender refers to the social meanings and value attached to being female or male in any given society, expressed in terms of the concepts femininity and masculinity, as distinct from that which is thought to be biologically given (sex). Feminists have used the sex\gender distinction to argue that although there may exist certain biological differences between females and males, societies superimpose different norms of personality and behavior that produce ‘women’ and ‘men’ as social categories. It is this reasoning that led Simone de Beauvoir (1964) to famously remark ‘One is not Born a Woman.’ More recently, a new understanding of gender has emerged. Rather than viewing sex and gender as distinct entities, sex being the foundation upon which gender is superimposed, gender has increasingly been used to refer to any social construction to do with the female\male binary, including male and female bodies. The body, it is argued, is not free from social interpretation, but is itself a socially constructed phenomenon. It is through understandings of gender that we interpret and establish meanings for bodily differences that are termed sexual difference (Nicholson 1994). Sex, in this model, is subsumed under gender. Without gender we could not read bodies as differently sexed; gender provides the categories of meaning for us to interpret how the body appears to us as ‘sexed.’ Feminists have critiqued essentialist understandings of both sexuality and gender and have played an 14019
Sexuality and Gender important role in establishing a body of research and theory that supports the social constructionist view. However, feminist theories of sexuality are not only concerned with detailing the ways in which our sexual desires and practices are socially shaped; they are also concerned to ask how sexuality relates to gender and, more specifically, what the relationship is between sexuality and gender inequality? It is this question which perhaps more than any other provoked discussion and controversy between feminists during the 1970s and 1980s. Most feminists would agree that historically women have had less control in sexual encounters than their male partners and are still subjected to a double standard of sexual conduct that favors men. It is, for example, seen as ‘natural’ for boys to want to have sex and with different partners, whereas exactly the same behavior that would be seen as understandable and extolled in a boy is censured in a girl. Sexually active women are subject to criticism and are in danger of being regarded as a ‘slut’ or a ‘slag.’ Where feminists tend to differ is over the importance of sexuality in understanding gendered power differences. For many radical feminists sexuality is understood to be one of the key mechanisms through which men have regulated women’s lives. Sexuality, as it is currently constructed, is not merely a reflection of the power that men have over women in other spheres, but is also productive of those unequal power relationships. Sexuality both reflects and serves to maintain gender divisions. From this perspective the concern is not so much how sexual desires and practices are affected by gender inequalities, but, more generally, how constructions of sexuality constrain women in many aspects of their daily lives from restricting their access to public space to shaping health, education, work, and leisure opportunities (Richardson 1997). Fears of sexual violence, for instance, may result in many women being afraid to go out in public on their own, especially at night. It is also becoming clearer how sexuality affects women’s position in the labor market in numerous ways; from being judged by their looks as right or wrong for the job, to sexual harassment in the workplace as a common reason given by women for leaving paid employment. Other feminists have been reluctant to attribute this significance to sexuality in determining gender relations. They prefer to regard the social control of women through sexuality as the outcome of gendered power inequalities, rather than its purpose. There is then a fundamental theoretical disagreement within feminist theories of sexuality, over the extent to which sexuality can be seen as a site of male power and privilege, as distinct from something that gendered power inequalities act upon and influence. In the 1990s a new perspective on sexuality and sexual politics emerged fueled by the impact of HIV and AIDS on gay communities and the anti-homosexual feelings and responses that HIV\AIDS revita14020
lized, especially among the ‘moral right.’ One response by scholars was queer theory; a diverse body of work that aims to question the assumption in past theory and research that heterosexuality is ‘natural’ and normal. Queer theory is often identified, especially in its early stages of development, with writers associated with literary criticism and cultural studies, and generally denotes a postmodernist approach to understanding categories of gender and sexuality. The work of Eve Sedgwick (1990), Judith Butler (1990), and Teresa de Lauretis (1991), for instance, might be taken as key to the development of queer theory. A principal characteristic of queer theory is that it problematizes sexual and gender categories in seeking the deconstruction of binary divides underpinning and reinforcing them such as, for instance, woman\man; feminine\masculine; heterosexual\homosexual; essentialist\constructionist. While queer theory aims to develop existing notions of gender and sexuality, there are broader implications of such interventions. As Sedgwick (1990) and others have argued, the main point is the critique of existing theory for its heterosexist bias rather than simply the production of theory about those whose sexualities are marginalized such as, for example, lesbians and gay men. Sexuality is the primary focus for analysis within queer theory and, while acknowledging the importance of gender, the suggestion is that sexuality and gender can be separated analytically. In particular, queer theorists are centrally concerned with the homo\heterosexual binary and the ways in which this operates as a fundamental organizing principle in modern societies. The emphasis is on the centrality of homosexuality to heterosexual culture and the ways in which the hetero\homo binary serves to define heterosexuality at the center, with homosexuality positioned as the marginalized ‘other.’ Feminist perspectives, on the other hand, have tended to privilege gender in their analyses—the woman\man binary— and, as I have already outlined above, are principally concerned with sexuality insofar as it is seen as constitutive, as well as determined by, gendered power relations. Of particular significance for the development of our understanding of the relationship between queer and feminism is a rethinking of the distinction between sexuality and gender. The relationship between sexuality and gender has, then, been theorized in different ways by different writers. These can be grouped into five broad categories. First, some theories place greater emphasis on gender insofar as concepts of sexuality are understood to be largely founded upon notions of gender (Gagnon and Simon 1973, Jackson 1996). For example, it is impossible to talk meaningfully about heterosexuality or homosexuality without first having a notion of one’s sexual desires and relationships as being directed to a particular category of gendered persons. Others propose a different relationship, where sexu-
Sexuality and Geography ality is understood to be constitutive of gender. The radical feminist Catherine MacKinnon (1982), for example, suggests that it is through the experience of ‘sexuality,’ as it is currently constructed, that women learn about gender, learn what ‘being a woman’ means. ‘Specifically, ‘‘woman’’ is defined by what male desire requires for arousal and satisfaction and is socially tautologous with ‘‘female sexuality’’ and the ‘‘female sex’’’ (MacKinnon 1982). A third way of understanding the relationship between sexuality and gender, which moves away from causal relationships where one is seen as more or less determining of the other, is one which allows for the possibility that the two categories can be considered as analytically separate, if related, domains. Gayle Rubin (1984) in her account of what she terms a ‘sex\gender system,’ and others who have been influenced by her work, such as Eve Sedgwick (1990), make this distinction between sexuality and gender, which means that it is possible for sexuality to be theorized apart from the framework of gender difference. This is a model favored by many queer theorists. Alternatively, we may reject all of these approaches in favor of developing a fourth model which relies on the notion that sexuality and gender are inherently co-dependent and may not usefully be distinguished one from the other (Wilton 1996). The stage in our contemporary understandings of sexuality and gender are such that there can be no simple, causal model that will suffice to explain the interconnections between them. However, rather than wanting to privilege one over the other, or seeking to analytically distinguish sexuality and gender or, alternatively, to collapse the two, we might propose a fifth approach which investigates ‘their complex interimplication’ (Butler 1997). It is this articulation of new ways of thinking about sexuality and gender in a dynamic, historically, and socially specific relationship that is one of the main tasks facing both feminist and queer theory (Richardson 2000). See also: Female Genital Mutilation; Feminist Theory: Psychoanalytic; Feminist Theory: Radical Lesbian; Gay, Lesbian, and Bisexual Youth; Gay\Lesbian Movements; Gender and Reproductive Health; Gender Differences in Personality and Social Behavior; Gender Ideology: Cross-cultural Aspects; Heterosexism and Homophobia; Lesbians: Historical Perspectives; Male Dominance; Masculinities and Femininities; Prostitution; Queer Theory; Rape and Sexual Coercion; Rationality and Feminist Thought; Regulation: Sexual Behavior; Sex Offenders, Clinical Psychology of; Sex-role Development and Education; Sex Therapy, Clinical Psychology of; Sexual Attitudes and Behavior; Sexual Behavior: Sociological Perspective; Sexual Harassment: Legal Perspectives; Sexual Harassment: Social and Psychological Issues; Sexual Orientation and the Law; Sexual Orientation:
Biological Influences; Sexual Orientation: Historical and Social Construction; Sexual Risk Behaviors; Sexuality and Geography; Teen Sexuality; Transsexuality, Transvestism, and Transgender
Bibliography Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, New York Butler J 1997 Critically queer. In: Phelan S (ed.) Playing with Fire: Queer Politics, Queer Theories. Routledge, London de Beauvoir S 1964 The Second Sex. Bantam Books, New York de Lauretis T 1991 Queer theory: lesbian and gay sexualities: an introduction. Differences: Journal of Feminist Cultural Studies 3(2): iii–xviii Delphy C 1993 Rethinking sex and gender. Women’s Studies International Forum 16(1): 1–9 Foucault M 1979 The History of Sexuality. Allen Lane, London, Vol. 1 Gagnon J H, Simon W (eds.) 1967 Sexual Deiance. Harper & Row, London Gagnon J H, Simon W H 1973 Sexual Conduct: The Social Sources of Human Sexuality. Harper & Row, London Jackson S 1996 Heterosexuality and feminist theory. In: Richardson D (ed.) Theorising Heterosexuality. Open University Press, Buckingham, UK MacKinnon C A 1982 Feminism, Marxism, method and the state: an agenda for theory. Signs: Journal of Women in Culture and Society 7(3): 515–44 McIntosh M 1968 The homosexual role. Social Problems 16(2): 182–92 Nicholson L 1994 Interpreting gender. Signs: Journal of Women in Culture and Society 20(1): 79–105 Plummer K 1975 Sexual Stigma: An Interactionist Account. Routledge and Kegan Paul, London Rich A 1980 Compulsory heterosexuality and lesbian existence. Signs: Journal of Women in Culture and Society 5(4): 631–60 Richardson D 1997 Sexuality and feminism. In: Robinson V, Richardson D (eds.) Introducing Women’s Studies: Feminist Theory and Practice, 2nd edn. Macmillan, Basingstoke, UK Richardson D 2000 Re: Thinking Sexuality. Sage, London Rubin G 1984 Thinking sex: notes for a radical theory of the politics of sexuality. In: Vance C S (ed.) Pleasure and Danger: Exploring Female Sexuality. Routledge, London Sedgwick E K 1990 Epistemology of the Closet. University of California Press, Berkeley, CA Wilton T 1996 Which one’s the man? The heterosexualisation of lesbian sex. In: Richardson D (ed.) Theorising Heterosexuality. Open University Press, Buckingham, UK
D. Richardson
Sexuality and Geography Geographical questions concerning sexuality are ideally investigated by combining empirical work with theories of how and why particular sexualities garner spatial force and cultural meaning. In practice, most 14021
Sexuality and Geography theoretical treatments of sexuality in geography have dealt with normative forms of heterosexuality in industrial contexts. Queries about heterosexuality have centered partially on theorizing why and how procreation is given symbolic, spatial, and practical centrality in different cultural and historical contexts. Other theoretical queries have involved analyzing how heterosexual metaphors and expectations inform the structure and meaning of language, discourse, and place. For instance, what is the historical and spatial significance of saying master bedroom, cockpit, capitalist penetration, motherland, virgin territory, mother-earth, or father-sky? In contrast, empirical work has centered on documenting where nonheterosexualities take place, how states and societies regulate nonheterosexual places, bodies, and practices, and the sociospatial difficulties and resistances faced by those who desire nonheterosexual places and identities. Some empirical work also addresses the marginalized places of sex workers in heterosexual sex-work trades. The disparity between theoretical vs. empirical work reflects how knowledge and power intersect spatially. Most persons today are compelled to assume, and are hence familiar with, the bodily and spatial arrangements of heterosexual gestures and public displays of affection, procreational practices (marriages, buying a ‘family’ home, having children) and forms of entertainment. Consequently, little explanatory or descriptive analytical space is needed. In contrast, the bodily and spatial arrangements of those assuming non-normative sexualities are commonly hidden from view and relatively unknown. They therefore require documentation before theorization can proceed. Intersections between geography and sexuality cannot be understood singularly by listing research accomplished to date. Rather, discussion about how geographers have formally and informally navigated sexuality debates is needed to contextualize how external politics have informed internal concerns and research agendas.
1. The Sociology of Sexuality Research in Geography Few geographers have, until recently, explored how sexuality shapes and is shaped by the social and spatial organization (what geographers call the sociospatiality) of everyday life. Fewer still have explored how geography as a discipline is shaped by heterosexual norms and expectations, or what is called heteronormativity. The ways that modern heterosexuality informs everyday life are not seen or studied because it is taken as natural. Hence, most people unconsciously use it to structure the sociospatiality and meaning of their lives, living out and within what Butler (1990) calls a ‘heterosexual matrix.’ Many cultural settings in which heterosexuality predominates place great prac14022
tical and ideological value on procreation. Here, couplings of persons with genitalia qualified as oppositely sexed are presumed to be biologically natural, feeding into culturally constructed ideals of gender identities and relations (WGSG 1997). In modern industrial contexts, boys and girls are expected to grow up to be fathers and mothers who settle into a nuclear household to procreate. Given modern heterosexuality’s centrality in determining what is natural, it commonly grounds notions of ontology (being), epistemology (ways of knowing), and truth. The socially constructed ‘naturalness’ of heterosexuality in modern contexts is produced and reproduced at a variety of geographical scales. Many nation-states, for example, define and regulate heterosexuality as the moral basis of state law and moral order (e.g., The Family), forbidding sexual alliances not ostensibly geared towards procreation (Nast 1998). Such laws and related political, cultural, and epistemological strictures have historically limited the kinds of spatial questions geographers have thought, imagined, or asked. Today, precisely when the existence and sovereignty of nation-states is being challenged by transnational actors so, too, is the relevance of the nuclear family being questioned, suggesting partial societal and political symbiosis between family and state types. Nevertheless, for the time being, heterosexual strictures predominate across sociospatial domains, helping to account for geography’s generalized disciplinary anxiety towards: (a) those who conduct research on, or question, how heterosexuality is sociospatially made the norm, (b) geographers who hold queer identities, and (c) those engaged in queer identity research. The word queer, here, refers to persons practicing non-normative kinds of sexuality. The word is not meant to efface sexuality differences, but to stress the oppositional contexts through which those who are non-normative are made marginal (see Elder et al. in press). By the late 1980s geographers in significant numbers had brought analytical attention to the violence involved in sustaining what Rich (1980) has named, ‘compulsory heterosexuality’ (e.g., Foord and Gregson 1986, Bell 1991, Valentine 1993a, 1993b, WGSG 1997, Nast 1998). Part of this violence has to do with the fact that most geographic research fails to question how heterosexuality is spatially channeled to shape the discipline and everyday life. Heterosexuality’s incorporation into the micropractices and spaces of the discipline means that heteronormativity dominates the ways geographers investigate space. Witness the sustained heterosexist practices and spaces of geography departments and conferences (e.g., Chouinard and Grant 1995, Valentine 1998), heterosexist geographical framings or epistemologies of space, nature, and landscape, and the discipline’s heterosexist empirical interests, divisions, and focii (e.g., Rose 1993, Binnie 1997a). Even
Sexuality and Geography feminist geographical analyses rarely show how constructions of gender relate to the heterosexual practices to which they are attached Foord and Gregson 1986, Townsend in WGSG 1997, p. 54). In these ways, heterosexuality underpins geographical notions of self, place, sex, and gender (Knopp 1994). Consequently, geographers reproduce hegemonic versions of hetero sex and oedipal (or, nuclear) family life, helping to make these appear legitimate, natural, morally right, and innocent.
2. Theoretical Research on Heterosexuality and Geography Perhaps the first geographers to theorize heterosexuality’s impact were Foord and Gregson (1986). Using a realist theoretical framework, they argued that gender divisions of male and female derive from procreation and, hence, heterosexuality. Later, Rose (1993) demonstrated how the language, interests, methods, theories, and practices of certain subfields in geography (particularly time geography and humanistic geography) bely a masculinity that, respectively, ignores women’s lives or renders them nostaglic objects of desire, not unlike how landscapes and nature have been romanticized and feminized in the discipline. Yet the masculinity Rose describes in heterosexual; it is heteromasculinity. Heteromasculinity can additionally be argued to reside in epistemologies and empirical focii of geographic research generally (Knopp 1994), especially physical geography and some Marxian analyses of space. Nast (1998) theorizes that modern heteromasculinity is not a singular entity set in structural opposition to femininity asin traditional binary theories of gendered binaries. Rather, the heteromasculine is structured along two imagined, symbolized, and enaced avenues: (a) the filial (son-like) and (b) the paternal. The filial is constructed as hyperembodied and hence celebrates-as-it-constructs men’s superior physical strength and courage. The paternal is different in that it celebrates men’s superior intellectual strength, objectivity, and cunning and is written about as though disembodied. These two masculinities are antagonistically inter-related in an imaginary and symbolic framework supportive of the prototypical industrial, oedipal family. In the context of globalization and nationalisms, Gibson-Graham (1999) and Nast (1998), respectively, theorize how heterosexualities and geographies discursively and practically make one another. GibsonGraham analyzes the striking rhetorical parallels between some Marxian theoretical constructions of capital and some social constructions of rape that cast women as victims. Thus, many Marxian theories script capitalism as violently irrepressible, a force that destroys noncapitalist social relations with which it
comes into contact. Such theorization gathers force from heterosexualized language and imagery, such that capitalism and its effects are rendered naturally and forever singular, hard, rapacious, and spatially penetrative. Analogously, raped women have commonly been scripted as victims of men’s naturally and singularly superior strength and virility. Rape, once initiated, is judged as unstoppable or inevitable. Women are therefore encouraged to submit (like noncapitalist groups) to heteromasculinity’s logicviolence. For Gibson-Graham, what needs to be resisted is not the ‘reality’ that women and markets are naturally rapable or that men and capitalism are monolithically superior and rapacious. Rather, imaginaries and discursive formations need to be deployed that allow for difference and agency and that support, facilitate, and valorize resistance and change. Nast (1998) drawing upon Foucaultian notions of discourse as useful, similarly shows how heteronormativity is dispersed across sociospatial circuits of everyday life, including discourses and practices of the body, nation, transnation, and world. Focusing on modern nationalist language and imagery, she shows that many representations of nation-states depend upon images of a pure maternal and pure nuclear family. In these instances, heteromasculine control over women’s bodies and procreation is commonly linked to eugenics or racialized notions of national purity. Obversely, some nation-states, particularly fascist ones, are represented using phallic language and imagery that glorify the oedipal (nuclear) family father. Cast as superior protector of family and nation, he and his sons defend against that deemed weak and polluting, including other ‘races’ and the maternal. Nuclear familial imagery and practices have also been used in the context of industrial interests providing, for example, practical and aesthetic means for systematically alienating and reproducing labor. Here, the nuclear family is a core setting for producing and socializing industrial workers, with the family and family home fetishized as natural protected places of privacy and peace.
3. Empirical Research on Sexuality and Geography in Sociospatial Context Most sexuality research in geography consists of empirical studies of sexuality. In many cases the research is undertheorized, partially reflecting the nascent state of sexuality research wherein specificities of sexualities are mapped out before they are theorized across contexts. Many of those documenting the spatial contours of sexualized bodies and practices have focused on: (a) marginalized heterosexuals, such as sex workers, and (b) queer bodies and places. The former is heteronormative to the degree that the researchers do not specify that it is heterosexuality 14023
Sexuality and Geography with which they work; they also do not theorize heterosexuality’s specific sociospatial force in the contexts they study. Consequently their work reinforces popular senses that when one speaks of sex, it is naturally and singularly heterosex. In contrast, working against the grain of heterosexual norms and [expectations] because they are researchers who study the spatiality of queer bodies and places call attention to the specific sexuality(ies) through and against which they work. Perhaps because in some postindustrial contexts the nuclear family and procreational economies went into decline in the 1970s (alongside the Fordist family), social movements involving queer sexualities obtained greater popular recognition and political and material might, especially where related to gay men. It was at this time that a first wave of queer geographical work broke surface, dealing primarily with white gay male investors in urban contexts and drawing largely upon Marxian theories of rent. Many of these works were unpublished and bore descriptive titles (but see Weightman 1980; for unpublished works, see Elder et al. in press). In tandem with increased scholarly production was an increase in the scholarly isibility of queer geographers (see Elder et al. in press). Evidence of increased networking among sexuality researchers is Mapping Desire, the first edited collection on sexuality andgeography(BellandValentine1995).Thecollection, like other works produced towards the late 1980s and 1990s, reflects broad geographical interests, ranging from heterosexual prostitution in Spain, to state surveillance and disciplinary actions towards Black men and women in apartheid South Africa, to the ‘unsafeness’ of the traditional nuclear family home for many lesbians. The works consists mostly of case studies detailing how marginalized identities are lived out, negotiated, and contested. Ironically, given the collection’s title, no spatial theories of desire are presented. Institutional recognition of queer geographers or sexuality research was not sought in English-speaking contexts until the 1990s and only then, in the United States and Britain. In the 1980s, members of the Association of American Geographers (AAG) formed the Lesbigay Caucus, subsequently holding organizational meetings at annual AAG meetings and cosponsoring sessions. In 1992 a small group of British geographers created The Sexuality and Space Network, a research and social network affilated loosely with the Institute of British Geographers. That same year, members of the Lesbigay Caucus decided to work for creation of a Sexuality and Space Specialty Group within the AAG which, unlike the caucus, would sponsor sessions and research committed to exploring inter-relations between sexuality and space. The Specialty Group was granted official status in 1996, becoming the first group of its kind in the discipline. 14024
Empirical work on sexuality and geography is gendered, especially in the context of queer research. Most scholars writing about lesbian spaces are women documenting geographies of women’s fear and discrimination and how these are resisted at different scales and in different contexts. In contrast, most researchers addressing gay male spaces are men. In contrast to lesbian-related research, that related to gay men speaks mostly to proactive market measures successfully taken in securing urban-based private property and terrorities, particularly queer cultural districts produced through large investments in real estate and small business. In either case, western contexts prevail. Thus, Patricia Meono-Picado (Jones et al. 1997) documents how Latina lesbians in New York used spatial tactics to oppose the homophobic programming of a Spanish-language radio station, Valentine (1993b) and McDowell (in Bell and Valentine 1995) chart how some lesbians tenuously negotiate their identities in the workplace, Rothenberg (in Bell and Valentine 1995) discusses the informal networks lesbians use to create community and safety, and Johnston (1998) speaks of the importance of gyms as safe public spaces for alternative female embodiments. Valentine also discusses discriminations faced by lesbians everyday (1993a) and the considerable diversity within and among lesbian communities. (in Jones et al. 1997). Finally, Chouinard and Grant (1995) argue against marginalization of disabled and lesbian women in the academy, comparing their exclusions as disabled and lesbian women, respectively. In contrast, the urban market emphasis present in much gay male research is partly evident in titles such as, ‘The Broadway Corridor: Gay Businesses as Agents of Reitalization in Long Beach, California’ (Ketteringham 1983 cited in Elder et al. in press). The title secondarily speaks to a phenomenon present in much research, particularly about gay male spaces, namely, the exclusionary use of the word ‘gay’. While a study may pertain to gay men only, the word ‘gay,’ like heteropatriachal designations of ‘man,’ is commonly deployed as though it speaks for all queer communities. It thereby obliterates the specificities and identities of those not gay-male-identified (Chouinard and Grant (1995). Perhaps the most prominent and prolific scholar pioneering research into urban markets and queer life is Larry Knopp. Though his work centers on gay male gentrification, he has consistently attempted to theorize gendered spatial differences in queer opportunity structures and life (e.g., Knopp 1990, Lauria and Knopp 1985). Gay men’s desires and differential abilities to procure spatial security at scales and densities greater than that obtained by lesbians (for example, Boystown in Chicago, the Castro district in San Francisco, the West End in Vancouver, Mykonos and Korfu in Greece, and cruising areas for western gay men in Bangkok) have been a source of scholarly debate.
Sexuality and Geography Castells (1983) argued early on, for example, that urban-based gay men form distinct neighborhoods because, as men, they are inherently more territorial. Adler and Brennan (1992) disagree, contending that lesbians, as women, are relatively disadvantaged economically and more prone to violence and attack. Consequently, they are less able to secure territory, legitimacy, visibility (see also Knopp 1990). More recent analyses have taken a comparative or cross-cultural look at queer life, entertaining both greater analytical specificity and diversity along lines of gender, ‘race,’ disability, and\or national or rural– urban location (e.g., Brown 2000, Bell and Valentine 1995, Chouinard and Grant 1995). There is also nascent research on state regulation of sexuality, national identity, and citizenship (e.g., Binnie 1997b, Nast 1998) and on intersections of sexuality and the academy (JGHE 1999). Despite the increased presence of queer research and researchers in the discipline, sexuality research, especially that which deconstructs heterosexuality or which makes queer sexuality visible, is still considered radical (Chouinard and Grant 1995, Valentine 1998, JGHE 1999). Negative reactions point to anxieties over research that expose heterosexuality’s artifice, research that implicitly argues against the privileges and privileging power relations upon which heterosexual ontologies of truth and gender are grounded. The depth of fear is made poignantly clear in a recent article by Valentine (1998) about the ongoing homophobic harassment levied at her by someone in the discipline who collapses her sexuality (she is a lesbian) with her sexuality research on lesbians, explicitly naming both a moral abomination.
4. Future Research A number of areas for future research suggest themselves. First, theorization is needed in addressing sexuality–geography links. What kinds of spatial and symbolic work do normative and queer sexualities practically accomplish across historical and cultural place? More work is needed that explores how normative heterosexualities vary across time and place, in keeping with diverse sociocultural expressions, uses and values of kinship and family structures. Moreover, the nuclear family is apparently in decline unevenly across place and time, in keeping with shifts in places of industrialization, begging the question as to whether some sexualities and affective familial patterns are better suited to certain political economies than to others (see also Knopp 1994). If this is the case, what kinds of sexual identities and desires work well in postindustrial places, and how are industrial restructuring processes affecting oedipal family life and the commoditization of normative and oppositionally sexed identities? What sorts of alternative sociospatial alliances are needed to allow diverse sexualities to co-
exist? Are gay white men between the ages of 25 and 40 ideally situated economically and culturally to take advantage of the decline of the nuclear family in postindustrial contexts, which do not depend as much upon, or value, procreation? Second, sexuality–geography research needs to be more theoretically and empirically attuned to differences produced through normative and oppositional constructions of race, gender, class, disability, age, religious beliefs, and nation. Little gender research has been done in the context of queer communities, for example. Are sociospatial interactions between gay male and lesbian communities devoid of gendered tensions? Does patriarchy disappear in spaces of gay male communities (Chouinard and Grant 1995, Knopp 1990)? Are lesbians or gay men situated outside the signifying strictures of heteronormative codes of femininity and masculinity? And how are constructions of race expressed sexually and geographically? Much work needs to be done on how persons of color have been put into infantilized, subordinate positions and places associated with the white-led Family. African-American men after emancipation, for example, were consistently constructed as rapist-sons sexually desirous of the white mother. What kind of work do these and other similarly sexed constructions accomplish in colonial and neocolonial contexts (Nast 2000)? Similarly, why does sex with young boys and men of color, particularly in Southeast Asia, have such wide market appeal among privileged gay white men, resulting in considerable sexualized tourism investment? And what kind of cultural, political, and economic work is achieved through sado-masochistic constructions threaded across sexed and raced communities? What precisely is being constructed and why? Finally, how do sociospatial debates about, and constructions of sexuality, continue to shape the theories, practices, and empirical concerns of geography? See also: Gender and Environment; Gender and Place; Queer Theory; Sexuality and Gender
Bibliography Adler S, Brennan J 1992 Gender and space: Lesbians and gay men in the city. International Journal of Urban and Rural Research 16: 24–34 Bell D J 1991 Insignificant others: Lesbian and gay geographies. Area 23: 323–9 Bell D, Valentine G (eds.) 1995 Mapping Desire. Routledge, New York Binnie J 1997a Coming out of geography: Towards a queer epistemology? Enironment and Planning D: Society and Space 15: 223–37 Binnie J 1997b Invisible Europeans: Sexual citizenship in the New Europe. Enironment and Planning A 29: 237–48 Brown M P 2000 Closet Geographies. Routledge, New York Butler J P 1990 Gender Trouble Routledge, New York
14025
Sexuality and Geography Castells M 1983 The City and the Grassroots. University of California Press, Berkeley, CA Chouinard V, Grant A 1995 On being not even anywhere near ‘‘the project’’: Ways of putting ourselves in the picture. Antipode 27: 137–66 Elder G, Knopp L, Nast H in press Geography and sexuality. In: Gaile G, Willmott C (eds.) Geography in America at the Dawn of the 21st Century. Oxford University Press, Oxford, UK Foord J, Gregson N 1986 Patriarchy: towards a reconceptualisation. Antipode 18: 186–211 Gibson-Graham J K 1998 Queerying globalization. In: Nast H, Pile S (eds.) Places Through the Body. Routledge, New York Johnston L 1998 Reading the sexed bodies and spaces of gyms. In: Nast H J, Pile S (eds.) Places Through the Body. Routledge, New York Jones J P III, Nast H J, Roberts S H (eds.) 1997 Thresholds in Feminist Geography. Rowman and Littlefield, Lanham, MD JGHE Symposium: Teaching Sexualities in Geography 1999 The Journal of Geography in Higher Education 23: 77–123 Knopp L 1990 Some theoretical implications of gay involvement in an urban land market. Political Geography Quarterly 9: 337–52 Knopp L 1994 Social justice, sexuality, and the city. Urban Geography 15: 644–60 Lauria M, Knopp L 1985 Towards an analysis of the role of gay communities in the urban renaissance. Urban Geography 6: 152–69 Nast H 1998 Unsexy geographies. Gender, Place and Culture 5: 191–206 Nast H J 2000 Mapping the ‘unconscious’: Racism and the oedipal family. Annals of the Association of American Geographers 90(2): 215–55 Rich A 1980 Compulsory heterosexuality and lesbian existence. Signs 5: 631–60 Rose G 1993 Feminism and Geography. University of Minnesota Press, Minneapolis, MN Valentine G 1993a (Hetero)sexing space: Lesbian perceptions and experiences of everyday spaces. Enironment and Planning D: Society and Space 11: 395–413 Valentine G 1993b Negotiating and managing multiple sexual identities: Lesbian time-space strategies. Transactions of the Institute of British Geographer 18: 237–48 Valentine G 1998 ‘‘Sticks and Stones May Break my Bones’’: A personal geography of harassment. Antipode 30: 305–32 Weightman B 1980 Gay bars as private places. Landscape 24: 9–17 Women and Geography Study Group (WGSG) 1997 Feminist Geographies: Explorations in Diersity and Difference. Longman, Harlow, UK
H. J. Nast
Sexually Transmitted Diseases: Psychosocial Aspects Sexually transmitted diseases (STDs) including HIV place an enormous burden on the public’s health. In 1995, the World Health Organization (WHO) estimated that, worldwide, there were 333 million new 14026
cases of chlamydia, gonorrhea, syphilis, and trichomoniasis in 15- to 49-year-olds. In 1999, WHO estimated that there were 5.6 million new HIV infections worldwide, and that 33.6 million people were living with AIDS. Although it is not who you are, but what you do that determines whether you will expose yourself or others to STDs including HIV, STDs disproportionately affect various demographic groupings. For example, STDs are more prevalent in urban settings, in unmarried individuals, and in young adults. They are also more prevalent among the socioeconomically disadvantaged and various ethnic groups. For example, in 1996 in the United States, the incidence of reported gonorrhea per 100,000 was 826 among black non-Hispanics, 106 among Native Americans, 69 among Hispanics, 26 among white nonHispanics, and 18.6 among Asian\Pacific Islanders (Aral and Holmes 1999). Demographic differences in STD rates are most likely explained by differences in sexual behaviors or disease prevalence. Thus, demographic variables are often referred to as risk markers or risk indicators. In contrast, sexual and healthcare behaviors that directly influence the probability of acquiring or transmitting STDs represent true risk factors. From a psychosocial perspective, it is these behavioral risk factors (and not risk markers) that are critical for an understanding of disease transmission (see Sexual Risk Behaiors).
1. Transmission Dynamics To understand how behavior contributes to the spread of an STD, consider May and Anderson’s (1987) model of transmission dynamics: Ro l βcD, where, Ro l Reproductive Rate of Infection. When Ro is greater than 1, the epidemic is growing; when Ro is less than 1, the epidemic is dying out; and when Ro l 1, the epidemic is in a state of equilibrium, β l measure of infectivity or tranmissability, c l measure of interaction rates between susceptibles and infected individuals, and D l measure of duration of infectiousness Each of the parameters on the right hand side of the equation can be influenced by behavior. For example, the transmission rate (β) can be lowered by increasing consistent and correct condom use or by delaying the onset of sexual activity. Transmissibility can also be reduced by vaccines, but people must utilize these vaccines and before that, it is necessary for people to have participated in vaccine trials. Decreasing the rate of new partner acquisition will reduce the sexual interaction rate c and, at least for bacterial STDs, duration of infectiousness D can be reduced through the detection of asymptomatic or through early treatment of symptomatic STDs. Thus, increasing care seeking behavior and\or increasing the likelihood that one will participate in screening programs can affect the reproductive rate. Given that STDs are
Sexually Transmitted Diseases: Psychosocial Aspects important cofactors in the transmission of HIV, early detection and treatment of STDs will also influence HIV transmissibility (β). Finally, D can also be influenced by patient compliance with medical treatment, as well as compliance with partner notification. Clearly, there are a number of different behaviors that, if changed or reinforced, could have an impact on the reproductive rate Ro of HIV and other STDs. A critical question is whether it is necessary to consider each of these behaviors as a unique entity, or whether there are some more general principals that can guide our understanding of any behavior. Fortunately, even though every behavior is unique, there are only a limited number of variables that need to be considered in attempting to predict, understand, or change any given behavior.
2. Psychosocial Determinants of Intention and Behaior Figure 1 provides an integration of several different leading theories of behavioral prediction and behavior change (cf. Fishbein et al. 1992). Before describing this model, however, it is worth noting that theoretical models such as the one presented in Fig. 1 have often been criticized as ‘Western’ or ‘US’ models that don’t apply to other cultures or countries. When properly applied, however, these models are culturally specific, and they require one to understand the behavior from the perspective of the population being considered. Each of the variables in the model can be found in
almost any culture or population. In fact, the theoretical variables contained in the model have been assessed in over 50 countries in both the developed and the developing world. Moreover, the relative importance of each of the variables in the model is expected to vary as a function of both the behavior and the population under consideration (see Health Behaior: Psychosocial Theories).
2.1 Determinants of Behaior Looking at Fig. 1, it can be seen that any given behavior is most likely to occur if one has a strong intention to perform the behavior, if one has the necessary skills and abilities required to perform the behavior, and if there are no environmental constraints preventing behavioral performance. Indeed, if one has made a strong commitment (or formed a strong intention), has the necessary skills and abilities, and if there are no environmental constraints to prevent behavioral performance, the probability is close to one that the behavior will be performed (Fishbein et al. 1992). Clearly, very different types of interventions will be necessary if one has formed an intention but is unable to act upon it, than if one has little or no intention to perform the behavior in question. Thus, in some populations or cultures, the behavior may not be performed because people have not yet formed intentions to perform the behavior, while in others, the problem may be a lack of skills and\or the presence of
Figure 1 An integrative model
14027
Sexually Transmitted Diseases: Psychosocial Aspects environmental constraints. In still other cultures, more than one of these factors may be relevant. For example, among female commercial sex workers (CSWs) in Seattle, Washington, only 30 percent intend to use condoms for vaginal sex with their main partners, and of those, only 40 percent have acted on their intentions (von Haeften et al. 2000). Clearly if people have formed the desired intention but are not acting on it, a successful intervention will be directed either at skills building or will involve social engineering to remove (or to help people overcome) environmental constraints.
2.2 Determinants of Intentions On the other hand, if strong intentions to perform the behavior in question have not been formed, the model suggests that there are three primary determinants of intention: the attitude toward performing the behavior (i.e., the person’s overall feelings of favorableness or unfavorableness toward performing the behavior), perceived norms concerning performance of the behavior (including both perceptions of what others think one should do as well as perceptions of what others are doing), and one’s self-efficacy with respect to performing the behavior (i.e., one’s belief that one can perform the behavior even under a number of difficult circumstances) (see Self-efficacy and Health). As indicated above, the relative importance of these three psychosocial variables as determinants of intention will depend upon both the behavior and the population being considered. Thus, for example, one behavior may be determined primarily by attitudinal considerations while another may be influenced primarily by feelings of self-efficacy. Similarly, a behavior that is driven attitudinally in one population or culture may be driven normatively in another. Thus, before developing interventions to change intentions, it is important first to determine the degree to which that intention is under attitudinal, normative, or selfefficacy control in the population in question. Once again, it should be clear that very different interventions are needed for attitudinally controlled behaviors than for behaviors that are under normative influence or are related strongly to feelings of selfefficacy. Clearly, one size does not fit all, and interventions that are successful in one culture or population may be a complete failure in another.
2.3 Determinants of Attitudes, Norms, and Selfefficacy The model in Fig. 1 also recognizes that attitudes, perceived norms, and self-efficacy are all, themselves, functions of underlying beliefs—about the outcomes of performing the behavior in question, about the normative proscriptions and\or behaviors of specific 14028
referents, and about specific barriers to behavioral performance. Thus, for example, the more one believes that performing the behavior in question will lead to ‘good’ outcomes and prevent ‘bad’ outcomes, the more favorable one’s attitude toward performing the behavior. Similarly, the more one believes that specific others think one should (or should not) perform the behavior in question, and the more one is motivated to comply with those specific others, the more social pressure one will feel (or the stronger the norm) with respect to performing (or not performing) the behavior. Finally, the more one perceives that one can (i.e., has the necessary skills and abilities to) perform the behavior, even in the face of specific barriers or obstacles, the stronger will be one’s self-efficacy with respect to performing the behavior. It is at this level that the substantive uniqueness of each behavior comes into play. For example, the barriers to using and\or the outcomes (or consequences) of using a condom for vaginal sex with one’s spouse or main partner may be very different from those associated with using a condom for vaginal sex with a commercial sex worker or an occasional partner. Yet it is these specific beliefs that must be addressed in an intervention if one wishes to change intentions and behavior. Although an investigator can sit in their office and develop measures of attitudes, perceived norms, and self-efficacy, they cannot tell you what a given population (or a given person) believes about performing a given behavior. Thus, one must go to members of that population to identify salient behavioral, normative, and efficacy beliefs. One must understand the behavior from the perspective of the population one is considering.
2.4 The Role of ‘External’ Variables Finally, Fig. 1 also shows the role played by more traditional demographic, personality, attitudinal, and other individual difference variables (such as perceived risk). According to the model, these types of variables primarily play an indirect role in influencing behavior. For example, while men and women may hold different beliefs about performing some behaviors, they may hold very similar beliefs with respect to others. Similarly rich and poor, old and young, those from developing and developed countries, those who do and do not perceive they are at risk for a given illness, those with favorable and unfavorable attitudes toward family planning, etc., may hold different attitudinal, normative or self-efficacy beliefs with respect to one behavior but may hold similar beliefs with respect to another. Thus, there is no necessary relation between these ‘external’ or ‘background’ variables and any given behavior. Nevertheless, external variables such as cultural and personality differences should be reflected in the underlying belief structure.
Sexually Transmitted Diseases: Psychosocial Aspects
3. The Role of Theory in HIV\STD Preention Models, like the one represented by Fig. 1, have served as the theoretical underpinning for a number of successful behavioral interventions in the HIV\STD arena (cf. Kalichman et al. 1996). For example, the US Centers for Disease Control and Prevention (CDC) have supported two large multisite studies based on the model in Fig. 1. The first, the AIDS Community Demonstration Projects (Fishbein et al 1999), attempted to reach members of populations at risk for HIV\STD that were unlikely to come into contact with the health department. The second, Project RESPECT, was a multisite randomized controlled trial designed to evaluate the effectiveness of HIV\STD counseling and testing (Kamb et al. 1998). Project RESPECT asked whether prevention counseling or enhanced prevention counseling were more effective in increasing condom use and reducing incident STDs than standard education. Although based on the same theoretical model, these two interventions were logistically very different. In one, the intervention was delivered ‘in the street’ by volunteer networks recruited from the community. In the other, the intervention was delivered one-on-one by trained counselors in an STD clinic. Thus, one involved community participation and mobilization while the other involved working within established public health settings. In addition, one was evaluated using the community as the unit of analysis while the other looked for behavior change at the individual level. Despite these logistic differences, both interventions produced highly significant behavioral change. In addition, in the clinic setting (where it was feasible to obtain biologic outcome measures), the intervention also produced a significant reduction in incident STDs. The success of these two interventions is due largely to their reliance on established behavioral principles. More important, it appears that theory-based approaches that are tailored to specific populations and behaviors can be effective in changing STD\HIV related behaviors in different cultures and communities (cf. NIH 1997).
4. The Relation between Behaioral and Biological Outcome Measures Unfortunately, behavior change (and in particular, self-reported behavior change), is often viewed as insufficient evidence for disease prevention, and several investigators have questioned the validity of self-reports of behavior and their utility as outcome measures in HIV\STD Prevention research. For example, Zenilman et al. (1995) investigated the relationship between STD clinic patients self-reported condom use and STD incidence. Patients coming to an STD clinic in Baltimore, Maryland, who agreed to participate in the study were examined for STDs upon
entry and approximately 3 months later. At the time of the follow-up exam, participants were asked to report the number of times they had sex in the past month and the number of times they had used a condom while having sex. Those who reported 100 percent condom use were compared with those who reported sometime use or no use of condoms with respect to incident (or new) STDs. Zenilman et al. (1995) found no significant relationship between self-reported condom use and incident STDs. Based on this finding, they questioned the veracity of the self-reports and suggested that intervention studies using self-reported condom use as the primary outcome measure were at best suspect and at worst, invalid. Such a view fails to recognize the complex relationship between behavioral and biological measures. 4.1 Transmission Dynamic Reisited Consider again the May and Anderson (1987) model of transmission dynamics: Ro l βcD. It is important to recognize that the impact on the reproductive rate, of a change in any one parameter, will depend upon the values of the other two parameters. Thus, for example, if one attempted to lower the reproductive rate of HIV by reducing transmission efficiency (either by reducing STDs or by increasing condom use), the impact of such a reduction would depend upon both the prevalence of the disease in the population and the sexual mixing patterns in that population. Clearly, if there is no disease in the population, decreases (or increases) in transmission efficiency can have very little to do with the spread of the disease. Similarly, a reduction in STD rates or an increase in condom use in those who are at low risk of exposure to partners with HIV will have little or no impact on the epidemic. In contrast, a reduction in STDs or an increase in condom use in those members of the population who are most likely to transmit and\or acquire HIV (that is, in the so-called core group), can, depending upon prevalence of the disease in the population, have a big impact on the epidemic (cf. Pequegnat et al. 2000). To complicate matters further, it must also be recognized that changes in one parameter may directly or indirectly influence one of the other parameters. For example, at least some people have argued that an intervention program that increased condom use successfully, could also lead to an increase in number of partners (perhaps because now one felt somewhat safer). If this were in fact the case, an increase in condom use would not lead necessarily to a decrease in the reproductive rate. In other words, the impact of a given increase (or decrease) in condom use on STD\ HIV incidence will differ, depending upon the values of the other parameters in the model. In addition, it’s important to recognize that condom use behaviors are very different with ‘safe’ than with ‘risky’ partners. For example, condoms are used much more frequently 14029
Sexually Transmitted Diseases: Psychosocial Aspects with ‘occasional’ or ‘new’ partners than with ‘main’ or ‘regular’ partners (see e.g., Fishbein et al 1999). Thus, one should not expect to find a simple linear relation (i.e., a correlation) between decreases in transmission efficiency and reductions in HIV seroconversions. Moreover, it should be recognized that many other factors may influence transmission efficiency. For example, the degree of infectivity of the donor; characteristics of the host; and the type and frequency of sexual practices all influence transmission efficiency; and variations in these factors will also influence the nature of a relationship between increased condom use and the incidence of STDs (including HIV). In addition, although correct and consistent condom use can prevent HIV, gonorrhea, syphilis, and probably chlamydia, condoms are less effective in interrupting transmission of herpes and genital warts. Although one is always better off using a condom than not using a condom, the impact of condom use is expected to vary by disease. Moreover, for many STDs, transmission from men to women is much more efficient than from women to men. For example, with one unprotected coital episode with a person with gonorrhea, there is about a 50 to 90 percent chance of transmission from male to female, but only about a 20 percent chance of transmission from female to male. It should also be noted that one can acquire an STD even if one always uses a condom. Consistent condom use is not necessarily correct condom use, and incorrect condom use and condom use errors occur with surprisingly high frequencies (Fishbein and Pequegnat 2000, Warner et al. 1998). In addition, at least some ‘new’ or incident infections may be ‘old’ STDs that initially went undetected or that did not respond to treatment. Despite these complexities, it is important to understand when, and under what circumstances, behavior change will be related to a reduction in STD incidence. Unfortunately, it is unlikely that this will occur until behaviors are assessed more precisely and new (or incident) STDs can be more accurately identified. From a behavioral science or psychosocial perspective, the two most pressing problems and the greatest challenges are to assess correct, as well as consistent, condom use, and to identify those at high or low risk for transmitting or acquiring STDs (see HIV Risk Interentions).
4.2 Assessing Correct and Consistent Condom Use Condom use is most often assessed by asking respondents how many times they have engaged in sex and then asking them to indicate how many of these times they used a condom. One will get very different answers to these questions depending upon the time frame used (lifetime, past year, past 3 months, past month, past week, last time), and the extent to which one distinguishes between the type of sex (vaginal, 14030
anal, or oral) and type of partner (steady, occasional, paying client). Irrespective of the time frame, type of partner or type of sex, these two numbers (i.e., number of sex acts and number of times condoms were used) can be used in at least two very different ways. In most of the literature (particularly the social psychological literature), the most common outcome measure is the percent of times the respondent reports condom use (i.e., for each subject, one divides the number of times condoms were used by the number of sex acts). Perhaps a more appropriate measure would be to subtract the number of time condoms were used from the number of sex acts. This would yield a measure of the number of unprotected sex acts. Clearly, if one is truly interested in preventing disease or pregnancy, it is the number of unprotected sex acts and not the percent of times condoms are used that should be the critical variable. Obviously, there is a difference in one’s risk of acquiring an STD if one has sex 1,000 times and uses a condom 900 times than if one has sex 10 times and uses a condom 9 times. Both of these people will have used a condom 90 percent of the time, but the former will have engaged in 100 unprotected sex acts while the latter will have engaged in only 1. By considering the number of unprotected sex acts rather than the percent of times condoms are used, it becomes clear how someone who reports always using a condom can get an STD or become pregnant. To put it simply, consistent condom use is not necessarily correct condom use, and incorrect condom use almost always equates to unprotected sex.
4.3 How Often Does Incorrect Condom Use Occur? There is a great deal of incorrect condom use. For example, Warner et al. (1998) asked 47 sexually active male college students (between 18 and 29 years of age) to report the number of times they had vaginal intercourse and the number of times they used condoms in the last month. In addition the subjects were asked to quantify the number of times they experienced several problems (e.g., breakage, slippage) while using condoms. Altogether, the 47 men used a total of 270 condoms in the month preceding the study. Seventeen percent of the men reported that intercourse was started without a condom, but they then stopped to put one on; 12.8 percent reported breaking a condom during intercourse or withdrawal; 8.5 percent started intercourse with a condom, but then removed it and continued intercourse; and 6.4 percent reported that the condom fell off during intercourse or withdrawal. Note that all of these people could have honestly reported always using a condom, and all could have transmitted and\or acquired an STD! Similar findings were obtained from both men and women attending STD clinics in the US (Fishbein and Pequegnat 2000). For example, fully 34 percent of
Sexually Transmitted Diseases: Psychosocial Aspects women and 36 percent of men reported condom breakage during the past 12 months, with 11 percent of women and 15 percent of men reporting condom breakage in the past 3 months. Similarly, 31 percent of the women and 28 percent of the men reported that a condom fell off in the past 12 months while 8 percent of both men and women report slippage in the past 3 months. Perhaps not surprisingly women are significantly more likely than men to report condom leakage (17 percent vs. 9 percent). But again, the remarkable findings are the high proportion of both men and women reporting different types of condom mistakes. For example, 31 percent of the men and 36 percent of the women report starting sex without a condom and then putting one on, while 26 percent of the men and 23 percent of the women report starting sex with a condom and then taking it off and continuing intercourse. This probably reflects the difference between using a condom for family planning purposes and using one for the prevention of STDs including HIV. The practice of having some sex without a condom probably reflects incorrect beliefs about how one prevents pregnancy. Irrespective of the reason for these behaviors, the fact remains that all of these ‘errors’ could have led to the transmission and\or acquisition of an STD despite the fact that condoms had, in fact, been used. In general, and within the constraints of one’s ability to accurately recall past events, people do appear to be honest in reporting their sexual behaviors including their condom use. Behavioral scientists must obtain better measures, not of condom use per se, but of correct and consistent condom use, or perhaps even more important, of the number of unprotected sex acts in which a person engages.
4.4 Sex With ‘Safe’ and ‘Risky’ Partners Whether one uses a condom correctly or not is essentially irrelevant as long as one is having sex with a safe (i.e., uninfected) partner. However, to prevent the acquisition (or transmission) of disease, correct and consistent condom use is essential when one has sex with a risky (i.e., infected) partner. Although it is not possible to always know whether another person is safe or risky, it seems reasonable to assume that those who have sex with either a new partner, an occasional partner, or a main partner who is having sex outside of the relationship are at higher risk than those who have sex only with a main partner who is believed to be monogamous. Consistent with this, STD clinic patients with potential high risk partners (i.e., new, occasional, or nonmonogamous main partners) were significantly more likely to acquire an STD than those with potential low risk partners (18.5 percent vs. 10.4 percent; Fishbein and Jarvis 2000).
One can also assess people’s perceptions that their partners put them at risk for acquiring HIV. For example, the STD clinic patients were also asked to indicate, on seven-place likely (7)\unlikely (1) scales, whether having unprotected sex with their main and\or occasional partners would increase their chances of acquiring HIV. Not surprisingly those who felt it was likely that unprotected sex would increase their chances of acquiring HIV (i.e., those with scores of 5, 6, or 7) were, in fact, significantly more likely to acquire a new STD (19.4 percent) than those who perceived that their partner(s) did not put them at risk (i.e., those with scores of 1, 2, 3, or 4—STD incidence l 12.6 percent). Somewhat surprising, clients’ perceptions of the risk status of their partners were not highly correlated with whether they were actually with a potentially safe or dangerous partner (r l .11, p .001). Nevertheless, both risk scores independently relate to STD acquisition. More specifically, those who were ‘hi’ on both actual and perceived risk were almost four times as likely to acquire a new STD than were those who are ‘lo’ on both actual and perceived risk. Those with mixed patterns (i.e., hi\lo or lo\hi) are intermediate in STD acquisition (Fishbein and Jarvis 2000). So, it does appear possible to not only find behavioral measures that distinguish between those who are more or less likely to acquire a new STD, but equally important, it appears possible to identify measures that distinguish between those who are, or are not, having sex with partners who are potentially placing them at high risk for acquiring an STD. 4.5 STD Incidence as a Function of Risk and Correct and Consistent Condom Use If this combined measure of actual and perceived risk is ‘accurate’ in identifying whether or not a person is having sex with a risky (i.e., an infected) or a safe (i.e., a noninfected) partner, condom use should make no difference with low-risk partners, but correct and consistent condom use should make a major difference with high-risk partners. More specifically, correct and consistent condom use should significantly reduce STD incidence among those having sex with potential high-risk partners. Consistent with this, while condom use was unrelated to STD incidence among those with low-risk partners, correct and consistent condom use did significantly reduce STD incidence among those at high risk. Not only are these findings important for understanding the relationship between condom use and STD incidence but they provide evidence for both the validity and utility of self-reported behaviors. In addition, these data provide clear evidence that if behavioral interventions significantly increase correct and consistent condom use among people who perceive they are at risk and\or who have a new, an occasional, or a nonmonogamous main partner, there will be a significant reduction in STD incidence. On 14031
Sexually Transmitted Diseases: Psychosocial Aspects the other hand, if, among this high-risk group, we only increase consistency of use without increasing correct use and\or if we only increase condom use among those at low risk, then we will see little or no reduction in STD (or HIV) incidence.
5. Future Directions Clearly it is time to stop using STD incidence as a ‘gold standard’ to validate behavioral self-reports and to start paying more attention to understanding the relationships between behavioral and biological outcome measures. It is also important to continue to develop theory-based, culturally sensitive interventions to change a number of STD\HIV related behaviors. While much of the focus to date has been on increasing consistent condom use, interventions are needed to increase correct condom use and to increase the likelihood that people will come in for screening and early treatment, and, for those already infected, to adhere to their medical regimens. Indeed, it does appear that we now know how to change behavior, and that under appropriate conditions, behavior change will lead to a reduction in STD incidence. While many investigators have called for ‘new’ theories of behavior change, ‘new’ theories are probably unnecessary. What is needed, however, is for investigators and interventionists to better understand and correctly utilize existing, empirically supported theories in developing and evaluating behavior change interventions. See also: Health Behavior: Psychosocial Theories; HIV Risk Interventions; Self-efficacy and Health; Sexual Risk Behaviors; Vulnerability and Perceived Susceptibility, Psychology of
Bibliography Aral S O, Holmes K K 1999 Social and behavioral determinants of the epidemiology of STDs: Industrialized and developing countries. In: Holmes K K, Sparling P F, Mardh P-A, Lemon S M, Stamm W E, Piot P, Wasserheit J N (eds.) Sexually Transmitted Diseases. McGraw-Hill, New York Fishbein M, Higgins D L, Rietmeijer C 1999 Communitylevel HIV intervention in five cities: Final outcome data from the CDC AIDS community demonstration projects. American Journal of Public Health 89(3): 336–45 Fishbein M, Bandura A, Triandis H C, Kanfer F H, Becker M H, Middlestadt S E 1992 Factors Influencing Behaior and Behaior Change: Final Report—Theorist’s Workshop. National Institute of Mental Health, Rockville, MD Fishbein M, Jarvis B 2000 Failure to find a behavioral surrogate for STD incidence: What does it really mean? Sexually Transmitted Diseases 27: 452–5 Fishbein M, Pequegnat W 2000 Using behavioral and biological outcome measures for evaluating AIDS prevention interventions: A commentary. Sexually Transmitted Diseases 27(2): 101–10
14032
Kalichman S C, Carey M P, Johnson B T 1996 Prevention of sexually transmitted HIV infection: A meta-analytic review of the behavioral outcome literature. Annals of Behaioral Medicine 18(1): 6–15 Kamb M L, Fishbein M, Douglas J M, Rhodes F, Rogers J, Bolan G, Zenilman J, Hoxworth T, Mallotte C K, Iatesta M, Kent C, Lentz A, Graziano S, Byers R H, Peterman T A, the Project RESPECT Study Group 1998 HIV\STD prevention counseling for high-risk behaviors: Results from a multicenter, randomized controlled trial. Journal of the American Medical Association 280(13): 1161–7 May R M, Anderson R M 1987 Transmission dynamics of HIV infection. Nature 326: 137–42 NIH Consensus Development Panel 1997 Statement from consensus development conference on interventions to prevent HIV risk behaviors. NIH Consensus Statement 15(2): 1–41 Pequegnat W, Fishbein M, Celantano D, Ehrhardt A, Garnett G, Holtgrave D, Jaccard J, Schachter J, Zenilman J 2000 NIMH\APPC workgroup on behavioral and biological outcomes in HIV\STD prevention studies: A position statement. Sexually Transmitted Diseases 27(3): 127–32 von Haeften I, Fishbein M, Kaspryzk D, Montano D 2000 Acting on one’s intentions: Variations in condom use intentions and behaviors as a function of type of partner, gender, ethnicity and risk. Psychology, Health & Medicine 5(2): 163–71 Warner L, Clay-Warner J, Boles J, Williamson J 1998 Assessing condom use practices: Implications for evaluating method and user effectiveness. Sexually Transmitted Disases 25(6): 273–7 Zenilman J M, Weisman C S, Rompalo A M, Ellish N, Upchurch D M, Hook E W, Clinton D 1995 Condom use to prevent incident STDs: The validity of self-reported condom use. Sexually Transmitted Diseases 22(1): 15–21
M. Fishbein
Shamanism Shamanism is a tradition of part-time religious specialists who establish and maintain personalistic relations with specific spirit beings through the use of controlled and culturally scripted altered states of consciousness (ASC). Shamans employ powers derived from spirits to heal sickness, to guide the dead to their final destinations, to influence animals and forces of nature in a way that benefits their communities, to initiate assaults on enemies, and to protect their own communities from external aggression. Shamans exercise mastery over ASC and use them as a means to the culturally approved end of mediating between human, animal, and supernatural worlds. Shamans draw upon background knowledge, conveyed through myth and ritual, which renders intelligible the potentially chaotic experience of ASC. The criterion of control helps to distinguish shamanism from the use of ASC in other traditions. Shamanism has long been a subject of inquiry and controversy in diverse academic disciplines, with many hundreds of accounts of shamanic practices published
Shamanism by the early 1900s. It has also been a topic of spiritual interest to the wider public in Europe and North America for the past several decades. Depending on the perspective taken, shamanism is either the most archaic and universal form of human spirituality or a culturally distinct religious complex with historical roots in Siberia and a path of diffusion into North and South America. The latter view is adopted in the following.
1. Siberian Origin and Common Features The term ‘shaman’ is drawn from seventeenth century Russian sources reporting on an eastern Siberian people, the Evenk (Tungus). In the classic work on the subject by the historian of religion Mircea Eliade, ‘shamanism’ is used to refer to a complex of beliefs and practices diffused from northern Asia to societies in central and eastern Asia, as well as through all of North and South America. Eliade’s comparative work specifies a core constellation of ideas common to shamanic traditions. The classic shamanic initiation involves the novice’s selection—frequently unwanted and resisted—by a spirit(s), a traumatic and dangerous series of ordeals, followed by a death and rebirth that sometimes involves violent dismembering and subsequent reconstitution of the fledgling shaman’s body. The archetypal shamanic cosmology is vertically tiered, with earth occupying the middle level and a cosmic tree or world mountain serving as a connecting path for shamans to travel to other cosmic planes (up or down) in pursuit of their ‘helping spirits.’ Shamans use a variety of techniques to enter the ASC in which they communicate with their spirit helpers: sensory deprivation (e.g., fasting, meditation), repetitive drumming and\or dancing, and ingestion of substances with psychoactive properties (e.g., plant hallucinogens). The last of these means is particularly important in the shamanism of Central and South America, where an impressive array of plants and animal-derived chemicals have been used for their hallucinogenic properties (e.g., peyote, datura, virola, and poison from the Bufo marinus toad). Among the most commonly reported characteristics of the shaman’s ASC are transformations into animals, and flights to distant places. Shamanic animals include the jaguar (lowland South America), the wolf (North American Chukchee), and bears, reindeer and fish (Lapp). Shamanic flights are undertaken to recapture the wandering souls of the sick, to intercede on behalf of hunters with spirits that control animal species, and to guide the dead to their final destination. The fact that shamans are often called upon to heal the sick should not lead to the conclusion that they are always concerned with the well being of their fellows. While the Hippocratic Oath of Western medicine prohibits physicians from doing harm, shamans fre-
quently engage in actions meant to sicken, if not kill, their adversaries. This is an important corrective to romanticized images of the shamanic vocation.
2. Popular and Professional ‘Shamanism’ Beginning in the 1960s there developed a convergence in the professional and popular interests in shamanism. Inspired by the experimentation with hallucinogenic drugs on the part of American and European youth, attention was drawn to the physiology and psychology of altered states of consciousness. In academic circles, this led to speculations about the neurochemical foundations of hallucinatory experiences and their potential therapeutic benefits. Investigations were undertaken into the imagery of shamanic states of consciousness to determine how much could be attributed to universal, physiologically related sensory experiences. One often cited collection of essays, published in the journal Ethos (Prince 1982), sought to link the shaman’s ASC to the production of the naturally occurring opiates known as endorphins. Scholars argued that this might account for the healing effect of shamanic therapies. In the popular cultures of North America and Europe, stress came to be placed on shamanism as an avenue to self-exploration, a means by which persons from any culture could advance their quest for spiritual understandings. The legacy of this development is to be found in the forms of ‘neoshamanism’ among New Age religions, which draw eclectically from more traditional shamanic practices. Native North and South American societies have been special sources of inspiration for practitioners of neoshamanism, but not always with the willing support of the indigenous people. At the same time, and in response to the evolution of research in symbolic anthropology, anthropologists have contributed increasingly sophisticated ethnographic analyzes of the metaphysical underpinnings of shamanism in specific societies. These anthropologists concentrate their attention on the complexities of shamanic cosmologies and the relationships between shamanic symbolism and crucial features of the natural world (e.g., important animal and plant species, meteorological phenomena, and geographic landmarks). An especially detailed analysis of a shamanic tradition comes from the Warao of the Orinoco Delta in Venezuela. The German-born and American-trained anthropologist Johannes Wilbert documents the mythic cosmology that supports the work of three distinct kinds of shamans, each of which are responsible for the care and feeding of specific gods. The highest ranking shaman, the Wishiratu (‘Master of Pains’ or Priest Shaman), responds when the Kanobos (Supreme Beings) send sickness to a Warao village. The Bahanarotu is also a curer, but has a special 14033
Shamanism responsibility for a fertility cult centered on an astonishingly complex supernatural location, the House of Tobacco Smoke, where four insect couples and a blind snake play a game of chance under the watchful eye of a swallow-tailed kite. The Hoarotu (Dark Shaman) has the unpleasant but essential task of feeding human flesh to the Scarlet Macaw, the God of the West. The preferred food of the other gods is tobacco smoke, which is also the vehicle by which Warao shamans enter ASC—by hyperventilating on long cigars made of a strong black leaf tobacco.
3. Research Directions 3.1 Dead Ends and Detours There have been a number of dead ends and detours in shamanism studies. An unfortunate amount of energy was spent in fruitless discussions of the mental health of shamans—whether or not their abnormal behavior justified the application of psychiatric diagnoses (e.g., schizophrenia). Most scholars now recognize that the culturally patterned nature of shamans’ behavior and the positive value placed on their social role make the application of mental illness labels inappropriate. Another distraction came when cultural evolutionists hypothesized that shamanism was an intermediary stage in the developmental sequence of religious forms, in-between magic and institutionalized religion. Attempts to treat shamanism exclusively as an archaic expression of human religiosity flounder on the perseverance and even revitalization of shamanic traditions in cultures around the world. For example, the remarkable resiliency of shamanic practices is evident in the continuity between contemporary curanderismo (curing) on Peru’s north coast and shamanic iconography dating to the pre-Hispanic Chavin culture (approx. 900–200 BC). 3.2 Positie Directions A still limited amount of research has been directed toward the important question of how effective are shamanic treatments. This work has been troubled by serious theoretical issues (e.g., what constitutes ‘efficacy’) and thorny methodological questions (e.g., is the double blind, randomized trial possible under the unusual circumstances of shamanic rituals?). The previously cited endorphin theory has not been confirmed ethnographically. Another approach has adapted psychiatric questionnaires to before and after treatment interviews with shamanic patients. A fruitful line of investigation has focused on the relationship between shamanism and culturally constructed notions of gender. Shamans sometimes ‘bend’ gender roles (e.g., transvestitism) and the sicknesses they treat can be entangled in gender-based conflicts. Some scholars suggest that where women dominate in the shaman role there is a less adversarial model of 14034
supernatural mediation than with male shamans. The best examples of gender-focused research come from East Asia (especially Korea), where the majority of shamans are women, and from the Chilean Mapuche, among whom female and cross-dressing male shamans have predominated since at least the sixteenth century. A final body of research worth noting focuses on the capacity of shamanic traditions to survive and thrive even under the most disruptive social and political conditions. Agents of culture change, whether they are Soviet-era ideologues in Siberia or Christian missionaries in the Amazon, have been repeatedly frustrated in their attempts to obliterate shamanic practices. In recent decades, shamans have even become central figures in the politics of ethnicity and in antidevelopment protests. Scholars have shown that shamans accomplish this survival feat not by replicating ancient traditions, but by continuously reinventing them in the light of new realities and competing symbolic structures. While this strategy may occasion debates about what constitutes an ‘authentic’ shaman, it is nevertheless the key to shamanism’s continuing success.
4. A Plea for Terminological Precision The most serious threat to the academic study of shamanism lies in the broad application of the term to any religiously inspired trance form. To so dilute the concept as to make it applicable to Kung trance dancers, New Age spiritualists, and Mexican Huichol peyote pilgrims is to render it meaningless. The value of a scientific term is that it groups together phenomena that are alike in significant regards, while distinguishing those that are different. The promiscuous use of the term shaman will ultimately leave it as generic as ‘spiritual healer,’ and just as devoid of analytic value. See also: Alternative and Complementary Healing Practices; Healing; Religion and Health; Spirit Possession, Anthropology of
Bibliography Atkinson J M 1992 Shamanisms today. Annual Reiew of Anthropology 21: 307–30 Bacigalupo A M Shamans of the Cinnamon Tree, Priestesses of the Mood: Gender and healing among the Chilean Mapuche. Manuscript in preparation Brown M F 1989 Dark side of the shaman. Natural History 11 8–10 Eliade M 1972 Shamanism: Anchaic Techniques of Ecstasy. Princeton University Press, Princeton, NJ Furst P T (ed.) 1972 Flesh of the Gods: The Ritual Use of Hallucinogens. Praeger, New York Hultkrantz A 1992 Shamanic Healing and Ritual Drama. Crossroad, New York Joralemon D 1990 The selling of the shaman and the problem of informant legitimacy. Journal of Anthropological Research 46: 105–18
Shame and the Social Bond Joralemon D, Sharon D 1993 Sorcery and Shamanism: Curanderos and Clients in Northern Peru. University of Utah, Salt Lake City, UT Kalweit H 1988 Dreamtime & Inner Space: The World of the Shaman, 1st edn. Shambhala, Boston Kendall L 1985 Shamans, Housewies, and Other Restless Spirits: Women in Korean Ritual Life. University of Hawaii Press, Honolulu, HI Kleinman A, Sung L H 1979 Why do indigenous practitioners successfully heal? Social Science & Medicine 13(1B): 7–26 Peters L G, Price-Williams D 1980 Towards an experiential analysis of shamanism. American Ethnologist 7: 398–418 Prince R 1982 Shamans and endorphins: Hypotheses for a synthesis. Ethos 10(4): 409–23 Reichel-Dolmatoff G 1975 The Shaman and the Jaguar. Temple University Press, Philadelphia, PA Shirokogoroff S M 1935 The Psychomental Complex of the Tungus. Kegan, Paul, Tranch and Trubner, London Silverman J 1967 Shamanism and acute schizophrenia. American Anthropologist 69: 21–31 Sullivan L E 1988 Icanchu’s Drum: An Orientation to Meaning in South American Religions. Macmillan, New York Wilbert J 1993 Mystic Endowment: Religious Ethnography of the Warao Indians. Harvard University Press, Cambridge, MA
D. Joralemon
Shame and the Social Bond Many theorists have at least implied that emotions are a powerful force in social process. Although Weber didn’t refer to emotions directly, his emphasis on values implies it, since values are emotionally charged beliefs. Especially in his later works, Durkheim proposed that collective sentiments created social solidarity through moral community. G. H. Mead proposed emotion as an important ingredient in his social psychology. For Parsons it is a component of social action in his AGIL scheme (Parsons and Shils 1955). Marx implicated emotions in class tensions in the solidarity of rebelling classes. Durkheim proposed that ‘… what holds a society together—the ‘glue’ of solidarity—and [Marx implied that] what mobilizes conflict—the energy of mobilized groups—are emotions’ (Collins 1990). But even the theorists who dealt with emotions explicitly, Durkheim, Mead, and Parsons, did not develop concepts of emotion, investigate their occurrence, nor collect emotion data. Their discussions of emotion, therefore, have not borne fruit. The researchers whose work is reviewed here took the step of investigating a specific emotion.
1. Seen Pioneers in the Study of Social Shame Five of the six sociologists reviewed acted independently of each other. In the case of Elias and Sennett, their discovery of shame seems forced upon them by their data. Neither Simmel nor Cooley define what
they mean by shame. Goffman only partially defined embarrassment. The exception is Helen Lynd, who was self-conscious about shame as a concept. Lynd’s book on shame was contemporaneous with Goffman’s first writings on embarrassment and realized their main point: face-work meant avoiding embarrassment and shame. Helen Lewis’s empirical work on shame (1971) was strongly influenced by Lynd’s book. She also was sophisticated in formulating a concept of shame, and in using systematic methods to study it. Sennett’s work involved slight outside influence. He approvingly cited the Lynd book on shame in The Hidden Injuries of Class (1972), and his (1980) has a chapter on shame. 1.1 Simmel: Shame and Fashion Shame plays a significant part in only one of Simmel’s essays, on fashion (1904). People want variation and change, he argued, but they also anticipate shame if they stray from the behavior and appearance of others. Fashion is the solution to this problem, since one can change along with others, avoiding being isolated, and therefore shame (p. 553). Simmel’s idea about fashion implies conformity in thought and behavior among one group in a society, the fashionable ones, and distance from another, those who do not follow fashion, relating shame to social bonds. There is a quality to Simmel’s treatment of shame that is somewhat difficult to describe, but needs description, since it characterizes most of the other sociological treatments reviewed here. Simmel’s use of shame is casual and unselfconscious. His analysis of the shame component in fashion occurs in a single long paragraph. Shame is not mentioned before or after. He doesn’t conceptualize shame or define it, seeming to assume that the reader will know the meaning of the term. Similar problems are prominent in Cooley, Elias, Sennett, and Goffman. Lynd and Lewis are exceptions, since they both attempted to define shame and locate it with respect to other emotions. 1.2 Cooley: Shame and the Looking Glass Self Cooley (1922), like Simmel, was direct in naming shame. For Cooley, shame and pride both arose from self-monitoring, the process that was at the center of his social psychology. His concept of ‘the looking glass self,’ which implies the social nature of the self, refers directly and exclusively to pride and shame. But he made no attempt to define either emotion. Instead he used the vernacular words as if they were selfexplanatory. To give just one example of the ensuing confusion: in English and other European languages, the word pride used without qualification usually has an inflection of arrogance or hubris (pride goeth before the fall). In current usage, in order to refer to the kind 14035
Shame and the Social Bond of pride implied in Cooley’s analysis, the opposite of shame, one must add a qualifier like justified or genuine. Using undefined emotion words is confusing. However, Cooley’s analysis of self-monitoring suggests that pride and shame are the basic social emotions. His formulation of the social basis of shame in self-monitoring can be used to amend Mead’s social psychology. Perhaps the combined Mead–Cooley formulation can solve the inside–outside problem that plagues psychoanalytic and other psychological approaches to shame, as I suggest below. 1.3 Elias: Shame in the Ciilizing Process Elias undertook an ambitious historical analysis of what he calls the ‘civilizing process’ (1994). He traced changes in the development of personality and social norms from the fifteenth century to the present. Like Weber, he gave prominence to the development of rationality. Unlike Weber, however, he gave equal prominence to emotional change, particularly to changes in the threshold of shame: ‘No less characteristic of a civilizing process than ‘‘rationalization’’ is the peculiar molding of the drive economy that we call ‘‘shame’’ and ‘‘repugnance’’ or ‘‘embarrassment’’.’ Using excerpts from advice manuals, Elias outlined a theory of modernity. By examining advice concerning etiquette, especially table manners, body functions, sexuality, and anger, he suggests that a key aspect of modernity involved a veritable explosion of shame. Elias showed that there was much less shame about manners and emotions in the early part of the period he studied than there was in the nineteenth century. In the eighteenth century, a change began occurring in advice on manners. What was said openly and directly earlier begins only to be hinted at, or left unsaid entirely. Moreover, justifications are offered less. One is mannerly because it is the right thing to do. Any decent person will be courteous; the intimation is that bad manners are not only wrong but also unspeakable, the beginning of repression. The change that Elias documents is gradual but relentless; by a continuing succession of small decrements, etiquette books fall silent about the reliance of manners, style, and identity on respect, honor, and pride, and avoidance of shame and embarrassment. By the end of the eighteenth century, the social basis of decorum and decency had become virtually unspeakable. Unlike Freud or anyone else, Elias documents, step by step, the sequence of events that led to the repression of emotions in modern civilization. By the nineteenth century, Elias proposed, manners are inculcated no longer by way of adult to adult verbal discourse, in which justifications are offered. Socialization shifts from slow and conscious changes by adults over centuries to swift and silent indoctrination of children in their earliest years. No justification is offered to most children; courtesy has 14036
become absolute. Moreover, any really decent person would not have to be told. In modern societies, socialization automatically inculcates and represses shame. 1.4 Richard Sennett: Is Shame the Hidden Injury of Class? Although The Hidden Injuries of Class (1972) carries a powerful message, it is not easy to summarize. The narrative concerns quotes from interviews and the authors’ brief interpretations. They do not devise a conceptual scheme and a systematic method. For this reason, readers are required to devise their own conceptual scheme, as I do here. The book is based on participant observation in communities, schools, clubs, and bars, and 150 interviews with white working class males, mostly of Italian or Jewish background, in Boston for one year beginning in July of 1969 (p. 40–1). The hidden injuries that Sennett and Cobb discovered might be paraphrased: their working class men felt that first, because of their class position, they were not accorded the respect that they should have received from others, particularly from their teachers, bosses and even from their own children. That is, these men have many complaints about their status. Secondly, these men also felt that their class position was at least partly their own fault. Sennett and Cobb imply that social class is responsible for both injuries. They believe that their working men did not get the respect they deserved because of their social class, and that the second injury, lack of self-respect, is also the fault of class, rather than the men’s own fault, as most of them thought. Sennett and Cobb argue that in US society, respect is largely based on individual achievement, the extent that one’s accomplishments provide a unique identity that stands out from the mass of others. The role of public schools in the development of abilities forms a central part of Sennett and Cobb’s argument. Their informants lacked self-respect, the authors thought, because the schooling of working class boys did not develop their individual talents in a way that would allow them to stand out from the mass as adults. In the language of emotions, they carry a burden of feelings of rejection and inadequacy, which is to say chronic low self-esteem (shame). From their observations of schools, Sennett and Cobb argue that teachers single out for attention and praise only a small percentage of the students, usually those who are talented or closest to middle-class. This praise and attention allows the singled-out students to develop their potential for achievement. The large majority of the boys, however, are ignored and, in subtle ways, rejected. There are a few working class boys who achieve their potential through academic or athletic talent. But the large mass does not. For them, rather than
Shame and the Social Bond opening up the world, public schools close it off. Education, rather than becoming a source of growth, provides only shame and rejection. For the majority of students, surviving school means running a gauntlet of shame. These students learn by the second or third grade that is better to be silent in class rather than risk humiliation of a wrong answer. Even students with the right answers must deal with having the wrong accent, clothing, or physical appearance. For most students, schooling is a vale of shame. 1.5 Helen Lynd: Shame and Identity During her lifetime, Helen Lynd was a well-known sociologist. With her husband, Robert, she published the first US community studies, Middletown and Middletown in Transition. But Lynd was also profoundly interested in developing an interdisciplinary approach to social science. In her study On Shame and the Search for Identity (1961), she dealt with both the social and psychological sides of shame. She also clearly named the emotion of shame and its cognates, and located her study within previous scholarship, especially psychoanalytic studies. But Lynd also modified and extended the study of shame by developing a concept, and by integrating its social and psychological components. In the first two chapters, Lynd introduced the concept of shame, using examples from literature to clarify each point. In the next section, she critiques mainstream approaches in psychology and the social sciences. She then introduces ideas from lesser known approaches, showing how they might resolve some of the difficulties. Finally, she has an extended discussion of the concept of identity, suggesting that it might serve to unify the study of persons by integrating the concepts of self, ego, and social role under the larger idea of identity. Lynd’s approach to shame is much more analytical and self-conscious than the other sociologists reviewed here. They treated shame as a vernacular word. For them, shame sprung out of their data, unavoidable. But Lynd encounters shame deliberately, as part of her exploration of identity. Lynd explains that shame and its cognates get left out because they are deeply hidden, but at the same time pervasive. She makes this point in many ways, particularly in the way she carefully distinguishes shame from guilt. One idea that Lynd develops is profoundly important for a social theory of shame and the bond, that sharing one’s shame with another can strengthen the relationship: ‘The very fact that shame is an isolating experience also means that … sharing and communicating it … can bring about particular closeness with other persons’ (Lynd 1961, p. 66). In another place, Lynd went on to connect the process of risking the communication of shame with the kind of role-taking that Cooley and Mead had described: ‘communicating
shame can be an experience of … entering into the mind and feelings of another person’ (p. 249). Lynd’s idea about the effects of communicating and not communicating shame was pivotal for Lewis’s (1971) concepts of acknowledged and unacknowledged shame, and their relationship to the state of the social bond, as outlined below. 1.6 Goffman: Embarrassment and Shame in Eeryday Life Although shame goes largely unnamed in Goffman’s early work, embarrassment and avoidance of embarrassment is the central thread. Goffman’s Eeryperson is always desperately worried about his image in the eyes of the other, trying to present herself with her best foot forward to avoid shame. This work elaborates, and indeed, fleshes out, Cooley’s abstract idea of the way in which the looking glass self leads directly to pride or shame. Interaction Ritual (1967) made two specific contributions to shame studies. In his study of face work, Goffman states what may be seen as a model of ‘face’ as the avoidance of embarrassment, and losing face as suffering embarrassment. This is an advance, because it offers readily observable markers for empirical studies of face. The importance of this idea is recognized, all too briefly, at the beginning of Brown and Levinson’s (1987) study of politeness behavior. Goffman’s second contribution to the study of shame was made in a concise essay on the role of embarrassment in social interaction (1967). Unlike any of the other shame pioneers in sociology, he begins the essay with an attempt at definition. His definition is a definite advance, but it also foretells a limitation of the whole essay, since it is behavioral and physiological, ignoring inner experience. Framing his analysis in what he thought of as purely sociological mode, Goffman omitted feelings and thoughts. His solution to the inside–outside problem was to ignore most of inner experience, just as Freud ignored most of outside events. However, Goffman affirms Cooley’s point on the centrality of the emotions of shame and pride in normal, everyday social relationships, ‘One assumes that embarrassment is a normal part of normal social life, the individual becoming uneasy not because he is personally maladjusted but rather because he is not … embarrassment is not an irrational impulse breaking through social prescribed behavior, but part of this orderly behavior itself’ (1967, pp. 109, 111). Even Goffman’s partial definition of the state of embarrassment represents an advance. One of the most serious limitations of current contributions to the sociology of emotions is the lack of definitions of the emotions under discussion. Much like Cooley, Elias and Sennett, Kemper (1978) offers no definitions of emotions, assuming that they go without saying. Hochshild (1983) attempts to conceptualize various 14037
Shame and the Social Bond emotions in an appendix, but doesn’t go as far as to give concrete definitions of emotional states. Only in Retzinger (1991, 1995) can conceptual and operational definitions of the emotions of shame and anger be found.
2. Lewis’s Discoery of Unacknowledged Shame Helen Lewis’s book on shame (1971) involved an analysis of verbatim transcripts of hundreds of psychotherapy sessions. She encountered shame because she used a systematic method for identifying emotions, the Gottschalk–Gleser method (Gottschalk et al. 1969, Gottschalk 1995), which involves use of long lists of keywords that are correlated with specific emotions. Lewis found that anger, fear, grief, and anxiety cues showed up from time to time in some of the transcripts. She was surprised by the massive frequency of shame cues. Her most relevant findings: (a) Prevalence: Lewis found a high frequency of shame markers in all the sessions, far outranking markers of all other emotions combined. (b) Lack of awareness: Lewis noted that patient or therapist almost never referred to shame or its near cognates. Even the word embarrassment was seldom used. In analyzing the context in which shame markers occurred, Lewis identified a specific context: situations in which the patient seemed to feel distant from, rejected, criticized, or exposed by the therapist. However, the patient’s showed two different, seemingly opposite responses in the shame context. In one, the patient seemed to be suffering psychological pain, but failed to identify it as shame. Lewis called this form overt, undifferentiated shame. In a second kind of response, the patient seemed not to be in pain, revealing an emotional response only by rapid, obsessional speech on topics that seemed somewhat removed from the dialogue. Lewis called this second response bypassed shame.
3. Shame, Anger, and Conflict In her transcripts, Lewis found many episodes of shame that extended over long periods of time. Since emotions are commonly understood to be brief signals (a few seconds) that alert us for action, the existence of long-lasting emotions is something of a puzzle. Lewis’s solution to this puzzle may be of great interest in the social sciences, since it provides an emotional basis for longstanding hostility, withdrawal, or alienation. She argued that her subjects often seemed to have emotional reactions to their emotions, and that this loop may extend indefinitely. She called these reactions ‘feeling traps.’ The trap that arose most frequently in her data involved shame and anger. A patient interprets an expression by the therapist as hostile, rejecting or critical, and responds with shame or embarrassment. However, the patient instantaneously masks the shame with anger, then is ashamed of being angry. 14038
Apparently, each emotion in the sequence is brief, but the loop can go on forever. This proposal suggests a new source of protracted conflict and alienation, one hinted at in Simmel’s treatment of conflict. Although Lewis didn’t discuss other kinds of spirals, there is one that may be as important as the shame–anger loop. If one is ashamed of being ashamed, it is possible to enter into a shame–shame loop that leads to silence and withdrawal. Elias’s work on modesty implies this kind of loop.
4. Shame and the Social Bond Finally, Lewis interpreted her findings in explicitly social terms. She proposed that shame arises when there is a threat to the social bond, as was the case in all of the shame episodes she discovered in the transcripts. Every person, she argued, fears social disconnection from others. Lewis’s solution to the outside–inside problem parallels and advances the Darwin–Mead–Cooley definition of the social context of shame. She proposed that shame is a bodily and\or mental response to the threat of disconnection from the other. Shame, she argued, can occur in response to threats to the bond from the other, but in can also occur in response to actions in the ‘inner theatre,’ in the interior monologue in which we see ourselves from the point of view of others. Her reasoning fits Cooley’s formulation of shame dynamics, and also Mead’s (1934) more general framework: the self is a social construction, a process constructed from both external and internal social interaction, in role-playing and role-taking.
5. Shame as the Social Emotion Drawing upon the work of these pioneers, it is possible to take further steps toward defining shame. By shame, I mean a large family of emotions that includes many cognates and variants, most notably embarrassment, humiliation and related feelings such as shyness, that involve reactions to rejection or feelings of failure or inadequacy. What unites all these cognates is that they involve the feeling of a threat to the social bond. That is, I use a sociological definition of shame, rather than the more common psychological one (perception of a discrepancy between ideal and actual self). If one postulates that shame is generated by a threat to the bond, no matter how slight, then a wide range of cognates and variants follow: not only embarrassment, shyness, and modesty, but also feelings of rejection or failure, and heightened self-consciousness of any kind. Note that this definition usually subsumes the psychological one, since most ideals are social, rather than individual. If, as proposed here, shame is a result of threat to the bond, shame would be the most social of the basic emotions. Fear is a signal of danger to the body, anger a signal of frustration, and so on. The sources of fear
Shared Belief and anger, unlike shame, are not uniquely social. Grief also has a social origin, since it signals the loss of a bond. But bond loss is not a frequent event. Shame on the other hand, following Goffman, since it involves even a slight threat to the bond, is pervasive in virtually all social interaction. As Goffman’s work suggests, all human beings are extremely sensitive to the exact amount of deference they are accorded. Even slight discrepancies generate shame or embarrassment. As Darwin (1872) noted, the discrepancy can even be in the positive direction; too much deference can generate the embarrassment of heightened self-consciousness. Especially important for social control is a positive variant, a sense of shame. That is, shame figures in most social interaction because members may only occasionally feel shame, but they are constantly anticipating it, as Goffman implied. Goffman’s treatment points to the slightness of threats to the bond that lead to anticipation of shame. For that reason, my use of the term shame is much broader than its vernacular use. In common parlance, shame is an intensely negative, crisis emotion closely connected with disgrace. But this is much too narrow if we expect shame to be generated by even the slightest threat to the bond.
Gottschalk L A 1995 Content Analysis of Verbal Behaior. Lawrence Erlbaum Associates, Hillsdale, NJ Gottschalk L, Winget C, Gleser G 1969 Manual of Instruction for Using the Gottschalk–Gleser Content Analysis Scales. University of California Press, Berkeley, CA HochshildA R 1983The ManagedHeart. Universityof California Press, Berkeley, CA Kemper T O 1978 A Social Interactional Theory of Emotions. Wiley, New York Lewis H B 1971 Shame and Guilt in Neurosis. International Universities Press, New York Lynd H M 1961 On Shame and the Search for Identity. Science Editions, New York Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago Parsons T, Shils E l951 Toward a General Theory of Action. Harvard University Press, Cambridge Retzinger S M 1991 Violent Emotions. Sage, Newbury Park, CA Retzinger S M 1995 Identifying shame and anger in discourse. American Behaioral Scientist 38: 1104–13 Sennett R 1980 Authority. Alfred Knopf, New York Sennett R, Cobb J 1972 The Hidden Injuries of Class, 1st edn. Knopf, New York Simmel G 1904 Fashion. International Quarterly X: 130–55 [Reprinted in the American Journal of Sociology 62: 541–59]
T. J. Scheff
6. Conclusion The classic sociologists believed that emotions are crucially involved in the structure and change of whole societies. The authors reviewed here suggest that shame is the premier social emotion. Lynd’s work, particularly, suggests how acknowledgement of shame can strengthen bonds, and by implication, lack of acknowledgment can create alienation. Lewis’s work further suggests how shame–anger loops can create perpetual hostility and alienation. Acknowledged shame could be the glue that holds relationships and societies together, and unacknowledged shame the force that divides them. See also: Civilizational Analysis, History of; Emotion: History of the Concept; Emotions, Evolution of; Emotions, Sociology of; Identity: Social; Moral Sentiments in Society; Norms; Values, Sociology of
Bibliography Brown P, Levinson S C 1987 Politeness: Some Uniersals in Language Usage. Cambridge University Press, Cambridge Collins R 1990 Stratification, emotional energy, and the transient emotions. In: Kemper T D (ed.) Research Agendas in the Sociology of Emotions. State University of New York Press, Albany, NY, pp. 27–57 Cooley C H 1922 Human Nature and the Social Order. C. Scribner’s Sons, New York Darwin C 1872 The Expression of Emotion in Men and Animals. John Murray, London Elias N 1994 The Ciilizing Process. Blackwell, Oxford, UK Goffman E 1967 Interaction Ritual. Anchor, New York
Shared Belief 1. Introducing the Concept of Shared Belief The importance of the notion of shared belief has been emphasized by philosophers, economists, sociologists, and psychologists at least since the 1960s (the earlier contributors include Schelling 1960, Scheff 1967, Lewis 1969, and Schiffer 1972). This article deals with the strongest kind of shared belief, to be called ‘mutual belief.’ While a belief can be shared in the trivial sense of two or more people having the same belief, mutual belief requires the kind of strong sharing which requires at least the participants’ awareness (belief ) of their similar belief. Mutual belief is one central kind of collective attitude, examples of others being collective intentions, wants, hopes, and fears. Understandably, collective attitudes are central explanatory notions in the social sciences, as one of the tasks of these sciences is to study collective phenomena, including various forms of collective thinking and acting (see below for illustrations). There is some special, additional interest in the notion of shared belief as mutual belief. First, mutual beliefs serve to characterize social or intersubjective existence in a sense that does not rely on the participants’ making agreements or contracts. Thus, many social relations, properties, and events arguably involve mutual beliefs (cf. Lewis 1969, Ruben 1985, Lagerspetz 1995, Tuomela 1995). As a simple example, think of the practice of two persons, A and B, shaking 14039
Shared Belief hands. It presupposes that A believes that B and A are shaking hands, and that A also believes that B believes similarly; and B must believe analogously. Similarly, communication has been argued by many philosophers, especially Grice, to involve mutual belief (cf. Schiffer 1972, Grice 1989). Second, characterizations of many other collective attitudes in fact depend on the notion of mutual belief (cf. Balzer and Tuomela 1997). In social psychology and sociology theoreticians often speak about consensus instead of mutual belief. The notion of consensus—with the core meaning ‘mutual belief’—has been regarded as relevant to such topics as public opinion, values, mass action, norms, roles, communication, socialization, and group cohesion. It can also be mentioned that fads, fashions, crazes, religious movements, and many other related phenomena have been analyzed partly in terms of shared beliefs, consensus, shared consensus, mutual belief, or some similar notions. As pointed out by the sociologist Scheff (1967), such analyses have often gone wrong because they have treated consensus merely as shared first-order belief. Thus, as Scheff argues, consensus as mere first-order agreement does not properly account for ‘pluralistic ignorance’ (where people agree but do not realize it) and ‘false consensus’ (where people mistakenly think that they agree). Scheff proposes an analysis in terms of levels of agreement corresponding to a hierarchy of so-called loop beliefs (e.g., in the case of two persons A and B, A believes that B believes that A believes that something p). As is easily seen, pluralistic ignorance and false consensus are second-level phenomena. The third level will have to be brought in when speaking about people’s awareness of these phenomena. Other well-known social psychological notions requiring more than shared belief are Mead’s concept of ‘taking the role of the generalized other,’ Dewey’s ‘interpenetration of perspectives,’ and Laing’s metaperspectives (see, for example, Scheff 1967). There are two different conceptual–logical approaches to understanding the notion of mutual (or, to use an equivalent term, common) belief: (a) the iteratie account, and (b) the reflexie or fixed-point account. According to the iterative account, mutual belief is assumed to mean iteratable beliefs or dispositions to believe (cf. Lewis 1969, Chap. 2, and, for the weaker account in terms of dispositions to come to believe, Tuomela 1995, Chap. 1). In the two-person case, mutual belief amounts to this according to the iterative account: A and B believe that p, A believes that B believes that p (and similarly for B), A believes that B believes that A believes that p (and similarly for B); and the iteration can continue as far as the situation demands. In the case of loop beliefs there is accordingly mutual awareness only in a somewhat rudimentary sense. As will be seen, in many cases one needs only two iterations for functionally adequate mutual belief: A and B believe that p and they also believe that they believe that p. However, there are 14040
other cases in which it may be needed to go higher up in the hierarchy. The fixed-point notion of mutual belief can be stated as follows: A and B mutually believe that p if and only if they believe that p and also believe that it is mutually believed by them that p. No iteration of beliefs is at least explicitly involved here. Correspondingly, a clear distinction can be made between the iterative or the level-account and the fixed-point account (to be briefly commented on in Sect. 4). One can speak of the individual (or personal) mode and the group mode of having an attitude such as belief or intention. This article deals with the general notion of mutual belief, be it in the individual mode or in the group mode. Some remarks on the present distinction are anyhow appropriate here. The groupmode sense, expressible by ‘We, as a group, believe that p,’ requires that the group in question is collectively committed to upholding its mutual belief or at least to keeping the members informed about whether it is or can be upheld. This contrasts with mutual belief in an aggregative individual mode involving only personal commitments to the belief in question. When believing in the group-mode sense, group members are accordingly committed to a certain shared view of a topic and to group-mode thoughts such as ‘We, as a group, believe that p.’ Group-mode beliefs are central in the analysis of the kinds of beliefs that structured social groups such as organizations and states have. According to the ‘positional’ account defended by Tuomela (1995), the group members authorized for belief or view formation collectively accept the views which will qualify as the beliefs the group has. These views are group-mode views accepted for the group and are strictly speaking acceptances of something as the group’s views rather than beliefs in the strict sense.
2. Mutual Beliefs More Precisely Characterized This section will be concerned with what mutual beliefs involve over and above plain shared beliefs, and the Sect. 3 takes up the problem of how many layers of hierarchical beliefs are conceptually or psychologically needed. (The discussion below draws on the treatment in Tuomela 1984 and 1995.) Consider now the iterative account starting with the case of two persons, A and B. Recall from Sect 1 that according to the standard iterative account A and B mutually believe that p if and only if A believes that p, B believes that p, A believes that B believes that p (and similarly for B), A believes that B believes that A believes that p (and similarly for B), and so on, in principle ad infinitum. In the general case, the account defines that it is mutually believed in a group, say G, that p if and only if a) everyone in G believes that p, b) everyone believes that everyone believes that p, c) everyone believes that everyone believes that everyone
Shared Belief believes that p, and so on ad infinitum. The word ‘everyone’ can of course be qualified if needed and made dependent on some special characteristics; for example, it can be restricted to concern only every fully fledged, adequately informed, and suitably rational member of the group. A major problem with the iterative account is that it seems in some cases it is psychologically realistic. It loads the people’s minds with iterated beliefs which people, after all, do not experientially have and possibly, because of lack of rationality and memory, cannot have. Thus, the account must be improved to make it better correspond to psychological reality. One way to go is to operate partly in terms of lack of disbelief. A person’s lack of disbelief that p is defined as that it is not the case the person believes the negation of p. One can try to define mutual belief schematically by saying that it is mutually believed in G that p if and only if everyone believes that p, iterated n times, and from level nj1 on everyone lacks the disbelief that p. With n l 2, viz. two levels actually present, this definition says that mutual belief amounts to everyone’s believing that p and everyone’s believing that everyone believes that p and that everyone lacks the disbelief that p from the second level on. The basic reason for using this amended iterative approach is that no more levels of positive belief than people actually have should be required. Provided that the value of n can be established, the present account satisfies this. It must be assumed here that the agents in question are to some degree intelligent and rational as well as free from emotional and other disturbances so that they, for instance, do not lack a higher-order belief when they really ought to have one in order to function optimally. While the iterative account amended with the use of the notion of lack of disbelief seems viable for some purposes, an alternative analysis is worth mentioning. This analysis amends the original iterative account in that it requires iterative beliefs up to some level n and from that level on requires only the disposition to acquire higher-order beliefs in appropriate conditions. These appropriate conditions, serving to determine the value of n, include both background conditions and more specific conditions needed for acquiring the belief in question. First, the agents in G must be assumed to be adequately informed and share both general cultural and group-specific information; especially, they must have the same standards of reasoning so that they are able to ‘work out the same conclusions’ (see Lewis 1969, Chap. 2, and Heal 1978). Second, they must be sufficiently intelligent and rational, and they must be free from cognitive and emotional disturbances to a sufficient degree (so that there are no psychological obstacles for adding the nj1st order belief when a genuine mutual belief is involved and for not adding it, or indeed for adding its negation, when a genuine mutual belief is not involved). A typical releasing condition for the dis-
position to acquire a higher-order belief would be simply that the agents are asked about higher-order beliefs (or are presented with other analogous problems concerning them). There is not much to choose between the two amended accounts, and the matter will not be discussed further here. With the conceptual machinery at hand, one can deal with, for instance, the mentioned phenomena of pluralistic ignorance and false consensus. An account of pluralistic ignorance, concerning something p, must obviously include the idea that everyone believes that p. Second, it must say that the agents do not have the belief that they agree. Compatibly with this, it can still be required that that they do not disbelieve that they believe that p. As to false consensus, it must be required that not everyone believes that p and that, nevertheless, it is believed by everyone that everyone believes that p.
3. Mutual Beliefs and the Leel Problem Let us next discuss the level problem. Given the iterative account, the problem is how many iterative levels of belief are conceptually, epistemically, or psychologically needed for success in various cases. The criteria here are functionality and success. For instance, how many levels of belief does successful joint action require? The level question is difficult to answer in general terms. However, it can be conjectured that in certain cases at least two levels (viz. n l 2), relative to the base level, are needed for mutual belief, but no more. In some other cases this is not enough. Generally speaking, there are very seldom epistemic or psychological reasons for going beyond the fourth level. For many agents level n l 4 probably is psychologically impossible or at least very difficult to handle in one’s reasoning. The base level is the level zero at which the sentence or proposition p is located, and p will concern or be about an agent’s or agents’ actions, dispositions to act, intentions, beliefs (or something related); or, alternatively, p can be (or concern) a belief content. In the analysis below it is accepted that the beliefs—at least those going beyond the level n l 2—need not be proper occurrent or standing beliefs or even subconscious ones but only dispositions to form the belief in question. (See Audi 1994, for the relevant notion of a disposition to form a belief.) The claim that n l 2 is not only necessary but also sufficient in many typical cases is allowed to involve social loop beliefs where the content of mutual belief could be, for instance, that A believes that B believes that A will act in a certain way, or believes that such and such is the case. What the base level will be is relative to the topic at hand. That at least two levels are needed can be regarded as a conceptual truth (related to the notion ‘social existence’); that in many typical cases no more is needed or that in some other 14041
Shared Belief cases four layers are needed can be regarded as general psychological theses concerning the actual contents of people’s minds in those situations. The present view allows that a need may arise to add layers indefinitely. There follows a recursive argument to show that such a situation is possible in principle. Given an analysis of mutual belief in terms of iteration of beliefs such as ours, then at each level under consideration one may start asking questions about people’s realizing (or believing) such and such, where ‘such and such’ refers to the level at hand. This game obviously can go on indefinitely as far as the involved agents’ reasoning capacities (or whatever relevant capacities) permit. This is an inductive argument (in the mathematical sense—the first step is obvious) for the open and infinite character of the hierarchy of nested beliefs. To discuss the level problem in more concrete terms, let us consider mutual belief in the case of two agents, A and B. First we let our base-level sentences be related to the agents’ performances of their parts as follows: (a) A will do his part of X (in symbols, p(A)); (b) B will do his part of X ( p(B)). Next, let us assume: (i) A believes that (a) and (b); (ii) B believes that (a) and (b). This is a social or mutual recognition of the agents’ part-performances in the context of joint action. The sentence p in (a) and (b) may alternatively concern other things than action. Thus it may represent a belief content, for example p l The earth if flat, and we will consider this case below. Considering joint action, it can be argued that n l 2 is both necessary and sufficient for successful performance. In this example, A will obviously have to believe that B will do her part. This gives her some social motivation to perform her own part. Concentrating on our reference point individual A and genuine joint action X (in which the agents’ performances of their parts are interdependent), A must also believe (or be disposed to believe, at any rate, viz. come to form the belief if asked about the matter) that B believes that A will do her part. For otherwise A could not believe with good reason that B will do her part, for B would—if she does not so believe—lack the social motivation springing from her belief that A will do her part. However, if she did lack that motivation, A could not on this kind of ground defensibly form the belief that B will perform her part. This argument shows that it must be required that the agents have a loop belief or at least a disposition to have a loop belief: (iii) A believes that (i) and (ii); (iv) B believes that (i) and (ii). For instance, (iii) gives the loop ‘A believes that B believes that p(A),’ and this means that in the present kind of case n l 2 is necessary and normally also sufficient (relative to the chosen base level). 14042
An analogous claim can be defended in the case of belief. Here the defense can be given in terms of the recognition by the agents that they share a belief. Suppose that each member of the Flat Earth Society not only believes that the Earth is flat, but—because of being organized into a belief group—the members all share this information. This second-order belief indicates the first level at which social and mutual belief can be spoken of in an interesting sense, for at this level people recognize that the others in the group believe similarly, and this establishes a doxastic social connection; at level one that recognition is, however, missing. This is a typical case of a social property, and again n l 2 is both necessary and normally sufficient for acting in matters related to the shape of the Earth qua a member of the society. Suppose next the members of the Flat Earth Society now begin to reflect upon this second-order belief of theirs and come to form the belief that indeed it does exist. (Perhaps a sociologist has just discovered that they share the second-order belief and told them about it.) They might wonder how strong this second-order awareness is, and so on. All this entails that on some occasions third-order beliefs may be needed, although the standard case clearly would be second-order beliefs accompanied by a disposition to go higher up, if the need arises. Analogously, in the case of joint action in some very special cases, for example, double loop beliefs may be needed for rational action. The reader should be reminded that not all social notions depend on mutual belief. For instance, latent social influence and power do not require it, nor does a unilateral love relation between two persons.
4. Mutual Belief as Shared We-belief What has been called ‘we-attitudes’ are central for an account of social life (see Tuomela, 1995, Chap. 1, Tuomela and Balzer 1999). A we-attitude is a person’s attitude (say belief, want, fear, etc.) concerning something p (a proposition, or sentence) such that (1) this person has the attitude in question and believes that (2) everyone in the group has the attitude and also believes that there is a mutual belief to the effect that (2) in the group. Of these, clause (1) is required because there cannot of course be a truly shared attitude without all the members participating in that attitude. Clause (2) gives a social reason for adopting and having the attitude A, and (3) strengthens the reason by making it intersubjective. (A shared we-attitude can be either in the group mode or in the individual mode; cf. Sect. 1.) A shared we-belief is a we-belief (be it in the individual or in the group mode) which (ideally) all the group members have. To take an example, a shared we-belief that the Earth is flat entails that the group members believe that the Earth is flat, that the group members believe that the Earth is flat, and also that it
Sherrington, Sir Charles Scott (1857–1952) is a mutual belief in the group that the group members believe that the Earth is flat. Shared we-beliefs can be related to the reflexive or fixed-point account of mutual belief. According to the simplest fixed-point account, mutual belief is defined as follows: it is a mutual belief in a group that p if and only if everyone in the group believes that p and also that it is mutually believed in the group that p. It can thus be seen that the account of mutual belief given by the fixed-point theory is equivalent to the definiens in the definition of a shared we-belief. In the fixed-point approach the syntactical infinity involved in the iterative approach is cut short by a finite fixed-point formula, that is an impredicative construct in which the joint notion to be ‘defined’ already occurs in the definiens. Under certain rationality assumptions about the notion of belief it can be proved that the iterative approach, which continues iterations ad infinitum, gives the fixed-point property as a theorem (see Halpern and Moses 1992 and, for a more general account, Balzer and Tuomela 1997). The fixed-point account is in some contexts psychologically more realistic, as people are not required to keep iterative hierarchies in their minds. Note, however, that it depends on context whether the iterative approach or the fixed-point approach is more appropriate. Thus, in the case of successful joint action, at least loop beliefs must be required. See also: Action, Collective; Collective Behavior, Sociology of; Collective Beliefs: Sociological Explanation; Collective Identity and Expressive Forms; Collective Memory, Anthropology of; Collective Memory, Psychology of; Genes and Culture, Coevolution of; Human Cognition, Evolution of; Memes and Cultural Viruses; Religion: Culture Contact; Religion: Evolution and Development; Religion, Sociology of
Scheff R 1967 Toward a sociological model of consensus. American Sociological Reiew 32: 32–46 Schelling R 1960 The Strategy of Conflict. Harvard University Press, Cambridge, MA Schiffer S 1972 Meaning. Oxford University Press, Oxford, UK Tuomela R 1984 A Theory of Social Action. Reidel, Dordrecht, The Netherlands Tuomela R 1995 The Importance of Us: A Philosophical Study of Basic Social Notions. Stanford Series in Philosophy, Stanford University Press, Stanford, CA Tuomela R, Balzer W 1999 Collective acceptance and collective social notions. Synthese 117: 175–205
R. Tuomela
Sherrington, Sir Charles Scott (1857–1952) Like this old Earth that lolls through sun and shade, Our part is less to make than to be made. (Sherrington 1940, p. 259)
To characterize the broad span of Sherrington’s engagement he should be described as ‘Pathologist, Physiologist, Philosopher and Poet.’ It is somewhat arbitrary to pick out only two of them because it is the combination of all these elements that made him both great and unique. However, hardly any other physiologist since 1800 made such a fundamentally important contribution to the knowledge and to the development of concepts in modern neurophysiology. Moreover, when advanced in years, he was the first to attempt a contribution to the solution of the age-old problem of brain\body and mind on a well-founded neuroscientific basis.
1. The Life Bibliography Audi R 1994 Dispositional beliefs and dispositions to believe. Nous 28: 419–34 Balzer W, Tuomela R 1997 A fixed point approach to collective attitudes. In: Holmstro$ m-Hintikka G, Tuomela R (eds.) Contemporary Action Theory II. Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 115–42 Grice P 1989 Studies in the Ways of Words. Harvard University Press, Cambridge, MA Halpern J, Moses Y 1992 A guide to completeness and complexity for modal logics of knowledge and belief. Artificial Intelligence 54: 319–79 Heal J 1978 Common knowledge. Philosophical Quarterly 28: 116–31 Lagerspetz E 1995 The Opposite Mirrors: An Essay on the Conentionalist Theory of Institutions. Kluwer Academic Publishers, Dordrecht, The Netherlands Lewis D 1969 Conention, A Philosophical Study. Harvard University Press, Cambridge, MA Ruben D-H 1985 The Metaphysics of the Social World. Routledge, London
Born in London on November 27, 1857, Sherrington was brought up in Ipswich and Norwich. From his earliest years he felt a strong attraction to poetry, a love which never left him throughout his life. Some of his poems were published in various magazines and later were partly collected in a small volume (Sherrington 1925). In 1876 Sherrington first enrolled at St. Thomas’ Hospital Medical School in London and in 1881 matriculated at Cambridge University, where he received his B.A. in 1884 and his M.D. in 1885. During his undergraduate years he started his research in neuroanatomy and neuropathology, publishing his first papers on these studies in 1884 together with John Newport Langley (Langley and Sherrington 1884). This was the beginning of a very productive publishing activity with almost 320 scientific publications (for Sherrington’s complete bibliography see Fulton 1952). During his initial postgraduate period at Cambridge 14043
Sherrington, Sir Charles Scott (1857–1952) Sherrington‘sscientificinterestsconcentratedonneurohistology and pathology. This period was interrupted by expeditions to epidemics of Asian Cholera in Spain and Italy. The autopsied material taken during these expeditions he analyzed together with J. Graham Brown and Charles S. Roy in England (1885) and with Rudolf Virchow and Robert Koch in Germany (1886). In 1890 Sherrington was appointed lecturer in physiology at St. Thomas’ Hospital in London where he mainly performed clinical work. A year later he received the position of the Professor-Superintendent at the Brown Institute at the University of London, where he became remarkably productive with a variety of publications which reflected his pathological, but also his increasing neurophysiological interest. In 1895 Sherrington received his first appointment to a full professorship of physiology at the University College in Liverpool, where he investigated intensely the organization of the spinal cord. At age 56 (in 1913) Sherrington entered probably the busiest period of his scientific career when he was appointed to the chair of physiology at Oxford University, from which he did not retire until 1935, when he was 78. It was during his Oxford time that he, together with Edgar Adrian, received the Nobel Prize for Medicine in 1932 (the greatest tribute among numerous honours and rewards, like, for example the Presidency of the Royal Society 1920–25, some 21 international honorary doctorates, and many medals and honorary memberships of British and international societies and academies, among these the corresponding membership of l’Institut de France). Three of his students at Oxford were later Nobel Prize winners: Howard Walter Florey, John Carew Eccles and Ragnar Granit. In the early 1930s Sherrington performed his last physiological experiments and he felt free to turn to his other great love, the philosophy of the nervous system. Having retired to his old hometown of Ipswich in 1935, his scientific interest was increasingly enriched by philosophical considerations that became evident in the Gifford Lectures, held between 1937 and 1938. Obviously, it was also Sherrington’s interest in Natural Philosophy that caused him to work on Goethe (Sherrington 1942) and Jean Fernel, a French physician and philosopher of the sixteenth century (Sherrington 1946). Though his body was crippled by arthritis, his intellectual capacities remained undimmed until his death in Eastbourne on March 4 1952.
2. The Physiologist Regarding the nervous system as a functional unity resulting from its well-organised parts, Sherrington investigated a variety of motor and sensory systems: the topographical organisation of the motor cortex, the binocular integration in vision, as well as the 14044
different receptor systems. However, his main efforts concentrated on the spinal cord and its reflex function in motor control. He explicitly admitted that spinal reflex behaviour is not ‘the most important and farreaching of all types of nerve behavior’ but represents the advantage that ‘it can be studied free from complication with the psyche’ (Sherrington 1906\1947, Foreword, p. xiii). In the spinal cord, isolated from the brain, he carefully analyzed almost all spinal reflex-types from the ‘simple’ two-neurone reflex arc of the stretch reflex (first described by him), via the complex partly nociceptive flexion and crossed extension reflex and co-ordinated rhythmic scratching and stepping reflexes, up to goal-directed reflex movements of the limbs towards irritating stimuli. Right from the beginning he demonstrated that spinal reflexes are not following a stereotyped all-or-nothing behavior, but that the adaptability of reflexes, based on graduated amplitude, irradiation, and mutual interference between different reflex types, is of great importance for motor control. Sherrington was the first to suggest that inhibition had at least the same functional importance for motor coordination—and other nervous functions—as excitation, not only at the spinal level but also at the brain level and for descending motor commands. This view also opened up new insights into the background of motor disorders observed in various neurological syndromes. Though some of the reflexes and other motor and sensory phenomena described by Sherrington had been noted earlier, his experimental observations and their analyses were altogether more exact and careful than those of former investigations. Thus, he disproved e.g., several of Pflu$ gers’s reflex laws and the assumption of Descartes (1677) and John and Charles Bell (1826) that reciprocal inhibition is localized peripherally in the muscle. However, the higher experimental sophistication alone would not for sure have received that particularly high enduring recognition. Indeed, Sherrington’s most important contribution to neuroscience resulted from his unique capability to condense his immense knowledge on sensory and motor functions and on the structure of the nervous system (partly derived from an intense contact with the famous Spanish neurohistologist Santiago Ramo! n y Cajal) in an almost visionary synopsis of the integrative action of the nervous system, which laid the foundation for the great progress of neuroscience in the twentieth century and which has lost none of its basic validity. This synopsis made his book The Integratie Action of the Nerous System (1906) a milestone in neurophysiology, which has been compared ‘in importance with Harvey’s’ 300 years older ‘De motu cordis in marking a turning point in the history of physiological thought’ (Fulton 1952, p. 174). For Sherrington, spinal reflexes did not have a ‘purpose’ but a ‘meaning.’ He replaced the formerly presumed ‘soul’ of the spinal cord (e.g., Legallois 1812, Pflu$ ger
Sherrington, Sir Charles Scott (1857–1952) 1853) as the driving force for reflexes and even voluntary movements by an astonishing exactly considered function of spinal interneuronal activity. He replaced the Cartesian reflex behaviorist view by the still-relevant view that the performance of coordinated movements as a basis for animal and ‘conscious’ human behavior requires an integration of ‘intrinsically’ generated brain functions with ‘extrinsically’ induced reflex functions. In this context he suggested that the pure ‘apsychical’ reflex behavior without a contribution of ‘mind’ loses in importance with phylogenetic ascendance, playing the smallest role in man: ‘The spinal man is more crippled than is the spinal frog’ (Sherrington 1906\1947, Foreword, p. xiv). As a true scientist Sherrington never felt that he had arrived at any incontrovertible truth and in a typically unpretentious understatement he concluded: ‘I have dealt very imperfectly with a small bit of a large problem…. The smallness of the fragment is disappointing’ (1931 p. 27); this a statement from a man about whom Frederick George Donnan wrote in 1921 to the Nobel Committee: ‘In the field of physiology and medicine, Professor Sherrington’s works remind one of that of Kepler, Copernicus and Newton in the field of mechanics and gravitation’ (in H. Schu$ ck et al., 1962, p. 310).
3. The Philosopher Like for example, Descartes before and Eccles after him, Sherrington realised that mere experimental investigation of the nervous system was not sufficient to solve the fundamental problem of the body\ brain–mind relation, particularly the difficulty of explaining how it actually works. In the Rede Lecture on The Brain and its Mechanism (Sherrington 1933) in which he considered the complexity of the brain functions in motor control and behavior and in which he tentatively approached this problem, Sherrington took a fully dualistic view. He even negated any scientific right to ‘conjoin mental experience with the physiological…. The two … seem to remain disparate and disconnected’ (1933, pp. 22–3). But later in the Gifford Lectures, which he held in Edinburgh in 1937–1938 (published as Man on his Nature in Sherrington 1940 and as an extensively revised new edition in 1951) he took up the great enigma of the brain–mind relation in the light of his new neuroscientific knowledge. Rejecting the assumption of a mysterious mind of a ‘heavenly’ origin he stated, ‘Ours is an earthly mind which fits our earthly body’ (Sherrington 1940, p. 164). ‘Mind’ for Sherrington was coupled to motor acts, ‘mind’ only being recognizable via motor acts. He suggested that ‘mind’ is graduable, that a recognizable ‘mind’ is developing in the developing brain in humans and to varying degrees in animals. Regarding the relation between the two concepts of ‘energy’ (assuming ‘matter’ also as ‘energy’ in the physical sense) and
‘mind’, he concluded that ‘mind’ as a function of the living brain is mortal, while the ‘energy’ the matter—is immortal, just changing its state when the brain is dying. In his discussions he entirely avoided the complicating problem of an ‘immortal soul’, which he, as being himself oriented towards Natural Religion, left to the Revealed Religions: ‘When on the other hand the mind-concept is so applied as to insert into the human individual an immortal soul, again a trespass is committed’ (Sherrington 1940, p. 355). When Sherrington with something like resignation stated, ‘But the problem of how that [body-mind] liaison is effected remains unsolved; it remains where Aristotle left it more than 2000 years ago’ (Sherrington 1906\1947, Foreword, p. xxiii) he probably underestimated the contribution of his step towards the direction of a solution by defining ‘mind’ as a function of the brain requiring a complex integrative action of the nervous system; a view which was taken up and further developed by one of his students, Sir John C. Eccles (1970, Eccles and Gibson 1979).
4. And Now? As stated in Sect. 3 the experimentally well-based theories and concepts, which Sherrington had creatively developed with great visionary imagination, formed a well-recognised secure foundation for future developments in the neurosciences. The half-century after Sherrington’s death has brought an immense increase in neuroscientific knowledge. First, particularly, the technique of microrecordings from nerve cells and the analysis of ionic membrane mechanisms confirmed and extended the knowledge and hypotheses based on Sherrington’s activities on brain and spinal functions in sensory perception, motor control, and behavior. Then new techniques allowed for analytic investigation of more and more detailed neuronal structures, functions, and interactions down to the molecular level. New gentechnical approaches enabled the ‘production’ of innumerable variants of mice with defined genetic defects at the molecular level. However, the bewildering mass of results originating from these experiments opened up a host of new questions. What is required now is a new ‘Sherringtonian’ approach which comprises a complex critical analysis of today’s entire neuroscientific knowledge (from molecular mechanisms up to the level to complex behaviour) in an integrative synopsis and which develops new concepts to explain the sophisticated brain functions and their relation to mind as expressed by behavior. Despite the exponential increase in detailed neuroscientific knowledge, we have not really come closer to a solution of that problem since Sherrington. See also: Behavioral Neuroscience; Cognitive Neuroscience; Consciousness, Neural Basis of; Cross-modal 14045
Sherrington, Sir Charles Scott (1857–1952) (Multi-sensory) Integration; Functional Brain Imaging; Motor Control; Motor Cortex; Motor Skills, Psychology of; Neural Plasticity of Spinal Reflexes; Neural Representations of Direction (Head Direction Cells); Psychophysiology; Theory of Mind; Topographic Maps in the Brain; Vestibulo-ocular Reflex, Adaptation of the
Bibliography Bell J, Bell C 1926 Anatomy and Physiology of the Human Body. Edinburgh Descartes R 1677 De homine ed. by de la Forge. Amsterdam Eccles J C 1970 Facing Reality. Springer-Verlag, New York Eccles J C, Gibson W C 1979 Sherrington—His Life and Thought. Springer International, Berlin Fulton J F 1952 Sir Charles Scott Sherrington, O.M. Journal of Neurophysiology 15: 1667–190 Granit R 1966 Charles Scott Sherrington—An Appraisal. Nelson, London Langley J N, Sherrington C S 1884 Secondary degeneration of nerve tracts following removal of the cortex of the cerebrum in the dog. Journal of Physiology 5: 49–65 Legallois M 1812 ExpeT riences sur la principe de la ie. D’Hautes, Paris Pflu$ ger E 1853 Die sensorischen Funktionen des RuW ckenmarks der Wirbeltiere. Springer, Berlin Schu$ ck H, Sohlman R, O= sterling A, Liljestrand G, Westgren A, Siegbahn M, Schou A, Sta0 le N K 1962 Nobel, the Man and His Prizes. The Nobel Foundation, Elsevier, Amsterdam Sherrington C S 1906\1947 The Integratie Action of the Nerous System, 2nd edn, Yale University Press, New Haven, CT, with a new Foreword, 1947 Sherrington C S 1925 The Assaying of Brabantius and Other Verse. Oxford University Press, London Sherrington C S 1931 Quantitative management of contraction in lowest level co-ordination. Hughlings Jackson Lecture. Brain 54: 1–28 Sherrington C S 1933 The Brain and its Mechanism. Cambridge University Press, Cambridge, UK Sherrington C S 1940\1951 Man on his Nature, 2nd rev. edn. 1951. Cambridge University Press, Cambridge, UK Sherrington C S 1942 Goethe on Nature and on Science. Cambridge University Press, Cambridge, UK Sherrington C S 1946 The Endeaour of Jean Fernel. Cambridge University Press, Cambridge, UK
E. D. Schomburg
claimed that short-term memory is an archaic concept and there is no need to distinguish the processes involved in short-term memory from other memory processes (Crowder 1993).
1. Definitions and Terminology ‘Short-term memory’ refers to memory over a short time interval, usually 30 s or less. Another term for the same concept is ‘immediate memory.’ Both these terms have been distinguished from the related terms ‘shortterm store’ and ‘primary memory,’ each of which refers to a hypothetical temporary memory system. However, the term ‘short-term memory’ has also been used by many authors to refer to the temporary memory system. Thus, here the term ‘short-term memory’ is used in both senses. It is important to distinguish ‘short-term memory’ from the related concepts ‘working memory’ (see Working Memory, Psychology of) and ‘sensory memory’. Some authors have used the terms ‘shortterm memory’ and ‘working memory’ as synonymous, and indeed the term ‘working memory’ has been gradually replacing the term ‘short-term memory’ in the literature (and some authors now refer to ‘shortterm working memory’; see Estes 1999). However, ‘working memory’ was originally adopted to convey the idea that active processing as well as passive storage is involved in temporary memory. ‘Sensory memory’ refers to memory that is even shorter in duration than short-term memory. Further, sensory memory reflects the original sensation or perception of a stimulus and is specific to the modality in which the stimulus was presented, whereas information in shortterm memory has been coded so that it is in a format different from that originally perceived.
2. Historical Deelopment and Empirical Obserations An initial description of short-term memory was given by James (1890), who used the term ‘primary memory’ and described it as that which is held momentarily in consciousness. The intense study of short-term memory began almost 70 years later with the development of the distractor paradigm (Brown 1958, Peterson and Peterson 1959).
Short-term Memory, Cognitive Psychology of
2.1 The Distractor Paradigm
The concept of short-term memory has been of theoretical significance to cognitive psychology since the late 1950s. Some investigators have even argued that ‘all the work of memory is in the short-term system’ (Shiffrin 1999, p. 21). However, others have
In this paradigm, a short list of items (usually five or fewer) is presented to subjects for study. If the subjects recall the list immediately, perfect performance results because the list falls within their memory span. However, before the subjects recall the items, they engage in an interpolated activity, the distractor task,
14046
Short-term Memory, Cognitie Psychology of which usually involves counting or responding to irrelevant material. The purpose of the distractor task is to prevent the subjects from rehearsing the list items. The length of the distractor task varies, and its duration is called the ‘retention interval.’ By comparing performance after various retention intervals, it was found that the rate of forgetting information from short-term memory is very rapid, so that after less than 30 s little information remains about the list of items. Another central finding from the distractor paradigm involves the serial position curve, which reveals accuracy for each item in the list as a function of its position. The curve is typically bowed and symmetrical with higher accuracy for the initial and final positions than for the middle positions. The distractor task typically requires serial recall (i.e., recall in the order in which the items were presented; see Representation of Serial Order, Cognitie Psychology of). Two types of errors occur with this procedure. Transposition errors are order errors in which subjects recall an item in the wrong position. Nontransposition errors are substitution errors in which subjects replace an item with one not included in the list. For example, if subjects receive the list BKFH and recall VKBH, they make a nontransposition error in the first position and a transposition error in the third. The bowed serial position curve reflects the pattern of transposition errors. The pattern of nontransposition errors shows instead a relatively flat function with errors increasing slightly across the list positions. Errors in the distractor paradigm can also be classified by examining the relation between the correct item and the item that replaces it. In the example, the subjects replace B with V. Because those two letters have similar-sounding names, the error is called an ‘acoustic confusion error’, and such errors occur often in the distractor paradigm. Even visually presented items are typically coded in short-term memory in an acoustic or speech representation. This fact was first illustrated in an experiment in which subjects were shown a list of six consonants for immediate serial recall. The errors were classified by a confusion matrix, in which the columns indicate the letter presented and the rows indicate the letter actually recalled. A high correlation was found between the confusion matrix resulting from this memory task when subjects recalled a list of visually presented letters, and the confusion matrix resulting from a listening task when subjects heard one letter at a time and simply named it with no memory requirement (Conrad 1964).
2.2 The Free Recall Task Another paradigm commonly used in the early investigation of short-term memory was the free recall task, in which subjects are given a relatively long list of items and then recall them in any order they choose. When the list is recalled immediately, the subjects
show greater memory for the most recent items. This ‘recency effect’ was attributed to the fact that the final items in the list, but not the earlier ones, are still in short-term memory at the termination of the list presentation. Support for this conclusion came from the observation that if presentation of the list is followed by a distractor task, then there is no recency effect, although as in the case of immediate recall there is a ‘primacy effect,’ or advantage for the initial items in the list. The explanation offered for the elimination of the recency effect with the distractor task is that the final list items are no longer in short-term memory after the distractor activity. Further support for this explanation came from finding that other variables like list length and presentation rate had differential effects on the recency and earlier sections of the serial position curve. Specifically, subjects are less likely to recall an item when it occurs in a longer list for all except the most recent list positions. Likewise, subjects are less likely to recall an item in a list presented at a faster rate for all but the recency part of the serial position curve.
3. Theoretical Accounts The most widely accepted account of short-term memory was presented in the 1960s in what was subsequently termed the ‘modal model’ because of its popularity. The core assumption of that model is the distinction between short-term memory, which is transient, and long-term memory, which is relatively permanent. The fullest description of the modal model was provided by Atkinson and Shiffrin (1968), who also distinguished between short-term memory and sensory memory.
3.1 The Buffer Model Atkinson and Shiffrin’s ‘buffer model,’ as it has been called, is characterized along two dimensions (Atkinson and Shiffrin 1968). Following a computer analogy, the first dimension involves the structural features of the system, analogous to computer hardware, and the second dimension involves the ‘control processes,’ or operations under the control of the subjects, analogous to computer software. The structural features of the buffer model include the sensory registers (with different registers for each sense), shortterm store, and long-term store. The control process emphasized is rote rehearsal, which takes place in part of short-term store called the ‘buffer.’ The rehearsal buffer has a small capacity with a fixed number of slots (about four). To account for performance in the free recall task, it is assumed that each item enters the buffer and when the buffer is full the newest item displaces a randomly selected older item. While an item is in the buffer, 14047
Short-term Memory, Cognitie Psychology of information about it is transferred to long-term store, with the amount of information transferred a linear function of the time spent in the buffer. Although information about an item thereby gets transferred to long-term store, the item remains in the buffer until it is displaced by an incoming item. At test, subjects first respond with any items in the buffer and then try to retrieve other items from long-term store, with the number of retrieval attempts fixed (see Memory Retrieal). These assumptions allow the model to account for the various empirical observations found with the free recall task. The model accounts for the recency effect and its elimination with a distractor task because the final items are still in the buffer immediately after list presentation but get displaced from the buffer by the interpolated material of the distractor task. The model accounts for the primacy effect by assuming that the buffer starts out empty so the initial items reside in the buffer longer than subsequent items because they are not subject to displacement until the buffer is full. The effect of list length is due to the fixed number of attempts to retrieve information from longterm store. The longer the list, the smaller is the likelihood of finding a particular item. The effect of presentation rate is due to the linear function for transferring information from short-term to long-term store. More information is transferred when the rate is slower so that retrieving an item from long-term store is more likely at a slower rate. There have been numerous refinements and expansions of the buffer model since it was first proposed. These updated versions have been termed ‘SAM’ (search of associative memory) and ‘REM’ (retrieving effectively from memory). However, these refinements have been focused on the search processes in long-term memory, and the short-term memory component of the model remains largely intact (Shiffrin 1999). 3.2 The Perturbation Model Whereas the buffer model explains the results of the free recall task, a popular model proposed by Estes (1972) explains the results of the distractor paradigm. According to this ‘perturbation model’, the representation in memory of each item in a list is associated with a representation of the experimental context. This contextual representation is known as a ‘control element.’ It is assumed that if the control element is activated, then the representations of the items associated with it are activated in turn, allowing for their recall. Forgetting is attributed in part to the fact that the context shifts with time so that subjects may be unable to activate the appropriate control element after a delay. Reactivation of the item representations by the control element does not only occur at the time of test. In addition, there is a reverberatory loop providing a periodic recurrent reactivation of each item’s representation, with the difference in reacti14048
vation times for the various items reflecting the difference in their initial presentation times. This reverberatory activity provides the basis for the shortterm memory of the order of the items in a list. Because of random error in the reactivation process, timing perturbations result, and these perturbations may be large enough to span the interval separating item representations, thereby leading to interchanges in the order of item reactivations and, hence, transposition errors during recall. Because such interchanges can occur in either the forward or backward direction for middle items in the list but in only one direction for the first and last items, the bowed serial position function for transposition errors is predicted. The symmetry in the serial position function is predicted with the assumption that timing perturbations do not start until all the list items have been presented. Because interchanges can also occur between the list items and the interpolated distractor items, the gradually increasing proportion of nontransposition errors is also predicted. The original version of the perturbation model included only the perturbation process responsible for short-term memory, but subsequent research documented the need to include a long-term memory process in addition to the short-term perturbation process (Healy and Cunningham 1999). Other extensions allowed the perturbation model to account for a wide range of findings in the distractor paradigm and also to provide insights into other memory processes such as those responsible for memory distortions in eyewitness situations (Estes 1999; see also Eyewitness Memory: Psychological Aspects). 3.3 Theoretical Controersies and Complications Although models of memory typically include the distinction between short- and long-term memory, some investigators have pointed to problems with some of the evidence establishing the need to postulate a distinct short-term memory. For example, Craik and Lockhart (1972) argued that short-term memory cannot be distinguished by the exclusive use of speech coding, as suggested by Conrad (1964). Craik and Lockhart proposed an alternative framework called ‘levels of processing,’ according to which information is encoded at different levels and the level of processing determines the subsequent rate of forgetting (see Memory: Leels of Processing). More recent arguments against the need for a separate short-term memory were made by Crowder (1993), who pointed out that the rapid forgetting across retention intervals in the distractor paradigm is not found on the first trial of an experiment. He also pointed out that a recency effect like that in immediate free recall is found in a number of tasks relying exclusively on long-term memory. Healy and McNamara (1996) dismissed some of these arguments with two general considerations. First, information
Short-term Memory: Psychological and Neural Aspects can be rapidly encoded in long-term memory, so that, for example, recall can derive from long-term memory rather than short-term memory on the first trial of an experiment before interference from previous trials has degraded the long-term memory representation. Second, although recency effects occur in many memory paradigms, the specific properties and causes of the serial position functions differ across paradigms. Memory models also typically make the distinction between sensory and short-term memory. However, an important exception is Nairne’s ‘feature model’ (Nairne 1990). Instead of distinguishing between these two types of memory stores or processes, the feature model distinguishes between two types of memory trace features—modality independent (which involve speech coding) and modality dependent (which are perceptual but not sensory). According to this model, during recall subjects compare the features of the memory trace to the features of various item candidates. Forgetting occurs as the result of feature overwriting, by which a new feature can overwrite an old feature but only if the two features are of the same type (modality dependent or modality independent). Although short-term memory is distinguished from both sensory and long-term memory by most contemporary models of memory, including a recent ‘composite model’ proposed by Estes (1999) to describe the current state of theorizing, many current models have broken down short-term memory into different subcomponent processes. The most popular of these models is Baddeley’s ‘working memory model’ (Baddeley 1992), which includes an attentional control system called the ‘central executive’ (see Attention: Models) and two ‘slave systems,’ the phonological loop (responsible for speech coding) and the visuospatial sketchpad (responsible for the coding of both spatial and visual information). Recent neuropsychological evidence (Martin and Romani 1994) has led to a further breakdown into semantic and syntactic components in addition to the components of Baddeley’s system. Despite these controversies and complications, it seems clear that the concept of ‘short-term memory’ will continue in the future to play an important role in our theoretical understanding of cognitive processes. See also: Learning and Memory: Computational Models; Learning and Memory, Neural Basis of; Long-term Potentiation (Hippocampus); Memory Models: Quantitative; Short-term Memory: Psychological and Neural Aspects; Visual Memory, Psychology of; Working Memory, Neural Basis of; Working Memory, Psychology of
Bibliography Atkinson R C, Shiffrin R M 1968 Human memory: A proposed system and its control processes. In: Spence K W, Spence J T (eds.) The Psychology of Learning and Motiation: Adances in
Research and Theory 2. Academic Press, San Diego, pp. 89–195 Baddeley A-A. 1992 Is working memory working? The fifteenth Bartlett lecture. Quarterly Journal of Experimental Psychology 44: 1–31 Brown J 1958 Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology 10: 12–21 Conrad R 1964 Acoustic confusions in immediate memory. British Journal of Psychology 55: 75–84 Craik F I M, Lockhart R S 1972 Levels of processing: A framework for memory research. Journal of Verb Learning and Verb Behaior 11: 671–84 Crowder R G 1993 Short-term memory: Where do we stand? . Memory and Cognition 21: 142–45 Estes W K 1972 An associative basis for coding and organization in memory. In: Melton A W, Martin E (eds.) Coding Processes in Human Memory. Winston, Washington, DC, pp. 161–90 Estes W K 1999 Models of human memory: A 30-year retrospective. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 59–87 Healy A F, Cunningham T C 1999 Recall of order information: Evidence requiring a dual-storage memory model. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 151–64 Healy A F, McNamara D S 1996 Verbal learning and memory: Does the modal model still work? Annual Reiew of Psychology 47: 143–72 James W 1890 The Principles of Psychology. Holt, New York Martin R C, Romani C 1994 Verbal working memory and sentence comprehension: A multiple-components view. Neuropsychology 8: 506–23 Nairne J S 1990 A feature model of immediate memory. Memory and Cognition 18: 251–69 Peterson L R, Peterson M J 1959 Short-term retention of individual verbal items. Journal of Experimental Psychology 58: 193–8 Shiffrin R M 1999 30 years of memory. In: Izawa C (ed.) On Human Memory: Eolution, Progress, and Reflections on the 30th Anniersary of the Atkinson–Shiffrin Model. Erlbaum, Mahwah, NJ, pp. 17–33
A. F. Healy
Short-term Memory: Psychological and Neural Aspects 1. Short-term Memory ‘Short-term memory’ refers to a number of systems with limited capacity (in the verbal domain, roughly the ‘magical’ number 7p2 items: Miller 1956) concerned with the temporary retention (in the range of seconds) of a variety of materials. Knowledge of the functional and anatomical organization of short-term memory in humans, and its role in cognition as at the turn of the twenty-first century, is herewith presented, drawing data from three main sources of evidence: (a) 14049
Short-term Memory: Psychological and Neural Aspects behavioral studies in normal individuals and in braininjured patients with selective neuropsychological short-term memory deficits; (b) correlations between the anatomical localization of the cerebral lesion and the short-term memory disorder of brain-damaged patients; (c) correlations between the activation of specific cerebral areas and the execution of tasks assessing short-term retention in normal subjects. The two more extensively investigated aspects of shortterm memory are considered: verbal and visual\ spatial. ‘Short-term memory’ is closely related to the concept of ‘working memory’ (see Working Memory, Psychology of Working Memory, Neural Basis of). The present article focuses on the ‘storage’ and ‘rehearsal’ components of the system, rather than on the cognitive operations and executive functions currently associated with ‘working memory.’ However, the section devoted to the uses of short-term memory illustrates some of the working aspects of short-term retention.
2. Historical Origin of the Construct ‘Short-term Memory’ Suggestions of a distinction between two types of memory, one concerned with temporary retention, and the other having the function of a storehouse for materials which have been laid out of sight, date back at least to John Locke’s ‘Essay Concerning Human Understanding’ (1700). William James (1895) revived the distinction, suggesting the existence of a limitedcapacity ‘primary memory,’ embracing the present and the immediate past, and subserving consciousness (see Consciousness, Neural Basis of). Psychological research in the nineteenth century and in the first half of the twentieth century was, however, mainly concerned with the diverse factors affecting learning and retention, in the context of a basically unitary view of human memory. It was only in the 1950s that shortterm memory became the object of systematic behavioral studies in normal subjects (Baddeley 1976). In the late 1960s the division of human memory into a short- and a long-term system became a current view (Atkinson and Shiffrin 1968).
3. Functional Architecture of Short-term Memory 3.1 Eidence from Normal Subjects Three main behavioral phenomena suggest the existence of a discrete limited-capacity system, concerned with short-term retention (Baddeley 1976, Baddeley 1986, Glanzer 1972). (a) Accuracy in recalling a short list of stimuli (e.g., trigrams of consonants or of words) decreases dramatically in a few seconds if the subjects’ repetition of the memory material (rehearsal) is prevented by a distracting activity such as counting 14050
backwards by threes. (b) In the immediate free recall of a sequence of events, such as words, the final five to six stimuli on the list are recalled better that the preceding ones. This ‘recency effect’ vanishes after a few seconds of distracting activity, and is minimally affected by factors such as age, rate of presentation of the stimuli, and word frequency. (c) In the immediate serial recall of verbal material (memory span), the subject’s performance is affected by factors such as phonological similarity and word length, with the effects of semantic factors being comparatively minor. Each of these phenomena subsequently proved to be considerably more complex than initially thought. They were also interpreted as compatible with a unitary, single-system, view of human memory. This account proved untenable, however, mainly on the basis of neuropsychological evidence. These empirical observations illustrate the main characteristics of ‘short-term memory’: a retention system with limited capacity, where the memory trace, in the time range of seconds, shows a decay, which may be prevented through rehearsal. Material stored in short-term memory has a specific representational format, which, in the case of the extensively-investigated verbal domain, involves phonological codes, separately from lexical-semantic representations stored in long-term memory. The latter contribute, however, to immediate retention, e.g., in verbal span tasks. The functional architecture of phonological shortterm memory has been investigated in detail using effects which break down storage and rehearsal subcomponents. The effect of phonological similarity, whereby the immediate serial recall of auditory and visual verbal material is poorer for sequences of phonologically similar stimuli than for dissimilar ones, reflects the coding which takes place in the phonological short-term store. The effect of word length, whereby the immediate serial recall of auditory and visual verbal material is poorer for sequences of long words than for short ones, is held to reflect the activity of the process of rehearsal, abolished by ‘articulatory suppression,’ i.e., a continuous uttering of an irrelevant speech sound. Suppression, while disrupting rehearsal, also reduces immediate memory performance in span tasks. The interaction between phonological similarity, input modality and articulatory suppression, with suppression abolishing phonological similarity only when the stimuli are presented visually, suggests that rehearsal participates in the process of conveying visual verbal material to the phonological short-term store. Finally, some phonological judgments (such as rhyme and stress assignment) are held to involve the articulatory components of rehearsal, because the performance of normal subjects is selectively impaired by suppression (Burani et al. 1991). In the nonverbal domain a similar distinction is drawn between visual and spatial short- and long-term memory systems, with the relevant representational
Short-term Memory: Psychological and Neural Aspects
Figure 1 Short-term forgetting of a single letter by patient PV after 3 to 15 seconds of delay filled by arithmetic distracting activity. With an auditory input, dramatic forgetting occurred within a few seconds. In the visual modality, the patient’s performance was at ceiling. Immediate recall was fully accurate in both input modalities, ruling out perceptual and response-related deficits, differently from the short-term memory disorder (Source: data from Basso A, Spinnler H, Vallar G, Zanobio M E 1982 Left hemisphere damage and selective impairment of auditory-verbal short-term memory. Neuropsychologia 20: 263–274)
format being in terms of the shape or spatial location of the stimulus (Baddeley 1986, Della Sala and Logie 1993). Also in the visuo-spatial domain, similarity, recency and interference effects have been observed. Visuo-spatial short-term memory is likely to comprise storage and rehearsal (pictorial and spatial) components. In the case of spatial locations, rehearsal may be conceived in terms of planned movements (e.g., ocular, manual reaching, locomotion) towards a target coded in a spatial reference frame, (e.g., egocentric).
3.2 Eidence from Brain-injured Patients Studies in patients with localized brain damage provide unequivocal evidence that supports the independence of short- and long-term memory systems, conjuring up a double dissociation of deficits. In patients with ‘global amnesia,’ which may be characterized as a selective impairment of the declarative or explicit component of long-term memory (see Declaratie Memory, Neural Basis of; Episodic and Autobiographical Memory: Psychological and Neural Aspects) verbal and visuo-spatial short-term memory
are unimpaired, with immediate serial span, the recency effect in immediate free recall, short-term forgetting being within normal limits. This functional dissociation suggests a serial organization of the two systems, with temporary retention being a necessary condition for long-term storage (Atkinson and Shiffrin 1968). Since the late 1960s selective impairments of shortterm memory have been reported (Vallar and Papagno 1995, Vallar and Shallice 1990). The more extensively investigated area concerns auditory-verbal (phonological) short-term memory. Patients show a disproportionate reduction of the auditory-verbal span to an average of less than three items (digits, letters, or words); the recency effect in immediate free recall of auditory-verbal lists of words is abolished; short-term forgetting is abnormally rapid. The disorder is modality-specific, with the level of performance being better when the material is presented visually (Fig. 1). This input-related dissociation has two main implications: (a) discrete phonological and visual short-term memory components exist; (b) in the input– output processing chain, the phonological short-term store should be envisaged as an input system, rather than an output buffer store. This division argues against a monolithic view of the system as a single store, which is amodal, i.e., not specific for the different sensory modalities (see Atkinson and Shiffrin 1968). Additional support to this input locus of the system comes from the observation that in some of these patients speech production is entirely preserved. Patients with defective phonological short-term memory may show unimpaired long-term verbal and visuo-spatial learning (Fig. 2). This observation further corroborates the complete independence of short- and long-term memory systems, but is incompatible with a serial organization, in which defective short-term memory entails a long-term memory impairment (see Atkinson and Shiffrin 1968), suggesting a parallel architecture instead. After early perceptual analysis, information may enter short- or long-term memory, either of which may be selectively disrupted by brain damage. The learning abilities of these patients are, however, dramatically impaired when the phonological material to be learned does not possess pre-existing lexical-semantic representations in long-term memory. This is the case of pronounceable meaningless sequences of letters, such as nonwords, or words of a language unknown to the subject (Fig. 2).This phonological learning deficit indicates that, within a specific representational domain, temporary retention in short-term memory is necessary for stable long-term learning and retention, as predicted by serial models. Other brain-injured patients show deficits of shortterm retention in the visual and spatial domain, even though these patterns of impairment have been explored less extensively than the phonological disorder. Immediate retention of short sequences of spatial 14051
Short-term Memory: Psychological and Neural Aspects
Figure 2 Word (A) and nonword (B) paired-associate learning by patient PV and matched control subjects (C), with auditory presentation. The patient’s performance was within the normal range with native language (Italian) words, dramatically defective with unfamiliar nonwords (Russian words transliterated into Italian) (Source: Baddeley A D, Papagno C, Vallar G 1988 When long-term learning depends on short-term storage. Journal of Memory and Language 27: 586–595)
locations, as assessed by a block tapping task (a spatial analogue of digit span), may be selectively defective, with verbal memory being unaffected, in both its short- and long-term components. In other patients, the deficit may involve the visual (shape) component of short-term memory, while impairments in the shortterm visual recognition of unfamiliar faces, objects, voices, colours have been also described. The disorder of patients with defective visual imagery (as assessed, for instance, by tasks requiring colour or size comparisons) may be interpreted in terms of an impaired visual short-term memory store (see Visual Imagery, Neural Basis of). A deficit of visuo-spatial short-term memory may also disrupt long-term learning of unfamiliar non-verbal material, as assessed by recognition memory for unfamiliar faces and objects. This extends to the visuo-spatial domain the conclusion that long-term acquisition requires short-term storage. The process of rehearsal has been traditionally described as an activity which, through rote repetition, refreshes the short-term memory trace, preventing its decay. The precise characteristics of rehearsal have been elucidated in more detail in recent years, particularly in the domain of phonological memory. Rehearsal may be conceived in terms of the recoding of the memory trace from input (auditory-verbal) to output-related (articulatory) representations and 14052
vice-versa. More specifically, rehearsal of verbal material has long been regarded as ‘articulatory’ in nature, involving output-related verbal processes— such as the motor programming of speech production in a phonological assembly system or output buffer— the actual articulation of the material to be rehearsed (subvocal rehearsal), or both. Anarthric subjects, who are unable to utter any articulated speech sound due to congenital disorders or brain damage acquired in adult age, may, nonetheless, show a preserved immediate memory, including verbal rehearsal. This suggests that the process is ‘central’ and does not require the activity of the peripheral musculature. Brain-damaged patients with a selective impairment of rehearsal (Vallar et al. 1997) show a defective immediate verbal span, as do patients with damage to the phonological short-term store. Both components of phonological memory contribute to immediate retention, though the phonological store provides a major retention capacity. The ability to perform phonological judgments is disrupted, however, by damage to the rehearsal process, but not to the phonological store. By contrast, a defective rehearsal leaves the recency effect in immediate free recall of auditory lists of words largely unimpaired. This recency effect is largely reduced or absent in patients with damage to the phonological short-term store.
Short-term Memory: Psychological and Neural Aspects
4. The Uses of Short-term Memory Is there a use for a system providing the temporary retention of a limited amount of stimuli, besides infrequent situations such as the following? A friend tells us an unfamiliar eight-digit number, which we have to dial on a telephone placed on the other side of a large street and we have no paper and pencil to write it down. The answer is positive. Short-term retention contributes to the stable acquisition of new information in long-term memory. More specifically, phonological short-term memory plays an important role in learning new vocabulary and participates in the processes of speech comprehension and production (see Speech Perception; Speech Production, Psychology of).
4.1 Long-term Learning The observation that patients with defective auditoryverbal span are also impaired in learning unfamiliar pronounceable letter sequences gives rise to the possibility that phonological memory may contribute to a relevant aspect of language development, the acquisition of vocabulary (Fig. 2). Similarly, subjects with a developmental deficit of phonological memory are impaired in vocabulary acquisition and in non-word learning. An opposite pattern is provided by subjects with a congenital cognitive impairment which selectively spares phonological short-term memory. Acquisition of vocabulary, foreign languages and nonword learning are also preserved. Converging evidence from different subject populations supports this view. Correlational studies in children have shown that the capacity of phonological memory is a main predictor of the subsequent acquisition of vocabulary, both in the native and in a second language. In normal adult subjects, the variables which disrupt immediate memory span (phonological similarity, item length, articulatory suppression) also impair the acquisition of non-words. Polyglots have a greater capacity of phonological memory, compared to nonpolyglots, and a better ability to learn novel words. Phonological short-term memory may be considered as a learning device for the acquisition of novel phonological representations, and the building up of the phonological lexicon (Baddeley et al. 1998). A few observations in brain-damaged patients suggest a similar role for visuo-spatial shortterm memory in the acquisition of new visual information, such as unfamiliar faces and objects.
4.2 Language Processing The idea that short-term retention contributes to speech comprehension dates back to the 1960s. Phonological memory may withhold incoming auditory-
verbal strings, while syntactic and lexical-semantic analyses are performed. Patients with defective phonological memory show a preserved comprehension of individual words, as well as many sentential materials, and a normal ability to decide whether or not sentences are grammatically correct. This may reflect, on the one hand, the operation of on-line lexical-semantic processes, heuristics and pragmatics, and, on the other, the complete independence of syntactic and lexical-semantic processes from phonological memory. Patients are however impaired by ‘complex’ sentences, where ‘complexity’ refers to a number of non-mutually-exclusive factors, such as: (a) a high speed of material presentation, which prevents the immediate build-up of an unambiguous cognitive representation; (b) word order conveying meaningcrucial information (e.g., in sentences in which a semantic anomaly is introduced by a change in the linear arrangement of words: ‘The world divides the equator into two hemispheres, the southern and the northern’); (c) extralinguistic presuppositions biasing the interpretation of the spoken message. Under such conditions, adequate interpretation may require backtracking to the verbatim (phonological) representation of the sentence, temporarily held in phonological memory. This provides a ‘backup’ or ‘mnemonic window’ resource for performing supplementary cognitive operations necessary for comprehension (Vallar and Shallice 1990).
5. Neural Architecture of Short-term Memory 5.1 Phonological Short-term Memory Anatomoclinical correlation studies in brain-damaged patients with a selective impairment of the auditory-verbal span indicate that the inferior parietal lobule (supramarginal gyrus) of the left hemisphere, at the temporoparietal junction, represents the main neural correlate of the ‘store’ component of phonological short-term memory (Vallar and Papagno 1995). The frontal premotor regions in the left hemisphere and other structures such as the insula are the major neural correlates of the ‘rehearsal’ component, even though the available anatomoclinical data are more limited (Vallar et al. 1997). Functional neuroimaging studies in normal subjects concur with this pathological evidence, to suggest a left-hemisphere-based network. Activation in the left supramarginal gyrus [Brodmann’s area (BA) 40] is associated with the ‘store’ component of short-term phonological memory, activation in the left frontal premotor BA 44 (Broca’s area) and BA 6, and in the left insula, with the ‘rehearsal’ component (Paulesu et al. 1996). In these studies, in line with the behavioral evidence from normal subjects and patients, an immediate verbal span task activates both the inferior 14053
Short-term Memory: Psychological and Neural Aspects parietal region (phonological short-term store) and the premotor cortex (rehearsal process) in the left hemisphere. Conversely, rhyme judgements selectively activate the left premotor regions, whose damage, in turn, disrupts the patients’ ability to perform this task. These activation and lesion-based data support, from an anatomofunctional perspective, the behavioral distinction between a ‘storage’ component and a ‘rehearsal process’ in phonological short-term memory. Furthermore, they qualify ‘rehearsal’ as a process whichmakes use ofcomponents also concerned with the planning (i.e., programming in the left premotor cortex) of articulated speech. Seen in this perspective, phonological memory may be regarded as a component of the language system. Finally, connectionist modelling of this architecture is currently being developed (Burgess and Hitch 1999).
specific functional properties and discrete neural correlates. These systems secure the retention of a limited amount of material in the time range of seconds and contribute to relevant aspects of cognition, such as long-term learning. See also: Declarative Memory, Neural Basis of; Episodic and Autobiographical Memory: Psychological and Neural Aspects; Learning and Memory, Neural Basis ofMemory Retrieval; Memory: Synaptic Mechanisms; Recognition Memory (in Primates), Neural Basis of; Short-term Memory, Cognitive Psychology of; Working Memory, Neural Basis of
Bibliography 5.2 Visual and Spatial Short-term Memory Studies in brain-damaged patients suggest an association between damage to the posterior regions of the right hemisphere and defective spatial short-term memory, as assessed by tasks requiring the reproduction of sequences of spatial locations. There is also evidence suggesting that damage to the posterior regions of the left hemisphere brings about disorders of visual short-term memory, such as defective immediate retention of sequences of visual stimuli, (e.g. lines), and impaired recognition of more than one visual stimulus at a time (defective simultaneous form perception) (Vallar and Papagno 1995). Neuroimaging activation studies in humans support the distinction between primarily spatial (a location, a ‘where’ component) and visual short-term memory systems (Smith et al. 1995). The main neural correlates of spatial memory for locations include the occipital extra-striate, posterior–inferior parietal, dorsolateral premotor and prefrontal cortices. Short-term visual recognition memory is associated with activations in a network including the posterior–inferior parietal, temporal and ventral premotor and prefrontal cortices. Right hemisphere regions may play a more relevant role in spatial memory for locations, while left hemisphere regions are more involved in visual memory for objects. These patterns of activation indicate an association of the ‘dorsal visual stream’ with spatial short-term memory for locations and of the ‘ventral visual stream’ with short-term visual recognition memory.
6. Conclusion Behavioural observations and neuroanatomical evidence from normal subjects and brain-injured patients concur in suggesting that ‘short-term memory’ should be conceived as a multiple-component system with 14054
Atkinson R C, Shiffrin R M 1968 Human memory: a proposed system and its control processes. In: Spence K W, Taylor Spence J (eds.) The Psychology of Learning and Motiation. Adances in Research and Theory. Academic Press, New York Baddeley A D 1976 The Psychology of Memory. Basic Books, New York Baddeley A D 1986 Working Memory. Clarendon Press, Oxford, UK Baddeley A D, Gathercole S, Papagno C 1998 The phonological loop as a language learning device. Psychological Reiew. 105: 158–73 Burani C, Vallar G, Bottini G 1991 Articulatory coding and phonological judgements on written words and pictures: the role of the phonological output buffer. European Journal of Cognitie Psychology 3: 379–98 Burgess N, Hitch G J 1999 Memory for serial order: A network model of the phonological loop and its timing. Psychological Reiew 106: 551–81 Della Sala S, Logie R H 1993 When working memory does not work: the role of working memory in neuropsychology. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology. Elsevier, Amsterdam Glanzer M 1972 Storage mechanisms in recall. In: Bower G H (ed.) The Psychology of Learning and Motiation. Adances in Research and Theory. Academic Press, New York James W 1895 The Principles of Psychology. Holt, New York Miller G A 1956 The magic number seven, plus or minus two: some limits to our capacity for processing information. Psychological Reiew 63: 81–93 Paulesu E, Frith U, Snowling M, Gallagher A, Morton J, Frackowiak R S J, Frith C D 1996 Is developmental dyslexia a disconnection syndrome? Evidence from PET scanning. Brain 119: 143–57 Smith E E, Jonides J, Koeppe R A, Awh E, Schumacher E H, Minoshima S 1995 Spatial versus object working memory: PET investigations. Journal of Cognitie Neuroscience 7: 337–56 Vallar G, Di Betta A M, Silveri M C 1997 The phonological short-term store-rehearsal system: Patterns of impairment and neural correlates. Neuropsychologia 35: 795–812 Vallar G, Papagno C 1995 Neuropsychological impairments of short-term memory. In: Baddeley A D, Wilson B A, Watts F (eds.) Handbook of Memory Disorders. Wiley, Chichester, UK
Shyness and Behaioral Inhibition Vallar G, Shallice T (eds.) 1990 Neuropsychological Impairments of Short-term Memory. Cambridge University Press, Cambridge, UK
G. Vallar
Shyness and Behavioral Inhibition Since the 1980s, the study of shyness, behavioral inhibition, and social withdrawal in childhood has taken on a research trajectory that can best be described as voluminous. Yet, these related phenomena remain something of a mystery, carrying with them a variety of definitions and a number of very different perspectives concerning psychological significance. Because there appear to be several different characterizations of forms of shy, inhibited, or nonsocial behavior, the first goal of this article is to provide a definitional framework.
1. Shyness, Behaioral Inhibition, and Social Withdrawal: Definitions In their efforts to identify the etiology of children’s personalities and social behaviors, developmental scientists have attempted to determine the relevant dispositional dimensions of temperament that may underlie children’s actions that are displayed consistently across situations, and continuously over time. One such dispositional construct is behaioral inhibition (Kagan 1997). Behavioral inhibition has been defined variously as (a) an inborn bias to respond to unfamiliar events by showing anxiety; (b) a specific vulnerability to the uncertainty all children feel when encountering unfamiliar events that cannot be assimilated easily; and (c) one end of a continuum of possible initial behavioral reactions to unfamiliar objects or challenging social situations. These definitions highlight some common elements: behavioral inhibition is (a) a pattern of responding or behaving, (b) possibly biologically determined, such that (c) when unfamiliar and\or challenging situations are encountered, (d) the child shows signs of anxiety, distress, or disorganization. The term shyness has been used to refer to inhibition in response to novel social situations. In infancy and early childhood, shyness is elicited by feelings of anxiety and distress when confronted by unfamiliar people. To some, such behavior serves an adaptive purpose in that it removes children from situations they perceive as discomforting and dangerous (Buss 1986). But variability of the infantile shyness response is great; some children clearly demonstrate the behavior excessively. In middle childhood, children have the cognitive skill to compare themselves with others and to
understand that others can, and do, pass judgment on them. It is this understanding of social comparison and evaluation that can elicit shy behavior among older children, adolescents and adults (Buss 1986). That is, children may remove themselves from situations they believe will be discomforting because others will pass judgment on their skills, personae, and so on. Again, variability is great; some children display shyness that is clearly out-of-the ordinary. A related construct is social withdrawal, a phenomenon that refers to the consistent (across situations and over time) display of solitary behavior when encountering both familiar and\or unfamiliar peers (Rubin and Stewart 1996). Thus, shyness is embedded within the construct of social withdrawal. The common denominator for inhibition, shyness, and social withdrawal is that the representative behavior is one that moves children away from their social worlds, that is, solitude.
2. Different Forms of Inhibition, Shyness, and Solitude Inhibited behavior in the face of the unfamiliar typically has been assessed by observing very young children as they are confronted by unfamiliar adults, objects, or peers. Insofar as shyness is concerned, it is best exemplified behaviorally by the display of socially reticent behavior during the preschool and early childhood periods. Among unfamiliar others, reticent preschoolers hang back, watching peers from afar; they make few social initiations and back away from initiations made by others. When asked to speak up in groups of unfamiliar peers, reticent children choose not to do so at all, and if they do speak, it is uncomfortably and for a very short period of time. Further, when asked to collaborate with peers to solve problems, they spend their time watching others rather than providing the requested help. Reticence is one of several forms of solitude demonstrated by children (Rubin and Asendorpf 1993). Some preschoolers appear to be satisfied playing with objects rather than people. These children play alone, constructing and exploring objects. The term used to describe such behavior is solitary-passiity (Coplan et al. 1994). When others initiate interaction with solitary-passive children, they do not back away; further, they participate fully and actively in-group problemsolving behavior. Thus unlike shy children who have a high social avoidance motivation, those whose solitary behavior is mainly of the solitary-passive ilk may be described as having a low social approach motivation. Still yet other children demonstrate solitude that appears to reflect immaturity. These young children engage in solitary-sensorimotor activity (solitaryactie play), repeating motor movements with or without objects (Coplan et al. 1994). 14055
Shyness and Behaioral Inhibition 2.1 Deelopmental Differences and Change Typically all forms of solitude decrease from early through middle and late childhood. As noted above, causes of shy behavior change from early wariness to the anxiety associated with being socially evaluated. While reticent behavior tends to decrease with age, it is also the case that the ‘meanings’ of different forms of solitude change as well. For example, as children come to cope with their social anxieties among unfamiliar peers, they become increasingly likely to display that form of solitude that had been viewed earlier as normal and adaptive—solitary-exploration and construction. Thus, with increasing age, such constructive solitude becomes increasingly associated with measures of social wariness, and physiological markers of anxiety and emotion dysregulation (see below).
3. Putatie Causes of Inhibition, Shyness, and Social Withdrawal 3.1 Physiology Behaioral inhibition has been thought to emanate from a physiological ‘hard wiring’ that evokes caution, wariness, and timidity in unfamiliar social and nonsocial situations. Inhibited infants and toddlers differ from their uninhibited counterparts in ways that imply variability in the threshold of excitability of the amygdala and its projections to the cortex, hypothalamus, sympathetic nervous system, corpus striatum, and central gray. That there is a physiological basis underpinning behavioral inhibition is drawn from numerous psychophysiological studies. For example, stable patterns of right frontal EEG asymmetries in infancy predict temperamental fearfulness and behavioral inhibition in early childhood (Calkins et al. 1996). The functional role of hemispheric asymmetries in the regulation of emotion may be understood in terms of an underlying motivational basis for emotional behavior, specifically along the approach–withdrawal continuum. Infants exhibiting greater relative right frontal asymmetry are more likely to withdraw from mild stress, whereas infants exhibiting the opposite pattern of activation are more likely to approach. Another physiological entity that distinguishes wary from nonwary infants and toddlers is vagal tone, an index of the functional status or efficiency of the nervous system, marking both general reactivity and the ability to regulate one’s level of arousal. Reliable associations have been found between vagal tone and inhibition in infants and toddlers: children with lower agal tone (consistently high heart rate due to less parasympathetic influence) tend to be more behaviorally inhibited (Kagan et al. 1987). In early childhood, reticent behavior is associated with the same physiological markers as is the case for 14056
behavioral inhibition in infancy and toddlerhood. Thus, in early childhood, reticent, fearful, solitary behavior is associated with greater relative right frontal EEG activation; but constructive solitude is not (Fox et al. 1996). Further, parents view children who have such EEG asymmetries as anxious. Among older, elementary school-age children, shy, reticent behavior among familiar peers (i.e., social withdrawal) has been associated positively with internalized negative emotions such as nervousness, distress, and upset; and negatively related to positive emotions such as enthusiasm and excitement.
3.2 Parent–Child Relationships Attachment theorists maintain that the primary relationship develops during the first year of life, usually between the mother and the infant (Bowlby 1973). Maternal sensitivity and responsiveness influence whether the relationship will be secure or insecure. Researchers have shown that securely attached infants are likely to be well adjusted, socially competent, and successful at forming peer relationships in early and middle childhood whereas insecurely attached children may be less successful at social developmental tasks. Researchers have proposed that those infants who are temperamentally reactive and who receive insensitive, unresponsive parenting come to develop an insecure-ambivalent (‘C’-type) attachment relationship with their primary caregivers (e.g., Calkins and Fox 1992). In novel settings these ‘C’ babies maintain close proximity to the attachment figure (usually the mother). When the mother leaves the room briefly, these infants become quite unsettled. Upon reunion with the mother, these infants show angry, resistant behaviors interspersed with proximity—or contactseeking behaviors. It is argued further that this constellation of infant emotional hyperarousability and insecure attachment may lead the child to display inhibited\wary behaviors as a toddler, and there are data supportive of this conjecture. Given that the social behaviors of preschoolers and toddlers who have an insecure ‘C’-type attachment history are thought to be guided largely by fear of rejection, it is unsurprising to find that when these children are observed in peer group settings, they appear to avoid rejection by demonstrating passive, adult-dependent behavior and withdrawal from social interaction. Lastly, ‘C’ babies lack confidence and assertiveness at age 4 years; then, at age 7 years they are seen as passively withdrawn (Renken et al. 1989).
3.3 Parenting Recently, researchers have shown that parental influence and control maintain and exacerbate child
Shyness and Behaioral Inhibition inhibition and social withdrawal. For example, mothers of extremely inhibited toddlers have been observed to display overly solicitous behaviors (i.e., intrusively controlling, unresponsive, physically affectionate). Mothers of shy preschoolers do not encourage independence and exploration. And mothers of socially withdrawn children tend to be highly controlling, overprotective, and poor reciprocators of their child’s displays of positive behavior and positive affect. Lastly, researchers have shown that mothers of socially withdrawn children are more likely than those of normal children to use such forms of psychological control statements, as devaluation, criticism, and disapproval. Taken together, these parenting practices may attack the child’s sense of self-worth (for a review, see Rubin et al. 1995).
4. Correlates and Consequences of Inhibition, Shyness, and Social Withdrawal Researchers who have followed, longitudinally, the course of development for inhibited infants have found strong consistency of behavior over time. As a group, children identified as extremely inhibited are more likely to be socially wary with unfamiliar peers in both the laboratory and at school, and to exhibit signs of physiological stress during social interactions (Kagan et al. 1987). In a longitudinal study extending into adulthood, Caspi and Silva (1995) found that individuals identified as shy, fearful, and withdrawn at 3 years reported that they preferred to stick with safe activities, be cautious, submissive, and had little desire to influence others at 18 years. A subsequent follow-up at age 21 on interpersonal functioning showed that these same children were normally adjusted in both their work settings and their romantic relationships. Social withdrawal appears to carry with it the risk of a child’s developing negative thoughts and feelings about the self. Highlighting the potential long-term outcomes of social withdrawal is a recent report which showed that social withdrawal among familiar peers in school at age 7 years predicted negative self-perceived social competence, low self-worth, loneliness, and felt peer-group insecurity among adolescents aged 14 years (Rubin et al. 1995). These latter findings are augmented by related research findings. For example, Renshaw and Brown (1993) found that passive withdrawal at age 9 to12 years predicted loneliness assessed one year later. Ollendick et al. (1990) reported that 10year-old socially withdrawn children were more likely to be perceived by peers as withdrawn and anxious, more disliked by peers, and more likely to have dropped out of school than their well-adjusted counterparts five years later. Finally, Morison and Masten (1991) indicated that children perceived by peers as withdrawn and isolated in middle childhood were more likely to think negatively of their social compe-
tencies and relationships in adolescence. In sum, it would appear as if early social withdrawal, or its relation to anxiety, represents a behavioral marker for psychological and interpersonal maladaptation in childhood and adolescence.
5. Summary and Future Directions The study of inhibition and shyness garnered an enormous amount of attention in the 1990s. Most empirical research has focused on the contemporaneous and predictive correlates of social reticence, shyness, and withdrawal at different points in childhood and adolescence. These correlated variables include those of the biological, intrapersonal, interpersonal, and psychopathology ilk that have been chosen from conceptual frameworks pertaining to the etiology, stability, and outcomes of socially wary and withdrawn behaviors. Thus far, it appears that socially inhibited children have a biological disposition that fosters emotional dysregulation in the company of others. These children, if overly directed and protected by their primary caregiver, become reticent and withdrawn in the peer group. In turn, such behavior precludes the development of social skills and the initiation and maintenance of positive peer relationships. Yet again, this transactional experience seems to lead children to develop anxiety, loneliness, and negative self-perceptions of their relationships and social skills. Despite these strong conclusions, however, it is important to recognize that the data bases upon which these conclusions rest are relatively few. Clearly, replication work is necessary. The extent to which dispositional factors interact with parenting styles and parent–child relationships to predict the consistent display of socially withdrawn behavior in familiar peer contexts still needs to be established. Further, the sex differences discussed above require additional attention. Lastly, our knowledge of the developmental course of inhibition, shyness, and social withdrawal is constrained by the almost sole reliance on data gathered in Western cultures. Little is known about the developmental course of these phenomena in Eastern cultures such as those in China, Japan, or India; and even less is known in Southern cultures such as those found in South America, Africa, and southern Europe. It may well be that depending on the culture within which these phenomena are studied, the biological, interpersonal, and intrapersonal causes, concomitants, and consequences of inhibition, shyness, and social withdrawal may vary. In short, cross-cultural research is necessary, not only for the study of these phenomena, but also for most behaviors that are viewed as deviant or reflective of intrapsychic abnormalities in the West. 14057
Shyness and Behaioral Inhibition See also: Attachment Theory: Psychological; Emotional Inhibition and Health; Personality and Social Behavior; Personality Development and Temperament; Temperament and Human Development
Rubin K H, Stewart S, Chen X 1995 Parents of aggressive and withdrawn children. In: Bornstein M H (ed.) Handbook of Parenting: Children and Parenting. L. Erlbaum Associates, Mahwah, NJ, Vol. 1, pp. 225–84
K. H. Rubin
Bibliography Bowlby J 1973 Attachment and Loss. Attachment. Basic Books, New York, Vol. 1 Buss A H 1986 A theory of shyness. In: Jones W H, Cheek J M, Briggs S R (eds.) Shyness: Perspecties on Research and Treatment. Plenum, New York Calkins S D, Fox N A 1992 The relations among infant temperament, security of attachment and behavioral inhibition at 24 months. Child Deelopment 63: 1456–72 Calkins S D, Fox N A, Marshall T R 1996 Behavioral and physiological antecedents of inhibited and uninhibited behavior. Child Deelopment 67: 523–40 Caspi A, Silva P A 1995 Temperamental qualities at age three predict personality traits in young adulthood: Longitudinal evidence from a birth cohort. Child Deelopment 66: 486–98 Coplan R J, Rubin K H, Fox N A, Calkins S D, Stewart S L 1994 Being alone, playing alone, and acting alone: Distinguishing among reticence, and passive- and active-solitude in young children. Child Deelopment 65: 129–37 Fox N A, Schmidt L A, Calkins S D, Rubin K H, Coplan R J 1996 The role of frontal activation in the regulation and dysregulation of social behavior during the preschool years. Deelopment and Psychopathology 8: 89–102 Kagan J 1997 Temperament and the reactions to unfamiliarity. Child Deelopment 68: 139–43 Kagan J, Reznick J S, Snidman N 1987 The physiology and psychology of behavioral inhibition in children. Child Deelopment 58: 1459–73 Morison P, Masten A S 1991 Peer reputation in middle childhood as a predictor of adaptation in adolescence: A seven-year follow-up. Child Deelopment 62: 991–1007 Ollendick T H, Greene R W, Weist M D, Oswald D P 1990 The predictive validity of teacher nominations: A five-year followup of at risk youth. Journal of Abnormal Child Psychology 18: 699–713 Renken B, Egeland B, Marvinney D, Mangelsdorf S, Sroufe L 1989 Early childhood antecedents of aggression and passivewithdrawal in early elementary school. Journal of Personality 57: 257–81 Renshaw P D, Brown P J 1993 Loneliness in middle childhood: Concurrent and longitudinal predictors. Child Deelopment 64: 1271–84 Rubin K H, Asendorpf J 1993 Social withdrawal, inhibition and shyness in childhood: Conceptual and definitional issues. In: Rubin K H, Asendorpf J (eds.) Social Withdrawal, Inhibition, and Shyness in Childhood. L. Erlbaum Associates, Hillsdale, NJ Rubin K H, Chen X, McDougall P, Bowker A, McKinnon J 1995 The Waterloo Longitudinal Project: Predicting internalizing and externalizing problems in adolescence. Deelopment and Psychopathology 7: 751–64 Rubin K H, Stewart S L 1996 Social withdrawal. In: Mash E J, Barkley R A (eds.) Child Psychopathology. Guilford, New York, pp. 277–307
14058
Sibling-order Effects Historically, sibling order has influenced important aspects of social, economic, and political life, and it continues to do so today in many traditional societies. Discriminatory inheritance laws and customs about royal succession that favor firstborns and eldest sons are just two examples. Sibling-order effects have also been documented for a wide variety of behavioral tendencies, although the magnitude and causal interpretation of these effects have been subject to debate. In the Darwinian process of competing for parental favor, siblings often employ strategies that are shaped by their order of birth within the family, and these strategies exert a lasting impact on personality. Radical revolutions are particularly likely to elicit differences in support by birth order. These behavioral patterns appear to be mediated by differences in personality as well as by differing degrees of identification with the family.
1. Social, Economic, and Political Consequences Many societies—especially in past centuries and in non-Western parts of the world—have engaged in practices that favor one sibling position over another. For example, most traditional societies permit infanticide, especially when a child has a birth defect or when an immediately older infant is still breastfeeding. However, no society condones the killing of the elder of two siblings (Daly and Wilson 1988). An extensive survey of birth order and its relationship to social behavior in 39 non-Western societies found that the birth of a first child typically increases the status of parents and stabilizes the marriage. In addition, firstborns were generally found to receive more elaborate birth ceremonies, were given special privileges, and tended to exert authority over their younger siblings. In most of these 39 societies, firstborns maintained supremacy over their younger siblings even in adulthood. They also gained control of a greater share of parental property than laterborns, were more likely to become head of their local kin group, and tended to exert greater power in relationships with nonfamily members (Rosenblatt and Skoogberg 1974).
Sibling-order Effects Previously, many Western societies employed sibling order as a means of deciding who inherits parental property or assumes political power. Primogeniture (the policy by which the eldest child or son automatically inherits property or political authority) was the most commonly employed mechanism, but other discriminatory inheritance practices have also been employed. For example, secondogeniture involves leaving the bulk of parental property to the second child or second son, and ultimogeniture involves leaving such property to the lastborn or youngest son. Most variations in inheritance practices by sibling order can be understood by considering the advantages that accrue from such policies, given local economic circumstances (Hrdy and Judge 1993). For example, primogeniture has generally been practiced in societies where wealth is stable and based on limited land, and where talent does not matter much. Leaving the bulk of parental property to a single offspring avoids subdividing estates and hence reducing the social status of the family patronymic—something that was particularly important in past centuries among the landed aristocracy. However, in Renaissance Venice economic fortunes were generally based on speculative commerce rather than ownership of property, and parents typically subdivided their estates equally so as to maximize the chances of having one or more commercially successful offspring (Herlihy 1977). Ultimogeniture is a policy often found in societies that impose high death taxes on property. This inheritance practice has the consequence of maximizing the interval between episodes of taxation, thus reducing the overall tax burden. Even in societies employing inheritance systems that favor one sibling over others, parents have commonly provided more or less equally for their offspring by requiring the child who inherits the family estate to pay a compensatory amount to each sibling. Primogeniture and related practices did not always mean disinheritance—a common misassumption. In medieval and early modern times, however, primogeniture among the landed aristocracy did mean that some younger sons and daughters faced difficult economic and social prospects. Among the medieval Portuguese nobility, for example, landless younger sons and daughters were significantly less likely to marry and leave surviving offspring (Boone 1986). Younger sons, for example, left 1.6 fewer children than did eldest sons. Younger sons were also nine times more likely than eldest sons to father a child out of wedlock. Because they posed a serious threat to political stability in their own country, younger sons were channeled by the state into expansionist military campaigns in faraway places such as India, where they often died in battle or from disease. Similarly, the Crusades can be seen, in part, as a church-state response to this domestic danger (Duby 1977). The surplus of landless younger daughters in the titled nobility was dealt with by sending them to nunneries.
2. Personality Birth-order differences have long been claimed in the domain of personality, although these claims have remained controversial despite considerable research on this topic. Psychologists have investigated the consequences of birth order ever since Charles Darwin’s cousin Francis Galton (1874) reported that eldest sons were overrepresented as members of the Royal Society. After breaking away from Sigmund Freud in 1910 to found a variant school of psychoanalysis, Alfred Adler (1927) focused on birth order in his own attempt to emphasize the importance of social factors in personality development. A secondborn, Adler considered firstborns to be ‘power-hungry conservatives.’ He characterized middleborns as competitive and lastborns as spoiled and lazy. Since Adler speculated about birth order and its consequences for personality in 1927, psychologists have conducted more than 2000 studies on the subject. Critics of this extensive literature have argued that most of these studies are inadequately controlled for key covariates, such as social class and sibship size; that studies often conflict; and that birth-order differences in personality and IQ seem to have been overrated (Ernst and Angst 1983). Meta-analysis—a technique for aggregating findings in order to increase statistical power and reliability—suggests a different conclusion. If we consider only those well-designed studies controlling for important background variables that covary with birth order and can introduce spurious cross-correlations, a meta-analytic review reveals consistent birth-order differences for a wide range of personality traits (Sulloway 1995). For instance, firstborns are generally found to be more conscientious than laterborns, a difference that is exemplified by their being more responsible, ambitious, and self-disciplined. In addition, firstborns tend to be more conforming to authority and respectful of parents. Firstborns also tend to have higher IQs than their younger siblings—a reduction of about one IQ point is observed, on average, with each increment in birth rank (Zajonc and Mullally 1997). These and other birth-order differences in personality can be usefully understood from a Darwinian perspective on family life (Sulloway 1996). Because they share, on average, only half of their genes, siblings will tend to compete unless the benefits of cooperating are greater than twice the costs (Hamilton 1964). In this Darwinian story about sibling competition, birth order does not exert a direct biological influence on personality. For example, there are no genes for being a firstborn or a laterborn. Instead, birth order is best seen as a proxy for various environmental influences, particularly disparities in age, physical size, power, and status within the family. These physical and mental differences lead siblings to pursue alternative strategies in their efforts to maximize parental investment (which includes emotional as well as physical 14059
Sibling-order Effects resources). This perspective on how sibling order shapes personality accords with research in behavioral genetics, which finds that most environmental influences on personality are not shared by siblings and hence belong to the ‘nonshared environment’ (Plomin and Daniels 1987). Prior to the twentieth century, half of all children did not survive childhood. Even minor differences in parental favor would have increased an offspring’s chances of getting out of childhood alive. Because eldest children have already survived the perilous early years of childhood, they are more likely than younger siblings to reach the age of reproduction and to pass on their parents’ genes. Quite simply, a surviving eldest child was generally a better Darwinian bet than a younger child, which is why parents in most traditional societies tend to bias investment, consciously or unconsciously, toward older offspring. An exception to this generalization involves youngest children born toward the end of a woman’s reproductive career. These children also tend to be favored, since they cannot be replaced (Salmon and Daly 1998). Even when parents do not favor one offspring over another, siblings compete to prevent favoritism. Siblings do so, in part, by cultivating family niches that correspond closely with differences in birth order. Firstborns, for example, tend to act as surrogate parents toward their younger siblings, which is a good way of currying parental favor. As a consequence, firstborns tend to be more conservative and parentidentified than their younger siblings, which is one of the most robustly documented findings in the literature. Laterborns cannot compete as effectively as firstborns for this surrogate parent niche, since they cannot baby-sit themselves. As a consequence, laterborns seek alternative family niches that will help them to cultivate parental favor in different ways. To do so, they must often look within themselves for latent talents that can only be discovered through systematic experimentation. Toward this end, laterborns tend to be more open to experience—that is, more imaginative, prone to fantasies, and unconventional— propensities that offer an increased prospect of finding a valued and unoccupied family niche. In addition to their efforts to cultivate alternative family niches, siblings diverge in personality and interests because they employ differing strategies in dealing with one another. These strategies are similar to those observed in mammalian dominance hierarchies. Because firstborns are physically bigger than their younger siblings, they are more likely to employ physical aggression and intimidation in dealing with rivals. Firstborns are the ‘alpha males’ of their sibling group, and they generally boss and dominate their younger brothers and sisters. For their own part, laterborns tend to employ low-power strategies to obtain what they want, including pleading, whining, cajoling, humor, social intelligence, and, whenever expedient, appealing to parents for assistance. 14060
Laterborns also tend to form coalitions with one another in an effort to circumvent the physical advantages enjoyed by the firstborn. Middle children are the most inclined to employ diplomatic and cooperative strategies. Some middle children are particularly adept at nonviolent methods of protest. Martin Luther King, Jr., the middle of three children, began his career as a champion of nonviolent reform by interceding in the episodes of merciless teasing that his younger brother inflicted upon their elder sister. Only children represent a controlled experiment in birth-order research. They have no siblings and therefore experience no sibling rivalry. As a consequence, they are not driven to occupy a particular family niche. Although only children, like other firstborns, are generally ambitious and conform to parental authority, they are intermediate between firstborns and laterborns on most other personality traits. Because age spacing can affect functional sibling order, a firstborn whose next younger sibling is six or more years younger is effectively like an only child. Similarly, large age gaps can make some laterborns into functional firstborns or only children. That brothers and sisters differ from one another for strategic reasons, rather than randomly, has been shown by studies involving more than one sibling from the same family (Schachter 1982). In one study, firstborns were found to be significantly different in personality and interests from secondborns, who were significantly different from thirdborns. By contrast, the first and third siblings were not as different as adjacent pairs, presumably because of less competition. This process of sibling differentiation has been termed ‘deidentification’ and extends to relationships with parents. For example, a firstborn child sometimes identifies more strongly with one parent than another. In such cases, the secondborn tends to identify more strongly with the parent that is not already preferred by the firstborn. The most compelling evidence for birth-order effects in personality comes from studies in which siblings assess each other’s personalities (Paulhus et al. 1999, Sulloway 1999). Such within-family designs control for the kinds of spurious correlations that can result from comparing individuals from different family backgrounds. Studies based on such direct sibling comparisons exhibit consistent birth-order effects in the expected direction. When the results are organized according to the Five-Factor Model of personality (Costa and McCrae 1992), firstborns tend to be more conscientious and slightly more neurotic than laterborns, whereas laterborns tend to be more agreeable, extraverted, and open to experience than firstborns. Although reasonably consistent patterns for birth order and personality are observed according to the Five-Factor Model of personality, findings are significantly heterogeneous for three of the five personality dimensions. For example, firstborns tend to be more assertive than laterborns, which is an in-
Sibling-order Effects dicator of extraversion, but laterborns tend to score higher on most other facets of extraversion, which include being more fun-loving, sociable, and excitement seeking. Similarly, laterborns tend to be more open to experience in the sense of being unconventional, whereas firstborns tend to be more open to experience in the sense of being intellectually oriented. Lastly, firstborns are more neurotic in the sense of being anxious about status, but laterborns are more neurotic in the sense of being more self-conscious. As measured by direct sibling comparisons within the family, birth-order differences explain about four percent of the variance in personality, less than does gender and substantially more than do age, family size, or social class. In studies controlled for differences in age, sex, social class, and sibship size, siblings are about twice as likely to exhibit traits that are consistent with their sibling positions as to exhibit inconsistent traits. In short, not every laterborn has a laterborn personality (just as some firstborns deviate from the expected trend), but a reasonably consonant pattern is nevertheless present. One still-unresolved question about birth order and personality is the extent to which within-family patterns of behavior transfer to behavior outside the family of origin. Recent studies suggest that birthorder effects observed in extrafamilial contexts are about one-third to one-half the magnitude of those manifested within the family (Sulloway 1999). Relative to firstborn spouses and roommates, for example, laterborn spouses and roommates are generally perceived to be more agreeable and extraverted, but less conscientious and neurotic. Outside the family of origin, birth-order effects seem to manifest themselves most strongly in intimate living relationships and in dominance hierarchies. These findings are not surprising, since they involve the kind of behavioral contexts in which sibling strategies were originally learned.
3. Behaior During Radical Reolutions Some of the best evidence for sibling differences in behavior outside the family of origin comes from social and intellectual history. During socially radical revolutions, laterborns have generally been more likely than firstborns to question the status quo and to adopt a revolutionary alternative. This difference is observed even after controlling for the fact that laterborns, in past centuries at least, appear to have been more socially liberal than firstborns. As a rule, these effects are very context sensitive. During the Protestant Reformation, for example, laterborns were nine times more likely than firstborns to suffer martyrdom in support of the Reformed faith. (This statistic is corrected for the greater number of laterborns in the general population.) In countries that turned Protestant, such as England, firstborns were five times
more likely than laterborns to suffer martyrdom for their refusal to abandon their allegiance to the Catholic faith. Thus laterborns gave their lives in the service of radical rebellion, whereas firstborns did so in an effort to preserve the waning orthodoxy. There was no generalized tendency, however, for laterborns to become ‘martyrs’ (Sulloway 1996). Similarly, during the French Revolution firstborn deputies to the National Convention generally voted according to the expectations of their social class, whereas laterborn deputies were more likely to vote in the opposite manner. For this reason, there was only a modest tendency for firstborns and laterborns to vote in a particular manner, since voting was generally influenced by other important considerations that determined what it meant to conform or rebel in a rapidly changing political environment. In response to radical conceptual changes in science one finds similar differences by birth order. Firstborns tend to be the guardians of what Thomas Kuhn (1970) has called ‘normal’ science, during which research is guided by the prevailing paradigms of the day. By contrast, laterborns tend to be the outlaws of science, sometimes flaunting its accepted methods and assumptions in the process of attempting radical conceptual change. During the early years of the Copernican revolution, which challenged church doctrine by claiming that the earth rotates around the sun, laterborns were five times more likely than firstborns to endorse this heretical theory. Copernicus himself was the youngest of four children. When Charles Darwin, the fifth of six children, became an evolutionist in the 1830s, he was 10 times more likely to do so than a firstborn. During other notable revolutions in science—including those led by laterborns such as Bacon, Descartes, Hutton, Semmelweis, and Heisenberg, and by occasional firstborns such as Newton and Einstein—younger siblings have typically been twice as likely as firstborns to endorse the new and radical viewpoint. Conversely, when new scientific doctrines, such as vitalism and eugenics, have appealed strongly to social conservatives, firstborns have been more likely than laterborns to pioneer and endorse such novel but ideologically conservative ideas. These birthorder effects typically fade over the course of scientific progress, as conceptual innovations accumulate supporting evidence and eventually convince the scientific majority. In spite of their predilection for supporting radical innovations, laterborns have no monopoly on scientific truth. They were nine times more likely than firstborns to support Franz Joseph Gall’s bogus theory of phrenology, which held that character and personality could be read by means of bumps on the head. In their willingness to question the status quo, laterborns run the risk of error through over-eager rebellion, just as firstborns sometimes err by resisting valid conceptual changes until the mounting evidence can no longer be denied. 14061
Sibling-order Effects During ‘normal’ science, firstborns possess a slight advantage over laterborns. Being more academically successful, firstborns are more likely than laterborns to become scientists. They also tend to win more Nobel prizes. This finding might seem surprising, but, according to the terms of Nobel’s will, Nobel prizes have generally been given for ‘discoveries’ or creative puzzle solving in science, not for radical conceptual revolutions. Firstborn scientists who have innovated within the system include James Watson and Francis Crick, who together unraveled the structure of DNA, and Jonas Salk, who developed the polio vaccine. Such historical facts highlight another important point, namely, that firstborns and laterborns do not differ in overall levels of ‘creativity.’ Rather, firstborns and laterborns are preadapted to solving dissimilar kinds of problems by employing disparate kinds of creative strategies.
4. Family Sentiments The kinds of birth-order effects that are observed during radical historical revolutions may depend as much on differences in ‘family sentiments’ as they do on personality. As Salmon and Daly (1998) have shown, firstborns (and to a lesser extent lastborns) are more strongly attached to the family system than are middle children. Historically, radical revolutions have tapped differences in family sentiments in two important ways. First, being considerably older than their offspring, parents were more likely to endorse the status quo, given that age is one of the best predictors of the acceptance of new and radical ideas (Hull et al. 1978). For this reason, endorsing a radical revolution has usually meant opposing parental values and authority, something that firstborns (and to a lesser extent lastborns) are less likely to do than are middleborns. Second, most radical revolutions, even in fields such as science, have tended to raise issues that are directly relevant to issues of parental investment and discrimination among offspring, filial loyalty, and overall identification with the family system. During the Reformation, for example, Protestant leaders such as Martin Luther strongly advocated the right to marriage by previously celibate clergymen and nuns, who tended to be younger sons and daughters. These Protestant leaders also advocated the egalitarian principles of partible inheritance, by which property—and even political rule—were subdivided equally among offspring (Fichtner 1989). During the height of the controversies raging over Darwin’s theory of evolution, Darwin noted that discriminatory inheritance practices posed a serious impediment to human evolution, remarking to Alfred Russel Wallace: ‘But oh, what a scheme is primogeniture for destroying natural selection!’ (Sulloway 1996, p. 54). Even when radical ideology does not play a central role in conceptual change, and when discriminatory 14062
inheritance systems are also not a factor, differences in identification with the family can impact on social and political attitudes. As Salmon (1998) has shown experimentally, political speeches appeal differentially to individuals by birth order depending on whether kinship terms are included in the speech. Firstborns and lastborns are more likely to react positively to speeches that employ kinship terms such as ‘brother’ and ‘sister,’ whereas middle children prefer speeches that employ references to ‘friends.’ To the extent that radical social movements make use of kinship terms, which they often do, birth-order differences in family sentiments will tend to influence how siblings react to radical change.
5. Conclusion In past centuries birth order and functional sibling order have influenced such diverse social and political phenomena as royal succession, expansionist military campaigns, religious wars and crusades, geographic exploration, and inheritance laws. Even today, birth order and family niches more generally are among the environmental sources of personality because they cause siblings to experience the family environment in dissimilar ways. In particular, birth order introduces the need for differing strategies in dealing with sibling rivals as part of the universal quest for parental favor. This is a Darwinian story, albeit one with a marked environmental twist. Although siblings appear to be hard-wired to compete for parental favor, the specific niche in which they have grown up determines the particular strategies they adopt within their own family. Finally, because birth order and family niches underlie differences in family sentiments, including filial loyalty and self-conceptions about family identity, the family has historically supplied a powerful engine for revolutionary change. See also: Family, Anthropology of; Family as Institution; Family Processes; Family Size Preferences; Family Systems and the Preferred Sex of Children; Galton, Sir Francis (1822–1911); Personality Development in Childhood; Property: Legal Aspects of Intergenerational Transmission; Sibling Relationships, Psychology of
Bibliography Adler A 1927 Understanding Human Nature. Greenberg, New York Boone J L 1986 Parental investment and elite family structure in preindustrial states: case study of late medieval-early modern Portuguese genealogies. American Anthropologist 88: 859–78 Costa P T, McCrae R R 1992 NEO PI-R Professional Manual. Psychological Assessment Resources, Odessa, FL Daly M, Wilson M 1988 Homicide. Aldine de Gruyter, Hawthorne, New York
Sibling Relationships, Psychology of Duby G 1977 The Chialrous Society. Edward Arnold, London Ernst C, Angst J 1983 Birth Order: Its Influence on Personality. Springer-Verlag, Berlin Fichtner P S 1989 Protestantism and Primogeniture in Early Modern Germany. Yale University Press, New Haven, CT Galton F 1874 English Men of Science. Macmillan, London Hamilton W 1964 The genetical evolution of social behavior. Parts 1 and II. Journal of Theoretical Biology 7: 1–52 Herlihy D 1977 Family and property in Renaissance Florence. In: Herlihy D, Udovitch A L (eds.) The Medieal City. Yale University Press, New Haven, CT, pp. 3–24 Hrdy S, Judge D S 1993 Darwin and the puzzle of primogeniture. Human Nature 4: 1–45 Hull D L, Tessner P D, Diamond A M 1978 Planck’s principle. Science 202: 717–23 Kuhn T S 1970 The Structure of Scientific Reolutions, 2nd edn. University of Chicago Press, Chicago Paulhus D L, Chen D, Trapnell P D 1999 Birth order and personality within families. Psychological Science 10: 482–8 Plomin R, Daniels D 1987 Why are children in the same family so different from one another. Behaioral and Brain Sciences 10: 1–60 Rosenblatt P C, Skoogberg E L 1974 Birth order in crosscultural perspective. Deelopmental Psychology 10: 48–54 Salmon C 1998 The evocative nature of kin terminology in political rhetoric. Politics and the Life Sciences 17: 51–7 Salmon C A, Daly M 1998 Birth order and familial sentiment: Middleborns are different. Human Behaior and Eolution 19: 299–312 Schachter F F 1982 Sibling deidentification and split-parent identifications: A family tetrad. In: Lamb M E, Sutton-Smith B (eds.) Sibling Relationships: Their Nature and Significance across the Lifespan. Lawrence Erlbaum, Hillsdale, NJ, pp. 123–52 Sulloway F J 1995 Birth order and evolutionary psychology: A meta-analytic overview. Psychological Inquiry 6: 75–80 Sulloway F J 1996 Born to Rebel: Birth Order, Family Dynamics, and Creatie Lies. Pantheon, New York Sulloway F J 1999 Birth order. In: Runco M A, Pritzker S (eds.) Encyclopedia of Creatiity 1: 189–202 Zajonc R B, Mullally P R 1997 Birth order: Reconciling conflicting effects. American Psychologist 52: 685–99
F. J. Sulloway
Sibling Relationships, Psychology of Siblings—brothers and sisters—have a key place in legends, history and literature throughout the world, from the era of the Egyptians and Greeks onward. The great majority of children (around 80 percent in Europe and the USA) grow up with siblings, and for most individuals their relationships with their siblings are the longest-lasting in their lives. Scientific study of the psychology of siblings is relatively recent, but is fast-growing, centering chiefly on studies of childhood and adolescence. The scientific interest of siblings lies, in particular, in the following domains: the nature and the potential influence of siblings on each other’s development and adjustment, the illuminating per-
spective the study of siblings provides on developmental issues, and the challenge that siblings present to our understanding of how families influence development—why siblings differ notably in personality and adjustment even though they grow up within the same family, the significance of their shared and separate family experiences, and their genetic relatedness.
1. The Term ‘Siblings’ This term is usually applied to ‘full’ siblings, brothers and sisters who are the offspring of the same mother and father, and share 50 percent of their genes; however, with the changes in family structure during the later decades of the twentieth century, increasing numbers of children have relationships with ‘half siblings,’ children with whom they share one biological parent, and ‘step-siblings,’ children who are unrelated biologically.
2. The Nature of Sibling Relationships 2.1 Characteristics of the Relationship The relationship between siblings is one that is characterized by distinctive emotional power and intimacy from infancy onward. It is a relationship that offers children unique opportunities for learning about self and other, with considerable potential for affecting children’s well-being, intimately linked as it is with each child’s relationship with the parents. Clinicians and family systems theorists have, from early in the twentieth century, expressed interest in the part siblings play in family relationships, and in the adjustment of individuals. However, until the 1980s there was relatively little systematic research on siblings (with the notable exception of the classic studies of birth order by Koch 1954). In the 1980s and 1990s, research interest broadened greatly to include the investigation of sibling developmental influences, sources of individual differences between siblings, and links between family relationships (Boer and Dunn 1990, Brody 1996). Individual differences in how siblings get along with each other are very marked from early infancy; siblings’ feelings range from extreme hostility and rivalry to affection and support, are often ambivalent, and are expressed uninhibitedly. The relationship is also notable for its intimacy: siblings know each other very well, and this can be a source of both support and conflict. These characteristics increase the potential of the relationship for developmental influence. Because of the emotional intensity and familiarity of their relationships, the study of siblings provides an illuminating window on what children understand about each other, which has challenged and informed 14063
Sibling Relationships, Psychology of conceptions of the development of early social understanding (Dunn 1992). 2.2 Deelopmental Changes in Sibling Relationships As children’s powers of understanding and communication develop, their sibling relationships change. Younger siblings play an increasingly active role in the relationship during the preschool years, and their older siblings take increasing interest in them. During middle childhood their relationships become more egalitarian; there is disagreement about how far this reflects an increase in the power of younger siblings over older siblings, or a decrease in the dominance both older and younger try to exert. A decrease in warmth between siblings during adolescence parallels the patterns of change found in the parent–child relationship as adolescents become increasingly involved with peers outside the family (Boer and Dunn 1990). There is a paucity of studies of siblings in adulthood; however, what information we have indicates that in the USA, most siblings maintain contact, communicate and share experiences until very late in life (Cicirelli 1996). During middle age, most adults describe feelings of closeness, rather than rivalry with their siblings, even when they have reduced contact, though family crises can evoke sibling conflict. Closeness and companionship become increasingly evident among older adults. Step- and half-siblings also continue to keep contact with each other, though they see each other less often than full siblings. Relationships with sisters appear to be particularly important in old age; this is generally attributed to women’s emotional expressiveness and their traditional roles as nurturers. Little systematic research has focused on ethnic differences; however, in a national sample in the USA, sibling relationships in African-American, Hispanic, non-Hispanic whites, and Asian-American adult respondents were compared; the conclusion was that the similarities across groups in contact and social support far outweighed the differences (Riedmann and White 1966). Siblings play a central role in adults’ lives in many other cultures; the psychology of these relationships remains to be studied systematically. 2.3 Continuities in Indiidual Differences in Relationships Do the marked individual differences in the affection or hostility that siblings show toward each other in early childhood show stability through middle childhood and adolescence? Research described in Brody (1996) indicates there is some continuity over time for both the positive and negative aspects of sibling relations, but there is also evidence for change. Contributors to change included new friendships children formed during the school years, which led to 14064
loss of warmth in the sibling relationship, or increased jealousy; developmental changes in the individuals; and life events (the majority of which contributed to increased intimacy and warmth, an exception being divorce or separation of parents). Research on adult siblings shows the relationship is not a static one during adulthood: life events, employment changes, etc. affect adult siblings’ relations—some increasing intimacy and contact, others with negative effects (Cicirelli 1996).
2.4 Influences on Indiidual Differences The temperament of both individuals in a sibling dyad is important in relation to conflict between them, and the match in their temperaments, too (Brody 1996). The effects of gender and age gap are less clear: Findings are inconsistent for young siblings, while research on children in middle childhood indicates that these ‘family constellation’ effects influence the relationship in complex ways; gender and socioeconomic status apparently increase in importance as influences on the sibling relationship during adolescence. Most striking are the links between the quality of sibling relationships and other relationships within the family—the parents’ relationships with each child, and the mother–partner or marital relationship. These connections are considered next.
3. Siblings, Parents, and Peers: Connections Between Relationships To what extent and in what ways do parent–child relationships influence sibling relationships? Currently there is much debate and some inconsistency in the research findings. First, there is some evidence that the security of children’s attachment to their parents is correlated with later sibling relationship quality, and that positive parent–child relations are correlated with positive sibling relations. Note that conclusions cannot be drawn about the direction of causal influence from these correlational data. There are also data that fit with a ‘compensatory’ model, in which intense supportive sibling relationships develop in families in which parents are uninvolved. This pattern may be characteristic of families at the extremes of stress and relationship difficulties. More consistent is the evidence that differential parent–child relationships (in which more attention and affection, and less punishment is shown by a parent toward one sibling than another) are associated with more conflicted, hostile sibling relationships (Hetherington et al. 1994). These links are especially clear for families under stress, such as those who have experienced divorce, and in those with disabled or sick siblings. Again, note that the evidence is correlational.
Sibling Relationships, Psychology of Indirect links between parent–child and sibling relationships have been documented in studies of the arrival of a sibling; a number of different processes are implicated ranging from general emotional disturbance, through processes of increasing cognitive complexity—such as increased positivity between siblings in families in which mothers talked to the firstborn about the feelings and needs of the baby in early months. This evidence indicates that even with young siblings, processes of attribution and reflection may be implicated in the quality of their developing relationship. The quality of the marital relationship is also linked to sibling relationships: mother–partner hostility is associated with increased negativity between siblings, while affection between adult partners is associated with positivity between siblings. Both direct pathways between marital and sibling relationships, and indirect pathways (via parent–child relationships) are implicated (Dunn et al. 1999).
early and middle childhood. Direction of effects in these associations is not clear: children who are good at understanding feelings and others’ minds are more effective cooperative play companions, thus their early sophistication in social understanding may contribute to the development of cooperative play, which itself fosters further social understanding (Dunn 1992). Other aspects of prosocial, cooperative behavior, pretend play, and conflict management have all been reported to be associated with the experience of friendly sibling interactions. While it appears plausible that experiences with siblings should ‘generalize’ to children’s relationships with peers, the story appears more complex, and there is not consistent evidence for simple positive links. The emotional dynamics and demands of sibling and friend relationships are very different.
4. Deelopmental Influence
Striking differences between siblings in personality, adjustment, and psychopathology have been documented in a wide range of studies. These present a challenge to conventional views of family influence, as the children are growing up within the same family. Extensive studies by behavior geneticists have now shown that the sources of environmental influence that make individuals different from one another work within rather than between families. The contribution of sibling studies to our understanding of the relative roles of genetics and environment in studies of socialization and development has been notable: the message here is not that family influence is unimportant, but that we need to document those experiences that are specific to each child in the family, and need to study more than one child in the family if we are to clarify what are the salient environmental influences on development (Dunn and Plomin 1990, Hetherington et al. 1994). The developmental research that documents the extreme sensitivity with which children monitor the interaction between other family members, and the significance of differential parent– child relationships combine here to clarify the social processes that are implicated in the development of differences between children growing up in the ‘same’ family.
4.1 Influence on Adjustment Three adjustment outcome areas in which siblings are likely to exert influence are aggressive and externalizing behavior, internalizing problems, and self esteem. For example, hostile aggressive sibling behavior is correlated with increasing aggressive behavior by the other sibling. Patterson and his group have shown the shaping role that siblings play in this pattern (Patterson 1986), for both clinic and community samples. The arrival of a sibling is consistently found to be linked to increased problems: disturbance in bodily functions, withdrawal, aggressiveness, dependency, and anxiety. It is thought these changes are linked to parallel changes in interaction between the ‘displaced’ older siblings and the parents. A growing literature links sibling influence to deviant behavior in adolescence. For example, frequent and problematic drinking by siblings increases adolescents’ tendency to drink: siblings appear to have both a direct effect and an indirect later effect on other sibs’ risks of becoming a drinker—through adolescents’ selection of peers who drink. Siblings can also be an important source of support in times of stress, and act as therapists for siblings with some problems, such as eating disorders (Boer and Dunn 1990). 4.2 Influence on Social Understanding The kinds of experience children have with their siblings are related to key aspects of their sociocognitive development. For instance, positive cooperative experiences with older siblings are correlated with the development of greater powers of understanding emotion and others’ mental states, both in
5. Why Are Siblings So Different From One Another?
6. Methodological Issues For the study of siblings in childhood, a combination of naturalistic and structured observations, interviews with parents and siblings have proved most useful. Children are articulate and forthcoming about their relations with their siblings, and cope with interviews and questionnaires from an early age. As in the study of any relationship, it is important to get both participants’ views on their relationship as these may 14065
Sibling Relationships, Psychology of differ, and both are valid. Cicirelli (1996) points out some particular methodological problems with studying siblings in adults: incomplete data sets formed when one sibling is unwilling or unable to participate, choices over which dyad is picked, and limitations in sample representativeness.
7. Future Directions Exciting future directions in sibling research include: clarification of the links between family relationships (including those involving step-relations); identification of the processes of family influence—for which we need studies with more than one child per family; further investigation of the role of genetics in individual development; and further exploration of significant sibling experiences for social understanding in middle childhood and adolescence. See also: Family Processes; Genetic Studies of Personality; Infancy and Childhood: Emotional Development; Sibling-order Effects; Social Relationships in Adulthood
Bibliography Boer F, Dunn J 1990 Children’s sibling relationships: Deelopmental and clinical issues. Lawrence Erlbaum Associates, Mahwah, NJ Brody G H 1996 Sibling relationships their causes and consequences. Ablex, Norwood, NJ Cicirelli V G 1996 Sibling relationships in middle and old age. In: Brody G H (ed.) Sibling relationships: Their causes and consequences. Ablex, Norwood, NJ, pp. 47–73 Dunn J 1992 Siblings and development. Current Directions in Psychological Science 1: 6–9 Dunn J, Deater-Deckard K, Pickering K, Beveridge M and the ALSPAC study team 1999 Siblings, parents and partners: Family relationships within a longitudinal community study. Journal of Child Psychology and Psychiatry 40: 1025–37 Dunn J, Plomin R 1990 Separate Lies: Why Siblings are so Different. Basic Books, New York Hetherington E M, Reiss D, Plomin R 1994 Separate Social Worlds of Siblings: The Impact of Nonshared Enironment on Deelopment. Lawrence Erlbaum Associates, Mahwah, NJ Koch H L 1954 The relation of ‘primary mental abilities’ in fiveand six-year-olds to sex of child and characteristics of his sibling. Child Deelopment 25: 209–23 Patterson G R 1986 The contribution of siblings to training for fighting: A microsocial analysis. In: Olweus D, Block J, Radke-Yarrow M (eds.) Deelopment of Antisocial and Prosocial Behaior: Research, Theories, and Issues. Academic Press, Orlando, FL, pp. 235–61 Riedmann A, White L 1966 Adult sibling relationships: Racial and ethnic comparisons. In: Brody G H et al. (ed.) Sibling Relationships: Their Causes and Consequences: Adances in Applied Deelopmental Psychology. Ablex, Norwood, NJ, pp. 105–26
J. Dunn 14066
Sign Language Until recently, most of the scientific understanding of the human capacity for language has come from the study of spoken languages. It has been assumed that the organizational properties of language are connected inseparably with the sounds of speech, and the fact that language is normally spoken and heard determines the basic principles of grammar, as well as the organization of the brain for language. There is good evidence that structures involved in breathing and chewing have evolved into a versatile and efficient system for the production of sounds in humans. Studies of brain organization indicate that the left cerebral hemisphere is specialized for processing linguistic information in the auditory–vocal mode and that the major language-mediating areas of the brain are connected intimately with the auditory–vocal channel. It has even been argued that hearing and the development of speech are necessary precursors to this cerebral specialization for language. Thus, the link between biology and linguistic behavior has been identified with the particular sensory modality in which language has developed. Although the path of human evolution has been in conjunction with the thousands of spoken languages the world over, recent research into signed languages has revealed the existence of primary linguistic systems that have developed naturally independent of spoken languages in a visual–manual mode. American Sign Language (ASL), for example, a sign language passed down from one generation to the next of deaf people, has all the complexity of spoken languages, and is as capable of expressing science, poetry, wit, historical change, and infinite nuances of meaning as are spoken languages. Importantly, ASL and other signed languages are not derived from the spoken language of the surrounding community: rather, they are autonomous languages with their own grammatical form and meaning. Although it was thought originally that signed languages were universal pantomime, or broken forms of spoken language on the hands, or loose collections of vague gestures, now scientists around the world have found that there are signed languages that spring up wherever there are communities and generations of deaf people (Klima and Bellugi 1988). One can now specify the ways in which the formal properties of languages are shaped by their modalities of expression, sifting properties peculiar to a particular language mode from more general properties common to all languages. ASL, for example, exhibits formal structuring at the same levels as spoken languages (the internal structure of lexical units and the grammatical scaffolding underlying sentences) as well as the same kinds of organizational principles as spoken languages. Yet the form this grammatical structuring assumes in a visual–manual language is apparently
Sign Language deeply influenced by the modality in which the language is cast (Bellugi 1980). The existence of signed languages allows us to enquire about the determinants of language organization from a different perspective. What would language be like if its transmission were not based on the vocal tract and the ear? How is language organized when it is based instead on the hands moving in space and the eyes? Do these transmission channel differences result in any deeper differences? It is now clear that there are many different signed languages arising independently of one another and of spoken languages. At the core, spoken and signed languages are essentially the same in terms of organizational principles and rule systems. Nevertheless, on the surface, signed and spoken languages differ markedly. ASL and other signed languages display complex linguistic structure, but unlike spoken languages, convey much of their structure by manipulating spatial relations, making use of spatial contrasts at all linguistic levels (Bellugi et al. 1989).
with different orderings producing different hierarchically organized meanings. Similarly, the syntactic structure specifying relations of signs to one another in sentences of ASL is also essentially organized spatially. Nominal signs may be associated with abstract points in a plane of signing space, and it is the direction of movement of verb signs between such endpoints that marks grammatical relations. Whereas in English, the sentences ‘the cat bites the dog’ and ‘the dog bites the cat’ are differentiated by the order of the words, in ASL these differences are signaled by the movement of the verb between points associated with the signs for cat and dog in a plane of signing space. Pronominal signs directed toward such previously established points or loci clearly function to refer back to nominals, even with many signs intervening (see Fig. 2). This spatial organization underlying syntax is a unique property of visual-gestural systems (Bellugi et al. 1989).
1. The Structure of Sign Language
2. The Acquisition of Sign Language by Deaf Children of Deaf Parents
As already noted, the most striking surface difference between signed and spoken languages is the reliance on spatial contrasts, most evident in the grammar of the language. At the lexical level, signs can be separated from one another minimally by manual parameters (handshape, movement, location). The signs for summer, ugly, and dry are just the same in terms of handshape and movement, and differ only in the spatial location of the signs (forehead, nose, or chin). Instead of relying on linear order for grammatical morphology, as in English (act, acting, acted, acts), ASL grammatical processes nest sign stems in spatial patterns of considerable complexity (see Fig. 1), marking grammatical functions such as number, aspect, and person spatially. Grammatically complex forms can be nested spatially, one inside the other,
Findings revealing the special grammatical structuring of a language in a visual mode lead to questions about the acquisition of sign language, its effect on nonlanguage visuospatial cognition, and its representation in the brain. Despite the dramatic surface differences between spoken and signed languages—simultaneity and nesting sign stems in complex co-occurring spatial patterns—the acquisition of sign language by deaf children of deaf parents shows a remarkable similarity to that of hearing children learning a spoken language. Grammatical processes in ASL are acquired at the same rate and by the same processing by deaf children as are grammatical processes by hearing children learning English, as if there were a biological timetable underlying language acquisition.
Three Dimensional Morphology (Sign Stem Embedded in Spatial Patterns)
a) GIVE (uninflected)
b) GIVE [Durational] "give continuously"
c) GIVE [Exhaustive] "give to each"
d) GIVE [[Exhaustive] Durational] 'give to each, that action recurring over time'
e) GIVE [[Durational] Exhaustive] 'give continuously to each in turn'
f) GIVE [[Durational] Exhaustive] 'give continuously to each in turn, that action recurring over time'
Figure 1 Spatially organized (three-dimensional) morphology in ASL
14067
Sign Language
Figure 2 Spatialized syntax
Deaf Children's Grammatical Overregularizations
*BED
[N:dual]
BED
*DUCK
[N:dual]
DUCK
*FUN
[N:dual]
FUN
Figure 3 Deaf children’s grammatical overregularizations
First words and first signs appear at around 12 months; combining two words or two signs in children, whether deaf or hearing, occurs by about 18–20 months, and the rapid expansion signaling the development of grammar (morphology and syntax) develops in both spoken and signed languages by about 3–3" years. Just as hearing children show their # discovery of grammatical regularities by mastery and producing overregularizations (‘I goed there,’ ‘we helded the rabbit’), deaf children learning sign language show evidence of learning the spatial morphology signaling plural forms and aspectual forms by producing overregularizations in spatial patterns (see Fig. 3).
3. Neural Systems Subsering Visuospatial Languages The differences between signed and spoken languages provide an especially powerful tool for understanding the neural systems subserving language. Consider the following: In hearing\speaking individuals, language 14068
processing is mediated generally by the left cerebral hemisphere, whereas visuospatial processing is mediated by the right cerebral hemisphere. But what about a language that is communicated using spatial contrasts rather than temporal contrasts? On the one hand, the fact that sign language has the same kind of complex linguistic structure as spoken languages and the same expressivity might lead one to expect left hemisphere mediation. On the other hand, the spatial medium so central to the linguistic structure of sign language clearly suggests right hemisphere or bilateral mediation. In fact, the answer to this question is dependent on the answer to another, deeper, question concerning the basis of the left hemisphere specialization for language. Specifically, is the left hemisphere specialized for language processing per se (i.e., is there a brain basis for language as an independent entity)? Or is the left hemisphere’s dominance generalized to process any type of information that is presented in terms of temporal contrasts? If the left hemisphere is indeed specialized for processing language itself, sign language processing should be mediated by the left
Sign Language hemisphere, as is spoken language. If, however, the left hemisphere is specialized for processing fast temporal contrasts in general, one would expect sign language processing to be mediated by the right hemisphere. The study of sign languages in deaf signers permits us to pit the nature of the signal (auditory-temporal vs. visual-spatial) against the type of information (linguistic vs. nonlinguistic) that is encoded in that signal as a means of examining the neurobiological basis of language (Poizner et al. 1990). One program of studies examines deaf lifelong signers with focal lesions to the left or the right cerebral hemisphere. Major areas, each focusing on a special property of the visual-gestural modality as it bears on the investigation of brain organization for language, are investigated. There are intensive studies of large groups of deaf signers with left or right hemisphere focal lesions in one program (Salk); all are highly skilled ASL signers, and all used sign as a primary form of communication throughout their lives. Individuals were examined with an extensive battery of experimental probes, including formal testing of ASL at all structural levels; spatial cognitive probes sensitive to right-hemisphere damage in hearing people; and new methods of brain imaging, including structural and functional magnetic resonance imaging (MRI, fMRI), event-related potentials (ERP), and positron emission tomography (PET). This large pool of well-studied and thoroughly characterized subjects, together with new methods of brain imaging and sensitive tests of signed as well as spoken language allows for a new perspective on the determinants of brain organization for language (Hickok and Bellugi 2000, Hickok et al. 1996).
3.1 Left Hemisphere Lesions and Sign language Grammar The first major finding is that so far only deaf signers with damage to the left hemisphere show sign language aphasias. Marked impairment in sign language after left hemisphere lesions was found in the majority of the left hemisphere damaged (LHD) signers, but not in any of the right hemisphere damaged (RHD) signers, whose language profiles were much like matched controls. Figure 4 presents a comparison of LHD, RHD, and normal control profiles of sign characteristics from The Salk Sign Diagnostic Aphasia Examination—a measure of sign aphasia. The RHD signers showed no impairment at all in any aspect of ASL grammar; their signing was rich, complex, and without deficit, even in the spatial organization underlying sentences of ASL. By contrast, signers with LHD showed markedly contrasting profiles: one was agrammatic after her stroke, producing only nouns and a few verbs with none of the grammatical apparatus of ASL, another made frequent paraphasias at the sign internal level, and several showed many
grammatical paraphasias, including neologisms, particularly in morphology. Another deaf signer showed deficits in the spatially encoded grammatical operations which link signs in sentences, a remarkable failure in the spatially organized syntax of the language. Still another deaf signer with focal lesions to the left hemisphere reveal dissociations not found in spoken language: a dissociation between sign and gesture, with a cleavage between capacities for sign language (severely impaired) and manual pantomime (spared). In contrast, none of the RHD signers showed any within-sentence deficits; they were completely unimpaired in sign sentences and not one showed aphasia for sign language (in contrast to their marked nonlanguage spatial deficits, described below) (Hickok and Bellugi 2000, Hickok et al. 1998). Moreover, there are dramatic differences in performance between left- and right-hemisphere damaged signers in formal experimental probes of sign competence. For example, a test of the equivalent of rhyming in ASL provides a probe of phonological processing. Two signs ‘rhyme’ if they are similar in all but one phonological parametric value such as handshape, location, or movement. To tap this aspect of phonological processing, subjects are presented with an array of pictured objects and asked to pick out the two objects whose signs rhyme (Fig. 5). The ASL signs for key and apple share the same handshape and movement, and differ only in location, and thus are like rhymed pairs. LHD signers are significantly impaired relative to RHD signers and controls on this test, another sign of the marked difference in effects of right- and left-hemisphere lesions on signing. On other tests of ASL processing at different structural levels, there are similar distinctions between left- and rightlesioned signers, with the right-lesioned signers much like the controls, but the signers with left hemisphere lesions significantly impaired in language processing. Moreover, studies have found that there can be differential breakdown of linguistic components of sign language (lexicon and grammar) with different left hemisphere lesions.
3.2 Right Hemisphere Lesions and Nonlanguage Spatial Processing These results from language testing contrast sharply with results on tests of nonlanguage spatial cognition. RHD signers are significantly more impaired on a wide range of spatial cognitive tasks than LHD signers, who show little impairment. Drawings of many of the RHD signers (but not those with LHD) show severe spatial distortions, neglect of the left side of space, and lack of perspective. RHD deaf signers show lack of perspective, left neglect, and spatial disorganization on an array of spatial cognitive nonlanguage tests (block design, drawing, hierarchical 14069
Sign Language
Left Hemisphere Lesions lead to Sign Language Aphasias LHD Signers Sign Profiles
RATING SCALE PROFILE OF SIGN CHARACTERISTICS 1
2
3
4
5
6
7
MELODIC LINE Absent
limited to short phrases and stereotyped expressions
runs through entire sentence
4 signs
7 signs
normal only in familiar signs and phrases
never impaired
PHRASE LENGTH 1 sign
ARTICULATORY AGILITY always impaired or impossible
Control Dear Signers
RATING SCALE PROFILE OF SIGN CHARACTERISTICS 1
2
Absent
4
5
6
7
limited to short phrases and stereotyped expressions
runs through entire sentence
PHRASE LENGTH
utterance
limited to simple declaratives and stereotypes
normal range
once per minute of conversation
absent
ARTICULATORY AGILITY normal only in familiar signs and phrases
never impaired
information proportional to fluency
exclusively content signs
utterance
limited to simple declaratives and stereotypes
absent
6 7 runs through entire sentence
7 signs
normal only in familiar signs and phrases
never impaired
none available
PARAPHASIA IN RUNNING SIGNpresent in every
limited to simple declaratives and stereotypes
normal range
once per minute of conversation
information proportional to fluency
fluent without information
exclusively content signs
absent
information proportional to fluency
exclusively content signs
SIGN COMPREHENSION (z = -2) (z = -1.5)
(z = +1)
5
4 signs
utterance
Absent
Normal (z = -.5) (z = 0) (z = +.5)
4
1 sign
always impaired or impossible
normal range
once per minute of conversation
SIGN COMPREHENSION Absent (z = -2) (z = -1.5) (z = -1)
3
limited to short phrases and stereotyped expressions
SIGN FINDING
fluent without information
SIGN COMPREHENSION
2
Absent
GRAMMATICAL FORM
none available
SIGN FINDING fluent without information
1 MELODIC LINE
ARTICULATORY AGILITY
PARAPHASIA IN RUNNING SIGNpresent in every
SIGN FINDING
RATING SCALE PROFILE OF SIGN CHARACTERISTICS
7 signs
GRAMMATICAL FORM
none available
PARAPHASIA IN RUNNING SIGN present in every
RHD Signers
PHRASE LENGTH 4 signs
1 sign
always impaired or impossible
GRAMMATICAL FORM
3
MELODIC LINE
Absent
Normal (z = -1)
(z = -.5)(z = 0)
(z = -2) (z = -1.5) (z = -1) (z = -.5)
(z = +.5) (z = +1)
Normal (z = 0)
(z = +.5)
(z = +1)
Figure 4 Left hemisphere lesions lead to sign language aphasias
Rhyming Task with LHD and RHD Signers Sample Item on ASL Rhyming Test
100
%Correct
80 60 40 20 0
LHD Signers
RHD Signers
Figure 5 Rhyming task with LHD and RHD signers
processing), compared with LHD deaf signers. Yet, astonishingly, these severe spatial deficits among RHD signers do not affect their competence in a spatially nested language, ASL. The case of a signer with a right parietal lesion leading to severe left neglect is of special interest: Whereas his drawings show characteristic omissions on the left side of space, his signing (including the spatially organized syntax) is impeccable, with signs and spatially organized syntax perfectly maintained. The finding that sign aphasia follows left hemisphere lesions but not right hemisphere lesions provides a strong case for a modality-independent linguistic basis for the left hemisphere specialization for language. These data suggest that the left hemisphere is predisposed biologically for language, independent of language modality. Thus, hearing and speech are not necessary for the development of hemisphere specialization—sound is not crucial. Furthermore, the finding of a dissociation between competence in a spatial language and competence in nonlinguistic spatial cognition demonstrates that it is the type of information that is encoded in a signal (i.e., linguistic vs. spatial information) rather than the nature of the signal itself (i.e., spatial vs. temporal) that determines 14070
the organization of the brain for higher cognitive functions.
4. Language, Modality, and the Brain Analysis of patterns of breakdown in deaf signers provides new perspectives on the determinants of hemispheric specialization for language. First, the data show that hearing and speech are not necessary for the development of hemispheric specialization: sound is not crucial. Second, it is the left hemisphere that is dominant for sign language. Deaf signers with damage to the left hemisphere show marked sign language deficits but relatively intact capacity for processing nonlanguage visuospatial relations. Signers with damage to the right hemisphere show the reverse pattern. Thus, not only is there left hemisphere specialization for language functioning, there is also complementary specialization for nonlanguage spatial functioning. The fact that grammatical information in sign language is conveyed via spatial manipulation does not alter this complementary specialization. Furthermore, components of sign language (lexicon and grammar) can be selectively impaired, reflecting
Sign Language: Psychological and Neural Aspects differential breakdown of sign language along linguistically relevant lines. These data suggest that the left hemisphere in humans may have an innate predisposition for language, regardless of the modality. Since sign language involves an interplay between visuospatial and linguistic relations, studies of sign language breakdown in deaf signers may, in the long run, bring us closer to the fundamental principles underlying hemispheric specialization.
velopmental trajectory similar to spoken languages. Memory for signs exhibits patterns of interference and forgetting that are similar to those found for speech. Early use of sign language may enhance certain aspects of nonlanguage visual perception. Neuropsychological studies show that left hemisphere regions are important in both spoken and sign language processing.
See also: Evolution and Language: Overview; Language and Animal Competencies; Sign Language: Psychological and Neural Aspects
1. Linguistic Principles of American Sign Language 1.1 Language and Deaf Culture
Bibliography Bellugi U 1980 The structuring of language: Clues from the similarities between signed and spoken language. In: Bellugi U, Studdert-Kennedy M (eds.) Signed and Spoken Language: Biological Constraints on Linguistic Form. Dahlem Konferenzen. Weinheim\Deerfield Beach, FL, pp. 115–40 Bellugi U, Poizner H, Klima E S 1989 Language, modality and the brain. Trends in Neurosciences 10: 380–8 Emmorey K, Kosslyn S M, Bellugi U 1993 Visual imagery and visual-spatial language: Enhanced imagery abilities in deaf and hearing ASL signers. Cognition 46: 139–81 Hickok G, Bellugi U 2000 The signs of aphasia. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, 2nd edn. Elsevier Science Publishers, Amsterdam, The Netherlands Hickok G, Bellugi U, Klima E S 1996 The neurobiology of signed language and its implications for the neural organization of language. Nature 381: 699–702 Hickok G, Bellugi U, Klima E S 1998 The basis of the neural organization for language: Evidence from sign language aphasia. Reiews in the Neurosciences 8: 205–22 Klima E S, Bellugi U 1988 The Signs of Language. Harvard University Press, Cambridge, MA Poizner H, Klima E S, Bellugi U 1990 What the Hands Reeal About the Brain. MIT Press\Bradford Books, Cambridge, MA
U. Bellugi
Sign Language: Psychological and Neural Aspects Signed languages of the deaf are naturally occurring human languages. The existence of languages expressed in different modalities (i.e., oral–aural, manual– visual) provides a unique opportunity to explore and distinguish those properties shared by all human languages from those that arise in response to the modality in which the language is expressed. Despite the differences in language form, signed languages possess formal linguistic properties found in spoken languages. Sign language acquisition follows a de-
Sign languages are naturally-occurring manual languages that arise in communities of deaf individuals. These manual communication systems are fully expressive, systematic human languages and are not merely conventionalized systems of pantomime nor manual codifications of a spoken language. Many types of deafness are inheritable and it is not unusual to find isolated communities of deaf individuals who have developed complex manual languages (Groce 1985). The term Deaf Community has been used to describe a sociolinguistic entity that plays a crucial role in a deaf person’s exposure to and acceptance of sign language (Padden and Humphries 1988). American Sign Language (ASL), used by members of the Deaf community in the USA and Canada, is only one of many sign languages of the world, but it is the one that has been studied most extensively . 1.2 Structure of American Sign Language A sign consists of a hand configuration that travels about a movement path and is directed to or about a specific body location. Sign languages differ from one another in the inventories and compositions of hand-shapes, movements, and locations used to signal linguistic contrasts just as spoken language differ from one another in the selection of sounds used, and how those sounds are organized into words. Many sign languages incorporate a subsystem in which orthographic symbols used in the surrounding spoken language communities are represented manually on the hands. One example is the American English manual alphabet, which is produced on one hand and allows users of ASL to represent English lexical items. All human languages, whether spoken or signed, exhibit levels of structure which govern the composition and formation of word forms and specify how words combine into sentences. In formal linguistic analyses, these structural levels are referred to as phonology, morphology and the syntax of the language. In this context, phonological organization refers to the patterning of the abstract formational 14071
Sign Language: Psychological and Neural Aspects units of a natural language (Coulter and Anderson 1993). Compared to spoken languages in which contrastive units (for example, phonemes) are largely arranged in a linear fashion, signed languages exhibit simultaneous layering of information. For example, in a sign, a handshape will co-occur with, rather than follow sequentially, a distinct movement pattern. ASL exhibits complex morphology. Morphological markings in ASL are expressed as dynamic movement patterns overlaid on a more basic sign form. These nested morphological forms stand in contrast to the linear suffixation common in spoken languages (Klima and Bellugi 1979). The prevalence of simultaneous layering of phonological content and the nested morphological devices observed across many different sign languages likely reflect an influence of modality on the realization of linguistic structure. Thus the shape of human languages reflects the constraints and affordances imposed by the articulator systems involved in transmission of the signal (i.e., oral versus manual) and the receptive mechanisms for decoding the signal (i.e., auditory versus visual). A unique property of ASL linguistic structure is the reliance upon visuospatial mechanisms to signal linguistic contrasts and relations. One example concerns the use of facial expressions in ASL. In ASL, certain syntactic and adverbial constructions are marked by specific and obligatory facial expressions (Liddell 1980). These linguistic facial expressions differ significantly in appearance and execution of affective facial expressions. A second example concerns the use of inflectional morphology to express subject and object relationships. At the syntactic level, nominals introduced into the discourse are assigned arbitrary reference points along a horizontal plane in the signing space. Signs with pronominal function are directed toward these points, and the class of verbs which require subject\object agreement obligatorily move between these points (Lillo-Martin and Klima 1990). Thus, whereas many spoken languages represent grammatical functions through case marking or linear ordering, in ASL grammatical function is expressed through spatial mechanisms. This same system of spatial reference, when used across sentences, serves as a means of discourse cohesion (Winston 1995).
American Sign Language, though questions remain as to whether manual babbling is constrained by predominantly motoric or linguistic factors (Cheek et al. 2001, Petitto and Marantette 1991). Between 10 and 12 months of age, children reared in signing homes begin to produce their first signs, with two-sign combinations appearing at approximately 18 months. Some research has suggested a precocious early vocabulary development in signing children. However, these reports are likely to be a reflection of parents’ and experimenters’ earlier recognition of signs compared to words rather than underlying differences in development of symbolic capacities of signing and speaking children. Signing infants produce the same range of grammatical errors in signing that have been documented for spoken language, including phonological substitutions, morphological overregularizations, and anaphoric referencing confusions (Petitto 2000). For example, in the phonological domain a deaf child will often use only a subset of the handshapes found in the adult inventory, opting for a simpler set of ‘unmarked’ handshapes. This is similar to the restricted range of consonant usage common in children acquiring a spoken language.
2. Psycholinguistic Aspects of Sign Language Psycholinguistic studies of American Sign Language have examined how the different signaling characteristics of the language impact transmission and recognition. Signs take longer to produce than comparable word forms. The average number of words per second in running speech is about 4 to 5, compared with 2 to 3 signs per second in fluent signing. However, despite differences in word transmission rate, the proposition rate for speech and sign is the same, roughly one proposition every 1 to 2 seconds. Compare, for example the English phrase ‘I have already been to California’ with the ASL equivalent, which can be succinctly signed using three monosyllabic signs, glossed as FINISH TOUCH CALIFORNIA.
2.1 Sign Recognition 1.3 Sign Language Acquisition Children exposed to signed languages from birth acquire these languages on a similar maturational timetable as children exposed to spoken languages (Meier 1991). Prelinguistic infants, whether normally hearing or deaf, engage in vocal play commonly known as babbling. Recent research has shown that prelinguistic gestural play referred to as manual babbling will accompany vocal babbling. There appear to be significant continuities between prelinguistic gesture and early signs in deaf children exposed to 14072
Studies of sign recognition have examined how signs are recognized in time. Recognition appears to follow a systematic pattern in which information about the location of the sign is reliably identified first, followed by handshape information and finally the movement. Identification of the movement reliably leads to the identification of the sign. Interestingly, despite the slower articulation of a sign compared to a word, sign recognition appears to be faster than word recognition. Specifically, it has been observed that proportionally less of the signal needs to be processed in order to uniquely identify a sign compared to a spoken
Sign Language: Psychological and Neural Aspects word. For example, one study reports that only 240 msec. or 35 percent of a sign has to be seen before a sign is identified (Emmory and Corina 1990). In comparable studies of spoken English, Grosjean (1980) reports that 330 msec. or 83 percent of a word has to be heard before a word can be identified. This finding is due, in part, to the simultaneous patterning of phonological information in signs compared to the more linear patterning of phonological information characteristic of spoken languages. These structural differences in turn have implications for the organization of lexical neighborhoods. Spoken languages, such as English, may have many words that share in their initial phonological structures (tram, tramp, trampoline, etc.) leading to greater coactivation (and thus longer processing time) during word recognition. In contrast, ASL sign forms are often formationally quite distinct from one another, permitting quicker selection of a lexical unique entry.
2.2 Memory for Signs Efforts to explore how the demands of sign language processing influence memory and attention have led to several significant findings. Early studies of memory for lists of signs report classic patterns of forgetting and interference, including serial position effects of primacy and recency (i.e., signs at the beginning and the end of a list are better remembered than items in the middle). Likewise, when sign lists are composed of phonologically similar sign forms, signers exhibit poorer recall (Klima and Bellugi 1979). This result is similar to what has been reported for users of spoken language, where subjects exhibit poorer memory for lists of similarly sounding words (Conrad and Hull 1964). These findings indicate that signs, like words, are encoded into short-term memory in a phonological or articulatory code rather than in a semantic code. More recent work has assessed whether Baddeley’s (1986) working memory model pertains to sign language processing. This model includes components that encode linguistic and visuospatial information as well as a central executive which mediates between immediate and long-term memory. Given the spatial nature of the sign signal, the question of which working memory component(s) is engaged during sign language processing is particularly interesting. Wilson and Emmorey (2000) have shown word-length effects in signs; it is easier to maintain a cohort of short signs (monosyllabic signs) than long signs (polysyllabic signs) in working memory. They also report effects of ‘articulatory suppression.’ In these studies, requiring a signer to produce repetitive, sign-like hand movements while encoding and maintaining signs in memory has a detrimental effect on aspects of recall. Once again, analogous effects are known to exist for spoken languages and these phenomena provide support for a multicomponent model of linguistic working memory
that includes both an articulatory loop and a phonological store. Finally, under some circumstances deaf signers are able to utilize linguistically relevant spatial information to encode signs, suggesting the engagement of a visuospatial component of working memory. 2.3 Perception and Attention Recent experiments with native users of signed languages have shown that experience with a visual sign language may improve or alter visual perception of nonlanguage stimuli. For example, compared to hearing nonsigners, deaf signers have been shown to possess enhanced or altered perception along several visual dimensions such as motion processing (Bosworth and Dobkins 1999), mental rotation, and processing of facial features (Emmorey 2001). For example, the ability to detect and attend selectively to peripheral, but not central, targets in vision is enhanced in signers (Neville 1991). Consistent with this finding is evidence for increased functional connectivity in neural areas mediating peripheral visual motion in the deaf (Bavelier et al. 2000). As motion processing, mental rotation, and facial processing underlie aspects of sign language comprehension, these perceptual changes have been attributed to experience with sign language. Moreover, several of these studies have reported such enhancements in hearing signers raised in deaf signing households. These findings provide further evidence that these visual perception enhancements are related to the acquisition of a visual manual language, and are not due to compensatory mechanisms developed as a result of auditory deprivation.
3. Neural Organization of Signed Language 3.1 Hemispheric Specialization One of the most significant findings in neuropsychology is that the two cerebral hemispheres show complementary functional specialization, whereby the left hemisphere mediates language behaviors while the right hemisphere mediates visuospatial abilities. As noted, signed languages make significant use of visuospatial mechanisms to convey linguistic information. Thus sign languages exhibit properties for which each of the cerebral hemispheres shows specialization. Neuropsychological studies of brain injured deaf signers and functional imaging studies of brain intact signers have provided insights into the neural systems underlying sign language processing. 3.2 Aphasia in Signed Language Aphasia refers to acquired impairments in the use of language following damage to the perisylvian region 14073
Sign Language: Psychological and Neural Aspects of the left hemisphere. Neuropsychological case studies of deaf signers convincingly demonstrate that aphasia in signed language is also found after left hemisphere perisylvian damage (Poizner et al. 1987). Moreover, within the left hemisphere, production and comprehension impairments follow the well-established anterior versus posterior dichotomy. Studies have documented that frontal anterior damage leads to Broca-like sign aphasia. In these cases, normal fluent signing is reduced to effortful, single-sign, ‘telegraphic’ output with little morphological complexity (such as verb agreement). Comprehension, however, is left largely intact. Wernicke-like sign aphasia following damage to the posterior third of the perisylvian region presents with fluent but often semantically opaque output, and comprehension also suffers. In cases of left hemisphere damage, the occurrence of sign language paraphasias may be observed. Paraphasia may be formationally disordered (for example, substituting an incorrect hand shape) or semantically disordered (producing a word that is semantically related to the intended form). These investigations serve to underscore the importance of left hemisphere structures for the mediation of signed languages and illustrate that sign language breakdown is not haphazard, but rather honors linguist boundaries (Corina 2000). The effects of cortical damage to primary language areas that result in aphasic disturbance can be differentiated from more general impairments of movement. For example Parkinson’s disease leads to errors involving timing, scope, and precision of general movements including, but not limited to, those involved in speech and signing. These errors produce phonetic disruptions in signing, rather than higherlevel phonemic disruptions that are apparent in aphasic signing (Corina 1999). Apraxia is defined as an impairment of the execution of a learned movement (e.g., saluting, knowing how to work a knife and fork). Lesions associated with the left inferior parietal lobe result in an inability to perform and comprehend gestures (Gonzalez Rothi and Heilman 1997). Convincing dissociations of sign language impairment with well-preserved praxic abilities have been reported. In one case, a subject with marked sign language aphasia affecting both production and comprehension produced unencumbered pantomime. Moreover, both comprehension and production of pantomime were found to be better preserved than was sign language. These data indicate that language impairments following left hemisphere damage are not attributable to undifferentiated symbolic impairments and demonstrate that ASL is not simply an elaborate pantomimic system.
tion in nonlinguistic domains (e.g., face processing, drawing, block construction, route finding, etc.) but have reported only minimal language disruption. Problems in the organization of discourse have been observed in RHD signers, as have also been reported in users of spoken languages. More controversial are the influences of visuospatial impairment in the use of highly spatialized components of the language such as complex verb agreement and the classifier system. Further work is needed to understand whether these infrequently reported impairments (both in comprehension and production) reflect core linguistic deficits or rather reflect secondary effects of impaired visuospatial processing.
3.4 Functional Imaging Recent studies using functional brain imaging have explored the neural organization of language in users of signed languages. These studies have consistently found participation of classic left hemisphere perisylvian language areas in the mediation of sign language in profoundly deaf, lifelong signers. For example Positron Emission Tomography studies of production in British Sign Language (McGuire et al. 1997) and Langue des Signes Quebecoise (Petitto et al. 2000) reported deaf subjects activated left inferior frontal regions, regions similar to those that mediate speech in hearing subjects. Researchers using Functional Magnetic Resonance imaging techniques in investigating comprehension of ASL in deaf and hearing native users of signed language have shown significant activations in frontal opercular areas (including Broca’s area and dorsolateral prefrontal cortex) as well as in posterior temporal areas (such as Wernicke’s area and the angular gyrus) (Neville et al. 1998). Also reported was extensive activation in right hemisphere regions. Subsequent studies have conferred that aspects of the right posterior hemisphere activation appear to be unique to sign language processing and present only in signers who acquired sign language from birth (Newman et al. in press). Investigations of psychological and neural aspects of signing reveal strong commonalities in the development and cognitive processing of signed and spoken languages despite major differences in the surface form of these languages. Early exposure to a sign language may lead to the enhancements in the specific neural systems underlying visual processing. There appears to be a strong biological predisposition for left hemisphere structures in the mediation of language, regardless of the modality of expression.
3.3 Role of the Right Hemisphere in ASL Studies of signers with right hemisphere damage (RHD) have reported significant visuospatial disrup14074
See also: Language Acquisition; Language Development, Neural Basis of; Sign Language
Signal Detection Theory
Bibliography Baddeley A 1986 Working Memory. Oxford University Press, New York Bavelier D, Tomann A, Hutton C, Mitchell T, Corina D, Liu G, Neville H 2000 Visual attention at the periphery is enhanced in congenitally deaf individuals. Journal of Neuroscience 20: RC93 Bosworth R, Dobkins K 1999 Left-hemisphere dominance for motion processing in deaf signers. Psychological Science 10: 256–62 Cheek A, Cormier K, Repp A, Meier R 2001 Prelinguistic gesture predicts mastery and error in the production of early signs. Language Corina D 1999 Neural disorders of language and movement: Evidence from American Sign Language. In: Messing L, Campbell R (eds.) Gesture, Speech and Sign. Oxford University Press, New York Corina D 2000 Some observations regarding paraphasia in American Sign Language. In: Emmorey K, Lane H (eds.) The Signs of Language Reisited: An Anthology to Honor Ursula Bellugi and Edward Klima. Lawrence Erlbaum Associates, Mahwah, NJ Coulter G, Anderson S 1993 Introduction. In: Coulter G (ed.) Phonetics and Phonology: Current Issues in ASL Phonology. Academic Press, San Diego, CA Conrad R, Hull A 1964 Information, acoustic confusion and memory span. British Journal of Psychology 55: 429–32 Emmorey K 2001 Language, Cognition, and the Brain: Insights From Sign Language Research. Lawrence Erlbaum Associates, Mahwah, NJ Emmorey K, Corina D 1990 Lexical recognition in sign language: Effects of phonetic structure and morphology. Perceptual and Motor Skills 71: 1227–52 Gonzalez Rothi L, Heilman K 1997 Introduction to limb apraxia. In: Gonzalez Rothi L, Heilman K (eds.) Apraxia: The Neuropsychology of Action. Psychology, Hove, UK Grosjean F 1980 Spoken word recognition processes and the gating paradigm. Perception and Psychophysics 28: 267–83 Groce J 1985 Eeryone Here Spoke Sign Language. Harvard University Press, Cambridge, MA Klima E, Bellugi U 1979 The Signs of Language. Harvard University Press, Cambridge, MA Liddell S 1980 American Sign Language Syntax. Mouton, The Hague, The Netherlands Lillo-Martin D, Klima E 1990 Pointing out differences: ASL pronouns in syntactic theory. In: Fisher S, Siple P (eds.) Theoretical Issues in Sign Language Research I: Linguistics. University of Chicago Press, Chicago McGuire P, Robertson D, Thacker A, David A, Frackowiak R, Frith C 1997 Neural correlates of thinking in sign language. Neuroreport 8: 695–8 Meier R 1991 Language acquisition by deaf children. American Scientist 79: 60–70 Neville H 1991 Neurobiology of cognitive and language processing: Effects of early experience. In: Gibson K, Peterson A (eds.) Brain Maturation and Cognitie Deelopment: Comparatie and Cross-cultural Perspecties. Aldine de Gruyter Press, Hawthorne, NY Neville H, Bavelier D, Corina D, Rauschecker J, Karni A, Lalwani A, Braun A, Clark V, Jezzard P, Turner R 1998 Cerebral organization for language in deaf and hearing subjects: Biological constraints and effects of experience. Proceedings of the National Academy of Science 90: 922–9
Newman A, Corina D, Tomann A, Bavelier D, Jezzard P, BraunA, Clark V, Mitchell T, Neville H (submitted) Effects of age of acquisition on cortical organization for American Sign Language: an fMRI study. Nature Neuroscience Padden C, Humphries T 1988 Deaf in America: Voices from a Culture. Harvard University Press, Cambridge, MA Petitto L 2000 The acquisition of natural signed languages: Lessons in the nature of human language and its biological foundations. In: Chamberlain C, Morford J (eds.) Language Acquisition by Eye. Lawrence Erlbaum Associates, Mahwah, NJ Petitto L, Marantette P 1991 Babbling in the manual mode: Evidence for the ontogeny of language. Science 251: 1493–6 Petitto L, Zatorre R, Gauna K, Nikelski E, Dostie D, Evans A 2000 Speech-like cerebral activity in profoundly deaf people processing signed languages: Implications for the neural basis of human language. Proceedings of the National Academy of Science 97: 13961–6 Poizner H, Klima E, Bellugi U 1987 What the Hands Reeal About the Brain. MIT Press, Cambridge, MA Wilson M, Emmorey K 2000 When does modality matter? Evidence from ASL on the nature of working memory. In: Emmorey K, Lane H (eds.) The Signs of Language Reisited: An Anthology to Honor Ursula Bellugi and Edward Klima. Lawrence Erlbaum Associates, Mahwah, NJ Winston E 1995 Spatial mapping in comparative discourse frames. In: Emmorey K, Reilly J (eds.) Language, Gesture, and Space. Lawrence Erlbaum Associates, Mahwah, NJ
D. P. Corina
Signal Detection Theory Signal detection theory (SDT) is a framework for interpreting data from experiments in which accuracy is measured. It was developed in a military context (see Signal Detection Theory, History of), then applied to sensory studies of auditory and visual detection, and is now widely used in cognitive science, diagnostic medicine, and many other fields. The key tenets of SDT are that the internal representations of stimulus events include variability, and that perception (in an experiment, or in everyday life) incorporates a decision process. The theory characterizes both the representation and the decision rule.
1. Detection and Discrimination Experiments Consider an auditory experiment to determine the ability of a listener to detect a weak sound. On some trials, the sound is presented, and the listener is instructed to say ‘yes’ (I heard it) or ‘no’ (I did not hear it). On other trials, there is no sound, discouraging the observer from responding ‘yes’ indiscriminately. The presence of variability in sensory systems leads us to call these latter presentations noise (N ) trials, and the former signal plus noise (SjN ) trials. Possible data based on 50 trials of each type are given in Table 1. Responses of ‘yes’ on signal trials are called hits, and 14075
Signal Detection Theory Table 1 Response frequencies in an auditory detection experiment ‘Yes’
‘No’
Total
Signal + noise
42
8
50
Noise
15
35
50
alarm rate, so the criterion is about 0.5 standard deviations above the mean of that distribution.
2. Sensitiity and Response Bias Figure 1 offers natural interpretations of the sensitiity of the observer in the task and of the decision rule. Sensitivity, usually denoted dh, is reflected by the difference between the two distribution means; this characteristic of the representation is unaffected by the location of the criterion, and is thus free of response bias. The magnitude of dh is found easily by computing the distances between each mean and the criterion, found in the last section: dh l z(hit rate)kz( false-alarm rate)
(1)
where z( p) is the point on a unit-normal distribution above which the area p can be found. The function z is negative for arguments less than 0.5 and positive for arguments above that value. In the example, dh l z(0.84)kz(0.30) l 0.994k(k0.524) l 1.518
Figure 1 Distributions assumed by SDT to result from N and SjN. The horizontal axis is scaled in standard deviation units, and the difference between the means is dh. The observer responds ‘yes’ for values to the right of the criterion (vertical line), ‘no’ for those to the left
on no-signal trials are called false alarms. The observer’s performance can be summarized by a hit rate (here 0.84) and a false-alarm rate (here 0.30). The variability principle implies that repeated presentation of the signal leads to a distribution of internal effect, as does repeated presentation of no signal. In the most common SDT model, these distributions are assumed to be normal, and to differ only in mean, as shown in Fig. 1. How does the observer reach a decision in this experiment, when any observation on the ‘internal effect’ axis is ambiguous with regard to source? The best decision rule is to establish a criterion value c on the axis, and respond ‘yes’ for values to the right of c, ‘no’ for values to the left. The location of the criterion can be determined from the data, for the area above it under the SjN distribution must equal the hit rate. Consulting a table of the normal distribution reveals that in the example above the criterion must therefore be about one standard deviation below the mean of that distribution. Similarly, the area above the criterion under the N distribution must equal the false14076
Criterion location is one natural measure of response bias. If evaluated relative to the point at which the two distributions cross, it can be calculated by a formula analogous to Eqn. 1: c lk"[z(hit rate)jz( false-alarm rate)] #
(2)
In the example above c lk"[z(0.84)jz(0.30)] lk0.235, # that is, the criterion is slightly to the left of the crossover point, as in fact is shown in Fig. 1.
3. Receier Operating Characteristic (ROC) Cures If sensitivity truly is unaffected by the location of the response criterion, then a shift in that criterion should leave dh unchanged. An observer’s decision rule can be manipulated between experimental runs through instructions or payoffs. In the more efficient rating method, a rating response is required; the analysis assumes that each distinct rating corresponds to a different region along the axis of Fig. 1, and that each region is defined by a different criterion. The data are analyzed by calculating a hit and false-alarm rate for each criterion separately. A plot of hit rate vs. false alarm rate is called a receier-operating characteristic, or ROC; the ROC that would be produced from the
Signal Detection Theory veloped for complex experimental situations, such as source monitoring in memory; in this context, they are called multinomial models (see Discrete State Models of Information Processing).
5. Identification and Classification Along a Single Dimension
Figure 2 An ROC curve, the relation between hit and falsealarm rates, both of which increase as the criterion location moves (from right to left, in Fig. 1)
representation in Fig. 1 is shown in Fig. 2. Points at the lower end of the curve correspond to high criteria, and the curve is traced out from left to right as the criterion moves from right to left in Fig. 1. The shape of the ROC is determined by the form of the distributions in the representation, and data largely support the normality assumption. However, the assumption that the SjN and N distributions have the same variance is often incorrect (Swets 1986). In such cases, the area under the ROC is a good measure of accuracy (Swets and Pickett 1982); this index ranges from 0.5 for chance performance to 1.0 for perfect discrimination. An entire ROC (from a rating experiment, for example) is needed to estimate this area.
4. Alternatie Assumptions About Representation The essence of signal detection theory is the distinction between the representation and the decision rule, not the shape of the underlying distributions, and two types of nonnormal distributions have been important. One such distribution is the logistic, which has a shape very similar to the normal but is computationally more tractable. Logistic models have been used for complex experimental designs, and for statistical analysis. Another family of distributions, whose shape is rectangular, leads to rectilinear ROCs and can be interpreted as support for ‘thresholds’ that divide discrete perceptual states. Such curves are unusual, experimentally, but have been found in studies of recognition memory ( Yonelinas and Jacoby 1994). Models using threshold assumptions have been de-
Detection theory is extended easily to experiments with multiple stimuli differing on a single dimension. For example, a set of tones differing only in intensity may be presented one at a time for identification (the observer must identify the exact stimulus presented) or classification (the observer must sort the set into two or more categories). The representation is assumed to be a set of distributions along a decision axis, with mk1 criteria used to assign responses to m categories. SDT analysis allows the calculation of dh (or a related measure, if equal variances are not assumed) for any stimulus pair, and an interesting question concerns the relation between this value and that found in a discrimination task using just those two stimuli. According to a theory of Braida and Durlach (1988), identification dh will be lower than discrimination dh by an amount that reflects memory constraints in the former task. The predicted relation is qualitatively correct for many stimulus dimensions, and can be quantitatively predicted for tones differing in intensity.
6. Complex Discrimination Designs and Multidimensional Detection Theory The one-interval design for measuring discrimination is appropriate for some applications (it seems quite natural in recognition memory, for example, where the SjN trials are studied items and the N trials are foils), but other methods are often preferred. One popular technique is two-alternatie forced-choice (2AFC), in which both stimuli are presented on every trial and the observer must say whether N or SjN occurred first. Another method is same-different, in which there are again two intervals, but each interval can contain either of the two stimuli and the observer must say whether the two stimuli are the same or different. These and many other designs can be analyzed by assuming that each interval is represented by a value on a separate, independent perceptual dimension. Although dh is a characteristic of the stimuli and the observer, and therefore constant across paradigms, performance as measured by percent correct [ p(c)] varies widely (Macmillan and Creelman 1991). For example, Sect. 2 showed that if dh l 1.5, p(c) l 0.77 in the one-interval task; but in 2AFC the same observer would be expected to score p(c) l 0.86, and in the same-different task as low as p(c) l 0.61. The relation between one-interval and 2AFC was one of the first 14077
Signal Detection Theory predictions of SDT to be tested (Green and Swets 1966), whereas that between same-different and other paradigms is a more recent topic of investigation. An interesting aspect of same-different performance is that two distinct decision strategies are available, and their use depends on aspects of the experimental design and stimulus set (Irwin and Francis 1995) Many identification and classification experiments use stimulus sets that themselves vary in multiple dimensions perceptually, and SDT analysis can be extended to this more general case. Multidimensional detection theory has been particularly helpful in clarifying questions about independence (see Ashby 1992) (see Signal Detection Theory: Multidimensional).
7. Comparing Obsered with Ideal or Theoretically-deried Performance Detection theory has provided a forum for asking a slate of questions about the optimality of performance. One set of issues concerns the degree to which human behavior falls below the ideal level set by stimulus variability. Early auditory experiments using external noise revealed that human ‘efficiency’ was often quite high, though never perfect. A different type of application concerns the theoretical prediction of the ratio between the SjN and N standard deviations, which can be derived from the shape of the ROC shape. For example, this ratio should be less than unity for detection experiments in which signal parameters are not known by the observer (Graham 1989). Ratcliff et al. (1994) showed that for an important class of recognition memory models, the ratio should be approximately 0.8, as it is empirically. Some strong assumptions about the decision process that is at the heart of SDT have been tested. In general, these are well supported for one-dimensional representations, but complications sometimes arise for higher dimensionality. Do observers establish fixed criteria? Experiments requiring discrimination between distributions of stimuli have shown that they do when the variability is in one dimension and with adequate experience in two dimensions as well. Do observers set ‘optimal’ criteria? Actual criterion settings are often conservative compared to ideal ones predicted by Bayes’s Theorem. In a multidimensional task, observers appear to adopt appropriate decision boundaries when those are simple in form (e.g., linear), but fall short of optimality when more complex rules are required. Does the representation remain fixed in the face of task variation? In one dimension, this amounts to the original, often-replicated datum that dh (or some sensitivity measure) is unchanged when criterion changes. In more than one dimension, attentional effects can alter the representation (Nosofsky 1986), and the question of what is invariant in these cases is still unresolved. 14078
See also: Decision Theory: Classical; Perception: History of the Concept; Psychophysical Theory and Laws, History of; Signal Detection Theory, History of; Signal Detection Theory: Multidimensional; Time Perception Models
Bibliography Ashby F G (ed.) 1992 Multidimensional Models of Perception and Cognition. Erlbaum, Hillsdale, NJ Braida L D, Durlach N I 1988 Peripheral and central factors in intensity perception. In: Edelman G M, Gall W E, Cowan W M (eds.) Auditory Function. Wiley, New York, pp. 559–83 Graham N V 1989 Visual Pattern Analyzers. Oxford University Press, Oxford, UK Green D M, Swets J A 1966 Signal Detection Theory and Psychophysics. Wiley, New York Irwin R J, Francis M A 1995 Perception of simple and complex visual stimuli: Decision strategies and hemispheric differences in same-different judgments. Perception 24: 787–809 Macmillan N A, Creelman C D 1991 Detection Theory: A User’s Guide. Cambridge University Press, New York Nosofsky R M 1986 Attention, similarity, and the identificationcategorization relationship. Journal of Experimental Psychology: General 115: 39–57 Ratcliff R, McKoon G, Tindall M 1994 Empirical generality of data from recognition memory receiver-operating characteristic functions and implications for the global memory models. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 763–85 Swets J A 1986 Form of empirical ROCs in discrimination and diagnostic tasks. Psychological Bulletin 99: 181–98 Swets J A, Pickett R M 1982 Ealuation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York Yonelinas A P, Jacoby L L 1994 Dissociations of processes in recognition memory: Effects of interference and of response speed. Canadian Journal of Experimental Psychology 48: 516–34
N. A. Macmillan
Signal Detection Theory, History of 1. Introduction and Scope Signal detection theory (SDT) sprouted from World War II research on radar into a probability-based theory in the early 1950s. It specifies the optimal observation and decision processes for detecting electronic signals against a background of random interference or noise. The engineering theory, culminating in the work of Wesley W. Peterson and Theodore G. Birdsall (Peterson et al. 1954), had foundations in mathematical developments for theories of statistical inference, beginning with those advanced by Jerzy
Signal Detection Theory, History of Neyman and E. S. Pearson (1933). SDT was taken into psychophysics, then a century-old branch of psychology, when the human observer’s detection of weak signals, or discrimination between similar signals, was seen by psychologists as a problem of inference. In psychology, SDT is a model for a theory of how organisms make fine discriminations and it specifies model-based methods of data collection and analysis. Notably, through its analytical technique called the receiver operating characteristic (ROC), it separates sensory and decision factors and provides independent measures of them. SDT’s approach is now used in many areas in which discrimination is studied in psychology, including cognitive as well as sensory processes. From psychology, SDT and the ROC came to be applied in a wide range of practical diagnostic tasks, in which a decision is made between two confusable alternatives (Swets 1996) (see also Signal Detection Theory).
2. Electronic Detection and Statistical Inference Central to both electronics and statistics is a conception of a pair of overlapping, bell-shaped, probability (density) functions arrayed along a unidimensional variable, which is the weight of evidence derived from observation. In statistical theory, the probabilities are conditional on the null hypothesis or the alternative hypothesis, while in electronic detection theory they are observational probabilities conditional on noise-alone or signal-plus-noise. The greater the positive weight of evidence, the more likely a signal, or significant experimental effect, is present. A cutpoint must be set along the variable to identify those observed values above the cutpoint that would lead to rejection of the null hypothesis or acceptance of the presence of the designated signal. Errors are of commission or omission: type I and type II errors in statistics, and false alarms and misses in detection. Correct outcomes in detection are hits and correct rejections. Various decision rules specify optimal cutpoints for different decision goals. The weight of evidence is unidimensional because just two decision alternatives are considered. Optimal weights of evidence are monotone increasing with the likelihood ratio, the ratio of the two overlapping bell-shaped probability functions. It is this ratio, and not the shapes of the functions, that is important.
of two independent aspects of detection performance: (a) the location of the observer’s cutpoint or decision criterion and (b) the observer’s sensitivity, or ability to discriminate between signal-plus-noise and noisealone, irrespective of any chosen cutpoint. The two measures characterize performance better than the single one often used, namely, the signal-to-noise ratio (SNR) expressed in energy terms that is necessary to provide a 50 percent hit (correct signal acceptance) probability at a fixed decision cutpoint, say, one yielding a conditional false-alarm probability of 0.05. A curve showing conditional hit probability versus SNR is familiar in detection theory and is equivalent to the power function of statistical theory. These curves can be derived as a special case of the ROCs for the SNR values of interest.
4. The ROC The ROC is a plot of the conditional hit (true–positive) proportion against the conditional false-alarm (false– positive) proportion for all possible locations of the decision cutpoint. Peterson’s and Birdsall’s SDT specifies a form of ROC that begins at the lower left corner (where both proportions are zero) and rises with smoothly decreasing slope to the upper right corner (where both proportions are 1.0)—as the decision cutpoint varies from very strict (at the right end of the decision variable) to very lenient (at the left end). This is a ‘proper’ ROC, and specifically not a ‘singular’ ROC, or one intersecting axes at points between 0 and 1.0. The locus of the curve, or just the proportion of area beneath it, gives a measure of discrimination capacity. An index of a point along the curve, for example, the slope of the curve at the point, gives a measure of the decision cutpoint that yielded that point (the particular slope equals the criterial value of likelihood ratio). SDT specifies the discrimination capacity attainable by an ‘ideal observer’ for any SNR for various practical combinations of signal and noise parameters (e.g., signal specified exactly and signal specified statistically) and hence the human observer’s efficiency can be calculated under various signal conditions. SDT also specifies the optimal cutoff as a function of the signal’s prior probability and the benefits and costs of the decision outcomes.
5. Psychology’s Need for SDT and The ROC 3. Modern Signal Detection Theory In the early 1950s, Peterson and Birdsall, then graduate students in electrical engineering and mathematics, respectively, at the University of Michigan, developed the general mathematical theory of signal detection that is still current. In the process they devised the ROC—a graphical technique permitting measurement
Wilson P. Tanner, Jr., and John A. Swets, graduate students in psychology at Michigan in the early 1950s, became aware of Peterson’s and Birdsall’s work on the same campus as it began. Tanner became acquainted through his interest in mathematical and electronic concepts as models for psychological and neural processes and Swets was attracted because of his interest in quantifying the ‘instruction stimulus’ in 14079
Signal Detection Theory, History of psychophysical tasks, in order to control the observer’s response criterion. As they pursued studies in sensory psychology, these psychologists had become dissatisfied with psychophysical and psychological theories based on overlapping bell-shaped functions that assumed fixed decision cutpoints and incorporated measures only of discrimination capacity, which were possibly confounded by attitudinal or decision processes. Prominent among such theories were those of Gustav Theodor Fechner (1860), Louis Leon Thurstone (1927), and H. Richard Blackwell (1963); see Swets (1996, Chap. 1). Fechner and Thurstone conceived of a symmetrical cutpoint, where the two distributions cross, because they presented two stimuli on each trial with no particular valence; that is, both stimuli were ‘signals’ (lifted weights, samples of handwriting). Fechner also studied detection of single signals, but did not compare them to a noise-alone alternative and his theory was essentially one of variable representations of a signal compared to an invariant cutpoint. This cutpoint was viewed as a physiologically determined sensory threshold (akin to all-or-nothing nerve firing) and hence the values of the sensory variable beneath it were thought to be indistinguishable from one another. Blackwell’s task and model were explicitly of signals in noise, but the observer was thought to have a fixed cutpoint near the top of the noise function, for which the false-alarm probability was negligible, and, again, the values below that cutpoint were deemed indistinguishable. For both threshold theorists, the single measure of performance was the signal strength required to yield 50 percent correct positive responses, like the effective SNR of electronic detection theory, and it was taken as a statistical estimate of a sensory threshold. In psychology, the curve relating percentage of signals detected to signal strength is called the psychometric function. There had been some earlier concern in psychology for the effect of the observer’s attitude on the measured threshold, e.g., Graham’s (1951) call for quantification of instruction stimuli in the psychophysical equation, but no good way to deal with non-sensory factors had emerged. In 1952, Tanner and Swets joined Peterson and Birdsall as staff in a laboratory of the electrical engineering department called the Electronic Defense Group. They were aware of then-new conceptions of neural functioning in which stimulus inputs found an already active nervous system and neurons were not lying quiescent till fired at full force. In short, neural noise as well as environmental noise was likely to be a factor in detection and then the observer’s task is readily conceived as a choice between statistical hypotheses. Though a minority idea in the history of psychophysics (see Corso 1963), the possibility that the observer deliberately sets a cutpoint on a continuous variable (weight of evidence) seemed likely to Tanner and Swets. A budding cognitive psychology (e.g., ‘new look’ in perception) supported the notion 14080
of extrasensory determinants of perceptual phenomena, as represented in SDT by expectancies (prior probabilities) and motivation (benefits and costs).
6. Psychophysical Experiments The new SDT was first tested in Swets’s doctoral thesis in Blackwell’s vision laboratory (Tanner and Swets 1954; Swets et al. 1961–see Swets 1964, Chap. 1). It was then tested with greater control of the physical signal and noise parameters in new facilities for auditory research in the electrical engineering laboratory. Sufficiently neat empirical ROCs were obtained in the form of the curved arc specified by SDT. ROCs found to be predicted by Blackwell’s ‘highthreshold’ theory—straight lines extending from some point along the left axis, that depended on signal strength, to the upper right hand corner—clearly did not fit the data. Other threshold theories, e.g., ‘lowthreshold’ theories, did not fare much better in experimental tests (Swets 1961–see Swets 1964, Chap. 4). It was recognized later that linear ROCs of slope l 1, intersecting left and upper edges of the graph symmetrically, were predicted by several other measures and their implicit models (Swets 1996, Chap. 3) and also gave very poor fits to sensory data (and other ROC data to be described; Swets 1996, Chap. 2). After a stint as observer in Swets’s thesis studies, David M. Green joined the engineering laboratory as an undergraduate research assistant. He soon became a full partner in research, and co-authored a laboratory technical report with Tanner and Swets in 1956 that included the first auditory studies testing SDT. Birdsall has collaborated in research with the psychologists over the following decades and commented on a draft of this article. The measure of discrimination performance used first was denoted d’ (‘d prime’) and defined as the difference between the means of two implicit, overlapping, normal (Gaussian) functions for signal-plusnoise and noise-alone, of equal variance, divided by their common standard deviation. A value of d’ can be calculated for each ROC point, using the so-called single-stimulus yes–no method of data collection. Similarly, a value of d’ can be calculated for an observer’s performance under two other methods of data collection: the multiple-stimulus forced-choice method (one stimulus being signal-plus noise and the rest noise-alone) and the single-stimulus confidencerating method. Experimental tests showed that when the same signal and nonsignal stimuli were used with the three methods, essentially the same values of d’ were obtained. The measures specified by threshold theories had not achieved that degree of consistency or internal validity. For many, the most conclusive evidence favoring SDT’s continuous decision variable and rejecting the high-threshold theory came from an experiment sug-
Signal Detection Theory, History of gested by Robert Z. Norman, a Michigan mathematics student. Carried out with visual stimuli, the critical finding was that a second choice made in a fouralternative forced-choice test, when the first choice was incorrect, was correct with probability greater than 1\3 and that the probability of it being correct increased with signal strength (Swets et al. 1961, Swets 1964, Chap. 1). Validating the rating method, as mentioned above, had the important effect of providing an efficient method for obtaining empirical ROCs. Whereas under the yes–no method the observer is induced to set a different cutpoint in each of several observing sessions (each providing an ROC point), under the rating method the observer effectively maintains several decision cutpoints simultaneously (the boundaries of the rating categories) so that several empirical ROC points (enough to define a curve) can be obtained in one observing session.
vigilance task, that is, the practical military and industrial observing task of ‘low-probability watch’ (Egan et al. 1961–see Swets 1964, Chap. 15), and also to speech communication (Egan and Clarke 1957–see Swets 1964, Chap. 30). He also brought SDT to purely cognitive tasks with experiments in recognition memory (see Green and Swets 1966, Sect. 12.9). Applications were then made by others to those tasks, and also to tasks of conceptual judgment, animal discrimination and learning, word recognition, attention, visual imagery, manual control, and reaction time (see Swets 1996, Chap. 1). A broad sample of empirical ROCs obtained in these areas indicates their very similar forms (Swets 1996, Chap. 2). An Annual Review chapter reviews applications in clinical psychology, for example, to predicting acts of violence (McFall and Treat 1999).
9. Applications in Diagnostics 7. Dissemination of SDT in Psychology Despite the growing evidence for it, acceptance of SDT was not rapid or broad in psychophysical circles. This was partly due to threshold theory having been ingrained in the field for a century (as a concept and a collection of methods and measures), and probably because SDT arose in an engineering context (and retained the engineers’ terminology), the early visual data were noisy, and the first article (Tanner and Swets 1954) was cryptic. Dissemination was assisted when J. C. R. Licklider brought Swets and Green to the Massachusetts Institute of Technology where they participated in Cambridge’s hotbed of psychophysics and mathematical psychology, and where offering a special summer course for postgraduates led to a published collection of approximately 35 of the articles on SDT in psychology that had appeared by then, along with tables of d’ (Swets 1964). Licklider also hired both of them part-time at the research firm of Bolt Beranek and Newman Inc., where they obtained contract support from the National Aeronautics and Space Administration to write a systematic textbook (Green and Swets 1966). Meanwhile, Tanner mentored a series of exceptional graduate students at Michigan. All were introduced by Licklider to the Acoustical Society of America where they enjoyed a feverishly intense venue for discussion of their ideas at biannual meetings and in the Society’s journal. Another factor was the active collaboration with Tanner of James P. Egan and his students at Indiana University.
8. Extensions of SDT in Psychology Egan recognized the potential for applying SDT in psychology beyond the traditional psychophysical tasks. He extended it to the less tightly defined
An analysis of performance measurement in information retrieval suggested that two-by-two contingency tables for any diagnostic task were grist for SDT’s mill (Swets 1996, Chap. 9). This idea was advanced considerably by Lusted’s (1968) application to medical diagnosis. A standard protocol for evaluating diagnostic performance via SDT methods, with an emphasis on medical images, was sponsored by the National Cancer Institute (Swets and Pickett 1982). By 2000, ‘ROC’ is specified as a key word in over 1000 medical articles each year, ranging from radiology to blood analysis. Other diagnostic applications are being made to aptitude testing, materials testing, weather forecasting, and polygraph lie detection (Swets 1996, Chap. 4). A recent development is to use SDT to improve, as well as to evaluate, diagnostic accuracy. Observers’ ratings of the relevant dimensions of a diagnostic task—e.g., perceptual features in an X-ray—are merged in a manner specified by the theory to yield an optimal estimate of the probability that a ‘signal’ is present (Swets 1996, Chap. 8). The latest work both in psychology and diagnostics has been described didactically (Swets 1998). See also: Psychophysical Theory and Laws, History of; Signal Detection Theory; Signal Detection Theory: Multidimensional
Bibliography Blackwell H R 1963 Neural theories of simple visual discriminations. Journal of the Optical Society of America 53: 129–60 Corso J F 1963 A theoretico-historical review of the threshold concept. Psychological Bulletin 60: 356–70
14081
Signal Detection Theory, History of Fechner G T 1860 Elemente der Psychophysik Breitkopf & Hartel, Leipzig, Germany [English translation of Vol. 1 by Adler H E 1966. In: Howes D H, Boring E G (eds.) Elements of Psychophysics. Holt, Rinehart and Winston, New York] Graham C H 1951 Visual perception. In: Stevens S S (ed.) Handbook of Experimental Psychology. Wiley, New York Green D M, Swets J A 1966 Signal Detection Theory and Psychophysics. Wiley, New York. Reprinted 1988 by Peninsula, Los Altos Hills, CA Lusted L B 1968 Introduction to Medical Decision Making. C. C. Thomas, Springfield, IL McFall R M, Treat T A 1999 Quantifying the information value of clinical assessments with signal detection theory. Annual Reiew of Psychology 50: 215–41 Neyman J, Pearson E S 1933 On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London A231: 289–311 Peterson W W, Birdsall T G, Fox W C 1954 The theory of signal detectability. Transactions of the Institute of Radio Engineers Professional Group on Information Theory, PGIT 4: 171–212. Also In: Luce R D, Bush R R, Galanter E 1963 (eds.) Readings in Mathematical Psychology. Wiley, New York, Vol. 1 Swets J A (ed.) 1964 Signal Detection and Recognition by Human Obserers. Wiley, New York. Reprinted 1988 by Peninsula, Los Altos Hills, CA Swets J A 1996 Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers. L. Erlbaum Associates, Mahwah, NJ Swets J A 1998 Separating discrimination and decision in detection, recognition, and matters of life and death. In: Osherson D (series ed.) Scarborough D, Sternberg S (vol. eds.) An Initation to Cognitie Science: Vol 4, Methods, Models, and Conceptual Issues. MIT, Cambridge, MA Swets J A, Pickett R M 1982 Ealuation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York Tanner W P Jr, Swets J A 1954 A decision making theory of visual detection. Psychological Reiew 61: 401–9 Thurstone L L 1927 A law of comparative judgment. Psychological Reiew 34: 273–86
J. A. Swets
Signal Detection Theory: Multidimensional Multidimensional signal detection theory (MSDT) is an extension of signal detection theory (SDT) to more than one dimension, with each dimension representing a different source of information along which the ‘signal’ is registered. The development of MSDT was motivated by psychologists who wanted to study information processing of more realistic, multidimensional stimuli, for example, the shape and color of geometric objects, or pitch, loudness, and timbre of tones. The typical questions of interest are: Are the dimensions (e.g., color and shape of an object) processed independently, or does the perception of 14082
one dimension (e.g., color) depend on the perception of the second (shape), and if so, what is the nature of their dependence? In MSDT, several types of independence of perceptual dimensions are defined and related to two sets (marginal and conditional) of sensitivity, dh, and response bias, β, parameters. This article gives the definitions, and outlines the uses in studying information processing of dimensions. MSDT is also known as General Recognition Theory (Ashby and Townsend 1986) or Decision Bound Theory (e.g., Ashby and Maddox 1994).
1. Introduction and Notation Stimulus dimensions, A, B,…, represent different types of information about a class of stimuli, for example, in a two-dimensional (2-D) case, the shape (A) and color (B) of objects. The researcher specifies the levels on each dimension to study (e.g., square and rectangle are levels of shape, and blue, turquoise, and green are levels of color). A multidimensional ‘signal’ is stimulus AiBj, where the subscript indicates the specific level on the corresponding dimension (e.g., a green square is A B ). For each physical stimulus dimension A (B,…), " $ is assumed a corresponding unique psychological there or perceptual dimension X (Y,…). The minimal stimulus set considered in this article consists of four 2-D stimuli, oA B , A B , A B , A B q, " " #on #dimension " # # constructed by combining two" levels A (i l 1, 2) with each of two levels on dimension B ( j l 1, 2). For example, the set constructed from two levels of color (blue, green) and two levels of shape (square, rectangle) is oblue square, blue rectangle, green square, green rectangleq. On a single trial, one stimulus is selected from the set at random, and it is presented to the observer whose task is to identify the stimulus. The presentation of a 2-D stimulus results in a percept, a point (x, y) in X-Y-perceptual space. The observer’s response is based on the percept’s location relative to the observer’s placement of decision criterion bounds in the perceptual space, typically one bound per dimension, denoted cA, cB, … . These bounds divide the perceptual space into response regions such that if, for example, (x, y) falls above cA and below cB, the (2-D) response would be ‘21’ (or any response that uniquely identifies it as A B ). # " As in unidimensional SDT, multiple presentations of a single stimulus are assumed to lead to a distribution of percepts, and one distribution (usually Gaussian) is assumed for each stimulus. An example with four 2-D stimuli is illustrated in Fig. 1. The bivariate joint densities, fA B (x, y), are represented j more easily as equal-density icontours (Fig. 1(b)), that are obtained by slicing off the tops of the joint densities in Fig. 1(a) at a constant height, and viewing the space
Signal Detection Theory: Multidimensional (a)
responded ‘11’ when in fact stimulus A B was " # presented is
y
f (x,y)
P(R QA B ) l "" " #
&&
cA cB
_
_
fA B (x, y) dx dy
(1)
" #
x (b)
2. Definitions of Independence of Dimensions
y
fA B CA
Several definitions of independence of dimensions are defined within MSDT. These are: perceptual separability (PS), decisional separability (DS), and perceptual independence (PI).
Dimension B
1 2
Level 2 fA B
CB
2 2
fA B
1 1
Level 1
fA B 2 1 x Level 1
2.1 Perceptual Separability (PS ) and Decisional Separability (DS)
Level 2
Dimension A (c) gA B (x) = gA B (x) 1 1
1 2
gA B (x) 2 1
gA B (x) 2 2
x
Figure 1 (a) The perceptual space showing four joint bivariate densities, fA B (x, y), that represent all possible i j (theoretical) perceptual effects for each of the four 2-D stimuli AiBj; (b) Equal density contours of the bivariate densities in (a) and the decision bounds cA and cB; (c) Marginal densities, gA B (x), representing perceptual i j effects for each stimulus AiBj on dimension A only. In this example, gA B (x) gA B (x) but gA B (x)gA B (x) " "
" #
# "
Dimension A is PS from dimension B if the perceptual effects of A are equal across levels of dimension B, that is, gA B (x) l gA B#(x), for all values of x and " both i l 1 iand i l 2.i Similarly, dimension B is perceptually separable from dimension A if gA B " j (x) l gA B (x), for all y, and j l 1, 2. DS is a# jsimilar type of independence of the decisions that the observer makes about the stimuli. DS holds for dimension A (or B) when the decision bound cA (or cB) is parallel to the Y- (or X-) perceptual axis. PS and DS are global types of independence defined for perceptual dimensions across levels of the second dimension for all stimuli in the set. They can also be asymmetrical relations. In Fig. 1(b), for example, PS holds for dimension B across levels of dimension A, but PS fails for dimension A across levels of B; also, DS holds for dimension A but fails for dimension B.
# #
from ‘above.’ The joint density for each stimulus AiBj yields: (a) Marginal densities on each dimension, gA B (x) and gA B (y), obtained by integrating over all i j j values of the i second dimension (see Fig. 1(c) for marginal densities gA B (x) for dimension A); and (b) j Conditional densities ifor each dimension (not shown), conditioning on the observer’s response to the second dimension, and obtained by integrating only over values either aboe (or below) the decision bound on the second dimension, for example, gA B (xQy cB). i j An observer’s responses are tabulated in a 4i4 confusion matrix, a table of conditional probability estimates, P(RpQAiBj), that are proportions of trials that response Rp was emitted when stimulus AiBj was presented. These proportions estimate the volumes under each joint density in each response region. For example, the proportion of trials on which the observer
2.2 Perceptual Independence (PI ) PI is distinguished from PS in that it is a local form of independence among dimensions within a single stimulus. PI holds within stimulus AiBj if fA B (x, y) l gA B (x) gA B (y) i j
i j
i j
for all (x, y). This is statistical independence, and it is a symmetric relation in a given stimulus. The shape of an equal density contour describes the correlation among the perceived dimensions in one stimulus. A circle indicates PI in a stimulus with variances equal on the two dimensions. An ellipse with its major and minor axes parallel to the perceptual axes also indicates PI, but with unequal variances on the two dimensions. A tilted ellipse indicates failure of 14083
Signal Detection Theory: Multidimensional PI, with the direction of the tilt (positive or negative slope of the major axis) indicating the direction of the dependence. In Fig. 1(b), stimuli A B and A B show " # " and PI, stimulus A B shows a positive" dependence, # # a negative dependence. stimulus A B shows " #
In a parallel fashion, marginal βs are defined as β(A at Bj) l
gA B (cA) # j
gA B (cA)
for j l 1, 2
" j
and
3. Marginal and Conditional dhs and βs As in unidimensional SDT, sensitivity, dh, and response bias, β, are parameters that describe different characteristics of the observer. Two sets of these parameters are defined: (a) marginal dhs and βs, and (b) conditional dhs and βs. The marginal and conditional dhs and βs for the four-stimulus set discussed here can be generalized to any number of dimensions and any number of levels: for details, see Kadlec and Townsend (in Ashby 1992, Chap. 8).
3.1 Marginal dhs and βs Marginal dhs and βs are defined using the marginal densities, gA B (x) and gA B (y). Let the mean vector and i j i j variance-covariance matrix of fA B (x, y), for stimulus i j AiBj, be denoted by A
µ(A B ) l i j
µx(A B ) µy(A B ) i j
Σ(A B ) l i j
B
σxy(A B )σ#y(A B ) i j
(2) D
respectively. The marginal dh for dimension A at level j of dimension B is dh(A at Bj) l
µx(A B )kµx(A B ) # j
σx(A B )
" j
for j l 1, 2 (3)
14084
where cA and cB are the decision bounds for dimension A and B, respectively. 3.2 Conditional dhs and βs In the four-stimulus set, two pairs of conditional dhs and βs are defined for each dimension, conditioning on the observer’s response on the second. Using the terminology of unidimensional SDT for simplicity, with level 1 indicating the ‘noise’ stimulus and level 2 the ‘signal’ stimulus, one pair of conditional dhs for dimension A, with B presented at level 1, is: (a) dh(AQ‘correct rejection’ on B)—conditioned on the observer’s correct response that B was at level 1, defined using parameters of cB),
i "
in Eqn. (3). The second pair of conditional dhs for dimension A, with B presented at level 2, is: (c) dh(A Q ‘hit’ on B)—conditioned on the correct response that B was at level 2; and (d) dh(AQ‘miss’ on B)—conditioned on the incorrect response that B was at level 1. Similar pairs are defined for conditional βs, and for dhs and βs for dimension B conditional on dimension A responses.
" j
The subscript x indicates that for a dh on dimension A, only the X-coordinates of the mean vectors and the standard deviation along the X-perceptual axis are employed. As in unidimensional SDT, the standard deviation of the ‘noise’ stimulus (here indicated as level 1 of dimension A) is used, and without loss of generality, it is assumed that µ(A B ) l [0 0]T (T denotes " " σ# l σ# l 1, and transpose) and Σ(A B ) l I (i.e., x y " " σxy l 0). Similarly, the marginal dhs for dimension B at level i of dimension A is dh(B at Ai) l
(5)
gA B (xQy cB)
C
i j
i j
for i l 1, 2
i "
i l 1 and 2 in Eqn. (3); and (b) dh(AQ‘false alarm’ on B)—conditioned on the incorrect response that B was at level 1, using parameters of
D
σ#x(A B ) σxy(A B ) i j
i #
gA B (cB)
i "
and A
gA B (cB)
gA B (xQy
C
i j
B
β(B at Ai) l
µy(A B )kµy(A B ) i #
σy(A B ) i "
i "
for i l 1, 2 (4)
4. Using the d hs and βs to Test Dimensional Interactions These dh and β parameters are linked theoretically to PS, DS, and PI (Kadlec and Townsend 1992). Their estimates, obtained from the confusion matrix proportions (Eqn. (1)), can thus be used to draw tentative inferences about dimensional interactions (Kadlec 1999). DS for dimension A (or B) is inferred when cA (cB) estimates are equal at all levels of dimension B (A). Equal marginal dhs for a dimension across levels of the other are consistent with PS on that dimension. A stronger inference for PS on both dimensions can be drawn if the densities form a rectangle (but not a
Significance, Tests of parallelogram), and this condition is testable if DS and PI hold (Kadlec and Hicks 1998; see also Thomas 1999 for new theoretical developments). When DS holds, evidence for PI is gained if conditional dhs are found to be equal: Conversely, unequal conditional dhs indicate PI failure, and the direction of the inequality provides information about the direction of the dimensional dependencies in the stimuli. Exploring these relationships and looking for stronger tests of PS and PI is currently an active area of research. MSDT is a relatively new development, and with its definitions of independence of dimensions, it is a valuable analytic and theoretical tool for studying interactions among stimulus dimensions. Current applications include studies of (a) human visual perception (of symmetry, Gestalt ‘laws’ of perceptual grouping, and faces); and (b) source monitoring in memory. It can be used in any research domain where the study of dimensional interactions is of interest. See also: Memory Psychophysics; Multidimensional Scaling in Psychology; Psychometrics; Psychophysics; Signal Detection Theory; Signal Detection Theory, History of; Visual Perception, Neural Basis of
Bibliography Ashby F G (ed.) 1992 Multidimensional Models of Perception and Cognition. Hillsdale, Erlbaum, NJ Ashby F G, Maddox W T 1994 A response time theory of separability and integrality in speeded classification. Journal of Mathematical Psychology 38: 423–66 Ashby F G, Townsend J T 1986 Varieties of perceptual independence. Psychology Reiew 93: 154–79 Kadlec H 1999 MSDA-2: Updated version of software for multidimensional signal detection analyses. Behaior Research Methods, Instruments & Computers 31: 384–5 Kadlec H, Hicks C L 1998 Invariance of perceptual spaces. Journal of Experimental Psychology: Human Perception & Performance 24: 80–104 Kadlec H, Townsend J T 1992 Implications of marginal and conditional detection parameters for the separabilities and independence of perceptual dimensions. Journal of Mathematical Psychology 36: 325–74 Thomas R D 1999 Assessing sensitivity in a multidimensional space: Some problems and a definition of a general dh. Psychonomic Bulletin & Reiew 6: 224–38
H. Kadlec
Significance, Tests of A recent article on the health benefits of herbal tea reported its use leading to a decreased incidence of insomnia in an experiment conducted at a sleep disorders clinic. Patients at the clinic were randomly
assigned to daily consumption of herbal tea or a caffeine-free beverage of their choice, and were followed up for 10 months. The reported improvement was stated to be ‘statistically significant ( p 0.05).’ The implication intended was that the improvement should be attributed to the beneficial effects of the tea. In fact, this article does not really exist, but examples of this sort are reported regularly in science articles in the press, and are very common in journal publications in a number of fields in science and social science. It has even been argued, in the press, that ‘statistical significance’ has become so widely used to give an imprimatur of scientific acceptability that it is in fact misused and should be abandoned (Matthews 1998). A test of significance assesses the agreement between the data and an hypothesized statistical model. The magnitude of the agreement is expressed as an observed level of significance, or p-value, which is the probability of obtaining data as or more extreme than the observed data, if the hypothesized model were true. A very small p-value suggests that either the observed data are not compatible with the hypothesized model or an event of very small probability has been observed. A large p-value indicates that the observed data are compatible with the hypothesized model. Since the p-value is a probability, it must be between 0 and 1, and a very common convention is to declare values smaller than 0.05 as ‘small’ or ‘statistically significant’ and values larger than 0.05 as ‘not statistically significant.’ In this section we shall try to explain the historical rationale of this very arbitrary cut-off point. To make these vague definitions more concrete, it is necessary to consider statistical models and the notion of a (simplifying) hypothesis within that model. The theory of this is outlined in Sect. 2 with some more advanced topics considered in Sect. 3. This section concludes with a highly idealized example to convey the idea of data being inconsistent with an hypothesized model. Example 1. Students in a statistics class partake in an activity to assess their ability to distinguish between two competing brands of cola, and to identify their preferred brand from taste alone. Each of the 20 students expresses a preference for one brand or the other, but just one student claims to be able to discriminate perfectly between the two. Twenty cups of each brand are prepared by the instructor and labelled ‘1’ and ‘2.’ Each student is to record which label corresponds to Brand A. The result is that 12 students correctly the competing brands identify, although not the student who claimed a perfect ability to discriminate. The labelling of the cups as 1 or 2 by the instructor was completely random, i.e., cup 1 was equally likely to contain Brand A or B. The students did not discuss their opinions with their classmates, and the taste testing was completed fairly quickly. Under these conditions, it is plausible that each student has a 14085
Significance, Tests of Table 1 Probability of zero or one mistakes in n independent Bernoulli trials with probability of a mistake l 0.5 n
Probability
5 6 7 8 9 10 11 12 13 14 15
0.1875 0.1094 0.0625 0.0352 0.0195 0.0107 0.0059 0.0032 0.0017 0.0009 0.0002
plays the role of a conservative position that the experimenter hopes to disprove, and one reason for requiring rather small p-values before declaring statistical significance is to raise the standard of proof required to replace a relatively simple working hypothesis with one that is possibly more complex and less well understood. As formulated here, the hypothesis being tested is that the probability of a correct choice is 0.5, and not the other aspects of the model, such as independence of the trials, and unchanging probability of success. The number of observed successes does not measure such model features; it provides information; only on the probability of success. Functions of the data that do measure such model features can be constructed, and from these significance tests that assess the fit of an assumed model; these play an important role in statistical inference as well.
probability of " of identifying the brands correctly # simply by guessing, so that about 10 students would identify the brands correctly with no discriminatory ability at all. That 12 students did so does not seem inconsistent with guesswork, and the p-value helps to quantify this. The probability of observing 12 or more correct results if one correct result has probability " and the guesses are independent can be computed by# the binomial formula as
(020121j020131j…j020201*("#)#! l 0.34.
(1)
This is not at all an unlikely event, so there is no evidence from these data that the number of correct answers could not have been obtained by guessing: in more statistical language, assuming a binomial model, the observed data is consistent with probability of success ". # The student who claimed to have perfect discrimination, but actually guessed incorrectly, argued that her abilities should not be dismissed on the basis of one taste test, so the class carried out some computations to see what the p-value for the same observed data would be if the number of trials was increased. The probability of one or zero mistakes in a set of n trials for various values of n, is given in Table 1. From this we see that, for example, one or no mistakes in five trials is consistent with guesswork, but the same result in 10 trials is much less so. In both parts of this example we assumed a model of independent trials, each of which could result in a success or failure, with constant probability of success. Our calculations also assumed this constant probability of success was 0.5. This latter restriction on the model is often called a ‘null hypothesis’, and the test of significance is a test of this null hypothesis; the p-value measures the consistency of the data with this null hypothesis. In many applications the null hypothesis 14086
1. Model Based Inference 1.1 Models and Null Hypothesis We assume that we have a statistical model for a random variable Y taking values in a sample space , described by a parametric family of densities o f ( y; θ ); θ ? Θq. Tests of significance can in fact be constructed in more general settings, but this framework is useful for defining the main ideas. If Y is the total number of successes in n independent Bernoulli trials with constant probability of success, then f ( y; θ ) l
0ny1θ (1kθ ) y
n−y
(2)
Θ l [0, 1] and l o0, 1,…, nq. If Y is a continuous random variable following a normal or bell curve distribution with mean θ and variance θ#, then # " f ( y; θ ) l
(
*
1 1 exp k ( ykθ )# " N2πθ 2θ# # #
(3)
Θ l 2i2 +, and l 2. The model for n independent observations from this distribution is
(
*
1 1 n f ( y , …, yn; θ ) l exp ( y kθ )# " " (N2π)nθ n 2θ# i = i # # "
(4)
Θ l 2i2+, and l 2 n. For further discussion of statistical models, see Statistical Sufficiency; Distributions, Statistical: Special and Discrete; Distributions, Statistical: Approximations; Distributions, Statistical: Special and Continuous.
Significance, Tests of As noted above, we assume the model is given, and our interest is in inference about the parameter θ. While this could take various forms, a test of significance starts with a so-called null hypothesis about θ, of the form H :θlθ ! !
(5)
H :θ?Θ . ! !
(6)
or
In Eqn. (5) the parameter θ is fully specified, and H is ! called a point null hypothesis or a simple null hypothesis. If θ is not fully specified, as in Eqn. (6) H is ! called a composite null hypothesis. In the taste-testing examples the simple null hypothesis was θ l 0.5. In the normal model, Eqn. (3), a hypothesis about the mean, such as H : θ l 0, is composite, since the ! " variance is left unspecified. Another composite null hypothesis is H : θ l θ , which restricts the full parameter space! to# a "one-dimensional curve in i +. A test is constructed by choosing a test statistic which is a function of the data that in some natural way measures departure from what is expected under the null hypothesis, and which has been standardized so that its distribution is known either exactly or to a good approximation under the null hypothesis. Test statistics are usually constructed so that large values indicate a discrepancy from the hypothesis. Example 2. In the binomial model (2), the distribution of Y is completely specified by the null hypothesis θ l 0.5 as f ( y) l
0ny12
−n
(7)
and consistency of a given observed value y of y, is ! measured by the p-value n (ny) 2−n, the probability y=y of observing a value as or more! extreme than y . If y ! be! is quite a bit smaller than expected, then it would more usual to compute the p-value as y! (ny) 2−n. y= Each of these calculations was carried out! in the discussion of taste testing in Sect. 1. Example 3. In independent sampling from the normal distribution, given in Eqn. (4), we usually test the composite null hypothesis H : θ l θ , by con! " "! structing the t-statistic T l Nn ( Yz kθ )\S "!
(8)
where Yz l n−"n Yi and S # l (nk1)−"( YikYz )#. i=" Under H , T follows a t-distribution on nk1 degree of ! freedom, and the p-value is Pro T Nn ( y` kθ )\sq "!
where y` and s are the values observed in the sample. This probability needs to be computed numerically from an expression for the cumulative distribution function of the t-distribution. Historically, tables of this distribution were provided for ready reference, typically by identifying a few critical values, such as t . , t . , and t . satisfying Pr oTν tαq l α, where Tν ! !& ! !" is! "! a random variable following a t-distribution on ν degrees of freedom. It was arguably the publication of these tables that led to a focus on the use of particular fixed levels for testing in applied work. Example 4. Assume the model specifies that Y ,…, Yn are independent, identically distributed from a "distribution with density f (:) on 2 and that we are interested in testing whether or not f (:) is a normal density: H : f ( y) l (N2π)−" e−"# y# !
(9)
or H : f ( y) l (N2π)−"θ−" exp (k(1\2θ#) o ykθ q#) (10) # " ! # the former is a simple and the latter is a composite null hypothesis. For this problem it is less obvious how to construct a test statistic or how to choose among alternative test statistics. Under Eqn. (9) we know the distribution of each observation (standard normal) and thus any function of the observations. The ordered values of Y, Y( ) ( Y(n), could be compared to " their expected values under Eqn. (9), for example by plotting one against the other, and deviation of this plot from a line with intercept 0 and slope 1 could be measured in various ways. In the case of the composite null hypothesis Eqn. (10), we could make use of the result that under H , ( YikYz )\S has a distribution free of θ and θ , and! the vector of these residuals is " #of the pair (Yz , S#), and then for example independent compare the skewness n−"o( Y kYz )\Sq$ with that " expected under normality. Example 5. Suppose we have a sample of independent observations Y ,…, Yn or a circle of radius 1 and our null hypothesis" is that the observations are uniformly distributed on the circle. One choice of a test statistic is T l n cos Yi, very large positive (or i=" negative) values "indicating a concentration of observations at angle 0 (or π). If, instead, we wish to detect clumps of observations at two angles differing by π, then T l n ocos (2Yi)k1q would be more approi=" priate. #The exact distribution of T under H is not " ! available in closed form, but the mean and variance are readily computed as 0 and n, so a normal approximation might be used to compute the p-value. In the examples described above, the test statistics are ad hoc choices likely to be large if the null hypothesis is not true; these are called pure tests of significance. Clearly, a more sensitive test can be constructed if we have more specific knowledge of the 14087
Significance, Tests of likely form of departures from the null hypothesis. The theory of hypothesis testing formalizes this by setting up a null hypothesis and alternative hypothesis, and seeking to construct an optimal test for discriminating between them. See Hypothesis Testing in Statistics. In the remainder of this section we consider a less formal approach based on the likelihood function.
1.2 Significance Tests in Parametric Models In parametric models, tests of significance are often constructed by using the likelihood function, and the p-value is computed by using an established approximation to the distribution of the test statistic. The likelihood function is proportional to the joint density of the data, L (θ; y) l c( y) f ( y; θ ).
(11)
We first suppose that we are testing the simple null hypothesis H : θ l θ in the parametric model f ( y; θ ). ! ! Three test statistics often constructed from the likelihood function are the Wald or maximum likelihood statistic we l (θkθ )T j(θV ) (θV kθ ), ! !
(12)
the Rao or score statistic wu l U (θ
)T o
!
j(θV )q−" U (θ ), !
(13)
and the likelihood ratio statistic w l 2o%(θV )k%(θ )q !
(14)
where in Eqns. (12), (13), and (14) the following notation is used: sup L (θ; y) l L (θV ; y)
(15)
θ
%(θ ) l log L (θ ) U (θ ) l %h(θ ) j (θ ) l k%d(θ ).
(16) (17)
The distributions of each of the statistics Eqns. (12), (13), and (14) can be approximated by a χ k# distribution, where k is the dimension of θ in the model. This relies on being able to apply a central limit theorem to U(θ ), and to identify the maximum likelihood estimator θ# with the root of the equation U(θ ) l 0. The precise regularity conditions needed are somewhat elaborate; see, for example, Lehmann and Casella (1998, Chap. 6) and references therein. The important point is that under the simple null hypothesis the 14088
distributions of each of these test statistics is exactly known, and p-values readily computed. In the case that the hypothesis is composite, a similar triple of test statistics computed from the likelihood function is available, but the notation needed to define them is more elaborate. The details can be found in, for example, Cox and Hinkley (1974, Chap. 9, Sect. 3) and the notation above follows theirs. If θ is a one-dimensional parameter, then a onesided version of the test statistics given in Eqns. (12), (13), and (14) can be used instead, since the square root of we, wu, or w follows approximately a standard normal distribution. It is rare that the exact distribution of test statistics can be computed, but the normal or chi-squared approximation can often be improved, and this has been a subject of much research in the theory of statistics over the past several years. A good booklength reference is Barndorff-Nielsen and Cox (1994). One result of this research is that among the three test statistics the square root of the likelihood-ratio statistic w is generally preferred on a number of grounds, including the accuracy of the normal approximation to its exact distribution. This is true for both simple and composite tests of a scalar parameter.
1.3 Significance Functions and Posterior Probabilities We can also use a test of significance to consider the whole set or interval of values of θ that are consistent with the data. If θ is scalar, one of the simplest ways to do this is to compute r (θ ) l pNw (θ ) as a function of θ, and tabulate or plot Φ or (θ )q against θ, choosing the negative root for θ θ# , and the positive square root otherwise. This significance function will in regular models decrease from one to zero as θ ranges over an interval of values. The θ values for which Φ(r ) is 0.975 and 0.025 provide the endpoints of an approximate 95 percent confidence interval for θ. In a Bayesian approach to inference it is possible to make probability statements about the parameter or parameters in the model by constructing a posterior probability distribution for them. In a model with a scalar parameter θ based on a prior π(θ ) and model f ( y; θ ) we compute a posterior density for θ as π(θ Q y) ` f ( y; θ ) π(θ )
(18)
and can assess any particular value θ by computing !
& π(θ Q y) dθ _ θ
!
called the posterior probability of θ being larger than θ . This posterior probability is different from a p!
Significance, Tests of value: a p-value assesses the data in light of a fixed value of θ, and the posterior probability assesses a fixed value of θ in light of the probability distribution ascribed to the parameter. Many people find a posterior probability easier to understand, and indeed often interpret the p-value in this way. There is a literature on choosing priors called ‘matching priors’ to reconcile these two approaches to inference; recent developments are perhaps best approached from Kass and Wasserman (1996). See also Bayesian Statistics.
hypothesis across the collection of datasets. This is one of the ideas behind meta analysis; see Meta-analysis: Oeriew and Meta-analysis: Tools. One difficulty is that the studies will nearly always differ in a number of respects that may mean they are not all measuring the same parameter, or measuring it in the same way. Another difficulty is that studies for which the p-value is not ‘statistically significant’ will not have been published, and thus are unavailable to be included in a meta analysis. This selection effect may seriously bias the results of the meta analysis.
1.4 Hypothesis Testing Little has been said here about the choice of a test statistic for carrying out a test of significance. The difficulty is that the theory of significance testing provides no guidance on this choice. The likelihood based test statistics described above have proved to be reasonably effective in parametric models, but in a more complicated problem, such as testing the goodness of fit of an hypothesized model, this approach is often not available. To make further progress in the choice of test statistics, the classical approach is to formulate a notion of a ‘powerful’ test statistic, i.e., one that will reliably led to small p-values when the null hypothesis is not correct. To do this in a systematic way requires specifying what model might hold if in fact the null hypothesis is incorrect. In parametric models where the null hypothesis is H : θ l θ the ! ! alternative may well be Ha : θ θ . In more general ! H : ‘the model is settings the null hypothesis might be normal’ and the alternative Ha : ‘the! model is not normal.’ Even in the parametric setting if θ is a vector parameter it may be necessary to consider what direction in the parameter space away from θ is of ! interest. The formalization of these ideas is the theory of hypothesis testing, which considers both null and alternative hypotheses, and optimal choices of test statistics. See Hypothesis Testing in Statistics; Goodness of Fit: Oeriew.
2. Further Topics 2.1 Combining Tests of Significance The p-value is a function of the data, taking small values when the data are incompatible with the null hypothesis, and vice versa. As a function of y the pvalue itself has a distribution under the model f ( y; θ ) and in particular under the null hypothesis H has the ! uniform distribution on the interval (0, 1). In principle, then, if we have computed p-values from a number of different datasets the p-values can be compared to observations from a U(0, 1) distribution with the objective of obtaining evidence of failure of the null
2.2 Sample Size The p-value is a decreasing function of the size of the sample, so that a very large study is more likely to show ‘statistical significance’ than a smaller study. This has led to considerable criticism of the p-value as a summary measure. Some statisticians have argued that, for this reason, posterior probabilities are a better measure of disagreement with the null hypothesis; see, for example, Berger and Sellke (1987). To some extent the criticism can be countered by noting that the p-value is just one summary measure of a set of data, and excessive reliance on one measure is inappropriate. In a parametric setting it is nearly always advisable to provide, along with p-value for testing a particular value of the parameter of interest, an estimate of the observed effect and an indication of the precision of this estimate. This can be accomplished by reporting a significance function, if the parameter of interest is one-dimensional. At a more practical level, it should always be noted that a small pvalue should be interpreted in the context of other aspects of the study. For example a p-value of less than 0.05 could be obtained by a very small difference in a study of 10,000 cases or a relatively larger difference in a study of 1,000. While a 1 percent reduction may be of substantial importance for some scientific contexts, this needs to be evaluated in its context, and not by relying on the fact that it is ‘statistically significant.’ Unfortunately, the notion that a study report is complete if and only if the p-value is found to be less than 0.05 is fairly widely ingrained in some disciplines, and indeed forms a part of the requirements of some government agencies for approving new treatments. Another point of confusion in the evaluation of pvalues for testing scalar parameters is the distinction sometimes made between one-sided and two-sided tests of significance. A reliable procedure is to compute the p-value as twice the smaller of the probabilities that the test statistic is larger than or smaller than the observed value, under the null hypothesis. This socalled two-sided p-value measures disagreement with the null hypothesis in two directions away from the null hypothesis, towards the alternative that the, say, new treatment is worse than the old treatment as well 14089
Significance, Tests of Table 2 Hypothetical data from a small fictitious experiment on insomnia
Herbal tea Other beverage Total
Decrease in insomnia reported
No decrease reported
No. of participants
x 13kx
10kx xk3
10 10
13
7
20
as better. In the absence of very concrete a priori evidence that the alternative hypothesis is genuinely one-sided, this p-value is preferable. In testing hypotheses about parameters of dimension larger than one, it can be difficult, as noted above, to define the relevant direction away from the null hypothesis. One solution is to identify the parameters of interest individually, and carry out separate tests on those parameters. This will usually be effective unless the dimension of the parameter of interest is unusually large, but in many applications it would be rare to have more than a dozen such parameters. Extremely large datasets such as arise in so-called data mining applications raise quite new problems of inference, however.
Table 3 The set of achievable p-values from Table 2, as a function of x. The p-value for Table 2 is given by the formula min("!, "$−x)p(s), where p(s) l ("!s ) ( "!−s)\(#!). s=x "$ "$ The mid p-value is given by (1\2)p (x)jmin("!, "$−x)p (s) s = x+" x
p (x) p-value Mid p-value
10
9
8
7
0.0015 0.0015 0.0007
0.0271 0.0286 0.0150
0.1463 0.1749 0.1018
0.3251 0.5000 0.3561
shows a hypothetical 2i2 table for the herbal tea experiment. The null hypothesis is that the two beverages are equally effective at reducing insomnia, and is here tested using Fisher’s exact test. The relevant calculations are produced in Table 3, from which we see that the set of achievable p-values is relatively small. Some authors have argued that for such highly discrete situations a better assessment of the null hypothesis can be achieved by the use of Barnard’s mid p-value, which is (1\2) Pr (X l x)jPr (X x). See Agresti (1992) and references therein.
3. Conclusion 2.3 Fixed Leel Testing The problem of focusing on one or two so-called critical p-values is sometimes referred to as fixed-level testing. This was useful when computation of p-values was a very lengthy exercise, and it was usual to provide tables of critical values. It is now usually a very routine matter to compute the exact p-value, which is usually (and should be) reported along with other details such as sample size, estimated effect size, and details of the study design. There is still in some quarters a reliance on fixed level testing, with the result that studies for which the p-value is judged ‘not statistically significant’ may not be published. This is sometimes called the ‘file drawer problem,’ and a quantitative analysis was considered in Dawid and Dickey (1977). More recently there has been a move to make the results of inconclusive studies available over the Internet, and if this becomes widespread practice will alleviate the file drawer problem. This issue is particularly important for meta analysis.
2.4 Achieable p-Values In some problems where the distribution is concentrated on a discrete set, the number of available pvalues will be relatively small. For example, Table 2 14090
A test of statistical significance is a mathematical calculation based on a test statistic, a null hypothesis, and the distribution of the test statistic under the null hypothesis. The result of the test is to indicate whether the data are consistent with the null hypothesis: if they are not, then either we have observed an event of low probability, or the null hypothesis is not correct. The choice of test statistic is in principle arbitrary, but in practice might be determined by convention in the field of application, by intuition in a relatively new setting, or by one or more considerations developed in statistical theory. It is convenient to use test statistics whose distributions can easily be calculated exactly or to a good approximation. It is useful to use a test statistic that is sensitive to the particular departures from the null hypothesis that are of particular interest in the application. A test of statistical significance is just one component of the analysis of a set of data, and should be supplemented by estimates of effects of interest, considerations related to sample size, and a discussion of the validity of any assumptions of independence or underlying models that have been made in the analysis. A statistically significant result is not necessarily an important result in any particular analysis, but needs to be considered in the context of research in that field. An eloquent introduction to tests of significance is given in Fisher (1935, Chap. 2). Kalbfleisch (1979,
Simmel, Georg (1858–1918) Chap. 12) is a good textbook reference at an undergraduate level. The discussion here draws considerably from Cox and Hinkley (1974, Chap. 3), which is a good reference at a more advanced level. An excellent overview is given in Cox (1977). For a criticism of p-values see Schervish (1996) as well as Matthews (1998). See also: Distributions, Statistical: Approximations; Frequentist Inference; Goodness of Fit: Overview; Hypothesis Testing in Statistics; Likelihood in Statistics; Meta-analysis: Overview; Resampling Methods of Estimation
Bibliography Agresti A 1992 A survey of exact inference for contingency tables. Statistical Science 7: 131–53 Barndorff-Nielsen O E, Cox D R 1994 Inference and Asymptotics. Chapman and Hall, London Berger J O, Sellke T 1987 Testing a point null hypothesis: The irreconcilability of p-values and evidence. Journal of the American Statistical Association 82: 112–22 Cox D R 1977 The role of significance tests. Scandinaian Journal of Statistics 4: 49–70 Cox D R, Hinkley D V 1974 Theoretical Statistics. Chapman and Hall, London Dawid A P, Dickey J M 1977 Properties of diagnostic data distributions. Journal of the American Statistical Association 72: 845–50 Fisher R A 1935 The Design of Experiments. Oliver and Boyd, Edinburgh, UK Kalbfleisch J G 1979 Probability and Statistical Inference. Springer, New York, Vol. 2 Kass R E, Wasserman L 1996 Formal rules for selecting prior distributions: A review and annotated bibliography. Journal of the American Statistical Association 91: 1343–70 Lehmann E L, Casella G 1998 Theory of Point Estimation. Springer, New York Matthews R 1998 The great health hoax. Sunday Telegraph 13 September. Reprinted at ourworld.compuserve.com\ homepages\rajm\ Schervish M J 1996 P values: what they are and what they are not. American Statistician 96: 203–6
N. Reid
Simmel, Georg (1858–1918) Georg Simmel was born in the heart of Berlin on March 1, 1858. He was the youngest son of Flora (born Bodstein) and Eduard Simmel, who, although coming from Jewish families, had been baptized into Christianity (he as a Catholic, she as a Protestant). Following the early death of Simmel’s father in 1874, the family suffered serious financial difficulties, which,
where the young Georg was concerned, were overcome thanks to Julius Friedla$ nder, a friend of the family (co-founder of the music publishing company ‘Peters’). Friedla$ nder felt a strong sympathy for the young Simmel, and indeed took him under his wing as his prote! ge! . Thus, Georg could attend high school and then university in Berlin, where he studied philosophy, history, art history, and social psychology (VoW lkerpsychologie). Simmel received his degree as Doctor of Philosophy in 1881, but not without difficulty: his first attempt at a doctoral thesis, ‘Psychological and Ethnological Studies on the Origins of Music,’ was not accepted, and he had instead to submit his previous work on Kant, On the Essence of Matter—Das Wesen der Materie nach Kant’s Physischer Monadologie, which had earned him a prize (Ko$ hnke 1996). The Habilitation (postdoctoral qualification to lecture) came next in 1885, for which he also encountered some controversy, and after this his academic career began immediately as a Priatdozent (external lecturer). An extraordinary (außerordentliche) professorship without salary followed in 1901, and indeed he had to wait until 1914 before he was offered a regular professorship at the University of Strasbourg, where he remained until his death on September 26, 1918, shortly before the end of the First World War. During his lifetime, Simmel was a well-known figure in Berlin’s cultural world. He did not restrict himself merely to scientific or academic matters, but consistently showed great interest in the politics of his time, including contemporary social problems and the world of the arts. He sought to be in the presence of, and in contact with, the intellectuals and artists of his day. He married the painter Gertrud Kinel, with whom he had a son, Hans, and maintained friendships with Rainer Maria Rilke, Stefan George, and Auguste Rodin, amongst many others. At his home in Berlin, he organized private meetings and seminars (Simmel’s priatissimo), whose participants he would choose personally.
1. Georg Simmel and the Social Sciences Simmel’s contributions to the social sciences are immeasurable. Nevertheless, most of them remain misunderstood, or have been separated from the intentions of their creator, and, thus, their origins have been forgotten. From system theory to symbolic interactionism, almost all sociological theories need to rediscover Simmel as one of their main founding parents. Simmel’s interest in the social sciences, especially in sociology, can be traced to the very beginning of his academic career. After attending seminars offered by Moritz Lazarus and Heymann Steinthal (founders of theVoW lkerpsychologie) during his student years, he became a member of Gustav Schmoller’s circle, where he 14091
Simmel, Georg (1858–1918) became acquainted with the debates concerning the national economy of the time, and for whom he held a conference on The Psychology of Money (GSG2 1989, pp. 49–65), which would later constitute the first pillar of one of his major works, The Philosophy of Money (Frisby and Ko$ hnke 1989). In both of these circles Simmel became more aware of, and sensitized towards, social questions. Schmoller’s engagement with social questions, together with Lazarus’ and Steinthal’s emphasis on the level of ‘Uq berindiidualitaW t’ (supraindividuality), and their relativistic worldview and insistence that ethical principles are not of universal validity, as well as Simmel’s own interest in Spencer’s social theory, all helped to shape the contours of his sociological approach. These various influences were melded together with Simmel’s philosophical orientation, particularly his interest in Kant, which yielded a new and rather far-reaching sociological perspective that would evolve extensively throughout his life. For example, Simmel’s first sociological work, On Social Differentiation (Uq ber Sociale Differenzierung: GSG2 1989), written at the very beginning of his academic career, was deeply influenced by Herbert Spencer and the ideas of Gustav Schmoller and his circle. Slowly, however, as the 1890s drew on, the admiration Simmel had felt for Spencer’s theories turned into rejection, and he distanced himself from an evolutionary-organicist approach to sociology. In this rejection of such theories, he brought his knowledge of Kant to his sociological thinking, and this would become the basis of his later contact and dialogue with the members of the southern-German neo-Kantian school, such as Rickert, Windelband, and with the sociologist who was closest to their ideas: Max Weber. This contact with neo-Kantianism influenced Simmel’s approach to the social sciences at the end of the nineteenth century and during the first years of the 1900s. From contemporary testimonies we know that Simmel was in fact one of the greatest lecturers at the University of Berlin, and that his seminars were attended by a great number of students (Gassen and Landmann 1993). It is thus difficult to understand why Simmel did not have a more successful academic career. We know from his correspondences and from Gassen and Landmann’s attempts to reconstruct Simmel’s life and oeure that Georg Jellinek engaged himself, though with no positive results, in seeking to obtain an ordinary professorship for Simmel at the University of Heidelberg in 1908 (with, following Jellinek’s death, a further attempt by Alfred Weber in 1912, and another in 1915). His converted Jewish background (i.e., assimilated into German society) surely played a significant role in the lack of recognition Simmel received from the academic system; but also his peculiar and original understanding of science, which diverged greatly from established patterns, as well as his characteristic mode of writing, 14092
using essays instead of more ‘academic’ and standardized forms, contributed to his not being accepted into the rather formal and classical German academic milieu. Both attempts at obtaining a professorship for Simmel were blocked by the bureaucracy of the Grand Dukedom of Baden. In fact the letter of evaluation written by the Berlin historian Scha$ fer regarding a possible professorship for Simmel in Heidelberg in 1908 has as the primary argument for denying Simmel’s potential ability to be a good professor his ‘Jewishness,’ which, according to Scha$ fer, too obviously tinged Simmel’s character and intellectual efforts with a strong relativism and negativity, which could not be good for any student (Gassen and Landmann 1993, Ko$ hnke 1996). Throughout his life Simmel had to fight against these kinds of accusation, and he endeavored to build a ‘positive relativism’ in an attempt to show that he did not question ‘absolute pillars’ and thus leave us with nothing, but sought instead to show that this sense of ‘absoluteness’ was also a product of human reciprocal actions and effects (Wechselwirkungen), and was not fundamentally absolute. Such an argument was too much for the society and scientific milieu of the time to accept and forgive, even when Simmel later radically rejected his Introduction to the Moral Science (Einleitung in die Moralwissenschaften: GSG3 1989 GSG4, 1991), referring to it as a sin of his youth. This was the work in which his relativism, in the critical, ‘negative’ sense, had been most fully in bloom (Ko$ hnke 1996). Notwithstanding these varying approaches to sociology, Simmel’s interest in the discipline persisted right through his career. When Simmel, in his letter to Ce! lestin Bougle! of March 2, 1908, wrote: ‘At the moment I am occupied with printing my Sociology, which has finally come to its end,’ and, sentences later, added that work on this book ‘had dragged on for fifteen years’ (Simmel Archive at the University of Bielefeld) he indicated that his engagement with sociology was a long-term project. Considering that Sociology (Sociologie)was first published in 1908 (and therefore his work on it must have begun around 1893), we can find its first seed in his article The Problem of Sociology (Das Problem der Sociologie) originally published in 1894. Simmel must have thought this article a significant contribution to sociology, since he endeavored to spread it abroad as much as possible. Hence, the French translation of The Problem of Sociology appeared, simultaneously with the original German version, in September 1894. The American translation appeared in the Annals of the American Academy of Political and Social Science a year later, and, by the end of the century, the Italian and Russian translations were also in print. The American translation is of particular significance, since Simmel emphasized therein, in a footnote, that sociology involved an empirical basis and research, and should not be thought of as an independent offshoot from philosophy, but as a science concerning
Simmel, Georg (1858–1918) the social problems of the nineteenth century. In the Italian translation he delivered an updated version of the text, wherein he introduced clear references to his theoretical polemic with Emile Durkheim. From his letter to Ce! lestin Bougle! of February 15, 1894, we know that Simmel was, after the completion of The Problem of Sociology, quite excited by this new discipline, and that he did not foresee a shift away to any other fields of inquiry in his immediate future. During those years he worked on most of the key sociological areas of research, thus articulating the key social problems of his time within a sociological framework: for example, workers’ and women’s movements, religion, the family, prostitution, medicine, and ethics, amongst many others. He seemed deeply interested in putting his new theoretical proposals and framework for the constitution of sociology into practice. He realised the institutionalisation of the new discipline would be reinforced by the establishment of journals for the discipline; hence his participation in: the ‘Institut Internationale de Sociologie,’ for whom he became vice-president; the American Journal of Sociology, and, although only briefly (due to differences with Emile Durkheim), l’AnneT e sociologique (Rammstedt 1992, p. 4). He also played with the idea of creating his own sociological magazine. Another means of solidifying the role of sociology within the scientific sphere was through the academia, and he engaged himself in organising sociological seminars, offering them uninterruptedly from 1893 until his death in 1918. On June 15, 1898, he wrote to Jellinek ‘I am absolutely convinced that the problem, which I have presented in the Sociology, opens a new and important field of knowledge, and the teaching of the forms of sociation as such, in abstraction from their contents, truly represents a promising synthesis, a fruitful and immense task and understanding’ (Simmel Archive at the University of Bielefeld) Despite his original intention, it is clear Simmel did not work continuously on the Sociology for 15 years as, from 1897 to 1900, he worked almost exclusively on The Philosophy of Money (Philosophie des Geldes), and also found time for the writing and publication of his Kant (1904), as well as a reprint of his Introduction to the Moral Sciences (1904), which he had intended to rewrite, for he no longer accepted most of the ideas he had presented therein when it was first published in 1892 (although he did not achieve this); the revised edition of The Problems of the Philosophy of History (Probleme der Geschichtsphilosophie 1905), (The Philosophy of Fashion (Philosophie der Mode 1905), and Religion Die Religion 1906\1912) were all worked on by Simmel during this period too (Rammstedt 1992b). In The Problem of Sociology he questioned for the first time the lack of a theoretically well-defined object of study for the emerging discipline, and sought to develop a specific sociological approach, which would entail a distinct object of study, in order to bestow legitimacy and
scientific concreteness to a discipline, under attack from different, and more settled, lines of fire. Fearing his call had not been heard, Simmel endeavored to prove his point by writing a broader work, within which he attempted to put into practice the main guidelines he had suggested as being central to the newly emerging discipline. In this way the Sociology, almost one thousand pages long, was cobbled together from various bits and pieces, taken from several essays he had written between the publication of The Problem of Sociology in 1894 and its final completion in 1908. Simmel, as can be understood from his letters, was aware of the incompleteness of this work, but rescued it by saying that it was an attempt to realise that which he had suggested almost 15 years earlier, for it had not been noticed enough by the scientific community. Simmel did not wish, by that time, to be thought of as only a sociologist. This was due to sociology not being an established discipline within the academic world, which therefore did not allow him to obtain a professorship in the field (i.e., offering very little recognition). Nevertheless he never quite abandoned the field of sociology and continued to write about religion, women’s issues, and the family. He merely broadened his scope to include Lebensphilosophie (the philosophy of life) and cultural studies in general, whilst participating at the same time in the founding of the German Sociological Society (Deutsche Gesellschaft fuW r Soziologie), for whom he served as one of the chairmen until 1913. Sociology marks the end of Simmel’s strongly Kantian period, and represents a turning point in his interests, for, from this moment onwards, he refused to be labeled as a sociologist, seeking instead to return to philosophy. Indeed, following the publication of the ‘big’ Sociology, as it is usually called by those who work in Simmel studies, he did not publish any sociological papers for nine years (although he continued, as has already been mentioned, to offer sociological seminars until his death), instead devoting his efforts to philosophy, history and the philosophy of art. Hence, as part of Simmel’s output from these years we find, amongst other works, Kant und Goethe (Kant und Goethe 1906\1916), Goethe (1913), Kant (1904\1913\1918), The Principal Problems of Philosophy (Hauptprobleme der Philosophie 1911), The Philosophical Culture (Philosophische Kultur 1911\1918), and Rembrandt (1916). This new direction was accompanied and partly motivated by Simmel’s acknowledgment of Henri Bergson’s hoeure, and his inclination towards Lebensphilosophie (the philosophy of life; see Fitzi 1999, GSG16 1999). Thus, ‘life’ became the primary focus of Simmel’s theoretical work, and consequently ‘society’ was pushed into a secondary role. During these years Simmel occupied himself with the study of artists, of their production, and how the relation between life and its formal expression is crystallised in their work. It was as if Simmel had lost interest in sociology, as he did not write a single line 14093
Simmel, Georg (1858–1918) concerning it for years. Yet, unexpectedly in 1917, a year before his death, he wrote Grundfragen der Soziologie (Main Questions of Sociology), the ‘small sociology,’ as it is called by Simmel scholars (i.e., in contrast to his 1908 work). The impetus for writing this book came from a publisher (Sammlung Go$ schen), who intended to print an introductory work to sociology, which he invited Simmel to write, because of the success other works of his had enjoyed for the same publisher. If Simmel had indeed distanced himself from all sociological questions, it is likely he would have merely reached for the shelf to his previous works and rewritten them in shorter form. But this was not the case, because, although he utilised older material, Simmel rewrote and redefined his perspective, and in the Main Questions of Sociology, scarcely 100 pages long, presented the final stage of his sociological reflections, which melded together his previous study of forms of sociation with a perspective from the philosophy of life. This approach to sociology fell into neglect after his death, awaiting revitalisation, harbouring a broad scope and original perspective for new generations of sociologists to rediscover (Simmel’s last work was a philosophical contribution to Lebensphilosophie, the Lebensanschauung—View of Life—1918, GSG16 1999).
objects). However, Simmel argued in his Main Questions of Sociology (GSG16 1999, pp. 62–8) that if we merely take individuals and pretend, through adopting a significant distance, to approach ‘society,’ or a social phenomenon, we will not reach our goal. Therefore, when conducting a sociological inquiry, what we must seek to do is base ourselves on the forms of sociation, which are built in Wechselwirkung. Simmel used this concept when either addressing ‘interactions’ or following the Kantian definition as ‘reciprocal actions and effects’ (Simmel 1978). The difference between the two possible meanings of this concept, or, more precisely, the differentiation of the two different concepts hidden behind the same word, has been addressed in the English translation of The Philosophy of Money; so the usual mistake of translating Wechselwirkung as a direct synonym for interaction has been corrected. According to Simmel, what actually takes place between individuals will be seen to be that which constitutes the object of sociology, not merely the individuals by themselves, or society as a whole, for, as previously mentioned, society is nothing but the sum of forms of sociation, a continuous process in and for which these forms intertwine and combine themselves to form the whole (GSG11 1992, p. 19). Thus, Simmel defined society as the sum of the forms of reciprocal actions and effects.
2. The Object of Sociology At a time when sociology was still far from being an established discipline, instead seeking to stand up and open its eyes for the first time, Simmel sought to release it from the burden of being ‘the science of society.’ According to him this burden was an impossible one for the new-born discipline to carry, since being the science of society meant having to compete with already settled and established disciplines for the legitimacy of its object of study; law and history, psychology and ethnology, all could argue the case that society was their object, thus leaving sociology with the mere pretence of including elements from them all. Viewed from this perspective, as an object society was an all-encompassing matter but, at the same time, it eluded any scientific investigation, just like sand falling through our fingers. Simmel, as Max Weber two decades later, contributed to the demystification of ‘society’ as some kind of essential entity (as it appeared in nineteenth-century sociology, and remained as such in the sociology of Durkheim and To$ nnies), instead explaining it as a dynamic process, a continuous happening, a continuous becoming, which is nothing more than the mere sum of the existing forms of sociation. Individuals as well as society\ies are not units in and of themselves, though they may appear as selfsufficient units depending on the distance the observer interposes between him\herself and them (as observed 14094
3. The Concept of ‘Form’ If we take reciprocal actions and effects as our starting point, we achieve only a perspective common to all social and human sciences: sociology as method. In order to construct sociology as an independent discipline, with a specific object of study, it is necessary to analytically differentiate between form and content, that is, ‘forms’ as means and patterns of interaction between individuals, social groups or institutions, and ‘contents’ as that which leads us to act, the emotions or goals of human beings. Thus, social forms were conceived as being the object of analysis for a scientific sociology, which would only be possible empirically. Contents, on the other hand, should be left to other disciplines, such as psychology or history, to analyze. Simmel’s sociological theory is orientated towards these ‘forms of sociation’ (defined as such in the subtitle of Sociology). For individuals to become social, they need to rely on such forms in order to channel their contents, as forms represent their only means of participation in social interaction. Forms are independent from contents, and when analyzing them, it is necessary that they are abstracted from individual, particular participation in concrete interactions: the question of social forms does not include that regarding the specific relationships between the participants involved in concrete interactions; we are dealing only with that which is between them, as for an
Simmel, Georg (1858–1918) analysis of social forms the particular individuals involved in them are irrelevant. Human beings, with their particular ‘contents,’ only become social when they seek to realise these contents, and then acknowledge this is only possible in a social framework: via ‘exteriorising,’ via acting through forms. Hence forms are social objectivations, which impose themselves, with their norms, upon particular individuals, an imposition which could only be annulled by isolation. These impositions and constraints are part of what sociation is, a concept which plays a key role in Simmel’s theory, for he actually maintained that human beings are not social beings by nature. Simmel placed sociation, as a product of reciprocal actions and effects, at the centre of his formal, or pure (reine), sociology. The form of competition can be used as an example of what Simmel actually meant by ‘forms.’ Competition does not imply any specific contents, and remains the same, independent of those who are competing, or what they are competing for. Competition forms ‘contents’ (such as job seeking, or looking for the attention of a beloved person, amongst many others), limits the boundaries of actions, and actually brings them into being via giving them a framework and shape in which they can appear in the arena of actions and effects. This situation of having the choice between multiple forms for channelling one content is stressed appreciatively in Simmel’s sociology, particularly in his Lebensphilosophie, when he no longer contrasted the concept of form with the concept of content, but with life. In this later period, forms are the crystallization of the unretainable flow of life, yet also the channels of expressing life, for life cannot be expressed as only itself. Simmel explained this apparent paradox by asserting that life is ‘more-life’ and also ‘more-thanlife.’ The concept of ‘more-life’ merely implies that this continuous flow connects every complete moment with the next; ‘more-than-life’ implies that life is not life if it does not transcend its boundaries, and becomes crystallized in a form. So life becomes art, or science, for example; life becomes externalised and crystallised and hence expressed and fulfilled. For instance, we should understand art as one form of expressing (aW ussern) life through (artistic, aesthetic) forms. Actually uncovering how this perspective can be applied to sociology may appear to be a difficult task; the first step towards clarifying this was made by Simmel himself in his Main Questions of Sociology.
4. ‘Me’ and ‘You’ Deliberately distancing himself from the concept of ‘alter ego,’ Simmel emphasized the signific1ance the concept of ‘you’ should have to all sociological theory. He articulated and embedded this concept within his—sociological—theory of knowledge. In parallel with Kant’s question on nature, Simmel formulated this question as ‘How is society possible?’ the first and
well known digression of his Sociology. In order to answer this question he proposed three a priori. According to the first, ‘Me’ and ‘You’ would see each other as ‘to some extent generalised’ (GSG11 1992, p. 47), and he therefore assumed that each individual’s perception of the other (i.e., ‘you’) would also be generalized to a certain degree. The second a priori affirms that each individual as an ‘element of a group is not only a part of the society, but, on top of this, also something else’ (GSG11 1992, p. 51) that derives its own uniqueness, its individuality from the duality of being\not being sociated. The third a priori is related to the assignment of each individual to a position within their society (in the sense of the sum of highly differentiated reciprocal actions and effects): ‘that each individual is assigned, according to his qualities, to a particular position within his social milieu: that this position, which is ideally suited to him, actually exists within the social whole—this is the supposition according to which every individual lives his social life’ (GSG11 1992, p. 59); this is what Simmel meant when he wrote about the ‘general value of individuality.’ These sociological a priori, orientated towards role, type, and individual\society, also allow us to understand the central aspects of Simmel’s methodology, which can be framed together with the concepts of ‘differentiation,’ in the sense of division and difference, and of ‘dualism,’ in the sense of an irreconcilable, tragic opposition of the elements of a social whole.
5. A Proposal for Three Sociologies According to Simmel sociology needed to stand as an autonomous science in close relationship with ‘the principal problem areas’ (GSG16 1999, pp. 76–84) of social life, even when these areas were distant from each other. Already in 1895 he asserted that just which name to give to any particular group was quite unimportant, since the real question was to state problems and to solve them, and not at all to discuss the names which we should give to any particular groups (1895, Annals of the American Academy of Political and Social Science 6: 420). He stressed this orientation into problem areas again in 1917, when he returned to theorizing on the Main Questions of Sociology. Thus, as an attempt to approach these main questions (i.e., which relationships exist between society and its elements, individuals) he concentrated on three different sets of problems with concrete examples, particularly in the third part of the first chapter.The first focuses on objectivity as a component of the social sphere of experience; the second includes the actual facts of life, in which, and through which, social groups are realised; and the third emphasises the significance of society (Fitzi and GSG16 1999, p. 84) for grasping and understanding attitudes towards the world and life. These three problem areas correspond to his three proposed sociologies: the ‘general sociology,’ which, wherever society exists, deals with the 14095
Simmel, Georg (1858–1918) central relationships between the individuals and the social constructions resulting from them. The aim of general sociology is to show which reciprocally orientated values these social constructions and individuals possess. He gave an example for general sociology in the second chapter of the Main Questions of Sociology, by illustrating the relationships which exist between the ‘social and the individual level,’ a perspective which, at the same time, sociologizes the masses theory. The ‘pure or formal’ sociology focuses upon the multiple forms individuals use to embody contents, that is, emotions, impulses, and goals, in reciprocal actions and effects with others, be it with other people, social groups, or social organizations, and thus they constitute society. According to him, each form wins, entwined in these social processes, ‘an own life, a performance, which is free from all roots in contents’ (GSG16 1999, p. 106). The example Simmel chose for his formal sociology was the form of ‘sociability’ (Geselligkeit), which he presented in the third chapter of the Main Questions of Sociology. Finally, the ‘philosophical sociology’ circles the boundary of the ‘exact, orientated towards the immediate understanding of the factual’ (GSG16 1999, p. 84) empirical sociology: on the one hand the theory of knowledge (Erkenntnistheorie), on the other, the attempts to complement, through hypothesis and speculation, the unavoidably fragmentary character of factual, empirical phenomena (Empirie), in order to build a complete whole (GSG16 1999, p. 85). In the fourth chapter, entitled ‘The Individual and Society in Eighteenth and Nineteenth Century Views of Life’, Simmel illustrated the abstract necessity of individual freedom, which should be understood as a reaction to the contemporary, increasing, social constraints and obligations. Thus, reference is made to general conflicts between individual and society, which derive from the irreconcilable conflict between the idea of society as a whole, which ‘requires from its elements the onesidedness of a partial function,’ and the individual, who knows him\herself to be only partially socialized, and ‘him\herself who wants to be whole’ (GSG16 1999, p. 123).
6. Simmel and Modern Sociology Simmel’s theoretical significance to contemporary sociology resides in the various theories, which built on his sociology. As examples, it will be sufficient to name symbolic interactionism, conflict theory, functionalism, the sociology of small groups, and theories of modernity. Simmel also introduced into sociology the essay as an academic form of analysis, whilst his digressions within Sociology should be mentioned as well, for they have become generally recognized as classical texts; see, for example, his digressions on ‘The Letter’ (Brief ), ‘Faithfulness’ (Treue), ‘Gratefulness’ (Dankbarkeit), and ‘The Stranger’ (Fremde). But above all sociology owes Simmel the freedom he gave 14096
it from the fixation on the ‘individual and society’ as an ontic object—in hindsight, a point of no return. See also: Capitalism; Cities: Capital, Global, and World; Cities: Internal Structure; Differentiation: Social; Ethics and Values; Family, Anthropology of; Family as Institution; Fashion, Sociology of; Feminist Movements; Gender and Feminist Studies in Political Science; Groups, Sociology of; History and the Social Sciences; Individual\Society: History of the Concept; Individualism versus Collectivism: Philosophical Aspects; Interactionism: Symbolic; Kantian Ethics and Politics; Knowledge (Explicit and Implicit): Philosophical Aspects; Knowledge Representation; Knowledge, Sociology of; Labor Movements and Gender; Labor Movements, History of; Medicine, History of; Methodological Individualism in Sociology; Methodological Individualism: Philosophical Aspects; Modernity; Modernity: History of the Concept; Modernization and Modernity in History; Modernization, Sociological Theories of; Money, Sociology of; Motivational Development, Systems Theory of; Personality and Social Behavior; Personality Structure; Personality Theories; Prostitution; Religion, Sociology of; Science and Religion; Selfknowledge: Philosophical Aspects; Social Movements and Gender; Social Movements, History of: General; Sociology, Epistemology of; Sociology, History of; Sociology: Overview; Symbolic Interaction: Methodology; Theory: Sociological; Urban Life and Health; Urban Sociology
Bibliography Dahme H J, Rammstedt O (eds.) 1983 Georg Simmel. Schriften zur Soziologie. Eine Auswahl, Suhrkamp, Frankfurt am Main, Germany Fitzi G 1999 Henri Bergson und Georg Simmel: Ein Dialog zwischen Leben und krieg. Die persoW nliche Beziehung und der wissenschaftliche Austausch zweier Intellektuellen im deutschfranzoW sischen Kontext or dem Ersten Weltkrieg. Doctoral thesis, University of Bielefeld Gassen K Landmann M 1993 Buch des Dankes an Georg Simmel. Briefe, Erinnerungen, Bibliographie. Zu seinem 100. Geburtstag am 1. MaW rz 1958. Duncker & Humblot, Berlin GSG 10: Philosophie der Mode (1905). Die Religion (1906\1912). Kant und Goethe (1906\1916). Schopenhauer und Nietzsche (1907), ed. M Behr, V Krech, and G Schmidt, 1995 GSG 8: AufsaW tze und Abhandlungen 1901–1908, Band II, ed. A Cavalli and V Krech 1993 GSG 2: AufsaW tze 1887–1890. Uq ber sociale Differenzierung (1890). Die Probleme der Geschichtsphilosophie (1892), ed. H J Dahme, 1989 GSG 5: AufsaW tze und Abhandlungen 1894–1900, ed. H J Dahme and D Frisby, 1992 GSG 16: Der Krieg und die geistigen Entscheidungen (1917). Grundfragen der Soziologie (1917). Vom Wesen des historischen Verstehens (1918). Der Konflikt der modernen Kulur (1918). Lebsensanschauung (1918), ed. G Fitzi and O Rammstedt, 1999
Simulation and Training in Work Settings GSG 6: Philosophie des Geldes (1900\1907), ed. D Frisby and k C Ko$ hnke, 1989 GSG 3: Einleitung in die Moralwissenschaft. Eine Kritik der ethischen Grundbegriffe. Erster Band (1892\1904), ed. K C Ko$ hnke, 1989 GSG 4: Einleitung in die Moralwissenschaft. Eine Kritik der ethischen Grundbegriffe. Zweiter Band (1893), ed. K C Ko$ hnke, 1991 GSG 1: Das Wesen der Materie (1881). Abhandlungen 1882–1884. Rezensionen 1883–1901, ed. K C Ko$ hnke, 1999 GSG 7: AufsaW tze und Abhandlungen 1901–1908, Band I, ed. R Kramme, A Rammstedt, and O Rammstedt, 1995 GSG 14: Hauptpobleme der Philosophie (1910\1927). Philosophische Kultur (1911\1918), ed. R Kramme, and O Rammstedt, 1996 GSG 9: Kant (1904\1913\1918). Die Probleme der Geschichtsphilosophie, 2 Fassung (1905\1907), ed. G Oakes, and K Ro$ ttgers, 1997 GSG 11: Soziologie. Untersuchungen uW ber die Formen der Vergesellschaftung (1908), ed. O Rammstedt, 1992 Ko$ hnke K C 1996 Der junge Simmel in Theoriebeziehungen und sozialen Bewegungen, Suhrkamp, frankfurt am Main, Germany Otthein R (ed.) 1992a Georg Simmel Gesamtausgabe. Suhrkamp, Frankfurt, Germany Rammstedt O 1992 Programm und Voraussetzungen der Soziologie Simmels, Simmel-Newsletter 2: 3–21 Simmel G 1978 The Philosophy of Money, trans. and ed. by D Frisby and T Bottomore, Routledge, London Simmel G 1895 The Problem of Sociology. Annals of the American Academy of Political and Social Science 6
O. Rammstedt and N. Canto! -Mila'
Simulation and Training in Work Settings Training is a set of activities planned to bring about learning in order to achieve goals often established using a training needs analysis. Methods of training include: behavior modeling, based on video or real activities; action training, which provides an opportunity to learn from errors; rule- and exemplar-based approaches that differ in terms of the extent to which theory is covered; and methods that aim to develop learning skills, the skills needed to learn in new situations. Simulation involves using a computerbased or other form of model for training, providing an opportunity for systematic exposure to a variety of experiences. Simulations are often used in training complex skills such as flying or command and control. Issues in training highlight the differing emphasis that can be placed on goals related to the process of learning or performance outcomes; motivation and self-management in training; and the transfer dilemma where effortful methods of training that enhance transfer may be less motivational. Finally organizational issues and training evaluation are discussed with a particular emphasis on the role of the environment and follow-up enhancing transfer of skills learned during training to the workplace.
1. Oeriew Training is a set of planned activities organised to bring about learning needed to achieve organizational goals. Organizations undertake training as part of the socialization of newcomers. Even with the best selection systems a gap frequently remains between the knowledge, skills, and abilities (KSAs) required in a job, and those possessed by the individual. Training is used by organizations to improve the fit between what an individual has to offer and what is required. Training is also used when job requirements change such as with the introduction of new technology. Because of the frequency with which job requirements change, a major aim of training is to develop specific skills while also ensuring their transferability and adaptability. An ultimate aim is to increase learning skills where the emphasis in training is on learning to learn. Many methods of training are available (e.g. behavioral modeling, action, computer-based training), with an increasing trend toward incorporating evaluation as part of the training. Although simulations have been used in training for decades, the reduced cost and increased sophistication of simulations now make them more readily available for a wide range of situations. Simulations provide increased opportunities to train for transferable and adaptable skills because trainees can experiment, make errors, and learn from feedback on complex and dynamic real-time tasks. Evaluation of training programmes remains an important component of the training plan, and can be used to enhance transfer.
2. Training Needs Analysis Most training starts with a training needs analysis, where the present and future tasks and jobs that people do are analyzed to determine the task and job requirements. The identified requirements are then compared with what the individual has to offer. This is done within a broader organizational context. Many different methods of job, task, and organizational analysis can be used, including observation, questionnaires, key people consultation, interviews, group discussion, using records, and work samples (that is, letting people perform a certain activity and checking what they need to know to perform well). A transfer of training needs analysis (Hesketh 1997a), places a particular emphasis on identifying the cognitive processes that must be practised during learning to ensure that the skills can be transferred to contexts and tasks beyond the learning environment.
3. The Training Plan and Methods There has been an increase in the methods of training that can be used within a broader training plan that includes transfer and evaluation (Quinones and Ehrenstein 1997). The training plan provides a way of 14097
Simulation and Training in Work Settings combining the illustrative methods discussed below to ensure that the training needs identified in the analysis can be met.
3.1 Behaior Modeling Behavior modeling is often combined with roleplaying in training. A model is presented on video or in real life, and the rationale for the special behaviors of the model are discussed with the trainees. Trainees then role play the action or interaction, and receive feedback from the trainers and fellow trainees. This type of training is very effective as it provides an opportunity for practice with feedback (e.g. Latham and Saari 1979).
3.2 Action Training Action training follows from action theory (Frese and Zapf 1994) and exploratory learning. Key aspects of action training involve active learning and exploration, often while doing a task. Action learning is particularly effective as a method of training (Smith et al. 1997). Another important aspect of action training involves obtaining a good mental model of the task and how it should be approached. A mental model is an abstraction or representation of the task or function. Trainees can be helped to acquire a mental model through the use of ‘orientation posters’ or advanced organizers, or through the provision of heuristic rules (rules of thumb) (Volpert et al. 1984). One of the advantages of action training is the opportunity to learn from feedback and errors. Feedback is particularly important in the early stages of learning, but fading the feedback at later stages of learning helps ensure that trainees develop their own self-assessment skills (Schmidt and Bjork 1992). Errors are central in action training since systematic exposure to errors during learning provides opportunities to correct faulty mental models while providing direct negative feedback. Although earlier learning theory approaches argued that there should be only positive feedback, active error training helps trainees develop a positive attitude toward errors because of their value in learning (Frese et al. 1991).
3.3 Rules ersus Examples in Training Although it has traditionally been assumed that rulebased training provides a sound basis for longer term transfer, recent research suggests that for complex nonlinear problems, exemplar training may be superior. Optimizing the combination of rules and examples may be critical. In order to facilitate transfer, 14098
examples should be chosen carefully to cover the typical areas of the problem. With only a few examples, the rule and the example are often confused. However, individual instances are difficult to recall if trainees are provided with too many examples (DeLosh et al. 1997), suggesting that care needs to be taken when deciding how many examples will be presented in training.
3.4 Deeloping Learning Skills: Learning to Learn Downs and Perry (1984) offer a practical way of helping trainees develop learning skills such as knowing that there are different ways of learning (e.g., learning by Memorizing, Understanding and Doing, the MUD categories). Facts need to be memorized, concepts need to be understood and tasks such as driving a motor car need to be learned by doing. The approach stresses the importance of selecting the most appropriate method for the material to be learned so that learning skills can be trained while also developing content skills. Methods that typically encourage learning skills as well as content skills involve action learning, active questioning and discovery learning, rather than direct lecturing and instruction. The ideas in the learning to learn literature lead to issues such as developing self-management techniques and focusing on learning.
3.5 Simulation Training Simulation involves developing a model of an on-thejob situation that can be used for training and other purposes. The advantages of using simulation for training include reduced cost, opportunity to learn from errors, and the potential to reduce complexity during the early stages of learning. Historically simulators have been used for training in military, industrial and transport industries, and in management training where business games are widespread. Pilot training on simulators is well developed, to the extent that many pilots can transfer directly from a flight simulator to a real aircraft (Salas et al. 1998). Within the aviation industry, crew cockpit resource management training has also been undertaken using simulators. Driving simulators are not as well developed, and the transfer of skills learned on a driving simulator to the actual road remains to be established. Nevertheless, there is a view that attitudes and driving decisions can be trained on a simulator. Simulations of management decision situations and command and control in the military, police, and emergency personnel are well developed, and widely used for training. These simulations provide a miniaturized version of real crises, with key decision points extracted and shrunk
Simulation and Training in Work Settings in time. They provide an opportunity for the trainee to experiment with decisions, make mistakes, and learn from these errors (Alluisi 1991). Simulation training has also been used to facilitate the acquisition of crosscultural skills.
4. Issues in Training 4.1 Learning s. Performance Goals Whether the emphasis during learning is on performance or learning goals is used to explain differences in how people conceptualize their ability. Dweck and Leggett (1988) argue that some people conceptualize ability as increasing with learning (learning orientation), while others see ability as fixed (performance orientation). People with a learning orientation learn from mistakes and challenges. However, individuals with a performance orientation view mistakes as examples of poor performance and learn less from errors. Performance oriented people also tend to demonstrate a helpless response to problems and are therefore less likely to overcome challenges. Martocchio (1994) showed that ability self-conceptualization was related to computer anxiety and selfefficacy. Motivation plays a key role in transfer of training (Baldwin and Ford 1988). Numerous motivation issues have been studied, the most important ones being self-efficacy, relapse prevention, perceived payoffs, goals, and the training contract. Self-efficacy is critical to transfer in that people will use a skill only if they believe that they can actually perform the appropriate behavior. Relapse prevention focuses on teaching solutions to those situations in which it may prove difficult to use the newly learned skills (Marx 1982). Trainees who receive relapse prevention training have been found to use their skills more often and perform their job better.
4.2 Self-management in Training Self-management implies that one acquires the skills to deal with difficulties, reward oneself, and increase self-efficacy. Metacognitive strategies showing evidence of self-management and self-reward during training related to training performance. Furthermore, self-efficacy is related to increased post-training knowledge (Martocchio 1994) and transfer performance. Thus, self-efficacy functions as a predictor of both training and transfer performance.
4.3 Transfer Dilemma Transfer is important to ensure that the skills learned in one context or on one task can be applied in a range
of different contexts and tasks. For example fire fighters may learn how to fight fires in urban areas, but their skills must also transfer to the bush or rural areas where they are frequently required to work in emergencies. In the field of technology, training, in the use of one spreadsheet or database should lead to transfer to different spreadsheets and databases and a range of other software packages. Baldwin and Ford (1988) provide a model that highlights several factors that influence transfer, including the similarity of the training and transfer situation, the nature of the training methods used, and the extent to which environmental factors reinforce transfer. Annett and Sparrow (1986) explained that transfer would be best if the stimulus situations and behaviors trained shared identical elements with stimuli in the work environment and the behaviors required there. Because these are seldom identical it is important to use methods of training that encourage learners to bridge the gap and transfer their skills (Hesketh 1997b), and to create an environment that reinforces them for doing so. A transfer of training needs analysis can be used to discover how best to design training to increase transfer knowledge and skill (Hesketh 1997a). For example, a transfer of training needs analysis for a team leader in fire fighting would highlight the types of decisions that needed to be made in different fire incidents and the specific cues used in making the decisions. This information would be used to design training that emphasized practice of these decision processes in a range of systematically chosen contexts to facilitate transfer. This approach ensures that the cognitive skills required for transfer are practised during training, and that knowledge about transfer and likely barriers are discovered during training (Von Papstein and Frese 1988). Druckman and Bjork (1992) has highlighted a dilemma in that the methods of training that trainees enjoy, and that often lead to better performance on the training task, are not necessarily the ones that enhance long term retention and transfer. Trainees enjoy methods of training that require less effortful cognitive processing. Yet to facilitate transfer, trainees need to engage in the problem solving and active processes that they will be required to do on transfer. This may create motivational difficulties. Designing the training to provide an appropriate level of challenge is important for both motivation and transfer. 4.4 Simulation and Transfer Early debate about the appropriate level of physical and psychological fidelity of simulation for training remains important, but has been incorporated into the more general issue of transfer and generalization. A high fidelity simulation may facilitate transfer to a single context, but lower fidelity simulators may be more appropriate if transfer is required to a range of situations. Current research issues are addressing the 14099
Simulation and Training in Work Settings best ways of integrating simulation with other forms of training, and how to optimize the level of fidelity for the particular purpose and type of simulation. The debate is being informed by research on the best way of combining rules and examples for training. Simulators provide an ideal opportunity to structure systematic exposure to a carefully chosen set of exemplar situations. Simulators have traditionally been used for training, but their potential use is much more widespread. Simulation may provide a more realistic selection task for situations that require dynamic decision-making. Simulations are also being used as a way of signing off competency levels. Questions remain about how to deal with re-testing with a simulator, e.g., whether the trainee should be given an opportunity to perform again on exactly the same sequence as used during an initial test or a transfer problem. Here the research on transfer of training is critical, and may be of use in resolving the reassessment debate. This illustrates the ways in which selection, assessment, and training are strongly related areas of research and practice. 4.5 Organizational Issues Organizational characteristics often influence transfer success. Trainees develop expectations about whether or not it will pay off when they use what they have learned in training. Often, companies teach one thing and reward a completely different behavior. For example, trainees may learn to be cooperative in a training course, but may then be paid for their individual contribution in a highly competitive environment. In such situations there is no transfer. Trainees need to be reinforced for what they have learned, and to practise skills in circumstances where errors can be made without serious consequences. For example a practice niche can be created where a bank clerk who has learned a new program to calculate mortgages is provided with an opportunity to practise it first while answering written requests. Thus, the customer does not see all the mistakes the bank clerk makes when using the new program. 4.6 Ealuating Training The importance of evaluation has always been emphasized in training, although until recently approaches were somewhat traditional and limited. The importance of training evaluation has been recognized because intuitive guesses about what works are often wrong. Druckman and Bjork (1991) concluded that many companies in the USA continue to use methods of training known to be suboptimal for transfer because they rely on short term reactions evaluation, rather than examining the longer term retention and transfer of skills. Training evaluation can also be used as a way of indicating the importance 14100
of the skills learned, and of providing an opportunity to practise. Integrating training evaluation into the training plan is the best way of achieving this. The more detailed understanding of ways in which knowledge structures change with skill acquisition has also provided a basis for evaluating training. For example, experts tend to have hierarchically organized knowledge structures, and are able to read off solutions to problems far more quickly. These ideas can be used to suggest innovative ways of evaluating training programs.
Bibliography Alluisi E A 1991 The development of technology for collective training: SIMNET, a case history. Human Factors 33: 343–62 Annett J, Sparrow J 1986 Transfer of training: A review of research and practical implications. Programmed Learning and Educational Technology (PLET) 22(2): 116–24 Baldwin T T, Ford J K 1988 Transfer of training: A review and directions for future research. Personnel Psychology 41: 63–105 DeLosh E L, Busemeyer J R, McDaniel M A 1997 Extrapolation: The sine qua non for abstraction in function learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 23: 968–86 Downs S, Perry P 1984 Developing learning skills. Journal of European Industrial Training 8: 21–6 Druckman D, Bjork R A 1991 In the Mind’s Eye: Enhancing Human Performance. National Academy Press, Washington, DC Dweck C S, Leggett E L 1988 A social-cognitive approach to motivation and personality. Psychological Reiew 95: 256–73 Frese M, Brodbeck F C, Heinbokel T, Mooser C, Schleiffenbaum E, Thiemann P 1991 Errors in training computer skills: On the positive functions of errors. Human– Computer Interaction 6: 77–93 Frese M, Zapf D 1994 Action as the core of work psychology: A German approach. In: Triandis H C, Dunnette M D, Hough L M (eds.) Handbook of Industrial and Organizational Psychology, 2nd edn. Consulting Psychologists Press, Palo Alto, CA, Vol. 4, pp. 271–340 Hesketh B 1997a W(h)ither dilemmas in training for transfer. Applied Psychology: An International Reiew 46: 380–6 Hesketh B 1997b Dilemmas in training for transfer and retention. Applied Psychology: An International Reiew 46: 317–39 Latham G P, Saari L M 1979 Application of social-learning theory to training supervisors through behavioural modelling. Journal of Applied Psychology 64: 239–46 Martocchio J J 1994 Effects of conceptions of ability on anxiety, self-efficacy, and learning in training. Journal of Applied Psychology 79: 819–25 Marx R D 1982 Relapse prevention for managerial training: A model for maintenance of behavior change. Academy of Management Reiew 7: 433–41 Quinones M A, Ehrenstein A 1997 Training for a Rapidly Changing Workplace: Applications of Psychological Research. APA, Washington, DC Salas E, Bower C A, Rhodenizer L 1998 It is not how much you have but how you use it: toward a rational use of simulation to support aviation training. International Journal of Aiation Psychology 8: 197–208
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Schmidt R A, Bjork R A 1992 New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science 3: 207–17. Smith E, Ford J K, Kozlowski S 1997 Building adaptive expertise: Implications for training design. In: Quinones M A, Ehrenstein A (eds.) Training for a Rapidly Changing Workplace: Applications of Psychological Research. APA, Washington, DC Volpert W, Frommann R, Munzert J 1984 Die Wirkung heuristischer Regeln im Lernprozeß. Zeitschrift fuW r Arbeitswissenschaft 38: 235–40 Von Papstein P, Frese M 1988 Transferring skills from training to the actual work situation: The role of task application knowledge, action styles, and job decision latitude. In: Soloway E, Frye D, Shepard S B (eds.) Human Factors in Computing Systems, ACM SIGCHI Proceedings, CHI’ 88, pp. 55–60
B. Hesketh and M. Frese
Simultaneous Equation Estimates (Exact and Approximate), Distribution of A simple example of a system of linear simultaneous equations may consist of production and consumption functions of a nation: Y l ajbKjcLjerror, and C l djeYjerror. The variables Y, K, L, and C represent the gross domestic product (GDP), the capital equipment, the labor input, and the consumption, respectively. These variables are measures of the level of economic activity of a nation. In the production function, Y increases if the inputs K and\or L increase. C increases if Y increases in the consumption equation. Each equation is modeled to explain the variation in the left-hand side ‘explained’ variable by the variation in the right-hand side ‘explanatory’ variables. Error terms are added to analyze numerically the effect of the neglected factors from the right-hand side of the equation. These equations are different from the regression equations since the ‘explained’ variable Y is the ‘explanatory’ variable in the C equation, and Y and C are simultaneously determined by the two equations. Estimation of unknown coefficients and the properties of estimation methods are not straightforward compared with the ordinary least squares estimator. In practice, this kind of simultaneous equation system is extended to include more than 100 equations, and regularly updated to measure the economic activities of a nation. It is indispensable to analyze numerically the effect of policy changes and public investments. In this article, the statistical model and the estimation methods of all the equations are first explained, followed by the estimation methods of a
single equation and their asymptotic distributions. Explained next are the exact distributions, the asymptotic expansions, and the higher order efficiency of the estimators.
1. The System of Simultaneous Equations and Identification of the System We write the structural form of a system consisting of G simultaneous equations as yi l Yi βijZi γijui ( Yi Zi )δijui,
i l 1,…, G (1)
where yi and Yi are 1 and Gi subcolumns in the TiG matrix of whole endogenous variables Y l ( yi, Yi, Yei), Zi consists of Ki subcolumns in the TiK matrix of whole exogenous variables Z, βi and γi are Gii1 and Kii1 column vectors of unknown coefficients, δi l ( βi , γi )h, and ui is the Ti1 error term. This system of G equations with T observations is frequently summarized in a simple form YBjZΓ l U
(2)
the ith column of which is yikYiβikZiγi l ui, i.e., Eqn. (1). The ith columns of B and Γ may be denoted as bi and ci where (GkGik1) and (KkKi) elements are zero so that yikYiβikZiγi l YbijZci. Zero elements are called zero restrictions. It is assumed that each row of U is independently distributed as N(0, Σ). The reduced form of Eqn. (2) is Y l ZΠjV
(3)
where the KiG reduced form coefficient matrix is Π l kΓB−", and the TiG reduced form error term is V l UB−". Each row of V is assumed to be independently distributed as N(0, Ω), and then Σ l B−", ΩB−". The definition Π l kΓB−", or kΠB l Γ is the key to identify structural coefficients. Coefficients in βi are identified if they can be uniquely determined by the equation kΠbi l ci given Π. This equation is reduced to kΠ (1, βi )h l 0 denoting the (KkKi)i(1jGi) ! in Π as Π . (Rows and columns are selected submatrix ! elements in c and non-zero according to the zero i elements in bi, respectively.) Given Π , this includes ! (KkKi) linear equations and Gi unknowns, and βi is solvable if rank (Π ) l Gi. This means (KkKi) must ! KkK kG must be at least 0. If be at least Gi, or L l i i L l 0, βi is uniquely determined. For the positive L, there are L linearly dependent rows in Π since only Gi ! rows are necessary to determine βi uniquely. Once βi is determined, γi is determined by other Ki equations through kΠbi l ci. L is called the number of the degrees of overidentifiability of the ith equation. 14101
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Structural coefficients are not uniquely estimable if they are not identified (see Statistical Identification and Estimability).
2. Estimation Methods of the Whole System and the Asymptotic Distribution The full information maximum likelihood (FIML) estimator of all nonzero structural coefficients δi, i l 1,…, G, follows from Eqn. (3). Since it is in a linear regression form, the likelihood function can first be minimized with respect to Ω. Once Ω is replaced by the first-order condition, the likelihood function is concentrated where only B and Γ are unknown. The concentrated likelihood function is proportional to lnQ ΩR Q, ΩR l (YkZΠR)h(YkZΠR)\T, ΠR l kΓB−",
(4)
and all zero restrictions are included in B and Γ matrices. In the FIML estimation, it is necessary to minimize Q ΩR Q with respect to all non-zero structural coefficients. The FIML estimator is consistent, and the asymptotic distribution is derived by the central limit theorem. Stacking δi, i l 1,…, G in a column vector δ, the FIML estimator δ# asymptotically approaches N(0, kI−") as follows: NT(δV kδ) D N(0, kI−"),
I l lim T
_
0
1
1 c#lnQΩRQ . (5) E T cδ cδh
I is the limit of the average of the information matrix, i.e., kI−" is the asymptotic Cramer–Rao lower bound. Then the FIML estimator is the best among consistent and asymptotically normal (BCAN) estimators. The right-hand side endogenous variable Yi in (1) is defined by a set of Gi columns in (3) such as Yi l ZΠijVi. By the definition of V, Yi or, equivalently, Vi is correlated with ui since columns in U are correlated with each other. The least squares estimator applied to (1) is inconsistent because of the correlation between Yi and ui. Since Z is assumed to be not correlated with U in the limit, Z is used as K instruments in the instrumental variable method estimator. Premultiplying Zh to (1), it follows that Zhyi l (ZhYi, ZhZi)δijZhui l (0,…, 0, ZhYi, ZhZi, 0,…, 0)δju*i , i l 1,…, G (6) where the Ki1 transformed right-hand side variables ZhYi is not correlated with u*i in the limit. Stacking all G transformed equations in a column form, the G equations are summarized as w l Xδju* where w and u* stack Zhyi and u*i , i l 1,…, G, respectively, and are GKi1. The covariance between u*i and u*j is σij(ZhZ) which is the ith row and jth column sub-block in the 14102
covariance matrix of u*. (The whole covariance matrix can be written as Σ"(ZhZ) where " signifies the Kroneker product.) Once Σ is estimated consistently (by the 2SLS method explained in the next section), δ is efficiently estimated by the generalized least squares method δV
l oXh[Σp −""(ZhZ)−"]Xq−"oXh[Σp −""(ZhZ)−"]wq. (7) $SLS This is the three-stage least squares (3SLS) estimator by Zellner and Theil (1962). The assumption of the normal distribution error is not required in this estimation. The 3SLS estimator is consistent and is BCAN since it has the same asymptotic distribution as the FIML estimator.
3. Estimation Methods of a Single Equation and the Asymptotic Distribution An alternative way of estimating structural coefficients is to pick up one structural equation, the ith, in a Gequation system, and estimate δi neglecting zero restrictions in other equations. Because of this, the other (Gk1) structural equations can be rewritten equivalently as (Gk1) reduced form equations. The limited information maximum likelihood (LIML) estimator by Anderson and Rubin (1949) applies the FIML method to a (1jGi)-equation system consisting of (1) and Yi l ZΠijVi. This means the first column and the second Gi columns of B are (1, kβi )h and (0, I )h, respectively, and the first column and the second Gi columns of Γ are (γi , 0h)h and Πi, respectively. (If we denote Y as (yi,Yi,Yei), Yei is weakly exogenous in estimating (1) or, equivalently, the structural and reduced form parameters of ( yi, Yi) given Yei are variation-free from the reduced form coefficient of Yei. Then Yei is omitted from (2), (3), (4), and (8).) Using these limited information B and Γ matrices, we minimize Q Ω< R Q with respect to δi and Πi. Defining PF l F(FhF )−"Fh for any full column matrix and Cl F, G l (yi, Yi)h(PZkPZ )(yi, Yi), i (yi, Yi)h(IkPZ)(yi, Yi), it turns out that βi is estimated by minimizing the least variance ratio λ( βi) l
(1, kβi )G(1, kβi )h . (1, kβi )C(1, kβi )h
(8)
γi is estimated by the least squares method applied to (1) replacing βi with the estimator. The two stage least squares (2SLS) estimator is the generalized least squares estimator applied to Eqn. (6) using ZhZ as the weight matrix. (See Eqn. (10), where k is set to 1.) The assumption of the normal distribution of the error term is not required in this estimation. Both LIML and 2SLS estimators are consistent, and the large sample distribution is NT(δV kδ ) D N(0,kI−"(δ , δ )) i i i i
(9)
Simultaneous Equation Estimates (Exact and Approximate), Distribution of where I is calculated similarly to Eqn. (5) but ΩR is defined using the limited information B and Γ matrices, and only the diagonal submatrix of IV" which corresponds to δi is used. (Partial derivatives are calculated with respect to δi and columns in Πi.) This asymptotic distribution is a particular case of Eqn. (5). Both estimators are consistent and BCAN under the zero restrictions imposed on Eqn. (1). The k-class estimator δ# i(k) unifies both LIML and 2SLS estimators (Theil 1961). It is δV i(k) l F
E
YihPZYik(kk1)Yi(IkPZ)Yi YihZi ZihYi ZihZi E
i F
YihPZk(kk1)Yih(IkPZ) Zih
−" G
H
H
G
yi. (10)
This is the least squares estimator, the 2SLS estimator, and the LIML estimator when k is 0, 1, and 1jλ, respectively. There are two important properties in the k-class estimator. It is consistent if p limT _k l 1, and is BCAN if p limT _NT(kk1) l 0. If k satisfies these conditions, the k-class estimator is consistent and BCAN even when Yi (IkPZ)Yi and Yi (IkPZ)yi are replaced with any matrix and a vector of order OP(T ).
4. Exact Distributions of the Single-equation Estimators Several early studies compared the bias and mean squared errors of OLS, LIML, 2SLS, FIML, and 3SLS estimators by the Monte Carlo simulations since all but OLS estimators are consistent and are indistinguishable. The OLS estimator was often found as reliable as other consistent estimators. Later, the studies went on to t ratios, and the real defect of OLS estimators was found: the deviation from the standard normal distribution is worse than any other simultaneous equation methods. See Cragg (1967) for related papers. Drawing general qualitative comparisons from simulations is difficult since simulations require setting values of all population parameters. Simulation studies on the small sample properties led to the derivation of the exact distributions, which were expected to permit the drawing of general comparisons without depending on the particular parameter values. If ni1 column vectors xt, t l 1,…, T are independently distributed N(mt, Ω), the density function of Σt = ,T xtxt is the non-central Wishart matrix denoted " (T, Ω, M) where the non-centrality parameter is as W n M l Σt = ,T mtmt (Anderson 1958, Chap. 13). " of the exact distribution of the singleThe study equation estimators started from the fact that G l (Gkl) and C l (Ckl), k, l l 1,…, 1jGi in Eqn. (8)
are the noncentral Wishart matrix W +G (KkKi, " i matrix Ω, M) with the noncentrality parameter M l ΠRZh(PZkPZ )ZΠR, and the central Wishart i matrix W +G (TkK, Ω, 0), respectively. The 2SLS, i OLS, and" LIML estimators of βi are G−" G , #" "(G k (G jC )−"(G jC ), and (G kλC )−## #" and #" λ is the minimum ## ## in#"the ## λC## ), respectively, root #" polynomial equation Q GkλC Q l 0. Since all estimators are functions of elements in the G and C matrices, their distributions can be characterized by degrees of freedom of the two Wishart matrices, M and Ω matrices. In deriving the exact density functions, the 2SLS and OLS estimators can be treated in a similar way. In the 2SLS estimator, the joint density of G is transformed into the joint density of G−" G , G kG G−" ## G , and G . Integrating out G ##kG#" G ""−"G "#and #" ## "" "# ## #" − G results in the joint density of G "G . The resulting ## #" density function includes infinite##terms, and zonal polynomials when Gi is greater than one. A pedagogical derivation is found in Press (1982, Chap. 5). In the LIML estimator, the joint density of G and C is transformed into that of characteristic roots and vectors. Since β# is rewritten as a ratio of elements in a characteristic vector, the density function is derived by integrating out unnecessary random variables from the joint density function. However, the analytical operations are not easy when there are many endogenous variables. See Phillips (1983, 1985) to find comprehensive reviews on the 2SLS and LIML estimators, respectively. It was somewhat fruitless to derive exact distributions because these include nuisance parameters and infinite terms. It was difficult to draw general conclusions on the qualitative properties of the estimators from the numerical evaluations of these distributions. See Anderson et al. (1982). Qualitative properties of the estimators followed from the exact moments of estimators. Kinal (1980) proved that the (fixed) k-class estimator in a multiple endogenous variables case has moments up to (TkKikGi) if 0 k 1, and up to L if k l 1. Mariano and Sawa (1972) proved that, in the Gi l 1 case, the mean and variance of the LIML estimator do not exit. (In Monte Carlo simulations, the bias of LIML estimators was often found to be smaller than that of others even though the exact mean is infinite. This showed the clear limitation of the simulation methods.)
5. Asymptotic Expansions of the Distributions of the Single-equation Estimators Asymptotic expansion of the distribution was introduced as an analytical tool which is more accurate than the asymptotic distribution but is less complicated than the exact distributions. For instance, the 14103
Simultaneous Equation Estimates (Exact and Approximate), Distribution of t ratio statistic, say X, which is commonly used in econometrics, has the density function f (x) l c:[1j(x#\m)]−("+m)/# where c is a constant and m is the degree of freedom under conditions including the normally distributed error terms. Since the mean and variance of X is 0 and m\(m–2), the standardized statistic Z l N(mk2)\m:X has the density f(z) l ch:[1j(z#\(mk2))]−("+m)/# where ch is a new constant. This density function is expanded to the third-order term as f(z) l φ(z)o1j
1 (z%k6z#j3)qjo(m−") 4m
(11)
where φ(z) is the standard normal density function, and the constant is adjusted so that the area under the curve is one. (Since the t distribution is symmetric around the origin, the O(1\Nm) term does not appear in the right-hand side of the equation.) Rewriting in terms of X, the asymptotic expansion of the t statistic with m degrees of freedom is f (x) l φ(x)j
1 (x%k2x#k1)φ(x)jo(m−") (12) 4m
The first term in the right-hand side is the n(0, 1) density function. The second term in the right-hand side converges to zero if the limit is taken with respect to m. This second term gives deviation of f (x) from φ(x), and is called the third-order term. For finite m, the third-order term is expected to improve the standard normal approximation. The numerical evaluation of this expansion is easy. The asymptotic expansions in the simultaneous equation estimators are long and include nuisance parameter matrices such as q below. See Phillips (1983) for a review and Phillips (1977) for the validity of expansion. The asymptotic expansion does not require the assumption of the normal distribution of the error term. Fujikoshi et al. (1982) gave the expansion of the joint density of the estimators of δi. In their study, the bias of the estimators is calculated from the asymptotic expansions as AM(NT(δV ikδi)) l
1 (dLk1)I−"qjo(T−"), NT ql
1 σ# F
E
kΩ βi #" ## 0
G
, (13) H
where δ# is the estimator, d is 1 and 0 fo1r 2SLS and LIML estimators, respectively, and the Ω matrix is partitioned into submatrices conformable with the partitions of G and C matrices. AM(:) stands for the mean operator, but uses the asymptotic expansion for the density function. (Recall that the exact mean does 14104
not exist in the LIML estimator.) It is possible to compare the two estimators in terms of the calculated bias. For example, the bias of the 2SLS estimator is 0 when the degree of overidentifiability L is 1. Further, the mean of the squared errors was calculated from the asymptotic expansions and used to compare estimators. It was proved that the mean squared error of the 2SLS estimator is smaller than that of the LIML estimator when L is less than or equal to 6. Historically, this kind of comparison of ‘approximate’ mean squared errors goes back to the ‘Nagar expansion’ by Nagar (1959) and the ‘small disturbance expansion’ by Kadane (1971). These qualitative comparisons gave researchers some guidance about the choice of estimators. It was interesting to examine the accuracy of the approximations calculated by the asymptotic expansions of distributions. If the asymptotic expansions were accurate, calculation of the exact distributions could be avoided, and properties of estimators could be found easily from the approximations. However, the approximations were found not to be accurate enough to replace the exact distributions. There are many cases where the asymptotic expansion is accurate when the asymptotic distribution, the first term in the expansion, is already accurate; the asymptotic expansion is inaccurate when the asymptotic distribution is inaccurate. In particular, the asymptotic expansions are inaccurate when (a) The value of asymptotic variance is small; (b) The value of L is large; or (c) The structural error term is highly correlated with the right-hand side endogenous variable. It is noted, first, ZhZ\T is assumed to converge a nonsingular fixed matrix in the asymptotic theory. Second, for an attempt to improve the accuracy of asymptotic distributions by incorporating large L values, see Morimune (1983). Third, the asymptotic expansions cannot trace closely the skewed exact distributions that happen particularly when the correlation is high.
6. Higher-order Efficiency of the System and the Single-equation Estimators The asymptotic expansions of distributions can be used in comparing the probabilities of concentration of estimators about the true parameter values. One estimator is more desirable than another if its probability is greater than that of the other. This measure was used in comparing the single-equation estimators, and some qualitative results were derived. Furthermore, the third-order efficiency criterion was brought in the comparisons. This criterion requires that estimators be adjusted to have the same asymptotic bias as in Eqn. (13). Then the adjusted estimators are compared, and the maximum likelihood
Simultaneous Equation Estimates (Exact and Approximate), Distribution of estimator is proved most efficient. It has the highest concentration about the true parameter values in terms of the asymptotic expansion of distribution to the third-order O(1\T ) terms. (Akahira and Takeuchi (1981), for example.) The adjusted maximum likelihood estimator has the smallest mean-squared error at the same time since the difference among estimators is found only in the mean-squared errors. In whole system estimation, the FIML estimator is third-order efficient. The 3SLS estimator is less efficient than the FIML estimator in terms of the asymptotic probability of concentration once the bias of the two estimators is adjusted to be the same. Morimune and Sakata (1993) derived a simple adjustment of the 3SLS estimator so that the adjusted estimator has the same asymptotic expansion as the FIML estimator to the third-order O(1\T ) terms. This estimator is explained by modifying Eqn. (7). In Eqn. (6), Yi is replaced by Y< i Z Π< i l Z(ZhZ)−" ZhYi so that the X matrix consists of ZhY< i and ZhZi. In the modified estimator, we estimate Σ and Π by the first round 3SLS estimator and replace Y< i in X by YM i ZΠM i where ΠM i consists of proper subcolumns in ΠM l kΓ< B< −". The new X matrix is denoted as XM . Finally, the new estimator is δg M SLS l $ oX h[Σ −""(ZhZ)−"]Xq−"oX h[Σ −""(ZhZ)−"]wq. (14) This estimator has the same asymptotic expansion as the FIML estimator to the third-order terms and is third-order efficient. The LIML and 2SLS estimators are simple cases of the FIML and 3SLS estimators, respectively. Then the modified 2SLS estimator which follows from Eqn. (14) has the same asymptotic expansion as the LIML estimator to the third-order term. The LIML estimator and the modified 2SLS estimator are third-order efficient in the single-equation estimation.
7. Conclusion Laurence Klein received the 1980 Nobel prize for the creation of a macroeconometric model which is an empirical form of a simultaneous equation system, and for the application to the analysis of economic fluctuations and economic policies. The macroeconometric model became a standard tool to analyze the economies and policies of nations. Trygve Haavelmo received the 1989 Nobel prize for his contribution in the analysis of simultaneous structures. Haavelmo, together with other researchers at the Cowles Commission for Research in Economics, then at the University of Chicago, became the founders of simultaneous equation analysis in econometrics. Part of their research is collected in Hood and Koopmans
(1953). Studies on the exact and approximate distributions of estimators came after the research conducted at the Cowles group, and helped to make econometrics rigorous. Access to computers was the main concern when econometric model-building started spreading all over the world in the 1970s. Since then, computer facilities surrounding econometric model-building have changed greatly. Bulky mainframe computer have been replaced by personal computers. Computer programs were written individually, mostly in Fortran, and were used for regression analyses in model estimation as well as for simulation studies in econometrics theory. The packaged least squares programs replaced the individually written programs later in model estimation. They run on personal computers and have facilitated greatly the conducting of empirical studies. See also: Simultaneous Equation Estimation: Overview
Bibliography Akahira M, Takeuchi K 1981 Asymptotic Efficiency of Statistical Estimators: Concepts and Higher Order Asymptotic Efficiency. Springer, New York Anderson T W 1958 An Introduction to Multiariate Statistical Analysis. Wiley, New York Anderson T W, Rubin H 1949 Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20: 46–63 Anderson T W, Kunitomo N, Sawa T 1982 Evaluation of the distribution function of the limited information maximum likelihood estimator. Econometrica 50: 1009–27 Basmann R L 1957 A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25: 77–83 Cragg J G 1967 On the relative small sample properties of several structural equation estimators. Econometrica 35: 89–110 Fujikoshi Y, Morimune K, Kunitomo N, Taniguchi M 1982 Asymptotic expansions of the distributions of the estimates of coefficients in a simultaneous equation system. Journal of Econometrics 18: 191–205 Hood W C, Koopmans T C 1953 Studies in Econometric Methods. Wiley, New York Kadane J B 1971 Comparison of k-class estimates when the disturbance is small. Econometrica 39: 723–37 Kinal T W 1980 The existence of moments of k-class estimators. Econometrica 48: 241–9 Mariano R S, Sawa T 1972 The exact finite sample distribution of the limited information maximum likelihood estimator in the case of two included endogenous variables. Journal of the American Statistical Association 67: 159–63 Morimune K 1983 Approximate distributions of k-class estimators when the degree of over-identifiability is large compared with the sample size. Econometrica 51: 821–41 Morimune K, Sakata S 1993 Modified three stage-least squares estimator which is third-order efficient. Journal of Econometrics 57: 257–76
14105
Simultaneous Equation Estimates (Exact and Approximate), Distribution of Nagar A L 1959 The bias and moment matrix of the general kappa-class estimators of the parameters in simultaneous equations. Econometrica 27: 575–95 Phillips P C B 1977 The general theorem in the theory of asymptotic expansions as approximation to the finite sample distributions of econometric estimators. Econometrica 45: 1517–34 Phillips P C B 1983 Exact Phillips. In: Griliches Z, Intriligator M D (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 1, Chap. 8 Phillips P C B 1985 The exact distribution of LIML: 2. International Economic Reiews 25(1): 249–61 Press S J 1982Applied Multiariate Analysis: Using Bayesian and Frequentist Methods of Inference [original edn. 1972, 2nd edn.] Robert E Krieger, Malabar, FL Theil H 1961 Economic Forecasts and Policy, 2nd edn. NorthHolland, New York, pp. 231–2, 334–6 Zellner A, Theil H 1962 Three-stage least squares: simultaneous estimation of simultaneous equations. Econometrica 30: 54–78
K. Morimune
Simultaneous Equation Estimation: Overview
unknown constant parameters. Suppose that market equilibrium, in which the price is such that suppliers want to sell the same quantity that demanders what to buy, is described by these linear equations: q l γ jβ pju , β 0 (1) " " " " 0 (2) demand: q l γ jδ yjε wjβ pju , β # # # # # # Neither equation alone can determine either p or q, because each equation contains both: these are simultaneous equations. Solve them for p and q thus: supply:
p l π jπ yjπ wj(u ku )\∆ "" "# "$ # " q l π jπ yjπ wj( β u kβ u )\∆ #" ## #$ " # # "
(3) (4)
where π l (γ kγ )\∆, π l δ \∆, π l ε \∆ "" # " "# # "$ # π l ( β γ kβ γ )\∆, π l β δ \∆, #" " # # " ## " # π l β ε \∆ #$ "# and ∆ l β kβ " #
(5) (6)
(7)
Simultaneous equations are important tools for understanding behavior when two or more variables are determined by the interaction of two or more relationships in such a way that causation is joint rather than unidirectional. Such situations abound in economics, but also occur elsewhere. Haavelmo (1943, 1944) began their modern econometric treatment. The simplest economic example is the interaction of buyers and sellers in a competitive market which jointly determines the quantity sold and price. Another important example is the interaction of workers, consumers, investors, firms, and government in determining the economy’s output, employment, price level, and interest rates, as in macroeconometric forecasting models. Even when one is interested only in a single equation, it often is best interpreted as one of a system of simultaneous equations. Strotz and Wold (1960) argued that in principle every economic action is a response to a previous action of someone else, but even they agreed that simultaneous equations are useful when the data are yearly, quarterly, or monthly, because these periods are much longer than the typical market response time. This article discusses the essentials of simultaneous equation estimation using a simple linear example.
The variables p and q which are to be explained by the model are endogenous. Equations (1) and (2) are structural equations. Each of them describes the behavior of one of the building blocks of the model, and (as is typical) contains more than one endogenous variable. Equations (3) and (4) are reduced form equations. Each of them contains just one endogenous variable, and determines its value as a function of parameters, shocks, and explanatory variables (here y and w). The explanatory variables are predetermined if the shocks for any given period are independent of the explanatory variables for that period and all previous periods; they are exogenous if the shocks for each period are independent of the explanatory variables for every period. Thus all exogenous variables are predetermined, but not conversely. For example, the value of an endogenous variable from a previous period cannot be exogenous, but it is predetermined if the shocks are serially independent.
1. A Simple Supply–Demand Example
3. The Need for Estimation of Parameters
Let q and p stand for the quantity sold and price of a good, y and w for the income and wealth of buyers (assumed independently determined), u and u for " letters# for unobservable random shocks, and Greek
One job of a reduced form equation is to tell how an endogenous variable responds to a change in any predetermined or exogenous variable. Typically, no single structural equation can do this. Another job of
14106
2. Types of Variables: Structural and Reduced Form Equations
Simultaneous Equation Estimation: Oeriew a reduced form equation is to forecast the value of an endogenous variable in a future period, based on expected future values of the predetermined and exogenous variables and of the shocks (the latter can safely be set at zero in most cases). Typically, no single structural equation can do this either. If the reduced form equations are to do these jobs, numerical values of their parameters are needed (in this example, values of the πs in Eqns. (3) and (4)). Numerical values of the structural parameters are needed as well (in this example, the βs, γs, δ and ε in # Eqns. (1) and (2)). There are several reasons.#First, one wants to understand each of the system’s separate building blocks in its own right. Second, in overidentified cases (see Sect. 7) better estimates of the reduced form can be obtained by solving the estimated structure (as in Eqns. (5)–(7) than by estimating the reduced form directly. Third, if forecasts made by the reduced form are poor, one wants to know which of the structural equations fit the forecastperiod’s data poorly, and which (if any) fit well. This is discovered by inserting observed values of the variables into each estimated structural equation to obtain an estimate of its forecast-period shock. Then one can revise the poorly-fitting structural equation(s) and so try to improve the model’s accuracy (see Sect. 11). Fourth, if one wants to find the new values of the reduced form parameters after a change in a structural parameter, one needs to know the old values of all the structural parameters, as well as the new value of the one that has changed. In the example, one can then use Eqns. (5)–(7) to compute the reduced form parameters. The critique of Robert Lucas (1976) provides a warning about pitfalls in doing this.
4. Least Squares Estimators Least squares (LS) estimators (see Linear Hypothesis) of an equation’s coefficients are biased if the shocks in each period are not independent of the explanatory variables in all periods. Clearly this is true of Eqns. (1) and (2), since Eqns. (3) and (4) show that both u and " of u influence both p and q. This is typically true # estimators of simultaneous structural equations, LS However, LS estimators often have small variances, so they may sometimes have acceptably small expected squared errors even when they are biased. LS estimators of the π’s in the reduced form, Eqns. (3) and (4) are unbiased if the shocks in each period have zero mean and constant variance, and are independent of y and w for all periods (so that y and w are exogenous). They have minimum variance among unbiased estimators and are consistent if in addition the shocks are uncorrelated across time. They remain consistent if the shocks are uncorrelated across time but y and w are predetermined rather than exogenous. The generalized least squares (GLS) method is minimum variance unbiased if the explanatory vari-
ables are exogenous but the shocks are correlated across time. This method requires information about the variances and covariances of the shocks.
5. The Identifiability of Structural Parameters Having estimates of the reduced form parameters, one can try to make them yield estimators of the structural parameters. In the example this means trying to solve the estimated version of Eqns. (5) and (7) for estimators of the γ’s, β’s, δ and ε . Denote the LS estimators # # and (6) show that π# \π# of the π’s by π# . Equations (5) ## of "# is one estimator of β , and π# \π# is another. Either " #$ γ "$in Eqn. (1). These are them leads to an estimator of " indirect least squares (ILS) estimators. They are not unbiased, but they are consistent if the LS reduced form estimators are consistent. The supply parameters β and γ are identified, meaning that the data and the " " model reveal their values (subject to sampling variation). Indeed, they are oeridentified, because in small samples the two ways to estimate β from π# ’s yield " different results. (If either δ and ε were zero, there # # would be only one way, and the supply equation would be just identified. If both δ and ε were zero, # equation # there would be no way, and the supply would be unidentified.) Equations (5)–(7) have an infinite number of solutions for the demand parameters β , γ , # # δ and ε . Hence the data and the model do not reveal # # their values; they are unidentified. Another way to see this is to imagine price and quantity data being generated by intersections of the supply and demand curves in the pq plane. Ignoring shocks, the supply curve is fixed, but the demand curve shifts when y or w changes. Hence the intersections reveal the slope and position of the supply curve as the demand curve shifts. But they reveal nothing about the slope of the demand curve. The identifiability of parameters is crucial: they cannot be estimated if they are not identified. Reduced form parameters are usually identified, except in special cases. But the identifiability of structural parameters should be checked before one tries to estimate them. The least squares formula can be applied to an unidentified structural equation such as Eqn. (2), but the result is not an estimator of that equation. For a more detailed and general discussion, see Statistical Identification and Estimability.
6. Simultaneous Equations Estimation Methods Many methods have been developed for estimating identifiable parameters of simultaneous structural equations without the bias and inconsistency of LS. Most of them exploit the assumed independence between shocks and predetermined variables. Some estimate one equation at a time, and others estimate the whole system at once. 14107
Simultaneous Equation Estimation: Oeriew
7. Estimating One Structural Equation at a Time The ILS method mentioned above is one way of estimating one equation that is part of a system. A second way is the instrumental variables (IV) method (Durbin 1954). LS and IV will be shown for Eqn. (1). Denote the deviation of each variable from its sample mean by an asterisk, for example, p* l pkp- . Now rewrite Eqn. (1) in terms of deviations from sample means, thus eliminating the constant term γ ; multiply " it by p*; sum it over all the sample observations; and divide the sum byp*#. The result is the LS estimator of β : " βV l q*p*\p*# l β ju* p*\p*# " " "
(8)
It is biased and inconsistent because the shock u influences p via Eqn. (3) so that the error term, the last" term in Eqn. (8), does not equal zero in expectation or in probability limit. Now rewrite Eqn. (1) as before in terms of deviations from sample means, but this time multiply it by y*, sum it over all observations, and divide by p*y*. The result is an IV estimator of β : " βV y l q*y*\p*y* l β ju* y*\p*y* (9) " " " If y is predetermined it is an instrument. In this case the IV estimator is consistent if u has zero mean and constant variance and is serially"independent, because then u* y* is zero in probability limit and p*y* is " not. Similarly, if w is predetermined it is another instrument, and another IV estimator of β is β# w l " " are q*w*\p*w*. Clearly, these IV estimators equivalent to the ILS estimators of β obtained earlier. " reduced form For example, the one based on the coefficients of y is πV \πV l (q*y*\y*#)\(p*y*\y*#) ## "# l q*y*\p*y* l βV
"y
(10)
Similarly, the other one is π# \π# l β# w. For a reduced "$ as LS, " form equation, IV is the #$same because the instruments are precisely the equation’s predetermined variables. The two-stage least squares (2SLS) method (Theil 1972, Basmann 1957) is another way of estimating one structural equation at a time. For an overidentified equation it is superior to ILS or IV because it results in one estimator rather than two or more; for a just identified equation it reduces to ILS and IV. It first computes the LS estimator of the reduced form equation for each endogenous variable that appears on the right side of the structural equation to be 14108
estimated (this is stage 1); it then replaces the observed data for those endogenous variables by the values calculated from the reduced form, and computes the LS estimator for the resulting equation (this is stage 2). For Eqn. (1), stage 1 is to compute the least squares estimators of the π’s in the price equation (3) of the reduced form; the second stage is to compute p# l π# jπ# yjπ# w, substitute this p# for p in (1), and "" "# the "$ compute LS estimator q*p# *\p# *#, which is the 2SLS estimator of β . 2SLS is an IV estimator, its " instruments are the observed values of the equation’s predetermined variables and the reduced-form-calculated values of the endogenous variables from the equation’s right side. 2SLS is a generalized method of moments (GMOM) estimator (Hansen 1982). LS and IV are plain MOM estimators. The k-class estimator (Nagar 1959) includes as special cases LS, 2SLS, and the limited information maximum likelihood estimator (LIML) (Anderson and Rubin 1949, 1950). LIML is similar to 2SLS (the two are the same for infinitely large samples), but it has largely been displaced by 2SLS because it requires iterative computations which 2SLS does not, and because it often has a larger variance and sometimes yields outlandish estimates (Kadane 1971, Theil 1972). For an identified structural equation, ILS, IV, LIML, and 2SLS are consistent if the model’s explanatory variables are predetermined and the shocks have zero means and constant variances and covariances and are independent across time. An advantage of these methods is that, unlike LS, if applied to an unidentified structural equation they fail to yield estimates. The Bayesian method of moments (BMOM) method (Zellner 1998) obtains estimators of linear reduced form and structural equations without making assumptions about the likelihood function of the data. It assumes that the posterior expectations of the shocks given the data are uncorrelated with the predetermined variables, and that the differences between actual and estimated shocks have a covariance matrix of a particular form. Zellner shows how to find optimum BMOM estimators for several different loss functions, including a precision loss function which is a weighted sum of squares and cross-products of errors. For this loss function, the optimum BMOM estimator of an identified structural equation belongs to the k-class; it turns out to be LS if the ratio of the sample size to the number of predetermined variables in the model is 2, and it approaches 2SLS in the limit as this ratio grows without limit. When a structural equation is overidentified, restrictions are placed on the reduced form parameters (such as π \π l π \π in the example, because both #" equal $# β $"), but LS estimators of the reduced ratios##must " form ignore this information. Better estimators of the reduced form, because they use this information, can then be obtained by solving the estimated structure (see Sect. 3).
Simultaneous Equation Estimation: Oeriew
8. Estimating a Complete System If in an identified complete simultaneous equations system the shocks are normally distributed and other suitable conditions are satisfied, an asymptotically efficient but computationally complex method of estimating all the parameters of a complete system is the full information maximum likelihood (FIML) method (Koopmans 1950, Durbin 1988). It involves maximizing the joint likelihood function of all the data and parameters, and requires iterative computations. It has largely been displaced by the three stage least squares (3SLS) method (Zellner and Theil 1962), which is also asymptotically efficient but is much easier to compute. 3SLS is an application of the GLS method. 3SLS gets the required information on the shocks’ covariances from 2SLS estimates of the shocks; hence the name 3SLS.
9. Cross-section s. Time Series Studies So far the treatment has concerned time series data, where the observations describe the same country, city, family, firm, or what not, at successive periods of time. The same type of analysis is appropriate for cross-section data, where the observations describe different countries, firms, or what not, at a single point of time, with obvious modifications. The analysis has also been extended to panel data, that is, a time series of cross-sections.
10. The Decline of Simultaneous Equations Models In recent years the econometric literature has paid diminishing attention to simultaneous equations models and their identifiability. Many modern textbooks give these topics little space, and that only near the end of the book (Davidson and MacKinnon 1993). Perhaps one reason is that modern computers can handle the estimation of nonlinear models, for which identification criteria are much more complicated and for which lack of identifiability is a much rarer problem.
11. The Problem of Choosing and Testing a Model Thus far it has been presumed that the model being estimated is an accurate representation of the realworld process that actually generates the data. This is far from certain; at best it is likely to be only approximately true. Any model has been chosen by someone, perhaps with the aid of economic theory
about the maximization of profit or utility, or perhaps because it was suggested by previously observed data. But there is no guarantee that it contains the right variables, or that their endogeneity or exogeneity is correctly stated, or that the correct number of lagged variables has been chosen, or that its eqns. have the right mathematical form, or that the assumed distribution of its shocks is correct. One can perform diagnostic tests to see whether the estimated model fits past data well, and whether its calculated past shocks have constant variance and are free of any obvious systematic behavior. But this does not assure that it will do well with future data. In my view, the most stringent and most important test of a model is to expose it to data that were not available when the model was formulated. If it does not describe these data well, it leaves something to be desired. The more new data it can describe well, the more confidence one can have in it. But at any point the best that can be said about a model is that it has done a good job of describing the data that are available thus far. See also: Instrumental Variables in Statistics and Econometrics; Linear Hypothesis; Simultaneous Equation Estimates (Exact and Approximate), Distribution of; Statistical Identification and Estimability
Bibliography Anderson T W, Rubin H 1949 Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20: 46–63 Anderson T W, Rubin H 1950 The asymptotic properties of estimates of parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 21: 570–82 Basmann R L 1957 A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25: 77–83 Christ C F (ed.) 1994 Simultaneous Equations Estimation. Edward Elgar, Aldershot, UK Davidson R, MacKinnon J G 1993 Estimation and Inference in Econometrics. Oxford University Press, New York Durbin J 1954 Errors in variables. Reue of the International Statistical Institute 22: 23–32 Durbin J 1988 Maximum likelihood estimation of the parameters of a system of simultaneous regression equations. Econometric Theory 4: 159–70 Haavelmo T 1943 The statistical implications of a system of simultaneous equations. Econometrica 11: 1–12 Haavelmo T 1944 The probability approach in econometrics. Econometrica 12(suppl.): 1–115 Hansen L P 1982 Large sample properties of generalized method of moments estimators. Econometrica 50: 1029–54 Hausman J A 1983 Specification and estimation of simultaneous equation models. In: Griliches Z, Intriligator M D (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. 1, pp. 391–448
14109
Simultaneous Equation Estimation: Oeriew Kadane J B 1971 Comparison of k-class estimators when the disturbances are small. Econometrica 39: 723–7 Koopmans T C (ed.) 1950 Statistical Inference in Dynamic Economic Models. Cowles Commission Monograph 10. Wiley, New York Lucas R E Jr 1976 Econometric policy evaluation: a critique. Carnegie-Rochester Conference Series on Public Policy 1: 19–46 Nagar A L 1959 The bias and moment matrix of the general kclass estimators of the parameters in simultaneous equations. Econometrica 27: 575–95 Strotz R H, Wold H O A 1960 A triptych on causal chain systems. Econometrica 28: 417–63 Theil H 1972 Principles of Econometrics. Wiley, New York Zellner A 1998 The finite sample properties of simultaneous equations’ estimates and estimators Bayesian and nonBayesian approaches. Journal of Econometrics 83: 185–212 Zellner A, Theil H 1962 Three stage least squares: simultaneous estimation of simultaneous equations. Econometrica 30: 54–78
is also appropriate when one wishes to make an intensive study of a phenomenon in order to examine the conditions that maximize the strength of an effect. In clinical settings one is frequently interested in describing which variables affect a particular individual rather than in trying to infer what might be important from studying groups of subjects and assuming that the average effect for a group is the same as that observed in a particular subject. In the current era of increased accountability in clinical settings, single-case design can be used to demonstrate treatment effects for a particular clinical case (Hayes et al. 1999 for a discussion of the changing research climate in applied settings).
C. F. Christ
Single-case designs study intensively the process of change by taking many measures on the same individual subject over a period of time. The degree of control in single-case design experiments can often lead to the identification of important principles of change or lead to a precise understanding of clinically relevant variables in a specific clinical context. One of the most commonly used approaches in single-case design research is the interrupted time-series. A timeseries consists of many repeated measures of the same variable(s) on one subject, while measuring or characterizing those elements of the experimental context that are presumed to explain any observed change in behavior. An by examining characteristics of the data before and after the experimental manipulation to look for evidence that the independent variable alters the dependent variable because of the interruption.
Single-case Experimental Designs in Clinical Settings 1. Differing Research Traditions Two major research traditions have advanced the behavioral sciences. One is based on the hypotheticodeductive approach to scientific reasoning where a hypothesis is constructed and tested to see if a phenomenon is an instance of a general principle. There is a presumed a priori understanding of the relationship between the variables of interest. These hypotheses are tested using empirical studies, generally using multiple subjects, and data are collected and analyzed using group experimental designs and inferential statistics. This tradition represents current mainstream research practice for many areas of behavioral science. However, there is another method of conducting research that focuses intensive study on an individual subject and makes use of inductive reasoning. In this approach, one generates hypotheses from a particular instance or the accumulation of instances in order to identify what might ultimately become a general principle. In practice, most research involves elements of both traditions, but the extensive study of individual cases has led to some of the most important discoveries in the behavioral sciences and therefore has a special place in the history of scientific discovery. Single-case or single-subject design (also known as ‘N of one’) research is often employed when the researcher has limited access to a particular population and can therefore study only one or a few subjects. This method 14110
2. Characteristics of Single-case Design
3. Specific Types of Single-case Designs Single-case designs frequently make use of a graphical representation of the data. Repeated measures on the dependent variable take place over time, so the abscissa (X-axis) on any graph represents some sort of across time scale. The dependent measure or behavior presumably being altered by the treatment is plotted on the ordinate (Y-axis). There are many variations of single-case designs that are well described (e.g., Franklin et al. 1997, Hersen and Barlow 1976, Kazdin 1982), but the following three approaches provide an overview of the general methodology.
3.1 The A-B-A Design The classic design that is illustrative of this approach is called the A-B-A design (Hersen and Barlow 1976, pp. 167–97) where the letters refer to one or another
Single-case Experimental Design in Clinical Settings
Counts of disruptive behavior
Baseline
Change in contingent attention
Return to baseline
Series of class periods
Figure 1 A-B-A design
experimental or naturally observed conditions presumed to be associated with the behavior of importance. A hypothetical example is shown in Fig. 1. By convention, A refers to a baseline or control condition and the B indicates a condition where the behavior is expected to change. The B condition can be controlled by the experimenter or alternatively can result from some naturally occurring change in the environment (though only the former are true experiments). The second A in the A-B-A design indicates a return to the original conditions and is the primary means by which one infers that the B condition was the causal variable associated with any observed change from baseline in the target or dependent variable. Without the second A phase, there are several other plausible explanations for changes observed during the B phase besides the experimental manipulation. These include common threats to internal validity such as maturation or intervening historical events coincidental to initiating B. To illustrate the clinical use of such an approach, consider the data points in the first A to indicate repeated observations of a child under baseline conditions. The dependent variable of interest is the number of disruptive behaviors per class period that the child exhibits. A baseline of disruptive behaviors is observed and recorded over several class periods. During the baseline observations, the experimenter hypothesizes that the child’s disruptive behavior is under the control of contingent attention by the teacher. That is, when the child is disruptive, the teacher pays attention (even negative attention) to the child that unintentionally reinforces the behavior, but when the child is sitting still and is task oriented, the teacher does not attend to the child. The experimenter then trains the teacher to ignore the disruptive behavior and contingently attend to the child (e.g., praise or give some token that is redeemable for later privileges) when the child is emitting behavior appropriate to the classroom task. After training the teacher to reinforce differentially appropriate class-
room behavior, the teacher is instructed to implement these procedures and the B phase begins. Observations are made and data are gathered and plotted. A decrease in the number of disruptive behaviors is apparent during the B phase. To be certain that the decrease in disruptive behaviors is not due to some extraneous factor, such as the child being disciplined at home or simply maturing, the original conditions are reinstated. That is, the teacher is instructed to discontinue contingently attending to task-appropriate behavior and return to giving corrective attention when the child is disruptive. This second A phase is a return to the original conditions. The plot indicates that the number of disruptions returns to baseline (i.e., returns to its previous high level). This return to baseline in the second A condition is evidence that the change observed from baseline condition A to the implementation of the intervention during the B phase was under the control of the teacher’s change in contingent attention rather than some other factor. If another factor were responsible for the change, then the re-implementation of the original baseline conditions would not be expected to result in a return to the original high levels of disruptive behavior. There are many variations of the A-B-A design. One could compare two or more interventions in an A-BA-C-A design. Each A indicates some baseline condition, the B indicates one type of intervention, and the C indicates a different intervention. By examining differences in the patterns of responding for B and C, one can make treatment comparisons. The notation of BC together, as in A-B-A-C-A-BC-A, indicates that treatments B and C were combined for one phase of the study.
3.2 The Multiple-baseline Design An A-B-A or A-B-A-C design assumes that the treatment (B or C) can be reversed during the subsequent A period. Sometimes it is impossible or unethical to re-institute the original baseline conditions (A). In these cases, other designs can be used. One such approach is a multiple-baseline design. In a multiple-baseline design, baseline data are gathered across several environments (or behaviors). Then a treatment is introduced in one environment. Data continue to be gathered in selected other environments. Subsequently the treatment is implemented in each of the other environments, one at a time, and changes in target behavior observed (Poling and Grossett 1986). In the earlier example, if the child had been exhibiting disruptive behavior in math, spelling, and social studies classes, the same analysis of the problem might be applied. The teachers or the researcher might not be willing to reinstitute the baseline conditions if 14111
Single-case Experimental Design in Clinical Settings
Figure 2 Example of multiple baseline design with same behaviour across different settings
improvements were noted because of its disruptive effects on the learning of others and because it is not in the best interests of the child. Figure 2 shows how a multiple-baseline design might be implemented and the data presented visually. Baseline data are shown for each of the three classroom environments. While the initial level of disruptive behavior might be slightly different in each class, it is still high. The top graph in Fig. 2 shows the baseline number of disruptive behaviors in the math class. At the point where the vertical dotted line appears, the intervention is implemented and its effects appear to the right of dotted line. The middle and bottoms graphs show the same disruptive behaviors for the spelling and social studies class. No intervention has yet been implemented and the disruptive behavior remains high and relatively stable in each of these two classrooms. This suggests that the changes seen in the math class were not due to some other cause outside of the school environment or some general policy change within the school, since the baseline conditions and frequency of disruptive behaviors are unchanged in the spelling and social studies classes. After four more weeks of baseline observation, 14112
the same shift in attention treatment is implemented in the spelling class, but still not in the social studies class. The amount of disruptive behavior decreases following the treatment implementation in the spelling class, but not the social studies class. This second change from baseline is a replication of the effect of treatment shown in the first classroom and is further evidence that the independent variable is the cause of the change. There is no change in the social studies class behavior. This observation provides additional evidence that the independent variable rather than some extraneous variable is responsible for the change in the behavior of interest. Finally, the treatment is implemented in the social studies class with a resulting change paralleling those occurring in the other two classes when they were changed. Rather than having to reverse the salutary effects of a successful treatment as one would in an A-B-A reversal design, a multiple baseline design allows for successively extending the effects to new contexts as a means of demonstrating causal control. Multiple-baseline designs are used frequently in clinical settings because they do not require reversal of beneficial effects to demonstrate causality. Though Fig. 2 demonstrates control of the problem behavior by implementing changes sequentially across multiple settings, one could keep the environment constant and study the effects of a hypothesized controlling variable by targeting several individual behaviors. For example, a clinician could use this design to test whether disruptive behavior in an institutionalized patient could be reduced by sequentially reinforcing more constructive alternatives. A treatment team might target aggressive behavior, autistic speech, and odd motor behavior when the patient is in the day room. After establishing a baseline for each behavior, the team could intervene first on the aggressive behavior while keeping the preexisting contingencies in place for the other two behaviors. After the aggressive behavior has changed (or a predetermined time has elapsed), the autistic speech behavior in the same day room could be targeted, and so on for the odd motor behavior. The logic of the application of a multiplebaseline design is the same whether one studies the same behavior across different contexts or different behaviors across the same context. To demonstrate causal control, the behavior should change only when it is the target of a specific change strategy and other behaviors not targeted should remain at their previous baselines. Interpretation can be difficult in multiple-baseline designs because it is not always easy to find behaviors that are functionally independent. Again, consider the disruptive classroom behavior described earlier. Even if one’s analysis of teacher attention to disruptive behavior is reinforcing, it may be difficult to show change in only the targeted behavior in a specific classroom. The child may learn that alternative behaviors can be reinforced in the other classrooms and the child may
Single-case Experimental Design in Clinical Settings
Figure 3 Example from an Alternating Treatment Design (ATD)
alter his or her behavior in ways that alter the teacher reactions, even though the teachers in the other baseline classes did not intend to change their behavior. Nevertheless, this design addresses many ethical and practical concerns about A-B-A (reversal) designs.
rating of clinically useful responsiveness from the client. The therapist would have a random order for asking one of the two types of questions. The name of the design does not mean that the questions alternate back and forth between two conditions on each subsequent therapist turn, but randomly change between treatments. One might observe the data in Fig. 3. The X-axis is the familiar time variable and the Y-axis the rating of the usefulness of the client response. The treatment conditions are shown in separate lines. Note that points for each condition are plotted according to the order in which they were observed, but that the data from each condition are joined as if they were collected in one continuous series (even though the therapists asked the questions in the order of O-C-O-O-O-C-C-O …). The data in Fig. 3 indicate that both methods of asking questions produce a trend for the usefulness of the information to increase as the session goes on. However, the open-ended format produces an overall higher degree of utility.
4. Analyses 3.3 Alternating Treatment Design The previous designs could be referred to as withinseries designs because there are a series of observations within which the conditions are constant, an intervention with another series of observations where the new conditions are constant for the duration of series, and so on. The interpretation of results is based on what appears to be happening within a series of points compared with what is happening within another series. Another variation of the single-case design that is becoming increasingly recognized for its versatility is the alternating treatment design (ATD) (Barlow and Hayes 1979). The ATD can be thought of as a betweenseries design. It is a method of answering a number of important questions such as assessing the impact of two or more different treatment variations in a psychotherapy session. In an ATD, the experimenter rapidly and frequently alternates between two (or more) treatments on a random basis during a series and measures the response to the treatments on the dependent measure. This is referred to as a ‘betweenseries’ because the data are arranged by treatments first and order second rather than by order first, where treatments vary only when series change. There may be no series of one particular type of treatment, although there are many of the same treatments in a row. To illustrate how an ATD might be applied, consider a situation where a therapist is trying to determine which of two ways of relating to a client produces more useful self-disclosure. One way of relating is to ask open-ended questions and the other is to ask direct questions. The dependent variable is a
There are four characteristics of a particular series that can be compared with other series. The first is the level of the dependent measure in a series. One way in which multiple series may differ is with respect to the stable level of a variable. The second way in which multiple series may differ is in the trend each series shows. The third characteristic that can be observed is the shape of responses over time. This is also referred to as the course of the series and can include changes in cyclicity. The last characteristic that could vary is the variability of responses over time. Figure 4 shows examples of how these characteristics might show treatment effects. One might observe changes in more than one of these characteristics within and across series.
4.1 Visual Interpretation of Results In single-case design research, there has been a long tradition to interpret the data visually once it has been graphed. In part, this preference emerged because those doing carefully controlled single-case research were looking for large effects that should be readily observable without the use of statistical approaches. Figure 4 depicts examples of clear effects. They demonstrate the optimal conditions for inferring a treatment effect, a stable baseline and clearly different responses during the post baseline phase. Suggestions for how to interpret graphical data have been described by Parsonson and Baer (1978). However, there are many circumstances where patterns of results are not nearly so clear and errors in interpreting results are more common. When there are 14113
Single-case Experimental Design in Clinical Settings
Figure 4 Examples of changes observed in single-case designs
outlier data points (points which fall unusually far away from the central tendency of other points in the phase), it becomes more difficult to determine reliably if an effect exists by visual inspection alone. If there are carry-over effects from one phase to another that the experimenter may not initially anticipate, results can be difficult to interpret. Carry-over is observed when the effects of one phase do not immediately disappear when the experiment moves into another phase. For example, the actions of a drug may continue long after a subject stops taking the drug, either because the body metabolizes the drug slowly or because some irreversible learning changes occurred as a result of the drug effects. It is also difficult to interpret graphical results when there is a naturally occurring cyclicity to the data. This might occur when circumstances not taken into account by the experimenter lead to systematic changes in behaviors that might be interpreted as treatment effects. A comprehensive discussion of issues of visual analysis of graphical displays is provided by Franklin et al. (1996). 4.2 Traditional Measurement Issues Any particular observation in a series is influenced by three sources of variability. The experimenter intends to identify systematic variability due to the intervention. However, two other sources of variability in an observation must also be considered. These are variability attributable to measurement error and variability due to extraneous sources, such as how well the subject is feeling at the time of a particular observation. Measurement error refers to how well the score on the measurement procedure used as the dependent measure represents the construct of interest. A detailed discussion of the reliability of measurement entails some complexities that can be seen intuitively if one compares how accurately one could measure two different dependent measures. If 14114
the experimenter were measuring the occurrence of head-banging behavior in an institutionalized subject, frequency counts would be relatively easy to note accurately once there was some agreement by raters on what constituted an instance of the target behavior. There would still be some error, but it would be less than if one were trying to measure a dependent measure such as enthusiasm in a classroom. The construct of enthusiasm will be less easily characterized and have more error associated with its measurement and therefore be more likely to have higher variability. As single-case designs are applied to new domains of dependent measures such as psychotherapy research, researchers are increasingly attentive to traditional psychometric issues, including internal and external validity as well as to reliability. Traditionally, if reliability was considered in early applications of single-case designs, the usual method was to report interrater reliability as the percent agreement between two raters rating the dependent measure (Page and Iwata 1986). Often these measures were dichotomously scored counts of behaviors (e.g., a behavior occurred or did not occur). Simple agreement rates can produce inflated estimates of reliability. More sophisticated measures of reliability are being applied to single-case design research including kappa (Cohen 1960) and variations of intraclass correlation coefficients (Shrout and Fleiss 1979) to name a few. 4.3 Statistical Interpretation of Results While the traditional method of analyzing the results of single-case design research has been visual inspection of graphical representation of the data, there is a growing awareness of the utility of statistical verification of assertions of a treatment effect (Kruse and Gottman 1982). Research has shown that relying solely on visual inspection of results (particularly
Single-case Experimental Design in Clinical Settings when using response guided experimentation) may result in high Type I error rates (incorrectly concluding that a treatment effects exists when it does not). However, exactly how to conduct statistical analyses of time-series data appropriately is controversial. If the data points in a particular series were individual observations from separate individuals, then traditional parametric statistics including the analysis of variance, t-tests, or multiple regression could all be used if the appropriate statistical assumptions were met (e.g., normally distributed scores with equal variances). All of these statistical methods are variations of the general linear model. The most significant assumption in the general linear model is that residual errors are normally distributed and independent. A residual score is the difference between a particular score and the score predicted from a regression equation for that particular value of the predictor variable. The simple correlation between a variable and its residual has an expected value of 0. In singlecase designs, it can be the case that any particular score for an individual is related to (correlated with) the immediately preceding observation. If scores are correlated with their immediately preceding score (or following score), then they are said to be serially dependent or autocorrelated. Autocorrelation violates a fundamental principle of traditional parametric statistics. An autocorrelation coefficient is a slightly modified version of the Pearson correlation coefficient (r). Normally r is calculated using pairs of observations (Xi,Yi) for several subjects. The autocorrelation coefficient (ρk) is the correlation between a score or observation and the next observation (Yi,Yi+ ). When " this correlation is calculated between an observation and the next observation, it is called a lag 1 autocorrelation. The range of the autocorrelation coefficient is from k1.0 to j1.0. If the autocorrelation is 0, it indicates that there is no serial dependence in the data and the application of traditional statistical procedures can reasonably proceed. As ρk increases, there is increasing serial dependence. As ρk becomes negative, it indicates that there is a rapid cyclicity pattern. A lag can be extended out past 1 and interesting patterns of cyclicity may emerge. In economics, a large value of ρk at a lag 12 can indicate an annual pattern in the data. In physiological psychology, lags at 12 or 24 may indicate a systematic diurnal or circadian rhythm in the data. Statistical problems emerge when there is an autocorrelation among the residual scores. As the value of ρk increases (i.e., there is a positive autocorrelation) for the residuals, the computed values of many traditional statistics are inflated, and type I errors occur at a higher than expected rate. Conversely, when residuals show a negative autocorrelation, the resulting statistics will be too small producing misleading significance testing in the opposite direction. Suggestions for how to avoid problems of autocorrelation
include the use of distribution-free statistics such as randomization or permutation tests. However, if the residuals are autocorrelated even these approaches may not give fully accurate probability estimates (Gorman and Allison 1996, p. 172). There are many proposals for addressing the autocorrelation problem. For identifying patterns of cyclicity, even where cycles are embedded within other cycles, researchers have made use of spectral analysis techniques, including Fourier analysis. These are useful where there may be daily, monthly, and seasonal patterns in data such as one might see in changes in mood over long periods of observation. The most sophisticated of these approaches is called the autoregressive integrated moving average analysis or ARIMA method (Box and Jenkins 1976). This is a regression approach that tries to model the value of particular observation as a function of the parameters estimating the effects of past observations (φ), moving averages of the influence of residuals on the series (θ), and the removal of systematic trends throughout the entire time series. The computation of ARIMA models is a common feature of most statistical computer programs. Its primary limitation is that it requires many data points, often more than are practical in applied or clinical research settings. Additionally, there may be several ARIMA models that fit any particular data set, making results difficult to interpret. Alternative simplifications of regression-based analyses are being studied to help solve the interpretive problems of using visual inspection alone as a means of identifying treatment effects (Gorman and Allison 1996).
4.4 Generalization of Findings and Meta-analysis Although single-case designs can allow for convincing demonstrations of experimental control in a particular case, it is difficult to know how to generalize from single-case designs to other contexts. Single-case design researchers are often interested in specific effects, in a specific context, for a specific subject. Group designs provide information about average effects, across a defined set of conditions, for an average subject. Inferential problems exist for both approaches. For the single-case design approach, the question is whether results apply to any other set of circumstances. For the group design approach, the question is whether the average result observed for a group applies to a particular individual in the future or even if anyone in the group itself showed an average effect. In the last two decades of the twentieth century, meta-analysis has become used as a means of aggregating results across large numbers of studies to summarize the current state of knowledge about a particular topic. Most of the statistical work has focused on aggregating across group design studies. 14115
Single-case Experimental Design in Clinical Settings More recently, meta-analysts have started to address how to aggregate the results from single-case design studies in order to try to increase the generalizability from this type of research. There are still significant problems in finding adequate ways to reflect changes in slopes (trends) between phases as an effect size statistic, as well as ways to correct for the effects of autocorrelation in meta-analysis just as there are for the primary studies upon which a meta-analysis is based. There is also discussion of how and if singlecase and group design results could be combined (Panel on Statistical Issues and Opportunities for Research in the Combination of Information, 1992).
5. Summary Technical issues aside, the difference between the goal of precision and control typified in single-case design versus the interest in estimates of overall effect sizes that are characteristic of those who favor group designs makes for a lively discussion about the importance of aggregation across studies for the purpose generalization. With analytic methods for single-case design research becoming more sophisticated, use of this research strategy to advance science is likely to expand as researchers search for large effects under well specified conditions. See also: Behavior Analysis, Applied; Case Study: Logic; Case Study: Methods and Analysis; Caseoriented Research; Experimental Design: Overview; Experimental Design: Randomization and Social Experiments; Experimenter and Subject Artifacts: Methodology; Hypothesis Testing: Methodology and Limitations; Laboratory Experiment: Methodology; Medical Experiments: Ethical Aspects; Psychotherapy: Case Study; Quasi-Experimental Designs; Reinforcement, Principle of; Single-subject Designs: Methodology
Bibliography Barlow D H, Hayes S C 1979 Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behaior Analysis 12: 199–210 Box G E P, Jenkins G M 1976 Time Series Analysis, Forecasting, and Control. Holden-Day, San Francisco Cohen J A 1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20: 37–46 Franklin R D, Allison D B, Gorman B S 1997 Design and Analysis of Single-case Research. Lawrence Erlbaum Associates, Marwah NJ Franklin R D, Gorman B S, Beasley T M, Allison D B 1996 Graphical display and visual analysis. In: Franklin R D, Allison D B, Gorman B S (eds.) Design and Analysis of Singlecase Research. Lawrence Erlbaum Associates, Marwah NJ, pp. 119–58
14116
Gorman B S, Allison D B 1996 Statistical alternatives for singlecase designs. In: Franklin R D, Allison D B, Gorman B S (eds.) Design and Analysis of Single-case Research. Lawrence Erlbaum Associates, Marwah NJ, pp. 159–214 Hayes S C, Barlow D H, Nelson-Gray R O 1999 The Scientistpractitioner: Research and Accountability in the Age of Managed Care, 2nd edn. Allyn and Bacon, Boston Hersen M, Barlow D H 1976 Single Case Experimental Designs: Strategies for Studying Behaior Change. Pergamon Press, New York Kazdin A E 1982 Single Case Research Designs: Methods for Clinical and Applied Settings. Oxford University Press, New York Kruse J A, Gottman J M 1982 Time series methodology in the study of sexual hormonal and behavioral cycles. Archies of Sexual Behaior 11: 405–15 Page T J, Iwata B A 1986 Interobserver agreement: History, theory, and current methods. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis: Issues and Adances. Plenum, New York, pp. 99–126 Panel on Statistical Issues and Opportunities for Research in the Combination of Information 1992 Contemporary Statistics: Statistical Issues and Opportunities for Research. National Academy Press, Washington, DC, Vol. 1 Parsonson B S, Baer D M 1978 The analysis and presentation of graphic data. In: Kratochwill T R (ed.) Single-subject Research: Strategies for Ealuating Change. Academic Press, New York, pp. 101–65 Poling A, Grossett D 1986 Basic research designs in applied behavior analysis. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis: Issues and Adances. Plenum, New York Shrout P E, Fleiss J L 1979 Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin 86: 420–8
W. C. Follette
Single-subject Designs: Methodology Science is about knowledge and understanding and is fundamentally a human enterprise. Adequate methodology is at the heart of scientific activity for it allows scientists to answer fundamental epistemic questions about what is known or knowable about their subject matter. This includes the validity of data-based conclusions, the generality of findings, and how knowledge may be used to achieve practical goals. Choice of methodology is critical in answering such questions with precision and scope, and the adequacy of a given methodology is conditional upon the purposes for which it is put. The purpose of this chapter is to provide a succinct description of a methodological approach well suited for work with single organisms, whether in basic research or applied contexts, and particularly where large N designs or inferential statistics are nonsensical and even inappropriate. This approach, referred to here as ‘single-subject design’ methodology, has a long history in the medical,
Single-subject Designs: Methodology behavioral, and applied sciences and includes several design options, from the simple to the highly sophisticated and complex. As will be seen, all such designs involve a rigorous and intensive experimental analysis of a single case, and often replications across several individual cases. No attempt will be made here to describe all available design options, including available statistical tests, as excellent resources are available on both subjects (see Barlow and Hersen 1984, Busk and Marascuilo 1992, Hayes et al. 1999; Huitema 1986, Iversen 1991, Kazdin 1982, Parsonson and Baer 1992). Rather, the intent is to provide (a) an overview of single-subject design methodology and the rationale guiding its use in experimental and applied contexts; (b) a succinct description of the main varieties of single-subject designs, including basic guidelines for their application; and (c) to highlight recent trends in the use of such designs in applied contexts as a means to increase accountability.
100 rats for 10 hours each, the investigator is likely to study one rat for a 1000 hours’ (p. 21); an approach that served as the foundation for operant learning (Skinner 1953, 1957) and one that, with the increasing popularity of behavior therapy, set the stage for a new approach to treatment development, treatment evaluation, and increased accountability (Barlow and Hersen 1984). Though single-subject research often involves more than one subject, the data are evaluated primarily on a case-by-case basis. The idiographic nature of this methodology is well suited for applied work where practitioners often attempt to influence and change problematic behavior(s) of individual clients (Hayes et al. 1999). Though not a requirement, the inclusion of more than one subject helps establish the robustness of observed effects across differentially imposed conditions, individuals, behaviors, settings, and\or time. Hence, direct or systematic replication serves to increase confidence in the observed effects and helps establish the reliability and generality of observed relations (Sidman 1960).
1. Single-subject Design: Oeriew and Rationale At the core, all scientists ultimately deal with particulars, whether cells, atoms, microbes, or the behavior of organisms. It is from such particulars that the facts of science and generality of findings are derived (Sidman 1960). This is true whether one aggregates a limited set of observations from a large sample of cases into groups that differ on one or more variables, or situations involving a few cases and a large number of observations of each over time (see also Hilliard 1993). The former is characteristic of hypothetico-deductive group design strategies that rely on inferential statistics to support scientific claims, whereas the latter idiographic and inductive approach is characteristic of single-subject methodology. As with other research designs, single-subject methodology is as much an approach to science as it is a set of methodological rules and procedures for answering scientific and practical questions. As an approach, single-subject designs have several distinguishing features.
1.1 Subject Matter of Single-subject Designs Perhaps the most notable feature of single-subject methodology is its subject matter and sample characteristics; namely, the intensive study of the behavior of single (i.e., N l 1) organisms. This feature is also a source of great misunderstanding (Furedy 1999), particularly with regard to the generality and scientific utility of research based on an N of 1. It should be stressed that single-subject methodology gets its name not from sample size per se, but rather because the unit of analysis is behavior of individuals (Perone 1991), or what Gottman (1973) referred to as ‘N-of-one-at-atime’ designs. Skinner (1966) described it this way: ‘… instead of studying 1000 rats for one hour each, or
1.2 Analytic Aims of Single-subject Designs: Prediction and Influence (Control) As with other experimental design strategies, the fundamental premise driving the use of single-subject methodology is to demonstrate control (i.e., influence) via the manipulation of independent variables and analysis of the effects of such variables on behavior (dependent variables). This emphasis on systematic manipulation of independent variables distinguishes single-subject designs from other small N research such as case reports, case studies, and the like which are often lucidly descriptive, but less controlled and systematic. With few exceptions (see Perone 1991), the general convention of single-subject design is to change one variable at a time and to evaluate such effects on behavior. This ‘one-variable-at-a-time’ rule positions the basic or applied researcher to make clear causal statements about the effects of independent variables compared with a previous state (e.g., naturally occurring behavior during baseline), to compare different independent variables across or within conditions, and only then to evaluate interaction effects produced by two or more independent variables together. Indeed, single-subject methodology is unique in that it typically involves an intensive and rigorous experimental analysis of the behavior (i.e., thinking, emotional responses, overt motor acts) across time, and an equally rigorous attempt to isolate and control extraneous sources of variability (e.g., maturation, history) that may confound and\or mask any effects caused by systematic changes in the independent variable(s) across conditions and time (Sidman 1960). This point regarding the handling of extraneous sources of variability cannot be overstated. Iversen 14117
Single-subject Designs: Methodology (1991) and others (Barlow and Hersen 1984: Sidman 1960) have noted issues concerning how variability is handled in-group design research and single-subject methodology. With regard to such issues, Iverson put it this way: [in-group design research] the fit between theory and observation often is evaluated by means of statistical tests, and variability in data is considered a nuisance rather than an inspiration …. By averaging over N subjects the variability among the subjects is ignored N times. In other words, a data analysis restricted to only the mean generates N degrees of ignorance ... Single-subject designs are used because the data based on the individual subject are more accurate predictors of that individual’s behavior than are data based on an averaging for a group of subjects (p. 194).
Though the issue of when and how to aggregate data (if at all) is controversial in single-subject research (Hilliard 1993), there is a consensus that variability is to be pinpointed and controlled to the extent possible. This is consistent with the view that variability is imposed and determined from without, and not an intrinsic property of the subject matter under study (Sidman 1960). Efforts are deliberately made to control such noise so that prediction and control can be achieved with precision and scope. Consequently, those employing experimental single-subject design methods are more likely to commit Type II errors (i.e., to deny a difference and claim that a variable is not exerting control over behavior, when it is) compared with more classic Type I errors (i.e., to claim that a variable is exerting control over behavior, when it is not; (see Baer 1977).
Perone 1991). Such changes, in turn, are often evaluated visually and graphically relative to a prior less controlled state (i.e., baseline), or adjacent conditions involving another independent variable. In singlesubject methodology, such stability can be evaluated within a condition (i.e., from one sample or observation to the next) and across conditions (i.e., changes in level, trend, and variability). The pragmatic appeal of such an approach, particularly in applied contexts, rests in allowing the practitioner to evaluate the effectiveness of imposed interventions in real time, and consequently to modify their intervention tactics as the data dictate. Indeed, unlike more formal group designs that are followed strictly once set in place, single-subject methodology is more flexible. Indeed, it is common for design elements to be added, dropped, and\or modified as the data dictate so as to meet scientific and applied goals. This feature has obvious parallels with how practitioners work with their clients in designing treatment interventions (see also Hayes et al. 1999). In sum, all single-subject designs focus on individual organisms and aim to predict and influence behavior via: (a) repeated sampling and observation; (b) manipulation of one or more independent variables while isolating and controlling sources of extraneous variability to the extent possible; and (c) demonstration of stability within and across levels of imposed independent variables (see Perone 1991). Discussion will now turn to an enumeration of such features in the context of more popular single-subject designs.
2. Varieties of Single-subject Methodology 1.3 Single-subject Design: Measurement Issues Single-subject design methodology also can be distinguished by the frequency with which behavior is sampled within conditions, and across differentially imposed conditions and time. It is common, for instance, for basic and applied researchers to include hundreds of samples of behavior across time (i.e., hours, days, weeks, and sessions) with individual subjects. Yet, a minimum of at least three data points is required to establish level, trend, and variability within a given phase or design element (Barlow and Hersen 1984, Hayes et al. 1999). Changes in level, trend, and at times variability from one condition to the next, and particularly changes that are robust, clear, immediate, and stable from behavior observed in a prior condition, help support causal inferences. Here, stability is a relative term and should not be confused with something that is static or flat. Rather, stability provides a background by which to evaluate the reliability of changes produced by an independent variable across conditions and time. By convention, each condition is often continued until some semblance of stability is observed, at which point another condition is imposed, and behavior is evaluated relative to the previous condition or steady state (cf. 14118
Most single-subject designs can be generally classified as representing two main types: within-series and between-series designs. Though each will be discussed briefly in turn, it is important to recognize that neither class of design precludes combining elements from the other (e.g., combined-series elements). In other words, a within-series design may, for some purposes, lead the investigator to add between-series elements and viceversa, including other more complex elements (e.g., interaction effects, changing criterion elements). 2.1 Within-series Designs The basic logic and structure of within-series designs are simple; namely, to evaluate changes within a series of data points across time on a single measure or set of related measures (see Hayes et al. 1999). Stability is judged for each data point relative to other data points that immediately precede and follow it. By convention, such designs include comparisons between a naturally occurring state of behavior (denoted by the letter A) and the effects of an imposed manipulation of an independent variable or intervention (denoted by different letters such as B, C, and so on). A simple AB design involves sampling naturally occurring beha-
Single-subject Designs: Methodology vior (A phase), followed by repeated assessment of responding when the independent variable is introduced (B phase). For example, suppose a researcher wanted to determine the effects of rational-emotive therapy on initiating social interactions for a particular client. If an A-B design were chosen, the rate of initiations prior to treatment would be compared with that following treatment in a manner analogous to popular pre-to-post comparisons in-group outcome research. It should be noted, however, that A-B designs are inherently weak in controlling for threats to internal validity (e.g., testing, maturation, history, regression to the mean). Withdrawal\reversal designs control for such threats, and hence provide a more convincing case of experimental control. Withdrawal designs represent a replication of the basic A-B sequence a second time, with the term withdrawal representing the imposed removal of the active treatment or independent variable (i.e., a return to second baseline or A phase). For example, a simple A-B-A sequence allows one to evaluate treatment effects relative to baseline responding. If an effect due to treatment is present, then it should diminish once treatment is withdrawn and the subject or client is returned to an A phase. Other variations on withdrawal designs include B-A-B designs and A-B-A-B (reversal) designs (see Barlow and Hersen 1984, Hayes et al. 1999). A-B designs and withdrawal\reversal designs are typically used to compare the effects of a finite set of treatment variables with baseline response levels. Data regarding stable response trends are collected across several discrete periods (e.g., time, sessions), wherein the independent variable is either absent (baseline) or present (treatment phase). Phase shifts are data-driven, and response stability determines the next element added to the design, elements that may include other manipulations or treatments either alone (e.g., A-B-A-C-A-B-A) or in combination (e.g., A-B-A-BjC-A-B-A). As the manipulated behavior change is repeatedly demonstrated and replicated across increasing numbers of phase shifts and time, confidence in the role of the independent variable as a cause of such changes increases. This basic logic of within-series single-subject methodology has been expanded in sophisticated and at times complex ways to meet basic and applied purposes. For instance, such designs can be used to test the differential effects of more than one treatment. Other reversal designs (i.e., B-C-B-C) involve the comparison of differential, yet consecutive, treatment interventions across multiple phase shifts. These designs are similar to the above reversals in how control is evaluated, but differ primarily in that baseline phases are not required. Designs are also available that combine features of withdrawal and reversal. For example, A-B-A-B-C-B designs allow one to compare A and B phases with each other (reversal; A-B-A-B), and a second treatment to B (reversal; e.g., B-C-B), or even to evaluate the extent to which behavior tracks a
specified behavioral criterion (i.e., changing criterion designs; see Hayes et al. 1999). Changing criterion designs provide an alternative method for experimentally analyzing behavioral effects without subsequent treatment withdrawal. Criterion is set such that optimal levels given exposure can be met and then systematically increased (or decreased) to instill greater demand on acquiring new repertoires. For example, a child learning to add may be required to calculate 4 of 10 problems correctly to earn a prize. Once the child successfully and consistently demonstrates this level of responding, the demand increases to 6 of 10 correct problems. Changing criterion designs serve as a medium to demonstrate learning through achieving successive approximations of the end-state.
2.2 Between-series Designs Between-series designs differ primarily from withinseries designs in that data are first grouped and compared by condition, and then by time, whereas the reverse is true for within-series designs. Between-series designs need not contain phases, as evaluation of level, trend, and stability are organized by condition first, but not by time alone (Hayes et al. 1999, p.176). Designs within this category include ‘alternatingtreatment design’ and ‘simultaneous-treatment design.’ The basic logic of both is the same, in that a minimum of two treatments are evaluated concurrently in an individual case. With alternating-treatments design, the concurrent evaluation is made between rapid and largely random alternations of two or more conditions (Barlow and Hayes 1979). Unlike withinseries designs, alternating-treatment designs contain no phases (Hayes et al. 1999). The same is true for simultaneous-treatment designs; a design that is appropriate for situations where one wishes to evaluate the concurrent or simultaneous application of two or more treatments in a single case. Rapid or random alteration of treatment is not required with simultaneous-treatment design. What is necessary is that two or more conditions are simultaneously available, with the subject choosing between them. This particular design is not conducive for evaluating treatment outcome, but is appropriate for evaluating preference or choice (see Hayes et al. 1999). Alternating-treatment designs, by contrast, fit well with applied work, as therapists must often routinely target multiple problems concurrently, and thus need to switch rapidly between intervention tactics in the context of therapy.
2.3 Combining Within- and Between-series Elements: Multiple-baseline Design Multiple-baseline designs build upon and integrate the basic logic and structure of within- and 14119
Single-subject Designs: Methodology between-series elements. The fundamental premise of multiple-baseline designs is to replicate phase change effects systematically in more than one series, with each subsequent uninterrupted series serving as a control condition for the preceding interrupted series. Series can be compared and arranged across behaviors, across settings, across individuals, or some combination of these (see Hayes et al. 1999). Such designs require that the series under consideration be independent (e.g., two functionally distinct behaviors), and that intervention be administered sequentially beginning with the first series, while others are left uninterrupted as controls. For example, a client might present with three distinct behavior problems all requiring exposure therapy. All behaviors would be monitored during baseline (A phase), and after some semblance of stability is reached the treatment (B phase) would be applied to the first behavior series, while the remaining two behaviors are continuously monitored in an extended baseline. Once changes in the first behavior resulting from treatment reach stability, the second series would be interrupted and treatment applied, while the first behavior continues in the B phase and the third behavior is monitored in an extended baseline. The procedure followed for the first two series elements is then repeated for the third behavior. This logic can be similarly applied to multiple-baseline designs across individuals or settings (see Barlow and Hersen 1984, Hall et al.). More complex within-series elements (e.g., A-B-A-B, or B-C-B, or counterbalanced phases such as B-A-B and A-B-A) can be evaluated across behaviors, settings, and individuals. Note that with multiple-baseline designs the treatment is never withdrawn, rather, it is introduced systematically across a set of dependent variables. Independent variable effects are denoted from how reliably behavior change correlates with the onset of treatment for a particular dependent variable. For such reasons, multiple-baseline designs are quite popular, owing much to their ease of use, strength in ruling out threats to internal validity, built-in replication, and fit with the demands of applied practitioners working in therapy, schools, and institutions. Indeed, such designs can be useful when therapists are working with several clients presenting with similar problems, when clients begin therapy at different times, or in cases where targets for change in therapy occur sequentially.
3. Recent Trends in the Application of Singlesubject Methods Single-subject methodology has a long historic affiliation with basic and applied behavioral research, and the behavior therapy movement more generally (Barlow and Hersen 1984). Though the popularity of 14120
single-subject methods in published research appearing in mainstream behavior therapy journals appears to be on the decline relative to use of group design methodology (Forsyth et al. 1999), there does appear to be a resurgence of interest in single-subject methodology in applied work. There are several possible reasons for this renewed interest, but only two are mentioned here. First, trends in treatment-outcome research have increasingly relied on the now popular ‘randomized clinical trial’ group design methodology to establish empirical support for psychosocial therapies for specific behavioral disorders. Practitioners, who work predominantly with individual clients, are quick to point out that group-outcome data favoring a given intervention, however convincing statistically speaking, includes individuals in the group that did not respond to therapy. Moreover, the ‘average’ group response to treatment may not generalize to the specific response of an individual client to that treatment. Thus, practitioners have been skeptical of how such research informs their work with individual clients. The second, and perhaps more important, reason for the renewed interest in single-subject methodology is driven by pragmatic concerns and the changing nature of the behavioral health care marketplace. Increasingly, third-party payers are requiring practitioners to demonstrate accountability. That is, to show that what they are doing with their clients is achieving the desired effect (i.e., good outcomes) in a cost-effective and lasting manner. Use of single-subject methodology has a place in assisting practitioners in making clinical decisions based on empirical data, in demonstrating accountability to third party payers and consumers. Further, such methodology, though requiring time and sufficient training to implement properly, is atheoretical and fits nicely with how practitioners from a variety of conceptual schools work routinely with their individual clients (Hayes et al. 1999). Thus, there is great potential in single-subject methodology for bridging the strained scientist–practitioner gap, and ultimately advancing the science of behavior change and an empiricallydriven approach to practice and treatment innovation.
4. Summary and Conclusions Methodology is a way of knowing. Single-subject methodology represents a unique way of knowing that has as its subject matter an experimental analysis of the behavior of individual organisms. Described here were the assumptions driving this approach and the main varieties of design options available to address basic experimental and applied questions. Perhaps the greatest asset of single-subject methodology rests with the flexibility with which such designs can be construc-
Situated Cognition: Contemporary Deelopments ted to meet the joint analytic goals of prediction and control over the behavior of individual organisms. This asset also represents one of the greatest liabilities of such designs in that flexibility requires knowledge of when and how to modify and\or add design elements to achieve analytic goals, and particularly skill in recognizing significant effects, sources or variability, and when one has demonstrated sufficient levels of prediction and control over the behavior of interest. See also: Case Study: Logic; Case Study: Methods and Analysis; Psychotherapy: Case Study; Single-case Experimental Designs in Clinical Settings
T R, Levin J R (eds.) Single-Case Research Design and Analysis. Plenum, New York, pp. 15–41 Perone M 1991 Experimental design in the analysis of freeoperant behavior. In: Iversen I H, Lattal K A (eds.) Experimental Analysis of Behaior, pt 1. Elsevier, New York, pp. 135–71 Sidman M 1960 Tactics of Scientific Research: Ealuating Experimental Data in Psychology. Basic Books, New York Skinner B F 1953 Science and Human Behaior. Macmillan, New York Skinner B F 1957 The experimental analysis of behavior. American Scientist 45: 343–71 Skinner B F 1966 Operant behavior. In: Honig W K (ed.) Operant Behaior: Areas of Research and Application. Appleton-Century-Crofts, New York, pp. 12–32
J. P. Forsyth and C. G. Finlay
Bibliography Baer D M 1977 Perhaps it would be better not to know everything. Journal of Applied Behaior Analysis 10: 167–72 Barlow D H, Hayes S C 1979 Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behaior Analysis 12: 199–210 Barlow D H, Hersen M 1984 Single-Case Experimental Designs: Strategies for Studying Behaior Change, 2nd edn. Allyn & Bacon, Boston, MA Busk P L, Marascuilo L A 1992 Statistical analysis in single-case research: Issues, procedures, and recommendations, with applications to multiple behaviors. In: Kratochwill T R, Levin J R (eds.) Single-Case Research Design and Analysis. Plenum, New York, pp. 159–85 Forsyth J P, Kollins S, Palav A, Duff K, Maher S 1999 Has behavior therapy drifted from its experimental roots? A survey of publication trends in mainstream behavioral journals. Journal of Behaior Therapy and Experimental Psychiatry 30: 205–20 Furedy J J 1999 Commentary: On the limited role of the singlesubject design in psychology: Hypothesis generating but not testing. Journal of Behaior Therapy and Experimental Psychiatry 30: 21–2 Gottman J M 1973 N-of-one and N-of-two research in psychotherapy. Psychological Bulletin 80: 93–105 Hall R V, Cristler C, Cranston S S, Tucker B 1970 Teachers and parents as researchers using multiple baseline designs. Journal of Applied Behaior Analysis 3: 247–55 Hayes S C, Barlow D H, Nelson-Gray R O 1999 The ScientistPractitioner: Research and Accountability in the Age of Managed Care. Allyn & Bacon, Boston, MA Hilliard R B 1993 Single-case methodology in psychotherapy process and outcome research. Journal of Consulting and Clinical Psychology 61: 373–80 Huitema B E 1986 Statistical analysis and single-subject designs: Some misunderstandings. In: Poling A, Fuqua R W (eds.) Research Methods in Applied Behaior Analysis. Plenum, New York, pp. 209–32 Iversen I H 1991 Methods of analyzing behavior patterns. In: Iversen I H, Lattal K A (eds.) Experimental Analysis of Behaior, pt 2. Elsevier, New York, pp. 193–241 Kazdin A E 1982 Single-Case Research Designs: Methods for Clinical and Applied Settings. Oxford University Press, New York Parsonson B S, Baer D M 1992 The visual analysis of data, and current research into the stimuli controlling it. In: Kratochwill
Situated Cognition: Contemporary Developments Situate analyses of cognition that draw on the substantive aspects of Vygotsky’s and Leont’ev’s work (see Situated Cognition: Origins) did not become prominent in Western Europe and North America until the 1980s. Contemporary situated perspectives on cognition can be divided into two broad groups. One of these, cultural historical activity theory (CHAT), has developed largely independently of mainstream Western psychology by drawing inspiration directly from the writings of Vygotsky and Leont’ev. The other group, which I will call distributed cognition, has developed in reaction to mainstream cognitive science and incorporates aspects of the Soviet work.
1. Cultural Historical Actiity Theory It is important to clarify that the term ‘activity’ in the acronym in CHAT refers to cultural activity or, in other words, to cultural practice. The reference to history indicates a focus on both the evolution of the cultural practices in which people participate and the development of their thinking as they participate in them. Olson’s (1995) analysis of the historical development of writing systems is paradigmatic of investigations that focus on the evolution of cultural practices at what might be termed the macrolevel of history writ large. Olsen’s goal was not merely to document the first appearance of and subsequent changes in writing systems. Instead, he sought to demonstrate that changes in writing systems precipitated changes in thought that in turn made possible further developments in both writing and thinking. In doing so, he elaborated Vygotsky’s claim that it was not until reasonably sophisticated writing systems had 14121
Situated Cognition: Contemporary Deelopments emerged that people became consciously aware of language as a means of monitoring and regulating thought. As Olsen has made clear, the findings of his analysis have significant implications for children’s induction into literacy practices in school. Saxe’s (1991) investigation of the body parts counting system of the Oksapmin people of Papua New Guinea provides a useful point of contrast in that it was concerned with cultural change at a more local level. As Saxe reports, the Oksapmin’s counting system has no base structure and no distinct terms for numbers. Instead, the Oksapmin count collections of items (e.g., sweet potatoes) by beginning with the left index finger and naming various body parts as they move up the left arm, across the head and shoulders, and down the right arm, ending with the right index finger. At the time that Saxe conducted his fieldwork, a new technology was being introduced, a base-10 currency system. Saxe documents that the Oksapmin with the greatest experience in using the currency, the owners of indigenous trade stores, had developed relatively sophisticated reasoning strategies that were based on the body parts system but that privileged 10 (e.g., transforming the task of adding nine and seven into that of adding ten and six by ‘moving’ a body part). Saxe’s finding is significant in that it illustrates a general claim made by adherents of CHAT, namely that changes in the artifacts people use, and thus the cultural practices in which they participate, serve to reorganize their reasoning. The two CHAT studies discussed thus far, those of Olsen and Saxe, investigated the evolution of cultural practices. A second body of CHAT research is exemplified by investigations that have compared mathematical reasoning in school with that in various out-of-school settings. Following both Vygotsky and Leont’ev, these investigations can be interpreted as documenting the fusion of the forms of reasoning that people develop with the cultural practices in which they participate. A third line of CHAT research has focused on the changes that occur in people’s reasoning as they move from relatively peripheral participation to increasingly substantial participation in the practices of particular communities. In their overview of this type of research, Lave and Wenger (1991) clarify that the cultural tools used by community members are viewed as carrying a substantial portion of a practice’s intellectual heritage. As Lave and Wenger note, this implies that novices’ opportunities for learning depend crucially on their access to these tools as they are used by the community’s old-timers. Lave and Wenger also make it clear that in equating learning with increasingly substantial participation, they propose to dispense with traditional cognitive analyses entirely. In their view, someone’s failure to learn should be explained in terms of the organization of the community and the person’s opportunities for access to increasingly substantial forms of partici14122
pation rather than in terms of cognitive deficits that are attributed to the person. This is clearly a relatively strong claim in that it equates an analysis of the conditions for the possibility of learning with an analysis of learning. It should be apparent from this overview that CHAT research spans a wide range of problems and issues. One theme that cuts across the three lines of work is a focus on groups of people’s reasoning, whether they be people using different writing systems in different historical epochs, Oksapmin trade store owners compared to Oksapmin who have less experience with the new currency, the mathematical reasoning of people in school and nonschool settings, or novices versus old-timers in a community of practice. Differences in the reasoning of people as they participate in the same practices are therefore accounted for in terms of either (a) experience in participating in the practice, (b) access to more substantial forms of participation, or (c) differences in the history of their participation in other practices. Each of these types of explanations instantiates Leont’ev’s dictum that the individual-in-cultural-practice rather than the individual per se is the appropriate unit of analysis.
2. Distributed Cognition Whereas CHAT research often involves comparisons of groups of people’s reasoning, work conducted within the distributed cognition tradition typically focuses on the reasoning of individuals or small groups of people as they solve specific problems or complete specific tasks. Empirical studies conducted within this tradition therefore tend to involve detailed microanalysis of either an individual’s or a small group’s activity. Further, whereas CHAT researchers often frame people’s reasoning as acts of participation in relatively broad systems of cultural practices, distributed cognition theorists typically restrict their focus to the immediate physical, social, and symbolic environment. This circumscription of analyses to people’s reasoning in their immediate environments is one indication that this tradition has evolved from mainstream cognitive science. Several of the leading scholars in this tradition, such as John Seeley Brown, Alan Collins, and James Greeno, initially achieved prominence within the cognitive science community before substantially modifying their theoretical commitments, in the process contributing to the emergence of a distributed perspective on cognition. The term ‘distributed intelligence’ or ‘distributed cognition’ is most closely associated with Roy Pea (1993). Pea coined this term to emphasize that, in his view, cognition is distributed across minds, persons, and symbolic and physical environments. As he and other distributed cognition theorists make clear in their writings, this perspective directly challenges a
Situated Cognition: Contemporary Deelopments foundational assumption of mainstream cognitive science. This is the assumption that cognition is bounded by the skin or by the skull and can be adequately accounted for solely in terms of people’s construction of internal mental models of an external world. Distributed cognition theorists instead see cognition as extending out into the immediate environment such that the environment becomes a resource for reasoning. In coming to this conclusion, distributed cognition theorists have been influenced by a number of studies conducted by CHAT researchers, one of the most frequently cited investigations being that of Scribner (1984). In this investigation, Scribner analyzed the reasoning of workers in a dairy as they filled orders by packing products into crates of different sizes. Her analysis revealed that the loaders did not perform purely mental calculations but instead used the structure of the crates as a resource in their reasoning. For example, if an order called for 10 units of a particular product and six units were already in a crate that held 12 units, experienced loaders rarely subtracted six from 10 to find how many additional units they needed. Instead, they might realize that an order of 10 units would leave two slots in the crate empty and just know immediately from looking at the partially filled crate that four additional units are needed. As part of her analysis, Scribner convincingly demonstrated that the loaders developed strategies of this type in situ as they went about their daily business of filling orders. For distributed cognition theorists, this indicated that the system that did the thinking was the loader in interaction with a crate. From the distributed perspective, the loaders’ ways of knowing are therefore treated as emergent relations between them and the immediate environment in which they worked. Part of the reason that distributed cognition theorists attribute such significance to Scribner’s study and to other investigations conducted by CHAT researchers is that they capture what Hutchins (1995) refers to as ‘cognition in the wild.’ This focus on people’s reasoning as they engage in both everyday and workplace activities contrasts sharply with the traditional school-like tasks that are often used in mainstream cognitive science investigations. In addition to questioning whether people’s reasoning on school-like tasks constitutes a viable set of cases from which to develop general models of cognition, several distributed cognition theorists have also critiqued current school instruction. In doing so, they have broadened their focus beyond cognitive science’s traditional emphasis on the structure of particular tasks by drawing attention to the nature of the classroom activities within which the tasks take on meaning and significance for students. Brown et al. (1989) developed one such critique by observing that school instruction typically aims to teach students abstract concepts and general skills on the assumption that students will be able to apply
them directly in a wide range of settings. In challenging this assumption, Brown et al. argue that the appropriate use of a concept or skill requires engagement in activities similar to those in which the concept or skill was developed and is actually used. In their view, the well-documented finding that most students do not develop widely applicable concepts and skills in school is attributable to the radical differences between classroom activities and those of both the disciplines and of everyday, out-of-school settings. They contend that successful students learn to meet the teacher’s expectations by relying on specific features of classroom activities that are alien to activities in the other settings. In developing this explanation, Brown et al. treat the concepts and skills that students actually develop in school as relations between students and the material, social, and symbolic resources of the classroom environment. It might be concluded from the two examples given thus far, those of the dairy workers and of students relying on what might be termed superficial cues in the classroom, that distributed cognition theorists do not address more sophisticated types of reasoning. Researchers working in this tradition have in fact analyzed a number of highly technical, work-related activities. The most noteworthy of these studies is, perhaps, Hutchins’s (1995) analysis of the navigation team of a naval vessel as they brought their ship into San Diego harbor. In line with other investigations of this type, Hutchins argues that the entire navigation team and the artifacts it used constitutes the appropriate unit for a cognitive analysis. From the distributed perspective, it is this system of people and artifacts that did the navigating and over which cognition was distributed. In developing his analysis, Hutchins pays particular attention to the role of the artifacts as elements of this cognitive system. He argues, for example, that the cartographer has done much of the reasoning for the navigator who uses a map. This observation is characteristic of distributed analyses and implies that to understand a cognitive process, it is essential to understand how parts of that process have, in effect, been sedimented in tools and artifacts. Distributed cognition theorists therefore contend that the environments of human thinking are thoroughly artificial. In their view, it is by creating environments populated with cognitive resources that humans create the cognitive powers that they exercise in those environments. As a consequence, the claim that artifacts do not merely serve to amplify cognitive process but instead reorganize them is a core tenet of the distributed cognition perspective.
3. Current Issues and Future Directions It is apparent that CHAT and distributed cognition share a number of common assumptions. For example, both situate people’s reasoning within encom14123
Situated Cognition: Contemporary Deelopments passing activities and both emphasize the crucial role of tools and artifacts in cognitive development. However, the illustrations presented in the preceding paragraphs also indicate that there are a number of subtle differences between the two traditions. One concerns the purview of the researcher in that distributed cognition theorists tend to focus on social, material, and symbolic resources within the immediate local environment whereas CHAT theorists frequently locate an individual’s activity within a more encompassing system of cultural practices. A second difference concerns the way in which researchers working in the two traditions address the historical dimension of cognition. Distributed cognition theorists tend to focus on tools and artifacts per se, which are viewed as carrying the reasoning of their developers from prior generations. In contrast, CHAT theorists treat artifacts as one aspect of a cultural practice, albeit an important one. Consequently, they situate cognition historically by analyzing how systems of cultural practices have evolved. Thus, whereas distributed cognition theorists focus on the history of the cognitive resources available in the immediate environment, CHAT theorists contend that this environment is defined by a historically contingent system of cultural practices. Despite these differences, it is possible to identify several current issues that cut across the two traditions. The two issues I will focus on are those of transfer and of participation in multiple communities of practice.
3.1 Transfer As noted in the article Situated Cognition: Origins, the notion that knowledge is acquired in one setting and then transferred to other settings is central to the cognition plus view as well as to mainstream cognitive science. To avoid confusion, it is useful to differentiate between this notion of transfer as a theoretical idea and what might be termed the phenomenon called transfer. This term refers to specific instances of behavior that a cognition plus theorist would account for in terms of the transfer of knowledge from one situation to another. As will become clear, CHAT and distributed cognition theorists both readily acknowledge the phenomenon of transfer but propose different ways of accounting for it. An analysis developed by Bransford and Schwartz (1999) serves to summarize concerns about the traditional idea of transfer that underpins many cognitive science investigations. Writing from within the cognition plus view, Bransford and Schwartz (1999) observe that the theory underlying much cognitive science research characterizes transfer as the ability to apply previous learning directly to a new setting or problem. As they note, transfer investigations are designed to ensure that subjects do not have the opportunity to learn to solve 14124
new problems either by getting feedback or by using texts and colleagues as resources. Although Bransford and Schwartz indicate that they consider this traditional perspective valid, they also argue for a broader conception of transfer that includes a focus on whether people are prepared to learn to solve new problems. In making this proposal, they attempt to bring the active nature of transfer to the fore. As part of their rationale, they give numerous examples to illustrate that people often learn to operate in a new setting by actively changing the setting rather than by mapping previously acquired understandings directly on to it. This leads them to argue that cognitive scientists should change their focus by looking for evidence of useful future learning rather than of direct application. Bransford and Schwartz’s proposal deals primarily with issues of method in that it does not challenge transfer as a theoretical idea. Nonetheless, their emphasis on preparation for future learning is evident in Greeno and MMAP’s (1998) distributed analysis of the phenomenon of transfer. Greeno and MMAP contend that the ways of knowing that people develop in particular settings emerge as relations between them and the immediate environment. This necessarily implies that transfer involves active adaptation to new circumstances. As part of their theoretical approach, Greeno and MMAP analyze specific environments in terms of the affordances and constraints that they provide for reasoning. In the case of a traditional mathematics classroom, for example, the affordances might include the organization of lessons and of the textbook. The constraints might include the need to produce correct answers, the limited time available to complete sets of exercises, and the lack of access to peers and other resources. Greeno and MMAP would characterize the process of learning to be a successful student in such a classroom as one of becoming attuned to these affordances and constraints. From this distributed perspective, the phenomenon called transfer is then explained in terms of the similarity of the constraints and affordances of different settings rather than in terms of the transportation of knowledge from one setting to another. The focus on preparation for future learning advocated by Bransford and Schwartz (1999) is quite explicit in Beach’s (1995) investigation of the transition between work and school in a Nepal village where formal schooling had been introduced during the last 20 years of the twentieth century. Beach worked within the CHAT tradition when he compared the arithmetical reasoning of high school students who were apprentice shopkeepers with the reasoning of shopkeepers who were attending adult education classes. His analysis revealed that the shopkeepers’ arithmetical reasoning was more closely related in the two situations than was that of the students. In line with the basic tenets of CHAT, Beach accounted for this finding by framing the shopkeepers’ and students’ reasoning as acts of participation in relatively global
Situated Cognition: Contemporary Deelopments cultural practices, those of shopkeeping and of studying arithmetic in school. His explanation hinges on the observation that the students making the school-towork transition initially defined themselves as students but subsequently defined themselves as shopkeepers when they worked in a shop. In contrast, the shopkeepers continued to define themselves as shopkeepers even when they participated in the adult education classes. Their goal was to develop arithmetical competencies that would enable them to increase the profits of their shops. In Beach’s view, it is this relatively strong relationship between the shopkeepers’ participation in the practices of schooling and shopkeeping that explains the close relationship between their arithmetical reasoning in the two settings. Thus, whereas Greeno and MMAP account for instances of the phenomenon called transfer in terms of similarities in the affordances and constraints of immediate environments, Beach does so in terms of the experienced commensurability of certain forms of participation. In the case at hand, the phenomenon called transfer occurred to a greater extent with the shopkeepers because they experienced participating in the practice of shopkeeping and schooling as more commensurable than did the students. The contrast between Beach’s and Greeno and MMAP’s analyses illustrates how general differences between the distributed cognition and CHAT traditions can play out in explanations of the phenomenon called transfer. However, attempts to reconceptualize transfer within these two are still in their early stages and there is every indication that this will continue to be a major focus of CHAT and distributed cognition research in the coming years.
3.2 Participating in Multiple Communities of Practice To date, the bulk of CHAT research has focused on the forms of reasoning that people develop as they participate in particular cultural practices. For their part, distributed cognition theorists have been primarily concerned with the forms of reasoning that emerge as people use the cognitive resources of the immediate environment. In both cases, the focus has been on participation in well-circumscribed communities of practice or engagement in local systems of activity. An issue that appears to be emerging as a major research question is that of understanding how people deal with the tensions they experience when the practices of the different communities in which they participate are in conflict. The potential significance of this issue is illustrated by an example taken from education. A number of studies reveal that students’ home communities can involve differing norms of participation, language, and communication, some of which might be in conflict with those that the teacher seeks to
establish in the classroom. Pragmatically, these studies lead directly to concerns for equity and indicate the importance of viewing the diversity of the practices of students’ home communities as an instructional resource rather than an obstacle to be overcome. Theoretically, these studies indicate the importance of coming to understand how students attempt to resolve ongoing tensions between home and school practices. The challenge is therefore to develop analytical approaches that treat students’ activity in the classroom as situated not merely with respect to the immediate learning environment, but with the respect to their history of participation in the practices of particular out-of-school communities. The classroom would then be viewed as the immediate arena in which the students’ participation in multiple communities of practice play out in face-to-face interaction. This same general orientation is also relevant when attempting to understand people’s activity in a number of other specific settings. As this is very much an emerging issue, CHAT and distributed cognition research on participation in multiple communities of practice is still in its infancy. The most comprehensive theoretical exposition to date is perhaps that of Wenger (1998). There is every reason to predict that this issue will become an increasingly prominent focus of future research in both traditions.
4. Concluding Comments It should be apparent from this overview of contemporary developments in situated cognition that whereas CHAT researchers draw directly on Vygotsky’s and Leont’ev’s theoretical insights, the relationship to the Soviet scholarship is less direct in the case of distributed cognition research. Inspired in large measure by the findings of CHAT researchers, this latter tradition has emerged as a reaction to perceived limitations of mainstream cognitive science. As a consequence, distributed cognition theorists tend to maintain a dialog with their mainstream colleagues. In contrast, the problems that CHAT researchers view as significant are typically far less influenced by mainstream considerations. The issue addressed in the previous section of this overview was that of the possible future directions for the two research traditions. It is important to note that the focus on just two issues—transfer and participation in multiple communities—was necessarily selective. It would have been possible to highlight a number of other issues that include the learning of the core ideas of academic disciplines and the design of tools as cognitive resources. In both cases, it can legitimately be argued that the CHAT and distributed cognition traditions each have the potential to make significant contributions. More generally, it can legitimately be argued that both research traditions are in progressive phases at the present time. 14125
Situated Cognition: Contemporary Deelopments See also: Cultural Psychology; Learning by Occasion Setting; Situated Cognition: Origins; Situated Knowledge: Feminist and Science and Technology Studies Perspectives; Vygotskij, Lev Semenovic (1896–1934); Vygotskij’s Theory of Human Development and New Approaches to Education
Bibliography Beach K 1995 Activity as a mediator of sociocultural change and individual development: The case of school-work transition in Nepal. Mind, Culture, and Actiity 2: 285–302 Bransford J D, Schwartz D L 1999 Rethinking transfer: A simple proposal with multiple implications. Reiew of Research in Education 24: 61–100 Brown J S, Collins A, Duguid P 1989 Situated cognition and the culture of learning. Educational Researcher 18(1): 32–42 Greeno J G 1998 The situativity of knowing, learning, and research. American Psychologist 53: 5–26 Hutchins E 1995 Cognition in the Wild. MIT Press, Cambridge, MA Lave J, Wenger E 1991 Situated Learning: Legitimate Peripheral Participating. Cambridge University Press, Cambridge, UK Olson D R 1995 Writing and the mind. In: Wertsch J V, del Rio P, Alvarez A (eds.) Sociocultural Studies of Mind. Cambridge University Press, New York, pp. 95–123 Pea R D 1993 Practices of distributed intelligence and designs for education. In: Salomon G (ed.) Distributed Cognitions: Psychological and Educational Considerations. Cambridge University Press, New York, pp. 47–87 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Scribner S 1984 Studying working intelligence. In: Rogoff B, Lave J (eds.) Eeryday Cognition: Its Deelopment in Social Context. Harvard University Press, Cambridge, MA, pp. 9–40 Wenger E 1998 Communities of Practice. Cambridge University Press, New York and Cambridge, UK
P. Cobb
Situated Cognition: Origins Situated cognition encompasses a range of theoretical positions that are united by the assumption that cognition is inherently tied to the social and cultural contexts in which it occurs. This initial definition serves to differentiate situated perspectives on cognition from what Lave (1991) terms the cognition plus view. This latter view follows mainstream cognitive science in characterizing cognition as the internal processing of information. However, proponents of this view also acknowledge that individual cognition is influenced both by the tools and artifacts that people use to accomplish goals and by their ongoing social interactions with others. To accommodate these insights, cognition plus theorists analyze the social world as a network of 14126
factors that influence individual cognition. Thus, although the cognition plus view expands the conditions that must be taken into account when developing adequate explanations of cognition and learning, it does not reconceptualize the basic nature of cognition. In contrast, situated cognition theorists challenge the assumption that social process can be clearly partitioned off from cognitive processes and treated as external condition for them. These theorists instead view cognition as extending out into the world and as being social through and through. They therefore attempt to break down a distinction that is basic both to mainstream cognitive science and to the cognition plus view, that between the individual reasoner and the world reasoned about.
1. Situation and Context I can further clarify this key difference between the situated and cognition plus viewpoints by focusing on the underlying metaphors that serve to orient adherents to each position as they frame questions and plan investigations. These metaphors are apparent in the different ways that adherents to the two positions use the key terms situation and context (Cobb and Bowers 1999, Sfard 1998). The meaning of these terms in situated theories of cognition can be traced to the notion of position as physical location. In everyday conversation we frequently elaborate this notion metaphorically when we describe ourselves and others as being positioned with respect to circumstances in the world of social affairs. This metaphor is apparent in expressions such as, ‘My situation at work is pretty good at the moment.’ In this and similar examples, the world of social affairs (e.g., work) in which individuals are considered to be situated is the metaphorical correlate of the physical space in which material objects are situated in relation to each other. Situated cognition theorists such as Lave (1988), Saxe (1991), Rogoff (1995), and Cole (1996) elaborate this notion theoretically by introducing the concept of participation in cultural practices (see Fig. 1). Crucially, this core construct of participation is not restricted to faceto-face interactions with others. Instead, all individual actions are viewed as elements or aspects of an
Figure 1 Metaphorical underpinnings of the cognition plus and situated cognition perspectives
Situated Cognition: Origins encompassing system of cultural practices, and individuals are viewed as participating in cultural practices even when they are in physical isolation from others. Consequently, when situated cognition theorists speak of context, they are referring to a sociocultural context that is defined in terms of participation in a cultural practice. This view of context is apparent in a number of investigations in which situated cognition theorists have compared mathematical reasoning in school with mathematical reasoning in various out-of-school settings such as grocery shopping (Lave 1988), packing crates in a dairy (Scribner 1984), selling candies on the street (Nunes et al. 1993, Saxe 1991), laying carpet (Masingila 1994), and growing sugar cane (DeAbreu 1995). These studies document significant differences in the forms of mathematical reasoning that arise in the context of different practices that involve the use of different tools and sign systems, and that are organized by different overall motives (e.g., learning mathematics as an end in itself in school vs. doing arithmetical calculations while selling candies on the street in order to survive economically). The findings of these studies have been influential, and challenge the view that mathematics is a universal form of reasoning that is free of the influence of culture. The underlying metaphor of the cognition plus view also involves the notion of position. However, whereas the core metaphor of situated theories is that of position in the world of social circumstances, the central metaphor of the cognition plus view is that of the transportation of an item from one physical location to another (see Fig. 1). This metaphor supports the characterization of knowledge as an internal cognitive structure that is constructed in one setting and subsequently transferred to other settings in which it is applied. In contrast to this treatment of knowledge as an entity that is acquired in a particular setting, situated cognition theorists are more concerned with knowing as an activity or, in other words, with types of reasoning that emerge as people participate in particular cultural practices. In line with the transfer metaphor, context, as it is defined by cognition plus theorists, consists of the task an individual is attempting to complete together with others’ actions and available tools and artifacts. Crucially, whereas situated cognition theorists view both tools and others’ actions as integral aspects of an encompassing practice that only have meaning in relation to each other, cognition plus theorists view them as aspects of what might be termed the stimulus environment that is external to internal cognitive processing. Consequently, from this latter perspective, context is under a researcher’s control just as is a subject’s physical location, and can therefore be systematically varied in experiments. Given this conception of context, cognition plus theorists can reasonably argue that cognition is not always tied to
context because people do frequently use what they have learned in one setting as they reason in other settings. However, this claim is open to dispute if it is interpreted in terms of situated cognition theory where all activity is viewed as occurring in the context of a cultural practice. Further, situated cognition theorists would dispute the claim that cognition is partly context independent because, from their point of view, an act of reasoning is necessarily an act of participation in a cultural practice. As an illustration, situated cognition theorists would argue that even a seemingly decontextualized form of reasoning such as research mathematics is situated in that mathematicians use common sign systems as they engage in the communal enterprise of producing results that are judged to be significant by adhering to agreed-upon standards of proof. For situated cognition theorists, mathematicians’ reasoning cannot be adequately understood unless it is located within the context of their research communities. Further, these theorists would argue that to be fully adequate, an explanation of mathematicians’ reasoning has to account for its development or genesis. To address this requirement, it would be necessary to analyze the process of the mathematicians’ induction into their research communities both during their graduate education and during the initial phases of their academic careers. This illustrative example is paradigmatic in that situated cognition theorists view learning as synonymous with changes in the ways that individuals participate in the practices of communities. In the case of research mathematics, these theorists would argue that mathematicians develop particular ways of reasoning as they become increasingly substantial participants in the practices of particular research communities. They would therefore characterize the mathematicians’ apparently decontextualized reasoning as embedded in these practices and thus as being situated.
2. Vygotsky’s Contribution Just as situated cognition theorists argue that an adequate analysis of forms of reasoning requires that we understand their genesis, so an adequate account of contemporary situated perspectives requires that we trace their development. All current situated approaches owe a significant intellectual debt to the Russian psychologist Lev Vygotsky, who developed his cultural-historical theory of cognitive development in the period of intellectual ferment and social change that followed the Russian revolution. Vygotsky was profoundly influenced by Karl Marx’s argument that it is the making and use of tools that serves to differentiate humans from other animal species. For Vygotsky, human history is the history of artifacts such as language, counting systems, and writing, that 14127
Situated Cognition: Origins are not invented anew by each generation but are instead passed on and constitute the intellectual bequest of one generation to the next. His enduring contribution to psychology was to develop an analogy between the use of physical tools and the use of intellectual tools such as sign systems (Kozulin 1990, van der Veer and Valsiner 1991). He argued that just as the use of a physical tool serves to reorganize activity by making new goals possible, so the use of sign systems serves to reorganize thought. From this perspective, culture can therefore be viewed as a repository of sign systems and other artifacts that are appropriated by children in the course of their intellectual development (Vygotsky 1978). It is important to understand that for Vygotsky, children’s mastery of a counting system does merely enhance or amplify an already existing cognitive capability. Instead, children’s ability to reason numerically is created as they appropriate the counting systems of their culture. This example illustrates Vygotsky’s more general claim that children’s minds are formed as they appropriate sign systems and other artifacts. Vygotsky refined his thesis that children’s cognitive development is situated with respect to the sign systems of their culture as he pursued several related lines of empirical inquiry. In his best known series of investigations, he attempted to demonstrate the crucial role of face-to-face interactions in which an adult or more knowledgeable peer supports the child’s use of an intellectual tool such as a counting system (Vygotsky 1962). For example, one of the child’s parents might engage the child in a play activity in the course of which parent and child count together. Vygotsky interpreted observations of such interactions as evidence that the use of sign systems initially appears in children’s cognitive development on what he termed the ‘intermental plane of social interaction.’ He further observed that over time the adult gradually reduces the level of support until the child can eventually carry out what was previously a joint activity on his or her own. This observation supported his claim that the child’s mind is created via a process of internalization from the intermental plane of social interaction to the intramental plane of individual thought. A number of Western psychologists revived this line of work in the 1970s and 1980s by investigating how adults support or scaffold children’s learning as they interact with them. However, in focusing almost exclusively on the moves or strategies of the adult in supporting the child, they tended to portray learning as a relatively passive process. This conflicts with the active role that Vygotsky attributed to the child in constructing what he referred to as the ’higher mental functions’ such as reflective thought. In addition, these turn-by-turn analyses of adult–child interactions typically overlooked the broader theoretical orientation that informed Vygotsky’s investigations. In focusing on social interaction, he attempted to demonstrate 14128
both that the environment in which the child learns is socially organized and that it is the primary determinant of the forms of thinking that the child develops. He therefore rejected descriptions of the child’s learning environment that are cast in terms of absolute indices because such a characterization defines the environment in isolation from the child. He argued that the analyses should instead focus on what the environment means to the child. This, for him, involved analyzing the social situations in which the child participates and the cognitive processes that the child develops in the course of that participation. Thus far, in discussing Vygotsky’s work, I have emphasized the central role that he attributed to social interactions with more knowledgeable others. There is some indication that shortly before his death in 1934 at the age of 36, he was beginning to view the relation between social interaction and cognitive development as a special case of a more general relation between cultural practices and cognitive development (Davydov and Radzikhovskii 1985, Minick 1987). In doing so he came to see face-to-face interactions as located within an encompassing system of cultural practices. For example, he argued that it is not until children participate in the activities of formal schooling that they develop what he termed scientific concepts. In making this claim, he viewed classroom interactions between a teacher and his or her students as an aspect of the organized system of cultural practices that constituted formal schooling in the USSR at that time. A group of Soviet psychologists, the most prominent of whom was Alexei Leont’ev, developed this aspect of cultural-historical theory more fully after Vygotsky’s death.
3. Leont’e’s Contribution Two aspects of Leont’ev’s work are particularly significant from the vantage point of contemporary situated perspectives. The first concerns his clarification of an appropriate unit of analysis when accounting for intellectual development. Although face-to-face interactions constitute the immediate social situation of the child’s development, Leont’ev saw the encompassing cultural practices in which the child participates as constituting the broader context of his or her development (Leont’ev 1978, 1981). For example, Leont’ev might have viewed an interaction in which a parent engages a child in activities that involve counting as an instance of the child’s initial, supported participation in cultural practices that involve dealing with quantities. Further, he argued that children’s progressive participation in specific cultural practices underlies the development of their thinking. Intellectual development was, for him, synonymous with the process by which the child becomes a full participant in particular cultural practices. In other words, he viewed the development of children’s minds and
Situated Knowledge: Feminist and Science and Technology Studies Perspecties their increasingly substantial participation in various cultural practices as two aspects of a single process. This is a strongly situated position in that the cognitive capabilities that children develop are not seen as distinct from the cultural practices that constitute the context of their development. From this perspective, the cognitive characteristics a child develops are characteristics of the child-in-cultural-practice in that they cannot be defined apart from the practices that constitute the child’s world. For Leont’ev, this implied that the appropriate unit of analysis is the child-inculture-practice rather than the child per se. As stated earlier, it is this rejection of the individual as a unit of analysis that separates contemporary situated perspectives from what I called the cognition plus viewpoint. Leont’ev’s second important contribution concerns his analysis of the external world of material objects and events. Although Vygotsky brought sign systems and other cultural tools to the fore, he largely ignored material reality. In building on the legacy of his mentor, Leont’ev argued that material objects as they come to be experienced by developing children are defined by the cultural practices in which they participate. For example, a pen becomes a writing instrument rather than a brute material object for the child as he or she participates in literacy practices. In Leont’ev’s view, children do not come into contact with material reality directly, but are instead oriented to this reality as they participate in cultural practices. He therefore concluded that the meanings that material objects come to have are a product of their inclusion in specific practices. This in turn implied that these meanings cannot be defined independently of the practices. This thesis only served to underscore his argument that the individual-cultural-practice constitutes the appropriate analytical unit for psychology. See also: Cultural Psychology; Learning by Occasion Setting; Situated Cognition: Contemporary Developments; Situated Knowledge: Feminist and Science and Technology Studies Perspectives; Situated Learning: Out of School and in the Classroom; Transfer of Learning, Cognitive Psychology of; Vygotskij, Lev Semenovic (1896–1934); Vygotskij’s Theory of Human Development and New Approaches to Education
Bibliography Cobb P, Bowers J 1999 Cognitive and situated perspectives in theory and practice. Educational Researcher 28(2): 4–15 Cole M 1996 Cultural Psychology. Harvard University Press, Cambridge, MA Davydov V V, Radzikhovskii L A 1985 Vygotsky’s theory and the activity-oriented approach in psychology. In: Wertsch J V (ed.) Culture, Communication, and Cognition: Vygotskian Perspecties. Cambridge University Press, New York, pp. 35–65
DeAbreu G 1995 Understanding how children experience the relationship between home and school mathematics. Mind, Culture, and Actiity 2: 119–42 Kozulin A 1990 Vygotsky’s Psychology: A Biography of Ideas. Harvard University Press, Cambridge, MA Lave J 1988 Cognition in Practice: Mind, Mathematics, and Culture in Eeryday Life. Cambridge University Press, New York Lave J 1991 Situating learning in communities of practice. In: Resnick L B, Levine J M, Teasley S D (eds.) Perspecties on Socially Shared Cognition. American Psychological Association, Washington, DC, pp. 63–82 Leont’ev A N 1978 Actiity, Consciousness, and Personality. Prentice-Hall, Englewood Cliffs, NJ Leont’ev A N 1981 The problem of activity in psychology. In: Wertsch J V (ed.) The Concept of Actiity in Soiet Psychology. Sharpe, Armonk, NY, pp. 37–71 Masingila J O 1994 Mathematics practice in carpet laying. Anthropology and Education Quarterly 25: 430–62 Minick N 1987 The development of Vygotsky’s thought: An introduction. In: Rieber R W, Carton A S (eds.) The Collected Works of Vygotsky, L.S. (Vol. 1): Problems of General Psychology. Plenum, New York, pp. 17–38 Nunes T, Schliemann A D, Carraher D W 1993 Street Mathematics and School Mathematics. Cambridge University Press, Cambridge, UK Rogoff B 1995 Observing sociocultural activity on three planes: Participatory appropriation, guided participation, and apprenticeship. In: Wertsch J V, del Rio P, Alvarez A (eds.) Sociocultural Studies of Mind. Cambridge University Press, New York, pp. 139–64 Saxe G B 1991 Culture and Cognitie Deelopment: Studies in Mathematical Understanding. Erlbaum, Hillsdale, NJ Scribner S 1984 Studying working intelligence. In: Rogoff B, Lave J (eds.) Eeryday Cognition: Its Deelopment in Social Context. Harvard Univesity Press, Cambridge, pp. 9–40 Sfard A 1998 On two metaphors for learning and the dangers of choosing just one. Educational Research 27(21): 4–13 van der Veer R, Valsiner J 1991 Understanding Vygotsky: A Quest for Synthesis. Blackwell, Cambridge, MA Vygotsky L S 1962 Thought and Language. MIT Press, Cambridge, MA Vygotsky L S 1978 Mind and Society: The Deelopment of Higher Psychological Processes. Harvard University Press, Cambridge, MA
P. Cobb
Situated Knowledge: Feminist and Science and Technology Studies Perspectives The expression ‘situated knowledge,’ especially in its plural form ‘situated knowledges,’ is associated with feminist epistemology, feminist philosophy of science, and science and technology studies. The term was introduced by historian of the life sciences and feminist science and technology studies scholar Donna Haraway in her landmark essay Situated Knowledges: The Science Question in Feminism and the Priilege of Partial Perspectie (Haraway 1991, pp. 183–201). The 14129
Situated Knowledge: Feminist and Science and Technology Studies Perspecties essay was a response to feminist philosopher of science Sandra Harding’s discussion of the ‘science question in feminism’ (Harding 1986). In her analysis of the potential of modern science to contribute to the goals of feminism, Harding noted three different accounts of objective knowledge in feminist epistemology: feminist empiricism, feminist standpoint, and feminist postmodernism (Harding 1986, pp. 24–6). Feminist empiricism attempted to replace more biased with less biased science. The feminist standpoint, echoing the Marxist tradition from which it derived, stressed the relevance of the social positioning of the knower to the content of what is known. Feminist postmodernism accentuated the power dynamics underlying the use of the language of objectivity in science. Haraway, taking off from Harding, diagnosed a ‘Scylla and Charybdis’ of temptations between which feminists attempt to navigate on the question of objectivity: radical constructionism and feminist critical empiricism. As she put it, what feminists wanted from a theory of objectivity was ‘enforceable, reliable accounts of things not reducible to power moves and agonistic, high status games of rhetoric or to scientistic, positivist arrogance’ (Haraway 1991, p. 188). Her notion of situated knowledges was her attempt to provide just such a theory of objectivity. Despite the single author provenance, the term resonated with attempts by other scholars in the history, sociology, and philosophy of science to address similar epistemological tensions. The concept was quickly and fruitfully taken up in science and technology studies and feminist theory, provoking a certain amount of reworking in its turn.
1. ‘Situated Knowledge’ and Objectiity: Constructed Yet Real The term ‘situated knowledge’ derives its theoretical importance from its seemingly oxymoronic character, particularly when applied to knowledge about the natural world. It is common to think of modern scientific knowledge as universal, so that it has the same content no matter who possesses it. It is also almost definitional to hold that objective knowledge is warranted by the fact that it captures reality as it really is, rather than being warranted by the situational circumstances out of which the knowledge was generated or discovered (see Shapin 1994, pp. 1–8 for discussion of, and citations relevant to, various manifestations of these points). Thus, if the law of gravity enables us to make reliable experimental predictions, it is because there is such a thing as gravity that is adequately captured by our scientific understanding; in short, the truth of the knowledge is its own warrant. It is only in the case of false or superseded knowledge that we typically explain what went wrong by reference to faulty assumptions, sloppy work, ill-calibrated equipment, the Zeitgeist, or other aspects of the 14130
context of discovery. The idea of ‘situated knowledge’ contests these supposed concomitants of objective knowledge. It suggests that objective knowledge, even our best scientific knowledge of the natural world, depends on the partiality of its material, technical, social, semiotic, and embodied means of being promulgated. Haraway’s notion thus has affinities with other feminist epistemologies which have noted that facts can differ in their content from one time, place, and knower to another (e.g., Collins 1989). It also has sympathies in common with sociologists of science and scholars of science and technology studies who have suggested that capturing ‘reality as it really is’ may be dependent on institutional, technical, and cultural norms (Kuhn (1962\1970), on practice (Clarke and Fujimura 1992, Pickering 1992), and attempts to witness, measure, comprehend, or command assent to it (Latour 1987, Shapin 1994, Shapin and Schaffer 1985). All these scholars share a search for the theoretical resources to do justice to the embeddedness of science and truth. These challenges to conventional views of objectivity bring situated knowledges into conversation with key debates in the philosophy of science around the theory-ladenness of facts (Hesse 1980, pp. 63–110). Additionally, the suspicion of transcendent universalism entrains an epistemological and political distrust of clear-cut distinctions between subject and object, and a blurring of the distinction between context and content of knowledge or discovery. Situated knowledges are as hostile to relativism as they are to realism. Haraway describes relativism as ‘being nowhere while claiming to be everywhere equally’ (Haraway 1991, p. 191) and realism as ‘seeing everything from nowhere’ (Haraway 1991, p. 189), and conceives of them both as ‘god-tricks’ promising total, rather than partial, located, and embodied, vision. In contrast to realist or relativist epistemologies, Haraway sees the possibility of sustained and rational objective inquiry in the epistemology of partial perspectives. This requires, she maintains, reclaiming vision as a series of technological and organic embodiments, as and when and where and how vision is actually enabled. This crafting of a feminist epistemology of situated knowledges on the basis of vision and partial perspective is noteworthy. The links in the history of science to militarism, capitalism, colonialism, and male supremacy have been theorized around the masculinist gaze of the powerful but disembodied knower disciplining and subjugating the weak by means of a multitude of technologies of surveillance. Feminists have lamented the privilege granted to the visual as a sure basis of knowledge and bemoaned the sidelining in modernity of what some cast as more feminine and less intrinsically violent ways of knowing involving emotion, voice, touch, and listening (Gilligan 1982). Haraway is concerned that feminists not cede power to those whose practices they wish
Situated Knowledge: Feminist and Science and Technology Studies Perspecties critically to engage. It is in this spirit that she grounds her feminist solution in an embrace of science and vision, ‘the real game in town, the one we must play’ (Haraway 1991, p. 184).
2. The Feminist Roots of the Dilemma A tension between emancipatory empiricism and its associated egalitarian or socialist politics, and feminist postmodern constructionism and its associated identity politics, resonates throughout contemporary Western feminist theory. It is a recent hallmark of those engaged in feminist philosophical and social studies of science that they seek to resolve one or another version of this tension. One horn of the feminist dilemma, according to Haraway, represents the good feminist reasons to be attracted to radical constructionism. Feminist postmodernists, and analysts of science and technology influenced by semiotics (including Haraway herself ), helped develop and often appeal to ‘a very strong social constructionist argument for all forms of knowledge claims’ including scientific ones (Akrich and Latour 1992, Haraway 1991, p. 184). This position has the benefit of showing the links between power—such things as status, equipment, rhetorical privilege, funding, and so on—and the production of knowledge and credibility. The downside, from the point of view of feminists interested in arguing for a better world, is that the radical constructionist argument risks rendering all knowledges as fundamentally ideological, with no basis for choosing between more and less just ideas, more and less true versions of reality. As Haraway provocatively expressed it, embracing this temptation seemed to leave no room for ‘those of us who would still like to talk about reality with more confidence than we allow the Christian right’s discussion of the Second Coming and their being raptured out of the final destruction of the world’ (Haraway 1991, p. 185). The second horn of the dilemma, according to Haraway, involves ‘holding out for a feminist version of objectivity’ through materialism or empiricism (Haraway 1991, p. 186). Haraway briefly discusses both Marxist derived feminisms and feminist empiricism. Feminisms with Marxist inspirations are several, and their genealogy can be traced in a number of ways. Feminists have long criticized Marxist humanism for its premise that the self-realization of man is dependent on the domination of nature, and for its account of the historical progression of modes of production that grants no historical agency to domestic and unpaid labor (Hartmann 1981). Some have responded by developing feminist versions of historical materialism (Hartsock 1983). Feminist standpoint theorists appropriated the general insight, inherited from both Marxist thought and the sociology of knowledge, that one’s social-structural position in
society—such things as one’s class or relation to the means of production, or one’s gender or ethnonational characteristics—determine or affect how and what one knows (Smith 1990). Likewise, the idea that some social structural positions confer epistemological privilege has been widely adopted by standpoint theorists and feminist epistemologists arguing for specifically feminine ways of knowing (Rose 1983). ‘Seeing from below,’ that is, from a position of subordination, has commonly been theorized by feminists as the position of epistemological privilege, on the grounds that those with little to gain from internalizing powerful ideologies would be able to see more clearly than those with an interest in reproducing the status quo. In Patricia Hill Collins’ version of standpoint theory, for example, these insights are used both to validate the knowledges of the historically disenfranchised, and to reverse the hegemonic ranking of knowledge and authority, and claim epistemological privilege for African–American women (Collins 1989). Psychoanalytic theory, particularly anglophone object relations theory, inspired some of the early writings on gender and science (Chodorow 1978, Keller 1985). Object relations theory attempted to explain the different relation of women and men to objectivity, abstract thought, and science in modern societies. To account for this difference, the theory posited gender-based differences in the socialization of sons and daughters in Western middle class heterosexual nuclear families. Boys, according to this theory, are socialized to separate from the primary caregiver who is the mother in this normative family scenario. They thus learn early and well by analogy with their emotional development that relational thinking is inappropriate for them; separating themselves from the object of knowledge, as from the object of love, is good. Girls, on the other hand, are supposedly socialized to be like their primary caregiver, so that they can reproduce mothering when their turn comes. Relationality and connectivity, not abstraction and separation, are the analogous ordering devices of girls’ affective and epistemological worlds. As applied to objectivity and scientific knowledge, object relations theory seemed to explain to feminists, without resort to distasteful biological determinisms denying women scientific aptitude, why women were excluded from much of science and technology. It also suggested that there were (at least) two distinct ways of knowing, and that much might have been lost in the violence and separation of masculinist science that could be restored by a proper valuation of the feminine values of connection and empathy (Harding 1986, Keller 1983). Like Marxism, psychoanalytic approaches to objectivity gave feminists a means to show the relevance of one’s social position to knowledge. Like feminist empiricism, they encouraged the belief in the possibility of an improved, feminist, objectivity (Harding 1992). 14131
Situated Knowledge: Feminist and Science and Technology Studies Perspecties The feminist canon contains a number of empirical studies that have revealed the negative effects of such things as colonialism and stereotypes about race and gender on the production of reliable science (FaustoSterling 1995, Martin 1991\1996, Schiebinger 1989, Traweek 1988). Evelyn Fox Keller’s call for ‘dynamic objectivity’ (Keller 1985, pp. 115–26) and Sandra Harding’s demand for ‘strong objectivity’ (Harding 1992, p. 244) are exemplary of the aspirations of theoretical feminist empiricism. These projects seek to prescribe scientific methods capable of generating accounts of the world that would improve upon disembodied, masculinist portrayals of science because they would be alert to the practices of domination and oppression inherent in the creation, dissemination, and possession of knowledge. Feminist empiricism nonetheless remains problematic because of its reliance on the dichotomies of bias vs. objectivity, use vs. misuse, and science vs. pseudoscience. The feminist insight of the ‘contestability of every layer of the onion of scientific and technological constructions’ (Haraway 1991, p. 186) flies in the face of leaving these epistemological dichotomies intact.
3. Subjects, Objects, and Agency Haraway’s notion of situated knowledges problematizes both subject and object. Unlike standpoint theories which attribute epistemological privilege to subjugated knowers, and the sociology of knowledge which attributes espitemological privilege to those in the right structural position vis-a' -vis a given mode of production, Haraway attributes privilege to partiality. This shift underscores that ‘situated knowledge’ is more dynamic and hybrid than other epistemologies that take the position of the knower seriously, and involves ‘mobile positioning’ (Haraway 1991, p. 192) In situated knowledges based on embodied vision, neither subjects who experience, nor nature which is known, can be treated as straightforward, pretheoretical entities, ‘innocent and waiting outside the violations of language and culture’ (Haraway 1991, p. 109). Haraway maintains that romanticizing, and thus homogenizing and objectifying, the perfect subjugated subject position is not the solution to the violence inherent in dominant epistemologies. As feminists from developing countries have also insisted, there is no innocent, perfectly subjugated feminist subject position conferring epistemological privilege; all positionings are open to critical re-examination (Mohanty 1984\1991). Subjectivity is instead performed in and through the materiality of knowledge and practice of many kinds (Butler 1990, pp. 1–34). Conversely, the extraordinary range of objects in the physical, natural, social, political, biological, and human sciences about which institutionalized knowledge is produced should not be considered to be passive and inert. Haraway says that situated knowledges require thinking of the world in terms of the 14132
‘apparatus of bodily production.’ The world cannot be reduced to a mere resource if subject and object are deeply interconnected. Bodies as objects of knowledge in the world should be thought of as ‘material-semiotic generative nodes,’ whose ‘boundaries materialize in social interaction’ (Haraway 1991, p. 201). The move to grant agency to material objects places the epistemology of situated knowledges at the center of recent scholarship in science and technology studies (Callon 1986, Latour 1987).
4. Uptake and Critique Donna Haraway’s essay ranks among the most highly cited essays in science and technology studies and has been anthologized. As stated above, situated knowledges is a provocative and rich methodological metaphor with resonances in many quarters. The dialogue between Harding and Haraway continued after the publication of Situated Knowledges (Harding 1992, pp. 119–63). Her epistemology has directly influenced, and has in turn been influenced by, the recent work of sociologists and anthropologists of science (Clarke and Montini 1993, Rapp 1999), feminist philosophers of science (Wylie 1999), and practicing feminist scientists (Barad 1996). In addition, ‘situated knowledges’ is used as a standard technical term of the field by more junior scholars. Critics of situated knowledges have been few. Timothy Lenoir has pointed out that many of the epistemological ideas behind Haraway’s situated knowledges are found not only in other major strands of science and technology studies, but also in the work of continental philosophers such as Nietzsche. He likewise critiqued the idea of situated knowledges for its dependence on the apparatus of semiotics (Lenoir 1999, pp. 290–301). Historian Londa Schiebinger, in her recent book summarizing the effects of a generation of feminist scholarship on the practice of science, places Haraway’s situated knowledges together with Harding’s strong objectivity as attempts to integrate social context into scientific analysis (Schiebinger 1999). Implicit critiques have been leveled against the limitations of the idea of being situated, for example, in the development of De Laet’s and Mol’s mobile epistemology (De Laet and Mol 2000). Sheila Jasanoff and her colleagues have argued for bringing differently spatialized entities such as the nation, the local, and the global, into the epistemology of science and technology studies, while retaining the insights gained by paying attention to practice, vision, and measurement. These critiques stand more as continuing conversations with, than rebuttals of situated knowledges, however. Overall, the idea of situated knowledges remains central to feminist epistemology and science studies and to attempts to understand the role of modern science in society.
Situated Learning: Out of School and in the Classroom See also: Cognitive Psychology: Overview; Contextual Studies: Methodology; Feminist Epistemology; Feminist Political Theory and Political Science; Feminist Theory; Feminist Theory: Postmodern; Knowledge, Anthropology of; Knowledge, Sociology of; Rationality and Feminist Thought; Science and Technology, Anthropology of; Scientific Knowledge, Sociology of; Situation Model: Psychological
Bibliography Akrich M, Latour B 1992 A summary of a convenient vocabulary for the semiotics of human and nonhuman assumblies. In: Bijker W, Law J (eds.) Shaping Technology\Building Society: Studies in Sociotechnical Change. MIT Press, Cambridge, MA, pp. 259–64 Barad K 1996 Meeting the universe halfway: realism and social constructivism without contradiction. In: Hankinson Nelson L, Nelson J (eds.) Feminism, Science, and the Philosophy of Science. Kluwer, Dordrecht, The Netherlands, pp. 161–94 Butler J 1990 Gender Trouble: Feminism and the Subersion of Identity. Routledge, London Callon M 1986 Some elements of a sociology of translation: domestication of the scallops and fishermen of St. Brieuc Bay. In: J Law J (ed.) Power, Action and Belief: A New Sociology of Knowledge. Routledge, London, pp. 196–233 Chodorow N 1978 The Reproduction of Mothering. University of California Press, Berkeley Clarke A, Fujimura J (eds.) 1992 The Right Tools for the Job: At Work in Twentieth-Century Life Sciences. Princeton University Press, Princeton, NJ Clarke A, Montini T 1993 The many faces of RU486: Tales of situated knowledges and technological contestations. Science, Technology and Human Values 18: 42–78 Collins P H 1989 The social construction of black feminist thought. Signs 14(4): 745–73 De Laet M, Mol A 2000 The Zimbabwe bush pump. Mechanics of a fluid technology. Social Studies of Science Fausto-Sterling A 1995 Gender, race, and nation: The comparative anatomy of ‘Hottentot’ Women in Europe, 1815–1817. In: Terry J, Urla J (eds.) Deiant Bodies: Critical Perspecties on Difference in Science and Popular Culture. Indiana University Press, Bloomington, IN, pp. 19–48 Gilligan C 1982 In a Different Voice: Psychological Theory and Women’s Deelopment. Harvard University Press, Cambridge, MA Haraway D 1991 Simians, Cyborgs and Women: The Reinention of Nature. Routledge, New York Harding S 1986 The Science Question in Feminism. Cornell University Press, Ithaca, NY Harding S 1992 Whose Science? Whose Knowledge? Thinking from Women’s Lies. Cornell University Press, Ithaca, NY Hartmann H 1981 The unhappy marriage of Marxism and feminism. In: Sargent L (ed.) Women and Reolution. South End Press, Boston Hartsock N 1983 The feminist standpoint: Developing the ground for a specifically feminist historical materialism. In: Harding S, Hintikka M (eds.) Discoering Reality: Feminist Perspecties on Epistemology, Metaphysics, Methodology, and Philosophy of Science. Reidel, Dordrecht, The Netherlands, pp. 283–310 Hesse M 1980 Reolutions and Reconstructions in the Philosophy of Science. Indiana University Press, Bloomington, IN
Keller E F 1983 A Feeling for the Organism: The Life and Work of Barbara McClintock. W. H. Freeman and Company, New York Keller E F 1985 Reflections on Gender and Science. Yale University Press, New Haven, CT Kuhn T S 1962\1970 The Structure of Scientific Reolutions, 2nd edn. Chicago University Press, Chicago Latour B 1987 Science in Action: How to Follow Scientists and Engineers Through Society. Open University Press, Philadelphia, PA Lenoir T 1999 Was the last turn the right turn? The semiotic turn and A. J. Greimas. In: Biagioli M (ed.) The Science Studies Reader. Routledge, London, pp. 290–301 Martin E 1991\1996 The egg and the sperm: how science has constructed a romance based on stereotypical male–female roles. In: Laslett B, Kohlstedt G S, Longino H, Hammonds E (eds.) Gender and Scientific Authority. Chicago University Press, Chicago, pp. 323–39 Mohanty C T 1984\1991 Under Western eyes: Feminist scholarship and colonial discourses. In: Mohanty C, Russo A, Torres L (eds.) Third World Women and the Politics of Feminism. Indiana University Press, Bloomington, IN, pp. 51–80 Pickering A (ed.) 1992 Science as Practice and Culture. University of Chicago Press, Chicago Rapp R 1999 Testing Women, Testing the Fetus: The Social Impact of Amniocentesis in America. Routledge, London Rose H 1983 Hand, brain, and heart: A feminist epistemology for the natural sciences. Signs 9(1): 73–90 Schiebinger L 1989 The Mind Has No Sex? Women in the Origins of Modern Science. Harvard University Press, Cambridge, MA Schiebinger L 1999 Has Feminism Changed Science? Harvard University Press, Cambridge, MA Shapin S 1994 A Social History of Truth: Ciility and Science in Seenteenth-century England. University of Chicago Press, Chicago Shapin S, Schaffer S 1985 Leiathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, Princeton, NJ Smith D 1990 The Conceptual Practices of Power; A Feminist Sociology of Knowledge. University of Toronto Press, Toronto, ON Traweek S 1988 Beamtimes and Lifetimes: The World of High Energy Physicists. Harvard University Press, Cambridge, MA Wylie A 1999 The engendering of archaeology: Refiguring feminist science studies. In: Biagioli M (ed.) The Science Studies Reader. Routledge, London, pp. 553–68
C. M. Thompson
Situated Learning: Out of School and in the Classroom Situated learning is not a unitary, well-defined concept. From an educational point of view, the core idea behind the different uses of this term is to create a situational context for learning that strongly resembles 14133
Situated Learning: Out of School and in the Classroom possible application situations in order to assure that the learning experiences foster ‘real-life’ problem solving.
1. Situated Cognition–Situated Learning Since 1985, the notion of situatedness of learning and knowing has become prominent in a variety of scientific disciplines such as psychology, anthropology, and computer science. In this entry, however, we focus only on situatedness approaches in education and educational psychology. Since the late 1980s a particular educational problem has received much attention. In traditional forms of instruction, learners acquire knowledge that they can explicate, for example in examinations. When the learners are, however, confronted with complex ‘reallife’ problems, this knowledge is frequently not used, although it is relevant. Such knowledge is termed ‘inert knowledge’ (Renkl et al. 1996). This frequently found inertness phenomenon motivated some researchers to postulate that the whole notion of knowledge that is used in cognitive psychology as well as in everyday reasoning about educational issues is wrong (e.g., Lave 1988). It has been argued that knowledge is not some ‘entity’ in a person’s head that can be acquired in one situational context (e.g., classroom) and then be used in another context (e.g., workplace), but that instead it is context-bound. Hence, symbolic modeling of cognitive processes and structures is regarded as inappropriate. From a situatedness perspective, knowledge is generally constituted by the relation or interaction between an agent and the situational context they are acting in. Hence, it is proposed that we use the term ‘knowing’ instead of ‘knowledge’ in order to underline the process aspect of knowing\knowledge (Greeno et al. 1993). As a consequence of this conception of knowledge, learning must also be conceived as contextbound or situated. Thus, it is understandable that there is no transfer between such different contexts as the classroom and everyday life. The theoretical assumption of the situatedness of knowing and learning also has consequences for research methodology. Laboratory experiments can no longer be seen as appropriate, because in this research strategy the phenomena are put out of their ‘natural’ context and their character is changed (e.g., Lave 1988). Situatedness proponents conduct primarily qualitative field studies. Unfortunately, the term of situatedness is not well defined. One reason for the fuzziness of this construct is that its proponents differ in how far they depart from traditional cognitive concepts. Whereas Lave (1988), for example, radically rejects the notions of knowledge, transfer, and symbolic representations as not being sensible, other proponents, such as Greeno et al. (1993), hold a more modest position. Although 14134
they also stress the situatedness of knowing and learning, they aim, for example, to analyze conditions for transfer, or claim that representations can play a role in human activity. A factor that further contributes to the vagueness of the notion of situated learning is that it is used in two ways: descriptive and prescriptive. When descriptively used, situated learning means that learning is analyzed as a context-bound process. In the field of education, situated learning is, however, mostly used as prescriptive concept. It is argued that learning should be situated in a context that resembles the application contexts. The prescriptive aspect of situated learning was quite appealing to the community of educational researchers in the 1990s. Hence, not only those who subscribe to the situated cognition perspective (i.e., rejection of cognitive concepts), but also people who still more or less remain within the traditional cognitive framework rely on this concept. The latter group often juxtapose situated learning with decontextualized (traditional) learning, in which concepts and principles are presented and acquired in an abstract way with little or no relation to ‘real-world’ problems. It is important to note that this ‘assimilated’ view, in principle, contradicts the more fundamental situatedness concept according to which there is no nonsituated learning. Even typical abstract school learning is situated in the very specific context of school culture, although this situatedness would usually be evaluated as unfavorable because of the differences between school and ‘real-life’ contexts. Hence, irrespective of the logical inconsistencies in the use of the notion of situated learning, the prescriptive notion of situated learning means that the learning and the application situations should be as similar as possible in order to assure that the learning experiences have positive effects on ‘real-life’ problem solving.
2. Learning and Problem Soling in School and Outside From a situatedness perspective, typical school learning is not much help for problem solving in everyday or professional life because the contexts are too different. Based among others upon Resnick’s (1987) analyses of typical differences between these contexts, the following main points are outlined: (a) Well-defined problems in school ersus ill-defined problems outside. For example, word problems in mathematics are usually well defined. It is clear what the problem is all about, all the necessary information is given in the problem formulation, there is usually only one way to arrive at the solution that is labeled as appropriate, and so on. Nontrivial, ‘real-life’ problems (e.g., improvement of an organization’s communication structure) first often have to be defined more precisely for a productive solution: one has to decide whether one needs more information, one has to seek
Situated Learning: Out of School and in the Classroom information, one has to decide what is relevant and irrelevant information, there are multiple ways of solving a problem, and so on. (b) Content structured by theoretical systems in school ersus structured by problems outside. In traditional forms of instruction, content is structured according to theoretical systematizations (e.g., biological taxonomies). This helps learners to organize the content and to remember it. One very salient systematization of content is the distinction between different school subjects. When a ‘real-life’ problem (e.g., pollution in a city or evaluation of an Internet company’s share) has to be solved, thinking within the boundaries of school subjects is often not helpful. Furthermore, the structure of concepts used in school is frequently not relevant to the problem at hand. In ‘real life,’ the nature of the problem to be solved determines what concepts and information are required and in which structure they are needed. (c) Indiidual cognition in school ersus shared cognition outside. In schools, the usual form of learning and performance is individualistic. Cooperation in examinations is even condemned. In professional or everyday life, in contrast, cooperation is valued and it is frequently necessary for solving problems. (d) Pure mentation in school ersus tool manipulation outside. In traditional instruction, pure ‘thought’ activities are dominating. Students should learn to perform without support of tools such as books, notes, calculators, etc. Especially in exams, tools are usually forbidden. In contrast, a very important skill in everyday or professional life is the competent use of tools. (e) Symbol manipulation in school ersus contextualized reasoning outside. Abstract manipulation of symbols is typical of traditional instruction. Students often fail to match symbols and symbolic processes to ‘real-world’ entities and processes. In ordinary life, on the other hand, not only are tools used, but reasoning processes are an integral part of activities that involve objects and other persons. ‘Realworld’ reasoning processes are typically situated in rich situational contexts. (f ) Generalized learning in school ersus situationspecific competencies outside. One reason for the abstract character of traditional instruction is that it aims to teach general, widely usable skills and theoretical principles. Nobody can foresee what types of specific problem students will encounter in their later life. In everyday and professional life, in contrast, situation-specific skills must be acquired. Further learning is mostly aimed at competencies for specific demands (e.g., working with a new computer program). On the one hand, this list is surely not complete and more differences could be outlined. On the other hand, this juxtaposition is somewhat oversimplifying. Nevertheless, there is a core of truth because typical learning in and out of school differs significantly. Given these
differences, a consequence of situatedness assumptions is to claim that learning environments in school should strongly resemble application contexts or that learning should take place in the field (e.g., ‘on the job’). A traditional model in which people acquire applicable skills in authentic contexts is apprenticeship learning.
3. Apprenticeship Learning as Situated Learning The situatedness protagonists Lave and Wenger (1991) investigated out of school learning in the form of apprenticeship by means of qualitative analyses. They focused on people working in traditional skills, for example Indian midwives in Mexico, tailors in Liberia, and butchers in supermarkets. In these traditional apprenticeships, learners acquire mainly manual skills. In modern society, in contrast, ‘cognitive’ domains prevail (e.g., computer science, psychology), so that skilled activity is hardly ‘visible.’ Also in school learning, cognitive skills such as mathematical problem solving, reading, and writing dominate. Against this background, Collins et al. (1989) developed the instructional cognitive apprenticeship model. Herein the importance of explication or reification of cognitive processes (e.g., strategies, heuristics) during learning is stressed. Thus, cognitive processes can be approximately as explicit as the more manual skills trained in traditional apprenticeship. The core of cognitive apprenticeship is a special instructional sequence and the employment of authentic learning tasks. Experts provide models in applying their knowledge in authentic situations. Thereby they externalize (verbalize) their reasoning. The learners then work on authentic tasks of increasing complexity and diversity. An expert or teacher is assigned an important role as a model and as a coach providing scaffolding. The learners are encouraged increasingly to take an active role, as the support by the expert is gradually faded out. Articulation is promoted so that normally internal processes are externalized and can be reflected. This means that one’s own strategies can be compared with those of experts, are then open to feedback, and can be discussed. In addition, the student’s own cognitive strategies can be compared with those of other students. In the course of interaction with experts and other learners, students can also get to know different perspectives on concepts and problems. As a result of this instructional sequence, the students increasingly work on their own (exploration) and may take over the role initially assumed by the expert. Lave and Wenger (1991) have characterized such a sequence as development from legitimate peripheral participation to full participation. It is important to note that the type of apprenticeship learning that is envisioned by the proponents of the situatedness approaches implies much more than the acquisition of ‘subject-matter knowledge.’ It 14135
Situated Learning: Out of School and in the Classroom is a process of enculturation. The learner gradually develops the competence to participate in a community of practice. Such participation presupposes more than the type of knowledge usually focused on in classroom learning. In addition, ‘tricks of the trade’ and knowledge of social norms, for example, are required.
4. Problem-based Learning as Situated Learning Besides apprenticeship, problem-based learning is another possibility for implementing arrangements in accordance with a situatedness rationale. Learning should be motivated by a complex ‘real-world’ problem that is the starting point of a learning process. For example, Greeno and the Middle School Mathematics Through Applications Project Group (MMAP) (1998) designed learning arrangements in which mathematical reasoning is not triggered primarily in separate mathematics lessons, but within design activities in four domains: architecture, population biology, cryptography, and cartography. The design activities are supported by the employment of computer tools which are also typical of current situated learning arrangements. An instructional principle is to induce quantitative reasoning involving proportions, ratios, and rates during design activities that strongly resemble the activities of many everyday crafts and commercial practices. The mathematical reasoning within design activities can be quite sophisticated; however, it often remains implicit. The teachers’ task is to uncover the mathematics the students are implicitly using. For this purpose, there are, among others, curricular materials for ‘math extension’ units (i.e., explicit mathematics lessons).
5. Common Critiques of the Situatedness Approach The situatedness camp has criticized the fundamental assumptions of cognitively oriented educational research. Hence it is not surprising that the situatedness camp has also been heavily attacked. Three major objections against the situatedness approach and corresponding defending arguments should be outlined: (a) Faulty fundamental assumptions: Anderson et al. (1996), in particular, have argued that the situatedness approach is based on wrong assumptions such as ‘knowledge does not transfer between tasks’ or ‘training in abstraction is of little use.’ Anderson et al. cited empirical studies that contradict these assumptions. In his reply, Greeno (1997) argues that the assumptions that Anderson et al. criticize are actually wrong, but that they are not the claims of the situativity approach. The core of a situativity theory is a different perspective on phenomena of learning. Instead of focusing on mental processes and structures, the 14136
situativity approach analyses ‘… the social and ecological interaction as its basis and builds toward a more comprehensive theory by … analyses of information structures in the contents of people’s interactions’ (Greeno 1997, p. 5). (b) Triiality: It is also argued that there is nothing really new in the core arguments of the situatedness theories. For example, Vera and Simon (1993) argue that all findings of the situatedness research can be incorporated into the well-elaborated traditional framework of cognitive models (i.e., symbolic paradigm). From the situatedness perspective, however, analyzing symbolic structures and processes runs too short. They may just be a special case of activity (Greeno and Moore 1993). Another triviality argument is that many claims of the situatedness protagonists were already articulated long before, for example by Dewey, Piaget, and Vygotskji, so that they are hardly stating anything new (e.g., Klauer 1999). Renkl (2000) counters that it is to the merit of the situatedness protagonists that they have reactivated these classical ideas. Most of these ideas played only a very minor role in educational mainstream research before the situatedness approach emerged. Furthermore, situated learning approaches bring classical ideas together with new developments (e.g., learning with new technologies) to form new ‘Gestalts.’ (c) Weak methodology: Klauer (1999) is one of many researchers who criticize that the situatedness protagonists employ purely qualitative research methods and that they often rely merely on anecdotes to support their claims of the situatedness of cognition and learning (e.g., the ‘cottage cheese story’; cf. Lave 1988). Lave (1988), on the other hand, rejects the empirical-experimental paradigm as artificially decontextualizing phenomena in laboratory investigations. From a situatedness point of view, it is clear that results from laboratories are of questionable value for out-of-lab contexts. It is important to note that many researchers who merely assimilated the notion of situated learning, but do not subscribe to radical situatedness, keep on researching within the traditional empirical framework. Hence, the methodological critique aims at the more radical situatedness proponents.
6. Possible Futures of Situated Learning On the one hand, at least some situatedness protagonists in the area of education make strong statements with respect to the advantage of their approach over the traditional one. On the other hand, the ‘traditionalists’ defend themselves. Against this background it is interesting to ask what the future will bring. Four main possibilities are discussed. (a) Critical traditionalists (Klauer 1999) argue that the situatedness approach and the discussion around it will disappear as other more or less fruitless debates (e.g., the person–situation debate in psychology) have done
Situation Model: Psychological before. (b) Others (e.g., Cobb and Bowers 1999) hope that the situatedness perspective will take the place of the cognitive paradigm in education, just as years ago the behavioral paradigm was driven out by the cognitive one. (c) Some ‘observers’ of the situatedness debate (e.g., Sfard 1998) argue that the two positions provide different metaphors for analyzing learning and that both are useful. Accordingly, they plea for a complementary coexistence. (d) Greeno and MMAP (1998) envision a very ambitious goal for their situativity approach; that is, to develop a situatedness approach that is a synthesis of the cognitive and the behavioral paradigm. Whatever possibility may become reality, there is at least consensus between the main opponents in the situatedness debate (cf. Anderson et al. 1997, Greeno 1997) on what the touchstone for the two approaches should be: the ability to improve education. See also: Capitalism: Global; Piaget’s Theory of Human Development and Education; School Learning for Transfer; Situated Cognition: Contemporary Developments; Situated Cognition: Origins
Bibliography Anderson J R, Reder L M, Simon H A 1996 Situated learning and education. Educational Researcher 25: 5–11 Anderson J R, Reder L M, Simon H A 1997 Situated versus cognitive perspectives: Form versus substance. Educational Researcher 26: 18–21 Cobb P, Bowers J 1999 Cognitive and situated perspectives in theory and practice. Educational Researcher 28: 4–15 Collins A, Brown J S, Newman S E 1989 Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In: Resnick L B (ed.) Knowing, Learning, and Instruction. Essays in the Honor of Robert Glaser. Erlbaum, Hillsdale, NJ, pp. 453–94 Greeno J 1997 On claims that answer the wrong questions. Educational Researcher 26: 5–17 Greeno J G, Middle School Mathematics Through Applications Project Group 1998 The situativity of knowing, learning, and research. American Psychologist 53: 5–26 Greeno J G, Moore J L 1993 Situativity and symbols: Response to Vera and Simon. Cognitie Science 17: 49–59 Greeno J G, Smith D R, Moore J L 1993 Transfer of situated learning. In: Dettermann D K, Sternberg R J (eds.) Transfer on Trial: Intelligence, Cognition, and Instruction. Ablex, Norwood, NJ, pp. 99–167 Klauer K J 1999 Situated learning: Paradigmenwechsel oder alter Wein in neuen Schla$ uchen? Zeitschrift fuW r PaW dagogische Psychologie 13: 117–21 Lave J 1988 Cognition in Practice: Mind, Mathematics, and Culture in Eeryday Life. Cambridge University Press, Cambridge, UK Lave J, Wenger E 1991 Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge, UK Renkl A 2000 Weder Paradigmenwechsel noch alter Wein!— Eine Antwort auf Klauers ‘Situated Learning: Paradigmenwechsel oder alter Wein in neuen Schla$ uchen?’ Zeitschrift fuW r PaW dagogische Psychologie 14: 5–7 Renkl A, Mandl H, Gruber H 1996 Inert knowledge: Analyses and remedies. Educational Psychologist 31: 115–21
Resnick L B 1987 Learning in school and out. Educational Researcher 16: 13–20 Sfard A 1998 On two metaphors for learning and the dangers of choosing just one. Educational Researcher 27: 4–13 Vera A H, Simon H A 1993 Situated action: A symbolic interpretation. Cognitie Science 17: 7–48
A. Renkl
Situation Model: Psychological For, it being once furnished with simple ideas, it [the mind] can put them together in several compositions, and so make variety of complex ideas, without examining whether they exist so together in nature (John Locke 1690 An Essay Concerning Human Understanding).
When we read a story, we combine the ideas derived from understanding words, clauses, and sentences into mental representations of events, people, objects, and their relations. These representations are called situation models. Thus, situation models are not representations of the text itself; rather, they could be viewed as mental microworlds. Constructing these microworlds is the essential feature of understanding. At the most basic level, situation models are mental representations of events. Aspects of events that are encoded in situation models are: what their nature is, where, when, and how they occur, and who and what is involved in them. Single-event models are integrated with models of related events constructed based on the preceding text, such that situation models may evolve into complex integrated representations of large numbers of related events. This is what occurs when we comprehend extended discourse, such as news articles, novels, or historical documents.
1. Background Situation models were introduced to cognitive psychology by van Dijk and Kintsch (1983) and are based on earlier research in formal semantics and logic (e.g., Kripke 1963). The concept is primarily used in research on language and discourse comprehension. For a good understanding, it is important to distinguish situation models from similar concepts, such as mental models, scripts, and frames.
2. How Situation Models Differ 2.1 Mental Models Originally proposed by Craik (1943) and elaborated and introduced to cognitive psychology by JohnsonLaird (1983), mental models (see Mental Models, 14137
Situation Model: Psychological Psychology of ) are mental representations of real, hypothetical, or imaginary situations. Situation models can be viewed as a special type of mental model. Situation models are mental models of specific events. They are bound in time and space, whereas mental models in general are not. For example, heart surgeons have mental models of our blood circulatory system, but they construct a situation model of the state of patient X’s coronary artery at time (t).
stated (rather than implied) and thus deny the textbase a special status. Analyses of naturalistic discourse suggest that comprehenders not only construct a model of the denoted situation, but also construct a model of the communicative context. For example, they make inferences about the attitudes of writers regarding the situation they describe. Van Dijk (1999) calls this type of representation ‘context model’ and argues that no account of discourse comprehension is complete without the inclusion of a context model.
2.2 Scripts and Frames Originally proposed in the artificial intelligence literature by Schank and Abelson (1977) and Minsky (1975), respectively, scripts and frames (see Schemas, Frames, and Scripts in Cognitie Psychology) are representations of stereotypical sequences of events, such as going to a restaurant and spatial layouts, such as that of a living room or the interior of a church. Comprehenders use scripts and frames to construct situation models. For example, the restaurant script can be used to construct a mental representation of your friend’s visit to a local Italian restaurant last night. This is accomplished by filling in the slots of the script. Thus, scripts and frames can be considered types, whereas situation models are tokens. Furthermore, scripts and frames are semantic memory representations, whereas situation models are episodic memory representations.
4. Why Situation Models are Needed to Explain Language Use A task analysis of text comprehension shows the need for situation models. For example, when we comprehend the instructions that come with a household appliance, we form mental representations of the actions needed to operate or repair the device. It would be of little use to construct a mental representation of the wording of instructions themselves. Similarly, when we read newspaper articles, our usual goal is to learn and be updated about some news event. For example, we want to know how the United States House of Representatives responded to the latest budget proposal by the President or why the latest peace negotiations in the Middle East came to a halt. In these cases, we construct mental representations of agents, events, goals, plans, and outcomes, rather than merely mental representations of clauses and words.
3. Other Representations Constructed During Comprehension
5. Components of Situation Models
It is often assumed that readers construct multilevel mental representations during text comprehension. Although there is no complete consensus as to what other types of mental representations, besides situation models, are constructed during text comprehension, empirical and intuitive evidence for the role of the following representations has been put forth. The surface structure is a mental representation of the actual wording of the text. Surface–structure representations are typically short-lived in memory, except when they have a certain pragmatic relevance (for example in the case of insults or jokes) or when the surface structure is constrained by prosodic features, such as rhyme and meter, as in some poetry. The textbase is a mental representation of the semantic meaning of what was explicitly stated in the text. The textbase usually decays rather rapidly. That is, within several days, comprehenders are unable to distinguish between what they read and what they inferred (see Memory for Meaning and Surface Memory). Some researchers view the textbase simply as that part of the situation model that was explicitly
Situation models are models of events. Events always occur at a certain time and place. In addition, events typically involve participants (agents and patients) and objects. Furthermore, events often entertain causal relations with other events or are part of a goalplan structure. Thus, time, place, participants, objects, causes and effects, and goals and plans are components of situations, with time and place being obligatory. As linguists have observed, most of these components are routinely encoded in simple clauses. Verbs typically describe events (although some events can also be described by nouns, e.g., explosion), while nouns and pronouns denote participants and objects (although pronouns can also denote events) and prepositions primarily denote spatial relations (although spatial relations can also be denoted by, for instance, verbs, as in ‘The nightstand supported a lamp’ and prepositions can be used to indicate temporal relationships, as in ‘In an hour’). Temporal information can be expressed lexically in a variety of ways, but is also encoded grammatically in the form of verb tense and aspect in languages such as English, German, and French.
14138
Situation Model: Psychological Causation and intentionality are denoted lexically by verbs (e.g., ‘caused,’ ‘decided to’) or adverbs (e.g., ‘therefore,’ ‘in order to’), but are often left to be inferred by the comprehender. For example, there is no explicitly stated causal connection between following two events ‘John dropped a banana peel on the floor. The waiter slipped,’ yet comprehenders can easily infer the connection. Participants and objects may have various kinds of properties, such as having blue eyes or a cylindrical shape, and temporary features, such as being sunburnt or overheated. Finally, participants and objects may be related in various ways (e.g., kinship, professional, ownership, part–whole, beginning–end, and so on).
6. A Simple Example A simple example illustrates these points. Consider the following clause: (1) John threw the bottle against the wall.
This clause describes an event (throw) that involves a male agent (John), and two objects (bottle, wall). The simple past tense indicates that the event occurred prior to the moment at which the sentence was uttered (e.g., as opposed to John will throw the bottle against the wall ). John’s location is not explicitly stated, but we infer it is in relative proximity to the wall. We may draw inferences as to the causal antecedent and consequent of this event. A plausible antecedent is that John was angry. A plausible consequent is that the bottle will break. Comprehenders are more likely to infer causal antecedents than causal consequences. Note that there are many other inferences that could be drawn. For example, one might infer that John is a middle-aged man and that the bottle was a wine bottle and that it was green and half-filled. One might also infer that the wall was a brick wall, and so on. Comprehenders typically make relatively few of this kind of elaborative inferences. Rather, they are focused on the causal chain of events (Graesser et al. 1994).
7. Situation Models Establish Coherence A major role of situation models in discourse comprehension is to help establish coherence. Suppose sentence (1) were followed in the text by (2):
fashion. They are about some set of events. Comprehenders assume by default that a new sentence will describe the next event in the chronological sequence. Thus, they will assume that the event described in (2) directly follows that in (1). This implies temporal contiguity between (1) and (2). The use of the simple past tense is consistent with this assumption. Event 2 occurs prior to the moment of utterance, but after event 1. Given that no passage of time was mentioned, temporal contiguity is assumed. The pronoun it in sentence (2) is taken to refer to some entity already in the situation model. There are three potential referents, John, the bottle, and the wall. John is not appropriate, because animate male human beings require the pronoun he. However, the bottle and the wall are both inanimate and thus compatible with the pronoun. Because there is no linguistic way to select the proper referent, the comprehender will use background knowledge that bottles are typically made out of glass, which is breakable, and walls out of harder materials, such as brick or concrete, to infer that it was the bottle that broke. The absence of a time lapse and the presence of an object from the previous event will cause the comprehender to assume that the second event takes place in roughly the same spatial region as the first, given that the same object cannot be at two different places at the same time. Thus, the two events can now be integrated into a single situation model in which an agent, a male individual named John, at t1 threw a breakable object, a bottle, against a wall, presumably out of anger, which immediately, at t2, caused the bottle to break into pieces. This is how situation models provide coherence in discourse. Discourse is more than a sequence of sentences. Rather, it is a coherent description of a sequence of events that are related on several dimensions. Situation model theory attempts to account for how these events are successively integrated into a coherent mental representation.
8. Situation Models Integrate Textual Information with Background Knowledge Another function of situation models is that they allow for the integration of text-derived information with the comprehender’s background knowledge. Consider the following examples (from Sanford and Garrod 1998): (3) Harry put the wallpaper on the table. Then he put his mug of coffee on the paper.
(2) It shattered into a thousand pieces.
How would this sentence be comprehended? If the two sentences are part of a discourse, rather than read in isolation, they would have to be integrated in some
It is rather straightforward to integrate these sentences. They call for a spatial arrangement in which the paper is on top of the table and the mug on top of the paper and the coffee inside the mug. However, 14139
Situation Model: Psychological consider the following sentence pair, which differs from (3) by only one word: (4) Harry put the wallpaper on the wall. Then he put his mug of coffee on the paper.
Many readers will have difficulty integrating these sentences, because this discourse snippet describes an impossible set of circumstances. Realizing this impossibility relies critically on the activation of background knowledge. In this case the knowledge that putting wallpaper on a wall produces a vertical surface which would not support a mug of coffee. Thus, even though (or, rather, because) there are linguistic cues in the second sentence affording integration with the first—the pronoun can be taken to refer to Harry, and paper to wallpaper—this integration of text-derived information does not produce the correct understanding that the described situation is impossible. It is necessary to activate the requisite background knowledge to make this determination. By extension, the requisite background knowledge will help the comprehender construct an adequate representation in (3).
perceptual symbols (Barsalou 1999, Glenberg 1997, Johnson-Laird 1983). In an amodal propositional representation, there is no analog correspondence between the mental representation and its referent. In a perceptual–symbol representation, there is. As a result, perceptual symbols represent more of the perceptual qualities of the referent than do amodal symbols, which are an arbitrary code. One of the major challenges of situation-model research will be to gather empirical evidence that speaks to the distinction between amodal and perceptual symbol systems as the representational format for situation models. Both amodal and perceptual symbol systems allow for the possibility that information constructed from a text and activated background knowledge from longterm memory have the same representational format, such that they can be integrated quite easily. This is an important quality, because the integration of textderived information with background knowledge is an essential feature of comprehension.
11. Empirical Eidence 9.
Other Functions of Situation Models
Situation models are needed in translation. Word-forword translations yield nonsense in most cases. In order to arrive at a proper translation, one has to construct a situation model based on the source language and then convey this situation model in the target language. Situation models are needed to explain learning from text. When we read a newspaper article about a current event, for example a war, we update our situation model of this event. We learn what the status of the actors and objects and their relations is at a new time. We would not learn by simply storing a mental representation of the text itself in long-term memory. In fact, what we know about a current and historical political situation, is usually an amalgam of information obtained from various sources (TV news, newspapers, magazines, encyclopedias, conversations with friends, and so on).
10. The Representational Format of Situation Models Situation models often have an almost perceptual quality. In the example about the bottle, the first thing we construct is an agent, next is his action, next is the instrument of the action, and subsequently, we see the consequence of the action. There is a debate regarding the perceptual nature of situation models. Traditionally, an amodal propositional format has been proposed for situation models (van Dijk and Kintsch 1983, Kintsch 1998). However, others have proposed 14140
There is a wealth of empirical evidence supporting the notion of a situation model (see Zwaan and Radvansky 1998 for a review). Most of this evidence consists of reaction times collected while people are reading texts. In addition, there is an increasing amount of electrophysiologal evidence (see, for example, Mu$ nte et al. 1998). Analyses of reading times, as well as electrophysiological measures suggest that comprehenders have difficulty integrating a new event into the situation model when that event has different situational parameters from the previously described event, for instance when it occurs in a different time frame or involves a new participant. Evidence suggests that ease of integration of an event depends on its relatedness to the evolving situation model on each of the five situational dimensions. Probe-recognition studies show that the activation levels of concept decrease after a change in situational parameters (e.g., a shift in time, space, participant, or goal structure). For example, people recognize the word ‘checked’ more quickly after having read ‘He checked his watch. A moment later, the doorbell rang’ than after ‘He checked his watch. An hour later, the doorbell rang.’ Thus, the ‘here and now’ of the narrated situation tends to be more accessible to comprehenders than other information. This, of course, mimics our everyday interaction with the world. Analyses of memory-retrieval data shows that people tend to store events together in long-term memory based on the time and location at which they occur, the participants they involve, whether they are causally related and whether or not they are part of the same goal-plan structure. There is evidence that
Skinner, Burrhus Frederick (1904–90) memory retrieval is influenced by a combination of the link strengths between events on the five situational dimensions. It has furthermore been shown that people’s long-term memory for text typically reflects the situation that was described; memory representations for the discourse itself are much less resistant to decay or interference.
12. Computational Models Kintsch (1998) has developed a computational model of text comprehension, the construction-integration model (see also Text Comprehension: Models in Psychology). In this model, a network consisting of nodes representing the surface structure, the textbase, and the situation models is successively constructed and integrated in a sequence of cycles. The construction of the network is done by hand. The nodes are propositions (or surface structure elements) and the links between them can be based on a variety of criteria, such as argument overlap, causal relatedness, or other aspects of situational relatedness. Integration occurs computationally by way of a constraint-satisfaction mechanism. Simulations using the construction– integration model have been most successful in (quantitatively) predicting text recall. In addition, the model has been shown to provide qualitative fit with a variety of findings in text comprehension, such as anaphoric resolution and sentence recognition. Similar models have been proposed by other researchers. Like the construction-integration model, these models include aspects of situation-model construction. However, there currently exists no full-fledged computational model of situation-model construction.
13. Beyond Language Situation models have significance beyond the domain of text comprehension. Researchers are beginning to apply this concept to other domains of cognitive psychology, such as the comprehension of visual media (e.g., movies) and autobiographical memory. In the first case, situation models are acquired vicariously, as in language comprehension, but, in part, on nonlinguistic visual and auditory information. In the second case, they are acquired via direct experience. The question of whether and how the mode of acquisition affects the nature of situation models is a fruitful one. See also: Concept Learning and Representation: Models; Figurative Thought and Figurative Language, Cognitive Psychology of; Knowledge Activation in Text Comprehension and Problem Solving, Psychology of; Language and Thought: The Modern Whorfian Hypothesis; Literary Texts: Comprehension and Memory; Mental Models, Psychology of; Mental
Representations, Psychology of; Narrative Comprehension, Psychology of; Reasoning with Mental Models; Sentence Comprehension, Psychology of
Bibliography Barsalou L W 1999 Perceptual symbol systems. Behaioral and Brain Sciences 22: 577–660 Craik K 1943 The Nature of Explanation. Cambridge University Press, Cambridge, UK Glenberg A M 1997 What memory is for. Behaioral and Brain Sciences 20: 1–19 Graesser A C, Singer M, Trabasso T 1994 Constructing inferences during narrative text comprehension. Psychological Reiew 101: 371–95 Johnson-Laird P N 1983 Mental Models: Towards a Cognitie Science of Language, Inference, and Consciousness. Harvard University Press, Cambridge, MA Kintsch W 1998 Comprehension: A Paradigm for Cognition. Cambridge University Press, Cambridge, MA Kripke S 1963 Semantical considerations on modal logics. Acta Philosophica Fennica 16: 83–94 Minsky M 1975 A framework for representing knowledge. In: Winston P H (ed.), The Psychology of Computer Vision. McGraw-Hill, New York, pp. 211–77 Mu$ nte T F, Schiltz K, Kutas M 1998 When temporal terms belie conceptual order. Nature 395: 71–3 Sanford A J, Garrod S C 1998 The role of scenario mapping in text comprehension. Discourse Processes 26: 159–90 Schank R C, Abelson R 1977 Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Erlbaum, Hillsdale, NJ van Dijk T A 1999 Context models in discourse processing. In: van Oostendorp H, Goldman S R (eds.), The Construction of Mental Representations During Reading. Erlbaum, Mahwah, NJ, pp. 123–48 van Dijk T A, Kintsch W 1983 Strategies in Discourse Comprehension. Academic, New York Zwaan R A, Radvansky G A 1998 Situation models in language comprehension and memory. Psychological Bulletin 123: 162–85
R. A. Zwaan
Skinner, Burrhus Frederick (1904–90) After J. B. Watson, the founder of the behaviorist movement, Skinner has been the most influential, and also the most controversial figure of behaviorism. His contributions to behavioral sciences are manifold: he designed original laboratory techniques for the study of animal and human behavior, which he put to work to produce new empirical data in the field of learning, and to develop a theory of operant behavior that led him eventually to a general psychological theory; he further elaborated the behaviorist approach, both refining and extending it, in a version of behavioral science which he labeled radical behaviorism; he 14141
Skinner, Burrhus Frederick (1904–90) formulated seminal proposals for applied fields such as education and therapy and pioneered in machine assisted learning; finally, building upon his conception of the causation of behavior, he ventured into social philosophy, questioning the traditional view of human nature and of the relation of humans to their physical and social environment. This part of his work has been the main source of sometimes violent controversy.
main concern was obviously with humans, as evidenced by his literary writings in the field of social philosophy—namely the utopian novel Walden Two (1948) and the essay, Beyond Freedom and Dignity (1971)—as well as by his theoretical endeavors to account for human behavior (Science and Human Behaior 1953) and for verbal behavior (Verbal Behaior 1957).
1. Biographical Landmarks
2. Operant Conditioning and the Skinner Box
Skinner was born on March 20, 1904 in Susquehanna (Pennsylvania, USA) in a middle-class family and experienced the usual childhood and adolescence of provincial American life. He attended Hamilton College, which was not a particularly stimulant institution to him. He was first attracted by a literary career, which he soon gave up, after traveling to Europe. He turned to psychology, and was admitted at Harvard in 1928. He obtained his Ph.D. in 1931, with a theoretical thesis on the concept of reflex—a first landmark in his reflections on the causation of behavior, an issue he was to pursue throughout his scientific career. He stayed at Harvard five more years, beneficiary of an enviable fellowship, affiliated to the physiology laboratory headed by Crozier. In 1936, he was appointed professor at the University of Minnesota where he developed his conditioning chamber for the study of operant behavior in animals—which was to be known as the Skinner box—and wrote his first book The Behaior of Organisms (1938). In 1945, he moved to Indiana University, as Chairman of the Department of Psychology. In 1948, he was offered the prestigious Edgar Pierce Professorship in Psychology at Harvard University, where he was to stay until his death on August 18, 1990. Skinner received during his lifetime the highest national awards an American psychologist could receive, and he was praised as one of the most prominent psychologists of the century, in spite of harsh attacks against some of his ideas, from the most opposite sides of scientific and lay-people circles. Although most of Skinner’s laboratory research was carried out with animal subjects, mainly rats and pigeons, using the so-called operant procedure that will be described hereafter, essentially he was interested, not in animal behavior proper, but in behavior at large, and more specifically in human behavior. After the tradition of other experimental psychologists before him, such as Pavlov and Thorndike, and of most of his colleagues in the behaviorist school of thought, such as Hull, Tolman, or Guthrie, he resorted to animals as more accessible subjects than humans for basic studies on behavior, just as physiologists used to do quite successfully. The relevance of extrapolating from animals to humans is of course an important issue in psychology. However, Skinner’s
The operant conditioning chamber, often called the Skinner box, is a laboratory device derived from Thorndike’s puzzle box and from the mazes familiar to students of learning in rats by the time Skinner started his career. In its most common form, it consists of a closed space in which the animal moves freely; it is equipped with some object that the subject can manipulate easily—be it a lever for rats, or a small illuminated disk upon which pigeons can peck—and with a food dispenser for delivering calibrated quantities of food. By exploring spontaneously this particular environment, eventually with the help of the experimenter in shaping progressively its behavior, the subject will eventually discover the basic relation between a defined response—pressing the lever or pecking the key—and the presentation of a reinforcing stimulus—a small food reward. The basic relation here is between an operant response (i.e., a response instrumental to produce some subsequent event) and its consequence (i.e., the reinforcement), rather than between a stimulus and a response elicited by it, as in Pavlovian or respondent conditioning. This simple situation may be made more complex either by introducing so-called discriminative stimuli, the function of which is not to trigger the response at the manner of a reflex, but to set additional conditions under which the response will be reinforced, or by changing the basic one response-one reinforcement link to some more complicated contingencies, for instance requiring a given number of responses to one reinforcement, or the passing of some defined delay. A wide variety of schedules of reinforcement have been so studied, be it for their own sake as sources of information on the lawfulness of behavior (for instance modern research has applied optimization models borrowed from economics to the study of operant behavior), or as efficient tools for other purposes (such as the analysis of sensory functions in animal psychophysics, of the effects of drugs acting upon the Central Nervous System in experimental psychopharmacology, or of cognitive capacities). The operant technique presented two important features by the time Skinner developed it from the thirties to the fifties. It emphasized the study of individual subjects through a long period of time rather than groups of subjects for a few sessions, as
14142
Skinner, Burrhus Frederick (1904–90) used to be the case in maze studies and the like. This interest in individual behavior would favor later applications to human subjects in educational and clinical settings. Second, the operations involved soon were automatized by resorting to electromechanical circuits, to be replaced later by online computer control. This led to a level of efficiency and precision unprecedented in psychological research.
3. The Eolutionary Analogy Skinner captured the essence of operant behavior in the formula ‘control of behavior by its consequences,’ and very early he pointed to the analogy between the selection of the response by the subsequent event and the mechanism at work in biological evolution. An increasingly large part of his theoretical contributions were eventually devoted to elaborating the evolutionary analogy (Skinner 1987). The generalization of the selectionist model to behavior acquisition at the individual level, initially little more than a metaphoric figure, has recently gained credentials with the theses of neurobiologists, such as Changeux’s Generalised Darwinism (1983) or Edelman’s Neural Darwinism (1987), who both have substantiated in ontogeny selective processes previously reserved to phylogeny. One of the main tenets of Skinner’s theory converges with contemporary views in neurosciences. Skinner extended the selectionist explanation to cultural practices and achievements, joining some schools of thought in cultural anthropology and in the history of science, such as Karl Popper’s selectionist account of scientific hypotheses.
4. Radical Behaiorism As a behaviorist, Skinner viewed psychology as a branch of natural sciences, more explicitly as a branch of biology, which can deal with its subject matter using the same principles as in other fields of life sciences, be it with specific implementations as required by its particular level of analysis. Skinner defended a brand of behaviorism quite distinct from the dominant view that prevailed in the second quarter of the century. His radical behaviorism was opposed to methodological behaviorism. For most psychologists, defining their science as the science of behavior, after Watson’s recommendation, did not really mean that they had abandoned mental life as the main objective of their inquiry; rather, they had simply resigned themselves to study behavior, because they had to admit that they had no direct access to mental life. Such methodological behaviorism, in fact, remained basically dualistic. In contrast, radical behaviorism is definitely monist, and it rejects any distinction between what is called mental and behavioral. Skinner is, in this respect, closer to Watson’s view than
to the position of other influent neobehaviorists of his generation, although he developed a far more sophisticated view of behavior than Watson’s. For instance, he rejected the simplistic claim that thought is nothing more than subvocal language, and admitted that not all behavior is directly observable. Part of human behavior is obviously private, or covert, and raises difficult methodological problems of accessibility; but this is no reason to give it different status in a scientific analysis. Skinner vigorously denounced mentalism not so much because it refers to events that would occur in another space and be of a different substance than behavior, but because it offers pseudo-explanations which give the illusion of understanding what in fact remains to be accounted for. Mentalistic explanations, very common in daily life psychology, were still quite frequent in scientific psychology. For example, in the field of motivation, all sorts of behavior were assigned internal needs as causal agents: not only do we eat because we are hungry—a simple statement which, at any rate from a scientist’s point of view, requires qualification—but we exhibit aggressive behavior because of some aggression drive, we interact with social partners because of a need for affiliation, we work to successful and creative outcomes because of a need for achievement, etc. For Skinner, such explanations dispense us from looking for the variables really responsible for the behavior; these variables are more often than not to be found in the environment, and can be traced to the history of the individuals interacting with their environment, social and physical. Combining this epistemological conception and the results of his empirical research, Skinner developed a theory of human behavior emphasizing the determining role of environmental contingencies on human actions. However, along the lines of the evolutionary analogy, he suggested a model that would account equally well for novelty and creative behavior, as exhibited in art and science productions, and for stabilized, persistent habits adapted to unchanging conditions. Skinner devoted special attention to verbal behavior, the importance of which in the human species justified special treatment. He presented his book Verbal Behaior as ‘an essay in interpretation,’ proposing a functional analysis of the verbal exchanges composing the global episode of speaker-listener communication. In spite of the violent criticisms expressed by the linguist Chomsky (1959), Skinner’s analysis foreshadowed some aspects of the pragmatic approach adopted some years later by many psycholinguists, aware of the insufficiency of formal grammars to account for central features characterizing the use of language. He viewed verbal behavior as shaped by the linguistic community, and exerting a genuine type of control over an individual’s behavior, distinct from the action of the physical environment. A large part of human behavior is, in his terms, rule governed, 14143
Skinner, Burrhus Frederick (1904–90) that is, controlled by words, rather than shaped by contingencies, that is, through direct exposure to the physical world. Many behaviors can have one or the other origin: avoidance of fire flame can derive from direct experience with fire, or from warnings received during education. Once endowed with verbal behavior, individuals can use it to describe and anticipate their own behavior, or to develop it in its own right, as in literary composition. Were it not for the unfortunate use of the term rule, which makes for confusion with the word as used in formal linguistics and with the notion of coercive control, what Skinner was pointing to was akin to a distinction now familiar to contemporary psychology and neurosciences between topdown vs. bottom-up causation.
5. Education Skinner’s interests in applications covered three main areas: education, psychological treatment, and social practices at large. He dealt with the first two in a technical manner, developing principles and techniques for improving educational and therapeutic practices. His treatment of the third one is more akin to social philosophy than to scientific application, although he viewed it as very consistently rooted in his scientific thinking. At a time school education in the US was criticized for its deficiencies and for its inefficiency in competing with technological achievements of the Soviet Union, Skinner, as many other American scientists, inquired into the reasons of that state of affairs. Observing what was going on in any normal classroom, including in reputed schools, he concluded that it was violating blatantly most basic principles of learning derived from laboratory analysis of the learning process. Pupils and students were passively exposed to teachers’ monologues rather than actively producing behaviors followed by feedback; there was no attempt to adjust the teacher’s actions to individual level and rhythm of learning; negative evaluation based on mistakes and errors prevailed over positive evaluation pointing to progresses achieved; punitive controls, known to be poorly effective in shaping and maintaining complex behavior, was still widely used; general conditions and teaching practices were far from favorable to developing individual talents and creativity. Such criticisms had been made by others, but Skinner differed from them in the analysis of the causes and in the remedies proposed. He did not question the importance of endowing the students with basic knowledge and skills that they need if they are to engage in more complex and original activities. But he thought such skills could be mastered using more efficient methods than those currently in use in the classroom. This was the origin of teaching machines, a term that would raise strong objections on the ground that machines could not lead 14144
but to dehumanising teaching. In fact, Skinner’s idea has been implemented since then in computer assisted learning, which is now widely accepted—with little reference to the pioneering projects of the behaviorist psychologist. The device he designed in the 1950s appears quite primitive compared with modern computers: it was an electromechanical machine—adapted from a record player—built in such a way that it would present in a window to the student successive small frames of the material to be learned, each frame requiring an active answer from the learner. The latter could learn at his or her own individual rhythm, ideally with no or few errors, and eventually reach the end of the program, with the guarantee that the subject matter had been mastered from end to end. Good programs would make exams useless, if exams are just a way to control that the material has been covered and understood. Most important, student and teachers would not waste the few hours they could work together in tasks easily fulfilled using teaching devices; they could devote the time so spared to more constructive activities requiring direct human contact. In spite of numerous attacks in educational circles, Skinner’s project inspired many applications, such as programmed instruction in book form, until modern computers would offer the elegant solution we know today. It also contributed to the development of individualized teaching approaches that favor methods allowing students to learn at their own pace in an autonomous and active way. Teaching machines are but one aspect of Skinner’s contribution to education. In a number of his writings, including his utopian novel, Skinner (1968, 1978) did express his reflections on educational issues. He was concerned especially with the disproportion between the resources devoted to education and the poor outcomes; with the tendency to level down individual differences; with the increasing distance between the school environment and real life; with violence in schools, and other matters which remain crucial issues today, with little improvement.
6. Behaior Therapy Treatment of psychological disturbances was another field of application to which Skinner (1955, 1989) made influent contributions. By the middle of the twentieth century, psychopathology had elaborated refined descriptive systems of psychological disturbances and equally sophisticated explanatory models such as the psychoanalytic theory. Contrasting with these, methods for treatment were scarce and their results poor. Psychoanalytic treatment had limited indications and practical limitations. Rogers’s nondirective therapy, though quite popular, did not provide convincing results. Psychopharmacology was still in the air.
Skinner, Burrhus Frederick (1904–90) Skinner did not question the classical categorization of mental illnesses, nor did he propose miracle remedies. He simply suggested to look at them as disturbances in behavior, rather than alterations of hypothetical mental structures, such as the psychic apparatus appealed to by psychoanalysis, of which abnormal behavior would be but observable indicators, or symptoms. Consequently, he proposed to attempt to change undesirable behavior by acting directly upon it, rather than upon underlying structures supposedly responsible for it. This approach was not totally new: behavior therapy had its origins in John Broadus Watson’s attempts to treat fear in children by resorting to Pavlovian conditioning and in the theoretical work of some neobehaviorists aimed at transposing some psychoanalytical concepts into learning theory models. What Skinner added was his genuine theoretical elaboration, especially based on his antimentalist stand, and techniques of behavior modification drawn from the operant laboratory, supplementing Pavlovian techniques in use up to then. He also brought into the clinical field a sense of rigor transferred from the laboratory, perfectly compatible with the study of single cases. Skinner did not practice behavior therapy himself. His influence in the field was indirect, by stimulating pioneering research on psychotic patients and mentally deficient people. He gave a decisive impulse to the development of the behavioral approach to treatment, which soon acquired a major position in clinical psychology and counseling, and which eventually merged, somewhat paradoxically, with cognitively oriented practices into what are labeled behavioralcognitive therapies.
7. Social Philosophy Skinner’s social philosophy was based in a deep confidence that only science would help us in solving the problems we are facing in our modern societies. What is needed is a science of behavior, which in fact is now available, he thought, and could be applied if only we would be ready to abandon traditional views of human nature. He first expressed his ideas in the novel Walden Two (1948). Written two years after the end of the World War Two, the book describes a utopian community run after the principles derived from the psychological laboratory. It is by no means a totalitarian society, as some critics have claimed. Looked at retrospectively, it is surprisingly premonitory of social issues that are still largely unsolved half a century later. For instance, working schedules at Walden Two have been arranged so that all tasks needed for production of goods and good functioning of the community are distributed among members according to a credit system which results in an average amount of 24 hours per week, avoiding unemployment, abolishing any social discrimination between
manual and intellectual work, and leaving many free hours for leisure activities such as sports, arts, and scientific research. Emphasis is put on active practice rather than passive watching, on co-operation rather than competition. The community is not isolated culturally: cultural products from outside such as books or records are of course welcome, but radio programs are filtered to eliminate publicity. Education is active; the school building symbolically has no door separating it from the life and work community; there are no age classes, no humiliating ranking; all learn at their own rhythm, in whatever orientation they feel appropriate, throughout their life time. Women enjoy complete equality with men. Waste of natural resources is avoided. Special charges in the management of the community are strictly limited in time, eliminating any risk of political career. Similar themes, plus the frightening concerns with pollution, violence, uncontrolled population growth, nuclear weapons, and the like, are further elaborated in the essay Beyond Freedom and Dignity (1971) and a number of later articles. In an alarming tone, Skinner points to what he feels is the core of our inability to deal with these issues, that is our obstinacy in keeping a conception of human nature which scientific inquiry shows us to be wrong, and which bar any solution to the problems we are confronted with. We still stick to a view of humans as being the center of the universe, free and autonomous, dominating nature, while we are but one among many elements of nature. As a species, we are the product of biological evolution; as cultural groups, the result of our history; and as individuals the outcome of our interactions with the environment. Because we fail to admit this dependency, and draw the consequences of it, we might put in danger our own future. Freedom, autonomy, and merit are no absolute values: they were forged throughout history, and more often than not they are used to disguise insidious controls, the mechanisms of which should be elucidated if we want to develop counter-controls eventually allowing for the survival of our species. Skinner viewed these various facets of his work as closely related, making for a highly consistent theory of human behavior, in which the critical analysis of social processes in modern society was deeply rooted in the experimental analysis of the behavior of animal subjects in the laboratory. So global an ambition has been criticised, and clearly various aspects of his contribution did not have the same fate. If the operant technique is now a widely used procedure to many purposes in experimental research in psychology and related fields, if his early attempts to build teaching machines appear now as ancestors of computer assisted learning and teaching, if a number of principles of contingencies analysis are now currently put in practice in behavior therapies, radical behaviorism has been seriously questioned and even shaken by the rise of the cognitivist approach in psychology, while 14145
Skinner, Burrhus Frederick (1904–90) Skinner’s social philosophy has been attacked from different fronts both on ideological and scientific grounds. As most great theory builders of the twentieth century in psychology, from Sigmund Freud and Watson to Jean Piaget, Skinner may be blamed for having reduced the explanation of human nature to a very limited set of concepts and findings, namely those he had forged and observed in his own restricted field of research and reflection, ignoring even other concepts and facts in neighboring fields of psychology, leaving alone of other sciences. It is clear that Skinner has made no attempt at integrating, for example, contributions of developmental or of social psychology, nor those of sociology, cultural anthropology, or linguistics. Such neglects might have been deliberate, legitimated by the will to concentrate on what were, in Skinner’s mind, essential points left out by other branches of psychology or other sciences dealing with human societies. However, they might appear as sectarianism to those who favor an integrative and plurisdisciplinary approach to the complex objects of human sciences. It cannot be decided whether his influence would have been larger or smaller had he adopted a less exclusive stand. See also: Autonomic Classical and Operant Conditioning; Behavior Therapy: Psychiatric Aspects; Behavior Therapy: Psychological Perspectives; Behaviorism; Behaviorism, History of; Conditioning and Habit Formation, Psychology of; Darwinism: Social; Educational Learning Theory; Educational Philosophy: Historical Perspectives; Evolutionary Epistemology; Freud, Sigmund (1856–1939); Lashley, Karl Spencer (1890–1958); Mental Health and Normality; Operant Conditioning and Clinical Psychology; Pavlov, Ivan Petrovich (1849–1936); Piaget, Jean (1896–1980); Psychological Treatment, Effectiveness of; Psychological Treatments, Empirically Supported; Psychology: Historical and Cultural Perspectives; Thorndike, Edward Lee (1874–1949); Utopias: Social; Watson, John Broadus (1878–1958)
Bibliography Bjork D W 1997 B. F. Skinner, A Life. American Psychological Association, Washington, DC Changeux J-P 1983 L’Homme neuronal. Fayard, Paris (The Neuronal Man) Chomsky N 1959 Review of Skinner B F, Verbal behavior. Language 4: 16–49 Edelman G M 1987 Neural Darwinism: The Theory of Neuronal Group Selection. Basic Books, New York Modgil S, Modgil C (eds.) 1987 B. F. Skinner: Consensus and Controersy. Falmer, New York Richelle M 1993 B. F. Skinner, A Reappraisal. Erlbaum, Hove, London
14146
Roales-Nieto J G, Luciano Soriano M C, Pe! rez Alvarez M (eds.) 1992 Vigencia de la Obra de Skinner. Universidad Granada Press, Granada (Robustness of Skinner’s Work) Skinner B F 1938 The Behaior of Organisms. Appleton Century Crofts, New York Skinner B F 1948 Walden Two. Macmillan, New York Skinner B F 1953 Science and Human Behaior. Macmillan, New York Skinner B F 1955 What is psychotic behavior. In: Gildea F (ed.) Theory and Treatment of the Psychoses: Some Newer Aspects. Washington University Studies, St Louis, MO, pp. 77–99 Skinner B F 1957 Verbal Behaior. Appleton Century Crofts, New York Skinner B F 1961 Cumulatie Record. Appleton Century Crofts, New York Skinner B F 1968 The Technology of Teaching. Appleton Century Crofts, New York Skinner B F 1971 Beyond Freedom and Dignity, 1st edn. Knopf, New York Skinner B F 1978 Reflections on Behaiorism and Society. Prentice-Hall, Englewood Cliffs, NJ Skinner B F 1987 Upon Further Reflection. Prentice Hall, Englewood Cliffs, NJ Skinner B F 1989 Recent Issues in the Analysis of Behaior. Merrill, Columbus, OH
M. N. Richelle
Slavery as Social Institution Slavery is the most extreme form of the relations of domination. It has existed, at some time, in most parts of the world and at all levels of social development. This article examines five aspects of the institution: Its distinguishing features; the means by which persons were enslaved; the means by which owners acquired them; the treatment and condition of slaves; and manumission, or the release from slavery.
1. The Distinctie Features of Slaery The traditional, and still conventional, approach is to define slavery in legal–economic terms, typically as ‘the status or condition of a person over whom any or all the powers attaching to the right of ownership are exercised’ (League of Nations 1938, Vol. 6). In this view, the slave is, quintessentially, a human chattel. This definition is problematic because it describes adequately mainly Western and modern, capitalistic systems of slavery. In many nonWestern parts of the world, several categories of persons who were clearly not slaves, such as unmarried women, concubines, debt bondsmen, indentured servants, sometimes serfs, and occasionally children, were bought and sold. Conversely, in many slave-holding societies certain categories of slaves, such as those born in the household, were not treated as chattels.
Slaery as Social Institution Slavery is a relation of domination that is distinctive in three respects. First, the power of the master was usually total, if not in law, almost always in practice. Violence was the basis of this power. Even where laws forbade the gratuitous killing of slaves, it was rare for masters to be prosecuted for murdering them, due to the universally recognized right of masters to punish their slaves, and to severe constraints placed on slaves in giving evidence in courts of law against their masters, or free persons generally. The totality of the master’s claims and powers in them meant that slaves could have no claims or powers in other persons or things, except with the master’s permission. A major consequence of this was that slaves had no custodial claims in their children; they were genealogical isolates, lacking all recognized rights of ancestry and descent. From this flowed the hereditary nature of their condition. Another distinctive consequence of the master’s total power is the fact that slaves were often treated as their surrogates, and hence could perform functions for them as if they were legally present, a valuable trait in premodern societies with advanced commodity production and longdistance trading, such as ancient Rome, where laws of agency, though badly needed, were nonexistent or poorly developed. Second, slaves were universally considered as outsiders, this being the major difference between them and serfs. They were natally alienated persons, deracinated in the act of their, or their ancestors’, enslavement, who were held not to belong to the societies in which they lived, even if they were born there. They lacked all legal or recognized status as independent members of a community. In kin-based societies, this was expressed in their definition as kinless persons; in more advanced, state-based societies, they lacked all claims and rights of citizenship. Because they belonged only to their master, they could not belong to the community; because they were bonded only to their master’s household, they could share no recognized bond of loyalty and love with the community at large. The most ancient words for slaves in the IndoEuropean and several other families of languages translate to mean, ‘those who do not belong,’ or ‘not among the beloved,’ in contrast with free members of the community, who were ‘among the beloved,’ and ‘those who belonged.’ Third, slaves were everywhere considered to be dishonored persons. They had no honor that a nonslave person need respect. Masters could violate all aspects of their slaves’ lives with impunity, including raping them. In most slave-holding societies, injuries against slaves by third parties were prosecuted, if at all, as injuries against the person and honor of the master. Where an honor-price or wergild existed, as in Anglo-Saxon Britain and other Germanic lands, its payment usually went to the master rather than to the injured slave. Universally, slavery was considered the most extreme form of degradation, so
much so that the slave’s very humanity was often in question. For all these reasons, there was a general tendency to conceive of slaves symbolically as socially dead persons. Their social death was often represented symbolically in ritual signs and acts of debasement, death and mourning: In clothing, hairstyles, naming practices, and other rituals of obeisance and nonbeing.
2. The Modes of Enslaement Free persons became slaves in one of eight ways: capture in warfare; kidnapping; through tribute and taxation; indebtedness; punishment for crimes; abandonment and sale of children; self-enslavement; and birth. Capture in warfare is generally considered to have been the most important means of acquiring slaves, but this was true mainly of simpler, small-scale societies, and of certain volatile periods among politically centralized groups. Among even moderately advanced premodern societies, the logistics of warfare often made captivity a cumbersome and costly means of enslaving free persons. Kidnapping differed from captivity in warfare mainly in the fact that enslavement was its main or sole objective, and that it was usually a private act rather than the by-product of communal conflict. Other than birth, kidnapping in the forms of piracy and abduction was perhaps the main form of enslavement in the ancient Near East and the Mediterranean during Greek and Roman times; and this was true also of free persons who were enslaved in the transSaharan and transatlantic slave trades. Debt bondage, which was common in ancient Greece up to the end of the seventh century BC, the ancient Near East, and in South East Asia down to the twentieth century, could sometimes descend into real slavery, although nearly all societies in which it was practiced distinguished between the two institutions in at least three respects: Debt-bondage was nonhereditary; bondsmen remained members of their community, however diminished; and they maintained certain basic rights, both in relation to the bondholder and to their spouses and children. Punishment for crimes was a major means of enslavement in small, kin-based societies; China, and to a lesser extent, Korea, were the only advanced societies in which it remained the primary way of becoming a slave. Nonetheless, it persisted as a minor means of enslavement in all slaveholding societies, and became important historically in Europe as the antecedent of imprisonment for the punishment of crimes. The enslavement of foundlings was common in all politically centralized premodern societies, though rarely found in small-scale slaveholding communities. It was the humane alternative to infanticide and was especially important in China, India, European antiquity, and medieval Europe. It has been argued that 14147
Slaery as Social Institution it ranked second to birth as a source of slaves in ancient Rome from as early as the first century CE until the end of the Western empire. Self-enslavement was rare and was often the consequence of extreme penury or catastrophic loss. In nearly all slave-holding societies where the institution was of any significance, birth rapidly became the most important means by which persons became slaves, and by which slaves were acquired. Contrary to a common misconception, this was true even of slave societies in which the slave population did not reproduce itself naturally. The fact that births failed to compensate for deaths, or to meet the increased demand for slaves—which was true of most of the slave societies of the New World up to the last decades of the eighteenth century—does not mean that birth did not remain the main source of slaves. However, the role of birth as a source of slaves was strongly mediated by the rules of status inheritance, which took account of the complications caused by mixed unions between slaves and free persons. There were four main rules. (a) The child’s status was determined by the mother’s status only, regardless of the father’s status. This was true of most modern and premodern Western slave societies, and of nearly all premodern nonWestern groups with matrilineal rules of descent. (b) Status was determined by the father only, regardless of the mother’s status. This unusual pattern was found mainly among certain rigidly patrilineal groups, especially in Africa, where it was the practice among groups such as the Migiurtini Somali, the Margi of northern Nigeria, and among certain Ibo tribes. The practice, however, was not unknown in the West. It was the custom in Homeric Greece and was the norm during the seventeenth century in a few of the North American colonies such as Maryland and Virginia, and in South Africa and the French Antilles up to the 1680s. (c) Status was determined by the principle of deterior condicio, that is, by the mother or father, whoever had the lower status. This was the harshest inheritance rule and was the practice in China from the period of the Han dynasties up to the reforms of the thirteenth and fourteenth centuries. It found its most extreme application in Korea, where it was the dominant mode down to 1731. The rule also applied in Visigothic Spain, and medieval and early modern Tuscany. The only known case in the New World was that of South Carolina in the early eighteenth century. (d) The fourth principle of slave inheritance, that of melior condicio, is in direct contrast with the last mentioned, in that the child inherited the status of the free parent, whatever the gender of the parent, as long as the father acknowledged his paternity. This is the earliest known rule of slave inheritance and may have been the most widely distributed. It was the norm in the ancient Near East and, with the exception of the Tuareg, it became the practice among nearly all 14148
Islamic societies. The rule was supported among Muslims by another Koranic prescription and practice: The injunction that a slave woman was to be freed, along with her children, as soon as she bore a son for her master. The only known cases in Europe of the melior condicio rule both emerged during the thirteenth century. In Sweden, it was codified in the laws of Ostergotland and Svealand as part of a general pattern of reforms. In Spain, religion was the decisive factor in the appearance of a modified version of the rule. Baptized children of a Saracen or Jewish owned slave and a Christian were immediately freed. Throughout Latin America, although the legal rule was of the first type—the children of slave women were to become slaves—the widespread custom of concubinage with slave women, coupled with the tendency to recognize and manumit the offspring of such unions, meant that, in practice, a modified version of the melior condicio rule prevailed where the free person was the father, which was usually the case with mixed unions.
3. The Acquisition of Slaes Slaves not born to, or inherited by, an owner were acquired mainly through external and internal trading systems; as part of bride and dowry payments; as money; and as gifts. There were five major slave trading systems in world history. The Indian Ocean trade was the oldest, with records dating back to 1580 BC, and persisted down to the twentieth century AD. Slaves from subSaharan Africa were transported to the Middle and Near East as well as Southern Europe. It has been estimated that between the years 800 AD and 1800 approximately 3 million slaves were traded on this route, and over two million were traded during the nineteenth century. The Black Sea and Mediterranean slave trade supplied slaves to the ancient European empires and flourished from the seventh century BC through the end of the Middle Ages. Over a quarter of a million slaves may have been traded in this system during the first century of our era. The European slave trade prospered from the early ninth century AD to the middle of the twelfth, and was dominated by the Vikings. One of the two main trading routes ran westward across the North Sea; the other ran eastward. Celtic peoples, especially the Irish, were raided and sold in Scandinavia. Most slaves traded on the eastern routes were of Slavic ancestry. It was the Viking raiding, and wide distribution, of Slavic slaves throughout Europe that accounts for the common linguistic root of the term ‘slave’ in all major European languages. The transSaharan trade has persisted from the midseventh century AD down to the twentieth and involved the trading of subSaharan Africans throughout North Africa and parts of Europe. It has been
Slaery as Social Institution estimated that some 6.85 million persons were traded in this system up to the end of the nineteenth century. Although it declined substantially during the twentieth century, largely under European pressure, significant numbers of Africans are still being traded in Sudan and Mauritania. The transatlantic slave trade was the largest in size and certainly the most extensive of all these systems. The most recent evidence suggests that between the years 1500 and 1870, some 11 million Africans were taken to the Americas. Of the 10.5 million who were forced from Africa between 1650 and 1870, 8.9 million survived the crossing. Although all the maritime West European nations engaged in this trade, the main traders were the British, Portuguese, and French. Four regions account for 80 percent of all slaves going to the New World: the Gold Coast (Ghana), the Bights of Benin and Biafra, and West-Central Africa. Forty percent of all slaves landed in Brazil; and 47 percent in the Caribbean. Although only 7 percent of all slaves who left Africa landed in North America, by 1810 the United States had, nonetheless, one of the largest slave populations due to its unusual success in the reproduction of its slave populations. For the entire three and a half centuries of the Atlantic slave trade, approximately 15 percent of those forced from Africa died on the Atlantic crossing (some 1.5 million) with losses averaging between 6000 and 8000 per year during the peak period between the years 1760 and 1810.
4. The Treatment of Slaes It is difficult to generalize about the treatment of slaves, since this not only varied considerably between societies but also within them. There was no simple correlation of favorable factors. Furthermore, the same factor may operate in favor of slaves in one situation, but against them in the next. Thus, in many small, kin-based societies, slaves were relatively well treated and regarded as junior members of the master’s family, but could nonetheless be sacrificed brutally on special occasions. In advanced premodern societies such as Greece and Rome, as well as modern slave societies such as Brazil, slaves in the mines or latifundia suffered horribly short lives, while skilled urban slaves were often virtually free and sometimes even pampered. In the Caribbean, the provision ground system, by which slaves supported themselves, led to high levels of malnutrition compared to the US South, where masters provided nearly all the slaves’ provision. Nonetheless, Caribbean slaves cherished the provision ground system for the periods of selfdetermination and escape from the master’s direct control that it offered. In general, the most important factors influencing the condition of slaves were the uses to which they were put, their mode of acquisition, their location—
whether urban or rural—absenteeism of owners, proximity to the master, and the personal characteristics of the slaves. Slaves were acquired for purely economic, prestige, political, administrative, sexual, and ritual purposes. Slaves who worked in mines or in gangs on highly organized farming systems were often far worse off than those who were employed in some skilled craft in urban regions, especially where the latter were allowed to hire themselves out independently. Slaves acquired for military or administrative purposes, as was often the case in the Islamic world— the Janissaries and Mameluks being the classic examples—were clearly at an advantage when compared with lowly field hands or concubines. Newly bought slaves, especially those who grew up as free persons and were new to their masters’ society, usually led more wretched lives than those born to their owners. High levels of absenteeism among owners—which was true of the owners of slave latifundia in ancient Rome as well as the Caribbean slave societies and some parts of Latin America—often meant ill-usage by managers and overseers paid on a commission basis. Proximity to the master cut both ways with respect to the treatment of slaves. Slaves in the household were usually materially better off, and in some cases close ties developed between masters and these slaves, such as those between masters and their former nannies, or with a favored concubine. However, proximity meant more sustained and direct supervision, which might easily become brutal. Ethnic and somatic or perceived racial differences between masters and slaves operated in complex ways. Intra-ethnic slavery was uncommon in world history, although by no means nonexistent, while that between peoples of different perceived ‘racial’ groups was frequent. The common view that New World slavery was distinctive in that masters and slaves belonged to different perceived races is incorrect. Where there were somatic differences, the treatment of the slave depended on how these differences were perceived and, independently of this, how attractive the slave was in the eyes of the master. Scandinavian women were prized in the slave harems of many Islamic Sultans, but so were attractive Ethiopian and subSaharan women. Furthermore, in Muslim India and eighteenth-century England and France, dark-skinned slaves were the most favored, especially as young pages. In the New World, on the other hand, mulatto and other light-skinned female slaves were often better treated than their more African-looking counterparts. Two other factors should be mentioned in considering the treatment of slaves: Laws and religion. Slightly more than a half of all slave societies on which data exist had some kind of slave laws, in some cases elaborate servile codes, the oldest known being those of ancient Mesopotamia. Slightly less than a half had none. Laws did make a difference; a much higher proportion of societies without any slave codes tended to treat slaves harshly. Nonetheless, the effectiveness 14149
Slaery as Social Institution of laws were mediated by other factors such as the relative size of the slave population, and religion. The degree to which religion, especially Islam and Christianity, influenced the treatment of slaves is a controversial subject. Islam had explicit injunctions for the treatment of slaves, and these were sometimes influential, especially those relating to manumission. Although racism and strong preference for light complexion were found throughout the Islamic lands, it is nonetheless the case that Islam as a creed has been more assimilative and universalist than any other of the world religions, and has rarely been implicated in egregiously racist movements similar to those that have tarnished the history of Christianity such as apartheid, the Ku Klux Klan, Southern Christianity during the era of segregation, and the ethnic cleansing of Eastern Europe. However, Islam never developed any movement for abolition, and, in general, strongly supported the institution of slavery, especially as a means of winning converts. For most of its history up to the end of the eighteenth century, Christianity simply took slavery for granted, had little to say about the treatment of slaves, and generally urged slaves to obey their masters. This changed radically with the rise of evangelical Christianity in the late eighteenth and early nineteenth centuries, during which it played a critical role in the abolition of both the slave trade and of slavery in Europe and the Americas. Throughout the world, Christianity appealed strongly to slaves and ex-slave populations, and in the Americas their descendants are among the most devout Christians. While Christianity may have had conservative influences on converted slaves, it is also the case that nearly all the major slave revolts from the late eighteenth century were influenced strongly by rebel leaders, such as Daddy Sharp in Jamaica and Nat Turner in America, who interpreted the faith in radical terms, or by leaders of syncretic Afro-Christian religions.
5. Manumission as an Integral Element of Slaery With a few notable exceptions, manumission, the release from slavery, was an integral and necessary element of the institution wherever it became important. The reason for this is that it solved the incentive problem implicit in slavery. The promise of redemption proved to be the most important way of motivating the slaves to work assiduously on their masters’ behalf. Slaves were manumitted in a wide variety of ways, the most important being: Self-purchase or purchase by others, usually free relatives; through the postmortem or testamentary mode; through cohabitation or sexual relations; through adoption; through political means or by the state; and by various ritual or 14150
sacral means. Self-purchase or purchase by relatives or friends of the slave into freedom was by far the most important means in the advanced slave economies of Greece and Rome, and in the modern capitalistic slave regimes. However, it was not the most widespread in other kinds of slave systems, and in parts of the world such as Africa it was uncommon. Post-mortem manumission by wills and other means was common in Islamic lands and in many parts of Africa. This form of manumission was usually intimately linked to religious practices and with expectations of religious returns for what was often defined as an act of piety. As indicated earlier, in many slave societies slave concubines sometimes gained their freedom for themselves and their children from the master. Manumission by the state for acts of heroism or for military action was an important, though episodic, form of manumission not only in the ancient world, but in many New World slave societies. Thousands of slaves gained their freedom in this way, not only in Latin America, but also in North America during the American war of independence and in wars against Spain in the southern USA. Manumission by adoption was unusual, but in certain societies, such as ancient Rome, it constituted the most complete form of release from slavery. Slaves were sometimes manumitted for ritual or religious reasons, or on special celebratory occasions. Although thousands of slaves were manumitted at Delphi, ostensibly by being sold to Apollo, such manumissions had become merely legal formalism by the second century BC, although the practice may have harked back to an earlier era when they were genuinely religious in character. In other societies, religious or ritual manumissions were often substitutes for earlier practices in which slaves were sacrificed. Since manumission meant the negation of slavery, for many peoples, freeing slaves was symbolically identical to killing them. Such practices were common in some parts of Africa and the Pacific islands, and among some indigenous tribes of the Northwest Coast of America slaves were either killed or given away in potlatch ceremonies. There is an extension of this primitive symbolic logic in Christianity, where Christ’s sacrificial death is interpreted as a substitute for the redemption of mankind from enslavement to sin and eternal death, ‘redemption’ (from Latin redemptio) literally meaning ‘to purchase someone out of slavery.’ In all slave societies, certain categories of slave were more likely to be manumitted than others. The most important factors explaining the incidence of manumission were: Gender, the status of parents, age, skill, residence and location, the means of acquisition and, where relevant, skin color. These factors are similar to those influencing the treatment of slaves and will not be discussed further. They were also important in explaining varying rates between societies. Thus societies with relatively higher proportions of skilled slaves, a greater location
Slaery as Social Institution of slaves in urban areas, higher ratios of female to male slaves, and higher rates of concubinage between masters and slaves were more likely to have higher rates of manumission than those with lower levels of these attributes. Added to this is another critical variable: The availability of slaves, either internally or externally, to replace manumitted slaves. As long as such sources existed and the replacement value of the manumitted slave was less than the price of manumission, it suited slave-owners to manumit slaves, especially when the manumitted slaves were nearing, or had already reached, the end of their useful life. However, on rare occasions the supply of slaves was cut off when demand remained high or was on the increase. This always resulted in very low rates of manumission. The most striking such case in the history of slavery was the US South, where the rise of cotton-based, capitalistic slavery in the early nineteenth century came within a few years of the termination of the Atlantic slave trade to America. Planters responded by reducing the manumission rate to near zero. A similar situation, though not as extreme, developed in the Spanish islands of the Caribbean during the nineteenth century, when the plantation system developed in the context of British enforcement of the abolition of the slave trade in the region. The result was that previously high rates of manumission plunged to very low rates. A final point concerns the status of freed persons. This varied considerably across slave societies and bore little relation to the rate of manumission. Thus manumission rates were high in the Dutch colonies of Curac: ao and in nineteenth-century Louisiana, but the condition of freedmen was wretched. Conversely, in the British Caribbean, where manumission rates were low, the condition of freedmen was relatively good, some groups achieving full civil liberties before the end of slavery. The main factor explaining the difference in treatment was the availability of economic opportunities for freedmen, and the extent to which the dominant planter class needed them as allies against the slaves. In the Caribbean, the small proportion of slaveholders and Europeans and the existence of a vast and rebellious slave population gave much political leverage to the small but important freed population. No such conditions existed in the United States, where the free and white population greatly outnumbered the slaves, and conditions for rebellion were severely restricted. In the ancient world, ethnic barriers meant generally low status and few opportunities for the manumitted in the Greek states, in contrast with Rome, where cultural and economic factors favored the growth and prosperity of a large freedmen class, a class that eventually came to dominate Rome demographically and culturally, with major implications for Western civilization. See also: Caribbean: Sociocultural Aspects; Inequality; Inequality: Comparative Aspects; Property
Rights; Slavery: Comparative Aspects; Slaves\ Slavery, History of; Subaltern History; West Africa: Sociocultural Aspects
Bibliography Blackburn R 1997 The Making of New World Slaery. Verso, London Cohen D W, Greene J P (eds.) 1972 Neither Slae Nor Free. Johns Hopkins University Press, Baltimore, MD Davis D B 1966 The Problem of Slaery in Western Culture. Cornell University Press, Ithaca, NY Drescher S, Engerman S L (eds.) 1998 A Historical Guide to World Slaery. Oxford University Press, New York Engerman S 1973 Some considerations relating to property rights in man. Journal of Economic History 33: 43–65 Engerman S L (ed.) 1999 Terms of Labor: Slaery, Serfdom, and Free Labor. Stanford University Press, Stanford, CA Engerman S, Genovese E (eds.) 1975 Race and Slaery in the Western Hemisphere. Princeton University Press, Princeton, NJ Eltis D, Richardson D (eds.) 1997 Routes to Slaery. Frank Cass, London Findlay R 1975 Slavery, incentives and manumission: A theoretical model. The Journal of Political Economy 83(5): 923–34 Finley M I (ed.) 1960 Slaery in Classical Antiquity: Views and Controersies. Heffer, Cambridge, UK Fogel R W 1989 Without Consent or Contract: The Rise and Fall of American Slaery. Norton, New York Garlan Y 1995 Les Esclaes en Greece Ancienne. Editions de la Decouvert, Paris Kirschenbaum A 1987 Sons, Slaes and Freedmen in Roman Commerce. Catholic University Press, Washington, DC Landers J (ed.) 1996 Against the odds. Slaery and Abolition, Special issue 17(1) League of Nations 1938 Report to the League of Nations Adisory Committee of Experts on Slaery, Vol. 6. League of Nations, Geneva, Switzerland Lovejoy P E 2000 Transformations in Slaery: A History of Slaery in Africa. Cambridge University Press, New York Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold, trans. A Dasnois. Athlone, London Miers S, Kopytoff I (eds.) 1977 Slaery in Africa. University of Wisconsin Press, Madison, WI Miller J C 1993 Slaery and Slaing in World History: A Bibliography, 1900–1991. Kraus International, Millwood, NY Patterson O 1967 The Sociology of Slaery: Jamaica, 1655–1838. McGibbon & Kee, London Patterson O 1982 Slaery and Social Death. Harvard University Press, Cambridge, MA de Queiros Mattoso K M 1986 To Be a Slae in Brazil, 1550–1888, trans. A Goldhammer. Rutgers University Press, New Brunswick, NJ Reid A (ed.) 1983 Slaery, Bondage and Dependency in Southeast Asia. University of Queensland Press, St. Lucia, Queensland Rodriguez J P (ed.) 1977 The Historical Encyclopedia of World Slaery. ABC-CLIO, Santa Barbara, CA Shepherd V, Beckles H (eds.) 2000 Caribbean Slaery in the Atlantic World. Ian Randle Publishers, Kingston, Jamaica
14151
Slaery as Social Institution Watson J (ed.) 1980 Asian and African Systems of Slaery. Blackwell, Oxford, UK
O. Patterson
Slavery: Comparative Aspects In the comparative study of slavery it is important to distinguish between slave holding societies and largescale, or what Moses Finley called ‘genuine,’ slave societies (Finley 1968). The former refers to any society in which slavery exists as a recognized institution, regardless of its structural significance. Genuine slave societies belong to that subset of slave holding societies in which important groups and social processes become heavily dependent on the institution. The institution of slavery goes back to the dawn of human history. It remained important down to the late nineteenth century, and persisted as a significant mode of labor exploitation in some Islamic lands as late as the second half of the twentieth century. Remnants of it are still to be found in the twenty-first century in a few areas. Yet it was only in a minority of cases that it metastasized socially into genuine slave societies, though in far more than the five cases erroneously claimed by Keith Hopkins (1978). Thus, the institution existed throughout the ancient Mediterranean, but only in Greece and Rome (and, possibly, Carthage) did genuine slave societies emerge. It was found in every advanced precapitalist society of Asia, but only in Korea did it develop into large-scale slavery. All Islamic societies had slavery, but in only a few did there emerge structural dependence on the institution. In all, there were approximately 50 cases of large-scale slavery in the precapitalist world. With the rise of the modern world, slavery became the basis of a brutally efficient variant of capitalism. There were at least 40 such cases, counting the Caribbean slave colonies as separate units (for the most complete list, see Patterson 1982, App. C) This article examines two sets of problems. The first concerns those factors associated with the presence of institutionalized slaveholding. The critical question here is: why is the institution present in some societies yet not in other, apparently similar, ones? The second set of problems begins with the assumption that slavery exists, and attempts to account for the growth in significance of slavery. More specifically, such studies attempt to explain the origins, structure, dynamics, and consequences of genuine slave societies.
1. Comparatie Approaches to Slaeholding Societies There is now a vast and growing body of literature on slavery (Patterson 1977a, Miller 1993, Miller and 14152
Holloran 2000). Yet, with the notable exception of certain anthropological studies (to be considered shortly) relatively few recent works are truly comparative in that they aim to arrive at general conclusions about the incidence, nature, and dynamics of slavery. Even those few works that compare two slaveholding societies tend to be concerned more with highlighting, through contrast, the distinctive features of the societies under consideration. This highly particularistic trend marks a regrettable departure from earlier studies of slavery. The evolutionists of the nineteenth century were the first to offer explanations for the presence of slaveholding. Their basic proposition—that there was a close relationship between stages of socioeconomic development and the rise of slavery—has received no support from modern comparative studies (Pryor 1977). However, some of their less grandiose views have survived later scrutiny. One was their emphasis on warfare, and the demand for labor at certain crucial points in their scales of development as the important factors (Biot 1840, Westermarck 1908). The other was their finding that the socioeconomic role of women was critical. It was claimed, for example, that the subjection of women provided both a social and an economic model for the enslavement of men (Biot 1840, Tourmagne 1880). The most important of the early-twentieth-century theorists was H. J. Nieboer (1910), who broke with the evolutionists with his open-resource theory. His work, which is still influential, was unusual also in its reliance on statistical data. His main hypothesis was that slavery existed to a significant degree only where land or some other crucial resource existed in abundance relative to labor. In such situations, free persons cannot be induced to work for others, so they must be forced to do so. The theory has had lasting appeal, especially for economic historians, since it is both testable and consistent with marginal utility theory. It was revived by Baks (1966) and by the MIT economist, Domar (1970). However, the theory has been shown to have little empirical support from modern crosscultural data (Patterson 1977b, Pryor 1977); and Engerman (1973) has questioned seriously its theoretical consistency. In the course of criticizing Nieboer, Siegel (1945) proffered a functionalist theory which claimed that chattel slavery may be expected to occur in those societies where there is a tendency to reinforce autocratic rule by means of wealth symbols, and where the process results in a rather strongly demarcated class structure. There is little empirical support for this theory, and it verges on circularity. In many premodern hierarchical societies with wealth symbols, it was slavery which was the main cause of increased stratification. The best recent comparative work on slavery has come from historical anthropologists studying Africa. Meillassoux (1975, 1991) and his associates have
Slaery: Comparatie Aspects demonstrated ably from their studies of West Africa and the Sahel just how valuable a nondogmatic Marxian approach can be in understanding the dynamics of trade, ethnicity, status, and mode of production in complex lineage-based societies. The anthropologist Jack Goody (1980), drawing on Baks (1966), began by arguing that ‘slavery involves external as well as internal inequality, an unequal balance of power between peoples.’ From this unpromising start he enriches his analysis with both the ethnohistorical data on Africa and the cross-cultural statistical data from the Murdock Ethnographic Atlas. Central to his analysis is the role of women. The complex facts of slavery cannot be explained, he argues, ‘except by seeing the role of slaves as related to sex and reproduction as well as to farm and production, and in the case of eunuchs, to power and its non-proliferation.’ While being valuable, Goody’s arguments are confined to Africa, and even with respect to this continent they are insufficient to explain why it was that some African societies came to rely so heavily on the institution while closely related neighboring groups did not. The economist Frederic Pryor (1977) has come closest to formulating a general theory of premodern slavery using modern statistical techniques applied to cross-cultural data. He distinguishes between social and economic slavery, and argues that different factors explain the presence of one or the other. Central to his theory is the nineteenth-century idea that there is a correspondence or ‘homologism’ between male domination of women and masters’ domination of slaves. He tried to demonstrate that economic slavery was most likely to occur ‘in societies where women normally perform most of the work so that the slave and the wife act as substitutes for each other,’ whereas social slavery was related to ‘the role of the wife in a polygynous situation.’ The theory is interesting and robustly argued but is problematic. Where women dominated the labor force there was no need for men to acquire slaves for economic purposes. On the contrary, it was precisely in such situations that slaves, when used at all, were acquired for social purposes. The presumed correspondence between wives and slaves is also questionable. There were profound differences between these two roles. Wives everywhere belonged to their communities, and were intimately kin-bound, whereas slaves everywhere were deracinated and kinless. Wives always had some rights, some power, at least in the protection of their kinsmen, in relation to their husbands and could rarely be killed with impunity, which was not true of slaves. Wives everywhere had honor, while slaves had none. Far more comparative work is needed for an understanding of the institutionalization of slavery. The main variables involved in any explanation are now well known. They are: The economic and social role of women; polygyny; internal and external warfare; the mode of subsistence (mainly whether pastoral
or agricultural); and the mode of political succession. However, the causal role and direction of each of these variables is complex, and their interaction with each other even more so. To take the role of women as an example, in some cases it is their role as producers that is important; in others, their role as reproducers. It is also difficult using static cross-cultural data to ascertain whether low female participation in production is the result of slavery or its cause.
2. Approaches to the Study of Genuine Slae Societies Marxist scholars were the first to take seriously the problem of the origins, nature, and dynamics of largescale slavery (for a review, see Patterson 1977a). Engels’s (1962) periodization view that large-scale slavery constituted an inevitable stage in the development of all human societies was merely one version of nineteenth-century evolutionism, discussed earlier. It dominated East European thought until the deStalinization movement, and is still the orthodox view in mainland China. More sophisticated, but no less problematic, have been attempts of modern Marxists to formulate a ‘slave mode of production.’ The most empirically grounded of these attempts is that of Perry Anderson (1974) who argued that ‘the slave mode of production was the decisive invention of the Graeco-Roman world, which provided the ultimate basis both of its accomplishments and its eclipse.’ There is now no longer any doubt that slavery was foundational for both Athens and Rome, and that these were the first societies in which large-scale or genuine slavery emerged. However, the concept of the ‘slave mode of production’ overemphasizes the materialistic aspects of slavery, making it of limited value for the comparative study of slavery. Slaves were indeed sometimes used to generate new economic systems—as in Rome and the modern plantation economies—but there are many cases where, even when used on a large scale, there was nothing innovative or distinctive about the economic structure. The narrow materialist focus not only leads to a misunderstanding of the relationship between slavery and technology in the ancient world, but more seriously, it fails to identify major differences between the Greek and Roman cases, and it is of no value in the study of genuine slave societies in which the structurally important role of slaves was noneconomic, as was true of most of the Islamic slave systems. Several non-Marxist historical sociologists, drawing also on the experience of ancient Europe, have made important contributions to our understanding of genuine slave societies. According to Weber (1964) slavery on a large scale was possible only under three conditions: ‘(a) where it has been possible to maintain slaves very cheaply; (b) where there has been an 14153
Slaery: Comparatie Aspects opportunity for regular recruitment through a wellsupplied slave market; (c) in agricultural production on a large scale of the plantation type, or in very simple industrial processes.’ However suggestive, there is little support for these generalizations from the comparative data. Medieval Korea (Salem 1978) and the US South (Fogel and Engerman 1974) disprove (a) and (b). And work on urban slavery in the modern world as well as the relationship between slavery and technology in the ancient world disprove (c) (Finley 1973). Finley (1960, 1973, 1981) was the first scholar to grapple seriously with the problem of defining genuine slave societies, which he explicitly did not confine to those in which the slaves were economically important. He also cautioned correctly against too great a reliance on numbers in accounting for such societies. His emphasis on the slave as a deracinated outsider led him to the conclusion that what was most advantageous about slaves was their flexibility, and their potential as tools of change for the slaveholder class. Finlay also offered valuable pointers in his analyses of the relationship between slave and other forms of involuntary labor. And in criticizing Keith Hopkins’s (1978) conquest theory of the emergence of genuine slave societies, persuasively he encouraged an emphasis on demand, as opposed to supply factors in the rise of slave society. Romans, he argued, captured many thousands of slaves during the Italian and Punic wars because a demand for them already existed and ‘not the other way around.’ He postulated three conditions for the existence of this demand: private ownership of land, and some concentration of holdings; ‘a sufficient development of commodity production of markets’; and ‘the unavailability of an internal labor supply.’
3. A Framework for the Study of Slae Societies It is useful to approach the comparative study of slave society with an understanding of what structural dependence means. There are three fundamental questions. First, what was the nature of the dependence on slavery; second, what was the degree of dependence; and third, what was the direction of dependence. Answers to these three questions together determine what may be called the modes of articulation of slavery. The nature of dependence on slavery may have been primarily economic or social, political or militaristic, or a combination of these. Economic dependence was frequently the case, especially in ancient Rome and in the modern capitalistic slave systems. Here the critical questions are: What were the costs and benefits of imposing and maintaining an economy based on slave labor? Stanley Engerman (1973) has written authoritatively on this subject. He distinguishes between the costs of imposition, of enforcement, and of worker 14154
productivity. Lower maintenance costs, more constant supply of labor, greater output due to the neglect of the non-pecuniary costs of labor, higher participation rates and greater labor intensity, and economies of scale, are the main factors proposed by Engerman in explaining the shift to slave labor. They apply as much to ancient Rome as they do to the modern capitalistic slave systems of America and the Caribbean. Military and bureaucratic dependence were the main noneconomic forms, and they could be as decisive for societies as was economic dependence. The rise of Islam was made possible by the reliance on slave soldiers, and the Abbasid Caliphate, along with other Islamic states, was maintained by slave and exslave soldiers and bureaucrats. Although there was little economic dependence on slaves in most Islamic states—with the notable exception of ninth-century Iraq—many of them qualify as genuine slave societies as a result of the politico-military dependence of these regimes on slavery, and the ways in which slaves influenced the character of their cultures, many of the literati also being slaves or descendants of slaves (see Pipes 1981, Crone 1981, Marmon 1999). The degree of dependence must be taken into account, although it is sometimes difficult to quantify. There has been a tendency to emphasize the size of the slave population over other variables, and while demography is important it can sometimes be misleading. Thus, it has been noted that no more than one in three adults were slaves in Athens at the height of its slave system in the late fifth century BCE as a way of playing down the importance of slavery. But as M. I. Finley liked to point out, the same was true of most parts of the American Slave South, and no one has ever questioned the fact that it was a large-scale slave system. Finally, there is the direction of dependence. Even where a society had a high functional dependence on slavery it was not necessarily the case that slavery played an active, causal role in the development of its distinctive character. The institution, though important, was not structurally transformative. In ancient Rome, the Sokoto Caliphate and the slave societies of the Caribbean, the American South and Brazil, slavery was actiely articulated and transformative. In other cases, however, it was passiely articulated, the classic instance being Korea during the Koryo and Yi periods where, although a majority of the rural population were at times slaves, there was no significant transformation of the economy, and no impact on the regime’s government and culture. The same was true of several of the modern Spanish colonies of Central and South America. During the sixteenth and seventeenth centuries there was marked structural dependence on slavery in Mexico and Central America as well as Peru, and the urban areas of Chile and Argentina, but the institution was not determinative in the course of development of these societies and, as in Korea, when slavery ended it left hardly a trace, either
Slaery: Comparatie Aspects cultural or social (Mellafe 1975, Palmer 1976, Klein 1986, Blackburn 1997). These three factors together determined a finite set of modes of articulation, which is a sociologically more useful construct than that of the so-called ‘slave mode of production.’ Space permits only the most cursory mention of some of the most important such modes of articulation. The lineage mode of articulation refers to those kinbased societies in which large-scale slavery was related critically to the rise to dominance of certain lineages in the process of class and state formation. In some cases, slavery originally served primarily economic ends; in others, mainly social and political ones; but in the majority the institution became multifunctional. This kind of genuine slave society was most commonly found in western and west-central Africa. Warfare, combined with some critical internal factor such as demographic change accounted for the rise of slavery (see Miller 1977 for the Kongo kingdom). The ideal case of this mode of articulation was the Asante state (Wilks 1975, Klein 1981). Slaves were originally incorporated as a means of expanding the number of dependents, a tendency reinforced by the matrilineal principle of descent. The growing number of slaves enhanced the lineage heads who owned them, and facilitated the process of lineage hierarchy and state formation. Later, the role of slaves was greatly expanded to include a wide range of economic activities, including mining. The predatory circulation mode refers to those slave societies in which warfare and raiding mainly for slaves were the chief occupations of a highly predatory elite. The warrior class was usually assisted by a commercial class which traded heavily in slaves. There was usually a high rate of manumission of slaves, who not only contributed to the production of goods, but in their role as freedmen soldiers helped the ruling class to produce more slaves. Thus there was a continuous circulation of persons in and out of slave status as outsiders were incorporated as loyal freedmen retainers, creating a constant need for more enslaved outsiders. Slaves and freedmen often played key roles in, and sometimes even dominated, the palatine service and elite executive jobs. The contiguous existence of pastoral and sedentary agricultural peoples with different levels of military might was the major factor in the development of this mode of articulation of slavery. The mode is strongly associated with Islam, and there are many examples of it in the Sahel and North Africa. The west–east spread of the Fulani over an area of some 3,000 miles over a period of 800 years provides one of the best cases of this mode, on which there is an abundance of historical and anthropological data (Lovejoy 2000, Pipes 1981, Meillassoux 1975, 1991). The embedded demesne mode embraces those patrimonial systems dominated by large landed estates in which slaves were incorporated on a substantial scale
to cultivate the demesne land of the lords. Serf or tenant laborers continued, in most cases, to be the primary producers of food for the society, and their rents or appropriated surpluses remained a major source of wealth for the ruling class. However, slaves were found to be a more productive form of labor on the home farms of the lords for the cost–benefit reasons analyzed by Engerman, mentioned above. The landowners got the best of both types of labor exploitation. This was a particularly valuable arrangement where a landed aristocracy needed to change to a new crop but was not prepared to contest the technical conservatism of serfs; where there was a supply of cheap labor across the border; or where there was a high level of internal absenteeism among the landed aristocracy. This is the least understood or recognized form of advanced slave systems, perhaps because its mode of articulation was usually passive. Many of them were to be found in medieval Europe, especially in France, Spain, and parts of Scandinavia (Dockes 1982, Verlinden 1955, 1977, Bonassie 1985, Anderson 1974, Patterson 1991). The slave systems of the western ‘states’ of eleventh-century England, where slave populations were as high as 20 percent of the total are likely examples. So, possibly, was Viking Iceland, which may well have had similar proportions of slaves (Williams 1937, Foote and Wilson 1970). However, the ideal case of embedded demesne slavery was to be found in Korea, especially during the Koryo and early Yi periods. Here the slave population sometimes exceeded that of other forms of bonded labor (Salem 1978, Wagner 1974, Hong 1979). The urban–industrial modes of articulation were those in which the urban elites came to rely heavily on slaves for support. Slaves played a relatively minor role in agriculture, although they may well have dominated the ‘home farms’ of certain segments of the ruling urban elites. Slave labor was concentrated in urban craft industries which produced goods for local consumption and exports, as well as the mining sector where it existed. Slavery emerged on a large scale in such systems as a result of a combination of factors among which were the changing nature and frequency of warfare, conquest of foreign lands, the changing tastes of the ruling class, crises in the internal supply of labor, shifts in food staples, growing commercial links with the outside world, and demographic changes. This mode of articulation could be either passive or active. To the extent that the character of the urban civilization depended on its urban economy, and to the extent that the economy depended on slave laborers, both manual and technical, to that degree were these systems active in their articulation. The classic case of the active mode of slave articulation were the ancient Greek slave systems, especially Athens of the fifth and fourth centuries BC (Finley 1981, Garlan 1982, De Ste. Croix 1981). Typical of the 14155
Slaery: Comparatie Aspects passive mode of urban–industrial articulation were several of the Spanish slave systems of Central and South America during the sixteenth and seventeenth centuries, mentioned above. The Roman or urban–latifundic mode: ancient Roman slavery stands in a class by itself, having no real parallels in either the ancient, medieval, or modern worlds. It came the closest to a system of total slavery in world history. It was distinctive, first, in the sheer magnitude of its imperial power and the degree of dependence on slavery, both at the imperial center and in its major colonial sectors. Second, Rome was unique in the extent of its reliance on slavery in both its rural and urban–industrial sectors. Third, the articulation of slavery was more actively transformative than in any other system, entailing what Hopkins (1978) called an ‘extrusion’ of free, small farmers and their replacement by slaves organized in gangs on large latifundi. Rome was unusual too, not only for its high levels of manumission, but for the extent to which slaves and slavery came to influence all aspects of its culture. The capitalist plantation mode: contrary to the views of early economic theorists such as Adam Smith, and of Marxist scholars until fairly recently (Genovese 1965), modern plantation slavery was in no way incompatible with capitalism. Indeed, the rise of capitalism was intimately bound up with this mode of articulation of slavery, and at its height in nineteenthcentury America was as profitable as the most advanced industrial factories of Europe or the northern United States (Fogel and Engerman 1974). Plantation slavery constituted one version of the worldwide systemic spread of capitalism, in which capital accumulation was advanced through the use of slaves in the peripheral colonial regions, complementing the use of so-called free labor in the metropolitan centers, and of serf and other forms of dependent labor in the semiperipheral areas of Eastern Europe and Latin America (Wallerstein 1974, Blackburn 1997). While plantation slavery bore some organizational resemblance to the ancient slave latifundia, and indeed can be traced historically to late medieval and early modern variants of the ancient model (see Solow 1991), it was distinctive in its complex, transnational system of financial support, in its production for international export, its heavy reliance on a single crop, its advanced organizational structure, which in the case of sugar involved innovative agri-industrial farms, in the vast distances from which slaves were imported, entailing a complex transoceanic slavetrading system, and in its reliance on slaves from one major geographic region who differed sharply from the slaveholding class in ethnosomatic terms. However, it is important to recognize that there existed concurrently with the capitalistic plantation mode several other modes of slave articulation that were either precapitalist or at best protocapitalist. As has already been noted, many of the Spanish colonial 14156
systems in the Americas relied on forms of slavery that were nonaccumulative and distinctly premodern in their articulation, for example the urban–industrial modes of Mexico and parts of South America which were focused heavily on mining. In the Spanish Caribbean up to the third quarter of the eighteenth century a peculiar premodern form of agri-pastoral slavery prevailed which had little in common with the highly capitalistic slave plantation systems of the neighboring French and British islands, and which had to be dismantled at considerable political and socioeconomic cost in order to make way for the belated capitalistic slave systems that were imposed forcefully during the nineteenth century (Knight 1970). While slavery was largely abolished in the Americas through the course of the nineteenth century (Blackburn 1988, Davis 1975), pockets of the institution persist into the twenty-first century in northern Africa and parts of Asia (Stearman 1999). With the possible exception of Mauritania, however, in none of these societies do we find genuine slave systems. It can be claimed, with cautious optimism, that this most inhuman form of societal organization has vanished from the world.
Bibliography Anderson P 1974 Passages from Antiquity to Feudalism. London Ayalon D 1951 L’Esclaage du Mamelouk. Jerusalem Baks C J et al 1966 Slavery as a system of production in tribal societies. Bijdrgen tot de Taal-, Land- en Volkenkunde 122 Biot E 1840 De l’abolition de l’esclaage ancien en Occident. Paris Blackburn R 1988 The Oerthrow of Colonial Slaery, 1776– 1848. London Blackburn R 1997 The Making of New World Slaery. London Bonassie P 1985 Survie et extinction due regime esclavagiste dans l’Occident du haut moyet age. Cahiers de Ciilisation Medieale 28 Crone P 1981 Slaes on Horses. New York Davis D B 1966 The Problem of Slaery in Western Culture. Ithaca, NY Davis D B 1975 The Problem of Slaery in the Age of Reolution, 1770–1823. Ithaca, NY De Ste Croix G E M 1981 The Class Struggles in the Ancient Greek World. Ithaca, NY Dockes P 1982 Medieal Slaery and Liberation. Chicago Domar E 1970 The causes of slavery or serfdom. Journal of Economic History 30 Engels F 1962 Anti-Duhring. Moscow Engerman S 1973 Some considerations relating to property rights in man. Journal of Economic History 33 Finley M I (ed.) 1960 Slaery in Classical Antiquity. Cambridge Finley M I 1968 Slavery. International Encyclopedia of the Social Sciences, Vol. 14 Finley M I 1973 The Ancient Economy. London Finley M I 1981 Economy and Society in Ancient Greece. In: Shaw B D, Saller R P (eds.) , New York Fogel R W, Engerman S 1974 Time On the Cross. Boston, Vols. 1 and 2
Slaes\Slaery, History of Foote P, Wilson D 1970 The Viking Achieement. London Garlan Y 1982 Les esclaes en grece ancienne. Paris Genovese E D 1965 The Political Economy of Slaery. New York Goody J 1980 Slavery in time and space. In: Watson J L (ed.) Asian and African Systems of Slaery. Oxford Hong S H 1979 The legal status of the private slaves in the Koryo Dynasty (in Korean). Han’gukakpo 12 Hopkins K 1978 Conquerors and Slaes. Cambridge Klein A N 1981 The two Asantes. In: Lovejoy P (ed.) The Ideology of Slaery in Africa. Beverly Hills Klein H S 1986 African Slaery in Latin America and the Caribbean. New York Knight F W 1970 Slae Society in Cuba During the 19th Century. Madison, WI Lovejoy P E 2000 Transformations in Slaery: A History of Slaery in Africa. New York Marmon S E (ed.) 1999 Slaery in the Islamic Middle East. Princeton, NJ Meillassoux C (ed.) 1975 L’esclaage en Afrique precoloniale. Paris Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold. (Trans Dasnois A). London Mellafe R 1975 Negro Slaery in Latin America. Berkeley, CA Miller J C 1977 Imbangala lineage slavery. In: Miers S, Kopytoff I (eds.) Slaery in Africa. Madison, WI Miller J C (ed.) 1993 Slaery and Slaing in World History: A Bibliography. Millwood Miller J C, Holloran J R (eds.) 2000 Slavery: Annual bibliographical supplement. Slaery and Abolition 21(3) Nieboer H J 1910 Slaery as an Industrial System. Rotterdam, The Netherlands Palmer C 1976 Slaes of the White God: Blacks in Mexico, 1570–1650. Cambridge, MA Patterson O 1977a Slavery. Annual Reiew of Sociology 3: 407–49 Patterson O 1977b The structural origins of slavery: A critique of the Nieboer–Domar hypothesis. Annals of the New York Academy of Science 292: 12–34 Patterson O 1982 Slaery and Social Death. Cambridge, MA Patterson O 1991 Freedom in the Making of Western Culture. New York Pipes D 1981 Slae Soldiers and Islam. New Haven, CT Pryor F L 1977 A comparative study of slave societies. Journal of Comparatie Economics 1 Salem H 1978 Slaery in Medieal Korea. Ph.D. dissertation, Columbia University Siegel B J 1945 Some methodological considerations for a comparative study of slavery. American Anthropologist 7 Solow B (ed.) 1991 Slaery and the Rise of the Atlantic System. Cambridge, MA Stearman K 1999 Slaery Today. Hove Tourmagne A 1880 Histoire de l’esclaage encien et moderne. Paris Verlinden C L’Esclaage dans l’Europe Medieale. Vols. 1 and 2 (Bruges, 1995; Ghent, 1977) Wagner E 1974 Social stratification in 17th century Korea. Occasional Papers on Korea 1 Wallerstein I 1974 The Modern World System. New York Weber M 1964 The Theory of Social and Economic Organization. New York Westermarck E 1908 The Origins and Deelopment of the Moral Ideas. New York
Wilks I 1975 Asante in the Nineteenth Century. Cambridge Williams C O 1937 Thralldom in Ancient Iceland. Chicago, IL
O. Patterson
Slaves/Slavery, History of Slavery comes in many culturally specific forms. Common to all these different forms is the fact that slaves are denied the most basic rights. Slaves remain outside the social contract that binds the other members of a given society; they are people without honor and kin, subject to violent domination. They are objects of the law, not its subjects, considered as property, or chattel. Hence, we may describe slavery with Orlando Patterson as a form of ‘social death,’ and slaves as outsiders kept in a position of institutionalized marginality. In many languages, the term for foreigner also denoted slaves. Nowadays slavery is considered a most inhumane practice. The United Nations Universal Declaration of Human Rights proclaims: ‘No one shall be held in slavery or servitude; slavery and the slave trade shall be prohibited in all their forms.’ Signing the slavery conventions of 1926 and 1956, 122 states have committed themselves to abolish slavery and similar practices such as debt bondage, serfdom, servile marriage, and child labor. In historical times, however, slavery as an institution was accepted in many societies with a minimum of social stratification all over the world, from early Mesopotamia and Pharaonic Egypt to ancient China, Korea, India, medieval Germany, and Muscovite Russia, from the Hebrews to the Aztecs and Cherokees. Yet, according to Moses Finley there were only five full-fledged slave societies with slaves constituting the majority of people and their work determining the whole economy. All five were located in what we tend to call the West, two in antiquity: Athenian Greece and Rome between the second century BC and the fourth century AD, three in modern times: Brazil, the Caribbean, and the southern US. The Ottoman Empire was notorious for its slave soldiers, the janissary, taken from its Slavic neighbors to the north and from Africa. The most exploitative form of chattel slavery emerged in the plantation economies of the New World under the aegis of European colonial rule with the UK playing a dominant role. Interestingly, this happened at the time when the medieval concept of labor as a common community resource was progressively replaced by a free labor regime in the UK. Ultimately, and for the first time in history, employer and employed in Europe were equal before the law, while slavery with its extreme status differential prevailed in the new world. The ensuing contradictions eventually led to the rise of an abolitionist movement in Europe and to the progressive banning of slavery in the nineteenth and twentieth centuries. 14157
Slaes\Slaery, History of The last to make slavery illegal were the governments on the Arabian Peninsula. They did so in 1962. Where there are slaves, there are people arguing about slavery, some rationalizing it, others denouncing it, and yet others to better demarcate the duties and rights of those involved. There is an extensive Islamic literature about slavery, paying particular attention to the question of who may, and who may not, be enslaved, to the duties of slave owners and the terms of manumission. Many classical and Christian authors dealt with the same problems. The abolitionist movement generated a literature of its own, stressing the horrors of slavery as in ‘Oronooko,’ the justly famous play by Aphra Behn, and the life narrative of Olaudah Equiano, who was taken captive as a child in his native Igbo village in what is now Nigeria. A common aspect of all these contributions written by contemporaries is their normative approach, based either in religion or moral philosophy, their focus on institutional and legal matters, and not the least on their reliance on anecdotal evidence and personal experience. In ancient Greece and in Rome, slaves were perceived as captives of war even when bought and defined as things until the Code of Justinian defined slaves as persons. Christian and Islamic thought highlighted religious difference. Slavery was primarily the fate of nonbelievers taken captive in ‘just’ wars. Yet, neither Christianity nor Islam prevented the enslavement of coreligionists, and if conversion obliged Islamic slave owners to manumit their slaves, it did not change the slaves’ status in Christian societies. Later authors rather emphasized property relations. The same line of argument informs the 1926 League of Nations slavery convention which defines slavery as the ‘status or condition of a person over whom any or all the powers attaching to the right of ownership are exercised.’ European and American scholars pioneered the historical study of slavery. Taking their cues from Greece, Rome, and medieval European experiences, they tended to interpret slavery as an intermediate stage in the general process of historical development. G. W. F. Hegel construed slavery as a sort of development aid for the benefit of Africans. Classical social sciences from Adam Smith to Max Weber identified slavery with backwardness. Marxist thinkers were even more explicit, positing primitive society, slavery, feudalism, capitalism, and communism as five consecutive stages of world history. Thus, slavery and capitalism, slavery and markets, and slavery and technological innovations were considered utterly incompatible. How then can one explain the rise of racial slavery in the plantation economies of the New World? From about 1500 to the late nineteenth century Portuguese, British, Dutch, French, Danish, Spanish, and American vessels carried some 12 million plus Africans across the Atlantic to be used mainly as field hands for the production of sugar, coffee, tobacco, rice, indigo, and cotton. Others were put to work in the gold and silver mines of Spanish America. Together 14158
they constituted by far the largest forced migration the world has known. Moreover, how should one interpret slavery in the South of the US, the first modern nation, if it is in opposition to capitalist logic? Studies of US and West Indian slavery have dominated the field of slave studies from its very beginning. They set the agenda for research into other domains, such as: the transatlantic slave trade and its precedents in the Mediterranean and the Black Sea during the Middle Ages and the Renaissance. In the Italian city-states slaves were common until the fourteenth century, on the Iberian Peninsula up to the sixteenth century, many of them Moslems. Christian sailors, captured on the seas, ended up in North African slavery. Later, researchers shifted their attention to plantation slavery in other areas of the New World, and finally to slavery in various non-Western societies. In the beginning, scholars studying Southern slavery followed an institutional approach, describing plantation slavery as a closed system of generalized violence and totalitarian control with dehumanizing effects similar to Nazi concentration camps. Slaves in these pioneering interpretations were victims, devoid of any power and will of their own. All the power was said to be vested in the slave owners who treated their slaves as mere chattel; whipping them into submission and keeping them at starvation level, even working them to death; denying them any family life and breeding them like cattle, buying and selling the slaves at their will and pleasure. Then came Eugene D. Genovese’s monumental historical anthropology of plantation life in the American South. A Marxist himself, Genovese held on to the idea that the planters constituted a pre-capitalist class. Yet he showed that the slaves, although people without rights or hardly any rights, were not people without will. Quite to the contrary, they were even able to shape the structures of the system as a whole. Using cunning and other weapons of the weak in the daily transactions at the work place, they moved their owners to accept their humanity and some minimal duties. The outcome was a paternalist system with some give and take between slave-owner and slave. With this new paradigm slaves got their agency back. Slavery as such came to be seen as a negotiated relationship and as an intrinsically unstable process, the terms of which varied with time and place. This process spanned different stages from enslavement and transporation to life in slavery and possible, yet by no means inevitable social re-integration. There were different ways to overcome the outsider status: resistance and flight, manumission or the purchase of freedom. Prompted by the American civil rights movement, slave lives, slave culture, and slave resistance developed into privileged subjects of study. Herbert G. Gutman showed that slaves against all odds were living in families, creating enduring networks of kinship and self help. Religion proved to be
Slaes\Slaery, History of another major sphere where slaves gained some cultural autonomy. Integrating Christian teachings and African beliefs they developed a richly textured religious life with dance and song as integral parts of worship. As to resistance, scholars had to contend with the fact that apart from the Haitian Revolution, slave rebels were nowhere able to end slavery. Studies of slave rebellions nonetheless brought to the fore numerous instances of organised resistance. A more salient aspect of these studies pioneered by John Hope Franklin was the insight into the complexities of resistance which ranged from mocking tales and mimicking dances to the deliberate cultivation of African customs, to go slow tactics and flight and the establishment of so-called Maroon societies. These societies developed into refuges for other slaves. Most of them had only a limited life span, some, however, such as the seventeenth century kingdom of Palmares in the hinterland of Pernambuco evolved into autonomous centers of power at the margins of the colonial system, often cultivating their own forms of slavery. Most successful were the rebels in relatively weak colonial states with open frontiers such as Brazil, Spanish America, and Surinam where the Saramaka fought a successful war of liberation against the Dutch, lasting more than 100 years. A major advance in the study of slavery was achieved when historians shifted their focus from the interpretation of anecdotal literary evidence to the analysis of statistical material and other mass data with the help of economic modeling techniques. Philip D. Curtin was the first to attempt a census of the transatlantic slave trade using customs records from the Americas. His calculations have later been complemented and refined, yet his general insights withstood any later scrutiny and still hold true at the beginning of the twenty-first century. Others tried to extend his census to the much longer established Trans-Saharan, Indian Ocean, and Red Sea trades although there is no comparable database for these regions. Current estimates stand at 12 million plus Africans transported along these routes from the seventh to the twentieth century. Others reconstructed slave prices at the points of purchase and sale, carefully tracing changes over time, and studying all the other aspects of slave trading and the plantation business. These endeavors engendered seminal new insights. Scholars were able to prove that the length of a voyage was the single most important factor determining mortality during the voyage across the sea, the ghastly Middle Passage, with death rates ranging from 10 to 20 percent and more. They showed that pricing was competitive and that supply was responsive to changes in price and that the transatlantic slave trade continued well into the nineteenth century. It ended when Brazil and Cuba emancipated their slaves in the 1880s, the last to do so in the Western hemisphere. Scholars also discovered that markets shifted eastwards and southwards along
the West African coast, from Senegambia and the Upper Guinea coast to the Bight of Biafra and northwestern Angola. Imports into Africa varied over time and from place to place with cotton goods, brandy, guns, and iron tools in high demand everywhere along the coast. With the recent publication of more than 27,000 Atlantic slave trading voyages in one single data set, even non-specialists have access to parts of the evidence used in the econometric computations. Taken together, the studies of the new economic history show that the Marxist thesis of an inherent contradiction between slavery and capitalism is untenable is untenable as both slave trading and plantation economies were capitalist to the core, with prices set by the laws of supply and demand. In a major reinterpretation of slavery in the antebellum South Robert William Fogel and Stanley L. Engerman also convincingly demonstrated that during the reign of ‘King Cotton,’ the planters acted as capitalist entrepreneurs who were ready to innovate when it paid to do so and who rather used incentives than the whip to drive their slaves. They even ventured the thesis that in material terms (diet, clothing, housing, and life expectancy) slaves in the South were better off than many peasants and workers in Europe and the northern US. Moreover, the system was well and flourishing to the end with high levels of productivity. Parallel quantitative studies of the plantation economies in the West Indies arrived at similar conclusions: abolition came not in the wake of economic decline, but rather in spite of economic success. Seymour Drescher even coined the term econocide. A major contentious issue among scholars has been the significance of slavery for the Industrial Revolution. In 1947 Eric Williams argued that the profits from the ‘triangular trade’ had financed the Industrial Revolution in the UK, adding that abolition was a consequence of economic decline. This doublepronged ‘Williams thesis’ became a central tenet of dependency theory. A close look at hundreds and thousands of slave voyages, however, showed that average profits were smaller than scholars had previously assumed and could never explain why the UK achieved the stage of self-sustained growth in the latter half of the eighteenth century. New data on capital formation in that period points in the same direction. Nevertheless, slavery and the plantation economies contributed to British economic growth and the advent of modernity. Slavery reduced the cost of sugar and tobacco for European consumers and it made great fortunes; it created markets for cheap industrial massproducts, and sugar plantations with their gang-labor system may even be interpreted as ‘factories in the fields’ with a work discipline prefiguring that of modern industrial plants. The impact of slavery was also felt in the realm of ideas and ideology. It furthered a racially informed negative perception of all things African in the West and set the stage for racism, a most 14159
Slaes\Slaery, History of tragic legacy of more than 400 years of slave trading, that the West has not yet come to terms with. The rise of racial slavery in the Atlantic system is a reminder of the fact that economic self-interest and markets, when left to themselves, can produce the most immoral and humanly destructive institutions. The term ‘racial slavery’ refers to the fact that almost all slaves employed in the Americas hailed from Africa as the Spanish had banned the enslavement of Amerindians already by the 1540s. The impact of slavery on Africa is also very controversially debated. Yet, once again a change of perspective has led to a shift of emphasis. An earlier generation of scholars with first-hand experience of colonial domination saw African societies mainly as victims of external forces. They singled out the transatlantic slave trade as a first step towards underdevelopment as it depopulated whole areas, or so ran the argument, and induced African rulers to engage in wars for the sake of making prisoners for sale in exchange for guns, which then triggered new wars. This set of ideas informed dependency theory as well as world-system analysis and the early nationalist African historiography. Walter Rodney even argued that African slavery on the Guinea coast was a result of external demand. Quantitative studies shattered some of these assumptions. The gun-slave-cycle could not be substantiated. In addition, historical demography, although a risky business, points to demographic stagnation from 1750 to 1850, with population losses in the areas directly affected by slaving when other continents were beginning to experience demographic growth. The causal link between the slave trade and Africa’s economic backwardness as posited by Joseph E. Inikori is a contested issue. A better understanding of the complexities of African politics has furthermore led scholars to argue that most African wars of the period were independent of the external demand for slaves, although the slave trade generated income, strengthening some polities while weakening others. Asante, Dahomey, and the Niger Delta city-states were among the main beneficiaries while the kingdom of Benin was one of those who abstained from participating in the slave trade. Slave sales shifted with the vagaries of African politics, not to forget the cycles of drought with the ensuing threats of starvation, which also led people into slavery. In other cases, slavery was a punishment for the crimes people had committed. To stay in business, European and American slave traders had to adapt to local circumstances. They were at the mercy of local merchants and local power holders. They hardly ever ventured beyond the ports of call and the forts they built along the West African coast. They also bought women and children, although the planters considered them second choice to men, and people from other areas than those that the planters preferred. Women constituted up to a third of 14160
those sent to the Americas. The female ratio among slaves was higher than among early European immigrants. It ensured an enduring African physical presence. It is also a further proof that African cultural and political parameters had the power to shape the Atlantic system, an argument best developed by David Eltis. Studies of African slavery have furthered the understanding of these cultural parameters. Slavery was widespread in precolonial Africa and expanded in the nineteenth century despite, or rather because of, the abolition of the transatlantic slave trade by the UK in 1807. Some scholars even argue that West African societies then developed into fully-fledged slave societies with 50 percent and more of their population kept in bondage. First among these were the Sudanic societies from the Futa Jallon to Sokoto and Adamawa, where Islamic revolutions and statebuilding initiated more intense slave raiding than ever before. Yet, African slavery, often referred to as ‘household slavery,’ markedly differed from chattel slavery. As elsewhere, slaves had to do the hardest work and their owners treated them as outsiders, the first to be sacrificed in public rituals and put to death at the whim of their owner. However, it was common to find slaves in positions of albeit borrowed authority, such as traders, officers, and court officials. Elsewhere slave and slave owner worked side by side. Alternatively, slaves lived in villages of their own. They owned property and in some cases, even slaves of their own. More importantly, the slave status was not a fixed status; rather it changed over time along a slave to kinship continuum, with people born into slavery exempt from further sale. Recent scholarship has stressed that women were the pre-eminent victims of slavery in Africa as they were valued for their labor power and for the offspring they might give birth to. Moreover, one should never forget that power and prestige in African societies rather depended on the number of followers someone could count on, than the control of land which was not a scarce resource yet, or not everywhere. According to H. J. Nieboer agricultural slavery developed wherever land was plentiful and the productivity of agricultural labour was low. The slave to kinship paradigm as expounded most forcefully by Igor Kopytoff and Suzanne Miers has lost some of its appeal when scholars discovered that in many places a slave past carries a social stigma to this day. Yet Africa is a continent with a wide spectrum of cultures. Hence, what prevails at one place may be contradicted at another. Nonetheless, the general rule of slavery getting more exploitative and more strictly closed when used for the production of export staples and in societies with strong state structures holds true for Africa as well as for the Americas. A case in point is slavery on the East African coast and the island of Zanzibar during the clove boom of the mid-nineteenth century. In West Africa too, slavery turned more adversarial when set to the production of palm oil in
Slaes\Slaery, History of the aftermath of the abolition of the transatlantic slave trade, called the period of legitimate trade. The less centralized power in a society was, the easier it was for slaves to overcome their outsider status. This, however, meant integration into a land-holding kin group as access to women and land, the keys to reproduction, was generally controlled by these. In modern Europe, on the other hand, property rights in human labor, one’s own or others, were vested in the individual. Hence freedom in Africa, and other non-Western societies, meant belonging, but independence in Europe. Colonialism also helped to keep the memory of slavery alive. In the early nineteenth century, the UK had done its best to end the slave trade. It even stationed a naval squadron for this purpose in West African waters and concluded anti-slave trade treaties with African rulers who had trouble seeing the benefits of abolition. The European public later tended to take the imperialist scramble for Africa as an anti-slavery crusade, but colonial administrations, while forcefully suppressing the slave trade, were reluctant to fight slavery as a social system; rather they closed their eyes and did their best to freeze existing social relations while claiming to bring progress to Africa. When slavery finally ended in Africa, it did so not so much as the result of conscious acts of emancipation but rather because of the emergence of a colonial labor market, with the socially defined older mechanisms for the integration of outsiders easing the transition. The men and women held in bondage eagerly took up alternatives wherever they were viable. The cultural turn in social sciences has deeply influenced slave studies. There is no end to quantitative analyses. However, scholars have discovered that cultural parameters are important variables affecting what people do, even when they operate under the impress of a market system. Even Homo oeconomicus has to consider cultural values. Gender and slave memories have also attracted much more attention than before. The 41 volumes of interviews with former slaves conducted in the 1920s and 1930s in the US remain unrivalled, yet cf. the splendid collection of peasant narratives from Niger presented by Jean Pierre Olivier de Sardan. Archeologists have started to investigate slave material culture. Others have begun to retrace the construction of different Creole identities and cultures in the Americas so rich in imaginative creativity. They all encounter the same problem: as slaves by definition were people without voice, what they experienced and what they thought is to a great extent buried in the documents written by the perpetrators of slavery. To give the slaves their voice back is one of the nobler tasks of historical scholarship, but it is immensely difficult. While studies of slavery become ever more detailed and localized, historians have come to realize that the Middle Passage is more a bridge than an abyss or a one-way street, binding different cultures together, a
point first made by the anthropologist Melville J. Herskovits. Slaves came with a history of their own to the Americas, returning Africans and children of former slaves such as Olaudah Equiano, Wilmot Blyden, and James Africanus Horton were among the first to consciously define themselves as Africans. Hailing from different regions of Africa, yet sharing a common plight, they developed an awareness of a common African identity. The books they wrote helped to lay the foundations for Panafricanism, a strand of ideas and a political movement central to the freedom struggle of Africans in the twentieth century. David Brion Davis noted a similar correlation between slavery and freedom in Western religious, legal, and philosophical discourses. Hence the double paradox and legacy of slavery: it inflicted death and hardship on millions and millions of people of mostly African descent for the benefits of a few; and while it was instrumental for the rise of racial prejudice, it also shaped the notion of freedom and equality as we know and cherish it today. To consider these global dimensions, is the challenge of any further research into the history of slavery even when dealing with very specific local aspects of the problem. But this is easier said than done because it requires historians to venture beyond the deeply rooted traditions of privileging the history of a particular nation-state, usually their own, into the open sea of a new comparative history. The fight against slavery is not won so long as debt bondage, servile marriage, forced labor, child labor, trafficking of women, forced prostitution, and other forms of bondage exist in many societies, including the very rich countries of the West. See also: Colonialism, Anthropology of; Colonization and Colonialism, History of; Human Rights, History of; Property, Philosophy of; Slavery as Social Institution; Slavery: Comparative Aspects; Trade and Exchange, Archaeology of
Bibliography Bales K 1999 Disposable People: New Slaery in the Global Economy. University of California Press, Berkeley Berlin I 1998 Many Thousands Gone. The First two Centuries of Slaery in North America. The Belknap Press, Cambridge, MA Blackburn R 1997 The Making of New World Slaery. From the Baroque to the Modern, 1492–1800. Verso, London Clarence-Smith W G 1989 The Economics of the Indian Ocean Slae Trade in the Nineteenth Century. Frank Cass, London Craton M 1978 Searching for the Inisible Man: Slaes and Plantation Life in Jamaica. Harvard University Press, Cambridge, MA Curtin P D 1967 African Remembered; Narraties by West Africans from the Era of the Slae Trade. University of Wisconsin Press, Madison, WI Curtin P D 1969 The Atlantic Slae Trade: A Census. University of Wisconsin Press, Madison, WI
14161
Slaes\Slaery, History of Davis D B 1966 The Problem of Slaery in Western Culture. Cornell University Press, Ithaca, NY Davis D B 1984 Slaery and Human Progress. Oxford University Press, New York Drescher S 1977 Econocide: British Slaery in the Era of Abolition. University of Pittsburgh Press, Pittsburgh, PA Drescher S, Engerman S L 1998 A Historical Guide to World Slaery. Oxford University Press, New York Elkins S M 1968 Slaery; a Problem in American Institutional and Intellectual Life, 2nd edn. University of Chicago Press, Chicago Eltis D (ed.) 1999 The Trans-Atlantic Slae Trade: a Database on CD-ROM. Cambridge University Press, Cambridge, UK Eltis D 2000 The Rise of African Slaery in the Americas. Cambridge University Press, Cambridge, UK Finley M I 1987 Classical Slaery. Frank Cass, London Fogel R W 1989 Without Consent or Contract. The Rise and Fall of American Slaery, 1st edn. Norton, New York Fogel R W, Engerman S L 1974 Time on the Cross; the Economics of American Negro Slaery. Little, Brown, Boston Franklin J H, Moss A A 2000 From Slaery to Freedom: a History of African Americans, 8th edn. A. A Knopf, New York Genovese E D 1974 Roll, Jordan, Roll: The World the Slaes Made, 1st edn. Pantheon Books, New York Gutman H G 1976 The Black Family in Slaery and Freedom, 1750–1925, 1st edn. New York Pantheon Books, New York Hellie R 1982 Slaery in Russia, 1450–1725. University of Chicago Press, Chicago Herskovits M J 1941 The Myth of the Negro Past. London Harper & Brothers, New York Inikori J E, Engerman S L (eds.) 1992 The Atlantic Slae Trade: Effects on Economies, Societies, and Peoples in Africa, the Americas, and Europe. Duke University Press, Durham, NC James C L R 1938 The Black Jacobins: Toussaint l’Ouerture and the San Domingo Reolution. The Dial Press, New York Klein H S 1986 African Slaery in Latin America and the Caribbean. Oxford University Press, New York Knight F W 1997 The Slae Societies of the Caribbean. General History of the Caribbean. UNESCO Pub, London Kopytoff S M, Miers I 1977 African slavery as an institution of marginality. In: Kopytoff S M (ed.) Slaery in Africa: Historical and Anthropological Perspecties. University of Wisconsin Press, Madison, WI Lovejoy P E 1983 Transformations in Slaery: a History of Slaery in Africa. Cambridge University Press, Cambridge, UK Manning P 1990 Slaery and African Life. Occidental, Oriental, and African Slae Trades. Cambridge University Press, Cambridge, UK Mattoso K M, de Queiro! s 1986 To be a Slae in Brazil, 1550–1888. Trans. Goldhammer A. Rutgers University Press, New Brunswick, NJ Meillassoux C 1991 The Anthropology of Slaery: The Womb of Iron and Gold. University of Chicago Press, Chicago Miller J C 1993 Slaery and Slaing in World History: A Bibliography, 1900–1991. Knaus International Publications, Millwood, NY Morrissey M 1989 Slae Women in the New World: Gender Stratification in the Caribbean. University Press of Kansas, Lawrence, KA Nieboer H J 1910 Slaery as an Industrial System; Ethnological Researches, 2nd rev. edn. M Nijhoff, The Hague, The Netherlands
14162
Olivier de Sardan J P 1976 Quand nos anceV tres eT taient captifs. ReT cits paysans du Niger. Paris Patterson O 1982 Slaery and Social Death. A Comparatie Study. Harvard University Press, Cambridge, MA Phillips U B 1918 American Negro Slaery: A Surey of the Supply, Employment and Control of Negro Labor as Determined by the Plantation Regime. D Appleton, New York Price R 1973 Maroon Societies: Rebel Slae Communities in the Americas, 1st edn. Anchor Press, Garden City, CA Raboteau A J 1978 Slae Religion: The ‘Inisible Institution’ in the Antebellum South. Oxford University Press, Oxford, UK Rawick G P, Federal Writers’ Project 1972 The American Slae: a Composite Autobiography. Contributions in Afro-American and African studies; no. 11. Greenwood, Westport, CT Reid A, Brewster J 1983 Slaery, Bondage, and Dependency in Southeast Asia. St Martin’s Press, New York Robertson C C, Klein M A 1983 Women and Slaery in Africa. University of Wisconsin Press, Madison, WI Rodney W 1982 African slavery and other forms of social oppression on the Upper Guinea Coast in the context of the African slave trade. In: Inikori J (ed.) Forced Migration. The Impact of the Export Slae Trade on African Societies. Africana , New York Schwartz S B 1985 Sugar Plantations in the Formation of Brazilian Society: Bahia, 1550–1835. Cambridge University Press, Cambridge, UK Sheriff A 1987 Slaes, Spices, and Iory in Zanzibar: Integration of an East African Commercial Empire into the World Economy, 1770–1873. J. Currey, London 1980 Slaery & Abolition. Cass, London, Vol. 1 Stampp K M 1956 The Peculiar Institution: Slaery in the AnteBellum South, 1st edn. Knopf, New York Thornton J 1992 Africa and Africans in the Making of the Atlantic world, 1400–1680. Cambridge University Press, Cambridge, UK Toledano E R 1998 Slaery and Abolition in the Ottoman Middle East. University of Washington Press, Seattle, WA Verger P F 1982 Orisha: les Dieux Yorouba en Afrique et au Noueau Monde. Me! taille! , Paris Verlinden, C 1955 L’esclaage dans l’Europe MeT dieT ale. De Tempel, Bruges, Belgium Watson J L (ed.) 1980 Asian and African Systems of Slaery. University of California Press, Berkeley, CA Williams E E 1994 Capitalism and Slaery, 1st edn. University of North California Press, Chapel Hill, NC Willis J R 1985 Slaes and Slaery in Muslim Africa. Frank Cass, London Wirz A 1984 Sklaerei und kapitalistisches Weltsystem. Suhrkamp, Frankfurt am Main, Germany
A. Wirz
Sleep and Health 1. Sleep Physiology Sleep is a ubiquitous mammalian phenomenon which is associated with diminished responsiveness to external stimuli and a familiar and delightful sense of restoration under the circumstances of a normal night of sleep. The importance of sleep in daily living can
Sleep and Health
Figure 1 The distribution of sleep stages over the course of a normal night of sleep
easily be discerned when an individual has even a modest restriction in a normal night of sleep. This has been well documented to be associated with performance decrement and alterations in mood. The mysteries of sleep are many, and how it produces its restorative effects remain vague, but it is clear that it is at the core of mental and physical well being. Biological functioning cannot be thoroughly appreciated without integrating sleep and sleep biology into our understanding of waking biology. It is obvious now to the layman and professional sleep researcher both that sleeping behavior effects waking behavior, and waking behavior effects sleeping behavior. In the mid-1950s, some of the mysteries of sleep began to unravel by a series of discoveries by Nathaniel Kleitman, William Dement, and Eugene Aserinsky at the University of Chicago (Dement 1990). These individuals produced a series of studies which delineated the brain wave patterns associated with different stages of sleep, and the unique stage of sleep which was termed rapid eye movement (REM) sleep, shown to be associated with dreaming. These stages of sleep were determined by the concomitant recording of the electroencephalogram (EEG), electro-octulogram (EOG), and the electromyogram (EMG). The stages of sleep are characterized by a pattern of activity in each of these variables. Brain wave activity changes dramatically from waking to the deep stages of sleep (stages 3 and 4) with high voltage slow waves becoming increasingly dominant. These stages are distributed throughout the night in a well-defined pattern of activity associated with REM periods approximately every 90 minutes (Fig. 1). REM periods are interspersed with stages 2, 3, and 4 such that stages 3 and 4 are identified largely in the first third of the night, with relatively little noted in the last third of the night. REM sleep is accumulated with episodes occurring every 90 minutes as noted, but as the night progresses, the REM periods become progressively longer, thus producing very long REM periods in the last third of the sleeping interval. Thus, REM sleep is identified primarily in the last third of the night, with relatively
little in the first third of the night (Carskadon and Dement 1994). Of interest, is the fact that REM sleep is associated with some unique physiologic changes. Thermoregulatory and respiratory functioning are markedly compromised during REM sleep. For example, the vasomotor mechanisms necessary to respond appropriately to increases and decreases in ambient temperature in order to maintain a normal core body temperature are suspended during REM sleep. This renders the mammalian organism poikilothermic, or cold blooded, during REM sleep. In addition, appropriate respiratory compensatory responses to both hypoxemia and hypercapnia are substantially blunted during REM sleep. These data combine to suggest that even under normal circumstances, REM sleep is a period of physiologic risk woven into the fabric of a normal night of sleep. Other phenomena associated with REM sleep are of considerable interest. For example, it is known that REM sleep is associated with a skeletal muscle paralysis. This has been substantiated by the documentation of hyperpolarization of motor neurons during REM sleep in cats. In addition, REM sleep is generally associated with thematic dreaming, and it is felt that the inhibition of skeletal muscles prevents the ‘acting out’ of dreams. Also during each REM period penile tumescence occurs, resulting in a normal full erection in males (Carskaden and Dement 1994). Non-REM (NREM) sleep, alternatively, is associated with a slowing of physiological processes to include heart rate, blood pressure, and general metabolic rate compared to the waking state. During the first episode of slow wave sleep (stages 3 and 4) there is a dramatic increase in the secretion of growth hormone. This has been shown to be specifically linked to the first episode of slow wave sleep, rather than a circadian phenomenon. No other hormone is quite so precisely identified with secretion during a specific stage of sleep.
2. Subjectie Aspects of Sleep The quality of sleep, and the physiologic parameters which produce ‘good sleep’ or a feeling of restoration in the morning are not understood. Deprivation of REM sleep as opposed to non-REM sleep does not appear to differentially affect subsequent mood or performance. It is well established that older adults have characteristic alterations in their sleep patterns, i.e., a marked diminution in stages 3 and 4 sleep, which are quite characteristically associated with complaints of nonrestorative, or poor sleep. Determining the physiologic characteristics of the subjective elements of ‘good sleep’ is difficult in that it is well known that the subjective and physiologic aspects of sleep can be 14163
Sleep and Health easily disassociated. It is not unusual, for example, for individuals with significant complaints of insomnia or poor sleep to have essentially normal physiological sleep patterns. Sleep is well known to be easily disturbed by mood and psychological state. Sleep is clearly disturbed by anxiety and in patients with generalized anxiety disorder. Characteristically this is associated with prolonged sleep onset latency and multiple awakenings subsequent to sleep onset. In addition, a well-recognized prodrome to a clinically significant depressive episode is an alteration in sleep pattern which includes an early onset of REM sleep and early morning awakenings with difficulty falling back to sleep. There are also data which document the fact that REM deprivation in depressed patients can produce a significant antidepressant effect (see Depression). Good sleep is commonly associated with good health and a sense of well being. Measures of overall functional status have been known to be significantly correlated with both subjective and objective measures of daytime sleepiness. Other studies have shown that sleep disordered breathing is associated with lower general health status, with appropriate controls for body mass index, age, smoking status, and a history of cardiovascular conditions. Even very mild degrees of sleep disordered breathing have been shown to be associated with subjective decrements in measures of health status which are comparable to those individuals with chronic disease such as diabetes, arthritis, and hypertension. Complaints of poor sleep and\or insomnia, and daytime sleepiness and fatigue are common. A recent Gallop survey in the USA indicated that approximately one third of Americans have insomnia. Thirtysix percent of American adults indicated some type of sleep problem. Approximately 27 percent reported occasional insomnia, while 9 percent indicated that their sleep-related difficulty occurs on a regular, chronic basis. In a study of a large sample of Australian workers, the prevalence of a significant elevation in the Epworth Sleepiness Scale was just under 11 percent. These data were not related significantly to age, sex, obesity, or the use of hypnotic drugs. Perhaps the most common cause of daytime fatigue and sleepiness, aside from self-imposed sleep restriction, is obstructive sleep apnea. The sine qua non of this sleep disorder is persistent sonorous snoring. Snoring itself is the most obvious manifestation of an increase in upper airway resistance, and as it progresses to more significant levels, the upper airway gradually diminishes in cross-sectional diameter and may produce a complete occlusion. Sonorous snoring can exist as a purely social nuisance or it can be associated with multiple episodes of partial and complete upper airway obstruction during sleep associated with dangerously low levels of oxygen saturation. This sleep-related breathing disorder is also associated with complaints of daytime fatigue and 14164
sleepiness that may be only minimally obvious to some individuals, but can be quite severe and debilitating in others (Orr 1997).
3. Sleep Disorders There are a variety of documented sleep disorders which have been described in a diagnostic manual which can be obtained through the American Academy of Sleep Medicine. This manual describes the diagnostic criteria of a plethora of sleep disorders ranging from disorders which manifest themselves primarily as symptoms, i.e., insomnia, narcolepsy, to physiologic disorders which can be defined only through a polygraphic sleep study such as sleep apnea, periodic limb movements during sleep, and nocturnal gastroesophageal reflux. The prevalence of these disorders varies from narcolepsy (5–7 cases per 10,000) to OSAS (2–4 cases per 100) to insomnia which may be as high as 25 percent of the general population who report occasional problems with this disorder. Space constraints do not permit a discussion of all of these disorders, therefore we will touch only on the most commonly encountered sleep disorders. Clearly, sleep disorders have well-documented consequences with regard not only to health but to daytime functioning. Perhaps the most common of all relates to the behavioral consequences of sleep fragmentation or sleep restriction. Either as a result of anxiety, or a physiologic sleep disturbance, sleep restriction and sleep fragmentation can produce documentable declines in performance, and increases in daytime sleepiness. The effects of even minimal sleep restrictions are cumulative across nights, but the effects can be quickly reversed in a single night of normal sleep. Sleep restriction is commonly noted in shift workers; it is estimated that approximately 30 percent of the American work force works rotating shifts, or a permanent night shift. Even permanent night shift workers rarely obtain ‘normal’ sleep in the daytime, and complaints of inadequate or nonrestorative sleep are persistent in this group of workers. Studies have shown that accidents are 20 percent higher on the night shift, and 20 percent of night shift workers report falling asleep on the job. The demands of our increasingly complex technologic society have created a workforce that is under greater stress, obtaining less sleep, and clearly increasing prevalence of sleep complaints and sleep disorders. An epidemiological study has estimated the prevalence of sleep disordered breathing (defined as five obstructive events per hour or greater) was 9 percent for women and 24 percent for men (Young et al. 1993). It was estimated that 2 percent of women and 4 percent of middle-aged men meet the minimal diagnostic criteria for sleep apnea, which includes five obstructive events per hour with the concomitant symptom of daytime sleepiness. Although, obstructive
Sleep and Health sleep apnea syndrome (OSAS) is felt to be a predominantly male phenomenon, the incidence rises sharply among postmenopausal women. Persistent, loud snoring, and the concomitant upper airway obstruction, has been shown to carry with it a variety of medical and behavioral risks. Various studies have shown that snoring is a significant predictor of hypertension, angina, and cerebral vascular accidents independent from other known risk factors. One study has shown in a multiple regression analysis that snoring was the only independent risk factor which differentiated stroke occurring during sleep and stroke occurring at other times of the day. Other studies have documented a higher mortality rate in patients with moderate to severe OSAS compared to those with a less severe manifestation of this disorder. Furthermore, other studies have shown that in individuals with curtailed life expectancy secondary to OSAS, the most common cause of death was myocardial infarction. Other studies have confirmed that more aggressive treatment of OSAS reduces the incidence of vascular mortality (Hla et al. 1994). The issue of the relationship between snoring, OSAS, and hypertension is somewhat controversial. The frequent association of significant OSAS with male gender, and obesity make it difficult to determine the relative contribution of each variable to hypertension. One large population study of 836 males from a general medical practice in England revealed a significant correlation between overnight hypoxemia and systemic blood pressure, but this could not be determined to be independent of age, obesity, and alcohol consumption (Stradling and Crosby 1990). No significant relationship was found with snoring. Alternatively, another excellent study which actually utilized overnight polysomnography and 24-hour ambulatory blood pressure monitoring did find an association between hypertension and sleep apnea which is independent of obesity, age, and sex in a nonselected community based adult population (Hla et al. 1994). Clinically, it is recognized that appropriate treatment of OSAS will often result in a notable reduction in blood pressure, often independent of weight loss. In a recent National Institute of Health supported project on sleep and cardiovascular functioning, preliminary data have shown that 22 to 48 percent of hypertensive patients have been observed with significant OSA, and 50 to 90 percent of sleep apnea patients have been documented to have hypertension. Perhaps the most well-known ‘sleeping disorder’ or ‘sleep sickness’ is narcolepsy. Long assumed to be synonymous with uncontrollable daytime sleepiness, narcolepsy is now known to be a neurological disorder, and genetic abnormalities have recently been described in a canine model of narcolepsy. This disorder is associated with an extraordinary degree of daytime sleepiness but that alone does not define the syndrome. Many individuals with OSAS also have an equivalent degree of daytime sleepiness. Narcolepsy is identified
by other ancillary symptoms. Perhaps the most common and dramatic is cataplexy in which individuals have a partial or complete loss of skeletal muscle control in the face of emotional stimuli such as fear, anger, or laughter. In the sleep laboratory, narcolepsy is associated with a unique pattern of sleep where an onset of REM sleep is noted usually within 10 to 15 minutes after sleep onset. This is considered to be a pathognomonic sign of narcolepsy. Unfortunately, there is no cure for narcolepsy at the present time, and treatment is limited to the symptomatic improvement of daytime sleepiness with stimulant medication, and the control of the ancillary symptoms primarily via tricyclic antidepressant medication. The results of the Gallop survey indicated that 36 percent of American adults suffer from some type of sleep problem. Approximately 25 percent reported occasional insomnia, while 9 percent said that their sleep-related difficulty occurs on a regular or chronic basis. In addition, the survey documented that insomniacs were 2.5 times more likely than noninsomniacs to report vehicle accidents in which fatigue was a factor. In addition, compared to noninsomniacs, insomniac patients reported a significantly impaired ability to concentrate during the day. The welldocumented excessive daytime sleepiness in patients with obstructive sleep apnea has been documented in several studies to result in a significant increase in traffic accidents, as well as at-fault accidents (Findley et al. 1989). The extreme sleepiness noted in patients with OSAS is almost certainly the result of an extreme degree of sleep fragmentation secondary to arousals from sleep by repeated events of obstructed breathing. Furthermore, it is well established that the sleepiness experienced by patients with OSAS is completely reversible by appropriate therapy which resolves the sleep related breathing disorder. The most common approach to therapy currently is the application of a positive nasal airway pressure via a technique referred to as continuous positive airway pressure (CPAP). The effects of sleep deprivation, either as a result of willful sleep restriction, poor sleep secondary to insomnia, or fragmented sleep secondary to OSAS, has substantial effects on physical and behavioral health as noted above. Studies on chronic sleep deprivation have shown minimal physiologic effects, but have documented dose related effects in performance decrements and decreased alertness. More recently, however, an extraordinary set of studies has demonstrated that long-term sleep deprivation does have very significant effects in rats, and can ultimately result in death. The most likely explanation of death appears to be the result of a profound loss of thermal regulation (Rechtschaffen 1998). Also of interest, are recent studies which suggest alterations in immune function secondary to chronic sleep deprivation in humans. In conclusion, the intuitive notion that sleep has important consequences with regard to one’s health 14165
Sleep and Health appears to be clearly documented. Alterations in sleep producing a fragmentation of sleep or restriction in the normal duration of sleep have been shown to have significant consequences with regard to waking behavior. In addition, sleep disorders which fragment sleep such as obstructive sleep apnea produce effects that relate not only to waking behavior, but also have significant consequences with regard to both cardiovascular and cerebral vascular complications as well as producing a significant increase in mortality. Perhaps, the most important message to be gleaned from the remarkable increase in our knowledge of sleep is that sleeping behavior affects waking behavior and waking behavior affects sleeping behavior. See also: Stress and Health Research
Bibliography Carskadon M, Dement W 1994 Normal human sleep: An overview. In: Kryger M, Roth T, Dement W (eds.) Principles and Practice of Sleep Medicine. Saunders, Philadelphia, PA Dement W C 1990 A personal history of sleep disorders medicine. Journal of Clinical Neurophysiology 7: 17–47 Findley L J, Fabrizio M, Thommi G, Suratt P M 1989 Severity of sleep apnea and automobile crashes. New England Journal of Medicine 320: 868–9 Hla K M, Young T B, Bidwell T, et al 1994 Sleep apnea and hypertension. Annals of Internal Medicine 120: 382–8 Orr W 1997 Obstructive sleep apnea: Natural history and varieties of the clinical presentation. In: Pressman M, Orr W (eds.) Understanding Sleep. American Psychological Association, Washington, DC Rechtschaffen A 1998 Current perspectives on the function of sleep. Perspecties in Biology and Medicine 41: 359–90 Stradling J R, Crosby J H 1990 Relation between systemic hypertension and sleep hypoxaemia or snoring: Analysis in 748 men drawn from general practice. British Medical Journal 300: 75 Young T, Palta M, Dempsey J, et al 1993 The occurrence of sleep-disordered breathing among middle-aged adults. The New England Journal of Medicine 328: 1230–5
W. C. Orr
Sleep Disorders: Psychiatric Aspects Complaints of too little sleep (insomnia) or too much sleep (hypersomnia), or of sleep that is not restorative enough, are termed dyssomnias. This chapter is intended to give an overview on the frequency and risk factors of dyssomnias, and on the psychiatric conditions that may be associated with them, and the physiopathological concepts related to sleep changes in depression. Of the greater than 30 percent of the general population who complain of insomnia during 14166
the course of one year, about 17 percent report that it is ‘serious.’ Insomnia is encountered more commonly in women than men and its prevalence increases with age. Insomnia represents one of the major features of depression and appears to be one type of the prodromal symptoms and risk factors for the development of major depression. Since chronic dyssomnia most often occurs as a comorbid disturbance of psychiatric and physical conditions, a thorough evaluation of the patient and his\her sleep complaints is needed to lay the foundation for accurate diagnosis and effective treatment. The diagnosis of insomnia is based upon the subjective complaint of sleeping too little. Patients report difficulties in initiating or maintaining sleep or of non-restorative sleep, i.e., not feeling well rested after sleep that is apparently adequate in amount, and tiredness during the day. Insomnia may occur as primary or as secondary disorder due to other psychiatric conditions, due to general medical conditions, and\or due to substance misuse. Compared to secondary insomnia, the prevalence of primary insomnia is relatively small. Hypersomniaincludescomplaintsofexcessivesleepiness characterized either by prolonged sleep episodes and\or excessive sleepiness during the day. These symptoms may interfere with social, occupational, and\or other areas of functioning. Like insomnia, hypersomnia may occur as a primary disorder or as a secondary disorder due to psychiatric and\or medical conditions, and\or due to substance misuse. Especially among patients encountered in psychiatric practices and hospitals, secondary sleep disturbances are more common than primary sleep disturbances. This is particularly important to bear in mind when evaluating a patient with sleep complaints. Dyssomnia is particularly often associated with psychiatric disorders such as depression, schizophrenia, anxiety disorders, or personality disorders, and with misuse of drugs and\or alcohol. Whenever possible, it is clinically useful to assess which is the primary and which is the secondary disturbance. This will facilitate clinical treatment and management and improve preventive measures.
1. Frequency and Risk Factors Somnipathies are among the most frequent complaints a general practitioner must deal with. According to epidemiological studies, 19–46 percent of the population report sleep problems. Of these, 13 percent suffer from moderate or severe disturbances. In terms of the narrow definition of diagnostic criteria, that is, initial insomnia, interrupted sleep and disturbance of daily well being, 1.3 percent of the population suffers from somnipathies. Risk factors for somnipathies are psychological stress or psychiatric illness. A high prevalence of
Sleep Disorders: Psychiatric Aspects
Figure 1 Sleep stages according to Rechtschaffen and Kales (1968)
somnipathies was reported by volunteers suffering from stress, tension, solitude, or depression. More severe sleep problems were found to be clearly related to psychiatric illness such as depression and anxiety disorders as well as substance misuse. A recent World Health Organization (WHO) collaborative study in 15 different countries found that insomnia is more common in females than males and increases with age. In fact, it is thought that half of the population over 65 years of age suffers from chronic sleep disturbances. The WHO study also found that 51 percent of people with an insomnia complaint had a well defined International Classification of Diseases-10 (ICD-10) mental disorder (mainly depression, anxiety, alcohol problems). Although many somnipathies develop intermittently, 70–80 percent of the surveyed subjects had been suffering from sleeping problems for more than one year. Disturbances of concentration and memory, difficulties in getting on with daily activities, and depressive mood changes were among the problems most often related to insomnia. The pressure imposed by their ailment leads many patients to seek help in alcohol and drugs. Insomnia has also costly financial consequences. The economical impact of insomnia can be divided
into direct and indirect costs. Direct costs of insomnia include outpatient visits, sleep-recordings, and medications directly devoted to insomnia. There is too little knowledge about the exact figures. However, the direct costs of insomnia in the US has been estimated in 1990 to be $10.9 billion (with $1.1 billion devoted to substances used to promote sleep and $9.8 billion associated with nursing home care for elderly subjects with sleep problems). The direct costs related to the evaluation of sleep disorders by practitioners seem to be a small part of the total costs of insomnia. The indirect costs of insomnia include the presumed secondary consequences of insomnia such as health and professional problems, and accidents. The exact quantification of these costs is, however, controversial. It is often not known if sleep disorders are the cause or the consequence of various medical or psychiatric diseases. For instance, it has been observed that insomniacs report more medical problems than good sleepers do and have twice as many doctor visits per year than good sleepers do. Furthermore, subjects with severe insomnia appear to be hospitalized about twice as often as good sleepers. It has also been observed that insomniacs consume more medication for various problems than good sleepers do. 14167
Sleep Disorders: Psychiatric Aspects These results confirm previous observations showing that insomnia is statistically linked with a worse health status than individuals with good sleep. Again, it can not be established whether insomnia is the cause or the result of this worse status. For instance, one could reasonably hypothesize that insomnia promotes fatigue that could increase the risk of some diseases, or more simply decrease the threshold of others that could more easily develop.
2. Regulation of Normal Sleep The quantity and quality of sleep is measured with polysomnographic recordings (PSG) that include the electroencephalogram (EEG), electromyogram (EMG), and electrooculogram (EOG). Human sleep consists of two major states, the Rapid-Eye-Movement (REM) sleep, and the nonREM (nonREM) sleep (Fig. 1). NonREM sleep is characterized by increasing EEG synchronization and is divided into stages 1–4. Stages 3 and 4 are also termed slow wave sleep (SWS) or delta sleep (‘delta waves’). REM sleep has also been termed paradoxical sleep because of its wake-like desynchronized EEG pattern combined with an increased arousal threshold. Typical nocturnal sleep is characterized by 3-5 cycles of non-REM and REM sleep phases of about 60–100 minutes duration. At the beginning of the night, the cycles contain more nonREM sleep, particularly more SWS. Towards the end of the night, the amounts of REM sleep increases and the amount of SWS decreases. The following standard terms are used to describe quality and quantity of sleep: sleep continuity refers to the balance between sleep and wakefulness (i.e., initiating and maintaining sleep), and sleep architecture refers to the amount, distribution, and sequencing of specific sleep stages. Measures of sleep continuity include sleep latency (usually defined as the duration between lights out and the first occurrence of stage 2 sleep), intermittent wake after sleep onset, and sleep efficiency (ratio of the time spent asleep to total time spent in bed). REM latency refers to the elapsed time between sleep onset and the first occurrence of REM sleep. The amount of each sleep stage is quantified by its percentage of the total sleep time. REM density is a measure of eye movements during REM sleep; typically this is low in early REM sleep periods and increases in intensity with successive REM sleep periods. In normal sleep, the cycle of non-REM and REM sleep lasts approximately 60–100 minutes. Sleep usually begins with the nonREM sleep stage 1 and progresses to stage 4 before the appearance of the first REM period. The duration of the REM sleep episodes and REM density usually increases throughout subsequent sleep cycles. As shown in Fig. 2, the pattern of nocturnal growth hormone secretion is associated with the development of SWS during the first non-REM sleep 14168
Figure 2 Bidirectional interaction between sleep EEG and nocturnal hormone secretion
period. Cortisol secretion, on the other hand, appears to be associated with increased amount of REM sleep. Several non-REM and REM sleep measures including the amount of SWS, REM latency and density may be particularly altered in affective and schizophrenic disorders, in aging, and with the administration of certain drugs. At least three major processes are involved in the regulation of normal sleep. According to the TwoProcess-Model of sleep regulation, sleep and wakefulness are influenced by both a homeostatic and a circadian process that interact in a complex way. The homeostatic Process S (related to SWS) increases with the duration of wakefulness prior to sleep onset and augments sleep propensity. The circadian Process C, driven by the internal clock located in the suprachiasmatic nuclei (SCN), describes the daily cycle of sleepiness and wakefulness and is also related to REM sleep. The third process is the ultradian rhythm of non-REM and REM sleep. The electrophysiological measures are associated with endocrinological and other physiological events. Wakefulness, non-REM sleep, and REM sleep appear to be controlled by interacting neuronal networks, rather than by unique necessary and suffic-
Sleep Disorders: Psychiatric Aspects
Figure 3 Neuroanatomy of sleep–wake regulation
ient centers. A simplified neuroanatomy of sleep– wakefulness is shown in Fig. 3. The non-REM–REM sleep cycle is regulated within the brainstem. According to current concepts, REM sleep is initiated and maintained by cholinergic neurons originating within the lateral dorsal tegmental and pedunculopontine nuclei in the dorsal tegmentum, and is inhibited by noradrenergic and serotonergic neurons originating in the locus coeruleus and dorsal raphe nuclei. Human pharmacological data are consistent with the neurophysiological concepts of the control of REM and non-REM sleep. In contrast to the brain-activated state of REM sleep, non-REM sleep is characterized by synchronized, rhythmic inhibitory–excitatory potentials in large numbers of neurons in cortical and thalamic regions. Other areas implicated in the control of non-REM sleep include cholinergic and noncholinergic neurons in the basal forebrain, the hypothalamus, and the area of the solitary tract. In addition to neuronal mechanisms in the control of sleep, more than 30 different endogenous substances have been reported to be somnogenic. These include delta sleep-inducing peptide, prostaglandin D , vaso# active intestinal peptide, adenosine, growth hormonereleasing hormone, and several cytokines including interleukins, tumor necrosis factor-α, and interferons. The significance of endogenous sleep factors in normal
sleep physiology remains to be proven, but the possibility has interesting implications. As the Two Process Model of sleep regulation suggests, increased sleep propensity with progressive duration of wakefulness might be associated with the accumulation of a sleep factor and the homeostatic function of sleep or at least delta sleep might reflect its ‘catabolism.’
3. Sleep and Psychiatric Disorders Polygraph sleep research helped paving the way to the modern age of scientific, biological psychiatry. Much of the early sleep research beginning in the mid-1960s was descriptive mapping of events taking place during sleep and laid out the currently rich picture of polysomnographic features in the different psychiatric disorders. Although many of the objective sleep abnormalities associated with psychiatric disorders appear not to be diagnostically specific, these studies have also established sleep measures as neurobiological windows into the underlying pathophysiology associated with psychiatric illnesses. Psychiatric disorders are among the most common primary causes of secondary sleep complaints, particularly of insomnia. Sleep abnormalities may be caused by central nervous system abnormalities associated with psychiatric illnesses as well as by accompanying 14169
Sleep Disorders: Psychiatric Aspects
Table 1 Typical EEG sleep findings in psychiatric disorders (meta-analysis by Benca et al. 1992)
behavioral disturbances. Patients with depression, anxiety disorders, misuse of alcohol or drugs, schizophrenia and personality disorders may complain of difficulty falling asleep, staying asleep, or of inadequate sleep. Although specific sleep patterns are not necessarily diagnostic of particular psychiatric disorders, there are relationships between certain sleep abnormalities and categories of psychiatric disorders (Table 1). A review of the literature on sleep in psychiatric disorders showed that EEG sleep in patients with affective disorders differed most frequently from those of normal control subjects.
4. Sleep Abnormalities in Depression Sleep abnormalities in patients with major depressive disorders, as assessed by laboratory studies, can be classified as difficulties of sleep continuity, abnormal sleep architecture, and disruptions in the timing of REM sleep. Sleep initiation and maintenance difficulties include prolonged sleep latency (sleep onset insomnia), intermittent wakefulness and sleep fragmentation during the night, early morning awakenings with an inability to return to sleep, reduced sleep efficiency, and decreased total sleep time. With regard to sleep architecture, abnormalities have been reported in the amounts and distribution of nonREM sleep stages across the night and include increased shallow stage 1 sleep and reductions in the amount of deep, slow-wave (stages 3 j 4) sleep. REM sleep disturbances in depression include a short REM latency 14170
( 65 minutes), a prolonged first REM sleep period, and an increased total REM sleep time, particularly in the first half of the night. Sleep disturbances are generally more prevalent in depressed inpatients, whereas only 40–60 percent of outpatients show sleep abnormalities. Moreover, a recent meta-analysis indicated that no single polysomnographic variable might reliably distinguish depressed patients from healthy control subjects or from patients with other psychiatric disorders. Table 1 gives an overview of the typical EEG sleep changes in major psychiatric conditions. This prompted some researchers to conclude that clusters or combinations of sleep variables better describe the nature of sleep disturbances in depression. Although there is some disagreement as to which specific sleep EEG variables best characterize depressed patients, the importance of sleep to depression is clear. Persistent sleep disturbance is associated with significant risk of both relapse and recurrence, and increased risk of suicide. Sleep variables such as REM latency has also been shown to predict treatment response and clinical course of illness in at least some studies. It has also recently been suggested that the nature of the sleep disturbance at initial clinical presentation may be relevant to the choice of antidepressant medications and the likelihood of experiencing treatment-emergent side effects. One of the most sensitive parameter for discrimination of patients with major depression from patients with other psychiatric disorders and healthy subjects is REM density which is substantially elevated only in depressed patients. The persistence of a depressionlike sleep pattern in fully remitted depressed patients suggests that the pattern is a trait characteristic of the sleep measurements. However, in the past, subjects have undergone investigations only after the onset of the disorder, and therefore the altered sleep pattern may merely represent a biological scar. The answer to the question ‘trait or scar’ lies in the investigation of potential patients before the onset of the disorder. The EEG sleep patterns of subjects without a personal history but with a strong family history of an affective disorder differed from those of the controls without any personal history of family history of psychiatric disorders showing a depression-like sleep pattern with diminished SWS and increased REM density. Followup studies will determine whether the sleep pattern indeed represents a trait marker indicating vulnerability. The importance of sleep in depression is also shown in other ways. Many well-documented studies show that total and partial sleep deprivation or selective REM sleep deprivation has antidepressant effects. Additionally, following total or partial sleep deprivation, patients with depression appear to be uniquely susceptible to clinical mood changes when they return to sleep. Patients who have shown a clinically significant antidepressant response to sleep deprivation are
Sleep Disorders: Psychiatric Aspects at risk of awakening depressed again, even after very short naps (see Depression; Antidepressant Drugs).
5. Neurobiology of Sleep: Releance to Psychiatric Disorders Inspired by the growing knowledge of the underlying neurobiology of sleep, investigators proposed theories into the pathophysiology of psychiatric disorders. Because depression has been studied more than any other psychiatric syndrome in recent decades, the models have attempted to explain features of sleep in depression, such as short REM latency, decreased delta sleep, and the antidepressant effects of sleep deprivation in depressed patients. Among the most prominent models of sleep changes in depression are the cholinergic–aminergic imbalance hypothesis for depression, the Two-Process-Model of sleep regulation, the Phase Advance Hypothesis, the Overarousal hypothesis, and the REM sleep hypothesis. The ‘cholinergic–aminergic imbalance hypothesis’ for depression postulates that depression arises from an increased ratio of cholinergic to aminergic neurotransmission in critical central synapses. Because various features of the sleep of depressed patients (decreased REM latency and delta sleep) have been simulated in normal volunteers by pharmacological probes, the reciprocal interaction hypothesis from basic sleep research and the cholinergic–aminergic imbalance model from clinical psychiatry have been correlated. The reciprocal interaction model assumes that the cycling alternating pattern of non-REM and REM sleep is under the control of noradrenergic\ serotonergic and cholinergic neuronal networks. Linking these concepts it is suggested if depression results from diminished noradrenergic and serotonergic neurotransmission, cholinergic activity would be expected to increase and lead therefore to the sleep disturbances of depression. In recent years the role of serotonin in the regulation of sleep has afforded increased attention. Among other neurotransmitters, serotonin plays a role in the pathophysiology of depression and its treatment; chiefly all antidepressants ultimately lead to an enhancement of serotonergic neurotransmission that is believed to be associated with clinical improvement. Depression is associated with a disinhibition of REM sleep (shortened REM latency, increased REM density) and serotonin leads to a suppression of REM sleep. Furthermore, there is evidence that the antidepressant effect of sleep deprivation is related to a modification of serotonergic neurotransmission, thus sleep regulation and depression share common physiopathological mechanisms at the serotonergic level. According to the ‘Two-Process-Model of sleep regulation,’ depression is thought to result from, or is associated with, a deficiency of Process S. This model
was supported by evidence that EEG power density in the delta frequencies was decreased during sleep in depressed patients compared with controls and by the fact that sleep deprivation increases both Process S and mood. In the ‘Extended Two-Process-Model of sleep regulation’ the interaction of hormones with EEG features has been integrated. Preclinical investigations and studies in young and elderly normal controls and in patients with depression demonstrate that neuropeptides play a key role in sleep regulation. As an example, growth hormone releasing hormone (GHRH) is a common stimulus of SWS and growth hormone release, whereas corticotropin-releasing hormone (CRH) exerts opposite effects. It is suggested that an imbalance of these peptides in favor of CRH contribute to changes in sleep EEG and endocrine activity during depression. Based on the findings that the hypothalamic–pituitary–adrenocortical (HPA) axis is disregulated in depression, and that CRH produces EEG sleep changes reminding to depression on one hand (SWS reduction, REM sleep disinhibition), and that the somatotrophic system reciprocally interacts with the HPA system (decreased GHRH and SWS), it has been postulated that deficient Process S is associated with deficient GHRH and an overdrive of CRH. Although abnormally high values of cortisol secretory activity normalize after recovery from depression, growth hormone release and several characteristic disturbances of the sleep EEG may remain unchanged. These putative trait-dependent alterations suggest that strategies aiming at restoration of these sleep changes are worthy of exploration for their potential antidepressant effect. The depressed state appears not only to be associated with a disturbance of sleep-wake homeostasis but also of circadian function or of their interaction. The ‘Phase Advance Hypothesis’ suggests that the phase position of the underlying circadian oscillator is ‘phase advanced’ relative to the external clock time. This is supported by studies showing that short REM latency could be simulated in normal controls by appropriate phase shifts of hours in bed. As mentioned above, sleep deprivation has potent antidepressant effects in more than half of all depressed patients. This observation has prompted the hypothesis that depressed patients are ‘overaroused.’ Later, the ‘Overarousal Hypothesis’ has been forwarded and was supported by psychological selfrating studies suggesting that clinical improvement to sleep deprivation in depression may be associated simultaneously with subjective feelings of more energy (arousal), less tension, and more calmness (dearousal). Other data consistent with this hypothesis include the short, shallow, and fragmented sleep patterns, lowered arousal thresholds, and elevated nocturnal core body temperature often seen in depressed patients. This hypothesis has been tested by means of studies of localized cerebral glucose metabolism with ")F-deoxyglucose positron emission tom14171
Sleep Disorders: Psychiatric Aspects ography in separate studies in depression, sleep deprivation, and the first non-REM period of the night. In these studies it was found that elevated brain metabolism prior to sleep deprivation predicted clinical benefits in depressed patients and that normalization of these measures is associated with clinical improvement. It was also found that local cerebral glucose metabolism in cingulate and amygdala at baseline was significantly higher in clinical responders than in nonresponders or normal controls. Furthermore, it was shown that glucose metabolic rate was increased during the first non-REM period in depressed patients compared with normal controls. Moreover, these studies demonstrated significant ‘hypofrontality,’ that is; reduced ratio of frontal to occipital activity compared with normal controls. The ‘REM sleep hypothesis’ of depression has been based on the findings that REM sleep is enhanced in depression and that virtually all antidepressant drugs suppress REM sleep. Early, however not replicated, studies have shown that a sustained remission from depression may be achieved by selective REM sleep deprivation carried out repeatedly every night during about two weeks. This treatment modality was not followed because the long-term REM sleep deprivation was too exhausting for the patients. Recently, researchers have proposed a treatment modality that combined some of the above hypotheses. The so-called ‘Sleep-Phase-Advance’ protocol, that is scheduling sleeping times in a way that minimizes the occurrence of REM sleep, has been shown to produce a sustained antidepressant effect. However, also this treatment modality is demanding for both the patients and the institutional staff.
6. Future Perspecties We have outlined just some aspects of sleep research that are relevant to depression. Linking basic and clinical approaches has been one of the ‘royal roads’ to the neurobiological underpinnings of psychiatric diseases and their treatment in the past, and to narrow the gap between bench and bedside. One of the most impressing links between sleep and depression is the fact that sleep deprivation alleviates depressive symptoms within hours and that sleep may restore the initial symptoms within hours or even minutes. Given the fact that the core symptoms of only few psychiatric and medical conditions may be ‘switched’ on and off, one promising lead of future research in depression is the quest to understanding the mechanisms underlying the neurobiological processes associated with sleep and wakefulness. In the future it is hoped that the rapidly evolving progress in basic neuroscience including recent molecular biology with systematic screening of gene expression and functional brain imaging techniques will help us in progressively uncovering the mystery of sleep, and ultimately improving treatment strategies of depression. 14172
Bibliography Benca R M, Obermayer W H, Thisted R A, Gillin J C 1992 Sleep and psychiatric disorders: A meta-analysis. Arch. Gen. Psychiatry 49: 651–68 Borbe! ly A A, Achermann P 1999 Sleep homeostasis and models of sleep regulation. Journal of Biological Rhythms 14(6): 557–68 Gillin J C, Seifritz E, Zoltoski R, Salin-Pascual R J 2000 Basic science of sleep. In: Sadock B J, Sadock V A (eds.) Kaplan & Sadock’s Comprehensie Textbook of Psychiatry—VII. Lippincott Williams and Wilkins, Philadelphia, pp. 199–208 Holsboer-Trachsler E, Kocher R 1996 Somnipathies: new recommendations for their diagnosis and treatment. Drugs of Today 32(6): 477–82 Kupfer D J 1999 Pathophysiology and management of insomnia during depression. Annals of Clinical Psychiatry 11(4): 267–76 Rechtschaffen A, Kales A 1968 A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. Department of Health, Education and Welfare, Neurological Information Network, Bethesda Silva J, Chase M, Sartorius N, Roth T 1996 Special report from a symposium held by the World Health Organization and the World Federation of Sleep Research Societies: an overview of insomnia’s and related disorders—recognition, epidemiology, and rational management. Sleep 19(5): 412–6 Steiger A, Holsboer F 1997 Neuropeptides and human sleep. Sleep 20: 1038–52 Tononi G, Cirelli C 1999 The frontiers of sleep. Trends in Neuroscience 22(10): 417–8
E. Holsboer-Trachsler and E. Seifritz
Sleep Disorders: Psychological Aspects 1. Definition Good sleep is usually defined by its consequences. It is the amount and quality of sleep that results in the ability to feel and function well the next day. In contrast, a sleep disorder is a disturbance in the quantity or quality of sleep that interferes with waking performance and\or feelings of well-being. Psychological factors play a role in both the etiology and maintenance of many of the 84 sleep disorders recognized in the The International Classification of Sleep Disorders Diagnostic Coding Manual (1997). These in turn have both short-term and long-term effects on the psychological functioning of patients.
2. Background Historically, sleep has not been an area of concern in psychology. Aside from psychoanalytic theory, which placed a good deal of emphasis on the role of unconscious motivation in explaining behavior, most psychological theory was based on behavior that was objectively observable. With the discovery in the early 1950s of rapid eye movement (REM) sleep, and its
Sleep Disorders: Psychological Aspects close association with the distinctive mental activity, dreaming, a good deal of work was undertaken to explore whether and how this interacted with waking behavior. The question of whether dreaming has some unique psychological function was driven by the observation that this phenomenon was universal, regularly recurring, and persistent when experimentally suppressed. Initially it was speculated that the study of dreaming might lead to better understanding of the psychoses. Was the hallucination of mental illness a misplaced dream? Was the high degree of brain activation in REM sleep conducive to memory consolidation of new learning? Did dreaming play a role in emotional adaptation? When, after some 25 years of experimentation, no clear consequences for waking behavior could be attributed to the suppression of REM sleep, interest in this area faded. Loss of REM sleep did not appear to interfere with memory consolidation, nor did it promote waking hallucinations. Its role in emotional adaptation remained speculative. Foulkes’s (1982) study of children’s dream development concluded that dream construction was not something special. It could be explained by the periodic internal activation of the brain stimulating bits of sensory memory material represented at the same level of cognitive sophistication as the child was capable of achieving in waking thought. Adult dreams were further diminished in importance by the Activation-Synthesis hypothesis of Hobson and McCarley (1977), whose work pinpointed the area of the brain where REM sleep turns on as, not where the higher mental processes take place, but in the lower brain stem. In this theory dreams only acquire meaning after the fact, by association to what are initially, unplanned, random sensory stimuli.
3. Current Research Interest in the 24-hour mind, the interaction of waking–sleeping–waking behavior came back into focus with the development of the field of sleep disorder medicine. The impact of disordered sleep on psychological functioning is documented most convincingly through large epidemiological studies of insomnia and hypersomnia that involved extensive follow up. The findings of Ford and Kamerow (1989) that both insomnia and hypersomnia are significant risk factors for new psychiatric disorders have now been replicated in several large studies. Further work has established that the onset of a new episode of waking major depression can be predicted from the presence of two weeks of persistent insomnia (Perlis et al. 1997). These findings have sparked the development of psychological interventions for the control of insomnia in an effort to prevent the development of these psychiatric disorders. Some of these are behavioral programs that work directly to manipulate sleep
to improve its continuity, others are psychotherapeutic in nature such as interpersonal therapy and cognitive behavioral therapy which address the interaction patterns, emotional and cognitive styles that are dysfunctional in depressed persons. Again, it was the observation that morning mood on awakening in major depression was frequently low, that suggested an investigation of the REM sleep and dreaming of the depressed person to test whether defects in this system prevented overnight mood regulation or emotional adaptation. The work of Kupfer and his colleagues (Reynolds and Kupfer 1987) established that there are several REM sleep deviations associated with major depression not seen in normal individuals. This leads to a variety of manipulations in an attempt to correct these. Vogel’s (Vogel et al. 1975) studies of extensive REM deprivation interspersed with recovery nights were most successful in improving waking mood and in bringing about remission without further treatment. Cartwright’s work (Cartwright et al. 1998) on the effect on morning mood of various within-night dream affect patterns, showed that a ‘working through’ pattern, from dreams expressing negative mood to those with predominantly positive affect, predicted later remission from depression. The recurring nightmares characteristic of posttraumatic stress disorder (PTSD) has stimulated renewed interest in addressing dreams directly in treatment programs, especially for those with long-lasting symptoms of this disorder. These methods typically involve active rehearsal of mastery strategies for the control of the disturbing dream scenario. A good deal of effort is now being addressed to discovering those attitudes, beliefs, and habitual behaviors that are implicated in maintaining poor sleep patterns since the correction of these may help to restore normal sleep and abort the development of a major psychiatric disorder. Studies of the psychological profiles of those exhibiting various sleep disorders have most often employed the Minnesota Multiphasic Personality Inventory (MMPI). These show insomnia patients have more scale elevations above the norms especially of those indicating neurotic personality characteristic, such as phobias and obsessive compulsive disorders, than do matched controls. Generally, insomnia patients appear to internalize tensions rather than express them and may often somaticize these and express them as pain syndromes. In terms of demographic variables most studies report more women than men complain of insomnia. Rates are higher in those who are separated or widowed rather than single or married, and among the unemployed. In addition, insomnia is more common in those of middle to lower socioeconomic status than in the highest class. This picture suggests that a loss of the time structure of work and of a love relationship may be precipitating psychological factors in disrupting sleep\wake rhythms. 14173
Sleep Disorders: Psychological Aspects Other sleep disorders, such as periodic limb movements of sleep, and sleep-related breathing disorders, are not related to personality, but do have an impact on relationships, due to an inability to share the bed at night, and on waking performance. Severe levels of any sleep disorder, whether insomnia or hypersomnia, limits work efficiency, cognitive clarity, and emotional stability.
4. Methodological Issues Technological problems still hamper the progress in this field. Sleep laboratory studies are expensive and in many ways unnatural. Home studies while possible are unable to make repairs or adjustments as needed. Subjects must be awakened to obtain reports of their ongoing mental activity to investigate dream content. This aborts the end of each REM period, thus truncating the dream story. Dream reports are also limited by the subject’s ability to translate his\her sensory experience during sleep into waking verbal terms. Recent studies using positive emission tomography (PET) to image the brain activity during REM sleep have established that dreaming sleep differs from waking and from non-REM in the activation of the limbic and paralimbic systems in the presence of lower activity in the dorsal–lateral frontal areas (Maquet et al. 1996, Nofzinger et al. 1997). This is interpreted as confirming that dreaming engages the emotional and drive-related memory systems in the absence of higher systems of planning and executive control. This gives new impetus to the study of dreams as a unique mental activity whose place in the 24-hour psychology may now move ahead on firmer ground.
5. Probable Future Directions Work initiated by Solms (1997) on the effect of localized brain lesions on dream characteristics suggests a new line of investigation in mapping how the brain constructs dreams. This strategy may give more power to understand some of the dream disorders currently less well understood; perhaps even the hallucinations of psychoses. See also: Sleep: Neural Systems; Sleep States and Somatomotor Activity
Bibliography American Sleep Disorders Association 1997 The International Classification of Sleep Disorders Diagnostic and Coding Manual. Rochester, MN Cartwright R, Young M, Mercer P, Bears M 1998 The role of REM sleep and dream variables in the prediction of remission from depression. Psychiatry Research 80: 249–55
14174
Ford D E, Kamerow D B 1989 Epidemiologic study of sleep disturbances and psychiatric disorders: An opportunity for prevention? Journal of the American Medical Association 262: 1479–84 Foulkes D 1982 Children’s Dreams: A Longitudinal Study. Wiley and Sons, New York Hobson J A, McCarley R W 1977 The brain as a dream state generator: An activation-synthesis hypothesis of the dream process. American Journal of Psychiatry 134: 1335–48 Maquet P, Peters J-M, Aerts J, Delfiore G, Degueldre C, Luxen A, Franck G 1996 Functional neuroanatomy of human rapideye-movement sleep and dreaming. Nature 383: 163–6 Nofzinger E A, Mintun M A, Wiseman M B, Kupfer D, Moore R 1997 Forebrain activation in REM sleep: an FDG PET study. Brain Research 770: 192–201 Perlis M L, Giles D E, Buysse D J, Tu X, Kupfer D 1997 Selfreported sleep disturbance as a prodromal symptom in recurrent depression. Journal of Affectie Disorders 42: 209–12 Reynolds C F, Kupfer D J 1987 Sleep research in affective illness: state of the art circa 1987. Sleep 10: 199–215 Solms M 1997 The Neuropsychology of Dreams: A Clinicoanatomical Study. L. Erlbaum Associates, Mahwah, NJ Vogel G, Thurmond A, Gibbsons P, Sloan K, Boyd M, Walker M 1975 REM sleep reduction effects on depressed syndromes. Archies of General Psychiatry 32: 765–7
R. D. Cartwright
Sleep States and Somatomotor Activity The drive to sleep has an awesome power over our lives. If we go without sleep or drastically reduce it, the desire to sleep quickly becomes more important than life itself. The willingness to go to extraordinary efforts in order to obtain even a little sleep demonstrates vividly that sleep is a vital and necessary behavior. In fact, we often allow ourselves to be placed in lifethreatening situations in order to satisfy the need to sleep. But what is sleep, really? All of us feel, at some basic level, that we really do understand what sleep is all about. We certainly know how it feels, and the ‘meaning’ of the word is generally accepted in ordinary conversation. But, in point of fact, no one knows what sleep really is. Once we proceed beyond a simple description of an apparently quiet state that is somehow different from wakefulness, we find that the following major questions about the nature of sleep are largely unanswered: Why do we sleep?; What is (are) the function(s) of sleep?; What are the mechanisms that initiate and sustain sleep?; and, How, when and why do we wake up? In order to understand ‘sleep,’ we must clarify the key processes and mechanisms that initiate and maintain this state. From a purely descriptive point of view, we simply need to know what is going on in the brain
Sleep States and Somatomotor Actiity (and body) during sleep. But first, we must agree on how to describe and\or define those behaviors that we intuitively understand as constituting the sleep states.
1. NREM and REM Sleep Traditionally, three physiological measures are employed for describing the states of sleep in controlled laboratory situations. They are the EEG (electroencephalogram, which reflects the electrical activity of the cerebral cortex), the EOG (electrooculogram, which is a recording of eye movements), and the EMG (electromyogram, which is an indication of the degree of muscle activity, i.e., contraction). Together, these measures are used to define and differentiate sleep in mammals, which is comprised of two very different states. These states are called nonrapid eye movement (NREM) sleep and rapid eye movement (REM) sleep: each is nearly as different from the other as both are distinct from wakefulness. Generally speaking, the EEG during NREM sleep consists of slow large-amplitude waves. Involuntary, slow, rolling, eye movements occur during the transition from drowsy wakefulness to NREM sleep; otherwise, NREM essentially lacks eye movements. Few motor events occur during NREM sleep; however, body repositioning and occasionally some motor
behavior, such as sleepwalking, talking, eating or cooking, take place during NREM sleep. In general, motor processes that occur during NREM sleep are comparable to those present during very relaxed wakefulness. The cortical EEG during REM sleep closely resembles the EEG of active wakefulness, i.e., it is of low voltage and high frequency. This is a surprising finding, considering that these two states are so dramatically different from a behavioral point of view. Bursts of rapid eye movements occur phasically during REM sleep. These eye movements are similar to those which would occur during wakefulness if one were looking at an object that went from right to left and then from left to right, rapidly and continuously, for a couple of seconds at a time. During REM sleep there is also a nonreciprocal flaccid paralysis, i.e., atonia, of the major muscle groups (with the principal exception being certain muscles used in respiration). However, at intervals that usually coincide with the phasic periods of rapid eye movements, there are brief muscle twitches that involve principally, but not exclusively, the distal muscles (e.g., the muscles of the fingers, toes, and face). REM sleep is subdivided into ‘phasic’ and ‘tonic’ periods. During phasic REM sleep periods there are brief episodes of rapid eye movements and muscle twitches as well as muscle atonia (Fig. 1). Tonic REM
Figure 1 Intracellular recording from a trigeminal jaw-closer motoneuron: correlation of membrane potential and state changes. The membrane potential increased rather abruptly at 3.5 min in conjunction with the decrease in neck muscle tone and transition from quiet (NREM) to active (REM) sleep. At 12.5 min, the membrane depolarized and the animal awakened. After the animal passed into quiet sleep again, a brief, aborted episode of active sleep occurred at 25.5 min that was accompanied by a phasic period of hyperpolarization. A minute later the animal once again entered active sleep, and the membrane potential increased. EEG trace, marginal cortex, membrane potential band pass on polygraphic record, DC to 0.1 Hz. PGO, ponto-geniculo-occipital potential (reprinted from Chase MH, Chandler SH, Nakamura Y 1980 Intracellular determination of membrane potential of trigeminal motoneurons during sleep and wakefulness. Journal of Neurophysiology 44: 349–58)
14175
Sleep States and Somatomotor Actiity periods occur when the preceding phasically occurring eye and muscle twitches are absent, but there is still persistent muscle atonia.
2. Somatomotor Actiity During REM Sleep Because there is very little motor behavior during NREM sleep, the present overview of somatomotor activity during sleep consists mainly of a description of motor events which take place during REM sleep and entails: (a) an exploration of the mechanisms that control muscle activity during this sleep state, and (b) a description of the ‘executive’ mechanisms that are responsible for these REM sleep-related patterns of motor control (Chase and Morales 1994). In order to understand the mechanisms responsible for the control of motor activity during REM sleep, it is important to first describe the changes in muscle fiber activity that occur during this state. The passage from wakefulness to NREM sleep is accompanied by a decrease in the degree of contraction of somatic muscle fibers, that is, there is a decrease in muscle tone, i.e., hypotonia. Surprisingly, during the state of REM sleep there occurs somatomotor atonia, or the complete lack of tone of many somatic muscles (Fig. 1). In order to understand how atonia of the somatic musculature during REM sleep is achieved, it is necessary to understand how the activity of muscle fibers are, in general, controlled. First of all, muscles (i.e., muscle fibers) do one thing, they contract\shorten. When muscle fibers contract, they do so because command signals, which are really trains of action potentials, travel from the cell bodies of motoneurons along their axons to activate muscle fibers by changing the permeability of the muscle fiber membrane. A great many muscle groups are constantly contracting, at least to some degree, when we maintain a specific posture or perform a movement when we are awake. Thus, tone or some degree of muscle fiber contraction is dependent on the asynchronous, sustained discharge of motoneurons and the action potentials that travel down their axons to initiate the contraction of muscle fibers. There is a gradual, but slight decline in muscle tone during NREM sleep compared with wakefulness. During REM sleep, there is such a strikingly potent suppression of muscle activity and motoneuron discharge. This results in a complete loss of muscle tone, or atonia. Thus, the key to understanding atonia during REM sleep resides in understanding the manner in which the activity of motoneurons is reduced or eliminated during this state. There are two basic mechanisms that could, theoretically, be responsible for the elimination of motoneuron discharge during REM sleep: one is the direct inhibition of motoneurons (i.e., postsynaptic inhibition) and the second is a cessation of excitatory 14176
Figure 2 High-gain intracellular recording of the membrane potential activity of a tibial motoneuron during wakefulness (A), quiet (NREM) sleep (B), and active (REM) sleep (C). Note the appearance, de noo during active sleep, of large-amplitude, repetitively occurring inhibitory postsynaptic potentials. Two representative potentials, which were aligned by their origins, are shown at higher gain and at an expanded time base (C1, 2). These potentials were photographed from the screen of a digital oscilloscope. The analog to digital conversion rate was 50 µs\min. During these recordings the membrane potential during active sleep was k67.0 mV; the antidromic action potential was 78.5 mV (reprinted from Morales FR, Boxer P, Chase MH 1987 Behavioral state-specific inhibitory postsynaptic potentials impinge on cat lumbar motoneurons during active sleep. Experimental Neurology 98: 418–35)
input to motoneurons (i.e., disfacilitation). Experimentally, these processes have been explored in the laboratory by recording, intracellularly, from individual motoneurons during REM sleep. During REM sleep, it has been found that motoneurons become hyperpolarized, and as a consequence, they become relatively unresponsive to excitatory input. Analyses of the various membrane properties of these motoneurons indicate that their lack of responsiveness is due primarily to postsynaptic inhibition (Fig. 1). Thus, it is clear that postsynaptic
Sleep States and Somatomotor Actiity
Figure 3 Action potential generation during wakefulness (A) and a rapid eye movement period of active (REM) sleep (B). Gradual membrane depolarization (bar in Ah) preceded the development of action potentials during wakefulness, whereas a strong hyperpolarizing drive was evident during a comparable period of active sleep (bar in Bh). This difference was also observed preceding the subsequent generation of each action potential during the rapid eye movement episode. Action potentials in Ah and Bh are truncated owing to the high gain of the recording. Resting membrane potentials are k58 mV in (A) and k63 mV in (B). Data are unfiltered; records were obtained from a single tibial motoneuron (reprinted from Chase MH, Morales FR 1982 Phasic changes in motoneuron membrane potential during REM periods of active sleep. Neuroscience Letters 34: 177–82)
inhibition is the primary mechanism that is responsible for atonia of the somatic musculature during this state. Postsynaptic inhibition of motoneurons during REM sleep occurs when certain neurons which are located in the brainstem liberate glycine at their point of contact with motoneurons, which is called the synapse. When glycine is liberated, motoneurons respond by generating changes in their membrane properties and their ability to discharge, which are recorded as ‘inhibitory’ postsynaptic potentials. In an effort to elucidate the bases for motoneuron inhibition (which is viewed behaviorally as atonia) during REM sleep, spontaneous postsynaptic potentials, recorded from motoneurons, were examined. A unique pattern of sleep-specific inhibitory postsynaptic potentials was found to bombard motoneurons during REM sleep; they were of very large amplitude and occurred in abundant numbers (Fig. 2). Thus, it is clear that there is a unique set of presynaptic cells that are responsible for the REM sleep-specific inhibitory postsynaptic potentials that bombard motoneurons. For brainstem motoneurons (and likely
for spinal cord motoneurons as well), these presynaptic inhibitory neurons are located in the ventromedial medulla. In summary, postsynaptic inhibition is the principal process that is responsible for the atonia of the somatic musculature during REM sleep. This postsynaptic process is dependent on the presence of REM sleepspecific inhibitory potentials, which arise because glycine (an inhibitory neurotransmitter) is released by presynaptic neurons onto the surface of postsynaptic motoneurons. All of the inhibitory phenomena that have been described in the previous sections are not only present but also enhanced during the phasic rapid eye movement periods of REM sleep. How, then, could there be twitches and jerks of the eyes and limbs during phasic rapid eye movement periods when there is also enhanced inhibition? The answer is simple: most of these periods are accompanied not only by increased motoneuron inhibition but also by potent motor excitatory drives that impinge on motoneurons. Thus, during these phasic periods of REM sleep, even though there is a suppression of the activity of motoneurons, there occurs, paradoxically, the concurrent excitation of motoneurons whose discharge results in the activation of muscle fibers (which leads to twitches and jerks, mainly of the limbs and fingers) (Fig. 3). These patterns of activation reflect descending excitatory activity emanating from different nuclei in the pons and possibly from the forebrain as well. Consequently, from time to time, and for reasons as yet unknown, during REM sleep excitatory drives overpower the enhanced inhibitory drives. When this occurs, motoneurons discharge and the muscle fibers that they innervate contract. Thus, there is an increase in excitatory drives that is actually accompanied by an increase in inhibitory drives (Fig. 3). When motoneurons discharge during the REM periods of active sleep, contraction of the muscles that they innervate is unusual because the resultant movements are abrupt, twitchy, and jerky; they are also without apparent purpose. Thus, from the perspective of motoneurons, REM sleep can be characterized as a state abundant in the availability of strikingly potent patterns of postsynaptic inhibition and, during REM periods, by new periods of excitation and also by enhanced postsynaptic inhibition.
3. The Executie Control Mechanism for Somatomotor Actiity During REM Sleep The preceding description of the neural control of somatomotor control during REM sleep can also be viewed from the perspective of the central nervous system executive mechanisms that are responsible this physiological pattern of activity. A critical structure 14177
Sleep States and Somatomotor Actiity that is involved in the generation and\or maintenance of REM sleep, and especially, the motor inhibitory aspects of this state, is the nucleus pontis oralis, which is located in the rostral portion of the pontine reticular tegmentum. This area is thought to be activated by cholinergic fibers that arise from cell bodies that are located in the laterodorsal pontine tegmentum and the pedunculo pontine tegmentum. There is evidence that a neuronal mechanism that resides within or is part of the nucleus pontis oralis is involved in the generation of wakefulness (and motor excitation) as well as REM sleep (and motor inhibition). For example, studies of the nucleus pontis oralis have shown that it is central to the phenomenon of reticular response reversal. This is a phenomenon wherein stimulation of this pontine nucleus results in an increase in somatic reflex activity during wakefulness, but remarkably, the identical stimulus during REM sleep yields potent postsynaptic inhibition of the same somatic reflexes. We have suggested the existence of a neuronal switch, based on the phenomena of reticular response reversal, that is responsible for controlling the animal’s state and somatomotor inhibitory processes during wakefulness as well as REM sleep. The existence of this switch is based on the hypothesis that wakefulness occurs when inhibitory (GABAergic) neurons in the nucleus pontis oralis suppress the activity of REM sleep neurons that are also located in, or in the vicinity of, the nucleus pontis oralis (Xi et al. 1999). Thus, when REM sleep-controlling neurons discharge in this and related areas the result is the generation of this state and its attendant patterns of somatomotor inhibition. In support of the preceding hypothesis, we have found that when the inhibitory neurotransmitter GABA is placed in the nucleus pontis oralis, prolonged periods of wakefulness and heightened motor activity are elicited in cats. Conversely, the application of bicuculline, a GABAA antagonist, results in the occurrence of episodes of REM sleep and somatomotor inhibition of very long duration. We therefore conclude that a pontine GABAergic system plays a critical role in the control of wakefulness and REM sleep. There are also recent data which demonstrates that when cells of the nucleus pontis oralis discharge, they initiate a cascade of events, the first one being the excitation of a group of premotor inhibitory neurons in the lower part of the brainstem, i.e., in the medulla (Morales et al. 1999). Fibers from these medullary neurons release glycine onto motoneurons, which results in atonia of the somatic musculature. When the motor inhibitory mechanisms of REM sleep cease to function properly, and there is still ongoing motoneuron excitation (and\or enhanced motor excitation during the phasic periods of REM sleep), a syndrome occurs in cats that is called ‘REM without atonia.’ In humans, a comparable syndrome is called ‘REM Behavior Disorder.’ When humans 14178
and cats with this syndrome enter REM sleep, they begin to twitch violently, jump about and appear to act out their dreams; their sleep is quite disrupted and they are certainly a threat both to themselves and, of course, to others. It is clear that the motor inhibition that occurs during REM sleep plays a critical role in the maintenance of this state. There is also no doubt that the simple release of glycine onto motoneurons during REM sleep, which results in atonia, i.e., a lack of muscle activity, has importance and implications of far-reaching and widespread proportions.
Bibliography Chase M H, Chandler S H, Nakamura Y 1980 Intracellular determination of membrane potential of trigeminal motoneurons during sleep and wakefulness. Journal of Neurophysiology 44: 349–58 Chase M H, Morales F R 1982 Phasic changes in motoneuron membrane potential during REM periods of active sleep. Neuroscience Letters 34: 177–82 Chase M H, Morales F R 1994 The control of motoneurons during sleep. In: Kryger M H, Roth T, Dement W C (eds.) Principles and Practice of Sleep Medicine, 2nd edn. W.B. Saunders, Philadelphia, PA, pp. 163–75 Morales F R, Boxer P, Chase M H 1987 Behavioral statespecific inhibitory postsynaptic potentials impinge on cat lumbar motoneurons during active sleep. Experimental Neurology 98: 418–35 Morales F R, Sampogna S, Yamuy J, Chase M H 1999 c-fos expression in brainstem premotor interneurons during cholinergically-induced active sleep in the cat. Journal of Neuroscience 19: 9508–18 Xi M-C, Morales F R, Chase M H 1999 Evidence that wakefulness and REM sleep are controlled by a GABAergic pontine mechanism. Journal of Neurophysiology 82: 2015–19
M. H. Chase, J. Yamuy and F. R. Morales
Sleep: Neural Systems Since we spend roughly one third of our lives asleep, it is remarkable that so little attention has been paid to the capacity of sleep to organize the social behavior of animals including humans. To grasp this point, try to imagine life without sleep: no bedrooms and no beds; no time out for the weary parent of a newborn infant; and nothing to do all through the long, cold, dark winter nights. Besides demonstrating the poverty of the sociobiology of sleep, this thought experiment serves to cast the biological details which follow in the broader terms of social adaptation. Family life, reproduction, childrearing, and even economic commerce all owe their temporal organization to the power of sleep. The dependence of these aspects of our
Sleep: Neural Systems lives on sleep also underlines its importance to deeper aspects of biological adaptation—like energy regulation—that are only beginning to be understood by modern scientists.
1. Behaioral Physiology of Sleep Sleep is a behavioral state of homeothermic vertebrate mammals defined by: (a) characteristic relaxation of posture; (b) raised sensory thresholds; and (c) distinctive electrographic signs. Sleep tends to occur at certain times of day and becomes more probable as sleeplessness is prolonged. These two organizing principles are the pillars of Alex Borbely’s two-factor model: factor one captures the temporal or ‘circadian’ aspect while factor two captures the energetic or ‘homeostatic’ aspect (see Sleep Behaior). Sleep is usually associated with a marked diminution of motor activity and with the assumption of recumbent postures. Typically the eyes close and the somatic musculature becomes hypotonic. The threshold to external stimulation increases and animals become progressively less responsive to external stimuli as sleep deepens. The differentiation of sleep from states of torpor in animals that cannot regulate core body temperature has an important phylogenetic correlation with the neural structures mediating the electrographic signs of sleep. These include the cerebral cortex and thalamus whose complex evolution underlies the distinctive electroencephalographic features of sleep in the higher vertebrate mammals. Sleep also constitutes the state of entry to and exit from hibernation in mammalian species that regulate temperature at lower levels during winter. In humans, who can report upon the subjective concomitants of these outwardly observable signs of sleep, it is now clear that mental activity undergoes a progressive and systematic reorganization throughout sleep. On first falling asleep, individuals may progressively lose awareness of the outside world and experience microhallucinations and illusions of movement of the body in space; after sleep onset, mental activity persists but is described as thoughtlike and perseverative, if it can be recalled at all upon awakening. (For a description of the subsequent changes in mental activity, see the entry on dreaming.)
2. Electro-physiological Aspects of Sleep There is a complex organization of behavioral, physiological, and psychological events within each sleep bout. To detect this organization, it is necessary to record the electroencephalogram (EEG) from the surface of the head (or directly from the cortical
structures of the brain), to record the movement of the eyes by means of the electrooculogram (EOG), and to record muscle tone by means of the electromyogram (EMG). These three electrographic parameters allow one to distinguish sleep from waking and to distinguish two distinctive and cyclically recurrent phases within sleep: NREM (non-rapid eye movement) and REM (rapid eye movement) sleep. NREM, also known as synchronized or quiet sleep, is characterized by a change in the EEG from a low-amplitude, highfrequency to a high-amplitude, low-frequency pattern (see Fig. 1a). The degree to which the EEG is progressively synchronized (that is, of high voltage and low frequency) can be subdivided into four stages in humans: In stage one, the EEG slows to the theta frequency range (4–7 cycles per second or cps) and is of low voltage ( 50 mV and arrhythmic). Stage two is characterized by the evolution of distinctive sleep spindles composed of augmenting and decrementing waves at a frequency of 12–15 cps and peak amplitudes of 100 mV. Stage three is demarcated by the addition to the spindling pattern of high voltage ( 100 mV) slow waves (1–4 cps), with no more than 50 percent of the record occupied by the latter. In stage four, the record is dominated by high-voltage (150–250 mV) slow waves (1–3 cps). At the same time that the EEG frequency is decreasing and the voltage increasing, muscle tone progressively declines and may be lost in most of the somatic musculature. Slow rolling eye movements first replace rapid saccadic eye movements of waking and then also subside, with the eyes finally assuming a divergent upward gaze (Fig. 1a). After varying amounts of time, the progressive set of changes in the EEG reverses itself and the EEG finally resumes the low-voltage, fast character previously seen in waking. Instead of waking, however, behavioral sleep persists; muscle tone, at first passively decreased, is now actively inhibited; and there arise in the electrooculogram stereotyped bursts of saccadic eye movement called rapid eye movements (the REMs, which give this sleep state the name REM sleep). Major body movements are quite likely to occur during the NREM–REM transition. The REM phase of sleep has also been called activated sleep (to signal the EEG desynchronization) and paradoxical sleep (to signal the maintenance of increased threshold to arousal in spite of the activated brain). Human consciousness is associated with the low-voltage, fast EEG activity of waking and REM sleep but its unique character in dreaming depends upon other aspects of REM neurophysiology (see Fig. 1b). In all mammals (including aquatic, arboreal, and flying species) sleep is organized in this cyclic fashion: sleep is initiated by NREM and punctuated by REM at regular intervals. Most animals compose a sleep bout out of three or more such cycles, and in mature humans the average nocturnal sleep period consists of four to five such cycles, each of 90–100 minutes 14179
Sleep: Neural Systems (a)
(b)
Figure 1 (a) Behavioral states in humans. The states of waking, NREM, and REM sleep have behavioral, polygraphic, and psychological manifestations which are depicted here. In the behavioral channel, posture shifts- detectable by time lapse photography or video- can be seen to occur during waking and in concert with phase changes of the sleep cycle. Two different mechanisms account for sleep immobility: disfacilitation (during stages I–IV of NREM sleep) and inhibition (during REM sleep). In dreams, we imagine that we move, but we do not. The sequence of these stages is schematically represented in the polygraph channel and sample tracings are also shown. Three variables are used to distinguish these states: the electromyogram (EMG) which is highest in waking, intermediate in NREM sleep, and lowest in REM sleep; the electroencephalogram (EEG) and electrooculogram (EOG) which are both activated in waking and REM sleep, and inactivated in NREM sleep. Each sample record is about 20 s long. Other subjective and objective state variables are described in the three lower channels. (b) Sample records of human sleep (Snyder and Hobson)
14180
Sleep: Neural Systems duration. After a prolonged period of wake activity (as in humans) the first cycles are characterized by a preponderance of high-voltage, slow wave activity (i.e., the NREM phase is enhanced early in the sleep bout) while the last cycles show more low-voltage, fast wave activity (i.e., the REM phase is enhanced late in the sleep bout). The period length is fixed across any and all sleep periods in a given individual but is shorter in immature and smaller animals indicating a correlation of NREM–REM cycle length with brain size.
3. Neural Mechanisms of Sleep In all animals, sleep is one of a number of circadian behavioral functions controlled by an oscillator in the suprachiasmatic nucleus of the hypothalamus. It is the interaction of the intrinsic propensity to sleep at about 24 hour (i.e., circadian) intervals with the daily cycles of extrinsic light and temperature that gives sleep its diurnal organization. The exact mechanism by which hypothalamic structures act to potentiate sleep at one or another diurnal period is unknown, but increasing evidence suggests that peptide hormones are involved in regulating this coordination. The hypothesis that sleep is hormonally regulated is an old one and is linked to the subjective impression of fatigue with its implied humoral basis. The fact that the sleep drive increases with the duration of time spent awake in both normal and sleep-deprived animals clearly indicates the homeostatic regulation function of sleep, and experiments begun by Pieron at the beginning of the twentieth century suggested that a circulating humoral factor might mediate this propensity. Subsequent efforts to test the humoral hypothesis have involved the isolation of an S-muramyl peptide as a slow wave sleep-enhancing factor found both in the cerebrospinal fluid of sleep-deprived animals and in normal human urine. It remains to be established that this so-called factor S is involved in physiological sleep mediation; this is most important since factor S is known to be a product of bacterial cell walls but is not produced by the brain. Whatever its mediating mechanism, the onset of sleep has long been linked by physiologists to the concept of deafferentation of the brain. In early articulations of the deafferentation hypothesis it was thought that the brain essentially shut down when deprived of external inputs. This idea was put forth by Frederic Bremer to explain the results of midcollicular brain stem transection after which the forebrain persistently manifested the EEG synchronized pattern of slow wave sleep. Bremer thought that the so-called cereau isoleT (or isolated forebrain) preparation was asleep because it had been deprived of its afferent input. That interpretation was upset by Bremer’s own subsequent finding that transection of the neuraxis at C-1 produced intense EEG desynchronization and
hypervigilance in the so-called encephale isoleT (or isolated brain) preparation. The possibility that active neural mediation of arousal by brain stem structures situated somewhere between the midcollicular and high spinal levels was thus suggested. The faint possibility that afferent input (via the trigeminal nerve) might account for the difference between the two preparations was eliminated by Giuseppe Moruzzi’s finding that the mediopontine pretrigeminal transected preparation was also hyperalert. Thus the notion of sleep mediation via passive deafferentation was gradually replaced by the idea of sleep onset control via the active intervention of brain stem neural structures. This new idea does not preclude a contribution from deafferentation. And indeed, they appear to be complementary processes. The clear articulation of the concept of active brain stem control of the states of sleep and waking came from the 1949 finding by Giuseppe Moruzzi and Horace Magoun that high-frequency stimulation of the mid-brain reticular formation produced immediate desynchronization of the electroencephalogram and behavioral arousal in drowsy or sleeping animals. The most effective part of the reticular activating system was its rostral pole in the diencephalon where the circadian clock is located. These data indicated that wakefulness was actively maintained by the tonic firing of neurons in an extensive reticular domain. Arousal effects could be obtained following high-frequency stimulation anywhere in the reticular formation from the medullary to the pontine to the midbrain and diencephalic levels. The activating effect was thought to be mediated by neurons in the thalamus, which in turn relayed tonic activation to the entire neocortex. This concept has been recently confirmed at the cellular level by Mircea Steriade, following up on the earlier work of Dominic Purpura which had indicated that the spindles and slow waves of NREM sleep were a function of a thalamocortical neuronal interaction that appeared whenever brain stem activation subsided. Once Moruzzi and Magoun had substantiated the concept of active neural control of the states of waking and sleep, other evidence supporting this notion promptly followed. For example, it was found by Barry Sterman and Carmine Clemente that highfrequency stimulation of the basal forebrain could produce slow wave sleep in waking animals. When the same region was lesioned by Walle Nauta, arousal was enhanced. The cellular basis of these basal forebrain effects and their possible link to both the circadian oscillator of the hypothalamus and the arousal system of the reticular activating system are currently under intense investigation. The second dramatic demonstration that sleep states were under active neural control was Michel Jouvet’s lesion and transection work suggesting a role of the pontine brain stem in the timing and triggering of the REM phase of sleep. The pontine cat preparation 14181
Sleep: Neural Systems (a)
(b)
Figure 2 (a) Schematic representation of the REM sleep generation process. A distributed network involves cells at many brain levels (left). The network is represented as comprising 3 neuronal systems (center) that mediate REM sleep electrographic phenomena (right). Postulated inhibitory connections are shown as solid circles; postulated excitatory connections as open circles. In this diagram no distinction is made between neurotransmission and neuromodulatory functions of the depicted neurons. It should be noted that the actual synaptic signs of many of the aminergic and reticular pathways remain to be demonstrated, and, in many cases, the neuronal architecture is known to be far more complex than indicated here (e.g., the thalamus and cortex). Two additive effects of the marked reduction in firing rate by aminergic neurons at REM sleep onset are postulated: disinhibition (through removal of negative restraint) and facilitation (through positive feedback). The net result is strong tonic and phasic activation of reticular and sensorimotor neurons in REM sleep. REM sleep phenomena are postulated to be mediated as follows: EEG desynchronization results from a net tonic increase in reticular, thalamocortical, and cortical neuronal firing rates. PGO waves are the result of tonic disinhibition and phasic excitation orf burst cells in the lateral pontomescencephalic tegmentum. Rapid eye movements are the consequence of phasic firing by reticular and vestibular cells; the latter (not shown) directly excite oculomotor neurons. Muscular atonia is the consequence of tonic postsynaptic inhibition of spinal anterior horn cells by the pontomedullary reticular formation. Muscle twitches occur when excitation by reticular and pyramidal tract motorneurons phasically overcomes the tonic inhibition of the anterior horn cells. Anatomical abbreviations: RN, raphe nuclei; LC, locus coeruleus; P, peribrachial region; FTG, gigantocellular tegmental field; FTC, central tegmental field; FTP, parovocellular tegmental field; FTM, magnocellular tegmental field; TC, thalamocortical; CT, cortical; PT cell, pyramidal cell; III, oculomotor; IV, trochlear; V, trigmenial motor nuclei; AHC, anterior horn cell. (Modified from Hobson et al. 1986) (b) Synaptic modifications of the original reciprocal interaction model based upon recent findings. Reported data from animal (cat and rodent) are shown as solid lines and some of the recently proposed putative dynamic relationships are shown as dotted lines. The exponential magnification of cholinergic output predicted by the original model can also occur in this model with mutually excitatory cholinergic-non-cholinergic interactions (7.) taking the place of the previously postulated, mutually excitatory cholinergic-cholinergic interactions. The additional synaptic details can be superimposed on this revised reciprocal interaction model without altering the basic effects of aminergic and cholinergic influences on the REM sleep cycle. For example: i. Excitatory cholinergicnon-cholinergic interactions utilizing ACh and the excitatory amino acid transmitters enhance firing of REM-on cells (6., 7.) while inhibitory noradrenergic (4.), serotonergic (3.) and autoreceptor cholinergic (1.) interactions suppress REM-on cells. ii. Cholinergic effects upon aminergic neurons are both excitatory (2.), as hypothesized in the original reciprocal interaction model and may also operate via presynaptic influences on noradrenergicserotonergic as well as serotonergic-serotonergic circuits (8.). iii. Inhibitory cholinergic autoreceptors (1.) could contribute to the inhibition of LDT and PPT cholinergic neurons which is also caused by noradrenergic (4.) and serotonergic (3.) inputs. iv. GABAergic influences (9., 10.) as well as other neurotransmitters such as adenosine and
14182
Sleep: Neural Systems (with ablation of all forebrain structures above the level of the pontine tegmentum) continued to show periodic abolition of muscle tone and rapid eye movements at a frequency identical to the REM periods of normal sleep. The implications were (a) that the necessary and sufficient neurons for timing and triggering by REM were in the pons and (b) that this system became free-running when the pontine generator was relieved of circadian restraint from the hypothalamus. Small lesions placed in the pontine tegmentum between the locus coeruleus and the pontine reticular formation resulted in periods of REM sleep without atonia: the cats exhibited all the manifestations of REM sleep except the abolition of muscle tone, indicating that intrapontine connections were essential to mediate and coordinate different aspects of the REM sleep period.
4. A Cellular and Molecular Model of the Sleep– Wake Cycle Microelectrode studies designed to determine the cellular mechanism of these effects have indicated that REM sleep is actively triggered by an extensive set of executive neurons ultimately involving at least the oculomotor, the vestibular, and the midbrain, pontine, and medullary reticular nuclei (see Fig. 2). The intrinsic excitability of this neuronal system appears to be lowest in waking, to increase progressively throughout NREM sleep, and to reach its peak during REM sleep. Cells in this system have thus been designated REM-on cells. In contrast, neurons in the aminergic brain stem nuclei (the serotoninergic dorsal raphe and the catacholaminergic locus coeruleus and peribrachial region) have reciprocally opposite activity curves. These aminergic REM-off cells fire maximally in waking, decrease activity in NREM sleep, and reach a low point at REM onset. The reciprocal interaction model of state control ascribes the progressive excitability of the executive (REM-on) cell population to disinhibition by the modulation (REM-off) cell population. How the aminergic systems are turned off is unknown, but Cliff Saper has recently suggested that GABA-ergic inhibition may arise from neurones in the hypothalamic circadian control system.
5. Cholinergic REM Sleep Mediation Confirming the early studies of Jouvet and Raul Hernandez-Peon, numerous investigations have found prompt and sustained increases in REM sleep signs when cholinergic agonist drugs are microinjected into the pontine brain stem of cats. The behavioral syndrome produced by this treatment is indistinguishable from the physiological state of REM sleep, except that it is precipitated directly from waking with no intervening NREM sleep. The cats can be aroused but revert to sleep immediately when stimulation abates. This drug-induced state has all the physiological signs of REM sleep (such as the low voltage fast EEG, EMG atonia, REM, and ponto-geniculo-occipital (PGO) waves), suggesting that it is a valid experimental model for physiological REM. Humans who are administered cholinergic agonist by intravenous injection during the first NREM cycle also show potentiation of REM sleep; this cholinergically enhanced REM sleep is associated, as usual, with dreaming. Mixed nicotinic and muscarinic agonists (e.g., carbachol), pure muscarinic agonists (e.g., bethanechol) and dioxylane are equally effective potentiators of REM sleep and recent results suggest that activation of the M2 acetylcholine receptor suffices to trigger REM sleep behavior. The inference that acetylcholine induces REM sleep is suggested by Helen Baghdoyan’s finding that neostigmine, (an acetylcholinesterase inhibitor which prevents the breakdown of endogenously released acetylcholine) also potentiates sleep after some delay. All of these effects are dosedependent and are competitively inhibited by the acetylcholine antagonist atropine. REM sleep induction is obtained only by injection of cholinergic agonists into the pons: Midbrain and medullary injections produce intense arousal with incessant turning and barrel-rolling behavior, respectively. Within the pons, the response pattern differs according to the site of injection, with maximal effects obtained from a region anterodorsal to the pontine reticular formation and bounded by the dorsal raphe and the locus coeruleus nuclei. In the peribrachial pons, where both aminergic and cholinergic neurons are intermingled, carbachol injection produces continuous PGO waves by activating the cholinergic PGO burst cells of the region. But agonist induction of these drug-induced waves is stateindependent and REM sleep is not potentiated in the
Figure 2 (continued) nitric oxide (see text) may contribute to the modulation of these interactions. Abbreviations: open circles, excitatory postsynaptic potentials; closed circles, inhibitory postsynaptic potentials; mPRF, medial pontine reticular formation; PPT, pedunculopontine tegmental nucleus; LDT, laterodorsal tegmental nucleus; LCα, peri-locus coeruleus alpha; 5HT, serotonin; NE, norepinephrine; ACh, acetylcholine; glut, glutamate; AS, aspartate; GABA, gamma-aminobutyric acid
14183
Sleep: Neural Systems first 24 hours following drug injection. It is only later that REM sleep increases reaching its three- to fourfold peak at 48–72 hours and thereafter remaining elevated for 6–10 days. By injection of carbachol into the sub- and peri-locus coeruleus regions of the pontine reticular formation, where tonic, neuronal activity can be recorded during REM sleep, atonia may be generated, while waking persists, suggesting a possible animal model of human cataplexy. Cholinergic cells of the PGO wave generator region of the peribrachial pons are known to project to both the lateral geniculate nucleus and the perigeniculate sectors of the thalamic reticular nucleus, where they are probably responsible for the phasic excitation of neurons in REM sleep. This supposition is supported by the finding that the PGO waves of the LGN are blocked by nicotinic antagonists. Wolf Singer’s recent work suggests the functional significance of these internally generated signals. Singer proposes that the resulting resonance contributes to the use-dependent plasticity of visual cortical networks. Besides providing evidence that REM sleep signs are cholinergically mediated, these recent pharmacological results provide neurobiologists with a powerful experimental tool, since a REM-sleeplike state can be produced at will. Thus Francisco Morales and Michael Chase have shown that the atonia produced by carbachol injection produces the same electrophysiological changes in lumbar motoneurons as does REM sleep: a decrease in input resistance and membrane time constant, and a reduction of excitability associated with discrete IPSPs. These findings facilitate the exploration of other neurotransmitters involved in the motorneuronal inhibition during REM sleep which is mediated, at least in part, by glycine. Another important advance is the intracellular recording of pontine reticular neurons in a slice preparation by Robert McCarley’s group. Neuronal networks can be activated cholinergically, producing the same electrophysiological changes as those found in REM sleep of intact animals. A low threshold calcium spike has been identified as the mediator of firing pattern alterations in brain-stem slice preparations. The activation process within the pons itself has also invited study, using carbachol as an inducer of REM sleep. Using the moveable microwire technique, Peter Shiromani and Dennis McGinty showed that many neurons normally activated in REM sleep either were not activated or were inactivated by the drug. KenIchi Yamamoto confirmed this surprising finding and found that the proportion of neurons showing REMsleeplike behavior was greatest at the most sensitive injection sites. To achieve greater anatomical precision in localizing the carbachol injection site, James Quattrochi conjugated the cholinergic agonist carbachol to fluorescent microspheres, thereby reducing the rate of 14184
diffusion tenfold and allowing all neurons projecting to the injection site to be identified by retrograde labeling. Surprisingly, the behavioral effects are no less potent despite the reduced diffusion rate of the agonist. They are maximized by injection into the same anterodorsal site described by Helen Baghdoyan. That site receives input from all the major nuclei implicated in control of the sleep cycle; the pontine gigantocellular tegmental field (glutamatergic), the dorsal raphe nuclei (serotonergic), the locus coeruleus (noradrenergic), and the dorsolateral tegmental and pedunculo-pontine nuclei (cholinergic). Thus the sensitive zone appears to be a point of convergence of the neurons postulated to interact reciprocally in order to generate the NREM– REM sleep cycle. Whether this is also a focal point of reciprocal projection back to those input sites can now be investigated.
6. Conclusions Because sleep occurs in a social context, and because sleep actively determines many aspects of that context, its study constitutes fertile ground for integration across domains of inquiry. At the basic science level, the cellular and molecular mechanisms emphasized here reach inexorably down, to the level of the genome. The upward extension of sleep psychophysiology to the psychology of individual consciousness, to dreaming, and hence to the mythology that guides cultures is also advancing rapidly. Altogether absent is a sound bridge to the social and behavioral realms where sleep has been ignored almost as if it were a non-behavior and hence devoid of social significance. Yet when, where, and especially with whom we sleep roots us profoundly in our domestic and interpersonal worlds. See also: Circadian Rhythms; Dreaming, Neural Basis of; Hypothalamus; Sleep and Health; Sleep Disorders: Psychiatric Aspects; Sleep Disorders: Psychological Aspects; Sleep States and Somatomotor Activity
Bibliography Hobson J A 1999 Consciousness. Scientific American Library Datta S, Calvo J, Quattrochi J, Hobson J A 1991 Long-term enhancement of REM sleep following cholinergic stimulation. NeuroReport 2: 619–22 Jones B E 1991 Paradoxical sleep and its chemical\structural substrates in the brain. Neuroscience 40: 637–56 Gerber U, Stevens D R, McCarley R W, Greene R W 1991 Muscarinic agonists activate an inwardly rectifying potassium conductance in medial pontine reticular formation neurons of the rat in itro. Journal of Neuroscience 11: 3861–7 Steriade M, Dossi R C, Nunez A 1991 Network modulation of a slow intrinsic oscillation of cat thalamocortical neurons implicated in sleep delta waves: Cortically induced synchroniz-
Small-group Interaction and Gender ation and brainstem cholinergic suppression. Journal of Neuroscience 11: 3200–17 Steriade M, McCarley R W 1990 Brainstem Control of Wakefulness and Sleep. Plenum Press, New York
J. A. Hobson
Small-group Interaction and Gender When people interact together in groups that are small enough to allow person-to-person contact (2 to 20 people), regular patterns of behavior develop that organize their relations. The study of gender in this context examines how these patterns of behavior are affected by members’ social locations as men or women and the consequences this has for beliefs about gender differences and for gender inequality in society.
1. The Emergence of the Field The systematic, empirical study of small group interaction and gender (hereafter, gender and interaction) developed during the 1940s and 1950s out of the confluence of a general social scientific interest in small groups and structural-functionalist theorizing about the origin of differentiated gender roles for men and women. Parsons and Bales (1955) argued that small groups, like all social systems, must manage instrumental functions of adaptation and goal attainment while also attending to expressive functions of group integration and the well-being of members. Differentiated instrumental and expressive gender roles develop to solve this problem in the small group system of the family. Children internalize these functionally specialized roles as personality traits that shape their behavior in all groups. Although eventually discredited on logical and empirical grounds, the functional account of gender roles stimulated broader empirical attention to gender and interaction, not only within the family but also in task-oriented groups such as committees and work groups. Evidence accumulated that gender’s effects on interaction are complex and quite context specific (see Aries 1996; Deaux and LaFrance 1998). This evidence is inconsistent with the view that gender is best understood as stable individual traits that affect men and women’s behavior in a consistent manner across situations. These findings on interaction contributed to a gradual transformation of social scientific approaches to gender. From an earlier view of gender as a matter of individual personality and family relations, social science has increasingly approached gender as a broad system of social difference and inequality that is best studied as part of social stratification as well as individual development and family organization.
Considering gender in a broader context draws attention to how the interactional patterns it entails are both similar to and different from those that characterize other systems of difference and inequality such as those based on race, ethnicity, or wealth. While interaction occurs between the advantaged and the disadvantaged on each of these forms of social difference, the rate of interaction across the gender divide is distinctively high. Gender divides the population into social categories of nearly equal size; it cross-cuts kin and households and is central for reproduction. Each of these factors increases the frequency with which men and women interact and the intimacy of the terms on which they do so. Furthermore, research on social cognition has shown that people automatically sex categorize (i.e., label as male or female) any concrete person with whom they interact. As a consequence, gender is potentially at play in all interaction and interactional events are likely to be important for the maintenance or change of a society’s cultural beliefs and practices about gender. West and Zimmerman (1987) argue that for gender to persist as a social phenomenon, people must continually ‘do gender’ by presenting themselves in ways that allow others to relate to them as men or women as that is culturally defined by society. Recognition of these distinctive aspects of gender has increased attention to the role that gendered patterns of interaction play in gender inequality. Interaction mediates the process by which people form social bonds, are socially evaluated by others, gain influence, and are directed towards or away from positions of power and valued social rewards. Interaction also provides the contexts in which people develop identities of competence and sociality. To the extent that gender moderates these interaction processes, it shapes the outcomes of men and women as well as shared beliefs about gender.
2. Current Theories Four theoretical approaches predominate. Two, social role theory and expectation states theory, single out a society’s cultural beliefs about the nature and social value of men’s and women’s traits and competencies as primary factors that create gendered patterns of interaction in that society. The theories conceptualize these beliefs in slightly different terms (i.e., as gender stereotypes or gender status beliefs) but are in substantial agreement about their explanatory impact on interaction through the expectations for behavior that develop when these beliefs are evoked by the group situation. The remaining theories take somewhat different approaches. 2.1 Social Role Theory Eagly’s (1987) social role theory argues that widely shared gender stereotypes develop from the gender 14185
Small-group Interaction and Gender division of labor that characterizes a society. In western societies, men’s greater participation in paid positions of higher power and status and the disproportionate assignment of nurturant roles to women have created stereotypes that associate agency with men and communion with women. In addition, the gendered division of labor gives men and women differentiated skills. When gender stereotypes are salient in a group because of a mixed sex membership or a task or context that is culturally associated with one gender, stereotypes shape behavior directly through the expectations members form for one another’s behavior. When group members enact social roles that are more tightly linked to the context than gender, such as manager and employee in the workplace, these more proximate roles control their behavior rather than gender stereotypes. Even in situations where gender stereotypes do not control behavior, however, men and women may still act slightly differently due to their gender differentiated skills. Social role theory has a broad scope that applies to interaction in all contexts and addresses assertive, power related behaviors as well as supportive or feeling related behaviors (called socioemotional behaviors). The explanations offered by the theory are not highly specific or detailed, however. The theory predicts that women will generally act more communally and less instrumentally than men in the same context, that these differences will be greatest when gender is highly salient in the situation, and that gender differences will be weak or absent when people enact formal, institutional roles.
2.2 Expectations States Theory Another major approach, Berger and co-worker’s expectation states theory, offers more detailed explanations within a narrower scope. The theory addresses the hierarchies of influence and esteem that develop among group members in goal-oriented contexts and makes predictions about when and how gender will shape these hierarchies due to the status value gender carries in society (see Ridgeway 1993; Wagner and Berger 1997). It does not address socioemotional behavior. Gender status beliefs are cultural beliefs that one gender (men) is more status worthy and generally more competent than the other (women) in addition to each having gender specific competencies. When gender status beliefs become salient due to the group’s mixed sex or gender associated context, they create implicit expectations in both men and women about the likely competence of goal oriented suggestions from a man compared to those from a similar woman. These often unconscious expectations shape men and women’s propensity to offer their ideas to the group, to stick with those ideas when others disagree, to positively evaluate the ideas of others, and 14186
to accept or resist influence from others, creating a behavioral influence hierarchy that usually advantages men over women in the group. In mixed sex groups with a gender-neutral task, the theory predicts that men will participate more assertively and be more influential than women. If the group task or context is culturally linked to men, their influence advantage over women will be stronger. If the task or context is associated with women’s culturally expected competencies, however, the theory predicts that women will be somewhat more assertive and influential than men. There should be no gender differences in assertive influence behavior between men and women in same sex groups with a genderneutral task, since gender status beliefs should not be salient.
2.3 Structural Identity Theories A set of symbolic interactionist theories, including Heise and Smith-Lovin’s affect control theory and Burke’s identity theory forms the structural identity approach (see Ridgeway and Smith-Lovin (1999) for a review). It, too, emphasizes shared cultural meanings about gender but focuses on the identity standards those beliefs create for individuals in groups. People learn cultural meanings about what it is to be masculine or feminine and these meanings become a personal gender identity standard that they seek to maintain through their actions. Identity standards act like control systems that shape behavior. If the context of interaction causes a person to seem more masculine or feminine than his or her gender identity standard, the person reacts with compensatory behaviors (e.g., warm behaviors to correct a too masculine impression). Consequently, different actions serve to express and maintain gender identities in different situational contexts. Since people automatically sex categorize one another, this approach assumes that gender identity standards affect behavior in all interaction, although the extent of their impact varies with gender’s salience in the context. Gender is often a background identity that modifies other, more situationally prominent identities, such as woman judge. Unlike the other theories, the predictions of structural identity theories focus primarily on the behavioral reactions gender produces to events in small groups.
2.4
Two-cultures Theory
Maltz and Borker’s (1982) two cultures theory, popularized by Tannen (1990), takes a different approach. It limits its scope to informal, friendly interaction. People learn rules for friendly conversation from peers in childhood, it argues. Since these peer groups tend to
Small-group Interaction and Gender be sex-segregated and because children exaggerate gender differences in the process of learning gender roles, boys and girls groups develop separate cultures that are gender-typed. Girls learn to use language to form bonds of closeness and equality, to criticize in nonchallenging ways, and to accurately interpret the intentions of others. Boys learn to use speech to compete for attention and assert positions of dominance. In adult mixed sex groups, these rules can cause miscommunication because men and women have learned to attribute different meanings to the same behavior. Men and women’s efforts to accommodate each other in mixed-sex interaction, however, modifies their behavior slightly, reducing gender differences. In same sex interaction, gendered styles of interaction are reinforced. Thus, two cultures theory predicts greater gender differences in behavior between men and women in same-sex groups than in mixed sex groups. The theory has been criticized for ignoring status and power differences between men and women and oversimplifying childhood interaction patterns (see Aries 1996).
3. Research Findings The body of systematic evidence about men and women’s behaviors in small group interaction is large and growing. Several methodological concerns must be kept in mind in order to interpret this evidence and infer general patterns.
3.1 Methodological Issues Interaction in small groups is an inherently local phenomenon that is embedded within larger sociocultural structures and affected by many aspects of those structures besides gender. Three methodological problems result. First, care is required to ascertain that behavioral differences between men and women in a situation are indeed due to their gender and not to differences in their other roles, power, or statuses in the situation. Second, reasonably large samples of specific interactional behaviors are necessary to infer gendered patterns in their use. Third, attention must be paid to the specific cultural context within in which the group is interacting. At present, almost all systematic research has been based on small groups in the US composed predominately of white, middle class people. Since several theories emphasize the importance of cultural beliefs about gender in shaping interaction, researchers must be alert to subcultural and cross-cultural variations in these beliefs and appropriately condition their empirical generalizations about gendered patterns of interaction. The available studies that compare US populations such African–Americans whose gender beliefs are less
polarized than the dominant beliefs find that gender differences in interaction are also less for these populations (Filardo 1996).
3.2 Empirical Patterns in North American Groups Taking these methodological concerns into account, narrative and meta-analytic reviews of research suggest several provisional conclusions about gender and interaction in groups governed by the dominant culture of North American society. Aries (1996), Deaux and La France (1998), and Ridgeway and Smith-Lovin (1999 provide reviews of the research on which these conclusions are based. Gender differences in behavior do not always occur in small groups and vary greatly by context. Behavioral expectations associated with the specific focus and institutional context of the small group (e.g., the workplace, a committee, a friendship group, a student group) generally are more powerful determinants of both men’s and women’s behavior than gender. When gender differences occur, they tend to be small or moderate in effect size, meaning that there is usually at least a 70 percent overlap in the distributions of men’s and women’s behavior. When men and women are in formal, prescribed roles with the same power and status, there are few if any differences in their behavior. Research has shown that men and women in equivalent leadership or managerial roles interact similarly with subordinates of either sex. On the other hand, when women are gender atypical occupants of positions of power, they are sometimes perceived by others as less legitimate in those roles and elicit more negative evaluations when they behave in a highly directive, autocratic way than do equivalent men. These findings are in accord with the predictions of social role theory and expectations states theory. Influence over others and assertive, goal-directed behavior such as participation rates, task suggestions, and assertive patterns of gestures and eye gaze are associated with power and leadership in small groups. In mixed sex groups with a gender-neutral task, men have moderately higher rates of assertive behaviors and influence than do women who are otherwise their peers. When the group task or context is culturally linked to men, this gender difference increases. When the task or context is one associated with women, however, women’s rates of assertive behaviors and influence are slightly higher than men’s. When performance information clearly demonstrates that women in a mixed sex group are as competent as the men, gender differences in assertiveness and influence disappear. In same sex groups, there are no differences between men’s and women’s rates of assertive behaviors or influence levels. These patterns closely match the predictions of expectations states theory, are 14187
Small-group Interaction and Gender consistent with social role theory, and inconsistent with two cultures theory. They suggest that gender status beliefs in society and the expectations for competence in the situation that they create are an important determinant of gender differences in power and assertiveness in groups, independent of men and women’s personalities or skills. They indicate as well that both men and women act assertively or deferentially depending on the situational context. Men, like women, show higher rates of socioemotional behaviors when they are in subordinate rather than superordinate positions in groups. These are verbal and nonverbal behaviors that support the speech of others, express solidarity, and show active, attentive listenership. In mixed sex groups, women engage in slightly more socioemotional behavior than men. However, women engage in the highest rates of socioemotional behaviors, and men the lowest, in same sex groups. The latter findings are the only ones in partial accord with the two-cultures theory. That theory, however, does not account for the partial association of socioemotional behaviors with lower status positions. Status factors alone, however, do not explain women’s increased socioemotional behaviors in female groups. Assertive, instrumental behaviors appear to reflect power, competence, and status equally for men and women and, thus, do not reliably mark gender identity for the actor. To the extent that people signal gender identity consistently across interaction contexts, they appear to do so primarily through socioemotional behaviors that are less associated with instrumental outcomes.
4. Conclusions Both the gender division of labor and gender inequality in a society depend on its cultural beliefs about the nature and social value of gender differences in competencies and traits. Such taken for granted beliefs allow actors to be reliably categorized as men and women in all contexts and understood as more or less appropriate candidates for different roles and positions in society. For such cultural beliefs to persist, people’s everyday interactions must be organized to support them. The empirical evidence from North America suggests that unequal role and status relationships produce many differences in interactional behavior that are commonly attributed to gender. Network research suggests that most interactions between men and women actually occur within the structural context of unequal role or status relations (see Ridgeway and Smith-Lovin 1999). These points together may account for the fact that people perceive gender differences to be pervasive in interaction, while studies of actual interaction show few behavioral differences between men and women of equal status 14188
and power. Small group interaction is an arena in which the appearance of gender differences is continually constructed through power and status relations and identity marking in the socioemotional realm. Theory and research on gender and interaction have focused on the way cultural beliefs about gender and structural roles shape interaction in ways that confirm the cultural beliefs. New approaches investigate the ways that interactional processes may perpetuate or undermine gender inequality in a society as that society undergoes economic change. If the cultural beliefs about gender that shape interaction change more slowly than economic arrangements, people interacting in gendered ways may rewrite gender inequality into newly emerging forms of socioeconomic organization in society. On the other hand, rapidly changing socioeconomic conditions may change the constraints on interaction between men and women in many contexts so that people’s experiences undermine consensual beliefs about gender and alter them over time. See also: Androgyny; Cultural Variations in Interpersonal Relationships; Feminist Theory; Gender and Feminist Studies in Psychology; Gender and Feminist Studies in Sociology; Gender and Language: Cultural Concerns; Gender Differences in Personality and Social Behavior; Gender Ideology: Crosscultural Aspects; Gender-related Development; Groups, Sociology of; Interactionism: Symbolic; Interpersonal Attraction, Psychology of; Language and Gender; Male Dominance; Masculinities and Femininities; Social Networks and Gender; Social Psychology: Sociological; Social Psychology, Theories of; Social Relationships in Adulthood; Stereotypes, Social Psychology of
Bibliography Aries E 1996 Men and Women in Interaction: Reconsidering the Differences. Oxford University Press, New York Deaux K, LaFrance M 1998 Gender. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, Boston, Vol. 1, pp. 788–827 Eagly A H 1987 Sex Differences in Social Behaior: A SocialRole Interpretation. Earlbaum, Hillsdale, NJ Filardo E K 1996 Gender patterns in African-American and white adolescents’ social interactions in same-race, mixed-sex groups. Journal of Personal Social Psychology 71: 71–82 Maltz D N, Borker R A 1982 A cultural approach to malefemale miscommunication. In: Gumperz J J (ed.) Language and Social Identity. Cambridge University Press, Cambridge, UK, pp. 196–256 Parsons T, Bales R F 1955 Family, Socialization, and Interaction Process. Free Press, Glencoe, IL Ridgeway C L 1993 Gender, status, and the social psychology of expectations. In: England P (ed.) Theory on Gender\Feminism on Theory. A de Gruyter, New York
Smith, Adam (1723–90) Ridgeway C L, Smith-Lovin L 1999 The gender system and interaction. Annual Reiew of Sociology 25: 191–216 Tannen D 1990 You Just Don’t Understand: Women and Men in Conersation. Morrow, New York Wagner D G, Berger J 1997 Gender and interpersonal task behaviors: Status expectation accounts. Sociological Perspecties 40: 1–32 West C, Zimmerman D 1987 Doing gender. Gender and Society 1: 125–51
C. L. Ridgeway
Smith, Adam (1723–90) Adam Smith was born in Kirkcaldy, in the County of Fife, and baptized on June 5, 1723 (the date of birth is unknown). He was the son of Adam Smith, Clerk to the Court-Martial, later Comptroller of Customs in the town, and of Margaret Douglas of Strathendry.
1. Early Years Smith entered Glasgow University in 1737—at the not uncommon age of 14. He was fortunate in his mother’s choice of University and in the period of his attendance. The old, nonspecialised, system of ‘regenting’ had been replaced in the late 1720s by a new arrangement whereby individuals professed a single subject. There is little doubt that Smith benefited from the teaching of Alexander Dunlop (Greek) and of Robert Dick (Physics). But two men in particular are worthy of note in view of Smith’s later interests. The first is Robert Simson (1687–1768) whom Smith later described as one of the ‘greatest mathematicians that I have ever had the honour to be known to, and I believe one of the two greatest that have lived in my time’ (TMS, III.2.20). Smith could well have acquired from Simson and Matthew Stewart (father of Dugald) his early and continuing interest in mathematics. Dugald Stewart recalled in his memoir (Stewart 1977) that Smith’s favorite pursuits while at university were mathematics and natural philosophy. Campbell and Skinner noted that ‘Simson’s interests were shared generally in Scotland. From their stress on Greek geometry the Scots built up a reputation for their philosophical elucidation of Newtonian fluxions, notably in the Treatise on Fluxions (1724) by Colin Maclaurin (1698–1746), another pupil of Simson’s who held chairs of mathematics in Aberdeen and Edinburgh (1982, p. 20). But important as it undoubtedly was Simson’s influence upon Smith pales by comparison with that exerted by Francis Hutcheson (1694–1746). A student of Gerschom Carmichael, the first Professor of Moral
Philosophy, Hutcheson succeeded him in 1729. A great stylist, Hutcheson lectured in English (rather than Latin) 3 days a week on classical sources and 5 days on Natural Religion, Morals, Jurisprudence, and Government. As Dugald Stewart was to observe, Hutcheson’s lectures contributed to diffuse, in Scotland, the taste for analytical discussion, and a spirit of liberal enquiry. It is believed that Smith graduated from Glasgow in 1740 and known that he was elected to the Snell Exhibition (scholarship) in the same year. He matriculated in Balliol College on July 7, and did not return to Scotland until 1746—the year that ended the Jacobite Rebellion and saw the death of Hutcheson. The 6 years spent in Oxford were often unhappy. Smith had a retiring personality, his health was poor, and the College was pro-Jacobite and ‘anti-Scottish.’ Some members were also ‘unenlightened’; a fact confirmed by the confiscation by one tutor of David Hume’s Treatise on Human Nature. And yet Smith was to write to the Principal of Glasgow University (Archibald Davidson), on the occasion of his election as Lord Rector (letter 274, dated November 16, 1787): No man can owe greater obligations than I do to the University of Glasgow. They educated me, they sent me to Oxford, soon after my return to Scotland they elected me one of their members, and afterwards preferred me to another office, to which the abilities and virtues of the never to be forgotten Dr Hutcheson had given a superior degree of illustration.
The reference to Oxford was no gilded memory. Balliol had one of the best libraries in Oxford. Smith’s occasional ill-health may be explained by his enthusiastic pursuit of its riches. It has been assumed that Smith developed an interest in Rhetoric and Belles Lettres during this period, although it now seems likely, in view of his later career, that he also further developed longer standing interests in literature, science, ethics, and jurisprudence.
2. Professor in Glasgow Smith returned to Kirkcaldy in 1746 without any fixed plan. But his wide-ranging interests must have been known to his friends, three of whom arranged a program of public lectures which were delivered in Edinburgh between 1748 and 1751. The three friends were Robert Craigie of Glendoick, James Oswald of Dunnikier, and Henry Home, Lord Kames. The lectures were of an extramural nature and delivered to a ‘respectable auditory.’ They probably also included material on the history of science (astronomy), jurisprudence, and economics. The success of Smith’s courses in Edinburgh no doubt led to his appointment to the Glasgow Chair of Logic and Rhetoric in 1751—where once again he 14189
Smith, Adam (1723–90) enjoyed the support of Henry Home, Lord Kames. Evidence gathered from former pupils confirms that Smith did lecture on rhetoric but also that he continued to deploy a wide range of interests, leading to the conclusion that he continued to lecture on the history of philosophy and science in the Glasgow years (Mizuta 2000, p. 101). Smith attached a great deal of importance to his essay on the history of Astronomy (Corr, letter 137) the major part of which may well have been completed soon after leaving Oxford (Ross 1995, p. 101). Adam Smith was translated to Hutcheson’s old chair of Moral Philosophy in 1752. As John Millar recalled to Dugald Stewart, the course was divided into four parts: natural theology, ethics, jurisprudence, and expediency (economics). Millar also confirmed that the substance of the course on ethics reappeared in The Theory of Moral Sentiments (1759) and that the last part of the course featured in the Wealth of Nations (1776). We know from Smith’s own words that he hoped to complete his wider plan by writing a ‘sort of Philosophical History of all the different branches of literature, of philosophy poetry and eloquence’ together with ‘a sort of theory and history of law and government’ (Corr, letter 248). Smith returned to this theme in the advertisement to the sixth and final edition of TMS in noting that TMS and WN were parts of a wider study. ‘What remains the theory of jurisprudence, which I have long projected, I have hitherto been hindered from executing, by the same occupations which had till now prevented me from revising the present work.’ But the outlines of the projected study are probably revealed by the content of LJ(A) and LJ(B), and by those passages in WN which can now be recognized as being derived from them (e.g., WN, III, and V.i.a.b). The links between the parts of the great plan are many and various. The TMS, for example, may be regarded as an exercise in social philosophy, which was designed in part to show the way in which so selfregarding a creature as man, erects (by natural as distinct from artificial means) barriers against his own passions, thus explaining the observed fact that he is always found in ‘troops and companies.’ The argument places a good deal of emphasis on the importance of general rules of behaviour which are related to experience and which may thus vary in content, together with the need for some system of government as a precondition of social order. The historical analysis, with its four socioeconomic stages, complements this argument by formally considering the origin of government and by explaining to some extent the forces which cause variations in accepted standards of behavior over time. Both are related in turn to Smith’s treatment of political economy. The most polished accounts of the emergent economy and of the psychology of the ‘economic man’ are to be found, respectively, in the third book of WN and 14190
in Part VI of TMS that was added in 1790. Yet both areas of analysis are old and their substance would have been communicated to Smith’s students and understood by them to be what they possibly were: a preface to the treatment of political economy. From an analytical point of view, Smith’s treatment of economics in the final part of LJ was not lacking in sophistication, which is hardly surprising in view of his debts to Francis Hutcheson, and through him, to Gerschom Carmichael, and thus to Pufendorf.
3. The French Connection Despite his anxieties, the TMS proved to be successful (Corr, letter 31) and attracted the attention of Charles Townshend, the statesman, who set out to persuade Smith to become tutor to the young Duke of Buccleuch. Smith eventually agreed, and left Glasgow with two young charges to begin a stay of almost 2 years in France. Smith resigned his Chair in February 1764 (Corr, letter 87). The party spent many months in Toulouse before visiting Avignon and Geneva (where Smith met the much-admired Voltaire) prior to reaching Paris to begin a stay of some 10 months. From an intellectual point of view, the visit was a resounding success and arguably influential in the sense that Smith was able to meet, amongst many others, Francois Quesnay and A. R. J. Turgot. Quesnay’s economic model dates from the late 1750s but it is noteworthy that he was working on a new version of the Tableau, the Analyse, during the course of Smith’s visit. It is also known that Smith met Turgot at this time and that the latter was at work on the Reflections. While Smith wrote a detailed commentary on Physiocratic teaching (WN IV.ix), his lectures on economics, as delivered in the closing months of 1763, did not include a model of the kind which he later associated with the French Economists—thus suggesting that he must have found a great deal to think about in the course of 1766. But there were more immediate concerns. The young Duke was seriously ill in the summer, leading Smith to call upon the professional services of Quesnay. Smith’s distress was further compounded by the illness and death of the younger brother, Hew Scott. The visit to Paris was ended abruptly. Smith spent the winter of 1766 in London and returned to Kirkcaldy in the following Spring to begin a stay of more than 6 years during which he worked upon successive drafts of the WN.
4. London 1773–76 Smith left Kirkcaldy in the spring of 1773 to begin a stay of some 3 years in London. David Hume assumed that publication of WN was imminent, but in fact, the
Smith, Adam (1723–90) book did not appear until March 1776. The reason may well be that Smith was engaged in systematic revision of a complex work. But he was also concerned with two sophisticated areas of analysis. The first of these questions related to Smith’s interest in the optimal organization of public services, notably education, and features in a long letter addressed to William Cullen dated September 1774. The issue of public services was one of the topics that Smith addressed during his stay in London. Cullen had written to Smith seeking his opinion on proposals from the Royal College of Physicians in Edinburgh. The petition suggested that doctors should be graduates, that they should have attended College for at least 2 years, and that their competence be confirmed by examination. Smith opposed this position, arguing that universities should not have a monopoly of provision and in particular that the performance of academic staff should be an important factor in determining salaries. In sum, an efficient system of higher education requires: a state of free competition between universities and private teachers; a capacity effectively to compete in the market for men of letters; freedom of choice for students as between teachers, courses, and colleges, together with the capacity to be sensitive to market forces, even if those forces were not always themselves sufficient to ensure the provision of the basic infrastructure. Smith also argued that education should be paid for through a combination of private and public funding (WN, V.i.i.5). Smith was arguing in favour of the doctrine of ‘induced efficiency’ and applied the principles involved to the whole range of public services (Skinner 1996, chap. 8). At the same time, Smith also addressed another complex question, which was considered at great length in WN (Book IV, vii), namely the developing tensions with the American Colonies. Indeed, Hume believed that Smith’s growing preoccupation with the Colonial Question was the cause of the delay in the publication of the book (Corr, letter 139). Smith’s long analysis of the issues involved provides the centerpiece of his critique of the ‘mercantile’ system of regulation (WN, IV, v.a.24). On Smith’s account the Regulating Acts of Trade and Navigation in effect confined the Colonies to the production of primary products, while the ‘mother country’ concentrated on more refined manufacturers—thus creating a system of complementary markets which benefited both parties. But it was Smith’s contention that the British economy could not sustain indefinitely the cost of colonial administration and that in the long run the rate of growth in America would come into conflict with restrictions currently imposed. Smith, therefore, argued that Great Britain should dismantle the Acts voluntarily with a view to creating a single free trade area, in effect an Atlantic Economic Community, with a harmonized system of taxation, possessing all the advantages of a common language and culture.
Smith accepted the principle that there should be no taxation without representation. In this connection, he noted that: In the course of little more than a century perhaps the produce of American might exceed that of British taxation. The seat of empire would then naturally remove itself to that part of the empire which contributed most to the general defence of the whole (WN IV, vii.c.79).
Hugh Blair, Professor of Rhetoric in Edinburgh, objected to Smith’s inclusion of this material on the grounds that ‘it is too much like a publication for the present moment’ (Corr, 151). Others were more perceptive, recognizing that Smith was applying principles drawn from his wider system to the specific case of America (e.g., Thomas Pownall, Corr, ap.A). Pownall, a former Governor of Massachusetts, was an acute critic of Smith’s position (Skinner 1996, chap. 8) but recognized that Smith had sought to produce ‘an institute of the Principia of those laws of motion, by which the operations of the community are directed and regulated, and by which they should be examined’ (Corr p. 354).
5. The Wealth of Nations Dugald Stewart, Professor of Moral Philosophy in Edinburgh, noted that: ‘It may be doubted, with respect to Mr Smith’s Inquiry, if there exists any book beyond the circle of the mathematical and physical sciences, which is at once so agreeable in its arrangement to the rules of a sound logic and so accessible to the examination of ordinary readers’ (Stewart 1977, IV22). This is a complement which Smith would have appreciated, conscious as he was of the ‘beauty of a systematical arrangement of different observations connected by a few common principles’ (WN, V.i.f. 25). But there are really two systems in WN. The first is a model of ‘conceptualised reality’ (Jensen 1984) which provides a description of a modern economy, which owed much to the Physiocrats. The second system is analytical and builds upon the first. 5.1 A Model of Conceptualized Reality If the Theory of Moral Sentiment provides an account of the way in which people erect barriers against their own passions, thus meeting a basic precondition for economic activity, it also provided an account of the psychological judgments on which that activity depends. The historical argument on the other hand explains the origins and nature of the modern state and provides the reader with the means of understanding the essential nature of the exchange economy. For Smith: ‘the great commerce of every civilised society, is that carried on between the inhabitants of 14191
Smith, Adam (1723–90) the town and those of the country ... The gains of both are mutual and reciprocal, and the division of labour is in this, as in all other cases, advantageous to all the different persons employed in the various occupations into which it is subdivided’ (WN, III.i.1). The concept of an economy involving a flow of goods and services, and the appreciation of the importance of intersectoral dependencies, were familiar in the eighteenth century. Such themes are dominant features of the work done, for example, by Sir James Steuart and David Hume. But what is distinctive about Smith’s work, at least as compared to his Scottish contemporaries, is the emphasis given to the importance of three distinct factors of production (land, labor, capital) and to the three categories of return (rent, wage, profit) which correspond to them. What is distinctive to the modern eye is the way in which Smith deployed these concepts in providing an account of the flow of goods and services between the sectors involved and between the different socioeconomic groups (proprietors of land, capitalists, and wage-labor). The approach is also of interest in that Smith, following the lead of the French Economists, worked in terms of period analysis— typically the year was chosen, so that the working of the economy is examined within a significant time dimension as well as over a series of time periods. Both versions of the argument emphasise the importance of capital, both fixed and circulating.
5.2 A Conceptual Analytical System The ‘conceptual’ model which Smith had in mind when writing the Wealth of Nations is instructive and also helps to illustrate the series of separate, but interrelated problems, which economists must address if they are to attain the end which Smith proposed, namely an understanding of the full range of problems which have to be encountered. Smith, in fact, addressed a series of areas of analysis which began with the problem of value, before proceeding to the discussion of the determinants of price, the allocation of resources between competing uses, and, finally, an analysis of the forces which determine the distribution of income in any one time period and over time. The analysis offered in the first book enabled Smith to proceed directly to the treatment of macroeconomic issues and especially to a theory of growth which provides one of the dominant features of the work as a whole (c.f. Skinner 1996, chap. 7). The idea of a single, all-embracing conceptual system, whose parts should be mutually consistent is not an ideal which is so easily attainable in an age where the division of labor has increased significantly the quantity of science through specialization. But Smith becomes even more informative when we map the content of the ‘conceptual (analytical) system’ against a model of the economy, which is essentially descriptive. 14192
Perhaps the most significant feature of Smith’s vision of the ‘economic process,’ to use Blaug’s phrase, lies in the fact that it has a significant time dimension. For example, in dealing with the problems of value in exchange, Smith, following Hutcheson, made due allowance for the fact that the process involves judgments with regard to the utility of the commodities to be received, and the disutility involved in creating the commodities to be exchanged. In the manner of his predecessors, Smith was aware of the distinction between utility (and disutility) anticipated and realized, and, therefore, of the process of adjustment which would take place though time. Young (1997, p. 61) has emphasized that the process of exchange may itself be a source of pleasure (utility). In an argument which bears upon the analysis of the TMS, Smith also noted that choices made by the ‘rational’ individual may be constrained by the reaction of the spectator of his conduct—a much more complex situation than that which more modern approaches may suggest. Smith makes much of the point in his discussion of Mandeville’s ‘licentious’ doctrine that private vices are public benefits, in suggesting that the gratification of desire is perfectly consistent with observance of the rules of propriety as defined by the ‘spectator,’ that is, by an external agency. In an interesting variant on this theme, Etzioni (1988, pp. 21–4) noted the need to recognize ‘at least two irreducible sources of valuation or utility; pleasure and morality.’ He added that modern utility theory ‘does not recognise the distinct standing of morality as a major, distinct, source of valuations’ and hence as an explanation of ‘behaviour’ before going on to suggest that his own ‘deontological multi-utility model’ is closer to Smith than other modern approaches. Smith’s theory of price, which allows for a wide range of changes in taste, is also distinctive in that it allows for competition among and between buyers and sellers, while presenting the allocative mechanism as one which involves simultaneous and interrelated adjustments in both factor and commodity markets. As befits a writer who was concerned to address the problems of change and adjustment, Smith’s position was also distinctive in that he was not directly concerned with the problem of equilibrium. For him the ‘natural’ (supply) price was: ‘as it were, the central price, to which the prices of all commodities are continually gravitating ... whatever may be the obstacles which hinder them from settling in this center of repose and continuance, they are constantly tending towards it’ (WN, I.vii.15). The picture was further refined in the sense that Smith introduced into this discussion the doctrine of ‘net advantages’ (WN, I.x.a.1). This technical area is familiar to labor economists, but in Smith’s case it becomes even more interesting in the sense that it provides a further link with the TMS, and with the discussion of constrained choice. It was Smith’s contention that men would only be prepared to
Smith, Adam (1723–90) embark on professions that attracted the disapprobation of the spectator if they could be suitably compensated (Skinner 1996, p. 155) in terms of monetary reward. But perhaps the most intriguing feature of the macroeconomic model is to be found in the way in which it was specified. As noted earlier, Smith argued that incomes are generated as a result of productive activity, thus making it possible for commodities to be withdrawn from the ‘circulating’ capital of society. As he pointed out, the consumption of goods withdrawn from the existing stock may be used up in the present period, or added to the stock reserved for immediate consumption, or used to replace more durable goods which had reached the end of their lives in the current period. In a similar manner, undertakers and merchants may add to their stock of materials, or to their holdings of fixed capital while replacing the plant which had reached the end of its operational life. It is equally obvious that undertakers and merchants may add to, or reduce, their inventories in ways that will reflect the changing patterns of demand for consumption and investment goods, and their past and current levels of production. Smith’s emphasis upon the point that different ‘goods’ have different life-cycles means that the pattern of purchase and replacement may vary continuously as the economy moves through different time periods, and in ways which reflect the various age profiles of particular products as well as the pattern of demand for them. If Smith’s model of the ‘circular flow’ is to be seen as a spiral, rather than a circle, it soon becomes evident that this spiral is likely to expand (and contract) through time at variable rates. It is perhaps this total vision of the complex working of the economy that led Mark Blaug to comment on Smith’s sophisticated grasp of the economic process and to distinguish this from his contribution to particular areas of economic analysis (c.f. Jensen 1984, Jeck 1994, Ranadive 1984). What Smith had produced was a model of conceptualized reality, which is essentially descriptive, and which was further illuminated by an analytical system which was so organized as to meet the requirement of the Newtonian model (Skinner 1996, chap.7). Smith’s model(s) and the way in which they were specified confirmed his earlier claim that government’s ought not to interfere with the economy—a theme stated in the ‘manifesto of 1775’ (Stewart 1977, IV.25), repeated in LJ, confirmed by Turgot, and even more eloquently defended in WN (IV.ix.51). Smith would no doubt be gratified that Hume had lived to see the publication of WN and pleased with the assessment that it has ‘Depth and Solidity and Acuteness’ (Corr, letter 150). Hume died in the summer of 1776. Two years later Smith was asked by a former pupil, Alexander Wedderburn, SolicitorGeneral in Lord North’s administration, to advise on the options open to the British Government in the
aftermath of Bourgoyne’s surrender at Saratoga. Smith returned to his old theme of Union, but recognized that the most likely outcome was military defeat (Corr, app B). In February of the same year, Smith was appointed Commissioner of Customs and of the Salt Duties that gave him an income of £600pa to be added to the £300 which he still received from the Duke. Smith then moved to Edinburgh where he lived with his mother and a cousin. He died in 1790 after instructing his executors, Joseph Black and James Hutton, to burn the bulk of his papers. We may believe that the pieces which survived (which include the Astronomy) and which were later published in the Essays on Philosophical Subjects, were all specified by Smith.
6. Influence Adam Smith’s influence upon his successors is a subject worthy of a book rather than a few concluding paragraphs. But there are some obvious points to be made even if the list can hardly be exhaustive. If we look at the issues involved from the standpoint of economic theory and policy, the task becomes a little simpler. On the subject of policy, Teichgraeber has noted that Smith’s advocacy of free trade or economic liberalism did not ‘register any significant victories during his life-time’ (1987, p. 338). Indeed, Tribe has argued that ‘until the final decade of the eighteenth century, Sir James Steuart’s Inquiry was better known than Smith’s The Wealth of Nations’ (1988, p. 133). The reason is that Steuart’s extensive and unique knowledge of conditions in Europe, gained as a result of exile, made him acutely aware of problems of immediate relevance; problems such as unemployment, regional imbalance, underdeveloped economies and the difficulties which were presented in international trade as a result of variations in rates of growth (Skinner 1996, chap. 11). In view of later events, it is ironic to note that Alexander Hamilton considered Steuart’s policy of protection for infant industries to be more relevant to the interests of the young American Republic than the ‘fuzzy philosophy of Smith’ (Stevens 1975, pp. 215–7). But the situation was soon to change. In the course of a review of the way in which WN had been received, Black noted that: On the side of policy, the general impression left by the historical evidence is that by 1826 not only the economists but a great many other influential public men were prepared to give assent and support to the system of natural liberty and the consequent doctrine of free trade set out by Adam Smith’ (1976, p. 47).
Black recorded that the system of natural liberty attracted attention on the occasion of every anni14193
Smith, Adam (1723–90) versary. But a cautionary note was struck by Viner (1928). Having reviewed Smith’s treatment of the function of the state in Adam Smith and Laisser-Faire, a seminal article, Viner concluded: Adam Smith was not a doctrinaire advocate of laisser-faire. He saw a wide and elastic range of activity for government, and was prepared to extend it even further if government, by improving its standards of competence, honesty and public spirit, showed itself entitled to wider responsibilities’ (Wood 1984, i.164).
But this sophisticated view, which is now quite general, does not qualify Robbins’ point that Smith developed an important argument to the effect that economic freedom ‘rested on a two fold basis: belief in the desirability of freedom of choice for the consumer and belief in the effectiveness, in meeting this choice, of freedom on the part of the producers’ (Robbins 1953, p. 12). Smith added a dynamic dimension to this theme in his discussion of the Corn Laws (WN, IV.v.b). The thesis has proved to be enduringly attractive. Analytically, the situation is also intriguing. Teichgraeber’s research revealed that there ‘is no evidence to show that many people exploited his arguments with great care before the first two decades of the nineteenth century (1987, p. 339). He concluded: ‘It would seem at the time of his death that Smith was widely known and admired as the author of the Wealth of Nations. Yet it should be noted too that only a handful of his contemporaries had come to see his book as uniquely influential’ (1987, p. 363). Black has suggested that for Smith’s early nineteenth-century successors, the WN was ‘not so much a classical monument to be inspected, but as a structure to be examined and improved where necessary’ (1984, p. 44). There were ambiguities in Smith’s treatment of value, interest, rent, and population theory. These ambiguities were reduced by the work of Ricardo, Malthus, James Mill, and J. B. Say, making it possible to think of a classical system dominated by short-run self-equilibrating mechanisms and a longrun theory of growth. But there was one result of which Smith would not have approved in that the classical orthodoxy made it possible to think of economics as quite separate from ethics and history. In a telling passage reflecting upon the order in which Smith developed his argument (ethics, history, economics), Hutchison concluded that Smith was unwittingly led, as if by an Invisible Hand, to promote an end which was no part of his intention, that ‘of establishing political economy as a separate autonomous discipline’ (1988, p. 355). But the economic content of WN did, after all, provide the basis of classical economics in the form of a coherent, all-embracing account of ‘general interconnexions’ (Robbins 1953, p. 172). As Viner had earlier pointed out, the source of Smith’s originality lies in his ‘detailed and elaborate application to the 14194
wilderness of economic phenomena of the unifying concept of a co-ordinated and mutually interdependent system of cause and effect relationships which philosophers and theologians had already applied to the world in general’ (Wood 1984, i.143). Down the years it is the idea of system which has attracted sustained attention perhaps because it is now virtually impossible to duplicate a style of thinking which becomes more informative the further we are removed from it. No one who is familiar with the Smithian edifice can fail to notice that he thought mathematically and in a manner which reflects his early interest in related disciplines, including the life sciences—all mechanistic, evolutionary, static, and dynamic, which so profoundly affected the shape assumed by his System of Social Science. Works of Adam Smith (OUP, 1976–1983) Corr Correspondence, eds E C Mossner and I S Ross, (1977) Early Draft of WN in W R Scott, Adam Smith as Student and Professor, (Jacksons, Glasgow, 1937) EPS Essays on Philosophical Subjects, incl. the Astronomy, General eds D D Raphael and A S Skinner (1980) Letter Letter to Authors of the Edinburgh Review (1756), in EPS LJ(A) and LJ(B) Lectures on Jurisprudence ed R L Meek, P G Stein and D D Raphael (1978) LRBL Lectures on Rhetoric and Belles Letters, ed J C Bryce (1983) TMS The Theory of Moral Sentiments, ed D D Raphael and A L Macfie (1976) WN The Wealth of Nations, eds R H Campbell, A S Skinner and W B Todd (1976) Stewart Account of the Life and Writings of Adam Smith (1977) in Corr.
Bibliography Black R D C 1976 Smith’s contribution in historical perspective. In: Wilson T, Skinner A S (eds.) The Market and the State. Oxford University Press, Oxford, UK Campbell R H, Skinner A S 1982 Adam Smith. Croom-Helm, London Etzioni A 1988 The Moral Dimension: Towards a New Economics. Macmillan, London Groenwegen P 1969b Turgot and Adam Smith. Scottish Journal of Political Economy 16
Smoking and Health Hutchison T 1988 Before Adam Smith. Blackwell, Oxford, UK Jeck A 1994 The macro-structure of Adam Smith’s theoretical system. European Journal of the History of Economic Thought. 3: 551–76 Jensen H E 1984 Sources and Contours in Adam Smith’s Conceptualised Reality. Wood, ii.194 Meek R L 1962 The Economics of Physiocracy. Allen and Unwin, London Meek R L 1973 Turgot on Progress, Sociology and Economics. Cambridge, UK Mizuta H 2000 Adam Smith: Critical Responses. Routledge, London Ranadive K R 1984 The wealth of nations: the vision and conceptualisation. In: Wood J C (ed.) Adam Smith: Critical Assessments, ii. 244 Robbins L 1953 The Theory of Economic Policy in English Classical Political Economy. Macmillan, London Ross I S 1995 Life of Adam Smith. Oxford University Press, Oxford, UK Skinner A S 1996 A System of Social Science: Papers Relating to Adam Smith, 2nd edn. Oxford University Press, Oxford, UK Stevens D 1975 Adam Smith and the colonial disturbances. In: Skinner A S, Wilson T (eds.) Essays on Adam Smith. Oxford University Press, Oxford, UK Teichgraeber R 1987 Less abused than I had reason to expect: The reception of the Wealth of Nations in Britain 1776–1790. Historical Journal 80: 337–66 Tribe K P 1988 Goerning Economy: The Reformation of German Economic Discourse, 1750–1840. Cambridge University Press, Cambridge, UK Viner J 1928 Adam Smith and Laissez-Faire. Journal of Political Economy 35: 143–67 Wood J C 1984 Adam Smith: Critical Assessments. CroomHelm, Beckenham, UK Young J 1977 Economics as a Moral Science: The Political Economy of Adam Smith. Edward Elgar, Cheltenham, UK
adequately characterized the interactions of these determinants. Social and behavioral scientists have also elucidated important mechanisms by which both private sector and governmental intervention can discourage the initiation of smoking and encourage quitting. After a brief description of the burden of smoking and how epidemiological science unearthed the principal tobacco-related disease connections, the causes of tobacco consumption are summarized. The efforts of social and behavioral scientists to improve the ability of smokers to quit smoking are then described, including better theoretical characterization of the determinants of quitting success and the development of more effective cessation interventions. Next examined is how social and behavioral scientists have analyzed the effects of selected important tobacco control measures, and thereby contributed to the formulation and implementation of tobacco control programs and policies. Finally, the future contribution of the social and behavioral sciences to dealing with a potentially even more cataclysmic crisis in world health during the twenty-first century is considered. Traditional methods of youth smoking prevention, such as school health education and enforcement of minimum age-of-purchase laws, are only touched on due in part to limited scientific understanding of how initiation can be effectively discouraged, and due to its coverage elsewhere (Lantz et al. 2000; see Substance Abuse in Adolescents, Preention of; Smoking Preention and Smoking Cessation; Health Promotion in Schools).
A. S. Skinner
2. Cigarette Smoking and Disease: Making the Connection
Smoking and Health 1. Introduction Worldwide, 1.1 billion people use tobacco products, primarily cigarettes. An estimated four million die annually as a consequence. By the year 2030, an estimated 1.6 billion will consume tobacco and tobacco’s death toll will have skyrocketed to 10 million per year, making tobacco the leading behavioral cause of preventable premature death throughout the world. Tobacco already claims that dubious distinction in the world’s developed countries (World Health Organization 1997). The causes of this pandemic of avoidable disease and death are a complex web of physiological, psychological, social, marketing, and policy factors. Social and behavioral scientists of all disciplinary stripes have contributed to identifying and disentangling these factors, although no single model has yet
In the major industrialized nations, smoking causes from a sixth to a fifth of all mortality. The two major smoking-related causes of death are lung cancer and coronary heart disease (CHD). In the United States, epidemiologists estimate that smoking accounts for approximately 90 percent of all lung cancer deaths, with lung cancer the leading cancer cause of death in both men and women. Smoking is credited with better than a fifth of CHD deaths. In addition, smoking causes four-fifths of all chronic obstructive pulmonary disease mortality and just under a fifth of all stroke deaths (US Department of Health and Human Services 1989). The exposure of nonsmokers to environmental tobacco smoke is also a cause of disease and death (American Council on Science and Health 1999). The toll of smoking is proportionately smaller in developing countries, reflecting the more recent rise and lesser intensity of smoking to this point. However, projections indicate a future chronic disease epidemic quite comparable to that now experienced in the world’s more affluent nations (World Health Organization 1997). 14195
Smoking and Health Although the health hazards of tobacco smoking have been suspected for centuries, serious interest is a twentieth-century phenomenon, coincident with the advent of cigarette smoking. Prior to the twentieth century, tobacco was smoked primarily in pipes and cigars, chewed, or used nasally as snuff. Harsh tobaccos made deep inhalation of tobacco smoke difficult. Tobacco likely exacted a modest toll through the nineteenth century. In the US, four factors combined in the early twentieth century to make cigarette smoking the most popular, and lethal, form of tobacco consumption. Perfection of the Bonsack cigarette rolling machine in 1884 introduced relatively inexpensive and neater cigarettes to the market. Discovery of the American blend of tobaccos (combining more flavorful tobaccos from Turkey and Egypt with milder American tobaccos) made deep inhalation of cigarette smoke feasible. The development and marketing of Camel cigarettes in 1913, the first American blend cigarette, introduced this new-generation product to the American public with an advertising campaign often credited as inaugurating modern advertising. Finally, cigarettes were included in soldiers’ rations during World War I, permitting soldiers a quick and convenient battlefield tobacco break. Considered effeminate and unattractive prior to the war, cigarette smoking came home a ‘manly’ behavior. Since then, cigarette smoking has dominated all other forms of tobacco use by far in the US and in most countries of the world. Lung cancer was virtually unknown at the beginning of the twentieth century. By 1930, it had become the sixth leading cause of cancer death in men in the United States. Less than a quarter of a century later, lung cancer surpassed colorectal cancer to become the leading cause of male cancer death. It achieved the same status among American women in the mid1980s, surpassing breast cancer (US Department of Health and Human Services 1989). The rapidly growing rate of lung cancer in the early decades of the century spurred a number of epidemiologists and laboratory scientists to begin investigating the relationship between smoking and cancer. In 1950, Wynder and Graham published a now classic retrospective epidemiologic analysis that strongly linked the behavior to the disease. Throughout the decade, a series of articles documented similar findings from major prospective (cohort) mortality studies in the US and the UK. Scientific groups in both countries soon thereafter published seminal public health documents indicating smoking as the major cause of lung cancer and a cause of multiple other diseases (Royal College of Physicians of London 1962, US Public Health Service 1964, US Department of Health and Human Services 1989). By the end of the twentieth century, approximately 70,000 studies in English alone associated cigarette smoking with a wide variety of malignant neoplasms and cardiovascular and pulmonary diseases, as well as 14196
numerous other disorders. The modern plague of tobacco-produced disease is now rapidly metastasizing—both figuratively and literally—to the world’s poorer nations, where increasing affluence and western image-making have combined to place cigarettes in the mouths of a sizable majority of men and a growing minority of women.
3. The Causes of Smoking Smoking affords users a very mild ‘high’ and addresses a number of often seemingly competing physical and psychological needs, for relaxation or stimulation, distraction or concentration, for example. The crucial ingredient in the sustained use of cigarettes is the addictiveness of the nicotine in tobacco smoke. Once addicted, as most regular smokers are, smoking serves the physiological purpose of avoiding nicotine withdrawal. Nicotine affects brain receptors in much the same manner as other addictive substances, and withdrawal shares many of the same unpleasant characteristics (US Department of Health and Human Services 1988). In many countries, smoking is initiated during the teen and even pre-teen years. At the time of initiation, new smokers typically do not appreciate the nature of addiction, much less the addictiveness of nicotine. Further, they tend to be short sighted, unconcerned about potential distant adverse health consequences (US Department of Health and Human Services 1994). The combination translates into a large pool of new smokers who have adopted the behavior, and become addicted, without considering either the danger or addictiveness of smoking, a situation that most will come to regret. The physical effects of nicotine notwithstanding, history teaches that virtually all drug use is socially conditioned; tobacco smoking is no exception. The use of tobacco in the Americas prior to the arrival of Europeans demonstrates this vividly. Although tobacco played a prominent role in most native societies, in some tobacco smoking was restricted exclusively to the shaman, who used it for medicinal and relig-ious purposes. In other societies, tobacco smoking anointed official nonsectarian functions of tribal leaders (such as the famous ‘peace pipe’). In still others, nearly all males smoked tobacco frequently, for personal and social reasons (Goodman 1993). In contemporary society, the initiation of cigarette smoking is often viewed as a rite of passage to adulthood. In many poorer societies, smoking is seen as a symbol of affluence. In virtually all cases, smoking results from role modeling, of parents, prominent celebrities, or youthful peers. Many tobacco control advocates attribute much of smoking’s holding power to sophisticated tobacco industry marketing campaigns. In the United States alone, the industry spends over $5 billion per year on
Smoking and Health advertising and other forms of marketing. The industry insists that the purpose of its advertising is solely to vie for shares of the existing market of adult smokers, a claim greeted with derision by the public health community. With US smokers exhibiting strong brand loyalty and two companies controlling threequarters of the US market, marketing directed at brand switching (or its defensive analog, maintaining brand loyalty) would appear to be a relatively fruitless investment. Tobacco control advocates thus believe that much of the industry’s marketing effort is directed toward attracting new smokers, primarily children but also groups of first- and second-generation American adults not yet fully acculturated into American society (e.g., Hispanic females, who have a very low smoking prevalence). Similarly, the introduction of aggressive Western cigarette marketing into Asian societies and Eastern European and African countries has been attacked by the public health community as spurring the growth of smoking among children and other traditionally nonsmoking segments of society (e.g., Asian females). In many of these countries, cigarette advertising was nonexistent or restricted to modest use of print media prior to the introduction of advertising by the major multinational tobacco companies. Before dissolution of the Soviet Union, there was virtually no advertising of the state-produced cigarettes in the Eastern European countries. That ended with a flourish when multinational tobacco companies entered the newly liberated economies and, in some instances, took over cigarette production from the inefficient state-run enterprises. In Japan, the state tobacco monopoly had never bothered to advertise on television prior to the entry of the multinationals. In short order, competition from the advertising of Western cigarette brands led to cigarettes becoming the second most advertised product on Japanese television. In Africa, the advertising media portray smoking as the indulgence of the much admired affluent set, with billboards in desperately poor villages depicting highsociety Africans smoking and smiling in convivial social settings (McGinn 1997). Although many tobacco control advocates find advertising a convenient villain to attack, widespread smoking has often preceded formal marketing. Tobacco smoking spread like wildfire through much of Europe in the sixteenth and seventeenth centuries. Similarly, male smoking rates in Russia and the Eastern European countries were very high prior to the fall of Communism and the advent of modern cigarette marketing. The effects of advertising are further examined below in Sect. 5.
4. The Art and Science of Smoking Cessation In countries in which the hazards of smoking have been well publicized, surveys find that most smokers
would like to quit and most have made at least one serious quit attempt, yet relatively few of those who try to quit succeed on any given attempt. In the US, for example, approximately three-quarters of smokers report they want to stop and as many as a third try each year. Yet only 2.5–3 percent succeed in quitting each year. Combined with persistent cessation efforts by many smokers, this modest quit rate has created a population in the US in which there are as many former smokers as current smokers (US Department of Health and Human Services 1989, 1990). In the aggregate, thus, quitting has significantly reduced the toll of smoking, but the toll remains stubbornly high due to the difficulty of quitting. Although most former smokers quit without the aid of formal programs or products, the widespread desire to quit, paired with its difficulty, has created a small but thriving market for smoking cessation. Formal cessation interventions range from the distribution of how-to-quit booklets, to mass media cessation campaigns, to individual and group counseling programs, to self-administered nicotine replacement therapy (NRT) products, to use of cessation pharmaceuticals combined with clinical counseling. The efficacy of interventions ranges from sustained quit rates of about 5–7 percent for the most general and least resource-intensive interventions (e.g., generic cessation booklets) to 30 percent or more for the most resource-intensive programs that combine sophisticated counseling with the use of pharmaceuticals (Agency for Health Care Policy and Research 1996). Over the years, behavioral scientists have helped refine smoking cessation techniques by evaluating interventions and developing theory applied to smoking cessation. Pre-eminent in the domain of theory have been models that characterize how smokers progress, through a series of ‘stages,’ to contemplate, attempt, and eventually succeed or fail in quitting, with cessation maintenance also examined (Prochaska and DiClemente 1983). Relating cessation advice to smokers’ stages of readiness to change constitutes one way of ‘tailoring’ cessation messages. Another involves tying cessation advice to the specific motivations of individual smokers to quit and their specific concerns. For example, consider a smoker motivated to quit primarily by the high cost of smoking but also concerned about gaining weight. Armed with this knowledge, a cessation counselor can develop specific information on the financial savings the smoker can expect once he or she quits, while suggesting specific strategies to avoid weight gain (or to deal with it if it occurs). Tailored cessation messages have the obvious virtue of meeting the idiosyncratic needs of individual smokers. In the absence of computer technology, however, they would entail the substantial cost of collecting information on individuals’ needs and concerns and then developing individualized (tailored) cessation advice for them. Computers, however, per14197
Smoking and Health mit simple and inexpensive collection and translation of information into tailored cessation materials, such as individualized advice booklets and calendars with specific daily advice and reminders (Strecher 1999). Health informatn kiosks placed in malls and other public locations can provide instant tailored suggestions to help people quit smoking. The concept of tailoring holds great potential to enhance quit rates both in individual counseling situations and in lowcost nontreatment settings, such as these kiosks. Social and behavioral scientists have also played a central role in defining appropriate medical treatment of smokers. The US Agency for Health Care Policy and Research (1996) clinical guideline for smoking cessation was produced by an expert committee including social and behavioral scientists who had worked on smoking cessation as service providers or developers or evaluators of interventions. The guideline urges physicians to regularly counsel their smoking patients about the implications of the behavior and to encourage them to quit. It concludes that a highly effective cessation approach involves physician counseling to quit, supplemented with use of cessation pharmaceuticals and maintenance advice through follow-up contacts. A cost-effectiveness analysis of the guideline added economics to the social sciences contributing to understanding optimal smoking cessation therapy (Cromwell et al. 1997). Despite substantial improvements in the efficacy of cessation treatments, in any given year only a small fraction of the smokers who say they want to quit succeed in doing so, and only a small fraction of these have employed professional or programmatic assistance. As such, in any short-run period of time, the contribution of smoking cessation interventions to reducing the health toll of smoking is modest at best. This source of frustration has led a subset of smoking cessation professionals to explore methods of achieving ‘harm reduction’ that do not depend on smokers completely overcoming their addictions to nicotine. Harm reduction techniques range from helping smokers to reduce their daily consumption of cigarettes to encouraging consideration of the long-term use of low-risk nicotine-delivery devices, such as nicotine ‘gum’ or the patch. Though fraught with problems, harm reduction may be an idea whose time has come (Warner 2001).
5. Analysis of the Effects of Tobacco Control Policies Another way to grapple with the toll created by smoking is to develop policies that discourage the initiation or continuation of smoking, or that restrict it to areas in which it will not impose risk on nonsmokers. Social and behavioral scientists have devoted substantial effort to studying the impacts of 14198
policies, as well as the processes by which such policies come to be adopted. This section examines the former. (For the latter, see, e.g., Fritschler and Hoefler 1996, Kagan and Vogel 1993.) The World Health Organization and the World Bank have described and evaluated a wide array of tobacco control policies in countries around the globe (Roemer 1993, Prabhat and Chaloupka 1999). Three that have commanded the greatest amount of research attention are taxation, restricting or banning advertising and promotion, and limiting smoking in public places.
5.1 Taxation Research performed primarily by economists has established that taxing cigarettes is among the most effective measures to decrease smoking (Chaloupka and Warner 2000, Prabhat and Chaloupka 1999). Because the quantity of cigarettes consumed declines by an amount proportionately smaller than the associated tax rise, taxation increases government revenues at the same time that it decreases smoking. In developed countries, economists find that a 10 percent increase in cigarette price induces approximately a 4 percent decrease in quantity demanded. In developing countries, the demand impact may be twice as large. Effective dissemination of research findings has made taxation one of the central tenets of a comprehensive tobacco control program. Although the ‘bottom line’ about taxation is well established, research by economists and others points to concerns that remain unanswered. For example, does taxation discourage the initiation of smoking? Although evidence preponderantly suggests that it does, recent studies challenge the conventional view (Chaloupka and Warner 2000, Chaloupka 1999). ‘Side effects,’ or subtle unanticipated impacts of taxation, warrant additional attention as well. Notably, cigarette tax increases in the US cause some smokers to switch to higher nicotine cigarettes to get their customary dose of nicotine from fewer cigarettes, particularly among younger smokers (Evans and Farrelly 1998). In the emerging field of ‘behavioral economics,’ a small cadre of psychologists is studying experimentally how smoking responds to incentives. For example, investigators give subjects a daily ‘budget’ (e.g., play money) with which to buy cigarettes, food, and other commodities, at prices established by the investigators, and evaluate the effects of ‘taxation’ by raising the price of cigarettes. Findings generally have been quite consistent with those in the mainstream economics literature (Chaloupka et al. 1999). Behavioral economics can investigate questions not addressable using existing real world data, but also has decided limitations in predicting how people will
Smoking and Health respond to price changes in that real world. By learning from each other, the two fields can enrich the evidentiary base for future tobacco tax policy.
5.2 Adertising A diverse group of sociologists, psychologists, economists, marketing specialists, and public health scholars has studied the effects of marketing on smoking. Research by several scholars concludes that modern Western-style advertizing creates an imagery that many people, including large proportions of children, find easily recognizable and attractive. There is a strong correlation between children’s interest in cigarette marketing campaigns and their subsequent smoking behavior. Temporally, advertising campaigns directed at specific segments of a population have often been followed by significant growth in smoking in the targeted groups, including women in the US in the 1960s, young women in Asia in the 1990s, and children in many countries (Warner 2000 McGinn 1997). Whether any of these associations constitutes a causal relationship is the essential question. The same unidentified factor that makes cigarette advertising attractive to certain children could account for their subsequent smoking, independent of the advertising per se. Similarly, cigarette marketers might foresee an (independently occurring) expansion of a market segment and dive into the advertising void to compete for shares of that new market. In the absence of the ability to run randomized controlled trials, empirical analysis of the relationship between cigarette advertising and consumption has been unable to prove a causal connection or to estimate its likely extent (US Department of Health and Human Services 1989, Warner 2000). However, strong new evidence supporting causality comes from a study re-examining data on the relationship between countries’ policies with regard to cigarette advertising (ranging from no restrictions to complete bans) and levels of smoking within those societies (Saffer and Chaloupka 1999). Blending marketing theory with empirical analysis, this study concluded that a complete ban on cigarette advertising would decrease smoking by about 6 percent, while partial bans (e.g., banning cigarette ads on the broadcast media) would be unlikely to have an impact on cigarette consumption. Combined with the previously existing evidence, the new research leads to the most plausible interpretation of the relationship between advertising and cigarette consumption. It is a conclusion that likely will satisfy neither tobacco industry defenders of the ‘brand-share only’ argument, nor tobacco control advocates who condemn marketing efforts as a principal cause of smoking. Cigarette advertising and other forms of
marketing likely do increase the amount of smoking in a statistically significant manner, possibly accounting for as much as 10 percent of cigarette consumption. (This figure is likely to vary among societies, depending on the maturity of the smoking market and on familiarity with large-scale marketing campaigns.) The converse, of course, is that advertising and marketing almost certainly do not account for the majority of cigarettes consumed. For these, one must turn to other influences, all less policy tractable than advertising, including role modeling, peer behavior, and the addictiveness of nicotine and smoking.
5.3 Restrictions on Smoking in Public Places ‘Clean indoor air laws,’ which restrict or prohibit smoking in public places and workplaces, grew out of concerns that the exposure of nonsmokers to environmental tobacco smoke (ETS) could create a risk of disease, leading in the US to rapid diffusion of state laws, beginning in 1973. A decade later, similar laws emerged at the local level of government, where most of the legislative action in the US has remained since then (Brownson et al. 1997). The scientific knowledge base actually followed early diffusion of legislation, with a number of studies of the relationship between ETS exposure and the risk of lung cancer published in the 1980s (Brownson et al. 1997). By the 1990s, the research base had become sufficiently strong that the US Environmental Protection Agency (EPA) declared ETS a ‘Class A Carcinogen,’ a proved environmental cause of cancer in nonsmoking humans. The EPA estimated that ETS caused approximately 3,000 lung cancer cases annually in the US, and also detailed nonfatal respiratory disease effects of ETS, particularly in children (US Environmental Protection Agency 1992). More recently, research has implicated ETS in heart disease deaths in adult nonsmokers, possibly an order of magnitude greater than the lung cancer impact (American Council on Science and Health 1999). Social and behavioral scientists have informed the debate by studying the process of the emergence and diffusion of clean indoor air laws and, subsequently, the effects of such laws on nonsmokers’ exposure to ETS and on smokers’ smoking behavior. Studies have found generally high levels of compliance with the laws, even in bars (where low compliance was anticipated by many) (Warner 2000), and reductions in employees’ exposure to ETS, as measured both by self-reports, air sampling, and examination of employees’ body burdens of cotinine, a nicotine derivative. Less obvious is whether clean indoor air laws discourage smoking or merely ‘reallocate’ it to times and places in which smoking is permitted. According to several studies, the laws do discourage smoking 14199
Smoking and Health among workers in regulated workplaces, producing increased quit rates and lower daily consumption among continuing smokers (Brownson et al. 1997).
5.4 Interaction of Policies As policy research becomes more sophisticated, the relationships among policies and their impacts will come to be better understood. Illustrative is research examining the joint effects of tax increases and the adoption of clean indoor air laws. For example, some of the smoking decline credited to tax increases might be associated with a citizenry more interested in reducing smoking. In turn, the latter could be reflected in the adoption of clean indoor air laws. Examining the joint effects of the two policies confirmed researchers’ suspicions, thereby suggesting a reduced (but still quite substantial) impact of taxation on smoking. Similarly, disentangling the impacts of statelevel tobacco control programs that mix tax increases with media antismoking campaigns is important but analytically challenging. Recent research paves the way toward better evaluation of multiple-component policies (Chaloupka 1999, Chaloupka and Warner 2000).
6. The Future of Social and Behaioral Science Contributions to Tobacco Control Social and behavioral science have contributed greatly toward understanding how and why the epidemic of smoking has evolved throughout the twentieth century. They have also elucidated a set of tools that can help society to extricate itself from the tenacious grip of this public health disaster. With innovations in information and pharmacological technology combining with better insights into human behavior, further improvements in assisting smokers to quit appear to be virtually certain. Although the long-range goal must focus on preventing future generations of children from starting to smoke, helping current adult smokers to quit remains critical to reducing the tobacco-attributable burden of disease and mortality over the next three decades. As such, the role of social and behavioral scientists in smoking cessation will likely be increasingly productive. The effectiveness of efforts to encourage smokers to quit depends to a significant degree on the environment in which smoking occurs. If smoking is increasingly viewed as antisocial, quitting smoking will become easier, and more urgent, for current smokers. Policy making is society’s best means of intentionally altering the environment; and better understanding of the effects of policy interventions, and of how they come to be adopted, will be crucial to shaping the 14200
environment with regard to smoking in the coming years. The multidimensional problems associated with tobacco use vividly illustrate the need for scientists of all disciplinary persuasions to work together. The next great challenge confronting the diverse fields of social and behavioral science in tobacco control is how to combine the methods and insights of the various disciplines into an integrated whole (Chaloupka 1999). See also: Drug Addiction; Drug Addiction: Sociological Aspects; Drug Use and Abuse: Cultural Concerns; Drug Use and Abuse: Psychosocial Aspects; Smoking Prevention and Smoking Cessation; Substance Abuse in Adolescents, Prevention of
Bibliography Agency for Health Care Policy and Research 1996 Smoking Cessation: Clinical Practice Guideline, No. 18, Information for Specialists. US Department of Health and Human Services, Public Health Service, Agency for Health Care Policy and Research (AHCPR Publication No. 96-0694), Rockville, MD American Council on Science and Health 1999 Enironmental Tobacco Smoke: Health Risk or Health Hype? American Council on Science and Health, New York Brownson R C, Eriksen M P, Davis R M, Warner K E 1997 Environmental tobacco smoke: Health effects and policies to reduce exposure. Annual Reiew of Public Health 18: 163–85 Chaloupka F J 1999 Macro-social influences: The effects of prices and tobacco control policies on the demand for tobacco products. Nicotine & Tobacco Research 1(51): 105–9 Chaloupka F J, Grossman M, Bickel W K, Saffer H (eds.) 1999 The Economic Analysis of Substance Use and Abuse: An Integration of Econometric and Behaioral Economic Research. University of Chicago Press, Chicago Chaloupka F J, Laixuthai A 1996 US trade policy and cigarette smoking in Asia. National Bureau of Economic Research Working Paper No. 5543. NBER, Cambridge, MA Chaloupka F J, Warner K E 2000 The economics of smoking. In: Culyer A J, Newhouse J P (eds.) Handbook of Health Economics. Elsevier, Amsterdam Cromwell J, Bartosch W J, Fiore M C, Hasselblad V, Baker T 1997 Cost-effectiveness of the clinical practice recommendations in the AHCPR guideline for smoking cessation. Journal of the American Medical Association 278: 1759–66 Evans W N, Farrelly M C 1998 The compensating behavior of smokers: taxes, tar, and nicotine. RAND Journal of Economics 29: 578–95 Fritschler A L, Hoefler J M 1996 Smoking and Politics: Policy Making and the Federal Bureaucracy, 5th edn. Prentice Hall, Upper Saddle River, NJ Goodman J 1993 Tobacco in History: The Cultures of Dependence. Routledge, New York Kagan R A, Vogel D 1993 The politics of smoking regulation: Canada, France, and the United States. In: Rabin R L, Sugarman S D (eds.) Smoking Policy: Law, Politics, & Culture. Oxford University Press, New York Lantz P M, Jacobson P D, Warner K E, Wasserman J, Pollack H A, Berson J, Ahlstrom A 2000 Investing in youth
Smoking Preention and Smoking Cessation tobacco control: a review of smoking prevention and control strategies. Tobacco Control 9: 47–63 McGinn A P 1997 The nicotine cartel. World Watch 10(4): 18–27 Prabhat J, Chaloupka F J 1999 Curbing the Epidemic: Goernments and the Economics of Tobacco Control. World Bank, Washington, DC Prochaska J O, DiClemente C C 1983 Stages and processes of self-change of smoking: toward an integrative model of change. Journal of Consulting and Clinical Psychology 51: 390–5 Roemer R 1993 Legislatie Action to Combat the World Tobacco Epidemic, 2nd edn. World Health Organization, Geneva, Switzerland Royal College of Physicians of London 1962 Smoking and Health: A Report on Smoking in Relation to Cancer of the Lung and other Diseases. Pitman Publishing Co., London Saffer H, Chaloupka F J 1999 Tobacco advertising: economic theory and international evidence. National Bureau of Economic Research Working Paper No. 6958. NBER, Cambridge, MA Strecher V J 1999 Computer-tailored smoking cessation materials: A review and discussion. Patient Education and Counseling 36: 107–17 US Department of Health and Human Services 1988 The Health Consequences of Smoking: Nicotine Addiction. (DHHS Publication No. (CDC) 88-8406), Government Printing Office, Washington, DC US Department of Health and Human Services 1989 Reducing the Health Consequences of Smoking: 25 Years of Progress. A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease control, National Center for Chronic Disease Prevention and Health Promotion, Office of Smoking and Health (DHHS Publication No. (CDC) 89-8411), Rockville, MD US Department of Health and Human Services 1990 The Health Benefits of Smoking Cessation. A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease Control, National Center for Chronic Disease Prevention and Health Promotion, Office of Smoking and Health (DHHS Publication NO. (CDC) 90-8416), Rockville, MD US Department of Health and Human Services 1994 Preenting Tobacco Use Among Young People: A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease Control, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health. US Government Printing Office, Washington, DC US Environmental Protection Agency 1992 Respiratory Health Effects of Passie Smoking: Lung Cancer and Other Disorders. US Environmental Protection Agency, Washington, DC US Public Health Service 1964 Smoking and Health. Report of the Adisory Committee to the Surgeon General of the Public Health Serice. US Department of Health, Education, and Welfare, Public Health Service, Center for Disease Control. PHS Publication No. 1103, Washington, DC Warner K E 2000 The economics of tobacco: Myths and realities. Tobacco Control 9: 78–89 Warner K E 2001 Reducing harms to smokers: Methods, their effectiveness, and the role of policy. In: Rabin R L, Sugarman S D (eds.) Regulating Tobacco: Premises and Policy Options. Oxford University Press, New York World Health Organization 1997 Tobacco or Health: A Global Status Report. World Health Organization, Geneva, Switzerland
Wynder E L, Graham E A 1950 Journal of the American Medical Association 143: 329–96
K. E. Warner
Smoking Prevention and Smoking Cessation Cigarettes became popular after 1915. Today, approximately 1.1 billion people aged 15 years and over smoke: about one-third of the global population (see Smoking and Health). This article provides an overview of health effects, and methods for smoking prevention and cessation.
1. Health Effects In the late 1940s, epidemiologists noticed that annual death rates due to lung cancer had increased fifteenfold between 1922 and 1947 in several countries. Since the middle of the twentieth century, tobacco products contributed to more than 60 million deaths in developed countries. The estimated annual mortality is 540,000 in the European Union, 461,000 in the USA, and 457,000 in the former USSR (Peto et al. 1994). Tobacco is held responsible for three and a half million deaths worldwide yearly: about seven percent of all deaths per year. This figure will grow to ten million deaths per year by the 2020s; about 18 percent of all deaths in developed countries and 11 percent of all deaths in developing countries. Half a billion people now alive will then be killed by tobacco products; these products will have killed more people than any other single disease (WHO 1998). More than 40 chemicals in tobacco smoke cause cancer. Tobacco is a known or probable cause of about 25 diseases. Tobacco is recognized as the most important cause of lung cancer, but it kills even more people through many other diseases, including cancers at other sites, heart disease, stroke, emphysema, and other chronic lung diseases. Smokeless tobacco and cigars also have deadly consequences, including lung, larynx, esophageal, and oral cancer (USDHHS 1994). Lifetime smokers have a 50 percent chance of dying from tobacco. Half of these will die in middle age, before age 70, losing 22 years of normal life expectancy. Exposure to environmental tobacco smoke (ETS) has been found to be an established cause of lung cancer, ischemic heart disease, and chronic respiratory disease in adults. Reported effects for children are sudden infant death syndrome, bronchial hyperresponsiveness, atopy, asthma, respiratory diseases, 14201
Smoking Preention and Smoking Cessation reduced lung function, and middle ear disease. Barnes and Bero (1998) demonstrated from 106 reviews that conclusions on ETS were associated with the affiliations of the researchers. Overall, 37 percent concluded that passive smoking was not harmful to health. However, 74 percent of these were written by researchers with tobacco industry affiliations. The only factor associated with concluding that passive smoking is not harmful was whether the author was affiliated with the tobacco industry. Declining consumption in developed countries has been counterbalanced by increasing consumption in developing countries. Globally, 47 percent of men and 12 percent of women smoke. In developing countries 48 percent of men and 7 percent of women smoke, while in developed countries 42 percent of men smoke as do 24 percent of women. Tobacco use is regarded as the single most important public health issue in industrialized countries.
training. Life skills approaches focus on the training of generic life skills. Their effects are less strong than those of the social influence approaches. 2.2 Out-of-school Approaches Youngsters can also be reached out of school. Mass media approaches and nonsmoking clubs are popular methods. They attract attention to the subject and may influence attitudes. A review of 63 studies about the effectiveness of mass media found small effects on behavior (Sowden and Arblaster 1999). Targeting smoking parents is important as well; children are almost three times as likely to smoke if their parents do. Helping parents to quit smoking may prevent their children from starting to smoke and may encourage adolescent cessation. 2.3 Policies
2. Smoking Preention The process of becoming a smoker can be divided into several stages: preparation (never smoking), initial smoking, experimental or occasional (monthly) smoking, and regular (weekly and daily) smoking. In the preparatory stage attitudes towards smoking are formed. While at least 90 percent of the population has ever smoked a cigarette, the likelihood of becoming a regular smoker increases if initial smoking is repeated several times. In the third stage, a child learns how to smoke and the perceived advantages of smoking start to outweigh the disadvantages. In the fourth stage smoking becomes a routine. Most onset of smoking takes place during adolescence between age 10 and 20. Smoking onset as well as cessation are influenced by a variety of cultural (e.g., availability, litigation, smoke free places), biological (addiction), demographic (e.g., socioeconomic status, parent education), social factors (e.g., parental and peer smoking, parental and peer pressure, social bonding), and psychological factors (e.g., self-esteem, attitudes, self-efficacy expectations). Various efforts have been undertaken to prevent youngsters to start to smoke (see e.g., Hansen 1992, Reid et al. 1995, USDHHS 1994). 2.1 School-based Preention Programs Three types of school programs can be distinguished (see Health Promotion in Schools). Knowledge programs were not effective. The social influence approaches result in reduced onset ranging from 25 to 60 percent; effects may persist up to four years. Long term effects are found with programs embedded in more comprehensive community approaches. The method includes five to ten lessons, emphasizes shortterm consequences of smoking, discusses social (mostly peer) pressures, and includes refusal skills 14202
A range of policy interventions can be used to stimulate the prevention of smoking. (a) Price policies can have preventive effects. Higher prices encourage cessation among current smokers and discourage initiation among young smokers. A price elasticity of k0.05 implies that a ten percent increase in price reduces consumption or prevalence by five percent. Price elasticity ranges between k0.20 to k0.50. (b) School policies can stimulate smoking prevention. When examining the impact of school smoking policies over 4,000 adolescents in 23 Californian schools, it was found that schools with many smoking prevention policies had significantly lower smoking rates than schools with fewer policies and less emphasis on smoking prevention. (c) Public smoking restrictions can contribute to adolescents’ beliefs that nonsmoking is normative and that smoking creates health problems. Smoking regulations are more effective in preventing teenagers from starting to smoke than in reducing their consumption. Restriction of sales to minors often results in noncompliance: between 60 to 90 percent of the adolescents succeeded to buy tobacco products in situations where this was not allowed.
3. Smoking Cessation Physical addiction is caused by the pharmacological effects of nicotine. Psychological addiction occurs because smoking becomes linked with numerous daily circumstances and activities (e.g., eating, drinking) and emotional and stressful events. A person becomes motivated to quit if he has a positive attitude, encounters social support, and has high self-efficacy expectations towards quitting (De Vries and Mudde 1998).
Smoking Preention and Smoking Cessation Cessation is a process: a smoker in precontemplation is not considering to quit within six months; a smoker in contemplation is, but not within a month; a smoker in preparation is considering to quit within a month; a person in action has quit recently; a person in maintenance has quit for more than six months (Velicer and Prochaska 1999). Three outcome measures can be used to assess smoking cessation: point prevalence (having smoked during the preceding seven days), prolonged abstinence (not smoked during six or twelve months), and continuous abstinence (not smoked at all since the time of intervention). In longitudinal experimental designs more smokers than quitters may drop out, thus resulting in too optimistic estimates of successes of treatments. To correct for this bias, dropouts are coded as smokers; this is referred to as the ‘intention to treat procedure.’ This procedure may, however, result in conservative effect estimates. The desirability of biochemical validation is still controversial. Misreporting seldom exceeds five percent. Detection of occasional smoking in youngsters is difficult and expensive. Biochemical validation of self-reports should be considered when high demand situations are involved. A random subsample can be used to estimate bias and correct reported cessation rates. Cotinine has emerged as the measure of choice, with carbon monoxide as a cheaper but less sensitive alternative (Velicer et al. 1992). Comparing the results of cessation studies is hindered by differences in outcome measures, populations (e.g., the percentage of precontemplators), and follow-up periods. This overview includes methods that have evidence for success. The numerous studies assessing and reviewing the efficacy of smoking cessation interventions provide different success rates. Hence, the figures reported below are estimates derived from various studies that are reported below, as well as from inspection of other publications. 3.1 Pharmacotherapy Pharmacotherapeutic interventions increase quit rates approximately 1.5- to 2-fold. The absolute probability of not smoking at 6–12 months is greater when additional high intensity support is provided (Hughes et al. 1999, Silagy et al. 1999). Nicotine Replacement Therapy (NRT) products are available in a number of forms, including gum, transdermal patch, nasal spray, lozenge, and inhaler. NRT is recommended to be part of the core treatment package offered to all smokers. There are few instances in which the use of NRT is contraindicated. A meta analysis including 49 trials found significant effects when applying gum, patches, nasal spray, inhaled nicotine, and sublingual tablet. These effects were largely independent of the intensity of additional support provided or the setting in which the NRT was offered. The efficacy of NRT appears also to be largely
independent of other elements of treatment although absolute success rates are higher with more intensive behavioral support. The NRT cessation effects range from 6 to 35 percent, mostly ranging between 15 and 25 percent. Bupropion is an atypical antidepressant that has both dopaminergic and adrenergic actions. Unlike NRT, smokers begin bupropion treatment one week prior to cessation. The recent evidence shows that cessation rates are comparable to those by NRT, between 15 and 25 percent (Hurt et al. 1997, Jorenby et al. 1999). One study found that the nicotine patch was less effective than bupropion (Jorenby et al. 1999). The drug appears to work equally well in smokers with and without a past history of depression.
3.2 Motiational Strategies Various studies report on the effectiveness of group courses, self-help materials, computer tailoring, and competitions (e.g., Eriksen and Gottlieb 1998, Fisher et al. 1993, Matson et al. 1993, Skaar et al. 1997, Strecher 1999, Velicer and Prochaska 1999). Group courses can be effective. Studies suggest 10 to 30 percent abstinence rates. The disadvantage is that they attract mostly smokers who are highly motivated to quit, and that many smokers want to quit on their own. Self-help cessation interventions use different formats such as brochures, cassettes, and self-help guides. Point prevalence quit rates at one-year follow-up range between 9 and 15 percent. Competitions were effective in three out of five workplace studies. No study showed enhanced smoking cessation past six months. Net effects for cessation rate of competition plus group cessation program over group program alone in three studies varied between 1 to 25 percent. Computer tailoring results in personalized materials for smokers which are based on the (cognitive) characteristics of a person that were measured by a questionnaire. Large segments of smokers can be reached (De Vries and Brug 1999). A review of ten randomized trials found significant effects up to 25–30 percent.
3.3 Policy Interentions Policy interventions can effect cessation (Cummings et al. 1997, Eriksen and Gottlieb 1998, Glantz et al. 1997, Meier and Licari 1997, Zimring and Nelson 1995). Price policies, as with adolescents, influence cessation rates. Price elasticity rates for adults range between k0.3 and k0.55. Increases in cigarette prices may place a greater burden on those with lower incomes who tend to have greater difficulty in stopping smoking. 14203
Smoking Preention and Smoking Cessation Smoke-free areas can be effective as well, although evidence is not unequivocal. Smoking bans in workplaces resulted in reduced tobacco consumption or cessation at work, results ranging from 12 percent to 39 percent. The findings on reductions on prevalence were not consistent. Advertising, including tobacco sponsorship of events, is believed to stimulate the uptake of smoking and to reinforce the habit. This finding was supported by results from the COMMIT trial. The use of health warnings, however, has been found to be able to reduce tobacco consumption. Litigation by states and other groups may change the way tobacco is advertised and sold.
3.4 Special Settings for Cessation Multiple smoking cessation interventions can be applied in specific settings. The goal is to attract large segments of smokers, or particular segments of smokers. Advice by health intermediaries (physicians, nurses, midwives) is effective. A review describing the efficacy of 188 randomized controlled trials reported a two percent cessation rate resulting from a single routine consultation by physicians. While modest, the results were cost-effective. The effects are most salient among special risk groups, such as pregnant women and patients with cardiac diseases, although the additional cessation rates range from approximately 5 to 30 percent (Law and Tang 1995). In workplace settings cessation rates were four percent for a simple warning, eight percent for short counseling. They were highest among high-risk groups: 24 percent among workers at high risk for cardiovascular problems and 29 percent among asbestos exposed workers (Eriksen and Gottlieb 1998). Community interventions have the potential to reach all segments of a community. They combine different methods, such as mass media approaches, counseling, and self-help materials. Both the broader cardiovascular risk reduction interventions and those focusing solely on smoking had very small effects (1–3 percent cessation rates), although somewhat larger effects may be reached when combining them with mass media approaches (Fisher et al. 1993, Velicer and Prochaska 1999). Workplace programs also often combine methods. They can reach large and diverse groups of smokers and may produce an average long-term quit rate of 13 percent (ranging from 1.5 percent to 37 percent) at an average of 12 month’s follow-up regardless of the intervention methods (Eriksen and Gottlieb 1998). Group programs were more effective than minimal programs, although less intensive treatments when combined with high participation rates can influence the total population as well. 14204
4. Conclusions School-based smoking prevention can prevent or delay smoking onset. It is unrealistic to expect long-term effects with short programs when prosmoking norms are communicated through multiple channels. Consequently, at least ten lessons are needed during adolescence in several grades. Outside of school, programs are needed, since adolescents with the highest rates of tobacco use are least likely to be reached through school based programs (Stanton et al. 1995). Innovative cessation interventions are needed as well, because youth cessation programs show no long-term success. Since multiple interventions are found to be most effective, priority should be given to broad-based interventions aimed at both the youngsters, the schools, the parents, and the community as a whole, including mass campaigns for all age groups, fiscal policies, restrictions on smoking, and bans on advertising. Pharmacotherapy, motivational strategies, and policies are effective tools to promote cessation. The potential of other antidepressants (e.g., nortriptyline) on smoking cessation need attention as well. Innovative media strategies, such as computer tailoring, may reach less motivated smokers. Specific attention for underprivileged populations (e.g., ethnic populations, low socioeconomic groups) is needed as well. The combination of multiple methods given by multiple health care providers on multiple occasions will result in the greatest impact. Future research should assess the effectiveness of programs in smokers differing in motivational stages, since the most effective methods mostly attract smokers who are motivated to quit. Effective methods that only reach a small segment of motivated smokers have a low impact. Less effective methods that reach large segments of the population have a higher impact. Velicer and Prochaska (1999) suggest the greatest impact for computer-tailored programs. Finally, cost-effectiveness studies reveal that cessation methods are most cost-effective when aimed at a particular subgroup, multiple interventions, and at maintaining abstinence. See also: Health Behaviors; Health Education and Health Promotion; Health: Self-regulation
Bibliography Barnes D E, Bero L A 1998 Why review articles on the health effects of passive smoking reach different conclusions? Journal of the American Medical Association 279: 1566–70 Cummings K M, Hyland A, Pechacek T, Orlandi M, Lynn W R 1997 Comparison of recent trends in adolescent and adult cigarette smoking behaviour and brand preferences. Tobacco Control 6: suppl: 31–7
Soap Opera\Telenoela De Vries H, Brug H 1999 Computer-tailored interventions motivating people to adopt health promoting behaviours; introduction to a new approach. Patient Education and Counseling 36: 99–105 De Vries H, Mudde A 1998 Predicting stage transitions for smoking cessation applying the Attitude–Social influence– Efficacy model. Psychology & Health 13: 369–85 Eriksen M P, Gottlieb N H 1998 A review of the health impact of smoking control at the workplace. American Journal of Health Promotion 13: 83–104 Fisher B E, Lichtenstein E, Haire-Joshu D, Morgan G D, Rehberg H R 1993 Methods, successes, and failures of smoking cessation programs. Annual Reiews of Medicine 44: 481–513 Glantz S A, Fox B J, Lightwood J M 1997 Tobacco litigation. Issues for public health and public policy. Journal of the American Medical Association 277: 751–3 Hansen W B 1992 School-based substance abuse prevention: a review of the state of art in curriculum, 1980–1990. Health Education Research 7: 403–30 Hughes J R, Goldstein M G, Hurt R D, Shiffman S 1999 Recent advances in the pharmacotherapy of smoking. Journal of the American Medical Association 281: 72–6 Hurt R D, Sachs D P L, Glover E D et al. 1997 A comparision of sustained-release bupropion and placebo for smoking cessation. New England Journal of Medicine 337: 1195–202 Jorenby D E, Leishow S J, Nides M A, Rennard S I, Johnston J A, Hughes A R, Smith S S, Muramoto M L, Daughton D M, Doan K et al. 1999 A controlled trial of sustainedrelease bupropion, a nicotine patch or both for smoking cessation. New England Journal of Medicine 340: 685–91 Law M, Tang J L 1995 An analysis of the effectiveness of interventions intended to help people stop smoking [see comments]. Archies of International Medicine 155 (18): 1933–41 Matson D M, Lee J W, Hopp J 1993 The impact of incentives and competitions on participation and quit rates in worksite smoking cessation programs. American Journal of Health Promotion 7: 270–80 Meier K J, Licari M J 1997 The effect of cigarette taxes on cigarette consumption 1995–1994. American Journal of Public Health 87: 1126–30 Peto R, Lopez A D, Boreham J, Thun M, Heath C 1994 Mortality from Smoking in Deeloped Countries 1950–2000. Oxford University Press, Oxford, UK Reid D J, McNeill A, Glynn T 1995 Reducing the prevalence of smoking in youth in Western countries: An international review. Tobacco Control 4: 266–77 Skaar K L, Tsoh J Y, McClure J B, Cinciripini P, Friedman K, Wetter D W, Gritz E 1997 Smoking cessation 1: An overview of research. Behaioral Medicine 23: 5–13 Silagy C, Mant D, Fowler G, Lancaster T 1999 Nicotine replacement therapy for smoking cessation (Cochrane Review). In: The Cochrane Library, Issue 4. Oxford, UK. Update Software Sowden A J, Arblaster L 1999 Mass media interventions for preventing smoking in young people (Cochrane Review). In The Cochrane Library, Issue 1. Oxford, UK. Update Software Stanton W R, Gillespie A M, Lowe J R 1995 Reviewing the needs of unemployed youth in smoking intervention programmes. Drug and Alcohol Reiew 14: 101–8 Strecher V J 1999 Computer-tailored smoking cessation materials: A review and discussion. Patient Education and Counseling 36: 107–17
USDHHS 1994 Preenting Tobacco Use Among Young People; A Report of the Surgeon General. US Department of Health and Human Services, Public Health Service, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health, Atlanta, GA Velicer W, Prochaska J 1999 An expert system intervention for smoking cessation. Patient Education and Counselling 36: 119–29 Velicer W, Prochaska J, Rossi J, Snow M 1992 Assessing outcome in smoking cessation studies. Psychological Bulletin 111: 23–41 WHO 1998 Tobacco epidemic: Health dimensions. Fact Sheet 154. World Health Organisation, Geneva, Switzerland Zimring F E, Nelson W 1995 Cigarette taxes as cigarette policy. Tobacco Control 4: suppl: 25–33
H. de Vries
Soap Opera/Telenovela ‘Soap opera’ is a pejorative term coined in the 1930s by the entertainment trade press of the United States to designate the daytime dramatic serials that were broadcast on radio and aimed primarily at women. The term ‘soap’ was due to the sponsorship of the programs, which usually were produced by the advertising agencies of the soap and toiletry industries (see Adertising Agencies). The term ‘opera’ refers ironically to the dramatic character of the genre and its ‘low cultural quality.’ Soap operas are daytime serials with a dramatic content aimed primarily at a female audience. The genre emerged in the early period of the radio industry and has been broadcast by television stations since the 1950s. ‘Telenovela’ is a term used throughout Latin America to designate the melodramatic serials that became the most popular programs of the television industries of the region. They have been exported to every continent in the world. Although telenovelas have a similar history and common features when compared to soap operas, they are two distinct genres (see Teleision: Genres).
1. Soap Opera The soap opera emerged in the US radio industry of the early 1930s. The genre developed as a response to a fundamental problem for the commercial radio stations and the advertising agencies: how to achieve the largest audience of potential consumers of certain products. The advertising agencies of the soap, toiletry, and foodstuff industries took on the role of developing programs that could attract the members of the general audience that were the main potential consumers: women between the ages of 18 and 49. When searching for formulas that could do this job, the agencies found in serials a type of narrative that 14205
Soap Opera\Telenoela proved to be suitable in creating a faithful audience. By the late 1920s, a program that relied upon narrative and fictional characters, Amos ‘n’ Andy, established a crucial precedent by demonstrating the appeal of the radio serial form. The program presented the adventures of two Southern black men living in the city of Chicago and achieved an estimated daily audience of 40 million listeners by 1929. The first radio programs that can be classified as soap operas appeared in the early 1930s and were based on these early successful experiments. The serial Painted Dreams, first broadcast in October 1930 by the station WGN, told the story of an Irish woman, her household, and daughter. The program was written by a school teacher turned radio actress, Irna Phillips, who has been credited with ‘inventing’ the genre soap opera. Phillips wrote many successful plots, for example, The Guiding Light, the longest running soap ever. The serial was introduced in January 1937, and lasted for 19 years on radio. It is still broadcast on television by CBS. Frank Hummert and his wife, Anne, were other key figures in the birth of the soap opera. Their advertising agency created one of the first daytime serials ever produced, Stolen Husband, in 1931. Although the serial failed, it helped the Hummerts develop the formula that proved to be successful in attracting large audiences. Between 1933 and 1937, soap operas consolidated their dominant position in radio daytime programming and became the primary advertising vehicle for the soap, toiletry, and foodstuff industries. Although the soap opera faced a decline in broadcasting time and in number after this period of expansion, it remained one of the most popular genres in the history of radio. By 1940, the 64 serials being broadcast each day constituted 92 percent of all sponsored daytime broadcast hours and the 10 highest-rated daytime programs were all soap operas (Allen 1985). It is estimated that in the same year, one of the best in the history of daytime serials, about 20 million female viewers, approximately half the women at home during the day, listened to two or more serials daily (Cantor and Pingree 1983) (see also Radio as Medium). The radio era of the soap opera in the US was terminated by the advent of a new technology, television, which would soon become the most important advertising medium. By 1955, radio soap operas were practically eliminated and the last ones were discontinued in the 1960–61 season. When the last daytime serials left radio in the early 1960s, they were already an established form of television programming and advertisers continued to produce soap operas after the transition to television. Proctor and Gamble, for example, produced the first television network soap opera, The First Hundred Years, first broadcast by CBS in December 1950. The reason for the longevity of the daytime serials and their successful transition to television was the suitability of fulfilling one key demand of the commercial broadcasting 14206
system: the need to attract the audience of heavy consumers in a cost-effective manner. The main feature of the narrative text offered by the soap operas is ‘openness.’ Soaps are characterized by the absence of ultimate closure, since the plots never begin or end, staying on the air for several years or decades. The central elements of the serials’ content usually gravitate towards issues such as love, family, intimate relationships, and other domestic concerns, although the genre has been characterized by a process of diversification. As we have seen, the history of the soap opera can be traced to the US. Nevertheless, the dramatic serials have become a common programming form all over the world, with specific national peculiarities (e.g., Allen 1995). In Europe, for example, an analysis of the network of family and romantic relationships portrayed by soap operas has shown that European serials are not simply an American genre which was imported. Europe has developed distinctive subtypes of the genre, most notably the ‘community model’ in which loves and family conflicts are colored by more or less pedestrian hardships of sickness, unemployment, and teenage drug habits (Liebes and Livingstone 1998). By definition, soap operas are daytime serials. Nevertheless, in the case of the US, some prime-time dramatic serials, such as Dallas and Dynasty, have been broadcast during prime-time hours and have achieved a huge success both on the national and the international level. These prime-time serials frequently have been defined as soap operas, but they differ from daytime serials in several aspects. Prime-time serials demand a more expensive production structure and have fewer episodes. They also have a faster pace that leads to more resolution of conflicts when compared to soap operas (Cantor and Pingree 1983). Despite the fact that soap operas have been one of the most popular and central genres in the history of television (see Teleision: History), there were very few published works on television serials until the early 1970s. This ‘invisibility’ of the soap opera was due, to a great extent, to the low position it occupies in the hierarchy of taste in the cultural field (see Taste and Taste Culture) and to the unequal gender relations that constrain it. Media, academic, and public discourses about soap operas have been marked by the fact these serials are ‘woman’s forms.’ It was feminism that has transformed soap opera into a field for academic inquiry, although some feminists have also responded in a hostile manner toward the genre. Initially, feminist scholars used textual analysis to compare ‘real’ women with their stereotypical and unrealistic images (the sex object and the housewife), criticizing soap operas’ tendency to confirm female subordination. Later on, feminists used more varied methodological approaches, including qualitative audience research, to affirm and authenticate the pleasures offered by the genre to its female audiences. The feminist notion that ‘the personal is political’ helped turn media research in
Soap Opera\Telenoela the l970s and l980s away from news and current affairs toward ‘softer’ programs. Ensuing studies found soap operas’ focus on the domestic sphere of women’s everyday lives to provide meaningful, rather than escapist, entertainment; they tended to validate, rather than condemn, the role that soap opera viewing plays in women’s lives (see Brunsdon 1997, Brunsdon et al. 1997).
2. Telenoela Telenovelas are melodramatic serials produced in Latin America. They have become the most popular programs of the television industries of the region and a successful phenomenon in the international market. Although telenovelas have features in common with soap operas, they are two distinct genres. Soap operas are broadcast during the day, usually in the afternoon, while telenovelas dominate the prime-time slots of Latin American TV networks. As a result, soap operas are watched primarily by women while telenovelas attract a broader audience in terms of age and gender. If soaps are ‘open’ forms, broadcast for years or decades, telenovelas are ‘closed,’ having a clear beginning and an end, and lasting approximately 180– 200 episodes or 6–8 months. Finally, if soaps are characterized by authorial anonymity, telenovela writers in Latin America, particularly in Brazil, have personal styles that get recognized by the public (see Vink 1988). As with their American counterparts, telenovelas have their roots in the daytime serials that were broadcast by the commercial radio stations, the radionoelas. Early on, Cuba consolidated a commercial broadcasting system, which became an important production center of radio melodramatic serials. By 1930, the city of Havana had proportionally more radio stations than New York and in 1932 Cuba was the third country in the world in terms of the number of radio receivers (Ortiz et al. 1989). A key figure in the development of the radio serials in Latin America was the Cuban writer Felix Caignet. After the first radionoelas were broadcast in Cuba around 1935, Caignet started writing several melodramatic texts which became popular not only in Cuba, but all over Latin America. For example, the story El Derecho de Nacer (The Right to Be Born) has been broadcast by radio stations all over the region for several decades and was adapted with great success for television in the 1960s. After the introduction of the first television stations in Mexico and Brazil in 1950, telenovelas gradually became the most popular programming form of the television industries of the continent. Since the late 1960s, telenovelas have dominated prime-time and commanded the highest advertising rates. Globo Network in Brazil and Televisa in Mexico became
giant media corporations and their successful programming strategies have been based largely on the local production of telenovelas. In the late 1970s, after becoming virtual monopolies in their national markets, these companies began exporting telenovelas with great success. Globo pioneered the conquest of the international market in the early 1980s by exporting the telenovela Escraa Izaura (Izaura, the Slae) that became a national obsession in countries like Poland, China, and Cuba. On the other hand, Mexico concentrated first on Latin America and Hispanic audiences in the US. Later on, some of its telenovelas, as Los Ricos TambieT n Lloran (The Rich also Weep), were successful phenomena in countries like Italy and Russia. Latin American telenovelas have been viewed in more than 120 countries and besides Globo and Televisa, the telenovelas from Venevision Network of Venezuela have also been successful in the world market (see Mattelart and Mattelart 1990, Lopez 1995, Martin-Barbero 1995). The Latin American telenovelas focus on melodramatic conflicts, passion, tragic suffering, and moral conflicts. There are, nevertheless, important national peculiarities in the development of the genre. Mexican telenovelas are known for their weepiness and lack of specific historical references, while Brazilian telenovelas are more ‘realistic’ in their depiction of specific historical and political events of the country (see Lopez 1995), although more recently Mexican telenovelas such as Nada Personal (Nothing Personal ) and Colombian serials like Por estas Calles (In These Streets) have also dealt with political and social contexts in a more explicit way. Latin American telenovelas have, therefore, played a growing role in representing the cultural, social and political conflicts that have shaped the region. See also: Adolescents: Leisure-time Activities; Cultural Policy; Mass Communication: Normative Frameworks; Mass Media and Cultural Identity; Mass Society: History of the Concept; Media and History: Cultural Concerns; Media and Social Movements; Media Imperialism; Media, Uses of; Popular Culture; Television: Genres; Television: History; Television: Industry
Bibliography Allen R C 1985 Speaking of Soap Operas. University of North Carolina Press, Chapel Hill, NC and London Allen R C 1995 (ed.) To Be Continued … Soap Operas Around The World. Routledge, London and New York Brunsdon C 1997 Screen Tastes: Soap Opera to Satellite Dishes. Routledge, London and New York Brunsdon S, D’Acci J, Spigel L (eds.) 1997 Feminist Teleision Criticism: A Reader. Clarendon Press, Oxford, UK Cantor M G, Pingree S 1983 The Soap Opera. Sage, Beverly Hills, CA
14207
Soap Opera\Telenoela Liebes T, Livingstone S 1998 European soap operas: The diversification of a genre. European Journal of Communication 13: 147–80 Lopez A 1995 Our welcomed guests: Telenovelas in Latin America. In: Allen R C (ed.) To Be Continued … Soap Operas Around The World. Routledge, London and New York Martin-Barbero J 1995 Memory and form in the Latin American soap opera. In: Allen R C (ed.) To Be Continued … Soap Operas Around The World. Routledge, London and New York Mattelart M, Mattelart A 1990 The Carnial of Images: Brazilian TV Fiction. Bergin & Garvey, New York Ortiz R, Borelli S, Ramos J 1989 Telenoela: HistoT ria e Producm ag o, 1st edn. (Telenovela: History and Production). Brasiliense, Sa4 o Paulo Vink N 1988 The Telenoela and Emancipation: A Study on Teleision and Social Change in Brazil. Royal Tropical Institute, Amsterdam
M. P. Porto
Sociability in History Sociability is a notion used by historians to apprehend different forms of social relationships, in particular interpersonal bonds that are initiated either consciously or unconsciously in a given context. General distinctions are made, according to their social character, between bourgeois sociability and popular sociability, and, according to the shape it assumes, between formal sociability and informal sociability.
1.
Sociological Background
Before historians adopted the term sociability, sociologists had used it in various ways. The first to cite is Georg Simmel, who made the notion of Geselligkeit (sociability) a key concept of his formal sociology. In his Grundfragen der Soziologie (1917), Chapter III is entitled Die Geselligkeit. Beispiel der Reinen oder formalen Soziologie. According to Simmel, sociability is a form of socialization, engendered by interactions exercised among individuals on the basis of reciprocity and equality. This new sociological approach made it possible to shift the emphasis of the axis of a social analysis from the content to the form of the social games. Another sociologist for whom the notion of sociability was fundamental was Georges Gurvitch. For him, sociability comes under what he calls microsociology. In La ocation actuelle de la sociologie (1950), a chapter is devoted to the microsociological scale where different forms of sociability are dealt with. One significant point is his distinction between organized sociability and spontaneous sociability. 14208
This corresponds by and large with the historians’ distinction between formal sociability and informal sociability.
2. Adoption by Historians of the Notion of Sociability Agulhon was the first to use this notion as a key concept in his historical analysis of Provenc: al society. Agulhon was not inspired directly by the sociological theories mentioned above; their interpenetration came about later little by little. The starting point was his dissertation of 1966, La sociabiliteT meT ridionale: ConfreT ries et associations en Proence orientale dans la deuxieZ me moitieT du XVIIIe sieZ cle. In this work, Agulhon threw into relief an extremely dense network of social bonds, given concrete expression at the popular level in the shape of confraternities of penitents, and at the higher level, masonic lodges. He recognized in these phenomena certain characteristics of Provenc: al society, and called them sociabiliteT meT ridionale, or southern sociability. He later extended his research beyond the French Revolution to the midnineteenth century, and discerned at the base of republican movements developing in the villages a strong tradition of southern sociability. In the first half of the nineteenth century it is to be seen at the popular level in the increasing number of associations known as chambreT es, and at the higher level in the proliferation of cercles (Agulhon 1970). These pioneering works by Agulhon reverberated widely among social historians and opened up a new direction in the field of social history. The two following points represent the especial novelty of the notion. (a) Until then, the fundamental categories that constituted the very basis of social history had been the notions of social class and nation. Agulhon introduced a new dimension by adopting an ethnological perspective aimed at penetrating the intimacy of everyday life and discerning there the original social bonds that govern the society in question. (b) Social history in France came into being through criticizing traditional political history—so much so that there was a certain tendency to separate politics from social history. That was also the case with the Annales school until the 1970s. Agulhon, on the other hand, succeeded in reintegrating politics and combining it closely with social studies. In this way, his work inaugurated a new political history, closely linked with political anthropology. The role played by Agulhon is, therefore, very important, but that does not mean that previous or parallel research in the same field had been lacking, even though they did not make use of the notion of sociability. To begin with, the very early case of Georges Lefebvre must be cited. In his 1932 presentation on the revolutionary crowd to the Semaine de
Sociability in History SyntheZ se presided over by Henri Berr (Foules reT olutionnaires in the Proceedings of the colloquium, 1932), he identified three levels of social grouping: (a) simple (involuntary) crowd; (b) semivoluntary gathering; and (c) (voluntary) assembly. While the first level corresponds to an amorphous, involuntary crowd, and the third to a body that acts collectively with a common objective and a precise consciousness, like the Revolutionary sans-culottes, the second intermediate level represents people who form a certain community in their everyday lives through activities such as communal work in the fields, Sunday mass, evening meetings, markets or fairs, local festivals, or drinking in taverns, and who are thus imbued with their own collective mentality. By identifying these three levels, Georges Lefebre especially underlined the importance of this second level for correctly apprehending the popular movements of the Revolutionary period. It must be noted that what he here calls ‘semi-voluntary gathering’ is no other than informal sociability. A similar preoccupation is also to be found among historians contemporary with Agulhon. Yves Castan, for example, succeeded in discerning the social bonds that affect ordinary people in southwest France, by discovering the systems of codes that determine their behavior (HonneV teteT et relations sociales en Languedoc au XVIIIe sieZ cle, 1974). Without actually using the word sociability, and without any prior contact with the work of Agulhon, this study by Castan was inspired by the same concerns and remains a classic of the history of popular sociability. For sociability in the upper levels of society, one of the best examples is presented by Roche. In his doctoral dissertation, he studied some 2,500 provincial academics who constituted a cultural network of their own. Here is another instance of a classic study of sociability without actually using the term (Roche 1978).
3. Acceptance of the Notion of Sociability in French Historiography The term sociability proposed by Agulhon in his first book was favorably accepted by historians such as Philippe Arie' s, Emmanuel Le Roy Ladurie, and Daniel Roche, exceeding even the writer’s expectations. Vovelle, who compiled in 1980 a survey of the previous ten years’ research on southern sociability, concluded: ‘Sociability, a notion contrived or rediscovered over ten years ago, has entered the ranks of those supporting concepts needed by the history of mentalities in any attempt to define the collective realities on which it faces’ (Vovelle 1982). Agulhon himself revised and refined this theme on various occasions. Other historians also undertook parallel studies, first within the framework of the Mediterranean, and later crossing those frontiers that had
initially been assumed. Witness is borne by Guitton, whose book extended the horizons of investigation to villages throughout France, and discovered in northern France, too, intensely real-life social bonds that would constitute a village sociability (Guitton 1979). As fast as research proceeded, discussion arose around Agulhon’s initial hypothesis. Above all, criticism concerned the following two points. (a) While Agulhon considered the density and vivacity of associative life as one of the special features of Provenc: al life or at most the regions of the south, other researchers such as Guitton, Michel Bozon, or JeanLuc Marais demonstrated that the phenomenon exceeded the boundaries of Provence, and that the notion of sociability was also applicable to other regions. (b) While distinguishing between the two levels of sociability, namely informal (spontaneous) sociability and formal (organized) sociability, Agulhon attached importance to the evolution from the former to the latter, this latter being the very basis of associative life. Vovelle, however, believes that informal sociability retains a vague but persistent existence, without necessarily taking on any institutional form. Agulhon himself, in his later articles, revised his initial position. He admitted in part that the notion of sociability was operative on a general plan beyond regional temperament. He also acknowledged that informal sociability had a being of its own and played an important role, especially at the popular level (Agulhon 1976, 1986). Thus, since the 1970s, sociability has been an honored bastion of social history, alongside the history of mentalities or the history of everyday life (see the excellent survey in Franc: ois and Reichardt 1987). Increasing numbers of monographs have appeared on a wide variety of themes. At the level of informal sociability, different social activities have been studied, both for villages and for urban environments, for example, collective work in the fields, community’s assembly, taverns, evening meetings (eilleT es), rough music (chariari), carnivals or feasts, youth groups, markets and fairs, pilgrimages, games, and divertissements. All of these activities are supposed to favor a spontaneous sociability, not organized but having organic life (J-P. Guitton, R. Muchembled, N. Pellegrin, M. Bozon). In the urban framework, neighborhoods and the social life in a street or a city quarter have been brought into relief, rather than the municipal institutions that used to be studied by classical urban historians (see Farge 1979). Confraternities and chambreT es, which Agulhon showed to be so important, now constitute a major theme of research in popular sociability. As far as formal sociability itself is concerned, what counts above all is the sociability of the bourgeois middle classes or of the intellectual elite. Apart from freemasonry and the provincial academies mentioned earlier, objects of study include local activities of literary or other erudite people, societies of all kinds, 14209
Sociability in History as well as less closed locations frequented by less elevated social strata: cafe! s or reading clubs (Roche, Jean-Pierre Chaline, Eric Saunier, D. Goodman). For research as a whole, the Association for Research in Sociability (ARS) at the University of Rouen has made a valuable contribution in its wide perspective of current research. The themes chosen for colloquia that have been organized there display a wide variety of work in progress (ARS 1987, 1989, 1992, 1997).
4. Beyond French Historiography: Influence or Coincidence For Germany and Switzerland, it is a case more of coincidence than of influence. In 1983, Franc: ois organized an important Franco–German colloquium on sociability. According to him, ‘The study of the facts of sociability and of associative life has become one of the most fertile fields for study in the new history,’ but ‘this parallel discovery—or re-discovery—of sociability as an object of historical research and the resulting profusion of studies have come about in France, as in Germany, in almost complete ignorance of inquiries being carried out at the same instant by historians in the neighboring country’ (Franc: ois 1986). A stinging comment. He also points out that German historians were concentrating on institutionalized associations (Vereinswesen), while their French counterparts were interested rather in all forms of sociability (Geselligkeit), in particular the informal sociability. This contrast between two neighboring cultural zones is also confirmed by Otto Dann in his presentation to the Franco–German colloquium. He attributes the contrast not to national temperaments, but to the difference in historical conditions experienced by the two countries in the eighteenth and nineteenth centuries. In any case he insists that henceforth for both Germany and France there is a primordial necessity to carry out parallel and joint research on these two complementary dimensions of sociability (Franc: ois 1986). In Germany in the eighteenth century, a process of transition in upper-class sociability took place from the aristocracy or the court to the rising middle classes. There was a clear increase in numbers of associations such as academies, scientific societies, musical or literary societies, reading circles, masonic lodges, student associations. Also in the nineteenth century a transference is to be observed from bourgeois sociability to the working-class sociability. Workers’ educational associations were set up on the initiative of middle-class leaders, and working-class movements for political mobilization came into being. The end of the nineteenth century, with the rise in the consciousness of Germany as a single nation, saw the formation on a national scale of new organizations for 14210
sport, for soldiers, and so on. One feature of these associations is that women were excluded. As a result, from 1848 onward, women began setting up their own associations to campaign for their rights to education, and indeed the right to vote. Because of its importance in the country’s cultural, social, and even political life, this kind of institutionalized sociability (Vereinswesen) has long been a focus of interest for German historians (Dann 1984, Siebert 1993). But recently they have begun to show interest in another kind of sociability, which is rather less formal and less rigidly organized, as in family or neighborhood ties. Family celebrations, dinner parties, balls, coffee house circles, and family concerts are in their turn being studied from the point of view of sociability. Historians are also examining the blurred boundaries between public and private, and including in their work the problems posed by gender history. It is in this direction that the current of the history of everyday life is flowing (see Lu$ dtke 1989). Here this new orientation in German historiography can be seen converging with the preoccupations of the French historical anthropology. The contribution of Swiss historians is important, too. The work of Im Hof not only for Switzerland but also for Europe in the Enlightenment is well known (Im Hof 1982). In his presentation to the FrancoGerman colloquium mentioned previously, he stresses the liveliness and the density of associative life in Switzerland. During the first half of the nineteenth century there was a blossoming of associations of widely varied types: not only artists, musicians, historians, or students, but also gymnasts, marksmen, and singers banded together in associations. He points out, too, a boom in explicitly democratic sociability, but with a strong patriotic and national inclination—a predominant Swiss characteristic (Franc: ois 1986). In the Anglo-Saxon world, the term sociability as a historical notion is not very frequent, but there are certain currents in research which have clearly been inspired by the history of sociability. At the level of informal sociability, neighborhood has become an appealing theme for urban historians: the London suburbs of the seventeenth century and the local community in eighteenth-century Paris have been admirably well analyzed (Boulton 1987, Garrioch 1986). The bonds of sociability in taverns are also considered an essential element of popular culture (Brennan 1988). In this way the approaches of English social history concur with those of French historians of sociability. On the other hand, at the level of the history of ideas, a perceptive study by Gordon points up the correlations between egalitarian-minded associations and the formation of the public sphere (Gordon 1994). This important problem of the public sphere should be further examined in close connection with social practices. In the Mediterranean world, the influence of French historiography is more strongly felt. In Italy, Gemelli
Sociability in History and Malatesta compiled a volume of extracts from 10 or so major articles in the history of sociability, with a substantial Introduction which succeeds in locating the ‘adventures’ of sociability in the current of contemporary French historiography (Gemelli and Malatesta 1982). It should be noted that Italian microstoria and the new French history share many of their concerns. The works of Levi (1985) or Cerutti (1990) are good examples of this. Moreover, for analyzing urban societies in the Middle Ages and the Renaissance, Italian historians propose the notions of parenti (family), amici (friends), and icini (neighbors) as elements of the social network. These are very suggestive of the point of view of sociability. Similarly, outside the Western world, new currents in historical research are to be seen. In Japan, for example, the immediate postwar years saw the birth of a powerful current of social science history which, strongly influenced by Marxism, tried to analyze Japanese society based almost exclusively on the notions of social class and nation-state. For criticizing the nationalistic historians, who had been dominant during the War, or the traditional pure positivists, these concepts worked well and made it possible to propose a structural vision of the history of Japan. Nevertheless, with the passage of time, this unidimensional viewpoint proved too categorical and not capable of explaining the sociocultural background of Japanese history. And the rapid, unexpected transformation of Japanese society after the War made it necessary to reconsider the framework of social analysis. Therefore, since the 1970s, a new current in historical research is to be observed which, endowed with a multidimensional viewpoint, tries first to define history as lived by the people in their everyday lives and from there to approach economic or political aspects in order to reformulate the mechanisms by which Japanese society functions. It is in these new circumstances that notions of mentalities and of sociability have been integrated into Japanese historical research. A colloquium organized in Tokyo in 1994 by Hiroyuki Ninomiya on ‘Different forms of social ties, or to what extent can the notion of sociability operate’ was one outcome of these efforts (see Ninomiya 1995). In this way the notion of sociability has become operative even in the fields of Japanese history. The same endeavors are being followed up for other Asian regions.
5.
Perspecties
The notion of sociability remains ambiguous. But despite its ambiguity, or rather because of its vagueness, it has been able to play a heuristic role, just like the notion of mentalities (see Le Goff 1974). Thus, it has opened up a new perspective in social history. To be sure, there are some important gaps to be filled in actual research.
(a) In general, formal (organized) and informal (spontaneous) sociability are treated separately even now. The connection and the interference between these two aspects of sociability must be elucidated. (b) There is a certain tendency to consider sociability in isolation. Originally, sociability was about openness, exchange, communication. It would be interesting to look at it from the point of view of network and strategy theory, and to exchange problems. (c) The relations between sociability and the creation of public sphere should be reconsidered not only at the level of bourgeois sociability and its associations, but also at the level of popular sociability, without detaching the public sphere from everyday life. See also: Civil Society, Concept and History of; Communication and Democracy; Friendship, Anthropology of; Social Capital; Voluntary Associations, Sociology of
Bibliography Agulhon M 1966 La sociabiliteT meT ridionale: ConfreT ries et associations dans la ie collectie en Proence orientale aZ la fin du XVIIIe sieZ cle dans la deuxieZ me moitieT du XVIIIe sieZ cle, 2 Vols. La Pense! e universitaire, Aix-en-Provence [2nd edn. under a modified title PeT nitents et Francs-maçons de l’ancienne Proence: Essai sur la sociabiliteT meT ridionale. Fayard, Paris] Agulhon M 1970 La ReT publique au illage. Les populations du Var de la ReT olution aZ la IIe ReT publique. Plon, Paris Agulhon M 1976 La sociabilite! , la sociologie et l’histoire. L’Arc 65 [Reprinted in Le cercle dans la France bourgeoise 1810– 1848: E´tude d’une mutation de sociabilite´. Armand Colin, Paris, 1977] Agulhon M 1979 Sociabilite! populaire et sociabilite! bourgeoise au XIXe sie' cle. In: Poujol G, Labourie R (eds.) Les Cultures Populaires. Privat, Toulouse Agulhon M 1986 La sociabilite! est-elle objet d’histoire? In: Franc: ois E (ed.) SociabiliteT et socieT teT bourgeoise en France, en Allemagne et en Suisse (1750–1850). E´ditions Recherche sur les Civilisations, Paris ARS (Association de la Recherche sur la Sociabilite! de l’Universite! de Rouen) 1987 SociabiliteT , pouoirs et socieT teT , Actes due colloque de Rouen (1983). Rouen University, France ARS 1989 Aux sources de la puissance: SociabiliteT et parenteT , Actes du colloque de Rouen (1987). Rouen University, France ARS 1992 La sociabiliteT aZ table: CommensaliteT et coniialiteT aZ traers les aV ges, Actes du colloque de Rouen (1990). Rouen University, France ARS 1997 La rue, lieu de sociabiliteT ? Rencontres de la rue, Actes du colloque de Rouen (1994). Rouen University, France Boulton J 1987 Neighbourhood and Society: A London Suburb in the Seenteenth Century. Cambridge University Press, Cambridge, UK Brennan T E 1988 Public Drinking and Popular Culture in Eighteenth Century Paris. Princeton University Press, Princeton, NJ Cerutti S 1990 La ville et les me´tiers. Naissance d’un langage corporatif (Turin, XVIIe–XVIIIe sieZ cle). Editions de l’Ecole des Hautes Etudes en Sciences Sociales, Paris Dann O (ed.) 1984 Vereinswesen und buW rgerliche Gesellschaft in Deutschland. Oldenbourg, Mu$ nchen
14211
Sociability in History Farge A 1979 Vire dans la rue au XVIIIe sieZ cle. Gallimard, Paris Franc: ois E (ed.) 1986 SociabiliteT et socieT teT bourgeoise en France, an Allemagne et en Suisse (1750–1850)\Geselligkeit, Vereinswesen und buW rgerliche Gesellschaft in Frankreich, Deutschland und der Schweiz (1750–1850). Editions Recherche sur les Civilisations, Paris Franc: ois E, Reichardt R 1987 Les formes de sociabilite! en France, du milieu du XVIIIe sie' cle au milieu du XIXe sie' cle. Reue d’histoire moderne et contemporaine 34: 3 Garrioch D 1986 Neighbourhood and Community in Paris, 1740–1790. Cambridge University Press, Cambridge, UK Gemelli G, Malatesta M 1982 Forme di SociabilitaZ nella Storiografia Francese Contemporanea. Feltrinelli, Milano Goodman D 1999 Sociabilite! . In: Ferrone V, Roche D (eds.) Le Monde des LumieZ res. Fayard, Paris Gordon D 1994 Citizens without Soereignty: Equality and Sociability in French Thought, 1670–1789. Princeton University Press, Princeton, NJ Guitton J-P 1979 La sociabiliteT illageoise dans l’ancienne France: SolidariteT s et oisinages du XVIe au XVIIIe sieZ cle. Hachette, Paris Gurvitch G 1950 La ocation actuelle de la sociologie. Presses Universitaires de France, Paris Im Hof U 1982 Das gesellige Jahrhundert: Gesellschaft und Gesellschaften im Zeitalter der AufklaW rung. Beck, Mu$ nchen Le Goff J 1974 Les mentalite! s: une histoire ambigue! . In: Le Goff J, Nora P (eds.) Faire de l’histoire, Vol. 3: Noueaux objets. Gallimard, Paris Levi G 1985 L’eredita immateriale. Carriera di un esorcista nel Piemonte del seicento. Giulio Einaudi, Turin, Italy Lu$ dtke A (ed.) 1989 Alltagsgeschichte. Campus Verlag, Frankfurt, Germany Muchembled R 1989 La iolence au illage. SociabiliteT et comportements populaires en Artois du XVe au XIXe sieZ cle. Editions Brepols, Turnhout Ninomiya H (ed.) 1995 Musubiau Kataihi: Soshiabirite-ron no Shatei. Yamakawa-shuppansha, Tokyo Roche D 1978 Le sieZ cle des LumieZ res en proince: AcadeT mies et AcadeT miciens proinciaux, 1680–1789, 2 Vols. Mouton, The Hague Roche D 1993 La France des LumieZ res. Fayard, Paris Siebert E 1993 Der literarische Salon: Literatur und Geselligkeit zwischen AufklaW rung und VormaW rz. Metzler, Stuttgart Vovelle M 1982 Dix ans de sociabilite! me! ridionale. In: Vouelle M (ed.) IdeT ologies et mentaliteT s. Franc: ois Maspero, Paris
H. Ninomiya
Social Behavior (Emergent), Computer Models of By the term ‘emergent social behavior’ we mean those patterns of social behavior that arise automatically without relevant specifications being in-built in the individual. Under what conditions such patterns may emerge, and of what kind they are, is unpredictable in itself, but can be learned from ‘bottom-up’ models. In 14212
such models, a so-called ‘process-oriented’ approach is used. An artificial system is constructed in which agents are equipped with behavioral rules and mechanisms. While these agents interact with their environment and other agents, their behavior is studied in the same way as in ethological studies. Results show that in this way simple agents generate complex social structures, and that the same set of rules may lead to different patterns depending on the past experiences of the individuals, the demography of their population or the distribution of their food. Studies of such complex systems were, of course, made possible by the development of computers, which have led to new modeling techniques. Initially, these studies dealt with emergent phenomena in chemical and physical research (Gleick 1987). In this article, the major issues and characteristics of these models (called ‘individual-oriented,’ ‘individualbased,’ or artificial life models) are explained by examples. Their connection with views on evolution and their general usage is discussed.
1. Major Issues of Self-Organized Social Behaior Computer models in question deal with the emergence of: (a) the formation of groups; (b) collective ‘decisions’ (e.g., as regards the choice of a food source); (c) the distribution of tasks over individuals (such as brood care and foraging); (d) a reproductive ‘clock’; (e) spatial structure of groups; and (f ) social organization of groups.
1.1 Group Formation Many animal species live in groups which vary in characteristics, such as composition, size, permanency, and cohesion. One of the most cohesive and largest societies is found in army ants (with populations of up to 20,000,000 ants). Remarkably, these ants are practically blind and rely on a pheromonal system with which they mark their paths and by which they follow paths taken by others. Preying upon insect colonies, they raid over the forest in phalanxes of about 200,000 workers. Different species display different swarming patterns. To explain the differences among species in swarming, it is usually assumed that there are corresponding differences in the underlying behavioral motivations. Deneubourg and co-authors (1989), however, have shown that such markedly different swarming patterns may arise from one and the same system of laying and following pheromone
Social Behaior (Emergent), Computer Models of trails, when ants are raiding in different environments with different distributions of food. To show this, the authors used a simulation in which ‘artificial ants’ moved in a discrete network of points, and marked their path with pheromones. Further, when choosing between left and right, they preferred the more strongly marked direction. By introducing different distributions of food, different swarming patterns arose from the interaction between the flow of ants heading away from the nest to collect food, and the different spatial distributions of the foragers returning with the food. These different swarm types were remarkably similar to those of certain army ants.
1.2 Collectie ‘Decisions’ Further, the use of trail pheromones is also a mechanism of collective ‘decision’ making. Experiments with real ants combined with modeling have shown that trail laying often causes ants to choose the food source that is nearest to the nest (Deneubourg and Goss 1989). This observation seems to suggest that individual ants possess considerable cognitive capacities, but Deneubourg and Goss show in a computer model that ants may accomplish this remarkable feat without comparing distances to various food sources. When ants mark their path by pheromones and follow the more strongly marked branch at crossroads, then, by returning to the nest more quickly when the road to and from the food source is shorter, they obviously imprint the shorter path more heavily with pheromones. Thus, positive feedback is the result: as the shorter path attracts more ants, it also receives stronger marking, etc. In subsequent extensions of this model (Detrain et al. 1999), path marking and following are shown to imply automatically risk avoidance (which agrees with observations in some species of ants), as follows: a hostile confrontation delays the forager for a certain time before returning to the nest. Thus the path to the dangerous food source automatically is marked more slowly and less strongly. Consequently, the safest food source is usually ‘preferred.’
1.3 Distribution of Tasks Such a division of labor occurs in colonies of many species of social insects. Remarkably, such division is flexible, i.e., the ratios of workers performing different tasks may vary according to the changing needs of the colony. The earliest model on the emergence of task division is that designed for bumblebees (Bombus terrestris) by Hogeweg and Hesper (1983, 1985). In an earlier experimental study, it was discovered that, during the growth of the colony, workers developed
into two types, ‘common’ and ‘elite’ workers. The activities carried out by the two types differ remarkably: whereas the ‘common’ workers mainly forage and take rests, the ‘elite’ workers are more active, feed the brood, interact often with each other and with the queen, and sometimes lay eggs. To study the conditions for the formation of the two types of workers, Hogeweg and Hesper (1983, 1985) set up a so-called ‘Mirror’ model. This kind of model is complex in the number of variables included, but simple as regards behavioral processes. It contains biological data concerning the time of the development of eggs, larvae, pupae, etc. Space in the model is divided into two parts, peripheral (where inactive common workers doze part of the time) and central (where the brood is, and where all interactions take place). The rules of the artificial adult bumblebees operate ‘locally’ in that their behavior is triggered by what they meet. What they meet is picked out in the model randomly from what is available in the space in which the bumblebee finds itself. For instance, if an adult bumblebee meets a larva, it feeds it, and if it meets a pupa of the proper age, it starts building a cell in which a new egg can be laid, etc. All workers start with the same dominance value after hatching, with only the queen starting with a much higher rank. When one adult meets another, a dominance interaction takes place, the outcome of which (victory or defeat) is self-reinforcing. The positive feedback is ‘damped,’ however, because expected outcomes reinforce or diminish the dominance values of both partners only slightly, whereas unexpected outcomes give rise to relatively large changes in dominance value. Dominance values of the artificial bumblebees influence almost all their behavioral activities; for instance, individuals of high rank are less likely to forage. It appears that this model generates two stable classes automatically, the ‘elites’ and ‘commons,’ with their typical dominance and behavioral conduct. For this result, the nest has to be divided into a center and a periphery (as found in real nests). The flexibility of the distribution of tasks becomes obvious upon halving the worker force in the model. In line with observations in similar experiments on real bumblebees, this increases markedly each individual’s tendency to forage. This arises in the model as follows: the lowered numbers of workers reduces the number of encounters among them, and thus increases the frequency of encounters between bumblebees and brood. Encountering the brood more often induces workers to collect food more frequently. A different approach to the effect of social interactions on task allocation was taken by Pacala et al. (1996). In their model, individuals will stick to a certain task as long as it can be executed successfully, but otherwise give up (e.g., stop foraging when food is depleted). Then, as soon as they encounter another individual that is performing a task successfully, they will switch to perform that task too. 14213
Social Behaior (Emergent), Computer Models of Another approach to the emergence of task division in social insects is described by Bonabeau et al. (1996). Their basic assumption is that different types of workers have different, genetically determined response thresholds for behavioral tasks. This is particularly relevant for dimorphic species, such as the ant Pheidole megacephala, in which two physically different castes, the so-called majors and minors, exist. Majors generally have higher thresholds than minors. This fixed threshold model has been extended to variable response-thresholds by Theraulaz et al. (1998). They added learning in the form of a reinforcement process. This implies that performing tasks lowers the threshold of task execution and notperforming them heightens it.
1.4 Behaioral ‘Clock’ In real bumblebees, the queen switches at the end of the season from producing sterile offspring to fertile ones, after which she is either chased away or killed in spite of the fact that she is still as dominant as before. The time of departure\death of the queen is important for the ‘fitness’ (i.e., number of new queens) of the colony. After the demise of the queen, new queens and males (so-called generative offspring) are reared. It is assumed that the switch should take place some weeks before the end of the season, because at that time the colony is at its largest and can take care of the largest number of offspring. But it is very difficult to think of a possible external signal that could trigger such a switch. However, Hogeweg and Hesper (1983) show in their Bumblebee model that no external signal is needed, and that the switch may occur as a socially regulated ‘clock’ involving a fixed threshold, as follows. During the development of the colony, the queen is able to inhibit worker ovipositions by producing a certain pheromone that delays their production of eggs. Just before she is killed, she can no longer suppress the ‘elite’ workers, because, due to the large size of the colony, individual workers are less subjected to the dominance of the queen. So they start to lay unfertilized (drone) eggs. Further, the stress on the queen increases each time she has to perform dominance interactions with workers during her egg laying. When the ‘stress’ of the queen is at a low fixed threshold value, she switches to producing male eggs (drones), and when it increases to a high fixed threshold, she is killed or chased away. The notion of this switch, which functions as a kind of social clock, has been challenged, because generative offspring are sometimes also found in small nests. However, in response to this objection, Hogeweg and Hesper have shown that the switch in small nests created in the Bumblebee model by reducing the growth speed, appears to occur at the same time as in 14214
large nests (1985). This is due to a complicated feedback process.
1.5 Spatial Structure In many animal groups, there is a spatial structure in which dominants are in the center and subordinates at the periphery. This spatial structure is usually explained by the well-known ‘selfish herd’ theory of Hamilton (1971). The basic assumption of this theory is that individuals in the center of a group are best protected against predators, and therefore have evolved a preference for positions in which group members are between them and the predators, the socalled ‘centripetal instinct.’ However, such a spatial structure with dominants in the center also emerges in a model designed by Hemelrijk (2000), called DomWorld, in which such a preference for the center is lacking. The model consists of a homogeneous virtual world inhabited by agents that are provided with only two tendencies: (a) to group, and (b) to perform dominance interactions. Dominance interactions reflect competition for resources (such as food and mates) and the agent’s capacity to be victorious (i.e., its dominance) depends, in accordance with ample empirical evidence, on the self-reinforcing effects of winning (and losing). In this model, the spatial configuration arises without any ‘centripetal instinct,’ but is the result of the mutually reinforcing effects of spatial structure and strong hierarchical differentiation. It can be explained as follows. Pronounced rank differentiation causes low-ranking agents to be chased away by many, and therefore to end up at the periphery. Automatically, this leaves dominants in the center. Besides, this spatial structure causes agents to interact mainly with partners close in rank; therefore, if a rank change occurs, it is only a minor one. Thus, the spatial structure stabilizes the hierarchy and maintains its differentiation. Spatial structure in human organizations is, for instance, evident in ghetto formation and in the formation of same-sex groups at parties. The Harvard economist, Thomas Schelling (1971), has shown how a stronger segregation emerges than is wanted by the individuals—who have only a slight preference for their own type—due to a positive feedback. Once individuals obtain minority status in a subgroup, because they are surrounded by an insufficient number of individuals of their own type (race or sex), they leave the subgroup, and this induces others of the same type to leave their former subgroup too. Their arrival in another subgroup may in its turn induce individuals of the other type to leave. Accordingly, Schelling notes that, even if the preference to have some individuals of one’s own type close by is slight, its behavioral consequences may increase at a group level, leading to unexpected ‘macropatterns.’
Social Behaior (Emergent), Computer Models of 1.6 Social Organization of Groups Two types of society are generally distinguished in the study of animal behavior: egalitarian and despotic ones. In certain primate species, such as macaques, a steep dominance or power hierarchy characterizes a despotic society, and a weak hierarchy an egalitarian one. Furthermore, both societies differ in a number of other characteristics, such as bidirectionality of aggression and cohesion of grouping. For each of these (and other unmentioned) differences a separate, adaptive (i.e., evolutionary) explanation is usually given, but DomWorld provides us with a simpler explanation for many of them simultaneously (Hemelrijk 1999): by simply increasing the value of one parameter, namely, intensity of aggression (in which egalitarian and despotic macaques differ—see Thierry 1990), the artificial society switches from a typical egalitarian to a despotic society with a cascade of consequences that resemble characteristics observed in the corresponding society of macaques. Such resemblances imply that differences between egalitarian and despotic societies of macaques may be entirely due to one trait only, intensity of aggression. On an evolutionary timescale, one may imagine that species-specific differences in aggression intensity arose, for example, because in some populations of a common ancestor, individuals suffered more food shortages, and consequently only those with intense aggression could survive. Apart from variation among individuals in power, human societies are also characterized by a distribution of wealth. Epstein and Axtell (1996) show how, under many circumstances, a very skewed distribution of wealth emerges among heterogeneous agents (for instance, differing in vision and metabolism) in a model called ‘Sugarscape.’ Such a skewed distribution of wealth appears to be a general outcome of many specifications of agents that are extracting resources from a landscape of fixed capacity.
2. Characteristics of Models for Emergent Social Patterns Typical of models for self-organized social behavior are the following characteristics of some or all of the above examples: (a) nearby perception and activation, (b) positive (instead of the more commonly used negative) feedback processes, (c) spatial distribution, (d) the possibility of studying complex behavior, and (e) ‘open-endedness’. (a) Nearby perception and actiation. Individual agents have exclusive information about what is happening nearby. Further, timing of acts of agents is not ruled by a central clock, but nearby agents activate each other. Otherwise, agents are activated randomly. (b) Positie feedback processes. Examples of these are the trail marking (with pheromones) and tracking system of ants, winning and losing of dominance interactions and a mutual reinforcement between
spatial structure and the development of the hierarchy among group-living agents. (c) Spatial distribution. The inclusion of space in the model creates the possibility for spatial structures to emerge. Such a spatial structure may affect behavior of individuals. (d) The possibility of studying complex behaior. In these models, behavior is recorded in behavioral units similar to those used by ethologists for studying the behavior of real animals. Data are analyzed to detect new behavioral activities and patterns. (e) ‘Open-endedness’—that is, at the time when the model is designed, it is unknown what patterns may emerge.
3. Emergence and Eolution Because in these models behavioral mechanisms are implemented without continuous reference to pay-off benefits, this procedure leads to a parsimonious or restrictive view of evolution in the sense that the behavioral patterns observed need not always be based on trait-specific genes, and are not willed expressly by the individuals. Besides, many of the behavioral patterns usually attributed to long-term evolution may instead be explained as direct, short-term effects of other mechanisms. In these respects, these models agree with the social genetic and evolutionary explanations of, for instance, Lewontin, Gould, and Eldredge (Depew and Weber 1995), and they contrast with the thermodynamic and gene-focused views of Fisher and of sociobiology. In sociobiology, it is not the ‘processes’ that are of interest, but questions regarding adaptivity of behavior, and regarding evolutionarily stability of strategies. These issues are studied in the sociobiologically-oriented gametheoretic models. In contrast to models for emergence, behavioral strategies (such as ‘hawks and doves,’ ‘cooperators and defectors’) are defined in terms of pay-off without taking into account the underlying mechanisms or behavioral and environmental features; entities with different strategies are pitted against each other in order to examine which maintain themselves and which disappear, or in what proportions they may continue to exist in combination; new strategies that are not included from the start cannot arise in these game-theoretic models.
4. The Benefits of Models for Emergent Social Behaior Models of self-organization are never to be regarded as a proof of how behavior arose. They are, however, useful for several reasons. First, they may suggest explanations that cannot be reached by theorizing alone. Take, for instance, Camazine’s model on the structuring of the comb of honey bees (Camazine 14215
Social Behaior (Emergent), Computer Models of 1991). The comb of honey bees is structured with brood in the center, and pollen and honey in layers around it. This organization is usually attributed to a cognitive blueprint in the honey bees. Camazine’s model shows, however, that the structuring of the comb may also result as a side-effect from the feeding activities of the brood, and from the relative quantities of pollen and honey imported or removed. Second, these models can be used as ‘search indicators’ for empirical studies. If models are firmly based on observed biological mechanisms (as in many of the above mentioned examples), and if the emerging patterns agree with the patterns observed in real groups of organisms, then additional patterns observed in the model, but not yet noted in the real world, may be looked for in the real world. In the same way, in a model that has been shown to be relevant for real animal behavior, consequences of certain parameters may also be studied, e.g., the effects of different degrees of cohesion of grouping. To carry out such experiments in computer simulation is very easy. Usually, similar experiments are much more difficult in the real world. Third, the insight that certain patterns of social behavior may be emergent often leads to new, parsimonious hypotheses about their evolution. Fourth, such models can also be used in teaching pupils how to think in terms of self-organization in general as is done, for instance, by Resnick (1994). Fifth, certain kinds of insight obtained from models may be used to solve real-world-problems (e.g., Bonabeau and Theraulaz 2000). For instance, the ant models by Deneubourg have been extended by Dorigo to solve the famous ‘traveling salesman problem.’ In this problem, a person must find the shortest route by which to pay a single visit to a number of cities. This problem is very difficult to solve, because the possibilities easily run into the millions.
Deneubourg J L, Goss S, Franks N, Pasteels J M 1989 The blind leading the blind: modeling chemically mediated army ant raid patterns. Journal of Insect Behaior 2: 719–25 Depew D J, Weber B H 1995 Darwinism Eoling. Systems Dynamics and the Genealogy of Natural Selection. MIT Press, Cambridge, MA Detrain C, Deneubourg J-L, Pasteels J M 1999 Decision-making in foraging by social insects. In: Detrain C, Deneubourg J-L, Pasteels J M (eds.) Information Processing in Social Insects. Birkha$ user Verlag, Basle, Switzerland, pp. 331–54 Epstein J M, Axtell R 1996 Growing Artificial Societies. Social Science from Bottom Up. MIT Press, Cambridge, MA Gleick J 1987 Chaos: Making A New Science. Viking, New York Hamilton W D 1971 Geometry for the selfish herd. Journal of Theoretical Biology 31: 295–311 Hemelrijk C K 1999 An individual-oriented model on the emergence of despotic and egalitarian societies. Proceedings of the Royal Society London B: Biological Sciences 266: 361–9 Hemelrijk C K 2000 Towards the integration of social dominance and spatial structure. Animal Behaiour 59: 1035–48 Hogeweg P, Hesper B 1983 The ontogeny of interaction structure in bumble bee colonies: a MIRROR model. Behaioral Ecology and Sociobiology 12: 271–83 Hogeweg P, Hesper B 1985 Socioinformatic processes: MIRROR modelling methodology. Journal of Theoretical Biology 113: 311–30 Pacala S W, Gordon D M, Godfray H J C 1996 Effects of social group size on information transfer and task allocation. Eolutionary Ecology 10: 127–65 Resnick M 1994 Turtles, Termites and Traffic Jams. Explorations in Massiely Parallel Microworlds. MIT Press, Cambridge, MA Schelling T C 1971 Dynamic models of segregation. Journal of Mathematical Sociology 1: 143–86 Theraulaz G, Bonabeau E, Deneubourg J-L 1998 Threshold reinforcement and the regulation of division of labour in social insects. Proceedings of Royal Society London B: Biological Sciences 265: 327–33 Thierry B 1990 Feedback loop between kinship and dominance: The macaque model. Journal of Theoretical Biology 145: 511–21
See also: Artificial Social Agents; Emergent Properties; Evolutionary Social Psychology; Social Psychology; Social Psychology: Sociological; Social Psychology, Theories of; Social Simulation: Computational Approaches; Sociobiology: Overview; Sociobiology: Philosophical Aspects
C. K. Hemelrijk
Social Capital Bibliography Bonabeau E, Theraulaz G 2000 Swarm smarts. Scientific American 282: 72–80 Bonabeau E, Theraulaz G, Deneubourg J-L 1996 Quantitative study of the fixed threshold model for the regulation of division of labour in insect societies. Proceedings of the Royal Society London B: Biological Sciences 263: 1565–9 Camazine S 1991 Self-organizing pattern formation on the combs of honey bee colonies. Behaioural Ecology and Sociobiology 28: 61–76 Deneubourg J L, Goss S 1989 Collective patterns and decisionmaking. Ethology, Ecology and Eolution 1: 295–311
14216
With roots in de Tocqueville, Durkheim, Weber, and others, current approaches to social capital were formulated most clearly by Coleman (1990), who distinguished the term from physical and human capital. Physical capital identifies investment in tools and other tangible productive equipment; human capital refers to less tangible investments in skill and knowledge by individuals (manifested specially by educational attainment). Social capital is even less tangible, and refers to relations among persons that facilitate action, embodied in the collective norms of
Social Capital communities that extend beyond immediate family members and the trustworthiness of the social environment on which obligations and expectations depend. Most ensuing discussions of social capital emphasize levels of trust and patterns of social networks within communities. Putnam takes social capital to reflect ‘features of social organization, such as trust, norms, and networks, that can improve the efficiency of society by facilitating coordinated actions’ (1993). For Fukuyama, social capital is ‘a capability that arises from the prevalence of trust in a society or certain parts of it’ (1995). More complete surveys of the term are available in Jackman and Miller (1998) and Portes (1998). While it is an intangible that affects the behavior of individuals, social capital itself is cast as a collective or group-level phenomenon that identifies the informal social contexts that constrain and influence the behavior of individuals. Because it is relational and inheres in social networks, social capital minimally involves dyads and typically groups of larger size: communities, neighborhoods, ethnic and national groups. In high social capital environments, individuals are networked into a variety of larger informal groups that emphasize norms of trust and reciprocity from which those individuals benefit. While the concept is intangible, the underlying variable is clearly consequential. There is broad agreement that effective social networks and pervasive interpersonal trust enhance efficiency in social relations and lower transaction costs, both of which have direct implications for social, economic, and political behavior. Some have further suggested that high levels of social capital foster cooperative and perhaps altruistic behavior, a theme with communitarian origins. The tendency to couple social capital and cooperative behavior is specially pronounced among scholars uncomfortable with the premise that human behavior is grounded in the egoistic optimization of individual utility. However, the line between cooperation and collusion is fine. Much has been written about social capital recently. In a noted analysis, Putnam laments the political consequences of a decline of social capital in the US during the 1980s and 1990s, a trend he claims has led increasingly to a society of go-it-alone bowlers, isolated both from each other and from broader currents of community political life (Putnam 1995). Yet the evidence for any such pattern is mixed. Voter turnout in the US has declined since around 1960, and there has also been a decline in reported levels of political trust on some survey responses (Levi and Stoker 2000). These trends appear to be the exception rather than the rule, however, and their meaning is unclear. A broad review of the available evidence since the 1960s finds no reduction in social capital, indexed by general levels of trust or group membership (Ladd 1999). Shrinking membership rates in some organizations
(e.g., labor unions) have been more than offset by increases in others (e.g., environmental groups). Further, there is no evidence of a decline in social activities associated with group membership. Associational membership in the US thus appears live and well. Data on trends in group membership in other countries are less complete, but do not suggest a decline. A more fundamental issue hinges on the causal significance attached to social capital, and here there is considerable disagreement. The bowling alone thesis attracted attention because of the consequences it imputes to social capital. Reduced social capital becomes noteworthy because it heralds increasing problems of governability and other social pathologies. The idea that social capital affects other outcomes is, indeed, a recurring theme in recent studies. For example, Putnam (1993) concludes that the effectiveness of Italy’s regional governments stems from durable regional differences in patterns of social networks (civic associations): his statistical analyses explain current performance as a function of social networks in place a century earlier, and at various points, regional differences are traced back almost 1,000 years. Similarly, Fukuyama (1995) argues that the key to economic growth is social capital in the form of a supporting culture of trust, or ‘spontaneous sociability.’ This theme is echoed in Harrison’s (1992) conclusion that political and economic developments hinge critically on values involving trust, ethical codes, and orientations to work and risk-taking. Inglehart (1997) offers a minor variation by suggesting that membership in voluntary associations and levels of trust explain stable democracy and patterns of democratization in recent years. In such ways, social capital has been linked to a variety of political and social outcomes. These claims are more than a simple statement that social capital is important. They instead imply that social capital reflects enduring cultural norms passed across generations by socialization, norms that cannot be explained in terms of reasonable responses to current circumstances. Further, these cultural norms are durable: allowing for slight short-term forces, they reflect the long haul. Finally, these norms are taken to be the key exogenous factors in generating economic and governmental performance. Social capital is, in this sense, more crucial than objective conditions embodied in institutions, and endures in the face of institutional change. Social capital is critical in a second sense. Different levels of social capital lead people to process the same information differently, even in face of the same institutional constraints and incentives. Such claims parallel earlier arguments linking political culture to development and governmental performance. For example, Banfield (1958) argued that the poverty of a small southern Italian town stemmed from a dominant ethos of ‘amoral familism.’ Residents displayed no concern for collective issues facing the 14217
Social Capital community, operating instead according to the axiom, ‘Maximize the material, short-run advantage of the nuclear family: assume that all others will do likewise.’ In current terms, social and political pathologies in the town were attributed to low levels of social capital. Similarly, Almond and Verba (1963) claim that variations in democratic performance reflect national differences in civic orientations as a function of social capital. An alternative approach identifies conditions that enhance levels of mutual trust. This is the strategy suggested by Coleman (1990) and others, one that parallels recent institutional approaches to politics that specify how institutions constrain behavior through the incentives they embody. Social capital (reflected in trust and group membership) then becomes a function of the political and economic performance induced by effective institutional arrangements, not the other way round. Consider the following examples. The Mafia is often seen as a product of distinctive enduring Sicilian subcultural norms in which trust is at a premium. Since the norms are seen as both exogenous and durable, the Mafia should persist indefinitely. In contrast, Gambetta (1993) takes the Sicilian Mafia to be a loose cartel of families that produces and sells private protection. Gambetta treats this product as a surrogate for the enforcement of property rights by the state, so that protection is an imperfect substitute for mutual trust, and demand for protection is highest when the state does not enforce contracts. While state enforcement of property rights constitutes a public good, however, Gambetta shows that the distinctive feature of the Mafia’s product is that the protection offered is private. Traders can distinguish insiders (other traders with Mafia protection) from outsiders (traders without it). The incentives for both insiders and the Mafia favor cheating outsiders, thereby increasing distrust. Distrust is thus endogenized, a product of the Mafia, not its source. Recasting the argument this way makes a difference on at least two counts. First, by identifying the sources of low social capital in the form of distrust, it offers an analysis of ways in which levels of trust might be enhanced. In this case, one focuses on ways in which the state might enforce contracts even-handedly, as opposed to centering attention on a deep-seated subcultural ethos. Second, it offers a general approach that can be applied in different settings. For example, the rapid increase in Mafia activities in Russia after the collapse of the Soviet Union also appears to have origins in the state’s inability or unwillingness to enforce contracts (Varese 1994). Clearly it involves a stretch to attribute such changes to entrenched and distinctive subcultural values, whether from Sicily or elsewhere. Ethnically divided societies constitute another significant phenomenon, since ethnic markers have so 14218
often been taken to reflect the boundaries of ‘primordial’ groups (the quintessential form of exogenously-defined cultural groups) that pose particular threats to states. However, despite numerous ethnically divided societies, ethnic conflict itself is relatively unusual, which implies that such conflict is not an automatic fuction of the ethnic differences themselves. A principal source of ethnic conflict in divided societies is fear of victimization in the form of aggression from rivals (Weingast 1998). When a group fears victimization and can employ the state against another, it has an incentive to mount a preemptive strike against the other. Since being an aggressor is preferable to being a victim, and because promises not to aggress by individual groups are insufficient to deter conflict, ethnic conflict is likely. It is, thus, crucial to construct political institutions that raise the costs for any group to use the state this way. Effective institutions modify incentives to make toleration self-enforcing, credibly committing each group to honor its promises and thereby minimizing the odds of victimization for any group. Such institutions provide incentives for groups and their leaders to conclude that neither they nor their rivals have an interest in striking first. They thereby help induce trust and co-operation. The advantage of this perspective is that it allows variations in conflict across divided societies to be specified. It also focuses more directly on the temporal ebbs and flows of ethnic conflict that occur with institutional collapse and rebuilding within divided societies. Levels of trust and patterns of group membership within communities, however intangible, are clearly of intrinsic social and political consequence, and recent work on social capital has restored these issues to center stage. Fully understanding the sources and distribution of social capital is an important issue on the current research agenda. See also: Altruism and Self-interest; Coleman, James Samuel (1926–95); Human Capital: Educational Aspects; Mafia; Network Analysis; Networks and Linkages: Cultural Aspects; Networks: Social; Political Culture; Trust: Cultural Concerns; Trust: Philosophical Aspects; Trust, Sociology of
Bibliography Almond G A, Verba S 1963 The Ciic Culture; Political Attitudes and Democracy in Fie Nations. Princeton University Press, Princeton, NJ Banfield E C 1958 The Moral Basis of a Backward Society. Free Press, New York Coleman J S 1990 Foundations of Social Theory. Harvard University Press, Cambridge, MA Fukuyama F 1995 Trust: The Social Virtues and the Creation of Prosperity. Free Press, New York Gambetta D 1993 The Sicilian Mafia; The Business of Priate Protection. Harvard University Press, Cambridge, MA
Social Categorization, Psychology of Harrison L E 1992 Who Prospers? The Business of Priate Protection Basic Books, New York Inglehart R 1997 Modernization and Postmodernization: Cultural, Economic and Political Change in 43 Societies. Princeton University Press, Princeton, NJ Jackman R W, Miller R A 1998 Social capital and politics. Annual Reiew of Political Science 1: 47–73 Ladd E C 1999 The Ladd Report. Free Press, New York Levi M, Stoker L 2000 Political trust and trustworthiness. Annual Reiew of Political Science 3: 475–507 Portes A 1998 Social capital: Its origins and applications in modern sociology. Annual Reiew of Sociology 24: 1–24 Putnam R D 1993 Making Democracy Work: Ciic Traditions in Modern Italy. Princeton University Press, Princeton, NJ Putnam R D 1995 Bowling alone: America’s declining social capital. Journal of Democracy 6: 65–78 Varese F 1994 Is Sicily the future of Russia? Private protection and the rise of the Russian Mafia. European Journal of Sociology 35: 224–58 Weingast B R 1998 Constructing trust: The political and economic roots of ethnic and regional conflict. In: Soltan K, Uslaner E M, Haufler V (eds.) Institutions and Social Order. University of Michigan Press, Ann Arbor, MI
R. W. Jackman
Social Categorization, Psychology of Humans experience the world through a filter of categorization. They distinguish among natural kind categories such as plants, animals, and minerals, and among artifact categories such as tools, machines, and works of art. Still other categories are abstract (e.g., peace, love, happiness) or ad hoc (e.g., things I need to pack for a trip). Social categorization is mainly concerned with the way people categorize themselves and others. Like any other kind of categorization, social categorization raises two fundamental questions. How do categories come to be coherent, and how do they affect human perception and action (Sternberg and Ben-Zeev 2001) Before addressing these questions, one may wonder what would happen if there were no social categories. People would either be lumped together or considered individually. Though intuitively appealing, both options are impractical. Neither an all-inclusive human category nor all-exclusive person categories allow precise inferences about individuals. Such inferences are important because of their adaptive functions. People need to be able to predict—beyond chance levels—what others are like and what they are likely to do. Social categorization facilitates and constrains two kinds of inductive inferences. Downward induction occurs when the features of a category are attributed to individual members; upward induction occurs when the features of individuals are generalized to the
category at large. Both inferences become more precise as categories become smaller and more homogeneous (Holland et al. 1986). Knowing that Jacques is French, for example, suggests more precise attributions than merely knowing that he is European. These attributions, in turn, may change when even more exclusive categories (e.g., Parisian) are used (Nelson and Miller 1995). Similarly, the discovery that Jacques is a connoisseur of fine wine is more readily generalized to the French than to Europeans, and it is even more readily generalized to other people from Jacques’ region. Precision and economy are two inversely related aspects of the quality of inductive inferences. The use of exclusive and homogeneous social categories benefits precision, but it is not economical. Any given inference, be it attribution or generalization, can reach only few individuals. Conversely, large (and heterogeneous) categories permit attributions and generalizations to many members, but these inferences lack precision. Like object categories, social categories form hierarchies in which large heterogeneous categories comprise smaller and more homogeneous ones (Sternberg and Ben-Zeev 2001). In a hierarchy of religious categories, for example, the distinction between Christians and Muslims lies at a highly inclusive level. Further distinctions between Catholics and Protestants, and between Sunnis and Shiites, lie at more exclusive levels. Social categorization is extraordinarily flexible because any particular category falls within a web of overlapping and conflicting hierarchies (Wattenmaker 1995). When Jacques is considered Parisian, French, and Catholic, three hierarchies intersect. Most Parisians are French, and most French are Catholic, but no category is contained fully within another. Parisians can also be classed with other city-dwellers (e.g., Romans) as urbanites, and the French can also be classed with other nations (e.g., the Italians) as Europeans. The availability of alternative groupings permits strategic choices among equally valid depictions of the social world. Whether the choice of a social category is apt depends heavily on the context in which it is made. Natural kind categories are more resistant to changes in context. Most of the time, a furry companion is simply a dog.
1. Coherence Categories are coherent if induction effectively differentiates among them. Downward induction is effective if the features of a category can be attributed to its own members at a higher rate than to the members of other categories. Upward induction is effective if the features observed in an individual can be generalized to other members of this category at a higher rate than they can be generalized to members of other categories. 14219
Social Categorization, Psychology of Inspired by Aristotle’s beliefs about the structure of nature, the classic theory postulated the existence of features that are individually necessary and jointly sufficient for categorization (Sternberg and Ben-Zeev 2001). When there are defining features, inferences are certain and deductive instead of probabilistic and inductive. Because, for example, all geometric shapes with three sides and angles summing to 180 degrees are known to be triangles, no generalization needs to occur. Incidental features, such as the color of the shapes, have little inductive interest. Most social categories lack defining features. To illustrate, Allport (1954) noted that the Jews ‘cannot be classified as racial, ethnic, national, religious, or as any other single sociological type.’ The lack of classic definitions in the social domain creates a psychological challenge. How can we perceive and judge people by categories that we are unable to define? The idea of essences protects the faith in classic definitions of social categories. Essences are invisible—or at least so far undetected—defining features. No matter how much a person’s surface features may change, the philosophy of essences assumes the existence of an immutable underlying nature (Rothbart and Taylor 1992). Essentialist beliefs thus obscure the social origin of many categories. Even during the nineteenth century, some scientists believed that differences observed between human races were actually differences between species (Gould 1996). Since essences, by definition, are postulated rather than discovered, they tend to follow rather than precede social categorization. Once a set of social categories— be it by cultural convention or political expediency—is established, people tend to accept it as natural. They tend to believe that there must be defining features even if they are unable to name them. Wittgenstein (1958), who was more interested in language than in nature, refuted the classic theory when he asked what games are. Although they have little in common, chess and bingo can be grouped together because other games share features with either one or the other (e.g., poker requires both skill and luck). Family resemblance makes categories cohere without requiring that a single feature is shared by all members (Rosch and Mervis 1975). The term family resemblance aptly describes what many vintage portraits of extended families reveal. The patriarch seated in the center is obviously related to everyone, but the cousins on either side have no apparent similarities. Categorization by family resemblance is effective because it captures the lumpiness of the social world. Family resemblance categories comprise members of varying typicality, they have unstable and fuzzy boundaries, and their features tend to be correlated with one another (Rosch and Mervis 1975). Consider the social category of ‘actors.’ Typical members, or prototypes (e.g., Humphrey Bogart), have more features in common with members of the same category 14220
than with members of other categories. Prototypes come to mind faster and are recognized more easily than less typical members. Atypical members (e.g., Ronald Reagan) are not categorized effectively. Minor changes in their feature sets, or even changes in the perceiver’s attention to some features, can lead to categorization elsewhere. Crossing a fuzzy boundary, Ronald Reagan went from being a moderately typical actor to being a prototypical politician. His example illustrates how some features (e.g., the ability to portray sincerity) can contribute to separate family resemblance categories. Individual features tend to be correlated across categories. Feathered animals also tend to fly, and people who prefer Beaujolais over beer tend say santeT instead of prosit. Correlations among features and across categories provide useful, though partly redundant, information. They increase the perceiver’s confidence that categorization has been correct and they allow inductive inferences from one feature to another.
2. Consequences Unless categorization is random (see discussion of ‘thin categories’ below), most individuals are more similar to other members of the same group than to members of other groups. Both attribution and generalization inferences benefit from this basic similarity relation. If there is no specific information about a group member, any trait can be attributed to him or her with a probability corresponding to the perceived prevalence of the trait in the group. To make such trait attributions is to stereotype. Stereotyping may be questioned on ethical grounds—especially when the expectations are incorrect (Banaji and Bhaskar 2000)—but as an inductive inference, stereotyping is necessary if beliefs about categories and their members are to be coherent (Krueger and Rothbart 1988). Conversely, traits observed in individuals may be generalized to the group. If there are many observations, the prevalence of a feature in a sample is the best estimate for its prevalence in the group at large. But is it possible to justify a generalization to the category if only a single category member has been observed? Mill (1843\1974) noted that people sometimes generalize from single instances, whereas at other times they do not. He realized that this fundamental problem of induction can only be understood by taking into account a priori expectations about the homogeneity of the group. In the simple case of dichotomous features (i.e., a person either does or does not possess a given trait), the group becomes more homogeneous as the prevalence of the trait approaches either zero percent or 100 percent. If traits are thought to be either present in all group members or in none, a single observation of a particular trait can
Social Categorization, Psychology of be generalized to everyone. If the group is thought to be more heterogeneous, generalizations are ‘regressive,’ which means they shift further towards the 50 percent mark. Suppose half of the traits are presumed common (with p l 0.8) and half are presumed rare ( p l 0.2). Before there is relevant information about an individual, any particular trait may be expected with the average of these probabilities (i.e., p l 0.5). When the person turns out to be friendly, the trait of friendliness is more likely among the common ones than among the rare ones. The probability that friendliness is a common trait (0.8) may then be multiplied with the probability of encountering a friendly person if friendliness is indeed common (also 0.8). An analogous calculation is done for the possibility that the trait is rare (0.2i0.2), and the products are summed and multiplied by 100. The result is an estimate for the prevalence of friendliness (68 percent), which, although it does not match either of the assumed base rates, minimizes the expected error in the long run (Dawes 1990). Generalizations allow the formation and the change of stereotypes. Because generalizations, unlike attributions, are regressive, stereotypes are harder to change than to apply. Categories are coherent inasmuch as similarities within categories are greater than similarities between categories. This simple rule suggests that similarity relationships are there to be detected, and that social categorization is free from errors if this detection task is mastered. Somewhat surprisingly, however, no theory of similarity can fully explain why people use certain categories and not others. Before similarity can be estimated, a finite number of features must be selected from the infinite pool of possible features (Sternberg and Ben-Zeev 2001). These selections partly depend on value judgments or other (social) intuitions that may have little to do with similarity itself. In other words, categorization depends heavily on the perceiver’s theories of how to construe the world and what features to ignore. Some social categories are so ‘thin’ that they are held together by little more than the label itself. Thin categories are superimposed on social reality rather than extracted from it (Krueger and Clement 1994). Labels create quasi-Aristotelian categories because they do double duty as defining features. Labels create patterns of inclusion and exclusion that might as well be different. The US Census 2000, for example, distinguishes between Hispanics and non-Hispanic Blacks, not between Blacks and non-Black Hispanics. Despite meager underlying similarity relations, these categories organize perception and thought, and thus provide a platform for prejudice, stereotyping, and intergroup relations. The perceiver’s perspective on social categories depends critically on their own sense of identity (Oakes et al. 1994). Some categories include the self, whereas others exclude it. The egocentric distinction between
ingroups and outgroups, which is peculiar to social categorization, has two major consequences: ingroup favoritism and perceptions of outgroup homogeneity. Both phenomena are related reciprocally to the thinness (or arbitrariness) of categorization and to inductive reasoning.
3. Ingroup Faoritism Typically, a group is judged more favorably by people who are in it than by people who are not. In real social categories, ingroup favoritism has many causes, such as competition for scarce resources, a history of ideological or symbolic conflict, or unrepresentative interactions among group members. Therefore, it is important to note that the same bias emerges in the socalled minimal group situation. Minimal groups are very thin categories created in research laboratories. Group membership is completely arbitrary; often it depends merely on the luck of the draw in a lottery. There is no contact among members, no information about the attributes of others, and no dependence on their actions. Still, people discriminate (Tajfel et al. 1971). Compared with outgroup members, ingroup members give away more money or symbolic rewards, they are more interested in making friends, and they expect to see and are prepared to remember more positive personality traits. These differences point to the crucial role of self. When people find themselves in an unfamiliar group, self-knowledge is the only information available to support judgments about others, and social categorization provides the only framework for the generalization of this knowledge. Indeed, people ‘project’ their own features to others with whom they know they have something salient in common, namely group membership (Krueger and Clement 1994). Because most people have favorable self-images, generalization creates a favorable description of the group. Outgroup members do not generalize, believing that their own features do not represent the group. This lack of induction results in a more neutral description of the group. Whose judgments are mistaken? The interplay of induction and categorization suggests that the judgments made by outgroup members are biased. When a population is arbitrarily broken up into different categories, these categories inherit the same features. Therefore, any individual is similar to both ingroup and outgroups. Ingroup favoritism primarily reflects a failure to generalize one’s own favorable features to outgroups, rather than any excessive tendency to ascribe desirable features to the ingroup. Although egocentric and asymmetric induction patterns go a long way to explain ingroup favoritism, other motives also come into play (Abrams and Hogg 1999). Being able to discriminate against an outgroup 14221
Social Categorization, Psychology of can strengthen a sense of social identity and heighten self-esteem while reducing feelings of uncertainty. These motives can be so strong that people forfeit possible gains for the ingroup as long as the ingroup is better off than the outgroup.
4. Perceptions of Outgroup Homogeneity The mere act of categorization suggests that the categorized instances can be treated as equivalents. In social categorization, the assimilation of instances (people) to the group average is asymmetric. People who do not belong to the group usually perceive it to be even less variable than group members do (Park and Rothbart 1982). Italians, for example, may appear to be more homogeneous to Americans than to Italians themselves. This homogeneity bias is expressed in several ways. Individual group members are easily confused with one another; individual traits appear to be either extremely common or extremely rare; and judgments about individual group members on a given trait dimension do not depart much from the average of these judgments. The homogeneity bias is not a mere by-product of ingroup favoritism because it affects both positive and negative attributes, and because it is absent in minimal groups where ingroup favoritism is common. The psychological need to differentiate ingroups from outgroups—which contributes to ingroup favoritism— would lead people to perceive the ingroup as homogeneous. Moreover, the homogeneity bias does not simply reflect differences in induction. Although outgroup members may have less information than ingroup members do, differences in familiarity do not necessarily produce a bias. Statistically, sample size is unrelated to sample variance (unless the samples are very small). Finally, the selective induction from the self to ingroups works against the outgroup homogeneity bias. Induction reduces perceptions of variability and thus uncertainty. Not using self-related information to make inferences to outgroups thus supports perceptions of heterogeneity. The main source of perceived outgroup homogeneity are processes of categorization themselves. People who belong to a group are more likely to perceive differentiated subgroups (Park and Rothbart 1982). Italians more than Americans, for example, further categorize Italians by geographic origin, distinguishing between Tuscans, Romans, Calabrians, and so forth. To ingroup members, the group is thereby located at a higher level in the hierarchy than it is to outgroup members. The homogeneity bias then results from the assumption that categories not divided any further are more homogeneous than divisible categories. The error is the failure to realize that members of other groups consider their own categories to be as divisible as we consider our own. 14222
5. Conclusion Frameworks of social categorization may change across time and place, but they are functional in that they enable people to reason inductively and to situate themselves within the social world. Making predictions about the self and others is potentially more adaptive than withholding judgment altogether (Krueger 1998). In the process, however, social categorization also enables the persistence of prejudicial beliefs and discriminatory behavior. The tension between the adaptive and the biasing effects of categorization highlights the importance of this topic. See also: Stereotypes, Social Psychology of
Bibliography Abrams D, Hogg M A 1999 Social Identity and Social Cognition. Blackwell, Oxford Allport G W 1954 On the Nature of Prejudice. Addison-Wesley, Reading, MA Banaji M R, Bhaskar R 2000 Implicit stereotypes and memory: the bounded rationality of social beliefs. In: Schacter D L, Scarry E (eds.) Memory, Brain and Belief. Cambridge University Press, Cambridge, UK, pp. 139–75 Dawes R M 1990 The potential nonfalsity of the false consensus effect. In: Hogarth R M (ed.) Insights in Decision Making: A Tribute to Hillel J. Einhorn. University of Chicago Press, Chicago, pp. 179–99 Gould S J 1996 The Mismeasure of Man. Norton, New York Holland J H, Holyoak K J, Nisbett R E, Thagard P R 1986 Induction: Processes of Inference, Learning, and Discoery. MIT Press, Cambridge, MA Krueger J 1998 On the perception of social consensus. Adances in Experimental Social Psychology 30: 163–240 Krueger J, Clement R W 1994 Memory-based judgments about multiple categories: A revision and extension of Tajfel’s accentuation theory. Journal of Personality and Social Psychology 67: 35–47 Krueger J, Clement R W 1996 Inferring category characteristics from sample characteristics: Inductive reasoning and social projection. Journal of Experimental Psychology: General 128: 52–68 Krueger J, Rothbart M 1988 Use of categorical and individuating information in making inferences about personality. Journal of Personality and Social Psychology 55: 187–95 Mill J S 1974\1843 A System of Logic Ratiocinatie and Inductie. University of Toronto Press, Toronto, ON Nelson L J, Miller D T 1995 The distinctiveness effect in social categorization. Psychological Science 6: 246–9 Oakes P J, Haslam S A, Turner J C 1994 Stereotyping and Social Reality. Blackwell, Oxford, UK Park B, Rothbart M 1982 Perceptions of out-group homogeneity and levels of social categorization: Memory for the subordinate attributes on in-group and out-group members. Journal of Personality and Social Psychology 42: 1051–68 Rosch E, Mervis C B 1975 Family resemblances: Studies in the internal structure of categories. Cognitie Psychology 7: 573–605
Social Change: Types Rothbart M, Taylor M 1992 Category labels and social reality: Do we view social categories as natural kings? In: Semin G R, Fiedler K (eds.) Language, Interaction and Social Cognition. Sage, London, pp. 11–36 Sternberg R J, Ben-Zeev T 2001 Complex Cognition: The Psychology of Human Thought. Oxford University Press, New York Tajfel H, Flament C, Billig M, Bundy R 1971 Social categorization and intergroup behaviour. European Journal of Social Psychology 1: 149–78 Wattenmaker W D 1995 Knowledge structures and linear separability: Integrating information in object and social categorization. Cognitie Psychology 28: 274–328 Wittgenstein L 1958 Philosophical Inestigations, 2nd edn. Blackwell, Oxford
J. Krueger
Social Change: Types Two sets of assumptions must be made explicit to identify different types of social change. The actor’s attributes are needed to understand what they do and why. The social structure—options and payoffs— within which the actors choose and act are needed to explain what happens and why. The two sets of assumptions are needed to construct a logical narrative of systemic effects. Changes can consist in either (a) the actors changing: what they want, know, do, obtain, or are burdened with, i.e., actor change in cognition, attitudes, sentiments, attributes, actions; or (b) the structure changing: i.e., change of states, their correlatives or distributional aspects. These face the actors as facts, options, or structured costs and benefits, such as property rights, vacancies, socially available marriage partners, health hazards, or language boundaries. The structure may change the actors, their hearts and minds. The actors may change the structure, its positions and possibilities. And the actors may react to conditions they have brought forth—by an echo effect, so to speak, from an edifice of their own making. From answering two questions: (a) do the actors change, and (b) does the structure change, we may
identify four kinds of models of social change, as shown in Fig. 1. Examples from classical texts below illustrate these types.
1. Models of Aggregate Effects An example of a model of ‘aggregate effects’ is found in microeconomic theory. Under ‘pure barter,’ each actor is assumed to prefer certain combinations of goods, to know who commands what goods, etc. Among the structural assumptions are that actors have rights to goods, that goods can be exchanged freely without transaction costs, etc. From such assumptions it follows that the actors will barter goods they like less for goods they want more. The driving force is the initial gap between what they have and what they want—between what they actually prefer and what they initially control. Hence, a new distribution of goods will arise among the actors from which nobody will want to move because none can get more satisfaction from another exchange. The actors themselves remain unchanged, as does the value of their goods. The model is one of change only in the sense than that the actors’ bundles of goods at the end are different from the outset. Hence economists call this genre ‘models of static equilibrium,’ though a process runs until this equilibrium is reached. Distributional characteristics may be modified—e.g., the degree of concentration of goods. But they have no further motivational effects in these models—the actors’ choices shape the final allotment of goods, but they do not react to it once equilibrium is reached. The distribution of final control is just an aggregate outcome, not input for further action. Economics has many variations on this theme. It is a separate, empirical question to what extent such a model agrees with a barter process in the real world. There are even more stripped down models of aggregate effects. A stable population model assumes nothing about what the actors want or know—only that they have constant probabilities of offspring and death at specific ages. Initially the population pyramid may deviate from its final shape. Having arrived there, it will remain unchanged. The stable proportions in the different age\sex categories are equilibrium states which represent the ‘latent structural propensities of the process characterized by the [parameters]’ (Ryder 1964, p. 456)—the values that proportions move toward when parameter values remain constant. For some, even important, phenomena, cut down actor models (no intentions or knowledge) are powerful. Simple assumptions can take you a long way in making a logical narrative.
2. Models of Actor Effects Figure 1 Types of models of social change
The second type of model is illustrated by ‘learning by doing.’ In Capital, Marx describes how a specialist (a ‘detail laborer’) acquires skills within manufacture 14223
Social Change: Types production. A ‘collective laborer’ is the organized combination of such specialists. Manufacture ripens the virtuosity of the ‘detail laborer’ in performing his specialized function by repetition: A labourer who all his life performs one and the same simple operation, converts his whole body into the automatic, specialized implement of that operation. Consequently, he takes less time doing it, than the artificer who performs a whole series of operations in succession. But the collective labourer, who constitutes the living mechanism of manufacture, is made up solely of such specialized detail labourers. Hence in comparison with the independent handicraft, more is produced in a given time, or the productive power of labour is increased. Moreover, when once this fractional work is established as the exclusive function of one person, the methods he employs become perfected. The workman’s continued repetition of the same simple act, and the concentration of his attention on it, teach him by experience how to attain the desired effect with the minimum of exertion. (Marx [1864] 1967, p. 339)
The first logical element of this argument is actor assumptions: (a) Actors are purposive—i.e., they seek desired effects. Indeed, Marx was no strict determinist, but assumed purposive action explicitly: A spider conducts operations that resemble those of a weaver, and a bee puts to shame many an architect in the construction of her cells. But what distinguishes the worst architect from the best of bees is this, that the architect raises his structure in imagination before he erects it in reality. And at the end of every labor-process, we get a result that already existed in the imagination of the labourer at its commencement. He not only effects a change of form in the material on which he works, but he also realizes a purpose of his own. (Marx [1864] 1967, p. 178)
(b) The actors are satisficing, i.e., they start pursuing their goals by procedures that may not be the best (which provides room for improvement). (c) They recognize deviations from present satisfactory performance. (d) They evaluate deviations as either positive or negative by a criterion such as the ‘principle of least effort’ (to attain the desired effect with the minimum of exertion). (e) There are two sources of deviation from current routines. The actors may search for better solutions, i.e., purposively probing for improvements by ‘trial and error.’ Or actors make mistakes—a stochastic element producing random departures from current routines. Accidentally hitting better solutions could be called ‘spot and snatch.’ Hence the model has two sources of new alternatives—deliberate exploration and random variation—one sought ex ante, the other sighted ex post. In short, behavior is result controlled: when the outcome of an action deviates from the intended result, actors may modify their conduct. Negative deviations in terms of an actor’s purposes will be 14224
avoided and result in a return to the standard routine. Positive deviations will be retained. The second logical element is the structure assumptions of Marx’ selection model: (a) The actors are set to repetitive tasks. (b) There are alternative ways of carrying out the tasks. (c) The choice of alternative has consequences for the actors, being rewarding or penalizing in terms of their goals. These consequences may be due to laws of nature (e.g., efforts required or difficulties encountered) or they can be socially organized (e.g., reducing costs or increasing payoffs). Such are the assumptions of Marx’ argument. He constructed a selection model which at the individual level works like this: the probability of mistakes is cut with every attempt, while methods giving better results are retained. Initially satisfactory but unwieldy procedures are abandoned—more effective ones are kept when captured, e.g., by the principle of least effort. Hence the laborer becomes a special tool for the task at hand—more adept and adroit. The best form found solidifies into a routine. And the collective worker as the sum of detail laborers can reach a perfection beyond any single individual carrying out all operations alone. Manufacture begets handicraft industries that push the skills required for carrying out a specific task in the same direction. The progression toward a best response pattern hence leads to homogenization of actions. In short, on the assumptions of purposive action and selection of beneficial deviations, Marx explains individual adaptation as well as uniforming selection—i.e., increased dexterity in repetitive tasks until the form best fitted to the task is found. Note that in this model what changes is not actors’ preferences but their skills—not what they want, but what they know. Their bias for less effort is stable. There are other models where actors change without this being part of their intention or recognition— effects may occur behind their backs. For example, a contagious disease may immunize a group against further outbreaks even if neither the contagion nor its effects are understood. If insights are won, actors may change practice, as when humans started to infect themselves with cowpox as protection against smallpox.
3. Models of Structural Effects The brilliant Norwegian sociologist Eilert Sundt (1817–75) provides an example of structural change inspired by the logic of mutations and selection found in The Origin of Species. ‘Learning by doing’ can modify not just behavior, but also material forms. Sundt had a keen eye for the logic of both error-based ‘spot and snatch’ and the more deliberate ‘trial and error’ mentioned above. In this quote, both the stochastic component in generating variation and the
Social Change: Types purposive element in discovery and selection of slight beneficial deviations are exposed lucidly: A boat constructor may be skilled, and yet he will never get two boats exactly alike, even if he exerts himself to this end. The deviations occurring may be called accidental. But even a very small deviation usually is noticeable during navigation, but it is not accidental that the sailors come to notice that the boat that has become improved or more convenient for their purpose, and that they should recommend it to be chosen for imitation … One may think that each of these boats is perfected in its way, since it has reached perfection through one-sided development in one particular direction. Each kind of improvement has progressed to the point where further development would entail defects that would more than offset the advantage … Hence I conceive of the process in the following way: when the idea of new and improved forms had first been aroused, then a long series of prudent experiments each involving extremely small changes, could lead to the happy result that from the boat constructor’s shed there emerged a boat whose like all would desire. (Sundt [1862] 1976)
In this case, then, the material structure of society is modified via learning by doing. Note also that Sundt argued that the recognition of accidental discoveries (‘spot and snatch’) itself leads to the discovery of the usefulness of experiments (‘trial and error’), i.e., to second order learning. Marx used a similar line of argument to explain the differentiation of tools adapted to different tasks, and second order learning when science is used to improve industrial means of production. Before turning to the fourth type, dialectical change, the analyses of Marx and Sundt may highlight how the same basic mechanism can be used to explain constancy as well as change, i.e., the evolution of social habits, tools, and organizational forms toward an optimal equilibrium pattern. Or, put differently, how continual trial and error leads to optimal responses which are then crystallized as customs, standard operating procedures, or best practices. The skills required are pushed in the same direction and the progression toward a best response pattern leads to homogenization of actions, i.e., to uniforming selection. Likewise, adaptation for divergent purposes generates differentiation. The essence of ordinary work is repetition—i.e., variations on the same theme. But by chance variations and serendipity, skills can be acquired, tools developed, and workgroups rearranged. This is how Marx explains changes in organizations and in aggregates at the macro level. The first organizational consequence is that craftspeople, the best performers, may become locked into lifetime careers. The second organizational consequence is that, on the basis of skill and mastery, industry may sprout into different occupations—a structural change arising from the social organization of individual skills. That is, a particular task selects a best response that becomes fixed as a routine, different tasks select different types of best responses: ‘On that
groundwork, each separate branch of production acquires empirically the form that is technically suited to it, slowly perfects it, and as soon as a given degree of maturity has been reached, rapidly crystallizes that form’ (Marx [1864] 1967, p. 485). Marx uses his selection model to explain structural differentiation. The resulting structure may be no part of any individual’s or group’s intentions, though it is a result of their choices. By the same logic Marx explains change in the social structure of society as the gradual adaptation of different groups of workers. What is the best relative size of groups of specialized workers in different functions? Marx argues that there is a constant tendency toward an equilibrium distribution, forced by competition and the technical interdependence of the various stages of production: The result of the labour of one is the starting-point for the labour of the other. The one workman therefore gives occupation directly to the other. The labour-time necessary in each partial process, for attaining the desired effect, is learnt by experience; and the mechanism of Manufacture, as a whole, is based on the assumption that a given result will be obtained in a given time. It is only on this assumption that the various supplementary labour-processes can proceed uninterruptedly, simultaneously, and side by side. It is clear that this direct dependence of the operations, and therefore of the labourers, on each other, compels each one of them to spend on his work no more than the necessary time, and thus a continuity, uniformity, regularity, order, and even intensity or labour, of quite a different kind, is begotten than is to be found in an independent handicraft or even in simple cooperation. The rule, that the labour-time expended on a commodity should not exceed that which is socially necessary for its production, appears, in the production of commodities generally, to be established by the mere effect of competition; since, to express ourselves superficially, each single producer is obliged to sell his commodity at its market price. In Manufacture, on the contrary, the turning out of a given quantum of product in a given time is a technical law of the process of production itself. (Marx [1854] 1967, p. 345)
There are two types of selection pressures in this argument: structural dependence, or integration that compels each to spend on his or her work no more than the necessary time, and economic competition. In short, the argument is based not just on purposive actors but on their location in a particular social macrostructure (competition) resulting in modification of another part of the structure (occupational composition). This argument also uses a conception of bottlenecks—a condition or situation that retards a process and thwarts the purposes of actors involved. This mode of explanation is often used by social scientists, e.g., for the logic of the industrial revolution: the invention of the Spinning Jenny made weaving a bottleneck and hence led to improved shuttles and machine looms requiring greater power to drive them, favoring the use of the steam engine, etc. Learning in this case consists in gradual adaptation of group sizes to obtain the most efficient production. 14225
Social Change: Types That definite numbers of workers are subjected to definite functions Marx calls ‘the iron law of proportionality’ (Marx [1854] 1967, p. 355). From firms, Marx moves on to the societal level: adjusting different work groups within the manufacture generates a division of labor at the aggregate or macro level. Third, the basic selection model is used to explain not only change—i.e., the succession of steps towards the best suited response—but also constancy, i.e., the locking in of a structurally best response. ‘History shows how the division of labour peculiar to manufacture, strictly so called, acquires the best adapted form at first by experience, as it were behind the backs of the actors, and then, like the guild handicrafts, strives to hold on to that form when once found, and here and there succeeds in keeping it for centuries’ (Marx [1854] 1967, p. 363). The fourth point made by both Sundt and Marx is a conceptual one: learning may take place not just by doing, but also through a second form of learning, namely, imitation. That is, those who have found the knack can be emulated, and, as Sundt argued, tools improved by accidents are chosen for imitation. Marx wrote: ‘since there are always several generations of labourers living at one time, and working together at the manufacture of a given article, the technical skill, the tricks of the trade thus acquired, become established, and are accumulated and handed down’ (Marx [1854] 1967, p. 339). In other words: socially organized purposive actors may learn vicariously. They recognize and copy the improved solutions others find to their own problems. Basically then, Marx and Sundt argued that a constant technology advances the development of an equilibrium pattern of behavior, corresponding to the best possible practice, which also congeals social forms at the macro level. This takes place through the selection of the most adept methods by individuals, which, when aggregated, moves a group or even a mass of people from one distribution to another. They used selection mechanisms to explain a wide range of phenomena: development of individual dexterity, the evolution of lifetime careers, the homogenization of practices and the differentiation of tools, the evolution of occupations—indeed, the drift of large aggregates as well as equilibrium patterns and constancy. The fifth point is that they considered new technology as a driving force which may require readaptation of behavioral responses (cf. Arrow 1962). From the notion of change moving toward an equilibrium or constancy Marx introduces the notion of the constancy of change. This happens when the random nature of learning by doing yields to the systematic search for new knowledge by applied science: The principle, carried out in the factory system, of analysing the process or production into its constituent phases, and of solving the problems thus proposed by the application of
14226
mechanics, of chemistry, and the whole range of the natural sciences, becomes the determining principle everywhere. Hence machinery squeezes itself into the manufacturing industries first for one detail process, then for another. Thus the solid crystal of their organisation, based on the old division of labor, becomes dissolved, and makes way for constant changes. (Marx [1854] 1967, p. 461)
Marx’ explicit logic, therefore, was not just one of learning by doing and second order learning by outand-out search and science. He went even further in analyzing how the introduction of new implements required new learning by the users of those implements, and indeed, led to a change in attitudes as well. In other words, learning made people change their own conditions—by adapting to those conditions people changed themselves by learning to learn. This is the double logic of dialectical models.
4. Models of Dialectical Effects Dialectical models analyze structures which motivate actors, by accident or design, to bring forth structural changes which then change the actors’ own attributes. In shorthand, such models look like this: Structure Actoral Change Structural Change Actoral Change This is, of course, a very important category for social analysis—the more so the longer the time perspective. People have not shaped the society into which they are born—they are molded by a society before they can remake it. But humans can remake themselves indirectly by changing the structure or culture within which they act. A classical text, Max Weber’s (1904–5) The Protestant Ethic and the Spirit of Capitalism, can serve to illustrate such models. Weber first identified a specific tendency among Protestants to develop economic rationalism not observed to the same extent among Catholics (Hernes 1989). Weber’s paradoxical thesis is that the supposed conflict between otherworldliness or asceticism on the one hand, and participation in capitalistic acquisition on the other, is in fact an intimate relationship. Weber’s model departs from what he calls traditionalism, populated by two types of actors: traditionalistic workers, and traditionalistic entrepreneurs. The key attribute of the former is that increasing their rewards does not lead to an increase in their output, but rather makes them decrease the amount of work they put in. People ‘by nature’ do not want to earn as much as possible: traditionalistic workers adjust their supply of labor to satisfy their traditional needs. This ‘immense stubborn resistance’ is the ‘leading trait of pre-capitalist labor.’ Likewise, traditional entrepreneurs adjust their efforts to maintain the traditional rate of profit, the traditional relations with labor, and the traditional circle of customers.
Social Change: Types However, what these actors want is not hardwired—their character can be changed. Weber wants to explain this change from the tranquil state of traditionalism to the bustling dynamism of capitalism. The first part of his argument is about how the actors’ temperaments are changed: how a homo nous—or rather two—is brought forth. The second part of the argument is about how they trap themselves and everyone else in the new conditions or structure they create. The new entrepreneurs came to see moneymaking as an end in itself. The new workers came to see labor as an end in itself. The argument runs briefly as follows. In the beginning there was logic. Conceptual activity worked on beliefs to turn doctrines of faith into maxims for behavior. At the outset traditionalists had faith in God. He was omniscient and omnipotent. By pure logic this led to predestination: an omniscient God had to know ahead of time who were among the chosen; an omnipotent God must have decided already and must know the outcome of his decisions. Humans could do nothing to bring about their salvation by their own actions. Fates were preordained by God. Those who had salvation could not lose it and those without it could not gain it. What is more, neither group could know who had it. So for every true believer one question must have forced all others into the background: ‘Am I one of the elect?’ At first there seemed to be no answer; however, all the faithful were to act as if they were among the chosen. Doubts were temptations of the Devil—a sign of failing faith, hence of imperfect grace. And the faithful should act in the world to serve God’s glory to the best of their abilities. Those unwilling to work therefore had the symptoms of lack of grace. Hence there were some signs. But the logic of this argument resulted in a peculiar psychology of signs—a mental reversal of conditional probability notions. The original argument ran: if you are among the elect, you work in a calling. The psychological reversal became: if you work in a calling, you have a sign that you are among the elect. Moreover, this sign— working—could be affected. By choosing to work hard, you attain a sign that you are among the chosen: In practice this means that God helps them who helps themselves. Thus the Calvinist, as it is sometimes put, himself creates his own salvation, or, as would be more correct, the conviction of it. But this creation cannot, as in Catholicism, consist in a gradual accumulation of individual good works to one’s credit, but rather in a systematic self-control which at every moment stands before the inexorable alternative, chosen or damned. (Weber 1930, p. 115)
This is the first part of Weber’s analysis of actor change: the moral construction of the new actors— new workers and new entrepreneurs working in a calling for the glory of God—driven to restless work by their fierce quest for peace of mind.
However, this actoral change is not a sufficient condition to explain the structural change, i.e., the establishment of a new economic system—capitalism. Hence the next part of Weber’s logical chain is the following. Working in a calling was not enough. For even if many were called, few were believed chosen. Identifying who they were—and if one was among them—could only be done by comparisons. ‘Success reveals the blessings of God’ (Weber 1930, p. 271, n. 58). But success can only be measured in relative terms: ‘But God requires social achieement of the Christian because He wills that social life shall be organized according to His commandments, in accordance with that purpose’ (Weber 1930, p. 108, emphasis in original). Each Christian had, so to speak, to prove that he or she was ‘holier than thou.’ Moreover, each of them had to assess their success relative to others, and moreover, they had to do it in terms of economic achievement. The sign of salvation is not what you did unto others—it was whether you outdid the others. Here is the nexus between actor change and structural change. From this mode of thinking, the individualistic Protestants were, so to speak, connected behind their backs. Religious attainment became measured by private profit. The market united those that theology separated. From being religious isolates the Christians became rivals—assessed by the relative fortunes they amassed. But then, to hold their own, they forced each other to work harder. In effect they set up a new social structure for themselves by their thinking and by their actions. Their religious mindset produced the new social setup. By taking not a good day’s work, but the relatie fruits of labor in a calling, as a measure of success, they in effect caught themselves in a prisoner’s dilemma. Since capital accumulated but not capital consumed was measured, the logic of asceticism ‘turned with all its force against one thing: the spontaneous enjoyment of life and all it had to offer’ (Weber 1930, p. 166). Hence assets could also serve as a relative measure of success. And this had the ‘psychological effect of freeing the acquisition of goods from the inhibitions of traditionalistic ethic’ (p. 174). The double-barreled effect of ascetic Protestantism, in short, was to join the compulsion to work with the obligation to reinvest its output. But the prisoner’s dilemma that the Protestants established by their particular comparative, even compulsive, work ethic not only captured their creators. By their logic of relative success they also dragged everybody else into the vortex of the market—they changed the social structure in a decisive way because it also entrapped all others: The Puritan wanted to work in a calling; we are forced to do so. For when asceticism was carried out of the monastic cells into everyday life, and began to dominate worldly morality, it did its part in building the tremendous cosmos of the modern
14227
Social Change: Types economic order. This order is now bound to the technical and economic conditions of machine production which today determine the lives of all individuals who are born into this mechanism, not only those who are directly concerned with economic acquisition, with irresistible force. (Weber 1930, p. 181)
The structure that the actors created turned on themselves with a compelling might—their internal convictions transformed into external forces, so to speak. Moreover, not only are people compelled to react to conditions of their own making. The new structure of the competitive market in turn creates yet new types of actors without the religious motivation from which it sprang: The capitalistic economy of the present day is an immense cosmos into which the individual is born, and which presents itself to him, at least as an individual, as an unalterable order of things in which he must live. It forces the individual, in so far as he is involved in the system of market relationships, to conform to capitalistic rules of action. The manufacturer who in the long run acts counter to these norms, will just as inevitably be eliminated from the economic scene as the worker who cannot or will not adapt himself to them will be thrown into the street without a job. Thus the capitalism of today, which has come to dominate economic life, educates and selects the economic subjects which it needs through a process of economic survival of the fittest. (Weber 1930, p. 54 f.)
By measuring success in relative terms, the Puritans established a prisoner’s dilemma in which they not only ambushed themselves and ensnared non-Puritans, but in addition, once it had been socially established, in turn liberated capitalism from its connection to their special Weltanschauung. It no longer needed the support of any religious forces: ‘Whoever does not adapt his manner of life to the conditions of capitalistic success must go under. But these are phenomena of a time in which modern capitalism has become dominant and has become emancipated from its old supports’ (Weber 1930, p. 72). Hence, having explained the generation of the capitalist system, Weber takes competition to be source of its own reproduction, i.e., for the constancy of the new system. The paradox is that the religious belief that set people free from restraints on acquisitive activity, in turn liberated the resulting economic system from its creators and their beliefs. Once the new system is in place, it selects and shapes the actors needed to maintain it. The Protestant entrepreneurs who established the new system thereby in turn called forth yet another homo nous—industrialists and managers and workaholics infused not with Protestantism, but with the spirit of capitalism. The short version of Weber’s analysis is this: religious conversion led to economic comparison; and the importance of relative success as a religious sign led to competition. This was a systemic transformation that 14228
in turn led not only to accumulation and innovation, but also to selection and socialization of new actors. A system brought into being by actors who feared eternal damnation bred actors who maintained it from concern for their daily survival and without religious qualms. Weber’s analysis provides a very good example of dialectical change in that it explains how a new type of actor, with different beliefs and knowledge, modifies its own environment out of existence and in turn creates the conditions for yet another type of actor that is different from themselves. As Garrett Hardin (1969, p. 284) has written of ecological succession: ‘It is not only true that environments select organisms; organisms make new selective environments’ (cf. Hernes 1976, p. 533).
5. Conclusion The analysis of change is at the heart of social theory because it attempts to answer this question: How do humans cope with the effects of their own creativity? To carry out such analysis one needs simple, yet potent, tools. They are embedded in the classical works of social science. When they are identified, we can emulate the best practitioners in the field.
Bibliography Arrow K 1962 The economic implications of learning by doing. Reiew of Economic Studies 9(June): 155–73 Hardin G 1969 The cybernetics of competition: A biologist’s view of society. In: Shephard P, McKinley D (eds.) The Subersie Science: Essays Towards an Ecology of Man. Houghton Mifflin, Boston, pp. 275–95 Hernes G 1976 Structural change in social processes. American Journal of Sociology 38(3): 513–46 Hernes G 1989 The logic of ‘The Protestant Ethic.’ Rationality and Society 1: 123–62 Marx K [1864] 1967 Capital. International Publishers, New York Ryder N 1964 Notes on the concept of a population. American Journal of Sociology 69(March): 447–63 Sundt E [1862] 1976 Om havet. In: Samlede erker. Gyldendal, Oslo, Vol. 7 Weber M [1904–5] 1930 The Protestant Ethic and the Spirit of Capitalism. Translated by Talcott Parsons, Allen & Unwin, London
G. Hernes
Social Changes: Models There is an amazing multiplicity of social forms: rites, tools, languages, and rules. How has this variety come about and why is it forever changing? Yet most of the
Social Changes: Models time humans behave like those around them. How has this conformity developed? And why is it maintained? Models of social change address these questions: What generates change, and is it differentiating or uniforming? What maintains established structures, whether they are simple or complex? Classics, such as Durkheim, Marx, Sundt and Weber, provided models of the transformation as well as the preservation of social forms, and analyzed diversification as well as homogenization. This they did not so much by general theoretical statements as by laying bare the social logic driving specific historical passages. They can be used to highlight models of social change and provide paragons that present-day studies can be appraised against.
1. Indiidual Actions and Social Facts Durkheim, in Suicide ([1897] 1951) sought to explain both constancy and variety. He asked: why are suicide rates so stable—indeed, more stable than ordinary death rates—and yet vary systematically between groups? Why do they take on a new stubborn stability after marked shifts? Durkheim held stable, but differing suicide rates to be social facts. They exist relatively independent of the individuals who compose the group on which they are measured. Social facts—e.g., the language we speak, rites we share, or streets we walk—endure, Durkheim argued, sui generis. Social facts have an existence of their own, like a swarm of moths which can change its shape or shift its position—yet, of course, there are no social phenomena without individuals being part of them: no swarm without the ephemeral movements of moths. Put paradoxically: social facts—both change and constancy—are relatively independent of those on whom they depend. Individuals are held in place by their fellows and take their cues from each other. This is important in study of social change: it is always mediated through individuals who are shaped by their physical and social environment before they can act on it, and who change it by reacting to circumstances they have brought forth or are transmitted from the past. Social change is at the core of sociological analysis. Yet, few studies cover the whole cycle of analyzing ‘how men react to conditions of their own making and in so doing change these conditions’ (Hernes 1976, p. 513). Rather they present snapshots of a condition, such as a poll of opinions or their causal determinants. The reasons are many: collecting data as a process unfolds may be too costly or time-consuming. Data from other sources do not easily lend themselves to analysis over time. Survey techniques often result in a kind of aggregate social psychology rather than in studies of structural change, and when it restricts itself to ‘enumerating individual characteristics, [it] treats the individual as if he were detached from his environment and hence as an abstraction’ (Boudon 1971, p. 48). Indeed, the methodological successes of
survey analysis can lead to conceptual failures. So most social scientists do not go into deep analysis of social change even if it is at the core of social analysis.
2. Requirements of Theories of Social Change From the above observations, three logical requirements of theories of social change can be identified (Hernes 1976). First, it should be possible to use the same basic approach in studying both constancy and change. This implies that a stable structure should be seen as a process in (temporary) equilibrium and that the theory must include an explanation of why the forces or parameters governing a process themselves are regenerated. Structural change must be described in terms of the processes that generate the change. Structural stability must be explained in terms of the processes that not only maintain the stability, but also maintain the processes that maintain the stability. Some accounts are static, e.g., ‘percentage unemployed.’ Yet, from a model point of view, it not only makes better sense, but also for more interesting analysis to conceptualize even ‘static’ phenomena in terms of processes. For example, if the number unemployed remains stable, this does not imply that the same individuals remain out of work—only that the flows of individuals in and out of employment cancel out. Hence stability can be viewed as a process in equilibrium (Coleman 1964). Second, models of social change should incorporate intrinsic sources of change. Changes in social systems may be induced from the outside, but all changes cannot be thus induced (Elster 1971). Even when the impetus for change is imported, the mechanisms transmitting external impacts should be identified. Indeed, when an item is taken from the outside, it should be possible to employ the same kinds of explanations for its adoption as if it had been invented at home. Moreover, the sources of change should also be intrinsic to the theory in the sense that effects of previous actions provide or change premises for further choices of action. Lauriston Sharp (1952) gives an example in ‘Steel axes for Stone Age Australians.’ He describes how the aboriginal Yir Yoront group as late as the 1930s lived an essentially isolated and independent economic existence based on Stone Age techniques. However, ‘their polished stone axes were disappearing fast and being replaced by steel axes which came to them in considerable numbers, directly or indirectly, from various European sources to the south’ (Sharp 1951, p. 17). Sharp then discusses what changes in the lives of the group resulted from the replacement of stone axes around which their culture was built. For example, stone axes were an item of trade hence important in intertribal relations, belief systems, celebrations, and totemic ceremonies. Men crafted the stone axes, which were a symbol of their masculinity, yet they were lent 14229
Social Changes: Models to women who used them in their heavy chores. Furthermore, ‘The shift from stone to steel provided no major technological difficulties’ (Sharp 1952, p. 20). Sharp then analyzes how this change of tools changed trade relations, gender interactions, and authority structures. The most disturbing effects of the steel axe, operating in conjunction with other elements also being introduced from the white man’s several sub-cultures, developed in the realms of traditional ideas, sentiments, and values. These were undermined at a rapidly mounting rate, with no new conceptions being defined to replace them. The result was the erection of a mental and moral void which foreshadowed the collapse and destruction of all Yir Yoront cultures, if not, indeed, the extinction of the biological group itself. (Sharp 1952, p. 21)
Here the new technology came from the outside— the impetus for change was exogenous. But its adoption is explained in terms of simple principles at the level of actors: steel axes were nearly free good tools, simple to use, and requiring less effort—the same types of explanation we would use if they had been invented locally. The effects of this deus ex machina changed premises for further choices of action—in this case with devastating aggregate effects: it reduced women’s dependence on men, it undermined belief systems, it wrecked havoc with traditional trade relations and their social embellishments. Finally, as just seen, social change is mediated through individual actors. Hence the third requirement of theories of social change: they must show how macrovariables or structural characteristics affect individual motives, capacities, and choices and how the choices of actors in turn change the macrovariables. For the Yir Yoront, adopting a new tool changed their social structure in ways that was no part of their intention.
3. Elements of Models of Social Change Next we can identify the locuses of change. Basically they can be of two kinds: (a) the actors can change, and (b) the structure within which they operate can change. Models must be constructed from these two sets of abstract elements—actors and structures—to mirror what drives the changes in them (Hernes 1989, 1999). Hence, the first element to specify in a model of social change is the types of actors assumed to operate. In the language of the theater, specifying the actors corresponds to the casting of a play. For example, in microeconomic theory two types of actors are distinguished: consumers maximizing utility by spending their resources to obtain goods, and producers maximizing profit by the quantities they market. The second element to specify is the structure the actors are lodged in—the staging of a play. For 14230
example, in microeconomic theory three main market structures are distinguished: perfect competition where the actors are so many and small that none of them can affect the price; oligopoly, where the producers are so few that they react to each others’ decisions; and monopoly, where a single producer can set the price. One reason for combining actors and the structure within which they operate, is to construct a model so that one can logically work out the systemic effects of actors’ choices within the structure—the plotting of the play. The point is to answer questions about what happens to the actors or to the structure as a consequence of the actors reacting to the initial conditions—i.e., the logical sequence of consequences that flow from the assumptions made. In other words, we construct models of actors within a structure in order to understand what the actors are doing and why, and to explain what results from their actions and why, whether the results are intended or not. In other words, a model of change is a logical narrative. A logical narrative may have logical flaws and hence be theoretically unsound. And one such logical narrative may exclude others: it simultaneously empowers our reasoning and limits our comprehension—both our understanding of the actors and the explanation of the results of their interaction. A second purpose is to compare the logical outcomes of the model to a real life sequence of events. A compelling argument may be empirically false—the logical narrative may not match the historical record. Hence, if the model is to provide a valid understanding or a tenable explanation, there must be a rough goodness of fit between what the model logically leads us to expect about what will transpire and what actually takes place in the real world. The model cannot just be plausible in theory—it must at the end of the day be admissible by the facts: the observed must agree with the inferred. If the model is ruled out by the facts, it must be discarded and another constructed from different assumptions that provide a more realistic understanding and credible explanation. For example, the simple model of perfect competition implies that profits will tend towards zero. If we observe that profits in fact are high, we may modify the model to better account for this fact by changing the assumptions, e.g., by allowing producers to change consumers’ attitudes (advertising) or by allowing for monopolization—i.e., discarding the model of perfect competition.
4. Specifying Actor Assumptions and Structure Assumptions To give a logical portrait of the types of actors involved in a social process, the following kinds of questions must be answered:
Social Changes: Models (a) What do they want—what are their preferences or purposes? (b) What do they know—what are their beliefs, their potential for changing them or for learning? (c) What can they do—what are their capacities or skills? Sometimes answers are only rudimentarily specified, e.g., in models of contagion where contracting a disease clearly is not something sought but something caught. Actor assumptions allow us to understand what actors can do, what they actually do and why, or how others react to them and why. A similar set of questions must be answered to specify assumptions about the structure: (a) What alternaties or constraints do the actors face? These may span from opportunity for migration to household budgets. They include collective properties such as laws and norms and physical barriers such as roads or rivers. (b) What social positions can the actors hold—e.g., organizational roles? What are the correlates of positions—e.g., their status or the hierarchy of wages linked to a given division of labor? Rewards, rights, and burdens are structural correlates of positions— they also define, fix, or freeze the alternatives actors face. (c) What are the distributional characteristics of actors, as distinct from their individual manifestations? For example, the number of other actors may affect options as in perfect competition, the age distribution may affect outlooks for marriage. The sex ratio and the population pyramid are structural characteristics. Specifications of the actors’ attributes are needed to understand what they do and why. Specifications of the structure—options and payoffs—within which they choose and act are needed to explain what happens and why. The two sets of assumptions are needed to construct the logical narrative of systemic effects.
5. Intentions s. Effects A key point in analyzing social change is the nexus between intentions and effects. Purposive action may have unanticipated consequences (Merton [1936] 1976). That is why understanding the actors is not sufficient—explaining the interworkings of their actions beyond their intentions is necessary. The net outcome of rational choices may be counterintuitive. The way to hell may be paved with good intentions and simultaneous choices based on the best of motives may be counterproductive by some hidden hex. Private vices may give public benefits (Mandeville [1714] 1989), self-serving interest may yield benefits as if guided by an invisible hand. Sometimes actors are
caught by what they have wrought; sometimes they do not master the world they have made. Such unintended effects, counterproductive actions and counterintuitive results which flow from the actors’ choices, not their intentions, often produce engaging logical narratives.
6. Expected s. Desired Consequences The distinction between actors and structure is useful also when what happens deviates from what was anticipated. In 1936 Robert K. Merton wrote a famous article on ‘The unanticipated consequences of purposive social action’ (Merton [1936] 1976). A key point is that the effects of actors’ choices may interact to produce effects that are no part of the intentions. This notion is very useful for describing models or mechanisms of social change. The simplest models of change are of course when purposive actors produce intended results. What appears is what was planned, what materializes is what was sought. However, sometimes what turns up is not what the actors had in mind. Sometimes their best intentions interact to produce self-defeating results, as when a stampede to get out blocks the exit. Sometimes selfish motives produce positive overall effects over and beyond their expectations, with improved welfare coming as a by-product, as if an invisible hand guides self-interest to efficient use of resources. As the argument runs in Adam Smith’s classic text: He [every individual] generally, indeed, neither intends to promote the public interest, nor knows how much he is promoting it. By preferring the support of domestic to that of foreign industry, he intends only his own security; and by directing that industry in such a manner as its produce may be of the greatest value, he intends only his own gain, and he is in this, as in many other cases, led by an invisible hand to promote an end which was no part of his intention. Nor is it always the worse for the society that it was no part of it. By pursuing his own interest he frequently promotes that of the society more effectually than when he really intends to promote it. I have never known much good done by those who affected to trade for the public good. It is an affectation, indeed, not very common among merchants, and very few words need be employed in dissuading them from it. (Smith [1776] 1937, Book 4, Chap. 2)
A key point for the analysis of social change is that what results, whether it appears as the compelling influence from a hidden hex or an invisible hand, flows from the actors’ choices, not from their motives. That is why understanding the actors is not sufficient—explanation of interworkings of their actions beyond their intention, expectation, or belief is necessary. Simultaneous choices based on the best of motives may be counterproductive. The net outcome of rational choices may be counterintuitive. Silly choices 14231
Social Changes: Models
Figure 1 Actions and consequences
may prove lucky strikes, bad motives may result in good fortunes. The way to hell may be paved with good intentions just as private vices may give public benefits (Mandeville [1714] 1989). In fact, we can classify types of explanations from the answers to two questions: were the consequences expected or not, and were they desired or not? If the results coincide with what was desired, the actors have produced the intended results. If the results were not expected, but when they appear are desired, we have what could be called ‘invisible hand’ explanations, since they are of the kind illustrated by the quote from Adam Smith above. The case where the consequences are expected, but actors do not like them when they turn up, can be termed suboptimality (cf. Elster 1978): the actors know that when they act in their own interest, they nevertheless collectively end up in an undesired state—the joint outcome of their choices make them worse off. ‘The Tragedy of the commons’ illustrates the logic. A ‘commons’ is any resource that can be used as though it belongs to all. However, when rational agents use it, such a shared resource is doomed, though this is no part of the users’ intent. Everyone gains from using it, none gains anything from not taking out what they can when others do. When utilized near carrying capacity, additional use will degrade its value to all. ‘Ruin is the destination toward which all men rush, each pursuing his own best interest in a society that believes in the freedom of the commons’ (Hardin 1968). The inexorable working out of the resource’s ruin is Hardin’s tragedy of the commons. Social change is the consequence of the sum of self-interested actions. And all could be better off by restraints on their own action. If such collective restraints are imposed by the actors themselves, this represents a change of political strutcture. This latter logic and model of structural change was first suggested by Thomas Hobbes ([1651] 1982) who called the commons ‘the state of nature’ and ‘war of all against all,’ whereas the joint establishment of an authority to 14232
keep all from self-defeating actions he called Leviathan (cf. Hernes 1999). The fourth case is where the actors misjudge what others will do. Intentions that could be realized under constant conditions are rendered impossible when everyone seeks to realize them and change the conditions. Hog cycles is an economic example: producers act on the same assumptions and overinvest or underinvest in concert—the results of their decisions appear too late to become premises for their decisions. Erosion is another example—the irreversible destruction of alternatives. With a term Elster (1978) borrows from Sartre, this could be called counterfinality. The four kinds of models are summarized in Fig. 1. Sometimes the actors get caught by what they have wrought; sometimes they do not master the world they have made. But independent actions can also turn out to be blessings in disguise.
7. Models s. Types of Social Changes Some key aspects of models of social change have been identified. A question of equal import is whether it is possible to identify basic types of social change. To answer this question requires a separate analysis. See also: Action, Collective; Action, Theories of Social; Durkheim, Emile (1858–1917); Marx, Karl (1818–89); Organizations: Unintended Consequences; Social Change: Types; Sociology: Overview; Structuralism, Theories of; Structure: Social; Theory: Sociological; Weber, Max (1864–1920)
Bibliography Boudon R 1971 The Uses of Structuralism. Heinemann, London Coleman J 1964 Introduction to Mathematical Sociology. The Free Press, Glencoe, IL Durkheim E [1897] 1951 Suicide. The Free Press, Glencoe, IL
Social Class and Gender Elster J 1971 Nytt syn pab økonomisk historie. Pax, Oslo Elster J 1978 Logic and Society. Contradictions and Possible Worlds. Wiley, Chichester, UK Hardin G 1968 The tragedy of the commons. Science 162: 1243–48 Hernes G 1976 Structural change in social processes. American Journal of Sociology 38(3): 513–46 Hernes G 1989 The logic of ‘The Protestant Ethic’. Rationality and Society 1: 123–62 Hernes G 1999 Real virtuality. In: Hedstro$ m P, Swedberg R (eds.) Social Mechanisms: An Analytic Approach to Social Theory. Cambridge University Press, Cambridge, UK, pp. 74–101 Hobbes T [1651] 1982 Leiathan. Penguin, London Mandeville B [1714] 1989 The Fable of the Bees. Penguin, London Merton R K [1936] 1976 The unanticipated consequences of purposive social action. In: Merton R K (ed.) Sociological Ambialence. The Free Press, Glencoe, IL Sharp L 1952 Steel axes for Stone Age Australians. Human Organization 2: 17–22 Smith A [1776] 1937 The Wealth of Nations. The Modern Library, New York
G. Hernes
Social Class and Gender Both ‘class’ and ‘gender’ are contested concepts. Debates around gender and class have focused on employment-based approaches to ‘class analysis.’ It is argued that these have reached an impasse. It is, however, still of importance to retain a focus on the processes of gender and class structuring.
1. Concepts and Definitions Three different meanings (or dimensions) of the class concept may be identified: (a) ‘Class’ as prestige, status, culture, or ‘lifestyles.’ (b) ‘Class’ as structured inequality (related to the possession of economic and power resources). (c) ‘Classes’ as actual or potential social and political actors. The use of ‘class’—often as an adjective—to describe prestige or lifestyle is probably the most common use of the concept in everyday speech. Social scientists have also carried out descriptive investigations of social differentiation (e.g., Warner 1963), and market research companies have devised occupational classifications in order to give an indication of lifestyles and consumption patterns. Following Weber, many sociologists would insist on the distinction between this use of ‘class’ to indicate prestige, or consumption patterns, and the use of ‘class’ in the second sense above, that is to describe
groups with differential access to economic and power resources (these could be the ownership of capital or productive resources, skills and qualifications, networks of contacts, etc.) that generate structures of material inequality. Such unequally rewarded groups are often described as ‘classes.’ The identification of these groups summarizes the outcome, in material terms, of the competition for resources in capitalist market societies. The term ‘class,’ however, has not just been used to describe levels of material inequalities or social prestige. ‘Classes’ have also been identified as actual or potential social forces, which have the capacity to transform society. Marx considered the struggle between classes to be the major motive force in human history. From the French Revolution and before, ‘classes’—particularly the lower classes—have been regarded as a possible threat to the established order. Thus ‘class’ is also a term with significant political overtones. The use of the single word ‘class,’ therefore, may describe hierarchical rankings of social prestige or lifestyles, patterns of material inequalities, as well as revolutionary or conservative social actors. It is a concept that is not the particular preserve of any individual branch of social science—unlike, for example, the concept of marginal utility in economics. It is not possible to identify a ‘correct’ sociological perspective, or an agreed use of the term. In contrast to ‘class,’ the concept of gender might seem to be unproblematic, given its overt physical manifestations in all spheres of human life. However, its meaning has also been disputed. It is commonplace to draw a distinction between sex and gender, where sex denotes anatomical and physiological differences, and gender the social construction of masculinity and femininity. This is not universal (i.e., there is no ‘essence’ of masculinity or femininity), but created and interpreted differently by different cultures. However, recent theoretical developments within poststructuralist and postmodernist feminism have argued that sex itself is a social construct rather than biologically given, and the categories ‘men’ and ‘women’ discursiely constructed through ‘performance.’ Thus, ‘gender’ has become overlaid with ‘sexuality.’ This has resulted in a focus on the sexual identity of the individual, and the discourses through which these identities are constructed, rather than on gender as a set of social relations. However, an emphasis on the discursive construction of gender (or sex) effectively detracts from the investigation of the systematic structuring of inequality between men and women (through, for example, the gender division of labor). The question of sexuality has a focus on the indiidual and their associated identity—albeit an individual that is constructed, in process, undetermined, etc.—whereas gender is inevitably a relational concept. However, although gender, like class, is a social relationship of 14233
Social Class and Gender power, there is a crucial difference between the two phenomena. Were private property to be abolished overnight, and class differentiation processes outlawed, there would be no ‘classes.’ However, were the differentiation processes of ‘gender’ as we know it to cease tomorrow, approximately half of the human population would nevertheless continue to have breasts and wombs. To recognize the biological roots of gender, however, is not to collapse into a biological determinism or essentialism. While gender differentiation is universal and persistent, what constitutes this concrete ‘difference’ will be variable in its nature and extent, and subject to constant construction and reconstruction.
2. Approaches to Class Analysis The three different meanings of the class concept identified above are widely employed in a range of different approaches to the analysis of class and stratification. These approaches may be grouped within four areas (Crompton 1993, 1998): (a) The study of the processes of the emergence and perpetuation of advantaged and disadvantaged groups or ‘classes’ within society. This may be described as ‘class analysis’ as a ‘dependent variable’ enterprise, in which the ‘class’ (or classes) itself is the object of study. All three meanings of ‘class’ may be employed in this approach, and indeed, these different dimensions are often seen as being intertwined with each other—as in, for example, E. P. Thompson’s The Making of the English Working Class (1968). (b) The study of the concrete consequences (or outcomes) of class location. This may be described as ‘class’ as an ‘independent variable’ specialty, in which ‘class’ is the constant factor and its contribution to other factors—e.g. voting behavior—is that which has to be explained (Wright 1997). Occupational ‘class’ may be related to consumption, inequalities, and political preferences, thus all three meanings of ‘class’ may be drawn upon in empirical investigations. However, the methodology of ‘employment aggregate’ class analysis means that these different dimensions of class have to be treated as discrete variables. Thus, for example, its practitioners argue that it is important to establish that occupational classifications do not correspond to a ‘status’ scale. In practice, however, economic and cultural dimensions are closely intertwined (particularly in the case of gender), and this separation is problematic. (c) The discussion of the significance of ‘class,’ and class processes, for macro theories of societal change and development. These discussions mainly relate to the third meaning of class identified above. In Marx’s work, ‘classes’ are seen as having the causal capacity to shape societies and generate social change (Sayer and Walker 1992; it should be noted that these claims are 14234
not only made by Marxists). Much of the debate with ‘reflexive’ or ‘post’ modernist critics of ‘class analysis’ has centered on this issue, i.e., whether, with increasing ‘individuation,’ ‘classes’ are still relevant social forces (Pakulski and Waters 1996, Clark and Lipset 1991). These preoccupations lead directly into a fourth approach to ‘class’ within sociology: (d) The development of class and status cultures and identities. These discussions have tended to have a major focus on the first meaning of class identified above. However, Bourdieu’s (1973) influential work combines an analysis of both class cultures and class inequalities, and focuses on both the processes of class formation as well as their outcomes.
3. Processes and Outcomes The notion of class processes, therefore, may be used as a general term to identify the patterns of change, tensions, and struggle that generate the differentiated groupings that we call ‘classes’—although, as we have stressed above, there would be no general agreement on the composition of these ‘classes.’ We have to distinguish between the processes of the formation and emergence of advantaged and disadvantaged, more powerful and less powerful, groups (or ‘classes’) in society, and the concrete consequences or outcomes of these processes. It has been suggested that this distinction broadly corresponds to two rather different approaches to ‘class analysis’ ((a) and (b) above). In a similar vein, it may be argued that it is important to distinguish between the processes of differentiation between men and women—that is, the construction of gender as a social relation—and the concrete consequences of this construction, that is, measurable differences between the sexes when taken in aggregate. Thus measurable ‘sex differences’ may be seen as outcomes of gender differentiation processes.
4. Measuring Social Class The occupational structure is often used to generate ‘class’ groupings. This approach to the measurement of class may be described as the ‘employment aggregate’ approach, and is so widespread that the occupational structure and the class structure are frequently referred to as if they were synonymous. From the 1970s leading practitioners of ‘class analysis’ argued that the classifications used by leading researchers of the time—(e.g., Blau and Duncan 1967)—did not measure ‘class’ in a truly sociological sense. Albeit from rather different theoretical perspectives, both Wright and Goldthorpe argued that the occupational measures previously used had in fact been hierarchical (‘gradational’) status scales rather than ‘relational’ class schemes. Thus throughout the
Social Class and Gender 1980s, two major cross-national projects, both of which developed their own, employment-based class schemes, were established. The International Class Project, directed by Wright (1985, 1997), was explicitly Marxist in its inspiration and the scheme(s) he devised classified jobs according to a Marxist analysis of relations of domination and exploitation in production. The Comparative Analysis of Social Mobility in Industrial Societies (CASMIN) project used an occupational classification initially derived from Goldthorpe’s study of social mobility. Even in these improved versions, however, there are a number of problems with the assumption that the job or occupational structure may be divided so as to generate ‘classes.’ The occupational structure does not give any indication of wealth or property holdings. Not all adult persons have an occupation and there are therefore problems in allocating an individual ‘class situation’—this has been particularly problematic in relation to the ‘economically inactive,’ such as housewives. The occupational structure also bears the imprint of other major stratifying factors—notably gender, race, and age—which are difficult to disentangle from those of ‘class.’ We will expand our discussion of feminist criticisms of ‘class analysis’ in the next section, but for the moment, it is important to reiterate the point that occupational class schemes are only a proxy for ‘classes.’ Occupation gives a reasonable indication of ‘life chances,’ but no accurate measure of property or wealth. Employment aggregates do not correspond to politically conscious collectivities. Rather, sociological class schemes may be seen as an approximate measure (probably the best we have) of the outcomes of the processes of class differentiation. Wright and Goldthorpe take different positions regarding the identification of these processes. Wright’s location of jobs into particular classes is guided by Marxist class theory. However, Goldthorpe emphasizes that no particular theory has contributed to his class scheme which must be judged by its empirical adequacy—‘it is consequences, not antecedents, that matter.’ Nevertheless, the schemes intercorrelate quite highly and indeed Wright suggests that his scheme can be interpreted ‘in a Weberian or hybrid manner.’
5. Feminist Criticisms of ‘Class Analysis’ In contrast to both Goldthorpe and Wright, feminists have argued that the processes of class formation and emergence—the divisions of capital and labor which led to the development of the bourgeoisie and mass proletariat—were intimately bound up with parallel processes of gender differentiation (e.g., Bradley 1989). They have described the processes whereby the sexual division of labor which culminated in that stage of modern capitalism we may loosely describe as ‘Fordist’
was characterized by the ‘male breadwinner’ model of the division of labor, around which ‘masculine’ and ‘feminine’ gender blocs were crystallized. Thus feminist criticisms of employment aggregate class analysis derive from the observation that, because the primacy of women’s family responsibilities has been explicitly or implicitly treated as ‘natural’ and men have been dominant in the employment sphere, this approach has effectively excluded women from any systematic consideration in ‘class analysis.’ It is argued: (a) That the primary focus on paid employment does not take into account the unpaid domestic labor of women. Thus, women’s contribution to production is not examined or analyzed (as in, for example, the debate around ‘domestic labor,’ see Molyneux 1979). (b) The expansion of married women’s paid employment has, apparently, rendered problematic the practice of taking the ‘male breadwinner’s’ occupation as a proxy for the ‘class situation’ of the household. (c) That gendered occupational segregation makes it difficult to construct universalistic ‘class schemes’ (that is, classifications equally applicable to men and women). The crowding of women into lower-level occupations, as well as the stereotypical or cultural ‘gendering’ of particular occupations (such as nursing, for example), gives very different ‘class structures’ for men and women when the same scheme is applied to both. Even more problematic is the fact that the same occupation (or ‘class situation’) may be associated with very different ‘life chances’ as far as men and women are concerned (for example, clerical work). This third criticism emphasizes the de facto intertwining of class and gender within the employment structure (Crompton and Mann 1994). As noted above, the occupational structure emerging in many industrial societies in the nineteenth and early twentieth centuries was grounded in a division of labor in which women took the primary responsibility for domestic work, while male ‘breadwinners’ specialized in market work. Today this is changing in that married women have taken up market work, and this has had very important consequences for employment aggregate class analysis.
5.1 Response of Leading Practitioners When Goldthorpe’s major national investigation of the British class structure was published (1980\1987), it was subjected to extensive criticism on the grounds that it focused entirely on men, women only being included as wives. However, Goldthorpe argued that as the family is the unit of ‘class analysis,’ then the ‘class position’ of the family can be taken to be that of the head of the household, who will usually be a male. Thus were the assumptions of the male breadwinner model incorporated into sociological class analysis in 14235
Social Class and Gender Britain. To incorporate women’s employment into ‘class analysis’ on the same terms as men’s, he argued, would lead to confusion. Many women work in lowerlevel white-collar jobs (‘Intermediate’ class locations in Goldthorpe’s original class scheme), and to include their occupations would result in the generation of ‘excessive’ amounts of spurious social mobility. However, he subsequently modified his original position in adopting, with Erikson, a ‘dominance’ strategy, in which the class position of the household is taken as that of the ‘dominant’ occupation in material terms—whether a man or a woman holds this occupation. Furthermore, although Erikson and Goldthorpe (1992) still insist that the unit of class analysis is the household, their class scheme has been modified in its application to women as individuals. Class IIIb (routine nonmanual) has been categorized as ‘intermediate’ for men, but ‘labor contract’ for women. Besides these modifications of his original approach, Goldthorpe has demonstrated that as measured by the Goldthorpe class scheme, the pattern of women’s relative rates of social mobility closely parallels that of men. Thus although, because of the effects of occupational segregation, the distribution of men and women within occupational ‘class’ schemes is different, when considered separately, the impact of ‘class’ on mobility chances is similar for men and women. Wright’s analysis generally takes the individual, rather than the household, to be the unit of class analysis. However, in respect of economically inactive housewives, Wright (1997) employs a similar strategy to that of Goldthorpe. He introduces the notion of a ‘derived’ class location, which provides a ‘mediated’ linkage to the class structure via the class location of others. Wright is sensitive to the issue of gender, and the fact that gender is a major sorting mechanism within the occupational structure as well as reciprocally interacting with class. Nevertheless, he argues that while gender is indeed highly relevant for understanding and explaining the concrete lived experiences of people, it does not follow that gender should be incorporated into the abstract concept of ‘class.’ Thus in his empirical work, ‘class’ and ‘gender’ are maintained as separate factors. We can see, therefore, that in responding to feminist criticisms, both Goldthorpe and Wright insist that class and gender should be considered as distinct causal processes. This analytical separation of class and gender may be seen as part of a more general strategy within the employment aggregate approach in which the continuing relevance of ‘class’ is demonstrated by the empirical evidence of ‘class effects’ (Goldthorpe and Marshall 1992). Although, therefore, Goldthorpe and Wright have apparently developed very different approaches to ‘class analysis,’ their underlying approach to the articulation of gender with class is in fact the same. It may be suggested that this stems from the similarity of the empirical techniques used by the CASMIN and International Class Pro14236
jects: that is, the large-scale, cross-nationally comparative, sample survey. This kind of research proceeds by isolating a particular variable—in this case, employment class—and measuring its effects. Feminist criticisms, therefore, were important in making explicit the fact that the ‘employmentaggregate’ approach within ‘class analysis’ is largely concerned with the outcomes of employment structuring (via its analysis of occupational aggregates or ‘classes’), rather than the processes of this structuring. To paraphrase Goldthorpe and Marshall, employment aggregate ‘class analysis’ now appears as a rather less ambitious project than it once appeared to be. In a parallel fashion, it may be suggested that a major weakness of Wright’s class project (not specific to the gender question) is that the linkage between Marxist theory and his ‘class’ categories has not been successfully achieved. Although important issues have been clarified, therefore, this debate can go no further, and Wright argues that it is necessary to get on with ‘… the messy business of empirically examining the way class and gender intersect.’ While one may be in broad agreement with this sentiment, it may be noted that Wright’s preferred methodology focuses only on the association between job categories and biological sex. However, developments within the employment structures of the advanced service economies suggest the need for an approach that recognizes the complex processes of the structuring of both class and gender.
6. The Restructuring of Gender and Class: the Shift to Serices In today’s advanced service economies, employment is still a significant, although not the only, determinant of ‘life chances’ and access to them. The last 20 years, however, have seen two major, and related, developments. These are, first, the shift to service work, and second, the growth and development of women’s employment. The growth of male-dominated industrial employment was paralleled by the increasing development of social protections associated with the welfare state. Welfare state institutions, such as occupational pensions and other social benefits, were in many countries explicitly created on the assumption that the male breadwinner model was the norm, and women received social benefits via their ‘breadwinner.’ In countries without extensive state welfare protections, such as the US, there has been an extensive development of marketized personal services. The expansion of the welfare state (as well as the expansion of other services such as education and health) was a major source of employment growth for women. Similarly, marketized personal services are also female dominated. Thus the basis of the male breadwinner model was being eroded even as its principles were
Social Class and Gender being consolidated in national institutions and policies. Other factors were also leading to the growth of women’s employment, including ‘push’ factors such as rising levels of education, effective fertility controls, and the growth of ‘second wave’ feminism, as well as ‘pull’ factors including the buoyant labor markets of the 1950s and 1960s. Service expansion was fueled not only by state expenditure and job creation, but also the growth of financial, leisure, and business services. In the West, much of the growth of women’s employment in the decades after World War II was in low-level and subordinate jobs. This reflected not only the directly discriminatory employment practices that were a major target of feminist action and protest in the 1960s and 1970s, but also the relatively low level of formal employment experience among women entering the labor force at this time. However, over the last 10 years, women have been steadily increasing their representation in managerial and professional occupations.
7. The Growth of Women in Management and the Professions In all Western countries, the proportion of women in professional and managerial occupations is increasing at a faster rate than the proportion of women in the labor force as a whole. The level of increase of women’s representation in some heavily feminized professions, such as teaching, is relatively low, suggesting that these professions might be nearing female saturation. In other professions such as engineering, the increase of female representation is again only modest, but in these instances, from a low base. Well above average rates of increase in the numbers and proportions of women are to be found in public service, financial and legal professions, in ‘people centered’ management such as personnel, marketing, and advertising, and above average rates in previously male dominated medical professions such as medicine. These trends support Esping-Andersen’s (1993) assertion that the postindustrial (or service) job hierarchy is becoming increasingly feminized. The feminization of the middle classes would appear to be a universal trend. This means that growing numbers of women are earning sufficient amounts to enable them to support a household. There has been an increase in the proportion of ‘dual breadwinner’ households. One outcome of this trend has been growing social polarization at the level of the household, as the gap between two income and one income households widens. Occupational polarization is also occurring with the growth of a new servant class. Another major outcome of the increase of women in managerial and professional occupations has been a decline in fertility levels, particularly among well-
educated women. These demographic trends serve to illustrate, in a rather dramatic fashion, the de facto intertwining of employment and domestic life, which was one of the major points emphasized by ‘secondwave’ feminism.
8. Discussion Changes in the gender division of labor are leading to increasing tensions between the demands of employment and caring responsibilities (Hochschild 1997), and some have argued that gender conflicts are replacing class conflicts (Beck and Beck-Gernsheim 1995). We have suggested, however, that rather than viewing ‘gender’ and ‘class’ as alternatie causal processes, we should rather focus on their interaction. With the mass entry of women into the labor force, it is increasingly problematic to regard paid and unpaid labor as residing in ‘separate spheres,’ as class analysis conventionally has done. Demands originating from the domestic realm—for example, for time to carry out caring responsibilities or unpaid labor (both men and women may make these demands)—might potentially have a substantial impact on the structuring of paid employment and thus on a significant aspect of class relations. Thus although debates between feminist critics and ‘employment aggregate’ practitioners may have ended in stalemate, the actual need for a refocus on the interaction of gender and class processes is, arguably, greater than ever. See also: Class Consciousness; Class: Social; Feminist Theory: Marxist and Socialist; Gender and Feminist Studies in Sociology; Gender, Class, Race, and Ethnicity, Social Construction of; Labor Movements and Gender; Right-wing Movements in the United States: Women and Gender; Sex Differences in Pay; Sex Segregation at Work; Social Movements and Gender
Bibliography Beck U, Beck-Gernsheim E 1995 The Normal Chaos of Loe. Polity Press, Cambridge, UK Blau P, Duncan O D 1967 The American Occupational Structure. John Wiley, New York Bourdieu P 1973 Distinction: A Social Critique of the Judgement of Taste. Routledge, London\New York Bradley H 1989 Men’s Work, Women’s Work. Polity Press, Cambridge, UK Clark T, Lipset S M 1991 Are social classes dying? International Sociology 6(4): 397–410 Crompton R 1993\1998 Class and Stratification. Polity Press, Cambridge, UK Crompton R, Mann M (eds.) 1986\1996 Gender and Stratification. Polity Press, Cambridge, UK Erikson R, Goldthorpe J H 1992 The Constant Flux. Clarendon Press, Oxford, UK
14237
Social Class and Gender Esping-Andersen G (ed.) 1993 Changing Classes: Stratification and Mobility in Post-Industrial Societies. Sage, London Goldthorpe J H 1980\1987 Social Mobility and Class Structure in Modern Britain. Clarendon Press, Oxford, UK Goldthorpe J H, Marshall G 1992 The promising future of class analysis. Sociology 26(3): 381–400 Hochschild A R 1997 The Time Bind. Metropolitan Books, New York Molyneux M 1979 Beyond the domestic labour debate. New Left Reiew 116 Pakulski J, Waters M 1996 The reshaping and dissolution of social class in advanced society. Theory and Society 25: 667–91 Sayer A, Walker R 1992 The New Social Economy. Blackwell Publishers, Oxford, UK Thompson E P 1968 The Making of the English Working Class. Penguin, Harmondsworth, UK Warner L 1963 Yankee City. Yale University Press, New Haven, CT Wright E O 1985 Classes. Verso, London Wright E O 1997 Class Counts. Cambridge University Press, Cambridge, UK
R. Crompton
Social Cognition and Affect, Psychology of The role of affect in social cognition has fascinated writers and philosophers since the dawn of antiquity. Starting with Plato, many theorists saw affect as a potentially dangerous, invasive force that subverts rational thinking. However, recent advances in social cognition, neuroanatomy, and psychophysiology suggest that affect is often a useful and even essential component of adaptive responding to social situations. The 1980s and 1990s produced something like an ‘affective revolution’ in psychology. Indeed, most of what we know about the role of affect in social thinking has been discovered since the early 1980s. As a result, we are now in a better position than ever before to pursue the age-old quest to understand the relationship between the rational and the emotional aspects of human nature. This article will consider early research on affect and social thinking, followed by a survey of contemporary theories and substantive research areas. The behavioral consequences of affect, and integrative theories of affect and social cognition, will also be discussed.
1. Background Although affect has long been recognized as having a critical influence on thinking and judgments, empirical research on this link is quite recent. Psychologists traditionally studied the three fundamental faculties of the human mind—affect, cognition, and conation—as 14238
independent, isolated entities. Radical behaviorism consigned affect into the ‘black box’ of irrelevant mental phenomena, and the objective of most cognitive research was the study of cold, affectless ideation. The idea that affect is an integral and often adaptive part of how information about the social world is processed was not seriously entertained until the early 1980s. In a radical departure, Robert Zajonc argued that affective reactions often constitute the primary response to social stimuli and may influence subsequent attitudes and behaviors even in the absence of any cognitive memory trace (e.g., Zajonc 2000). Affective reactions such as feelings of anxiety, intimacy, or pleasure are also critical to cognitive representations of social experiences. More importantly, affect has a dynamic influence on memory, thinking, and judgments. Such ‘affect infusion’ was explained initially in terms of either psychodynamic or associationist principles. Psychoanalytic theories suggested that affect has an invasive quality and can ‘take over’ thinking and judgments unless adequate psychological control is exerted. Attempts to suppress affect may increase the ‘pressure’ and could cause affect infusion into unrelated thoughts and judgments. In contrast, conditioning theories maintained that reactions to previously neutral social objects can be conditioned by simultaneous exposure to emotion-arousing stimuli. Consistent with this idea, investigators found that people who feel good or bad report significantly more negative or positive attitudes towards social stimuli. Byrne and Clore (1970) relied on this principle to explain the influence of affect on interpersonal attitudes and attraction.
2. Contemporary Cognitie Theories of Affect and Social Cognition In contrast to psychoanalytic and conditioning accounts, contemporary cognitive theories focus on the information-processing mechanisms that link affect and cognition. Two kinds of mechanisms have been proposed: memory-based accounts (e.g., the affectpriming model; see Bower 1981, Forgas 1995), and inferential models (e.g., the affect-as-information model; see Clore et al. 1994). In addition, several theories highlight the influence of affect on information-processing strategies (Bless 2000, Fiedler 2000).
2.1 Memory Models Affect may influence access to positively or negatively valenced ideas, producing more affect-congruent thoughts and judgments by happy or sad people. The associative network model proposed by Bower (1981) suggested that affect and cognition are integrally linked within an associative network of mental repre-
Social Cognition and Affect, Psychology of sentations. Affect will prime associated thoughts and ideas that are more likely to be used in constructive cognitive tasks. However, affect priming is subject to important boundary conditions. Mood-state dependence in memory is more likely when the affective state is strong, salient, and self-relevant, and the task involves the active generation and elaboration of information (Eich and Macauley 2000). 2.2 Inferential Models Alternatively, rather than computing a judgment on the basis of recalled features of a target, individuals may simply ask themselves, ‘How do I feel about it?’ and in doing so, may use their mood to infer a reaction to the target. The ‘how-do-I-feel-about-it’ heuristic thus involves an inferential error: People misattribute their prevailing affect to an incorrect source. This theory is similar to conditioning approaches (Byrne and Clore 1970), as well as more recent work on misattribution, and research on judgmental heuristics. Accumulating evidence suggests that affect is most likely to be used as a heuristic cue when people are unfamiliar with the task, have no prior responses to fall back on, involvement is low, and cognitive resources are limited (Clore et al. 1994, Forgas 1995). Calling attention to the source of affect can reduce or even eliminate this effect. The affect-as-information model assumes that affect has invariant informational properties. Critics argue, however, that the informational value of affective states is not given but depends on the situation. The affect-as-information model has little to say about how affect and the stimulus information, and internal knowledge structures, are combined in producing a response. In that sense, this is more a theory of nonjudgment or aborted judgment rather than a theory of social judgment. Social cognitive tasks that require some degree of processing are more likely to be influenced by affect-priming rather than affect-as-information mechanisms. 2.3 Affect and Information Processing In addition to informational effects (influencing what people think), affect can also influence the process of cognition, that is, how people think. It has been suggested that positive affect promotes less effortful and more superficial processing strategies, while negative affect triggers a more systematic and vigilant processing style. However, recent studies show that positive affect can also produce processing advantages. Happy people seem to adopt more creative and constructive thinking styles, use broader categories, show greater mental flexibility and perform better on secondary tasks (Bless 2000, Fiedler 2000). Explanations of mood effects on processing style at first emphasized the influence of affect on processing capacity. However, evidence for such accounts remains inconsistent. Alternative, motivational ac-
counts suggest that people experiencing positive mood may try to maintain this pleasant state by refraining from any effortful activity. In contrast, negative affect may motivate people to engage in vigilant, effortful processing. Such a ‘cognitive tuning’ account seems consistent with evolutionary ideas about the adaptive functions of affect. More recently, Bless (2000) and Fiedler (2000) argued that positive and negative affective states trigger equally effortful, but qualitatively different processing styles. Thus, positive affect promotes a more assimilative, schema-based, top-down processing style, where pre-existing ideas, attitudes and representations dominate information processing. In contrast, negative affect produces a more accommodative, bottom-up and externally focused processing strategy where attention to situational information drives thinking.
3. The Empirical Eidence A growing number of empirical studies illustrate the role of affect in thinking and judgments in such areas as the perception of self and other people, the use of stereotypes, intergroup attitudes, attitude change, and persuasion. 3.1 Affect and Social Perception Affect has a significant influence on how people respond to others (Forgas 2001). According to memory-based theories, affect infusion into social judgments is due to greater availability of affectively primed information when interpreting ambiguous social behaviors. Thus, a smile that may be seen as ‘friendly’ in a good mood could be judged as ‘awkward’ or ‘condescending’ by an observer experiencing negative affect. Surprisingly, these mood effects seem to be greater when people need to think more extensively about a more complex or unusual person, as confirmed by studies measuring reaction times and processing latencies. The more ambiguous the target, the more people will need to rely on their stored knowledge about the world to make sense of these stimuli. Such affect infusion effects also occur in realistic interpersonal judgments when people evaluate their real-life interpersonal relationships (Forgas 1995, 2001). 3.2 Affect and the Self Affect has a particularly strong influence on selfrelated attitudes and judgments (Sedikides 1995): Positive affect improves, and negative affect impairs, the valence of self-conceptions. Thus, students in a negative mood tend to blame themselves more for failing, and those in a positive mood claim more credit for success in an exam (Forgas 1995). More recent 14239
Social Cognition and Affect, Psychology of studies indicate that affect congruence in self-related thinking is subject to a number of boundary conditions. Affect infusion into self-judgments is more likely when the task requires open, constructive thinking. Compared with peripheral self-conceptions, central and familiar self-conceptions that require less elaboration also show reduced affect sensitivity (Sedikides 1995). Low self-esteem persons generally also have less certain and stable self-conceptions, and affect appears to have less of an influence on their selfjudgments. Affect intensity and motivational factors also moderate mood congruency effects. Overall, affect seems to have a strong mood-congruent influence on many self-related attitudes and judgments, but only when some degree of open and constructive processing is required, and there are no motivational forces to override affect congruence. Positive affect may also serve as a resource that allows people to overcome defensiveness and deal with potentially threatening information about themselves. People in a positive mood appear to be more willing to expose themselves voluntarily to threatening but diagnostic information. It seems that positive mood functions as a buffer, enabling people to handle the affective costs of receiving negative information, as long as the negative feedback is seen as useful (Trope et al. 2001). 3.3 Affect and Stereotyping Early theories based on the frustration–aggression hypothesis and psychoanalytic notions of projection implied that negative affect may directly produce negative responses to disliked groups. Conditioning theories offered an alternative explanation of such effects, and also suggested that increased contact with outgroup members in positive situations should reduce aversive feelings and improve intergroup relations, a prediction that received considerable support (the socalled ‘contact hypothesis’). Positive affect may also promote more inclusive cognitive categorizations of social groups, thus reducing intergroup distinctions (Bodenhausen and Moreno 2001). Individual differences such as trait anxiety can also significantly moderate the influence of negative affect on intergroup judgments; paradoxically, highly anxious people seem to show reduced intergroup discrimination when in an aversive mood. Affect is thus likely to influence intergroup judgments both by influencing the information processing strategies adopted, and the way positive and negative information about outgroups is selected and used. 3.4 Affect in Persuasion and Attitude Change Writers on rhetoric have long assumed that producing an emotional response in an audience is an important prerequisite for effective persuasion. Recent dualprocess theories of persuasion suggest that response to 14240
a persuasive message depends on the particular information-processing strategy people use, partly influenced by mood (Petty et al. 2001). For example, college students are significantly more likely to agree with persuasive messages advocating comprehensive exams when they are interviewed on a pleasant, sunny day rather than an unpleasant, rainy day. In general, affect should selectively prime mood-congruent thoughts and should influence responses to persuasion when a systematic, central-route strategy is adopted. However, people in a positive mood often process persuasive messages more superficially, and respond to both low-quality and high-quality messages the same way (Petty et al. 2001). Sad persons, in contrast, seem to be more persuaded by high-quality messages, consistent with their more externally focused and bottom-up information-processing style (Bless 2000). 3.5
Affect and Cognitie Dissonance
The experience of dissonance between attitudes and behaviors is a potent mechanism that produces attitude change. Cognitive dissonance involves negative affect because discrepancy among cognitions undermines our clear and certain knowledge about the world, and thus our ability to engage in effective action (Harmon-Jones 2001). Studies suggest that positive affect decreases, and negative affect increases, dissonance reduction and attitude change. Once consonance is restored, affective state also tends to improve. However, much work remains to be done in discovering the precise cognitive mechanisms responsible for these effects. 3.6 Affect, Cognition and Social Behaior Affect can influence not only social cognition but also subsequent social behaviors. People need to evaluate and plan their behaviors in inherently complex and uncertain social situations. Positive affect may thus prime positive information and produce more confident, friendly, and cooperative ‘approach’ attitudes and behaviors, whereas negative affect should prime negative memories and produce avoidant, defensive, or unfriendly attitudes and behaviors. Research shows that people in a happy mood respond more positively when approached with an impromptu request, and produce more confident and more direct requests themselves in a variety of social situations. These effects also influence real-life interpersonal behaviors. Further, affect infusion into attitudes and behaviors is greater when more elaborate, substantive processing is required to produce a strategic response in more complex social situations. Affective states also play an important role in elaborately planned interpersonal encounters such as the planning and performance of complex negotiating encounters (Forgas 2001). Why do these effects occur? When people face an
Social Cognition and Affect, Psychology of uncertain and unpredictable social encounter, such as a negotiating task, they need to rely on open, constructive processing in order to formulate new plans to guide their interpersonal behaviors. Affect can selectively prime more affect-congruent thoughts and associations, and these ideas will ultimately influence people’s plans and behaviors. However, affective influences on social behaviors are highly processdependent: affect infusion is increased or reduced depending on just how much open, constructive processing is required to deal with a more or less demanding interpersonal task. 3.7 Integratie Theories: The Affect Infusion Model Affective states thus have powerful informational and processing effects on the way people perceive, interpret, and represent social information. It is also clear, however, that affective influences on social cognition are highly context-sensitive. A recent integrative theory, the Affect Infusion Model (AIM; Forgas 1995) argued that affect infusion should only occur in circumstances that promote an open, constructive processing style. Constructive or generative processing may be defined here as genuinely open-ended, generative thinking that involves the selective search for, and on-line elaboration of, external information, and the constructive combination of stimulus details with internal knowledge structures that results in the creation of a new response. In contrast, merely reconstructive processing involves the retrieval and use of a previously computed response, with no online elaboration or stimulus details (Fiedler 2000). The AIM identifies four alternative processing strategies: direct access, motivated, heuristic, and substantive processing. These four strategies differ in terms of two basic dimensions: the degree of effort exerted, and the degree of openness or constructiveness of the information search strategy. The combination of these two processing features, quantity (effort), and quality (openness), produces four distinct processing styles: substantive processing (high effort\ open, constructive), motivated processing (high effort\closed), heuristic processing (low effort\open, constructive), and direct access processing (low effort\closed). Direct access processing occurs when a pre-existing, stored response is directly accessed and used without any on-line elaboration. Motivated processing is driven by a strong pre-existing motivational objective that determines information search strategies and thus limits the likelihood that incidentally primed, affective information will be incorporated into the response. Heuristic processing, although a constructive process, is based on the limited consideration of information, and the use of mental shortcuts and heuristics (such as the affect-as-information heuristics) whenever possible. It is only substantive processing that involves the full and elaborate processing of stimulus information
and the constructive integration of stimulus details with primed memories and associations. According to the AIM, mood congruence and affect infusion is most likely when constructive processing is used, such as substantive or heuristic processing. In contrast, affect is unlikely to affect the outcome of closed, merely reconstructive tasks involving motivated or direct access processing (see also Fiedler 2000).
4.
Conclusion
The evidence shows that mild everyday affective states can have a highly significant influence on the way people perceive and interpret social situations. Further, several of the experiments discussed here show that different information-processing strategies play a key role in moderating these effects. Affect infusion influences not only attitudes and judgments but also interpersonal behaviors, such as the monitoring and interpretation of observed encounters; the formulation of, and responses to, requests; and the planning and execution of strategic negotiations. In contrast, affect infusion is absent whenever a social cognitive task could be performed using a simple, wellrehearsed direct access strategy, or a highly motivated strategy. Obviously a great deal more research is needed before we can fully understand the multiple influences that affect has on attitudes, judgments, and interpersonal behavior. See also: Adulthood: Emotional Development; Automaticity of Action, Psychology of; Emotion in Cognition; Social Cognitive Theory and Clinical Psychology
Bibliography Bless H 2000 The interplay of affect and cognition. In: Forgas J P (ed.) Feeling and Thinking: The Role of Affect in Social Cognition (pp. 201–22). Cambridge University Press, New York Bodenhausen G V, Moreno K N 2001 How do I feel about them? The role of affective reactions in intergroup perception. In: Bless H, Forgas J P (eds.) The Role of Subjectie States in Social Cognition and Behaior. Psychology Press, Philadelphia, pp. 283–303 Bower G H 1981 Mood and memory. American Psychologist 36: 129–48 Byrne D, Clore G L 1970 A reinforcement model of evaluation responses. Personality: An International Journal 1: 103–28 Clore G L, Schwarz N, Conway M 1994 Affective causes and consequences of social information processing. In: Wyer R S, Srull T K (eds.) Handbook of Social Cognition (2nd edn.). Erlbaum, Hillsdale, NJ, pp. 323–417 Eich E, Macauley D 2000 Fundamental factors in mooddependent memory. In: Forgas J P (ed.) Feeling and Thinking: The Role of Affect in Social Cognition (pp. 109–30). Cambridge University Press, New York Fiedler K 2000 Towards an integrative account of affect and cognition phenomena using the BIAS computer algorithm. In:
14241
Social Cognition and Affect, Psychology of Forgas J P (ed.) Feeling and Thinking: The Role of Affect in Social Cognition. Cambridge University Press, New York, pp. 223–52 Forgas J P 1995 Mood and judgment: The affect infusion model. Psychological Bulletin 117: 39–66 Forgas J P (ed.) 2000 Feeling and Thinking: The Role of Affect in Social Cognition. Cambridge University Press, New York Forgas J P (ed.) 2001 The Handbook of Affect and Social Cognition. Erlbaum, Mahwah, NJ Harmon-Jones E 2001 The role of affect in cognitive dissonance processes. In: Forgas J P (ed.) The Handbook of Affect and Social Cognition. Erlbaum, Mahwah, NJ, pp. 237–56 Petty R E, DeSteno D, Rucker D 2000 The role of affect in attitude change. In: Forgas J P (ed.) The Handbook of Affect and Social Cognition. Erlbaum, Mahwah, NJ Sedikides C 1995 Central and peripheral self-conceptions are differentially influenced by mood: Tests of the differential sensitivity hypothesis. Journal of Personality and Social Psychology 69, 4: 759–77 Trope Y, Ferguson M, Ragunathon R 2001 Mood as a resource in processing self-relevant information. In: Forgas J P (ed.) The Handbook of Affect and Social Cognition. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 257–89 Zajonc R B 2000 Feeling and thinking: Closing the debate over the independence of affect. In: Forgas J P (ed.) Feeling and Thinking: The Role of Affect in Social Cognition (pp. 31–58). Cambridge University Press, New York
J. P. Forgas
Social Cognition and Aging There is a rich and abundant history of research in cognitive aging. However, studies typically have ignored the influence of social factors on cognition with one exception, metacognition. More specifically, there has been considerable research on self and other beliefs about memory such as personal control and memory self-efficacy. In the 1990s social cognition and aging research has grown considerably in both breadth and scope. This movement reflects a broader interface between mainstream social cognition and cognitive aging to better understand how cognition in the aging adult operates in a social context. There are pervasive findings in the cognitive aging research that demonstrate age-related decline in cognitive performance such as working memory, abstract reasoning, and speed of processing. However, taking into consideration the pragmatics of cognition and social-contextual factors may help broaden our understanding of cognitive functioning in older adults beyond linear decline. Accordingly, a recent wave of studies has begun to examine social cognitive functioning in an effort to better understand the effects of cognitive aging and associated adaptive developmental processes. These studies raise important issues in relation to cognitive functioning such as age-related increases in experience, pragmatic knowledge, social 14242
expertise, beliefs about aging and social values, and social contextual variation. In addition, they highlight the importance of considering both the basic cognitive architecture of the aging adult and the functional architecture of everyday cognition in a social context. In the latter case, even if certain basic cognitive mechanisms have declined, older adults may possess appropriate knowledge and skills in a given context to function adequately through adaptation and compensation. Before we entertain these issues and findings in the extant literature, let us first consider a definition of the social cognitive perspective as it is applied to aging research.
1. Definition of Social Cognition The basic goal of a social cognitive perspective is to understand how individuals make sense of themselves, others, and events in everyday life. These goals are reflected in the study of (a) cognitive mechanisms that may explain social cognitive processes (as illustrated above); (b) the content and structure of social knowledge; and (c) the influence of social situational and individual demands on social information processes and social knowledge. To illustrate these points within an adult development and aging perspective, first consider the well-documented finding that age-related decline in processing capacity correlates with age-related decline in cognitive performance. The question is whether agerelated changes in working memory, attentional capacity, and speed of processing also relate to and\or explain age changes in social cognitive processes (Hess 1999, Schwarz et al. 1999). Researchers have suggested that age-related decline in processing capacity could explain the finding that older adults rely more on easily accessible trait-based information when making social judgments as opposed to more elaborative processing (Hess 1999). However, from the second goal perspective listed above, age-related change in content and structure of social knowledge, accessibility of this knowledge held in memory, and related motivation and goals of the individual in the context of concurrent processing resources may offer competing explanations of age differences in social reasoning. Finally, personal and situational factors associated with aging could also explain a reliance on easily accessible social knowledge and beliefs. For example, older adulthood has been associated with a decrease in personal characteristics such as tolerance for ambiguity and attitudinal flexibility (Blanchard-Fields 1999). It may be the case that such decreases in cognitive flexibility may lead to social processing biases (Hess 1999). Alternatively, older adulthood is associated with changing situational factors, which influence social goals such as an increase in the importance of emotional considerations. In this case such an increase in emotional
Social Cognition and Aging salience influences how information is processed (see Turk-Charles and Carstensen 1999 for a review). Let us now turn to each of these goal perspectives in more detail.
2. Cognitie Mechanisms and Social Cognition There is a history of social cognition research that posits information-processing models for making social judgments (e.g., Gilbert and Malone 1995). For example, Gilbert and Malone have shown that the ability to make nonbiased attributional judgments depends upon the cognitive demand accompanying those judgments. In this case individuals are more likely to rely on their initial dispositional attribution (attributing the cause of an outcome to personal characteristics of the main character involved in producing the outcome), ignoring compelling situational causes. Accordingly, in a number of studies, Blanchard-Fields and co-workers (see BlanchardFields 1999 for a review) have found that older adults consistently attributed the cause of problem dilemmas to dispositions about the primary protagonist more than younger age groups. However, this was only demonstrated when events reflected negative interpersonal content. It may be the case that older adults relied more heavily on easily accessible dispositional attributions given the demands on cognitive resources to deliberate about situational factors. However, keep in mind that these findings were content-specific (i.e., only found when situations were negative and interpersonally oriented). Hess and co-workers (see Hess 1999 for a review) have demonstrated that the use of diagnostic trait information (such as immoral behavior) in making initial impressions of an individual varied with age. They find age differences in the use of negative information in modifying impressions. When first presented with a positive portrayal of an individual (e.g., honest) followed by negative information (e.g., dishonest behavior) older adults were more willing to modify their impression from an initially positive to a more negative one. However, older adults did not modify their first impression when it was an initially negative portrayal followed by positive information. Younger adults did not exhibit this pattern of impression change. Instead, they engaged in effortful processing of the text in their concern with the situational consistency of the new information presented. Hess suggests that older adults may rely more on life experiences and social rules of behavior in interpreting these types of situations, whereas young adults were more concerned with situational consistency of the new information presented. Again, it is suggested that older adults do not correct their initial impressions because they were affected by the availability of negative information or a negativity bias. Such influences of processing biases suggest that decline in cognitive functioning limits the ability of
older adults to override the impact of more easily accessible information such as the negativity bias. However, Hess also suggests that the use of such information appears to be related to age differences in social experience and changes in the salience of specific behavioral realms. Finally, another example of a processing resource approach to social cognition and aging is illustrated in the work of Schwarz et al. (1999). They found that manipulating the salience of the information given, and thus its accessibility, can modify social judgments in young and older adults. However, in some circumstances, older adults’ judgments show smaller context effects than young adults. One explanation of such findings is that reduced processing resources constrain the processing of contextual information by older adults. Based on the above research, it appears that both processing resource limitations as well as the social content of memory play an important role in understanding how older adults process social information. However, it is also the case that the extent to which social information is accessible may operate independently of a processing resource limitation to influence social judgments. Next we will explore the issue of knowledge-based views of social cognitive processing.
3. Social Knowledge and Social Cognition There is a history of social cognition research that suggests that strongly held and easily accessible social beliefs and knowledge are activated automatically and can influence social judgments in general, and promote social judgment biases in particular (e.g., Gilbert and Malone 1995). The prepotent influence of such beliefs and knowledge on social judgments may operate even when individuals are not constrained by high cognitive demands and possess ample resources to engage in elaborative processing. This brings into play alternative explanations for older adults’ tendencies to rely on easily accessible information in making social judgments. First, as alluded to earlier, older adults may exhibit a cognitive style that forecloses further consideration of information such as intolerance for ambiguity or a need for structure. As indicated above, age-related decreases in attitudinal flexibility may lead to biases in social judgment. Second, strong attitudinal beliefs and values may be evoked by the presenting situation. For example, Blanchard-Fields (1999) suggests that particular types of situations such as negative interpersonal relationship situations trigger relatively automatic trait- and rule-based social rules or schemas pertinent to the actor in the problem. In this case the individual may not engage in intentional, effortful processing of detailed information. Instead, the older adults may have relied more on activated social beliefs resulting in dispositional biases. Thus, it is not a matter of insufficient processing capacity to 14243
Social Cognition and Aging engage in elaborative social inference, but overriding considerations (e.g., irrepressible beliefs or values). Accordingly, Chen and Blanchard-Fields (reviewed in Blanchard-Fields 1999) explored both a processing explanation and a belief content hypothesis to explain dispositional biases in older adults’ attributions about negative relationship situations. They manipulated the timing of when dispositional ratings were collected and obtained belief statements regarding appropriate behaviors in the social situations presented. They found that older adults produced more dispositional biases only when time was constrained. However, they also found that older adults made more evaluative rule statements about the main character than younger adults and this rendered all age differences in dispositional biases nonsignificant. Such findings suggest that a knowledge-based explanation of social judgments is a viable alternative to resource limitations. This approach is also evidenced in the research examining the influence of social situational factors and cognitive functioning in older adults. In this case, the research focus is on one’s changing life context and adaptation or social competence. We turn to this approach next.
4. Social Situational Factors, Social Competency, and Social Cognition and Aging The social cognition paradigm has offered an enriched understanding of competency in older adulthood. From this perspective, social competence is seen as an important and valid dimension of cognition and intelligence. Social cognitive performance, from this perspective, incorporates principles derived from the lifespan contextual perspective and can be explained in terms of adaptation to particular life contexts. For example, research exploring such social cognitive constructs as possible selves (one’s hopes and fears for the future) indicates that they change throughout the latter half of the lifespan (Hooker 1999). For example, health-related selves achieve greater importance in older adulthood and relate to adaptive self-regulatory health behaviors. This is most evident in areas such as wisdom, practical and social problem-solving, and postformal reasoning. For example, Staudinger et al. (1992) argue that wisdom involves an expert body of knowledge and strategies related to life composition and life insight. These strategies are used to produce adaptive outcomes in managing and composing one’s life. Examples of such knowledge and strategies include the ability to select appropriate abilities and strategies, compensate for declining abilities, and optimize one’s performance, e.g., as in increased flexibility in coping behavior. Similarly, Turk-Charles and Carstensen’s (1999) work suggests that changes in the relative importance of social cognitive goals across the lifespan profoundly influence the interpretation of social infor14244
mation and the selection and maintenance of intimate relationships. In this latter case, important new variables affecting social information processing emerge. Carstensen and co-workers argue that perception of time is responsible for this change in social motivation. By considering such changes, age-related decreases in social interaction can be recast in terms of its adaptive significance. In another domain, Dixon (1999) examines cognition in social collaborative contexts. The primary thrust of this work examines age differences in the benefits or costs of collaboration in cognitive performance (e.g., memory, problem solving). He argues that collaborative cognitive functioning characterizes another aspect of social competence. Finally, we cannot ignore the negative role that social situational factors can play in the social cognitive functioning of older adults. This is reflected in the work on social stereotypes and aging. A number of researchers have examined adult developmental changes in the content and structure of stereotypes (e.g., Hummert 1999). For example, Hummert finds that older and young adults hold quite similar age stereotypes. However, she and other researchers find that older persons have more complex representations of the category ‘older adult’ as opposed to young adults who, in addition, hold more negative representations. However, unlike the social cognitive research reviewed above, these studies primarily look at age differences in the content of stereotypes. The explicit social cognitive mechanisms that account for these differences are not revealed clearly. Relative to the knowledge-based framework discussed above, the content of social knowledge, attitudes, and beliefs such as stereotypes is important. However, the content must be accessed or activated in order to influence behavior. An important question is how does stereotype activation encourage or discourage the individual to be a more effective information processor? With respect to older adults, how do the stereotypes of aging provide expectations of mental deficiencies? In her research Levy (1996) suggests that such automatically activated (that is subliminally primed) stereotypes about aging and memory adversely affect the cognitive performance of older adults. This is, of course, a provocative finding and in need of replication. However, the point here is that it provokes issues for further research. The issue of directly manipulating threatening and\or facilitative and nonfacilitative social contexts is only beginning to undergo examination.
5. Future Directions In conclusion, the social cognition and aging research highlights the importance of considering social factors in explanations of social cognitive functioning. Such factors as the communicative context, social beliefs
Social Cognition in Childhood and values, cognitive style, stereotypes, motivational goals, among others influence social information processing in important ways. Thus, it is important not to limit explanations of social cognitive change to cognitive processing variables. There are important social factors influencing how and when older adults will attend to specific information and when this information will influence social cognitive functioning. However, current research in social cognition and aging has only begun to address important questions regarding older adults’ social cognitive functioning. In order to understand more fully the role and interplay of social cognitive structures and mechanisms in social cognition and aging, more research is needed. For example, are there qualitative changes in underlying mechanisms influencing social cognitive functioning? Age differences in the nature of social beliefs, values, and goals activated in social reasoning need to be explored. How and under what conditions do age differences in the content and activation of social knowledge and social belief systems account for social information processing? Can factors that influence variability as to when social judgment biases occur in young adults account for age-related variability? Under what conditions does working memory load compromise social information processing and under what conditions is it irrelevant? We can be optimistic in the growth of the social cognitive approach in research on aging. Although many of the cognitive and socio-emotional aspects of behavior and aging have been addressed, we have only begun to scratch the surface. See also: Adult Cognitive Development: PostPiagetian Perspectives; Brain Aging (Normal): Behavioral, Cognitive, and Personality Consequences; Cognitive Aging; Cognitive Development in Childhood and Adolescence; Lifespan Development, Theory of; Lifespan Theories of Cognitive Development; Social Learning, Cognition, and Personality Development; Wisdom, Psychology of
Bibliography Blanchard-Fields F 1999 Social schematicity and causal attributions. In: Hess T M, Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA Dixon R 1999 Exploring cognition in interactive situations: The aging of Nj1 minds. In: Hess T M, Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA Gilbert D T, Malone P S 1995 The correspondence bias. Psychological Bulletin 117: 21–38 Hess T M 1999 Cognitive and knowledge-based influences on social representations. In: Hess T M, Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA Hooker K 1999 Possible selves in adulthood: Incorporating teleonomic relevance into studies of the self. In: Hess T M,
Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA Hummert M L 1999 A social cognitive perspective on age stereotypes. In: Hess T M, Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA Levy B 1996 Improving memory in old age through implicit stereotyping. Journal of Personality and Social Psychology 71: 1092–107 Schwarz N, Park D, Knauper B, Sudman S 1999 Aging, Cognition, and Self-reports. Psychology Press, Philadelphia, PA Staudinger U M, Smith J, Baltes P B 1992 Wisdom-related knowledge in a life review task: Age differences and the role of professional specialization. Psychology and Aging 7: 271–81 Turk-Charles S, Carstensen L L 1999 The role of time in the setting of social goals across the life span. In: Hess T M, Blanchard-Fields F (eds.) Social Cognition and Aging. Academic Press, San Diego, CA
F. Blanchard-Fields
Social Cognition in Childhood ‘Social cognition’ refers in very general terms to thinking about people, knowledge of people, and the social world, as well as to social processes in cognitive development. That is, it is a term that is used to refer both to the nature or content of children’s understanding of others and themselves, and to theories and explanations of how social experiences are involved in the development of understanding more broadly considered—the role of social factors in cognitive growth. The study of social cognition then is at the border of cognitive and social developmental psychology (Butterworth 1982). It brings together very different theories of the growth of knowledge, and is a growing point for new ideas on the nature of children’s understanding of mind and emotion, and on the relation of this understanding to children’s prosocial and moral development, and their social relationships.
1. Changes in Ideas on Social Cognition Five themes in the changing ideas on the development of social cognition over recent decades are evident. One concerns the question of how social experiences are implicated in sociocognitive development, in terms of the processes by which cognitive change takes place. The second concerns ideas and evidence on the nature of children’s understanding of emotion, mind, intention, and the connections between these inner states and people’s actions—a domain which has come to dominate cognitive developmental research since the early 1990s. The third theme is the rapidly growing interest in individual differences in social understand14245
Social Cognition in Childhood ing, and the implications of these differences for a wide range of children’s outcomes. The fourth is the role of social experiences in contributing to these individual differences in children’s social understanding. The fifth is the recent interest in bridging the gap between the study of cognitive and of social development through a focus on how cognitive development is implicated in social relationships, moral and prosocial development. These themes will be considered in turn.
1.1 Indiidual and Social Approaches to Cognitie Deelopment Piaget’s ideas on cognitive development centered on both the process of understanding, and on the nature of knowledge, and a key notion was that the individual child actively constructs knowledge by selecting and interpreting information in the environment. The processes he proposed for the growth of intelligence were seen as applying to both the inanimate and the social world. His investigations of children’s understanding of the mind began in the 1920s (e.g., Piaget 1926); after these first books he focused chiefly upon children’s growing knowledge of the physical world, and the development of logicomathematical thought. The emergence of social knowledge was not seen as essentially different from other forms of knowledge, though social interactions (especially those with peers) were seen as contributing to children’s growing understanding (see Sect. 1.4 below). In contrast to Piaget’s focus on the individual actively constructing knowledge as an individual, there is a strong tradition within psychology that emphasizes the social nature of cognitive processes, beginning as early as the 1890s with Baldwin’s seminal writings (e.g., Baldwin 1895). In Russia, Vygotsky argued that cognitive growth was socially determined—that cognitive development proceeded from the social plane to the individual (see Wertsch 1985). He famously proposed that new cognitive abilities were first seen when a child was interacting with another person, and only later became internalized. Individual development was, he argued, embedded in a social context. Vygotsky emphasized the difference between a child’s level of potential ability, and his or her actual ability. The difference between these two levels he termed the ‘zone of proximal development’—abilities that were just beyond what a child could achieve alone, but was able to achieve when interacting with a cooperative, supportive other person (adult or peer). Thus to Vygotsky, the scaffolding or facilitation provided in social interaction (especially interaction between adult and child, or expert and novice) was key for the development of new abilities. Within this broad theoretical framework, researchers have examined the significance of ‘scaffolding’ and facilitation in cognitive performance, and the ‘expert–novice’ interac14246
tion (Rogoff 1990). A whole new view of cognitive growth in infancy has flourished, in which the context in which new abilities are fostered is seen as essentially social (Hala 1997). So for example, recent research on the development of understanding of emotion takes off from Darwin’s observations of the development of emotions (which highlight the biological foundation and function of emotional expression) to emphasize the sociocultural contributions to the development of both expression and understanding of emotions. The idea that cognitive growth takes place through social interaction has also been experimentally investigated within a different research tradition, that of Doise and his colleagues in Geneva, with a series of studies that began in the 1970s (e.g., Doise and Mugny 1984). Rather than emphasizing scaffolding and supportive cooperation, they argued that cognitive conflict between individuals who may view a problem differently, or have different levels of ability can stimulate cognitive development. These theoretical positions have all been concerned with the normative processes by which cognitive changes take place, rather than with the role of social experiences as contributors to individual differences in social cognition; this is considered in Sect. 1.4 below.
1.2 The Nature of Children’s Deeloping Social Understanding After Piaget’s seminal work in the 1920s, it was not until the mid-1980s that interest in children’s understanding of mental life, and their perspective on why people behave the way they do, began to revive; since the early 1990s there has been a great increase in developmental research on children’s understanding of emotion, mind, and intention. The focus has been chiefly on normative developmental changes in children’s understanding of emotional and mental life and their relation to people’s behavior, beginning with the work on infancy that highlights the extent to which babies are ‘prepared to learn about people’ from very early in infancy. For example, babies discriminate different emotional expressions very early (a key first step for understanding emotions); by the time they are nine months old they already act in ways that indicate they have considerable grasp of affective communication, and are able to communicate their desires— clearly, these are efforts at symbolic communication. At the end of the first year, the emergence of ‘social referencing’ is a key development: in situations of ambiguity for the baby (such as when a strange person or object appears), the baby looks at a familiar adult (in most experimental settings, the mother), and apparently is guided by her emotional expression, showing different behavior if she is expressing happiness or fear, for instance. Such differential behavior is taken as evidence that the baby ‘understands’ that
Social Cognition in Childhood the mother’s emotional expression provides important information about the ambiguous situation. During the second year, babies begin to show pride, shame, and embarrassment—the ‘social emotions,’ emotions that highlight how closely tuned the children are to other people’s reaction to them. They also clearly show their understanding of others’ feelings in their attempts to comfort, tease, and joke, and, with the acquisition of language, in their talk about feelings (Bartsch and Wellman 1995). The marked changes between two and five years old in children’s ability to distinguish thoughts and things; their increasing interest in talking about feeling, thinking, and knowing; their curiosity about why people behave the way they do; their ability to play with hypothetical social events, to share narratives in pretend play, and to deliberately deceive others, have been documented in a burst of research since the early 1990s (Astington 1993). There has been considerable controversy about the details of the timing of these normative changes (conclusions about children’s abilities depend very much on the methods used and the setting in which children are studied; see Sect. 2), and competing theoretical accounts of these cognitive changes have been articulated. However the broad outline of development in what has been termed children’s ‘theory of mind’ has been mapped, as follows. Children’s understanding of people’s feelings and mental states, and their use of this understanding to explain and predict people’s behavior begins in infancy, as the evidence of their intention to communicate shows. It develops in the toddler years with their growing ability to talk first about what they and other people see, want, and feel, then later in their talk about what they think and know. The critical understanding that people behave in the world as they see it or believe it to be, even when those beliefs or perceptions may be inaccurate, is reached by most children around four years. Note that much of the most recent research emphasizes the substantial competence of children as young as three, and the gradual nature of developmental change. A key tool in this research has been the demonstration that children understand how a ‘false belief ’ can influence how someone will behave. Other key developments in children’s social understanding include children’s growing understanding that people are different, their grasp of understanding the self, and their moral understanding. On the first of these, children’s understanding and attribution of traits, a clear developmental pattern was described from work in the 1970s and 1980s, with marked change around six to eight years in children’s appreciation of trait terms; it is usually presumed that this reflects age-related cognitive changes. Recently the ‘theory of mind’ research has been integrated with this ‘traditional’ approach, and there is increasing empirical support for the idea that both cognitive
development and sociocultural influence play a role in the development of trait attribution (Hala 1997). On the development of moral understanding, there has been a parallel shift from a focus primarily on the cognitive underpinnings of moral development (e.g., Kohlberg 1984) to a recognition that cognition, emotion, and moral action are closely interrelated (Eisenberg et al. 1997). Thus while we now know that developmental changes in prosocial behavior are complex, and influenced by the methodology employed (Eisenberg and Fabes 1998), it is recognized that the relations between perspective-taking abilities and prosocial behavior are moderated by affective factors, by the particular emotional relationship between the child and the other person involved, and by the child’s interpretation of the situation, and attributions about the individuals involved.
1.3 Indiidual Differences in Social Cognition Individual differences in sociocognitive abilities have been shown to be importantly linked to children’s adjustment and ‘social competence’ (skills, deficits) in middle childhood; the wealth of research on social cognition and children’s peer relations—and specifically to problems in those relations—illustrates these associations. One line of research has focused on information-processing models (as in Dodge’s 1986 social information-processing model), which highlight children’s attributions about their peers, their goals, and strategies in peer relations. These models conceptualize an ordered series of steps in social information processing that includes encoding and representational processes, search and decisionmaking strategies. Differences in processing patterns may underlie the form of aggressive behavior that children show towards their peers. Experimental studies with aggressive children show, for instance, that they did not attend to all relevant cues, they attributed hostile intent to a protagonist in an ambiguous setting, and failed to generate enough viable alternative strategies. The original social informationprocessing models were criticized on the grounds that the linear model was too constrained, and that no role of emotion was included. Revised models now include emotional states as influences on what is perceived, how it is processed, and what attributions are made. A second line of evidence links children’s thoughts about themselves, such as their perceptions of their own competence, to individual differences in peer relations (Ladd 1999). Thus children who are both aggressive and rejected by their peers, compared to other children, overestimate their own social skills and competence, and underestimate the degree to which their peers dislike them. Other research links individual differences in early social understanding to later differences in moral sensibility, and to the ways in which children adjust to school. Such findings high14247
Social Cognition in Childhood light the importance of clarifying what are the key processes implicated in the development of such individual differences in social understanding.
1.4 Social Experience as Contributor to Indiidual Differences in Social Understanding To understand these individual differences in children’s social understanding, it is increasingly clear that their social experiences must be examined. Research that includes a focus on social experiences not only reveals children’s abilities vividly, but also suggests some key processes implicated in their development (Dunn 1996). The interactive events that have been highlighted include sharing a pretend world with another person (especially a peer or sibling), participation in conversations about inner states (Bartsch and Wellman 1995), joint narratives and causal discourse about the social world, engagement in certain kinds of ‘constructive’ conflict with others. For example, several studies have shown that children who grew up in families in which they participated in discourse about feelings, and causal talk about why people behave the way they do, were, in later assessment of social understanding, more successful than those who had not engaged in such talk; other research has shown that those who had experienced much shared, cooperative pretend play with a sibling were also in later assessments particularly mature in their social understanding. Three general principles that emerge from this empirical research are the following. First, the emotional context in which such interactions take place may be key to what is learned; it is not simply the child’s exposure to new information or another’s point of view that matters, but the emotions and pragmatics of the relationship between the interactants that are important. Second, how children use their social understanding in real-life interactions differs in different social relationships, depending on the quality of that relationship. Relationship quality depends on both partners, and is unlikely to be linked simply to the sociocognitive skills of either partner. Third, different kinds of social relationship appear to play different roles in influencing the development of social understanding. Two recent lines of evidence illustrate this point. First, studies of child–child relationships show these peer interactions to be of special significance in the development of understanding feelings and other minds, and of moral sensibility (as originally proposed by Piaget [1932] 1965). Second, recent work on attachment quality links the security of children’s attachments to their later mind-reading abilities—and it is inferred that the key process implicated here is mothers’ sensitivity to their children’s current levels of understanding and their use of mental state terms in 14248
interactions with them, a propensity that Meins (1997) refers to as ‘mind-mindedness.’ The role of family processes, more broadly conceived, in influencing children’s social cognitive abilities during early and middle childhood—both directly and indirectly— continues to be a focus of much research interest. It should be noted that inferences about the direction of effects in such research present particular problems.
1.5 The Significance of Cognitie Processes in Social Deelopment The advances in our understanding of children’s cognitive development have important implications for how we view and study children’s social development and their social relationships—both the nature of social development and the processes implicated in it. First, what has been learned from cognitive research about the development of understanding of emotion and mind in childhood, means not only that dimensions of empathy and understanding are part of their early close relationships, but that aspects of intimacy (previously thought only relevant to the relations between much older children) may also be present in the relationships between preschoolers. Thus observational research shows that three year olds who are close friends engage in shared imaginative play in which their cooperation, the smooth coconstruction of narratives, reflect an impressive ‘meeting of minds’—a mutual understanding of how the other thinks the play might develop, and what are the exciting next moves in the narrative. Second, the cognitive research alerts us to the importance of looking beyond the ‘security’ aspects that have dominated conceptions of parent–child relationships. Manipulation of control and power within a dyadic relationship, deception, teasing, and shared humor all depend on a grasp of the other’s motivation, beliefs, desires, and capacities. These dimensions may well be relevant for a full account of young children’s relationships. Third, to understand developmental changes in children’s social relationships, the evidence for developments in metacognitive skills is clearly relevant. Perhaps most important of all are the implications of cognitive research for children with difficulties in social relations, for example children with problems such as autism or hyperactivity. Whether these children’s social problems are antecedent to or sequels of their cognitive problems remains a question of central social and clinical importance, as well as one of theoretical interest. Two other lessons from cognitive research have major implications for the study of children’s sociocognitive development. The first concerns the developmental impact of relationships between other people. Cognitive research has shown us that children are interested in, and reflect on, others’ intentions,
Social Cognition in Childhood thoughts, and feelings, reminding us to pay attention to the developmental impact of what is happening between the other people in children’s social worlds. Three lines of evidence illustrate this significance: the evidence for the negative impact of marital conflict on children’s adjustment; the research on co-parenting, which shows the poor outcome of children whose parents disagree and fail to support one another over parenting; and third, the accumulating evidence that differential treatment of siblings within the family is associated with later problems in children’s development. The second lesson is that the evidence for children’s interest in the inner states of others suggests that children may be sensitive to the responses, opinions, and relationships between other members of their peer groups to an extent that is only recently being appreciated. Recognizing that children’s problems in peer group relations are a key marker for later adjustment problems suggests that further attention to this sensitivity to the group is important.
2. Methodological Issues and Problems In research on social cognition there is a new concern with the contextual validity of experimental tasks used, particularly in terms of the social situation in which children are placed in such tasks; their relationship with the experimenter plays a role in their task performance (Dunn 1996). More generally there is a concern to expand the social context of such studies to more ‘real-life settings’: research within a Vygotskean framework, for example, has chiefly been limited to adult–child encounters of a didactic nature, which are affectively neutral—unlike the interactional settings within the family in which children are likely to make advances in sociocognitive ability (Goodnow 1990). Research on children in real-life social settings—as opposed to controlled experimental settings— highlights the capacities that even young children can show in the domain of understanding other people’s feelings, beliefs, and intentions. Studies of natural discourse have been centrally important here (Bartsch and Wellman 1995). There is a tension here in current developmental research, as the naturalistic data suggest children’s capacities for social understanding are evident at considerably younger ages than would be expected on the basis of the standard experimental assessments of sociocognitive development; the various studies of deception provide a good example (Dunn 1996). Such studies not only show that children’s abilities may be misrepresented by their limited abilities in more formal settings; they have also contributed importantly to ideas on the social processes that may influence the development of individual differences in social cognition.
Different methodologies can importantly affect the inferences we make about children’s prosocial behavior (Eisenberg and Fabes 1998), as well as about their mind-reading and emotion understanding. Moreover, children use their sociocognitive abilities differently within their various relationships: one lesson from research that focuses on children in different relationships is that we should move away from the conception of sociocognitive capacity as a characteristic of a child that will be evident in all social situations (Dunn 1996).
3. Future Directions Among the many challenges posed by the research to date, four stand out as key. First, to make progress in understanding the development of social understanding, we need to address the difficult issue of how affect is linked to cognitive change, and to bridge the divide between research on social and cognitive development. The developmental work that documents children’s explanations of people’s actions first in terms of emotions and desires, and then later incorporating the notion of belief, has led for instance to the provocative argument that an understanding of cognitive states arises from an earlier understanding of emotional states (Bartsch and Estes 1996). Second, we need to investigate the differentiation of different aspects of social understanding—too often combined into global categories at present—and to further explore the relation of language and general intelligence to the various aspects of understanding the social world. Third, the challenging question of how developmental accounts based on children within the normal range of individual differences relate to the development of children at the extremes remains both theoretically and clinically important. Fourth, more research on implications of cultural and class differences, and more diversity in the children studied is needed. Finally at a methodological level, researchers need to include both experimental and naturalistic strategies, and to study children in different social settings, to use as informants both children and others in their social worlds. See also: Cognitive Development: Child Education; Family Bargaining; Infancy and Childhood: Emotional Development; Piaget’s Theory of Child Development; Social Learning, Cognition, and Personality Development
Bibliography Astington J W 1993 The Child ’s Discoery of the Mind. Harvard University Press, Cambridge, MA
14249
Social Cognition in Childhood Baldwin J M 1895 Mental Deelopment in the Child and the Race. Macmillan, New York Bartsch K, Estes D 1996 Individual differences in children’s developing theory of mind and implications for metacognition. Learning and Indiidual Differences 8: 281–304 Bartsch K, Wellman H M 1995 Children Talk About the Mind. Oxford University Press, Oxford, UK Butterworth G 1982 A brief account of the conflict between the individual and the social in models of cognitive growth. In: Butterworth G, Light P (eds.) Social Cognition: Studies of the Deelopment of Understanding. The Harvester Press, Brighton, UK, pp. 3–16 Dodge K 1986 A social information processing model of social competence in children. In: Perlmutter M (ed.) Minnesota Symposium on Child Psychology. Erlbaum, Hillsdale, NJ, Vol. 18, pp. 77–125 Doise W, Mugny G 1984 The Social Deelopment of the Intellect. Pergamon, Oxford, UK Dunn J 1996 The Emanuel Miller Memorial Lecture 1995: Children’s relationships: Bridging the divide between cognitive and social development. Journal of Child Psychology and Psychiatry and Allied Disciplines 37: 507–18 Eisenberg N, Fabes R 1998 Prosocial development. In: Damon W (ed.) Handbook of Child Psychology. Wiley, New York, Vol. 3, pp. 701–78 Eisenberg N, Losoya S, Guthrie I K 1997 Social cognition and prosocial development. In: Hala S (ed.) The Deelopment of Social Cognition. Psychology Press, East Sussex, UK, pp. 329–63 Goodnow J J 1990 The socialization of cognition: What’s involved? In: Stigler J W, Shweder R A, Herdt G (eds.) Cultural Psychology. Cambridge University Press, Cambridge, UK, pp. 259–86 Hala S 1997 The Deelopment of Social Cognition. Psychology Press, East Sussex, UK Kohlberg L 1984 The Psychology of Moral Deelopment. Harper and Row, San Francisco, CA, Vol. 2 Ladd G W 1999 Peer relationships and social competence during early and middle childhood. Annual Reiew of Psychology 50: 333–59 Meins E 1997 Security of Attachment and the Social Deelopment of Cognition. Psychology Press, East Sussex, UK Piaget J 1926 The Language and Thought of the Child. Kegan Paul, London Piaget J [1932] 1965 The Moral Judgement of the Child. Free Press, Glencoe, IL Rogoff B 1990 Apprenticeship in Thinking. Oxford University Press, Oxford, UK Wertsch J V 1985 Vygotsky and the Social Formation of Mind. Harvard University Press, Cambridge, MA
J. F. Dunn
Social Cognitive Theory and Clinical Psychology 1. Social Cognitie Theory Human behavior has often been explained in terms of one-sided causation. In these unidirectional models, behavior is depicted as being predominantly shaped 14250
and controlled by environmental influences, or impelled by inner drives and dispositions. Social cognitive theory explains human functioning in terms of triadic reciprocal causation (Bandura 1986). In this model of reciprocal determinism personal determinants in the form of cognitive, biological, and emotional factors, behavior patterns, and environmental events all operate as interacting determinants that influence each other bidirectionally.
2. Agentic Perspectie Social cognitive theory is rooted in an agentic perspective. People are self-organizing, proactive, selfreflecting, and self-regulating, not just reactive organisms shaped and shepherded by external events. Human adaptation and change are rooted in social systems. Therefore, personal agency operates within a broad network of sociostructural influences. In these agentic transactions, people are producers as well as products of social systems. Sociostructural and personal determinants are treated as co-factors within a unified causal structure rather than as rival conceptions of human behavior.
3. Fundamental Human Capabilities In social cognitive theory, people are characterized in terms of a number of fundamental capabilities. These are enlisted in effecting personal and social change. 3.1 Symbolizing Capability The extraordinary capacity to represent events and their conditional relations in symbolic form provides humans with a powerful tool for comprehending their environment and for creating and managing environmental conditions that touch virtually every aspect of their lives. Most environmental events exert their effects indirectly, through cognitive processing, rather than directly. Cognitive factors partly determine which environmental events are observed, what meaning is conferred on them, what emotional impact and motivating power they have, and how the information they convey is organized and preserved for future use. Through the medium of symbols, people transform transient experiences into cognitive models that serve as guides for reasoning and action. By symbolizing their experiences, people give structure, meaning, and continuity to their lives. The sociocognitive methods for promoting personal and social change draw heavily on this symbolizing capability. If put to faulty use, the capacity for symbolization can cause distress. Indeed, many human dysfunctions and torments stem from problems of thought. People live in a psychic environment largely of their own making. In their thoughts, they often dwell on painful
Social Cognitie Theory and Clinical Psychology pasts and on perturbing futures of their own invention. They burden themselves with stressful arousal through apprehensive rumination, debilitate their efforts by self-impeding ideation, drive themselves to despondency by harsh self-evaluation and dejecting modes of thinking, and often act on misconceptions that get them into trouble. Thought can thus be a source of human failings and distress as well as a source of human accomplishments. Faulty styles of thinking can be changed to some extent by verbal analysis of faulty inferences from personal experiences. However, corrective and enabling mastery experiences are more persuasive than talk alone in altering faulty beliefs and dysfunctional styles of thinking (Bandura 1986, 1997, Rehm 1988, Williams 1990).
3.2 Vicarious Capability There are two basic modes of learning. People learn by experiencing the effects of their actions, and through the power of social modeling. Trial and error learning is a tedious and hazardous process. Fortunately, this process can be short cut by social modeling. Humans have evolved an advanced capacity for observational learning that enables them to expand their knowledge and competencies rapidly through the information conveyed by the rich variety of models (Bandura 1986, Rosenthal and Zimmerman 1978). Modeling is not simply a process of response mimicry as commonly believed. Modeled activities convey rules for generative behavior. Once observers extract the rules underlying the modeled activities they can generate new patterns of behavior that go beyond what they have seen or heard. Self-regulatory and other cognitive skills can also be developed by using models to verbalize plans, action strategies, and selfguidance to counteract self-debilitating thought patterns, as solutions to problems are worked through (Meichenbaum 1984). Deficient and faulty habits of thought are corrected as participants adopt and enact the verbal self-guidance. In addition to cultivating new competencies, modeling influences can alter incentive motivation, emotional proclivities, and value systems (Bandura 1986). Seeing others achieve desired outcomes by their efforts can instill motivating outcome expectations in observers that they can secure similar benefits for comparable performances; seeing others punished for engaging in certain activities can instill negative outcome expectations that serve as disincentives. Observers can also acquire lasting attitudes and emotional and behavioral proclivities toward persons, places, or things that have been associated with modeled emotional experiences. Observers learn to fear the things that frightened models, to dislike what repulsed them, and to like what gratified them. During the course of their daily lives, people have direct contact with only a small sector of the physical
and social environment. As a result, their conceptions of social reality are greatly influenced by modeled representations of society, mainly by the mass media (Gerbner 1972). Video and computer systems feeding off telecommunications satellites are now rapidly diffusing new ideas, values, and styles of conduct worldwide. At the societal level, symbolic modeling is transforming how social systems operate and serving as a major vehicle for sociopolitical change (Bandura 1997). As Braithwaite (1994) has shown, the speed with which Eastern European rulers and regimes were toppled was greatly accelerated by televised modeling of successful mass action. The vast body of knowledge on modeling processes is being widely applied for personal development, therapeutic purposes, and social change (Bandura 1997, Bandura and Rosenthal 1978, Rosenthal and Steffek 1991). For human problems that stem from sociocognitive deficits, the guided mastery approach that is highly effective in cultivating personal efficacy and psychosocial competencies combines three components. First, the appropriate competencies are modeled in a stepwise fashion to convey the basic rules and strategies. Second, the learners receive guided practice under simulated conditions to develop proficiency in the skills. Third, they are provided with a graduated transfer program that helps them to apply their newly learned skills in their everyday lives in ways that will bring them success. In ameliorating anxiety and phobic dysfunction, the modeling provides coping strategies for managing threats. One can eventually overcome fears without the aid of modeling through repeated unscathed contact with threats, although it takes longer and is more stressful. However, modeling of cognitive and behavioral skills is a key ingredient in the development of complex competencies. Sociocognitive approaches rely on mastery experiences as the principal vehicle of change. The enabling power of social modeling is enhanced by guided mastery enactments. When people avoid what they dread, they lose touch with the reality they shun. Guided mastery treatment quickly restores reality testing in two ways. It provides disconfirming tests of phobic beliefs by persuasive demonstrations that what phobics dread is safe. Even more importantly, it provides confirmatory tests that phobics can exercise control over what they fear. Intractable phobics, of course, are not about to do what they dread. Therapists must, therefore, create environmental conditions that enable phobics to succeed despite themselves. This is achieved by enlisting a variety of performancemastery aids. Threatening activities are repeatedly modeled to demonstrate coping strategies and to disconfirm people’s worst fears. Intimidating tasks are reduced to graduated subtasks of easily mastered steps. Joint performance with the therapist enables frightened people to do things they would refuse to do on their own. Another method for overcoming re14251
Social Cognitie Theory and Clinical Psychology sistance is for phobics to perform the feared activity for only a short time. As they become bolder the length of involvement is extended. Protective conditions can be introduced to weaken resistance that retards change. After effective functioning is fully restored, increasingly challenging self-directed mastery experiences are then arranged to strengthen and generalize the sense of coping efficacy. Varied mastery experiences build resiliency by reducing vulnerability to the negative effects of adverse experiences. This is a powerful treatment that eliminates phobias, anxiety arousal, biochemical stress reactions, and wipes out recurrent nightmares and intrusive rumination (Bandura 1997, Williams 1990). Symbolic modeling lends itself readily for societywide applications through creative use of the electronic media. For example, the soaring population growth and the environmental devastation it produces is the most urgent global problem. Worldwide use of enabling and motivating dramatic serials is raising people’s efficacy to exercise control over their family lives, enhancing the status of women to have some say in how they live their lives, and lowering the rates of childbearing (Singhal and Rogers 1999, Vaughan et al. 1995).
3.3 Self-regulatory Capability People are self-reactors with a capacity to motivate, guide, and regulate their activities through the anticipative mechanism of forethought. People set goals for themselves, anticipate the likely consequences of prospective actions, and plan courses of action that are likely to produce desired outcomes and avoid detrimental ones. The projected future is brought into the present through forethought. A forethoughtful perspective provides direction, coherence, and meaning to one’s life. Much human motivation and behavior is regulated anticipatorily by the material and social outcomes expected for given courses of action. Social cognitive theory broadens this functionalism to include selfevaluative outcomes. People do things that give them satisfaction and a sense of self-worth, and refrain from actions that evoke self-devaluative reactions. Incentive systems are enlisted, if needed, to promote and help sustain personal and social change (Bandura 1986, 1998, O’Leary and Wilson 1987). In keeping with the agentic perspective of social cognitive theory, the greatest benefits that psychological treatments can bestow are not specific remedies for particular problems, but the self-regulatory capabilities needed to deal effectively with whatever situations might arise. To the extent that treatment equips people to exercise influence over events in their lives, it initiates an ongoing process of selfregulative change. 14252
Personal standards for judging and guiding one’s actions play a major role in self-motivation and in the exercise of self-directedness (Bandura 1991, Locke and Latham 1990). Self-regulatory control is achieved by creating incentives for one’s own actions and by anticipative reactions to one’s own behavior depending on how it measures up to personal standards. Sociocognitive principles of self-regulation provide explicit guidelines for self-motivation, personal development, and modification of detrimental styles of behavior. A favorable self-regulatory system provides a continuing source of motivation, self-directedness, and personal satisfaction, but a dysfunctional self-system can breed much human misery. For people who adopt stringent personal standards, most of their accomplishments bring them a sense of failure and self-disparagement (Bandura 1997, Rehm 1988). In its more extreme forms, harsh standards of self-evaluation give rise to despondency, chronic discouragement, and feelings of worthlessness and lack of purposefulness. For example, Hemingway, who took his own life, imposed upon himself demands that were unattainable throughout his life, pushed himself to extraordinary feats, and constantly demeaned his accomplishments. Effective treatments for despondency arising from stringent self-imposed standards remedy each of the dysfunctional aspects of the self-system (Rehm 1981). They correct self-belittling interpretative biases that minimize one’s successes and accentuate one’s failures. To increase self-satisfaction and a sense of personal accomplishment, participants are taught how to set themselves attainable subgoals in activities of personal significance and to focus their efforts and self-evaluation on progress toward their aspirations. Through proximal structuring, goals become motivating rather than demoralizing. Depressed individuals tend to be less self-rewarding for successes and more denying and self-punishing for failures than the nondepressed for similar performances. They are taught how to be more self-rewarding for their progressive personal attainments. Deficient or deviant standards also create problems, although the distress is inflicted on others rather than on oneself. Unprincipled individuals who pursue an ethic of expediency, and those who pride themselves in antisocial activities readily engage in conduct that is socially injurious (Bandura 1991). Antisocial and transgressive styles of conduct have diverse sources requiring multifaceted approaches to personal change. Successful treatments promote the development of competencies and prosocial self-regulatory standards and styles of behavior sufficiently attractive to supplant antisocial ones (Bandura 1973, Reid and Patterson 1991, Silbert 1984). Education provides the best escape from crime and poverty. Promoting educational development is, therefore, a key factor in reducing crime and delinquency.
Social Cognitie Theory and Clinical Psychology 3.4 Self-reflectie Capability Among the mechanisms of personal agency none is more central or pervasive than people’s beliefs in their capability to exercise control over their own functioning and over environmental events (Bandura 1997). Efficacy beliefs are the foundation of human agency. Unless people believe that they can produce desired results by their actions, they have little incentive to act or to persevere in the face of difficulties. Metaanalyses, which combine the findings of numerous studies, attest to the influential role played by efficacy beliefs in human adaptation (Holden 1991, Holden et al. 1990, Multon et al. 1991, Stajkovic and Luthans 1998). Self-efficacy beliefs regulate human functioning through their impact on cognitive, motivational, emotional, and choice processes. They determine whether people think pessimistically or optimistically, and in self-enhancing or self-debilitating ways. Efficacy beliefs play a central role in the self-regulation of motivation through goal challenges and outcomeexpectations. Whether people act on the outcomes they expect prospective performances to produce depends on their beliefs about whether or not they can produce those performances. It is partly on the basis of efficacy beliefs that people choose what challenges to undertake, how much effort to expend in the endeavor, how long to persevere in the face of obstacles and failures, and whether failures are motivating or demoralizing. Efficacy beliefs play a key role in shaping the courses lives take by influencing the types of activities and environments people choose to enter. By choosing their environments, people can have a hand in what they become. People’s beliefs in their coping capabilities also play a pivotal role in the self-regulation of affective states (Bandura 1997). Efficacy beliefs influence how potential threats are perceived and cognitively processed. If people believe they can manage threats they are not distressed by them. But if they believe they cannot control potential threats they experience high anxiety. Through inefficacious thinking they distress themselves and constrain and impair their functioning. Many human distresses result from failures of thought control. It is not the sheer frequency of disturbing cognitions, but the perceived inability to turn them off that is the major source of distress. Similarly, in obsessional disorders, it is not the ruminations, per se, but the perceived inefficacy to stop them that is perturbing. Phobics display high anxiety and physiological stress to threats they believe they cannot control. After their perceived efficacy is raised to the maximal level by guided mastery experiences, they manage the same threats with equanimity. It was widely assumed that phobic behavior was controlled by anxiety. Many therapeutic procedures were, therefore, keyed to extinguishing anxiety arou-
sal. The anxiety control theory has not withstood experimental scrutiny, however. A low sense of coping efficacy produces both anxiety and phobic behavior. People often perform activities even though highly anxious, as long as they believe they can master them. Intense stage fright, for example, does not stop actors from going on stage. Conversely, people avoid situations they believe exceed their coping capabilities without waiting for visceral arousal to tell them to do so. Williams (1992) has shown that people base their actions on efficacy beliefs in situations they regard as risky not on anxiety arousal. A low sense of efficacy to exercise control over things one values can give rise to feelings of futility and despondency through several pathways. In one pathway, a low sense of efficacy to fulfill personal standards of worth gives rise to self-devaluation and depression. A second pathway occurs through a low sense of social efficacy to develop social relationships that bring satisfaction to people’s lives and enable them to manage chronic stressors. Another is through the exercise of control over depressing thoughts themselves. Low efficacy to regulate ruminative thought contributes to the occurrence of depressive episodes, how long they last, and how often they recur. People’s beliefs about their efficacy are constructed from four principal sources of information. The most effective way of instilling a strong sense of efficacy is through mastery experiences. Successes build a robust belief in one’s personal efficacy. Failures undermine it. Development of resilient self-efficacy requires experiences in overcoming obstacles through perseverant effort. The second method is by social modeling. Models serve as sources of competencies and motivation. Seeing people similar to oneself succeed by perseverant effort raises observers’ beliefs in their own capabilities. Social persuasion is the third mode of influence. The fourth way of altering self-efficacy beliefs is to enhance physical strength and stamina and alter mood states on which people partly judge their capabilities. These different modes of influence create and strengthen beliefs of personal efficacy across diverse spheres of functioning. Such beliefs predict the level, scope, and durability of behavioral changes (Bandura 1997). Microanalyses of how given treatments work, reveal that the efficacy belief system is a common pathway through which different types of treatment produce their effects. Knowledge of the determinants and processes governing the formation of efficacy beliefs provides explicit guidelines on how best to structure programs to achieve desired change.
4. Sociostructural Change Many human problems are sociostructural, not simply individual. Common problems require social solutions through collective action. Social cognitive theory 14253
Social Cognitie Theory and Clinical Psychology extends the conception of human agency to collective agency (Bandura 1997). People’s shared belief in their collective power to produce desired results is a key ingredient of collective agency. The stronger the people’s perceived collective efficacy the higher their aspirations and motivational investment in their undertakings, the stronger their staying power in the face of impediments and setbacks, the higher their morale and resilience to adversity, and the greater their accomplishments. Efforts to improve the human condition must, therefore, be directed not only at treating the casualties of adverse social practices, but also at altering the social practices producing the causalities. The models of social change derived from social cognitive theory (Bandura 1986, 1997), draw heavily on knowledge of modeling, motivational, regulatory, and efficacy mechanisms operating at the collective level. See also: Self-efficacy; Self-regulated Learning
Bibliography Bandura A 1973 Aggression: A Social Learning Analysis. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1986 Social Foundations of Thought and Action: A Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1991 Social cognitive theory of moral thought and action. In: Kurtines W M, Gewirtz J L (eds.) Handbook of Moral Behaior and Deelopment. Erlbaum, Hillsdale, NJ, Vol. 1, pp. 45–103 Bandura A 1997 Self-efficacy: The Exercise of Control. Freeman W H, New York Bandura A, Rosenthal T L 1978 Psychological modeling: Theory and practice. In: Garfield S L, Bergin A E (eds.) Handbook of Psychotherapy and Behaior Change, 2nd edn. Wiley, New York, pp. 621–58 Braithwaite J 1994 A sociology of modeling and the politics of empowerment. British Journal of Sociology 45: 445–79 Gerbner G 1972 Communication and social environment. Scientific American 227: 152–60 Holden G 1991 The relationship of self-efficacy appraisals to subsequent health related outcomes: A meta-analysis. Social Work in Health Care 16: 53–93 Holden G, Moncher M S, Schinke S P, Barker K M 1990 Selfefficacy of children and adolescents: A meta-analysis. Psychological Reports 66: 1044–6 Locke E A, Latham G P 1990 A Theory of Goal Setting and Task Performance. Prentice-Hall, Englewood Cliffs, NJ Meichenbaum D 1984 Teaching thinking: a cognitive-behavioral perspective. In: Glaser R, Chipman S, Segal J (eds.) Thinking and Learning Skills: Research and Open Questions. Erlbaum, Hillsdale, NJ, Vol. 2, pp. 407–26 Multon K D, Brown S D, Lent R W 1991 Relation of self-efficacy beliefs to academic outcomes: A meta-analytic investigation. Journal of Counseling Psychology 38: 30–8 O’Leary K D, Wilson G T 1987 Behaior Therapy: Application and Outcome. 2nd edn., Prentice-Hall, Englewood Cliffs, NJ Rehm L P 1981 A self-control therapy program for treatment of depression. In: Clarkin J F, Glazer H I (eds.) Depression: Behaioral and Directie Treatment Strategies. Garland STPM Press, New York, pp. 68–110
14254
Rehm L P 1988 Self-management and cognitive processes in depression. In: Alloy L B (ed.) Cognitie Processes in Depression, Guilford Press, New York, pp. 143–76 Reid J B, Patterson G R 1991 Early prevention and intervention with conduct problems: a social interactional model for the integration of research and practice. In: Stoner G, Shinn M, Walker H (eds.) Interentions for Achieements and Behaior Problems. National Association of School Psychologists, Silver Spring, MD, pp. 715–39 Rosenthal T L, Steffek B D 1991 Modeling applications. In: Kanfer F H, Goldstein A P (eds.) Helping People Change, 4th edn. Pergamon, New York, pp. 70–121 Rosenthal T L, Zimmerman B J 1978 Social Learning and Cognition. Academic Press, New York Silbert M H 1984 Delancy Street Foundation—process of mutual restitution. In: Reissman F (ed.) Community Psychology Series, Human Sciences, New York, Vol. 10, pp. 41–52 Singhal A, Rogers E M 1999 Entertainment-education: A Communications Strategy for Social Change. Erlbaum, Mahwah, NJ Stajkovic A D, Luthans F 1998 Self-efficacy and work-related performance: A meta-analysis. Psychological Bulletin 124: 240–61 Vaughan P W, Rogers E M, Swalehe R M A 1995 The effects of ‘‘Twende Na Wakati,’’ an entertainment-education radio soap opera for family planning and HIV\AIDS preention in Tanzania. Unpublished manuscript. University of New Mexico, Albuquerque, NM Williams S L 1990 Guided mastery treatment of agoraphobia: beyond stimulus exposure. In: Hersen M, Eisler R M, Miller P M (eds.) Progress in Behaior Modification. Sage, Newbury Park, CA, Vol. 26, pp. 89–121 Williams S L 1992 Perceived self-efficacy and phobic disability. In: Schwarzer R (ed.) Self-efficacy: Thought Control of Action. Hemisphere, Washington, DC, pp. 149–76
A. Bandura
Social Comparison, Psychology of Social comparison occurs when an individual compares himself or herself to another person. The dimension on which the comparison is made can be almost anything—attractiveness, tennis skill, wealth, etc. The comparison might be deliberately sought, or it might occur by chance. It is not social comparison when one compares one person to another, as in comparing one neighbor to another; the individual making the comparison must be included in the comparison for it to fit the commonly accepted definition.
1. The Original Theory Leon Festinger’s (1954) paper, A Theory of Social Comparison Processes, still serves as the basic formulation, although there have been many additions
Social Comparison, Psychology of and revisions. Festinger argued that it is necessary for people to know what they are capable of doing and to hold correct opinions about the world. Sometimes people can perform a physical reality test on their abilities and opinions, but that could be dangerous, inconvenient, or impossible. For example, if individuals want to know whether they can swim across a swift river, they could simply try it, but that could be dangerous. If it is a person’s opinion that the Russian Revolution would not have occurred without V. I. Lenin, there is simply no physical reality test of the truth of that opinion. Consequently, people often seek social reality rather than physical reality for their abilities and opinions, and that social reality is obtained through social comparison.
people feel confident when they have the same opinions as others. Thus, all the members of a group can be satisfied when they have the same opinion, but not when they have exactly the same level of ability.
2. Methods for Studying Social Comparison The three major methods for studying social comparison are ‘selection’ approaches, which examine characteristics of the people with whom individuals choose to compare; ‘reaction’ approaches, which examine individuals’ emotional or cognitive reactions to comparison information; and ‘narration’ approaches, which examine individuals’ self-reports of comparisons made (Wood 1996). Each of these will be examined below.
1.1 Similarity A cornerstone of Festinger’s theory is that comparison with others who are similar provides better information than comparison with dissimilar others. The theory was ambiguous, however, about the concept of similarity, and critics argued that if an individual already knows that someone is similar, that implies that a comparison has already been made and that further comparison with that person would be redundant. Goethals and Darley (1977) combined attribution theory with social comparison theory and suggested that similarity should refer to similarity on ‘related attributes,’ attributes correlated with and predictive of the particular ability or opinion to be evaluated. To answer the question, ‘How good a tennis player am I?’ a person should compare with someone of the same age, gender, physical condition, and tennis experience. The person should logically expect to perform equally to that comparison target and, if this turns out to be so, would probably feel that he or she is a fairly good player (‘as good as I ought to be’). If the comparison is with someone advantaged (or disadvantaged) on related attributes, the person would expect to lose (or to win) and, if this turns out to be so, wouldn’t be able to draw any clear conclusion about his or her ability level. The person wouldn’t know whether the loss or win was due to ability or to the related attributes. ‘Did I win because I am a good tennis player, or because I am in better physical condition?’
2.1 The Selection Approach Individuals are given the choice of two or more comparison targets that differ in theoretically important ways. The choices they make (and related data) provide clues concerning social comparison motives. The first notable example, known as the ‘rank order paradigm,’ was introduced to study the unidirectional drive upward. Experimental participants were tested on a (fictitious) desirable or undesirable personality trait such as ‘intellectual flexibility’ or ‘intellectual rigidity’ and were told that they occupied the middle rank in a group of several other participants tested at the same time. Participants were given their own (fictitious) score on the test, as well as the rank of each of the other participants, and were told that they could be shown the score of one other person in the group. The rank of the person participants chose to see could be of someone better off than themselves (known as an ‘upward comparison’) or worse off than themselves (a ‘downward comparison’). In most of the experiments using this method, participants made upward comparisons. That is, they wanted to compare their scores to better scores, either more flexible or less rigid (Wheeler et al. 1969). They believed that they were similar to and almost as good as those better off than themselves and chose to compare with these others in order to confirm that similarity.
2.2 The Reaction Approach 1.2 The Unidirectional Drie Upward An important difference between opinions and abilities in Festinger’s theory is a unidirectional drie upward for abilities but not for opinions. Regardless of the ability, people would generally like to have more of it than other people have. With opinions, however,
Individuals are presented with social comparison information, and their responses to that information are measured. In what is known as the Mr. Clean\Mr. Dirty study, job applicants casually encountered another applicant whose appearance was either socially desirable (dark suit, well-groomed, and confident) 14255
Social Comparison, Psychology of or socially undesirable (torn and smelly clothes, poorly-groomed, confused). Those participants who encountered Mr. Clean showed a decrease in selfesteem measured a few minutes later, whereas those who encountered Mr. Dirty increased in self-esteem (Morse and Gergen 1970). In this research, participants contrasted themselves against the other job applicant, but this is not always what happens. For example, Van der Zee et al. (1998) exposed cancer patients to a bogus interview with another patient who was doing better than participants (upward comparison) or worse (downward comparison). Patients responded with more positive affect to upward than to downward comparisons, particularly those patients who scored high on a measure of identification with the upward comparison patient. Collins (1996) called this ‘upward assimilation,’ the perception that one is similar to superior others. This is opposed to ‘upward contrast,’ in which comparing with someone who is better off (like Mr. Clean) makes people feel worse about themselves.
2.3 The Narratie Approach This approach attempts to measure the comparisons that people make or have made in their everyday lives. At the most basic level, participants are sometimes asked who they have compared themselves with in the past. This places too great a burden on memory, however, and a better method is the Social Comparison Record, a diary recording of all comparisons as they occur over a period of two weeks, including the dimension of comparison (e.g., appearance, happiness), the direction (e.g., upward, downward), relationship to the target (e.g., friend, stranger), and mood before and after the comparison (Wheeler and Miyake 1992). Free response measures are also narrative in approach. These are spontaneous statements, usually made during the course of an interview or conversation, implying comparisons. Wood et al. (1985) reported that breast cancer patients made many free response comparison statements, most of which seemed to indicate downward comparison, such as ‘I think I am coping a little better than these other women.’
because it was with someone worse off than themselves. By verifying that someone had even more of the negative trait than themselves, participants could defend their self-esteem. Influenced by this research, Wills (1981) proposed a theory of downward comparison—that persons experiencing threat or negative affect will enhance their subjective well-being through comparison with someone worse off than themselves, thereby reducing distress. This prediction received support from studies of cancer and other medical patients, particularly those studies using free response measures (see Buunk and Gibbons 1997).
3.1 Challenges to Downward Comparison Theory There is substantial controversy about the empirical support for downward comparison theory. The theory makes the intuitively appealing prediction that when people are experiencing distress they will make a downward comparison to reduce the distress. Others have questioned whether people experiencing distress are psychologically capable of making a downward comparison; their attention may be so focused on their distress that they are unable to see others as being worse off than themselves. In research using the Social Comparison Record (see above), Wheeler and Miyake (1992) found that when participants were experiencing negative affect, they made upward comparisons rather than the downward comparisons predicted by downward comparison theory. In contrast, when the same participants were experiencing positive affect, they made downward comparisons. Such a result is not consistent with downward comparison theory but is consistent with affect-cognition priming models in which affect primes (makes ready and available) cognitions about the self that are congruent with the affect. Experiencing negative affect primes a person to have negative thoughts about the self and thus to see others as superior to the self, leading to a contrastive upward comparison and further feelings of inferiority and negative affect.
4. A Return to the Concerns of the Original Theory
3. Downward Comparison Theory
4.1 Abilities
The rank-order experiments drew attention to the difference between upward and downward comparisons, and one of the experiments (Hakmiller 1966) demonstrated downward comparison. Participants who were threatened with having a very high score on the undesirable personality trait ‘hostility toward one’s parents’ chose to compare with others who had even more of the trait. This was a downward comparison
It should be apparent that Festinger’s (1954) original theory is very different from most of the subsequent research and theory. Festinger was concerned with social comparison providing accurate evaluation of one’s opinions and abilities, whereas much of the later work was concerned with how social comparison relates to how people feel about themselves. Some recent work has returned to Festinger’s original con-
14256
Social Comparison, Psychology of cerns. For example, one of the questions addressed by Festinger’s theory was ‘Can I do X?’ Can I swim across the river? Can I complete a college degree? Can I have both a family and a career? One way to answer such questions is to compare oneself to someone— called a proxy—who has already succeeded at the task (Martin 2000). If people seem to have the same amount or more of the ability or abilities required as the proxy does, they could conclude with some confidence that they could also perform the task. There are two ways that people can be confident that they have as much of the underlying ability as the proxy: (a) they have competed directly against the proxy and have performed as well as the proxy when the proxy was performing to maximum, or (b) they have observed the proxy performing, are unsure whether the proxy was performing to maximum, but are similar to the proxy on attributes related to the ability. In both cases they have established with some degree of certainty that they are similar in ability to the proxy and thus have the same action possibilities.
4.2 Opinions and Attitudes Suls (2000) proposed that the social comparison of opinions is best considered in terms of three different evaluative situations: preference assessment (i.e., ‘Do I like X?’), belief assessment (i.e., ‘Is X correct?’), and preference prediction (i.e., ‘Will I like X?’). Each evaluative question is associated with a different comparison dynamic. (a) If the question is ‘Do I like X?,’ persons similar in related attributes have special importance. Thus, in deciding whether one likes a new political candidate, a person should prefer comparison with someone with similar political values and goals. Such a comparison target should share one’s assessment of the candidate, and if not, something is amiss, and the situation should be re-examined. (b) If the question is ‘Is X correct?,’ comparisons with persons of more advantaged status (or ‘expert’) are most meaningful, although they also should hold certain basic values in common (the ‘similar expert’). Thus, to determine the truth of the belief that guns should not be in private hands, comparison with the opinion of an expert on the relevant statistics would be useful, but only if the expert shared one’s values about human life, freedom, etc. (c) If the question is ‘Will I like X?,’ the most meaningful comparisons are with persons who have already experienced ‘X’ and who share a consistent history of preferences. Thus, in deciding whether a movie would be enjoyable, a person would consult with someone who has seen the movie and who has in the past usually agreed or usually disagreed with the person. If the other has usually agreed, the person would expect to have the same reaction to the movie; if the other has usually disagreed, the person would expect to have an opposed reaction to the movie.
5. Prospects There is more interest in social comparison theory now than any time in its almost half-century history because it has become clearer that comparison with other people is central to the human experience. The theory is being applied to new problems and is increasingly influencing and being influenced by other areas within psychology. See also: Attributional Processes: Psychological; Cognitive Dissonance; Personality and Conceptions of the Self; Self-efficacy; Self-evaluative Process, Psychology of; Self-monitoring, Psychology of
Bibliography Buunk B P, Gibbons F X (eds.) 1997 Health, Coping, and WellBeing: Perspecties from Social Comparison Theory. Lawrence Erlbaum Associates, Mahwah, NJ Collins R L 1996 For better or worse: The impact of upward social comparison on self-evaluations. Psychology Bulletin 119(1): 51–69 Festinger L 1954 A theory of social comparison processes. Human Relations 7: 117–40 Goethals R R, Darley J M 1977 Social comparison theory: an attributional approach. In: Suls J M, Miller R L (eds.) Social Comparison Processes: Theoretical and Empirical Perspecties. Hemisphere, Washington, DC Hakmiller K L 1966 Threat as a determinant of downward comparison. Journal of Experimental Social Psychology 2(1): 32–9 Martin R 2000 ‘Can I do X?’: Using the proxy comparison model to predict performance. In: Suls J, Wheeler L (eds.) Handbook of Social Comparison: Theory and Research. Plenum, New York Morse S, Gergen K J 1970 Social comparison, self-consistency, and the concept of self. Journal of Personality and Social Psychology 16: 148–56 Suls J 2000 Opinion comparison: The role of corroborator, expert, and proxy in social influence. In: Suls J, Wheeler L (eds.) Handbook of Social Comparison: Theory and Research. Kluwer Academic\Plenum, New York Van der Zee K, Buunk B, Sanderman R 1998 Neuroticism and reactions to social comparison information among cancer patients. Journal of Personality 66(2): 175–94 Wheeler L, Shaver K G, Jones R A, Goethals G R, Cooper J, Robinson J E, Gruder C L, Butzine K 1969 Factors determining choice of a comparison other. Journal of Experimental Social Psychology 5: 219–32 Wheeler L, Miyake K 1992 Social comparison in everyday life. Journal of Personality and Social Psychology 62(5): 760–73 Wills T A 1981 Downward comparison principles in social psychology. Psychology Bulletin 90: 245–71 Wood J V 1996 What is social comparison and how should we study it? Personality and Social Psychology Bulletin 22(5): 520–37 Wood J V, Taylor S E, Lichtman R R 1985 Social comparison in adjustment to breast cancer. Journal of Personality and Social Psychology 49: 1169–83
L. Wheeler, J. Suls, and R. Martin Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
14257
ISBN: 0-08-043076-7
Social Competence: Childhood and Adolescence
Social Competence: Childhood and Adolescence 1. Definitions Social competence refers to the effectiveness of a person’s functioning as an indiidual, in dyadic relationships and in groups. Like other general constructs used by psychologists (e.g., risk, stress and ulnerability), competence refers to a general class or category of phenomena. In this respect, competence brings together a set of ideas or concepts regarding individuals’ capacities to function in their environments. Insofar as competence refers to a property of the relationship between the individual and the environment, one cannot think of individuals as being competent without knowing the contexts in which they are engaged. Competence is also a metatheoretical construct in the sense that it is not linked to a single theoretical perspective. Instead, it serves as an organizing principle for research programs and it allows scientists from diverse areas of study to see the links and commonalities among them.
1.1 Initial Conceptualizations An etymological analysis of the word competence shows that it refers to the engagement of an individual in activity oriented toward the achievement of a state or goal (Bukowski et al. 1998). Therefore, a competent person is not a person who possesses a particular skill or ability but is a person who is able to engage in the activities needed to achieve particular goals or outcomes. This exact view—that competence is a process—serves as the centerpiece of White’s (1959) seminal conceptualization of competence as an organism’s orientation toward an efficient interaction with the environment. For White (1959), this orientation was a profound and basic form of motivation, as critical as any fundamental drive. According to his view, competence is not a form of motivated behavior that serves other drives, but is a fundamental form of motivation in and of itself. To support this argument, he referred to the high rates of exploratory behavior seen in many animals and the infant’s instinctual curiosity in novelty. Both of these phenomena show that organisms are inherently oriented toward interaction with the environment. He found evidence of this ‘instinct to master’ in the observations and interpretations provided by neo-analysts (e.g., Fenichel 1945, Hendrick 1942). Hendrick, for example, went so far as to write that individuals have an ‘inborn drive to do and to learn to do.’ Again, White was careful to point out that according to these neoanalysts, such activity is not in the service of another goal (e.g., anxiety reduction), but is instead a basic 14258
human motivation by itself. White noted that this proposal was central also to the ideas of Erik Erikson. In Erikson’s (1953) description of the latency years, he emphasized that this developmental period is a time in which competence is the main developmental struggle that children confront. Erikson wrote that ‘children need a sense of being able to make things and to make them well and even perfectly: this is what I call a sense of industry.’
1.2 Current Definitions The impact of White’s views is, in part, apparent in the definitions of competence that subsequent writers have offered. These include the following: ‘the effectiveness or adequacy with which an individual is capable of responding to various problematic situations which confront him’ (Goldfried and D’Zurilla 1969 p. 161); ‘an individual’s everyday effectiveness in dealing with his environment’ (Zigler 1973); ‘a judgment by another that an individual has behaved effectively’ (McFall 1982 p. 1); ‘attainment of relevant social goals in specified social contexts, using appropriate means and resulting in positive developmental outcomes’ (Ford 1982 p. 324); the ability ‘to make use of environmental and personal resources to achieve a good developmental outcome’ (Waters and Sroufe 1983 p. 81); and ‘the ability to engage effectively in complex interpersonal interaction and to use and understand people effectively’ (Oppenheimer 1989 p. 45). In a review of these definitions, Rubin et al. (1998), pointed out that they share at least two points: (a) an emphasis on effectiveness, and (b) an understanding of the environment so as to achieve one’s own needs or goals. One has to presume that the dynamism that stood at the center of White is implicit in these definitions even if it is not stated explicitly. Using definitions noted above, as well as from others found in the work of Attili (1989), Dodge (1986), Gresham (1981), Strayer (1989) as a point of departure, Rubin and Rose-Krasnor (1992, 1986) have defined social competence as the ability to achiee personal goals in social interaction while simultaneously maintaining positie relationships with others oer time and across situations. A significant feature of this definition is its implicit recognition of the importance of individual and social goals. This emphasis reflects essential duality of self and other, placing the individual within a social and personal context. The conceptualization of self and other as indistinguishable is an important feature of theory from several domains including personality theory (Bakan 1966, Sullivan 1953), gender role theory (Leaper 1994), feminist psychology (Jordan et al. 1991), and philosophy (Harre! 1979, 1984). Accordingly, this definition is valuable as it points to the complex goals that persons confront as individuals and as members of
Social Competence: Childhood and Adolescence groups. It is in this regard that individuals, as well as social scientists, have struggled with the tension between agentic and communal orientations.
2. The Socially Competent Child and Adolescent Children’s competence within social contexts depends upon their social skills. These skills are both behavioral and cognitive. The cognitive skills include the ability to understand and predict the internal states (i.e., the thoughts, feelings, and intentions) of other children and to understand the meaning of these states within the present context. Using this information, a competent child is able to develop plans for interacting with others to achieve desired outcomes and to evaluate the flow of the interaction as it unfolds. Part of this evaluation concerns the consequences of one’s actions as well as those of the partner, on both instrumental and moral grounds, so as to maintain positive interaction, and either repair or terminate negative interactions. In the behavioral domain, a competent child is able to act prosocially and is capable of displaying appropriate positive affect while avoiding or inhibiting negative affect and is able to communicate verbally and nonverbally so as to facilitate the partner’s understanding of the interaction. (These issues are discussed further in chapters found in Asher and Coie (1990) and Schneider et al. (1989).)
2.1 Competence as Process Recent approaches to social competence have gone beyond the emphasis on skills per se and have argued that social competence needs to be understood as a hierarchical process. Two models of this sort have been proposed by Rubin and Rose-Krasnor (1992) and Crick and Dodge (1994). In parallel to previous process-oriented approaches to social cognition and behavior (e.g., Miller et al. 1960) these models include a series of steps that are presumed to underlie social interaction. Rubin and Rose-Krasnor adopt an interpersonal problem solving model of social competence which is largely predicated on the view that most social interactions appear to be routine, reflecting an automaticity in thinking and a scriptedness in social interchanges (Abelson 1981, Nelson 1981, Schank and Abelson 1977). They note, for instance, that children quickly learn many rudimentary scripted forms of social behavior, such as greeting and leave-taking sequences, as well as acquiring an understanding that many social relationships ‘slot’ members into particular roles that are played out in highly predictable ways (e.g., Strayer 1989). Such scripts and role understandings allow children easy access to sequences of actions corresponding to highly familiar social
situations thus facilitating social competence. Indeed, for both competent and less-than-competent children, familiar social situations easily elicit routine, scriptdriven behavior. In the absence of such familiarity, when appropriate and routine eliciting conditions are not present, this sort of script driven enactment of social behavior is precluded.
2.2 Rubin and Rose-Krasnor Model Rubin and Rose-Krasnor propose that in such novel conditions, social competence derives from the following sequentially organized steps. First, a goal needs to be chosen. Often several goals are apparent, and they can often conflict with each other. For this reason Dodge et al. (1989) argued that the successful coordination of goals is more likely to be a challenge to an individual’s competence than is the achievement of a particular goal. Second, the chosen goals need to be understood according to contextual factors, such as whom they are interacting with (e.g., stranger, acquaintance, friend) and this person’s characteristics, and the features of the context (e.g., school, home, or playground), population density, and visibility (public versus private encounters). Next, strategies for achieving that goal need to be chosen. To do this, a child may retrieve one or more strategies from long-term memory or new strategies may be constructed from those already available. Once selected, the child must produce the strategy in its relevant context and the results need to be evaluated. If the strategy is judged to be successful, the problem-solving process ends and relevant information concerning the social interchange is retained in long-term memory. If a strategy has been judged as partially successful, the actor may accept the outcome as ‘successful enough’ and proceed as if the outcome was a success; alternately, the individual may decide to judge the outcome as a failure. If this case, the child may stop the sequence and leave the goal unattained or may choose to either repeat the original strategy or choose to alter the previous strategy in some way while maintaining the same goal.
2.3 Crick and Dodge Model Another approach to understanding the process of social competence is seen in a model proposed by Crick and Dodge (1994). They also emphasize the importance of information processing as the basis of social competence. Their model involves six stages, each of which involves a form of information processing that draws upon currently available information as well as a data base of rules, schemas, and social knowledge. The six steps in their model are as follows. 14259
Social Competence: Childhood and Adolescence First the child encodes social cues by attending to relevant social information such as who is in the social milieu (a friend, a stranger, a popular child, an unpopular child), how it is that this person is expressing him or herself (affectively happy, angry, or sad), and so on. Second, the child must interpret the encoded cues. This requires that the child search his or her memory store in an effort to understand the meaning of the encoded information. For example, one of the social cues encoded at Step 1 might be a facial expression denoting anger. Interpreting the facial expression correctly may lead to the assumption that the social situation involves hostility and anger. Third, the child deelops a goal or set of goals based on his\her interpretation of the encoded cues. Fourth, the child accesses or constructs potential responses to achieve the selected goals. The fundamental question asked here is ‘What are all the things I can do in this situation?’ Next, she or he must decide on a particular response to be enacted. This selection depends on various considerations including the expected outcomes of particular responses, his\her confidence that the selected strategy can be enacted, and their evaluation of the response itself. For example, the individual might consider the potential interpersonal consequences (‘Will she still like me?’), the instrumental consequences (‘Will I get what I want?’), and the moral value of the strategy. Finally, the child enacts the chosen response. The models of Rubin and Rose-Krasnor (1986) and Crick and Dodge (1994) are similar in that they both emphasize (a) that the processing underlying social competence occurs rapidly in real time, (b) that the steps of information processing follow a particular sequence, (c) that they are dynamically interrelated, yet separable, and (d) that processing can proceed automatically at the unconscious level. Most importantly they regard social competence as a plan of action by which social goals are achieved.
3. Competence at Different Leels of Social Complexity Thoreau, in his description of his cabin near Walden Pond said ‘I have three chairs—one for solitude, two for friendship, and three for society.’ This multilevel scheme applies equally well to social competence. As we state above, competence is not a feature of a single domain of functioning but instead can be seen at multiple levels of social complexity, specifically, the indiidual, the dyad and the group. This recognition that competence needs to be assessed at multiple levels of experience is a central theme in the writings of Bronfenbrenner and Crouter (1983) and Hinde (1979, 1987, 1995). Our point is that these different levels of complexity provide different contexts for competence. One goal of research on peer relations has been to 14260
identify competence at different levels and to understand how these levels are related to each other. In the following sections we describe briefly the ways that competence is manifested at these three levels of analysis.
3.1 The Indiidual The level of the individual refers to the behavioral, affective, and cognitive characteristics and dispositions that a child brings to social interaction. These characteristics include behavioral tendencies, either derived from experience or temperament, cognitions about self and other, motivations and goals, relationship history, and affective states and traits. The level of the individual is the most basic level of social complexity. While it might appear that the level of the individual has no social complexity, such complexity can exist in a child’s representations of self and other (e.g., in internal working models) as well as in their planning for social interactions and relationships. Certainly, characteristics from the level of the individual can be interpreted as indicating social competence. The two features from the level of individual that have been studied most broadly concern individuals’ general social behaviors and their social cognitions. In parallel to psychology’s general concern with the processes of (a) moving against others, (b) moving toward others, and (c) moving away from others, research on social behavior and competence has been largely organized around the constructs of aggression, sociability and helpfulness, and withdrawal. This interest in how the child’s individual behaviors influence experiences with peers has comprised a large portion of research on the peer system during the past 15 years. In fact, the very large literature on children’s popularity has been typically focused on behaviors at the level of the individual. Information about the correlates of popularity is important as it reveals which characteristics from the level of the individual are related to effectiveness in establishing a general place within the peer group.
3.1.1 Indiiduals and behaior. In a meta-analysis of this large body of literature regarding individual behaviors and popularity, Newcomb et al. (1993) were able to reach conclusions about differences between individuals from different popularity groups, especially, the popular, rejected, and neglected groups. They were able to interpret these differences as a function of what they reveal about competence. Specifically, they reported that popular children (i.e., those who were liked by many children and disliked by very few) are more likely to be helpful, to interact actively with other children, to show leadership skills, and to engage in constructive play (see Newcomb et
Social Competence: Childhood and Adolescence al. 1993). These findings show that popular children are capable of pursuing and achieving their own goals while simultaneously allowing others to do the same. In other words, they not only engage in ‘competent’ action themselves, but they are able to promote it in others. One critical finding in regard to the behavior of popular children is that there are no differences between them and other children on all aspects of aggression. Newcomb et al. differentiated assertive behaviors and disruptiveness. Popular children did not differ from others on the dimension of aggression but they were significantly less disruptive. These findings show that although popular children engage in assertive and agonistic behavior, these assertive actions are unlikely to interfere with the actions and goals of others. Consistent with other reports (Olweus 1977, Pepler and Rubin 1991), it would appear as if some forms of aggression may be compatible with being popular and with the concept of social competence. As Hawley (1999) has argued, children’s strategies for controlling resources do not necessarily involve coercive or agonistic actions. In this way children can be dominant and competent. In contrast with the popular children, Newcomb et al. reported that rejected children not only showed high levels of disruptive behavior but also showed little evidence of engagement in activities with peers. In other words, their activities disrupted their peers’ effective engagement with the environment, and showed little effective engagement themselves. This latter point is true of the neglected group also.
3.1.2 Indiiduals and social cognition. A second phenomenon from the level of the individual is that of social cognition. Social cognition includes a broad array of processes, including the way that an individual interprets and understands the behaviors of one’s self and of others. Akin to research on the link between behavior and experiences with peers, much of the research on social cognition has focused on how an individual’s use of social information is linked to the person’s experiences with peers at the group level. Indeed, research on social cognition has also examined how an individual’s understanding and interpretation of experience may partially determine the impact that experiences with peers have on development. This sort of social understanding is likely to be a prerequisite for competence. Without knowing how others will act, and without having a means of understanding the motives of others, individuals do not have a ‘ground’ on which to base their action (Harre! 1974). Moreover, in order to effectively interact with the environment, one needs to possess the ‘local knowledge’ about how this environment functions (Geertz 1983). Considering this, it is not surprising that most models of social competence place a strong emphasis on social understanding as a
powerful antecedent to competent social action (Crick and Dodge 1994, Rubin and Rose-Krasnor 1986).
3.2 The Dyad Peer experiences at the level of the dyad can be conceptualized as reflecting either interactions or relationships. According to Hinde (e.g., 1979, 1987, 1995) interactions are the series of interchanges that occur between individuals over a limited time span. Interactions are defined as interdependencies in behavior (Hinde 1979). That is, two persons are interacting when their behaviors are mutually responsive. If a person addresses someone (e.g., says ‘Have a nice day’) and the other person responds (‘Don’t tell me what to do’) they have had an interaction, albeit a rather primitive one. Relationships are based on these patterns of interaction. The relationship consists of the cognitions, emotions, internalized expectations, and qualifications that the relationship partners construct as a result of their interactions with each other.
3.2.1 Basic aspects of friendship. Researchers can examine dyadic relationships, such as friendship, according to (a) what the partners do together (i.e., the content of the relationship); (b) the number of different activities in which the partners engage (i.e., the breadth of their interactions); and (c) the quality of the interactions within the relationship (e.g., reciprocal, complementary, positive, negative). As in the research on popularity in which associations between behavior and popularity were used as the basis of conclusions regarding competence, research on friendship, a primary form of relationship experience for children, informs us of the processes that are central to engagement at the dyadic level. In research on friendship, different characteristics of relationships can be contrasted empirically, for example, between friends and acquaintances or disliked others.
3.2.2 Conflict. One domain of interaction, which nicely illustrates the nature of competence in friendships, is that of conflict. Conflict is one of the few dimensions of interaction in which there are no differences between friends and non-friends. Researchers have shown repeatedly that during childhood and adolescence, pairs of friends engage in the same amount of conflict as pairs of non-friends (Hartup 1992, Laursen et al. 1996, Newcomb and Bagwell 1995). There is, however, a major difference in the conflict resolution strategies that friends and nonfriends adopt. In a large meta-analysis of research on friendship, Newcomb and Bagwell showed that 14261
Social Competence: Childhood and Adolescence friends were more concerned about achieving an equitable resolution to conflicts than achieving outcomes that would favor one partner. More specifically, Laursen et al. and Hartup reported that friends are more likely than nonfriends to resolve conflicts in a way that will preserve or promote the continuity of their relationship. In this respect, competence in friendship appears to require the ability of the two partners to find a balance between individual and communal goals.
3.2.3 Self and other. A significant feature of this interpretation of these findings is its implicit recognition that competence in friendship is based on achieving a balance between personal desires against social consequences. This emphasis reflects an essential duality of self and other, placing the individual within a social and personal context. The conceptualization of self and other as interdependent is an important feature of theory from several domains including personality theory (Bakan 1966, Sullivan 1953), gender role theory (Leaper 1994), feminist psychology (Jordan et al. 1991) and philosophy (Harre! 1979, 1984). This view of the self as inseparable from the other underscores the complex goals that persons confront as individuals and as members of dyads. In this way competence in friendship involves resolving the tension between agentic\individual and communal orientations.
3.3 The Group Group experiences refer to experiences among a set of individuals who have been organized by either formal or informal means. Group phenomena do not refer to individuals but instead refer to (a) the links that exist among the persons in the group, and (b) the features that characterize the group. Groups can be measured according to (a) their structure and size (see Benenson 1990), and (b) the themes or the content around which groups are organized (see Brown 1989). Many of the best known studies of children and their peers were concerned with the group per se, including Lewin et al. (1938) study of group climate, and Sherif et al. (1961) study of intragroup loyalty and intergroup conflict.
3.3.1 Studying the group. In spite of the apparent importance of phenomena at the level of the group, group measures have received far less attention than measures conceptualized at the level of the dyad and the individual. This is surprising because persons generally refer to the peer system as a group phenomenon. Indeed references to experiences with peers invariably refer to the ‘peer group.’ Moreover, as Hartup (1983) noted, early theory and research on 14262
the peer system was organized around the idea that group experiences were powerful determinants of behavior. Nevertheless, research on the factors related to the peer group per se has been relatively sparse. In spite of a relative lack of research, the studies regarding peer group behavior generally converge on the same conclusion. This conclusion is that groups that show the most effective engagement with the environment (e.g., those that can complete a task most efficiently) are those that have the strongest sense of cohesion and shared purpose (e.g., Sherif et al. 1961) or that have the most engaged leadership (Lewin et al. 1938). These studies show that a factor that may be critical for an individual to be competent within a group is the shared knowledge or understanding of a task or ritual necessary for a group to function toward a common goal.
4. Conclusion and Future Directions Social competence refers to an individual’s capacity to achieve goals at the individual, dyadic, and group levels. Being competent means that one can balance one’s own goals with those of one’s partners and group members. Competent behavior is predicated on social cognitive processes underlying the encoding and interpretation of social problems and the development, execution and evaluation of plans to solve problems. The socially competent child and adolescent achieves his\her own goals while promoting the goaloriented actions of their friends and companions. In this regard they have enduring positive relationships with others. Research on the development of competence needs to be focused on how children resolve the demands required for functioning at the individual, dyadic, and group levels of social complexity. Consistent with definitions of competence, researchers need to treat competence as a process, rather than as a static individual entity. An emphasis on action and process is necessary to reveal the integrative mechanisms that underlie a child’s ‘competence’ in social and personal domains. See also: Friendship: Development in Childhood and Adolescence; Intrinsic Motivation, Psychology of; Motivation and Actions, Psychology of; Shyness and Behavioral Inhibition
Bibliography Abelson R P 1981 The psychological status of the script concept. The American Psychologist 36: 715–29 Attili G 1989 Social competence versus emotional security: The link between home relationships and behavior problems in preschool. In: Schneider B H, Attili G, Nadel J, Weissberg R P (eds.) Social Competence in Deelopmental Perspectie. Kluwer Academic Publishers, Boston
Social Competence: Childhood and Adolescence Asher S R, Coie J D (eds.) 1990 Peer Rejection in Childhood. Cambridge University Press, New York Bakan D 1966 The Duality of Human Existence. Rand McNally, Chicago Benenson J F 1990 Gender differences in social networks. Journal of Early Adoloescence 10: 472–95 Bronfenbrenner U, Crouter A C 1983 The evolution of environmental models in developmental research. In: Hetherington E M (ed.) Handbook of Child Psychology: Vol. 1. History, Theories and Methods, 4th edn. Wiley, New York, pp. 357–414 Brown B B 1989 The role of peer groups in adolescents’ adjustment to secondary school. In: Berndt T J, Ladd G W (eds.) Peer Relationships in Child Deelopment. Wiley, New York, pp. 188–216 Bukowski W M, Bergevin T, Sabongui A, Serbin L 1998 Competence: The short history of the future of an idea. In: Pushkar D, Bukowski W, Schwartzman A, Stack D, White D (eds.) Improing Competence Across the Life Span. Plenum, New York, pp. 91–100 Crick N R, Dodge K A 1994 A review and reformulation of social information processing mechanisms in children’s social adjustment. Psychological Bulletin 115: 74–101 Dodge K A 1986 A social information processing model of social competence in children. In: Perlmutter M (ed.) Minnesota Symposium on Child Psychology. L. Erlbaum Associates, Hillsdale, NJ, Vol. 18, pp. 77–125 Dodge K A, Asher S R, Parkhurst J T 1989 Social life as a goal coordination task. In: Ames C, Ames R (eds.) Research on Motiation in Education. Academic Press, San Diego, CA, Vol. 3, 107–35 Erikson E H 1953 Growth and crises of the healthy personality. In: Kluckhorn C, Murray H A, Schneider D M (eds.) Personality in Nature, Society, and Culture, 2nd rev. and enl. edn. Knopf, New York, pp. 185–250 Fenichel O 1945 The Psychoanalytic Theory of Neurosis. W. W. Norton, New York Ford M E 1982 Social cognition and social competence in adolescence. Deelopmental Psychology 18: 323–40 Geertz C 1983 Local Knowledge: Further Essays in Interpretie Anthropology. Basic Books, New York Goldfried M R, D’Zurilla T J 1969 A behavior analytic model for assessing competence. In: Spielberger C D (ed.) Current Issues in Clinical and Community Psychology. Academic Press, New York, pp. 151–98 Gresham F M 1981 Validity of social skills measures for assessing social competence in low status children: A multivariate investigation. Deelopmental Psychology 17: 390–8 Harre! R 1974 The conditions for a social psychology of childhood. In: Richards M P M (ed.) The Integration of a Child into a Social World. Cambridge University Press, Cambridge, UK Harre! R 1979 Social Being. Rowman and Littlefield, Totowa, NJ Harre! R 1984 Personal Being. Harvard University Press, Cambridge, MA Hartup W W 1983 Peer relations. In: Hetherington E M (ed.) Handbook of Child Psychology: Vol. 4. Socialization, Personality and Social Deelopment, 4th edn. Wiley, New York, pp. 103–96 Hartup W W 1992 Conflict and friendship relations. In: Shantz C U, Hartup W W (eds.) Conflict in Child and Adolescent Deelopment. Cambridge University Press, Cambridge, UK, pp. 185–215
Hawley P H 1999 The ontogenesis of social dominance: A strategy-based evolutionary perspective. Deelopmental Reiew 19: 97–132 Hendrick I 1942 Work and the pleasure principle. Psychoanalytic Quarterly 11: 33–58 Hinde R A 1979 Towards Understanding Relationships. Academic Press, London Hinde R A 1987 Indiiduals, Relationships and Culture. Cambridge University Press, Cambridge, UK Hinde R A 1995 A suggested structure for a science of relationships. Personal Relationships 2: 1–15 Jordan J V, Kaplan A, Miller J B, Steiver I, Surrey J 1991 Women’s Growth in Connection: Writings from the Stone Center. Guilford Press, New York Laursen B, Hartup W W, Koplas A L 1996 Toward understanding peer conflict. Merrill-Palmer Quarterly 42: 76–102 Leaper C 1994 Exploring the consequences of gender segregation on social relationships. In: Leaper C (ed.) Childhood Gender Segregation: Causes and Consequences. Jossey-Bass, San Francisco, CA, pp. 67–86 Lewin K, Lippit R, White R K 1938 Patterns of aggressive behavior in experimentally created ‘social climates.’ Journal of Social Psychology 10: 271–99 McFall R M 1982 A review and reformulation of the concept of social skills. Behaioral Assessment 4: 1–33 Miller G A, Galanter E, Pribram K H 1960 Plans and the Structure of Behaior. Henry Holt, New York Nelson K 1981 Social cognition in a script format. In: Flavel J H, Ross L (eds.) Social Cognitie Deelopment. Cambridge University Press, New York, pp. 97–118 Newcomb A F, Bagwell C L 1995 Children’s friendship relations: A meta-analytic review. Psychological Bulletin 117: 306–47 Newcomb A F, Bukowski W M, Pattee L 1993 Children’s peer relations: A meta-analyic review of popular rejected neglected controversial and average sociometric status. Psychological Bulletin 113: 99–128 Olweus D 1977 Aggression and peer acceptance in adolescent boys: Two short-term longitudinal studies of ratings. Child Deelopment 48: 1301–13 Oppenheimer L 1989 The nature of social action: Social competence versus social conformism. In: Schneider B H, Attili G, Nadel J, Weissberg R P (eds.) Social Competence in Deelopmental Perspectie. Kluwer Academic Publishers, Boston, pp. 40–70 Pepler D J, Rubin K H (eds.) 1991 The Deelopment and Treatment of Childhood Aggression. L. Erlbaum Associates, Hillsdale, NJ Rubin K H, Buowski W M, Parker J G 1998 The peer system: Interactions, relationships and groups. In: Damon W (series ed.), Eisenberg N (Vol. ed.) The Handbook of Child Psychology, 5th edn. J. Wiley, New York Rubin K H, Rose-Krasnor L 1986 Social cognitive and social behavioral perspectives on problem solving. In: Perlmutter M (ed.) Minnesota Symposium on Child Psychology. L. Erlbaum Associates, Hillsdale, NJ, Vol. 18, pp. 1–68 Rubin K H, Rose-Krasnor L 1992 Interpersonal problem solving. In: Van Hassett V B, Hersen M (eds.) Handbook of Social Deelopment. Plenum, New York, pp. 283–323 Schank R C, Abelson R P 1977 Scripts, Plans, Goals, and Understanding. Erlbaum, Hillsdale, NJ Schneider B H, Attili G, Nadel J, Weissberg R P (eds.) 1989 Social Competence in Deelopmental Perspectie. Kluwer Academic Publishers, Boston
14263
Social Competence: Childhood and Adolescence Sherif M, Harvey O J, White B J, Hood W R, Sherif C W 1961 Inter-group Conflict and Cooperation: The Robbers Cae Experiment. University of Oklahoma Press, Norman, OK Strayer F F 1989 Co-adaptation within the early peer group: A psychobiological study of social competence. In: Schneider B H, Attili G, Nadel J, Weissberg R P (eds.) Social Competence in Deelopmental Perspectie. Kluwer Academic Publishers, Boston, pp. 145–74 Sullivan H S 1953 The Interpersonal Theory of Psychiatry, 1st edn, Norton, New York Waters E, Sroufe L A 1983 Social competence as a developmental construct. Deelopmental Reiew 3: 79–97 White R 1959 Motivation reconsidered: The concept of competence. Psychological Reiew 66: 297–333 Zigler E 1973 Project Head Start: Success or failure? Learning 1: 43–7
W. M. Bukowski, K. H. Rubin, and J. G. Parker
Social Constructivism In its original version, social constructivism is a view about the social nature of science. It rests on the methodological assumption that a sociological analysis of science and scientific knowledge can be empirically fruitful and epistemologically illuminating. This approach has generated detailed empirical studies of scientific practices (for instance, of what is going on in laboratories on a day-to-day basis). According to social constructivism, these studies show that it does not depend exclusively on the objective external world which scientific beliefs are held to be true or false, and thus, what are the scientific facts, but rather also (or even mainly or exclusively) on social arrangements resulting from negotiations between scientists taking place in the course of scientific practices. It is in this sense that scientific knowledge and scientific facts are supposed to be socially constructed. Social constructivism is not a unique specified doctrine, however, but rather a bunch of related studies representing different versions of the general approach.
1. Historical Background One of the guiding ideas of social constructivism is that science, scientific knowledge, and scientific practices are socially determined. Historically, this idea is rooted in the historical materialism developed by Marx. Historical materialism is best understood as a research program, roughly in the sense of the methodology of research programs proposed by Lakatos; the core of historical materialism consists of the claim that the economic base of a society determines its superstructure. This core is supposed to generate testable historical theories or hypotheses about what are, in various societies, the base and the super14264
structure, and about the specific ways in which they are related. The strongest, but not the only possible, interpretation is that the superstructure is causally determined by the base. This approach was taken up by sociologies of science worked out during the twentieth century. Thus, for instance, Karl Mannheim wanted to show how humanities and theories of society are socially determined; and one of the leading American sociologists of science, Robert Merton, suggested that it is primarily the rise and success of science that is determined by social forces (Merton 1973). Another important intellectual source of social constructivism is the historical turn of twentieth century philosophy of science (Kuhn 1962, Bachelard 1937, Canguilhem 1966, Foucault 1969, Hacking 1990). This movement is vigorously opposed to scientific realism and rejects the claim, so crucial for the traditional philosophy of science, that there are universal rational scientific methods and rules such that the history of science can be rationally reconstructed as an continuous intellectual enterprise constantly improving our scientific knowledge by applying these methods. Instead, proponents of the movement see the history of science and scientific beliefs shattered by contingent breaks and dependent on normative attitudes and hidden non-scientific assumptions that an adequate philosophy of science must aspire to reveal. A third theoretical background of social constructivism is the program of naturalizing epistemology initiated by Quine (Quine 1969). The crucial idea of this programme is that traditional epistemology, centering around the question of the justification of claims to (scientific) knowledge, should be replaced by a scientific study of causal processes of belief formation and information. In this way, epistemology is supposed to be transformed into a natural science. In sum, social constructivism puts together three ideas, borrowed from a broadly Marxist sociology of science, from the historical turn of twentieth century philosophy of science, and from the program of naturalizing epistemology, that the development of scientific knowledge is (a) determined by social forces, (b) essentially contingent and independent of rational methods, and should (c) be analyzed in terms of causal processes of belief formation.
2. The Edinburgh School of the Sociology of Science The first prominent version of social constructivism is the Edinburgh school of the sociology of science. The proponents of this school start out by criticizing the Marxist sociology of science for the exclusion of mathematics and natural sciences as a proper subject of the sociology of science. The extended claim is that mathematics and natural sciences are socially determined too, and it is this extension that is fun-
Social Constructiism damental to the further claim of many social constructivists that, in a sense, the world as such is socially constructed (Bloor 1991; Knorr-Cetina 1981; Latour and Woolgar 1986). Specifically, it is, according to the Edinburgh sociologists of science, not only the historical development of science, its rise and success, that is influenced by social forces; rather, it is the content of accepted scientific beliefs that is determined by social factors or by the social interests involved in scientific practices. Edinburgh sociologists of science conclude that there is no definite unique set of rational methods that guide scientific practices which can be referred to in order to explain how scientific results and beliefs are established. Another result is that it is not helpful to make a sharp distinction between the context of discovery and the context of justification in sociological investigations of scientific practices. In this way the Edinburgh school rejects fundamental assumptions held by the traditional philosophy of science. If scientific practices and the development of science cannot be rationally reconstructed, then the sociological analysis of scientific practice must basically be a causal explanation of belief formation and belief alidation; it is supposed to show in detail how specific scientific beliefs are established as a result of a causal process proceeding from social conditions, social interests, and negotiations between scientists guiding the specific scientific practice under consideration. Therefore, social constructivism is, in the view of the Edinburgh school, itself a kind of natural science. The domain of this sort of science is the set of all scientific beliefs that get accepted for the time being, independently of whether they are true or false and whether they ultimately prove to be successful or not. Some constructivists focus not on belief formation, but rather on scientific facts: they want to know in which way such facts are hardened, i.e., get accepted as facts by all researchers in the field (Latour and Woolgar 1986). However, focusing in this way on scientific facts instead of scientific beliefs does not make much difference if scientific facts are supposed to be what accepted scientific beliefs are about—roughly the content of statements showing up in established scientific textbooks on the field (Quine 1969). Nevertheless, it is precisely by shifting the focus from beliefs to facts that some social constructivists are inclined to entertain an ontological version of social constructivism that sees scientists as essentially contributing to the existence of scientific facts. Obviously, this claim has idealistic implications only if scientific facts are understood in a strictly realistic fashion—roughly as the way things are independently of the way we take them to be. Social constructivism, as proposed by the Edinburgh sociologists, can also be formulated as a methodological claim mandating a naturalistic approach to scientific practices; as such, social constructivism is best understood as being part of a
naturalized epistemology concentrating on the investigation of the social causes of belief formation. In the methodological version, social constructivism does not seem to be committed to ontological claims; however, it remains committed at least to the epistemological claim that scientific knowledge cannot simply be seen as a good representation of the external objective world, but must rather be taken as a result of an extremely complex process involving mainly, and most importantly, social causes. In this way, social constructivism is not only a form of anti-rationalism, but also a form of anti-realism and in this twofold sense, a form of relativism.
3. The Actor-network Theory The actor-network theory (Latour and Woolgar 1986, Latour 1987) is a form of constructivism that rejects the idea of a social determination of scientific knowledge, prominent in the Edinburgh school, mainly for the reason that the social is barely better understood than the natural. The leading thought is that scientific knowledge is an effect of established relations between objects, animals, and humans engaged in scientific practices. An actor is, according to this theory, everything that in some causal way affects the production of scientific statements and theories: not only scientists, but also, for instance, background assumptions, methodologies, techniques, social rules and institutions, routines, experiments, measurements and the appropriate instruments, scientific texts and, last but not least, external objects. For an entity to be an actor in this sense it is obviously not required to have contentful mental states, but to be able to perform actions as a kind of behavior describable under some intention. Thus, there can be many sorts of relations and interactions between actors; in particular, some actors can transform other actors (these transformations are sometimes called translations). A network is a set of actors such that there are relations and translations between the actors that are stable, in this way determining the place and functions of the actors within the network. Once a network has been established it implies a sort of closure that prevents other actors or relations from entering the network, thereby opening the possibility of the accumulation of scientific knowledge that is taken to be the result of translations within the network. Establishing a scientific belief, theory, or facts comes down, from the point of view of the actor-network theory, to placing these actors in a stable network. In this sense, scientific beliefs, knowledge, theories, and facts are taken to be constructed by translations taking place in established networks. The actor-network theory shares a number of basic assumptions with social constructivism as conceived in the Edinburgh school. Thus, both approaches entertain a naturalistic account of scientific practices, 14265
Social Constructiism do not presuppose a distinction between true or successful and false or unsuccessful scientific beliefs, and reject the possibility of a rational reconstruction of scientific practices and their outcomes. But constructivism in the sense of the actor-network theory is social not in the strong sense that social forces that are presupposed to exist largely independently of scientific practices have a causal impact on these practices; but rather in the extremely weak sense that as a result of processes taking place in networks, a scientific claim can eventually be developed about a distinction between the natural and the social, and consequently also about the function of the social for scientific practices (Pickering 1992). Social constructivism usually does not hold that, in the course of scientific practices, scientific facts of the external world are literally constructed of some other entities. The crucial idea of a (social) construction of scientific knowledge and scientific facts is, rather that an analysis of the process and history of scientific belief formation will not be able to show that the methods of science continuously increase the probability that scientific beliefs will be good representations of an independent external world, and should not even try. Instead, scientific belief formation should be modeled in terms of very different factors, mainly social ones like rules, techniques, institutions, power relations, and negotiations, which affect scientific belief formation in a causal way that can be studied empirically and sociologically One of the most debated applications of social constructivism is the claim held by many feminists that gender is socially constructed (see Comparable Worth in Gender Studies, Feminist Political Theory and Political Science). Originally, the most prominent feminist theories introduced a distinction between a naturalistic notion of sex and a social notion of gender to denounce all attempts of deriving properties of gender from properties of sex as mere constructions of gender that do not refer to reality. More recently, in some postmodern feminist accounts of sex and gender, it is also the very distinction between sex and gender, and thus, sex itself, that are supposed to be socially constructed.
4. Social Constructiism About the Social In one of the most recent developments of social constructivism, leading proponents of the Edinburgh school try to exploit the thought that scientific knowledge is, after all, something like an institution and that institutions are constructed by humans (Bloor 1996). It seems to follow that scientific knowledge is, because constructed by humans, socially constructed. This idea is part of what is often called the intentionalist program of social ontology, the aim of which is to clarify the ontological status of social entities, i.e., groups and of institutions. The fundamental claim is 14266
that it is on the basis of intentional mental states and associated actions that social entities come into existence. That is to say that human beings construct social entities by having specific contentful thoughts and by performing intentional actions. It is obvious that this approach is a sort of social constructivism in the sense that it is a theory about the construction of the social; this is held to be so insofar as it consists largely of normative interactions between humans. It must be emphasized that this sort of constructivism is restricted to the realm of the social, that it presupposes a sharp distinction between the natural and the social, and that it has nothing to say about the construction of natural facts or physical entities. Most importantly, though, the intentionalist program relies on an intentional vocabulary. In all these respects the program differs fundamentally from traditional versions of social constructivism. Basically the intentionalist program of social ontology is an individualistic account of the social; at the same time, however, one of the crucial theoretical moves is to introduce notions of collective intentionality and of collective actions. The first step is to assume that for some persons to be socially related in a specific way is to be treated by most of the other members of their society as being related in this way. Social relations are taken to imply rules as well as certain rights. In general, therefore, to treat some persons as socially related in way S is to treat them as being entitled or committed to perform actions according to the rules and rights derivable from S. One of the next important moves is to introduce the notion of a collective attitude (or, as it is sometimes called, a we-attitude and, in particular, the notion of a collective intention to perform a specific action. Roughly speaking a person P as member of a set S of persons has a contentful collective attitude of the sort A iff (a) P has the A-attitude herself, (b) P believes that every other member has the A-attitude, and (c) P believes that every other member M of S believes that every member of S other than M has the A-attitude. In particular, if a number of persons Pi intend to perform action A together or collectively, then it must be the case, in addition to (a)–(c), that every Pi intends to perform A by performing a specific action Ai as her specific contribution to the performance of A such that Pi thinks that Ai is necessary for A and results, together with the other specific actions Aj intended to be performed by the other Pj, in performing A, and that Pi believes that the other Pj intend to do the same thing and believe themselves that every other person intends to do the same thing. On the basis of these notions it can be asserted that a number of persons Pi have the collective intention to perform action A iff each Pi has the collective intention to do A by performing a specific Ai, in the sense just explained; and Pi perform a collective action A iff A can be described under some collective intention of the Pi. It might be suggested that a number of
Social Democracy persons Pi represent a social group iff they share a collective attitude. Finally, let MR be a set of sematically consistent rules Ri such that a number of persons collectively treat an X as a Y by following Ri, then MR is, in the most basic sense, an institution. In particular, if MR is the set of rules according to which scientific results and scientific knowledge are established, and if the persons in question are trained scientists, then MR is an institution of epistemic or scientific practices, and in this sense, scientific knowledge might be called an institution that is constructed by intentional acts of human beings (Searle 1995, Tuomela 1995). This line of argument is a rough sketch that by no means exhausts the theoretical complexity of the intentionalist program. For instance, something has to be said about the mechanisms of the formation of collective attitudes which account for the stability of social groups and institutions; at this point, a theory of consensus and a theory of power has to be incorporated into the general account. In any case, even a rough sketch of this sort of social ontology shows how, to put it very generally, the mind can be thought of as bringing into existence, and keeping in existence (and thus to construct) social entities in a fairly literal manner. It also becomes clear how and why traditional social constructivists might be inclined to render their position more precise by exploiting the intentionalist program of social ontology. This program is a philosophically sound approach; it makes evident, however, that human beings do not construct social facts as they construct natural facts. Traditional versions of social constructivism tend to downplay this distinction, which might be thought of as a fundamental flaw. Another problem with these versions is that they seem to commit themselves to relying exclusively on a causal, physicalist vocabulary. While it certainly makes sense to examine whether a purely naturalistic account of epistemic practices might be fruitful, it does not follow from this strategy, even if it proves fruitful, that intentionalist strategies must be excluded. On the contrary, it is doubtful whether epistemic practices can be adequately described without also using intentionalistic and normative vocabulary. Finally, it is hard to see how traditional social constructivists can overcome Davidson’s point that, according to interpretation theory a basic form of rationality is constitutive for mastering a natural language and must therefore be the same in all epistemic practices (Davidson 1973); obviously, this insight is inconsistent with the unqualified contingency claim about the development of scientific knowledge which is so central to traditional social constructivism. See also: Foucault, Michel (1926–84); History of Science: Constructivist Perspectives; Interactionism: Symbolic; Scientific Knowledge, Sociology of; Technology, Social Construction of
Bibliography Bachelard G 1937 Le nouel esprit scientifique. Alcan, Paris Bloor D 1991 Knowledge and Social Imagery. University of Chicago Press, Chicago Bloor D 1996 Idealism and the Sociology of Knowledge. Social Studies of Science 26: 839–56 Canguilhem G 1966 Le Normale et le Pathologique. PUF, Paris Davidson D 1973 Radical interpretation. Dialectica 27: 313–28 Foucault M 1969 L’archeT ologie de Saoir. Gallimard, Paris Hacking I 1990 The Taming of Chance. Cambridge University Press, Cambridge, UK Knorr-Cetina K D 1981 The Manufacture of Knowledge: An Essay on the Constructiist Contexualist Nature of Science. Pergamon Press, Oxford Knorr-Cetina K D (ed.) 1982 Science in Context. Sage, London Kuhn T S 1962 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Latour B 1987 Science in Action: How to follow Scientists and Engineers Through Society. Harvard University Press, Cambridge, MA Latour B, Woolgar S 1986 Laboratory Life: The Construction of Scientific Facts. Princeton University Press, Princeton, NJ Merton R K 1973 The Sociology of Science: Theoretical and Empirical Inestigations. University of Chicago Press, Chicago Pickering A (ed.) 1992 Science as Practice and Culture. University of Chicago Press, Chicago Quine W V 1969 Ontological Relatiity and Other Essays. Columbia University Press, New York Searle J R 1995 The Construction of Social Reality. The Free Press, New York Tuomela R 1995 The Importance of Us: A Philosophical Study of Basic Social Notions. Stanford University Press, Stanford, CA
W. Detel
Social Democracy The term ‘social democracy’ can refer either to a form of societal organization, an ideology, a set of public policies, a political party or group of parties, or a broad social and political movement centered on such parties and their allies within different sectors of society. The meaning of ‘social democracy’ has varied enormously since its original adoption by a small, vaguely-socialist group within the left wing of the French democratic-republican opposition prior to the Revolutions of 1848. At various times and places, the social-democratic label has been adopted by revolutionary Marxist, democratic-socialist, liberal\ center-left, and even conservative-authoritarian political parties and tendencies, usually but not always with closer or more distant ties to blue-collar labor unions. Most commonly, the term ‘social democracy’ is used either in connection with political parties that have belonged to the contemporary Socialist Inter14267
Social Democracy national or its predecessor, the Second International of 1889–1914, or with reference to policies and ideologies close to such parties. As parties calling themselves social-democratic have grown steadily more moderate over time, gradually becoming less and less anticapitalist and antiliberal, usage of the term ‘social democracy’ has evolved in a similar fashion. At the turn of the twenty-first century, ‘social democracy’ generally refers either to the programs and policies of the largely center-left political parties that belong to the Socialist International, or to a vague set of essentially liberal, center-left ideological and policy orientations that are often associated with these parties. While specific policies may vary, such center-left ‘social-democratic’ approaches share a commitment to a relatively free-market-based capitalist economy, a fairly developed welfare-state, a liberal-democratic (‘polyarchic’) political system, and a relatively internationalist approach to questions of foreign policy, security, and globalization. All over the world, parties and ideologies of this type are currently among the most important in the vast majority of countries with relatively competitive elections. The term ‘social democracy’ is rarely any longer used as a self-description by Marxist or other anticapitalist and antiliberal tendencies.
1. The History and Significance of Social Democracy Ever since the Revolutions of 1848, which are generally considered to mark the birth of the social-democratic movement, the history of social democracy has been characterized by a profound ideological and tactical struggle between relatively radical, anticapitalist tendencies and relatively moderate, reformist orientations. In this dialectical contest, the radicals have generally adhered to some form of State socialism, usually of a Marxist (but anti-Leninist) variety, and\or some version of decentralized and radical-democratic cooperative socialism (‘self-management,’ ‘economic democracy’); the comparatively moderate reformists, by contrast, have increasingly made their peace with multiclass (‘catch-all’) parliamentary politics, liberal ideology, and the capitalist system. Over the decades, this internal struggle has interacted with the often complex relationships of social-democratic movements with other political, social, and economic forces within particular societies to give rise to a wide variety of more or less ‘socialist’ ideological visions and public policies, often very effective and\or highly original, which have directly or indirectly exerted a tremendous influence on virtually all aspects of political, social, economic, and intellectual life in the modern world, particularly in countries with relatively capitalist economic systems and relatively liberal-democratic political systems. Although social-democratic move14268
ments and governments have frequently been frustrated in their reformist efforts by powerful domestic or international pressures, the resulting compromises and modifications have often marked significant new developments in the history of their respective countries, and of the world as a whole. Indeed, even when confined to the ranks of the political opposition in a particular country, social democrats have sometimes played a major role in bringing about political, social, or economic reforms, by means of political pressure, foreign example, or intellectual influence. Among the most important of these socialdemocratic contributions to the development of the ‘welfare-capitalist’ synthesis that has gradually come to dominate an increasing portion of the world since the 1880s, have been the following: (a) the striking (albeit gradual) democratization and humanization of political, social, and economic life; (b) the enormous growth and modernization of both the state and the economy, particularly in the realms of social policy, industrial policy, macroeconomic policy, industrial and labor relations (the much-celebrated ‘democratic corporatism’ of various Northern European countries, now largely defunct in a more market-oriented age), and recently also environmental policy; (c) the more or less successful promotion of social harmony and ideological consensus within previously much more polarized societies; and (d) strong support for international integration and world peace (although the temptation to join in nationalistic ventures, on the one hand, or in pacifistic crusades, on the other hand, has been a frequent source of tension within the socialdemocratic movement ever since its inception, most notably during the First World War). Although self-styled ‘social-democratic’ parties and ideas have played a leading role in the history of an increasing number of countries around the world since 1880s, the concept of ‘social democracy’ is generally couched in terms based on the experience of Northern European Social Democratic, Socialist, and Labor parties (particularly those of Sweden and the other Scandinavian countries, as well as Germany, Austria, Great Britain, the Netherlands, and Belgium), because they have usually been the leading forces associated with the ‘social-democratic’ label, whether in terms of ideological and tactical development, policy-innovation, electoral support, governmental influence, or international prestige. To the extent that world history since 1848 has been the story of the grand struggle between capitalism and socialism, social-democratic movements and ideas have often played a decisive role in finding a more or less successful middle-ground between the two (frequently under the slogan of some sort of ‘Third Way’ between unbridled capitalism and state-socialism), particularly in Sweden, Norway, and the other countries of Northern Europe. And to the extent that world history during the same period has been the story of the grand struggle between democracy and
Social Democracy dictatorship, social-democratic movements and ideas have contributed greatly to both sides: on the one hand, social democracy spawned radical, antiliberal offshoots that quickly evolved into the rival totalitarian movements of communism and fascism during and after the First World War; on the other hand, social democrats have often played a decisive role in resisting and defeating these totalitarian movements, as well as other antidemocratic political, social, economic, and intellectual forces. From the 1945 onwards social-democratic parties all over the world have grown steadily more dependent on the success of the welfare-capitalist social-market economy and the support of middle-class voters and relatively privileged elites in both the public and private sectors. This has been the result of two major interrelated processes: first, the continuation of these parties’ aforementioned gradual movement toward the liberal center of the ideological spectrum; and second, the search for new bases of support and new reasons for existence under the impact of the gradual economic and cultural disappearance and disaffection of these parties’ traditional blue-collar industrial (‘proletarian’) base. The first of these processes, that of increasing ideological moderation, has largely been due to the combination of mass prosperity, the rise of the welfare-state, profound political and social democratization, the failure of Marxist ideas and State socialist regimes (‘real-existing socialism’), the spread of international peace and integration, American political and cultural hegemony, and the all-butinevitable habituation of social-democratic elites to existing power-structures. The second process, namely the disappearance and disaffection of social democracy’s old blue-collar base, has stemmed primarily from the combination of four interrelated developments that are associated with relentless economic and social modernization: first, much greater levels of social and economic mobility and opportunity, partly as a result of successful social-democratic policies; second, the accelerated disintegration and dissolution of inherited social and cultural solidarities and cleavages, including class-consciousness (‘individualization,’ ‘desegregation,’ ‘depillarization’); third, the drastic weakening of inherited attitudes, beliefs, and values (‘postmodernization,’ the rise of ‘postmaterialist’ or ‘left-libertarian’ values), particularly among highly-educated elites; and fourth, the transition from a predominantly ‘Fordist’ socioeconomic system based on mass production in ‘smokestack’ industries, to an increasingly ‘postindustrial’ economy based on much more sophisticated technology, a much older, more highly-skilled, and more highly-educated laborforce, a burgeoning service sector, much more widespread share-ownership, much higher levels of flexibility in all areas of economic and social life, much greater concern for environmental protection, and much higher levels of international interdependence, integration, and homogenization (‘globalization’).
2. The Challenges Facing Social Democracy Today In the twentieth century, social-democratic parties and elites increasingly staked their claim to rule less on utopian visions of ‘socialism’ than on their ability to manage and modernize the welfare-capitalist socialmarket economy more fairly, more humanely, more democratically, and, above all, more competently and more efficiently, than their rivals on right and left. Yet, although egalitarian and statist alternatives to free markets and a fairly rapacious capitalism were in the year 2000 weaker and more uncertain than ever before, challenges from the populist and often vehemently antiliberal and antimodern right, which since the 1980s has increasingly drawn the support of traditionally social-democratic (or communist or Christian Democratic) working-class voters, are often close at hand. Thus social democracy is faced with the difficult task of reconciling the interests and aspirations of an economically and culturally ever-more diverse potential constituency, a task which may prove extremely difficult once the worldwide economic boom at the turn of the twenty-first century dissipates. The danger, then, is that social democracy may become caught between the conflicting demands of relatively successful or privileged and relatively unsuccessful or underprivileged sectors within increasingly unequal and anomic postindustrial societies, with the attendant risk that important groups within either or both sectors may gravitate toward antiliberal and antimodern movements on the left or, more likely, the right. See also: Industrial Policy; Liberalism; Liberalism: Historical Aspects; Political Parties, History of; Socialism; Socialism: Historical Aspects; Welfare State
Bibliography Anderson P, Camiller P (eds.) 1994 Mapping the West-European Left. Verso, London Berman S 1998 The Social Democratic Moment: Ideas and Politics in the Making of Interwar Europe. Harvard University Press, Cambridge Esping-Andersen G 1985 Politics Against Markets: The Social Democratic Road to Power. Princeton University Press, Princeton NJ Gamble A, Wright T 1999 The New Social Democracy. Basil Blackwell, Oxford Gay P 1962 (1952) The Dilemma of Democratic Socialism: Eduard Bernstein’s Challenge to Marx. Collier, New York Giddens A 2000 The Third Way and Its Critics. Polity Press, Cambridge UK Gillespie R, Paterson W (eds.) 1993 Rethinking Social Democracy in Western Europe. Frank Cass, London Kitschelt H 1994 The Transformation of European Social Democracy. Cambridge University Press, Cambridge UK
14269
Social Democracy Lindemann A 1983 A History of European Socialism. Yale University Press, New Haven Marx K 1978 (1875) Critique of the Gotha Program. In: Tucker R (ed.) The Marx–Engels Reader. W.W. Norton, New York Merkel W 1993 Ende der Sozialdemokratie? Machtressourcen und Regierungspolitik im westeuropaW ischen Vergleich. Campus, Frankfurt am Main, Germany Notermans T 1999 Money, Markets, and the State: Social Democratic Economic Policies Since 1918. Cambridge University Press, Cambridge Padgett S, Paterson W 1991 A History of Social Democracy in Postwar Europe. Longman, London Piven F F (ed.) 1991 Labor Parties in Postindustrial Societies. Polity Press, Cambridge Przeworski A 1985 Capitalism and Social Democracy. Cambridge University Press, Cambridge Sassoon D 1996 One Hundred Years of Socialism: The WestEuropean Left in the Twentieth Century. The New Press, New York Scharpf F 1991 (1987) Crisis and Choice in European Social Democracy. Cornell University Press, Ithaca
T. P. Harsanyi
Social Demography: Role of Inequality and Social Differentiation The description and analysis of inequality and social differentiation are fundamental components of social demography. In this context, a socially differentiated population is one characterized by a variety of subgroups that are distinct according to generally persistent social criteria such as race, ethnicity, religion, language, and so on. Inequality, on the other hand, is a hierarchical concept and is used here to refer to differences in the actual possession of resources of various types or in access to such resources. Resources are in the form of political power or influence, income, schooling, health care services, etc. Differentiated groups may also be extremely unequal but the idea of socially different need not imply inequality. This article briefly reviews several of the reasons why demographers study differentiation and inequality and then goes on to discuss how four different theoretical perspectives have been applied to the problem. The first emphasizes cultural explanations; competing somewhat with it are resource-based explanations which focus particularly on differences in opportunities and constraints. Cutting across both of these are reference group theory and life-course analysis.
1. Introduction A perennial demographic question is the extent to which diverse or unequal groups exhibit marked differences in demographic behavior. Are there sub14270
stantial race, ethnic, or religious differences in marital or fertility behavior, in mortality, or in patterns of migration? One reason it is important to study such differentials is that they indicate the extent to which the overall demographic performance of a population is a function of subgroup diversity. How big an impact a subgroup’s behavior can have on overall population measures depends on the distinctiveness of that group’s behavior weighted by its relative size. Although size counts for much, if the group’s behavior is unusual enough, it can still have a substantial impact on overall measures. For example, using never married women, aged 30–34, as a rough indicator of marriage delays in the US, 22 percent of all women of that age were never married in 1998. However, the proportion was much higher for African American than white women—47 as opposed to 17 percent. The differences were so large that even though African American women were only 14 percent of all women in this age group while whites were 80 percent, the proportion never married was pulled up by five points—or almost 30 percent—for all women as compared to whites (US Bureau of the Census 1998). Documenting subgroup differences in demographic behavior also provides important insights into the demographic concomitants of inequality and a socially differentiated population. It also poses analytical questions about the underlying determinants of these differences since group membership is usually an indicator of a variety of more proximate causes of an observed demographic pattern—e.g., the role of unequal access to medical care in ethnic group differences in mortality. A classic example in social demography is the influence of marriage timing on fertility in early industrializing populations. Given women’s circumscribed reproductive period, delayed marriage in a noncontracepting population typically leads to smaller completed family size, while an earlier marriage promotes higher fertility. Thus, in rural areas in many parts of Western Europe, where inheritance of the farm was a precondition for marriage, marriages were often greatly delayed. On the other hand, factory workers in the cities usually married much earlier as their economic fortunes were tied to the developing wage economy rather than the traditional inheritance system (Anderson 1971).
2. Varying Theoretical Perspecties The remainder of this article will focus on the impact of inequality and social differentiation on demographic behavior by examining four theoretical approaches to the problem. Together these underlie most research in social demography; hence the discussion will begin by a brief introduction to the general thrust of each perspective before applying them to issues of social differentiation and inequality. Two distinct and often competing theoretical approaches
Social Demography: Role of Inequality and Social Differentiation have been particularly influential in social demography; these are cultural compared to resource explanations of demographic behavior. Cross-cutting them are two additional perspectives—reference group theory and life-course analysis.
2.1 Socioeconomic Constraints and Opportunities Resource based explanations focus on how differences in resources affect demographic behavior—e.g., differences in social or economic constraints and opportunities. Thus, a major feature of the Theory of the Demographic Transition is the argument that economic development raises the cost of children and reduces their economic value to families, thereby promoting the transition from the relatively high fertility characteristic of most preindustrial societies to the typically low fertility of industrialized societies (see Fertility Transition: Economic Explanations). By focusing on differences in resources, this approach also lends itself well to analysis of the determinants and consequences of inequality. In the resource-based approach, differences or changes in goals or preferences among individuals or groups are frequently not examined in any great detail. In fact, neoclassical economists have typically argued that preferences are fixed across time and do not systematically vary among individuals, thereby permitting changes in demographic behavior to be unambiguously attributed to changes in constraints. However, for most social demographers, such an assumption is generally too strong and variations in preferences are usually considered an important, if often understudied, source of differences in people’s behavior.
2.2 Cultural Factors A second approach views ‘culture’ as the most important factor influencing demographic behavior. The nature of cultural factors being considered are highly variable, ranging from the narrowly to the broadly conceived—from norms regarding the permissibility of divorce or the number of children a couple should raise all the way to more complex and sometimes highly abstract constructs such as religious or basic value systems. For example, it has been argued that both earlier and more recent European fertility transitions are largely the result of a major change in the ‘ideational’ system marked by the growth of secular individualism among other value changes. While some recognition is given to the influence of economic factors, their importance is often minimized by scholars operating in this tradition (see Fertility Transition: Cultural Explanations; Demo-
graphic Transition, Second; Family Theory: Role of Changing Values). The cultural approach tends to emphasize differences in goals and values and hence seems more oriented towards explaining socially differentiated behavior rather than behavioral differences based on an inequality of resources. However, it is often recognized that certain cultural patterns arise out of conditions of inequality and may even contribute to its persistence (Parsons and Goldin 1989).
2.3 Reference Group Models Cutting across these general perspectives are two additional approaches that have been influential in explanations of demographic behavior. The first is the use of reference group models. The idea here is that people use selected others whose behavior or characteristics provide positive—or sometimes negative— models for themselves. How this approach is implemented partly depends on whether the researcher’s primary orientation is cultural or resource based. If cultural, the emphasis is more likely to be on reference groups as normative models to emulate. If the investigator is more resource oriented, the emphasis is likely to be on relative economic status concerns (see Social Networks and Fertility; Human Ecology: Insights on Demographic Behaior).
2.4 Life Course Approach A second influential cross-cutting perspective has been the lifecycle\life course approach. Lifecycle analysis is the pioneering version of this approach and focuses particularly on age-related characteristics and behavior. It can be applied either to an individual—an individual’s career or family cycle, for example—or to a group—the family’s developmental cycle. Because of the scarcity of life history data, lifecycle analysis has often had to restrict itself to age patterns measured at one or more points in time (using crosssectional data on the behavior of different age groups, i.e., ‘synthetic’ cohorts) rather than being able to follow individual life histories over real time. The later developing life course perspective adds a particular emphasis on such life histories, although it too has sometimes had to settle for cross-sectional data. The availability of longitudinal data has increased in recent years, permitting more extensive analyses of life histories. Some life course researchers have also placed great emphasis on normative issues as well, a major one being that there are norms governing the sequencing of life course transitions—for example, that there is a norm against marrying before completing school (Elder 1977, Hogan 1978). 14271
Social Demography: Role of Inequality and Social Differentiation While all these perspectives, singly and in combination, have been generally important in demography, the discussion below will focus on their use in analyses of the role of socioeconomic inequality and differentiation in demographic behavior. Although often viewed as competing, neither the resource nor cultural perspective alone can do an entirely satisfactory job of explaining patterns or changes in demographic behavior. Both approaches profit by incorporating elements from the other.
3. Cultural Differences and Demographic Behaior Probably the most extensive research project which came to espouse a cultural explanation of demographic behavior was the Princeton European Fertility Project which examined the transition from high to low fertility occurring over much of late nineteenth and early twentieth century Europe. By using provincial-level data, the project set out to test several propositions of Demographic Transition Theory, one of which was that fertility declines were a consequence of economic development. A major conclusion of this body of research was that the observed fertility declines appeared to be only weakly related to conventional measures of socioeconomic development, such as the proportions in a province who were literate, employed in agriculture, lived in large cities, etc. In fact, in most European countries the fertility transition appeared to have begun at nearly the same time, despite considerable geographical variability in economic and social conditions. Hence, this too seemed to be a strong indication that the fertility transition was not an adaptive response to the changing conditions brought about by economic development. However, areas which shared similar cultural characteristics, such as language and religion, exhibited different rates of fertility change compared to areas at a similar economic level but with different cultural characteristics (Coale and Watkins 1986). To explain these findings, an innovation and diffusion hypothesis was invoked, involving elements of reference group as well as cultural theories. The argument is that the deliberate control of completed marital fertility was generally unknown in Europe throughout most of its history although large families were not highly desired. However, late in the nineteenth century certain influential opinion leaders in local areas not only became innovators in the use of contraception but also in promoting the view that family size could and should be legitimately controlled. Their views and practices provided a behavioral model and were, accordingly, rapidly adopted by others. Such changes were more likely to occur within cultural groups, defined, for example, by language and religion, partly because of the ease of within-group communication. In addition, shared norms and values increased 14272
the influence innovators were likely to have within the group, thereby fostering a rapid diffusion of innovations. Moreover, some cultural groups were more open to change than others. In Belgium, for example, traditional Catholic doctrines were much more powerful in the Flemish-speaking areas than in the French which were more secularized. Throughout much of Europe, France being a notable exception, fertility changes were more rapid in Protestant than Catholic areas (see Fertility Transition: Cultural Explanations). The Princeton Project’s contention that the European fertility transition was only weakly related to economic development has been subject to a variety of criticisms. The argument heavily depends on two empirical generalizations—first, that the adoption of fertility control was an innovative behavior and, second, that the fertility transition began at roughly the same time over most of Europe despite considerable regional variations in the pace of industrialization. However, because the kind of data collected during this historical period were severely limited, indirect statistical measures had to be developed as indicators of these behavioral patterns. Recent work on the validity of these measures, some by an original contributor to one of the measures (Trussell), indicates that they will not reliably detect the start of the transition until it is well underway and hence arguments about the simultaneity of transitions all over Europe, regardless of the level of economic development, are not reliable. Moreover, the statistical measure of the proportion of couples who were fertility controllers requires a fairly large number of them in order to be detected by the measure, casting doubt on the ability to determine whether fertility control was actually an innovation at the times indicated (Guinnane et al. 1994). In addition, recent analyses of German data, employing more sophisticated multivariate statistical methods than typically used in the Princeton project, have found that indicators of economic development were strongly related to declines in fertility during this period; however, although a cultural factor, such as religion, was important in predicting fertility leels, it did not have an effect on changes in fertility (Galloway et al. 1998). While there is increasing doubt that cultural factors alone can explain the European fertility transition, the continuing social policy goal of reducing fertility in currently developing societies has kept alive a strong interest in more rigorously developing an innovation and diffusion model. However, recent work in this area has placed a greater emphasis on the value of the process as a mechanism for explaining changing fertility patterns in conjunction with changing socioeconomic constraints rather than instead of them (Montgomery and Casterline 1996). Moreover, in contrast to the provincial-level data the Princeton project was necessarily limited to, detailed microlevel research involving network analysis is being carried
Social Demography: Role of Inequality and Social Differentiation out in several less developed countries to determine the particular social interaction and networking mechanisms involved in promoting the diffusion of contraceptive use and declines in fertility (see Social Networks and Fertility; Human Ecology: Insights on Demographic Behaior). In general, although there is little doubt regarding the importance of investigating the effect of cultural differences on demographic behavior, in practice it is not always so easy to pin down the causal role of cultural factors. The easiest but probably also the least sophisticated approach is to assume that if there is a simple correlation observed this constitutes reliable evidence of a causal relationship between demographic behavior and membership in socially differentiated groups such as those defined by religion, ethnicity, social class, metropolitan residence, etc. Sometimes, the existence of cultural determinants is deduced as the residual explanation when other variables, such as socioeconomic ones, do not explain much of the between-group differences. However, to do so makes the unrealistic assumption that all other relevant variables are included in the analysis and are also properly measured. Moreover, given the preliminary nature of some of the theories under investigation as well as the uneven quality of demographic data and weaknesses in current measurement capabilities, attaching specific meanings to unexplained variance is always a risky procedure. Another problem involved in opting too quickly for purely cultural explanations is that cultural identities are usually related to economic characteristics. Hence multivariate analyses are essential for separating out the role of economic constraints from cultural characteristics. Even then, the nature of the causal process is not always clear. For example, education is more highly stressed among Chinese than Latin American immigrant populations in the United States, suggesting a causal sequence between ethnicity, educational attainment, and socioeconomic status rather than one of these factors alone dominating the process. This may indicate a culturally related higher value placed on education among the Chinese—because of a Confucian tradition, for example—but it may also partly reflect differences in the socioeconomic origins of the two immigrant groups—one having a greater representation of professionals while the other coming from a much poorer socioeconomic background. Hence, sorting out the nature of the causal process is often complex and the data available to accomplish this are not always readily available. The magnitude of the effect of cultural differentiation on demographic behavior can also be quite different depending on whether one is discussing leels or changes in behavior. For example, religion usually has historically had a consistently sizable impact on fertility in the cross-section but if the religious composition of an area does not change markedly over time, and the effect of any given religious identification
remains constant, religion’s role in changes in demographic behavior may be minimal. However, if different cultural groups vary in their openness to change introduced from the outside, as is sometimes argued, then cultural differences could indeed have a substantial impact on change.
4. Socioeconomic Constraints and Opportunities Social scientists who primarily focus on the role of resources and constraints in demographic behavior have often ignored how institutional and normative commitments may constrain individual choices, both in terms of affecting preferences and also by limiting what are considered the legitimate means for realizing these preferences. In this respect, neoclassical economists have often chastised sociologists for their apparent willingness to posit a plethora of attitudes and norms to explain every observed behavior, thereby trivializing the analytical process. Nevertheless, it is possible to accept the idea of systematic variations and changes in preferences as a reasonable analytical tool and to apply it in a disciplined fashion in conjunction with resource constraints—that is, without positing a different or changing norm or value to explain every observed variation in demographic behavior. Hence the view that one must choose either a cultural approach or a resource constraint approach is unnecessarily confining intellectually.
4.1 Relatie Economic Status Models A major example of the integration of the idea of using varying lifestyle aspirations in conjunction with varying resource constraints to explain demographic behavior is found in Richard Easterlin’s work on relative cohort size in the US. Easterlin hypothesized that intergenerational shifts in relative cohort size produce intergenerational inequalities in people’s economic position which, in turn, affect age at marriage, fertility, and a number of other demographic responses. In addition, he maintained that lifestyle aspirations were learned in the parental household so that children raised in affluent households would develop higher-level consumption goals than those raised in poorer households. He combined these two views by arguing that cohorts who were entering young adulthood in the early postwar period were small due to low fertility during the Great Depression and were also raised in less affluent households, resulting in relatively modest consumption aspirations. However, because of this depression cohort’s small size, they were in a very favorable labor market position as adults which encouraged early marriage and childbearing, as well as larger families—in short, they produced the postwar baby boom. The baby boomers, in turn, were raised in these affluent house14273
Social Demography: Role of Inequality and Social Differentiation holds and acquired high consumption aspirations; however, because of their large size, their labor-market position was not as good as that of their parents and this led to delayed marriage and lower fertility, resulting in the baby bust. In short, the Easterlin hypothesis posited that a feedback effect was produced by this process and created self-generating demographic cycles. Due to intergenerational shifts in relative economic status and consumption aspirations, small cohorts begat large ones which, in turn, produced small cohorts (Easterlin 1987). Empirically, the status of Easterlin’s theory has been criticized on a number of counts (see Baby Booms and Baby Busts in the Twentieth Century). A major one is that the postwar demographic changes were primarily driven by ‘period’ rather than ‘cohort’ factors. For example, annual birth, marriage, and divorce rates have tended to fluctuate together among larger and smaller cohorts alike, suggesting that there was something about particular historical periods that influenced behavior rather than cohort size. Nevertheless, the relative economic status aspect of the model remains interesting because the existence of inequality can provide a motivation for adaptive behavioral changes to occur; thus it is an intrinsically dynamic type of model. Easterlin’s hypothesis focused only on intergenerational inequalities; however, reference group and relative economic status models can also be applied within generations. One example comes from the rapid postwar rise in US married women’s paid employment so that by 1990 around 70 percent of wives, aged 25–54, were in the labor force compared to less then 25 percent in 1950. There is little evidence that this transformation was precipitated by major changes in norms or values. While opinion poles and other attitudinal studies indicate that approval of married women’s employment became increasingly positive with time, attitudinal changes did not precede the changes in employment behavior but rather lagged behind them. A far more influential factor appears to have been the postwar economic expansion which greatly increased job opportunities for married women because the demand for women workers in female dominated occupations far exceeded the size of the population group that had traditionally been the backbone of the female labor force—young single women (see Oppenheimer 1970). In addition to substantial changes in labor-market conditions, a type of innovation and diffusion model combined with a relative economic status approach might further clarify why the employment shift has taken place so rapidly. The rationale is that the rise in married women’s employment changes how families are stratified by reducing some previous socioeconomic inequalities and creating others. When married women’s employment outside the home was relatively rare, a family’s socioeconomic position was primarily dependent on the husband’s economic 14274
status, although earlier in the twentieth century the work of the family’s adolescent and young adult children was also important at certain periods in the family’s developmental cycle. As wives’ employment became more common in the postwar period, however, many families within the same socioeconomic group, but without working wives, could increasingly be put at a relative economic disadvantage, thereby generating inequalities within groups which had previously been at roughly the same socioeconomic level, based on husbands’ economic position alone. On the other hand, wives’ employment can reduce the relative economic advantage of some families who might previously have been at a somewhat higher socioeconomic level. Hence, wives’ employment in some families but not others creates both an objective and subjective instability in how people experience the stratification system. But it also creates a model of how to deal with these instabilities—the entry of other wives into the labor force. Wives already working also provide a source of information on the nature of the job market for others contemplating entering the labor force, as well as providing models of strategies for how wives and mothers could cope with work and family responsibilities. The impact of this process should, if anything, have increased over time. As long as there were few wives employed, their work would have relatively little impact on both intra- and intergroup reference comparisons and the easier it would be to dismiss their behavior as ‘deviant’ or the ‘exception that proves the rule.’ However, as the proportion of wives working rose and became increasingly full-time, then their example would become more salient, thereby increasing its destabilizing effect on reference group comparisons. Moreover, the more wives who worked, the fewer nonworking women there would be around to socialize with, share childcare responsibilities with, etc., hence, the lifestyles as well as the relative socioeconomic position of families with nonworking wives would be increasingly affected. All this should have contributed to the continued rapid expansion of married women’s employment in the postwar era in addition to any changes in labor market conditions (Oppenheimer 1982). Although not easy to operationalize, due to data limitations, relative economic status models have attractive theoretical features for analyzing change. They posit changes in lifestyle aspirations which affect behavior but which nevertheless remain embedded in a stable preference structure—maintaining one’s socioeconomic position in the community. They also involve feedback effects that foster a continuation of the diffusion process whose momentum may even increase rather than peter out over time. Furthermore, the existence of feedback effects means that not all those changing their behavior need experience the same initial motivating factors that the earlier innovators were exposed to; whatever its original cause,
Social Demography: Role of Inequality and Social Differentiation the behavior of innovators itself creates an additional set of pressures that influences others, in turn, to shift their behavior and attitudes as well. Nor need such an amalgam of a relative economic status model with an innovation and diffusion model operate independently of social norms. For example, in the American case, although there continued to be a general disapproval of married women’s employment long after it had already substantially increased, even early on in the process there was evidence of a considerable flexibility in attitudes, depending on the circumstances. Thus attitudes were very negative in the 1940s if it was specified there was only a limited number of jobs but much more permissive if a wife worked in order to facilitate a marriage (Oppenheimer 1970). Hence, the conditionality of many norms permits a certain amount of innovative behavior which may ultimately modify the normative structure more generally. In this case, changing socioeconomic conditions encouraged an increase in wives’ employment which was facilitated by norms viewing working to achieve family goals as a legitimate exception to the general disapproval of married women’s employment. The ultimate result, however, was not only a permanent change in employment behavior but a radical reduction in the general disapproval of wives’ working as well.
4.2 Socioeconomic Inequalities and the Life Course While cross-sectional analyses are major sources of information on socioeconomic inequality and its demographic concomitants, they often miss an important dimension of inequality: the variation in socioeconomic well-being over the life course and how this may systematically differ among socioeconomic groups as well as change over time. However, there is a substantial body of work—primarily in historical demography—that directly addresses these problems within a life course context. An inescapable fact of human biology is that individuals’ economic productivity varies over the lifecycle. For example, very young children and the very old have few productive capabilities. The nature of the economic system also influences these variations and, along with it, the individual’s earnings position. If work involves considerable physical effort or health risks, a reduction in productivity may happen at a relatively young age so that the most economically productive part of the career cycle occurs fairly early in adulthood. On the other hand, if skill is the major job characteristic required, relatively high productivity may last well into old age; however, in this case, lengthy training requirements during an individual’s youth are likely to reduce productivity then. Research on middle-class English, European, and American families and on the working classes around the turn of the nineteenth century not only reveals the substantial
general economic inequality between these groups but the considerable differences in how relative affluence or deprivation was experienced over the life course, along with the resulting differences in demographic responses. In the case of the middle classes, while earnings might ultimately be substantial, early in the career cycle they were usually inadequate to permit a young man to marry and set up a household in the style considered appropriate to his social station in life. The common pattern was therefore for men to delay marriage until they were economically established and had proved themselves capable of ‘properly’ supporting a wife and family (Banks 1954, Alter 1988). In contrast to the middle classes, the career-cycle pattern of earnings was far different for industrial workers, both in Britain and many other nineteenth century industrializing populations. In addition to their generally much lower earnings, the lifecycle period where earnings were greatest and most assured was in young adulthood rather than middle age; hard physical labor, the risk of debilitating disease or accidental maiming or death in the far more dangerous work settings of that era all raised the risk that the middle and older years would be ones of declining or lost productivity. Several adaptive strategies seemed to have been extremely common in such populations. One was a major reliance on the earnings of children to help maintain or improve the family’s economic stability over its developmental cycle. Another was an early timing to the start of both marriage and childbearing, thus placing these transitions during the period when working class men’s earnings were at their greatest; in addition, this increased the likelihood that at least some children would be old enough to go out to work and contribute to the family’s income as the father aged and entered into a higher risk period. However, the downside to this strategy was a frequent pattern of life-cycle poverty or, at the very least, of relative deprivation—for example, when there were several children in the household but none old enough yet to go out and work: this problem was particularly acute if the earnings of the father were lost or much reduced. A different kind of life-cycle poverty typically occurred after all the children had left home and the parents were too old to engage in extensive employment. As a consequence of this life-cycle poverty pattern, the economic situation of individuals at any single point in time was a poor indicator of the frequency with which poverty or relative economic deprivation might be experienced over their entire life course (Anderson 1971, Haines 1979, Rowntree 1922). While such strategies might well be characterized as rational adaptations to the life course circumstances faced by many working class families, they were also cultural in nature. Historical research indicates that they had a major normative component; children were placed under a strong obligation to work and help improve the family’s economic well-being and they 14275
Social Demography: Role of Inequality and Social Differentiation were often allowed to keep very little of their earnings (Parsons and Goldin 1989). Although their work might also improve their own well-being while in the household, such a pattern could nevertheless work against children’s long-term self-interest because it tended to reduce the amount of schooling they received and hence their adult socioeconomic status. Moreover, delaying or even foregoing marriage was not entirely uncommon because of the obligations adult children had to help support the family and\or help care for younger siblings, especially if one or both parents were lost (Hareven 1982). Work on contemporary populations shows that different socioeconomic groups have also been experiencing quite contrasting career development patterns and that this can have significant demographic consequences. For example, consistent with their traditionally important and normatively prescribed economic role in the family, it was still true in the 1980s that American men were much less likely to marry while in an unstable labor market position with low earnings. How large a cumulative impact this had on their marriage timing, however, depended on the length of time a young man remained in such an ‘immature’ career state and this varied considerably among racial and schooling groups. Compared to men with one or more years of college, a much higher proportion of high school graduates, as well as dropouts, were experiencing a far more difficult and prolonged transition to a relatively stable career position, leading to substantial educational differences in marriage timing. African American males were having particular difficulties and their marriages were much more delayed than that of whites, as indicated by the never married figures cited earlier. These findings also suggest that, given the considerable rise in economic inequality in the US starting in the mid1970s, increasing career-entry difficulties may have played a significant role in the overall rise in the age at marriage, not only for those reaching adulthood during the 1980s but for earlier and later cohorts as well (Oppenheimer et al. 1997; see Family Theory: Competing Perspecties in Social Demography). See also: Differentiation: Social; Equality and Inequality: Legal Aspects; Inequality; Inequality: Comparative Aspects; Resource Geography
Bibliography Alter G 1988 Family and the Female Life Course. University of Wisconsin Press, Madison, WI Anderson M 1971 Family Structure in Nineteenth Century Lancashire. Cambridge University Press, Cambridge, UK Banks J A 1954 Prosperity and Parenthood. Routledge & K. Paul, London Coale A J, Watkins S C (eds.) 1986 The Decline of Fertility in Europe. Princeton University Press, Princeton, NJ
14276
Easterlin R A 1987 Birth and Fortune. University of Chicago Press, Chicago Elder G H 1977 Family history and the life course. Journal of Family History 2: 279–304 Galloway P R, Lee R D, Hammel E A 1998 Urban versus rural: fertility decline in the cities and rural districts of Prussia: 1875 to 1910. European Journal of Population 14: 209–64 Guinnane T W, Okun B S, Trussell J 1994 What do we know about the timing of fertility transitions in Europe? Demography 31: 1–20 Haines M 1979 Industrial work and the family life cycle, 1889–1890. Research in Economic History 4: 449–95 Hareven T K 1982 Family Time and Industrial Time. Cambridge University Press, Cambridge, UK Hogan D P 1978 The variable order of events in the life course. American Sociological Reiew 43: 573–86 Montgomery M R, Casterline J B 1996 Social learning, social influence, and new models of fertility. Population and Deelopment Reiew 22 (Suppl. 5): 151–75 Oppenheimer V K 1970 The Female Labor Force in the United States. University of California, Berkeley, CA Oppenheimer V K 1982 Work and the Family: A Study in Social Demography. Academic Press, New York Oppenheimer V K, Kalmijn M, Lim N 1997 Men’s career development and marriage timing during a period of rising inequality. Demography 34: 311–30 Parsons D O, Goldin C 1989 Parental altruism and self-interest: child labor among late nineteenth-century American families. Economic Inquiry 27: 637–59 Rowntree B S 1922 Poerty: A Study of Town Life, new edn. Longmans, Green, London US Bureau of the Census 1998 Marital Status and Living Arrangements: March 1998. Current Population Reports, P20514.
V. K. Oppenheimer
Social Dilemmas, Psychology of The functioning of societies, groups, and relationships is strongly challenged by social dilemmas, situations in which private interests are at odds with collective interests. Social dilemmas are potentially disruptive for functioning of collectives because the well-being of these larger units is threatened when most or all individuals act in a self-serving manner. For example, during periods of water shortage, it is tempting not to worry too much about how much water to use; but if most people exercise no restraint, this may create severe societal problems. Also, meetings at work would be considerably more productive if all people invest sufficient time and effort in preparing for these meetings. And close partners too, face social dilemmas, such as whether or not one should spend less time with one’s independent friends. In a sense, social dilemmas are so pervasive in everyday life that one can go so far as to claim that the most challenging task governments, organizations, and even partners in
Social Dilemmas, Psychology of a relationship face is to successfully solve social dilemmas. After a brief discussion of the history of social dilemmas, as well as their methodological underpinnings, the basic question of how cooperation between people can be promoted will be addressed.
1. Social Dilemmas: An Historical and Methodological Perspectie The intellectual parents of social dilemmas can be found in early formulations of game theory concerned with the strategic analysis of conflicts of interests (e.g., Luce and Raiffa 1957). Game theory captures decisions in terms of experimental games, that is, situations in which each of the participants has to choose one of two (or more) alternatives, which affect one’s own outcomes as well as the other’s outcomes. The most famous game following from this work is the Prisoner’s Dilemma Game, which represents a true conflict of private interests and collective interests. The prisoner’s dilemma derives its name from an anecdote about two prisoners who were accused of robbing a bank. The district attorney, unable to prove that they were guilty, confronted the prisoners with two options: either confess to the crime (option D), or not confess to it (option C). Paradoxically, option D stands for the Defecting option and option C for the Cooperative option, for reasons which will be apparent by the consequences these choices had for both prisoners. If both confess, each will receive a five-year sentence, whereas if neither confesses each of them receives a one-month sentence. Thus, each prisoner would be better off when neither confesses than when both confess. However, the decision is complicated by the fact that each prisoner has a clear temptation to confess. If one confesses and the other does not, then the prisoner who does not confess will receive a 10year sentence, while the other will be set free. Thus, in a prisoner’s dilemma, two individuals are interdependent in that each person is better off by behaving noncooperatively, regardless of the other’s behavior (individual rationality). At the same time, both individuals are better of when both behave cooperatively, rather than when both behave noncooperatively (collective rationality). The interdependence structure underlying prisoner’s dilemmas is characteristic of many dyadic interactions. For example, two neighbors living in apartments with thin walls may wish to play their favorite music at high volume, but they together would probably be better off by playing it at a somewhat lower volume. The conflict between individual and collective rationality apparent in the prisoner’s dilemma can be extended to situations involving many individuals, which are often referred to as N-person prisoner’s dilemmas. For example, trade unions can function considerably better to the extent that a greater
number of individuals become members; however, membership requires fees and time and thus is a costly option from an individual perspective. Social dilemmas is a generic concept, embodying both two-person and N-person prisoner’s dilemmas, as well as some situations that are quite similar to prisoner’s dilemmas, and that do represent a conflict between self-interest and collective interest. It is also noteworthy that in everyday life social dilemmas tend to take the form of either a resource dilemma or a public good dilemma. A resource dilemma calls for a decision whether or not to take (or how much to take) from a public resource. An example is the degree to which one uses water during a period of water shortage. A public good dilemma calls for a decision whether or not contribute (or how much to contribute) to a public good. An example is the degree to which one contributes to the maintenance or improvement of public television. Public good dilemmas pose the problem of free-riding, not contributing but still enjoying the benefits of the public good. In actual research on social dilemmas, researchers often use experimental games, a technique is that closely linked to study of conflict, cooperation and competition, as well as intergroup relations (e.g., conflict and conflict resolution; cooperation and competition, intergroup relations). The experimental game approach has a number of advantages, in that allows researchers to confront participants with a real dilemma, especially when the ‘outcomes’ represent real outcomes (e.g., money). Thus, this approach focuses on actual behavior in a context characterized by high levels of experimental realism—it is an involving experimental task. At the same time, to enhance mundane realism researchers have become increasingly interested in examining simulations of social dilemmas as they appear in the real world. Examples are well-designed simulations of the resource dilemma or the public good dilemma. And there is increasing research which examines social dilemmas directly in the real world, which involves correlational survey research, observational research, and experimental field research (see Komorita and Parks 1995). These multitudes of techniques and procedures are used by not only psychologists, but also by anthropologists, economists, sociologists, political scientists, and even biologists. Many or most of the findings reviewed below are observed in various research paradigms.
2. What Factors Promote Cooperation in Social Dilemmas? How can one promote cooperation in social dilemmas? One important factor derives from the structure itself; that is, the degree to which self-interest and collective interest are conflicting. Cooperation can be enhanced by both decreasing the incentive associated with 14277
Social Dilemmas, Psychology of noncooperative behavior and increasing the incentive associated with cooperative behavior (Kelley and Grzelak 1972). Generally, the influence of the structure itself has been shown to be strong not only in experimental work on social dilemmas, but also in field studies. For example, it has been found that monetary reward for electricity conservation was one of the most effective means to attaining electricity conservation. A second factor concerns the possibility of communication. Indeed, one interesting and potentially important way to escape from massive noncooperation is to coordinate the social dilemma in a manner such that people feel committed, or actually promise to contribute to the collective welfare (Orbell et al. 1988). However, opportunities for verbal communication tend to be effective only when individuals indeed discuss their actions relevant to the dilemma at hand. One of the explanations for the fruitful effects of communication is that it enhances trust in others’ cooperative choices, and that it is difficult for people to break a promise. A third factor derives from the personalities involved, in particular differences among three types of social value orientation: prosocial orientation (enhancement of collective outcomes as well as equality in outcomes), individualistic orientation (enhancement of own outcomes with no or very little regard for others’ outcomes), and competitive orientation (enhancement of relative advantage over others; Messick and McClintock 1968; Van Lange 1999). People with a prosocial orientation exhibit greater cooperation, expect others to behave more cooperatively, are more trusting of others’ intentions and motivations, and interpret the dilemma more strongly as a moral issue and less of an intelligence task than do individualists and competitors. Also, such personality differences tend to be a function of social interaction experiences, in that, for example, prosocials were raised in larger families, having more sisters in particular than individualists and competitors, and people over time tend to become more prosocial and less individualistic and competitive (Van Lange et al. 1997; see also Personality Deelopment in Childhood; Personality Deelopment in Adulthood).
3. From Two-person Social Dilemmas to N-person Social Dilemmas In two-person social dilemmas, individuals may to some degree be able to promote cooperation in the other. In this regard, one of the most effective strategies is the tit-for-tat strategy (Pruitt and Kimmel 1977), a strategy which starts with a cooperative choice, and subsequently mimics the other’s prior choice. Thus, tit-for-tat’s willingness to behave cooperatively is conditional upon the other’s cooperative 14278
behavior. Several studies have revealed that tit-for-tat is more effective in eliciting cooperation in the other than a strategy which always makes cooperative choices irrespective of other’s choices (perfect cooperation), which is more effective than a strategy which always makes noncooperative choices irrespective of other’s choices (perfect noncooperation). Tit-for-tat is assumed to be effective because it is nice, retaliatory, forgiving, and clear (Axelrod 1984). One potential limitation of tit-for-tat is that it can only be effectively used in the context of dyadic relationships or small groups. How is one to pursue tit-for-tat in large groups? Apart from the strategies that people use, there is typically a decline in cooperation as groups become larger in size. More specifically, cooperation typically declines as groups become larger up to about seven or eight persons. However, if groups become larger than seven or eight persons then the level of cooperation is not strongly influenced by a further increase in group size. How can one account for the group-size effect? It has been suggested that, with increasing group size, anonymity is greater, individuals become more pessimistic about the efficacy of their efforts to promote collective outcomes (Kerr 1996), people tend to feel less personally responsible for good collective outcomes. An effective solution might be a structural solution, by which one effectively changes the underlying structure, by implementing a system that rewards cooperation or sanctions noncooperation (Yamagishi 1986). The benefits need not necessarily be financial or even material. For example, reduction of car use (so as to reduce smog or congestion) might be promoted to some degree by establishing carpool priority lanes. Other solutions might derive from strengthening social norms,facilitating effective communication, or enhancing feelings of identity with the group or a sense of ‘we-ness.’ (see also Social Categorization, Psychology of)
4. From Interpersonal to Intergroup Relations Interestingly, interactions between groups are characterized by lower levels of cooperation and higher levels of competition than are interactions between individuals (Insko et al. 1998). This individual-group discontinuity effect is accounted for by, first, lower levels of trust in intergroup relations, eliciting fear of being exploited by the other group. Second, individuals within a group are able to support each other in their pursuit of the interest of their own group (and each member’s self-interest) at the expense of the other group. Third, responsibility for self-centered behavior is often shared in intergroup settings but interpersonal settings one is individually responsible for his or her actions. One way to enhance cooperation between groups is when one of both groups follows the tit-for-
Social Eolution: Oeriew tat strategy, which yields a level of cooperation comparable to that found among interactions between individuals when one of them follows the tit-for-tat strategy (Insko et al. 1998).
5. Issues for Future Research Recent approaches to social dilemmas examine the ways in which social-cognitive mechanisms (e.g., construal of social dilemmas) might be associated with behavior in social dilemmas, examine how individuals themselves seek to change features of the social dilemma to enhance cooperation (e.g., excluding members from the group) or to escape from the dilemma (e.g., seeking greater independence), and examine the functionality and evolutionary stability of various strategies in large-scale social dilemmas through computer simulation techniques. These issues are examined by various social and behavioral scientists, and these interdisciplinary efforts should make important contributions to theory regarding cooperation and competition, social dilemmas, and interdependence as well as to providing effective solutions to various complex social dilemmas we face in the real world. See also: Altruism and Self-interest; Cooperation and Competition, Psychology of; Cooperation: Sociological Aspects; Decision-making Systems: Personal and Collective; Personality Development in Adulthood; Personality Development in Childhood; Prisoner’s Dilemma, One-shot and Iterated
Bibliography Axelrod R 1984 The Eolution of Cooperation. Basic Books, New York Insko C A, Schopler J, Pemberton M B, Wieselquist J, Mcllraith S A, et al. 1998 Long-term outcome maximization and the reduction of interindividual-intergroup discontinuity. Journal of Personality and Social Psychology 75: 695–711 Kelley H H, Grzelak J 1972 Conflict between individual and common interests in an n-person relationship. Journal of Personality and Social Psychology 21: 190–7 Kerr N L 1996 ‘Does my contribution really matter?’: Efficacy in social dilemmas. In: Stroebe W, Hewstone H (eds.) European Reiew of Social Psychology. Wiley, Chichester, UK, Vol. 7 pp. 209–40 Komorita S S, Parks C D 1995 Interpersonal relations: Mixedmotive interaction. Annual Reiew of Psychology 46: 183–207 Luce R D, Raiffa H 1957 Games and Decisions: Introduction and Critical Surey. Wiley, New York Messick D M, McClintock C G 1968 Motivational bases of choice in experimental games. Journal of Experimental Social Psychology 4: 1–25
Orbell J M, Van der Kragt A J, Dawes R M 1988 Explaining discussion-induced cooperation. Journal of Personality and Social Psychology 54: 811–19 Pruitt D G, Kimmel M J 1977 Twenty years of experimental gaming: Critique, synthesis, and suggestions for the future. Annual Reiew of Psychology 28: 363–92 Van Lange P A 1999 The pursuit of joint outcomes and equality in outcomes: An integrative model of social value orientation. Journal of Personality and Social Psychology 77: 337–49 Van Lange P A, Otten W, De Bruin E M, Joireman J A 1997 Development of prosocial, individualistic, and competitive orientations: Theory and preliminary evidence. Journal of Personality and Social Psychology 73: 733–46 Yamagishi T 1986 The provision of a sanctioning system as a public good. Journal of Personality and Social Psychology 51: 110–16
P. A. M. Van Lange
Social Evolution: Overview 1. Concepts and their Definitions Social eolution: a process or set of processes of directional social change. Unilinear eolution: the movement of societies along a single path of change. Uniersal eolution: Steward’s name for the revival of unilinear evolutionism by Childe and White. Multilinear eolution: Steward’s name for the movement of societies along many different paths of change. General eolution: Sahlins’s name for the overall movement of society and culture along a major path. Specific eolution: Sahlins’s name for the divergent evolutionary paths taken by different societies and cultures. Progress: an improvement in the overall functional efficiency of society and the quality of life of the members of a society. Eolutionary theory: any theory that attempts to describe and explain a sequence or sequences of directional social change. Eolutionary transformations: see social evolution. Surials: structures or processes that have been carried by the force of custom or habit beyond the stage of society in which they originated. Neolithic Reolution: the first great social transformation in human history associated with the domestication of plants and animals and the emergence of settled village life. Urban Reolution: the name given by Childe to the second great social transformation in human history. It was characterized by the emergence of cities and states. 14279
Social Eolution: Oeriew Enironmental circumscription: blockage of the movement of peoples out of a geographic region by such natural barriers as mountain ranges or large bodies of water. Social stratification: the existence within a society of distinct social strata, or groups distinguished by unequal levels of power, privilege, and prestige. Economic surplus: a quantity of goods above and beyond what is needed by individuals for survival. Adaptation: the process whereby individuals adjust themselves to their physical and social environments and solve the basic problems of human living. Enironmental depletion: the diminution of resources within an environment to a point whereby the standard of living is lowered. Eolutionary materialism: the name given by Sanderson to the general theoretical framework developed by him on the basis of Harris’s evolutionary theory. Eolutionary uniersal: the name given by Parsons to any new social structure or process that allows a society to achieve a higher level of adaptation or functional efficiency. Eolutionary typology: a set of stages through which a society or one of its sectors is said to pass. Differentiation: the emergence of increasing complexity within a society or one of its subsystems. Militant society: Spencer’s name for a society that is highly specialized for warfare. Industrial society: (a) a type of society characterized by urban life, the factory system, and the mass production of goods; (b) Spencer’s name for a type of society specialized for economic activity rather than warfare. Saagery: Morgan’s name for the earliest and most primitive type of human society, one based on hunting and gathering, and a rudimentary technology. Barbarism: Morgan’s name for an intermediate type of society organized around simple forms of agriculture. Ciilization: (a) a type of society characterized by elaborate social stratification, urbanism and city life, occupational differentiation and craft specialization, trade and markets, monumental architecture, writing and record keeping, and the presence of a state; (b) Morgan’s name for the highest stage of social evolution, one characterized by the phonetic alphabet and writing. Societas: a stage of social evolution identified by Morgan that is organized on the basis of kinship and that is normally highly egalitarian and democratic. Ciitas: a stage of social evolution identified by Morgan that is organized along the lines of property and territory rather than kinship. Mode of production: in Marxian theory, the economic base of society consisting of the level of technological development and the social relationships through which people carry on economic activity. Primitie communism: in Marxian theory, the first stage of human society characterized by a rudimentary 14280
technology and common ownership of the resources of nature. Slaery: in Marxian theory, the second stage of human society characterized by social and economic relations between a large class of slaves and a small class of slave owners. Ancient mode of production: see slaery (above). Feudalism: in Marxian theory, the third stage of human society characterized by social and economic relations between a large class of peasants or serfs and a small class of landlords. Capitalism: in Marxian theory, the fourth stage of human society characterized by social and economic relations between a large working class and a small class of capitalists. Capitalism is based on the production of goods for sale in a market for profit and the accumulation of capital over time. Hunting and gathering: the simplest stage of social evolution characterized by an economy based on the hunting of wild animals and the collection of fruits, nuts, vegetables, and other plant materials. Simple horticulture: a stage of social evolution characterized by societies in which people cultivate gardens through the use of digging sticks. Adanced horticulture: a stage of social evolution characterized by societies in which people cultivate gardens through the use of metal hoes. Agrarianism: a stage of social evolution characterized by societies in which people cultivate the land with plows and draft animals. Simple agrarian society: a society that cultivates the land with plows and draft animals but that lacks iron tools and weapons. Adanced agrarian society: a society that cultivates the land with plows and draft animals and that possesses iron tools and weapons. Industrialism: see Industrial Society\Post-industrial Society: History of the Concept. Industrial Reolution: a technological and economic transformation, beginning in England in the second half of the eighteenth century, characterized by the development of the factory system and the substitution of machinery for hand power. Band: a residential association of nuclear families, usually numbering less than 100 individuals, who govern themselves by purely informal mechanisms. Tribe: a population, numbering in the thousands or tens of thousands, sharing a common culture and language and living in politically autonomous and economically self-sufficient villages of a hundred to several hundred people. Chiefdom: a hierarchical sociopolitical structure consisting of a chief and various subchiefs who administer an integrated set of villages. State: a form of political organization characterized by a monopoly over the use of force within a society. Capitalist world-economy: the name given by Wallerstein to the large-scale intersocietal system that began to develop in Europe in the sixteenth century.
Social Eolution: Oeriew The capitalist world-economy was associated with production of goods for the market and high levels of economic and geographical specialization. Social evolution is a term that has been used in a number of different ways, but most often it refers to a form of social change that exhibits some sort of directional sequence, a passage from less of something to more of something, or from one type of social arrangement to another type. Usually the term connotes slow and gradual change, and most often it carries with it the notion that the change in question is progressive, i.e., that it leads to some sort of improvement or enhancement. However, social evolution can be fast and even sudden, and it need not lead to improvement or enhancement. Erik Olin Wright (1983) has offered a precise set of criteria for identifying an evolutionary theory of human society. According to him, a theory is evolutionary if it has four characteristics: (a) it proposes a typology of social forms with potential directionality; (b) the order or arrangement of the social forms is based on the assumption that the probability of remaining at the same stage exceeds the probability of regressing; (c) it asserts a probability of transition, even if only weak, from one stage to another; and (d) it proposes a mechanism or set of mechanisms that will explain movement from one stage to another. One of the benefits of Wright’s characterization is that it makes no assumption that social evolution is some sort of automatic or inevitable process. Regression is permitted (although unlikely), and much of the time societies may be undergoing little or no change. Nor need it be assumed that social evolution always proceeds through some sort of rigid sequence.
2. Historical Deelopment of Theories of Social Eolution Theories of social evolution go back to the very beginnings of the social sciences and even earlier. The second half of the nineteenth century was a sort of ‘golden age’ of evolutionary thinking in sociology and anthropology, the fields that have contributed by far the most to the study of social evolution. There were many well-known evolutionary theorists during this time, but the three greatest were Herbert Spencer (1972, Peel 1971), Lewis Henry Morgan (1974), and Edmund Burnett Tylor (1871). Spencer was a British social theorist who contributed to philosophy, psychology, biology, and sociology. He saw evolution as a process that occurred in the entirety of the natural and social world. All phenomena, he said, had an inherent tendency to change from a state of ‘incoherent homogeneity’ to a state of ‘coherent heterogeneity.’ In simpler terms and with respect to human societies, they grew and expanded over time, their structures and functions became increasingly differentiated, and these increasingly differentiated wholes achieved
higher levels of unification or integration. Morgan, an American anthropologist, saw social evolution as rooted in technological change, and in his great work Ancient Society (1974) he focused mainly on the evolution of government, property, and the family. In terms of the institutions of property and government, Morgan traced out a process whereby highly democratic and egalitarian societies were gradually replaced by societies marked by differential property ownership, social inequality, and the domination of the many by the few. Tylor was a British anthropologist who was much less concerned with the evolution of technology, economy, and political life and much more interested in what anthropologists regard as ideational and symbolic culture, i.e., with myth, language, customs, rituals, and, in particular, religion. Tylor is famous for his use of the concept of survivals, which are features of culture that have been carried by the force of custom or habit into types of societies beyond the one in which they initially appeared. For Tylor, survivals were excellent evidence of a process of social evolution. Completely outside of the disciplines of sociology and anthropology, Karl Marx and Friedrich Engels (1964) developed their own evolutionary model of society. This model focused on the economic evolution of societies over thousands of years and made use of a revised version of the Hegelian dialectic as the mechanism of evolution. Within the economic infrastructures of societies, contradictions or tensions built up that eventually led to explosive situations and the transition to a new stage of society. Engels developed his own highly abstract theory of social evolution in his book Anti-Duhring (1939). He also formulated a much less abstract theory in The Origin of the Family, Priate Property, and the State (1970), a book that depended heavily on the ideas of Morgan. After about 1890 the golden age of social evolutionary thought came to an end and was succeeded by what has been called ‘the antievolutionary reaction’ (Harris 1968, Sanderson 1990). Under the leadership of Franz Boas (1940), anthropologists in large numbers turned against the evolutionary theories of Spencer, Tylor, and Morgan. Evolutionary thought was kept alive to some extent through the work of such thinkers as L. T. Hobhouse (Hobhouse et al. 1965) and the celebrated sociologist William Graham Sumner (Sumner and Keller 1927). However, the intellectual atmosphere was for the most part unfriendly to it. But by the 1930s this atmosphere began to change and evolutionism revived. Three thinkers were most responsible for the revival. In 1936 V. Gordon Childe, an Australian archaeologist who taught in Scotland, wrote a famous book entitled Man Makes Himself. In this book Childe identified two great evolutionary transformations in human history. The first he called the Neolithic Revolution, which was associated with the domestication of plants and animals and led to larger and more settled societies. The second evolutionary transformation Childe called the Urban 14281
Social Eolution: Oeriew Revolution, which was made possible by the invention of the plow. The Urban Revolution was marked by the emergence of the city, the development of sharp class divisions, and of the form of political organization known as the state. Following closely on the heels of Childe was the American anthropologist Leslie White. Throughout the 1940s White wrote a series of articles in which he attacked Boas and other antievolutionists for what he saw as their distortions of the claims of the nineteenthcentury evolutionists. White (1943, 1959) also developed his own conception of the evolution of culture, which he expressed in terms of a law of evolution. This law stated that culture developed in direct proportion to the amount of energy per capita that societies had harnessed. Societies harnessed energy as they moved from one technological stage to another or as they made their existing technology more efficient, and thus for White technological change was what was driving the evolutionary process. The third major figure in the evolutionary revival was Julian Steward (1955). Steward introduced the concept of multilinear evolution, which he opposed to unilinear and universal evolution. According to him, the nineteenth-century evolutionists had all been guilty of seeing societies as moving through a fixed series of stages (unilinear evolution), and thus greatly oversimplified and distorted the process of evolutionary change. The resurrection of this idea by Childe and White Steward called universal evolution, and it was even more problematic than unilinear evolution. For Steward, evolution was largely multilinear, i.e., it moved along a series of paths rather than one grand path. By 1960 the evolutionary revival had proceeded to the point of putting evolutionary thinking back into a dominant position, at least in anthropology. After 1960 a new generation of anthropologists (and at least one sociologist) built on this revival and solidified it (Sanderson 1990). Marshall Sahlins (1960) developed an important distinction between general and specific evolution. This paralleled the distinction Steward had made between unilinear or universal evolution and multilinear evolution, except that Sahlins saw both general (unilinear, universal) and specific (multilinear) evolution as real social processes that anthropologists should be studying. Despite being unique, specific evolutionary changes also involved an overall movement of social and cultural life from one stage of evolutionary development to another. Robert Carneiro (1970), like Sahlins a student of Leslie White, developed a theory of the evolution of the state that would become one of the most famous theories of this phenomenon. The critical factors in Carneiro’s theory were population growth, warfare, and what Carneiro called environmental circumscription. Circumscribed environments are those in which areas of fertile land are surrounded by natural barriers that prevent or impede the movement of people out of the area. These barriers include such things as moun14282
tain ranges or large bodies of water. Warfare is the result of population growth and resource scarcity, and when land is plentiful people respond to war by simply moving away to new land. But in circumscribed environments this process can only be taken so far. Land is eventually filled up, and the solution to more population pressure and resource scarcity becomes political conquest, a process that eventually results in the type of political society known as the state. About the same time Gerhard Lenski (1966), a sociologist, was using the ideas of Childe and White to develop a theory of the evolution of social stratification. For Lenski the key to the rise of stratification was technological advancement and increasing economic productivity. As societies advance technologically, their economic productivity rises to the point where they can begin to produce an economic surplus, or a quantity of economic goods above and beyond what people need to survive. When economic surpluses arise competition and conflict over their control emerge, and as these surpluses grow larger the struggles between individuals and groups intensify and stratification systems become more elaborate and polarized. One of the most important theories of social evolution of this period was that of Marvin Harris (1977). He broke ranks with such theorists as Childe, White, Sahlins, and Lenski on a critical point. These other theorists saw social evolution as a generally adaptive and progressive process that was driven by technological change. When new technologies became available, people readily adopted them because they saw their potential for improving the quality of life. According to Harris, technological change came about for a very different reason: the tendency of societies to deplete their environments as the result of population growth. When populations grew, pressure on resources intensified and standards of living dropped. At some point people had no choice but to advance their technologies so as to make their economies more productive. Thus farming replaced hunting and gathering, and farming with the use of the plow replaced farming with the use of simple hand tools. But technological change itself leads to further population growth and greater environmental depletion, and so a new wave of technological change will soon become necessary. For Harris, social evolution, at least in preindustrial societies, is a process in which people have been running as hard as they can just to avoid falling farther and farther behind. Sanderson (1994a, 1994b, 1995) has built on the evolutionary ideas of Harris. He has formalized, extended, and to some extent modified them by developing a comprehensive theory that he calls evolutionary materialism. He uses evolutionary materialism as a general framework within which to understand three great evolutionary transformations: the Neolithic Revolution, the rise of the state and civilization, and the emergence of the modern capi-
Social Eolution: Oeriew talist world-economy. Like Harris’s, Sanderson’s view of social evolution is nonprogressivist and in some respects even antiprogressivist. About this same time a very different kind of evolutionary theory was developed by the sociologist Talcott Parsons (1966, 1971). Parsons concentrated not on the evolution of technology and other features of material life but on the evolution of ideas and social institutions. Parsons formulated the concept of the evolutionary universal, which is a new kind of social structure or process that allows a society to achieve a new stage of evolutionary adaptation and thereby improve its level of functional efficiency. An early evolutionary universal was social stratification, which allowed a more efficient form of societal leadership to emerge. Later such evolutionary universals as administrative bureaucracy and money and markets emerged. An important evolutionary universal in more recent times, Parsons claimed, was the democratic association, which gave the use of power a broad societal consensus.
3. Typologies of Eolutionary Stages Most theories of social evolution have involved the delineation of evolutionary stages. This was no less true in the nineteenth century than it is today. Spencer used two different evolutionary typologies. One of these was based explicitly on Spencer’s notion of evolution as increasing differentiation: simple, compound, doubly-compound, and trebly-compound societies. At one end of the scale of differentiation we have simple societies, which are politically headless or have only rudimentary forms of headship. At the other end, that of trebly-compound societies, we have civilizations such as the Egyptian empire, ancient Mexico, the Roman empire, or the contemporary United States. Spencer’s other typology distinguished between what he called militant and industrial societies. Military societies are those in which individuals are subordinated to the will of society and forced to obey its dictates. They are produced by the needs associated with warfare and the preparation for it. Industrial societies, on the other hand, are characterized by an emphasis on economic activities rather than warfare, and are notable for their much greater individualism and overall freedom. Although Spencer did not work out in any precise way just how these two typologies corresponded to each other, just as he saw evolution as a process of increasing differentiation he also saw it as a generalized movement from militant to industrial societies. For Spencer the epitome of industrial societies was his own nineteenth-century Britain. Morgan divided societies into three major stages, which he called ‘ethnical periods’: savagery, barbarism, and civilization. The first two of these are divided into lower, middle, and upper substages. Societies in the stage of savagery are the oldest and most primitive.
They survive by hunting and gathering and use rudimentary tools, such as spears or bows and arrows. Societies have reached the stage of barbarism when they have begun to domesticate plants and animals and live by agriculture rather than by hunting and collecting. Civilization is achieved with the invention of the phonetic alphabet and writing. Tylor made use of a very similar typology, as did other nineteenthcentury thinkers. However, Morgan also employed another typology, which involved the distinction between societas and ciitas. Societies at the stage of societas were organized on the basis of kinship and had relatively egalitarian and democratic social and political arrangements. Societies having reached the stage of ciitas, by contrast, rested on property and territory as organizational devices. Kinship declined greatly in importance and was replaced by the state as the integrator of society. Economic inequalities became prominent, and the democratic arrangements of earlier times gave way to one or another form of despotism. Marx and Engels introduced a typology of stages that was to become one of the most famous of all time. For them a society rested upon one of four possible modes of production: primitive communism, slavery, feudalism, or capitalism. The earliest human societies were at the stage of primitive communism. Here technology was very primitive, and people subsisted by hunting and gathering, by simple forms of agriculture, or by animal herding. Private property had not yet emerged, nor had class divisions. People lived in harmony and equality. Slavery, sometimes called the ancient mode of production, was characteristic of such earlier civilizations as Egypt and Mesopotamia and ancient Greece and Rome. Technology had undergone major advances since the stage of primitive communism, but as a result private property and class divisions had emerged. A small class of slave-owning masters confronted a large class of slaves. Slavery was followed in due course by feudalism, which characterized Europe during the Middle Ages. Slavery had largely given way to serfdom, and the major class division in society was that between landlords and serfs. The transition to capitalism was associated with another major upheaval in social life. Landlords and peasants were replaced by capitalists and workers as societies industrialized and urbanized. In due time the capitalist mode of production would fill with internal contradictions and eventually burst asunder, giving way to a socialist society without class divisions. In recent decades very useful typologies have been produced by Gerhard Lenski (1966, 1970) and Elman Service (1971). Lenski has distinguished five major stages based on the level of technological development: hunting and gathering, simple horticulture, advanced horticulture, agrarianism, and industrialism. Hunting and gathering societies are the simplest and oldest. Here people survive by living on what nature provides. Men hunt and women collect fruits, nuts, vegetables, 14283
Social Eolution: Oeriew and other plant matter. The transition to simple horticulture was made with the Neolithic Revolution that began in some parts of the world about 10,000 years ago. Simple horticulturalists subsist by a rudimentary form of agriculture that is based on the cultivation of gardens with the use of digging sticks. They often use a form of cultivation known as ‘slashand-burn’ cultivation. This involves cutting down a section of forest, burning off the accumulated debris, and then spreading the ashes over the garden area as a form of fertilization. Usually gardens can only be used for a few years because soil fertility is quickly lost, so they will be abandoned in favor of other areas, only to be returned to many years hence. Advanced horticulturalists use the same basic form of cultivation but till the land with metal hoes rather than digging sticks, which give them a technological advantage. Agrarian societies are based on the plow and the harnessing of animal energy for plowing. Fields are completely denuded of shrubs and trees and slash-and-burn is abandoned. Industrial societies began to emerge with the Industrial Revolution of the late eighteenth century. They are based on industry, the factory system, and urban life. Service’s typology is based on the form of sociopolitical organization rather than the mode of technology, and distinguishes four stages: bands, tribes, chiefdoms, and states. Bands and tribes are similar in that they lack any formal mechanisms of social control. Leaders exist, but they have no real power or authority to compel anyone to do anything. They lead by example and they advise and suggest, but their decisions have no binding force. A band is usually a residential association of nuclear families, with a population anywhere from about 25 persons to 100. A tribe may be thought of as a larger collection of bands. Often the term tribe is used to designate a population, numbering in the thousands or tens of thousands, that shares a common culture and language. This cultural and linguistic whole is composed of politically autonomous and economically self-sufficient villages, which may consist of anywhere from a hundred to several hundred persons. Chiefdoms develop when the villages of the tribe come to be subordinated to an administrative structure of the whole. This structure is hierarchical and is dominated by a chief and at least a few subchiefs, each of whom has the capacity to compel the actions of others. Chiefdoms rest on at least some degree of coercion, but by definition this coercion is not sufficient to prevail when there is a rebellion from below. When such a degree of coercion has been achieved, a state has emerged. States are thus large-scale, highly centralized political systems in which rulers have a monopoly over the use of force. Other typologies exist, and all have their strengths and weaknesses. Morton Fried (1967) developed a typology based on the degree of social inequality: egalitarian, rank, and stratified societies. Eric Wolf (1982) formulated a Marxist typology of stages based 14284
on the mode of production: the kin-based mode, the tributary mode, and the capitalist mode. And Talcott Parsons (1966) formulated a typology based on three major stages, which he called primitive, intermediate, and modern societies. There is no one way to create a typology of evolutionary stages. What is important is that the typology describes a real directional sequence or process based upon a clearly identified mechanism of change. All of the typologies identified here do that to one extent or another.
4. Social Eolution: Its Main Outlines The discussion now moves from typologies and theories of social evolution to the actual process of social evolution itself. There is considerable disagreement about what the most important dimensions and driving forces of this process are, but the following sketch probably would not produce too much controversy among social evolutionists (Sanderson 1995). Virtually all social evolutionists would agree that the first great social transformation was the Neolithic Revolution, which introduced plant and animal domestication into the world. The most striking thing about the Neolithic from an evolutionary perspective is that it occurred in remarkably parallel form in several major regions of the world. Beginning about 10,000 years ago in southwest Asia, the transition to communities based on agriculture rather than hunting and gathering occurred at later times in other parts of the Old World—southeast Asia, China, Africa, and Europe—and three parts of the New World: Mesoamerica, South America, and North America. It is now generally understood that most if not all of these Neolithic transitions occurred independently of one another, a phenomenon that can only be explained in evolutionary terms. The effects in these various regions were much the same too. The transition to agriculture led to settled and more densely populated communities that for a while remained relatively egalitarian but eventually gave way to stratified societies organized into chiefdoms. By around 5,000 years ago in several parts of the world societies that had evolved into chiefdoms were beginning to make the transition to a state level of political organization, or to what many scholars have called civilizations. This occurred first in Mesopotamia and Egypt, and then later in China, the Indus valley in northern India, parts of Europe, and Mesoamerica and Peru in the New World. Civilizations are societies with highly centralized political institutions involving a monopoly over the use of force, elaborate social stratification, urbanism and city life, occupational differentiation and craft specialization, trade and markets, monumental architecture, and writing and record keeping. Most civilizations have been agrarian societies, thus cultivating the land with plows and draft animals and intensively fertilizing the soil. Like
Social Eolution: Oeriew the Neolithic Revolution, the transition to civilization and the state was a process of independent parallel evolution in several parts of the world. Social evolution occurred rapidly throughout the world between 10,000 and 5,000 years ago, but once the stage of civilization had been reached there was a considerable slowing of the evolutionary process. Social change continued in that societies became bigger and more densely populated, technology advanced, trade and markets expanded, and empires grew larger and more powerful. However, there was no qualitative leap to a new mode of social organization in the sense that there had been with the transition from hunter–gatherer to agricultural communities or to civilization and the state. Lenski (1970) has distinguished between simple and advanced agrarian societies on the basis of the possession by the latter of iron tools and weapons. He has suggested that advanced agrarian societies were in many ways so different from simple agrarian societies that it is difficult to include them in the same category. Yet both types were based on the same technological and economic foundations and the differences between them were more quantitative than qualitative. It was to take another several thousand years before there was a shift to a qualitatively new mode of social organization. What was this shift and when did it occur? This is where sociologists enter the picture. Most sociologists have taken the view that it was the Industrial Revolution of the late eighteenth century that introduced a qualitatively new form of social life, industrial society. However, in recent years some sociologists, along with social scientists from other fields, have moved this qualitative transformation back in time to the sixteenth century (Wallerstein 1974). The qualitative shift is then considered to be the transition to a capitalist world economy. Capitalism—selling goods in a market in order to earn a profit—in some form or another had existed for thousands of years, but after the sixteenth century it began to get the upper hand and to replace earlier, precapitalist forms of social life. From this perspective, the Industrial Revolution was simply part of the logic inherent in the advance of capitalism. Most scholars have treated the rise of capitalism as a unique feature of European society and Europe’s decisive contribution to the world. However, at about the same point in history a society at the other end of the world, Japan, began to undergo a capitalist transition of its own, and it is possible to identify a number of striking similarities between Europe and Japan that may have facilitated their transition to the new economic system (Sanderson 1994b). Once the transition to capitalism had been made and the Industrial Revolution had commenced, strikingly similar changes began to occur among rapidly industrializing societies. Today’s industrial capitalist societies all have very similar occupational structures, governments based on parliamentary democracy, sys-
tems of mass education that are closely linked to the work world, similar stratification systems with relatively high levels of mobility, and rationalization in the form of advanced science and technology. Some sociologists and most historians will protest that recognition of these similarities should not prevent us from seeing the many important differences among industrial capitalist societies. That is certainly true, but from the perspective of the evolutionist, it is the similarities that are more compelling.
5. Conclusions It was noted above that evolutionary theories of society have ebbed and flowed in their popularity ever since they were introduced. They were extremely prominent between 1850 and 1890, fell to a low point between 1890 and 1930, and rose to prominence again around the middle of the twentieth century. But the bubble burst again, and beginning in the 1970s evolutionary theories fell from grace. They have been criticized on numerous grounds (Nisbet 1969, Sztompka 1993). Among other things, it has been charged: (a) that they overestimate the amount of directionality in history; (b) that they rely on teleological explanations, which are illegitimate; (c) that they have a strong endogenous bias and ignore, or at least insufficiently attend to, the external relations among societies; (d) that they employ a specious concept of adaptation; and (e) that their progressivist view of history is much too optimistic. All of these criticisms have been answered at length (Sanderson 1990, 1997). Precedent suggests that evolutionism will once again become prominent in sociology and anthropology. This type of theorizing has fared well in optimistic periods of history and poorly in more pessimistic periods. Western society and Western social science have long been in a pessimistic period, but this cannot last. With the eventual turn toward greater optimism, evolutionism will again assume its rightful place as a major perspective on social life and its historical changes. See also: Cultural Evolution: Overview; Cultural Evolution: Theory and Models; Industrial Society\ Post-industrial Society: History of the Concept; Inequality; Slavery as Social Institution
Bibliography Boas F 1940 Race, Language, and Culture. Macmillan, New York Carneiro R L 1970 A theory of the origin of the state. Science 169: 733–8
14285
Social Eolution: Oeriew Childe V G 1936 Man Makes Himself. Watts, London Engels F 1939 Herr Eugen Duhring’s Reolution in Science (Anti-Duhring). 3rd edn. International Publishers, New York Engels F 1970 The Origin of the Family, Priate Property, and the State. International Publishers, New York Fried M H 1967 The Eolution of Political Society. Random House, New York Harris M 1968 The Rise of Anthropological Theory. Crowell, New York Harris M 1977 Cannibals and Kings: The Origins of Cultures. Random House, New York Hobhouse L T, Wheeler G C, Ginsberg M 1965 The Material Culture and Social Institutions of the Simpler Peoples. Routledge and Kegan Paul, London Lenski G 1966 Power and Priilege: A Theory of Social Stratification. McGraw-Hill, New York Lenski G 1970 Human Societies: A Macro-leel Introduction to Sociology. McGraw-Hill, New York Marx K, Engels F 1964 The German Ideology. Progress Publishers, Moscow Morgan L H 1974 Ancient Society, or Researches in the Line of Human Progress from Saagery through Barbarism to Ciilization. Peter Smith, Gloucester, MA Nisbet R A 1969 Social Change and History: Aspects of the Western Theory of Deelopment. Oxford University Press, New York Parsons T 1966 Societies: Eolutionary and Comparatie Perspecties. Prentice-Hall, Englewood Cliffs, NJ Parsons T 1971 The System of Modern Societies. Prentice-Hall, Englewood Cliffs, NJ Peel J D Y 1971 Herbert Spencer: The Eolution of a Sociologist. Basic Books, New York Sahlins M D 1960 Evolution: Specific and general. In: Sahlins M D, Service E R (eds.) Eolution and Culture. University of Michigan Press, Ann Arbor, MI Sanderson S K 1990 Social Eolutionism: A Critical History. Basil Blackwell, Oxford, UK Sanderson S K 1994a Evolutionary materialism: A theoretical strategy for the study of social evolution. Sociological Perspecties 37: 47–74 Sanderson S K 1994b The transition from feudalism to capitalism: The theoretical significance of the Japanese case. Reiew 17: 15–55 Sanderson S K 1995 Social Transformations: A General Theory of Historical Deelopment. Blackwell, Oxford, UK Sanderson S K 1997 Evolutionism and its critics. Journal of World-Systems Research 3: 94–114 Service E R 1971 Primitie Social Organization: An Eolutionary Perspectie. 2nd edn. Random House, New York Spencer H 1972 Herbert Spencer on Social Eolution. Peel J D Y (ed.) University of Chicago Press, Chicago Steward J H 1955 Theory of Culture Change: The Methodology of Multilinear Eolution. University of Illinois Press, Urbana, IL Sumner W G, Keller A G 1927 The Science of Society. 4 Vols. Yale University Press, New Haven, CT Sztompka P 1993 The Sociology of Social Change. Blackwell, Oxford, UK Tylor E B 1871 Primitie Culture: Researches into the Deelopment of Mythology, Philosophy, Religion, Language, Art, and Custom. 2 Vols. John Murray, London Wallerstein I 1974 The Modern World-system: Capitalist Agriculture and the Origins of the European World-economy in the Sixteenth Century. Academic Press, New York
14286
White L A 1943 Energy and the evolution of culture. American Anthropologist 45: 335–56 White L A 1959 The Eolution of Culture. McGraw-Hill, New York Wolf E R 1982 Europe and the People Without History. University of California Press, Berkeley, CA Wright E O 1983 Giddens’s critique of Marxism. New Left Reiew 138: 11–35
S. K. Sanderson
Social Evolution, Sociology of The term social evolution describes sociological theories that study the changes and development of societies as a whole, by adopting a long-term perspective. These theories have alternately emphasized: (a) the characteristics of successive social orders; (b) the mechanisms and laws that articulate this succession; (c) the direction of this evolution; and (d) its progressie character. Although explicit references to social evolution were certainly not lacking in the reflections of philosophers who dealt with the whole history of human evolution (Plato, Vico, Bossuet), actual theories on social evolution were only established in the eighteenth century (Mandeville, Hume, Ferguson, Smith, Savigny), and nineteenth century (Saint-Simon, Comte, Marx, Spencer, Menger, Durkheim, To$ nnies). Even though they are radically different, these theories all identify a tendency to progress from relatively simple and uniform social organizations characterized by a small or even nonexistent division of labor, to social organizations that are marked by a growing division of labor and a considerable increase of population. In spite of their shared minimalist presuppositions, two currents of sociological theories have been established, each interpreting social evolution in a radically different way: (a) theories that try to understand society’s inevitable laws of eolution; and (b) theories that interpret social evolution as an unintended process generated by the spontaneous combination of individual actions.
1. Social Eolution as an Ineluctable Deelopment Many philosophers, sociologists and economists have tried to single out the laws that clearly explain the succession of different social phases. They have tried to identify eolutionary criteria that would allow them to look back at changes that have already taken place, and foresee future changes. According to Saint-Simon, a law of progress supports the process of evolution and this progress is not linear, since ‘organic periods’ of progress necessarily alternate with ‘critical periods’ of temporary and partial regression. However, the
Social Eolution, Sociology of industrial revolution and scientific progress inevitably lead to a more evolved organic period that was organized according to the principles of ‘positive science’ which puts power in the hands of scientists. According to Comte (Comte 1875), ‘the development of civilization follows a necessary law’, which individuals ‘are subject to by invariable necessity’: the ‘law of the three phases’. This law establishes that every society goes through ‘theological, metaphysical and positive phases’ which follow a fixed and predetermined scheme. The positive phase characterizes the most advanced stage of social progress, dominated by a new and higher form of science, ‘social physics’, that permits the organization and management of society as a whole. Marx considers ‘dialectic materialism’ the progressive law of all society, marking the inevitable evolution from ‘primitive communism’ to the ‘feudal’ phase that leads to the ‘capitalistic’ phase, which in turn leads to the inevitable success of ‘communism.’ These theories, based on a holistic view of society that considers social evolution a teleocratic process, have been subjected to a great deal of methodological criticisms. The impossibility of foreseeing how human knowledge will develop and the inevitable emergence of unintended consequences are sufficient criticism to the determinism of these theories. There is also a fundamental epistemological objection to the claim these theories have of giving a scientific explanation to the entire social process: in order to explain a social phenomenon scientifically, universal laws are necessary, i.e., laws that are not empirically assessed only on the phenomenon itself. Where social evolution is concerned this is not possible, because it is a historically unique process. Consequently, there are no scientific ‘laws of evolution’ to be discovered, but only ‘tendencies’ that can and must be explained by laws (Popper 1957). In a more general sense, these evolutionary theories view society as a closed system where it is possible to determined the evolutionary dynamics. Marx, on the basis of a situational analysis, considered social evolution an emerging effect, and attributed a necessary and universal quality to mere tendencies, simply because he did not allow for the fact that a social system is necessarily opened by, and to, new information from the outside.
2. Social Eolution as a Spontaneous Order Individualistic theories on social evolution are formulated in a completely different way. They do not seek laws of eolution, but try to define a principle of eolution that is determined as the spontaneous combination of subjectively rational, individual actions. Social order and its evolution are conceived mostly as the unintended result of the interaction
between individuals with limited knowledge, pursuing their private interests through an ultimately positive exchange with others. B. De Mandeville and Scottish economists and philosophers of the eighteenth century (Hume, Ferguson, Millar, Smith) are among the first supporters of this theory. They proposed an inisible hand explanation for the origin and development of the most important social institutions. Even before, Smith, Mandeville explained the affirmation of the division of labor as a spontaneous result generated by the necessity of satisfying the needs of the individual in an increasingly efficient way. The division of labor, like other social institutions is therefore, according to Ferguson’s famous definition, the ‘result of human action but not of human planning.’ Among the Scottish philosophers, Adam Smith particularly drew his conclusions at a more completely macro-sociological level, identifying trade as the mechanism behind social order, since it spontaneously combines single actions. Smith believed that in the exchange with others, individuals are driven ‘by an inisible hand to support objectives that were not part of their original intentions. [- - -] By pursuing their own interests, they advance society’s interests more efficiently than was deliberately intended.’ Owing to Burke’s influence on the German historical school (Savigny, Neibuhr), the evolutionary approach was established in linguistics (Humboldt) and law in the early nineteenth century. It was thus adopted first of all in the human and social sciences, and only later in the natural sciences (for example, with Lyell in geology and Darwin in biology). In fact, Hayek had defined these theories of evolutionism as ‘Darwinian prior to Darwin.’ As the evolutionary theory was increasingly accepted in biology, theories on social evolution were given new impetus. This was also thanks to Spencer, who should be considered one of the most significant theoreticians of social evolutionism based on individualism, which was later developed by the economists of the Austrian school. Spencer considered social evolution as a process generated by a combination of individual actions, which tend to organize spontaneously, establishing rules and social organizations that are selected on the basis of their fitness to perform the basic functions of human society (survival of the species, production of riches). The ‘obligatory co-operation’ at the base of ‘military society’ and the ‘spontaneous co-operation’ at the base of ‘industrial society’ are the two most important ‘types’ of social orders that have selectively evolved. Spencer believed that societies where spontaneous cooperation was established had an eolutionary adantage because they maximised the problem soling capacity by safeguarding individual liberty and allowing the greatest possible safeguarding individual liberty and allowing the greatest possible number of compatible individual plans to be accomplished. For this reason, Spencer gives ‘spontaneous co-operation’ an ethical value because it is the best solution to reproduction of 14287
Social Eolution, Sociology of the species and communal welfare, which he considers the two most important issues of humanity. The Austrian Economic School is probably the strongest current of theorists of social evolution with an individualistic foundation. It spanned the nineteenth and twentieth centuries with Menger, Bo$ hmBawerk, Mises and Hayek and the neo Austrian Americans (Rothbard, Kirzner, Stigler). By emphasizing the primacy of the theory in the social sciences and suggesting that they be refounded on an individualistic basis, Menger harshly criticises the ‘pragmatist’ theories that consider the genesis and evolution of social order as effects of intentional plans. In his opinion, ‘law, language, the State, currency, trade [- - -], in their various expressions and incessant changes, are largely the spontaneous result of social evolution.’ Mises developed a momentous theory of rationality (praxeology), providing an epistemological defence of social order as produced by market, and demonstrating the impossibility of making an ‘economic calculus’ in a planned economy. But, it was above all F. A. von Hayek who suggested the strong link between methodological individualism and social evolution. Hayek harshly criticised the reification of the Kollektiebegriffe starting from the ‘social system’, and based his methodological individualism on an ontological indiidualism. He stated that there are only individuals whose intentional actions (and their unintentional consequences) tend to combine spontaneously, giving rise to a ‘spontaneous order’ meaning open systems that are self-generated and self-regulated. In Hayek’s view, the great ‘misunderstanding’ of evolutionists is confusing the study of social evolution with the impossible search for the ‘laws of evolution’ on one hand, and on the other, with studying the selection process between single individuals (social Darwinism). Social evolution is actually the result of ‘group selection’, meaning the competition between groups organized according to different rules, that are selected on the basis of their functional adaptation. Hayek, like Spencer, demonstrates the eolutionary superiority of the ‘spontaneous order’ (Cosmos) by distinguishing it from ‘contrived order’ (Taxis). This superiority is generated by the fact that the spontaneous order allows each individual to use the generalized knowledge of others to reach their goals. Competition between different types of organizations and individual plans is therefore the means of propagation and retention of evolutionary selection results. It is the method for exploring the unknown and as such, is the driving force of social evolution. These individualistic theories of social evolution are based on a fallibilistic view of knowledge, and as such they harshly criticise the rationalistic and ‘constructionist’ claims of scholars who try to discover the inevitable laws of social evolution and then, on the basis of this discovery, plan social order or at least a significant part of it. Social order should be seen 14288
instead as the unintended result of the spontaneous combination of rational, subjective actions that relate to situations in an adaptie way. This nomocratic order is an autopoietic system that incorporates the principle of its own evolution: a spontaneous cooperation between free individuals. As such, it is teleonomic, since it has a tendency to self-organization, and is not teleological, because it is not guided by any previously determined, predictable, and inevitable goal.
3. The Eolution of Social Systems Several other important theories of social evolution exist beside these two currents of thought. Although they do not attempt to define the inevitable laws of development, they study the characteristics and sequential succession of cultural structures, systems and social phases, considering them separate and above the order of the single individuals of which they are composed. Durkheim and Parsons are the most significant exponents of these theories. Even though Durkheim’s works do contain interesting examples of the inisible hand explanation (such as the explanation of the genesis of the family in De la diision du traail social), he also harshly criticises Smith and the ‘economists’ who interpreted social evolution as a result of the interaction between single individuals pursuing their private interests. Society cannot be ‘deduced’ from individual behavior because ‘the agents in exchanges remain mutually estranged’, since a subjective interest is ‘the least constant thing in the world.’ Durkheim considers society a ‘sui generis’ reality, and therefore ‘collective life is not the result of individual life; on the contrary, the latter is the result of the former.’ Society exercises ‘moral authority’ over individuals and has a ‘collective conscience.’ In archaic societies, which are characterized by a ‘mechanical solidarity’, the collective conscience entirely replaces individual conscience. In more advanced societies characterized by an ‘organic solidarity’, a progressive individualization occurs that is not however sufficient to free the individual from collective tutorship. Evolution that leads to modern society is produced by the division of labor, which in turn, according to Durkheim (influenced by Spencer in this case), ‘varies in direct proportion to the volume and density’ of society. Having given up trying to define a law of deelopment, Durkheim finally explained the division of labor in a way that is similar to the invisible hand explanation. He writes in fact that ‘the economic services which the division of labor can provide are insignificant compared to the moral effect it produces, and its true function is to achieve solidarity between two or more individuals.’ Parsons’ ‘neo-evolutionism’ has a culturalistic imprint. After the Second World War, he clearly revised the views he expressed in The Structure of Social
Social Eolution, Sociology of Action. Parsons defines three ‘levels’ of social evolution: ‘primitive society’ (centered on family relations), ‘intermediate society’ (regulated by religious institutions), and ‘modern society’ (established by the industrial revolution). The passage from one social stage to another can be explained by the laws of general eolution and the law of cybernetic hierarchy. The general laws of evolution (for which Parsons is much indebted to Spencer), explains how societies evolve through the ‘increase of adaptive capacity’ namely through their capacity to ‘transform the environment in order to satisfy their needs (Parsons 1971).’ Successive social phases, under the pressure of functional adaptation, are therefore characterized by a progressive ‘differentiation’ (of roles, functions and sub-systems) and by their ‘integration.’ The law of cybernetic hierarchy explains that the ‘social system’, works and evolves by the restraints placed by ‘cultural structures’ (i.e., systems of rules and values) on the ‘social system’ and on individual actions, which Parsons places on the lower levels of the cybernetic scale. Parsons describes society as a ‘sui generis reality’ explicitly following Durkheim’s theories. Social evolution is therefore the result of the ‘evolutionary change’ of cultural variables. These variables have the ‘function’ of ‘legitimizing the rules of a society’, providing a view of society in its ‘totality.’
4. Eolution and Social Progress A dogma of theories on social evolution in the eighteenth and nineteenth century, based on the Enlightenment and positivism, was that evolution coincided with progress. Saint-Simon, Comte, Marx and Condorcet before them, all believed that the laws of eolution coincided with the laws of progress, meaning that social evolution occurs in a predictable way that is equated with a necessary human progress. This attempt to make historic deelopment and human progress coincide is invalidated by ‘ Hume’s law’, which states that it is not logically possible to derive prescriptions from descriptions. The idea of progress cannot therefore be rationally founded. This does not prevent social evolution from being defined as positie or progressie, or being either optimistic (Spencer, Hobhouse, Childe, Withe, Parsons) or pessimistic (Malthus, Speigler), for these are ealuations based on subjective value choices. As such, they can be debated and critically discussed, but they cannot be considered as describing a necessity. It is possible to avoid the naturalistic fallacy of confusing goodness with eolution by following hHume’s law.’ In order to assess the results of evolutionary adaptation, it is necessary to move from a historical-sociological plane to an ethical one. Individualistic evolutionists, like Spencer and Hayek, separated evolution from progress and—fearing that social evolution might alter the nature of the spontaneous order—gave progress a moral value by turn-
ing it into an ideal to be defended and developed should certain value choices be accepted (individual liberty, the increase of wealth and knowledge, etc.). The purpose of evolutionary ethics is precisely to demonstrate that spontaneous cooperation based on free individual actions is the best means to achieve these values.
5. Adaptation and Innoation The ariation\selection mechanism is the hard core of both biological and social evolution. As neo-darwinian models were increasingly accepted in biology and theories of self-organized complex systems were increasingly accepted in physics, social sciences also began to study this evolution-generating mechanism. However, the dynamics of this mechanism can be interpreted in different ways that once again oppose individualism to collectivism. It is possible to have a ‘structural’ view of innovation, by seeing it as a deterministic product of enironmental instruction according to Lamarck’s theories. Variations would then be the manifestation of functional needs that arise from, and are selected by, the need to guarantee the social system’s equilibrium. Social changes would therefore be essentially conceived as ‘endogenous’, i.e., part of the social structure and necessary to its reproduction. Individualistic evolutionists, like Weber and Schumpter proposed a completely different view of social innovation. They emphasized the individual, unpredictable and strategic character of innovation by focusing respectively on the innovative role of the charismatic eT lites and of entrepreneurs. Innovation does not answer a question posed by the environment, it is the result of individual strategies, and strategies between individuals, who both condition and are conditioned by the environment in which they develop. Variations cannot be explained by strict Lamarkian theories, because the social organism’s ingrained environmental needs and tendencies do not deterministically generate them. On the other hand, they also cannot be interpreted using an orthodox darwinian model, since they are not random or random-like. These two pure types of innovation correlate to two pure kinds of adaptation. Parsons reduces adaptation almost exclusively to its functional dimension, defining it as a process that allows society as a whole (as well as its subsystems) to respond to some essential, functional needs. Another view of adaptation considers it closely linked to a biological model (especially Spencer and Hayek). Adaptation is seen as an endless order generating process that reduces the variety of configurations produced by variations through a rule and group selection. This selection essentially refers to groups that spontaneously adopt different rules and organizations. Among these groups, only those which are better equipped to address the most important social needs will survive. 14289
Social Eolution, Sociology of On another level, a similar and complimentary process of selection through competition takes place between individuals belonging to the same group, and this process determines the subjective actions and projects that are most adaptable to environmental circumstances and particularly to the rules of the group itself. However, the fact that emphasis is placed on group selection does not mean that this evolutionary prospect denies methodological individualism (as stated by D T Campbell 1995). Organized groups are the spontaneous result of the fact that individuals follow specific rules of conduct that are conveyed and spread by learning through imitation. Following these guidelines, one eventually considers tradition as a necessary element of evolution since it characterizes the process of retention of kinds of actions and dispositions.
6. Social Eolution and Eolutionary Epistemology D T Campbell has used the expression ‘evolutionary epistemology’ to describe epistemological theories (from Poincare! and Mach to Popper) that saw the evolution of human knowledge as a process that progresses by trial and error-elimination, creating new hypotheses and selectively eliminating false ones. Evolutionary epistemology also helps to better understand the social evolution process, which is also a discoery procedure that lato sensu has a cognitive nature. Social innovations are nothing more than attempts to resolve problems (intentionally proposed or occurring spontaneously), that are selected on the basis of their capacity to deal with problematic situations. Comparing the two evolutionary processes, it can be said that: (a) fitness is nothing more than the elimination of errors; (b) the transmission of selected solutions (in social and cognitive contexts) means learning by error; and (c) the increase of complexity in social systems is simply an increase of background knowledge. Like the evolution of knowledge, social evolution will never reach a state of equilibrium because new instructions will result in an environmental change that will be the background of new variations. Finally, an important distinction should be made between social evolution and the evolution of the knowledge. Even though the former cannot be reduced to a Lamarckian model, its enironmental instruction is more significant while the evolution of knowledge (if one excludes the collectivistic theories of some contemporary science sociologists such as Bloor, Barnes and others), is articulated by creatie inentions that, unlike social innovation, are much less affected by external selection. See also: Comte, Auguste (1798–1857); Cycles: Social; Durkheim, Emile (1858–1917); Hayek, Friedrich A von (1899–1996); Marx, Karl (1818–89); Parsons, Talcott (1902–79); Social Evolution: Overview; Spencer, Herbert (1820–1903); Theory: Sociological 14290
Bibliography Boudon R 1986 Theories of Social Change. Basil Blackwell\Polity Press, London Campbell D T, Heylighen F 1995 Selection of Organization at the Social Leel: Obstacle and Facilitators of Metasystem Transition. World Future: the Journal of General Eolution 45: 181–212 Childe V G 1951 Social Eolution. H Schumann, London Comte A 1875 The Positie Philosophy. Trubner, London Di Nuoscio E 2000 Epistemologie dillatione e ondine spontaneo. Rubbeltino, Sovena Mannelli Durkheim E (1893) 1964 The Diision of Labor in Society. Free Press of Glencoe, New York Hayek F A von 1973–79 Law, Legislation and Liberty. Routledge, London, 3 Vols. Hayek F A von 1988 The Fatal Conceit. Routledge, London Heylighen F, Aerts A 1999 (eds.)The Eolution of Complexity. Kluwer, Dordrecht, Germany Infantino L 1998 Indiidualism in Modern Thought. Routledge, London Parsons T 1971 The System of Modern Societies. Englewood Cliffs, NJ Popper K R 1957 The Poerty of Historicism. Routledge, London Sanderson S K 1990 Social Eolutionism: a Critical History. Cambridge, MA Spencer H 1882–1896 Principles of Sociology. Williams and Norgate, London, 3 Vols. Turchin V F 1977 The Phenomenon of Science. A Cybernetic Approach to Human Eolution. Columbia University Press, New York White L A 1959 The Eolution of Culture. McGraw-Hill, New York
E. Di Nuoscio
Social Facilitation, Psychology of Social facilitation refers to the effect of the presence of others on the individual. This phenomenon is of psychological interest to scholars who regard the presence of others as a minimal condition for social influence.
1. History of the Term The term social facilitation was coined by the American Floyd Allport in an early textbook of social psychology (Allport 1924). Allport’s was a behavioral treatment of the field. As part of his Social Psychology, Allport maintained that a response called forth by a nonsocial stimulus could be augmented by an incidental social stimulus. He termed this contributory influence social facilitation. Allport reported demonstrations of the social facilitation of students’ task performance from the sight and sound of other students performing the same task. Allport’s research participants cancelled vowels more quickly, multiplied
Social Facilitation, Psychology of numbers more quickly, and generated word associations more quickly when they were working alongside peers than when they were not. When an individual is in the presence of others, the others will invariably assume some role. Sometimes they are watching the individual, as members of an audience. Sometimes they occupy the same role as the individual, and all are engaged in the same behavior. In this latter case, the others are said to be co-actors. Audience and co-action researchers seek to infer social facilitation from experimental data, by comparing individuals who are in the presence of others with individuals who are alone. However, it is difficult to collect valid evidence of social facilitation, because the monitoring that the experiments require may compromise research participants’ sense of solitude. In such cases, no one is truly ‘alone,’ and the impact of others’ presence is difficult to isolate. Social facilitation was first reported in an 1898 journal article by Norman Triplett. Studying records of bicycle races, the sports psychologist noted a tendency for cyclists to race more quickly when riding alongside competitors than when riding alone against the clock. Triplett developed a laboratory-based analogue of this social phenomenon, when he had school children turn a fishing reel alone and in the presence of coacting classmates. The presence of these young coactors produced faster reel-turning, an effect that Triplett attributed to a ‘dynamogenic’ increase in motivation. Years later, Triplett’s 1898 study of others’ presence was dubbed the first experiment in social psychology, although its claim to this title has been disputed.
observed. In the presence of others, males (but not females) display a heightened tolerance for pain. In the presence of others, Japanese (but not American) students conceal facial signs of distress. Since Triplett’s pioneering investigation, psychologists have reported hundreds of experiments on the social facilitation of human task performance. The evidence that they have amassed indicates that the term social facilitation may be a misnomer. If sometimes the presence of others facilitates a person’s task performance, often it has the opposite effect. In the presence of others, people have trouble unscrambling anagrams; in the presence of others, people lack motor coordination; in the presence of others, people do poorly on memory tasks. When is a person’s task performance facilitated by the presence of others, and when it is impaired? Psychologists had been pondering this question for decades before they found a convincing answer. In 1965, social psychologist Robert Zajonc published an article in the prestigious journal Science. There, Zajonc contended that the impact of the presence of others on task performance depends on the complexity of the performer’s task. The presence of others serves to facilitate the performance of simple tasks and to impair the performance of complex tasks, Zajonc claimed. Subsequent evidence verifies that the impact of the presence of others does, indeed, depend on task complexity. At the same time, it indicates that social impairments of complex performance are stronger than social facilitations of simple performance. People perform simple tasks more quickly in the presence of others. However, there is little evidence that others’ presence has any effect on simple performance quality, a quantitative review of 241 ‘social facilitation’ studies concluded (Bond and Titus 1983).
2. Effects of the Presence of Others Social facilitations have been observed in a number of species. Chickens peck at food more quickly when other chickens are pecking; rats press a bar faster in the presence of other rats; cockroaches run with greater speed when running alongside other cockroaches. To the biologically oriented psychologist (e.g., Clayton 1978), these demonstrations of animal social facilitation hold profound interest. People can be affected in many ways by the presence of others. Physiologically inclined researchers report social facilitations of human heart rate, blood pressure, and electrodermal activity. If often the presence of others increases an individual’s physiological arousal, sometimes affiliation with others can reduce high levels of arousal (Mullen et al. 1997). Psychological investigators have documented social facilitations of various behaviors (Kent 1994). In the presence of others, people eat large meals, express conventional judgments, and show a tendency to smile. Gender and cultural differences in social facilitation are sometimes
3. Explanations of Social Facilitation Psychologists have attempted to explain social effects on task performance. In his 1965 article, Robert Zajonc proposed the most influential explanation. According to Zajonc, the presence of others increases an individual’s level of generalized drive. Generalized drive, learning psychologists Clark Hull and Kurt Spence hypothesized, functions to energize an individual’s dominant response tendency toward any stimulus compound. Generalized drive enhances this dominant response tendency, to the exclusion of any weaker tendencies. According to Zajonc’s drive theory of social facilitation, the impact of the presence of others on the performance of a task should hinge on the performer’s dominant response tendency when attempting the task. On simple tasks, where the dominant tendency is toward a correct response, the presence of others should facilitate performance. On complex tasks, where the dominant tendency is to 14291
Social Facilitation, Psychology of make a mistake, the presence of others should enhance this error, thereby impairing performance. Zajonc’s theory intensified interest in the study of social facilitation. Many experiments were designed to test this drive theory of social facilitation, and the theory was often corroborated. When the presence of others caused errors in athletic performance, undermined spatial maze learning, and enhanced response tendencies that had been trained to dominance in the laboratory, researchers claimed evidence for the operation of a socially induced drive. The presence of others was shown to have different effects on task performance at different stages in the learning of the task—retarding individuals’ acquisition of novel skills, then facilitating execution of those same skills once performance reached a plateau. Over training, drive theorists reason, correct response tendencies come to dominate over errors; thus dominantresponse enhancements can explain these opposing effects. Although for a time theorists agreed that the presence of others functioned as a source of generalized drive, there were debates about the origin of the drive. Zajonc attributed drive to the unpredictability of species mates, and an individual’s need, in the presence of others, to be alert to the unexpected. Other social psychologists (notably Nickolas Cottrell) believed that drive reflected a learned tendency to associate the presence of others with positive and negative outcomes. Only when others were in a position to confer such outcomes would they have any effects on the individual, Cottrell implied. Positive and negative outcomes are often associated with the presence of others, all psychologists agreed. Many wondered, however, whether the mere presence of others would be sufficient to influence the individual, independent of any associative processes. In search of effects of the mere presence of others, experimenters conducted some colorful studies. They fed rats in the presence of anaesthetized rats; confronted chickens with a dead companion; and had undergraduates trace spatial mazes in front of blindfolded mannequins. When results indicated that individuals can be affected by others who are merely present, many psychologists acted as though Cottrell’s analysis of social facilitation could be discarded. More recently, the theoretical significance of mere presence effects has been revisited. Robert S. Baron argues that if an individual learns to associate positive and negative outcomes with the presence of others, then others who are merely present should become a conditioned stimulus for those outcomes. From Baron’s perspective, mere presence effects are simply conditioned responses, and have no bearing on the origin of any socially induced drive. No doubt, there will be continuing discussion about the theoretical significance of the effects of the mere presence of others. In the meantime, it is apparent that others’ mere presence can have some effects. 14292
As drive theorists engaged in intramural disputes, alternative theories of the effects of the presence of others were being developed (Geen 1989). A self-presentational analysis attributed the social facilitation of task performance to the performer’s regulation of a public image, and the social impairment of performance to embarrassment from a loss of public esteem. Theoretically, performers lose esteem when they must attempt complex tasks in public. Complex tasks, theorists note, are ones on which the individual makes many mistakes. Self-awareness theories held that the presence of others prompts individuals to compare their task performance to a performance standard. These self-evaluations may inspire performance facilitation or occasion a withdrawal from the task, depending on the outcome of the comparison. Cybernetic models of the self-comparison process were constructed. Performers must divide their attention between a task and any others who are present, an influential distraction-conflict theory emphasized. This theory made the novel prediction that social distractions would facililtate simple task performance, and inspired evidence of an audience-induced attentional overload (Baron 1986). The inability to visually monitor others’ reactions arouses uncertainty, and this is the key to understanding mere presence effects, writes the Australian psychologist Bernard Guerin (1993).
4. Related Phenomena Alongside social facilitation from the presence of others are broader facets of the psychology of public performance. Performances, for example, take place in front of audiences of greater or lesser size. Mathematical equations have been developed to relate the number of people in the audience to the audience’s impact on the performer. Data have been fit to the proposed equations, and reveal a negatively accelerating impact, as an audience grows in size. An audience may be supportive of the performer, or it may be hostile. Traditionally it was assumed that audience support would benefit the performer, and create a home-field advantage in sports. However, audience support may intensify pressure and make the performer choke, some paradoxical data suggest. Performers may bring positive expectations or negative expectations to a task. The presence of others benefits performers who expect to succeed, and impedes performers who expect to fail, the self-efficacy theorist Lawrence Sanna writes. The presence of co-actors may obscure the individual’s contributions to a group effort and provide an incentive to loaf, a related research tradition emphasizes. Psychologists continue to pursue these and other possibilities as part of their century-long study of social facilitation.
Social Geography See also: Adolescent Development, Theories of; Adult Development, Psychology of; Deindividuation, Psychology of; Developmental Psychology; Group Decision Making, Social Psychology of; Group Processes, Social Psychology of; Group Productivity, Social Psychology of; Infant and Child Development, Theories of; Lifespan Development, Theory of
Bibliography Allport F H 1924 Social Psychology. Houghton Mifflin, Boston Baron R S 1986 Distraction-conflict theory: progress and problems. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, New York, Vol. 19, pp. 1–40 Bond C F, Titus L J 1983 Social facilitation: a meta-analysis of 241 studies. Psychological Bulletin 94: 265–92 Clayton D A 1978 Socially facilitated behavior. The Quarterly Reiew of Biology 53: 373–92 Cottrell N B 1972 Social facilitation. In: McClintock C G (ed.) Experimental Social Psychology. Holt, New York, pp. 185–236 Geen R G 1989 Alternative conceptions of social facilitation. In: Paulus P B (ed.) Psychology of Group Influence, 2nd edn. L. Erlbaum Associates, Hillsdale, NJ, pp. 15–51 Guerin B 1993 Social Facilitation. Cambridge University Press, Cambridge Kent M V 1994 The presence of others. In: Hare A P, Blumberg H H, Davies M F, Kent M V (eds.) Small Group Research: A Handbook. Ablex, Norwood, NJ, pp. 81–105 Mullen B, Bryant B, Driskell J E 1997 Presence of others and arousal: an integration. Group Dynamics: Theory, Research, and Practice 1: 52–64 Zajonc R B 1980 Compresence. In: Paulus P B (ed.) Psychology of Group Influence. L. Erlbaum Associates, Hillsdale, NJ, pp. 35–60
C. F. Bond, Jr.
Social Geography
synonymous with the whole field of human geography’ (Johnston et al. 1994, p. 562). Nevertheless, the growing prominence of cultural geography towards the end of the 1980s and into the 1990s had a major impact on social geography. This was neatly symbolized by the decision of the ‘Social Geography Study Group’ of the Institute of British Geographers (which had been formed in the early 1970s) to change its title to ‘The Social and Cultural Geography Study Group’ in the 1990s. What is social geography and what do social geographers do? The rapid growth of the sub-discipline and the significant theoretical debates within it have involved shifts in its definition and practice. In 1965, Pahl suggested that it was about ‘the theoretical location of social groups and social characteristics, often within an urban setting’ (p. 82). In 1975, Emrys Jones argued that ‘social geography involves the understanding of the patterns which arise from the use social groups make of space as they see it, and of the processes involved in making and changing such patterns’ (p. 7). The emphasis on social groups evident in Pahl’s earlier definition remains of particular importance, in that what is held to make social geography ‘social’ is its concern not with individuals but with people as members of groups. However, there are some significant changes between the two definitions. First, Jones’ definition introduces the idea that social processes produce spatial patterns and vice versa. This understanding of social geography remained dominant into the 1980s. Second, Jones’ definition includes the phrase ‘the use social groups make of space as they see it’ (author’s emphasis). This signals an increasing interest in people’s subjective experiences in specific places—an interest that has become increasingly prominent in research designs. The understanding of the scope and aims of social geography implied in Jones’ definition endured until the mid-1980s when it came under increasing attack. These attacks and their implications for the development of social geography make better sense in the context of the account of the type of work that social geographers did from the 1960s to the mid 1980s, presented below.
1. Introduction—When and What Social geography emerged as a significant sub-discipline during the 1960s. Although earlier references to ‘social geography’ can be found, Johnston (1987) notes that it is barely mentioned in reviews of geography published in the 1950s. The development of the new sub-discipline was part of exciting theoretical and methodological change that geography experienced in the 1960s. In the 1970s and 1980s, vigorous theoretical debates led to a variety of conflicting approaches to research in social geography. By the early 1980s, the sub-discipline was of major importance within human geography. Indeed, a recent discussion of the area claimed that it became ‘virtually
2. Engaging with Positiism 2.1 Describing the Social Characteristics of Urban Areas A major focus of research during the late 1950s and 1960s was the statistical description and analysis of the characteristics of the populations of urban areas. Central to this project was the identification of systematic patterns in the location of different social groups. Analysts examined which groups were likely to be segregated from other groups and whether there 14293
Social Geography was evidence for particular patterns of location for different groups within cities. At this time, there were also important moves within human geography to apply positivist philosophy and methodology to the pursuit of geographical knowledge. These moves encouraged, on the one hand, attention to clearly specified theoretical statements and predictions and, on the other, the use of quantitative data and statistical techniques that were presented as allowing rigorous, scientific testing of hypotheses. Also associated with the growing commitment to positivism, was a desire for predictive models of the location of human activity and people’s ‘spatial behavior,’ often drawing on neoclassical economics. For geographers interested in social groups and in cities this implied a need to develop predictive theories of the spatial form of cities, including theories of the location of social groups within urban areas. Geographers looking for theoretical and methodological inspiration turned to the work of urban sociologists. First, they drew upon the work of the Chicago School—including Louis Wirth, Robert Park, and Ernest Burgess—who carried out extensive studies of Chicago in the first half of the twentieth century. They used maps of a wide range of statistical information in order to identify ‘natural areas’— residential areas with distinctive physical and cultural characteristics—that the Chicago School sociologists argued developed from unplanned processes of competition and cooperation amongst urban residents. Burgess had developed a general model of urban residential land use as a series of concentric zones focused around the Central Business District. Second, urban geographers were drawn to the work of Eshref Shevky and Wendell Bell who had studied characteristics of the populations of Los Angeles and San Francisco. Shevky and Bell had argued that the ‘increasing scale’ of society resulting from the increasing complexity of economic organization and the associated growth of urban areas could be linked to three dimensions of urban residential differentiation: ‘social rank’ or ‘economic status’; ‘urbanization’ or ‘family status’; and ‘segregation’ or ‘ethnic status.’ A key feature of their work was the development of ‘social area analysis’ using data on census tracts. They selected variables to represent the three dimensions and then used these to categorize census tracts in terms of ‘high’ or ‘low’ economic, family, or ethnic status. Although the theoretical links between increasing societal scale and the three dimensions of economic, family, and ethnic status were poorly specified and strongly criticized, geographers enthusiastically embraced the use of census data to characterize areas of the city. During the 1960s, computers that were more powerful allowed geographers to use sophisticated statistical techniques to analyze data on census tracts. Factorial ecology was particularly popular—the use of factor analysis or principal component analysis to 14294
identify a few general constructs which underlie a large number of variables. Shevky and Bell’s three status dimensions often emerged from the census data. The analyst could also classify census tracts in terms of the underlying constructs and map the results to see whether any systematic spatial patterns in characteristics of census tracts could be identified. These maps were compared with Burgess’ concentric model of urban form and comparisons between factorial ecologies of different cities also made. Similar techniques were also used to look for relationships between residential differentiation, location, and patterns of crime, deviance, and mental health. The availability of census data for many cities (mostly, in the ‘developed’ world) and increasing access to computers allowed social geographers to indulge in an orgy of analysis. However, there were some clear limitations to this work. First, the nature of the constructs identified was limited by the available census data: they could be as much a reflection of the data available as of the social ‘reality’ of urban areas. Second, because the data related to areas rather than individuals, false associations between variables may have been made (the ‘ecological fallacy’). Moreover, differences in the sizes and rules for creating census tracts between countries limited the comparability of the results. Third, although the practitioners of these analyses aspired to work within a positivist framework much of the research was inductive description rather than tests of clearly specified hypotheses linked to well identified theoretical propositions. Nevertheless, this body of work provided a considerable body of descriptive information on residential differentiation that has become part of the taken for granted knowledge of geography.
2.2 The Behaioral Challenge The descriptive analysis of urban residential characteristics was not the only type of work in social geography done during the 1960s and early 1970s. There were those who, despite their hopes for a positivist approach to the development of theory within geography, were dissatisfied with the approaches to studying human behavior that were used widely at that time. In the 1960s, much research in human geography drew its inspiration from models in economic geography that posited that people possessed perfect information and were rational and optimizing in their behavior. Dissatisfaction with these assumptions prompted some researchers to turn to psychological theory concerned with cognition and decision-making in order to investigate what information people had available to use in spatial decision-making and how they used it. The principal method employed in these studies was the questionnaire survey, frequently incorporating some variety of attitudinal and personality
Social Geography tests drawn from social psychology. The research carried out within this broad framework is known as ‘behavioral geography.’ Work in behavioral geography was concerned with the locational decision-making of firms, individuals’ decisions over where to live, people’s everyday movements, and their information and ideas about space and place. Gould’s work on mental maps was particularly influential (Gould and White 1974). His initial research examined American undergraduates’ views of each US state as a place in which to live after graduation. The results showed that individuals had a strong preference for their home area and that this preference was superimposed upon an image of a national space with marked variations in preferences. The idea of an incomplete, biased and personal vision of the spatial world conjured up by the term ‘mental map’ became part of geographers’ ideas about how people made decisions about both everyday movement and migration. Information about space and place was also recognized as important within studies of residential mobility and of people’s everyday travel behavior. In the former, researchers were influenced by the early work of the sociologist Peter Rossi whose 1955 book Why Families Moe was subtitled A study in the social psychology of residential mobility. A large number of studies of residential mobility in different cities and countries resulted. In the latter, diary data on travel behavior were used to develop inductive theories about the way people choose and order travel destinations. Travel behavior and migration research explored how people searched for information in space and developed strategies for choosing locations. Research using the perspectives of behavioral geography—while part of social geography—remains marginal to the field because of its focus on the individual rather than on people as interacting with and partly constituted by their social milieu. However, the impossibility of such an individualist approach is evident in much of the work actually carried out. Researchers often examined contrasts between individuals drawn from different social groups (for example, different income, class, age and, occasionally, gender groups). Others examined the influence of the resources available to individuals on their spatial cognition and so ended up investigating differences between social groups. The research is also significant to social geography because criticisms of its individualistic approach were important to the development of alternatives.
3. Competing Approaches During the early 1970s, many geographers became dissatisfied not only with the philosophical approach of positivism but also with the lack of critical political engagement that they detected in contemporary re-
search. These concerns grew out of the radical ideas promoted in America and Europe as part of opposition to the American war in Vietnam, anti-nuclear campaigns (especially in Britain and European countries), civil rights (particularly in America), and the growing ‘rediscovery’ of poverty in the prosperous countries of the West. What was needed, it was felt, was research that could suggest how to change society, not merely how to understand it. Positivism was criticized for suggesting that an objective approach to social science was necessary, feasible, and desirable and that it was possible to find ‘universal laws’ of human society. Diverse competing ‘radical’ approaches were proposed; they all shared the desire to challenge the inequalities and injustices of contemporary society although some sought this through changing our valuation of human beings, some through institutional change and some through revolutionary political action. A sketch of some of the main approaches follows. Although here they are separated into demarcated ‘boxes,’ in reality their boundaries were fuzzy.
3.1 Humanistic Geography The humanistic geography that emerged in the 1970s was animated by a desire to present an alternative to positivism. It shared the behavioral geographers’ critique of some positivist approaches as dehumanized—insofar as they did not engage with the world as experienced by individuals—but added to this the more fundamental criticism that positivist social scientists made false claims to objectivity—by failing to recognize the subjectivity of both the observed and the observer. Here, humanistic geography made a decisive break with positivism and turned to approaches which require empathy with people’s own perception of the values and meanings which inform their action (Jackson and Smith 1984). Two other important features of humanistic approaches were the insistence that people were active agents shaping their own worlds and a fascination with the links between place and social identity. Although links have been made between the humanistic geography of the 1970s and a variety of early work in geography, the most immediate influence was the Chicago School. Humanistic geographers turned to the wealth of studies of the ‘cultural life’ of ‘natural areas’ produced by the Chicago School, as well as drawing on the ‘community studies’ tradition in sociology which had itself been partially inspired by the Chicago School. One of the major strands of work focused on understanding the way of life of particular social groups in particular areas of a city (although a few studies of rural areas were also carried out). It used ethnographic methods including participant observation, firsthand observation and interviews as well as 14295
Social Geography documentary information and census materials. Other researchers used ethnographic methods to explore features of crime and deviance in selected neighborhoods, often using factorial ecology to identify suitable areas for study. Humanistic geographers also engaged with what were for 1970s geographers the new philosophical approaches of phenomenology and existentialism. There were two important consequences. First, a move towards thinking of the individual as constituted through the social interactions in which they take part. These social interactions and the everyday behaviors in which people engage create a taken-for-granted world which the social scientist can attempt to deconstruct. Second, there was an attempt to avoid prior theorization about the values and meanings of those people who are studied and to be open to understanding the world through their eyes. However, this still allows the findings of such studies to be interpreted within a theoretical or practical context and for disagreement about such interpretations to be fierce. Another focus was on understanding the cultural meanings of, and values attached to particular landscapes and built forms. This work employed the methods of hermeneutics (for example, Meinig 1979). In the 1970s, this work was considered to be on the margins of social geography and soon became seen as part of the revival of cultural geography. As such, its investigation of the multiplicity and instability of meanings, their communication and reformulation began a tradition of work that has had important implications for contemporary social geography.
3.2 Welfare and Weber David Smith’s Human Geography: A Welfare Approach (1977) addressed what had begun to emerge as a new approach to human geography of particular significance to social geography. This approach focused on ‘who gets what, where, and how’—that is, on issues of social justice and inequality. It therefore involved both arguments about what constitutes social (and often, territorial) justice and attempts to measure empirically various types of social inequality. An early focus was on developing social indicators to measure inequality between social groups living in different areas at a variety of scales ranging from the city to the world. This research built on the expertise developed in earlier descriptive analyses of census data on cities. It also included measurement of people’s spatial access to services (for example health care or education) or proximity to negative factors such as pollution. However, although this answered questions of where inequality or deprivation might be found and who suffered such problems, it did not answer questions of how this inequality arose. Consequently, researchers 14296
developed a range of analyses of the operation of particular agencies and agents whose actions seemed significant to the distribution of resources. A substantial body of work thus developed around state and private institutions’ management of the distribution of resources, drawing on the theories of Max Weber. The concept of ‘managerialism’ was prominent. It was first proposed by Pahl (1975), who suggested that within both the state and private sector there were ‘gatekeepers’ whose decisions were crucial to access to resources and so to people’s quality of life. Within social geography, the ideas of managerialism stimulated a wide range of studies of access to both public and private sector housing. These were counterposed to the earlier studies of housing search and housing choice to suggest that they had paid too little attention to the constraints on people’s choices in the housing market. The theme of choice and constraint reappears within this body of work applied not only to housing but also to resources like health care, education, services for the elderly, parks, and open space and delivery services in rural areas. However, the ideas of Weber also introduced an increasing interest in conflict between and among institutions and groups. Here social geographers drew upon the ideas of the sociologist John Rex (1968) on housing classes, as well as on Weberian ideas about the operation of bureaucracies and professional groups, to explore conflicts over the distribution of resources. A different approach to access used the ideas of time geography developed by the Swedish geographer Hagerstrand (1975). He suggested that time and space are resources which people must use to realize the ‘projects’ which they wish to carry out. Since traversing space requires time and many activities require the co-presence of individuals or presence at a particular location, not all projects can be realized. People’s ability to carry out particular projects is limited by capacity, coupling, and authority constraints. Despite the enthusiasm of some geographers and a few empirical studies, the potential of these ideas remained regrettably underdeveloped. Most empirical use of Hagerstrand’s ideas focused on the physical limits to access and did not develop the possibilities of a more fully realized social theory of human behavior in space and time. The majority of welfare-focused research was concerned with cities in ‘developed’ countries; however, attention was also paid to inequalities in the rural areas of such countries and these perspectives have been used in studies of developing countries, Eastern Europe and the former USSR. There were overlaps with other radical approaches. The study of conflict and constraint often employed ethnographic methods to explore the values and patterns of behavior of gatekeepers and the social groups attempting to access resources. Attempts to answer the how element of welfare geographers’ problematic often drew on research using Marxist theory which sought to situate
Social Geography explanation of issues such as housing access within an analysis of the operation of capitalist society. 3.3 Marxist Approaches The book that best illustrates the thinking that drew geographers to Marxist theory was Harvey’s Social Justice and the City (1973). A key event also was the founding of Antipode: a Radical Journal of Geography in 1969 at Clark University in the USA. Harvey’s book charts and explains a shift in his thinking from liberal to Marxist. At the start of the 1970s, very few geographers knew anything about Marxism, and Marxist analyses of the political economy of cities and regions were rare. By the end of the decade, there had been a major expansion in the use of Marxist approaches sufficient to create a new orthodoxy. This had implications across all geography while in social geography Marxist approaches were embraced with particular enthusiasm. The analysis of residential differentiation and change had been a central part of social geography since its inception. The new Marxist theory offered ways of understanding these processes through ideas of class struggle and capital accumulation and largescale patterns of investment, and disinvestment could be interpreted in terms of crises in processes of accumulation. Such analyses were applied to gentrification and suburbanization. Processes of socialization within residential areas could also be theorized as involving the reproduction of particular social subgroups (class fractions). These early formulations were challenged by an important new body of socialistfeminist work that developed in the early 1980s. While this incorporated many of the postulates of Marxism it criticized the conceptual and practical neglect of gendered power relations and the failure to identify and explore the links between gender and class relations. Critical empirical research on both gentrification and suburbanization sought to explore the ways in which gender relations were involved in the construction of the spatial structure of the city and the implications of this for women’s oppression in society. In general, this socialist-feminist work was more sensitive than other Marxist approaches to the role of human agency and to humanist claims concerning the partiality of knowledge. The approach to analyzing access to services was also strongly affected by Marxist analysis. Castells’ work on collective consumption and social movements was highly influential. Castells defines collective consumption as the provision of services that are produced and managed collectively, and distributed through non-market allocation processes. He argued that these were provided through the State apparatus because they were necessary to the reproduction of labor power but would not be provided directly by the capitalist market because of a lower than average profit rate. Castells made a highly contentious defini-
tion of the city as ‘a unit of collective consumption corresponding more or less to the daily organization of a section of labour power’ (1976, p. 148). More importantly to research in social geography, he also spelt out how a variety of social movements might arise in response to inadequacies in the provision of collective consumption. Such social movements might fragment class solidarity and reduce the possibility of radical political change or enhance it. Castells’ ideas stimulated extensive conceptual and empirical research in both urban sociology and in social geography. 3.4 Critiques and Cross-fertilization By the early 1980s, social geographers had become better versed in social theory and more critical and sophisticated in its application. There were debate and cross-fertilization of ideas amongst researchers working in the three areas identified above and increasing awareness of their shortcomings as well as their strengths. There were strong critiques from humanists and feminists of the neglect of human agency in Marxist approaches in geography and this encouraged geographers’ interest in the sociologist Anthony Giddens’ ‘structuration’ theory (Giddens 1981). This attempted to show how human agency and structures were mutually constitutive within a time-space framework and drew on the earlier ideas of Hagerstrand. These critiques encouraged attempts to develop a more ‘humanistic’ Marxism sensitive to both human agency and contingency. At the same time, those concerned with issues of welfare and social justice were increasingly using some form of political economy approach deriving from Marxism to frame their research. All three approaches, as well as earlier work in social geography, were criticized by feminist geographers for their neglect of gender (Bowlby and McDowell 1987) and a major new body of research by feminist social geographers began to develop. This involved ethnographic empirical study of the everyday life of women and incorporated ideas drawn from welfare geography (in relation to issues of access and inequality), humanistic geography (in relation to women’s experience of and social relations within particular places) and, as indicated above, Marxist geography. Some of this feminist work was linked to another major development within geography—the explosive growth of cultural geography during the late 1980s with its insistence that because meaning is constantly created and changed through shared but contested discourses, both individual and group selfdefinitions are unstable and contested. This new cultural geography drew on earlier humanistic social geographical research as well as on feminist and postmodern critiques of modernist theory. In particular, modernist theory was criticized for assuming a white, bourgeois, and masculine subject and for its attempts to discover universal truth. The 14297
Social Geography impact of these critiques, and of the new empirical work done in the 1980s and 1990s in both cultural and economic geography, has been to call into question both earlier definitions of social geography and the coherence of the sub-discipline.
4. What is ‘the Social’ in Social Geography? The later 1980s and the 1990s saw a progressive destruction of the main features of Emrys Jones’s 1975 definition of social geography. That definition involved the separation of social process and spatial pattern and a focus on the use of space and creation of spatial patterns by social groups. 4.1 Social Process and Spatial Pattern or ‘Spatiality’? The 1980s saw growing dissatisfaction with the idea of social processes as distinct from spatial patterns, on the grounds that, because all social interactions necessarily take place in and through space, spatial pattern cannot be isolated from social process. These attacks derived from Marxist, humanistic, and cultural geographers’ critiques of positivist analyses of abstract space as an empty container of social processes. Instead, Marxists argued that space was produced through capitalist social processes and through processes of class and other groups’ opposition to capitalism. Humanistic and cultural geographers were also using phenomenological theory and empirical research to argue that our understandings of space are produced through our everyday social activities and do not draw on prior concepts of abstract space. 4.2 The Social s. the Economic In the early work, people’s participation in the life of social groups made social geography social and distinguished it from approaches that focused on the individual. The existence of social groups and the salience of characteristics such as ethnicity or income were largely taken for granted. Questions about which social groups to study, and why, have had a number of implications for social geography. First, increasing familiarity with Marxist and Weberian social theory made social geographers more aware of the need to define what were significant social groups in relation to the major constellations of power in society. Since economic relations are clearly of central importance to social power this required social geographers to consider the relations between the social and the economic. Were social relations to be seen as simply derivative of more ‘fundamental’ economic relations? If so, was social geography of less significance than economic geography? Second, social geography had always concerned itself with issues of social reproduction—with housing, education, health care, and crime. However, as the new 14298
feminist work of the 1980s soon established, they had neglected some central aspects of social reproduction—notably those to do with domestic labor and the home. This feminist work also showed through analyses of the links between the gendering of the ‘home’ and the ‘workplace’ how untenable was the division between social reproduction and production. Thus, they claimed that ‘production and reproduction—the conventionally accepted subjects of economic and social geography respectively—cannot analytically be separated’ (Bowlby and McDowell 1987, p. 305). At the same time, researchers in economic geography were arguing for analyzing economies and economic relations as social constructs. During the 1990s, the importance of examining the social relations embedded in economic exchange was re-emphasized by a number of researchers. In particular, feminist research on labor markets showed clearly the interpenetration of the social and the economic (Hanson and Pratt 1995). Thus, the term ‘social’ no longer clearly delimits the subject matter for social geographers. 4.3 Discourses and Social Groups Meanwhile a resurgence of cultural geography and associated growth of interest in postmodern theory has called into question social geographers’ accustomed ways of thinking. On the one hand, these developments have drawn attention to the power relations inherent in the definition of social groups and to the political problems of representation this poses. They have also stressed the fluidity and contested nature of such definitions. On the other hand, postmodern approaches have questioned the validity of modernist ‘grand theory’ such as that of Marx and Weber that has been part of the stock in trade of social geographers from 1975 to 2000. Cultural geography has thus required social geographers to examine processes of representation and communication more closely and to question the idea that there is a single ‘social reality’ to be uncovered through research. 4.4 What Future for Social Geography? There are significant vulnerabilities in the current situation of social geography. It now lacks a distinctive focus. In the 1990s, the definition of social geography given in the Dictionary of Human Geography was ‘The study of social relations in space and the spatial structures that underpin those relations’ (Johnson et al. 1994, p. 562). This is so broad that it is difficult to see where social geography ends and other sub-disciplines—for example, economic, political, or cultural geography—start. It is also arguable that the term has become shorthand for research that concentrates on issues of ‘social relations in space’ only in Western societies. Despite the existence of much research in developing countries which, if carried out in a Western society,
Social History would be termed ‘social geography’ there has been too little interaction between those researchers working on developing and developed countries. There are, however, some indications that this is changing with greater appreciation of the implications of the globalization of communication and economic relationships and the significance of diasporic cultures. For many social geographers the growth of cultural geography presents an opportunity to combine many of the themes of the two sub-disciplines in fruitful investigation of social inequality and difference and to focus on and destabilize definitions of a wider range of ‘social groups’—such as disabled people and gay and lesbian people. Social geographers’ tradition of political engagement and ethnographic research can combine positively with cultural geographers’ sensitivity to discourse and symbolic expression of difference. The rapprochements between social, cultural, and development geographers seem likely to produce valuable research. Nevertheless, the term ‘social geography’ on its own—as opposed to ‘social and cultural geography’ or ‘social and development geography’—seems likely to revert to its earlier obscurity. See also: Access: Geographical; Behavioral Geography; Crime, Geography of; Critical Realism in Geography; Cultural Geography; Disability, Geography of; Gender and Feminist Studies in Geography; Humanistic Geography; Marxist Geography; Postcolonial Geography; Postmodern Urbanism; Postmodernism in Geography; Qualitative Methods in Geography; Residential Segregation: Geographic Aspects; Rural Geography; Sexuality and Geography; Space and Social Theory in Geography; Spatial Analysis in Geography; Time-geography; Urban Geography; Urban System in Geography
Bibliography Bowlby S R, McDowell L 1987 The feminist challenge to social geography. In: Pacione M (ed.) Social Geography: Progress and Prospect. Croom Helm, London Castells M 1976 Theoretical propositions for an experimental study of urban social movements. In: Pickvance C (ed.) Urban Sociology: Critical essays. Methuen, London Giddens A 1981 A Contemporary Critique of Historical Materialism. Macmillan, London Gould P, White R 1974 Mental Maps. Penguin, London Hagerstrand T 1975 Space, time and human conditions. In: Karlqvist A, Lundqvist L, Snickars F (eds.) Dynamic Allocation of Urban Space. Saxon House, Farnborough, UK, pp. 3–14 Hanson S, Pratt G 1995 Gender, Work and Space. Routledge, New York Harvey D 1973 Social Justice and the City. Edward Arnold, London Jackson P, Smith S 1984 Exploring Social Geography. Allen and Unwin, London
Jones E 1975 Readings in Social Geography. Oxford University Press, Oxford, UK Johnston R J 1987 Theory and methodology in social geography. In: Pacione M (ed.) Social Geography: Progress and Prospect. Croom Helm, London Johnston R L, Gregory D, Smith D M 1994 The Dictionary of Human Geography, 3rd edn. Blackwell, Oxford, UK Meinig D (ed.) 1979 The Interpretation of Ordinary Landscapes. Oxford University Press, New York Pahl R 1965 Trends in social geography. In: Chorley R J, Haggett P (eds.) Frontiers in Geographical Teaching. Methuen, London Pahl 1975 Whose City? 2nd edn. Penguin, Harmondsworth, UK Rex J 1968 The sociology of a zone in transition. In: Pahl R E (ed.) Readings in Urban Sociology. Pergamon Press, Oxford, UK Smith D 1977 Human Geography: A Welfare Approach. Edward Arnold, New York Shevky E, Bell W 1955 Social Area Analysis. Stanford University Press, Stanford, CA
S. Bowlby
Social History The term ‘social history’ refers to a subdiscipline of the historical sciences on the one hand and to a general approach to history that focuses on society at large on the other hand. In both manifestations social history developed from marginal and tentative origins at the end of the nineteenth and the beginning of the twentieth centuries and experienced a triumphant expansion from the 1950s to the 1980s. Throughout, social history can be best defined in terms of what it wants not to be or against that to which it proposes an alternative. On one hand it distinguished itself from political history, which had been dominant during most of the nineteenth and the early twentieth centuries. Consequently, the ‘social’ in social history meant dealing with the structures of societies and social change, social movements, groups and classes, conditions of work and ways of life, families, households, local communities, urbanization, mobility, ethnic groups, etc. On the other hand social history challenged dominant historical narratives which were constructed around the history of politics and the state or around the history of ideas by stressing instead social change as a core dimension around which historical synthesis and diagnosis of the contemporary world should be organized. With these aims in mind, social historians (including historical sociologists and economic historians) sought to uncover the relationships between economic, demographic and social processes and structures, as well as their impact on political institutions, the distribution of resources, social movements, shared worldviews, and forms of public and 14299
Social History private behavior. For the comprehensive variant of ‘social history,’ the term ‘history of society’ (Gesellschaftsgeschichte) has gained currency. Not only in Western Europe and North America but also in other parts of the world, the establishment, expansion, and specialization of social history belong among the significant trends of interwar and particularly postwar intellectual and academic life. Nevertheless, in recent years, it shares with other fields of the humanities and social sciences a ‘time of doubts’ (Chartier 1993), a period of challenges from interpretative, constructivist, and discursive approaches (Palmer 1990, Bonnel and Hunt 1999).
1. Early Institutionalization (Late Nineteenth–Early Twentieth Century) Like other generations of historians in the twentieth century that have at some point used the label ‘new’ for themselves, the claim to innovation by social historians in the 1960s and 1970s obscured the rich diversity of predecessors and competitors, which goes back to the second half of the nineteenth century (Kocka 1986). In that period, the academic professionalization and disciplinary differentiation of economics, sociology, and political science on the one hand and history on the other prevailed. The mutual contact zones or, by contrast, the strength of boundaries were marked by significant national differences. The resulting interactions and hierarchies between these ‘fields of knowledge’ (Ringer 1992) differed in respect to the amount of space they provided for historical approaches. The ‘Historical School’ of political economy in Germany, dominant in the last third of the nineteenth century, is a case in point. It was part of economics and did not gain much recognition in the discipline of history. Some professional historians took part in a process of specialization in ‘social and economic history’ that kept a critical distance as much from this strand of economics as from the traditional history of great men, events, and ideas. The development of the combined field ‘social and economic history’ marked a double opposition, on the one hand to the history of political events and on the other hand to an increasingly ahistorical economic science. The creation of a specialized journal in Germany allows one to give a precise date to the development of this subdiscipline of historiography. In 1893 the journal for social and economic history was founded, which was continued after ten years (and still exists today) as the Vierteljahrschrift fuW r Sozial- und Wirtschaftsgeschichte (VSWG), under the direction of the Austrian socialist Ludo Moritz Hartmann and the conservative medievalist Georg von Below. That journal served as a direct reference for Lucien Febvre and Marc Bloch (see Bloch, Mark LeT opold Benjamin (1886–1944)) when they founded the Annales in 1929 14300
in Strasbourg. Not surprisingly, their journal was first called Annales d’histoire eT conomique et sociale (1929–1939). The international exchanges and mutual observation between some of the protagonists of historiographic innovations in the early decades of the twentieth century were intense. Thus, only two years after the French flagship, the Polish journal Roczniki dziejow spolecznych i gospodarczych (Yearbooks for social and economic history) appeared. These early journals shared a further characteristic which distinguished them from the mainstream political history around them, namely a keen interest for ‘universal,’ non-nationalistic history. In Germany, social and economic history remained a specialized branch without much impact on general history. In England, economic history maintained close links to social themes. After World War II, the French Annales historians tended to move their journal away from its earlier identification with this ‘subdiscipline.’ Similarly, the program represented by Ernest Labrousse for a history of class relations taking as its starting point the economic ‘basis’ broadened out more and more into the areas of urban history, quantitative analysis of social strata and the integral description of regional societies. Where ‘social and economic history’ continued to be an institutionalized specialty, it did not necessarily figure as an area of major innovations, as was later the case with the ‘new’ history of the 1960s and 1970s. These early efforts to establish a specialized field and an alternative view of history with ‘society’ as the main object and organizing principle of analysis were paralleled by similar movements in various countries. In the USA, the New History of the turn of the century; in France, Henri Berr with his Reue de SyntheZ se (Historique) starting in 1900 and the reception of Durkheimian sociology by historians; in Germany, the development of a historical sociology by Max Weber and others; and in Great Britain in the first half of the twentieth century the impulses of George Unwin and R. H. Tawney for the development of an economic history that included social and cultural factors, were all among the most influential strands of sociohistorical thinking (Iggers 1975). Two characteristics of these debates and innovations deserve emphasis: the challenge of Marxism in a broad, nonorthodox sense provoked both emulation and animosity. Secondly, it is interesting to note in light of current discussions that especially in the German context the concept of ‘culture’ rather than that of ‘society’ was conceived as the vantage point for the synthetic organization of knowledge in the social sciences. In the 1950s the representatives of French ‘new history’ used the term ‘history of civilization’ to define the broadest possible framework of historical synthesis. Even Trevelyan’s 1942 definition of social history as ‘the history of a people with the politics left out’ which is cited as often as it is refuted provides a surprisingly wide room for the ‘social,’ implicitly
Social History including culture, mentalities, everyday life. (For a detailed treatment of the above topics see Historiography and Historical Thought: Indigenous Cultures in the Americas; History and the Social Sciences). In recent times a further root of social history has attracted wider attention. It lies in the antimodern nationalist currents, critical of industrialization, which in particular in the 1920s and 1930s presented a special attraction for the younger social and human scientists. The interest in rural life, population, and race combined particularly in Germany with aggressive nationalism and criticism of the new political borders (Raphael 1997). This historical writing opposed conventional research subjects and forms; it turned away from the state and classical political history, looked for alternatives to the traditional focus on elites, and took as its example the methods which had already seen much progress in demography, geography, and anthropology. In the 1930s and 1940s a number of representatives of this so-called Volksgeschichte (history of the people) moved closer to National Socialism and became active as wartime political consultants. As some of these historians and social scientists played a significant role in the modernizing of the human sciences in the German Federal Republic, a debate arose as to the extent to which lines of continuity led from ‘history of the people’ to social history, and concerning the relation between its representatives’ rightist views and their proposed ‘structural history’ of modern industrial society (Schulze and Oexle 1999, Conrad 1999).
2. Postwar Expansion (1950s to 1970s) Not only in Western Europe and North America but also in other parts of the world, the establishment, expansion, and specialization of social history belong among the significant trends of interwar and particularly postwar intellectual and academic life. The spread of the ‘new’ histories since the 1960s occurred at a time when systems of higher education in Europe and North America were dramatically expanding. The general increase in academic jobs, student enrolments and Ph.D. candidates, as well as the massive growth in book publications and journals, allowed the various strands of social history to find ample space without replacing or pushing aside more traditional ways of writing and teaching history. The institutional growth of social history followed different national directions. But it can generally be said that in the 1970s a high point was reached as far as the establishment within universities and among the larger public was concerned. The most impressive indicator is the founding of new journals concentrated around the year 1975: Social Science History (USA, 1975), Geschichte und Gesellschaft (West Germany, 1976), History Workshop Journal (UK, 1976); Social
History (UK, 1976). Along with the Journal of Social History (USA) founded in 1967 they represented the new ‘historical social science,’ complementing the older ‘school-forming’ reviews such as above all Annales ESC (France, refounded 1946) and Past and Present (UK, 1952).
2.1 National Variations and International Circulation of Ideas The triumph of social history in the 1960s and 1970s was a transnational phenomenon. The themes, questions, and methods were astonishingly similar in different countries; the international specialist public was interconnected through increased contacts. In many places the attractiveness and legitimacy of new approaches were aided by models coming from other countries. Particularly the French ‘nouelle histoire,’ often described by others as the Annales school, developed into a veritable export hit. Already at the international historians’ congresses in Paris in 1950 and in Rome in 1955, its French representatives and Italian or English allies very much set the tone (Erdmann 1987). In the following decades the international circulation of concepts and approaches took on greater strength through the translation of books. A few historiographic models or theoretical influences (such as for instance the work of Fernand Braudel) became widespread in the countries of southern Europe and South America, while others won international recognition via their reception and dissemination on the North American market. Yet for all that the national differences remained considerable, even in retrospect and after several decades of approaches and exchanges. The strengths of the French school remained—with a few exceptions—limited to the Middle Ages and the early modern period. Not the genesis of industrial society but the explanation of the revolution of 1789 was and remains the dominant theme for French historians. By contrast in Great Britain, an early market society and ‘first industrial nation,’ historians gave particular importance to these factors. The central approach which English social historians shared with social reformers and social scientists was however focused on the persistence and dynamism of class society. It is by no means a coincidence that E. P. Thompson in his pioneering work The Making of the English Working Class (1963) made decisive strides for social history in this field. And the historical self-examination in other countries can also—though certainly not entirely—be characterized through central motives in their social development: the USA as an immigration and frontier society, in which ethnic cleavages are given more weight than class conflicts; Sweden with its experiences of mass emigration and social crises at the end of the nineteenth and beginning of the twentieth centuries; and finally Japan, which after 14301
Social History World War II looked to describe its latest developments in terms of Marxian-inspired modernization models (Conrad 1999). The political breaches and catastrophes of the twentieth century also formed the node of the new German social history, which was especially influenced by the political social history by the ‘Bielefeld School’ historians. The German Sonderweg, the German divergence from the West, and the social conditions for the national-socialist rule were the central preoccupations of a whole generation of historians who broke with ‘historicist’ (see Historicism) and individualizing explanatory models (Kocka 1999, Welskopp 1998).
2.2 Topics, Neighbors, Models The place of social history in the intellectual landscape of the historical and social sciences of the post-World War II era can be understood by tracing the main trends with respect to its preferred research topics and questions, interdisciplinary alliances, methods, and explanatory models. These trends were most conspicuous in the period from the late 1950s to the 1970s, which witnessed the most pronounced change toward the ‘new’ social history. The reorientation effected by the spread of social history is most obvious with respect to the subjects chosen for monographs, in the programs of congresses or in the contents of major journals. The history of social classes and social stratification in general entered into the historical mainstream. In researching social mobility and standards of life in the past, historians in Europe and North America tended to prolong the prehistory of their contemporary societies into the nineteenth century (and later even further back). Labor history (Labor History) left its subaltern and ideological niche and expanded its focus to the realities of work, family, and everyday life. The economic interests and social position of political actors were taken seriously and entered into the ‘social interpretation’ of great processes like the American and French revolutions, colonialism and imperialism, the development of political parties and social movements, or the causes of war. The fluctuation of harvests, prices and wages, the ebb and flow of population movements and land use, even microbes and the climate were considered with growing technical sophistication and by an increasing numbers of istorians. Population history with the more micro-oriented historical demography became a leading sector of the new historiography: migration, demographic behaviour, household structure were reconstructed by large research projects. The Cambridge Group for the History of Population and Social Structure founded in 1964 by Peter Laslett, E. A. Wrigley, and Roger Schofield as well as the historical department of the French Institut National d’Etudes Demographiques 14302
represented the effort of making history ‘scientific,’ and influenced similar enterprises in Germany and Austria, the Scandinavian countries and North America among others. Similar aims were held by historically minded sociologists, political scientists, and economists who joined historians in the USA to create a discussion forum under the heading of ‘Social Science History’ (with conferences since the 1960s and a journal of the same name since 1975). The flourishing subfields, each with their own journals, congresses, and experts, read like a repertoire of social history themes (Stearns 1994): labor, demographic, family, urban, rural, ethnic, women’s, and gender histories emerged and became aspects of a more encompassing social history (Social History). Internal sequences and echoes were to be observed in the establishment of subdisciplines and the proliferation of new topics: the growing emphasis on mentality, agency, subjectivity, and experience since the late 1970s was itself a response to the establishment of abstract, often quantitative studies on standard of living, family and demography, social and geographic mobility, or political interest groups. Women’s historians criticized the unspoken gender assumptions of much of historical demography and family studies, as well as of the history of labor markets and welfare states. The very meaning of the term ‘social’ was itself reconceived; it lost its direct association with ‘socialist’ movements or ‘social policy’ and embraced instead the whole abundance of the life-world. The enlarged thematic spectrum of the Amsterdam-based International Reiew of Social History (1956), the Parisian Le Mouement Social (1960) or the German journal Archi fuer Sozialgeschichte (1961), are examples of this transition which had gained considerable ground by the mid-1970s. The shift in interdisciplinary alliances mirrored and underpinned topical interests. The nearest neighbors to history were no longer philology, philosophy, or law, but economics, political science, and sociology. For the French Annales School the close relationship with geography and demography was particularly important. The West German focus on a ‘political social history’ of the late nineteenth and early twentieth centuries led to an alliance with historical sociology and political science. Various strands of descriptive, often quantitative economics, social surveys, and sociography are found among the more practical roots of historians who embarked on studies of living standards, family formation, or business cycles. Equally it should not be forgotten that the enormous growth of sociology in the postwar era held a particular attraction for historians. The varieties of social and cultural anthropology, or ethnology as it is called in some countries, served as models for a minority of social historians, especially those working on early modern history. The journal Comparatie Studies in Society and History (founded in 1958) became an important forum for historically
Social History minded anthropologists and anthropologically interested historians and social scientists. The relationship between history and anthropology—a discipline which has also changed dramatically during the last three or four decades—is a good indicator for the underlying change of emphasis from ‘society’ to ‘culture’ within social history. Natalie Z. Davis’s influential collection of essays Society and Culture in Early Modern France (Davis 1975) or the wide impact of Clifford Geertz’s writings, especially his 1973 collection The Interpretation of Cultures, are cases in point. Parallel to these shifts, the protagonists and agents of the new social history reflected on the mechanisms of scientific change as such. What appeared to be happening under their own eyes and with their help in the humanities and social sciences seemed to resemble a ‘paradigm shift,’ as Thomas S. Kuhn had described it for physics in his influential book The Structure of Scientific Reolutions of 1962. Notwithstanding Kuhn’s own doubts as to whether his model could be applied to the social sciences, the ‘paradigm shift’ became a battle cry in the nascent struggle to establish social history over and against the history of politics, great men, or ideas. The implications of Kuhn’s work for the history of science were also welcomed by social historians. Indeed, the emphasis on social and external factors in the development and application of knowledge clearly corresponded to the efforts of social historians to stress context, interests, and structural constraints when dealing with ideas and ideologies (Appleby et al. 1994). Although at the time many were fascinated by the idea of a ‘paradigm shift,’ in retrospect we see a more subtly differentiated, less abrupt transition. It is certainly true that in individual national academic traditions there were in part intense debates as to the ‘right’ sort of history. And in most cases a political dimension was also present, in which social history was defined as being more to the left, and political history more to the right. However rather than superseding an old paradigm, social historians could work on several levels (institutional, methodological, and thematic) and go about broadening and innovating their field within quite different time frames. If one sees the success story of social history as a pluralistic project fed from many different sources, it becomes easier to understand its expansion and, later, its reorientation in the direction of cultural history. In fact the many social-political and intra-academic forces which were involved in the rise of social history have continued to develop independently, and gradually cast doubt upon the plausibility of their own basic assumptions. These definitional clarifications should not, however, obscure the impact of the new social history on the traditions of historiography. The historicist principles of the historians’ craft were challenged by analytical, theory-driven approaches. The borrowing of concepts and even the adaptation of ‘middle range
theories’ (R. K. Merton) as tools in historical research and writing has become a trademark of social historians as compared to traditional ‘narrative historians’ (Le Goff and Nora 1985, Fairburn 1999). The use of comparative methods gained wide currency (Tilly 1984). Theories of modernization (Modernization, Sociological Theories of ) became more attractive, above all for historians of the nineteenth and twentieth centuries. Even if the explicit use of theories remained in general limited, process-oriented terms such as industrialization, urbanization, secularization, and professionalization became ever more widely used in historiography. And they were by no means limited to social history in the more restricted sense. Modernization, industrial society, even democratization, all were part of the meta narrative governing a good part of social history writ large—albeit with national differences and often heavy criticism from alternative political currents within social history (see for example Judt 1979). Yet despite evident tension the height of the social history wave was marked by a good deal of common ground. The explanatory models proposed by the ‘new’ histories of the 1950s, 1960s, and 1970s shared the aim of explaining collective behavior and social relations by linking them to the structures of the economy and society. Although most opposed the rigidity of official Marxism-Leninism, many historians conceived of society as an architecture which resembled a vaguely materialist interaction between ‘basis’ and ‘superstructure.’ In her work on the French revolution Lynn Hunt explicitly rejected the widespread partition of monographs in three levels of economy\demography, society, and politics or mentality by simply turning it on its head (Hunt 1990, pp. 102–3). Interests, world views, political stances were largely understood as collective responses to objective social positions. Since the 1970s, social history has experienced shifts in its basic assumptions about causality that have led to a much less homogenous situation. But there seems to be a widespread refocusing first on agency and practices, secondly on meaning and identity, and thirdly on microprocesses.
2.3 The Will to Synthesis The social-historical approach to history set other requirements on the writing of an overall historical account as conventional political history. The narrative structure of political events, the periodization by means of wars or revolutions, and the intentionality of a historiography oriented towards the ‘big players’ could not be simply transposed on to social history. In this new field syntheses therefore had to contain— whether one liked it or not—a reflection on the character of ‘society.’ The will to synthesis is for this reason part and parcel of social history’s emphasis on a comprehensive approach to history. Here social 14303
Social History history sets itself off most clearly from the study of social and economic phenomena within a context which continues to be defined in political terms. The authors of large syntheses in social history are for that reason also more or less explicitly protagonists of a ‘history of society,’ or of a histoire totale in the French tradition. Mikhail Rostovtzeff’s Social and Economic History of the Roman Empire (1926) and Marc Bloch’s SocieT teT feT odale (1939–1940) are both early and distinguished examples. Fernand Braudel, Eric Hobsbawm, and Hans-Ulrich Wehler emphasized in their own way this requirement with monumental general histories. But they could do so only by consciously borrowing concepts and models from the neighboring social sciences, and by setting out theoretical considerations on social structure and on the motors of social change. Whether these involved models on temporal and geographic structures such as Braudel’s, or a flexible use of Marxist theories such as Hobsbawm’s, or an explicit use of modernization theories such as Wehler’s—in each case this type of general history distinguished itself from the conventional historiography also in its explicit theoretical orientation. Yet a clear difference in the degree to which theory governs the historical narrative or the display of arguments remains between these works and those of historical sociologists such as Immanuel Wallerstein, Theda Skocpol, Charles Tilly, or Michael Mann—authors who nevertheless share common interests with the historians. But also at a more humble geographical and temporal level social historians have continuously endeavored to reconstruct the object of their investigations, society, as a coherent if multifaceted system, sometimes even in a literary way. Above all however they continue writing within their respective national historical frameworks (Wehler 1987–1995). Often this takes the form of collections of articles in which various authors describe the individual aspects of society (a recent example is Thompson 1990). But even the short, broad-based handbook giving an overview of the state of research belongs in this category. In part this reflects the tendency to specific research on social classes and strata which leaves out further societal, cultural, and political contexts (Charle 1994). Although the emergence of the newer directions of gender, experience, and cultural history has increased the fear that social history would become even more fragmented, Jose! Harris (1993), for example, has explicitly endeavored to integrate these very perspectives into a general historical account.
3. The Reenge of Success However, social history’s expansion in quantitative scale and conceptual scope was large enough to produce serious side-effects. First, specialization increased dramatically. Greater demand for technical 14304
expertise such as quantification, the creation of such subfields as historical demography, or the tendency of methodological approaches like oral history to take on a life of their own contributed to the differentiation of the field. Second, the proliferation of topics as well as of local or regional units of analysis deemed worthy of a monograph concurred to create a very colorful map of approaches and subjects. In the late 1980s and later, representatives of social history looking back on the success story of their field could made out the first signs of self-doubt already in the second half of the 1970s, at the highpoint of social history’s intellectual and institutional glory (see contributions in Kocka 1989). More than any other approach to history, social history gradually came to be marked by fragmentation and dispersion. Third, social history as a general approach gained access to other historiographical periods and to other disciplines in the humanities. In literary criticism and modern languages the social history of literature became a trademark in the 1970s and early 1980s and, since then, has met with growing skepticism and disillusionment. Certainly it must be said that these developments represent a tribute to the success of social history. In the course of differentiation and specialization, the new approaches and subfields which had established themselves under the heading of social history gradually changed their character and underlying priorities (Burke 1991). What is often described at the end of the twentieth century as a cultural ‘turn’ in fact emerged over the years. Indeed, disillusionment with the social history of the 1970s can best be understood not as a ‘paradigm shift’ but rather as an incremental, selfreinforcing, sometimes conflictual transition on different, non-unified, levels of historical work: research methods, preferred topics and sources, examples taken from other disciplines, new ‘buzz words’ and concepts, more hidden understandings of causality, agency, and evolution. Without question, the more recent set of shifts, from social history to the ‘new cultural history,’ makes us think harder about the establishment of social history and its concomitant changes of the societal and political context in earlier decades. The recent ‘cultural turn’ can first of all be understood immanently, that is intra-academically. In its various manifestations, social history continued in the tradition of an oppositional approach against the historical mainstream. Small wonder that it has attracted attacks from every new wave of critical and revisionist historians as well as the revival of older (biographic, narrative) forms of historical writing. The concept of gender has in many countries been at the forefront of debates around structure and agency, macrosociological categories and deconstructivist criticism. Particularly in Germany, a fierce debate with aspects of a generation conflict divided the ‘history of society’ embodied in the Bielefeld school of Gesellschaftsgeschichte (Iggers 1975, Kocka 1986,
Social History Wehler 1987–1995) and the ‘history of everyday life’ (Alltagsgeschichte) represented by grassroots history workshops and a number of younger academics at the margin of the profession (Ludtke 1995). Yet one must also consider the external, extraacademic factors which in the last decades have changed the relationship of the social sciences to society. The social and political movements of the postwar years made their mark on social history. Especially in the USA the expansion of themes and problems in social history since the 1960s had been given a strong push by the corresponding conflicts in American society: the movements of Afro-Americans, of students, of women, and of gays and lesbians added to the traditional input from the workers’ movement and radical intellectuals. Since the 1980s, the already proverbial trinity of race, class, and gender reflected these social forces in an academic discourse. In Europe, too, a similar, by no means linear elective affinity could be observed between social movements (the alternative left, the women’s, ecological and peace movements) and fresh ideas in history (gender, everyday, and microhistory). The 1980s and 1990s experienced a strong move toward independence and pluralization of identity politics and the associated attempts at claiming one’s own history for oneself. Taken together these trends have aided the transition away from a history of anonymous social processes and structures. This reorientation became even stronger with growing critique of the ethnocentric dominance of industrial societies in social history leading up to this time. In recent years, the slow inroads which postcolonial discourses have also made into historical science have on the one hand furthered the significance of classical social history, this time in the countries of Asia, Africa, and Latin America. On the other hand the established concepts of modernization, development, progress, and culture were subject to a thoroughgoing critique and redefinition. Finally, the breakdown of actually existing socialism in 1989\1990 and the resulting political and economic transformation led to reflection on the basics of politics and the economy (Kocka 2000). The answers to these challenges and interrogations are by no means clear. The voices which speak of a ‘crisis’ are not restricted to social history, but are directed at the historical sciences, if not the humanities in general. Two trends can be made out: beyond all methodological differences the self-reflection of the academic historian has moved to center stage. The idea that one can produce reliable knowledge about society—contemporary or past—was in 2000 taken far less for granted than a few decades previously (Bonnell and Hunt 1999). On the other hand the historicization of all forms of knowledge has become, as it were, a patent remedy in all debates. In opening up towards this sociocultural direction, it is true that social history may have lost in unity, perhaps in
recognizability, but it has gained in intellectual dynamism. See also: Business History; Cultural History; Economic History; Family and Kinship, History of; Gender and Feminist Studies in History; Gender History; Historical Demography; History and the Social Sciences; Labor History; Life Course in History; Modernization and Modernity in History; Social Movements, History of: General; Urban History; Work and Labor: History of the Concept; Work, History of; Working Classes, History of
Bibliography Appleby J, Hunt L, Jacob M 1994 Telling the Truth about History. Norton, New York Bonnell V E, Hunt L (eds.) 1999 Beyond the Cultural Turn. New Directions in the Study of Society and Culture. University of California Press, Berkeley, CA Burke P (ed.) 1991 New Perspecties on Historical Writing. Cambridge University Press, Cambridge, UK Charle, C 1994 A Social History of France in the Nineteenth Century. Berg, Oxford, UK Chartier, R 1993 Le temps des doutes. Le Monde. Paris, March 18: vi–vii Conrad, S 1999 Auf der Suche nach der erlorenen Nation. Geschichtsschreibung in Westdeutschland und Japan 1945– 1960. Vandenhoeck & Ruprecht, Go$ ttingen, Germany Davis N Z 1975 Society and Culture in Early Modern France. Stanford University Press, Stanford, CA Erdmann K E 1987 Die Oq kumene der Historiker. Geschichte der Internationalen Historikerkongresse und des ComiteT International des Sciences Historiques. Vandenhoeck & Ruprecht, Go$ ttingen, Germany Fairburn M 1999 Social History. Problems, Strategies and Methods. Macmillan, London Geertz C 1973 The Interpretation of Cultures. Basic Books, New York Harris J 1993 Priate Lies, Public Spirit: Britain 1870–1914 (The Penguin Social History of Britain). Oxford University Press, Oxford, UK Hobsbawm E J 1972 From social history to the history of society. In: Gilbert F, Graubard R (eds.) Historical Studies Today. Norton, New York Hunt L 1990 History beyond social theory. In: Carroll D (ed.) The States of ‘Theory’. History, Art, and Critical Discourse. Columbia University Press, New York Iggers G G 1975 New Directions in European Historiography. Wesleyan University Press, Middletown, CT Iggers G G 1997 Historiography in the Twentieth Century: From Scientific Objectiity to the Postmodern Challenge. Wesleyan University Press, Middletown, CT Judt T 1979 A clown in regal purple: Social history and the historians. History Workshop Journal 7: 66–94 Kocka J 1986 Sozialgeschichte. Begriff, Entwicklung, Probleme, 2nd edn. Vandenhoeck & Ruprecht, Go$ ttingen, Germany Kocka J (ed.) 1989 Sozialgeschichte im internationalen Uq berblick. Ergebnisse und Tendenzen der Forschung. Wissenschaftliche Buchgesellschaft, Darmstadt, Germany Kocka J 1999 Asymmetrical historical comparison: the case of the German Sonderweg. History and Theory 38: 45–50
14305
Social History Kocka J 2000 Historische Sozialwissenschaft heute. In: Nolte P, Hettling M et al. (eds.) Perspektien der Gesellschaftsgeschichte. Beck, Munich, Germany Le Goff J, Nora P (eds.) 1985 Constructing the Past: Essays in Historical Methodology. Cambridge University Press, Cambridge, UK Ludtke A 1995 (ed.), The History of Eeryday Life. Princeton University Press, Princeton, NJ Palmer B D 1990 Descent into Discourse. The Reification of Language and the Writing of Social History. Temple University Press, Philadelphia, PA Raphael L 1997 Die ‘neue Geschichte’ – Umbru$ che und neue Wege der Geschichtsschreibung in internationaler Perspektive (1880–1940). In: Ku$ ttler W et al. (eds.) Geschichtsdiskurs, Vol. 4. Fischer, Frankfurt\Main, Germany Ringer F 1992 Fields of Knowledge. French Academic Culture in Comparatie Perspectie, 1890–1920. Cambridge University Press, Cambridge, UK Schulze W, Oexle O G (eds.) 1999 Deutsche Historiker im Nationalsozialismus. Fischer, Frankfurt\Main, Germany Stearns P N (ed.) 1994 Encyclopedia of Social History. Garland, New York Thompson F M L 1990 (ed.) The Cambridge Social History of Britain, 3 Vols. Cambridge University Press, Cambridge, UK Tilly C 1984 Big Structures, Large Processes, Huge Comparisons. Russell Sage Foundation, New York Wehler H-U 1987–95 Deutsche Gesellschaftsgeschichte, 3 Vols. Beck, Mu$ nchen, Germany Welskopp T 1999 Westbindung auf dem ‘Sonderweg’. Die deutsche Sozialgeschichte vom Appendix der Wirtschaftsgeschichte zur historischen Sozialwissenschaft. In: Ku$ ttler W et al. (eds.) Geschichtsdiskurs. Fischer, Frankfurt\Main, Germany, Vol. 5
C. Conrad
Social Identity, Psychology of 1. Social Identity Social identity is those aspects of the self-concept that are derived from belonging to social groups and categories. Social identity theory was developed by Henri Tajfel and colleagues at the University of Bristol in the 1970s and 1980s. Social identity is a concept that continues to be used widely in the social psychology of intergroup relations, social influence, and the self. This article describes social identity theory and associated work on intergroup relations and group processes. It points to the diverse areas of research to which the theory has contributed and concludes with issues that are currently being explored.
2. Minimal Conditions for Intergroup Bias In the 1960s and 1970s social psychology underwent a crisis of confidence that generated a powerful critique of individualistic theorizing and research. In response, European social psychologists began to develop 14306
theories that concerned the relationship between individuals and society. Social identity connected cognitive processes of categorization with the social structure of society. Research on social identity therefore ranges from the detailed analysis of cognitive processes underlying stereotyping and prejudice to wide-ranging questions such as reactions to group disadvantage. Tajfel’s early work on social categorization included development of the minimal group paradigm. In this widely used paradigm people are randomly allocated to categories that have little meaning (e.g., A group and B group, or people who prefer Klee vs. Kandinsky). Research participants are unaware of which other individuals belong to the category, or what their characteristics might be. Evidence shows that minimal group membership is a sufficient basis for people to allocate points or money preferentially to anonymous members of their own category. They also evaluate in-group members more positively, although it seems that reward allocation and evaluations are not highly correlated. If a trivial categorization is sufficient to generate in-group bias, it can be reasoned that more significant and meaningful categorizations have substantial real-word consequences. An important theoretical implication of in-group bias in the minimal group paradigm is that ingroup bias does not necessarily have its roots in objective or realistic intergroup conflict or in personality differences or interpersonal dependency.
3. Social Identity Theory Social identity theory (Tajfel and Turner 1979) offers an explanation for minimal intergroup bias, and also a broader statement of how relationships between realworld groups relate to social identity. According to social identity theory, there is a continuum on which identity and relationships can be depicted as purely personal and interpersonal at one extreme, and purely social, or intergroup at the other. Personal identity is derived from personality, idiosyncratic traits and features, and particular personal relationships with other individuals. A special emphasis of social identity theory is that an effective account of group and intergroup phenomena is not possible if the theory only considers personal identity and relationships. Social identity is the individual’s knowledge that he or she belongs to certain social groups together with the emotional and value significance of the group memberships. Social identity affects and is affected by intergroup relationships. The theory aims to explain the uniformity and coherence of group and intergroup behavior as mediated by social identity (see Hogg and Abrams 1988 for a detailed account). 3.1 Categorization Categorization leads to accentuation of differences among objects in different categories, and accentu-
Social Identity, Psychology of ation of similarities of objects within categories. When social categorizations are salient, accentuation results in stereotypical perceptions of both categories. When people categorize one another they generally include the self in one of the categories. Attributes that are stereotypical for the in-group may then be ascribed to self, a process labeled self-stereotyping. As a result of self-stereotyping the self is ‘depersonalized.’ This means that the self and other in-group members are seen as interchangeable in their relationship to outgroup members.
3.2 Self-categorization Self-categorization theory (Turner et al. 1987) extends social identity theory. It proposes that the salience of self-categorization depends on a combination of the relative accessibility of the categories (i.e., the perceiver’s readiness to make use of them) and the comparative and normative fit between the social stimuli (i.e., people being perceived) and the available categories. The more that the targets are classifiable by a clear criterion (such as gender) and the more that their behavior seems to fit the category normatively (e.g., being domineering vs. considerate), the more readily the categorization will be applied. The categories will be represented in terms of prototypes based on a meta-contrast ratio that minimizes the differences within categories and maximizes the differences between categories on criterial dimensions. Categorizing self and others involves perceiving them in terms of their closeness to the relevant group prototype, rather than in terms of their unique personal characteristics. According to self-categorization theory, stereotyping of in-group members, outgroup members, and the self is a normal functional process that responds flexibly to the particular categories that are being compared. The theory therefore challenges traditional models of stereotyping as a form of bias, and of the self-concept as a stable psychological structure (see Abrams and Hogg 2001).
3.2.1 Social influence. Depersonalization is thought to underlie social influence within groups. When social identity is salient, in-group prototypes serve as norms or reference points. For example, attraction to group members is affected more strongly by shared social identity than by interpersonal similarities (Hogg 1992). The coordinated and consistent behavior of crowd members also seems to be more explicable in terms of a shared social identity than in terms of characteristics of the individuals, resulting from a process of referent informational influence (Turner 1991). In classic social influence paradigms such as Asch’s conformity experiments, or Sherif’s norm formation experiments, people are more strongly influ-
enced when they share a category with the source of influence than when they do not. Moreover, reactions to group deviants are much more hostile if they are deviating toward an out-group norm (Marques et al. 1998).
3.3 Motiation 3.3.1 Positie distinctieness. Social identity theory offers a motivational explanation for in-group bias. First, judgments about self as a group member are held to be associated with the outcome of social comparisons between the in-group and relevant outgroups. Second, it is assumed that people desire a satisfactory self-image, and positive self-esteem. Positive self-evaluation as a group member can be achieved by ensuring that the in-group is positively distinctive from the out-group. Usually group members will engage in social competition with out-groups to try to make the in-group positively distinctive. For example, in minimal group experiments, people show a consistent bias both towards maximizing in-group profit and toward maximizing differential profit in favor of the in-group, even when the total in-group profit suffers. The theory does not argue that material considerations are unimportant, but that the symbolic meaning of the group’s position relative to other groups is a powerful motivating consideration.
3.3.2 The self-esteem hypothesis. The motivational component of the theory, sometimes described as the ‘self-esteem hypothesis,’ has been scrutinized in some detail (Abrams and Hogg 2001). Two corollaries of the hypothesis are that expression of in-group bias should maintain or increase self-esteem, and that lowered self-esteem should motivate the expression of in-group bias. These hypotheses only relate to the specific evaluations of self as a group member immediately before and after judgments or allocations have been made toward the in-group and the out-group. However, few empirical tests have adhered to these requirements, and the evidence supporting them is weak or mixed. Evidence does suggest that people with higher personal self-esteem are more likely to show in-group bias and favoritism. As a consequence of theoretical and practical difficulties of the self-esteem hypothesis, other motivational processes have been suggested. Brewer (1981) has proposed that people seek optimal distinctiveness, such that they have some assimilation to a group, but the group has some distinctiveness from other groups. Hogg and Abrams (1993) have suggested that people are motivated to reduce uncertainty, and that clarification of group membership, adherence to group norms, and associating positive group features with 14307
Social Identity, Psychology of the self are ways to achieve this. However, apart from conformity to pre-existing biased in-group norms, there seems to be no strong alternative candidate to the self-esteem hypothesis as an explanation for in-group bias.
3.4
Social Structure
A second component of social identity theory explicitly addresses the status relationships among groups. To maintain or enhance social identity, people may engage in a variety of strategies. The choice of strategy will depend on perceptions of the relationship between their own and other social groups. The boundaries between groups may be viewed as permeable or impermeable, the status differences as legitimate or illegitimate, and the nature of these differences as being stable or unstable. When status differences are seen as legitimate and stable, the differences between groups are secure. If group boundaries are permeable, the individuals may have a social mobility belief structure. In this situation they may strive to improve their identity by moving to the higher status group. For example, it is possible to go from school to university by passing the relevant entrance examinations. Social mobility belief structures are strongly endorsed by Western individualistic democracies, in which individual effort and ability are viewed as the basis for the allocation of social goods and resources. When boundaries between groups are viewed as impermeable (e.g., that between men and women, or between blacks and whites), members consider that their social identity can only be improved if the social relationship between the groups changes (a social change belief structure). When the social structure is viewed as illegitimate and\or unstable, the intergroup relationship is ‘insecure,’ and it becomes possible to imagine cognitive alternatives to the status quo. Members of lower status groups will engage in social competition with higher status groups to evaluate and reward their own group more positively. In the absence of cognitive alternatives, group members may still find ways of conferring positive distinctiveness on the in-group, using a social creativity strategy. Social creativity strategies include comparing the groups on new (in-group favoring) dimensions, trying to re-value the existing dimensions of comparison so recasting ingroup attributes positively, and engaging in comparisons with other, lower status, out-groups. There is considerable evidence from experimental tests of the theory that these strategies are adopted (e.g., Ellemers et al. 1999).
language use and maintenance. Language is an important marker of social identity. For example, use of the matched guise technique, in which the same speaker delivers a passage using different accents or dialects, has revealed that judgments of in-group and out-group speakers reveal clear in-group favoritism. Moreover, speakers’ accents converge or diverge depending on whether they view the relationship positively. The ability of a group to sustain its language may reflect its ethnolinguistic vitality, which is a function of the perceived status, demography, and institutional support that it has (Giles and Johnson 1987). Finally, there seems to be a linguistic intergroup bias in which people describe positive in-group and out-group attributes in more abstract terms using more specific terminology (Semin and Fiedler 1991).
3.6 Collectie Behaior A second significant area of research is collective behavior and protest. Traditional theories, such as relative deprivation or frustration–aggression theory, have depicted mass protest, rioting, and revolution as products of frustration or dissatisfaction felt by individuals. However, research has shown that social protest and mobilization is predicted by the perception of deprivation of the group relative to other groups. In turn, collective action is more likely when group members feel there is no legitimate alternative, no possibility of social mobility, and when they identify strongly with their group. Some evidence suggests that collective action is usually a last resort, but other evidence suggests that participation in collective action is likely when people identify strongly with their group (Kelly and Breinlinger 1996).
3.7
Organizational Behaior
Social identity is also highly relevant to organizational behavior. Commitment to organizational goals and willingness to contribute to organizational activities, as well as lowered absenteeism and turnover intentions, have all been shown to be associated with higher identification with organizations (Ashforth and Mael 1989, Tyler and Smith 1998). Moreover, people’s endorsement of leaders seems to be affected at least as much by the leader’s perceived prototypicality (representativeness of the specific in-group norm) as by how effective or stereotypically leader-like the person is.
4. Critical Comments and Future Directions 3.5 Language Social identity has been studied widely in natural intergroup settings. One important area of research is 14308
The concept of social identity is widely accepted in social psychology and has contributed to a large volume of research (Robinson 1996). Current work is moving away from the idea that group members are
Social Inequality and Schooling primarily motivated by self-enhancement, and examines the idea that people may be motivated to reduce uncertainty or find meaning from their group memberships. In addition, identification may be manifested in different ways (Deaux 1996), and there are individual differences that affect people’s behavior, such as differences in how strongly they identify with particular groups, differences in their social dominance orientation, and differences in their willingness to express prejudice. It seems that the strength with which people identify with groups, and the degree of threat to social identity, can moderate the relationship between group status and intergroup bias (Ellemers et al. 1999). Current work is also examining the cognitive processes that affect the self–group relationship. For example, some researchers argue that the self is the primary psychological unit and that it simply extends to include in-groups, in contrast to the idea that the self can be redefined by salient self-categorizations. Recent research also shows evidence of a rapprochement between social cognition approaches to intergroup relations and the social identity perspective (Abrams and Hogg 1999). See also: Ethnic Identity\Ethnicity and Archaeology; Ethnic Identity, Psychology of; Identity and Identification: Philosophical Aspects; Identity in Anthropology; Identity in Childhood and Adolescence; Identity: Social; Multiculturalism and Identity Politics: Cultural Concerns; Personal Identity: Philosophical Aspects; Regional Identity
Bibliography Abrams D, Hogg M A (eds.) 1999 Social Identity and Social Cognition. Blackwell, Oxford, UK Abrams D, Hogg M A 2001 Collective identity: group membership and self-conception. In: Hogg M A, Tindale R S (eds.) Handbook of Social Psychology: Group Processes. Blackwell, Oxford, UK, pp. 425–60 Ashforth B E, Mael F 1989 Social identity theory and the organization. Academy of Management Reiew 14: 20–39 Brewer M B 1991 The social self: on being the same and different at the same time. Personality and Social Psychology Bulletin 17: 475–82 Deaux K 1996 Social identification. In: Higgins E T, Kruglanski A W (eds.) Social Psychology: Handbook of Basic Principles. Guildford Press, New York, pp. 777–98 Ellemers N, Spears R, Doojse B 1999 Social Identity. Blackwell, Oxford, UK Giles H, Johnson P 1987 Ethnolinguistic identity theory: a social psychological approach to language maintenance. International Journal of Sociol Language 68: 256–69 Hogg M A 1992 The Social Psychology of Group Cohesieness. Harvester Wheatsheaf, Hemel Hempstead, UK Hogg M A, Abrams D 1988 Social Identifications: A Social Psychology of Intergroup Relations and Group Processes. Routledge, London Hogg M A, Abrams D 1993 Group Motiation: Social Psychological Perspecties. Harvester Wheatsheaf, Hemel Hempstead, UK
Kelly C, Breinlinger S 1996 The Social Psychology of Collectie Action: Identity, Injustice and Gender. Taylor and Francis, London Marques J M, Abrams D, Martinez-Taboada C, Paez D 1998 The role of categorization and in-group norms in judgments of groups and their members. Journal of Personality and Social Psychology 75: 976–88 Robinson P (ed.) 1996 Social Groups and Identities: Deeloping the Legacy of Henri Tajfel. Butterworth-Heinemann, Oxford, UK Semin G R, Fiedler K 1990 The linguistic category model: its bases, applications and range. In: Stroebe W, Hewstone M (eds.) European Reiew of Social Psychology. J. Wiley, Chichester, UK, Vol. 2, pp. 1–30 Tajfel H, Turner J C 1979 An integrative theory of intergroup conflict. In: Austin W G, Worchel S (eds.) The Social Psychology of Intergroup Relations. Brooks-Cole, Monterey, CA Turner J C 1991 Social Influence. Open University Press, Milton Keynes, UK Turner J C, Hogg M A, Oakes P J, Reicher S D, Wetherell M 1987 Rediscoering the Social Group: A Self-categorization Theory. Blackwell, Oxford, UK Tyler T R, Smith H J 1998 Social justice and social movements. In: Gilbert D, Fiske S T, Lindzey G (eds.) Handbook of Social Psychology, 4th edn. McGraw-Hill, New York, pp. 595–629
D. Abrams
Social Inequality and Schooling Do systems of education in modern societies help to break down the inheritance of privilege from one generation to the next, thereby providing opportunities for mobility for members of less privileged groups, or do schools largely reinforce and legitimate the inequalities among individuals evident in the larger society? This article shows that social background affects educational attainment; discusses mechanisms through which family privilege is translated into advantaged educational opportunities and achievement; identifies unequal school resources and opportunities as determinants of inequality in outcomes; and shows that egalitarian-based school reforms have generally failed to reduce social inequalities in education significantly. Unequal educational outcomes derive both from inequality in educational opportunities and inequalities in family environment and resources.
1. Social Background and Educational Inequality 1.1 Basic Relationships A relative constant about education in the modern world is that social background has a major impact on educational performance and on educational attain14309
Social Inequality and Schooling ment, even net of ability: on average, students from more advantaged class and\or status backgrounds perform better in school and go further in the educational system, thereby obtaining higher-level educational credentials (Ishida et al. 1995). The association between family background and educational outcomes is even stronger in the developing world than in advanced industrial societies and is less strong in the USA than in the many European countries in which more explicit class and gender barriers in access to schooling (e.g., in the form of fees for secondary schooling and gender-segregated schools, respectively) have existed historically (see Brint 1998). Given that educational performance and credentials are important determinants of adult success, education thereby plays an important role in the reproduction of social inequality. Some theorists propose that education is a form of ‘social capital’ that has replaced property as a means of intergenerational status transmission (Bourdieu 1973). Whereas elite families once ensured their children’s future positions by passing on property, they now ensure their children’s futures by helping them obtain the ‘best’ educational credentials. Nonetheless, even the strongest proponents of class reproduction theory recognize that education does not perfectly reproduce the class\status structure of modern societies. It also provides opportunities for some talented individuals from less advantaged backgrounds to succeed. Social scientists have for decades tried to determine why class and status inequalities in educational outcomes occur. Part of the explanation, as discussed more fully below, is or has been due to unequal educational opportunities. Further, the growth of mass education throughout the world has nominally equalized educational opportunities at some levels (primary education in the developing world, secondary education in the developed world) while displacing the level of education that has real consequence for adult success to a higher level—one to which elites retain privileged access (e.g., the university level in developed countries). School expansion, therefore, has not reduced the educational advantage that elite children enjoy (Walters 2000, pp. 254–7). But even when educational access is formally equal, inequalities in outcomes still occur. Some of this is due to class or status segregation within schools or school systems: Disadvantaged children tend to go to schools with largely disadvantaged student bodies (i.e., there is between-school segregation), or be in streams or tracks within schools that are fairly socially homogeneous (i.e., there is within-school segregation). That is, the proximate schooling experience is class- and statussegregated to a large extent in many societies, and this segregation has an adverse effect on disadvantaged children’s achievement (Oakes et al. 1992). Some of the inequality in educational outcomes is also due to the direct effects that families and family background have on learning, achievement, and edu14310
cational experiences, aside from wealthy parents’ ability to buy private schooling for their children or, in some developing countries at present, poor families’ dependence on child labor, which interferes with schooling. Children from more advantaged families, for example, are more likely to start school with strong academic skills, be helped in their school work by their parents, have parents who actively intervene in their schooling careers, exhibit linguistic patterns preferred by teachers, have resources to buy learning materials or supplemental help, and in general live in a family environment that supports learning and academic accomplishment (Brint 1998, pp. 208–14). In many societies, educational programs have been established to help disadvantaged children make up for their family-based learning ‘deficits’ (one notable example is Project Head Start in the USA). With the exception of far-going educational reforms in some socialist states, however (see Sect. 2.3), reforms have not attempted to deprive children from advantaged families of most of their sources of educational advantage.
1.2 Bases of Social Inequality The particular social groups that are educationally ‘advantaged’ and ‘disadvantaged’ vary by places and times, depending on the stratification system in place in the larger society. Generally, those groups that are advantaged in society at large are advantaged with respect to educational opportunities and educational outcomes. In most societies in the modern world, social class position confers tremendous educational advantage or disadvantage. Children from advantaged social classes have better educational opportunities, have higher levels of ‘achievement’ as officially measured and recognized by the school, and receive on average ‘better’ educational credentials. In capitalist societies, upper-class and upper-middle-class children reap these educational class advantages. In socialist societies, children of party elites reaped educational class advantages, official policy notwithstanding (see Shavit and Blossfeld 1993). Class is arguably the most important source of educational advantage or disadvantage, and the source that has been most consistent in recent decades. Another common social distinction that has important educational consequences is race and ethnicity. The degree of educational disadvantage associated with being a member of a disadvantaged racial or ethnic group has varied historically. The USA and South Africa, for example, historically had public educational systems that segregated blacks into separate schools and offered them far fewer educational opportunities; in both cases, changes in state policy eliminated the explicit racial barriers to school access although some forms of racial inequality in education persist. There is also significant cross-
Social Inequality and Schooling cultural variation in the racial or ethnic groups that are educationally disadvantaged: For example, Koreans and other Asians perform better than whites in American schools but are educationally disadvantaged in Japan (Brint 1998, p. 216). The final major basis of social inequality in education is gender. Gender inequality in education— both in terms of access and outcomes—has decreased dramatically in recent decades in most advanced industrial countries. Whereas girls were once commonly barred from certain types or levels of education, gender barriers in access are insignificant in most industrial countries at present but remain in those less developed countries in which women are still largely restricted to the private (family) sphere. Similarly, in most industrial countries schools offer similar if not identical curricular opportunities at present to boys and girls, and girls outperform boys in some important respects (e.g., in the USA girls earn better grades on average than boys and have higher rates of graduation from high school and college). The significance of gender as a basis of social inequality in education has declined significantly, then, in many parts of the world but remained high in societies in which gender remains a significant basis of differentiation in the larger society.
2. Social Inequality in School Access and Resources 2.1 Translating Adantaged Family Background to Educational Adantage The ways in which children from advantaged families gain access to advantaged schools or schooling experiences vary from society to society. Most European societies have long been known for their class-based systems of education (which are also often ethnicbased), characterized by a clear differentiation into educational streams at relatively early ages. Children from advantaged backgrounds were more likely to enter streams leading to university-level education at the ages of 10 or 12, whereas at the same time workingclass children or racial and ethnic minorities were more likely to enter streams leading to earlier school exit and vocational training. At present, some European states retain a system of early branching (e.g., Germany and Austria), albeit in modified form from prior decades. Others, however (e.g., France, England, and Sweden), have eliminated early branching, have combined vocational and academic training in the same secondary school buildings, and allow easier movement among streams at the high-school level—in part because of the explicit recognition that early and relatively permanent branching maximizes the effect of family background on schooling outcomes (see Brint 1998, pp. 29–64).
But even in the USA—a society whose educational system is regarded as more open, in which most students attend comprehensive high schools, in which most students get second, third, or more chances for access to high-level education, and in which a private school secondary education is less important for access to the top universities—inequality in access to ‘good’ educational opportunities is still evident. In part this is due to the ability of advantaged parents to use their personal resources to send their children to expensive private schools if they so choose. But largely it is due to wide variation in the USA among states, within states, and among school districts in the quality of public schools. This variation favors those states and localities in which advantaged families predominate, because of the historic reliance in the USA on local property taxes to fund public education. Unlike the financing basis of state-supported schooling in many other industrialized countries, this system allows wide variation in per-student school funding. This reliance on local taxes in the USA has produced a system where the ‘best’ schools—the ones with the most expensive facilities, the better-trained and most highly paid teachers, the most advanced curricular opportunities that ‘produce’ students with the highest standardized test scores, and send the highest proportion of their graduates to college—are located in those local areas with the highest concentrations of wealth, largely very wealthy suburbs of large metropolitan areas (Kozol 1987). This system produces sharp inequalities in educational access by class and race, the latter because African-Americans are concentrated in central cities, which have crumbling and fiscally strapped public-school systems.
2.2 The Importance of School Resources Starting with the publication called the Coleman Report in the United States (Coleman et al. 1966), numerous studies by social scientists have been interpreted to show that schools—or, more precisely, resources devoted to schools—do not matter. Coleman found that family socioeconomic background is a much more important determinant of educational achievement as measured by standardized test scores than spending on schools. One implication of this finding is that any observed differences in public moneys spent on schools attended by the elite versus schools attended by the disadvantaged is irrelevant with respect to class or status differences in educational outcomes. In other words, families are implicated in the creation of inequality in educational outcomes, but schools are not. As politically popular as this argument has proven to be, at least in the USA, a number of recent metaanalyses show that the existing studies offer stronger proof of a positive effect of resources on student test scores than an absence of an effect (e.g., Hedges et al. 14311
Social Inequality and Schooling 1994; but for a counter interpretation see Hanushek 1994). Further, an important longitudinal experimental study in Tennessee, in which students were randomly assigned to different-sized classes, demonstrated that placement in relatively small classes, an important type of school resource, had a significant effect on students’ performance on standardized tests in the first and subsequent years of the study (Krueger 1999). There are other problems with the argument that inequality in school resources is inconsequential with respect to educational outcomes. One, Coleman’s (and many others’) measure of educational outcome was ability, not achievement, years of schooling completed, entry to college, or the like, all of which are far more consequential for adult success than one’s score on an ability test. A number of studies have found positive effects of school resources on educational attainment and on adult earnings (see review in Card and Krueger 1996). Two, aggregate school-level family social economic status and school funding co-vary, making it hard to disentangle the effects of family background and school spending. Three, what families ‘know’ themselves about the importance of going to a well-funded school belies the argument that school spending doesn’t matter: when given the opportunity, families go to extraordinary lengths to get their own children placed in resource-rich schools.
2.3 Egalitarian School Reforms Although it is doubtful that eliminating class, race, gender, and other bases of inequality in access to schooling or the quality of schooling would result in equality in educational outcomes (because of the strong effects family background have on school performance and persistence), it is clear that the presence of class, race, gender or other barriers to access to school or to particular levels of school promotes social inequality in education and in adult life. Thus, in the past several decades many countries have instituted school reforms intended to eliminate historic patterns of unequal access to education. In those places where racial minorities were allowed only separate—and unequal—educations, desegregation was an important step toward equalization of opportunity. In societies in which class barriers to secondary education existed in the form of school fees and the like, such as Ireland, elimination of fees was intended to equalize class access to secondary education (Raftery and Hout 1993). Some socialist societies went much further, instituting systems that gave clear preference in access to education to children from formerly disadvantaged backgrounds (see Shavit and Blossfeld 1993). Interestingly, in very few cases did these reforms, even the most far-reaching ones, succeed in eliminating former class or status disadvantages in schooling. Social differentials in educational 14312
attainments, for example, have proven remarkably resistant to egalitarian school reform (Shavit and Blossfeld 1993). One major reason for this is that increasing the access of formerly disadvantaged groups to a particular level of schooling—say secondary education—increases the proportion of advantaged children attending as well and generally does not significantly alter class or status effects on the likelihood of entering that level of education (Raftery and Hout 1993, p. 60). Further, once secondary education, for example, is readily available to formerly disadvantaged groups, higher education becomes the significant arena in which class and status advantages are preserved (Walters 2000). Gender is an exception: in some Western societies, women are now more likely than men to obtain baccalaureate degrees, although not Ph.Ds. It appears that the only way to significantly reduce class or status inequalities in educational outcomes is to combine egalitarian school reform with a variety of social policies that significantly reduce inequalities in living conditions, family-based resources, and occupational opportunities. Of the 13 countries covered by Shavit and Blossfeld’s (1993) important edited book, for example—including a range of countries from capitalist to state socialist—only in Sweden and the Netherlands were egalitarian school reforms coupled with social welfare policies that substantially reduced inequalities in family resources and did educational inequality significantly decline. Further, the substantial decline in gender inequality in education witnessed in some societies is due to school-based reforms as well as a significant decline in gender stratification in the public sphere. The failure of sweeping school reform in a number of socialist states, including ‘reverse discrimination’ policies intended to favor children of working-class or peasant backgrounds, is particularly illuminating. To an important degree, new elites took the place of old elites (e.g., party elites replaced old class elites) but children of elites still enjoyed educational advantages. Further, none of these reforms closed off all possible means by which even former elites could ensure their children’s educational advantage: for example, bribes, favors based on personal ties, passing on various forms of cultural capital that helped children overcome formal barriers to their access to privileged schooling, and the lingering effect of family financial capital (see summary in Walters 2000, p. 255).
3. Summary Social inequality in education is an enduring feature of the twentieth century. There have been changes in which social groups are advantaged and disadvantaged, but groups that are disadvantaged in the larger society are generally—though not always—disadvan-
Social Inequality in History (Stratification and Classes) taged with respect to educational achievement and attainment, if not with respect to educational opportunities. Many societies have implemented changes that make access to education formally more open to disadvantaged groups, but these changes and reforms have not succeeded in creating equality of educational outcomes. Inequality in family-based resources and patterns of stratification and inequality in the larger society continue to be determinants of social inequality in education. See also: Academic Achievement: Cultural and Social Influences; Economic Status, Inheritance of: Education, Class, and Genetics; Education and Gender: Historical Perspectives; Educational Sociology; Equality and Inequality: Legal Aspects; Equality of Opportunity; Gender, Class, Race, and Ethnicity, Social Construction of; Inequality: Comparative Aspects; Social Inequality in History (Stratification and Classes)
Bibliography Bourdieu P 1973 Cultural reproduction and social reproduction. In: Brown R (ed.) Knowledge, Education and Cultural Change. Tavistock, London, pp. 71–112 Brint S 1998 Schools and Societies. Pine Forge Press, Thousand Oaks, CA Card D, Krueger A B 1996 School resources and student outcomes: An overview of the literature and new evidence from North and South Carolina. Journal of Economic Perspecties 10: 31–50 Coleman J S et al. 1966 Equality of Educational Opportunity. Government Printing Office, Washington, DC Hanushek E A 1994 Making Schools Work: Improing Performance and Controlling Costs. Brookings Institution, Washington, DC Hedges L V, Laine R, Greenwald R 1994 Does money matter? A meta-analysis of studies of the effects of differential school inputs on student outcomes. Educational Researcher 23: 5–14 Ishida H, Muller W, Ridge J M 1995 Class origin, class destination, and education: A cross-national study of ten industrial societies. American Journal of Sociology 101: 145–93 Kozol J 1987 Saage Inequalities: Children in America’s Schools. HarperCollins, New York Krueger A B 1999 Experimental estimates of education production functions. Quarterly Journal of Economics 114: 497–532 Oakes J, Gamoran A, Page R N 1992 Curriculum and differentiation: Opportunities, outcomes and meanings. In: Jackson P W (ed.) Handbook of Research on Curriculum. American Educational Research Association, Washington, DC, pp. 570–608 Raftery A E, Hout M 1993 Maximally maintained inequality: Expansion, reform, and opportunity in Irish education, 1921–75. Sociology of Education 66: 41–62 Shavit Y, Blossfeld H-P (eds.) 1993 Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Westview Press, Boulder, CO
Walters P B 2000 The limits of growth: School expansion and school reform in historical perspective. In: Hallinan M (ed.) Handbook of the Sociology of Education. Kluwer Academic\ Plenum Publishers, New York, pp. 241–64
P. Barnhouse Walters
Social Inequality in History (Stratification and Classes) As a universal fact of all human civilization, social inequality has been shaping advanced societies in a most fundamental way for many thousands of years, especially since the rise of more complex cultures and political systems in the wake of the ‘Neolithic Revolution.’ The concept of social inequality refers to an unequal distribution of positions in society, a distribution which is determined by structures of power (e.g., the modern state and its administrative elites), of technology and economy (e.g., industrial technology and capitalism), and of cultural expression and perception (e.g., Christian religion and its attempt to define and explain a ‘holy’ order of worldly society). In modern societies, social inequality is often framed as a differentiation of society in distinct groups of similar positions conceivable as ‘layers’ of the social order on a vertical scale. A general term for this type of social inequality structure is ‘stratification.’ In the Western tradition, the concept of stratification often—in a more or less explicit way—refers to a ranking of social positions according to the distribution of economic resources; the conflicting groups resulting from this are frequently called ‘classes’ (in a broad sense). The notion of a society divided into classes, which through their collective action fundamentally shape the dynamic processes of society, has been most widely applied to the specific historical case of Western industrial societies with their market economy, but has also been transferred from here to the description of agrarian societies in ancient or medieval times. Since around 1970, concepts of stratification and class have been criticized for their neglect of other dimensions of social differentiation and inequality such as gender, race, or ethnicity. Important as this recent restructuring of the debate on historical (and contemporary) modes of inequality is, stratification and class have been essential concepts for social history in the past (and appear to remain so in the future, if in a modified way). Indeed, it may be said that the development of social history as a field of historical research is hardly conceivable without the analytical concepts of stratification and class. In considering social inequality, stratification, and class in history, two perspectives may be taken. First, 14313
Social Inequality in History (Stratification and Classes) stratification is an empirical problem, a problem of historical subject matter. In this regard, one can follow the development of systems of social inequality throughout history, from antiquity to the twenty-first century. Second, stratification refers to an analytical and theoretical problem in (social) history; it appears as a problem of conceptualizing historical processes and structures. In this vein, one can look at concepts, approaches, and methods in research on stratification and class, and consider their development, especially since the nineteenth century, and their usage in different historiographical traditions. This article will concentrate on the second, the conceptual level. However, both perspectives are intimately linked, since there is no empirical description of inequality without an acute awareness of concepts and language. However, for historians (as opposed to social theorists), all methodological discussion eventually points to empirical research in order to better understand systems of social inequality in the past, and therefore, in the present.
1. Theories of Social Stratification and Class Formation: Historical Deelopment Through the 1960s Even before the rise of the modern social sciences with their more elaborate theories of social structure, the explanation (and justification) of inequality was an important part of intellectual thought in advanced, literate societies, often within a general philosophical or theological context. In ancient Athens, Aristotle regarded society as consisting of three hierarchically ordered groups, among which he favored the middle, or ‘mesor,’ as being most important for the stability of the polis. Aristotle’s scheme of ‘stratification’ remains influential in attempts at defining society in terms of ‘upper,’ ‘middle,’ and ‘lower’ classes or strata, among which the middle class is held in highest regard (e.g., in political culture and public discourse in the USA. In feudalist Western Europe during the Middle Ages, a different kind of tripartite scheme of inequality gained prominence: the idea of a functional differentiation of society into the three orders of Christian clergymen, noblemen, and peasants. With the rise of cities and the emergence of town burghers as an influential group, the functionalist scheme was adapted to new social realities, but it remained a central concept for the interpretation of inequality. While these theories are themselves a subject of historical inquiry because they tell a lot about the framing and self-understanding of pre-modern societies (Duby 1978), a major shift toward modern concepts of stratification and class only occurred in the late eighteenth and nineteenth centuries. In the historical and philosophical works of the Scottish enlightenment (e.g., Adam Smith, Adam Ferguson, and John Millar), principal stages of civilizational 14314
development were associated with particular systems of social inequality and stratified class structure. The language of social inequality that is still in use, with its typical metaphors of ‘class’ and ‘strata,’ is heavily indebted to scientific discourses of the eighteenth century, and especially to the enlightenment’s quest for ‘classification’ (Karl v. Linne! ) and order in the realms of both nature and culture. In the following decades, with Adam Smith and David Ricardo providing important linkages, attention in the theory of stratification increasingly shifted towards a search for the origins of inequality in economic systems of production. The most powerful, and historically most influential, theory of class as a product of basic economic forces was for a long time Karl Marx’ (and Friedrich Engels’) concept of Historical Materialism, a theory according to which at each stage of society, an oppressed lower class rises against the economic and political dominance of a ruling class, and eventually overturns the old system through class struggle and revolution. Whatever critical may be said of Marx’ theories and its political consequences, it nevertheless became the major reference point for most theories of inequality and stratification until the middle of the twentieth century, and possibly also thereafter. For social history in particular, it provided an extremely suggestive framework for understanding historical change in terms of economic progress, class formation, and societal conflict, instead of referring to diplomacy, politics, and the ‘great men’ as driving forces of history, as was the case in much traditional nineteenth and early twentieth century history. After Marx, theories of inequality based on functional differentiation and the division of labor played an important role for a while (e.g., with Herbert Spencer in the UK, Gustav Schmoller in Germany, and E; mile Durkheim in France), but they too were primarily grounded in the idea of economic foundations of particular class systems. At the beginning of the twentieth century, German sociologist Max Weber in some ways moved class analysis back to Marxian scope and intentions, while on the other hand he developed, if in a fragmentary manner not easy to grasp for later scholars, a broader, historically more flexible, and less dogmatic, framework for the interpretation of social inequality and class structures. Although Weber, like most of his contemporaries, under the impression of industrial capitalism’s triumphant advance, acknowledged the enormous impact of economic forces on the molding of society, he was also prepared to systematically consider other dimensions of stratification. Political power, in Weber’s view, often played a crucial role in determining positions in a class system; and also inequality for him appeared as fundamentally shaped by ‘culture’: by general interpretations of the world and the social order as, for example, provided by religion (Weltbilder), as well as by the honor, prestige,
Social Inequality in History (Stratification and Classes) or status attributed to an individual (or to members of a larger group, such as a specific profession) within the normative setting of a society (see Weber 1978). After a brief period of neglect, Weber’s concepts of social inequality and social class exerted a strong influence on both American and European social science after 1945. In postwar American sociology and political science in particular, stratification became one of the dominant topics in the field, a topic that all major ‘schools’ and theoretical currents had to reckon with (see Barber et al. 1968). Despite an enormous variety of approaches and methods, some common features of stratification research in the 1950s and early 1960s stand out, making them attractive for historians. Often, they were not just present-minded, but embedded in a macro-sociological and historical framework of thought. Typically, they were interested in linkages between class and political power structures, and in the cultural dimensions of inequality as expressed in neo-Weberian concepts like ‘status’ or ‘prestige’ (Bendix and Lipset 1953, Lenski 1966). Since the late 1950s, Weberian and functionalist research in stratification began to overlap with a renaissance of critical theories of society and inequality that were attempting to benefit from the Marxian tradition in an undogmatic manner. Authors like Ralf Dahrendorf or Anthony Giddens were widely read for their innovative intellectual combinations of Marxism and liberalism, or Marxian and Weberian traditions (see Dahrendorf 1959, Giddens 1973). In Western social science, the revival of the Marxian tradition proved short-lived and mostly faded away in the late 1970s, leaving some strands of cultural Marxism as survivors into the then emerging new age of culturalism in the social sciences. However, Weber’s influence, because of the historical depth of his work and a rare combination of flexibility and analytical precision in his categories, never quite vanished, and his work continuously remained inspiring for historians doing research on stratification and class. At the same time, Weber proved remarkably resistant against changing intellectual moods since the late 1970s and 1980s, when classical concepts of stratification and class came under attack. He thus continued to be a startingpoint in some of the most important recent approaches in the sociology and social history of inequality and class.
2. Stratification and Social Classes in Historiography: Trends and Examples With the rise of social history to intellectual dominance in the 1960s and 1970s in most Western countries, research on the history of stratification and class, mostly in modern societies since the Industrial Revolution of the nineteenth century, became a central historiographical theme, and even a topic defining
what the ‘New Social History’ was all about. In the general political and academic climate of the 1960s, with its social-democratic air and its resurgence of Neo-Marxian theories, interest for the less privileged groups of industrial society—and especially for the working class—spilled over into history and not only generated a host of special studies on stratification, mobility, and class formation, but also influenced synthetic approaches in social history and the general discourse of historians in that field. While this was a pervasive international tendency, it rendered different results within the national traditions of social history. Therefore, Germany and the US will be considered as two important examples.
2.1 Germany With its strong methodological roots in nineteenthcentury historicism and its political conservatism in the historical profession, Germany was a late-comer in the development of postwar social history, and in many respects remained dependent on the reception of theoretical innovation from other countries such as France, the UK, and the USA. However, due to its strong traditions in social theory, and the theory of class in particular (Marx, Weber), and also because of the great historical significance of class structures in late nineteenth- and early twentieth-century Germany, topics like stratification and class formation quickly took center stage in West German social history during the late 1960s and 1970s. To a larger degree than in other countries, stratification and class emerged as the paradigmatic categories of research in social history (see, e.g., Kocka 1975), a tendency that came under attack in the 1990s as competing concepts such as gender and ethnicity by many historians were regarded as at least equally important. For the history of stratification and class in the 1970s and 1980s, two approaches deserve special mention. First, as was the case elsewhere, the history of classes in the beginning concentrated on the less privileged groups and lower strata of society. Here, a particular focus was on the formation of a modern, industrial working class in the nineteenth century, and on the problem of the homogenization of that class from the origins of a multitude of pre-industrial, agrarian, and artisanal occupations and interests. In what he termed the ‘Weberian application of a Marxian model,’ historian Ju$ rgen Kocka studied the formation of a German proletariat, which he conceived as a Marxian class in its basic socio-economic determination, and at the same time as a Weberian ideal type with which reality can never fully correspond (see Kocka 1980, 1987, see also: Labor History; Working Classes, History of). Class formation in this approach was considered an historical process progressing through different levels. It proceeded— 14315
Social Inequality in History (Stratification and Classes) ideally, though not necessarily in historical reality— from economic homogenization in the workplace towards increasing social and cultural coherence and might end with class consciousness and the political organization of the working class. Since the 1980s, class formation models have also been applied to the upper strata of German society, particularly the bourgeoisie and the professional middle classes, but it appeared that in this case, ‘culture’ seemed to operate as a much more influential determinant of class than a common economic position. Second, German traditions in social theory fostered an unusually strong interest in the historical conceptualization of the overall class structure and stratification system, again with an emphasis on the period of German modernization and industrialization from the late eighteenth century onwards. In conceiving social history as the ‘history of society,’ and the history of society as grounded in structures and processes of social inequality, stratification, and class, German historian Hans-Ulrich Wehler in a very influential project attempted a synthesis of modern German history in terms of a history of inequality and class. His Gesellschaftsgeschichte in a systematic way asks for the transformation of the estate system of preindustrial society into an industrial society comprised of social classes, with the bourgeoisie and proletariat in the center, but also giving attention to the nobility and to agrarian classes such as farmers or farmworkers (Wehler 1987–95). Typical for much German social history in the 1970s and 1980s, Wehler’s approach is methodologically under the heavy influence of Max Weber’s categories; and it politically adheres to a vision of an increasingly equal, less classrifted (if not classless) society. While most other German authors are not as ambitious and comprehensive as Wehler in their concern with social structures, they have, however, been more inclined to expand the explanatory framework of social inequality into the realm of culture. A new focus of research is on the complicated relationship between social structure on the one hand, and cultural perceptions and practices on the other hand.
2.2 United States In its self-image, the US has been born as a country of equality, defined by a lack of class and stratification in the European sense, and by the predominance of a broad middle class. The fact that the US has never known an estate system of social order has indeed, in combination with the country’s more pragmatist tradition in social thought, exerted a deep influence on social scientific and historical approaches toward class structure and socio-economic inequality in America. On the other hand, critical strands in social theory and historiography, which were able to identify and discuss inequality of class, appeared earlier than in Germany 14316
and remained stronger in the first half of the twentieth century—with the ‘Progressive historians’ such as Charles Beard being the most important example. During the 1960s, the blossoming of the ‘New Social History,’ together with a more self-critical mood in American society in general, prompted a new interest in the history of social class and inequality in America, much in the same way as in other Western countries at about the same time. With regard to approach and method, historians often concentrated on less privileged groups, such as workers and immigrants, and they often did so in case studies of particular cities and through the study of social mobility among the lower strata of urban society (Thernstrom 1964; see also Social Mobility, History of). For some time around 1970, Neo-Marxist approaches to the study of class and ‘class struggle’ in American society drew some attention, but on the whole, even leftist historians of class and inequality undertook much less theoretical effort in conceptualizing their topics than in Germany. Similar to trends in European social history, however, quantitative methods (as in the study of social mobility) since the late 1970s lost much of their earlier appeal. Instead, the problem of class was more often approached through qualitative, even hermeneutical methods, and through the lenses of class culture and class experience. Also, outside the somewhat narrowing field of working class and labor history, the study of social classes shifted its main focus from the lower classes to the middle classes and their origins in nineteenth-century social change (e.g., Blumin 1989, Zunz 1990). This shift may be understood as an attempt to deconstruct the pervasive myth of the American middle class, and at the same time to historically explain the middle class orientation of twentieth-century America. In a broader perspective, the study of stratification and class as such after the 1970s retreated from the center of concern for social historians much more quickly, and more pervasively, than in Europe. Obviously, the most important reason for this is the existence of other, competing dimensions of inequality in American society, especially the inequality of race and ethnicity. Class formation in the US rarely occurred in ethnically homogeneous milieus, and often racial and ethnic identity turned out to be more important, and certainly of a much more durable character, than class identity and economic position. For the same reason, synthetic efforts with generalized views on American society from the perspective of stratification and class were hardly undertaken by American historians. The one large project that probably comes closest to such a synthesis is the twovolume textbook ‘Who Built America?,’ collectively conceived and written by the American Social History Project under the intellectual guidance of labor historian Herbert Gutman (See, American Social History Project 1989–92). The book attempts a synthesis of
Social Inequality in History (Stratification and Classes) American history from colonial times to the present by centering its narrative around the development of social classes—particularly the industrial working class—and the struggle of less privileged groups for participation in the economic and political resources of ‘mainstream’ America. At the same time, the authors’ efforts demonstrate that a narrative of American social history from the viewpoint of stratification and class alone cannot make sense; it needs to be expanded by, and connected with, additional categories of inequality such as race, ethnicity, and gender.
3. The Challenge of Culture in the History of Stratification and Class The discussion of the two national examples, especially the case of the US, has already hinted at important new trends in historical research which have begun to change the field in fundamental ways since the 1980s and will probably continue to do so in the future. Although many of the recent tendencies stem from similar motives—e.g., the increasing dissatisfaction with interpretations of class based on the structures of male industrial work—and therefore often overlap in historiographical practice, it is useful to distinguish two horizons of criticism and expansion of the classical stratification and inequality paradigms. Some scholars, for different reasons, fundamentally question the relevance of concepts such as ‘class’ and ‘stratification’ and try to develop alternative concepts, or alternative theoretical and methodological viewpoints on the problem of inequality in history (see Sect. 4). Others consider ‘class’ and ‘stratification’ as still important, or even indispensable notions for social history, yet argue that they have to be redefined in order to meet the challenges of social reality and social science at the turn of the twenty-first century. This ongoing redefinition of class is in most cases centered around the idea of ‘culture.’ Culture, especially in the forms of ‘language’ and ‘experience,’ is now widely believed to be the prime factor in determining stratification systems and class structures. However, at a closer look, the challenge of culture differentiates into a number of theoretical and empirical approaches.
3.1 Class as Social Action, Experience, and Culture In 1963, the English Marxist historian E. P. Thompson published his two-volume ‘Making of the English Working Class,’ which is arguably among the dozen most influential works in twentieth-century social history (Thompson 1963). At that time, classical approaches to class formation grounded in systematic sociology or political science were still widely in use, and in some countries just beginning to influence the expanding field of social history. Thompson’s massive
study was rather conventional in the sense of its practical empiricism and its Marxian notion of a working-class winning identity and togetherness in the struggle against its bourgeois class enemy. What was new, however, was Thompson’s insistence that ‘class’ was not a quasi-objective structure determined by certain economic forces, but a very fluid result of practical action and social experience on the part of the lower classes in the decades around 1800. As Thompson largely focused on pre-industrial work and living experience, he discovered the roots of class in traditions and re-inventions of popular culture. In many respects, Thompson’s book was ahead of his time. Reception and debate in Germany, for example, only took off in the 1980s and turned his pioneering work into a major stimulant for the then emerging culturalist turn in social history (see Cultural History).
3.2 Class Experience: From Production to Consumption Classical approaches to the history of stratification, from the enlightenment theories of a division of labor to Marx and beyond, have based their identification of certain classes, and their description of the overall inequality structure of society, on the sphere of production, and on the relative position an individual (mostly seen as a male adult and family head) acquired in that society’s system of work and employment. As the nineteenth-century fascination with factory work and industrial production faded and the twentieth century increasingly developed the self-image of a consumer economy and society, it seemed appropriate to search for social spheres of class formation outside the workplace experience: at home, in leisure activities, and especially in consumption. In American and in British historiography in particular, the history of consumption has been one of the most significant discoveries for the field of social and cultural history in the 1990s (see Consumption, History of), and several attempts have been made to conceptually link consumption and class, some of which build on much earlier efforts such as Thorstein Veblen’s ‘Theory of the Leisure Class’ with its ideas about the role of ‘conspicuous consumption’ for the life of the upper classes. The notion of consumption can play a key role because of its ability to connect the perspective of economy—the appropriation of goods by groups with different economic resources—with the perspective of culture—insofar goods serve as symbols and as expressions of (individual or collective) identity and lifestyle. It also allows for a more adequate consideration of female and private-life contributions to the formation of social inequality and class. In an important empirical study on Chicago in the 1920s and 1930s, 14317
Social Inequality in History (Stratification and Classes) historian Lizabeth Cohen investigated the formation of class identity among ethnic workers and their families through their shared experience of consumption and modern mass culture institutions like radio (Cohen 1990). As the history of consumption still flourishes at the turn of the twenty-first century, it is likely that its impact on the history of inequality and class will grow even larger in the near future.
might better be abandoned altogether and replaced by alternative concepts of social order. However, that does not mean that they would deny the existence, and empirical relevance, of stratification systems and class formation under certain historical circumstances. Rather, the argument is that the category of class itself is a historical product of the nineteenth century, and that therefore the continuing conceptual primacy of class endangers the ability of scholars to recognize different lines of social division and inequality.
3.3 Pierre Bourdieu: Class and Cultural Distinction The first two examples for the cultural expansion of the history of class represent developments within the historical profession, yet trends and new concepts in other social scientific disciplines have also had their impact on the topic. While national traditions of thought remain influential—with Max Weber, for example, still being of prime importance in Germany, and anthropologist Clifford Geertz stimulating American debates—in international perspective one scholar’s work in particular has received near unanimous attention and acclaim by social historians during the 1980s and 1990s, namely French sociologist Pierre Bourdieu’s theory of class and cultural distinction as developed in his major work La distinction (Bourdieu 1979). Arguing from his research on contemporary French society, Bourdieu secures a central place in social research for the problem of inequality and even sticks to the concept of class. Yet in a brilliant way he enlarges that concept by understanding class as the result of a complex ensemble of not just economic, but also social and cultural resources or types of ‘capital.’ Similar to the category of ‘prestige’ in 1950s American sociology, Bourdieu defines an individual’s position in society in terms of his or her ‘distinction’ and ‘habitus,’ a concept referring to socio-cultural outlook and selfstylization. Though obviously fitting the self-oriented consumer and leisure society of the later twentieth century especially well, the appeal of Bourdieu’s approach also extends to the analysis of more traditional societies in the early modern era, and even to non-Western societies, not least because of Bourdieu’s own intellectual origins as a social anthropologist. As Bourdieu at the same time draws on ‘classical’ authors of stratification research such as Weber and Marx, his work is extremely important in providing a bridge between the classical theoretical tradition and the more recent approaches that center around language, experience, culture, and consumption.
4. Beyond Class, Beyond Stratification? While the ‘culturalist turn’ in the humanities has substantially expanded and reframed perspectives on stratification and class in history since around 1980 other scholars have gone a step further in arguing that the conceptual framework of stratification and class 14318
4.1 Gender, Race, and Ethnicity This point has been most persuasively made by feminist social scientists and historians who have argued that the predominance of class in social history has in fact only been able to describe the male half of the social world—and even that in an inadequate manner, because class formation among male individuals (or, household heads) cannot be conceived of independent from the family and gender systems out of which the sphere of production and patriarchy grows (see Canning 1992). In a similar way, historians of race and ethnicity have maintained that the stratification of a society in terms of individuals’ economic positions never functions as an independent variable in multiracial societies such as the US, Brazil, or South Africa—nor even in European societies which are increasingly being shaped by their immigrant populations. It has hence become almost a standard method in American, and now also in British social history, to frame the problem of social inequality in the trias of ‘race, class, and gender.’ In the view of many feminist historians, however, this solution is a dissatisfactory compromise, because it implicitly affirms a conceptual primacy of class by molding the two other concepts after the example of class, or as mere expansions of it (see Scott 1986). Therefore, the history of gender is in need of developing specific theoretical foundations out of twentieth-century social theory in order to establish itself as an independent approach in social studies, and not just a part of the ‘litany of class, race, and gender’ (Scott 1986, p. 1055). In the near future, we will perhaps see more theoretical efforts, and empirical historical studies, going in that direction, not just in gender history, but also in the history of ethnicity. 4.2 From Classes to Life-Styles, ‘Milieus,’ and Radical Indiidualism A completely different line of thought, which also calls traditional concepts of stratification and class into question, has emerged in recent sociological work on contemporary western societies in their ‘postindustrial’ stage. As industrial work, and the labor and class relations so intimately connected with it, loses its central status for the definition of modern societies,
Social Inequality in History (Stratification and Classes) these societies may no longer be adequately described as stratified class societies. The large, socially coherent and even politically self-conscious groups such as the ‘working-class’ are considered to be rapidly dissolving, without new, equally coherent group structures building up. Instead, social relations in modern society may be described as based on radical individualization, and a similarity of social status at best finds expression in rather diffuse social ‘milieus’ (see Beck 1986, Hradil 1987). Western societies, in this perspective, are not becoming completely leveled societies with citizens of equal material resources, but the absence of material wants, and a re-definition of the value system towards post-material orientations, dramatically diminish the relevance of occupational situation, income, and even class-specific consumption patterns. As ‘social inequality’ (in the form of social strata and classes) was the defining element of nineteenth and early twentieth century societies, these authors hold, categories like ‘risk’ or ‘information’ are defining societies on the verge of the twenty-first century. Although not yet widely received in social history, these new developments in theoretical and empirical sociology will probably have a considerable impact on historical views on inequality and social structure in the future. They offer important insights into the now rapidly expanding field of twentieth century social history, but it may also be asked whether societies before the industrial age can be interpreted beyond the classical thought system of social estate, class, and stratification. 4.3 Constructing Class, Constructing Inequality: From Economic Realism to Cultural Relatiism With the class structure of industrial societies appearing more as an historical exception rather than historical normalcy, and with the interest in the socioeconomic foundations of inequality and class diminishing, historians have become more sensitive to the relativist or constructivist character of social structure in general and of ‘class’ in particular. Especially in British social history, questions have been asked about the ‘language of class,’ that is, about the use of concepts such as class by the historical actors themselves, who were trying to express collective selfidentities by communicating new notions of social bonding. With regard to the working classes, the main thrust of this research was on political organization and political action (e.g., in English Chartism), and the objective nature of class was hardly fundamentally doubted (see Stedman Jones 1983). Especially in the 1990s, however, and with empirical interest somewhat shifting to the middle classes, the language approach was radicalized under the double influence of cultural and linguistic theory on the one hand, and of ‘radical constructivism’ in epistemology and the sociology of knowledge on the other hand. Stratification and class were now analysed as ‘im-
aginary’ realms of communication and culture; they figured as notions that expressed ‘visions’ of the social order as perceived by the contemporaries, rather than realities of economic and power structure as described by social scientists and historians (see Corfield 1991, Joyce 1991, Nolte 2000, Wahrman 1995). Some of these approaches fundamentally undermine any ‘realistic’ concept of social inequality and class, and therefore sound like a triumphant signal for the final abandonment of any social scientific or historical study in stratification and class structures. Yet at the same time it can be argued that the language-oriented and constructivist approaches have introduced important expansions to the more traditional methods of social history. They have provided bridges between social and intellectual history and have induced historians to a more careful usage of language and concepts. Also, ironically, in devoting so much energy to the problem of class in historical language and selfconsciousness, this work has also underscored the importance of inequality and class in past societies.
5. Conclusion After a long period of undisputed reign as a kind of master concept in the study of present and past social structures, theories of stratification and class are now undergoing a period of substantial challenge and criticism, redefinition, and even rejection. The question may thus arise whether theoretical and empirical interest in social inequality will persist in the future, especially in the discipline of history, which is now characterized by a very pronounced shift away from problems of socio-economic structure in favor of a history of culture, symbolism, and subjectivity. It is almost certain that inequality, stratification, and class, will lose (and already have lost) some of their original relevance as defining ground for social history as such, when compared with the 1950s to 1970s boom of those topics in the social sciences. In addition, the emerging intercultural debates on the history of non-Western societies may prove that concepts such as ‘class’ and ‘social strata,’ which are obviously deeply rooted in the Western tradition of thought and culture since classical antiquity, are not suitable instruments for analyzing Indian, or Central African, societies, be it past or present. As ‘postcolonial theories’ in the social and cultural sciences describe the ‘hybrid’ character of such societies, the hybridity of class and competing principles of social relations may also be a distinctive mark of their inequality structures. On the other hand, three points may be made which serve as a reminder for the continuing importance, not only of the general notion of social inequality, but also of more specific concepts like stratification and class. First, social inequality—whatever it is grounded on, and in whatever way it is communicated—is an anthropological feature of all human civilization, not 14319
Social Inequality in History (Stratification and Classes) only in the West. Second, inequality has often taken the form of a hierarchical order, or ‘stratification,’ of distinct groups according to their access to power, their rights, and material well-being. Moreover, the hierarchical mode of social differentiation is a widespread mark of the less advanced, less technologically and culturally complex societies in which historians are often especially interested. Third, it is hard not to acknowledge the success, and the astonishing flexibility, of the concept of ‘class’ for describing systems of inequality. As new and intellectually promising approaches have been developed since around 1980 that link ‘culture’ with ‘class,’ and as these approaches are now applied in empirical research, there can be little doubt about the fact that the history of inequality, stratification, and class will retain a prominent place in the field of social and cultural history. See also: Class: Social; Equality and Inequality: Legal Aspects; Inequality; Inequality: Comparative Aspects; Marx, Karl (1818–89); Mobility: Social; Social Stratification; Weber, Max (1864–1920)
Bibliography Barber B, Lipset S M, Hodge R W, Siegel P M, Stinchcombe A L, Rodman H 1968 Stratification, Social. In: Sills D (ed.) International Encyclopedia of the Social Sciences. Macmillan and Free Press, New York, Vol. 15 Beck U 1986 Risikogesellschaft. Auf dem Weg in eine andere Moderne. Suhrkamp, Frankfurt, Germany Bendix R, Lipset S M (eds.) 1953 Class Status, and Power: Social Stratification in Comparatie Perspectie. Free Press, New York Blumin S M 1989 The Emergence of the Middle Class. Social Experience in the American City, 1760–1900. Cambridge University Press, New York Bourdieu P 1979 La distinction. Critique sociale du jugement. Editions de Minuit, Paris Canning K 1992 Gender and the Politics of Class Formation: Rethinking German Labor History. American Historical Reiew 97: 736–68 Cohen L 1990 Making a New Deal: Industrial Workers in Chicago, 1919–1939. Cambridge University Press, Cambridge, UK Corfield P J (ed.) 1991 Language, History, and Class. Blackwell, Oxford Dahrendorf R 1959 Class and Class Conflict in Industrial Society. Stanford University Press, Stanford, CA Duby G 1978 Les Trois Ordres ou l’Imaginaire du FeT odalisme. Gallimard, Paris Giddens A 1973 The Class Structure of the Adanced Societies. Hutchinson, London Hradil S 1987 Sozialstrukturanalyse in einer fortgeschrittenen Gesellschaft. Von Klassen und Schichten zu Lagen und Milieus. Leske und Budrich, Opladen, Germany Joyce P 1991 Visions of the People: Industrial England and the Question of Class 1848–1914. Cambridge University Press, Cambridge, UK
14320
Kocka J 1975 Theorien in der Sozial- und Gesellschaftsgeschichte. Vorschla$ ge zur historischen Schichtungsanalyse. Geschichte und Gesellschaft 1: 9–42 Kocka J 1980 The Study of Social Mobility and the Formation of the Working Class in the 19th Century. Le Mouement Social 111: 97–117 Kocka J 1987 Lohnarbeit und Klassenbildung. Arbeiter und Arbeiterbewegung in Deutschland 1800–1875. J H W Dietz, Berlin, Germany Lenski G E 1966 Power and Priilege: A Theory of Social Stratificaton. McGraw Hill, New York Levine B et al. American Social History Project 1989–92 Who Built America? Working People and the Nation’s Economy, Politics, Culture, and Society. 2 Vols. Pantheon, New York Nolte P 2000 Die Ordnung der deutschen Gesellschaft: Selbstentwurf und Selbstbeschreibung im 20. Jahrhundert. Beck, Munich, Germany Scott J W 1986 Gender: A Useful Category of Historical Analysis. American Historical Reiew 91: 1053–5 Stedman Jones G 1983 Languages of Class: Studies in English Working Class History, 1832–1982. Cambridge University Press, Cambridge, UK Thernstrom S 1964 Poerty and Progress: Social Mobility in a Nineteenth Century City. Harvard University Press, Cambridge, MA Thompson E P 1963 The Making of the English Working Class. Penguin, Harmondsworth, UK Wahrman D 1995 Imagining the Middle Class. The Political Representation of Class in Britain, 1780–1840. Cambridge University Press, Cambridge, UK Weber M 1978 Economy and Society. An Outline of Interpretie Sociology. In: Roth G (ed.) 2 Vols. University of California Press, Berkeley, CA Wehler H-U 1987–95 Deutsche Gesellschaftsgeschichte, 3 Vols. C H Beck, Munich, Germany Zunz O 1990 Making America Corporate 1870–1920. University of Chicago Press, Chicago
P. Nolte
Social Influence, Psychology of Social influence refers to the ways people alter the attitudes or behavior of others. Typically social influence results from a specific action, command, or request, but people also alter their attitudes and behaviors in response to what they perceive others might do or think. Social psychologists often divide social influence phenomena into three categories— conformity, compliance, and obedience—although the distinction between the categories is often difficult to discern. Conformity generally refers to changes in opinion or behavior to match the perceived group norm. These changes are voluntary, although the individual may not always be aware that he or she is conforming. Compliance refers to changes in behavior that are elicited by a direct request. These requests
Social Influence, Psychology of usually call upon the individual to buy a product, do a favor, or donate money or services. Obedience refers to responses to direct commands or demands from an authority figure. Although individuals are not physically coerced into these responses, they often feel as if they have little choice but to obey the command.
1. Conformity People exhibit conformity when they change their attitudes or behavior to resemble what they believe most people like them would think or do. Everyday examples of conformity include selecting clothes to match what other people are wearing and recycling newspapers because it appears that most people support recycling. Psychologists have identified two primary motives to explain conformity behavior— informational influence and normatie influence (Deutsch and Gerard 1955). When informational influence is operating, individuals conform because they want to be accurate. People agree with the perceived norm because they lack confidence in their own judgment and assume that the typical judgment is correct. When conformity reflects normative influence, the individual’s primary concern is to gain social approval and to avoid the consequences of appearing deviant. People go along with the perceived norm because they hope their conformity will lead to acceptance from the group or because they hope to avoid the criticism, ridicule, and possible rejection that may follow a failure to conform. In practice, informational influence and normative influence often operate hand in hand. For example, a conforming individual may hope to gain acceptance from a group and may also believe that the majority of group members are correct in their judgment. The first demonstration of information influence in a conformity study was reported by Muzafir Sherif (1936). This classic investigation placed male participants in a totally darkened room. Periodically a small dot of light appeared on a wall approximately 15 feet away, and the participants’ task was to report how much, if any, they saw the light move. In fact, the light did not move. Sherif took advantage of a perceptual illusion known as the autokinetic effect—the tendency to see a stationary point of light move in a dark environment. When three participants sat together and reported their perceived movement aloud, Sherif found that over time the estimates each participant gave tended to resemble those of the others in the room. In other words, each group established its own norm for the amount of movement they perceived, and the group members tended to see an amount of movement close to that norm. These participants typically were unaware that they had been influenced by the group, and later investigations found participants continued to rely on the group norm even after the other individuals were no longer in the room.
A second set of early conformity studies demonstrated the power of normative influence. Solomon Asch (1951) asked participants to engage in a simple line judgment task. In the standard procedure, several individuals at a time were presented with a picture of a ‘standard’ line accompanied by three comparison lines. The task was to identify which of the comparison lines was the same length as the standard line. The individuals gave their responses aloud for the experimenter to record. Because the length of the lines differed by noticeable amounts, the task was an easy one. However, only one of the individuals in the study was a real participant. The others were confederates who, beginning with the third line judgment task, began to give wrong answers (before the real participant’s turn to respond). On 12 of the remaining 16 trails, the real participant heard each of the others select a comparison line that clearly was incorrect. The question was whether the participants would give what they knew to be the correct answer, or if they would go along with the response given by the group.Typically, participants went along with what their senses told them. However, on 37 percent of the trials the participants conformed to the norm and gave the same response as the rest of the group. Moreover, approximately three-quarters of the participants went along with the majority at least once. Most of the conforming participants in Asch’s studies were motivated by normative influence. That is, these individuals were concerned about how they would look in the eyes of the other participants if they were to express their true judgments. When participants in one study were allowed to give their responses in private, the rate of conformity dropped dramatically (Asch 1956). However, at least some informational influence also appeared to be operating in the Asch studies. That is, some people came to believe that the shared perception of the group must be more accurate than their own visual perception.
1.1 Variables Affecting Conformity Although numerous replications of Asch’s findings provide persuasive evidence for the power of conformity, not all people conform to the same degree in the same situation. For example, cultural variables have been found to play a role in the conformity process. Bond and Smith (1996) reviewed 133 studies using the Asch procedure conducted in 17 countries. They found participants from collectivist cultures (see Cultural Psychology) were more likely to conform to the group norm than people from individualistic cultures. These investigators also found a trend toward less conformity since the 1950s in studies with US participants. Thus, changes in societal standards, such as the value placed on independence and autonomy, may affect the extent to which members of a given culture conform. Age also appears to play a role in this 14321
Social Influence, Psychology of process. Consistent with conventional wisdom, researchers find that young adolescents are particularly prone to conforming with the behavior of peers. Finally, several investigations find that personality plays a role in conformity. People who have a high need to feel in control are less likely to conform, whereas those who are highly concerned with the impression they make on others are more likely to go along with the perceived norm. Psychologists have not always agreed on the extent to which gender influences conformity behavior (Cooper 1979, Eagly and Chravala 1986). Although some disagreement remains about the size of the effect, it does appear that women are more likely to conform than men. Moreover, this tendency to conform seems to be especially likely in face-to-face situations like the Asch procedure. Explanations for this difference include men’s need to appear independent and women’s tendency to promote smooth social interactions. Finally, the number of other people in the situation affects conformity behavior. Asch (1956) varied the number of confederates in his procedure, creating conditions with one to fifteen confederates. He found that conformity increased noticeably as the number of confederates grew, but that this effect leveled off after three or four. In other words, five or fifteen confederates giving an incorrect response to the line judgment task generated no more tendency to conform to the norm than four confederates. Other researchers have developed elaborate models to account for the effects of crowd size on conformity (Latane 1981, Tanford and Penrod 1984). These models consider such variables as characteristics of the individuals in the group and physical proximity of the group members.
1.2 Perceied Norms The impact of norms is not limited to situations like the Asch paradigm, in which people observe the behavior of others and face immediate social consequences. Rather, many daily decisions are influenced by what individuals believe to be the normative behavior for a given situation. Whether people give money to a charitable cause, yield the right of way to pedestrians, or take an extra copy of the paper from a newspaper box depends to some degree on what they perceive to be the normative behavior in that situation. Researchers have identified two kinds of perceived norms (Cialdini et al. 1991). Injunctie norms refer to correct behavior, that is, what society says one should do in the situation. Injunctive norms tell people they should give to charitable causes and that they should not take a newspaper they have not paid for. Descriptie norms refer to what most people actually would do in the situation. Although people typically 14322
do not have adequate data to make an accurate assessment of what most people would do, they commonly rely on whatever information they have to estimate the descriptive norm, however incorrect they may be. Injunctive and descriptive norms often are very similar and lead us to the same decision about our behavior. However, psychologists also point to some notable exceptions. For example, a widely accepted injunctive norm states that people should not litter in public settings. But a parking garage filled with bits of litter suggests to passersby that the descriptive norm— what people actually do—is quite different. Whether people respond to the injunctive or descriptive norm in this situation may depend on which of these two is brought to their awareness. Researchers in one set of studies placed useless flyers on windshields in a parking garage (Reno et al. 1993). When the drivers’ attention was drawn to the litter that others had apparently tossed to the ground (descriptive norm), they tended to also drop the flyer on the ground. When these drivers were made aware of the injunctive norm, such as when they saw another person pick up a piece of litter to keep the environment clean, most decided against littering.
2. Compliance People demonstrate compliance when they agree to an explicit request, such as a request to buy a product or to volunteer their time. Much of the research on compliance has examined the effectiveness of sequential-request procedures. By presenting participants with certain kinds of request in specific sequences, investigators have identified many of the psychological processes that lead people to accept or refuse a request. Four of these procedures are described here. The most widely researched of the sequentialrequests procedures is the foot-in-the-door technique (Freedman and Fraser 1966). The procedure begins with a small request that virtually everyone agrees to. At some later point, the same requester or a different individual returns and presents a larger request. This second request is known as the target request, because getting participants to agree with the larger request is the real purpose of the procedure. If the foot-in-thedoor manipulation is successful, participants will comply with the target request at a higher rate than if they had been presented only with the target request. Numerous investigations demonstrate that the footin-the-door procedure can increase compliance to the request. However, reviews of this literature find that many studies fail to produce the effect and that the overall size of the effect is not large (Burger 1999). Part of the problem with this research is that investigators have used a variety of procedures to create a foot-in-
Social Influence, Psychology of the-door manipulation, some of which are not effective. For example, the foot-in-the-door procedure is more likely to increase compliance when participants are allowed to perform the initial request, when the initial request requires more than a minimal amount of effort, and when the second request appears to be a continuation of the initial request (Burger 1999). Researchers disagree on the psychological processes underlying the foot-in-the-door phenomenon. The most widely cited interpretation for the effect is the self-perception explanation. According to this account, agreeing with the initial request causes people to alter the way they think about themselves. When later presented with the target request, these individuals are said to use their compliance to the small request as an indicator of their attitude toward such causes or toward participating in these kinds of activities. Because they agreed to the small request, they are more likely than control participants to agree to the target request. Another common sequential-request procedure begins with an initial request so large that most people refuse. These individuals then are presented immediately with a smaller (target) request. If this door-in-theface technique is successful, participants will agree to the target request more often than if they had not first rejected the large request (Cialdini et al. 1975). For example, participants in one study were asked to volunteer for a two-year placement working as a youth counselor. After they refused, experimenters asked participants if they would chaperone some children on a two-hour trip to the zoo. These participants agreed at a higher rate than those asked only the target request (Cialdini et al. 1975). The door-in-the-face procedure is said to be effective because the requester takes advantage of a widely practiced social rule called the reciprocity norm. The reciprocity norm governs many social exchanges and requires that people return favors and acts of kindness (Gouldner 1960). Most people feel a strong sense of obligation to someone who has done them a favor, even when that favor was not requested. Moreover, studies find that people often give back more than they received in order to relieve themselves of their obligation. When applied to the door-in-the-face situation, the requester ‘gives’ the individual what appears to be a concession. That is, the requester seems to give up something he or she wants, and the individual reciprocates by giving something to the requester, i.e., the target request. Another sequential-request procedure begins when the participant agrees to a request at a given price. Requesters using the low-ball procedure then raise the cost of the request slightly. The procedure leads to more compliance at the higher price than when the price is presented at the outset (Cialdini et al. 1978). For example, undergraduates in one study agreed to participate in a psychology experiment. The students then were informed that the experiment was to be held
at 7.00 a.m., thus making participation more costly than they probably had imagined. Nonetheless, these students showed up for the early morning experiment more often than those told about the time of the experiment at the beginning (Cialdini et al. 1978). The effectiveness of the low-ball procedure is said to result from a sense of commitment that develops when the individual agrees to the initial request. The individuals feel committed to perform the agreed-upon behavior, or at least to do something for the requester (Burger and Petty 1981). This sense of commitment causes people to continue to agree to the request even at the higher price. In contrast to the small-to-large price progression used in the low-ball technique, the that’s-not-all procedure uses a large-to-small price progression. Requesters using this procedure present the request at a given price, then improve the deal before the individual can respond (Burger 1986). For example, participants in one study were told the price of a cupcake was one dollar. Before participants could respond, the salesperson said the price really was 75 cents. The participants in this condition were more likely to buy the cupcake than those told at the outset that the price was 75 cents (Burger 1986). The that’snot-all procedure is said to work in part because, as in the door-in-the-face procedure, the requester appears to be making a concession. The procedure also is effective because it appears to alter the ‘anchor point’ the individual uses to decide if the final price is a good or fair one. That is, participants are said to use the initial price as a reference point when deciding whether to agree with the final price. Because the final price is lower, participants are more likely to see it as reasonable than when they are presented only with the lower price.
3. Obedience Obedience refers to an individual’s response to a command from an authority figure. Implicit in this description is the notion that the recipient of the command is reluctant to engage in the behavior and probably would not unless give direct orders to do so. Consequently, psychologists’ ability to conduct controlled laboratory studies of obedience has been limited by concerns for the welfare of participants who are torn between obeying commands and following their conscience. As a result of heightened concerns for the ethical treatment of human participants in recent years, most of the insights psychologist have about the obedience process come from a series of demonstrations conducted in the 1960s by Stanley Milgram (1974). The basic procedure in Milgram’s studies used one real participant, one confederate posing as a participant, and an experimenter. The study was described to participants as a learning experiment designed to 14323
Social Influence, Psychology of test the effects of punishment. The participant’s task was to teach a list of word associations to the confederate by punishing the confederate for incorrect answers. The method of punishment was electric shock, with the participant instructed to give increasingly severe shocks to the confederate for each wrong answer. The participant’s dilemma occurred when the confederate complained about the shocks and demanded to be released from the shock apparatus. In most versions of the procedure the participant could hear the confederate screaming in pain through the wall that separated them. Although it was apparent that the confederate was suffering and was probably receiving dangerous levels of electric shock, the experimenter commanded the participant to continue giving shocks. The procedure continued until the participant refused to go on, or until the participant pressed the highest button on the apparatus (labeled 450 volt) three times, for a total of 33 shocks. The results of the study were unexpected. Prior to conducting the investigation, Milgram asked students, middle-class adults, and psychiatrists to predict the results. Everyone agreed that the typical participant would stop very early in the shock sequence and that virtually no one would continue to the 450 volt level. However, in the basic procedure Milgram found that 65 percent of the participants continued to follow the experimenter’s commands all the way to the end. Milgram (1974) conducted a series of variations on the basic procedure in an effort to explain the participants’ high level of obedience. He found that proximity of the participant and confederate appeared to play a role. Obedience dropped to 40 percent when the participant was in the same room as the confederate, and dropped to 30 percent when the participant was required to touch the confederate to administer the shock. Although the original studies were conducted with male participants, a similar pattern was found when Milgram used women as participants. The level of obedience did not diminish when the experimenter gave his commands in a meek manner, but dropped to 20 percent when the commands were given by another confederate posing as a participant. Milgram concluded from this last finding that obedience does not depend on the authority figure’s personality or charismatic style, but rather that obedience requires only that the individual be recognized as a legitimate authority. Most social psychologists point to Milgram’s research as an example of the often-unrecognized power of the situation to influence behavior. Although personality characteristics of the participants can affect whether they obey or refuse the command (Blass 1991), the discrepancy between people’s predictions and the actual outcome of the Milgram studies suggests that the situation had a greater influence than most people recognize. Other psychologists point out that Milgram’s procedure also took advantage of conformity and compliance processes. For example, 14324
by moving from small to progressively larger shocks, Milgram placed participants in a situation similar to that used in foot-in-the-door demonstrations. Milgram also found a drop in obedience when participants had information about descriptive norms. When participants saw two other participants (actually confederates) refuse the command, only 10 percent obeyed the experimenter’s commands to the end.
4. Future Directions Although researchers have identified many social influence phenomena, questions remain about the psychological processes underlying some of these behaviors. Future research is likely to examine these processes, with particular attention to the way multiple processes combine to produce conformity and compliance. Investigators also may focus on variables affecting social influence. In particular, cultural differences in conformity and compliance behavior are likely to be explored. However, future research on obedience probably will be limited by concerns for the ethical treatment of human participants. For many years now, these concerns have prevented researchers from replicating the procedures used by Milgram in his obedience studies. The challenge for investigators is to develop procedures that create a realistic obedience situation without exposing participants to excessive emotional stress. See also: Compliance and Obedience: Legal; Conformity: Sociological Aspects; Experimental Design: Compliance; Impression Management, Psychology of; Obedience: Social Psychological Perspectives
Bibliography Asch S E 1951 Effects of group pressure upon the modification and distortion of judgments. In: Guetzkow H (ed.) Groups, Leadership, and Men. Carnegie Press, Pittsburgh, PA Asch S E 1956 Studies of independence and conformity: A minority of one against a unanimous majority. Psychological Monographs 70 (9): 416. Blass T 1991 Understanding behavior in the Milgram obedience experiment: The role of personality, situations, and their interactions. Journal of Personality and Social Psychology 60: 398–413 Bond R, Smith P B 1996 Culture and conformity: A metaanalysis of studies using Asch’s (1952b, 1956) line judgment task. Psychological Bulletin 119: 111–37 Burger J M 1986 Increasing compliance by improving the deal: The that’s-not-all technique. Journal of Personality and Social Psychology 51: 277–83 Burger J M 1999 The foot-in-the-door compliance procedure: A multiple-process analysis and review. Personality and Social Psychology Reiew 3: 303–25 Burger J M, Petty R E 1981 The low-ball compliance technique: Task or person commitment? Journal of Personality and Social Psychology 40: 492–500
Social Insurance: Legal Aspects Cialdini R B, Cacioppo J T, Bassett R, Miller J A 1978 The lowball procedure for producing compliance: Commitment then cost. Journal of Personality and Social Psychology 36: 463–76 Cialdini R B, Kallgren C A, Reno R R 1991 A focus theory of normative conduct: A theoretical refinement and reevaluation of the role of norms in human behavior. Adances in Experimental Social Psychology 24: 201–34 Cialdini R B, Vincent J E, Lewis S K, Catalan J, Wheeler D, Darby B L 1975 Reciprocal concessions procedure for inducing compliance: The door-in-the-face technique. Journal of Personality and Social Psychology 31: 206–15 Cooper H M 1979 Statistically combining independent studies: A meta-analysis of sex differences in conformity research. Journal of Personality and Social Psychology 37: 131–46 Deutsch M, Gerard H B 1955 A study of normative and informational social influences upon individual judgment. Journal of Abnormal and Social Psychology 51: 629–36 Eagly A H, Chravala C 1986 Sex differences in conformity: Status and gender-role interpretation. Psychology of Women Quarterly 10: 203–20 Freedman J L, Fraser S C 1966 Compliance without pressure: The foot-in-the-door technique. Journal of Personality and Social Psychology 4: 195–202 Gouldner A W 1960 The norm of reciprocity: A preliminary statement. American Sociological Reiew 25: 161–78 Latane B 1981 The psychology of social impact. American Psychologist 36: 343–56 Milgram S 1974 Obedience to Authority. Harper, New York Reno R R, Cialdini R B, Kallgren C A 1993 The transsituational influence of social norms. Journal of Personality and Social Psychology 64: 104–12 Sherif M 1936 The Psychology of Social Norms. Harper, New York Tanford S, Penrod S 1984 Social influence model: A formal integration of research on majority and minority influence processes. Psychological Bulletin 95: 189–225
The difference between social and private insurance can be found in the goals, the techniques of protection, the legal form of the insurance bodies as private or public organizations, and the way of financing the benefits. But there is no clear distinction between them. Sometimes, the goals of social insurance are attained by compulsory private insurance and, in some countries, the bodies responsible for social insurance are private bodies. One of the main distinctions between social and private insurance concerns the ways in which risk is calculated and financed. In the field of private insurance, the basis of calculation of the premiums is the risk. In the field of social insurance, the risk is less important, and social aims (such as health and welfare) prevail against the economic criterions. Private insurance, moreover, distributes the burden of an individual risk across a group. Social insurance has the same effects, but it also effects income redistribution between those who are healthy and sick, richer and poorer, younger and older people, or households without and with family members.
2. Two Characteristic Items of Social Insurance: Financing and Organization Thus, two characteristic items of social insurance remain, which allow us to distinguish between social insurance and other techniques of social protection. Those interlinked items are the ways of financing social insurance and the organization of social insurance.
J. M. Burger 2.1 Financing
Social Insurance: Legal Aspects 1. Social Insurance and Priate Insurance The notion of insurance is widespread and clear: insurance is a way of providing for the consequences of negative future events (risks or contingencies) in a collective manner with a special technique of financing. The distinction between private and social insurance is less clear: social insurance is organized by public authorities through legislation, and is usually compulsory, while private insurance is usually voluntary. Social insurance seeks to provide coverage against the consequences of risks like sickness, industrial accident, invalidity, old age, and unemployment for a large part of the population of a nation. To attain these goals, a compulsory form of insurance is considered a necessity. Bismarck’s social insurance in Germany is in some ways the prototype of social insurance. However, his social insurance covered only workers rather than the general population (Koehler and Zacher 1982).
Social insurance benefits were originally financed by contributions levied on earnings and divided between employee and employer, thus expressing the shared responsibility between employer and employee with regard to work-related social risks. State subsidies completed the financing. However, this scheme changed over time. Some countries introduced increasing state subsidies, while others changed the proportion of the contributions burden payable by employer and employee. In Germany, for example, the employees had to relinquish a public holiday to compensate employers for the new financial burden created by the introduction of long-term care insurance. Currently, there are ongoing discussions about changing the financial basis from the earnings of employees to the surplus value of enterprises. 2.2 Organization Originally, social insurance bodies were special bodies that were not directly integrated in the administrative organization of a state. Sometimes, they are co14325
Social Insurance: Legal Aspects administered by representatives of employers and employees. This kind of self-government is more prevalent in countries where social insurance bodies are private organizations. In countries with special public authorities for social insurance, but without self-governing or autonomous elements of administration, the ideal image of social insurance is left behind.
3. Social Insurance and Social Security The concept of social insurance should not be confused with the concept of social security (cf. UN Declaration of Human Rights, art. 22), which is an international goal of social policy. Social insurance is considered a special, nevertheless widespread technique of social protection. It combines the scope and character of a compulsory insurance on the one hand, and the financing of the benefits by obligatory contributions on the other. From the perspective of international law, social insurance does not entail special legal problems. Nevertheless, international law scholars have taken a special interest in social insurance as a mechanism of social protection. Thus, many international treaties and other international or regional instruments, especially involving Europe, address social insurance. In the field of social security, international law has two main dimensions: (1) coordinating different national social security schemes to guarantee freedom of movement, and (2) defining social standards and aims (e.g., minimum standards, standards to attain, principles). The coordination task was originally primarily due to the migration of workers, while the standard defining task began with the treaty of Versailles, which can be considered the basis of the development of an international social policy (Zacher 1976).
validity, and for survivors. This convention applies to insurance schemes as well as other schemes of protection.
4.2 Higher Leels of Social Standards The European Social Charter (1961) of the Council of Europe, a regional instrument for the European countries, serves as a social standard instrument for the European Union nations. It is now ratified by 24 states. In the frame of the Council of Europe, other social security conventions (e.g., Code of Social Security) also aim at higher social standards.
5. Social Insurance Benefits 5.1 Range of Benefits
4. Fulfillment of International Requirements (International Standards of Social Insurance)
The range of risks covered by social insurance schemes, labeled as ‘traditional’ social risks or contingencies, is described in ILO Convention 102 on minimal standards of social security (see Sect. 4.1) (von Maydell and Hohnerlein 1994). As new social risks develop, it is not clear whether protection will be provided by social insurance schemes or by other means (van Langendonck 1997, Pieters 1998). Social insurance bodies do not provide benefits unique to social insurance. Other bodies of social protection and even private insurance bodies provide comparable or identical benefits. However, there are typical links between social insurance and the benefit provided, especially for those cash benefits designed to replace former earnings, e.g., sickness benefits or oldage pensions. As the insured person pays contributions based on his\her earnings (usually a certain percentage of the income), the benefit replacing the earnings is calculated as a function of the former earnings and is not provided as a flat-rate sum. Whereas private insurance is almost always paid in cash, social insurance benefits can be either benefits in kind or in cash.
Two kinds of requirements prevail internationally: attainment of minimum standards and achievement of higher levels than a given minimum.
5.2 Guarantee of Benefits
4.1 Minimum Standards The best known and most basic international instrument to set minimum standards of social security is Convention 102 of the International Labor Organization (ILO) (1952). This convention, ratified by 40 states, contains minimum standards regarding medical care and benefits in the case of sickness, unemployment, old age, employment injury, maternity, in14326
In some countries, entitlement and even qualifying periods for social insurance entitlement are protected by the constitution. In the German constitutional doctrine, contributions to the old-age pension scheme are protected by the constitution as property. The new Ukrainian Constitution gives another kind of guarantee which may apply to social insurance benefits: it requires that the content and scope of existing rights and freedoms shall not be diminished in the adoption of new laws or in the amendment of laws that are in force.
Social Integration, Social Networks, and Health 5.3 Public\Priate Mix of Benefits Initially, social insurance benefits were the only way of providing a subsistence base in the case of risk occurrence. Nowadays, social insurance benefits are increasingly complemented by other means of protection. Mainly in the field of old-age pensions, the interplay of public, occupational, and private protection increasingly serves as a current mode of protection (International Bank for Reconstruction and Development 1994).
6. Future Deelopments in the Field of Social Insurance To reach the goals of social security the technique of social insurance was one of the most important tools in the evolution of the welfare states. Social insurance, in the sense of Bismarck’s social insurance, has undergone major changes. The initial role of social insurance, the social protection of workers, has lost its importance as the personal scope of social insurance has widened. Other forms of social protection, especially state-financed social programs for families, unemployed people, or for persons in need are increasingly prevalent. Today, there is no unique mode of using social insurance to address social problems. Social insurance schemes rely mainly on financing by contributions and are thus, as regards their financing basis, dependent on the evolution of earnings. This reliance is considered a critical factor in the future financial stability of social insurance schemes and is broadly under discussion (Pieters 1999b). See also: Health Insurance: Economic and Risk Aspects; Insurance; Insurance and the Law; Law and Risk; Legal Insurance; National Health Care and Insurance Systems; Social Security
Bibliography Comparatie Tables of Social Security Schemes in Council of Europe Member States not Members of the European Union, in Australia, in Canada, in Armenia and in New Zealand 1999 (9th ed., situation at 1 July 1998) Council of Europe Publishers, Strasbourg International Bank for Reconstruction and Development 1994 Aerting the Old Age Crises: Policies to Protect the Old and Promote Growth. Oxford University Press, New York International Social Security Association 1993 Social Security Systems in Central and Eastern Europe. International Social Security Association, Geneva, Switzerland International Social Security Association 1998 Social Security Issues in English-speaking African Countries. International Social Security Association, Geneva, Switzerland International Social Security Association 1998 Social Security in Asia and the Pacific: the Road Ahead. International Social Security Association. Manila, Philippines
Koehler P A, Zacher H F (eds.) 1982 The Eolution of Social Insurance 1881–1981. Studies of Germany, France, Great Britain, Austria and Switzerland. Pinter, London Langendonck J van (ed.) 1997 The New Social Risks. Kluwer Law International, London Maydell B von, Hohnerlein E M (eds.) 1994 The Transformation of Social Security Systems in Central and Eastern Europe. Peeters Press, Leuven, Belgium Myers R J 1985 Social Security. 3rd edn., published for McCahan Foundation, Bryn Mawr, PA by R D Irwin, Homewood, IL Pennings F 1998 Introduction to European Social Security Law. 2nd edn., Kluwer Law International, The Hague, Netherlands Pieters D 1993 Introduction into the Basic Principles of Social Security. Kluwer, Deventer, Netherlands Pieters D (ed.) 1998 Social Protection of the Next Generation in Europe. Kluwer Law International, London Pieters D (ed.) 1999a International Impact upon Social Security. Kluwer Law International, London Pieters D (ed.) 1999b Changing Work Patterns and Social Security. Kluwer Law International, London Swedish National Social Insurance Board – European Commission 1997 25 Years of Regulation (EEC ) No. 1408\71 on Social Security for Migrant Workers. Stockholm Zacher H F 1976 Internationales und EuropaW isches Sozialrecht. Verlag R. S. Schulz, Percha\Kempenhausen am Starnberger See, Germany
G. Igl
Social Integration, Social Networks, and Health Since the mid-1970s there have been dozens of articles, and now books, on issues related to social networks and social support. It is now recognized widely that social relationships and affiliation have powerful effects on physical and mental health for a number of reasons (House et al. 1988, Berkman and Glass 2000, Cohen et al. 2000). When investigators write about the impact of social relationships on health, many terms are used loosely and interchangeably, including social networks, social support, social ties, and social integration. The aim of this article is to discuss: (a) theoretical orientations from diverse disciplines which are fundamental to advancing research in this area; (b) findings related to mortality; (c) a set of definitions of networks and aspects of networks and support; and (d) an overarching model which integrates multilevel phenomena.
1. Theoretical Orientations There are several sets of theories that form the bedrock for the empirical investigation of social relationships and their influence on health. The earliest theories 14327
Social Integration, Social Networks, and Health came from sociologists such as Emile Durkheim, as well as from psychoanalysts such as John Bowlby, who first formulated attachment theory. A major wave of conceptual development also came from anthropologists including Bott and Mitchell, as well as quantitative sociologists such as Burt, Laumann, and Wellman who, along with others, have developed social network analysis. This eclectic mix of theoretical approaches coupled with the contributions of epidemiologists form the foundation of research on social ties and health. Durkheim’s contribution to the study of the relationship between society and health is immeasurable. Perhaps most important is the contribution he has made to the understanding of how social integration and cohesion influence suicide (Durkheim 1897). Durkheim’s primary aim was to explain how individual pathology was a function of social dynamics. In light of recent attention to ‘upstream’ determinants of health, Durkheim’s work re-emerges with great relevance today. Bowlby (1969), one of the most important psychiatrists of the twentieth century, proposed theories suggesting that the environment, especially in early childhood, played a critical role in the genesis of neurosis. Bowlby proposed that there is a universal human need to form close affectional bonds. Attachment theory, proposed by Bowlby, contends that the attached figure creates a secure base from which an infant or toddler can explore and venture forth. The strength of Bowlby’s theory lies in its articulation of an individual’s need for secure attachment for its own sake, for the love and reliability it provides, and for its own ‘safe haven.’ Primary attachment promotes a sense of security and self-esteem that ultimately provides the basis on which the individual will form lasting, secure and loving relationships in adult life.
1.1 Social Network Theory: A New Way of Looking at Social Structure and Community During the mid-1950s, a number of British anthropologists found it increasingly difficult to understand the behavior of either individuals or groups on the basis of traditional categories such as kin groups, tribes, or villages. Bott (1957) and Mitchell (1969) developed the concept of ‘social networks’ to analyze ties that cut across traditional kinship, residential, and class groups to explain behaviors they observed such as access to jobs, political activity, or marital roles. The development of social network models provided a way to view the structural properties of relationships among people. Network analysis ‘focuses on the characteristic patterns of ties between actors in a social system rather than on characteristics of the individual actors themselves and use these descriptions to study how these 14328
social structures constrain network members’ behavior’ (Hall and Wellman 1985, p. 26). Network analysis focuses on the structure and composition of the network, and the contents or specific resources that flow through those networks. The strength of social network theory rests on the testable assumption that the social structure of the network itself is largely responsible for determining individual behavior and attitudes, by shaping the flow of resources which determine access to opportunities and constraints on behavior.
2. Health, Social Networks, and Integration From the mid-1970s through the present, there has been a series of studies showing consistently that the lack of social ties or social networks predicted mortality from almost every cause of death. These studies have been done in the USA, Europe, and Asia and most often captured numbers of close friends and relatives, marital status, and affiliation or membership in religious and voluntary associations. These measures were conceptualized in any number of ways as assessments of social networks or ties, social connectedness, integration, activity, or embeddedness. Whatever they were named, they defined embeddedness or integration uniformly as involvement with ties spanning the range from intimate to extended. In the first of these studies, from Alameda County (Berkman and Syme 1979), men and women who lacked ties to others (in this case, based on an index assessing contacts with friends and relatives, marital status, and church and group membership) were 1.9 to 3.1 times more likely to die in a 9-year follow-up period than those who had many more contacts. Another study in Tecumseh, Michigan (House et al. 1982) shows a similar strength of positive association for men, but not for women, between social connectedness\social participation and mortality risk over a 10–12 year period. An additional strength of this study was the ability to control for some biomedical predictors assessed from physical examination (e.g., cholesterol, blood pressure, and respiratory function). Similar results from several more studies have been reported, one from a study in the United States and three from Scandinavia. Using data from Evans County, Georgia, Schoenbach et al. (1986) found risks to be significant in older white men and women even when controlling for biomedical and sociodemographic risk factors, although some racial and gender differences were observed. In Sweden, two studies (Welin et al. 1985, Orth-Gomer and Johnson 1987) report significantly increased risks among socially isolated adults. Finally, in a study of 13,301 men and women in Eastern Finland, Kaplan et al. (1988) have shown that an index of social connections predicts
Social Integration, Social Networks, and Health mortality risk for men but not for women, independent of standard cardiovascular risk factors. Studies of older men and women confirm the continued importance of these relationships into late life (Seeman et al. 1993a). Furthermore, two studies of large cohorts of men and women in a large Health Maintenance Organization (HMO) (Vogt et al. 1992), and 32,000 male health professionals (Kawachi et al. 1996) suggest that social networks are, in general, more strongly related to mortality than to the incidence or onset of disease. Two studies, in Danish men (Pennix et al. 1997), and Japanese men and women (Sugiswa et al. 1994), further indicate that aspects of social isolation or social support are related to mortality. Virtually all of these studies find that people who are socially isolated or disconnected from others have between two and five times the risk of dying from all causes compared to those who maintain strong ties to friends, family, and community. Social networks and support have been found to predict a very broad array of other health outcomes from survival post-myocardial infarction to disease progression, functioning and the onset and course of infectious diseases. The reader is referred to several lengthy reviews mentioned earlier for additional information.
3. A Conceptual Model Linking Social Networks to Health Although the power of measures of networks or social integration to predict health outcomes is indisputable, the interpretation of what the measures actually assess has been open to much debate. Hall and Wellman (1985) have commented appropriately that much of the work in social epidemiology has used the term social networks metaphorically, since rarely have investigators conformed to more standard assessments used in network analysis, according to calls to develop a second generation of network measures (Berkman 1986, House et al. 1988). A second wave of research developed in reaction to this early work, and as an outgrowth of work in health psychology, that turned the orientation of the field in several ways (House and Kahn 1985, Sarason et al. 1990). These social scientists focused on the qualitative aspects of social relations (i.e., their provision of social support or, conversely, detrimental aspects of relationships) rather than on the elaboration of the structural aspects of social networks. Most of these investigators follow an assumption that what is most important about networks is the support function they provide. While social support is among the primary pathways by which social networks may influence physical and mental health status, it is not the only critical pathway (Berkman and Glass 2000). Moreover, the exclusive study of more
proximal pathways detracts from the need to focus on the social context and structural underpinnings that may have an important influence on the types and extent of social support provided. In order to have a comprehensive framework in which to explain these phenomena, it is helpful to move ‘upstream’ and return to a more Durkheimian orientation to network structure and social context. It is critical to maintain a view of social networks as lodged within those larger social and cultural contexts which shape the structure of networks. In fact, some of the most interesting recent work in the field relates social affiliation to social status, and social and economic inequality (Kawachi et al. 1997). Conceptually, social networks are embedded in a macrosocial environment in which large-scale social forces may influence network structure, which in turn influences a cascading causal process beginning with the macrosocial to psychobiological processes to impact health. Serious consideration of the larger macrosocial context in which networks form and are sustained has been lacking in all but a small number of studies, and is almost completely absent in studies of social network influences on health. Networks may operate at the behavioral level through at least four primary pathways: (a) provision of social support, (b) social influence, (c) social engagement and attachment, and (d) access to resources and material goods. These psychosocial and behavioral processes may influence even more proximate pathways to health status, including: (a) direct physiological responses, (b) psychological states including self-esteem, self-efficacy, depression, (c) health-damaging behaviors such as tobacco consumption or high-risk sexual activity, and health promoting behavior such as appropriate health service utilization or exercise, and (d) exposure to infectious disease agents such as HIV, other sexually transmitted diseases (STDs), or tuberculosis. The four primary psychosocial or behavioral pathways by which networks may influence health will be reviewed briefly. The reader is referred to Berkman and Glass (2000) for a lengthier review.
3.1 The Assessment of Social Networks Social networks might be defined as the web of social relationships that surround an individual and the characteristics of those ties. Burt (1982) has defined network models as describing ‘the structure of one or more networks of relations within a system of actors.’ Network characteristics cover: (a) range or size (number of network members), (b) density (the extent to which the members are connected to each other), (c) boundedness (the degree to which they are defined on the basis of traditional structures such as kin, work, neighborhood), and 14329
Social Integration, Social Networks, and Health (d) homogeneity (the extent to which individuals are similar to each other in a network). Related to network structure, characteristics of individual ties include: (a) frequency of contact, (b) multiplexity (the number of types of transactions or support flowing through ties), (c) duration (the length of time an individual knows another), and (d) reciprocity (the extent to which exchanges are reciprocal).
3.2 Downstream Social and Behaioral Pathways 3.2.1 Social Support. Moving downstream, discussion follows of the mediating pathways by which networks might influence health status. Most obviously the structure of network ties influences health via the provision of many kinds of support. This framework acknowledges immediately that not all ties are supportive, and that there is variation in the type, frequency, intensity, and extent of support provided. For example, some ties provide several types of support while other ties are specialized and provide only one type. Social support typically is divided into subtypes which include emotional, instrumental, appraisal, and informational support (House 1981). Emotional support is related to the amount of ‘love and caring, sympathy and understanding and\or esteem or value available from others.’ Emotional support is most often provided by a confidant or intimate other, although less intimate ties can provide such support under circumscribed conditions. Instrumental support refers to help, aid, or assistance with tangible needs such as getting groceries, getting to appointments, phoning, cooking, cleaning, or paying bills. House identifies instrumental support as aid in kind, money, or labor. Appraisal support often defined as the third type of support, relates to help in decision-making, giving appropriate feedback, or help deciding which course of action to take. Informational support is related to the provision of advice or information in the service of particular needs. Emotional, appraisal, and informational support are often difficult to disaggregate and have various other definitions (e.g., self-esteem support). Perhaps even deeper than support are the ways in which social relationships provide a basis for intimacy and attachment. Intimacy and attachment have meaning not only for relationships that we think of traditionally as intimate (e.g., between partners, parents, and children) but for more extended ties. For instance, when relationships are solid at a community level, individuals feel strong bonds and attachment to places (e.g., neighborhood) and organizations (e.g., voluntary and religious). 14330
3.2.2 Social influence. Networks may influence health via several other pathways. One pathway that is often ignored is based on social influence. Shared norms around health behaviors (e.g., alcohol and cigarette consumption, healthcare utilization) might be powerful sources of social influence with direct consequences for the behaviors of network members, quite apart from the provision of social support taking place within the network concurrently. For instance, cigarette smoking by peers is among the best predictors of smoking for adolescents (Landrine et al. 1994). The social influence which extends from the network’s values and norms constitutes an important and under-appreciated pathway through which networks impact health.
3.2.3 Social engagement. A third and more difficult to define pathway by which networks may influence health status is by promoting social participation and social engagement. Participation and engagement result from the enactment of potential ties in real life activity. Getting together with friends, attending social functions, participating in occupational or social roles, group recreation, and church attendance are all instances of social engagement. Thus, through opportunities for engagement, social networks define and reinforce meaningful social roles including parental, familial, occupational, and community roles, which in turn provide a sense of value, belonging, and attachment. Several recent studies suggest that social engagement is critical in maintaining cognitive ability (Bassuk et al. 1999) and reducing mortality (Glass et al. 1999). In addition, network participation provides opportunities for companionship and sociability. Rook (1990) argues that these behaviors and attitudes are not the result of the provision of support per se, but are the consequence of participation in a meaningful social context in and of itself. One reason measures of social integration or ‘connectedness’ may be such powerful predictors of mortality over long periods of follow-up is that these ties give meaning to individuals’ lives by virtue of enabling them to participate in it fully, to be obligated (in fact, often to be providers of support), and to feel attached to their community.
3.2.4 Person-to-person contact. Another behavioral pathway by which networks influence disease is by restricting or promoting exposure to infectious disease agents. In this regard, the methodological links between epidemiology and networks are striking. What is perhaps most remarkable is that the same network characteristics that can be health-promoting can at the same time be health-damaging if they serve as vectors for the spread of infectious disease.
Social Integration, Social Networks, and Health The contribution of social network analysis to the modeling of disease transmission is the understanding that in many, if not most, cases, disease transmission is not spread randomly throughout a population. Social network analysis is well suited to the development of models in which exposure between individuals is not random but rather is based on geographic location, sociodemographic characteristics (age, race, gender), or other important characteristics of the individual such as socioeconomic position, occupation, or sexual orientation (Laumann et al. 1989). Furthermore, because social network analysis focuses on characteristics of the network rather than on characteristics of the individual, it is suited ideally to the study of diffusion of transmissible diseases through populations via bridging ties between networks, or uncovering characteristics of ego-centered networks that promote the spread of disease. 3.2.5 Access to material resources. Surprisingly little research has sought to examine differential access to material goods, resources, and services as a mechanism through which social networks might operate. This is unfortunate given the work of sociologists showing that social networks operate by regulating an individual’s access to life-opportunities by virtue of the extent to which networks overlap each other. In this way networks operate to provide access or to restrict opportunities in much the same way that social status works. Perhaps the most important among the studies exploring this tie is Granovetter’s classic study of the power of ‘weak ties’ that, on the one hand lack intimacy, but on the other facilitate the diffusion of influence and information, and provide opportunities for mobility (Granovetter 1973). There are five mechanisms by which the structure of social networks might influence disease patterns. While social support is the mechanism most commonly invoked, social networks also influence health through additional behavioral mechanisms including (a) forces of social influence, (b) levels of social engagement and participation, (c) the regulation of contact with infectious disease, and (d) access to material goods and resources. To date, the evidence linking aspects of social relationships to health outcomes is strongest for general measures of social integration, social support, and social engagement. However, it should be noted that these mechanisms are not mutually exclusive. In fact, it is most likely that in many cases they operate simultaneously.
4. Conclusion The aim in this review was to integrate some classical theoretical work in sociology, anthropology, and psychiatry with the current empirical research underway on social networks, social integration, and social
support. Rather than review the vast amount of work on health outcomes which is the subject of several excellent recent papers, a conceptual framework that guides work in the future has been developed. With the development of this framework, two issues of profound importance stand out. The first is the ‘upstream’ question of identifying those conditions that influence the development and structure of social networks. Such questions have been the substantive focus of much of social network research, especially in relationship to urbanization, social stratification, and culture change. Yet little of this work becomes integrated with health issues in a way that might guide us in the development of policies or intervention to improve the health of the public. Of particular interest would be more cross-cultural work comparing countries with different values regarding social relationships, community, or sense of obligation. The same might be true to specific areas within countries, or specific cultural or ethnic groups with clearly defined values. The second major issue relates to the ‘downstream’ question. Many investigators have assumed that networks influence health via social support functions. The framework in this article makes clear that this is but one pathway linking networks to health outcomes. Furthermore, the work on conflicts and stress points out that not only are not all relationships positive in valence, but that some of the most powerful impacts on health that social relationships may have are through acts of abuse, violence, and trauma (Rook 1992). Fully elucidating these downstream experiences, and how they are linked to health via which biological mechanisms, remains a major challenge in the field. See also: Illness: Dyadic and Collective Coping; Integration: Social; Networks: Social; Social Network Models: Statistical; Social Networks and Fertility; Social Support and Health; Social Support and Recovery from Disease and Medical Procedures; Social Support and Stress; Social Support, Psychology of
Bibliography Bassuk S, Glass T, Berkman L 1999 Social disengagement and incident cognitive decline in community-dwelling elderly persons. Annals of Internal Medicine 131: 165–73 Berkman L 1986 Social networks, support and health: Taking the next step forward. American Journal of Epidemiology 123(4): 559–62 Berkman L F, Glass T 2000 Social integration, social networks, social support and health. In: Berkman L F, Kawachi I (eds.) Social Epidemiology. Oxford University Press, New York, pp. 137–73 Berkman L F, Syme S 1979 Social networks, host resistance, and mortality: A nine-year follow-up of Alameda County residents. American Journal of Epidemiology 109: 186–204 Bott E 1957 Family and Social Network. Tavistock Press, London Bowlby J 1969 Attachment and Loss. Hogarth Press, London
14331
Social Integration, Social Networks, and Health Burt R S 1982 Toward a Structural Theory of Action. Academic Press, New York Cohen S, Underwood S, Gottlieb B 2000 Social Support Measures and Interention. Oxford University Press, New York Durkheim E 1897 Suicide. Free Press, New York Glass T, Mendes de Leon C, Marottoli R, Berkman L F 1999 Population based study of social and productive activities as predictors of survival among elderly Americans. British Medical Journal 319: 478–83 Granovetter M 1973 The strength of weak ties. American Journal of Sociology 78: 1360–80 Hall A, Wellman B 1985 Social networks and social support. In: Cohen S, Syme S L (eds.) Social Support and Health. Academic Press, Orlando, FL, pp. 23–41 House J S 1981 Work Stress and Social Support. AddisonWesley, Reading, MA House J S, Kahn R 1985 Measures and concepts of social support. In: Cohen S, Syme S L (eds.) Social Support and Health. Academic Press, Orlando, FL, pp. 79–108 House J S, Landis K R, Umberson D 1988 Social relationships and health. Science 241: 540–5 House J S, Robbins C, Metzner H 1982 The association of social relationships and activities with mortality: Prospective evidence from the Tecumseh Community Health Study. American Journal of Epidemiology 116: 123–40 Kaplan G, Salonen J, Cohen R, Brand R, Syme S, Puska P 1988 Social connections and mortality from all causes and cardiovascular disease: Prospective evidence from eastern Finland. American Journal of Epidemiology 128: 370–80 Kawachi I, Colditz G A, Ascherio A, Rimm E B, Giovannucci E, Stampfer M J et al. 1996 A prospective study of social networks in relation to total mortality and cardiovascular disease in men in the U.S.A. Journal of Epidemiological Community Health 50: 245–51 Kawachi I, Kennedy B, Lochner K, Prothrow-Stith D 1997 Social capital, income inequality, and mortality. American Journal of Public Health 87: 1491–8 Landrine H, Richardson J K, Klondoff E A, Flay B 1994 Cultural diversity in the predictors of adolescent cigarette smoking: the relative influence of peers. Journal of Behaioral Medicine 17: 331–6 Laumann E O, Gagnon J H, Michaels S, Michael R T, Coleman J S 1989 Monitoring the AIDS epidemic in the US: A network approach. Science 244: 1186–9 Mitchell J C 1969 The Concept and Use of Social Networks. Manchester University Press, Manchester, UK Orth-Gomer K, Johnson J 1987 Social network interaction and mortality: A six year follow-up study of a random sample of the Swedish population. Journal of Chronic Diseases 40: 949–57 Pennix B W, van Tilburg T, Kriegsman D M, Deeg D J, Boeke A J, van Eijk J T 1997 Effects of social support and personal coping resources on mortality in older age: The Longitudinal Aging Study, Amsterdam. American Journal of Epidemiology 146: 510–9 Rook K S 1990 Social relationships as a source of companionship: Implications for older adults’ psychological well being. In: Sarason B R, Sarason T G, Pierce G R (eds.) Social Support: An Interactional View. Wiley, New York, pp. 221–50 Rook K S 1992 Detrimental aspects of social relationships: Taking stock of an emerging literature. In: Viel H O, Baumann U (eds.) The Meaning and Measurement of Social Support. Hemisphere, New York, pp. 157–69
14332
Sarason B R, Sarason I G, Pierce G (eds.) 1990 Social Support: An Interactional View. Wiley, New York Schoenbach V, Kaplan B, Freedman L, Kleinbaum D 1986 Social ties and mortality in Evans County, Georgia. American Journal of Epidemiology 123: 577–91 Seeman T, Berkman L, Kohout F, LaCRoix A, Glynn R, Blazer D 1993 Intercommunity variation in the association between social ties and mortality in the elderly: A comparative analysis of three communities. Annals of Epidemiology 3: 325–35 Sugisawa H, Liang J, Liu X 1994 Social networks, social support and mortality among older people in Japan. Journal of Gerontology 49: S3–S13 Vogt T M, Mullooly J P, Ernst D, Pope C R, Hollis J F 1992 Social networks as predictors of ischemic heart disease, cancer, stroke, and hypertension: Incidence, survival and mortality. Journal of Clinical Epidemiology 45: 659–66 Welin L, Tibblin G, Svardsudd K, Tibblin B, Ander-Peciva S, Larsson B et al. 1985 Prospective study of social influences on mortality: The study of men born in 1913 and 1923. Lancet 1: 915–8
L. F. Berkman
Social Intervention in Psychiatric Disorders As early as the nineteenth century psychiatrists already suspected that harmful influences from the social environment might promote the appearance of mental illnesses or have a deleterious effect on their course. In that era, the isolation of the mentally ill in the ‘healthful’ milieu of institutions far removed from their original living environment was thought to be an appropriate means of shielding the patients from the disease-producing influences of their surroundings. The objective of psychiatric reforms since the mid to late 1950s was to resettle chronically mentally ill persons from large custodial institutions to community settings. The idea behind this was to (re-)integrate patients in their normal living environment as rapidly as possible, because it was thought that their natural social network would make a major supporting contribution to the healing process (Ro$ ssler 2001). This reform has changed the perception of mental illness. Enabling disabled persons to live a normal life in the community caused a shift away from a focus on an illness model toward a model of functional disability (Grove 1994). As such, other outcome measures aside from clinical conditions became relevant. Since then social role functioning including social relationships, work, and leisure, as well as quality of life and the family burden, became of major interest for affected individuals living in the community as well as for the health professionals involved in treatment and care of those persons. Most of the social intervention
Social Interention in Psychiatric Disorders strategies as applied by health professionals in the community can be integrated within the concept of psychiatric rehabilitation (Ro$ ssler 2000). In current scientific discourse, environmental and psychosocial factors are often relegated to the status of a single variable of merely peripheral interest, without further specification. Biological science has indeed made enormous progress toward an understanding of mental disorders, but it should not be forgotten that research in social sciences also yielded a considerable body of knowledge with respect to the psychosocial and environmental factors that influence it (Ro$ ssler 2001). The relevance of psychosocial and environmental problems is also reflected in the DSM IV and ICD-10. Axis 4 of the DSM (American Psychiatric Association 1994) and codes Z 55–Z 65 and Z 73 of the ICD-10 (WHO 1993) are assigned for reporting psychosocial and environmental problems that may affect the diagnosis, treatment, and prognosis of mental disorders.
1. Target Population of Social Interentions Although the majority of the chronically mentally ill have the diagnosis of schizophrenic disorders, other patient groups with psychotic and nonpsychotic disorders are targeted by social interventions. All patients suffering from severe mental illness (SMI) now require social interventions. The core group is drawn from patients with: $ persistent psychopathology, $ marked instability characterized by frequent relapses, and $ social maladaptation (Royal College of Psychiatrists 1996). There are other definitions currently used to characterize the chronically mentally ill, but they all share some common elements, i.e. a diagnosis of mental illness, prolonged duration, and role incapacity. Furthermore, up to 50 percent of persons with SMI carry dual diagnoses, especially in combination with substance abuse (Cuffel 1996). Young adult chronic patients constitute an additional category that is diagnostically more complicated (Schwartz et al. 1983). These patients present complex pictures of symptomatology difficult to categorize within our diagnostic and classification systems. Many of them also have a history of suicide attempts. All in all, they represent a patient population that is mostly difficult to treat.
2. Models of Vulnerability There are several vulnerability–stress models forming the theoretical basis into which biological and social intervention approaches of psychiatric treatment, rehabilitation, and care can be integrated. These models
were developed mainly for persons with schizophrenic disorders, but they apply essentially to all individuals with SMI. In the ulnerability–stress model it is presumed that socioenvironmental stressors superimposed on an underlying and enduring biological vulnerability lead to abnormalities in central nervous system functioning (Zubin and Spring 1977). In the ulnerability–stress– competence model (Nuechterlein and Dawson 1984), individuals, families, and natural support systems can cope with stressors and have the competence to create a higher threshold for symptomatic illness. In an interactie–deelopmental model (Strauss et al. 1981), the persons affected do not remain in a passive role but actively interact with their environment. They modulate and select the kind of help they get from other people, they choose to comply with the treatment, and they seek or avoid social support and stressful situations. The interactive–developmental model is the basis for the consumer movement in mental health care. People with mental disorders and their caregivers prefer to see themselves as consumers of mental health services with an active interest in learning about psychiatric disorders and in selecting the respective treatment approaches. Consumerism makes it possible to take up the affected person’s perspective and to consider seriously courses of action relevant to them (Kopelowicz and Liberman 1995).
3. Conceptual Framework of Social Interentions The overall philosophy of social interventions in psychiatric disorders comprises two intervention strategies: the first strategy is individual-centered and aims at developing the patient’s skill in interacting with a stressful environment. There is no consensus among researchers as to what this individual-centered approach actually does accomplish. Some understand developing patient’s skill as an approach to help disabled persons to compensate for impairments and to function optimally with the deficiencies they have. Others assume that different forms of individualcentered interventions help with recovery from the disorder itself, while the contributing factors to the healing process are not clear (Strauss 1986). The second strategy is an ecological approach directed toward developing environmental resources to reduce potential stressors. This approach includes social interventions concerning housing and occupation as well as measures to build natural networks, as many chronically mentally ill and disabled persons lose close and stable relationships in the course of the disease. Most disabled persons need a combination of both approaches. In contrast to acute treatment there are mostly no legal powers to enforce social interventions in longterm care. Thus, the sufferer’s autonomy concerning treatment decisions should be respected. Within this 14333
Social Interention in Psychiatric Disorders frame, therapeutic alliance plays a crucially important role in engaging the affected persons in their own care planning. As such, the therapist should acknowledge that disagreement about the illness between patient and therapist is not always the result of the illness process (Eichenberger and Ro$ ssler 2000).
4. The International Classification of Impairments, Disabilities, and Handicaps The long-term consequences of major mental disorders targeted by social interventions might be described on different levels. A useful framework is provided by the international classification of impairment, disability, and handicaps (WHO 1980). In addition to acute symptoms, chronically mentally ill patients, in particular individuals with schizophrenia, have impairments in the most basic cognitive processes concerning information input, memory, and abstraction. These cognitive impairments are present even when all symptoms have subsided, and they seem to be enduring markers of a vulnerability to the illness. Persistent and severe impairments lead to disability, i.e., affected persons will be limited in performing certain roles in their social and occupational environment. Finally, handicaps occur when disabilities place the individual at a disadvantage relative to others in society. On more specified levels concerning impairments, medication and cognitive training are provided. Effective pharmacotherapy and cognitive training stabilize affected persons so that they are more responsive to learning from their environment. Interventions concerning disability encompass mainly social skills training and family intervention. Furthermore, social interventions emphasize vocational training suitable to the extent of the handicap. 4.1 Reduction of Impairments 4.1.1 Medication. Even when taken regularly, medication is only able to reduce rather than abolish symptoms, and delay rather than prevent relapse. In addition to that, many of the traditional drugs have adverse effects and can weaken patients’ ability to perform their social roles or can impair their social and vocational functioning. As such, it is not surprising that noncompliance with medication is one of the most serious problems in the long-term care of persons with SMI. Several treatment strategies have been developed for improving compliance with medication. For patients with cognitive deficits, the complexity of the medication regimen should be reduced (Gaebel 1997). Some patients need to be given the medication by a professional service on a regular basis. Depot medication (i.e., intramuscular injections that release the medication slowly into the bloodstream) might be 14334
especially useful for patients with fluctuating motivation or insight. It is also useful to involve families and caregivers in increasing compliance with medication. Yet, many patients living in the community have to take responsibility for their medication themselves. Training in self-management of medication (Eckman et al. 1990) emphasizes the patient’s autonomy and increases acceptance of and responsibility for treatment. This also includes variation of medication without consultation within certain limits. 4.1.2 Cognitie training. With respect to cognitive deficits, consistent results emerged from most studies in so far as remediation of specific deficits through repeated practice or related techniques is successful. But it is not clear if and how recovered cognitive deficits should facilitate the remediation of functional outcome and community functioning. Addressing these limitations, Brenner et al. (1992) developed a comprehensive cognitive training program which they called ‘integrated psychological therapy’ (IPT). IPT comprises five stages: cognitive differentiation, social perception, communication skills, interpersonal problem solving, and social skills training. It is presented to groups of five to seven patients for 30 to 60 minutes two or three times a week. The hierarchically arranged program is aimed at facilitating acquisition and maintenance of more complex skills by remediating cognitive deficits. Findings indicate that IPT reduces psychotic disorganization and improves social, cognitive, and problem-solving skills (Penn and Mueser 1996). 4.2 Remediation of Disabilities 4.2.1 Skills training. In recent years, social skills training in psychiatric rehabilitation has become very popular and has been disseminated widely. The most prominent proponent of skills training is Robert Liberman, who has designed systematic and structured skills training since the mid-1970s. Liberman and his colleagues packaged the training in the form of modules which focus on different topics such as medication management, symptom management, substance abuse management, basic conversational skills, interpersonal problem solving, friendship and intimacy, recreation and leisure, workplace fundamentals, community (re-)entry, and family involvement. Each module is composed of skill areas. The symptom management module, for example, encompasses skills areas such as identifying warning signs of relapse, developing a relapse prevention plan, coping with persistent symptoms, and avoiding alcohol and drugs. The skills areas are taught in exercises with demonstration videos, role play and problem-solving exercises, and in io and homework assignments (Liberman 1988).
Social Interention in Psychiatric Disorders The results of several controlled studies suggest that disabled individuals can be taught a wide range of social skills. Social and community functioning improve when the trained skills are relevant for the affected person’s daily life, and the environment perceives and reinforces the changed behavior. Unlike medication effects, benefits from skills training occur more slowly. Furthermore, long-term training has to be provided for positive effects (Penn and Mueser 1996). 4.2.2 Family interention. As a consequence of deinstitutionalization, the burden of care has fallen increasingly on the relatives of the mentally ill. It is estimated that up to 65 percent of mentally disabled individuals live with their families (Goldmann et al. 1981). This is a task many families do not choose voluntarily. Additionally, not all families are equally capable of giving full support to their disabled member, and may not be willing to replace an insufficient health care system (Johnson 1990). But families also represent support systems which provide natural settings for context-dependent learning important for recovery of functioning (Schooler 1995). As such there has been a growing interest in helping families since the beginnings of care reforms (Strachan 1986). One area of interest dealt with the expectations of relatives concerning the provision of care. Relatives quite often feel ignored, not taken seriously, and insufficiently informed by health professionals. They also may feel that their contribution to care is not appreciated, or that they will be blamed for the patient’s problems. It is certainly not surprising that there is a lot of frustration and resentment among relatives related to the extent of the physical, financial, and emotional family burden. As important research has demonstrated a relationship between a negative affective family atmosphere and an unfavorable course of a mental disorder, the need for helping families and patients with family intervention programs became clear. All family intervention programs include an educational aspect covering basic information on the etiology, treatment, and prognosis of schizophrenia. Although family interventions differ in the treatment focus, all family interventions are organized around the principal goal of providing family members with more information about the disorder and strategies for managing common problems. There are more common features: family interventions are concrete, practical, and focus on everyday coping. Family intervention programs have produced the most promising results. Family intervention is effective for lowering relapse rate and improving outcome (e.g., social functioning). Possibly, family intervention can reduce family burden. Furthermore, the treatment gains are fairly stable.
However, it is not clear what the effective components of the different models are. Additionally, family interventions differ in frequency and length of treatment. There are also no criteria for the minimum amount of treatment necessary. Finally, it should be noted that most of the familiy interventions were developed in the context of Western societies during deinstitutionalization. Family caregiving might be quite different in a different cultural context. This refers to other cultures in total as well as to minority groups in Western societies (Guarnaccia 1998). 4.2.3 Reduction of handicaps by ocational rehabilitation. The beneficial effects of work for mental health have been known for centuries (Harding et al. 1987). Vocational rehabilitation therefore has been a core element of psychiatric reforms since its beginnings. Vocational rehabilitation is based on the assumption that work not only improves activity, social contacts, etc., but may also promote gains in related areas such as self-esteem and quality of life, as work and employment are a step away from dependency toward integration into society. Enhanced self-esteem in turn improves adherence to long-term care of individuals with impaired insight (McElroy 1987). Vocational rehabilitation originated in psychiatric institutions, where the lack of activity and stimulation led to apathy and withdrawal among their inpatients. Long before the introduction of medication, occupational and work therapy contributed to sustainable improvements in long-stay inpatients. Today occupational and work therapy are no longer hospital-based, but represent the starting point for a wide variety of rehabilitative techniques teaching vocational skills (Royal College of Psychiatrists\College of Occupational Therapists 1992). Vocational rehabilitation programs in the community provide a series of graded steps to promote job (re-)entry. For less disabled persons, brief and focused techniques are used to teach them how to find a job, fill out applications, and conduct employment interviews (Jacobs et al. 1988). In transitional employment a temporary work environment is provided to teach vocational skills that should enable the affected person to move on to competitive employment. But too often the gap between transitional and competitive employment is so large that the mentally disabled individuals remain in a temporary work environment. Sheltered workshops providing prevocational training also quite often prove to be a dead end for disabled persons. Today, the most promising vocational rehabilitation model is supported employment. In this model, disabled persons are placed in competitive employment as soon as possible and receive all the support needed to maintain their position. Skills are taught in 14335
Social Interention in Psychiatric Disorders the actual environment, tailored to the job and the individual. If necessary, additional advice is given to the employer. As the course of SMI is unpredictable and job requirements often change, the support provided is neither time-limited nor delivered on a fixed schedule (Wallace 1998). Participation in supported employment programs is followed by an increase in the ability to find and keep employment (Baronet and Gerber 1998). But most of the research does not allow evaluation of the impact of vocational rehabilitation programs on nonvocational domains. Although findings regarding supported employment are encouraging, many critical issues remain unanswered. Most individuals in supported employment obtain unskilled part-time jobs. Therefore, the high drop-out rate (up to 40%) is not surprising. Since most studies only evaluated short (12–18 months) follow-up periods, the long-term impact remains unclear. At the time of writing, it is not known which individuals benefit from supported employment and which do not (Mueser et al. 1998). It must be acknowledged that the integration into the labor market by no means depends only on the ability of the persons affected to fulfil a work role and on the provision of sophisticated vocational training and support techniques, but also on the willingness of society to integrate its most disabled members. One result of the difficulties of integrating mentally disabled individuals into the common labor market has been the steady growth of cooperatives which operate commercially with disabled and nondisabled staff working together on equal terms and sharing in management. The mental health professionals work in the background providing support and expertise (Grove 1994). Cooperatives serve as a bridge between sheltered workshops and competitive jobs on the common labor market.
4.2.4 Deeloping enironmental resources. Effective community psychiatry requires individualized and specialized treatment which has to be embedded in a comprehensive and coordinated system of community services. But even when a variety of services is available, they are poorly linked in many cases, and costly duplication may occur. While developing community support systems it became obvious that there was a need to coordinate and integrate the services provided as each professional involved concentrates on different aspects of the same patient. Therefore, as a key coordinating and integrating mechanism, the concept of case management (CM) originated. CM focuses on all aspects of the physical and social environment. The core elements of CM are the assessment of patient needs, the development of comprehensive service plans for the patients, and arrangement of service delivery (Ro$ ssler et al. 1992). 14336
Since the 1980s, a range of different models of CM have been developed which exceed the original idea that CM intends mainly to link the patient to needed services, and to coordinate those services. Today most clinical case managers also provide direct services in the patient’s natural environment. This model is called intensive case management (ICM). ICM is difficult to distinguish from assertive community treatment (ACT). The basic compounds of ACT were developed by Stein and Test in the 1970s (Stein and Test 1980). The original program was designed as a community-based alternative to hospital treatment for persons with severe mental illnesses. A comprehensive range of treatment, rehabilitation, and support services in the community is provided through a multidisciplinary team. ACT is characterized by an assertive outreach approach, i.e., interventions are provided mainly in the natural environment of the disabled individuals (Scott and Dixon 1995). Research on CM and ACT indicates that these models can reduce time in hospital, but have only small or moderate effects on improving symptomatology and social functioning (Mueser et al. 1998).
5. Conclusions The goal of social intervention embedded in the broader concept of psychiatric rehabilitation is to help disabled individuals to establish the emotional, social, and intellectual skills needed to live, learn, and work in the community with the least amount of professional support (Anthony 1979). The refinement of all these approaches as described has achieved a level where they should be made readily available for every person with SMI. But we have to be aware that it is a long way to translate research into practice. In 1992, the National Institute of Mental Health, for example, funded the ‘schizophrenia patient outcome research team’ (PORT) to develop and disseminate recommendations for the treatment of schizophrenia based on existing scientific evidence. The recommendations address in essence all interventions as presented above (Lehman and Steinwachs 1998). Lehman and colleagues assessed the patterns of usual care for schizophrenic patients and examined the conformance rate with the PORT treatment recommendations. The conformance rate was modest, generally below 50 percent. As such, it is obvious that current care practice has to be improved substantially in the light of the outcome research available (Lehman et al. 1995). Serious mental illness can devastate the lives of people who suffer from it. These persons have a right to be treated according to current knowledge. But mental illnesses are associated with a significant amount of stigma, resulting in social isolation and discrimination in housing, education, and employ-
Social Interention in Psychiatric Disorders ment opportunities. Mentally ill people are even disadvantaged in receiving appropriate treatment and care, so stigma and discrimination further increase the burden on affected persons and their relatives. Yet, social interventions aim at reintegrating the mentally ill into the community and improving their quality of life. For these reasons a responsive and open-minded environment is indispensable. Therefore, translating research into practice always encompasses combating stigma and discrimination of mental illnesses as well (World Psychiatric Association 1999). See also: Health Interventions: Community-based; Mental Health: Community Interventions; Mental Illness, Epidemiology of; Mental Illness: Family and Patient Organizations; Public Psychiatry and Public Mental Health Systems; Vulnerability and Perceived Susceptibility, Psychology of
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders (DSM IV), 4th edn. American Psychiatric Association, Washington, DC Anthony C W 1979 The Principles of Psychiatric Rehabilitation. University Park Press, Baltimore Baronet A M, Gerber G 1998 Psychiatric rehabilitation: Efficacy of four models. Clinical Psychology Reiew 18: 189–228 Brenner H D, Hodel B, Genner R, Roder V, Corrigan P W 1992 Biological and cognitive vulnerability factors in schizophrenia: Implications for treatment. British Journal of Psychiatry 161(suppl. 18): 154–63 Cuffel B 1996 Comorbid substances use disorder: Prevalence, patterns of use, and course. In: Drake R E, Mueser K T (eds.) Dual Diagnosis of Major Mental Illness and Substance Abuse: Recent Research and Clinical Implications. Jossey-Bass, San Francisco, Vol. 2, pp. 93–105 Eckman T, Liberman R, Phipps C, Blair K 1990 Teaching medication management skills to schizophrenic patients. Journal of Clinical Psychopharmacology 10: 33–8 Eichenberger A, Ro$ ssler W 2000 Comparison of self-ratings and therapist ratings of outpatients’ psychosocial status. Journal of Nerous and Mental Disease 188: 297–300 Gaebel W 1997 Towards the improvement of compliance: The significance of psychoeducation and new antipsychotic drugs. International Clinical Psychopharmacology 12(suppl. 1): 37–42 Goldmann H H, Gattozzi A A, Taube C A 1981 Defining and counting the chronically mentally ill. Hospital and Community Psychiatry 32: 21–7 Grove B 1994 Reform of mental health care in Europe. Progress and change in the last decade. British Journal of Psychiatry 165: 431–3 Guarnaccia P J 1998 Multicultural experiences of family caregiving: A study of African American, European American, and Hispanic American families. New Directions for Mental Health Serices 77: 45–61 Harding C, Strauss J, Hafez H, Lieberman P 1987 Work and mental illness: I. Toward an integration of the rehabilitation process. Journal of Nerous and Mental Disease 175: 317–26
Jacobs H E, Kardashian S, Kreinbring R K, Ponder R, Simpson A 1988 A skills-oriented model for facilitating employment in psychiatrically disabled persons. Rehabilitation Counseling Bulletin 27: 87–96 Johnson D 1990 The family’s experience of living with mental illness. In: Lefley H, Johnson D (eds.) Families as Allies in Treatment of the Mentally Ill: New Directions for Mental Health Professionals. American Psychiatric Press, Washington, DC, pp. 31–63 Kopelowicz A, Liberman R P 1995 Biobehavioral treatment and rehabilitation of schizophrenia. Harard Reiew of Psychiatry 3: 55–64 Lehman A F, Carpenter W T J, Goldman H H, Steinwachs D M 1995 Treatment outcomes in schizophrenia: Implications for practice, policy, and research. Schizophrenia Bulletin 21: 669–75 Lehman A F, Steinwachs D M 1998 Survey Co-Investigators of the PORT Project. At issue: Translating research into practice: The Schizophrenia Patient Outcomes Research Team (PORT) Treatment Recommendations; and Patterns of usual care for schizophrenia: Initial results from the Schizophrenia Patient Outcomes Research Team (PORT) Client Survey. Schizophrenia Bulletin 24: 1–10 , 11–20 Liberman R (ed.) 1988 Psychiatric Rehabilitation of Chronic Mental Patients. American Psychiatric Press, Washington DC McElroy E 1987 Sources of distress among families of the hospitalized mentally ill. New Directions for Mental Health Serices 34: 61–72 Mueser K T, Bond G R, Drake R E, Resnick S G 1998 Models of community care for severe mental illness: A review of research on case management. Schizophrenia Bulletin 24: 37–74 Nuechterlein K H, Dawson M E 1984 A heuristic vulnerability\stress model of schizophrenic episodes. Schizophrenia Bulletin 10(2): 300–12 Penn D L, Mueser K T 1996 Research update on the psychosocial treatment of schizophrenia. American Journal of Psychiatry 153: 607–17 Ro$ ssler W 2000 Rehabilitation techniques. In: Gelder M G, Lopez-Ibor J J (eds.) New Oxford Textbook of Psychiatry. Oxford University Press, Oxford, pp. 1141–95 Ro$ ssler W 2001 Schizophrenia: Psychosocial factors. In: Henn F, Sartorius N, Helmchen H, Lauter H (eds.) Contemporary Psychiatry, Vol. 3, Part 1. Springer, Berlin, pp. 121–8 Ro$ ssler W, Fa$ tkenheuer B, Lo$ ffler W, Riecher-Ro$ ssler A 1992 Does case management reduce the rehospitalization rate? Acta Psychiatrica Scandinaica 86: 445–9 Royal College of Psychiatrists 1996 Psychiatric Rehabilitation, rev. edn. Gaskell, London Royal College of Psychiatrists\College of Occupational Therapists 1992 Occupational Therapy and Mental Disorders. A Consensus Statement by the Royal College of Psychiatrists and the College of Occupational Therapists. Royal College of Psychiatrists, London Schooler N 1995 Integration of family and drug treatment strategies in the treatment of schizophrenia: A selective review. International Clinical Psychopharmacology 10(suppl. 3): 73–80 Schwartz S, Goldfinger S, Ratener M, Cutler D 1983 The young adult patient and the care system: Fragmentation prototypes. New Directions for Mental Health Serices 19: 23–36 Scott J E, Dixon L B 1995 Assertive community treatment and case management for schizophrenia. Schizophrenia Bulletin 21: 657–68
14337
Social Interention in Psychiatric Disorders Stein L I, Test M A 1980 Alternative to mental hospital treatment: I. Conceptual model, treatment program, and clinical evaluation. Archies of General Psychiatry 37: 392–7 Strachan A 1986 Family intervention for the rehabilitation of schizophrenia: Toward protection and coping. Schizophrenia Bulletin 4: 678–98 Strauss J S 1986 What does rehabilitation accomplish? Schizophrenia Bulletin 12: 720–3 Strauss J, Bartko J, Carpenter W T Jr 1981 New directions in diagnosis: The longitudinal processes of schizophrenia. American Journal of Psychiatry 138: 954–8 Wallace C J 1998 Social skills training in psychiatric rehabilitation: Recent findings. International Reiew of Psychiatry 19: 9–19 World Health Organization (WHO) 1980 International Classification of Impairments, Disabilities and Handicaps. WHO, Geneva World Health Organization (WHO) 1993 International Statistical Classification of Diseases and Related Health Problems: Tenth Reision. WHO, Geneva World Psychiatric Association 1999 The WPA Global Programme Against Stigma and Discrimination Because of Schizophrenia. WPA, Geneva Zubin J, Spring B 1977 Vulnerability—a new view of schizophrenia. Journal of Abnormal Psychology 86: 103–26
W. Ro$ ssler
Social Justice Social justice is a subcategory of the wider term justice and concerns justice as it refers to the distribution of valued goods and necessary burdens. Such distributions can take place at all levels of societal aggregation: at the micro level of face-to-face interaction, at the meso level of intermediate institutions and organizations, and at the macro level of a society’s basic structure. The ways in which these various levels provide (or deny) access to social positions, in which they regulate the allocation of rights and duties as well as of scarce goods and necessary burdens, and in which they interact through an encompassing framework of institutionalized rules and procedures, form the subject of social justice.
1. The Leels of Social Justice 1.1 The Macro Leel The macro level is itself subdivided into several layers. The highest layer is that of the constitutional order of a society: its polity determining the (permitted) forms of government and the ground rules, as well as scope, of ‘normal’ politics; its economic regime, its social and legal systems, and the integration of its major institutions into one large scheme. The next lower layer is that of concrete politics. Here, different social groups compete for influence in the designation of particular 14338
policies concerning all aspects of social life which can become a matter of political and legal regulation under the chosen constitutional frame. And finally, there are the policies themselves, e.g., particular fiscal, economic, educational, welfare, etc. policies. The initial design of such policies can have far-reaching implications for future policies because once a particular policy structure or pattern is established in a given field, it tends to shape the public’s expectations towards it and to lead to a certain degree of inertia, making radical changes difficult and limiting the options available to future policy makers. In many ways, these policies also affect what happens at the meso level. 1.2 The Meso Leel This is the level of organizations. When considering organizations from the point of view of social justice, it is useful to distinguish between types of organizations according to the societal sector or social subsystem into which they fall, e.g., business organizations belonging to the economic system, political parties belonging to the political system, hospitals belonging to the health system, schools belonging to the educational system, etc., because the various subsystems follow different logics of operation, subjecting their organizations to separate ‘local’ rationalities which in turn affect the ways in which these organizations (and their representatives) conceive of and deal with ‘local’ justice problems. Local justice problems concern the allocation of various in-kind (and mostly non- or not fully marketable) goods such as university places, jobs, public housing, and scarce medical resources; in short, the rules for inclusion into or exclusion from organizations and for the distribution of some of their main ‘products’ and services. The problems are typically dealt with and resolved by the respective organizations themselves. However, both the degree of local autonomy and of local scarcity are subject to the influence of macro level decisions. Thus, the state can mandate particular principles that must be observed in their handling (e.g., antidiscrimination laws) or it can refrain from doing so; it can issue rules for some such problems but not for all of them, and these rules can be more or less determinate. As for the degree of scarcity, on the other hand, the state’s influence is particularly evident in the health sphere. The lower the budget assigned to healthcare as a whole (macro allocation level), the greater the need for rationing in the various branches of the health system (micro allocation level). 1.3 The Micro Leel This level is not to be confused with the above micro allocation level referring to problems arising within organizations. Instead it concerns a separate societal
Social Justice level, namely that of interactions between individuals in relatively unorganized social contexts. An example are families and the ways in which they allocate the burdens facing them in their everyday life. It is obvious that at least theoretically this level of decision making leaves the widest scope for individual choice. However, in reality this scope is multiply constrained by cultural traditions framing the perception of appropriate choices, even though the traditions themselves may undergo constant changes. Moreover, higher-level decisions can have a significant impact on this micro level as well. Thus, if a welfare state exists which provides generous old age pensions, then this will obviously relieve the youngers’ (immediate) duties against their parents. Likewise, if such a welfare state implicitly favors a pattern of lifelong, uninterrupted full-time employment, then, given a cultural background in which it is ‘normal’ for women to bear the bulk of child-rearing duties coupled with a tendency for work organizations to allocate better paid jobs to men, this will weaken women’s bargaining position in the negotiation of the distribution of household chores.
2. Theories of Social Justice Most theories of social justice focus on the macro level. Some, like the famous Theory of Justice by John Rawls (1971), view this level, or more precisely, its highest layer, as the primary, if not only subject of a theory of social justice (see also Barry 1995). Other conceptions, such as Michael Walzer’s (1983), include a much wider range of issues. Many theories of the first type deliberately restrict themselves to the formulation of principles for a society’s basic structure. Once these principles are met, they are willing to leave the ‘rest’ to the political process. If the rules by which political decisions are made are fair, then, according to these theories, the outcomes must be accepted as legitimate by everyone because they have been brought about in a manner against which no one can reasonably object. Hence, the obligation to comply with policies and norms even if one objects to their content. The nature of these theories is thus predominantly procedural (e.g., Habermas 1983, Ro$ hl and Machura 1997), although some, like the Rawlsian theory, also formulate one or more substantive principles. There are also macro level theories which go beyond the basic structure and address issues of normal politics as well, e.g., welfare policies relating to public education, medical care, or unemployment benefits, as in the case of David Miller (1976). The second group of theories is much broader in scope, covering the whole range from micro to macro issues. More recently, there has been a development towards more specialization on particular types of justice problems arising within single subsystems of
society. An example of this is the proliferating work in the so-called applied ethics, especially in the fields of medical and bioethics. Some of these theories also cover issues of distributive justice, with special attention devoted to problems of medical triage. Another recent development is the aim to cut across the various social subsystems and to cover, in a systematic way, the whole range of micro allocation problems facing intermediate institutions and organizations. This is the local justice approach developed by Jon Elster (1992). And finally, there is a growing body of feminist literature (e.g., Okin 1989). Much of this work starts from relative disadvantages that women encounter at the micro and meso levels and looks at the ways in which macro level institutions or policies account for them. One pervasive problem in the theoretical literature on social justice is a widespread lack of awareness of differences in the subject matters of the available conceptions. The result is that the participants in the pertinent debates often talk past each other. They will, for instance, argue, that a principle suggested by a particular theory is unsuitable for a particular problem, but ignore that the principle was never meant to regulate problems of that sort because its suggested scope is much more limited than the critics impute. Or they will argue that a theory fails to give due weight to a particular principle, but fail to realize that the principle in question may be appropriate for some situations, but not necessarily for the ones forming the subject of the criticized theory. A favorite target of this sort of criticism is John Rawls’ theory. A second problem is that much of the theorizing in the philosophical literature lacks a proper understanding of the structure and functioning conditions of modern society. But political theories (as they often conceive themselves) that do not ‘know’ the society whose operations they seek to guide (see Luhmann 1980 for this criticism) are unlikely to be politically very instructive. The social theory underlying most normative theories of social justice is often rather simplistic, variously modeled after the Greek polis, a medieval village community, the market place or a handful of stranded islanders having to regulate their common affairs. The complexity of modern society is mostly ignored by this kind of theorizing—although its principles are proposed for the operations of real societies, not fictitious ones. A third problem is that the available theories of justice are anything but clear about the meaning of social justice. Some associate it with a particular understanding of equality, others equate it with some notion of fairness, and still others use the term in still another way. There is a plethora of theories promoting different principles of justice, some favoring desert (or equity) as the first principle, others rights, and still others needs, but even those choosing the same principle may interpret it very differently. For example, how is one to determine, measure and 14339
Social Justice remunerate a person’s deserts (and in which contexts are they relevant)? No consensus exists on any of these questions in the pertinent literature. Moreover, there are not only various single-principle conceptions, but also proposals for pluralist theories, some covering all three of the above principles, others allowing for as many principles as there are ‘spheres’ of justice. In the latter case, the meaning ascribed to social justice becomes largely a function of the problem at hand. Taken together, these (and further) difficulties leave the impression of a great deal of messiness and ambiguity in the (mostly normative) theorizing of social justice.
3. Empirical Work on Social Justice Not much different is the situation in the social science literature on social justice. Here, the starting point is often some ideal of equality drawn from the enlightenment tradition and forming a standard against which the performance of a society’s institutions is judged (e.g., Kluegel and Smith 1986). The ideal is rarely openly expressed or explicitly defended, but nevertheless drives ongoing attempts at laying bare inequalities in all areas of life and the social mechanisms underlying and reproducing them, with the implicit message being that the contradiction between ideal and reality calls for social reform. One problem with this literature is its hidden normativity, its unquestioned support for a certain notion of equality whose rightness is presumed (and presupposed) rather than argued for. Another problem is that it possesses no measure for admissible degrees of inequality or for prescribed degrees of equality. This problem is discussed with great vigor and sophistication in philosophy (see, e.g., Dworkin 1981a, 1981b) and (welfare) economics (e.g., Sen 1992). The advances made in this literature, however, have so far remained both unattended and unmatched in the social scientific work on inequality. In a related way, this problem also comes up in a second body of social science literature on social justice, that on judgments of justice. Its primary subject is that of income (in)justice and (in)equality (see, e.g., Alves and Rossi 1978, Kluegel et al. 1995). To simplify somewhat, a recurrent finding across societies and social categories is that most people tend to favor some degree of inequality over absolute equality (although they disagree about the proper extent and about the fields in which inequalities may prevail or should exist). They also tend to underrate the factually existing income (and hence, wealth) differences. Still, they would mostly prefer a narrowing of the differences that they believe to exist, even though most people are roughly satisfied with their own earnings. Furthermore, there is a widespread belief that people’s earnings roughly reflect their economic desert and that this is also what they should 14340
reflect. At the same time, market economies are generally preferred over planned economies, and their outcomes are largely taken to be desert-based. However, market allocations are anything but that; they are, as has been noted repeatedly, in principle unprincipled (Hirsch 1977). And this is by no means the only inconsistency; the one impression that inevitably emerges from the literature on people’s judgments is that these judgments are multiply confused. So what is one to make of them? A third type of empirical work on social justice is Elster’s local justice approach. This work is concerned less with judgments of justice than with the ways in which justice problems arising in organizations are actually resolved and with the weight that justice considerations are accorded in such situations (see Elster 1995, Schmidt and Hartmann 1997). The aim is to supplement the normative literature with factual knowledge about the fields into which it seeks to intervene, the underlying premise being that normative theories need empirical foundations in order to render them applicable to a social reality which is much more complex than philosophical theories usually allow for. Finally, in a series of articles Raymond Boudon (1997, 1999; see also Boudon and Betton 1999) has suggested a rationalist and cognitivist conception of justice and morality more generally, according to which there are objective normative truths just as there are objective scientific truths, the basis for accepting them being in both cases their grounding in strong reasons. This conception, which is directed primarily against relativistic and culturalistic positions, seems very promising but also raises a number of methodological questions: how are we to determine, by sociological means, what these truths are, how can we ‘find’ and accurately identify them, and what are the techniques needed to separate them from those normative statements and proposals which only claim to be true? Given the novelty and as yet mainly programmatic nature of Boudon’s approach, it seems too early for a final assessment of its merits.
4. Desiderata The concept of social justice is used very differently in the available literature, so if one is not already committed to a particular definition, it is hard to distil a common core out of its many usages. To some extent this has to do with the term’s indiscriminate application to almost any kind of distribution problem without consideration of the different quality of problems arising at different levels and in different subsystems of society. A first desideratum is therefore that these differences be adequately acknowledged in attempts at theorizing about the concept: social justice in the design of a society’s basic structure may not mean the same as social justice in the determination of wage levels by firms or in the distribution of household
Social Learning, Cognition, and Personality Deelopment duties. To the extent that there are overlaps between the term’s meaning and links between single justice problems and\or levels of societal aggregation, they must be carefully examined and extracted. But if the term social is to have any serious meaning at all, then it is also important to develop a clear sense of the types of problems it is to be applied to, of their differences, their relative interdependence and independence. Or, to put it differently, what is needed is greater clarity about the locus of regulation of different categories of problems. Some such problems need to be regulated at the macro level, others are better left to lower-level instances. And it is quite likely that even a wellordered basic structure will leave many problems of social justice unresolved and hence cannot prevent all injustices in the spheres for which it provides no determinate answers. If this is the case, then it is all the more important that the theoretical discussion be more focused and less diffuse. Second, there is a need for more cross-disciplinary exchange, allowing the empirical and normative approaches to enrich each other reciprocally (see Miller 1999, Schmidt 1994, 1999). The philosophical literature needs an improved understanding of the real world because elegance is no guarantee for relevance; in fact, it frequently is an impediment because simplification can easily result in simplistic ideas about real world problems and their suitable regulation. The social science literature, on the other hand, needs more terminological, methodological and conceptual sophistication, and a better understanding of the categorical and epistemological differences between ‘is’ and ‘ought’ questions, in order to find its proper place in the analysis of normativity. A familiarization with the pertinent philosophical literature can be of great benefit in this regard.
Elster J 1992 Local Justice. How Institutions Allocate Scarce Goods and Necessary Burdens. Russell Sage Foundation, New York Elster J 1995 (ed.) Local Justice in America. Russell Sage Foundation, New York Habermas J 1983 Moralbewußtsein und kommunikaties Handeln. Suhrkamp, Frankfurt, Germany Hirsch F 1977 The Social Limits to Growth. Routledge & Kegan Paul, London Kluegel J R, Smith E R 1986 Beliefs About Inequality. Americans’ Beliefs of What Is and What Ought to Be. Aldine de Gruyter, New York Kluegel J R, Mason B, Wegener B (eds.) 1995 Social Justice and Political Change: Public Opinion in Capitalist and Postcommunist States. Aldine de Gruyter, New York Luhmann N 1980 Gesellschaftsstruktur und Semantik. Studien zur Wissenssoziologie der modernen Gesellschaft. Suhrkamp, Frankfurt, Germany Miller D 1976 Social Justice. Clarendon Press, Oxford, UK Miller D 1999 Principles of Social Justice. Harvard University Press, Cambridge, MA Okin S 1989 Justice, Gender and the Family. Basic Books, New York Rawls J 1971 A Theory of Justice. Harvard University Press, Cambridge, MA Ro$ hl K F, Machura S (eds.) 1997 Procedural Justice. Ashgate, Aldershot, UK Schmidt V H 1994 Bounded Justice. Social Science Information 33: 305–33 Schmidt V H (eds.) 1999 Justice in Philosophy and Social Science. Special issue of Ethical Theory and Moral Practice. Kluwer, Dordrecht, The Netherlands Schmidt V H, Hartmann B K 1997 Lokale Gerechtigkeit in Deutschland. Studien zur Verteilung knapper Bildungs-, Arbeits- und GesundheitsguW ter. Westdeutscher Verlag, Opladen, Germany Sen A 1992 Inequality Re-examined. Russell Sage Foundation, New York Walzer M 1983 Spheres of Justice. A Defense of Pluralism and Equality. Basic Books, New York
See also: Deprivation: Relative; Institutions; Legitimacy; Legitimacy, Sociology of; Moral Reasoning in Psychology; Moral Sentiments in Society; Norms; Values, Sociology of
V. H. Schmidt
Bibliography
Social Learning, Cognition, and Personality Development
Alves W, Rossi P 1978 Who should get what? Fairness judgments of the distribution of earnings. American Journal of Sociology 84: 541–64 Barry B 1995 Justice as Impartiality. Clarendon Press, Oxford, UK Boudon R 1997 The moral sense. International Sociology 12: 5–24 Boudon R 1999 Multiculturalism and value relativism. In: Honneger C, Hradil S, Traxler F (eds.) Grenzenlose Gesellschaft? Leske and Budrich, Opladen, Germany, pp. 97–118 Boudon R, Betton E 1999 Explaining the feelings of justice. Ethical Theory and Moral Practice 2: 365–98 Dworkin R 1981a What is equality? Part 1: Equality of welfare. Philosophy and Public Affairs 10: 185–246 Dworkin R 1981b What is equality? Part 2: Equality of resources. Philosophy and Public Affairs 10: 283–345
Social learning theory views the course of human development on the basis of children’s socialization experiences and acquisition of self-regulation. Children’s development of personality characteristics, such as dependency and aggression, as well as their skill in academics, sports, arts, or professions, are assumed to emerge from learning experiences embedded within the social milieu of their family, peers, gender, and culture. This article describes a social learning perspective and describes the intellectual context from which it emerged, changes in researchers’ emphasis over time, and current emphases, methodological issues related to the construct, and directions for future research. 14341
Social Learning, Cognition, and Personality Deelopment
1. Defining a Social Learning Perspectie Social learning theory defines children’s socialization in terms of specific social learning experiences, such as modeling, tuition, and reinforcement, and the cognitions, emotions, and behavior that emerge from these formative experiences. Children’s personal reactions were explained historically using personality constructs, such as identification, conditioning, or drive reduction, but are explained contemporaneously in cognitive terms, such as personal agency beliefs and self-regulatory processes. Social learning experiences are responsible for children’s development of selfregulatory competence to adapt to changing settings, including those associated with their age or level of physical development. Self-regulation is essential to children’s development because socialization involves giving up immediately pleasurable activities or familiar methods of coping to achieve delayed benefits.
2. Intellectual Context for Social Learning Research Social learning research grew initially out of a seminar offered at Yale University during the late 1930s by Clark Hull. Among the members of that historic class of students were John Dollard, Neal Miller, Robert Sears, and John Whiting. The goal of the seminar was to provide learning explanations for key aspects of children’s personality development that had been discussed by Freud, such as dependency, aggression, identification, conscience formation, and defense mechanisms. Of particular interest to class members was the psychoanalytic notion of children’s identification with powerful models, such as parents or other adults, which Freud used to explain a wide range of personality characteristics, such as, dependency, aggression, and even psychosexual development. Identification was singled out by Miller and Dollard (1941) in a series of experimental studies of imitative learning. They found that a general tendency to imitate could be learned through reinforcement and that rewards could be structured to teach counter-imitation (i.e., the opposite choice of a model) as well as imitation. They concluded that imitation was a learned class of instrumental behavior but was not a learning process in its own right. Sears studied the impact of parents and their relationship with their children on the youngsters’ personality development, especially socialization processes that could explain how children internalize the values, attitudes, and behavior of the culture in which they were raised. According to Sears’ identification theory, social learning experiences lead to the internalization of self-sustaining drives to emulate their parents and other adult figures. Sears was particularly interested in children’s development of self-control over aggressive behavior. Whiting conducted cross-cultural studies of differences in 14342
children’s socialization to determine whether the severity of these experiences could lead to cultural differences in anxiety, guilt, and the formation of conscience.
3. Changes in Emphasis Oer Time Albert Bandura’s first book with Richard Walters on adolescent aggression tested Sears’ identification theory as a basis for internalization of controls over children’s aggressive behavior. This research ultimately led these researchers to reject the psychoanalytic theoretical basis for social learning in a second book (Bandura and Walters 1963). They disproved the Freudian hypothesis that modeled violence would reduce observers’ aggressive drives through identification and catharsis by demonstrating that such modeling actually taught or facilitated aggression, especially when modeling was accompanied by vicarious rewards. In other research, Bandura and his colleagues showed that through modeling and vicarious reward experiences children could learn new responses without actually performing them or receiving rewards. This disproved earlier instrumental conditioning accounts of modeling and imitation, and led Bandura to distinguish the cognitive effects of learning due to modeling from the motivational effects of rewards on imitative performance. To explain these two aspects of imitative learning, Bandura distinguished four subprocesses: two that affect vicarious learning—attention and retention—and two that affect imitative performance—production and motivation. To learn a new skill socially, a learner must attend to and encode a model’s actions, and to perform a skill imitatively, a learner must be motivated and motorically able to produce the response. Bandura used the label obserational learning to describe these distinctive influences of modeling without reference to Freudian identification, Hullian acquired drives, or instrumental conditioning. During the late 1960s and early 1970s, social learning researchers began to study vicarious learning of abstract properties of a model’s performance, such as grammar or conceptual classes underlying spoken language. They discovered that children could acquire a variety of abstract rules with minimal mimicry of a model’s specific behaviors, such as imitating a model’s style of questioning but not the model’s exact words (Rosenthal et al. 1970). These demonstrations of abstract modeling on children’s concepts, rules, and reasoning attracted adherents looking for alternatives to stage views of children’s development. Many learning researchers, including those interested in social learning processes, questioned the validity of claims of an invariant sequence of qualitative stages that are resistant to direct training and are transituational in their impact. Instead they sought to demonstrate that children could be taught advanced
Social Learning, Cognition, and Personality Deelopment
Figure 1 Reciprocal causation among triadic classes of social learning determinants
stage Piagetian concepts, Chomskian grammatical rules, and Kohlbergian or Piagetian moral judgments and that task and situational variations could cause cross-stage variations in these measures. Mary Harris, Robert Liebert, Ted Rosenthal, James Sherman, Grover Whitehurst, and Barry Zimmerman, among many others, used abstract modeling techniques to teach advanced stage functioning among children of a variety of ages, and the results were chronicled in the book on social learning and cognition (Rosenthal and Zimmerman 1978). Although age-related shifts in children’s functioning were of interest to social learning researchers, they sought to explain these outcomes more on the basis of changes in social learning experience, hierarchies of goals and knowledge, and level of motoric competence rather than simply cognitive growth and development. From a social learning perspective, shifts in children’s personal functioning at the approximate ages of 2, 7, and 12 years by numerous stage theories are better predicted by social learning experiences and accomplishments, such as the acquisition of speech and mobility, entrance into school, and the onset of puberty. When considering later lifespan developmental issues, social learning researchers have focused on adult life-path experiences associated with employment, marriage, child rearing, and physical declines in functioning rather than cognitive stage indices. During the late 1970s, social learning researchers began to shift their interest from children’s initial acquisition of important personal skills via modeling to the other end of the developmental spectrum—the point at which they can regulate these skills on their own. To explain children’s development and use of self-regulation, Bandura (1977) depicted social learning experiences in triadic terms involving reciprocal causal relations between personal (cognitive-affective), behavioral, and environmental sources of influence (see Fig. 1). Each of the three components of his theory influences the other components bidirectionally. Cognitive events, although central to learning
and performance, were constrained by specific features of social environmental contexts and by specific performance outcomes. The close bidirectional linkage of cognitive-affective processes to performance outcomes and situational contexts enabled Bandura’s triadic model to explain dynamic features of children’s development, such as variations in children’s aggression across settings, as well as enduring individual differences. Walter Mischel also used a triadic social learning formulation to explain children’s development of self-regulatory capability to delay gratification. He found that both modeling experiences and cognitive encoding enhanced children’s willingness to defer a small reward for a later larger one. Michael Mahoney and Carl Thoresen studied adults’ selfcontrol of a range of clinical problems triadically in terms of covert personal forms of self-control as well as environmental and behavior forms. Bandura (1977) undertook a series of clinical training studies using highly effective (or mastery) models to improve the self-control of patients plagued with disabling fears, such as snake and dog phobias. These modeling studies were designed to enable patients’ to overcome their phobia by showing the effectiveness of specific handling techniques with the feared animal. Bandura discovered that although these patients were confident about the effectiveness of a model’s techniques to handle the feared animal, they were not confident about their own ability to learn and use those techniques, and this cognitive and affective deficit greatly affected their willingness to change. Bandura defined these personal perceptions of capability to learn or perform as self-efficacy beliefs.
4. Emphases in Current Theory and Research Increasing evidence of the importance of cognitive aspects of self-regulation, especially self-efficacy beliefs, led Bandura to rename his approach as social cognitive theory (Bandura 1986). His book inspired numerous researchers, such as Gail Hackett, Robert Lent, Alan Marlatt, Ann O’Leary, Frank Pajares, Dale Schunk, Ralf Swartzer, Anita Woolfolk Hoy, Robert Wood, and Barry Zimmerman, among others, to begin studying the role of self-efficacy and selfregulatory processes in a wide variety of areas, such as academic functioning, physical health and medicine, sports, business and management, mental health, and the arts. A social cognitive model was used very extensively to explain how students acquire the sense of self-efficacy and the self-regulatory skill to assume responsibility for their academic learning (Schunk and Zimmerman 1994). Much of this research focused on how modeling experiences could lead school-aged children to acquire self-regulatory control of their academic performance and pursue desired careers. During the 1990s, Bandura undertook a series of studies of the psychosocial influences of families, peers, 14343
Social Learning, Cognition, and Personality Deelopment and schools on children’s development of self-efficacy to self-regulate their academic and social attainments. In research with Claudio Barbarelli, Gian Vittorio Caprara, Manuel Martinez-Pons, Concetta Pastorelli, and Barry Zimmerman, Bandura found that selfefficacy beliefs and goals of parents for their children significantly affected the youngster’s self-efficacy beliefs, aspirations, level of depression, and adherence to moral codes of conduct. Children’s self-efficacy was assessed in a variety of areas of functioning, such as perceived social efficacy and efficacy to manage peer pressure for detrimental conduct, both of which were found to contribute to the youngsters’ academic attainments. In 1997, Bandura published a 20-year review of research on self-efficacy involving hundreds of studies that spanned such tasks as mental and physical health, sport, academics, and business and management (Bandura 1997). Self-efficacy proved to be highly predictive of such motivational measures as free choice, persistence, and effort, of measures of selfregulation, and of social and academic learning outcomes.
5. Methodological Issues In order to capture the triadic interdependence of person-related processes, such as children’s self-efficacy beliefs, with their behavior and environmental setting, Bandura has advocated situationally specific forms of assessment and microanalyses of cognitive and behavioral functioning. By assessing children’s, peers,’ and adults’ functioning in this specific manner before, during, and after efforts to perform behaviorally, researchers can test causal relations among social and self-regulatory processes. There is a body of evidence indicating that task-specific measures, such as self-efficacy judgments, are more predictive of actual performance efforts and outcomes than global trait measures of self-belief (Bandura 1997). The greater predictive power of task-specific measures was also reported by Mischel (1968) when he reviewed the literature on a wide range of personality traits, such as friendliness or conscientiousness. He found higher correlations when situationally specific measures than when trait measures of personal processes were employed, and he concluded that contextfree measures of personal belief or competence were unable to explain variations in behavioral functioning across contexts.
6. Future Directions for Research and Theory To explain better the sequential cognitive and behavioral aspects of self-regulation, Zimmerman (1998) developed a three-phase, cyclical self-regulation model. Forethought phase processes anticipate efforts to learn and include self-motivational beliefs such as 14344
self-efficacy. Performance phase processes seek to optimize learning efforts and include the use of imagery, self-verbalization, and self-observation. Selfreflection processes follow efforts to learn and provide understanding of the personal implications of outcomes. They include self-judgment processes such as attributions and self-reaction processes such as selfsatisfaction, defensiveness, or adaptation. These selfreactions are hypothesized to influence forethought processes cyclically regarding future efforts to perform. It is also expected that as children develop skill in an area of functioning, they will show stronger cyclical links among these self-regulatory processes. Schunk and Zimmerman (1997) identified four successive social learning levels associated with children’s development of a particular skill, such as writing or golf. A first level, observational learning, occurs when a learner induces the major features of skill or strategy (for complex skills) from watching a model learn or perform, whereas a second level, emulative learning, is attained when a learner’s motoric performance approximates the general strategic form of a model’s. Emulative accuracy can be improved further when a model assumes a teaching role and provides guidance, feedback, and social reinforcement during performance efforts. A third level, self-controlled learning, is attained when a learner masters the use of a skill or strategy in a structured setting outside the presence of a model by relying on a previously abstracted standard from a model’s performance. A fourth level, self-regulated learning, is attained when learners can systematically adapt their performance to changing personal and contextual conditions without reference to a model. It is not assumed that this is a stage model; thus, learners do not have to advance through the four levels in sequence or that the highest level of self-regulation will be used continuously without regard to motivation or context. Rather, it is hypothesized that students who master skills or subskills (in the case of complex tasks) according to this social learning sequence will display greater motivation and achievement than students using self-discovery methods. This multilevel formulation integrates the findings of modeling research with those of self-regulation research within a sequential model of children’s socialization. See also: Family Bargaining; Infancy and Childhood: Emotional Development; Personality and Marriage; Self-regulation in Childhood; Social Cognition in Childhood; Social Competence: Childhood and Adolescence; Socialization and Education: Theoretical Perspectives
Bibliography Bandura A 1977 Social Learning Theory. Prentice-Hall, Englewood Cliffs, NJ
Social Mobility, History of Bandura A 1986 Social Foundations of Thought and Action: a Social Cognitie Theory. Prentice-Hall, Englewood Cliffs, NJ Bandura A 1997 Self-efficacy: the Exercise of Control. Freeman, New York Bandura A, Walters R H 1963 Social Learning and Personality Deelopment. Holt, Rinehart and Winston, New York Miller N, Dollard J 1941 Social Learning and Imitation. Yale University Press, New Haven, CT Mischel W 1968 Personality and Assessment. Wiley, New York Rosenthal T L, Zimmerman B J 1978 Social Learning and Cognition. Academic Press, New York Rosenthal T L, Zimmerman B J, Durning K 1970 Observationally induced changes in children’s interrogative classes. Journal of Personality and Social Psychology. 16: 631–88 Schunk D H, Zimmerman B J (eds.) 1994 Self-regulation of Learning and Performance: Issues and Educational Applications. Erlbaum, Hillsdale, NJ Schunk D H, Zimmerman B J 1997 Social origins of selfregulatory competence. Educational Psychologist 32: 195–208 Zimmerman B J 1998 Developing self-fulfilling cycles of academic regulation: an analysis of exemplary instructional models. In: Schunk D H, Zimmerman B J (eds.) Self-regulated Learning: from Teaching to Self-reflectie Practice. Guilford Press, New York pp. 1–19
B. J. Zimmerman
Social Mobility, History of 1. General Questions Social mobility was one of the important themes of social history during its beginnings in the 1960s and the 1970s. Historians explore social mobility for two reasons. First, they want to study the general question of the history of equality of social opportunities. The question whether modern societies enlarged or reduced chances of social ascent for men as well as for women was attractive for historians. The rising openness or reinforced exclusiveness of modern elites, the rising or declining opportunities of upward social mobility for men and women from the lower classes, immigrants, poor families, and ethnic groups, the broadening or reduced access to channels of social ascent such as education, careers in business or public bureaucracies, and politics, and the role of family networks were central topics. Second, historians discuss the history of social mobility in a comparative view. European and American historians explore the myth of the opportunities for social ascent in America, which were unique compared to Europe. Historians also investigated the myth of unblocked social mobility in communist countries before 1989\91 compared to Western Europe. Social mobility was a historical theme to which major contributions did not
only come from historians, but also from sociologists and political scientists.
2. Definitions and Sources Studying the history of social mobility, historians usually investigate only the social mobility of individuals. They usually do not explore under the heading of social mobility the grading up or grading down of entire social groups or entire social classes. Moreover, the study of social mobility primarily focuses upon mobility in social structures and hierarchies rather than upon the geographical mobility of individuals as the term might suggest. It also mostly concentrates upon occupational mobility. If investigating career mobility (or intra-generational mobility), historians trace the mobility of individuals between different occupational positions or their persistence in the same occupation throughout the life. If they investigate mobility between generations (or intergenerational mobility), they compare the occupation of the father, the mother or the ancestors at specific points of their life with the occupations of a historical individual. Occupation is usually seen as the crucial indicator of the situation of an individual in a historical society. To be sure, historians are fully aware of the fact that the activity of an individual in history, more often than today, might comprise a variety of occupations at the same point of time or might include professions that still are in the making. Historians of social mobility work to link various historical sources and to trace individual persons through marriage license files, census materials, tax files, last wills, records of churches and public administrations, and autobiographical sources. The competence of historians in linking various sources has grown distinctly. During recent years, the study of social mobility has become highly sophisticated. Individual careers are more frequently explored in micro studies of as many details as possible. These studies investigate a few richly documented individual cases rather than all members of a local society. They often use autobiographical materials difficult to analyze quantitatively. This type of micro study rarely concentrates only on social mobility, but covers a large variety of aspects. At the same time, the international and interregional comparative study of the history of social mobility has become somewhat more frequent, using the rich results of about 30 years of historical research in this field.
3. Three Main Debates A first debate on social mobility by historians covers the increase of social mobility from 1800–2000. In this debate the rise of social mobility has various meanings. It sometimes means a more meritocratic recruitment, especially for the few most prestigious, most powerful, and best paid positions. It sometimes means more 14345
Social Mobility, History of mobility between occupations, upward mobility as well as downward mobility, and job mobility in the same occupation as well as mobility between occupations in the same social class. Sometimes, the increase of social mobility includes the chances of both genders and of minorities. Sometimes it means a clear increase of the opportunities of the lower classes compared to the opportunities of the upper and middle classes and not just more social mobility for everybody. One has to make sure what meaning is used by individual authors. The advocates of the rise of social mobility usually think of a general increase of mobile people, but often also of a rising number of upwardly mobile persons since industrialization. They argue that various major social changes should have led to more social mobility and to more social ascent: the general decline of the fertility rate during the late nineteenth and early twentieth centuries made it possible for parents not only to invest more in individual help and education of their children, but also to promote their own professional careers. The rapid expansion of secondary and higher education especially since the end of the nineteenth century enlarged the chances for better training enormously. The rapid increase of geographical mobility since the second half of the nineteenth century led to a widening of the labor market and to a greater variety of new chances. The fundamental changes of the active population from the predominance of agrarian work up to the nineteenth and early twentieth centuries to the predominance of service work especially since the 1970s generated substantial social mobility between occupations. The distinct increase of the sheer number of occupations in all modern societies since the Industrial Revolution also must have led to more social mobility. The general change of mentalities, the weakening of the emotional identification with specific professions, social milieus, and specific local milieus, and the rising readiness for job mobility and for lifelong training further enlarged the number of socially mobile persons. The rise of the welfare state, the mitigation of individual life crisis, and the guarantee of individual social security clearly improved the chances for further training and for the purposeful use of occupational chances. Deliberate government policies of enhancing educational and occupational opportunities for the lower classes, for women, ethnic and religious minorities, and immigrants should also have had an impact on social mobility. In sum, a substantial list of factors in favor of an increase of social mobility from 1800–2000 can be put forward from this side of the debate. The advocates of stability or even the decline in social mobility are a heterogeneous group. Arguments stem from very different ideas of social developments. It is sometimes argued that nineteenth and early twentieth century industrialization not only led to a rising number and a fundamental change of occupations, but also to a class society in which the major 14346
social classes, the middle class, the lower middle class, the working class, the peasants, and in some societies also the aristocracy tended to reinforce the demarcation lines to other social classes and, hence, to reduce rather than to enlarge the number of mobile persons. Other advocates of the skeptical view argue that the fundamental upheaval of modern societies during industrialization led to a unique rise of social mobility, of upward as well as downward mobility, and that modern societies thereafter became more closed since the generation of pioneers in business ended, as most occupational careers became more formalized and more dependent on formal education, as modern bureaucracies emerged and as mentalities adapted to the modern, highly regulated job markets. Other advocates argue for the stability of social mobility rates in a different and much more narrow sense. They argue that the long-term change of social mobility from the Industrial Revolution until the present was mostly structural, that is, it depended almost exclusively upon the redefinition of the active population rather than on the reduction of social, cultural, and political barriers. In this view social mobility remained stable if one abstracts from the changes simply induced by alternations in occupational structure (as peasants, for example, became workers, a real change, but not necessarily a case of upward mobility). Still other advocates of the longterm stability of social mobility posit a stable inequality of educational and occupational chances of lower classes, women, and minorities in comparison with the educational and occupational chances of the middle and upper class, the male population, or the ethnic majority, respectively. This long debate has led since the beginning of quantitative studies of social mobility after World War II to a large number of historical studies of social mobility and to a wide range of results. To sum up briefly, three main results ought to be mentioned. First, only in very rare cases could a clear decline of social mobility rates to be found. Most studies show either stable or increasing rates of social mobility, depending upon the type of community and country and the generation and the period under investigation. There is no overwhelming overall evidence, either for the stability or for an increase, of social mobility rates. Second, changes of overall social mobility rates do in fact depend to a large degree on changes in occupational and educational structure. So one can say that modern societies became more mobile to a large degree because education expanded so much and because occupational change became so frequent and normal. Finally, there is much evidence that the educational and social mobility of the lower classes and women did not increase to the detriment of the educational and occupational chances of the middle and upper classes and men. Except for the Eastern European countries in some specific periods, social mobility was usually not a zero numbers game.
Social Mobility, History of A second debate covers the more advanced social mobility in the United States. This old debate dates at least from the early nineteenth century, when the French social scientist Alexis de Tocqueville argued that American society offered more opportunities of social ascent than did European. Various pieces of evidence was presented in favor of the American lead. Some empirical studies by sociologists demonstrated that in some crucial aspects a clear American lead could be shown. This was especially true for mobility into the professions. Higher education was more extensive and offered more chances than in Europe. Hence, the social ascent from the lower classes into the professions that are based on higher education was clearly more frequent than in Europe. In addition, comparative historical studies on the late nineteenth and early twentieth century American and European cities showed that in a special sense a modest American lead existed during that period: unskilled workers in fact moved up into white-collar positions in American cities somewhat more frequently than in European cities. Finally, historians demonstrated that the important difference between American and European societies could be found in the American creed in more opportunities in America than in closed European societies. Other recent comparative studies tend to argue that a general lead of the American society no longer exists in comparison with Europe. According to this view industrialization and modernization led to about the same opportunities everywhere. Various international comparisons of social mobility rates support this view and demonstrate that the overall rates were not distinctly higher in the US than in old societies such as Western Europe or Japan. As these comparisons almost exclusively deal with the second half of the twentieth century, the results might be due to the fundamental social changes in Europe and Japan since World War II. The lead of communist countries in social opportunities especially in the USSR during the 1920s and 1930s and in the Eastern European communist countries in the late 1940s and 1950s is the topic of a less intensive, but important debate. Some historical studies of social mobility demonstrate that during these periods rates of upward social mobility into the higher ranks of the social hierarchy were substantial compared to Western European societies. This was true partly because of the rapid expansion of higher education, the communist abolition of the middle class, and the rapid change of the employment structure due to rapid industrialization. However, the rise of social opportunities in communist countries, if it rose at all, was largely limited to the period of the system upheaval. Most comparative studies of the 1970s and 1980s show that rates of social mobility were not distinctly higher in Eastern Europe compared to Western Europe. This was partly because the communist political and administrative elite became
exclusive and gentrified, because social change slowed down, and because the expansion of higher education was reduced in several communist countries. Gender contrasts in social mobility were explored only by a small number of studies by historians. A debate among historians has not yet started. But gender contrasts undoubtedly will add new important aspects to the general debate about long-term trends in social mobility. Four conclusions can be expected from the few studies. First, in a more radical sense than in the study of male mobility, female mobility raises the question whether social mobility should in fact be centered around occupational mobility or other factors such as marriage and unpaid or partially paid work in emerging professions are to be taken in account much more than so far. Second, the question of a rise of downward mobility during the transition to modern society, i.e., during the rise of female activity outside the family sphere, is to be explored. A study of female social mobility in twentieth century Berlin demonstrates a high rate of intergenerational declassement of active women during the early parts of the twentieth century. Third, the study of the social mobility of women demonstrates much more clearly than that of males the effects of economic crises and fundamental transitions on social mobility. Opportunities for women seem to have depended strongly on economic prosperity, on long-term social stability. In periods of economic crisis and rapid transitions such as the upheaval of 1989\91 women more than men belonged to the losers. Here again the study of female mobility might draw the attention of the historians to a more general aspect of mobility that was not investigated enough. Finally, the social mobility of women also demonstrates that definite changes of social opportunities can only be reached in a longterm perspective. It was shown that even when important channels of upward social mobility such as education offered equal chances to women, they did not lead to a parallel improvement of occupational chances for women. Besides the study of institutions and policies, the historical study of the experience of social mobility and the perception of social mobility will become crucial.
4. The Future Historical Research on Social Mobility During 1980–2000 social mobility was much less frequently investigated by historians. The declining interest in social mobility has partly to do with the rising research standards that are more and more difficult to match by projects limited in time and financial resources. Moreover, the initial questions asked in this field lost the former attraction because it became widely accepted that social mobility rates were about the same in most societies and did not rise distinctly during in the nineteenth and twentieth 14347
Social Mobility, History of industrialization and modernization centuries. Finally, the general thematic trends in historical research made social mobility look unlike a modern theme. The future of the study of social mobility is that of a normal theme among many others in history rather than a top theme in an expanding branch of history as in the 1960s and 1970s. One can hope for four sorts of studies of neglected aspects of social mobility. A first future aspect is the historical study of social mobility in Eastern Europe, but also outside Europe, and thereafter the international comparisons reaching beyond Western Europe and the US. The question of the particularities of social mobility in Europe might then be answered differently. A second future aspect of social mobility is gender contrasts of social mobility. We need a well-reflected number of case studies of contrasting countries, different female activities, contrasting general conditions such as prosperity and economic depression, peace and war periods, stability and transitions. A third future theme is specific factors of social mobility such as religion, types of family, immigration, unemployment, and poverty and more generally the impact of background social milieu. It seems that historians will explore these contexts of social mobility often in looking at a limited number of individuals and that they will include also the subjective experience of social mobility. The fourth future topic is to be the history of the debate on social mobility, as an aspect of the history of identities, e.g., the European or American identity. The debate among historians was inspired by this debate and this topic will help us to understand better why historians became interested in this theme in the 1960s. See also: Bourgeoisie\Middle Classes, History of; Class Consciousness; Class: Social; Elites: Sociological Aspects; Mobility: Social; Political Elites: Recruitment and Careers; Social Stratification; Working Classes, History of.
Bibliography Boyle S C 1989 Social Mobility in the United States, Historiography and Methods. Garland, New York Charle C 1987 Les eT lites de la ReT publique (1880–1900). Fayard, Paris Erickson R, Goldthorpe J H 1992 The Constant Flux, A Study of Class Mobility in Industrial Societies. Clarendon, Oxford, UK Fitzpatrick 1979 Education and Social Mobility in the Soiet Union 1921–1934. Cambridge University Press, Cambridge, UK Goldthorpe J H 1980 Social Mobility and Class Structure in Modern Britain. Clarendon, Oxford Grusky D B, Hauser R M 1984 Comparative social mobility revisited: Models of convergence and divergence in 16 countries. American Sociological Reiew 49: 19–38 Haller M (ed.) 1990 Class Structure in Europe: New Findings from East–West Comparisons of Social Structure and Mobility. Sharpe, Armonk, NY Hout M 1989 Following in Father’s Footsteps, Social Mobility in Ireland. Harvard University Press, Cambridge, MA
14348
Kaelble H 1985 Social Mobility in the 19th and 20th Centuries: Europe and America in Comparatie Perspectie. Berg, Dover, NH Kocka J, Ditt K, Mooser J, Reif H, Schu$ ren R 1980 Familie und soziale Plazierung. Westdeutscher, Opladen, Germany Kocka J, Ditt K, Mooser J, Reif H, Schu$ ren R 1984 Journal of Social History 411–434 Ko$ nig W, Mu$ ller W 1986 Educational systems and labour markets as determinants of worklife mobility in France and West Germany: A comparison of men’s career mobility, 1965–1970. European Sociological Reiew 2: 73 Lipset S M, Bendix R 1959 Social Mobility in Industrial Society. University of California Press, Berkeley, CA Mach B W, Mayer K U, Pohoski M 1994 Job changes in the Federal Republic of Germany and Poland: A longitudinal assessment of the impact of welfare-capitalist and statesocialist labor-market segmentation. European Sociological Reiew 10 Miles A 1999 Social Mobility in 19th and Early 20th Century England. Saint Martin’s, New York Pinol J L 1991 Les mobiliteT s de la grande ille, Lyon fin XIXedeT but XXe sieZ cle. Presses de la fondation nationale des sciences politiques, Paris Sewell W H Jr 1985 Structure and Mobility. The Men and Women of Marseille, 1820–1870. Cambridge University Press, Cambridge, UK Sorokin P A 1959 Social and Cultural Mobility. Free Press, Glencoe, IL Thernstrom S 1973 The Other Bostonians. Harvard University Press, Cambridge, MA Vincent D, Miles A (eds.) 1993 Building European Society, Occupational Change and Social Mobility in Europe, 1840– 1940. Manchester University Press, Manchester, UK
H. Kaelble
Social Movements and Gender Social movements can be defined as collectivities acting outside institutionalized channels to promote or resist change in an institution, society, or the world order. Although there are a diversity of theoretical approaches to the study of social movements, until recently mainstream theory and research in the fields of social movements and political sociology have ignored the influence of gender on social protest. Beginning in the 1990s, scholars trained in the gender tradition that flourished in the social sciences during the 1980s turned their attention to questions pertaining to the gender dynamics of a variety of historical and contemporary movements. As a result, a thriving body of research demonstrates that gender hierarchy is so persistent that, even in movements that do not evoke the language of gender conflict or explicitly embrace gender change, the mobilization, leadership patterns, strategies, ideologies, and outcomes of social movements are influenced by gender and gender differences. This article delineates the way gender relations and hierarchy influence various social movements, including feminist, antifeminist, men’s, labor, civil rights,
Social Moements and Gender gay and lesbian, nationalist, rightwing, and fascist movements. It ends by discussing the broader implications of integrating gender to mainstream theories of social movements.
1. Linking Gender and Social Moement Theories Notions of femininity and masculinity, the gender division of labor in private and public spheres, and the numerous institutions, practices, and structures that reinforce male dominance are not expressions of natural differences between women and men. Rather, the gender order is constructed socially and gender is a pervasive feature of social life. Although gender scholars fairly universally adopt social constructionist perspectives to understand gender relations, there is considerable variation in the epistemologies, contexts, and levels of analyses of the various theories that have developed to explain gender hierarchy. The most sophisticated conceptualizations treat gender as an element of social relationships that, like race and class, operates at multiple levels. Gender categorizes and distinguishes the sexes and ranks men above women of the same race and class (Lorber 1994). Multilevel frameworks seek to avoid universal generalizations that are impervious to the historicity and contextualized nature of perceived differences between the sexes by conceptualizing gender as operating at three levels. Multilevel approaches, first, view gender as organizing social life at the interactional level through the processes of socialization and sex categorization that encourage gender-appropriate identities and behavior in everyday interactions. Second, multilevel theories emphasize the complex ways gender operates at the structural level, so that gender distinctions and hierarchy serve as a basis for socioeconomic arrangements, the division of labor in families, the organization of sexual expression and emotions, work and organizational hierarchies, and state policies. Finally, multilevel theories attend to the cultural level, specifically to the way gender distinctions and hierarchy are expressed in ideology and cultural practices, such as sexuality and language, as well as in the symbolic codes and practices of institutions such as art, religion, medicine, law, and the mass media. Recent writings on social movements treat the social construction of gender as taking place at these three respective levels. The significance of gender is more apparent for some movements than for others. Its relevance is obvious in the case of women’s movements targeting discriminatory policies and practices pertaining to employment, family, reproduction, and sexuality. Nevertheless, gender is also constructed in movements that do not explicitly evoke the language of gender conflict. One example is McAdam’s (1992) landmark study of the US 1964 Freedom Summer campaign, when white college students spent the
summer in the South participating in the US civil rights movement. The traditional sexual double standard meant that women were less likely to be recruited to the Mississippi project than men because of the taboos associated with interracial sexual relationships, especially those between Black men and white women. Those women who managed to make it to Mississippi were more likely to be slotted into traditional female roles—specifically, clerical work and teaching rather than canvassing and voter registration. Furthermore, the biographical consequences of participation differed for men and women. Men were more likely to remain activists over the course of their lives than women, and women who participated in Freedom Summer were less likely to marry than men. To provide another example, masculinity is integral to nationalist politics in the contemporary world (Nagel 1998). Gender hierarchy is expressed in forms such as the construction of patriotic manhood and exalted motherhood as icons of nationalist ideology. The relevance of gender can also be seen in the domination of masculine interests in the ideology of nationalist movements and sexualized militarism that simultaneously constructs the male enemy as over-sexed and under-sexed (as rapists and wimps) and the female enemy as promiscuous (sluts and whores). A systematic incorporation of gender into an understanding of social movements requires that we examine the way gender processes are embedded in the factors that trigger and sustain social movements. The most influential theoretical perspectives on social movements combine the insights of classical collective behavior theory, resource mobilization theory, and new social movements theory (McAdam et al. 1996). They emphasize three factors that explain the emergence and development of any movement: shifting political and social opportunities and constraints; the forms of organization used by groups seeking to mobilize; and variations in the ways challenging groups frame and interpret their grievances and identities. Let us examine the way gender comes into play in each of these three broad sets of conditions that are relevant to the emergence and operation of social movements.
1.1 Political and Cultural Opportunities and the Gender Regime of Institutions One of the major premises of social movement theory is that changes in the larger political and cultural context can stimulate or discourage people from participating in collective action. Shifts in gender differentiation and gender stratification in the particular sociohistorical context have been important facilitating conditions for the emergence and operation of feminist movements. In modern welfare states, public policy with respect to pregnancy and mother14349
Social Moements and Gender hood historically has been formulated either by treating women and men the same with gender-neutral policies or by recognizing women’s special needs as child bearers and mothers and providing femalespecific benefits. In the US, during the early twentieth century, government support of maternity often took the form of ‘protective legislation’ that regulated women’s hours, working conditions, and prohibited women from working in occupations thought to be dangerous. While nineteenth century feminists, by and large, supported special female legislation and other policies consistent with women’s primary identity as mothers, by the late 1960s the feminist campaign to formulate public policy switched to the sameness side of the sameness\difference divide, reflected in campaigns for policies such as the Equal Rights Amendment and the Family and Medical Leave Act. Feminist movements in the US historically have targeted gender inequality in a wide range of institutions, including medicine, religion, education, politics and the law, the military, and the media. For instance, feminists criticized the medical system’s bias against women in medical research, the labeling and classification of medical conditions, and doctor\patient relationships. The entry of larger numbers of women into the medical field beginning in the 1970s and 1980s facilitated the growth and expansion of feminist women’s health centers and women’s self-help movements focused on conditions unique to women, such as rape, battering, breast cancer, childbirth and pregnancy, and postpartum illness.
1.2 Gendered Mobilizing Structures Expanding political and cultural opportunities create possibilities for collective action, but the emergence of a movement depends upon whether aggrieved groups are able to develop the organization and solidarity necessary to get a challenge underway. Gender differences and relations figure heavily into the pre-existing mobilizing structures or nonmovement ties that exist among individuals prior to mobilization and the organizational form a movement takes once it is underway. Gender, racial, class, ethnic, and other inequalities are often embedded in the informal networks, clubs, and formal organizations that provide the incentives for people to take the risks associated with participating in protest groups. In her study of African American women’s participation in the US civil rights movement, Robnett (1996) identifies a distinct form of grassroots behind-the-scenes leadership carried out by women who were prevented from occupying formal leadership positions by the exclusionary practices of the Black church. Gender divisions can also facilitate mobilization. For example, sex-segregated women’s colleges, clubs, associations, and informal networks historically have been fertile 14350
grounds for the growth of feminism. Blee’s (1991) research on women in the 1920s Ku Klux Klan provides further evidence of the importance of sexsegregated networks. She reveals how women drew upon family ties and traditions of church suppers and family reunions to mobilize. She describes the way Klan women deployed women’s gossip in the form of ‘poison squads’ to spread a message of racial and religious hatred. The connective structures that link leaders and followers and different parts of a movement, allowing them to persist, can also be gendered. Some scholars have argued that there is no particular structure essential to a feminist organization. Others, however, find that some feminists’ beliefs in fundamental differences between female and male values have led to feminist organizations with distinctive emotion cultures that dictate open displays of emotion, empathy, and attention to participants’ biographies. By contrast, the large-scale all-male meetings of the Promise Keepers, a mostly white, middle-class, and heterosexual men’s rights movement that arose in the late 1990s to reassert essentialist views of women’s and men’s roles in the family, were steeped in an aura of powerful masculinity, holding meetings in stadiums associated with organized sports events (Messner 1997). Gender can also be a subtext in the organizational logic, beliefs, and practices of movements that do not make explicit gender claims. Einwohner (1999) found that prevailing ideas about gender and gender differences are a driving force behind the largely female composition, emotion-laden strategies, and negative reactions of the public to the contemporary animal rights movement. The well known decentralized grassroots Alinsky model that influenced early social welfare activism since the early 1970s, according to Stall and Stoecker (1998), is organized along an essentially masculine logic that views community organizing as a means to power, in contrast with ‘the women-centered model,’ which treats community building as a goal of organizing. Brown (1992) implicates gendered organization in the decline of the Black Panther Party. She recounts the way the male power rituals of the Party—specifically its military structure, use of violence and aggression, and promotion of sexual rivalries—not only excluded women from the Black power struggle but planted the seeds of its destruction. As this last example illustrates, gender inequality often persists within social movement organizations as Fonow (1998) reports in her research on women’s participation in the 1985 Wheeling–Pittsburgh steel strike. In this case, major strike activities—picket duty and meal preparation—reflected a gender division of labor. While working kitchen duty helped mobilize the support of the larger community by bringing together the wives and families of workers and nonworkers, the exclusion of women steelworkers from the picket lines—seen as an expression of ‘virile unionism’—
Social Moements and Gender undermined solidarity between men and women steelworkers. 1.3
Gender and Framing Processes
Social movements often appropriate gender ideology to legitimate and inspire collective action and to identify a challenging group’s commonalities. Contemporary scholars of social movements rely upon two concepts for analyzing the unique frames of understanding that people use to define their shared grievances: collective action frames and collective identities. To the extent that gender operates as a constitutive element of social interaction and relationships, it is not surprising that images of masculinity and femininity appear in the language and ideas social movement activists use to frame movement demands and injustices. Gender dualism, for example, was woven tightly into German National Socialism. Not only was masculinity glorified and male bonding considered the foundation of the Nazi state, but gender-linked oppositions served as the metaphor for the polarization of Jews as weak and emotional and Aryans as strong and rational. The language of gender difference and power is pervasive in contemporary women’s self-help movements and serves as a major framework for understanding the problems that trouble women. In the postpartum selfhelp movement, gender plays a pivotal role in mobilizing participants. Activists linked postpartum illness to women’s responsibilities for parenting, to ideas that mothers (not fathers) have to be self-sacrificing, and to the gender bias of the largely male medical establishment (Taylor 1999). Social movement scholars use the concept of collective identity to research the question of how challenging groups define and make sense of the question ‘who we are’ and the complex and changing process collective actors use to draw the circles that separate ‘us from them.’ Gender difference in virtually every culture is key to the way actors identify themselves as persons, organize social relations, and symbolize meaningful social events and processes. That means that the identities that social movements of all kinds create and recreate are bound to be molded, at least in part, by gender meanings and practices. For example, Taylor and Whittier (1992) argue that in the lesbian feminist movement of the 1970s and 1980s, the drawing of boundaries between male and female promoted the kind of oppositional consciousness necessary for organizing one’s life around feminism. Similarly, some branches of the contemporary breast cancer movement frame women’s commonalities in gender terms by connecting the experiences of breast cancer survivors to the societal emphasis on women’s breasts as crucial to women’s attractiveness to men. At the same time, such activists point out, the medical establishment has provided less funding and attention to breast cancer than to diseases that affect men more than women
(Klawiter 1999). If gender symbolism plays an important role in the socially constructed solidarities that mobilize collective action, the development of gender solidarity is never automatic, even in women’s movements. Analyzing African American women’s autobiographies written before 1970, Brush (1999) shows that the messages that came from the civil rights movement encouraged them to interpret their grievances in terms of race rather than gender, while the women’s movement made gender seem irrelevant to Black women’s experiences. The tendency to see the generic ‘Black’ as male and the generic ‘woman’ as white shows that using gendered meanings to construct a group’s commonalities can exclude some people, since deciding who ‘we’ are also entails deciding who ‘we’ are not (Collins 1990). Gender as a set of cultural beliefs and practices figures heavily into many of the identity movements that have swept the political landscape in recent years. In attempting to privilege white racial identity, the contemporary white supremacist movement not only presents a range of racial enemies but deploys gender metaphors—e.g., treating the Black male body as dangerous and animal-like and the white female body as always at risk and in need of protection—to appeal to the growing number of disillusioned white males who feel threatened by what they perceive to be the successes of social movements for gender and racial equality (Ferber 1998). Finally, gender symbolism and ideas about masculinity and femininity figure heavily in the modern gay and lesbian movement, which challenges the notion that women are feminine and men masculine and, further, that erotic attraction must be between two people of different sexes and genders (Gamson 1997).
2. The Significance of Integrating the Study of Social Moements and Gender Feminist scholars interested in the study of social movements recently have called attention to the gendering of social movement processes and social movement theory. The role of gender in the emergence and dynamics of social movements, even those seemingly not about gender, has been obscured through the gender-neutral discourse that characterizes prevailing theories of social movements. Treating gender as an analytic category illuminates a range of new questions pertaining to the political opportunities, mobilizing structures, and framing processes of movements. A synthesis of the theoretical literatures on gender and social movements also has the potential to advance the field of gender. Theories of gender emphasize the maintenance and reproduction of gender inequality and neglect countervailing processes of resistance, challenge, conflict, and change. The literature on social movements spotlights the role of human agency in the 14351
Social Moements and Gender making and remaking of femininity, masculinity, and the existing gender system.
Social Movements: Environmental Movements
See also: Collective Behavior, Sociology of; Feminist Movements; Gender and Feminist Studies in Sociology; Social Movements, History of: General; Social Protests, History of
Since World War II environmental issues have increasingly entered Western political discourses. From the 1992 UN-conference in Rio de Janeiro onwards, environmental issues appeared on the government agendas of the South and of reforming countries and were promoted by international programs under the label of sustainable development. This contribution examines the ways in which environmental issues were taken up by non-formal collective political action— often referred to as the ‘New Social Movements.’ It looks at the informal actors participating in the movements and how they organize themselves and identify with their demands. Furthermore, it focuses on their strategies for engaging with power, which have redrawn the lines of what is regarded as ‘political’ and ‘politics.’
Bibliography Blee K M 1991 Women of the Klan: Racism and Gender in the 1920s. University of California Press, Berkeley, CA Brown E 1992 A Taste of Power: A Black Woman’s Story. Pantheon Books, New York Brush P S 1999 The influence of social movements on articulations of race and gender in Black women’s autobiographies. Gender & Society 13: 120–37 Collins P H 1990 Black Feminist Thought. Unwin Hyman, Boston Einwohner R L 1999 Gender, class, and social movement outcomes: Identity and effectiveness in two animal rights campaigns. Gender & Society 13: 56–76 Ferber A L 1998 White Man Falling: Race, Gender, and White Supremacy. Rowman and Littlefield, Lanham, MD Fonow M M 1998 Protest engendered: The participation of women steelworkers in the Wheeling–Pittsburgh steel strike of 1985. Gender & Society 12: 710–28 Gamson J 1997 Messages of exclusion: Gender, movements, and symbolic boundaries. Gender & Society 11: 178–99 Klawiter M E 1999 Racing for the cure, walking women, and toxic touring: Mapping cultures of action with the bay area terrain of breast cancer. Social Problems 46: 104–26 Lorber J 1994 Paradoxes of Gender. Yale University Press, New Haven, CT McAdam D 1992 Gender as a mediator of the activist experience: the case of freedom summer. American Journal of Sociology 97: 1211–40 McAdam D, McCarthy J D, Mayer N Z 1996 Comparatie Perspecties on Social Moements: Political Opportunities, Mobilizing Structures, and Cultural Framings. Cambridge University Press, New York Messner M A 1997 Politics of Masculinities: Men in Moements. Sage, Thousand Oaks, CA Nagel J 1998 Masculinity and nationalism: Gender and sexuality in the making of nations. Journal of Ethnic Racial Studies 21: 242–69 Robnett B 1996 African–American women in the civil rights movement, 1954–1965: Gender, leadership, and micromobilization. American Journal of Sociology 101: 1661–93 Stall S, Stoecker R 1998 Community organizing or organizing community? Gender and the crafts of empowerment. Gender & Society 12: 759–56 Taylor V 1999 Gender and social movements: Gender processes in women’s self-help movements. Gender & Society 13: 8–33 Taylor V, Whittier N 1992 Collective identity and lesbian feminist mobilization. In: Morris A D, Mueller C M (eds.) Frontiers in Social Moement Theory. Yale University Press, New Haven, CT, pp. 104–29
V. Taylor 14352
1. Social Moements and New Social Moements: Theoretical Perspecties Research on social movements—as the term is understood today—dates back to the 1930s when US sociologists, who were confronted with the turmoil of the emergence of fascism and communism in Europe, concentrated on ideological aspects of collective behavior, and perceived collective political action as a threat to the established order (Eyerman and Jamison 1991, p. 10). Herbert Blumer from the Chicago School was the first to outline the positive potential of the new forms of social interaction. In contrast to this socialpsychological focus emphasizing the individual, Talcott Parsons took a structuralist-functionalist approach drawing on the classic work of Max Weber and Emile Durkheim. The combination of the two views resulted in a framework which has come to be known as the collective behavior approach, balancing the micro-oriented approach with the macro-oriented. This perspective dominated research on social movements until the 1960s (Eyerman and Jamison 1991, p. 11f.). However, the civil rights movements and the students’ demonstrations of the late 1960s led American sociologists to criticize the functionalist logic of the collective behavior theory, which devalued action as reactive behavior without strategic capability. US sociologists directed their attention instead toward the costs and benefits of taking part in social movements, grounding their concepts in actors as ‘movement ‘‘entrepreneurs’’’ who make rational choices, using the material and symbolic resources available. Critics have pointed out that this emphasis on resources normally held by influential social groups leads to the destruction of the resistance due to marginalized actors. Moreover the exclusive focus on
Social Moements: Enironmental Moements rational choices does not take adequate account of emotions (Della Porta and Diani 1999, pp. 7–9). As a response to this criticism the field of political research has directed its attention toward interactions between political establishment insiders and outsiders, and toward the outsiders’ less conventional forms of action. This framework allowed social movements to be viewed as part of political debates and dialogs rather than as excluded from the political system. Despite their insights, however, analysts of political processes have failed to generate an explanation for the innovative nature of social movements and the new spaces created and occupied by contemporary movements of youth groups, of women, and of homosexual or minority ethnic groups (Della Porta and Diani 1999, p. 10). Della Porta and Diani draw upon concepts developed by European scholars, whose criticism of the deterministic Marxist models shed new light on the ‘New Social Movements’ in Europe. In particular, Alain Touraine’s so-called postindustrial approach has become influential. It posits that the ruling and popular classes remain major antagonists and forces of social change. The New Social Movements are characterized by their decentralized and participatory organizational structures as well as their critical attitudes towards ideology and solidarity structures. As Alberto Melucci states, material claims are—in contrast to the workers’ movement—no longer central to the Movements’ agenda. Guaranties of security and state welfare are no longer demanded. Instead, the intrusion of the political-administrative system into daily life is something to be resisted (Della Porta and Diani 1999, p. 12f.). Although the exchange between European and American scholars has intensified and research on social movements taken up by a variety of disciplines, Calderon criticizes the still predominant structuralist research paradigm, arguing that a structuralist view cannot provide sufficient insight into the social expression brought about by the movements (Calderon 1992, p. 23). Referring to Melucci, David Slater (1997, p. 259) uses the metaphor of the ‘nomads of the present’ for the post 1989 movements, capturing the fluidity and territorial flexibility of today’s mobilization. Intensified cooperation between the disciplines and the theoretical schools has led to regarding manifestations of resistance and opposition as something increasingly independent from universalist discourses. Connected in terms of space, these new mobilizations stick together as ‘archipelagos of resistance’ characterized both by their global reach and their local embedding and specificities (Slater 1997, p. 259). This framework seems particularly helpful for understanding the emergence of social movements with an environmental focus. Before turning to environmental issues, however, it is necessary to consider the state’s changing role in confronting collective political action.
Thirty years ago collective action was strongly oriented toward the state and state politics. Today, collective practices create their own cultural identities and specific spaces for social expression, taking advantage of and contributing to the severe crisis faced by nation states worldwide (Caldero! n et al. 1992, p. 23). Globalization processes are undermining what used to be the territorial state’s significant spaces of action. Furthermore, the state is faced with the fragmentation of civil society: segments of society create their own spaces, often with a self-determined distance from the state which—in its modernist conception—has traditionally furthered integration by excluding certain social categories (Caldero! n et al. 1992, p. 25). Thus the state has lost its central role as the focus of attention for social movements, i.e., for collective action concerning ethnic or ecological questions. In general, both emerging forms of social and political participation as well as highly influential economic players are challenging the legitimacy of the state and its political representatives on environmental issues. Taking his analysis of resistance even further, Steven Pile (1997, p. 3) criticizes social movements research that operates exclusively with the dual model of ‘authority’ on the one hand and ‘resistance’ on the other. He perceives resistant political subjects as also influenced by driving forces ‘… such as desire and anger, capacity and ability, happiness and fear, dreaming and forgetting,’ which mobilize forces not simply based on the antagonism of established power and opposition. This analytical frame seems useful for understanding social movements organized around issues of identity and natural resources.
2. Characteristics of Social Moements with an Enironmental Focus The first of the New Social Movements started in 1968 with the student revolts in Western Europe. Independent of established political institutions, these new forms of social and political participation contrasted sharply with the traditional workers’ movements. Starting out as student riots supported by feminist activists and intellectuals, this movement soon grew, also becoming more critical of transformations occurring in industrial society. The New Social Movements questioned modernism and economic progress, focusing on the costs of social transformation. In the 1970s these issues were extended to environmental conflicts in Western Europe. Protest groups not only questioned the values of industrial society and the price of such things as economic progress, security, and sustainability, but also the political process, its parties, and politicians. Touraine identified such central conflicts in post-industrial society as the struggle for the control of symbolic production. In addition the organizational structures developed by the New Social Movements are different from those 14353
Social Moements: Enironmental Moements of traditional political parties: they are more decentralized and follow participatory organizational principles. They usually involve people who have not been interested in traditional politics or who have not been integrated into traditional political parties before. In general these movements want to experience new forms of democracy and focus on environmental and non-profit issues. Another characteristic feature of the New Social Movements—especially of those concerned with environmental issues—is the high involvement of women, minorities, young people, and people from the South. Seager (1996, p. 271) in her writing on ‘hysterical housewives’ points to the high percentage of women participating in environmentalist and conservationist organizations. Women around the world would, according to Seager, support more restrictive environmental laws and spend more money on ecological projects. However, they often face inflexible bureaucracies which adhere to fixed beliefs about economic growth and success. The potential of the politics of resistance lies in the interstices of self-determination, interpersonal and international solidarity as well as solidarity with future generations and the environment. According to Della Porta and Diani (1999, p. 14f ), the New Social Movements evince at least four characteristic features: (a) Informal interaction networks. The movements can be regarded as informal interaction networks between a plurality of individuals, groups and\or organizations with loose and dispersed links. Such networks promote the circulation of essential resources for action (such as information or material resources), creating and preserving a system of meaning through adequate organizational resources. In many cases the networks intend to create a new model of democracy based on greater participation and decentralization than in traditional democracies. (b) Shared beliefs and solidarity. Another characteristic feature of the New Social Movements is the set of beliefs and the sense of belonging they share. Social movements help to find new approaches to existing environmental conflicts, to raise new public issues and to create visions. By contributing to a symbolic redefinition of reality and possible future perspectives they construct new collective identities and value systems. In many cases it is up to the leaders of movements to create appropriate ideological representations. (c) Collective action focusing on conflicts. The actors of social and environmental movements are usually engaged in political and cultural conflict. Very often they set their value system against that of a powerful establishment. The actors involved in the movements share common interests, values, collective experiences, and a feeling of belonging to certain places and environments. (d) Use of protest. The movements make use of 14354
unusual patterns and deviating forms of political behavior and public protest, e.g., civil disobedience, collective action, and the mobilization of public opinion by spectacular activities (Della Porta and Diani 1999, p. 14f ).
3. New Actors and Alliances in the Political Struggle on Enironmental Issues According to Caldero! n et al. (1992, p. 24) the reason for the emergence of social movements lies in the tension between social demands and the inability of the existing political institutions to hear these demands, not to mention give in to them. The protagonists of traditional social movements were rather large social entities easily identified by one common social denominator, e.g., their identity as workers or peasants. Unlike the protagonists of earlier social movements today’s activists are not held together by a political force, i.e., by an aggregation of interests and actions. Although the New Social Movements are led by seemingly unimportant social actors, i.e., people in the lower ranks of the social hierarchy, such as black people, mothers, indigenous or disabled people, they leave a considerable mark on public platforms (Caldero! n et al. 1992, p. 26). Their highly specific interests and demands enable them to challenge the established structures more effectively. As Steven Pile (1997, p. 1) describes it, they perform ‘acts of resistance [that] take place (…) in the spaces under the noses of the oppressor … .’ This framework requires a more complex understanding of power and how it is distributed. The simple model of authority in the upper rank and its victims among the lower ranks has to be dropped. In many of the New Social Movements feminist and ecological activists have joined together for collective action. There are several reasons for this (Rocheleau et al. 1996, p. 15ff ): declining ecological and economic conditions which activate women because they threaten the survival of families; the impact of structural adjustment policies and the ‘retreat of the state’ from support of public services, social welfare, and environmental regulations; consciousness raising and political awareness; the political marginality of most women; the expansion of the women’s movement to address converging issues of gender, race, class and culture. They focus on policy and environmental management issues, access to and distribution of resources under conditions of environmental decline, resource scarcity and political change, and environmental sustainability. The unequal ‘Geometries of Power’ (Massey 1993) are responsible for generating social movements with an environmental focus and have led to a shift in scale between locally anchored movements and international organisations with global impact. In many
Social Moements: Enironmental Moements cases small indigenous groups succeeded by involving Non Governmental Organisations (NGOs) from Western countries in their struggles and were able to achieve their goals by globalizing the conflict. Some small and spontaneous local movements have expanded to become influential global players in environmental and human rights arenas. Several international NGOs, e.g., Friends of the Earth International, the WWF, Greenpeace, Global 2000, Peace Brigades International, have become global social movements with highly sophisticated organizational structures and professional fund raising. In some cases local social movements with a specific environmental focus obtained wider social support and political influence, and were finally transformed into political parties (see below). Several Green Parties in Europe have developed along this line (WastlWalter 1996).
4. Issues 4.1 Preseration of Natural Diersity Of the world’s biodiversity, 98 percent exists on lands controlled by indigenous peoples. The land uses that sustain this diversity are intimately tied to the survival of the people in question, who often struggle to preserve their ‘space’ and lifeways against exogenous forces (Della Porta and Diani 1999, p. 12). Power relations are usually tilted against indigenous peoples, who have little political and social capital. These relations are further blurred by values that differ from those of the West. For some indigenous peoples, land is not only life, it is sacred. Hence many conflicts are also cultural in nature, and can only be won if indigenous groups obtain the support of Western social movements, which can contribute their strategies and resources for successful confrontation with global economic powers. A well known example is the conflict of interest dating from 1988 concerning the construction of the world’s biggest dam on land belonging to the Kayapo Indians on the Rio Xingœ in Brazil. If the project were realised, 7.6 million ha would be flooded and 11 indigenous peoples would be forced to leave their land. Furthermore a unique ecosystem would be destroyed. The actors involved are the Government of Brazil, the World Bank, the Kayapo leaders and people, the OAB (Ordem de Abogados do Brasil), the NWF (National Wildlife Federation), the EDF (Environmental Defense Fund), Amnesty International, Survival International, Cultural Survival, and the International Society of Ethnobiology. The example shows that the actors involved range from local groups to groups with an international dimension. The groups differ vastly in terms of interests. Since various peoples would lose their home and hence
their existence, human rights issues—and therefore the possibility of new alliances—should be considered as well.
4.2 Preseration of Resources and the Right of Access to Enironmental Resources A central aim of social movements with an environmental focus is the preservation of and right of access to environmental resources. Natural resources are endangered by the rich world’s resource-intensive, wasteful, and polluting habits of production and consumption as well as by the environmentally destructive practices of the poor, who are forced to intensify their production under inappropriate conditions leading, for example, to depletion and degradation of large areas. Similar conflicts could be mentioned: in terms of land use, this means food crops vs. cash crops or sustainable farming vs. industrialization. One of the first famous examples of conflicts of this kind was the Kalinga women’s struggle against the Chico Dam project (Philippines) from 1967 until 1987. The women linked the sociopolitical issue of emancipation with their ecological interests, questioning the right of access to environmental resources as such. The dam project, which was financed by the IWF\World bank, was their first that had to be cancelled due to the opposition of the indigenous population. In Europe the first conflicts where social movements were concerned with environmental issues were the anti-nuclear power movements of the seventies. Again the main actors in terms of protest and the key speakers were women and mothers (MuW tter gegen Wackersdorf in Germany). They were worried about the preservation of a healthy environment for their children and spoke against further economic and technological development at any price. They also questioned the legitimacy of technical experts.
5. Methods and Strategies of the New Social and Enironmental Moements These examples illustrate how the New Social Movements are forced to apply differing strategies and methods to succeed against the political establishment and the economic powers that be. Steven Pile draws our attention to the specific geographies around which acts of resistance unfold: ‘on the streets, outside military bases, and so on, or, further, around specific geographical entities such as the nation, or ‘our land,’ or world peace, or the rainforests; or, over other kinds of geographies, such as riots in urban places or revolts by peasants in the countryside or jamming government web sites in cyberspace’ (Pile 1997, p. 1). 14355
Social Moements: Enironmental Moements The groups involved are often poorly resourced and act on a local level. In order to succeed they go public thus extending their reach, i.e., allying themselves in ‘Joint Ventures’ with global-players such as NGOs. For that reason, as Jayanta Bandyopadhyay states, social movements with an environmental focus cannot only develop fighting strategies; they also have to invest in public relations in order to become a trendsetting movement (Bandyopadhyay 1999, p. 880). The movements are forced to use reassessed indigenous knowledge to overcome the arguments of modern scientific knowledge systems. And these alternative sources of knowledge need to be fed into channels of modern communication technology in order to reach a broad audience. Local movements brought about by so called ‘disposable people’ (Ekins 1993) need to create a myth about themselves which triggers activists and solidarity movements among the more privileged. In many cases these groups have charismatic leaders or ‘movement poets’ (as they are called in the Chipko movement in India), who by personalizing the conflicts manage to obtain access to the mass media. Several such people have gained an international reputation in the course of their commitment, e.g., Medha Patkar, the leader of the movement against the Sardar Sarovar Dam in India, who was awarded the alternative Nobel prize. The history of the Green party in Germany reveals a different developmental pattern: the party come to represent the ‘Eco-Establishment’ (Seager 1993, p. 170). A social movement has become an official public agent. The debate between ‘fundamentalists’ and ‘realists’ was about whether to promote social change from within the system (realists) or from the outside (fundamentalist). Spontaneous demonstrations relying on a very affective approach have yielded to pragmatic and committed politics using the official channels of public life. Professional respectability has replaced passionate involvement. A change in essence and style towards the perspectives of the establishment can be noted, with an acceptance of free market economy criteria, an emphasis on practicability, efficiency, and scientific arguments (Seager 1993, p. 201).
6. Social and Political Releance of Enironmental Moements Environmental problems are caused by conflicting interests regarding natural resources and the preservation of the environment. Apart from those problems social movements also view environmental conflicts as a crisis of democracy and participation and as a cultural and ethical crisis because of the uneven power relations and conflicting systems of values and norms involved. With new actors entering the political field, traditional national politics and even states themselves are called into question by environmental 14356
movements. Traditional political parties lose members and voters while green parties or more spontaneous political groups gain influence. In most cases it is impossible to solve the conflicts at either the local, regional, or national level. The issues at stake are manifold, each interest group portraying its own perspective. It is the interstices between selfdetermination and solidarity with human and nonhuman agents which are likely to create a substantial mobilizing potential. Hence new networks, operating at an international level and able to consider the differing interests and to mediate the uneven power relations, have to be built up in order for social movements with little formal political power to be heard. See also: Environmental Justice; Environmental Security; Indigenous Knowledge and Technology; Moral Economy and Economy of Affection
Bibliography Bandyopadhyay J 1999 Of floated myths and floated realities. Economic and Political Weekly 34: 880 Bula-at L 1996 The struggle of the Kalinga women against the Chipco Dam project. idoc internazionale 3(July–Sept): 24–7 Caldero! n F, Pscitelli A, Reyna J L 1992 Social movements: actors. Theories, expectations. In: Escobar A, Alvarez S E (eds.) The Making of Social Moements in Latin America. Identity, Strategy and Democracy. Westview Press, Boulder, pp. 19–36 Della Porta D 1995 Social Moements, Political Violence, and the State. Cambridge University Press, Cambridge, UK Della Porta D, Diani M 1999 Social Moements. An Introduction. Blackwell, Oxford, UK Ekins P 1992 A New World Order. Grassroots Moements for Global Change. Routledge, London Eyerman R, Jamison A 1991 Social Moements. A Cognitie Approach. Pennsylvania State University Press, University Park, PA Massey D 1993 Power geometry and a progressive sense of place. In: Bird B et al. (ed.) Mapping the Futures: Local Cultures. Global Change. Routledge, London, pp. 59–69 McCormick J 1989 The Global Enironmental Moement. Reclaiming Paradise. Belhaven Press, London Mohanty M, To/ rnquist O (eds.) 1998 People’s Rights. Social Moements and the State in the Third World. Sage, Thousand Oaks, CA Pile S 1997 Opposition, political identities and spaces of resistance. In: Pile S, Keith M (eds.) 1997 Geographies of Resistance. Routledge, London, pp. 1–32 Posey D 1995 High noon am rio xingœ. In: Rasper Martin (ed.) LandraW uber: Gier und Macht—BodenschaW tze contra Menschenrechte. Focus, Giessen, pp. 164–75 Rocheleau D, Thomas-Slayter B, Wangari E (eds.) 1996 Feminist Political Ecology: Global Issues and Local Experiences. Routledge, London Seager J 1993 Earth Follies: Coming to Terms with the Global Enironment Crisis. Routledge, New York Seager J 1996 Hysterical housewives and other mad women— grassroot environmental organizing in the United States.
Social Moements, Geography of In: Rocheleau D, Thomas-Slayter B, Wangari E (eds.) 1996 Feminist Political Ecology: Global Issues and Local Experiences. Routledge, London, pp. 271–86 Slater D 1997 Spatial politics\social movements. Questions of (b)orders and resistance in global times. In: Pile S, Keith M (eds.) Geographies of Resistance. Routledge, London, pp. 258–76 Staggenborg S 1998 Gender, Family and Social Moements. Pine Forge Press, Thousand Oaks, CA Wastl-Walter D 1996 Protecting the environment against state policy in Austria: From women’s participation in protest to new voices in parliament. In: Rocheleau D, Thomas-Slayter B, Wangari E (eds.) Feminist Political Ecology: Global Issues and Local Experiences. Routledge, London, pp. 86–104
D. Wastl-Walter
Social Movements, Geography of Social movements, like all social relations, are geographically constituted. What one means by ‘social movements,’ however, is open to a wide range of interpretations. Here, social movements are defined broadly as collective mobilizations of people seeking social or political change. While social movements necessarily take place somewhere, their geography involves considerably more than an inventory of their locations. Geography is not a background onto which aspatial social processes are mapped. Rather, it is a fundamental aspect of the constitution of all social processes, affecting their evolution and outcome directly. It is part of the explanation of social processes, as well as an outcome (Massey and Allen 1984). While a variety of geographic concepts can be employed in the analysis of social movements, three core geographic concepts are particularly important: space, place, and scale. Each of these concepts has its own evolution, which can be seen in the ways in which analyses of the geography of social movements have changed over time.
1. Eoling Geographic Concepts and the Analysis of Social Moements 1.1 Spatial-analytic Analysis of Social Moements The spatial-analytic paradigm that dominated geographical analysis in the 1960s and 1970s worked from a conception of space that was based, by and large, on measures of distance. In this paradigm, social phenomena are described and explained in terms of absolute (e.g., with reference to a fixed grid such as latitude and longitude) or relative (e.g., transportation costs or
travel time) location. Virtually all social phenomena have been shown to possess a ‘distance decay’ function as interaction costs increase with distance from the activity center. The spatial diffusion of information, as well as barriers to such diffusion, became a common theme in much spatial-analytic research, including that addressing social movements (see Diffusion: Geographical Aspects). The growth and mobilization of social movements is clearly dependent upon flows of information, as demonstrated by Sharp’s (1973) analysis of the spatial diffusion of US postal strikes, and Hedstrom’s (1994) work on the spatial diffusion of Swedish trade unions. Most spatial-analytic work on social movements, however, examines the co-presence of specific social conditions within clearly delimited places. Adams (1973), for instance, examines the colocation of factors promoting urban protests, while Cutter et al. (1986) consider how characteristics measured at the state level are associated with support for nuclear weapons freeze referenda (see Peace Moements, History of ). In the 1970s, attempts to explain the occurrence of social phenomena in terms of distance or co-location came under heavy criticism for ignoring power relations, and in particular, the relation between ‘the social’ and ‘the spatial.’
1.2 Contemporary Conceptions of Space and the Analysis of Social Moements Influenced by the work of Henri Lefebvre and others, contemporary conceptions of space emphasize the mutual constitution of ‘the social’ and ‘the spatial.’ Space is seen increasingly as a necessary component of the constitution of power relations. Equally significant, space is no longer conceptualized primarily in terms of the physical location of material objects. Instead, both material spatial practices, as well as symbolic or representational spaces, are examined. As political economy approaches swept through the social sciences in the 1970s and 1980s, the spatial constitution of material interests was linked explicitly to the geography of social movements. The relationship between the geography of political and economic conditions, and the incidence of protest is demonstrated in detail in Charlesworth’s (1983) atlas of rural protest in the UK. Bennett and Earle (1983) show that the failure of socialism in the USA can be linked directly to geographic patterns of wage and skill differentiation; where differentiation was substantial, working class solidarity was undermined (see Labor Moements, History of ). Similarly, Massey (1984) argues that the spatial division of labor affects directly the material interests that underlie political movements. Markusen (1989) makes a somewhat different point: a region’s stage in the profit cycle shapes the types of movements that are likely to emerge in it. Environmental movements are more likely to emerge 14357
Social Moements, Geography of in economically booming regions where rapid growth places stress on the environment, while pro-growth politics are common in regions of decline (see Social Moements: Enironmental Moements). Geographically differentiated state systems also shape the geography of social movements. Kitschelt’s (1986) comparison of the political structures of France, Sweden, West Germany, and the USA shows that differences in central state structures are related to national-level variations in anti-nuclear mobilization. Other studies link local-level variation in social movement mobilization and success to differences in political opportunity structures among local states (Duncan et al. 1988, McAdam 1996, Tarrow 1996). In the USA many social movements began as highly localized movements in politically opportune places, then spread nationally (see Citizen Participation). Symbolic or representational spaces affect the geography of social movements as well. The symbolic codings of spaces, i.e., the often subtle and unacknowledged signs that shape people’s reactions to spaces, may indicate belonging for some groups and exclusion for others, empowerment for some groups and repression for others, safety for some groups and danger for others. Such codings can foster or hinder social movement mobilization. The creation of symbolic space is intertwined with the process of collective identity formation, as Marston (1988) shows for ethnic identity, Rose (1993) demonstrates for gendered identity, and Ruddick (1996) illustrates for interlocking ‘race,’ class, and gender identities (see Social Moements and Gender). Such identities are crucial to the formation of solidarity necessary for movement mobilization (see Social Moements, History of: General). The dominant symbolic codings of space are open to contestation, and counter-representations are often constructed by insurgent groups. Representational spaces, and their internalization, relate directly to the ways in which social movement messages resonate in some places, and not in others.
Much of the geographical literature on social movements stresses the ways in which identities and allegiances are shaped in place. Thrift and Williams (1987) analyze the ways in which consciousness and identity are produced through place-specific social practices and the time–space paths of individuals’ everyday lives. Katznelson’s (1981) work can be interpreted as a place-sensitive analysis of US workplace and community politics, stressing how distinct places shape identity in distinct ways. Savage’s (1987) analysis of British working-class politics demonstrates how the interactions of workplace and lifeworld spaces have effects on identity that vary from place to place. Recent work on identity construction stresses the place-specific interplay of spaces of production, reproduction, and the state that produce multiple, fragmented, and sometimes contradictory identities, all with direct implications for social movement mobilization (Ruddick 1996, Cope 1996, Brown 1997). Place itself can be a primary basis for collective identity construction. The defense of places associated with specific collective identities is a common theme of many new social movements (Melucci 1994). Increasingly, cities find themselves in competition with each other for capital investment; in response, placebased identification and coalition building has often taken precedence over class and other axes of social identity (Hudson and Sadler 1986, Swyndegouw 1989). Identity is not all that is shaped in place. Agnew (1987, p. 59) observes that ‘different places produce different degrees and orientations of organizational capacity.’ Routledge (1993), for example, shows that social and environmental movements in India faced opportunities for, and constraints to, mobilization that relate to place-specific social networks, systems of meaning, and patterns of political repression. Miller (2000) shows how a variety of place-specific characteristics, including political opportunity structures, available resources, material interests, and attachments to place, shaped the strategies and effectiveness of peaceorganizing differentially in several Boston area municipalities.
1.3 Place and the Analysis of Social Moements Many geographic analyses emphasize the place-based constitution of social processes. Place is best understood as a multidimensional concept entailing ‘locale,’ i.e., the geographical setting of everyday social relations, ‘location,’ i.e., geographical position vis-avis wider scale interactions; and ‘sense of place,’ i.e., the esthetic and emotional bonds to locale (Agnew 1987). The concept of place stresses the unique ways in which social processes may interact in specific milieus, calling our attention to the ‘situatedness’ of all human action and institutions. At the same time, place is not to be equated with ‘the local.’ Place implies the articulation of localized processes, with other processes operating at broader scales. 14358
1.4 Scale and the Analysis of Social Moements The concept of place clearly carries within it issues of geographic scale. Geographic scale has commonly been thought of as a fairly straightforward technical issue relating to the telescoping frame of geographic analysis (from local to global), but such a notion of scale as pre-given or unproblematic is increasingly challenged. Most analyses of social movements assume that the appropriate scale of analysis is that of the nation state, but such an assumption overlooks the often critical subnational, as well as international, constitution of
Social Moements, Geography of social movement processes. Moreover, rather than being seen as an ‘external fact awaiting discovery,’ scale should be viewed as a ‘way of framing conceptions of reality’ (Delaney and Leitner 1997, pp. 94–5) that is employed by social movement organizations and their opponents as a way of gaining political advantage. As Smith (1993, p. 90) explains: The construction of scale is not simply a spatial solidification of contested social forces and processes; the corollary also holds. Scale is an active progenitor of specific social processes. In a literal as much as metaphorical way, scale both contains social activity and at the same time provides an already partitioned geography within which social activity takes place. Scale demarcates the sites of social contest, the object as well as the resolution of contest.
2. Geographic Tactics and Strategies of Social Moements Given that all social movements are geographically constituted, all social movement organizations must make decisions regarding the geographic tactics and strategies they employ. Sometimes these decisions are explicitly considered, often they are not. Social movements frame their messages in particular ways that they hope will resonate with particular groups. Given a highly differentiated geography of culture, symbolization, and interests, the resonance of any given frame will depend in substantial measure on the geographic targeting of the message. Similarly, the recruitment of social movement members is intertwined inextricably intertwined with the geography of mobilization activities (Harvey 1985, Miller 2000). While identities constructed in place and placebased identities frequently are central to social movement solidarity, the absence of such bases of solidarity can be rectified through efforts to construct new places of group interaction and identity. Castells (1983), for example, details how the growth of gay bars, businesses, and gathering places was crucial to the construction of shared identity in the San Francisco gay community. Castells also shows how, in the case of the Madrid citizens’ movement, the construction of a community center in the Orcasitas shantytown was essential to breaking down social walls and building broad, place-based bonds around which the movement organized. The tactics and strategies that social movements employ to challenge the existing social order are rooted in an often-unexamined normative geography. Cresswell (1996) makes a compelling case that spatial structures and systems of places provide ideologicallyloaded schemes of perception. Dominant conceptions of what is ‘normal’ and ‘appropriate’ is defined ‘to a significant degree, geographically, and deviance from this normality is also shot through with geographical
assumptions concerning what and who belong where’ (p. 27). He illustrates his argument with several examples, including that of the Greenham Common women’s peace camp that challenged British participation in the nuclear arms race through ‘out of place’ behavior that ‘denaturalized the dominant order [and] showed people that what seemed natural, could, in fact, be otherwise’ (p. 125). Indeed, social movements can be mobilized and social conditions can be changed through a resymbolization of existing spaces or through a reterritorialization of existing symbolization (Pile and Keith 1997). Harvey (1989, p. 233) makes a case for the re-symbolization of space when he argues: If workers can be persuaded … that space is an open field of play for capital but a closed terrain for themselves, then a crucial advantage accrues to the capitalists … power in the realms of representation may end up being as important as power over the materiality of spatial organization itself.
An example of the re-territorialization of existing symbolization can be found in the recent work that has been done in several US metropolitan areas to recast the ongoing political battles between central cities and suburbs. In several metropolitan areas this common conception of territorial interests is being replaced with one that aligns the central cities with the inner ring suburbs, creating a far more powerful territorial alliance behind the movement for a more equitable metropolitan distribution of affordable housing, school funding, and tax revenue (Orfield 1997). Many of the efforts of social movements to redefine the spaces of struggle involve attempts to ‘jump’ to a more favorable geographic scale (Smith 1993). In addition to changing symbolic geography, altering the scale of material processes can be central to social movement strategy. Examples include shifting from regional to national-scale labor bargaining (Herod 1998), and seeking out more favorable political opportunity structures at a different level of the state, e.g., local rather than central (Miller 2000). The geographic tactics and strategies adopted by social movements and their opponents play a powerful role in the nature of social movement mobilization and the degree of success or failure social movements achieve. See also: Community Power Structure; Interest Groups, History of; Peace Movements; Political Protest and Civil Disobedience
Bibliography Adams J 1973 The geography of riots and civil disorders in the 1960s. In: Albaum M (ed.) Geography and Contemporary Issues. Wiley, NewYork, pp. 542–64
14359
Social Moements, Geography of Agnew J A 1987 Place and Politics. Allen & Unwin, Boston, MA Bennett S, Earle C 1983 Socialism in America: a geographical interpretation of its failure. Political Geography Quarterly 2(1): 31–55 Brown M 1997 Radical politics out of place? In: Pile S, Keith M (eds.) Geographies of Resistance. Routledge, New York, pp. 152–67 Castells M 1983 The City and the Grassroots. University of California Press, Berkeley, CA Charlesworth A (ed.) 1983 An Atlas of Rural Protest in Britain, 1548–1900. University of Pennsylvania Press, Philadelphia, PA Cope M 1996 Weaving the everyday: identity, space, and power in Lawrence, Massachusetts, 1920–1939. Urban Geography 17(2): l79–204 Cresswell T 1996 In Place\Out of Place. University of Minnesota Press, Minneapolis, MN Cutter S L, Holcomb H B, Shatin D 1986 Spatial patterns of support for a nuclear weapons freeze. The Professional Geographer 38(1): 42–52 Delaney D, Leitner H 1997 The political construction of scale. Political Geography 16(2): 93–97 Duncan S, Goodwin M, Halford S 1988 Policy variations in local states: Uneven development and local social relations. International Journal of Urban and Regional Research 12(1): 107–28 Harvey D 1985 Consciousness and the Urban Experience. Johns Hopkins University Press, Baltimore Harvey D 1989 The Condition of Postmodernity. Blackwell, Cambridge, MA Hedstrom P 1994 Contagious collectivities: On the spatial diffusion of Swedish trade unions, 1890–1940. American Journal of Sociology 99(5): 1157–79 Herod A (ed.) 1998 Organizing the Landscape: Geographical Perspecties on Labor Unionism. University of Minnesota Press, Minneapolis, MN Hudson R, Sadler D 1986 Contesting work closures in Western Europe’s old industrial regions: Defending place or betraying class? In: Scott A J, Storper M (eds.) Production, Work, Territory. Allen & Unwin, Boston, pp. 172–94 Katznelson I 1981 City Trenches. University of Chicago Press, Chicago Kitschelt H 1986 Political opportunity structures and political protest: Anti-nuclear movements in Four Democracies. American Sociological Reiew 49: 770–83 Markusen A 1989 Industrial restructuring and regional politics. In: Beauregard R (ed.) Economic Restructuring and Regional Politics. Sage, Newbury Park, CA, pp. 115–48 Marston S A 1988 Neighborhood and politics: Irish ethnicity in nineteenth century Lowell, Massachusetts. Annals of the Association of American Geographers 78: 414–32 Massey D 1984 Spatial Diisions of Labor. Methuen, New York Massey D, Allen J 1984 Geography Matters. Cambridge University Press, New York McAdam D 1996 Conceptual origins, current problems, future directions. In: McAdam D, McCarthy J D, Zald M N (eds.) Comparatie Perspecties on Social Moements. Cambridge University Press, New York, pp. 23–40 Melucci A 1994 A strange kind of newness: what’s ‘new’ in new social movements? In: Larana E, Johnston H, Gusfield J R (eds.) New Social Moements: From Ideology to Identity. Temple University Press, Philadelphia, pp. 101–30 Miller B A 2000 Geography and Social Moements. University of Minnesota Press, Minneapolis, MN
14360
Orfield M 1997 Metropolitics: A Regional Agenda for Community and Stability. The Brookings Institution, Washington, DC Pile S, Keith M 1997 Geographies of Resistance. Routledge, New York Rose G 1993 Feminism and Geography. University of Minnesota Press, Minneapolis, MN Routledge P 1993 Terrains of Resistance: Noniolent Social Moements and the Contestation of Place in India. Praeger, Weatport, CT Ruddick S 1996 Constructing difference in public spaces: Race, class, and gender as interlocking systems. Urban Geography 17(2): 132–51 Savage M 1987 The Dynamics of Working Class Politics: The Labour Moement in Preston, 1880–1940. Cambridge University Press, Cambridge, UK Sharp V 1973 The 1970 postal strikes: The behavioral element in spatial diffusion. In: Albaum M (ed.) Geography and Contemporary Issues. Wiley, New York, pp. 523–32 Smith N 1993 Homeless\global: scaling places. In: Bird J, Curtis B, Putnam T, Robertson G, Tickner L (eds.) Mapping the Futures: Local Cultures, Global Change. Routledge, New York, pp. 87–119 Swyndegouw E 1989 The Heart of the Place: The Resurrection of Locality in an Age of Hyperspace. Geografiska Annaler 71B: 31–42 Tarrow S 1996 States and opportunities: The political structuring of social movements. In: McAdam D, McCarthy J D, Zald M (eds.) Comparatie Perspecties on Social Moements. Cambridge University Press, New York, pp. 41–61 Thrift N, Williams P 1987 The geography of class formation. In: Thrift N, Williams P (eds.) Class and Space: The Making of Urban Society. Routledge & Kegan Paul, New York, pp. 1–22
B. Miller
Social Movements, History of: General Social movements, according to sociologist Sidney Tarrow’s definition, are ‘collective challenges, based on common purposes and social solidarities in sustained interaction with elites, opponents and authorities’ (Tarrow 1998, p. 4). This definition, derived from studies of both historical and contemporary social movements, locates the earliest movements in the late eighteenth and early nineteenth centuries. The emergence of social movements was contemporaneous with: (a) the increased centralization of national states that accompanied war and revolution in western Europe and North America; (b) the rise of the popular press and the circulation of notions of natural rights and liberty put forward by Enlightenment and revolutionary thinkers; and (c) particularly in England and the English-speaking colonies of North America, the rise of evangelical and nonconforming religions in which literacy and reading were cultivated and principles of human rights formulated. From that period and place, social movements have proliferated to the world.
Social Moements, History of: General The chief focus of this article is the history of social movements in the West that have attracted most scholarly attention; it draws particularly on studies done in the years 1975–2000—a period in which the concept of social movement has been redefined and developed—for its theoretical framework and empirical base. The British movement for the abolition of slavery—an emblematic, and by some criteria the first social movement—serves as a concrete example of an early crusade that embodied many of the central features of the phenomenon.
1. The Abolition of Slaery in Great Britain The eighteenth and nineteenth century British abolitionist movement was distinctive in that it was aimed at an institution which had only a minor presence in Great Britain, whereas most social movements have been concerned with issues close to hand. Slavery nevertheless involved many Britons: British ships sailing from Liverpool were active in the slave trade carrying slavers to Africa and slaves to the Americas, and the ‘peculiar institution’ was common in Britain’s North American and universal in its Caribbean colonies. The question of the legality of slavery in Britain was raised as early as the 1720s, but it was not until 1772 that Lord Chief Justice Mansfield ruled that slavery was illegal under English law. English members of the Society of Friends had developed the earliest critique of slavery as contrary to Christian teaching, and local Quaker meetings first excluded slave traders in 1761 and later slave owners from membership. Quakers then turned to political action to end British involvement in the slave trade, establishing in 1787 the Society for the Abolition of the Slave Trade, commonly known as the London Committee. This organization welcomed non-Friends as members, and recruited both middle and working-class adherents. The political technique utilized by the Committee in this and later campaigns was mass petitions to Parliament. Thomas Clarkson, a gifted organizer, and other activists worked outward from London and local Friends meetings to form regional organizations, recruit members, distribute leaflets, and collect signatures on petitions to Parliament calling for the end of the slave trade. (Clarkson and his associates began their crusade with an exclusive focus on the slave trade, as less contentious than abolition.) Visual images portraying the horrors of the slave trade such as a print of the cross section of a slave ship with its tightly packed human cargo and Josiah Wedgwood’s medallion of a kneeling chained slave, inscribed with the motto ‘Am I not a Man and Brother?’ were highly effective in attracting supporters. There was an outpouring of support, with about 60,000 male signers on the first petitions, including 11,000 men from Manchester, the center of the
burgeoning cotton spinning industry. Parliamentary sympathy for the cause seemed strong up to 1792, when 519 petitions with up to 400,000 signers were collected. However, legislative support dropped with the slave revolt in Haiti, the contemporaneous radicalization of the French Revolution, and the outbreak of war against France. In 1794, the House of Lords decisivelyrejectedtheabolitionistresolutionspresented to it. Political opportunities for action on abolition reappeared in 1804, after Britain’s victories in the Napoleonic Wars and its seizure of the French Caribbean colonies. Clarkson and other organizers again canvassed the country seeking signatures; he also sought support from members of Parliament, in particular William Wilberforce, an Evangelical MP from Yorkshire who spoke eloquently about the evils of the trade and Samuel Whitbread, a nonconformist Protestant already known as a social reformer. Clarkson’s powers of persuasion and his Parliamentary allies combined to produce in 1807 a majority vote in Parliament to forbid British ships’ participation in the slave trade to both British possessions and foreign slave-holding colonies. The African Institution, founded in the same year, took as its task ascertaining the effectiveness of the British legislation. Although British diplomats signed agreements eschewing the slave trade with other European powers, and indeed paid compensation when they did, the results were uneven. British activists resolved to continue their efforts, founding the Society for the Amelioration and Gradual Abolition of Slavery (known familiarly as the Anti-Slavery Society) in 1823 and the Agency Committee (1831), which demanded immediate and unconditional slave emancipation. The Agency’s professional organizers traveled around England, promoting local petition drives and pressure on candidates in the Parliamentary elections to declare their support of full abolition. Members of the House of Commons elected under the 1832 Reform Act were ready to vote for the abolition of slavery in British possessions, but the House of Lords was not. The outcome was a compromise Emancipation Act of 1833 that set August 1, 1834 as the date on which slavery would end, but stipulated a complex gradual process to complete freedom. The former slaves would become ‘apprentices,’ working for pay three-quarters of their time for their ex-masters; one-quarter of the ex-slaves’ time was to be reserved for them to cultivate their newly-granted plots. Slave owners would be compensated by the state for the loss of their ‘property.’ The story did not end there, however. Disagreements about strategy between the Anti-Slavery Society (which had accepted the compromise Emancipation Act) and the Agency (which had not) continued, and the latter established yet another organization (the Central Negro Emancipation Committee) and sought to persuade Parliament that oppressive conditions still prevailed. In 14361
Social Moements, History of: General the end, the complex system based on the 1833 Act proved difficult to manage, and the planters themselves ended apprenticeship in 1838. The international abolition movement matured in the 1830s, and began to hold annual congresses (which some American abolitionists attended) in 1840. The United States Civil War (1861–5) was the outcome of a series of legal and political disputes about slavery and state’s rights. In 1863 President Abraham Lincoln’s Emancipation Proclamation ended slavery in the Confederacy, and during the postwar Reconstruction the Thirteenth, Fourteenth, and Fifteenth Constitutional amendments (which completed abolition, made the ex-slaves citizens, and guaranteed their right to vote) were ratified. The abolition of slavery in Cuba (1886) and Brazil (1888) ended legal slavery in the West.
2. The Social Character of Moements Over the period 1975–2000, sociologists have begun to define ‘contentious politics’—confrontations of ordinary people with authorities and\or elites—as the building blocks of social movements. Sidney Tarrow defines the conditions under which contentious politics is triggered: When changing political opportunities and constraints create incentives for social actors who lack resources on their own, they contend through known repertoires of contention and expand them by creating innovations at their margins. When backed by dense social networks and galvanized by culturally resonant, action-oriented symbols, contentious politics leads to sustained interaction with opponents. The result is the social movement (Tarrow 1998, p. 2).
The case of the British movement for the abolition of the slave trade and slavery illustrates this definition: (a) the movement was collective and interactive, between authorities and the populace; it was conducted early on through organizations established for the purpose and spread by movement professionals; (b) it was aimed at political authorities (in this case Parliament) which could take action to achieve the movement’s goal; (c) each phase of activism on abolition began with an opening of political opportunity and ended when new political constraints closed off forward movement; (d) the network of Quaker meetings was the first activated in the cause, but other church groups and local branches of the anti-slavery societies joined the abolitionist network over the period; (e) interaction was contentious and drew on symbols that helped promote action; (f) and participation in the movement came to be perceived by activists as constitutive of their identities. 14362
3. Woman’s Suffrage in the United States The formation of the United States woman’s suffrage movement was closely connected to the social groups and organizations that promoted the abolition of slavery there. For example, American woman’s suffrage activists Lucretia Mott and Elizabeth Cady Stanton first met each other at the London World Anti-Slavery Convention of 1840, where they discussed the possibility of organizing an organization to promote women’s rights. Mott, a New Englander then living in Philadelphia and the senior of the two had founded the first Female Anti-Slavery Society in the United States; her home was a stop on the ‘underground railroad’ through which slaves escaped to Canada and freedom. The well-educated daughter of a judge, Stanton was born in western New York State, married an abolitionist leader, and eventually settled with her family in Seneca Falls, NY. Stanton and Mott renewed their acquaintance in 1848, when they and three like-minded women issued a call to a women’s rights conference in the summer. To their surprise, some 300 women and men came, applauded Stanton’s speech echoing the Declaration of Independence and passed a resolution on women’s suffrage: ‘Resolved, that it is the duty of the women of this country to secure to themselves their sacred right to the elective franchise.’ The movement developed slowly through annual conferences, spreading primarily to the Middle West and East coast. In 1851, Stanton met Susan B. Anthony, then a paid organizer in the temperance movement. Anthony brought her organizational skills to women’s rights, beginning a drive in 1854 based on organizers in each county of the state of New York who recruited local women. These in turn collected signatures for a petition calling for women’s control of their own earnings, mothers’ guardianship of children in divorce, and women’s suffrage. The first concrete accomplishment—a New York state law granting women property rights—was passed in 1860. With the outbreak of the Civil War in April 1861, organizing efforts were temporarily suspended. Not for long, however, as Anthony and Stanton reactivated their abolitionist and women’s rights networks in 1863, issuing a call to a mobilization meeting in New York. The outcome was the National Women’s Loyal League that set to work gathering close to 400,000 signatures on a petition for the Thirteenth constitutional amendment that would free the slaves. Women also served on the Civil War Sanitary Commission, which raised funds and organized and administered nursing and support services for the military hospitals and convalescent centers. The campaign for ratification of the Fourteenth and Fifteenth amendments (the first made male ex-slaves citizens and the second granted them the vote) split the former abolitionist movement colleagues between those who supported rights for the ex-slaves as a first
Social Moements, History of: General priority, and those who supported universal suffrage. In 1869, two new organizations dedicated to women’s suffrage emerged from the split: the National Woman Suffrage Association founded by Anthony and Stanton which objected to the Fourteenth and Fifteenth Amendments because they specified male ex-slaves; and the American Woman Suffrage Association, founded by Lucy Stone, her husband Henry B. Blackwell, and Mary Livermore. These associations pursued the cause by different methods and separately for 20 years during which the westward movement was completed, the country industrialized, and new women’s associations such as the Women’s Christian Temperance Union (1874) and the National Association of Colored Women (1895) were founded. In 1890, the National and the American Associations merged to form the National American Woman Suffrage Association, with Elizabeth Cady Stanton as its first president; she was succeeded two years later by Susan B. Anthony who served to 1900. Led after 1900 by Carrie Chapman Catt, a gifted organizer the National American Association continued its strategy: seeking constitutional change state by state. Catt resigned in 1904 to play an active role in founding the International Woman Suffrage Alliance launched that year. There was little activity around the woman suffrage (‘Susan B. Anthony’) amendment to the United States Constitution for over 10 years. New tactics appeared in 1913, devised by the more militant new movement recruits Alice Paul and Lucy Burns who were appointed to the National American Suffrage Association’s Congressional Committee. They promptly organized a demonstration of some 5,000 women who marched in Washington in support of a constitutional amendment the day before Woodrow Wilson’s inauguration for a second term. Despite the organization’s permit for the parade, the police were unprepared for the crowd reaction and the women were mobbed and jeered. Later that year Paul and Burns established the Congressional Union, and their organization eventually became independent of the National American, which continued to pursue its state-by-state strategy, with only slow progress. In 1915, Carrie Chapman Catt agreed to lead the campaign for women’s suffrage in New York State, and once again became president of the National American Suffrage Association with a newly energetic board committed to her plan for a renewed campaign for a national constitutional amendment. (The Congressional Union and its successor the Woman’s Party did not join the effort. The Woman’s Party nevertheless contributed to the process by courting arrest through disruptive and provocative action, which made the National American’s actions look reasonable.) At its convention in late 1917 the latter warned that if Congress did not approve the suffrage amendment before the next election, it would adopt the British suffragette method: opposing the election of Congressional candidates who opposed the federal
amendment, no matter what their party. The amendment passed in the House of Representatives by the required two-thirds majority in January 1918, the Senate in 1919, and ratification by the requisite proportion of states was completed in 1920.
4. Twentieth Century Social Moements: Italian Fascism In the early twentieth century, several movements demonstrated unequivocally the adaptability of the social movement to regressive ends. The Italian Fascist movement serves as an example. After the end of World War One late in 1918, Italy was swept by waves of contentious politics: demobilized troops demanded jobs; peasants and agricultural laborers agitated for land; consumers called for price controls; and workers struck for higher wages. Among the groups engaging in collective action were the fasci di combattimento founded by Benito Mussolini (formerly a revolutionary socialist) who convened the first fascio in Milan on March 21, 1919. At a public rally two days later the national program was only vaguely described (it was not fully formulated and publicized until months later). The program called for integration of all Italian ‘national’ territory into the nation, and pledged to oppose all neutralist politics no matter the party involved; other radical social and political demands included universal (male) suffrage, a constituent assembly, a republic with local autonomy, the eight-hour day, worker participation in factory management, land to the peasants, elimination of the Senate and all noble titles, and an end to conscription, expropriative taxes, speculation, and monopoly. The Fascists’ first public action (April 15, 1919) was a counter-demonstration together with students, World War veterans, and revolutionary syndicalists which attacked an anarchist\socialist march protesting repression in Milan. It ended with the sacking and burning of the headquarters of the socialist newspaper Aanti! In the next two years most Italian popular contention was dominated by socialists and organized social Catholics (represented by the newly-founded Popular Party) who both sought to control and lead the strikes in northern industries and large-scale agriculture in the Po Valley, the land occupations, and eventually in fall, 1920, the factory occupations. These struggles all occurred in northern Italy, where socialist and Popolari strength was concentrated. The Fascist counterattack from its urban bases (again in the north) began in the same period: physically assaulting unionized farm workers on strike, the headquarters of peasant leagues, cooperatives, and unions, burning left wing newspapers’ offices and printing plants, and breaking up meetings of the socialist communal councils as well. Socialist and Popular Party organizations crumbled as Fascist membership and attacks both increased. 14363
Social Moements, History of: General The Fascists had become a political force to be reckoned with and their success went hand in hand with the demobilization of the left. In August 1921, Mussolini signed the so-called Pacification Pact with the socialists and Popolari, promising to eschew violence, but his followers forced him to back away from the agreement and he publicly renounced it later in the year. He also announced the formation of a political party, which he described as a rival to the state, outside traditional politics. Violent clashes between socialists and Fascists continued into 1922, when the Fascist threat to seize the state forced the cancellation of a general strike planned by an ad hoc group of unions; by then the entire Po Valley, and much of North and Central Italy was in Fascist hands. As the constitutional parties called for a new government, Mussolini threatened to march on Rome to seize power. The march never happened; instead, Fascist Black Shirts occupied provincial government and party headquarters, post offices, and police stations (again in the north, where socialist power had been concentrated) sometimes without resistance. Fascist threats and his own concerns led the king to refuse to sign the declaration of martial law which the prime minister had drawn up, and Mussolini’s first government (based on a coalition of center and right parties) was formally constituted on October 31, 1922. Mussolini used nationalism wreathed in patriotism as a weapon against the left, and he continued to be a master of language and image once in power. Although there had been social movements in the European colonies (and in China, as well) before the outbreak of World War I in 1914, there were many more non-Western movements in the twentieth century; they ranged in principal concerns from nationalism to widow burning and female circumcision (in India and Kenya respectively). Nationalists sometimes rejected Western notions of modernity and connected customs that seemed unacceptable to Westerners to colonial cultural authenticity. Two examples are Mahatma Gandhi in his representation of himself with a spinning wheel and in his advocacy of khadi (handspun and woven cotton cloth which he himself wore in his later years), and Kenyan nationalist Jomo Kenyatta who defended female circumcision on behalf of the Kikuyu Central Association, a pre-independence nationalist organization. More recently, televised news reports and rapid communication by telephone, fax, and e-mail have made possible the almost instantaneous transmission of news not only in print, but of visual images as well. To what extent can we speak of transnational social movements?
5. Transnational Social Moements The above discussion of anti-slavery and women’s rights as social movements notes the international dimensions that were integral to social movements 14364
even in the late eighteenth century; similarities and connections of social movements across borders are not new. Sidney Tarrow offers a more restrictive definition of a transnational social movement: sustained contentious interactions with opponents—national or nonnational—by connected networks of challengers organized across national boundaries … What is important … is that the challengers themselves be both rooted in domestic social networks and connected to one another more than episodically through common ways of seeing the world, or through informal or organizational ties, and that their challenges be contentious in deed as well as in word (1998, p. 184).
Tarrow discusses three movements that fit his definition: the contemporary ecological and environmental movement, the European and American peace movement of the 1980s, and Islamic fundamentalism. The ecological\environmental movement, which mounted coordinated worldwide Earth Day demonstrations in the late 1970s and early 1980s, provides the example discussed here. Some of its social movement organizations and their coordinated activities across borders do qualify as transnational organizations, as Tarrow demonstrates. Greenpeace, for example, has been a genuinely transnational social movement organization, with professional organizers and offices in many countries, and supporters who share a common world view and are willing to support its advocacy. Founded in Vancouver, Canada, according to its 1992–3 report it had over five million supporters (contributors), 43 offices, and over a thousand paid employees. It developed out of domestic movements that had similar goals, organized advocacy networks, and conducted contentious politics that cross national boundaries. Its members and sympathizers demonstrated against French nuclear testing in the Pacific and against abusive fishing for whales by Japanese and Soviet whalers. Nevertheless, most international connections are less sustained and less professionalized than Greenpeace. Not all issues can attract support across the varied conditions of world polities, and most political, social, and economic international movements continue to work within states and national-level private institutions like large business organizations or labor unions rather than through transnational voluntary associations which represent individuals’ world views directly.
6. Oeriew: Social Moements from the Eighteenth to the Twentieth Century Sidney Tarrow’s definition of the social movement (quoted in the opening sentence of this article) posits its applicability to the entire period since the first appearance of social movements in the late eighteenth century up to the present. His definition is based on the premise that although the issues and tactics of social
Social Moements: Psychological Perspecties movements have changed over the two centuries in which they have proliferated the defining form of social movements—their collective nature, common purposes, and shared base in social solidarities with roots in the social structure in which they occur—have remained relatively unchanged. Not all social movement scholars agree with this premise; among European sociologists and political scientists in particular the concept of ‘new social movements’ has been developed since 1985. Kriesi et al. (1995) argue that a new type of social movements which address similar issues yet vary significantly in form according to the national political context has emerged in Western Europe since 1968. Kriesi and his colleagues emphasize contemporary cross-national variation in domestic politics and corresponding differences in the timing and other characteristics of social movements. The examples that they discuss under the rubric of new social movements include ‘the ecology movement (with its antinuclear energy branch), the solidarity movement (with the Third World), the contemporary women’s movement, the squatters’ movement, as well as various other movements mobilizing for the rights of discriminatedagainst minorities’ such as gays (1995, p. xii). The new type of social movements discussed by Kriesi et al. (1995) is classified according to the chief constituencies of the movements and the connections among them. Tarrow’s definition of social movements leaves open the characteristics of their members and their goals and stresses the movements’ longstanding form. Two alternatives present themselves: in terms of their political concerns and context then, recent social movements may herald changes in some aspects of movement politics and require redefinition of the concept. Alternatively, the broad definition with which this article begins argues for stability in the form of social movements even as the issues that bring them into the public arena change. Whether either one of these alternatives will contribute to understand contemporary and future social movements better is an empirical question that can only be determined with a longer span of observations.
Bibliography Anderson B 1991 Imagined Communities. Verso, London Costain A R, McFarland A S (eds.) Social Moements and American Political Institutions. Rowman and Littlefield, Boulder, CO Keck M E, Sikkink K 1998 Actiists Beyond Borders: Adocacy Networks in International Politics. Cornell University Press, Ithaca, NY Kriesi H, Koopmans R, Duyvendak J W, Giugni M G (eds.) 1995 New Social Moements in Western Europe: A Comparatie Analysis. University of Minnesota Press, Minneapolis, MN McAdam D, McCarthy J D, Zald M N (eds.) 1996 Comparatie Perspecties on Social Moements: Political Opportunities,
Mobilizing Structures, and Cultural Framings. Cambridge University Press, Cambridge, UK Melucci A 1985 The symbolic challenge of contemporary movements. Social Research 52(4): 789–816 Offe C 1985 New social movements: Challenging the boundaries of institutional politics. Social Research 52(4): 817–67 Tarrow S 1998 Power in Moement: Social Moements and Contentious Politics, 2nd rev. edn. Cambridge University Press, Cambridge, UK Tilly C 1995 Popular Contention in Great Britain, 1758–1834. Harvard University Press, Cambridge, MA Touraine A 1985 An introduction to the study of social movements. Social Research 52(4): 749–87
L. Tilly
Social Movements: Psychological Perspectives Social movements are sustained actions undertaken by organized groups of people seeking social or political change. The groups involved in social movements are structured, have goals and ideologies, and pursue their goals over time. The psychology of social movements focuses on the motives that lead people to join, work on behalf of, and stay within social movements. One reason for people to join social movements is because of the desire to adopt a favorable identity and\or engage in positive actions. Social identity theory notes that people use groups to define themselves, so joining a group is one way to create a favorable identity (Tajfel 1982, see Social Identity, Psychology of ). When people identify with a group, they take on the values and lifestyle of that group, seeing themselves as defined through the attributes that define the group. People joining a sports team, for example, will come to think about themselves in terms of their athletic ability—a prototypical attribute of the group. In addition to giving people values or attributes to define themselves, identifying with groups provides people with information about their self-worth. Since people value positive self-worth, they are more likely to identify with and join a group when the group they join has a favorable reputation. Of course, people can also work collectively to better the reputation of their group, as in the gay rights movement, the civil rights movement, or other social movements motivated by the desire to enhance the stature of low status groups (Kelly and Kelly 1994, Simon et al. 1998). Social identity theory suggests one motivation for joining social movements—to create a favorable identity and to work on behalf of the values and lifestyle that identity represents. Although this is an important motivation for social movements, this positive motivation has not been the focus of most of the 14365
Social Moements: Psychological Perspecties psychological research on social movement participation. Instead, the focus has been on protest movements, which people join in response to perceived injustice. That motivation is explored in several theories, the earliest being the theory of relative deprivation.
1. Relatie Depriation The factors that shape the generation of the grievances that lead to social movements are first addressed by the psychological literature on relative deprivation. This literature argues that people’s feelings about their social conditions are only loosely connected to their objective situations. Instead of developing directly from people’s objective circumstances, subjective feelings of satisfaction and dissatisfaction are linked to the comparisons that people make of their own outcomes to other outcomes. This work makes the important point that it is the interpretation of grievances that is key to their role in shaping social movements (see Klandermans 1989). One type of comparison involves temporal comparisons of one’s current outcomes to one’s outcomes at some other point in time. Temporal comparisons are the basis for several of the best known models of the psychology underlying the development of social movements (Davies 1962, Gurr 1970). Gurr (1970) distinguishes among several different patterns of relative deprivation, each of which can create the necessary conditions for social movements. The best known pattern is progressive deprivation, which describes a situation in which both prospects and expectations are rising, but expectations are rising higher (Davies’s ‘jcurve’). In this case, the discrepancy between expectations and prospects increases, leading to dissatisfaction and support for social movements. A second type of comparison involves social comparison to the outcomes of others. Again the key point is that people’s satisfaction is not directly linked to the objective quality of their outcomes. Instead, it is comparative. The main question is how people decide to whom to compare the outcomes they receive (Suls and Wills 1991). Studies suggest that when people are comparing their outcomes to standards, they apply principles of entitlement or deservingness (see Justice: Social Psychological Perspecties). Use of these justice-based principles shows that people’s feelings about outcomes are linked to their assessments of their fairness. The importance of evaluations of outcome fairness (‘distributive justice’) has been compellingly demonstrated in the literature on the distribution of rewards in work settings, where equity is a central principle of justice. This literature shows that people are most satisfied when they receive equitable outcomes (Walster et al. 1978). Justice theories also suggest that people react to the way that outcomes are distributed, that is, to the 14366
process of decision making, and to their treatment during that process (‘procedural justice,’ Thibaut and Walker 1975). Studies demonstrate that people’s assessments of the fairness of procedures shape their satisfaction with their outcomes and with authorities and institutions (Tyler et al. 1997, Tyler and Smith 1997). These findings link people’s motivation to join social movements to their views about the fairness of social outcomes and procedures. In particular, they suggest that people react negatively to experiencing unfair procedures. However, the experience of injustice can potentially lead to either individual or collective actions: people can respond to perceived injustice by acting as an individual, or they can join a group. Only those actions that involve a collective response to injustice are relevant to the issue of social movements.
2. When do People Act Collectiely? Early research on both distributive and procedural justice focused primarily upon the feelings of the individuals who experienced justice or injustice. On this level, there was widespread evidence that justice played a central role in shaping people’s feelings and behaviors. Further, people both reacted to their own experiences of justice or injustice and acted to provide justice to other people when they observed injustice. However, most of this research did not acknowledge that justice can also be conceptualized and responded to on the level of groups, not individuals. People can think about injustice in terms of unfairness to groups and they can respond to injustice on a group level. The perception of group level injustice is the key to the development of social movements. The first crucial insight is that people can think about injustice in terms of unfairness to groups. This insight is contained in the early distinction between egoistical and fraternal deprivation (Runciman 1966). Instead of being concerned with personally receiving fairness, people can judge the fairness received by groups. Psychologists have increasingly recognized that group memberships are an important element in people’s self-definitions (Hogg and Abrams 1988) and this has heightened attention to people’s reactions to the experiences of the groups to which they belong, as well as to their own personal experiences. The key issue is how people interpret experiences, that is, their view about the reason for the injustices they experience or observe. If people feel that the injustice is occurring to someone as an individual person, either themselves or someone else, and is due to their personal actions, they will respond personally. If they feel that the injustice is occurring to someone due to their membership in particular groups, or to all the people who share a common group membership, they will respond collectively. Social identity theory argues that the way people interpret their experience is determined by how they
Social Moements: Psychological Perspecties think about people. In particular, people are influenced by the degree to which they construct their sense of self in terms of the groups to which they belong (Hogg and Abrams 1988). To the degree that people think of themselves or others in personal terms, they interpret experiences as reflecting unique personal characteristics and behaviors, and they think about fairness in personal terms. To the degree that they think of people in terms of the group(s) to which they belong (their ‘social’ self), they interpret experiences as reflecting treatment as a member of groups and think about justice in group terms. Studies confirm that those with strong social-selves are more likely to interpret experiences as being shaped by others attitudes toward the groups to which they and others belong, and to respond to injustice by engaging in collective behavior (Grant and Brown 1995, Kelly and Kelly 1994). So, thinking about injustice in collective terms leads to acting collectively in response to injustice, for example, by joining social movements. Decisions about whether to respond to injustice individually or collectively are not only shaped by judgments about why injustice is occurring. Choices among possible behavioral responses are also influenced by people’s judgments about the intergroup situation (Ellemers 1993). For example, people are influenced by their assessments of the permeability of group boundaries. If people believe that it is possible for individuals to move from low status groups to higher status groups, they are more likely to act as individuals. If people believe that the group’s boundaries are not permeable, they are more likely to act collectively to raise the status of their group. Interestingly, studies suggest that very few low status group members need to be successful for people to view the group’s boundaries as sufficiently permeable to justify individual, as opposed to collective, action (Wright et al. 1990). People are also influenced by the perceived stability of group status. When they believe that group status can change, people are more likely to act collectively on behalf of the group. People are similarly affected by their views about the legitimacy of the status of existing groups. If people view current social arrangements as illegitimate, they are more likely to engage in collective action to change them (Major 1994). Finally, people are influenced by pragmatic concerns. They respond to their sense of the probable gains and losses associated with various types of actions (Klandermans and Oegema 1987). When people decide that injustice has occurred, their general behavioral reaction is to do nothing. This inaction reflects the real risks associated with confronting powerful others, and an objective recognition of the low likelihood of success. On the other hand, risk is not enough of an explanation for people’s behavior. For example, although the willingness to engage in collective action is influenced by the likelihood of
success (Klandermans 1997), studies of collective action suggest that people often engage in collective actions against injustice even when the likelihood of success is small. Similar factors shape people’s motivation to engage in groups for positive, identity based, reasons. If people feel that their identity is firmly rooted in their group membership (stability), they will act to try to build the stature of their group through collective action. An example is ethnicity, which is often linked to physical appearance. If people feel that their status is not changeable, they are more likely to identify with that group, and to work to improve its status. Of course, such identification can also be strong when physical appearance is not involved. For example, many people identify strongly with being gay, even though they could ‘pass for straight.’ However, when people are members of a low status group they are more strongly tempted to leave the group as an individual and join a higher status group, instead of defining themselves in terms of group membership.
3. Types of Collectie Responses to Injustice Two dimensions distinguish among collective responses to injustice. One dimension is sustained vs. temporary action. The other dimension is normative vs. nonnormative. Many of the most dramatic and widely seen collective responses to injustice are temporary events. Riots have this quality. They erupt quickly in response to some instance of injustice that reflects underlying issues of injustice. Riots typically spread rapidly and widely, and end quickly. They are not sustained and have little formal structure, or even a clear agenda beyond responding to an injustice. In contrast, social movements have leadership structures, political agendas, and long-term plans. Such movements may engage in actions such as violent protests, but these actions are typically planned and in the service of longterm goals. The second dimension of collective movements distinguishes normative vs. non-normative responses to injustice. People can work for groups within the context of social norms in the political process to get their candidate elected. They can also step outside of social norms to engage in illegal and even violent social protests.
4. Summary The psychological perspective on social movements examines the factors that encourage people to join and participate in social movements. Studies suggest that the experience of injustice is key to the development of a personal sense of grievance. When injustice is viewed as being linked to group membership, it is likely to 14367
Social Moements: Psychological Perspecties lead to participation in collective actions, such as social movements. See also: Attitudes and Behavior; Civil Rights Movement, The; Conflict and Conflict Resolution, Social Psychology of; Ethnic and Racial Social Movements; Feminist Movements; Gay\Lesbian Movements; Intergroup Relations, Social Psychology of; Justice: Social Psychological Perspectives; Labor Movements and Gender; Labor Movements, History of; Media and Social Movements; National Liberation Movements; New Religious Movements; Parties\Movements: Extreme Right; Peace Movements; Peace Movements, History of; Right-wing Movements in the United States: Women and Gender; Science and Social Movements; Social Categorization, Psychology of; Social Identity, Psychology of; Social Movements and Gender; Social Movements: Environmental Movements; Social Movements, Geography of; Social Movements, History of: General; Social Movements: Resource Mobilization Theory; Social Movements, Sociology of; Social Psychology; Stigma, Social Psychology of; Youth Movements; Youth Movements, History of
Bibliography Davies J C 1962 Toward a theory of revolution. American Sociological Reiew 27: 5–9 Ellemers N 1993 The influence of socio-structural variables on identity management strategies. European Reiew of Social Psychology 4 Grant P R, Brown R 1995 From ethnocentrism to collective protest. Social Psychology Quarterly 58: 195–212 Gurr T R 1970 Why Men Rebel. Princeton University Press, Princeton, NJ Hogg M A, Abrams D 1988 Social Identifications: A Social Psychology of Intergroup Relations and Group Processes. Routledge, New York Kelly C, Kelly J 1994 Who gets involved in collective action?. Human Relations 47: 63–88 Klandermans B 1989 Grievance interpretation and success expectations. Social Behaior 4: 113–25 Klandermans B 1997 The Social Psychology of Protest. Blackwell, Cambridge, UK Klandermans B, Oegema D 1987 Potentials, networks, motivations, and barriers: Steps toward participation in social movements. American Sociological Reiew 52: 519–31 Major B 1994 From social inequality to personal entitlement: The role of social comparisons, legitimacy appraisals, and group membership. Adances in Experimental Social Psychology 26: 293–355 Runciman W G 1966 Relatie Depriation and Social Justice. University of California Press, Berkeley, CA Simon B, Loewy M, Sturmer S, Wever U, Freytey P, Habig C, Kempmeier C, Spahlinger P 1998 Collective identification and social movement participation. Journal of Personality and Social Psychology 74: 646–58 Suls J, Wills T A 1991 Social Comparison. Erlbaum, Hillsdale, NJ
14368
Tajfel H 1982 Human Groups and Social Categories. Cambridge University Press, New York Thibaut J, Walker L 1975 Procedural Justice. Erlbaum, Hillsdale, NJ Tyler T R, Boeckmann R J, Smith H J, Huo Y J 1997 Social Justice in a Dierse Society. Westview, Boulder, CO Tyler T R, Smith H J 1997 Social justice and social movements. In: Gilbert D T, Fiske S T, Lindzey G (eds.) Handbook of Social Psychology. Vol. 2, pp. 595–632 Walster E, Walster G W, Berscheid E 1978 Equity: Theory and Research. Allyn and Bacon, Boston Wright S C, Taylor D M, Moghaddam F M 1990 Responding to membership in a disadvantaged group. Journal of Personality and Social Psychology 58: 994–1003
T. Tyler
Social Movements: Resource Mobilization Theory Resource mobilization theory (RMT) is premised on the idea that the central factor shaping the rise, development, and outcome of social movements is resources. ‘Resource’ here is taken broadly to mean any social, political, or economic asset or capacity that can contribute to collective action. Thus, solidarities and cultural outlooks of groups that enable them to act collectively as well as individual resources, such as communication skills, discretionary time-schedules, and independence from negative constraints, as well as organizational resources (e.g., leadership, meeting and communication facilities, and an esprit de corps) are relevant. Drawing on a rational choice approach to collective action, RMT views social movements as collective attempts to bring about change in social institutions and the distribution of socially valued goods. Typically hailed as the dominant paradigm in contemporary social movement research, some have contended that it is opposed by a rival ‘new social movement’ approach (Dalton and Kuechler 1990) but recent developments (discussed below) have shown that most arguments about the formation of collective identities and the role of values and social networks in collective action can be integrated into RMT. Research informed by RMT has focused on three major issues: (a) the microprocesses giving rise to individual participation; (b) organizational processes shaping mobilization; and (c) the political opportunities that guide social movement development and outcomes.
1. Collectie Goods and Indiidual Participation Early RMT argued that grievances stemming from structural strains and relative deprivation are secondary or possibly irrelevant to movement partici-
Social Moements: Resource Mobilization Theory pation and that what changes, giving rise to social movements, is the availability of resources (McCarthy and Zald 1987, Jenkins and Perrow 1977). This assumption, however, has been questioned as researchers have shown that grievances in the sense of articulated collective claims about social injustice, including suddenly imposed grievances (Walsh and Warland 1983), are critical to individual participation. Drawing on rational choice assumptions, some point to the utility calculations of participants, arguing that they respond to perceived costs and benefits (Opp 1989), including the incentives of solidarity and selfesteem flowing from acting on principle. A major focus of debate has been Olson’s (1971) theory of collective action according to which selective incentives (i.e. individual benefits) are critical to participation while collective goods per se create free riding. Arguing against this selective incentive thesis, Klandermans (1995) shows that most free riding is due to ignorance, high perceived costs in terms of work and family obligations, weak integration into interpersonal networks which provide both information and solidary incentives, and the perceived irrelevance of participation. Marwell and Oliver (1993) argue that free riding is an option only when individuals perceive that their contributions to the collective good follows a decelerating production function (i.e., each contribution makes others’ subsequent contributions less worthwhile). Mass mobilization follows an accelerating production function (i.e., each contribution makes the next one more worthwhile). Hence, there may be large start-up costs to collective action but there is no point of diminishing returns. Thus, the key questions are expectations about the number of likely participants and the likelihood that the goal will be achieved. RMT argues that collective identity and solidarity stemming from interpersonal networks provide for participation. Marwell and Oliver (1993) contend that organizers (or movement entrepreneurs) pay much of the start-up costs by building solidary networks, developing injustice frames (Snow et al. 1986), and creating collective perceptions of an accelerating production function. Klandermans (1997) shows that both selective and collective incentives matter with some movements depending more on selective incentives and others on collective commitments to movement goals. Thus, the women’s movement and unions rely more on selective incentives while purposive commitments are paramount to the peace movement.
sympathetic bystanders, wealthy patrons, and institutional sponsors are the major contributors to these movements. ‘Speaking for’ rather than mobilizing the aggrieved, grassroots grievances are seen as secondary if not irrelevant to these movements. Mass media (including direct mail solicitation and the Internet) has lowered organizing costs and the development of professional advocacy careers and training schools provide a ready supply of movement entrepreneurs. Thus, there has been a proliferation of professional movements, as in the environmental movement, sections of the women’s movement, the peace and human rights campaigns, and other ‘public interest’ causes (Berry 1997). Drawing on organizational ecology theory, the increased density of social movement organizations (or SMOs) creates organizational survival problems with increased SMO mortality (Minkoff 1995). Similarly, issue attention cycles limit the ability of professional movement entrepreneurs to mobilize external support, forcing them to migrate among causes as public attention. This argument, however, has been criticized for overstating the importance of professionalization and misdiagnosing its origins. Many contemporary movements are based indigenously on the voluntary labor of direct beneficiaries who are integrated by loosely structured networks, strong collective identities, and volunteer organizers. Morris (1984) shows that the Southern civil rights movement was centered organizationally in African–American churches and organized by a loosely knit coalition of ministers and college students. While external institutional resources may be central to the professionalization of movements, these contributions are often spurred by indigenous protests. Thus, the civil rights protests mobilized by the more militant groups spurred foundations and government officials to donate to the more moderate SMOs, creating a ‘radical flank’ effect (Haines 1986). Professionalization did not, however, demobilize these movements but rather channeled them into more moderate institutional activities, including the implementation of public policy gains (Jenkins and Eckert 1986). Thus, the professional model is most relevant for post-movement SMOs that have become institutionalized interest associations and ‘public interest’ movements that lack cohesive constituencies, such as the environmental, consumer, and general human rights movements.
3. Political Opportunities 2. Organizational Processes Zald and McCarthy (1987) argue that contemporary movements are professionalized, led and maintained by paid full-time staff with minimal ‘grassroots’ support. ‘Conscience constituents’ in the sense of
A core rational choice idea is that participants respond to perceived costs and the likelihood of success. Thus, movement development should be shaped by political opportunities. Drawing on Piven and Cloward’s (1977) thesis that elite divisions legitimize movement demands and thus facilitate protest and Tilly’s (1978) 14369
Social Moements: Resource Mobilization Theory arguments about elite-challenger coalitions that lead to revolutionary situations, McAdam (1982) showed that competition between Democratic and Republican Presidential candidates for the African–American vote facilitated the rise of the civil rights movement. Jenkins (1985) showed that a dominant left-center governing coalition removed political obstacles to farm worker unionization and strengthened the institutional sponsors of the United Farm Worker Union, thus giving it more organizing space. Kriesi et al. (1995) show in a comparison of Western Europe that decentralized federal states, left-center governments, and strong alliances with unions facilitate the rise of the environmental and women’s movements but opportunities mattered little for the more culturally-oriented homosexual and counter-cultural movements. Distinguishing between ‘early riser’ and ‘late riser’ movements that develop within a general protest cycle, Tarrow (1998) argues that ‘late risers’ are more successful because they can benefit by the policy precedents and resources created by ‘early riser’ challenges. In a widely cited analysis, Gamson (1975\1990) argues that opportunities along with strategies are central to movement outcomes. Movements that ‘think small’ in terms of narrow incremental gains, use formal organization and thus avoid schisms, have allies, use unruly and violent tactics, and organize during political crisis periods are more likely to secure political acceptance and tangible benefits. By contrast, movements that attempt to overthrow their targets, rely on decentralized informal organization, lack political allies, use conventional tactics, and organize during periods of stability are less successful. Subsequent research has focused largely on smallscale policy changes, such as the passage of favorable legislation and public expenditure showing that the resources under movement control (especially disruptive protest) combined with political opportunities are central to bringing about favorable policy outcomes (Jenkins 1985, Amenta 1998). Cress and Snow (2000) show that combinations of favorable strategies and opportunities are critical to the policy victories of homeless movements in US cities. Burstein (1985) argues that favorable public opinion is the key factor in equal employment legislation with nonviolent protest serving merely as a catalyst for greater public support. However, violent and highly disruptive protest often creates public opinion backlashes, making it more difficult for movements to gain favorable policies and raising the possibility of political repression. It is also important to examine the broader impact of movements on culture and social institutions. Thus, expanding on Gamson’s (1975\1990) typology of movement ‘acceptance’ and ‘tangible gains,’ movements also create new cultural codes and collective identities, such as in the women’s and counter-cultural youth movements. The women’s movement made it more legitimate for women to pursue full-time careers, to treat motherhood as optional, and promoted 14370
changes in social institutions from schools to collegiate sports and family practices that insured broader social inclusiveness. Movement participation also has profound biographical consequences, creating ‘bridgeburning’ experiences that channel activists into specific adult careers and transform their personal lives. Thus, McAdam (1988) shows that the adult lives of the white ‘summer of 1964’ student activists who went South to aid the civil rights movement were changed dramatically, leading most to embark on careers in the law, academe, and social work that allowed them to act on their political activist beliefs and leading to life course experiences, such as lower marital stability, geographic mobility, and lower income. A major problem in inferring the effects of social movements is distinguishing between authentic movement impacts vs. ongoing social changes that are contributing simultaneously to both movement mobilization and changes reflecting the proclaimed goals of the movement. Most research is ‘movement-centered’ in that it takes the activities of the movement as central and focuses on the ostensive impact (e.g., public policy changes) of movement activity. It thus neglects the broader context in which movements emerge. Some contend, however, that these changes would have occurred without the movement or that, by working through movement activity, these deeper social changes are more important. Thus, for example, the changes in work and family typically attributed to the women’s movement are also rooted in deeper cultural changes emphasizing individual autonomy and in the structural growth of service sector employment where most women have found work. Thus, while the women’s movement may be central in these changes, evaluating how important it is relative to broader social changes that have also facilitated the women’s movement is complex and difficult. Another complexity is that movement impact is often indirect and unintended. Thus, the women’s movement encouraged more women to go into professional careers, which reinforced the number of feminist activists and legislators, which in turn strengthened pressures for feminist legislation, and so on. Similarly, social revolutions rarely have the outcomes intended by revolutionary leaders but, nonetheless, may have major unintended impacts on people’s lives and the societies in which they live. Thus, RMT currently is being expanded by research on movement impact as well as by efforts to assess the social and cultural processes involved in mobilization. See also: Action, Collective; Collective Behavior, Sociology of; Leadership in Organizations, Sociology of; Organizations, Sociology of; Political Sociology; Social Movements, Sociology of
Bibliography Amenta E A 1998 Bold Relief. Princeton University Press, Princeton, NJ
Social Moements, Sociology of Berry J M 1997 The Interest Group Society, 3rd edn. Longman, New York Burstein P 1985 Discrimination, Jobs and Politics. University of Chicago Press, Chicago Cress D M, Snow D A 2000 The outcomes of homeless mobilization. American Journal of Sociology 105: 1063–1104 Dalton R J, Kuechler M 1990 Challenging the Political Order. Oxford University Press, New York Gamson W A 1975\1990 The Strategy of Social Protest. Dorsey, Homewood, IL Haines H H 1986 Black Radicals and the Ciil Rights Mainstream, 1954–1970. University of Tennessee Press, Knoxville, TN Jenkins J C 1985 The Politics of Insurgency: The Farm Worker Moement in the 1960s. Columbia University Press, New York Jenkins J C, Eckert C M 1986 Channeling black insurgency: Elite patronage and the channeling of social protest. American Sociological Reiew 51: 812–29 Jenkins J C, Perrow C 1977 Insurgency of powerless: Farm worker movements. American Sociological Reiew 42: 249–68 Klandermans B 1995 The Social Psychology of Protest. Blackwell, Cambridge, MA Kriesi H P, Koopmans R, Duyvendak J W, Giugni M G 1995 New Social Moements in Western Europe: A Comparatie Analysis. University of Minnesota Press, Minneapolis, MI Marwell G, Oliver P 1993 The Critical Mass. Cambridge University Press, Cambridge, UK McAdam D 1982 Political Process and the Deelopment of Black Insurgency, 1930–1970. University of Chicago Press, Chicago McAdam D 1988 Freedom Summer. Oxford University Press, New York Minkoff D C 1995 Organizing for Equality. Rutgers University Press, New Brunswick, NJ Morris A D 1984 The Origins of the Ciil Rights Moement. Oxford University Press, Oxford, UK Olson M 1971 The Logic of Collectie Action. Harvard University Press, Cambridge, MA Opp K D 1989 The Rationality of Political Protest. Westview, Boulder, CO Piven F F, Cloward A 1977 Poor People’s Moements. Pantheon, New York Snow D A, Rochford B, Worclen S, Benford R D 1986 Frame alignment processes, micromobilization and movement participation. American Sociological Reiew 51: 464–81 Tarrow S G 1998 Power in Moement. Cambridge University Press, Cambridge, UK Tilly C 1978 From Mobilization to Reolution. Addison-Wesley, Reading, MA Walsh E J, Warland R H 1983 Social movement involvement in the wake of a nuclear accident. American Sociological Reiew 48: 764–80 Zald M N, McCarthy J D 1987 Social Moements in an Organizational Society. Transaction, New Brunswick, NJ
J. C. Jenkins
Social Movements, Sociology of For the whole set of actions and events subsumed under the heading of social movements, the problem of boundary demarcation is not to be taken lightly.
Social movements unquestionably are, as Hanspeter Kriesi (1988, p. 350) puts it, ‘elusive phenomena with unclear boundaries in time and space.’ Furthermore, the question is not a mere empirical one to be solved, in each case, by deciding when a social movement begins and when it ends, or by circumscribing the social field in which it takes place and evolves. Social movements are also hard to grasp conceptually. First of all, even if a social movement as such cannot be equated with an organized group (or several ones), it has distinctively collective features: in this respect, Heberle (1951) had grounds for defining it ‘as a kind of social collective.’ Second, the scholar must be aware of two pitfalls: on the one hand, it is mistaken to consider a social movement to be a self-perpetuating crowd, for a crowd totally lacks mechanisms for sustaining association between people; on the other hand, it is an unduly narrow view of social movements to conceive them as necessarily linked to class-based actions, although the history of the concept can, as we will see, account for this usage. Finally, drawing the line between social movements and other close phenomena is, as a general rule, difficult, and that is why Turner and Killian ([1957] 1987) characterized these phenomena as ‘quasi movements.’ A classical instance of these borderline cases is to be found in some religious movements, such as messianic and millenarian sects: at the beginning of the twenty-first century, it is generally agreed upon that they both fall within the ambit of social movements when they set up meetings for voicing protest and aim at establishing a new type of social order. What emerges from this view is perhaps the main criterion for defining a social movement, that is, its critical relation to social change. Herbert Blumer (1946, p. 199) aptly dwelt on this point when he suggested that social movements should be viewed ‘as collective enterprises to establish a new order of life,’ and his concise wording may still be used as a startingpoint. But social movements may differ both in the direction and the nature of the change aimed at: some act ‘to promote a change,’ some others ‘to resist’ it, in the words of Turner and Killian ([1957] 1987); and the change claimed sometimes is a partial one, and sometimes implies a transformation of the whole social order. A second distinctive feature of social movements is the use of uninstitutionalized means by participants, at least in some occasions and situations. A third one is also worth noting: the actions of a social movement are protest-oriented and even contest-oriented; therefore they have political relevance. We are now in a position to state an elementary, although tentative, definition of a social movement: it is ‘a collective enterprise of protest and contention which aims at bringing about changes of varying importance in the social and\or political structure by the frequent but not necessarily exclusive use of uninstitutionalized means’ (Chazel 1992, p. 270). 14371
Social Moements, Sociology of
1. From the Formulation of the Idea to the Deelopment of a Self-contained Field The term soziale Bewegung was first used and coined by the German scholar, Lorenz von Stein ([1850] 1964) in his pioneer study, first published in German as Geschichte der sozialen Bewegung in Frankreich on 1789 bis auf unsereTage. Rejecting Hegel’s idealism, he did not study socialism and communism as forms of social thought, but regarded them as expressing the strivings of the industrial proletariat toward economic and political power; in his view, they were an essential development of the period and it is why, for him, the social movement meant the efforts of the rising industrial working class at organizing and sustaining mass action. Although Marx and Engels shared many interests with Stein, they did not themselves use the term. But Stein’s conception remained influential until the beginning of the twentieth century. In Werner Sombart’s ([1896] 1909, p. 2) wording, the social movement is still defined as ‘the conception of all the attempts at emancipation on the part of the proletariat.’ Rudolf Heberle is to be credited with the discarding of this exclusive identification of the notion with proletarian movements in industrial society: the definition must be enlarged to take account of such important phenomena as peasant movements, Fascism, and National Socialism, or independence uprisings against colonial powers. It is to be noted that Heberle is still under the restraining influence of the traditional conception, when he states that ‘the main criterion of a social movement is [to] aim [at] bring[ing] about fundamental changes in the social order, especially in the basic institutions of property and labor relationships’ (1951, p. 6) or when he draws a somewhat arbitrary distinction between ‘genuine social movements’ and more or less short-lived ‘protest movements.’ Yet Heberle’s work must be acknowledged as a landmark in the field, paving the way for the systematic study of social movements: he also points out their political implications and relevance, and therefore subtitles his book, Social Moements, as An Introduction to Political Sociology. However original his work was, Heberle, belonging to a generation of political exiles, was the heir to European social theory, and research in social movements mostly developed along other lines in the USA. Social movements were to be conceived as one major form of collective behavior which Robert Park, who coined the term, defined as ‘the behavior of individuals under the influence of an impulse that is common and collective, an impulse, in other words, that is the result of social interaction’ ([1921] 1969). This notion has broad applicability and, while rather denoting for Park an approach to social phenomena, it quickly became a label for a distinctive field of investigation. Some valuable studies and surveys of social move14372
ments, such as those, already mentioned, of Blumer, Turner, and Killian, have been worked out within this framework. But Neil Smelser’s (1962) Theory of Collectie Behaior, which is the most ambitious attempt at building up an elaborate theory, cannot be categorized in such a straightforward way, because he approaches the empirical phenomena subsumed under the field of collective behavior from a quite different theoretical perspective, the Parsonian hierarchy of ‘components of social action.’ By using criteria derived from this conceptual frame of reference, Smelser draws a demarcation line between ‘panics,’ ‘crazes,’ and ‘hostile outbursts’ on the one hand and social movements on the other, while making a further distinction between ‘norm-oriented movements’ and ‘valueoriented movements.’ This latter distinction, because it proceeds from an analytical basis, is more significant than purely descriptive ones, and it may be quite appropriate for differentiating, within a large movement, between a reform-oriented trend and a more radical one. Yet participants to social movements often account for their normative commitments by referring to widespread values. Discriminated as they are by their relation to a distinctive component of social action, these different kinds of collective behavior share common attributes, as Smelser makes clear in his formal characterization. All episodes of collective behavior imply ‘an uninstitutionalized mobilization for action in order to modify one or more kinds of strain on the basis of a generalized [belief]’ (Smelser 1962, p. 71). Two major features of this conception are to be pointed out: the sharp contrast between the uninstitutionalized character of collective behavior and established behavior and Smelser’s emphasis on beliefs of a very peculiar type, because they entail a ‘process of short-circuiting,’ that is, ‘the jump from extremely high levels of generality to specific, concrete situations,’ (p. 82) and these are the very points which will later on give rise to much controversy. The irony of an unexpected outcome also is worth noticing: students of collective behavior were so keen on bringing out the salient characteristics of social movements that this topic gradually became a teaching and research area in itself. A quite significant instance of this development can be found in Robert Faris’s Handbook of Modern Sociology (1964), in which the chapter on ‘Social Movements,’ written by Lewis Killian, is next to the one on ‘Collective Behavior’ by Ralph Turner.
2.
New Emphases in Theory and Research
The turbulent 1960s aroused much interest in social movements, and many scholars did not agree with Smelser’s statement that social movements are, as any other kind of collective behavior, ‘the action of the
Social Moements, Sociology of impatient.’ During this period important developments internal to the social sciences were also under way, with systematic attempts at tackling sociological problems in terms of the economic paradigm. Facing troubled times that they had not predicted, as often happens, and grappling with theoretical puzzles of a new kind, sociologists and political scientists were thus induced to take a fresh and finally inventive approach to social movements. In this process, Mancur Olson’s The Logic of Collectie Action (1965) was a seminal step. Olson drew attention to a vexing paradox, unnoticed in traditional theory: ‘it does not follow, because all of the individuals in a group would gain if they achieved their group objective, that they would act to achieve that objective, even if they were all rational and selfinterested’ (p. 2); indeed, in very large groups individuals will not as a rule act to obtain a collective good. In Olson’s view, there is only one way of preventing rational actors in such ‘latent’ groups from not mobilizing to provide a collective good: it consists of an incentive which discriminately operates upon some individuals in the group and which, by offering positive inducements to them or by coercing them, leads these individuals to act in the common interest. Actually, Olson’s theoretical importance lies less in ‘selective incentives,’ which have been appraised by several scholars as a quite limited remedy, than in his skill in throwing light on the social dilemma of movement participation itself: mobilization never is to be taken for granted. Thus the true significance of Olson’s book rested in its setting a theoretical puzzle to a new generation of scholars: how and when are social actors able to overcome the dilemma of movement participation? These attempts at solving the dilemma resulted in what is now known as ‘resource mobilization’ theory. Yet, because there are different emphases within this theoretical approach, we can differentiate three distinctive ways of overcoming the dilemma, all of which ascribe weight to organization. The first one, which may be regarded as classic, has been elaborated by Anthony Oberschall in his book Social Conflict and Social Moements (1973). As Oberschall stresses in his ‘sociological theory of mobilization,’ collective protest is more likely to be present in a collectivity which has a strong organizational base, whether it is of a communal or of an associational kind, and which also is cut off from the rest of society, that is, is ‘segmented.’ A second way of getting over the dilemma was put forward by John McCarthy and Mayer Zald, who coined the convenient label, ‘resource mobilization’: they assert that particular attention must be paid to outside support, funding, and leadership; consequently, they dwell on the prominent part of ‘conscience constituents’ and ‘adherents’ on the one hand, of political entrepreneurs on the other, arguing that they ‘turn Olson on his head’ (1977). It is a theoretical point worth noticing, although it may be observed that Zald
and McCarthy overrate the aggregate import of ‘professional social movements’ during the 1960s. The third way of solving the dilemma perhaps is not so closely connected with resource mobilization theory; but the major questions at issue are the same. Charles Tilly (1978) puts as much emphasis as Oberschall or Zald and McCarthy on organization and interests, yet he stresses the political context in which mobilizations take place. Therefore he calls attention to emerging coalitions, of which a quite significant kind brings together ‘members’ of the ‘polity’ and ‘challengers’ aiming at entering it; and he takes account of the role of the government in this process, which can facilitate as well as repress an emergent mobilization. Taken as a whole, this stream of thinking is to be considered ‘revisionist,’ as William Gamson aptly called it in his ‘Introduction’ to the collected essays by Zald and McCarthy (1987), and it is so, on two basic points. First, the advocates of the new approach assume that participation in social movements is to be explained in terms of rationality, as opposed to the prevailing view, which was exemplified by Smelser’s emphasis on ‘generalized beliefs.’ Second, they are anxious to downgrade the psychosocial approach to social movements, as it was used and developed by collective behavior theorists. When at a later date Doug McAdam (1982) boldly asserts that social movements must be regarded as political phenomena, no more as psychosocial ones, he is in full agreement with this line of thought. The theoretical import of McAdam’s (1982) book has to be pointed out. In many respects he brings resource mobilization theory to completion, by joining the strand derived from Oberschall (although the relative weight of ‘indigenous organizational strength’ vs. outside support in the civil rights movement is a bone of contention between them), and the one derived from Tilly whose term, ‘political process,’ he took up. But on an important point, which resource mobilization theorists did not tackle, he makes a significant advance by drawing attention to potential participants’ cognitive activity. Thus McAdam paves the way for attempts at multidimensional synthesis, in which, somewhat ironically, the psychosocial viewpoint will be awarded due consideration. For both of these achievements, his book can be regarded as a landmark.
3. Contemporary Adances and Issues Any comment on the latest trends is open to question and correction but it is worth trying to outline the main line of development and, as already mentioned, the keynote of the times seems to lie in attempts at multidimensional synthesis. Yet the path has gradually to be mapped out, and there are heuristic blind alleys, which are often linked 14373
Social Moements, Sociology of with some overemphasis on one analytical dimension or empirical type of social movements. Truly, social movements are in many respects significant political phenomena and, as mentioned, Tilly and McAdam opened up new vistas. But it does not follow that the whole approach must center in the concept of ‘political opportunity structure,’ which then is unduly stretched. As William Gamson and David Meyer aptly put it: the concept of political opportunity structure is ... in danger of becoming a sponge that soaks up virtually every aspect of the social movement environment ... Used to explain so much, it may ultimately explain nothing at all (1996, p. 275).
It is well known that with new waves of contention different problems are to be put on the research agenda. Thus the ‘new social movement’ school was quite justified in investigating the distinctive features of this type of movement. But the wrong inference that these movements were unparalleled was not always resisted: ‘new social movements’ perhaps are not so new as they appeared to be. Nevertheless this approach, by stressing the search for a collective identity, revives interest in the cultural dimension of social movements, as opposed to the resource mobilization stream of thinking, which had little regard for it; therefore the reconciliation of these quite divergent viewpoints is urgently needed. The abundance of books on social movements during the 1990s, such as those edited by Aldon Morris and Carol Mueller (1992), Hank Johnston and Bert Kandermans (1995), McAdam et al. (1996), or the individual works by Sidney Tarrow (1994) and Klandermans (1997), along with the publishing of a specialized journal, Mobilization, first testifies to the institutionalization of an academic field. But their theoretical significance lies mainly in systematic attempts at integrating mobilizing structures and organizations, political opportunities, and ‘framing’ processes. A true synthesis seems to be under way, although it is not an easy job to articulate distinctive dimensions with one another. A last point is worth noting: the clarifying of cognitive and discursive processes, whether in terms of causal attribution, as by Myra Ferree and Frederick Miller (1985), or in terms of ‘frame alignment’ and ‘frame resonance,’ as by David Snow and his associates (1986, 1988), is part and parcel of this development. This new interest in collective beliefs can be construed as Smelser’s revenge, but it is an ambiguous one: as opposed to ‘generalized beliefs,’ the beliefs which are now to be investigated are reasonable ones and can be interpreted in terms of a rational, though not simply rationalistic, model. See also: Action, Collective; Civil Rights Movement, The; Collective Behavior, Sociology of; Conflict Sociology; Feminist Movements; Gay\Lesbian Move14374
ments; Identity Movements; Participation: Political; Peace Movements, History of; Revolutions, Sociology of; Revolutions, Theories of; Social Change: Types; Social Movements: Environmental Movements; Social Movements, Geography of; Social Movements, History of: General; Social Movements: Psychological Perspectives; Social Movements: Resource Mobilization Theory; Youth Movements, History of
Bibliography Blumer H 1946 Collective behavior. In: Lee A M (ed.) New Outline of the Principles of Sociology. Barnes and Noble, New York, pp. 166–222 Chazel F 1992 Mouvements sociaux. In: Boudon R (ed.) TraiteT de sociologie. Presses Universitaires de France, Paris, pp. 263–312 Chazel F 1997 Les ajustements cognitifs dans les mobilisations collectives. In: Boudon R, Bouvier A, Chazel F (eds.) Cognition et sciences sociales. Presses Universitaires de France, Paris, pp. 193–206 Faris R E L (ed.) 1964 Handbook of Modern Sociology. Rand McNally, Chicago Ferree M M, Miller F D 1985 Mobilization and Meaning: Toward an integration of social psychological and resource perspectives on social movements. Sociological Inquiry 55: 38–61 Gamson W, Meyer D 1996 Framing political opportunity. In: McAdam D, McCarthy J D, Zald M N (eds.) Comparatie Perspecties on Social Moements: Political Opportunities, Mobilizing Structures and Cultural Framings. Cambridge University Press, Cambridge, UK, pp. 275–90 Gusfield J R 1969 Symbolic Crusade. University of Illinois Press, Urbana, IL Heberle R 1951 Social Moements: An Introduction to Political Sociology. Appleton-Century-Crofts, New York Hirschman A O 1970 Exit, Voice and Loyalty. Harvard University Press, Cambridge, MA Johnston H, Klandermans B (eds.) 1995 Culture and Social Moements. University of Minnesota Press, Minneapolis, MN Killian L M 1964 Social movements. In: Faris R E L (ed.) Handbook of Modern Sociology. Rand McNally, Chicago, pp. 426–55 Klandermans B 1997 The Social Psychology of Protest. Blackwell, Oxford Kriesi H 1988 The interdependence of structure and action: Some reflections on the state of the art. In: Klandermans B, Kriesi H, Tarrow S (eds.) From Structure to Action: Comparing Social Moement Research across Cultures. JAI Press, Greenwich, CT, pp. 349–68 McAdam D 1982 Political Process and the Deelopment of Black Insurgency. University of Chicago Press, Chicago McAdam D, McCarthy J D, Zald M N (eds.) 1996 Comparatie Perspecties on Social Moements: Political Opportunities, Mobilizing Structures and Cultural Framings. Cambridge University Press, Cambridge, UK McCarthy J D, Zald M N 1987 Resource mobilization and social movement: a partial theory. In: Zald M N, McCarthy J D (eds.) Social Moements in an Organizational Society: Collected Essays. Transaction Books, New Brunswick, NJ
Social Network Models: Statistical Morris A D, Mueller C M (eds.) 1992 Frontiers in Social Moement Theory. Yale University Press, New Haven, CT Oberschall A 1973 Social Conflict and Social Moements. Prentice-Hall, Englewood Cliffs, NJ Oberschall A 1993 Social Moements: Ideologies, Interests and Identities. Transaction Books, New Brunswick, NJ Olson M 1965 The Logic of Collectie Action: Public Goods and the Theory of Groups. Harvard University Press, Cambridge, MA Park R [1921] 1969 Collective behavior. In: Park R, Burgess E W (eds.) Introduction to the Science of Sociology. Greenwood Press, New York, pp. 865–78 Smelser N J 1962 Theory of Collectie Behaior. Free Press, New York Snow D, Benford R D 1988 Ideology, frame resonance and participant mobilization. In: Klandermans B, Kriesi H, Tarrow S (eds.) From Structure to Action: Comparing Social Moement Research across Cultures. JAI Press, Greenwich, CT, pp. 197–217 Snow D, Rochford E B, Worden S K, Benford R D 1986 Frame alignment processes, micro-mobilization and movement participation. American Sociological Reiew 51: 464–81 Sombart W [1896] 1909 Socialism and the Social Moement. Dent, London Stein L von [1850] 1964 The History of the Social Moement in France 1789–1850 [trans. K. Mengelberg]. Bedminster Press, Totowa, NJ Tarrow S 1994 Power in Moement: Social Moements and Contentious Politics. Cambridge University Press, Cambridge, UK Tilly C 1978 From Mobilization to Reolution. Addison-Wesley, Reading, MA Touraine A 1988 Return of the Actor: Social Theory in Postindustrial society. University of Minnesota Press, Minneapolis, MN Turner R H, Killian L M [1957] 1987 Collectie Behaior, 3rd edn. Prentice-Hall, Englewood Cliffs, NJ Zald M N, McCarthy J D (eds.) 1987 Social Moements in an Organizational Society: Collected Essays. Transaction Books, New Brunswick, NJ
F. Chazel
Social Network Models: Statistical Late twentieth-century developments in statistical models for social networks reflect an increasing theoretical focus in the social and behavioral sciences on the interdependence of social actors in dynamic, networkbased social settings. As a result, a growing importance has been accorded the problem of modeling the dynamic and complex interdependencies among network ties and the actions of the individuals whom they link. The early focus of statistical network modeling on the mathematical and statistical properties of Bernoulli and dyad-independent random graph distributions has been replaced by efforts to construct theoretically and empirically plausible parametric
models for structural network phenomena and their changes over time. These models in this article are based on graphs where the nodes are individual actors, which are ‘linked’ by lines representing relational ties (see Graphical Models: Oeriew).
1. Statistical Models for Social Networks 1.1 Notation In most of the literature on statistical models for networks, the set N l o1, 2,…, nq of network nodes is regarded as fixed and the edges, or ties, between nodes are assumed to be random. The tie linking node i to node j (i, j ? N) may be denoted by the random variable Xij. In the simplest case of binary-valued random variables, Xij takes the value 1 if the tie is present and 0 otherwise. Other cases of interest include: (a) valued networks where Xij is assumed to take values in the set o0, 1,…, Ck1q; (b) multiple relational or multivariate networks, where the variable Xijk represents the possible tie of type k from node i to node j (with k ? R l o1, 2,…, rq, a fixed set of types of tie); and (c) time-dependent networks, where Xijt represents the tie from node i to node j at time t. In each case, network ties may be directed (e.g., Xij and Xji are distinct random variables) or nondirected (e.g., Xij and Xji are not distinguished). The nin array X l [Xij] of random variables can be regarded as the adjacency matrix of a random (directed) graph on N; the state space of all possible realizations of these arrays may be denoted by Ωn. The array x l [xij] denotes a realization of X, with xij l 1 if there is an observed tie from node i to node j, and xij l 0 otherwise. In some cases, models may also refer to variables measured on actor attributes: for instance, the mth attribute Zi[m] of actor i gives rise to a random vector Z [m] with realization z[m]. The adjacency matrix of an illustrative nondirected binary network is shown in Table 1. The data are from a study of interorganizational networks amongst mental health agencies conducted by Morrisey et al. (1994). The nodes of the network are 37 mental health agencies in one of the US cities in the Morrissey study; an edge links two agencies if they engage in regular and mutual exchange of information. The graphical representation of these data is shown in Fig. 1. We use this network to illustrate several of the distributions and models described below.
1.2 Bernoulli Graphs A simple statistical model for a (directed) graph assumes a Bernoulli distribution, in which each edge, 14375
Social Network Models: Statistical series of developments, the p model introduced by " and developed by Holland and Leinhardt (1981), Fienberg and Wasserman (1981), recognized the theoretical and empirical importance of dyadic structure in social networks, that is, of the interdependence of the variables Xij and Xji. This class of Bernoulli dyad distributions and their generalization to valued, multivariate, and time-dependent forms, gave parametric expression to ideas of reciprocity and exchange in dyads and their development over time. The model assumes that each dyad (Xij, Xji) is independent of every other and, in a commonly constrained form, specifies:
(
0 1 jρ0 x x 1jα 0 x 1 jβ 0 x 1*
P(X l x) l exp λijjθ xij i j
i j
j
i
ij
ij ji ij
i
j
ij
(1)
or tie, is statistically independent of all others and governed by a theoretical probability Pij. In addition to edge independence, simplified versions also assume equal probabilities across ties; other versions allow the probabilities to depend on structural parameters (Frank and Nowicki 1993). These distributions have often been used as models (Bollobas 1985), but are of questionable utility due to the independence assumption. Bernoulli graph distributions are described at length by Erdos and Renyi (1960) and Frank (1981).
where θ is a density parameter; ρ is a reciprocity parameter; the parameters αi and β reflect individual differences in expansiveness and popularity; and λij ensures that probabilities for each dyad sum to 1. This model is log-linear (see Multiariate Analysis: Discrete Variables (Loglinear Models)). An important elaboration of the p model permits the dependence of density, popularity" and expansiveness parameters on a partition of the nodes into subsets that can either be specified a priori (e.g., from hypotheses about social position) or can be identified a posteriori from observed patterns of node interrelationships (e.g., Wasserman and Faust 1994). The resulting stochastic blockmodels represent hypotheses about the interdependence of social positions and the patterning of network ties (as originally elaborated in the notion of blockmodel by White et al. 1976). A potentially valuable variant of these models regards the allocation of nodes to blocks as a latent classification; Snijders and Nowicki (1997) have provided Bayesian estimation methods for the case of two latent classes. Another valuable extension of the p model is " 1997). the so-called p model (Lazega and van Duijn # Like the p model, the p model assumes dyad " but it adds the # possibility of arbitrary independence, node and dyad covariates for the parameters of the p model, and models the unexplained parts of the" popularity and expansiveness parameters as random rather than fixed effects.
1.3 Dyadic Structure in Networks
1.4 Null Models for Networks
Statistical models for social network phenomena have been developed from their edge-independent beginnings in a number of major ways. In one important
Despite the useful parameterization of dyadic effects associated with the p model and its generalizations, " the assumption of dyadic independence has attracted
14376
Social Network Models: Statistical
Figure 1 Graphical representation of interorganizational networks amongst mental health agencies in one US city (data from Morrissey et al. (1994)).
sustained criticism. Thus, another series of developments has been motivated by the problem of assessing the degree and nature of departures from simple structural assumptions like dyadic independence. Following a general approach exemplified by early work of Rapoport (e.g., Rapoport 1949), a number of conditional uniform random graph distributions were introduced as null models for exploring the structural features of social networks (see Wasserman and Faust 1994 for a historical account). These distributions, denoted by U Q Q, are defined over subsets Q of the state space Ωn of directed graphs, and assign equal probability to each member of Q. The subset Q is usually chosen to have some specified set of properties (e.g., a fixed number of mutual, asymmetric, and null dyads, as in the U Q MAN distribution of Holland and Leinhardt 1975). When Q is equal to Ωn the distribution is referred to as the uniform (di)graph distribution, and is equivalent to a Bernoulli distribution with homogeneous tie probabilities. Enumeration of the members of Q and simulation of U Q Q is often straightforward (e.g., see Wasserman and
Pattison, in press), although certain cases, such as the distribution that is conditional on the indegree xji and outdegreee xij, of each node i in the net-
0 1 j
0 1 j
work, require more complicated approaches (Snijders 1991, Wasserman 1977). A typical application of these distributions is to assess whether the occurrence of certain higher-order (e.g., triadic) features in an observed network is unusual, given the assumption that the data arose from a uniform distribution that is conditional on plausible lower-order (e.g., dyadic) features (e.g., Holland and Leinhardt 1975). For example, the network displayed in Table 1 has 84 edges, and 333 and 69 triads with exactly 2 and 3 edges, respectively. If Q denotes the subset of complete Ω with 84 edges, $( is 250.1 and the expected number of triads of each kind 50.8, and the observed values lie at the 0.003 and 0.005 percentiles of the u Q x++ l 84 distribution, respectively. This general approach has also been developed for the analysis of multivariate social networks. The bestknown example of this approach is probably the 14377
Social Network Models: Statistical quadratic assignment procedure or model (QAP) for networks (Hubert and Baker 1978). In this case, the association between two graphs defined on the same set of nodes is assessed using a uniform multivariate graph distribution that is conditional on the unlabelled graph structure of each univariate graph. In a more general framework for the evaluation of structural properties in multivariate graphs, Pattison et al. (in press) also introduced various multigraph distributions associated with stochastic blockmodel hypotheses.
1.5 Extra-dyadic Local Structure in Networks A significant step in the development of parametric statistical models for social networks was taken by Frank and Strauss (1986) with the introduction of the class of Markov random graphs. This class of models permitted the parameterization of extra-dyadic local structural forms and so allowed a more explicit link between some important network conceptualizations and statistical network models. Frank and Strauss (1986) adapted research originally developed for modeling spatial dependencies (see Spatial Statistical Methods) among observations to graphs. They observedthattheHammersley-Cliffordtheoremprovidesa general probability distribution for X from a specification of which pairs (Xij, Xkl) of edge random variables are conditionally dependent, given the values of all other random variables (see Graphical Models: Oeriew). Define a dependence graph D with node set N(D) l o(Xij: i, j ? N, i jq and edge set E(D) l o(Xij, Xkl): Xij and Xkl are assumed to be conditionally dependent, given the rest of X q. Frank and Strauss used D to obtain a model for Pr(X l x), denoted p* by Wasserman and Pattison (1996), in terms of parameters and substructures corresponding to cliques of D. The model has the form
( α z (xq*
Pr(X l x) l p*(x) l (1\c) exp
P9N(D)
P P
(2)
where: (a) the summation is over all cliques P of D (with a clique of D defined as a nonempty subset P of N(D) such that QPQ l 1 or (Xij, Xkl) ? E (D) for all Xij, Xkl ? P); xij
(b) zp(x) l Π? ρxij is the (observed) network statistic xij
corresponding to the clique ρ of D; and (c) c l Σx exp oΣpαpzp(x) is a normalizing quantity. specification of a dependence graph raises a deep theoretical issue for network analysis: what constitutes an appropriate set of dependence assumptions? Frank and Strauss (1986) proposed a Markov assumption, in which (Xij, Xkl) ? E(D) whenever oi, jq E ok, l q φ. This assumption implies that the occurrence of a network tie from one node to another 14378
is conditionally dependent on the presence or absence of other ties in a local neighborhood of the tie. A Markovian local neighborhood for Xij comprises all possible ties involving i and\or j. Any network tie is seen both as dependent on its neighboring ties and as participating in the neighborhood of other ties, so the resulting model embodies the assumption that social networks have a local self-organizing quality. In the case of Markovian neighborhoods, cliques in the dependence graph correspond to triadic and star-like network configurations. A homogeneity assumption, that model parameters do not depend on the specific identities of nodes but only on the network configurations to which they correspond, yields a model that expresses the probability of the network in terms of propensities for subgraphs of triadic and star-like structures to occur. For example, a homogeneous Markov model for the network of Table 1 with parameters corresponding to edges, 2-stars, and triangles appears to be a substantial improvement in fit over a Bernoulli model with just an edge parameter, since the mean absolute residual for fitted edge probabilities drops from .220 to .175. Pseudolikelihood estimates for model parameters (Wasserman and Pattison 1996) are –3.31, 0.056, and 0.616 for edges, 2-stars, and triangles, respectively, suggesting that graphs with a large number of triangles are relatively more likely in the assumed homogeneous Markov p* distribution than those with few. p* random graph models permit the parameterization of many important ideas about local structure in univariate social networks, including transitivity, local clustering, degree variability, and centralization. Valued (Robins et al. 1999) and multivariate (Pattison and Wasserman 1999) generalizations also lead to parameterizations of substantively interesting multirelational concepts, such as those associated with balance and clusterability, generalized transitivity and exchange, and the strength of weak ties.
1.6 Prospects for Structural Models The p* family of random graph models poses many interesting questions, both substantive and statistical. A major empirical question is the sufficiency of these local structural descriptions: do they suffice, or do they need to accommodate other characterizations of local dependence, more global considerations, other types of constraints (e.g., temporal and spatial), or variability in local structural dependencies? On the statistical side, major questions are raised by the complexity and rich parameterization of models that are deemed to be theoretically plausible. For instance, does a Markov chain Monte Carlo (see Marko Chain Monte Carlo Methods) maximum likelihood approach to parameter estimation offer a viable alternative to approximate methods such as pseudolikelihood esti-
Social Network Models: Statistical mation (Strauss and Ikeda 1990)? How can these distributions best be simulated, and what are the convergence properties of various approaches, such as the Metropolis-Hastings algorithm?
2. Dynamic Models Empirically and theoretically defensible parametric models for networks constitute one challenge for statistical modeling, but an arguably more significant challenge is to develop models for the emergence of network phenomena, including the evolution of networks and the unfolding of individual actions (e.g., voting, attitude change, decision-making) and interpersonal transactions (e.g., patterns of communication or interpersonal exchange) in the context of longerstanding relational ties. Early attempts to model the evolution of networks in either discrete or continuous time assumed dyad independence and Markov processes in time. A step towards continuous time Markov chain models for network evolution that relaxes the assumption of dyad independence has been taken by Snijders (1996). These models incorporate both random change and change arising from individuals’ attempts to optimize features of their local network environment (such as the extent to which reciprocity and balance prevail) and use the computer package SIENA for fitting. A tension function is proposed to represent the function that actors wish to minimize and Robbins-Munro procedures are used to estimate free parameters. The approach illustrates the potentially valuable role of simulation techniques for models that make empirically plausible assumptions; clearly, such methods provide a promising focus for future development. The value of simulation has also been demonstrated by Watts and Strogatz (1998) and Watts (1999) in their analysis of the so-called small-world phenomenon (the tendency for individuals to be linked by short acquaintanceship network paths). They considered a probabilistic model for network change in which the probability of a new tie was determined, in part, by existing patterns of neighboring ties (a clustering component) and, in part, by a small random component. They used simulations to establish that only a small random component was needed for the simulated network to be likely to possess the small-world property.
action occurs. The network modeling challenge posed by such framing of the nature of social processes is the development of models that represent interdependencies across several analytic categories (e.g., individuals, relational ties, settings, groups). Snijder’s (1996) model for network change represents one step in this direction; the class of social influence models represents another (Friedkin 1998). These models are exemplified by the network effect model for the mth attribute variable: z[m] l αxz[m]jyβjε
(3)
where α is a network effects parameter, y is a matrix of exogenous attribute variables with corresponding parameter vector β, and ε is a vector of residuals. A further step in the direction of modeling interdependent node attributes and network ties has been to generalize the dependence graph approach to p* network models (Sect. 1.5) so as to include both node attribute variables (Z [m]) and relational variables (X ). This step is facilitated if directed dependencies are permitted in the dependence graph (i.e., dependencies in which one random variable is assumed to be conditionally dependent on a second, but the second not on the first). Robins et al. (in press) drew on graphical modeling techniques to construct models with directed dependencies among network ties and individual attributes. If the dependencies are directed from network ties to node attributes, models for social influence are obtained; with dependencies oriented from node attributes to network ties, social selection models result. Indeed, this generalization introduces substantial flexibility in model construction, with potential applications to joint influence and selection models, discrete time models for network change, network path models, and group-level attribute models. The dependence approach of Sect. 1.5 has also been used to construct principled models for more general multiway relational data structures representing interdependent analytic categories. Examples to date have included bipartite (Faust and Skvoretz 1999) and k-partite (Pattison et al. 1998) graphs and illustrate how this general distribution has the capacity for a systematic and coherent approach to a variety of major network modeling challenges.
4. Conclusion 3. Models Bridging Analytic Leels A recurring theme among social theorists is the need for models that integrate the dynamic, interdependent, interpersonal, and cultural contexts in which social
In addition to some of the specific future prospects already noted, a more general expectation is that the current classes of models are likely to be the foundation of substantial empirical and model-building activity in the early twenty-first century. From their early beginnings, statistical approaches to network modeling have 14379
Social Network Models: Statistical now reached a point where sustained interaction is likely to be empirical research that allows the development of greatly improved models for processes occurring on networks, with important applications in many fields of social and behavioral science, including epidemiology, organizational studies, social psychology, and politics. For more information on these approaches to social networks analysis, please see Wasserman and Faust (1994) and Wasserman and Pattison (1996, in press). See also: Graphical Models: Overview; Multivariate Analysis: Discrete Variables (Loglinear Models); Network Analysis; Network Models of Tasks; Networks and Linkages: Cultural Aspects; Networks: Social; Spatial Statistical Methods
Bibliography Bollobas B 1985 Random Graphs. Academic Press, London Erdos P, Renyi A 1960 On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5: 17–61 Faust K, Skvoretz J 1999 Logit models for affiliation networks. In: Becker M, Sobel M (eds.) Sociological Methodology 1999. Blackwell, New York, pp. 253–80 Fienberg S, Wasserman S 1981 Categorical data analysis of single sociometric relations. In: Leinhardt S (ed.) Sociological Methodology 1981. Jossey-Bass, San Francisco, pp. 156–92 Frank O 1981 A survey of statistical methods for graph analysis. In: Leinhardt S (ed.) Sociological Methodology 1981. JosseyBass, San Francisco, pp. 110–55 Frank O, Nowicki K 1993 Exploratory statistical analysis of networks. In: Gimbel J, Kennedy J W, Quintas L V (eds.) Quo Vadis Graph Theory? A Source Book for Challenges and Directions. North-Holland, Amsterdam Frank O, Strauss D 1986 Markov graphs. Journal of the American Statistical Association 81: 832–42 Friedkin N 1998 A Structural Theory of Social Influence. Cambridge University Press, New York Holland P W, Leinhardt S 1975 The statistical analysis of local structure in social networks. In: Heise D R (ed.) Sociological Methodology 1976. Jossey-Bass, San Francisco, pp. 1–45 Holland P W, Leinhardt S 1981 An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association 76: 33–50 Hubert L J, Baker F B 1978 Evaluating the conformity of sociometric measurements. Psychometrika 43: 31–41 Lazega E, van Duijn M 1997 Position in formal structure, personal characteristics and choices of advisors in a law firm: A logistic regression model for dyadic network data. Social Networks 19: 375–97 Morrisey J P, Calloway M 1994 Local mental health authorities and service system change: Evidence from the Robert Wood Johnson Foundation Program on chronic mental illness. The Milbank Quarterly 72(1): 49–80 Pattison P, Mische A, Robins G L 1998 The plurality of social relations: k-partite representations of interdependent social forms. Keynote address, Conference on Ordinal and Symbolic Data Analysis, Amherst, MA, Sept. 28–30
14380
Pattison P E, Wasserman S 1999 Logit models and logistic regressions for social networks, II. Multivariate relations. British Journal of Mathematical and Statistical Psychology 52: 169–194 Pattison P, Wasserman S, Robins G L, Kanfer A M 2000 Statistical evaluation of algebraic constraints for social networks. Journal of Mathematical Psychology 44: 536–68 Rapoport A 1949 Outline of a probabilistic approach to animal sociology, I. Bulletin of Mathematical Biophysics 11: 183–196 Robins G L, Pattison P, Elliott P 2001 Network models for social influence processes. Psychometrika 66: 161–190 Robins G L, Pattison P, Wasserman S 1999 Logit models and logistic regressions for social networks, III. Valued relations. Psychometrika 64: 371–94 Snijders T A B 1991 Enumeration and simulation methods for 0–1 matrices with given marginals. Psychometrika 56: 397–417 Snijders T A B 1996 Stochastic actor-oriented models for network change. Journal of Mathematical Sociology 21: 149–72 Snijders T A B, Nowicki K 1997 Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification 14: 75–100 Strauss D, Ikeda M 1990 Pseudolikelihood estimation for social networks. Journal of the American Statistical Association 85: 204–12 Wasserman S 1977 Random directed graph distributions and the triad census in social networks. Journal of Mathematical Sociology 5: 61–86 Wasserman S, Faust K 1994 Social Network Analysis: Methods and Application. Cambridge University Press, New York Wasserman S, Pattison P E 1996 Logit models and logistic regressions for social networks, I. An introduction to Markov random graphs and p*. Psychometrika 60: 401–25 Wasserman S, Pattison P E in press Multiariate Random Graph Distributions. Springer Lecture Note Series in Statistics, Springer-Verlag, New York Watts D 1999 Networks, dynamics, and the small-world phenomenon. American Journal of Sociology 105: 493–527 Watts D, Strogatz S 1998 Collective dynamics of ‘small-world’ networks. Nature 393: 440–2 White H C, Boorman S, Breigen R L 1976 Social structure from multiple networks, I. Blockmodels of roles and positions. American Journal of Sociology 81: 730–80
P. Pattison and S. Wasserman
Social Networks and Fertility Explanations offered for fertility decline in both developed and developing countries underwent fundamental changes in the last decades of the twentieth century. In addition to more refined economic reasoning and the introduction of cultural aspects, theories of the fertility transition now routinely reserve a place for diffusion effects. Such effects arise because individuals are themselves members of larger groups. The information that is known to group members, the choices they make, and the norms and values they adhere to, can all exert a powerful influence on
Social Networks and Fertility individual incentives to innovate. In settings in which fertility has been high, for example, innovation might take the form of adopting modern contraception and fertility limitation. Social interaction with group members represents one important avenue along which such innovative demographic behavior can diffuse: peers, community members or relatives may provide relevant information to potential adopters of family planning, and they also represent the institutional and social structures that may facilitate or constrain individual fertility or family planning decisions. Theories of diffusion usually encompass two central elements. First, they emphasize a particular set of social interactions that are associated with the adoption of innovations, such as social networks, opinion leaders, public media, etc. Second, diffusion models have an explicit dynamic dimension in which the adoption of an innovation is often associated with ‘contagion’ or ‘tipping point’ phenomena, which can accelerate social change. Because fertility transitions are often rapid, irreversible, and only weakly associated with the specific socioeconomic conditions in a population, such diffusionist explanations have appealed to demographers ever since the European Fertility Project was carried out (Coale and Watkins 1986) At the same time, diffusionist approaches have been criticized for several key reasons, such as the unclear definition of what diffuses, the unclear connection with economic models of behavior, the mostly indirect empirical evidence, and the insufficient knowledge about the preconditions for the onset of the diffusion process. Several of these potential problems can be overcome by investigating the diffusion process with the tools of social network analysis.
1. Why Social Networks Matter for Understanding Fertility The theoretical perspective of social networks rests on one central insight: human action is not carried out by individuals who make their decisions in isolation and independence, but by socialized actors who are deeply rooted in their social environments. These environments consist of other individuals with whom a focal actor stays in various relationships such as kinship, love, friendship, competition, or business transactions. Within these relationships, norms are negotiated and enforced, and information and goods are exchanged (Mitchell 1973). On the one hand this implies that a network member has access to resources owned by other network partners. In addition, a network enables its members to use the resources of community members to whom they are tied only indirectly. Social networks therefore constitute a pool of resources which form opportunity structures for action (Lin 1999). The kind and quality of these resources are decisive for the goals that are attainable for the
network members. On the other hand, social relationships create, maintain, and modify attitudes (Erickson 1988). People feel uncomfortable if the attitudes of their network partners are too different from their own. Social networks therefore bring the attitudes of their members in accordance with each other. Alternatively, if the network fails to bring attitudes into conformity, the relationships become less intensive or are abandoned. The resources that are available in a network, and the tendencies towards similar attitudes, depend on the characteristics of individuals’ direct relationships with one another as well as on the structure of the entire network. One basic characteristic of network relations is their strength (Granovetter 1973). Strong relationships, i.e., relationships that are characterized by a high frequency of contact or by a high level of emotional intensity, can be influential because they tend to transfer more valuable goods, more binding norms and more trustworthy—but also more redundant—information. On the other hand, a fundamental dimension of the network structure is its density. A dense network, i.e., a network in which a high proportion of the individuals are in direct contact with one another, can lead to a high degree of normative pressure on its members and it can provide intensive support for members in situations when help is needed. Less dense networks are characterized by a greater degree of heterogeneity. Such networks provide more opportunities for individual development, and they are the source of more diverse and heterogeneous information. The mechanisms by which social networks affect the diffusion process can be summarized under the headings ‘social learning,’ ‘joint evaluation,’ and ‘social influence’ (Bongaarts and Watkins 1996, Montgomery and Casterline 1996). At early stages, potential adopters need to be informed about the existence of new methods of fertility control, their availability, costs and benefits, and possible side effects. ‘Social learning’ is defined as the process by which individuals (a) learn about the existence and technical details of new contraceptive techniques, and (b) reduce the uncertainties associated with the adoption of these methods by drawing on the experience of network partners. The spread of information about modern contraceptives is always accompanied by the spread of new ideas about the social role of women, the status of the family, and the desired number of children. ‘Joint evaluation’ refers to the fact that social networks (a) facilitate the transformation of these new ideas into terms that are meaningful in the local context, and (b) assist in the evaluation of the respective social changes. A woman’s decision to adopt a modern form of contraception, however, rests not only on information, conviction, and discursive evaluation, but also on the normative pressures of dominant groups or actors. ‘Social influence’ describes the process by 14381
Social Networks and Fertility which an individual’s preferences and attitudes are influenced by the prevailing norms and values of the social environment. The direction of this social influence can differ in different stages of fertility decline. In the early stages, social influence may constrain the adoption of contraception, which demographers may interpret as being in a woman’s best interest, because deliberate control of fertility is still considered deviant. Once a noticeable fertility decline has set in, however, the direction of social influence can shift. As low fertility becomes commonplace, the interaction among women or couples is likely to encourage a decline in the demand for children and behaviors that lead to a limiting of fertility.
2.
Empirical Eidence
The earliest systematic evidence for the influence of diffusion processes and social networks on fertility decisions is found in two seminal analyses of demographic change. First, the European Fertility Project (Coale and Watkins 1986) showed that areas which are culturally similar, in language or ethnicity for example, often exhibit similar fertility patterns despite substantially different socioeconomic conditions and development levels. Thus, new ideas about fertility as well as new contraceptive methods seem to have diffused within cultural and linguistic boundaries. More sophisticated approaches employing similar methodological ideas have used the correlation pattern in pooled time series to show support for the role of social interactions in the adoption of family planning over time (Montgomery and Casterline 1993). Second, the Taichung family planning experiment in Taiwan (Freedman and Takeshita 1969) used a randomized experiment and longitudinal follow-up to show how direct interventions of a family planning program targeted to some individuals spread to others who were not directly contacted by the program. The study thus suggests that family planning interventions extend beyond the direct effect on the contacted individuals and that interaction in social networks constitutes an important aspect in the diffusion of new contraceptive methods and ideas. The above evidence, although intriguing, is far from conclusive. In all of these data, the evidence is indirect and based on correlated fertility developments in geographically and\or culturally close populations. Support for the relevance of social networks in fertility transitions requires more individual-level data, gathered either in structured surveys or ethnographic studies. The work of Watkins and collaborators in South Nyanza, Kenya, for example, represents an important step in this direction. Qualitative group interviews show that casual conversations about family planning occur frequently as women go about their daily activities. The network partners in these conversations are usually geographically and socially close 14382
to one another and are of a similar age or education. Frequently they also know each other and share multiple relations, e.g., stemming from money lending, help in the household, or family ties. These qualitative impressions are further supported by survey data that include individuals and their social networks. Until recently, only one effort had been made to assemble such data, and this was by E. Rogers and colleagues in their pioneering work in rural Korea (Rogers and Kincaid 1981). Currently there are several related research projects being conducted in Kenya and Malawi (Watkins and colleagues), Thailand (Entwisle and colleagues), and Ghana (Casterline and colleagues). In all of these projects socioeconomic panel information on women and couples is being collected along with data on social networks that may have influenced contraceptive decisions. Research using these data has consistently shown that the prevalence of contraceptive use in social networks is positively associated with the probability of women using family planning, and that the use of family planning by network partners tends to increase both the knowledge about modern methods and the willingness to use them (Entwisle and Godley 1998, Kohler et al. 2001, Montgomery and Chung 1998). Although this association is robust across different socioeconomic settings and studies, a causal interpretation is not necessarily warranted. In particular, it has been challenging for researchers to distinguish between true network effects on fertility decisions on one side, and other unobserved factors which render the behavior of network members similar on the other. Nevertheless, additional work that fully exploits the panel nature of the above studies and improves the temporal measures of interactions and the adoption of family planning has the potential to overcome this limitation. A further criticism of existing empirical work is that the link between the theoretical arguments of why social interaction matters and the respective empirical implementation is weak in many analyses. One possible way to overcome this limitation is to use additional information about the structure of the social network, e.g., the density. Dense networks impose more behavioral constraints on the adoption or nonadoption of family planning than do sparse networks, which are more open to innovative information and behavior. Thus, density can be used as an indicator for the mechanisms of ‘social influence’ and ‘social learning.’ If the prevalence of contraceptive use in dense networks is more relevant for the decision to adopt contraception than it is in sparse networks, then social influence is the most relevant aspect of social networks. If it is the other way around, however, then it is social learning that is the dominant component. In the South Nyanza district of Kenya, for instance, evidence for both pathways of influence can be found. Moreover, the relative importance of these two mechanisms depends on the socioeconomic context of the regions.
Social Networks and Fertility Social learning seems to be relatively more important in areas with higher levels of market activity (Kohler et al. 2001). Although the empirical evidence supporting the argument that social networks play an important role in fertility decisions is accumulating from various sources, it remains inconclusive in important aspects, e.g., as regards the empirical identification of causal network effects. Nevertheless, the extensive efforts currently in progress to collect data, combined with both an improved theoretical understanding of social interaction and a more sophisticated empirical methodology, are likely to reveal further insights.
3. Implications of Social Networks For Fertility Change Current research on social networks and fertility is in part stimulated by the fact that the dynamic implications of fertility models that include social interactions seem to be more consistent with the empirical evidence on fertility transitions than individualistic models. However, specific theoretical analyses of social interactions and fertility decisions are required as further support for this conclusion. Early formal models treat the diffusion of low fertility in a similar manner to the way the spread of contagious diseases is treated in epidemiological models (Rosero-Bixby and Casterline 1993). More recent developments have used stochastic processes or simulation, and they explicitly consider the interplay between social networks, women’s fertility decisions, and economic incentives on a microeconomic level (Kohler 2000a, 2000b).
The main consequence of interactions in social networks is that fertility decisions of women or couples in a population become interdependent and exert mutual influences on each other. The implications of this interdependence can be illustrated in a simple nonlinear model of social interaction. In particular, assume that the probability that a woman ever adopts modern contraception is given by P (woman adopts modern contraception Q x, y` ) l F(α(y` k0.5)jβxjγ) (1) where α and β are positive parameters, γ represents individual characteristics, x is the level of family planning effort, F is the cumulative logistic function, and y` is the proportion of women in the population who use modern contraception in the reference period. Hence, the social utility term α(y` k0.5) captures the dependence of a woman’s contraceptive choice on the fertility behavior in her network, where we assume for the sake of simplicity that this network includes all members of the population. The model implies that social effects are absent in a population in which half of the members use modern contraception and the other half do not. If the proportion of contraceptive users exceeds (is below) 0n5, then the social network increases (decreases) the probability of using contraception. This simple model already reveals that the aggregate properties of fertility dynamics in the presence of social interaction are distinct from the fertility dynamics in a purely individualistic model. In particular, the relevance of social interaction arises from three factors: social multiplier effects, multiple equilibria, and dense vs. sparse networks.
Figure 1 Implications of social networks for population dynamics. In the left-hand panel, social networks exert a relatively weak influence on individual contraceptive decisions (α l 3), while in the right-hand panel the social network effect is relatively strong (α l 6)
14383
Social Networks and Fertility 3.1 Social Multiplier Effects Social interaction implies the presence of multiplier effects which change how the fertility level adjusts to changes in family planning programs. An increase in family planning effort x, for instance, increases the probability of choosing modern contraception in two distinct ways. The direct effect reflects the increase in the propensity to choose modern contraception while holding the population prevalence y` constant. However, because the increased level of x affects all community members, the prevalence of modern contraception in the population tends to increase as well. This change in the community has a feedback or social multiplier effect on individual behavior in future periods via the social utility term in Eqn. (1). This means that the total change in the prevalence of family planning exceeds the direct effect. The left-hand panel of Fig. 1 shows how this total effect is composed of a direct effect and an indirect effect through social interaction. The interaction in social networks thus enhances the changes in contraceptive behavior that follow changes in family planning efforts or other socioeconomic changes. 3.2 Multiple Equilibria Many formal fertility models assume that there is only one aggregate fertility level, denoted as ‘equilibrium,’ associated with any given level of socioeconomic conditions. When the interdependence of fertility decisions in social networks is sufficiently strong, however, multiple equilibria can emerge (right-hand panel of Fig. 1). This possibility of there being multiple equilibria is relevant because in this case a population can be ‘stuck’ at a low level of contraceptive use, even though a second equilibrium exists with a higher level of contraceptive use and higher individual utility levels. In this case, increases in the policy level x (or other socioeconomic changes) can induce a transition from the low contraceptive use to the high contraceptive use equilibrium. These transitions between equilibria are often thought to occur at a rapid pace, resulting in significant changes in fertility behavior within a relatively short space of time. Moreover, both the high and low contraceptive use equilibria are stable, so that a transitory increase in the family planning effort can yield sustainable long-term changes in fertility levels.
effort can decrease with an increase in parameter α, which reflects the importance of the network influences in individual fertility decisions. The changes in contraceptive use resulting from changes in the program effort can therefore be smaller for very densely knit societies than for more sparse societies.
Bibliography Bongaarts J, Watkins S C 1996 Social interactions and contemporary fertility transitions. Population and Deelopment Reiew 22: 639–82 Coale A J, Watkins S C 1986 The Decline of Fertility in Europe. Princeton University Press, Princeton, NJ Entwisle B, Godley J 1998 Village network and patterns of contraceptive choice. Paper presented at a meeting of the Population Committee of the National Academy of Sciences, January 29–30, Washington, DC Erickson B H 1988 The relational basis of attitudes. In: Wellman B, Berkowitz S D (eds.) Social Structures: A Network Approach. Cambridge University Press, Cambridge, pp. 99–121 Freedman R, Takeshita J Y 1969 Family Planning in Taiwan: An Experiment in Social Change. Princeton University Press, Princeton, NJ Granovetter M S 1973 The strength of weak ties. American Journal of Sociology 73: 1361–80 Kohler H-P 2000a Fertility decline as a coordination problem. Journal of Deelopment Economics 63: 231–63 Kohler H-P 2000b Social interaction and fluctuations in birth rates. Population Studies 54: 223–38 Kohler H-P, Behrman J R, Watkins S C 2001 The density of social networks and fertility decisions: Evidence from South Nyanza District, Kenya. Demography 38: 43–58 Lin N 1999 Building a network theory of social capital. Connections 22: 28–51 Mitchell J C 1973 Networks, norms, and institutions. In: Boissevain J, Mitchell J C (eds.) Network Analysis: Studies in Human Interaction. Mouton, The Hague, Holland pp. 2–35 Montgomery M R, Casterline J B 1993 The diffusion of fertility control in Taiwan: Evidence from pooled cross-section timeseries models. Population Studies 47: 457–79 Montgomery M R, Casterline J B 1996 Social learning, social influence learning and new models of fertility. Population and Deelopment Reiew 22: S151–75 Montgomery M R, Chung W S 1998 Social networks and the diffusion of fertility control in the Republic of Korea. In: Leete R (ed.) The Dynamics of Values in Fertility Change. Oxford University Press, Oxford Rogers E M, Kincaid D L 1981 Communication Networks: Toward a New Paradigm for Research. Free Press, New York Rosero-Bixby L, Casterline J B 1993 Modelling diffusion effects in fertility transition. Population Studies 47: 147–67
H. P. Kohler and C. Bu$ hler 3.3 Populations With Dense Versus Sparse Networks An important question in diffusion models is whether ‘more’ interaction in networks always leads to larger multiplier effects and more rapid diffusion. It can be shown that already in simple models such as Eqn. (1) social networks can be status-quo enforcing: the longterm effect associated with changes in the program 14384
Social Networks and Gender One of the most frequently cited causes for inequalities in the distribution of income, power, and status to men and women in the workplace is gender differences in
Social Networks and Gender social networks (Baron and Pfeffer 1994). The term network refers to the set of relationships defined by an individual and his or her direct contacts with others (see Ibarra 1993 for various definitions). Networks shape careers by awarding access to jobs, providing mentoring and sponsorship, channeling the flow of information and referrals, augmenting power and reputations, and increasing the likelihood and speed of promotion (e.g., Brass 1985, Burt 1992, Ibarra 1995, Kanter 1977). This article reviews the last two decades of research on men’s and women’s work networks and proposes a social structural perspective that can integrate findings about gender differences in network characteristics and their consequences. A social structural view underscores the embeddedness of social relationships within structural contexts that constrain the availability and ease of developing various kinds of social contacts. But such a perspective also gives prominence to the interaction patterns that intervene between causal processes at the macrostructural level of socioeconomic organizations and those operating at the individual level. Three broad sets of factors moderate the relationship between gender and work-related networks: sex stratification, interaction dynamics, and gender beliefs and identities. Women are under-represented in executive positions and men and women are segregated into different jobs, occupations, firms, and industries such that women are over-represented in lower paying jobs (Baron and Pfeffer 1994, Reskin 1993). Men and women interact on a daily basis in social life, but the structure of gender roles outside the workplace is such that they engage in different activities and form part of different social circles outside of work (Ross 1987). Voluntary group activities, for example, mirror and reinforce the gender segregation found in the workplace (McPherson and Smith Lovin 1986). Furthermore, as in the workplace, women’s social roles tend to be lower in status and prestige than men’s; as a result, although men and women interact frequently, only a minority of these cross-gender interactions occur between people of roughly equivalent power and status (Ridgeway and Smith-Lovin 1999). Kanter (1977) established the effects of differences in power, opportunity, and numbers on interaction dynamics and individual attitudes. Interaction dynamics, which are both cause and consequence of stratification inside and outside the workplace, further reinforce differences in networks via several mechanisms. Well-documented preferences for homophily, i.e., same-gender interaction, lead men and women to organize discretionary interaction within different social circles (Smith-Lovin and McPherson 1993). Preferences for interaction with high status individuals—those who are powerful and upwardly mobile—influence the dynamics of interpersonal attraction and network formation (Ibarra 1992). But, ‘female’ is a lower ascribed status than ‘male’ (Ridgeway 1991). Thus, for women, preferences for homo-
phily and preferences for status conflict, while these preferences are congruent for men. This serves to further segregate networks and aids in the perpetuation of stereotypic beliefs about gender and work competency (Ridgeway and Smith-Lovin 1999). Gender beliefs and identities, which are themselves a product of interaction patterns, affect how people interact with the constraints of their structural contexts, shaping what contacts men and women seek and how they negotiate their interactions (Ridgeway and Smith-Lovin 1999). In sum, the literature on work networks suggests that social structures, social interaction, and social identities intertwine to differentiate women’s and men’s networks. Macrostructural demographic and job stratification patterns interact with the microdynamics of interpersonal exchange and identity construction to reinforce gender differences in networks as well as in the types of networks that each need in order to further their attainment. Below is reviewed the research evidence in each of these areas.
1. Differences in Networks Structuralists have typically studied how men and women enter the occupational world with different human capital and how they are distributed in different jobs, hierarchical levels, firms, and even industries. These structural factors account for most of the differences observed in men and women’s networks and related mobility outcomes. When men and women hold similar jobs, occupations, and hierarchical levels, they have similar networks and attain equally central informal positions (Brass 1985, Ibarra 1992, Miller 1986, Burt 1992, Aldrich et al. 1989, Moore 1990). Regardless of position, however, organizational demography affects networks by regulating the availability of same-gender contacts that are also influential (Ibarra 1992). Compared to men, therefore, women typically have fewer, high-status, same-gender contacts available to them for work-related interaction (Ibarra 1992). As a result, women working in maledominated industries, firms, or jobs tend to develop ‘functionally segregated’ networks with personal and professional contacts relegated to different, rarely overlapping networks. This differs from the networks of their male counterparts, who tend to develop more ‘multiplex’ network relationships, i.e., relationships that have both a personal and professional dimension, rather than just one. The trust and loyalty associated with coaching and advocacy for promotion is more likely to develop in relationships defined by multiple dimensions (Kanter 1977). Outside the workplace, family and life-course factors also produce similar structural effects. Kinship ties tend to account for a greater proportion of women’s ties and this effect is heightened with child rearing when the effects of occupation are held 14385
Social Networks and Gender constant (Campbell 1988, Fischer and Oliker 1983, Moore 1990). Sex segregation of activities inside and outside the workplace, therefore, limits the status of contacts and multiplexity of women’s work relationships and thus the career instrumentality of their network.
professional ranks, homophilous ties play a critical role in helping those in the minority gain advice from others who have had similar experiences. A beneficial side effect is that reaching out to other women typically requires women to extend their network links beyond the immediate workgroups or functional areas, adding breadth and range to their networks.
2. Differences in the Relationship Between Networks and Professional Outcomes
3. Effects of Identity on Networks
More recent research suggests that, due to the structural factors outlined above, women must use different network means to achieve the same career goals as men. Many studies, reviewed in more detail below, find that an interaction between gender and human capital or structural position—rather than main effects for either—explains network characteristics. Men appear to be better able than women to convert positional and human capital resources such as hierarchical rank, educational attainment, and external professional contacts into access to central network positions (Ibarra 1992, Miller 1986, Miller et al. 1981). Burt (1992) found that men furthered their career mobility by creating networks rich in ‘bridging ties’ to people who were not directly connected to one another. Women, by contrast, required strong ties to strategic partners to signal their legitimacy and thus contribute to their advancement. He argues that networks dominated by strong ties, i.e., relationships high in emotional intensity, intimacy, or frequency of interaction (Granovetter 1982), while detrimental for men are beneficial for women by serving to compensate for their lower status and legitimacy in the managerial world. In a study of Fortune 500 middle managers, Ibarra (1997) found that fast-track men reported networks much like those of Burt’s highly mobile managers, while the fast-track women built networks that were higher in both the tie strength and range of contacts. These findings are consistent with Granovetter’s (1982) argument that weak ties are less advantageous for people in socially or economically insecure positions. Women may also develop different network routes in order to compensate for the few women available to them in the workplace. Men’s workplace ties are primarily with men, regardless of the type of relationship involved; for women, by contrast, contacts with men provide certain types of instrumental access, while relationships with a mix of men and women serve as sources of social support and friendship (Ibarra 1992, Lincoln and Miller 1979). Brass (1985) found that homophilous workplace networks were more detrimental for women than for men because those who control the promotion process tend to be male. Ibarra (1997), by contrast, found that samegender ties help managerial women advance. When women are under-represented in managerial and
Network behavior is not only driven by instrumental motives such as career advancement. People initiate and maintain relationships as expressions of valued social identities, and participate in activities and institutions that are congruent with those identities (Stryker and Serpe 1982). Gender identity, therefore, may affect a person’s choice of social contexts (within which different types of potential network contacts are more or less available) as well as the discretionary contacts that are developed within those contexts. But, the simple existence of homophily on some trait is not sufficient motive for tie formation (Ibarra 1997). Women vary in the strength of their identification with their multiple identities, and this fact may explain large within-gender group differences in network homophily. Chatman and Brown (1996) found that social identity mediates the effects of demographic similarity at the dyadic level on friendship formation in business school student networks. Thus, identification with one’s gender group explains variance in the formation and nature of network ties over and above that explained by demography. Demography, however, is still expected to play an important role because ascribed categories such as gender acquire their meaning in the context of social interaction (Wharton 1992). In work environments in which the distribution of power and status favors men numerically, the extent to which gender is a salient identity will affect women’s interaction choices. Ely (1994), for example, found a correlation between the organizational demography of law firms and the quality of peer and hierarchical relationships among women in those firms. She argued that this correlation was mediated by the extent to which the women in those firms identified with each other on the basis of gender. People have multiple identities (e.g., consultant, wife, mother, tennis player, Protestant) that vary in their importance to the individual and in the degree to which they are reinforced or embedded in one’s network (Stryker and Serpe 1982). Stryker and Serpe argue that a person’s ‘commitment’ to any given identity is a function of the number, affective importance, and multiplexity of network ties that are formed by the person when enacting the identity. As discussed above, the demographics of the workplace and social groups outside work make it easier for men’s personal and professional contacts to work
14386
Social Networks and Gender synergistically; women’s are more likely to be uniplex, or valuable on a single dimension. This pattern is exacerbated by life course and family factors because gender interacts statistically with major family transitions to change network patterns (Elder and O’Rand 1995). Munch et al. (1997) found that childbirth leads women to drop their job and career-related ties in order to limit the overall size of their networks, while men substitute family ties for male social friends, but without losing job contacts. Changes in family status, therefore, are associated with changes in patterns of social relations, which shape and give meaning to women’s and men’s experiences, both in their families and jobs. The network configurations created by major family transitions, in turn, will shape the hierarchical ordering of social identities and the cultural assumptions that women and men make about the appropriate balance of job- and family-related work. Changes in network characteristics associated with family transitions may explain cohort and individual differences in women’s responses to job–family demands.
4. Conclusion The existing evidence suggests that different network configurations are effective in promoting the careers of men and women. In most organizations men and women differ in the jobs they hold, in their opportunity for informal interaction with high-status, same-gender others, and the expectations that govern their interactions with others. Outside the workplace, men and women are subject to different role demands. They come to develop different types of networks because they are, in fact, operating in different social contexts that require different network configurations to accomplish similar objectives. These findings raise critical questions for future research on the interaction between gender and networks as it affects the demographics of the workplace and course of careers. They suggest that future research should shift focus from comparing men’s and women’s networks towards exploring why different networks work for women and men, and what factors affect the formation of networks with such characteristics. See also: Gender and Feminist Studies in Sociology; Language and Gender; Networks: Social; Sex Segregation at Work; Small-group Interaction and Gender
Bibliography Aldrich H, Reese P R, Dubini P 1989 Women on the verge of a breakthrough: Networking among entrepreneurs in the United States and Italy. Entrepreneurship and Regional Deelopment 1: 339–56
Baron J, Pfeffer J 1994 The social psychology of organizations and inequality. Social Psychology Quarterly 57: 190–209 Brass D J 1985 Men’s and women’s networks: A study of interaction patterns and influence in an organization. Academy of Management Journal 28: 327–43 Burt R S 1992 Structural Holes. Harvard University Press, Cambridge, MA Campbell K E 1988 Gender differences in job-related networks. In: Work and Occupations 15: 179–200 Chatman J A, Brown R A 1996 It takes two to tango: Demographic similarity, social identity and friendship. Paper presented at the Stanford Conference on Power, Politics and Influence Ely R J 1994 The effects of organizational demographics and social identity on relationships among professional women. Administratie Science Quarterly 39: 203–38 Fischer C, Oliker S 1983 A research note on friendship, gender, and the life cycle. Social Forces 62: 124–32 Granovetter M 1982 The strength of weak ties: A network theory revisited. In: Marsden P V, Lin N (eds.) Social Structure and Network Analysis. Sage, Beverly Hills, CA Ibarra H 1992 Homophily and differential returns: Sex differences in network structure and access in an advertising firm. Administratie Science Quarterly 37: 422–47 Ibarra H 1993 Personal networks of women and minorities in management: A conceptual framework. Academy of Management Reiew 18(1): 56–87 Ibarra H 1997 Paving an alternate route: Gender differences in managerial networks for career development. Social Psychology Quarterly 60: 91–102 Ibarra H, Smith-Lovin L 1997 New directions in social network research on gender and careers. In: Jackson S, Cooper C (eds.) Futures of Organizations. John Wiley & Sons, New York, pp. 359–84 Kanter R M 1977 Men and Women of the Corporation. Basic Books, New York McPherson J M, Smith-Lovin L 1986 Sex segregation in voluntary associations. American Sociological Reiew 51: 61–79 McPherson J M, Smith-Lovin L 1987 Homophily in voluntary organizations: Status distance and the composition of face-toface groups. American Journal of Sociology 52: 370–79 Miller J 1986 Pathways in the Workplace. Cambridge University Press, Cambridge, UK Moore G 1990 Structural determinants of men’s and women’s personal networks. American Sociological Reiew 55: 726–35 Munch A, Miller McPherson J, Smith-Lovin J 1997 Gender, children and social context: A research note on the effects of childrearing for men and women. American Sociological Reiew 62: 509–20 Reskin B 1993 Sex segregation in the workplace. Annual Reiew of Sociology 19: 241–70 Ridgeway C 1991 The social construction of status value: Gender and other nominal characteristics. Social Forces 70: 367–86 Ridgeway C L, Smith-Lovin L 1999 The gender system and interaction. Annual Reiew of Sociology 25: 191–216 Ross C E 1987 The division of labor at home. Social Forces 65: 816–33 Smith-Lovin L, McPherson M J 1993 You are who you know: A network approach to gender. In: England P (ed.) Theory on Gender\Feminism on Theory. Aldine, New York Stryker S, Serpe R T 1982 Commitment, identity salience and role behavior. In: Ickes W, Knowles E (eds.) Personality, Roles and Social Behaior. Springer-Verlag, New York, pp. 199–218
14387
Social Networks and Gender Wharton A S 1992 The social construction of gender and race in organizations: A social identity and group mobilization perspective. In: Tolbert P, Bacharach S (eds.) Research in the Sociology of Organizations. JAI Press, Greenwich, CT, Vol. 10, pp. 55–84
H. Ibarra
Social Neuroscience Social neuroscience refers to the study of the relationship between neural and social processes. A fundamental assumption in the science of psychology is that neurophysiological processes underlie social and psychological phenomena. The early American psychologist William James (1890\1950) was among the first to articulate this assumption; to recognize that developmental, environmental, and sociocultural factors influence the neurophysiological processes underlying psychological and social phenomena; and to acknowledge that these influences could be studied as neurophysiological transactions. James also asserted that unnecessary diseconomies and conundrums would result if neurophysiological events were the exclusive focus of psychological investigations. Despite this early recognition that molecular processes and mechanisms bear on molar social psychological processes and vice versa, the early identification of biology with innate mechanisms and the identification of social psychology with self-report data contributed to an antipathy that endured long after each field became more sophisticated. Accordingly, neuroscientific analyses tended to be limited to behavioral phenomena occurring within an organism (e.g., attention, learning, memory), whereas social psychological analyses traditionally eschewed biological data in favor of self-report and behavioral data. As neuroscientific approaches were applied to the study of diseases and to elementary cognitive and behavioral processes, subtler medical and behavioral phenomena succumbed to neuroscientific inquiry. The 1990s were declared by the US Congress to be the ‘decade of the brain,’ reflecting the interest in and importance of the neurosciences in theory of and research on cognition, emotion, behavior, and health.
1. Doctrine of Multileel Analyses The decade of the brain has led to a realization that a comprehensive understanding of the brain cannot be achieved by a focus on neural mechanisms alone. The brain, the organ of the mind, is a fundamental but interacting component of a developing, aging indi14388
vidual who is a mere actor in the larger theater of life. This theater is undeniably social, beginning with prenatal care, caregiver–infant attachment, and early childhood experiences and ending with feelings of loneliness or embeddedness and with familial or societal decisions about care for the elderly. Mental disorders such as depression, anxiety, schizophrenia, and phobia are both determined by and are determinants of social processes; and disorders such as substance abuse, prejudice and stigmatization, family discord, worker dissatisfaction and productivity, and the spread of acquired immunodeficiency syndrome are quintessentially social as well as neurophysiological processes. Indeed, humans are such social animals that a basic ‘need to belong’ has been posited (Baumeister and Leary 1995, Gardner et al. in press). People form associations and connections with others from the moment they are born. The very survival of newborns depends on their attachment to and nurturance by others over an extended period of time. Accordingly, evolution has sculpted the human genome to be sensitive to and succoring of contact and relationships with others. For instance, caregiving and attachment have hormonal (Uvnas-Mosberg 1997) and neurophysiological substrates (Carter et al. 1997). Communication, the bedrock of complex social interaction, is universal and ubiquitous in humans. In the rare instances in which human language is not modeled or taught, language develops nevertheless (GoldinMeadow and Mylander 1983, 1984). The reciprocal influences between social and biological levels of organization do not stop at infancy. Affiliation and nurturant social relationships, for instance, are essential for physical and psychological well-being across the lifespan. Disruptions of social connections, whether through ridicule, separation, divorce, or bereavement, are among the most stressful events people must endure (Gardner et al., 2000). Berkman and Syme (1979) operationalized social connections as marriage, contacts with friends and extended family members, church membership, and other group affiliations. They found that adults with fewer social connections suffered higher rates of mortality over the succeeding nine years even after accounting for self-reports of physical health, socioeconomic status, smoking, alcohol consumption, obesity, race, life satisfaction, physical activity, and preventive health service usage. House et al. (1982) replicated these findings using physical examinations to assess health status. In their review of five prospective studies, House et al. (1988) concluded that social isolation was a major risk factor for morbidity and mortality from widely varying causes. This relationship was evident even after statistically controlling for known biological risk factors, social status, and baseline measures of health. The negative health consequences of social isolation were particularly strong among some of the fastest growing segments of
Social Neuroscience the population: the elderly, the poor, and minorities such as African Americans. Astonishingly, the strength of social isolation as a risk factor was found to be comparable to high blood pressure, obesity, and sedentary lifestyles. Social isolation and loneliness are associated with poorer mental as well as physical well-being (e.g., Ernst and Cacioppo 1999, Gupta and Korte 1994, Perkins 1991). People who report having contact with five or more intimate friends in the prior six months are 60 percent more likely to report that their lives are ‘very happy,’ as compared to those who do not report such contact (Burt 1986). People appear to be cognizant of the importance of social relationships. When asked ‘what is necessary for happiness?’ most rated relationships with friends and family as being the most important factor (Berscheid 1985). Initially, studies of the neural structures and processes associated with psychological and social events were limited to animal models, post-mortem examinations, multiple determined peripheral assessments, and observations of the occasional unfortunate individual who suffered trauma to or disorders of localized areas of the brain. Developments in electrophysiological recording, functional brain imaging, neurochemical techniques, neuroimmunologic measures, and ambulatory recording procedures have increasingly made it possible to investigate the role of neural systems and processes in intact humans. These developments fostered multilevel integrative analyses of the relationship between neural and social processes. Cacioppo and Berntson (1992) coined the term social neuroscience and outlined several principles for spanning molar and molecular levels of organization.
1.1 The Principle of Multiple Determinism The principle of multiple determinism specifies that a target event at one level of organization, but particularly at molar (e.g., social) levels of organization, may have multiple antecedents within or across levels of organization. For example, consider the multiple factors that contribute to drug abuse. On the biological level, researchers identified the contribution of individual differences in the susceptibility of the endogenous opiod receptor system while on the social level investigators have noted the important role of social context. Both operate, and our understanding of drug abuse is incomplete if either perspective is excluded. A corollary to this principle is that the mapping between elements across levels of organization becomes more complex (e.g., many-to-many) as the number of intervening levels of organization increases. One implication is that the likelihood of complex and potentially obscure mappings increases as one skips levels of organizations.
1.2 The Principle of Nonadditie Determinism The principle of nonadditive determinism specifies that properties of the whole are not always readily predictable from the properties of the parts. Consider an illustrative study by Haber and Barchas (1983), who investigated the effects of amphetamine on primate behavior. The behavior of nonhuman primates was examined following the administration of amphetamine or placebo. No clear pattern emerged between the drug and placebo conditions until each primate’s position in the social hierarchy was considered. When this social factor was taken into account, amphetamines were found to increase dominant behavior in primates high in the social hierarchy and to increase submissive behavior in primates low in the social hierarchy. The importance of this study derives from its demonstration of how the effects of physiological changes on social behavior can appear unreliable until the analysis is extended across levels of organization. A strictly physiological (or social) analysis, regardless of the sophistication of the measurement technology, may not have unraveled the orderly relationship that existed.
1.3 The Principle of Reciprocal Determinism The principle of reciprocal determinism specifies that there can be mutual influences between microscopic (e.g., biological) and macroscopic (e.g., social) factors in determining behavior. For example, not only has the level of testosterone in nonhuman male primates been shown to promote sexual behavior, but the availability of receptive females influences the level of testosterone in nonhuman primates (Bernstein et al. 1983, Rose et al. 1972). Within social psychology, Zillmann (1994) has demonstrated that exposure to violent and erotic materials influences the level of physiological arousal in males, and that the level of physiological arousal has a reciprocal influence on the perceptions of and tendencies toward sex and aggression. Accordingly, comprehensive accounts of these behaviors cannot be achieved if either biological or social levels of organization are considered unnecessary or irrelevant.
2. Insights Fostered by Multileel Analyses Recent research has provided growing evidence that multilevel analyses spanning neural and social perspectives can foster more comprehensive accounts of cognition, emotion, behavior, and health (e.g., Anderson 1998). First, important inroads to the logic of social processes have emerged from theory and research in 14389
Social Neuroscience the neurosciences (e.g., brain organization and localization of function, genetic determinants of behavior). Research in the neurosciences, for instance, has revealed that early ideas about memory, in which knowledge and skills were thought to be served by the same underlying mechanism, are incorrect. Instead, separable neural systems underlie the memory of people, episodes and ideas (declarative, episodic memory), and the memory of procedures, interactions, or skills (procedural, implicit memory). Similarly, research in the neurosciences has influenced what are thought to be the processes underlying social attitudes and decisions. Mechanisms for differentiating hostile from hospitable environmental stimuli are imperative for the survival of species and for the formation and maintenance of social units. The human brain and body therefore can be conceptualized as being shaped by natural selection to calculate utility and to respond accordingly. Evaluative decisions and responses are so critical that organisms have rudimentary reflexes for categorizing and approaching or withdrawing from certain classes of stimuli, and for providing metabolic support for these actions (Lang et al. 1990). A remarkable feature of humans is the extent to which the evaluation of utility is shaped by learning and cognition. The processes subserving the evaluation of hostile and hospitable events are also physiological processes and cannot be understood fully without considering the structural and functional aspects of the physical substrates. Noninvasive investigations of the physiological operations associated with evaluative processes provide an important window through which to view these processes without perturbing them. For instance, the neural circuitry involved in computing the utility of a stimulus (i.e., evaluative processing) diverges from the circuitry involved in identification and discrimination (i.e., nonevaluative processing). Furthermore, evaluative discriminations by humans (e.g., attitudes) have traditionally been conceptualized as being bipolar (hostile–hospitable) and have been measured using bipolar scales to gauge the net affective predisposition toward a stimulus. Such an approach treats positive and negative evaluative processes (and the resulting affective states and judgments of utility) as equivalent directions on a single scale, reciprocally activated, and interchangeable in the sense that increasing one is equivalent to decreasing the other. Physical constraints may restrict behavioral manifestations of utility calculations to bivalent actions (approach–withdrawal), but research in the behavioral and social neurosciences now suggests that approach and withdrawal are behavioral manifestations that come from distinguishable motivational substrates. This work has led to evolutionary models of affect and emotion in which a stimulus may vary in terms of the strength of positive evaluative activation (i.e., positivity) and the strength of negative evaluative activation (i.e., negativity) it evokes. Furthermore, the under14390
lying positive and negative evaluative processes are distinguishable, are characterized by distinct activation functions, and have distinguishable antecedents (Cacioppo and Berntson 1994). Second, the study of social processes has challenged existing theories in the neurosciences, resulting in refinements, extensions, or complete revolutions in neuroscientific theory and research. Classically, immune functions were considered to reflect specific and nonspecific physiological responses to pathogens or tissue damage. It is now clear that immune responses are heavily influenced by central nervous processes that are affected by social interactions and processes. For instance, the effects of social context now appear to be among the most powerful determinants of the expression of immune reactions (Glaser and KiecoltGlaser 1994). It is clear that an understanding of immunocompetence will be inadequate in the absence of considerations of psychosocial factors. Research on these interactions was activated by investigations demonstrating the direct and moderating effects of psychosocial factors (e.g., conditioned stimuli, bereavement, social support, major life events) on immune competence (Kiecolt-Glaser et al. 1984). Thus, major advances in the neurosciences can derive from increasing the scope of the analysis to include the contributions of social factors and processes. Third, the social environment shapes neural structures and processes and vice versa. Meaney and colleagues (Liu et al. 1997, Meaney et al. 1993), for instance, have found that experimental manipulations of maternal care influence the development of individual differences in neuroendocrine responses to stress in rats. As adults, the offspring of mothers that exhibited more licking and grooming of pups during the first ten days of life were also characterized by reduced adrenocorticotropic hormone and corticosterone responses to acute stress. As mothers, these rats also tended to lick and groom their pups. Finally, deciphering the structure and function of the brain is fostered by sophisticated social psychological theories in which the elementary operations underlying complex social behaviors are explicated, and by experimental paradigms that allow these social psychological operations to be studied in isolation using neuroscientific methods. Brain imaging studies of the neural bases of emotion have contrasted positive and negative emotions based on the assumption that these emotions are served by the same neural structures. Research and theory in the social and behavioral sciences now suggests that the processes underlying positive and negative emotions are separable, and paradigms for eliciting positive and negative emotions have been developed. Accordingly, recent brain imaging studies in which positive and neutral emotions are contrasted, and negative and neutral emotions are contrasted, have revealed the activation of distinguishable neural structures during positive and negative emotions.
Social Properties (Facts and Entities): Philosophical Aspects In sum, social neuroscience is an interdisciplinary field that is helping to illuminate questions ranging from the social sciences to the neurosciences by examining how organismic processes are shaped, modulated, and modified by social factors and vice versa. See also: Behavioral Neuroscience; Comparative Neuroscience; Psychophysiology
Bibliography Anderson N B 1998 Levels of analysis in health science. A framework for integrating sociobehavioral and biomedical research. Annals of the New York Academy of Sciences 840: 563–76 Baumeister R F, Leary M R 1995 The need to belong: Desire for interpersonal attachment as a fundamental human motivation. Psychological Bulletin 117: 497–529 Berkman L F, Syme S L 1979 Social networks, host resistance, and mortality: A nine-year follow-up study of Alameda County residents. American Journal of Epidemiology 109: 186–204 Bernstein I S, Gordon T P, Rose R M 1983 The interaction of hormones, behavior, and social context in nonhuman primates. In: Svare B B (ed.) Hormones and Aggressie Behaior. Plenum Press, New York pp. 535–61 Berscheid E 1985 Interpersonal attraction. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology. Random House, New York Burt R S 1986 Strangers, friends, and happiness. GSS Technical Report 72: National Opinion Research Center, University of Chicago, Chicago Cacioppo J T, Berntson G G 1992 Social psychological contributions to the decade of the brain: The doctrine of multilevel analysis. American Psychologist 47: 1019–28 Cacioppo J T, Berntson G G 1994 Relationship between attitudes and evaluative space: A critical review, with emphasis on the separability of positive and negative substrates. Psychological Bulletin 115: 401–23 Carter C S, Lederhendler I I, Kirkpatrick B 1997 The Integratie Neurobiology of Affiliation. New York Academy of Sciences, New York Ernst J M, Cacioppo J T 1999 Lonely hearts: Psychological perspectives on loneliness. Applied and Preentie Psychology 8: 1–22 Gardner W L, Gabriel S, Diekman A B 2000 Interpersonal processes. In: Cacioppo J T, Tassinary L G, Berntson G G (eds.) Handbook of Psychophysiology, 2nd edn, Cambridge University Press, New York, pp. 643–64 George M S, Ketter T A, Parekh P I, Horwitz B, Herscovitch P, Post R M 1995 Brain activity during transient sadness and happiness in healthy women. American Journal of Psychiatry 152: 341–51 Glaser R, Kiecolt-Glaser J K 1994 Handbook of Human Stress and Immunity. Academic Press, San Diego, CA Goldin-Meadow S, Mylander C 1983 Gestural communication in deaf children: Noneffect of parental input on language development. Science 221: 372–4 Goldin-Meadow S, Mylander C 1984 Gestural communication
in deaf children: The effects and noneffects of parental input on early language development. Monographs of the Society for Research in Child Deelopment 49: 1–121 Gupta V, Korte C 1994 The effects of a confidant and a peer group on the well-being of single elders. International Journal of Aging and Human Deelopment 39: 293–302 Haber S N, Barchas P R 1983 The regulatory effect of social rank on behavior after amphetamine administration. In: Barchas P R (ed.) Social Hierarchies: Essays Toward a Sociophysiological Perspectie. Greenwood Press, Westport, CT, pp. 119–32 House J S, Landis K R, Umberson D 1988 Social relationships and health. Science 241: 540–5 House J S, Robbins C, Metzner H L 1982 The association of social relationships and activities with mortality: Prospective evidence from the Tecumseh Community Health Study. American Journal of Epidemiology 116: 123–40 James W 1890\1950 Principles of Psychology. Dover, New York Kiecolt-Glaser J K, Garner W, Speicher C E, Penn G, Glaser R 1984 Psychosocial modifiers of immunocompetence in medical students. Psychosomatic Medicine 46: 7–14 Lang P J, Bradley M M, Cuthbert B N 1990 Emotion, attention, and the startle reflex. Psychological Reiew 97: 377–95 Liu D, Diorio J, Tannenbaum B, Caldji C, Francis D, Freedman A, Sharma S, Pearson D, Plotsky P M, Meaney M J 1997 Maternal care, hippocampal glucocorticoid receptors, and hypothalamic-pituitary-adrenal responses to stress. Science 277: 1659–62 Meaney M J, Bhatnagar S, Diorio J, Larocque S, Francis D, O’Donnell D, Shanks N, Sharma S, Smythe J, Viau V 1993 Molecular basis for the development of individual differences in the hypothalamic pituitary adrenal stress response. Cellular and Molecular Neurobiology 13: 321–47 Perkins H W 1991 Religious commitment, ‘Yuppie’ values, and well-being in post collegiate life. Reiew of Religious Research 32: 244–51 Rose R M, Bernstein I S, Gordon T P 1972 Plasma testosterone levels in the male rhesus: Influences of sexual and social stimuli. Science 178: 643–5 Uvnas-Mosberg K 1997 Physiological and endocrine effects of social contact. Annals of the New York Academy of Sciences 807: 146–63 Zillmann D 1994 Cognition-excitation interdependencies in the escalation of anger and angry aggression. In: Potegal M, Knutson J F et al. (eds.) The Dynamics of Aggression: Biological and Social Processes in Dyads and Groups. Erlbaum, Hillsdale, NJ, pp. 45–71
J. T. Cacioppo and G. G. Bernston
Social Properties (Facts and Entities): Philosophical Aspects One area of philosophical concern, dating from the very beginning of philosophy itself, is ontology: what is there?; or, what kinds of things are there? Many questions about existence are straightforwardly em14391
Social Properties (Facts and Entities): Philosophical Aspects pirical, and hence of no interest to the ontologist. Are there unicorns, wombats, planets circling stars other than our Sun? But other existence questions are not empirical, and these are the province of the ontologist. Are there any physical objects, ideas, forms, numbers, sets, propositions, minds, concepts, a Deity, social wholes? To answer these questions is to engage in producing a list of the metaphysical kinds of things that one believes exist. ‘Existence’ in what sense? Some philosophers have held that each metaphysically different kind of thing exists in its own, special sense of ‘existence’; or they may distinguish between being, existence, subsistence, and so on. A more plausible view is that ‘existence’ is univocal, that whatever exists exists in the same sense, and that differences in the ‘mode’ of existence, say, of mind and body, arise not because existence differs in the two cases but because of the differences between minds and bodies themselves. Bodies have a spatial ‘mode’ of existence, it might be thought, and minds do not, but, if so, that is because of the differences between them, not because they exist in different senses of that word. Besides, if existence had a special sense for each metaphysical kind of thing, ontology could become somewhat trivial. To the question of whether, say, propositions really exist, it would be anodyne simply to say that they do exist, but only in the special proposition-sense of existence. Or that minds exist, but only in the special mental sense of existence. Or that possibilia subsist rather than exist (since that is what possible things do) whereas actual things exist rather than subsist. Ontology would entail no honest labor, and therefore no real rewards.
1. Ontological Commitments and How to Wriggle Out of Them When we speak, indeed even whenever we simply think, we typically take ourselves to be speaking or thinking about various things. It is not easy to capture this notion of aboutness, but here I rely on the readers’ intuitive grasp of the idea. I do not always think or speak about anything, whenever I speak or think: in (a) ‘whoever is a bachelor is neurotic,’ I am not trying to speak or think about any particular thing, or even about any set of things. In (a), I am merely linking up two properties: the property of being a bachelor and the property of being neurotic, but I am not speaking about those properties. If I want to speak about the set of bachelors and the set of neurotic persons, I could say: (b) ‘The set of bachelors is a proper subset of the set of neurotic persons.’ (b) is certainly not equivalent to (a), since (b) has ontological commitments of which (a) is wholly innocent. I might be thinking or speaking about Socrates or the man I met yesterday or the tallest building in New 14392
York. Let us call whatever I think or speak about ‘entities.’ There are a number of ways in which things can go wrong when I speak about entities. First, I can make mistakes, e.g., when I take myself to be thinking about something that does not in fact exist: Atlantis, Vulcan, witches, or phlogiston. These are cases in which we really are intending to think about something, but fail. Let us call these sorts of cases, ‘cases of elimination.’ There simply are no such things, and if we have at first placed them on our list of what there is, on finding that they do not exist, we remove or eliminate them from the list. Some philosophers have said this about minds, or mental states. In cases of elimination, nothing can be salvaged from talk about such things as it stands. Such talk is plain false (see Lycan and Pappas 1972). Second, there are cases which, on first appearance, seem to be ones in which I am speaking or thinking about something, but closer inspection dispels the appearance. Not all cases of apparently thinking about something turn out, on analysis, really to be such. I may not be really intending to talk about the thing in question at all. I may, for example, talk about the average family, but such talk or thought is not really about what it might seem to be about. There is no such thing as the average family about which I am talking. Average-family talk or thought is merely shorthand for talking about the number of people in the country divided by the number of households (or something like this anyway). There is no such thing as the average family, nothing that should be taken ontologically seriously, beyond taking seriously households and people. Let us call these cases, ‘cases of replacement.’ The original talk, r (‘the average family’) has apparent " ontological commitments of some sort. The replacing talk, r (‘households’ and ‘persons’) does not have those # same apparent ontological commitments (although of course it has some of its own). The burden of proof, in such cases, falls on the person wishing to show that some of the ontological commitments of the original talk are apparent but not real. He or she must indicate the replacing talk that lacks them. In these cases, our original talk gets salvaged, because it is true, if the replacing sentence without the same apparent ontological commitments is true. What must be the case for the replacing talk, r , to # count as an adequate replacement for the original talk, r ? The two talks, r and r , must convey the same " " of that # requirement is that r if information, and part " and only if r . No two sentences can convey the same # information unless they at least have the same truth values. Beyond this, matters get controversial, but the same-truth-value condition will suffice for our purposes. It is not required that the replacement can actually be carried out in practice. Actually replacing r by r " or# might be cumbersome, complicated, long-winded, whatever. The replacement must be theoretically
Social Properties (Facts and Entities): Philosophical Aspects possible, and the supporter of the replacement needs to show how the replacement is possible in principle, but not necessarily in practice. Third, and closely related to the above, are cases in which our talk or thought really and not just apparently is about some entities, but they turn out not to be sui generis, since they can be reductively identified as or with another thing. These are, appropriately, cases of reductive identification. This is the reductive identification which claims so much of philosophers’ energies. There are, it may be said, physical objects, but they are really only sets of sense data; there are mental states, but they are identical with brain states; there are numbers but they are merely the same as sets of sets. Such attempts at reductive identification allow the philosophical cake to be both had and eaten. Along with common sense, it can be asserted that there are such things. But, along with some ideal of philosophical acceptability, they can be shown to be reductively identified as merely another thing after all. Insofar as philosophical reductive identification succeeds, it allows us to shorten our ontological list, while not denying the claims of common sense. It is for this reason that the term ‘reductive’ is appropriate; these identifications permit a reduction in the number of basic ontological commitments. On a materialist reductive analysis, for example, there are minds, but they do not have to be taken ontologically seriously, beyond taking bodily or brain states or whatever seriously. Or, according to Berkeleyian idealism, there are tables, chairs, and so on, but they are only ideas in the mind. Once ideas in the mind are on the ontological list, adding tables and chairs would be superfluous. Tables and chairs are already on the list, if ideas in the mind are. Again, in cases of reductive identification, our talk gets salvaged, since the original talk is true if the analysis containing the reductive identifier is (see Reduction, Varieties of).
2. Entities, Properties, Facts Socrates or the man I met yesterday or the tallest building in New York are concrete objects. Sets and numbers and propositions, if there are such, are abstract objects, which are neither spatial nor temporal. Other things we think or speak about include spatio-temporal nonconcreta: events and actions, such as the eruption of Mount Vesuvius and the first running of a mile under 4 min; processes such as the erosion of the soil in some specific place over a specific period; and states such as the lake’s remaining calm on such-and-such a night or the stillness of the air on a summer’s eve or the presence of helium in a balloon. Thinking or talking about any sort of thing is caught by reference; we refer to something, namely, to that entity our thought or language is about.This entry
confines itself to discussion of facts, entities, and properties, as such referents of our thought, but no discussion of social ontology could be complete unless it brought to the fore actions, and in particular social actions (see Joint Action and Action, Collectie). Typically, when I think about something, I also ascribe some feature or characteristic to it. For example, when I say of Socrates that he is snub-nosed, or of the man I met yesterday that he is a spy, or of the erosion of the soil that it was accidental, I ascribe such a feature or characteristic. Let us call these features or characteristics ‘properties.’ The duality of reference and ascription is captured grammatically by the distinction between subject and predicate, and metaphysically by the distinction between entity and property. (All of this needs serious qualification, and extension to other sorts of sentences, but I forego that pleasure here.) Are there, ontologically speaking, properties as well as entities? There are at least two ways in which to handle this question. First, there is the austere point of view that says that ascribing properties is not enough to be ontologically committed to them. There are properties, only if we talk or think about them, refer to them, and if such talk or thought cannot be eliminated or replaced by other talk or thought which makes no such ontological commitment (see Quine 1963). There are, though, cases in which we do appear to refer to properties: (a) ‘The color crimson is a shade of the color red.’ It is easy to show that (b) ‘The set of crimson things is a subset of the set of red things’ will not do as a replacement for (a). And if no such replacement or elimination is possible, then there are properties, although they may not be sui generis, if any reductive identification of them is possible. Second, there is the more relaxed approach, which says that we are ontologically committed to properties if they are actually instantiated (or perhaps even if only well defined), whether or not we refer to them or talk about them. Ascription alone, without reference, may still involve us in ontological commitment. If so, ‘this visual patch is red,’ which ascribes the property of being red to something, but does not refer to the property, commits us to the property of being red as well as to a visual patch, since the sentence insures that redness has at least one actual instance. Finally, and much more controversially, some philosophers have believed that our account of what there is must include not only entities and properties, but also facts, for example the fact that some entity e has some property P. Facts have been enlisted for many purposes, one such being a correspondence theory of truth, and another being causal analysis. Are sentences true when they correspond to the facts? Is it the fact that the match was struck that caused it to light? These two particular uses of facts have come in for some devastating criticism. But even so, there may well be facts, even if they are useless in theories of truth and for analyses of causality. 14393
Social Properties (Facts and Entities): Philosophical Aspects If it is a fact that Socrates is snub-nosed, then is not there at least that fact? At least on first appearance, it seems that there are facts. If so, there are not only Socrates and the property of being snub-nosed, but also Socrates’ being snub-nosed (a state) and the fact that he is snub-nosed. If there are facts, then they count as a kind of entity, if they can be referred to and occupy subject position in sentences. ‘The fact that Fiorello LaGuardia was elected disturbed many of his opponents,’ was no doubt true, so we do seem to refer to facts. Without mere replacement, or elimination, or reductive identification, we have facts on our ontological list, as well as entities and properties, states, and events. Needless to say, philosophers have disputed whether all of these ‘reifications’ are really required, and whether some can be understood in terms of the others. Perhaps numbers are just sets. Perhaps propositions are just sets of similar sentences. Perhaps minds are just brains. Perhaps there is the fact that Socrates is snub-nosed, or perhaps there is no such thing. And even if there is this fact, maybe it is identical with the state, Socrates’ being snub-nosed. But none of that need be decided here.
3. Some Social Ontology Thought and discourse about society, whether lay or by the social scientist, raises these ontological questions too. Just as I can think about Socrates or the tallest building in New York, so too I can think about France or the Icelandic working class or the commodity market or the National Football League. Our very thought about society appears to commit us to believing in nation states, governments, classes, social structures, tribes, clubs, associations, and so on, just as our thought about Socrates and the tallest building in New York appears to commit us to believing in persons and buildings. And we are really committed to these social entities, unless we can show that these turn out to be cases in which either we can, without loss, eliminate discourse about social entities entirely, or we can replace such talk with other talk that makes no similar ontological commitment, or there is some reductive identification available for the putative social entities. What does ‘social’ connote? The basic application of the term relates to properties (for more about social properties, see below). What makes a property a social property? This is, I think, a difficult question. Some entity’s having a property may have a social cause or social effect, without making the property itself social. For example, the soil’s being eroded may be caused by inept bureaucracy, and may result in a stock market plunge, but none of that makes the property of being eroded a social property. Whether a property is social 14394
depends on something intrinsic about the property itself, and not on the causes or effects of its being instantiated. A property is social (roughly, without some refinements unnecessary here) if and only if from the fact that it is instantiated, it follows that at least two people exist and have an interlocking system of beliefs and expectations about one another’s thoughts or actions. Intuitively, I think this gets our idea of sociality about right, although one might just think that the requisite number should be greater than two. For example, the property of jointly carrying something is not social on this account, and rightly so, because two men might jointly carry something without either being aware of what the other was doing, oblivious to each other’s very existence. Jointly carrying requires two people, but makes no demands on their beliefs and expectations about one another. Purchasing a stone is, on the other hand, a social property on this account, and rightly so. No two people could be engaged in such a transaction without having a whole host of beliefs and expectations about the actions of the other. Social entities, if such there be, are entities such that they have at least one social property essentially. France, for example, is essentially a nation; the Icelandic working class is essentially a class, a working class, and Icelandic. Once the idea of a social property is to hand, we can use it to introduce the idea of sociality as applied to other ontological categories. So our idea of the social fans outwards, as it were, from our idea of a social property. Social entities, if such there be, are never concrete, in the way in which earlier examples of entities were. Concrete entities, like Socrates and the tallest building in New York, are exclusive space occupiers, no two of which can occupy the same space at the same time. Social entities do, in general, appear to be space occupiers; France occupies a certain region of physical space, the Icelandic working class is physically in Iceland. But social entities are not exclusive space occupiers. Suppose a diocese of the Catholic Church, call it ‘Sancta Gallia,’ occupies exactly the same geographical area as does France. If so, two distinct social entities, France and Sancta Gallia, occupy the same physical space at the same time. What, then, of the three options we have available, for dealing with these apparent ontological commitments of our social discourse? Elimination of discourse about social entities would seem to involve an intellectual loss; we could not even in principle say things we would want to say without the discourse. (Imagine where we would be without sociology, anthropology, and a part of economics, for example.) The issue here is not just one of linguistic economy. It is not that we could not nearly as easily say what we want without social discourse but could only do so in an altogether more baroque fashion; rather, the claim is that we would have no way in principle of conveying what we want to say without that social discourse.
Social Properties (Facts and Entities): Philosophical Aspects On the other hand, replacement does seem possible for some very simple examples. A case which has been mentioned in the literature is ‘The Jewish race is cohesive,’ which appears to commit the speaker or thinker to an entity, the Jewish race. But, it is alleged, this is replaceable by ‘Jews tend to marry other Jews,’ etc., and the latter makes no such ontological commitment to the existence of a race. ‘The Jewish race is cohesive’ behaves very much like ‘The average family has 3.2 members.’ (See J. W. N. Watkins, ‘Ideal Types and Historical Explanation’ and ‘Historical Explanation in the Social Sciences,’ the first of which is reprinted in Ryan (1973), and both of which are reprinted in O’Neill (1973).) I do not wish to claim that these replacements are never available; perhaps they do work for the case just mentioned. But it seems hard to see what replacement strategy would work for many other examples. Consider ‘France is a charter member of the United Nations.’ What sentence or sentences could replace this, yet convey the same information, at least in the sense of having the same truth value, and yet make no ontic commitments to a nation state and a world organization? I can here only invite the reader to suggest candidates; I myself know of none. Replacement does not work as a general strategy. However, it is one thing to admit that there are such entities as France, the United Nations, and the Icelandic working class, but quite another to insist that they cannot be reductively identified with something more respectable. Many philosophers are drawn to reductive strategies of one kind or another: there are mental states, but they are (merely) identical with brain states; there are numbers, but they are (merely) identical with sets of sets; there are physical objects, but they are (merely) identical with sets of sense data. So too various reductive strategies suggest themselves for social entities: France is (merely) a geographical area, a set of persons, etc. No one can ‘prove’ that no reductive strategy could work. What must be done, I think, is to evaluate each reductive strategy that suggests itself, one-by-one. There are some general lessons one can draw, though. The vague thought, ‘Nations are identical with the people who make them up,’ is ill-formed. In any reductive identification, names must flank both sides of the identity sign. So, for instance, ‘France l (.’ Well, identical with what? Name fillers for the space on the right-hand side of the identity sign will either be extensional or nonextensional entities, in the following sense. Sets and aggregates and land masses are extensional, since, if and only if a and b are sets or aggregates or land masses, a l b if and only if a and b have exactly the same members, or components, or parts. But no extensional entity in this sense could be identical with France, for example, since their identity conditions will differ. France can remain numerically one and the same, in spite of variation in its citizenry
or residents, or land occupancy. Even if one considered the set of all actual citizens of France or pieces of land that France ever occupied at any time, past, present, and future, there would be counterfactual differences. France could have occupied a different piece of land than it did, or had different citizens or residents than it did, but this could not be true of the set or aggregate of its citizens, residents, or lands. France is nonextensional; sets, aggregates, and land masses are extensional, and no nonextensional entity could be identical with an extensional one. (In fact, this little argument which relies on counterfactual differences assumes that ‘France’ and ‘the set of (.’ both rigidly designate, since the identity required is identity across all possible worlds. But I would argue that this assumption is true.) If France is a nonextensional entity and no nonextensional entity can be identical with an extensional one, then what about some nonextensional entity for the reductive identification of France? Examples that come to mind are things like groups. Is France to be identified with the group of French people? The problem with this suggestion is that a group seems itself to be a social entity, and hence hardly a candidate for the reductive identification of France with a nonsocial entity. What the determined reductivist needs to find is a nonextensional entity which is also nonsocial, for the reductive identification of France, and I am aware of no such obvious candidate. So much for social entities. What sort of properties get ascribed to things when we think or talk socially? We can think or speak of either social entities or nonsocial ones like Socrates or the tallest building in New York and, in either case, ascribe social properties to them: ‘France is a charter member of the United Nations’ ascribes to a pair of social entities, France and the United Nations, the relational social property of being a charter member of; ‘Socrates was mayor of Athens’ ascribes (falsely, as it happens) to a nonsocial entity, Socrates, the relational social property of being a mayor of. From the fact that social properties figure in our discourse, it does not yet follow that we refer to or think about social properties. The relational social property, being the mayor of, certainly figures in ‘Fiorello LaGuardia was mayor of New York,’ but the sentence is not about that social property. It is only about Fiorello LaGuardia and New York. On the relaxed criterion, the sentence commits us to the existence of three things: a person, a city, and the social property of being a mayor of. On the austere criterion, since the sentence is only about Fiorello LaGuardia and New York, the sentence commits us only to the existence of a person and a city, but not to the existence of a social property. But even were we to adopt the austere view, there are a host of arguments, modeled on ones about properties generally, which do move us to talking about social properties, and so to the existence of 14395
Social Properties (Facts and Entities): Philosophical Aspects social properties. Here are terse statements of two such arguments: (a) If Iceland is capitalist, and if France is capitalist, then there is some property that Iceland and France share; (b) If France is a charter member of the United Nations, then it follows that France is a member of the United Nations. The inference in (b) requires taking properties ontologically seriously.The first argument parallels one by Hilary Putnam (1980) on properties generally; the second parallels one by Donald Davidson (1980) on action. So we talk and think about social properties, and I do not see how we could simply eliminate, do without, such discourse. Nor do I know of any discourse which could replace discourse about social properties, and which both lacked the ontological commitment to social properties but allowed us to convey the same information or messages. So, on the austere view, there seem to be social properties, as well as nonsocial ones. A fortiori, on the relaxed view, we get properties, social and nonsocial, just insofar as we ascribe them to actually existent things, or just in case they are well defined. Could each social property be reductively identified either with some specific nonsocial property, or with some finite Boolean construction of such? I doubt whether this could be so. Let me give an indication of why I think this. Some social properties are both nonvariable and weak, in the following senses. A social property P is nonvariable if and only if P is instantiated, then some specific set s of interlocking beliefs and expectations exists. For example, the property of participating in the custom of drinking tea at breakfast is nonvariable, since the beliefs and expectations associated with it must be the set of beliefs and expectations about drinking tea at breakfast. A social property P is weak if and only if P is instantiated, then the interlocking beliefs and expectations are about types of events which are not themselves social. The social property of participating in the custom of drinking tea at breakfast is also weak, since the beliefs and expectations associated with it concern drinking tea at breakfast, which is itself a nonsocial event or activity. But the most characteristic sorts of social property are neither nonvariable nor weak; they are variable and strong. Consider the property of being a mayor. If that property is instantiated, it is true that some set of interlocking beliefs and expectations must exist, but there is no one particular set that must exist. There is an indefinitely wide range of things that a mayor might be expected to do in various different societies or social settings. So the property is variable, in the sense of being indefinitely variably instantiable. Moreover the property is strong, since what a mayor is expected to do involves events and activities like receiving the ceremonial key to the city, opening formal events, representing the city at certain social or political functions, and so on, and these things are themselves social events or activities. 14396
It seems to me that these two features of typically social properties make the reductive identification of social properties with nonsocial ones exceedingly implausible. One might have thought that a social property could be reductively identified with the nested and interlocking set of beliefs and expectations with which it is associated. But since there is an indefinitely long list of such sets associated with each typically social property, and these beliefs and expectations themselves require the existence of social events or activities, the reductive identification appears to fail. I do not have the space here to discuss the issue at length of whether it is necessary to admit facts to our ontology. Frequently, the case for or against reduction of social science to physical science has been put in terms of facts; supervenience claims in social science typically enlist social and nonsocial facts as the terms of the debate. (See Maurice Mandelbaum, ‘Societal facts,’ reprinted in Ryan (1973) and O’Neill (1973).) But what does seem plain is that if there are facts, then, in the light of what we have already said above about social properties and social entities, there must be social facts. A fact would qualify as social if either it was a fact about a social entity, or it was a fact about any kind of entity, social or nonsocial, and attributed a social property to that entity. So the fact that the United Nations contributed to relief in Kosovo is a social fact, since it is about a social entity, and the fact that LaGuardia was mayor of New York is a social fact, since it attributes a social property to a nonsocial entity, namely to a person. Often, the positions described in the relevant literature about social ontology are posed in terms of individualism and holism, sometimes with the qualifying adjective, ‘methodological’ added (see Methodological Indiidualism: Philosophical Aspects). Sometimes, the positions are expressed in terms of facts, and at other times, in terms of entities. Supervenience claims for the social (‘the social supervenes on the nonsocial’) are sometimes posed in terms of facts, at other times in terms of properties. It might be claimed that there are no irreducible social facts, or that there are no irreducible social entities or that there are no irreducible social properties. As can be seen from the discussion above, the positions are not equivalent. For example, as I have claimed, some social facts ascribe social properties to nonsocial entities. Rarely, if at all, is the distinction drawn between the existence of entities, properties, and facts. Does individualism in the social sciences eschew social entities but admit irreducible social properties? Or should it eschew both? What should it say about facts? Might there be distinct versions of individualism, so that it comes in stronger and weaker forms? The lesson is that these terms, ‘holism’ and ‘individualism,’ do not name unambiguous positions unless and until various distinctions are drawn, and the ontological claims of each position are clearly
Social Protests, History of specified. (See S. Lukes, ‘Methodological Individualism Reconsidered,’ reprinted in Ryan (1973).) See also: Atomism and Holism: Philosophical Aspects; Causation: Physical, Mental, and Social; Causes and Laws: Philosophical Aspects; Identity and Identification: Philosophical Aspects; Individualism versus Collectivism: Philosophical Aspects; Shared Belief
Bibliography Currie G 1984 Individualism and global supervenience. British Journal for the Philosophy of Science 35: 345–58 Davidson D 1980 The logical form of action sentences. Reprinted in Davidson D Actions and Eents. Clarendon Press, Oxford Garfinkel A 1981 Forms of Explanation: Rethinking the Questions in Social Theory. Yale University Press, New Haven, CT Gilbert M 1989 On Social Facts. Routledge, London Goodman N 1961 About. Mind 70: 1–24 Lycan W, Pappas G 1972 What is eliminative materialism? Australasian Journal of Philosophy 50: 149–59 McKeon R (ed.) 1966 The Basic Works of Aristotle. Random House, New York. Mellor D H 1982 The reduction of society. Philosophy 57: 51–75 O’Neill J (ed.) 1973 Modes of Indiidualism and Collectiism. Heinemann Educational Books, London Pettit P 1993 The Common Mind. Oxford University Press, Oxford, UK, Chaps. 3, 4 Putnam H 1980 On properties. Reprinted in Putnam H Philosophical Papers, Cambridge University Press, Cambridge, UK, Vol. 1 Quine W V O 1963 On what there is. In: From A Logical Point of View. Harper and Row, New York, pp. 1–19. Quinton A 1975–6 Social objects. Proceedings of the Aristotelian Society 76: 1–27 Ruben D-H 1985 The Metaphysics of the Social World. Routledge and Kegan Paul, London Ruben D-H 1990 Explaining Explanation. Routledge, London, Chap. V Ruben D-H 1998 The philosophy of the social sciences. In: Grayling A C (ed.) Philosophy 2: Further Through the Subject. Oxford University Press, Oxford, UK, pp. 420–69 Ryan A (ed.) 1973 The Philosophy of Social Explanation. Oxford University Press, London Taylor B 1985 Modes of Occurrence. Blackwell, Oxford, UK
D.-H. Ruben
Social Protests, History of Social protests have been integral to collective life since the first complex communities were formed. One of the oldest references to protest was found on an
inscription in a tomb in Egypt’s Valley of the Kings from the reign (1198–66 BC) of Rameses III in the New Kingdom translated as follows: ‘Today the gang of workmen have passed by the walls of the royal tomb saying: we are hungry, eighteen days have gone by, and they sat down behind the funerary temple of Tuthmosis III … the workmen remained in the same place all day’ (Romer 1982, pp. 193–5). Social protest is defined here as contentious action undertaken collectively in response to perceived injustice or unfair action on the part of those who hold legitimate political and economic power. It seeks to achieve social (as opposed to political and economic) ends, or alternatively to restore or return to earlier ways of life. Examples of such aims are a more equitable distribution of privilege or wealth, reducing inequality among persons or groups, changing or restoring religious beliefs and\or cultural practices, and reversing cultural change. This article examines social protest in Western and non-Western societies starting in Antiquity and through the nineteenth century.
1. Protest in Imperial Rome The authoritarian regime of Imperial Rome did not offer many opportunities for peaceful communication of ordinary people’s grievances or wishes; despite the city’s heavy policing, however, the large agglomerations of people in the Colosseum, Circus, and theaters offered the possibility of mass (and more or less anonymous) expressions of popular opinion. In these settings, crowds could shout demands in the Emperor’s presence for the disciplining of a hated official, or object to an unpopular war. The streets too were a locus for popular protest, especially that about the shortage or high price of grain. Street disorder of this type contributed to the fall of Nero. Crowds of ordinary citizens might also call for the downfall of a usurper of the Imperial throne or express their dissatisfaction with a prefect or his policy. The grain riot or angry complaints about the high price of bread continued to be a feature of popular protest in many parts of the world, right into the twentieth century. Most medieval and early modern European states sought to regulate the grain trade and concomitantly the price of bread at retail. These controls began to be modified and eventually removed as trade was liberalized in Western Europe, and newly formulated laissez-faire theories were put into effect in the late seventeenth century. The result was an upsurge of two types of protest: (a) crowd blockage of grain being moved on roads or waterways out of a growing region with a surplus to one with a shortage, or a city (especially an administrative city or the capital) with a stronger claim on adequate supplies; and (b) the market riot with confiscation of grain, its sale by the 14397
Social Protests, History of crowd, and often payment (at lower than the asking price, of course) to the owner for his grain. Market riots were more likely to occur in cities or towns, sometimes along with attacks on bakers and looting of their shops.
taxes, so that inhabitants took extreme umbrage at the imposition of new taxes or stricter enforcement of the old.
3. Early Modern Religious Protest in Europe 2. Protest about Food Supply and Taxation in Old Regime France Looking more closely at the relationship of food supply and public order in ancien regime France, widespread food riots beginning in the late winter or early spring (when the price of grain began to rise if the fall’s crop had been small) and involving large numbers of people occurred in cycles beginning in the seventeenth century and ended only in the early 1850s. The timing of the upsurge suggests a relationship with political centralization, including policy decisions concerning economic matters and the modification of longstanding paternalist policies, especially those concerning the production and marketing of grain. Several of the modern period’s great revolutions (French, 1789 and Russian, 1917) began with contention about food prices, and went on to attack—and overthrow— monarchies. Tax revolts are another common form of protest that span the centuries of European history; the welldocumented case of France is again illustrative. Increased taxation to finance the dynastic ambitions and consequent wars of the Bourbon kings made heavy demands on ordinary French men and women, to which they responded with protest. In a political system in which noble status (and ecclesiastical office) exempted the upper classes from most taxation, the fiscal burden fell disproportionately on peasants and commoners. Seventeenth century disputes over the legitimacy of new direct taxes, including attacks on collectors, were usually community affairs, but sometimes transcended single communities when the tocsin was sounded at a tax collector’s first stop to arouse a whole region. New indirect taxes were farmed out to individuals (tax farmers) who had the responsibility for collecting them through private agents; both tax farmers and agents were subject to attack. Although at first, punishment was more exemplary than extensive, widespread repression became common over the seventeenth century; the label ‘croquant’ (a countryman armed with a stick) was first applied to peasant rebels in 1594 in southwest France, Yves Berce tells us, and continued to be used in recurrent rebellions until 1752. The titles of other seventeenth and early eighteenthcentury rebellions likewise suggest the commonalty: Sabotiers (shoemakers, 1658), Bonnets Rouges (red caps, 1675), and Camisards (white shirts, intermittently from 1685 to 1710). These rebellions occurred in off-the-beaten-track areas of France that had formerly been only lightly taxed or exempt from particular 14398
The revolt of the Camisards illustrates religious protest, another type of social protest which can be found across Europe in the sixteenth to eighteenth centuries. In France, the Edict of Nantes (1598, promulgated after decades of religious wars by Henry IV) had granted freedom of conscience to Protestants all over France, permitted public worship by Protestants in designated towns, and guaranteed their civil rights; in 1685, however, Louis XIV revoked the edict, thus capping several decades of decrees limiting the effect of the policy of tolerance, and increasing the pressure on Protestants. They nevertheless continued their forbidden forms of worship (often outdoors, since they had no churches), especially in the southwest province of Languedoc; there Protestant Camisard bands conducted a guerilla war in the hills, clashing over 25 years with the forces of order. The military repression was expensive, and the rebels reduced tax revenues by attacking tax collectors and destroying property, but royal forces finally succeeded in ending Protestant protest in 1710. Over the last years of the Old Regime, a more tolerant policy emerged unevenly, alternating with periods of intensified repression. Outside France, but in Western and Central Europe, the sixteenth century was also a period of extensive social protest connected with the Protestant Reformation. The German Peasant War has been the object of controversy since it began and Catholic authorities promptly claimed that Luther’s revolt against the church had promoted peasant rebellion. Historians of the rebellion have also disagreed, with Marxists— starting with Friedrich Engels—claiming that the Peasant War was socioeconomic class warfare and non- or anti-Marxists interpreting it as a conservative movement seeking to restore Catholic principles of right and justice. More recently, Peter Blickle (1981) has again emphasized the religious underpinnings of the conflict. He argues that contrasts between urban and rural revolt are not clear cut; both urban and rural ordinary people emphasized justification through scripture. And there was only one outcome: the rising ended with the defeat of the rebels and its leaders executed. In 1536, the Pilgrimage of Grace challenged Henry the VIII’s dissolution of the monasteries in England, with a brief aftermath early in 1537. Like the revolt of the Camisards, the Pilgrimage (the term includes several more or less coordinated local risings) mixed religious issues with political and economic ones; unlike the Camisards, however, the Pilgrims were protesting Henry and his minister Thomas Cromwell’s
Social Protests, History of rejection of aspects of Roman Catholic doctrine. The Pilgrims’ banners portrayed the Five Wounds of Christ and they were often led by priests. Among the significant political events which preceded the revolt in 1536 were the passage of parliamentary acts which dissolved the ‘lesser monasteries,’ and permitted the king to name his own successor; the clergy of England then rejected the doctrine of purgatory, and accepted Henry’s Ten Articles making (what seemed to some) modest changes in religious doctrine, a compromise document. The Pilgrimage began with the rising of nine ‘hosts’ (popular armies, including members of the gentry, clergymen, town dwellers, and peasants) starting in Lincolnshire, followed by eight more raised in Yorkshire and elsewhere in the north. Attracting new recruits on their way to take the city of York without a battle, the rebels united around the term ‘pilgrimage’ to designate their undertaking to ask the king’s grace in preserving ‘Christ’s church’ and their sworn oath. Robert Aske, a member of the gentry became their spokesperson. First a truce, and then a general pardon from the king were bargained and granted (it contained promises of a parliament in York and that the abbeys which had been closed under the dissolution decree would be reopened) which led to class divisions within the hosts and in January 1537, renewed rebellion, which was put down militarily. Compared to the German Peasant War, the Pilgrims of 1536 included gentry, peasants, and townsmen, but like most early modern large-scale protests, it failed to achieve its backward-looking goals.
4. England’s Glorious Reolution and its American Colonies England’s seventeenth century Stuart monarchs struggled with Parliament over taxation and royal prerogative, a struggle that ended in civil war when Charles I ordered the arrest of his opponents in the House of Commons. Charles I was beheaded after he was defeated in battle but refused to abandon his opposition to parliamentary claims. When the Puritan Republic ended after Oliver Cromwell’s death with the restoration of the Stuarts—Charles II (1660–85) and James II (1660–88)—the struggle between kings and Parliament resumed. The struggle was finally resolved in the Glorious Revolution of 1689, when James II abdicated to be replaced by William and Mary; the outcome was a greater role in English governance for Parliament. Wars consumed the West European mainland until the mid-eighteenth century, when monarchical governments reached a new great power balance in which Spain played only a peripheral role. France and Great Britain (following the unification under the same monarch of England and Scotland) faced each other as rivals not only in Europe but also in North America
and South Asia where both had colonial interests. The following paragraphs look at social protest in other world regions, both colonized and free, starting with the British colonies in North America, continuing with two Asian examples. During the Restoration period, the British kings sought to increase their control over their North American colonies by passing Navigation Acts, based on mercantilist economic theory which would limit colonial trade and manufacturing in order to promote English trade and manufacturing. In pursuing this goal James II replaced the proprietary governments of New Hampshire, Massachusetts, and the Carolinas with royal governors and suspended their elected assemblies. The colonists resisted these interventionist efforts and overthrew the governors of New York and Massachusetts, and replaced the proprietor of Maryland. The interests of colonial populations had diverged from those of the royal governments in England, including that of William and Mary after the Glorious Revolution (1689) in which James II was forced into exile. The 13 North American British colonies increasingly became another locus for the rivalry of France and Britain that was also being played out in Europe. With the end in 1763 of the Seven Years War, Britain took over most of the former French colonies, while domestically, it staggered under an enormous war debt. The government searched for a solution to its liquidity crisis in the colonies, attempting to contain settlement south of Canada to the area east of the Appalachians, impose new taxes, and limit selfgovernment in North America. These heavy-handed British policies only increased colonist dissatisfaction, and to make matters worse they were followed by new revenue generating laws: the Sugar Act (to improve enforcement of the duties on molasses), the Currency Act (which forbade the colonies from issuing paper money), and the Stamp Act (which taxed the paper used in legal documents and publications). Public protest erupted in several of the colonies, with coordinated resistance in Virginia and Massachusetts, and a meeting (the Stamp Act Congress) of delegates from nine colonies in New York in 1765 to plan further protest. Colonists boycotted imports from Britain and formed the Sons of Liberty who sponsored more public meetings; attacks on tax collectors spread. Although the Stamp Act was repealed, new taxes were passed including a duty on tea; to enforce compliance special courts were established and British troops were transferred from the frontiers to eastern cities. The dispute again escalated as colonists began new boycotts and renewed attacks on tax collectors. The tea grievance did not disappear even though the new duties were again repealed, for Parliament granted the British East India Company a monopoly (another mercantilist policy) for the importation of tea into the colonies. This time patriots disguised as Native Americans dumped a large quantity of tea into Boston 14399
Social Protests, History of Harbor. British authorities placed the Massachusetts colony under military rule, and open rebellion followed, starting in 1776. By 1781, the colonists had defeated Britain with the help of France and a peace treaty was signed which granted unconditional independence to the 13 colonies. Spanish America also had its colonial protest and revolutions, beginning when Spain was distracted by Napoleon’s invasion in 1808 and most of its colonies in the Americas achieved independence.
5. France and Germany before and after 1789 Protest in continental Europe in the eighteenth century fell into the patterns described for early modern England. In France, protest about grain shortfalls or bread prices continued to dominate even as the monarchy sought to improve the circulation of grain so that regional shortages would not arouse public ire. In fact, the methods that the French state followed to reduce protest—removing market and price controls—merely agitated local populations because they could no longer count on controlled grain markets. The ‘free trade’ solution for France’s economic problems had the effect of making matters worse from the point of view of consumers, and both Napoleon and restored monarchy after 1815 reinstated some controls in difficult times. After the establishment of the Third Republic in 1871 both the paternalist policies and old forms of protest disappeared. Expanded political incorporation via universal male suffrage—passed early in the Revolution of 1848 but manipulated under Napoleon III to produce needed electoral majorities—was finally implemented in the 1870s. Better harvests and the more efficient transportation that the railroads provided for moving grain to regions with poor harvests helped to reduce protest as well. Worker unionization became legal in 1864 and with time, strikes became a routinized form of collective action in France as well as Britain. This does not mean that violence disappeared, for violence is an interaction between police (or the military in cases of martial law) and protesters. The point is that both more professional methods on the part of the authorities handling strikes and protest and greater formalization and nationalization of political parties and workers’ organizations led to the development of routinized policies for dealing with collective protest, thus reducing its impact (Tilly 1971). The late nineteenth and early twentieth century was a watershed in typical forms of collective protest which the German states also crossed. Reports from German states such as Saxony in 1842 detailed how rural craftsmen who were dependent on wages to buy bread had mobbed grain dealers who had bought and were moving grain out of the region to more distant urban areas (Tilly et al. 1975, p. 192). By the early twentieth century the industrialization of production, unionization of laborers, and urban populations had 14400
increased disproportionately in Prussia. (Unified Germany was a federation of formerly independent states, the most powerful being Prussia, and urban public order was the responsibility of the individual states.) In 1910 there was a massive disciplined demonstration in Berlin in which the Socialist Workers Party (SPD) leaders called for a political response: reform of the Prussian three-class electoral laws. The demonstration was then still a comparatively recently adopted form of protest through which ordinary people could bring their dissatisfaction with public policies to the attention of the government.
6. Social Protest in Nineteenth-century Asia We turn now to Asia, to examine two non-European cases of nineteenth-century protest: in the Dutch East Indies and China. The Dutch East India Company (founded in 1602) had displaced the Portuguese in Southeast Asia by seizing Malacca at the southern end of the Malay Peninsula and conquering the kingdom of Acheh on Sumatra as well as the western part of Java by mid-century. The Dutch built their administrative capital, Batavia in western Java, leaving the rest of the island for some years to the Javanese kingdom of Mataram. A series of civil wars and regional revolts involving the Dutch and local princes ended in the 1750s with the division of Mataram into four new princely states, but lower level war continued. By then, Chinese immigrants controlled local and inland trade and the Dutch ran plantations and long distance trade in coffee and teak. By the late nineteenth century, Michael Adas (1979) shows, Dutch control over Javanese society had profoundly undermined the island’s formerly independent economy. The Dutch system for the collection of tolls on roads and canals (using Chinese subcontractors) was similar to the seventeenth and eighteenth century French tax farming and aroused similar objections. Moreover, the Dutch could not win militarily because they lacked knowledge of Java’s interior and in the end, they chose expediency and limited goals, preventing further unrest and raising new revenues to pay the costs of the war. This was done by forcing Javanese peasants to dedicate part of their time to cultivating cash crops to be sold overseas by the government. Although peace was restored, Dutch agricultural policy severely distorted Javanese social and demographic patterns, and drained its economy as Clifford Geertz (1963) demonstrates in his study of ‘agricultural involution.’ China was never fully colonized, but Western intervention in its economic and political affairs beginning in the eighteenth century nevertheless had a great impact on its social and economic development. The Taiping Rebellion of the mid-1800s serves as an example here. Jesuit missionaries had been active for some time in the area around Guanzhou (Canton), as were British merchants buying tea and looking for customers to buy their products, with little success.
Social Protests, History of The immediate result was a negative trade balance for the British; to remedy this European merchants together with Chinese partners imported opium from India into China through Canton, a profitable commerce that grew rapidly in the early years of the nineteenth century. When the Qing court belatedly addressed the problem in 1839 and sought to cut off the trade altogether, the British easily established their superiority in gunboats and arms and forced the Chinese to sign a treaty that opened five ports to them with very low tariffs. Soon after, a Chinese treaty with the United States legalized the importation of opium by foreigners. Chinese resentment of both missionary efforts to convert them and Westerners’ privileges increased. The study of C. K. Yang (1975) of ‘mass action incidents’ (group actions involving public issues) that he links with others to constitute ‘events’ shows an upsurge of incidents in the years 1846 to 1875. The best known of these events is the Taiping Rebellion, which started in the poor and backward region of Guanxi, to the west of Guanzhou. Its originator and leader was a disappointed office seeker who encountered Protestant missionaries in Guangxi and—inspired by the Christianity they preached to him—developed his own messianic version of the message. Believing himself to be the brother of Jesus Christ, he would found a new Heavenly Kingdom (Taiping) that would bring peace on earth. The Qing government sent a military force to Thistle Mountain, where Hong had settled, to arrest the Taiping leaders, but the mission failed, and the Taiping continued to preach. New converts extended the membership beyond the Taiping. Life among the Taipings was highly regimented, with sex-segregated military work teams; women were forbidden to bind their feet and were welcomed as workers and soldiers. The Qing forces of order were able to chase down the Taipings who had crossed the Yanzi River and in 1853 occupied Nanjing and held it for more than 10 years. The rebellion was only crushed after the French and British, who had invaded Beijing in 1860 to punish the Qing for not following up the treaties ending the Opium War, joined the Qing army in crushing the Taipings. Frederic Wakeman Jr. (1966) concludes that the Taiping Rebellion earned the unhappy record as the world’s bloodiest civil war, in which tens of millions were killed in fighting and executions. As social protest, it combined many incidents and lasted for years, but nevertheless is considered one event because of the numerous links between the incidents.
7. Twentieth-century Protest in the West The twentieth century has seen new or reinvented forms of social protest such as boycotts, sit-ins, and more formalized social movements (q.v.). Citizens of
most national states have by and large received rights to participate fully in the political process, but the extent to which governments respect those rights continue to differ. Whether citizens can influence policy formally (through elections and representative government) or informally (through protest) is still disputed between citizens and states in many parts of the world. What has changed is greater consensus between protestors and authorities that has largely formalized protest and how it is expressed. Even in the West, however, social protest may become violent either through design (for example, the anarchist protest at the World Trade Organization’s 1999 meetings in Seattle), or through the breakdown of policing procedures that have been developed in Europe and North America to handle such situations. The contours of social protest continue to shift in many countries, especially in those in which channels for making citizens’ opinions known are very limited or non-existent and where governments seek to limit or even prevent popular participation. Social protest has not disappeared: it continues to occur in both authoritarian states and those with long-established formalization of procedures as a means for expressing dissent or the desire for change. See also: Right-wing Movements in the United States: Women and Gender; Social Movements and Gender
Bibliography Adas M 1979 Prophets of Rebellion: Millenarian Protest Moements Against the European Colonial Order. University of North Carolina Press, Chapel Hill, NC Africa T 1971 Urban violence in imperial Rome. Journal of Interdisciplinary History 2: 3–19 Berce Y-M 1990 History of Peasant Reolts: The Social Origins of Rebellion in Early Modern France (Trans. Whitmore A). Cornell University Press, Ithaca, NY Blickle P 1981 The Reolution of 1525: The German Peasant War from a New Perspectie (Trans. Brady T A, Midelfort H C E). Johns Hopkins Press, Baltimore Bush M 1996 The Pilgrimage of Grace: A Study of the Rebel Armies of October 1536. Manchester University Press, Manchester, UK Davies C S L 1968 The pilgrimage of grace reconsidered. Past and Present 41: 54–76 Geertz C 1963 Agricultural Inolution: The Process of Ecological Change in Indonesia. Association of Asian Studies, University of California Press, Berkeley, CA Romer J 1982 People of the Nile: Eeryday Life in Ancient Egypt. Crown, New York Scribner B, Benecke G (eds.) 1979 The German Peasant War of 1525: New Viewpoints. George Allen and Unwin, London Tilly C 1986 The Contentious French. Belknap Press of Harvard University Press, Cambridge, MA Tilly C, Tilly L A, Tilly R 1975 The Rebellious Century. Harvard University Press, Cambridge, MA Tilly L A 1971 The food riot as a form of political conflict in France. Journal of Interdisciplinary History 2: 23–57
14401
Social Protests, History of Wakeman F 1966 Strangers at the Gate: Social Disorder in South China, 1839–1861. University of California Press, Berkeley, CA Weller R P 1994 Resistance, Chaos and Control in China: Taiping Rebels, Taiwanese Ghosts and Tiananmen. University of Washington Press, Seattle Yang C K 1975 Some preliminary statistical patterns of mass actions in nineteenth-century China. In: Wakeman F, Grant C (eds.) Conflict and Control in Late Imperial China. University of California Press, Berkeley, CA, pp. 174–210
L. Tilly
with individual differences and their ramifications. Whereas personality psychologists seek to understand consistencies of a single individual’s behavior across varying situations, social psychologists seek to understand consistencies across persons in their responses to particular situations. Thus, the social psychological level of analysis may be regarded as a mid-station between those theoretical systems that conceptualize behavior on the one hand in terms of the individual’s unique internal properties and on the other hand in terms of the external social structure.
1. Core Principles of Social Psychology
Social Psychology Social psychology may be defined as the scientific study of how people perceive, influence, and relate to other people. Explicit in this definition are the field’s modus operandus—empirical, data-based investigations of theories and hypotheses derived from those theories—and its goal of understanding the processes by which people think about and interact with others. Traditionally, this very broad umbrella incorporates diverse specific topics, including, but not limited to, attitudes, social cognition, decision-making, social motivation, prejudice and stereotyping, social identity, persuasion, altruism, aggression, interpersonal attraction, close relationships, conformity, group dynamics, group productivity, and conflict resolution, all of which are discussed in more detail elsewhere in this Encyclopedia. Because social interaction is fundamental to most human behavior, it is useful to consider how the perspective of social psychology fits with that of the other behavioral and social sciences. Although in its early development social psychology was closely aligned with sociology, primarily because of the shared interest in social interaction, since the 1960s these fields have become largely independent of each other. The social psychological perspective is fundamentally psychological; it tends to focus on the thoughts, feelings, and behavior of individuals, whereas the sociological perspective is more immediately concerned with the analysis of social structure and the operation of social and societal systems. Similarly, social psychology tends toward a relatively individualcentered analysis of certain questions that other disciplines such as anthropology, economics, and political science generally address at more systemic levels of analysis (although the recent emergence in those fields of such specialties as game theory and behavioral economics suggest growing commonalities). In contrast, social psychology’s interest in situations (discussed below) and normative patterns of behavior differs from that of personality psychology (see Personality Psychology), which tends to be concerned 14402
1.1 The Situational Context of Behaior Social psychology traditionally has emphasized the latter term in Kurt Lewin’s classic dictum, B l f(P,E )—behavior is a function of the person and the environment—to reflect the fact that behavior varies, often profoundly so, depending on the social environment in which it occurs (see also: Lewin, Kurt (1890–1947)). The term social environment typically is used in a very broad manner, encompassing not only patently interpersonal examples (e.g., whether there is conflict or commonality of interest between partners, whether or not a new acquaintance is a member of one’s in-group, and whether performance feedback is supportive or controlling) but also its relatively more impersonal features (e.g., whether persuasive appeals are based on style or substance, whether induced behavior is logically consistent or inconsistent with personal beliefs, and whether circumstances encourage deliberate or automatic processing of social information). Regardless of which variables are included under the heading of situational context, however, social psychology’s published literature is replete with abundant evidence demonstrating that context is responsible for clear, substantial, and theoretically informative differences in behavior. Although situationism is sometimes positioned conceptually as if it were the logical antithesis of the influence of personality, in reality nearly all contemporary social psychologists believe not only that both the situational context and individual differences matter, but that each affects the other interactively and continuously. For example, situations must be perceived and interpreted by the individual to possess relevant features in order for influence to occur; these interpretations are often shaped significantly by person factors (which are not limited to traits but may also include goals and motives). Similarly, personality traits are likely to be influential only to the extent that the situation affords relevant opportunities for their expression (e.g., social anxiety may have relatively little impact during interaction with close friends, but substantial effects with strangers). Individual differences also play an important role in determining which
Social Psychology situations the individual enters or avoids. Thus, it is most appropriate to view social psychology as emphasizing the complex interplay of forces within the person and in the outside situation in casting an individual’s interpretation of, and response to, situational contexts. 1.2 A Process Approach is Most Informatie Phenomena may be regarded in terms of their distinctive and unique features or as the product of more general processes. Social psychologists, because they believe that certain general principles underlie many specific behaviors observed in everyday social life, typically focus their attention on these processes shared in common. For example, a social psychologist interested in prejudice likely would consider psychological mechanisms responsible for negative affect toward members of outgroups, rather than the details of prejudice toward one or another particular outgroup (except, of course, insofar as those details might be informative about the general process). Similarly, social psychologists study interaction between spouses less so to understand marriage per se and more so to discern how high degrees of interdependence influence social interaction. This emphasis does not imply that social psychologists are disinterested in the relevance of their research to particular instances; rather it denotes their belief that theoretical analysis grounded in broadly applicable principles is likely to be most revealing. The field’s emphasis on general processes is evident in the concerted attention routinely invested to identify causal mechanisms responsible for a given phenomenon. That is, once a phenomenon has been demonstrated, researchers almost invariably seek to explain its operation by ascertaining the relevant underlying mechanisms—in other words, to know when, why, and how something occurs. Typically such investigations address questions of moderation—establishing conditions under which a phenomenon is augmented or diminished—and of mediation—determining the specific step-by-step operations by which one variable produces changes in another variable. In recent years, the field has been particularly attentive to mediation by cognitive and affective processes, as described below. Considerable attention also is devoted toward ruling out alternative explanations; that is, toward conducting fine-grained experimental tests capable of distinguishing among alternative, often somewhat similar, explanations for a given phenomenon. 1.3 Basic Theory and its Application Kurt Lewin’s other classic dictum was that ‘there is nothing so practical as a good theory.’ Many social psychologists, in fact, initially were attracted to the discipline because of its potential for generating
solutions to the pressing social problems of the day. Thus, the oft-cited distinction between ‘basic’ and ‘applied’ research is to many social psychologists somewhat misleading in the sense that basic research is often triggered by contemporary events and societal needs, and in turn may produce important evidence not only for better understanding these problems but also for generating interventions. The credo that sound theory and rigorous empiricism provide the firmest possible foundation for social application underlies Lewin’s conviction, one that is shared by many current researchers. Effective applications require good theory, and good theories have real-world relevance. As the social psychological literature has grown, so, too, has the dissemination of its principles and findings into the culture at large. As a result, the impact of social psychology is sometimes difficult to trace, although numerous examples are readily apparent, ranging from concrete interventions derived explicitly from experimental findings to more amorphous influences on prevailing public opinions. Documented impact of social psychological research can be found, for example, on such contemporary social issues as prejudice and intergroup relations, sex discrimination, healthcare, the operation of the legal system, and family violence.
2. A Brief History of Social Psychology Social psychological thinking has deep roots. Rudimentary versions of contemporary theories are evident in many ancient sources, such as the Bible and the writings of Aristotle, Plato, and the other Greek philosophers. But it was not until the turn of twentieth century, with the field’s first experimental studies and publication of the first textbooks explicitly bearing the name, Social Psychology, that the scientific study of human social interaction began to cultivate its own disciplinary identity. Although the field’s development has been more or less continuous since then, two distinct and noteworthy trends mark its intellectual progression. Much of the field’s most influential early work was concerned with social influence, namely the processes by which the real or imagined presence of others affects belief and behavior. Research by Muzafer Sherif and Solomon Asch are considered landmarks, but the most durable and widespread impact came from Lewin’s elegant Field Theory, which conceived of social relations in terms of interdependent force fields, such that each organism (i.e., person) acts on and responds to the other organisms in its space. As compelling as Lewin’s conceptual model was his emphasis on experimentation as a formal method for evaluating theoretical propositions, an emphasis that became increasingly popular during the heady postWorld War II expansion of psychological science. Field theory, especially in its application to investigating longstanding questions about group dynamics and 14403
Social Psychology social influence, and its accompanying experimental method quickly became a dominant theme in the field’s growth. In the late 1950s and 1960s, social psychology’s purview began to widen, such that social influences on the mental life of individuals came to occupy centerstage. Notable in this trend were studies of the impact of social feedback on self-appraisals, including Leon Festinger’s theory of social comparison processes, Stanley Schachter’s studies of self-attribution of emotion, and Festinger’s subsequent theory of cognitive dissonance, which concerned self-perceived consistencies and inconsistencies between one’s beliefs and behavior. In several respects, this movement set the stage for the so-called cognitive revolution of the 1970s and 1980s, in which principles of cognitive psychology were applied to social stimuli (i.e., other people, groups) and phenomena. Fundamental to the original conception of social cognition, as this subdiscipline came to be called, was the idea that many cognitive processes are designed to enable people to contend with the basic interpersonal tasks of everyday life. In their processing of social information, it was argued, people often function as naive scientists, seeking, within human limits, to comprehend causeand-effect associations among the many pieces of social information encountered (see Heuristics in Social Cognition). The earliest emphasis in social cognition research was on understanding people’s attributions of causality, led by the influential theories of E. E. Jones and Harold Kelley. Subsequent work has broadened this purview considerably. More recent trends in social psychology balance the emphasis on cognitive processes with a functionalist perspective, or as Fiske (1992) put it, ‘thinking is for doing.’ For social psychology, the doing, of course, is relating to others in the social world. Social cognition is thus best considered as a mediating process between the individual’s interpersonal goals and circumstances on the one hand, and his or her social behavior on the other hand. Embodying this perspective, contemporary research has increasingly emphasized affect in mediating the individual-social behavior link as well as the importance of the self and its motives. The logic of this latter work assumes that because action is functional and purposive, reactions to particular situations may help reveal the social motives operative in that context. Also, there is increasing recognition of the nature of ongoing relationships as part of the interpersonal context of behavior.
3. Methods of Social Psychological Research It is no accident that social psychologists pride themselves on being rigorous, broad, and clever methodologists. The field’s domain is wide—ranging from society-wide intergroup relations to intimate dyads—and its principled emphasis on identifying cognitive and affective mediators—ranging from cul14404
tural norms to idiosyncratic emotions—dictates that its research methods be flexible and diverse, yet suitable to uncovering and disentangling often subtle phenomena. One reason why social psychologists are in demand across a variety of disciplines and applied settings is their deftness in balancing the demands of experimental rigor, the necessity of adapting methods to diverse questions, and the various methodological issues inherent in studying human behavior. Social psychology’s favored method is the laboratory experiment, in which research participants are randomly assigned to one of several conditions defined by an experimenter so as to permit evaluation of a prespecified hypothesis about cause-and-effect. This emphasis follows directly from the high priority ascribed to establishing an effect’s internal validity: laboratory manipulation allows experimenters to construct conditions whose carefully controlled differences explicitly contrast alternative explanations for a phenomenon while ruling out experimental artifacts. Contemporary social psychology has myriad other strategies and techniques in its toolkit, reflecting the realization that many questions are more appropriately addressed by different methods and the further realization that, because all methods are limited in one way or another, the validity of a research program ultimately depends on its methodological diversity. Other popular methods include quasi-experiments (when manipulation is not possible), surveys and questionnaires, daily event recording, behavioral observation, and meta-analysis. Technological innovation, both from within the field and by expropriating advances in other disciplines, has facilitated researchers’ ability to ask more sophisticated questions of their data, and to answer those questions with a greater degree of precision. Examples of these include computerized methods for investigating cognitive or affective mediation of basic social processes, especially when such mediation occurs outside of the individual’s conscious awareness (e.g., priming and response latency methods), methods for assessing psychophysiological systems that underlie social behavior, internet-based data collection, and videotape-based protocols for fine-grained analysis of observable behavior in dyads and groups. The social psychologist’s readiness to capitalize on new methodologies that provide fresh perspectives on the field’s longstanding conceptual interests is further demonstrated by the eager acceptance of new statistical procedures in the literature. Recent and noteworthy examples include structural equation modelling, multilevel models, and methods for analyzing data in interdependent groups and dyads.
4. The Future of Social Psychology Because an understanding of social relations is in one way or another fundamental across all of the social and behavioral sciences, social psychology is a cross-
Social Psychology: Research Methods roads discipline. Managing that position, that is, acquiring theoretical insights and new methods from neighboring disciplines while exporting its own theories, findings, and tools to those disciplines, represents an important challenge for social psychology. Several boundaries seem particularly promising at this juncture: neuroscience, with its ability to identify markers of social psychological phenomena in the brain; cognitive psychology, which enhances the field’s understanding of mental representation of social entities; evolutionary psychology and behavioral genetics, with particular reference to identifying and understanding the specialized mechanisms created by evolution in making humans the most social of animals; and the study of culture, a contextual variable whose specific cognitive, affective, and interpersonal manifestations ultimately influence all social behavior. At the same time, in its zeal to embrace related disciplines, social psychology must remain focused on its core phenomena. Despite the numerous important empirical findings and theoretical insights of the past century, many more questions remain to be asked and answered. The reasons underlying the importance of building bridges with related disciplines are one and the same as the reasons why social psychology must maintain its commitment to continued investigation of the field’s core processes: that many of the most important unresolved questions about human functioning and well-being concern the processes by which people perceive, influence, and relate to other people. See also: Attitude Change: Psychological; Attitude Formation: Function and Structure; Attitudes and Behavior; Attributional Processes: Psychological; Automaticity of Action, Psychology of; Cognitive Dissonance; Communication and Social Psychology; Conflict and Conflict Resolution, Social Psychology of; Cultural Variations in Interpersonal Relationships; Deception, Psychology of; Group Decision Making, Social Psychology of; Group Processes, Social Psychology of; Group Productivity, Social Psychology of; Interactionism and Personality; Intergroup Relations, Social Psychology of; Interpersonal Attraction, Psychology of; Justice: Social Psychological Perspectives; Love and Intimacy, Psychology of; Mental Representation of Persons, Psychology of; Motivation and Actions, Psychology of; Obedience: Social Psychological Perspectives; Person Perception, Accuracy of; Reactance, Psychology of; Self-conscious Emotions, Psychology of; Social Categorization, Psychology of; Social Cognition and Affect, Psychology of; Social Comparison, Psychology of; Social Dilemmas, Psychology of; Social Facilitation, Psychology of; Social Psychology, Theories of; Social Relationships in Adulthood; Stereotypes, Social Psychology of; Stigma, Social Psychology of
Bibliography Fiske S T 1992 Thinking is for doing: Portraits of social cognition from daguerreotype to laser photo. Journal of Personality and Social Psychology 63: 877–89 Gilbert D T, Fiske S T, Lindzey G (eds.) 1998 The Handbook of Social Psychology. McGraw-Hill, New York Hewstone M, Manstead A S R (eds.) Blackwell Encyclopedia of Social Psychology. Blackwell, Oxford, UK Jones E E 1998 Major developments in five decades of social psychology. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, New York, Vol. 1, pp. 3–57 Kruglanski A, Higgins E T (eds.) 1996 Social Psychology: Handbook of Basic Principles. Guilford, New York Reis H T, Judd C M (eds.) 2000 Handbook of Research Methods in Social and Personality Psychology. Cambridge University Press, New York Ross L, Nisbett R E 1991 The Person and the Situation: Perspecties of Social Psychology. McGraw-Hill, New York Smith E R, Mackie D M 2000 Social Psychology. Psychology Press, Philadelphia, PA Taylor S E 1998 The social being in social psychology. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, New York, Vol. 1, pp. 58–95
H. T. Reis
Social Psychology: Research Methods Social psychological research methods involve methods in use throughout the behavioral and social sciences. The goal of social psychology is to understand human behavior as it is affected by and affects social contexts. Accordingly, social psychologists attempt to measure and explain characteristics of individuals as social actors and characteristics of the social environments in which they act. Importantly, they also attempt to understand the interactive effects of both individual and environmental variables. Social psychological methods are fundamentally empirical methods, using observation and data to guide theoretical development. Ultimately, however, those methods are simply tools in the development of theoretical insights into the nature of human social behavior.
1. Research Validities Methods of research in social psychology are useful only in so far as they permit one to critically evaluate and develop theoretical claims. Accordingly, a set of research validities that represent criteria by which any given piece of research, and any program of research, can be evaluated is defined. Ultimately research methods that permit the accomplishment of these 14405
Social Psychology: Research Methods validities permit the gathering of data that will be maximally informative for theory construction. Importantly, however, as will be discussed, there are no perfect methods that simultaneously maximize all of the research validities. Ultimately, convergence on theoretical insights is only gained through the use of multiple methods of research, triangulating across individual studies and individual methods, to maximize the validity of an entire program of research (for general discussions of research validities, see Brewer 2000, Cook and Campbell 1979).
1.1 Construct Validity Theoretical claims in social psychology typically involve statements about abstract and poorly defined constructs. So, for instance, small group research in social psychology has been interested in the role of alternative leadership styles on the successful functioning of the group. Researchers in intergroup relations have explored how levels of group cohesion affect the derogation of outgroups. Research in person perception has explored how information that is inconsistent with prior beliefs about someone is cognitively integrated into an overall impression of what that person is like. In each of these examples, the phenomena that social psychologists are interested in understanding are quite vague constructs, amenable to multiple alternative definitions. To gather data that are informative for theory development, these constructs must be measured. The success of that measurement determines the construct validity of any piece of research. In other words, construct validity concerns the extent to which the constructs of theoretical interest have been successfully captured by the variables actually measured in research. The fundamental tenant of construct validity is that multiple operationalizations of any construct are mandatory. Any one measure is only an imperfect indicator of the construct it is supposed to measure. Confidence in measurement is produced only if multiple operationalizations tend to converge or triangulate upon the same answer.
1.2 Internal Validity Theoretical claims in social psychology typically involve claims about the causal impact of constructs upon each other (e.g., social norms cause conformity, persuasive communications cause attitude change). Once the constructs have been operationalized, then research that is informative for theory construction and evaluation is research in which one can argue not only that one variable is associated with another, but also that one variable (the independent variable) is causally responsible for variation in another (the dependent variable). Research that is more internally 14406
valid is research where causal claims are more defensible. Internal validity is affected by the nature of the research design that is employed. Ultimately, confidence in causal claims is only possible in the case of what are called randomized experimental designs, in which the units of observation (typically persons) have been randomly assigned to levels of the independent variable. For instance, to examine whether group sizes (the independent variable) influences the quality of a group decision (the dependent variable), people would be randomly assigned to groups of different sizes. Random assignment means that the researcher typically must control and manipulate the independent variable. Random assignment is importantly not the same thing as haphazard or uncontrolled assignment. It involves careful planning so that the probability of any unit occurring in a particular level of the independent variable is constant across all experimental units. For many research questions in social psychology, randomized experiments are not feasible. Accordingly a variety of quasi-experimental designs can be employed. As is later described, these involve the collection of longitudinal data, without random assignment of units to levels of the independent variable. Finally, from the point of view of internal validity, cross-sectional studies (also known as correlational) studies are the least valid, although data from such studies can be quite informative for other reasons. Additionally, the absence of covariation between the independent and dependent variables in a correlational study would certainly lead one to doubt the existence of a causal relationship between the two.
1.3 External Validity Data are always collected from specific units in specific settings. Obviously, theoretical conclusions require generalization from those units and settings to broader populations of theoretical interest. The external validity of any given piece of research indexes the extent to which this sort of generalization is feasible. In general, generalization from specific samples to populations of interest is only possible if probabilitysampling methods have been used to select the units and settings from which to collect data. Although widely used in social psychology, nonprobabilitysampling methods (e.g., gathering data from participants who are given experimental credit in introductory psychology classes) pose serious threats to external validity.
1.4 Tradeoffs Among Research Validities It is exceedingly likely that there are tradeoffs among these validities in any particular study. For instance,
Social Psychology: Research Methods to effectively manipulate an independent variable in order to maximize internal validity by the use of a randomized experimental design, the construct validity of the manipulated independent variable may be seriously compromised. Historically, much of the research reported in the leading social psychological journals has relied upon randomized experiments conducted in laboratory settings with participants recruited from introductory psychology pools. Implicitly, such research has assigned internal validity paramount importance while compromising the other two validities. Although tradeoffs among the validities may be inevitable in any particular study or data collection effort, it is important to employ a variety of different research strategies and designs across a line of research, moving from the laboratory to the field, from experimental designs to correlational ones, from manipulated independent variables to carefully measured ones. In this way, across a series of studies, it becomes possible to maximize all three sorts of research validities.
2. Research Designs As just defined, randomized experimental designs, in wide use in social psychology, involve the random assignment of units to levels of the independent variables. Most frequently, multiple independent variables are included in crossed experimental designs, so that their separate, as well as interactive effects, can be assessed simultaneously. The interactive effects of multiple independent variables have been of great theoretical importance in social psychology, as they suggest, for instance that the impact of situational or environmental factors may depend on characteristics of the units being assessed (i.e., person by situation interactions). If the effects of independent variables are considered to be short-lived, then a within-subjects research design may be employed to maximize power. In other words, each experimental unit occurs in each level of the independent variable (or variables). Random assignment in this case means that the order of the levels of the independent variables to which units are exposed is counterbalanced, with units randomly assigned to the different possible orders. Many experimental designs in social psychology involve some mixture of between-subject and withinsubject independent variables, and these are typically crossed so that their interactions can be assessed. The typical analytic approach to such data involves the use of analysis of variance. Quasi-experimental designs involve the longitudinal assessment of units on the dependent measure(s) both before and after exposure to levels of the independent variable. The classic quasi-experimental design is the non-equivalent control group design in which units
are assessed on the dependent variable twice, once as a pretest and once as a post-test. Some units are exposed to the experimental treatment and others to the control condition (i.e., the two levels of the independent variable), but the assignment process is typically an uncontrolled one in which the assignment rule, rather than being random, is unknown. Analyses typically involve the analysis of differences between conditions on the post-test, attempting to remove or adjust for any pre-existing differences on the pretest (through an analysis of covariance or some other adjustment procedure). A more elaborate version of the nonequivalent control group design uses the same basic approach except that multiple pretests and multiple post-tests are assessed. Such designs permit greater confidence about treatment effects because differences in units in spontaneous growth can be modeled, and these differences can then be ruled out as competing explanations for any condition differences. A particularly strong version of this quasi-experimental design would be one in which the treatment variable is turned on or off at different points along the time sequence during which the dependent variable is being assessed. There may be multiple measures of the dependent variable over time and at some time during that sequence the treatment gets turned on, but the time point at which that occurs varies between units (with some units serving as control units where the treatment never gets turned on). The regression—discontinuity quasi-experimental design is one in which a known, rather than a random assignment rule is used to assign units to either the treatment or control conditions. Although seldom used in basic social psychology, this design has received attention in more applied research endeavors (e.g., deciding who will receive a new and scarce medical treatment based on the severity of symptoms manifested by patients). And finally, as defined above, correlational designs involve the cross-sectional measurement of both the independent and dependent variables. (See Cook and Campbell 1979, Judd and Kenny 1981, Smith 2000, and West et al. 2000 for discussions of alternative research designs and their implications.)
3. Measurement As discussed under Sect. 1.1, measured variables in social psychology are assumed to be only imperfect indicators of the constructs that are of theoretical interest. Classic measurement theory has focused on random errors of measurement, assuming that variation in measured variables derives from two sources: true score variation and random error variation. According to this classic approach, the reliability of a variable is defined as the proportion of variation in the 14407
Social Psychology: Research Methods variable that is true score variation. A variety of procedures for assessing reliability have been developed from this classic model, the most widely used being coefficient alpha which assesses the degree to which a set of items share true score variation. More recent approaches to measurement take a more fine-grain approach, assuming those errors of measurement can involve systematic as well as random errors of measurement. Accordingly variation in any variable can be decomposed into three sources: variation due to the construct of interest; variation due to other constructs that represent sources of systematic error; and random error variation. Convergent validity assesses the contribution of the construct of interest; discriminant validity assesses the contribution of systematic sources of error variation; and reliability assesses the contribution of random errors of measurement. Procedures for disentangling these sources of error variation have been developed by collecting data according to the multitrait-multimethod matrix, where multiple measures of multiple constructs are assessed, with various measures sharing systematic sources of error, particularly due to measurement methods (Campbell and Fiske 1959). In addition to these approaches to measurement, based on the presumption that measurement adequacy is best assessed by patterns of covariation across multiple alternative measures, some have argued for a more axiomatic approach to measurement (Dawes and Smith 1985). This approach is based on the presumption that regularities of individual responses should follow certain axiomatic principles deduced from the constructs that are assessed (e.g., transitivity). In general, this approach has not proved fruitful for empirical work since random error in responses often prohibits strong tests of these axiomatic principles. Nevertheless, a few examples, such as Guttman and Thurstone scaling, are of historical importance in the domain of attitude measurement (see Judd and McClelland 1998 for a general treatment of measurement in social psychology).
approach, incorporating analysis of variance through the use of contrast-coded predictors (Cohen and Cohen 1983, Judd and McClelland 1989). Longitudinal data structures as well as more sophisticated measurement models have resulted in increasing use of latent variable structural equation modeling procedures (Bollen 1989, Kline 1998). In general these procedures allow researchers to examine and test models of data that incorporate both systematic and random measurement errors, estimating the linear relations among latent constructs that are each assessed with multiple indicators. Although sample size requirements and other restrictions have placed limits on the use of these procedures in social psychology, the general data analytic approach that they represent is particularly promising. Data analysis strategies that allow for correlated errors of measurement are particularly promising in the subfield of social psychology devoted to the study of dyadic and small group interaction. Traditional inferential procedures normally assume that observations are independent of each other and this assumption is nearly always violated in dyadic and small group research where individual persons interact and are jointly influenced by situational factors. Accordingly, considerable work has been devoted in recent years to procedures for the analysis of dependent data structures. The most general of these approaches involve multilevel or hierarchical linear models, where individual observations are nested under factors whose levels are themselves treated as random observations. Traditional analysis of variance and regression procedures do not allow for this sort of hierarchical data structure with random factors (see Bryk and Raudenbush 1992, Kenny et al. 1998). Beyond the domain of small group and dyadic research, these hierarchical models are also particularly useful in longitudinal data structures, where individual observations are repeatedly taken from participants over time and one is interested in modeling growth or change in response to various situational factors that also vary across time. Data from various quasi-experimental designs are particularly likely to exhibit this sort of structure.
4. Analysis As discussed above, historically the dominant empirical approach in social psychology has involved the use of experimental designs, with independent variables orthogonally manipulated either within or between participants. Accordingly, the dominant analytic approach to data has been the analysis of variance. As researchers have increasingly extended their bag of research tools to include field studies and other sorts of research designs, more general approaches to data analysis, based on multiple regression and the general linear model, have become more prevalent. Increasingly, researchers are employing multiple regression as a general data analytic 14408
5. Conclusion Although experimental laboratory research procedures have been the dominant approach to research in social psychology, increasingly researchers are using a wide range of research designs, measurement procedures, and data analytic techniques to address the multifaceted research questions of social psychology. Certainly there will remain an important role for laboratory experimental work in social psychology. However, the wide range of methods now used throughout social psychology offers tremendous po-
Social Psychology, Sociological tential for addressing the diversity of theoretical questions addressed by social psychology in very rich ways. See also: Evolutionary Social Psychology; Experimental Design: Randomization and Social Experiments; Experimental Laboratories: Social\Pyschological; Experimentation in Psychology, History of; Observational Methods in Human Development Research; Small-group Interaction and Gender; Social Psychology; Social Psychology: Sociological; Social Psychology, Theories of; Structural Equation Modeling
Social Psychology, Sociological Social psychology is concerned with the interplay of society and individual. In this specification, ‘society’ is a gloss for all relatively stable social relationships and patterns of social interaction, and the close examination of microsociological processes is included in social psychology. The major contributors to this enterprise are psychologists and sociologists.
1. The Two Social Psychologies
Bibliography Bollen K A 1989 Structural Equations with Latent Variables. Wiley, New York Bryk A S, Raudenbush S W 1992 Hierarchical Linear Models: Applications and Data Analysis Methods. Sage, Newbury Park, CA Brewer M B 2000 Research design and issues of validity. In: Reis H T, Judd C M (eds.) Handbook of Research Methods in Social and Personality Psychology. Cambridge University Press, New York Campbell D T, Fiske D W 1959 Convergent and discriminant validition by the multitrait-multimethod matrix. Psychological Bulletin 56: 81–105 Cohen J, Cohen P 1983 Applied Multiple Regression\Correlation for the Behaioral Sciences, 2nd edn. Erlbaum, Hillsdale, NJ Cook T D, Campbell D T 1979 Quasi-experimentation: Design and Analysis Issues for Field Settings. Rand McNally College Pub. Co., Chicago Dawes R M, Smith T L 1985 Attitude and opinion measurement. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology, 3rd edn. Random House, New York Judd C M, Kenny D A 1981 Estimating the Effects of Social Interentions. Cambridge University Press, New York Judd C M, McClelland G H 1989 Data analysis: A ModelComparison Approach. Harcourt, Brace, Jovanovich, San Diego, CA Judd C M, McClelland G H 1998 Measurement. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, Boston Kenny D A, Kashy D A, Bolger N 1998 Data analysis in social psychology. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. McGraw-Hill, Boston, MA Kline R B 1998 Principles and Practice of Structural Equation Modeling. Guilford, New York Smith E R 2000 Research design. In: Reis H T, Judd C M (eds.) Handbook of Research Methods in Social and Personality Psychology. Cambridge University Press, New York West S G, Biesanz J C, Pitts S C 2000 Causal inference and generalization in field settings: Experimental and quasiexperimental designs. In: Reis H T, Judd C M (eds.) Handbook of Research Methods in Social and Personality Psychology. Cambridge University Press, New York
C. M. Judd
In practice, there are two social psychologies. Psychological social psychology assumes that ‘in the beginning there is the individual’ and focuses on individuals’ social cognitions. Sociological social psychology assumes that ‘in the beginning there is society’; its distinguishing charge is to locate interactional processes in their social structural context. This difference leads to differences in problems studied, conceptualizations of problems, and research findings. Marcus and Zajonc (1985) observe that cognitive social psychology starts and ends with persons’ cognitions; sociologists note that unless the rootedness of cognitions in social structure is recognized, more causal power than warranted is accorded cognitions. Cognitive social psychologists recognize that cognitions reflect experience, but experience does not enter theoretical formulations or research designs; for sociologists how experience is organized is central to both theory and research. The former developed largely through experimental research; the latter—more eclectic in methods and exhibiting greater willingness now than earlier to use experimental methods—has preferred studying social behavior in situ using observational procedures or survey and questionnaire methods locating persons in social milieus. The two social psychologies share topical interests: e.g., attitudes, self, and identity. They also differ in topics, as a glance at textbooks for each will evidence.
2. Focusing on Sociological Social Psychology (SSP) The concern here is with social psychological work reflecting sociological interests and contributing to sociology as an intellectual enterprise. This does not imply that psychological social psychology is ‘wrong’ or irrelevant to SSP; sociologists can learn much from their psychological counterparts. Yet, SSP has its relatively distinct approaches, interests, and objectives. Sociologists have offered versions of social psychology as alternatives to a sociology built on social 14409
Social Psychology, Sociological structural concepts; Herbert Blumer’s (1969) advocacy of symbolic interactionism is an important example. Today, sociological social psychologists typically understand their work as a specialization within sociology, focusing on persons as social beings and their interaction that contributes to sociological analyses in general; this view is adopted here. Most sociologists doing social psychology share a set of core ideas, largely taken for granted. They also significantly differ in approaches taken. The variations are attended to first.
3. Variations in Perspecties within SSP Three perspectives are considered—interactionist, group processes, and social structure–personality— encompassing most SSP; discussion of each is selective. These perspectives are not as independent as separate treatment suggests. Interactionists incorporate fundamentals of the social structure–personality frame by recognizing structural constraints on personal development and interaction. The social structure– personality formula recognizes the mediating role of interaction processes and microstructures. And group processes work builds upon the social definitional emphases of the interactional frame and the macrostructural contexts of small group structures and processes. Nevertheless, the classificatory scheme allows the recognition of significant variation in perspectives within SSP.
3.1 The Interactionist Perspectie Interactionism developed under the early influence of the Scottish moral philosophers, George Herbert Mead, Charles Horton Cooley, W. I. Thomas. More recent influences include Blumer, Everett Hughes, Erving Goffman, Anselm Strauss, and Ralph Turner. Basic principles are ideas drawing on Mead. Subjectively held meanings are central to explaining or understanding social behavior. Meanings are products of persons’ communication as they seek solutions to collective problems, and humans’ capacity for symbolization. Of particular importance are meanings attached to self, others with whom persons interact, and situations of interaction; these guide interaction, and are altered in interaction. Implied is that who persons interact with under what circumstances is critical to the development of, and changes in, meanings and thus to interaction processes. The centrality of meanings underwrites a basic methodological principle of interactionism: actors’ definitions must be taken into account in examining their behavior. For many, this principle leads to viewing everyday life as the preferred arena of research; use of observational methods, ethnography, 14410
case histories, or intensive interviewing as preferred means of gathering data; and qualitative methods as preferred analysis procedures. Other interactionists, accepting the methodological principle involved, find these preferences unnecessarily limiting in their implied rejection of quantitative analyses of large-scale data sets for research stemming from interactionist premises. Given these ideas, it is unsurprising that many interactionists attend to self, its dimensions or components, variability or stability, content under varying structural conditions, authenticity, efficacy, selfchange, the social construction and interactional consequences of affect, the interactional sources and social behavioral resultants of self-esteem, interactional and symbolic processes in defense of self, the relation of self to physical and mental health, labeling, stigmatization, deviance, and a host of other topics. A related focus is on socialization, for the shaping of self is key to entrance into roles, normal and deviant. Attention is paid to socialization processes within institutional settings—family, school, work, etc., as well as settings that are not fully institutionalized, like child and adolescent play, informal friendship of young and old, or developing social movements, and noninstitutional settings like homelessness. Responding to the insight that socialization is a lifelong process, attention is accorded life-course related socialization processes. Interaction itself, of both intimates and nonintimates, is the focus of other interactionists, whose studies examine (among other topics) teasing among adolescents, language use both as and in action and interaction; variations in the implicit rules governing interaction and interaction strategies in various settings (Erving Goffman has been especially influential here and with reference to the preceding topic); role relationships within both formal and informal settings, and the impact of status and power inequalities on interaction. Such long-standing interests of interactionists continue. A comparatively recent turn is an emphasis on emotions. Present in varying degrees in earlier interactionist social psychology, the expanded interest emphasizes emotion as basic to all interaction, expressed for example in David Heise’s affect control theory; the role of affect in self-processes; the social construction and interactional consequences of emotion in general and particular emotions. Other recent developments can only be mentioned. Contemporary sociological interest in culture is seen in work on the generation of culture through interaction. A ‘new look’ in social movement research makes framing processes and collective identity formation of particular consequence and has reinvigorated interactionist interests in that topic. A vision of self as comprised of multiple group or network based identities has opened up work on the consequences of consonant or conflicting identities that carries theory and analyses beyond traditional role-conflict studies.
Social Psychology, Sociological 3.2 The Group Processes Perspectie Group processes work reflects diverse substantive interests with a common focus on interaction in social groups or social networks. The interaction processes attended to include: cooperation; competition; conflict; conflict resolution; social exchange; inequality; bargaining; power, status, and influence processes; procedural and distributive justice and injustice; the resolution of social dilemmas; the emergence of social structures from interaction; and the reproduction of interaction processes by social structures deriving from them. According to Karen Cook, a social interdependence theme integrates these, as do ideas about how group processes work relate to the general sociological enterprise: group and network interaction processes provide the foundations of macrosociological theory; theories developed on interacting individuals apply to corporate or collective actors; group- and network-based interaction mediates the relation of individual to society. Also, group process students have a common lineage and hold similar beliefs about how to do research and build theory. That lineage includes Georg Simmel on implications of social forms for interaction; George Homans’ exchange theory built upon principles of economics and operant conditioning psychology; Robert F. Bale’s research on groups resolving tasks; ‘small groups’ research; theoretical as well as experimental economics work on bargaining and reward allocation; and game-theoretic inspired research in political science and psychology. Richard Emerson is especially important to this lineage, developing a social exchange theory escaping the limitations of Homans’ strict behaviorism and extending exchange theory to deal with exchange networks. This lineage led to experimentation and mathematical modeling of interaction processes. The focus on experimentation enabled programmatic research, an ideal carried out more in practice by group process researchers than by other SSP researchers. Group process researchers have provided models of such research, building small theories to explain given facts, testing the theories experimentally, discovering flaws in the fit of theory, revising theory to accommodate new data, in an ongoing process of theory building and theory testing. Further attention is limited to two topics, social exchange and status structures, epitomizing theoretical themes of the group processes perspective and its programmatic research emphasis. Contemporary social exchange theory starts with actors’ interdependency: people need others for, and provide others with, things they value and so engage in social exchange. Exchange occurs in situations of mutual dependence; persons act to increase positively valued outcomes and decrease negatively valued outcomes, exchange relations with specific others are repeated, and satiation and marginal utility apply to valued
outcomes. Exchange networks—chains of exchange relations—are basic to sociological exchange theory and SSP because they bridge to larger social structures. From here, network theory and research proliferate, extending to corporate actors, the role of power in facilitating or impeding exchanges across networks, the impact on exchanges of negative (competitive) or positive (noncompetitive) network connections, the emergence of norms in exchange networks, etc. Current work on status structures builds on expectation states theory, developed by Joseph Berger and colleagues. This theory argues that power and prestige orderings in groups arise from members’ expected contributions to problem solutions. These expectations shape group interaction confirming the expectations and stabilizing the orderings. The theory further argues that in the absence of information directly linking members’ abilities to tasks at hand, group members draw inferences about those abilities from the stratification system of the larger social context, thus reproducing that system. On this foundation, work on expectation states theory has expanded, systematically refining and extending the theory and examining its implications. Of particular interest under contemporary circumstances is work showing that new information about group members’ taskrelevant abilities can negate the effects on group interaction of stereotypes drawn from the existing stratification order, potentially changing the stratification order of the larger social context. Also of interest is the way ideas underlying this theory parallel ideas fundamental to the interactionist perspective in SSP.
3.3 The Social Structure–Personality Perspectie Described by Melvin Kohn, a major proponent, as the quintessentially sociological approach to social psychology, the social structure–personality perspective examines the reciprocal impact of macrosociological structure and person. It traces to the classical sociological work of Emile Durkheim and others, tying features of persons to types of societies. More proximately, it emerges from cultural anthropological work seeking to understand basic personality structures of society’s members through analyses of culture. Influenced by psychoanalysis, this work invoked family and kinship structures as vehicles sustaining and transmitting culture, evidencing the relevance of social structure to persons produced. Social structure replaces culture in a social structure–personality perspective, bringing to the fore the economy, stratification, and work institutions as basic to personality. An important part of social structure–personality work has been comparative, with nation as context, examining the hypothesis that similar work structures have similar psychological effects. Inkeles and Smith (1974), in a six-nation study, proposed that an 14411
Social Psychology, Sociological attitudinal-behavioral syndrome, ‘individual modernity,’ resulted from working in organizations produced by industrialization. This work was criticized on ideological and methodological grounds, and its failure to specify the empirical linkage of structure and person and how mechanisms accounting for that linkage affected subsequent work. The evolving research program initiated by Kohn provides evidence that complex jobs allowing self-direction link societallevel class and stratification structures to intellectual flexibility and self-directedness. Longitudinal analyses suggest the linkages are reciprocal. These propositions hold in comparative contexts, and under conditions of radical social change, arguing for generality of structural effects independent of cultural variation. The operative psychological mechanism underlying these relationships is taken to be learning generalization. The social structure–personality perspective exists in work on many topics filling the structural ‘space’ between nation and person, for example, status attainment and mobility processes, gender, work, the life course, and both physical and mental health. Further comment, necessarily brief, on two of these topics provides both insight into the utility of SSP for sociology and why SSP is not in itself an alternative sociology. Painted large, life course research focuses on the interaction of biography and historical social structures over time in order to understand adult developmental processes. While the complexity of this enterprise defies easy description or summary, it involves a vision of persons living their lives in the context of a continuously changing society, experiencing important historical events at varying ages and points in their lives, and undergoing (or not undergoing) a series of important age-related transitions involving status and role changes; of people’s existence within multiple institutional contexts and multiple networks of social relationships, some of which are operative over lengthy portions of their lives; of the life course as implicating multiple developmental trajectories necessitating coordination. A consequence of the contextual complexity and multistrandedness of life is what Elder and O’Rand (1995) term a loose coupling between social transitions and age-gradings. Loose coupling opens up room for choices reflecting meanings and interpretations (including reinterpretations) of events and of prior experience. Persons are selected by societal allocation processes for entry into social positions and roles but also select themselves: they choose to marry, have children, take or leave jobs, divorce, etc., and these choices are consequential. In short, persons are active agents importantly constructing their own lives. Research from a social structure–personality perspective on status attainment processes also provides a contrary moral. That research indicates that structures of relationships and interaction built on differences in socioeconomic status, race, and gender impact 14412
people’s skills, motivations, and aspirations for success in schools and work. It also shows that skills, motivations, and aspirations relate to educational and occupational success, i.e., to individual mobility. At the same time, research shows that the structure of schools, jobs, and labor markets can severely restrict the impact of personal characteristics on social mobility and that even voluminous individual mobility need not alter the overall character of a societal-level stratification system. In short, there are limits on the impact of agency on social structures.
4. Commonalities Underlying most work in SSP are ideas reflecting the assumption that ‘in the beginning there is society’: (a) Human experience is socially organized. Societies are not invariably coherent; they are, rather, typically complex and shifting mosaics of frequently conflicting (but sometimes not) role relationships, social networks, groups, organizations, institutions, communities, and strata, sometimes related but sometimes isolated from one another. What people experience is importantly a function of just what relationships, networks, groups, communities, and strata they enter, remain in, and leave, how they enter these social structures, and how these social structures relate to one another. (b) The forms and content of social life are social constructions, the product of the collective activities of persons as they join one another in the course of developing solutions to problems in their daily lives. To make this claim, it is neither necessary nor in accord with accumulated evidence of SSP and sociology itself to argue that processes of social construction are ephemeral or infinitely mutable, or that they occur without constraints imposed by the natural world or prior constructions. (c) Human beings are active agents who respond selectively to social and nonsocial environments impinging on them, who initiate transactions with, and sometimes alter, those environments. Implied by the idea of social construction, action produced at the initiative of an actor presupposes the symbolic and reflexive capacities of humans, in short, presupposes a ‘self,’ but does not deny the power of conditioning or of normative demands. In general, all social behavior is some blend of action and reaction. (d ) The world on which humans act and to which they react is a symbolized world specified by meanings attached to the objects comprising it. (e) There is both constraint and choice in personal and social life. Whether one, the other, or both are operative depends on particulars of the patterns of people’s relationships to others in social groups and networks, and on the patterned relationships of those groups and networks to one another. ( f ) A final idea within SSP can serve as a summary of the previous five and an appropriate concluding
Social Psychology, Theories of statement: the basic responsibility of SSP is to deal with ways in which social structures impinge on people’s social behavior and the reciprocal impact of that behavior on social structures. While each of the three perspectives within SSP reviewed tends to emphasize different levels of structure—the interactionist perspective stresses proximate structures closest to interacting persons (e.g., families, friendship networks); the implicit structural contexts of interaction for the small group perspective are middle-level structures (e.g., work organizations); and the social structure–personality perspective focuses on largescale structures (e.g., the nation, stratification systems)—an encouraging development is the increasingly common recognition that structures on all levels are important to the social behaviors studied from each perspective. See also: Age, Sociology of; Conflict and Conflict Resolution, Social Psychology of; Emotions, Sociology of; Ethnomethodology: General; Exchange: Social; Goffman, Erving (1921–82); Group Processes, Social Psychology of; Groups, Sociology of; Interactionism: Symbolic; Life Course: Sociological Aspects; Mead, George Herbert (1863–1931); Personality and Conceptions of the Self; Personality and Social Behavior; Self: History of the Concept; Smallgroup Interaction and Gender; Social Psychology; Social Psychology, Theories of; Socialization, Sociology of; Status and Role, Social Psychology of; Symbolic Interaction: Methodology
Bibliography Berger J, Fisek M H, Norman R Z, Zelditch M Jr 1977 Status Characteristics in Social Interaction: An Expectation States Approach. Elsevier, New York Blumer H 1969 Symbolic Interactionism: Perspectie and Method. Prentice-Hall, Englewood Cliffs, NJ Bryson G 1945 Man and Society: The Scottish Inquiry of the Eighteenth Century. Princeton University Press, Princeton, NJ Cook K S 1977 Social Exchange Theory. Sage: Newbury Park, NJ Cook K S, Fine G A, House J S 1995 (eds.) Sociological Perspecties on Social Psychology. Allyn and Bacon, Boston Elder G H Jr, O’Rand A 1995 Adult lives in a changing society. In: Cook K S, Fine G A, House J S (eds.) Sociological Perspecties on Social Psychology. Allyn & Bacon, Boston, pp. 452–75 Emerson R M 1962 Power-dependence relations. American Sociological Reiew 27: 31–41 Gerth H, Mills C W 1953 Character and Social Structure. Harcourt Brace, New York Goffman E 1963 Behaior in Public Places: Notes on the Social Organization of Gatherings. Free Press, New York Homans G C 1961 Social Behaior: Its Elementary Forms. Harcourt, New York Inkeles A, Smith D H 1974 Becoming Modern: Indiidual Change in Six Deeloping Countries. Harvard University Press, Cambridge, MA
Joas H 1985 G.H. Mead: A Contemporary Re-examination of his Thought. MIT Press, Cambridge, MA Kerckhoff A C 1995 Social stratification and mobility processes: Interaction between individuals and social structures. In: Cook K S, Fine G A, House J S (eds.) Sociological Perspecties on Social Psychology. Allyn and Bacon, Boston, pp. 476–96 Kohn M L, Schooler C 1973 Occupational experience and psychological functioning: An assessment of reciprocal effects. American Sociological Reiew 38: 97–118 Marcus H, Zajonc R B 1985 The cognitive perspective in social psychology. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology. Random House, New York, pp. 137–230 Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago Ridgeway C 1991 The social construction of status value. Social Forces 70: 367–86 Strauss A L 1978 Negotiations: Varieties, Contexts, Processes, and Social Order. Jossey-Bass, San Francisco Stryker S 1980 Symbolic Interactionism: A Social Structural Version. Benjamin\Cummings, Menlo Park, CA
S. Stryker
Social Psychology, Theories of Social psychology studies how individual people’s thoughts, feelings, and behavior are influenced by other people (Allport 1954). Social psychology’s theories account for how, why, and when people affect other people. Social psychological theories provide networks of propositions, testable by scientific observations, to describe, explain, and predict systematic phenomena in people’s reactions to others. This article addresses mid-range theories, not general systems (i.e., Lewin’s entire field theory) and not single phenomena (i.e., the reliable finding that first impressions matter). Because social psychology specializes in mid-range theories, the review imposes a larger framework, providing mere pointers to the theories flagged (see the cited books reviewing theory for detail).
1. Organization of this Reiew Social psychology’s theories each tend to center on one of a few major types of social motivation, describing the social person as propelled by particular kinds of general needs and specific goals. Most reviews acknowledge these motivational roots by reference to broad traditions within general psychology or sociology: role theories, cognitive and gestalt theories, learning and reinforcement theories, and psychoanalytic or self-theories. As a variant, this review takes a more integrated but compatible premise, based on people’s evolution in a social niche: to survive and thrive, people need other people. Several core social motives arguably result 14413
Social Psychology, Theories of from this perspective. Over the twentieth century, social and personality psychologists frequently have identified the same five or so core social motives, which should enhance social survival (Stevens and Fiske 1995). Belonging reflects people’s motive to be with other people, especially to participate in groups. Understanding constitutes people’s motive for shared social accounts of themselves, others, and surroundings. Controlling describes people’s motive to function effectively, with reliable contingencies between actions and outcomes. Self-enhancing comprises people’s tendencies to affirm the self. Trusting concerns people’s motives to see others (at least own-group others) positively. While these motives are not absolute (other reviewers would generate other taxonomies), not invariant (people can survive without them), nor distinct (they overlap), they do arguably facilitate social life, and they serve the present expository purpose. Within each core social motive, distinct levels of analysis address social psychological processes primarily within the individual, between two individuals, and within groups.
2. Theories of Belonging: Role Theories People’s core social motive to participate in group life generates theories that emphasize the place of the individual in the group. Role theories define people’s behavior as a function of their roles, their position in social structures and the expectations attached to that position, all in the context of group norms, the unwritten, informal rules that govern group life.
2.1 Belonging, Within the Indiidual Emphatically sociological, role theories describe the effects of group roles and norms on individual selfpresentation and self-understanding. In the theater of social life, so aptly described in the dramaturgical theory of Goffman, self-presenting people define the scene, take on roles, and follow cultural scripts. Here, symbolic interaction with generalized others forms the self (Mead), or others’ reflected appraisals construct the looking glass self (Cooley). People’s motivation to fit into groups sparks strategic impression management (Schlenker) and selfpresentation (Jones). Wanting to operate successfully in groups provides goals to convey desired (but not necessarily positive) self-images that are personally beneficial and socially believable, for a given audience. For example, people may ingratiate, presenting themselves as likable; self-promote, presenting themselves as competent; or supplicate, presenting themselves as helpless. When people feel under public or private scrutiny, they become objectiely self-aware, in Wicklund’s theory; this awareness motivates matching to internalized social standards. People and situations 14414
differ in motivating people to self-monitor; that is, to adjust their behavior to the demands of the social group (Snyder). Belonging motivates not only selves, but also attitudes and values. Merton’s reference group theory explained how people derive their values from belonging to particular groups or desiring to belong to them. Through face-to-face interaction, reference groups theoretically serve a normatie function, defining acceptable standards for group membership (Deutsch and Gerard, Kelley). Some attitudes thus serve a social adjustie function (Katz, Kelman, Smith, Bruner, and White); that is, attitudes and values facilitate belonging.
2.2 Belonging, Between Indiiduals Motives to belong and get along with other people appear in theories describing reciprocity: expectating something in return. Cialdini’s compliance principles include reciprocity; people help other people who help them (see Gouldner on reciprocity norms). Exchange theories (see Sect. 4.2) posit that people expect fair outcomes (outcomes proportional to inputs), but later theories proposed different norms, depending on the relationship. Communal-exchange theory (Clark and Mills) describes exchange relationships (equitable outcomes) versus communal relationships (need-based outcomes). A. Fiske’s social relations model theorizes four relationship types and norms: equality matching, communal sharing, authority ranking, and market pricing. People’s adherence to relationships over merely selfserving outcomes appears in the dual-concern model (Pruitt and Carnevalle): bargaining elicits both ownconcern and other-concern. People also evaluate outcomes by procedural justice; that is, whether the process itself was fair (Thibaut and Walker, Lind and Tyler). Institutions and authorities especially are evaluated according to procedural justice, thereby eliciting loyalty or anger. Belonging also appears in leadership theories. Milgram’s theoretical analysis of destructive obedience to authority includes sociocultural contexts of obedience, subtle and progressive commitment, and denial of personal responsibility—all in the service of the group. More benign forms of authority appear in the contingency theory of leadership (Fiedler), describing how leaders influence members most successfully through relationship-oriented, rather than merely task-oriented leadership, though other contingency models (e.g., Vroom) note that effective leadership style depends on context. Transactional leadership (Burns) adheres to norms of leader–member reciprocity, whereas transformational leadership changes the demands of the followers. Regardless, individuals gain from belonging. Social support theories explain how relationships benefit
Social Psychology, Theories of health and well-being. Main effects models (e.g., attachment, see Sect. 6) suggest a monotonic relationship between relationships and health. Buffering theories (Cohen, House, Sarason) posit that social support undermines the ill effects of stress, if the support fits the stressor. 2.3 Belonging, In Groups Crowds turn into groups, in Turner’s emergent norm theory, when people cue each other’s behavior. Theoretically, extreme cases result in deindiiduation (Diener, Zimbardo); the self lost in the group. Less radically, people join groups because they are attracted to them, according to group cohesion theory (Hogg); solidarity transforms aggregates into groups (see Moreno’s sociometry). Networks of interaction processes (Bales) define the group structure along task, social, and activity dimensions. Groups valuing cohesion over task effectiveness suffer Janis’s groupthink, faulty decision-making that results from conformity pressures. French and Raven’s referent power and Kelman’s identification both influence through desire to belong. Group-level influence processes depend on the strength, number, and immediacy of people attempting influence (Latane! ’s social impact theory; Tanford and Penrod’s social impact model). These theories elaborate Zajonc’s social facilitation theory, which describes the presence of others facilitating an individual’s dominant response, benefiting wellpracticed skills and undermining less-developed performances. In organizations, effective performance meets reviewers’ standards, maintains working relationships, and satisfies individual needs, in Hackman’s normatie model of group effectieness. Relatedly, Steiner describes process losses that endanger group productivity, when coordination or motivation fails. Social loafing diminishes motivation in groups (Latane! ), but social compensation (Williams and Karau) can enhance motivation in groups. The predominantly positive effects of group belonging all pertain to own group (in-group). Nevertheless, in-group favoritism (rewarding one’s own), not out-group deprivation, poisons intergroup relations. Tajfel’s social identity theory describes how in-group favoritism stems from self-esteem and categorization, the latter expanded in Turner’s selfcategorization theory. These theories hold that categorization into in-group and out-group differs from other kinds of categorization (see Sect. 3) because the perceiver belongs to one of the categories: Mere categorization into groups promotes conflict. In contrast, Sherif’s realistic group conflict theory blames intergroup hostility on competition over scarce resources. If single in-group belonging promotes intergroup conflict, multiple belonging diminishes it, in the theory of crossed categorization (Dovidio and Gaertner).
3. Theories of Understanding: Gestalt and Cognitie Theories People’s core social motive to hold a coherent, socially shared understanding has inspired theories that account for various social cognitive processes. Gestalt theories in social psychology emphasize harmonious fit among elements that constitute a coherent whole, hence their influence on consistency theories of attitudes and on schema theories of social perception (see Sect. 3.1). Their descendants, modern social cognition theories, most often focus on dual processes, one automatic and based on superficial, peripheral heuristics or categories, the other controlled and based on systematic, central, deliberation or indiiduation (see Chaiken and Trope 1999). 3.1 Understanding, Within Indiiduals Derived from gestalt psychology, three types of theory focus on processes within the social perceiver: attribution, impression formation, and consistency theories. Other attitude theories and self theories build indirectly on these origins, but still emphasize understanding as primary. Heider’s theories of social perception focused on harmonious, coherent wholes: inariance in perceived personality. Heider’s social perceiver, portrayed as a naie scientist, searches for consistencies in behavior, to make coherent dispositional attributions (inferring stable, personal causes). Other attribution theories developed: Jones’s theory of correspondent inference describes how perceivers impute dispositions that fit an actor’s behavior, attributions increased by a behavior’s unique (‘noncommon’) effects and low social desirability. Kelley’s coariation theory likewise notes a behavior’s distinct (i.e., unique) target and its degree of consensus (i.e., desirability) across actors, but adds the observation of consistency over time and circumstances. Extending fundamental theories, stage models (Quattrone, Gilbert, Trope) converged on automatic categorization or identification of behavior, followed by dispositional anchoring or characterization, followed by controlled situational correction or adjustment. These recent theories bring dual process perspectives to attribution theories that emphasized controlled processes. A second line of person perception theories also originated in gestalt ideas and eventuated in dualprocess models. Asch proposed a holistic theory of impression formation, in which the parts (most often personality traits) interact and change meaning with context. The alternative, an algebraic model that merely summed the traits’ separate evaluations, matured in Anderson’s later averaging model of information integration. Asch’s most immediate heirs were modern schema theories, examining impression formation as a function of social categories that cue organized prior knowledge. Following Taylor and 14415
Social Psychology, Theories of Fiske’s cognitie miser perspective, social perceivers were viewed as taking various mental shortcuts (below), schemas among them. Eventually, a more balanced perspective emerged, describing perceivers as motiated tacticians, who sometimes use shortcuts and sometimes think more carefully, depending on their goals. One such dual-process model, the continuum model (Fiske and Neuberg), holds that perceivers begin with automatic categories (e.g., stereotypes), but with motivated attention to additional information, perceivers may individuate instead. Brewer’s dual-process model posits automatic identification processes and controlled categorization, individuation, and personalization. Other mental shortcuts emphasize relatively automatic processes, as defined by Bargh to include being unintentional, effortless, unconscious, and unstoppable. Illustrative theories address influences by stimuli that are arbitrarily salient in the environment (Taylor and Fiske) or accessible in mind (Higgins, Bargh). In Kahneman and Tversky’s heuristics, people estimate probabilities by irrelevant but convenient processes: ease of generating examples (aailability), ease of generating scenarios (simulation), and ease of moving from an initial estimate (anchoring and adjustment). Norm theory (Kahnman and Miller) posits that people retrospectively estimate probabilities via the simulation heuristic, bringing to mind similar but counterfactual scenarios. Although not a theory, a catalog of inferential errors and biases follows from people’s limited capacity (Nisbett and Ross; see Sect. 4.1). A third line of theories follows from gestalt approaches. Heider’s balance theory posits that perceivers prefer similarly evaluated people and things also to belong together. This emphasis fits other consistency theories. Most prominently, Festinger’s cognitie dissonance theory holds that people seek consistency among the cognitions relevant to their attitudes, including their cognitions about their own behavior. Attitudes change more easily than behavior, especially behavior based on counter-attitudinal advocacy, forced compliance, free choice, unjustified effort, and insufficient justification (see updates emphasizing self-esteem (Aronson), accepting responsibility for an aversive event (Cooper and Fazio’s new look), and self-affirmation (Steele). Other consistency theories hold that people seek harmony within and between their attitudes. Some attitude theories focus on understanding dual processes, downplaying consistency motives. The elaboration likelihood model (Cacioppo and Petty) describes the object-appraisal (understanding and evaluating) function of attitudes, but depicts two modes, depending on motivation and capacity: The peripheral mode processes persuasive communications based on superficial, message-irrelevant cues (such as communicator, context, or format), whereas the central mode processes message content, generating 14416
cognitive responses pro and con, which predict persuasion. Chaiken’s heuristic–systematic model similarly proposes a rapid, simple process that contrasts with a more deliberate, in-depth process of attitude change. Focusing on understanding via deliberate control, far from gestalt perspectives, two subjectie expected utility models predict attitude–behavior relations. Fishbein and Ajzen’s theory of reasoned action posits that behavior results from intention, which in turn results from attitudes toward a behavior (evaluating the behavior’s consequences, weighted by likelihood) and from subjective norms. Ajzen’s updated theory of planned behaior adds a third component to predict intentions, namely perceived behavioral control. Like theories of attitudes and social perception, theories of self-perception emphasize coherence. Selfschema theory (Markus) describes few, core dimensions for efficiently organizing self-understanding. Self-concepts may be more or less elaborate, resulting in respectively more stable and moderate or volatile and extreme self-evaluations (Linville’s complexity– extremity theory). People learn about themselves partly by looking at others. Festinger’s social comparison theory describes how people strive to evaluate themselves accurately, by comparing their standing relative to similar others, on matters of ability or opinion. Schachter extended this work in exploring the fear-affiliation hypothesis, whereby people under threat affiliate with similar others, perhaps to gauge the appropriateness of their emotional reactions. From this came Schachter’s subsequent arousal-cognition theory of emotion, positing that unexplained physiological arousal elicits cognitive labels from the social context. Although the outside world may anchor self-understanding, other theories describe how the self anchors understanding of the world outside, as in Sherif and Hovland’s social judgment theory, explaining how people assimilate nearby attitudes within a latitude of acceptance and contrast far-off attitudes within a latitude of rejection.
3.2 Understanding, Between Indiiduals Moving to understanding in actual interactions between people: although generally these theories do not take an explicit dual-process approach, one set of theories focuses on the role of old, familiar, prepackaged information, whereas another set focuses on the role of newly processed information. One of the fundamental principles of attraction is familiarity. The Zajonc mere exposure theory describes people favoring people and objects encountered frequently. Similarly, people are attracted to people in propinquity (close proximity) (Festinger, Schachter, and Back), perhaps because of exposure. Although explained in non-cognitive terms, familiarity effects fit the principle of liking those who seem comfortable,
Social Psychology, Theories of safe, and easily understood. Consistent with this view, Newcomb developed the similarity principle of attraction, derived from balance theory (above), to explain the clear attraction to others of shared background, attitudes, etc. Similar others also maximize social influence, in Cialdini’s compliance principles. Just as theories describe how familiar others are liked, perhaps because easily understood, other theories emphasize dislike of dissimilar others, as in theories of stigma offered separately by Goffman, Jones, and Katz. Category-based theories of impression formation (Sect. 3.1) describe how negative outgroup categories stimulate negative understanding. Moreover, perceivers behaviorally confirm their negative categories of out-groupers, who sometimes affirm the stereotype (see Merton’s theory of self-fulfilling prophecies, Rosenthal’s theory of nonverbally communicated expectancies, Snyder’s theory of behaioral confirmation, Hilton and Darley’s theory of expectancy-confirmation processes). Mere knowledge of the stereotypes about one’s group creates stereotype threat (Steele), undermining the performance of distracted, frustrated, targets of stereotyping. Although familiar information tends to reproduce itself, new information has an impact, according to many theories. In noting the informational influence of others, Kelley, as well as Deutsch and Gerard, noted that reference groups can serve a informatie or comparatie function defining objective accuracy from the group’s perspective. In Darley and Latane! ’s cognitie model of helping, people consult social cues at each step of the process of noticing, interpreting, taking responsibility, deciding, and implementing help. Weiner analyzes attributions of responsibility as stemming from information about degrees of control and its internal or external locus. People acquire new information through social learning (Sect. 4.2) that encourages them to help (Berkowitz, Rushton, Staub) or to aggress (Bandura). As people learn by doing, they are socially influenced by self-consistency and commitment, according to another of Cialdini’s compliance principles.
3.3 Understanding, In Groups In groups, the familiar prior knowledge about own group and other groups is captured by social identity theory and self-categorization theory (Sect. 2.3). In addition, group-level schemas and categories at the societal level take the form of Moscovici’s social representations; shared social understandings make the unfamiliar into the familiar by reference to old socially shared concepts. People in groups also acquire new information. Some individuals hold power by virtue of expertise or information (French and Raven), evoking internalization (privately held beliefs; Kelman) in those they
influence. More generally, persuasie arguments theory (Burnstein) describes how groups polarize shared opinions, compared to individuals, when they receive novel supporting information. Moreover, minority influence (Moscovici) within groups operates partly through minorities’ conviction provoking majorities to systematically process their arguments.
4. Theories of Controlling: Reinforcement Theories The core social motive to perceive some contingency between one’s actions and one’s outcomes is a sense of effectance, White’s term, or control, in other formulations, drives a set of social psychology theories concerned with people’s desire to effect certain outcomes. These theories often predict people’s behavior from the economics of reinforcements (rewards and costs). People aware of situational contingencies act to maximize or at least improve their outcomes, in this view.
4.1 Controlling, Within Indiiduals Attribution theories that focused on rational understanding by the naie scientist (Sect. 3.1) gave way to theories and mini-theories documenting the shortcut errors and biases of the cognitie miser (Sect. 3.1). Several of these errors share a focus on belief in human agency. The fundamental attribution error (Ross), correspondence bias (Jones), and the observer part of the actor–obserer effect (Jones and Nisbett) all suggest that people over-rely on personality or dispositional explanations for other people’s behavior. People making attributions about another’s behavior underestimate the role of social situations, a phenomenon first noted by Heider. Theoretically, people explain their own behavior situationally, the actor part of the actor– observer effect. Nevertheless, other theories emphasize people’s need to believe in their own agency. For example, Brehm’s reactance theory describes people’s resistence to limits on their behavioral freedom and attraction to threatened or constrained behavioral choices. Bandura’s social cognitie theory and self-efficacy motiation emphasize individual human agency. Bem theorized that people are as balanced as outside observers; his radical behaviorist self-perception theory suggests that people observe the contingencies controlling their own behavior and infer their attitudes from the pattern of freely chosen behavior. Indeed, people’s intrinsic motiation (Deci and Ryan, Lepper, Greene, and Nisbett) depends on free choice, autonomy, and interest; it declines when under extrinsic reward (oer-justification effect); that is, excessive external control. When people lack a sense of control, 14417
Social Psychology, Theories of theories hold that they may exhibit learned helplessness (Seligman), leading to depression, and mindlessness (Langer), leading to unintelligent behavior, but they may also increase control motiation (Pittman). 4.2 Controlling, Between Indiiduals Various social exchange theories concern outcomeoriented standards for enacting and judging relationships. Homans’s distributie justice applied elementary principles of operant (Skinnerian) behaviorism to social interdependence, holding that individuals expect rewards proportionate to costs. From this came equity theory (Adams, Walster (later Hatfield), Walster, and Berscheid) that argued that people seek fair ratios of outcome to investment. Although couched in reward–cost terms, inequity theoretically relates to dissonance (Sect. 3.1), creating a drive to reduce it. The exchange idea in relationships developed into interdependence theory (Kelley and Thibaut), that posits that human interactions follow from degrees, symmetries, bases, and kinds of dependence. In Levinger’s stage theory of relationships, people begin with a cost–benefit analysis. Whether close relationships switch (Sect. 2.2) or not (Sect. 4.2) to a nonexchange (i.e., communal) orientation, people in relationships do control their own and the other’s outcomes, addressed by theories of intent attribution (Sect. 3.1), emotion in relationships (Berscheid), and accommodation to disruption (Rusbult). Even outside close relationships, outcome dependence motivates indiiduation (Fiske), undercutting stereotypes. Control over one’s outcomes appears in a cost–reward model (Dovidio, Piliavin) of helping. Although motives to control one’s outcomes theoretically underlie most positive relationships, they theoretically also underlie aggression. Excessive control over another person, to obtain desired outcomes, motivates instrumental aggression (Geen), following the frustration–aggression hypothesis (Dollard, Miller, Doob, Mowrer, and Sears). Other formulations carry an explicitly cognitive neoassociationist analysis (Berkowitz) of aggressive patterns. Lack of control, brought on by enironmental stressors (Anderson) enhances aggression, as does social learning (Bandura), whereby people are socialized to imitate successful aggressive acts. Incentives motivated the earliest frameworks for persuasie communication (Hovland, Janis, and Kelley), as well as more recent principles of compliance involving scarcity (Cialdini) as a threat to control over outcomes. 4.3 Controlling, In Groups Group members theoretically analyze costs and benefits at role transitions during group socialization (Moreland and Levine). Interdependence of outcomes defines group life (Deutsch, Kelley and Thibaut, 14418
Sherif). Some people control outcomes more than others, placing them in positions of power, defined as the capacity to influence others (French and Raven); although only some bases of power (reward, coercive) explicitly relate to reinforcements, even the others (Sects. 2.3 and 3.2) could be viewed as incentive-based. Pure incentive orientation, however, induces only compliance (public performance, not private belief ) (Kelman). Belief in the power of human agency to control outcomes underlies several theories of group-level world views. Lerner’s just world theory posits individual differences in the belief that people and groups get what they deserve. Sidanius and Pratto’s social dominance theory posits individual differences in the beliefs that group hierarchy is inevitable; individuals high in social dominance orientation seek hierarchy-maintaining careers and exhibit high levels of intergroup prejudice; societies high in social hierarchy exhibit highly controlling social practices, including violent oppression, toward subordinate groups. Eolutionary social psychology, coming out of sociobiology (Trivers, Wilson), holds that individuals are motivated to dominate other individuals, to enhance their own reproductive outcomes (Buss, Kenrick).
5. Enhancing Self: Psychoanalytic and Selftheories People seem motivated to have a special affection for self, and in the West this often takes the form of selfenhancement, whereas in East Asia, it may take the form of self-improvement, but with a certain sympathy for the self’s difficulties. Theories of the special role of the self, especially defense of self-esteem, influenced social psychology first through psychoanalytic perspecties, especially from Freud, of course, but also Erikson, Maslow, and Sullivan. Subsequently, the importance of self-esteem became a non-Freudian truism driving a range of theories at individual, interpersonal, and groups levels. 5.1 Self-enhancing, Within Indiiduals The clearest influence of psychoanalytic theory, the authoritarian personality theory (Adorno, FrenkelBrunswik, Levinson, and Sanford) proposed that rigid, punitive, status-conscious child-rearing practices reinforce unquestioning, duty-bound, hierarchical-oriented obedience. To maintain safety, the child idealizes the parents, leaving no outlet for the unacceptable primitive impulses of sex and aggression. According to the theory, these are displaced onto those of lower status, notably outgroups, leading to lifelong prejudice. Altemeyer’s recent right-wing authoritarianism has revived interest in prejudice as a self-enhancing motivational construct, absent the Freudian backdrop.
Social Psychology, Theories of More broadly, several Western theories pick up the theme that people think, interpret, and judge in ways that elevate the self. Taylor and Brown suggest that positie illusions about the self, possibly inaccurate, benefit mental and physical health. Kunda describes motiated reasoning that protects self-esteem. Even people with stigmas buffer self-esteem by taking advantage of attributional ambiguity (Crocker and Major), attributing negative outcomes to prejudice instead of personal failings. More generally, Steele’s self-affirmation theory posits that people need to feel worthy, so a threat to one aspect of self can be counteracted by affirming another self-aspect. Terror management theory (Greenberg, Pyszczynski, and Solomon) holds that people specifically feel threatened by their own mortality, so to allay their anxiety, they subscribe to meaningful world-views that allow them to feel enduring self-worth. Similarly, the ego-defensie function (Katz, Smith, Bruner, and White) of attitudes protects the self from threat, whereas the alue-expressie function of attitudes (Katz) more broadly affirms the self-concept. The self-discrepancy theory of Higgins argues self regulation aims to attain positive self-feelings and avoid negative self-feelings. Discrepancies between the actual self and the desired ideal self cause depression, at the lack of the positive outcome. Discrepancies between the actual self and the ought self cause anxiety, at the failure of not living up to standards. 5.2 Self-enhancing, Between Indiiduals Various theories argue that people use relationships to enhance the self. Aron and Aron’s theory of self-inrelationship posits that relationship and self-concept merge, presumably affecting self-esteem according to relationship quality. Not all positive features of a close other boost self-esteem. According to Tesser’s selfealuation maintenance theory, people feel threatened by the success of close others in areas of self-identified competence, but threat diminishes with decreased closeness or identification. Even interpersonal altruism has been hypothesized to result from egoistic motives, such as negatie state relief (Cialdini) that puts an end to the perceiver’s own guilt or sadness at another’s distress. Certainly aggression has been hypothesized to result from selfprotective motives. Indeed, Baumeister argues that inflated self-esteem underlies much aggression and violence. More broadly, affectie aggression (vs. instrumental aggression, Sect. 4.3) theoretically results from perceived threats to self, as in attributions of hostile intent (Dodge). 5.3 Self-enhancing, In Groups According to sociometer theory (Baumeister and Leary), people base their self-esteem on their standing
in the group; from a broad perspective, this fits the belonging motive (Sect. 2), but the theory also specifically addresses the process of self-enhancing in the context of a group. People’s sense of positive identity theoretically results from group membership (Sect. 2.1). Nevertheless, people balance group identity and personal identity, according to Brewer’s optimal distinctiveness theory, to achieve a positive view of self. To the extent that people incorporate the self-inthe-group (Smith), people will experience emotions on behalf of the group, presumably including groupbased self-esteem. Traditional theories of relative deprivation (Stouffer, Runciman, Crosby) distinguish egoistic deprivation, which assumes that people base their feelings about their own esteem on their standing compared to others, versus fraternal deprivation, which provides the parallel for one’s group standing compared to other groups. In each case, a form of selfesteem depends on gaps within or from a group. The group as a basis for self-esteem theoretically differs in more indiidualistic s. collectiistic cultures (Triandis) or more independent or interdependent cultures (Markus and Kitayama). In relatively individualistic, independent cultures, the self is autonomous from the group, and self-esteem is personal; most theories of self-enhancement result from this North American and European cultural perspective. The autonomous self moves in and out of many groups, with multiple group identities all affecting self-esteem. In relatively collectivistic, interdependent cultures, the self is subordinate to the group, and strives to enhance the group by improving the less important self. The interdependent self remains loyal to few primary groups, which sustain the self.
6. Trusting Just as people have a special positive feeling for the self, they also have a positive feeling for others, at least in the in-group, all else being equal. Bowlby’s attachment theory argues for the survival value of children having secure relationships, a principle social psychologists extend to adult relationships. 6.1 Trusting, Within Indiiduals Matlin and Stang’s Pollyanna hypothesis holds that people perceive and expect positivity across a variety of domains. People especially perceive individual people more positively than comparable groups or entities, in the person-positiity hypothesis (Sears). Given this positive baseline, negative information theoretically differs in salience (Fiske), diagnosticity (Skowronski and Carlston), and motiated igilance (Peeters and Czapinski). People are risk-averse, according to prospect theory (Kahneman and Tversky). Taylor’s mobilization-minimization hypothesis describes how people react strongly to negative events, but quickly readjust to a moderately positive baseline. 14419
Social Psychology, Theories of Gilbert and Wilson’s durability bias describes people’s ignorance of their own readjustment to baseline after unusually negative (or positive) events.
6.2 Trusting, Between Indiiduals All relationships, but especially close ones, depend on trust, according to appraisal process theory (Holmes). Attachment theory, as applied to close relationships (Hazan and Shaver), describes how people vary in trust for others and value for self, resulting in a typology of relationship orientations. Trust benefits relationship quality and longevity. Trust, in an exchange form (Sect. 4.2) constitutes reciprocity (Sect. 2.2); in a communal form (Sect. 2.2), trust constitutes response to another’s needs. When people vicariously experience each other’s emotions, a strong form of communal sharing, they are more likely to help others, in accord with the empathy–altruism hypothesis (Batson, Eisenberg). In game theory (Luce and Raiffa), trust entails cooperative instead of competitive behavior in mixed motive games, in which outcomes conflict but need not sum to zero. Structural features determine cooperation, according to Deutsch’s theory of trust and suspicion; when people can communicate intent, they trust more. Larger groups enable less trust, according to Yamagishi and Sato’s theory of trust.
6.3 Trusting, In Groups Shared goals promote trust both within and between groups, in an interdependence theory (Sect. 4.3) analysis. But shared goals also contribute to the identity of the group (Sect. 2.3). Another theoretical account is that superordinate goals require collaborative effort (Sherif and Sherif ) that creates trust and overcomes intergroup divisions. Under optimal conditions, Allport’s contact hypothesis specifies that equal status, cooperative, pleasant, sanctioned intergroup contact reduces prejudice and increases trust between groups. Members of distinct groups may come to view themselves as one-group (Gaertner and Dovidio).
7. Conclusion and Disclaimer Theories flourish in social psychology, usually accompanied by rigorous programs of research. This review has not evaluated the empirical success of the noted theories, merely recorded some prominent ones. At this count, the 32-volume American Adances in Experimental Social Psychology, the 10-volume European Reiew of Social Psychology, the 4-volume Personality and Social Psychology Reiew, as well as 14420
social psychological theories appearing in Psychological Bulletin and Psychological Reiew (by computerized search) amount to just short of 600 separate theories, not to mention the ones appearing only in ‘empirical’ journals or books. The present article comprises a series of pointers to theories that reflect both past and current thinking, in a social evolutionary framework of five (plus or minus five) core social motives that facilitate social life. See also: Attitude Change: Psychological; Attitude Formation: Function and Structure; Attitudes and Behavior; Attributional Processes: Psychological; Cognitive Dissonance; Conflict and Conflict Resolution, Social Psychology of; Group Decision Making, Social Psychology of; Heuristics in Social Cognition; Impression Management, Psychology of; Intergroup Relations, Social Psychology of; Justice: Social Psychological Perspectives; Leadership, Psychology of; Love and Intimacy, Psychology of; Mental Representation of Persons, Psychology of; Motivation and Actions, Psychology of; Obedience: Social Psychological Perspectives; Person Perception, Accuracy of; Reactance, Psychology of; Schemas, Social Psychology of; Self-evaluative Process, Psychology of; Small-group Interaction and Gender; Social Categorization, Psychology of; Social Cognition and Affect, Psychology of; Social Comparison, Psychology of; Social Dilemmas, Psychology of; Social Identity, Psychology of; Social Influence, Psychology of; Social Psychology; Social Relationships in Adulthood; Stereotypes, Social Psychology of
Bibliography Allport G W 1954 The historical background of modern social psychology. In: Lindzey G (ed.) The Handbook of Social Psychology, 1st edn. Addison-Wesley, Reading, MA, Vol. 1 Berkowitz L (eds.) 1964\1989 Adances in Experimental Social Psychology, Vols.1–22. Academic Press, New York Chaiken S, Trope Y (eds.) 1999 Dual-process Theories in Social Psychology. Guilford, New York Deutsch M, Krauss R M 1965 Theories in Social Psychology. Basic Books, New York Gilbert D T, Fiske S T, Lindzey G (eds.) 1998 The Handbook of Social Psychology, 4th edn. McGraw-Hill, New York Higgins E T, Kruglanski A W (eds.) 1996 Social Psychology: Handbook of Basic Principles. Guilford, New York Jones E E 1985 Major developments in social psychology during the past five decades. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology, 3rd edn. Vol 1. Random House, New York Jones E E, Gerard H B 1967 Foundations of Social Psychology. Wiley, New York Manstead A S R, Hewstone M (eds.) 1995 The Blackwell Encyclopedia of Social Psychology. Blackwell, Oxford, UK 1997\2000 Personality and Social Psychology Reiew 1–4
Social Question: Impact on Social Thought Ross L R, Nisbett R E 1991 The Person and the Situation: Perspecties of Social Psychology. McGraw-Hill, New York Shaw M E, Costanzo P R 1982 Theories of Social Psychology, 2nd edn. McGraw-Hill, New York Stevens L E, Fiske S T 1995 Motivation and cognition in social life: A social survival perspective. Social Cognition 13: 189–214 Stroebe W, Hewstone M (eds.) 1990\1998 European Reiew of Social Psychology, Vols. 1–8. Wiley, Chichester, UK Taylor S E 1998 The social being in social psychology. In: Gilbert D, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology, 4th edn. Vol. 1. McGraw-Hill, New York West S G, Wicklund R A 1980 A Primer of Social Psychological Theories. Brooks\Cole, Monterey, CA Zanna M P (eds.) 1990\2000 Adances in Experimental Social Psychology, Vols. 23–32. Academic Press, New York
S. T. Fiske
Social Question: Impact on Social Thought A modern social science not only originates within an intellectual project of widening rational knowledge, but also as a means to govern what Simmel (1908) called ‘the material of social processes.’ The making of what Comte (1848) described as ‘the preponderance of a social point of view’ implied that there were problems that no established scientific approach could successfully address. Even more, the individualistic premises of modern legal and economic sciences themselves opened new problems from the vantagepoint of social organization. If the major trend of modernity has been toward the legal emancipation of the individual and the economic emancipation of labor from the burdens of traditional society, the emergence of social thought was to make evident that this could not suffice to organize a stable modern society. There was a need for models of organization and social integration, i.e., for a ‘science of social order.’ For such a science, the first object of analysis has been the ‘social question’, that is, the mass pauperism growing alongside the rise of modern liberalism that social thought regarded as a danger for social stability. This became particularly clear in France where revolutions brought the problem of popular poverty to the forefront of the political arena.
1. The Experience of the Reolution The idea that the French Revolution played an important role in the rise of sociology has been established by important sociologists such as Reinhardt Bendix and Robert Nisbet. The very term ‘social science’ became official during the French Revolution
through authors like Condorcet (Baker 1964, McDonald 1993). Less common is acknowledging that its development was linked to the question of poverty and the new political importance it assumed. The Revolution was more than an intellectual advance in rationalism; it was a dramatic experience cutting bridges with most of what was known and abruptly revealing the depth of the social crisis. The new awareness of the crisis, more than the rationalist critique, was crucial to social thought. Nisbet (1943), stressing the impact of the French Revolution over the making of sociological concepts, suggested that the experience of the crisis focused them on the ‘social group’, as their own response to eighteenth century social theory centered on the opposition between the state and the individual. Social thought developed by establishing the group as an object of science, not only with an ethical implication, but also with a primary theoretical value in the study of man. Why the group? Because the impact of the Revolution upon traditional social groups had driven the crisis until the destruction of the whole institutional frame, eroding the very bases of social belonging. The ‘rational state,’ as Nisbet put it, that the eighteenth century social theory had been regarded as the solution to the crisis of traditional bonds (religion, family, guilds, and associations) and the correlation to individuals’ emancipation, no longer seemed able to nourish a real sense of social belonging. The group raised a new problem that could not be faced in economic or in juridical terms: to rebuild society required new foundations for the social bond. The experience of the crisis and the condition of insecurity due to the destabilization of order opened the way to a science that would fill the task of laying down the proper conditions for a new social order.
2. The Autonomy of ‘Society’ The ‘social question’ was the most dramatic evidence for both the crisis and the need for a new social order. Poverty could no longer be regarded as a mere individual condition to be dealt with by charitable support and police repression, as in the Old Regime treatment of mendicancy and vagrancy. The political emancipation gained through the Revolution, as well as the economic emancipation guaranteed by a free labor, made the increase of poverty among the urban popular masses an issue of public concern (Simmel 1908). How could the poor play a role in the new social order, be inscribed in the political arena of universal citizenship and in the liberalized market of productive labor? As Hannah Arendt (1963) stressed, the social question broke the social unanimity around the revolutionary project—and from then onwards, it has been the stumbling block of any revolution. 14421
Social Question: Impact on Social Thought The modern idea of rights, making the poor just citizens the same as everyone else, and the modern economic imperative to include them into productive labor, did converge in giving a new visibility to the social question. In fact, poverty challenged both the achievement of individual autonomy as well as the generalization of free labor. It was therefore important to act against the social causes of poverty, and therefore to know them, what neither legal nor economic sciences could do. The importance of the poverty question grew together with the political awareness that a stable society demanded an end to the revolution. Both social and political problems contributed to a realization that social change could not be understood through individual actions, but rather as the result of relatively impersonal processes. Society appeared to have autonomous conditions to its own development and an autonomous ability to resist changing, both impossible to curb by law or through the market. The social question has been the first object of application where the autonomy of social thought’s premises, concepts, and methods was at stake. It was thus the arena for a gradual autonomization of social thought from economic and legal categories, founding the problem of order as specifically social (Procacci 1993). In this way, it represented a crucial step toward the construction of the very idea of society as a theoretical concept and at the same time a practical field of investigation.
practice: observation of symptoms, collection of cases, and statistical recurrences. This shaped concepts animated fundamentally by a will to reform. There was no longer a pure science; the programme of positive sociology was a science able to found the criteria for acting in order to improve human conditions on positive knowledge. Thirdly, the foundation concepts of positive sociology inherited the moral element that had characterized the philanthropic approach to poverty. Particularly with Comte, the problem of order became a question of moral consistence of society, impossible to solve through legal or economic reforms. His main theoretical questions run about the position of bourgeoisie and the working class in the new social order (Elias 1978). For this reason, once the revolution exploded again in 1848 and the popular movement claimed political and economic reforms to solve the social question, he felt the need to engage into the public debate. His ‘Positivist Society’, created a few days after the February insurrection, had the explicit task to oppose the positivist view of the social question to socialists. Insufficient labor did not depend upon capitalist economy, but upon an irrational organization that scientific development would overcome. The emphasis was rather on society’s moral engagement to fight against poverty, a problem more difficult to solve: it required to put ‘social sentiment’ in the forefront, by promoting a sociability that the Comtian concept of ‘duty’ was to organize as the cement for new social bonds.
3. The Science of Social Order The analysis of the social question had a crucial impact over the making of positive sociology. With Saint-Simon and Comte, the search for a new social order was firmly located within antisocial effects of industrial transformation: how was it possible to organize inequality in a way that would not prevent social stability? Their foundation concepts were marked by their source in the analysis of social inequality. First of all, they carried a vision of society as an organic entity, essentially concerned with the cohesion of all its parts—hence the importance of the metaphor of social organism. Most importantly, this located sociology on the side of the whole social body, the common interest, the collective being—apart from the opposition opened by liberalism between individual and collective points of view. Secondly, all dangers for social order were interpreted in the perspective of social pathology, calling for some kind of therapeutics. Poverty was a pathological element within the social body—the task of social thought was to identify the means for restoring a normal society. Hence the importance of medical knowledge, with the normative character intrinsic in the idea of restoring a normal society. Methods of inquiry were also drawn out of medical 14422
4. Democracy and Socialism A different perspective is the one opened by Tocqueville’s analysis of democracy. Elected in 1848 to the parliament by the first universal manhood suffrage, he was directly involved in writing the new Constitution which was to put an end to the revolution. As such, he politically intervened against the reforms claimed by the popular movement. Political organization of democracy above all demanded fighting its tendency to ally with a centralized power. The social question implied such a danger, strengthening the state that was expected to solve it; therefore nothing could be granted, except an exclusively moral engagement of society to reduce poverty. Distinguishing moral duties (la morale) from legal rights (le droit) was strategic in preventing the state from becoming either an economic agent providing jobs or a regulator reviving old protectionism. A political solution could only come from organising democracy: pushing people to act on collective issues and promoting public virtue instead of private freedom, were the only means for undermining the spectre of state despotism. The perspective of self-government led to rehabilitate associations as the arena where people would exert their public responsibility and resist the state.
Social Relationships in Adulthood The 1848 revolution was a crucial moment for social thought consolidation through its analysis of the social question. The work of Lorenz von Stein (1850) provided important evidence and was a crucial source for Marx. Indeed it would be difficult not to mention the influence of social question over socialist social thought, as the revolution of 1848 exemplified. The experience of French revolutions was also central for Marx’s historical concept of class, conceptualizing the fundamental inequality producing poverty in a capitalist economy. But class conflict was for Marx (Marx and Engels 1848) the key to development, and not a pathology to be cured; there could not be any other solution to it but the revolution. The strategical programme of Marxian social thought deeply differed from the one of sociology: it was not a question of reanimating society through moral duties and intermediary associations. Under this light, there is more continuity with Durkheim’s theory of social solidarity. In his study on socialism, Durkheim (1895) stressed its affinity with sociology, given the central role played by organization and a set of social rules controlling the economic sphere. Since societies autonomously tend toward solidarity, the antagonism between capital and labor was a danger, and could only result from disorganization. The social question was a problem of socialization; even unions might work as a means of social integration. The concept of organic solidarity combined the characteristics of positive sociology and the attempt to enhance democratic strategies: it provided criteria for techniques reorganizing social bonds, such as the division of labor and the resurgence of intermediary groups. In conclusion, the analysis of poverty contributed to strengthen a critique of economic and legal individualism towards a new science of social solidarity. The impact of the social question tied the fate of social science to the development of an administrative state, and engaged it into the political arena; thus promoting the re-establishment within the social body of those groups that liberalism had strategically eliminated. See also: Civil Rights; Democracy; Garbage Can Model of Behavior; Inequality; Institutions; Liberalism; Modernization, Sociological Theories of; Poverty, Sociology of; Revolutions, Sociology of; Socialism
Bibliography Arendt H 1963 On Reolution. Viking Press, New York Baker K 1964 The early history of the term ‘social science’. Annals of Science 20: 211–25 Comte A 1848\1957 The social aspect of positivism. In: A General View of Positiism [trans. by Bridges J H]. R. Speller, New York, Chap. 2 Durkheim E 1895 Le Socialisme, 1971. PUF, Paris
Elias N 1978 What is Sociology? [trans by Mennell S, Morrissey G]. Hutchinson, London Foucault M 1972 L’Histoire de la Folie aZ l’Age Classique, Suii de mon corps, ce papier, ce feu et la folie, l’absence d’oere, 2nd edn. Gallimard, Paris Marx K, Engels F 1848\1964 The Communist Manifesto. Washington Square Press, New York McDonald L 1993 The Early Origins of the Social Sciences. McGill-Queen’s University Press, Montreal Nisbet R 1943 The French Revolution and the rise of sociology in France. American Journal of Sociology 49: 156–64 Procacci G 1993 Gouerner la MiseZ re. La Question Sociale en France. Seuil, Paris. Simmel G 1908 Sociology, Untersuchungen uW ber die Formen der Vergesellschaftung. Leipzig, Germany von Stein L 1850\1921 Geschichte der Soziale Bewegung in Frankreich Salomon G (ed.) Munich, Germany Tocqueville A 1851\1978 Souenirs. Gallimard, Paris, [Recollections.]
G. Procacci
Social Relationships in Adulthood 1. Basics About Relationships 1.1 What is a Relationship? What constitutes a relationship? The answer, many relationship researchers will say, is simple. A relationship exists between two people when each person influences the other’s thoughts, feelings, and\or behavior. In other words, a relationship exists when people are at least minimally interdependent (Kelley et al. 1983). As the frequency, diversity, impact, and\or time span over which people influence one another increase, the relationship becomes increasingly interdependent or, according to Berscheid et al. (1989), closer.
1.2 Types of Relationships Interdependence takes many forms, a fact reflected in the diversity of terms we use to refer to relationships—friends, romantic partners, sibling relationships, business partners, and physician–patient relationships. Most past, and many current, relationship researchers have chosen one of these types of relationships to describe in detail. This has resulted in separate literatures on friendships, romantic relationships, marriages, and sibling relationships, and these literatures continue to expand. Increasingly, however, researchers are focusing upon conceptual differences between relationships. 14423
Social Relationships in Adulthood Some have identified conceptual dimensions along which relationships vary. For instance, Wish et al. (1976) find evidence for four dimensions underlying people’s ratings of relationships: cooperative\friendly versus competitive\hostile; equal versus unequal; intense versus superficial; and socioemotional\informal versus task-oriented\formal. Blumstein and Kollock (1988) suggest that relationships vary according to whether they do or do not involve kin, sexual\ romantic aspects, cohabitation, differences in status, and cross-sex participants. Still others have proposed taxonomies including sets of qualitatively distinct categories. For instance, Fiske (1992) posits that four basic rules direct individuals’ relations with others in all social interactions: people may share a common identity and treat one another equally (communal sharing); people may attend to status differences and give benefits accordingly (authority ranking); people may weight the utility of their behavior in achieving desired outcomes for themselves (market pricing); and finally, people may follow a quid-pro-quo principle (equality matching). Clark and Mills (1979, 1993) distinguish between: communal relationships, wherein benefits are given, non-contingently, on the basis of the recipient’s needs; exchange relationships, wherein benefits are given with the expectation of receiving comparable benefits in return; and relationships in which people are simply concerned about their own self-interest. They also emphasize a quantitative dimension relevant to communal relationships—the degree of responsibility that one person feels toward another. The stronger the communal relationship, the greater the costs (in time, effort, money, and foregone opportunities) a person will incur to respond to the other’s needs. Most recently, Bugental (2000) has proposed grouping relationships within an attachment domain, a hierarchical power domain, a coalitional group domain, or a reciprocity domain. In addition to thinking more about the conceptual dimensions and qualities characterizing different relationships, many researchers now emphasize understanding the interpersonal processes that influence relationships. 1.3 On What Types of Relationships is Most Psychological Research Currently Focused? Despite recognition that a variety of conceptually distinct relationships exist, research on processes occurring within relationships is not evenly distributed among these types. Most research has been conducted on intimate, or potentially intimate, relationships (e.g., friendships, romantic partnerships, and marriages). These relationships tend to fall in attachment or reciprocity domains (Bugental 2000). Further, people expect intimate relationships to be cooperative, friendly, equal, intense, social-emotional (Wish et al. 1976), and communal in nature (Clark and Mills 1979, 1993). Increasingly these relationships are being stud14424
ied longitudinally. A much smaller corpus of research exists on interpersonal processes relevant to competitive (Wish et al. 1976), market pricing (Fiske 1992), authority ranking (Fiske 1992), or hierarchical (Bugental 2000) relationships, and little of it is longitudinal in nature. Hence, just a few comments are devoted to research on competitive and hierarchical relationships here. More extensive comments are included relevant to interpersonal processes within friendships, romantic relationships, and family relationships.
2. Research on Non-intimate Relationships 2.1 Hierarchical Relationships Many social relationships exist in which one member has power or authority over the other, and psychologists have reported some interesting findings relevant to such relationships. Fiske (1993), for instance, found that one’s position in a hierarchical relationship influences one’s tendency to view one’s partner in stereotypical terms. Those with less power tend to pay careful attention to information about powerful people, especially information that contradicts stereotypes of these individuals. In contrast, in thinking about their underlings, those in more powerful positions rely on stereotypes. Presumably underlings have more time to attend to details about those who have power over them and are motivated to do so by a desire to gain control over the situation. Those in power who oversee many people have less time and, given the control their power affords them, less need to attend to details regarding underlings. Other researchers have found disturbing links between some men’s ideas of power and their tendencies to view women as attractive. Bargh et al. (1995) used a questionnaire to measure males’ propensity to sexual aggression. Later, these men participated in a study with a woman. Some men were primed to think about power (by sitting behind a large desk), while others were not. Among those prone to sexual aggression, being primed to think about power was associated with seeing the woman as being significantly more attractive—a link about which these men were not consciously aware. 2.2 Relationships Between Negotiators A great deal of work exists on interpersonal processes occurring between people who must negotiate with one another to come to a solution to a problem, most often involving a division of resources. Much early work focused on how individual differences in personality and gender influence negotiating, yet it yielded little in the way of findings. Other early work on how situational factors influence negotiations was more
Social Relationships in Adulthood fruitful. For example, the presence of observers has been shown to lead negotiators to express greater advocacy for previously announced positions, whether or not the observers have a vested interest in the outcome. Later work, falling within a ‘behavioral decision perspective,’ has explored how negotiators deviate from rationality in their decision-making. Findings suggest that negotiators are too influenced by whether risks are framed positively or negatively, rely too heavily on the most available information, and are too confident about the likelihood of attaining outcomes that favor themselves. The nature of preexisting relationships between negotiators has also been shown to influence the process of negotiation. For instance, negotiators prefer equal payoffs when the relationship is positive or neutral. In contrast, when the relationship is negative, they prefer inequality—favoring themselves. Other findings show that when relationships are close there is more information sharing, less coercive behavior, less demanding initial offers, and quicker agreements. Surprisingly, better joint outcomes are not superior overall, however. (See Bazerman et al. 2000 for a summary of this and other work on relationships between negotiators.)
3. Research on Intimate Relationships 3.1 A Very Actie Area With Some Well-established Findings Psychological research on intimate relationships is flourishing. Twenty-five years ago, the relevant literature was small and limited almost entirely to studies of initial attraction between people. The research was primarily experimental and conducted within the confines of a laboratory. When ongoing relationships were examined, the dependent measure of interest was, most often, global satisfaction. Today the relationship literature focuses largely on naturally occurring, ongoing relationships and on specific interpersonal processes relevant to forming, building, repairing, maintaining, and (sometimes) ending relationships. Although some research remains experimental and laboratory-based, field research often involving longitudinal surveys and diary studies of ongoing relationships has taken center stage. The field has advanced sufficiently that there is clear consensus on some facts about close relationships. These topics are discussed next.
3.2 Haing Relationships is Good for You One fact that highlights the importance of research in this area is simply that, overall, having relationships is
associated with good mental and physical health. Having friends and family members with whom one maintains regular contact, and\or a romantic relationship, is associated with greater overall reports of happiness and life satisfaction (e.g., Diener 1984, Myers 1992). Researchers using self-report measures have found that people report feeling best when they are with other people. Being socially involved (as indicated by such diverse indices as one’s network size, the density of that network, simply being married or not (especially for men), and perceiving that one has social support available if needed) is linked to better physical health and a lower risk of death (e.g., Berkman and Syme 1979, Cohen 1988). Indeed, many studies have related social involvement to lower susceptibility to the common cold (Cohen et al. 1997), cancer (Glanz and Lerman 1992), and cardiovascular disease (Berkman et al. 1993). That having social relationships is linked to better mental and physical health is clear. However, why this is so remains unclear. Personality attributes may predict both good relationships and good health. Thus, having relationships may not necessarily be causally linked to good health. However, most researchers in this area believe that it is. Perhaps our relationship partners encourage good health practices. Perhaps they help us cope with and reduce stress. Perhaps just knowing that our partners are there to support us lessens our anxiety (and its physiological concomitants). Such issues are the subject of current research.
3.3 We Can Predict Who Forms Relationships With Whom People tend to form relationships with those who are physically or functionally proximal to them. However, we come into proximity with many others. With which of these people will we form relationships? One factor that influences us is what members of our already existing relationships think. We tend to form relationships with people of whom our present relationship partners approve (Parks et al. 1983). Another factor is similarity. The evidence is clear that ‘birds of a feather flock together.’ People form relationships with those who are similar in attitudes, age, race, and educational background. In the case of romantic relationships, we also tend to form relationships with others whose physical attractiveness matches our own. This may be because we are drawn to situations which attract similar others. Alternatively, we may actively seek out similar others. Evidence suggests we generally assume people are similar to us and that the discovery of dissimilarities may be what drives people apart (Rosenbaum 1986). With one notable exception, there is very little evidence for complementariness in attraction. Opposites do not seem to attract. Even work on 14425
Social Relationships in Adulthood masculinity and femininity does not suggest that couples including a masculine male paired with a feminine female get along particularly well. Rather, it appears that compatibility is greater when each member of a couple possess both masculine and feminine traits (Ickes and Barnes 1978, Zammichieli et al. 1988). Performance domains appear to be the one exception to the rule that ‘birds of a feather flock together.’ Specifically, there is evidence that when an area of performance is central to one’s self-definition—say, being good at academic subjects—a close other’s good performance in the same area can be threatening to one’s self-esteem. That, in turn, can drive the relationship apart. In contrast, partners excelling in different performance domains can promote the relationship. That way each person can bask in the other’s reflected glory, and neither suffers from social comparison with the other (Erber and Tesser 1994). Still another important determinant of initial attraction is whether a potential partner likes you. Backman and Secord (1959) have demonstrated this experimentally, and Kenny has demonstrated this in studies of ongoing relationships (Kenny and La Voie 1982). Finally, we note that there is ample evidence that physical attractiveness is a powerful determinant of initial attraction (e.g., Berschied et al. 1971). Although both males and females seem to value attractiveness in a relationship partner, it seems especially important to males when judging females. Evolutionary theorists suggest that this has resulted from attractiveness being a signal of the health and presumed fertility of that potential mate (Buss 1989). Evolutionary theorists’ work on mate preferences also suggests that women especially value males with good resources (presumably because they can provide for offspring). At the same time, these theorists have found evidence of women placing high value on resources in males and males placing high value on attractiveness in females, they have simultaneously found that both males and females value such attributes as kindness and honesty in one another. Indeed, most people rate these attributes as being more important than resources or attractiveness.
3.4 Responsieness to Needs is a Process that is Important to Good Relationship Functioning Social psychologists have made considerable strides in identifying interpersonal processes that are important to good relationship functioning once relationships have been established. Responsiveness to one another’s needs is one such process. It has long been recognized, initially by reinforcement theorists, that receiving rewards from one’s partner is important to relationship satisfaction (e.g., Byrne and Clore 1970). However, recent theory and empirical research has demonstrated that the manner in which people meet 14426
(or fail to meet) one another’s needs is very important to a person’s sense that his or her partner is truly concerned for his or her welfare. This work suggests that a person needs to receive more than benefits from another person to feel intimate with that person. The person needs to believe that the relationship partner understands, validates, and, ultimately, cares for him or her. This appears to emerge from such things as the other responding appropriately to one’s statements of needs, self-disclosures, and emotional expressions by listening carefully, responding with understanding and acceptance, and, ultimately, by actually helping (Reis and Patrick 1996) (see Loe and Intimacy, Psychology of ). It seems likely that these processes, in turn, are facilitated by having a partner who is high in empathic accuracy—that is, by having a partner who accurately perceives one’s thoughts and emotions (Ickes 1997). Feeling cared for also appears to be enhanced when partners do not seem to be giving benefits to get benefits. For instance, forgoing self-interest to benefit a partner enhances trust in the partner’s caring more than giving a benefit that aids both the recipient and the giver (Holmes and Rempel 1989). People led to desire a close relationship with another want that relationship to be characterized by noncontingent responsiveness to needs (Grote and Clark 1998). That is, they react more positively to benefits given without strings attached than to those given with the expectation of repayment (Clark and Mills 1979), and they do not wish to keep track of just who has contributed what to the relationship in order to allocate benefits (Clark 1984). Giving benefits with strings attached or in exchange for comparable benefits calls the fundamental caring component of relationships into question. For relationships to be characterized by responsiveness to needs, not only is it important for each person to try to meet the other person’s needs, it is also important for each person to convey their needs to their partner. This greatly facilitates the partner’s ability to respond to those needs. Thus, willingness to self-disclose has been shown to promote relationships (Cohen et al. 1986). So, too, does it appear that freely expressing one’s emotions—which convey needs—is good for relationships (Feeney 1995). From a personality perspective, it is noteworthy that researchers have found that individual, chronic, ‘attachment styles’ are closely related to perceptions that others will meet one’s needs and hence to a wide variety of behaviors relevant to meeting needs in close relationships. People with a secure attachment style (presumably resulting from having had appropriately responsive caregivers early on) are more effective support-givers and -seekers than are those with insecure attachment styles (e.g., Simpson et al. 1992). Indeed, research on attachment styles has been the fastest-growing area of relationship research over the past two decades (Cassidy and Shaver 1999) (see Adult Psychological Deelopment: Attachment).
Social Relationships in Adulthood 3.5 Other Processes Associated With High- s. Low-quality Close Relationships Responsiveness to needs is not the only interpersonal process important for maintaining high-quality intimate relationships. A number of studies suggest that viewing one’s partner positively and reconstruing a partner’s faults as virtues is also beneficial for relationships (Murray and Holmes 1993). (Perhaps this aids in maintaining faith that one’s partner really does care about one’s needs.) These same researchers have also found evidence that when people who are satisfied with their dating relationships do see faults in those relationships, the negative thoughts tend to be linked in memory to positive thoughts about their partners in what they call a ‘yes but’ manner. For instance, a member of such a couple might say, ‘I think his laziness is actually kind of funny; it gives us a reason to laugh.’ Less-satisfied couples tend to link negative thoughts about the partner with other negative thoughts about the partner and the relationship (Murray and Holmes 1999). Seeing one’s partner in a more positive light than one’s partner sees him- or herself also seems good for one’s relationship and appears, over time, actually to result in one’s partner living up to one’s ideals (Murray et al. 1996). Reflection and social comparison are two more processes that appear to influence the quality of intimate relationships. Reflection refers to feeling that a partner’s performance reflects on you, either positively or negatively. Social comparison refers to the process of comparing one’s own performance in a given domain with a partner’s performance and discovering that one has performed better or worse than that partner. Each process can benefit or harm a relationship. When a performance domain is irrelevant to one’s identity, it appears that a partner’s good performance in that domain enhances one’s own selfesteem (and helps the relationship), whereas a partner’s poor performance decreases one’s self-esteem (and harms the relationship). The closer the relationship, the greater the effects (Erber and Tesser 1994). Still other processes that appear to influence the quality of relationships have been identified. For example, research based on Aron and Aron’s selfexpansion model (1996) has demonstrated that when we fall in love our conception of ourselves expands to include aspects of the other person and that doing exciting and novel activities with one another can increase the quality of a couple’s relationship (see Interpersonal Attraction, Psychology of ). To give another example of processes that appear to increase the quality of relationships, Rusbult has reported research showing that highly committed couples are especially likely to respond constructively (i.e., to accommodate) in the face of a partner’s poor behavior and that such couples are also likely to engage in such relationship-protective processes as seeing attractive (and therefore threatening) alternative partners as
significantly less attractive than their less committed peers do (Johnson and Rusbult 1989; Rusbult and Van Lange 1996). Interdependence theory has been particularly useful in pinpointing factors independent of ongoing interpersonal processes occurring within the relationship that influence both satisfaction and commitment to a relationship. For instance, although interdependence theorists agree with reinforcement theorists and with those emphasizing responsiveness to needs that rewards received (and costs avoided) are important determinants of relationship satisfaction, they add the important point that the expectations people bring with them to any particular relationship will influence satisfaction. The higher a person’s expectations (comparison level) are for the relationship, the tougher their standards will be and, potentially, the lower the satisfaction with any given relationship. Interdependence theorists also emphasize that satisfaction is not the only, nor necessarily even the most important, determinant of the stability of a relationship. Instead, they have identified a separate variable—commitment to a relationship—as a stronger determinant of relationship stability. Satisfaction does contribute to higher commitment to a relationship, but there are other important determinants of commitment as well. One is ‘comparison level for alternatives.’ The better one’s alternatives, the less committed one will be to the current relationship (independent of satisfaction). This means that one could lack commitment to a relationship which is, in and of itself, satisfying or be committed to a relationship which is, in and of itself, unsatisfying (but more satisfying than one’s alternatives). Investments (anything put into the relationship that cannot be easily retrieved should one exit the relationship) also influence commitment. Having investments such as joint friends, joint memories, children, a joint house, well-established routines, and so forth all increase one’s sense of having invested in a relationship and, hence, one’s commitment to that relationship. Finally, outside prescriptives, either coming from oneself or one’s social circle, can influence commitment which, as we have already noted, is an important predictor of relationship stability as well as of partners’ willingness to accommodate one another (Rusbult and Van Lange 1996).
4. Future Directions It is always difficult to predict the future. However, it seems likely that research on the interpersonal processes that occur in, and affect the quality of, friendships, romantic relationships, and family relationships will continue to occupy center stage for a while. We can also expect to see increasing collaboration between epidemiologists and health psychologists, who have demonstrated links between having relationships and health, and social and personality psychologists, who 14427
Social Relationships in Adulthood have investigated interpersonal processes that contribute to high-quality intimate relationships. Many researchers currently are calling for such collaboration, in the hope that the fallout will be an understanding of how and why relationships are beneficial for mental and physical health. Further, with the increasing emphasis on studying relationships longitudinally, it may be expected that theories and empirical work on interpersonal processes occurring in relationships over time will become more frequent. Right now, we know lots about first impressions and some about the very initial stages of relationship deepening. We also know a bit about the dissolution of relationships. However, we know little about long-term trajectories of happy and unhappy relationships. We will know much more in 10 years. It also seems likely that relationships other than those emphasized in recent research (and hence in this entry) will become the focus of more research—relationships such as teacher–student and physician–patient relationships, relationships between co-workers, and relationships between siblings.
See also: Adult Psychological Development: Attachment; Friendship, Anthropology of; Personality and Marriage; Sibling Relationships, Psychology of; Social Support and Health
Bibliography Aron A, Aron E 1996 Self and self expansion in relationships. In: Fletcher G J O, Fitness J (eds.) Knowledge Structures in Close Relationships: A Social Psychological Approach. Erlbaum, Hillsdale, NJ, pp. 325–44 Backman C W, Secord P F 1959 The effect of perceived liking on interpersonal attraction. Human Relations 12: 379–84 Bargh J A, Raymond P, Pryor J B, Strack F 1995 Attractiveness of the underling: An automatic power–sex association and its consequences for sexual harassment and aggression. Journal of Personality and Social Psychology 68: 768–81 Berkman L F, Syme S L 1979 Social networks, host resistance, and mortality: A nine year follow-up study of Alameda County residents. American Journal of Epidemiology 100: 186–204 Berkman L F, Vaccarino V, Seeman T 1993 Gender differences in cardiovascular morbidity and mortality: The contribution of social networks and support. Annals of Behaioral Medicine 1: 112–8 Berscheid E, Dion K, Walster E, Walster G W 1971 Physical attractiveness and dating choice: A test of the matching hypothesis. Journal of Experimental Social Psychology 7: 173–89 Berscheid E, Snyder M, Omoto A M 1989 The Relationship Closeness Inventory: Assessing the closeness of interpersonal relationships. Journal of Personality and Social Psychology 57: 792–807 Blumstein P, Kollock P 1988 Personal relationships. Annual Reiew of Sociology 14: 467–90 Bugental D B 2000 Acquisition of the algorithms of social life: A domain-based approach. Psychological Bulletin 126: 187–219 Buss D M 1989 Sex differences in human mate preferences:
14428
Evolutionary hypotheses tested in 37 cultures. Behaioral and Brain Sciences 12: 1–49 Byrne D, Clore G 1970 A reinforcement model of evaluative responses. Personality: An International Journal 1: 103–28 Cassidy J, Shaver P (eds.) 1999 Handbook of Attachment: Theory, Research, and Clinical Applications. Guilford Press, New York Clark M S 1984 Record keeping in two types of relationships. Journal of Personality and Social Psychology 47: 549–57 Clark M S, Mills J 1979 Interpersonal attraction in exchange and communal relationships. Journal of Personality and Social Psychology 37: 12–24 Clark M S, Mills J 1993 The difference between communal and exchange relationships: What it is and is not. Personality and Social Psychology Bulletin 19: 684–91 Cohen S 1988 Psychosocial models of the role of social support in the etiology of physical disease. Health Psychology 7: 269–97 Cohen S, Doyle W J, Skoner D, Rabin B S, Gwaltney J M 1997 Social ties and susceptibility to the common cold. Journal of the American Medical Association 277: 1940–44 Cohen S, Sherrod, D R, Clark M S 1986 Social skills and the stress-protective role of social support. Journal of Personality and Social Psychology 50: 963–73 Diener E 1984 Subjective well-being. Psychological Bulletin 95: 542–75 Erber R, Tesser A 1994 Self-evaluation maintenance: A social psychological approach to interpersonal relationships. In: Erber R, Gilmour R (eds.) Theoretical Frameworks for Personal Relationships. Erlbaum, Hillsdale, NJ, pp. 211–34 Feeney J A 1995 Adult attachment and emotional control. Personal Relationships 2: 143–60 Fiske A P 1992 The four elementary forms of sociality: Framework for a unified theory of social relations. Psychological Reiew 99: 689–723 Fiske S 1993 Controlling other people: The impact of power on stereotyping. American Psychologist 48: 621–8 Glanz K, Lerman C 1992 Psychosocial impact of breast cancer: A critical review. Annals of Behaioral Medicine 14: 204–12 Grote N, Clark M S 1998 Why aren’t indices of relationship costs always negatively related to indices of relationship quality? Personality and Social Psychology Reiew 2: 2–17 Holmes J G, Rempel J K 1989 Trust in close relationships. In: Hendrick C (ed.) Close Relationships: Reiew of Personality and Social Psychology. Sage, London, Vol. 10, pp. 187–220 Ickes W 1997 Empathic Accuracy. Guilford, New York Ickes W, Barnes R D 1978 Boys and girls together and alienated: On enacting stereotyped sex roles in mixed sex dyads. Journal of Personality and Social Psychology 36: 669–83 Kelley H H, Berscheid E, Christensen A, Harvey, J H, Huston T L, Levinger G, McClintock E, Peplau L A, Peterson D 1983 Close Relationships. Freeman, New York Kelley H H, Thibaut J 1978 Interpersonal Relations: A Theory of Interdependence. Wiley, New York Kenny D, LaVoie L 1982 Reciprocity of attraction: A confirmed hypothesis. Social Psychology Quarterly 45: 54–8 Murray S L, Holmes J G 1993 Seeing virtues in faults: Negativity and the transformation of interpersonal narratives in close relationships. Journal of Personality and Social Psychology 65: 701–22 Murray S L, Holmes J G 1999 The (mental) ties that bind: Cognitive structures that predict relationship resilience. Journal of Personality and Social Psychology 77: 1228–44 Murray S L, Holmes J G, Griffin D W 1996 The benefits of positive illusions: Idealization and the construction of sat-
Social Science, the Idea of Myers D G 1992 The Pursuit of Happiness: Who is Happy—and Why? Marrow, New York Myers D G, Diener E 1995 Who is happy? Psychological Science 6: 10–19 Parks M R, Stan C M, Eggert L L 1983 Romantic involvement and social network involvement. Social Psychology Quarterly 46: 116–31 Reis H T, Patrick B C 1996 Attachment and intimacy: Component processes. In: Higgins E T, Kruglanski A (eds.) Social Psychology: Handbook of Basic Principles. Guilford, New York, pp. 523–63 Rosenbaum M E 1986 The repulsion hypothesis: On the nondevelopment of relationships. Journal of Personality and Social Psychology 51: 1156–66 Simpson J A, Rholes W S, Nelligan J 1992 Support-seeking and support-giving within couples in an anxiety provoking situation: The role of attachment styles. Journal of Personality and Social Psychology 62: 434–46 Wish M, Deutsch M, Kaplan S J 1976 Perceived dimensions of interpersonal relations. Journal of Personality and Social Psychology 33: 409–20 Zammichieli M E, Gilroy F D, Sherman M F 1988 Relation between sex-role orientation and marital satisfaction. Personality and Social Psychology Bulletin 14: 747–54
M. S. Clark
1. Some Taxonomic Distinctions and a Definition If human studies are to count as scientific, then what it takes to make a science must first be considered. 1.1 Modality First, empirical and normatie approaches to this question need to be distinguished—roughly, ‘What are the sciences like?’ and ‘What should the sciences be like?’ Of course, these two questions cannot be treated in isolation, especially in view of the widely accepted maxim that ‘Ought implies can.’ This maxim means, in this instance, that those subjects whose scientific status is being assessed cannot sensibly be required to attain some excellence unless, of course, paradigms of scientific achievement have themselves attained this excellence. With these points in mind, the question about human studies can be understood in this way: given what human studies are like, do they or can they conceivably measure up to whatever normative standards of excellence the best scientific disciplines do measure up to? 1.2 Focus
Social Science, the Idea of Can human studies, e.g., sociology and psychology, be properly scientific? Since the mid-nineteenth century, philosophers and social scientists have debated this question more or less continuously. It has periodically assumed considerable urgency, indeed become a focus of attention within the disciplines, as new influences on these subjects were felt or as new models of science itself were developed. Indeed, the complexity and potential obscurity of the issue is already evident. There is an implied equation, ‘human studies are\are not science,’ in at least two variables. There has been considerable debate, at least since the 1850s, about what kinds of role human studies might play. Likewise, what is required for a discipline to be scientific has also been a matter of continuing controversy. As Peter Manicas put it (1987, p. 168), ‘neither ‘‘science’’ nor, a fortiori, the branches of the social sciences are ‘‘natural kinds’’.’ Notice, furthermore, that the variables are not entirely independent of one another. What human studies practitioners have aspired to has itself been influenced by what they think about the desirability and the possibility of making their enterprises genuinely scientific. Similarly, though perhaps to a lesser extent, what has counted as scientific has reflected the actual practices of, inter alia, students of sociology and psychology. To approach this question, a suitably sophisticated taxonomy of possibilities is therefore required.
Of course, there are other issues about what it takes to make a science. The question of focus is perhaps the most important. Science and nonscience can be distinguished either in terms of their products or in terms of their processes. This fundamental distinction divides pre- and post-Kuhnian conceptions of science. Whereas before Thomas Kuhn’s epochal work The Structure of Scientific Reolutions (1970), science was distinguished from nonscience by the kinds of statements the two discourses produced, or, more importantly, by the status of these statements relative to epistemic standards of confirmation, generality, or objectivity, after this work, science was increasingly distinguished from nonscience in terms of the activities of its practitioners. Obviously, it will make a difference which of these criteria is deemed more appropriate. On the product conception, it might count against the scientific pretensions of sociology and psychology, for instance, that the more robust results in these subjects lack the kind of generality that robust results typically display in paradigm, ‘hard’ sciences such as physics and chemistry. But perhaps this lack of generality is irrelevant if a process criterion is adopted. Perhaps sociologists and psychologists obtain their less general findings by the same technical means and methods that natural scientists use in producing more general or more stable results. 1.3 Scope Finally, there is the issue of scope. Roughly speaking, wide and narrow understandings of the sciences can be 14429
Social Science, the Idea of distinguished. On the narrow reading, it is the concrete and specific particularities of the paradigm sciences that provide criteria for judging other candidate sciences. Typically, the paradigm sciences are laboratory based, involve heavy reliance on statistical techniques of data analysis, and produce results in the form of highly confirmed differential equations relating measurable variables. On the narrow criterion, then, to count as a science, any other discipline must also exhibit these specific features. On the other hand, a wide approach to the issue requires of candidate sciences merely that they exhibit some abstract and high-level virtue which is shared by paradigm scientific achievements—that they permit the attainment of objectivity, for instance. A wide approach to the issue is probably indicated; at the concrete and specific level of technique and theory, there is enough variation within the paradigm sciences to rule out any narrow approach. In considering whether human studies are scientific, an approach which is normative, process oriented, and wide in its scope should therefore be adopted. The question will be, in other words, whether it is possible that human studies be conducted in such a way that they might achieve whatever levels of objectivity in their procedures as are attainable in the paradigm sciences. (The issue is phrased in this way to permit the rather complicated judgment (likely to be favored by ‘postmodernists’) that human studies can achieve the same degree of objectivity as paradigm sciences, but that neither can achieve the kind of ‘objectivity’ that was previously thought to be characteristic of paradigm, natural sciences such as physics.)
1.4 Objectiity To assess the scientific status of disciplines such as sociology and psychology, the objectivity of their procedures must, on this account, be assessed. But what is objectivity? Of course, there may be no more determinacy on this matter than on some others which have already been alluded to. Just as ‘science’ does not designate any natural kind, neither does ‘objectivity’; it will be a matter of persistent debate what counts as objective. Some preliminary observations can nevertheless be offered. First of all, transcendental and nontranscendental accounts of objectivity can be distinguished. In the transcendental account, objectivity in investigative activities has been attained when and only when all that is specific to the situation in which they are undertaken, including the personalities of the investigators, has been eliminated from them. Individuals who are otherwise dissimilar must, on this account, conduct themselves in some common way when they examine the objects of their investigations. (When scientists have managed to specify methods for 14430
the replication of results, some approximation to this ideal may have been achieved.) They know how to tell others who differ from them how to negate the effects of these differences and thus achieve the same results as they have. Objectivity, on this account, consists of transcending local circumstances by eliminating all factors which vary from one situation to another. Of course, philosophers and others working under the rubric of ‘postmodernism’ have widely repudiated this ideal. True transcendence of local circumstances is not possible, it is not desirable, and, indeed, it is not even necessary for a form of objectivity (that is worthy of being pursued) to be attained. Objectivity requires not that local differences be eliminated, but, rather, that they be exploited in order to discover, where this is possible, a point of convergence of otherwise very different perspectives. According to this account, objectivity is attained when an individual is able to affirm from his or her own perspective a system of propositions that can likewise be affirmed from the otherwise different perspectives of such other individuals whose agreement matters (see Rorty 1989).
2. Why the Issue Matters Why should it matter whether human studies are or are not scientific, especially if there is little agreement about the concepts in terms of which the issue is defined. Discussing the so-called ‘interpretive turn’ in human studies, Geertz (1993, p. 8) put the point this way: ‘Whether this [development] is making the social sciences less scientific or humanistic study more so (or, as I believe, altering our view, never very stable anyway, of what counts as science) is not altogether clear and perhaps not altogether important.’ There are, in fact, at least three reasons why it matters whether human studies are considered scientific.
2.1 Institutionally The scientism of its public culture, i.e., the privilege that is granted to scientific disciplines and their methods and results, is one prominent characteristic of modernity. Science has immense, though perhaps no longer undisputed, cognitive authority, and plays an influential role in defining as well as creating the world which human beings inhabit. If human studies succeed in gaining recognition as scientific subjects, then they too could hope to exercise this kind of authority. Foucault, typically, puts the matter the other way around (1980, p. 85): ‘What types of knowledge do you want to disqualify in the very instant of your demand: ‘‘Is it a science?’’ Which speaking, discoursing subjects of experience and knowledge do you want to ‘‘diminish’’ when you say: ‘‘I who conduct this discourse am conducting a scientific discourse,
Social Science, the Idea of and I am a scientist?’’ Which theoretical-political aant garde do you want to enthrone in order to isolate it from all the discontinuous forms of knowledge that circulate about it?’
that sort of knowledge of ourselves and each other would modify our natures in ways that we cannot predict at all. Every institution we now have: arts, politics, religion—even science—would be changed beyond all recognition.’
2.2 Heuristically Whether sociologists and psychologists see themselves and are seen by others as scientists will, of course, influence the conduct of their enquiries and the kinds of products they aspire to deliver. As Geertz (1993, p. 8) put it, the issue is ‘who borrows what from whom.’ Do human studies borrow techniques of observation, experimentation, and data manipulation from ‘hard’ sciences such as chemistry and biology, or do they, instead, look to other disciplines for their techniques or their standards of objectivity? Indeed, this issue is part of a long-standing debate. In the late nineteenth century, the crucial questions were about the Geisteswissenschaften—are they concerned with understanding or with explanation? Can they articulate general psychic and\or social laws or must they, instead, confine themselves to the accurate description and contextualization of concrete and specific incidents (Windelband’s nomothetic vs. idiographic conceptions)? In the middle of the twentieth century, the key issue was the cognitive status of social scientific generalizations and, indeed, their very possibility in the face of innumerable obstacles and points of difference from paradigm ‘hard’ sciences. At the end of the twentieth century, the question is whether interpretation, with all its vicissitudes, constitutes a more central method for human studies than lawbased explanation, and, indeed, whether interpretation itself is possible in a sufficiently ‘objective’ manner to permit the judgment that human studies are, perhaps in some extended sense, scientific.
2.3 Morally How human studies are conceptualized may have profound moral significance. This point is dependent on the previous two. Suppose human studies are modeled on the ‘hard’ sciences and that success in doing so is conspicuous enough so that these subjects gain the kind of cognitive authority that some of their practitioners have long aspired to. In this case, ideas about the human situation that are moral and cultural foundations of our longest-standing and most deeply embedded traditions are at risk of being undermined. Putnam (1978, p. 63) put it this way: ‘[S]uppose our functional organization became transparent to us. Suppose we had a theory of it, and we could actually use this theory in a significant class of cases. What would happen to us? … Would it be possible even to think of oneself as a person? … [T]he development of
2.4 Some Positions Whether human studies can be scientific is therefore too important a question simply to ignore. And, of course, it has not been ignored. Philosophers, social scientists, and others have, in fact, taken a number of positions on this question. Before proceeding with more substantive matters, a threefold primary distinction among these positions should be considered. In the middle of the twentieth century, the most common position, though not so designated until later, was naturalism. According to this doctrine, there is nothing methodologically or theoretically distinctive about human studies in relation to paradigm, natural sciences. In particular, social and natural scientists legitimately share both aspirations (to discover and confirm general laws covering specific phenomena) and techniques (operationalization of variables, controlled experimentation, and statistical analysis of its results). Both earlier and later than this period, however, a very common, perhaps even the most common position was interpretiism. According to this doctrine, human studies are distinguished from paradigm, ‘hard’ sciences both in their aspirations and in their techniques. Social scientists should not try to emulate the paradigm sciences narrowly construed (see Sect. 1.3); they should strive, instead, to interpret human social and psychic phenomena, and should employ the techniques of hermeneutics, broadly construed, to achieve this. Finally, towards the end of the twentieth century, though also but to lesser degrees at earlier periods, ‘scepticism’ about the social sciences has been an increasingly influential position. On this account, laws of human being cannot be discovered, nor can properly objective interpretations of what human beings do be provided. Although the results of sociological or psychological research are often presented in the form of statements of knowledge, they are not entitled to this status, and must be understood in some other, less exhaulted way. Sceptical positions may be either local or global. Global scepticism, often associated with the work of Jacques Derrida, repudiates all claims to determinate and methodologically adequate knowledge of human phenomena. Local scepticism adopts such a dismissive attitude only to certain claims, e.g., those which have been derived in some specific way or from some specific social position. (For instance, Marxists might be skeptical in this local sense about the claims 14431
Social Science, the Idea of produced by mainstream, and hence in their view ‘bourgeois’ sociology or psychology. Similarly, feminists are often skeptical about the claims of mainstream and hence in their view ‘patriarchal’ social science.)
3. Impediments to Human Studies Being Scientific That the legitimacy of human studies has so long been a matter of discussion is prima facie evidence that the investigation of human phenomena is not straightforward. What, then, are some of the reasons why it might not be possible to have a genuinely scientific approach to human social and psychic phenomena?
3.1 Preinterpretation Theorists otherwise as diverse as Peter Winch, R. G. Collingwood, and Alfred Schu$ tz have pointed, in distinguishing human studies from science per se, to the fact that students of human phenomena work, whereas paradigm scientists do not work, with materials that already have meaning and significance (see, e.g., Winch 1958, p. 87). There are strong and weak versions of this point. On the weak version, students of human phenomena must always rely, methodologically, on the objects of their research in ways in which paradigm scientists need not. Saying what is going on is preliminary to saying why it is happening, and hence to any scientific explanation. But what is going on, socially or psychically, must be understood, in Geertz’s phrase, ‘from the native’s point of view.’ How the subject understands what is going on determines the description of what is going on. On the strong version, once scientists have characterized ‘the native’s point of view,’ they have discovered all that they need to or can know about the objects of their investigation. In either case, what paradigm scientists do and what human studies practitioners do is very different. The latter seem forced to rely on the subjective perspectives of the objects of their research.
3.2 Instability There are three, inter-related factors which make it unlikely that sociologists and psychologists will be able to develop stable, objective accounts of the phenomena they take an interest in. First, there is the historicity of both the subjects and the objects of human studies research. When Gadamer (1975, p. 204) asks ‘Is not the fact that consciousness is historically conditioned inevitably an insuperable barrier to reaching perfect fulfilment in historical 14432
knowledge?’ he articulates this fundamental difficulty: neither scientist nor subject ‘stands still,’ and hence what the one should say about the other changes as they themselves change. Paradigm, natural sciences, on the other hand, typically deal with statements of ‘timeless’ generality. Second, in human studies relations of representation are subject to reflexiity in a way that is not common in the sciences per se. What practitioners say about human phenomena influences how their subjects enact or react to these phenomena, and hence often undermines the claim, necessary if objectivity is sought, that scientists have accurately represented the behavior of their subjects. (Practitioners have, instead, caused or persuaded subjects to act so that their behavior conforms to the practitioners’ descriptions of it.) To mid-twentieth century philosophers, this possibility was best known under the heading ‘reflexive predictions’ (self-defeating and self-confirming ‘prophecies’), but the crucial points have since been generalized extensively. Of course, as Lyotard (1984, p. 57) rightly points out, reflexivity must not be understood in too one-sided a way. It is not just that sociologists and psychologists can create the objects of their investigations; it is, as well, that these objects can, strategically, resist the characterizations which practitioners attempt to fit them with. This too is an impediment to the achievement of objectivity as it is usually conceived. Finally, there is radical indeterminacy about the objects of investigation. This point is an ontological not an epistemological one. It is not that, in some or perhaps many cases, scientists cannot tell when they have characterized some phenomenon accurately. It is, rather, that there may be no sense, in these cases, to the very idea of accurate characterization. Typically, this means that there are, in the human domain, no fixed objects with well-defined characteristics for scientists to (try to) represent. In its more familiar forms, this point is, of course, an obvious development of Dilthey’s observations about the ‘hermeneutic circle.’ It has been much emphasized by Derrida (1978) and others of his school. Specifically, if interpretation takes place against and in terms of a background that is not ‘given’ but must itself be an object of interpretation, then every interpretation has the form, not of an objective (if possibly incorrect) description, but, rather, of a (performative) proposal to see the situation in one of the many ways in which it could (as) legitimately be seen. There is on this account insufficient interpretive stability to warrant attributions of objectivity.
3.3 Complexity The economist Hayek (1978, 26f) has long emphasized the peculiar kind of complexity which human studies
Social Science, the Idea of practitioners must try to deal with. Such ‘organized complexity’ resists reduction either to deterministic or to statistical models that might be appropriated from the ‘hard’ sciences. The behavior of a social aggregate often cannot, in other words, be reduced to the sum or product of the behavior of its individual parts, nor can it be adequately understood by characterizing its statistical properties (as the behavior of an aggregate of gas molecules might be understood, for instance). Notice, furthermore, that there may be moral and prudential (as well as practical) reasons for refusing to reduce this complexity in ways standard in the ‘hard’ sciences, i.e., by ‘experimentalizing’ it. So, while scientists might be able to ‘standardize’ individual agents in an experimental setting so that their interactions there can be understood deterministically or statistically, there are moral objections to exporting these conditions to extra-experimental settings. Indeed, there may be prudential reasons for not using these techniques of standardization in ‘naturalistic’ settings—as is implied, of course, by Hayek’s own analysis of the market. In particular, adequate functioning of a complexly organized whole may depend precisely on its individual elements not being standardized. In either case, there are thus limits to the objectivity (actually the ‘ecological validity’) of the results that might be obtained in laboratory settings.
3.4 Interim Conclusion All these impediments manifest themselves, of course, in a simple fact: there is, especially compared with paradigm ‘hard’ sciences, a conspicuous lack of progress in human studies. In particular, multiple and opposed paradigms persist in human studies, especially in any area of investigation which is likely to be of broadly ‘political’ interest and concern. As Geertz (1993, p. 4) has said, ‘calls for ‘‘a general theory’’ of just about anything social sound increasingly hollow, and claims to have one megalomanic. Whether this is because it is too soon to hope for unified science or too late to believe in it is, I suppose, debatable. But it has never seemed further away, harder to imagine, or less certainly desirable than it does now.’
4. Alternatie Conceptions of Human Studies It may not make sense to compare sociology and psychology to the ‘hard’ sciences. How, then, are they to be conceptualized … and hence institutionalized? There are a number of possibilities. (Manicas’s cautionary note still applies, of course; if ‘science’ isn’t a natural kind, then no more is ‘literature’ or any of the other disciplines with which human studies might be compared.)
4.1 Literature If human studies cannot properly be compared to the ‘hard’ sciences, perhaps a comparison with literature and other forms of potentially edifying narrative or practical discourses is more appropriate. This, certainly, is the opinion of a wide range of late-twentieth century theorists, including Geertz, Charles Taylor, Gadamer, and others. Geertz, for instance, advocates (1993, p. 6) that we ‘turn from trying to explain social phenomena by weaving them into grand textures of cause and effect to trying to explain them by placing them in local frames of awareness … .’ Of course, it might seem, to practitioners, that such a heuristic reorientation (see Sect. 2.2) would reduce the likelihood that human studies might have significant institutional influence (see Sect. 2.1). However, it needs to be borne in mind that the institutional influence of sociology and psychology is rather patchy (this is made clear in utilization studies), and that literature still forms the basis for highly influential mass-media representations and analyses of the human situation. Another related possibility is this. While sociology and psychology do not or cannot provide an objective account of human phenomena, they can and do provide a variety of potentially important perspecties on these phenomena, and thus enable human beings to orient themselves more productively to them, especially by giving them frameworks for understanding these phenomena. In this vein, Rorty (1982, p. 247) draws a contrast between scientistic goals of prediction and control, on the one hand, and, on the other, practical goals of developing ‘descriptions which help one decide what to do.’ Since deciding what to do depends on values and other normative elements, and since there is widespread and apparently irreducible diversity in relation to these elements, the kind of ‘practical knowledge’ that nonscientistic forms of psychology and sociology might provide will not, of course, satisfy the most demanding requirements of objectivity.
4.2 Critique There is a subtle difference, but an important one between the view that human studies offer edifying perspectives on or frameworks for the interpretation of human phenomena (see Sect. 4.1) and the view that they offer a basis for critique and the advocacy of change. In the latter half of the twentieth century, critical theorists (e.g., Habermas), genealogists (e.g., Foucault), feminists, and advocates for other marginalized groups (see Seidman 1994) proposed and in some cases made progress in achieving a ‘takeover’ of human studies activities and institutions. Across this spectrum techniques which have some claim to being ‘antiscientific’ (Foucault 1980, p. 83) are widely employed. In particular, practitioners of critique shift 14433
Social Science, the Idea of the focus from showing how what is, socially, is an intelligible consequence of its causal antecedents to showing how what is, socially, represents the highly contingent realization of practices and structures to which there are clear, achievable, and desirable alternatives. As Foucault says (Rabinow 1984, p. 46): ‘[T]his critique will be genealogical in the sense that it will not deduce from the form of what we are, what is impossible for us to do and to know; but it will separate out, from the contingency that has made us what we are, the possibility of no longer being, doing, thinking, what we are, do, or think. It is not seeking to make possible a metaphysics that has finally become a science; it is seeking to give new impetus, as far and wide as possible, to the undefined work of freedom.’ It is not, then, a matter of objective representation, but, rather, of politically effective analysis of the possibilities for change. (This is, of course, an extension of Karl Marx’s 11th thesis on Feuerbach: ‘Philosophers have only interpreted the world in various ways; the point is to change it.’)
4.3 Disciplines or Ideologies There are (see Sect. 3.3) moral and prudential disincentives to investigating human phenomena experimentally and, a fortiori, to ‘exporting’ into ‘naturalistic’ settings those conditions in which objective, reproducible experimental effects have been obtained. This has not, of course, stopped such things from happening and, indeed, it is the main conclusion of Foucault and others of his school that human studies have played a key role in this process, and, in particular, that they have contributed to the production of a ‘disciplinary society’ in which individuals are governed via their consent, i.e., through the ways in which they are made into ‘subjects.’ Although Ju$ rgen Habermas disputed much that Foucault wrote and did, he was at one with him on this point. Habermas (1994, p. 54) says, for instance: ‘It is no accident that these sciences, especially clinical psychology, but also pedagogy, sociology, political science, and cultural anthropology, can, as it were, frictionlessly intermesh in the overall technology of power that finds its architectural expression in the closed institution. They are translated into therapies and social techniques, and so form the most effective medium of the new, disciplinary violence that dominates modernity.’ That human studies are ‘disciplines’ is one way of capturing their importance and their negative impact on the lives of ordinary people; that they have functioned ideologically is another way of making this point. Perhaps the most important issue here is the way in which human studies have been recruited by other powerful institutions to serve the specifically political interests of those institutions while seeming to provide a nonpolitical analysis, a value-neutral ac14434
count of contemporary social conditions. The idea of expertise has played an important role in this development. Insofar as social and psychic problems can be modeled on purely technical problems, solutions can be offered and consumed on a purely technical basis, and the reality of their dependence on fundamentally political matters can remain inaccessible to consciousness.
5. Conclusion Should human studies plausibly aspire to scientific status? This remains questionable, as it has long been. Can they realistically aspire to such a status? This too is questionable. Perhaps Rorty (see Sect. 4.1) is right to suggest that human studies have aspired to too much, that their two main goals are disjoint and hence cannot both be achieved under the same methodological or institutional auspices. In this case, practitioners would perhaps be wise to disentangle their various objectives, and adjust their ‘aspiration levels’ (Simon 1996, p. 30) to more realistic goals. Or perhaps an even more radical adjustment of aspirations and disciplinary cultures is called for. Certainly Toulmin thinks so. Toulmin (1992, p. 193) says: ‘The task is not to build new, more comprehensive systems of theory with universal and timeless relevance, but to limit the scope of even the best-framed theories, and fight the intellectual reductionism that became entrenched during the ascendancy of rationalism … [thus reinstating] respect for the pragmatic methods appropriate in dealing with concrete human problems.’ On this account, human studies are to be understood more on the model of ad hoc tools for a variety of perhaps incommensurable human purposes, and not, as in their ascendency in the mid-twentieth century, as sciences with all the cognitive and institutional entitlements which are associated with that sometimes honorific term. See also: Atomism and Holism: Philosophical Aspects; Causes and Laws: Philosophical Aspects; Economics, Philosophy of; Idealization, Abstraction, and Ideal Types; Individualism versus Collectivism: Philosophical Aspects; Interpretation and Translation: Philosophical Aspects; Methodological Individualism in Sociology; Methodological Individualism: Philosophical Aspects
Bibliography Derrida J 1978 Writing and Difference. University of Chicago Press, Chicago Foucault M 1980 Power\Knowledge. Pantheon, New York Gadamer H-G 1975 Truth and Method. Seabury Press, New York Geertz C 1993 Local Knowledge. Fontana Press, London
Social Security Habermas J 1994 The critique of reason as an unmasking of the human sciences. In: Kelly M 1994 Critique and Power. MIT Press, Cambridge, MA Hayek F A 1978 New Studies in Philosophy, Politics, Economics and the History of Ideas. Routledge and Kegan Paul, London Kuhn T S 1970 The Structure of Scientific Reolutions. University of Chicago Press, Chicago Lyotard J-F 1984 The Postmodern Condition. University of Minnesota Press, Minneapolis, MN Manicas P T 1987 A History and Philosophy of the Social Sciences. Blackwell, Oxford, UK Putnam H 1978 Meaning and the Moral Sciences. Routledge and Kegan Paul, London Rabinow P 1984 The Foucault Reader. Penguin, London Rorty R 1982 Consequences of Pragmatism. University of Minnesota Press, Minneapolis, MN Rorty R 1989 Contingency, Irony, and Solidarity. Cambridge University Press, Cambridge, UK Seidman S 1994 Contested Knowledge. Blackwell, Oxford, UK Simon H A 1996 The Sciences of the Artificial. MIT Press, Cambridge, MA Toulmin S 1992 Cosmopolis. University of Chicago Press, Chicago Winch P 1958 The Idea of a Social Science. Routledge and Kegan Paul, London
F. D’Agostino
Social Security ‘Social Security’ denotes both a complex of public institutions to protect individuals against common life risks and the programmatic idea that a commonwealth should provide all of its members such protection that they may feel secure within that commonwealth. This concept is broader than ‘social insurance’ and narrower than ‘welfare state,’ but relates to similar problems.
1. The Emergence of the Concept ‘Security’ (Latin securitas) has a long-standing tradition in the political rhetoric in Europe since the Roman Empire. In early modern times it was often used together with ‘welfare’ and ‘felicity’ in order to paraphrase the duties of the prince to care for his subjects. Thomas Hobbes made ‘the safety of the people’ a central task for his Leiathan, and this idea became seminal for modern political theory and constitutional practice. Wilhelm von Humboldt (1792) was the first to distinguish clearly between ‘safety\ security’ and ‘welfare’: he restricted the task of the state to the provision of ‘safety\security,’ i.e. on the one hand the protection of citizens’ freedom from external threats, and on the other hand the internal protection of citizens’ commitments and rights via the judiciary and police. ‘Welfare’ by contrast was now assigned to the private realm as the individual pursuit
of happiness from which the state was to abstain from interfering. In the nineteenth century the idea of public safety and security became restricted to the defense and protection of individual rights, the latter becoming differentiated between security within the public realm and the reliability of legal protection. But in the twentieth century the idea of security became paramount and was related to psychology and psychoanalysis, to the theory of cognition, to industrial and technical safety as well as to social security. The specific context for the emergence of the term ‘Social Security’ was the United States and the Great Depression. In American pragmatism ‘the quest for certainty’ was abandoned and superseded by ‘the search for security’ (John Dewey). Inspired by the ‘four wishes’ of William I Thomas, the ‘wish for security’ became a basic assumption about human motives and ‘insecurity’ a commonplace attribute of the Zeitgeist. The invention of the expression ‘Social Security’ is commonly attributed to Franklin Delano Roosevelt or to his entourage. Roosevelt used it first in a Message to Congress on 30 September 1934 after having created a ‘Committee on Economic Security’ to prepare the ‘Social Security Act’ on 29 June 1934. However, a voluntary association promoting ‘Old Age Security’ renamed itself ‘American Association for Social Security’ in 1933 and changed the title of its journal from Old Age Security Herald to Social Security because the program of this association had been expanded from old-age pensions to all branches of social insurance (Rubinow 1934, pp. iii, 279). In any case the expression quickly gained steam: it coined first the US ‘Social Security Act’ of 1935 (SSA). It then appeared in the Atlantic Charter of 1941, became the catchword for the Declaration of Philadelphia (1944) of the International Labour Organization (ILO) and found its way in 1948 into the Universal Declaration on Human Rights of the United Nations (Art. 22). After the Second World War ‘Social Security’ had become a leading term internationally in the programs for the development of welfare states. Its specific meaning remained vague, however.
2. Institutional Deelopments At the time when the SSA was drafted, a public commitment to income maintenance for people in need was by no means a new idea. At the end of the nineteenth century two models were competing in Europe: public subsidizing of mutual benefit associations (called the ‘System of Gent’), and social insurance along the lines of the German (‘Bismarck’) Model. The British social reforms brought two new principles: payment of old-age pensions for deserving poors from the budget (1908), and flat-rate contributions and benefits in unemployment insurance (1911). In 1913 Sweden created the first universal system for securing income in old age. After the 14435
Social Security Russian Revolution the Bolsheviks drafted a comprehensive system of social protection against all major contingencies of life, following a concept which Lenin had already propagated in 1912. Starting in France (1932), hitherto voluntary child allowances of employers were made obligatory, and this ushered in a new model of administration: calling upon private bodies for the attainment of public social policy ends. Thus social insurance for dependent workers was already competing with other models of social protection. The victory of the term ‘social security’ over ‘social insurance’ was a consequence of its vague content which allowed the inclusion of all kinds of public social protection. In 1947 the ‘International Social Insurance Conference’ changed its name to ‘International Social Security Association’ and opened membership to state-administered systems (such as those of Britain and the US) which until then had been excluded. Thus, the American SSA brought a new name but hardly new ideas. It had a problem, however, that was not present in Europe, namely federalism. The Supreme Court had long held attempts by the Federal Government to introduce measures of social protection to be unconstitutional. And it was only after a positive ruling by the Supreme Court on the SSA in 1937 that federal social policies took shape, including policies to combat the risks of the death of a breadwinner and of disability. Besides a federal system of Old-Age Insurance the SSA introduced a new federal instrument into American politics: matching grants to the states in order to induce them to abolish the old forms of the poor law, to introduce or to improve a system of unemployment compensation, and to provide better care to the blind as well as to mothers and children. Thus the hitherto strictly separated domains of the federal and state government became linked for the first time. This remained a continual subject of controversy in the American polity with lasting consequences for the structure of social protection: there exists a two-tiered system consisting of ‘social security’ on the federal and of ‘welfare’ on the state level. The first program, protecting most of the employed population, is well administered, highly stable, and positively valued. The second, providing help as a last resort, has remained an ever-contested object of repeated reforms. It is unevenly administered at the local level and its recipients remain stigmatized. That which was innovative in the concept of social security was developed first in a report to the British Government entitled ‘Social Insurance and Allied Services’ in 1942, by a commission chaired by William Beveridge. It is the program of a comprehensie and unified system, coering the whole population against all basic risks of life—an idea that Beveridge had already formulated in 1924. The Beveridge report met with unprecedented enthusiasm in the British population and became the blueprint for social legislation after the war. 14436
The Beveridge report proved seminal also for the creation of the French ‘Se! curite! Sociale’ (1946\48). But contrary to the case in the United Kingdom, the principle of administrative unification of the system met with strong resistance in France from various social groups which preferred to keep or create separate insurances meeting their specific needs and interests. In Germany only the term was received without changing the structure of the existing system. Moreover, a semantic distinction was made between the institutional (Soziale Sicherung) and the normative (Soziale Sicherheit) aspect of social security. In Scandinaia the term ‘Social Security’ was hardly received although the systems of social protection underwent substantial improvements. Although many industrialized countries had developed measures of social protection against certain risks before World War II, their impact remained generally modest. Some groups, e.g. public servants or veterans, were privileged by public budgets, whereas industrial workers received only modest benefits out of contributions. With the exception of Scandinavia, the rural population remained mostly outside of existing systems. In most countries there existed a multitude of funds respectively designed for specific risks, groups, and localities. Repeated economic crises culminating in the Great Depression often impaired the reliability of expected benefits. After the Second World War the new ideal of ‘Social Security’ was received differently in different countries. Almost everywhere coverage provided by existing institutions was extended or new institutions were created to protect all or most of the population against certain risks. Most countries assimilated the regulatory frameworks of different systems which covered the same risks, or integrated the systems. Finally, the risks covered by schemes underwent an international standardization. The Convention No. 102 of the ILO on ‘Minimum Standards of Social Security’ (1951) distinguishes nine forms of benefit: old-age benefits, survivor benefits, disability benefits, family allowances, unemployment benefits, sickness benefits, medical care, and maternity benefits. To be sure, most advanced industrial countries’ social protection institutions do not correspond one-for-one with that list of benefits, but today they indeed do tend to cover all of these risks, albeit with varying degrees of coverage and benefits. From a structural perspectie the institutions of Social Security form a delimited, though not everywhere equally defined system. Issues of delimitation concern the forms of benefit (in cash or in kind), of administration (centralized or decentralized; public, parapublic, or private), and of entitlement (public provision, insurance, relief), as well as the groups included (privileged or underprivileged groups are sometimes excluded) and the risks covered (family allowances, for example, are often excluded from the concept).
Social Security
3. Security and Social Security as Value-concepts The administrative codification of Social Security as a complex of institutions which provide income replacement or benefits in kind should not obscure the fact that the success of the concept is due less to these institutions than to the value connotations of the concept. President Roosevelt evinced a keen sense of these connotations when he characterized social security as ‘freedom from want’ (6 January 1941), and when he described social security as a functional equivalent of earlier security deriving from ‘the interdependence of members of families upon each other and of the families within a small community upon each other’ (8 June 1934). But what does security mean? And why did the concept become so prominent and pervasive in the most advanced industrial countries whose populations were more remote from serious want than any generations heretofore? Security relates to human attitudes towards the future and not the present. Insecurity does not mean an imminent danger but the possibility of future damages whose probability remains uncertain. The quest for security is a correlate to the growing complexity of society and the ensuing acceleration of social change. Thus security means not only protection, but also a subjective perception of the reliability of protection and the ensuing feeling of peace of mind. But in this psychological sense, security as self-assuredness can also mean the subjective capacity to bear the insecurity of risk. Hence on the individual level the concept of security remains highly ambivalent, and has indeed been abandoned by the discipline of psychology. Security was also a longstanding pillar of political order which had already been incorporated into many constitutional documents and became transposed by Roosevelt into a new economic and social context. Just as government had the duty to care for external and internal security it was now to provide a framework to guarantee the means of existence for all, through full employment, social insurance, or other forms of benefits or services. ‘Social Security’ thus became the umbrella word for economic, social, and cultural rights as expressed in Article 22 of the United Nations Universal Declaration of Human Rights: ‘Everyone as a member of society, has the right to social security and is entitled to realization through national effort and international co-operation and in accordance with the organization and resources of each state, of the economic, social and cultural rights indispensable for his dignity and the free development of his personality.’ These rights are then specified as the right to work (Art. 23), the right to recreation from work (Art. 24), the right to health, well-being, and social protection, with special emphasis on mothers and children (Art. 25), the right to education (Art. 26), and the right to participation in culture, science, and the arts (Art. 27).
Thus Social Security became another name for welfare in its political-philosophical sense. It became a key term in the international evolution of the welfare state after World War II. It was adopted differently on the national level, however. In most countries the institutional aspect has long dominated. In the actual context of ‘crisis‘ or ‘reconstruction’ of the welfare state ‘social security’ shows again its normative aspect: to what extent are politics free to dispose about political regulations which aim at stabilizing the future economic protection of the population or defined parts of it? ‘Social Security’ means the right of everybody who belongs to a certain commonwealth to be included into the basic provisions against destitution and want. And it suggests moreover that such provisions should take a coherent and clearly accessible form for everyone, so that people are not only legally protected but also have easy and reliable access to such protection.
4. Controersial Issues 4.1 General Problems From the perspective of the Universal Declaration of Human Rights as well as William Beveridge or Pierre Laroque, the ‘father of the French seT curiteT sociale,’ Social Security was not restricted to publicly organized redistribution within populations at risk but also included policies of full employment and minimum wages. There was a clear awareness of the trade-off between full employment and the possibilities for sufficiently funding social protection. Social security as the goal of guaranteeing to the entire population the opportunity to participate in the economic and cultural life of its society implies education and employment as primary means and compensatory social protection only as a ‘second-best’ solution. The reification of ‘Social Security’ has obscured this original perspective and focuses public attention on processes of redistribution only. The current trend to substitute ‘work for welfare’ in the case of the able-bodied is quite in line with the original intention, provided that a decent minimum living standard is attainable at any rate. There is a strong divergence, however, with regard to whether the role of labor markets should be exclusive (as in the Anglo-Saxon world) or whether there should be a subsidiary public role (as in the Scandinavian and German-speaking countries) in employing the less productive parts of the active population. From the perspective of the ‘Chicago School’ of economics, the existing systems of Social Security are overly costly, they disguise the relationship between an individual’s contributions and benefits, and they benefit the middle classes more than those who are truly in need. A negative income tax would better serve the poor, they maintain. Such a ‘rationalist’ economic 14437
Social Security perspective does not take into account the existing trust in public systems of reciprocal security in most countries. The ‘deal’ which pools the risks of the whole population in one scheme and thus makes basic economic security a matter of democratic politics seems to find high acceptance in most countries, although many people are aware of the likelihood of being among the net payers to the system. Given the natural inequality of abilities and the social inequalites resulting from birth, education, labor markets, and competition in general, all political endeavors to foster social security—be it in the wider or in the narrower sense—face the problem of distributie effects. These relate not only to the amount of individual benefit and the forms of public financing but to the institutional arrangements as well. Strong political conflicts emerged in many countries around the issue of creating a single comprehensive system for certain provisions or for maintaining a fragmented system differentiating among classes of populations at risk. A somewhat related problem concerns the issue of securing only minimal standards or advanced standards of benefits through public provision. Finally, there is an obvious political struggle in all countries concerning the level of redistribution and who should bear the costs of social protection. Comparative research has focused mainly on the explanation of national differences in institutional design and distributive effects. Path dependency upon early national approaches seems to be an important explanatory factor for these variations, beside political power relations and cultural orientations. An interesting finding is that systems which provide high levels of protection tend to protect minimal standards better as well. 4.2 Sectoral Aspects Most political controversies do not arise with regard to the system of Social Security as a whole but concerning particular issues. As the functional organization of social protection is different from country to country, it remains difficult to generalize about this. One may identify some broad issues, however. The paramount problem in the contemporary debate—particularly as a consequence of demographic aging—is the security of income in old age. A three-tiered system consisting of universal and stateadministered protection of basic standards (first tier), employment-related (and often private) protection of advanced standards (second tier), and complementary forms of personal provision (third tier) seems to be a promising model to resist the demographic and economic challenges of the decades to come. In order to ensure income security in old age for all, some public regulation and supervision of the second and third tiers remains important, however. The second complex of risks is related to illness, industrial diseases, and long-term disability. Here, 14438
benefits in cash and in kind are necessary, and coverage for these risks is organized quite differently in different countries. Essentially, there are two different institutional models: the National Health Service and protection through social insurance. Both systems face the problem of containing the explosion of costs which has resulted from the compounding of a wide range of medical, technological, demographic, and economic factors. The third complex of Social Security is related to the protection of the family. Since child labor has been forbidden and education made compulsory, children are no longer an economic asset for their parents, but rather a strong source of social inequality. It is still contested to what extent parents and especially mothers should be supported by public means, beyond the generally accepted subsidy in cases of manifest poverty. The low birth rate in most European populations now gives more political weight to demands to improve the living conditions and social protection of persons who are raising children or nursing their permanently incapacitated parents. The most contested part of Social Security is public proision for the unemployed. Here, the trade-offs with labor-market policies and full employment are obvious, and convictions vary about how to deal with this problem. The issue of poverty cuts across such functional distinctions. The extent to which it emerges as a separate problem which must be addressed by specific measures depends to a large extent on the institutional approaches which a country has adopted in the abovementioned realms of social security. The issues discussed here are those generally attributed to the concept of social security. The concept of the welfare state covers additional services such as housing, education, and personal services. See also: Insurance; Social Welfare Policy: Comparisons; Welfare; Welfare: Philosophical Aspects; Welfare Programs, Economics of; Welfare State; Welfare State, History of
Bibliography Alber J 1982 Vom Armenhaus zum Wohlfahrtsstaat. Analysen zur Entwicklung on Sozialersicherung in Westeuropa (From the Poor-House to the Welfare State. The Emergence of Social Insurance in Western Europe). Campus, Frankfurt-am-Main, Germany Arnold D R, Graetz M J, Munell A H (eds.) 1998 Framing the Social Security Debate. Values, Politics, and Economics. National Academy of Social Insurance, Washington, DC Atkinson A B 1989 Poerty and Social Security. Harvester Wheatsheaf, New York Baldwin S, Falkingham J (eds.) 1994 Social Security and Social Change. New Challenges to the Beeridge Model. Harvester Wheatsheaf, New York Burns E M 1956 Social Security and Public Policy. McGrawHill, New York
Social Simulation: Computational Approaches Cohen W J, Friedman M 1972 Social Security: Uniersal or Selectie? American Enterprise Institute, Washington, DC Committee on Economic Security 1937 Social Security in America. The Factual Background of the Social Security Act as Summarized from Staff Reports to the Committee on Economic Security. Published by the Social Security Board. Government Printing Office, Washington, DC Galant H 1955 Histoire politique de la seT curiteT sociale francaise 1945–1952 (A Political History of French Social Security 1945–1952). Armand Colin, Paris George V 1968 Social Security: Beeridge and After. Routledge & Kegan Paul, London Heclo H 1974 Modern Social Politics in Britain and Sweden: From Relief to Income Maintenance. Yale University Press, New Haven, CT Kaufmann F-X 1973 Sicherheit als soziologisches und sozialpolitisches Problem. Untersuchungen zu einer Wertidee hochdifferenzierter Gesellschaften (Security as a Problem for Sociology and Social Policy. An Inquiry into a Uniersal Value of Differentiated Societies), 2nd edn. Enke\Lucius u. Lucius, Stuttgart, Germany Lowe R 1993 The Welfare State in Britain Since 1945. Macmillan, London Noble C 1997 Welfare As We Knew It. A Political History of the American Welfare State. Oxford University Press, New York Robbers G 1987 Sicherheit als Menschenrecht (Security as a Human Right). Nomo, Baden-Baden, Germany Rubinow I M 1934 The Quest for Security. Henry Holt, New York Social Insurance and Allied Serices 1942 Report by Sir William Beveridge. HMSO, London Weir M, Orloff A S, Skocpol T (eds.) 1988 The Politics of Social Policy in the United States. Princeton University Press, Princeton, NJ
F.-X. Kaufmann
Social Simulation: Computational Approaches Computational social simulation involves the use of computer algorithms to model social processes. These algorithms require greater precision and logical rigor than natural language yet lack the generality of closedform mathematical equations. The use of computer simulation by social scientists has increased in recent years, due to growing interest in modeling nonlinear dynamical processes using graphical interfaces that allow highly intuitive visual representations of the results. Social simulation has also undergone a qualitative change as the emphasis has shifted from prediction to exploration. This article reviews three ‘waves’ of innovation: dynamical systems, microsimulation, and adaptie agent models (Gilbert and Troitzsch 1999). These three approaches will be summarized as follows. (1) 1960s: Dynamical systems simulations were empirically grounded, inductive, holistic, and functionalist.
(2) 1970s: Microsimulation introduced the use of individuals as the units of analysis but retained the earlier emphasis on empirically based macro-level prediction. (3) 1980s: Adaptive agent models revolutionized computational social science by simulating actors not factors (system attributes). These ‘bottom-up’ models explored interactions among purposive decisionmakers. Although these three ‘waves’ overlap (e.g., economists continue to use holistic simulation of productive factors, microsimulation was invented in 1957, and nascent agent-based models appeared in the late 1960s), most of the recent excitement in computational modeling in the social and behavioral sciences has centered on agent-based simulation. So too has much of the controversy.
1. Dynamical Systems The first wave of computer simulations in social science occurred in the 1960s (see, for example, Cyert and March 1963). These studies used computers to simulate dynamical systems, such as control and feedback processes in organizations, industries, cities, and even global populations. The models typically consist of sets of differential equations that describe changes in system attributes as a function of other systemic changes. Applications included the flow of raw materials in a factory, inventory control in a warehouse, urban traffic, military supply lines, demographic changes in a world system, and ecological limits to growth (Forrester 1971; Meadows et al. 1974). Models of dynamical systems are functional and holistic (Gilbert and Troitzsch 1999). Functional means that theoretical interest centers on system equilibrium or balance among interdependent input– output units. Wholistic means that the system is modeled as an irreducible and indivisible entity. Although these models allow nested organizational levels, at any given level, they assume that sets of related attributes (such as economic growth, suburban migration, and traffic congestion) are causally linked at the aggregate level.
2. Microsimulation The second wave of computational modeling in the social sciences developed in the late 1970s. Known as microsimulation, it represents the first step in the progression from factors to actors. In striking contrast to the earlier wholistic approach, microsimulation is a ‘bottom-up’ strategy for modeling the interacting behavior of decision-makers (such as individuals, families, and firms) within a larger system. This modeling strategy utilizes data on representative samples of decision-
14439
Social Simulation: Computational Approaches makers, along with equations and algorithms representing behavioral processes, to simulate the evolution through time of each decision-maker, and hence of the entire population of decision-makers. (Caldwell 1997)
For example, Caldwell and Keister use CORSIM, a large-scale dynamic microsimulation model of the US population of individuals and families, to integrate individual and family-level wealth behavior with aggregate-level stratification outcomes. This microanalytic approach allows researchers to avoid a serious limitation in the earlier generation of simulations: population homogeneity. For example, the relationship between wealth and age may not be uniform across ethnic or religious subcultures, geographic regions, or birth cohorts. Complex interactions across subpopulations are lost when the focus is on the correlation among aggregate factors. This precludes the ability to predict the effects of policy changes that impact only certain groups. Microsimulation solves this problem by making individuals, not populations, the unit of analysis. The database structure consists of individual records for a representative sample of a population, with a set of attributes measured at multiple points in time. These individuals are then ‘aged’ by updating their attributes, based on a set of empirically derived state-transition probabilities (e.g., an individual’s move from work to retirement at age 65). These probabilities are then used to predict new states of each individual, and these are then aggregated to make population projections.
3. Agent-based Models and Complex Adaptie Systems The third wave, agent-based models, attracted widespread interest beginning in the 1980s. These models extend the microanalytical approach by deriving global states from individual behaiors, not individual attributes. In microsimulation models, individuals are inert in two ways: (a) the state of an individual is a numerical function of other traits, and (b) individuals do not interact (Gilbert and Troitzsch 1999, p. 8). Agent-based models assume an individual has intentions or goals and makes choices that affect other agents whose choices in turn affect that individual. These models impose three key assumptions. (a) Agents interact with little or no central authority or direction. Global patterns emerge from the bottom up, determined not by a centralized authority but by local interactions among autonomous decisionmakers. (b) Decision-makers are adaptive rather than optimizing, with decisions based on heuristics, not on calculations of the most efficient action (Holland 1995, p. 43). These heuristics include norms, habits, protocols, rituals, conventions, customs, and routines. They evolve at two levels, the individual and the population. 14440
Individual learning alters the probability distribution of rules competing for attention, through processes like reinforcement, Bayesian updating, or the backpropagation of error in artificial neural networks. Population learning alters the frequency distribution of rules competing for reproduction through processes of selection, imitation, and social influence (Latane! 1996). Genetic algorithms (Holland 1995) are widely used to model adaptation at the population level. (c) Decision-makers are strategically interdependent. Strategic interdependence means that the consequences of each agent’s decisions depend in part on the choices of others. When strategically interdependent agents are also adaptive, the focal agent’s decisions influence the behavior of other agents who in turn influence the focal agent, generating a complex adaptive system (Holland 1995, p. 10). Thomas Schelling’s (1971) ‘neighborhood segregation’ model was one of the earliest agent-based social simulations using cellular automata, a method invented by Von Neumann and Ulam in the 1940s. Consider a residential area that is highly segregated, such that the number of neighbors with different cultural markers (such as ethnicity) is at a minimum. If the aggregate pattern were assumed to reflect the attitudes of the constituent individuals, one might conclude from the distribution that the population was highly parochial and intolerant of diversity. Yet Schelling’s simulation shows that this need not be the case. His ‘tipping’ model shows that highly segregated neighborhoods can form even in a population that prefers diversity. The aggregate pattern of segregation is an emergent property that does not reflect the underlying attitudes of the constituent individuals. Another example of emergence is provided by Latane! ’s (1996) ‘social impact model.’ Like Schelling, Latane! also studies a cellular world populated by agents who live on a two-dimensional lattice. However, rather than moving, these agents adapt to those around them, based on a rule to mimic one’s neighbors. From a random start, a population of mimics might be expected to converge inexorably on a single profile, leading to the conclusion that cultural diversity is imposed by factors that counteract the effects of conformist tendencies. However, the surprising result was that ‘the system achieved stable diversity. The minority was able to survive, contrary to the belief that social influence inexorably leads to uniformity’ (Latane! 1996, p. 294). Other researchers, including Axelrod (1997), Carley (1991), and Kitts et al. (1999) have used computational models that couple homophily (likes attract) and social influence. If local similarity facilitates interaction which in turn facilitates imitation, it is indeed curious that the global outcome is diversity, not uniformity. However, this outcome depends on interactions that are locally transitive, as in a spatially distributed population. Schelling and Latane! modeled agents with identical and fixed behavioral rules. Many agent-based models
Social Simulation: Computational Approaches assume heterogeneous populations with the ability to learn new rules. The ‘genetic algorithm’ invented by John Holland is a simple and elegant way to model strategies that can progressively improve performance by building on partial solutions. Each strategy consists of a string of symbols that code behavioral instructions, analogous to a chromosome containing multiple genes. The string’s instructions affect the agent’s reproductive fitness and hence the probability that the strategy will propagate. Propagation occurs when two or more mated strategies recombine. If different rules are each effective, but in different ways, recombination allows them to create an entirely new strategy that may integrate the best abilities of each ‘parent’ and thus eventually displace the parent rules in the population of strategies. Axelrod (1997, pp. 14–29] used a genetic algorithm to study the evolution of cooperation in a ‘prisoner’s dilemma,’ a game in which choices that are individually rational aggregate into collective outcomes that everyone would prefer to avoid. The results showed that strategies based on reciprocity (such as ‘tit for tat’) tend to be very successful. The secret of their success is that they perform well against copies of themselves. Axelrod’s result has been challenged by Nowak and Sigmund (1993), by Binmore (1998), and by Macy (1995). Working independently, these researchers found that ‘tit for tat,’ a strategy that teaches its partner a lesson, can be supplanted by ‘Pavlov’ (aka. ‘win–stay, lose–shift’ and ‘tat for tit’), a strategy that learns (Binmore 1998). The ability to learn appears to be at least as important for emergent social order as the ability to teach. Learning involves adaptation at the level of the individual rather than the population. Artificial neural networks are simple self-programmable devices that model agents who learn through reinforcement. Like genetic algorithms, neural nets have a biological analog, in this case, the nerve systems of living organisms. The device consists of a web of neuron-like units (or neurodes) that fire when triggered by impulses of sufficient strength, and in turn stimulate other units when fired. The magnitude of an impulse depends on the strength of the connection (or synapses) between the two neurodes. The network learns by modifying these path coefficients in response to environmental feedback about its performance. Neural nets have been used to study the evolution of religion (Bainbridge 1995), kin altruism (Parisi et al. 1995), the emergence of status inequality (Vakas-Duong and Reilly 1995), group dynamics (Nowak and Vallacher 1997), and social deviance (Kitts et al. 1999).
4. Criticisms of Simulation Agent-based simulations have been criticized as unrealistic, tautological, and unreliable.
(a) Simulations, like mathematical models, tell us about a highly stylized and abstract world, not about the world in which we live. In particular, agent-based simulations rely on highly simplified models of rulebased human behavior that neglect the complexity of human cognition. (b) Like mathematical models, simulations cannot tell us anything that we do not already know, given the assumptions built into the model and the inability of a computer to do anything other than execute the instructions given to it by the program. (c) Unlike mathematical models, simulations are numerical and therefore cannot establish lawful regularities or generalizations. The defense against these criticisms centers on the principles of complexity and emergence. How can simple agent-based models explain the rich complexity of social life? In complex systems, very simple rules of interaction can produce highly complex global patterns. ‘Human beings,’ Simon contends (1998, p. 53), ‘viewed as behaving systems, are quite simple.’ We follow rules, in the form of norms, conventions, protocols, moral and social habits, and heuristics. Although the rules may be quite simple, they can produce global patterns that may not be at all obvious and are very difficult to understand. Hence, ‘the apparent complexity of our behavior is largely a reflection of the complexity of the environment,’ including the complexity of interactions among strategically interdependent, adaptive agents. The simulation of artificial worlds allow us to explore the complexity of the social environment by removing the cognitive complexity (and idiosyncrasy) of constituent individuals.
4.1 The Exploration of Artificial Worlds If Simon, Axelrod, and the complexity theorists are right, then the artificiality of agent-based models is a virtue, not a vice. When simulation is used to make predictions or for training personnel (e.g., flight simulators), the assumptions need to be highly realistic, which usually means they will also be highly complicated (Axelrod 1997, p. 5). ‘But if the goal is to deepen our understanding of some fundamental process,’ Axelrod continues, ‘then simplicity of the assumptions is important and realistic representation of all the details of a particular setting is not.’ As such, the purpose of these models is to generate hypotheses, not to test them (Prietula et al. 1998, p. xv). Holland (1995, p.146) offers a classic example of the problem of building theories that too closely resemble actuality: Aristotle’s mistaken conclusion that all bodies come to rest, based on observation of a world that happens to be infested with friction. Had these observations been made in a frictionless world, Aristotle would have come to the same conclusion 14441
Social Simulation: Computational Approaches reached by Newton and formulated as the principle that bodies in motion persist in that motion unless perturbed. Newton avoided Aristotle’s error by studying what happens in an artificial world in which friction had been assumed away. Ironically, ‘Aristotle’s model, though closer to everyday observations, clouded studies of the natural world for almost two millennia’ (Holland 1995, p. 146). More generally, it is often necessary to step back from our world in order to see it more clearly. For example, Schelling’s ‘neighborhood segregation’ model made no effort to imitate actuality or to resemble an observed city. Rather, the ‘residents’ live in a highly abstract cellular world without real streets, rivers, train tracks, zoning laws, red-lining, housing markets, etc. This artificial world shows that all neighborhoods in which residents are able to move have an underlying tendency towards segregation even when residents are moderately tolerant. This hypothesis can then be tested in observed neighborhoods, but even if observations do not confirm the predicted segregation, this does not detract from the value of the simulation in suggesting empirical conditions that were not present in the model that might account for the discrepancy with the observed outcome. In short, agent-based models ‘have a role similar to mathematical theory, shearing away detail and illuminating crucial features in a rigorous context’ (Holland 1995, p. 100). Nevertheless, Holland reminds us that we must be careful. Unlike laboratory experiments, computational ‘thought experiments’ are not constrained by physical actuality and thus ‘can be as fanciful as desired or accidentally permitted.’
4.2 Unwrapping and Emergence A second criticism of agent-based modeling addresses a problem known as ‘unwrapping.’ Unwrapping occurs ‘when the ‘solution’ is explicitly built into the program [such that] the simulation reveals little that is new or unexpected’ (Holland 1995, p. 137). Critics charge that all computer simulations share this limitation, due to the inability of computers to act in any way other than how they were programmed to behave. This criticism overlooks the distinction between the micro-level programming of the agents and the macrolevel patterns of interaction that emerge. These patterns are ‘built in’ from the outset yet the exercise is valuable because the logical implications of behavioral assumptions are not always readily apparent. Indeed, a properly designed computational experiment can yield highly counter-intuitive and surprising results. This theoretical possibility is based on the principle of emergence in complex systems. Biological examples of emergence include life that emerges from non-living organic compounds and intelligence and consciousness that emerges from dense networks of very simple 14442
switch-like neurons. Schelling’s (1971) neighborhood model provides a compelling example from social life: segregation is an emergent property of social interaction that is not reducible to individual intolerance.
4.3 Rationality and Adaptie Behaior A third criticism is that simulation models, unlike mathematical models, are numerical, not deductive, and therefore cannot be used to form generalizations. Worse still, simulations of evolutionary processes can be highly sensitive to the basin of attraction in which the parameters are arbitrarily initialized (Binmore 1998). However, game-theoretic mathematical models pay a high price for the ability to generate deductive conclusions: multiple equilibria that preclude a uniquely rational solution. Equilibrium selection requires constraints on the perfect rationality of the agents. Simon appreciates the paradox: ‘Game theory’s most valuable contribution has been to show that rationality is effectively undefinable when competitive actors have unlimited computational capabilities for outguessing each other, but that problem does not arise as acutely in a world, like the real world, of bounded rationality.’ (1998, p. 38) But bounded rationality often makes analytical solutions mathematically intractable. ‘When the agents use adaptive rather than optimizing strategies, deducing the consequences is often impossible: simulation becomes necessary.’ (Axelrod 1997, p. 4) In sum, simulation requires sensitivity analysis and tests of robustness, but properly implemented, it not only offers a solution to the problem of equilibrium selection but can also tell us something about how evolutionary systems move from one equilibrium to another.
5. Conclusion From a narrowly positivist perspective, agent-based artificial worlds might seem unrealistic, unfalsifiable, and unreliable, especially when contrasted with earlier approaches to simulation that were data driven, predictive, and highly realistic (e.g., flight simulators). However, agent-based models may be essential for the study of open and complex adaptive systems, whether social, psychological, or biological. When used as an exploratory tool, ‘bottom-up’ simulation does not test theory against observation. Rather, these thought experiments suggest possible mechanisms that may generate puzzling empirical patterns. A key theme that runs through these puzzles is that it is often very difficult to recognize the underlying causal mechanisms using conventional data-analytic methods. As Holland concludes (1995, p. 195), ‘I do not think we will understand morphogenesis, or the
Social Stratification emergence of organizations like Adam Smith’s pin factory, or the richness of interactions in a tropical forest, without the help of such models.’ See also: Artificial Social Agents; Functional Equations in Behavioral and Social Sciences; Markov Models and Social Analysis; Population Dynamics: Mathematic Models of Population, Development, and Natural Resources; Social Behavior (Emergent), Computer Models of; Social Network Models: Statistical
Bibliography Axelrod R 1997 The Complexity of Cooperation. Princeton University Press, Princeton, NJ Bainbridge W 1995 Neural network models of religious belief. Sociological Perspecties 38(4): 483–96 Binmore K 1998 Axelrod’s ‘The Complexity of Cooperation’. The Journal of Artificial Societies and Social Simulation 1: 1 Caldwell S 1997 Dynamic Microsimulation and the Corsim 3.0 Model. Strategic Forecasting, Ithaca, NY Carley K 1991 A theory of group stability. American Sociological Reiew 56: 331–54 Cyert R, March J G 1963 A Behaioral Theory of the Firm. Prentice-Hall, Englewood Cliffs, NJ Forrester J W 1971 World Dynamics. MIT Press, Cambridge, MA Gilbert N, Troitzsch K 1999 Simulation for the Social Scientist. Open University Press, Buckingham, UK Holland J 1995 Hidden Order: How Adaptation Builds Complexity. Perseus, Reading, MA Kitts J, Macy M, Flache A 1999 Structural learning: Attraction and conformity in task-oriented groups. Computational and Mathematical Organization Theory 5: 129–45 Latane! B 1996 Dynamic social impact: Robust predictions from simple theory. In: Hegselmann R, Mueller U, Troitzsch K (eds.) Modeling and Simulation in the Social Sciences from a Philosophy of Science Point of View. Kluwer Dorderecht, Boston, pp. 287–310 Macy M 1995 Natural selection and social learning in prisoner’s dilemma: Co-adaptation with genetic algorithms and artificial neural networks. Sociological Methods and Research 25: 103–37 Meadows D L, Behrens W W III, Meadows D H, Naill R F, Randers J, Zahn E K 1974 The Dynamics of Growth in a Finite World. MIT Press, Cambridge, MA Nowak A, Vallacher R 1997 Computational social psychology: Cellular automata and neural network models of interpersonal dynamics. In: Read S, Miller L (eds.) Connectionist Models of Social Reasoning and Social Behaior. Erlbaum, Mahwah, NJ Nowak M, Sigmund K 1993 A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoners’ dilemma game. Nature 364: 56–7 Parisi D, Cecconi F, Cerini A 1995 Kin-directed altruism and attachment behaviour in an evoking population of neural networks. In: Gilbert, Coute R C (eds.) Artificial Societies. University College London Press, London Prietula M, Carley K, Gasser L 1998 Simulating Organizations: Computational Models of Institutions and Groups. MIT Press, Cambridge, MA Schelling T 1971 Dynamic models of segregation. Journal of Mathematical Sociology 1: 143–86
Simon H 1998 The Sciences of the Artificial. MIT Press, Cambridge, MA Vakas-Duong D, Reilly K 1995 A system of IAC neural networks as the basis for self-organization in a sociological dynamical system simulation. Behaioral Science 40: 275–303
M. W. Macy
Social Stratification In all complex societies, the total stock of valued goods is distributed unequally, with the most privileged individuals and families enjoying a disproportionate share of income, power, and other valued resources. The term ‘stratification system’ refers to the complex of social institutions that generate observed inequalities of this sort. The key components of such systems are (a) the institutional processes that define certain types of goods as valuable and desirable; (b) the rules of allocation that distribute these goods across various positions in the division of labor (e.g., doctor, farmer, ‘housewife’); and (c) the mobility mechanisms that link individuals to positions and thereby generate unequal control over valued resources. It follows that inequality is produced by two types of matching processes: The social positions in society are first matched to ‘reward packages’ of unequal value, and members of society are then allocated to the positions so defined and rewarded. There are of course many types of rewards that come to be attached to social positions (see Table 1). Given this complexity, one might expect stratification scholars to adopt a multidimensional approach, with the objective being to describe and explain the full distribution of goods listed in Table 1. Although some scholars have indeed advocated an approach of this sort, most have instead opted to characterize stratification systems in terms of discrete classes or socioeconomic strata whose members are endowed with similar amounts and types of resources. The goal of stratification research has thus reduced to (a) describing the major forms of class inequality in human history, (b) specifying the structure of contemporary classes and strata, (c) modeling the processes by which individuals move between class and socioeconomic positions, and (d) examining the effects of race, ethnicity, and gender on such mobility processes.
1. Basic Concepts The foregoing lines of inquiry cannot be adequately reviewed without first defining some of the core 14443
Social Stratification Table 1 Types of assets, resources, and valued goods underlying stratification systems Asset group
Selected examples
Relevant scholars
1. Economic
Ownership of land, farms, factories, professional practices, businesses, liquid assets, humans (i.e., slaves), labor power (e.g., serfs) Household authority (e.g., head of household); workplace authority (e.g., manager); party and societal authority (e.g., legislator); charismatic leader High-status consumption practices; ‘good manners’; privileged life-style Access to high-status social networks, social ties, associations and clubs, union memberships Prestige; ‘good reputation’; fame; deference and derogation; ethnic and religious purity Rights of property, contract, franchise, and membership in elective assemblies; freedom of association and speech Skills; expertise; on-the-job training; experience; formal education; knowledge
Karl Marx; Erik Wright
2. Political 3. Cultural 4. Social 5. Honorific 6. Civil 7. Human
concepts in the field. The following definitions are especially relevant for our purposes: (a) The degree of inequality in a given reward or asset depends, of course, on its dispersion or concentration across the population. If some types of assets (e.g., civil rights) are distributed more equally than others (e.g., political power), then the level of inequality obviously cannot be adequately characterized with a single parameter. (b) The rigidity of a stratification system is indexed by the continuity (over time) in the social standing of its members. The stratification system is said to be highly rigid, for example, if the current income, power, or prestige of individuals can be accurately predicted on the basis of their prior statuses or those of their parents. (c) The stratification system rests on ascriptie processes to the extent that traits present at birth (e.g., sex, race, ethnicity, parental wealth) influence the subsequent social standing of individuals. In modern societies, ascription of all kinds is usually seen as undesirable or discriminatory, and much governmental policy is therefore directed toward fashioning a stratification system in which individuals acquire resources solely by virtue of their achievements. (d) The degree of status crystallization is indexed by the correlations among the assets in Table 1. If these correlations are strong, then the same individuals (i.e., the ‘upper class’) will consistently appear at the top of all status hierarchies, while other individuals (i.e., the ‘lower class’) will consistently appear at the bottom of the stratification system. These four variables can be used to characterize differences across societies in the underlying structure of stratification (see Table 2). As the following discussion shall reveal, there is much cross-societal variability not merely in the extent of inequality, but 14444
Max Weber; Ralf Dahrendorf Pierre Bourdieu; Paul DiMaggio W. Lloyd Warner; James Coleman Edward Shils; Donald Treiman T. M. Marshall; Rogers Brubaker Kaare Svalastoga; Gary Becker
also in the underlying shape of the class structure and the processes by which classes are reproduced.
2. Forms of Stratification It is useful to begin with the purely descriptive task of classifying the various types of stratification systems that have appeared in past and present societies. The first panel of Table 2 pertains to the ‘primitive’ tribal systems that dominated human society from the very beginning of human evolution until the Neolithic revolution of some 10,000 years ago. Although tribal societies have of course assumed various forms, the total size of the distributable surplus was in all cases quite limited; and this cap on the surplus placed corresponding limits on the overall level of economic inequality. Indeed, some observers have treated tribal societies as examples of ‘primitive communism,’ since the means of production (e.g., tools) were owned collectively and other types of property were typically distributed evenly among tribal members. Moreover, insofar as positions of power emerged (e.g., shamans), these were never inherited but instead were secured by demonstrating superior skills in the relevant tasks. While meritocratic criteria are often seen as prototypically modern, they were in fact present in incipient form at quite early stages of societal development, no doubt because the surplus was too small to permit the luxury of less adaptive forms of allocation. With the emergence of agrarian forms of production, the economic surplus became large enough to support more complex and less meritocratic systems of stratification. The ‘Asiatic mode,’ which some commentators regard as a precursor of advanced agrarianism, is characterized by a poorly developed proprietary
Table 2 Basic parameters of stratification for eight ideal-typical systems System center (1) A. Hunting and gathering society 1. Tribalism B. Horticultural and agrarian society 2. Asiatic mode 3. Feudalism 4. Slavery 5. Caste society C. Industrial society 6. Class system 7. State socialism 8. ‘Advanced’ industrialism
Principal assets (2)
Major strata or classes (3)
Inequality (4)
Rigidity (5)
Crystallization (6)
Justifying ideology (7)
Human (hunting and magic skills)
Chiefs, shamans, and other tribe members
Low
Low
High
Meritocratic selection
Political (i.e., incumbency of state office) Economic (land and labor power) Economic (human property)
Office-holders and peasants
High
Medium
High
Tradition and religious doctrine
Nobility, clergy, and commoners Slave owners, slaves, ‘free men’
High
Medium–high
High
High
Medium–high
High
Tradition and Roman Catholic doctrine Doctrine of natural and social inferiority (of slaves) Tradition and Hindu religious doctrine
Honorific and cultural (ethnic purity and ‘pure’ lifestyles)
Castes and subcastes
High
High
High
Economic (means of production) Political (party and workplace authority) Human (i.e., education, expertise)
Capitalists and workers
Medium–high
Medium
High
Classical liberalism
Managers and managed
Low–medium
Low–medium
High
Marxism and Leninism
Skill-based occupational groupings
Medium
Low–medium
Medium
Classical liberalism
14445
Social Stratification class and a powerful state elite that extracted surplus agricultural production through rents and taxes (see line B2 in Table 2). This mode provides the conventional example of how a ‘dictatorship of officialdom’ can flourish in the absence of institutionalized private property. Whereas political assets were thus dominant in the Asiatic mode, the ruling class under Western feudalism was, by contrast, very much a propertied one. The distinctive feature of feudalism was that the nobility not only owned large estates or manors but also held legal title to the labor power of its serfs (see line B3). If a serf fled to the city, this was considered a form of theft: The serf was stealing that portion of his or her labor power owned by the lord. With this interpretation, the statuses of serf and slave differ only in degree, and slavery thereby constitutes a limiting case in which workers lose all control over their own labor power (see line B4; also see Slaery as Social Institution). The historical record makes it clear that agrarian stratification systems were not always based on strictly hereditary forms of social closure. To be sure, the era of classical feudalism (i.e., post-twelfth century) was characterized by a rigid stratification of classes, but there was far greater permeability during the period prior to the institutionalization of the manorial system and the associated transformation of the nobility into a legal class. The most extreme example of agrarian closure can of course be found in caste societies (see line B5). The Indian caste system, for example, is based on (a) a hierarchy of status groupings (i.e., castes) that are ranked by ethnic purity, wealth, and access to goods or services; (b) a corresponding set of ‘closure rules’ that restrict all forms of intercaste marriage or mobility and thereby make caste membership both hereditary and permanent; (c) a high degree of physical and occupational segregation enforced by elaborate rules and rituals governing intercaste contact; and (d) a justifying ideology (i.e., Hinduism) that induces the population to regard such extreme forms of inequality as legitimate and appropriate. What makes this system so distinctive, then, is not merely its well-developed closure rules but also the fundamentally honorific (and noneconomic) character of the underlying social hierarchy. The defining feature of the industrial era (see panel C) has been the emergence of egalitarian ideologies and the consequent ‘delegitimation’ of the extreme forms of stratification found in caste, feudal, and slave systems. This can be seen, for example, in the European revolutions of the eighteenth and nineteenth centuries that pitted the egalitarian ideals of the Enlightenment against the privileges of rank and the political power of the nobility. In the end, these struggles eliminated the last residue of feudal privilege, but they also made new types of inequality and stratification possible. Under the class system that ultimately emerged (see line C6), the estates of the feudal era were replaced by purely economic groups 14446
(i.e., ‘classes’), and closure rules based on heredity were likewise supplanted by (formally) meritocratic processes. The resulting classes were neither legal entities nor closed status groupings, and the associated class-based inequalities could therefore be represented and justified as the natural outcome of competition among individuals with differing abilities, motivation, or moral character (i.e., ‘classical liberalism’). As indicated in line C6, the class structure of early industrialism had a clear economic base, so much so that Marx ([1894] 1972) defined classes in terms of their relationship to the means of economic production. The precise contours of the industrial class structure are nonetheless a matter of continuing debate (see below); for example, a simple Marxian model focuses on the cleavage between capitalists and workers, while more elaborate Marxian and neoMarxian models identify additional intervening or ‘contradictory’ classes (e.g., Wright 1997), and yet other (non-Marxian) approaches represent the class structure as a continuous gradation of income, prestige, or socioeconomic status. Whatever the relative merits of these models might be, the ideology underlying the socialist revolutions of the nineteenth and twentieth centuries was of course explicitly Marxist (see line C7). The intellectual heritage of these revolutions and their legitimating ideologies can again be traced to the Enlightenment, but the rhetoric of equality that emerged in this period was now directed against the economic power of the capitalist class rather than the status and honorific privileges of the nobility. The evidence from Eastern Europe and elsewhere suggests that these egalitarian ideals were only partially realized. In the immediate postrevolutionary period, factories and farms were indeed collectivized or socialized, and various fiscal and economic reforms were instituted for the express purpose of reducing income inequality and wage differentials among manual and nonmanual workers. Although these egalitarian policies were subsequently weakened through the reform efforts of Stalin and others, inequality on the scale of pre-revolutionary society was never re-established among rank-and-file workers (see Lenski 2001). There nonetheless remained substantial inequalities in power and authority; most notably, the socialization of production did not have the intended effect of empowering workers, as the capitalist class was replaced by a ‘new class’ of party officials and managers who continued to control the means of production and to allocate the resulting social surplus. This class has been variously identified with intellectuals or intelligentsia, bureaucrats or managers, and party officials or appointees (see Gouldner 1979). Regardless of the formulation adopted, the presumption is that the working class ultimately lost out in contemporary socialist revolutions, just as it did in the so-called bourgeois revolutions of the eighteenth and nineteenth centuries (e.g., see Postsocialist Societies).
Social Stratification Whereas the means of production were socialized in the revolutions of Eastern Europe and the former USSR, the capitalist class remained intact throughout the process of industrialization in the West, even as ownership and control separated and a distinct managerial class emerged (see Dahrendorf 1959). The capitalist class may nonetheless be weakened by the structural changes of postindustrialism, with the most important of these being the rise of a service economy and the consequent emergence of technical expertise, educational degrees, and training certificates as new forms of property (see line C8). By this formulation, a dominant class of cultural elites may be emerging in the West, much as the transition to state socialism (allegedly) generated a new class of intellectuals in the East. This is not to suggest that all theorists of advanced industrialism posit a grand divide between the cultural elite and an undifferentiated working mass. In fact, some commentators (e.g., Dahrendorf 1959) have argued that skill-based cleavages are crystallizing throughout the occupational structure, with the result being a finely differentiated class system made up of discrete occupations (Grusky and Sørensen 1998) or a continuous gradation of socioeconomic status (e.g., Hauser and Warren 1997).
3. The Structure of Contemporary Stratification The history of stratification theory is in large part a history of such debates about the emerging structure of contemporary inequality. Moreso than usual, political and intellectual goals are conflated in this literature and, accordingly, the relevant research is infused with much scholarly contention. These debates are complex and wide-ranging, but it will suffice to review four approaches that have proven to be especially popular.
3.1 Reductionism Among contemporary interpreters of inequality, the prevailing approach has long been to claim that only one of the ‘asset groups’ in Table 1 is truly fundamental, whereas all others are somehow epiphenomenal. There are nearly as many claims of this sort as there are dimensions in Table 1. To be sure, Marx is most commonly criticized (with some justification) for placing ‘almost exclusive emphasis on economic factors as determinants of social class’ (Lipset 1968, p. 300), but in fact much of what passes for stratification theorizing amounts to reductionism of one form or another. Among non-Marxist scholars, inequalities in honor or power are frequently regarded as the most fundamental sources of class formation, while the distribution of economic assets is seen as purely secondary. For example, Dahrendorf (1959) argues that ‘differential authority in associations is the
ultimate ‘cause’ of the formation of conflict groups’ (p. 172), while Shils (1968) suggests that ‘without the intervention of considerations of deference position the ... inequalities in the distribution of any particular facility or reward would not be grouped into a relatively small number of vaguely bounded strata’ (p. 130). These extreme forms of reductionism have been less popular of late; indeed, even neo-Marxian scholars now typically recognize several stratification dimensions, with the social classes of interest then being defined as particular combinations of scores on the selected variables (e.g., Wright 1997). 3.2 Synthesizing Approaches There is an equally long tradition of research based on synthetic measures that simultaneously tap a wide range of assets and resources. As noted above, many of the rewards in Table 1 (e.g., income, honor) are principally allocated through the jobs or social roles that individuals occupy, and one can therefore measure the standing of individuals by classifying them in terms of their social positions. In this context, Parkin (1971) has referred to the occupational structure as the ‘backbone of the entire reward system of modern Western society’ (p. 18), while Hauser and Featherman (1977) argue that studies ‘framed in terms of occupational mobility … yield information simultaneously (albeit, indirectly) on status power, economic power, and political power’ (p. 4). The most recent representatives of this position, Grusky and Sørensen (1998), have argued that detailed occupations are not only the main conduits through which valued goods are disbursed but are also deeply institutionalized categories that are salient to workers, constitute meaningful social communities and reference groups, and serve as fundamental units of collective action. Although occupations continue, then, to be the preferred classificatory tool within this tradition, other scholars have pursued the same synthesizing objective by simply asking community members to locate their peers in a hierarchy of social strata (e.g., Warner et al. 1949). Under the latter approach, a synthetic classification is no longer secured by ranking and sorting occupations in terms of the bundles of rewards attached to them, but rather by passing the raw data of inequality through the fulcrum of individual judgment. 3.3 Aggregation Exercises Regardless of whether a reductionist or synthesizing approach is taken, most scholars adopt the final simplifying step of defining a relatively small number of discrete classes (e.g., see Class: Social). For example, Parkin (1971) argues for six occupational classes with the principal ‘cleavage falling between the manual and non-manual categories’ (p. 25), whereas 14447
Social Stratification Dahrendorf (1959) argues for a two-class solution with a ‘clear line drawn between those who participate in the exercise [of authority] ... and those who are subject to the authoritative commands of others’ (p. 170). While close variants of the Parkin scheme continue to be used, the emerging convention among quantitative stratification scholars is to apply either the 12-category neo-Marxian scheme fashioned by Wright (1997) or the 11-category neo-Weberian scheme devised by Erikson and Goldthorpe (1994). At the same time, new classification schemes continue to be regularly proposed, with the impetus for such efforts typically being the continuing expansion of the service sector and the associated growth of contingent work relations (e.g., Esping-Andersen 1999). The question that necessarily arises for all contemporary schemes is whether the constituent categories are purely nominal entities or are truly meaningful to the individuals involved. If the categories are intended to be meaningful, one would expect class members not only to be aware of their membership (i.e., ‘class awareness’) but also to identify with their class (i.e., ‘class identification’) and occasionally act in its behalf (i.e., ‘class action’).
4. Generating Stratification The foregoing debates may be seen, then, as disputes over which data reduction techniques best reveal the fundamental features of the distribution of rewards. If one or more of these simplifying assumptions is applied and a mapping of inequality is thereby developed, it becomes possible to examine the flow of individuals between the various categories of this mapping. The language of stratification theory thus makes a sharp distinction between the distribution of social rewards and the distribution of opportunities for securing these rewards. The latter distribution has come to determine popular judgments about the legitimacy of stratification; that is, substantial inequalities in power, wealth, or honor are typically seen as tolerable (and even desirable) provided that the opportunities for securing these social goods are distributed equally. Whatever the wisdom of this popular logic might be, stratification researchers have long sought to explore its factual underpinnings by monitoring and describing the structure of mobility chances. The relevant literature is vast, but of course broad classes of inquiry can be distinguished, as indicated below.
3.4 Gradationalism The prior approaches involve mapping individuals or families into mutually exclusive and exhaustive categories. By contrast, the implicit claim underlying gradational approaches is that such ‘classes’ are largely the construction of overzealous sociologists, and that the underlying structure of modern stratification can, in fact, be more closely approximated with gradational measures of income, status, or prestige. Although there is some sociological precedent for treating income as a gradational indicator of class, most sociologists opt for either prestige scales based on popular evaluations of occupational standing or socioeconomic scales constructed as weighted averages of occupational income and education. In recent years, such gradationalism has been subjected to criticism on various fronts, with the main objections being that (a) the fluidity and openness of the stratification system is overstated by conventional prestige and socioeconomic scales (Hauser and Warren 1997); (b) the desirability of jobs is best indexed by directly measuring such job-level attributes as earnings, promotion opportunities, complexity, or autonomy (Jencks et al. 1988); and (c) the convention of converting discrete occupations into scales strips away precisely those gemeinschaftlich features of discrete occupations that make them analytically so powerful (Grusky and Sørensen 1998). There is of course no guarantee that any of these criticisms will take hold; indeed, because socioeconomic scales have dominated sociological research since the 1960s, inertial forces are quite strong and will not be easily overcome. 14448
4.1 Mobility Analyses The conventional starting point for mobility scholars has been to analyze bivariate ‘mobility tables’ formed by cross-classifying the class origins and destinations of individuals (e.g., see Social Mobility, History of). The tables constructed can be used to estimate densities of inheritance, to map the social distances between classes and their constituent occupations, and to examine differences across subpopulations in the amount and patterning of fluidity and opportunity. Moreover, when comparable mobility tables are assembled from several countries, it becomes possible to address fundamental debates about the underlying contours of cross-national variation in stratification systems (e.g., Erikson and Goldthorpe 1994). This long-standing line of analysis, while still underway, has nonetheless declined of late, perhaps because past research has been so definitive as to undercut further efforts. The focus has thus shifted to studies of income mobility, with the twofold impetus for this development being (a) concerns that poverty may be increasingly difficult to escape and that a permanent underclass may be forming, and (b) the obverse hypothesis that growing income inequality may be counterbalanced by increases in the rate of mobility between income groups. 4.2 The Process of Stratification It is by now a sociological truism that Blau and Duncan (1967) and their colleagues (e.g., Sewell et al.
Social Stratification 1969) revolutionized the field with their formal ‘path models’ of stratification. These models were intended to represent, if only partially, the processes by which background advantages could be converted into socioeconomic status through the mediating variables of schooling, aspirations, and parental encouragement. Under formulations of this kind, the main sociological objective was to show that socioeconomic outcomes were structured not only by ability and family origins, but also by various intervening variables (e.g., schooling) that were themselves only partly determined by origins and other ascriptive forces. The picture of modern stratification that emerged suggested that market outcomes depend in large part on unmeasured career contingencies (i.e., ‘individual luck’) rather than influences of a more structural sort (Jencks et al. 1972). This line of research, which fell out of favor by the mid-1980s, has been recently reinvigorated as stratification scholars react to the controversial claim (i.e., Herrnstein and Murray 1994) that inherited intelligence is increasingly determinative of stratification outcomes (e.g., Fischer et al. 1996). In a related development, contemporary scholars have also turned their attention to the effects of family structure on attainment, given that new nontraditional family arrangements (e.g., female-headed households) may in some cases reduce the influence of biological parents and otherwise complicate the reproduction of class. 4.3 Structural Analyses The preceding models are frequently criticized for failing to attend to the social structural constraints that operate on the stratification process independently of individual-level traits. The structuralist accounts that ultimately emerged from these critiques initially amounted, in most cases, to refurbished versions of dual economy and market segmentation models that were introduced and popularized many decades ago by institutional economists. When these models were redeployed by sociologists in the early 1980s, the usual objective was to demonstrate that women and minorities were disadvantaged not merely by virtue of deficient human capital investments (e.g., inadequate schooling and experience), but also by their consignment to secondary labor markets that, on average, paid out lower wages and offered fewer opportunities for promotion or advancement. At the same time, more deeply sociological forms of structuralism have also appeared, both in the form of (a) mesolevel accounts of the effects of social networks and ‘social capital’ on attainment, and (b) macrolevel accounts of the effects of institutional context (e.g., welfare regimes) on mobility processes and outcomes. The history of these research traditions is arguably marked more by statistical and methodological signposts than substantive ones. Although there is, then, no grand theory that unifies seemingly disparate
models, the field has long relied on middle-range theorizing addressing such issues as the forces making for discrimination, the processes through which classbased advantage is reproduced, and the effects of industrialism, capitalism, and socialism on mobility and attainment. The main contenders, at present, for a ‘grand theory’ of mobility are various forms of rational action analysis that allow middle-range accounts to be recast in terms of individual-level incentives and purposive behavior (e.g., see Rational Choice Theory in Sociology). Indeed, just as the assumption of utility maximization underlies labor economics, so too a theory of purposive behavior might ultimately organize much, albeit not all, sociological theory on social mobility and attainment.
5. Ascriptie Solidarities To this point, the discussion has focused on class and socioeconomic inequalities, while considerations of race, ethnicity, and gender have been brought in only indirectly. This omission reflects, however improperly, the historical development of the field; that is, status groups were once regarded as secondary forms of affiliation, whereas class-based ties were seen as more fundamental and decisive determinants of social and political action. The first step in the intellectual breakdown of this approach was the fashioning of multidimensional models of stratification (e.g., Lipset and Bendix 1959) that moved beyond a strict class analytic emphasis on economic inequalities. The early multidimensionalists thus emphasized that social behavior could only be understood by taking into account all status group memberships and the complex ways in which these interacted with one another. There was much concern, in particular, that ‘inconsistent’ statuses (e.g., upper class and African American) might generate personal stress that would ultimately be resolved through extremist political action. Although this old variant of multidimensionalism has fallen out of fashion, new variants have been pursued on two intellectual fronts, neither of which similarly emphasize the stressfulness of multiple or inconsistent statuses. In the postmodernist variant, the essentialism of conventional class analytic theorizing is again rejected forcefully, with the claim being that class affiliations are by no means fundamental and that various nonclass statuses are more salient in many situations. Under some postmodern formulations, even the categories of race, ethnicity, and gender have no privileged position, and instead the presumption is that individuals are simply a congeries of manifold, situationally invoked statuses. As the British sociologist Saunders (1989) puts it, ‘On holiday in Spain we feel British, waiting for a child outside the school gates we are parents, shopping in Marks and Spencer we are consumers, and answering questions, framed 14449
Social Stratification by sociologists with class on the brain, we are working class’ (pp. 4–5). This approach, which has not yet informed much empirical research, might be usefully contrasted to the ‘intersectionist’ accounts of Collins (1990), where the categories of race, class, and gender are seen as master statuses that form a new holy trinity of stratification analysis. The theoretical framework motivating this approach is not well-developed, but the implicit claim seems to be that the most fundamental subcultures in the modern context are defined by the intersection of race, class, and gender categories (e.g., black workingclass women, white middle-class men). The social spaces defined by particular combinations of these statuses thus shape the experiences, lifestyles, and life chances of individuals and define the settings in which interests typically emerge. The obvious effect of this approach is to invert the traditional post-Weberian perspective on status groupings; that is, whereas orthodox multidimensionalists described the stress experienced by individuals in inconsistent statuses, these new multidimensionalists emphasize the shared interests and cultures generated within commonly encountered status sets. Among stratification scholars, the attention paid to issues of race, ethnicity, and gender has thus burgeoned of late, and entirely new literatures within these subfields have of course emerged (e.g., see Feminist Political Theory and Political Science, Ethnic Conflicts, Racial Relations). In organizing these literatures, one might usefully distinguish between (a) macro-level research addressing the structure of ascriptive solidarities and their relationship to class formation, and (b) attainment research exploring the effects of race, ethnicity, and gender on individual life chances. At the macrolevel, scholars examine such issues as the social processes by which ascriptive categories (e.g., ‘white,’ ‘black’) are constructed, the sources and causes of ethnic conflict and solidarity, and the relationship between patriarchy, racism, and class-based forms of organization. The microlevel research tradition emphasizes, by contrast, such topics as the size and causes of the gender gap in income, the sources of occupational sex segregation, and the effectiveness of various social interventions (e.g., affirmative action, comparable worth initiatives) in reducing ascriptive inequalities.
6. The Future of Stratification It is instructive to conclude by contrasting structural and cultural accounts of stratificatory change. Although there are many variants of structuralism, a common starting point is the claim that human and political capital are replacing economic capital as the principal stratifying forces in advanced industrial society. In the most extreme versions of this claim, the 14450
old class of moneyed capital is represented as a dying force, and a new class of intellectuals, managers, or party bureaucrats is assumed to be on the road to power (e.g., Gouldner 1979). There is of course much criticism of ‘new class’ interpretations of this sort. The (orthodox) Marxist stance is that ‘news of the demise of the capitalist class is ... somewhat premature’ (Zeitlin 1982, p. 216), whereas the contrasting position taken by Bell (1973) is that neither the old capitalist class nor the so-called new class will have unfettered power in the postindustrial future. To be sure, there is widespread agreement among postindustrial theorists that human capital is becoming a dominant form of property, yet this need not imply that ‘the amorphous bloc designated as the knowledge stratum has sufficient community of interest to form a class’ (Bell 1987, p. 464). As is well-known, Bell (1973) also argues that human capital (e.g., education) will become the main determinant of life chances, if only because the expansion of the professional and technical sectors serves to upgrade job skills and thus requires a welltrained workforce. Although the returns to education are indeed increasing as predicted, the occupational structure is not upgrading quite as straightforwardly as Bell (1973) suggested; and various ‘pessimistic versions’ of postindustrialism have accordingly emerged. In the US variant of such pessimism, the main concern is that postindustrialism leads to a ‘declining middle’ and consequent polarization, as manufacturing jobs are either rendered technologically obsolete or exported to less developed countries where labor costs are lower. These losses may be compensated by the predicted growth in the service sector, yet the types of service jobs that have emerged are quite often low skill, routinized, and accordingly less desirable than Bell imagined. In Europe, the same low-skill service jobs are less commonly found, with the resulting occupational structure more closely approximating the highly professionalized world that Bell envisaged. The European pessimists are nonetheless troubled by the rise of mass unemployment and the associated emergence of ‘outsider classes’ that bear disproportionately the burden of unemployment. In both the European and US cases, the less-skilled classes are therefore losing out in the market, either by virtue of unemployment and exclusion (i.e., Europe) or low pay and poor prospects for advancement (i.e., the US). The new pessimists thus anticipate a ‘resurgent proletarian underclass and, in its wake, a menacing set of new class correlates’ (Esping-Andersen 1999, p. 95). The driving force behind these accounts is, of course, structural change of the sort conventionally described by such terms as industrialism, postindustrialism, and post-Fordism. By contrast, cultural accounts of change tend to de-emphasize these forces or to cast them as epiphenomenal, with the focus thus shifting to the independent role of ideologies, social
Social Stratification movements, and cultural practices in changing stratification forms. This approach underlies, for example, all forms of postmodernism that seek to represent ‘new social movements’ (e.g., feminism, ethnic and peace movements, environmentalism) as the vanguard force behind future stratificatory change. As argued by Beck (1999), the labor movement can be seen as a fading enterprise rooted in the old conflicts of the workplace and industrial capitalism, whereas new social movements provide a more appealing call for collective action by virtue of their emphasis on issues of lifestyle, personal identity, and normative change. Under this formulation, the proletariat is stripped of its privileged status as a universal class, and cultural movements emerge as an alternative force ‘shaping the future of modern societies’ (Haferkamp and Smelser 1992, p. 17). Although no self-respecting postmodernist will offer up a fresh ‘grand narrative’ to replace that of discredited Marxism, new social movements are nonetheless represented within this subtradition as a potential source of change, albeit one that plays out in fundamentally unpredictable ways. The final, and more prosaic, question that might be posed is whether changes of the preceding sort presage a general decline in the field of stratification itself. It could well be argued that Marxian and neo-Marxian models of class will decline in popularity with the rise of postmodern stratification systems and the associated uncoupling of class from lifestyles, consumption patterns, and political behavior. This line of reasoning is not without merit, but it is worth noting that (a) past predictions of this sort have generated protracted debates that, if anything, have re-energized the field; (b) the massive facts of economic, political, and honorific inequality will still be with us even if narrowly conceived models of class ultimately lose out in such debates; and (c) the continuing diffusion of egalitarian values suggests that all departures from equality, no matter how small, will be the object of considerable interest among sociologists and the lay public alike. The latter sensitivity to all things unequal bodes well for the future of the field even in the (unlikely) event of a long-term secular movement toward diminishing inequality. See also: Class: Social; Democracy; Democratic Theory; Egalitarianism: Political; Egalitarianism, Sociology of; Equality and Inequality: Legal Aspects; Equality of Opportunity; Equality: Philosophical Aspects; Income Distribution; Income Distribution: Demographic Aspects; Inequality; Inequality: Comparative Aspects; Mobility: Social; Poverty: Measurement and Analysis; Poverty, Sociology of; Racism, Sociology of; Social Inequality in History (Stratification and Classes); Social Mobility, History of; Socialism; Socialism: Historical Aspects; Urban Geography; Urban Poverty in Neighborhoods; Wealth Distribution
Bibliography Beck U 1999 World Risk Society. Polity Press, Cambridge, UK Bell D 1973 The Coming of Post-industrial Society. Basic Books, New York Bell D 1987 The new class: A muddled concept. In: Heller C S (ed.) Structured Social Inequality, 2nd edn. Macmillan, New York, pp. 455–68 Blau P M, Duncan O D 1967 The American Occupational Structure. Wiley, New York Collins P H 1990 Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment. Unwin Hyman, Boston Dahrendorf R 1959 Class and Class Conflict in Industrial Society. Stanford University Press, Stanford, CA Erikson R, Goldthorpe J H 1994 The Constant Flux: A Study of Class Mobility in Industrial Societies. Clarendon Press, Oxford, UK Esping-Andersen G 1999 Social Foundations of Postindustrial Economies. Oxford University Press, Oxford Fischer C S, Hout M, Sa! nchez Jankowski M, Lucas S R, Swidler A, Voss K 1996 Inequality by Design: Cracking the Bell Cure Myth. Princeton University Press, Princeton, NJ Gouldner A 1979 The Future of Intellectuals and the Rise of the New Class. Seabury, New York Grusky D B, Sørensen J B 1998 Can class analysis be salvaged? American Journal of Sociology 103: 1187–234 Haferkamp H, Smelser N J 1992 Introduction. In: Haferkamp H, Smelser N J (eds.) Social Change and Modernity. University of California Press, Berkeley, CA, pp. 1–23 Hauser R M, Featherman D L 1977 The Process of Stratification: Trends and Analyses. Academic Press, New York Hauser R M, Warren J R 1997 Socioeconomic indexes of occupational status: A review, update, and critique. In: Raftery A (ed.) Sociological Methodology, 1997. Blackwell, Cambridge, MA, pp. 177–298 Herrnstein R J, Murray C 1994 The Bell Cure: Intelligence and Class Structure in American Life. Free Press, New York Jencks C, Perman L, Rainwater L 1988 What is a good job? A new measure of labor-market success. American Journal of Sociology 93: 1322–57 Jencks C, Smith M, Acland H, Bane M J, Cohen D, Gintis H, Heyns B, Michelson S 1972 Inequality: A Reassessment of the Effect of Family and Schooling in America. Basic Books, New York Lenski G E 2001 New light on old issues: The relevance of ‘really existing socialist societies’ for stratification theory. In: Grusky D B (ed.) Social Stratification: Class, Race, and Gender in Sociological Perspectie, 2nd edn. Westview, Boulder, CO, 77–84 Lipset S M 1968 Social class. In: Sills D L (ed.) International Encyclopedia of the Social Sciences. Macmillan, New York, pp. 296–316 Lipset S M, Bendix R 1959 Social Mobility in Industrial Society. University of California Press, Berkeley, CA Marx K [1894] 1972 Capital, 3 vols. Lawrence and Wishart, London Parkin F 1971 Class Inequality and Political Order: Social Stratification in Capitalist and Communist Societies. Praeger, New York Saunders P R 1989 Left write in sociology. Network 44: 3–4 Sewell W H, Haller A O, Portes A 1969 The educational and early occupational attainment process. American Sociological Reiew 34: 82–92
14451
Social Stratification Shils E 1968 Deference. In: Jackson J A (ed.) Social Stratification. Cambridge University Press, New York, pp. 104–32 Warner W L, Meeker M, Eels K 1949 Social Class in America. Science Research Associates, Chicago Wright E O 1997 Class Counts: Comparatie Studies in Class Analysis. Cambridge University Press, Cambridge, UK Zeitlin M 1982 Corporate ownership and control: The large corporation and the capitalist class. In: Giddens A, Held D (eds.) Classes, Power, and Conflict. University of California Press, Berkeley, CA, pp. 196–223
D. B. Grusky
Social Support and Health The concept of social support as an important factor in health and illness is not new. ‘Friendship is a basic human need along with food, shelter, and clothing. We naturally desire to love other human beings and to be loved by them. A totally loveless life—a life without friends of any sort—is a life deprived of much needed good.’ These thoughts were expressed by Aristotle, 350 years BC (Alcalay 1983). During and toward the end of the Middle Ages the concept was even addressed as a possible remedy for illness. In 1599 Paracelsus, a pharmacist and natural scientist, actually prescribed ‘love as the best possible cure for several diseases.’ That love and affection ‘influence life and death in both animals and humans’ was further observed by a Russian ethologist (Kropotkin 1908). In this article some of the empirical evidence that ‘love and affection influence life and death’ will be reviewed. The distinctions between social networks and social support are explained; the definition, concepts, and functions of social support are further described; and their impact on health and disease are demonstrated and discussed. With a focus on coronary heart disease (CHD), the most common cause of death among men and women in the developed world, the hazards of lack of social support are disentangled, and differences and similarities in men and women are discussed. Finally, implications for therapeutic interventions are considered. Systematic and conclusive empirical studies of social support and health, however, did not begin until the 1970s, when this field emerged as a new area of research, making a bold link between people’s social ties and social networks on the one hand and population morbidity and mortality on the other. In the pioneering work by Berkman and Syme on residents of Alameda County, California, modern epidemiological tools were applied to demonstrate what had long been sensed to be true: social ties are protective against ill health. Soon a number of population-based studies mainly from the USA and Scandinavia confirmed the findings 14452
of Berkman and Syme. These findings were systematically interpreted and discussed by House et al. (1988) who concluded that the impact of lack of social ties was similar to that of smoking. With such emphasis and visibility of the concept, a new research field was established. A general definition of social support was agreed upon, which was broad enough to incorporate most of the research traditions in the field: ‘Social support is the resource provided by other persons. By viewing social support in terms of resources—potentially useful information or things—we allow for the possibility that support may have negative as well as positive effects on health and well-being’ (Cohen and Syme 1985). Soon, however, researchers began to ask why and how social ties might influence mortality and morbidity; in particular investigators were focusing on the function and contents of social ties and their assessment (Cohen and Syme 1985). On the one hand, large epidemiological studies of populations typically would find that mortality increased with a decreasing number and frequency of social contacts. On the other hand, the more sophisticated measures of social context, social anchorage, social environment, and social support were limited to specific indicators of disease such as blood pressure in smaller selected groups of persons or patients. Thus two types of research traditions developed within this area—social network epidemiology (see Social Integration, Social Networks, and Health) and social support psychology. Attempts to unite these traditions have not always been successful. Where the first type of research mostly examined all causes of mortality or general well-being as end-points, the latter has looked more closely at specific disease end-points, such as cardiovascular disease. The latter has become a model for research on social support and disease, and will be the focus here, because heart diseases are common, develop gradually, and are found in all populations. Coronary heart disease (CHD) is the most common of cardiovascular diseases and the most common cause of death in highly developed and industrialized countries, and it is rapidly increasing in developing nations of the third world. In many countries it is also the most common cause of hospitalizations and an important factor in work absenteeism. The clinical manifestations of CHD are dependent on the extent of underlying atherosclerotic changes of the coronary arteries. With more advanced atherosclerosis, the risk of an acute myocardial infarction (AMI) or sudden cardiac death (SCD) is increased. The occurrence of both atherosclerosis and clinical manifestations of CHD increases with increasing age, in men as in women. Below the age of 65, clinical manifestations of CHD, including myocardial infarction and sudden death are rare in women. In particular before menopause at age 50, women seem to be protected. Women are also relatively spared in the following decade, between 50
Social Support and Health and 60 years of age. By the age of 60, sex differences are beginning to vanish. This may be one of the reasons why research on CHD has been so remarkably focused on men in their middle ages, under age 65. Most research on coronary risk factors and factors that are protective of CHD has been carried out on middle-aged men. Therefore knowledge about heart disease in women is less evidence-based than that of men. This is particularly true as concerns the psychosocial influences on heart disease and its risk factors.
1. The Social Enironment and the Heart Lifestyle and daily habits—the way we live in the broadest sense—strongly influence the atherosclerotic process. If we smoke, lead a sedentary life, eat too much fat and too little fiber, the risk of premature atherosclerosis and early heart disease is increased. It is also well known that many social environments may enhance or prevent the improvement of lifestyle and behavioral risk factors. Therefore, psychosocial influences, including the concepts of stress and social support should be recognized as relevant, both in cardiology and in general health care. Support from social networks may help the individual to deal with stress in many ways. Network members may provide help to recognize and evaluate stressors in the environment, and they may also be helpful in identifying solutions to overcome the stressor (information\appraisal support). This means that they may provide tools to improve problemoriented coping, and thereby enhance stress reduction. Network members may also provide emotional support that may counteract the individual’s feeling of loneliness and hopelessness, and thus help him or her to deal more efficiently with problems. Supportive actions may also relieve stress by reducing tension and anxiety, and counteracting depressive reactions (emotional support). Furthermore, supportive actions may also be helpful as they provide practical help (tangible support). Therefore, the empirical evidence of social support effects may be found at various stages of the pathogenic processes leading to heart disease. An epidemiological as well as a behavioral and psychophysiological view of the relationship is scientifically sound (see Social Support and Stress).
2. Social Support and Cardioascular Health in Men Large epidemiologic studies of social support and social networks have mainly focused on mortality as an endpoint. Scarce networks and few contacts are generally found to increase mortality risk. The incidence of acute cardiac events such as myocardial infarction or sudden cardiac death, however, has
rarely been investigated in relation to social support. Furthermore, the most commonly studied aspect of social support is marital status. In men, marriage or cohabiting seems to improve cardiovascular health and to decrease mortality (Case et al. 1992). In women, as we shall see below, the picture is more complex. Several large-scale epidemiological studies have been conducted in Sweden and Scandinavia. Scandinavian countries have a relatively homogenous, easily assessed population, living under good material conditions, where poverty, in its concrete sense, is rare. This does not mean that loneliness and lack of social support are irrelevant. In contrast, where the basic physical needs are met and a decent standard of living provided by public or governmental services, social ties and relations with other people tend to become diminished. In addition, the cold and dark climate, characteristic of these countries, does not promote social interaction, but rather may prevent naturally occurring social encounters. In one of the early population-based studies of risk factors for CHD, the study of 50-year-old men born in Gothenburg (men born in 1913, in 1923, and in 1933), the impact of poor social networks and lack of social support was first assessed. Scarce social ties and social isolation increased both total and cardiovascular mortality (Welin et al. 1985). In further studies of men born in 1914 in the city of Malmo$ in south Sweden this observation was confirmed. The provision of satisfactory social ties and adequate social and psychological anchorage was essential to good health and longevity in men (Hanson 1989). In a subsequent assessment of the Gothenburg study group, a random sample of 776 men, who were found healthy on initial examination, were followed for incidence of AMI and death from CHD. Lack of social support, as measured by the Interview Schedule for Social Interaction (ISSI), an instrument which was modified for population based studies in our laboratory, was found to be an independent predictor of AMI and death from CHD, including sudden cardiac death. This instrument is also described in Rosengren et al. (1993). These effects were maintained even after controlling for the effects of standard cardiovascular risk factors (hypertension, hyperlipidemia, smoking, diabetes, lack of exercise, etc.). The effect of lack of support was comparable in magnitude to that of smoking (Rosengren et al. 1993). All other risk factors, including elevated cholesterol, or fibrinogen levels, hypertension or obesity were less strong predictors. When smokers and non-smokers were analyzed separately, risk gradients between men with high and men with low support were similar in magnitude. Male smokers with little social support were at highest risk for CHD whereas male nonsmokers with large networks and good support were at lowest risk (7.1 percent vs. 0.8 percent) (Rosengren et al. 1993). These and other studies have demonstrated that lack of social support and its various functions are 14453
Social Support and Health predictors of disease incidence, and exert an effect that is independent of physiological and lifestyle risk factors. However, once signs and symptoms of chronic disease have occurred one would expect them to override the impact of ‘soft’ risk factors such as social support. Therefore the study of patients suffering from chronic disease, such as coronary disease, is crucial. In an early landmark study of over two thousand male survivors of acute myocardial infarction, Ruberman et al. (1984) showed that, even in the presence of clinical coronary disease, social isolation and stress contributed significantly to the increase in mortality risk. These risk factors were more common in men with low education. This early finding of a social gradient in health and disease has subsequently been frequently discussed and confirmed in both population- and patient-based studies (Haan et al. 1987). In general, findings in men support the notion that one way in which the health hazards of social disadvantage are expressed is through the lack of adequate social networks and support. In order to provide appraisal and tangible support in times of hardships, men with high education and high job positions usually have a vast network of friends and acquaintances from professional and other organizations or groups, which they can draw upon for help in difficult situations. In low job positions, networks are known to be smaller and less effective; also social activities and social interactions are known to be more centered on family and family members. Therefore, the specific functions of information and concrete practical help, which are unevenly distributed across social strata and found to be crucial for maintenance of good health (Hanson 1989) may explain why cardiovascular health and health in general is better in higher social classes.
3. Social Support and Emotions Social supports are provided to the individual by his\her social environment in an ongoing interactive process. This interactive process depends not only on the availability of supportive structures, but also on individual personal characteristics, such as the coronary prone behavior and\or emotional reactive patterns, such as depressive and anxious feelings. Whether the individual is able to receive and to provide social supports makes a difference for health outcomes. This includes the frequently studied coronary prone and hostile behaviors, which particularly in men were predictive of new incidence of coronary attacks (Brummett et al. 1998). If considered in the presence of supporting social structures some of these findings are intuitively easier to understand. In a Swedish study of 150 male cardiac patients who were followed for ten years, evidence of 14454
such interactive effects on disease risk was found, in addition to clinical cardiac prognostic factors. The psychosocial factors assessed in these patients included social integration and social support as well as the Type A behavior pattern, using the original Structured Interview procedure. In accordance with other studies at the time, Type A behavior was not found to worsen prognosis in cardiac patients (Ragland and Brand 1988): 24 percent of men with Type A as compared to 22 percent of men with Type B died from CHD during follow-up. In contrast, social isolation was a prominent risk factor for poor prognosis. Men who said they practically never engaged in social activities with others for leisure were defined as socially isolated. They had about three times the mortality risk of men who said they were socially active at least once a month (Orth-Gome! r et al. 1988). The worst prognosis, however, was found in Type A men who were socially isolated, their ten-year mortality risk being 69 percent, whereas, in socially active men, the behavior type made no difference (21 percent in type B and 17 percent in Type A men) (Orth-Gome! r et al. 1988). These results suggest that investigating either social environmental or individual behavioral factors alone is insufficient. Instead an attempt to integrate social and physical environment with individual psychological reactions appears to be more meaningful. Such models have been successfully applied and tested. An important example is the role of depressive reactions and their relation to outcome in patients with coronary disease. In several patient-based studies it has been have shown that depressed symptoms and feelings increase mortality risk in patients post AMI (Barefoot et al. 1996, Frasure-Smith et al. 1995). These depressive reactions are tied into a social context that is much influenced by the individual and collective capacity for coping with stressors (Holahan et al. 1995). Such coping capacity will in its turn depend on the provision of social supports. Therefore the more recent studies of psychosocial factors in women have been conceptualized taking these integrative aspects into consideration.
4. Social Support in Relation to CHD in Women Women have rarely been studied in regard to psychosocial factors and CHD. This is true for both social contextual factors, such as stress and social support, and the more individually based aspects of depression, coping, mastery, etc. It is interesting to note that many studies of coronary risk profiles did include women— but usually in too small numbers to be conclusive about their risk profile or about gender differences. Nevertheless findings are often generalized to both men and women. This was one of several reasons for initiating the Stockholm Female Coronary Risk Study (FemCor-
Social Support and Health
*Hazard ratio (95 percent CI), adjusted for age, P for trend = 0.01.
Figure 1 Percentage of women surviving free of cardiac events according to social support and depressive symptoms
Risk Study). This is a community-based case-control study, comprising 300 women below age 65, who were admitted to the coronary care units with acute coronary syndromes in greater Stockholm during a threeyear period. Women with CHD were compared to a random sample of 300 age-matched women from the normal population, obtained via the official census registry of the same area. Psychosocial and biological risk factors were examined using interview, survey, psychosocial, and physical examination techniques. The core study objectives were: (a) To examine psychosocial risk factor effects on CHD in women; (b) To examine psychosocial determinants of lifestyle and the psychobiological mechanisms involved; (c) To describe protective factors and mechanisms which buffer against noxious psychosocial influences. Several different strategies for analyses of the results have been applied, including case-control comparison, cross-sectional analyses among healthy women, prospective follow-up of women with cardiac disease, etc. Lack of social support and social isolation was found to be one of the major stressors identified in this study group. As the sick role and the biological disease process itself may influence both social activities and social support in most people, we relied on follow-up
rather than cross-sectional comparisons. Case-control comparisons were used to adjust for risk factors such as education, occupation, etc. Separate analyses of underlying coronary artery disease were undertaken, with patients being examined with quantitative coronary angiography (QCA). QCA was carried out in 131 of the original 292 patients, based on available space at the angiographic laboratory. Patients who did and did not undergo angiography were compared, but differences in coronary risk factor profile were not found. Measures of severity (presence and number of stenoses greater than 50 percent) and measures of extent (number of stenoses greater than 20 percent) of coronary artery disease were evaluated. After adjustment for age alone, lack of social support was associated with both measures. After adjustment for lifestyle and physiological risk factors, the risk ratio for stenosis greater than 50 percent in women with poor as compared to strong social support was 2.5 (Orth-Gome! r et al. 1998). Women with poor social support also had a greater number of less hazardous stenoses, obstructing the coronary lumen by only 20 percent. After multivariate control for all other risk factors, however, this association was of only borderline significance. These associations were 14455
Social Support and Health not explained by severity of anginal symptoms, selfrated subjective health status, or other clinical indicators of poor prognosis. Subsequently, the impact of social isolation and depressed affect was examined in a five-year follow-up of recurrent events in all women patients (Horsten et al. 2000). They were followed for a median of five years after admission for the acute coronary event. By means of the centralized healthcare system in Sweden, complete follow-up information for recurrent events (hospitalizations and deaths) was obtained (Alfredsson et al. 1997). A total of 81 patients (27 percent) either died from cardiovascular causes or had a recurrent AMI or a revascularization procedure. Women patients, who at initial examination showed signs of depressed affect (two or more symptoms of depression as measured by a short instrument by Pearlin et al. 1981) had twice the risk of recurrent events as compared to women who were not depressed. When social isolation was added to the model a further increase in risk was noted. As shown in Fig. 1, 33 percent of those who showed both depressed affect and social isolation had new events as compared to 9 percent of those who were free from both these problems, an almost fourfold difference. Prognostic risk factors including age, menopausal status, diagnosis at index event, symptoms of heart failure, diabetes, systolic blood pressure, smoking, body mass index, exercise, total cholesterol, triglycerides, HDL, therapy of beta-blockers, or ACE-inhibitors did not explain the effects of psychosocial determinants (Horsten et al. 2000).
5. Work Stress, Family Stress, and CHD in Women Psychosocial stress at work has been consistently studied and shown to increase the risk of CHD (Karasek and Theorell 1990). Using the concept of high demand and low control presented by Karasek and Theorell an increased CHD risk ratio of about two has been found in men workers. Although women have not been as extensively studied, it is suggested that they have about the same CHD incidence rates associated with work stress as men do (Schnall et al. 1994). When analyzing the contribution of high demand and low control in the case-control comparison of the FemCorRisk Study an odds ratio of about two was found. A further—and stronger—source of stress was family problems. Based on the concept presented by Frankenhaeuser et al. (1989), exposure to both stressors at work and those that originate from outside work was analyzed in a structured interview procedure. The predictive values of the two different work stress measures were almost identical. In the casecontrol comparison the odds ratio for work stress was 2.3 whereas that for family stress originating from a 14456
dyadic relationship was 2.9. In addition both work stress and family stress were associated with low SES. As expected, women with unskilled jobs had lower control over their jobs, but they also had increased family stress, in particular from separations. Furthermore the effects of these stressors on eventfree survival in women with CHD were examined. The risk of a recurrent event in women patients with severe marital stress as compared to no or little stress was 2.8 after control for age, menopausal status, family history of CHD, symptoms of chest pain and heart failure, diabetes, smoking, and levels of triglycerides and HDL cholesterol. The actuarial five-year probability of being free from cardiac recurrent events in women without marital stress was 85 percent. In women with severe marital problems it was 65 percent (OrthGome! r et al. 2000). This effect was modified by individual capacity to cope with stressors, as assessed by a short instrument describing degree of mastery, obtained from Pearlin et al. (1981). This self-rated scale attempts to quantify the extent to which the individual perceives that he\she can deal with and master problems and stressors in everyday life. In women patients with poor mastery the experience of severe marital stress increased the risk of recurrent events by a factor of eight. In women with good mastery there was no significant effect of marital stress on risk of recurrences. Furthermore, the ability to master or cope with problems was related to socioeconomic status (women with low SES having poorer mastery scores), depressive symptoms, and social isolation (Wamala 2001). These findings may be summarized as follows: women patients who experienced marital stress, when assessed after an acute coronary event, had a worsened prognosis if they also had a low education or an unskilled job, if they had depressive symptoms, if they were socially isolated, and poor at mastering difficulties and problems of everyday life. Numbers in subgroups were too small to analyze risk ratios in these different categories, but it is probable that all these factors are interrelated and overlap. This knowledge can be used to implement and improve preventive measures aimed at secondary prevention in women. Such interventions are ongoing, both in the US (ENRICHD) and in Sweden (Healthier Female Hearts in Stockholm).
6. Gender Differences in Cardioascular Effects To our knowledge, the harmful effects of family stress, depressed affect, poor mastery, and social isolation have not been investigated in a similar way in men. However, in an earlier study of Swedish men with CHD, the question was asked which life domain had generated the most severe stress experiences, work,
Social Support and Health family, education, or other. A majority of men (80 percent) ascribed their worst stress periods to the work situation, and only 10 percent to the family and spousal problems. This apparent gender difference is further supported by recent data from two large surveys of men and women of the general population in Massachusetts (Shumaker and Hill 1991). When asked which person was the most important provider of social support, men were more than twice as likely to name their spouse, whereas women were most likely to name a friend or relative, usually female, as their primary supporter. In addition, women were more likely than men to report that they give more than they receive in dyadic relationships. In the FemCorRisk Study, 89 percent of the women said they were their spouse’s closest confidants, whereas only 75 percent of women named their spouse as their closest confidant. In contrast to previous studies in men (see Gender and Cardioascular Health), being married or cohabiting did not in itself provide any extra protection from cardiovascular ill health. Instead, for these women, the strain from a problematic spousal relationship significantly contributed to a poor prognosis over and above the effect of clinical cardiac predictors. Therefore, the determinants, the protective health effects, and the mechanisms of social support may differ profoundly between men and women. Consequently they need to be separately conceptualized, examined, measured and analyzed, and implemented for use in clinical and community-based interventions and preventive efforts. In both men and women, however, the protective effects of social support and the health hazards of social isolation are strong and complex. Further research on processes and pathways as well as methods to implement this knowledge in preventive efforts is highly needed. See also: Family Health; Gender and Cardiovascular Health; Gender and Physical Health; Illness: Dyadic and Collective Coping; Social Integration, Social Networks, and Health; Social Support and Recovery from Disease and Medical Procedures; Social Support and Stress; Social Support, Psychology of; Stress at Work; Support and Self-help Groups and Health
Bibliography Alcalay R 1983 . In: Alcalay R (ed.) Ethics. Elsevier, Amsterdam, pp. 71–88 Alfredsson L, Hammar N, Hodell A, Spetz C-L, AH kesson L-O, Kahan T, Ysberg A-S 1997 Va$ rdering av diagnoskvaliteten fo$ r akut hja$ rtinfarkt i tre svenska la$ n 1995 [Validation of the quality of diagnosis for acute myocardial infarction in three Swedish districts]. Socialstyrelsen 84–8 Barefoot J C, Helms M J, Mark D B, Blumenthal J A, Califf R M, Haney T L, O’Connor C M, Siegler I G, Williams R B 1996 Depression and long-term mortality risk in patients with
coronary artery disease. American Journal of Cardiology 78: 613–17 Brummett B H, Babyak M A, Barefoot J C, Bosworth H B, Clapp-Channing N E, Siegler I C, Williams R B, Mark D B 1998 Social support and hostility as predictors of depressive symptoms in cardiac patients one month following hospitalization: A prospective study. Psychosomatic Medicine 60: 707–13 Case B R, Moss A J, Case N 1992 Living alone after myocardial infarction. Journal of the American Medical Association 267: 520–4 Cohen S, Syme S L 1985 Social Support and Health. Academic Press, New York Frankenhaeuser M, Lundberg U, Fredriksson M, Melin B, Tuomisto M, Myrsten A-L, Hedman M, Bergman-Losman B, Wallin L 1989 Stress on and off the job as related to sex and occupational status in white-collar workers. Journal of Organizational Behaior 10: 321–46 Frasure-Smith N, Lesperance F, Talajic M 1995 Depression and 18-month prognosis after myocardial infarction. Circulation 91: 999–1005 Haan M, Kaplan G A, Camacho T 1987 Poverty and health: Prospective evidence from the Alameda County Study. American Journal of Epidemiology 125: 989–98 Hanson B S 1989 Social network and social support influence mortality in elderly men. The prospective population study of ‘Men born in 1914’ Malmo, Sweden. American Journal of Epidemiology 130(1): 100–11 Holahan C J, Holahan C K, Moos R H, Brennan P L 1995 Social support, coping, and depressive symptoms in a latemiddle-aged sample of patients reporting cardiac illness. Health Psychology 14: 152–63 Horsten M, Mittleman M, Wamala S P, Schenck-Gustafsson K, Orth-Gome! r K 2000 Social isolation and depression in relation to prognosis of coronary heart disease in women. European Heart Journal 21: 1072–80 House J S, Landis K R, Umberson D 1988 Social relationships and health. Science 241: 540–5 Karasek R, Theorell T 1990 Healthy Work: Stress, Productiity, and the Reconstruction of Working Life. Basic Books, New York Kropotkin P 1908 Gegenseitige Hilfe in der Tier- und Menschenwelt. Thomas, Leipzig, Germany Orth-Gome! r K, Horsten, M, Wamala S P, Mittleman M A, Kirkeeide R, Svane B, Ryde! n L, Schenck-Gustafsson K 1998 Social relations and extent and severity of coronary artery disease. The Stockholm Female coronary risk study. European Heart Journal 19: 1648–56 Orth-Gome! r K, Unde! n A-L, Edwards M-E 1988 Social isolation and mortality in ischemic heart disease. A 10-year follow-up study of 150 middle-aged men. Acta Medica Scandinaica 224: 205–15 Orth-Gome! r K, Wamala S P, Horsten M, Schenck-Gustafsson K, Schneiderman N, Mittleman M A 2000 Marital stress worsens prognosis in women with coronary heart disease. The Stockholm female coronary risk study. Journal of the American Medical Association 284(23): 3008–14 Paracelsus [Theophrastus Bombastus von Hohenheim] 1599 Labyrinthus medicorum errantium. Hanoviae Pearlin L L, Menagham E G, Lieberman M A, Mullan J T 1981 The stress process. Journal for Health Social Behaior 22: 337–56 Ragland D R, Brand R J 1988 Type A behavior and mortality from coronary heart disease. New England Journal of Medicine 318: 65–9
14457
Social Support and Health Rosengren A, Orth-Gome! r K, Wedel H, Wilhelmsen L 1993 Stressful life events, social support, and mortality in men born in 1933. British Medical Journal 307: 1102–5 Ruberman W, Weinblatt E, Goldberg J D 1984 Psychosocial influences on mortality after myocardial infarction. New England Journal of Medicine 311: 552–9 Schnall P L, Landsbergis P A, Baker D 1994 Job strain and cardiovascular disease. Annual Reiew of Public Health 15: 381–411 Shumaker S A, Hill D R 1991 Gender differences in social support and physical health. Health Psychology 10: 102–11 Wamala, S P 2001 Stora sociala skillnader bakom dvinnors risk fo$ r kranska$ rlssjukdom. Okvalificerat jobb och slitningar i familjen avgo$ rande faktorer [Large social inequalities in coronary heart disease risk among women. Low occupational status and family stress are crucial factors]. LaW kartidningen [Journal of the Swedish Medical Association] 98(3): 177–81 Welin L, Sva$ rdsudd K, Ander-Perciva S, Tibblin G, Tibblin B, Larsson B, Wilhelmsen L 1985 Prospective study of social influences on mortality. The study of men born in 1913 and 1923. The Lancet I : 915–8 Williams R B, Barefoot J C, Califf R M, Haney T L, Saunders W B, Pryor D B, Hlatky M A, Siegler I C, Mark D B 1992 Prognostic importance of social and economic resources among medically treated patients with angiographically documented coronary artery disease. Journal of the American Medical Association 267: 520–4
K. Orth-Gome! r
Social Support and Recovery from Disease and Medical Procedures 1. Social Support and Recoery from Disease Evidence examining the effect of social support on recovery from illness comes from a variety of health areas. These include studies on infectious disease, as well as cancer, heart disease and other chronic illnesses. Under most circumstances, social support has been shown to moderate the negative effects of illness and improve recovery. The effect of social support, however, is dependent to a large degree on the characteristic of the illness and the match between the type of support offered and the needs of the patient. A recent study looked at the susceptibility to developing the common cold in volunteer subjects given nasal drops containing one or two cold viruses (Cohen et al. 1997). The results showed that those volunteers with more diverse social networks had a fourfold lower susceptibility to developing upper respiratory illness than those with a smaller number of social ties. In this study, social contact per se was not as important as the diversity of the individual’s social relationships. The more an individual had regular contact with a variety of social groups such as family, workmates, neighbors and other social networks, the 14458
more resistant he or she was to developing a cold. Another study examining the role of psychosocial factors on progression to AIDS, found stress and social support to be significant predictors of HIV disease advancement (Leserman et al. 1999). In a follow-up period of five-and-a-half years, those participants who were more satisfied with the level of social support offered by their friends and families at study entry were twice as likely to be free from AIDS than participants who were dissatisfied with their available level of social support. As well as infectious diseases, social support has been shown to have a positive effect on prognosis and recovery in a range of other medical conditions and illnesses. A study of low-income pregnant women found social support was associated with fewer complications during pregnancy and a higher infant birth weight. Research has found women who receive higher quality social support prior to delivery give birth to healthier babies and experience less post-partum depression. In chronic illnesses, Penninx et al. (1998) found positive effects on depression for having a partner and having many close relationships, although the strength of the effect differed across illnesses. Other work has shown social support to be associated with lower levels of disability and depression in diabetics, and higher levels of compliance in patients on renal dialysis. In the cardiovascular area, the effect of having a close confidant has been shown to have an important effect on future mortality (see Coronary Heart Disease (CHD), Coping with). Williams et al. (1992) followed 1,965 patients with coronary artery disease after angioplasty. After accounting for known medical prognostic factors such as cigarette smoking and family history of heart disease, the most important socio–economic variable in predicting time to death was the presence or absence of a spouse or confidant. This finding is consistent with other studies of heart attack patients which have found a spouse to play a protective role in outcome. A study of elderly myocardial infarction (MI) patients found those with low levels of social support were more likely to die in hospital and within six months of their heart attack even after controlling for severity of MI and other risk factors (Berkman et al. 1992). The available evidence suggests that the positive effect of social support seems to be stronger for male than female MI patients. For example, a 10-year follow-up study of MI patients admitted to Baltimore hospitals found higher survival rates in married compared to unmarried patients. The positive effect for marriage on both in-hospital and follow-up survival was stronger for males than females (Chandra et al. 1983). Recent data suggest the more similar the spouse and patient view the illness the greater likelihood of effective adjustment and recovery. In work with MI patients, beliefs about the cause of the heart attack
Social Support and Recoery from Disease and Medical Procedures were associated with positive changes in health behavior six months following MI. The belief by patients and spouse that the MI was caused by a faulty lifestyle was significantly related to overall improvements in diet and to an increase in the frequency of strenuous exercise. Attributions about how the illness developed are very common following MI and show a high degree of agreement between patient and spouse. They provide important guideposts for patients and their spouses by directing coping towards controlling a future myocardial infarction. Patient and spousal beliefs that the MI was caused by a faulty lifestyle were precursors to making changes in the types of foods eaten at home and participation in regular exercise. However, the belief that the MI was caused by stress, which is the most common attribution made by patients, had no relationship with later lifestyle changes (Weinman et al. 2000). A large number of studies have examined the effect of social support on adjustment to cancer but fewer have studied the effect of social support on mortality. The literature shows social support interventions to generally have a positive effect on emotional adjustment to cancer. For example, a follow-up study of 294 people with breast, lung, or colorectal cancer found emotional support reduced distress and promoted adjustment to the disease (Ell et al. 1992). This study also found some survival benefits for social support in women with localized breast cancer. Some researchers have directly investigated the value of social support by using it in controlled trials with chronic illness populations. Two randomized trials seem to show some health benefits for social support interventions. Spiegel et al. (1989) randomized women with metastatic breast cancer to participate in a one-year support group experience or usual care. There were no differences at four or eight months, but at one year the interventions group reported greater levels of adjustment than the control group. In a 10year follow-up, the researchers reported that the women who participated in the support group had a significantly longer survival time. In another study of patients with Stage I and II malignant melanoma, patients were randomized to an intervention group that combined informational and education support or to a control condition. At six years, the intervention group showed decreased recurrence and increased survival six years later (Fawzy et al. 1993). While these findings are encouraging for social support having health benefits in some chronic illnesses, it is not clear from the research to date what part of the interventions are the most effective or in which way they may influence health outcomes. It is still equivocal whether social support per se is the active ingredient that results in the positive outcome from these trials. As well as providing emotional support, the research interventions also provided education and information. Moreover, some later studies that have included social support interventions
have not found the same effect on health outcomes (e.g., Cunningham et al. 1998). A popular and relatively recent source of social support for patients with chronic illness are informal support groups. Here, patients meet together to discuss their condition and the difficulties and successes they have experienced in coping with their illness. As well as sharing personal illness experiences, these groups offer the advantage of providing information about utilizing health services, normalizing experiences and concerns and providing emotional support to group members. An analysis of social support groups found that the experience of physical illness was the most common reason for participation in such groups (Lieberman and Snowden 1993). Even after excluding substance abuse groups, this American survey found coping with physical illness made up 42 percent of the self-help group population. This study also suggested that these groups were more effectively accessed by the middle class populations who often have good access to other support services. One limit of such social support groups is that participants have to be well enough to be physically able to attend meetings. This pattern may be changing with the advent of the Internet. In the past five years there has been a dramatic increase in the use of the Internet in most Western households. This has been accompanied by an increasing number of discussion and support groups for chronic illness. Now without leaving home, chronic illness sufferers can read postings from other sufferers and participate in discussions about aspects of their condition. Recent research on these illness newsgroups suggests that certain diseases have a higher number of postings than others and these rates differ markedly from their level of morbidity in the community (Davidson and Pennebaker 1997). For example, patients with breast cancer and chronic fatigue syndrome have a high Internet usage, while postings from heart disease and prostate cancer patients are relatively under-represented. The value of Internet support groups for patients with chronic illness has yet to be evaluated but the use of this type of support is likely to rise markedly in the future.
2. Social Support and Recoery from Medical Procedures Although not as extensively studied as recovery from illness, there is considerable evidence that social support also has a positive influence on recovery from hospitalization and medical procedures. In a large study of 40,820 patients admitted to a University Hospital in Ohio, researchers investigated the effect of marital status on patient outcome (Gordon and Rosenthal 1995). They found that unmarried patients presented to hospital later than married patients. Unmarried surgical patients also had a higher risk of 14459
Social Support and Recoery from Disease and Medical Procedures dying in hospital than married surgical patients, even after controlling for severity of illness, diagnosis and other demographic factors. Furthermore, unmarried patients tended to stay longer in hospital than married patients. The differences found in this study were greater in patients who had never married when compared with patients who were widowed, divorced, or separated. This study suggests that being unmarried and in particular, never having married, increases your risk of negative outcomes during hospitalization. The effect of different types of social support on recovery has been examined in the context of coronary artery surgery. A study examining the effect of different types of social support on recovery following cardiac surgery found that the perceived availability of social support was related to emotional and functional outcomes up to a year following the operation. In particular, esteem support, or feedback that one is valued and respected by others, was consistently related to outcome over the follow-up period (King et al. 1993). It is not clear why esteem-related social support is associated with positive outcome, but it is likely that such support may give the individual in this situation greater resilience to setbacks and help instill more optimistic beliefs that they have the ability to recover from the surgery. One fascinating aspect in the study of social support and recovery from medical procedures is the effect of the patient’s hospital roommate on recovery. Research in this area demonstrates that whether a patient has a pre- or postoperative roommate can influence outcome. Patients about to undergo surgery who are also assigned to a preoperative roommate show greater levels of anxiety and a slower recovery than those assigned a roommate who has already been through surgery. As well as lower levels of anxiety prior to surgery, being placed with a roommate who had been through surgery resulted in getting out of bed more quickly following the operation and a faster discharge from hospital (Kulik et al. 1993). It seems likely then that the positive benefits resulting from a postoperative roommate are partly due to the natural relief of seeing someone who has survived the same operation that the preoperative patient is about to undergo, and by normalizing the feelings and problems following the operation so these are less likely to be viewed as unusual or abnormal by the patient. Thus a roommate who has been through the operation or procedure can provide a practical role model for preparing patients for the post-operative pain, physical sensations and sequence of recovery following the operation. Actions by others designed to aid or assist adjustment with medical problems is not always seen as helpful by the recipient. Negative reactions to intended helpful actions are actually quite common. One study of elderly chronic illness patients found half to receive support from their spouse they did not need and 28 percent reported not receiving assistance for a daily 14460
activity when needed. This mismatch between the need and provision of support has been the focus of study in the general social support area and is very relevant to the area of recovery from medical illness. Here, the evidence suggests that different types of support are differentially valued depending on the relationship the provider has with the patient. In an important study with cancer patients, Dakof and Taylor (1990) found emotional support was most likely to be valued when it came from intimate others, whereas informational support was more valued when it came from a physician or another cancer patient. This study found when there was a mismatch, such as when a patient needed emotional support from a spouse but received advice or information, then these actions were more likely to be seen as unhelpful. Unhelpful behaviors noted by patients were minimizing the problem, forced cheerfulness, medical care delivered without any emotional support, and being told not to worry. A later study using patients with the chronic non-life-threatening illnesses of irritable bowel syndrome and chronic headache, found these patients to put greater value on tangible and practical assistance and less on emotional support than the cancer patients in Dakof and Taylor’s sample (Martin et al. 1994). These studies highlight how the characteristics of the illness may strongly influence the type of social support needed by patients. In some life-threatening conditions, where patients are frightened or daunted by the demand of treatment and prognosis, emotional needs are more prominent. In other illnesses, where the illness forms a major hurdle to be met on a more routine daily basis without the threat of disfigurement or death, then more practical issues such as assistance with childcare and transport become more valued. A disparity in the intensity of social support can also cause difficulties for patients. Sometimes an aversion or fear of physical illness can cause family to avoid contact with the patient. While emotional support is the type of support most received by patients it is also the type of support likely to be perceived as most inadequate. However, overly intrusive offers of assistance and help can also cause a negative reaction and increased stress. These findings highlight the importance of finding a balance between the needs of the patient and the type and level of social support.
3. Summary The available evidence suggests social support has a positive influence on individuals’ susceptibility to infectious disease and recovery and adjustment to a variety of chronic illnesses. In the cardiac area there seem to be particularly strong benefits for having a spouse or close confidant and these effects seem to be even stronger for men. Social support as an intervention in chronic illness seems to hold considerable
Social Support and Stress promise but further trials are needed to confirm early findings and to identify the necessary components of successful intervention programmes. With the rise of the Internet, electronic social support groups are going to provide an important source of information and peer support for patients with chronic illness. There is strong evidence that the lack of social support is a risk factor for negative outcomes following hospitalization. Patients low in social support are more likely to present with more advanced illness and to be more likely to die while in hospital. However, it should be noted that many of the studies in the area are correlational, and the mechanisms that link social support to recovery from illness are still largely unknown. Recovery from surgery is an example where action by others that conveys to patients that they are respected and valued has a positive effect on outcome. The type of roommate can also set up a more positive outcome from the operation by providing a constructive role model by minimizing catastrophizing and demonstrating effective coping strategies. A critical factor running through social support and the recovery from illness and medical procedures is the match between the needs of the patient and the type of social support provided. Different types of illness and procedures cause differing needs in terms of whether they require more emotional or tangible support. Emotional support is valued highest when it comes from a significant other. A mismatch in the type or intensity of support can lead to difficulties and hamper overall recovery. See also: Chronic Illness, Psychosocial Coping with; Social Integration, Social Networks, and Health; Social Support and Health; Social Support and Stress; Social Support, Psychology of; Stressful Medical Procedures, Coping with
Bibliography Berkman L F, Leo-Summers L, Horwitz R I 1992 Emotional support and survival after myocardial infarction: A prospective, population-based study of the elderly. Annals of Internal Medicine 117: 1003–9 Chandra V, Szklo M, Goldberg R, Tonascia J 1983 The impact of marital status on survival after an acute myocardial infarction: a population-based study. American Journal of Epidemiology 117: 320–5 Cohen S, Doyle W J, Skoner D P, Rabin B S, Gwaltney J M 1997 Social ties and susceptibility to the common cold. Journal of the American Medical Association 227: 1940–4 Cunningham A J, Edmonds C V I, Jenkins G P, Pollack H, Lockward G A, Ward D 1998 A randomized controlled trial of the effects of group psychological therapy on survival in women with metastatic breast cancer. Psycho-Oncology 7: 508–17 Dakof G A, Taylor S E 1990 Victims’ perceptions of social support: What is helpful from whom? Journal of Personality and Social Psychology 58: 80–9
Davidson K P, Pennebaker J W 1997 Virtual narratives: Illness representations in online support groups. In: Petrie K J, Weinman J A (eds.) Perceptions of Health and Illness: Current Research and Applications. Harwood Academic Publishers, Amsterdam, pp. 463–86 Ell K, Nishimoto R, Mediansky L, Mantell J E, Hamovitch M B 1992 Social relations, social support and survival among patients with cancer. Journal of Psychosomatic Research 36: 531–41 Fawzy F I, Fawzy N W, Hyun C S, Elashoff R, Guthrie D, Fahey J L, Morton D L 1993 Malignant melanoma: Effects of an early structured psychiatric intervention, coping, and affective state on recurrence and survival 6 years later. Archies of General Psychiatry 50: 681–9 Gordon H S, Rosenthal G E 1995 Impact of marital status on outcomes in hospitalized patients: Evidence from an academic medical center. Archies of Internal Medicine 155: 2465–71 King K B, Reis H T, Porter L A, Norsen L H 1993 Social support and long-term recovery from coronary artery surgery: Effects on patients and spouses. Health Psychology 12: 56–63 Kulik J A, Moore P J, Mahler H I M 1993 Stress and affiliation: hospital roommate effects on preoperative anxiety and social interaction. Health Psychology 12: 118–24 Leserman J, Jackson E D, Petitto J M, Golden R N, Silva S G, Perkins D O, Cai J, Folds J D, Evans D L 1999 Progression to AIDS: The effects of stress, depressive symptoms and social support. Psychosomatic Medicine 61: 397–406 Lieberman M A, Snowden L R 1993 Problems in assessing prevalence and membership characteristics of self-help group participants. Journal of Applied Behaioral Science 29: 166–80 Martin R, Davis G M, Baron R S, Suls J, Blanchard E B 1994 Specificity in social support: Perceptions of helpful and unhelpful provider behaviors among irritable bowel, headache and cancer patients. Health Psychology 13: 432–9 Penninx B W J H, van Tilburg T, Boeke A J P, Deeg D J H, Kriegsman D M W, van Eijk J T M 1998 Effects of social support and personal coping resources on depressive symptoms: Different for different diseases? Health Psychology 17: 551–8 Spiegel D, Kraemer H C, Bloom J R, Gottheil E 1989 Effect of psychosocial treatment on survival of patients with metastatic breast cancer. Lancet 2: 888–91 Weinman J, Petrie K J, Sharpe N, Walker S 2000 Causal attributions in patients and spouses following a first-time myocardial infarction and subsequent lifestyle changes. British Journal of Health Psychology 5: 263–73 Williams R B, Barefoot J C, Califf R M, Haney T L, Saunders W B, Pryor D B, Hlatky M A, Siegler I C, Mark D B 1992 Prognostic importance of social and economic resources among medically treated patients with angiographically documented coronary artery disease. Journal of the American Medical Association 267: 520–4
K. J. Petrie
Social Support and Stress Social support is a dynamic process of transactions between people whereby assistance is received, especially during periods of stressful demands. Assist14461
Social Support and Stress ance may take the form of advice, material aid, help performing tasks, or emotional support. Indeed, emotional support may be the most active ingredient in the support process. People who have a history of being supported by others when they need it tend to feel a strong sense of attachment to a group that is caring or loving and to which they feel they belong. This perception of being closely attached to others who may help when needed is a major aspect of why social support may have health benefits. That is, social support consists both of actual receipt of help and the perception that help will be forthcoming if needed.
1. History of the Study of the Beneficial Effects of Supportie Ties Although the concept of social support was introduced in the social sciences in the 1970s, the idea that social relations may have a health-enhancing function is long-standing. Moreover, medicine has often spoken of the curative aspects of love, friendship, and positive social attachments. The influence of positive social interactions on mental health were first introduced in psychiatry by the French physician, Pinel, in the early nineteenth century and the English and American physicians, Connoly and Rush, in the late eighteenth to mid-nineteenth centuries. Having witnessed the madhouses of the time at Bedlam and the Bictre, they found that maniacs were treated with great cruelty. They offered a radical alternative to treatment. They suggested instead, that the mentally ill be treated gently, with patience, and with the use of ‘kindness’ and ‘community’ as the active ingredients of treatment. Sociological influences into the importance of positive social attachments can be traced to the seminal work of Durkheim (1897), who found suicide to be related to social alienation from others. Later, the Chicago School of Sociology represented by Park et al. (1926) in their classic book The City, proposed that health and social accomplishment vs. sickness and deviancy were consequences of the extent that social groups were integrated into and supported by the broader community. Their theory was ahead of its time, and clearly challenged the then current explanations that genetic or moral inadequacy was responsible for deviancy. Such genetic and morality explanations were the predominant psychological view of the time held by such eminent scholars as Lewis Terman of Stanford and Robert Yerkes of Harvard, who argued that Jewish and other ‘brunette’ immigrant groups were genetically inferior and should be prevented from immigrating (Kamin 1974). Moving to more recent empirical investigations, researchers in the 1970s examined the relationship between social attachments and wellbeing. They found an association between socioenvironmental conditions and infant 14462
mortality, tuberculosis, and mental illness. They noted that social disorganization and negative social conditionscontributedtoincreasedillnessonpsychological and physical levels. In a pioneering investigation, Berkman (1977) examined the social relationships and health of a random sample of adults in Alameda County, California. Those individuals who had ongoing social relationships, such as marriage, close friendships and relatives, church membership, and informal and formal group associations, had the lowest mortality rates. She interpreted her findings to mean that social isolation and lack of social and community ties disrupted people’s disease resistance and that the presence of positive social attachments, in contrast, acted as a resistance resource (see Social Integration, Social Networks, and Health). But how are social relationships translated into physical or emotional health? Sarason (1974) suggested that people obtained a psychological sense of community, and theorized that this sense of community contributed to people’s wellbeing. The healthy community guards its residents by facilitating healthy behavior (caring), offering people a positive sense of who they are (self-esteem), provides for safety and mutual support, and opens access to resources. He defined psychological sense of community as ‘the sense that one was part of a readily available, mutually supportive network of relationship upon which one could depend … (who are) part of the structure of one’s everyday living … (and) available to one in a ‘‘give and get’’ way.’ The work of Caplan (1974), a pioneering psychiatrist, was perhaps the first to identify the concept of social support and its importance in the stress process as the active ingredient that these other theorists were noting. Caplan identified the nature of the process by which social relationships provided psychological sustenance, acting ‘to mobilize … psychological resources and master … emotional burdens; share tasks; and supply (needed resources)’ (Caplan 1974). Caplan further emphasized that support provision supplied and activated resources, but also contributed to a positive sense of another critical stress resistance resource— sense of mastery.
2. Social Support’s Influence on Mental and Physical Health When people encounter stressful conditions they may benefit from the assistance of others. The people who they naturally might rely on are termed their social support network. The social support network consists of the constellation of individuals who have a relationship with one another that is characterized by helping. How strong the social support network is depends on how often social support occurs between
Social Support and Stress network members, the kinds of support that is communicated through the network, and the strength of the ties. Different kinds of social support networks also perform different function. To find a job, a large, loosely connected network may be of most assistance, even though the relationships are not close. The greater number of contacts may maximize the chance of learning about job openings and obtaining interviews. In contrast, many studies have shown that when major life stressors occur, such as death, illness, or divorce, then a small intimate network might prove most helpful. In such a close network, people are more likely to share their sorrows and receive consistent, ongoing help and emotional succor (see Gender Role Stress and Health). Moreover, the help received from intimate others has greater personal meaning (Sarason et al. 1986). Social support contributes both to mental and physical health. Social support is related to lower psychological distress in the form of anxiety and depression. Comprehensive reviews of the literature found that social support contributes to mental health both through the positive feelings that are engendered by having close attachments with others and because help is mobilized in times of need (Cohen and Wills 1985, Schwarzer and Leppin 1989). The attachment aspects of social support are seen as contributing to wellbeing whether or not stressful events are present. This has been called support’s direct effect. In contrast, the contribution of the actual receipt of support and knowing support will be forthcoming if needed may prove most beneficial during high stress periods. In this regard, those who have access to social support do better in particular when challenged by stressful circumstances. This has been called support’s stress buffering effect. Research has also examined social support’s positive impact on physical health. Comprehensive reviews of this literature found that social support contributes to better clinical health outcomes (i.e., less disease) (House et al. 1988) and more positive stress responding on a physiological level that is subclinical (Uchino et al. 1996). Social support is related to lower morbidity and mortality across a wide array of diseases including heart disease, cancer, and the common cold. Those who are ill but who obtain social support also have been found to have less severe symptoms including less pain, less physical disability, and more rapid recovery. On a physiological level that underlies health, social support is related to beneficial influences on cardiovascular, endocrine, and immune systems. Social support may, in part, help by encouraging positive health behavior but this does not seem to be the major channel through which its positive influence occurs. Rather, social support has direct benefits on lowering cardiovascular reactivity, improved immune responding, and more physiologically adaptive responding of stress-related hormones. As in the mental health studies, support from intimates seems most important for sustaining physical health.
In the excitement and groundswell of interest in social support, researchers and theorists almost entirely forgot that interpersonal relationships could also be a critical source of stress. Rook (1984) shattered this honeymoon with the social support concept by reminding the field that interpersonal conflict was also one of the main contributors to stress. Moreover, it is notable that positive and negative social interactions often emanate from the same people. Rook also showed that when positive and negative interactions cooccur, that the negative ones are more disruptive of wellbeing than the positive ones are health enhancing. In work with women during the Israel–Lebanon War, researchers further found that even positive social interactions could have a negative impact when people shared a crisis. They called this the pressure cooker effect, whereby stress contagion occurs among people who find their resources commonly drained and their problems accentuated by having to deal with the same severe problem. This finding is especially critical because many of life’s problems impact families or groups of people in a shared way, such as the case when there is a death in the family or in the face of disaster.
3. Social Support and the Exchange of Resources The process and value of social support is placed in broader ecological context by Hobfoll’s (1998) Conservation of Resources (COR) theory. COR theory’s basic tenet is that people strive to obtain, retain, protect, and foster that which they value. These things that they value are termed resources. Resources are either valued in their own right or serve as a means of obtaining valued ends. There are four basic kinds of resources: objects, conditions, personal characteristics, and energies. Object resources are physical entities that are valued such as transportation, a house, or a diamond ring. Condition resources are social circumstances that avail people to other resources, such as love, money, status, or shelter. They include such conditions as marriage, tenure, and employment. Personal characteristics include skills or personality attributes that enable an individual to better withstand stressful conditions, achieve desired goals, or obtain other resources. They include personal attributes such as sense of mastery, self-esteem, and optimism and skills such as job skills or social skills. Finally, energy resources are resources that can be used to obtain other resources, but that may become valued in and of themselves. They include money, credit, and knowledge. COR theory proposes that stress occurs when people (a) are threatened with resource loss, (b) actually lose resources, or (c) fail to gain resources following resource investment. Further, people use resources in order to limit such losses or to gain resources. For example, people use self-esteem in order to bolster their self-confidence after doing poorly on 14463
Social Support and Stress an examination. Similarly, people invest money to purchase insurance in order to offset potential financial loss. Because resources are often hard to obtain and maintain, resource loss according to COR theory is considered to be more salient and of greater impact than resource gain. Resource gain, in turn, becomes more important in the face of resource loss. COR theory helps explain the importance of social support. People have limited resources and, especially when under stress, they may find their resources inadequate. Through social support people can rely on others to offer the resources they lack, bolster their flagging resources, or remove them from the stressful circumstances so that they can regain resources or the ability to use their resources. This may take the form of offering object resources, such as shelter or transportation, and energy resources, such as money or information. It may also take the form of bolstering personal resources that the stressful conditions have depleted. For example, divorce may compromise individuals’ self-confidence, hope, and optimism. Emotional support can act to replenish these diminished resources by reminding individuals that they are loved, important, and likely to succeed in the future. COR theory also explains the potential costs of social support. First, the resources offered may be of poor quality. Supportive messages may contain submessages that are actually undermining, such as when people offer support but imply that they have the upper hand. Second, using social support means incurring social obligations that may be costly in the future. This is the case because people often refrain from wanting to incur obligations, and because when they need to return the favors bestowed upon them they may have to risk valuable resources at an inopportune time. This may differ in communal vs. individualistic cultures, but even in communal cultures where the transport of support is more balanced, incurring obligations has social costs. A third cost of obtaining social support is that other resources often have to be expended to call on and employ help. Reaching out to others requires time, a critical resources that may already be taxed during stressful conditions. Calling on help may also force people to admit to some inadequacy, which they may not wish to acknowledge. Particularly in Anglo-Saxon and Germanic cultures the concept of ‘standing on one’s own two feet’ is a central value. This, in turn, may diminish self-esteem, self-confidence, and sense of mastery. COR theory further emphasizes that social support is not just a matter of receiving resources from others when needed. Building a supportive social network requires extensive, long-term outlays of resources in order to build and maintain supportive ties. This means that to have support available requires a history of resource investment. In some relationships this may be one-sided, as in parent–child relations. Similarly, different relationships are characterized by varying 14464
histories of give and take. These patterns will set the stage for the commerce of social support when a crisis occurs.
4. Fitting of Social Support to the Type of Stress The help offered by social support may or may not fit the demands of any given stressful situation. This relationship between the nature of the demand and the adequacy of social support in meeting those demands has been termed social support fit. Emotional support has been found to fit many kinds of demands, and is the most robust, wide-reaching kind of support. However, even emotional support may not be appropriate for a given demand. When advice is needed it is important that the right kind of advice be available. Similarly, when support is needed for tasks some kinds of help may hinder more than they help. An important aspect of social support fit is highlighted by COR theory. Specifically, many stressful circumstances are chronic. COR theory suggests that resource reservoirs are finite and that even if initial support has excellent fit with demand, this does not mean that the right kinds of support will be available for the ongoing demands that ensue (Norris and Kaniasty 1996). People, however loving and helpful, may need to return to their own immediate family or may need to shelter their resources for the needs of other members of the family. How this is the case with money is readily understood, but love too takes time to communicate. Even a parent may have to withdraw resources from a needy child so as not to sacrifice the needs of other children at some point. Chronic stressors will deplete individuals’ resource reservoirs and those of their support system (see Stress and Coping Theories).
5. Future Research Directions Future research on social support is likely to take a number of directions. On the biological level, researchers are only beginning to uncover the relationships between social support and physiological reactivity and disease. Research will examine more specific diseases and the interaction of social support with the time element of disease. Hence, researchers will be examining social support’s impact on premorbid states, disease onset, and recovery. On the physiological level, research will be looking at how social support influences physiological reactivity that precedes disease or sustains health. On the more social psychological level, research will be investigating developmental aspects of social support in terms of how it comes about and how it is sustained. This will include evaluation of how culture
Social Support, Psychology of plays a role in social support interactions and the different meanings of social support in different cultures. Social support’s impact will continue to be examined in every imaginable stressful circumstance, from university examinations, to the workplace, to disasters. With an aging population, how social support impacts the process of adaptation among older adults will be an expanding area of interest. It is clear that social support is multifaceted and can be looked at on multiple levels and interest in the topic is likely to not only continue but to actually increase in scope. See also: Chronic Pain: Models and Treatment Approaches; Social Integration, Social Networks, and Health; Social Support and Health; Social Support and Recovery from Disease and Medical Procedures; Social Support, Psychology of; Stress and Health Research
Bibliography Berkman L F 1977 Social networks, host resistance and mortality: A follow-up study of Alameda county residents. Ph.D. thesis. University of California, Berkeley, CA Caplan G 1974 Support Systems and Community Mental Health. Behavioral Publications, New York Cohen S, Wills T A 1985 Stress, social support, and the buffering hypothesis. Psychology Bulletin 98: 310–57 Conolly J 1856 The Treatment of the Insane Without Mechanical Restraints. Smith, Elder, & Co, London Durkheim E 1897\1952 Suicide, a Study in Sociology. Routledge & K. Paul, London Hobfoll S E 1998 Stress, Culture, and Community: The Psychology and Philosophy of Stress. Plenum, New York House J S, Landis K R, Unberson D 1988 Social relationships and health. Science 241: 540–5 Kamin L J 1974 The Science and Politics of I.Q. L. Erlbaum Associates, Potomac, MD Norris F H, Kaniasty K 1996 Received and perceived social support in times of stress: A test of the social support deterioration deterrence model. Journal of Personality and Social Psychology 71: 498–511 Park R E, Burgess E W, McKenzie R D 1925 The City. University of Chicago Press, Chicago Rook K S 1984 The negative side of social interaction: Impact on psychological well-being. Journal of Personality and Social Psychology 46: 1097–1108 Sarason I G, Sarason B R, Shearin E N 1986 Social support as an individual difference variable: Its stability, origins, and relational aspects. Journal of Personality and Social Psychology 5: 845–55 Sarason S B 1974 The Psychological Sense of Community: Prospects for a Community Psychology, 1st edn. Jossey-Bass, San Francisco, CA Schwarzer R, Leppin A 1989 Social support and health: A metaanalysis. Psychology and Health 3: 1–15
Uchino B N, Cacioppo J T, Kiecolt-Glaser J K 1996 The relationship between social support and physiological processes: A review with emphasis on underlying mechanisms and implications for health. Psychology Bulletin 119: 488–531
S. E. Hobfoll
Social Support, Psychology of The concept of social support has both scientific and popular appeal. The topic has received attention as the important influence of psychological and interpersonal factors on the health and well-being of individuals has increasingly been recognized.
1. History Early research in this field was characterized by vague definitions, poor and inadequate samples, and inappropriate generalizations. Psychologists have long recognized the importance of the association between social support and health generally, and interpersonal factors and individual mental health specifically. It was only when the importance of these factors on physical health was documented that the research became of interest to a broader array of scientists, researchers, and clinicians. This turning point can perhaps be best identified with the early work of Berkman (Berkman and Syme 1979). Using the Alameda county longitudinal epidemiological data which followed a large representative sample of residents from the California county, Berkman and Syme were able to document that people reporting social ties with friends and organizations were more likely than people without such ties to be alive nine years later. Other studies essentially replicated these findings with respect to mortality in Tecumseh, Michigan (House et al. 1982) and Evans County, Georgia (Blazer 1982, see also Bowling and Grundy 1998 for a recent review). This began a series of studies documenting the association between social support and numerous aspects of physical and mental health, including depression, recovery from disease and medical procedures, cancer, heart disease, experience of pain, and adherence to a medical regimen (Antonucci 1990, Berkman 1985, Holahan et al. 1995).
2. Definitions While the term social support was initially used as a nonspecific, generalized term, experts have now come to recognize the importance of providing more detailed definitions and, more importantly, of distinguishing between different related concepts. We 14465
Social Support, Psychology of propose the generalized term social relations to describe the broad array of factors and interpersonal interactions that characterize social exchanges and social support between people. Under this rubric we find it is useful to distinguish three critical terms: social network, social support, and support satisfaction. Social networks can best be understood as the objective characteristics that describe the people with whom an individual maintains interpersonal relations. Thus, one can describe one’s social network members in terms of age, gender, role relationship, years known, residential proximity, frequency of contact, and the like. Describing a social network tells you about the people with whom an individual has relationships but not about the exchange that takes place within those social relations. To describe this aspect of social relations, the term social support is used. Social support refers to the actual exchange of support. Social support has been described as an interpersonal transaction involving the exchange of one of three key elements: aid, affect, and affirmation (Kahn and Antonucci 1980). Aid refers to instrumental or tangible support such as lending money, helping with chores, or providing sick care. Affect refers to emotional support such as love, affection, and caring. Affirmation refers to agreement or acknowledgment of similarities or appropriateness of one’s values or point of view. Other terms have also been used, e.g. informational and appraisal support. The third term often used in this literature is evaluative support or support satisfaction. This term refers to the assessment individuals make about the support they receive. It is an evaluative reflection of a combination of both social network and social support characteristics as well as more idiosyncratic characteristics such as perceived and expected support. Since individuals are fundamentally psychological beings, every individual does not experience the objective receipt of support in the same way. Thus, a support provider might say to a support recipient ‘I love you’, which might be heard by one person as a pledge of lifelong romantic love and by another as a passing sign of affection—an indication of platonic, brotherly love, or even a sarcastic statement of disdain. Support satisfaction refers to how individuals evaluate or assess the support they receive. They might feel support is adequate, that it meets their needs. Some individuals will be satisfied with the support they receive, i.e. positively evaluate the support they receive, and others will not—even when an observer would report that the two individuals have received the same support.
evolved significantly over the last several years. Different types of measures have been used such as open-ended questions and structured interviews. Also interesting is the fact that many different methods have now been used to study social support specifically and social relations more generally. These include indepth ethnographic studies, laboratory studies, as well as large representative, epidemiological studies. When assessing primary sources in this field it is important to be cognizant of the specific definitions and consequent measures which have been used to assess the social relations characteristics of interest since these facts fundamentally influence the kind of information obtained.
4. Theoretical Perspecties Several theoretical conceptualizations relating to social support have been offered such as attachment across the lifespan, socioemotional development, family systems, and developmental aspects of friendships (see Levitt 2000 for an overview). It is now generally recognized that a lifespan perspective provides a critical underlying base upon which to study social relations. The most important exchanges of social support and social relations usually involve long-lasting, close relationships such as those between parent and child, siblings, husband and wife. Interpersonal relationships are lifespan in nature either because they have existed across the lifespan or because they are built on interpersonal relationships that have been experienced over the individual’s lifetime. To understand how an individual experiences a relationship at any one point in time, it is most useful to understand the history of that specific relationship as well as other relationships in the person’s life. Thus, the study of a 50-year-old woman and her 70-year-old mother is best conceptualized as a relationship that has continued from when the 50-year-old was an infant through her early childhood, adolescence, young adulthood into the current state of middle age. Similarly, her mother was at one time a young 20-yearold mother who loved and cared for her child then and throughout the next 50 years. The resultant mother– child relationship is likely to be understood very differently depending on these lifetime-accumulated experiences. One might even argue that a specific negative experience could and would be forgiven because of a lifetime of positive experiences. Or, on the contrary, a contemporary supportive interaction may be unable to make up for a lifetime of disappointment or neglect.
3. Measurement Recognition of these different concepts and the distinctly different definitions associated with them leads directly to the recognition of the problems associated with measuring these constructs. Measurement has 14466
4.1 Conoy Model The convoy model of social support is actually a model of social relations more generally (Antonucci
Social Support, Psychology of 1990, Kahn and Antonucci 1980). It conceptualizes the longitudinal nature of these relations: the convoy is a structural concept shaped by personal (age, gender, personality) and situational (role expectations, resources, demands) factors which influence social support and support satisfaction. Changes in the convoy over time critically affect individual health and wellbeing.
4.2 Theoretical Work Recent theoretical work has focused on explaining how social support positively affects the health and well-being of the individual. Antonucci and Jackson (1987) offer the support\efficacy model. This model suggests that it is not only the exchange of specific support that accomplishes these effects but the cumulative expression by one or several individuals to another that communicates to the target person that he or she is an able, worthy, capable person. Assuming that the support recipient perceives the support provided as accurate and altruistically motivated, the support recipient will come to internalize this same belief communicated by the supportive other. Thus, with multiple and repeated exchanges of this type, the supported person accumulates a belief in his or her own ability which will enable that person to face and succeed in the multiple goals and challenges of life, i.e., to cope with problems across the lifespan. Not all people receive support from those who surround them and not all support is positive (Rook 1992). Thus, some people, instead of being told that they are competent, capable, and worthy, may be led to believe that they are incompetent, incapable, and unworthy. Just as the cumulative effect of positive exchanges can have a positive affect on an individual’s health and well-being, so the opposite can have a devastating and cumulative negative effect on health and well-being. Some support networks can have a negative effect by supporting or encouraging behaviors detrimental to health and well-being. Drug buddies, drinking buddies, eating buddies, and buddies who discourage exercise are all examples of people who help or ‘support’ behaviors that simply are not good for you. Other theoretical perspectives have been offered. For example, some scholars (e.g., Thoits 1995, Pearlin et al. 1996) consider social support a coping mechanism. They argue that as people are confronted with stress, the receipt of social support from important close others is a critical and significant coping strategy. Sarason et al. (1990) have taken a more personalitybased perspective, arguing that individuals with specific personality characteristics will seek and benefit from different types of social relations including specific types of social support exchanges. Carstensen et al. (1999) have suggested an interweaving of social relationships, emotional development, and lifespan
experiences. Although not technically a theory of social support, Carstensen has offered a perspective which argues that at younger ages people are more likely to seek out and be advantaged by numerous exchanges and social contacts, whereas as they age people reduce the number of exchanges in which they engage to only those which are perceived as most close. Finally, two other perspectives on social relationships have focused on intergenerational solidarity (Bengtson et al. 1996) and adult friendships (Blieszner and Adams 1992). Bengtson has documented the unique perspective of each generation towards intergenerational support. Older people of the more senior generations are likely to report that they provide a great deal of instrumental and affective support to their younger, junior generational family members. Younger people, while often agreeing to feelings of solidarity with older generations, report fewer support exchanges than their senior generation relatives. Bengtson et al. explained this as important differences in developmental stake in family, generational, and close social relationships. Finally, Blieszner and Adams note the unique supportive nature of adult relationships, recognizing that adult friendships serve an important function because, unlike family relationships, they are nonobligatory, most often among peers, and therefore on an equal or nonhierarchical basis.
5. Support Findings While there is still much to learn about the way in which social relations and, specifically social support, operates, there is quite a bit that is already known. There are both sociodemographic and cultural similarities and differences in social relations (Jackson and Antonucci 1992). It is known, for example, that most people report immediate family, mother, father, spouse, child, and siblings as their closest relationship (e.g., Antonucci and Akiyama 1995). It is known that people differentiate their relationships from very close to much less close. While all these relationships can be important, it is clear that the closest relationships have the biggest effects in most circumstances. However, it is worth noting that there can be special circumstances where a person generally not considered very close can have a significant effect on the individual. A supervisor not considered a friend but who goes out of his\her way to extend praise or deliver criticism is an example. It is also known that social relations vary by age, gender, socioeconomic status (SES), race, ethnicity, culture, and other sociodemographic characteristics. The number of close social relationships and amount of emotional support is relatively stable across the lifespan until very old age (Due et al. 1999), when a significant decline occurs (Baltes and Mayer 1999). This seems true in most countries for which we 14467
Social Support, Psychology of currently have data. Although men and women seem to have the same general structure to their social networks and very little difference in the size of their networks, it does seem to be the case that men have fewer close relationships than women. Men seem to maintain a significant, close, and high-quality relationship with their spouse, while women maintain this type of relationship with several people. On the other hand, men seem to be less burdened by their relationships and to feel less responsible for solving the problems of people in their networks. They also have fewer people they feel they can turn to in times of trouble and feel less comfortable seeking and obtaining support from others. The data on SES indicates that the social networks, support exchanges, and support satisfaction of people from different SES levels vary considerably. To put it succinctly, people at lower SES levels have smaller networks which tend to be sex segregated and to consist mainly of family members. They exchange support with fewer people and are often, though not always, less satisfied with the support they receive. Race, ethnicity, and culture have been shown to affect social relationships although there appears to be some fundamental characteristics of social relationships that transcend group membership. Thus, for example, most support networks consist of family members but different groups have different customs and sometimes different characteristics, which influence support relationships. So, for example, the church is a formal group that provides significant support to most members of the African-American community. On the other hand, Hispanics place a particularly high value on family relationships which are characterized by large, densely knit support groups. In the United States, the daughter is often the most significant support provider and caregiver for older parents. However, in Japan, daughters-in-law more often provide support than daughters because the traditional living arrangements in Japan require daughters-in-law to live with her husband’s parents. It is nevertheless important to remember that structure and culture do not always predict perception, expectation, and\or quality of support. A recent study (Tilburg et al. 1998) comparing older people in the Netherlands who often live alone with older people in Italy who often live with their children, found the Italians reporting less social integration and more loneliness than the Dutch. It may be that Italians expect more, perceive less, and therefore are more likely to downgrade the quality of the support they receive. These are important distinctions of which we are aware but about which we understand little. An additional perspective is offered by research now available which demonstrates, at the psychoimmunological level, that people who are lonely or have more hostile interactions with others are much more likely to become ill, develop heart disease, cope less well with stress, and take longer to recover from health problems (Uchino et al. 1996). 14468
6. Future Directions Future programs designed to improve social relations and social support need to be sensitive to individuals and yet not encourage or induce dependence (Silverstein et al. 1996). Social support can be most useful when it instills in the individual a feeling of being valued and competent, of being worthy and capable. We are just beginning to understand how this might happen. Acknowledged possibilities include the exchange of support, which involves both receiving and providing support, reciprocity in social relations, and the convergence of expectations, perceptions, and receipt of support. Simply stated, identify ways to maximize positive support and minimize negative support. The next important challenge in this area is finding ways to incorporate what we know about social support specifically and social relations more generally into prevention and intervention programs that will improve the physical and mental health of people of all ages. See also: Adult Psychological Development: Attachment; Coping across the Lifespan; Social Relationships in Adulthood; Social Support and Health; Social Support and Recovery from Disease and Medical Procedures; Social Support and Stress
Bibliography Antonucci T C 1990 Social supports and social relationships. In: Binstock R H, George L K (eds.) The Handbook of Aging and the Social Sciences, 3rd edn. Academic Press, San Diego, CA, pp. 205–26 Antonucci T C, Akiyama H 1995 Convoys of social relations: Family and friendships within a lifespan context. In: Blieszner R, Bedford V H (eds.) Handbook of Aging and the Family. Greenwood Press Westport CT, pp. 355B–372 Antonucci T C, Jackson J S 1987 Social support, interpersonal efficacy, and health. In: Carstensen L L, Edelstein B A (eds.) Handbook of Clinical Gerontology. Pergamon Press, New York, pp. 291–311 Baltes P B, Mayer KU (eds.) 1999 The Berlin Aging Study: Aging from 70 to 100. Cambridge University Press, New York Bengston V L, Rosenthal C, Burton L 1996 Paradoxes of families and aging. In: Binstock R H, George L K (eds.) Handbook of Aging and the Social Sciences. Academic Press, San Diego, CA, pp. 253–82 Berkman L F 1985 The relationship of social network and social support to morbidity and mortality. In: Cohen S, Syme L S (eds.) Social Support and Health. Academic Press, New York, pp. 241–62 Berkman L F, Syme S L 1979 Social networks, host resistance, and mortality: A nine-year follow-up study of Alameda county residents. American Journal of Epidemiology 109: 186–204 Blazer D G 1982 Social support and mortality in an elderly community population. American Journal of Epidemiology 115: 684–94 Blieszner R, Adams R G 1992 Adult Friendship. Sage Publications, Newbury Park, CA Bowling A, Grundy E 1998 The association between social
Social Surey, History of networks and mortality in later life. Reiews in Clinical Gerontology 8: 353–61 Carstensen L L, Isaacowitz D, Charles S T 1999 Taking time seriously: A theory of socioemotional selectivity. American Psychologist 54: 165–81 Due P, Holstein B, Lund R, Modvig J, Avlund K 1999 Social relations: Network, support and relational strain. Social Science and Medicine 48: 661–73 Holahan C J, Moos R H, Rudolf H, Holahan C K, Brennan P L 1995 Social support, coping and depressive symptoms in a late-middle-aged sample of patients reporting cardiac illness. Health Psychology 14: 152–63 House J S, Robbins C, Metzner H L 1982 The association of social relationships and activities with mortality: Prospective evidence from the Tecumseh community health study. American Journal of Epidemiology 116: 123–40 Jackson J S, Antonucci T C 1992 Social support processes in the health and effective functioning of the elderly. In: Wykle M L (ed.) Stress and Health Among the Elderly. Springer, New York, pp. 72–95 Kahn R L, Antonucci T C 1980 Convoys over the life course: Attachment, roles, and social support. In: Baltes P B, Brim O (eds.) Life Span Deelopment and Behaior. Academic Press, New York, Vol. 3, pp. 253–86 Levitt M J 2000 Social relations across the life span: In search of unified models. International Journal of Aging and Human Deelopment 51(1): 71–84 Pearlin L I, Aneshensel C S, Mullan J T, Whitlach C J 1996 Caregiving and its social support. In: Binstock R H, George L K (eds.) Handbook of Aging and the Social Sciences. Academic Press, San Diego, CA, pp. 283–302 Rook K S 1992 Detrimental aspects of social relationships: Taking stock of an emerging literature. In: Rook K S (ed.) The Meaning and Measurement of Social Support. Hemisphere Publishing, New York, xiv, 327, pp. 157–69 Sarason B R, Sarason I G, Pierce G R 1990 Social Support: An Interactional View. John Wiley and Sons, New York Silverstein M, Chen X, Heller K 1996 Too much of a good thing? Intergenerational social support and the psychological wellbeing of older parents. Journal of Marriage and the Family 58: 970–82 Thoits P A 1995 Stress, coping, and social support processes: Where are we? What next? Journal of Health and Social Behaior. Extra Issue : 53–79 Tilburg T, de Jong Gierveld J, Lecchini L, Marsiglia D 1998 Social integration and loneliness: A comparative study among older adults in the Netherlands and Tuscany, Italy. Journal of Social and Personality Relationships 15: 740–54 Uchino B N, Cacioppo J T, Kiecolt-Glaser J K 1996 The relationship between social support and physiological processes: A review with emphasis on underlying mechanisms and implications for health. Psychological Bulletin 119: 488–531
T. C. Antonucci
Social Survey, History of The social survey grew out of concerns about the social consequences in Western Europe and North America of rapid industrialization and urbanization, as well as from the desire to investigate society.
The origins of the investigation of the condition of the working or laboring classes in modern times may be traced to social investigations in the UK in the late eighteenth and early nineteenth century. Such social investigations were not surveys, but are part of the immediate prehistory of the social survey. The social survey as a tool of scientific inquiry is not much more than 110 years old.
1. The Surey Defined Several specific characteristics distinguish the social survey from the modes of social investigation that preceded it. A social survey involved field work, the collection of data at first hand by a social investigator rather than reliance upon reports by others or on preexisting data. Surveys attempted to achieve comprehensive rather than haphazard coverage, albeit initially within a local rather than a national area. The data in surveys related to individuals, families, and households rather than aggregates, and were analyzed accordingly. Survey research involved the attempt, however primitive, at counting and quantifying the phenomena with which it was concerned. And the social survey developed in close relationship with public policy and social reform. Throughout the nineteenth century there were investigations into public health, housing, family life, and employment, particularly as these affected the working classes. Some, such as the studies by Fre! de! ric Le Play of working-class family budgets, had wider scientific aspirations. The investigations were carried out by private individuals (many of them, in the later nineteenth century, women of some social position), certain professions (associated with, in particular, medicine), members of voluntary associations concerned with social welfare, a few journalists and (by the end of the century) one or two academic scholars. Although the methods of inquiry varied, there was some common ground. Indeed, for a period in the USA what became known as the Social Survey Movement flourished, engaging significant numbers of private, government, and academic researchers. These inquiries signified increasing upper-class and middle-class interest in the condition of the working classes as well as a desire to intervene—a desire both to remedy want and disease through voluntary or state action and to achieve a greater degree of social control through the use of scientific expertise. There was thus a dual involvement of both social scientists and social reformers in the history of the social survey, and an interplay between them. It is important to stress that before 1940 the social survey was not associated particularly closely with academic social science, which was in any case itself quite small in scale. The modem scientific sample survey, established by academics such as Rensis Likert and Paul 14469
Social Surey, History of Lazarsfeld, government statisticians such as P. C. Mahalanobis and Louis Moss, and market researchers such as Henry Durant and Mark Abrams, has developed since that time. Before World War II, a variety of individuals and groups, only a minority of them academics, contributed to the establishment of the social survey. Occasionally surveys have originated in an abstract desire for more knowledge about the structure and workings of society; more frequently, however, they have been carried out as an indispensable first step in measuring the dimensions of a social problem, ascertaining its causes, and then deciding upon remedial action.
2. Studies of Local Social Conditions The history of the social survey can be conveniently schematized into two periods, the early studies of local social conditions—particularly poverty—and the post-1940 probability sample survey. The former were mainly local, the latter frequently national and more recently international in scope and design. However, both types of survey shared features in common. A fundamental change in the concepts guiding the acquisition of social knowledge accompanied the rise of the social survey. A comparison between Henry Mayhew’s examination of the London poor in the middle of the nineteenth century and Charles Booth’s surveys three decades later is instructive. Booth conceptualized the problem differently, conducted a larger-scale inquiry, attempted to measure the phenomena with which he was concerned, and collected data from multiple observers rather than relying upon a single observer. The emergence of the interview, however, was a gradual process. The history of the social survey is usually dated in Britain to the 1880s when, as economic depression heightened social tensions, social and political groups that had competed in prosperity descended into conflict. One manifestation of these changing circumstances was sporadic social unrest. One particular question which exercised opinion in the middle and upper classes was the extent of poverty. In the 1880s, with more trying economic conditions, the problem of poverty became more insistent and the discrepancy between the numbers in receipt of poor relief and the proportion of the population having inadequate incomes due to underemployment, unemployment, family breakdown, ill-health, old age, and other misfortunes was more apparent. But how great was the difference and who would provide an answer to the question? The absence of an adequate institutional base from which to undertake such inquiries was a major obstacle. This base was not provided by the political parties. Government departments did little at this period to gather information on social conditions. 14470
With the exception of one or two positions in economics, social science was not institutionalized in universities. Existing methods of collecting social data about poverty were patchy and often lacking in reliability. Henry Mayhew’s observational approach emphasized vivid insight and detailed description over extensive enumeration. Statistics compiled by central and local government, such as they were, related to administrative function and were neither designed nor collected by people with professional training. The interrogatory methods of parliamentary committee and Royal Commissions as devices for investigating problems were unlikely to lead to a satisfactory overall picture of social conditions, producing impressionistic results which varied between individuals. There was a marked lack of clarity in defining the phenomenon.
3. Charles Booth The pivotal figure of Charles Booth looms large in the history of poverty surveys. Charles Booth was a wealthy Liverpool shipowner who was annoyed by the lack of a factual basis for the accounts of urban poverty of the time and lack of clarity about three questions to which he sought answers: How many were poor, why were they poor, and what should be done to alleviate poverty? Through careful empirical study Booth sought answers for the East End of London to the first two questions. The Life and Labour of the People in London, published in 17 volumes between 1889 and 1903, is the first great empirical study in the social survey tradition. The publication of the first volume of Booth’s study of poverty in London in 1889 created a sensation. Booth’s originality lay in his methods of inquiry, not just in how he collected and analyzed his data, but in the way he conceptualized the problem of poverty. Booth attempted to introduce precision to the concept of the ‘poverty line’ where hitherto there had been vagueness, a characteristic which justifies including him as one of the founding fathers of social science. Booth sought to measure the numbers of the population above and below the poverty line with some precision, even though the classification categories that he used lacked adequate justification. This too differentiated his work from previous studies, and foreshadowed the later use of the social survey for estimating population values with some exactitude. Furthermore, although Booth’s study, like most of the early social surveys, was a local study, its coverage was comprehensive—he aimed to gather information on as large a proportion of the population as was possible and especially on all families with children in the parts of London covered by the survey. Booth was also unusual in using a team of investigators. First-hand data about the household circum-
Social Surey, History of stances of poor families were not collected directly by his research team but by what one of them, Beatrice Webb, called the method of the ‘wholesale interview.’ School board visitors, who entered the homes of children attending London schools, were interviewed by Booth to gather information about the circumstances of each family. Booth had printed data collection notebooks that categorized the information gained in interviews with the visitors. Great care was taken to achieve uniformity in the data collection between the different interviewers, and the information obtained was checked against other sources in an attempt to increase its reliability. The quantitative and qualitative data from the notebooks were analyzed by Booth and his staff. The published results were aggregations of the measurable and distillations of the unquantifiable. The findings had direct policy implications and contributed to movements toward social reform and old age pensions in particular. Booth’s research had important effects on the development of the social survey, independent of any influence his findings may have had, and popularized survey research. Several early US studies were derivative of Booth. And Booth’s work served as a template for a large number of early social researchers.
4. Seebohm Rowntree and A. L. Bowley The next major survey in Britain, Seebohm Rowntree’s study of York, was conducted in 1899 and published in 1901. This further refined both the conceptualization and measurement of the phenomenon of poverty. Most important, perhaps, it used ‘retail’ rather than ‘wholesale’ interviewing; that is to say, people were interviewed directly in their own homes by members of the research team, not indirectly through reports by middle-class professionals with knowledge of individual families. This transition from reliance upon informants who could report on the circumstances of working-class people to reliance upon the response of the people themselves was a most important one. Rowntree also refined the conceptual understanding of poverty by introducing the concept of the ‘life cycle,’ and by showing how poverty varied between different phases of the life cycle. There were a considerable number of other local poverty studies in Britain prior to 1940, but the original studies by Booth and Rowntree, and restudies conducted in York by Rowntree in 1936 and 1951 were the most influential. The first social surveyor to use sampling methods was Arthur Lyon Bowley in his studies of poverty in five towns, published as Lielihood and Poerty in 1915. This innovation, which was slow in being adopted, eventually undercut the local social survey, replacing it with the modem probability sample survey. The use of sampling in practice only emerged slowly and was first seriously taken up by US market
researchers between the wars. It was used in social survey research for government and academic research only during and after World War II.
5. Early Sureys in the United States Developments in the USA shared many features of this British experience. In both societies the social survey was shaped by the salience of voluntary effort, of liberal political traditions that restricted government initiatives, and of values promoting ‘responsible individualism.’ Thus we can speak of an AngloAmerican tradition of social survey movements that shared many fundamental features. In both countries the settlement house movement provided a base for fostering social investigation. The influence of their ideas and the publicity they gave to the existence of social conditions created a more thoughtful climate for the reception of social surveys. At the same time, the US context contained challenges not found in the British milieu. More isolated from political power, unable to draw on traditional criticisms of industrialization, and less likely to work closely with organized labor, reformers had to make the most of their middle-class resources. They also stood in a different relationship to emerging social science. By the end of the century, there was a different relationship between professional expertise and the developing skills of social investigation than in the UK, one which also had repercussions for conceptions of objectivity and of what constituted social science. In these circumstances, the social survey became even more important to middle-class reformers. For the objective data provided by the surveys were uniquely effective in mobilizing ‘public opinion’—that amorphous combination of middle-class and working-class attitudes that activated politicians and labor leaders alike within the US political environment. Surveys could get action (especially when linked with sensationalist press reports) even when ‘monster’ meetings and petitions could not. The social survey in the USA in the period between 1880 and 1940, while resembling and differing from its British counterpart, directly imitated Booth’s work in certain respects. Florence Kelley’s work on the HullHouse Maps and Papers was one of the earliest US surveys and very much in the Booth tradition. Other studies followed. The role of the settlement house movement in publicizing social conditions was not confined to poverty. Indeed issues such as child labor, the treatment of juveniles in the court system, sweated labor and poor housing conditions loomed as large if not larger in their concerns. At the state level some notable innovations were effected, for example, the establishment of the Illinois Juvenile Court in 1899. W. E. B. DuBois’ study of The Philadelphia Negro (1899) by a young black sociologist was notable for its thoroughness and detachment. 14471
Social Surey, History of
6. The Social Surey Moement The social survey movement, which was best expressed in the Pittsburgh Survey of 1906–9, was different. It emerged from the charity movement, and the belief among significant numbers of social workers that the roots of social problems were to be found in society rather than in the individual. The Pittsburgh Survey, funded by the newly created Russell Sage Foundation, was more in the nature of investigative journalism, with much less emphasis upon quantification. More emphasis was put on influencing public opinion. When published, a great effort was made to feed back the results of the research to the locality in which it had been carried out, by means of publicity and exhibits. This gave surveys of this type a completely different character, for they became exercises in community self-study more than scientific examinations of social questions. Indeed, its success was followed by the establishment of the Department of Survey and Exhibits at the Russell Sage Foundation, which between 1912 and 1931 supported a very large number of local surveys. The US social survey movement had run out of steam by 1930 and disappeared shortly thereafter. There were other developments, such as ‘crime surveys’ as an aid to administrative reform, and the country life movement and the church survey movement, but these were of little lasting methodological significance. The geographically local social survey between 1880 and 1940 in the UK and the USA led to distinctive changes in social investigation. What distinguished Booth from Henry Mayhew, W. E. B.DuBois from Ray Stannard Baker, A. L. Bowley from R. H. Tawney, was the urge to present systematic and defensible numerical statements about the problem studied based on a large-scale data collection exercise derived from questioning individuals. Such an appeal to ‘objectivity’ and ‘science’ is one which became increasingly prominent as the twentieth century advanced. The local survey was closer to social reform and amelioration, to political action and to nonacademic institutions.
7. The Probability Sample Surey After 1920, the social survey developed more rapidly and achieved more outstanding results in the USA than in the UK, and at the same time underwent a change in form. It is true that at the theoretical level, Britain maintained its preeminence for longer. The traffic to sit at the feet of statisticians Karl Pearson, R. A. Fisher, G. Udny Yule, and A. L. Bowley in the University of London continued until well into the 1930s. Leading US survey researchers such as Samuel Stouffer learned advanced inferential statistics from them and returned to apply those methods to good 14472
effect in the USA. At a practical level, however, the main implementation of probability sampling methods which came gradually to be used, giving the survey as a method very much greater power, took place in the USA. After Kiaer and Bowley’s pioneering work, the application of sampling in national studies to provide a representative national picture may be traced to market research and opinion polling in the 1930s. From 1928, George Gallup had been successfully using probability sampling for marketing and political research in the USA. In 1936 Henry Durant established the British counterpart of Gallup’s organization, the British Institute of Public Opinion (BIPO). Gallup’s triumph over the Literary Digest in 1936 in correctly predicting Roosevelt’s victory from a much smaller sample was a clear marker of the future potential of the method. At the same time, efforts to improve social measurement were being made. During the 1920s, political scientists and psychologists, such as Harold Gosnell and L. L. Thurstone, made advances in the systematic measurement of voting and of social attitudes. Research units in government departments, one of them led by Rensis Likert, were established. Paul Lazarsfeld came to the USA from Austria as a Rockefeller Fellow, and remained there to commence survey research on mass media audiences in the mid-1930s. By the late 1930s, the basis for the development of the modern probability social survey was in place. With the outbreak of World War II and in particular the entry of the USA into the war in 1941, significant activity took place in government. The American Soldier study directed by Stouffer was important in its own right theoretically and methodologically, and for the stimulus it gave to postwar research at Harvard and Columbia. Likert’s group in the Department of Agriculture became the nucleus of the postwar development of the Survey Research Center at the University of Michigan. Lazarsfeld independently widened his interest in the media to voting and the effects of mass communications in elections. With the increases in federal government support for social science post-1945, survey research took off, becoming established in the academic world as well as in bureaucratic social survey organizations outside it. There was particularly strong cross-fertilization at centers at the National Opinion Research Center at the University of Chicago, the Survey Research Center at the University of Michigan and the Bureau of Applied Social Research at Columbia. In the UK, government was important, with the Government Social Survey remaining the main center for survey research. Other major survey research organizations and centers in Western Europe followed later. Since the mid-twentieth century, the intellectual leadership in modem sample survey research has lain in the USA, which has tended to remain in the forefront of intellectual fertility, analytic inventiveness, and methodological innovation.
Social Trauma See also: Archives and Historical Databases; Databases, Core: Sociology; Demographic Data Regimes; Demographic Techniques: Data Adjustment and Correction; Economic Panel Data; Longitudinal Data; Panel Surveys: Uses and Applications; Sample Surveys, History of; Survey Research: National Centers; Surveys and Polling: Ethical Aspects
Bibliography Bulmer M, Sklar K K, Bales K (eds.) 1991The Social Surey in Historical Perspectie 1880–1940. Cambridge University Press, Cambridge, UK Converse J 1987 Surey Research in the United States: Roots and Emergence, 1890–1960. University of California Press, Berkeley, CA
M. Bulmer
Social Trauma Social trauma is a new concept in the social sciences. It transfers a notion from the clinical psychology of individuals to the sociology of groups and cultural communities. On the social level it denotes a collectivity’s response to an event that is considered to be an overwhelming and unexpected threat to its cultural identity and social order.
1. History of the Concept Although the term trauma had been used in medical contexts before, its origin as a distinctly psychological concept is commonly attributed to Sigmund Freud. In their studies on hysteria Freud and Breuer characterized the memory of the psychic trauma as ‘a foreign body which long after its entry must continue to be regarded as an agent that is still at work’ (Freud and Breuer 1955 (1893–1895), p. 6). Very early Freud noticed the time delay and the possible disparity between the cause and the traumatic response: ‘In traumatic neurosis the operative cause of the illness is not the trifling physical injury but the effect of fright’ (Freud and Breuer 1955 (1893–1895), pp. 5–6). Freud did not yet conceive traumatic memory as a social or collective response. Instead, he refers to processes within the psychic system, in particular to the fundamental tension between instinctive drives and defensive controls. An overwhelming and threatening experience results in an indelible anxiety that is coped with by various defense mechanisms like denial, reversal of cause and effect, or projection of the evil to others. Psychoanalysis prepared the ground, but the rise of the notion to its current use is due to its relation to
Post-Traumatic Stress Disorders (PTSD) that were clinically studied and professionally treated after the First World War (shell shock) and, increasingly, after the Second World War (Holocaust survivors, war experiences, rape, violence). During the 50 years since the wars the range of possible traumatizing experiences as well as the symptomatology of trauma vastly expanded —turning ‘trauma’ into a key concept for the analysis of individual and collective phenomena that before had been studied under the conceptual labels of ‘crisis,’ ‘neurosis,’ ‘anomie,’ and ‘disorder.’
2. Social Trauma as Collectie Response to the Breakdown of Order Genocide and the outburst of violence, military defeat and foreign occupation, sudden economic crisis and mass poverty, forced migration and expulsion from home—but also natural disaster and the spread of mortal plagues—are experienced as shocking events that not only cause great individual suffering, but may also result in the breakdown of social order, that is, in the collapse of the most basic ‘taken for granted’ social expectations. As these ‘evil’ events affect more than a few members of a social community, they cannot be considered as exceptional and individual misfortunes, but rather as events that affect the validity and stability of the social order itself. As a result, they erode basic trust in social institutions, in the solidarity that creates the boundaries of family, neighborhood, and country. The predictability of everyday life is suspended; ordinary time seems to stop (Neal 1998); and the social fabric is disrupted. Under the impact of the shocking event, feelings of hopelessness, apathy, fear, and disorientation spread in the community. Participation in general politics and public debates as well as economic investments and consumption decrease sharply, and crime rates increase. This sharp decline of trust and the rise of apathy and disorientation can be called a social trauma. In distinction to individual trauma, the social trauma even extends to those members of the community who, as individuals, did not suffer from the traumatizing condition. Because social trust typically extends beyond immediate others, the distrust produced by collective trauma tends to be copied by other members of the community who are devoid of a strong personal experience of breakdown and crisis. In this conception, social trauma is seen as a response to the sudden breakdown of legitimate social expectations and the rapid decay of social institutions and social structure. The trauma is seen as an ‘adequate’ or natural response to the collapse of social order, and this response can occur without a time delay, immediately after the experience of the traumatizing condition. Although there are mediating institutions and cultural patterns that reinforce and support the traumatic experience, this experience is 14473
Social Trauma seen as caused by an external event or condition that is directly observable and objectively accessible. An event becomes a social trauma by its very catastrophic objectivity, that intrudes into ordinary life and resists any attempt to ignore it or to accommodate to it.
3. Social Trauma as a Construction of Collectie Memory An alternative theoretical paradigm defines social trauma as the sudden collapse of the culturally generated web of meaning that supports individual or collective identity. Because the event that disrupts the web of meaning threatens the identity of an individual or of a collectivity, it may often pass unnoticed when it happens: the mind—individual or collective—is unable to comprehend the moment when its own death may occur. Therefore, social traumas are rarely witnessed directly and immediately, but are crystallized and become available for experience only in the reconstruction of the past. It is the individual or collective memory that strives, voluntarily or forced by others, to remember and to cope with an intrusive event that crushed the social and cultural basis of identity. This ruminating memory of a past that suddenly confronts a community with the abyss of incomprehensibility is called a social trauma. Because social traumas refer to incomprehensible or horrible occurrences, they are hard to put into narration or to represent by common images or patterns. As individual or collective identities do not refer to objects but to the subjectivity of agents, they can be named, but they escape general definitions and descriptions and so do the traumatic challenges and threats to these identities. In this respect the traumatic memory profoundly differs from the narrative memory that remembers past episodes of meaningful action. However, the narrative memory repeats the past event because it fits into the cultural patterns of the present; the ruminations of traumatic memory result from the very fact that it cannot be narrated and represented by the cultural forms at hand—it occupies our mind because it resists being integrated into our perspective on the world. The strong disturbance and disruption of the individual or collective consciousness produces ambivalent responses and fosters hysteria—obsessive anxiety and repression by denial, the fear of conspiracies and the feeling of guilt, the persecution of scapegoats and hostility towards the victims (Smelser 2001).
principle at least—communicate their common or similar experiences and thus develop a collective traumatic memory as members of a generation, as a community of victims, etc. Thus the victims of rape and violence, of racist attacks or forced migration can develop a distinctive collective identity based on common traumatic experiences. But not all personal traumatic memories can be translated into the collective memory of a community. Some remain enclosed in the individual mind and body—because the individual is unable to talk about it, because no other person is expected to understand it, or because the individual trauma is framed by a collective triumph, like the victory of the nation that tends to forget the personal trauma of its soldiers. In contrast to these private and individual traumas, there are social traumas that are largely disconnected from individual experience and memory, but exist, after some time at least, only on the level of collective communication and public culture. Thus the traumatic origins of a nation, a religious community, a generation or an ethnic group can be remembered by public rituals, memorial days, and official monuments, although there may be no surviving witnesses who could claim a personal memory of the traumatizing event. Here, the continuation and reproduction of the trauma is shifted entirely from personal memory to the collective memory of a community, its social institutions, and cultural traditions. Sometimes the collective remembering of the trauma is even impeded by haunting personal memories that cannot be spoken of. After the social trauma is decoupled from personal memories, and transferred to social carriers devoid of these personal memories it can be remembered by public rituals and literature, and it has to be remembered in this way if it is not to be passed into oblivion. Although there is hardly a traumatic memory that cannot refer to any individual suffering, the cultural construction of a traumatic past does not directly correspond to the amount of individual suffering and the actual number of victims. A victorious war may not be remembered as a trauma, although the number of casualties may exceed those produced by a war that resulted in a military defeat and that, therefore, is remembered as a social trauma. Viewed from a culturalist perspective, the social trauma is not the plain and natural result of individual suffering but its reconstruction in collective consciousness. Whoever partakes in the communicative reconstruction of the trauma can regard him- or herself to be a member of the traumatized community, regardless of whether he or she can claim personal experience of the traumatizing event.
4. Indiidual and Collectie Trauma The culturally constructed trauma can appear on the level of the individual or on the level of the collective memory or on both. If individuals remember the same kind of personal traumatic memories they can—in 14474
5. Modes of Coping with the Trauma Most analysts of social trauma agree that there are different modes of coping with traumatic memory, but not all consent to the idea that the traumatized
Social Trauma community has to go through a fixed sequential order in coping with the memory. Following Freud we can distinguish between the period of latency and the period of speaking out and working through, or between different responses like denial and uncoupling, reversal, and projection. In the period of latency the community is unable to speak publicly about the trauma because the personal memories are still too vivid to be soothed by public rituals. The trauma is denied or silenced in public discourse or official social representations. If the issue cannot be avoided in communication, it is decoupled from the traumatized community: the time and circumstances of victimization are regarded as being radically different from today’s situation; the victims are even seen as partly responsible for their fate (reversal); the evil is projected on scapegoats or on a few unquestionably responsible individuals. Only when some time has passed, when a new generation has entered the stage, when a large and influential group of impartial but interested observers takes part in the public debates, can the trauma be publicly addressed by political representatives and even be ritually remembered by memorial days, monuments, and museums. Collective traumas like the Irish famine or the Holocaust of the European Jews, the slavery of African Americans, or the American civil war usually take more than the timespan of a generation to be publicly remembered and recognized. Even if the social trauma can be spoken of and is publicly recognized, its representation remains mostly an issue of contestation and conflict between different generations, political camps, religious or ethnic groups. Questions as to who is to blame for the trauma and who was the most important victim imply, even for succeeding generations, closeness or distance to the symbolic center of the community and are therefore a matter of identity and interest for collective actors within this community. In parallel with these public debates and conflicts, the trauma can be transferred to specialized institutional arenas like historical research and museums, literature, or movies, where it is worked through according to the special logic of the respective institution. Finally, the trauma may be accepted by a majority of a community and become an integral part of its construction of collective identity and history. Not all communities remembering a social trauma will pass through all stages of the sequence and, frequently, different modes of coping coexist and are carried by different social groups and institutional arenas.
6. Victims, Perpetrators and the Public Audience In its common usage, refers to a community victims resulted from denial of their claims
the concept of social trauma of victims. The trauma of the a profound and unexpected on dignity and autonomy as
subjects and cohumans. Instead, they have been treated as objects that could be traded, used, deported, oppressed, and killed. Like objects, they were considered as cases of a category deprived of a name, a face, and a place in the community. Slavery and genocide, displacement, and mass violence are the paradigm cases of this dehumanizing treatment of victims. Immediately after the traumatizing event the surviving victims, because of shame and shock, can hardly report on their suffering—they are muted and sometimes they are even isolated from the regular members of their community in special camps or asylums. It is only from a temporal or social distance that the suffering of victims is remembered, that they are recognized as subjects and symbolically reintegrated into the community. This recognition, remembering, and reintegration cannot be achieved by the descendants of the victims alone. Instead, it requires the participation of an impartial public or of a noninvolved third party who consents to this process and agrees with the cultural self-definition of victims. If a nation or an ethnic group claims the identity of traumatized victim, it remains a contingent question as to whether other nations or groups recognize this claim in a public debate. In contrast to a vast range of research on the trauma of victims, relatively few studies center upon the trauma of perpetrators. The trauma of perpetrators results from the fact that they violated fundamental moral principles, ensuring not only the victims’ status as subjects, but also their own status. Because of their shame and their ignorance, perpetrators tend to be silent and to deny the moral crisis that they have caused. In this case, even more than in the case of victims, a strong third party is required to publicly debate the victimization and insist on stigmatizing the perpetrators. Particular institutions like courts of justice, or professional experts like medical doctors, can also take this position of a noninvolved third party. Only in exceptional cases, however, is the trauma of perpetrators accepted by those who were directly and actively involved in creating the traumatizing event. Instead, it is mainly the succeeding generation and other noninvolved members of the community who can speak of the collective trauma and admit publicly to the collective guilt. Like the social trauma of victims, the trauma of perpetrators also leads to public contestations and conflicts about the question of how the boundaries of the traumatized community of victims or of perpetrators will be demarcated. It also determines who has to be included, even if he or she was not directly concerned as an individual person. In the course of this debate the social trauma of victims and perpetrators is turned into the core of a public discourse about the collective identity of the nation or of the other communities involved. As such, it tends to replace the traditional public representation of national identity that showed the nation as a triumphant or tragic hero resisting its 14475
Social Trauma adversaries. This demise of heroism and the rise of social trauma as the core reference for national identity is reflected in public monuments, museums, and movies, but also by the rhetoric and the rituals of political representatives who remember the victims of the past in public. See also: Collective Behavior, Sociology of; Collective Memory, Anthropology of; Collective Memory, Psychology of; Genocide: Anthropological Aspects; Genocide: Historical Aspects; Mass Killings and Genocide, Psychology of
Bibliography Eyerman R 2001 Cultural Trauma. University of California Press, New York, CA Amato J A 1990 Victims and Values. A History and a Theory of Suffering. Praeger, New York Assmann A, Assmann J 2002 Pits and Pyramids. Dimensions and Dynamics of Cultural Memory. University of Chicago Press, Chicago Caruth C 1996 Unclaimed Experiences. Trauma, Narratie and History. Johns Hopkins University Press, Baltimore, MD Caruth C (ed.) 1995 Trauma. Explorations in Memory. Johns Hopkins University Press, Baltimore, MD Erikson K 1995 Notes on trauma and community. In: Caruth C (ed.) Trauma. Explorations in Memory. Johns Hopkins University Press, Baltimore, MD, pp. 183–99 Freud S 1964 [1939 (1934–1938)] Moses and monotheism. In: Strachey J (ed.) The Standard Edition of the Complete Psychological Works of Sigmund Freud. Hogarth Press, London, Vol. XXIII, pp. 7–237 Freud S, Breuer J 1955 (1893–1995) Studies in hysteria. In: Strachey J (ed.) The Standard Edition of the Complete Psychological Works of Sigmund Freud. Hogarth Press, London, Vol. II Giesen B 2001 Lost paradises, failed revolutions, remembered victims. In: Giesen B, Eyerman R, Smelser N, Sztompka, P. Cultural Trauma. University of California Press, Berkeley, CA Giesen B 2002 Triumph and Trauma. Cambridge University Press, New York Giesen B 2000 National identity as trauma: The German case. In: Strath B (ed.) Myth and Memory in the Construction of Community. Historical Patterns in Europe and Beyond. P I E-P Lang, Brussels, Belgium, pp. 227–47 Kolk B A van der, McFarlane A C, Weisaeth L (eds.) 1996 Traumatic Stress: The Effects of Oerwhelming Experience on Mind, Body, and Society. Guilford Press, New York La Capra D 1998 History and Memory after Auschwitz. Cornell University Press, Ithaca, NY La Capra D 1994 Representing the Holocaust. History, Theory, Trauma. Cornell University Press, Ithaca, NY Neal A 1998 National Trauma and Collectie Memory. Mayor Eents in the American Century. M E Sharpe, New York Smelser N J 2001 Psychological Trauma and Cultural Trauma. In: Giesen B, Smelser N, Eyerman R, Sztompka P (eds.) Cultural Trauma. Cambridge University Press, New York, CA Vries de M 1996 Trauma in cultural perspective. In: Kolk B A van der, McFarlane A C, Weisaeth L (eds.) Traumatic Stress:
14476
The Effects of Oerwhelming Experience on Mind, Body, and Society. The Guilford Press, New York, pp. 398–413 Weatherill R 1994 Cultural Collapse. Free Association Books, London
B. Giesen
Social Welfare Policies and Gender In conceptualizing and defining social welfare policies, much discussion has revolved around the residual\ institutional or the marginal\comprehensive distinction. This distinction is important in determining what policies are included in the term social welfare policies. Narrow definitions frequently confine social welfare policies to income maintenance programs—assistance and social insurance, and sometimes tax benefits. Broad definitions expand the policies to encompass provision of housing, healthcare, education, and an array of other social services. The rise of the women’s movement in the late 1960s and the development of women’s studies prompted a re-examination of social welfare policies that focused on gender. In bringing gender into the analysis of social welfare policies, two strategies were pursued. First, the dearth of information about the impact of policies on women and differing outcomes for women and men led to an approach that primarily centered on women. A byproduct of this approach has been to make women visible in the analysis. The second strategy has stressed gender as a relational category where the inquiry explicitly deals with both women and men. Instead of making women visible, this strategy has shown how gender is involved in policies where it had previously been thought to be completely irrelevant. A number of trends in research on gender and social welfare policies are discernible during the 1980s and 1990s. The initial emphasis on studying the impact of social welfare policies on women has been supplanted by a concern with how gender structures policies and how policies affect gender relations (O’Connor 1996). Generalizations based on the policies of a single country have given way to theory building grounded in comparative analyses (Lewis 1992; Orloff 1993; Sainsbury 1996; O’Connor et al. 1999). The first wave of theorizing was informed largely by the social welfare policies of the English-speaking countries (e.g., Wilson 1977), while a second wave reflected the experiences of the Scandinavian countries (e.g., Hernes 1987), and in the 1990s the range of countries has further expanded (e.g., Sainsbury 1999). Comparative research has charted and attempted to explain major variations in policies across nations. The variations revealed by this research have challenged earlier generalizations derived from the experi-
Social Welfare Policies and Gender ences of one nation, especially those that held specific features and outcomes were intrinsic to social welfare policies. Views on the nature of welfare policies have also shifted. In the 1970s a widespread view was that welfare policies were mainly instruments of patriarchal oppression; eventually more consideration has been devoted to the emancipatory potential of policies. This reconsideration has focused on both the unintentional consequences of policies that have benefited women or have undermined traditional gender relations (Pateman 1988) and the implications of different policy constructions (Sainsbury 1996). A parallel shift has occurred in theorizing about women’s relationships to welfare policies. Previously women were regarded as the objects of policy, while in the 1990s research explored women’s agency in shaping welfare policies (Skocpol 1992; O’Connor et al. 1999; Sainsbury 1999). Finally, a weakness of many earlier studies was their exclusive preoccupation with gender. Gradually, however, more attention has been paid to how gender intersects with race, ethnicity, and class in determining social entitlements (e.g., Williams 1989; Quadagno 1994). Employing the lens of gender, scholars have called in question the definition of social welfare policies. They have been critical of narrow definitions of social welfare policies that focus solely on income maintenance programs. Feminist researchers have objected to the pivotal position of paid work in the analysis of social welfare policies. The emphasis on paid work takes men as the norm, and it obscures the importance of unpaid labor. This focus has also downplayed services and care. The incorporation of gender into the analysis has also substantially altered the research agenda on social welfare policies. Priority has been assigned to four areas of investigation: (a) the gender division of welfare, (b) the gendered construction of welfare policies, (c) welfare policies as an ordering force in gender relations, and (d) gender and explanations of welfare policies.
1. The Gender Diision of Welfare Social welfare policies—especially assistance and social insurance benefits—traditionally have been conceived as instruments of social protection and redistribution. At a minimum, social welfare policies should protect individuals from poverty and relative deprivation. More ambitiously, the policies aim to promote the general welfare and well being of the population. Mainstream studies evaluating the redistributive effects of social welfare policies largely have been gender blind. The units of analysis have been social class or occupational groups, households, generations, regions within a country, or nations. By contrast, feminist research has focused on the gender division of welfare and how the construction of policies has produced differences in the benefits re-
ceived by women and men. The differences concern both access to benefits and benefit levels. With this focus, studies have mapped out the gender composition of beneficiaries of social welfare policies. Early research found that a disproportionate number of women were beneficiaries, but these studies examined the English-speaking countries. When other countries were included in the examination, a different pattern emerged: men predominated as beneficiaries. In these countries social insurance schemes whose eligibility was based on labor market participation were central to social provision; men also collected family allowances as part of these schemes. Even when the allowance has been paid to the married mother, eligibility has been determined by the father’s rights as an insured employee. Although men predominate as beneficiaries in some countries, households headed by women are more reliant on social welfare benefits than male headed households because men command a larger share of earnings. In many cases, especially when benefits are earnings related, the benefit levels of men are higher than those of women. Studies have also documented disparities in the poverty rates of women and men. An initial finding was the feminization of poverty in the US in the mid1970s. Although the poverty rates of women and men in the US had dropped substantially since the 1950s, the composition of the poor had changed dramatically so that women were overwhelmingly in the majority. In many countries the poverty rates of women are higher than those of men. However, among the industrialized countries, the poverty rate of women in the US ranks as highest, and the gender poverty gap is also the largest (Casper et al. 1994) (see Poerty and Gender in Affluent Nations). Nonetheless, there are countries—such as Italy, The Netherlands, and the Scandinavian countries (Sainsbury 1999)—where differences in the gender poverty gap are small or women have lower poverty rates than men. In summary, empirical research on the gender composition of beneficiaries of income maintenance programs in the US revealed an anomaly: women outnumber men as beneficiaries but their poverty rates were much higher than men’s. This anomaly, together with cross-national variations, spurred theorizing about the sources of gender inequalities in benefits and the gendered construction of social welfare policies.
2. The Gendered Construction of Social Welfare Policies Much analysis has concentrated on how gender norms and meanings are encoded in social welfare policies. One line of theorizing emphasizes that the gendered construction of social welfare policies results in dual welfare. This involves a two tiered system of social benefits, consisting of a ‘masculine’ set of programs geared to individuals who make claims as earned 14477
Social Welfare Policies and Gender rights based on labor market participation and a ‘feminine’ set of programs oriented to households that claim benefits to compensate for family failures. Underlying the two-tier benefit system are gender norms that define the home as a female sphere and outside work is a male sphere. Women thus claim benefits on the basis of tasks in the home and their benefits are familialized, while men’s claims are primarily labor market related and individualized (Fraser 1989). Another variant of the thesis puts more weight on the dual labor market as the source of dual welfare (Nelson 1984). In any event, according to the dual welfare thesis, women and men are segregated into different types of social welfare programs: women rely more heavily on means-tested or income-tested benefits, and men on social insurance schemes. Assistance programs and social insurance are further differentiated in terms of benefit levels, political legitimacy and administrative intervention in private lives so that the quality of women’s and men’s social rights diverge greatly. The importance of the dual welfare thesis is that it calls attention to the gender differentiation of social provision and specifies the differing implications for women and men. However, the thesis has been criticized on several grounds. The thesis is a generalization based primarily on social welfare policies in the US, but gender differentiation in entitlements varies in different ways across countries. Even with regard to the social welfare policies of the US the dual welfare thesis is misleading in one significant respect. The emphasis on predominance of women claiming meanstested benefits overlooks that many more women receive social insurance benefits on the basis of their husband’s entitlements in the US (Orloff 1993; O’Connor 1996). In studying the gendered construction of income maintenance policies, Orloff (1993) and O’Connor et al. (1999) distinguishes between gender differentiation and gender inequality. Gender differentiation is related to separate programs for labor market and family needs, shaping claims to benefits: women claim benefits as wives and caregivers, while men make their claims as family providers and earners. Gender inequality pertains to the differences in the benefit levels of women and men. The legislation of most countries ascribes less value to wifely and motherly labor than to paid work, resulting in gender inequalities in benefit levels. Spouse benefits and additional allowances for dependents have generally been a percentage—often around half—of the regular benefits received by men. Furthermore, the traditional division of labor between the sexes assigns domestic work and care to women, which has limited their ability to qualify for more lucrative benefits as earners. Another important starting point has been an examination of the inscription of male breadwinner family model in social security legislation, taxation, and employment policies. As distinct from the dis14478
cussions on dual welfare and gender differentiation, which look at income maintenance schemes, the male breadwinner model has been applied to a wider range of policies. Lewis and Ostner (1991) argue that this model has been built into social provision in varying degrees, and they classify countries as strong, moderate, and weak male breadwinner states. Subsequently, the weak male breadwinner category has more appropriately been termed the dual breadwinner model (Lewis 1992) or the dual earner model. A major contribution of this typology was to emphasize that the strength of the male breadwinner model as reflected in social welfare policies varied across countries. Nonetheless, the typology does assume a single basic underlying dimension, and that social welfare policies are informed by the male breadwinner model. A major criticism of this analytic construct is that it is based on two principles of entitlement: (a) as breadwinner or earner and (b) as the dependent of the breadwinner. Thus, it overlooks other bases of entitlement and it conflates women’s entitlements as wives and mothers (Sainsbury 1996). The problematic nature of the typology is also revealed by its inadequacy in analyzing the entitlements of single mothers. Hobson (1994) has documented substantial variation across strong breadwinner states—such as The Netherlands, Germany, and Britain—in terms of social welfare policies providing mother only families with an acceptable standard of living. Jenson (1997) has criticized Lewis’s emphasis on the nexus between paid work, unpaid labor, and welfare, stressing that unpaid work is not a synonym for care. Jenson calls for making care central in theorizing about the gendered construction of welfare policies. More specifically, this would involve an analysis of ‘the gender division of labor among caregivers, gender differences in the capacity or need to pay, and the gender consequences of different institutional arrangements for provision’ (1997, p. 187). Sainsbury (1996) has developed a framework based on two ideal types: the male breadwinner model and the individual model. This framework formalizes and expands the dimensions of variation implicit in Lewis’s typology. The dimensions are: the nature of the familial ideology, the basis and unit of entitlement, the taxation system, employment and wage policies, and the organization of care work. It also clarifies an alternative policy model to the male breadwinner model. A major drawback of this analytic scheme is that it limits the models to two contrasting ideal types, but its dimensions of variation have proved to be a useful point of departure for empirical analysis and theory building. A final development has been to theorize the gendered construction of social welfare policies in terms of gender regimes or gender policy regimes. More precisely, a regime consists of values, norms and rules, thus providing a normative or regulatory framework that influences behavior. A policy regime re-
Social Welfare Policies and Gender directs the analysis from behavior to policies. Analogous to the notion of a social policy regime that emphasizes that a given state–economy organization provides a logic of it own, a gender policy regime is a given organization of gender relations associated with a specific policy logic. The organization of gender relations is determined by principles and norms (gender ideologies and practices) that specify the tasks, obligations, and rights to the two sexes (e.g., Sainsbury 1999). All of these frameworks have furnished insights into the gendered construction of social welfare policies. Studies employing the frameworks have often challenged earlier classifications of countries’ policies that neglected gender, leading to revisions and new understandings. Analyzing Britain, The Netherlands, Sweden, and the US, Sainsbury (1996) found strong divergences in the policy patterns of the four countries, and that the clustering of countries was quite different from how countries are bracketed together in analyses using mainstream typologies. In several mainstream studies Sweden and The Netherlands have been grouped together as countries with generous social welfare policies, but the gendered construction of Dutch and Swedish policies represented opposite poles, especially prior to the mid-1980s. Similarly, a study comparing the inscription of gender in the social welfare policies of Denmark, Finland, Norway, and Sweden revealed sharp differences between Norway and Sweden—the two countries that most closely approximate the social democratic welfare regime in mainstream literature (Sainsbury 1999). Orloff’s examination of income maintenance policies in Australia, Britain, Canada, and the US led her to conclude that the way gender norms are encoded in these programs undermines the coherence of the liberal welfare regime (O’Connor et al. 1999).
3. The Impact of Social Welfare Policies on Gender Relations Conceiving of social welfare policies as an ordering force in gender relations rivets attention on the implications of policies for reinforcing or undermining the traditional division of labor among women and men in the family and society. Theorizing on how traditional gender relations have been inscribed in policy frameworks has underlined that policies and benefit structures perpetuate existing relations and the subordination of the female sex. Whether policies actually have these effects, however, is a matter for empirical research. In other words, it is necessary to distinguish between the gendered framework of policies and policy outcomes. Several discussions have conflated policy models and behavior. Women’s labor market participation rates have been taken as evidence of a particular policy model but without examining whether policies are in
place to enable or limit women’s employment. For example, married women in the US have entered the labor force in massive numbers, despite income tax policy and social security measures that are biased heavily in favor of the traditional family with a single earner. Both the principles underlying entitlement and the redistributive effects of policies have major implications for gender relations. The basis of entitlement is a crucial aspect in determining the impact of welfare policies on gender relations and whether policies buttress women’s dependency on men or enhance their autonomy (Sainsbury 1996). At one extreme, women’s entitlements as wives and means-tested benefits reinforce married women’s dependency. If women are entitled to benefits and services as wives, i.e., their entitlement derives from their husband’s rights, their dependency within the family is exacerbated. In this situation a woman’s rights are jeopardized by divorce or may be contingent upon a marriage test—a specified number of years of marriage that guarantees entitlement. In means tested programs, the family is usually the unit of benefit; and need is established on the basis of family income. In these instances, benefits are familialized. Benefits based on the principle of care occupy a middle position. Although these benefits may buttress traditional gender relations, they recognize the importance of care and thus alter notions of deservingness. The principle of care also erodes the importance of marriage. Compared to entitlements as wives, benefits based on the principle of care enhance autonomy. Benefits whose eligibility is based on labor market participation or citizenship\residence have greater potential to increase personal autonomy. Even if women workers are likely to receive lower benefits compared to men workers, benefits based on labor market participation are an independent source of income and they can lessen married women’s dependency on benefits based on their husband’s rights. Finally, benefits based on citizenship or residence accord equal social rights to women and men. This basis of entitlement neutralizes marital status as a criterion of eligibility. It not only equalizes benefits between women and men but also between unmarried and married persons—and husband and wife. Likewise, citizenship\residence as a basis of entitlement makes no distinction between paid and unpaid work, which enhances the social rights of persons with weak attachments to the labor market. Regardless of the basis of entitlement, welfare policies have redistributive effects that can alter the resources of women vis-a' -vis men. In turn, changing resources between the sexes can affect gender relations. In countries where married women receive their own retirement pension, even though entitlement was based on their husband’s rights, wives’ contribution to family earnings has on average been higher compared to other couples. An even more striking example is that income maintenance programs, public provision of 14479
Social Welfare Policies and Gender affordable child care services, and\or tax benefits enable women to leave an unhappy relationship or marriage and strike out on their own. In most industrialized countries lone parent families are increasing, and social welfare policies help to make this family form a viable option (see Lone Mothers in Affluent Nations). Policies that promote women’s access to paid work are also vital to changing gender relations (Orloff 1993; O’Connor et al. 1999; Sainsbury 1999). Policy measures enabling employment provide both earnings and individualized social benefits, weakening or eliminating dependence on the male family provider. Women with high earnings have substantial bargaining power within the family, and this leverage can alter how partners divide unpaid domestic work among themselves. While most analysts have empirically investigated the outcomes of policies on gender relations, Fraser (1994) adopts a normative stance and asks how can social welfare policies promote gender equity. She presents three policy models and evaluates the strengths and weaknesses of each model in terms of achieving gender equity. The first is the universal breadwinner model where policies promote women’s employment so that the breadwinner role becomes universal—and is no longer reserved for men. In the caregiver parity model, public policies support informal care so that the role of caregiver is on a par with the role of breadwinner. The policies of the earner– carer model support both roles but simultaneously aim at eliminating gender differentiation. This involves policies that aid women in becoming earners and men in becoming carers (cf. Sainsbury 1996, 1999).
structure of the state that allows or thwarts a group’s influence on policies, and policy legacies. Along similar lines, O’Connor et al. (1999) suggest an explanatory framework made up of the following elements: social movement mobilization and orientation, political party configuration, the political opportunity structure, and the institutional context and legacy. The movement’s orientation involves its political strategy to achieve gender equality, specifically the movement’s attitude toward the state and its gender ideology, whether emphasis is on sameness or differences between the sexes. Political party configuration refers to the strength of left, center and right parties. The political opportunity structure is defined as access to the state institutions, the stability or fluidity of political alignments and relationships to allies and support groups. The institutional context and legacies encompasses the structure of the state, bureaucratic capacity, and the industrial relations framework (see also Sainsbury 1999). In short, many scholars have abandoned the view that women are simply the objects of policy, or ‘policy takers’ (Hernes 1987, p. 35). Instead research has focused on, first, women’s power resources—organizational forms, discourses and strategies—in a variety of national and historical contexts and, second, how these contexts limit or promote women’s capacities to influence social welfare policies. See also: Lone Mothers in Affluent Nations; Motherhood: Economic Aspects; Poverty and Gender in Affluent Nations; Poverty, Sociology of; Sex Differences in Pay; Sex Segregation at Work; Social Welfare Policy: Comparisons; Underclass; Welfare Programs, Economics of; Welfare State, History of
4. Gender and Explanations of Social Welfare Policies
Bibliography
In studies emphasizing the gendered construction of policies, gender relations themselves or the prevailing gender ideology are often a primary explanation of the particular form that social welfare policies take. Kessler-Harris (2000), e.g., conceives of gender as a historical agent, as an ideational force that configures and sustains systems of cultural representation. She has examined how gender mediates policy discussions between interest groups in providing or denying legitimacy to their proposals in the areas of social insurance, taxation and labor legislation. Other studies have emphasized the role of women’s political mobilization and the institutional setting in shaping social welfare policies. In attempting to explain the late introduction of social insurance schemes and the early adoption of maternal welfare measures in the US, Skocpol (1992) underlines the importance of the state in politicizing gender identities that lead groups to mobilize, the correspondence between the form of political organizing and the
Casper L M, McLanahan S S, Garfinkel I 1994 The genderpoverty gap: what we can learn from other countries. American Sociological Reiew 59: 594–605 Fraser N 1989 Unruly Practices: Power, Discourses and Gender in Contemporary Social Theory. University of Minnesota Press, Minneapolis, MN Fraser N 1994 After the family wage: gender equity and the welfare state. Political Theory 22: 591–618 Hernes H M 1987 Welfare State and Woman Power. Norwegian University Press, Oslo, Norway Hobson B 1994 Solo mothers, social policy regimes, and the logics of gender. In: Sainsbury D (ed.) Gendering Welfare States. Sage, London Jenson J 1997 Who cares? Gender and welfare regimes. Social Politics 4: 182–7 Kessler-Harris A 2001 The Pursuit of Equity: How Gender Shaped American Economic Citizenship. Oxford University Press, New York Lewis J 1992 Gender and the development of welfare regimes. Journal of European Social Policy 2: 159–73 Lewis J, Ostner I 1991 Gender and the evolution of European social policies. Paper presented at the CES workshop on
14480
Social Welfare Policy: Comparisons Emergent Supranational Social Policy, Center for European Studies, Harvard University, Cambridge, MA Nelson B 1984 Women’s poverty and women’s citizenship. Signs 10: 209–31 O’Connor J S 1996 From women in the welfare state to gendering welfare state regimes. Current Sociology 44(3): 1–130 O’Connor J S, Orloff A S, Shaver S 1999 States, Markets, Families: Gender, Liberalism and Social Policies in Australia, Canada, Great Britain and the United States. Cambridge University Press, Cambridge, UK Orloff A S 1993 Gender and the social rights of citizenship: the comparative analysis of gender relations and welfare states. American Sociological Reiew 58: 303–28 Pateman C 1988 The patriarchal welfare state. In: Gutmann A (ed.) Democracy and the Welfare State. Princeton University Press, Princeton, NJ Quadagno J S 1994 The Color of Welfare. Oxford University Press, New York Sainsbury D 1996 Gender, Equality and Welfare States. Cambridge University Press, Cambridge, UK Sainsbury D (ed.) 1999 Gender and Welfare State Regimes. Oxford University Press, New York Skocpol T 1992 Protecting Soldiers and Mothers:The Political Origins of Social Policy in the United States. The Belknap Press of Harvard University Press, Cambridge, MA Williams F 1989 Social Policy. Polity Press, Cambridge, UK Wilson E 1977 Women and the Welfare State. Tavistock, London
D. Sainsbury
Social Welfare Policy: Comparisons Welfare policy comparisons have been a prominent feature of macro-level social science research since the 1970s. Dominated by political scientists and sociologists, most studies focus on the advanced industrial democracies. The literature is guided by two overarching questions: One, how do we explain the rise of welfare states and their diversity? Two, how do variations in policy design affect welfare distributions? The overview that follows will treat the two questions separately. It has been difficult to arrive at clear-cut answers to either question because of statistical overdetermination: the number of rival explanations often exceeds the number of cases that can be studied (see also Quantitatie Cross-national Research Methods). There are, moreover, conceptual difficulties because scholars do not always compare and explain the same phenomenon. Some compare social policies (say pensions), others welfare states, and still others welfare regimes, and the difference is rarely made clear. The presence of a set of social policies does not automatically imply a welfare state which, following Marshall (1950) implies social entitlements as a matter of citizenship, comprehensive risk coverage, commitments to full employ-
ment, and active reduction of inequalities. The welfare state is also an ideological construct, a means to inculcate broad solidarities and national integration (Briggs 1961, Flora 1986). Rather than speaking of more-or-less welfare statism, the literature has moved towards typologies of kinds of welfare states. Many scholars use the concept of welfare regimes to denote the broader complex, or mix, of social welfare provision between state, market, and families (for a discussion, see Esping-Andersen 1999).
1. Conceptualizing Welfare Policy Models Early quantitative comparisons, such as Cutright (1965) and Wilensky (1975), compared welfare policy differences through national social spending levels. This was challenged on validity grounds (spending may not reflect welfare effort, and it tells us little about what kind of welfare is being produced and for whom). As a result, research in the 1980s and 1990s has been greatly concerned with the precise meaning and content of social policies and the welfare state. Most conceptualizations begin with Marshall’s (1950) emphasis on the rights of social citizenship, and programs are then compared in terms of how strong are rights, to what extent do they lessen market dependency (‘decommodification’), and to what degree do they segment or unify citizens (Flora and Alber 1981, Korpi 1983, Baldwin 1990, Esping-Andersen 1990). More recently, reflecting the rise of feminist scholarship, new dimensions, such as the inherent gender bias, or the degree of women ‘friendliness,’ have been added (Hernes 1987, Sainsbury 1996, Orloff 1993, O’Connor et al. 1999). The greater conceptual attention to policy content and objectives has led to a keener appreciation of the qualitative differences between national welfare systems. Drawing upon Titmuss (1958, 1974), there emerged a voluminous literature on welfare state typologies, initially focused on welfare state models, such as in Furniss and Tilton (1979). As research expanded to the interplay between state, market, and family welfare provision, emerged classifications of welfare regimes, such as the liberal, conservative, and social democratic trichotomy in Esping-Andersen (1990) and further refinements, such as Castles’ (1993) notion of a unique Antipodean ‘wage earners’ model,’ the identification of a specific, unusually familialistic and patronage-based Mediterranean model (Castles 1995, 1998, Esping-Andersen 1999). Some feminist writings have proposed typologies based on the strength of the conventional male breadwinner assumption (Lewis 1993). With the exception of the latter, most typologies classify nations rather similarly: a Nordic group, distinguished by its universalism, comprehensive coverage, generosity, and a strong prowomen bias; an Anglo-Saxon group, characteristic for 14481
Social Welfare Policy: Comparisons its residual public provision, its targeting of benefits, and its encouragement of private welfare; a Continental European group, which stands out for its corporativistic status-segmented social insurance; and, more controversially, a Mediterranean (and, possibly, East Asian) group primarily distinguished by its strong accent on familial welfare provision. As Castles’ (1993) points out, these typologies resemble ‘families of nations,’ nation-clusters with a deep common historical tradition.
2. Explaining Policy Diergence The great vitality of comparative welfare state research comes from its address to central theoretical issues in the social sciences. Existing reviews of the literature usually distinguish two broad explanatory thrusts: the convergence thesis that derives from theories of industrialism and modernization, and the ‘politics matter’ thesis that derives from theories of conflict and power mobilization (Shalev 1983, Esping-Andersen and van Kersbergen 1992, O’Connor and Olsen 1998). In addition, some theorists emphasize the process of nation building and democratization (Flora 1986, Flora and Alber 1981), nations’ position in, and response to, the world economy (Cameron 1978, Katzenstein 1985), and also the policy-making role that state bureaucracies exert (Skocpol and Amenta 1986).
2.1 The Modernization Thesis The earliest systematic welfare state comparisons were chiefly inspired by the theory that industrialization (accompanied by urbanization, the rise of efficient bureaucracy, and by demographic aging) erodes traditional social protection mechanisms, such as the family, and makes necessary public responsibilities. As nations industrialize, they will also converge because social risks become more similar, and because the functional necessity of public social security becomes pressing. Main exponents of this thesis include Rimlinger’s (1971) as yet unrivaled historiography of the roots of welfare policy in Europe, America, and Russia, and Wilensky’s (1975) quantitative demonstration that rich countries gradually converge in terms of the share of gross domestic product (GDP) spent on welfare. Wilensky provided no direct measure of political variables, and his conclusions were therefore challenged by adherents of the ‘politics matter’ position.
2.2 The ‘Politics Matter’ Thesis The resurgence of conflict theory in the 1970s provoked two antitheses to the modernization paradigm. 14482
One was Marxian and quite functionalist, arguing that the modern welfare state was not an egalitarian response to new needs, but rather a mechanism, however contradictory, of sustaining the legitimacy of capitalist economies (Gough 1979, Offe 1972, O’Connor 1973). This perspective has produced little systematic empirical comparison. The second antithesis argues explicitly that ‘politics does matter’ in the sense that the nature, scope, and objectives of a welfare state depend primarily on the balance of political power between ‘left’ and ‘right.’ Since it is well established that the roots of welfare policy are generally conservative, as in the case of Bismarck, this position is mainly concerned with welfare state variations in mature industrial democracies. The premise is that the collective resources of wage earners can, through parliamentary democratic means, be mobilized so as to establish a kind of welfare state that effectively circumscribes the distributional logic of markets and the decisional power of capital (Korpi 1983, Stephens 1979, Castles 1982, Hicks 2000; for an overview, see O’Connor and Olsen 1998). This thesis lies close to classical social democratic theory and its postulate that a parliamentarian transition from capitalism to democratic socialism is possible (Stephens 1979, Esping-Andersen 1990). For this reason, it is often labeled the ‘social democratic’ model of the welfare state. Since Sweden is the epitome of social democratic power and achievement, the thesis has been criticized for its inherent ‘Swedo-centrism’ (Shalev 1983). The power mobilization hypothesis has given rise to a very large but also conceptually and methodologically heterogenous literature which, cumulatively, makes a strong case in favor of the underlying thesis (the implications of methodological design are examined in Janoski and Hicks 1994). Whether studying individual social programs, such as pensions, sickness benefits, or labor market policies (Myles 1984, Palme 1990, Korpi 1989, 1991, Kangas 1991), or more broadly ‘welfare states’ (Stephens 1979, Korpi 1983, Esping-Andersen 1990, Huber et al. 1993, Hicks 2000), the impact of the political strength of the left is quite robust. Still, the ‘left power’ explanation is questioned by several authors working within the ‘politics matter’ tradition. Some argue against a simple, linear leftpower effect, arguing that real politics are premised on enduring party coalitions. Postwar Christian Democracy, for example, has been seen as a functional equivalent to labor movements, producing similarly generous welfare states (Wilensky 1981). EspingAndersen (1990, 1999), Huber et al. (1993), Castles (1994) and, especially, van Kersbergen (1995) demonstrate, however, that Christian Democracy produces welfare states that may be generous, but which are also strongly corporativistic in design, very ‘familialistic,’ and transfer-biased. Also Baldwin (1990) questions the simple left-power thesis because what
Social Welfare Policy: Comparisons are usually imputed to be social democratic achievements (such as universalism) arose from entirely different political quarters (agrarians in particular). In a unique synthesis of the modernization and politicsmatter theses, Pampel and Williamson (1989) suggest that all governments, left or right, will favor welfare expansion because of population and, thus, medianvoter aging. 2.3 The Impact of Globalization Since there is a strong coincidence between a country’s size, its degree of economic openness, and the scope of its welfare state, the correlation between left power and welfare statism may be causally spurious (Cameron 1978, 1984, Katzenstein 1985). The point is that small, export dependent economies are prone to international shocks and need to be adaptive. This favors the emergence of centralized, consensual social pacts between the social partners, but also strong social guarantees to workers in order to assure their compliance. Put in this way, the causal link between power and welfare state outcomes becomes ambiguous. It may be that powerful labor movements and generous welfare states both derive from global economic pressures. It is difficult to sort the real causal order because the number of countries we can compare is so few. 2.4 Nation-building and Democratization The puzzle that any ‘politics-matter’ explanation faces is that the first social reformist steps were usually taken by conservative, sometimes anti-democratic forces, often with the explicit intent of containing working class mobilization (Rimlinger 1971, Flora and Alber 1981). Social policy per se cannot therefore be a robust indicator of left-political achievements. The issue is really one of historical sequence. Most would agree that the historical roots of social policy are closely connected to late nineteenth-century nation building, a means to sustain the established order in the face of emerging social, linguistic, ethnic, or religious cleavages. The ‘politics-matter’ paradigm, in contrast, is concerned with the epoch of actual welfare state construction in the postwar era of full democracy. 2.5 The State-centered Thesis Modernization theory stresses the functional necessity of efficient bureaucracy for welfare state development because, without it, effective taxation and administration of complex distributional programs is simply impossible. Bureaucracy, once erected, may also come to play an active role in decision-making and thus determine policy evolution. Heclo (1974) maintained that Swedish and British welfare policy, although
originally rather similar, eventually parted ways because of inherently different bureaucratic policymaking styles and traditions. The independent powers of public officials in framing key decisions have also been emphasized especially by Skocpol (1992). This, so-called state-centered thesis has proven difficult to test empirically against rival arguments. As a result, empirical applications are limited to select case studies (such as Skocpol 1992) that are difficult to generalize, and a handful more systematic, quantitative studies (Korpi 1989).
3. The Consequences of Welfare States The literature on the consequences of welfare state differences is, possibly, as huge as the former but the protagonists tend to be different. For one, there is a longstanding tradition in economics to study the micro-economic effects of social policies on work incentives as well as its macro-economic effects on inflation, savings, unemployment, or economic growth. This literature, which, in any case, is rarely comparative, will not be reviewed here (for an overview, see Danziger et al. 1981, Atkinson and Mogensen 1993, and Barr 1993). Closer to our concerns are studies that examine the impact of welfare state design on distributional outcomes. Most scholars agree that the focus should be on life chance dynamics, but this has been thwarted by data scarcity. The literature has therefore concentrated on cross-sectional analyses of income inequalities, poverty and, in recent years, also on gender equality. 3.1 Poerty and Income Distribution The conventional assumption has been that equality is positively and linearly related to welfare state generosity. This was implicit in Wilensky (1975), and some early attempts at empirical verification produced positive and significant correlations (Hewitt 1977). Until the mid-1980s, however, robust comparisons were hampered by severe problems of cross-national data comparability. Only with the advent of the Luxembourg Income Study (LIS) and, later, European Union data collection, did genuinely comparable income data become available. Most studies of the redistributive effects of welfare states (or single social programs) are based on a comparison of pre- and post-transfer\tax distributions (a representative example is Smeeding et al. 1990; an extensive review of the literature is provided by Gottschalk and Smeeding 1997). A few studies attempt to study more explicitly the impact of types of welfare states or social policies on distributional outcomes (Ringen 1987, McFate et al. 1995). Quantitative comparisons come to a surprisingly consensual view: the larger the welfare state, the greater the degree of equality in net disposable income, 14483
Social Welfare Policy: Comparisons and the lower the rates of poverty. A less obvious finding is that welfare states targeted to the truly needy do not result in more poverty reduction than does the universalistic approach to social protection (for a review, see Gottschalk and Smeeding 1997). Reflecting the paucity of comparable panel data, studies on poverty and inequality dynamics are, so far, few. Preliminary evidence does, however, suggest that comprehensive, generous welfare states are substantially more able to guarantee against long-term entrapment in poverty or welfare dependency (Burkhauser and Poupore 1993, Duncan et al. 1993). Most studies conclude that targeted, means-tested policies create strong poverty traps. The finding that comprehensive, universalistic welfare states are more redistributive than those favoring heavy targeting has provoked debate about the precise mechanisms involved. Some argue that income tests create stigma and lower the take-up rate. With their notion of the ‘paradox of redistribution,’ Korpi and Palme (1998) argue that the basic reason lies in crowding out of private welfare that ensues from comprehensive, universalistic policy: limited welfare states nurture private welfare plans which are extremely inequality producing.
3.2 Gender Inequality The study of welfare policies and gender inequalities is primarily focused on family-related social policies, such as caring services and leave schemes; what Bradshaw et al. (1993) call the family benefit package. Specific policy variations are linked to either welfare state typologies (as in the case of Gornick et al. 1997, Gustafsson 1994), or to male-breadwinner typologies (Lewis 1993, Mclanahan et al. 1995, Sainsbury 1996), and welfare systems are often ranked in terms of their degree of ‘women friendliness’ (for an overview, see O’Connor 1996). A number of rather clear-cut empirical conclusions have emerged. One, day care provision is a powerful precondition for women’s economic independence, for their capacity to harmonize careers and family, and also for safeguarding against poverty in single parent families (Gornick et al. 1997, Gustafsson 1994). A second finding is that welfare systems that grant women independent entitlements both lessen dependency and augment gender equality (Mclanahan et al. 1995, Sainsbury 1996; O’Connor et al. 1999). See also: Policy History: State and Economy; Social Security; Social Welfare Policies and Gender; Welfare; Welfare: Philosophical Aspects; Welfare Programs, Economics of; Welfare State; Welfare State, Geography of; Welfare State, History of 14484
Bibliography Atkinson A, Mogensen G V (eds.) 1993 Welfare and Work Incenties. Clarendon Press, Oxford, UK Baldwin P 1990 The Politics of Social Solidarity. Class Bases of the European Welfare State 1875–1975. Cambridge University Press, Cambridge, UK Barr N 1993 The Economics of the Welfare State. Stanford University Press, Palo Alto, CA Briggs A 1961 The welfare state in historical perspective. Archies Europeennes de Sociologie 2: 221–58 Burkhauser R, Poupore J 1993 A cross-national comparison of permanent inequality in the United States and Germany. Cross-National Studies in Ageing. Program Project Paper no 10. State University of New York, Syracuse, NY Cameron D 1978 The expansion of the public economy: A comparative analysis. American Political Science Reiew 72: 1243–61 Castles F G 1982 The impact of parties on public expenditure. In: Castles F G (ed.) The Impact of Parties. Sage, London, pp. 21–96 Castles F G 1993 Families of Nations: Patterns of Public Policy in Western Democracies. Aldershot, Dartmouth, UK Castles F G 1994 On religion and public policy: Does Catholicism make a difference? European Journal of Political Research 25: 19–40 Castles F G 1995 Welfare state development in Southern Europe. West European Politics 18: 291–313 Castles F G 1998 Comparatie Public Policy. Elgar, Cheltenham, UK Cutright P 1965 Political structure, economic development and national social security programs. American Journal of Sociology 70: 537–50 Danziger S, Haveman R, Plotnik R 1981 How income transfers affect work, savings, and income distribution. Journal of Economic Literature 19 Duncan G, Gustavsson B, Hauser R 1993 Poverty dynamics in eight countries. Journal of Population Economics 6: 215–34 Esping-Andersen G 1990 The Three Worlds of Welfare Capitalism. Polity Press, Cambridge, UK Esping-Andersen G 1999 Social Foundations of Postindustrial Economies. Oxford University Press, Oxford, UK Esping-Andersen G, van Kersbergen K 1992 Contemporary research on social democracy. Annual Reiew of Sociology 18: 187–208 Flora P 1986 Introduction. In: Flora P (ed.) Growth to Limits, Vol. 1. De Gruyter, Berlin Flora P, Alber J 1981 Modernization, democratization and the development of welfare states in Western Europe. In: Flora P, Heidenheimer A J (eds.) The Deelopment of Welfare States in Europe and America. Transaction Books, New Brunswick, NJ Furniss N, Tilton T 1979 The Case for the Welfare State. Indiana University Press, Bloomington, IN Gauthier A 1996 The State and the Family. Clarendon Press, Oxford, UK Gornick A, Meyers M, Ross K E 1997 Supporting the employment of mothers. Journal of European Social Policy 7: 45–70 Gottschalk P, Smeeding T 1997 Cross-national comparisons of earnings and income inequality. Journal of Economic Literature 35: 633–81 Gough I 1979 The Political Economy of the Welfare State. Macmillan, London
Socialism Gustafsson S 1994 Childcare and types of welfare states. In: Sainsbury D (ed.) Gendering Welfare States. Sage, London Heclo H 1974 Modern Social Policies in Britain and Sweden. Yale University Press, New Haven, CT Hernes H 1987 The Welfare State and Women Power. Norwegian University Press, Oslo, Norway Hewitt C 1977 The effect of political democracy and social democracy on equality in industrial societies. American Sociological Reiew 42 Hicks A 2000 Social Democracy and Welfare Capitalism. Cornell University Press, Ithaca, NY Hicks A, Swank D 1992 Politics, institutions, and welfare spending, 1960–1982. American Political Science Reiew 86(3): 658–74 Huber E, Stephens J, Ragin C 1993 Social democracy, christian democracy, constitutional structure and the welfare state. American Journal of Sociology 99: 711–449 Janoski T, Hicks A (eds.) 1994 The Political Economy of the Welfare State. Cambridge University Press, Cambridge, UK Kangas O 1991 The Politics of Social Rights. Swedish Institute for Social Research, Stockholm, Sweden Katzenstein P 1985 Small States in World Markets. Cornell University Press, Ithaca, NY Korpi W 1983 The Democratic Class Struggle. Routledge, London Korpi W 1989 Power, politics and state autonomy in the development of social citizenship. American Sociological Reiew 54: 309–28 Korpi W 1991 Political and economic explanations for unemployment. British Journal of Political Science 21: 315–48 Korpi W, Palme J 1998 Redistribution and strategies of equality in Western countries. American Sociological Reiew 5: 661–87 Lewis J 1993 Women and Social Policies in Europe. Elgar, Aldershot, UK McFate K, Lawson R, Wilson W J 1995 Poerty, Inequality, and the Future of Social Policy. Russell Sage, New York Mclanahan S, Casper L, Sorensen A M 1995 Women’s roles and women’s poverty. In: Oppenheimer V, Jensen A (eds.) Gender and Family Change in Industrialized Countries. Clarendon Press, Oxford, UK Marshall T H 1950 Citizenship and Social Class. Oxford University Press, Oxford, UK Myles J 1984 Old Age in the Welfare State. Little, Brown, Boston O’Connor J 1973 The Fiscal Crisis of the State. St Martin’s Press, New York O’Connor J 1996 From women in the welfare state to gendering welfare state regimes. Current Sociology (special issue) 44 O’Connor J, Olsen G 1998 Power Resources Theory and the Welfare State. University of Toronto Press, Toronto O’Connor J, Orloff A, Shaver S 1999 States, Markets, Families. Cambridge University Press, Cambridge, UK Offe C 1972 Advanced capitalism and the welfare state. Politics and Society 4: 479–88 Orloff A 1993 Gender and the social rights of citizenship. American Sociological Reiew 58: 303–28 Palme J 1990 Pension Rights in Welfare Capitalism. Swedish Institute for Social Research, Stockholm, Sweden Pampel F, Williamson J 1989 Age, Class, Politics and the Welfare State. Cambridge University Press, Cambridge, UK Rein M, Rainwater L 1986 The Public-Priate Mix in Social Protection. M E Sharpe, Armonck, New York Rimlinger G 1971 Welfare and Industrialization in Europe, America, and Russia. Wiley, New York Ringen S 1987 The Politics of Possibility: A Study in the Political Economy of the Welfare State. Clarendon Press, Oxford, UK
Sainsbury D 1996 Gender, Equality and Welfare States. Cambridge University Press, Cambridge, UK Shalev M 1983 The social democratic model and beyond: Two generations of comparative research on welfare states. Comparatie Social Research 6: 315–51 Skocpol T 1992 Protecting Soldiers and Mothers. Harvard University Press, Cambridge, MA Skocpol T, Amenta E 1986 States and social policies.Annual Reiew of Sociology 12 Smeeding T, O’Higgins M, Rainwater L 1990 Poerty, Inequality and Income Distribution in Comparatie Perspectie. Harvester\Wheatsheaf, New York Stephens J 1979 The Transition from Capitalism to Socialism. Macmillan, London Titmuss R 1958 Essays on the Welfare State. Allen and Unwin, London Titmuss R 1974 Social Policy. Allen and Unwin, London Van Kersbergen K 1995 Social Capitalism. Routledge, London Wilensky H 1975 The Welfare State and Equality. University of California Press, Berkeley, CA Wilensky H 1981 Leftism, catholicism and democratic corporatism. In: Flora P, Heidenheimer A (eds.) The Deelopment of Welfare States in Europe and America. Transaction Press, New Brunswick, NJ
G. Esping-Andersen
Socialism A workable definition of Socialism, subject to the qualifications offered below, would be this: ‘Socialism is an economic organization of society in which the material means of production are owned by the whole community and operated by organizations that are representative of, and responsible to, the community according to a general economic plan, all members of the community being entitled to benefit from the results of such socialized planned production on the basis of equal rights’ (Lichtheim 1971, p. 113). Leszek Kolakowski’s survey (1978) calls attention to ‘the surprising diversity of views expressed by Marxists in regard to Marx’s so-called historical determinism.’ However, what is ‘surprising,’ is not this diversity— which becomes much wider if we examine not Marxism in particular, but socialism in general. The surprising thing is Kolakowski’s pontifical belief that faced with it he could ‘schematize with precision the trends of twentieth century Marxism’ (1978, Vol. I, pp. 6–7). His failure to attain such precision serves as a salutary warning to anyone attempting to characterize the broader and much more ‘diverse’ terrain of socialism, of which Marxism—nineteenth and twentieth century Marxism—is itself but a ‘main current.’ In what follows, which will (in Wittgenstein’s words) ‘strive not after exactness, but after a synoptic view,’ attention will be paid to Cole’s admonition: that ‘the most that can be attempted is the discovery of some 14485
Socialism central core of meaning, present with varying additions in all or most of the manifold uses of the word (socialism)’ (Cole 1967, Vol. I, p. 1). This entails sensitivity to the shifts in meaning socialism has undergone since its inception; only beneath these shifts can a substratum of meaning emerge.
1. Capitalism, Socialism and Democracy The word ‘socialism’ first appeared in Robert Owen’s Co-operatie Magazine in 1827, ‘le socialisme’ in the Saint-Simonian journal Le Globe in 1832. It meant a projected alternative sociopolitical system—this was still offered as a definition of ‘socialism’ in the 1933 OED—not a political movement or demand. Early socialist doctrine contained little about the proletariat, the class system, labor, or revolution. While John Stuart Mill’s Principles of Political Economy accordingly distinguished socialism and communism as theories, not political movements, Marx and Engels’s Manifesto of the Communist Party of the same year (1848) identified communism as a movement that would render socialism, so defined, beside the point because it was merely a theory—utopian, that is because it was impracticable. It was communism, not socialism, that carried with it the idea of revolutionary struggle and human agency; a new and better society could not be wished or legislated into existence by benevolent doctrinaires, but had to await the advent of a politically conscious, class-based labor movement. This was a key shift in the coordinates, not just of communism, but of socialism. It too had now to be, not just thinkable, but feasible. After the failure of the 1848 revolutions, communism ceased for many years to be practical politics, yet some time was to pass before Marx and Engels consented to having their cause described as ‘socialist’ (Lichtheim 1969, p. 209). They did so because socialism had become not just a doctrine but a movement. Thereafter, ‘Marxism created that distinctively German socialism which was soon to assume an ideological dominance over most of the continent, driving the older forms of socialism before it as chaff before the wind’ (Cole 1967, Vol. I, p. 222). The second key shift in the coordinates of European socialism occurred in 1917, when, against all the odds, communism in its Leninist form arose as a clear alternative to democratic socialism—the same democratic socialism that had itself, against all previous odds, survived World War One. Both authoritarian communism and democratic socialism laid claim to a lineage deriving from Marx; but they had little else in common. Twentieth-century communists showed no reluctance to clothe their particular doctrines in the broader mantle of socialism. The locutions, ‘Union of Soviet Socialist Republics,’ ‘socialism in one country,’ ‘socialist realism,’ ‘socialist inter14486
nationalism’ (in its twentieth-century Soviet version, which meant subservience to the foreign policy of the USSR), the fact that after World War Two the East German Communist Party was officially called the ‘Socialist Unity Party’—all these are instances of synecdoche, if not the sheerest verbal trickery. We should not allow them to addle us into thinking, languidly, that all twentieth-century communists were socialists, whereas not all socialists were communists. There were in the twentieth century not just non- but also anti-communist socialists aplenty, socialists actively and vociferously opposed to communism, from the very moment when such opposition first became possible, at the Berne International in February 1919, when adherence to democracy was reasserted as a prerequisite for socialism (Cole 1967, vol. IV, pp. 292–9). The socialism that survived World War One to become part of the European political landscape had had as its prerequisite not democracy but capitalism. Socialists had long been in the forefront of the struggle for democracy. (Indeed, socialism was weakest in those countries, the US and the UK, where it was a less powerful force in the struggle for democracy than it was on the European continent.) In Cole’s words, Liebknecht, Bebel, and Kautsky in Germany, Guesde in France, indeed, most socialist stalwarts in Europe before World War One ‘felt that they had no right to make the revolution without the backing, or at least the assent, of the people. They thought of the proletariat as being, or as in the process of becoming, the majority of the people; and they envisaged the mass conversion of the proletariat to the socialist cause as a necessary preliminary to the revolution’ (Cole 1967, Vol. III, pp. 946–7). This democratic perspective was not necessarily reformist or purely parliamentary; democratization was also meant to encompass work relations and trade union structures, and could scarcely have avoided doing so. What the Second International socialists rightly feared was war; the solitary success of Russian Bolshevism in 1917 was as little expected or forseen as its eventual collapse was to be in 1989.
2. Marx, Marxism and Socialism Marx casts a longer shadow over the history of socialism than anyone else—not because every socialist agreed (or was even very familiar) with what he said, but because, all the same, his presence proved impossible to ignore. It is in this sense that Marxism can be called the lingua franca of socialism (Cole 1967, vol. I, p. 379). Without Marx, we would still have had revolution as a word and as a concept, as the French and the Americans had put revolutionary change on the political agenda during the eighteenth century, and we would have had capitalism, socialism, communism, and (thanks to Babeuf ) the proletariat too. The
Socialism celebrated opening line of the Manifesto of the Communist Party was designed to call the reader’s attention to ‘the specter haunting Europe,’ to invoke, not invent, communism. It is no doubt easier to imagine a world without Marx than a world without revolution, capitalism, communism, and socialism. But in the world we actually inhabit, all these still have to be seen through Marx. He may not have coined these terms, but he decisively set his seal on them. They still cannot properly be discussed without bringing him in. Marx contributed enormously to the vocabulary of socialism (class, class struggle, class war, class consciousness, ideology, alienation, the commodity, forces, relations, and modes of production, exploitation, contradiction, and crisis). Even though ‘socialist and communist movements (took) up some aspects of Marx’s theory and discarded others, even this selectivity (was) carried out in Marx’s name’ (Gilbert 1981, p. 270). Yet much that was carried out ‘in Marx’s name’ was attained without much knowledge of what he had written. The revisionism debate (at the turn of the twentieth century) that almost tore apart German Social Democracy and the Second International, was about Marxist ‘orthodoxy,’ but was conducted in an astounding ignorance of Marx’s writings. While practicing politicians commonly regard themselves as having more immediately pressing concerns than questions of textual fidelity, these same questions were never ignored among Marxists—although the theoretical corpus to which socialist stalwarts appealed was not necessarily made up of Marx’s writings. The problem was not that a canon of basic texts failed to emerge. (To the contrary, textual fidelity was adhered to tirelessly—and, at times, tiresomely.) It was that the canon as it developed was not so much derived from what Marx had written as it was constructed around what little of Marx was available in published form. Historical materialism—it still amazes people when they learn that this term was not Marx’s own—had its basic texts long before the first appearance in any language (during the 1930s) of so basic a text as Marx and Engels’s The German Ideology. These texts included Franz Mehring’s On Historical Materialism (1893), G. V. Plekhanov’s The Deelopment of the Monist View of History (1895), Antonio Labriola’s Essays on the Materialist Interpretation of History (1896) and Karl Kautsky’s The Materialist Interpretation of History (1927). Even though Marx in some sense inspired these books and others like them, what he had to say eventually needed to be disentangled from their arguments. It could in no way be directly encountered through them or explained by them (Thomas 1991, pp. 32–3).
3. The Varieties of Socialist Experience Even if Marx was more known about than known, the fact remains that by the end of the nineteenth century a politically-oriented, class-conscious European socia-
list movement had emerged in his name; and by this time the word socialist meant what it still means today, a social and political doctrine and a political movement or system. In view of this latitude of meaning it is scarcely surprising that ‘socialism’ has come down to us in social science literature as a noun qualified by an adjective—as in ‘utopian socialism,’ ‘scientific socialism,’ ‘state socialism,’ ‘revolutionary socialism,’ ‘evolutionary socialism,’ or (until recently) as ‘actually existing socialism,’ or again (today) as ‘market socialism.’ Similarly, the adjective ‘socialist’ can be employed as a pendant qualifying or characterizing an abstract noun, as in ‘socialist internationalism,’ ‘socialist economics,’ ‘socialist realism,’ or ‘socialist feminism.’ The above-listed collocations—the inventory is illustrative, not exhaustive—are not just convenient subdivisions, but serial attempts at spot-welding that can readily be arrayed in historical sequence. ‘Utopian socialism,’ as we have seen, dates from the 1830s and 1840s, ‘Fabian’ or ‘evolutionary socialism’ dates from the late nineteenth and early twentieth centuries, ‘Guild socialism’ dates from the period between the two World Wars, and ‘actually existing socialism’ dates from the latter days of Soviet communism. ‘Socialist internationalism’ peaked, then bottomed out dramatically, in the period before 1914; ‘socialist realism’ (like ‘socialism in one country’) belongs to the Stalinist period in the Soviet Union; ‘socialist feminism’ to Western scholarship of a later date. To array these composite terms historically in order to arrive at the meaning of their least common denominator, socialism, is not to pigeonhole them arbitrarily, nor yet to proceed reductively, if it is borne in mind throughout that each instance of spot-welding affected the coordinates of its successors. More specifically, anarchism, syndicalism and their offshoots (whether revolutionary or not) are kinds of socialism that were defined and elaborated against state socialism (Thomas 1980); Engels in 1875 contrasted scientific with utopian socialism (even if the latter was largely a spent force, the former was to enjoy some shelf-life well into the twentieth century); Lenin defined Bolshevism, then communism as he understood the term, against the parliamentary, reformist socialism of the pre-1914 German SPD (and by extension the Second International); this reformist socialism had itself been elaborated in contradistinction to the revolutionary socialism that appeared to have had its come-uppance at the time of the Paris Commune (1871). Lenin’s successors proceeded to counterpose socialism in one country against the socialist internationalism the First and Second Internationals had espoused. Overall, these developments do not add up to a tidy, formulaic picture. Yet there is a subtle patterning in the carpet. Positions and tendencies throughout the history of socialism were debated, articulated and defined in relation to other, pre-existing socialist positions and tendencies. 14487
Socialism See also: Marx, Karl (1818–89); Marxian Economic Thought; Marxism and Law; Marxism\Leninism; Marxist Social Thought, History of; Social Democracy; Social Security; Social Welfare Policy: Comparisons; Socialism: Historical Aspects
Bibliography Bottomore T B 1993 Socialism. In: Outhwaite W, Bottomore T B (eds.) The Blackwell Dictionary of Twentieth Century Political Thought. Blackwell, Oxford, pp. 617–9 Braunthal J 1966 History of the International 1864–1914, 2 Vols. Collins H, Mitchell K (trans.). Nelson, London Cole G D H 1967 A History of Socialist Thought, 5 Vols. Macmillan, London Gilbert A 1981 Marx’s Politics. Communists and Citizens. Rutgers University Press, New Brunswick, NJ Kolakowski L 1978 Main Currents in Marxism, 3 Vols. Falla P S (trans.). Clarendon Press, Oxford, UK Landauer C 1959 European Socialism: A History of Ideas and Moements from the Industrial Reolution to Hitler’s Seizure of Power, 2 Vols. University of California Press, Berkeley, CA Lichtheim G 1961 Marxism: An Historical and Critical Study. Praeger, New York Lichtheim G 1969 The Origins of Socialism. Praeger, New York Lichtheim G 1971 A Short History of Socialism. Praeger, New York Schorske C E 1955 German Social Democracy 1905–1917. The Deelopment of the Great Schism. Harvard University Press, Cambridge, MA Thomas P 1980 Karl Marx and the Anarchists. Routledge and Kegan Paul, London Thomas P 1991 Critical reception: Marx then and now. In: Carver T (ed.) The Cambridge Companion to Marx. Cambridge University Press, Cambridge, UK
P. Thomas
Socialism: Historical Aspects 1. The Meaning of a Group of Ideas The term socialism designates a set of doctrines which originated in the nineteenth century and intended to criticize the actual social organization and values (capitalism, bourgeois society, individualism) in terms of equality. Socialist ideologies contributed to the formation of a cultural world, linked mainly with the lower classes, composed of political parties and other associations which had as their goals projects of future societies and programs calling for collective action. In fact, the term socialism must comprehend a certain number of doctrines, political opinions, ideological vocations, and feelings of rebellion and emancipation. Today, the way of considering this problem is far from the acceptance of a hierarchy and of a teleological 14488
vision, which was established essentially by the Marxist tradition and defined the history of socialism as a successive progression from ‘precursors’ and ‘utopians’ to a more and more ‘mature’ conception of socialism. Socialism, in its origins, both from a linguistic and conceptual point of view, was part of that rich patrimony of ideas which originated with the French revolution and at the same time was linked to the social experiences evolving from the industrial revolution. During the nineteenth century, in continental Europe, the term was also used to define social and economic doctrines advocating an intervention of the State in the economy and\or a change in the regime of property from the individual to the collective property of the means of production. In the twentieth century, the socialist movement was widely permeated by political conflicts on how to achieve a socialist society (reforms or revolution) and also regarding what kind of model such a society should be based on. Marxism was the most influential doctrine in the constellation of socialism, both for the role it played in the history of socialist culture and in the contribution it gave to the creation of socialist political movements. The Russian revolution claimed to have created ‘socialism,’ and in spite of the criticism regarding the lack of democracy in this realized socialism, the idea continued to spread from Europe to other continents. Between 1917 and 1989, the term was also used to designate an actual social and economic system. According to the American Marxist Paul M. Sweezy (1949), ‘the term ‘‘socialism’’ means a social system which is differentiated from other social systems by the character of its property relations. Thus socialism properly belongs to the same species as capitalism and feudalism.’ There is a general agreement regarding the difficulty of defining and the complexity of the term socialism. This depends on the fact that the word has always designated an overall group of very complex ideas, that these ideas have been subjected to changes in time and space, and that by placing the history of the word in a crossroads of intense ideological polemics it has often changed its meaning. From the word used by Pierre Leroux in France during the 1830s to define an anti-individualistic political and social doctrine close to saintsimonism, up until the socialist ideology of the New Left in the 1970s, the road this term and concept have followed has been long and tortuous.
2. The Origins of Socialism When, between 1847 and 1848, Marx and Engels wrote the Communist Manifesto, the majority of their contemporaries would not have been able to precisely define the difference between socialism, communism and democracy. Even though it was the democracy which smelled ‘of blood and barricades’ (Rosenberg 1962), the upper classes perceived the different socialist
Socialism: Historical Aspects ‘schools’ as a threat to property and order. These ‘schools’ had a common root in the political thought and in the social experience derived from the French and the industrial revolutions. At the same time, however, they were their most severe critic. In France and England, the term socialism was used for the first time in the 1820s and the 1830s essentially to designate economic orientations contrary to individualism. In the period preceding 1848 various political and cultural centers appeared which could be traced to similar trends of thought. In France, the first equalitarian ideas after the Revolution came from the Jacobin left and are those of F. N. Babeuf, which are known to us above all through the literary works of F. Buonarroti. Babeuf tried to achieve his ideals (elimination of private property and the establishment of a society based on community property) by means of an attempted insurrection which failed. Later on, in the midst of political reflections after the French revolution and the experiences of the Napoleonic era, C. H. de SaintSimon imagined a future society based on science and work. His ideals influenced utopian French socialism affirming an optimistic confidence in technical and scientific progress. Fourier, who was also a critic of contemporary society, proposed the construction of a harmonious society organized around the Phalanste' res which were kinds of self-supporting communities promoted by free and happy men, who were able to achieve equality between the sexes and to satisfy any material necessity. In England, differently from in France, the new doctrines were diffused in a framework of relative legitimacy. Robert Owen, who was at the same time a utopian socialist and a social reformer, tried to build communities based on the collective property of the means of production and on noncompetitive social relationships. He also succeeded in creating workers’ organizations based on the same ideals, which contributed to the formation of cooperatives and unions. It was the first time—and this experience will eventually have consequences in the Chartist movement— that socialist ideas became integrated with real workers. Both in France and in England, the 1830s marked a change in the political climate and in the panorama of social conflicts. The artisans, impoverished by the processes of industrialization, and the alienated wage workers living on the borderline of a survival level were sensitive to the ideas of socialism which had circulated before the revolutions of 1848. It was a type of socialism containing strong elements of solidarity, cooperation, emancipation, and social justice. The experiences of the previous years, thanks to the influence of these first groups of organized workers, were able to put together a first set of common ideas: unjust profit, equal salary, the presence of two worlds in opposition or—as Fourier said—of an ‘upsidedown world’ in which a small number of privileged
people were opposed to a multitude of those condemned to unrewarding work. The spontaneous struggles of the workers, which in France both preceded and followed the July Monarchy, led to a politicization which exploited the socialist theories. Auguste Blanqui was a central figure in this context. Influenced by the theories of SaintSimon, Fourier, and Babeuf, Blanqui was above all a conspirator: he not only created several secret societies of republican-socialist inspiration, but he was also a tireless revolutionary activist. From certain points of view, Blanqui embodies the prototype of the ‘professional revolutionary’ who believed that the victory of the proletariat could be reached by a determined minority ready to conquer the power and use dictatorial methods. At the same time Proudhon was one of the first to underline the libertarian aspects in the criticism of capitalism, and in his essay What is Property (1840), he condemned in particular the oppressive function of the state and private property. In England, between 1837 and 1848, the Chartist movement developed as a reaction to the electoral reform of 1832 and the enforcement of the Poor Law which had further worsened the living standards of the working classes. The People’s Charter, which constituted the basic document of Chartism, claimed the right to universal male suffrage and considered a just electoral reform as the principal instrument for the achievement of social reforms in favor of the workers. Though Chartism was harshly repressed and completely disappeared after 1848, it constituted an important experience because of the connection which was established between social struggles and politics in that decade. This connection had its most profound roots in the changes which took place in the lives of the lower classes between the end of the eighteenth and the beginning of the nineteenth century. In France during the Revolution the sansculottes were mostly concerned with the problems of subsistence, instead now there was a different attention mainly focused on the social dimensions of work, wages, and on the opposition between the bourgeoisie and the proletariat. At the same time in England the social changes brought about by the industrial revolution stimulated a connection with the radical traditions of the past in the new social reformers. In both cases ideas of emancipation, equalitarianism, and restoration of a usurped legality were diffused among the lower classes; these popular ideas coexisted and at times integrated extremely well with the works of intellectuals and political leaders. However, socialism in its entire existence was not able to offer a unanimous solution to the problem regarding the ways through which a more just society could be achieved. In fact, socialists in the period preceding 1848 were often romantic and bizarre figures (Hobsbawm) who relied more on the intrinsic rationality of their ideas rather than on a collective action to bring about the realization of their projects. 14489
Socialism: Historical Aspects
3. Marx and Engels The revolutions of 1848 constituted a very important chronological fracture in the history of socialism. In the first place because the so called ‘social question’ was brought to the attention of the public due to the intense participation of socialist workers and artisans in the social struggles in the spring of that year, and particularly in the ‘June Days’ in France. This situation contributed to transforming the social question into a political one and it began a process in which socialists and communists became more and more different from democrats and republicans. In the second place, a shift began which later would be called a swing of the center of gravity in the development of socialism from England and France towards Germany. In preceding years the German culture had not been completely absent from the discussions regarding socialism; however, differently from other countries, intellectual reflections were isolated from the experiences of the working class. Now, between 1847 and 1848, along a tortuous road which brought them from the study of German classical philosophy to the Bund der Kommunisten, Marx and Engels published a brief pamphlet which would come to represent a milestone in the history of socialism, the Communist Manifesto. In this pamphlet, political activism was combined with an analysis of the social relationships created by the development of capitalism. The Manifesto defined the ways through which the bourgeoisie and proletariat became social classes irremediably opposed not only on the basis of moral choices, but above all on the basis of economic and social laws which dominated the history and the existence of mankind. Communists were gifted with the scientific awareness of the development of human society and were able to guide the proletarians in acquiring their class consciousness and in achieving a classless society in which the exploitation of men against men would be eliminated. Marx and Engels not only affirmed a ‘scientific’ conception of socialism, they also intended to close the doors on that world of socialist ideas which had preceded them. In the third part of the Manifesto, they made a distinction between reactionary socialism, conservative or bourgeois socialism, and critical-utopian socialism and communism. What Marx and Engels accomplished was the creation of an arrangement of other nuclei of the socialist culture in a historical perspective in which their ‘scientific’ doctrine overcame the preceding socialist traditions and ended up by expressing a negative criticism on half a century of socialist culture. Instead, the importance of the old socialism actually consisted in its capacity to exploit other cultural and political trends circulating among the lower classes: anticonformism, dissent, radical jacobinism, and democracy. Without this cultural atmosphere, Marx and Engels probably wouldn’t have had such a large public following. If it’s true, according to an intuition of 14490
Durkheim, that socialism was not only a ‘miniature sociology’ but also a ‘cry of pain,’ it’s just as true that its roots should be sought for not only in the traditions of ‘scientific socialism’ but also in the culture—e.g. in the beliefs, in the feelings, in the religion—of the lower classes and of the rank and file militants. In the 20 or 30 years following 1848, the doctrine of Marx and Engels became ‘marxism.’ Marx published the first book of the ‘Capital’ in 1867, and both Engels and he were engaged in an intense journalistic activity which accompanied the development of a first international nucleus of an organized workers movement. It’s the so-called First International, inside of or next to which, new groups and doctrines came about in conflict with those of Marx and Engels. One of these groups were the anarchists, followers of Bakunin in the rural areas of Latin and Slavic Europe; others were the Russian populists, and both showed the extent of the influence exerted by the International Association of Workers; and their activism also showed the richness and complexity of socialism. In March 1871, groups of internationalists with different political origins gave birth to the Paris Commune, which Marx and Engels considered a first experience of the proletarian dictatorship. Around that date, even though a bloody repression concluded the experience of the Paris Commune, nuclei of political subjects were formed in all of Europe and through a complicated game of conflicts and unification would eventually give origin to the different socialist parties. Principally in France and Germany the working class movements will become efficient bearers of the socialist culture and will stimulate emulation in groups and organizations in other countries.
4. The Diffusion of Socialism Beginning in 1889, the process of the formation and expansion of socialist parties as well as an incredibly varied body of organizations, associations, unions, and cooperatives which accompany them assumed a rapid and intense rhythm. A type of socialism developed in which Marxism consolidated its central role. However, it was a type of Marxism remote from the cold reasoning of Marx and Engels. Though Karl Kautsky, principal exponent in this phase of the expansion of Marxism contributed in defining it as ‘scientific socialism,’ it was an ideology not far from the mechanistic features of positivistic common sense. The idea of ‘natural necessity’ (Naturnotwendigkeit) so widely present in the Marxist texts and documents which exalted the automatism in the transformation of the means of production, influenced the mentality of rank and file giving them the certainty of the breakdown of the capitalistic system and the birth of socialism. The ‘scientific’ aspect of this conception of the past, present, and future of the society ended up by creating a certain ‘rationality of history.’ Thanks to
Socialism: Historical Aspects this faith in the future, the rank and file—according to Gramsci—think that they are only ‘momentarily defeated,’ but that ‘the force of things will work for them in the long run.’ Socialism became the ideology of the socialist parties and the mixture of its components became even more enriched. According to H.-J. Steinberg, along with Marxism other components must be considered as a part of the socialist ideology of the German Social Democracy: these were, pragmatism, anti-intellectualism, social Darwinism, the idea of the predictability of the final victory, and other ideas which could be found on uncertain grounds between the common sense of the lower classes and the vulgarization of Marxism. Socialism ended up by assuming a worldwide dimension. The Socialist International (which Lenin later called the Second International, when he was creating the Third) was not only the association where general political questions were discussed or agreed upon (participation or non participation in left wing bourgeois government coalitions, to call or not to call for a general strike, etc.), but it was also the place where other aspects of the socialist mentality and culture such as internationalism, pacifism, and solidarity were expressed. It is difficult to distinguish between the diffusion of socialist ideas and the force of attraction exerted by the major socialist parties. Certainly the socialist or social democratic parties were the most efficient means for the circulation of socialist ideas in the period of the Second international. But also the ways political cultures and ideologies reacted against socialism might have played a role in the process of the diffusion of socialism and Marxism. It was in this period that the integration of socialist parties in the European political systems took place and at the same time a complex cultural effort aimed at legitimizing the socialist presence developed. Precisely because the historical dimension of the socialist culture was an integral part of its way of confronting the present, it produced a noteworthy effort directed towards the ennoblement of its past in some traditions of the European culture (from Plato to Campanella, from Thomas More to Thomas Mu$ ntzer). More in general, the study of the past of the working class movement and socialism was inextricably connected with the political passion in the works of Jaure' s and Mehring, of Compe' re-Morel and Gru$ nberg. Socialism influenced the historical culture also in the academic field and at the end of the nineteenth century there were many interpretations regarding the end of slavery in the Ancient world or the class struggles in medieval towns which echoed Marxist canons. The legitimation of the socialist parties provoked tensions and oppositions. However, in general at the beginning of the twentieth century it became an accomplished fact. In reality, besides the brilliant electoral successes which consolidated the idea that the final victory was guaranteed by a slow and continuous increase of
votes, socialism had at the same time stimulated and won over the intellectuals. From the American populist writers at the beginning of the century (the so called muckrakers) such as Jack London or Upton Sinclair to those writers such as the Russian Maksim Gorki, there are many artists who look towards socialism with sympathy: the old Emile Zola or the young Italian painter Carlo Carra' whose first work in 1901 was a portrait of Engels. But above all, socialism involved intellectuals and scholars in the field of the social sciences. They were challenged to interpret this key aspect in the life of contemporary societies and sometimes looked at it with sympathy. From Vilfredo Pareto to Benedetto Croce, from Werner Sombart to Emile Durkheim, each had to take a stand regarding socialism, and this meant that in a period of increasing alphabetization, newspapers, and public opinion attributed remarkable importance to this problem. It is at this time that the political and cultural panorama of socialism became more complicated. The more socialist parties became actors in the political systems, the more intense the discussion within the party regarding the use of parliament, the role of social reform, the ways of passage toward a socialist society became. In general, during the period of the Second international, socialist thought tried to define the principal problems of the contemporary society. It radically criticized liberalism, unwillingly supported parliamentary democracy, tried to interpret imperialism, organized and coordinated the social protest, and created or unified complex labor organizations. The socialist world, however, was only partially accepted as a legitimate subject in the society: it had developed an attitude of ‘negative integration,’ and produced a ‘subculture’ closed within itself (Roth 1963). At the beginning of the century, the nonsocialist public opinion let out a sigh of relief upon seeing the relative availability of moderate socialists to enter and take part in coalition governments oriented toward social reform. However, even the most hopeful among them couldn’t have predicted that in August 1914 internationalism, one of the milestones of the socialist culture, would have found itself in the midst of a crisis because of the outbreak of the war and that this crisis would have created a new irremediable rift opposing socialists against socialists.
5. Socialisms in the Twentieth Century During the war, the conferences of Zimmerwald and Kienthal sanctioned the formation of socialist groups which criticized the failure of internationalism. Among them, the Bolsheviks played a particularly active role. After the outbreak of the October revolution the Bolsheviks promoted the formation of a new international organization which they called the Communist International. From that time on, the relationship between socialists and communists be14491
Socialism: Historical Aspects came in certain extreme situations one of violent confrontation. Paradoxically right when the Bolsheviks were building ‘socialism,’ they identified ‘socialists’ as their principal adversaries. The development of ‘socialism in one country’ accompanied a progressive erosion of the democracy in Russia and caused social conflicts which were resolved only with the use of repression. In the period of the ‘entre deux guerres’ socialists and social democrats (the two names coexist, the first in Latin Europe, the second in Germany and Scandinavia) strongly criticized communism, affirming the refusal of violence and exploiting the role of democratic institutions. After the experience of fighting together against fascism, socialists and communists split again during the cold war when the formation of ‘socialist’ or ‘popular’ democracies accompanied once again the implementation of repressive policies in Eastern Europe. In the East and West two opposing socialist worlds were formed: in the West, the German social democracy modified its identity by eliminating Marxism from its own political culture; in the East, Marxism was subjected to further dogmatic changes and was definitely transformed into Marxism–Leninism. After the Second World War the geography of socialism underwent a complex transformation. The crisis of colonialism and the anticolonialist tradition of the socialist movement influenced the formation of currents of socialist thought in several non-European countries. The Communist international had already exerted a worldwide influence outside the European boundaries, from Mexico to China, from Peru to Bengal but it was a simple center–periphery relationship. Starting with the beginning of the 1950s, autonomous socialist nuclei appeared especially in the Near East and in some colonies of sub-Saharan Africa. One of the examples was the so called ‘African socialism’ which had as its essential component the idea of a noncompetitive and nonindividualistic development with the objective of integrating religious, ethnic, and racial differences. In the 1960s and 1970s, the two different models of socialism continued to clash; on the one hand, democratic socialism, represented by political parties and experiences in coalition governments in Western Europe, on the other hand, the Soviet model. At the end of the 1980s, following an internal economic, political, and social crisis of enormous proportions, ‘realized socialism’ in the countries of eastern Europe ceased to exist, and with it the whole idea of socialism in the rest of the world underwent a crisis. In the same way that it was so difficult at the beginning of the nineteenth century, it is just as difficult to offer a satisfying definition of socialism two centuries later. Anti-authoritarian opinions, nonindustrialist conceptions of nature, the desire to defend great humanitarian causes and civil rights have come back from a remote past to be seen again in the foreground, while the passion for equalitarianism and social justice shows continuity with the socialist traditions. The
perspective from which these traditions were looked at has long been fairly ambiguous: often, in fact, the historians of socialism were also militant socialists, and political options have influenced the reconstruction of the past. The integration of the history of socialism in the academic environment has allowed scholars to look back at this subject with a certain detachment and has directed a special interest towards the place which socialist ideas have had in the culture of the lower classes and towards the role socialist doctrines and movements have played in the formation of contemporary political systems. Today an international socialist organization exists, socialist parties exist, and policies considered socialist exist. Their identity, however, no longer consists of a general conception of the world and of history, nor in allegiances or beliefs. Instead it consists of general cultural orientations aimed to control public expenditures in complex societies, directing resources towards citizens with lower income; in other words to extend the welfare state. Pragmatic options have substituted secular religions.
14492
Copyright # 2001 Elsevier Science Ltd. All rights reserved.
See also: Capitalism; Communism; Communism, History of; Democracy, History of; Dictatorship; Eastern European Studies: History; Eastern European Studies: Politics; Labor Markets, Labor Movements, and Gender in Developing Nations; Labor Movements and Gender; Labor Movements, History of; Marx, Karl (1818–89); Psychohistory; Right-wing Movements in the United States: Women and Gender; Russian Revolution, The; Sociability in History; Social Democracy; Social Movements and Gender; Social Movements, History of: General; Socialist Societies: Anthropological Aspects; Soviet Studies: Politics; Working Classes, History of
Bibliography Cole G D H 1953–60 A History of Socialist Thought. 5 vols. St. Martin’s Press, New York Haupt G 1986 Aspects of International Socialism, 1871–1914. Cambridge University Press, Cambridge, UK Histoire GeT neT rale du Socialisme 1972–78 [Publie! e sous la direction de Jacques Droz.] 4 vols. Presses Universitaires de France, Paris Hobsbawm E, Haupt G, Marek, F et al. (eds.) 1978–82 Storia del marxismo. 4 vols. Einaudi, Torino, Italy Rosenberg A 1962 (1939) Demokratie und Sozialismus. Zur politischen Geschichte der letzten 150 Jahre. Europa$ ische Verlagsanstalt, Frankfurt am Main, Germany Roth G 1963 The Social Democrats in Imperial Germany; a Study in Working-Class Isolation and National Integration. [Pref. by R. Bendix.] Bedminster Press, Totowa, NJ Steinberg H-J 1976 Sozialismus und deutsche Sozialdemokratie. Zur Ideologie der Partei or dem ersten Weltkrieg. Dietz, Berlin Sweezy P M 1949 Socialism. McGraw-Hill, New York
F. Andreucci
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Socialist Law
Socialist Law Socialist law, as an actual legal order, was a major legal family whose members possessed similar characteristics. The common features of the legal systems of the Soviet Union and communist East-Central Europe resulted from the use of socialist law in maintaining communist-party dominance over state and society. Yet the law in the various socialist states differed, too, not only because of modifications in the forms of domination but also because Soviet legal solutions were imposed on diverse cultures and societies. The ideological legacy of Marxism–Leninism is expressed in the concept of socialist law. According to Marx, law was the will of the ruling class provided with statutory validity. Certain socialist (socialistic) theories of law have emphasized the importance of law in class domination. These theories served to legitimate communist rule. Other socialist (mostly Western) theories envisioned a normative system that reflected social equality and solidarity, that is, a theory of law based on social justice. Socialist law as the positive legal system in communist countries was understood by its proponents as the expression of the will of the working class (or, when class struggle was less emphasized, of the ‘toilers’), and it was an instrument of class struggle as well as a tool for building communist society. In the early years of the Bolshevik Revolution, workers’ revolutionary legal consciousness was deemed the source of applicable law. Given the totalitarian roots of the legal system, to what extent the norms of the communist state fit into any concept of law remains a fundamental problem. If law is a social phenomenon or institution that differs from its environment, is socialist law to any extent autonomous? And, is it law at all?
1. Stages in the Deelopment of Socialist Law Socialist law changed over time. It served totalitarian regimes whose efforts were not always successful. State socialism was totalitarian in the sense that it attempted to construct all-embracing state control over every sphere of social life in the service of a single goal. In the early years of the socialist revolution and during certain periods of Stalin’s rule, physical control and repression through brute force played the crucial role in this attempt to achieve total social control. It followed that law as a tool of social management or one that limits government action in accordance with its own specific logic could not exist. Law was not a serious consideration for the revolutionary mindset. This disregard was facilitated by Marx’s belief that there was no place for state and law in the classless society. But Marx also believed that the transition leading to the classless society would be long; and he expected that law, in socialism, would contribute to
the extension of equality under the law. Lenin argued for a dictatorship of the proletariat ‘based directly upon force and unrestricted by any laws.’ In the postrevolutionary period, the communist-party position was uncertain regarding the future of law. Yevgeny Pashukanis, the leading authority on communist legal theory, wrote in 1928 that law was a temporary system dictated by commodity relations. Once socialism had eradicated commodity relations, legal forms would also disappear. In the 1930s, law (not only criminal law) was structured as an unrestrained authorization, and even a command, to intervene in anything and everything. Such calls to uncircumscribed, coercive intervention transgressed the limits of legal forms. The 1930 joint decree of the Central Executive Committee and the Council of People’s Commissars of the USSR, which empowered local authorities to ‘take all necessary measures … to fight kulaks,’ would hardly satisfy any definition of law. But legal forms continued to exist even after the planned economy was fully developed. And even during the oppressive Stalinist regime, legal forms were employed at least for show-trial and propaganda purposes. Law remained an important internal and external tool of legitimation as exemplified in the Soviet Constitution of 1936 (known as the Stalin Constitution). Law served total repression, in the 1930s, in the sense that it reminded people that they could be brutally handled for no apparent reason and without any protection from other state institutions. As communist-party oppression neared anarchy, unlimited power turned against the elite and their supporters. Totalitarian disregard for the specific needs of the economy and other areas of social activities drifted into dysfunction. Stalin himself ‘brought law back in.’ But the promised legal stability had ‘nothing to do with the legal restraint on power, and it coincided with the period of greatest repression.’ The system moved ‘from lawless repression to repressive law’ (Krygier 1994). Officially socialist law was proclaimed the most developed form of law in history as it served the most developed social formation (communism). It provided socialist legality, a qualitatively superior alternative to bourgeois rule of law. In reality it often served show-trial and other propaganda purposes only, as exemplified by the Stalin Constitution. Laws enabled the direct repression of Stalin’s Great Terror, although the mass-scale extermination (the purges, the gulag, etc.) was based on action that presupposed a disregard of promulgated legal norms (trial-procedure rules, evidence). This is not to claim that the ‘disregard’ was not officially sanctioned and ordered: disregard of the then-existing official law was a systemic component of the totalitarian use of power. The resulting destruction of social and individual resistance to communist authority, the irrevocable stamp of fear in almost every citizen, served in the 14493
Socialist Law remaining half of the twentieth century as the foundation of the repressive order, which would afford to make use of less coercive laws. After Stalin’s death, in 1953, the communist elite reverted to a certain level of legal formalism, by and large for their own protection. This was achieved by a duplication, of sorts, of the normative system. The legal authorities had strictly observed secret rules regarding actions affecting members of the communist nomenklatura: to proceed against one of its members required authorization from the proper body within the nomenklatura. The nomenklatura followed its own ‘rule of law’, which did not include the right to a hearing of the concerned. The legalistic turn had a certain spillover effect on the rest of the population. Increasingly, law became less directly coercive and punitive and, at least in some of the satellite states, in matters of secondary importance it no longer depended fully on politics. On the other hand, law was recognized as a system that provided structure to a bureaucratic state capable of economic centralization. The copying of Soviet solutions resulted in less damage to legal forms in the East-Central European states than in the Soviet Union because the use of brute, Stalinist force lasted for shorter periods in these states and because Western legal traditions existed in some of them. The relative autonomy of law was officially recognized, beginning in the 1970s. For reasons of regime legitimation it was overemphasized to disguise a poorer reality of limited autonomy. Nevertheless, the formal characteristics of modern law (abstractness, general nature of commands, etc.) survived, although in simplified and distorted forms. The instrumental use of law served a bureaucratically ossified institutional politics. The official doctrine emphasized that the autonomy of law could be relative only vis-a' -vis the economy, which determined all forms of social life. ‘Legislation, whether political or civil, never does more than proclaim, express in words, the will of economic relations’ (Marx 1976). In reality neither the economy nor other social relations were respected, and only specific concerns of power and domination mattered, as understood by the communist leadership. Law remained a tool of the planned economy, and it served the deliberate changes of the planning system. Law was declared the engineering device that helped organize people in their building of communism. Revolutionary law, though an instrument of class struggle, had an element of internal legitimacy as it served social (class) justice (through murderous revenge, expropriation, and redistribution). In the instrumental understanding of socialist law moral considerations became irrelevant. Instrumentalism was also distorted, however, as ‘relative autonomy’ allowed only very limited respect for the characteristics of the tool. Legal dogmatics or consistency requirements were easily skirted. Yet there were actual elements of independence in law simply because the 14494
political system could not generate a unified will. There was competition within the communist political elite, which contributed to at least superficial respect for legal forms. Moreover, law-enforcement organizations had a vested interest in legalism, and to the limited extent that they played a role in party politics, law became a point of consideration. To the extent that the communists needed efficient state machinery they needed law as an instrument of control over the state bureaucracy (see Sect. 4). Law was important insofar as centralization required uniformity and unity of action.
2. The Structure of Socialist Law Socialist law was understood as an ideological superstructure, and its provision were subject to ideological scrutiny. Law (its provisions and understanding) had to be politically correct as understood by whatever ideological line the communist party touted. The ideological roots and functions of socialist law led many commentators to the conclusion that it was a mutation of natural law. It was argued that it originated in a set of doctrines or expectations that were above the law. Even the vulgar Marxist claims that law as superstructure reflects economic realities might have been interpreted as being in the naturallaw tradition: it was the mystical economy with its ‘objective law’ that determined law. Socialistic traditions of social justice also show structural elements of natural law. Max Weber criticized these socialistic tendencies to material justice as detrimental to the formal, rational qualities of law. Socialist legal systems, as they existed in statesocialist regimes, were positivistic and functioned as positive legal orders. The prevailing definition of law (legal system) was that it was the hierarchically structured sum total of state-made norms. The profound departure from promulgated norms and principles occurred not because of the observance of some kind of higher law. On the contrary, the departure was the result of changing secret orders, commands, and expectations emanating from the communist-party leadership and secret services. Of course, once the external political intervention becomes routine or even institutionalized, one can argue that there is no more law in the sense of normative, predictable expectations; hence socialist law (at its worst) does not fit into any notion of law. Socialist legal systems, in principle, were strictly hierarchic. Statutes (parliamentary enactments) were of absolute supremacy, while constitutions offered only guiding perspective and basic rules of state organization. Law had to originate from the people’s representatives; therefore, judicial lawmaking was ruled out as ideologically unacceptable, after the early revolutionary period. Statutes were supreme because they were enacted by parliament, which was held to be
Socialist Law the supreme state body. The rigid position on legal origins reflected the dominant nineteenth century German and French views and a simplification of Rousseau. Each socialist regime institutionalized some kind of small and reliable body within parliament (often called a presidium). The presidium had full power to act on behalf of parliament, which convened infrequently. The presidium had the power to amend existing laws. Decrees were subordinated to statutes. In theory, decrees were to serve exclusively for the execution of statutes. But in reality, substantive issues were often determined at the level of executive decrees. Officially courts were restricted to pure application of the law. However, a number of practical issues were determined by guiding principles of supreme courts. Stare decisis was unknown in this system; lower courts were bound to follow interpretations of higher courts. This was imperative given the informal control over the judiciary (see Sect. 4).
3. Substantie and Procedural Institutions The centralized socialist management of the economy and social life was suspicious of any expression of autonomy. The priority of ‘social interests’ (or ‘society’s interests’) defined the legal system. No autonomy (in the sense of self-determination) was permitted, and legal structures enabling or protecting autonomy were pruned even from private law. The traditional continental divide between public and private law was refuted as bourgeois legality. As Lenin stressed: ‘We acknowledge nothing as ‘‘private.’’ For us eerything in the province of economics is in the domain of public law and not of private law … . Hence our task is … to broaden the right of the state to abrogate ‘‘private’’ contracts.’ In practical terms this meant not only the denial of privacy and sanctity of contract but also the primacy and special protection of state property. Depending on the period, private property and private initiative were either prosecuted, prohibited, or greatly restricted, at best. Property consisting of the ‘means of production’ was restricted even where it was permitted (as in the German Democratic Republic (GDR), or, as far as land was concerned, in Poland). ‘Social interests,’ determined by the state, were favored above all other claims. Restrictions on personal freedom went beyond etatist concerns. Freedom of movement and residence was limited, and, at least in the Soviet Union, where the czarist tradition survived, freedom of internal movement was subject to administrative approval. Speech and press freedoms were permitted only in the interest of workers, which resulted in institutionalized or informal but total censorship. Personal choices and alternatives were limited or prohibited (including, in education, health, profession, sexual identity, and the like). State control over the individual
was extended to the intimate, private sphere. Decisions regarding family relations were subject to intervention by the Prokuratura (State Attorney’s Office). Abortion was not an issue of self-determination but a matter of demographic or workforce politics, although certain social compromises were observed in this context. In Hungary, the nearly unlimited access to state-funded abortion was the result of an official repudiation of coerced demographic policies. In other countries, like Russia, abortion as the primary form of family planning derived from a total disregard for human dignity. The legal restrictions imposed on the general population went hand in hand with the award of privileges for supporters of the regime, although only certain privileges were codified (e.g., access to higher education to family members of committed regime supporters). Privileges were not absolute. They were the reward for loyalty to the system. But the ordinary, loyal citizen (contrary to the privileges of the nomenklatura) had no title or claim to privileges, which, when bestowed, were an act of grace and goodwill. The discretionary distributive powers of the public administration were used to favor the politically loyal (or, at least, to punish the suspect). This was the case, for instance, in the assignment of public housing. During the less-dictatorial periods of state socialism, these discretionary methods of allocating goods and services were the primary tools of social control. Social control through the discretionary application of law put society in a condition of permanent dependence.
4. Administration of Justice The contribution of the administration of justice to the restriction of liberty varied historically. During the Stalinist terror, courts served as theaters of repression and injustice. In less dramatic times, the administration of justice promoted etatism and protected the interests of the communist party. Most administrative decisions could not be appealed in court, although in Hungary and Poland, as a rule, there was the possibility of appeal against administrative decisions, except those involving the more sensitive issues of military and police administration. As to judges, in the formative revolutionary years of socialist power, the legally trained personnel of the previous regime were replaced with incompetent but politically trustworthy laymen. The basis of revolutionary justice was the revolutionary consciousness of worker-judges. After World War II, in the Sovietdominated satellite states the ‘democratic attitude’ of lay assessors played a role in politically charged procedures. In more settled times, judges received formal legal training but worked in personal and organizational dependence on court presidents. Courts were technically under the control of the 14495
Socialist Law minister of justice. These courts were not structured to be independent, and the existing political control further increased the likelihood of ‘telephone justice’. Socialist legal theory emphasized the key role of the Prokuratura in the administration of justice. The organizational design of the Soviet Prokuratura was inherited from the czarist administration. It was a select, centralized body; procurators served in total subordination to their superiors, in what was a quasimilitary organization. They were much better paid than judges. The Prokuratura was in charge of criminal prosecution, representing the state in court, but it also routinely supervised the activities of the state administration. Lenin considered the Prokuratura the bulwark of socialist legality. He knew that a centralized state power could not afford any particularism, and all local, particular normative systems represented serious deviance. Normative particularism was especially dangerous if it originated in state bodies. As Lenin wrote to Stalin ‘legality cannot be legality of Kaluga or the legality of Kazan but must be one single all-Russian legality.’ Thus the task of the Prokuratura was to ‘see to the establishment of a truly uniform understanding of legality.’ The supervisory powers of the Prokuratura included the authority to ask for ordinary and extraordinary review of court decisions, even when it was not a party in the case it sought to review. Ironically, the fear of localism resulted in minimal protection for the population: the central bodies had an interest in mobilizing local populations against local authorities, and they encouraged denunciation of local (nonpolitical) abuses to the extent that they deviated from centrally authorized injustice. The dependence of citizens on local services was attenuated in the name of central law. Rules of procedure aimed to protect the material and political interests of the state, and allowed for paternalistic intervention in private litigation. The procedures followed simplified continental models that had traditionally existed in the given country. The inquisitorial nature of the traditional continental criminal procedure was further aggravated by Soviet solutions regarding the admissibility of policegathered evidence. In civil matters judges were required to establish ‘objective justice’ and were not bound by parties’ motions. The paternalistic nature of continental civil procedure dramatically increased, resulting in further dependence, as far as the parties were concerned. On the other hand, the simplifications and limited access to court yielded cheap and relatively swift administration of justice (where courts were available at all). Civil litigation rates were stable, and the most common type of litigation was divorce related. Criminal litigation (except when criminal justice was used as a tool of class struggle, in the 1930s and 1950s, stabilized at low level. Criminality was limited because of the efficient social control exercised by the oppressive state mechanism. 14496
See also: Law and Society: Sociolegal Studies; Marxism\Leninism; Marxist Social Thought, History of; Socialism; Socialism: Historical Aspects
Bibliography Collins H 1982 Marxism and Law. Oxford University Press, Oxford, UK David R, Brierley J E C 1985 Major Legal Systems in the World Today: An Introduction to the Comparatie Study of Law. Stevens and Sons, London Eo$ rsi Gy 1979 Comparatie Ciil (Priate) Law. Akade! miai Kiado! , Budapest, Hungary Feifer G 1964 Justice in Moscow. Simon and Schuster, New York Hazard J N 1969 Communists and Their Law. University of Chicago Press, Chicago Krygier M 1994 Marxism, ideology and law. In: Krygier M (ed.) Marxism and Communism: Posthumous Reflections on Politics, Society, and Law. Rodopi, Amsterdam Marcuse H 1971 Soiet Marxism: A Critical Analysis. Penguin, Harmondsworth, UK Marx K, Engels F 1976 The Poerty of Philosophy, Collected Works. Lawrence and Wishart, London, Vol. VI Podgorecki A, Olgiati V (eds.) 1996 Totalitarian and Posttotalitarian Law. Dartmouth, Aldershot, UK Thompson E P 1977 Whigs and Hunters: The Origin of the Black Act. Penguin, Harmondsworth, UK Vyshinskii A I A (ed.) 1948 The Law of the Soiet State. Macmillan, New York Zweigert K, Ko$ tz H 1987 Introduction to Comparatie Law. Clarendon Press, Oxford, UK
A. Sajo
Socialist Societies: Anthropological Aspects Socialist (also known as communist) societies constitute a class of twentieth-century societies sharing two distinctive features: the political dominance of a revolutionary—usually a Communist—Party, and widespread nationalization of means of production, with consequent preponderance of state and collective property. This definition excludes societies governed by socialist or social-democratic parties in multiparty systems, such as the Scandinavian welfare states. It includes the Soviet Union, the People’s Republic of China, and (for some part of their histories) the East European countries, Mongolia, North Korea, Vietnam, South Yemen, Cuba, Nicaragua, Ethiopia, and Mozambique. Socialist societies came into being with the 1917 Bolshevik Revolution and the establishment of the Soviet Union. Others founded later by various means (revolution, conquest, or annexation) all reflected the
Socialist Societies: Anthropological Aspects influence—both negative and positive—of the Soviet model created under Lenin and Stalin. All employed extensive coercion and central control, yet they increased living standards, access to housing, employment, education, and medical care. Although they instituted new hierarchies, they also promoted gender equality and improved life-chances for many disadvantaged people.
1. Research on Socialist Societies For much of the twentieth century, social science understanding of these societies was politically circumscribed. Their own scholars were under strict Party control; nonetheless, invaluable knowledge came from dissidents (e.g., Bahro 1978, Casals 1980) and from scholars in countries with strong reform movements (e.g., Kornai 1980, Wesolowski 1966). In Western Europe and the US, political and economic hostility to communist principles distorted thinking about socialist societies as epitomized in the concept of totalitarianism, which overstressed the importance of terror and top-down control. In all Western social science disciplines, impediments to on-site fieldwork hindered understanding. This situation improved in the late 1960s, for Eastern Europe, and the 1980s, for the Soviet Union and China, as deT tente and the impetus for technology transfer produced an infrastructure for intellectual exchange. The result was analyses that displaced totalitarian models to explore the societies’ inner workings, with greater attention to phenomena such as kinship, ritual, the mobilization of labor, and the exchange of gifts and favors, as well as more subtle treatments of political economy (e.g., Humphrey 1983).
2.
Political Economy and Social Organization
Although they varied widely in economic development (from industrialized Czechoslovakia, to agricultural China and Cuba, to pastoral Mongolia), most societies that became socialist lagged behind the advanced industrial countries economically; all embarked on programs of state-led development through central planning. The following description of their political economies emphasizes ‘classic’ socialism prior to the reforms introduced at various times, and draws on research in both agricultural and industrial settings in China, the Soviet Union, and Eastern Europe. 2.1 Redistributie Bureaucracies and Reciprocal Exchanges Owing to the role of exchange in legitimizing socialism, it is appropriate to begin with Karl Polanyi’s three basic modes of exchange: reciprocity, redistribution, and markets. Socialist systems suppressed the market
principle, instituting redistribution as the chief exchange modality and the system’s raison d’eV tre: everyone would have the right to work, and the wealth they produced would be redistributed as welfare for all. Centralized redistribution required social ownership and control of productive resources, together with centralized appropriation of the product for reallocation along lines set by the Party. The instrument Party leaders developed to orchestrate production and distribution was the plan. Fulfilling it, however, depended on relations of reciprocity, both clientelistic and egalitarian. Although these were highly bureaucratized societies, they were not Max Weber’s impersonal modern bureaucracies dedicated to procedural rationality. Dense social networks sustained reciprocal exchanges that, far from constituting resistance to bureaucratic control, were intrinsic to its functioning: horizontal exchanges enabled socialist managers to do their jobs, and clientelism facilitated controlling both labor and Party cadres in seemingly noncoercive ways. Personalistic ties extended far beyond the bureaucracy as well, linking ordinary citizens with one another and with bureaucrats. The warp and woof of socialist societies, then, consisted of vertical and horizontal relations of patronage, loyalty, and exchanges of goods, favors, and gifts. Although the forms of these relations differed according to the various societies’ prerevolutionary histories and cultures, (e.g., gift relations in China, compadrazgo in Cuba or Nicaragua, or specific kinship structures of Soviet minorities), they were widely prevalent in all. In each socialist society words indicating ‘connections’ (Russian blat, Chinese guanxi, Romanian pile, Hungarian protekcioT , Cuban compang erismo, Polish dojscie) appeared often in daily speech.
2.2 Organized Shortage Personal relations were integral to socialism in part because of certain pervasive features of planned economies: tension between Party planners and firm managers, bargaining and soft budget constraints, and barter amid shortage. Socialist property could be productive only if the center’s ownership\control rights were disaggregated and devolved on to lowerlevel managers charged with realizing production. Because planning could never foresee all contingencies, these managers had considerable discretionary power. Expected to achieve Party directives with inadequate wage funds and to exceed their plan targets with inadequate raw materials, they bore responsibility far greater than the means allotted them for fulfilling it. Therefore, they strove constantly to expand their local power base and their effective capacities. In doing so, they posed a threat to their superiors, who thus were motivated to encourage subordinates’ loyalty with favors and patronage. 14497
Socialist Societies: Anthropological Aspects In redistributive systems governed not by demand for products but by the Party’s planned allocation, materials for production could not simply be bought on a market; their availability depended on the supplies budgeted in plans and on often-inefficient central distribution. Managers, therefore, requested more supplies than they needed, hoping to obtain enough materials to fulfill and exceed their targets. Because the planning mechanism required firms to produce regardless of profitability—they operated under soft budget constraints and were rescued rather than bankrupted if they lost money—local managers could with impunity overstate their needs for materials and investments and then hoard any excess. They also strove to bargain their plan targets downward, making it easier to fill these and have goods left over. Comparable processes occurred in both industry and agriculture, as cadres everywhere manipulated information, under-reported production, and engaged in illicit trade to benefit their firm or locale. The result of bargaining and hoarding was endemic scarcity of the materials necessary for production; thus, classic socialist societies were economies of shortage (Kornai 1980). Shortage caused competition among firms but also widespread exchanges, managers supplying from their hoard today the materials needed by others who would return the favor tomorrow. Thus, firms hoarded materials not only to cover emergencies in their own production but to backstop the supplies needed by others in their network. Shortage affected materials for production and also consumption goods, generating the queues characteristic of many socialist societies. It also affected two other crucial resources: information and labor. Party planners could implement top-down planning only if they had detailed and accurate information. This happened rarely, for lower-level cadres had many incentives to falsify information so as to build careers by appearing successful, evade responsibility for disasters, inflate their materials requests, provide cushions against production shortfalls, overfulfill their targets to receive bonuses, etc. From these discrepant needs came a perpetual struggle over information. Higher authorities hoarded information as a source of power and engaged in continual information gathering at all levels of society (Horva! th and Szakolczai 1992). Censorship and careful management of information at the top provoked information hunger below, generating a culture of rumor, gossip, selective secrecy, and conspiratorial explanations of events, as well as producing a citizenry skilled at reading between the lines. Classic socialist societies were thus a special kind of information society. 2.3
Labor Shortage and Rights in People
An additional resource subject to shortage was labor. Relatively underdeveloped socialist economies had initial problems mobilizing skilled workers where there 14498
was effectively no ‘working class,’ and they lacked the resources for either paying labor well or substituting capital-intensive methods. Moreover, the shortage economy encouraged firms to employ excess workers so they could complete monthly plans once sufficient materials were on hand (in a frenzied effort known as ‘storming’); managers, therefore, padded their workforce, exacerbating shortage. Labor shortage, thus, had both macro and micro dimensions. Means for stabilizing the labor supply included making workplaces the locus of benefits such as daycare, housing, vacation permits, pensions, medical care, etc. Certain cities were closed to immigration and construction of urban infrastructures was limited (Szele! nyi 1983), thus compelling millions to become village-based ‘peasantworkers.’ Within each industrial and agricultural workplace, managers had to ensure a labor supply adequate to both normal production and periods of ‘storming.’ Workers, for their part, might bargain for better conditions by withholding labor, through intentional slowdowns or time off for household tasks and moonlighting—that is, labor shortage gave workers structural leverage. Conflicting demands for labor put a premium on ways of accumulating rights in people. For managers, these included extending patronage and favors (e.g., access to schooling, houses, or building sites), overlooking petty illegalities, and securing loose plans that enabled them to produce a surplus they might use to create labor-securing debt and exchange relations (cf. Humphrey 1983). Both managers and others needing labor might appeal to kinship idioms, emphasize ethnic identities, participate in special rituals, and expand networks of reciprocity through gift-giving. Thus, the quest for labor further encouraged personal ties and reinforced particularistic identities. 2.4 Second Economies and Socialist Reform Believing they alone could best determine how social wealth should be redistributed, Party leaders initially opposed economic activity not encompassed within plans. The inability of planning to cover social needs at the given level of technological endowment, however, compelled officials to permit and even legalize some small-scale private effort, known as the ‘second’ (or informal, unofficial, or shadow) economy. Among its forms were food production on small plots, afterhours repair work or construction, typing, tutoring, unofficial taxi services, etc. Because these activities overlapped with semilegal and illegal ones, their situation was everywhere precarious, with authorities persecuting them more in some times and places than in others. The second economy was largest in Hungary after 1968, for example, small in Cuba until Castro’s ‘Rectification’ of 1986, harassed in Romania throughout the 1980s; it burgeoned in post-1978 China; extreme forms are reported for the Soviet Union,
Socialist Societies: Anthropological Aspects where entire factories ran illegal production after hours. Crucial to understanding the second economy is that it nearly always utilized materials from the first (or formal, official) economy; its much-noted high rates of productivity were subsidized, then, by state firms. The prevalence of second-economic activity both indicated popular resistance to the Party’s definition of needs and helped to fill those needs by voluntarily lengthening the working day. Pressure from the second economy was but one of many signs of difficulty in socialist planning that led to repeated efforts at reform, initiated from both within and outside the Party. Beginning with Lenin’s New Economic Policy, every socialist society experienced cycles of reform and retrenchment, devolution and recentralization (cf. Skinner and Winckler 1969, Nee and Stark 1989). Most exhibited a trend toward less stringent planning and the introduction of market mechanisms, heightened material incentives, and mixed property forms. System-wide experimentation began with Khrushchev’s 1956 ‘Secret Speech’ criticizing Stalin and increased as each society moved from ‘extensive’ development (mobilizing resources) to the ‘intensive’ phase (attention to productivity). Hungary and Yugoslavia introduced the most durable early reforms; those in the Soviet Union ended in the collapse of the Soviet bloc, while comparable reforms continued in China, North Vietnam, and Cuba. As they reformed, socialist societies increasingly diverged not only from the Stalinist model but from one another, introducing path-dependent differences that became ever more marked. Despite these differences, the reforms everywhere redefined basic units of activity (e.g., revitalizing villages or households at the expense of collective farms, teams, or brigades); altered gender relations (usually in favor of men and patriarchal authority) and increased other inequalities; affected networks of reciprocity (expanding horizontal over vertical connections); dismantled at least some socialist property (as with China’s decollectivization, begun in 1978); shifted the locus of authority (usually downward, provoking reactions from higher-level bureaucrats); and entailed new, more intimate forms of state penetration (implied, e.g., in Chinese rituals that no longer imagined gods or ancestors as inhabiting a nether world but found them immediately present).
3.
Social Engineering and Resistant Personhood
All Communist Parties promoted massive projects of social engineering. They accorded knowledge and expertise a privileged place, under Party monopoly. Convinced of the creative power of language, Party leaders both used and modified language in commanding ways. They spoke of things that had yet to be created—the working class, the proletarian dictatorship, the new socialist man—as if those things already existed, and they developed hieratic speech with
reduced vocabularies, clusters of noun phrases, and few (often passive) verbs, creating a limited, static verbal world. Party cadres intervened exhaustively in all facets of life, aspiring to create new moralities and to control populations in myriad ways. They attacked religion as superstition, instituted new socialist rituals, and expanded educational access to create enlightened citizens indifferent to religious belief. They filled the air with moral exhortations aimed at instilling a new, puritan, socialist ethic. Toward better management of resources, they introduced massive population programs, from Romania’s enforced pronatalism to China’s stringent birth control, insinuating themselves into people’s most intimate lives, often to devastating effect (e.g., Kligman 1998). Dedicated to building an alternative to the world of bourgeois capitalism, communist leaders announced an assault on all forms of inequality, including those based in kinship status, gender, class, and ethnonational difference. Society would be homogenized, lineage-based kinship structures broken up, gender differences minimized and authority relations in families altered, classes and private property abolished, national minorities given equal chances. New social entities emerged—state and collective farms, teams, and brigades, for instance, to take the place of kingroups and village structures. When pre-existing social relations proved difficult to abolish, they would be coopted and turned to the Party’s purposes. From all these efforts would result the ‘new socialist man,’ a new kind of person, and the ‘people-as-One,’ a new kind of human community, bearing an exalted socialist consciousness. The results of these interventions were decidedly mixed. To begin with, the Party failed to speak with one voice, as ministries vied for resources rather than cooperating to remold society, and as cadres built their careers by under- or over-executing central directives and engaging in rampant corruption. In addition, the annexation of pre-existing forms inevitably modified the purposes they were intended to serve. Exhibiting greater capacity to formulate goals than to implement them, socialist governments were plagued continually by the unforeseen consequences of their policies. To solve conflicts among nationalities they institutionalized ethnonational difference in ways that strengthened it; they embraced nationalist idioms that then overtook socialist ones (Verdery 1991, 1996). Parties created working classes only to find themselves opposed by their creation, as with the Polish authorities and Solidarity. Beyond this, resistance to social engineering proliferated even as people adopted the new forms. Constant surveillance politicized behavior: small acts of joke telling became willful anarchy in the eyes of both authorities and joke tellers. Dissidents sought support internally and abroad for their critiques of the system, launching projects to build ‘civil society’ against the state. Less visibly, peasant refusal to accept starvation 14499
Socialist Societies: Anthropological Aspects from China’s Great Leap Forward or Eastern Europe’s delivery quotas ultimately forced the authorities to renounce these policies (Yang 1996, Re! v 1987), while popular insistence on adequate living standards pressed the parties toward reform. Highly complex political effects centered on the making of persons and political subjects. Redistribution and socialist paternalism tended to create subjects disposed to dependency and passive expectation; early campaigns taught the poor to see themselves as victims of oppression (rather than simply of fate) and offered them the Party’s instrument of vengeance, in the form of denunciations. Thus arousing people’s complicity, the Party also instilled selfdenigration. At the same time, however, many adopted dissimulation as a mode of being: apparent compliance covered inner resistance. The resultant ‘social schizophrenia,’ ‘doubling,’ duplicity, or split self is described for every socialist society. It accompanied a pervasive dichotomization of the world into ‘Us’ (the people) and ‘Them’ (the authorities). More significantly, it encouraged subtle forms of self-making in people’s own terms: defiantly they consumed forbidden Western goods, created self-respect through diligent secondeconomy work while loafing on their formal jobs, participated in ethnic- or kin-based identities and rituals constitutive of self, and gave gifts not just to secure advantage but to confirm their sociality as persons and human beings. The collapse of the Soviet bloc hinders a balanced assessment of socialism, which once represented great hope for countless people. Undeniably, socialist societies diminished certain forms of oppression, yet they created others. In many of these societies, an early e! lan for the expressed goals gave way to cynicism and a feeling of betrayal, as the means used to achieve them sabotaged their fulfillment. Admirable gains in education, medical care, and social well-being counted for less against the devastating purges, labor camps, Party-induced famines, environmental degradation, and abuses of human rights. Top-down engineering foundered as socialism’s leaders resisted their citizens’ efforts to educate them about social relationships, self and human feeling, the value of ritual practices, and the limits to imposed change. The lessons of these citizens’ refusals, their silent revolutions from below, should guide future attempts to create alternatives to the world of global capitalism. See also: Mao Zedong; People’s Republic of China; Soviet Union; Stalin; Cold War, The; Communism; Communism, History of; Informal and Underground Economics; Marxism\Leninism; Socialism; Socialism: Historical Aspects; Soviet Studies: Culture; Totalitarianism
Bibliography Bahro R 1978 The Alternatie in Eastern Europe. Verso, London
14500
Casals F G 1980 The Syncretic Society. M. E. Sharpe, White Plains, NY Horva! th A, Szakolczai A 1992 The Dissolution of Communist Power: The Case of Hungary. Routledge, New York Humphrey C 1983 Karl Marx Collectie: Economy, Society and Religion in a Siberian Collectie Farm. Cambridge University Press, Cambridge, UK Judd E R 1994 Gender and Power in Rural North China. Stanford University Press, Stanford, CA Kligman G 1998 The Politics of Duplicity: Controlling Reproduction in Ceausm escu’s Romania. University of California Press, Berkeley, CA Kornai J 1980 Economics of Shortage. North-Holland, Amsterdam Kornai J 1992 The Socialist System. Princeton University Press, Princeton, NJ Nee V, Stark D (eds.) 1989 Remaking the Economic Institutions of Socialism: China and Eastern Europe. Stanford University Press, Stanford, CA Re! v I 1987 The advantages of being atomized. Dissent 34: 335–50 Skinner G W, Winckler E A 1969 Compliance succession in rural Communist China: A cyclical theory. In: Etzioni A (ed.) A Sociological Reader on Complex Organizations, 2nd edn. Holt, Rinehart and Winston, New York Szele! nyi I 1983 Urban Inequalities under State Socialism. Oxford University Press, Oxford, UK Verdery K 1991 National Ideology under Socialism: Identity and Cultural Politics in Ceausm escu’s Romania. University of California Press, Berkeley, CA Verdery K 1996 What Was Socialism, and What Comes Next? Princeton University Press, Princeton, NJ Walder A G 1986 Communist Neo-traditionalism: Work and Authority in Chinese Industry. University of California Press, Berkeley, CA Wesolowski W 1966 Classes, Strata, and Power. Routledge & Kegan Paul, London Yang D 1996 Calamity and Reform in China: State, Rural Society, and Institutional Change since the Great Leap Famine. Stanford University Press, Stanford, CA
K. Verdery
Sociality: Anthropological Aspects 1. The Problem of Culture Sociality is the capacity for social behavior, a capacity particularly marked among human beings, as evidenced by the intensity, complexity, and endless variety of human social life. This definition echoes the earlier use of sociality in philosophy and psychology, where it is used to denote the fact of human social interdependence and to correct a tendency in those disciplines to explain human behavior in individualistic terms alone. Sociality is used differently among evolutionary biologists, where it denotes the actually observed social behavior of a species, such as its mating system or dominance hierarchy; sociality so described is then subjected to an explanation from
Sociality: Anthropological Aspects natural selection (Wilson 1975). But human social behavior is so varied and so open-ended that no such simple description or explanation can be given. We must, therefore, describe human sociality as a potential, one which results in many different social systems. To understand human sociality we need to turn first to the concept of culture. For it is culture which has borne most of the weight of anthropological efforts to think about human nature in general and about human potential, and indeed one could regard ideas about culture as anthropology’s first approximation to a picture of human nature. This was sketched in by theorists and fieldworkers before World War II and confirmed in the great expansion of fieldwork and theory after that war. In this first approximation, human beings are dependent upon the possession of some particular culture, both for their survival and for their recognizably human character. For culture comprises everything that we might consider knowledge, from knowledge of language and other practical techniques, through knowledge about the immediate natural world and the larger cosmos, to knowledge of social customs and social relations. Geertz captured the encompassing import of culture when he remarked that ‘man’s nervous system does not merely enable him to acquire culture, it positively demands that he do so if it is going to function at all’ (Geertz 1973, p. 73). On this view, human beings have evolved to have culture, and each individual human being begins as a more or less blank slate upon which the knowledge of culture can be written, or as a new computer into which a program must be loaded. Moreover, the saliency of culture, as opposed to raw genetic inheritance, in human nature is certified by the tremendous variety in ways of life across the world. Humanity presents a patchwork of sharply discrete societies, each with its own social organization, its own language, and its own view of the world. It is the nature of these cultures, according to the first approximation, to be deeply traditional and resistant both to change and to mutual understanding. Yet anthropologists’ confidence in this first approximation has given way since the 1980s to sharp doubts. For such a conception of human nature is still, for all its ambition and universal scope, a narrow and partial one. It misses out the change, incompleteness, ambiguity and potential—that is the historicity—that are so prominent in human society and experience (Peel 1987, Fox 1985). Wolf (1982) showed that there have existed no peoples beyond the reach of historicity, change, and interchange between peoples, no primary and untouched cultures. Human beings do create new roles and new institutions and ways of doing things— such as the joint stock company, which has played such a great part in world history in recent centuries. And human beings do, as a matter of course, make effective contact and create enduring social relation-
ships beyond the apparent limitations of our natal culture, as evidenced in networks of trade, in longdistance marriage networks (such as those of India), in the practice of ethnographic fieldwork, and in all the other occasions, such as migration for work, for safety, and even for conquest, that create a continual sifting and reblending of people and cultures across the globe. (For a sense of relatedness beyond ostensible boundaries, see Hannerz 1996.) Anthropologists have begun to look to human sociality to fashion a second approximation of human nature that will allow for such flexibility (see Ingold 1986, Carrithers 1992).
2. Sociality as Intersubjectiity and Social Intelligence It is perhaps easiest to capture sociality to work in infancy, when the infant has not yet acquired the tools of a particular culture. Human infants already show a responsiveness to others which goes well beyond that found among any other species, including the great apes. Even a newborn can imitate facial expressions and respond to the emotions of others. This is primary intersubjectivity, in which the baby already has a dual ‘selfjother’ orientation (Trevarthen and Logotheti 1989). By the age of one year, infants have developed the ability to share purposes and intentions with others, even if they cannot yet formulate these clearly in speech. So the infant possesses an innate and growing social intelligence that operates with the feelings and doings of others. Upon the basis of primary intersubjectivity, for example, the infant will be able to go on in the first two years of life to achieve joint attention with others, following a carer’s gaze or gesture to an object, and will learn to draw a carer’s attention to an object. The intellectual achievements of intersubjectivity and joint attention form what Bruner (1983) has called the Language Acquisition Support System, the scaffolding of abilities upon which the joint capacity of speech, as well as the other achievements of culture, will be constructed. Later, at the age of three years or so, the child will begin to differentiate what they know and feel from what others around know and feel; they will then begin to simulate the knowledge and feelings of others, and on that basis will be able to experiment with a growing variety of social relationships, both in play and in earnest. Already at this age, that is, the child has begun to ‘take the role of the other,’ in George Herbert Mead’s famous phrase, the facility upon which all distinctively human socializing is built. It is useful to think of sociality as social intelligence, and indeed it seems quite likely that our technical and symbolic intelligence is an evolutionary by-product of the formidable abilities developed in social interaction. But there is one implication of the term ‘intelligence’ that must be stripped away, and that is the assumption that sociality is a strictly individual capacity. On the 14501
Sociality: Anthropological Aspects contrary, it comprises an intellectual, cognitive, and emotional openness to others which runs throughout human experience, in childhood or adulthood. The child matures, but into a matured form of intersubjectivity, of selfjother orientation, not into sealed isolation. This means, for example, that most human abilities and knowledge are distributed, a term drawn from computing to indicate the distribution of a computing problem across more than one computer (or central processing unit). One sees this on a smaller scale in the coordination of persons and sharing of information in subsistence activities among hunter– gatherers, or on a larger scale in the division of labor that brings this encyclopedia to the reader: at no point need, or can, all the necessary skills and information reside in one individual, but must be distributed across all participants. The knowledge and skills exist only with the scaffolding of persons-in-relationship who exercise the knowledge. To return to Geertz’s dictum: it is not that the individual nervous system acquires culture, but that the socially intelligent nervous system completes itself with others to acquire culture jointly with them. It is a good deal more difficult for researchers to keep this more complex, more nuanced image in mind, but it is far more faithful to the character of human life.
2.1 Sociality as Mind-reading in Talk So one achievement of sociality is to make human knowledge and skills much more than the sum of individuals, and to that extent, more powerful. But there is a cost as well, for the continual selfjother orientation of people requires constant attention and a great deal of energy. We are socially calculating beings. We must be able to decide not only what to do, but to calculate the consequences of our actions on others, their probable reaction to those consequences, and the significance of those reactions for ourselves. The calculating ability thus required can be called higher-order intentionality, the ability to think complexly about one’s own and others’ states of mind and reactions. To put this abstractly: one represents to oneself others’ states of mind (first-order intentionality), or one represents to oneself others’ representations of one’s own state of mind (second-order intentionality), or one even represents to oneself others’ representations of one’s own representations of their state of mind (third-order intentionality). Here is an example of third-order intentionality: I beliee that Amy recognizes that I accept what she is saying. One can go further: I beliee that you recognize that I think you understand the general idea of higher-order intentionality. In thus, spelling out the idea of higherorder intentionality, we can see how complicated it is, and therefore how much intelligence it requires. Yet these are also the everyday, unremarked, and swiftlydeployed abilities that a mature human being needs to 14502
conduct even the most workaday talk. Whiten (1991) has called this ability ‘mind-reading,’ but of course it is not mind-reading as some mystical skill of seeing directly into the little grey cells, but is rather based on interpretation of the clues of glance, tone, timing, and nuance that people emanate continually in their interactions. To weigh the significance of these central abilities in human sociality, we need first to note the cumulative effect of people’s ability to conduct such everyday talk. For through such talk, we are able to create and sustain complex skeins of joint action, including the joint action of provisioning and sustaining ourselves. As Maurice Godelier remarked, ‘human beings … do not just live in society, they produce society in order to live’ (Godelier 1986, p. 1). On that account, sociality is a resounding practical success, opening creatively onto many forms of joint social action. But such practical success, leading to some reliability and permanence in social relations, is only partial. For, as one can already see from the nature of mind-reading, the imputation of beliefs and attitudes to others can hardly be infallible. Slips are frequent, misunderstandings are sometimes repaired and sometimes not, and the interpretation of what one says can never fully be determined by the words themselves, as discourse and conversation analysts have so meticulously documented (see e.g., Moerman 1988). The cumulative direction of conversation cannot be controlled fully by any single participant, and all participants’ interpretations and interests contribute to the process through implicit or explicit negotiation (Heritage 1984). So the seeming sureness of everyday talk is actually a laborious and uncertain business, a matter of minute-by-minute, indeed second-by-second, interpretation, with all the fallibility, but also creativity, inherent in the idea of interpretation. Max Weber’s famous phrase, ‘unintended consequences,’ is usually used of large social phenomena, but unintended consequences begin in the fine grain of everyday life.
2.2 Sociality in Narratie Thought So human sociality involves a noisy eventfulness that resembles that of other primate groups, but which also has notably human features. We often do things, not only because of what others do, but also because of what they might think or feel: I may apologise because she was angry, I explain myself because someone might think I am inconsiderate. So an action can be not only practical, but also have an interactive meaning with social effect: I built that fence to keep out stray cattle, but also to spite my neighbor. Moreover people can, and do, weave such actions and reactions together over a long series, linking cause and effect, action and reaction. This may be called a plot line or narrative, and indeed this larger scale mutual awareness may be called narrative thought (Bruner
Sociality: Anthropological Aspects 1986, Carrithers 1992). Narrative thought is richly evidenced in gossip, story, history, and myth, that is, in all the ways in which people are able to represent complex skeins of action to one another. To understand the story of Oedipus, for example, the audience must be able to bear in mind the relationships between Jocasta, Tiresias, and Oedipus as they develop through a complex play of events and changes in the characters’ states of mind. Moreover people are able, through narrative thought, to come to understand the underlying propensities, attitudes, beliefs—in a word, the character—of those involved in the skein of action. But narrative thought is much more than a passive competence, because it is only through understanding one’s place in the various skeins of action and characters that comprise one’s own life that one is able to act sensibly and appropriately to others. Societies vary greatly in their descriptions of characters and their notions of plots, but the basic theme of narrative thought, that of placing oneself with others in mutual awareness of larger time frame and a larger society, going beyond immediate events and those immediately present, is characteristic of all human life. Moreover, this larger view over events concerns not only the past, but the future as well. Goody (1995) has written of ‘anticipatory interactive planning’ as a characteristic of human sociality, and Cole (1996) has shown how people use their narrative understanding of the past to project the future. He calls this prolepsis, ‘looking ahead.’ He uses the example (drawn from the observations of an (obstetrician) of a mother who said, as she held her newborn daughter in the hospital for the first time, ‘I shall be worried to death when she’s 18.’ The remark is a sort of miniature of the complexities of human sociality. It shows the engagement of the mother with the daughter, and looks forward to a projected, predicted life cycle for the daughter. And of course the remark looks back as well, to the experience of the mother and her peers and forebears: the mother is (re)collecting her own and others’ past to predict her daughter’s future.
3. Sociality and Culture In regarding human nature from the point of view of sociality, we bring human agency, human intersubjectivity, and historicity to the fore, linking human interaction and human creativity. What then of culture? It is difficult to see how culture-as-acomputer-program could fit here, but there is a fruitful alternative, namely to regard culture as a collection of artifacts. Through such artefacts human beings mediate interactions between each other, and between themselves and the natural world (Cole 1996). Actual tools are of course included, and we can see a good deal of how they mediate between people when we think of writing, books, telephones, or computers. But language itself, with all its panoply of different tools, is
also a collection of artefacts for mediating between people. Thus, for example, South Asian languages tend to possess an extensive set of second person pronouns, each denoting a degree of rank difference between the speaker and the interlocutor. Such tools allow people to set a relationship … or to dispute it, by choosing a pronoun different from the one the interlocutor may wish to hear. Kinship systems are another set of such tools. There are also corporeal artefacts, such as the gesture of obeisance in South Asia known as offering puja or worship, which establishes a relationship of great respect to some real or spiritual other. Other artefacts include what have been called ‘scripts,’ the little slices of action through which people accomplish routine tasks. There are scripts, for example, for greeting people politely throughout the world, politeness being a marked feature of human sociality (Brown and Levinson 1987, Strecker 1988). These scripts may be played practically, to get the work of setting a relationship or conversation going, but also strategically, to attain some end of the speaker or interlocutor. And there are also stories, or story templates, that guide people over longer stretches of action. Many have noted, for example, how the story of the good stranger arriving in town to deal with evil-doers has often influenced American foreign policy. It is characteristic of human ingenuity that these artefacts may not only be used strategically, but changed to achieve new ends or to meet new circumstances. Perhaps the most spectacular examples of this are the creation of writing from speech, and later books from writing, thus creating new, mediated, forms of interaction, ‘exploded interactions’ (Carrithers 1992), between people separated in time and space. On the first approximation of human nature, it was difficult to see how cultures could change, or indeed how the great variety of cultures had come into being. But by regarding culture as a collection of artefacts and humans as possessing sociality beyond any prescriptions of culture, we can begin to see that any culture itself has a history, and that no one culture circumscribes finally the potential of those who use it. In this second approximation of human nature, both historicity and the creation of new cultural forms are already woven into the human fabric.
Bibliography Brown P, Levinson S C 1987 Politeness: Some Uniersals in Language Usage. Cambridge University Press, Cambridge, UK Bruner J S 1983 Child’s Talk: Learning to Use Language. W N Norton, New York Carrithers M 1992 Why Humans hae Cultures: Explaining Anthropology and Social Diersity. Oxford University Press, Oxford, UK Cole M 1996 Cultural Psychology: A Once Future Discipline. Harvard University Press, Cambridge, MA
14503
Sociality: Anthropological Aspects Fox R G 1985 Lions of the Punjab: Culture in the Making. University of California Press, Berkeley, CA Geertz C 1973 The Interpretation of Cultures: Selected Essays. Basic Books, New York Godelier M 1986 The Mental and the Material: Thought, Economy, and Society. Verso, London Goody E N (ed.) 1995 Social Intelligence and Interaction: Expressions and Implications of the Social Bias in Human Intelligence. Cambridge University Press, Cambridge, UK Hannerz U 1996 Transnational Connections: Culture, People, Places. Routledge, London Heritage J 1984 Garfinkel and Ethnomethodology. Polity Press, Cambridge, UK Ingold T 1986 Eolution and Social Life. Cambridge University Press, Cambridge, UK Moerman M 1988 Talking Culture: Ethnography and Conersation Analysis. University of Pennsylvania Press, Philadelphia, PA Peel J D Y 1987 History, culture, and the comparative method. In: Holy L (ed.) Comparatie Anthropology. Blackwell, Oxford, UK Strecker I A 1988 The Social Practice of Symbolisation: An Anthropological Analysis. Athlone, London Trevarthen C, Logotheti K 1989 Child in society, society in children. In: Howell S, Willis R (eds.) Societies at Peace: Anthropological Perspecties. Routledge, London Whiten A (ed.) 1991 Natural Theories of Mind: Eolution, Deelopment and Simulation of Eeryday Mindreading. Blackwell, Oxford, UK Wilson E O 1975 Sociobiology: The New Synthesis. Harvard University Press, Cambridge, MA Wolf E R 1982 Europe and the People without History. University of California Press, Berkeley, CA
ever, artificial constructs of humans. As such, they are more or less useful in answering specific questions, rather than more or less true and realistic.
M. Carrithers
Sociality involves multiple individuals of the same species engaging in a common task. Such tasks may involve any sort of behavior, but most research on the evolution of sociality has focused on the evolution of social cooperation in the production and rearing of offspring. The main goal of this research program is to explain and predict the occurrence of different social systems of multiple individuals engaged in reproduction, from aspects of the genetic, phenotypic, and ecological traits present in different species or populations. Pursuing this goal requires definition and classification of social systems.
Taxa exhibiting communal social systems include some bees, beetles, mites, spiders, birds, rodents, and primates. By contrast, cooperative breeding has been described in many birds, some bees and wasps, and various mammals (especially carnivores, viverrids, and rodents). Taxa exhibiting eusociality include many Hymenoptera (ants, bees and wasps), some gall-inducing aphids and thrips, sponge-dwelling, shrimp, termites, encyrtid wasps, one species of bark beetle, naked mole rats, and damaraland mole rats (Choe and Crespi 1997). The challenge for students of social evolution is to make sense of the sporadic and idiosyncratic taxonomic distributions of these social systems and to find the commonalities and differences between groups that are responsible for their convergent and divergent social systems. The main determinants of social system evolution include genetic relatedness and aspects of ecology.
1. Defining Social Systems
2. Genetic Relatedness and Altruism
The purpose of defining social systems is twofold: clear communication between researchers, and classificatory structure for comparative evolutionary analysis of the causes of social evolution. To the extent that social system classifications are not based solely on phylogenetic relationships of species, they are, how-
Much cooperation is mutualistic in that all of the interacting individuals gain from the behavior. However, many forms of cooperation involve altruistic behavior, whereby one individual provides benefits to another at a cost to itself. The presence of such behavior appears paradoxical to evolutionary theory
Sociality, Evolution of
14504
1.1 Social System Classification Reproductive social systems can be classified into three categories, based on (1) the presence and absence of behaviorally distinct sets of individuals within a social group, one of which specializes in reproduction and one of which specializes in helping the morereproductive individuals, and (2) permanence of these behaviorally distinct sets, such that in societies with permanently distinct groups (called ‘castes’) some individuals become irreversibly phenotypically distinct prior to adulthood (Choe and Crespi 1997). ‘Communal’ societies exhibit cooperation in reproduction, but they lack behavioral specialization into ‘reproductives’ and ‘helpers’. ‘Cooperatively breeding’ societies show such behavioral specialization, but it is not irreversible: some individuals can and do switch between the reproductive and helper roles as adults. ‘Eusocial’ (‘truly social’) societies exhibit castes. These three categories refer to societies rather than to species, because some species exhibit social variability across geographic regions or across the life cycle. 1.2 Social Taxa
Sociality, Eolution of Table 1 The three mechanisms leading to social cooperation 1. Benefits gained by cooperators a. mutualism or altruism – all benefit equally from cooperation, no incentive to cheat (i.e., exploit resources created by cooperators) b. mutualism or altruism – unequal benefits from cooperation, incentive to cheat, success of cheaters limited in some way (e.g., is frequency-dependent) 2. Costs imposed on non-cooperators a. manipulation – dominant reduces independent breeding prospects of subordinate(s), favoring helping b. punishment – dominant individual reduces fitness of non-cooperators c. policing – less-reproductive individuals prevent one another from engaging in reproductive cheating 3. Forced cooperation: dominant individual takes control of behavior away, such that subordinate individual must cooperate
because reduced fitness cannot evolve. Hamilton (1964) proposed a resolution to this paradox: that genes for altruism can increase in frequency and be maintained because they may also be present in the recipient of the benefits. Thus, by helping genetic relatives, an individual can maximize its ‘inclusive fitness,’ the sum of its own reproduction (descendant kin), plus its reproduction via its positive effects on collateral kin devalued by its genetic relatedness to them. Altruistic behavior can be selected for under such ‘kin selection’ if rbkc0, where r is the relatedness of actor to recipient (the probability that a focal gene in the actor is present in the recipient), b is benefit to the recipient, and c is cost to the actor. Inclusive fitness theory predicts that genetic relatedness should be high among individuals in social systems with altruism. In accordance with this prediction, substantial levels of genetic relatednesses have been found between interacting individuals in the overwhelming majority of cooperatively-breeding and eusocial species thus far studied (Choe and Crespi 1997). By contrast, the few communal species for which such data are available exhibit colonies comprised of nonrelatives or distantly-related individuals. All told, inclusive fitness theory has gained tremendous empirical support and is one of the cornerstones of social evolution theory. Altruism can also evolve if it is reciprocal, such that the altruistic act of one individual to another is later reciprocated (Trivers 1971) (Table 1). By this mechanism, individuals need not be relatives, but there must be some means whereby cheating is thwarted lest the system break down. Such means may include recognition of individuals or the building of ‘reputations’ and as such reciprocal altruism may be limited to species with highly-developed cognitive abilities, such as primates.
3. Ecology The ecology of a species comprises its habitat, food sources, demography, competitors, predators, parasites, and mutualists. The ecological niche of a species, and the nature of its interspecific and intraspecific
interactions, form the selective context and selective agents for the evolution of social behavior (Evans 1977, Strassmann and Queller 1989). 3.1 Habitat Cooperation in reproduction is concentrated in habitats that select for juveniles to stay at home after reaching adulthood, because of benefits to philopatry and costs to dispersal and independent reproduction. Among social vertebrates, the preponderance of social species are territorial and live in environments where independent breeding opportunities appear more limited than in related non-social taxa (Brown 1974, Emlen 1982). Indeed, many studies have shown a positive association between the degree of ‘ecological constraint’ on independent reproduction (measured in terms of some aspect of habitat availability) and the extent to which the offspring of a set of parents stay at home to serve as ‘helpers,’ forming a cooperativebreeding social system. By staying at home, such offspring may be able to increase their inclusive fitness via altruism towards relatives, they may eventually inherit the territory, or they may simply gain a favorable environment in which to wait for a breeding opportunity to arise elsewhere. Among social invertebrates, sociality is more or less restricted to species that live in improvable, expansible, or defensible microhabitats that serve either as nurseries or as combined nurseries and food-rich sites (Alexander et al. 1991). For example, social aphids and thrips live in galls (hollow modified leaves or stems), social encyrtid wasps live in the body cavities of their hosts, social bark beetles live in the heartwood of trees, social snapping shrimp live in sponges, and most social Hymenoptera live in nests in the ground, in stems, or in wood (Choe and Crespi 1997). These habitats are especially valuable and tend to select for offspring philopatry and helping behavior, in the same way as do the territories of social vertebrates (Strassmann and Queller 1989). The two species of social mole rat provide a striking link between the invertebrates and vertebrates: both live in colonies underground in expansible burrows like bees, ants, or 14505
Sociality, Eolution of termites, and, like social thrips, aphids, shrimp, and some termites, their food is available at the nest site. Although cooperation is clearly associated with particular sorts of habitat in both vertebrates and invertebrates, few studies have yet to compare social taxa to closely-related non-social taxa, with regard to the specific characteristics of habitats believed to be important. Moreover, researchers are only beginning to link differences in habitat characteristics to differences in social system, to discern what aspects of habitat tend to select for communal, cooperative breeding, or eusocial behavior. The most obvious pattern to date is that virtually all species living in habitats that combine food and shelter, such that the colony members need not leave their domicile during the period of rearing offspring, exhibit eusociality, if they also have predators and the ability to defend against them (Crespi 1994). 3.2 Natural Enemies Because the habitats of social animals are so rich in resources such as food, helpless juveniles, and shelter, they are prime targets for attack by a multitude of natural enemies, including predators, competitors and parasites (Evans 1977, Strassmann and Queller 1989). These enemies represent strong selective forces for cooperation because they can decimate entire social groups of invertebrates and because defense against them often involves the evolution of behavioral and morphological specializations that trade off with reproduction. Almost all social species have some means of defense, such as the stings of Hymenoptera, the mouth parts or modified forelegs of soldier aphids, the armed forelegs of soldier thrips, the incisive teeth of mole rats, the claws of shrimp, or the diversity of chemical and physical defenses of soldier termites (Choe and Crespi 1997). Some of these defensive traits apparently served as pre-adaptations for the origin of sociality (e.g., stings, which evolved in a non-social context and may have facilitated the origin of sociality), whereas other traits, such as those exhibited by the soldiers of termites, clearly evolved in the context of social defense. In contrast to invertebrates, vertebrates have few of the types of natural enemies that select for sociality, except in some taxa (e.g., mongooses) that are highly vulnerable to attack by other vertebrates (Alexander et al. 1991). For most vertebrates, intraspecific competition for resources, both between and within social groups, appears to be a more important selective force for cooperation. 3.3 Demography Social behavior evolves within the context of life history and aspects of demography, such as birth and death schedules and lifespans, have come under 14506
increasing scrutiny as factors that may influence social systems and be influenced by them. One demographic benefit of sociality is that an individual may help to rear dependent relatives when it is young, before it would normally reproduce on its own. If mortality rates of adults are sufficiently high, relative to the duration of offspring dependence, then such inclusive fitness ‘head starts’ may greatly increase the benefits of helping (Queller 1989). Moreover, extrinsic mortality rates of helpers are expected to be higher than those of reproductives, which results in faster senescence of helpers, extended lifespans of the reproductives, and an accelerated divergence of castes in behavior and morphology (Alexander et al. 1991). Indeed, in many Hymenoptera species, reproductives can live for many years, while helpers only survive for a few weeks or months. In some invertebrates (i.e., gall thrips and gall aphids), sociality appears to be associated with a longer duration of the habitat (the gall) than in closelyrelated non-social forms (Choe and Crespi 1997). Moreover, cooperatively-breeding bird species appear to be longer lived than their non-social relatives (Arnold and Owens 1998). In both cases, the direction of casuation is unclear. In birds, cooperation may be favored by longer lifespans leading to stronger ecological constraints, or longer lifespans may result from delayed breeding. Similarly, longer-lived galls may be more vulnerable to attack, leading to the evolution of soldier castes, or having soldiers may allow for an extended gall duration. Because demography is so closely tied to fitness, assessment of its role in social evolution is one of the main imperatives of current research.
4. Mechanisms of Cooperation Social behavior can arise not only via benefits gained by cooperators, through mutualism, altruism, or reciprocity, but also via costs imposed on noncooperators, through manipulation, punishment, or policing, or via dominant individuals forcing others to cooperate (Table 1). Determining under what ecological and genetic-relatedness circumstances these different mechanisms evolve is critical to further progress in the study of social evolution.
5. Synthesis of Relatedness, Ecology and Mechanism Explaining and predicting the evolution of social systems requires the development of models that combine relatedness, ecology, and mechanism into a common framework. Models of ‘reproductive skew’ (i.e., variation in reproductive success) parameterize aspects of ecology into two components: (1) the degree of ecological constraint on independent reproduction, and (2) the productivity benefits of cooperation (i.e.,
Socialization and Education: Theoretical Perspecties the benefits of cooperation and division of labor in terms of offspring production) (Vehrencamp 1983, Johnstone 2000). These two variables, plus genetic relatednesses, enter into models that seek to predict under what genetic-relatedness and ecological conditions a pair of individuals will form a stable cooperative group and to what degree reproduction in the group is skewed in favor of the dominant. A critical assumption of most skew models is the mechanism of reproductive control: the dominant individual controls the extent of reproduction of the subordinate and can thus concede reproduction to the subordinate to motivate it to stay and help, rather than leave and attempt independent reproduction. The degree to which the assumption of dominant control over reproduction is satisfied in nature is currently a matter of considerable dispute (Johnstone 2000). Skew models have predicted a number of observed broad-scale patterns, such as low relatedness and low ecological constraint in communal social systems versus high relatedness and constraint under cooperative breeding. However, their usefulness is still limited, because analysis of their assumptions, the ecological and genetic variables involved, and their predictions is empirically so challenging. Future development of skew models requires (1) expansion of their domain of applicability to include both intersexual and intrasexual interactions, (2) enlarging their range of assumptions regarding control of reproduction, (3) consideration of additional mechanisms of cooperation, such as manipulation (Alexander 1974), policing, eviction, and punishment (Table 1), (4) incorporation of demography and an explicit element of time into the models, and (5) feedback between the findings of empirical studies and the assumptions of the models. Cooperation can arise in a wide variety of ways and a hierarchical set of skew models, encompassing this diversity of mechanisms, will spur further progress in understanding how sociality evolves. See also: Altruism and Prosocial Behavior, Sociology of; Collective Behavior, Sociology of; Evolutionary Social Psychology; Group Processes, Social Psychology of; Groups, Sociology of; Human Behavioral Ecology; Primate Socioecology; Primates, Social Behavior of; Social Evolution, Sociology of; Social Psychology; Social Psychology, Theories of; Sociality: Anthropological Aspects
Bibliography Alexander R D 1974 The evolution of social behavior. Annual Reiew of Ecology and Systematics 5: 325–83 Alexander R D, Noonan K M, Crespi B J 1991 The evolution of eusociality. In: Sherman P W, Jarvis J U M, Alexander R D
(eds.) The Biology of the Naked Mole Rat. Princeton University Press, Princeton NJ, pp. 3–44 Arnold K E, Owens I P F 1998 Cooperative breeding in birds: A comparative test of the life history hypothesis. Proceedings of the Royal Society of London Series B 265: 739–45 Brown J L 1974 Alternative routes to sociality in jays – with a theory for the evolution of altruism and communal breeding. American Zoologist 14: 63–80 Choe J C, Crespi B J 1997 The Eolution of Social Behaior in Insects and Arachnids. Cambridge University Press, Cambridge UK Crespi B J 1994 Three conditions for the evolution of eusociality: are they sufficient? Insectes Sociaux 41: 395–400 Emlen S T 1982 The evolution of helping. I. An ecological constraints model. American Naturalist 119: 29–39 Evans H E 1977 Extrinsic versus intrinsic factors in the evolution of insect sociality. BioScience 27: 613–7 Hamilton W D 1964 The genetical evolution of social behavior, I and II. Journal of Theoretical Biology 7: 1–52 Johnstone R A 2000 Models of reproductive skew: a review and synthesis. Ethology 106: 5–26 Queller D C 1989 The evolution of eusociality: Reproductive head starts of workers. Proceedings of the National Academy of Sciences of the USA 86: 3224–6 Strassmann J E, Queller D C 1989 Ecological determinants of social evolution. In: Breed M D, Page R E Jr (eds.) The Genetics of Social Eolution. Westview Press, Boulder CO, pp. 81–101 Trivers R L 1971 The evolution of reciprocal altruism. Quarterly Reiew of Biology 46: 35–57 Vehrencamp S L 1983 A model for the evolution of despotic versus egalitarian societies. Animal Behaior 31: 667–82
B. Crespi
Socialization and Education: Theoretical Perspectives Socialization refers to all processes by which an individual learns the ways, ideas, beliefs, values, and norms of his particular culture and adapts them as part of his own personality in order to function in his given culture. Education, on the other hand, deals explicitly with intentional attempts at influencing an individual’s development course. In the following, some core anthropological assumptions guiding the thinking and theorizing about socialization and education will be described. Second, major units of analysis, in particular phases of individual development, topics of socialization, socializing institutions and cultural variations will be briefly addressed. Third, an overview of three groups of theoretical approaches to socialization and educational processes, i.e., biological, psychological and sociological theories, will be presented. Finally, some elements of an integrative theoretical framework for studying socialization and educational processes will be outlined. 14507
Socialization and Education: Theoretical Perspecties
1. Core Anthropological Assumptions At the heart of all theorizing about socialization and education are central prototheoretical beliefs concerning the nature and development of human beings. In fact, as Montada (1998) has pointed out, there are at least four distinguishable models of individual development depending on whether the individual and\or his environment are conceived of playing an active or passive part. Besides endogenistic, exogenistic, and self-creation theories exists a fourth group of theories which views both person and environment as active agents. The latter group of theories rests on the common assumption that individual development is a dynamic interactional process which evolves in a sequence of reciprocally patterned person–environment transactions. Besides this basic assumption, there are at least six additional aspects that are central to a transactionals conceptualization of human development, i.e., (a) learning capacity, (b) social dependence, (c) experience formation, (d) internal representations of experiences, (e) action competence, and (f) freedom of choice. Action competence and freedom of choice, in particular, are indispensable prerequisites for innovation and change concerning individual and sociocultural conditions of life.
2. Perspecties in Studying Socialization and Education When looking at socialization and educational processes from an empirical point of view there are at least four different, albeit interrelated, approaches that can be distinguished, i.e., a chronological, topical, institutional, and cultural approach. In a chronological perspectie, the course of individual development can be divided in different age-graded phases such as infancy, childhood, adolescence, and early, middle, and late adulthood. The sequence of phases and corresponding transitions is structured by individual developmental tasks (Havighurst 1953) which, on one hand, are based on physical and social cognitive maturation and, on the other, on normative expectations prevalent in a given society. From a topical perspectie, socialization processes have a strong impact on certain life domains such as (a) physical growth and health, (b) cognitive, social, moral, and spiritual development, (c) sex and gender differences, and (d) self and personality development. From the vantage point of institutions, socialization takes place in different settings which, in some instances, are especially designed to advance the development of a broad spectrum of individual dispositions and competencies, either by formal or informal learning processes, i.e., families, schools, peer-groups, work settings, and so on. Finally, from a cultural point of view, an individual’s development depends a great deal on the values and 14508
beliefs that are typical of the culture to which he or she belongs. It has been shown, for example, that the prevalence of an independent-autonomous conception of the self within individualistic cultures vs. an interdependent–relational self within collectivistic cultures is closely related to distinct socialization goals and practices stressing either autonomy or relatedness (Triandis 1995).
3. Major Theoretical Approaches to Socialization and Education Although in the present context it is not possible to give a full account of the vast array of theories dealing with the structural and dynamic aspects of socialization and education, three major classes of theories, i.e., biological, psychological, and sociological, will be briefly reviewed. 3.1 Biological Theories 3.1.1 Sociobiology and eolutionary psychology. Based on Darwin’s groundbreaking ideas concerning the phylogenetic evolution of the species, sociobiology and, somewhat later, evolutionary psychology, contended that ultimate explanations of human behavior must take into account the superior goals of securing and expanding the successful reproduction of the human species. Especially mate selection and parental investment have been associated with universally evolved psychological mechanisms that are rooted in the human genotype as a consequence of phylogenetic adaptation. For example, Belsky et al. (1991) have proposed and also presented some research evidence for an evolutionary theory of socialization relating different kinds of reproductive strategies to specific childhood experiences, which in turn entail specific outcomes of an individual’s psychological and somatic development. More specifically, these authors argue for two distinct developmental pathways of reproductive strategy. One is characterized by stressful contextual and rearing environments, insecure attachments, early pubertal development and onset of sexual activity within the context of short-term, unstable pair bonds, which in turn predispose to higher fertility rates and limited parental investment. In contrast, the other developmental trajectory shows the opposite pattern of rearing conditions and interpersonal relationships. 3.1.2 Behaior genetics. While evolutionary theories focus on universal adaptive strategies, behavior genetic theories concentrate on explaining the variability of individual differences that can be accounted for by genetic and\or environmental influences
Socialization and Education: Theoretical Perspecties (Plomin 1994). Behavior geneticists, in particular, have shown that individual differences in cognitive and socio-emotional dispositions are not only substantially influenced by the individuals’ genotype but also by genetic components of the environment (e.g., via the parents’ genotype impacting on their children’s environment). Moreover, besides the conceptually and empirically useful distinction between shared and non-shared environments, various kinds of genotype-environment covariation and interaction have been distinguished to account for genetically informed processes of individual development. In this vein, empirical analyses of cross-sectional and longitudinal data have led some authors to challenge the environmentalist position according to which, for example, familial influences such as parenting style or family climate have a strong impact on the development of individual differences in the filial generation. 3.2 Psychological Theories There is a wide range of psychological theories that have been applied to describing, explaining, and controlling socialization and educational processes. In fact, all theories dealing with personality and social development have much to contribute to this goal. In the following, four theoretical approaches will be briefly described with a special emphasis on the processes of socialization and education (Schneewind and Pekrun 1994). 3.2.1 Psychoanalytical theories. In classical Freudian psychoanalytical theory the operation of defense mechanisms such as identification or sublimation plays a central role in socializing the growing individual’s initially untamed libidinal and aggressive drives. Especially, coping with the oedipal situation via identification with the aggressor is supposed to be a developmental milestone, because it allows the young boy or girl to enormously expand his or her learning experiences, thus leading to a gradually more realistic and differentiated satisfaction of formerly more primitive expressions of his or her drive impulses. Later on, sublimination is the most elaborate and mature defense mechanism, which provides the individual with new ways in his private or professional life to follow through with his otherwise unsatisfied libidinal and aggressive needs at a socially acceptable level. Fixation is another important psychoanalytical concept that has been invoked to explain character formation at different stages of an individual’s psychosexual development, depending on either a lack, inconsistency, or abundance of gratifications at a particular developmental stage. Thus, distinct patterns of socialization practices at different psychosexual
stages have been related to particular outcomes of children’s personality development. Cultural anthropologists, in particular, have followed this line of reasoning in contrasting more lenient vs. strict socialization practices in various cultures to demonstrate, albeit with only partial success, their impact on character formation (e.g., Whiting and Child 1953). While the classical Freudian psychoanalytical model is based on the transformation of primitive into more differentiated ways of need satisfaction, neoanalytic theories, especially object relations theories, approach the issue of socialization and personality development quite differently. In object relations theories the central dynamic principle is a person’s need for interpersonal relatedness (Greenberg and Mitchell 1983). In a developmental perspective, the individual acquires internal representations of the self, the other, and the relationship between self and other, which tend to become more and more differentiated. Thus, the quality of relationships that an individual is engaged in appears to be vitally important with respect to his internal representations of self and other. Empirical studies, however, are scarce and have mostly been limited to the pathology of object relations 3.2.2 Attachment theory. Although object relations theories bear some resemblance to attachment theory, the empirical status of the latter is much more elaborated. Attachment theory which combines concepts drawn from ethology, psychoanalysis, cognitive psychology, and systems theory has sparked off a multitude of theoretical, methodological, and empirical refinements (Cassidy and Shaver 1999). Of particular importance is the discovery of various attachment types (i.e., secure, resistant, avoidant and disorganized\disoriented attachment in infancy; secure, preoccupied, anxious, and dismissing attachment in adulthood) and corresponding internal working models. Moreover, it has been shown that specific aspects of socialization are related to different attachment types in theoretically meaningful ways. For example, several aspects of caregiving that promote secure mother–infant attachments have been found. These include characteristics such as sensitivity, positive attitude, synchrony, mutuality, support and stimulation. Similarly, a number of distinct interpersonal behaviors such as mutual need for intimacy, exchange of positive affect, and constructive conflict resolution have been observed in adult relationships belonging to the secure type. 3.2.3 Epigenetic theories. The common feature of epigenetic theories is that individual development is conceived as a sequence of qualitatively distinct phases concerning the structure of experience and behavior. A prominent example is Piaget’s (1954) theory of cognitive development, comprising four 14509
Socialization and Education: Theoretical Perspecties major periods from the sensorimotor period to that of formal operations. These are associated with specific cognitive schemes that are built on the interplay of assimilation and accommodation processes. The epigenetic principle has influenced a series of theoretical conceptualizations concerning other developmental domains including moral reasoning, distributive justice, social perspective taking, or ego development. While Piaget emphasized the epigenetic sequence of development as an inherent process of self-creation, Vygotsky (1978) put a stronger emphasis on the cultural influences on development. In particular, his concept of the zone of proximal development clearly stresses the importance of socialization and education in the progression of epigentically patterned levels of development. 3.2.4 Learning theories. This class of theories covers a wide range of theoretical approaches spanning from orthodox behaviorism characteristic of Watson and Skinner to social–cognitive and action–theoretic conceptualizations of learning. Although the principles of classical and operant conditioning are by no means outdated, the so-called cognitive revolution in psychology led to a paradigmatic shift in learning theories. One of the earliest contributions to this new type of theories was Rotter’s (1972) social learning theory of personality which combines basic principles of learning within an expectancy–value framework. In particular, Rotter’s concept of generalized expectancies for internal vs. external control of reinforcement and the social conditions shaping these expectancies, especially within the family and in the school, stimulated a tremendous amount of research, presumably because of the affinity of this construct to other socially relevant concepts such as autonomy, personal control, helplessness, or alienation stemming from quite different theoretical positions. More recently, and in a similar vein, Bandura’s (1999) social cognitive learning theory of personality has contributed much to the analysis and theoryguided application of specific procedures fostering the development of the self system by socialization and self-directed action. Besides his influential theorizing and research on the processes of observational learning and modeling Bandura has focused on the construct of self-efficacy, i.e., an individual’s belief that he or she possesses the necessary competencies to attain specific goals. In addition, Bandura has shown that the concept of self-efficacy can also be extended to collective efficacy beliefs of groups or even societies. 3.3 Sociological Theories Not unlike psychological theories there exist quite a few theoretical approaches that have been created within sociology to account for the processes of socialization and education. In the following, three 14510
will be briefly described because of their historical and contemporary importance. 3.3.1 Structural functionalism. The theory of structural functionalism is closely associated with the work of Parsons who has written more than 150 articles and books on this topic. The main question that Parsons addresses in his theory refers to processes of how individuals become members of a given society in order to guarantee the survival and maintenance of the social system (see Parsons and Bales 1955). According to Parsons, society is a global social system which is based on an integrated value system. The individual person participates in the social system by interacting with others in accordance with the various roles and positions he or she holds in that system. The global social system itself consists of hierarchically ordered subsystems which are characterized by corresponding institutionalized norms. On the one hand, these norms are supposed to be congruent with society’s integrated value system and, on the other hand, they determine the expectations and rules attached to specific positions and roles. These are further specified with respect to a set of pattern variables comprising, for example, particularistic vs. universalistic values. In this respect, the family plays a particularly salient role (Parsons and Bales 1955). Drawing heavily on Freudian psychoanalytic theory Parsons contends that via identification with their same-sex parents boys and girls become socialized into specific male and female role patterns. More precisely, gender-related role specialization is characterized by instrumental or task-oriented roles for males and expressive or person-oriented roles for females. Again, these gender-specific complementary roles are supposed to maintain the equilibrium of the family system as well as the social system as a whole. 3.3.2 Symbolic interactionism. While Parsons’ theory of structural functionalism focused on the conditions ensuring the conservation and stability of the social system, symbolic interactionism called into question whether the expectations and behaviors associated with the enactment of roles can be determined as precisely as Parsons suggested. Instead, Mead (1934), who is credited as the originator of the symbolic interaction framework, maintained that social roles are acquired by imparting and sharing the meaning of symbol systems, particularly that provided by language, in order to ensure an appropriate coordination of the actions of two or more interacting persons. The central vehicle by which this can be achieved is the process of communication which makes role-taking a necessary prerequisite. In essence, the shared meaning of symbols is a coconstructive process which is an outgrowth of interpersonal communication. Therefore, the notion of
Socialization and Education: Theoretical Perspecties socialization is crucial in symbolic interactionism although Mead himself never used this term. Instead he talked about the ‘importation’ of social symbols into a person’s mind. For the developing child the process of importation is marked by two stages, i.e., a play and a game stage. In the play stage, the child is an actor with his own needs and interpretation of the situation (which in Mead’s terms represents the self as subject or ‘I’) while in the game stage the child is an actor who is confronted with the needs and social expectations of other actors (which refers to Mead’s self as object or ‘Me’). In certain times and situations, however, balancing the I and Me aspects of the self may become quite conflictual, thus calling for appropriate means of intra-and interpersonal conflict resolution.
to find a solution that all participants can live with. At a more practical level, specific procedures have been worked out to enhance constructive communication and conflict resolution skills, particularly for the family, school, and work context.
3.3.3 Conflict theories. Generally speaking, conflict within and between social groups is inevitable because the needs and interests of individuals and ingroups are more or less incompatible with the needs and interests of other individuals or groups. At the core of conflict theories is the phenomenon of power, or more precisely the imbalance of power, in particular when the powerful determine the lives and resources of the underprivileged, and at the same time ignore to a great extent their needs and interests. Conflict theories as, for example, Marx’ and Engels’ dialectical theory of history, try to understand the conditions leading to social inequality and injustice. However, Marxism not only provides an explanatory framework for the understanding of social inequality and conflict; it also has an emancipatory element inasmuch as it urges political action to dissolve inequality by endowing the underprivileged with more power and control to allow for a more self-determined conduct of life. Marx and Engels used their theory to explain the differences and conflict between social classes. In principle, however, Marxist theorizing can be applied to all social constellations where inequality and lack of justice cause conflict as, for instance, in feminist theories that deal with social gender inequalities. Unresolved conflict and the plea for emancipation call for appropriate means of conflict resolution which cover a wide range from physical aggression to negotiation between conflict parties. In order to ensure that basic human rights are not violated, socialization for negotiation and constructive conflict resolution is a vital prerequisite. At a philosophical level, Habermas (1981), a prominent contemporary proponent of critical theory, has suggested to develop the art of ‘communicative competence’ to handle conflicts in a so-called ‘ideal discourse situation.’ This means that all participants in discourse are granted equal opportunity to express their own point of view and to comment critically on the contributions of their fellow participants, keeping in mind that the ultimate goal is
4.1 Units of Enironmental Influence
4. Elements of an Integratie Framework for Studying Socialization and Education Based on the core anthropological assumptions and theoretical approaches described in the preceding sections, an integrative framework for the study of socialization and educational processes must at least cover the following five aspects.
Since the developing person is a part of different and progressively expanding environments, the units of analysis should comprise all settings that may exert direct or indirect influences on the individual. Bronfenbrenner’s (1979) ecological systems approach to human development provides an excellent framework for this purpose. By distinguishing between four systemically interrelated levels of developmental cont-exts, i.e., the micro-, meso-, exo-, and macrosystem, the scope of the framework is sufficiently broad to encompass all possible socializing influences that an individual might have to deal with. More specifically, Super and Harkness (1994) have introduced the concept of ‘developmental niche’ to describe the conditions that constitute the developmental context of a particular person, especially a particular child. According to these authors, there are three major components of the developmental niche, i.e. (a) the physical and social contexts of everyday life, (b) culturally regulated customs of child care and child rearing, and (c) particular psychological characteristics of a child’s parents. Together these three components constitute the proximal context which determines the child’s experiential field. On the whole, broader ecological systems as well as more specific aspects of the developmental niche form the opportunity structures of the developing person.
4.2 Characteristics of the Indiiduals to be Socialized Within systems of interacting persons the characteristics of the individuals to be socialized play an important role in shaping the processes and outcome of socialization. Particularly salient individual characteristics that moderate and mediate socializing influences are (a) basic personal characteristics (e.g., genotype, health, and maturational status of cognitive, motivational, and temperamental dispositions), (b) 14511
Socialization and Education: Theoretical Perspecties characteristic adaptational strategies (e.g., information processing, coping and defense mechanisms, and volitional and emotion regulation processes), and (c) typical forms of experiencing the self and the world (e.g., self-concepts, experience of personal identity and coherence, and internal representations of relationships, situations, and environments).
4.3 Proximal Processes To combine the characteristics of a person and his or her environment in a meaningful way, proximal processes, i.e., reciprocal interactions between the human organism and the persons, objects, and symbols in its environment (Bronfenbrenner 1995), are presumed to produce specific socialization outcomes. More specifically, Caspi (1997) has called attention to three kinds of person–environment transactions explaining developmental diversity and continuity. They are (a) reactive person–environment transactions which occur when different individuals express different forms of experience and behavior when exposed to the same situation; (b) evocative person–environment transactions, i.e., when an individual’s specific personality characteristics elicit particular responses from other persons; and (c) proactive person–environment transactions which refer to the selection and creation of environments by the individuals themselves.
4.4 Goals and Values In the behavioral and social sciences there is disagreement as to whether or not goals and values governing socialization and education should be incorporated as normative statements in corresponding theories, as is clearly the case in conflict theories of socialization. Although goals and values are undoubtedly objects of socialization theories and research, this controversy remains unresolved. However, even if the originators of socialization theories deliberately restrict themselves to descriptive and explanatory statements, such theories can nevertheless contribute to a great extent to the clarification of socialization goals, and to a better understanding of their antecedents and consequences. Thus, theories of socialization, and, particularly, corresponding research findings, provide valuable information which can be used to make informed choices with respect to one’s own beliefs.
4.5 Self-socialization Based on the anthropological assumptions presented above, socialization consists not only, as Mead (1934) 14512
contended, of the ‘importation’ of social symbols into an individual’s mind, but deals also with the capacity to select and create personally and socially meaningful goals, due to the individual’s progressively increasing action competence and freedom of choice. The latter process refers to the principle of self-socialization within the limits of personal and socio-cultural opportunity structures. In this context, human beings are creating themselves and the culture in which they live. Moreover, self-socialization enables them to work on the life-long project of designing their lives in a personally and communally rewarding way. Thus, theories of socialization and education are not only an outgrowth of humans’ cultural activity but serve also as a critical guide for making well-informed choices to optimize the process of self-socialization. See also: Behavioral Genetics: Psychological Perspectives; Cultural Diversity, Human Development, and Education; Educational Sociology; Learning and Instruction: Social-cognitive Perspectives; Learning Theories and Educational Paradigms; Social Inequality and Schooling; Socialization in Adolescence; Socialization, Sociology of
Bibliography Bandura A 1999 Social cognitive theory of personality. In: Cervone D, Shoda Y (eds.) The Coherence of Personality. Guilford Press, New York Belsky J, Steinberg L, Draper P 1991 Childhood experience, interpersonal development, and reproductive strategy: An evolutionary theory of socialization. Child Deelopment 62: 647–70 Bronfenbrenner U 1979 The Ecology of Human Deelopment. Harvard University Press, Cambridge, MA Bronfenbrenner U 1995 Developmental ecology through space and time: A future perspective. In: Moen P, Elder G H Jr, Lu$ scher K (eds.) Examining Lies in Context. American Psychological Association, Washington, DC Caspi A 1997 Personality development across the life course. In: Eisenberg N (ed.) Handbook of Child Psychology: Social, Emotional, and Personality Deelopment, 5th edn. Wiley, New York, Vol. 3 Cassidy J, Shaver R P (eds.) 1999 Handbook of Attachment: Theory, Research, and Clinical Applications. Guilford Press, New York Greenberg J R, Mitchell S A 1983 Object Relations in Psychoanalytic Theory. Harvard University Press, Cambridge, MA Habermas J 1981 Theorie des kommunikatien Handelns. Suhrkamp, Frankfurt Havighurst R J 1953 Human Deelopment and Education. Longmans Green, New York Mead G H 1934 Mind, Self and Society. University of Chicago Press, Chicago Montada L 1998 Fragen, Konzepte, Perspektiven. In: Oerter R, Montada L (eds.) Entwicklungspsychologie. Psychologie Verlags Union, Weinheim
Socialization in Adolescence Parsons T, Bales R F 1955 Family Socialization and Interaction Process. Free Press, New York Piaget J 1954 The Construction of Reality in the Child. Basic, New York Plomin R 1994 Genetics and Experience: The Interplay Between Nature and Nurture. Sage, Thousand Oaks, CA Rotter J B 1972 An introduction to social learning theory. In: Rotter J B, Chance J E, Phares E J (eds.) Application of a Social Learning Theory of Personality. Holt, Rinehart and Winston, New York Schneewind K A, Pekrun R 1994 Theorien der Erziehungs- und Sozialisationspsychologie. In: Schneewind K A (ed.) Psychologie der Erziehung und Sozialisation. Hogrefe, Go$ ttingen Super C M, Harkness S 1994 The developmental niche. In: Lonner W J, Malpass R (eds.) Psychology and Culture. Allyn and Bacon, Boston Triandis H C 1995 Indiidualism and Collectiism. Westview, Boulder, CO Vygotsky L S 1978 Mind in Society: The Deelopment of Higher Psychological Processes. Harvard University Press, Cambridge, MA Whiting J W M, Child I L 1953 Child Training and Personality: A Cross-cultural Study. Yale University Press, New Haven, CT
K. A. Schneewind
Socialization in Adolescence Socialization is the process through which individuals’ behavior, attitudes, and values are shaped by the individuals and social institutions to which they are exposed. Because adolescence is a transitional time during which individuals are prepared for the roles they will occupy as adults, this period of development has received a great deal of attention from researchers interested in socialization. Among the issues most often researched are influences on academic achievement, problem behavior, and social competence. Although we often think of the family as the main agent of socialization in children’s development, young people are influenced by forces outside the family as well, including peers, schools, and broader cultural institutions, such as the mass media. The influence of nonfamilial agents of socialization is especially important in adolescence, as individuals begin to spend increasingly more time away from their parents and exposed to the potential influence of other socialization agents. Not surprisingly, much attention has been devoted in the study of socialization in adolescence to questions concerning the ability of parents to influence their children in the face of these other forces. Research on socialization in adolescence has focused mainly on three broad questions. First, to what extent are adolescents influenced by different agents of socialization such as the family or peer group? Second, which aspects of adolescent development (e.g., school
performance, drug use, prosocial behavior) are most influenced by these various socialization agents? Finally, what characteristics of individuals (e.g., age, gender, upbringing) make them more or less susceptible to different sources of influence?
1. A Brief History of Research on Adolescent Socialization Early studies of socialization in adolescence concentrated on the influence of parents. Building on the prior work of scholars who had studied socialization during early and middle childhood, students of adolescent socialization examined different forms of parental discipline and their impact on the adolescent’s character development. The dramatic increase in the relative size of the adolescent population during the 1960s, however, as well as growing concerns about the segregation of adolescents from adults created by the growth of secondary and postsecondary education, fueled scholarly interest in the study of adolescent peer groups and the potential influence of adolescents on each other. The publication of James Coleman’s book, The Adolescent Society (1961) popularized the notion that young people were growing up in a separate world in which they were exposed to attitudes and values not shared by adults. Warnings about the development of a ‘counterculture’ provided the impetus for much of the socialization research conducted during the 1960s and 1970s, on such topics as peer pressure and the generation gap. The spread of popular culture during the latter part of the twentieth century broadened the study of adolescent socialization further, as society began to question the impact of such forces as television, popular music, and film. The mass media were seen as responsible for a variety of social ills afflicting adolescents, such as sexual promiscuity, drug use, and violence. As was the case in earlier studies of the peer group, the mass media were cast as evil influences on adolescent behavior, attitudes, and values, and parents were portrayed as struggling to overcome this insidious force.
2. General Challenges in Socialization Research Although the goal of understanding how adolescents are influenced by the individuals and institutions to which they are exposed is straightforward, answering questions about the strength of different socialization agents has proven very difficult, for a number of reasons. One substantial challenge, particularly in the study of socialization in the family, involves separating genetic from environmental influences on the adolescent. Adolescents may behave in ways that reflect their parents’ interests and values, but similarity between children and their parents can be due both to 14513
Socialization in Adolescence the influence of parents on children as well as to the fact that parents and children share certain genes in common. Toward the end of the twentieth century, advances in the study of behavioral genetics challenged socialization researchers to demonstrate that what they had described as parental influence was actually an effect of socialization, and not merely a reflection of shared genetic background (Plomin 1990). Although today this issue remains far from resolved, it is clear that genetic factors play a far more important role in adolescent personality and social development than had been recognized, and that much of what had been viewed as parental influence is attributable to heredity rather than socialization (Reiss et al. 2000). A different but no less formidable challenge faces researchers interested in the influence of peers, mass media, and other nonfamilial socialization agents (Harris 1995). Because adolescents choose their friends and select the media they are exposed to, it is difficult to know whether the apparent impact of these agents on adolescent development is actually due to their influence, rather than merely indicative of the choices that adolescents make about how they spend their time. Thus, for example, similarity between adolescents and their friends could be due to the impact that friends have on each other, but it may also be due to the fact that adolescents choose as their friends individuals who share their interests. Separating out selection from socialization can only be accomplished through experimental research or longitudinal studies that track individuals over time.
3. Adolescent Socialization in the Family More is known about adolescent socialization in the family than about socialization in other contexts, although far more is known about the influence of parents than about other family members, such as siblings. Research on parental socialization has traditionally examined the links between variations in adolescent functioning and variations in parenting practices (the specific things that parents do to influence their child in one way or another, such as close monitoring of the child’s whereabouts) or parenting styles (the general affective climate of the parent–child relationship). Over the past 25 years, several broad conclusions about socialization in the family context have emerged (Steinberg 1990). First, most research indicates that adolescents fare better developmentally when their parents combine firm and consistent discipline with affection and involvement in the adolescent’s life. This combination of ‘demandingness’ and ‘responsiveness’ is referred to as authoritatie parenting. In many studies, adolescents raised by authoritative parents have been compared with their peers who come from homes that are authoritarian (demanding but relatively less responsive), indulgent (responsive but relatively less demand14514
ing), or neglectful (neither demanding nor responsive). Generally speaking, adolescents raised in authoritative homes score higher on measures of psychological adjustment and academic achievement and lower on measures of psychological and behavior problems than adolescents from other households. Adolescents from neglectful homes, in contrast, score lowest on measures of adjustment and highest on measures of maladjustment. Adolescents from authoritarian or indulgent homes generally score somewhere between the two extremes. Second, research indicates that adolescent development is more closely linked to the nature of their family relationships than to the composition of their household. Studies of adolescents from single-parent homes or stepfamilies, for example, find only small differences between adolescents from these sorts of families and those who live with their two biological parents. Similarly small differences emerge in studies that compare adolescents whose mother works outside the home with those whose mother do not. For the most part, the benefits of authoritative parenting transcend variations in household composition. Most, but not all, research indicates that the benefits of authoritative parenting also cut across ethnic and socioeconomic groups. Finally, research on socialization in the family indicates that the broader context within which the family lives affects the ways in which parents behave and, as a consequence, the impact that parents have on their adolescent’s development. Authoritative parenting is more likely to occur in families with greater financial and social resources, and who live in neighborhoods populated with other well-functioning families. In contrast, studies of families living under stressful conditions, such as poverty, neighborhood deterioration, or high rates of crime, indicate that such conditions tend to foster parenting that is hostile, inconsistent, or uninvolved. As a result of these variations in the broader context of socialization, adolescents growing up under different circumstances often exhibit different levels of psychological adjustment. A different line of research has examined changes in family relationships over the course of adolescence. This work has pointed to the importance of puberty as a period of significant transformation in family dynamics, although it is not clear whether these changes in relationships are triggered by the physical or hormonal changes of puberty per se or by other, cooccurring events (e.g., the transition out of elementary school, the development of abstract thinking). Whatever the causes, during puberty, adolescents often become more assertive toward their parents, and parent–child conflict, especially over issues concerning the adolescent’s independence, tends to increase. Most families report that this phase is temporary, however, with relationships becoming closer and less conflicted toward the middle adolescent years (Steinberg 1990).
Socialization in Adolescence
4. Adolescent Socialization in the Peer Group Researchers have studied the socialization influence of peers at multiple levels of analysis, including close friends and their impact on one another, cliques and social networks, and crowds, the large amorphous groups that define the social terrain of most high schools. Scholars of adolescent peer relations maintain that it is important to distinguish among friends, cliques, and crowds, because these different levels of peer influence operate in different ways (Brown 1990). It is also important to distinguish among different arenas of socialization; peers are more influential over day-to-day activities than over enduring values, for example. As a general rule, adolescents tend to be similar to the peers with whom they affiliate, a phenomenon known as ‘homophily.’ Although much of this similarity is due merely to the tendency for like-minded individuals to select each other as friends, there have been several longitudinal studies that indicate that friends do in fact socialize each other. The most widely documented effects have been in the areas of drug and alcohol use, delinquency, and academic achievement. Less is known about the influence of cliques (groups of between six and 10 close friends) on adolescent development, but a number of studies have examined changes in individuals’ susceptibility to peer pressure over the course of adolescence. This research shows that children’s susceptibility to peer influence increases steadily between the ages of 9 and 14, then gradually declines as adolescents move into the later years of high school. Boys, on average, are more susceptible to peer pressure than girls, especially when the pressure is to engage in antisocial behavior. It is important to keep in mind, however, that most peer pressure is not antisocial, and that adolescents’ cliques often pressure their members to avoid drugs and alcohol and perform well in school. Studies of adolescents in Western industrialized cultures have also examined the role of crowds in adolescent socialization. Crowds are large, reputationbased groups that are used to locate adolescents within the school’s social world; most schools have been found to have different crowds defined by athletics, academic commitment, antisocial activities, or social adeptness. By setting norms and establishing standards for such things as attire, tastes in popular culture, and leisure pursuits, crowds socialize their members to adopt some behaviors and attitudes and to eschew others. Contemporary studies of socialization in the peer group have, like studies of the family, adopted a more ecological perspective on adolescent development. Instead of asking whether adolescents are more influenced by peers than by parents, for example, recent research has examined how parental and peer influence interact. Although there are exceptions to the rule, most studies in this vein indicate that parental
and peer influence are synergistic, not antagonistic. More often than not, adolescents select friends whose values and attitudes reinforce, rather than undermine, the values and attitudes of the family. In addition, parents may influence the process of peer socialization by directing their children toward some peer groups and away from others.
5. Adolescent Socialization and Other Institutions Although most work on adolescent socialization has focused on the family and the peer group, some research has examined the influence of social institutions, such as school, the workplace, the neighborhood, and the mass media. One problem in drawing conclusions from this body of work is that there is considerable variability among schools, work settings, neighborhoods, and the media to which adolescents are exposed. Thus, it is difficult to draw conclusions about the impact of these institutions on adolescent socialization without examining the specific features of the institution to which an individual adolescent is exposed. Unfortunately, research on adolescent socialization in these contexts has lagged behind that on the family and peer group in both quantity and sophistication. The study of schools mainly has focused, as one would expect, on characteristics of schools or teachers that foster classroom engagement, achievement motivation, and scholastic success, and special attention has been paid to the need to make schools more sensitive and responsive to the particular psychological needs of adolescents. Much of this research indicates that effective schools and teachers share much in common with effective parents. Specifically, students are more strongly oriented to school and interested in achievement when they attend schools that are orderly and consistent in their goals, and staffed by teachers and administrators who are firm but supportive. These characteristics are reminiscent of the attributes associated with authoritative parenting described earlier (Entwisle 1990). Research on school programs designed to socialize certain attitudes or behaviors among adolescents (e.g., programs designed to prevent drug use, aggression, or unsafe sexual activity) has not been especially encouraging, however. Most evaluations of such deliberate educational efforts indicate that they are modestly successful in transmitting information but largely unsuccessful in changing students’ behavior. Most experts agree that school-based efforts of this sort need to be accompanied by programs aimed at parents, peers, and other community institutions, and that classroom programs in isolation from other sorts of interventions have little lasting influence over adolescents’ socialization. Several investigators have examined the role of the workplace in the socialization of adolescents, especial14515
Socialization in Adolescence ly among adolescents who hold part-time jobs during the school year (Steinberg and Cauffman 1995). Because rates of student employment are so much higher in the United States than in other countries, almost all of this research has involved American samples. We do not know if these findings are generalizable to adolescents in other countries, because the nature of work and its relation to schooling varies considerably from country to country. The results of research on student employment in the United States have indicated that the impact of work on adolescent socialization is to a great extent dependent on the amount of time adolescents spend on the job. In moderate amounts—10 or fewer hours per week—a part-time job can be a useful means of exposing adolescents to the world of work and helping them learn to manage their time more responsibly. When adolescents work longer hours, however, the costs of employment may overshadow the benefits. More specifically, studies indicate that employment during the school year in excess of 20 hours per week is associated with diminished performance in and commitment to school, and increased drug and alcohol use, and some research links long work hours to increased delinquency. These negative correlates of working are thought to accrue because long work hours interfere with school and diminish parents’ ability to monitor their teenagers, both of which have been shown to be associated with poor school performance and increased problem behavior. In addition, adolescents who work long hours have large amounts of discretionary income, which may pull them away from school and toward more recreational pursuits. Research on the impact of neighborhoods on adolescent socialization is relatively new. This work has examined how adolescents attitudes and behavior are shaped by variations in the communities in which they live along such dimensions as the neighborhood’s income, employment rate, crime, and demography (Brooks-Gunn et al. 1997). Most research to date has found relatively few direct effects of neighborhood conditions on adolescent socialization, however. Rather, studies suggest that variations in neighborhoods affect adolescent development primarily through their impact on the family and peer group. For example, adolescents living in disorganized neighborhoods are themselves more likely to be involved in antisocial activity because neighborhood disorganization is associated with ineffective parenting and with insufficient monitoring of peer group activities. This sort of finding is consistent with the general view that socialization is mediated primarily through interactions with significant others. Despite widespread popular belief in the deleterious effects of sex and violence in the mass media on the socialization of adolescents, scientific evidence on the impact of exposure to violent or sexual imagery—in television, film, music, or computer games—is sparse 14516
and inconsistent. There is some research indicating that exposure to televised violence may increase aggression among younger children, but this issue has not been adequately studied among adolescents. The study of the influence of other media (e.g., computer games, music) or of other types of media content (e.g., sexual imagery) on adolescent socialization is virtually nonexistent. See also: Adolescence, Sociology of; Adolescent Behavior: Demographic; Adolescent Development, Theories of; Identity in Childhood and Adolescence; Social Learning, Cognition, and Personality Development; Socialization and Education: Theoretical Perspectives; Socialization in Infancy and Childhood; Socialization, Sociology of; Substance Abuse in Adolescents, Prevention of
Bibliography Brooks-Gunn J, Duncan G J, Aber J L (eds.) 1997Neighborhood Poerty. Russell Sage Foundation, New York Brown B 1990 Peer groups. In: Feldman S S, Elliott G R (eds.) At the Threshold: The Deeloping Adolescent. Harvard University Press, Cambridge, MA, pp. 171–96 Coleman J S 1961 The Adolescent Society: the Social Life of the Teenager and its Impact on Education. Free Press of Glencoe, New York Entwisle D 1990 Schools and the adolescent. In: Feldman S S, Elliott G R (eds.) At the Threshold: The Deeloping Adolescent. Harvard University Press, Cambridge, MA, pp. 197–224 Harris J R 1995 Where is the child’s environment? A group socialization theory of development. Psychological Reiew 102: 458–89 Plomin R 1990 Nature and Nurture: An Introduction to Human Behaioral Genetics. Brooks Cole, Pacific Grove, CA Reiss D et al. 2000 The Relationship Code: Deciphering Genetic and Social Patterns in Adolescent Deelopment. Harvard University Press, Cambridge, MA Steinberg L 1990 Autonomy, conflict, and harmony in the family relationship. In: Feldman S S, Elliot G R (eds.) At The Threshold: The Deeloping Adolescent. Harvard University Press, Cambridge, MA, pp. 255–76 Steinberg L, Cauffman E 1995 The impact of employment on adolescent development. In: Vasta R (ed.) Annals of Child Deelopment. Jessica Kingsley Publishers, London, Vol. 11, pp. 131–66
L. Steinberg
Socialization in Infancy and Childhood Socialization is the process whereby an individual’s standards, skills, motives, attitudes, and behaviors change to conform to those regarded as desirable and appropriate for his or her present and future role in any particular society. Many agents play a role in the
Socialization in Infancy and Childhood socialization process including family, peers, schools, and the media. It is assumed that these various agents function together rather than independently. Families have been recognized as an early pervasive and highly influential context for socialization. Children are dependent on families for nurturance and support from an early age, which accounts, in part, for their prominence as a socialization agent.
1. Contemporary Perspecties on Family Socialization Theory Several themes are evident in current theoretical approaches to socialization. First, the rise of systems theory (Sameroff 1994) has transformed the study of socialization from a parent–child focus to an emphasis on the family as a social system. To understand fully the nature of family relationships, it is necessary to recognize the interdependence among the roles and functions of all family members. It is being increasingly recognized that families are best viewed as social systems. Consequently, to understand the behavior of one member of a family, the complementary behaviors of other members also need to be recognized and assessed. For example, as men’s roles in families shift, changes in women’s roles in families must also be monitored. A closely related theme involves the recognition of the importance of the historical time period in which the family interaction is taking place (Elder 1998). Historical time periods provide the social conditions for individual and family transitions: examples include the 1930s (the Great Depression) or the 1980s (Farm Belt Depression). Across these historical time periods, family interaction may, in fact, be quite different due to the unique conditions of the particular era. In order to understand the nature of parent–child relationships within families, a multilevel and dynamic approach is required. Multiple levels of analysis are necessary in order to capture the individual, dyadic, and family unit aspects of operation within the family itself as well as to reflect the embeddedness of families within a variety of extrafamilial social systems. The dynamic quality reflects the multiple developmental trajectories that warrant consideration in understanding the nature of families in infancy. Distinctions among different developmental trajectories, as well as social change and historical period effects, are important because these different forms of change do not always harmonize. For example, a family event such as the birth of a child may have more effects on a man who has just begun a career than on one who has advanced to a stable occupational position. Moreover, individual and family developmental trajectories are embedded within both the social conditions and the values of the historical time in which they exist. The role of parents as socialization agents is responsive to such fluctuations.
Consistent with a family systems viewpoint, recent research has focused on a variety of subsystems, including parent–child, marital, and sib–sib systems. We employ a tripartite model to understand the parent–child subsystem and the relation between this parent–child subsystem and children’s social adaptation. The most common pathway is the parent–child relationship. Parents may influence their children through a second pathway namely in their role as direct instructor, educator, or consultant. In this role parents may explicitly set out to educate their children concerning appropriate norms, rules, and mores of the culture. In a third role, parents function as managers of their children’s social lives and serve as regulators of opportunities for social contacts and cognitive experiences (Parke et al. 1994). In this article, we will focus on each of these pathways, as well as marital and sibling subsystems as contexts for socialization. Finally, we will examine the determinants of parental socialization strategies.
2. Quantitatie and Qualitatie Assessments of Mother and Father Inolement in Intact Families In spite of current shifts in cultural attitudes concerning the appropriateness and desirability of shared roles and equal levels of participation in routine caregiving and interaction for mothers and fathers, the shifts are more apparent than real in the majority of intact families. Fathers spend less time with their infants, children, and adolescents than mothers not only in the USA but in other countries such as Great Britain, Australia, France, and Belgium (see Lamb 1987). Fathers participate less than mothers in caregiving but spend a greater percentage of the time available for interaction in play activities than mothers do. The quality of play across mothers and fathers differs too. With infants and toddlers, fathers play more physically arousing games than mothers. In contrast, mothers play more conventional motor games or toy-mediated activities, and are more verbal and didactic (Parke 1996), although fathers in several other cultures, such as Sweden, India, and Central Africa, do not show this physical play style. As children develop, fathers become more involved in physical\outdoor play interactions and fixing things around the house and garden, where mothers are more actively involved in caregiving and household tasks, in school work, reading, playing with toys, and helping with arts and crafts. In adolescence, the quality of maternal and paternal involvement continues to differ. Just as in earlier developmental periods mothers and fathers may complement each other and provide models that reflect the tasks of adolescence—connectedness and separateness. Recent evidence suggests that fathers may help adolescents develop their own sense of identity and autonomy by being more ‘peer-like’ and more playful 14517
Socialization in Infancy and Childhood (joking and teasing) which is likely to promote more equal and egalitarian exchanges (Larson and Richards 1994). Why do mothers and fathers play differently? Both biological and environment factors probably play a role. Experience with infants, the amount of time spent with infants, the usual kinds of responsibilities that a parent assumes—all of these factors influence the parents’ style of play. The fact that fathers spend less time with infants and children than mothers may contribute as well. Fathers may use their distinctive arousing style as a way to increase their salience in spite of more limited time. Biological factors cannot be ignored in light of the fact that male monkeys show the same rough and tumble physical style of play as American human fathers. Perhaps predisposing biological differences between males and females may play a role in the play patterns of mothers and fathers. At the same time, the cross-cultural data underscore the ways in which cultural and environmental contexts shape play patterns of mothers and fathers and remind us of the high degree of plasticity of human social behaviors.
3. Assessing Parent–Child Interaction: Two Approaches Two approaches to this issue of the impact of parent–child interaction on children’s socialization outcomes have been utilized: (a) a typological approach which focused on styles of child-rearing practices and (b) a social interaction approach which focused on the nature of the interchanges between parent and child. 3.1 The Typological Approach The most influential typology has been offered by Baumrind (1973) who distinguished between three types of parental child-rearing typologies: authoritative, authoritarian, and permissive. Authoritative parents were not intrusive and did permit their children considerable freedom within reasonable limits, but were firm and willing to impose restrictions in areas in which they had greater knowledge or insight. In general, high warmth and moderate restrictiveness are associated with the development of self-esteem, adaptability, and social competence. In contrast, the authoritarian parents were rigid, power-assertive, harsh, and unresponsive to the children’s needs. This results in the unhappy, conflicted, neurotic behavior often found in these children. Finally, in spite of the permissive parents’ reasonably affectionate relationship with their children, their excessively lax and inconsistent discipline and encouragement of the free expression of their children’s impulses were associated with the development of uncontrolled, impulsive behavior in their children. 14518
Baumrind has followed these types of parents and their children from the preschool period through adolescence (Baumrind 1991). She found that authoritative parenting continued to be associated with positive outcomes for adolescents as with younger children and that responsive, firm parent–child relationships were especially important in the development of competence in sons. Moreover, authoritarian childrearing had more negative long-term outcomes for boys than for girls. Maccoby and Martin (1983) extended the Baumrind typology and included a fourth type of parenting style which is characterized by neglect and lack of involvement. These are disengaged parents who are ‘motivated to do whatever is necessary to minimize the costs in time and effort of interaction with the child’ (Maccoby and Martin 1983). In infants such a lack of parental involvement is associated with disruptions in attachment; in older children it is associated with impulsivity, aggression, noncompliance, and low selfesteem. The approach has been challenged on several fronts. First, questions remain concerning the processes that contribute to the relative effectiveness of these different styles. Second, it is unclear whether parenting styles are, in part, in response to the child’s behavior. Placing the typology work in a transactional framework would argue that children with certain temperaments and\or behavioral characteristics would determine the nature of the parental style. A third concern is the universality of the typological scheme. Recent studies have raised serious questions about the generalizability of these styles across either socioeconomic status (SES) or ethnic\cultural groups. In lower SES families, parents are more likely to use an authoritarian as opposed to an authoritative style but this style is often an adaptation to the ecological conditions such as increased danger and threat that may characterize the lives of poor families (Furstenberg 1993). A second challenge to the presumed universal advantage of authoritative child-rearing styles comes from crossethnic studies. In Chinese families, authoritarian styles of child-rearing are more common and some (Chao 1994) have argued that the application of these stylistic categories to Chinese parents may be ‘ethnocentric and misleading’ (Chao 1994, p. 1111) since these childrearing types represent an American perspective which emphasizes an individualistic view of childhood socialization and development. Contextual and cultural considerations need to be given more attention in typological approaches to child-rearing. 3.2 The Parent–Child Interactional Approach to Socialization Research in this tradition is based on the assumption that face-to-face interaction with parents may provide the opportunity to learn, rehearse, and refine social skills that are common to successful social interaction
Socialization in Infancy and Childhood with other social partners. The style of the interaction between parent and child is linked to a variety of social outcomes including aggression, achievement, and moral development. Parents who are responsive, warm, and engaging are more likely to have children who are more socially competent. In contrast, parents who are hostile and controlling have children who experience more difficulty with age-mates. Moreover, these findings are evident in the preschool period, middle childhood, and adolescence (Parke and Buriel 1998). Although there is an overlap between mothers and fathers, fathers make a unique and independent contribution to their children’s social development. Although father involvement is quantitatively less than mother involvement, fathers have an important impact on their offspring’s development. Quality rather than quantity of parent–child interaction is the important predictor of cognitive and social development.
4. Parental Instruction, Adice Giing, and Consultation Learning about relationships through interaction with parents can be viewed as an indirect pathway since the goal is often not explicitly to influence children’s social relationships with extrafamilial partners such as peers. In contrast, parents may influence children’s relationships directly in their role as a direct instructor, educator, or advisor. In this role, parents may explicitly set out to educate their children concerning appropriate ways of initiating and maintaining social relationships and learning social and moral rules. The quality of advice that mothers provided their children prior to entry into an ongoing play dyad varied as a function of children’s sociometric status (Russell and Finnie 1990). Mothers of well-accepted children were more specific and helpful in the quality of advice that they provided. In contrast, mothers of poorly accepted children provided relatively ineffective kinds of verbal guidance, such as ‘have fun’ or ‘stay out of trouble.’ The advice was too general to be of value to the children in their subsequent interactions. As children develop, the forms of management shift from direct involvement or supervision of the ongoing activities of children and their peers to a less public form of management, involving advice or consultation concerning appropriate ways of handling social problems.
5. Parents as Managers of Children’s Opportunities Parents influence their children’s social relationships not only through their direct interactions with their children, but also function as managers of their children’s social lives and serve as regulators of
opportunities for social contact with extrafamilial social partners. This parental role is of theoretical importance in light of the recent claims that parents’ impact on children’s development is limited and peer group level processes account for major socialization outcomes. In contrast to this view, we conceptualize the parental management of access to peers as a further pathway through which parents influence their children’s development (Parke et al. 1994). From infancy through middle childhood, mothers are more likely to assume the managerial role than fathers. In infancy, this means setting boundaries for play, taking the child to the doctor, or arranging daycare. In middle childhood, it was found that mothers continue to assume more managerial responsibility (e.g., directing the child to have a bath, to eat a meal, or to put away toys).
6. Beyond the Parent–Child Dyad: The Marital Subsystem as a Contributor to Children’s Socialization Children’s experiences in families extend beyond their interactions with parents. Evidence suggests that children’s understanding of relationships is also shaped through their active participation in other family subsystems (e.g., child–sibling) as well as through exposure to the interactions of other dyadic subsystems (e.g., parent–parent). 6.1 Influence of Marital Discord on Child Outcomes Considerable evidence indicates that marital functioning is related to children’s short-term coping and longterm adjustment. Children exposed to marital discord are likely to have poorer quality of interpersonal relationships, including internalizing and externalizing behavior problems, and changes in cognitions, emotions, and physiology (see Fincham 1998). Two alternatives, but not mutually exclusive models, have been proposed to account for the impact of marital relations on children’s developmental outcomes. One theoretical framework conceptualized marital discord as an indirect influence on children’s adjustment that operated through its effect on family functioning and the quality of parenting. A second model focuses on the direct effects of witnessed marital conflict on children’s outcomes rather than on the indirect effects. Both of these models have received empirical support (see Marital Interaction: Effects on Child Deelopment).
7. The Sibling System as a Contributor to Children’s Socialization Siblings play a critical role in the socialization of children. Most children are likely to spend more time in direct interaction with siblings than parents and this 14519
Socialization in Infancy and Childhood array of interactions between siblings have been found to be typified by greater emotional intensity than the behavioral exchanges that characterize other relationships. Sibling relationships contribute to children’s socialization in a number of significant ways. Through their interactions with siblings, children develop specific interaction patterns and social understanding skills that generalize to relationships with other children. Relationships with siblings may also provide a context in which children can practice the skills and interaction styles that have been learned from parents or others. Older siblings function as tutors, managers, or supervisors for their younger siblings. Also paralleling the indirect influence that the observation of parent– parent interaction has on children, a second avenue of influence on children’s development is their observation of parents interacting with siblings. These interactions may serve as an important context in which children deal with issues of differential treatment and learn about complex social emotions such as rivalry and jealousy (Dunn 1993).
8. Determinants of Family Socialization Strategies One of the major advances in the field has been recognition of the importance of understanding the determinants of parenting behavior. Belsky (1984) proposed a three-domain model of the determinants of parenting, which included personal resources of the parents, characteristics of the child, and contextual sources of stress and support.
8.1 Child Characteristics Child characteristics take two forms: universal predispositions that are shared by all children and individual differences in particular characteristics. Infants are biologically prepared for social, cognitive, and perceptual challenges and these prepared responses play a significant role in facilitating children’s adaptation to their environment. Under the influence of recent advances in behavior genetics, there is increasing recognition of the role of individual differences in temperament on parenting behavior (Rothbart and Bates 1998). Although debates about the relative contributions of genetic and experiential factors to the emergence of individual differences in temperament continue, temperament clearly is a determinant of parental socialization tactics. For example, infants with difficult temperaments elicit more arousal and distress from caregivers than less difficult infants. The impact of these individual differences on parental socialization behavior are not independent of environmental conditions. Crockenberg (1981) show14520
ed that the impact of a difficult infant temperament on the parent–infant attachment relationship varied as function of the degree of social support available to the mother, which underscores the potential modifiability of temperament-based influences.
8.2 Personal Resources A variety of studies support the prediction that personal resources—conceptualized as knowledge, ability, and motivation to be a responsible caregiver— alter parenting behaviors (Belsky 1984). Particularly striking are recent studies of how parental psychopathology such as depression will alter parenting behavior. When interacting with their infants, depressed mothers show flat affect and provide less contingent stimulation than nondepressed mothers; in turn, their infants showed less attentiveness, more fussiness, and lower activity levels. Differences are particularly evident when depression is protracted and not merely transient (Campbell et al. 1995).
8.3 Social Support and Family Socialization Strategies Recognition of the role of the community and community agents as modifiers of family interaction is necessary for an adequate theory of socialization. Recognition of the embeddedness of families in a set of broader social systems such as community and culture is only a first step. The next task is to articulate the ways in which these other levels of social organization affect family functioning and explore the ways in which these influence processes take place. Community influence can be either positive or negative, since a high degree of connectedness with community resources is not necessarily positive. In addition, the relationship between communities and families is bidirectional and varies across development. Moreover, the influence of support systems on families may either be direct or indirect in its effects. Finally, both availability and utilization need to be separately considered. Families may have friends, relatives, and neighbors available, but fail to utilize these members of their informal social network in times of stress or crises or on a day-to-day basis.
8.4 Socioeconomic Status as a Determinant of Family Socialization Strategies There is a long history of research concerning the links between socioeconomic status and\or social class and parenting beliefs and practices, in contrast to traditional assumptions that SES is a static state, SES is
Socialization in Infancy and Childhood a dynamic concept. Over the course of childhood and adolescence, families change social class and change is greatest in the youngest ages. Over 50 percent of American children change social class prior to entering school. In spite of the controversies surrounding this variable, there are SES differences in parental socialization practices and beliefs. Lower SES parents are more authoritarian and more punitive and controlling than higher SES families. Second, there are more SES differences on language measures than on nonverbal measure with higher SES mothers being more verbal than low SES mothers. Some SES differences are independent of race and poverty. In China, where there are relatively small differences in income across groups who vary in terms of education, less educated parents used more imperatives with their toddlers than better educated mothers (Tardif 1993). Similarly, studies of cognitive socialization found clear SES differences in African-American lower class and middle class families (Parke and Buriel 1998).
8.5 Ethnicity as a Determinant of Family Socialization Strategies Recent cross-cultural and intracultural theories have emphasized the importance of socialization goals, values, and beliefs as organizing principles for understanding cultural variations (Harkness and Super 1995). In contrast to the older cultural deficit models of socialization, the more recent models emphasize how ecological demands shape values and goals. In the past, cultural deficit models were popular explanations for the socialization and child outcome differences observed between ethnic minorities and EuroAmericans. The focus on ethnic minority families has shifted away from majority–minority differences in developmental outcomes and more toward an understanding of the adaptive strategies ethnic minorities develop in response to both majority and minority cultural influences on their development. The parents’ individual history of interaction with the larger sociocultural context, including their awareness of their ethnic group’s history within the larger society, affect the manner in which they socialize their children. An important dimension of socialization in ethnic minority families is teaching children how to interact effectively in dual cultural contexts; the context of their ethnic group and the context of the larger EuroAmerican society. Harrison et al. (1990) have adopted an ecological orientation to explain the diverse environmental influences that contribute to the socialization of ethnic minority children. They conceptualize the socialization of ethnic minority children in terms of the interconnectedness between the status of ethnic minority families, adaptive strategies, socialization goals, and child outcomes. Emerging out of the adaptive strategies of adults are the socialization goals that they endeavor to inculcate in children to help
them meet the ecological challenges they will face as ethnic minorities in a class- and race-conscious society. Ethnic pride and interdependence are two important socialization goals that enable ethnic minority children to function competently as members of both their minority culture and the larger society (see Parke and Buriel 1998). Research needs to take into account the acculturation level of parents and children in recent immigrant families and the effects it has on family processes and child outcomes. Intergenerational differences in acculturation can create role strains between parents and children that have implications for childrearing styles, disciplinary practices, and overall parent–child relations. Together with acculturation, recognition of biculturalism as both an adaptation strategy and socialization goal is important. The effects of prejudice and discrimination on ethnic minorities, in such areas as social and emotional development, ethnic identity, and achievement motivation, deserve more attention. Language development research should also give greater attention to second language acquisition (usually English) and bilingualism and its relation to cognitive development and school achievement. More attention must also be given to the role of ethnic minority fathers, grandparents, and extended family members in the socialization of children (see also Parenting in Ethnic Minority Families: United States).
9. The Impact of Social Change on Family Socialization Families are not static but dynamic and are continuously confronted by challenges, changes, and opportunities. A number of society-wide changes have produced a variety of shifts in the nature of family relationships. Fertility rates and family size have decreased, the percentage of women in the workforce has increased, the timing of onset of parenthood has shifted, divorce rates have risen, and the number of single parent families have increased (Hernandez 1993). These social trends provide an opportunity to explore how families adapt and change in response to these shifting circumstances and represent ‘natural experiments’ in family coping and adaptation. Moreover, they challenge our traditional assumptions that families can be studied at a single point in historical time since the historical contexts are constantly shifting. Our task is to establish how socialization processes operate similarly or differently under varying historical circumstances. In both the United States and other parts of the world a variety of changes, including the timing of parenthood, increases in women’s employment, and increases in rates of divorce and remarriage, have taken place. These social changes have a major impact on children’s socialization (Parke and Buriel 1998). To date, societal changes such as shifts in the timing of parenting, work participation, 14521
Socialization in Infancy and Childhood or divorce have been treated relatively independently, but, in fact, these events co-occur rather than operate in any singular fashion. Multivariate designs which capture the simultaneous impact of multiple events on family socialization strategies are necessary.
10. Conclusion Socialization is a multiply determined process; while our focus has been on families as central socialization agents, it is critical to recognize the roles of other social agents such as schools, peers, and the mass media in this process. A fuller understanding of socialization will come from more attention to the interplay among these diverse agents and how cultural and ethnic background influence socialization practices and outcomes. See also: Attachment Theory: Psychological; Child Care and Child Development; Divorce and Children’s Social Development; Identity in Childhood and Adolescence; Infant and Child Development, Theories of; Marital Interaction: Effects on Child Development; Parenting: Attitudes and Beliefs; Parenting in Ethnic Minority Families: United States; Self-development in Childhood; Sibling Relationships, Psychology of; Social Learning, Cognition, and Personality Development; Socialization in Adolescence; Socialization, Sociology of
Bibliography Baumrind D 1973 The development of instrumental competence through socialization. In: Pick A D (ed.) Minnesota Symposium on Child Psychology. University of Minnesota Press, Minneapolis, MN, Vol. 7, pp. 3–46 Baumrind D 1991 Parenting styles and adolescent development. In: Brooks-Gunn J, Lerner R M, Peterson A C (eds.) The Encyclopedia of Adolescence. Garland, New York, pp. 746–58 Belsky J 1984 The determinants of parenting: A process model. Child Deelopment 55: 83–96 Campbell S B, Cohn J F, Meyers T 1995 Depression in first time mothers: Mother-infant interaction and depression chronicity. Deelopmental Psychology 31: 349–57 Chao R K 1994 Beyond parental control and authoritarian parenting style: Understanding Chinese parenting through the cultural notion of training. Child Deelopment 65: 1111–19 Crockenberg S B 1981 Infant irritability, mother responsiveness, and social support influences on the security of infant–mother attachment. Child Deelopment 52: 857–65 Dunn J 1993 Young Children’s Close Relationships: Beyond Attachment. Sage, Newbury Park, CA Elder G H 1998 The life-course and human development. In: Damon W (ed.) Handbook of Child Psychology. Wiley, New York, Vol. I Fincham F 1998 Child development and marital relations. Child Deelopment 69: 543–74 Furstenberg F F 1993 How families manage risk and opportunity in dangerous neighborhoods. In: Wilson W J (eds.)
14522
Sociology and the Public Agenda. Sage, Newbury Park, CA, pp. 231–58 Furstenberg F F, Harris K M 1993 When and why fathers matter: Impacts of father involvement on children of adolescent mothers. In: Lerman R I, Ooms T (eds.) Young Unwed Fathers: Changing Roles and Emerging Policies. Temple University Press, Philadelphia, PA, pp. 117–38 Harkness S, Super C M 1995 Culture and parenting. In: Bornstein M (ed.) Handbook of Parenting. Erlbaum, Mahwah, NJ, Vol. 2 Harrison A O, Wilson M N, Pine C J, Chan S Q, Buriel R 1990 Family ecologies of ethnic minority children. Child Deelopment 61: 347–62 Hernandez D J (ed.) 1993 America’s Children: Resources from Family, Goernment, and the Economy. Russell Sage Foundation, New York Lamb M E (ed.) 1987 The Father’s Role: Cross-Cultural Perspecties. Erlbaum, Hillsdale, NJ Larson R, Richards M H 1994 Diergent Realities: The Emotional Lies of Mothers, Fathers and Adolescents. Basic Books, New York Maccoby E E, Martin J A 1983 Socialization in the context of the family: Parent–child interaction. In: Hetherington E M (ed.) Handbook of Child Psychology. Wiley, New York, Vol. 4 Parke R D 1996 Fatherhood. Cambridge, MA: Harvard University Press Parke R D, Buriel R 1998 Socialization in the family: Ecological and ethnic perspectives. In: Damon W (ed.) Handbook of Child Psychology. Wiley, New York, Vol. 3, pp. 463–552 Parke R D, Burks V, Carson J, Neville B, Boyum L 1994 Family–peer relationships: A tripartite model. In: Parke R D, Kellam S (eds.) Adances in Family Research, Vol. 4: Exploring Family Relationships with Other Social Contexts. Erlbaum, Hillsdale, NJ, pp. 115–45 Rothbart M K, Bates J E 1998 Temperament. In: Damon W (ed.) Handbook of Child Psychology. Wiley, New York, Vol. 3, pp. 105–76 Russell A, Finnie V 1990 Preschool children’s social status and maternal instructions to assist group entry. Deelopmental Psychology 26(4): 603–11 Sameroff A J 1994 Developmental systems and family functioning. In: Parke R D, Kellam S G (eds.) Exploring Family Relationships with Other Social Contexts. Erlbaum, Hillsdale, NJ, pp. 199–214 Tardif T 1993 Adult-To-Child Speech and Language Acquisition in Mandarin Chinese. Yale University Press, New Haven, CT
R. D. Parke and R. Buriel
Socialization: Political In its broadest sense political socialization refers to the content, processes, and dynamics of political learning across time and space. Related, more restricted terms include civic education and citizenship training. Because of the importance commonly attached to the consequences of early socialization, scholarship in this area has focused on the pre-adult years though not to the exclusion of later periods as well. Political socialization figures heavily in discussions of political
Socialization: Political culture, the relation of the individual to the political system, and political system stability and change. This article covers the major foci of the field, methodological approaches, and the field’s evolution and development since the 1950s.
1. Major Topics and Questions Some scholars, political scientists in particular, are concerned about the implications of political socialization for how political institutions function. Other scholars focus more on what socialization means for individuals and how they cope with their political environment. Scholars have tended to borrow from other disciplines rather than developing original perspectives. Applications of social learning and cognitive growth theories predominate. Learning about politics occurs as a result of intentional and direct efforts as well as unintentional and indirect efforts. The range extends from broad topics such as national identity and consensus political norms to narrower ones such as partisanship and issue preferences. The broader topics are especially important in terms of how political culture is preserved, while the narrower ones help explain individual level behavior and attitudes. The ‘agents of socialization’ is one of the most explored topics, with the role of the family, schools, and political events receiving heaviest scrutiny. Complex research designs are often employed here so that the individuals being socialized can be matched up with the agents providing the socialization cues. There are intracountry variations. Political socialization may proceed in different ways and generate different outcomes according to individual and contextual characteristics. Most attention has been paid to the role of social class, gender, race, region, religion, and birth cohort. On the other hand cross-country variations exist. Commonalties based on individual developmental processes and differences based on institutional arrangements have been observed. Some of these conclusions are based on systematic, multicountry investigations whereas others are derived from a comparison of independent, one-country studies. Although much emphasis has been placed on transitions from early childhood up through young adulthood, the study of political socialization increasingly is being viewed from a lifelong perspective in order to comprehend the residues of earlier socialization as well as the impact of adult life stage transitions. For many nonconsensus political beliefs and practices the formative years appear to be young adulthood, after which beliefs and practices tend to solidify. For beliefs and values of a more integral and widely shared nature, the formative years occur earlier. Of interest to a wide range of scholars, generational and historical effects may reflect either gradual
changes in the social composition of succeeding cohorts or sharper discontinuities due to secular events. Rising education levels and affluence exemplify generational effects due to social composition. Wars, depressions, revolutions, and student protest movements are examples of more abrupt, event-driven effects.
2. Methodology The majority of the literature rests on a foundation of empirical methodology. Attention has focused more on individuals than on institutions and systems as units of analysis. There are various modes of data gathering. Surveys of probability and nonprobability samples compose the major mode. Three variants have been used. One consists of self-administered questionnaires of elementary and secondary school students. The use of such instruments for young children has been questioned. A second variant consists of personal interviews, either face to face or by telephone. Some of these surveys employ a structured format while others are more semi-structured. Investigators using either mode typically employ statistical analysis. Other modes of data collection include intensive interviews, sometimes combined with participant observation and projective techniques, with a small number of subjects (Coles 1987). Analysis in this instance is usually qualitative in nature. Although not typically classified under the heading of political socialization, analytic biographies of political figures—employing a variety of data sources—also constitute a mode of data gathering. Studies based on experiments are relatively rare. Most investigations are cross-sectional, that is, the subjects were studied only once. However, a few overtime studies of the same individuals exist, thereby enabling more insight into developmental patterns. Some studies stretch from pre to full adulthood and involve either special populations or more general ones. Other longitudinal projects, especially those assessing the impact of educational institutions and political events, cover a shorter period of time. Replicated surveys of the same populations provide historical trend data. Longitudinal studies of preadult populations are now available in a number of countries via national and international archives. Similarly, replicated surveys of adult birth cohorts are widely available. These surveys facilitate analyses devoted to assessing the probable impact of preadult experiences as cohorts age.
3. Initial Deelopment and Ascendance of the Field The phenomenon of political socialization has a long history. The preparation of individuals for their roles in the political world is as old as political life 14523
Socialization: Political itself. Both political authorities and ordinary citizens have been subject to the practices and outcomes of socialization regardless of political regime type. By contrast, the systematic study of political socialization has a relatively short history. Around the turn of the twentieth century educators in the United States conducted studies of school children that included choices of exemplars, national figures most admired by the children (Greenstein 1965). These ‘surveys’ and occasional other investigations in a variety of settings marked the first empirical studies of what would later be called political socialization. It was not until the post World War II era that more sustained, disciplined research emerged. The term itself surfaced in the mid-1950s. By the early 1960s the label had achieved wide academic currency among political scientists in the United States and quickly spread to other western countries. Early work in the Unites States, the initial center of the empirical investigations, focused on pre-adolescents and emphasized the positive and relatively benign processes and outcomes of political socialization. Although more realistic perceptions and understandings set in with advancing age and cognitive development, the overall effect was viewed as salutary both for the successful functioning of the political system and the citizenry (Easton and Dennis 1969). Following in the wake of these pioneering studies came a flood of projects in the United States and elsewhere. These inquiries tended to elaborate upon, modify, and in some instances to reject the conclusions of the earlier work. Variations according to social class, race, and location were identified. Political institutions conditioned the content, processes, and outcomes in fundamental ways. Considerable attention was devoted to the agents of socialization. Although the family appeared to be the key agent, it became clear that many other forces—including individuals themselves—also contribute to the formation and maintenance of political orientations. In general the more concrete, affect-laden, and reinforced the attribute in question, the more likely it was to be reproduced in the person being socialized and to persist over time.
4. Waning Interest Scholarly inquiry in this area began to decline after the mid-1970s, especially in the United States. What had been a bull market turned bearish. Two factors appeared to be at work beyond the fading effect often associated with scholarly trends. One source lay in the dwindling novelty of studying school children. Political scientists in particular were accustomed to dealing with the ‘real world’ of politics, one occupied and ruled by adults. A second likely source rested in the appearance of youth movements throughout much 14524
of the western world in the late 1960s and early 1970s. This development not only shifted attention away from pre-adults but also appeared to undermine the proposition that the trusting orientations observed in many of the earlier studies would persist over time. Indeed, the question of persistence posed a serious threat to the field. Ironically, even as fresh research efforts tailed off, the concept and the term itself became embedded in any number of subfields, including public opinion, electoral behavior, the politics of race and gender, political culture, and regime stability. Political socialization came to be assumed, rather than observed. Still, research continued on in a number of countries (Percheron 1993, Ichilov 1990) and longitudinal studies began to appear (Alwin et al. 1991, Jennings and Niemi 1981).
5. Resurgence and Future Prospects After a prolonged lull, scholarly inquiry began to increase in the 1990s and has continued onward. Three developments helped lead to a resurgence of interest. One development was the crumbling of the Soviet Union and the appearance of transitional regimes and new democracies around the globe. These changes generated numerous surveys of adults that emphasize generational differences and studies of how and with what effect pre-adults are being socialized under the new regimes (e.g., Flanagan and Sherrod 1998). New, large-scale immigration patterns also presented opportunities for studies of adult resocialization. Thus, dramatic events in the political world have provided a natural laboratory for examining the processes and outcomes of political socialization. A second development springs from the decline in social capital and civic virtue among upcoming cohorts said to characterize most Western countries. In an effort to understand what is happening and to propose possible solutions, a variety of institutions and scholars have turned to the question of education and training of the young. National and cross-national assessments dealing with the character and impact of curriculum and school characteristics are being employed (Smith 1999). Much of this work is being carried out under the banner of civic education inquiries (Torney-Purta et al. in press). A third development consists of renewed interest in the question of persistence and the dynamics of learning. Results from the long-term panel studies form part of this renewal as do the shorter-term studies of children and adolescents. The accumulation of replicated surveys as represented by such projects as national election studies, the General Social Surveys, and the Eurobarometer Surveys are also relevant because they permit the tracing of birth cohorts over extended periods of time. In doing so, they enable
Socialization, Sociology of informed interpretations drawing upon the conditions under which pre-adult socialization occurred and how these cohorts respond to later developments. These new developments augur well for continued scholarly inquiry. At the same time, the prospects remain bleak with respect to the study of young children, a major focus of the early research efforts. See also: Citizen Participation; Citizenship: Political; Civic Culture; Educational Sociology; Generations: Political; Opinion Formation; Participation: Political; Political Communication; Political Culture; Political Discourse; Political Elites: Recruitment and Careers; Political Protest and Civil Disobedience; Socialization and Education: Theoretical Perspectives; Socialization, Sociology of
Bibliography Alwin D F, Cohen R L, Newcomb T M 1991 Political Attitudes Oer the Life Span: the Bennington Women after Fifty Years. University of Wisconsin Press, Madison, WI Coles R 1987 The Political Lies of Children. Houghton Mifflin, Boston Easton D, Dennis J 1969 Children in the Political System. McGraw Hill, New York Flanagan C A, Sherrod L R (eds.) 1998 Political development: Youth growing up in a global community. Journal of Social Issues 54: 447–627 Greenstein F I 1965 Children and Politics. Yale University Press, New Haven, CT Ichilov O (ed.) 1990 Political Socialization, Citizenship Education, and Democracy. Columbia University, New York Jennings M K, Niemi R G 1981 Generations and Politics: A Panel Study of Young Adults and their Parents. Princeton University Press, Princeton, NJ Percheron A 1993 La Socialization Politique (Political Socialization). Armand Colin, Paris Sears D O, Valentino N A 1997 Politics matters: Political events as catalysts for preadult socialization. American Political Science Reiew 91: 45–65 Smith E S 1999 The effects of investments in the social capital of youth on political and civic behavior in young adulthood: A longitudinal analysis. Political Psychology 20: 553–80 Torney-Purta J, Lehmann R, Oswald H, Schulz W in press Citizenship and Education in Twenty-eight Countries: The Ciic Knowledge and Engagement of Adolescents. International Association for the Evaluation of Educational Achievement, Amsterdam
M. K. Jennings
Socialization, Sociology of Socialization generally refers to the process of social influence through which a person acquires the culture or subculture of his or her group, and in the course of
acquiring these cultural elements the individual’s self and personality are shaped. Socialization, therefore, addresses two important problems of social life: the problem of societal continuity and the problem of human development. Because of its broad scope and importance, socialization is claimed as a major process by most social science disciplines. Different disciplines, however, have emphasized different aspects of this process. Anthropology tends to view socialization primarily as cultural transmission from one generation to the next. Anthropological interest in socialization coincides with the emergence of the culture and personality orientation in the late 1920s and 1930s. Psychologists are more likely to emphasize various aspects of the individual’s development. For developmental psychologists, socialization is largely a matter of cognitive development that typically is viewed as a combination of social influences and maturation. For behavioral psychologists, socialization is synonymous with learning patterns of behavior, primarily through environmental reinforcement or through vicarious processes such as modeling. For clinical psychologists, socialization is viewed as the establishment of personality traits, usually in the context of early childhood experiences. The subfield of child development is associated most closely with the topic of socialization within psychology, where much of the focus is on child rearing (see Socialization in Infancy and Childhood; Socialization in Adolescence; Socialization and Education: Theoretical Perspecties).
1. Sociological Perspecties on Socialization Within sociology there have been two major orientations to socialization and several minor ones. One of the oldest, associated primarily with structurefunctionalism, considers socialization mainly as the learning of social roles. Parsons (1955), a major architect of this theoretical perspective, maintained that individuals become integrated members of society by learning and internalizing the relevant roles and statuses of the groups to which they belong. Since roles are important elements of social and cultural systems, comprising normative and value components, socialization into roles becomes a major means of integrating individuals into social systems and, thereby, enabling the continuity of these systems. This integration is viewed as functional for individuals and for society, although Parsons’s interest was more in the latter than the former. Merton (1957), another key figure in American functionalism, also viewed socialization as the learning of social roles, but conceived of the process as more problematic. His concept of ‘latent function’ points to the possibility of unintended and undesirable consequences of normative and institutional behavior, such as the perpetuation of social inequalities or the development of individual 14525
Socialization, Sociology of dysfunction. Functionalism was the dominant theoretical perspective in sociology in the 1950s and 1960s. Its view of socialization as the learning of social roles largely characterized sociology’s contribution to ‘socialization’ in the previous edition of The International Encyclopedia of the Social Sciences (Brim 1968). But it has declined since then. Role learning is still an important focus of sociological inquiry into socialization, but it is no longer the most important focus, and no longer from a primarily functionalist perspective. The other, and more prevalent, sociological orientation views socialization mainly as self-concept formation. This view of socialization is associated closely with the symbolic interactionist perspective, a pragmatically based constructivist theory of self and society that places major emphasis on the construction of meanings through symbolic (mostly linguistic) interaction. In Mead’s (1934) influential writings, the self is a reflexive phenomenon (i.e., both subject and object) that emerges out of symbolic interaction. Language enables the development of role taking whereby the individual is able to view himself or herself from the perspective of another. Role taking is the means by which culture (in the form of values, role identities, norms, etc.) becomes internalized as elements of the self-concept. This internalization, in turn, reproduces society. From the interactionist perspective, both self and society depend on the same process of social interaction through which realities are created, maintained, and negotiated. For contemporary interactionists as well, socialization is distinguished from other types of learning and other forms of social influence by its relevance for self-conceptions. As such, socialization is not merely the process of learning rules or norms or behavior patterns. It involves learning these things, to be sure, but only to the extent that they become part of the way in which we think of ourselves, because the mark of successful socialization is the transformation of social control (via agents of socialization) into self-control (in the form of conscience and other self-regulatory mechanisms). This is accomplished largely through the development of identities, the various labels and characteristics attributed to the self, and guided by various self-evaluative processes (especially selfesteem, self-efficacy, and sense of authenticity). Commitment to identities, especially those based on important roles and values, is a source of motivation for individuals to act in accordance with the behavioral expectations implied by these identities (Gecas 1986, Stryker 1980). The focus on identity also emphasizes the membership component of socialization: to be socialized is to belong to some group. Socialization as self-concept formation occurs through various processes of learning and social influence, such as, modeling, reinforcement, roleplaying, social comparisons, direct instruction, and reflected appraisals. For sociologists, the most important of these is reflected appraisals. Based on 14526
Cooley’s (1902) concept of the ‘looking-glass self’ and Mead’s (1934) ideas on role taking, reflected appraisals refers to our perceptions of how others see and evaluate us. The reflected appraisals proposition suggests that we come to see ourselves as we think others see us, especially significant others. Much of the research on the development of self-esteem, as well as programs to enhance self-esteem, is based on this idea. An important variant of this proposition is labeling theory, which was the dominant sociological theory of deviance in the 1960s and 1970s. The theory maintained that a key causal factor in the etiology of deviance is the imposition of a (usually negative or stigmatized) identity or label on a person. Individuals may come to think of themselves in terms of the imposed label, even if there was no or little initial basis for it, because of the responses of others toward them (Schur 1971). It is a form of self-fulfilling prophecy, used to explain not only the development of deviant or criminal careers but also to address classroom dynamics regarding the consequences of teachers’ differential expectations of students. Reflected appraisals, in its several forms, are an important process for self-concept formation, but it has been subject to overstatement and uncritical application. Research has found only modest support, at best, for the proposition that our self-conceptions are a function of how we think others see us (Felson 1985). Some of this is due to the effects of situational contingencies (e.g., lack of specification of the nature of the relationship between self and other, the nature of the feedback given) and methodological problems (poor measures, cross-sectional research designs). But an even more important factor, increasingly recognized as such, is the agency and influence of the self in affecting its own socialization. The self does not passively receive input from the environment. Rather, it is actively engaged ‘selecting, distorting, ignoring, influencing others’ (Rosenberg 1979). Even very young children try to shape and manipulate their social environments by, for example, trying to influence their parents or resisting influence attempts from their parents. An important source of the self’s agency are self-motives. By virtue of having a self, an individual is motivated to evaluate it favorably (selfesteem motive), perceive it as efficacious (self-efficacy motive), and experience it as meaningful or real (authenticity motive). It is uncertain how much of the self’s manipulations, distortions, and attempts to control its environment are due to these self-motives, but probably quite a bit, and more so for some people than for others. They are a source of resistance and distortion in the reflected appraisals process. But they also have broader implications for socialization. Within socialization contexts (as well as other settings), individuals will try to maximize their self-esteem, selfefficacy, and sense of authenticity. To the extent that agents of socialization can tap into one or more of these self-motives and utilize them toward socializa-
Socialization, Sociology of tion goals, the socialization effort will be more effective. Conversely, these same self-motives can serve as major sources of resistance to socialization pressures (Gecas 1986). In other words, the most effective socialization is self-activated, in which the individual is a willing accomplice in his or her own socialization. The picture of socialization that emerges from the symbolic interactionist perspective is of a highly reciprocal process between agents and subjects of socialization, in which self-conceptions are affected by reflected appraisals and other forms of social influence, but that selves exert an influence in their own right on the process and its outcomes. Society and culture do provide the stage and a general script for the novice, but typically there is plenty of room for interpretation and creativity.
2. Research on Socialization Much of the sociological research on socialization, past and present, is context specific, that is, it studies socialization processes within specific contexts of interaction, such as family, peers, workplace, day care, boot camp, etc. (Gecas 1981). Some of the studies guided by symbolic interaction theory use ethnographic or field methods to record in rich detail the patterns of interaction, negotiation of meanings, the normative constraints of social settings, and their socializing consequences for the participants. Becker’s (1961) study of the processes by which medical students become doctors, Fine’s (1987) field studies of the development of gender identities among preadolescent boys on a little league team, and Corsaro’s (1997) observations on the emergence of peer group subcultures among preschoolers in day care, are excellent examples of this genre. While these studies provide rich and insightful observations of socialization processes as they occur in their natural settings, they are relatively rare because of the time and effort involved in this methodology. Most of the research on socialization, by symbolic interactionists and other sociologists, utilizes survey methods (e.g., interviews, questionnaires). This research tends to be quantitative and narrower in focus, for example, focusing on the effects of a socializing agent’s behavior on some aspect of the socializee’s development. 2.1 The Family Context Even though socialization is not limited to experiences in childhood but occurs throughout one’s life, most of the research attention has been on children (although less so in sociology than psychology). Therefore, the most important context of socialization is the family, where the initial or primary socialization of children takes place. Studies of the consequences of parent– child interaction for the development of children are numerous, particularly along the dimensions of par-
ental support and control. Research indicates that parents are most effective as agents of socialization when they express high levels of support or nurturance combined with the use of inductive control. Under these conditions, children are most likely to identify with parents, internalize parental values and expectations, use parents as role-models, be receptive to parents’ influence attempts, and develop positive selfconceptions and a strong conscience. Conversely, low parental support, combined with reliance on coercive control, is associated with unfavorable socialization outcomes (Peterson and Rollins 1987). These associations typically are due, to some extent, to reciprocal influences: not only are these child outcomes affected by these parental behaviors, but the characteristics of the child (e.g., compliance vs. rebellion, high selfesteem vs. low self-esteem) may elicit differential parental behavior. After all, children do effect parents as well as the reverse. Besides parental support and control, there are many other important variables in the family setting affecting socialization processes and outcomes, such as political and religious values expressed by parents, performance expectations, family structure regarding number of parents and children, birth order and sibling interaction, and personality characteristics of parent and child. Sociological interest in the effects of family structure, especially single-parent vs. twoparent families, on child socialization has been increasing, as the prevalence of single-parent families continues to increase in America and other western countries. This is one of the hotly contested and political issues in the family socialization area. While the research evidence seems to suggest that child socialization in single-parent families is disadvantageous for children, compared to socialization in two-parent families (McLanahan 1999), there are numerous factors that affect this relationship (such as economic level, quality of husband–wife relationship), leaving room for ambiguity and need for further research. 2.2 The School Context The school as a context of child socialization has generated almost as much public concern as has the family and almost as much research. Like the family, the school is a social institution directly involved in socializing children, although its mandate is more narrowly defined as providing formal education and developing children’s knowledge and skills. In this sense, schools are less involved in primary socialization (i.e., the development of basic values, motivations, and conceptions of self) and more involved in sec ondary socialization (i.e., the development of knowledge and skills). This is not a very precise distinction, however, since a good deal of primary socialization occurs in schools as well, involving values, motivations, and self-concepts. Many school activities have 14527
Socialization, Sociology of implications for the child’s self-concept. One of the most important involves evaluation of the student’s academic performance, and this evaluation is more public than evaluations by parents or peers. Success in academic performance is good for self-esteem and selfefficacy. But failure is not, and public failure is worse. Besides its implications for self-esteem, performance evaluations result in the categorization or labeling of students, by teachers as well as classmates, as ‘smart,’ ‘dumb,’ ‘lazy,’ etc. Such labels affect the way others respond to the person and, thereby, shape the person’s self-concept. But students, like other socializees, are not passive recipients of the pressures they experience. Some resist negative labels, some try to change them by working harder, and some adopt failure-avoiding strategies (such as withdrawal or procrastination) which may enable maintenance of self-esteem but at the expense of academic performance (Covington and Beery 1976). American public schools, concerned with this problem, began to develop strategies to enhance children’s self-esteem. By the mid-1980s, self-esteem enhancement became part of the curriculum in many public (especially elementary) schools. Grading and public evaluations of students were discouraged, negative feedback of any kind was also discouraged, positive feedback and affirmations of self-worth were encouraged. This emphasis on self-esteem in the school curriculum had its own set of undesirable socialization consequences (Hewitt 1998) and began to decline by the 1990s. 2.3 Peer Groups The third most important context for the socialization of children is the peer group. In terms of structure and function, the peer group is very different from family and school as a context of socialization. Peer groups are voluntary associations of status equals and are based on friendship bonds. Children’s peer groups typically are segregated by sex and differ in organizational patterns: girls’ peer groups tend to be closely knit and egalitarian friendships, whereas boys’ peer groups tend to be loosely-knit, larger groups, with status hierarchies (Thorne 1990). An important socialization consequence of intensive association with same-sex peers and involvement in sex-typed activities is that it strongly reinforces identification and sense of belonging with members of the same sex. It also contributes to the development of stereotypical attitudes concerning members of the opposite sex. The development of gender identities and much of sexual socialization during childhood occurs in the context of peer rather than family associations. Peer interactions have a wide range of consequences for the child’s developing sense of self. In his field studies of pre-adolescent boys, Fine (1987) observed that friendships are especially appropriate for the mastering of self-presentation and impression-man14528
agement skills, since inept displays usually will be ignored or corrected without severe loss of face. Friendships characterized by mutual trust and tolerance are important for exploring the boundaries of the self. A wider latitude of behavior is allowed, and even encouraged, since friends, unlike parents, are not responsible for changing or shaping the child’s behavior. Mutual understanding, role taking, and empathy—important to the development of self in relation to others—develops best in the context of friendship bonds. Peers provide an alternate reference group for children, as well as an alternate source of self-esteem and identity, and they provide a context for the exercise of independence from adult control. As such, the peer group has been described as a socialization context in which subcultures of opposition or resistance to the dominant culture develop, such as the adolescent subcultures described by Coleman (1961) and in much of the literature on juvenile delinquency. The existence of adolescent subcultures, with values, norms, and tastes distinct from and often in opposition to the mainstream culture, has tended to polarize parents and peers as agents of socialization and make them competitors for the youth’s attention and loyalty (see Youth Culture, Sociology of ). 2.4 Adult Socialization The socialization experienced by adults generally falls into the category of secondary socialization, building on the socialization experiences of childhood. Much of this is role specific (Brim 1968), that is, learning the knowledge and skills required for the performance of specific adult roles (e.g., occupation, marriage, parenthood). As individuals become committed to the roles they play, they come to identify themselves and think of themselves in terms of these role identities (Stryker 1980). Since work is a dominant activity and context for most men and women in modern societies, much of adult socialization involves preparation for an occupation, which usually takes place in specialized schools or training programs (e.g., medical school, college). The work setting itself can have a substantial socializing effect on workers, affecting more than just their knowledge and skills. Much of the socialization that occurs in work settings is incidental to the main activities on the job and often unintentional, involving the workers adaptations to the conditions of work. These work conditions vary considerably with regard to structural features, power relations, presence of coworkers, degree of complexity and autonomy, etc., and are consequential for many aspects of self and personality. Kohn and Schooler (1983) have found certain work conditions (e.g., complexity, autonomy, routinization) affect the development of workers’ values and personality. Kanter (1977) found that the nature of work relations, especially the structure of
Socialization, Sociology of opportunity on the job, affects the workers’ attitudes and behavior. Workers’ adaptations to their work conditions do not lead necessarily to commitment to the job or self-investment in the occupational roleidentity. On the contrary, a prevalent theme in the sociological literature on work, especially from a Marxist perspective, deals with the alienating consequences of work in capitalist societies (Hochschild 1983).
2.5 Resocialization Contexts Many other contexts have socializing consequences for adults: family, church, recreation settings, and various voluntary organizations. The socialization that takes place in these contexts typically is a continuation and expansion of past socialization experiences. Resocialization, however, refers to socialization experiences constituting a more radical change in the person. Resocialization contexts (such as mental hospitals, reform schools, political indoctrination camps, religious conversion settings, some therapy groups) have as their explicit goal the transformation of the individual, because the individual is perceived by self and\or others to be fundamentally dysfunctional, misguided, sinful, or deficient. The key task in resocialization is the replacement of the person’s previous set of values, beliefs, and self-conceptions with a new set based on a new ideology or worldview, that is, the ‘death’ of the old self and the birth of a new self. Typically, this is accomplished through intense small group interaction and pressure to change, in which the physical and symbolic environments are highly controlled by the agents of socialization. The actual techniques used to bring about change may vary considerably in resocialization contexts depending, for example, on whether the socializee is there voluntarily or not (Ofshe 1992). Resocialization contexts have received more attention from the popular press than they have from social scientists. They deserve much more research attention, since they constitute an important and more drastic type of adult socialization. Also, the prevalence of resocialization contexts in the form of religious and therapeutic groups seems to be increasing in modern societies.
3. Current Trends and Future Directions Interest in socialization tends to increase in troubled times, when social institutions (especially family systems and public schools) are perceived as breaking down or working badly, or when the rate of social change creates stress, difficulties in self-maintenance, and anxiety about the future. These are such times in most modern societies. Consequently, current academic and public interest in socialization is high. It is also marked by controversy. Much of the controversy
centers on family issues and family trends: increasing divorce rates, number of single-parent families, births to unmarried women, use of day-care centers, dualcareer families, etc. It is the consequences of these trends for child socialization and wellbeing that is at issue. Current opinion (academic and public) is divided strongly on the implications of these changes for child socialization. More research is needed and will be forthcoming, since funding agencies are very interested in these issues as well. Socialization theory and research typically lags behind societal changes and developments. It is slow in acknowledging and addressing social change. What effect will the rapidly evolving computer technology have on the socialization of children and adults? The ubiquity of computers in homes, classrooms, workplaces, and recreation centers is quite obvious in modern societies. Children are becoming involved with this technology at very early ages—even preschoolers are becoming computer-literate. Unlike television, computers are an interactive technology, more likely to engage and develop the user’s cognitive and motor skills. As such, there is reason to hope that the influence of computers on children’s socialization will be beneficial, as some scholars have suggested (Turkle 1995). Computers and information technologies are transforming modern societies. Surely they are also affecting processes and outcomes of socialization. But just how, when, and what these processes and outcomes might be remains to be studied. Socialization theory in sociology has long had a primarily cognitive emphasis, especially evident in symbolic interactionism. The interactionist conception of the self as a reflexive process, that develops out of role taking (a cognitive process) and language, left little room for emotions. This started to change in the mid-1970s. Sociologists and social psychologists interested in socialization rediscovered emotions and have been trying to include them in theorizing about socialization. Special attention is given to self-relevant or reflexive emotions, such as shame and guilt, and to their importance in the development of the moral self (Shott 1979). Research on emotions has lagged far behind these theoretical developments, but is increasing and will continue to increase in the sociology of socialization. A perennial issue regarding human development is the relative influence of biological factors vs. social factors (i.e., nature vs. nurture) in the development of mind, self, and personality. Sociology has long been squarely on the ‘nurture’ side of the issue, resisting any claims of biological determinism of any part of the socialization domain. There are signs that a more conciliatory stance is developing between sociology and biology. One reason is sociology’s renewed interest in emotions, which increases the relevance of physiological factors. Another reason is the dramatic advances in biology (e.g., genetics, cell biology, physiology of the brain). These have potential implications 14529
Socialization, Sociology of for a wide range of socialization concerns. How do biological factors interact with environmental influences to produce socialization outcomes? Research at the intersection of biology and sociology could be very important for the development of socialization theory. See also: Adult Education and Training: Cognitive Aspects; Child Care and Child Development; Childhood: Anthropological Aspects; Coleman, James Samuel (1926–95); Erikson, Erik Homburger (1902–94); Family as Institution; Family Processes; Freud, Sigmund (1856–1939); Identity: Social; Integration: Social; Mead, George Herbert (1863–1931); Self-regulation in Adulthood; Self-regulation in Childhood; Socialization: Political; Status and Role, Social Psychology of
Ofshe R 1992 Coercive persuasion and attitude change. In: Borgatta E F, Borgatta M L (eds.) Encyclopedia of Sociology. Macmillan, New York Parsons T 1955 Family structure and the socialization of the child. In: Parsons T, Bales R F (eds.) Family, Socialization, and Interaction Process. Free Press, Glencoe, IL Peterson G W, Rollins B C 1987 Parent-child socialization: A review of research and applications of symbolic interaction concepts. In: Sussman M B, Steinmetz S K (eds.) Handbook of Marriage and the Family. Plenum Press, New York Rosenberg M 1979 Conceiing the Self. Basic Books, New York Schur E M 1971 Labeling Deiant Behaior. Harper & Row, New York Shott S 1979 Emotions and social life: A symbolic interactionist analysis. American Journal of Sociology. 84: 1317–34 Stryker S 1980 Symbolic Interactionism: A Social Structural Version. Benjamin\Cummings, Menlo Park, CA Thorne B 1990 Children and gender: Construction of difference. In: Rhode D L (eds.) Theoretical Perspecties on Sex Differences. Yale University Press, New Haven, CT Turkle S 1995 Life on the Screen: Identity in the Age of the Internet. Simon and Shuster, New York
Bibliography Becker H S 1961 Boys in White: Student Culture in Medical School. University of Chicago Press, Chicago Brim Jr. O G 1968 Adult socialization. In: Sills D L (eds.) International Encyclopedia of the Social Sciences. Macmillan, New York Coleman J S 1961 The Adolescent Society. Free Press of Glencoe, New York Cooley C H 1902 Human Nature and the Social Order. Scribner’s Sons, New York Corsaro W A 1997 The Sociology of Childhood. Pine Forge Press, Thousand Oaks, CA Corsaro W A, Eder D 1990 Children’s peer cultures. Annual Reiew of Sociology 16: 197–220 Covington M V, Beery R G 1976 Self-worth and School Learning. Holt, Rinehart and Winston, New York Felson R B 1985 Reflected appraisals and the development of the self. Social Psychology Quarterly 48: 71–8 Fine G A 1987 With the Boys: Little League Baseball and Preadolescent Culture. University of Chicago Press, Chicago Gecas V 1981 Contexts of socialization. In: Rosenberg M, Turner R H (eds.) Social Psychology: Sociological Perspecties. Basic Books, New York Gecas V 1986 The motivational significance of self-concept for socialization theory. In: Lawler E J (eds.) Adances in Group Process. JAI Press, Greenwich, CT, Vol. 3 Hewitt J P 1998 The Myth of Self-esteem. St. Martin’s Press, New York Hochschild A R 1983 The Managed Heart: Commercialization of Human Feeling. University of California Press, Berkeley, CA Kanter R M 1977 Men and Women of the Corporation. Basic Books, New York Kohn M L, Schooler C 1983 Work and Personality: An Inquiry into the Impact of Social Stratification. Ablex, Norwood, NJ McLanahan S S 1999 Father absence and the welfare of children. In: Hetherington E M (eds.) Coping with Diorce, Single Parenting, and Remarriage. Erlbaum, Mahwah, NJ Mead G H 1934 Mind, Self, and Society. University of Chicago Press, Chicago Merton R K 1957 Social Theory and Social Structure. Free Press, Glencoe, IL
14530
V. Gecas
Societies, Types of While the concept of society is to be found in most sociological writings, it remains ambiguous and relatively ill-defined. Like most of the scientific concepts that are also used in common speech, that of society seems to need no introduction and to reflect reality in a rather straightforward, transparent way. Yet although it is one of the commonest, this notion is also one of the most ambiguous; the question has recently been raised whether it should not be considered obsolete or even harmful (Ingold 1990). It is not the purpose here to go that far; nevertheless, its ambiguities cannot be ignored, and it can be argued that the concept of society has probably been more a means of constructing reality than a representation of it. This is also the case with typologies that have been constructed around it, and that can be considered as useful only within quite well-defined boundaries.
1. From Society to Societies One of the first ambiguities of the concept of society lies in the fact that it is used, with quite different meanings, in the plural as well as in the singular. In sociology, the use of the singular refers to social reality in a general way, which can be both abstract and empirical. In social anthropology, on the other hand, it is more commonly used in the plural, and it is very widely used. Like most anthropological concepts, it is
Societies, Types of rather loosely defined, and is often presented almost as a material entity. In other words, it is used as if it were an observable fact, as if it had empirical status: a society is something that exists and can be observed; in other words, it is a clearly observable object. Consequently, social anthropologists have used the concept of society indiscriminately, and have transformed any little village into a society. The theoretical, and indeed the political, consequences of such a use are important, as shall be seen below. At this stage, it needs only to be noted that the omnipresence of the concept of society, in sociology as well as in social anthropology, does not prevent its being a rather blurred category, sometimes meaningless or at least fluid. A quick glance at different textbooks and dictionaries suffices to indicate that, although it is referred to on almost every page and seems pivotal, it does not always merit its own entry. It might, at first glance, seem useful, as there is no term more appropriate for referring to a well-knit social group differing from the more political and neighboring concepts of state, nation, or ethnic group. But this advantage soon turns out to be a handicap as one does not really know to what it precisely refers; yet it subsumes the social classes and oppositions that characterize any social reality. It is undoubtedly legitimate, and probably inevitable, to classify social realities into types. Yet one must avoid the temptation of considering societies as well-defined objects. On the contrary, it is not the concept of society that entails comparison, but comparison that entails the concept of society; in other words, the need for typologies creates the elements to be classified (Amselle 1999, p. 21). A typology can only be provisional and of limited use. That is surely the reason why no classification can be considered as definitive. That is particularly true of the perpetually recurrent dichotomies such as those that oppose ‘modern’ and ‘traditional’ societies, which are inevitable sources of confusion. A ‘type’ is always the outcome of a process of classification, and can never be taken as revealing an ‘essence’ (Boudon 1998, p. 104).
2. A Dichotomic Outlook If humans are naturally social beings, predisposed to social life, the task of sociology is to understand the mechanisms of life in society. Nevertheless, sociologists very soon stressed the limitations of a purely abstract conception of society, and attempted to distinguish between various fundamental types of society. To some extent, it could be said that the very idea of society automatically opens up into the discussion of its diversity. Furthermore, society is hardly conceivable aside of its specific instantiations. One of the most influential definitions was coined by the German sociologist Ferdinand To$ nnies (1855– 1936). It is interesting to point out that his definition
includes a comparative perspective, since, according to him, the concept of society (Gesellschaft) can only be understood in reference to the concept of community (Gemeinschaft): community is characterized by the proximity, both affective and spatial, between individuals. Within the community, the ‘we’ prevails over the ‘I’; the individual does not exist outside the collective. Society, on the other hand, characterizes the more modern forms of sociability and is founded on individual interests. These latter ground social relations, which are oriented toward the market and the contract. One finds traces of German Romanticism in To$ nnies’ conception of society, which can be considered as a theory of Heimat. To$ nnies’ opposition between Gesellschaft and Gemeinschaft was to be taken up by Durkheim, according to whom two types of solidarities can be distinguished: mechanical solidarity and organic solidarity. The first expresses the similarity and the community of feelings that unite the members of a group. The second, on the contrary, consists of interdependence, in a division of labor, as each member contributes to the whole according to his or her capabilities. Although it has sometimes remained unnoticed, the place of the individual is central to Durkheim’s conception of solidarity: according to him, the crucial element in mechanical solidarity is that individuals do not exist as such, they are completely subsumed in the collectivity (Durkheim 1978, pp. 100, 170). Organic solidarity, on the other hand, presupposes the existence of the individual; here, all individuals have a ‘sphere of action’ and their own personality. The individuals here are necessarily different from one another (p. 101). Durkheim also considers that mechanical solidarity is more present in ‘inferior societies’ and, more generally, he related the type of solidarity to the degree of division of labor: the more rudimentary the division of labor, the more mechanical the society and, in such circumstances, the individuals of a group look very much alike. Their character lacks any kind of originality; there is no psychological individuality. Conversely, among civilized people, the division of labor is at its peak, and individuals are very different from one another. Organic solidarity replaces, little by little, mechanical solidarity, which is more present in clan-based societies, which are made up of similar entities. As society becomes more complex, solidarity becomes organic (Durkheim 1978, p. 159). Certainly, traces of mechanical solidarity remain here and there, but there is a global process of substitution of the one type for the other. In La Diision du traail social, Durkheim seems to realize that he has validated a liberal conception of social life in modern societies, and he therefore nuances his thought by reasserting that altruism is essential to any kind of social life: in order to be able to live together, people must make mutual sacrifices (Durkheim 1978, pp. 207–9). He then suggests that mechanical solidarity never completely disap14531
Societies, Types of pears. Later on, he asks for more social justice and more equality (p. 311) and he reminds us that society cannot be conceived without some solidarity. Durkheim did not really draw any conclusion from these last remarks that might have led him to reduce the gap between ‘inferior’ and ‘civilized’ societies. If he sees the necessity of some collective feeling in industrial societies, he admits no doubt of the idea that the individual is basically absent from mechanical solidarity. Yet one of the main difficulties of the opposition between Gemeinschaft and Gesellschaft, and indeed of other dichotomic approaches, lies in the ‘annihilation’ of individuals, and in the equally problematic sense of communality that is meant to be typical of preindustrial societies. Yet such a radical opposition between ‘us’ and ‘them’ is by no means limited to these classical examples. It is also to be found in more contemporary writers such as Le! viStrauss, who opposes ‘cold’ and ‘hot’ societies on the grounds of their degree of ‘historicity’ (Le! vi-Strauss 1973, p. 40). The French anthropologist was wise enough to recognize that such a distinction is purely theoretical and does not correspond to any concrete entity; however, one can wonder whether his warning is not purely rhetorical, as he (and others) continued to use the same kind of dichotomous categories based on the relative impact of history and change. It is no exaggeration to say that such views have been very influential and that they have directed the way we conceptualize preindustrial societies. It is, then, important to point out that those hunting and gatherering groups that have been taken as living examples of preneolithic societies do not live in autarky, and have been in contact with more ‘advanced’ groups for a long time; while others in this category have in fact adopted hunting quite recently. Furthermore, as has been suggested by Le! vi-Strauss himself, the passage from the type to the reality remains delicate, and it is often difficult to decide which group belongs to one or the other pole. Those typologies often seem to derive from theoretical prejudices, such as the deeply grounded ideas either that some peoples are resistant to change or even of the absence of the individual in some groups. This latter idea was also at the basis of the opposition between holism and individualism, which, in Dumont’s theory, amounts to two types of societies opposing each other. Whereas Dumont sometimes considers holism and individualism as ‘ideologies,’ he does not take this specification into account when he argues that they also represent two ‘types of societies’: ‘Where the individual is the supreme value, I speak of individualism; in the opposite case, where the value lies in the society as a whole, I speak of holism’ (Dumont 1983, p. 37). Elsewhere, he assimilates this dichotomy to the opposition between ‘traditional’ and ‘modern’ societies (Dumont 1966, p. 23). In his view, India then becomes the archetype of traditional societies, which might look somewhat paradoxical. 14532
If the distinction between individualism and holism is debatable in itself, it becomes even more contestable as soon as one seeks for some empirical data to exemplify it. Indian studies, for instance, are increasingly ill at ease with the idea that the individual is absent from the Indian social system (see, e.g., Mines 1994). To consider the ‘renouncer’ (the ascetic) as the sole agent of social change in India (Dumont 1983) amounts to reducing India to a purely religious order in which there is no room for the military, the merchant, the king, or even the householder: in other words, a world without politics or economics. More generally, the same criticism applies to any dichotomous representation of the world. The main raison d’eV tre of typologies of this kind lies precisely in their deficiency: they soon appear as alternatives, radical and exclusive oppositions that reveal an ‘essence.’ They are also linked to a kind of sociological illusion that consists of dissolving individuals into a system by denying them any kind of action. They also transform the world into blocs that are said to be homogeneous (e.g., East and West, North and South). What is the relevance of this kind of dichotomy, which seems to impoverish our analyses? Recent changes in the contemporary world make this kind of view even more hazardous: globalization, migration, and the new technologies make any search for essence even more dubious. Furthermore, modern societies are far from having given up any kind of ‘holism,’ as is shown by some modern movements of ethnic renaissance or religious fundamentalism, or indeed in US multiculturalism, all of which promote the need for collective recognition at the expense of individual differences. Conversely, the absence of a division of labor does not entail any psychological conformity among individuals, and no structural constraint has ever succeeded in inhibiting every kind of individuality. In short, the classical distinction between ‘them’ and ‘us’ is always much less radical than is suggested by some classifications.
3. Economic Categories Beyond this rather Manichean use of typologies, one can find some more elaborate attempts that base themselves on a variety of criteria in order to compare economic, kinship, or religious systems. Their main interest is allowing the possibility of comparison. However, these systems also fail to avoid the trap of reducing social realities, complex by nature, to a single criterion. One sees categories such as ‘matrilineal societies’ or ‘polyandric societies,’ which seem to imply that a specific feature can qualify a complex whole. One could argue that these classes can only be useful within very precise limits, in this case a comparison between ‘marriage systems’; however, an examination of their use in practice shows that they have inevitably been used in a much broader sense, even though the
Societies, Types of selected criterion had little relevance to the questions at issue. That is what Leach deplored : ‘Comparison is a matter of butterfly collecting—of classification, of the arrangement of things according to their types and subtypes’ (Leach 1968, p. 2). The comparative anthropologist, Leach complains, is a mere butterfly collector, and the types and subtypes proposed in these systems are so problematic that they lead to nothing. This is particularly the case with categories based on unilineal descent, which might be without any sociological significance: ‘We must be prepared to consider the possibility that these type of categories have no sociological significance whatsoever. It may be that to create a class labelled matrilineal societies is as irrelevant for our understanding of social structure as a class of blue butterflies is irrelevant for the understanding of the anatomical structure of lepidoptera’ (Leach 1968, p. 4). The classification of societies according to economic criteria is one of the commonest forms, perhaps because it gives the impression of being based on objective elements, in a materialistic perspective that has always been influential in the social sciences. Lewis Henry Morgan was among the first to take technical inventions as the foundation for his evolutionary stages. According to him, the first two stages of humankind—savagery and barbarism—were subdivided into inferior, middle, and superior degrees, whereas the third stage—civilization—started with the invention of writing. The first degree of savagery is characterized by fruit-gathering, and the middle degree by fishing, whereas hunting only appears at the superior level. Barbarism starts with pottery, and proceeds to the domestication of animals, and ends with metallurgy. There is no need to dwell upon the deficiencies of such a model, which does not even take agriculture as constituting a significant stage. In an evolutionary perspective, comparison was above all diachronic, and aimed at stressing the sense of history. What is being compared is not ‘social systems’ but different stages in the development of humankind. In some ways, it is more a philosophy of history than a typology that is at stake here. Morgan had a deep influence on Marx and Engels who also considered humankind in an evolutionary perspective: social wholes are defined from the relations of production between the various social classes, at a given level of technology. One calls a ‘mode of production’ a special arrangement of those elements at a specific moment of history. In a strict sense, the mode of production denotes a few specific types of relations of production. In the years 1960–70, however, the concept took on a more elaborated meaning, especially among those scholars who followed Althusser in an attempt to combine Marxism and structuralism. The mode of production was then conceived as a model or structure, that encompassed the totality of the social relations. It consisted of a special arrangement within the ‘economic infrastruc-
ture’ (the ‘base’) that determined the superstructure (politics, law, and ideology). One could say that the superstructure necessarily reflects the infrastructure; in other words, the economic base determines all social and jural relations. The mode of production became a key concept of Marxist theory, and numerous efforts were mobilized toward the delineation of the most characteristic modes of production: besides the classical Marxian major types (Asiatic, feudal, and capitalist modes of production), Marxists invented new types, such as the domestic mode of production or the lineage mode of production. The weakening of Marxism and structuralism put an end to this line of research, which had not brought any convincing results. One of its major shortcomings was to consider social realities as systems that are closed and self-sufficient. It was quite legitimate to stress the importance of economic realities, but the overall principle of determination was somewhat simplistic, and apart from a few remarks about the ‘relative autonomy’ of the superstructure, it was unable to account for the influence of ideas and values in social formations. The very existence of the socialist ‘democracies,’ with their Orwellian omnipresence of the state, could serve to illustrate the limitations of a purely economic determinism. With the decline of Marxism in the 1980s, the mode of production became a rather obsolete concept. Nevertheless, the type of production remained one of the major criteria used to differentiate societies. Numerous anthropological studies attempted to characterize societies on the base of hunting, pastoralism, and agriculture. Hunting societies have always been a major source of fascination among anthropologists, partly because they can be seen as survivors of the Palaeolithic era. The first anthropologists took hunting societies as living fossils surviving from prehistoric times. Such a position is hardly acceptable today since we know that most contemporary hunting societies have been in contact with more ‘advanced’ groups for a long time while others have only taken up to hunting quite recently. Furthermore, few peoples have lived exclusively on hunting. If the contemporary hunter can no longer be considered as typical of the first stages of humankind, hunting remains a special case of the relationship between humans and nature. In Femmes, greniers et capitaux, Meillassoux (1974), a French anthropologist, showed that hunting and agriculture entailed different types of social relations. According to him, there was no rupture between production and reproduction: thus kinship relations are also determined by productive forces. One can distinguish between two economic types according to whether land is an ‘object’ or a ‘means’ of labor. Hunting groups use land as an object of labor: it is used directly, without any preliminary investment of human energy. Therefore one can speak of an economy of ponction (withdrawal of capital). Here, the 14533
Societies, Types of cycle of reproduction of the labor force (human energy) is short: there is no possible accumulation of wealth, and what is being produced must be consumed. Social relations are affected by such an organization: the producing groups are precarious, and prevent the appearance of any form of authority. The membership of a horde is constantly changing, and people move freely from one to another. This free mobility of adults between the groups is, according to Meillassoux, the main mechanism of social reproduction. According to him, the horde is not made up of kinship relations but of ‘adhesion’. In other words, the social status of individuals is not fixed at birth through descent rules, but depends on the voluntary choice of the individual. These productive constraints determine a special type of social relationship. However, it is difficult to follow Meillassoux when he argues that kinship ties are reduced to a minimum in the hunting band: we have no example to support his claim that children are easily given up at an early age, that marriages are short-lived, or even that weddings and funeral ceremonies are less elaborate than the rituals welcoming new members. The classical example of the !Kung goes against Meillassoux’s assertions. His mistake was to consider that relations of adhesion necessarily contradicted kinship ties. It is now well known that the elementary family is common among most social groups, whatever their degree of complexity (see for instance Kuper 1994, p. 178). Yet, if some hunting groups have a quite elaborate social organization, the majority of them have almost no internal differentiations, except those based on age and sex. They have no state organization and have an economy that is largely based on exchange. The concept of ‘peasant society’ is another interesting attempt to analyze the differences between hunters and agriculturalists. It was also a means of considering the latter in a comparative perspective. The works of Eric Wolf, Robert Redfield, or Marshall Sahlins (who spoke of a ‘domestic mode of production’) all tend in that direction. The problems of the ‘peasant society’ as a concept can be seen at first glance, since it comprehends such a vast and disparate ensemble. One could wonder what all agriculturists have in common, apart from working on the land. Yet the concept was quite fecund, and gave rise to interesting developments: Robert Redfield (1956), for instance, stressed the difference between the great and little traditions, and saw peasant communities as ‘partial societies’ that belong to larger entities. In France, Henri Mendras (1995) also coined a concept of peasant society based on three main features: the relative autonomy of the community, the importance of the domestic group, a relative self-sufficiency, the existence of a local collectivity, and the role of mediators between the local group and the main society. Sociologists have always been preoccupied with the different forms of contemporary societies, and Max Weber was undoubtedly a precursor in this field. 14534
According to Weber, the rational organization of free labor is one of the main features of Western capitalism. Raymond Aron was much inspired by Weber in his definition of industrial society by five main criteria: separation between family and industry, an original type of production and division of labor, the accumulation of capital, rational economic calculation, and the concentration of the workers in the factory. Two main types of industrial societies could then be distinguished according to the role of the state in the distribution of resources and the type of property: the socialist and the capitalist society. Aron considered this distinction as typological in the sense that it had a merely instrumental nature, and therefore did not match reality. It was above all a tool. This is surely true of all typologies, which are useful inasmuch as they remain ‘ideal-types,’ in the Weberian sense of the term.
4. Political Systems Political organization is another common means of classifying societies. Here, too, the first attempts tried to oppose ‘modern’ and ‘traditional’ societies, as the works of Rousseau or Hobbes show. In the political context, this opposition took the shape of a dichotomy between state and stateless societies. In Primitie Society, Robert Lowie argued that there is no radical difference between the two categories: some sort of political organization is present in any kind of group, and it is therefore possible to study its growing manifestations. However, the culturalist and relativist foundation of modern anthropology was more prone to stress the differences than the continuities, and anthropologists increasingly emphasized the gap between ‘them’ and ‘us.’ One of the most influential books in the field was undoubtedly African Political Systems, edited by Edward Evans-Pritchard and Meyer Fortes. Relying on their African examples, the contributors to this volume made a distinction between societies that have autonomous political institutions and those that have not. In the latter case, politics is expressed through kinship and the lineage system. The African examples were taken as having a universal value and social anthropologists saw lineages everywhere, and went as far as speaking of ‘lineage societies’. In a classical study, The Nuer, Evans-Pritchard (1940) asked himself how a ‘society’ could function in the absence of any political institutions and state organization. He discovered that while the Nuer have no state, they have a segmentary structure, in other words a division into segments of a similar nature, which merge or fight according to the circumstances. Evans-Pritchard’s study became very influential, not only in social anthropology, but also in the political sciences, where The Nuer is considered as a case study of preindustrial society. In recent years, however,
Societies, Types of social anthropologists became increasingly critical of the picture constructed by Evans-Pritchard. The model was undoubtedly elegant, but it contrasted with a rather chaotic reality on the ground. It also appeared that social organization was not solely characterized by the ties of kinship but that, on the contrary, the bands included many people who were not related by blood ties. In other words, the contrast between the Nuer and more advanced societies was perhaps less striking than was stated by the anthropologist. This is probably typical of the phenomenon of the insistence on the difference: in the Australian context, for instance, anthropologists have labeled as ‘Kariera’ and ‘Aranda systems’ societies that are divided into four or eight sections. Yet those very peoples also in fact live in nuclear families. Clastres’s opposition between society and the state is not very convincing either and reminds us of the dichotomies that were mentioned above. According to Clastres, primitive societies are stateless societies, and differ radically from state societies. According to him, the real revolution in history was not economic (the invention of agriculture), but political. The advent of the state, of power and domination, radically altered the mode of existence of societies and provoked a breach in human history (Clastres 1974, p. 172). The appearance of the state meant the death of ‘society,’ in the narrow sense of the term. Clastres was attacking Marxist economism, but he fell into a simplistic eulogy of primitivity, and amalgamated different realities under a single category. Social anthropologists then sought to improve their typologies and spoke of bands or chiefdoms, or tribes without rulers. These categories may be useful, though only to a certain extent.
5. From Structure to Relatiism In sociology, the work of Talcott Parsons became one of the most interesting attempts to classify social groups according to their ‘structure.’ In his structurofunctionalist perspective, he identified four models of society. Radcliffe-Brown took up a similar viewpoint, and defined social anthropology as a comparative sociology, the aim of which was to compare social structures. In social anthropology, however, his influence was undermined by a very strong reluctance to search for general laws. The studies of Ruth Benedict, for instance, put the stress on the irreducibility of every culture. By considering participant observation (or experience) as the main source of knowledge, social anthropology encouraged a kind of scepticism. Although she entitled her first book Patterns of Culture, Benedict spoke of the plasticity of human nature, and marveled at the infinite diversity of cultures. As for society, culture became plural. The US culturalists
themselves never became hostile to comparison; but they had sown the seeds of a relativist ideology that invaded the social sciences and rejected any kind of generalization. Relativism itself is an ideology that is historically determined. In a sense, it is true that all groups are different from one another (as indeed are all individuals), but the social sciences will have to go beyond this if they wish to prolong their existence. In that case, they will have to resume their elaboration of models and types, since classification and comparison remain elementary scientific activities. At the same time, typologies will never succeed in replacing analyses, and will always remain intellectually unsatisfactory. Moreover, when applied to societies, they also tend to reinforce the idea of homogeneous, well-defined entities, an idea that is problematic, both scientifically and politically. See also: Community\Society: History of the Concept; Industrial Society\Post-industrial Society: History of the Concept; Peasants and Rural Societies in History (Agricultural History); Society\People: History of the Concept; Sociology: Overview; Structure: Social; System: Social
Bibliography Amselle J-L 1999 Logiques meT tisses: anthropologie de l’identiteT en Afrique et ailleurs. Payot et Rivages, Paris m Aron R 1962 Dix-huit lecons sur la socieT teT industrielle. Gallimard, Paris Boudon R 1998 La Place du deT sordre: critique des theT ories du changement social. Presses Universitaires de France, Paris Clastres P 1974 La SocieT teT contre l’En tat. Minuit, Paris Dumont L 1966 Homo Hierarchicus: essai sur le systeZ me des castes. Gallimard, Paris Dumont L 1983 Essais sur l’indiidualisme: une perspectie anthropologique sur l’ideT ologie moderne. Le Seuil, Paris Durkheim E [1930] 1978 De la diision du traail social. Presses Universitaires de France, Paris Evans-Pritchard E 1940 The Nuer: A Description of the Modes of Lielihood and Political Institutions of a Nilotic People. Clarendon Press, Oxford, UK Ingold T (ed.) 1990 The Concept of Society is Theoretically Obsolete. Group for Debates in Anthropological Theory, London Kuper A 1994 The Chosen Primate: Human Nature and Cultural Diersity. Harvard University Press, Cambridge, MA Leach E 1968 Rethinking Anthropology. The Athlone Press, London Le! vi-Strauss C 1973 Anthropologie structurale deux. Plon, Paris Le! vy-Bruhl L 1951 Les Fonctions mentales dans les socieT teT s infeT rieures. Presses Universitaires de France, Paris Meillassoux C 1974 Femmes, greniers et capitaux. Maspe! ro, Paris Mendras H 1995 Les SocieT teT s paysannes. Gallimard, Paris Mines M 1994 Public Faces, Priate Voices: Community and Indiiduality in South India. University of California Press, Berkeley, CA
14535
Societies, Types of Morgan L H [1877] 1971 La SocieT teT archaıW que. Anthropos, Paris Parsons T 1951 The Social System. The Free Press, New York Redfield R 1956 Peasant Society and Culture: An Anthropological Approach to Ciilization. Chicago University Press, Chicago Schnapper D 1999 La CompreT hension sociologique: deT marche de l’analyse typologique. Presses Universitaires de France, Paris Service E 1966 The Hunters. Prentice-Hall, Englewoods Cliff, NJ To$ nnies F [1887] 1977 CommunauteT et socieT teT : cateT gories fondamentales de la sociologie pure. Retz, Paris
R. Delie' ge
Society/People: History of the Concept ‘People’ is a concept that has come to acquire two related, yet different meanings. On the one hand, there is the democratic, political conception of ‘the people’ — the people as demos—and on the other there is the ethnic, cultural conception of ‘a people’—the people as ethnos. In either sense of the word, the crystallization of the modern concept of the people can be traced back to the latter part of the eighteenth century when the idea of the ‘sovereignty of the people’ became central to democratic political theory, and the nationstate emerged as the paradigmatic form of state institution, informed by the principle that ideally each state contained one nation\people, and that each nation\people constituted one state. In this article each meaning will be discussed, as well as their relationship to each other.
1. The ‘People’ and the Democratic and National Imaginaries Prior to the eighteenth century the concept of the people was of relatively minor importance. Present from the outset were two basic meanings of the term, a general one, closely related to ‘nation,’ ‘community,’ and ‘ society,’ and a particular one, which opposed the people to the elite. Take for example Samuel Johnson’s A Dictionary of the English Language from 1755 which juxtaposes the following definitions: (a) ‘A Nation, those who compose a community,’ (b) ‘The Vulgar,’ and (c) ‘The Commonality; not the princes or nobles.’ Similarly, Abbe! Fuertie' re’s Dictionaire uniersel from 1690, list both a positive, general meaning—‘the mass of persons who live in one country, who compose a nation’—and a negative one, defined ‘by opposition to those who are noble, wealthy, and educated.’ During the eighteenth century the concept, which until then had carried largely negative connotations, associated with the vulgar, the lowly, and the beastly—what the Romans referred to as plebs— underwent a dramatic change, and ‘the people’ was valorized and reformulated in distinctly positive terms, 14536
as the source of both mortal virtue, political legitimacy, and cultural authenticity, a sentiment captured in the increasingly common saying, ‘the voice of the people is the voice of God.’ By around 1800 the ‘people’ was thus constituted as what Reinhart Koselleck calls a ‘key-concept’ (see Brunner et al. 1972–92). In dictionaries published after 1800 a veritable explosion of neologisms derived from the rootwords ‘people,’ ‘Volk,’ ‘peuple,’ ‘narod,’ ‘folk,’ and so on, can be observed, reflecting what has been called the ‘discovery of the people’ (Burke 1994). This production of new words had two origins. On the one hand there were the terms derived from the vocabulary of the English, French, and American revolutions, to which the principle of the ‘sovereignty of the people’ was the central tenet. In opposition to the Old Regime, with its feudal, hierarchical society with the sovereign king at the top of the social and legal pyramid, a new idiom was elaborated that spoke in terms of citizenship, civil liberties, and legal equality. In England the primacy of Parliament and especially the House of Commons gave expression to this new political reality, which also brought with it a new sense of common, popular English nationality. In France the Third Estate—which contained the now celebrated people—became synonymous with the nation, while the nobility was excluded from such membership, associated, as Sieye' s put it, with ‘incurable idleness,’ ‘collapse of morality,’ and a variety of exclusive and insulting social and political rights, privileges, and prerogatives. And in the USA, the Declaration of Independence spoke of the rights of ‘we the people’ to assert its sovereignty against the tyrannical and illegitimate powers of the English king. On the other hand, there were Romantic, ethnographic terms inspired by the German philosopher Johann Gottfried von Herder and his followers who ventured into the countryside to explore the very popular culture that hitherto had been the objects of scorn and contempt. In 1774 Herder coined the word Volkslied, or ‘folksong,’ as the title of a collection of songs he published, and a host of similar words pertaining to ‘folklore’ and popular culture soon became commonplace in most European languages. As in the realm of politics, ‘the people’ in this ethnographic sense was now to be rescued from oblivion and condescension and instead to be presented as the authentic bearer of national culture. True poetry, Herder suggested, gives expression to the ‘soul of the people,’ and according to Jacob Grimm such authentic, national poems ‘belong to the whole people.’ For Herder, the people was first and foremost a matter of history and culture, and he harbored an organic vision of the concept. Each people was thus conceived as a collective individual that possessed its own national character, its own language, its own mores, its own morality, and its own sense of self. In contrast to the political definition of the people as demos, Herder thus stood in the forefront on an
Society\People: History of the Concept elaboration of the people as ethnos. Indeed, Herder tended to perceive the emerging modern state with a great deal of suspicion, as a ‘machine,’ an ‘artificial form of society’ that threatens the organic, authentic culture of a people. From a wider perspective these new terms belonged to a more general phenomenon of linguistic innovation during this period. As Lynn Hunt has observed in the context of the French revolution, ‘words came in torrents’ as the old political, social, and economic order broke down and attempts were made to semantically capture a new one (Hunt 1984). As the rate of change accelerated, experience could no longer guide action in the present. A new interpretative language was required to make sense of the world. Thus attempts had to be made to formulate a vocabulary that no longer simply referred to what Koselleck calls ‘the space of experience,’ but instead reached into the future, creating a ‘horizon of expectation,’ terms through which prognostic political goals could be hatched (Koselleck 1985). In other words, the concept of ‘the people’ acquired a utopian character. As such it came to occupy a central position in the ‘democratic imaginary’ (Joyce 1994), as a concept that did not merely serve to describe an existing state of affairs, but also prefigured the future. By elaborating visions of popular sovereignty, a politically potent expectation was created that has come to define the context for politics ever since. Similarly, the people imagined as a form of community—the nation—was a vision that came to inform European and global politics as nationalism became a fundamental political force, erupting in the revolutions of 1848— ‘the springtime of the peoples’—in the unification of Italy and Germany, in the collapse of the Hapsburg Empire after World War I in the shadow of President Wilson’s ideology of the right to national self-determination, and in the politics of anticolonial national liberation movements during most of the twentieth century.
2. Structural Relations: Who Belongs to the People? In structural terms the difference between these two conceptions is one of how membership in ‘the people’ is delimited. In the one case the inclusion\exclusion is made in inside\outside terms. Here the emphasis is on setting limits against other peoples by defining the peculiar character of one’s own people and culture in relation to those of others. In its most extreme and lethal form, such as the one practiced by the National Socialists in Hitler’s Germany, the people may be defined in terms of race and blood, and citizenship be determined according to the strictures of jus sanguinis. Alternatively, membership may be determined through participation in what Rousseau called ‘civil religion,’ allegiance to a national constitution and to the law, a process by which individuals become active
citizens, devoted to the nation, and to the practice of civic virtue. In this context, ethnicity tends to play a less defining role, and rights to citizenship may, as in France or the USA, be conferred automatically on anyone born in the country ( jus soli) (Brubaker 1992). However, in either case, citizenship and nationhood is by definition simultaneously a matter of inclusion and exclusion, however benign and civic the criteria for membership in the national community may be. Thus the imagined community of the people\nation is always both sovereign and limited, surrounded by and defined against other ‘peoples’ who likewise are bounded and sovereign within those boundaries (Anderson 1983). This emphasis on the people as a community built on affective ties of solidarity and comradeship furthermore requires a minimum of cultural if not ethnic homogeneity, and thus nation-states have consistently engaged in efforts at ‘nationalizing the masses’ through public institutions such as the schools and the military, creating a nationally minded people where there previously existed diverse and culturally heterogeneous populations, divided by all manner of differences in language, custom, and law, turning ‘peasants into Frenchmen,’ as the historian Eugen Weber memorably put it in the case of France (Hobsbawm 1983, Mosse 1975, Weber 1976). In the other case, associated with the democratic definition of the people, the delimitation is made along a vertical axis where ‘the people’ is set against segments of the population at large that are defined as not belonging to the people. This can be done by excluding groups either at the top or the bottom. So, for example, the people can be defined as the entire population except the government, or the broad segments of the population vs. the ruling elite. It has also been done by excluding groups at the bottom. Thus, to take the German case, the working class could be seen as aterlandslose Gesellen whose primary identification was with the international proletariat and not with the German nation. The words PoW bel (from French peuple) and Masse were deployed in similar ways, to distinguish the ‘real,’ respectable Volk from the degenerate, urban underclass. Indeed, the delimitation could be made in both directions simultaneously; the Social Democrats, for example, could claim the noble title of the Volk defined against both the capitalist elites above and the Lumpenproletariat below. In the socialist vocabulary, the concept of the people had to compete with related notions that figured prominently in the Marxist idiom, particularly ‘class’ and ‘proletariat,’ especially since it was often closely linked to the concept of the ‘nation,’ an idea that sat uneasily within Marxist theory. Eventually, however, ‘the people’ retained a largely positive status, as evidenced in the common usage of the concept of ‘the people’s republic’ in referring to a variety of communist regimes. In the context of the liberaldemocratic political system of the West, the utopian 14537
Society\People: History of the Concept impulse at the heart of the democratic definition of the people, one that envisions not only the sovereignty of the people but also that all individuals within a sovereign state should belong to the people\demos, has led, at least in theory, to an increasing democratization of society and to the abolition of traditional obstacles to citizenship on the basis of categories such as class, creed, race, or gender.
3. National Variations: Neighboring Concepts and Political Ramifications While the concept of the people has played a central role throughout the Western world and beyond, the impact of this key concept has varied considerably from country to country. This is true both when it comes to the specific political impact and how the ‘people’ came to be understood in relation to neighboring key concepts such as state, society, community, class, proletariat, and individual rights and liberties. In the French case the ideas of the people and the nation became strongly linked to the state and the political notion of the general will. Thus the French nineteenth century historian Jules Michelet, in his influential book Le Peuple (1846), focused on the role of the Revolution and democratic politics as constitutive of the people\nation, and similarly Ernest Renan, in his essay, ‘What is a Nation’ (1882), emphasized that the nation should not be seen as first and foremost a matter of blood, but as an expression of political will, ‘a daily plebiscite.’ As Renan put it, when it comes to nationality, ‘the people’s wish is after all the only justifiable criterion, to which we must always come back.’ Here Renan was significantly arguing against fellow German historians over the status of Alsace-Lorraine, the provinces that lay in the borderland separating Germany from France, an area annexed by the Germans in the wake of the FrancoPrussian war of 1870. By contrast, the German conception of the Volk and the nation was, pace Herder, at heart a cultural conception that at best saw the state as a protective shell safeguarding the Volk\ Nation, or, as in the case of Herder himself, saw the state as an artificial and even destructive form of society. At the same time, the French and the German conceptualizations of the people were similar in that they both were collectivist. Le peuple and das Volk were both formulations that suggested a collective, be it a national, ethnically\culturally defined community, or a collectivity that expressed the democratic, general will of the people. In England the idea of the people came, as in France, to be strongly associated with political institutions and practices. However, whereas in France the idea of the people\nation tended to be associated with the collective, communal notion of the general will, which gave priority to the nation and to the social over the rights of the individual, it was in England more 14538
strongly associated with the assertion of parliamentary power, individual liberty, and the limitation of state power. Thus English national identity itself came to be linked to political as well as economic liberalism, and to a conception of the general good that took as the basic unit the freeborn individual Englishman. These tendencies were even more fully realized in the USA, where the separation of powers and the protection of individual rights were enshrined in the constitution itself. The beginning point was the stress on the American colonists’ entitlement to ‘English liberties’ and civil rights, rights that, it was claimed, had been trampled upon. The next step was to separate from this close association to Englishness and instead to refer, as was done in the Declaration of Independence, to the ‘inalienable rights’ ultimately grounded in universal Natural Law and ‘the rights of man.’ However, in spite of this emphasis on universality, the Constitution—and the rights and liberties it guaranteed—soon became the basis for the elaboration of an American civic nationalism, based on the idea that the American people, formed by its liberal-democratic institutions, was exceptional in its commitment to individual liberty and equality. Thus in the USA, the idea of ‘We the People’ as the ultimate source of legitimacy and sovereignty was informed by a pervasive individualism and an enduring suspicious of centralized government. If in the French case the primacy of the people as a unified collectivity went hand in hand with strong central government, national standards, and a distaste for local difference, it was conversely true that in the US case the primacy of the people as a collection of free individuals informed an opposite tendency. At the same time, Herderian notions of national character also came to infuse US self-understanding of the essential character of the American people, which revolved around the idea that America as a people and a nation was more mature and evolved than other peoples by virtue of its deep-seated capacity for democratic self-government. A similar fusion of an ethnic and a civic notion of the people occurred in Sweden, where leading Social Democrats after World War I came to embrace the idea that Swedes were endowed with a peculiar propensity for democracy and freedom, amounting to what one of them referred to as having ‘democracy in the blood.’ Citing a long, unbroken legacy of self-government that linked modern-day Swedes to the ancient yeoman peasants, the existence of a tradition was claimed, in which ‘democracy is rooted not merely in the constitution, but also in our traditions and in the disposition of the folk,’ as the Swedish prime minister Per Albin Hansson put it in 1935. On this basis, Swedish Social Democrats conjured up a vision of a folkhem, a people’s home, which usefully can be contrasted with the contemporary German slogan, Volksgemeinschaft, or ‘people’s community,’ championed by Hitler and the National
Sociobiology: Oeriew Socialists. While both concepts celebrate national community, one prefigured the Nazi racial state, while the other became the organizing slogan for the Swedish welfare state. In the Swedish case, the concept of the folk connoted democratic sentiment rooted in a particular, common history; in the German one, the corresponding concept of the Volk referred to a racial community. This is particularly clear in the adjectival forms of folk\Volk, both equally difficult to translate into English. The German word, oW lkisch, means ‘national’ but with a distinct emphasis on race, and specifically with an anti-Semitic thrust. The Swedish word, folklig, conversely, is better translated as ‘popular\populist\democratic,’ with an association to the democratic legitimacy of claims made by those who credibly speak in the name of ‘the people,’ first and foremost the nonelite, that is, workers and peasants. Thus the precise relationship between the concept of the people and such neighboring concepts as state, society, culture, race, and the individual become absolutely central if we are to understand the political implications of the concept of the people. In one case the proximity of Volk\nation\culture\race\collectivism is associated with racist nationalism; in another the linkage between peuple\nation\state\society connotes a collectivist, if democratic, conception of popular sovereignty; in a third the association with the people\individual liberty suggests a liberal and individual rights-based political order; while in a fourth one the correlation of folk\home\solidarity\state informs the vision of a welfare state. See also: Citizenship, Historical Development of; Citizenship: Sociological Aspects; Civilization, Concept and History of; Civilizations; Community\ Society: History of the Concept; Democracy; Democratic Theory; Ethnic Identity, Psychology of; Identity and Identification: Philosophical Aspects; Identity in Anthropology; Identity Movements; Identity: Social; Individual\Society: History of the Concept; Multiculturalism and Identity Politics: Cultural Concerns; Nationalism: General; Nationalism, Sociology of; Nations and Nation-states in History; Personal Identity: Philosophical Aspects; Social History; Social Identity, Psychology of; Societies, Types of; State and Society; State, History of; State, Sociology of the
Bibliography Anderson B 1983 Imagined Communities. Verso, New York Brubaker R 1992 Nationhood and Citizenship in Germany and France. Harvard University Press, Cambridge, MA Brunner O, Conze W, Koselleck R (eds.) 1972–92 Geschichtlische Grundbegriffe, Vols. 1–7. Klett, Stuttgart, Germany Burke P 1994 Popular Culture in Early Modern Europe. Scolar, Aldershot, UK
Hobsbawm E, Ranger T (eds.) 1983 The Inention of Tradition. Cambridge University Press, Cambridge, UK Hunt L 1984 Politics, Culture, and Class in the French Reolution. University of California Press, Berkeley, CA Joyce P 1994 Democratic Subjects. Cambridge Koselleck R 1985 Futures Past. MIT Press, Cambridge, MA Mosse G 1975 The Nationalization of the Masses. Cornell University Press, Ithaca, NY Weber E 1976 Peasants into Frenchmen. Stanford University Press, Stanford, CA
L. Tra$ ga/ rdh
Sociobiology: Overview ‘Sociobiology’ refers to the study of social phenomena within the conceptual framework of biological science. More specifically, it usually refers to an explicitly neoDarwinian approach that grew out of the theoretical contributions of W. D. Hamilton (1964) and G. C. Williams (1966). The hallmark of this approach is the interpretation of social behavior, and its causal mechanisms and processes, as products of the Darwinian processes of natural and sexual selection (see Darwinism), and hence in terms of adaptive function or ‘design’ (Dawkins 1986). The term ‘sociobiology’ was apparently coined by J. P. Scott, a student of the genetics and development of mammalian social behavior, who began organizing paper sessions on animal behavior and sociobiology at scientific meetings around 1950. In 1956, the Ecological Society of America established a new section of animal behavior and sociobiology, the forerunner of today’s Animal Behavior Society, with Scott as its chairman (Dewsbury 2000). However, the word ‘sociobiology’ was not widely used before 1975, when Harvard entomologist Edward O. Wilson made it the title of an ambitious volume reviewing and synthesizing information and ideas about the evolution of social behavior across the animal kingdom. Wilson (1975, p. 4) defined sociobiology as ‘the systematic study of the biological basis of all social behavior,’ but although this definition is frequently quoted, it does not convey the word’s meaning in actual usage. In the broadest sense, since ‘biology’ is ‘the science of life and life processes, including the structure, functioning, growth, origin, evolution, and distribution of living organisms’ (the American Heritage Dictionary), and since only living organisms have ‘social behavior’ in any interesting sense of the word ‘social,’ Wilson’s definition could be construed as inclusive of all social science. However, ‘sociobiology’ is obviously not used as a synonym for ‘social science’ and most social scientists do not consider themselves biologists. A narrower construction of the meaning of 14539
Sociobiology: Oeriew ‘biology’ does not help Wilson’s definition conform to usage: for most scientists, the phrase ‘biological basis’ connotes the genetic, endocrine, and brain mechanisms underlying social behavior, but these ‘proximate causal’ factors are not the primary focus of sociobiologists, and most work in these areas would not be considered sociobiological. In practice, both Wilson’s tome and subsequent sociobiological work have been primarily concerned with applying theories from population biology, genetics, and ecology to the problem of explaining the evolution of social phenomena and their diversity both among and within animal species (see Comparatie Method, in Eolutionary Studies), in much the same way as one might endeavor to understand the evolution and diversity of anatomical structures. Like anatomists and physiologists, whose research is guided by their interpretations of the component parts and subsystems of organisms as devices ‘for’ digestion, respiration, vision, circulation of the blood, and so forth, sociobiologists are ‘adaptationists’ (see Adaptation, Fitness, and Eolution). That is, sociobiologists assume that social phenomena—such as mating preferences, offspring sex ratios, warning calls, systematically discriminative treatment of young by parents, situationally contingent gregariousness, seasonality of territorial defense, etc.—have a functional significance for the behaving animal that can be illuminated by appropriate hypothesis testing. Moreover, sociobiologists are ‘selectionists’ (see Natural Selection). That is, their hypotheses about the adaptive functions of social phenomena generally entail the assumption that the social attributes under consideration have been shaped by a history of Darwinian selection to be effectively fitness-promoting in ancestral environments (see Adaptation, Fitness, and Eolution). Indeed, effective contribution to fitness (the replicative success of the focal individual’s genes relative to their alleles, and hence genetic posterity) is considered the superordinate criterion of adaptive function; more immediate aims, such as foraging efficiently, attracting mates, and protecting one’s young from predators, are interpreted as having evolved as means to the end of fitness. Sociobiological theory and research began to flourish only after the fallacy of naı$ ve group-level adaptationism or ‘greater goodism’ (Cronin 1991) was dispelled. For over a hundred years, Darwin’s (1859) concept of ‘natural selection’ was widely misunderstood, even by evolutionary biologists, who often uncritically interpreted evolved attributes as having been selected to promote ‘the reproduction of the species’ (see Eolutionary Selection, Leels of: Group ersus Indiidual). Williams (1966) dissected and demolished this fallacy, persuading a majority of evolutionists that the capacity of Darwinian selection to create adaptive organization resides primarily in the differential reproductive success of alternative ‘designs’ within species, that the attributes that evolve 14540
therefore tend to be functionally organized to help individual organisms to outreproduce conspecific rivals, and that any ensuing benefits or harms to a whole population or species are incidental byproducts of organismic designs rather than what the designs are organized to achieve. This critique inspired a view of organisms as ‘evolved reproductive strategists’ whose attributes must be understood as tactics for effective reproductive competition against others of their own kind. (It is important to note that the ‘strategy’ metaphor carries no implication of foresight or intent. Indeed, the concept applies to brainless organisms as fully as to those with complex cognition. Thus, a sociobiologist might refer to a plant’s ‘reproductive strategy,’ which is analyzed as consisting of numerous special-purpose ‘tactics,’ such as germinating in response to some degree of abrasion of the seed husk, flowering in response to a threshold air temperature, avoiding self-fertilization by asynchronous maturation of male and female flower parts, and so forth.) The second main conceptual development leading to the flowering of sociobiological theory and research in the final third of the twentieth century was Hamilton’s (1964) formulation of ‘inclusive fitness’ (or ‘kin selection’) theory (see Adaptation, Fitness, and Eolution). Classical Darwinism had construed the fitness that selection tends to maximize as personal reproductive success, but Hamilton noted that selection will favor any phenotype (see Genotype and Phenotype) that effectively promotes the replicative success of copies of the focal organism’s genes, relative to their alleles (alternative forms of the same gene), regardless of whether those genes reside in descendants or in collateral relatives. Thus, for example, the complex brood care and colony maintenance behaviors of sterile worker ants can be favorably selected if these acts promote the reproduction of a queen who is closely related to the workers. Hamilton proposed that the quantity that selection tends to maximize is not personal reproduction (sometimes called ‘direct fitness’), but the focal individual’s ‘inclusive fitness effect,’ which is its total impact on the proliferation of its alleles via both direct fitness and any effects that the focal individual may have on the reproductive success of collateral kin (‘indirect fitness’). Hamilton’s analysis implied that organisms are not just evolved ‘reproductive strategists,’ but evolved ‘nepotistic strategists’ (see Social Eolution: Oeriew). In this context, ‘nepotism’ means directing benefits to one’s genetic relatives, and the implication of Hamilton’s revision of the fitness concept was, to quote Alexander (1979, p. 46), that living creatures ‘should have evolved to be exceedingly effective nepotists, and we should have eoled to be nothing else at all.’ (Note that Alexander, like most evolutionary biologists, uses the word ‘nepotism’ to encompass acts that promote both the direct and indirect components of fitness; confusingly, however, some authors use this term to refer only to acts that promote indirect fitness, excluding parental
Sociobiology: Oeriew care from its domain. Even more confusing is the common misuse of the term ‘kin selection’ as a substitute for ‘nepotism,’ such that an individual who is helping relatives is said to be exhibiting or engaging in kin selection. Used correctly, ‘kin selection’ refers to a population genetical process whose expected phenotypic consequence, on an evolutionary time scale, is nepotism, which is often but not necessarily manifested as selectivity in social response. Many sociobiologists, however, including Hamilton, have eschewed this popular phrase, both because it has so often been misused and because even its correct usage is often misinterpreted as logically parallel to, and theoretically alternative to, ‘individual selection’ or ‘group selection’; see Eolutionary Selection, Leels of: Group ersus Indiidual). By the late 1980s, the word ‘sociobiology’ had begun to fall from favor. Most practitioners now refer to their field as ‘behavioral ecology,’ a term with slightly broader scope, in that it encompasses the adaptationist, selectionist approach not only to social phenomena but to asocial behavior as well, especially foraging. Thus, although one might suppose from its title that the journal Behaioral Ecology and Sociobiology, established in 1976, would cover a broader domain than that of Behaioral Ecology, established in 1990, the scope of the two journals is identical. Declining use of the word ‘sociobiology’ should not be mistaken for a decline in the paradigm’s popularity. Indeed, during the 1980s, the sociobiological approach came to dominate the study of animal social behavior, as it still does (Alcock 2000).
1. The ‘Problem of Altruism’ A central concern of sociobiological theory and research has been the ‘problem of altruism’: if Darwinian selection is a process that favors and refines only those attributes that contribute to their bearers’ fitness relative to the fitness of conspecific rivals, then why is nature not more consistently ‘red in tooth and claw’? In other words, how can such prosocial phenomena as restraint in aggressive conflict or active cooperation have evolved? In this connection, sociobiologists define ‘altruism’ as action whose average consequence is a reduction in the actor’s direct fitness (expected reproductive success) and an increase in the direct fitness of someone else. (It should be noted that although this definition encompasses much of what might be deemed ‘altruistic’ in an ordinary English sense, since its essential feature is self-sacrifice that benefits others, it differs from ordinary usage in the fact that other-regarding intent is irrelevant.) At first glance, it would appear that selection must always penalize action that is altruistic in this technical sense, and yet animals often perform acts that clearly qualify, such as working to help raise others’ young, or paying some energetic or predation risk cost in order to warn
neighbors of danger. How these phenomena came into existence and are able to persist in a Darwinian world was a mystery before the 1960s, although the unexamined greater-goodism of many biologists kept them from recognizing that there was something in need of explanation. Hamilton’s inclusive fitness theory provided the most important answer. If the beneficiary of an altruistic act is a relative of the actor, and if that relative’s gain is sufficiently large relative to the actor’s sacrifice, then the net inclusive fitness effect for the actor can be positive and the altruistic tendency can increase in its relative frequency. More specifically, Hamilton theorized that the necessary condition for a heritable altruistic tendency to increase in prevalence under selection (often called ‘Hamilton’s rule’) is c\b r, where c is the cost (in units of expected direct fitness) incurred by the actor, b is the benefit (in the same currency) enjoyed by the altruistic act’s beneficiary, and r is a ‘coefficient of relatedness’ between the actor and the beneficiary, indicative of the extent to which the two share identical genes by virtue of descent from recent common ancestors. Subsequently, more complex population genetical models confirmed the robustness and generality of Hamilton’s rule, at least as a close approximation, making it a cornerstone of sociobiological theory (see, e.g., Frank 1998). A great deal of sociobiological research has been concerned with testing whether altruistic behavior in nature can be understood as nepotism in accordance with Hamilton’s rule. Supportive evidence has mostly taken the form of demonstrations that animals behave differently as a function of their relatedness to their interactants or neighbors; animals such as colonial ground squirrels and prairie dogs, for example, have been found to call out warnings of predator presence facultatively, according to the potential caller’s degrees of relatedness to others in the vicinity. Progress in genetic methods of kinship assessment, especially ‘DNA fingerprinting,’ has fuelled an explosion of research on the ways in which social behavior in the field is affected by relatedness. There has also been considerable interest in elucidating how animals ‘recognize’ kin when they behave nepotistically; in some cases, they respond to cues that are under more or less direct genotypic control, such as certain odors, and in others to reliable circumstantial cues (e.g., those who share my mother’s womb with me must be close kin), while in still other cases there is no evident discrimination among one’s actual interactants at all but a ‘rule of thumb’ that does not require true recognition (e.g., behave cooperatively until I emigrate at maturity, at which time my average level of relatedness to those that I encounter will plummet). The direct fitness costs incurred by altruists or nepotists and the fitness benefits that they bestow are usually harder to measure than relatedness, since they often depend on small changes in the risk of infrequent outcomes such as death by predation. However, one 14541
Sociobiology: Oeriew area in which relatively precise measurement and quantitative hypothesis testing have been substantial is that of ‘cooperative breeding,’ where individuals other than a parent or pair of parents participate in the care and rearing of dependent young. The two taxonomic groups that have been most studied are the social insects, which manifest complex consequences of some unusual twists on genetic relatedness (Crozier and Pamilo 1996), and cooperatively breeding birds, in which helpers are usually, but not always, close kin of those they assist (Stacey and Koenig 1990). Recent theory and research on ‘cooperative breeding’ have focused especially on factors affecting the degree to which reproduction is monopolized or shared (e.g., Reeve 2000). Theoretical sociobiologists have also sought nonnepotistic solutions to the problem of altruism, with particular interest in the circumstances that permit reciprocity (‘you scratch my back and I’ll scratch yours’) to be evolutionarily stable. This issue has been addressed primarily by game theoretical models (see Game Theory), which have highlighted the necessity that nonreciprocators (‘cheaters’) be detected and denied the benefits provided to reciprocators, as well as the difficulty of establishing the requisite punishment of cheaters in multi-party games, when punishing is itself a sort of altruism (i.e., the provision of a ‘public good’). Empirical investigations of nonnepotistic reciprocity in nonhuman animals have been few, however, perhaps indicating that the complex social cognition that is necessary for all but the most limited forms of reciprocation-in-kind is restricted to human beings and our near relatives.
2. Sexual Selection and Sexual Conflict Another major focus of sociobiological theory and research has been sexual reproduction. Problems include when and why organisms reproduce sexually (i.e., biparentally) rather than uniparentally, and, in the event of ‘dioecy’ (the existence of two sexes with distinct adaptations), exactly how and why females and males differ. In 1871, Darwin supplemented his 1859 theory of evolution under the influence of ‘natural selection,’ by proposing a distinct evolutionary force that he called ‘sexual selection’ (see Darwinism). Whereas natural selection referred to the nonrandom differential reproduction of types as a result of differential success in coping with such exigencies as finding food, avoiding predators, and enduring climatic factors, sexual selection referred to the nonrandom differential reproduction of types as a result of differential access to mates and their reproductive labors. Darwin distinguished the two because they could sometimes act in opposition, as for example when sexual selection favors the evolution of conspicuous courtship structures, like the peacock’s tail, 14542
that enhance mating success but apparently also increase mortality risk. Sexual selection was the object of scant theoretical or empirical work for a century after Darwin proposed the concept. Williams (1966) and Trivers (1972) resurrected it as a focal topic in sociobiology, proposing that the extent to which sexual selection operates differently on females and males, and thereby induces the evolution of sex differences in anatomy, physiology, psychology, and behavior, was determined by the extent to which the reproductive efforts (especially parental care) provided by one sex constituted the principal resource limiting the fitness of the less investing sex (usually, but by no means always, males), and hence the object of competition among members of the less investing sex. More recent work has focused on the criteria and consequences of mating preferences, with particular interest in differentiating among theories that attribute the evolution of such preferences primarily: (a) to their utility for recruiting ‘good genes’ for one’s offspring; or (b) to their utility for garnering material resources and\or minimizing such costs of mating as lost time or energy and elevated disease transmission risk; or (c) to ‘sensory biases’ whose primary function resides in contexts other than mate choice; or (d) to a self-reinforcing dynamic (‘Fisherian runaway selection’) that tends to produce display features and preferences that are arbitrary with respect to all criteria of functionality outside the mating context (see Cronin 1991, Andersson 1994). Recent theory and research have also focused on conflicts of interest between mates (or potential mates) as well as those between same-sex rivals. Work in this area is related to problems in the evolution of communicative signals between agents whose commonality of interests is only partial (see Game Theory).
3. ‘Selfish Genes’ and Intragenomic Conflict An area of great contemporary interest concerns the evolutionary consequences of the fact that selection can operate differentially on different components of the genome. Because the male sex-determining Ychromosome is transmitted only from father to son in such animals as fruit flies and human beings, for example, mutations on the Y that have the effect of biasing a progeny towards male production have been observed to increase in prevalence over generations, even when they reduce the inclusive fitness of the host individual (see Y-chromosomes and Eolution). Conversely, mitochondrial DNA and other cytoplasmic inheritance factors are transmitted only through daughters, and mutations in these factors that bias sex allocation towards females may similarly proliferate despite negative inclusive fitness effects (see
Sociobiology: Philosophical Aspects Mitochondria and Eolution). The consequence is ‘intragenomic conflict’: simultaneous selection at different genetic loci, within the same population, in favor of alleles with antagonistic phenotypic effects (Haig 1997). The recent discovery of ‘genetic imprinting,’ a process whereby certain genes are differentially activated according to whether they were inherited from one’s father or mother, has greatly extended the known potential for such intragenomic conflict, as for example when genes of paternal origin are selected to produce fetal and infantile phenotypes that effectively ‘discount’ the value of the mother’s potential future offspring (who may have distinct paternity) more steeply than do genes of maternal origin, leading to the evolution of directly antagonistic phenotypic effects of maternally and paternally imprinted genes (Haig 1993). In a sense, these discoveries challenge the orthodox sociobiological view of organisms as nepotistic strategists with integrity of purpose subsidiary to the function of maximizing inclusive fitness, since selection clearly can and does favor the proliferation of genes that promote their own replication at the expense of ‘their’ organisms’ inclusive fitness. However, the individual-level focus of sociobiological research continues to be productive (Alcock 2001), and is unlikely to be abandoned soon. See also: Adaptation, Fitness, and Evolution; Comparative Method, in Evolutionary Studies; Darwin, Charles Robert (1809–82); Darwinism; Evolution and Language: Overview; Evolution, History of; Evolution, Natural and Social: Philosophical Aspects; Evolution: Self-organization Theory; Evolutionary Epistemology; Evolutionary Selection, Levels of: Group versus Individual; Evolutionary Social Psychology; Game Theory; Genotype and Phenotype; Mitochondria and Evolution; Natural Selection; Social Evolution: Overview; Social Evolution, Sociology of; Sociality, Evolution of; Sociobiology: Philosophical Aspects
Bibliography Alcock J 2001 The Triumph of Sociobiology. Oxford University Press, New York Alexander R D 1979 Darwinism and Human Affairs. University of Washington Press, Seattle, WA Andersson M 1994 Sexual Selection. Princeton University Press, Princeton, NJ Cronin H 1991 The Ant and the Peacock: Altruism and Sexual Selection from Darwin to Today. Cambridge University Press, Cambridge, UK Crozier R H, Pamilo P 1996 Eolution of Social Insect Colonies: Sex Allocation and Kin Selection. Oxford University Press, Oxford, UK Darwin C 1859 On the Origin of Species by Means of Natural Selection. John Murray, London
Dawkins R 1986 The Blind Watchmaker. Longman, Harlow, UK Dewsbury D 2000 Obituary: John Paul Scott (1909–2000). Animal Behaior Society Newsletter 45(2): 3–5 Frank S A 1998 Foundations of Social Eolution. Princeton University Press, Princeton, NJ Haig D 1993 Genetic conflicts in human pregnancy. Quarterly Reiew of Biology 68: 495–532 Haig D 1997 The social gene. In: Krebs J R, Davies N B (eds.) Behaioural Ecology, 4th edn. Blackwell Scientific, Oxford, UK, pp. 284–304 Hamilton W D 1964 The genetical evolution of social behaviour, I and II. Journal of Theoretical Biology 7: 1–16, 17–52 Reeve H K 2000 A transactional theory of within-group conflict. American Naturalist 155: 365–82 Stacey P W, Koenig W D 1990 Cooperatie Breeding in Birds. Cambridge University Press, Cambridge, UK Trivers R L 1972 Parental investment and sexual selection. In: Campbell B (ed.) Sexual Selection and the Descent of Man 1871–1971. Aldine, Chicago, pp. 136–79 Williams G C 1966 Adaptation and Natural Selection: a Critique of Some Current Eolutionary Thought. Princeton University Press, Princeton, NJ Wilson E O 1975 Sociobiology: The New Synthesis. Belknap Press, Cambridge, MA
M. Daly and M. Wilson
Sociobiology: Philosophical Aspects Wilson (1975) published an overview of the evolution and adaptive significance of social behavior. It focused on the social behavior of nonhuman animals. But the final chapter applied evolutionary reasoning to human action; arguing that certain common patterns in human behavior were adaptations. That is, they exist because they conferred fitness advantages on our ancestors. Indeed, our ancestors were ancestors in part because they did, and others did not, manifest these behavioral dispositions. As a consequence, ‘sociobiology’ is a term often used narrowly, as a term for the evolutionary analysis of human social behavior, and that will be the focus of this article. Human sociobiology has been a very controversial enterprise (Kitcher 1985, Hull and Ruse 1998). Yet on the face of it, that is puzzling. We are evolved organisms. Human behavior has an evolutionary history; a history that must at least partially explain that behavior. None of sociobiology’s chief critics deny this basic fact. Instead, they have developed a minimization strategy, accepting the basic fact of human evolution, but arguing that human evolutionary history is largely irrelevant to an explanation of specific features of human social behavior. One idea is that human evolutionary history can explain only what is individually fixed or innate, and what is common across cultures. But the most striking features of human 14543
Sociobiology: Philosophical Aspects experience are individual developmental plasticity and cultural variability. Thus, within the social sciences it is common to suppose that our evolved ‘human nature’ places only the broadest constraints on our cultural life. If humans reproduced asexually, or if human infants were independent at 18 months of age, human cultures would be very different. But human evolutionary history makes possible many human cultures and social organizations. Since every human group has a similar set of biological resources, the massive differences between groups must be explained in terms of differing cultural resources. Difference explains difference. Human evolutionary history explains how hominids developed the ability to transmit culture and the plasticity to be shaped by that culture. But that is all it explains (Lewontin et al. 1984, Levins and Lewontin 1985). Sociobiologists resist this line of argument. In particular, Tooby and Cosmides argue that human cultural diversity and individual developmental plasticity is more bounded than the minimisation strategy suggests. And they argue that diversity itself may have an evolutionary explanation. Organisms are adapted to behave differently in different circumstances. The differences between castes of social insects are often profound. Yet these differences are different developmental expressions of a single adaptation. Equally, a single mechanism of mate choice might generate one behavior in a New York singles bar, and quite a different one in an Amish church (Tooby and Cosmides 1992). Sects. 2 and 3 outline of the development of human sociobiology since 1975. The article then concludes with three short sections introducing key conceptual and methodological issues raised by contemporary sociobiology.
1. The Wilson Program Wilson and his co-workers began with the insight that behavioral differences are just as subject to selection as any other feature of an organism. Animals within a population are as apt to differ in behavior as they are in any other aspect of their phenotype. These differences make a difference to fitness. Moreover, there is good reason to expect that behavioral differences are heritable. Evolutionary histories reconstructed from behavioral traits agree well with those from morphological and genetic ones (see, e.g., Kennedy et al. 1996). So selection can work. The mound building behavior of the orange-footed scrubfowl—a bird that lays its eggs in a mound that functions as an incubator—is just as likely to be an adaptation as its large powerful feet. Wilson simply extended this idea from animal to human behaviors. He had in mind such behaviors as incest avoidance, male sexual promiscuity and female coyness, infanticide, rape, and hostility to strangers. Just as the scrubfowl has been selected to build, guard, 14544
and monitor the temperature of its mound, perhaps humans have been selected to act with hostility towards strangers or to use infanticide to manipulate the sex ratio of their offspring (Wilson 1975, Wilson 1978). Adaptationist analyses of behavior can seem quite plausible. Consider sex role differentiation. In some bird species, the male takes responsibility for brooding the eggs and caring for the chicks. But, in general females, and especially female mammals, have a higher initial investment than males in any reproductive act. Eggs are much larger, and hence more biologically expensive, than sperm. Second, the costs of pregnancy and postparental care are very significant. By accepting a sexual partner, a female mammal commits a serious fraction of her total life reproductive resources. That need not be true of males: they do not bear the costs of pregnancy and lactation. Sex is not without cost to males. Copulation exposes the male to predators, parasites, and pathogens. But the female also bears these costs together with those of pregnancy. So mating costs do not fall evenly on the sexes, and this suggests different optimal strategies for males and females. Males are likely to attempt to be more promiscuous; females, more coy. We might also expect some gender differentiation in caring decisions. If her young are to live, a female mammal has no choice but to engage in childcare. Not so the male, who is unconstrained by physiology. For him, all options are possible. These range from outright desertion of the pregnant female, through diversion of some of his resources to other mates and their young, to full participation in childcare. Adaptive hypotheses of this kind about human behavior are all very controversial. This is not surprising, for they are hard to test. A standing problem for human sociobiology is to move beyond plausibility considerations to clear empirical tests, and it is not clear how that is to be done. However, the most fundamental problem with the Wilson program is its grain of analysis. Lewontin has pointed out that an adaptive explanation of a trait presupposes that the trait can, over evolutionary time, vary independently of others (Lewontin 1978). It is, thus, very unlikely that there is any selective explanation for the color vision properties of the human left eye. The color vision capacities of the left retina do not vary (heritably) independently of those of the right retina. Its receptor sensitivities are not adaptive or evolutionary atoms. Human skin color, on the other hand, almost certainly is such an atom. Behavioral traits are sometimes atoms. The scrubfowl’s mound building and mound maintaining behavior probably has evolved independently, and operates independently, of the scrubfowl’s feeding, roosting, and predator avoiding activities. The human behavioral repertoire, however, is not an aggregation of independent units. The realization that human action depends on a small number of interacting psycho-
Sociobiology: Philosophical Aspects logical mechanisms propelled psychology from behaviorism to cognitivism. Our behavior is produced by mental mechanisms that play a role in many different behaviors. Some of the mental mechanisms used in hunting are also used in storytelling. Rape, xenophobia, child abuse, homosexuality, and other human behaviors very likely do not vary independently of other human actions. They could only be altered by altering the underlying mental mechanisms, and since these mechanisms are used for many different purposes, any change would have many other consequences. Particular behavior patterns are unlikely to have independent selective histories.
2. Darwinian Psychology The idea that we should not be looking for adaptive hypotheses about specific behaviors has gradually become part of the accepted wisdom of human sociobiology. The psychological mechanisms that generate behavior are the proper focus of evolutionary theorizing. Sociobiologists have turned from the idea that each behavior evolved because it was selected to the idea that many different behaviors are caused by a relatively small number of cognitive mechanisms. It is these mechanisms that have evolved (Barkow et al. 1992, Plotkin 1997). Evolutionary psychologists defend a general schema of the evolved nature of the human mind, and they have developed a discovery procedure for filling out that schema. They reject the metaphor of human psychology as a general-purpose computer programmed differently by different cultures. They replace this vision with a modular theory of mind. The mind is a cluster of evolved information-processing mechanisms. The main empirical program of evolutionary psychology is to characterize these ‘Darwinian algorithms.’ For example, Buss and Symons think that there are Darwinian algorithms of sexual attraction. These result in the tendency of human males to find those females attractive that bear the cultural marks of youth, and of women to find attractive those that bear the cultural marks of high status (Symons 1979, Buss 1994). Baron-Cohen and Sperber have argued that we have a specific mental mechanism, a ‘mind reading’ device, designed to enable us to anticipate others’ behavior (Baron-Cohen 1995, Sperber 1996). Darwinian algorithms are supposed to be mental ‘modules’ in the sense defined by Fodor: domain-specific, mandatory, opaque, and informationally encapsulated (Fodor 1983). They are domain-specific because they deal with a specific class of situations in which the organism finds itself. The mind reader is not used to predict the outcome of physical interactions. Darwinian algorithms are mandatory. We do not choose to interpret a particular smile as false. Rather, our mechanisms of emotion recognition are engaged automatically once we recog-
nize a particular stimulus as a face. Darwinian algorithms are opaque because their internal processes are not consciously accessible. If Cosmides and Tooby are right, we have cognitive mechanisms specifically adapted to detect cheating in social exchange (Cosmides and Tooby 1992). But we cannot tell by introspection whether we have such mechanisms or how they operate. Finally, Darwinian algorithms are informationally encapsulated because they do not make use of the information stored elsewhere in the cognitive system. Chomsky looms large in the development of evolutionary psychology. If he is right, the differences between human languages are not profound. There are many very important features common to all human languages. The class of possible human languages is constrained quite tightly by the nature of an innate language acquisition device. Moreover, that device contains conditional elements whose different settings explain many of the differences between languages. Language thus illustrates two themes of evolutionary psychology. Diversity is less profound than it appears, and it can be explained by a single mechanism; one operating differently in different circumstances. Evolutionary psychologists take the human mind to be a complex of modules, each adapted to solve a specific problem or cluster of problems that was central to the life of our ancestors. But how are we to identify and characterise these modules? They hope to solve this problem by the strategy of ‘adaptive thinking.’ Adaptive thinking infers the solution, the adaptation, from the problem, the ecological context in which the organism evolved. So the first task of the evolutionary psychologist is to identify the adaptive problems our ancestors confronted in their environments. For example, in complex and unstable social groups, our ancestors would have needed social expertise; they would have needed to be good judges of others’ likely actions (Whiten and Byrne 1997). The second task is to discover the stable correlations between those aspects of the environment humans are equipped to sense, and those aspects they need to know about. We can expect natural selection to engineer into taskspecific devices implicit knowledge of these correlations. So if, for example, aggressive behavior or defecting from an agreement is usually preceded by a characteristic stance or facial expression, the evolutionary psychologist would expect this to be engineered into those mechanisms specialized for managing social interaction. The third task of the evolutionary psychologist is to construct an information-processing design that could solve the adaptive problem using the available cues. The fourth and final task is to experimentally test for the existence of the hypothesized mechanisms. For many potentially useful adaptations will not actually be engineered into us. Human sociobiology has developed from a theory of the adaptive significance of particular behavior 14545
Sociobiology: Philosophical Aspects patterns to a theory of the architecture of the human mind, and of the adaptive significance of the components of that architecture. Much remains controversial about this program, both in its detail and its overall orientation. So this article will finish by outlining three critical issues that are still open.
3. Testing Sociobiological Hypotheses One central problem for sociobiology is to move beyond plausibility stories to develop clear empirical tests of specific sociobiological hypotheses. It is not obvious that this problem can be solved at all. The resources of comparative biology offer one of the most powerful methods of testing adaptationist hypotheses (Harvey and Pagel 1991). For example, suppose we suspect that a certain leaf shape is an adaptation to arid conditions. If we show that within a particular tree family (e.g., the eucalypts) that there is a strong covariation between long, narrow, waxy leaves and distribution through an arid habitat, we could reasonably regard that hypothesis as well confirmed. It would be confirmed especially well if the ancestor of our (putative) arid-adapted trees did not have long and narrow leaves, for we could then rule out the hypothesis that the leaf shape was no more than an inheritance from a common ancestor. If we found the same pattern in other tree species, our confidence in our adaptive hypothesis would be stronger still. Unfortunately, we cannot apply the methods of comparative biology to test the most interesting adaptive hypotheses about human psychological mechanisms. Our closest relatives are extinct. So many of these mechanisms are found only in our own species. Even where they are shared with our closest living relatives (e.g., the facial expression of emotion) they exist in a different form. Moreover, the human lineage—and even the human and great ape lineage combined—is not bushy; the eucalypt, in contrast, boasts hundreds of species. The comparative method can tell us nothing at all about the adaptive significance of language and very little about tool making, true imitation, and many other important human cognitive adaptations. The main alternative to comparative biology is the development of quantitative evolutionary models. If there is a good quantitative fit between model and data, we can regard the model as passing an important test (Orzack and Sober 1994, Orzack and Sober 1996). But here too the prospects are dim. First, it is often hard to actually measure the impact of behavior on human fitness. Economic resources typically are used as a measure of fitness benefits. But even in hunter– gatherer societies this is probably too crude. For fitness probably does not vary as a linear function of economic resources. Second, in developing an evolutionary model, we need to define a space of possibi14546
lities. For the adaptationist hypothesis is that the fittest behavior within that space will become established by selection in the population. Yet how do we develop a possibility space for human behavior? To recycle on example from Kitcher, what were the realistic alternatives of, say, a man in an avunculate society? Could he defect from the avunculate, directing his resources to his wife’s children, rather than to his nieces and nephews (Alexander 1979, Kitcher 1985)? Finally, and most importantly, we face the problem that humans now act in a very different world from that in which human psychological mechanisms evolved. So measuring the impact of behavior on current fitness seems beside the point. Yet, of course, we have no access to the fitness consequences of behaviors in those ancestral environments. So the problem of testing sociobiological hypotheses is very serious. The shift in theoretical interest from behavior to the mechanism that produces that behavior in one way makes the problem even more intractable. The Wilson program took the unit of adaptationist analysis to be a behavioral pattern. Such patterns are observable, though if they are patterns in sexual behavior, violence to outsiders, or infanticide, they will not be easy to observe. The program of evolutionary psychology adds to the existing methodological problems yet another: inferring cognitive mechanism from behavior. This is known to be difficult. But in one important respect, evolutionary psychology improves the prospects for testing sociobiological hypotheses. To the extent that evolutionary psychology makes specific architectural claims about human cognition, these are subject to direct empirical test. For example, Frank has argued that emotions function as commitment devices; they bind agents to future courses of action that will not then be in the agent’s interest, and thus tend to maximise the agent’s overall long run welfare (Frank 1988). It is a consequence of this theory both that emotions are not under central control, and that other agents are good at using emotional signals to predict future behavior in commitment situations. These consequences are directly testable by experimental psychology. However, a positive test does not directly vindicate the adaptive hypothesis that is alleged to explain the existence of the proximate mechanism. Moreover, the relationship between adaptive function and proximate mechanism is often less direct than this example suggests. Sober and Wilson point out this in their discussion of altruism. They argue that there are good reasons to believe that humans are a group-selected species, and hence that human behavior will often be altruistic in the evolutionary sense. Its function will often be to promote the fitness of the group at some cost to the agent’s fitness. But Sober and Wilson point out that while this constrains the proximate mechanism, it does not fix it. Altruistic agents might act altruistically because of irreducibly altruistic motivations, or because selection has built in internal rewards.
Sociobiology: Philosophical Aspects Altruistic behavior could be engineered either way (Sober and Wilson 1998). In short, then, despite the shift from adaptive theories of behavior to adaptive theories of psychological mechanism, the problem of the testability of sociobiology remains very serious.
4. Eolutionary Psychology and Rational Agency Suppose the basic hypothesis of evolutionary psychology is right. Human minds are ensembles of adapted modules. What follows for folk psychology, i.e., belief\desire theories of human behavior? Many models in the social sciences depend on folk psychology, either directly or on its formalized cousin, the view that humans are rational agents that maximize their expected utility. Does evolutionary psychology drive out rational choice theory? The idea that humans are rational agents is compatible with a partially modular theory of mind, to one along the lines of Fodor’s 1983 model. For he retains a central processor with the critical role of maintaining an overall model of the agent’s environment, and using that model to guide rational action. A Fodorian agent could well be a rational maximizer; indeed, a rational economic maximizer. But evolutionary psychology in its standard formulation seems inconsistent with rational choice theory understood as a descriptive claim about human behavior. If there is no general-purpose device—if there is nothing like a central processor—it is hard to see how a rational actor model could accurately describe the pattern of human action. When humans respond to problems that were important in the environments in which our cognitive mechanisms evolved, and when they do so in environments relevantly like those ancestral environments, and when fitness consequences correlate well with individual welfare, then of course human behavior might maximize expected utility. But much human behavior is not directed on problems that were critical to our ancestors, and humans often act in environments very different from those in which psychologically modern humans evolved. One defender of evolutionary psychology to confront this problem is Sperber. He defends the view that the mind is wholly an ensemble of modules. But he argues that one of these modules, the metarepresentation module, can act as a surrogate general-purpose device. The function of the metarepresentation module is to guide our behavior with respect to other humans. For we can estimate their actions best by estimating their representation of the world. Since the proper function of the metarepresentation module is to represent representations, we have a derived, second hand capacity to represent anything, or at least anything that anyone else can represent (Sperber 1996). However, this suggestion does not address the issue of integration and the effective use of information to guide behavior with respect to novel aspects of our
physical environment. So it is not clear that the behavioral abilities of a Sperber agent would match those of a Fodor agent. So we may have to choose between a wholly modular theory of the mind and rational agent psychology. Perhaps, though, evolutionary psychologists should retreat from their commitment to a wholly modular theory of mental organization. For there are good reasons for thinking that a commitment to a modular view of the mind is premature. Adaptive thinking suggests modular mechanisms have a downside: they are vulnerable to exploitation in a malign world. For if they are encapsulated, they use only a fraction of the information potentially available to an agent. Yet, if our minds are the result of an arms race, they evolved in a hostile world, not merely an indifferent one. Game theoretic models of the evolution of language have a strong cooperative element; they are close to one end of a mathematical spectrum from games of pure cooperation to games of pure conflict (Skyrms 1996). Both parties in a linguistic interaction benefit from successfully communicating their intended meaning. Even if they have further, incompatible agendas, neither will succeed unless the utterance is understood. A module for resource sharing might well be manipulated to gain a better share of resources. Similarly, if there are mate choice modules that work on inflexible, prewired cues, these two will be vulnerable to mimics. Perceptual systems exemplify a surprising truth about human mentality. The information-processing tasks implicit in much human action are much more complex and difficult than one would intuitively expect. The outputs of the visual systems are reliable representations of what is seen, yet the signals that reach the retina are typically fragmentary and equivocal. That supports the view that those tasks could only be carried out by mental organs specifically adapted for those very tasks. This realization has generated a series of ‘poverty of the stimulus’ arguments in many branches of cognitive psychology. These attempt to show that human cognitive skills are too complex, and develop from too narrow an information basis, to be directed by a general learning mechanism. However, the ‘poverty of the stimulus’ argument does not sustain some of the central hypotheses of evolutionary psychology. For example, Cosmides and Tooby argue that we have a module of social exchange (Cosmides 1989, Cosmides and Tooby 1989, 1992). But their reasoning is the inverse of ‘poverty of the stimulus’ reasoning; it depends on the fact that we usually find certain computationally trivial reasoning tasks extraordinarily difficult, but when those tasks concern social exchange we find them easier. So humans do an easy task in one domain with less difficulty than the same task in other domains. A poverty of the stimulus argument applies in the reverse situations, those where we find a computationally complex task effortless. Determining motion from 14547
Sociobiology: Philosophical Aspects changes in apparent shape and size is computationally extraordinarily complex, yet we do it easily. Moreover, many important problems cannot be solved by modular mechanisms. Fodor has argued convincingly that the pragmatics of language cannot be handled by a specialist device (Fodor 1983). It is one thing to know what a sentence means; it is another to know the intentions that lie behind its utterance. The latter problem is not solvable by shortcuts from a restricted database i.e., by an encapsulated device. Everything the hearer knows is potentially relevant and potentially used in decoding the speaker’s intent. The same problem seems to arise for many of domains of interest to evolutionary psychology. Could an encapsulated mechanism deliver reliable judgments about the probability of a prospective partner’s cheating? It is not at all obvious that there are cues that are reliable and are stable across generations. Both conditions must be met, if selection is to build a specialist mechanism to solve such problems. So the most plausible version of evolutionary psychology might be somewhat closer to Fodorian architecture; a complex of domain general and domain specific devices. If that is so, there may be ways of integrating evolutionary psychology and rational choice theory.
5. Indiidual Behaior and Collectie Action One central problem in the social sciences concerns the relationship between the explanation of individual action, and the explanation of collectives: of the nature and activity of cultures, classes, institutions, and the like. It is uncontroversial that there is a minimum constraint of the individual on the collective. Social facts supervene on individual facts, for social facts cannot change unless individual facts do, but not vice versa. A somewhat stronger claim is also relatively uncontroversial: social or collective causation requires individual microfoundations. So, for example, a theory of working-class collective action that depends on class consciousness owes us an account of the thoughts and actions of the individual agents whose pattern of individual action constitute that collective action (Elster 1989, Wright et al. 1993). So one important issue about evolutionary psychology is concerned with the microfoundations it makes possible, and those that it rules out. Here the single most central issue concerns cooperation. Human life is pervasively cooperative, and this is a fact that is notoriously hard to explain consistently with the assumption that humans are rational economic maximizers. For the prisoner’s dilemma and its relatives threaten to subvert cooperation between such agents, even though all would be better off under cooperative regimes. An array of solutions to this problem have been developed around reciprocation and iterated interaction (Axelrod 1984). These retain the assumption that individual agents act to maximize 14548
their own economic welfare, but it is far from obvious that these models are robust or realistic. For example, the famous ‘tit for tat’ strategy is by no means robust against error. If either player mistakes cooperation for defection, cooperation collapses (Sigmund 1993). So one important role for sociobiology might be to expand the space of microfoundational options for the social sciences. If Frank or Skyrms is right, in some circumstances evolved individual agents might reach the cooperate–cooperate option, even though agents that maximize their own expected utility would not (Frank 1988, Skyrms 1996). If Sober and Wilson are right, a proper regard for evolution does not commit us to the view that individual agents are designed to maximize their individual fitness (Sober and Wilson 1998). A theory of individual agency based on evolutionary psychology might, therefore, be significantly different from microfoundations based on an economic version of rational agency. However, the nature and importance of these differences remain to be elucidated. See also: Evolution, Natural and Social: Philosophical Aspects; Evolutionary Social Psychology; Evolutionary Theory, Structure of; Evolutionism, Including Social Darwinism; Psychological Development: Ethological and Evolutionary Approaches; Sociobiology: Overview
Bibliography Alexander R D 1979 Darwinism and Human Affairs. Washington University Press, Seattle, WA Axelrod R 1984 The Eolution of Cooperation. Basic Books, New York Barkow J H, Cosmides L, Tooby J (eds.) 1992 The Adapted Mind: Eolutionary Psychology and the Generation of Culture. Oxford University Press, Oxford, UK Baron-Cohen S 1995 Mindblindness: An Essay on Autism and the Theory of Mind. MIT Press, Cambridge, MA Buss D M 1994 The Eolution of Desire: Strategies of Human Mating. Basic Books, New York Cosmides L 1989 The logic of social exchange: has natural selection shaped how humans reason? Studies with the wason selection task. Cognition 31: 187–276 Cosmides L, Tooby J 1989 Evolutionary theory and the generation of culture, part II. Case study: A computational theory of social exchange. Ethology and Sociobiology 10: 51–97 Cosmides L, Tooby J 1992 Cognitive adaptations for social exchange. In: Barkow J H, Cosmides L, Tooby J (eds.) The Adapted Mind. Oxford University Press, Oxford, UK, pp. 163–227 Elster J 1989 Marxism, functionalism and game theory: The case for methodological individualism. In: Callinicos A (ed.) Marxist Theory. Oxford University Press, Oxford, UK Fodor J A 1983 The Modularity of Mind. MIT Press, Cambridge, MA Frank P 1988 Passion Within Reason: The Strategic Role of the Emotions. W W Norton, New York
Sociocybernetics Harvey P H, Pagel M D 1991 The Comparatie Method in Eolutionary Biology. Oxford University Press, Oxford, UK Hull D L, Ruse M (eds.) 1998 Philosophy of Biology. Oxford University Press, Oxford, UK Kennedy M, Spencer H G, Gray R D 1996 Hop, step, and gape: Do the social displays of the Pelecaniformes reflect phylogeny?Animal Behaior 51: 273–91 Kitcher P 1985 Vaulting Ambition: Sociobiology and the Quest for Human Nature. MIT Press, Cambridge, MA Levins R, Lewontin R C 1985 The Dialectical Biologist. Harvard University Press, Cambridge, MA Lewontin R C 1978 Adaptation. Scientific American 239: 156–69 Lewontin R C, Rose S, Kamin L J 1984 Not In Our Genes. 1st ed. Penguin, London Orzack S H, Sober E 1994 Optimality models and the test of adaptationism. American Naturalist 143: 361–80 Orzack S H, Sober E 1996 How to formulate and test adaptationism. The American Naturalist 148: 202–10 Plotkin H 1997 Eolution in Mind: An Introduction to Eolutionary Psychology. Penguin, London Sigmund K 1993 Games of Life: Explorations in Ecology, Eolution and Behaior. Oxford University Press, Oxford, UK Skyrms B 1996 The Eolution of the Social Contract. Cambridge University Press, Cambridge, UK Sober E, Wilson D S 1998 Unto Others: The Eolution and Psychology of Unselfish Behaior. Harvard University Press, Cambridge, MA Sperber D 1996 Explaining Culture: A Naturalistic Approach. Blackwell, Oxford, UK Symons D 1979 The Eolution of Human Sexuality. Oxford University Press, Oxford, UK Tooby J, Cosmides L 1992 The psychological foundations of culture. In: Barkow J H, Cosmides L, Tooby J (eds.) The Adapted Mind. Oxford University Press, Oxford, UK, pp. 19–136 Whiten A W, Byrne R W (eds.) 1997 Machiaellian Intelligence II: Extensions and Ealuations. Cambridge University Press, Cambridge, UK Wilson E O 1975 Sociobiology: The New Synthesis. Belknap Press of Harvard University Press, Cambridge, MA Wilson E O 1978 On Human Nature. Bantam Books, Toronto, ON Wright E O, Levine A, Sober E 1992 Reconstructing Marxism. Verso Press, London
K. Sterelny
Sociocybernetics The term cybernetics derives from the Greek word for steersman. It can be translated roughly as the art of steering, and refers here to a set of related approaches, including especially general systems theory, and firstand second-order cybernetics. It can offer some interesting concepts and models to social science, and may also in that sense have a steering function. Sociocybernetics consists of the applications of firstorder and especially second-order cybernetics to the social sciences, and the further development of the latter (Buckley 1988). First-order cybernetics was developed in the 1940s, outside of the social sciences,
whereas second-order cybernetics was developed to remedy the shortcomings of first-order cybernetics when applied to biology and the social sciences. For reasons of space, the essentials of first-order cybernetics and its main concepts—system boundaries; the distinction between systems, subsystems, and suprasystems; circular causality; positive and negative feedback loops; simulation—cannot be dealt with here (see Geyer and van der Zouwen 1991).
1. Modern (Second-order) Cybernetics Second-order cybernetics originated in the early 1970s. The term was coined by Heinz von Foerster (1970) in ‘Cybernetics of cybernetics.’ He defined first-order cybernetics as the cybernetics of obsered systems, and second-order cybernetics as the cybernetics of obsering systems. Indeed, second-order cybernetics explicitly includes the observer(s) in the systems to be studied. These are generally living systems, ranging from simple cells to human beings, rather than the control systems for inanimate technological devices studied by most firstorder cybernetics. It could be said to have a biological approach, or at least a biological basis. Apart from von Foerster, several other authors haven given concise definitions of the differences between first-order and second-order cybernetics: the purpose of a model vs. the purpose of the modeler (Pask), controlled systems vs. autonomous systems (Varela), the interaction among variables in a system vs. interaction between the observer and the system observed (Umpleby), and theories of social systems vs. theories of the interaction between ideas and society (Umpleby). This latter difference reflects Parsons’s approach as that of a first-order systems theorist interested in system stability and system maintenance, and Luhmann’s as that of a second-order cybernetician interested more in change and morphogenesis.
2. The Usefulness of Second-order Cybernetic Concepts For Sociology The main concepts of second-order cybernetics—selforganization, self-reference, self-steering, autocatalysis, and autopoiesis—all start with ‘self’ (or ‘auto’), possibly reflecting accelerating social change, whereas first-order cybernetics concepts do not point specifically to either the system or its environment. 2.1
Self-organization
Contemporary scientific interest in many disciplines centers on the emergent, self-organizing properties of complex aggregates, especially in biology and social science, and has stimulated second-order cybernetics. First-order cybernetics, though certainly aware of self14549
Sociocybernetics organizing systems, did not pursue this line of inquiry. Ashby (1952, 1956), for example, also stressed selforganization. Recent developments in cognitive science—an interdisciplinary effort of neuroscience, artificial intelligence (AI), linguistics, philosophy, and cognitive psychology—demonstrate the importance of self-organization. Though the science only began in the 1970s, one can already distinguish two schools: cognitivism and connectionism. Cognitiism and cognitive technology furthered the development of AI. Cognitivism conceived mental processes as manipulations according to the rules of deductive logic, as localized and serial information processing of symbols supposedly representing external reality (Varela et al. 1993, pp. 4–5). Thus, it seems to be close to first-order cybernetics. It was the mainstream paradigm in cognitive science until its drawbacks became more apparent in the 1970s. (a) At some stage, symbolic information processing encounters the ‘Von Neumann bottleneck’: sequentially applied rules become problematic in case of large numbers of sequential operations, e.g., weather forecasting or image analysis. (b) Symbolic information processing is not distributed but localized. Local malfunctioning of some symbols or rules therefore tends to result in overall malfunctioning, not having the resilience towards disturbances of distributed processing. Connectionism, contrary to cognitivism, incorporates many second-order cybernetics views: It is explicitly bottom-up rather than top-down, eschews abstract symbolic descriptions, stresses emergence and self-organization, and assumes simple, ‘stupid,’ neuron-like components that develop cognitive capacities when appropriately connected. The intelligence is in the structure, in the connections made—hence connectionism—and not in the components. Hebb’s 1949 rule occupies a central place. It states that learning is based on postulated changes in the brain resulting from correlated activity between neurons. If two neurons are active together, their connection is strengthened; if not, it is weakened. Thus, history— always a history of transformations—is incorporated. Simulations of a simple network of neurons according to Hebb’s rule demonstrate that pattern recognition is possible, after a learning phase in which some of the connections are strengthened or weakened, provided the number of patterns presented does not exceed about 15 percent of the participating neurons. The components of neural networks do not need an external processing unit which guides their operations. They work locally according to local rules, but global cooperation emerges when the states of all participating neurons reach a mutually satisfactory state. This passage from local rules to global coherence is the essence of self-organization—also denoted as emergent or global properties, network dynamics, nonlinear networks, complex systems, and so forth. Once one looks for it, emergence abounds. In different domains, 14550
like vortices and lasers, chemical oscillations, genetic networks, population genetics, immune networks, ecology and geophysics, self-organizing networks give rise to new properties. The implications of these developments in cognitive science are potentially interesting for sociology, because of the possible analogies with self-organization in human societies: (a) Autonomous systems, though computersimulated in connectionism, give meaning to their interactions on the basis of their own history, rather than on the basis of the intentions of the programmer— or should one say a manipulating environment? (b) Neural networks, and possibly human networks as well, produce emergent phenomena resulting from both simultaneous processes (the emergent pattern itself arises as a whole) and sequential ones ( participating components have to engage in back-and-forth activity to produce the pattern).
2.2 Self-reference Self-reference is typical of human beings, and possibly apes, both on the individual and the group level. It is not a concept in first-order cybernetics, which—as Norbert Wiener ([1948] 1961, 1956) stressed— concerns itself with the commonalities rather than the differences between humans, animals, and machines. Self-reference is defined here in its second-order cybernetics meaning, where the (individual or social) system collects information about its own functioning, which in turn can influence that functioning. Minimal requirements are self-observation, self-reflection, and some degree of freedom of action. One of the main characteristics of individual and social systems is indeed their potential for self-referentiality in this sense. The self-knowledge accumulated by the system not only affects the structure and operation of that system, but also implies that (self-referential) feedback loops exist between parts of external reality and models and theories about these parts of reality. Whenever social scientists systematically accumulate and publish new knowledge about the structure and functions of their society or subgroups within it—in principle available also for those to whom that knowledge pertains—such knowledge will often be invalidated, because the research subjects may react to this knowledge in ways that falsify the analyses or forecasts made. In this respect, social systems are different from many other systems, including most biological ones: There is a clearly two-sided relationship between the system’s self-knowledge and its behavior and structure. While biological systems also show goal-oriented behavior of actors, self-organization, self-reproduction, adaptation, and learning, only psychological and social systems arrive systematically, through experiment and reflection, at knowledge about their
Sociocybernetics own structure and operating procedures, with the obvious aim to improve these—both on the microlevel of the individual, as in psychoanalysis or other selfreferential activities, and on the macrolevel of world society, as in planning international trade or optimal distribution of available resources. The usual social science examples of self-referential behavior consist of self-fulfilling and self-defeating prophecies. Henshel (1990), for example, has studied serial self-fulfilling prophecies, where the accuracy of earlier predictions, themselves influenced by the selffulfilling mechanism, impacts upon the accuracy of the subsequent predictions. However, in most empirical research, self-referential behavior does not loom large, perhaps because it is largely survey research, which offers less opportunity for self-referential behavior than depth interviews. These concentrate more on the ‘why’ than the ‘what’ of people’s opinions, and thus tend to elicit self-referential remarks. 2.3 Self-steering The futility of large-scale planning efforts has led to the realization that individuals and organizations are largely self-steering. While recognizing the usefulness of efforts to steer societies, a cost–benefit analysis of intensive large-scale steering efforts often turns out to be negative: Intensive steering implies intensive social change, i.e., a long period of increased uncertainty, with also an increased chance for changing planning preferences and for conflicts between different emerging planning paradigms during such a period. Aulin (1982, 1988) tried to answer two basic questions in this respect. (a) Should one opt for the ‘katascopic’ or the ‘anascopic’ view of society, for top-down planning of individual and group behavior to maximize societal survival chances, or should the insight of actors at every level, including the bottom one, be increased and therewith their competence to handle their environment more effectively and engage more successfully in goal-seeking behavior? (b) What should be the role of science, especially the social sciences, in view of this choice: should it deliver useful knowledge for an improved centralized steering of the behavior of social systems and individuals, or should it strive to improve the competence of actors at the grass-roots level, so that these actors can steer themselves and their own environment with better results? Aulin followed a cybernetic line of reasoning that argues for nonhierarchical forms of steering. Ashby’s Law of Requisite Variety indeed implies a law of requisite hierarchy if only the survival of the system is considered, i.e. if the regulatory ability of the regulators is assumed constant. However, the need for hierarchy decreases if this regulatory ability itself improves—which is indeed the case in advanced industrial societies, with their well-developed pro-
ductive forces and correspondingly advanced distribution apparatus (the market mechanism). Since human societies are not simply self-regulating systems, but self-steering systems aiming at an enlargement of their domain of self-steering, advanced industrial societies have the possibility nowadays to combine societal governability with ever less control, less centralized planning, and less concentration of power. When moving from a work-dominated society to an information-dominated one, less centralized planning is even a prerequisite. The intellectual processes dealing with information are self-steering, not only self-regulating, and cannot be externally steered by definition. Consequently, there should be no excessive top-down planning, and science should help individuals in their self-steering efforts, and certainly should not get involved in the maintenance of hierarchical power systems. Of course, there are types of systems within a society that can indeed be planned, governed, and steered, simply because such systems have been specifically designed to exemplify the concept of the control paradigm of first-order cybernetics. However, even in first-order cybernetics control does not necessarily imply hierarchy, as even the simple case of the thermostat demonstrates. Thus, although there certainly still is a limited place for organizations that exemplify a more or less hierarchical control paradigm, modern, complex, multigroup society in its entirety, conceptualized as a matrix in which such systems grow and thrive, can never be of this type. If one investigates a certain system with a research methodology based on the control paradigm, the results necessarily have a conservative bias. Changes of the system as such are almost prevented by definition. De Zeeuw (1988) considers a different methodological paradigm necessary if one wants to support fundamental social change and prevent ‘postsolution’ problems. It is based on a multiple-actor design, does not strive towards isolation of the phenomena to be studied, and likewise does not demand a separation between a value-dependent and a value-independent part of the research outcomes.
2.4 Autocatalysis and Cross-catalysis Laszlo (1987, 1988) describes two varieties of (chemical) catalytic cycles: auto-catalytic cycles, in which the product of a reaction catalyzes its own synthesis, and cross-catalytic cycles, where two different products or groups of products catalyze each other’s synthesis. An example of a model of a cross-catalytic cycle, developed by Prigogine and colleagues, is the Brusselator: (1) A X (2) BjX YjD (3) 2XjY 3X (4) X E 14551
Sociocybernetics With X and Y as intermediate molecules, there is an overall sequence in which A and B become D and E. Step 3 can be seen as autocatalysis, while steps 2 and 3 in combination describe cross-catalysis: X helps to produce Y, while Y helps to produce more X. Autocatalytic sets (A B C ( Z A) bootstrap their own evolution, provided the complexity of interactions is rich enough. The system then changes from a subcritical to a supercritical state, and autocatalysis follows. Kaufmann (1991, 1992) even uses the concept to explain the origins of life from a ‘primal soup’ of simple chemical elements as an inevitable production of order, rather than as a unique and extremely unlikely historical accident. Simple chemical laws coupled with the presence of a sufficient number of frequently interacting elements produce ever more complex elements. These have new characteristics, and often turn out to be part of new catalytic processes at higher levels of molecular complexity—processes which in turn boost the emergence of still higher levels of complexity. Along similar lines, Swenson (1991) likewise maintains that—in spite of the Second Law of Thermodynamics—‘the world is in the order production business.’ The economist Arthur (1990), collaborating closely with Kaufmann at the Santa Fe Institute, applied the concept of autocatalytic sets to the economy—which also bootstraps its own evolution, as it grows more complex over time. Beyond a certain critical threshold, phase transitions occur. Stagnant developing countries can enter the take-off stage when their economy has diversified sufficiently; increased trade between two countries in a subcritical state can similarly produce a more complex and interwoven economy which becomes supercritical and explodes outward. Catalytic effects may also operate in ‘negative’ phase transitions, where critical thresholds of violence are passed as in Northern Ireland or Bosnia.
be expanded into a more general theory of selfreferential autopoietic systems. Social and psychic systems are based upon another type of autopoietic organization than living systems: namely on communication and consciousness, respectively, as modes of meaning-based reproduction. While Luhmann thus views communications instead of actions as the elementary units of social systems, the concept of action admittedly remains necessary to ascribe certain communications to certain actors. Thus, the chain of communications can be viewed as a chain of actions. This enables social systems to communicate about their own communications and to choose their new communications, i.e., to be active in an autopoietic way. Such a general theory of autopoiesis has important consequences for the epistemology of the social sciences: It draws a clear distinction between autopoiesis and observation, but also acknowledges that observing systems are themselves autopoietic systems, subject to the same conditions of autopoietic self-reproduction as the systems they are studying. The theory of autopoiesis thus belongs to the class of global theories, i.e. theories that point to a collection of objects to which they themselves belong. Classical logic cannot really deal with this problem, and it will therefore be the task of a new systemsoriented epistemology to develop and combine two fundamental distinctions: between autopoiesis and observation, and between external and internal (self-)observation. Classical epistemology searches for the conditions under which external observers arrive at the same results, and does not deal with selfobservation. Consequently, societies cannot be viewed, in this autopoietic perspective, as either observing or observable. Within a society, all observations are by definition self-observations.
2.5
3. The Science of Complexity: A Conergence of Paradigms
Autopoiesis
Autopoiesis, or ‘self-production,’ is a concept introduced in the 1970s by the biologists Maturana and Varela to differentiate the living from the nonliving. An autopoietic system was defined as a network of inter-related component-producing processes such that the components in interaction generate the same network that produced them. Although they considered the concept applicable only in biology, and not in the social sciences, an interesting ‘theory transfer’ was made by Luhmann (1988). He defended the quite novel thesis that, while social systems are selforganizing and self-reproducing systems, they do not consist of individuals—or roles or even acts, as commonly conceptualized—but exclusively of communications. When generalizing the usages of autopoiesis, developed while studying biological systems, to make it also applicable to social systems, the biology-based theory of autopoiesis should therefore 14552
Compared with simpler societies, complex modern ones are highly dynamic and interactive. Consequently, they change at accelerated rates, and are generally in a far-from-equilibrium situation. According to Prigogine and Stengers (1984) nonlinear relationships obtain in systems that are far from equilibrium, where relatively small inputs can trigger massive consequences. At such ‘revolutionary moments’ or bifurcation points, chance influences determinism, and the direction of change is inherently impossible to predict: a disintegration into chaos, or a ‘spontaneous’ leap to a higher level of order or organization—a so-called ‘dissipative structure,’ which requires more energy to sustain it, as compared with the simpler structure it replaces. In stressing this possibility for self-organization, for ‘order out of chaos,’ Prigogine comes close to the
Sociocybernetics concept of autopoiesis. In modern societies, the mechanistic and deterministic Newtonian world view—emphasizing stability, order, uniformity, equilibrium, and linear relationships between or within closed systems—is being replaced by a new paradigm, more in line with today’s accelerated social change, which stresses disorder, instability, diversity, disequilibrium, nonlinear relationships between open systems, morphogenesis, and temporality. Social scientists, often still thinking in terms of linear causality, would be well-advised to try out the explanatory powers of Prigogine’s conceptual vocabulary—fluctuations, feedback amplification, dissipative structures, bifurcations, (ir)reversibility, auto-and cross-catalysis, self-organization, and so forth—on the phenomena they study, as well as the concepts and methods of second-order cybernetics in general. It is reassuring in this novel and therefore risky field of research that there seems to be a convergence of paradigms, subsumed here under the name of secondorder cybernetics. However, it might also be designated by other names like cognitive science, general systems theory, artificial intelligence, artificial life, or perhaps indeed most aptly, as Prigogine suggested, ‘the emerging sciences of complexity.’
4. Methodological Problems of Empirical Research in First- and Second-order Cybernetics: A Bridge Too Far? If one realizes that the social sciences mainly study self-organizing, self-referential, autopoietic systems, with their own strategies and expectations, with intertwining processes of emergence and adaptation, then one is confronted with one of the core problems of sociology, economics, and other social sciences: how to make a science out of studying of imperfectly smart agents exploring their way into an essentially infinite space of possibilities which they—let alone the social scientists researching them—are not even fully aware of. There is indeed quite a methodological problem here. It is already very difficult to apply the principles and methods (e.g., feedbacks and nonlinearities) of first-order cybernetics to empirical social research. It is nearly impossible to incorporate a second-order cybernetics approach in one’s research design. Secondorder cybernetics may be a bridge too far, given the research methodology and the mathematics presently available. Applying first-order cybernetics principles in empirical research already poses heavy demands on the data sets and the methods of analysis: Every feedback (Xt Y Xt+ ), every interaction between variables [Z (X Y )]," and every nonlinear equation (Y l cX #jbXja), let alone nonlinear differential equation (Yh l cY #jbYja), demands extra parameters to be
estimated, and quickly exhausts the information embedded in the data set. Admitting on top of that the second-order notions that the research subjects can change by investigating them, and may even reorganize themselves on the basis of knowledge acquired by them during the research, exceeds the powers of analysis of even the most sophisticated methodologists: it equals the effort to solve an equation with at least three unknowns. How can one obtain reliable data within such a framework, where nothing is constant and everything is on the move, let alone formulate policy-relevant decisions? How can one still forecast developments when at best retrospective analysis of how a new level of complexity has emerged seems possible? These problems are far from solved, and much work lies ahead before hypotheses derivable from second-order cybernetics will be fully testable, at least in a classical Popperian sense. However, testing hypotheses pertaining to the nonlinear dynamical regimes stressed by complexity theory and chaos studies may demand different criteria, invalidating the classical concept of a theory as a mathematical model yielding stable, consistent and complete predictions of a system’s behavior. Nevertheless, the opportunities offered by the second-order cybernetic paradigm to present a truly realistic analysis of the complex adaptive behavior of interacting groups of agents seems to good to pass up. There is certainly a challenge here, for theorists and methodologists alike. See also: Functionalism, History of; Functionalism in Sociology; Structure: Social; System: Social; Theory: Sociological
Bibliography Arthur W B 1990 Positive feedbacks in the economy. Scientific American 262: 92–9 Ashby W R 1952 Design for a Brain—The Origin of Adaptie Behaior. Wiley, New York Ashby W R 1956 An Introduction to Cybernetics. Chapman & Hall, London Aulin A 1982 The Cybernetic Laws of Social Progress: Towards a Critical Social Philosophy and a Criticism of Marxism. Pergamon, Oxford, UK Aulin A 1988 Notes on the concept of self-steering. In: Geyer F, van der Zouwen J (eds.) Sociocybernetic Paradoxes— Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 100–18 Buckley W 1988 Society—A Complex Adaptie System. Gordon and Breach, London De Zeeuw G 1988 Social change and the design of enquiry. In: Geyer F, van der Zouwen J (eds.) Sociocybernetic Paradoxes—Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 131–44 Geyer F, van der Zouwen J 1991 Cybernetics and social science: Theories and research in sociocybernetics. Kybernetes 20(6): 81–92 Hebb D O 1949 Organization of Behaior. Wiley, New York
14553
Sociocybernetics Henshel R L 1990 Credibility and confidence loops in social prediction. In: Geyer F, van der Zouwen J (eds.) Selfreferencing in Social Systems. Intersystems Publications Salinas, CA pp. 31–58 Kauffman S A 1991 Antichaos and adaptation. Scientific American 265: 78–84 Kaufmann S A 1992 Origins of Order: Self-organization and Selection in Eolution. Oxford University Press, Oxford, UK Laszlo E 1987 Eolution—The Grand Synthesis. Shambala, Boston Laszlo E 1988 Systems and societies: The basic cybernetics of social evolution. In: Geyer F, van der Zouwen J (eds.) Sociocybernetic Paradoxes—Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 145–72 Luhmann N 1988 The autopoiesis of social systems. In: Geyer F, van der Zouwen J (eds.) Sociocybernetic Paradoxes— Obseration, Control and Eolution of Self-steering Systems. Sage, London, pp. 172–92 Prigogine I, Stengers I 1984 Order out of Chaos—Man’s New Dialogue with Nature. Flamingo, London Swenson R 1991 End-directed physics and evolutionary ordering: Obviating the problem of the population of one. In: Geyer F (ed.) The Cybernetics of Complex Systems—SelfOrganization, Eolution, and Social Change. Intersystems Publications, Salinas, CA, pp. 41–60 Varela F, Thompson E, Rosch E 1993 The Embodied Mind— Cognitie Science and Human Experience, 3rd edn. MIT Press, Cambridge, MA Von Foerster H 1970 Cybernetics of cybernetics. Paper delivered at 1970 annual meeting of the American Society for Cybernetics. Washington, DC Von Foerster H 1974 On constructing a reality. In: von Foerster H (ed.) Obsering Systems. Intersystems Publications, Salinas, CA Wiener N 1956 The Human Use of Human Beings—Cybernetics and Society. Doubleday, Garden City, NY Wiener N [1948] 1961 Cybernetics, or Control and Communication in the Animal and the Machine. MIT Press, Cambridge, MA
F. Geyer
Socioeconomic Status and Health Observation of an inverse association between socioeconomic status (SES) and risk of poor health and death are not new, extending back to the twelfth century (Antonovsky 1967). In medieval Europe, Paracelsus noted unusually high rates of disease in miners. In 1790, Johan Peter Frank, one of the founders of the public health movement, wrote ‘The people’s misery, mother of disease.’ Daniel Defoe in his Journal of the Plague Year echoed similar sentiments when he opined that ‘… the misery of that time lay upon the poor who, being infected, had neither food or physic, neither physician or apothecary to assist them, or nurse to attend.’ By the nineteenth century, systematic investigations were being conducted by Villerme' into the relationship 14554
between rent levels of areas and mortality in Paris (Susser et al. 1985). In 1848, Virchow reported on the relationship between poor living conditions and typhus in Upper Silesia (Rather 1988). In England, Farr examined differences in mortality by occupation (Rosen 1993), while Engels (1848) deplored the impact of new working conditions on the health of the poor as a result of the industrial revolution in England. Since 1850, a rich scientific literature has developed that further extends these earlier observations that link socioeconomic position with health, and in recent years there has been an exponential increase in scientific work on this topic. As a prelude to summarizing this literature, it is first necessary to briefly discuss the ways in which socioeconomic status or position has been conceptualized in this research tradition. A variety of terms have been used in the epidemiologic literature in this area, including social class, social stratification, social inequality, social status, and socioeconomic status (Krieger et al. 1997). To a large extent these terms reflect different historical, conceptual, and disciplinary roots. More extensive discussions of the details and the potential advantages and disadvantages of various approaches to measurement have been presented in reviews by various authors (cf. Krieger et al. 1997). It should be noted that in Europe, and particularly the United Kingdom, occupational measures of social class have predominated, whereas in North America there has been more focus on individual or aggregate measures of socioeconomic status derived from education, income, or occupation. In what follows, the term ‘socioeconomic position’ (SEP) will be used generically to refer to the various measures that have been used.
1. The Relationship Between Socioeconomic Position and Health Figure 1 shows the association between relative risk of death and household income, after adjustment for age, sex, and household size (Wolfson et al. 1999) in a large sample of households in the United States. The figure also shows the relative distribution of levels of household income in the population. There are several important aspects of this relationship. First, risk of death, relative to mean household income, decreases monotonically with increasing income. Second, while risk of death increases monotonically with decreasing income, the greatest elevations of risk relative to mean income are found in approximately the bottom third of the household income distribution. This monotonic but nonlinear relationship between socioeconomic position and mortality risk or life expectancy has been found using many different measures of SEP and in many different countries. While a great deal of attention has been devoted to the monotonic nature of this risk relationship, much less attention has been devoted to the striking nonlinearity in this association.
Socioeconomic Status and Health
Figure 1 Plot of risk relative to mean income (modified from Wolfson et al. 1999)
Figure 2 Age-adjusted death rates for selected causes for adults 25–64 years of age, by education level and sex: US 1995. Note: Injuries include homicides, suicides, unintentional injuries and deaths resulting from adverse effects of medical procedures. Source: Pamuk et al. 1998
The association between SEP and health outcomes is not restricted to any particular age group, although some evidence suggests it is somewhat attenuated during late adolescence, early adulthood, and later old age. For example, Ericson et al. (1984) found in Sweden that lower parental SEP was associated with higher rates of perinatal mortality, prematurity, low birth weight, and being born small for gestational age; and rates of ‘stunting’ (below the 5th percentile in
height) were elevated in children born in families living below the poverty level in the United States (USDHHS 1986). Wolfson et al. (1993), using Canadian data, showed important differences in survival after age 65 based on income levels just before retirement, and the effects of educational level on other health outcomes continue into old age (Guralnik et al. 1993). Lower SEP is associated with increased risk for most diseases and health outcomes. Figure 2 shows the 14555
Socioeconomic Status and Health relationship between level of education and rates of death from chronic diseases, injuries, and communicable disease for men and women in the United States (Pamuk et al. 1998). In Susser et al.’s (1985) compilation of findings from the Registrar General’s Decennial Supplement for England and Wales, they noted that lower occupational class was related to higher rates of death from malignant neoplasms, endocrine, nutritional and metabolic diseases, diseases of the blood-forming organs, infective and parasitic diseases, diseases of the nervous system and sense organs, diseases of the circulatory system, diseases of the respiratory system, diseases of the digestive system, diseases of the genitourinary tract, diseases of the musculoskeletal system and connective tissue, congenital anomalies, and accidents, poisoning, and violence. For those outcomes where there is a positive relationship between SEP and disease risk, such as breast cancer, survival is generally worse for those at lower levels of SEP. The association between SEP and health outcomes in not fixed in time, with some evidence for increasing stratification of health by SEP (Black et al. 1980, Pappas et al. 1993). For some health outcomes, there have also been changes in the direction of the association (Morgenstern 1980). These observations of changes in the magnitude and direction of the SEP–health association, as well as the substantial international variation (Kunst et al. 1998), are important because they suggest that informed interventions could possibly reduce socioeconomic inequalities in health.
2. Explanations of Inequalities in Health Typical explanations for SEP-related variations in health status focus on the role of medical care, behavioral risk factors such as smoking or alcohol consumption, material standards of living, or the impact of poorer health on SEP (Black et al. 1980). While there are likely to be variations in access to and quality of medical care by SEP, with the magnitude of the difference varying widely by country, medical care is not likely to be a major cause of variations in inequality by SEP. The argument in support of this assertion is based, among other things, on observations that the introduction of widespread availability of medical care in some countries has not lowered SEP variation in health and that variations in health status by SEP are found for conditions that are not ‘amenable to medical care.’ The differential distribution of high-risk behaviors among lower SEP populations also does not seem to fully explain the SEP–health association. Studies that have information on such individual-level risk factors have generally been unsuccessful in completely explaining SEP gradients in health (Marmot et al. 1978). 14556
Furthermore, even if statistical adjustment for individual higher risk behaviors did reduce or eliminate SEP-related gradients in health, SEP-related factors are likely to be antecedent, causal determinants of high-risk behavior. Some have interpreted SEP-related gradients in health as reflecting the impact of poor health on income and wealth. In this view, the poor end up less healthy because they were, first, less wealthy. While it is undoubtedly true that chronic illness is in most contexts associated with declines in income and wealth, most researchers have concluded that these selection effects account for a relatively small amount of the association between SEP and health. For example, downward selection effects cannot easily account for the pervasive impact of level of education on health outcomes that occur long after the completion of education, nor the impact of SEP of one family member on the health of another, nor Wolfson’s finding of strong variations in postretirement survival by pre-retirement income in Canadian men who had no declines in income during the years before they retired ( Wolfson et al. 1993). SEP is generally related to material standard of living, while the magnitude of the association clearly varies between countries, and over time. Nonetheless, lower SEP is generally associated with poorer housing, less nutritious diet, greater exposure to adverse environmental substances, poorer working conditions, greater exposure to crime and dangerous environments, and a wide variety of other ‘material’ and psychosocial factors. It seems likely that any comprehensive understanding of the relationship between lower SEP and poorer health will have to simultaneously consider a wide variety of these pathways, but the dominant analytic strategy has been to focus on the primacy of one pathway or another.
3. Future Directions It is unlikely that any single set of explanations will account for the observed relationships between SEP and health across time, places, and health outcomes. Undoubtedly, any full understanding will involve many of the above pathways, and perhaps others. Current efforts are focused on a number of activities aimed at clarifying our understanding of this complex area of investigation.
3.1 Indiidual s. Area SEP Many studies that lack individual measures of SEP have used measures that reflect the SEP of the area in which these individuals live, e.g. median level of income, percent in poverty, or area deprivation scores. In many cases the motivation is to use these as proxy
Socioeconomic Status and Health measures of individual information. However, it is increasingly being recognized that such measures reflect properties of the social and economic environment in which people live and that these measures are not identical to individual measures of SEP. What is more, a number of studies indicate that such area measures are importantly associated with health outcomes beyond individual (Macintyre et al. 1993). Considerable work remains to clarify under what situations these area measures are related to health outcomes, their overlap with neighborhood and community phenomena, and the economic, social, and behavioral pathways that link area characteristics with health outcomes in individuals. 3.2 Inequality in Income Distribution and Health A growing literature indicates that inequalities in income distribution are importantly related to population health outcomes beyond individual incomes (Wilkinson 1996). For example, in the United States, the share of income received by the least well-off half of the population was correlated with age-adjusted mortality rates for states and metropolitan areas, independent of measures of level of income (Kaplan et al. 1996). Greater equality in the distribution of income was associated with lower mortality rates and other health outcomes. While controversial, these findings have led to a lively debate concerning the relative role of individual income, psychosocial pathways, and material conditions in individual and population health (Lynch et al. 2000, Wilkinson 1996).
progression of subclincial atherosclerosis (Lynch et al. 1997), with levels of cholesterol and other lipids, blood pressure, and hemostatic factors (Kaplan and Keil 1993), and with the clustering of cardiovascular risk factors (Helmert et al. 1989). A growing focus is on the cumulative physiologic effects of chronic stress and adaptive efforts and their link with SEP (McEwen 1998).
4. Conclusion The pervasive associations observed between various measures of SEP and many health outcomes, and the risk factors for these outcomes, supports the fundamental role of SEP as a determinant of health. Understanding these associations requires elucidating the variety of multilevel determinants and pathways that link SEP and individual health outcomes as well as determining the societal and individual forces that stratify these determinants and pathways along socioeconomic dimensions. An appreciation of the breadth and magnitude of the imprint of SEP on the health of individuals and populations suggests that this is a scientific area with potentially great importance for public policy. See also: Ethnicity, Race, and Health; Health Care Delivery Services; Health Care, Rationing of; Health Care Systems: Comparative; Health Economics; Minority Access to Health Services: United States; Public Health; Public Health as a Social Science; Socioeconomic Status and Health Care
3.3 Life Course Effects
Bibliography
Most studies have examined SEP at only one time and few have considered either variability in SEP over the short- or long-term. However, there is considerable variation beginning in childhood, based on macroeconomic and individual pathways, on parental SEP and adult educational, occupational, or income trajectories. Some evidence suggests that income trajectories are important (Blane et al. 1996), while others suggest that adult destinations are more important (Lynch et al. 1994). A growing literature is devoted to documenting life-course effects, many of which involve variations in SEP on adult health outcomes (Kuh and Ben-Schlomo 1997).
Antonovsky A 1967 Social class, life expectancy, and overall mortality. Milbank Memorial Fund Quarterly 45: 31–73 Black D, Morris J, Smith C, Townsend P 1980 Inequalities in Health: Report of a Research Working Group. Department of Health and Social Security, London Blane D, Hart C L, Davey Smith G, Gillis C R, Hole D J, Hawthorne V M 1996 Association of cardiovascular disease risk factors with socioeconomic position during childhood and during adulthood. British Medical Journal 313: 1434–8 Brunner E 1997 Stress and the biology of inequality. British Medical Journal 314(7092): 1472–6 Davey Smith G, Neaton J D, Wentworth D, Stamler R, Stamler J 1996 Socioeconomic differentials in mortality risk among men screened for the Multiple Risk Factor Intervention Trial: 1. White men. American Journal of Public Health 86: 486–96 Ericson A, Eriksson M, Westerholm P, Zetterstrom R 1984 Pregnancy outcome and social indicators in Sweden. Acta Paediatrica Scandinaica 73(1): 69–74 Guralnik J M, Land K C, Blazer D, Fillenbaum G G, Branch L G 1993 Educational status and active life expectancy among older blacks and whites [see comments]. New England Journal of Medicine 392: 110–16 Helmert U, Herman B, Joeckel K H, Greiser E, Madans J 1989 Social class and risk factors for coronary heart diseasae in the Federal Republic of Germany. Results of the baseline survey
3.4 Biological Pathways Evidence is rapidly accumulating on the proximal biological pathways that mediate the association between SEP and health outcomes, which form much of the evidence relating to cardiovascular disease (Brunner 1997). For example, measures of SEP have been found to be associated with the prevalence and
14557
Socioeconomic Status and Health of the German Cardiovascular Prevention Study (GCP). Journal of Epidemiology and Community Health 43(1): 37–42 Kaplan G A, Keil J E 1993 Socioeconomic factors and cardiovascular disease: a review of the literature. Circulation 88(4): 1973–98 Kaplan G A, Pamuk E, Lynch J W, Cohen R D, Balfour J L 1996 Inequality in income and mortality in the United States: analysis of mortality and potential pathways. British Medical Journal 312: 999–1003 Krieger N, Williams D R, Moss N 1997 Measuring social class in U.S. public health research: concepts, methodologies, and guidelines. Annual Reiew of Public Health 13: 341–78 Kuh D, Ben-Schlomo Y 1997 A Life Course Approach to Chronic Disease Epidemiology. Oxford University Press, Oxford, UK Kunst A E, Groenhof F, Mackenbach J P, Health E W 1998 Occupational class and cause specific mortality in middle aged men in 11 European countries: comparison of populationbased studies. EU Working Group on Socioeconomic Inequalities in Health. British Medical Journal 316(7145): 1636–42 Lynch J W, Kaplan G A, Cohen R D, Kauhanen J, Wilson T W, Smith N L, Salonen J T 1994 Childhood and adult socioeconomic status as predictors of mortality in Finland. Lancet 343: 524–7 Lynch J W, Kaplan G A, Salonen R, Salonen J T 1997 Socioeconomic status and progression of carotid atherosclerosis: prospective evidence from the Kuopio Ishemic Heart Disease Risk Factor Study. Arteriosclerosis, Thrombosis and Vascular Biology 17: 513–9 Lynch J W, Davey Smith G, Kaplan G A, House J S 2000 Income inequality and mortality: importance to health of individual income, psychosocial environment, or material conditions. British Medical Journal 320: 1200–4 Macintyre S, MacIver S, Sooman A 1993 Area, class and health: should we be focusing on places or people? Journal of Social Policy 22: 213–34 Marmot M G, Rose G, Shipley M, Hamilton P J S 1978 Employment grade and coronary heart disease in British civil servants. Journal of Epidemiology and Community Health 32: 244–9 McEwen B S 1998 Protective and damaging effects of stress mediators. New England Journal of Medicine 338(3): 171–9 Morgenstern H 1980 The changing association between social status and coronary heart disease in a rural population. Social Science and Medicine [Medical Psychology and Medical Sociology] 14A(3): 191–201 Pamuk E, Makuc D, Heck K, Reuben C, Lochner K 1998 Socioeconomic Status and Health Chartbook. Health, United States. 1998. National Center for Health Statistics, Hyattsville, MD Pappas G, Queen S, Hadden W, Fisher G 1993 The increasing disparity in mortality between socioeconomic groups in the United States, 1960 and 1986 (published erratum appears in New England Journal of Medicine 329(15)). New England Journal of Medicine 329(2): 103–9 Rather L J 1988 Report on the typhus epidemic in Upper Silesia. In: Rather L J (ed.) Rudolph Virchow: Collected Essays on Public Health and Epidemiology. Science History, Canton, MA, Vol. 1, pp. 205–20 Rosen G 1993 A History of Public Health, expanded edn. Johns Hopkins University Press, Baltimore, MD Susser M, Watson W, Hopper K 1985 Sociology in Medicine. Oxford University Press, New York Wilkinson R G 1996 Unhealthy Societies—The Afflictions of Inequality. Routledge, New York
14558
Wolfson M, Kaplan G A, Lynch J W, Ross N, Backlund E 1999 Relation between income inequality and mortality: empirical demonstration. British Medical Journal 319: 953–7 Wolfson M, Rowe G, Gentleman J F, Tomiak M 1993 Career earnings and death: a longitudinal analysis of older Canadian men. Journal of Gerontology, Social Sciences 48: S167–79
G. A. Kaplan
Socioeconomic Status and Health Care Although many factors determine the availability and use of health care, there is substantial evidence across countries in that markers of socioeconomic status (SES), including income, education, and occupational status, are significantly associated with health care utilization. This includes health insurance coverage and the use of personal health services such as visits to primary care providers and specialists, hospitalizations, dental care, vision care, mental health services, prescription-drug use, and home health care. Although there is ample evidence that people of higher socioeconomic position use more health care services and tend to receive higher-quality care, the reasons behind these disparities are not fully understood. Below, the relationship between SES and health services utilization is described, and explanations for this relationship are summarized.
1. The Relationship between SES and Health Care Utilization Numerous studies have found a positive graded relationship between SES and most types of healthservices utilization, measured by whether or not a service type was used in the past year, or as the mean number of encounters\visits for the service during a specified time period (National Center for Health Statistics 1998, Fiscella et al. 2000, Whitehead 1999). Although there are some exceptions to these general findings, the bulk of the empirical findings to date suggest that the association is positive; that is, higher SES is associated with a higher rate of health-care utilization (see Table 1 for some examples from the US). Higher SES is also associated with lower rates of untreated disease and unmet medical needs. That SES has a positive impact on health care use has been observed for both men and women, and for children and adults. The relationship between SES and health care is especially apparent in countries without universal health insurance, where there is a strong relationship between SES and health insurance status (Bongers et al. 1997, Whitehead 1999). Those with lower levels of
Socioeconomic Status and Health Care Table 1 Health care utilization by poverty status for selected measures, United States 1994–1995. Data from National Center for Health Statistics (1998) Health service
Poor
Near poor
Middle income
High income
No health insurance Children ( 18 years) Adults (18–64 years)
22.0 percent 44.1 percent
22.8 percent 35.3 percent
8.6 percent 13.8 percent
4.2 percent 6.1 percent
No physician contact in the past year Children ( 6 years) Adults (18–64 years)
11.2 percent 21.2 percent
10.1 percent 19.1 percent
6.9 percent 14.9 percent
4.1 percent 11.4 percent
Unmet need for health care Adults 18–64
32.5 percent
29.0 percent
16.0 percent
7.0 percent
Poor
Near poor
Middle\high income
Number of physician contacts in past year, age-adjusted by health status and gender Good to excellent health Males Females Fair or poor health Males Females
3.9 5.4
3.8 5.0
4.6 5.8
13.7 16.2
14.7 17.3
16.3 22.4
income and education are more likely to be uninsured, or to have insurance coverage that only covers catastrophic health events. In turn, those who are uninsured are less likely to use health services, and to delay or forego care for serious medical symptoms (Fiscella et al. 2000, Brown et al. 1998). Importantly, SES is also related to health care use in countries with universal health insurance (Black et al. 1982; Whitehead 1999, Kephart et al. 1998). Thus, even in countries with a long history of universal coverage, SES disparities in health-care utilization are strong and apparent. While the association between SES and health-care utilization is positive and graded in most studies, other patterns have been documented as well. An inverse association between SES and use of physician services was noted in Canada, along with the hypothesis that excess use by those of lower income and education levels is attributable to SES differences in health status (Kephart et al. 1998). In the US, those without health insurance, those living in low-income areas, and racial\ethnic minorities are significantly more likely to receive potentially-avoidable hospitalizations and medical procedures (Pappas et al. 1997, Fiscella et al. 2000). In summary, people from lower economic and educational strata tend to have fewer physician visits for both primary and specialty care, and to use fewer mental health, dental, vision and preventive services. They also are less likely to have a usual source of care, and have significantly higher rates of unmet health needs. In some analyses, lower SES is associated with
greater physician and hospital use (especially for avoidable causes of injury or illness), although it appears that these findings are primarily driven by higher rates of poor health status or medical needs in disadvantaged populations. SES and its proxies are not the only factors related to health care use. Utilization is also patterned by gender, geography, age, culture, and other factors. While it is difficult to sort out the independent effects of SES and race\ethnicity on health care, it appears that minority status has an important independent effect. In the US, certain racial and ethnic minorities (most notably, African Americans, Native Americans and Hispanics) have been found to have lower rates of utilization in general, to receive less aggressive care for certain diseases, and to report less satisfaction with, and more experiences with discrimination regarding, health care (Fiscella et al. 2000). Even so, SES appears to be a major social predictor of health-care use. The relationship is strong across different types of health services, and is robust across a variety of study designs from a number of different countries.
2. Explanations for the Relationship between SES and Health Care Utilization While the evidence for a relationship between SES and health care use is strong and compelling, there are many methodological challenges in measuring this association and its causes (Brown et al. 1998). First, the relationship is confounded by health status, which 14559
Socioeconomic Status and Health Care is related to both SES and the need for services. Thus, it is important to measure health-services utilization within the context of medical need. Second, it is difficult to have all necessary variables in a single population-based dataset, including valid and reliable measures of SES, health insurance, medical need, and other confounding factors. Third, there are many different types of health services, and relationships are not uniform in magnitude or patterns across these different types. Fourth, simple measures of utilization do not provide a very deep understanding of differences that may exist in terms of the quality, content, or continuity of care. Despite the methodological challenges for this type of research, several factors have been investigated to explain why people of lower SES generally use fewer health services than those of higher social position, and why their level of use does not appear to match their need. Dutton (1978) reported that explanations for this phenomenon generally fall into three categories: (a) explanations that focus on financial access or affordability differences (e.g., the poor use fewer services because they have fewer financial means); (b) explanations that focus on knowledge and attitudinal differences regarding care-seeking (e.g., the poor use fewer services because they value and\or understand the benefits of health care less than other people); and (c) explanations that focus on health care system factors (e.g., the poor use fewer services because of how the health care system is organized or responds to them). This set of explanations still reflects the primary reasons offered for the strong association between SES and health care utilization.
2.1 Financial or Affordability Explanations Since being uninsured is strongly correlated with income and education, it is often assumed that the strong relationship between SES and health-care use operates primarily through the more proximate variable of health insurance. However, the conclusion of numerous studies is that health insurance explains only a small part of the socioeconomic differences in the use of most types of health services (Bongers et al. 1997). In addition, even among people with health insurance, lower levels of income and education are associated with lower overall health-care use (Fiscella et al. 2000). Although access to health care is often equated with financial access, access is in fact multidimensional, and other aspects of access (such as the availability, convenience, and acceptability of providers and services) are of equal importance (Gold 1998). Also, as mentioned above, there is ample evidence from countries with universal health insurance that significant SES differences in utilization exist even within systems where there are few or no financial access 14560
barriers to services. Thus, SES differences in health care use are not fully explained by economic or financial factors.
2.2 Knowledge, Attitudinal, and Cultural Explanations Another type of explanation for the lower use of health services among disadvantaged populations is that a lack of attention to health is part of a larger way of life or subculture related to poverty. Underuse of health care services is viewed as stemming in part from a lack of understanding regarding appropriate careseeking behavior for specific symptoms or conditions. In addition, lower use of health services has been viewed as derivative of a larger set of attitudes that tend to accompany poverty and deprivation. This includes a ‘crisis-oriented’ approach to life, a tendency to overvalue immediate gains and rewards and discount longer-term outcomes, decreased self-reliance and self-efficacy, and increased skepticism regarding the potential benefits of medical care (Riessman 1994). Although culture of poverty explanations for SES differences in health care use have been criticized as ‘classist’ and pejorative, this type of explanation continues to be used explicitly and implicitly by many researchers (Riessman 1994, Dutton 1978). Such explanations focus on social class differences in knowledge, attitudes, and psychosocial predispositions; and, as a result, programmatic and policy interventions that focus on these individual traits are not uncommon (Adler 1993, Reilly et al. 1998). While these individual characteristics have been shown to be related to healthcare use, they are most often taken into account outside of the social and cultural contexts in which they were produced and the system context in which health care is provided. Thus, they provide a limited or incomplete explanation for observed SES disparities in health care use (Dutton 1978, Riessman 1994, Lantz et al. 1997).
2.3 Health Care System Explanations Empirical evidence from a number of countries suggests that health care system factors, especially nonfinancial components of access, offer the greatest explanatory power regarding SES differences in utilization. It appears that deficits, inequities and inefficiencies in the health care system rather than patient economic or psychosocial characteristics are essential to understanding SES differences in the use of health services (Dutton 1978, Riessman 1994, Pappas et al. 1997). Research has shown that SES differences in health-care utilization are driven much more by structural elements of the health-care system than by individual factors such as health insurance or other
Socioeconomic Status and Health Care financial variables (Rosenbach et al. 1999). These system-level factors include cultural deterrents to care, the unwieldy nature of large medical bureaucracies, fragmentation in care, racism and other forms of provider bias, transportation barriers, and nonfinancial dimensions of access. There are many ways in which the design of healthcare systems leads people to feel estranged, dissatisfied, or distrustful, especially those people who are from marginalized populations because of social class, race, or ethnicity (Baker 1999). Quality of care and patient perceptions of care experiences are important in system explanations for SES disparities in healthservices use. Not only is SES related to health-care utilization, it is also related to health-care quality (Fiscella et al. 2000). For example, the uninsured are more likely to receive substandard medical care and to suffer a medical injury as a result (Burstin et al. 1992). While individual characteristics (such as financial means or attitudes regarding care) clearly play a role, it is important to view these individual characteristics within the broader context or system of how healthcare services are organized, delivered, and financed. The policy implications of such findings are important; for they suggest that providing universal coverage or offering free health services to the poor is not a panacea for reducing social inequalities in health-care utilization. In the absence of significant changes in the ways in which health care is organized and delivered, improvements regarding financial access and personal beliefs will likely not do much to reduce SES disparities in health-services utilization.
3. The Role of Health Care in Explaining SES Disparities in Health Status There are many explanations for the strong relationship between SES and health status, as outlined in the accompanying chapter on this topic (see Socioeconomic Status and Health). This includes explanations that focus on different material resources, of which health care is viewed as an important type (MacIntyre 1997). Research from a number of countries suggests that, for specific health conditions, the receipt of clinical preventive services and\or timely therapeutic care play an important role in health outcomes, and thus may account for some portion of socioeconomic disparities in health status. Therefore, it has been argued that efforts to increase the use of health services amongthesocioeconomically disadvantaged—primarily by reducing financial barriers to health care or health insurance—have great potential for reducing socioeconomic disparities in health status (Andrulis 1998, Fiscella et al. 2000). Research at the end of the twentieth century, however, also suggests that the contribution of SES differences in health-care use to overall SES disparities in health status is likely to be small (Adler 1993,
MacIntyre 1997). Evidence for this comes from a variety of sources, including—as described above— countries with universal health insurance. For example, the Black report demonstrated that socioeconomic inequalities in mortality in England and Wales increased rather than attenuated in the decades after the British National Health Services was established (Black et al. 1982). It has also been shown that, in developed countries, there is no relationship between health indicators (such as life expectancy) and the proportion of economic resources spent on health care (Roberts and House 2000). In addition, a crossnational study found that health-care expenditures did not explain variation in infant mortality rates above and beyond the contribution of socioeconomic resources (Kim and Moody 1992). While it is important not to overgeneralize from these ecological studies, a wide variety of research strongly suggests that differential access to and use of personal health services is but one of myriad factors contributing to social inequalities in health status (Roberts and House 2000, MacIntyre 1997). With respect to developing countries, it is clear that economic development, growth in formal education, and public-health measures such as clean water and basic nutrition are of greater significance to population health than the delivery of health care. While the importance of personal health services (such as immunizations or oral rehydration therapy) in lessdeveloped countries should not be underestimated, broad public health and economic development advances are believed to afford the most potential for making significant improvements in health. This is not to suggest that health care has no impact on individual health or social disparities in health status. There are indeed some diseases and conditions for which health services can significantly reduce morbidity and mortality, and improve physical functioning and quality of life, as demonstrated in clinical trials and other types of studies (Andrulis 1998, Reilly et al. 1998, Fiscella 2000). The bulk of the knowledge generated to date, however, suggests that while reducing SES disparities in health care may help to reduce socioeconomic and racial disparities for some specific diseases or health issues, significant inequalities in health status would be likely to remain. Environmental, community, socioeconomic, and psychosocial exposures to both health risks and healthpromoting resources would still exist, creating disparities in the incidence and prevalence of a number of health problems. Furthermore, many health conditions and problems are not amenable to medical interventions or to prevention practices delivered through health-care providers. It is also important to emphasize that socioeconomic disparities in health care are not immutable (Fiscella et al. 2000). Thus, many researchers, clinicians, advocates and policymakers alike believe that efforts should continue to be made across and 14561
Socioeconomic Status and Health Care within countries to reduce SES differentials in healthcare access and utilization by focusing on the financial, attitudinal, and—most importantly—health system barriers that contribute to these significant disparities. The policy goal should be a health care delivery system in which all dimensions of access and the content and quality of care delivered are the same across social strata. See also: Health Insurance: Economic and Risk Aspects; Socioeconomic Status and Health; Unemployment and Mental Health
Bibliography Adler N E, Boyce W T, Chesney M A, Folkman S, Syme L 1993 Socioeconomic inequalities in health: No easy solution. Journal of the American Medical Association 269: 3140–5 Andrulis D P 1998 Access to care is the centerpiece in the elimination of socioeconomic disparities in health. Annals of Internal Medicine 129: 412–6 Baker R 1999 Minority distrust of medicine: a historical perspective. The Mount Sinai Journal of Medicine 66: 212–22 Black D, Morris J N, Smith C, Townsend P 1982 Inequalities in Health: The Black Report. Penguin, New York Bongers J M B, Van der Meer J B W, Van den Bos J, Mackenbach J P 1997 Socio-economic differences in general practitioner and outpatient specialist care in the Netherlands: A matter of health insurance? Social Science and Medicine 44: 1161–8 Brown M E, Bindman A B, Lurie N 1998 Monitoring the consequences of uninsurance: A review of methodologies. Medical Care Research and Reiew 55: 177–210 Burstin H R, Lipsitz S R, Brennan T A 1992 Socioeconomic status and risk for substandard medical care. Journal of the American Medical Association 268: 2383–7 Dutton D B 1978 Explaining low use of health services by the poor: Costs, attitudes, or delivery systems? American Sociological Reiew 43: 348–68 Fiscella K, Franks P, Gold M R, Clancy C M 2000 Inequality in quality: addressing socioeconomic, racial, and ethnic disparities in health care. Journal of the American Medical Association 283: 2579–84 Gold M 1998 Beyond coverage and supply: measuring access to healthcare in today’s market. Health Serices Research 33: 625–52 Kephart G, Thomas V S, MacLean D R 1998 Socioeconomic differences in the use of physician services in Nova Scotia. American Journal of Public Health 88: 800–3 Kim K K, Moody P M 1992 More resources better health? A cross-national perspective. Social Science and Medicine 34: 837–42 Lantz P M, Weigers M E, House J S 1997 Education and income differentials in breast and cervical cancer screening. Medical Care 35: 219–36 MacIntyre S 1997 The Black Report and beyond: What are the issues?Social Science and Medicine 44: 723–45 National Center for Health Statistics 1998 Health, United States, 1998 With Socioeconomic Status and Health Chartbook. US Department of Health and Social Services, Hyattsville, Maryland
14562
Pappas G, Hadden W C, Kozak L J, Fischer G F 1997 Potentially avoidable hospitalizations: Inequalities in rates between US socioeconomic groups. American Journal of Public Health 87: 811–16 Reilly B M, Schiff G, Conway T 1998 Part II: Primary care for the medically underserved: Challenges and opportunities. Disease-A-Month 44: 320–46 Riessman C K 1994 Improving the health experience of low income patients. In: Conrad P, Kern R (eds.) The Sociology of Health and Illness. St Martin’s Press, New York, pp. 434–37 Roberts S A, House J S 2000 Socioeconomic inequalities in health: an enduring sociological problem. In: Bird C E, Conrad P, Fremont A M (eds.) Handbook of Medical Sociology, 5th edn. Prentice Hall, Upper Saddle River, NJ, pp. 79–97 Rosenbach M L, Irvin C, Coulam R F 1999 Access for lowincome children: Is health insurance enough? Pediatrics 103: 1167–74 Whitehead M M 1999 Where do we stand? Research and policy issues concerning inequalities in health and in healthcare. Acta Oncologica 38: 41–50
P. M. Lantz
Sociolinguistics Sociolinguistics is the study of the relationship between language and society. The field of sociolinguistics is arguably the best known and most firmly established of the various hyphenated varieties of linguistics (e.g., psycholinguistics, neurolinguistics, sociolinguistics) that emerged during the second half of the twentieth century. Its roots can be located in the field research of American anthropological linguists such as Dell Hymes, who noted the interesting range of ways of talking and the variety of functions of talk among native American tribes whose languages they described, and of dialectologists, such as Bill Bright, who drew attention to the social bases of much of the linguistic diversity they documented in multilingual speech communities. Although it is a well-established field, there is still some controversy about what counts as sociolinguistics. William Labov, a pioneering and very influential American social dialectologist, has argued since the 1960s that his research constitutes core linguistics, and contributes directly to the development of linguistic theory. In the 1970s, Peter Trudgill, a leading British social dialectologist, noted a distinction between research in social dialectology, which he argued constituted the core of sociolinguistics, and research in areas such as multilingualism and language shift, which Joshua Fishman had labeled ‘the sociology of language,’ an area Trudgill proposed to exclude from sociolinguistics proper. While that debate has effectively been resolved—the current leading
Sociolinguistics sociolinguistics journals include research in both these areas—discussion over the precise boundaries of the term sociolinguistics continues. This article takes an inclusive approach. The sections which follow provide a brief description of the concerns of areas which uncontroversially constitute core sociolinguistics, namely the sociology of language, and social dialectology, as well as areas which some would regard as more peripheral, such as discourse analysis (DA) and stylistics, language awareness, and applied sociolinguistics.
1. Multilingualism; Diglossia; Bilingualism Sociolinguistic research in the area of multilingualism is concerned with identifying the relationship between social factors and choice of language or ‘code’ in specific situations. In Zaire, for example, people often use five or six languages, or more accurately ‘codes’ or ‘varieties,’ in their everyday interactions. The terms ‘code’ and ‘variety’ refer to any set of linguistic forms which are socially patterned, including different accents, different linguistic styles, different dialects, and different languages which contrast with each other for social reasons. The linguistic repertoire of a typical Zairean teenager might include three varieties of Swahili (standard Zairean, local Swahili, and Indoubil, a teenage slang variety), and two varieties of a tribal language such as Shi (a formal and a casual style). How the young person draws on and makes use of this repertoire in interaction is a matter of prime sociolinguistic interest (Goyvaerts 1988). Sociolinguistic research aims to establish the range of social factors relevant to code choice in such multilingual contexts, and their relative weight. The use of one code rather than another in any specific situation tends to reflect such factors as the relationship between speaker and addressee(s), the formality of the context, and the function of the interaction. In Paraguay, for example, the indigenous language, Guaranı! , is used in interactions with friends in the pub, while Spanish is the appropriate language to use with your doctor and the bureaucrats (Rubin 1985). Code choice may also indicate more abstract influences such as orientation to an urban rather than a rural lifestyle. More recently, researchers have begun to analyze code choice as a more dynamic activity, examining ways in which speakers change the emphasis in an interaction through linguistic initiatives, and how code choice actively contributes to the construction of particular types of interaction (see Code Switching: Linguistic). Diglossia is a term used to describe a particular kind of societal bilingualism. In his classic article on the topic, Ferguson (1959) defined diglossia as a situation where two varieties of the same language, a H(igh) variety and a L(ow) variety, are used in a community for complementary functions, a linguistic division of labor. The H variety is restricted to formal, religious,
symbolic, and literary functions. The L variety is used for everyday conversations in the home and in all informal contexts. The term proved so useful it was extended to cover a much wider variety of bilingual and multilingual situations where different languages were associated with different domains or social contexts. The term ‘polyglossia’ was coined to take account of communities such as Singapore with more than one H language as well as several L varieties and even some M(id) varieties falling between the H and L in terms of prestige (Platt 1977). Thus, sociolinguists have steadily modified the scope of the diglossia concept to handle more complex patterns of societal multilingualism. Moreover, further research on the four classic cases Ferguson used for exemplification (German-speaking Switzerland, Haiti, Greece, and Arabic-speaking countries) suggests that even these do not fit the original framework as neatly as was once thought. 1.1 Code-switching A related area of sociolinguistic research is the study of code-switching. Situational code switches have been defined as responses to changes in observable features of the context, such as the participants or the setting, and they typically occur at clearly demarcated discourse boundaries. Recent research has increasingly focused on interpreting the social meaning of conversational code-switching, sometimes called codemixing, the much less predictable rapid code-switching which typically occurs among bilinguals in informal situations (e.g., Auer 1998). The development of theoretical models that can account for the linguistic constraints on switches has been one focus of this research. Another has been the social and pragmatic meanings of switches. Analyzing code-choice in Kenya, for instance, Myers-Scotton (1997) discusses the sophisticated affective messages conveyed through complex and rapid alternations between codes, often within a very short exchange. So, for example, through subtle switching of codes, rather than any explicit refusal, a store-keeper politely rejects his sister’s attempts to obtain cut-price goods from him, thus preserving their relationship. This strand of research considers how switches actively contribute to constructing a particular type of interaction, and to the dynamics of identity construction. Code-switching is one means by which bilinguals and multilinguals make creative use of their linguistic repertoires to convey social meaning.
2. Language Maintenance and Shift; Language Endangerment\Reial While diglossia is defined as a society-wide phenomenon, individual bilingualism is often a transitional phase for new immigrants in a society as they shift 14563
Sociolinguistics from the ethnic language to the dominant language of the wider society. This process of language shift is typically complete in three generations. Fishman’s research dominates the area of language maintenance and shift, with studies of patterns of language use among ethnic minorities in the USA dating back to the early 1960s. A range of theoretical models has been developed to account for the relative survival rates of different ethnic languages in different countries, including measures of ‘ethnolinguistic vitality’ (Giles 1977), and a scale describing the steps involved in taking a language from near language death through to language revival (Fishman 2000). A language which is spoken only in a restricted geographical area and which is under threat from a colonial or world language is typically regarded as endangered. The relative chances of survival of different endangered languages have been the topic of a good deal of sociolinguistic analysis. Research on the numbers of users, social contexts of use, and linguistic attitudes has been undertaken on languages as diverse as Welsh, Cree, Maori, Dyirbal, and Basque, with some disagreement between experts on the crucial conditions for preventing language death.
3. Language Contact; Pidgins and Creoles While the development of bilingualism, or a shift to the majority language, are possible responses to language contact, another common development is the emergence of a pidgin language. Pidgins typically develop in situations where a lingua franca is needed for communication between groups who speak different languages, for example, for trade purposes, or between workers from different tribes on a crop plantation. They are structurally very simple and functionally extremely restricted. They are always ancillary languages developed for a specific and restricted purpose. Creoles develop from pidgins when they begin to be used as first languages and therefore expand functionally and structurally into adequate means of communication for the group using them. Typically they draw on the vocabulary of the language of the culturally dominant group, while their grammar is based on local vernacular languages which contribute the substrate component. A number of official and standard languages such as Swahili, Afrikaans, and Tok Pisin had creole origins. Current debates among creolists center mainly around the issue of the origins and development of creole languages. Do they develop according to universal linguistic principles and according to universal linguistic constraints, or are the remarkable similarities between creoles due to the fact that they share sources of input from European languages such as English and Portuguese? Another current area of interest is the decreolization process (whereby creoles 14564
gradually develop in the direction of the standard variety) which is occurring in places such as Guyana (Romaine 1994).
4. Social Dialectology Social dialectologists are concerned with tracing the relationships between linguistic variation and social factors. The pioneering research of William Labov determined the direction of subsequent research in the area for several decades, both in terms of the research questions he considered worth addressing, and in relation to the innovative methodology and quantitative (‘variationist’) analytical techniques he developed. His influential survey of the New York East Side (Labov 1972a) demonstrated that linguistic variation was not random or ‘free’ as previous dialectologists had tended to assume, but rather patterned systematically according to factors such as the social class, gender, age, and ethnicity of the speakers, and the formality of the style of speech they were using (see Language and Social Class; Language and Ethnicity). The basic patterns which Labov identified in New York were replicated in many other urban speech communities including cities in Canada, England, Australia, and New Zealand. In Europe too, variationist approaches gradually displaced those of traditional regional dialectologists, and were adopted in urban dialectology research with similar results. Researchers found, for instance, that stable variables, which typically included grammatical variables, such as multiple negation, and phonological variables, such as final -ing (e.g., he don’t know nothin’) in English, tended to divide the community clearly into two major social groups. Linguistic variables involved in change, however, were typically more finely stratified with a gradient pattern reflecting a continuum from the lowest to the highest social groups. In the great majority of surveys, women lead sound change whether towards the standard or towards the vernacular variety. In New Zealand, for example, women are leading the rapid spread of the high rising terminal, as well as the merging of the vowels in words such as EAR and AIR. Recent variationist studies indicate that in specific speech communities variables such as class, ethnicity, social role, and style often interact to modify such general patterns. Labov formalized the idea that linguistic change could be studied in progress; he analyzed ‘apparent time’ data demonstrating the spread of postvocalic (r) in words like car and card through different generations of the New York speech community. This approach to the study of language change has since been used by sociolinguists in speech communities worldwide. Labov also used quantitative methods to collect people’s subjective evaluations of different social accents he played to them, a method which has been
Sociolinguistics widely used since in language attitude research. He demonstrated that, despite wide variations in their linguistic production, there was a startling uniformity of evaluative norms and attitudes to language among members of the same speech community. The New York accent, for example, was universally disparaged, a pattern replicated in many other urban speech communities. On the basis of this analysis, Labov suggested that the most useful defining characteristic of a speech community was precisely such shared linguistics norms. Others have suggested shared rules of use, density of patterns of interaction, or a shared pattern of variation. More recently, some sociolinguists have begun to explore the concept of the ‘community of practice’ as a useful concept in sociolinguistic research which moves the focus to the ways in which communities are constructed through shared goals, activities, and verbal resources (see Holmes and Meyerhoff 1999). Labov set the methodological standard in social dialectology, influencing all subsequent research. He used an extensive interview schedule which included word lists, reading passages, questions skilfully designed to elicit casual speech, and innovative attitude elicitation techniques. Later refinements include the use of a network approach by Lesley Milroy and James Milroy in the study of working class Belfast. Instead of using social survey results and consequently interviewing strangers, they used networking (the ‘friend of a friend’ approach) to contact potential contributors. They also operationalized the network concept in terms of density and multiplexity, demonstrating that those who were most closely integrated into the local community used more vernacular forms (Milroy and Margrain 1980). Current variationist research can be found in the journal Language Variation and Change with material ranging from the finest vowel and consonant distinctions to the broadest features of interactional discourse. For instance, the use of quotative be like (as in so she was like you aren’t really going out wearing that thing and he was like why not) provides one example of an interesting, relatively new discourse marker which is being well documented as it spreads from American English into other varieties (e.g., Tagliamonti and Hudson 1999).
5. Social Constructionist Approaches Another important development over the last decade has been the influence of social constructionism on sociolinguistic research. Rather than regarding social categories, such as social class, ethnicity, and gender, as primary, with linguistic behavior reflecting membership, a social constructionist approach shifts the emphasis to language as a dynamic resource used to construct particular aspects of social identity at different points in an interaction. Social categories are not fixed but are subject to constant change; talk itself
actively creates different styles and constructs different social contexts and social identities as it proceeds. So, for example, at certain times during a romantic dinner, a woman may select linguistic forms which contribute to the construction of a more ‘feminine’ identity, while in a Board meeting she is chairing she may linguistically construct a statusful and powerful identity. Interacting with her children or perhaps with certain subordinates, she may linguistically construct a ‘maternal’ identity. Even within one interaction she may use language to perform a range of social identities as she responds to the development of the discourse in context.
6. Discourse and Style 6.1 Style The influence of social constructionism is also evident in recent research on the sociolinguistics of style. As mentioned above, Labov elicited a range of styles within the interview context by varying the interviewee’s task. Subsequent research focused on the influence of the addressee (e.g., Giles and Smith 1979) and\or the ‘audience’ (Bell 1984). Speech Accommodation Theory, later relabeled Communication Accommodation Theory (CAT), focuses on the mutual influence of participants on a wide range of aspects of communication: so, for example, people may ‘accommodate’ to each other’s nonverbal behavior, speech volume, rate of speaking, prosodic patterns, vowel realizations, discourse markers, and even language. Collecting social dialect data, for instance, Trudgill (1988) found he converged towards his interviewees’ levels of final glottal stops for \t\ in words such as not. Similarly in New Zealand, convergence in levels of the use of the discourse marker eh are common (Meyerhoff 1994). Divergence also occurs when interlocuters wish to differentiate themselves. Giles and Smith (1979) describe, for example, a switch to Welsh as one response to outsiders’ disapproval of efforts to maintain the language, and, in Australia, Eisikovits (1989) explains adolescent boys’ use of vernacular forms in an interview by suggesting they are deliberately diverging from the standard speech of their middle-class female interviewer. More recently there has been some exploration of Rampton’s concept of ‘crossing,’ where speakers apparently use out-group linguistic styles for specific stylistic effects, or code switch into varieties not generally associated with their group in order to challenge or problematize group boundaries (Rampton 1999). Currently, then, many sociolinguists argue that style is not so much a matter of self-monitoring (as Labov suggested) or of responding to others as CAT theorists argue, but rather a matter of the way an individual chooses to present herself in everyday interaction. Style emerges from ‘practice,’ the individual’s ongoing performance of her complex of social 14565
Sociolinguistics identities in particular sociocultural contexts. Eckert’s analysis of American adolescents’ stylistic ‘performance’ in the school context, for instance, demonstrates that style is not just a matter of speech, but also involves ‘dress, attitude and action’ (2000, 211ff ). Their choice of linguistic variables, however, affects every utterance. They are constantly manifesting a particular style (e.g., ‘flamboyant’ or conservative’), just as they are ongoingly involved in the performance of their particular social identity in the school (e.g., as ‘jock’ or ‘burn-out’). Few would argue at the inclusion of such research within sociolinguistics. However, turning to the closely related area of pragmatics, the borderlines become much less clear. 6.2 Pragmatics Pragmatics is concerned with the analysis of language in use. At the more linguistic end of pragmatics, researchers analyze how people disambiguate utterances such as I watched them standing up. Social context becomes increasingly relevant in analyzing the complex ways in which meaning is conveyed through such an apparently simple utterance as you didn’t get a paper then? which can be interpreted as a request for information, a complaint, or even a rebuke, depending on the social relationship between participants. Similarly, I can hear someone talking can convey very different meanings depending on the relevant social context and participant relationships: consider, for instance, a classroom vs. police officers investigating an ‘empty’ house. Closely related is sociopragmatic research on politeness, with analysis of the ways in which people express rapport and solidarity (positive politeness), as well as respect and social distance (negative politeness) (see Brown and Levinson 1987, Thomas 1995). Linguistic politeness devices include choice of pronoun (e.g., tu\ous, du\Sie etc), endearment terms (e.g., loe, honey, dear), hedges (e.g., tag questions, epistemic modal verbs, modal particles, such as perhaps, possibly), pragmatic particles (e.g., you know, of course, sort of, I mean), and many more, some of which are very subtle. Sociolinguists examine the patterning of such pragmatic devices in the usage of different social groups. Gender differences have been identified, for instance (e.g., Coates 1996), as well as ethnic and class differences (e.g., Goodwin 1980). More recently, researchers have examined the ways in which people draw on these devices in constructing different aspects of their social identity. Such devices are important resources in constructing ethnic identity, for example, or a ‘feminine,’ compliant, accommodating identity (e.g., Holmes 1997). Sociopragmatic research has also been developed by applied sociolinguists who have examined crosscultural differences in the way politeness is expressed, 14566
and in the realizations of such speech functions as apologies, complaints, compliments, directives, refusals, and disagreements (e.g., Gass and Neu 1996). The applied value of such work in second language teaching and learning is obvious. 6.3 Discourse Analysis Not all sociolinguists would regard DA as appropriately included within sociolinguistics. Introductory courses and textbooks vary in whether they have sections introducing DA. Just a few approaches to DA are outlined here. In its broadest sense DA includes any analysis of spoken or written discourse. In practice, sociolinguists who adopt a DA approach to their materials generally do so in order to identify the ways in which social phenomena emerge from or are instantiated in spoken interaction. The way an oral narrative is constructed, for instance, may provide evidence of the salience for the interlocuters of a particular social or ethnic or gender identity (e.g., Labov 1972b, Cheshire 2000). Discourse markers, such as end tags (and stuff, and eerything), can be used in ways which reflect or express social group membership. Conversational analysis (CA) is one distinctive approach to the analysis of discourse based on a model of communication as jointly organized activity (Sacks 1984): like dancing, or joint musical performance, it rejects the typical linguistic model of communication as sending and receiving messages. CA is concerned with how participants work together to jointly achieve conversational closings, storytelling, disputes, medical diagnoses, the mutually dependent roles of interviewer and interviewee, and so on. CA focuses on how interaction unfolds as a sequence of actions by different participants, with the significance of an utterance or gesture highly dependent on its position in a sequence, as well as being jointly negotiated. Others put more emphasis on the social context in which interaction takes place, and the relevant backgrounds, knowledge, and experience of the participants. This more ethnographic approach, generally identified as interactional sociolinguistics, is associated with John Gumperz and his associates. One example of recent work in this area is the exploration of workplace interactions between managers and workers, with a particular focus on areas of miscommunication and potential misunderstanding (e.g., Roberts et al. 1991). Another approach, generally known as critical discourse analysis (CDA), has developed largely within the European tradition. CDA is distinguished by its critical focus, its broad scope, and its ‘overtly political agenda’ (Kress 1990, pp. 84–5). CD analysts aim to reveal connections between language, power, and ideology, and to describe the way power and dominance are produced and reproduced in the discourse structures of generally unremarkable interactions.
Sociolinguistics
7. Language, Culture, and Ideology Another important strand of sociolinguistic research which can be traced to the influence of American anthropological linguists is the quest for a solution to the conundrum of the relationship between language, culture, and thought. Edward Sapir and his pupil Benjamin Lee Whorf developed the hypothesis that language influences thought rather than the reverse. The strong form of the Sapir–Whorf hypothesis claims that people from different cultures think differently because of differences in their languages. So, native speakers of Hopi perceive reality differently from native speakers of English because they use different languages, Whorf claimed. Few sociolinguists would accept such a strong claim, but most accept the weaker claim of linguistic relativity, that language influences perceptions, thought, and, at least potentially, behavior. More recently, sociolinguists such as Fairclough (1995) and applied linguists such as Pennycook (1994) have in quite different ways developed this approach. Fairclough’s research uses a CDA approach to identify and expose the ways in which ideology and power are constantly instantiated and enacted in the familiar discourse of the media and of everyday interactions. Others have examined such issues as the continuing and increasingly subtle ways in which sexism, in the form of the derogation of women, or racist suppression or misrepresentation of ethnic minorities, are sustained through the ways language is used in the media or in public documents. Pennycook’s controversial research also emphasizes critical awareness; he argues that the imperialist expansion of the English language has been at the cost of local languages in many countries. Not surprisingly, his views have been contested by those who regard the development of a wide variety of Englishes around the world as evidence that English no longer belongs to the English—nor even to the Americans.
8. Applied Sociolinguistics The implications and applications of sociolinguistic research have always concerned sociolinguistic researchers. Indeed, many began their research efforts in an attempt to answer a ‘real world’ question such as (a) does language play a role in accounting for differences in the educational performance of different sociocultural groups? (b) what role can education play in the maintenance of a minority language? (c) what is the role of the sociolinguist in the area of language policy? (d) how can sociolinguistic information improve the workings of the legal system? (e) what sociolinguistic competence is required by immigrants in the workplace?
As the breadth of these examples suggests, this is an extensive area of research and no more than a few illustrative examples can be provided. Sociolinguistic research has always had important social repercussions in the area of education. The early work of the late British sociologist Basil Bernstein (1971, [1973]) was hugely influential, especially in Germany, where deficit theories of various sorts dominated some areas of sociolinguistic research for decades. Bernstein suggested that a child’s social class determined their access to linguistic ‘codes,’ with potential cognitive consequences. His term ‘restricted code’ and the linguistically naive assumptions underlying some of his claims were challenged by social dialectologists, leading to the extended debates of the ‘deficit–difference’ controversy (e.g., Labov 1969. Its most recent manifestations range from arguments about the place of Ebonics or African American Vernacular English in the American school curriculum to debates on the linguistic rights of ethnic minorities such as Aboriginal children in Australia, British Black children in England, and the various Gastarbeiter communities in European countries. Another area where sociolinguistic advice is often sought is in the areas of language policy and language planning. Sociolinguistic descriptions provide crucial information for planning educational policy in multilingual contexts. In Vanuatu, for example, with more than 80 vernacular languages spoken by no more than 200,000 people in total, the government’s proposed policy of education in the vernacular requires extensive preparatory sociolinguistic and linguistic research (Crowley 2000). Similarly, attempts to revive a ‘dying’ language must start from an accurate sociolinguistic description of current numbers of users and contexts of use. Bilingual education is more likely to succeed if the sociolinguistic context in which the program is to be developed is taken into account in selecting an appropriate model. Accurate information about the way people talk can usefully counter misleading gender stereotypes which may restrict people’s job opportunities. Plans for sexist language reform also benefit from accurate information about the strength of attitudes in the relevant speech community. Forensic linguistics provides a final example where applied sociolinguistics has made a contribution. Sociolinguists have demonstrated ways in which the legal system discriminates against those who use nonstandard dialects, and those whose sociocultural rules for language use differ from those of the dominant group. In communities where Aboriginal English is used, for instance, direct questions are an inappropriate way of eliciting information (Eades 1996). Culturally conditioned responses to a lawyer’s direct questions (such as silence or a formulaic I don’t remember) may severely disadvantage Aboriginal defendants. The assumption that the norms of the Western legal system are universal regularly result in misinterpretation of the linguistic behavior of those 14567
Sociolinguistics from different sociocultural backgrounds. Many applied socolinguists orient their research to highlighting such areas of potential injustice.
9. Conclusion This article has necessarily been selective. The aim has been to provide a broad overview within which the range of activities of current sociolinguists can be located. Many of the topics are discussed in more detail in separate articles. Recent introductory textbooks (e.g., Romaine 1994, Wardhaugh 1998, Holmes 2001) and collections of readings in sociolinguistics (e.g., Coupland and Jaworksi 1997) provide further resources for those wishing to pursue the subject further. How will sociolinguistics develop in the twenty-first century? The search for robust and explanatory sociolinguistic theories currently occupies many sociolinguists and seems likely to continue for some decades. The steady integration of macrosociolinguistic (‘sociology of language’) approaches with variationist and microsociolinguistic (interactional\ DA) approaches seems a likely direction of development. Increasingly researchers using different theoretical approaches or from different disciplinary backgrounds are attempting to communicate with each other in theory development. Finally, it is safe to predict that sound descriptive sociolinguistic research will continue to provide the crucial underpinning for language policy developments in areas such as minority groups, immigration, education, and second language teaching. See also: Language and Gender; Language and Social Class; Language and Social Inferences; Language and Society: Cultural Concerns; Language Policy: Linguistic Perspectives; Language Policy: Public Policy Perspectives; Pidgin and Creole Languages; Politeness and Language
Bibliography Auer P 1998 Code-Switching in Conersation. Routledge, London Bell A 1984 Language style as audience design. Language in Society. 13(2): 145–204 Bernstein B 1971 [1973] Class, Codes and Control, Vols. 1 and 2: Theoretical Studies Towards a Sociology of Language. Routledge and Kegan Paul, London Brown P, Levinson P 1987 Politeness: Some Uniersals in Language Usage. Cambridge University Press, Cambridge, UK Cheshire J 2000 The telling or the tale? Narratives and gender in adolescent friendship networks. Journal of Sociolinguistics 4(2): 234–62 Coates J 1996 Women Talk. Blackwell, Oxford, UK
14568
Coupland N, Jaworski A (eds.) 1997 Sociolinguistics; A Reader and Coursebook. Macmillan, London Crowley T 2000 Vernaculars in Education in Vanuatu. Report to World Bank and Vanuatu Ministry of Education, Youth and Sport Eades D 1996 Legal recognition of cultural differences in communication: The case of Robyn Kina. Language and Communication 16(3): 215–27 Eckert P 2000 Linguistic Variation as Social Practice. Blackwell, Oxford, UK Eisikovits E 1989 Girl-talk\boy-talk: Sex differences in adolescent speech. In: Collins P, Blair D (eds.) Australian English. University of Queensland Press, St Lucia, Qld, pp. 35–54 Fairclough N L 1995 Critical Discourse Analysis: Papers in the Critical Study of Language. Longman, London Ferguson C A 1959 Diglossia. Word 15: 325–40 Fishman J A (ed.) 2001 Can Threatened Languages be Saed? Reersing Language Shift, Reisited: A 21st Century Perspectie. Multilingual Matters, Clevedon, UK Gass S, Neu J (eds.) 1996 Speech Acts Across Cultures: Challenges to Communication in a Second Language. Mouton de Gruyter, Berlin Giles H (ed.) 1977 Language, Ethnicity and Intergroup Relations. Academic Press, London and New York Giles H, Smith P 1979 Accommodation theory: Optimal levels of convergence. In: Howard G, St. Clair R (eds.) Language and Social Psychology. Blackwell, Oxford, UK, pp. 45–65 Gwin M H 1980 Directive-response speech sequences in boys and girls talk activities. In: McConnell-Ginet S, Borker R, Furman N (eds.) Women and Language in Literature and Society. Praeger, New York, pp. 157–73 Goyvaerts D L 1988 Indoubil: A Swahili hybrid in Bukavu. Language in Society. 17(2): 231–42 Holmes J 1997 Women, language and identity. Journal of Sociolinguistics 2(1): 195–223 Holmes J 2001 Introduction to Sociolinguistics, 2nd edn. Longman, London Holmes J, Meyerhoff M 1999 The community of practice: theories and methodologies in language and gender research. Language in Society: Special Issue: Communities of Practice in Language and Gender Research 28(2): 173–83 Kress G 1990 Critical discourse analysis. Annual Reiew of Applied Linguistics 11: 84–99 Labov W 1969 The logic of non-standard English. Georgetown Monographs on Language and Linguistics. Georgetown University Press, Georgetown, Vol. 22 Labov W 1972a Sociolinguistic Patterns. University of Pennsylvania Press, Philadelphia, PA Labov W 1972b The transformation of experience in narrative syntax. In: Labov W (ed.) Language in the Inner City. University of Pennsylvania Press, Philadelphia, PA, pp. 354–96 Meyerhoff M 1994 Sounds pretty ethnic eh? A pragmatic particle in New Zealand English. Language in Society 23(3): 367–89 Milroy L, Margrain S 1980 Vernacular language loyalty and social network. Language in Society 3: 43–70 Myers-Scotton C 1997 Duelling Languages: Grammatical Structure in Code-Switching. Clarendon Press, Oxford, UK Pennycook A 1994 The Cultural Politics of English as an International Language. Longman, London Platt J 1977 A model for polyglossia and multilingualism (with special reference to Singapore and Malaysia). Language in Society 6(3): 361–78
Sociology, Epistemology of Rampton B 1999 Journal of Sociolinguistics 4(3): 421–585 Roberts C, Jupp T, Davies E 1991 Language and Discrimination. Longman, London Romaine S 1994 Language in Society. Oxford University Press, Oxford, UK Rubin J 1985 The special relation of Guaranı! and Spanish in Paraguay. In: Wolfson N, Manes J (eds.) Language of Inequality. Mouton, Berlin, pp. 111–20 Sacks H 1984 Notes on methodology. In: Maxwell Atkinson J, Heritage J (eds.) Structures of Social Action: Studies in Conersation Analysis. Cambridge University Press, Cambridge, UK, pp. 21–7 Tagliamonte S, Hudson R 1999 Be like et al. beyond America: The quotative system in British and Canadian youth. Journal of Sociolinguistics 3(2): 147–72 Thomas J 1995 Meaning in Interaction. Longman, London Trudgill P 1988 Norwich revisited: Recent linguistic changes in an English urban dialect. English World-Wide 9(1): 33–49 Wardhaugh R 1998 An Introduction to Sociolinguistics, 3rd edn. Blackwell, Oxford, UK
J. Holmes
Sociology, Epistemology of The word ‘epistemology’ will be taken here in its etymological acceptation, which is the one kept in the Romance languages, namely as philosophy of science. Like any other special branch of the philosophy of science, the epistemology of sociology may be construed as having three main components: ontology (or metaphysics), epistemology in the narrow sense (theory of knowledge), and methodology (or normative epistemology). This order is not arbitrary, for every research strategy depends upon the type of knowledge being sought, which in turn depends on the nature of the thing studied. Thus, if the object of study is concrete, like a photon or a social group, and if what is sought is objective knowledge of it, then the scientific method will be adopted. If, by contrast, the object of knowledge is assumed to be spiritual, and if things spiritual are deemed to be beyond the reach of science, then an alternative method will be preferred—such as revelation, intuition, interpretation (Verstehen), unchecked speculation, or narrative. In this article the ontology, epistemology and methodology of sociology will be glanced at.
1. 1.1
Ontology: Nature of the Social The Idealism–Materialism Dilemma
Most sociologists adopt the vulgar view that the human being is a composite of body and mind (or spirit or soul), and do not ask how these two components are kept together. Hence they do not
characterize social facts as either material or spiritual, and do not worry about the place of sociology in the system of human knowledge: they are dualists. But some sociologists do worry about these questions, and argue for or against one of two monistic worldviews: idealism or materialism. Idealism is the view that everything is either ideal or ultimately ruled by ideas, whereas materialism holds that all existents are material or concrete. In social studies, the idealism\materialism cleavage appears notably in the place and relative weight assigned to the so-called ideal (cultural and political) superstructure vis a' vis the material (biological and economic) infrastructure. This very distinction betrays a dualistic stand, even if the pre-eminence or priority of either ‘structure’ (system) is postulated. Consistent idealists will assert that everything social is spiritual (or cultural), whence sociology is a Geisteswissenschaft (science of the spiritual). This is the view of hermeneutics, symbolic interactionism, phenomenological sociology, and ethnomethodology. By contrast, materialists assert that persons and social groups are material, though not necessarily confined to the physical or even the biological level. And, unlike idealists, they claim that all cultural and political activities are influenced or even determined by environment, reproduction, and material production. While it is undeniable that the reproductive patterns and means of subsistence condition culture and politics, it is no less true that the latter in turn help shape the former. In a systemic perspective, changes in the environment and in any of the subsystems of society (the biological, economic, political and cultural) may initiate processes altering any of the others. People may work or fight, live or die, for King and country, no less than for food and family. There is no ‘last analysis.’ Marxism is the oldest, and still the most influential, of the materialist schools in social science. However, given its proposed division between the material infrastructure and the ideal superstructure, it is questionable that Marxism is consistent. Moreover, philosophical Marxism is nowadays split into a large number of different schools, neither of which is flourishing. By contrast, some of Marx’s contributions to social science, particularly his emphasis on production, on the economic roots of the power elites, and on the economic causes of many an international conflict, have survived. Even some of Weber’s studies are materialist, particularly those of the congruence between Hinduism and the Indian caste system, and on the causes of the decline of slavery in ancient Rome. Likewise the Annales historical school owes much to Marx. Sociobiology is the newest and most popular materialist school in social studies. Sociobiologists hold that everything social is ultimately biological, for having evolved as an adaptation to the natural environment. This view overlooks all the types of self14569
Sociology, Epistemology of destructive and antisocial behavior, from suicide to military aggression to environmental degradation to gullibility and ideological zealotry. And it ignores all social inventions (like formal education) and social revolutions (like the information revolution)—hence it is basically conservative. The sociobiological speculations on the origin of mental faculties are even more far-fetched and they lack empirical support. Still, something does remain of the attempt to reduce social science to biology, namely, the reminder that sociologists cannot afford to ignore human physiology and ecology. For example, they should not ignore that malnutrition stunts development and limits productivity; that plagues, foreign invasions and wars are bound to eliminate some genes and change the immune system; and that a few more centuries of uncontrolled environmental degradation is likely to make our planet uninhabitable.
1.2 The Indiidualism–Holism–Systemism Trilemma There are essentially three consistent views on social structure: that it does not exist, that it is self-existing, and that it is a property of a system composed of individuals. The first, or individualism, identifies ‘society’ with ‘crowd.’ The second view, or holism, regards society as an organism, and the individual as wholly subordinated to it. The third, or systemism, views society as a system made up of individuals linked to one another by ties of different kinds and strengths. Individualism stresses agency at the expense of structure, and tends to explain action in terms of ‘rational’ (self-interested) decisions. Holism emphasizes collective traits, which it claims to be unexplainable, and it explains individuals by wholes. Lastly, systemism admits the novelty of systemic properties, as well as the possibility of explaining them in terms of interactions among individuals and systems. Individualists rightly see in individual action the root of everything social. And, even if they acknowledge the importance of the social context or situation, they resist the very idea of a social system (family, village, firm, etc.) Radical individualism cannot work in sociology if only because the idea of an individualist social science makes as much sense as the idea of a collectivist psychology. After all, sociology studies social facts and, although these result from individual actions, they have nonindividual traits. Families, businesses, schools, armies, religious congregations, and other social systems are structured groups, and a structure is a set of relations, among them the bonds holding the system together. Likewise, social processes, such as trade, governance, and war, are impersonal affairs. This is why there are no radical individualist sociologists, with the possible exception of Homans. Many who call themselves methodolo14570
gical individualists, from Weber and Pareto to Coleman and Boudon, admit the peculiarity of the social, and only attempt to explain it as emerging from individual actions: they behave more like systemists than like individualists. Holists rightly emphasize the centrality of connections, but they exaggerate in claiming that social structure precedes and dominates individuals. This view is logically untenable, because there are no relations without relata. In reality there are only interrelated individuals: relations, like their relata, are only the products of abstraction. For instance, there is no such thing as ‘married’ in itself, any more than there are unrelated spouses. Hence relations cannot act upon individuals, but interrelated individuals can alter or even sever some relationships—though never all of them at the same time except through suicide. Systemism, often confused with holism, is a sort of synthesis of the latter with individualism. It holds that, because individuals are inter-related, individual behavior can only be understood in terms of both intentions and social constraints and stimuli. Thus, agency and structure are mutually inseparable. And social change, though brought about by individual action, consists in structural change—that is, change in the nature and strength of the bonds that hold social systems together. A social system may be characterized by its composition, environment (natural and social), structure, and mechanism. A change in any of these features may modify the other three. However, some features may resist changes in the others. For example, social order may survive demographic turnover and environmental catastrophe. Still, all four aspects coexist and, in the long run, important changes in any are bound to alter the other three. Yet individualism ignores structure and underrates environment, whereas holism ignores composition and mechanism, and underrates the natural environment. Only systemism accounts for all four, and is thus likely to favor realistic models of social groups and effective social policies.
2.
Epistemology: Knowledge of the Social
There are as many views on the nature of social studies as general views on the nature of knowledge: skepticism, apriorism, empiricism, and realism.
2.1
Skepticism
Skepticism comes in two strengths: moderate (or methodological) and radical (or systematic). The moderate skeptic doubts a thesis as long as no solid evidence for it is available. By contrast, the radical skeptic questions the very possibility of knowledge of some kind. Whereas moderate skepticism is part of the
Sociology, Epistemology of scientific attitude, radical skepticism is destructive and even self-destructive for, if everything is doubtful, why not skepticism itself? The variety of radical skepticism most common concerning the social is cognitive relativism, or the view that social facts cannot be known objectively. This view has been falsified by the growing pile of reliable social data and approximately true sociological hypotheses. Since every research project sets out to find out the truth about something, it presupposes the ability to find it.
manifest or disguise. Only sociological theory can suggest the search for social data that indicate social process, in particular the mechanisms behind appearances. By banning theory, or restricting it to correlating commonsense data, empiricism curtails experience. Moreover, its limitation has elicited some of the postmodern attacks upon science.
2.4 2.2
Apriorism
Apriorism is speculation without reality checks. The ‘grand social theories’ of old were apriorist. So are the sociobiological theories on the alleged genetic roots of all social traits, from the incest taboo to social stratification to ideals of human beauty. However, nowadays the most popular a priori theories in social studies are the rational choice models. All of these assume that we always act in self-interest and moreover ‘rationally,’ that is, striving to maximize our expected utilities. These theories are a priori because they are seldom if ever checked against empirical data. In particular, they postulate that agents have certain preference rankings (or utility functions), that are seldom checked. Moreover, some rational choice models are untestable. Thus, if agents behave in an obviously irrational fashion, they are said to be subjectively rational—as when they play lottery. And if people are altruistic, it will be argued that altruism is nothing but enlightened self-interest. Being impregnable to empirical data, such models are unscientific.
2.3 Empiricism Empiricism is the view that only empirical data count: that they are the source, content, and final judges of ideas. Note that the emphasis is on data (reports on observable facts), not on objective facts. So much so, that the neopositivists disallow all talk of objective facts as metaphysical, in going beyond experience. However, empiricism is not limited to positivist philosophers. Behaviorism is a prime example of empiricism, since it deliberately avoids the neural springs of behavior and it eschews theorizing. Ethnomethodology, though ostensibly tributary to the antiempiricist philosophies of Dilthey and Husserl, is another instance of empiricism, since all its practitioners do is to record overt everyday behavior. Empiricism condemns students to the shallowness characteristic of common sense, because the most interesting facts of any kind are invisible, whence they must be guessed. For example, we can see a status indicator but not status, let alone class; and we can hear political talks, but not the interests that they
Scientific realism
Scientific realism, or objectivism, has two components: the ontological thesis that the world outside the knower exists on its own, and the epistemological thesis that it can be known. As applied to sociology, scientific realism is the view that social facts are just as real as physical facts, and that they can be known by blending theory with observation, measurement, and (whenever possible) experiment. Such knowledge is objective, but partial and gradual. In the case of social facts, research must include not only the facts themselves but also the way the actors perceive them—for example, the fact that people gauge social inequalities, and react to them, by reference to the adjoining social group. Although scientific realism demands that theories be ultimately justified by data, it does not stifle the sociological imagination the way empiricism does. A realist concedes that, just as some data trigger the curiosity of the theorist, at other times untested theories motivate the search for pertinent empirical data. And, on still other occasions, advances come from the confrontation of either rival theories or mutually incompatible data. In short, according to scientific realism, theorizing and empirical research should go hand in hand, now matching, now challenging one another. On the critical side, scientific realists will ask interpretivists and other armchair students to exhibit empirical evidence for their ‘interpretations’ (hypotheses). They will ask rational choice theorists to specify and justify the utility functions they impute to people. They will remind Marxists and other ‘economic imperialists’ that real people engage in some activities that are economically unproductive or even counterproductive, and that many a social change—such as the diffusion of the computer and the revival of religious fundamentalism—has come about with little or no class struggle.
3.
Methodology: Social Research Strategy
Methodology is the normative or prescriptive branch of epistemology: it attempts to lay down, justify, refine, or criticize research rules and procedures. Philosophers are interested only in general or crossdisciplinary strategic principles: they leave special 14571
Sociology, Epistemology of techniques to specialists. Here is a tiny random sample of the class of general methodological problems: How are causation and statistical correlation related? What is a social indicator? How to infer intention from behavior? Can every social feature be quantified, or are there irreducibly qualitative properties? Is an ideographic (or particularistic) science possible? Here we shall only review three rival research strategies.
3.1 Dataism: Collect Data In the Anglo-Saxon sociological tradition, research is equated with data hunting, gathering, and processing, whereas theory is regarded as mere data compression (as in curve fitting), or even as parasitic upon data collection. To be sure, there is no factual (or empirical) science without empirical data. But data alone a science does not make, because they are what calls for explanation. To understand a bit of the world is to explain it. And explaining involves guessing (and then checking) mechanisms—causal, random, intentional, or hybrid. Hence, theorizing is just as important as data collecting, whether in sociology or in any other factual science. Moreover, only theory, by suggesting interesting variables, can spare us the barrenness and boredom of mindless data gathering.
3.2 Interpretiism: Impute Intention The interpretivist school holds, rightly, that we can explain an agent’s behavior if we know his intentions. But it overlooks the fact that we only have direct access to behavior, not to intentions. That is, the interpretivist is faced with the inverse problem of guessing intentions from actions. (Typically, inverse problems are insoluble or have multiple solutions. Think, for example, of the problem of decomposing a given even number into prime numbers, as opposed to the problem of adding prime numbers.) The interpretivist is entitled to guess anything, but not to present his guesses (hypotheses) as true unless they have passed the suitable empirical tests. The imputation of intentions is not only dicey: it is also insufficient, because intentions, even if known, may explain success but not failure. The social constraints upon agency, and the reactions of other people to one’s actions, must be envisaged too. That is, individuals must be placed in their systems.
3.3 Scientism: Gather Data and Craft Theory Consistent with Data Scientism is the thesis that the scientific approach is the best of all approaches to cognitive problems of all kinds in all fields. Scientism should not be mistaken for naturalism, or the thesis that the social world is a 14572
fragment of nature, and consequently social research should be conducted in exactly the same manner as the study of nature. One may admit that everything social is made, while holding that, once made, a social item is out there, along with rivers and butterflies. That is, social reality is constructed, but not by the knower. And it is changeable, but not because the knower changes their viewpoint. Scientism does not involve the denial of the specificity and irreducibility of the social. It only holds that the scientific method is applicable across all the disciplinary barriers, hence that it is advisable to trespass them whenever they are artificial. In the case of sociology, scientism favors close relations with all its sister disciplines, particularly economics, politology, and historiography. Scientism also promotes the proliferation of social indicators, the production and statistical processing of data, and the crafting of empirically testable theories—in particular mathematical models. Hundreds of social indicators have been invented, measured and used since the mid-1960s. However, their very nature is still somewhat unclear, due to insufficient philosophical investigation. A social indicator bridges the observed to the unobserved, effects to causes, symptoms to mechanisms, and data to hypotheses. However, it does so ambiguously unless it can be shown that no other unobserved trait or event could possibly account for the data. For example, the popularity of low-cost merchandise may measure either consumer satisfaction or poverty; and high church attendance may measure either religiosity or the need for social integration (or for respectability). Ambiguity is typical of empirical indicators. By contrast, the indicators used in the more advanced sciences are unambiguous because they are backed by theories. For example, pH is a reliable measure of acidity because it is an index of proton concentration, and theory defines acids as proton donors. The moral for sociology is that the reliability of a social indicator can be increased either by grounding it in a theory, or by using it jointly with other indicators of the same hidden trait. Values and morals would seem to defy scientific realism. After all, Weber held that science ought to be value-free and morally neutral because he wished to preserve objectivity as well as the difference between science and ideology. Now, it is certainly correct to distinguish value judgments from factual statements. But not all values are subjective: some, such as peace, welfare, and truth, are not a matter of taste. Moreover, it is possible to investigate scientifically how values and moral norms emerge, and why some of them have become obsolete. It may even be argued that there are moral facts, such as misery and torture, and consequently that some moral judgments can be justified scientifically. Example: ‘Anomie is wrong because it is personally and socially destructive.’ It is also correct to distinguish science from ideology. Social scientists endeavor to understand social reality, not to alter it.
Sociology, Epistemology of However, some of their findings can be used by social activists, politicians and managers to alter social conditions in a rational fashion rather than through costly improvisation. That is, a scientific ideology— one based on social science and empirically testable— is not out of the question.
4.
The Sociology–Epistemology Connection
The traditional view of the relation between science and philosophy is that they are disjoint or even mutually incompatible. This view is mistaken, because science teems with philosophical concepts, such as those of reality, law, and knowledge, and because scientific research presupposes the reality and knowability of the world—a philosophical thesis. In short, philosophy and science, in particular social science, overlap partially. This overlap is only partial because only philosophers analyze philosophical constructs, and only scientists are equipped to study social facts. However, not all philosophies are congenial with science. Whereas some philosophies are extraneous to science, others are relevant to it. For example, phenomenology is extraneous to science because it rejects the scientific method and consists in egology, or the selfstudy of the self. Existentialism is even more alien, because it rejects rationality and does not deal at all with problems of knowledge. The philosophies congenial with science examine, among others, the problems, concepts, hypotheses, and methods common to science and epistemology. Hence, they can be judged to be adequate to the extent that they propose true accounts of actual scientific practice and help advance science. For example, Popper’s philosophy is congenial with science because of its rationalism and objectivism. But his devaluation of confirmation is incompatible with the search for truth, since a hypothesis can only be said to be true to some extent if it has been well confirmed.
5.
Sociology of Epistemology
The sociology of philosophy is only embryonic, and mostly speculative. In particular, little is known about the social roots of the various epistemologies of sociology. For one thing, it is not true that they are motivated by class interests. Nor is it true that all idealists are reactionary and all materialists progressive. For example, Hegel’s followers split into a leftwing and a right-wing. The second Comte inspired both the Mexican conservatives led by Porfirio Dı! az, and their contemporary Brazilian progressives. What is true is that, once crafted for whatever reasons, some philosophical views favor certain social groups rather than others. For instance, radical skepticism, even while revolting against the establishment, objectively favors the latter in that it discourages scientific social research and therefore
reasoned and well-organized opposition. Individualist philosophies favor anarchism and liberalism. By contrast, holistic philosophies jibe with communitarianism and totalitarianism. In short, some philosophies, particularly if fuzzy, can be craftily interpreted to suit anyone’s social interests. But the more abstract philosophical ideas—in particular the logical and semantical ones—are socially neutral. Thus, neither the logical calculi nor the theories of (semantic) meaning and truth refer to social facts, whence they cannot support any social ideology. However, they are useful to examine the cogency of any views whatsoever, and to this extent they have a subversive potential. See also: Action, Theories of Social; Empiricism, History of; Grounded Theory: Methodology and Theory Construction; Interpretive Methods: Macromethods; Interpretive Methods: Micromethods; Macrosociology–Microsociology; Marx, Karl (1818–89); Methodological Individualism in Sociology; Methodological Individualism: Philosophical Aspects; Neutrality: Axiological; Phenomenology in Sociology; Positivism: Sociological; Rational Choice Theory in Sociology; Sociobiology: Overview; Sociology, History of; Structure: Social; System: Social; Theory: Sociological; Weber, Max (1864–1920)
Bibliography Albert H 1994 Kritik der reinen Hermeneutik: Der Antirealismus und das Problem des Verstehens. Mohr, Tu$ bingen Barnes B 1977 Interests and the Growth of Knowledge. Routledge & Kegan Paul, London Boudon R 1980 The Crisis in Sociology. Columbia University Press, New York Boudon R 1994 The Art of Self-Persuasion. Polity, Cambridge Bunge M A 1996 Finding Philosophy in Social Science. Yale University Press, New Haven, CT Bunge M A 1998 Social Science under Debate. University of Toronto Press, Toronto Bunge M A 1999 The Sociology–Philosophy Connection. Transaction Publishers, New Brunswick, NJ Caplan A L 1978 The Sociobiology Debate. Harper & Row, New York Dilthey W 1959 Einleitung in die Geisteswissenschaften. In: Gesammelte Schriften, Vol. 1. Teubner, Stuttgart Durkheim E 1895 Les reZ gles de la meT thode sociologique. Flammarion, Paris Fay B 1996 Contemporary Philosophy of Social Science. Blackwell, Oxford Fox R 1989 The Search for Society: Quest for Biosocial Science and Morality. Rutgers University Press, New Brunswick, NJ Geertz C 1973 The Interpretation of Cultures. Basic Books, New York Gellner E 1973 Cause and Meaning in the Social Sciences. Routledge & Kegan Paul, London Menger C 1968 Untersuchungen u$ ber die Methode der Sozialwissenschaften, und der politischen Oekonomie insbesondere. In: Gesammelte Werke, Vol. 2, 2nd edn., Mohr, Tu$ bingen Merton R K 1968 Social Theory and Social Structure, rev. and enl. edn., The Free Press, New York
14573
Sociology, Epistemology of O’Neill J (ed.) 1973 Modes of Indiidualism and Collectiism. Heinemann, London Pareto V 1963 A Treatise on General Sociology, 4 Vols. Dover, New York Popper K R 1957 The Poerty of Historicism. Routledge & Kegan Paul, London Popper K R 1945 The Open Society and its Enemies, 2 Vols. Routledge & Kegan Paul, London Weber M 1968 Gesammelte AufsaW tze zur Wissenschaftslehre. Mohr, Tu$ bingen Winch P 1858 The Idea of a Social Science. Routledge & Kegan Paul, London
M. Bunge
Sociology, History of The standard way in which surveys of the history of sociology are written and taught still lags behind actual developments in the field itself. The writing and teaching of the history of sociology continue to be strongly dominated by the tradition of the history of ideas. Only recently have some major attempts been made to view the history of sociology from a sociological angle, using sociological concepts, methods, and theories. In the absence of a standard sociological survey of the development of sociology, we shall in this outline mainly follow the familiar textbook trajectory. This approach inevitably implies distortions. It suggests a pattern of Whig history, in which the history of sociology is made into a neat succession of successes. To balance that bias, the reader should bear in mind that the development of sociology itself has been a social and cultural process, consisting of a multitude of short-term planned actions and interactions with aggregate results that in the long term have not been planned by any single individual. This process took place, moreover, as a part of wider social and cultural developments. The context was primarily European and, since the second half of the nineteenth century, North American; yet, as Europe and North America belonged to a more encompassing global constellation, we should also consider the relevance of the processes of colonization and decolonization affecting nineteenth and twentieth century society and culture all over the world (Connell 1997, Collins 1997). Sociology, furthermore, was one of the social sciences amidst other intellectual disciplines: academic disciplines ranging from physics and biology to history and philosophy, as well as nonacademic disciplines including journalism, political debate, and literature. Sociology took on an articulate form in continuous dialog with those other disciplines, sometimes borrowing ideas from them, sometimes opposing them. Most of the controversies within sociology itself reflected its relations to the larger field of intellectual activities. 14574
This would be true of the moral or political views informing sociological theories as well as of the preference for either quantitative, ‘scientific’ methods or a qualitative, ‘hermeneutic’ approach (Lepenies 1988). In addition to the differentiation into (a) theoretical and methodological orientations, sociology has also become differentiated according to (b) empirical specializations, and (c) national traditions. These various forms of differentiation are respectively related to (a) the prevailing orientation to other intellectual disciplines including science, philosophy, literature, and journalism; (b) the general process of differentiation of functions in society at large; and (c) international relations. While the variety of empirical specializations and national traditions is easily visible, this is not so clearly the case with the way the national traditions are ‘nested’ in an evolving international constellation characterized by specific hierarchies, loyalties, and corresponding intellectual interests. We shall not continuously refer to the more general conditions affecting sociology in the following account which, because of its encyclopedic format, is written in accordance with standard practice. The reader should realize, however, that these conditions are relevant to each of the five major stages into which the development of sociology can conveniently be divided: (a) A long ‘predisciplinary’ stage up to 1830, marked by what in retrospect may be recognized as ‘protosociologies.’ (b) The formation of an intellectual discipline, 1830–1890. (c) The formation of an academic discipline with diverging national traditions, 1890–1930. (d) The establishment as a fully fledged academic discipline with autonomous degrees, departments, and research facilities and an emerging international hierarchy, 1930–1970. (e) A period of crisis, fragmentation, and attempts at new synthesis, 1970–2000. In discussing the successive stages we shall have to observe a basic rule of ‘phaseology’ (Goudsblom 1996): when a later phase begins, new elements are added to the previous constellation; but most of the elements which were already present in the earlier phase will continue to exist, albeit in a modified guise.
1.
The Predisciplinary Stage
By a discipline we mean a ‘unit of teaching, research and professional organization’ (Heilbron 1995). There is a body of knowledge, recognized as such under a generally accepted name, set down in textbooks and discussed in professional journals by practitioners. All of these elements were absent from the field of sociology prior to the nineteenth century. And yet we can list many names of people who may, arguably, be regarded as predecessors, ‘protosociologists’: people who reflected about social phenomena, collected data
Sociology, History of about social life, or tried to find patterns in human history. Some individual contributors to these traditions have gained great renown in their own cultures. Thus, from ancient Greece, the names of such men as Herodotus, Plato, Thucydides, and Aristotle stand out. Although they have been ‘appropriated’ as founding fathers by the practitioners of the established academic disciplines of philosophy and history, many of the problems they addressed lay in the area of what was later to become sociology. Chinese, Indian, and Arab culture have similar intellectual traditions; here, too, a few individuals have become the famous exponents of those traditions (Collins 1998). In this brief survey, we shall be able to pursue only the Western line leading up to the emergence of sociology as an intellectual discipline in Western Europe and North America in the nineteenth and twentieth centuries. Since the Middle Ages, there was, on the one hand, a strong tradition of philosophizing about human social life, mainly in theological and moral-legal terms, and, on the other hand, an increasingly strong tradition of collecting facts, most often related to state matters and focused on knowledge of the population and its resources, both at home and abroad (as reported by explorers and travellers). By and large, the two traditions stood apart. The transition from the predisciplinary stage to the actual conception of sociology as an intellectual discipline in its own right occurred in the eighteenth and nineteenth centuries. The shift consisted partly in conscious attempts to bridge the gap between theory and practice, between philosophical reflection and the collection of facts about social life. The leading intellectual model, furthermore, was sought no longer in theology or metaphysics but in empirical science. And, in so far as the focus had been either on the political realm or on more ‘private’ moral issues, it now moved to the structure and development of ‘society’ at large. This threefold shift in orientation was pioneered by the groups which have become known as the French philosophes and the Scottish moral philosophers. Their work was developed in different directions during the last decades of the eighteenth and the first decades of the nineteenth century. Among the philosophes, Montesquieu was one of the first and most influential. He demonstrated that types of government are to be understood in relation to their moral and physical infrastructure. For an understanding of such connections, a recognition of the ‘general spirit’ of a nation is central. This ‘spirit’ or ‘common character’ results from the combined effect of various causes, including climate, religion, and commerce. Instead of drawing deductions from an original principle such as a political or social contract, Montesquieu thus started to unravel the complex interdependencies of human society. While Montesquieu pioneered the study of what Auguste Comte later was to call ‘social statics,’ his younger contemporary, the future statesman Turgot,
pointed to the possibility of also viewing human society from a developmental perspective, in other words, in terms of ‘social dynamics.’ That idea was further elaborated in theories of successive stages or phases by such writers as Condorcet and Henri de Saint-Simon. In a kindred spirit, the Scottish moral philosopher David Hume proposed introducing ‘the experimental method of reasoning into moral subjects.’ Hume and his younger friend Adam Smith rejected theories based on an assumed state of nature. Such reasoning had characterized the natural law tradition—the predominant framework for early modern theories of government and moral obligation (Grotius, Hobbes, Locke, Pufendorf ). For Hume, the state of nature was a hypothetical construct, incompatible with the precepts of empirical science. To consider states as the result of an original voluntary act of agreement was illusory and contrary to scientific procedure. Social institutions, in Adam Ferguson’s famous phrase, ‘are the result of human actions, but not the execution of any human design.’ From this perspective, morality would have to be studied historically and comparatively, tracing its development through the successive stages of hunting, shepherding, agriculture, and commerce. As the idea of a general social science gained ground, various attempts were made to connect that idea to existing scholarly disciplines. Some groups, particularly in France, pursued the ideal of a social science modeled after the natural sciences. Thus, by applying probability theory to voting procedures and judicial decision making, mathematicians such as Condorcet and Laplace proposed a ‘social mathematics.’ British liberal thinkers such as Jeremy Bentham and James Mill somewhat similarly tried to build a social and political theory upon the idea of a moral arithmetic. Individuals and governments alike should promote the amount of happiness and reduce the amount of pain. Their ‘felicific calculus,’ aimed at the greatest happiness of the greatest number, provided the means for assessing the utility of public institutions. Others, such as Saint-Simon, couched their analysis of early industrial society in a medical idiom, advocating a ‘social physiology.’ Conservative social theorists such as Edmund Burke, Joseph de Maistre and Louis de Bonald, by contrast, while recognizing the reality of ‘social relations’ and ‘society,’ opposed the notion of a natural science of society, and rather sought inspiration in theology, philosophy, and history.
2. The Formation of an Intellectual Discipline, 1830–90 In the course of the nineteenth century empirical social research became a regular and organized endeavor. Most of it was centered on the ‘social question’ and intimately linked with governmental agencies and 14575
Sociology, History of reform movements. Besides a multitude of local initiatives, national associations arose for the coordination of these efforts: the French Acade! mie des sciences morales et politiques (1832), the British National Association for the Promotion of Social Science (1857), the US Social Science Association (1867), the German Verein fu$ r Socialpolitik (1873). More theoretically informed work made use of the research findings, but was carried out mainly by individual scholars who either possessed private means or made a living by writing and lecturing. Thus Auguste Comte, John Stuart Mill and Herbert Spencer were for most of their professional lives publicists, men of letters with a strong interest in the sciences. All three contributed significantly to the formation of sociology as an emerging discipline, but none of them was a regular academic or was actively involved in social research. Other writers, such as Alexis de Tocqueville, Karl Marx and Adolphe Quetelet, shared similar interests, but consciously avoided the term sociology. Marx and Tocqueville have been incorporated in the canon of the discipline only after 1945. While Tocqueville called for a ‘new political science for a new world,’ Marx’s critique of political economy resulted in a theory of capitalism and class domination. The Belgian astronomer and statistical entrepeneur Quetelet continued the efforts of Condorcet and Laplace. He considered the ever growing body of statistics to be the proper basis for a ‘social physics.’ When Comte introduced the term ‘sociology,’ he was likewise concerned with the idea of a fundamental science of human society. At the time, Comte had no interest in applied knowledge, rejected premature specialization, and wanted to construct an appropriate foundation for social science. The term sociology captured all of these ambitions quite well. It conveyed the idea of a positive science, suggesting both a high level of generality and a great degree of independence from the natural sciences. For Comte sociology was to become an uncompromising positive science which, however, should not be conceived of in terms of either physics or physiology. Sociology was to biology what biology was to physics: a relatively autonomous science with its own subject matter and methods. In the Cours de philosophie positie (1830–42) Comte explained that different methods prevail in the various sciences: the experimental method in physics, the comparative method in biology, the historical method in sociology. Sociology had to be founded on a historical law. Since human beings specifically have the capacity to learn, the development of knowledge is the core of human development, and Comte considered his law of the three stages in the development of knowledge as the proper beginning of sociology. Comte’s plea for sociology as the general social science was well received in the UK, first in the circle of John Stuart Mill, then by Herbert Spencer. Mill rejected Comte’s criticisms of psychology and political 14576
economy, but admired his philosophy of science and embraced the idea of sociology. Yet, like the French followers of Comte, Mill never wrote a properly sociological study. One of the first sociological treatises appeared in the 1870s and was written by Spencer—the most widely read sociologist of the nineteenth century. For Spencer, progressive change was the common denominator of all natural processes. From the maturation of an embryo to the evolution of life in general, all living things evolve from the simple to the complex through successive differentiation and integration. Evolution is the natural process of change from incoherent homogeneity to coherent heterogeneity. Human societies are no exception: they too develop through differentiation, a process steadily accompanied by higher levels of integration and coordination. Spencer’s view of evolution was thus broader in scope than both Comte’s sociological and Darwin’s biological theory. Combining a laissez-faire view borrowed from economics with an embryological model of growth, Spencer gave the evolutionary principle the status of a universal law and made it the core of his all-embracing system of synthetic philosophy. As stated in his influential essay ‘The Social Organism’ (1860), Spencer’s sociology was cast in the organic idiom, while carefully maintaining an individualist perspective. In Principles of Sociology (1876–97) he elaborated his analysis, emphasizing the passage from ‘militant’ to ‘industrial society.’ In the latter, social integration is no longer imposed from a controlling center, but the spontaneous result of individuals who cooperate on the basis of a division of labor. Spencer strongly favored a laissez-faire stance, opposing state politics (in matters of welfare as well as in colonial expansion), and inspired what was—inaccurately—called social Darwinism. His position as a freelance writer and lecturer allowed Spencer to make virtually unlimited claims for the scope of sociology. While he was most outspoken and influential in articulating the idea of sociology as an intellectual superstructure sustained by an evolutionary perspective and overarching all the other social sciences, that idea was shared by several other leading advocates of sociology in Europe and the USA.
3. The Formation of an Academic Discipline, 1890–1930 Sociology as an academic discipline emerged between 1890 and 1930. The first courses and chairs were established, sociological journals appeared as the primary outlet for research, and sociological associations were founded for furthering intellectual exchange and professional interest. France, the UK, Germany, and the USA were the core countries. In all
Sociology, History of of these countries, distinct national traditions took shape, with the curious effect that some of the academic pioneers, most notably Durkheim and Weber, never referred to each other. Emile Durkheim was the pivotal figure in French sociology. He taught his first course in 1887, wrote classic works on the division of labor, suicide, and the rules of sociological method in the 1890s, and organized a network of highly productive scholars around his journal, the AnneT e sociologique (1898–1912). By proclaiming that social facts had to be explained by other social facts, Durkheim broke away from the predominant biologistic and purely psychological approaches. His explanation for variations in the suicide rate was a paradigmatic example. The social realm was a reality sui generis with its own structure and regularities. Systems of collective representations are an essential part of social life, not merely when studying religion or morality, but also for understanding the economy and political institutions. Following this line of argument, Durkheim and his collaborators (Marcel Mauss, Henri Hubert, Marcel Granet, Maurice Halbwachs, Franc: ois Simiand) worked on a wide range of topics, contributing to many fields outside of sociology proper. In the UK sociology did not gain a proper entry into the university system until well after World War II. The rich tradition of factual inquiry was continued by government officials and wealthy individuals (Charles Booth, B. S. Rowntree), but remained almost entirely separated from academia. Despite the existence of a sociological journal and a professional association, the British academic establishment did not show much interest in sociology. The only chair for a long time had been established in 1907 at the London School of Economics for L. T. Hobhouse, who, in 1929, was succeeded by his pupil Morris Ginsburg. Both men maintained a cautious evolutionary perspective but neither one built up a network of productive scholars. Whereas British academics showed little interest in studying their own society, they had no such inhibition about studying the exotic regions of the British Empire. Social anthropology became a respected academic discipline and partly served as a substitute for sociological concerns. Sociology in Germany emerged later than in France and England; a professional association was founded in 1910, a first chair in 1917, a journal in 1921. The academic pioneers in Germany, men like Ferdinand To$ nnies, Georg Simmel and Max Weber, faced two powerful indigenous traditions. The oldest was that of the Staatswissenschaften. The proponents of these disciplines (law, economics, administration) consistently subsumed civil society to the various functions of the state. This mode of thinking had its roots in the eighteenth-century system of centralized political administration and economic policy known as cameralism and was refashioned in the nineteenth century by idealist philosophers such as Hegel. After the
founding of the German Empire, the members of the Verein fuW r Socialpolitik maintained much of this conception of social science. They were opposed by liberal economists like Carl Menger during the first Methodenstreit, and by Max Weber and others during the second Methodenstreit. An important aim in Weber’s argument about the value-freedom of scientific inquiry was to create a greater distance from the practice of the state sciences. The other tradition with which German sociologists grappled radically opposed the natural to the cultural sciences: the first were supposed to be concerned with causal explanation, the second with interpretative understanding. Early sociologists such as Comte and Spencer had often been rejected in Germany as representatives of an inadequately naturalistic conception of the human sciences. Reflecting on these epistemological issues, Max Weber proposed to transcend the dichotomy by ingeniously defining sociology as the science of social action, which by interpretative understanding seeks to explain its course and consequences. Weber was nominated for a chair in sociology shortly before his death in 1920. His achievements and those of others of his generation (Simmel, Alfred Vierkandt, Werner Sombart, Franz Oppenheimer) contributed to the recognition of sociology as an academic discipline during the Weimar Republic (1919–33). Sociology expanded rapidly, opening opportunities for a generation of young scholars including Max Horkheimer, Theodor Adorno, Theodor Geiger, Karl Mannheim, and Norbert Elias. Although these men had initially received a training in different academic fields, they were ready for brilliant careers in sociology. However, the seizure of state power by the National Socialist Party in 1933 made an abrupt end to the favorable outlook for German sociology, forcing all these men, along with many of their colleagues, into exile. Sociology’s most rapid expansion took place in the USA. The first courses were taught in the 1870s and 1880s, while the expansion of higher education in the 1890s produced rapid institutionalization. Within two decades sociology was taught in most colleges and universities; textbooks standardized the concepts and methods of the new discipline. US sociology developed primarily on the basis of empirical research into social problems. The main topics were the issues which were widely debated by politicians, religious leaders, and social reformers: massive immigration, urban problems, industrialization, local communities, social (dis)integration, ethnic and race relations. Empirical studies proliferated, based either on statistical material or on actual fieldwork. By the 1920s the sociology department in Chicago, founded by Albion Small in 1893, had become the predominant center. Small also played a key-role in the American Journal of Sociology (1895) and—together with George E. Vincent—wrote the most widely 14577
Sociology, History of used textbook, An Introduction to the Study of Society (1894). The work of the Chicago School was founded on qualitative case studies, using observations, interviews, and life histories. Exemplary were William I. Thomas and Florian Znaniecki’s The Polish Peasant (1918) and the collective volume on The City (1925).
4. Establishment as an Academic Discipline, 1930–70 Between 1930 and 1970, sociology became established as an academic field at most universities in the Western world. The number of practitioners and students greatly increased, and there was a proliferation of specialisms in research and teaching. Far from being an even process of growth all over, however, these trends were marked by far-reaching structural changes: a forcible break in the development of sociology in Europe, and a concomitant shift of intellectual leadership in the field from Europe to the USA. That shift, which strongly affected the development of sociology, was illustrated strikingly by two historical introductions to the discipline written toward the end of this period: The Sociological Tradition by Robert A. Nisbet (1966) and The Origins of Scientific Sociology by John Madge (1962). Nisbet’s book celebrated the formative era of the second and third stages, which he designated as ‘the golden age of sociology.’ Here was a US scholar revering the ‘titans’ of nineteenth-century sociology: a small pantheon of authors whose highly personal visions provided the basic ideas and concepts for their more pedestrian successors in the twentieth century. Almost inevitably, all the heroes of Nisbet’s great sociological tradition were Europeans. Madge’s book radiated a very different atmosphere. Madge was an Englishman, proudly hailing the achievements of scientific sociology in the twentieth century. He discussed 12 pieces of empirical research; with the only exception of Durkheim’s Suicide, they all belonged to the fourth stage in the development of sociology. Many of the studies selected by Madge were published under joint authorship, and, again except for Suicide, they were all carried out in the USA. The leaders of the first generation of US academic sociologists all acknowledged an intellectual debt to European masters. Even in the early 1920s, it was still not uncommon for promising US social scientists to start their career with a European apprenticeship, as did Talcott Parsons who went to London and Heidelberg. At the time, however, the westward exodus of sociologists from Eastern Europe was already underway, as the communist regime in the USSR and the fascist regime in Hungary tightened their grip, greatly restricting the margins for sociological inquiry and reflection. 1933 brought the de14578
cisive rupture: the seizure of power by the Nazis in Germany put an end to the flourishing sociology of the Weimar Republic, and forced a great number of prominent and promising sociologists into exile, mostly to the USA. In contrast to Europe, sociology in the USA suffered no drastic interruptions in the years between 1933 and 1945. It remained a subject in most college curricula, with textbooks such as Park and Burgess’s Introduction to the Science of Society (1924) continuing to supply an authoritative framework of concepts and ideas. At the same time, practically oriented research in areas of social problems and social policy was stimulated by increasingly generous funding from both government and private foundations (Platt 1996). Empirical research received a powerful boost after 1941, when social scientists were recruited for the war effort and commissioned to study factors fostering the morale of the Armed Forces. After 1945, the USA emerged as the major political, economic, and cultural power in the world. In the social sciences, including sociology, US examples were virtually unrivaled in setting an international standard of competence in method and theory. An impoverished Europe lacked the resources for challenging the US ascendency. The relatively autonomous sociological traditions in Germany and Italy were virtually destroyed. Sociology in the UK had only a minor position at the London School of Economics. In France the Durkheimian tradition continued to exert considerable intellectual influence among anthropologists and historians, but the actual institutional basis for sociology as an academic discipline was very weak. Thus the scene was set for US dominance. The sociology departments at Harvard and Columbia University in particular gained international reknown. At Harvard, Talcott Parsons developed a new synthesis of sociological theory, mainly derived from European predecessors, which he presented as a ‘general theory of action’—although in its worldwide reception it became known as ‘structural functionalism’ or simply ‘functionalism.’ Still, Parsons’ generally recognized status as the world’s leading sociological theorist did not imply that his ideas put a mark on the entire discipline. There were some influential rival theorists, including his Harvard colleague George Homans who advocated a strictly scientistic ‘exchange theory’ inspired by the model of homo economicus. At Columbia, Parsons’ former student Robert Merton propagated a more modest ideal of ‘theories of the middle range,’ which proved to be very well compatible with the work of his colleague Paul Lazarsfeld, an immigrant from Austria who specialized in the organizational and methodological aspects of social research. The years between 1955 and 1975 in particular spelled a boom period for American sociology (Turner and Turner 1990). The successful launching of Sputnik
Sociology, History of by the USSR in 1957 inaugurated an era of unprecededly high government funding of scientific research in the United States, which also benefited sociology. In the 1960s, a combination of demographic trends (the postwar baby boom) and political discontent made for a dramatic increase in student enrollments. Professional self-confidence, which had been clearly mounting since the early 1940s, surged to new heights. The auspices were favorable for manifold activities in the realm of academic sociology: systematic elaboration of theoretical ideas in the Parsonian fashion; exploration of the possibilities of rival theories presented under such labels as exchange theory or symbolic interactionism; empirical research, conducted in many forms from intensive ‘qualitative’ case studies to extensive quantitative surveys, and applied to a broad range of social phenomena. Sociologists in most other capitalist-democratic countries outside the USA adopted similarly diverse programs for teaching and research. As the meetings of the International Sociological Association testified, by the end of the 1960s sociology was a blooming pluralistic discipline, with a tacitly accepted hierarchy of prestige headed by US sociologists whose ideas set the agenda for the discipline at large. Unintentionally, the publication of the International Encyclopedia of the Social Sciences in 1968 spelled the ending of that era.
5. Crisis, Fragmentation and Attempts at New Synthesis, from 1970 If the years around 1970 formed a watershed in the development of sociology, the tendencies that then became manifest had been at work for some time. By the time that readers all over the world pored over Alvin Gouldner’s (1970) The Coming Crisis of Western Sociology, that crisis actually was in full swing. As Gouldner pointed out, mainstream academic sociology as represented by the combination of functionalist theory and survey research was rapidly losing its appeal to a new and vociferously critical generation. With much programmatic gusto, new ‘paradigms’ were launched, highly diverse in their implications, but all radiating the same radical spirit. Inspired by political protest movements, a culture of dissent arose in sociology, criticizing the intellectual canon and the professional stance of the discipline. The short-term result of this turmoil was the demise of functionalism as the leading theoretical perspective, and a flurry of controversies about a mixture of ideological, epistemological, and methodological issues. In the longer term, after the controversies receded, a state of fragmentation ensued to which most practitioners tacitly accomodated. Fragmentation took many forms. The most obvious and least contested form was the proliferation of
subdisciplines around an unsystematic array of specific research areas such as the sociology of the family, religion, organizations, etc. Then, equally noticeable but far more controversial, there was a plurality of theoretical orientations, often with corresponding methodological presumptions. Some theories were primarily attuned to specific forms of microsociology, marked by an affinity to various philosophic schools ranging from phenomenology (embraced by ethnomethodologists), through pragmatism (preferred by symbolic interactionists), to neopositivism (for rational choice theorists). At the level of macrosociology, functionalism as a perspective gave way to new approaches influenced by Marxism and emphasizing social conflict and long-term social transformations. Fragmentation was further enhanced by a weakening of the hierarchy in the international sociological community and an increased articulation of national traditions. None of these changes came out of the blue, but together they produced an almost revolutionary shock to the world of sociology. Whereas the most famous sociologists of the previous generation had all been from the USA, now European names again came to the fore: Ju$ rgen Habermas, Pierre Bourdieu, Anthony Giddens. During the early 1970s, while many older sociologists deplored the patent end to the illusion of consensus in the discipline, there was also a widespread sense of excitement over new intellectual possibilities and career opportunities. Ideological debates around many current topics, from race relations to the reform of psychiatry, were saturated with sociological references. In his critique of sociology, Alvin Gouldner (1970) did not present himself as a prophet of doom, but rather as a pioneer in ‘reflexive sociology’ who was able to point at new ways of fulfilling what C. Wright Mills earlier called ‘the promise of sociology.’ The later 1970s saw a dramatic reversal in this mood of confidence. General public interest turned away from sociology, student enrollments dropped rapidly, research funds shrank–the only sediment of the earlier high tide that remained was internal fragmentation, pluralism. The general stagnation at first reinforced the tendencies toward fragmentation, as established subdisciplines tried to maintain their position in the face of diminishing institutional support for, and intellectual vigor of the sociological discipline as a whole. However, the 1980s and 1990s also brought new attempts towards a synthesis. Thus, in historical and comparative sociology the orthodox Marxist approach tended to be softened by combining it with a more liberal Weberian perspective. ‘World systems’ theory as originally developed by Andre Gunder Frank and Immanuel Wallerstein helped to break away from the implicit nation-centeredness that characterized most sociological work during the fourth stage. Making the international order, conceived either as a world system, or alternatively as ‘transnational society,’ or in terms of a process of 14579
Sociology, History of globalization, into the centerpiece for sociological study gave counterweight to the tendencies toward national parochalism. Other ventures in historical and comparative sociology sought synthesis in a different direction. During the 1980s and 1990s, Randall Collins (1997) valiantly subjected a wide range of long-term cultural trends (including religion and philosophy as well as technology) to sociological analysis, while Jack Goldstone and Charles Tilly brought sophisticated techniques of statistical analysis to bear on huge sets of historical data. Meanwhile Norbert Elias’s work, aiming at a thorough integration of sociology with history and psychoanalysis, was gaining influence inside and beyond the discipline. These promising signs cannot hide the fact, however, that after the decades of rapid growth in the 1960s and 1970s, sociology had entered a period of far less favorable circumstances. The welfare state came under siege in many Western countries, and neoliberal policies in favor of deregulating and extending markets gained the upper hand. Even if the collapse of the USSR and the termination of the Cold War have not spelled ‘the end of history,’ these events have reduced interest in ideological matters and theories of society. A theoretical perspective that commands much public interest seems to be sociobiology—which, as the anthropologist Marvin Harris observes, may well constitute the most provocative challenge to the social sciences. At the universities, the established social science disciplines (economics, sociology, anthropology, social science) have faced increasing competition from new, often multidisciplinary fields of study which are more immediately focused on policy needs (organizational studies, management, business) or the demands of particular student groups (‘Black studies’, gender studies). The strains of the present situation are also conducive to a revival of the old polarity between ‘scientistic’ and ‘humanistic’ approaches. The scientistic posture has given a great deal of prestige to ‘rational choice’ models by means of which some sociologists are applying assumptions and methods derived from economics to a wide array of noneconomic relations. Others fear that the economic models thus adopted may have the effect of a Trojan horse, undermining sociology at its very center. Ironically, the humanistic approach is posing a similar risk: embracing culturalistic and postmodern ideas, drawn mainly from current anthropology, literary studies, and hermeneutic philosophy, may also detract sociologists from the core of their discipline—the study of social processes and social structures. The opposing social currents at work in the world today, as well as the more directly felt institutional pressures and intellectual challenges are confronting sociology with a variety of contradictory claims. Recognizing the many sources of confusion is a first requisite in order to arrive at a balanced response to 14580
this situation. Equally important is a continuous reconsideration of the history of the discipline. See also: Comte, Auguste (1798–1857); Disciplines, History of, in the Social Sciences; Durkheim, Emile (1858–1917); Marx, Karl (1818–89); Paradigms in the Social Sciences; Sociology: Overview; Spencer, Herbert (1820–1903); Universities, in the History of the Social Sciences; Utilitarian Social Thought, History of; Weber, Max (1864–1920)
Bibliography Bottomore T, Nisbet R (eds.) 1978 A History of Sociological Analysis. Basic Books, New York Boudon R, Cherkaoui M, Alexander J (eds.) 1997 The Classical Tradition in Sociology, 8 Vols. Sage, London Collins R 1997 A sociological guilt trip: Comment on Connell. American Journal of Sociology 102(6): 1558–64 Collins R 1998 The Sociology of Philosophies. Belknap Press of Harvard University Press, Cambridge, MA Connell R W 1997 Why is classical theory classical? American Journal of Sociology 102: 1511–57. Goudsblom J 1977 Sociology in the Balance. Blackwell, Oxford, UK Goudsblom J 1996 Human history and long-term social processes: towards a synthesis of chronology and ‘phaseology’. In: Goudsblom J, Jones E, Mennell S The Course of Human History. M. E. Sharpe, Armonk, NY Goudsblom J, Jones E, Mennell S 1996 The Course of Human History. M. E. Sharpe, Armonk, NY Gouldner A W 1970 The Coming Crisis of Western Sociology. Basic Books, New York Heilbron J 1995 The Rise of Social Theory. Polity Press, Cambridge, UK Kilminster R 1998 The Sociological Reolution. From the Enlightenment to the Golden Age. Routledge, London Lepenies W (ed.) 1981 Geschichte der Soziologie, 4 Vols. Suhrkamp, Frankfurt, Germany Lepenies W 1988 Between Literature and Science: The Rise of Sociology. Cambridge University Press, Cambridge, UK Levine D N 1995 Visions of the Sociological Tradition. Chicago University Press, Chicago Madge J 1962 The Origins of Scientific Sociology. Free Press of Glencoe, New York Mohan R P, Wilke A S (eds.) 1994 International Handbook of Contemporary Deelopments in Sociology. Greenwood Press, Westport, CT Nisbet R A 1966 The Sociological Tradition. Basic Books, New York Platt J 1996 A History of Sociological Research Methods in America 1920–1960. Cambridge University Press, Cambridge, UK Turner S P, Turner J H 1990 The Impossible Science. An Institutional Analysis of American Sociology. Sage, Newbury Park, CA Wagner P 1990 Sozialwissenschaften und Staat: Frankreich, Italien, Deutschland 1870–1980. Campus Verlag, Frankfurt, Germany
J. Goudsblom and J. Heilbron Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Sociology: Oeriew
Sociology: Overview The institutionalization of sociology has followed a long and nonlinear path. The word sociology was created by Auguste Comte and became established once it was institutionalized in periodicals. Meanwhile, the existence of sociology was recognized in an implicit fashion. When Tocqueville ([1856], 1986) writes in the very first sentence of his book on The Old Regime and the Reolution ‘This book is not a book of history,’ he meant that it dealt with what we would call today ‘comparative sociology.’ But he could not use a word which was not yet common and probably considered Comte as an eccentric figure. Max Weber expressed doubts as to whether his work should be qualified as belonging to sociology. The institutionalization of the word and of the discipline owes much to Durkheim and later to the Chicago school of sociology.
1. Sociology: A Unique or Multiple Discipline? Sociology, commonly defined as the scientific study of social relations,social institutions, and societies, has nevertheless displayed a great diversity of ways of conceiving its objectives, uses, styles, and methods. Durkheim never quoted Weber nor Weber Durkheim, although they were contemporary and could read one another. Nisbet (1966) maintains though that, beyond the differences, a common inspiration can be detected in the works notably of Tocqueville, Marx, Simmel, Weber, Pareto, or Durkheim. All would have seen that the rational utopian theory of society developed by the Enlightenment was unacceptable. For this reason, they all insisted on the social functions of religion, of the hierarchical distinctions between social statuses, on the role of traditions, or on the distinction between ‘community’ and ‘society.’ Describing the identity of classical, and still more of modern, sociology can be harder though than Nisbet states, for both classical and modern sociologists endorse diverse main views as to the uses of sociology and to the ways it should be conducted.
1.1 Three Types of Objecties A first type of sociologist sees sociology as having a mainly informatie function: Its role would consist in producing data and proposing analyses implicitly and explicitly oriented toward decision makers. Thus, Le Play in his Les ouriers europeT ens (1855) produced a survey of the conditions of life of European workers, with the final objective of improving the social policies of the time. While workers themselves know the way they live, this local knowledge is not directly available
at the level of societal decision makers. Collecting statistical data on crime is another illustration of this informative function: Such data are crucial to the determination of crime policy. Tarde became a sociologist because, as a judge and later as the head of the French Bureau of Statistics on Crime, he was concerned with the increase of crime rates. Schumpeter qualified this informative dimension of the social sciences as ‘cameral’ (Kameralwissenschaften). In the seventies, Lazarsfeld et al. (1967) produced an overview of the production of this ‘cameral’ sociology in their Uses of Sociology. A second type of sociologist sees the function of sociology rather as critical: Sociologists should mainly aim at identifying the defects of actual societies and devising remedies to correct them. Saint-Simon, Proudhon, or Owen can be considered as examples among the pioneers of this type of sociology. The members of the Frankfurt School provide a modern illustration of ‘critical’ sociology. A third type of sociologist sees sociology as having the same main objective as other scientific disciplines: explaining social phenomena which appear at first glance as enigmatic. Thus, Max Weber wonders why peasants appear as more attracted by polytheistic than monotheistic religions and Durkheim why all religions include the notion of ‘soul’ in one form or another. In such cases, the interests of sociologists are essentially of a cognitie character. These three basic orientations can appear in combination. Marx is concerned both with explaining social mechanisms and with changing society. Durkheim gathered data on suicide with the objective of explaining the variations of suicide rates, but he considered them also as indicators of the general state of modern society. But Tocqueville, Weber, or Durkheim have reached the status of ‘classics’ essentially because they have proposed scientifically valid explanations of puzzling social phenomena, rather than by their diagnoses of the society of their time. Durkheim considered that sociology could be socially useful only if it was scientifically solid, in other words that good criticism can derive only from valid explanation. By contrast with Durkheim or Weber, the Frankfurt School is known more for its criticism of modern societies than because it would have proposed convincing explanations of puzzling phenomena. The three orientations characterize modern and contemporary as well as classical sociology. Foucault’s Superise and Punish is generally recognized rather as a ‘critical’ than as a ‘scientific’ contribution (Foucault 1975). The ‘labeling theory’ teaches that some features of individual A can be interpreted as a ‘label’ by observer B and can structure B’s behavior and expectations toward A. This has been known for ever, though, so the theory does not bring additional knowledge. Its ‘critical’ function is more undebatable than its cognitive function. So, the ‘labeling theory’ is not a theory in the scientific sense. But, as Foucault’s 14581
Sociology: Oeriew theory, it notably had an influence on our views and on the views of public officials as to the way prisoners should be considered and treated.
1.2 Seeral Types of Orientations But sociology is perceived as a discipline with a weak identity, not only because of the existence of these three basic distinct goals, but also because the sociologists who consider that the main goal of sociology is to produce valid new knowledge on social phenomena endorse a variety of methodological and theoretical orientations. Thus, Weber’s work is considered as illustrating a particular style of sociology, ‘comprehensive’ sociology, which is currently opposed to the ‘positivist’ conception of sociology. The former proposes to analyze social phenomena as the outcome of comprehensible individual actions; the latter to analyze the relations between social phenomena. Weber himself saw his own version of ‘comprehensive’ sociology as very different from others, and talked currently of ‘comprehensive sociology in my sense (in meinem Sinne).’ In the 1960s, ‘quantitative’ and ‘qualitative’ sociology were perceived as irreconcilable competitors: the former saw collecting standardized data able to be subjected to statistical analysis as the most effective research procedure, while the latter insisted that it would produce biased findings and proposed to use much more flexible methods. Merton’s ‘middle range theory,’ that aimed at explaining specific social phenomena, was seen as the proper alternative to the type of general theory illustrated by Parsons, which C. Wright Mills had polemically called ‘Grand Theory,’ the object of which was ‘society’ as a whole. These notions label some of the main conceptions of sociology that can be distinguished and appear as more or less permanent. This diversity in the basic theoretical orientations of sociology is also a lasting feature of sociology, observable in contemporary as well as classical sociology. At the beginning of the twentieth century, a first ‘methodological quarrel’ opposed those who thought that sociology should follow the path of other scientific disciplines, to those who, like Rickert, maintained that, while the goal of natural sciences is ‘explanation,’ the goal of the social sciences would be ‘interpretation.’ In the 1960s, the quarrel was reopened, opposing the ‘positivists,’ as their challengers called them, to those—the ‘hermeneuticians’—who, once again, saw sociology as in charge of disentangling the ‘meaning’ of social phenomena—of ‘interpreting’ them—rather than identifying their causes, i.e. ‘explaining’ them. Adorno and Popper were respectively the main protagonists of the ‘antipositivist’ and of the ‘positivist’ group (Adorno et al. [1969] 1972). In various forms, this ‘quarrel’ reappears almost permanently. Thus, not long ago, ethnomethodologists claimed that the idea of producing an objective social knowledge would be 14582
self-contradictory, for social phenomena would be understandable only from the inside of a social group: one could interpret, not explain them. Weber’s sociology is particularly important, for it defines a third way between ‘positivism’ and ‘antipositivism.’ Explaining a phenomenon means to Weber finding its causes. The causes of any social phenomenon reside in the set of individual actions, beliefs, or attitudes of individual people responsible for it. So the sociologist has to make these individuals actions, beliefs, or attitudes ‘understandable.’ But understanding an individual action means finding out the meaning of the action for the actor, in other words the motivations and reasons that have inspired it. For Weber this operation amounts to building a theory of the motivations of social actors subject to the ordinary criteria any scientific theory should meet. Thus, when I see somebody cutting wood in his yard, I would reject the theory according to which he would cut wood because he is cold, if this action takes place on a hot summer day. On the whole, to Weber, ‘understanding’ is a crucial moment in the ‘explanation’ of any social phenomenon. So, the sociologists mainly interested in the ‘cognitive’ function of their discipline, namely in its capacity to produce valid new knowledge, can be classified along a first main dimension, according to whether they see it as able to produce ‘explanations,’ as in other scientific disciplines, or as rather entitled to produce ‘interpretations’ with an unavoidably subjective character. But a secondary distinction should be introduced. It opposes those who, like the Durkheim of the Rules, believe that the subjectivity of social actors should be ignored by sociologists (Durkheim 1895) and those who, like Weber (1951) consider on the contrary that it is essential to sociological knowledge and in no way incompatible with a scientific ambition. The latter distinction is correlated with another one: to some sociologists, individuals are considered as the atoms of any sociological analysis, while, to others, sociology should ignore individuals and rather be concerned with analyzing the associations between social phenomena and variables. The former follow the principles that Schumpeter called ‘methodological indiidualism,’ the latter ‘methodological holism.’ Weber declared explicitly that, to him, individuals are the atoms of any sociological analysis and that sociology should be ‘individualistic as far as its method is concerned’ (Mommsen 1965). Durkheim, by contrast, saw sociology rather as devoted to the analysis of the associations between observable social variables. He defends this viewpoint theoretically in his Rules (1895) and practically, for instance, in his Suicide (Durkheim 1897). It suffices to evoke a few modern sociological schools to see that the above distinctions, along which classical sociological works can be classified, can also be used in the case of modern works. Thus, ‘struc-
Sociology: Oeriew turalism’ can be ranked on the side of positivism: It proposes to identify the stable relations prevailing between social variables; it claims to be hard science; on the other hand, it is holistic, since it treats individuals as being merely the products or reflections of ‘structures.’ ‘Rational choice theorists’ for their part propose to analyze social phenomena as the effects of individual actions; they believe firmly that sociology can be as objective and scientific as any other discipline, but under the condition that it respects the postulate that social phenomena are the outcome of individual actions led by individual interests. ‘Ethnomethodologists’ believe that sociology is an interpretative rather than explanatory discipline, but are mainly concerned with the interpretations given by individuals of the social setting they are embedded in. Holistic sociologists of history, such as Elias or Wallerstein, following the footsteps of a Burckhardt or a Braudel, are concerned with detecting diffuse orientations in the course of historical and social processes. They worry rather about interpreting than explaining social phenomena and are interested mainly in identifying global macroscopic trends. The fact that the various conceptions of sociology emerging from the above typologies are lasting explains why sociology appears as a less unitary discipline than others. Moreover, the relative popularity of each of these conceptions tends to follow an ‘oscillatory’ process, as Pareto would have said. In many circles, structuralism was considered in the 1960s as the orientation which would make the social sciences genuinely scientific. But it was sharply criticized and became obsolete. Being ‘positivist’ and ‘holistic,’ it was replaced by antipositivist and individualistic currents, such as ethnomethodology or ‘phenomenology’ in the sociological sense.
2. A Few Genuine Achieements The most effective way to refute the skepticism toward sociology that develops as a consequence of its diversity is to mention a few genuine achievements in some of the branches of the typologies outlined above. Durkheim’s Suicide (1897) is a classical example of the holistic-positivist type of sociology. The work consists in a systematic analysis of correlations between rates of suicides and a number of independent variables, such as sex, living in cities vs. living in the countryside, religious affiliation (Protestant, Catholic, Jew), and so forth. Thus, Durkheim found that, when anomie grows, rates of suicide become higher. Other things being equal, professionals are for instance more exposed to anomie than peasants, so that the rates of suicide of the former are higher. Anomie is greater in periods of economic boom, for expectations tend to be overoptimistic then, opening the way to disillusion and making the rates of suicide higher. Durkheim’s Suicide provides a methodological model which has
inspired many writers. Cusson (1990) wonders for instance why the rates of criminality are characterized by similar trends in most Western countries, with the two exceptions of Japan and Switzerland. He maintains that this exception is due to the fact that primary social control is more developed in these two countries than in others, because, even in large cities, Japanese and Swiss societies have, for historical reasons, maintained networks of interpersonal relations characteristic of traditional societies. It should be noted, however, that, while such analyses are explicitly holistic, they implicitly consider individual motivations. Thus, in his Suicide, Durkheim contends that periods of political crises should be associated with a decline in the rates of suicide and that, other things being equal, they were effectively lower during the major political crises in France in the last two decades of the nineteenth century, as well as during the crisis between Prussia and Austria in the mid-1860s. As elsewhere in Suicide, the theory predicts an association between two variables defined at the societal level, and is in that sense holistic. However, the theory rests implicitly on the ‘psychological’ assumption that in periods of serious political crises, people cannot feel exclusively concerned with their own private worries. The same could be said of most analyses developed in Durkheim’s Suicide. Thus, the activity of professionals is much less routinized than the activity of peasants; they cannot be content with following day after day the same rules of behavior, and must on the contrary always devise new rules, objectives, procedures, or ideas. This higher uncertainty generates a higher anxiety. So, implicitly, in spite of his official opposition to any psychological explanation of social facts and of the recommendation formulated in the Rules always to ‘explain social facts by social facts,’ ‘understanding’ in Weber’s sense is implicitly conducted in the analyses of Suicide. Thus, in the nineteenth century, women had lower suicide rates than men because, being more exposed to social control, they were, according to Durkheim, more guided by routinized rules and norms and thus less exposed to the anxiety generated by uncertainty. So, the contrast between the antipsychological holistic sociology of Durkheim and Weber’s individualistic approach is less marked than is sometimes contended. Weber is as interested as Durkheim in macroscopic relations. Thus, in his Essays in the Sociology of Religion, he notes that Mithraism developed essentially among Roman civil servants before reaching other strata (Weber 1920–1). His explanation is that Mithraism was more palatable to civil servants than the old traditional Roman religion, because it could be perceived as a symbolic expression of the political philosophy which grounded the Roman bureaucracy. Roman civil servants were hierarchized in a number of grades associated with specific functions; admission to the bureaucracy and promotion from one grade to the higher was determined by 14583
Sociology: Oeriew impersonal procedures; the whole system was placed under the authority of the Emperor, but the latter was held rather as the incarnation of the impersonal rules codified in the Roman Right. Now, Mithraism was a religion placed under the auspices of a God symbolizing impersonal rules; as in the case of Freemasonry, the believers were admitted and promoted through a scale of grades with the help of impersonal procedures. Here, Weber solves a macroscopic question—why are such and such strata more receptive toward a religious doctrine?— in an explicitly ‘individualistic’ fashion, by showing that Roman civil servants had strong reasons for perceiving Mithraism as more attractive than the old Roman religion, which was born in the context of a purely agrarian society. For the same type of reasons, Freemasonry in Prussia has attracted civil servants more than other strata. The Weberian vein is noticeable in many modern analyses. Thus, Merton’s classical analysis of the antiBlack attitudes of the white American workers in the time of the Great Depression of the 1930s rests upon the assumption that white unionists had strong reasons to believe that black workers, coming from the paternalist big farms of the South, would be less prepared to be faithful to the unions than the white workers from the North (Merton 1949). Bergeron (1999) wondered why methadone programs were rejected in France, while they were adopted in the Netherlands and Switzerland. His answer is that Lacanian theories, very influent among French psychiatrists, led psychiatrists to the diagnosis that drug addiction results from a disruption of social relations. Hence they drew the conclusion that drug addicts should be cured by a methodical restoration of their social relations. By contrast, the Swiss or Dutch psychiatrists did not think they had at their disposal a theory from which effective remedies against drug addiction could be drawn. As they did not feel able to cure drug addiction, they believed that their main duty was, to protect people, notably against the threat of AIDS involved in drug addiction. The French had strong reasons to stress that their role was to cure; the Dutch and Swiss to believe that their role was to protect, since they did not know how to cure. In an analysis such as this one, one retrieves the very flexible theory of rationality used by individualist sociologists following the Weberian tradition.
2.1 A Particular Version of the Indiidualistic Tradition: The Rational Choice Model Contemporary sociology has developed a particular version of the individualistic tradition. Instead of the Weberian open theory of rationality, it proposes to use the theory of rationality developed by economics. According to this theory, social actors are supposed to be moved essentially by their interests and to be able to 14584
pick up the best means to follow their ends. This model—called the ‘rational choice model’—the inspiration of which goes back to Bentham, is very useful. It has produced many fruitful theories in several fields: notably the sociology of crime, of education, of organizations. These achievements have motivated some sociologists, such as James Coleman, to contend that the ‘rational choice model’ could play in sociology the role of a general theory able to integrate heterogeneous findings. But this model cannot be considered as a realistic theory of behavior in all circumstances. Olson (1965) has stressed that his own theory of collective action, which applies literally the postulates of neoclassical economics, can explain that collective action does not occur as long as its axioms can be considered as realistic. But, he adds, in many circumstances, they cannot, simply because people can be moved and mobilized, not only by their interests, but also by their beliefs. Now, beliefs can seldom be satisfactorily explained merely by the interests of social actors. On the whole, the ‘rational choice model’ has produced many convincing theories, but it cannot be considered as a general theory, neither actually nor potentially. The styles of research illustrated by the previous examples have in common that they provide explanations of social phenomena that follow the criteria generally used in all scientific disciplines. In that sense, they belong to the ‘positivist’ category of the previous typology.
3. Interpretation s. Explanation In his book on the methodology of the social sciences, Problems of the Philosophy of History, Simmel (1892) made the important point that history, as well as the social sciences, raises naturally two types of questions which respectively can and cannot be given a unique scientifically controlled answer. In Rickert’s vocabulary, some questions raised by sociologists aim at ‘explaining,’ others at ‘interpreting’ social phenomena. Thus, Simmel (1900) himself proposed in his Philosophy of Money an explanatory theory of the decline of feudalism: From the moment when the peasants of the Middle Ages could pay their rents in money rather than in kind, they could decide what they would grow, they started to be in a position to create and sell surpluses, and were eager to participate in collective actions against landowners, so that on the whole this apparently minor institutional innovation generated a chain reaction which powerfully eroded the feudal organization of society. But other questions call for ‘interpretation’ (or ‘description’) rather than ‘explanation’: what are the main features of modernity? Is the twenty-first century going to be characterized by a religious revival? What does it mean to be poor in an affluent society? How do
Solidarity: History of the Concept people live in poor suburban areas? Of course, such questions cannot be answered only in one way. There are many ways of describing the conditions of life in a poor suburban area. In other words, the answers to such questions are doomed—by the very nature of the questions—to be for one part subjective. Actually, as Simmel states, such questions can be answered in an interesting fashion, only if analysts are able to inject their own sensibility in their description or interpretation. Answering such questions requires on the whole a type of ability that makes sociologists close to artists. A good biographer, for instance, cannot follow exclusively the criteria defining a scientific work. He or she must also be able to give a unity to the numerous observations which have been collected by witnesses on the biography’s subject, to that person’s behavior and decisions in innumerable circumstances. A number of sociological works are faced with the same type of questions. Giddens’s (1984) work on postmodern society, as earlier Burckhardt’s work on the Kultur der Renaissance in Italien ([1860], 1952), are examples of works aiming at characterizing the main features of historical eras or identifying broad historical trends. Other works dealing, not with societies as wholes, but with local phenomena, belong also to the interpretative-descriptive category rather than to the explanatory one; though they deal with social reality, they evoke the procedures of novelists. Oscar Lewis’ The Children of Sanchez (1961) is a classical illustration of this ‘literary’ sociology. The anthropologist Clifford Geertz (1973) has coined the expression ‘thick description’ to characterize the methodology of this type of work. Runciman (1983–97) has developed methodological considerations which can be considered as elaborating on Simmel’s insights.
4. A Complex Identity In summary, all the types of sociology which can be derived by the combination of the three main idealtypical objectives (explaining, informing, criticizing) and of the four types generated by the combination of the binary variables holism vs. individualism and ‘positivism’ vs. ‘hermeneutics’ can be illustrated as easily by examples taken from contemporary as from classical sociology. This shows perhaps that all the types are legitimate and fruitful. The cost of this heterogeneity lies in the difficulty of perceiving the identity of sociology, its benefit in the fact that it makes sociology able to fill the gaps that other social sciences, such as economics, more devoted to a single paradigm, are unable to fill. See also: Action, Theories of Social; Critical Theory: Contemporary; Critical Theory: Frankfurt School; Durkheim, Emile (1858–1917); Hermeneutics, Including Critical Theory; Historical Sociology; Marx, Karl (1818–89); Methodological Individualism in Sociology; Parsons, Talcott (1902–79); Phenomeno-
logy in Sociology; Positivism: Sociological; Rational Choice Theory in Sociology; Sociology, Epistemology of; Sociology, History of; System: Social; Theory: Sociological; Weber, Max (1864–1920)
Bibliography Adorno T W, Dahrendorf R, Albert H, Popper K R [1969] 1972 Der Positiismusstreit in der deutschen Soziologie. Luchterhand, Neuwied, Germany Bergeron H 1999 Soigner ou prendre soin des Toxicomanes: Anatomie d’une Croyance Collectie. Presses Universitaires de France, Paris Burckhardt J [1860] 1952 Die Kultur der Renaissance in Italien. Kro$ ner, Stuttgart, Germany Cusson M 1990 Croissance et DeT croissance du Crime. Presses Universitaires de France, Paris Durkheim E 1895 Les ReZ gles de la MeT thode Sociologique. Alcan, Paris Durkheim E 1897 Le Suicide, En tude Sociologique. Alcan, Paris Foucault M 1975 Sureiller et Punir; Naissance de la Prison. Gallimard, Paris Geertz C 1973 The Interpretation of Cultures. Basic Books, New York Giddens A 1984 The Constitution of Society. Polity Press, Cambridge, UK Lazarsfeld P F, Sewell W H, Wilensky H L (eds.) 1967 The Uses of Sociology. Basic Books, New York Le Play M F 1855 Les Ouriers EuropeT ens, 2nd edn. Imprimerı' e Imperiale, Paris Lewis O 1961 The Children of Sanchez, Autobiography of a Mexican Family. Random House, New York Merton R K 1949 Social Theory and Social Structure. Free Press, Glencoe, IL Mommsen W 1965 Max Weber’s political sociology and his philosophy of world history. International Social Science Journal 17(1): 44 Nisbet R A 1966 The Sociological Tradition. Basic Books, New York Olson M Jr 1965 The Logic of Collectie Action. Harvard University Press, Cambridge, MA Runciman W G 1983–97 A Treatise on Social Theory. Cambridge University Press, Cambridge, UK Simmel G 1892 Die Probleme der Geschichtsphilosophie. Duncker & Humblot, Leipzig, Germany Simmel G 1900 Philosophie des Geldes. Duncker & Humblot, Leipzig, Germany Tocqueville A de [1856] 1986 L’Ancien ReT gime et la ReT olution. Levy, Paris Weber M 1920–1 Gesammelte AufsaW tze zur Religionssoziologie. Mohr, Tu$ bingen, Germany Weber M 1951 AufsaW tze zur Wissenschaftslehre. Mohr, Tu$ bingen, Germany
R. Boudon
Solidarity: History of the Concept The concept of solidarity epitomizes the total process through which sociology established itself among the social sciences. An originally juridical concept, per14585
Solidarity: History of the Concept taining to the character of legal obligations, was given a new social and political meaning which made it tantamount to the theoretical conditions of possibility for the constitution of sociology as such. In order to do this, the concept of solidarity had to take a distance from other close concepts, such as compassion and fraternity, and to develop the specific features which would ensure its role. Throughout the nineteenth century, it expressed the search for a new articulation between individualism and collectivism which was at the origins of crucial transformations, in social theory as well as in social policies.
dividualism. Individual rights entering into the new social reality, through a twofold process of emancipation, juridical and economic, did not offer criteria for organizing the life of the collectivity. Freed from the revolutionary echo resounding through fraternity, solidarity pointed at that sort of third way between liberal individualism and socialist collectivism that offered the proper room for the rise of sociology. The concept promised to reconcile individual independence and social cohesion, and on this promise built its own fortune.
2. Rights and Duties 1. Compassion, Fraternity, Solidarity Compassion had inspired the treatment of the poor in the Old Regime society, through private charity mixed with police repression. The inadequacy of such a treatment to a modern liberal order, from the vantage point of equality of rights as well as of a new way of production requiring the active participation of everybody, became clear in the early stages of the French Revolution (Procacci 1993). Against the fortuitous nature of private charity, solidarity openly marked its own public vocation. Instead of relying upon an occasional feeling or the practice of a religious piety, it was to become the principle of a secular organization framing the competition of individual interests according to the needs of the common good. ‘Fraternity’ was one of the key-words of the revolutionary tradition in France, together with equality and liberty, but proved to be much more difficult to fix in normative terms. Moreover, during the new 1848 revolution, fraternity inspired popular claims for a social and economic reform, engaging with the socialist tradition. Its revolutionary implications made more and more clear that, although fraternity was an integral part of the new legal system, it could not offer guidelines for public action—a role rather played by solidarity (Borgetto 1992). Solidarity presented the advantage of being a more ‘scientific,’ positive notion, interpreting more effectively a concern about social order. Fraternity had offered the first foundation to solidarity, well before the latter, toward the end of the nineteenth century, met the idea of social interdependence developed by sociology. Nevertheless, in French public law it eventually became restricted to the link among members of the nation, while solidarity was rather entrusted to regulate a system of mutual rights and duties organized by the state. As a matter of fact, the concept of solidarity inherited the same political space covered by fraternity, having to respond to the need of governing the persistence of inequality within a society of equals. Solidarity took its social meaning within a general framework deeply marked by the establishment of modern individual rights—the process of liberal in14586
Robert Nisbet (1966) correctly highlighted the contribution of conservative antirevolutionary authors to the social concept of solidarity, through the emphasis they put upon communitarian ties. De Bonald, for instance, denounced the revolution for having destroyed any collective order and therefore any possibility for solidarity. While sharing his insistence on the need for a systematic doctrine about the bases of social order, Saint-Simon fundamentally disagreed with his attempt to ground solidarity within religious thought. The recovery of social unity in a modern society required a new attitude, and foundations were to be searched within science. In fact, with Saint-Simon and Comte the reflection about solidarity clearly engaged in a criticism of the difficulty that legal and economic individualism opened to solidarity. From this point of view, the rise of sociology marked a discontinuity with respect to patterns of solidarity belonging to the past. For Auguste Comte ([1830–42] 1975) the specific task of a social science was to oppose to the destructive strength of individualism a principle of social organization based on the authority of science. The revolution was the result of individualism’s incapacity to provide criteria useful to organize any collective order; the only way to stop the revolution was to be able to remove its causes. To problems raised by the emancipation of individual rights, sociology must respond by providing concrete principles for social reorganization. Comte proposed to replace the emphasis on individual rights with a system of mutual duties, linking everybody to each other, therefore able to promote the solidarity of all parts within the social system. The concept of duty appeared to him directly social, in other words an organizer: It put at the forefront ‘the intimate feeling of social solidarity,’ the priority of common interest over individual ones. It was thus to become the key for a new politics, focused upon the search for the best way to combine the efforts of everybody, rather than upon the competition of interests. This was the meaning of ‘order and liberty,’ the motto of the SocieT teT Positiiste that Comte created as a response to the 1848 revolution. The imperative was to reorganize society, the essence of social
Solidarity: History of the Concept phenomena was organic; the concept of duty was to become the principle of an active morality, overcoming the opposition between individual and collective. Along these lines, throughout the nineteenth century solidarity deepened its distance from fraternity. By the end of the century, the sociological definition of solidarity found the support of scientific discoveries: Microbiology and microphysics confirmed the interdependence of all parts of an organism under the form of a pathological contamination, giving to an old idea the new strength of a scientific evidence. If each single element depends upon the others, then their solidarity with the whole becomes a fact. Solidarity only refers to the whole, it concerns the community, and does not distinguish among categories of members. As such, solidarity designates a twofold bond, through the space within the community, and over time between generations.
3. Organic Solidarity After the dramatic resurgence of revolution with the 1871 Commune, the concept of solidarity was at the core of the Third Republic political response to the conflict between two opposite views of the role of the state proposed by liberalism and socialism (Donzelot 1984). In the name of market economy, liberals opposed any intervention of the state in the solution of social problems, even at the risk of an objective alliance with conservatives’ defense of the role of family and traditional groups. Socialists, in turn, regarded state intervention as necessary. Republicans stayed in between, but without any clear theory about the role of the state were forced to a negative position; solidarity offered them a positive principle for their politics, to the point of becoming an ‘official philosophy’ (Scott 1951). It expressed a peculiar conception of social relations at the basis of a new social contract founded not merely on individual rights, but on the need for a collective reparation of social injustice. Alfred Fouille! e (1885), the founder of the so-called solidarist philosophy, insisted on the Comtian description of society as a system of mutual obligations, as the source for an ‘active repairing justice’ more suitable to a democratic society than a liberal negative justice. His idea of a contractual organism found its theoretical foundation in the concept of solidarity and at the same time the key for a new set of concrete political measures transforming the system of social protection (Bec 1998). Solidarity was not only the source of a philosophy, but became the core of scientific sociology with the seminal work of Emile Durkheim ([1893] 1978) on the division of labor. His attempt was to give a scientific foundation to the specific nature of social phenomena; solidarity conceptualized the principle of cohesion which only could transform a set of individuals into
something like a society. Individual consensus could not explain society; it depended upon an objective cause underlying the characters of the social environment. Through his distinction between a mechanic, indifferentiated solidarity, and an organic solidarity, Durkheim defined the bases for a social organization where individual rights and state’s ends did not exclude each other. From one model to another, the nature of solidarity changed, but it still worked as the foundational law of any society. A modern democratic society depended upon the level of division of labor: Individual functions must be specialized, but conversely their cooperation for a final end ought to be ensured. The division of labor was the pattern of the only kind of solidarity suitable to a differentiated society. Far deeper than individual freedom of contract, the organic solidarity organized functional interdependence of all parts with respect to the whole. Durkheim was interested in the moral value of the division of labor and what consequences this would have from the point of view of society and individuals. It increased social dynamics and mutual dependence, but would it produce solidarity? If labor was badly distributed, then the priority of the whole over the individual would not be ensured. The intensity of social conflicts depended upon a lack of perception of solidarity: The problem thus was not in the structure, but in the representation of internal social bonds, and so was ultimately a political problem. This is the role of the modern democratic state: to organize solidarity through a right balance of social cohesion and individual liberty, constantly enhancing its perception. The theory of solidarity established that societies live and change according to their own laws, they could neither be frozen in their past forms, nor be violated in their natural course. To acknowledge these laws became thus an indispensable step for politics, while promoting sociology’s role. As such, solidarity offered the basis for grounding political reforms on the scientific knowledge of these laws, intentionally adopted by a politics of solidarity. From the ‘natural fact’ of interdependence, the principle of solidarity implied that everybody had to be guaranteed from some common risks, and reciprocally that all must do their part in financing common protection according to their means. On the idea of shared social risks, a whole set of solidarity techniques were developed, shaped above all on an insurance model (Ewald 1986). Welfare and pension laws followed at the beginning of the twentieth century in most European countries, based on the idea of a social debt toward the less privileged and their corresponding right to support. Techniques of solidarity were at the origins of modern welfare systems and of more general political strategies of socialization. The very nature of solidarity, a concept born at the crossing point between liberal individualism and socialism, makes it particularly relevant today (Chevallier 1992), when we face again the need to 14587
Solidarity: History of the Concept define a new balance between those two principles, not so differently from what gave substance more than a century ago to the concept of solidarity. The whole debate nowadays within social sciences on the future of welfare is there to prove it. See also: Collectivism: Cultural Concerns; Communitarianism: Political Theory; Communitarianism, Sociology of; Comte, Auguste (1798–1857); Durkheim, Emile (1858–1917); Individualism versus Collectivism: Philosophical Aspects; Integration: Social; Liberalism: Historical Aspects; Marxist Social Thought, History of; Political Thought, History of; Socialism: Historical Aspects; Solidarity, Sociology of
Bibliography Bec C 1998 L’assistance en deT mocratie. Belin, Paris Borgetto M 1992 La notion de fraterniteT en droit public francm ais. Librairie ge! ne! rale de droit, Paris Chevallier J (ed.) 1992 La solidariteT : un sentiment reT publicain? Presses Universitaires de France, Paris Comte A [1830–42]1975 In: Enthoven J-P (ed.) Physique sociale. Cours de philosophie positie, 2 Vols. Hermann, Paris Donzelot J 1984 L’inention du social. Grasset, Paris Durkheim E [1893]1978 De la diision du traail social, 10th edn. Presses Universitaires de France, Paris Ewald F 1986 L’Etat proidence. Fayard, Paris Fouille! e A 1885 La science sociale contemporaine, 2nd edn. Hachette, Paris Nisbet R 1966 The Sociological Tradition. Basic Books, New York Procacci G 1993 Gouerner la miseZ re. The social question in France. Seuil, Paris Scott J A 1951 Republican Ideas and the Liberal Tradition in France, 1870–1914. Columbia University Press, New York
G. Procacci
Solidarity, Sociology of Solidarity refers to the binding, or melding, of individuals into a cohesive group or collectivity. As such, it is an emergent attribute of groups that facilitates social coordination and social order, and is a precondition to all nonspontanteous collective action. Since social order and collective action are the alpha and omega of classical sociological theory, solidarity has always been a central concept in sociology. The concept came to the fore following the changes wrought by the development of industry and the rise of market economies. Although they were unjustified in doing so, most classical social theorists regarded the cohesion of agrarian communities as unproblematic. In their stylized view, such communities were 14588
technologically and demographically stagnant and uninvolved in long-distance trade. Given these assumptions and barring exceptional circumstances—such as those resulting from wars or massive epidemics—social mobility would be minimal and most children would be destined to recapitulate the social lives of their parents. As agrarian communities tended to offer relatively little scope for individualism, the attainment of solidarity was viewed as relatively unproblematic. According to Ferdinand To$ nnies (1887), such communities (which he termed gemeinschaften) were breeding grounds for social relations based on deep emotional, quasifamilial commitments. Emile Durkheim (Durkheim 1893) referred to the solidarity characteristic of this kind of society as ‘mechanical,’ implying thereby that it had a certain automatic quality. Mechanical solidarity rests on a set of common values, internalized norms, and beliefs (a conscience collectie) engendered by physical co-presence, a common focus of attention, and a sense of mutual awareness. This kind of solidarity is often expressed in a religious credo. Mechanical solidarity appeared to be threatened, however, by the rise of industry and the expansion of market economies in Western Europe, North America, Australasia and Japan beginning in the late eighteenth century. Among their many other effects, these transformations increased the size and scope of social networks, thereby affording many individuals greater options, ranging from marriage partners to occupations, in their daily lives. The growth of individualism and the concomitant decline of a conscience collectie was the inevitable result (Simmel 1922). It was far from evident how groups, communities, and societies could maintain mechanical solidarity in the wake of this increasing individualism. Durkheim, the foremost early theorist of solidarity, initially argued that industrial societies were held together by individuals’ mutual functional interdependence—a form of solidarity that he termed ‘organic.’ At the end of his life, however, he came to regard common values and norms as the basis for solidarity in all kinds of groups and societies (Parsons 1937). Durkheim’s insistence that solidarity requires compliance to a set of collective obligations based on particular values and norms has greatly influenced subsequent research. For example, a religious group is solidary only to the degree that its adherents comply fully with its obligations, both sacred (rites) and secular (tithing). Likewise, a working class is solidary only to the degree that individual workers regard themselves as members of a collectivity (e.g., the working class) with its own obligations, and refrain from engaging in activities—such as strikebreaking— that compromise or countermand these obligations. Yet compliance to collective obligations can owe to causes other than solidarity. Consider the mobilization of labor in a profit-making firm. To attain maximum profit, firms demand that their employees perform to
Solidarity, Sociology of the best of their ability and refrain from shirking. Employees usually comply with the firm’s obligations, not out of a sense of solidarity but because they are compensated for doing so in wages or salaries. In contrast, the members of a religious group or a working class are not given pecuniary compensation to comply with their corporate obligations. On this account, it is likely that compliance in such groups is due to solidarity. Since compliance with corporate obligations can be attained via mechanisms other than solidarity, conducting empirical research on solidarity is inherently difficult. This difficulty has not inhibited theorists, however. Theoretical perspectives on solidarity may usefully be divided into three distinct types. Normatie perspectives suggest that solidarity arises from common values and norms, structural perspectives argue that it arises from common material interests, and rational choice perspectives hold that it arises from mutual interdependence.
dary collectivities principally by virtue of sharing common material interests. They discover their common interests in the course of mutual interaction, especially when they perceive themselves to be threatened by powerful antagonists (Marx 1972 [1845–46], p. 179). When mutual interaction occurs repeatedly, this may set the stage for the development of sentiments, or affective feelings, that further strengthen compliance to collective obligations (Collins and Hanneman 1998). However, the weaknesses of structural theories of solidarity are notable. Both commonality of interest and repeated interaction are insufficient causes of solidarity. Solidarity varies widely among workers and married couples, for example. Sometimes workers refuse to support striking colleagues, and the incidence of divorce has risen dramatically in nearly all industrial societies. Neither of these trends is easily accounted for by structural theories of solidarity. Further, if the individuals who share interests are rational egoists, as most structural theorists contend, then why would they ever join forces to pursue their common ends?
1. The Normatie Perspectie Normative theorists claim that solidarity is unlikely to result merely from the interaction of rational, selfinterested individuals (Parsons 1951). This is because it is never in the interest of rational egoists to adhere to collective obligations if they can profit by ignoring them. In a world of rational egoists, the only means of assuring the requisite compliance is by sanctioning. Normative theorists hold that sanctioning is both too costly and too unreliable a means of assuring high levels of compliance, especially among large numbers of individuals. For them, values and internalized norms make individuals fit for social life by removing the conflict between the individual’s interest and that of the relevant collectivity. If individuals adhere to the values and norms that underlie corporate obligations, they will tend to engage in the obligatory actions without the carrot of rewards or the stick of sanctions. According to normative theorists, socialization is the key to the attainment of solidarity. Solidarity therefore is most likely to emerge in socially homogeneous collectivities. It should be noted that the emergence of values and norms is exogenous to the theory, as are socialization processes. Moreover, although casual observation confirms that compliance with values and norms is never automatic, normative theory has been mute about the conditions under which the requisite compliance is likely to occur.
2. The Structuralist Perspectie Structural theorists typically, if implicitly, regard solidarity as an outcome of individually rational action. On their view, rational individuals form soli-
3. The Rational Choice Perspectie This question is the point of departure for rational choice perspectives of solidarity. Truly rational actors will not join a group to pursue common ends when, without participating, they can reap the benefit of other people’s activity in obtaining them. Instead of contributing to the common weal, rational actors will free ride (Olson 1965) and behave individualistically when it is expedient. The upshot is minimal solidarity despite an extensive commonality of interest. The rational choice theory of solidarity (Hechter 1987) hinges on precluding free riding. In this theory, rational individuals form groups to obtain goods that they either cannot produce on their own, or that they cannot attain as economically. The ultimate motivation for group formation is held to be the individual’s desire to gain access to jointly produced goods. These goods can only be produced if prospective members comply with the rules and obligations that are necessary to ensure production of the given joint goods. The more dependent individuals are on the group for access to valued goods, the greater its potential power over them. This dependence increases when the given goods are not readily available elsewhere, when members lack information about alternatives, when the costs of leaving the group are high, and when personal ties to other members are strong. Moreover, the greater the dependence of members on a group, the greater their willingness to comply with highly extensive obligations. Hence, dependence creates conditions not just for the emergence of corporate obligations, but for obligations that guide and regulate behavior to a high degree. 14589
Solidarity, Sociology of Yet dependence is an insufficient cause of compliance. Children tend to be highly dependent on their parents, but this is no guarantee that they will routinely perform their chores, or study diligently in school. Dependence by itself is unlikely to resolve the free rider problem. Free riding can only be precluded when the given group has a high control capacity. This control capacity, in turn, is a function of monitoring and sanctioning. Monitoring is the process of detecting noncompliance to corporate obligations; sanctioning is the use of rewards and punishments to induce compliance. The theory therefore holds that compliance to corporate obligations—an indirect measure of solidarity in noncompensatory groups—is jointly a function of dependence and control mechanisms. That said, this theory remains incomplete because the sources of individual dependence are exogenous to it.
4. Future Research The empirical study of solidarity has been hamstrung by a lack of agreement on definitions of the concept. Defining solidarity as an intersubjective state (as normative theorists do) poses a severe challenge for empirical research. This is because there are as yet no consensually accepted measures of individual values and internalized norms in social science; direct measures of the sentiments or feelings that may hold within collectivities perforce are unavailable. Most empirical research on solidarity therefore focuses on explaining variable compliance with corporate obligations. When this research fails to pay sufficient heed to the pitfalls of inferring an intersubjective state from behavior (see above), its results are questionable. Future research on solidarity is likely to proceed along both methodological and substantive fronts. Methodologically, new measurement instruments some day may make it possible to gauge solidarity directly as an intersubjective state, rather than using behavioral compliance as a proxy. Substantively, there is much to learn about the interaction of solidary groups in wider social systems. Since solidarity is built around a set of normative obligations and enforcement mechanisms, it follows that the normative content of any given solidary group may be more or less congruent with that of its global social environment. Thus minority ethnic groups often espouse norms at variance with their host society. Adolescent peer groups evolve norms that are mutually inconsistent, and generally inconsistent with school norms. Factory workers can embrace output restriction norms that are at variance with their employers’ expectations, and inner city gangs and organized crime groups establish norms that are at variance with those of the society at large. The effects of solidarity in such countercultural groups are a matter of dispute among both social scientists and political decision makers (see Social Capital). Further research is needed to clarify the 14590
mechanisms that are responsible for the emergence of such countercultural groups and the consequences these groups have on social order. After a long period of dormancy, interest in solidarity has increased in several social science disciplines. The rational choice theory of solidarity has been used to explain variations in legislative party voting, the membership rules of friendly societies and mutual benefit associations, the survival of US intentional communities, cross-national crime rates (Hechter and Kanazawa 1993), social order in Japan and the United States (Miller and Kanazawa 2000), and the prevalence of nationalism in world history (Hechter 2000). Whether group solidarity incites or inhibits intergroup conflict is now the subject of lively debate (Bhavnani and Backer 2000, Fearon and Laitin 1996, Gould 1999). Legal theorists (Ellickson 1991, Posner 2000) are exploring group solidarity as an institutional alternative to law as a means of dispute resolution. Finally, a variety of mathematical models of solidarity recently have been proposed by a group of social and physical scientists (Doreian and Fararo 1998). See also: Action, Collective; Communities of Practice; Community, Expression of; Community, Social Contexts of; Community Sociology; Community Studies: Anthropological; Durkheim, Emile (1858–1917); Integration: Social; Networks and Linkages: Cultural Aspects; Rational Choice Theory in Sociology; Societies, Types of; Solidarity: History of the Concept; Theory: Sociological
Bibliography Bhavnani R, Backer D 2000 Localized ethnic conflict and genocide: Accounting for differences in Rwanda and Burundi. Journal of Conflict Resolution 44: 283–306 Collins R, Hanneman R 1998 Modelling the interaction ritual theory of solidary. In: Doreian P, Fararo T J (eds.) The Problem of Solidarity: Theories and Models. Gordon and Breach, Amsterdam, pp. 213–38 Doreian P, Fararo T J (eds.) 1998 The Problem of Solidarity: Theories and Models. Gordon and Breach, Amsterdam Durkheim E 1893 The Diision of Labor in Society. Free Press, Glencoe, IL Ellickson R C 1991 Order Without Law: How Neighbors Settle Disputes. Harvard University Press, Cambridge, MA Fearon J D, Laitin D D 1996 Explaining interethnic cooperation. American Political Science Reiew 90: 715–35 Gould R V 1999 Collective violence and group solidarity: Evidence from a feuding society. American Sociological Reiew 64: 356–80 Hechter M 1987 Principles of Group Solidarity. University of California Press, Berkeley, CA Hechter M 2000 Containing Nationalism. Oxford University Press, Oxford, UK Hechter M, Kanazawa S 1993 Group solidarity and social order in Japan. Journal of Theoretical Politics 3: 455–93
Somatoform Disorders Marx K 1845–46, 1978 The German ideology. In: Tucker R The Marx–Engels Reader. 2nd edn, W W Norton, NY, pp. 146–200 Miller A S, Kanazawa S 2000 Order by Accident: The Origins and Consequences of Conformity in Contemporary Japan. Westview Press, Boulder, CO Olson M 1965 The Logic of Collectie Action. Harvard University Press, Cambridge, MA Parsons T 1937 The Structure of Social Action. McGraw-Hill, New York Parsons T 1951 The Social System. Free Press, Glencoe, IL Posner E A 2000 Law and Social Norms. Harvard University Press, Cambridge, MA Simmel G 1922 The web of group affiliations. In: Simmel G (ed.) Conflict and the Web of Group Affiliations. Free Press, New York pp. 125–95 To$ nnies F 1887 Community and Society. Transaction Books, New Brunswick, NJ
M. Hechter
Somatoform Disorders The history of somatization phenomena, the ‘borderland between medicine and psychiatry’ (Lipowski 1988), or in terms of neuroscience the problem of the ‘body–brain-interface’ (Guggenheim 2000), can be traced back to ancient Greek medicine, where the term ‘hysteria’ was used to describe patients’ somatic complaints without the presence of somatic disease. Hysteria (from hystera l womb) was understood as a psychological problem in women, whose intrapsychic conflicts, resulting from the unfulfilled wish to bear a child, lead to somatic complaints (see Hysteria). Similar clinical observations by different European clinicians since the mid-nineteenth century (Briquet, Charcot, Janet, Breuer) were antecedents of the Freudian concept of conversion disorder. The psychodynamic etiology and psychoanalytic treatment approach to patients with conversion disorders, psychosomatic complaints, and hysteria was most influential to the 1970s. Nowadays, the term somatoform disorders replaces the traditionally used, but not precisely defined terms of hysteria, ‘functional symptoms,’ and ‘masked depression,’ etc., that were mainly confounded with etiological (psychoanalytical) assumptions. The current concept of somatoform disorders encompasses a heterogeneous group of psychiatric disorders that are characterized by four core features: (a) patients suffer from enduring or recurrent bodily complaints, symptoms and\or pain (e.g., palpitations, headache, fatigue, and dizziness), which are experienced as indicators of somatic illness, (b) mainly these complaints are not due to physical or organ dysfunction, (c) psychological and social factors—usually refused as influences by the patients—play a predominant role in the development and course of these
disorders, and (d) patients experience a general preoccupation with bodily symptoms and health concerns. This article describes the various somatoform disorders, their prevalence, and etiology. Critical aspects regarding diagnosis, treatment, and management of somatoform disorders in healthcare systems will be discussed.
1. Classification and Epidemiology From a nosological perspective, somatoform disorders were grouped together in the third edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-III) in 1980 for the first time. However, this clustering was primarily based on clinical criteria and not on empirically derived or laboratory findings. The tenth edition of the International Classification of Diseases (ICD-10) includes most of the somatoform disorders of the recent DSM-IV (American Psychiatric Association 1994), except for conversion disorders that are grouped with the dissociative disorders. The main feature of these disorders is the ‘repeated presentation of physical symptoms with persistent request for medical investigations, in spite of repeated negative findings and reassurances by doctors that these symptoms have no physical basis. If any somatic disorders are present, they do not explain the nature and the extent of the symptoms or the distress and preoccupations of the patient’ (WHO 1991). For the diagnosis of all somatoform disorders the patients’ symptoms have to be of sufficient severity to warrant clinical attention. The ICD-10 differentiates between four main nosological entities (seven different diagnoses) which were characterized by different diagnostic criteria (see Table 1). Recurrent, multiple somatic complaints that are not associated with any physical disorder are the essential feature of somatization disorder. The diagnosis requires a history of many bodily symptoms of several years’ duration with onset before the age of 30. In somatoform autonomic dysfunctions symptoms of autonomic arousal (e.g., palpitations, sweating, and stomach discomfort) are the main complaints of the patients. Chronic pain syndromes (duration 6 months) like tension headache, lower back pain, and atypical facial are labeled as persistent somatoform pain disorder when emotional conflicts or psychosocial problems are involved as main influences. Psychological factors can play either a major role within a general medical condition (e.g., rheumatoid arthritis) or may be the only determinants of pain (e.g., not otherwise specified lower back pain). The pain in one or more anatomical sites is the main focus of clinical presentation. Conversion disorder is mostly a transient disturbance of physical functions (typically occurring in stressful situations) which do not conform to current 14591
Somatoform Disorders Table 1 Somatoform disorders and their characteristics Subtypes Somatization disorder
Characteristics and (selected) main criteria (ICD-10) $
$ $ $
Hypochondriasis
$ $
Somatoform autonomic dysfunction
$
$
Persistent somatoform pain disorder
$
a history of at least 2 years of multiple and variable complaints which cannot be better explained by somatic factors or other psychiatric disorders (A) preoccupation with the symptoms causes persistent distress and leads to seek repeated medical consultations (B) persistent refusal to accept medical reassurance that there is no adequate physical cause (C) a total of six or more enduring autonomous symptoms from different groups (gastrointestinal, cardiovascular, genitourinary system, skin and pain symptoms) (D) a persistent belief of the presence of a serious physical disease (A1) a persistent preoccupation with a presumed deformity or disfigurement (body dysmorphic disorder) (A2) Symptoms of autonomic arousal attributed by the patient to a physical disorder (heart and cardiovascular system\upper and lower gastrointestinal tract\respiratory or genitourinary system) (A) Two or more autonomic symptoms like palpitations, sweating, dry mouth, flushing or blushing, epigastric discomfort etc. (B) Persistence of severe and distressing pain for at least 6 months which cannot be explained adequately by a physiological process or physical disorder (A)
concepts of the anatomy and physiology of the nervous system. Many disorders seem to be neurological disorders (e.g., motor disorders or disturbances of sensory function), but do not show the usual pathological signs. Hypochondriasis is not exclusively a secondary phenomenon within affective, anxiety, or psychotic disorders but a diagnostic subgroup of its own. Patients are preoccupied with their belief of already suffering from a severe illness or the fear of becoming ill with a life threatening disease (e.g., cancer, stroke, or AIDS). The diagnostic congregation of body dysmorphic disorder—which is rarely seen as a full-blown syndrome—is different in the nosological systems. In the ICD-10 it is a subgroup of hypochondriasis, with the preoccupation on a presumed deformity or body abnormality. Patients subjectively feel unattractive, ugly, and disfigured despite lacking evidence for major defects in appearance or stigmata. Positive feedback from significant others about normality in appearance has no corrective effects. Somatoform disorders show high prevalence rates in the community, especially in the primary health care system. In industrialized societies up to 4–12 percent of all people suffer from somatoform disorders (Escobar et al. 1987, Simon and von Korff 1991). In medical outpatient and inpatient treatment, patients with the probable diagnosis of somatoform disorders are estimated at up to 25 to 30 percent. The onset is in adolescence or young adulthood, and the course in untreated cases is usually chronic with a waxing and waning of symptoms. Women, persons from lower social classes, and ethnic subgroups are more likely to 14592
develop somatoform disorder. Furthermore, somatization is more often seen after separation and divorce. Somatoform disorders are often associated with other psychiatric disorders, especially depression, anxiety disorders, substance abuse disorders, and personality disorders (see Depression, Anxiety and Anxiety Disorders, and Personality Disorders). Patients with somatoform disorders are often not correctly recognized by their doctors and therefore insufficiently treated. Also, the latency between onset of the disorder, the appropriate diagnosis, and the referral to adequate treatment is usually high. These factors lead to largely extended costs for medical and psychosocial services (nine-fold costs of care than media costs of all patients).
2. Etiology and Course No single factor can explain the pathogenesis of somatoform disorders. Comparable with other psychiatric disorders, biological, psychological, and social factors interact in complex ways (see Mental Illness, Etiology of). Their impact probably varies over time within the same individual and between different patients resulting in a common final pathway. There is no general etiological model for somatization phenomena (Pilowsky 1997). However, interesting preliminary data demonstrate that patients with somatization disorder may have an abnormality in cortical and neuropsychological functioning, e.g., bifrontal impairment of the hemispheres, abnormal
Somatoform Disorders
Figure 1 Risk model for somatoform disorders (adapted from Rief and Hiller 1998)
auditory-evoked potentials, and elevated levels of cortisol (Guggenheim 2000, Rief and Nanke 1999). Several etiological conditions play a prominent role, especially in maintaining the process and can be considered as empirically derived risk factors for somatoform disorders (see Fig. 1). (a) Genetic risks, e.g., higher rates of somatoform disorders for monozygot than for dizygot twins, and in first-degree relatives. (b) Higher physiological reactivity which can lead to pervasive interoception of physiological signs. Unspecific bodily sensations are perceived and interpreted as symptoms of disease. The consequence is the intensification of the complaints and elevated attention to them. (c) Biographic vulnerability, a high proportion of physical, sexual abuse (within the family), and traumatic experiences like war and natural catastrophes are over-represented in these patients. Furthermore, chronically ill parents or siblings, and learning by model within the family may influence the localization and types of somatoform symptoms.
(d) Personality traits, e.g., difficulties in the perception and expression of emotions with an inverse relationship of reduced emotional expressions and heightened psychophysiological arousal in stressful situations. (e) Unrealistic health beliefs with a low tolerance for normal complaints, ‘catastrophizing’ ideas about physiological processes, and exaggerated expectations regarding effects of modern medicine to all kinds of complaints. (f) Chronification factors like multiple diagnostic procedures, the overvaluation of normal variants through the caring physician (iatrogen factors) and positive (e.g., attention from others) or negative reinforcement (e.g., defense mechanism for negative emotions, advantages of the sick-role, withdrawal from responsibilities). Which of these potential factors are important, and how much impact they have in the individual patient have to be determined within the diagnostic and psychotherapeutic process (see Mental Illness, Etiology of). 14593
Somatoform Disorders
3. Diagnostic Process The diagnostic process follows a sequence starting with a careful monitoring of all presented bodily complaints and ending up with a reliable and valid diagnosis of a special subtype of somatoform disorder and its severity. Knowledge of base rates for somatoform disorders and the potential medical disorders which have to be ruled out, age of the patients and their presentation style, history of other bodily complaints and excessive visits to other doctors might be hints for the assumption of the diagnosis. The physician is confronted with the dilemma of possible false decisions: (a) they might diagnose somatoform disorder although the symptoms are better explained by a somatic disease which has to be treated by medical guidelines (type-1 error, ‘false positive’); (b) the diagnosis of a somatic disease\ dysfunction is assumed although the patient suffers from a somatoform disorder (type-2 error; ‘false negative’). In most cases doctors and patients judge the risk of missing the correct somatic diagnosis as more adverse and therefore try to minimize it. However, this is one of the major factors maintaining somatoform disorders. A pragmatic rule for diagnosis is ‘to have somatic diagnostic efforts as much as necessary but as less as possible.’ Thorough, but not excessive or indefinite, medical examinations have to rule out organic disease and dysfunctions that might be causing the symptoms (e.g., hyperparathyroidism, multiple sclerosis, or brain tumor). It has to be supported by adequate and plausible procedures (e.g., physical and neurological examination, laboratory examination). Results and documents from other doctors should be revised in order to prevent repeated examination without relevance. A precise mental health examination for a complete overview of previous severe psychiatric disorders and minor complaints and the current psychopathological status is necessary. In single cases it can be difficult to decide whether the symptoms can better be explained by affective, anxiety, and personality disorders or whether they are real comorbid states. The sequence over time and the severity of symptoms are helpful in finding the correct hierarchy. Symptoms, pathological illness behavior, course, and consequence of the complaints will allow one to find the adequate nosological diagnoses along with the diagnostic criteria of DSM-IV or ICD-10. This diagnosis within the group of somatoform disorder has the status of a working hypothesis and is helpful for further treatment planning. If the patient visits a mental health specialist he or she will gather specific information regarding nosological diagnosis. A sophisticated analysis of all relevant information concerning the individual development, learning and predisposing factors, the onset, and further course of the symptoms, which is based on a life-chart interview, and information about actual psychosocial conditions, conflicts and traumata is inevitable. Of course it partly 14594
depends on the theoretical assumptions and models of the different psychotherapeutic schools (see Differential Diagnosis in Psychiatry).
4. Treatment Since psychological factors are most influential in the development and course of somatoform disorders they have to be addressed in every effective treatment. Of course the setting, intensity, and content of psychological interventions can vary depending on factors like severity of symptoms, comorbidity, insight, and motivation of the patient and availability of psychotherapy. In a step-by-step approach the primary care physician should start with short-term verbal interventions. Acute and isolated somatoform symptoms may respond to psycho-education and counseling within the usual medical visits. Counseling and advice for general and easy-to-administer methods of symptom monitoring and management can be helpful. The physician should know the pitfalls in the treatment of patients with somatoform disorders and avoid procedures, which might reinforce the patients’ pathological illness behaviors (e.g., no medical tests or laboratory examinations without clear indication). Information about and motivation for more specific forms of psychotherapy are essential if these basic interventions within the primary care system fail. For the more severe patients a referral to mental health experts is needed, for chronic and therapy-refractory cases even psychiatric or psychosomatic inpatient treatment may be necessary. Unspecific factors, which are important in any psychotherapy (e.g., therapeutic alliance, expectations of being helped, enhancing the sense of mastery, etc.) can be differentiated from specific therapeutic techniques and be adapted to relevant pathological processes in somatoform disorders (see below). Multimodal and multidisciplinary treatment approaches seem to be more effective than single treatment approaches. A validating, accepting, and (especially in the beginning) nonconfronting patient–therapist relationship is a prerequisite for active participation in therapy. It is also the basis for the development of an alternative model of illness and the motivation to change. Self-monitoring diaries aim at a precise identification of symptoms in relation to external or interoceptive stimuli, events, and their positive and negative consequences in concrete situations. Psychoeducation and cognitive therapy are essential for a critical discussion and revision of inadequate concepts about health and disease and a better understanding of ongoing mechanisms. For example, the hypochondriac patient may realize the effects of a feedback loop leading to a vicious circle as is the case in increased autonomic arousal through anxious thoughts and worries about illness. In a collaborative way the therapist tries to convey a more realistic and helpful
Somatoform Disorders paradigm of body–mind interactions which aims at a better understanding and mastery of symptoms. Relaxation training is provided as a method for an active reduction of tension, arousal, and apprehension. The patient is motivated to reduce maladaptive illness behaviors like body scanning and checking, excessive medical visits, seeking reassurance, and social withdrawal. Cooperation with physicians and family members is warranted so that they can avoid inadequate reactions to the patients’ illness behaviors. Alternative behaviors, e.g., exercises, sports, and positive activities have to replace the symptomatic ones and can help to re-experience bodily functions in a more positive way. Stress management, social competence, and problem-solving interventions can be used optionally to improve coping with stress, conflicts, and interpersonal problems. Furthermore, therapy has to prevent relapses and enhance a more balanced life style. As some of the symptoms might resist the described interventions or take a longer time to be effective, patients have to learn how to ‘live with the residual complaints as best as possible’ and to regain a better quality of life instead of waiting for complete remission. Up to now there is no empirically validated form of specific pharmacological therapy. Clinical experience shows that in some cases tricyclic antidepressants and selective-serotonin-inhibitors might be helpful, especially in patients with comorbid depression, anxiety disorders, and pain. However, without concomitant psychotherapy the risk of relapse is usually high. Benzodiazepines and low-potency neuroleptics should not be given because of the risks of long-term intake, especially abuse and dependency in benzodiazepines and tardive dyskinesia in neuroleptics. Compared to other psychiatric disorders the empirical evidence for effective treatments is low in somatoform disorders. But there is a growing number of controlled outcome studies since the syndromes were clearly defined in 1980. These studies have to face the fact that most of these patients are not treated within mental health services but in the primary care system or in consultation and liaison services. Additional difficulties arise from the complexity of recent approaches, which can only be evaluated as ‘total treatment packages’ (see Antidepressant Drugs; Behaior Therapy: Psychiatric Aspects).
5. Summary and Outlook The clear operationalization of diagnoses within DSM and ICD classification systems stimulated a lot of empirical studies concerning the epidemiology, etiology, pathogenesis, and treatment of somatoform disorders in the 1980s and 1990s. Medical and mental health services have realized the clinical relevance of these disorders and the need for early recognition and multidisciplinary intervention programs. Clinical
observations and outcome studies have challenged the former statement, that patients with somatoform disorders in general have a poor prognosis. The acceptance of psychotherapeutic and psychiatric interventions can be enhanced if their positive effects are transparent to the patient. Convincing models for a better explanation and understanding of symptoms, the definition of limited and concrete goals, and the provision of successful experiences in symptom reduction and improving coping abilities may have the function of a ‘positive vicious circle’ and convey a sense of mastery. However, patients and their caregivers have to overcome a series of obstacles: (a) The first step is the correct recognition and early diagnosis by the primary care physician. (b) Divergent health belief systems of patient and physician may burden the relationship and cooperation between them, so that it is difficult to realize psychological interventions. (c) Although psychological treatment is indicated, physicians often avoid time-consuming verbal interventions or are not adequately trained for their application. However, it is not clear whether psychotherapeutic interventions, which are proven within a psychotherapeutic context, can be adopted by general practitioners without diminishing their efficacy. (d) If physicians try to refer to mental health experts a lot of patients refuse this referral or are not motivated for psychotherapy. (e) Since empirical validation of different treatment approaches for somatoform disorders is still rare, the selection of therapeutic methods mainly depends on local availability and personal preferences rather than on evidence-based criteria. Each of these items is a challenge for clinical and research efforts as well as for a better dissemination of existing knowledge and competence to different health professionals who are involved in the treatment of somatoform patients. There is also an ongoing debate concerning the psychiatric nosological congregation in somatoform, related disorders and their diagnostic criteria (Rief and Hiller 1999). High rates of mutual comorbidity between anxiety, affective, substance abuse, and somatoform disorders have challenged the usefulness of the concept of discrete and stable nosological entities and different treatment needs. Some clinicians propose a concept of spectrum disorders with more transient borderlines between different, but obviously linked, clusters of psychopathology and their courses in a long-term perspective. Specifically the relevance of subclinical forms of disorders as precursors and remaining psychopathological states after partial remission is stressed. Furthermore, the congregation within the somatoform disorders is questioned, e.g., for hypochondriasis, where parallels to anxiety disorders are obvious. Etiologic and pathogenetic models of panic and obsessive compulsive disorder seem to be 14595
Somatoform Disorders of value for a better understanding and effective modification of health anxieties in somatization syndromes, too. Some new candidates are thought to be potentially affiliated as psychiatric diseases with apparent parallels to somatoform disorders. Chronic fatigue syndrome, fibromyalgia, and multiple chemistry sensitivity, etc. are syndromes with bodily complaints of unknown etiology and pathogenesis. Their treatment might profit from adopting interventions, which have been successfully applied in somatoform disorders. See also: Fear: Psychological and Neural Aspects; Freud, Sigmund (1856–1939); Hysteria; Janet, Pierre (1859–1947); Mental Health and Normality; Psychoanalysis in Clinical Psychology; Psychoanalysis: Overview; Psychosomatic Medicine; Stress and Health Research; Stress, Neural Basis of
Bibliography American Psychiatric Association 1994 Diagnostic and Statistical Manual of Mental Disorders: DSM-IV, 4th edn. American Psychiatric Association, Washington, DC Barsky A J, Klerman G L 1983 Overview: Hypochondriasis, bodily complaints and somatic styles. American Journal of Psychiatry 140: 273–81 Bass C M, Murphy M R 1990 Somatization disorder: Critique of the concept and suggestions for future research. In: Bass C (ed.) Somatization: Physical Symptoms and Psychological Illness. Blackwell, Oxford, UK Escobar J I, Burnam M A, Karno M, Forsythe A, Golding J M 1987 Somatization in the community. Archies of General Psychiatry 44: 713–8 Guggenheim F G 2000 Somatoform disorders. In: Saddock B J, Saddock V A (eds.) Comprehensie Textbook of Psychiatry. Lippincott Williams and Wilkins, Philadelphia, PA, pp. 1504–32 Kellner R 1985 Functional somatic symptoms and hypochondriasis. Archies of General Psychiatry 42: 821–33 Lipowski Z J 1988 Somatization. The concept and its clinical application. American Journal of Psychiatry 145: 1358–68 Mayou R, Creed F, Sharpe M 1995 Treatment of Functional Physical Symptoms. Oxford University Press, Oxford, UK Pilowsky I 1997 Abnormal Illness Behaiour. Wiley, Chichester, UK Rief W, Hiller W 1998 SomatisierungsstoW rung und Hypochondrie [Somatization Disorder and Hypochondriasis]. Hogrefe, Go$ ttingen Rief W, Hiller W 1999 Toward empirically based criteria for the classification of somatoform disorders. Journal of Psychosomatic Research 46: 507–18 Rief W, Nanke A 1999 Somatization disorder from a cognitivepsychobiological perspective. Current Opinion in Psychiatry 12: 733–8 Simon G E, von Korff M 1991 Somatization and psychiatric disorder in the NIMH Epidemiologic Catchment Area study. American Journal of Psychiatry 148: 1494–1500 Warwick H M C, Salkovskis P M 1980 Hypochondriasis. Behaioral Research and Therapy 28: 105–17
14596
World Health Organization 1991 International Classification of Diseases (ICD-10). WHO, Geneva
J. Angenendt and M. Ha$ rter
Somatosensation The somatosensory system provides both some of our most immediate and compelling sensations—e.g., toothache, orgasm, and the urge to urinate—and those which, almost unnoticed, guide our movements as we write a letter, button a shirt, knead bread dough, use a screwdriver, or simply walk across the room. Under the rubric of somatosensation we include not a single modality, but several distinct submodalities: touch, proprioception, thermoception, and nociception, each served by characteristic receptor types, and conveyed to the brain by distinct neural pathways. Unlike vision or hearing, receptors for somatosensation are not localized in one place—like the retina or cochlea—but scattered throughout the body: in skin, muscle, and viscera (Gardner et al. 2000). The sensitivity of those receptor systems is astonishing. Using active touch of the fingertip, we can distinguish a raised dot .006 mm high and 0.04 mm wide (smaller than Braille) from a smooth surface. This high degree of spatial resolution is matched by a temporal resolution that enables a Braille reader to process up to 100 characters per minute (Phillips et al. 1990). Conversely, as a probe is moved stepwise across the skin, we can identify, with great precision, not just the location of the probe, but the direction and speed of its movement and the duration of its contact. Similarly, variations in the angle of the knee joint can be detected with a resolution of 0.5m. Note that all of these feats of discriminative somatosensation involve movement: either movement of a limb with respect to the body, movement of a body surface with respect to the stimulus, or of a stimulus with respect to the body surface. These examples highlight several key features of the somatosensory system; its multimodal nature, its close association with movement—in proprioception and active touch—and the topographic (i.e., somatotopic) mapping of sensory inputs with respect to the body image. Although the somatosensory system processes both pain and temperature, this review will focus on its role in mechanoreception. Progress in this area over the past few decades has involved a continuing interaction between studies of somatosensory psychophysics in human and animals, especially primates (Talbot et al. 1968). The development of methods for single-unit recording and controlled presentation of tactile stimuli to awake primate preparations has made it possible to
Somatosensation study, at the cellular level, neuronal activity correlated with performance on somatosensory discrimination tasks (Gardner and Kandel 2000). As with any sensory system, we begin with the initial elements of the system—the receptors and primary afferent fibers—characterizing their receptive field properties, and considering their putative contribution to the encoding of specific sensations or sensory attributes. We identify the ascending neural pathways that convey the encoded information to the spinal and cranial divisions of the CNS (central nervous system), and thence to the thalamus and somatosensory cortex. The somatotopic and physiological organization of somatosensory cortical regions is reviewed, with a focus on the manner in which information about modality and topography are integrated at the cortical level to provide functionally significant outputs. In the final section we examine processing within somatosensory cortex and between the ‘primary’ sensory areas and adjacent cortical ‘association’ and ‘motor’ areas involved in the generation of ‘percepts’ and the control of movement.
1. Receptors, Receptie Field Properties and Encoding of Somatosensory Properties Human skin contains several classes of mechanoreceptors—Merkel disks, Meissner’s, Ruffini, and Paccinian corpuscles—whose differential stimulus selectivity is associated with a distinctive morphology. All four are capable of transducing stimulus energy into peripheral neural activity via the generation of receptor potentials. Their associated afferent fibers may be classified with respect to receptor location (cutaneous vs. deep), receptive field size (small vs. large), and adaptation to maintained stimulation (slowly vs. rapidly adapting). In humans, a given receptor type is uniquely associated with a specific type of conscious sensation (e.g., light touchBpressure, flutterBvibration, stimulus movement). In addition to these tactile receptors, muscles and joints are endowed with a variety of proprioceptors (e.g., muscle spindles, Golgi tendon organs) activated by limb displacement. This ensemble of mechanoreceptors and their associated afferents mediate the extraordinarily rich array of sensations that accompany contact with the skin and movement of the limbs. By combining analysis of single primary afferent responses to controlled mechanical stimulation in monkeys, with psychophysical measurements in humans it has been possible to assess the differential contribution of different receptor classes to the encoding of tactile stimulus properties. Such studies suggest that spatial properties of tactile stimuli (form, texture) are best preserved in the discharge of slowly adapting cutaneous receptors; their temporal properties (flutter, vibration, slip) by the discharge of rapidly adapting afferents. However, natural stimuli do not engage a
single receptor, but a population of receptors of different types. Thus, complex movements like grasping, lifting, and manipulating objects are guided by a continuous flow of afference from slowly and rapidly adapting receptors of several types, while information about the static position and movements of limbs involves a combination of muscle, joint, and cutaneous inputs (Johnson and Hsiao 1992, Valbo et al. 1984).
2. Ascending Neural Pathways: Modality Segregation and Somatotopic Organization in the Lemniscal System Each of the receptor types discussed above is associated with a primary afferent fiber arising from a peripheral ganglion cell in the spinal or cranial divisions of the somatosensory system and conveying inputs from, respectively the body or head, to higher levels of the system. The ascending somatosensory pathways are organized according to the principle of modality segregation, which insures that information from each type of receptor is conveyed along a separate ‘labeled’ line, to terminate in a unique collection of cells in the CNS. This insures that, at least at early stages of processing, information about e.g., light touch, vibration, or proprioception is kept separate and processed in parallel within a ‘lemniscal’ pathway. Moreover, pain and temperature inputs are conveyed and processed by a distinctly different—anterolateral—pathway. Information from the different mechanoreceptor types projects to ‘higher’ brain regions via a series of sensory ‘relay’ nuclei at brainstem and thalamic levels. The organization of this ‘lemniscal’ projection reflects not only modality segregation but an initial dissociation of inputs from the body (spinal) and head (trigeminal) divisions into dorsal column-lemniscal and trigeminal lemniscal pathways, respectively. At each level of the pathway, the representations of head and body within a nucleus are somatotopically organized. For example, at lower brainstem levels (dorsal column nuclei: gracilis and cuneatus) the body representation (which is ipsilateral) is mapped so as to form a ‘homunculus,’ with the lower trunk, legs, and feet medial in the medulla, and the upper trunk, hands, and arms, lateral. The body and head representations travel in the medial and trigeminal lemnisci, which cross at medullary levels so that in the thalamic ‘homunculus’ the contralateral head and body are represented in medial, and lateral posterior subnuclei (VPM, VPL) respectively. The ventrobasal complex, in turn, relays this information to two cortical somatosensory areas: SI, located on the postcentral gyrus of the parietal lobe, and SII, on the banks of the lateral fissure. In addition to sharing a thalamic input, the two regions are linked by reciprocal connections within and between the two hemispheres (Gardner and Kandel 2000). 14597
Somatosensation
3. Somatosensory Cortex: Brain Maps and Sensory Function. Anatomical, Physiological and Functional Organization SI occupies a cortical region divisible, by cytoarchitectonic criteria, into Brodmann’s areas (3a, 3b, 1, and 2) and lying along an antero-posterior axis bounded rostrally by motor cortex and caudally by posterior parietal regions. Within the area defined as SI there are four somatotopically similar but physiologically distinct representations of the body surface (Nelson et al. 1980). These somatosensory maps are ‘functional’ rather than ‘veridical.’ All parts of the body are not equally represented since representation of a body part reflects its significance for the animal, not the actual space it occupies. The ‘significance’ of a body part is highly correlated with density of its receptor innervation. Thus the tail of the spider monkey, the whiskers of the rat, and the hand and tongue of humans occupy disproportionate amounts of the cortical map of those species. Moreover, the cortical map preserves the modality segregation inherent in the ‘labeled line’ processing of receptor inputs. The representation in each of the four areas is dominated by input from one mechanoreceptor submodality: in 3a it is primarily muscle stretch receptors; in 3b input from cutaneous receptors, in area 2 the dominant input is from deep pressure receptors, while in area 1 rapidly adapting cutaneous receptors predominate. Despite these differences in their predominant inputs, the various somatosensory representations within SI share a common physiological organization, evident in the results of single-unit studies. As an electrode advances downward from the cortical surface along a radial traverse through successive layers, all the neurons encountered in its path will respond to the same type of stimuli (cutaneous or deep, rapidly or slowly adapting, muscle or skin) originating from the same part of the body. This principle of ‘columnar organization,’ first encountered in SI, has proven to be an important principle of cortical organization across other sensory systems. For the somatosensory system, the mapping process results in an orthogonal relationship between modality and topography, such that a recording from any point on the map will generate information about both the nature of the peripheral sensory message and the part of the body from which it originates
4. Somatosensory Plasticity and the Phantom Limb Syndrome The complexity and stereotypy of cortical maps suggested that such maps are a fixed and relatively immutable property of mammalian somatosensory cortex. However, it is now evident that the somatotopic maps defined by recording studies are modifiable, even in adult primates, by a variety of sensory 14598
and experiential manipulations, including deafferentation and behavioral training. Differential use of specific digits or portions of a single digit (in monkeys, or human Braille readers) may result in expansion of the map area devoted to the stimulated region (Recanzone et al. 1992). Deafferentation of specific portions of the map of the hand in primates produces a sequence of stages, in which the denervated area may initially be unresponsive, but then become gradually responsive to inputs, not from the originally innervated region, but from regions adjacent to it. It is as if the areas devoted to the adjacent regions expand into the region deprived of its original input by the deafferentation. This interpretation is supported by an experiment in which removal of a single middle digit (D3) is followed by expansion into its representation by the adjacent digits (D4 and D2). Similarly, after surgical fusion of two contiguous digits (D3, D4) the receptive fields of the two digits become continuous across the area of fusion, and this continuity remains even after the fusion is removed (Clark et al. 1988, Jones 2000). However, the most striking examples of such somatosensory ‘plasticity’ are seen after limb amputations, which result in expansion of the adjacent face representation into the area originally activated by inputs from the limb. Stimulation of the face in such animals activated neurons, not only in the face, but also in the denervated hand or arm areas. Anatomical tracing experiments have shown that these new patterns of excitation are correlated with the emergence (over time) of a widespread expansion of lateral connections in the affected cortical regions. These findings have prompted a reexamination of the ‘phantom limb’ phenomenon of human amputees, and the suggestion that it may reflect reorganization processes similar to those seen in denervated monkeys (Florence et al. 1997, Kaas et al. 1983, Ramachadran 1993).
5. From ‘Somatosensation’ to Somatosensorimotor Functions Ablation studies of SI and SII have demonstrated a rough correlation between the physiological properties of anatomically discrete somatosensory cortical areas and the differential function of those subdivisions. Ablation studies of SI in primates are consistent with the physiological data. The differential effects of lesions of SI subdivisions upon tactile discriminations (texture, size, and shape) are correlated with the stimulus properties processed by that area. Lesions of SII, in addition to their effects upon texture and size discriminations, also disrupt the ability to transfer a tactile discrimination task from one hand to the other, an observation consistent with the presence, in SII, of a bilateral somatosensory representation, mediated by callosal connections (Carlson 1980). However, the parallel processing of peripheral inputs to SI may represent only a first stage in the
Somatosensation transition from ‘sensation’ to perception and movement, i.e., from the identification of specific sensory attributes to the recognition of, or interaction with objects. The possibility of a serial, possibly ‘hierarchical’ mode of processing is suggested both by the intracortical connectivity patterns within SI and the increasing complexity of the receptive field properties of neurons encountered in penetrations along the anterior-posterior axis of the postcentral gyrus. The initial thalamic input to SI terminates primarily within layers III and IV of area 3b, which project, in turn, upon layer IV of more caudal areas 1 and 2. This connectivity pattern is accompanied by an increase in both the size and complexity of somatosensory receptive fields along the anterior posterior axis. Thus neurons recorded in area 3b tend to have relatively small, punctate fields involving only a single digit, and comparable with those recorded from primary afferent fibers. The receptive fields of posterior neurons (areas 2 and 1) are responsive across larger areas, including the palm and several digits, and respond to moving stimuli with distinct preferences for the direction or orientation of movement (Gardner et al. 1989). Such neurons would be relatively unresponsive to passive contact by small probe tips, or active contact of a smooth sheet of paper, but show increased discharge rate to edges and three-dimensional shapes. Such observations suggest a special role for areas 1 and 2 in the digital coordination involved in grasping and manipulation of objects. Reversible (pharmacological) disruptions of cortical activity centered on area 2 produced a marked impairment in such manipulative abilities in primates. Interestingly, ablations of specific finger regions of area 3a and 3b deactivate homologous digit areas in area 1, but not the reverse (Iwamura et al. 1991). Finally, lesion data also suggests that SI lesions which disrupt complex discriminative functions (such as categorizing the relative speed of a tactile stimulus) do not impair a simple stimulus detection task. These findings are consistent with serial processing mechanisms involving the convergence of processed inputs from areas 3b, 3a and 1 upon area 2. Evidence for an additional level of serial processing involving SII comes from observations that neurons in this region may have still more complex receptive fields, including bilateral representations, involvement in active touch and modulation (enhancement or suppression) of neuronal activity by attentional processes (Burton et al. 1995). The participation of such processes should not be surprising. In nature, animals seek out, attend to, and actively engage objects, using ‘mobile sensors,’ such as the human fingers, to generate movement patterns directed at surfaces and three-dimensional objects (Lederman and Klatsky 1987). Such ‘haptic’ scanning patterns are among the most complex movements mediated by the somatosensorimotor system. Their generation, sensory control and contribution to object recognition are likely to involve the interaction of SI and SII with contiguous
frontal and parietal ‘association’ cortex (Gardner and Kandel 2000). See also: Audition and Hearing; Olfactory System; Pain, Neural Basis of; Phantom Limbs; Vestibular System
Bibliography Burton H, Fabri M, Alloway K D 1995 Cortical areas within the lateral sulcus connected to cutaneous representations in areas 3b and 1: A revised interpretation of the second somatosensory area in macaque monkeys. The Journal of Comparatie Neurology 355: 539–62 Clark S A, Allard T, Jenkins W, Merzenich M M 1988 Receptive fields in the body surface map in adult cortex defined by temporally correlated inputs. Nature 332: 444–5 Carlson M 1980 Characteristics of sensory deficits following lesions of Brodmann’s areas 1 and 2 in the postcentral gyrus of Macacca Mulata. Brain Research 204: 424–30 Florence S, Jain N, Kaas J 1997 Plasticity of somatosensory cortex in primates. Seminars in Neuroscience 9: 3–12 Gardner E P, Kandel E 2000 Touch. In: Kandel E., Schwartz J, Jessel T (eds.) Principles of Neural Science. McGraw-Hill, New York, Ch. 23, pp. 451–71 Gardner E P, Martin J H, Jessell T 2000 The bodily senses. In: Kandel E, Schwartz J, Jessel T (eds.) Principles of Neural Science. McGraw-Hill, New York, Ch. 22, pp. 430–50 Gardner E P, Hamalainen H A, Palmer C I, Warren S 1989 Touching the outside world: Representation of motion and direction within primary somatosensory cortex. In: Lund J S (ed.) Sensory Processing in the Mammalian Brain: Neural Substrates and Experimental Strategies. Oxford University Press, New York, pp. 49–96 Iwamura Y, Tanaka M 1991 Organization of the first somatosensory cortex for manipulation of objects: an analysis of behavioral changes induced by muscimol injection into identified cortical loci of awake monkeys. In: Franzen O, Westman J (eds.) Information Processing in the Somatosensory System. Macmillan, London, pp. 371–80 Johnson K O, Hsiao S S 1992 Neural mechanisms of tactual form and texture perception. Annual Reiew of Neuroscience 15: 227–50 Jones E G 2000 Cortical and subcortical contributions to activity-dependent plasticity in primate somatosensory cortex. Annual Reiew of Neuroscience 23: 1–37 Kaas J, Merzenich M, Killackey H 1983 The reorganization of somatosensory cortex following peripheral nerve damage in adult and developing animals. Annual Reiew of Neuroscience 6: 325–56 Lederman S J, Klatzky R 1987 Hand movements: a window into haptic object recognition. Applied Cognitie Psychology 19: 342–68 Nelson R J, Sur M, Fellerman D J, Kaas J H 1980 Representations of the body surface in postcentral parietal cortex of Macacca fasciularis. The Journal of Comparatie Neurology 611–43 Phillips J R, Johansson R S, Johnson K O 1990 Representation of Braille characters in human nerve fibers. Experimental Brain Research 81: 589–92 Ramachadran V S 1993 Behavioral and magnetoencephalographic correlates of plasticity in the adult human brain. Proceedings of the National Academy of Sciences 90: 10413–20
14599
Somatosensation Recanzone G H, Merzenich M M, Schreiner C E 1992 Changes in the distributed temporal response properties of S1 cortical neurons reflect improvements in performance on a temporally based tactile discrimination task. Journal of Clinical and Experimental Neurophysiology 67: 1071–91 Talbot W H, Darian-Smith I, Kornhuber H H, Mountcastle V B 1968 The sense of flutter vibration: Comparison of the human capacity with the response patterns of mechanoreceptive afferents from the monkey hand. Journal of Clinical and Experimental Neurophysiology 31: 301–34 Valbo A B, Olsson K A, Wetsberg K G, Clark F J 1984 Microstimulation of single tactile afferents from the human hand: Sensory attributes related to unit type and properties of receptive fields. Brain 107: 727–49
P. Zeigler
drugs provided the necessary means for day- and night-time sedation. It also allowed the replacement of physical restraint by pharmacological control (Lehmann and Ban 1970). Isolation of hyoscine (scopolamine) from hyoscyamine and the preparation of paraldehyde during the 1880s provided alternative treatments. Subcutaneously administered combinations of morphine (or apomorphine) and scopolamine (hyoscine) became the prevailing treatment modality of excitement and agitation in psychiatry by the dawn of the twentieth century (Ban 1969). The synthesis of malonylurea (barbituric acid) by von Baeyer in 1864 opened the path for the development of barbital (Veronal), the first barbiturate hypnotic introduced in 1903. Barbital, together with many other long-, intermediate-, short-, and ultrashort-acting barbiturates, dominated the treatment of insomnia for about 50 years (Lehmann 1993).
Somatotherapy, History of 3. Beyond Behaioral Control 1. Definition and Roots In psychiatry, somatotherapy refers to a variety of treatments of mental illness by physical or chemical means. It also includes psychosurgery and operations practiced in the late nineteenth and early twentieth century, such as castration, or removal of the clitoris, tonsils, teeth, etc. (Kalinowsky and Hippius 1969). The roots of physical therapies are in bloodletting, and in the use of emetics, and purgatives. The roots of chemical therapies are in soporifics which have been in use since ancient medicine. Poppy (opium) and Indian hemp (Cannabis indica) were known to the Egyptians, and mandrake (Atropa mandragora) to the Babylonians and Hebrews. Henbane (Hyoscyamus) and dewtry (Datura stramonium) were used by the Orientals and the Greeks. The roots of psychosurgery are in ‘decompressive trephining’ that had been practiced by the neolithic Gauls and the Bohemians (Garrison 1929).
2. Pharmacotherapy: Behaior Control The developments, which led to pharmacotherapy in psychiatry, began with the isolation of morphine from opium in Germany by Serturner in 1806; the separation of bromine from the ashes of the seaweed in France by Balard in 1826; and the synthesis of chloral hydrate in Germany by Liebig in 1832. Morphine, for controlling agitation and aggression, and potassium bromide for relieving restlessness and anxiety, were introduced for clinical use in the late 1850s. Chloral hydrate for controlling insomnia became available about 10 years later. The judicious use of these three 14600
The first treatments in psychiatry which aimed beyond behavioral control were fever therapy and prolonged sleep (Shorter 1997).
3.1 Feer Therapy The development, which led to fever therapy, began in the mid-1890s. It was triggered by Wagner-Jauregg’s observation that one of his female patients, who contracted erysipelas, recovered from her psychosis after she developed a high fever. Instrumental to further development was the preparation of the tuberculin vaccine in 1890 by Robert Koch and the demonstration of the presence of Treponema (Spirocheta) Pallidumin the brains of patients with syphilitic general paresis in 1905 by Moore. The recognition of the heat sensitivity of spirochetes was also important. To follow up his observation, Wagner-Jauregg induced fever with the employment of the tuberculin vaccine in several of his patients with general paresis at the University of Vienna. In spite of the favorable results, he replaced the vaccine in 1917 by blood from patients with tertian malarial attacks, because malaria could be reliably terminated by quinine. Treatment with malaria-induced fever was an overwhelming success. Six of the first nine patients treated with ‘fever therapy’ improved; three fully recovered (Wagner-Jauregg 1919). Wagner-Jauregg received the Nobel Prize in 1927 for introducing the first effective treatment for neurosyphilis. Fever therapy had also been tried in psychiatric disorders other than general paresis, e.g., depression,
Somatotherapy, History of schizophrenia; and fever had been induced by means other than the tuberculin vaccine and malarial blood, e.g., typhoid vaccine, killed Escherichia coli suspension, sodium nucleinate, sulfur in oil, hot bath. But only in general paresis did ‘malaria therapy’ have a demonstrable therapeutic effect. Even in general paresis, only in one-third of the patients did it yield ‘satisfactory’ results (Pichot 1983).
4. Physical Therapies Insulin-induced coma and electrically-induced convulsions were the first ‘physical therapies’ of schizophrenia with demonstrable therapeutic effectiveness without considerable risk (Sargant and Slater 1954).
4.1 Insulin Coma 3.2 Sleep Therapy Sleep therapy can be defined as pharmacologically induced sleep for about 20 hours a day for an extended period, e.g., two weeks. It was first introduced in Shanghai with the employment of bromides (‘bromide sleep’) by Macleod for mania in 1900. Sleep therapy was also used by Wolff, with the employment of methylsulfonal one year later (‘Trionalkur’). Sleep therapy was reintroduced as ‘prolonged sleep’ (‘l’ipnosi prolongata’) with long acting barbiturates by Epifaneo in 1915 at the University of Turin. It was adopted by Klaesi for the treatment of schizophrenia in 1922 at the Burgholzli psychiatric clinic in Zurich. Klaesi referred to sleep therapy (prolonged sleep) as ‘continuous narcosis’ (Dauernarkose) (Klaesi 1922). For about three decades, during the 1930s, 40s and 50s, ‘sleep therapy’ was used in conditions ranging from manic and catatonic excitement, through severe anxiety and depression, to various forms of schizophrenia. Various drug combinations had been employed to induce and sustain sleep. Klaesi himself used Somnifen, a barbiturate combination that contains in equal amounts diethyl and allylisopropyl barbituric acid. The Cloetta mixture was another frequently used combination. It contains paraldehyde, amylhydrate, chloral hydrate, alcohol, isopropyl-allyl-barbituric acid, digalene, and ephedrine. Sleep therapy was of marginal benefit, but of considerable risk. In Klaesi’s original sample of 26 schizophrenic patients, the improvement rate was about 30 percent, that is about 10 percent higher than the 20 percent spontaneous remission rate in a similar population. This relatively small gain was offset by a higher than 5 percent fatality from pneumonia and\or circulatory collapse. Nevertheless, ‘sleep therapy’ was lingering on, even after the introduction of new psychotropics—with the incorporation of drugs like chlorpromazine in some of the ‘sleep mixtures’— before fading away in the 1960s. In 1971, Pflug and Tolle, found sleep deprivation therapeutic in endogenous depression (Pflug and Tolle 1971). The findings are in keeping with neuropharmacological theory about the action mechanism of tricyclic antidepressants. Since the early 1970s sleep deprivation has been employed with success in the facilitation of the therapeutic effect of these drugs (see Schizophrenia, Treatment of ).
Insulin was isolated from the pancreas by Banting and Best in 1921. Insulin is the hormone that regulates sugar metabolism and since 1922 it has been used as replacement therapy in diabetes mellitus, a disease caused by its deficiency. The first report on the successful treatment of symptomatic depression with insulin in diabetic patients was published in 1923. Insulin, was introduced into psychiatry for stimulation of appetite and sedation in the late 1920s. Its indications were extended to the alleviation of the discomfort in morphine withdrawal in the early 1930s. Manfred Sakel’s report in 1930 was one of the first on the therapeutic effect of insulin in morphine withdrawal. The origin of insulin-coma therapy in schizophrenia was in Sakel’s observation that some of his restless and agitated addicts who inadvertently slipped into coma became tranquil and accessible. By following up in schizophrenics, at the neuropsychiatric clinic of the University of Vienna, the observations he made in addicts at the Lichterfelde Sanatorium in the suburb of Berlin, Sakel established the effectiveness of insulin coma therapy in schizophrenia. He published his findings in a series of 13 papers in the Wiener medizinische Wochenschrift between November 1934 and February 1935 (Sakel 1935). Insulin coma therapy fared favorably to all other therapies used in the treatment of schizophrenia in the 1940s. It was especially effective in recent cases. The spontaneous remission rate was doubled if the treatment was used within the first year of the illness. The risks with insulin-coma therapy were low. Of the 90 deaths reported out of the 12,000 patients treated with insulin coma therapy in the United States, only half could be attributed to the treatment. One British expert had only one fatality in over 1,000 cases (MayerGross et al. 1960). Nevertheless, insulin-coma therapy was short-lived in the Western world. It was promptly abandoned after the introduction of new psychotropic drugs. This was not the case in the Soviet Union and China, where it remained one of the major forms of treatment of schizophrenia until the late 1980s.
4.2 Electroshock Developments, which led to the introduction of electroconvulsive therapy in psychiatry, began in the 14601
Somatotherapy, History of late 1920s with Nyiro and Jablonsky’s clinical observation that epileptic patients improved if they developed schizophrenia. It continued with Laszlo Meduna’s findings in autopsies that the neuroglia, which is the supporting tissue of the nerve cells, was very thick in epilepsy and very thin in schizophrenia, and postulation of an antagonism between epilepsy and schizophrenia. To pursue matters further, Meduna induced convulsions for the first time in a schizophrenic patient by the intramuscular administration of camphor on January 23, 1934 at the National Psychiatric Hospital for Nervous and Mental Diseases in Budapest. Adopting the treatment schedule used in malarial therapy, he repeated the injection every third day. After the fifth injection he found the patient markedly improved. In spite of the favorable findings in several patients, Meduna replaced camphor by intravenously administered Cardiazol (Metrazol, pentetrazol) because of the pain caused by the intramuscular injection, the delay in the onset of the convulsions, and the agonizing fear patients experienced during the delay (Meduna 1937). Pharmacologically induced convulsions were replaced by Ugo Cerletti and Lucio Bini by electrically induced convulsions in the late 1930s. Electroshock (EST), originally referred to as electroconvulsive therapy (ECT), was first given to a schizophrenic patient on April 18, 1938 at the neuropsychiatric clinic of the University of Rome. It was administered through electrodes placed on both temples that Bini found safe in dogs. Cerletti and Bini reported their findings with EST to the Italian Academy of Medicine in Rome on May 28, 1938 (Cerletti and Bini 1938). The treatment was therapeutically effective and safe; the only drawbacks it had were fractures and memory disturbance. The risk for memory disturbance with EST was greatly reduced by the use of ‘brief-pulse currents’ and moving the path of the current in the brain away from the memory centers by locating the electrodes on one side, or over the very front of the head. An added advantage of bifrontal electrode placement is that seizures are elicited at near threshold frequencies. The risk for fractures was greatly reduced by pretreatment with a neuromuscular blocker. Bennett in the United States was first in 1940 to employ curare to relax the muscle (Bennett 1940). Curare, a substance that competes with acetylcholine at the neuromuscular junction, was replaced by succinylcholine, a safer drug, which prevents the action of acetylcholine by depolarizing the muscle end-plates. To prevent the panic which is caused by not being able to breath, succinylcholine administration is preceded by the administration of a short acting barbiturate. ECT, administered under general anesthesia with continuous oxygenation and monitoring of the body’s physiology during treatment, is the safest among the somatic therapies today. It offers specific advantages for the elderly, in pregnancy, and for 14602
patients with cardiac disease (Fink 2000, Kaplan and Sadock 1988). The indications of ECT have been extended over the years from schizophrenia to include mania and depression. However, in mania and schizophrenia, it is used only in treatment failures or in patients with specific contraindications to drugs. In depression ECT is also used as primary treatment in delusionalpsychotic patients and in patients perceived as suicidal risks (Abrams 1997, Fink 1979). In the mid-1990s a new technique, high-energy transcranial magnetic stimulation (rTMS) of the brain, was introduced. Expectations were high that it could eventually replace ECT. This, however, was not to be the case. Compared with ECT, the physiological effects of rTMS are rather modest.
5. Psychosurgery Development of psychosurgery began in the late nineteenth century in Switzerland with the removal of parts of the brain (topectomy, craniotomy) in six psychiatric patients by Burckhardt. The first operations were of questionable success; one patient died, one improved, two became ‘quieter,’ and two remained unchanged (Burckhardt 1891). Progress in psychosurgery received a great impetus in 1935 by Jacobsen and Fulton’s demonstration that removal (ablation) of the frontal lobe has a calming effect in the monkey. It was their report which prompted Egas Moniz, a Portuguese neurologist, to develop transcortical leukotomy, a method in which the white matter between the frontal lobes and the thalamus in the brain is severed through burr-holes cut in the cranium. The first transcortical leukotomies were conducted in 1935 by Moniz and Almeida Lima, a neurosurgeon, in schizophrenic patients from the Bombarda asylum, at the Santa Naria Hospital in Lisbon. The operation was a success. It reduced the severity of tension and psychotic symptoms in over 60 percent of the patients. (Moniz 1937). Moniz received the Nobel Prize in 1949 for his pioneering work in the new field and for his role in the development of psychosurgical techniques. Psychosurgery was introduced in the United States by Walter Freeman, a Washington neurologist, in collaboration with James Watts, a neurosurgeon. The first prefrontal lobotomy was performed in Washington on a patient with agitated depression at George Washington University in 1936 (Freeman and Watts 1936). About 10 years later, in 1946, Freeman introduced transorbital lobotomy by approaching the brain through the eye socket with an ice pick. Psychosurgery fell into disrepute during the 1950s and 1960s, because it deprived patients from their judgement and social skills. Demonstration of its effectiveness and safety with the introduction of improved stereotactic techniques, however, led to its
Somatotherapy, History of revival in the 1970s and 1980s. The new techniques allow the placement of discrete lesions in the brain, and also to make the actual lesions by radioactive implants, cyroprobes, electrical coagulation, proton beans, and ultrasound waves (Lehmann and Ban 1994). Today, psychosurgery covers a wide range of operations which aim to reduce the severity of psychopathology by destroying specific regions (lobotomy) or connecting tracts (leukotomy, tractotomy) in the brain. Among the most frequently employed operations are cingulectomy in the treatment of refractory obsessive compulsives, and thalamotomy in the treatment of refractory depressions.
tween the selective memory disturbance of Wernicke’s encephalopathy, first described in 1881, and thiamine deficiency by de Wardener and Lennox in 1947. The diagnostic distribution of patients in psychiatric hospitals drastically changed with the introduction of nicotinic acid, penicillin, and thiamin into psychiatry during the 1940s. Psychoses due to cerebral pellagra and dementia due to syphilitic general paralysis virtually disappeared by the 1950s in the western world, and the prevalence of dysmnesias markedly decreased. This was the state of art in the somatotherapy of mental illness immediately prior to the introduction of new psychotropic drugs.
6. Pharmacotherapy in the 1930s
8. Psychotropic Drugs: 1950–2000
There was a steadily growing interest in pharmacotherapy during the 1930s. One of the targets was catatonic inhibition which by the mid-1930s had been transiently removed in some patients by the inhalation of a 30 percent carbondioxide and 70 percent oxygen mixture, and the administration of sodium amobarbital and apomorphine (Loevenhart et al. 1929). Racemic amphetamine, a powerful stimulant, was introduced by Prinzmetal and Bloomberg in the treatment of narcolepsy in 1935 (Prinzmetal and Bloomberg 1935). Two years later, dextroamphetamine was first used by Charles Bradley for calming hyperexcitable–hyperactive children. These were the first neuropsychiatric applications of the amphetamines. The amphetamines are substances with a phenylpropylamine structure synthesized by Edeleano in 1887. Amphetamines were also tried in the treatment of depression, but with little success. Although treatment with hematoporphyrin, a photosensitizing substance, and dinitrile succinate, seemed to offer some promise in the treatment of depression, tincture of opium, introduced during the first quarter of the century, remained the prevailing pharmacological treatment modality in depressive illness during the 1930s, 1940s, and early 1950s (Ban 1981).
Introduction of therapeutically effective psychotropic drugs in the treatment of mental illness began in 1949 and has continued since. The term psychotropic was introduced by Ralph Gerard, an American neurophysiologist, in 1957 for drugs with an effect of mental activity and human behavior. The history of psychotropic drugs can be separated into three distinct periods. The first corresponds with the introduction of prototype drugs during the 1950s. The second, with the introduction of the firstgeneration psychotropics, identified on the basis of structural and\or pharmacological similarities to the prototypes, during the 1960s and 1970s. The third, ongoing period, with the introduction of the second generation of psychotropics, developed with consideration of theories about the pathomechanism of mental illness, derived from studies on the action mechanism of therapeutically effective drugs. The first set of psychotropic drugs included lithium, chlorpromazine, reserpine, meprobamate, iproniazid, and imipramine. In a broad sense it also included chlorprothixene, haloperidol, and chlordiazepoxide. It consisted of four antipsychotics (neuroleptics, major tranquilizers), i.e., chlorpromazine, reserpine, chlorprothixene, and haloperidol; two antidepressants, i.e., imipramine and iproniazid; two anxiolytic sedatives, i.e., meprobamate and chlordiazepoxide; and one mood stabilizer, i.e., lithium. Antipsychotics are effective in controlling excitement and psychomotor agitation, in the treatment of schizophrenia and mania, and in the management of psychosis. Antidepressants are effective in the treatment of depressive illness and also in panic disorder and obsessive compulsive disease. Mood stabilizers are effective in both the manic and the depressive phases of manic depressive illness, and in the acute, maintenance, and prophylactic treatment of bipolar disease. Anxiolytic sedatives are effective in the treatment of generalized anxiety disorder. Anxiolytics alleviate anxiety and relieve tension. The first set of psychotropic drugs was introduced within a period of about 10 years. Lithium, in 1949
7. Causal Treatments: The 1940s The developments which led to causal treatments in phychiatry during the 1940s began with the isolation of nicotinic acid (vitamin B-3) by Funk in 1911, and the demonstration of its therapeutic effect in pellagra by Fouts and his associates in 1937. Pellagra is a disease which may cause delirium in the acute, depression in the subacute, and dementia in the chronic stages of the illness (Ban 1971). It continued with the discovery of penicillin by Fleming in 1929, and the establishment of its therapeutic effect in syphilitic general paralysis by Stokes and his associates in 1944. It culminated with the recognition of the link be-
14603
Somatotherapy, History of was first, followed by chlorpromazine in 1952, reserpine in 1954, meprobamate in 1956, imipramine and iproniazid in 1957, chlorprothixene in 1958, haloperidol in 1959, and chlordiazepoxide in 1960 (see Antidepressant Drugs). The introduction of each of these drugs into psychiatry has its own story. The discovery of the therapeutic effect of lithium in mania was the result of John Cade’s recognition that the substance protected guinea pigs from the lethal effects of urea, responsible for the toxicity of manic urine. The discovery of the therapeutic effect of iproniazid in depression was triggered by clinical observations on the euphorizing effect of the substance when used in the treatment of tuberculosis. The origin of chlorpromazine, the first antipsychotic phenothiazine, was in research with antihistaminics. Its introduction into the treatment of schizophrenia and mania by Delay, Deniker, and Harl followed within two years of its synthesis by Paul Charpentier in 1950. The origin of reserpine was in research with Rauwolfia alkaloids. Its introduction in the treatment of psychosis by Delay, Deniker, Tardieu, and Lemperiere followed within two years of its isolation from the ‘snake-root’ plant by Muller, Schlittler, and Bein in 1952. Development of meprobamate, the first widely used propanediol in the treatment of anxiety, was in Frank Berger’s research that aimed at developing antibiotics for Gram negative bacteria originally. Development of haloperidol, the first widely used antipsychotic butyrophenone, was in Paul Janssen’s research which aimed at developing analgesics which are more potent than pethidine (meperidine) originally. Imipramine was the result of the search for structurally related drugs to chlorpromazine at Geigy; chlordiazepoxide, of the screening for pharmacologically related drugs to meprobamate at Roche; and chlorprothixene, of the effort to synthesize structurally and pharmacologically related drugs to chlorpromazine at Lundbeck (Ayd and Blackwell 1970, Ban et al. 1998, Ban and Ray 1996, Healy 1996, 1998). Some of the prototype drugs had a major impact on psychotropic drug development and by the end of the 1970s, the number of psychotropic drugs available for clinical use in the treatment of mental illness increased to about 50. Today there are well over 20 antipsychotics, antidepressants, and anxiolytic sedatives, and at least three mood stabilizers employed in the treatment of mental illness. In addition, there are also a couple of cholinesterase inhibitors in clinical use for the treatment of Alzheimer type of senile dementia (see Alzheimer’s Disease: Antidementie Drugs).
9. Neuropsychopharmacology in Perspectie The introduction of therapeutically effective psychotropic drugs and the spectrophotofluorimeter during the 1950s led to the development of neuropsychopharmacology. 14604
The spectrophotofluorimeter is an instrument with the capability (resolution power) to measure the concentration of cerebral monoamines, and their metabolites, involved in neuronal transmission at the synaptic cleft. Prior to its introduction, research dealing with central nervous system acting drugs was restricted to behavioral pharmacology and neurophysiological measures. Spectrophotofluorimetry provided direct access to the detection of biochemical changes responsible for behavioral effects. Employment of spectrophotofluorimetry in the mid-1950s led to the demonstration that iproniazid increased, whereas reserpine decreased, cerebral monoamines, i.e., serotonin (5-HT) and norepinephrine (NE). Since iproniazid, a monoamine oxidase inhibitor (MAOI), induced euphoria in some tubercular patients, and reserpine, dysphoria in some hypertensives, the possibility was raised that the mood changes were mediated by the changes in 5-HT and NE concentrations in the brain (Brodie et al. 1956). The postulation of a relationship between reserpineinduced depletion of monoamines and depression led to the introduction of the reserpine reversal test in the screening for potential antidepressants which, in turn, led to a substantial increase in imipramine-type antidepressants by the end of the 1970s. It also provided the rationale for the research which, with the employment of radioactive isotope labeling techniques, led to the finding that tricyclic antidepressants block the neuronal reuptake of NE and 5-HT (Glowinski and Axelrod 1964). Employment of spectrophotofluorimetry in the early 1960s led to the demonstration that nialamid, an MAOI, induced an accumulation of the 3-O-methylated metabolites of dopamine and NE in chlorpromazine and haloperidol treated rats. The increase in the accumulation of 3-methoxytyramine and normetanephrine could be explained by a compensatory increase of tyrosine hydroxylase activity to the drug-induced blockade of catecholamine receptors (Carlsson and Lindquist 1963). Postulation of a relationship between dopamine receptor blockade and antipsychotic effect led to the introduction of tests based on dopamine antagonism in the screening for potential antipsychotics which in turn led to a substantial increase in the number of chlorpromazine-type antipsychotics by the end of the 1970s. It also provided the rationale for the research which, with the employment of X-ray crystallography led to the demonstration of dopamine receptor blockade with antipsychotics (Creese et al. 1975). Development of receptor binding assays during the 1970s opened the path for the determination of NE and 5-HT reuptake blocking potency of antidepressants. Identification of receptor subtypes during the 1980s led to the determination of the dopamine-D2 and serotonin-5-HT2A blocking potency of antipsychotics, and to the delineation of the receptor profile of psychotropic drugs.
Somatotherapy, History of Separation of receptor subtypes was further refined during the 1980s by the introduction of genetic technology. With the employment of receptor cloning by the late 1990s five subtypes of the dopamine receptor, 14 subtypes of the serotonin receptor, and five subtypes of the muscarinic-cholinergic receptor had been identified (Lucas and Hen 1995, Richelson 1999). The new genetic technology rendered the tailoring of psychotropic drugs to receptor affinities a distinct possibility. It opened up a new perspective in the somatotherapy of mental illness. See also: Peptides and Psychiatry; Psychopharmacotherapy: Side Effects
Bibliography Abrams R 1997 Electroconulsie Therapy. 3rd edn. Oxford University Press, New York Ayd Jr F J, Blackwell B (eds.) 1970 Discoeries in Biological Psychiatry. Lippincott, Philadelphia Ban T A 1969 Psychopharmacology. Williams & Wilkins, Baltimore, MD Ban T A 1971 Nicotinic acid and psychiatry. Canadian Psychiatric Association Journal 16: 413–31 Ban T A 1981 Psychopharmacology of Depression. Karger, Basel, Switzerland Munchen, Paris, London, New York, Sydney Australia Ban T A, Ray O S (eds.) 1996 History of CINP. JM Productions, Brentwood, TN Ban T A, Healy D, Shorter E 1998 The Rise of Psychopharmacology and the Story of CINP. Animula, Budapest, Hungary Bennett A E 1940 Preventing traumatic complications in shock therapy by curare. Journal of the American Medical Association 141: 322–4 Brodie B B, Shore P A, Pletscher A 1956 Serotonin-releasing activity limited to Rauwolfia alkaloids with tranquilizing action. Science 123: 992–3 Burckhardt G 1891 Ueber Rindenexcisionen, als Beitrag zur operativen Therapie der Psychosen. Allgemeine Zeitschrift fur Psychiatrie 47: 463–58 Carlsson A, Lindquist M 1963 Effect of chlorpromazine or haloperidol on formation of 3-methoxytyramine and normetanephrine in mouse brain. Acta Pharmacology Toxicology 20: 140–4 Cerletti U, Bini L 1938 L’Ellettroshock. Archiio Generale Di Neurologia 19: 266–8 Creese I, Burt D R, Snyder S H 1975 Dopamine receptor binding: Differentiation of agonist and antagonist states with 3H-dopamine and 3H-haloperidol. Life Sciences 17: 993–1001 Fink M 1979 Conulsie Therapy: Theory and Practice. Raven Press, New York Fink M 2000 Electroshock revisited. American Scientist 88: 162–7
Freeman W, Watts J 1936 Prefrontal lobotomy in agitated depression: report of a case. Medical Annals District of Columbia 5: 326–8 Garrison F H 1929 An Introduction to the History of Medicine, 4th rev. and enl. edn., WB Saunders Company, Philadelphia Glowinsky J, Axelrod J 1964 Inhibition of uptake of tritiated nonadrenaline in the intact rat brain by imipramine and structurally related compounds. Nature 204: 1318–9 Healy D 1996 The Psychopharmacologists. Chapman and Hall, London Healy D 1998 The Psychopharmacologists II. Chapman and Hall, London Kalinowsky L B, Hippius H 1969 Pharmacological, Conulsie and Other Somatic Treatments in Psychiatry. Grune & Stratton, New York Kaplan H I, Sadock B J 1988 Synopsis of Psychiatry, 5th edn. Williams and Wilkins, Baltimore, MD, pp. 528–9 Klaesi J 1922 Ueber die therapeutische Anwendung der ‘Dauernarkose’ mittels Somnifens bei Schizophrenen. Zeitschrift fur die gesamte. Neurologie und Psychiatrie 74: 557–92 Lehmann H E 1993 Before they called it psychopharmacology. Neuropsychopharmacology 8: 291–303 Lehmann H E, Ban T A 1970 Pharmacotherapy of Tension and Anxiety. Thomas, Springfield, IL Lehmann H E, Ban T A 1994 Psychiatric research in North America. In: Oldham J M, Riba M B (eds.) Reiew of Psychiatry. American Psychiatric Press, Washington, DC, pp. 27–54 Loevenhart A S, Lorenz W F, Waters R M 1929 Cerebral stimulation. Journal of the American Medical Association 92: 880–3 Lucas J J, Hen R 1995 New players in the 5-HT receptor field; genes and knockouts. Trends in Pharmacological Sciences 16: 246–52 Mayer-Gross W, Slater E, Roth M 1960 Clinical Psychiatry, 2nd edn. Cassell and Company Ltd., London, pp. 296–7 Meduna L 1937 Die Konulsionstherapie der Schizophrenie. Marhold, Halle, Germany Moniz E 1937 Prefrontal leukotomy in the treatment of mental disorders. American Psychiatric Association Journal 93: 1379– 85 Pflug B, Tolle R 1971 Disturbance of the 24-hour rhythm in endogenous depression and the treatment of endogenous depression by sleep deprivation. International Pharmacopsychiatry 6: 187–96 Pichot P 1983 A Century of Psychiatry. Roger Dacosta, Paris Prinzmetal M, Bloomberg W 1935 The use of benzedrine in the treatment of narcolepsy. Journal of the American Medical Association 105: 2051–3 Richelson E 1999 Receptor pharmacology of neuroleptics: relation to clinical effects. Journal of Clinical Psychiatry 60(10): 5–14 Sakel M 1935 Neue behandlung der Schizophrenie. Perles, Vienna Sargant W, Slater E 1954 An Introduction to Physical Methods of Treatment. Livingstone, Edinburgh, UK Shorter E 1997 History of Psychiatry. Wiley, New York Wagner-Jauregg J 1919 Ueber die Einwirkung der Malaria auf die progressive Paralyse. Psychiatrisch-Neurologische Wochenschrift 20: 251–5
T. A. Ban Copyright # 2001 Elsevier Science Ltd. All rights reserved. 14605
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Sou South America: Sociocultural Aspects South America, the worlds fourth largest continent, is composed of 12 nation-states and one territory. The significant differences among these 13 countries are matched by great internal sociocultural and ethnic diversity. Contemporary neoliberal and authoritarian governments strive to centralize and homogenize their populations. Internal cultural, social, and political movements heighten diversity. The military often plays a strong role in centralizing efforts. The Roman Catholicism of European colonialism historically and culturally shapes pan-national and internal cultural unity of the 10 Iberian-derived nations. Diversity is especially strengthened by the striking polarization between the small wealthy minority and the large poor majority; by indigenous and Afro-Latin American movements of cultural and ethnic self-affirmation; by the presence of significant ethnic blocs resulting from European migrations since the two World Wars; by the struggles among leftist, rightist, and populist political parties; by the challenge of the twentieth century spread of protestantism; and by the liberal Catholicism of recent reforms.
1. Human and Topographic Diersity The rapidly expanding population contains as wide a spectrum of phenotypic features found anywhere in the world. Its modern nation-states include Venezuela, Colombia, Ecuador, Peru, Bolivia, Chile, Uruguay, Paraguay, and Argentina, where Spanish is the official language; Brazil, where Portuguese dominates; Guyana where English is the national language; Suriname, where English, Dutch, and Sranan are prevalent; and Guyane, an administrative district (deT partement) of France, where French is dominant. Many other languages are spoken in most of these countries. Western South America is dominated by the mighty Andes mountains, with snow-coned active volcanoes that rise to 20,000 feet. They extend southward from the Caribbean Island of Trinidad to terminate in southern Argentina at Tierra del Fuego. In this region Venezuela, Colombia, and Ecuador are known as the northern Andes (and the former two as the Spanish Main); Peru and Bolivia constitute the central Andes, and Chile, Paraguay, Uruguay, and Argentina are called the Southern Cone. Eastern South America
features the great Amazon Basin, epitomized by the Amazon itself, the ‘Ocean Sea.’ Toward the headwaters of the huge feeder rivers of the Amazon lie the mountainous Guiana Shield in the north, the Brazilian Shield in the south, and the Andean foothills or montang a, rimmed by the ceja, ‘eyebrow,’ in the west. Northwest and northeast of the Amazon regions, and west of the Guiana Shield, are the plains (llanos) of Colombia and Venezuela, which phase into the Orinoco Basin, and west of the Brazilian Shield are the llanos de Mojos of Bolivia. A subregion that combines elements of Amazonian and Andean systems is that of the North Pacific coast. From northern Ecuador through western Panama an Amazonian ecosystem prevails. From northern Peru southward, high aridity is characteristic.
2. Precolumbian People: Their Lands, Products, Societies, and Languages Native South American cultures prior to 1492 represented a dynamic mosaic of languages, cultures, social systems, rituals, and power relationships. long-distance trade—sometimes extending for thousands of miles—combined with bi- and multilingualism and multiculturality. Such trade traversed some of the most rugged landscapes in the world, creating networks of native knowledge and imagery crosscutting great geographical differences. Antiquity of South American native peoples is highly controversial. In the east, and thence up the Amazon and Orinoco (the latter has its head waters in the Parima and Pacarima mountains of the Guiana Shield and flows west, then north, then east out into the Atlantic), evidence suggests antiquity of 12,000 years ago or earlier, but the routes of arrival are obscure. Neither transatlantic nor transpacific routes seem credible to scholars in 1999, though sporadic contacts in South America by the first Americans with Asian-Pacific and EuroAfrican peoples remain subjects of possibility. Plant domestication seems to begin in Amazonia and in the Andes by about 8,000 years ago. The period beginning about 6,000 years ago and continuing until the European conquest and colonization marks the florescence of fine ceramics in Amazonia, and the development of textiles, gold, and silver work in the Andes and adjacent coastal regions. At the mouth of the Amazon, ceramics manufactured on Marajo! Island, an area the size of Connecticut in 14607
South America: Sociocultural Aspects the USA, rivaled the quality of any ceramic wares found anywhere in the world in terms of their beauty of form and decoration. By 4500 years ago great migrations from the lower and central Amazon region were underway, and regional consolidation in the Andes was occurring. People from coastal, Andean, and Amazonian regions were in contact with one another, and with the people of the Caribbean Islands, Central America, and Mesoamerica. People who belong to three language families—Arawak, Carib, and later Tupi—spread from the eastern and central Amazon to the Orinoco drainage and on to the Caribbean (Arawak), and to the base along the rim of the montang a. Other Amazonian language families of prominence include Tukano, Tukuna, Jı! varo, Pano, and Ge# ; these languages spread in a general arc from northwest of the Amazon around the central Amazonian rim to the southwest. Andean languages were also highly diverse. The Inca empire, radiating north and south out of its capital of Cuzco, in what is now Andean Peru, emerged in the fifteenth century to severely alter the cultural mosaic in the Andes, and set up politicaleconomic and cultural shock waves that radiated eastward. This empire was in full conquest fervor by the time word arrived from the Caribbean and Central America of the pending Spanish advances down the Pacific Coast and up into the Andes. At the height of their expansion, the Inca controlled all Andean systems from central Chile to southern Colombia, and their north-south 3,416-mile parallel cobble stone roadway (one coastal, one Andean) through this vast territory (385,000 square miles) amazed the conquering Spaniards. Although the Inca never dominated the adjacent indigenous people of Upper Amazonia, there is ample evidence of extensive trade between Amazonia and Inca-dominated Andes and Coast. Quechua (a variant of which is Quichua) was the lingua franca used by the Inca in its conquest, and became the language of 12 million people throughout the empire, and beyond. Other prominent Andean languages include Chibcha in the north Andes, Aymara in the central Andes and Mapuche in the southern cone. In the greater Amazonian areas east of the Andes people depended on maize and manioc cultivation for caloric intake, and on intensive fishing and reptile and amphibian collecting for protein and amino-acid balance. Hunting in the various ecosystems of the flooded forest, the upland forest, and the flood plains and riverine systems complemented the balance of fishing and horticulture. As time progressed manioc (Manihot esculenta Crantz) came to be the staple of the agricultural system, with maize sporadically and regionally grown and consumed. Fishing in rivers, estuaries, streams, and coast was of paramount importance, as were techniques for corralling turtles and other amphibians. Polities varied from well-developed chiefdoms and relatively small-scale states to more dispersed systems. Intermarriage between peoples at 14608
war with one another, and speaking different languages, seems fairly common. In the Andean regions, the potato was the supreme crop, with beans, squash, maize, quinoa, and other grains providing a balanced diet when combined with animal protein. Approximately 240 species of potato were grown, harvested, and prepared in many ways. Llama (one of three Andean camilids), guinea pig, and much later the hairless dog, were the primary animals raised for food, and the llama was used as a beast of burden. Polities ranged from small-scale systems to states and the expansive Inca Empire.
3. The European Conquest of Natie Lands and People A new cultural system based on conquest and capitalism was violently created by the overseas empires of the Spanish and Portuguese after 1492 as they sought huge profits in gold, silver, spices, and slaves. In the Treaty of Tordesillas, signed in 1494, the Spanish and the Portuguese divided up their colonies long before they knew the vast extent of their respective territorial ‘holdings.’ From the terms of this treaty the Portuguese would lay claim to Brazil, and Spain would take the rest, including all of the Pacific regions from South America to the Philippines and Japan, all of the Andes, and a parts of Amazonia, a territory itself nearly the size of the USA. Portugal’s interests first lay in the route around the horn of Africa to India, and only a bit later in its vast new colonies. In 1499, sailing (and\or writing) for Spain, Amerigo Vespucci signed his map of the ‘New World’ thereby creating the currently favored name for the entire continent. During the early to mid 1500s Spain set out to locate, conquer, and colonize its Amazonian, Andean, and coastal regions, and Brazil set out on the conquest of its Amazonian, coastal, and mountainous regions. The Spanish first brought sugar to the Americas and the Portuguese first brought the banana from India. Both of these overseas empires introduced cattle, sheep, goats, pigs, and chickens. Everywhere they went the conquerors encountered native people. They subdued them by enslavement, trickery and trade, and they spread European and African diseases for which native resistance was low or nonexistent. Where possible they enlisted them in slaving activities, and they killed indigenous people with firearms, swords, spears, and arrows with steel points. In areas feasible for such warfare they sent their warhorses and war dogs to maim and kill them in their radical recolonization of the subcontinent. This brutal transformation resulted in 90 percent or greater depopulation of the native inhabitants throughout the continent. Trade and warfare, now with an escalating colonial need for indigenous chattel slaves in the lowlands, and indigenous landed slaves in the highlands, created a persistent poverty that endures into
South America: Sociocultural Aspects the twenty-first century. Nonetheless, cultural creativity endured with local-level technical competence promoting subsistence and adaptations.
4. Blackness The Spanish and Portuguese brought Black slaves from Africa and from Iberia with them. Many such people escaped to mountainous, forested, inaccessible areas, and created a myriad of free, fortified sites where Native American and African American people often collaborated, intermarried, and established powerful political systems on the fringe of the dominant Spanish and Portuguese territories. From these refuge sites, often called palenques in Spanish and quilombos in Portuguese, self-liberated residents established powerful political systems on the fringe of the dominant Iberian and colonial territories. They threatened the Crown political economies. In some regions such as western Ecuador, southwestern Colombia, western Venezuela, and the north of Brazil, active trade between runaway people, often called cimarrones, and the colonial planters ensued.
5. Mestizaje, Cultural Practices and Ideologies of Racial Blending The Spanish initiated an ideology of ethnic blending, called mestizaje (miscegenation) that excluded both indigenous people and people of African descent. Spanish administrators from the republic of Spaniards demarcated a republic of ‘indios,’ so named by Christopher Columbus who thought he was in India. There was no legal space for Black people, except what they themselves created. Those excluded from the Spanish-Mestizaje realm were sometimes referred to as la mancha, ‘the (racial) stain.’ From this viewpoint the dominant, European concept of hybridity prevailed in this hegemonic model of racial mixing: the White, civilizing genes and cultures were to mix with the savage or barbarian ‘Indian’ and ‘Black’ to form such intermediate categories as mestizo and mulato in Spanish. In Portuguese, mesticm o, and mulato referred to White-Black hybridity. Those people with hybridized ‘racial’ features, according to European and later Creole (New World born) ideologues, ranged up and down the scale from white-civilized-Iberian to dark-civilizing-Creole. These diverse mestizos were known as the castas. In some times in some places the number of their racialized categories ranged from a dozen named ‘types’ to hundreds. Indeed, there could be five or more such ‘types’ among the sons and daughters of one couple. They were constituted as those people somewhere betwixt and between the savage and the civilized, and legal and colloquial labels for some of
them included such racialized concepts as ‘wolf,’ ‘coyote,’ ‘throw back,’ ‘up in the air,’ and even ‘there you are!’ Black people (including slave, bozal, and free, ladino) and native people (usually divided into tame, manso, and fierce or wild, brao), fell outside of the scheme of castas. Their own intermixture, often rendered in Spanish by the term zambo, was (and is) a category filled with racist ideas about danger and power, for here was a mixture without the civilizing affects of European-sponsored hybridity. An indigenous artist using European techniques in 1599 did the first oil painting in South America. His subject was that of the ‘Zambos’ of western Ecuador who, adorned in European princely clothes, wearing indigenous gold jewelry, and carrying spears with iron tips characteristic of Africa and Europe, walked from their free Republic on the coast of Ecuador to Quito to swear alliance to the Spanish Crown in the face of English encroachments. Before the rise of nation-states, movements of selfliberation by Black people and indigenous people were common, and some were quite successful. Early revolts in the sixteenth century resulted in ethnic regions that endured into the era of nation-state emergence. In 1579 the Quijo and the Jı! varo of Ecuadorian montang a revolted in attempts to unite Amazonian and Andean people. In Peru and Bolivia the revolts of Tupac Amaru and Tupac Katari later did the same. In the forested and mountainous coastal zone of Alagoas, Pernambuco, Brazil, Black and indigenous people formed the state of Palmares in the 1640s, and fought Portuguese and Dutch soldiers for a century. Black people in Suriname revolted in the mid-seventeenth century, and initiated a 100 year war against the Dutch and British, winning independence for six nations— Saramaka, Paramaka, Kwinti, Ndujka, Matawai, and Aluku—in the interior of this region. In neighboring Venezuela, the Carib-speaking Yekuana expelled the Spanish in 1767, and maintained their complete independence for 100 years until a brutal nationalist regime militarized the area for the purposes of exploiting wild rubber.
6. The Transformation from Colonies to Nationstates Wars of nationalist Independence from Spanish and Portuguese colonial rule began in 1810 in areas of vast distances from one another, from what is now Mexico through northern South America to the southern cone. Those who achieved positions at the pinnacle of power in South America were criollos, ‘Creoles,’ who outnumbered those of Spanish descent. In turn, they were vastly outnumbered by their darker compatriots, those especially of African descent who fought in wars led by Simo! n Bolı! var in the north, and by Jose! San 14609
South America: Sociocultural Aspects Martı! n in the south. As the Creole nationalist wars moved through slave territories, slavery was abolished, but blackness and indigenousness were kept under rigid control by fears that the revolution would become an ‘ethnic’ one. The ethnically marginalized continued to be excluded from dominions of power. Both Bolı! var and San Martı! n envisioned a united Spanish-speaking Creole nation, and lightness of phenotype within the ideology of mestizaje was clearly ascendant. Such a continental nation did not materialize and, one by one, contemporary nations rapidly emerged from their previously liberated colonial regimes. By 1822 Spain had lost all of its territories in the Americas, and the Portuguese King lost his colony to a new Brazilian monarchy with its own New World Empire. By 1830 all of the republics of Spanish-speaking South America emerged, only to begin border wars, geopolitical vestiges of which last to the present day. Peru, for example, invaded Ecuador in 1941, claiming half the latter’s territory, and these countries fought a war over a very small sector of disputed territory as recently as 1981 and 1995, both of them in indigenous Jivaroan-occupied sectors long serving as refuge areas due to the rugged, mountainous, Upper Amazonian topography. African chattel slavery continued in the new states side by side with Black freedom, as various systems of oppression of indigenous people continued side by side with native revolts. The last country to abolish slavery, Brazil, did not do so until 1888, and in some states illegal slavery existed side by side with manumission legalities. Not until 1966 (Guyana) and 1975 (Suriname) would two of the three Guianas gain nation status from the British and the Dutch, respectively.
theses, to become Creole languages. In medicine the humoral system from Mediterranean classical antiquity fused with African curing programs and indigenous shamanism. The result was hundreds, perhaps thousands, of highly regional systems of health and curing which, in turn, served as vehicles of communication with people in other regions who had experienced similar cultural fusions. Religious syncretisms became the modus operandi of colonizing curates and of indigenous and African specialists in esoteric and numinous thought and reflection. Even horrors of the conquest were syncretized into indigenous and Afro-Latin American cultures. For example, The battle cry of the Spanish conquistadors, ‘Santiago!’ came from Santiago de Matamoros (Santiago the Moor Killer) as the Moors were defeated during the Castillan reconquista that was concluded in 1492. It was used to mark Spanish territories in the Americas. After the reading of the ‘Requirement’ document informing all ‘Indians’ of their loss of rights to the Spanish, the conquerors yelled ‘Santiago!’ and attacked with firearms, war horses, and war dogs, often feeding parts of the dead natives to the dogs, and rendering fat from the same dead natives to rub on wounds of dogs and horses. Santiago underwent a ritual reversal in the Central Andes where this Saint became the patron icon of indigenous, Quechua-speaking peoples, and was called forth to resist the Spanish. Also in this region, and in parts of Amazonia, indigenous people say that some outsiders, whom they call pishtaco in Peru, seek ‘Indian fat’ for nefarious purposes. This may well be a modern reference to practices of the Spanish conquerors—an indigenous dimension of historical consciousness—though it is usually treated as an item of strange and curious native lore.
7. The Fusion of Cultures
8. Cultural Continuities
The powers of Spanish and Portuguese Catholicism came to South America in the heyday of the Spanish Inquisition. Powerful priests and others sought out indigenous and African systems of thought and knowledge to be stigmatized as idolatry and Devil worship. An underlying fear of converted Jews permeated some middle ranks of conquest and colonialism. Those of humble Iberian origin sought alternative sources of inspiration for religion and magic from indigenous and Black people, thereby heightening the symbolic powers of the oppressed. Fusions of Iberian, African, and Indigenous belief systems and ways of knowing and ‘seeing’ became legion. All this occurred at micro-levels in macrosystems of Iberian Catholic missionization that gives almost all of South America its special ‘Latin American’ sociocultural character. In the area of language, pidgins emerged out of African, Indigenous, and European imperfect syn-
In the face of radical transformations and syncretisms, continuities were also abundant. Amazonian languages such as Tupi, Arawak, Tukano, Tukuna, Jı! varo, and Pano continued in their respective indigenous areas, with trade between different people also continuing, in spite of the efforts of the Catholic curates, first the Jesuits and Franciscans, and then others, to ‘reduce’ the populations to nucleated areas. Tupi itself, as Neenghatu! , became the Lengua Geral of rural Brazil until replaced by Portuguese in the late nineteenth century, and the same language, as Guaranı! , became the national language of Paraguay, where citizens spoke it along with Spanish. Quechua spread in the Andes to Argentina and in Upper Amazonia where it may have served as a language of trade. Today at least 13 million people speak Quechua, making it by far the greatest indigenous language family in the Americas. Many languages that defy classification with other known indigenous languages
14610
South America: Sociocultural Aspects continue to be spoken today, especially in the Andean nation-states with substantial Amazonian regions and in the eastern countries of Brazil and the Guianas. Rich realms of story telling, often called mythology, are legion among native peoples, Afro-Latin American peoples, and people of ‘mixed’ descent, where they constitute sources of spirituality and historicity embedded in more than 100 languages. They convey rich information ranging from medicinal plants to healing procedures, to ways of thinking about humanity and society that confront the models of hierarchically organized and exploitative nation-state bureaucracies. Such telling of pasts motivates symbolically the development of political movements such as that of Pachakutik in Ecuador in the mid-1990s, and the Ecuadorian indigenous uprisings in that nation during the same decade. Within the socioeconomic classes oriented toward national politics, which have long been articulated to the global economy, the figure of the caudillo has loomed large since the end of the colonial era. A caudillo is a powerful figure, usually a man but sometimes a woman, such as the late Eva Pero! n in Argentina, who emerges in a time of politicaleconomic or sociocultural crisis, attracts followers from many sectors of national life, including radically opposed political parties and classes. One of the greatest caudillos of all time was the liberator Simo! n Bolı! var. The caudillo rallies such followers to his or her own personal benefit, thereby often losing considerable influence, and even office, when the crises are over. It is to the benefit of the caudillo, then, to maintain an aura of crisis, which in South America today, and back through its tumultuous history, is not difficult to do. The caudillo is matched in the indigenous sector by the cacique (sometimes called curaca), who shares the qualities just described. (Cacique is also used for leaders in organized crime.) Both caudillos and caciques, as leaders in peripheral nations, depend in large part on economic gains to come from industrialized, wealthy nations. The cultural form of dependence of South American countries is closely tied to the developmental policies of the center nations, whose funding for externally designed projects often support caudillismo and caciquismo.
9. Modernities and Alternatie Modernities Modernity among the elite and middle classes of South America is tied to urbanity, whiteness, and Euro-North Americanized consumer culture. The doctrine of mestizaje serves these interests in the sense that the identity of mestizo underscores the processes of social whitening (blanqueamiento). Similarly, the representations as mestizo of those below the elite or middle classes refers to the darker complexions and stigmatized and racialized vulgarity of the common people, who in many nations constitute the vast
majority. For people of indigenous and Afro-Latin American descent, and for others classed by the elite as ‘mestizos,’ modernity consists of passing back and forth among different systems of thought and behavior. Passages themselves give rise to new dimensions of power politics in South American nations, just as they gave rise to the Creole imagined nationalities of the nineteenth century. The alternative modernities in the Euro-American dominated global economy that spawned earlier state nationalism are now sewing the seeds of ethnic nationalism, manifest in many countries by the self-conscious rise of new nationalities out of the symbolic transformations and continuities of syncretic formations. Public displays of collective selfhood by indigenous people tap significant financial resources through Non Governmental Organizations (NGOs) of Europe and North America, giving such social formations and movements a new articulation to economic resources. Black selfhood, however, does not attract significant financial resources. The nations of South America vary greatly one from the other, and the salient characteristics of their respective modernities also vary. Twentieth century migration has changed the complexion of many countries, as Europeans (Jewish, Catholic, and Protestant), Middle Easterners (Jewish, Christian, and Islamic), and Asians (Chinese, Japanese, Korean, and Vietnamese) have become prominent actors in various class sectors. In the southern cone nations of Argentina, Chile, and Paraguay, live significant populations of German immigrants. The ex-President of Peru, Alberto Fujimori, is the son of Japanese immigrants, and the Ecuadorian ex-President, Jamil Mahuad Witt, is the son of Lebanese and German immigrants. South American nationality is often expressed through public performances as crystalizations of ‘authentic’ local-level or regional cultures. Concepts of cultural hybridity and alternative modernity fold ethnic and cultural diversity into a unified national pride. Festive centralization is often matched by locallevel and regional divergence, as people in many walks of life appropriate nationalist stereotypes about authenticity into their own self-presentations. For example, indigenous people of Andean Ecuador have long performed individualized constructions of the European festival of Corpus Christi. Cultural images of rain-forest inhabitants are evoked in these performances. In response to both the politicized indigenous movements and the craving of foreign tourists for ‘authentic’ Incaic performative reproductions, some of these festivals have been renamed ‘Inti Raymi,’ solstice of the sun. Performance of them becomes transformed to a nationalist, ‘mestizo’ rendition of Incaic authenticity and also an indigenous rendition of their increasing cultural spaces in the modern nation state. In the late twentieth century, nationalist governments use their expanding military powers to stress 14611
South America: Sociocultural Aspects centralization and homogenization, and pan-Roman Catholicism continues to be a centripetal force at the beginning of the twenty-first century. Nonetheless, indigenous and Black movements of South America together with labor movements, feminist movements, and various cultural emphases promoting counterhegemonic diversity work against the exclusive imagery of ethnic blending in their respective nation states and across national boundaries. New and dynamic cultural mosaics are constantly created out of the interplay of secular-sacred centralization, and ethniccultural diversity. The dynamic, transformative balance between centralization and diversity that began in early colonial times continues into the beginning of the twenty-first century. See also: Central America: Sociocultural Aspects; Historiography and Historical Thought: Indigenous Cultures in the Americas; Latin American Studies: Economics; Latin American Studies: Geography; Latin American Studies: Politics; Latin American Studies: Religion; Latin American Studies: Society
Williams J M, Lewis R E (eds.) 1993 Early Images of the Americas: Transfer and Inention. University of Arizona Press, Tucson, AZ
N. E. Whitten, Jr Copyright # 2001 Elsevier Science Ltd. All rights reserved.
South Asia, Archaeology of The archaeology of South Asia, or the Indian subcontinent, is a very substantial topic, since there is a deep history in this land of diverse peoples and cultural traditions. Moreover, archaeological fieldwork is conducted on a substantial scale, with a dozen or more excavations each year, and widespread exploration. Since it is impossible to be comprehensive in an article such as this, the article will focus on four topics (the Palaeolithic, Early Village Farming Communities, the Indus Civilization, and the Early Iron Age) that are at least representative of the principal problems on which contemporary archaelogy has focused.
1. Palaeolithic Bibliography Ades D 1989 Art in Latin America: The Modern Era, 1820–1980. Yale University Press, New Haven, CT Anderson B 1991 (1983) Imagined Communities, 2nd edn. Verso, London Collier S, Blakemore H, Skidmore T E (eds.) 1985 The Cambridge Encyclopedia of Latin America and the Caribbean. Cambridge University Press, Cambridge, UK Denevan W M (ed.) 1976 The Natie Population of the Americas in 1492. University of Wisconsin Press. Madison, WI Dillehay T D 2000 The Settlement of the Americas: A New Prehistory. Basic Books, New York Fuentes C 1992 The Buried Mirror: Reflections on Spain and the New World. Houghton Mifflin, New York Goodman E J 1992 (1972) The Explorers of South America. University of Oklahoma Press, Norman, OK Guss D M 1999 The Festie State: Race, Ethnicity, and Nationalism as Cultural Performance. University of California Press, Berkeley, CA Hill J (ed.) 1988 Rethinking History and Myth: Indigenous South American Perspecties on the Past. University of Illinois Press, Urbana, IL Richardson J B III 1994 People of the Andes. St Remy Press, Montreal, Canada and Smithsonian Press, Washington, DC Stern S J (ed.) 1987 Resistance, Rebellion, and Consciousness in the Andean Peasant World. University of Wisconsin Press, Madison, WI Van Cott D L (ed.) 1995 Indigenous Peoples and Democracy in Latin America. St Martin’s Press, New York Whitten N E Jr, Torres A (eds.) 1998 Blackness in Latin America and the Caribbean: Social Dynamics and Cultural Transformations, 2 Vols. Indiana University Press, Bloomington, IN
Unexpectedly early stone tools were found at Riwat, in the Soan Valley, about 6.2 miles (10 km) south-east of Rawalpindi between 1983 and 1988 (Allchin 1995) (see Fig. 1). Twenty-five pieces of flaked quartzite were recovered. Of these, only three are undeniably artifacts; that is, objects shaped by humans. Other pieces from Riwat might be the products of human activity, but they are so simple that this cannot be determined with certainty. Two of these artifacts are cores, the third is a flake. They are generally similar to the Oldowan industry of Africa, which dates to between two and three million years ago. The geological deposits of the artifacts at Riwat have reversed polarity and are therefore associated with the Matuyama Chron, between 780,000 and 2.60 million years ago. Volcanic ash immediately above the Riwat artifacts dated to c. 1.8 million years ago. The artifacts are therefore older than that; how much older is somewhat problematic. An excavation was undertaken in 1990 in the Pabbi Hills, to the south-east of Riwat. This excavation yielded tools in a deposit that was palaeomagnetically dated to 1.2 to 1.4 million years ago. The Lower Palaeolithic (c. 1.5 million to 200,000 years ago) is represented in virtually all regions of India, but not Sri Lanka. It is a period of two major traditions of early tool making, the western core biface (hand axe\cleaver) tradition and the eastern chopper\ chopping tool tradition. The line defined by Professor Hallam Movius that follows the foothills of the Himalaya mountains down the Ganges Valley, dividing these two traditions, is a concept that has returned favor. The distribution of these tool-making traditions
14612
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
South America: Sociocultural Aspects centralization and homogenization, and pan-Roman Catholicism continues to be a centripetal force at the beginning of the twenty-first century. Nonetheless, indigenous and Black movements of South America together with labor movements, feminist movements, and various cultural emphases promoting counterhegemonic diversity work against the exclusive imagery of ethnic blending in their respective nation states and across national boundaries. New and dynamic cultural mosaics are constantly created out of the interplay of secular-sacred centralization, and ethniccultural diversity. The dynamic, transformative balance between centralization and diversity that began in early colonial times continues into the beginning of the twenty-first century. See also: Central America: Sociocultural Aspects; Historiography and Historical Thought: Indigenous Cultures in the Americas; Latin American Studies: Economics; Latin American Studies: Geography; Latin American Studies: Politics; Latin American Studies: Religion; Latin American Studies: Society
Williams J M, Lewis R E (eds.) 1993 Early Images of the Americas: Transfer and Inention. University of Arizona Press, Tucson, AZ
N. E. Whitten, Jr
South Asia, Archaeology of The archaeology of South Asia, or the Indian subcontinent, is a very substantial topic, since there is a deep history in this land of diverse peoples and cultural traditions. Moreover, archaeological fieldwork is conducted on a substantial scale, with a dozen or more excavations each year, and widespread exploration. Since it is impossible to be comprehensive in an article such as this, the article will focus on four topics (the Palaeolithic, Early Village Farming Communities, the Indus Civilization, and the Early Iron Age) that are at least representative of the principal problems on which contemporary archaelogy has focused.
1. Palaeolithic Bibliography Ades D 1989 Art in Latin America: The Modern Era, 1820–1980. Yale University Press, New Haven, CT Anderson B 1991 (1983) Imagined Communities, 2nd edn. Verso, London Collier S, Blakemore H, Skidmore T E (eds.) 1985 The Cambridge Encyclopedia of Latin America and the Caribbean. Cambridge University Press, Cambridge, UK Denevan W M (ed.) 1976 The Natie Population of the Americas in 1492. University of Wisconsin Press. Madison, WI Dillehay T D 2000 The Settlement of the Americas: A New Prehistory. Basic Books, New York Fuentes C 1992 The Buried Mirror: Reflections on Spain and the New World. Houghton Mifflin, New York Goodman E J 1992 (1972) The Explorers of South America. University of Oklahoma Press, Norman, OK Guss D M 1999 The Festie State: Race, Ethnicity, and Nationalism as Cultural Performance. University of California Press, Berkeley, CA Hill J (ed.) 1988 Rethinking History and Myth: Indigenous South American Perspecties on the Past. University of Illinois Press, Urbana, IL Richardson J B III 1994 People of the Andes. St Remy Press, Montreal, Canada and Smithsonian Press, Washington, DC Stern S J (ed.) 1987 Resistance, Rebellion, and Consciousness in the Andean Peasant World. University of Wisconsin Press, Madison, WI Van Cott D L (ed.) 1995 Indigenous Peoples and Democracy in Latin America. St Martin’s Press, New York Whitten N E Jr, Torres A (eds.) 1998 Blackness in Latin America and the Caribbean: Social Dynamics and Cultural Transformations, 2 Vols. Indiana University Press, Bloomington, IN
14612
Unexpectedly early stone tools were found at Riwat, in the Soan Valley, about 6.2 miles (10 km) south-east of Rawalpindi between 1983 and 1988 (Allchin 1995) (see Fig. 1). Twenty-five pieces of flaked quartzite were recovered. Of these, only three are undeniably artifacts; that is, objects shaped by humans. Other pieces from Riwat might be the products of human activity, but they are so simple that this cannot be determined with certainty. Two of these artifacts are cores, the third is a flake. They are generally similar to the Oldowan industry of Africa, which dates to between two and three million years ago. The geological deposits of the artifacts at Riwat have reversed polarity and are therefore associated with the Matuyama Chron, between 780,000 and 2.60 million years ago. Volcanic ash immediately above the Riwat artifacts dated to c. 1.8 million years ago. The artifacts are therefore older than that; how much older is somewhat problematic. An excavation was undertaken in 1990 in the Pabbi Hills, to the south-east of Riwat. This excavation yielded tools in a deposit that was palaeomagnetically dated to 1.2 to 1.4 million years ago. The Lower Palaeolithic (c. 1.5 million to 200,000 years ago) is represented in virtually all regions of India, but not Sri Lanka. It is a period of two major traditions of early tool making, the western core biface (hand axe\cleaver) tradition and the eastern chopper\ chopping tool tradition. The line defined by Professor Hallam Movius that follows the foothills of the Himalaya mountains down the Ganges Valley, dividing these two traditions, is a concept that has returned favor. The distribution of these tool-making traditions
South Asia, Archaeology of
Figure 1 Sites and places mentioned in the text
seems to be coincident with the distribution of bamboo, which could have been a major source of tools in the Lower Palaeolithic. A Middle Palaeolithic (250,000–30,000 years ago) stone-flake tool industry is widely distributed in India, Pakistan and Afghanistan. Some sites have evidence
of the ‘Levallois,’ technique. The cave of Darra-i Kur in north-eastern Afghanistan produced a fossil that is anatomically a modern Homo sapiens dated to c. 30,000 BP (Kennedy 2000). Sites with Upper Palaeolithic blade and burin tools are not well represented in the Indian subcontinent. 14613
South Asia, Archaeology of Establishing a chronology for such a tool-making tradition, and documenting its distribution and relationship with the Middle Palaeolithic and the development of microlithic technology, is one of the major challenges in the prehistory of South Asia. The oldest fossil hominid in South Asia is an archaic Homo sapiens from Hathnora on the Narmada River, near the famous palaeolithic site at Hoshangabad (Kennedy 2000, pp. 1173–80). This find is not associated directly with stone tools, but the deposits from which it was derived are Upper Pleistocene and date from about 200,000 to 250,000 years ago. Representatives of anatomically modern Homo sapiens have been found in Sri Lanka in the caves of Fa Hien, Batadomba Lena and Beli Lena Kituigala (Kennedy 2000, pp. 180–8). Radiocarbon dates have shown that the Fa Hien specimens date to c. 33,000 BP and are associated with a non-geometric microlithic tool industry. The fossil humans from Batadomba Lena date to c. 28,000 BP and are associated with microliths, bone tools, and faunal and floral remains.
2. Early Village Farming Communities Insights into the beginnings of food production in South Asia have been found at Mehrgarh (Jarrige et al. 1995), on the Kachi Plain of the Indus Valley. Period I is aceramic, with small rectangular houses, and interesting multiroomed buildings thought to have been storehouses or granaries. A new periodization of Mehrgarh emerged from the 1998 excavations there. Period I now has three phases: IA and IB, with habitation, and IC, during which the excavated area was used as a cemetery (Meadow 1998, p. 13). The date for Period I at Mehrgarh has to be inferred, somewhat unsatisfactorily, from eight radiocarbon samples that range from c. 4000 BC to older than 8000 BC. The early ceramic Neolithic of Period IIA is reasonably well dated to c. 5300–4700 BC, by seven calibrated determinations, thus the beginnings of Mehrgarh IA at 7000 BC, is a reasonable estimate, given the 11 building levels there. More than 90 percent of the food grains recovered from Period I are naked six-row barley, along with domestic hulled sixrow barley, and wild and domestic hulled two-row barley (Meadow 1998). The naked barley is probably cultivated and on its way toward full domestication in Period I. There are also three forms of domesticated wheat, including emmer, einkorn and a third, freethreshing, form. The animals from Period I include wild forms of sheep, goats and gazelle, along with other ungulates that were never domesticated. Two burials of Period IC contain the remains of five kids in each, taken to indicate their domesticated status. This inference is corroborated by the remains of relatively small subadult or adult goats in trash deposits. After gazelle, goats were found to be the most common animal in the 14614
Period IA. Over the course of Period I, sheep and cattle rise in proportion to other animals; and by the end of Period IB cattle bones comprise over 50 percent of the remains. It is thought that increasing representation and decreasing body size ‘strongly supports a hypothesis of local domestication. In addition, both osteological and figurine evidence indicate that zebu (humped) cattle (Bos indicus) were present and likely to have been the dominant form’ (Meadow 1998, p. 16). It is clear from the Mehrgarh data that cattle, sheep, and goats were domesticated locally. Wild barley has been found in the hills above Mehrgarh, and it too would therefore seem to have been part of the same process. The one remaining resource, wheat, has never been found in the wild on the eastern edge of the Iranian Plateau, but suggesting that it was ‘imported’ as part of an otherwise local process of domestication creates an untenable awkwardness in the interpretation of the Mehrgarh sequence. It is also apparent that Mehrgarh does not document the beginnings of domestication, since both the plants and animals in the earliest levels there are either domesticated, or in the process of being so. The early beginnings of this process should be in the hills above Kachi, where the sheep, goats, and cattle as well as barley (and wheat?) would be found in the wild. Pursuing this line of inference is one of the most exciting challenges of contemporary South Asian archaeology. It was sites like Mehrgarh and their predecessors that laid the foundations in subsistence productivity that led to the beginnings of urbanization in the Indian subcontinent.
3. The Indus Ciilizaiton 3.1 Introduction South Asia’s earliest cities were a part of the Indus, or Harappan, Civilization, which arose on the plains of the Greater Indus Valley of Pakistan and northwestern India in the middle of the third millennium (c. 2500–1900 BC). The Harappan peoples were, for the most part, farmers, herders, and skilled craftspeople who lived in the cities of Mohenjo-daro, Harappa, and Ganweriwala. The Harappan Civilization covers about one million square kilometers (Possehl 1999, Allchin and Allchin 1997; see Possehl 1999, pp. 38–154 for the story of its discovery). The Indus Civilization can be defined by its ideology, which emerges as a grand act of creation. Some parts of older traditions survived, since it is clear that there was little change in subsistence, and in the sociocultural, economic and political balances between farmer and pastoralists. But Harappan ideology was based in nihilism. The new social order was founded in urbanization and sociocultural complexity.
South Asia, Archaeology of
Figure 2 Domains of the Harappan civilization
Water and ‘cleanliness’ were powerful forces in the Mature Harappan ideology. The Mature Harappans were technological innovators. 3.2 Geography The ancient cities of the Indus were built in riverine settings, in the lower Indus River, the Ravi of the Punjab, and the ancient Sarasvati. The land was quite different in the third millennium, since the now-dry Sarasvati, or Ghaggar-Hakra, was an active stream, and the rolling uplands between the rivers of the Punjab were unirrigated scrub grasslands. Harappan
peoples are evidenced from the Iranian border to the Aravalli Mountains, and the Arabian Sea to the Himalayas. There is a Harappan outlier site on the Amu Darya in northern Afghanistan known as Shortughai. In an effort to deal with the cultural geography of the Indus Civilization, their territory has been divided into ‘domains’—geographical and cultural regions, possibly with political significance (see Fig. 2). 3.3 Urban Origins The beginnings of the Harappan Civilization are still not well understood. There was an Early Harappan 14615
South Asia, Archaeology of
Figure 3 Plan of Mohenjo-daro
period (c. 3200–2600 BC) of four roughly contemporaneous phases (Kot Dijian, Amri-Nal, Damb Sadaat, and Sothi-Siswal) without large settlements or strong signs of social stratification. Craft specialization during the Early Harappan was not developed to 14616
a marked degree. There was a Transitional Stage between the Early Harappan and Mature Harappan at about 2600–2500 BC, during which most of the complex sociocultural institutions of the Indus Civilization came together (Possehl 1999, pp. 713–23).
South Asia, Archaeology of 3.4 Major Settlements The major excavated settlements of the Harappan Civilization are Mohenjo-daro and Harappa, each of about 100 hectares, with estimated ancient populations of about 20,000. The third city, Ganweriwala, in Cholistan, is unexcavated. Mohenjo-daro is the best preserved of the Indus cities (see Fig. 3). It was a planned settlement from the start, and was probably built in the middle centuries of the third millennium BC. The majority of the buildings were constructed of burnt brick, which was served to preserve the site in an extraordinary way. A Lower Town was situated to the west, laid out in a grid, with major thoroughfares running north–south and east– west. Multiroom, and probably multistoried buildings line these streets. The private houses were generally provided with bathing facilities, which were connected to an integrated civic drainage system. These bathing facilities were not privies. To the west of the Lower Town was the Mound of the Great Bath, an artificial hill with a series of public edifices on it. The most important of these were a Great Bath, probably used by some portion of the elite population as a place of ritual ablution, and a warehouse, once thought of as a State Granary. Other major sites are: Kalibangan, a regional center of the Indus Civilization near the present border between India and Pakistan on the ancient Sarasvati. It documents the succession from the Early to the Mature Harappan (Thapar 1973). Chanhu-daro, a former Harappan town, located 70 miles south of Mohenjo-daro. It is important for the presence of a bead- and seal-making workshop (Mackay 1943). Dholaira, located in Kutch, is midway between Chanhu-daro and Lothal. This was an important way station between major Mature Harappan regions, and was 60 hectares in size. It was fortified, with important water storage and drainage facilities. There is a stratigraphic succession from the Early Harappan to the Mature Harappan and succeeding cultures (Bisht 1999). Lothal, located in Gujarat at the head of the Gulf of Cambay, provides an understanding of Mature Harappan trade and craftsmanship, and the stratigraphic relationship between the Mature Harappan and the Posturban Harappan in the region. The large brick-lined enclosure at the site, once described as a dockyard, was probably a water storage facility (Rao 1979, 1985). 3.5
Writing
The Indus peoples mastered the art of writing, which was rendered in pictographic form on a range of physical objects. The best known of these are the Harappan stamp seals with writing and an animal
device (see Fig. 4). In spite of many claims to the contrary, the Indus script remains undeciphered (Possehl 1996) and the names we use for their settlements are not likely to be those of the Indus peoples themselves.
3.6 Subsistence Most of the Harappan peoples were farmers and herders who also hunted and fished, with some gathering mixed in. The chief food grains were barley and wheat. They also cultivated at least two forms of gram, the chickpea, field pea, mustard, sesame cotton, and dates. The evidence for the cultivation of rice during the Mature Harappan is ambiguous, but possible. The Harappans used grapes. The peoples of the Indus Civilization were cattle keepers. Cattle remains are consistently above 50 percent of the faunal assemblage, and often much more. Cattle imagery is also prominent, and makes it clear that this was the most important animal in their culture as well as probably the principal form of wealth. The Harappans also kept water buffalo, sheep, goats, and pigs. Fish, both marine and freshwater, were traded over wide areas.
3.7 Craft and Trade The Indus craftsmen plied their trades with great skill and had a mastery of pyrotechnology that was a model for its age. Their pottery was fashioned using a number of forming techniques, often in multiple parts that were welded together prior to firing. Bead making was an Indus forte, especially their long, barrel-shaped carnelian examples, sometimes etched. They were accomplished smiths, who smelted copper and tin, as well as arsenic to make bronze. They used gold, silver, electrum, lead, and antimony. Maritime shells, especially the conch, were formed into bangles, ladles, inlays, and other objects. Bangles and other objects were formed also from stoneware and faience, another of the Harappan specialties. The Mature Harappan peoples were participants in the Middle Asian Interaction Sphere which linked the regions from the Indus to the Mediterranean and Bactria\Central Asia to the Arabian Gulf. This interaction sphere has deep roots and began much earlier than the Mature Harappan, but was at its peak during the second half of the third millennium BC. Maritime trade with Mesopotamia is well documented, both textually and archaeologically. Mature Harappan peoples were also in the Arabian Gulf. The merchandise that the Indus peoples supplied to Mesopotamia, according to the cuneiform texts, are: carnelian, lapis lazuli, pearls, woods, fresh dates, a bird, a form of dog, cats, copper, and gold. The products traded to Meluhha are not as clearly documented, but 14617
South Asia, Archaeology of
Figure 4 A Harappan stamp seal
included food products, oils, cloth, and the like. There are many Harappan artifacts in Mesopotamia, including seals, etched carnelian beads, and ceramics. The Mesopotamian products found in Harappan contexts at Indus sites are very few and there is a considerable disparity in the archaeological record of this subject.
3.8 Religion The central theme of Harappan religion is a male deity usually shown with bull or water-buffalo horns (see Fig. 5), matched by a number of different female images (Possehl in press). At times, notions of gender may be merged. The plant and animal imagery of the Indus Civilization can be seen as specific aspects of the great duality of the Harappan Great Tradition. The multiheaded animals, ‘unicorns’ with elephant trunks, perhaps ‘unicorns’ themselves, as in the terracotta figurines from Chanhu-daro, are all themes appropriate to zoolatry. So are the tigers with bull horns and half human-half quadrupeds seen on the seals, such as the cylinder from Kalibangan (see Thappar 1973, p. 29) (see Fig. 6). There is widespread evidence for the place of animals in Harappan ideology. Two Indus sealings, in 14618
fact, show small animal figures carried above the crowd in a procession. This is just another minor observation that supports the idea that animals were a source of wealth for these people, and entered their ideological world as well. There are many portrayals of composite animals, some with three heads on one body, or tigers entwined, human torsos on four-legged bodies, ‘minotaur-like,’ tigers with horns, ‘unicorns’ with elephant trunks and ‘unicorns’ growing out of trees. There is no good evidence for fire-worship. Two iconographic themes that appear on Harappan seals have parallels in Mesopotamian mythology, both related to the Gilgamesh epic. The first is shown on a seal from Mohenjo-daro with a half-human female half-bull monster attacking a horned tiger (see Fig. 7). This Harappan seal could be seen as portraying the story of the Mesopotamian goddess Aruru, who created Ebani or Enkidu as a bull–man monster to do combat with Gilgamesh, but who ultimately became his ally and fought with Gilgamesh against the wild animals. The second motif is the well documented Mesopotamian combat scene, with Gilgamesh fighting off rampant animals on either side. F. R. Allchin sees in a seal from Chanhu-daro a representation of Heaven, (the Bull), who is at once the consort and the father of Earth; and of Earth, who is at the same time the consort of the Bull (Heaven),
South Asia, Archaeology of 1985, p. 381). This is perhaps our best evidence for a connection between a belief in the Bronze Age being carried forward into historical India. 3.9 The Eclipse of the Ancient Cities of the Indus
Figure 5 The great Harappan male god, once called the ‘ProtoShiva’
Figure 6 Kalibangan cyclinder seal
Figure 7 A possible representation of the Mesopotamia goddess Aruru in combat with a tiger
and the mother of the Bull, her calf; and that these themes can be understood by reference to the Creation myths found in the Rgeda and Atharaeda (Allchin
Mohenjo-daro was largely abandoned at the end of the Mature Harappan (c. 1900 BC). Similar evidence is available at Harappa, although there is a small occupation there of the so-called ‘Cemetery H people.’ The situation at Ganweriwala is less clear, since there has been no excavation there, but surface prospecting indicates that the occupation was limited to Mature Harappan times. Older theories which hold that the cities and civilization were destroyed by invading Aryan tribes, as depicted in the Rig Veda, make very little sense. This is in part because there is no evidence for the sacking of any of the Mature Harappan settlements. Nor is there good chronological agreement between the date of the Vedic texts and the changes seen so graphically at Mohenjo-daro and Harappa. The proposition that a natural dam formed across the Indus River in Sind and flooded out the civilization has been critiqued widely and is not a viable proposition (Raikes 1964, Wasson 1987). The evidence from the three urban centers, as well as regional surveys, indicates that Sind and West Punjab experienced the widespread abandonment of Mature Harappan settlements at the opening of the second millennium BC. A summary table of the substantive data comparing Mature Harappan and Posturban archaeological assemblages by regions is given in Table 1. These figures indicate that the ‘eclipse of the Harappan Civilization’ might hold for Sindh and Baluchistan, but in other regions, notably in the east and to a lesser degree in Gujarat, the course of culture change was different. In these areas there were strong lines of continuity through the early centuries of the second millennium with little, if any, of the ‘trauma’ that effected Sindh. The stark image one has for Baluchistan in the second millennium BC represents a clear challenge for field archaeology, because it would not seem reasonable to presume that the entire area was deserted at that time. Excavation and exploration are needed to give a more realistic sense of the culture history at that time (Possehl 1997). Archaeological documentation supports the notion that the locus of culture change for the transformation of the Indus Civilization was urban, those physical settings and institutions most closely associated with sociocultural complexity, and the Mature Harappan ideology. In some places thought of as a part of the Harappan Civilization, this transformation went almost undocumented, as in Saurashtra and the Eastern Domain. There are also strong lines of cultural continuity that link the Harappan Civilization with the early Iron Age and historical South Asia. The 14619
South Asia, Archaeology of Table 1 Site counts and settled area for the mature and posturban Harappan
Sindh Mature Harappan Posturban Cholistan Mature Harappan Posturban Baluchistan Kulli-Quetta\Harappan Posturban Saurashtra Sorath Harappan Posturban East Mature Harappan Posturban
Site count
Average site size (hectares)
Settled area (hectares)
86 6
8.0 5.6
688 34
174 41
5.6 5.1
974 209
129 0
5.8 0
748 0
310 198
5.4 4.3
1674 815
218 853
13.5 3.5
2943 2985
reasons behind the transformation of the Harappan Civilization are still not understood, but it is quite possible that the ideology on which it was based had lost its strength. This might be documented archaeologically by the abandonment of the Great Bath at Mohenjo-daro 200, or even 300, years prior to the abandonment of the city by 1900 BC. It should also be noted that the locus of change during the transformation was urban, and thus centered on Mature Harappan ideology.
4. The Early Iron Age Four archaeological assemblages document the Early Iron Age in South Asia: the Painted Grey Ware Assemblage, the Gandharan Grave Culture, the Pirak Assemblage, and the Megalithic Complex (Possehl and Gullapalli 1999). 4.1 The Painted Grey Ware Assemblage The Painted Grey Ware Assemblage is quite different from that of the Harappan and Post-urban Harappan that preceded it in Northern India and Pakistan. This difference, and the association with iron, tended to set the Painted Grey Ware apart, causing some archaeologists to think of it as intrusive, even foreign to the Indian subcontinent. But archaeologists from the Archaeological Survey of India have done much to dispel this impression through their explorations and excavations in the Indian Punjab. Their excavations at Bhagwanpura and Dadheri have found evidence for Post-urban Harappan occupations followed by an overlap between the Post-urban Harappan and the Painted Grey Ware (Joshi 1993). Thus the culturehistorical gap that once separated the Early Iron 14620
Age from the preceding Bronze Age in Northern India and Pakistan has now been closed and there is good evidence for cultural continuity between the two periods. Radiocarbon dates indicate that Painted Grey Ware begins around 1000 BC and persists to the middle of the first millennium BC, possibly to 300 BC. Iron is widely associated with Painted Grey Ware, at times in considerable quantity. At Atranjikhera in the Ganges Valley, iron objects occurred in profusion, along with furnaces, slag, and tools used by smiths, suggesting that iron goods were manufactured at the site, along with smelting. About 500 Painted Grey Ware sites have been discovered. They extend from the Bahawalpur region of Pakistan east across the Punjab into Uttar Pradesh in India. The largest Painted Grey Ware site is Satwadi in Pakistani Cholistan at 13.7 hectares, far from what one would expect for a city. Most of these sites are in the one- to two-hectare range. Since iron is not mentioned in the Rig Veda, the earliest text of South Asia, there is little or no chance that the peoples who occupied the Painted Grey Ware sites were the Vedic Aryans.
4.2 The Gandharan Grae Culture A series of cemeteries and associated habitation sites in northern Pakistan have produced evidence for Late Bronze Age and Early Iron Age life in the eastern portion of ancient Gandhara. Most of the excavations have been conducted at the cemetery site of Timargarha and the settlement of Balambat in northern Pakistan, and in Swat at places like Aligrama, Loebanr I, and Katelai. Other sites are distributed through the southern valleys of the Hindu Kush, and possibly Ladakh.
South Asia, Archaeology of The earliest of the graves, Period 1 of the Gandharan Grave Culture can be assigned to the Late Bronze Age (c. 1500–1000 BC) and consist of pits lined and covered by stones. Flexed burials with hand-made red and gray pottery, and bronze artifacts were found at Timargarha, followed by evidence for cremation in Period II (1000–800 BC). The latest series of graves, Period III of the Gandharan Grave Culture (800– 500 BC), contain iron as well as artifacts of other metals. At Timargarha, seven objects of iron are reported, including a spoon, spear head, and a nail, and, interestingly enough, an iron cheek piece from a snaffle bit that dated between the tenth and the sixth centuries BC. 4.3 The Pirak Assemblage The site of Pirak is on the Kachi Plain 20 kilometers east of Mehrgarh. There are only four sites of this assemblage, with a few other small sherd scatters in Kachi. Excavations at Pirak took place between 1968 and 1974 (Jarrige and Santoni 1979, Enault 1979). Period I (c. 1750–1200 BC). The site has the characteristic Pirak pottery and well built houses. Stray sherds of Wet Ware, Faiz Mohammad Grey Ware, and other third millennium wares mark some continuity with the Harappan cultural tradition. A compartmented copper stamp seal of Period II does the same. The subsistence regime was based on rice, millets, and other cultivars. There is evidence for the Bactrian camel and an equid, possibly the domesticated horse, in the form of figurines. Period II (c. 1200–1100 BC) has continuity with Period I, and is defined on the basis of new ceramic shapes and fabrics. Copper\bronze tools continue along with the original subsistence regime. Iron is introduced in Period III (1100–900 BC). Most of the iron was found in a large, basically quadrangular set of buildings constructed directly upon the remains of Period II. The iron artifacts, mostly points and ‘bars’ are complemented by the appearance of crucibles and iron slag, as well as copper\bronze, lead, silver, and gold artifacts. The appearance of certain gray wares in Period III is important in the definition of this phase of occupation. 4.4 The Megalithic Complex The Megaliths of South Asia are an immense field of study, far too large to be covered adequately in this short survey. Burial monuments incorporating large stone constructions, associated with habitation sites, are known from prehistoric times in many regions of the Indian subcontinent, principally the north-western mountainous areas of Kashmir and Almora, the Vindhyas, the Deccan Plateau, and the deep south of Peninsular India. Aspects of this tradition may still be present in the subcontinent, as among the Nagas and
Khasis of the north-east, and other tribes of the Chota Nagpur, and Bastar regions. The connections have not yet been sorted out for this complex, deep set of historical relationships. Archaeological stratigraphy, radiocarbon dates, and the study of associated materials indicates that the beginnings of Megaliths in south Asia is in the very late second millennium BC, and comes with the mass production of iron. The earliest iron implements associated with the Peninsular Indian Megalithic are simple things like arrow heads, daggers, and domestic vessels. They are also found in the transitional times between this cultural complex and the preceding South Indian Neolithic. Megalithic burial monuments in South India include dolmenoid cists, slabbed cists, shallow-pit burials, deep-pit burials, umbrella-stones and hatstones, hood-stones, multiple hood-stones, and menhirs. In excess of 1400 Megalithic sites are known in the Indian subcontinent, with 1116 of these in Peninsular India, about 240 of which have been excavated in some way. Most of these monuments contain human remains and associated burial furniture, sometimes in great quantity. Horses and vehicles are present, along with pottery and metal artifacts including tools, horse trappings, Roman coins, and other artifacts. Iron occurs in abundance. Human remains from the burials have been studied by Kennedy (2000, pp. 326–57). See also: Hunter–Gatherer Societies, Archaeology of; South Asian Studies: Culture
Bibliography Allchin B 1995 The Potwar Project 1981 to 1993: A concluding report on the British Archaeological Mission to Pakistan’s investigations into hominid and early Human cultures and environments in the northern Punjab, Pakistan. South Asian Studies 11: 149–56 Allchin B, Allchin R 1997 Origins of a Ciilization: The Prehistory and Early Archaeology of South Asia. Viking, New Delhi Allchin F R 1985 The interpretation of a seal from Chanhu-daro and its significance for the religion of the Indus Valley. In: Schotsmans J, Taddei M (eds.) South Asian Archaeology 1983. Instituto Universitario Orientale, Dipartimento di Studi Asiatici, Naples; Italy, Series Minor 23, pp. 369–84 Bisht R S 1999 Dholavira and Banawali: Two different paradigms of the Harappan urbis forma. Puratatta 29: 14–37 Enault J-F 1979 Fouilles du Pirak, 2 Vols. Fouilles du Pakistan 2(2). Publications de la Commission des Fouilles Archaeologique, Paris Jarrige J-F, Santoni M 1979 Fouilles de Pirak, 2 Vols. Fouilles du Pakistan 2(1). Publications de la Commission des Fouilles Archaeologique, Paris Jarrige C, Jarrige J-F, Meadow R H, Quivron G 1995 Mehrgarh: Field Reports 1974–1985, from Neolithic Times to the Indus Ciilization. Department of Culture and Tourism of Sindh, Pakistan, Department of Archaeology and Museums, French Ministry of Foreign Affairs, Karachi, Pakistan
14621
South Asia, Archaeology of Joshi J P 1993 Excaation at Bhagwanpura 1975–76 and Other Explorations and Excaations 1975–81 in Haryana, Jammu & Kashmir and Punjab. Memoirs of the Archaeological Survey of India, 89, Delhi Kennedy K A R 2000 Gods-Apes and Fossil Men. University of Michigan Press, Ann Arbor, MI Mackay E J H 1943 Chanhu-daro Excaations 1935–36. American Oriental Series 20. American Oriental Society, New Haven; CT Meadow R H 1998 Pre- and proto-historic agricultural and pastoral transformations in northwestern South Asia. Reiew of Archaeology 19(2): 12–21 Possehl G L 1996 Indus Age: The Writing System. University of Pennsylvania Press, Philadelphia, PA Possehl G L 1997 The transformation of the Indus Civilization. Journal of World Prehistory 11(4): 425–72 Possehl G L 1999 Indus Age: The Beginnings. University of Pennsylvania Press, Philadelphia, PA Possehl G L in press The religion of the Indus civilization. In: Hinnells J R (ed.) Penguin Encyclopaedia of Ancient Religions. Penguin Books, London Possehl G L, Gullapalli P 1999 The Early Iron Age in South Asia. In: Pigott V (ed.) The Archaeometallurgy of the Asian Old World. University Museum Monograph 89, MASCA Research Papers in Science and Archaeology Vol. 16. The University Museum, University of Pennsylvania, Philadelphia, PA, pp. 153–75 Raikes R L 1964 The end of the ancient cities of the Indus. American Anthropologist 66(2): 284–99 Rao S R 1979 Lothal: A Harappan Port Town, 1955–62. Memoirs of the Archaeological Survey of India 78(1), Delhi Rao S R 1985 Lothal: A Harappan Port Town, 1955–62. Memoirs of the Archaeological Survey of India 78(2), Delhi Thappar B K 1973 New traits of the Indus Civilization at Kalibangan: An appraisal. In: Hammond N (ed.) South Asian Archaeology. Noyes Press; Park Ridge, 11, pp. 85–104 Wasson R J 1987 The sedimentological basis of the Mohenjodaro flood hypothesis – a further comment. Man and Enironment 11: 122–23
G. L. Possehl
The people of these countries have a shared past by which many of their present social and cultural characteristics have been shaped. Demographically, economically, and politically, India, Pakistan, and Bangladesh are the three most important countries of the region. Until 1947 they were one country. It was partitioned in that year, when India and Pakistan became separate and independent nations. Pakistan was in turn partitioned in 1971, leading to the creation of Bangladesh as a separate country. The first partition was occasioned by differences of religion, and the second by differences of language. In each case the partition was attended by considerable movements of population. The Indian subcontinent, comprising India, Pakistan, and Bangladesh, was under British rule for a long time, as was also Sri Lanka. The two landlocked kingdoms of Nepal and Bhutan never came directly under British rule and remained relatively isolated from the economic and political forces released by it. Although many social and cultural changes have taken place in the region since the end of colonial rule in the middle of the twentieth century, large parts of it still remain economically backward. The subcontinent is the home of a very ancient and distinct civilization that may be described as the ‘panIndian’ or the ‘Indic’ civilization. This civilization had its origin in the Indo–Gangetic plain in the second millennium BC, and it antedates the birth of both Christianity and Islam. Its characteristic bearers today are the Hindus. It has undergone many changes, and its influence has not only permeated throughout India and its neighboring countries but also extended to lands beyond the seas such as Indonesia, Cambodia, and Thailand. India is not only the home of Hinduism, it also gave birth to Buddhism which is the official religion of Bhutan and the predominant religion of Sri Lanka.
2. Religion and Language
South Asia: Sociocultural Aspects 1. Area and Population The term ‘South Asia’ has come to stand for a group of seven countries: India, Pakistan, Bangladesh, Nepal, Bhutan, Sri Lanka, and the Maldives; and it is in that sense that it will be used in this article. These countries differ greatly in area and population, and in geographical aspect and location. India, with over a billion inhabitants, has the second largest population in the world; the Maldives have a population of only a few hundred thousand people. Nepal and Bhutan are entirely landlocked; Sri Lanka is an island in the Indian Ocean; and the Maldives comprise a chain of more than 1000 islands, most of them uninhabited. 14622
Although Hinduism has had the longest and most pervasive influence on society and culture in the subcontinent, Islam too has a strong presence in it, including present-day India. Pakistan is an Islamic country; Bangladesh has a predominantly Muslim population; and although the Muslims are a minority in India, they are a very large minority. The medieval phase of Indian history was dominated by Muslim sultanates and empires; and the Muslim population of Bangladesh, Pakistan, and India taken together make up a considerable part of the Muslim population of the world as a whole. In South Asia, the divisions of religion cut across those of state and nation, and the region as a whole may be viewed as a mosaic of religious communities, sects, and denominations. India provides an extreme example of co-existence of diverse religious faiths and practices. Although the Hindus predominate in the
South Asia: Sociocultural Aspects country as a whole, there are districts in which Sikhs, Muslims, or Christians predominate. Again, Nepal is a predominantly Hindu country, but there are areas in which Buddhists predominate; in Sri Lanka it is the other way around. South Asia is the home not only of many religions but also of innumerable languages, and, whether on the plane of cultural values or of social identity, language is as important as religion. Like the divisions of religion, those of language also cut across the boundaries of state and nation. The Punjabi-speaking community is divided between India and Pakistan; the Bengali-speaking community between India and Bangladesh; and the Tamil-speaking community between India and Sri Lanka. Sometimes, as in Sri Lanka, where the Sinhala-speakers are predominantly Buddhist and the Tamil-speakers predominantly Hindu, the divisions of language and religion follow the same line. But this is not always the case. It is common in the subcontinent, if not characteristic, for the divisions of language and religion to cut across each other; there are Muslims, Hindus, and Sikhs among the Punjabi-speakers in India and Pakistan, and Hindus and Muslims among the Bengali-speakers in India and Bangladesh. Because of historical and demographic reasons India has the largest variety of languages among South Asian countries: it is indeed the most polyglot country in the world. There are more than a dozen literary languages, divided between the Indo–Aryan and the Dravidian families of language. In addition, there are numerous tribal languages and dialects belonging to the Austro–Asiatic, the Tibeto–Burman, and other families. In India, as in the subcontinent generally, the divisions of language correspond more closely to geographical divisions than do the divisions of religion; but here too, population movements through the centuries have led to a certain dispersal of language communities. If the civilization of South Asia had a design, it was most clearly visible in the Indian subcontinent until its first partition in 1947. Visitors to the area from far and near have been for centuries struck by the distinctness and continuity of that design, and have commented on its influence on society and culture in the different countries of the region. Despite the changes and upheavals of the twentieth century, its distinctive features are still visible, most plainly in India, but elsewhere as well.
3. Accommodation of Diersity A brief sketch of the design of the traditional civilization of the subcontinent will be provided as an aid to the understanding of the multitude of beliefs, practices, habits, customs, and manners prevalent in the region to this day. The first striking feature of this design is the accommodation of diversity. The rich
ethnographic record available from the middle of the nineteenth century onward shows an endless, almost inexhaustible, variety in material culture, social arrangements, and religious beliefs and practices. Some of the variation in material traits, in matters such as food, dress, and habitation, may be explained by variations in geographical environment and natural resources. But the cultural variation is far in excess of what would be required by variations of natural environment. For instance, even where its basic ingredients are the same, food might be prepared and served differently among the different castes and communities in the same village. Then there is the great diversity of social arrangements in matters relating to family, marriage, descent, inheritance, residence, and so on. What needs to be stressed is not simply the diversity of practices but also the diversity of rules; they were the despair of the colonial administrators of the nineteenth century who sought to codify the laws and customs of the subcontinent. There is finally the profusion of religious beliefs, doctrines, rituals, customs, and usages which have been mentioned. The presence of diversity has been more than a matter of mere existence. In Indian society in particular and South Asian societies in general, it has been sustained by law, custom, and popular religion. Accommodation without assimilation has been a core value of Indian civilization until modern times. It has led to the co-existence of a large multiplicity of beliefs and practices as well as of social groups. Adding new components has not meant discarding old ones, and new and old components of the most heterogeneous kinds have existed cheek-by-jowl to a greater extent than in other civilizations. The Indian subcontinent has been and still largely remains a vast mosaic of tribes, clans, castes, sects, and communities, each with its own identity.
4. Collectie Identities The second distinctive feature of South Asian societies is the predominance of collective identities. In the past, what counted was not so much the individual as the group of which he was a member by birth, and this continues to a large extent even now. It has often been said that traditional Indian society was a society not of individuals but of castes and communities. As has been noted, there was a great multiplicity of groups, and this multiplicity was a feature even of small local communities. The members of a single village might observe different religious practices, speak more than one language, and belong to a dozen or more different castes and subcastes. Collective identities were maintained by fidelity to the customs of one’s ancestors and by rules for the regulation of marriage. In course of time, subtle differences developed in craft practices, dress, and 14623
South Asia: Sociocultural Aspects ornamentation, marriage rules and ritual observances between groups having the same origin. Each group and subgroup maintained its distinctive style of life and transmitted it from generation to generation. The group preserved a sense of its own identity despite changes in its social position. The high value placed on the accommodation of diversity allowed and indeed encouraged each group to retain a strong sense of its own identity. The persistence of collective identities may be illustrated with two examples from the subcontinent. The first relates to what has been called ‘the Hindu method of tribal absorption.’ In the past, tribal groups which were relatively isolated from the wider society sometimes established relations with that society, discarding some old practices and acquiring some new ones. In this way, a tribe was often transformed into a caste but, despite a change of name and to some extent also of social position, it preserved its old collective identity. The second example relates to the survival of caste identities despite changes of religion. The census reports of British India show that in the Punjab, on both sides of the present international frontier, the same castes, such as Ahir, Jat, and Gujjar were present among Hindus, Muslims, and Sikhs. When a section of a Hindu caste embraced Islam or Sikhism, it adopted a new religion but did not abandon its old and distinct collective identity.
5. Hierarchical Organization The accommodation of diversity did not mean that the tribes, clans, castes, sects, and communities that were allowed or even encouraged to maintain their distinctive identities were all placed on the same plane of equality. Diversity was organized on the basis of hierarchy and not of equality, and that is the third distinctive feature of the traditional civilization of South Asia. It is true that all communities were allowed or even encouraged to maintain their distinctive styles of life; but they were not all equally esteemed. While all premodern civilizations were hierarchical to a greater or lesser extent, the principle and practice of hierarchy were carried furthest on the Indian subcontinent, particularly among the Hindus. Hierarchy is not simply a matter of inequality in the distribution of material resources or of life chances; it is above all a system of values according to which the division of the world into social groups deemed to be of unequal worth is a part of the natural scheme of things. The hierarchical conception of society found its fullest expression in the classical Hindu texts known as the Dharmashastra which provided the framework for Hindu law. Although the law has changed radically in the direction of equality, the bias of custom is still often in favor of hierarchy. The Hindu caste system has been represented as the prototype of a hierarchical social order. But hierarchy 14624
extends beyond caste and permeates social relations and social institutions of the most diverse kinds among Hindus, Muslims, Buddhists, and others throughout the region. Despite the doctrine of equality characteristic of Islam, agrarian relations and gender relations are cast in a deeply hierarchical mould in Pakistan and to some extent also in Bangladesh.
6. The Village Colonial administrators throughout the nineteenth century saw the subcontinent as essentially a land of villages. They have left behind memorable accounts of the unity, the self-sufficiency, and the continuity of the Indian village. A large number of village studies have been conducted in India by professional anthropologists and others since the country became independent in 1947. Similar studies have been conducted in Sri Lanka, Pakistan, Bangladesh, and Nepal. In the paragraphs below there follows a brief account of the broad social characteristics of the Indian village, many of which may also be found in villages in the other countries just mentioned. In the past, the Indian village had a distinct physical identity which became particularly prominent during floods, famines, and other natural calamities. Most villages were nucleated although there were also dispersed villages, depending in part on the terrain. The life of the village was based on an economy of land and grain, and its rhythm was, and to a large extent still is, governed by the annual cycle of agricultural activities. The village also has an annual cycle of religious ceremonies and festivals, and the two cycles are related closely. With the development of transport and communication in the twentieth century, and particularly since the 1950s, the isolation of the village is becoming progressively reduced, although not at the same pace in all parts of the country or the region. The less isolated villages are now becoming urbanized, though in a highly uneven manner. The social morphology of the village shows considerable variation in terms of size and internal differentiation. Irrigated villages in river valleys and close to the ancient and medieval centers of civilization tend to have an elaborate structure; ‘dry’ villages in the remote areas tend to be smaller and simpler in their structure. Only the latter may be described as communities of peasants. The former are often highly differentiated and stratified. They include noncultivating landowners, owner-cultivators, cultivating tenants, sharecroppers, and agricultural laborers, in addition to groups of families associated with various specialized crafts and services, usually on a hereditary or quasi-hereditary basis. Agrarian reform and the expansion of the market are changing some of this, although many of the older features remain. Its relative isolation and self-sufficiency and its wellknit division of labor between agriculture, crafts, and
South Asia: Sociocultural Aspects services gave the Indian village a certain unity. In the nineteenth century—and even later—it was often described as a ‘little republic.’ But it was not a republic of equals. In the larger irrigated villages the social boundaries between the superior, intermediate, and inferior sections of the community—landowners, tenants, agricultural laborers, priests, potters, and scavengers—were clearly maintained. If there was a certain unity in such villages, as indeed there was, it is best described as a ‘vertical unity.’ The unity of the village has been weakened in recent decades, but that does not mean that the inequalities have all disappeared. While it is true that South Asia is a land of villages, one cannot ignore the presence of towns and cities in it. The subcontinent has cities going back to ancient and medieval times, and India had urban centers before most of Europe did. In the past, towns and cities grew around places of worship, the courts of kings and princes, and commercial and trading centers. Colonial rule led to the creation of a new kind of city, mainly along the coast, such as Calcutta, Madras, Colombo, Bombay, and Karachi. In the years since the end of colonial rule, industrialization has added to the population of existing cities and led to the creation of new ones. The urban middle class and the industrial working class are important factors in the economic and political life of contemporary India. Returning briefly to the Indian village, a prominent place was occupied in it by caste. Indeed, caste may be viewed as the social basis of the division of labor in it, and the differentiation and stratification in the Indian village cannot be understood without taking caste into account.
7. Caste Caste is an institution of great antiquity, and it has shown resilience in the face of the economic and political changes of the twentieth century. In its origin and fullest development, it is an institution of the Hindus, but near analogs of the Hindu caste system may be found among Sikhs, Muslims, and Christians on the subcontinent and among Buddhists in Sri Lanka; and it is of course very important in the predominantly Hindu population of Nepal. On the doctrinal plane, Hinduism, Buddhism, and Islam are different from each other; on the plane of everyday life at the local level, social divisions among Hindus, Muslims, and Buddhists act in broadly similar ways in most parts of South Asia. The caste, the subcaste, and even the subsubcaste, to each of which the term jati may be applied, maintained its identity and continuity through the regulation of marriage. The most general rule was the rule of endogamy, and among Hindus this was supplemented by the rule of hypergamy. In the past, Hindu law prohibited intercaste marriage except under very
specific conditions; the law has now changed, but the bias of custom is still in favor of endogamy among the Hindus, and to some extent also among the others. Every caste or jati was in principle associated with a specific occupation, although agriculture was open to most if not all castes. The association between caste and occupation was stronger among the Hindus than among the others, but from the end of the nineteenth century, it began to weaken even among the former. A very important factor behind the weakening of the association has been the emergence, since the middle of the nineteenth century, of a large and increasing number of occupations that are ‘caste-free’ in the sense that they have grown outside the framework of the old division of labor based on caste. The different castes are socially ranked; indeed, it is in the caste system that the principle of social ranking finds its most extreme and elaborate expression. In South Asia, castes are ranked among all religious groups, but the ranking is not as rigid or as exhaustive among the non-Hindus as it is among the Hindus. Caste ranking among the Hindus was expressed in the idiom of purity and pollution, but that idiom is now less pervasive than before, and new criteria of social ranking, having little to do with caste in the traditional sense, are becoming increasingly salient, particularly in the urban areas.
8. Family, Kinship, and Marriage Given the great significance of the rules regulating marriage and in particular the rule of endogamy, it is not surprising that there is a close association between caste and kinship. Through repeated intermarriage from generation to generation, all the members of the local section of a caste or subcaste become related to each other by real or putative ties of kinship. This is generally true of all the major religious and linguistic divisions of South Asian society. Much has been written about the prevalence of the joint family in the subcontinent and the changes it is undergoing as a result of urbanization, industrialization, and modernization. The Hindu undivided family had certain legal characteristics peculiar to itself, and some of these are still recognized by the law in India. One important feature of the family in the subcontinent, whether among Hindus, Sikhs, Muslims, or Christians, is that marriages are still to a large extent arranged by parents and older relatives, and the new couple generally lives in the parental home immediately after marriage even though it may set up a separate household after one or more children are born. Reference has already been made to wide variations in the rules of descent, inheritance, and marriage across the many and diverse communities that inhabit the region. But through all these variations, kinship has a significant place in social life everywhere and 14625
South Asia: Sociocultural Aspects among all communities. Relatives are numerous and various, and the obligations of kinship are cherished and cannot be denied easily. The rules of endogamy act against the formation of real ties of kinship between members of different castes; but they do not prevent the formation of fictive ties of kinship between them. The idiom of kinship is used extensively in interpersonal relations across castes and communities in villages throughout the region. Changes are undoubtedly taking place in the size and composition of the household, particularly in the rapidly growing cities. But even where nuclear households outnumber extended ones, the matrix of kinship within which they operate gives them a different character from their counterparts in the West. The boundaries between households are porous, and no household is secure against intrusion by cousins, uncles, aunts, nephews, nieces, and grandparents. In discussing caste, kinship, and family it is impossible to ignore the subject of gender, although only a passing reference to it is possible here. Despite variations according to language, religion, and caste, and between patrilineal and matrilineal communities, women have since time immemorial occupied a subordinate position in South Asian societies. Polygamy was sanctioned by law among both Hindus and Muslims. In India it is now prohibited by law among Hindus but not among Muslims; it is permitted by law in Pakistan, and attempts to regulate polygamy by law have had little success among Muslims. The position of women is changing in all countries and among all communities in South Asia, although the change is highly uneven. Here, more important than the variations of language, religion and caste are those that arise from education, occupation, and income.
9. New Institutions and Classes The first impediment against attempting a concise account of society and culture in South Asia arises from the problem of variation; the second impediment arises from the problem of change. Two important developments in the subcontinent since the middle of the nineteenth century have to be noted. These are, first, the growth of open and secular institutions such as schools, colleges, universities, libraries, laboratories, hospitals, law courts, and newspapers; and, second, the emergence of a new middle class based on a new educational system and a new occupational system. These two related developments have taken place everywhere, but most conspicuously in India where they are now a significant feature of the social and cultural landscape. Open and secular institutions are sustained by norms and values that are very different from the norms and values by which the traditional institutions of South Asian societies have been sustained. The new institutions are often at odds with the old ones which 14626
exert various kinds of pressures on them and sometimes threaten their very survival. At the same time, and particularly in a country like India, they have grown to an extent that makes it difficult to dispense with them or to devise alternatives to them that are fully in harmony with the institutions of the past. As a result, the conflict of norms and values has become endemic in these societies. The new middle class is both the creation and the creator of the open and secular institutions just referred to. The origins of the Indian middle class go back to the first half of the nineteenth century. Its first beginnings took place in the presidency capitals of Calcutta, Bombay, and Madras in each of which a new university was established in 1857. The graduates of these universities and of the others which came up in course of time became administrators, managers, lawyers, doctors, scientists, teachers, and clerks. The Indian middle class is now very large, at well over 100 million persons. Although it is still a relatively small proportion of the population of the country, it is among the largest and the most differentiated of the middle classes in the world. The emergence of a new and increasingly dynamic middle class has not led to the effacement of the old identities and distinctions based on language, religion, and caste. Those still remain, although they are now coming to be organized in a different way. The middle class in turn is differentiated not only in terms of occupation, income, and education, but also in terms of language, religion, and caste. Caste and class reinforce each other in some respects and cut across each other in others. While there is some individual mobility, it does not lead to the equal distribution of all castes and communities among the different economic strata. The social mosaic is, thus, becoming in some sense even more complex than in the past. See also: Area and International Studies: Development in South Asia; Caste; Colonization and Colonialism, History of; Communalism; South Asian Studies: Culture; South Asian Studies: Politics
Bibliography Alavi H, Harriss J (eds.) 1989 South Asia. Macmillan, Basingstoke, UK Be! teille A 1991 Society and Politics in India. Athlone Press, London Breman J, Kloos P, Saith A (eds.) 1997 The Village in Asia Reisited. Oxford University Press, Delhi, India Dumont L 1980 Homo Hierarchious. University of Chicago Press, Chicago Fuller C J (ed.) 1996 Caste Today. Oxford University Press, Delhi, India Madan T N (ed.) 1976 Muslim Communities of South Asia. Vikas, New Delhi, India Mandelbaum D G 1970 Society in India, 2 Vols. University of California Press, Berkeley, CA
South Asian Studies: Culture Shah A M 1998 The Family in India. Orient Longman, New Delhi, India Srinivas M N 1976 The Remembered Village. Oxford University Press, Delhi, India
A. Be! teille
South Asian Studies: Culture This article discusses studies of culture in South Asia, from colonial times to the present, focusing on the changing meanings of the term, the contribution of various disciplines to research in culture, and recent developments in cultural studies proper. In the South Asian context, culture can mean the cultural heritage of the nation–states of the region, cultural practices or ways of life, community identities, as well as culture in the aesthetic sense, including literature and the arts.
1. Colonial Beginnings The first elaboration of the idea of an Indian culture occurs in the colonial era, through the combined efforts of European Indologists and Indian nationalists. The Indologists’ construction of an Indian tradition, based largely on textual sources, was picked up by the nationalists, who divided the cultural realm into ‘two domains, the material and the spiritual,’ conceding the West’s superiority in the former, while claiming sovereignty over the ‘spiritual’ domain, which bore the ‘essential’ marks of ‘cultural identity’ (Chatterjee 1993). This realm was to be out of bounds for colonial reformers, but at the same time, as Chatterjee has argued, the nationalists had their own project ‘to fashion a ‘‘modern’’ national culture that is nevertheless not Western.’ At this stage, however, terms like tradition and civilization were more prevalent, and the meaning of culture was not fixed. In 1910, during the era of swadeshi (the movement for the promotion of native industry), Ananda K. Coomaraswamy, a cultural nationalist and art historian who included even present-day Sri Lanka in his map of Indian culture, describes culture as a ‘capacity for immediate and instinctive discrimination between good and bad workmanship’ and a ‘view of life essentially balanced.’ Here culture is understood as a historically developed human attribute, an assimilated refinement of taste that goes with a certain settled, rooted way of life (‘restlessness is essentially uncultured’), recalling the ideas of William Morris (Coomaraswamy 1994). It is this meaning that seems to have prevailed when the nationalists translated culture as sanskriti, a word now used in most of the major Indian languages, while Rabindranath Tagore’s suggestion of krishti, which is
closer to the English word in its derivation from krishi or cultivation, never gained acceptance even in his native Bengal, in spite of his being the most influential cultural figure of his time. In its substantive definition, the national culture included the classical heritage in the arts, traditions of education (the guru-sishya parampara), family structure (the joint family was celebrated as quintessentially Indian), and the deep-rooted customs and practices of village India. The twentieth century witnessed a widespread campaign for the reform, rediscovery, and revival of the classical arts. Thus a traditional temple dance around which a system of ‘courtesanship’ had developed was taken and ‘purified’ to create ‘Bharatanatyam,’ one of the currently widely practiced national dance forms. The textual tradition was revisited in the light of Orientalist scholarship and selectively annexed to the national cause, with the Bhagaadgita emerging as a sort of national text, embodying the spiritual distinction of Indian civilization. The search for a ‘living tradition’ that would supplement the classical heritage led to the celebration and appropriation of folk arts and village crafts (Guha-Thakurta 1992). Early nationalist constructions of India’s cultural heritage tended to focus exclusively on Hindu achievements, ignoring the Islamic heritage, on the basis of an ideological negation of Muslim rule as the cause of decline of Hindu civilization. Intellectuals like Rabindranath Tagore and the progressives in the national movement on the other hand tried to construct a more inclusive cultural history, locating themselves in the modern present and acknowledging the irreversible re-making of Indian culture and society by colonial intervention. Jawaharlal Nehru, who became India’s first prime minister, was a key figure in this project but it was a sociologist, D. P. Mukherjee, who produced the first extended reflection on the idea of a ‘modern Indian culture.’ In the 1940s, when Mukherjee wrote his book, the cultural climate seems not to have been very hospitable to such an idea, given the widespread preoccupation with the revival and preservation of the disappearing cultural heritage of the nation. Mukherjee’s project in the book is to reflect on the contemporary cultural situation in the soon-to-beindependent nation, to inventory the cultural, social, and intellectual heritage and its effectiveness in the present, as well as to produce a concept of the present moment as constituted by a diversity of forces, traditions, and processes. Uncharacteristically for his time, Mukherjee, a partisan of a socialist future for India, distances himself from any approach to culture that privileges nationalism, and insists on a sociological account of culture as ‘the whole social process’ (Mukherjee 1979). Rejecting the idea of culture as heritage, he locates modern culture in a society marked by ‘the artifice of an unreal class-structure.’ He attacks the idea of India as a land prone to the mystic and the 14627
South Asian Studies: Culture spiritual, and is, throughout, preoccupied with the most pressing issue of the time: that of the coexistence of Hindus and Muslims, and other minorities, within a modern nation–state.
2. Tradition and Modernity The 1940s and 1950s are a crucial period for the emergence of culture as an object of study. In this period we see the triumph of social anthropology over sociology as the disciplinary home of culture studies, and in consequence, the decline of Mukherjee’s idea of a modern Indian culture as defined by contemporary struggles between social forces, whether traditional, entrenched, emergent, or imposed. The overwhelming sense of the contemporary, which favored a strictly sociological approach, was soon replaced by a more historicist approach, as the dualism of tradition and modernity, by far the most influential paradigm in South Asian cultural studies, took hold. The key figures in this shift were M. N. Srinivas and Milton Singer, an associate of Robert Redfield. In 1952, Srinivas published Religion and Society Among the Coorgs of South India, a work which, Singer asserts, demonstrated how the social anthropological method could be applied to a Great tradition. It was Redfield who proposed, in his project for the study of civilizations at Chicago (Srinivas had been in California), the fundamental distinction between Great and Little Traditions, roughly equivalent to ‘higher’ and ‘lower’ orders of cultural practice, the former more reflective, more systematic, and textually elaborated, while the latter is considered to be more spontaneous, fragmented, primitive. Until then the anthropological method had only been employed in the study of the so-called primitive societies, but Redfield was proposing a research project of global sweep to study the cultural heritage of humanity. Singer undertook the Indian portion of the study, concentrating on the south Indian city of Madras, later published under the title When a Great Tradition Modernizes: An Anthropological Approach to Indian Ciilization, with a foreword by M. N. Srinivas. Singer 1972, argued that the Indian Great tradition (the use of the singular was to attract criticism) ‘was culturally continuous with the Little Traditions to be found in the diverse regions, villages, castes and tribes’ and that therefore, ‘even the acceptance of ‘‘modernizing’’ and ‘‘progress’’ ideologies does not result in linear forms of social and cultural change but may result in the ‘‘traditionalizing’’ of apparently ‘‘modern’’ innovations’ (Singer 1972). The most significant element in this formulation is the suggestion that a civilization with a tradition evolved over the longue duree acquired the strength to assimilate ideas and changes coming from outside, and to convert them into organic elements of its own make-up. 14628
One of the most influential and controversial concepts in this new disciplinary thrust was Srinivas’s ‘Sanskritization’ that, together with ‘Westernization’ served to explain social change in modern India. Sanskritization, a process by which the lower orders of traditional caste society aspire for a higher social status by adopting the customs and manners of the upper castes, was seen as one of the ways in which the continuity of Little and Great Traditions was maintained. Singer’s concept of ‘cultural performance’ illustrates both the notion of cultural continuity between Great and Little Traditions and that of the absorption of modern influences. Singer defines a cultural performance in the broadest possible manner, including within its ambit plays, concerts, and lectures, and the cinema and radio, as well as ‘prayers, ritual readings and recitations, rites and ceremonies, festivals, and all those things we usually classify under religion and ritual rather than with the cultural and artistic’ (Singer 1972). In other words, the anthropologist’s, the Indologist’s, and the aesthetician’s definitions of culture have here been fused into one, to constitute a seamless continuum of culture, object of the new social anthropology. The disruptions and displacements brought about by colonial modernity, which were foregrounded in Mukherjee’s sociology of the present, are now located as challenges which the Great Tradition takes in its stride. A. K. Ramanujan, a pioneer of Indian folklore studies, also based in Chicago, rejected the Great and Little Tradition dichotomy and asserted that ‘cultural traditions in India are indissolubly plural’ and organized according to the principles of context-sensitivity and reflexivity (Ramanujan 1999).
3. Culture and Deelopment Around the 1960s, the dualism ‘culture and development\modernization’ begins to vie for space with its elder cousin Tradition\Modernity. Culture and development acquired wide currency at a more grassroots level, as development projects, undertaken by the nation–state and international agencies, began to transform the territory. While the Tradition\ Modernity paradigm was prevalent among Indologists, anthropologists, and the nationalist intelligentsia, ‘culture and development’ rallies a wide range of social science disciplines including economics, political science, sociology, gender studies, as well as environmentalists and other grassroots activists and NGOs. ‘Culture and development’ the decolonizing, modernizing nation–state’s version of Tradition\Modernity. Culture in this framework can be a hindrance to development, as in Amartya Sen’s famous formulation about the ‘missing women,’ who are the victims of cultural constraints on women’s access to food. Superstitions and prejudices nourished by entrenched cultural practices can come in the way of educating illiterate people in family planning, health, education,
South Asian Studies: Culture hygiene, and other developmental concerns. Culture can also be a resource: traditional cultural forms can be usefully employed to spread developmental messages. Ecological debates have thrown up notions such as ‘masculinist forestry’ and turned to women as good agents in preserving the environment (Dietrich in Menon 1999). Thirdly, there is also the question of ‘cultural survival,’ cultural rights of minorities, tribes, and other groups, which come under threat from development’s blind onward march and the imposition from above, of Western models of linear progress and development. The most sustained critique of development and modernization from a point of view that affirmed the validity and continuity of Indian traditions was undertaken by Ashis Nandy. Recouping a Gandhian ‘critical traditionalism,’ Nandy attacked the deracinating effects of Western rationality, individualism, and other ideologies adopted by the Nehruvian state and the middle-class intelligentsia in its developmental campaign. Against the Western tendency to emphasize rupture as the precondition of change, Nandy, following Gandhi, emphasizes continuity. Against the rigid separation between male and female, between individual and individual, Nandy argues that fluid identities and ambiguous selves are more characteristically Indian. The imperative of cultural difference and diversity, of multiple rationalities, also gives rise to a critique of Western science and its hegemonic universalism, in the work of Nandy, Tariq Banuri, and others. The assertion of cultural specificity occurs in other fields of knowledge as well. Thus Sudhir Kakar, a psychoanalyst, has argued for a culturesensitive psychoanalytic practice and has elaborated his own analytic picture of the Hindu psyche. There is a substantial body of psychoanalytic readings of South Asian culture, including the work of Girindrashekhar Bose, the first Indian analyst, Philip Spratt, Erik Erikson, Gananath Obeyesekere, and Nandy (see Vaidyanathan and Kripal 1999). There is sometimes a tendency to culturalism reduction in these writings, producing a domesticated psychoanalysis from which the fundamental alienation that psychoanalysis posits at the threshold of human subjectivity is wished away. More recently, Lacanian psychoanalysis has found favor among film studies scholars. Nandy’s rise to eminence as an ideologue of decolonization roughly coincides with the emergence of ‘postcolonial studies,’ galvanized by the publication of Edward Said’s Orientalism in 1978. In India, Saidian postcolonial studies as well as studies of development undertaken in the social sciences were influenced by Nandy’s discourse. Postcolonial theorists like Gayatri Chakravorty Spivak and Homi Bhabha also shared this space of the critique of colonial reason. Apart from this, another important development was the emergence of Subaltern Studies, a series of volumes to which mainly historians contributed, in which colonial and nationalist historiography was critiqued and a
‘subaltern history,’ of ordinary people, of tribal, and lower caste groups, during colonial rule and after, was undertaken. Culture plays a very important role in this project, especially when there is an emphasis on the spontaneity of ‘peasant insurgency’ and tribal uprising, and the question of the subaltern mentality. Postcolonial studies, the Subaltern Studies project, and the critique of development and modernization by Nandy and others, together constitute the legacy of cultural studies in the 1970s and 1980s.
4. Globalization and Local Cultures In the 1990s, the terms shift again, as globalization arrives on the scene. In ‘globalization and local cultures’ we have the third and most recent version of the Tradition\Modernity paradigm, where again the emphasis is on questions of cultural survival in the face of globalization, the resilience of local cultures, and their ability to ‘consume modernity’ on their own terms. Arjun Appadurai has offered a comprehensive theory of globalization and the emergence of what he terms ‘public culture.’ Appadurai treats culture as ‘the dimension of difference,’ of identity based on difference, emerging after the rupture of globalization, in a world where he considers nation–states to be on their last legs. Global relations and movements, or ‘flows,’ have a more decisive bearing on human lives today than national identification. Others are less sanguine about the effects of globalization, and more skeptical about the nation–state’s imminent demise. Appadurai is confident about the ability of societies to assimilate modernity, which is what globalization transports: although we have traveled far, we are still within Singer’s paradigm where traditional societies respond to and assimilate modernity in their own ways. The three variants of a paradigm that have been examined so far all share one thing in common: they approach the question of Indian culture on an international plane. In each case, one term in the opposition refers to a force, a process—modernity, development\modernization, globalization—which is of extraneous provenance, while the other term indicates the culture that is at the receiving end. None of them was developed with specific reference to India, which is only one of the sites to which they are applied. The concept of culture employed in all three variants is also predominantly anthropologically defined.
5. National Culture Within India, other paradigms of cultural analysis have devoted themselves to reading and analyzing the stuff of national culture. Some of them emphasize a distinctive native culture with its own rationality, its sense of self, and strategies of survival. There are also attempts to forge an indigenous conceptual series for 14629
South Asian Studies: Culture the study of Indian culture, to reconnect with indigenous intellectual traditions after the ‘amnesia’ of colonialism. Beginning in the mid-1980s and throughout the 1990s, as questions of globalization assumed importance, the domestic politics of communalism, which resulted in popular mobilization and widespread violence against minorities, led to a fresh attempt to reexamine the past. As the Hindu nationalists put forth versions of history that supported their political activities, liberal and left intellectuals, particular historians, revisited the past to reassert the plurality and ineluctable syncretism and hybridity of India’s cultural heritage (Thapar 2000). The question of multiculturalism, and of cultural rights, also became more urgent as cultural and ethnic conflict was seen as threatening the secular-democratic fabric of the nation–state. Social anthropologists have noted the difficulty in the Indian instance, of separating a sphere of culture from those of religion, community, caste, etc. Caste has been a central category for understanding Indian society, from Dumont’s notion of homo hierarchicus to more recent studies of caste conflict in a contemporary setting, where caste has more to do with cultural and political identity, than with social position. This shift to caste as identity brings it into the realm of culture, and various studies have asserted, against the idea of a universal dynamic of Sanskritization, that caste groups enjoy a measure of cultural autonomy and strive to maintain their cultural identity through the formation of networks across regional and language barriers. Dalits’ (literally, ‘the oppressed,’ a term now used for the lowest castes in the caste hierarchy, especially the ‘untouchables’) struggles for social justice have included literary movements and other forms of cultural expression, as well as attempts, under the broad rubric of folklore studies, to record and study the traditional cultural forms prevalent among these groups. Feminist scholarship has engaged with questions of culture at many levels (see Menon 1999, Thapan 1997). For nationalists woman was the guarantor of cultural identity and continuity. In the confrontations between the nationalists and the colonial government, and after independence, in the confrontations that arose between religious groups and the nation–state, women became the object of reformist attention and patriarchal protectionism. Two recent debates arising out of events in the 1980s that have had a lasting impact on the character of the national polity have posed a challenge to feminist scholarship, raising questions of the competing claims of women and minority cultural rights and the law, and of female agency in traditional practices that are offensive to a modern, secular outlook. These relate to an incident of sati (self-immolation by a widow on her husband’s funeral pyre, a practice that was thought to have died out) in a village of Rajasthan, and the case of Shah Bano, a divorced 14630
Muslim woman who filed a suit for maintenance from her former husband, leading to a debate about the competing claims of the state and the community’s own institutions to handle such disputes. Using these incidents from the 1980s and the legislative moves that followed each, the anthropologist Veena Das has investigated how ‘a web of creative or destructive tensions in the matter of cultural rights’ determines relations between communities, the state, and the individual (in Menon 1999). Feminist scholarship has shown how communities deploy culture as a means to assert the supremacy of community rights over women’s rights as citizens in a democratic polity. At the same time, on the Uniform Civil Code issue, which has been intensely debated in recent years, feminists have become more sensitive to community laws which, in certain instances, may be more beneficial to women, in opposition to the Hindu nationalist deployment of the discourse of citizenship to press for a civil code that will prevail over all other culturally specific laws (see Menon 1999). The realm of Indian politics is also continuous with that of contemporary popular culture, in particular the culture of popular cinema. South Indian cinema culture has been the breeding ground of some of the most powerful political leaders to emerge in the states of Tamil Nadu, Andhra Pradesh, and Karnataka. Sociologists have studied the ways in which the unique film cultures of these regions have functioned as a platform for electoral politics. The fan clubs of these popular film stars are today an important feature of everyday political and social life, in some cases emerging as militant frontrunners in campaigns for a linguistic national identity. Meanwhile, the impact of the Birmingham school of cultural studies and a reconstructed anthropology’s search for new objects of research have spurred projects for cultural studies in India, notably studies of communities of reception for the popular film, popular cultures of photography, music, etc. Film has come to be seen as the emblematic cultural institution of modern India, and in recent years film studies have acquired legitimacy as a discipline, with a multidisciplinary resource base. The importance of globalization notwithstanding, there is a distinct national sphere of cultural studies, which draws upon many of the ideas and paradigms mentioned above, but locates its concerns within the national space. One instance is the study of aesthetic modernism. Postcolonial societies attract development and modernization approaches, where cultural issues are subsumed under sociological, economic, and political questions. Studies of modernism, on the other hand, are concerned with the emergence of artistic practices and ideologies that participate in the international modernist movement while speaking from within (though not only about) their own national space. Modern literature, cinema, theatre, architecture, painting, and other fine arts have been the object of this critical appreciation
South Asian Studies: Economics (Kapur 2000), with the Journal of Arts and Ideas serving as an important forum for the theorization of an Indian modernism. In the contemporary field of cultural studies, the traditional disciplines like anthropology are joined by new discourses like postcolonial studies, feminism, psychoanalysis, semiotics, post-structuralism, and postmodernism (see Niranjana et al. 1993, Spivak 1988). Ethnographies of cultural communities, including traditional ones as well as newer consumerist types of community, feminist critiques of gendered cultural forms and expressions, and of patriarchal ideologies, film and television studies, and studies of other modern, technology-dependent cultural forms, caste politics, popular religion, histories of modern and popular art, and political cultures are some of the types of cultural study being currently undertaken in India, by scholars belonging to a range of social science and humanities disciplines (see Manuel 1993, Niranjana et al. 1993, Thapan 1997, Vasudevan 2000). The concept of culture has undergone a definite shift of emphasis from ancient heritage and primitive ways of life to the more unstable and complex practices and processes of contemporary existence. The past coexists with the present, as it does in any social formation, but not in a historicist time-space. Cultural studies increasingly regard the present moment as a synchronic dimension, where all constituent elements, whether ancient or recent, foreign or indigenous, constitute a symbolic network with its own unique properties, making it radically contemporary (Dhareshwar 1995). Forging the tools for analyzing this complex cultural space is the task of the future. See also: Area and International Studies: Development in Southeast Asia; Buddhism; Buddhism and Gender; South Asia: Sociocultural Aspects
Bibliography Appadurai A 1997 Modernity at Large: Cultural Dimensions of Globalization. Oxford University Press, Delhi, India Chatterjee P 1993 The Nation and its Fragments: Colonial and Post-colonial Histories. Oxford University Press, Delhi, India Coomaraswamy A K 1994 Art and Swadeshi. Munshiram Manoharlal, New Delhi, India Dhareshwar V 1995 Our time: History, sovereignty and politics. Economic and Political Weekly 30(6) Guha-Thakurta T 1992 The Making of a New ‘Indian’ Art: Artists, Aesthetics and Nationalism in Bengal, c. 1850–1920. Cambridge University Press, Cambridge, UK Kapur G 2000 When was Modernism: Essays on Contemporary Cultural Practice in India. Tulika, New Delhi, India Manuel P 1993 Cassette Culture: Popular Music and Technology in North India. University of Chicago Press, Chicago Menon N (ed.) 1999 Gender and Politics in India. Oxford University Press, Delhi, India Mukherjee D P 1979 [1948] Modern Indian Culture, 2nd edn. [Alternative title Sociology of Indian Culture] Rawat Publications, Jaipur, India
Nandy A 1998 Exiled at Home. Oxford University Press, Delhi, India Niranjana T, Sudhir P, Dhareshwar V (eds.) 1993 Interrogating Modernity: Culture and Colonialism in India. Seagull Books, Calcutta, India Obeyesekere G 1981 Medusa’s Hair: An Essay on Personal Symbols and Religious Experience. University of Chicago Press, Chicago Ramanujan A K 1999 The Collected Essays of A. K. Ramanujan. Oxford University Press, Delhi, India Said E 1978 Orientalism. Routledge, London Singer M 1972 When a Great Tradition Modernizes: An Anthropological Approach to Indian Ciilization. Praeger, New York Spivak G C 1988 In Other Worlds: Essays in Cultural Politics. Routledge, New York Srinivas M N 1952 Religion and Society Among the Coorgs of South India. Asia Publishing House, Bombay, India Srinivas M N 1966 Social Change in Modern India. University of California Press, Berkeley, CA Thapan M 1997 Embodiment: Essays on Gender and Identity. Oxford University Press, Delhi, India Thapar R 2000 History and Beyond. Oxford University Press, Delhi, India Vaidyanathan T G, Kripal J J (eds.) 1999 Vishnu on Freud’s Desk: A Reader in Psychoanalysis and Hinduism. Oxford University Press, Delhi, India Vasudevan R (ed.) 2000 Making Meaning in Indian Cinema. Oxford University Press, Delhi, India
M. M. Prasad
South Asian Studies: Economics South Asia is the sister that was all dressed up, but did not make it to the ball. At the time of independence at the close of the first half of the twentieth century, India, Pakistan and Sri Lanka (Ceylon at the time), the larger countries of the region, all seemed poised for successful development. Sri Lanka’s per capita income was much higher than Korea’s in 1950, and India and Pakistan were at income levels comparable to those of Korea. These South Asian countries had inherited a reasonable infrastructure, both physical and administrative, and, freed of colonial shackles, all truly seemed ready for the ‘tryst with destiny’ that Jawaharlal Nehru articulated for India. Fifty years later, South Asia is far behind East and Southeast Asia in standards of living. While the region has made progress, the East Asian ‘miracle’ puts it to shame. The damning comparison is not just with the most successful developing countries; South Asia is fast emerging as the poorest, the most illiterate, the most malnourished the least gender-sensitive, indeed, the most deprived region in the world (Haq 1997, p.2).
What went wrong? This piece examines the economic development experience of South Asia and the lost 14631
South Asian Studies: Economics opportunities for the region’s people. It touches on long-standing issues of growth vs. development, state vs. market, and agriculture vs. industry, as well as the more recent themes of political economy, decentralization, liberalization and reform. It discusses the role of economic ideas and of political ideologies and compulsions in shaping the course of economic policies and region’s economies. The experience of the region’s economies stands as a mirror of those ideas and ideologies: South Asia has been a laboratory for economic research on development.
1. India We begin with objectives. Newly independent nations unambiguously wanted to catch up with the West that had ruled over them, to develop, to modernize, to industrialize—three somewhat different goals. Industrialization was the clear winner out of these. In India, the government was to occupy the ‘commanding heights’ of the economy, and engineer economic growth led by import-substituting heavy industrialization. The Soviet Union seemed an attractive model at the time, and formal planning exercises were launched in India and in Pakistan soon after independence. A Marxist view of an imperialist world economy, Fabian socialism, a colonial-trained elite, and an economy shaped by wartime controls, all met on common ground to define how India proceeded. The initial intellectual framework for Indian planning came from non-economists, but, at the time, the general approach of a vast, centralized resource optimization exercise was a common one in global thinking on economic development. Dissenting views to this model, accepted by most development researchers at the time, were rare, though not absent. Through the nineteen-fifties, this model was reasonably successful. There were no purges, and no mass starvation (as economist Amartya Sen has suggested, these facts are connected by the constraints of a democratic polity). India did not break apart. And the economy grew steadily, at a rate much faster than in colonial times, 3.5 percent a year (later caricatured as ‘the Hindu rate of growth’). Agriculture was not neglected, if not particularly favored. Land reform was legislated, but not really implemented, though it seemed to be called for on grounds of efficiency as well as equity. Agricultural pricing and food distribution also received attention from academics and policymakers, with the goals of increasing marketable surpluses and providing adequate food to the poor. Innumerable studies were performed on the productivity of Indian agriculture, using data collected through extensive surveys. What really wrought lasting change in agriculture, however, was the introduction of high-yielding varieties of wheat and rice. This ‘green revolution’ first took place in Punjab, where land holdings were relatively equal, and where the 14632
state government had created an infrastructure of market and rural roads. Central government irrigation projects also played a key role. The Indian planning exercise was a peculiar one, in that the mechanisms for implementation were weak: this was not Soviet-style command planning. At the same time, the functioning of markets as resource allocation mechanisms was incredibly constrained, almost stifled: prices (for internal and external trade) as well as investment and production decisions were bureaucratically controlled and greatly distorted. Theoretically, these distortions were meant to promote equity (e.g., rationing basic foodstuffs) and efficiency (where social costs and benefits supposedly diverged from private ones). In practice, they did a little for the former, and impeded the latter. The government’s ‘license-permit-quota raj’ epitomized the ‘rent-seeking society,’ where incentives for activities aimed at capturing a share of the current economic pie dominated those that would increase the pie itself. In fact, the experience of India seemed to inspire the coinage of the term ‘rent-seeking society’ by an economist, Anne Krueger (1974). Compromise and consensus were viewed as necessary for India’s continued unity, and this worsened the inefficiencies of the system. The nostalgic Gandhian ideal of a village-based Ram Rajya that rejected industrialization was accommodated through subsidies and protections for handlooms and small-scale industries in general: ‘big business’ could not encroach on their domain. A related populism that grew in the 1960s, as political competition greatly intensified, led to further restrictive polices. War, drought, and commodity price shocks worsened the performance of an economy hobbled by policies that led to inefficient capital use, despite increasing rates of savings and investment. Academic economists—a scattered few through the 1970s (Jagdish Bhagwati, Padma Desai, and T. N. Srinivasan being among the most prominent), then many more in the 1980s—pressed for economic reform, but ideology and vested interests in the economy and polity (as elegantly analyzed by Bardhan, 1984) engendered delay. A more pragmatic approach to economic policy-making made a halting entrance in the 1980s, and Indian economic growth picked up, to 5 percent along with a documented increase in productivity growth (Ahluwalia 1985, 1991, Goldar 1986). However, this took place in a manner that made India more vulnerable to external shocks. The balance of payments crisis of 1991 therefore ushered in a new wave of reform. The collapse of the Soviet Union and the success of East Asia demolished ideological barriers, and India considerably liberalized government controls on domestic and trade-related economic activities. The result has been a further improvement in India’s economic growth rate. Despite its relative economic failure, India has succeeded in building resilient political structures
South Asian Studies: Economics during the five decades after independence. Its constitution was drafted expeditiously, and has stood the test of time. Fiscal and monetary institutions such as government budget processes and the central bank have remained transparent, accountable, and reasonably effective. In particular, the central bank has excelled in monetary management within its given constraints. Quasi-government bodies such as the Finance Commission that governs central-state fiscal transfers have developed expertise and reputation, and a slew of new regulatory bodies, such as the Securities and Exchange Board of India, promise to improve the quality of economic governance. Gandhian-inspired decentralization, after many halfhearted experiments that met with limited success, has taken root in a constitutional amendment strengthening local governments. If properly implemented, this decentralization may at last be the vehicle for effectively delivering health, education and social services to the bulk of India’s poor, something that the planning process ultimately failed to do. The institutional developments in India again parallel broader research on the economic role of governance institutions (e.g., Williamson 1996): ideas help shape policy.
2. Pakistan If consensus to the point of gridlock has often seemed to characterize India’s development path, the word that describes Pakistan might be ‘contradictions.’ In many respects, policies followed in Pakistan were quite similar to those in India: import-substituting industrialization, controlled markets and prices, relative neglect of broad-based agricultural development and of health and education for the rural masses, and half-hearted attempts at land reform. This was part of the same economic development research consensus that shaped India’s post-independence policies. At the same time, Pakistan diverged from India in significant ways. On the economic front, Pakistan was earlier in allowing the private sector freedom for industrial growth. It was able to squeeze resources for industrialization from the agricultural sector more effectively than India. It relied much more heavily on foreign aid and private remittances from overseas workers, and its domestic savings rate has been much lower than India’s. Overall, as researchers would agree, the result was effectively a more efficient use of capital, and, up to the last decade or so, a higher growth rate than in India. The starkest contrast with India, according to many analyses of Pakistan’s economic performance, has been in the country’s governance. Whereas India has evolved a robust, if still flawed, democratic system, Pakistan has essentially been an authoritarian state, where interludes of populism with the trappings of democracy can not disguise the essential nature of the system as a contest between three overlapping and
related governing elites: the bureaucracy, the military, and the large landowners, with a small number of wealthy industrialist families acting as a background influence. Furthermore, ethnic divisions have sharpened this contest. While this sounds similar to Pranab Bardhan’s (1984) analysis of India, the degrees of inequality and rent capture, and lack of political checks and balances in Pakistan have differentiated the two countries. When Pakistan was created in two parts, East and West, the East was more populous, but the West contained the bulk of the elite. Differences in language, culture and economic structure all stood against religion as the only common point. This cleavage heavily influenced the nation’s political and economic development. Historians of the political economy of Pakistan, such as Omar Noman (1990) argue that real democracy was impossible for the Western elites to accept. Policies that squeezed the East’s agriculture to benefit the West’s industrial development exacerbated the divide. From this perspective, the country’s split in 1971 was almost inevitable. After the creation of Bangladesh from East Pakistan, the remainder of Pakistan faced further internal conflicts, reflecting a mix of class and ethnic divisions. The loss of the eastern wing sharpened the focus of these western conflicts. Society was further destabilized by the Soviet invasion of Afghanistan in 1979. This made Pakistan strategically more important, and led to greater foreign assistance, economic and military. This helped the economy in terms of overall growth, but exacerbated problems of governance, with a huge influx of arms into the region. When the Soviets withdrew, the problems for Pakistan remained, and persist. Around the same time, the decade-long oil boom in Persian Gulf wound down, curtailing the remittances from Pakistanis working there, and having a significant negative impact on the economy. If we have described Pakistan’s development path by its relation to that of India, this only reflects the consciousness of the nation. While those who govern India and determine its policies certainly are aware of Pakistan, India’s larger neighbor to the north, China, is often more of a mental presence. The most popular view, typical for such a large nation, has probably been just one of insularity: ‘India is different.’ For Pakistan, however, India is the large neighbor, that helped to dismember it in 1971, and that, even after three wars and numerous skirmishes, maintains a firm grip on Muslim-majority Kashmir. Pakistan spends a much larger proportion of its GNP on defense than does India—as much as a third of its budget—and always must be aware of its giant neighbor. However, Pakistan does not necessarily follow India in policy-making. Import substitution, bureaucratic control, and heavy industrialization at the expense of agriculture, all were chosen in Pakistan and in India because of their common history and a common understanding of the history and the possibilities of 14633
South Asian Studies: Economics development. Pakistan embarked on a path of economic reform shortly before India, in 1990, because of external pressures resulting from economic crisis— partly caused by external shocks, but ultimately reflecting the failure of the previous direction of policy. However, Pakistan may have a harder road ahead than India: its lead in per capita income has disappeared over the last few years, and it lags dramatically in human development indicators such as literacy and school enrollment, particularly for girls. Its population, too, is growing much more rapidly. Large inequalities and poor governance structures exacerbate the tensions inherent in its situation. Economists and political scientists would agree that this is not a promising situation for further economic development and growth.
3. Bangladesh If Pakistan’s prospects have dimmed recently, those of Bangladesh have grown brighter. When the nation was born, it had suffered from close to 25 years of economic stagnation as East Pakistan, and a devastating war of independence. Some observers dismissed it as basket case. It underwent the same cycles of military rule and authoritarian populism as Pakistan did after 1971. It also struggled with government-led development via Five Year Plans. While Bangladesh still trails India and Pakistan in measures of income, poverty and human development, it has shown progress in recent years, and has begun its demographic transition to a lower population growth rate, at income levels much lower than had been predicted by demographic research at that time. Some researchers suggest that what distinguishes Bangladesh from its larger South Asian cousins is its relative societal homogeneity. Therefore, after initial conflicts among the elite, the nation seems to have settled into an approach that is delivering some broader-based human development. Agricultural production has improved substantially since independence, and the countryside has proved to be an impressive laboratory for nongovernmental organizations such as the Grameen Bank (rural microcredit) and the Bangladesh Rural Advancement Committee (community development for the poorest). This approach, drawing on the strengths of the nation’s civil society, contrasts favorably with more centralized, government-led approaches in India and Pakistan. Of course each of these countries does have its own success stories, and building on them in the future will be important for sustained development. Not surprisingly, theoretical and empirical analyses of rural microcredit and targeted community development in South Asia have been very common (as documented, e.g., in the World Bank’s annual World Development Reports). Bangladesh also shares institutional problems with India and Pakistan: lack of systems that support 14634
widespread entrepreneurial attitudes, grossly inefficient formal financial sectors, and outmoded systems of taxation and tax administration. In each of these Bangladesh lags behind the other two nations, and its economic reforms in the last few years, where these institutional weaknesses are critical, make the reform process seem like pushing on a string, according to economists such as M. G. Quibria (1997) One thing that can potentially help Bangladesh is India’s external liberalization—international trade within the region has been very low in general, despite the absence of the hostility that has limited trade between India and Pakistan. Bangladesh may also be in a position to move into export niches vacated by Southeast and East Asian countries moving up to more sophisticated manufacture. This possibility, and the importance of international trade in general, has been widely recognized by economists. In agriculture, also, India may matter: Bangladesh is a vast delta for rivers that flow from India. Controlling and using that water requires international cooperation that (as documented and analyzed by Crow and Singh, 2000) has only recently begun to develop.
4.
The Himalayan Nations
International trade and water resource development also loom large for India’s Himalayan neighbors, Nepal and Bhutan. The economic development of these small, land-locked nations has been, and will be, heavily influenced by India’s own economic success, the openness of its economic regime, and its strategic concerns with respect to China. Nepal has had its own economic experiments with five year plans, and its own struggles with developing democracy. Its infrastructure is poor, but, in a more open regional economy, the potential of its natural resources is high.
5. Sri Lanka Sri Lanka and the tiny Maldives are the island nations of South Asia. They are outliers in other respects too, with higher incomes and higher human development indices than their continental cousins. Yet Sri Lanka has undergone some of the same problems as the others. At independence, its economy was somewhat depleted by the costs imposed by World War II, but it was still relatively rich, with a strong base in exported cash crops. Post-independence policies that nurtured the population in terms of health, education and access to food gave the country a persistent lead over the continental region in broader human welfare, and all would agree with the position emphasized by Amartya Sen (1981) that these are significant achievements. Yet growth has been slow, and the lack of adequate economic growth has cost Sri Lanka heavily, by exacerbating ethnic conflict that has, proportionate to the nation’s size, been the worst in the region.
South Asian Studies: Economics The reasons for Sri Lanka’s relative failure, as in the case of India and Pakistan, involve some themes that are familiar in analyses of the development of the region: populism, government policies overly restrictive of private entrepreneurship, direct government control through nationalization, failure to attract enough foreign investment bringing the right mix of capital and technological know-how, and severe distortions in the foreign trade regime. Both in the failure of its economic policies and of management of ethnic tensions, Sri Lanka’s most striking comparison and contrast is with Malaysia, rather than the larger nations of South Asia. The economic policy mistakes of the 1950s and 1960s fed into austerity measures in the 1970s and the explosion of conflict between Tamils and Sinhalese in the1980s. This meant that, while reforms began earlier than in India or Pakistan, they have not been undertaken in the best conditions. Sri Lanka has managed to grow somewhat more rapidly recently, and to reduce the role of primary products by establishing labor-intensive manufacturing exports as a significant part of the economy, but the story remains one of lost time and lost opportunity.
6. Prospects If all the major nations of South Asia, despite numerous differences in initial conditions and trajectories, made similar economic policy mistakes over the last five decades, can we lay the blame on common causes? The discipline of economics and its practitioners share some responsibility. The understanding of the workings of markets and entrepreneurship, and the roles played by information, incentives and innovation, was certainly incomplete when these countries embarked on their attempts at guided development. Economists who were skeptical of the abilities of government to be a guide or director of the economy were in a minority initially. Yet by the time economists began to be more forceful in their criticisms, policies had created entrenched interests, and reform has not come easily. This is not to argue that the state has no role. Currently, economists agree that the issue is not simply ‘state vs. market,’ or the degree of state intervention along some one-dimensional conceptual continuum. There are many dimensions. The state does some things better than others. It may do nothing very well, but it may still be better than the market in some spheres of the economy. And, even if it does not ‘govern the market’ (to use Robert Wade’s (1990) characterization of East Asia’s experience), it certainly ‘enables the market,’ through the maintenance of law and order, and through a judicial system that enforces contracts. The state also provides safety nets for the unfortunate, physical infrastructure, and, sometimes, social-order-preserving redistribution, all through a system of taxation and expenditures. In all this, the state is itself the agent of interest groups and consti-
tuencies. Part of the lesson of South Asia’s experiences of economic development is therefore a practical one: what does and does not work in economic policy, and the proper roles of state and market. Returning to common causes for South Asia’s relative failure, ideology, society and history all have their place—though not the catchall scapegoat of culture. The ideology of Marxism and its Soviet incarnation cast a long shadow, especially over India. Societal divisions have certainly made the tasks of governance and development enormously more difficult throughout the region, right from the partition that gave birth to Pakistan. Vertical cleavages have been as pernicious as horizontal ones, and the neglect of human development in South Asia may ultimately, perhaps, be blamed on the rule of an elite very different from any that exists in the social structures of Southeast and East Asia. Finally, history always matters, whether one begins with partition, or colonization, or even further back. While we have presented what might be termed the consensus of research on particular aspects of South Asia’s experience, it is difficult to argue that overall agreement exists on the relative importance of the different factors that have shaped the region’s development in the past halfcentury. While history is always important, though, it sometimes becomes a fondly remembered golden past. For nationalistic politicians, ‘In Search of Lost Time’ has that Proustian relevance. In this case, however, the lost time is really that of midnight’s children, the generations of South Asians who failed to realize the full promise of independence. Luckily, though, the ball does not end at midnight, and midnight’s grandchildren may do much better than the previous generation. Perhaps economists have learned enough in the past few decades to help them.
Bibliography Ahluwalia I J 1985 Industrial Growth in India. Oxford University Press, New Delhi, India Ahluwalia I J 1991 Productiity and Growth in Indian Manufacturing. Oxford University Press, New Delhi, India Bardhan P 1984 The Political Economy of Deelopment in India. Blackwell, Oxford, UK Bhagwati J 1993 India in Transition: Freeing the Economy. Oxford University Press, New York Bhagwati J N, Desai P 1970 India: Planning For Industrialization: Industrialization and Trade Policies Since 1951. Oxford University Press, New York Bhagwati J, Srinivasan T N 1975 Foreign Trade Regimes and Economic Deelopment: India. National Bureau of Economic Research, New York Crow B, Singh N 2000 Impediments and innovation in international rivers: The waters of South Asia. World Deelopment 28: 1907–25. Dre' ze J, Sen A 1995 India: Economic Deelopment and Social Opportunity. Oxford University Press, Delhi, India Goldar B N 1986 Productiity Growth in Indian Industry. Allied Publishers, New Delhi, India
14635
South Asian Studies: Economics Haq M 1997 Human Deelopment in South Asia 1997. Oxford University Press, Karachi, Pakistan Hossain M, Islam I, Kibria R 1999 South Asian Economic Deelopment: Transformation, Opportunities and Challenges. Routledge, London Husain I 1999 Pakistan: The Economy of an Elitist State. Oxford University Press, Karachi, Pakistan Karunatilake H N S 1987 The Economy of Sri Lanka, 1st edn. Centre for Demographic and Socio-economic Studies, Colombo, Sri Lanka Khan A R, Hossain M 1989 The Strategy of Deelopment in Bangladesh. Macmillan, Basingstoke, UK Krueger A O 1974 The political economy of the rent seeking society. American Economic Reiew 64: 291–303 Noman O 1990 Pakistan: A Political and Economic History Since 1947. Kegan Paul, London Quibria M G 1997 The Bangladesh Economy in Transition. Oxford University Press, Delhi, India Sen A K 1981 Poerty and Famines: An Essay on Entitlement and Depriation. Oxford University Press, New York Wade R 1990 Goerning the Market: Economic Theory and the Role of Goernment in East Asian Industrialization. Princeton University Press, Princeton, NJ Williamson O E 1996 The Mechanisms of Goernance. Oxford University Press, New York
N. Singh
South Asian Studies: Environment Cultural traditions of ecological prudence evolve in sedentary societies inhabiting stable environments, when populations are close to resource saturation and technology is stagnant (Gadgil and Guha 1992). Accordingly, the emergence, maintenance, erosion, and restoration of ecologically prudent behavior in the south Asian region is examined here. Following sections sketch colonization of the subcontinent, evolution of agricultural society and resource partitioning amongst occupation-based caste groups, and environmental erosion triggered by the colonial conquest and the resultant industrial society. The concluding section discusses the implications of ongoing globalization and emergent environmental conflicts. Differing research perspectives on important issues, both within and beyond the subcontinent, are also examined.
1. Colonization 1.1 Waes of Immigration The first major migration into south Asia was that of Austric-speaking, i.e., Australasian, peoples from the northeast about 65,000 years ago (Mountain et al. 1995). Subsequent numerous waves of immigrations were propelled by technological innovations outside the subcontinent, as underlined by Western-trained subcontinental researchers (Gadgil and Guha 1992). 14636
The farming of wheat, barley, cattle, and pigs in the Middle East around 10,000 years ago propelled several immigrations into south Asia of Dravidian-speakers approximately 6,000 years ago. Similarly, the farming of rice, buffalo, etc. in southeast Asia 8,000 years ago drove immigrations of Sino-Tibetan-speakers into the south Asian subcontinent about 6,000 years ago. IndoEuropeans dominated the latest immigrations around 4,000 years ago, following domestication of the horse in central Asia 6,000 years ago. Lastly, Europeans, pushed by resource scarcity following the ‘Little Ice Age’ and armed with superior ships and cannons, invaded India in the sixteenth century to control the spice trade. These manifold invasions have shaped, genetically and culturally, the most diverse society of the world, as Western researchers have revealed (Mountain et al. 1995). This persistence of over 50,000 tribe-like endogamous groups, characterized by hunting–gathering\shifting cultivation cultures, in a complex agrarian and later industrial society, is very unusual. This situation has resulted from the peculiar regional tradition of subjugation and isolation, in contrast with the worldwide practice of elimination or assimilation of subordinated communities by dominant groups. This diversity is accompanied by extreme specialization of occupation, as subcontinental researchers emphasize (Gadgil and Guha 1992). The diversity of environmental regimes throughout the subcontinent, ranging from warm coastal mudflats to snow-clad mountain peaks, also nurtured the cultural diversity. 1.2 Conseration Ethic In the relatively predictable tropical and subtropical environments, the primitive hunter–gatherer society evolved largely into sedentary, territorial tribes with populations close to carrying capacity. This promoted the evolution of cultural practices of ecological prudence, rationalized within a religious framework. Such traditions could have changed a little when hunters– gatherers advanced to the level of shifting cultivators, as is evident from the age-old practice of maintaining sacred groves (tree clumps protected against harvests by local people because of their religious significance), still followed in areas of shifting cultivation tracts such as the Himalayas. Harvests from such relict forest patches are prohibited or rigorously regulated by the local community. Even these communities showed some further occupational specialization by age and sex; such as shamans and traders of honey, shells, feathers, etc.
2. Agricultural Ciilization 2.1 Agricultural Eolution A sedentary agricultural civilization evolved gradually in favorable situations, employing paddy cultivation
South Asian Studies: Enironment in the river valleys and maintaining large cattle herds. The resultant surplus promoted the evolution of a predatory warrior social group, specializing in physical protection of the surplus and its usurpation as tax, as well as the expansion of territories. This accompanied the so-called Aryan invasion from west and central Asia, with horse chariots and iron axes easily clearing forest tracts for cultivation. Consequent deforestation in the Gangetic plains began around 1100 BC, proceeding to the southeastern end of the subcontinent by 600 BC (Gadgil and Guha 1992). This new agricultural–pastoral society organized into sedentary, largely self-sufficient village communities seemingly nurtured many tribal practices of ecological prudence. The generation of a large surplus by agriculturists and herdsmen promoted the expansion of trade, including that in luxury goods such as spices, fine cloth and precious metal, which were exported to other continents. This strengthened the caste of traders, while tribal shamans evolved into the priestly caste claiming a monopoly over codified knowledge. The peasants and artisans constituted the lowest groups.
2.2 Religious Eolution The hierarchical religion then practiced called for extensive sacrifice of animals, especially oxen. Such sacrificial ceremonies also necessitated extensive tree cutting for the holy fire. As the Gangetic plains became saturated with cultivation, woodlands and pastures shrank and oxen became a critical source of agricultural power and trade. As a result there was a strong protest from the peasants and traders against animal sacrifice and tree cutting, reflected in two new religions, Buddhism and Jainism, by the sixth century BC (Gadgil and Guha 1992). Respect for all forms of life and nonviolence was taken to extremes by Jainism, where one sect wears no clothes but for gauze over the nose to avoid trapping and killing insects. Jainism is strongest in the northwestern subcontinent, a region with a particularly fragile environment, where the dangers of overexploitation are acute.
2.3 Resource Impoerishment Kautilya’s Arthashastra, a manual of statecraft dating from the third century BC, near the peak of imperial expansion in the region, provides insights into the ecologically prudent practices of the rulers, such as maintaining royal hunting preserves (Gadgil and Guha 1992). Poaching wild elephants from forests along the boundary maintained for military purposes attracted the death penalty. Large empires began to disintegrate after the seventh century AD, probably due to the gradual decline of agricultural fertility over a millennium. This eroded the dominance of traders
and their main religions Buddhism and Jainism by the tenth century. This resulted partly from a revival of Hinduism, through transformations such as the incorporation of several tribal deities, and the abandonment of ritual animal sacrifice and consequent meat consumption. This revival consolidated Hindu caste society, which now manifested rigid barriers of endogamy such as those found among the tribes.
2.4 Resource Partitioning by Castes However, unlike the tribespeople, the Hindu castes do not occupy exclusive territories and have a specialized mode of subsistence, which used to be hereditary. In a small region, one caste specialized in catching fish, a second in herding buffaloes, a third in making salt from sea water. A fourth caste may have tapped liquor from palm trees, a fifth may have cultivated paddy, and so on. Thus, about 10 to 50 sedentary castes would coexist nearly independently within an area of about 1,000 square kilometers. The same area would also be visited by dozens of other nomadic castes of artisans, pastoralists, and entertainers. These nomadic castes also ranged over fixed areas. Traditionally, all these castes carried out barter with one another. The most remarkable feature of the caste society was the regulation of competition for limited resources through the diversification of resource use (Gadgil and Guha 1992) with a biological perspective, such as the partitioning of food resources for organisms with differing ecological roles in an ecosystem. Resource regulations included intracaste and intercaste territoriality manifested in land ownership, livestock category, or grazing seasons. Besides resource use diversification, this caste society also exhibited traditions of regulated harvests, some of which continue even today. An example is the setting up of sacred ponds upstream to protect fish breeding pools and confining their extraction during, a particular season, so ensuring good harvests. Besides religious motivation, villagers are also often aware of the direct benefits of sustainability practices, such as the utility of bird guano from protected village heronries as farm manure. Such understanding has manifested folk conservation practices beyond focal habitats to species protected by specific communities. For instance, the Bishnois, a sect of Hindus in the semiarid northwestern subcontinent, sacrificed several lives by hugging their sacred trees to prevent the soldiers of the local king from cutting down the trees for a lime kiln for the king’s palace. Such strict religious protection has meant that even today Bishnoi villages are green islands with freely roaming peacocks, blackbucks, and blue bulls, amidst an otherwise desolate landscape. Incidentally, while the sacred trees are very useful to the Bishnois, their animals are protected and fed with stored grain despite the crop depredation they inflict. 14637
South Asian Studies: Enironment Conservation practices amongst rulers continued, such as the moratorium on felling green trees, imposed by King Shivaji in western India in the seventeenth century. This king’s navy established the first plantations of teak trees long before the celebrated British plantations in Malabar two centuries later. Conservation traditions were not confined to tribal peoples or Hindus, as evident from sacred ponds around Muslim shrines in Bangladesh (Gadgil and Guha 1992).
3. Colonial Conquest While the subcontinental society responded to environmental constraints by resource specialization and cultural prudence, Europeans responded to resource scarcity by exploring new lands and technologies around the fifteenth century. The ensuing industrial revolution tilted the balance of power from landlords to traders and industrialists. The British government gradually whisked away the common property resources in the subcontinent and actively eroded the traditions of prudent communal management such as the sacred groves, condemned by Westerners as superstitious or obstacles to state control (Brandis 1884). Subsequent to the British invasion, reckless hunting and timber harvesting took place. This denuded the forests and drove animals such as the Asiatic lion and cheetah nearly to extinction. The realization of the harm gradually promoted so-called scientific forestry, which nevertheless had to hire continental foresters, as Britain had no tradition of scientific forest management. Eventually the British granted protection to royal hunting preserves and established some new game preserves. Meanwhile the continued breakdown of the traditional economy and the solidarity of village communities forced the indiscriminate use of common property resources, now owned by the government but with usufruct privileges to local people. Indiscriminate grazing, tree cutting, fishing, hunting, and other destructive activities increased, further eroding the impoverished resource base. The forests reserved for the exclusive use of the government, for example to provide timber for the emerging railway network, did not fare well either as local people agitated for cultivation on this forest land and failed to cooperate in its management. Thus, beyond the decimation of the environment, the British period marked an erosion of the conditions that had fostered an ethos of ecological prudence.
4. Industrialize and Perish? 4.1 Subsidized Profligacy The British looked at the subcontinent initially as a source of cheap raw material and, later, as a market 14638
for manufactured goods. In consequence, they deliberately destroyed traditional industries such as handicrafts and prevented industrialization from gathering pace. When several countries in the region became independent after World War II, they vigorously promoted mechanized industrialization as a remedy for the maladies of an impoverished resource base and growing population. The philosophy ‘industrialize or perish’ was quickly embraced by the urban elite, while farmers and artisans could hardly participate. Unbridled industrialization opened the gates to mass exploitation of the natural resources at throwaway prices to industry. Highly subsidized land, water, and power, and unregulated pollution soon severely eroded the environment and resources. However, the resultant phenomenal profits posted by the industries far exceeded the output of the agricultural sector, allowing industries to shift the resource base. Such immunity promoted ecological profligacy and prudence was thrown out of the window. Concomitant elitist conservation initiatives were confined to the conversion and expansion of hunting reserves into wildlife sanctuaries and national parks, further cornering the resources of the dependent masses, triggering fresh ecological conflicts that haunt most large projects such as irrigation and power schemes even today. 4.2 Grassroots Conseration This profligacy of the social groups that have a wide resource base and import goods from all over the world, termed ‘biosphere people,’ contrasts with the prudence inherited by the masses, termed as ‘ecosystem people’ (because they are dependent on the consumption and trade of immediately available resources from predominantly natural ecosystems). The latter group is dependent on protecting its neighboring resource catchment (Gadgil and Guha 1995). This is contradictory to the Western, Malthusian approach which attributes environmental degradation to population explosion of poor people (Ehrlich 1969). Regional researchers retort that the much higher resource consumption levels of the Western elite cause greater environmental degradation. Imposing the costs not only of environmental degradation but also of conservation by the elite on the masses was soon met with fierce resistance. Ensuing social conflicts prompted many fresh initiatives of ecological prudence on the part of the masses, who were left with no option but to protect their resource base. For instance, northern Himalayan people were agitated to discover that besides encouraging ecological insecurity, the government was starving their cooperative forest-based industry for the benefit of distant private sector industries. This led to a tree-hugging protest by local people along the lines of the Bishnoi, mentioned above, with women in the forefront, against industrial tree felling. It became famous as the Chipko (‘hugging’) movement.
South Asian Studies: Enironment Chipko triggered unprecedented environmental awareness and mass conservation movements. The repercussions are socializing the government-sponsored afforestation drive, though enhancing the stake of villagers in managing existing forests is proving much slower. Fishermen along the coast are agitating to prevent mechanized fishing trawlers from overfishing and destroying the spawning beds, ultimately eroding the fish stocks. This causes immense harm to traditional fishermen who avoid fishing during the breeding season and use nets with a large mesh size so that small fish can escape. The traditional fishermen are also opposing the aquaculture industry, as it promotes prawn cultures by landlords with heavy chemical inputs that ultimately pollute and destroy the estuarine fisheries which sustain small fishermen. In places, these conflicts have even assumed bloody dimensions. Often, multinational corporations (MNC) provide the finance for such environmentally damaging and socially inequitable projects.
5. Globalization The global dimensions of even localized conservation conflicts became more evident after the recent ban by the USA of marine food imports from India. The USA has appealed to the World Trade Organization (WTO) to declare a multilateral ban. The ban was justified on the grounds that it would foster conservation, as the Indian trawlers lack expensive turtle-excluding devices, primarily used and sold by the developed countries. The WTO has ruled out a multilateral ban, acceding to India’s request that environmental disputes be addressed at more appropriate platforms such as the international Convention on Biological Diversity (CBD). The threat to mass livelihoods from the global elite still looms large under the guise of environmental protection, with developed countries unsuccessfully trying to impose environment-related trade restrictions. This trend was seen at the Seattle millennium conference of the WTO in 2000. If the trawler ban was ever implemented, it would not only affect Indian trawler owners but also conservationminded masses such as the traditional fishermen who eke out a living from fish exports. Another example of an environmental conflict with the global elite is the transgenic cotton controversy in southern India (Gadgil and Utkarsh 1999). Here, cotton farmers were faced with disaster after crop failure due to extensive pest attacks in 1998. Not long afterwards, a multinational agro-corporation announced that it was carrying out ongoing field experiments prior to the likely release of a pest-resistant transgenic cotton variety. This prompted widespread protest from environmentalists and farmers, who feared not only that the transgenic strain might pose environmental hazards, but that if the MNC had a monopoly over the strain they might end up dependent
upon it. Further, such transgenic crops might accelerate the erosion of traditional cultivars, thus undermining environmental efficiency and farmers’ rights to their seeds. The controversy about large dams on the River Narmada and consequent unfair widespread displacement is a rare case where unrelenting protest by the masses resulted in the withdrawal of an international investor, the World Bank. However, transnational investors do not generally display such restraint, especially if they are wooed by local governments unmindful of the environmental hazards and related protests by local people.
6. The Future Future environmental conflicts may be glimpsed from the ongoing legislative reforms taking place in the south Asian region (Gadgil and Utkarsh 1999). For instance, the Indian biological diversity bill and patent act amendments intend to compel even foreign agencies to pay royalties for commercial applications based on Indian biological diversity and related folk knowledge. This might prevent cases like patenting of innovations based on Indian biodiversity and knowledge abroad, as in the infamous cases of Basmati rice and Neem-based pesticide where patents were obtained abroad on improved rice variety and process to stabilize volatile Neem extract, without sharing any profits with the original South Asian farmers. The bill proposes spending such funds on incentives for promoting grassroots conservation. It also buttresses people’s resource rights, in theory, by requiring the consent of relevant village councils for any development or conservation project. Implementation of such intentions is awaited, but many national and international projects are already encouraging villagers’ involvement in sustainable management of even government property, enthused by the success of village forest protection committees in Nepal and India or water users’ associations in India. Other initiatives drawing on substantial information industry strengths in the region include registration of people’s knowledge of biodiversity use, to prevent unfair patenting. While social systems within the region are slowly adapting to resource conflicts, new international conflicts are emerging, such as that over the CBD biosafety protocol. This has implications for the competition between local cultivars and the largely foreign-controlled genetically modified organisms (GMOs). GMOs are feared for their likely environmental and social hazards. Debate between the countries of the region and developed countries also concerns country contributions to the causes and mitigation of the global climate change. While developed country agencies fix the blame on deforestation and rice cultivation in the region (Anon 1992), regional researchers claim the opposite 14639
South Asian Studies: Enironment (Achanta 1993). Regional cooperation unfortunately seems to be lacking, unlike the case in other tropical regions. For instance, SAARC (the South Asian Association for Regional Cooperation), is hardly cohesive and vocal in pushing its case at the CBD and climate change negotiations, whereas its east Asian, African and South American counterparts are very vocal and united, probably due to people’s pressure in those regions. The partitioning of the people’s stake in the environment forms the focus of ongoing environmental research debate in the south Asian subcontinent. Those claiming to represent grassroots communities shun all that is modern and Western as environmentally destructive (Shiva 1993). Other Western-influenced regional researchers reject such categorization of conservation efforts (Gadgil and Utkarsh 1999), and advocate strategies to cope with globalization rather than to denounce it and presume that it is avoidable.
Gadgil M, Guha R 1992 This Fissured Land: An Ecological History of India. Oxford University Press, New Delhi, India\University of California Press, Berkeley, CA Gadgil M, Guha R 1995 Ecology and Equity: Use and Abuse of Nature in Contemporary India. Penguin Books, New Delhi, India\Routledge, London Gadgil M, Utkarsh G 1999 Intellectual property rights and agricultural technology: Linking the micro- and the macroscales. Indian Journal of Agricultural Economics. 54(3): 191–210 Kothari A, Pathak N, Vania F 2000 Where Communities Care: Community Based Wildlife and Ecosystem Management in South Asia. International Institute for Environment and Development, London\Kalpavriksh, Pune, India Mountain J L, Hebert J M, Bhattacharya S, Underhill P A, Ottolenghi C, Gadgil M, Cavalli-Sforza L L 1995 Demographic history of India and mtDNA-sequence diversity. American Journal of Human Genetics 56: 979–92 Shiva V 1993 Monocultures of the Mind: Perspecties on Biodiersity and Biotechnology. Third World Network, Penang, Malaysia
G. Utkarsh
7. Conclusion In future, conservation and livelihood conflicts might escalate both nationally and internationally. The evolution of cultural practices of ecological prudence has characterized much of the subcontinental society until now (Kothari et al. 2000). However, the emerging dominant social groups such as corporate industries are unlikely to inherit this traditional ecological prudence or to evolve an alternative paradigm. This stems from their ability to harness resources from many alternative locations, employing evolving technologies and measuring benefits solely in monetary terms. This profligate behavior is rationalized in the framework of economic growth and technological progress. The article emphasizes that ecological prudence might be maintained only through an innovative blend of the emergent corporate culture and the masses that inherit the conservation traditions. See also: Caste; Colonization and Colonialism, History of; Environmentalism: Preservation and Conservation; Globalization: Geographical Aspects; South Asia: Sociocultural Aspects; South Asian Studies: Culture; South Asian Studies: Economics; South Asian Studies: Geography; South Asian Studies: Politics
Bibliography Achanta A N (ed.) 1993 The Climate Change Agenda: An Indian Perspectie. Tata Energy Research Institute, New Delhi, India Anonymous 1992 World Resources 1992–93 World Resource Institute, Washington\Oxford University Press, New York Brandis D 1884 Progress of Forestry in India. McFarlane and Erskine, Edinburgh, Scotland Ehrlich P 1969 The Population Bomb. Ballantine Books, New York
14640
South Asian Studies: Geography The landscape of south Asia contains tremendous natural and cultural diversity, and the geographic study of the region exhibits an equally wide range of themes. During the last quarter of the twentieth century, geographers shifted focus from a regional studies approach, which provides general overviews of key geographic characteristics of the region, to a problem-oriented approach that highlights contemporary social and environmental issues confronting it. This article briefly introduces the regional geography of south Asia and then examines in detail the themes that inform contemporary geographic study of the region.
1. Regional Geography of South Asia Many of the early regional geographies of Asia contain sections on south Asia and provide useful summaries of background information about society and the physical environment across the Indian subcontinent. An exclusive focus on the region developed in Asian regional geography after the 1947 partition of the subcontinent and the emergence of the newly independent countries of India and Pakistan. The early regional studies offered comprehensive overviews with selected coverage of specific geographic details, notably population and economy. In the 1960s, important studies of individual countries in the region appeared, which provided greater depth and attention to local scale geography (Karan 1963).
South Asian Studies: Geography The regional approach to south Asian geography gained in sophistication during the 1970s, when geographers blended the study of regional patterns with a critical focus on process and on changes in society and the landscape. A study of the historical evolution of south Asian geopolitics appeared in the form of the monumental A Historical Atlas of South Asia, which offered new insights into the ebb and flow of imperial power and the design of regional landscapes (Schwartzberg 1978). The historical geography approach not only highlighted the magnificent periods of civilization in the region, but also showed how the contemporary organization of society and land owes its spatial origin in great part to the political past. The connection between history and modern times occupies a nexus of geographic study about south Asia. The historical perspective in regional geography was advanced further in the 1970s with studies that interpreted the spatial organization of the subcontinental economy according to the impact of colonialism. This work called specific attention to the role of the British in fundamentally altering not only the globalization of the regional economy, but also the administration of south Asian society and the usage of land and other natural resources. The regional geographies, including atlases of the subcontinent that have appeared since the 1970s, blend the comprehensive and historical approaches (Farmer 1983). Much of the contemporary study of south Asian geography, however, favors a problem-oriented approach over the regional perspective.
2. Contemporary Themes of South Asian Geography The general issues of modern society drive the current study of south Asian geography. Prominent among these are the place of tradition in modern life, the function of the peasant economy amid globalization, the dilemma of development, territorial movements and ethnic conflict, and environmental change. The coverage of these thematic areas is distributed at varying scales across the region, from village-level study to international inquiry. 2.1 The Place of Tradition in Modern Society The wealth of tradition in south Asia is one of its defining features and the subject of much study on the cultural geography of the region. The geography of India, for example, is commonly defined according to cultural practices, and the spatial organization of the landscape results in part from such complex traditions of culture as pilgrimage and sacred places (Bhardwaj 1973). The maintenance of traditional practices in contemporary life reflects their continued importance even while the material conditions of life rapidly change. Among the various roles of culture in the
region is the importance of tradition in determining how people use the land to obtain their basic needs. Environmental outlooks, organized according to indigenous knowledge systems, remain significant features for the geographic study of local conservation strategies and people-based environmental movements. Modern agriculture and forestry initiatives in south Asia owe a great deal to the traditional systems of environmental management that obtain in specific localities, and, indeed, the latter define in many cases the prospectus for sustainable development in the region. 2.2 The Function of the Peasant Economy The geographic study of peasant agriculture illuminates the place of culture in nature, as well as the social structures that organize village production systems. Farming systems research throughout southern Asia informs our knowledge about such farmer practices as crop type and rotation strategies, gender divisions of work, and the application of technologies in village economic systems. The harmonization of village agriculture with the vagaries of climate is the focus of much study, notably on the impacts of severe weather events and flooding on rural production systems in places such as Bangladesh, where natural events combine with human settlements to create potential hazards. Droughts and famine compelled early geographical studies on the nature of village farming systems, but the persistence of poverty within the peasant economy of south Asia is what drives much contemporary geographical inquiry into the function of the peasant economy. The contradictory nature of the green revolution and the political economic structures that determine how farmers participate in agricultural development programs are the focus of some important geographical studies that employ a social theory perspective. Geographers working in the cultural ecology and political ecology traditions have successfully utilized a systems perspective to examine the linkages between environment and society within the peasant economy. In this perspective, the household is the nexus for material and energy exchanges between forests, pastures, and farm fields. The main contribution of this approach to peasant studies is its ability to analyze the interplay of a diverse set of environmental and societal arrangements, which collectively define the function of the village economy. Generally, the systems approach is utilized at the micro or village level, but it also has been applied successfully to the regional setting, for example in the Punjab, with the application of agricultural land-use models. 2.3 The Dilemma of Deelopment Geographical inquiry into south Asian development processes has a strong backing among both the early 14641
South Asian Studies: Geography regional geographies and the more recent thematic studies. In the former case, the various development indices (both economic and social) are applied on a country basis to map patterns of development, but generally such efforts are flawed because too little attention is given to explaining the geographic inequities that appear on the maps. The more recent development studies in south Asian geography take a very different tack. They examine foremost the processes that contribute to the overall low level of development in the region and to the unequal distribution of development successes within it. The efforts of British geographers working with the Overseas Development Group at the University of East Anglia have led the way toward a political–economic explication of poverty in Nepal (Blaikie et al. 1980). Their work, guided by center–periphery relations theory, has contributed a great deal to our understanding of underdevelopment not only in the Himalayan kingdom of Nepal but throughout the subcontinent. The role of infrastructure in the development of spatial economies, the significance of land tenure relations for resource entitlement, and the geographic variation in wealth and poverty related to structural problems within national economies are the major foci for such studies. Most recent south Asian development studies by geographers have adopted a critical perspective that views poverty as a social construction rather than a product of ineffective economic policy. It is argued in this case that poverty is a result of social histories wherein some segments of society (or indeed entire countries) are made dependent upon other societal elements (or countries). In this vein, the role of foreign assistance as a development vehicle is challenged on the basis that it mainly serves Western interests and simply promotes further dependency among the poor nations of south Asia (Shrestha 1997). Furthermore, the geography of development in south Asia is shown to reflect inequities in the distribution of foreign assistance as a result of poor planning, fiscal mismanagement, and official corruption. The social theory and critical perspectives found among recent development geographies of the region lend sophistication to the conventional, neoliberal interpretations of development that describe most early regional geographies of the region. Development geography in south Asia pays considerable attention as well to urban and regional planning. Detailed analyses of urban systems show the role of cities as growth nodes for regional economic development and modernization. Micro-scale studies of south Asian cities examine such phenomena as urban migration and the development of fringe squatter settlements, the role of the informal economic sector in meeting the needs of the urban poor, and the environmental impacts of urban growth (Dutt 1994). In rural south Asia, especially in the northern mountains, geographers have examined closely infrastruc14642
tures associated with regional development, mainly road alignments, and have shown how these make formerly remote areas more accessible to the influences of globalization and modernity. The rural development studies also consider the geographic role of tourism, which is shown to widely transform regions throughout south Asia with its impacts on society and the natural environment (Zurick 1995).
2.4 Territorial Moements and Ethnic Conflict The fractious nature of the south Asian landscape derives in part from the tremendous cultural diversity of the region and the competing claims on territory by ethnic groups. The study of ethno-geographic zones and territorial aspirations of ethnic groups contributes to an understanding of peace and conflict in the region. Ongoing separatist movements are found throughout the subcontinent, from Kashmir in the northwest to Assam in the northeast, through the India plains where Sikh, Gorkhali, adiasi, and other ethnic groups claim regional autonomy or independence, and on to Sri Lanka, where Tamils and Sinhalese are embroiled in a civil war. The conflict potential of ethnic groups is viewed to be a function of such motivating conditions as government encroachment onto ethnic territories and socioeconomic discrimination, and such enabling conditions as group cohesiveness, population expansion, and access to infrastructure, foreign support and resources. Geographical analyses of territory-based ethnic conflict consider these factors as well as other historical circumstances of the conflict regions.
2.5 Enironmental Change The causes and consequences of environmental approaches used to study environmental change are diverse and include geo-technical monitoring in the Eastern Ghats of southern India, landscape assessment in Rajasthan, irrigation and water resource development in the Gangetic plain, and studies of the relations among subsistence economies, the natural environment, and political economy in the Himalaya (Zurick and Karan 1999). The geographic research on environmental trends generally subscribes to a fieldwork-based research model and is theoretically informed by cultural ecology, ecosystems research, and political ecology. Much environmental geography in south Asia centers on the Himalaya, where international research efforts span the mountains from Afghanistan to Burma. Environmental research in the Himalaya is informed in part by a proposed ‘Theory of Himalayan Environmental Degradation’, which links land degradation to population growth and overuse of resources, and by various theories of global environmental and
South Asian Studies: Health social change. The work of German geographers in the upper Indus river valley in Pakistan follows the geoecology tradition, which centers upon the human exploitation of altitudinal-based environmental zones, and examines human alterations of the environment for subsistence purposes as well as for modernization along the Karakoram Highway. The UNESCO Man and Biosphere Project contains the work of geographers in Nepal who have examined such issues as land use change, hill slope stability, and natural hazards in the middle mountains region. Much of the contemporary geography in the Himalaya seeks to understand the human underpinnings of environmental change, distinguishing societal causes from natural ones, and to resolve environmental problems according to human needs (Ives and Messerli 1989). The applied aspects of environmental geography in south Asia include such issues as forest regeneration, sustainable agriculture, geographic information systems (GIS)-based monitoring of environmental trends, and the reconciliation of societal aspirations with mountain carrying capacities. A great deal of applied geography is conducted under the auspices of the International Center for Integrated Mountain Development (ICIMOD) based in Kathmandu, Nepal. ICIMOD’s critical program areas include the impacts of globalization on fragile environments, livelihood security, and natural hazards management. These efforts apply broadly across south Asia, linking the environmental and social conditions in the mountains to those in the densely settled plains. Future geographic studies on the region will continue to engage scholars in the pressing issues of society and the natural environment. The regional perspective will benefit by considering the Indian subcontinent in terms of globalization trends and in respect to its potential role in the geographic alliances emerging within Asia. Among a set of related topics are the impacts of ethnic fragmentation on the evolution of national boundaries and on international relations among the south Asian countries. Attention on culture and environment will continue to consider the place of tradition as a guiding force toward sustainable practices related to economic development and the use of the natural environment. The overall economic transformation of the region will necessitate new geographic appraisals of the peasant economy, societal equity, and land management, linking these within new conceptions of environmental conservation, cultural survival, and sustainable development. See also: Colonialism, Anthropology of; Ethnic Conflicts; Peasants and Rural Societies in History (Agricultural History); Peasants in Anthropology; Regional Geography; South Asia, Archaeology of; South Asian Studies: Culture; South Asian Studies: Economics; South Asian Studies: Environment; Tradition, Anthropology of.
Bibliography Bhardwaj S M 1973 Hindu Places of Pilgrimage in India. University of California Press, Berkeley, Los Angeles, and London Blaikie P, Cameron J, Seddon D 1980 Nepal in Crisis: Growth and Stagnation in the Periphery. Oxford University Press, Oxford Dutt A 1994 The Asian City. Kluwer Academic Publishers, Dordrecht, The Netherlands Farmer B H 1983 An Introduction to South Asia. Methuen, London\New York Ives J D, Messerli B 1989 The Himalayan Dilemma. Routledge, London\New York Karan P P 1963 The Himalayan Kingdoms: Bhutan, Sikkim, and Nepal. Van Nostrand, Princeton, NJ Schwartzberg J 1978 A Historical Atlas of South Asia. University of Chicago Press, Chicago\London Shrestha N R 1997 In the Name of Deelopment. University Press of America, Lanham, MD Zurick D 1995 Errant Journeys: Adenture Trael in a Modern Age. University of Texas Press, Austin, TX Zurick D, Karan P P 1999 Himalaya: Life on the Edge of the World. Johns Hopkins University Press, Baltimore\London
D. N. Zurick
South Asian Studies: Health 1. Background 1.1 Common Heritage For a proper appreciation of the status of the study of health in south Asia, it is important to underline certain facts pertaining to the character of the region and to health as a field of study. The seven countries in the region share a common cultural heritage as well as political history. Three of these countries—Bangladesh, India, and Pakistan—accounting for nearly 97 percent of the population of the region, were one country until 1947. The cultural and religious ties between India, Nepal, Bhutan, and Sri Lanka predate recorded history. This affinity and closeness have not only produced a similar mindset but also a largely common history of medicine and health. Further, the story of health research in this region is almost as long as human history itself. This history is divided into four periods, each of which is marked by distinct characteristics regarding the nature and extent of health research.
2. Study of Health 2.1 The Pre-Islamic Period The history of health in South Asia starts with the Atharaeda, the last of the four Vedas—ancient 14643
South Asian Studies: Health collection of sacred Hindu hymns, pages, and liturgical formulae. The Atharaeda, which is said to date back to the second millennium BC, is the seminal source of Indian medicine. From it emerged the better known ayureda, credited to Dhanvantari who supposedly received it from Brahma, the God of creation in the Hindu Trinity. The period between 800 BC and AD 1000 is regarded as the golden age of Indian medicine. During this period there was great advancement in knowledge about diseases, techniques of surgery, and the concept of health. Two very well-known medical treatises, Charak Samhita by Charak, a physician, and Susruta Samhita by a surgeon, Susruta, were also produced during this period. Although the exact dates of their writing are debatable, both these works are considered far older than any counterpart work produced in Greek medicine, giving rise to a claim for the priority of Indian scientific medicine. During this era, a sophisticated concept of health emerged, postulating interactive unity among soma, psyche, and soul (not too different from the modern concept of holistic medicine), an identification of water, food, physical and occupational environment, sanitation, hygiene and lifestyle, biological and karmic inheritance, and sociopolitical order as primary determinants of health. Because of religious taboos on cutting bodies and killing animals for experimentation, deductive reasoning drove the study of medicine to a large extent. In this pursuit, the thinking about human anatomy and physiology was greatly influenced by the prevalent concept of universe; the human body was perceived as a reflection of the cosmos. However, there is some evidence to indicate a knowledge of germs and bacteria, disease etiology, and the use of empirical observations to determine the effectiveness of various treatments. There was detailed knowledge about a large number of diseases; the extensive materia medica included at least 760 medicinal plants, herbs, and spices, as well as a large number of minerals and animal remedies based on the milk of various animals, bones, dung, and urine. The list of surgical instruments invented and in use was equally impressive (Kutumbiah 1962, Zimmer 1948). But in later years there was a noticeable slowdown in the growth of medicine in this region, probably due to increased emphasis on the theory of ‘karma’ and wider prevalence of a medieval ethos characterized by feudatory attitudes and parochial concerns.
2.2 Islamic Period The arrival of the Turks (around AD 1000) brought Indian medicine to a standstill because of the deterioration of traditional patronage. With the strengthening of Muslim hegemony, unani medicine (a mixture of Greek, Arab, Turkish, and Iranian medicines) gained the favor of the ruling class, and spread 14644
with the Muslim conquests. However, little documentary evidence is available regarding the state of medicine and health during this period, beyond the prevalence of epidemics and famines. Scholarship during this period seems to be concerned primarily with the importation of medical know-how from Iran, Turkey, and Arab countries. 2.3 The Colonial Period The colonial period started on November 24, 1510, when Alfonso de Albuquerque captured Goa, and ended in 1948 when the British withdrew from Sri Lanka, after having withdrawn from India, Pakistan, and Bangladesh a year earlier. During this period, the government by and large limited its attention to the healthcare of its employees, the military, and plantation workers. For this purpose, government health clinics and hospitals were built at most of the district headquarters, major cantonments, primary railway depots, and on tea and coffee plantations. (Due to the greater concentration of tea plantations in Sri Lanka, a proportionately larger health infrastructure was instituted there.) Christian missionaries also built hospitals in major cities and established clinics in the tribal and backward areas. To staff these facilities doctors and nurses were initially imported from Britain, but later several medical colleges (the first of which was established in Madras in 1835) were set up to train local manpower. In 1921 a School of Tropical Medicine was established in Calcutta to train personnel for public health work. In 1932 it was upgraded to become the All India Institute of Hygiene and Public Health (Kumar 1998). A system of licensing the practitioners of allopathic medicine was introduced. However, the private practitioners of ayuredic and unani medicine, who were not licensed, provided most of the medical services. Scholarship on health was largely limited to reports based on the decennial census introduced in 1872, and on vital statistics, although the latter were inaccurate due to a widespread failure to register births, deaths, and other vital events. Most of these studies, produced by Westerners, were part of larger sociopolitical concerns. These included official reports, travelogs, diaries, and the occasional book. (Mother India is an example of such a book, written by an American, Katherine Mayo, which became highly controversial due to its political overtones.) In 1943, towards the end of the British rule in the region, an official committee under the chairmanship of Sir Joseph Bhore was appointed to produce a detailed report on health conditions in India (Pakistan and Bangladesh were then parts of India) and to make recommendations for improvement. The committee was called the Health Survey and Development Committee, and is popularly known as the Bhore Committee. This committee submitted its report in 1946, which is considered a seminal work on health-
South Asian Studies: Health care in this region insofar as it laid the basic foundation of scholarship as well as for political actions regarding the importance of health and the role of government in the provision of health services.
2.4 The Postcolonial Period The basic character of the health system in South Asia was formed during colonial rule: the supremacy of allopathic medicine; an ambivalent concept of healthcare (human service vs. economic necessity); a high degree of dependence on the West for basic knowledge and know-how in all matters pertaining to health; a largely reactive role for the government, and a dependence on the government for health statistics. Further, the deep-rooted cultural tendency to rely on deductive logic for problem-solving continued to be the hallmark of scholarship, placing high reliance on creative thinking over empirical proof. This is reflected in the fact that until recently the primary landmarks in the history of scholarship on healthcare in this region were a series of expert reports, starting with the Bhore Report. The Bhore Committee, reversing the minimalist approach of the British government in India to the provision of healthcare, advocated that the government should take responsibility for ensuring universal access to preventive and curative health services because health was too important to be left to the vagaries of circumstances; it recommended the development of a nationwide network of primary health centers, the provision of integrated preventive and curative health services to all regardless of ability to pay, extensive health education to modernize the knowledge, beliefs and practices of the people, the training of needed health manpower, and the construction of necessary health facilities. These recommendations reflected the mood of the time: the success of the Soviet Union in improving the health status of its population through central planning, the release of the Beveridge report in 1942 in Great Britain (recommending universal healthcare) which led to the passage of the National Health Services Act in 1946, and the high emphasis placed on social welfare by the leaders of the independence movement in the South Asian countries. The Bhore Committee recommendations were incorporated in the first Five Year Plan of India (1951–6), and as a result a foundation was laid for the development of a nationwide network of primary health centers, as well as for the expansion of the training facilities to provide the necessary manpower. This foundation was built upon in each of the successive Five Year Plans, guided by a series of other expert committee reports: Health Survey and Development Committee (chaired by A. L. Mudaliar) 1961; committees on the integration of health services and family planning, 1963, 1968, and 1970; committee
on multipurpose health workers, 1973; Joint Committee of Indian Council of Social Science Research and Indian Council of Medical Research, 1978, and so on. Other countries in the region also followed more or less a similar path and during this period expert committee reports remained the dominant focus of scholarship pertaining to health.
3. Recent Deelopments 3.1 Health Research Since the early 1970s the study of health in this region has received increasing attention from international organizations (WHO, UNICEF, World Bank and USAID), other donor organizations, and American and British universities. This heightened attention is a result of a significant flow of foreign assistance to contain the burgeoning population and high saliency health problems. However, in-region scholarship is still in an early stage of development. Earlier, it was noted that in this region deductive reasoning is a dominant scholarly tradition. This tradition persists despite the fact that the educational system is Western. But unlike those in the West, the institutions of higher learning in south Asia do not emphasize and reward empirical research. Instead, importance is given to the writing of treatises. This is reflected in the fact that while a Medline search covering the years 1980–2000 produced only 483 articles (around 25 articles per year, on average)—the majority of which reported biomedical research, advocated some policy posture or were descriptive in nature—the Sage India’s summer 2000 list of books (covering the period 1998–2000) included 24 titles on various aspects of health and an additional 26 titles related to health. There are very few refereed journals published in south Asia that are concerned primarily with the study of health systems. The National Medical Journal of India of the All India Institute of Medical Sciences, New Delhi, has been published since 1987. In recent years, three new journals have been established: Journal of Health and Population in Deeloping Countries (originally published in India by the University of North Carolina, but now moved to the USA), Journal of Health Management (published by the Indian Institute of Health Management and Research, Jaipur) and Journal of Health, Population and Nutrition (published by the International Centre for Diarrheal Diseases Research, Bangladesh). Funds for research also tend to be very scarce. The major sources of these funds are donor organizations, but they fund only those studies that meet their own needs. As a result, the researcherinitiated studies tend to be limited to doctoral dissertations, few of which are published. The studies on health may be grouped as follows. 14645
South Asian Studies: Health Table 1 Disability adjusted life expectancy (DALE) in South Asian countries, 1997 Country Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka
DALE (years)
Ranking (184 countries)
49.9 51.8 53.2 53.9 49.5 55.9 62.8
140 138 134 130 142 124 76
Source: WHO The World Health Report 2000, Annex Table 5 (for DALE scores); Annex Table 1 (for international ranking) Note: DALE is calculated by subtracting the average number of expected years of disability due to illness and injury throughout life from the average expected years of life at birth
3.2 Health System Achieements After 50 years of independence, the goal of universal access to preventive and curative health services in South Asia is far from fulfilled. This perception has been a catalyst for a number of studies typified by Qadeer and Sen (1998) and Wysocki et al. (1990). A large number of studies prepared by the governments in this region describe progress made. For example, various reports produced by the Ministry of Health in India point out that a network of health centers, each serving 5,000–10,000 persons and staffed by auxiliary health workers, covers the country. In 1947, there were 26 medical colleges in India; by 2000 there were 146. Further, India is largely self-sufficient in drugs and medical supplies. A significant research infrastructure has been developed (some of the more prominent bodies include the All India Institute of Medical Sciences, National Institute of Communicable Diseases, National Institute of Health and Family Welfare, Haffkine Institute, Pasteur Institute, National Institute of Virology, and International Institute of Population Sciences). Two major social insurance programs provide medical care to targeted populations: the Employees State Insurance Corporation, established in 1948, provides healthcare to the employees of private industry and their dependents (nearly 3.5% of India’s population); and the Central Government Health Insurance Scheme, which covers the employees of the central government and their families (more than 3 million people). India’s average life expectancy at birth rose from 41 years in 1951 to more than 60 years by 1995, and infant mortality rate fell from 146 infant deaths per 1,000 live births in 1951 to 71 per 1,000 live births in 1997. Other countries publish similar reports of progress. These reports by the governments are often incorporated in the reports of international and other donor organizations; very few of them collect independent data. However, in 1993 the World Bank departed from this tradition by publishing a report, 14646
Table 2 Ranking of South Asian countries for health system performance, 1997 (191 countries) Country Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka
Performance on health level
Overall performance
103 73 118 147 98 85 66
88 124 112 147 150 122 76
Source: WHO The World Health Report 2000, Annex Table 10 Note: Performance on health level is calculated by adjusting the DALE score for the efficiency of the health systems. Overall performance is a composite measure of achievement in the level of health, the distribution of health, responsiveness to people’s concerns, fairness of financial contributions and managerial efficiency. For details, see The World Health Report 2000, explanatory notes, pp. 141–51
World Deelopment Report: Inesting in Health, based largely on independently collected data. Now this initiative has been carried forward by WHO in its The World Health Report 2000, in which countries are rated for their various achievements. According to this report, the South Asia region fares poorly both on the health status of its population (see Table 1) as well as on the overall performance of health systems (see Table 2). 3.3 Factors Hindering Progress Increasing efforts are being made to understand the reasons for the poor performance of health systems in South Asia. The donors, mainly the WHO, the World Bank, UNICEF, and USAID, lead these efforts. The WHO has asserted that ‘health is fundamentally different from other things that people want’ (WHO 2000, p. 4) and, citing Jonathan Miller (Miller 1978) about its importance, has argued that health requires priority not only in funding decisions but also in ensuring effective access to the services. However, its analysis indicates that this argument seems to have had little effect on resource allocation decisions in South Asia. Countries in this region rank among the lowest for average per capita total expenditure on health as well as for public expenditure (Table 3). It is argued that the low priority given to health in this region is typical for developing countries. Generally, there is a strong positive correlation between the wealth of countries and the amount spent on health. The poorer the country, the lower the proportion of its wealth devoted to health (World Bank 1993, WHO 2000). It may be because the poor, after spending their income on food and shelter, are left with little for other things. But it may also be because the governments in the developing countries, even after the end of colonial rule, continue to perceive spending on health as ‘expenditure,’ not as ‘investment.’
South Asian Studies: Health Table 3 Health expenditure in South Asian countries, 1997
Country Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka
Total per capita expenditure in international dollars (1)
Total expenditure as % of GDP (2)
Public expenditure as % of total expenditure on health (3)
70 82 84 284 41 71 77
4.9 7.0 5.2 8.2 3.2 4.0 3.0
46.0 46.2 13.0 63.9 26.0 22.9 45.4
Public expenditure on health as % of total public expenditure (4)
Ranking for per capita total expenditure on health (184 countries) (5)
9.1 10.1 3.9 11.0 5.3 2.9 5.2
144 135 133 76 170 142 138
Source: WHO The World Health Report 2000, Annex Table 8 for columns 1–4, and Annex Table 1 for column 5
Table 4 Ranking of South Asian countries for equality of access to health services and fairness of financial contributions (191 countries) Country Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka
Equality of access
Fairness of financial contributions
125 158 153 134 161 183 80
51 89 42 51 186 62 76
Source: WHO The World Health Report 2000, Annex Table 5 for equality of access; Annex Table 7 for fairness of financial contributions
Several studies have focused on the issues of access to health services and financial burden (see Table 4), and have argued that the prevalent inequality is caused by uneven distribution of services between urban and rural areas, poor means of transportation and communication, gender bias and a high prevalence of forprofit health services (reliance on the private sector is 87% in India, 79% in Pakistan to 54% in Bangladesh). Further, it is observed that the poor carry a disproportionately heavy financial burden for obtaining health services, because of their lack of political power: not having effective access to state-supported health services, they are forced to purchase them. Even after paying for them, the services they get tend to be of poor quality. According to one estimate, more than 75 percent of all healthcare in India was delivered by ‘private’ physicians, most of whom lacked proper training in either allopathic or other healing traditions (Garrett 2000). The situation is not very different in most of the other countries in the region. Other factors for the poor performance that have been identified include the scholarly tradition, the role
of donors, prevalent work ethics and the unreliability of government-generated data. It is argued (Jain et al. 1999) that the dominance of scholarship based on deductive reasoning generates interesting debates, but little action. Most of the energy in such debates is dissipated in derecognizing those from other groups—a notable characteristic of highly populated societies in which people face acute competition for survival and advancement. This not only makes decision-making difficult but, more importantly, it frustrates implementation. The countries in this region are known for the loftiness of their goals, progressive policies and legislation, and noticeably poor implementation due to the absence of what is described variously as 3Cs—concern with the issues, commitment to resolve them, and competence needed to resolve them—and also as ‘organizational capital’ (Brynjolfsson et al. 2000). The scholarly tradition is only partly to be blamed for poor implementation. The prevalent work culture and work ethics have been found to be equally powerful factors. The dominant work culture is feudal and colonial, which denigrates ‘work’ and has a high tolerance for avoidance of accountability. As a result, ‘delinquent’ (absentee) and ‘ghost’ (fictitious) employees, toothless mandatory program evaluation, and noticeable non-performance, have become important features of the health systems in these countries (Jain et al. 1995). In combination with pervasive corruption (Chadda 2000, World Bank 1998, Jain et al. 1996, Jain and Nath 1996, Jain et al. 1997), the work culture saps the energy of the health system in many different ways. 3.4 The Role of Foreign Assistance Foreign donors, including international organizations, bilateral bodies, foundations, non-governmental organizations (NGOs), churches, and charities, make significant financial contributions to the health 14647
South Asian Studies: Health Table 5 Foreign assistance for health, 1990
Country Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka
Total aid (million $)
Aid per capita ($)
Aid as a percentage of total health expenditure
128 NA 286 NA 33 76 26
1.2 NA 0.3 NA 1.8 0.7 1.5
17.9 NA 1.6 NA 23.6 5.4 7.4
Source: World Bank World Deelopment Report 1993, Table A-9
budgets of the countries in the region (Table 5). But this contribution is perceived increasingly as a mixed blessing, and as a result, a new body of literature is emerging which identifies the pros and cons of foreign assistance and proposes new models for making this assistance more effective. Most of this literature is not health-specific, but the findings are generally applicable to this field. Following publication of the very popular book The Ugly American (Lederer and Burdick 1958), The Ugly Indian—a column written by the well-known Indian journalist, Kuldip Nayar (May 2000)—highlighted the excesses of Indian aid to Nepal. Dennis Merrill in his book Bread and Ballot (1990) reviewed US assistance for economic development during 1947–63. Choudhary (1991) produced a similar study on Canadian aid to countries in South Asia. Ghai (1991) examined the role of the International Monetary Fund in developing countries; and Rao (1999) and Priya and Qadeer (1998) examined the impact of the World Bank’s conditional assistance for healthcare programs in South Asia. A number of other treatises also appeared (Ghosh 1984, Kanbur et al. 1999) examining prevalent models of developmental assistance and proposing new ones. In essence, these studies observed that while the countries received much-needed resources, the process of foreign assistance compromised their freedom of action, especially in those countries in which the foreign assistance accounts for large proportions of the health budgets. This happened in several different ways. Each donor is led by its values, priorities, style, needs and power position. Consequently donors, whether individually or as a group, are rarely interested in the overall improvement in the health system of the country, but only in those parts that interest them. For instance, the WHO has been promoting primary healthcare under the flag of ‘Health for All by Year 2000’ (Wysocki et al. 1990); the US Agency for International Development (USAID) is interested mainly in the provision of family planning services through NGOs (Misra 2000); The World Bank is focused on ‘structural 14648
adjustments,’ privatization, and cost recovery (World Bank 1993, Rao 1999); and UNICEF on immunization of children. Each donor has different rules for project development, implementation and monitoring; and therefore makes different demands on the governments. More importantly, using their power position, the larger donors have tended to usurp the primary initiative for the stewardship of their health systems from the countries. As a result, governments have tended to spend a significant amount of their energy in addressing the interests and concerns of the donors, instead of providing the needed leadership to their health systems. This incapacity for effective leadership is further compounded by the fact that the top health leadership in the governments tends to suffer from a high rate of turnover. One study found that the average tenure of health ministers and health secretaries was less than two years, while the senior staff of the donor organizations stayed in their posts an average of four years (Jain et al. 1995). While the operating style of the donor organizations and their representatives has hurt the morale of the health leadership in the region, it has also contributed to the spread of corruption. However, this issue has not received focal attention in the studies on foreign assistance in South Asia, because until recently ‘corruption’ was a taboo term. Although everybody talked about it in informal settings, nobody wrote about it. Recourse was taken to euphemisms such as ‘work ethics’ and ‘unhealthy conduct’ when referring to corruption. This inhibition has prevented serious study of this grave problem. In this connection, special attention needs to be given to the staff evaluation policies of donor organizations. Most donor organizations, in the evaluation of the performance of their staff, place heavy emphasis to their ability to disburse aid funds. This policy places some pressure on their staff to make them resort to an ‘end-justifies-means’ approach (Jain et al. 1996, Jain and Nath 1996, Chadda 2000).
3.5 Health Statistics South Asia is known for the superiority of its knowhow in mathematics, statistics, demography, and computer sciences. Several world-class institutions such as the Indian Statistical Institute and International Institute of Population Sciences are located here. It has a long history of good quality census. Despite all these positives, the overall quality of health statistics in the region is fair to poor. This phenomenon has attracted the attention of major donors who have a large appetite for good data. Therefore, they have been probing this issue to find out the causes so that they could facilitate the needed correction. Most of these studies are in the form of consultancy reports, and they have identified a diverse group of factors
South Asian Studies: Health responsible for the poor quality of health statistics in the region. These include: health systems in these countries are dominated by physicians with little training in data management (Jain 2000), health data are generated primarily for the donors, health managers in these countries are more inclined to base their decisions on intuition and situational expediency than on data (Jain et al. 1999), the overall system of health data management is so out of date and inefficient that the data needed for decision-making are often not readily available, most program managers do not know how to use data in their work, target-oriented program planning generates pressure for falsification, and corruption needs the darkness of non-working information systems. To compensate for the poor reliability of official health statistics, USAID supports an independent sample survey to monitor progress in health and population programs, the National Family Health Survey (NFHS), which is conducted periodically and covers several countries in this region as well as many other developing countries. In India, the autonomous National Sample Survey Organization (NSSO) was established in 1950. This organization, which has a reputation for producing high-quality data, was originally intended to collect data on the state of the economy but now also gives attention to health and other social sectors. However, the health statistics produced by NSSO and NFHS are fairly limited in scope. Improving health requires improved stewardship of the health system, and this, in turn, requires a sound information system. There is a strong need for active research to improve the quality of health statistics in the region. 3.6 Emerging Issues To improve the health of their people, the countries in South Asia need to surmount the problems of high rate of population growth, infant mortality, maternal mortality, work injuries, and road accidents; contain malaria, TB and other old diseases as well as the new threat of HIV\AIDS; address the problems of a rapidly growing elderly population, rampant urbanization, and deteriorating environment; improve access to and quality of health services; deal with new pressures caused by new international forces such as the World Trade Organization (WTO) and Intellectual Property Rights (IPR), and get ready for new battles over linking trade to child labor and the environment, and foreign aid to privatization, structural adjustments, and corruption (Priya and Qadeer 1998). This poses a huge agenda of research for the social scientists interested in South Asian health. See also: AIDS, Geography of; Area and International Studies: Development in South Asia; Health: Anthropological Aspects; South Asian Studies: Politics.
Bibliography Brynjolfsson E, Hitt L, Yang S 2000 Intangible assets: how use of information technology and organizational structure affects market valuations. MIT Working Paper, July 2000 Chadda M 2000 Building Democracy in South Asia. Sage, New Delhi, India Choudhary N K 1991 Canada and South Asian Deelopment. Brill Academic, New York Garrett L 2000 Betrayal of Trust: The Collapse of Global Public Health. Hyperion, New York Ghai D P 1991 The IMF and the South. St. Martin’s Press, New York Ghosh P K 1984 Foreign Aid and Third World Deelopment. Greenwood Press, Westport, CT Health Survey and Development Committee, India, 1946 Report of the Health Surey and Deelopment Committee. Manager of Publications, New Delhi, India Jain S C, Gupta S D, Srinivasan P 1995 Improing Health System Management in Rajasthan. Carolina Consulting Corporation, Chapel Hill, NC Jain S C, Nath L M, Ahmed J S 1996 End of Project Ealuation: Health and Family Welfare Area Project, Rajasthan, 1989–96. UNFPA, New Delhi, India Jain S C, Nath L M 1996 End of Project Ealuation: Family Welfare Area Project II, Himachal Pradesh, 1990–96. UNFPA, New Delhi, India Jain S C, Nath L M, Assifi N 1997 End of Project Ealuation: Family Welfare Area Project II, Maharashtra, 1990–96. UNFPA, New Delhi, India Jain S C, Barkat A, Lundeen K, Faisel A J, Ahmed T, Islam M S 1999 Improving family planning program performance through management training: the 3Cs paradigm. Journal of Health and Population in Deeloping Countries 2: 1 Jain S C 2000 Education and training: capacity building. Journal of Health and Population in Deeloping Countries 3 : 1 (Special Issue on WHO (SEARO) Regional Conference on Public Health in south-east Asia in the twenty-first century) Kanbur R S M, Sandler T, Morrison K M 1999 The Future of Deelopment Assistance. Overseas Development Council, Washington, DC Kumar A 1998 Medicine and the Raj: British Medical Policy in India 1835–1911. Sage, New Delhi, India Kutumbiah P 1962 Ancient Indian Medicine. Orient Longman, Bombay, India Lederer W, Burdick E 1958 The Ugly American. Norton, New York Lele U J, Nebi I 1991 Transition in Deelopment. ICS Press, Oakland, CA Merrill D 1990 Bread and the Ballot. University of North Carolina Press, Chapel Hill, NC Miller J 1978 The Body in Question. Random House, New York Misra S 2000 Voluntary Action in Health and Population. Sage, New Delhi, India Priya R, Qadeer I 1998 Impact of structural adjustment policies on health in south Asia. The National Medical Journal of India 11(3): 140–2 Qadeer I, Sen K 1998 Public health debacle in south Asia: a reflection of the crisis in welfarism. Journal of Public Health Medicine 20: 1 Rao M 1999 Disinesting in Health: The World Bank Prescriptions for Health. Sage, New Delhi, India World Bank 1993 World Deelopment Report: Inesting in Health. Oxford University Press, New York
14649
South Asian Studies: Health World Bank 1998 Assessing Aid: What Works, What Doesn’t and Why. Oxford University Press, New York World Health Organization 2000 The World Health Report 2000, Health Systems: Improing Performance. WHO, Geneva, Switzerland Wysocki M J, Krishnamurthi C R, Orzeszyna S 1990 Monitoring the progress of health-for-all strategies: the situation in south-East Asia. World Health Statistics Quarterly 43: Zimmer H R 1948 Hindu Medicine. Johns Hopkins Press, Baltimore, MD
S. C. Jain and N. Johri
South Asian Studies: Politics The countries of South Asia that matter most have a colonial legacy in common. The British gradually established universities teaching social sciences all over the Indian subcontinent. This policy prepared the ground for the development of political science in most South Asian countries. The pattern is different in Nepal, which had eluded the British Raj. In this Himalayan kingdom social sciences remained centered on the sole department of political science, at Tribhuvan University, that was founded in 1961. It gained momentum gradually, not only because of the democratization process but also because Katmandu became the center for South Asian associations— including NGOs (nongovernmental organizations) and journals—dealing with education or human rights issues, like Himal and the South Asia Forum for Human Rights. The most common, pattern, however, based on British-built institutions is found in India, Pakistan, and Sri Lanka. In the latter country, Colombo is the most active place, not only because of the university but also because of research centers such as the Regional Centre for Strategic Studies. Like Nepal, Sri Lanka can use its relatively neutral position vis-a' -vis the Indo-Pakistani conflict for developing South Asian studies. However, political science remains rather underdeveloped. Interestingly, at the University of Colombo, political scientists are affiliated to the Department of History and Political Science (which was created in 1963). Besides, scholars tend to focus on one political issue, that is ethnic conflicts. (In 1982, an International Centre for Ethnic Studies was launched in Colombo). Often, the scholars who approached the national question either were not political scientists (Tambiah 1986, an anthropologist by training) or did not stay in Sri Lanka (Wilson 1974, 1988, occupied first the Chair of Political Science at the University of Ceylon but was appointed Professor of Political Science at the University of New Brunswick in 1972). 14650
This ‘extraversion’ is one of the specific features of the political science of India and Pakistan, the two countries on which we shall focus in this article.
1. The Institutional Bases: Building Upon the British Legacy 1.1 The Indian Efflorescence Political science has flourished in India whether in the examination of the political system per se and its links to society or in the study of international relations. The British contributed to these postcolonial developments from two points of view. First, they enabled Indian students to study politics in England, notably at the London School of Economics. Second, they introduced courses of political science in Indian university, more especially in Calcutta (where U. Goshal became lecturer in Comparative Politics in the 1920s) and in Allahabad, where Beni Prasad (who was to found the Indian Association for Political Science and the Indian Journal for Political Science) was Reader in Civics and Politics in the 1920s too. However, these two cities receded in the background gradually as political institutions were transferred to Delhi. With its twin educational and research institutions, the Jawaharlal Nehru University (JNU) and the Delhi University, India’s State capital has quite logically been at the heart of the whole process. The School of Social Sciences and the School of International Relations at the JNU and the Faculty of Political Sciences at the University of Delhi were both founded in 1955. Partly, because of the influence of Marxist thought, until the late 1980s, the historiographical tradition was strong, particularly in the JNU, where the study of politics was above all seen in interlinked terms of class and political economy. In spite of the domineering presence of these two rival institutions, some other research centers based in Delhi such as the Centre for the Study of Developing Societies (CSDS) founded in 1963 and the Institute for Defense Studies and Analyses (IDSA), founded in 1965, have managed to grow outside their fold and have been able to contribute new approaches in their own, very specialized, domains. The Centre for Policy Research (CPR), initially established in 1973 as an informal think-tank has also grown into a major center for studying public policies as well as foreign affairs and the political sociology of India. In the same manner, political science departments have blossomed throughout India and can be found in most, if not all, the state universities. To name but a few, Calcutta, Chandigarh, Hyderabad, Jaipur but also Banaras and even smaller places like Meerut are endowed with universities that have their own departments of political science and in some cases also of
South Asian Studies: Politics international relations, where research and teaching are being conducted with a focus on regional and local issues. This has in its times led to the development of journals and associations that are as many proofs to the vitality of the subject. The Indian Association for Political Science—which brings together large contingents of political scientists to the conferences of the International Political Science Association—edits the Indian Journal of Political Science. The Indian Political Science Reiew is another such journal based in Meerut University.
Studies (1973)—or with area studies, such as the Centre for South Asian Studies of the University of the Punjab (1973) and the Central Asia Centre of the University of Peshawar (1976). This list reflects the traditional outward looking tendency of Pakistan, a country at the crossroad of South Asia, Central Asia, and the Middle East and affiliated to associations of regional cooperation and defense since the 1950s. However, India and Pakistan have more in common as far as the relationship of their political scientists with the outside intellectual world is concerned.
1.3 The Study of South Asian Politics Abroad 1.2 Outward-looking Pakistan Political science is not as well established in Pakistan as in India but international relations tend to be studied by a comparatively larger number of institutions. The gap may be explained from three points of view. First, the political context was not very conducive to the development of this discipline: the prolonged phases of authoritarianism under Ayub Khan (1959–70), Zia ul Haq (1977–88), and Pervez Musharraf (1999–) did not allow much freedom of investigation and the instability of the regime did not help the conceptualization of what is the Pakistani political system. The phases of civil rule were not conducive to research work either. For instance, G. W. Chaudhury, who had published two reference books in the 1950s–1960s, was forced to exile under Zulfikar Ali Bhutto because of his account of the war of Bangladesh. Second, political scientists were dominated by history, the most influential social science because they took refuge in history for the reasons mentioned above, and, like in India, the historiographical tradition overpowered other disciplines. Third, political science—like many other social sciences suffered much from the weaknesses of the university system: the textbooks are substandard and the teaching of the discipline is especially bad because in most universities and colleges it is taught in Urdu—a language in which there are no quality standard textbooks (Shafqat 1989). In stark contrast to the situation prevailing in India, there is no association of political scientists and no ‘Journal of Political Science’ in Pakistan. Certainly there are departments of political science in many Pakistani Universities—including Karachi and Lahore—but not in some of the largest ones. On the other hand, international relations are taught in many places including the departments of international relations of the Qaı$ d-i-Azam University (Islamabad), founded in 1973, the University of Karachi (1958), and the University of Sindh (1972). Besides these departments, there are many more specific bodies dealing with international affairs—such as the Institute of Regional Studies (1982) and the Institute of Strategic
The command of English by South Asian political scientists partly explains their sometimes-strong affinities with Anglo-Saxon specialists in the field. First, many of them studied abroad and settled down there. Among the founding fathers of ‘Pakistani political science,’ one finds several expatriates including Anwar Hussain Syed, Khalin Bin Sayyed, and S. M. M. Qureshi. On the Indian side, the list is much longer, especially in the United States where—possibly because of political correctness—the universities have recruited an ever-larger number of Indian scholars who found the American environment increasingly attractive in terms of its intellectual climate as well as its working conditions. Second, American and British universities have well established traditions of South Asian studies in political science. In England, while Oxford and Cambridge tended to specialize in history, the School of Oriental and African Studies, the London School of Economics, and the University of Sussex have developed an early interest in South Asian politics. They were joined subsequently by the Institute of Development Studies (Brighton), Hull University, and Bristol University. While the other European countries have been less active, with the possible exception of Sciences-Po (Paris), the Su$ dasien Institut (Heidelberg), and the Amsterdam School of Social Science Research, American universities have played a major role in this field. The University of Pennsylvania’s Department of South Asia Regional Studies (Philadelphia) was founded in 1948 by W. Norman Brown. Brown was not a political scientist. He had became professor of Sanskrit at Pennsylvania in 1926 (and retired in 1966 in this capacity) but he was interested in Indian current affairs and trained political scientists, such as Richard Park who, in 1951 became the first director of the Berkeley South Asia Program and then of the Center for Southern Asian Studies in Michigan (Crane 1985). Nicholas Dirks points out that Norman Brown’s life ‘helps to explain why area studies at the University of Pennsylvania, and elsewhere, privileged a combination of classical Indological scholarship and 14651
South Asian Studies: Politics modern political and economic concern in the early history of the field’ (Dirks in press). Pennsylvania University, where Francine Frankel developed the Center for the Advanced Studies of Asia in the 1990s, continued to play a leading role in addition to Berkeley and Michigan. In Chicago, the study of South Asia was dominated by anthropology—Milton Singer played a major part there in the 1950s—but political science asserted itself in the 1960s with Susan Hoeber Rudolph and Lloyd Rudolph. The Massachusetts Institute of Technology, where Myron Weiner was the moving spirit of South Asian Studies between 1961 and his death in 1999, was one of the other key institutions. Every year, American social scientists—including political scientists—meet at the University of Wisconsin (Madison) for the Annual Conference on South Asia, an event which is incomparably more important for political scientists than the biyearly meeting of the European Association for the Study of Modern South Asia, which is still in its infancy. The American institutional base, the presence of many South Asian political scientists in the United States, and the strong influence of this country over political science—a young discipline in which there was no well-established European tradition—largely explain that the field was initially dominated by American-born methodologies and theories (Brass 1995).
2. Approaches in the Study of South Asian Politics 2.1 From Deelopmentalism to Political Sociology The first researches into the politics of South Asia were developed after World War II within the then most influential framework of the theories of development and modernization. Myron Weiner, whose pioneering work, Party Politics in India was published in 1957, played a leading role in this enterprise. He ‘represented’ India, for instance, in ‘classics’ of this Princeton University-dominated school of thought, such as Almond and Coleman (1960) and Binder et al. (1971). His books—more especially The Indian National Congress (1967)—however, did not suffer from the limitations of this developmentalist paradigm because they relied on intensive fieldwork and an empiricist methodology. Contemporaries of Weiner, including the Rudolphs and Morris-Jones, however, initiated a different approach. They rejected the theories that postulated the replication in South Asia of the political trajectories that had occurred in the West. W. H. Morris-Jones underlined the specificities of the Indian political cultures in his The Goernment and Politics of India 14652
(1964) and the Rudolphs made a similar argument by focusing on the case of Gandhi in The Modernity of Tradition (1967). Both trends—Weiner’s empiricism cum developmentalism and the school emphasizing the indigenous dimension of the modernization process—had an echo in India thanks to the towering figure of Rajni Kothari, the real founder of political science in India. In the 1960s, Kothari developed the notion of ‘the Congress system’ in an often quoted article to describe the political domination of the Congress party along lines that were very similar to Morris-Jones’ analysis (Kothari 1964b, 1970a). Secondly, he took up the issue of modernization in a famous article (that was also published in an Anglo-Saxon journal, Kothari 1964a) that criticized the approach of the Rudolphs but pointed towards the same direction. Thirdly, Kothari worked with M. Weiner (Kothari and Weiner 1965). He drew some of his inspiration from the developmentalist approach—as evident from the name of the research institution he founded—The Centre for the Study of Developing Societies (CSDS), and was favorably inclined towards empiricism. As a result, the CSDS played a pioneering role in electoral studies; its Data Unit organized the first opinion polls in 1967 and 1971 under the guidance of Bashiruddin Ahmed who published with Samuel Eldersveld a book entitled Citizens and Politics. Mass Political Behaior in India (1978). Right from the beginning, R. Kothari and his center introduced a modus operandi combining two major features: openness to the Anglo-Saxon approaches (stemming in collaborative works) and, at the same time, an attempt at Indianizing the concepts of analysis (this is most obvious in Kothari’s 1970b study of the functional role of caste in the Indian democracy). While these two characteristics virtually contradicted each other, the contradiction disappeared when the American brand of developmentalist receded in the background and let the empiricist approach result in research works paying more attention to Indian political cultures and practices. M. Weiner himself challenged ‘modernization’ theory that predicted erosion of religious, caste, linguistic, and tribal identities, as developing countries modernized in Sons of the Soil (1978), a book he wrote with Mary Katzenstein. Paul Brass’ initial work exemplifies this approach. Brass’ initial interest in the politics of Uttar Pradesh had led him to question the relevance of parties as the basic unit of the state Indian politics; he magisterially demonstrated that factions—partly related to castes— played a more important role in his book, Factional Politics in an Indian State. The Congress Party in Uttar Pradesh (1965). His approach prepared the ground for a political anthropology, which materialized gradually with the growing interest of anthropologists as Harold Gould in politics. However, Brass is best known for his study of ethnic politics. In his seminal work, Language, Religion and Politics (1974), he analyzed
South Asian Studies: Politics ‘communal’ movements in Punjab, Uttar Pradesh, and Bihar in an instrumentalist perspective, explaining that such identity politics results from the conflicts of interest of ethnic groups. This anti-primordialist paradigm was well in tune with the approach of a large number of Indian scholars.
endeavor to bring the benefits of development to the masses but also because of the interaction between economic, social, and political power in India.
2.3 New Deelopments in ‘Domestic Political Science’ 2.2 Political Economy and the Marxist School Besides the initial—and sustained—interaction as well as cross-fertilization relationship between Indian and Western—mostly American—political scientists, another characteristic of the study of South Asia politics lay in the strong emphasis on political economy. The main object of study here is the State and the issue at stake is: why did the State fail in its efforts at implementing its socialist agenda during the Nehru years and after? F. Frankel argues that the dependence of the Congress Party on privileged classes, landlords, capitalists, and other notables delivering votes or funds, prevented the government from pursuing its intended policies (1978). Marxist scholars, like C. P. Bhambri, offer a similar analysis but emphasize the weaknesses of class-consciousness among the poor, especially the peasantry. However, the Rudolphs (1978) and Atul Kohli (1990) suggest that the State has gained some autonomy. Indian Marxists have often drawn their inspiration from Gramsci and his theories about social hegemony and passive revolution. Pranab Bardhan has used this approach to analyze Indian society’s economic as well as political relations. In The Political Economy of Deelopment in India (1984) he shows how three ‘proprietary classes,’ the political establishment, the landlords, and the capitalists share power in India, the first group making concessions to the others in order to remain in power: the Congress party gave up substantial land reform in order to get the support of rural notables at the time of elections and did not implement its socialist agenda fully, so much so that the business community gave financial support to the party. Parallel to this class-oriented approach of social tensions, political scientists have more and more emphasized the role of caste. Eleanor Zelliot (1992) and Owen Lynch (1969) focused on the mobilization of the Untouchables (and more especially on the movement of B. R. Ambedkar). In the wake of these pioneering works, Marc Galanter (1984) analyzed the impact of the affirmative action programs on the lower castes, Gail Omvedt, the rise of Dalit politics (1994), and F. Frankel and M. S. A. Rao the ‘decline of a social order’ that was based on the traditional caste system (1989). The issue of caste, class, and power, originating from the 1960s works of the Marxist scholars, has remained central to the study of Indian politics not only because the State has failed in its
In the 1990s, the development of communalism and the increased political use of religion India was witnessing, led many authors to scrutinize a subject that had become a non-issue: secularism. In an attempt to understand the polarization of communities along religious lines, the real impact of secularism on Indian society was reexamined by sociologists as well as political scientists (Nandy 1985, Bhargava 1996, 1998). The State was also a central focus of the research undertaken, the question being why it had failed to anchor the secularist agenda in Indian society (Basu and Kohli 1998). The historical and the Marxist traditions again played an important encompassing part in delimiting the approach. This can be seen in the works of Ashgar Ali Engineer, for instance, whose interpretation of communalism as the result of economic rivalry (1984) had a large scientific impact. It is also evident as prominent historians like Gyanendra Pandey had a large share of the study of communalism from the late 1980s onwards (1993). Running parallel to the study of communalism was the new examination of the political system to account for the rise of the parties relying on religious identities for support. Bruce Graham’s work on the Bharatiya Jana Sangh, covered both the fields of the study of opposition politics (organization, support bases, etc.) and that of the political use of religion (1990). Generally speaking, however, the study of the political parties has been one of the declining themes. While the Congress, when it was a ruling party, had been the central focus of works by Kothari, Morris-Jones, Weiner, Brass, and Stanley Kochanek (1968), its decline has not been studied very closely. Similarly, the communist parties, the socialist movement, and the peasant-based parties such as the Lok Dal have remained largely ignored. The State institutions, which were scrutinized when they were set up after independence Austin 1966, did not generate the same amount of work in the 1990s. Scholars like James Manor (who edited a book on the Indian Prime Minister), however, argue that institutional decay is not an irreversible trend and that the resurgence of the judiciary and the Election Commission, for instance, are a good reflection of this tendency (1994). Similarly, State politics has not continued to attract the same interest. In the 1960s–70s Myron Weiner and Iqbal Narain edited volumes studying State politics, almost state by state (1974–7). Paradoxically, such comparative studies have become less and less numerous (a notable exception being Zoya Hasan’s 1998 14653
South Asian Studies: Politics book on Uttar Pradesh), precisely at the moment when the dismantlement of the one-party system favors the emergence of an even larger number of regional parties. In the 1990s, the regionalization of politics resulted in the multiplication of political parties in Parliament. There were about 40 in the late 1990s. On the other hand, election studies gained momentum. They have always been a very active sector of political science in, or about, India. Myron Weiner, once again, published several volumes untitled India at the Polls. Before him, R. Park and S. Kogekar had covered the first general elections (1956) and Iqbal Narain had coordinated similar surveys (Varma and Narain 1968). In a manner very different from the early years of Indian political science, the political assertion of new actors—including low castes, communal forces, and regional parties—and the tremendous transformations of the party system it triggered is now also being assessed through empirical studies of the voting pattern including exit polls. The Centre for the Study of Developing Societies again plays an essential part. Its Data Unit under the direction of Yogendra Yadav conceives and conducts nationwide polls before and after most elections since 1993. Focusing on socioeconomic criteria, they provide an exceptional corpus of material for the understanding of Indian political behavior. Yogendra Yadav’s articles on the 1993–5 elections have cleared the ground for the analysis of post-Congress politics. The current revival of Indian political science is driven by the analysis of the major and somewhat brutal changes undergone by Indian society since the late 1980s. One of the consequences is the relative decline of the historical tradition and the assertion of a more ‘political’ school of analysis. The main Englishmedium scientific press, the Oxford University Press, had launched a new series in 1997 that could be seen as an indication of the change taking place. This ‘Themes in Politics’ series co-edited by Rajiv Bhargava and Partha Chatterjee explored three subjects: the State, secularism, and gender, the three having been at the core of the then evolution of Indian society. The series shifted in 2000 to a new publishing house, Permanent Black, founded by Rukund Advani, a former senior commissioning editor with OUP.
2.4 International Relations: Nonalignment and Indian Neighbors In the early 1960s, the study of international relations ran parallel with the setting of independent India’s foreign policy and focused chiefly on the expectations rising from the Nonaligned Movement and the part India played in it. From the 1970s onwards, the development of strong links with the USSR provided a new focus of research. Soviet foreign policy became one of the main themes of research, in particular in 14654
relation to the USA and Eastern European countries. It therefore had a strong regional taste to it while conceptual issues were less prominent. While it focused on countries with which India was having close interactions, Indian political science hardly covered countries with which there were few links. The case of the Chinese school of study provides an interesting example of the close relationship between research and India’s foreign policy. With the 1962 war between India and China, the good relationship the two countries had enjoyed until the mid1950s collapsed completely. With diplomatic relations down to the minimum, too little scientific research was encouraged either on China’s foreign policy or later on the Chinese way to economic development. The Institute of Chinese Studies started publishing China Report in 1964, but it is rather later that China’s evolution into a major player on the international scene helped to realize India’s lack of knowledge of its chief neighbor. Western Europe and the USA were not central to the study of international relations in India, and were chiefly investigated through their links to the USSR. Central Asia, in spite of its proximity to India, was also hardly studied at all. The same goes also for East and Southeast Asia. The new dynamism of research on the region owes a lot to India’s ‘Look East policy’ since the early 1990s and the growing economic links between India and the various Southeast Asian countries. Apart from the interest in regional studies, Indian schools of international relations also developed strategic analysis. Partly due to the country’s size and its tendency to look inward, strategic culture had eluded India. With India’s growth on the South Asian regional scene and its claims to being a major international player, a school of strategic thinking started to emerge. The Institute for Defence Studies and Analysis (IDSA) played an essential part in this new school of thought. The center was founded in 1965 and has since focused on India’s national security and orientations to defense. Through its long established influence on the Indian government, the IDSA has been instrumental in establishing India’s naval and regional policy. The Centre’s main publications are the journals Strategic Analysis and Strategic Digest, and the yearly Asian Strategic Reiew. The latter has been conceived as a reply to the Stockholm Institute for Peace Research’s Yearbook and London’s International Institute for Strategic Studies’ Military Balance. Arguing against the Western perception that India’s military policy was a risk to regional stability, the IDSA has argued in favor of the nuclear deterrence and of the necessity of developing a security establishment of adequate size. Its former director, K. Subrahmanyam, and director in 1999 Jasjit Singh, wield considerable influence over the Indian government as far as security issues are concerned.
South Asian Studies: Politics Indicating the change of scale in the Indian current foreign policy, the relation to China, the nuclear doctrine, and the question of ‘multipolarity’ are nowadays central issues in international relation studies. The IDSA and the Delhi-based Centre for Policy Research concentrate on those subjects which can be seen as a major shift in Indian strategic thinking.
2.5 How to Study the Politics of Pakistan? The study of Pakistan’s politics, which is much less developed than its Indian counterpart has been dominated by two themes, the question of the regime and the issue of national unity. Historically, the latter came first: Pakistan was created to give a nation-state to the Muslims of British India but Islam did not turn out to be a sufficiently strong cementing force with East Pakistan fighting—successfully—for the establishment of a separate country (Bangladesh), the Pathans showing signs of irredentism vis-a' -vis Afghanistan, the Baluchis displaying centrifugal forces too, and Sindh being gradually the framework for a major conflict between Mohajirs and Sindhis. Hence there are a large number of studies dealing with ethnic issues. Some authors focus on language as the main line of cleavage (Rahman 1998), while others (Amin 1988) look at ethnic groups (with their other characteristics such as religious affiliations). But most of the authors reject the primordial paradigm. They rather interpret ethnic conflicts as the outcome of socioeconomic tensions or political rivalries. However, emphasizing that cultural features are instrumentalized for promoting collective interests does not mean that culture is ignored. This reading applies to studies of religious groups too. For instance, Vali Nasr underlines that the Jama’at-i-Islami had some affinities with Sufi orders (1994). The question of the regime has been looked at very closely after the first military coup terminating Pakistan’s first attempt at establishing democratic institutions. In his ‘classic,’ The Political System of Pakistan, Khalid B. Sayeed coined the expression ‘constitutional autocracy’ (1967) to describe Ayub Khan’s regime. Pakistan’s incapacity to escape the cyclical alternation of civil rule and praetorian politics is also central to another classic by Mohammad Waseem, Politics and the State in Pakistan (1994). Unsurprisingly, the periods of military rule have been studied in greater details. Ayesha Jalal and Mohammad Waseem have scrutinized respectively the reigns of Ayub Khan and that of Zia-ul-Haq in two complementary perspectives. While the former emphasized the political economy of defense (1990), the latter put a stress on the authoritarian overtone of the political culture of Pakistan—the viceregal legacy of colonial rule—and its affinities with the ethos of the army (1987).
Pakistani political scientists however often point out that the army’s domination over society and politics—even after 1988 when the military personnel continued to exert a strong influence directly or via the President—is not in tune with the country’s political culture. Saeed Shafqat refers to ‘a case of disharmony between democratic creed and autocratic reality’ (1998). Political science in and about South Asia shares some features irrespective of the country one looks at. On the one hand, in most cases it is derived from the British tradition and this starting point generated rather strong affinities with the Anglo-Saxon academic world. On the other hand, India was especially studied by political scientists (either in India or abroad) from the point of view of its domestic scene whereas scholars working on Pakistan tended to pay more attention to its relations with neighbors and other countries. This differentiation is evident from a clear unevenness in terms of institutional developments. This contrast can be explained by the size of India–a world in itself—and its unique, democratic trajectory, on the one hand, and by the extraversion of Pakistan, a country where political surveys have never been easy, especially when it was under military rule, on the other hand. The alternation of civil governments and military coups probably explains the rather limited development of political science in Bangladesh, the only country of this area we have not mentioned yet. However, the universities of Dhaka, Chittagong, and Jahangirnagar have departments of political science and the resumption of the democratization process may contribute to their consolidation.
Bibliography Ahmed B, Eldersveld S 1978 Citizens and Politics. Mass Political Behaior in India. Chicago, IL Almond G A, Coleman J S (eds.) 1960 The Politics of the Deeloping Areas. Princeton, NJ Amin T 1988 Ethno-national Moements of Pakistan. Domestic and International Factors. Institute of Policy Studies, Islamabad, Pakistan Austin G 1966 The Indian Constitution: Cornerstone of a Nation. Clarendon Press, Oxford, UK Bardhan P 1984 The Political Economy of Deelopment in India. Delhi, India Basu A, Kohli A (eds.) 1998 Community and the State in South Asia. Oxford University Press Bhargava R 1996 How not to defend secularism. In: Bidwai P, Mukhia H, Vanaik A (eds.) Religion, Religiosity and Communalism. Manohar, Delhi, India Bhargava R (ed.) 1998 Secularism and His Critics. Oxford University Press, Delhi, India Binder L et al. 1971 Crises and Sequences in Political Deelopment. Princeton, NJ Bose S 1994 States, Nations, Soereignty. Sri Lanka, India and the Tamil Eelam Moement. Sage, New Delhi, India Brass P 1965 Factional Politics in an Indian State. The Congress Party in Uttar Pradesh. Berkeley, CA Brass P 1974 Language, Religion and Politics. Cambridge
14655
South Asian Studies: Politics Brass P 1995 American Political Science and South Asian Studies: Virtue unrewarded. Economic and Political Weekly Sept.: 2257–62 Chaudhury G W 1959 Constitutional Deelopment in Pakistan. London Chaudhury G W 1963 Democracy in Pakistan. Dacca, Bangladesh Chaudhury G W 1974 The Last Days of Pakistan. Bloomington, IN Crane R 1985 Preface on Richard L. Park. In: Wallace P (ed.) Region and Nation in India. IBH. Publishing and Co., New Delhi, India Dirks N B in press South Asian Studies: Futures Past Engineer A A 1984 Communal Riots in Post-Independence India. Delhi, India Frankel F 1978 India’s Political Economy, 1947–1977. Princeton University Press, Princeton, NJ Frankel F, Rao M S A (eds.) 1989 Dominance and State Power in Modern India, 2 Vols. Oxford University Press Galanter M 1984 Competing Equalities. Law and the Backward Classes in India. Oxford University Press, Delhi, India Graham B D 1990 Hindu Nationalism and Indian Politics: The Origins and Deelopment of the Bharatiya Jana Sangh. Cambridge University Press, Cambridge, UK Hasan Z 1998 Quest for Power. Oppositional Moements and Post-Congress Politics in Uttar Pradesh. Oxford University Press, Delhi, India Jalal A 1990 The State of Martial Rule. The Origins of Pakistan’s Political Economy of Defence. Cambridge University Press, Cambridge, UK Kochanek S 1968 The Congress Party of India. Princeton University Press, Princeton, NJ Kogekar S V, Park R L (eds.) 1956 Reports on the Indian General Elections. 1951–52. Popular Book Depot, Bombay, India Kohli A 1990 Democracy and Discontent: India’s Growing Crisis of Goernability. Cambridge, Kothari R 1964a Tradition and Modernity Reisited. Goernment and Opposition Summer Kothari R 1964b The Congress System. Asian Surey (Dec.): 116–73 Kothari R 1970a Politics in India. Orient Longmans, Hyderabad, India Kothari R (ed.) 1970b Caste in Indian Politics. Orient Longman, New Delhi, India Kothari R Weiner M 1965 Indian Voting Behaior. Calcutta, India Lynch O 1969 The Politics of Untouchability. Columbia University Press, New York Manor J 1994 Nehru to the Nineties. The Changing Office of Prime Minister in India. Viking, Delhi, India Morris-Jones W H 1964 The Goernment and Politics of India. London Nandy A 1985 An anti-secular manifesto. Seminar Oct. Nasr S V R 1994 The Vanguard of the Islamic Reolution. The Jama’at-i Islami of Pakistan. University of California Press, Berkeley, CA Omvedt G 1994 Dalits and the Democratic Reolution. Sage, New Delhi, India Panandikar V A P, S 1983 A Changing Political Representation in India. Uppal, New Delhi, India Panandikar V A P, S 1992 Integration of Population, Enironment and Deelopment Policies. CPR, New Delhi, India Pandey G (ed.) 1993 Hindus and Others. Viking, Delhi, India Rahman T 1998 Language and Politics in Pakistan. Oxford University Press, Karachi, Pakistan
14656
Rudolph L I, Rudolph S H 1967 The Modernity of Tradition. Chicago, IL Rudolph L I, Rudolph S H 1978 In Pursuit of Lakshmi: The Political Economy of the Indian State. Chicago, IL Sayeed K B 1967 The Political System of Pakistan. Houghton Mifflin, Boston, MA Shafqat S 1989 Political science problems, prospects and scope in Pakistan. In: Hashimi S H (ed.) The State of Social Sciences in Pakistan, Quaid-i-Azam University, Islamabad, Pakistan, pp. 159–68 Shafqat S 1998 Political culture of Pakistan: A case of disharmony between democratic creed and autocratic reality. In: Shafqat S (ed.) Contemporary Issues in Pakistan Studies, Azad, Lahore, Pakistan Tambiah S J 1986 Sri Lanka: Ethnic Fratricide and the Dismantling of Democracy. Oxford University Press, Delhi, India Varma S P, Narain I (eds.) 1968 Fourth General Elections in India. Orient Longmans, Bombay, India Wasem M 1987 Pakistan Under Martial Law, 1977–1985. Vanguard, Lahore, Pakistan Waseem M 1994 Politics and the State in Pakistan. Islamabad, Pakistan Weiner M 1957 Party Politics in India. Princeton, NJ Weiner M 1967 The Indian National Congress. Chicago, IL Weiner M, Field O J (eds.) 1974–77 Electoral Politics in the Indian States. 4 Vols. Manohar, Delhi, India Weiner M, Katzenstein M 1978 Sons of the Soil. Princeton, NJ Wilson A J 1974 Politics in Sri Lanka, 1947–1973. Wilson A J 1988 The Break-up of Sri Lanka. Hurst, London Yadav Y Reconfiguration in Indian Politics. State Assembly Elections, 1993–1995. Economic and Political Weekly Zelliot E 1992 From Untouchable to Dalit. Essays on Ambedkar Moement. Manohar, Delhi, India
C. Jaffrelot and J. Zerinini-Brotel
Southeast Asia, Archaeology of Southeast Asia, a term which entered into common use and academic discourse during and after World War II, refers to the region covered by the modern nation-states of Burma, Thailand, Laos, Cambodia, Vietnam, Malaysia, Singapore, Indonesia, East Timor, Brunei, and the Philippines (although archaeologists also include southern China and Taiwan in studying aspects of the region’s prehistory). Divided into mainland and island areas, this region extends across tropical latitudes for approximately 5,000 km southeastwards from Burma to eastern Indonesia. Scholarly traditions of archaeological research in Southeast Asia are generally divided according to their focus on either prehistoric or historic periods. Prehistory begins with the Pleistocene artifactual record of early hominids and ends with the appearance of written records, the timing of which varies by area (e.g., in Indonesia stone inscriptions in Sanskrit appear in the fourth century AD whereas extant written
Southeast Asia, Archaeology of accounts of the Philippines date from the tenth century AD onward). Historical archaeology is subdivided into, for example, early, classical, Islamic, and protohistoric, depending upon the area. While the archaeology of European colonialism and the spread of capitalism are often used to define historical archaeology elsewhere in the world, in Southeast Asia these constitute more recent topics of research. European colonialism of Southeast Asia, however, was the context within which archaeology developed, and its legacy continues to affect contemporary archaeological practice. This history and the current understanding of Southeast Asia’s prehistoric sequence (as based on English-language sources and primarily the better known archaeological records, at least for certain periods, of Thailand, Vietnam, Malaysia, Indonesia, and the Philippines) are summarized here, followed by a discussion of some of the problems and future directions in the archaeology of this region.
1.
Deelopment of Southeast Asian Archaeology
The study of archaeological remains in Southeast Asia has a long history, predating even that conducted by Europeans who are credited with introducing archaeology. Thai and Burmese rulers, for example, had previously initiated studies of stone inscriptions to reconstruct political-economic aspects of earlier states. The practice of archaeology as a Western scientific discipline began with European colonial expansion into Southeast Asia. Following an initial period of antiquarian interest, which in places such as Indonesia lasted for more than a century, during which Dutch and British recorded monumental sites such as Prambanan, collected artifacts (including specifically for museums), and published descriptions, the first archaeological institutions were established. In Burma, a research officer in archaeology was appointed in 1881 under the Indian Archaeology Department established by the British; in Indonesia, the Dutch established an archaeological society in 1885 to coordinate scientific exploration of ancient monuments in Java; in IndoChina (Cambodia, Laos, Vietnam), the French established the E; cole Franc: aise d’Extre# me Orient in 1898, which directed excavations, recorded and restored sites, and published reports; in Thailand, the Siam Society was founded in 1904 as a forum for those interested in the country’s past, followed by the establishment of the Archaeological Service of Siam in 1924; in the Philippines, systematic explorations and collections were initiated in 1881, with archaeology established during the US colonial period; and in Malaysia, archaeological excavations began in the early 1930s conducted by the Raffles Museum in Singapore. Early investigations tended to focus on monumental sites, often in association with the study of the growing body of stone inscriptions, where these existed; however, excavations of prehistoric sites were also undertaken. Unlike Thailand, where from the
beginning archaeological and art historical research was conducted by Thais, in the colonies this was not the case; thus, for example, Vietnamese, Khmer, and Laotians were neither trained in archaeology nor participated as professionals until after independence. Beginning in the 1960s, there was a significant rise in archaeological research focused on prehistoric sites (although this did not happen everywhere; for example, no archaeological research was conducted in Laos between 1945 and 1975). Some were joint projects between foreign and local institutions, which included training of local students (a first in some countries such as Thailand). One outcome of the various research projects, some of which claimed early developments of agriculture and metallurgy, was that by the 1970s Southeast Asian archaeology was no longer viewed as a branch of Indian, Chinese, or Far Eastern studies, nor as a marginal zone in prehistory that was a backward appendage of the more advanced cultures of India and China. At the same time, diffusion and migration explanations of culture change began to be replaced by frameworks emphasizing local processes of innovation and adaptation. Also since the 1970s, the majority of archaeological research in the region has been conducted by Southeast Asian archaeologists (Miksic 1995) employing a range of perspectives from culture-historical to Marxist.
2. 2.1
From Homo erectus to Early States Early Hominid Occupation of Southeast Asia
The earliest known hominid occupation of Southeast Asia was by Homo erectus. It occurred by at least one million years ago, although redating of fossil-bearing deposits in Java suggests it may have been as early as 1.8 million years ago (Swisher et al. 1994). The majority of evidence is paleontological and primarily from sites on Java, although Homo erectus fossils are also known from Burma and Vietnam. Stone tool assemblages, composed of large pebble and flake tools (which differ from the contemporary Acheulian hand ax industries of Africa, Europe, and India: see SubSaharan Africa, Archaeology of ), and generally not from primary contexts, have not yet been found in association with hominid fossils. The interpretation of much of the archaeological record assigned Middle Pleistocene dates thus continues to be the subject of debate. Homo erectus’s technical achievements may have included maritime technology, as suggested by the recovery of lithic artifacts from excavated sites on Flores dated to the Early Pleistocene (c. 800,000 year ago). Occupation of this island would have entailed crossing deep sea channels since it was not part of Sundaland (the land mass connecting the mainland with Sumatra, Java, Bali, Borneo, and Palawan, exposed at various times during the Pleistocene) and suggests Homo erectus possessed voyaging skills and watercraft (Morwood et al. 1999). 14657
Southeast Asia, Archaeology of 2.2 Late Pleistocene–Early Holocene Hunter-gatherers The oldest evidence for occupation by anatomically modern humans dates to approximately 40,000 years ago. Only a few archaeological sites provide evidence of Late Pleistocene occupation: Lang Rongrien rock shelter, Lang Kamnan cave, Son Vi sites, Niah Cave, Tabon Cave, and Leang Burung 2 (see, for example, Anderson 1990, Ha Van Tan 1997, Shoocongdej 2000). The hunter-gatherer groups who occupied these caves and rock shelters are known primarily from their flake and pebble\cobble tool assemblages, although hearths and remains of hunted and collected animals (e.g., land tortoises, rodents, wild pig, monkeys, orangutan, shells) are also found. The majority of their material culture was probably made of bamboo and wood, as suggested by the generalized nature of the stone tools and their probable use in manufacturing activities. By the early Holocene, hunter-gatherers appear to have occupied much of Southeast Asia, with some areas, such as the Malay Peninsula, reoccupied following the expansion of the postglacial equatorial rainforest. Occupation of the upland areas of the mainland is better known in contrast to the lowlands, where rising sea levels and subsequent landscape changes after the last glacial maximum destroyed or buried most sites. Holocene hunter-gatherers are known primarily from their stone tool assemblages, which are dominated by pebble tools. Other types of tool included grinding stones, bone points, and antler wedges, although the majority of their material culture was presumably made of perishable materials. Caves and rock shelters were used for occupation as well as for burial; individuals were buried in either flexed or extended positions and accompanied by stone artifacts, animal bones, and red ochre. Coastal areas were also occupied\exploited, as indicated by shell midden sites. Mobility strategies may have varied with the wet and dry seasons, entailing different types of settlement (including in open areas) and exploitation of different food resources (Shoocongdej 2000). Diet varied by location and included a wide variety of terrestrial (e.g., pig, deer, macaque, rhinoceros), riverine and marine shell species. Holocene hunter-gatherers also exploited a wide range of plants; their role in plant domestication, a topic of long-standing interest in Southeast Asia, remains little known. In part, this reflects current emphasis on rice—the subsistence base of later complex societies—especially the timing of its domestication, adoption and spread. (See Hunter–Gatherer Societies, Archaeology of.)
2.3
Early Rice-cultiating Communities
The earliest known domesticated rice comes from middle Yangtzi valley sites and dates to c. 6000 BC 14658
(see Food Production, Origins of); Southeast Asian sites have not yet yielded direct evidence of rice this early. Consequently, this evidence in conjunction with linguistic data, underlies modern models (e.g., Bellwood 1997, Higham 1996) which propose that in the third millennium BC early rice agriculture was introduced into mainland Southeast Asia by expanding Austroasiatic-speaking rice-farming communities that originated in the Yangtzi river area, and into island Southeast Asia by expanding Austronesianspeaking communities from Taiwan or southern China (see Pacific Islands, Archaeology of). Archaeologically, early mainland rice-farming communities appear in northern Vietnam in the third millennium BC. Village settlements with associated cemeteries (of extended burials and infant-jar burials) were established in inland areas, and were engaged in rice cultivation, the raising of domesticated animals (dogs, cattle, pigs), the manufacture of decorated pottery, and the use of stone ornaments. In northeastern and central Thailand, farming villages are known from habitation remains and primarily from cemeteries of extended burials; some mortuary evidence suggests social status differences based on achievements in fine pottery crafting and exchange activities (Higham 1996). These contexts have yielded decorated pottery (some tempered with rice chaff ), polished stone adzes, marine shell beads and bracelets, stone reaping knives, bone tools, spindle whorls, and remains of domesticated cattle, pigs, chickens, and dogs. These mainland communities also continued to hunt and collect wild resources, and they engaged in regional systems of exchange involving stone and marine shell. Interestingly, in northern coastal Vietnam, shell middens and other sites dated from the fifth millennium BC have yielded pottery and polished stone adzes, material culture often associated elsewhere with ‘Neolithic’ communities. Similarly, in Thailand, pottery (appearing as early as 6000 BC) and stone adzes, often from cave burials, predate evidence considered indicative of rice-farming villages. Whether such evidence indicates ‘affluent foragers’ or early farming activities of rice or other plants remains unknown. Within the islands and the Malay Peninsula, early rice-farming communities are known almost entirely from occupation and burial remains in caves and rock shelters, along with a few shell midden sites (Bellwood 1997). In the Philippines, by the early third millennium BC, such sites contain various items including pottery, stone and shell adzes, spindle whorls, stone and shell jewelry, and tattooing chisels. Within the Indonesian archipelago, pollen evidence suggests minor forest clearance on Sumatra around 4500 BC and permanent clearance after 1000 BC, presumably for agriculture; other islands (again primarily from cave and rock shelter sites) provide evidence of pottery and rice around 2500 BC, and the human introduction (via watercraft) of pigs and nondomesticated animals
Southeast Asia, Archaeology of between 2500 and 2000 BC. During the second millennium BC, caves along the Malay Peninsula were used for burial rather than habitation, and contain graves of extended individuals accompanied by items such as stone adzes, bone points, shell beads and bracelets, and pottery.
2.4 Early Metal Crafting By the early to mid-second millennium BC, autonomous village communities on the mainland (known primarily from mortuary deposits) became involved in the production and consumption of copper\bronze craft goods, with stone and pottery crafting also continuing (see Intensification and Specialization, Archaeology of). This technological change may not have been accompanied by changes in sociopolitical complexity, unlike Bronze Age periods elsewhere in the world. Bronze objects appear in sites in northern Vietnam prior to 1500 BC, in association with exotic jades from China. Sites in central and northeast Thailand, with initial use in the early mid-second millennium BC, provide the earliest evidence for the mining, crushing, smelting, and casting of copper into ingots (e.g., Pigott and Natapintu 1988), activities which appear not to have required full-time specialists (White and Pigott 1996). After 1500 BC, bronzecasting technology, which involved the use of ceramic crucibles and stone or ceramic molds, appears to have spread quickly along riverine and coastal exchange routes (Higham 1996). Copper ingots entered into existing exchange networks, thus enabling communities situated far from ore sources to engage in bronze casting. Prior to production of iron craft goods (c. 500 BC), bronzes were cast into ornaments (bracelets, ear ornaments) and utilitarian objects (e.g., socketed ax heads and other implements, chisels, arrow and spear points, fishhooks, sickles). Bronze objects and copper ingots (along with other items such as pottery vessels, animal remains, stone tools, and shell ornaments) served as grave accompaniments, although whether their value in mortuary contexts related to distinguishing status differences and\or other social identities remains unknown.
2.5 The Deelopment of Chiefly Polities and Early States Around the mid-first millennium BC, significant social changes entailing systems of hereditary ranking and hierarchical organization appear to have been under way, which archaeologists interpret as indicative of complex societies such as chiefdoms (see Chiefdoms, Archaeology of). In mainland Southeast Asia, larger settlements (some with moats, particularly in northeastern Thailand) integrated into regional hierarchies appeared, as well as urban settlements (Co Loa in
northern Vietnam and Beikthano in Burma). Other changes, occurring variably within the region, included the development of iron production (of, for example, ornaments and tools in localized forms), with smelting and forging first appearing in the Chao Phraya and middle to lower Mekong river valleys; increased scale of production and variety of bronzes crafted (e.g., various vessel shapes, bells, large drums, axes, spearheads, jewelry), as exemplified by Dong Son bronzes; greater use of bronzes in mortuary ritual; glass working (in southern Vietnam); intensification of wet rice cultivation using water buffalo and iron plows; participation in maritime and long-distance exchange relations between the mainland and island areas (Bellwood 1997) and India (the earliest evidence dating to the fourth century BC; see, for example, Glover 1996), with exotic goods appearing in mortuary contexts; and the incorporation of the Dong Son polity (northern Vietnam) into the Han empire in 111 BC. The complex societies that were developing during the last few centuries BC in areas of island Southeast Asia acquired bronze prestige\status items (such as Dong Son drums) and iron goods via exchange with the mainland, which was soon followed by the establishment of local metalworking centers. Island polities also engaged in exchange networks involving China and India, with Bali yielding the earliest dated Indian materials at around two thousand years ago (Ardika and Bellwood 1991). As with the mainland, the majority of information on island societies assigned to an ‘early metal phase’ (Bellwood 1997) derives from burials, in open sites and caves, including the distinctive jar burial, slab grave (Sumatra, Java and Peninsular Malaysia), and carved stone sarcophagi (Java and Bali) forms of interment. While these sites have yielded a wealth of material culture (e.g., decorated pottery; bronze socketed tools, bowls, drums; glass and carnelian beads; iron tools) and information on mortuary rituals, much remains to be learned of the social, political, economic, and ideological aspects of this period, which preceded the formation of state polities in various areas of the Indo-Malaysian archipelago. States in Southeast Asia are traditionally thought to have appeared on the mainland during the early centuries AD. This dating reflects, in part, the privileging of written sources in the identification of state-level polities, and is exemplified by the case of ‘Funan,’ which is generally viewed as the earliest state. Written accounts, the earliest dating to the third century AD by visiting Chinese emissaries, describe Funan’s origin story, dynastic lineage, raiding activities, vassal polities, the walled settlements which contained a palace, the taxation system (paid in gold, silver, perfumes, and pearls), and the presence of a representative of the Indian Murunda king, among other things. Oc Eo, a 450-hectare rectangular site (enclosed by several ramparts and moats) of approximately the second to sixth 14659
Southeast Asia, Archaeology of century, is located in the lower Mekong valley, the area in which ‘Funan’ is thought to have been situated. It has yielded evidence of participation in extensive exchange networks involving Iranian, Roman, Indian (some engraved in Indian script), and Chinese goods from the second century AD onwards; and of glass, stone (precious and semiprecious), and metalcraft production (Higham 1996). An extensive system of canals (not yet dated) connects Oc Eo with several other large sites including Ta Keo (the probable port) and Angkor Borei. The latter was continuously occupied from the late centuries BC onwards, and was contemporary with and shared similarities in material culture to Oc Eo, which suggests the two were part of the same polity (Stark et al. 1999). The lower Mekong delta state, however, was not an isolated polity in its early years, as other contemporary maritime-oriented states appear to have existed on the west coast of the Malay Peninsula and in west Java (Wisseman 1995). From the mid-first millennium AD, more historic Southeast Asian states are known (e.g., Chenla, Dvaravati, Champa, Kedah, Srivijaya), and these exhibit a shared incorporation of Indian legal, political, and religious ideas and institutions, including the use of Sanskrit names by rulers, as seen in stone inscriptions (first in south Indian and then indigenous scripts), and in the layout and styles of religious architecture and carvings. While ‘Indianization’ (the view that statecraft was brought to Southeast Asia by Indians) is no longer accepted, relatively little is known about indigenous sociopolitical processes in polities which immediately preceded the early historical states, including whether they already represented emerging states. State polities may have existed earlier in northern Vietnam, prior to Han expansion, and this is an issue which requires further consideration and investigation. Several research projects aimed at understanding state development on the mainland have been initiated in central coastal Vietnam, the area of the Cham state (Glover and Yamagata 1995); the lower Mekong delta, an area associated with ‘Funan’ (Stark et al. 1999); and northeastern Thailand (Higham and Thosarat 2000). Throughout Southeast Asia, fundamental theoretical and empirical issues related to state formation remain, including the definition of a ‘state,’ the relevance of Western theoretical conceptualizations of states and of indigenous political theory, the dating of the earliest states, and the interplay of ideological, social, economic, and other factors in the transformation to states (see States and Ciilizations, Archaeology of ).
periods, in which, for example, at any point in time during the last two thousand plus years there existed various social formations from small, mobile huntergatherer groups to large, regional state polities. Consequently, the application of cultural frameworks derived from elsewhere in the world has been criticized, and attempts have been made to develop locally specific ones (e.g., Bayard 1984). Nevertheless, the challenge remains to develop cultural-chronological frameworks that do not rely on assumptions of simultaneous ‘progressive’ regional transformations. Related to this problem is the paucity of well-dated regional archaeological sequences, and the continued reliance on a few ‘key’ sites or artifact ‘types’ for dating, interpreting, and explaining Southeast Asia’s pre- and early history; both reflect a need for more institutional and economic resources for archaeology in this region.
3. Issues of Culture and Chronology
Bibliography
‘Diversity’ is often used to characterize Southeast Asia (e.g., linguistically, ethnically, religiously), and this certainly applies to the premodern and prehistoric
Anderson D 1990 Lang Rongrien Rockshelter: A PleistoceneEarly Holocene Archaeological Site from Krabi, Southwestern Thailand. University of Pennsylvania Museum, Philadelphia
14660
4. Future Research Directions of Southeast Asian and Western Archaeologists The establishment of chronometrically anchored local sequences will continue to be an aim of much of the archaeological research conducted in the near future. Within the community of Southeast Asian archaeologists, future directions (as expressed at the first International Colloquium on Archaeology in Southeast Asia in the 3rd Millennium, 1998) include more collaborative projects and comparative research; development of theoretical- and problem-oriented research; theoretical contributions to world archaeology; the application of anthropological (also in combination with historical) approaches to archaeological investigation and interpretation; regional as opposed to single-site investigations; and analysis of entire assemblages (particularly in the case of stone tool studies). In addition, Southeast Asian and Western archaeologists will continue research on issues now beginning to be raised, such as local processes of domestication and plant cultivation; the impact and contesting of European colonialism by various social groups; pre- and early historic gender systems and ethnicity; the role of foreign religions (Hinduism, Buddhism) in indigenous social processes; and how nationalism and colonialism have impacted archaeological practice in Southeast Asia. See also: Area and International Studies: Development in Southeast Asia; Southeast Asian Studies: Geography
Southeast Asia: Sociocultural Aspects Ardika I W, Bellwood P 1991 Sembiran: The beginnings of Indian contact with Bali. Antiquity 65: 221–32 Bacus E A, Lucero L J (eds.) 1999 Complex Polities in the Ancient Tropical World. American Anthropological Association, Arlington, VA Bacus E A 2000 World Archaeology Archaeology in Southeast Asia 32(1) (special issue) Bayard D 1984 A tentative regional phase chronology far Northeast Thailand. In: Bayard D (ed.) Southeast Asian Archaeology at the XV Pacific Science Congress. Department of Anthropology. University of Otago, Dunedin, pp. 161–8 Bellwood P 1997 Prehistory of the Indo-Malaysian Archipelago. University of Hawaii Press, Honolulu, HI Glover I 1996 Recent archaeological evidence for early maritime contacts between India and Southeast Asia. In: Ray H P, Salles J F (eds.) Tradition and Archaeology: Early Maritime Contacts in the Indian Ocean. Manohar Publishers, New Delhi, India, pp. 129–58 Glover I, Yamagato M 1995 The origins of Cham civilization: Indigenous Chinese and Indian influences in central Vietnam as revealed by excavations at Tra Kieu, Vietnam, 1990 and 1995. In: Yeung C, Li H-L (eds.) Conference Papers on Archaeology in Southeast Asia (1995: Hong Kong). HsiangKang ta hsueh mei shu po wu kwan, Hsiang-Kang Ha Van Tan 1997 The Hoabinhian and before. In: Bellwood P, Tillotson D, Soejono R P, Welch D (eds.) Indo Pacific Prehistory: The Chiang Mai Papers Vol. 3, Bulletin of the IndoPacific Prehistory Association 16: 35–41 Higham C 1989 The Archaeology of Mainland Southeast Asia. Cambridge University Press, Cambridge, UK Higham C 1996 The Bronze Age of Southeast Asia. Cambridge University Press, Cambridge, UK Higham C, Thosarat R 2000 The origins of the civilization of Angkor. Antiquity 74: 27–8 Junker L L 1999 Raiding, Trading, and Feasting: The Political Economy of Philippine Chiefdoms. University of Hawaii Press, Honolulu, HI Miksic J N 1995 Evolving archaeological perspectives on Southeast Asia, 1970–95. Journal of Southeast Asian Studies 26: 46–62 Morwood M J, Aziz F, O’Sullivan P, Nasruddin, Hobbs D R, Raza A 1999 Archaeological and palaeontological research in central Flores, east Indonesia: Results of fieldwork 1997–98. Antiquity 73: 273–86 Pigott V C, Natapintu S 1988 Archaeological investigations into prehistoric copper production: The Thailand archaeometallurgy project 1984–1986. In: Maddin R (ed.) The Beginning of the Use of Metals and Alloys. MIT Press, Cambridge, MA, pp. 156–62 Shoocongdej R 2000 Forager mobility organization in seasonal tropical environments of western Thailand. World Archaeology 32: 14–40 Stark M T, Griffin P B, Chuch P, Ledgerwood J, Dega M, Mortland C, Dowling N, Bayman J M, Bong S, Tea V, Chhan C, Latinis K 1999 Results of the 1995–1996 archaeological field investigations at Angkor Borei, Cambodia. Asian Perspecties 38: 7–36 Stark M T Allen S J (eds) 1998 International Journal of Historical Archaeology (The Transition to History in Southeast Asia) 2(3 and 4) Swisher C C, Curtis G H, Jacob T, Getty A G, Suprijo A, Widiasmoro 1994 Age of the earliest known hominids in Java, Indonesia. Science 263: 1118–21 White J C, Pigott V C 1996 From community craft to regional specialization: Intensification of copper production in pre-
state Thailand. In: Wailes B (ed.) Craft Specialization and Social Eolution: In Memory of V. Gordon Childe. The University Museum of Archaeology and Anthropology, University of Pennsylvania Press, Philadelphia, PA, pp. 151–75 Wisseman C J 1995 State formation in early maritime Southeast Asia. Bijdragen 151: 235–88
E. A. Bacus
Southeast Asia: Sociocultural Aspects Southeast Asia includes the ASEAN states of Brunei, Indonesia, Malaysia, the Philippines, Singapore, Myanmar, Thailand, Cambodia, Laos, Vietnam, and the territory of East Timor. Despite its extreme diversity in terms of population density, ethnic groupings, religion, politics, and economic development the area has common characteristics, many of them constructed by social scientists and politicians, that are increasingly recognized. In addition to the nationbuilding efforts of individual governments a common Southeast Asian identity has emerged, at least in the urban middle classes.
1. The Sociocultural Construction of Southeast Asia 1.1 Conceptualizing Southeast Asia Southeast Asia has undergone a multitude of crises, economic, political, cultural, and social. Most of these have resulted in a fresh look at what course Southeast Asian societies have taken, and what concepts and theories can be used to analyze new developments. As a matter of fact, some of the new concepts and paradigms have been exported to other world regions, utilized and generalized and entered into the realm of general theory, while their procreators and their region of origin have often been forgotten. Dualism, plural societies, involution, strategic groups, or loosely structured social systems are by now standard terms in social science research, though originally they were empirically-based concepts used to describe Southeast Asian social conditions. What we now call Southeast Asia has gone through a cycle of the social and political construction of itself. As Said (1978, p. 5) has rightly asserted ‘geographical and cultural entities … are man-made.’ Localities are not given, but socially constructed. To mention just a few terms that over time have signified the different external and internal efforts to construct a more or less unified image of the area: the ‘Golden Chersonese’ of Ptolemaeus, Further-India, Hinterindien, Extre# me 14661
Southeast Asia: Sociocultural Aspects Table 1 Conceptualizing Southeast Asia Early concepts
Contemporary concepts
Future directed concepts
Social structure
Deconstruction
Transformation
Indianization, islamization, dualism
Moral economy of the peasant and of trade
Transnational business networks
Plural societies, loosely structured social systems, and involution
Deconstructing village society and ethnicity
Social transformation
Big men and theater states
Deconstructing class and the state: strategic group formation
Democratization and civil society
Modernization
Modernization of modernity
Globalization
Religious revivalism
Postmodernism, life styling, consumerism, and the new middle classes
Epistemic culture
Urbanization
Globalization and locality
Cultural construction of space
Orient, Nusantara, the Greater co-prosperity sphere, Pacific Asia, Southeast Asia, and, more recently, ASEAN. Though empirical social science research has been carried out on many divergent topics, a limited number of theoretical concepts have emerged that show a certain consistency over time. The following table (Table 1) proposes a rough guide to the flow of sociocultural research on Southeast Asian societies. It also provides a framework in which to locate major concepts used in anthropological and sociological research on Southeast Asia. Some of these concepts will be explained below.
Islam came to Southeast Asia probably as early as the eighth century AD, but became a major force of social and political change only from the fourteenth century onwards. The creation first of a trading and religious, then of a political elite through conversion introduced a new element into the coastal states of Southeast Asia along the straits of Malacca, the Northern coast of Java, Sulawesi, and the Moluccas. The process of Islamization is a still ongoing process. Islamic revivalism is strong in Malaysia and is gaining strength in Indonesia, which has become the largest Muslim country worldwide. 1.3 Religious Diersity and Religious Reialism
1.2 Indianization and Islamization The term Southeast Asia became common use only during and after World War II. Before the area was seen as culturally colonized by its large neighbors India and China. Coede' s (1968, pp. 15–16) described the classical states as En tats HindouiseT s or in its translation as ‘indianized states.’ ‘Indianization must be understood as the expansion of an organized culture that was founded upon the Indian conception of royalty, was characterized by Hinduist or Buddhist cults, the mythology of the Pura_ nas, the observance of the DharmasT a_ stras, and expressed itself in the Sanskrit language. Far from being destroyed by the conquerors, the native peoples of Southeast Asia found in Indian society, transplanted and modified, a framework within which their society could be integrated and developed.’ This marriage of Austronesian and other local cultures with Indian patterns of sociocultural organization has remained a common background, on which later Chinese, Muslim, and European influences have been grafted. The Balinese and several smaller societies, such as the Tengger, maintain the Hindu tradition. 14662
In addition to animistic beliefs and related belief systems, most major world religions are found in Southeast Asia. Indonesia, Brunei, and Malaysia together have the largest Muslim population worldwide. The majority of the population of Myanmar, Thailand, Cambodia, and Laos adheres to Theravada Buddhism, the Philippines has a majority of Christians, and Singapore and Vietnam are characterized by a multitude of religious beliefs often cutting across family and kinship lines, though Confucian ideas are also widespread. The Buddhist monkhood in Thailand (Tambiah 1976), Islam in Indonesia (Geertz 1960), and in Malaysia have been studied mainly in relation to local religious forms. During the second half of the twentieth century, religious revivalist movements have swept through most of Southeast Asia. In Malaysia a missionary movement (dakwah) has led to a resurgence of Islamic practices, especially in the Eastcoast states of Kelantan and Trengganu. In Thailand, middle-class-based religious organizations have preached a return to Buddhist values. The Javanese mystical tradition has influenced Muslim brotherhoods or led to the foun-
Southeast Asia: Sociocultural Aspects dation of mystical sects: aliran kebatinan, Subud, and others (Geels 1997). The Singapore government has propagated Confucian values, while Christian sects have proliferated and found new adherents among students and young upwardly mobile employees. 1.4 Ethnic Diersity Boundaries between ethnic groups are blurred, and there are differences between self-interpretation and ethnic classifications imposed by others. In Singapore, the population officially is classified into Chinese, Indians, Malays, and others (usually identified as Eurasians and Caucasians), though there are many more self-ascribed categories. The Malaysian government distinguishes between bumiputra (sons of the soil) and immigrants, but there are also Orang Asli (aborigines), Indians, and Chinese. People of Arab descent no longer are seen officially as a separate group, but in Indonesia the culture, legal system, and language of larger ethnic groups as Javanese, Balinese, Batak, Minangkabau, and many others are recognized and expressed in cultural, though not political terms. In Thailand, Laos, Myanmar, and Vietnam the lowland population is distinguished from ‘hill tribes,’ an ethnically and linguistically diverse population (Kunstadter 1967). The difference between a lowland, rice-growing peasant society and nonsedentary ‘tribal groups’ practicing shifting agriculture has been emphasized but is disappearing slowly due to assimilation or replacement. The peoples of Southeast Asia have also been classified along linguistic lines. Despite recent criticism, the classification into Sino-Tibetan, AustroAsiatic, Tai-Kadai, and Malayo-Polynesian is still widely used (LeBar et al. 1964, 1972). In most countries or regions we can distinguish one dominant lowland and several upland satellite societies. Anthropological and sociological research has enriched our knowledge of the major societies that have been politically dominant in the formation of pre- and postcolonial states, but also the large number of smaller societies from hunter and gatherer societies, like the Orang Asli groups in Malaysia, to larger groups, like the Shan and Kachin in Myanmar (Leach 1954), the various ‘hill tribes’ of Thailand, Laos, and Vietnam (Kunstadter 1967), or the matrilineal Minangkabau of West Sumatra (Kahn 1993). 1.5 Plural Societies The in-migration of Arab, Chinese, and South Asian migrants in the wake of the expansion of a colonial economy has created what Furnivall termed a plural society, ‘with different sections of the community living side by side, but separately, within the same political unit’ (Furnivall 1948, p. 304). This ethnic pluralism was transcribed into colonial and postcolonial administrative practice, giving rise to a
classification of the population into Chinese, Malay, Indian, and others in colonial and contemporarary Singapore or into indigenous Malays as ‘sons of the soil’ (bumiputra) and migrant communities in Malaysia. Thailand, in contrast, has followed a policy of assimilation, in which all minorities were classified as Thai, whereas in Indonesia the Javanese dominance was balanced precariously by allowing the construction of ethnic, but not political, pluralism in the form of a cultural symbolism of local art forms and dances. Except for Singapore, one common national language has been introduced and enforced with varying degree of acceptance and use in all Southeast Asian states. 1.6 Moral Economy Geertz and others have underestimated the dynamics of Southeast Asian peasant societies. Rebellions and peasant uprisings have continued well into the twentyfirst century (East Timor, Aceh). The ‘moral economy of the peasant,’ illustrated by Scott (1976) with reference to Vietnamese and Burmese societies, indicates that a subsistence ethic governs peasant behavior. Only if the self-defined subsistence level is threatened have peasants rebelled. The moral economy also governs petty trade in the informal sector of most of Southeast Asia. The moral obligation of sharing work (gotong royong in Indonesia) and profits puts a burden on small traders, mostly women, who find it difficult to accumulate profits to sustain and expand their trading enterprises. There are, however, solutions to this ‘trader’s dilemma’ (Evers and Shrader 1994): where indigenous traders fail, immigrant-trading minorities like the Chinese take over the business. Otherwise, the indigenous traders themselves improve their moral status by forming or joining religious fundamentalist movements, like the Santri traders of Indonesia (Geertz 1965), to escape the pressure of moral obligations. Often trade is carried out on a subsistence level like by female petty traders where the obligation to share meagre profits does not arise because no cash is accumulated. The informal sector of small handicraft production, petty trade, and other services is widespread in Southeast Asia, has increased in Vietnam during the transformation from a planned socialist economy to a more market-oriented economy, but has declined in Singapore and Malaysia. This sector is playing an important role in cushioning the effects of land shortage in the rice growing areas of Southeast Asia, of rural–urban migration and a general lack of social security provisions. 1.7 Big Men and Theater States The Austronesian\Melanesian\Polynesian content of Southeast Asian culture becomes evident, the more one proceeds Southeast through Indonesia and the 14663
Southeast Asia: Sociocultural Aspects Philippines. Wolters (1982) has drawn attention to the ‘big man’ phenomenon, which he locates in Polynesia. The current emphasis on personal prowess, on the power of the oratory and the admiration for superiors and leaders is put into this context. The state and its aristocratic bureaucracy is admired as much in Thailand as in Malaysia or Indonesia. The state in these countries is highly ritualized and has been named ‘theatre state,’ following Geertz’s analysis of the precolonial Balinese states (Geertz 1980). The state was largely a performance with little administrative or military power. It is debatable whether this analysis withstands the test of historical data, but at least the theme of vagueness, looseness, involution, and theatricality in some Southeast Asian societies pervades the anthropological literature. 1.8 Loosely Structured Social Systems and Inolution John Embree once classified Thai society as ‘loosely structured,’ thus producing a long-drawn controversy among social scientists (Evers 1980, pp. 163–98). Thai society has a cognatic kinship system, a shallow recognition of descent, and a high degree of individualism, but a highly regulated ‘bureaucratic polity’ and a tendency to form cliques and patronage. Another, equally unconvincing attempt at using single cultural traits or selected ethnographic details has stressed dualism as a common characteristic of Southeast Asian societies. There was a greater acceptance of the work of Geertz, who classified Javanese society as ‘involuted,’ by ‘overdriving of an established form in such a way that it becomes rigid through inward over elaboration of detail’ (Geertz 1963, p. 82). This process applied mainly to wet rice agriculture and its tenancy system, but was seen at work by other scholars also in other regions like the Philippines or Thailand, or in the relatively slow process of Southeast Asian urbanization up to the middle of the twentieth century (Evers and Korff 2000).
2. Contemporary Southeast Asian Studies 2.1 Diasporas and Business Networks Migrant communities, like the Overseas Chinese originating from Southern China, the South Indian Chettiar, or the Hadramaut Arabs, have been classified as trading minorities in view of their economic position in Southeast Asian societies. Others have migrated to Southeast Asia in large numbers at the beginning of the twentieth century to work in plantations, mines, and the construction sector. Chinese business has been studied in great detail. Singapore has become the centre of the Chinese business community (Chan and Chiang 1994, Menkhoff and Gerke 14664
2001). Confucian ethics, patrilineal kinship-based networks, and the patriarchal organization of family firms have been held responsible for economic success and the predominance of the Nanyang Chinese. Prosecution, discrimination, or conflict have strained interethnic relations in Malaysia, Indonesia, and Vietnam.
2.2 Social Class and Strategic Group Formation Colonial Southeast Asian societies were ‘moulded on racial principles: belonging to the dominant white upper caste provided one with prestige and power largely independent of one’s personal capabilities. A strict ritual was introduced and maintained, by force when necessary, to preserve the white caste from contacts with Asiatics on a basis of equality and to maintain the former’s prestige as the dominant group’ ( Wertheim 1968, p. 432). In the course of decolonization and modernization Southeast Asian societies have undergone a process of class formation (Evers 1973, pp. 108–31). The colonial upper classes were replaced by ethnically differentiated elites, in which indigenous aristocrats, Chinese and European business people, and the upper ranks of a fast-growing bureaucracy and military officers played a major role. The concept of strategic group can be used to analyze this new situation more adequately. The pattern of strategic group formation differed from country to country. In Thailand, which had escaped the fate of colonialism, the growing bureaucracy, backed by sections of the military, revolted against the aristocracy and replaced the absolute monarchy in 1932 with a new regime, in which the strategic group of the military gained the upper hand, until a growing group of professionals, university lecturers, and students managed to establish a democratic government. Strategic group conflict continued, mediated by the electoral process, and put big business into power in 2001. Similar strategic group conflicts, rather than class struggle, signified the situation in most countries, except perhaps in the socialist states of Vietnam and Laos, were party cadres, another strategic group, maintained power after the communist victory in the Vietnam War. The partial reprivatisation of land and the deregulation of the economy introduced, however, new social processes from rural–urban migration to corruption of government officials and the formation of a group of newly rich businessmen. Class formation moved ahead and the growth of a middle class became a much researched topic in Southeast Asia (Robison and Goodman 1996, Chua 2000). The growth of a new middle class was seen as a basic requirement for democratization and the building of a civil society. Whereas in Thailand, the Philippines, and in Indonesia a civil society with a proliferation of nongovernment organizations (NGOs) developed and pressed for reformation and
Southeast Asia: Sociocultural Aspects the introduction of democracy, the prodemocracy movement was crushed in Myanmar, while the strategic group of military officers stayed in power. Middle classes have developed a special life-style of their own that blends Western consumerism with elements of Asian modes of living. Hawker centers, food courts, and markets exist alongside big department stores, pizza parlors, and hamburger outlets. The anthropology of Southeast Asian food has to be concerned with syncretism and life-styling as much as religious symbolism. Along with the demand for consumer goods, the demand for knowledge also increased. In fact, the ASEAN region has undergone a tremendous expansion of higher education. In the core states Thailand, Malaysia, Singapore, the Philippines, and Indonesia the number of students and teachers, and of secondary schools, colleges, and universities, has risen dramatically. Malaysia and Singapore, in particular, are on the way to become knowledge societies, where economic and social development is increasingly knowledge driven. Most ASEAN leaders and governments have developed visions of the future that are targeting a knowledge society as a way to achieve parity with the Western world. This vision is but one in a long line of ideas on how Southeast Asia should be seen, interpreted, and eventually constructed. 2.3 Family and Gender Relations The small nuclear family is widespread, or has become the norm more recently as in Vietnam (Pham Van Bich 1999) or among Singapore Chinese. The position of women has been regarded as strong throughout Southeast Asia; in particular in matrilineal societies like the Minangkabau (Kahn 1993). Demographic changes and the decline of fertility due to family planning has strengthened the position of women in Thailand, Vietnam, and Indonesia, giving them greater control over their own lives (Gerke 1992). Industrialization and the employment of female factory workers, especially in the garment and electronics industry has led to migration and chances for educational attainment but also to exploitation and harassment of women (Heyzer 1985, Sen and Stivens 1998). 2.4 Globalization, Urbanization, and Locality Right into the 1980s Southeast Asia, except the city state of Singapore, showed only relatively low rates of urbanization. Since then urbanization has taken off, creating large urban conglomerates like the Greater Bangkok area, Jabotabek (Jakarta, Bogor, Tanggerang, Bekasi), the Manila Metropolitan Area, and the Klang Valley in Malaysia including the federal territory of Kuala Lumpur, the new administrative city of Putrajaya, and the high-technology center
Cyberjaya. Southeast Asian urbanism has its roots in the Chinese-influenced walled city form, like Bangkok or Mandalay, the Malay or Muslim ports like Malacca, Banten, or Makassar, the court cities like Yogyakarta, and the colonial cities like Yangoon, Singapore, or Manila (Evers and Korff 2000). Next to the city centers with their culturally globalized population, we find localities of migrant groups, still maintaining a sociocultural flavor of their own. The various China towns, ‘Little India’ in Singapore (Siddique and Shotam 1982), or localities in Greater Manila (Berner 1997) should not be seen as remnants of traditional culture, but newly constructed forms of urban society. 2.5 Cultural Analysis Southeast Asian cultures as systems of beliefs, values, symbols, and presentations are as diverse as the ethnic plurality of the local and migrant population. This diversity was analyzed in detail in earlier anthropological studies (Koentjaraningrat 1985, Boon 1977, Lombard 1990, and many others). There are, however, some underlying trends that have been used to construct national cultures or an ASEAN identity. The family and its values, reaching decisions through consensus rather the majority vote and a religiously grounded work ethic have been stressed in a debate on Asian values. Social scientists have taken up the cultural analysis of modernity (Hooker 1995). In his study of Thai, Indonesian, and Philippine middle-class culture, Mulder (1992) has stressed the reinterpretation of classical concepts for the construction of modernity. The knowledge on Southeast Asian societies is increasing fast (Halib and Huxley 1996). Most of the relevant books and articles are still produced by foreign scholars, affiliated to universities or research institutions around the globe, but an increasing proportion of works are written by Southeast Asian nationals or by scholars, attached to Southeast Asian universities. An epistemic culture of local Southeast Asianists is emerging, contributing to a further construction of Southeast Asia as a sociocultural unit. See also: Big Man, Anthropology of; Diaspora; Ethnic Groups\Ethnicity: Historical Aspects; Ethnicity: Anthropological Aspects; Plural Societies; Southeast Asian Studies: Culture; Southeast Asian Studies: Economics; Southeast Asian Studies: Gender; Southeast Asian Studies: Geography; Southeast Asian Studies: Politics; Southeast Asian Studies: Religion; Southeast Asian Studies: Society
Bibliography Barry K (ed.) 1996 Vietnam’s Women in Transition. St. Martin’s Press, New York
14665
Southeast Asia: Sociocultural Aspects Berner E 1997 Defending a Place in the City. Localities and the Struggle for Urban Land in Metro Manila. Ateneo de Manila University Press, Quezon City Boon J A 1977 The Anthropological Romance of Bali 1597–1972. Dynamic Perspecties in Marriage and Caste, Politics and Religion. Cambridge University Press, Cambridge, UK Chan K B, Chiang C 1994 Stepping Out. The Making of Chinese Entrepreneurs. Prentice Hall, New York Chua B-H 1995 Communitarian Ideology and Democracy in Singapore. Routledge, London Chua B-H (ed.) 2000 Consumption in Asia. Lifestyles and Identities. Routledge, London Coede' s G 1968 The Indianized States of Southeast Asia. EastWest Center Press, Honolulu Condominas G 1977 We Hae Eaten the Forest. The Story of a Montagnard Village in the Central Highlands of Vietnam. Penguin, London Evers H-D (ed.) 1973 Modernization in Southeast Asia. Oxford University Press, Singapore Evers H-D (ed.) 1980 Sociology of Southeast Asia. Oxford University Press, Singapore Evers H-D, Korff R 2000 Southeast Asian Urbanism. The Meaning and Power of Social Space. LIT Verlag, Mu$ nster; St. Martin’s Press, New York Evers H-D, Schrader H (eds.) 1994 The Moral Economy of Trade. Routledge, London Furnivall J S 1948 Colonial Policy and Practice. Cambridge University Press, London Geels A 1997 Subud and the Jaanese Mystical Tradition. Curzon, Richmond Geertz C 1960 The Religion of Jaa. Free Press, New York Geertz C 1963 Agricultural Inolution. University of California Press, Berkeley, CA Geertz C 1965 The Social History of an Indonesian Town. MIT Press, Cambridge, MA Geertz C 1980 Negara, the Theatre State in Nineteenth-century Bali. Princeton University Press, Princeton, NJ Gerke S 1992 Social Change and Life Planning of Jaanese Women. Breitenbach, Saarbru$ cken and Fort Lauderdale, TX Gomez E T 1999 Chinese Business in Malaysia. Curzon, Richmond Halib M, Huxley T (eds.) 1996 An Introduction to Southeast Asian Studies. Tauris, London Heyzer N 1985 Working Women in Southeast Asia: Deelopment, Subordination and Emancipation. The Open University Press, Reading, UK Hooker V M (ed.) 1995 Culture and Society in New Order Indonesia. Oxford University Press, Kuala Lumpur Kahn J 1993 Constituting the Minangkabau. Peasants, Culture, and Modernity in Colonial Indonesia. Berg, Oxford, UK Koentjaraningrat R M 1985 Jaanese Culture. Oxford University Press, Singapore Kunstadter P (ed.) 1967 Southeast Asian Tribes, Minorities, and Nations. 2 vols. Princeton University Press, Princeton, NJ Leach E 1954 Political System of Highland Burma. London School of Economics, London LeBar F M, Gerald H G, Musgrave J K (eds.) 1964 Ethnic Groups of Mainland Southeast Asia. HRAF, New Haven, CT LeBar F M, Gerald H G, Musgrave J K (eds.) 1972 Ethnic Groups of Insular Southeast Asia. HRAF, New Haven, CT Lombard D 1990 Le Carrefour jaanais: essai d‘histoire globale (3 vols.). E; cole des Hautes E; tudes en Sciences Sociales, Paris Menkhoff T, Gerke S 2000 Chinese Entrepreneurship and Asian Business Networks. Curzon, Richmond, VA
14666
Mulder N 1992 Inside Southeast Asia: Thai, Jaanese and Filipino Interpretation of Eerday Life. Editions Duan Kamol, Bangkok Pham Van B 1999 The Vietnamese Family in Change. The Case of the Red Rier Delta. Curzon, Richmond, VA Robison R, Goodman D S G (eds.) 1996 The New Rich in Asia. Mobile Phones, McDonald’s and Middle-class Reolution. Routledge, London Said E W 1978 Orientalism. Pantheon, New York Scott J 1976 The Moral Economy of the Peasant. Yale University Press, New Haven, CT Sen K, Stivens M (eds.) 1998 Gender and Power in Affluent Asia. Routledge, London Siddique S, Shotam N P 1982 Singapore’s Little India: Past, Present and Future. ISEAS, Singapore Tambiah S J 1976 World Conquerer and World Renouncer: A Study of Buddhism and Polity in Thailand Against a Historical Background. Cambridge University Press, Cambridge, UK Wertheim W F 1968 Asian society: Southeast Asia. In: Sills D L (ed.) International Encyclopedia of the Social Sciences. Macmillan and The Free Press, New York, Vol. 1, pp. 423–38 Wolters O W 1982 History, Culture, and Region in Southeast Asian Perspecties. ISEAS, Singapore
H.-D. Evers
Southeast Asian Studies: Culture 1. Southeast Asia as a Cultural Region The study of culture in Southeast Asia was from the beginning influenced by debates among Western scholars as to just what comprised the region. Before and after World War II, there was a tendency to merge the study of Southeast Asian culture with that of India, as evident in references to the region as ‘greater India.’ This approach highlighted the extensive borrowing that had taken place between parts of the region and India in the pre-modern era. But the characterization was not helpful for understanding the many local populations less affected by these exchanges, including Muslims, Southeast Asian Chinese, and nonstate peoples living in the region’s vast hinterlands. The characterization also failed to do justice to the selective ways in which pre-modern Indian influences have been integrated into the mosaic of modern Southeast Asia. In the 1950s and 1960s, the institutionalization of Southeast Asian studies in Western universities led some scholars to attempt to devise a less derivative list of cultural traits that might allow researchers to visualize Southeast Asia as a culture area in its own right. Early iterations of this approach identified a number of traits as prototypically Southeast Asian: (a) the high status of women, especially by contrast with China and India; (b) court ceremony and high arts, both with strong Indian influences;
Southeast Asian Studies: Culture (c) the resilience of ancestral and spirit cult traditions alongside world religions; (d) the prevalence of nuclear families and cognatic (or bilateral) kinship as opposed to extended families and unilineal descent systems; and (e) a marked contrast between the egalitarian cultures of upland dry farmers and the hierarchical ways of the wet-rice lowlands. Although this effort to bundle traits into a convincing cultural typology helped to distinguish the region from India and China, it also suffered from conceptual blind spots. Some Southeast Asian societies display few of these traits. Equally serious, typological approaches of this sort provide few clues as to how the elements that comprise any specific culture are integrated into a meaningful whole. In the 1970s, then, bundled typologies gave way to two new approaches to the study of Southeast Asian cultures: historical-dynamic approaches that examine the diffusion and integration of diverse elements into local cultures over time, and the ‘thick description’ of individual cultures in specific social settings.
1.1 Historical-dynamic Approaches to Cultural Variation Dynamic approaches to Southeast Asian culture recognized that Southeast Asia lacked the cultural integration characteristic of China and India, but attempted to demonstrate that there was a unity to Southeast Asia in the way in which different populations integrated incoming elements into a unique cultural whole. In religion, for example, Southeast Asia today is home to large numbers of Muslims, Buddhists, and Christians, as well as small populations of Hindus (immigrant and indigenous). Alongside each of these religious communities, there are also strong traditions of shamanism, ancestral veneration, monistic mysticism, and guardian-spirit veneration. Rather than seeing this religious diversity as evidence of cultural incoherence, historical-dynamic approaches to Southeast Asian culture emphasize that local populations have selectively appropriated external cultural influences in a manner that reflects their own exigencies and constraints. This selective appropriation of cultural elements appears to go far back in time. Recent studies have demonstrated, for example, that from the first centuries of the Common Era to the thirteenth and fourteenth centuries, Saivite Hinduism was the predominant cultic tradition at courts across Southeast Asia. Just as Greco-Roman civilization provided a model of cultural excellence for courts and high-status people in the pre-Christian Mediterranean, Hindu civilization provided the symbols of distinction in law, literature, the performing arts, state ceremony, and religion. Even today, the aesthetic traditions and
ceremonial customs of many Southeast Asian societies show strong ‘Indic’ (Indian-influenced) inflections. Vietnam, the Philippines, and Southeast Asia’s nonstate hinterlands were not much involved in this cultural exchange. However, even in regions exposed to the full force of Hindu influence, there was clear selectivity in the appropriation of foreign elements. As with Hinduism in contemporary Bali ( Warren 1993), bits and pieces of India’s caste traditions were incorporated into religious ceremony and the social hierarchy, but no population in Southeast Asia ever implemented the full range of regulations operative among Indian Hindus in matters of diet, marriage, and everyday commensality. Similarly, the restrictions on women’s autonomy seen among Indian Hindus (or, when making the comparison for Islam, Middle Eastern Muslims) exercised less influence in Southeast Asia, where women’s public roles remained relatively high. Rather than ‘failed’ Hinduization, this resistance to caste and restrictive women’s roles showed the continuing influence of Southeast Asian notions of commensality and female autonomy. The dynamic approach to the study of Southeast Asian culture has also been used to examine key aspects of political and economic culture in Southeast Asia. In the 1960s and 1970s, political scientists and anthropologists used their understandings of local culture to assess the impact of nationalism and nation building on Southeast Asian societies. As with Clifford Geertz’s (1973) reflections on the ‘integrative revolution’ in postcolonial Indonesia, this research was initially concerned with the prospects for peace and integration in an unstable region. The nationalist project in Southeast Asia was seen as particularly daunting owing to the deep ethnic and religious divides in most countries. The concern for nation building has remained a key theme in analysis of Southeast Asian politics to this day. Rather than linking their research to questions of national stability, however, recent studies have examined the cultural terms through which the nation is ‘imagined,’ and highlighted the role of elites in manipulating the ‘invention’ of national tradition for their own ends (Keyes et al. 1994, Tonneson and Antlov 1996, Winichakul 1994). As these examples illustrate, one of the distinctive features of cultural analysis in Southeast Asia has long been that it is applied, not just to the usual domains of religion and the arts, but also to politics and economics. In Thailand, the political scientist Fred W. Riggs (1966) drew on anthropological studies of Thai society in assessing the prospects for sustained economic growth. Riggs characterized the Thai state as a ‘bureaucratic polity,’ suggesting that it was so thoroughly preoccupied with parasitic patronage that any sustained economic initiative was unlikely. Bureaucrats regarded Sino-Thai businesspeople as ‘pariah entrepreneurs,’ and restricted their operations accordingly. As Ruth McVey (1992) has observed, Riggs was right to emphasize the tensions between 14667
Southeast Asian Studies: Culture political and economic culture in Thai society, and to imply that corruption might remain a brake on economic progress. However, Riggs failed to anticipate the speed with which capitalist culture took hold in Thai society in the 1970s and 1980s. Popular society and bureaucratic culture both proved more open to new styles of production and new modes of conspicuous consumption than cultural analysts had earlier predicted. In a related study of indigenous enterprise in Java and Bali, the anthropologist Clifford Geertz (1963) reached a different conclusion than Riggs. Geertz argued that there were extensive cultural resources for market development in both of these Indonesian societies. In Java, Geertz observed, a culture of enterprise was taking shape among Muslim entrepreneurs marginal to the local (court-based) status hierarchy. In Bali, by contrast, the carriers of the market revolution were aristocrats intent on displaying their distinction in market enterprise to neutralize challenges from upstart rivals. Although both applied cultural analysis to the study of economic change, Riggs and Geertz’s conclusions were, curiously, inverse images of each other. Riggs attributed too great a hold to culture, and thus failed to anticipate the flexibility Thai society would show in the face of new economic initiatives. By the 1980s, capitalist culture in Thailand showed an ‘increasing intimacy and equality between business and political leadership’ (McVey 1992, p. 22). If Riggs’ preoccupation with politics led him to exaggerate the conservatism of Thai culture, Geertz was too optimistic precisely because insufficiently attentive to the political. His gaze focused on local entrepreneurs, Geertz failed to notice the looming structural impediments to indigenous enterprise. Already in the late 1950s, the Indonesian military had become a major player in the national economy. Concerned that an independent business class might challenge their power, after 1965 military and civilian elites channeled contracts and concessions away from indigenous entrepreneurs toward wealthy Chinese and cronies reluctant to protest against mandatory kick-backs and sweetheart deals. The native entrepreneurial class that Geertz had lionized declined precipitously. The Thai and Indonesian examples remind us of the dangers of seeing culture in too-static terms, or isolating it from the social and political worlds that constrain its development. Recent studies of the culture of capitalism in Southeast Asia have adopted just such an interactive approach to culture and sociopolitical structure in their efforts to understand the economic boom of the 1980s and early 1990s. Although attentive to the legacy of ethnic and religious culture, these studies also emphasize that culture is not a finished script or program, but an open body of knowledge continually reconstructed in the face of changing social and political realities (Hefner 1997, Li 1989). 14668
Historical-dynamic approaches remain a key instrument in the study of Southeast Asian culture; they also serve to counteract the tendency to think of culture in purely local terms. Two historians, Denys Lombard (1990) and Anthony Reid (1988 and 1993), have both made especially significant contributions to the study of Southeast Asian culture on such a grand scale. Tracing cultural developments in the arts, economy, politics, and religion since the early modern era, both authors show that Southeast Asian culture has long been affected by globalizing forces and transnational flows. Historical linguists (Collins 1996) have recently used a similar methodology to examine the diffusion of Malay language across vast stretches of island Southeast Asia in the early modern era. Malay’s spread, these studies demonstrate, was intertwined with the expansion of Islam and Malay-based commerce. In these and other examples, historical studies reveal that the cultural variation seen in Southeast Asia is the result, not of slumbering societies lost in time, but of vibrant regional networks and globalizations with deep historical precedents. Southeast Asian culture was ‘globalized’ well before the late twentieth century. 1.2 Thick Descriptie Culture Dynamic approaches to the study of culture arose through the efforts of historians and others to explain Southeast Asia’s diversity and to illuminate the creativity of Southeast Asians in appropriating external influences. Locally oriented, ‘thick’ descriptions of culture, by contrast, had less to do with Southeast Asian studies as such than they did a methodological shift in the human sciences in the 1970s and 1980s. The shift was premised on a new skepticism toward general theory, especially the modernization variant, and a heightened interest in local lifeworlds and meanings. The single most influential writer on Southeast Asian culture, Clifford Geertz’s intellectual biography illustrates this subtle shift in the terms of cultural analysis. In his early years, Geertz had been a student of modernization theory, especially that version promoted by his two mentors, Talcott Parsons and Edward Shils. As in his Religion of Jaa (1960) and his, The Interpretation of Cultures (1973), Geertz’s concept of culture at first owed much to the Parsonian model of culture as a system of meanings in ongoing interaction with institutions and patterned action (the ‘social system’). According to this model, rapid social change causes serious instability when it runs so far ahead of the cultural system as to make difficult the development of new systems of meaning capable of orienting collective behavior. In his analysis of religion in Java, Geertz identified three subtraditions (an Islamic, a syncretic, and an elite-aristocratic) and speculated that Indonesian social structure was changing more rapidly than these cultural traditions could accommodate. The resulting discontinuity between
Southeast Asian Studies: Culture structure and meaning, Geertz argued, was likely to give rise to serious religious conflict and political instability. Even in his early works, however, Geertz used other methodologies that showed great sensitivity to the nuances of local context and meaning. In the late 1960s and early 1970s, as interest in modernization theory waned, Geertz shifted his attention away from general theory and broad analyses of modernization toward the ‘thick description’ of local lifeworlds, performances, and meanings. If commented on at all, regional movements and transcultural flows were distant concerns at best. This was to be a hallmark of cultural research in Southeast Asia for the next 20 years. One salutary effect of this heightened skepticism toward general models and macrohistorical generalizations was that students of Southeast Asia were able to explore a range of topics neglected in earlier theories. There was, for example, an efflorescence of research on the interaction between local culture and great religious traditions. With its split gaze on Buddhism and indigenous religion, S. J. Tambiah’s (1970) study of Buddhism and spirit cults in northeast Thailand represents an early and paradigmatic example of this new approach. Rather than portraying local religion as a ‘little’ tradition playing catch up to modernized forms, the ever-unfinished interaction of the two traditions was seen as the key to religion’s vitality. Students of religion in Muslim Southeast Asia were slower to apply a similarly bifocal approach to Islam. As John Bowen (1993) has observed, they often neglected to examine the textual and normative bases of Islam, concentrating instead on syncretisms and exotic local customs. This neglect of the normative reflected the lack of training in Arabic and Islamic textual traditions among Southeast Asian specialists of Islam. Some argue that it also reflected the conviction among certain scholars that, unlike Buddhism, Islam was only a nominal influence on Southeast Asian culture as a whole. In the 1990s, however, as thick description began to be complemented by cross-cultural comparison, researchers made the balanced examination of transregional Islam and local tradition a hallmark of revitalized Islamic studies (Bowen 1993). Another area of cultural research that showed new vitality in the aftermath of modernization theory’s demise was the study of women and gender. Because of the unusually public status of women even in traditional societies, gender had from early on been a focus of attention in Southeast Asian cultural studies, as illustrated in the pioneering works of Hildred Geertz (1961) and Rosemary Firth (1966). Influenced by feminist studies and social constructionist understandings of gender in anthropology, however, in the 1980s and 1990s gender issues moved into a far more central position in Southeast Asian studies as a whole. Whereas earlier studies had examined gender from the perspective of the household and family, the new
generation used gender as a point of entry to a broader and often more political array of concerns. Aihwa Ong (1987) looked at female factory workers to examine the dynamics of gender, capitalism, and Islamic resurgence in Malaysia. Pham Van Bich (1999) used changes in family and gender roles to provide a unique perspective on the achievements and shortcomings of the Vietnamese revolution. More generally, the study of gender in Southeast Asia has been used to challenge Western-derived understandings of the public and private spheres (Peletz 1996). With this heightened sensitivity to gender, and with the growing influence of subaltern studies in the new field of cultural studies, there was growing interest in the early 1990s in the study of marginal groups and in the cultural construction of ‘marginality’ itself (Tsing 1993). Some proponents of this research opposed all forms of comparative and translocal research, arguing that broad comparisons only did violence to local life worlds. As this research moved forward, however, it inevitably gave rise to its own comparative questions. Gender inequalities in one society raise questions concerning the conditions that facilitate gender equity in others. Marginality raises questions concerning citizenship and the processes by which local actors might control the terms of their participation in the public sphere. As these examples hint, the fine-grained thick description of the 1980s had, by the late 1990s, given rise to a renewed interest in cultural comparison, translocal communication, and globalization.
2. The Return of the Translocal The effort to apply in-depth local cultural research to translocal and comparative issues is not a new one in Southeast Asian studies. As noted above, it was always a feature of cultural history in the region. Similarly, literary research has long shown a lively interest in the changing uses and meanings of texts as they move across time and social space (Sweeney 1987). In the late 1970s, one anthropologist attempted to carry out comparative ethnography, examining reformist Muslim psychology in Malaysia, Singapore, and Indonesia (Peacock 1978). It would be years, however, before other anthropologists would follow his model. Finally, and very important, the study of Chinese culture in Southeast Asia has always been a great exception to the tendency to ignore comparison or focus simply on the local. In both anthropology and history, studies of Southeast Asian Chinese have always examined the movement of people and ideas across national borders and sought to understand the way in which local environments inflect Chinese culture. Recent research shows that this concern remains as vibrant as ever (Dobbin 1996, Reid 1996). Another important precedent for comparative and translocal research in Southeast Asian studies has 14669
Southeast Asian Studies: Culture been the study of the specifically cultural dimensions of nationalism. With its attention to the impact of vernacular education, colonial culture, religious reform, and new forms of public association on Malay political identity, William R. Roff’s (1967) study of the origins of Malay nationalism was an early and exemplary example of just such a line of research. Anthony Milner’s (1994) more recent study begins from a similar historical base, but shows an even more deliberate concern with the question of whether Southeast Asian modernity is unique or reflects political challenges general to the modern world. With its comparisons to Western debates on liberalism and communitarianism, Beng-Huat Chua’s (1995) study of culture and politics in Singapore shows a similarly refined balance between the study of local cultures and comparative reflection on the modern condition as a whole. As researchers look to the future, it seems inevitable that efforts like these to balance local research with regional and cross-cultural comparison will become more common. Because many Southeast Asian societies have always been deeply multicultural, it has been difficult here to ignore comparisons or, conversely, dissolve specific cultures into a general analytic brew. At the same time, Southeast Asia’s position as a locus of civilizational interaction has helped to sensitize scholars to the fact that cultural ‘globalization’ is by no means a new phenomenon. Whether in the study of politics, gender, or economic culture, Southeast Asia also shows that trans-regional flows and globalization are not necessarily part of a long march toward a drearily homogeneous modernity. In this crossroads of civilizations, the most enduring cultural globalizations have always been those able to take root and ‘localize.’
Bibliography Bich P V 1999 The Vietnamese Family in Change: The Case of the Red Rier Delta. Curzon, Richmond, UK Bowen J R 1993 Muslims Through Discourse: Religion and Ritual in Gayo Society. Princeton University Press, Princeton, NJ Chua B-H 1995 Communitarian Ideology and Democracy in Singapore. Routledge, London Collins J T 1996 Malay, World Language: A Short History. Dewan Bahasa dan Pustaka, Kuala Lumpur, Malaysia Dobbin C 1996 Asian Entrepreneurial Minorities: Conjoint Communities in the Making of the World Economy, 1570–1940. Curzon, Richmond, UK Firth R 1966 Housekeeping Among Malay Peasants, 2nd edn. Athlone Press, London Geertz C 1960 Religion of Jaa. Free Press, New York Geertz C 1963 Peddlers and Princes: Social Deelopment and Economic Change in Two Indonesian Towns. University of Chicago Press, Chicago Geertz C 1973 The Interpretation of Cultures. Basic Books, New York Geertz H 1961 The Jaanese Family: A Study of Kinship and Socialization. Free Press, New York
14670
Hefner R W (ed.) 1997 Market Cultures: Society and Morality in the New Asian Capitalisms. Westview, Boulder, CO Keyes C F, Kendall L, Hardacre H 1994 Asian Visions of Authority: Religion and the Modern States of East and Southeast Asia. University of Hawaii Press, Honolulu, HI Li T 1989 Malays in Singapore: Culture, Economy, Ideology. Oxford University Press, Singapore Lombard D 1990 Le carrefour jaanais: Essai d’histoire globale, 3 Vols. Editions de l’Ecole des Hautes Etudes en Sciences Sociales, Paris McVey R (ed.) 1992 Southeast Asian Capitalists. Southeast Asia Program, Cornell University, Ithaca, NY Mead M 1949 Male and Female: A Study of Sexes in a Changing World. Morrow, New York Milner A 1994 The Inention of Politics in Colonial Malaya: Contesting Nationalism and the Expansion of the Public Sphere. Cambridge University Press, Cambridge, UK Ong A 1987 Spirits of Resistance and Capitalist Discipline: Factory Women in Malaysia. State University of New York Press, Albany, NY Peacock J L 1978 Muslim Puritans: Reformist Psychology in Southeast Asian Islam. University of California Press, Berkeley, CA Peletz M G 1996 Reason and Passion: Representations of Gender in a Malay Society. University of California Press, Berkeley, CA Reid A 1988 and 1993 Southeast Asia in the Age of Commerce, 1450–1680, 2 Vols. Yale University Press, New Haven, CT Reid A 1996 Sojourners and Settlers: Histories of Southeast Asia and the Chinese. Allen & Unwin, St. Leonard’s, Australia Riggs F W 1966 Thailand: The Modernization of a Bureaucratic Polity. East-West Center Press, Honolulu, HI Roff W R 1967 The Origins of Malay Nationalism. Oxford University Press, Kuala Lumpur, Malaysia Sweeney A 1987 A Full Hearing: Orality and Literacy in the Malay World. University of California Press, Berkeley, CA Tambiah S J 1970 Buddhism and the Spirit Cults in North-east Thailand. Cambridge University Press, Cambridge, UK Tonneson S, Antlov H 1996 Asian Forms of the Nation. Curzon, Richmond, UK Tsing A L 1993 In the Realm of the Diamond Queen: Marginality in an Out-of-the-way Place. Princeton University Press, Princeton, NJ Warren C 1993 Adat and Dinas: Balinese Communities in the Indonesian State. Oxford University Press, Kuala Lumpur, Malaysia Winichakul T 1994 Siam Mapped: A History of the Geo-body of a Nation. University of Hawaii Press, Honolulu, HI
R. W. Hefner
Southeast Asian Studies: Economics The ten countries of Southeast Asia—Brunei, Cambodia, Indonesia, Laos, Malaysia, Myanmar (also still known as Burma), the Philippines, Singapore, Thailand, and Vietnam—soon to become eleven when East Timor obtains its independence, have a population of approximately 503 million persons, and an aggregate gross domestic product (GDP) of some $578 billion.
Southeast Asian Studies: Economics Their population is similar to that of the entire Latin American and Caribbean (LAC) region, about 80 percent of that of Sub-Saharan Africa (SSA), and 80 percent larger than that of the Middle East and North Africa (MENA). Their combined economies are similar to totals for South Asia and MENA, about 90 percent larger than SSA, about 60 percent of that of China, and about 30 percent of the LAC total. With an average per capita GDP of $1,150 in 1998, the region is quite poor—about 50 percent above that of China, and almost treble that of India, but only about one-quarter of that of the large Latin American countries, Brazil and Mexico. The average, of course, conceals enormous intraregional differences. Indeed, apart from geographic proximity and membership of a regional association (ASEAN—the Association of Southeast Asian Nations), the countries share very few common characteristics. For example: (a) They include countries both very rich (Singapore, with a per capita income of more than $30,000) and extremely poor (the four mainland Southeast Asian states, other than Thailand, have per capita incomes of less than $400, as does East Timor). (b) They include economies which have been consistently among the most open in the world (Singapore, and to a lesser extent, Malaysia and Thailand) and those which, until quite recently, have been among the most closed in the world (the four mainland states again). (c) There are mineral-rich economies, most notably the oil sultanate of Brunei, and also Indonesia and Malaysia, alongside the city state of Singapore, whose only real natural resource is its strategic location. (d) Four of the ten countries have achieved rapid rates of economic growth since the 1960s, for at least three decades—Indonesia, Malaysia, Singapore, and Thailand. Three have generally performed very poorly—Cambodia, Laos, and Myanmar. The Philippines and Vietnam have an indifferent record over this period, taken as a whole, though for certain subperiods their performance has been quite good. (e) The impact of the East Asian economic crisis, 1997–8, has been very uneven: Indonesia was very badly affected, as were Malaysia and Thailand, whereas the others, although affected to varying degrees, avoided economic contraction. (f) One country is very large—Indonesia has a population of about 210 million people, the fourth largest in the world; three are sizable—the Philippines, Thailand and Vietnam each have 60–80 million people; while two (Brunei and East Timor) have fewer than one million inhabitants. (g) There are pronounced subnational differences in all the larger countries: Java and the ‘Outer Islands’, north and south Vietnam, and East and West Malaysia. Indonesia and the Philippines are the world’s largest archipelagic states. Bangkok is often mentioned as one of the most significant examples of a primate city.
This article draws attention to some of the stylized facts of the Southeast Asian economies, the lessons learned from past experiences, and some major ongoing analytical and policy debates.
1. The Record, Lessons Learnt Economic performance and policy regimes in these countries have been episodic in varying degrees. Even high-growth Singapore experienced a mild recession in the mid-1980s, and a slowdown in 1998. The Philippines performed quite well until the late 1970s and during the mid-1990s but, like much of Latin America (with which it is often compared), it experienced a ‘lost decade’ during the 1980s. In the late 1980s Vietnam instituted significant liberalizing economic reforms under its ‘Doi Moi’ (renovation) banner, and in consequence its economic performance improved markedly. Indonesia swung from Sukarno’s dirigisme, nationalizations and hyperinflation in the first half of the 1960s to economic orthodoxy for most of the Suharto rule (1966–98). Even Myanmar effectively renounced its one-time ‘Burmese Road to Socialism’ and isolation in the late 1980s, although, owing to its poor human rights record, it remains an international pariah state. In such a heterogeneous region, generalizations are risky. Nevertheless, the following observations apply to many of the countries most of the time. First, rapid economic growth has been associated with profound structural change. As incomes have risen, the share of agriculture in output and employment has declined, at first as a proportion of the aggregate and at higher incomes in absolute terms. In Malaysia and Thailand, agriculture now contributes only about one-tenth of GDP, whereas in Cambodia and Laos its share is still about one-half. The slower the growth, the slower the rate of structural change. In the Philippines, for example, the share of manufacturing in GDP has not risen appreciably since the late 1960s. Second, sustained economic growth is associated almost invariably with improved living standards and reduced poverty. In all the high-growth economies, there have been pronounced social improvements: education enrolments and life expectancy have risen, infant mortality and poverty incidence have fallen. Nevertheless, the key to poverty alleviation has been ‘growth plus’. While there is less agreement over just what constitutes the ‘plus,’ the core elements almost certainly include the provision of basic public goods (e.g., primary education and health), an ‘inclusive’ labor-intensive growth path (especially one which combines outward orientation), and a reasonably equitable distribution of wealth and social opportunity. There are, of course, marked differences. The Philippines had a head start in educational standards, 14671
Southeast Asian Studies: Economics but over time the gap has narrowed. Regional differences also persist: Indonesia’s eastern provinces are mostly poorer and have grown more slowly; Thailand has been concerned about its lagging northeast region; the northeastern states of West Malaysia are poorer than the more prosperous west and south; the southern regions have been more dynamic than the north in Vietnam. Particularly in Indonesia and Malaysia, much of the focus on distribution has an ethnic dimension, an issue to which we return below. Third, we have at least a proximate understanding of the causes of economic growth, in Southeast Asia and elsewhere. Economic growth has been associated positively with—and by implication caused by—at least the following factors: (a) macroeconomic orthodoxy, resulting from an appropriate mix of monetary, fiscal and exchange rate policies, has achieved a predictable and low inflation environment; (b) openness to foreign trade, investment, and technology (and in some cases, labor); (c) financial development, including effective prudential supervision; (d) reasonable distribution outcomes, and a community perception that the benefits of growth have been ‘fair’; (e) a set of political and legal institutions which, while not necessarily enshrining well-developed democratic principles, has at least ensured some measure of responsiveness to community concerns, including judicial redress and protection of property rights; and (f ) the public provision, or the coordinated private provision of, physical infrastructure at world competitive prices. The particular ‘fit’ of these propositions to each country has, of course, varied considerably. But the higher a country has ‘scored’ according to these criteria, the better its long-term performance has been. Naturally, these factors do not take account of major insurrections (which plagued much of Indo-China until the late 1970s), serious political disturbances (a factor in Indonesia in 1965–6 and 1997–8 and the Philippines in the decade 1982–92), major seismological events (the Philippines again), or the systematic destruction of physical infrastructure (East Timor in late 1999). Fourth, to varying degrees, these have been flexible economies, both in their policy regimes and their factor markets. Adverse exogenous shocks, such as a deterioration in the terms of trade, have resulted in compensating exchange rate and\or labor market adjustments. Indonesia in the early and mid-1980s is a good example of this proposition. When a strategy has not worked, or has been overdone—e.g., Singapore’s high-wage policy in the early 1980s, Malaysia’s hightech ambitions in the 1980s—governments have introduced remedial measures. The presence of large numbers of migrant workers, often illegal, has facili14672
tated labor market adjustment in Malaysia, Singapore, and Thailand. However, it is possible that some of these economies are becoming less flexible over time, as was evident in the crisis of 1997–8. Long-lived regimes in Indonesia and Malaysia were either unable or unwilling to introduce some of the bold reforms required by the international donor community. A similar problem afflicted the late Marcos period in the Philippines. Labor market regulations in Indonesia and the Philippines in the 1990s resulted in less flexibility. Vietnam’s reforms seemed to have run out of steam by the mid–late 1990s. However, these economies generally remain more open and flexible than those of Latin America and South Asia. Fifth, globalization, regional integration, and national economic integration have all proceeded more or less hand-in-hand throughout the region, to varying degrees, since at least the late 1980s, and a good deal earlier in some countries. Each has been propelled by different sets of factors. More open policy regimes have been introduced as suspicion towards markets and fear of Western commercial domination have waned. Initial reform successes have spawned coalitions favoring continued reforms. International institutions such as the IMF and the WTO have to some degree underpinned the process. Differences within the region remain very significant, however. Singapore has removed practically all obstacles to merchandise trade, but not services. Malaysia too has a very open goods sector, though in 1998 it introduced controversial controls on short-term capital movements. By contrast, the four poor economies of mainland Southeast Asia have barely established formal tariff codification provisions, nontariff barriers are often ill-defined (and are sometimes applied arbitrarily), and foreign investors have limited judicial protection. Regional economic cooperation has been consistent with this general thrust toward increasing openness. ASEAN has wisely avoided EU-style bureaucratic concentration in its headquarters, and discriminatory trade policy liberalization. The Association was, however, unable to play an effective role during the region’s recent economic crisis (or, for that matter, East Timor’s political crisis of 1999). Finally, there are numerous caveats which might be attached to the growth record, but none invalidates fundamentally the propositions that growth is good for a country’s citizens, and that people are generally better off in richer than in poorer countries. Admittedly, the quality of urban amenities in Southeast Asia generally has deteriorated over the past three decades. But, through careful planning, ample resources, and strict controls over private automobile use, rich Singapore has one of the world’s highestquality urban environments. (Although, as illustrated dramatically during Indonesia’s forest fires of 1997, it cannot control imported pollution.) Moreover, the most serious urban amenity problems seem to be
Southeast Asian Studies: Economics evident where rapid population growth and slow economic growth intersect, as in the case of Manila. It is also probably the case that the quality of rural environments has deteriorated, as topsoils have been eroded, forests poorly (and corruptly) managed, mine wastes disposed of recklessly, and waterways polluted. Here also, economic growth may initially be a negative influence, but ultimately a positive factor, as incomes rise, and with them the resources to improve environmental standards. In addition, some exercises in ‘green accounting’ suggest that the economic growth in Indonesia and Malaysia may have been overstated by a percentage point or two once account is taken of the exploitation of nonrenewable natural resources, particularly oil and gas. Putting aside the heroic assumptions underpinning such calculations, here too it needs to be emphasized that economic growth in these countries was not explained primarily by resource depletion. Indeed, from the mid-1980s, growth accelerated in both Indonesia and Malaysia even as the share of the natural resource sector in the economy declined sharply.
2. Some Debates and Puzzles While the general picture of economic development and its policy underpinnings may be clear enough, major analytical and policy debates abound, and there remain significant lacunae in our knowledge of the development process. The following are illustrative examples, by no means complete, of some of these gaps and intellectual challenges.
2.1 The Crisis of 1997–8: Origins and Implications This crisis led to an unprecedented decline in output, of about 14 percent in Indonesia, 10 percent in Thailand, and 7 percent in Malaysia in 1998. It caused much social dislocation and hardship, it was instrumental in toppling a seemingly impregnable regime (Suharto’s Indonesia), and it shook the region’s selfconfidence. It also caught governments, international development agencies, investors, and scholars off guard. Just when these groups were variously trumpeting the virtues of ‘Asian style’ development, making personal fortunes, or attempting to develop an intellectual framework for understanding the unprecedented high levels of growth, there was a sudden collapse. The crisis was not foreseen by anybody, notwithstanding much debate concerning the need for reform and the ills of corruption. There were some early warning indicators in Thailand, manifested in flight out of the capital market and the currency. Perhaps the biggest analytical conundrum was Indonesia: its economy remained buoyant right through to the general onset of the crisis in July 1997, and yet it suffered the most dramatic reversal of fortune in the
region, from a precrisis growth rate of about 8 percent to a contraction of 14 percent. There is still no consensus regarding the causes of the crisis among the development policy community and in its wake, and there are differences of emphases concerning future policy directions. On the causes, technical explanations focus on: (a) the dangers of a fixed but adjustable exchange rate regime which, in the presence of marked differentials between domestic and international interest rates, led to massive unhedged international borrowings; (b) rapidly rising short-term external debt and, more generally, a large stock of highly mobile capital; and (c) a poorly regulated and supervised domestic financial sector, especially in the presence of an open international capital account. Political and institutional explanations emphasize the widespread corruption and cronyism found in much of the region, and the weak state of the legal system, and ‘civil society’ more generally. A third set of explanations draws attention to international factors, including the rapid build-up of volatile, short-maturity international capital flows, the weak state of the Japanese economy throughout the 1990s, and the heavy-handed conditionality imposed by the International Monetary Fund (IMF) once the crisis had hit. Extending the latter point, a fourth dimension to the literature seeks an explanation not so much in the precrisis ‘vulnerability’ indicators as in mismanagement following its onset: Apart from inappropriate IMF medicine, this includes governments intent on protecting vested interests rather than reform (Indonesia and Thailand), inflammatory attacks on the international financial industry (Malaysia’s prime minister), and loss of control of monetary policy (Indonesia). Future policy directions are also the subject of much debate. Should countries allow their currencies to float? Will currency boards become attractive, as they were briefly to Indonesia’s Suharto in early 1998? Or is Malaysia’s fixed exchange rate a preferred option? The latter will normally entail some sort of supervision over short-term capital controls, which Malaysia instituted for 12 months in September 1998, and which continue to find favor among some intellectual circles in the region. Should the financial sector be open to foreign banks, and can central banks be trusted to supervise this sector? How can effective legal systems be developed in countries with poorly paid and demoralized civil services?
2.2 The Philippines: Why an Outlier? Just as Indonesia’s sudden economic collapse in 1998 is difficult to explain, so too is the Philippines’ poor economic performance longer-term. In the 1950s, this was the country of hope in the region. It was richer than Korea and Thailand. It had the best education record in East Asia after Japan, and a large English14673
Southeast Asian Studies: Economics speaking community. It had a functioning democracy, an independent judiciary, and a vigorous press. Its special relationship with the USA conferred upon it both security protection and privileged access to the US market. Yet, first the NIEs, then Thailand, and even Indonesia for a period, overtook it. Its real per capita income in 2000 was about the same as it was in 1980. What went wrong? As with the crisis, there is no simple or single explanation, and a smorgasbord of offerings is available. These include: (a) Technical explanations, which focus particularly on poor macroeconomic management (manifested in consistently above-average inflation rates) and inward-looking trade policies, including high levels of protection and a disequilibrium exchange rate. (b) Cultural factors, among them the stereotype of a ‘Latin American country located in East Asia.’ (c) Corruption and the ‘Marcos factor,’ particularly the nature of his regime in the last 6–8 years of his 20year rule (1966–86); sometimes these arguments are extended to include the control by the alleged ‘20’ (or ‘40’, or some other number) ruling families, many of whom are viewed as nondevelopmental in some sense. (d) Persistent insurrections, especially in the southern island of Mindanao, and the breakdown of law and order, resulting in frequent kidnappings of Filipino Chinese and foreign businessmen. (e) Bad luck: In the mid-1970s, Marcos’s strategy of external borrowing during OPEC I, sanctioned by the World Bank, came unstuck at the end of that decade, not only because the funds were used unwisely, but also because of the unfortunate conjunction of rising world interest rates, further increases in oil prices, and collapsing export prices (first sugar, then coconuts). In the late 1980s, the precarious post-Marcos recovery was jeopardized by a succession of coups and natural disasters, and serious power shortages. Ultimately, all these factors contribute to an explanation, but the puzzle remains, especially as some of these negative factors were also present in better-performing neighboring countries. 2.3 State s. Market Attempts to explain East Asian’s economic success, beginning with Japan from the late 1940s, have to confront the fact that rapid growth has been associated with extensive, micro level government intervention. There continues to be a major debate over whether, and if so, how, this intervention has been growth promoting. The issue here is not concerned with government investments in physical infrastructure and the provision of social goods, which have usually generated high rates of return and promoted social cohesion. (Although large infrastructure projects have often been corruption-prone.) Rather, it is whether governments have demonstrated the capacity to ‘pick winners,’ to anticipate better than markets the in14674
dustries of the future, and to manipulate interindustry (and sometimes interfirm) incentives accordingly. The empirical evidence in support of this proposition in Southeast Asia is weak. The rates of return on expensive high-tech programs, commencing in the region mostly around the early 1980s, have generally been low, or even negative, even allowing for long gestation periods and substantial learning exercises. The externalities—broader economy-wide benefits— have been minimal. This applies to virtually all such initiatives—Indonesia’s strategic industries (especially former President Habibie’s aircraft plant), Malaysia’s heavy industry program, and the Philippines’ major industrial projects. The problem with these programs was not so much corruption, though there was plenty of that, but a failure to appreciate the evolutionary nature of technological innovation and industrialization, and the fact that ‘great leaps forward’ in a bureaucratically-driven environment rarely succeed. More generally, there is very little support for the proposition that selective industry assistance—via tariffs, fiscal incentives, credit subsidies, and so on— has paid off. Indeed, very often, the most assisted industries have performed the worst. This is not to say that all forms of selective intervention in Southeast Asia have been unsuccessful. In Singapore, with its very open economy, well-paid bureaucracy, and fierce anti-corruption campaigns, it may well be that certain selective interventions have paid off—such as the goal of establishing the country as a major electronics base, and pushing the industry to upgrade quickly. Agricultural extension programs in several countries have induced farmers to innovate successfully, especially in the adoption of higher yielding varieties. The real challenge for the policy and research community in the region is to identify genuine cases of market failure, and to develop programs which do not exceed domestic bureaucratic capacities. The weaker these capacities—and most in Southeast Asia at the micro level are weak—the more likely the problem of ‘government failure’. 2.4 Managing Ethnic and Regional Disparities Southeast Asia is a region of great diversity, and one of its challenges is managing this diversity while maintaining the development momentum. Two dimensions are especially noteworthy—both longstanding but also painfully evident in Indonesia since 1997. One is ethnic diversity. One of the colonial legacies was that disproportionate wealth was in the hands of ethnic minorities, particularly in Indonesia and Malaysia where, because of religious differences, ethnic assimilation has been very limited. In both countries, the minority ethnic Chinese community (less than 30 percent of Malaysia’s population, and about 3 percent in Indonesia) traditionally has dominated the private, domestic, ‘modern sector’ economy.
Southeast Asian Studies: Economics Both countries have sought to maintain economic growth while experimenting with various affirmative action programs. Malaysia’s approach, following the serious ethnic disturbances of May 1969, was to develop explicit goals for asset redistribution, employment, and university entrance. Indonesia adopted a more informal style, which included the promotion of state enterprises (especially during the 1970s oil boom), a preference for pribumi (indigenous) groups in government procurement, credit, and cooperatives, and occasional presidential cajoling of Chinese businesses to redistribute some of their assets. The results have been mixed. There has been a major redistribution of assets in Malaysia, although the rising bumiputera share has been achieved mainly at the expense of foreign firms. Indonesia under Suharto witnessed the rise of several large pribumi conglomerates, mostly ‘palace-connected,’ and an expansion of pribumi managerial employment. It is sometimes argued that Malaysia avoided Indonesia’s traumatic ethnic tensions of 1997–8 precisely because it had earlier and comprehensively instituted these programs. But there have been costs—the well-connected have benefited disproportionately, a new class of rent-seekers has emerged in both countries, and it is not obvious that the above approaches have been the most effective policy instruments. Good quality, widely accessible public education is probably the most effective long-term redistributional strategy. Here, both countries score well, though Indonesia much less so on quality grounds. Much more could be done in the way of enforcing a more progressive tax structure, again especially in Indonesia. The fierce resentment toward Indonesia’s Chinese community during the recent crisis underlines the fact that governments need a strategy, and need to be seen to be doing something, but there are no ‘quick fixes’ in remedying such deep-seated disparities. The second dimension, regional diversity, is equally complex. The record of regional development in the resource-rich countries of Indonesia and Malaysia has been fairly good, in the sense that some quite effective programs have been introduced to ameliorate sharp regional disparities. But in an era where nation states are under challenge as never before, regions with a rich natural resource base are asking why they have to surrender such a large proportion of revenue to the central government, especially when their own social indicators are sometimes lagging behind the national average. The problem has been particularly serious in cases where there is already a long-standing grievance against the central government, as in the Indonesian provinces of Aceh and Irian Jaya (the latter recently renamed Papua Barat, in an attempt to mollify local sentiment). East Timor was a special case, given its unhappy history under both Portuguese colonialism and heavy-handed Indonesian rule. But it is quite likely that other secessionist attempts will be mounted in Indonesia in coming years. Meanwhile, there will be
continuing tension through the region between central governments’ desire to maintain territorial integrity and national unity, and pressure from below for decentralization and greater autonomy. 2.5 Economic Growth and Political\Institutional Deelopment A major puzzle is why and how some Southeast Asian countries—especially, but not only, Indonesia—were able to grow so fast for so long in the presence of extensive corruption, much bureaucratic incompetence, and a weak legal system. As a corollary, the literature extolling the virtues of a strong and able civil service, and invoking it as a key explanatory factor in East Asian growth, has been of little relevance to Southeast Asia, with the exception of Singapore, and to some extent Malaysia. Part of the explanation is that the high growth Southeast Asian economies with these problems had competent macroeconomic management, at least until the first half of the 1990s. If governments get a few key policy variables more or less ‘right’—fiscal, monetary, and exchange rate policy; reasonably open trade and investment regimes; a basic national education program; and a predictable policy environment— economies can ‘carry’ major deficiencies elsewhere. Another part of the explanation is that corruption is as much a distributional as a growth issue. Most civil service salaries in poorer Southeast Asian countries are below those in the private sector, especially for middle and senior officials, and thus small-scale corruption is a means, however morally reprehensible, of redressing the imbalance. Even grander corruption—for example, much of Indonesia’s huge forestry royalties have tended to accrue to wellconnected individuals rather than the state—may not necessarily retard growth if the funds are largely recycled through the domestic economy, as in the main they were. (Of course, in consequence, public programs were starved of resources, with short-term distributional and long-term growth implications.) Indeed, there is even a school of thought which, in seeking to explain the pronounced Indian–Indonesian growth differentials in the presence of similar levels of corruption, maintains that the latter’s corruption was more conducive to growth because the recipients were more likely to ‘deliver.’ But it is clear now that the past regime and practices are no longer sustainable. Perhaps corruption crosses some sort of ‘threshold level,’ when it so sours public support for a government (for example, the late Marcos and Suharto periods), that there are macro political ramifications. The presence of highly mobile international capital, which can suddenly leave the country if a political crisis develops, imposes a constraint on these governments which was hardly present in the 1980s and before. Continued IMF financial support for the crisis economies is also 14675
Southeast Asian Studies: Economics conditional upon major institutional reform. Moreover, weak legal and financial institutions may not be a problem so long as the growth momentum is maintained. But they become a problem in times of crisis: bankruptcy laws have to function in order that debt workouts may proceed, for example. Wide-ranging institutional reform is a complex and lengthy process. Democratization may hasten the process but it is not a simple cure-all. If indeed substantial reform is a necessary prerequisite for a durable recovery, the poorer countries of Southeast Asia may be in for a prolonged period of slower growth.
3. Summing Up The Southeast Asian countries are important because they are large in aggregate, strategically located, exceptionally diverse, and intellectually challenging. Some analytical and policy issues in the region, as elsewhere, are reasonably settled: social progress follows economic development, and a few factors are absolutely critical to successful development. But region-wide generalizations for Southeast Asia beyond these basic propositions are hazardous. Rich, hightech Singapore is a world away from the grinding poverty of East Timor and Myanmar. Some countries in the transition from communist to capitalist development have yet to establish the institutions which underpin a modern state that participates in the global economy. Indeed, some of the these states, particularly those on mainland Southeast Asia, are unsure how, and how far, to embrace globalization. The region’s largest state, Indonesia, is now grappling with its territorial integrity and possibly even its survival. For 30 years, ASEAN was the envy of the developing world. Now, quite suddenly, its economy is again growing slowly, there are serious social problems, and the region’s seemingly powerful cohesion has faltered. The age of uncertainty has returned to Southeast Asia, as illustrated not only by its short-term economic prospects but also by the diversity of opinion on the many important social and economic challenges it now faces.
4. Further Reading There is a huge literature on most aspects of the Southeast Asian economies. The bibliography contains some recent titles, which also include references to many earlier volumes. A 4-volume collection of leading articles on the subject will appear in Hill (2001). See also: Area and International Studies: Development in Southeast Asia; Colonialism: Political Aspects; Colonization and Colonialism, History of; Development, Economics of; Development: Rural Development Strategies; Development: Social14676
anthropological Aspects; Quantification in History; Southeast Asian Studies: Geography; Southeast Asian Studies: Politics; Southeast Asian Studies: Society
Bibliography Booth A 1998 The Indonesian Economy in the Nineteenth and Twentieth Centuries: A History of Missed Opportunities. Macmillan, London Gomez E T, Jomo K S 1997 Malaysia’s Political Economy. Cambridge University Press, Cambridge, UK Hill H 2000 The Indonesian Economy, 2nd edn. Cambridge University Press, Cambridge, UK Hill H (ed.) 2001 Southeast Asian Economic Deelopment. Edward Elgar, Cheltenham, UK Huff W G 1994 The Economic Growth of Singapore: Trade and Deelopment in the Twentieth Century. Cambridge University Press, Cambridge, UK Kyi K M et al. 2000 Economic Deelopment of Burma: A Vision and a Strategy. Singapore University Press, Singapore Leung S (ed.) 1999 Vietnam and the East Asia Crisis. Edward Elgar, Cheltenham, UK Lucas R E B, Verry D 1999 Restructuring the Malaysian Economy: Deelopment and Human Resources. Macmillan, London McLeod R H, Garnaut R (eds.) 1998 East Asia in Crisis: From Being a Miracle to Needing One? Routledge, London Phongpaichit P, Baker C 1995 Thailand: Economy and Politics. Oxford University Press, Kuala Lumpur, Malaysia Vos R, Yap J T 1996 The Philippine Economy: East Asia’s Stray Cat? Structure, Finance and Adjustment. Macmillan, London Warr P (ed.) 1994 The Thai Economy in Transition. Cambridge University Press, Cambridge, UK Woo W T, Sachs J D, Schwad K (eds.) The Asian Financial Crisis: Lessons for a Resilient Asia. MIT Press, Cambridge, MA
H. Hill
Southeast Asian Studies: Gender Social science research on Southeast Asia only began to make its mark after World War II. The study of gender (the cultural practices and symbols which construct and assign certain roles for men and women) is very recent, and reflects its origins in Women’s Studies and Women in Development programs.
1. Southeast Asia and the High Status of Women The term ‘Southeast Asia’ (including Myanmar, Thailand, Laos, Cambodia, Vietnam, the Philippines, Singapore, Brunei, Malaysia, and Indonesia) can be dated to World War II, when it was used to refer to the theater of operations between China and India. Even in very early times, travelers saw this region as distinct from its larger neighbors, despite its cultural diversity. In 1944, in his Histoire ancienne des eT tats hindouiseT s d’extreV me-orient, the French scholar
Southeast Asian Studies: Gender Georges Cœde' s (1886–1969) identified a shared corpus of cultural features, among which was ‘the importance of the role conferred on women and of relationships in the maternal line.’ Since the post-war development of Southeast Asia studies, the high status of women has been frequently invoked as a unifying characteristic of the region. A number of reasons have been proposed to explain the relative equality of male–female relations in Southeast Asia. Among those most commonly cited are the importance of women in food production, especially rice cultivation; their prominence in marketing and other economic activities; complementary gender roles in ritual, and female prominence as healers and spirit mediums; the prevalence of bilateral kinship patterns, matrilocal residence and bridewealth; the fact that women commonly inherit family property and maintain their own source of income; low population densities prior to the nineteenth century, which placed a high value on women’s work and female fertility; the absence of a strong state structure and a low level of urbanization. Individuals who incorporated both female and male elements were also important in indigenous belief systems, although this aspect of gender has received less scholarly attention.
2. Gender in the Social Sciences Before World War II The foundations of social science research in Southeast Asia lie in the late eighteenth and early nineteenth century, when Europeans began to compile ‘scientific’ studies of the region. Apart from Thailand, all Southeast Asia was colonized by Europeans, whose research naturally focused on the areas under their control. By the late nineteenth century work on contemporary Myanmar, Singapore, Malaysia, and Brunei was almost solely the preserve of British scholars, with the Dutch concentrating on Indonesia, the French on Indochina (Laos, Vietnam, and Cambodia); and the Spanish and, after 1900, the Americans on the Philippines. Produced primarily by missionaries and colonial administrators, this research has provided valuable data for understanding gender relations, although that was rarely its focus. Colonial governments, by contrast, did produce periodic reports related to women, notably in regard to community health, literacy, and sex ratios, especially in countries that had sizeable migrant populations. The development of the new field of anthropology in the latter part of the nineteenth century prompted greater interest in local cultures and beliefs. Other social science fields considered the effects of capitalism and colonialism on indigenous society, but provided little room for the discussion of male and female relationships. Until the 1930s ethnographic literature on Southeast Asia was largely confined to studies of smaller
societies or tribal areas, but although the egalitarian nature of male–female relations was usually noted, gender issues were addressed only in passing. This began to change with the growing influence of structural anthropology. In the Netherlands, F. A. E. van Wouden’s 1936 dissertation on social structure in eastern Indonesia gave gender relationships a prominent place because of his focus on cross-cousin marriage, the exchange of women, and the connections between matrilineal and patrilineal descent groups. In mainland Southeast Asia Edmund Leach’s pre-war research on highland groups in Myanmar (published in 1954), strongly influenced by the work of Claude Le! vi-Strauss, discussed the influence of cross-cousin marriages on constructions of hierarchy and equality in social relationships. In the 1930s the expanding fields of social and cultural anthropology also coincided with a new generation of academic women whose research focused on the family and the domestic economy. Working in Bali, Margaret Mead, one of the foremost anthropologists of her time, gave specific attention to child-rearing, education, adolescence, and sexuality. Another American anthropologist, Jane Belo, also analyzed family relationships within Balinese culture. A different approach was taken by Rosemary Firth in her investigation of Malay village housekeeping in 1939–40. Her socioeconomic study emphasized what has become a truism in descriptions of Southeast Asian societies—women are dominant in marketing and small trading, have a strong entrepreneurial tradition, and control domestic finances.
3. Gender and Southeast Asian Studies After World War II World War II represents a divide in the development of Southeast Asian studies. Not only did it witness the end of the colonial regimes; it also saw the escalation of Cold War hostilities, which enhanced the strategic importance of Southeast Asia. In the United States the government injected considerable funds into area studies, which led to a marked rise in social science research, especially on Thailand. Usually communityfocused, this research encompassed a range of topics—nutritional intake, agricultural practices, child-rearing, the practice of magic, cultural values, etc.—which all necessitated an understanding of gender roles. Although comments on the relative equality between men and women in village societies became standard, specialists were cautious about generalizing beyond specific linguistic or ethnic groups. At most, they attempted to identify cultural features considered to be emblematic of a particular society—patronclient relationships in the Philippines, Thailand’s loosely-structured social system—but these were rarely broken down in terms of gender. 14677
Southeast Asian Studies: Gender During the 1950s and 1960s the disciplinary study of Southeast Asia was also introduced to (largely American) universities, but an interest in women’s issues was not evident until the 1970s. As the region became more prosperous, local governments began to appreciate the economic importance of Southeast Asian women as well as the need to involve them in rural development programs. In the West social science work on Southeast Asia was influenced by the women’s liberation movement and the development of feminist anthropology. The foundation issue of a new feminist journal included an article which identified the major social science research areas relevant to women in Southeast Asia (Papanek 1975), and it is worth noting that Michelle Rosaldo, co-editor of the classic collection, Woman, Culture and Society (1974) was a specialist on the Philippines. The United Nations Decade of Women in 1975–85 was a catalyst for work on women. The combination of development agendas, feminist-inspired research and international concern with the effects of economic change in the Third World gave rise to a new academic field, Women in Development (WID). Pioneering studies focused on Asian women (Ward 1963, Boserup 1970) reasserted the economic autonomy and the high status of Southeast Asian women. The demand for work with practical applications for social policy and human resource management legitimized new research and teaching programs on women in Southeast Asia itself. While primarily concerned with empirical observation and evaluation, the work of Southeast Asian sociologists and developmental theorists has been a salutary reminder that analytical frameworks developed in the West do not necessarily apply elsewhere.
4. Contemporary Trends Contemporary work on women and gender in Southeast Asia is unevenly distributed. The bulk of research has been carried out in Indonesia, where anthropology has traditionally been strong, and in Thailand, the Philippines, Malaysia, and Singapore. Work elsewhere is limited, although research on Vietnam is expanding. The persistence of country and local studies and the lack of comparative material makes generalizations about trends in scholarship difficult. In moving away from macro theories of gender to concentrate on specific groups, social scientists are demonstrating the need to place all research in historicized contexts that take into account factors like ethnicity and class. Historians have also warned against the tendency to place gender relations in a context that sets an exploitative present against an unduly idealized past. By pointing to similarities between Southeast Asian societies and those of southern India and southern China, scholars have questioned the received wisdom that contrasts an essentialized Southeast Asia with 14678
China and India. A recognition that the notion of ‘Southeast Asian women’ is a chimera has led on to the understanding that women’s speech does not necessarily reveal a woman’s point of view and that women do not always speak from the gender identity of women. The shift to the analysis of gender has made it clear that in order to understand women it is necessary to think about men in different kinds of ways, and be alert to the strategies by which masculinity is culturally created. Research on gay male identity in Thailand and the Philippines, for instance, has shown that male homosexuality, far from being an imported western borrowing imposed on traditional sexual or gender norms, involves redefinitions that are located within a specific sex\gender system. A dominant paradigm in gender theory since the 1970s, the effects of globalized industrialization on the sexual division of labor, has been particularly relevant in Southeast Asia. In order to establish a manufacturing base Southeast Asian governments have used the lure of a cheap and docile work force to attract capital from the industrialized West. Much of the factory labor, especially in textiles, clothing, and electronics, has been provided by women, usually young and unmarried. From the 1980s a growing body of research, focused primarily on Indonesia, Thailand, Malaysia, and the Philippines, has explored the effects of economic globalization on women’s work. Social scientists have not reached a consensus regarding the extent to which Southeast Asian women have benefited from the new employment opportunities. Some have argued that although the nature of work has changed, gender hierarchies have been transferred to the factory floor and social norms which make women subject to family and patriarchal controls have been retained. On the other hand, analyses that depict women as victims of patriarchy and capitalism have been challenged by research that places greater stress on female agency. A common theme has been patterns of resistance that assert female worker solidarity against a male management, and often result in small workplace triumphs. Less researched, however, are the expanding numbers of female entrepreneurs and employers among the middle class. The study of gender in contemporary Southeast Asian societies has reaffirmed the problem of drawing a divide between the domestic economy and the formal work sector. As in the past, the average woman maintains her own finances independent of her husband, and indigenous economic networks mean that work is often an extension of home-based activities. Again, no Southeast Asian model has been established, in part because of the economic variation across the region, even within countries. In considering gender issues and industrialization, for example, it may be more useful to compare contemporary developments on Java with Gambia or Taiwan than with another part of Indonesia where economic development has not occurred.
Southeast Asian Studies: Gender Widespread industrialization and rural migration to cities has drawn attention to urban poverty and the tendency of unskilled and unemployed women to drift into prostitution. But in Southeast Asia, as elsewhere, social scientists have argued that the presumed universality of a concept like prostitution requires interrogation if it is to be used in transcultural contexts. Women involved in the sex trade can certainly be seen as victims, but it is also possible that poor women see prostitution as a path for advancement, especially when the customers are white men or there is the possibility of becoming ‘temporary wives.’ The phenomenon of ‘mail-order brides’ and the international recruitment of female workers, especially from the Philippines and Indonesia, have also attracted research interest, often in an effort to garner public and official support for greater legal protection of victimized women. In affluent and urbanized Singapore, by contrast, sociological research on gender is concerned less with poverty than with examining how different ethnic groups deal with smaller families and the care of the aged. The rapid pace of economic change in combination with government goals for development has also stimulated studies of the effects of modernization on rural society, most notably in Indonesia, Malaysia, the Philippines, and Thailand. The literature can be roughly divided into officially sponsored projects of obviously practical application, and more critical and theoretical work, most commonly by anthropologists. Both categories have generated studies of women’s responses to family planning and population control, and examinations of the extent to which women’s lives have been changed by health and education. In rural Southeast Asia, where it is generally accepted that premodern gender relations were relatively equal, some scholars have contended that increased mechanization and rural development programs have privileged men and disadvantaged women. Similarly, it has been alleged that the increasing value placed on formal educational qualifications and the spread of state health services undermines traditionally female areas such as midwifery and spirit healing. The argument that globalization has adversely affected women’s traditional status has led to some re-examination of cultural constructs of gender, especially in regard to religious teaching. The acknowledged place of sexuality and the male–female relationship in indigenous belief systems has encouraged greater attention to the effects of the male-dominated world religions on gender relations in Southeast Asian societies, and the extent to which imported ideas were localized to suit existing gender relationships. The debate has been most pronounced in Thailand, where textual statements denigrating women’s status in Buddhism can be contrasted with the prominence of women in the actual practice of religion. In Muslim societies the emergence of a more fundamental form of Islam that seeks a stricter application of shari’ah law
has obvious implications for gender relations, especially in Malaysia where Islam is the state religion. In the context of Islam’s adaptation to local custom, the combination of matrilineality, economic dynamism and Islamic commitment found among the Minangkabau in Sumatra and Negeri Sembilan in the Malay peninsula continues to attract researchers. Historians of Southeast Asia are now giving greater attention to gender considerations and are providing social scientists with a stronger basis for discussions of change. Research on the nineteenth and early twentieth century, for example, has explored the interactions between race and gender in the hierarchies created by the colonial state. Another line of inquiry concerns the position of women in Southeast Asian nationalist movements prior to World War II, since it is evident that the call for female education and marriage reform reiterated by women’s organizations were rarely accorded priority by male nationalist leaders. These issues were not resolved with independence, and it is evident that the colonial heritage often has a direct bearing on the way modern governments conceptualize gender roles. State involvement in education and the economy has perpetuated notions of gender difference and hierarchy by socializing men and women into different roles. For instance, there has been little encouragement for women to enter political life, except in support of male activities. Though the activities of women’s branches of political parties have attracted some research (e.g., in Malaysia), this work has made no impact on mainstream descriptions of political systems. By contrast, the involvement of women in non-Government Organizations (NGOs) holds out the promise of theoretically stronger work, as researchers attempt to discern the extent to which NGOs replicate existing patterns of social hierarchy and gender, or act as agents of change. Despite a general acceptance of gender complementarity in Southeast Asian societies, gender studies is still developing, and despite a growing interest in constructions of masculinity and gay identity, is currently dominated by work on women. Because of the array of cultures, languages, economies, religions, and modes of government, scholars of Southeast Asia will need to judge the extent to which they can move from case studies to larger generalizations. Specialists recognize that it is necessary to understand how gender operates within a particular social and cultural context before regional and global connections can be drawn. The similarities and contradictions that such investigations uncover may themselves be revealing, and should be actively identified rather than ignored. See also: Fertility Transition: Southeast Asia; Islam and Gender; Land Rights and Gender; Southeast Asian Studies: Culture; Southeast Asian Studies: Politics; Southeast Asian Studies: Religion; Southeast Asian Studies: Society 14679
Southeast Asian Studies: Gender
Bibliography Andaya L Y 2000 The Bissu: A Study of a Third Gender in Indonesia. In: Andaya B W (ed.) Other Pasts: Women, Gender and History in Early Modern Southeast Asia. Center for Southeast Asian Studies. University of Hawaii, Honolulu, HI Atkinson J M, Errington S (eds.) 1990 Power and Difference: Gender in Island Southeast Asia. Stanford University Press, Stanford, CA Boserup E 1970 Woman’s Role in Economic Deelopment. St. Martin’s Press, New York Eberhardt N (ed.) 1988 Gender, Power and the Construction of the Moral Order: Studies from the Thai Periphery. Center for Southeast Asian Studies, University of Wisconsin, Madison, WI Firth R 1943 Housekeeping among Malay Peasants. Athlone Press, London Hicks D 1984 A Maternal Religion: The Role of Women in Tetum Myth and Ritual. Center for Southeast Asian Studies, Northern Illinois University, DeKalb, IL Karim W J (ed.) 1995 ‘Male’ and ‘Female’ in Deeloping Southeast Asia. Berg, Oxford, UK Keyes C F 1984 Mother or mistress but never a monk: Buddhist notions of female gender in rural Thailand. American Ethnologist 11(2): 223–41 Manderson L, Jolly M (eds.) 1997 Sites of Desire, Economies of Pleasure: Sexualities in Asia and the Pacific. University of Chicago Press, Chicago Murray A J 1991 No Money, No Honey: A Study of Street Traders and Prostitutes in Jakarta. Oxford University Press, Singapore Ong A, Peletz M G (eds.) 1995 Bewitching Women, Pious Men: Gender and Body Politics in Southeast Asia. University of California Press, Berkeley, CA Papanek H 1975 Women in South and Southeast Asia: Issues and Research. SIGNS 1(1): 193–214 Rosaldo M Z, Lamphere L (eds.) 1974 Woman, Culture and Society. Stanford University Press, Stanford, CA Van Esterik P (ed.) 1996 Women of Southeast Asia. Center for Southeast Asian Studies, Northern Illinois University, DeKalb, IL Ward B E (ed.) 1963 Women in the New Asia. UNESCO, Paris
B. W. Andaya
Southeast Asian Studies: Geography As with the classical roots of the discipline of geography, geographical interest in Southeast Asia (in the Western intellectual tradition) can be traced back to the era of discovery and exploration when the mystical Orient first became exposed to European curiosity and conquest. From the beginning of the sixteenth century the expanding European presence in the region was paralleled by a growing need for information about the terra incognito of the Orient, and erstwhile geographers were at the forefront of the collection and classification of such information: indeed, to all intents and purposes their activities at the time were geography. The functional need for knowledge and understanding to support colonial 14680
commerce and administration constituted what has been called the ‘encyclopedic trend’ in classical geography. Towards the end of the nineteenth century geography started to emerge as an educational and academic discipline, and this led to the systematic study by professional academics of regions such as Southeast Asia both in Europe and within the colonies as part of the colonial educational system. Thus, well before Southeast Asia emerged as a recognized region with the creation of the South-East Asia Command during World War II, the geographical study of phenomena and processes therein was already well established. Southeast Asian geography may be found at several points along a continuum between area studies and ‘pure’ geography, reflecting scholars’ degree of areal specialization. Thus the field includes Southeast Asianists who happen also to be geographers, scholars in mainstream geography departments who take a specialist interest in Southeast Asia, and geographers whose work (e.g., on globalization, industrialization, environmental degradation) includes Southeast Asia. Whilst the inclusion of such a range of scholars makes any categorical overview of Southeast Asian geography difficult, it is equally unhelpful to exclude people simply on the basis of either their institutional location or their degree of disciplinary\area specialization. Before World War II there would have been less of a dilemma in this regard. Up until the early 1960s the study of ‘regions’ and ‘areal differentiation’ constituted an important cornerstone of the geographer’s approach, and thus regional geographers with a specialist interest in Southeast Asia were in many regards ‘mainstream’ geographers. Furthermore, until approximately this time there were no specialist area studies centers in which Southeast Asian geographers plied their trade (see Halib and Huxley 1996, pp. 2–5). In the United States, for instance, the first Southeast Asia program was initiated in Cornell University (and also Yale University) in 1950 by an endowment from the Rockefeller Foundation, but geography was not a prominent discipline in this program (as is still the case today). Similarly, although the School of Oriental Studies was established in London in 1917 (becoming the School of Oriental and African Studies (SOAS) in 1938), it was not until Charles Fisher was appointed the first Professor of Geography with reference to Asia at SOAS, in 1965, that Southeast Asian geography became established there. Similar situations appertained in continental Europe with geography being either absent or peripheral from such prominent Asiafocused institutions as the Koninklijk Instituut voor Taal-, Land- en Volkenkunde (KITLV) in Leiden (specializing on linguistics and anthropology), the E; cole Franc: aise d’Extre# me-Orient in Paris (anthropology and history), and the Institut fu$ r Asienkunde in Hamburg (political, economic, and social developments in Asia).
Southeast Asian Studies: Geography In the early period, some of the most Southeast Asia-focused geographers were those who were associated with the colonial presence in the region and, later, the strategic imperatives of the Pacific War. A number of Dutch geographers (e.g., S. van Valkenberg, H. Th. Verstappen, F. J. Ormeling) received their first exposure to the region as colonial officers. The French geographer Robert Garry was originally a member of the Colonial Service in French Indo-China. Robert Pendleton was attached to the Philippine Bureau of Plant Industry from 1923–35 during the period of US colonial administration, and later went to the Royal Department of Agriculture and Fisheries in Siam. Joseph Spencer joined the US Office of Strategic Services, and was later assigned to the Far Eastern Division as Assistant Chief. Karl Pelzer became part of a large US team concerned with the development of overseas territories. Some of the people who were to become leading scholars of the geography of Southeast Asia first became associated with the region as staff members of local universities, including the University of Rangoon, Raffles College Singapore (later to form the University of Malaya), the University of Hanoi, and the University of Phnom Penh. From newly established geography departments within these institutions geographers embarked on fieldwork, sometimes in conjunction with undergraduate students, or otherwise developed their research interests in their host territory. Prominent scholars in this regard included Pierre Gourou, Dudley Stamp, O. H. K. Spate, E. H. G. Dobby, and B. W. Hodder. Dutch geographical scholarship on the East Indies was a little more restricted by virtue of the relatively late (postwar) establishment of departments of geography in universities on Java. Meanwhile, despite lacking a colonial presence in the region, German scholarship on Southeast Asia (e.g., Wilhelm Credner, Albert Kolb) dates back to the late nineteenth century and is distinctive for its ability to transcend national boundaries because of its lack of association with colonial territorialization. Another German scholar (later a naturalized American), Karl Pelzer, was one of the first to use the designation Sudestasien (Southeast Asia) in his work, some time before it became more popularly used to describe Mountbatten’s South-East Asia Command. In line with the requirements of war and colonial administration, and reflecting the prevailing orthodoxy in the discipline of geography, a central preoccupation of Southeast Asian geographers at the time was with gathering and presenting information, set within a defined regional or areal context and typically employing various cartographic techniques. A further preoccupation was the drawing together of the physical and human realms of life. Among the ‘area specialists,’ two particularly stood out. The American geographer Joseph Spencer specialized on the cultural geography of the region, and was strongly influenced by the classical American geogra-
pher, Carl Sauer. Spencer also had wide experience of Asia, and this clearly influenced his identity and approach. The distinguished British Southeast Asian geographer, Charles Fisher, also combined an engagement of contemporary concepts and processes with the insight and ‘feel’ of the area specialist. He wrote several articles on matters appertaining to political development and international relations within Pacific Asia, but his most significant contribution was his classic text South-East Asia: A Social, Economic and Political Geography (1964). Fisher had the ability to cover in equal strength matters appertaining to history, physical geography, politics, society, and economy, a talent few of his more specialized successors possess. From the 1960s the discipline of geography entered a positivist era as it underwent a ‘quantitative revolution’ and remodeled itself as a ‘spatial science.’ Regional geography was pushed to the periphery of the discipline, although it was not so much the emphasis on area, space, and place that was responsible for its marginalization as the predominant use of descriptive and unsophisticated methods. Regional geographers and area specialists continued to be dogged by the tag of ‘chorology’ (the identification and description of physical forms and human activities): regional geography was no longer seen as the highest form of the geographer’s art. During the 1970s spatial science itself became problematic, not least because of the relegation of context and human values in geographical analysis. Accordingly, in the 1980s a ‘new regional geography’ emerged which recognized the importance of area, space, place, and locale in the understanding of problems and processes in contemporary society. A new approach to the study of regions emerged which was much more broadly based in the social sciences and in social theory. One of the champions of this new regional geography, Nigel Thrift, was at the time engaged in research on Southeast Asia (especially Vietnam). Southeast Asian geographers have since made a full contribution to this new regional geography, and at the same time have regained a place and respectability within the discipline. This has been aided to a considerable extent by the recent transformation of Southeast Asia, which has increased its prominence within the global system and, at the same time, the level of academic interest. Geographers have been at the forefront of recent analyses of such processes as globalization and the global-local interface, resource exploitation and environmental degradation, regional integration, urbanization, migration, postcolonialism, and cultural change. Southeast Asia thus provides a rich domain for what Doreen Massey has described as the drawing into geography of the traditional realms of other disciplines. Rebecca Elmhirst’s article in Political Geography (1999) on transmigration in Indonesia is a good example of how geographers are drawing on, and helping to advance, 14681
Southeast Asian Studies: Geography concepts and ideas that have their roots in other social science disciplines. The large majority of Southeast Asian geographers today can be found in mainstream geography departments, with many fewer in multidisciplinary area studies centers dedicated to the study of Southeast Asia. Most Southeast Asian geographers therefore have thematic specialisms that draw in Southeast Asia to varying degrees. By and large these are geographers rather than area specialists, the latter being identified by their greater degree of immersion in the region, linguistic capability, cultural awareness, post-Eurocentrism, etc., and their typical location in area studies departments. A perusal of the World Wide Web will reveal a large number of geographers in mainstream departments whose research and teaching draw on Southeast Asian examples; there are by comparison appreciably fewer geographers in dedicated Southeast Asian studies centers, typically seldom more than one, and quite frequently none. There are also a small number of distinctive ‘clusters’ of Southeast Asian geographers in disciplinary departments in several parts of the world. The following survey aims to convey a sense of the global structure of Southeast Asian geography, but is of necessity selective and thus far from exhaustive. In Europe, the largest number of Southeast Asian geographers (approximately 30) is found in the United Kingdom, although only a handful are located in the country’s two Southeast Asian area studies institutes (SOAS and Hull). The remainder are scattered across the length and breadth of the country, with small clusters in Manchester, Liverpool, and the University of London as a whole. In general, the Higher Education Funding Council for England (HEFCE) provides funding for these posts, whilst the majority of funding for research derives from the Economic and Social Research Council (ESRC) and the Natural Environmental Research Council (NERC). The British Academy Committee for South-East Asian Studies also provides dedicated funding for research in and on the region. Elsewhere in Europe the main loci for Southeast Asian geographers (mirroring the situation almost a century ago) are in the Netherlands (with a major cluster in the University of Utrecht, and individuals in the Universities of Nijmegen, Amsterdam, and Eindhoven), France (with a major cluster in the Centre National de la Recherche Scientifique), and Germany (Hamburg). There has also been a recent growth in Southeast Asian geography in Scandinavia, with the field being particularly well represented in the Universities of Lund and Go$ teborg in Sweden, and Copenhagen University in Denmark. Within Europe there are also a significant number of Southeast Asiafocused scholars whose work in, for example, demography, urban studies, planning, tourism, and ecology clearly overlaps with that of mainstream geographers. For more information on Southeast Asian geography in Europe, see van Dijk (1998). 14682
The situation in North America is similar to that in Europe, with the vast majority of Southeast Asian geographers being found in disciplinary departments and with negligible representation in some of the longestablished Southeast Asian studies programs (e.g., Cornell, Northern Illinois, the East-West Center). Southeast Asian geographers are quite dispersed in the United States (e.g., small numbers at the University of California at Berkeley, the University of Michigan), but there are larger clusters in Canada, most notably the University of Victoria and the University of British Columbia, underpinned by strong financial support for research on Southeast Asia by the Canadian government. In Australasia there are significant clusters of Southeast Asian geographers at the University of Sydney (where there is a strong emphasis on Australian government-funded research on the Mekong subregion), Flinders University, and Monash University. Meanwhile, the principal area studies center, the Research School of Pacific and Asian Studies (together with the Department of Geography) at the Australian National University contains one of the world’s most prominent groupings of geographers specializing on Southeast Asia. In New Zealand there are a number of Southeast Asian geographers in the universities of Canterbury, Wellington, Victoria, Massey, and Waikato. One of the most important clusters of Southeast Asian geographers is found in the region itself, and it is likely to become even more prominent in the future as a result of changes that are currently taking place in the scholarly world, particularly in connection with the study of ‘other’ societies. A Southeast Asia Geography Association was formed in 1990 in order to create a forum for academic collaboration and integration within the region. There are several quite large geography departments in the region (e.g., in the University of Malaya, Universiti Kebangsaan Malaysia, Universitas Gadjah Mada, Universiti of Brunei Darussalam, University of the Philippines, and the National University of Singapore), and researchactive staff members most typically engage in research work in and on the Southeast Asian region. Historically, much of the earlier work undertaken by Southeast Asian scholars was somewhat ‘imitative of the Western academic tradition’ (McGee 1995), largely because they received their initial training from the Western scholars who staffed the region’s main university geography departments until as recently as the 1970s in many cases, and still today in the case of Singapore. However in recent years, in the National University of Singapore in particular, there has been a strong resurgence of innovative and stimulating indigenous scholarship by local staff who have looked critically at the nature and consequences of, for instance, ‘globalization’ and ‘global-local interaction.’ Meanwhile, postcolonial tendencies in contemporary scholarship in the humanities and social sciences
Southeast Asian Studies: Health are forcing Western scholars to look critically at the stifling and arrogant Eurocentrism that has characterized a significant amount of work on Southeast Asia in the past—an accusation from which geographers are far from absolved—and are currently leading to a reappraisal of established scholarship and the search for indigenous theories and concepts. Work by geographers on urbanization, rural-urban interaction, regional integration, and cultural resilience\resurgence has been particularly prominent in this regard, with several Southeast Asian scholars being at the forefront of recent developments (see Savage et al. 1993, Forbes 1997 McGee 1995, Rigg 1998). A 1999 joint publication by Singaporean and Western scholars on the contested territories of globalization in Pacific Asia (Olds et al. 1999) is indicative both of the current direction of geographical scholarship on Southeast Asia and of the fusing of Asian and Western geographical minds. Southeast Asian geography has therefore experienced fluctuating fortunes as the region itself has waxed and waned in its global prominence over the last century or so. From being at the disciplinary forefront during the colonial era, courtesy of the imperatives of colonial administration and economic transformation, it experienced a largely self-inflicted hiatus during the immediate postindependence era. The development boom of the 1980s, and associated processes of global\regional integration and environmental degradation, rekindled geographers’ interest in the region, and at the same time provided a powerful boost for local scholarship and institutions. The post1997 economic crisis extended the post-boom shelf life of geographical research in and on Southeast Asia, and continuing processes of rapid economic, social, political, and environmental change should provide plenty of ammunition for future geographical enquiry. Whether the enquiring geographical minds will be those of ‘area specialists’ or ‘area incidentalists’ is difficult to anticipate: the disciplinary mainstream is once again disparaging the parochialism of area studies. Meanwhile, educational reforms in a neoliberal environment are further limiting the space and scope for ‘areal dedication.’
Bibliography Dijk K van 1998 European Directory of South-East Asian Studies. IIAS, Leiden, The Netherlands Elmhirst R 1999 Space, identity politics and resource control in Indonesia’s transmigration programme. Political Geography 18(7): 813–35 Fisher C A 1964 South-East Asia: A Social, Economic and Political Geography. Methuen, London Forbes D 1997 Metropolis and megaurban region in Pacific Asia. Tijdschrift oor Economische en Sociale Geografie 88(5): 457–68 Halib M, Huxley T (eds.) 1996 An Introduction to Southeast Asian Studies. Tauris, London
McGee T 1995 Eurocentrism and geography: Reflections on Asian urbanization. In: Crush J (ed.) Power of Deelopment. Routledge, London, pp. 192–210 Olds K, Dicken P, Kelly P F, Kong L, Yeung H W (eds.) 1999 Globalisation and the Asia-Pacific: Contested Territories. Warwick Studies on Globalisation Series, No. 1, Routledge, London Rigg J 1998 Southeast Asia: The Human Landscape of Modernization and Deelopment. Routledge, London Savage V T, Kong L, Yeoh B S A 1993 The human geography of Southeast Asia: An analysis of post-war developments. Singapore Journal of Tropical Geography 14(2): 229–49
M. J. G. Parnwell
Southeast Asian Studies: Health Scientific medicine (biomedicine) has had a predominantly mechanistic, individual-oriented perspective, especially since the development of germ theory at the end of the nineteenth century. Health and illness are defined in terms of the presence or absence of pathogens in the body, and treatment relies on specific curative agents. An alternative approach, with an affinity to public health concerns, relates health and illness to the wider social and economic environment. It has its origin in the emergence of the science of political economy at the end of the eighteenth century and the ‘consideration of disease as a political and economic problem of social collectivities’ (Foucault 1980, p.166). The periodized discussion below of health in Southeast Asia draws upon multidisciplinary scholarship that reflects this social health perspective.
1. Precolonial Period Anthony Reid (1988) argues that the health status of people in the precolonial period was generally better than their contemporaries in the West, and attributes their good health to good diet and hygiene (due to the abundance of water and habit of frequent bathing, and to the open and elevated houses). The region was by no means free of the afflictions of disease—leprosy, syphilis, and smallpox were especially feared. However, he suggests there was only ‘a mild epidemic cycle.’ From the fifteenth to the seventeenth centuries, Southeast Asia was very open to world commerce but the habit of washing and the pattern of dispersed village-type dwellings probably limited the spread of the worst urban infections of Europe and India (e.g., plague and typhoid). Southeast Asian theories of disease causation (etiologies), diagnosis, and healing were characterized by their syncretism. Etiologies were either personalistic (disease caused by the intervention of supernatural beings) or naturalistic (disease caused by metaphysical 14683
Southeast Asian Studies: Health imbalance) or a complex amalgam of both. Personalistic theories derived from a prehistoric and indigenous substratum of animistic beliefs. Some naturalistic theories were also indigenous, such as the belief in the need to balance ‘hot’ and ‘cold’ substances. Others were adapted by Southeast Asians from Chinese, Indian (Ayurvedic) or Arabic (Unanic) humoral medicine (Owen 1987). Modes of healing were similarly diverse and there was a great array of healers (astrologers, herbalists, bonesetters, masseurs and masseuses, midwives, exorcists, mediums, etc.). In seeking cures, Southeast Asians were pragmatic, using different modes of treatment in sequence or even concurrently (Owen 1987). How effective were these treatments? Certainly, Europeans were impressed by the variety and efficacy of herbal medicines and the relief provided by massage for rheumatism and soreness. Reid claims that the ubiquitous chewing of betel contributed to the prevention of tooth decay, indigestion, and dysentery (Reid 1988). Despite the relatively good health of Southeast Asians in this period, population was very low, with an estimated total of a little more than 22 million people in the year 1600 and a population density of only 5.5 persons per square kilometer. Reid’s theory (Reid 1988) is that this was due to endemic warfare caused by dynastic instability and succession wars. Battle casualties were actually minimal but crop destruction (resulting in famine) and the transportation of war captives led to great loss of life. Fertility was also probably low due to the constant need for flight during wars, with avoidance of further births until older children were able to run by themselves.
2. Colonial Period In the nineteenth century, colonial governments became increasingly concerned about the heavy toll of death and disease in the colonies. However, initially this concern was limited largely to the high mortality and morbidity of the military. With the expansion of the colonial empires in the second half of the century medical services (including public health measures) were extended to the growing number of colonial administrators and other European expatriates. As colonial economic interests expanded (in the form of plantation, mines, and railways) concern grew for the health of the native populations, which was in large measure due to the realization that poor health affected the productivity of native workers. Thus, the government of British Malaya justified public health measures to control malaria on rubber estates in terms of the economic cost of the disease to planters. Also, the introduction of maternal and child health services in Malaya in the early twentieth century was based on an awareness of ‘the importance of biological reproduction to ensure the supply of labor’ (Manderson 1989). However, the extension of medical and health 14684
services to rural areas had ideological as well as economic functions. The apparent benefits of European ‘scientific’ medicine and public health served to legitimize the colonial state and to give it moral authority (Manderson 1996). In Europe, the development of public health accorded doctors a powerful role in society as hygienists and led to a number of authoritarian medical interventions (Foucault 1980). Medical policing was no less evident in the European colonies of Southeast Asia, especially in the twentieth century. In Indonesia, Dutch health authorities adopted draconian measures to fight outbreaks of the plague in Java in the first three decades of the century (Hull 1987). In Malaya, public health surveillance intruded increasingly into the domestic and personal domains to search for vermin, whitewash walls, and disinfect premises as well as to check infant feeding and hygiene (Manderson 1989, 1996). A noteworthy feature of the period of colonial rule was the rapid and prolonged population growth. The demographic growth began in about the mid eighteenth century and accelerated in the nineteenth century, reaching rates of natural increase of 1–2 per cent per annum. One theory is that the colonial pax imperica established an uneasy peace in the region, put a stop to internecine warfare and thereby reduced mortality. Other theories posit an increase in fertility arising from colonial demands for labor (in Java) and an increased demand for children. However, the evidence for these theories is inconclusive. Population growth in this period was definitely not matched by improved nutrition—there is some evidence of an actual decline in per capita food consumption—or reduced morbidity (Owen 1987). Morbidity and mortality were very high in colonial cities in the nineteenth century (at least for the predominantly non-European populations) (see Abeyesekere 1987 on Batavia, and Manderson 1996 on Kuala Lumpur). Only in the twentieth century did the health situation improve noticeably. Despite some major plague and influenza epidemics, the overall frequency of epidemics declined. Public health measures (e.g., water supply, sewerage, swamp drainage) led to a reduction in cholera, typhoid, and malaria. Also, improved transportation enabled governments to ship food to areas hit by crop failures and epidemics (Owen 1987).
3. Postcolonial Period By the late 1950s almost all the countries of Southeast Asia had achieved independence. Most of these countries became closely allied with the West. Postwar theories of modernization (e.g., Rostow’s Stages of Economic Growth 1960) attributed the underdevelopment of Third World countries to the lack of diffusion of Western values, capital, skills, knowledge, and technology. Vicente Navarro (1982), writing from the
Southeast Asian Studies: Health Table 1 Child (under-5) mortality (deaths per 1,000 live births)
Higher income countries Singapore Brunei Malaysia Thailand Philippines Indonesia Lower income countries Myanmar (Burma) Vietnam Laos Cambodia Australia Japan
Per capita GNP ($US) (1995)
1960
1980
1997
27,730 11,990a 3,890 2,740 1,050 980
40 87 105 146 102 216
13 n.a. 42 61 70 128
6 12 21 36 42 59
220a 240 350 270 18,720 39,640
237 219 233 217 24 40
146 105 190 330 13 11
90 51 140 131 8 6
a 1992 Source: UNICEF (1994); World Health Report (1998)
perspective of Marxist-oriented ‘dependency theory,’ has argued that the underdevelopment of Latin American countries, on the contrary, was caused by excessive cultural and economic dependence on the West due to the alliances between national urban elites and their Western metropolitan counterparts. From the health perspective this translates into self-serving, urban and elite-centered ‘enclave’ medicine that is reliant on Western medical education and technology. In Southeast Asia, Thailand provides an apt example of such dependency. Donaldson (1982) has highlighted the key role of the Rockefeller Foundation in the dominance of Western scientific medical education in this country and the way it stifled attempts in the 1930s to introduce a low-cost, rural oriented Junior Doctor scheme. Others (e.g., Cohen 1989) have traced the subsequent growth of an elitist, urban centered and entrepreneurial medical profession that has accompanied the expansion of the capital sector in Thailand. Socialism provided the philosophical basis for alternative health policies. In Vietnam, radical changes to health care accompanied the overthrow of French colonial rule in 1954, land reform, and agricultural collectivization. A socialist health policy was instituted based on principles of preventative medicine, combined use of Western and traditional medicine, and popular participation. The health system was highly decentralized and focused on the commune health center. Decentralization was consistent with socialist principles but was also a practical response to three decades of war (1945–75), including heavy US bombing in the last decade. It was the success of Vietnamese and Chinese innovatory public health policies that served as models for the World Health Organization’s primary health
care (PHC) approach and goal of ‘Health for All by Year 2000’ enunciated in the Alma Ata Declaration of 1978. Since then all Southeast Asian countries have instituted some kind of PHC care system, and at least formal support for PHC principles of equity, popular participation, and affordable and culturally appropriate health care. This support was influenced in part by political unrest during the 1970s and by demands to redistribute wealth and income to satisfy the ‘basic needs’ of the poor (including improved living conditions and health). However, the level of political commitment to PHC throughout the region has been weak, except in Vietnam. Nevertheless, there has been a trend toward better health in Southeast Asia (except for a period in Cambodia, which experienced economic decline and massive loss of life during the rule of Pol Pot between 1975 and 1978). At the most basic level, this general improvement in health has been a consequence of economic growth. This is demonstrated in Table 1, which relates per capita gross national product to child mortality (the principal indicator used by UNICEF to measure human and economic progress). In the lower income countries, Vietnam has achieved faster reductions in child mortality despite its low level of income—a consequence of its strong political commitment to PHC and very successful PHC initiatives. The indicators also highlight the fact that there remain very significant differences in health between countries in the region, with Singapore and Brunei matching the industrialized nations of Australia and Japan, and with Myanmar, Laos, and Cambodia comparable in health status to the poor countries of Africa. However, national statistics for the higher income countries may disguise regions within coun14685
Southeast Asian Studies: Health tries with health status well below the national average, e.g., West Nusa Tenggara (Lombok, Sumbawa) in Indonesia, and Northeast Thailand. The health status of urban areas in Southeast Asia is generally better than rural areas, in stark contrast to the pathogenic cities of the nineteenth century. In the postcolonial period, urban health has improved more rapidly than that of rural areas due to better access to safe water, sanitation, and health services. However, national statistics that compare urban and rural areas obscure sizeable pockets of urban poverty, and the existence of urban slums and squatter settlements. Here, appalling living conditions are ideal for the spread of mosquito-borne malaria and dengue fever, and for high levels of infectious and communicable diseases. The growth of urban slums and squatter settlements has been particularly pronounced in Malaysia and Thailand, which have experienced rapid economic growth based on urban manufacturing and rural–urban migration (see Cohen and Purcal 1995). There are two diseases that have created serious health problems in Southeast Asia in recent years and threaten to reverse the general trend toward improved health in the region. These are malaria and HIV\ AIDS; the one is resurgent and the other is emergent. Both are closely connected to poverty and human population movements. In the 1950s and 1960s malaria was in decline in the region due to DDT spraying and highly effective drugs such as chloroquine. However, since then there has been a significant resurgence of the disease. For example, in 1998, Thailand had 125,013 confirmed cases of malaria. This resurgence in Southeast Asia can be partly attributed to insecticide-resistant mosquito vectors and drug-resistant strains of malaria parasites. Other causes can be found in the changing political economy of the region; for example, population pressure on land resources has forced land-poor and nonimmune families into circular migration in search of supplementary income in forested areas where they were exposed to malaria vectors (Singhanetra-Renard (1993). The first case of HIV infection in the region was reported in Thailand in 1984. Within a few years the disease had reached epidemic proportions, fueled by international domestic sex industries (exploiting young women from the poor northern and northeastern regions) and by a large population of injecting drug users. By 1997, 780,000 people were HIV infected in Thailand. Here there is some evidence that the rate of growth had been slowed down by 1994 by government campaigns promoting public awareness and condom use. By this time, however, the disease was becoming rampant in other Southeast Asian countries, especially in Myanmar (440,000 cases of HIV infection) and in Cambodia (130,000 cases). The massive scale of population movement in the region, due to the integrative effect of global economic forces, has contributed greatly to the spread of the disease. In 14686
1996 some 2.6 million migrant workers were circulating within the Asian region. Growth zones have encouraged the flow of people, goods, and ideas across international frontiers. For example, the Economic Quadrangle has established commercial and transport links between parts of Thailand, Laos, China, and Myanmar, and has led to an unprecedented movement across borders of poor people who are vulnerable to HIV infection—commercial sex workers, truckers, migrant workers, and ethnic minorities—in a way that greatly increases the chances of HIV transmission (Porter and Bennoun 1997). Against a backdrop of general trends in health in Southeast Asia—some positive, some negative—I have presented an overview of health studies from a broad range of disciplines (history, anthropology, sociology, demography, and geography). What these studies have in common is a social health perspective. Theoretically, the work of some of these scholars can be identified with Marxist or Foucaultian political economy; others are less obviously associated with a particular theoretical approach but, nevertheless, relate health to political and economic changes in the region. See also: AIDS (Acquired Immune-deficiency Syndrome); AIDS, Geography of; Colonialism, Anthropology of; Colonization and Colonialism, History of; Health: Anthropological Aspects; Public Health; Southeast Asian Studies: Society
Bibliography Abeyesekere S 1987 Death and disease in nineteenth century Batavia. In: Owen N (ed.) Death and Disease in Southeast Asia: Explorations in Social, Medical and Demographic History. Oxford University Press, Oxford, UK Cohen P 1989 The politics of primary health care in Thailand, with special reference to non-government organizations. In: Cohen P, Purcal J (eds.) The Political Economy of Primary Health Care in Southeast Asia. Australian Development Studies Network\ASEAN Training Centre for Primary Health Care Development, Canberra Cohen P T, Purcal J T 1995 Urbanisation and health in Southeast Asia. In: Cohen P, Purcal J (eds.) Health and Deelopment in Southeast Asia. Australian Development Studies Network, Australian National University, Canberra Donaldson P J 1982 Foreign intervention in medical intervention: A case study of the Rockefeller Foundation’s involvement in a Thai medical school. In: Navarro V (ed.) Imperialism, Health and Medicine. Pluto Press, London Foucault M 1980 The politics of health in the eighteenth century. In: Gordon C (ed.) Power\Knowledge: Selected Interiews and Other Writings 1972–1977. Harvester Press, Brighton, UK Hull T 1987 The plague in Java. In: Owen N (ed.) Death and Disease in Southeast Asia: Explorations in Social, Medical and Demographic History. Oxford University Press, Oxford, UK Manderson L 1989 Political economy and the politics of gender: Maternal and child health in colonial Malaya. In: Cohen P, Purcal J (eds.) The Political Economy of Primary Health Care
Southeast Asian Studies: Politics in Southeast Asia. Australian Development Studies Network\ ASEAN Training Centre for Primary Health Care Development, Australian National Univeristy, Canberra Manderson L 1996 Sickness and the State: Health and Illness in Colonial Malaya. Cambridge University Press, Cambridge, UK Navarro V 1982 The underdevelopment of health or the health of underdevelopment: An analysis of the distribution of human health resources in Latin America. In: Navarro V (ed.) Imperialism, Health and Medicine. Pluto Press, London Owen N 1987 Towards a history of health in Southeast Asia. In: Owen N (ed.) Death and Disease in Southeast Asia: Explorations in Social, Medical and Demographic History. Oxford University Press, Oxford, UK Porter D, Bennoun R 1997 On the borders of research, policy and practice: Outline of an agenda. In: Linge G, Porter D (eds.) No Place for Borders: The HIV\AIDS Epidemic and Deelopment in Asia and the Pacific. St. Martin’s Press, New York Reid A 1987 Low population growth and its causes in precolonial Southeast Asia. In: Owen N G (ed.) Death and Disease in Southeast Asia: Explorations in Social, Medical and Demographic History. Oxford University Press, Oxford, UK Reid A 1988 Southeast Asia in the Age of Commerce, 1450–1680. Yale University Press, New Haven, CT Rostow W W 1960 The Stages of Economic Growth. Cambridge University Press, Cambridge, UK Singhanetra-Renard A 1993 Malaria and mobility in Thailand. Social Sciences in Medicine 37(9): 1147–54 UNICEF (United Nations Children’s Fund) 1999 The State of the World’s Children. Oxford University Press, Oxford, UK World Health Report 1998 Life in the 21st Century: A Vision for All. World Health Organisation, Geneva World Health Organisation, Switzerland
P. T. Cohen
Southeast Asian Studies: Politics The study of the politics of Southeast Asia, a subfield of comparative politics, developed as an area of academic research following the regaining of independence of the former colonies of the region after World War II. Driven initially by a desire by outsiders to understand the political dynamics of societies with different cultural and historical origins to those of the West, the study has now become embedded in some national academies, with more work being pursued by local scholars wishing to apprehend the political forces at work in their own and neighboring polities. The development of the discipline has often been driven and shaped by the often global external security and economic interests of foreign scholars and governments, but more recently local concerns for democratization and the advancement of human liberties have emerged as a catalyst to research. Though a field of comparative politics, little of the published work has been truly comparative but rather has concentrated on the issues of political power, authoritative institutions, and constitutional issues within particular
countries. The great political heterogeneity of the region, combined with its religious, cultural, and modern historical diversity, have posed severe constraints on any attempts to develop a truly comparative political theory of Southeast Asia. In several countries, the discipline has not developed to any meaningful extent because of the political constraints that are placed on independent analysis and public debate of governmental actions and political behavior.
1. Nationalism The rise of nationalism as the colonial powers of Southeast Asia lost their desire and capacity to continue to rule in British controlled Burma, Malaya, Singapore, French-run Vietnam, Laos, and Cambodia, the American Philippines, and the Dutch East Indies (Indonesia), stimulated an awareness that these societies had internal political dynamics which were helping to shape the postwar international scene. The existence of powerful left-wing political forces in most of these nationalist movements, emerging at the same time as the United States and its European allies became concerned about the establishment of communist states beyond the Soviet Union, provided a political as well as an academic incentive for the development of Southeast Asian political studies. In the early 1950s, the first American university-based area studies centers devoted to the region began to study the origins and development of nationalism and communism. These early studies were predominantly the result of research and publication by American and Americantrained scholars. They turned, though, in most cases for their historical and cultural understanding of these new polities to the research done by colonial officials and scholars of the pre-war era. The notion that the apparently unique and distinctive cultures of these foreign lands were overwhelmingly important in shaping their politics was very influential at this time. Earlier research, summarizing the cosmological basis of Southeast Asian kingship, such as Robert HeineGeldern’s (1942) essay Conceptions of State and Kingship in Southeast Asia, became frequently cited as sources of theories about the nature of the pre-colonial state which, by analogy, had applicability to the then period. Though nationalism was the predominant theme of these early studies, such as George McT. Kahin’s path-breaking Nationalism and Reolution in Indonesia (1952), religion and ethnicity were seen as key components of that political phenomenon. Anthropological studies of Southeast Asian societies, which also began to emerge at about this time, had a significant impact on early political studies because of their insights into peasant political values and attitudes. Culture became the leitmotif of most political studies and the discipline as applied to the region tended to develop away from the mainstream political science 14687
Southeast Asian Studies: Politics topics of industrialized societies. The fact that almost all Southeast Asian economies in the 1950s and 1960s were predominantly agrarian in nature, and that the peasantry formed the largest social category, provided a material basis for this view. The American historical sympathy with nationalism, following from the dominant understanding of their own country’s historical evolution and morally justified struggle against colonialism, made these studies naturally sympathetic to the nationalist cause. In those countries where the left, and especially communist political parties, had a major influence, studies attempted to differentiate between the indigenous, anti-colonial roots of nationalism and the allegedly foreign inspiration and guidance of left wing nationalist movements. Nationalism was not initially analyzed in broader ideological terms, and there was little attempt to distinguish among the often-contradictory origins and purposes of rival organizations and ideological predispositions within nationalist movements. The religious, ethnic, and geographical diversity of Southeast Asian societies, divisions that frequently paralleled economic and administrative boundaries, were often obscured in the national level analyses that drove the nationalist paradigm. This was shared not only by most foreign scholars but also by national governments and capital-city based researchers. The findings of the postwar pioneers of Southeast Asian political research were summarized in the first, and still useful, major comprehensive review of the politics of Southeast Asia, the multi-author volume edited by George McT. Kahin, Goernment and Politics of Southeast Asia (1959, 1964). This highly influential work, with each country of the region discreetly analyzed with no attempt at any comparative analysis with other states either in or out of Southeast Asia, became the model for most of the next two decades.
2. Behaiorism and the Collapse of Constitutional Democracy As the newly independent states of the region regained their independence between 1945 and 1965, each initially established a form of constitutional democracy, often modeled on the political institutions of the former colonial power. These political systems quickly failed to achieve the idealistic intentions of their formulators. The issue of the failure of constitutional democracy in the politics of Southeast Asia thus became an early issue in political research. The rise of military and authoritarian regimes, even those which maintained the democratic facade of elections, became the next major research issue for students of the region. Herbert Feith’s The Decline of Constitutional Democracy in Indonesia (1962) was amongst the most influential of these works. The political failure of constitutional democracy was often 14688
attributed to the belief that somehow the cultures of these countries were incompatible with the give and take of democratic politics. In peasant-based societies organized through strong patron-client ties controlling limited resources, the tolerance of criticism and debate that is part of democratic politics was argued to be missing but for cultural, rather than structural, reasons. Thus while political studies in Western societies tended to focus on interests and institutions of power and authority, political studies of Southeast Asia generally looked to cultural forces and historical patterns. The very variety and complexity of Southeast Asian societies, which themselves posed severe issues of political unity and control on the often uncertain and weak postcolonial governments of the region, were consequently somewhat overlooked or dismissed. The postcolonial states were attempting to hold together what in many cases were artificially created colonial entities containing sharply contrasting communities with diverse and not necessarily coherent interests with fewer financial, administrative, and military resources than the former colonial administrations could have mobilized. The development of authoritarian, and usually army-led or army-backed, governments in many nations in Asia, Africa, and Latin America during the height of the Cold War in the 1960s, led to a spate of studies on the role of the military in politics. Thailand, Indonesia, Burma, and South Vietnam became cases that were sometimes cited in the wider comparative literature at this time. Such studies tended to be highly descriptive and revealed little of the internal political dynamics of the societies in question. The divorce between Southeast Asian political studies and the dominant trends in the discipline in the United States and Europe during the 1960s was enhanced by the emergence of behaviorism as an increasingly dominant mode of analysis in Western political science. During that period many American, European, and Asian scholars were trained in these methods that, however, had little applicability to conducting research in Southeast Asian societies. The quantitative methods of behaviorism were difficult to apply to societies with different cultures, little studied histories and institutions, almost non-existent social statistics, and limited political rights and freedom of expression. The failure of behaviorism in Southeast Asian conditions inevitably drove the discipline toward other issues after the early dominance of nationalism as an analytical theme.
3. New Themes in Southeast Asian Political Research The research interests and themes of scholars of Southeast Asian politics have broadened since the 1960s. The political roles of the bureaucracy, lead-
Southeast Asian Studies: Politics ership, ideology, and political parties, as well as nongovernmental organizations, economic policies and forces, the law, and electoral behavior have emerged as themes which have helped to bring the study of Southeast Asian politics into a more comparative position with the study of politics and political economy elsewhere, particularly other areas of Asia. Most of this research has, however, remained country specific, with explanatory theories often rooted in broad concepts of ‘modernization’ or political culture and development. Some of the new themes in Southeast Asian political research began to point to possible theories that were thought possibly applicable to other states in the region. One of the earliest and most influential was the concept of Thailand as a ‘bureaucratic polity.’ Fred W. Riggs 1966, in his Thailand: The modernization of a bureaucratic polity, argued that the bureaucracy, including the army in that country, rather than being an instrument of the government as assumed in American and European studies, was the government. The fact that his hypothesis could trace itself back to the pre-modern paradigms of the early Thai monarchy, produced ideas of an indigenous, Southeast Asian political form which might be applicable to other authoritarian states in the region. The development of research on patron-client relations, showing the political dependence of peasants and poorer urban elements on powerful ‘big’ men, and in turn the dependence of the patrons on maintaining the support of their clients in order to amass political power, reinforced many of the themes in Riggs’ ‘bureaucratic polity’ model. What was becoming increasingly clear in research of this kind across the region was that political power in Southeast Asian societies was usually derived from wealth, status, and office, rather than from the meritocratic, democratic, or egalitarian assumptions of most Western political theories. Benedict J. Kerkvliet’s The Huk Rebellion: A Study of Peasant Reolt in the Philippines (1977) and James C. Scott’s Weapons of the Weak: Eeryday Forms of Peasant Resistance (1985) were particularly influential in this genre.
3.1 Political Economy Two developments within the economies of some of the countries of the region led to the emergence of theories of politics derived from political economy following the rapid, if partial, industrialization of first Singapore, and then Malaysia, Thailand, and Indonesia from the 1970s. Not only did this economic growth draw people from the countryside into the cities, but it also created stronger political, economic, and cultural linkages with the advanced industrial West as well as Japan, South Korea, and Taiwan. These studies were often criticized, as were earlier explanatory theories, for applying Western-derived,
neo-classical assumptions to situations that were still in many ways pre-market, as the bureaucratic dominance of Southeast Asian states distorted ‘normal’ political forces. A more fruitful line of research, however, was developed by those scholars who began to study the relationship between economic resources and political power. Anek Laothamatas (1992), for example, in his Business Associations and the New Political Economy of Thailand: From bureaucratic polity to liberal corporatism, demonstrated how the rise of commerce and industry in the Thai economy was undermining the earlier dominance of the bureaucracy in Thai politics. Elite transformations, as a consequence of economic development, became a theme for research as illustrated in The Philippine State and the Marcos Regime by Gary Hawes (1987). Considerable interest was attracted in the late 1980s and early 1990s, before the Asian financial crisis of 1997, on the so-called ‘developmental state’ theories which grew from attempts to understand the rapid industrialization of several East Asian economies. Some arguments of political economists drew inspiration from neo-Marxist dependency theories while others looked to varieties of Wallerstein’s ‘world system’ model. To most students of Southeast Asian politics, however, these theories were unacceptable because of their unwillingness to recognize the dynamics of domestic political forces as well as their Euro-centrism. The rise of arguments of political economy was fueled not only by changes in the region itself, but by intellectual trends in much of social science at this time to explain the ‘Asian economic miracle.’ The fact that this phenomenon affected only a minority of the polities of the region meant that the study of states with the least developed economies tended to stagnate. This was paralleled by a relative lack of interest in their politics as well as strong political inhibitions against independent research in the poorest Southeast Asian societies. Thus while the indigenous study of Southeast Asian politics grew stronger in Singapore, Thailand, Malaysia and, to a lesser extent, Indonesia, the discipline remained stagnant, if not nonexistent, in Burma, Laos, Cambodia, Vietnam, and Brunei.
3.2 Democratization The symbiotic relationship between political and economic change and the growth of an indigenous political science can be seen in the development of research on the requirements for the establishment of democratic regimes in more recent Southeast Asian political studies. Just as the global political and intellectual preoccupations of Western students of Southeast Asian politics in the 1950s and 1960s shaped the early development of the discipline, the re-emerg14689
Southeast Asian Studies: Politics ence of democracy as a global political value in the eyes of the West has shaped recent political research. Some of this research has been often shaped by externally generated optimistic assessments of the rapid speed at which societies can adapt to new political forms and expectations subsequent to earlier economic and social structural changes. At the same time, the existence of a larger core of local students of politics has ensured that external themes and preoccupations have not driven research away from local realities as happened in earlier years. The democratization theme has often concentrated on creating dichotomies of political forms (military\ civilian, democratic\authoritarian, free market\ planned economy) that can tend to obscure an analysis of the actual political forces and interests at work in a society. The volume The Politics of Elections in Southeast Asia Taylor R H (ed) (1995) brought together a large amount of research which demonstrated the contradictory realities between the widespread practice of elections in Southeast Asia since independence, if not earlier, and the ability of governments to use democratic institutions to assert a powerful hold by the state over the population and other semidependent institutions. 3.3 Public Policy and National Security Because of the threat of criticism of the political behavior of rulers which is often implicit in studies which examine how the politics of societies actually work, even in the most open of Southeast Asian polities, political scientists have often been forced to shape their findings to fit with what is politically acceptable. While very good research is often conducted, such as the useful volumes on Singapore (Quah et al. 1985, Goernment and Politics of Singapore), and Thailand (Xuto 1987, Goernment and Politics of Thailand ) much such work tends to be historically oriented and essentially descriptive of institutions. Official documents and speeches by government leaders are often accepted at face value as the justification of particular policies and courses of action. The structural bases of politics are largely ignored and there is little discussion about where power lies in society and with what consequences. The first American studies of Southeast Asian politics emerged from area studies centers established at major universities with government and foundation support in order to improve the understanding of the region’s politics so as to develop more effective internal and external security policy options. Similarly, there has developed in Southeast Asia since the 1960s a set of national political ‘think tanks’ from which much political research has been conducted and published. The first such institution to be established was the Institute of Southeast Asian Studies (ISEAS) in Singapore. Though largely funded and led in its first 14690
years by Western foundations and scholars, its role in studying the security issues of Singapore and the region has brought it more closely in contact with government interests. During the Cold War, much of the political research of ISEAS and similar centers that were subsequently established in Kuala Lumpur, Jakarta, and Bangkok, centered on issues of relations with China, other communist powers, and internal security topics. With the ending of the Cold War, and the rapid industrialization of some of the region’s economies, economic policy issues have also come under scrutiny by students of politics in the region. In the style of the Association of Southeast Asian Nations (ASEAN), the region’s leading international organization, the scholarship emanating from these institutions refrains from criticizing the policies and politics of neighboring countries. The strains on regional relations arising from the Asian economic crisis of 1997, as well as the establishment of governments with regard for human rights in the Philippines and Thailand in the latter half of the 1990s, placed further strains on this shared policy of self restraint in research and publication. With the expansion of ASEAN to include all ten countries of Southeast Asia, similar think tanks have been established in Vietnam, Cambodia, Laos, and Myanmar, but the political constraints on their activities means that no genuinely independent research by indigenous scholars on these polities has been published within their countries. 3.4 International Politics International politics and international relations studies have not been central to the development of Southeast Asian political studies. From the earliest post-independence period, most studies of the region have centered around the themes of the foreign policies of particular states, the relations of one or more of the former superpowers with the region as a whole, or the development of regional with the formation of ASEAN as a long standing regional body. Most studies of each country’s foreign policy making and purposes have relied heavily on an analysis of the motivations and interests of national leaders. Foreign policy has been less seen as a national political process, or as the pursuit of national interests, and more often as an extension of the personality of one of the region’s long dominant leaders.
4. Discourse Analysis The diversity of Southeast Asia has made the development of a coherent discipline of Southeast Asian political studies impossible. The field is not a large one and there are few scholars. In the approximately 50 years since the first postcolonial studies began to
Southeast Asian Studies: Religion emerge, however, much progress has been made. While the early studies tended to be highly descriptive and to be the work of scholars who saw themselves as clearly outside their object of study, more recent and increasingly sophisticated studies by both foreign and indigenous scholars have begun to analyze the languages of politics and their diverse meanings in particular circumstances. The depth of cultural and linguistic knowledge required for such work has required Southeast Asian political studies to become in many ways multidisciplinary. Historians, anthropologists, and students of the region’s various literatures have contributed as much to this mode of political apprehension as have political scientists.
5. Conclusion Southeast Asian political studies are still in their infancy despite 50 years of development. From the early descriptive and largely derivative studies of the late 1940s, to the more in-depth, locally aware studies of the 1990s, the field has moved away from a vague and amorphous concept of culture as a catchall for political explanation to an understanding which grows out of the region itself. As the field has become increasingly rooted in South East Asia, it has become increasingly a field of study with implications for the development of the discipline more widely.
Bibliography Feith H 1962 The Decline of Constitutional Democracy in Indonesia. Cornell University Press, Ithaca, NY Hawes G 1987 The Philippine State and the Marcos Regime. University of Hawaii Press, Honolulu, HI Heine-Geldern R 1942 Conceptions of state and kingship in southeast Asia. Far Eastern Quarterly 2(1): 15–39 Kahin G McT 1952 Nationalism and Reolution in Indonesia. Cornell University Press, Ithaca, NY Kahin G M (ed.) 1964 Goernment and Politics in Southeast Asia, 2nd edn. Cornell University Press, Ithaca, NY Kerkvliet B J 1977 The Huk Rebellion: A Study of Peasant Reolt in the Philippines. University of California Press, Berkeley, CA Laothamatas A 1992 Business Associations and the New Political Economy of Thailand: From Bureaucratic Polity to Liberal Corporatism. Westview Press, Boulder, CO Quah J S T, Chee C H, Meow S C 1985 Goernment and Politics of Singapore. Oxford University Press, Singapore Riggs F W 1966 Thailand: The Modernization of a Bureaucratic Polity. East-West Center Press, Honolulu Scott J C 1985 Weapons of the Weak: Eeryday Forms of Peasant Resistance. Yale University, New Haven, CT Taylor R H 1992 Political science and south East Asian studies. South East Asia Research 1(1): 5–26 Taylor R H 1995 The Politics of Elections in Southeast Asia. Cambridge University Press, New York Uchia T (ed.) 1984 Political Science in Asia and the Pacific: State Reports on Teaching and Research in Ten Countries. UNESCO Regional Office for Education in Asia and the Pacific, Bangkok, Thailand
Xuto S (ed.) 1987 Goernment and Politics of Thailand. Oxford University Press, Singapore
R. H. Taylor
Southeast Asian Studies: Religion From their first appearances in Southeast Asia, the world religions have been intimately linked to states and statecraft, and so they remain. This organizing thesis will be used to demonstrate different disciplinary approaches to the study of religion. Archaeological and historical research on the world religions in this region reveal, at the same time, the dawn here of statelevel polities. Historical research shows the modern nation-states of the region emerging in the context of a colonial order in which the politics of religion were often formative. The predominant approach to the study of religion in contemporary Southeast Asia, however, combines ethnographic with textual studies. Scholars in religious studies, who once might have focused mostly on textual exegesis, increasingly couple this with contextual participation and observations. Anthropologists, who once considered participant observation the only method they needed to master, now realize that they have to be conversant, too, in the textual legacy of a particular faith if they hope to understand people’s religious practice. The renowned diversity of Southeast Asia owes a great deal to religious differences. On the mainland portion, Buddhism predominates, divided between the Mahayana form found in Vietnam (where it both blends and vies with a Confucian tradition), and the Theravada form practiced in Burma, Thailand, Laos, and Cambodia. Insular Southeast Asia is home to the world’s largest Muslim country—Indonesia—and Islam holds a commanding presence in Malaysia and the southern Philippines as well. Christianity is the majority faith in the Philippines, however, but many minorities throughout Southeast Asia, especially those located in the mountainous uplands, also follow this faith. Nowhere in the region have these world religions completely displaced older traces of Hindu cosmology or religious practices, older still, based on the veneration of ancestors and the propitiation of local spirits. The relationship between states and world religions has been reciprocally advantageous: states or rulers gained legitimacy by association with religious symbols; religious institutions benefited materially from the state’s largess, gaining the resources for additional propagation or reproduction of the faith. The world religions in Southeast Asia have comprised the primary criterion of civilization, their absence always marking the terrain of the primitive. Textually based, these world religions demanded literacy among at least 14691
Southeast Asian Studies: Religion some members of a society and entailed systems of formal education. The distribution of the world religions has thus signaled both the limits of state power and of cultural domestication. The animistic religions of upland or relatively isolated peoples have, therefore, been in retreat for two millennia, as states have grown increasingly capable of penetrating daily life at their peripheries as well as in their centers. Religion in Southeast Asia surely does not reduce to politics, but the patchwork distribution of religions in this region and the dynamics of religious change in the present are hardly comprehensible apart from issues of power, especially state power.
1. Theocratic Formations in Early Southeast Asian States The climatic conditions over most of Southeast Asia hamper archaeological preservation. Despite that obstacle to methods based on excavation, inferences from stone architecture and sculptures, as well as inscriptions associated with these, reveal a close relationship between politics and piety (Fontein 1990). One of the most innovative recent attempts to mine architectural evidence is the exhaustive and exacting measurements undertaken at Angkor Wat, measurements which confirm the close relationship between gods, kings, and cosmology in the classic Khmer world (Mannikka 1996). The earliest polities deserving the term ‘state’ in this region began to take shape around the first century CE. They have been termed Indic, owing to the predominance of Indian influence in their architecture and religious statuary, yet this was not an instance of conquest or political colonization. Rather, it appears that kingdoms such as Srivijaya, Champa, Khmer, Pagan, Majapahit, and others appropriated Hindu and Buddhist symbolism via contacts that grew along the trade routes that crossed Southeast Asia linking the circum-Mediterranean world to India and China. Religious monuments such as Angkor Wat, the Borobudur, and Prambanan represent only the most spectacular architectural traces of Indianization. Smaller temples and ritual complexes, such as those scattered throughout East and Central Java and along the Vietnam littoral, bear mute witness as well to the spread of these world faiths. Rulers sometimes adopted one variant in contrast to the religions of neighboring or past regimes, a devotion to Shiva, Vishnu, or the Buddha characterizing different Khmer rulers, for example. At other times, even Buddhism and Hinduism were regarded as non-exclusive. Both permitted avenues for deifying the ruler as a god-king. These early states have been termed mandalas or galactic polities, connoting a focus on an exemplary center where humanly constructed spaces and rituals mirrored cosmological and astronomical orders. Angkor Wat’s highest peak represents the axis 14692
mundi, and its various measurements inside and out encode astronomical cycles, mythical events, and cosmological relationships, sometimes simultaneously (Manikka 1996). Although measurements such as these would have been esoteric, the bas reliefs and statuary, plainly representational and didactic, would have instructed even illiterate persons who wanted to learn or remember stories about the gods. Amassing the materials and especially the labor to accomplish these massive constructions indexed at once a ruler’s wealth and power as well as his devotion and piety. More than mere devotees of the gods, however, kings united mystically with them, becoming themselves earthly manifestations of Vishnu or Shiva. Buddhist rulers harkened back to King Asoka as exemplar, an Indian monarch (reigned 265–238 BC) whose support of the monastic order and construction of reliquaries in the form of stupas set the standard for monarchs building their legitimacy around Buddhism (Swearer 1995, pp. 64–8). Conceiving of the king’s position as one accruing from past merit, Buddhists in Pagan (present-day Burma) saw the king also as a bodhisatta, one who helps others achieve salvation through his patronage of the religion and through the righteousness of his rule. As such, he was both dearaja and dhammaraja, god-king and just-king (AungThwin 1985). Breaking this pattern of fusing ruler and god, Islam filtered into Southeast Asia via India as well as the Arab world, following paths of transoceanic commerce in the tenth century to take root in coastal locations along the region’s archipelago and peninsular portions (Chaudhuri 1985). Since Muslims largely shun religious representational art, the archaeology of Islam’s advance derives mostly from gravestone inscriptions that reveal the deceased as Muslim (Drewes 1968). Conversion would have facilitated relationships with the Muslim traders who began to dominate the Indian Ocean trade in this era. To have equated ruler with God would have compromised Islam’s insistent monotheism, yet the mutually beneficial relationship between state and religion continued to characterize the Muslim sultanates and rajadoms of the Malay-speaking world of insular Southeast Asia. Malay rajas were often described as the ‘Shadow of God on Earth,’ or his deputy, and legends sometimes relate that when a ruler converted he commanded his followers to do likewise (Milner 1985). Given the Hindu-Buddhist environment, it is not surprising that Islamic Sufism, a mystical tradition, initially had the greatest success in this region (Woodward 1989).
2. Colonizing the Spirit: Historical and Ethnographic Qualifications Spain and Portugal, in competition with each other to gain access to trade with the East, sacralized their exploration and conquests of new worlds as a religious
Southeast Asian Studies: Religion competition against Protestants (Chadwick 1972, p. 321). In the Philippines, the Church and the colonial state operated through ‘an interpenetration of functions’ (Steinberg 1982, p. 64) whereby the colonial church enjoyed virtually full financial support from the government, and Spanish monarchs approved or disapproved the placement of religious personnel. With its vast land holdings, ownership of banks, demands for local labor, and increasingly racist limitations on recruiting clergy, the Church was targeted by the mestizo revolutionaries who first came to imagine themselves as independent Filipinos. While nationalism became a secular alternative to religion for some leaders, others sought to indigenize or nationalize the Church. Today, ‘the Roman Catholic experience’ remains important to a sense of Philippine identity, raising a conundrum for the primarily Muslim inhabitants of the southern islands, peoples still termed as the Spanish labeled them ‘Moros.’ Militant Moro separatism has smoldered in the southern islands for decades. However, historical approaches have recently qualified this model of conversion and conquest in the Philippines. Tagalog speakers ‘translated’ Christianity to leave largely intact a distinctly Filipino worldview (Rafael 1988). Fannela Cannell (1999), an anthropologist, found a similar process in Bicol. The records most accessible to the historian, those of the Church and state, are perhaps the most likely to elide this insight. But by including myth as part of historical consciousness, and by ethnographic attention to beliefs about magic and spirits, Aguilar’s (1998) history of labor conflicts on sugar haciendas shows how we might read between the lines of the institutional records to discover how Filipinos localized Christianity and resisted colonial domination.
3. Religion and Nation The trading companies of the Dutch and British that eventually displaced Spain and Portugal in the rest of insular Southeast Asia did not perceive their mission to be one of converting the natives. Toward the end of the eighteenth century, as these mercantile bodies died and European states began to engage directly in colonization, independent Protestant missionary organizations also emerged, following European flags wherever these were planted. Protestant missionaries found Islam and Buddhism, deeply entrenched with institutions of governance and leadership, their major competitors throughout much of the region. Missionaries targeted, instead, the tribal peoples who had remained autonomous from the lowland court centers and resistant to their religions. In some cases, colonial governments limited missionaries to working among tribal peoples, cautious of irritating the majority population and its leaders. Indeed, early nationalistic resistance throughout the region drew strength from
Islamic and Buddhist organizations (von der Mehden 1963). But perhaps because in the Dutch and British colonies the state was ostensibly (if not always in practice) neutral toward religion, nationalist opposition to these regimes was far less anticlerical than in the Philippines, moving increasingly toward secularsocialist forms. Thailand was the only country to have remained free from colonial rule, although it was forced to concede considerable territory to France and substantial commercial rights to Britain. The idea of Thailand as a modern nation, an equal player in a world of nations, indicated a shift from conceptualizing the polity as an exemplary center, the edges of which were of little import, to one where mapped boundaries defined precisely the compass of the state’s reach (Tongchai 1994). The Chakri dynasty, selecting Bangkok for its capital, led Thailand during its transition into nationhood. The fourth of the Chakri kings, Mongkut, lived as a Buddhist monk for 27 years before ascending the throne, and advocated reforms to rationalize and purify the sangha, the clerical body of Buddhism. His son, Chulalongkorn, carried out extensive governmental and social reforms, but it was the sixth Chakri ruler, Vajiravudh who brought the nation, the monarchy, and Buddhism together in the basic form that they still hold to this day: Vajiravudh’s Thai nation (and he was the first to popularize this phrase) was founded on the basic triad of ‘nationreligion-monarch’ (chat-satsana-phramahakasat), a trinitarian mystery in which all three elements were inextricably bound together. Allegiance to any one of the three meant loyalty to all three; disloyalty or disobedience or disrespect toward one meant disrespect toward all. (Some sophistries were engaged in to secure the allegiance of Siam’s considerable Muslim and Christian minorities Wyatt 1984, p. 229)
Precisely because the state has a vested interest in melding Buddhism with the nation, scholars have not readily seen the extent to which Thai thought and practice is not Buddhist, or perhaps is nominally Buddhist but substantially indigenous. Nicola Tannenbaum, an anthropologist, discovered this in her research on the Shan (1995, 1999). Like other anthropologists, she looked less at Buddhism as textual tradition than at Buddhists and their practices. In contrast, the methods of scholars in religious studies tend to focus more on the canonical aspects of religion than on the everyday ‘heresies’ of actual practitioners. This methodological difference between religious studies and anthropology is less and less firm, however. Donald Swearer, for example, a scholar in religious studies who specializes in Buddhism in Southeast Asia, approaches his subject ethnographically as well as historically (Swearer 1995). As Thai monarchs in the past made merit by building or renovating temples and stupas, so the 14693
Southeast Asian Studies: Religion current monarch, and the Thai state, remain prominent patrons of Buddhism and the sangha. One of the more recent royal projects of this type is the large and well-manicured park, Buddha Monthon, located on the edge of Bangkok. Dominated by a gigantic standing image of the Buddha and a large golden stupa, the park includes a lake, fountains, and quiet trails: a serene and inspirational space for public celebrations and high-level conferences or meetings. During the 1970s, Thailand allied itself with the United States in attempting to halt the spread of communism in Southeast Asia. Fearing its own communist insurgency from within, the government of Thailand utilized the Village Scout Movement to mobilize sentiment and patriotism in support of nation–religion–monarch. Initiation into this movement that attracted mainly teenagers and young adults began with isolating the initiates for several days in a camp-like setting, then depriving them of sleep while they participated in lectures, skits, songs, and other group activities. At the ritual’s climax, when initiates received the scarves that officially symbolized their membership in the Scouts, many were sobbing, some so overcome with patriotic emotion they had to be removed on stretchers (Bowie 1997). At the same moment, Thailand’s neighbor, Laos, was taking a path going in the diametrically opposite direction. With the ascendancy of the Lao People’s Revolutionary Party (LPRP) in 1975, Laos attempted to forget its monarchy and to harness Buddhism for socialist ends (Evans 1998). Forming the Lao United Buddhist Association, the LPRP hoped to purge the religion of ‘superstition’ while supporting the state’s reforms and programs: For many monks and for many laymen this political reorientation of the Lao sangha was unacceptable. As a result some left the order and returned to ordinary life, some fled to Thailand, and among the flood of Lao refugees to Thailand from 1975 until the early 1980s, many cited dissatisfaction with changes to Lao Buddhism among their reasons for leaving (Evans 1998, pp. 62–3).
With the reforms that have swept the communist world since 1989, Laos has also instituted a more market-based economy. What socialism and Lao identity mean in this new era have come under new scrutiny, and ‘the regime has turned increasingly to Buddhism in its search of new ideologies of legitimation, and of a reformulated Lao nationalism’ (Evans 1998, p. 67). Members of the ageing politburo who have died in recent years have been buried with full Buddhist funeral rites, and the upgraded constitution of 1991 gives more explicit attention to Buddhism than its predecessor did. Islamic Southeast Asia presents examples of two different solutions to the question of religion and nation. When Britain granted independence to Malaysia, Chinese Malaysians, who made up some 30 14694
percent of the population, controlled most of the urban economy. The indigenous Malays, most of whom lived by agriculture or fishing, comprised the scant majority of the population but were granted constitutional privileges of governorship in the new nation. Added to this ethnic mix was a South Asian population comprising about 10 percent of the population, and a small scattering of tribal aborigines in the mountains. Ethnic tensions erupted in 1969 during two weeks of rioting that caused widespread property damage and hundreds of casualties, mostly of Chinese and Indian Malaysians. Responding to this tragedy, the legislature passed economic policies that included affirmative action for Malays in education and business. These politics of ethnicity have much to do with Islam’s position in Malaysia (Shamsul 1994). The constitution, while providing citizens the right to practice and profess any religion, also establishes that ‘Islam is the religion of the nation’ (Bunge 1985, p. 239). Specifically, Islam is the religion of the Malays. Islam is so centrally part of Malay identity that, historically, throughout much of the Malay-speaking world, converting to Islam was often termed ‘becoming Malay’ (masuk Melayu). The government’s promoting Islam in this context of ethnic politics, therefore, promotes ‘Malayness.’ This explains why so much public architecture in Malaysia, such as the striking railway station in downtown Kuala Lumpur, looks distinctly Islamic. The relationship of Islam and nation in Indonesia has been quite different from the situation in Malaysia. Numerically speaking, Islam appears to be even more predominant here than in Malaysia, where but 53 percent were Muslim in 1980. Muslims comprise 88 percent of Indonesia’s huge population, yet Indonesia’s constitution does not name Islam as the religion of the nation. When the new republic was founded, some voices did call for the formation of an Islamic state, but the adherents of minority faiths, as well as those Muslims who feared the power of the state to enforce an orthodoxy they did not want to practice, opposed this concept. To mollify proponents of an Islamic state, however, the deliberative body drawing up the lineaments of a new nation determined at least to avoid a purely secular stance. Thus, the first of the five ideological ‘pillars’ (Pancasila) underlying the nation is the belief in the Almighty God. Significantly, too, the religiously neutral term Tuhan (Lord) is used to denote God in this statement rather than the term Allah. A government ministry to support religion was established at the same time, with different branches to promote the interests of each of five officially recognized religions: Islam, Hinduism, Buddhism, Protestant, and Catholic Christianity (Boland 1971). Because Suharto’s New Order came to power as the result of crushing an attempted coup by the Indonesian Communist Party, communism remained the state’s ‘enemy number one’ long after an estimated 500,000
Southeast Asian Studies: Religion were killed in recrimination for the coup, a bloodbath that effectively ended organized communism in the country. In this political climate, religion remained a significant tool in Indonesian politics (Kipp 1993). Religious affiliation became a cloak of respectability that shielded one from suspicion in that time of terror that ushered in the New Order, and later, that vouched safe one’s claim to be a solid and progressive citizen. Adherence to one of these five official religions was required of all government employees, such as teachers and bureaucrats, as proof of one’s immunity to communism. Tribal peoples whose animistic religions were not accorded protection and respect found themselves consigned to categories such as ‘backward,’ or ‘isolated.’ While adherents of the five approved religions were protected by law from proselytization, the practitioners of animism were opened to Christian and Muslim missions. Some of these tribal peoples yielded to this pressure by converting; others reconstituted their traditional practices as ‘Hinduism,’ allying themselves with the Balinese, whose religious beliefs and practices bear a family resemblance to Indian religion (Geertz 1972). While Islam was not set up as the religion of state, and Indonesians have been free to be religious in one of five approved ways, they have been far less free to claim allegiance to other kinds of religions, or to claim no religion at all. The ethnographic study of Islam in Indonesia was long hampered by the assumption that the rituals and practices of Muslims were universal in the Muslim world, and thus already known (Bowen 1993). Until recently, anthropologists working in Muslim Indonesia attended to those religious practices and beliefs that they presumed were non-Islamic or preIslamic, and in the process, underestimated the depth of Islamic faith and Muslim identities (Woodward 1989). Now scholars recognize that Islamic practice— everywhere the heart of this faith—deserves careful study, often leading us to the debates and differences that animate and divide Muslim communities. Indonesian Muslims whose practices incorporate substantial amounts of local precedent, following revered teachers, are termed Traditionalists. They have been gradually losing discursive ground to those who promulgate more modernist, standardized forms of Islam (Bowen 1993). The political economy of this shift can be attributed to the different positions of power occupied by those who espouse the Modernist variants as compared to the largely rural and less well educated Traditionalists. The state has also helped to tip the balance in this discourse. Religious instruction is a mandated part of the curriculum in all public schools through the high-school level, and the Department of Religion has established a nationwide system of teacher-training schools to prepare teachers of religion. Thus, schoolchildren receive a standardized, textbook version of Islam that may contradict the particulars of religious practice in their communities (Hefner 1994).
4. Both More and Less than an Instrument of State As rulers in Southeast Asia from ancient times to the present have attempted to harness the political power of religion, certain forms of religion acquire the privileges of official recognition and financial patronage, while others are at best ignored if not censured. When ‘modern state Buddhism’ developed under the patronage of the Chakri dynasty in the late nineteenth and early twentieth centuries, it overpowered many regional monastic traditions (Kamala 1997). A tradition of wandering asceticism, among other local Buddhist traditions, has been all but elided from history. Official Thai Buddhism was shaped by the wish to rationalize the faith in relation to a textual canon, whereas localized Thai Buddhism had been rooted in devotional practices, such as meditation, rather than in textual exegesis, and relied more on firsthand relationships between teachers and students than on the literate transmission of knowledge. The official Thai Buddhism also elevated some peculiarly local elements of central Thai experience as the national standard: The system—still in use today—rests on degrees, examinations, and ranks in the sangha hierarchy. It defines the ideal Buddhist monk as one who observes strict monastic rules, has mastered Wachirayan’s printed texts, teaches in Bangkok Thai, carries out administrative duties, observes holy days, and performs religious rituals based on Bangkok customs (Kamala 1997, p. 9).
The considerable numbers of Thai who speak Lao or some other language may thus feel less than comfortable with the orthodoxy deriving from Central Thai traditions. Religious faith and expression always go well beyond the designs of people in high places. The deepening Islamization of Indonesia and Malaysia in the closing decades of the twentieth century was not wholly an engineered artifact of the Cold War and internal ethnic politics. Muslims worldwide were experiencing a resurgence of faith and confidence, and as travel and communication have reduced the world’s distances, Southeast Asia’s Muslims easily participate in discourse within the worldwide ummah, the community of Islam. What Indonesians and Malaysians do with or make of their faith is not fully under any government’s control. A small handful of Muslim leaders in Indonesia, authorities worry, inspire a ‘fanaticism’ among their followers that could threaten the peace and security of the regime. Other prominent Muslim leaders want to depoliticize Islam. They have become wary of the government’s patronage of Islam, seeing there an avenue for political cooptation and control. They disavow the goal of an Islamic state, calling instead for the deep transformation of society through individuals’ adherence to the practices of Islam: a revolution of the spirit rather than of the political structure. 14695
Southeast Asian Studies: Religion Buddhism in Southeast Asia has experienced similar revivals and reinventions, some of which go well beyond what the state intends. In Singapore, for example, with an explicitly secular constitution and state ideology, Buddhism replaced Taoism as the leading religion according to the 1990 census. Over one-third of Singaporeans profess Buddhism, and it attracts especially university-educated adherents who attend lectures and youth camps, and who utilize the Internet to study Buddhist philosophy. Likewise, urban and educated Thai have become increasingly skeptical of the Buddhism purveyed as a civil religion. New, unorthodox religious movements based on Buddhism have attracted many in the Thai middle class, and, what is more, some monks have dared to oppose openly, and at the risk of arrest, the state’s development policies that destroy the environment (Taylor 1993). Archaeology and art history reveal that the world religions were tied intimately to statecraft in the region’s first states. Historical research tells us that religious imperialism was implicated in colonial expansion, and that the dawn of nationalistic movements sometimes drew energy from religious identities forged in opposition to the colonials. Modern research on the world religions in Southeast Asia, whether undertaken by those in religious studies, anthropology, or history takes seriously both the literate tradition of a faith as well as the imperative to find out what the faithful actually do and say. These hybrid methods add nuance to histories of religion, showing us, for example, how people translate religious concepts into locally meaningful forms, resisting religious and political imperialism in the process. At the same time, such hybrid approaches demand that anthropologists investigate the supposedly universal rituals of the world faiths and then place their knowledge of local practices against a more expansive background of regional history and religious literature. See also: East Asia, Religions of; India, Religions of; Religion: Definition and Explanation; Religion: Evolution and Development; Religion: Family and Kinship; Religion, History of; Religion: Nationalism and Identity; Religion, Sociology of; Southeast Asian Studies: Culture; Southeast Asian Studies: Politics; Southeast Asian Studies: Society
Bibliography Aguilar F 1998 Clash of Spirits: History of Power and Sugar Planter Hegemony on a Visayan Island. University of Hawaii Press, Honolulu, HI Aung-Thwin M 1985 Pagan: The Origins of Modern Burma. University of Hawaii Press, Honolulu, HI Boland B J 1971 The Struggle of Islam in Modern Indonesia. Nijhoff, The Hague, The Netherlands
14696
Bowen J R 1993 Muslims Through Discourse. Princeton University Press, Princeton, NJ Bowie K A 1997 Rituals of National Loyalty: An Anthropology of the State and the Village Scout Moement in Thailand. Columbia University Press, New York Bunge F M 1985 Malaysia: A Country Study. United States Government, Washington, DC Cannell F 1999 Power and Intimacy in the Christian Philippines. Cambridge University Press, Cambridge, UK Chadwick O 1972 The Reformation. Penguin Books, Harmondsworth, UK Chaudhuri K N 1985 Trade and Ciilization in the Indian Ocean: An Economic History from the Rise of Islam to 1750. Cambridge University Press, Cambridge, UK Drewes G W J 1968 New light on the coming of Islam to Indonesia? Bijdragen tot de Taal-, Land- en Volkenkunde 124(4): 433–59 Evans G 1998 The Politics of Ritual and Remembrance: Laos Since 1975. Silkworm Books, Chiangmai, Thailand Fontein J 1990 The Sculpture of Indonesia. National Gallery of Art, Washington, DC Geertz C 1972 ‘Internal Conversion’ in contemporary Bali. In: Geertz C (ed.) The Interpretation of Cultures. Basic Books, New York Hefner R W 1994 Reimagined community: A social history of Muslim education in Pasuruan, East Java. In: Keyes C F, Kendall L, Hardacre H (eds.) Asian Visions of Authority: Religion and the Modern States of East and Southeast Asia. University of Hawaii Press, Honolulu, HA Kamala T 1997 Forest Recollections: Wandering Monks in Twentieth-century Thailand. University of Hawaii Press, Honolulu, HA Kipp R S 1993 Dissociated Identities: Ethnicity, Religion, and Class in an Indonesian Society. University of Michigan Press, Ann Arbor, MI Mannikka E 1996 Angkor Wat: Time, Space, and Kingship. University of Hawaii Press, Honolulu, HI Milner A R 1985 Islam and Malay kinship. In: Ibrahim A, Siddique S, Hussain Y (eds.) Readings on Islam in Southeast Asia. Institute of Southeast Asian Studies, Singapore Rafael V 1988 Contracting Colonialism: Translation and Christian Conersion in Tagalog Society Under Early Spanish Rule. Ateneo de Manila University Press, Manila, The Philippines Shamsul A B 1994 Religion and ethnic politics in Malaysia: The significance of the Islamic resurgence phenomenon. In: Keyes C F, Kendall L, Hardacre H (eds.) Asian Visions of Authority: Religion and the Modern States of East and Southeast Asia. University of Hawaii Press, Honolulu, HI Steinberg D J 1982 The Philippines: A Singular and a Plural Place. Westview Press, Boulder, CO Swearer D K 1995 The Buddhist World of Southeast Asia. State University of New York Press, Albany, NY Tannenbaum N 1995 Who Can Compete Against the World? Power-protection in Shan Worldiew. Association for Asian Studies, Inc., Ann Arbor, MI Tannenbaum N 1999 Buddhism, prostitution, and sex: Limits on the academic discourse on gender in Thailand. In: Jackson P, Cook N (eds.) Genders and Sexualities in Modern Thailand. Silkworm Books, Chiangmai, Thailand Taylor J 1993 Buddhist revitalization, modernization, and social change in contemporary Thailand. Sojourn 8(1): 62–91 Tongchai W 1994 Siam Mapped: A History of the Geo-body of a Nation. University of Hawaii Press, Honolulu, HI
Southeast Asian Studies: Society Von der Mehden F R 1963 Religion and Nationalism in Southeast Asia. University of Wisconsin Press, Madison, WI Woodward M R 1989 Islam in Jaa: Normatie Piety and Mysticism in the Sultanate of Yogyakarta. University of Arizona Press, Tucson, AZ Wyatt D K 1984 Thailand: A Short History. Silkworm Books, Chiangmai, Thailand
R. S. Kipp
Southeast Asian Studies: Society 1. Analyzing Southeast Asia as a Form of Knowledge Society is both real and imagined. It is real through face-to-face contact and imagined when the idea of its existence is mediated through mediums such as printed materials and electronic images. So, the term society refers simultaneously to a microunit that we could observe and to a macro one that we could only partially engaged with. We, therefore, have observable ‘societies’ within a macro imagined ‘society,’ so to speak. Southeast Asia, like other regions in the world, has both. But it is the way that both of these components have been weaved into an enduring complex whole, which seemed to have made Southeast Asia and Southeast Asians thrive and survive even under adverse conditions, such as the recent financial–economic crisis, that has become the source of endless intellectual attraction and academic inquiry to both scholars and others, hence the birth, growth, and flourishing of Southeast Asian studies. Thus, Southeast Asian studies, dominated by humanities and the social sciences, have been about the study of the ‘society’ and ‘societies’ in the region, in their various dimensions, in the past and at present. The complex plurality of these ‘society’ and ‘societies,’ or societal forms, that do indeed co-exist, endure, and enjoy some functional stability, have made it imperative for researchers to apply an equally diverse set of approaches, some discipline-based (anthropology, sociology, geography, history, political science, etc.) and others thematically-oriented (development studies, gender studies, cultural studies, etc.) in studying Southeast Asian society. In some cases, it even involved disciplines from the natural as well as applied sciences. The greatest challenge in Southeast Asian studies, and to its experts, has been to keep pace with the major changes that have affected the ‘society’ and\or ‘societies’ and then narrate, explain, and analyze these changes and present the analysis in a way that is accessible to everyone within and outside the region. Therefore, framing the analysis is critical in understanding as to how Southeast Asian studies constitute
and reproduce through the study of ‘society’ and ‘societies’ within Southeast Asia. The ‘knowledge baseline’ approach is useful in making sense of the said framing process.
2. The ‘Knowledge Baseline’ in Southeast Asian Studies Social scientific knowledge (humanities included) on Southeast Asia has a clear knowledge baseline, meaning a continuous and interrelated intellectual-cumconceptual basis, which emerged from its own history and has, in turn, inspired the construction, organization and consumption process of this knowledge. The two popular concepts that have been used frequently to characterize Southeast Asia are ‘plurality’ and ‘plural society,’ both of which are social scientific constructs that emerged from empirical studies conducted within Southeast Asia by scholars from outside the region. In historical terms, ‘plurality’ characterizes Southeast Asia before the Europeans came and who, subsequently, divided the region into a community of ‘plural societies.’ Plurality here signifies a free-flowing, natural process not only articulated through the process of migration but also through cultural borrowings and adaptations. Politically speaking, polity was the society’s political order of the day, a flexible nonbureaucratic style of management focusing on management and ceremony by a demonstrative ruler. States, governments, and nation–states, which constitute an elaborate system of bureaucratic institutions, did not really exist until Europeans came and dismantled the traditional polities of Southeast Asia and subsequently installed their systems of governance, using ‘colonial knowledge,’ which gave rise to the plural society complex. Historically, therefore, plural society signifies both ‘coercion’ and ‘difference.’ It also signifies the introduction of knowledge, social constructs, vocabulary, idioms, and institutions hitherto unknown to the indigenous population (such as maps, census, museums, and ethnic categories), the introduction of market-oriented economy and systematized hegemonic politics. Modern nation–states or state–nations in Southeast Asia have emerged from this plural society context. It is not difficult to show that the production of social scientific knowledge on Southeast Asia has moved along this plurality-plural society continuum. When scholars conduct research and write on preEuropean Southeast Asia they are compelled to respond to the reality of Southeast Asian plurality during that period; a period which saw the region as the meeting place of world civilizations and cultures, where different winds and currents converged bringing together people from all over the world who were interested in ‘God, gold and glory’ and where groups 14697
Southeast Asian Studies: Society of indigenes moved in various circuits within the region to seek their fortunes. As a result, we have had, in Java, a Hindu king with an Arabic name entertaining European traders. In Champa, we had a Malay raja ruling a predominantly Buddhist populace trading with India, China, and the Malay archipelago. Whether we employ the orientalist approach or not, we cannot avoid writing about that period within a plurality framework, thus emphasizing the region’s rich diversity and colorful traditions. In other words, the social reality of the region to a large extent dictates our analytical framework. However, once colonial rule was established and the plural society was installed in the region, followed later by the formation of nation–states, the analytical frame, too, changed. Not only did analysts have to address the reality of the plural society but also the subsequent developments generated by the existence of a community of plural societies in the region. We began to narrow our analytical frame to nation–state, ethnic group, internation-state relations, intranationstate problems, nationalism, and so on. This gave rise to what could be called ‘methodological nationalism,’ a way of constructing and using knowledge based mainly on the ‘territoriality’ of the nation–state and not on the notion that social life is a universal and borderless phenomenon, hence the creation of ‘Indonesian studies,’ ‘Malaysian Studies,’ ‘Thai Studies,’ and so on. With the advent of the Cold War and the modernization effort analysts became further narrowed in their frame of reference. They began to talk of poverty and basic needs in the rural areas of a particular nation, also focusing on resistance and warfare, slums in urban areas, and economic growth of smallholder farmers. The interests of particular disciplines, such as anthropology, became narrower still when it only focuses on particular communities in remote areas, a particular battle in a mountain area, a failed irrigation project in a delta, or gender identity of an ethnic minority in a market town. In fact, in numerical terms, the number of studies produced on Southeast Asia in the plural society context supercedes many times those produced on Southeast Asia in the plurality context. Admittedly, social scientific studies about Southeast Asia developed much more rapidly after the Second World War. However, the focus became increasingly narrow and compartmentalized not only by academic disciplines but also in accordance to the boundaries of modern postcolonial nations. Hence, social scientific knowledge on Southeast Asia became, to borrow a Javanese term, kratonized, or compartmentalized. It is inevitable that a substantial amount of social scientific knowledge about Southeast Asia itself, paradigmatically, has been generated, produced, and contextualized within the plural society framework, because ‘nation–state’ as an analytical category matters more than, say, the plurality perception of the 14698
Penans of Central Borneo, who, like their ancestors centuries ago, move freely between Indonesia and Malaysia to eke out a living along with other tribal groups and outside traders, ignoring the existence of the political boundaries. In fact, anthropologists seem to have found it convenient, for analytical, scientific, and academic expedience, to separate the Indonesian Penans from those of Malaysia when, in reality, they are one and the same people. Therefore, the plurality–plural society continuum is not only a ‘knowledge baseline’ but also a real-life social construct that was endowed with a set of ideas and vocabulary, within which people exist day-to-day in Southeast Asia.
3. Constituting and Reproducing the Knowledge on Southeast Asia There are at least four major axes along which the construction, organization, and reproduction of social scientific knowledge about Southeast Asia and its societies have taken place. The first axis is that of discipline\area studies. There is an ongoing debate between those who prefer to approach the study of Southeast Asia from a disciplinary perspective, on the one hand, and those who believe that it should be approached from an area studies dimension, employing an interdisciplinary approach, on the other. The former prefer to start clearly on a disciplinary footing and treat Southeast Asia as a case study or the site for the application of particular set of theories that could also be applied elsewhere globally. The aim of such an approach is to understand social phenomena found in Southeast Asia and to make comparisons with similar phenomena elsewhere. Those preferring the latter approach see Southeast Asia as possessing particular characteristics and internal dynamics that have to be examined in detail using all available disciplinary approaches with the intention of unraveling and recognizing the indigenous knowledge without necessarily making any comparison with other regions of the world. The bureaucratic implications of these two approaches can perhaps be discerned clearly in the way social scientific knowledge about Southeast Asia is reproduced through research and teaching. This brings us to the second axis, namely, the undergraduate\ graduate studies axis. Those who favor area studies often believe that Southeast Asian studies can be taught at the undergraduate level hence the establishment of Southeast Asian studies departments or programs, in a number of universities in Southeast Asia, combining basic skills of various disciplines to examine the internal dynamics of societies within the region. Acquiring proficiency in one or two languages from the region is
Southeast Asian Studies: Society a must in this case. The problem with this bureaucratic strategy is that these departments have to be located in a particular faculty, say, in the arts, humanities, or social science faculty. This denies, for instance, those with a background in the natural sciences the opportunity to study in-depth about Southeast Asia. Therefore, those discipline-inclined observers would argue that Southeast Asian studies should be taught at the graduate level to allow those grounded in the various disciplines, whether in the social or natural sciences or in other fields of study, to have an opportunity to specialize in Southeast Asian studies. Therefore, a geologist or an engineer who, for instance, is interested in the soil and irrigation systems of Southeast Asia could examine not only the physical make-up of Southeast Asia but also the human– environment relationship. This is particularly relevant at the present time since environmental and ecological issues have become global concerns. This has made many individuals, institutions, and governments think carefully how they should invest their precious time and money when they are requested to support the setting up of, say, a program, center, or institute of Southeast Asian studies. They often ask whether universities should continue to have the prerogative on the teaching, research, and dissemination of knowledge about anything connected with Southeast Asia and its societies. Why not in nonuniversity institutions? This takes us to the third axis, namely, the uniersity\nonuniersity one. For many years, we imagined that only at the university we could acquire and reproduce knowledge about Southeast Asia, whether approached from the disciplinary or area studies perspective. However, many governments and international funding bodies felt that to obtain knowledge about Southeast Asia one need not go to university, but could acquire it through nonacademic but research-oriented institutions established outside the university structure to serve particular purposes. National research bodies such as LIPI (Indonesian Institute of the Sciences) in Jakarta and ISEAS (Institute of Southeast Asian Studies) in Singapore have been playing that role. ‘Think-tanks,’ such as the Center for Strategic Studies (CSIS), Jakarta, or the Institute of Strategic and International Studies (ISIS), Malaysia, have also played the role of the producer and reproducer of knowledge on societies in Southeast Asia outside the university framework. However, there seems to be a division of labor, based on differences in research orientation, in the task of producing and reproducing knowledge between the academic and nonacademic institutions. The final axis is an academic\policy-oriented research one. While academic endeavors pursued within the context of Southeast Asian studies in the universities are motivated by interest in basic research, which is by definition scholarly, those pursued outside the universities are often perceived as not being
scholarly enough because they are essentially applied or policy-oriented in nature and serving rather narrow, often political, interests of the powers that be in Southeast Asia. It is argued that the critical difference between these two approaches is that the academic one is always open to stringent peer-group evaluation as a form of quality control, but that the applied one is not always assessed academically. In fact, the latter is often highly confidential and political in nature, thus denying it to be vetted by the peer group, hence its perceived inferior scholarly quality. The basic research-based academic endeavors are, therefore, seen as highly scholarly, whereas the nonacademic ones are perceived as highly suspect as scholarly works and not considered to contribute to the accumulation of knowledge on Southeast Asia societies. However, research institutes like ISEAS in Singapore would argue that, even though it is essentially a policy-oriented research institute mainly serving the interests of the Singapore government, it still produces scholarly work of high quality and encourages basic research to be conducted by its research fellows either on an individual or a group basis. In other words, a nonuniversity research institute of Southeast Asian studies, such as ISEAS, could simultaneously conduct applied and basic research without sacrificing the academic and scholarly qualities of its final product; or put in another way, it is ‘policy-oriented yet scholarly.’ The moot question is who are really the consumers of knowledge on Southeast Asians societies, hence Southeast Asian studies; the Southeast Asians or outsiders?
4. Consuming the Knowledge on Southeast Asia It could be argued that social scientific knowledge about Southeast Asia and its societies is a commodity with a market value. Often the ‘market rationale,’ and not the ‘intellectual rationale,’ prevails in matters such as the setting-up of a Southeast Asian studies program, center, or institute, even in the government-funded academic institutions. However, the funding of research on Southeast Asian studies has often been dictated not by idealistic, philanthropic motives but by quite crass utilitarian desires, mainly political or economic ones. There are at least three important ‘sectors’ within which knowledge on Southeast Asia societies has been consumed: the public, the private, and the intellectual sectors. Since the governments in Southeast Asia have been the biggest public sector investors in education, through public-funded educational institutions, they have been the largest employment provider. They have set their own preferences and priorities, in accordance to their general framework of manpower planning, in deciding what type of graduates and in which fields of 14699
Southeast Asian Studies: Society specialization they want to employ them. The pattern in Southeast Asian countries has been well established, that is, there is a higher demand for science graduates than the social sciences and the humanities. But amongst the latter there is no clear, expressed demand for Southeast Asian studies graduates. However, there seems to be a significant demand for the inclusion of the Southeast Asian studies content in all the nonnatural science courses at the undergraduate level in most of the government-funded academic institutions in Southeast Asia. This is not unrelated to the fact that the awareness about ASEAN as a community has now become more generalized amongst the public hence the need for a more informed description on the different countries and societies within ASEAN (read Southeast Asia). Outside Southeast Asia, such as in Japan and the United States of America, very rarely, specialization in Southeast Asian studies, or components of, has been considered highly desired in the job market of the public sector. Perhaps having a graduate-level qualification in Southeast Asian studies is more marketable in the public sector especially in government of semigovernment bodies that deals with diplomatic relations or intelligence. In the private sector, the demand for Southeast Asian studies as a form of knowledge and the demand for a potential employer who possesses that knowledge are both limited and rather specific. However, the number could increase depending on how large is the investment and production outfit a particular company has in Southeast Asia. Since some of the demand for the knowledge is rather short-term, often specific but detailed, therefore it has to be customized to the needs of a company, ‘think-tanks,’ or ‘consultant companies’ have often become the main supplier of such tailored knowledge. Many of such organizations are actually dependent on ‘freelance’ Southeast Asianists or academics who do such jobs on a part-time, unofficial basis. It has been observed that the Japanese seem to be regular consumers of knowledge on Southeast Asia. This is hardly surprising because they have massive investments in Southeast Asia. There is, therefore, a constant need to know what is happening in the region. Research foundations from Japan, in particular the Toyota Foundation, have been very active, since the 1990s, in promoting ‘Southeast Asian studies for Southeast Asians,’ and supporting other research and exchange programs. Taiwan and Korea are the two other Asian countries having their own Southeast Asian studies research centers, along with the United States, United Kingdom, France, and The Netherlands, former colonial powers in Southeast Asia. A more generalized demand for knowledge on Southeast Asian societies relates to marketing and this trend must not be underrated with the recent expansion of the middle class in the region. As the markets and clients in Southeast Asia become more 14700
sophisticated the need for in-depth knowledge on sectors of the Southeast Asia societies has increased. This in turn has increased the demand for graduates who have followed courses related to Southeast Asian studies. In the intellectual sector, knowledge on Southeast Asia has been consumed generally by the NGOs, namely, those that are national-based ones as well as those that have regional networks. Because most of the NGOs are issue-specific based interest group, such as environmental protection, abused housewives, social justice, and the like, and often seeking funds for their activities from the governments and NGOs in developed countries, they find it more advantageous to operate on a regional basis because they get more attention and funding from the said source. The strength and success of their operation is very much dependent on the amount of knowledge they have about Southeast Asian societies in general as well as the specific issue that they are focusing on as a cause in their struggle. With the popularity of the Internet and its increased usage around the world and within Southeast Asia, it has now become an important medium through which academic and popular knowledge on Southeast Asian societies has become available. The source of the knowledge could be located outside or within the region but are now much more accessible for commercial and noncommercial purposes. As a conclusion, it could be said that Southeast Asian studies and what it constitutes is, first and foremost, a knowledge construct that represents only part of the region’s social reality. In spite of this, it is the most important element, amongst the many, that gives Southeast Asia, the geo-physical region as well as its people and environment, its history, territory, and society. Because of the co-existence of different societal forms in the region, hence the unevenness of the tempo of social life in the region, the speed of social change thus also differs from one community to the other, from one area within the region to another. Only a multidisciplinary approach could capture these complexities embedded in the societies of Southeast Asia. As the importance of the region increases in the globalizing world, both generalist and specialist knowledge about Southeast Asia become critical to the world and the region itself. In that sense, Southeast Asian studies as a knowledge construct transforms itself into a lived reality, especially for the Southeast Asians themselves. This knowledge, therefore, becomes indispensable both to those who study Southeast Asia and its society as well as to the Southeast Asian themselves. See also: Colonialism, Anthropology of; Colonization and Colonialism, History of; Modernization, Sociological Theories of; Plural Societies; Southeast Asian Studies: Culture; Southeast Asian Studies: Economics; Southeast Asian Studies: Politics
Southern Africa: Sociocultural Aspects
Bibliography Bellwood P 1985 Prehistory of Indo-Malayan Archipelago. Academic Press, Sydney Brown D 1994 The State and Ethnic Politics in South-East Asia. Routledge, London & New York Collins J T 1996 Malay, World Language: A Short History. Dewan Bahasa & Pustaka, Kuala Lumpur Evers H-D (ed.) 1980 Sociology of South-East Asia: Reading on Social Change and Deelopment. Oxford University Press, Kuala Lumpur Kemp H 1998 Bibliographies on Southeast Asia. KITLV Press, Leiden Reid A 1988 Southeast Asia in the Age of Commerce 1450–1680; Volume 1: The Lands Below the Winds. Yale University Press, New Haven, CT Reid A 1993 Southeast Asia in the Age of Commerce 1450–1680; Volume 2: Expansion and Crisis. Yale University Press, New Haven, CT Steinberg J (ed.) 1987 In Search of Southeast Asia: A Modern History, rev. edn. Allen & Unwin, Sydney and Wellington Tarling N (ed.) 1992 The Cambridge History of Southeast Asia. Volume 1: From Early Times to c. 1800; Volume 2: The Nineteenth and Twentieth Centuries. Cambridge University Press, Cambridge, UK Wallace A R 1869 The Malay Archipelago, the Land of the Orang-Utan and the Bird of Paradise: A Narratie of Trael with Studies of Man and Nature. Macmillan, London
A. B. Shamsul
Southern Africa: Sociocultural Aspects The southern African region comprises South Africa, Lesotho, Swaziland, Botswana, Namibia, Mozambique, Malawi, Zimbabwe, and Zambia. The recent history of the region, during the last half of the twentieth century, was dominated by the decolonization process and its aftermath. With a few exceptions, countries have struggled to establish viable political systems and bureaucratic states, to overcome ethnic, racial, and regional conflict, to diversify and stabilize economies often dominated by only one or two industries (such as copper mines in Zambia, tobacco in Zimbabwe), and to achieve economic development. The small countries of Lesotho, Botswana, and Swaziland are, for the most part, monoethnic states with little ethnic conflict, but have had significantly different internal political arrangements and economic fortunes. South Africa is the most economically and ethnically diverse country in the region, with the strongest state and institutions of civil society. It dominates the region’s economy. Lesotho is entirely surrounded by South Africa and almost completely dependent on it. All other countries, especially land-locked Zimbabwe, Zambia, and Malawi, depend to a large degree on South African transport services and harbors. South African invest-
ments and companies increasingly dominate the economies of its regional neighbors. The southern African region shares similar and inter-related colonial histories, postcolonial interactions, and the broad outlines of the cultural histories of the majority of their populations. Close cooperation among governments of the region fostered by a halfcentury of political struggle has created tight political integration of the whole. The largest trading partners still remain outside of the region, primarily in Western Europe. At the beginning of the twenty-first century, all countries had begun to seek increasing intraregional cooperation in order to reverse the effects of conflict, isolation, economic decline, and political weakness of state structures. This was articulated by Thabo Mbeki, President of South Africa, as the search for an African Renaissance.
1. History Archaeological evidence shows that southern Africa was ‘the cradle of mankind.’ The earliest prehumans and Homo sapiens evolved here. There is, however, no evidence that would link these early humans any more strongly to contemporary southern African populations than to the rest of humanity. All languages in the region derive from only two of the six major language families of Africa (Greenberg 1966). The entire population of speakers of the Khoisan language group have lived in the region, and only here, from at least 20,000 years ago. Beginning around 2,000 years ago, Bantu-speaking peoples appear to have moved into the region quite rapidly from north of the Congo forest zone, in the region of today’s Cameroun. They possessed technology to smelt iron and make durable fired pottery, and raised cattle, goats, and a mixture of crops from the Sudanic zone (millets and sorghum) and from the Indian Ocean region (sweet potatoes, bananas). At some time they acquired the New World crop, maize, which became central to the diet of the entire region well before European incursions. Before the arrival of the agricultural and pastoral Bantu, most of southern Africa was sparsely populated by the Khoisan people. The ‘Khoisan’ language family includes two major branches, the ‘Bushman’ or ‘San’ languages and the KhoiKhoi. The primary distinction between these two groups, however, is social and economic. The Bushman were hunters and gatherers, living in small mobile bands of no more than 35–40 people. The Khoi were cattle and sheep-herders, living in larger tribal groupings. Early European settlers at the Cape of Good Hope began to trade with the Khoi peoples (then known as ‘Hottentots’) in the seventeenth century, but as trade relationships broke down and the European farmers (‘Boers’) began to move inland in the eighteenth century, the Khoi were 14701
Southern Africa: Sociocultural Aspects gradually absorbed as clients and laborers by the Boers. Bushman were killed in skirmishes, displaced to remote deserts, or incorporated as servants and wives by European people moving inland from the south, and by Bantu-speaking Africans moving southwards. By the nineteenth century Khoisan speakers remained only in the dry areas inland from the west coasts of South Africa, Namibia, and Angola, and in the Kgalagadi (Kalahari) desert of Botswana. There is still a relatively large Nama (Khoi) speaking group in Namibia. Indeed, the name ‘Namibia’ means ‘land of the Nama.’ In South Africa where ‘Colored’ people preserve much of the San and Khoi genetic heritage, Bantu languages such as Xhosa and Zulu preserve their linguistic heritage in the form of some vocabulary and the characteristic ‘click’ sounds found nowhere else among the world’s languages. Their art, painted on rocks throughout the region, gives testament of a rich culture. Although ‘Bantu’ refers strictly to a linguistic category, people who speak Bantu languages also share broadly similar cultures. With the exception of their spread to Indian Ocean islands such as the Comores and Zanzibar, they are restricted to tropical and subtropical continental southern, central, and eastern Africa. The culture and political economy of early Bantu speakers was characterized by residence in small dispersed agricultural and\or pastoral settlements led by a ‘chief’ who delegated authority to subchiefs or headmen, and to a council of senior or royal males in a form of informal democracy that usually excluded women. Precolonial religious beliefs focused on reverence for the spirits of (male) ancestors who were propitiated during regular ceremonies by individuals, families, clans, or larger tribal or village groupings. While there were generally no ‘priests’ in these traditional religious systems, all societies relied on diviners, healers, and prophets who acted either as mediums for spirits during healing rituals or occasionally interpreted the will of the spirits in a political context. The institution of religious prophecy continues to have significant political and social effects. The role of traditional prophets and spirit mediums in the Chimurenga or ‘war of liberation’ fought in Rhodesia\Zimbabwe was especially important, but resistance movements led by traditional prophets appeared in all southern African countries during colonial times as well. Strong beliefs in the existence of witches—individuals who caused evil or illness often unbeknown to themselves—and beliefs in the existence of animal familiars of witches or other kinds of spiritual powers inherent in plants and animals are also widespread. Today, majorities in all countries are at least nominally Christian, with small numbers of Muslims such as the Yao of Mozambique, and the Swahili-speakers in Mozambique, Malawi, and eastern Zambia. Many aspects of precolonial belief have been incorporated into indigenous Christian beliefs and practices, while minorities of people continue to 14702
practice traditional religion in preference to, and sometimes in combination with, Christianity or Islam. The rapid establishment after 1816 of the Zulu kingdom under the warrior king Shaka, together with the British colonial advance in the latter half of the nineteenth century, provide the historical skeleton that unites the region’s cultural diversity. Shaka was a member of the small Nguni-speaking ‘Zulu’ clan. He is credited with introducing a number of military innovations that eventually led to the establishment of his hegemony over a large conquest state in what is now northern KwaZulu-Natal Province of South Africa. Repercussions were felt as far north as Malawi and eastern Zambia, southern Tanzania and Mozambique, where groups of people fleeing Shaka’s power established temporary hegemony. These people were known as the ‘Ngoni’ in the north, or Nguni in South Africa. The rise of the Zulu Kingdom, together with movements inland of the Portuguese from the east coast, and the Boers from the Cape of Good Hope, led to general disruption of settled life in south central Africa. This, in turn, created opportunities for widespread incursions, first by African and Arab slave and ivory traders from the east coast, and later for European adventurers, hunters, traders, settlers, and missionaries. The general lack of strong centralized states or other large-scale political organizations in the region contributed to the advance of colonial rule and expropriation of land by Europeans for farms and mines. European Christian missionaries established mission stations in Nyasaland (now Malawi) in response to the devastation and social dislocation caused by Ngoni incursions and led the way for the British government to declare Nyasaland a Protectorate. Zulu offshoots established the Gaza kingdom in the southern central region of Portuguese Mozambique over the Tsonga-speaking people and settled in large numbers in southern Zimbabwe as the ‘Ndebele’ people. (The Ndebele of South Africa derive from an earlier Nguni expansion into Sotho-Tswana areas, and are historically and culturally distinct from the Zimbabwe Ndebele.) The Kololo people, a Tswanaspeaking group originally deriving from the Hurutshe in what is now Botswana, fled north in response to Zulu incursions and established temporary hegemony over the Lozi people in western Zambia. They managed to impose much of their language and culture on the previously independent populations. In South Africa, the Zulu expansion and its repercussions had the effect of disrupting virtually all previous settlements of people during the Mfecane (or Difeqane in Sotho, ‘the crushing’), a period of vast population movements and reorganization lasting through the middle of the nineteenth century. Out of this emerged several strongly integrated ‘tribes’ as followers of powerful chiefs: the Sotho of Basutoland (Lesotho), the Swazi of Swaziland, and the Tswana tribes of
Southern Africa: Sociocultural Aspects Bechuanaland (Botswana). It also created an opportunity for white Afrikaans-speaking farmers, the Boers, to move north out of the Cape Province, a British Colony since 1802. After the Great Trek of 1835 they established several independent republics, including the Orange Free State and the Suid Afrikanse Republiek (‘South African Republic,’ known as ‘The Transvaal’).
2. Colonialism and Independence Struggle In the second half of the century, Swaziland, Basutoland (now Lesotho), and Bechuanaland (Botswana) were declared British Protectorates after treaties were signed by their chiefs, partly in order to secure them as labor reserves to provide labor for European farmers and the mines, and partly to keep them from the hegemony of the Zulu on the one hand and the Afrikaner (Boer) states on the other. The Zulu, unwilling to enter into negotiations, were eventually defeated in 1879 by British forces. Their king was exiled briefly but eventually returned as king of the Zulu people under the authority of the Colony of Natal. The hyphenated name of the current province of South Africa, ‘KwaZulu-Natal,’ still reflects the regional power of the Zulu kingdom. Mozambique was colonized by Portugal in the fifteenth century but only came under effective control in the late nineteenth century. Namibia was brought under German control as a result of the partition of Africa at the Berlin Conference in 1884. The remainder eventually came under the direct or indirect control of Britain. Zimbabwe and Zambia were given as concessions to the British South Africa Company under Cecil John Rhodes as Southern and Northern Rhodesia. In 1924, the British South Africa company was dissolved, and Northern Rhodesia and Southern Rhodesia became Crown Colonies under local white settler control. Together with the Protectorate of Nyasaland, they joined together in the Central African Federation in 1951 until independence was achieved by Northern Rhodesia and Nyasaland in 1964. The colony of Northern Rhodesia was renamed Zambia, while Nyasaland became Malawi. Military conflicts in the later decades of the twentieth century severely limited the pace of economic and political development throughout the region. Zimbabwe\ Rhodesia, Namibia\South West Africa, and South Africa witnessed protracted guerrilla struggles between white settler populations and black African nationalist organizations that spanned most of the half-century and generated large cross-border refugee movements. Southern Rhodesia continued to be ruled by its white minority settler population as Rhodesia, and declared its independence from Britain in 1965. Though unconstitutional and condemned by Britain and the United Nations, the largely white government
continued to rule Rhodesia until 1980, despite massive trade and political sanctions. After the Zimbabwe African National Union (ZANU), with an ethnic base among the Shona, and the largely Ndebele Zimbabwe African Peoples Union (ZAPU) succeeded in defeating the white settler regime, Rhodesia became Zimbabwe under Robert Mugabe, the leader of ZANU. Independence was followed by a period of repression by ZANU against the Ndebele followers of ZAPU in the middle of the 1980s, though peace was eventually restored. Mugabe continued to rule Zimbabwe with an increasingly heavy hand. Twenty years after independence, he again declared ‘war’ on the country’s white farmers who still retained about half of the arable land. At the turn of the millennium, Zimbabwe was on the brink of economic and political collapse. The defeat of the Smith’s Rhodesian government was made possible by close cooperation between ZANU and African nationalist forces in Mozambique. These had begun by the late 1960s to wage a war of liberation, and offered safe havens for ZANU forces. Operating from Tanzania, FRELIMO had succeeded in gaining control of northern Mozambique in the early 1970s. In 1974, the Portuguese government fell, due in large measure to the cost of its colonial wars in Angola and Mozambique. Mozambique was granted independence in the following year but suffered years of destructive civil war from shortly after the Portuguese withdrawal. To the west, the increasing success of the South West Africa Peoples Organisation (SWAPO) against the forces of the South African Apartheid government had also helped ZANU to achieve its aims in Zimbabwe. South West Africa had been wrested from German control by South African forces acting under British authority during World War I. It was then ceded to South Africa as a Trust Territory under the terms of the Armistice, and in accordance with a mandate of the League of Nations. After a protracted liberation struggle, it became independent Namibia. Despite its Portuguese colonial heritage, Mozambique is clearly part of the Southern African region culturally and historically. The Tsonga-speaking people of southern Mozambique straddle the border with South Africa. Mozambicans have long been intensively involved in the South African economy, first as labor migrants to white farms from as early as the eighteenth century, and later as laborers on South African mines from the late nineteenth century until the present. South African capital has flowed strongly into Mozambique after the end of its protracted civil war. The African National Congress came to power in South Africa in 1994 after the release in 1990 of Nelson Mandela from 24 years of imprisonment. With the end of colonialism, in all countries except South Africa, large numbers of white people withdrew, either moving back to European countries or into South Africa. Today, well over 95 percent of the population of all countries except South Africa is black African. 14703
Southern Africa: Sociocultural Aspects
3. States and Politics State institutions in southern Africa are all relatively weak. The colonial states of Mozambique and North and South Rhodesia were never very strong, focusing mainly on controlling black labor migrations and political movements, and on delivering security and services to white farmers, miners, and business interests. The activity of liberation forces also removed large areas from effective government control. Parts of these countries remained effectively beyond the reach of the state, neither paying much in taxes nor receiving many services from the central authorities. At the end of the twentieth century, all except Swaziland had democratic governments elected under universal franchise. The benefits of the postcolonial transition to democratic politics have been limited, however, by long liberation struggles in some cases (Zimbabwe, South Africa, Namibia), postcolonial civil war (Mozambique), rampant corruption, mismanagement (Zambia, Malawi), and, for most, increasing poverty and environmental damage. Throughout the last half of the twentieth century the political process was dominated by relatively few persons and parties. All countries have been ruled for long periods by one party or by only one or two presidents. Free political expression has often been suppressed. In South Africa the National Party, dominated by white Afrikaners, ruled from 1949 until 1994, one of the longest periods of rule by a single party anywhere in the postwar world. The National Party implemented a set of laws and practices known as Apartheid (‘separateness’) that attempted to maintain exclusive political and economic power by whites. Racism was institutionalized, and black opposition parties were banned. Their members were either jailed or forced into exile. Only whites were allowed to vote. Within these limits, however, the political process was orderly, with presidents stepping down after their terms of office had expired. Following this trend, Nelson Mandela, the first president of a free South Africa and leader of the African National Congress, stepped down after serving one five-year term to make way for Thabo Mbeki, his elected successor. The problem of how to integrate ‘traditional’ African politics of the rural areas with the ‘modern,’ urban-centered political systems deriving from Europe (either liberal-democratic or centralist-Marxist) has persisted to the present. While Botswana, Lesotho, South Africa, and Zimbabwe maintained ‘advisory’ constitutional roles for ‘traditional authorities’ (that is, predominantly rural tribal ‘chiefs’), others such as Mozambique abolished chiefship altogether. Tribal groups such as the Himba in Namibia and the Lozi of Zambia maintain chiefs or kings, but without recognition by the state’s constitution. With the exception of Swaziland, all possessed more or less liberal democratic parliamentary political systems by the end of the twentieth century. Swaziland alone possessed an 14704
absolute monarch based on precolonial African traditions of chiefship. Botswana institutionalized traditional rulers who have a role in government through the House of Traditional Authorities. In white-ruled Zimbabwe and South Africa, chiefs were brought under government control and seen as part of the system of oppression. In South Africa, during Apartheid, the internal ‘homeland’ republics also utilized the institution of ‘modernized’ (salaried and provided with minor bureaucratic support) chiefships that were based on traditional institutions. Nevertheless, the beginning of the twenty-first century saw a renaissance of chiefship in South Africa and Mozambique, with remnants of precolonial traditional political systems often playing public roles in most other countries of the region.
4. Lielihood In the later decades of the twentieth century, all countries in the region in one period or other suffered increasing isolation from world markets as the result of political sanctions (Rhodesia\Zimbabwe and South Africa), distance from markets (Lesotho, Malawi, Zambia), warfare (Mozambique), or declining market prices for their primary products (Zambia). Worldwide political and economic sanctions were applied first to Rhodesia (Zimbabwe) throughout the 1970s during the period of white settler rule, and then to South Africa during the final decades of the struggle against Apartheid. A polarity between the ‘modern’ urban centers and the surrounding ‘traditional’ rural regions has characterized the broad politico-economic structure of the region since colonial times. A long-term, occasionally rapid, migration of rural agriculturalists to urban areas has, in the past, skewed gender balances as males moved to urban areas in search of wage labor, while women tended to stay in the villages. Damage to family structures, unemployment, urban crowding, severely limited police capacity, and poor urban planning have often led to high levels of social insecurity and criminal violence. Rural–urban migration had stabilized by the end of the millennium, but high levels of mobility between city and country remain, with kin groups attempting to maintain bases in both. In Zambia, urbanization proceeded rapidly during the colonial period, with miners and work-seekers clustering in the ‘Copper belt’ in the north-central Zambia. With collapsing copper prices during the 1980s and mismanagement of mines, many people returned to rural areas. By the beginning of the twentyfirst century, over 95 percent of Whites and Indians in South Africa were living in urban areas. Black South Africans are nearly equally divided between urban and rural. Rural areas are characterized by dispersed residence, subsistence agriculture and pastoralism, producing no significant economic surplus for trade.
Southern Africa: Sociocultural Aspects Urban areas throughout the region are weakly centralized and sprawling. All large urban areas derive from colonial patterns of segregated ‘suburbs’ loosely grouped around multiple trading and industrial centers. Single houses (or complexes of houses, outbuildings, and shacks) surrounded by walled ‘gardens’oryards comprisethe formalurban settlement pattern, while large informal shack settlements surround them. The colonial economy based on mining, large-scale farming, and labor migration contributed to disruption of family life and systems of social control, especially in rural areas of Zambia, Zimbabwe, and South Africa. While mining and agriculture now account for less than 10 percent of the GDP in South Africa, these two sectors dominate the economies of other southern African countries as they did in South Africa in the early twentieth century. Lesotho and Malawi, lacking natural resources except water, have been largely labor-exporting countries, relying on remittances sent home by miners who work mainly in South Africa, but also Zambia and Zimbabwe. The bulk of Botswana’s GDP is derived from diamond mines, while Zambia relies on exports of copper, cobalt, and other minerals. Namibia is mostly arid, and depends on mineral exports, especially uranium, for foreign exchange earnings. All countries, however, participate in sending labor to the mines located throughout the region but dominated by South African mining houses. At the beginning of the twenty-first century, economic and social prospects are dimmed by the threat of HIV\AIDS pandemic. Southern Africa emerged as the hardest hit region in the world. As a consequence, the rate of population growth is predicted to fall below replacement rates in the first decade of the twenty-first century.
5. Language and Communication The process of urbanization has resulted in the emergence of distinctive urban cultures with small manufacturing, trade, mining, and transport industries initially attracting large numbers of formerly rural people to urban areas where different languages and cultures came into contact. They blended to form new urban-based languages and cultures. Examples include Isicamtu, ‘talking’ or tsotsitaal, ‘gangster language,’ in South Africa, based on Zulu grammar but blending English, Afrikaans, and Sotho or Twana words; or Fanagalo, meaning ‘Do-like-this,’ a pidgin based on Zulu used on the mines and farms throughout southern Africa. New forms of music, dance, and religion blending African traditions and Western forms also arose. No indigenous African language has achieved status or acceptance as a single national language or lingua franca anywhere in southern Africa. South Africa has 11 official languages, of which those with the largest
number of speakers are Zulu (8.7 m speakers), Xhosa (6.8 m), Afrikaans (6.2 m), Pedi (‘Northern Sotho’; 3.8 m), English (3.5 m), Sotho (Southern Sotho; 2.7 m), and Tswana (2.8 m) in addition to the smaller linguistic communities of Tsonga (‘Shangaan’), Venda, Ndebele, and Swati (‘Swazi’). Other languages spoken include Gujerati, Urdu, Hindi, Tamil, and Bhojpuri from India. Portuguese, Greek, and German is spoken by smaller communities from Europe. Chinese is spoken by recent immigrants from China and Taiwan, although the small community of Chinese who arrived in the nineteenth century speak only English. French is spoken primarily by central African refugees and migrants in South Africa. The linguistic diversity is even greater in Zambia where 41 languages are spoken, including Bemba, Tonga, Nyanja (Chewa), and Lozi. Thirty-three languages are spoken in Mozambique and 28 in Namibia. Although over 70 percent of the population of Botswana speak Tswana, there are many small, nearlyextinct languages spoken by Khoisan people, and by refugees from surrounding countries. There is some mutual intelligibility among some of these languages. For instance, Sotho, Pedi and Tswana are mutually intelligible, as are Zulu, Xhosa and Swazi in South Africa, Botswana, Swaziland and Lesotho. Several languages have become lingua francas within limited areas of these countries—for instance Zulu in South Africa, or Bemba and Nyanja\Chewa in Zambia and Malawi. Almost all people speak several regional languages. As a consequence of diversity, with the exception of Mozambique, English has increasingly become the dominant public language of government and the media. It is also the language of trade and diplomacy. While Mozambicans use Portuguese as the language of government and education, they are increasingly turning to English, too, as they develop deeper political and economic links with other nations. Nevertheless, the large majority of all people in the region continue to rely on local Bantu languages in their homes. Until 1994, South Africa treated both English and Afrikaans as ‘official,’ but Afrikaans now figures as only one of the 11 official languages. Afrikaans is indigenous to South Africa where it evolved from dialects of Dutch in the Dutch colonial settlements at the Cape of Good Hope in South Africa. Today it is spoken throughout South Africa and Namibia, with small pockets in other southern African countries. In South Africa, Afrikaans is spoken by a larger number of people than English as a home language. Within the region, South African society is the most complex with respect to color, ‘race,’ religion, language and dialect, region, and class. Ethnic, political, and other kinds of identities—such as preference for certain sports, or foods—are based on these differences, and also help to create and sustain them. None of these primary indicators of social difference, however, tend to define a consistent or bounded category 14705
Southern Africa: Sociocultural Aspects of people. In other words, people of many different colors or race speak the same language or set of languages, while people of the same region or class vary widely in what religion they follow, or in their political allegiances. People sharing the same religion and language—which is often a combination that defines a fairly closely-knit ethnic or national community in Europe, or Asia—may span widely different colors, classes, and regions in southern Africa. For instance, while the majority of South Africa’s ‘Colored’ people speak Afrikaans and belong, like most White Afrikaners, to some denomination of the Reformed Church, until recently they considered each other to be eternally divided by their ‘race.’ Surprisingly, while virtually all schools and public places were rigidly divided by ‘race’ in the 1980s, by the end of the century, schools, shopping centers, and other public places had quietly integrated with no more than a handful of violent or other racial incidents. On the other hand, as the business center of Johannesburg gradually became ‘black,’ the entire economic geography of this city of six million people shifted, with largely white owned business moving north into newer, dispersed business centers within the last few years of the 1990s. Despite the previously enforced divisions and centuries of conflict across many different cultures and language groups, South Africa exhibits a degree of cultural, economic, and political unity that is perhaps unique in Africa. Virtually all of its people manage to communicate with one another in one or several common languages, shop in the same stores, watch the same television channels, furnish their houses in similar ways, and— ironically—share the belief that they are much more different from one another than they actually are in terms of many objective measures of cultural or social difference. See also: Apartheid; Central Africa: Sociocultural Aspects; Colonialism, Anthropology of; Colonization and Colonialism, History of; East Africa: Sociocultural Aspects; Historiography and Historical Thought: Sub-Saharan Africa; Imperialism, History of; Sub-Saharan Africa, Archaeology of; West Africa: Sociocultural Aspects
Bibliography Barnard A 1992 Hunters and Herders of Southern Africa: A Comparatie Ethnography of the Khoisan People. Cambridge University Press, Cambridge, UK Colson E, Gluckman M (eds.) 1951 Seen Tribes of British Central Africa. Manchester University Press, Manchester, UK Freund B 1984 The Making of Contemporary Africa: The Deelopment of African Society Since 1800. Indiana University Press, Bloomington, IN Greenberg J 1966 The Languages of Africa. Indiana University Press, Bloomington, IN
14706
Grimes B F (ed.) 1996 Ethnologue, 13th edn. Summer Institute of Linguistics [http:\\www.sil.org\ethnologue\ ] Hall M 1990 Farmers, Kings, and Traders: The People of Southern Africa, 200–1860. The University of Chicago Press, Chicago Hammond-Tooke W D 1974 The Bantu-speaking Peoples of Southern Africa. Routledge, London Hammond-Tooke W D 1993 The Roots of Black South Africa. Jonathan Ball Publishers, Johannesburg
R. J. Thornton
Sovereignty: Political Sovereignty is a set of principles or rules that define appropriate governance structures. These rules were first explicated with the greatest clarity by Bodin and Hobbes in the sixteenth and seventeenth centuries, and elaborated by later thinkers including Jeremy Bentham and John Austin. The defining principle of these initial efforts to clarify the meaning of sovereignty was that there could be one, and only one, source of the law, and that this source, the sovereign, was either in practice or in theory not subject to any higher authority. A corollary is that each sovereign state is independent, subject to no external authority. In the contemporary world, the meaning of sovereignty has become less certain. Sovereignty has become associated with several different rules and practices, rather than with some single notion of supreme authority within a given territory. The realization of any pure version of sovereignty of the sort that Bodin, and especially Hobbes, embraced is precluded by three factors. First, the relationship between order and a single hierarchy of authority has been belied by actual practice, especially the experience of modern democratic states in which authority is fragmented and divided. Second, power asymmetries and the absence of any final authority at the international level has meant that, for some states, juridical autonomy has been an empty or half empty shell because their domestic political structures have been infiltrated or organized by other, more powerful, states. Third, the sovereign right of states to enter freely into any agreement means that political leaders may in some cases choose to compromise the domestic autonomy of their own polities; one element of sovereignty, the right to make treaties, confounds another, the independence of domestic political structures.
1. Origins of Soereignty The concept of sovereignty was associated initially with the domestic ordering of authority and control. At its core was the notion that there can be only one supreme authority. Even if divine authority or natural law existed (which even Hobbes admitted) in practice, only the sovereign could make laws. There could not
Soereignty: Political be any earthly authority above the sovereign, because then the sovereign would not be sovereign. There could also be no external authority, no international or transnational actor, that could negate the will of the sovereign. A political order based on sovereignty or the sovereign state can be contrasted with the medieval world, in which there were overlapping authority claims within the same territory, and the traditional Sinocentric world in which China alone was supreme; other political entities, such as Korea, were subordinate or tributary states yet not part of the imperial core. In Europe, a political world in which states with defined territories and some degree of autonomy from external, especially transnational authority (most notably the Catholic Church), evolved over several hundred years as a result of material factors, including long-distance trade and technological changes in warfare that favored states over other political entities such as city leagues, the Holy Roman Empire, or the Papacy, and ideational ones, most notably the Protestant Reformation (Tilly 1990, Spruyt 1994, Skinner 1978). It was, however, the European religious wars which provided the motivation to develop a doctrine that could legitimate a political order populated with one, and only one, final source of authority within a given territory. The antagonisms bred by sectarian beliefs precipitated bloody civil conflicts in the major polities of Western Europe. Bodin, who was almost killed in religious riots in France in 1572, wrote his Six Books of the Commonwealth in response to the Huguenot rebellion and the Huguenot writings that justified rebellion against tyranny. For Bodin, the fundamental objective of any government must be to insure order, and not to protect liberty. Sovereignty, which for Bodin was intimately associated with the right to give laws and not just to make judgments, had to be lodged somewhere, and he argued that monarchy was preferable to either an aristocracy in which sovereignty would rest with a few, or a democracy in which it would rest with the many. Bodin, like Machiavelli, dwelt on the frailty of political order. He acknowledged that a sovereign who became a tyrant might justly be killed by a foreigner, but he rejected any rationale for domestic resistance (Skinner 1978, pp. 284–8, Hinsley 1986, p. 120). Bodin did not, however, regard the sovereign as unconstrained. The sovereign was bound by the laws of God and nature, and by custom. Because natural law dictated that promises ought to be kept, the sovereign was obligated to honor treaties made with other sovereigns, and agreements made with his own subjects. Taxation required the consent of the governed, because arbitrary confiscation of property would violate the laws of nature. In France, Bodin averred that the king could not change the Salic law, a customary practice, which limited succession to males (Skinner 1978, pp. 273, 289, Bodin 1992, Bk. I, Chap. 8).
Hobbes, writing 80 years later, was more emphatic in his defense of the power of the sovereign. Hobbes rejected even the modest constraints that Bodin had imposed, arguing that it is the sovereign who interprets the laws of God and nature. Once individuals had entered into a social contract they could not or ought not challenge the sovereign, although Hobbes acknowledged that individuals had a right to run away in the face of violent death. The alternative to the Leviathan, the sovereign, was the state of nature in which life was ‘nasty, brutish and short.’ The line of argument focusing on sovereignty as an attribute of domestic political structures was developed further by legal positivists, such as Austin, who asserted that the law properly understood emanated from one single source of authority untrammeled by any constraints. Bentham, for instance, argued that natural law was a nonsensical concept.
2. Soereignty in Practice While in theory sovereignty has often been associated with the concept of a single lawgiver untrammeled by any higher source of authority, except perhaps God, in practice the actual exercise of authority has been much more complicated. The kind of state envisioned by Bodin, and especially Hobbes, never came into existence. In Britain itself, the strife of the seventeenth century led to a political system in which authority was divided between king and Parliament. In the United States, the Founding Fathers established a constitutional order with a division of power among the executive, legislative, and judicial branches, as well as between the individual states and the federal government. In the twentieth century, centralized hierarchical systems such as Nazi Germany and the Soviet Union provided neither prosperity nor domestic tranquillity for their citizens. The classic view which associated political order with a single hierarchy of authority is irrelevant in the contemporary world. 2.1 Soereignty in the International System The notion of sovereignty however, continues to be invoked at the international level, where it is defined by two principles. First, the right to domestic autonomy or independence: regardless of their internal structure of authority (unitary, federal, democratic, autocratic, etc.), sovereign states exclude both de jure and de facto external sources of authority. Second, the right to make treaties: sovereign states can enter freely into contracts with other entities. In practice, the first of these principles, autonomy, has been contested persistently, and the second, the right to enter freely into contracts, can result in agreements that contradict the first. Although a few states have approximated the sovereign ideal of actual, and not just formal juridical autonomy, the United States being the most important 14707
Soereignty: Political example, there has never been an historical moment during which the world could be described as divided among such states, each enjoying exclusive authority within its own territory. Even if states have had a monopoly over the legitimate use of force within their own borders, the purposes toward which such force might be used frequently have been influenced or determined by external actors. 2.2 Dilemmas of Soereignty There are two factors that preclude the full realization of any idealized form of sovereignty. First, because there is no authority above states, powerful states can violate the autonomy of their weaker neighbors. Second, because sovereignty implies that states have the right to make any agreements that they please, there is nothing to prevent them from entering into treaties that might subject them to external authority structures. The most obvious violations of the principle of autonomy involve military force. Major powers have not been reluctant to dictate the domestic political structures of their weaker counterparts, especially within their own spheres of influence. The United States has intervened frequently in the smaller states of the Caribbean and Central America because of concerns about security, financial stability, human rights, and illegal drugs. After World War II the Soviet Union imposed communist regimes on its satellite states in central Europe. In 1999, the member states of NATO conducted an extensive bombing campaign over Yugoslavia designed to coerce the Yugoslav government into changing its practices toward Kosovo Albanians while at the same time continuing to recognize Kosovo as a part of Yugoslavia. External influence on the domestic political structures of states has not, however, been the result of direct military action alone. All of the major peace treaties, beginning with the Peace of Westphalia of 1648, have legitimated to some extent external legitimation of domestic authority structures, sometimes as a result of fully voluntary and mutually beneficial accords, but at others as an outcome of asymmetrical bargaining power. The Peace of Westphalia, often described as a critical moment in the development of the modern state system, did delegitimate the Catholic Church as a transnational source of authority and initiate a long period in European history in which the secular balance of power rather than some notion of a unified Christendom defined the way in which leaders conceived of and practiced international politics. But at the same time, the specific provisions of the Westphalian Peace established a new constitution for the Holy Roman Empire including religious toleration within Germany, a position supported by the victors in the Thirty Years’ War, France and Sweden, and accepted only reluctantly by the Habsburg monarch, Ferdinand III. The Peace did not create a world of 14708
autonomous political entities; the political units within the Empire (principalities, electorates, free cities) were still embedded within the larger institutional arrangements of the Empire which were, as a result of the Westphalian Peace, now subject to international guidance. The Vienna settlement following the Napoleonic Wars guaranteed the rights of Catholics in The Netherlands. During the nineteenth century the European powers, fearing that ethnic conflict in the Balkans would destabilize Europe (which, indeed, happened in 1914) insisted on religious toleration as a condition for recognizing all of the successor states of the Ottoman Empire, beginning with Greece in 1832 and ending with Albania in 1913. Perhaps most clearly, the peace settlements following World War I included extensive provisions for the protection of minority rights. Poland, for instance, as a condition of recognition and admission to the League of Nations, agreed to bilingual education in some primary schools, and to refrain from holding elections on Saturday because such balloting would have violated the Jewish Sabbath. While a few states, such Hungary, which had a small domestic minority population but many coethnics living in other states, welcomed these minority rights arrangements, most of the 30 or more states that were affected regarded them as a necessary evil, accepted only to secure their international status. The interwar efforts at international constraints on domestic practices were a dismal, albeit internationally legitimated, failure, and after World War II, human rather than minority rights became the focus of attention. The United Nations Charter, for instance, endorsed both human rights and the classic sovereignty principle of nonintervention. Disheartening developments associated with the disintegration of Yugoslavia in the 1990s, however, revived earlier concerns with minority rights. Recognition of the Yugoslavian successor states was conditioned on their acceptance of constitutional provisions guaranteeing minority rights, and in Bosnia the Council of Europe established a set of external authority structures including a Human Rights Commission, a majority of whose members were appointed by the Council. Political leaders may also use their sovereign right to enter freely into treaties to compromise voluntarily the autonomy of their own polity. In the contemporary environment, the clearest example of the tension between the right to contract and domestic autonomy is the European Union. The European Commission, the European Court of Justice and the European Parliament are supranational authority structures that can, in some issue areas, make decisions that are binding on member states, even states that have not agreed to them explicitly. The rulings of the Court, for instance, have a direct effect in national judicial systems and supremacy over domestic law, doctrines that emerged through judicial practice rather than any explicit treaty. The Single European Act adopted in 1987, and the Maastricht Treaty ratified in 1993
Soereignty: Political provided for majority or supramajority, but not unanimous, approval for decisions in many issue areas. Mutual recognition, which obligates member states to recognize each others’ national regulatory measures, creates extraterritorial authority because legislation in one state governs the activities of enterprises in others. The European Union is a particularly dramatic example of the tension between sovereignty understood as the right to enter freely into contracts, and sovereignty understood as autonomy or independence from external authority structures, but it is not unique. The member states of International Monetary Fund have accepted conditionality arrangements covering not only specific policy targets such as the level of the public debt, but also domestic authority structures such as the establishment of an independent central bank or the creation of a corruption commission.
3. The Significance of Soereignty The tensions between different attributes of sovereignty, the coercive and legitimated efforts by external actors to influence authority structures within specific states, and the abandonment of the notion that sovereignty requires some one single source of domestic authority, does not mean that the concept is inconsequential. Sovereignty remains a widely understood script, comprehensible not only to political leaders but to wider populations as well. Sovereignty is a rallying cry for dispossessed populations even in situations where the best possible outcome would be a state whose authority structures would not be of its own making as is, for instance, the situation in East Timor. The salience of sovereignty may make alternative arrangements more difficult, especially in situations where compromises reached by political leaders could be undermined by counter elites deploying the language of sovereignty. Other forms of political order require either coercion, as in the former Yugoslavia, or explicit agreement, as has occurred in Europe. Innovative structures, such as the relationship that China established with Hong Kong in the late 1990s which included the participation of a foreign judge on the Court of Final Appeal and a separate Hong Kong passport, have been explained in terms of the modern language of sovereignty even when they are more reminiscent of different ordering principles, such as the traditional Sinocentric tributary state system. Sovereignty has become a set of principles more associated with the nature of political entities in the international environment (states as opposed to, for instance, protectorates or empires) and their relationship with each other (nonintervention), than with the organization of domestic political authority, which was the primary concern of most important early writers on sovereignty. But the absence of any ultimate source of authority in the international system, the conflicting norms embraced by political leaders and
their followers, and clashes over issues of both security and material welfare mean that while sovereignty remains a consequential idea influencing political choices, it will also be persistently contested. See also: Authoritarianism; Autonomy, Philosophy of; Bentham, Jeremy (1748–1832); Contractarianism; Democracy; Governments; Hobbes, Thomas (1588– 1679); International Law and Treaties; Law: Overview; Machiavelli, Niccolo (1469–1527); Nations and Nation-states in History; Natural Law; Power: Political
Bibliography Austin J 1954 [1832] The Proince of Jurisprudence Determined and the Uses of the Study of Jurisprudence. Noonday Press, New York Bentham J 1970 Of Laws in General. Hart H L A (ed.) Athlone Press, London Bodin J 1992 On Soereignty: Four Chapters from The Six Books of the Commonwealth. [Ed. and trans. J H Franklin]. Cambridge University Press, Cambridge, UK Bull H 1977 The Anarchical Society. Columbia University Press, New York Deudney D H 1995 The Philadelphian system: sovereignty, arms control, and balance of power in the American state-union, circa 1787–1861. International Organization 49(2): 191–228 Fowler M R, Bunck J M 1995 Law, Power and the Soereign State: The Eolution and Application of the Concept of Soereignty. Pennsylvania State University Press, University Park, PA Hinsley F H 1986 Soereignty, 2nd edn. Cambridge University Press, Cambridge, UK Hobbes T 1996 Leiathan. [Ed. R Tuck]. Cambridge University Press, Cambridge, UK Jackson R H 1990 Quasi-States: Soereignty, International Relations and the Third World. Cambridge University Press, Cambridge, UK Krasner S D 1999 Soereignty: Organized Hypocrisy. Princeton University Press, Princeton, NJ Oppenheim L 1992 Oppenheim’s International Law, 9th edn. [Eds. R Jennings, A Watts]. Longman, Harlow, UK Ruggie J G 1998 Constructing the World Polity: Essays on International Institutionalization. Routledge, London Skinner Q 1978 The Foundations of Modern Political Thought: Vol. 2. The Age of Reformation. Cambridge University Press, Cambridge, UK Spruyt H 1994 The Soereign State and Its Competitors. Princeton University Press, Princeton, NJ Strang D 1991 Anomaly and commonplace in European political expansion: Realist and institutional accounts. International Organization 45: 143–62 Tilly C 1990 Coercion, Capital, and European States, AD 990–1990. Basil Blackwell, Cambridge, MA Wendt A 1999 Social Theory of International Politics. Cambridge University Press, Cambridge, UK Vattel Emer de 1852 The Law of Nations; or, Principles of the Law of Nature, applied to the Conduct and Affairs of Nations and Soereigns. [New edn. (trans. J. Chitty)]. T. and J. W. Johnson, Law Booksellers, Philadelphia
S. D. Krasner Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
14709
ISBN: 0-08-043076-7
Soiet Studies: Culture
Soviet Studies: Culture Soiet studies, like its other regional studies counterparts, is primarily a European and North American academic response to post-World War II geopolitics. It overlaps with Slaic Studies but also includes a research on non-Slavic cultures living on the territory of the Soviet Union, and makes a reference to the history and culture of the Russian Empire before 1917. Soviet Studies grew rapidly during the Cold War, and were generously supported by the US government in its effort to ‘understand the enemy.’ It integrated a number of disciplines including history, economics, literature, linguistics, sociology, and ethnology, but gave primacy to political science. From the outset, such study took the supranational unity of the Soviet Union as the master frame of reference for the study of languages, cultures, and social institutions, however diverse. The status of culture in Soviet studies is complex and changing. Interest in the political origins of the Soviet system underwrote a long-term emphasis within Sovietology (the Cold War title for Soviet Studies) on political culture. Forms of governance, the nature of domination and subordination, the mechanisms of legitimation of the regime, and the emergence of a selfselected ruling elite were among the principal concerns of this type of research. At the other end of the spectrum was research on artistic culture, which was treated as expressive of national culture in general, and its producers, Soviet intelligentsia. The ‘cultural turn’ of the mid-1980s in Western social science stimulated new development in Soviet studies. Increasingly, studies of Soviet society directed attention to social discourses, practices, and everyday life. This anthropologically informed idea of culture as a realm of socially shared meanings and a system of representations now proliferates in this regional research. This article traces the development of Soviet and post-Soviet studies with an emphasis on how culture was used as an analytical framework. It also considers the diverse lines of influence that run between Western (post) Soviet studies and internal studies of culture in the Soviet Union and Russia. Throughout, the term ‘Soviet culture’ refers to cultural developments of the socialist period from 1917 to 1991. ‘Post-Soviet culture’ defines the contemporary stage of cultural development in the territory of the former Soviet Union.
1. Culture in Russian Society and Russian Social Sciences The concept of culture has a complex provenance in the Soviet and Russian contexts. One sense of the term refers to education and the level of self-possession of 14710
an individual—kul’turnost’. The second meaning symbolizes people’s unity based on language, tradition, and heritage, and reflects the profound role of culture in nationalist discourse. Finally, culture in its expressive forms, and literature in particular, constitute a relatively independent domain of meaning. The term ‘culture’ did not enter widespread use in Russian society until the 1880s. Before then, occasional use of the word referred either to ‘literary culture’ or to the broader attainments of education, enlightenment, and civilization. As the process of national self-definition gathered speed in the late nineteenth century, a series of cultural dualisms emerged. Perhaps most significant was the opposition between East and West, which became central to discourse on national identity. As the twentieth century began, cultural dualism of this kind informed increasingly powerful notions of the ‘Russian way of life’ and ‘Russian culture.’ At the same time, nonRussian ethnic groups were ascribed particular cultures. ‘Culture’ became, in this context, the container of difference, and as such underwrote a variety of inquiries into the material culture and ‘vanishing traditions’ of ‘traditional peoples,’ including folklore studies. This interest became more pronounced in the Soviet era, reflecting a perceived need to study the effects of socialist modernization on the ‘hinterlands’ (Dunn and Dunn 1974). AnotherformativedistinctionwithinRussian\Soviet culture is that between ‘high’ and ‘low.’ Originally the subject of nineteenth century debates between Slavophiles and Westernites, high–low distinctions became a major concern of communist rule. Communist party ideologues attempted to do away with the notion of a distinction between elitist and popular cultures by forging a ‘culture for the masses.’ New cultural initiatives were meant to speak to the peasants and the workers. Beginning in the 1920s, kul’tura was carefully planned and designed to eliminate cultural divides. In Stalin’s time the word ‘culture’ acquired a distinctive suffix—kul’turnost’ (‘cultured-ness’). The suffix signified the inclusiveness of the new cultural order— embracing the new artistic cannon (Socialist realism) manners, and styles of life (Boym 1994). High art, for its part, would be democratized through museums, education, and cultural associations. Although greatly influenced by political constraints and censorship, this set of cultural policies produced a number of progressive trends in art and literary criticism that profoundly influenced Western cultural and linguistic theory. The school of linguistics known as Russian formalism was one of these. It developed in the mid-1910s around Viktor Shklovsky and Yuryi Tynyanov, and in the 1920s merged with the Moscow Linguistic Circle, led by Roman Jackobson. Russian formalist theory tried to overcome the dualism of form and content by emphasizing the study of poetic language as such: the interplay of verbal forms, tropes, sounds, syntactic
Soiet Studies: Culture constructions, etc. Their method of ‘functional poetics’ sought the underlining structure of works through the relationships between their internal elements (Erlich 1965). Mikhail Bakhtin and Vladimir Voloshinov formed another group situated between the formalists and the official Marxist line on culture. Their work became an extensive study of what Bakhtin called the dialogic dimension of language—the construction of meaning in a linguistic field shaped by competing discourses such as official and popular cultural forms. Prohibited in the Soviet Union, these theorists found an audience among Western literary and cultural analysts, particularly in the areas of structuralism and semiotics (Holquist 1990). Other movements were important within Soviet scholarship. During the 1960s, 1970s, and 1980s, the Tartu School associated with Yurii Lotman became prominent. These scholars developed a sophisticated approach to cultural history that emphasized cultural wholeness and totality in its search for the key to Russia’s duality. They accorded culture a near metaphysical status—culture as an abstract force that reasserted itself onto national patterns of life (Lotman 1984). As a result of political constraints and censorship, more analytical attention was paid to literary and art criticism than to other forms of cultural analysis. In response, literary and art criticism became a common means of articulating broader critiques of the Soviet system. This was the case with Boris Groys’ The Total Art of Stalinism and Vladimir Papernyi’s Kul’tura ‘Da.’ Indigenous Soviet scholarship produced an extensive body of knowledge of Russian culture— though one that remained independent of Western theoretical influences until the 1990s.
2. The Deelopment of Soiet Studies in North America Although culture held a marginal position in Western Soviet studies until quite recently, the immediate postwar period between 1946 and the early 1950s provides a brief exception. For a time, linguists, sociologists, anthropologists, and even psychologists became interested in the study of Soviet society. They were encouraged in their work by the wave of war refugees who could provide first-hand knowledge about the life of the region. This effort collapsed shortly after it began. The comprehensive Soviet control of information thwarted Western inquiries until Perestroika in 1985. The inaccessibility of Soviet society to independent observers put severe methodological constraints on most forms of social science inquiry. Moreover, political considerations from the 1950s to mid-1980s played a major role in prioritizing certain kinds of research. High politics and economics
were deemed crucially important and the Soviet political system became the focus of most attention. The ‘ordinary man’ nearly disappeared as a subject of inquiry. Political scientists and in some cases historians came to dominate this research agenda. Sociologists and anthropologists were more or less removed from the field. One result of this reconfiguration was the so-called ‘totalitarian model,’ best represented by the line of political theory running from Hannah Arendt’s The Origins of Totalitarianism (1953) to Martin Malia’s The Soiet Tragedy (1994). The totalitarian model focused on the one-party political system and its methods of control: primarily terror and the destruction of meaningful social bonds outside the Party. This scholarship tended to suffer from several related problems. (a) It tended to take state ideologies at face value, reproducing Cold War prejudices about the diametrical opposition between capitalist and socialist systems, at the political as well as the economic level. (b) It took the indoctrination of the Soviet ‘masses’— another self-proclaimed success of the Communist Party—as empirically proven. Change came in the late 1970s and early 1980s, as debates arose between advocates of the totalitarian model and those who challenged totalitarian thinkers for overemphasizing the power of the Soviet political system. ‘Revisionists’ raised political questions about the nature of popular support and argued for evidence of more complex negotiation between the regime and the society. This is the moment in which ‘culture’ came to wider attention. Class identity, gender issues, social movements, and nationalism were introduced as necessary complements, and often as alternatives, to the political models. On their side, revisionists exposed themselves to the criticism of ‘whitewashing’ communist repression and purges. The fall of the Soviet regime in 1991 transformed studies of culture on both sides. On the one hand, it provided access to Russia for Western scholars. Archives were opened and sociological studies of the general public became possible. On the other hand, it put Russian scholars, with their knowledge of the language and culture, in contact with the West. This new wealth of information and intellectual exchange was accompanied by subtler shifts in approach. In reaction to the revisionist effort to ‘de-ideologize’ the Soviet regime, a new generation of scholars returned to the question of how ideology created new and durable social structures. They did so, however, while retaining the focus on culture and organization of everyday life. The following section considers the major directions in research on culture in Soviet and post-Soviet studies that emerged around questions of social identities, everyday life, consumer, and popular cultures. It pays less attention to political science that continues to view culture as limited to ‘political culture’—the expressions of economic and political processes. 14711
Soiet Studies: Culture
3. Areas of and Research in Soiet and PostSoiet Studies Questions of ‘modernity’ and ‘modernization’ are common denominators for much of the Soviet and post-Soviet studies, from the official communist ideology of socioeconomic development to a broader trajectory of rationalization implicit in notions of ‘modernity.’ Against this backdrop, a number of cultural phenomena have attracted analytical attention. 3.1 The ‘Soiet Indiidual’ The relationship between the regime and the society and the way it shapes individual identity is among the ‘oldest’ concerns in Soviet studies. From the ‘totalitarian model’ to the contemporary challenges of creating a civil society, many scholars have been preoccupied with what it meant to be a ‘Soviet Individual.’ Soviet modernization specified the need for an ‘all-round personality’ appropriate to a ‘nonexploitative society,’ and the state and the Party took responsibility for creating it. Facilitated by rapid urbanization, this ‘socialization’ proceeded through the imposition of cultural regulations and norms, altering the existing structures of identity (Volkov 2000). The state initiatives to reform individuals were not without limitations. Recent studies of socialist individuation show that a difference between public adherence and private life allowed for complex strategies of avoidance of Party control. Thus emerged the ‘dissimulating subject’ (Kharkhordin 1999). As the Soviet system calcified and as socialist prosperity became an increasingly remote fiction, endorsement of the official ideology became a means of securing a private existence out of reach of the intrusive power of the state (Yurchak 1997). Alexei Yurchak suggests that although the ‘objective conditions of socialism’ were seen as false, Soviet individuals continued to act as if they were true believers. This collective ‘pretense of misrecognition’ strengthened social solidarity and allowed for meaningful life in other areas. 3.2 Class and Stratification Through the creation of the Soviet individual, the Soviet state aimed to remove distinctions among social groups. ‘Class’ became a focus of social policies and, as a result, of political studies. In Soviet studies, the Soviet Union provided a contrasting case for Western economic upward mobility. While Soviet officials insisted on ‘giving power to a class,’ Western Sovietologists viewed mobility primarily in individual terms, as resulting from individual opportunities to improve social standing. Other critiques treated the Soviet claim of the disappearance of a class society as a necessary fiction of the regime. Discussions of 14712
mobility and stratification were enriched in the 1980s by culturalist accounts of class consciousness as a formed and learned identity, drawing especially on E. P. Thompson’s work on nineteenth century Britain. These directly challenged the dominant Soviet conception of class as an ascribed characteristic that had little connection to socioeconomic position (Fitzpatrick 1993). They were central to emerging work on a range of largely unexplored types of social behavior—especially the patterns of dissimulation, unmasking, and stigmatization that structured social identity and the relationship between individuals and the regime. Studies of class have been far less frequent in post-1991 scholarship, however, reflecting the rise of interest in other registers of identity.
3.3 National Identity Socialist modernization also made claims to cultural progress in ethnic relations while preserving the national distinctiveness of Soviet society. The fate of non-Russian people under Soviet rule was the subject of numerous studies in the Soviet era, including historical studies of Central Asia (Eickelman 1993) and Siberia (Balzer 1997). Western Sovietologists also produced studies of non-Russian people—although these were, with rare exceptions, heavily influenced by cultural evolutionary paradigms. Modern anthropological approaches emerged only later (e.g., Caroline Humphrey’s ethnography of collective farming in Buryatia (1983)). Because nationalism was officially represented as an obstacle to communist internationalism and ostensibly suppressed, studies of post-Soviet ethnic relations reproduce a ‘repressive hypothesis.’ When, the argument goes, ‘political opportunities’ arose in the late 1980s, ethnic conflicts resulted from the discontent of the Russian ‘core’ and the intensifying local nationalisms of the non-Russian periphery (Kaiser 1994). Yet, as Yuri Slezkin has argued, it may be more instructive to consider how socialism cleared the way for contemporary nationalism (Slezkin 1994). Despite opposition between socialism and nationalism on many levels, the principle of national difference was officially sustained in the federated states. The individual ‘nation’ was the only permitted form of social association other than the Communist party itself. This provided an opportunity for group mobilization along nationalist lines when the opportunity arose. How this opportunity was translated into processes of cultural formation after the 1991 break-up is a complex story that is only beginning to be told: identity formation in the Ukraine (Wanner 1998) draws on different resources than that of ‘traditional’ peoples in Siberia (Grant and English 1999). The prominence of new concepts such as the Russian diaspora and the near abroad—designating Russians living in the republics—also reflects the new, post-Soviet realities of
Soiet Studies: Culture national co-habitation. Research on population diasporas on the territory of the former Soviet Union offers new conceptual questions: what does it mean to become an ethnic and linguistic minority without having emigrated in any conventional sense (Shlapentokh et al. 1994)? David Laitin (1998) discusses new national identities in the post-Soviet space as outcomes of a ‘tipping game’—a rational choice to learn new languages and cultural norms if one wishes to enjoy the benefits of citizenship. Laitin argues that under post-Soviet conditions, cultural assimilation depends neither on cultural nor linguistic similarity, nor on ethnic closeness, but rather on the degree to which the structural conditions of economic, political, and social life stimulate a desire to chose some and not other forms of identification (Laitin 1998).
3.4 History and Collectie Memory The interpretation of national history was also important for Soviet modernization. Soviet culture, although future-oriented, was deeply rooted in a particular historical vision. Peter the Great stood as a symbol of modernization, linked first to Bolsheviks and then to Mikhail Gorbachev; the French Revolution was a predecessor of the Russian Revolution, and the victory over Germany in World War II completed the genealogy of Russian military glory (Clark 1994). Stories of redemptive suffering in World War II, especially, sponsored belief in the unique position of Russia in world history. They provided a narrative of ‘all-national struggle’ and a precedent for the patriotic mobilization of a ‘cynical and apathetic populace’ (Tumarkin 1994, p. 277). Especially under Brezhnev, these narratives counterbalanced the numerous tensions that strained national unity while giving priority of place to ethnic Russians as the leaders of the Soviet people. After 1991, socialism itself became a form of narrative and came under scrutiny as a subject of historical interpretation.
3.5 Gender and Sexuality One of the goals of socialist modernization was to solve the ‘women question.’ Western Sovietology tended either to accept communist claims that gender inequality had no place in the new society—Ella Winter’s (1933) Red Virtue and William Mandel’s (1975) Soiet Women took this line—or to severely criticize communists for indulging in mostly empty rhetoric on these issues, as Manya Gordon did in Workers Before and After Lenin (1941). The latter perspective reflected the tension between the official policy of women’s emancipation and gender equality, and de facto sexism and inequality in everyday life (Lapidus 1978). The post-Soviet period has, by most accounts, seen further deterioration of this situation,
as measured by discrepancies in male and female employment, sexual harassment in the workplace, and sexism in the media. Though some have argued that these phenomena are products of the rapid restructuring of society, most scholars of gender in Russia see connections between the formal promotion of women rights under socialism and the current regression. Many have argued that the Soviet constitutional definition of women as workers and mothers is partly responsible for discrimination directed against women in practice. In addition, the imposition of feminist ideology ‘from above’ did not reflect popular support ‘from below’ (Muller and Funk 1993). As the postSoviet state fell into crisis, it has been unable to defend the full range of formal rights for women. One of the consequences, Mary Buckley (1992) notes, is that women have been increasingly driven away from participation in the public sphere. Sexuality, as distinct from gender, was rarely examined in Soviet studies. Because studies of sexuality are methodologically demanding and complicated, they were severely restrained by Soviet censorship and rigid cultural norms. Peter Stanford’s 1967 book, Sexual Behaior in the Communist World, presented an interesting but fragmented view of sexual practices under socialism. Most work on sexuality in Western Soviet studies was based on the analysis of available legal or literary texts and other cultural artifacts (Dunham 1961, Levitt and Toporkov 1999). Only in the 1990s has more theoretical, fieldwork-based research developed—partly in response to the dramatic reorganization of sexual norms in Russia. Laurie Essig’s research on homosexuality in Russia is exemplary in this regard, offering not only rich empirical material but a critical comparative analysis of the distinctive formation of sexual subjectivity in Russia (Essig 1999).
3.6
Popular and Consumer Culture
During the Soviet years, the ‘shortage economy’ placed a premium on the hoarding, rather than consumption, of goods by individuals. Yet consumer goods were vitally important to socialist ideology. Images of communist abundance were crucial in securing the regime’s legitimacy—especially in the context of the current sacrifices. In reality, the shortage of goods resulted in their informal distribution through a ‘second economy.’ Access to goods was a coveted and largely political privilege: those who possessed it were in some respects analogous to the Western moneyed and propertied class (Osokina 1998). As new goods flow into the post-Soviet space and as the new consumer culture develops, new questions arise. Do the new relations of capitalism dissolve the fabric of pre-existing, socialist-era social relations? Does the international flow of goods homogenize cultures? How are we to understand the capitalist 14713
Soiet Studies: Culture ‘break’: is there a link between the Soviet and postSoviet consumer cultures (Humphrey 1995)? Russian scholars and Western scholars of Russia alike have shown tremendous interest in the recent flowering of popular culture—a new feature of postcommunism. Youth culture, night life, and new media are just a few areas of recent scholarly interest. As more cultural artifacts are imported to Russia from the West, the post-Soviet cultural landscape changes further. The study of popular culture’s contemporary forms, however, requires careful consideration of the peculiar trajectory of the concept in the Soviet and Russian contexts (Barker 1999).
4. New Social Realities And The Future of PostSoiet Studies The demise of the Soviet Union in 1991 opened a new chapter in the studies of the region. It also brought considerable confusion to the field. The break up of the old territorial unit challenged the geographical coherence of the former Soviet Studies. It yielded new geographical groupings. Central Asia, for instance, could be plausibly integrated into Middle Eastern studies, while the Baltic republics, with their effort to join the European Union, could be approached in the context of Western Europe. Yet the shared socialist past underlies considerable similarity in the social, economic, and political development of these countries and precludes easy redefinition of the region. The fall of the Soviet regime made it clear that certain perceptions of the Soviet Union were an integral part of the national fabric in Europe and the United States. Political scientists bore the most criticism for letting these perceptions cloud their understanding of fundamental social developments in the Soviet Union, leading to their collective failure to predict the end of communism. Despite this fact, revisions of totalitarian and modernization theories continue to be produced. New Russia centered accounts of capitalism’s inevitability—part of a body of ‘transition theory’ modeled on European and North American capitalism—have come to the fore (Burawoy and Verdery 1999). Opposed to such visions of predetermined change are diverse efforts on the part of younger scholars to understand how family, work, and community refashion themselves in creative and resistive processes of change. In spite of the accumulation of empirical research, few Cold War-era studies of Soviet culture were theoretically innovative. The opening of post-Soviet society and its ongoing transformation pose new challenges for Western cultural theory and studies of culture, from theories of capitalist development to nationalism and models of historical rupture. If taken critically, this challenge presents remarkable opportunities, both for the new Russian and regional studies, and for the study culture in general. 14714
See also: National Traditions in the Social Sciences; Soviet Studies: Gender; Soviet Studies: Politics; Soviet Studies: Society; Totalitarianism
Bibliography Balzer M M (ed.) 1997 Shamanic Worlds: Rituals and Lore of Siberia and Central Asia. North Castle Books, Armonk, NY Barker A (ed.) 1999 Consuming Russia. Duke University Press, Durham, NC Boym S 1994 Common Places. Harvard University Press, Cambridge, MA; London Buckley M (ed.) 1992 Perestroika and Soiet Women. Cambridge University Press, Cambridge, UK; New York Burawoy M, Verdery K (eds.) 1999 Uncertain Transition: Ethnographies of Change in the Post-socialist World. Rowman & Littlefield, Langam, MA Clark K 1994 Changing historical paradigm in Soviet culture. In: Lahusen T (ed.) Late Soiet Culture. Duke University Press, Durham, NC Dunham V 1961 Sex: from free love to puritanism. In: Inkeles A, Kent G (eds.) Soiet Society: A Book of Readings. Houghton Mifflin, Boston, MA Dunn S, Dunn E (eds.) 1974 Introduction to Soiet Ethnography. Highgate Road Social Science Research Station, Berkeley, CA Eickelman D (ed.) 1993 Russia’s Muslim Frontiers: New Directions in Cross-cultural Analysis. Indiana University Press, Bloomington, IN Erlich V 1965\1981 Russian Formalism: History, Doctrine, 3rd edn. Yale University Press, New Haven, CT Essig L 1999 Queer in Russia. Duke University Press, Durham, NC Fitzpatrick S 1993 Ascribing class: The construction of social identity. Journal of Modern History 75(4): 745–70 Fitzpatrick S (ed.) 2000 Stalinism: New Directions. Routledge, New York Gal S, Kligman G 2000 The Politics of Gender After Socialism: A Comparatie Historical Essay. Princeton University Press, Princeton, NJ Grant B, English (ed.) 1999 Neotraditionalism. In: Pika A (ed.) The Russian North: Indigenous Peoples and the Legacy of Perestroika. Canadian Circumpolar Institute, Edmonton and University of Washington Press, Seattle, WA Holquist M 1990 Dialogism: Bakhtin and his World. Routledge, London; New York Humphrey C 1983 Karl Marx Collectie: Economy, Society, and Religion in a Siberian Collectie Farm. Cambridge University Press, Cambridge, UK Humphrey C 1995 Creating a culture of disillusionment. In: Miller D (ed.) Worlds Apart. Routledge, London and New York, pp. 43–68 Kaiser R 1994 The Geography of Nationalism in Russia and the USSR. Princeton University Press, Princeton, NJ Kharkhordin O 1999 The Collectie and the Indiidual in Russia. University of California Press, Berkeley, CA Laitin D 1998 Identity in Formation. Cornell University Press, Ithaca, NY Lapidus G 1978 Women in Soiet Society. University of California Press, Berkeley, CA Levitt M, Toporkov A (eds.) 1999 Eros and Pornography in Russian Culture. Ladomir, Moscow Lotman Iu U 1984 In: Shukman A (ed.) The Semiotics of Russian Culture. Dept. of Slavic Languages and Literature, University of Michigan, Ann Arbor, MI
Soiet Studies: Economics Muller M, Funk N (eds.) 1993 Gender Politics and Postcommunism; Reflections from Eastern Europe and the Former Soiet Union Naiman E 1997 Sex in Public. Princeton University Press, Princeton, NJ Osokina E 1998 Za Fasadom ‘Stalinskogo Izobiliia’: raspredelenie i rynok snabzhenii naseleniia gody industrializatsii. ROSSPEN, Moscow [Forthcoming translation and edition: Our Daily Bread. Edited and translated by Kate and Greta Bucher, trans. Sharpe ME]. Armonk, New York Shlapentokh V, Sendich M, Payin E (eds.) 1994 The New Russian Diaspora. Russian Minorities in the Former Soiet Republics. M. E. Sharpe, Armonic, NY Slezkin Y 1994 The Soviet Union as a communal apartment. Slaic Reiew 53(2): 415–52 Tumarkin N 1994 The Liing & The Dead. Basic Books, New York Volkov V 2000 The concept of Cul’turnost’. In: Fitzpatrick S (ed.) Stalininsm: New Directions. Routledge, New York, pp. 210–30 Wanner C 1998 Burden of Dreams: History and Identity in Postsoiet Ukraine. Pennsylvania State University Press, University Park, PA Yurchak A 1997 The cynical reason of the late socialism. Public Culture 9(2): 161–88
O. Sezneva
Soviet Studies: Economics With growing intensity during the years following World War II, Western economists studied the economy of the Soviet Union and similar East European variants bound loosely in the West by the paradigm of Soviet studies. Systematic analysis of the Soviet economy after World War II was motivated by a variety of factors. The dominant pragmatic motivation for studying the Soviet economy was the emergence of the Soviet Union, functioning in a Marxist–Leninist framework, as a major political and potential economic power, and the related Cold War typology of East–West relations as the dominant geopolitical framework in which the competing economic and social systems of East and West would interact. Beyond the obvious interests of national security in the West, analysis of the Soviet economy and similar systems of Eastern Europe was driven by a variety of other forces. Economists understood that the economic system and related economic policies of the Soviet Union were fundamentally different from those found in Western market capitalist economies. These differences were arguably very important in determining how resources were allocated in these different economic systems, the core of the field of comparative economic systems. The Soviet Union became the dominant example of what would later be termed the
administrative command economy. Moreover, economists realized that it would be a challenge to apply the research methods and traditional tools of neoclassical economic analysis to the peculiar organizational arrangements of the Soviet economic system, especially in a setting of Soviet secrecy. In addition to issues of national interest and the inherent scholarly interest in the command economy, study of the Soviet Union was driven by pragmatic factors. The Soviet Union had, by most indicators, grown rapidly in the 1930s, a decade critical to understanding the subsequent Soviet economic experience. Moreover, the Soviet Union emerged from World War II to experience rapid economic growth in the 1950s, a decade when the subsequent pervasive growth slowdown was not evident. In an era when Soviet economic prospects looked bright, the issue of economic performance, and especially economic growth, was viewed as important. In addition, there was growing interest in pragmatic issues such as major sectors and regions, defense spending, foreign trade, and technology transfer for a largely closed economy functioning in an increasingly global economic environment. How would Western market-based firms interact with their Soviet counterparts, the Soviet firms operating in an isolated, secretive, and fundamentally different economic environment? Ironically, similar pragmatic interest would motivate later Western interest in other sectors of the Soviet economy, for example Soviet agriculture, the poor performance of which led to growing Soviet activity in foreign grain markets during the 1960s and thereafter. The growing world demand for energy has sustained interest in energy rich countries, the Soviet Union being a major example. These factors led to a growth of interest in the Soviet economy and recognition of the need for a cadre of scholars whose research interests would focus on understanding resource allocation in the very different Soviet setting. This need to know led to the creation of area studies programs, most of which combined the usual postgraduate training in the discipline (economics) with associated area training in disciplines such as geography, political science, languages, and the like. Courses devoted to the Soviet economy and the more general issues of comparative economic systems became common in most colleges and universities. While economists understood the need to apply the tools of economic analysis to the Soviet case, they also understood the importance of a broader perspective. For such a broad scholarly objective, group identity in area studies programs and research centers was attractive, and yet there were costs associated with the pursuit of these objectives. Study of the Soviet Union was a difficult task. Scholars were required to learn the Russian and possibly other languages. Moreover, access to all source materials was difficult, and for economists, much effort was required to assemble and to utilize 14715
Soiet Studies: Economics even minimal amounts of data, so critical for even basic statistical analysis. Soviet isolation from world agencies meant that data sources standard for any Western researcher were unavailable. Survey research was not generally possible, and Soviet official statistical publications (The National Economy of the USSR) contained limited information often of unknown origin and definition. Statistical and other information on the Soviet economy was limited, uneven, and often of questionable reliability. Archival research presented a major challenge notwithstanding the later availability of excellent guides to the information potentially available (Grimstead 1988). Field experience in the Soviet Union was essential, though the scholarly benefits of travel and work in the Soviet Union were difficult to realize in a setting of pervasive secrecy and limited access to information. Finally, in the Western academic setting, it was not always easy to combine the professional demands of the discipline (economics) with the important but peripheral pursuits of area studies. Put differently, the postwar era witnessed major strides in the development of neoclassical economic analysis. In addition to debate over the appropriate paradigm within which to analyze the very different Soviet experience, major areas of development in neoclassical economics, for example, econometric theory and practice, and the development of computer technology for data analysis, were of only limited applicability in our study of the Soviet economy, given the limited availability of meaningful disaggregated data on the Soviet economy. In this context, there was often a pervasive gap between analysis of issues in Western economies and the analysis of comparable issues in Soviet-type economies. For example, as the availability of data and expanded computer access led to important new developments in the analysis of labor allocation in the West, analysis of labor allocation in the Soviet context remained difficult.
sectors such as labor, data limitations constrained the level of analysis that could be done. In general, data on money and price issues was almost nonexistent. However, through the 1980s, our analysis of the Soviet economy grew in both depth and breadth, with significant attention paid to important subjects such as consumption, investment, income distribution, and technological change. More information became available, new topics were investigated, and new methodological approaches applied, expanded by the new era of perestroika. There is, however, an important sense in which a chronological approach is imperfect. During the transition era of the 1990s, the importance of the Soviet past for understanding the transition continued to be recognized. Likewise, throughout the postwar era, analysts of the Soviet economy recognized the fact that much of the Soviet system as we knew it emerged during the 1930s. Thus research interest in the early years of the Soviet economic experience sustains, and much research on these early years has helped analysts to understand later contemporary events (Davies 1980, Davies 1990, Nove 1982, Zaleski 1980). In yet another dimension, the breadth of Soviet Studies is evident. While a chronological survey of the postwar era correctly identifies performance as a broad but key focus of research, scholars also developed and sustained an interest in issues well beyond performance narrowly defined. Thus in a country so large, naturally wealthy and diverse, a substantial literature emerged on a broad range of issues such as regional economic development, population, urbanization, and the like. As economists moved beyond the immediate frontiers of the discipline of economics, the richness of Soviet studies became evident. Moreover, there was often a close relationship among disciplines; geographers who studied regional issues often did so with methods familiar to economists; the work of political scientists was important for our understanding of bureaucracy.
1. Soiet Studies: The Framework in Economics The content of Soviet studies (economics) is very broad, probably best understood from a chronological perspective. Although it is arbitrary to identify specific time periods, the early postwar years were dominated by a major effort to assemble meaningful data series, and to begin an analysis of how the Soviet system actually worked (plan formulation and plan execution), an understanding essential to any meaningful assessment of performance. Thereafter, there was growing interest in sectoral analysis of the Soviet economy and issues of economic reform, especially following the reform attempts of the late 1950s and 1960s. Through the 1970s, economic analysis of the Soviet economy improved, though unevenly. For some sectors such as foreign trade, data availability facilitated more sophisticated analysis, while for other 14716
2. The Content of Soiet Studies: Economics Although considerable work on the Soviet economy was done prior to World War II, the dominant era of Soviet studies as an identifiable paradigm was the period 1950 through 1991. While intelligence analysis of the Soviet economy was done during World War II, the decade following the end of the war witnessed the growth of interest in the Soviet economy in the United States, Great Britain, France, and Germany in several major directions. First, the Soviet economic system was dominated by state ownership with resource allocation conducted through a national economic plan, a document formulated by central planning authorities and implemented through ministries by industrial enterprises, agricultural enterprises, and other organizations. It was
Soiet Studies: Economics evident from analysis of the early years of the command economy, that if this system was to be understood and its progress assessed as an alternative to market arrangements, it would be necessary to understand how it worked and how well it worked. To achieve these tasks, it would be necessary to assemble meaningful data on national income and its components. The focus on data collection and analysis dominated analysis of the Soviet economy and was complicated by Soviet methods of data collection and dissemination, official Soviet secrecy, and the use of a Marxian (conceptual) framework by the Soviet Union. The initial work on national income (Bergson 1961) became the basis for early assessment of Soviet economic growth and subsequently Soviet productivity, key elements for understanding the broader issues of Soviet economic performance and economic growth. In addition to the development of meaningful data for analysis, it was necessary to understand both plan formulation and plan execution. Early efforts to understand plan formulation were cast primarily within the framework of input-output analysis, though subsequent work would also use mathematical modeling to understand the planning process. Plan implementation was the focus of early work (Granick 1954, Berliner 1957) combined with extensions into important sectors of the Soviet economy such as monetary and fiscal issues, capital accumulation, labor allocation, consumption, investment, and thereafter foreign trade. From the 1960s through the 1970s, there was increasing attention paid to the Soviet economy, especially a growing number of in-depth sectoral studies, for example, in agriculture, and the growing use of econometric methods for statistical analysis and modeling (Green and Higgins 1977). Attention to the gathering of data took a major step forward in the late 1970s when the Soviet Interview Project (Millar 1987) resulted in a major database on the Soviet household. At the same time, interest in Soviet economic reform continued, albeit with a recognition that reform in the Soviet Union was having very limited impact on performance (Hewett 1988) even during perestroika, the last major reform of the Soviet era. Western analysis of the Soviet economy continued to evolve through the 1980s with continuing emphasis on economic growth and productivity (Ofer 1986). In an era of declining performance, new themes emerged, while the analysis of basic issues sustained with important extensions. For example, while the theory of planning was in a sense the macroeconomics of the Soviet economy, the pervasive shortages and growing purchasing power of citizens led to interest in issues such as monetary overhang, arguably the beginnings of new thinking about the Soviet macro-economy. Interest in sectoral allocation continued with growing interest in capital allocation and especially the issue of technological change. Moreover, analysis of
the household permitted greater depth in the analysis of labor allocation. This depth was important, because it brought Western analysis of the Soviet economic system much closer to levels of sophistication present in the West and facilitated better understanding of the issues underlying Soviet economic growth and especially the continuing growth slowdown. The latter years of the Soviet era witnessed the development and application of new theoretical and statistical methods. For example, the theory of the Soviet firm, so critical to understanding Soviet plan implementation, had been conducted by Western economists virtually in isolation from Western analysis of the firm. While variations of profit maximization dominated Western analysis of the firm in market settings, the analysis of the Soviet firm was thought to be more readily understood within the context of organization theory. This distinction was eroded as neoclassical theory placed new emphasis on property rights, agency theory, and information economics. Western analysis of the Soviet micro-economy assumed new dimensions, important for understanding the pervasive problems of the Soviet economy such as incentives, information problems, shortages, poor plan fulfillment, lack of technological change, and the like. During the 1980s, disequilibrium economics was applied with increasing frequency to analyze the peculiar allocation arrangements of the Soviet economy, and especially to understand the emergence of the second or informal economy. The development of the shortage economy (Kornai 1992) provided a broad theoretical underpinning to facilitate statistical investigation of the socialist economic system. Finally, throughout the 1970s and 1980s, interest in the Soviet economy and similar examples elsewhere grew. Emphasis continued to be placed upon developing improved databases. While the Central Intelligence Agency continued to develop and disseminate data, other agencies provided important sectoral information. For example, the US Department of Agriculture was a major source of data and analysis on agriculture and the Soviet rural economy, while the Foreign Demographic Analysis Division of the Department of Commerce provided and analyzed population and labor force data. During these years, the publications of the Joint Economic Committee on China, Eastern Europe, and the Soviet Union became a basic resource for analysis of these economies. In addition to a growing volume of work on key economic aspects of the Soviet economy, important collections of scholarly work (Bergson and Kuznets 1963, Bergson and Levine 1983) became guideposts to the field of Soviet economics. Throughout the postwar era, the Soviet economic experience was chronicled in a variety of additional sources. In addition to general guides (Konn 1992), attention to data problems (Marer et al. 1992, Treml and Hardt 1972) continued. There are a number of basic texts devoted to the Soviet economic experience 14717
Soiet Studies: Economics (Gregory and Stuart 2000, Nove 1961, Lavigne 1974, Campbell 1974, Sherman 1969).
3. Soiet Studies: The post-Soiet era Understandably, the collapse of the Soviet empire fundamentally changed the dimensions of Western interest in Soviet-type command economies. Although there has been sustained interest in different economic systems from a scholarly perspective, interest in transition has assumed a central focus, with the demise of the Soviet Union as the primary example of the administrative command economy and the basic changes in the geopolitical framework of East–West relations. Our current interest in the Soviet economic experience is in part as a model for understanding the continuing issue of resource allocation in differing economic systems, interpreting the Soviet era and especially the impact of that era on the contemporary economies of Russia and the newly independent states of the region. See also: Socialist Societies: Anthropological Aspects; Soviet Studies: Geography; Soviet Studies: Politics; Soviet Studies: Society
Bibliography Bergson A 1961 The Real National Income of Soiet Russia Since 1928. Harvard University Press, Cambridge, MA Bergson A, Kuznets S 1963 Economic Trends in the Soiet Union. Harvard University Press, Cambridge, MA Bergson A, Levine H S 1983 The Soiet Economy: Toward the Year 2000. George Allen & Unwin, London Berliner J 1957 Factory and Manager in the USSR. Harvard University Press, Cambridge, MA Campbell R 1974 The Soiet Type Economies: Performance and Eolution, 3rd edn. Houghton Mifflin, Boston Davies R W 1980 The Industrialization of Soiet Russia, 3 vols. Harvard University Press, Cambridge, MA Granick D 1954 Management of the Industrial Firm in the USSR. Columbia University Press, New York Green D W, Higgins C I 1977 Somod I: A Macroeconometric Model of the Soiet Union. Academic Press, New York Gregory P R, Stuart R C 2000 Russian and Soiet Economic Performance and Structure, 7th edn. Addison Wesley Longman, Reading, MA Grimsted P K 1988 Archie and Manuscript Repositories in the USSR, Ukraine and Moldaia. Princeton University Press, Princeton, NJ Hewett E A 1988 Restoring the Soiet Economy. Brookings Institution, Washington, DC Konn T (ed.) 1992 Soiet Studies Guide. Bowker-Saur, London Kornai J 1992 The Socialist Economy: The Political Economy of Communism. Princeton University Press, Princeton, NJ Lavigne M 1974 The Socialist Economies of the Soiet Union and Europe. Martin Robertson, London Marer P, Arvay J, O’Connor J, Schrenk M, Swanson D 1992 Historically Planned Economies: A Guide to the Data. The World Bank, Washington, DC
14718
Millar J R (ed.) 1987 Politics, Work, and Daily Life in the USSR. Cambridge University Press, New York Nove A 1961 The Soiet Economy: An Introduction. Praeger, New York Nove A 1982 An Economic History of the USSR. Penguin, New York Ofer G 1987 Soviet economic growth, 1928–1985. Journal of Economic Literature 25(4): 1767–833 Sherman H J 1969 The Soiet Economy. Little Brown, Boston The National Economy of the USSR (Narodnoe khoziaistvo SSSR). Annual since 1956. Statistika, Moscow Treml V G, Hardt J P 1972 Soiet Economic Statistics. University of North Carolina Press, Durham, NC Zaleski E 1980 Stalinist Planning for Economic Growth, 1933–1952. University of North Carolina Press, Chapel Hill, NC
R. C. Stuart
Soviet Studies: Education The vast majority of Soviet and Western studies of Soviet education since the 1930s have focused in positive ways on the curricular and instructional rigor of the educational system, and its contribution to Soviet social and economic development. The official line that dominated Stalinist and postwar Soviet research argued that the history of Soviet education was essentially unproblematic, with the Communist party–state leading the people upwards into enlightenment through literacy, science and technology, and militant materialism. Perhaps ironically, much of this was echoed in contemporary Western research and in modernization theory, which also considered the contribution of Soviet education to social and economic development to be direct and unproblematic. The only real point of contention was over the ultimate effects of rising literacy and the growth of an increasingly educated mass society. While Soviet scholars argued that these developments would lead to the realization of true Communism, Western scholars tended to argue that they would result in inevitable social differentiation and cultural liberalization on the Western model. Neither group seemed to consider the possibility that the Soviet educational system could be maladaptive or dysfunctional in its ability to contribute to sustainable social and economic development. This unanimity in almost all studies of Soviet education since the 1930s also obscured the distinct historical phases and sharp policy conflicts that actually characterized the history of Soviet education. In fact, this history is much more contentious and significant for comparative international developments than the official Soviet line would suggest. As will be detailed below, the years after the Russian revolutions of 1917 were characterized by an official
Soiet Studies: Education Soviet policy of radical progressive education that was very much influenced by, and contributed to, international trends in educational policy. This was followed by increasingly conservative educational policies in the Stalinist 1930s and afterwards, when the enduring foundations of the Soviet educational system were created. In fact, while obscured by Soviet social and behavioral science, which always claimed that Soviet developments were utterly unique, this traditionalist restoration was actually also part of a broader international turn away from Deweyan progressive education in many nations during the middle decades of the twentieth century. Many of these historical shifts and policy changes, and the political and professional struggles that shaped them, have long been obscured by official Soviet historiography. The limits and failures of the Soviet educational system in terms of actual access and achievement have also long been obscured by opaque official Soviet statistics. More recent post-Soviet research, both Western and Russian, has begun to detail how, throughout its history, Soviet education was successful in many ways, and yet also heavily politicized, narrowly vocational, bureaucratically inflexible, and subordinated to the Soviet state and the Communist Party of the Soviet Union. Ultimately, it could be argued that the inflexibility of the system, and its difficulty in adapting to the needs of postwar social and economic development, caused Soviet education to become a serious obstacle to the viability of Soviet socialism. During the postStalinist period, from the 1950s through the 1980s, various reforms were attempted in an effort to ‘humanize’ and differentiate the highly bureaucratized and standardized system. The alleged failure of these reform efforts in the 1980s then led, in the post-Soviet years after 1991, to attempted radical reforms, questioning of the entire Soviet legacy, and deep systemic crisis in the 1990s.
1. The Soiet Educational Experiment and Radical Progressie Education The models for what became Soviet education were sketched out in the decades before the Russian revolutions of 1917 in response to the need for rapid social and economic development, and especially in response to the poverty and social dislocations caused by World War One. The fundamental principles of Soviet progressive education included expanding the conception of education and culture to include a broad array of programs for adults, working people, and children of all ages. These policies also entailed the provision of extensive public health and welfare services through educational institutions, and a radical rethinking of the nature of curriculum and instruction to bring learning ‘closer to life’ and industrial work. Finally, educational access and promotion were to be radically expanded to make education open to all
regardless of ethnic identity, class, or gender (Pinkevitch 1929). These early Soviet educational policies were inspired by European and American progressive education, and were hailed as revolutionary by Western radicals and educators. However, Soviet education then went far beyond Western influences in its emphasis on including previously disadvantaged populations, as well as in its increasingly dogmatic emphasis on political indoctrination (Harper 1929). Soviet educational experiments in the 1920s were, in many ways, far ahead of their time (on this and what follows, see Kalashnikov and Epshtein 1927–1929). Soviet educators sought to synthesize general and vocational education for all students in a common curriculum that was known as polytechnicalism, although this rapidly proved impractical, and some vocational tracking was reintroduced. The educational experiments of the 1920s also included the use of interdisciplinary and thematic curricula, and sophisticated developmental instruction that was targeted to stages of students’ psychological and intellectual development. Other innovations included student and faculty self-government, a focus on experiential learning and community service, and the curtailing of most formal assessments and criteria for promotion in the name of equality and universal access to education. Early Soviet education was also characterized by the extensive use of Western-style child study and psychological testing, which was known as pedology in Soviet practice. Resources were also invested in the mass provision of new media, libraries, and extracurricular activities for all ages. Finally, the 1920s also witnessed the development of written languages, new alphabets, primers, and comprehensive educational services for many indigenous peoples and small ethnic minorities. Almost all accounts in the 1920s, both Soviet and Western, were extremely positive about these early progressive experiments, although some Western observers such as John Dewey were also critical of the Soviet educators’ emphasis on ideological indoctrination and their philosophical dogmatism (Dewey 1929). For all of the early Soviet educators’ idealism, however, there were clearly also many internal contradictions and serious problems with the progressive educational experiments of the 1920s. For example, while expanding access to previously disadvantaged populations, the regime also sought to deny educational opportunity to members of the prerevolutionary elites, or to imagined political or class enemies. This created serious social tensions, and drove out of the new system many of those who could have become leading teachers and students. Public education was often funded on a ‘residual’ basis, which meant that it was financed only after other budgetary obligations were met, even as the Soviet regime struggled to prevent private financing or tutoring from compensating for inadequate state provision. Nonetheless, early Soviet progressive edu14719
Soiet Studies: Education cation, for all its utopianism and politicization, was undeniably an unprecedented and truly revolutionary experiment in the provision of mass education and culture for all.
2. Stalinist Traditionalism and System Building However, as detailed in subsequent Soviet and Western research, the many problems with these early experiments effectively discredited them, and provoked a sharp reaction in the 1930s. Soviet accounts during the Stalinist period tended to blame the progressive educators for their allegedly ‘anti-Leninist’ errors, and often accused them of deliberate sabotage and counter-revolutionary ‘wrecking’ (Svadkovskii 1938). In fact, many of the progressive educators were purged in the 1930s, and their lives and work erased from the historical record, to be partially rediscovered only when the Soviet archives began to open up in the late 1980s (Fradkin 1990). Later Soviet accounts glossed over the distinctive policies of the 1920s, obscured the sharp conflicts of the time, and blamed the progressive educators for their impracticality and adherence to inappropriate ‘bourgeois’ theories (Kuzin et al. 1980). Most Western analyses simply argued that the progressive policies were not conducive to true social and economic modernization, and that it was virtually inevitable that more ‘stable’ and traditional curricula and instruction be restored to fit the Soviet educational system to the needs of modern industrial society (Anweiler 1964). Other Western analyses of the failure of the progressive educators focused on problems such as the acute lack of funding that crippled the attempted reforms in the 1920s, or the disruptions caused by conflicting party– state policies. Later Western authors focused on the popular and political backlash against the progressive experiments, and the broad social support for the upward mobility opened up by Stalinist educational policies (Fitzpatrick 1979). Almost all accounts since the 1930s agree that the progressive experiments became increasingly unpopular as they were accused of undermining the authority of teachers and parents, weakening curricular integrity and instructional rigor, and failing to produce the highly-trained technical specialists and skilled workers needed to ‘build socialism.’ From a broader and more critical perspective, with the coming of Stalinism in the 1930s, the task became less to mobilize and empower all citizens through progressive education, and more to discipline and train subjects for state-directed development. Struggles over educational policy raged on in the 1930s, and resulted in the thorough repudiation of progressivism and the forging of a new and intensely conservative model of Soviet education. This also entailed the rejection of all child-centered or developmental schooling in the name of the struggle against ‘pedological deviations,’ and a return to a 14720
school-centered system and highly ritualistic formal ‘didactics’ (Kairov 1939). This Stalinist model was intensely traditional in curriculum and instruction; thoroughly politicized, especially in the social sciences; heavily dominated by the physical sciences and technology; and rigidly bureaucratized in its governance. From a broader perspective, it could be argued that the system of rigid party–state control over Soviet intellectual and professional life that resulted from these struggles became, over the following decades, debilitating for Soviet education as a whole. The fundamental principles of this Stalinist or traditionalist restoration are well known through the international attention they received in the postwar era, especially after the launch of the first satellite (Sputnik) in 1957. This system of Stalinist education entailed curricula organized along the lines of the traditional subject disciplines, and instruction that focused on the content of knowledge and its reproduction. The system was built upon a canon of standardized textbooks, lesson plans, and teaching guides; and emphasized lecturing, memorization, and rigorous oral examination. A central goal throughout was to reinforce the authority of the teacher or professor over their students, and of school directors and administrators over their staff. Formal assessments and rigid criteria for promotion were elaborated, with harsh sanctions for failure, lack of discipline, or dissidence. Many of the programs that sought to cultivate the languages and cultures of indigenous peoples and national minorities were curtailed, and many ethnic groups subjected to involuntary linguistic Russification and cultural Sovietization, often enforced through police terror. Overall, Soviet education in the 1930s and after became increasingly rigorous, hierarchical, and formal (Medynskii 1951). Stalinist education was permeated with a spirit of discipline and order, and was closely coordinated with the Soviet planned economy and national security state. There were, undeniably, real successes for Soviet education in the Stalinist period, especially in raising literacy rates and building a comprehensive public education system. It also seems that the traditionalist restoration was quite popular with the new Soviet elites, many professional educators and teachers, and many culturally conservative parents. The system did produce unprecedented numbers of technical specialists and skilled and semiskilled workers, albeit in increasingly narrow vocational specializations. The system also proved successful at research and development, especially in mathematics, the physical sciences, and military technologies. As the Stalinist regime embraced conservative educational practices and cultural values, it also fostered significant compensatory legitimacy for Soviet socialism, as demonstrated by both Soviet and e! migre! memoirs. Yet for all its claims to universality and unanimity, unofficial cultural and educational traditions did survive, especially among religious and national minorities, as
Soiet Studies: Education well as among the Soviet intelligentsia in the form of unsanctioned teaching practices, and private and family tutoring (Kline 1957). Official Soviet research in this period was, by definition, triumphal about the onward and upward march of literacy and enlightenment, and the indispensable leadership of the Soviet party–state (Konstantinov and Medynskii 1948). There was also a distinct tone of xenophobia, with Soviet developments depicted as both indigenous and superior to all comparable international trends. The emphasis on the role of state power tended to obscure the more complex relations between the party–state and admittedly dependent elements within the profession, as well as the role and interests of the Soviet public. While the disillusioned Western progressive George S. Counts was quite harsh in his judgement of the repressive and dogmatic aspects of Soviet education, he also acknowledged its rigor and utility for a forcedmarch version of modernization (Counts 1957). Almost all other Western and Soviet e! migre! accounts were positive in their judgment of the system’s pedagogical principles and effectiveness, even if they sometimes acknowledged its rigidity and limitations. However, alongside the successes and popularity of the conservative policies, there were severe problems as well. The human cost of Stalinist terror in education and culture was devastating, and resulted in the repression of many talented researchers, educational leaders, and teachers, especially among the national minorities. While adhering to the basic principles of modernization theory, some Western authors did draw attention to the more ambiguous effects of standardized Soviet education for the development of the nonRussian peoples of the Soviet Union (Bereday et al. 1971). Soviet education in this period became increasingly dogmatic and dominated by rigid official orthodoxy, with especially devastating effects in the social sciences and humanities. The system became top heavy and driven by administrative interests, with little space for professional autonomy or innovation. Research was often separated from teaching, and there was a clear assumption that theory and content should drive instructional practice. Furthermore, educational institutions and professional networks were isolated from one another through bureaucratic parallelism, as each economic sector and industry created its own educational and research hierarchies. Educational institutions and even individual teachers were closely supervised, even as many contrived elaborate and creative ways to subvert party–state controls. Most importantly, the entire system was structured into narrow vocational specializations that limited the development of the job skills and professional talents that were needed for sustainable postwar social and economic growth. These more critical perspectives only began to coalesce in the 1990s as some post-Soviet and Western authors began to focus on the more maladaptive or dysfunctional aspects of Stalinist
educational policies. In conclusion, and taken as a whole, the Stalinist reforms in Soviet education successfully built a mass public education system, but at the price of bureaucratic rigidity, intellectual stagnation, and vocational obsolescence.
3. The Potential and Limits of Postwar Reform Ironically, even as Western researchers were popularizing the curricular and instructional rigor of Soviet education in the post-Stalinist period after 1953, Soviet educators began to debate the inflexibility and limits of their own system. Attempted postwar reforms encompassed both more ‘progressive’ as well as quite ‘traditional’ ideas. For example, an attempted reform associated with Communist Party leader Nikita Khrushchev in 1958 sought to restore the polytechnical principles of the system, only this time as more rigorous and involuntary vocational tracking. This attempted reform sought to ‘force down’ the educational aspirations of the Soviet public through a requirement that all students entering higher education have at least two years of work or production experience (Goncharov and Korolev 1960). Significantly, this reform was essentially blocked by the combined resistance of enterprise managers, parents concerned about their childrens’ upward mobility, and more traditionalist educators resistant to such experiments. The regime also attempted, throughout the post-Stalinist period, to strengthen proper ‘civic– political’ socialization in response to what was seen as growing juvenile delinquency and alienation from official ideological values. Other reforms in the post-Stalinist period reflected the progressive innovations of the 1920s, especially the tradition of psychological research pioneered by Lev S. Vygotskii (1896–1934, see Vygotskii 1991). Vygotskean cultural psychology and developmental instruction, which also received significant attention from Western researchers in the 1980s and 1990s, may well prove to be the most internationally significant and enduring contribution of Soviet education. Other attempted reforms included greater curricular differentiation, the introduction of electives for students, and the development of programs for elite education in the arts, sciences, and sports (Matthews 1982). There was also a relaxation of linguistic Russification, with relatively more flexibility offered to educational institutions in the national minority regions. Yet it must be noted that, for all the potential of these reforms, they were often thwarted by the combined power of the party–state regime and the conservative educational establishment, especially in the various republic-level Ministries of Education and the Academy of Pedagogical Sciences. Taken as a whole, the reform movements in Soviet education from the 1950s through the early 1980s did have the effect of consolidating the system, while also 14721
Soiet Studies: Education diversifying and enriching curricular and instructional practices. Perhaps most importantly, the Soviet educational system by the 1970s had trained a vast new body of professional teachers, and essentially achieved universal literacy and nearly universal secondary education. Not surprisingly, official Soviet accounts focus on these achievements with little or no acknowledgement of any persistent problems or struggles over policy or practice (Kolmakova et al. 1986). Yet these achievements were, to a significant degree, gained at the expense of individual and professional freedom, creativity, and systemic flexibility. It must also be noted that, for all of its rhetoric of equality and universality, severe inequities persisted, especially between urban and rural areas, and that literacy and educational provision lagged behind in many national minority areas. The Soviet educational system was also slow to introduce new instructional technologies, and was especially weak in the provision of nonpolitical continuing education and vocational retraining. It seems that little Western attention was paid to these systemic problems, perhaps in part because the significant Western research on Soviet education, which began in the 1950s, had narrowed considerably by the late 1970s. It also seems that international cooperation in this area was constrained by the conservatism of the neoStalinist educators and officials who continued to dominate the Soviet system into the early 1980s.
4. The Crisis of Soiet Education and its Ambiguous Legacies The final phase of Soviet education in the 1980s and early 1990s witnessed both radical reforms and severe systemic crises. Several attempted reforms in the mid1980s sought to tighten vocational tracking and increase party–state controls, but in a very real sense, Soviet educated society as a whole, as well as leading elements within the educational profession, were no longer willing to tolerate such reactionary policies. There was a growing sense that Soviet public education was in crisis, which led to the emergence of a growing ‘innovation movement’ among educators and teachers. This movement sought to humanize and democratize Soviet education; to loosen party–state controls over curriculum, instruction, and governance; to recover the spirit of the child-centered progressive experiments of the 1920s; and to expand contacts with international researchers and practitioners (Eklof and Dneprov 1993). After the collapse of the Soviet Union in 1991, the newly independent republics embraced an array of educational policies. Some tried, unsuccessfully, to simply preserve the old Soviet system, as in Belarus and parts of Central Asia. Others struggled to radically reform the old system through attempted privatization and decentralization, as in the Baltic states and the Russian Federation (on Russia, see 14722
Dneprov et al. 1991). The reforms of the 1990s have, undeniably, unleashed a veritable pedagogical revolution of innovation in curriculum and instruction, and finally joined together the long isolated Soviet educational community with its international counterparts. Much of this reintegration into the world community of educational research and practice has been facilitated by an unprecedented level of international assistance provided to post-Soviet educators and students in the form of research grants and exchanges. Yet, it must also be noted, all of the Soviet successor states have experienced severe if not catastrophic crises in the funding and provision of public education, and that both excellence and equity have suffered. In other words, now that post-Soviet educators have finally rejoined the world community from which they have been isolated since the 1920s, many lack the financial resources to remain in the profession or to take full advantage of the new opportunities for international or even domestic professional cooperation and development. Furthermore, the economic and demographic crises that have gripped the entire region threaten the health and readiness to learn of virtually an entire generation of children and young adults.
5. Conclusions In conclusion, Soviet education both shaped and constrained the building of Soviet socialism. While Soviet and Western studies of Soviet education diverged sharply on some points of interpretation, almost all agreed that the Soviet educational system, as built in the 1930s and as endured into the 1980s, was fundamentally functional and conducive to modernization. While later Russian and Western authors have expressed some skepticism about the viability and functionality of the Soviet educational system, it did undeniably contribute, if in complex ways, to the development of a modern and literate mass society. Finally, the most rigid and repressive aspects of Soviet education were repudiated in the 1990s, and there have been some real successes in recent reform efforts. However, these years may ultimately be judged a historical tragedy of monumental proportions if the legacy of secular, free, and public education for all that was pioneered by Soviet educators is lost as well.
Bibliography Anweiler O 1964 Geschichte der Schule und Padagogik in Russland, om Ende des Zarenreiches bis zum Beginn der Stalin-Ara. Quelle & Meyer, Heidelberg, Germany Bereday G Z F, Pennar J, Bakalo I 1971 Modernization and Diersity in Soiet Education. Praeger, New York Counts G S 1957 The Challenge of Soiet Education. McGraw Hill, New York
Soiet Studies: Enironment Dewey J 1929 Impressions of Soiet Russia and the Reolutionary World, Mexico–China–Turkey. New Republic, New York Dneprov E D, Lazarev V S, Sobkin, V S (eds.) 1991 Rossiiskoe obrazoanie perekhodnoi period: programma stabilizatsii i razitiia [Russian Education in the Transition Period: Program of Stabilization and Deelopment]. Ministry of Education of the RSFSR, Moscow Eklof A B, Dneprov E D 1993 Democracy in the Russian School: The Reform Moement in Education since 1984. Westview, Boulder, CO Fitzpatrick S 1979 Education and Social Mobility in the Soiet Union, 1921–1934. Cambridge University Press, New York Fradkin F A 1990 A Search in Pedagogics: Discussions of the 1920s and 1930s. Progress, Moscow Goncharov N K, Korolev F F (eds.) 1960 Noaia sistema narodnogo obrazoaniia SSSR: Sbornik dokumento i statei [The New System of Public Education in the USSR: Collection of Documents and Articles]. Academy of Pedagogical Sciences of the RSFSR, Moscow Harper S N 1929 Ciic Training in Soiet Russia. University of Chicago Press, Chicago Kairov I A 1939 Pedagogika [Pedagogy]. Uchpedgiz, Moscow Kalashnikov A G, Epshtein M S (eds.) 1927–1929 Pedagogicheskaia entsiklopediia [Pedagogical Encyclopedia], 3 vols. Rabotnik prosveshcheniia, Moscow Kline G (ed.) 1957 Soiet Education. Columbia University Press, New York Kolmakova M N, Gellershtein L S, Ravkin Z I (eds.) 1986 Ocherki istorii pedagogicheskoi nauki SSSR (1917–1980 gg.) [Notes on the History of Pedagogical Science in the USSR, 1917–1980]. Pedagogika, Moscow Konstantinov N A, Medynskii E N 1948 Ocherki po istorii soetskoi shkoly RSFSR za 30 let [Notes on the History of the Soiet School in the Russian Soiet Federated Socialist Republic at 30 Years]. Gosuchpedgiz, Moscow Kuzin N P, Kolmakova M N, Ravkin Z I (eds.) 1980 Ocherki po istorii shkoly i pedagogicheskoi mysli narodo SSSR, 1917– 1941 gg. [Notes on the History of the School and Pedagogical Thought of the Peoples of the USSR, 1917–1941]. Pedagogika, Moscow Matthews M 1982 Education in the Soiet Union: Policies and Institutions Since Stalin. Allen and Unwin, London Medynskii E N 1951 Public Education in the USSR, 2nd edn. Foreign Languages Publishing House, Moscow Pinkevitch A P 1929 The New Education in the Soiet Republic. John Day Company, New York Svadkovskii I F (ed.) 1938 Proti pedologicheskikh izrashchenii. Sbornik statei [Against Pedological Deiations. Collection of Articles]. Gosuchpedgiz, Moscow-Leningrad. Vygotskii L V 1991 Pedagogicheskaia psikhologiia [Pedagogical Psychology]. (Davidov V V (ed.), Pedagogika, Moscow
M. S. Johnson
Soviet Studies: Environment Environmental studies are by nature interdisciplinary, and this holds for Soviet studies of the environment. Yet approaches to the study of the Soviet environment were, like many topics, politicized within the USSR. Secrecy and ideological constraints made it difficult
for Soviet biologists, geographers, and economists to discuss the full extent of pollution or the structural factors responsible for environmental waste and degradation. Political aspects of pollution were especially sensitive. Social scientists could at best only obliquely criticize the central planning, bureaucratic socialism, and militarized economy that were responsible for much of the country’s environmental ills. Western experts had to rely on fragmented and incomplete data, yet they managed to construct reasonably accurate accounts of Soviet environmental practice. The Soviet approach to the natural environment encapsulated, in extremis, all the elements that James C. Scott (1998) has identified as necessary for disastrous twentieth-century social engineering: the administrative ordering of nature and society; a highmodernist ideology relying on science and technology for problem solving; an authoritarian state fully willing to use its coercive powers for social experimentation; and a prostrate civil society unable to resist the authoritarian state’s transformative plans. Seven decades of rapid, forced industrialization, illconceived agricultural experiments, botched industrial and irrigation projects, waste and mismanagement of natural resources, and disregard for the human consequences of development left a legacy of environmental destruction unrivaled in Europe or North America. With the collapse of communism the obstacles to serious environmental analysis have been removed; however, economic decline and political paralysis in the postcommunist era have severely constrained the former USSR’s ability to address its environmental crisis.
1. Ideology and Attitudes Soviet practices and policies toward the natural environment mirrored the political system’s raison d’eV tre. In ideology, the Soviets turned the environmental determinism of Marxism on its head, glorifying the radical transformation of nature through sheer political will. Soviet leaders formulated socialism into a doctrine that equated anthropogenic transformation of the environment with progress. As political scientist Joan DeBardeleben noted in The Enironment and Marxism–Leninism (1985), ideology provided a conceptual framework that shaped and limited policy discourse. The Soviet dictatorship showed respect neither for nature nor for humankind. Both were subordinated to the regime goal of constantly expanding material production, without regard for the wellbeing of the population, their quality of life, or for the natural environment. From an environmental perspective, the Soviet fixation with production focused the question: production to what end? However, Soviet scholars were not allowed to address such questions: while environmental economists in the West discussed 14723
Soiet Studies: Enironment capitalist production in terms of the external costs associated with industry, Soviet economists had to echo the Party position that externalities were not possible in a socialist economy. Soviet environmental studies, policies, and practices were molded and distorted by communist ideology. In the Communist Manifesto, Marx and Engels had praised capitalism’s success in subjecting nature to man and machinery; Soviet communism’s goal was to surpass this record of accomplishment. In addition, Marxism’s labor theory of value denigrated natural resources to a position of secondary importance in the economy. This attitude was compounded by the belief that the USSR, by far the largest nation on earth, had illimitable stores of coal, land, timber, natural gas, and minerals. Only late in the history of the USSR did the regime acknowledge limits to exploiting nature. Even then, scholars focused on narrow technical issues of resource use and pollution, to avoid placing blame on government policies (Gerasimov et al. 1971).
2. Industrial Deelopment and Pollution Vladimir Lenin was favorably inclined toward conservation, including the system of nature preserves (zapoedniki) that predated the revolution. His primary emphasis, however, was on modernizing Russia in order to lay the foundations for communism. Joseph Stalin’s comprehensive program of forced industrialization and agricultural collectivization embodied a strong anti-environment bias. Soviet developmental policies resulted in an all-out assault on nature by planners and workers. This determination to transform the natural world was comparable to the high modernist ideology found in late nineteenth- and early twentieth-century industrializing nations; layered on top, however, was the Communist Party’s goal of subordinating all social and productive forces to its conscious guidance, and the insulation of the governing elites behind the regime’s authoritarian political structure. The focus on industrial growth as the system’s preeminent goal made continuous, linear progress a vital component of regime legitimacy, particularly after Marxist–Leninist ideology lost its appeal. Maintaining conservation values was problematic in the highly-charged political atmosphere of Soviet politics. Some scientists urged caution in exploiting the natural environment, even during the most repressive period of Stalinism, at great personal risk, and small, relatively independent social movements dedicated to protecting the environment persisted into the Stalin era. Historian Douglas Weiner describes these subtle forms of resistance in his book A Little Corner of Freedom (1999). But all of Soviet science, including ecology and conservation, was adjusted to support the regime’s political goals. Weiner’s earlier Models of Nature (1988) details the impact of Stalinism on conservation in the Soviet Union. For example, 14724
acclimatization—the introduction of non-native species into certain environments in order to transform nature and enhance productivity—was advocated by the biologist Trofim Lysenko during the Stalin and Khrushchev eras. Constrained by a poor understanding of science and arrogance toward nature, the regime’s acclimatization efforts proved disastrous for both the introduced species and the local ecology. Soviet biologist Zhores Medvedev chronicled how ideology, politics, and careerism affected Soviet genetics in The Rise and Fall of T.D. Lysenko (1969). More conventional genetics replaced Lysenkoism by 1966, after Party First Secretary Nikita Khrushchev had been ousted. Geographers and economists wrote the earliest Western studies of Soviet environment. Philip Pryde’s Conseration in the Soiet Union (1972) and economist Marshall Goldman’s The Spoils of Progress (1972) laid the groundwork for later studies. These scholars found the major polluters of the environment to be heavy industry, particularly the energy, chemical, and metallurgical sectors favored by Soviet planners. Large urban areas suffered air and water pollution caused by dense concentrations of steel plants, oil refineries, ore smelters, and pulp and paper mills. Automobile pollution was generally less of a problem, since few people owned cars, but poorly maintained buses and trucks spewed tons of pollutants into the air. Contamination from radioactive waste, military installations, and nuclear power plants posed another serious hazard to public health. Agricultural runoff and inadequately treated industrial wastes polluted much of the Soviet Union’s surface and subsurface water sources.
3. The Late Soiet Era Public awareness of Soviet environmental issues emerged in the late 1960s, roughly parallel with the growth of environmentalism in the developed, industrialized democracies. The Soviet regime’s focus on industrial development, official secrecy toward all systemic failings, and Communist Party assertions that environmental degradation characteristic of capitalist economies was not possible under socialism made it dangerous to criticize government policies openly. But the threatened pollution of Lake Baikal, the largest body of fresh water on the planet, led to open controversy over the building of two cellulose plants on its shores. Grigorii Galazii, Director of the Limnological Institute of the Siberian Academy of Sciences, led a crusade in the controlled press against the danger to the lake. Blunter criticism was expressed in an underground (samizdat) publication, the Destruction of Nature, published in the West by a pseudonomous Boris Komarov (Ze’ev Wolfson) (1978). Carefully managed public discussions on the environment did take place during the Brezhnev era, largely to highlight the regime’s ‘commitment to
Soiet Studies: Enironment environmental protection’ and publicize ineffective factory managers, bureaucrats, or poachers. Genuine discussion of environmental problems, and serious efforts to combat pollution and environmental degradation, however, were constrained by the authoritarian character of the system and the emphasis on covering up failings. A major theme in Komarov’s manuscript was the pernicious effect of censorship and secrecy. The Soviet military frequently invoked national security to justify a range of environmentally destructive practices, such as locating cellulose plants to produce cord for aircraft tires on Lake Baikal, dumping nuclear wastes underground and in the oceans, and poaching in zapoedniki. But environmental destruction took its toll on the military, as it did on civilians. Recruits were often unfit for military duty because of poor health. Unhygienic living conditions, including contaminated drinking water and air pollution, led to high incidents of dysentery, skin disease, and respiratory illnesses within the military (Feshbach and Friendly 1992). The Soviet experience forced environmentalists and economists to rethink many of their assumptions. Economic theory had held pollution to be an externalized cost to society resulting from production by private enterprise. Theoretically, social ownership internalized these costs, and a socialist economy therefore should pollute less than a capitalist economy. However, natural resources were not treated as scarce in the Soviet economy, which resulted in tremendous waste and inefficiency. A dominant theme in studies by Western economists (Goldman 1972), geographers (Pryde 1972, 1991), and political scientists (Ziegler 1987) was that concentrating power in the Communist Party and state bureaucracy inhibited the formation and implementation of responsible environmental policies. Socialist, centrally planned economies did not fully internalize external costs; rather, the absence of clearly identifiable property owners encouraged waste and abuse of the environment. Soviet agriculture was possibly more environmentally destructive and more wasteful of natural resources than industry. Millions of hectares of land were impoverished or eroded by unscientific attempts to extract higher yields. Lysenko’s ill-considered experiments in agrobiology caused significant damage to Soviet agriculture, as Medvedev (1969) documents. Khrushchev’s Virgin Lands project and his determination to plant maize on unsuitable land constituted additional examples of gross agricultural mismanagement. Chemical fertilizers and pesticides were overused and applied haphazardly on the collective and state farms through ignorance, carelessness, and a skewed incentive structure. In Central Asia, diversion of the Amu Darya and Syr Darya rivers for cotton irrigation depleted the Aral Sea, saturated the land with salt and chemicals, and altered the region’s microclimate. Structurally, socialized ownership of agricultural land generated a psychology in which
farm workers and administration valued only shortterm production and ignored long-term damage to the earth. Farmers were more careful with their private plots (and these were far more productive), but only 3 percent of land was set aside for private agriculture. Western political scientists naturally argued that politics played an important role in Soviet environmental degradation, claiming authoritarian political systems were less responsive to environmental concerns than democratic, pluralist systems. Charles Ziegler, in Enironmental Policy in the USSR (1987) found the lack of accountability to voters and the absence of environmental interest groups in the USSR ensured that decision makers could safely ignore public opinion. Until Gorbachev’s glasnost eased controls, scholars within the USSR could publicly criticize neither the general policies nor the specific decisions of high Party and government officials. Weak adherence to the rule of law also contributed to environmental degradation. The Soviet government adopted impressive environmental and conservation legislation, but these laws were largely symbolic and were consistently ignored by factories and ministries tasked with meeting production quotas. Enforcement of environmental regulations by government agencies was sporadic, and environmentalists could not utilize lawsuits, an effective tool of their Western counterparts. Soviet scientific studies and government pronouncements consistently maintained that socialist economic planning, the national ownership of land and resources, and centralized regulation ensured a superior approach to environmental protection that could not be realized under capitalism. But in the Brezhnev period cautious scientists admitted that, in practice, natural resource use was inefficient and wasteful, and that erosion and water pollution were serious problems (Gerasimov et al. 1971). Studies by Western scholars suggested that Soviet environmental performance was no better than that of capitalist states, and often much worse, due to the distorting influence of bureaucratic state socialism. Using a comparative approach, political scientist Barbara Jancar found that political–economic factors in oneparty systems played a greater role in defining environmental management than did regulation, regardless of whether the state was centralized, as was the USSR, or decentralized, as was Yugoslavia (Jancar 1987). Perhaps unique to the Soviet Union was the environmental impact of the militarized Soviet economy. Military-related production accounted for approximately one-quarter of the total gross domestic product. Environmental concerns were routinely sacrificed in the interests of national security and, at least until glasnost, questioning military imperatives was forbidden. Issues like global warming might be treated skilfully, as they were in The Night After … Climactic and Biological Consequences of a Nuclear War (1985), 14725
Soiet Studies: Enironment an edited collection of articles by prominent Soviet scientists. Nonetheless, these essays were published through an official government outlet and served a Soviet national security priority; namely, mobilizing opinion against the Reagan administration’s Strategic Defense Initiative. Given official secrecy and the politicizing of environmental issues, simple descriptive studies were an important contribution to the field, and a damning indictment of the Soviet system. Both Feshbach and Friendly (1992) and Komarov (1978) described how military bases contaminated towns and coastlines with fuel and lubricants, dumped garbage and scrap equipment indiscriminately, and polluted groundwater supplies. Troop exercises destroyed forests and ruined agricultural land. More pollution was generated by the production and storage of chemical weapons, while the production and testing of Soviet nuclear weapons contributed to radioactive pollution at the Semipalatinsk range in Kazakhstan and Novaya Zemlya in the Arctic. The Soviet navy dumped 135 reactors from decommissioned nuclear submarines in the waters around the Barents and Kara seas, and disposed of liquid nuclear waste from the Russian Far East by discharging it into the Sea of Japan. In postcommunist Russia the military continued to obfuscate and deny responsibility for environmental destruction, and officers were prosecuted for exposing the military’s role in pollution. Environmental legislation meant little in a nation not based on constitutional principles. Impressive sounding resolutions on air quality were adopted by the Council of Ministers as early as 1949; in the following decades additional laws were passed to protect water, forests, wildlife, and agricultural land. Scholars and officials could point to such progressive legislation as examples of the regime’s commitment to the environment (see Galeeva and Kurok 1979, for the range of legislation). In reality, Soviet legislation consisted of grand pronouncements and largely unenforceable standards. Producers routinely violated environmental laws with impunity. Enforcement was entirely in the hands of government bureaucracy and, given the regime’s emphasis on rapid and continuous economic growth, the enforcement of environmental laws that might interfere with production was actively or implicitly discouraged (Ziegler 1987).
4. Reform and Collapse Environmental issues played a key role in the liberalization of the Soviet system during the Gorbachev era. By officially acknowledging the scope of environmental degradation, however grudgingly, the Communist Party admitted failing in its promise to provide a better life for its people, and undermining its claim to legitimacy. Guarded criticism had been voiced by scientists, journalists, and literary figures in the 1960s and 1970s of the issues of air and water contamination, 14726
the Aral Sea disaster, and planned diversion of Siberian rivers southward to Central Asia. With the easing of censorship under glasnost, and widespread fears of government deception and incompetence after the April 1986 explosion at the Chernobyl nuclear reactor in Ukraine, environmental activism mushroomed and scholars turned their attention to the new environmental politics (Ziegler 1991, Pryde 1991, pp. 246–65). Public opinion surveys discovered that an overwhelming majority of the population was concerned about the environmental problems and the threat to public health. Environmental groups dominated the new ‘informals’ (neformaly), nascent interest groups that emerged in the late 1980s. Women, concerned as they were with pollution-related birth defects and the health of their children and families, figured prominently in these movements. Many of these spontaneously formed environmental groups linked environmental with national and separatist demands, especially in the more nationalistic Caucasian and Baltic republics. Under Soviet federalism Moscow treated the non-Russian republics like colonies, exploiting their natural resources, degrading the land, locating polluting industries on their territory, and in the process jeopardizing the health of the population. Soviet development strategies also led to an influx of ethnic Russians, who were concentrated in vital economic and political positions within the republics and who diluted the ethnic homogeneity of titular nationalities. Once political controls were relaxed, national discontent coalesced around the more obvious sources of Russian dominance—the Ignalina nuclear power plant in Lithuania, chemical plants in Armenia, nuclear testing in Kazakhstan, and oil pollution along Azerbaijan’s coastline.
5. Enironmental Studies and the Transition Postcommunist Russia no longer promotes a transformationist ideology. Full representative democracy has proved elusive, but the government is no longer authoritarian. Environmental information is easily available, and there are few formal restrictions on environmental activism. The Transboundary Environmental Information Agency, for example, regularly publishes an electronic Russian Enironmental Digest of newspaper and magazine articles. Drawing in part on studies of the Soviet environment, political scientists have theorized that democratic political systems should be more responsive to public demands for environmental protection (Payne 1995). But transitional Russians have become more concerned with acquiring material goods than with cleaning up the environment, notwithstanding the serious health implications of the legacy of Soviet pollution. These problems—which include higher morbidity and declining longevity rates, widespread birth defects, and outbreaks of such infectious diseases as diphtheria, cholera, and typhoid—have prompted a demo-
Soiet Studies: Enironment grapher–journalist team (Feshbach and Friendly 1992) to propose the Soviet Union’s epitaph as death by ecocide. Russia’s natural environment may have improved in the 1990s as a result of the precipitous decline in industrial production, but the evidence is mixed. Viktor Danilov-Danilyan, head of Russia’s State Committee for Environmental Protection, claimed air pollution from industrial sources at the beginning of 1999 was down 35 percent from 1991, while water pollution had decreased by 15–18 percent (Wren 1999) But Boris Porfiriev (1997), a systems analyst with the Russian Academy of Sciences, estimated that in 1995, 41 percent of Russia’s population lived in cities where the maximum permissible concentrations of pollutants were exceeded, compared with only 18 percent in the USSR in 1989. While approximately 25 percent of Soviet waste water went untreated in 1989, 40 percent was untreated in Russia in 1995. Overall, investments for environmental protection measures decreased 20 percent between 1990 and 1995 (Porfiriev 1997, pp. 147, 152). Environmental protection is guaranteed in the 1993 Constitution, but legal provisions remain inadequate. Resource-exploiting companies and ministries occupy advantageous, monopolistic positions relative to environmental agencies and lobby groups. Large budget deficits constrain Moscow’s ability to address a range of social problems, of which the environment is only one. Local and national priorities often clash over development and environmental protection issues. Bureaucratic arrogance and ignorance, political neglect, entrepreneurial exploitation, and public indifference relegated environmental protection to a low priority on Russia’s political agenda during the first postcommunist decade. And in a move disturbingly reminiscent of Soviet practices, in 2000 President Vladimir Putin dissolved Russia’s State Environmental Committee, distributing its oversight functions to the Ministry of Natural Resources, an agency tasked with exploiting the environment. With political restrictions largely gone, and institution-building on the agenda, a proliferation of environmental studies by social scientists might have been expected. This has not been the case. Western studies of Russia’s transition have focused on privatization, the development of a market economy, ethnic questions, and democratization, with few according much attention to the natural environment. Studies by conservationists Michael Wells and Margaret Williams (1998) and Swedish economist Patrik Soderholm (1999) found the absence of legal, administrative, and regulatory institutions in transitional Russia complicated environmental protection measures. Russian natural scientists have continued to pursue environmental studies, but social scientists have not devoted much attention to environmental issues. Several articles in the 1998 and 1999 issues of the Russian journal Military Thought addressed vari-
ous aspects of military-induced pollution, and Boris Porfirev (1997) has dealt with legal and organizational issues. However, there is still a great deal of work to be done on the Russian environment by researchers both inside and outside the country. See also: Communism; Economic Transformation: From Central Planning to Market Economy; Environment Regulation: Legal Aspects; Environmental Challenges in Organizations; Environmental Change and State Response; Environmental Policy; Environmental Risk and Hazards; Environmental Vulnerability; Environmentalism, Politics of; Environmentalism: Preservation and Conservation; Soviet Studies: Culture; Soviet Studies: Economics; Soviet Studies: Geography; Soviet Studies: Politics; Soviet Studies: Society
Bibliography DeBardeleben J 1985 The Enironment and Marxism–Leninism: The Soiet and East German Experience. Westview Press, Boulder Feshbach M 1995 Ecological Disaster: Cleaning Up the Hidden Legacy of the Soiet Regime. Twentieth Century Fund Press, New York Feshbach M, Friendly A Jr 1992 Ecocide in the USSR: Health and Nature Under Siege. Basic Books, New York Galeeva A M, Kurok M L (eds.) 1979 Ob okhrane okruzhaiushchei sredy: Sbornik dokumento partii i praitel’sta (On The Natural Enironment: A Collection of Party and Goernment Documents). Politizdat, Moscow Gerasimov A P, Armand D L, Yefron K M (eds.) 1971 Natural Resources of the Soiet Union: Their Use and Renewal. WH Freeman, San Francisco Goldman M I 1972 The Spoils of Progress: Enironmental Pollution in the Soiet Union. MIT Press, Cambridge, MA Jancar B 1987 Enironmental Management in the Soiet Union and Yugoslaia: Structure and Regulation in Federal Communist States. Duke University Press, Durham, NC Komarov B, ( Wolfson Z) 1978 Unichtozhenie prirody. Possev, Frankfurt, Germany [1980 The Destruction of Nature in the Soiet Union. M E Sharpe, White Plains, NY] Medvedev Z 1969 The Rise and Fall of TD Lysenko. Columbia University Press, New York Payne R A 1995 Freedom and the environment. Journal of Democracy 6: 41–55 Porfirev B 1997 Environmental policy in Russia: Economic, legal, and organizational issues. Enironmental Management 21: 147–57 Pryde P R 1972 Conseration in the Soiet Union. Cambridge University Press, Cambridge, UK Pryde P R 1991 Enironmental Management in the Soiet Union. Cambridge University Press, Cambridge, UK Scott J C 1998 Seeing Like a State: How Certain Schemes to Improe the Human Condition Hae Failed. Yale University Press, New Haven, CT Soderholm P 1999 Pollution charges in a transition economy: The case of Russia. Journal of Economic Issues 33: 403–10 Velikhov Y 1985 The Night After … Climatic and Biological Consequences of a Nuclear War. Mir, Moscow
14727
Soiet Studies: Enironment Weiner D R 1988 Models of Nature: Ecology, Conseration, and Cultural Reolution in Soiet Russia. Indiana University Press, Bloomington, IN Weiner D R 1999 A Little Corner of Freedom: Nature Protection from Stalin to Gorbache. University of California Press, Berkeley, CA Wells M P, Williams M D 1998 Russia’s protected areas in transition: The impacts of perestroika, economic reform and the move towards democracy. Ambio 27: 198–206 Wren C S 1999 World briefing. New York Times 19 January: 9 Ziegler C E 1987 Enironmental Policy in the USSR. University of Massachusetts Press, Amherst, MA Ziegler C E 1991 Environmental politics and policy under perestroika. In: Sedaitis J B, Butterfield J (eds.) Perestroika from Below: Social Moements in the Soiet Union. Westview Press, Boulder
C. E. Ziegler
Soviet Studies: Gender This article will discuss the meaning of the term ‘gender,’ the changing pattern of gender roles throughout Soviet history, and the ways in which Western and Soviet writers have tackled gender issues in the Soviet Union from the time of the Revolution in 1917 up to the country’s collapse in 1991.
1. Clarification of Terms ‘Gender’ refers to the personality traits and behavior associated with either men or women that are not rooted in biology but are determined by culture. Western feminists have been careful to distinguish between the terms ‘gender’ and ‘sex,’ the latter referring to the biological differences between men and women. However, Soviet writers and thinkers only acknowledged this distinction shortly before the disintegration of the Soviet Union, when, in 1990, the Centre for Gender Studies was founded in Moscow. The Soviet authorities always insisted, throughout the 70 years of the Soviet Union’s existence, on their commitment to complete equality for men and women, but no attempt was made to analyze or understand the extent to which differences between men and women were biologically or culturally produced. Alexandra Kollontai, the country’s principal theoretician on the ‘woman question’ at the time of the Revolution, did come close to addressing the concept of gender (though without actually using the term), but dismissed it as an irrelevancy, indeed, as a bourgeois preoccupation. What was important, she argued, was that all people, male and female, developed their personal potential. In her words: Leaving it to the bourgeois scholars to absorb themselves in discussion of the question of the superiority of one sex over the other, or the weighing of brains and the comparing of the psychological structure of men and women, the followers of
14728
historical materialism fully accept the natural specificities of each sex and demand only that each person, whether man or woman, has a real opportunity for the fullest and freest selfdetermination, and the widest scope for the development and application of all natural inclinations (Kollontai 1977, p. 581).
Yet despite their refusal to address the issue of gender openly, Soviet politicians were in reality very much concerned with it, and made determined attempts at different times either to eradicate or strengthen the cultural differences between men and women.
2. Historical Oeriew In its first decade the Soviet Union concerned itself with challenging the traditional understandings of appropriate male and female behavior and social roles. At a time when the West was promoting and preserving separate male and female spheres, Soviet propagandists were urging women to leave that female bastion, the home, and enter the public world of work and politics, until then a largely male environment. Women were assured that State institutions would take over their domestic tasks. In any case, the family would become less important as people got used to living communally. Some theorists went so far as to argue that the family would serve no purpose at all in a communist society and in due course would simply ‘wither away.’ However, there was no attempt to rethink both male and female roles. Women were simply to be incorporated into the hitherto male sphere of public service. Nor were women to be truly free of domestic responsibilities. Articles on the organization of communes made it clear that women would still be responsible for the cooking, laundry, child care, etc., but they would now take on this work collectively, for the commune as a whole, rather than individually, for their own families. There was also no question of them choosing not to have children. Rearing the next generation of workers was a patriotic duty, not a choice. Widespread unemployment in the 1920s meant that few women were actually drawn into social production. This began to change in the late 1920s, when Stalin came to power. In 1929 he launched a huge industrialization drive. All workers were now needed, women as well as men, and by 1932 the number of women workers was almost double what it had been in 1928. At the same time, the pledge to free women from their traditional domestic duties was quietly abandoned. All resources were now required for industry, and if women were still willing to perform those traditional functions for free, the State would hardly fight for the right to fund them. Predictions about the demise of the family also ceased. On the contrary, the family was now held up as the ‘basic cell’ of Soviet society. There are a number of reasons for this change in approach. The first Five-Year Plan had produced an enormous social upheaval, one of the results of
Soiet Studies: Gender which was an alarming drop in the birthrate, and it was hoped that rehabilitating the family might help reverse this trend. The familiarity of traditional family life might also provide some stability in people’s lives in the face of such radical change. Furthermore, the family now took on a symbolic function; Soviet society as a whole was depicted as one large family, with Stalin the ultimate patriarch. Far from being an outdated institution that had no place in a communist society, the family assumed a new significance. Women now found themselves laboring under a ‘double burden.’ They were required to go out to work: there were simply not enough men to bring about industrialization at the speed or to the extent required by Stalin; but they also had to perform their traditional domestic and nurturing functions. They were to do this with no assistance from men, nor, given the concentration on heavy rather than consumer industry, from any services or products that could relieve the domestic workload. Yet this combination of functions was ironically hailed as evidence of women’s equality, and the basis of their self-fulfillment. The ‘woman question’ was declared solved. When Khrushchev came to power and launched his ‘destalinization’ program, the ‘woman question’ was partially reopened once again. The press on the whole continued to applaud the country’s successful establishment of women’s equality, but some authors now acknowledged that the real situation was more complex, and that women fell behind men in both professional and political hierarchies. At the same time, sociologists began conducting ‘time budget studies’ which made it clear that women devoted an enormous amount of time to domestic work, and suggested that this was a significant factor in their inferior standing in public life. There were now some suggestions, albeit tentative, that husbands might be persuaded to ease their wives’ burden by ‘helping’ them in the home. From the 1970s, with Brezhnev now in charge, there was another shift in policy. The notion that women’s sense of self-fulfillment rested on their combining the traditionally female roles in the family with the traditionally male roles in politics and social production began to be questioned, largely because of the seemingly intractable demographic implications: the birthrate, at least in the European republics of the Soviet Union, had remained stubbornly low. At the same time, women workers were essential in the Soviet Union’s labor-intensive economy, and it was not possible simply to remove them from the workforce. Instead, there was an attempt to alter the balance between work and family in their lives, to persuade them to see the family as paramount and work as secondary. This meant that they would continue to work, but would devote less time and energy to trying to improve their qualifications and climb the professional ladder. Instead, they would see raising a family as their main source of satisfaction. Accord-
ingly, a media campaign and a new school course were launched to raise the status of the family, persuade women to have more children, and encourage boys and girls to develop more traditional gender-specific traits. If having children had never been a choice for women, the attempted manipulation of their reproductive patterns now reached a new height. Gender, though it was still not named as such, had become a major issue. Theorists now insisted that ‘the study of psychological differences between members of the male and female sex is an important scientific and practical question, no less important than the continuing study of their biological and physical differences’ (Khripkova and Kolesov 1981, p. 74). In the West there was also concern at this time with studying ‘psychological differences between members of the male and female sex,’ but in order to understand how they came about, the extent to which they challenged equality, and what could be done to overcome them. In the Soviet Union, the primary concern was to reinforce them. When Gorbachev came to power in 1985, his program of economic reform introduced the prospect, for the first time since the 1920s, of widespread unemployment. The high level of female employment that had characterized the Soviet Union since the beginning of the Stalin era was no longer necessary. Women could now be withdrawn from the workforce, and this could result in two positive outcomes: an increase in the birthrate, and a reduction in the unemployment figure. This was presented to women themselves as a positive development: at last they had the ‘luxury’ of staying home with their children. Yet the new openness (glasnost) of the Gorbachev era also meant that criticism of government policy was possible. Articles heatedly defending women’s right to work began to appear in the press. This extended even to the Communist Party’s own theoretical journal, Kommunist, in which three female scholars from the Institute of Socio-Economic Problems challenged what they saw as a clear trend to improve economic performance by shedding the less effective elements of the work force: ‘and that means, in the first place, women.’ This, they argued, lay behind the introduction of what seemed to be new benefits for women: In order to facilitate this female ‘‘exodus’’ from social production, it has been suggested that they be granted a series of privileges which will result in the reduction of their work load: the work day will be shortened, the period of paid leave to look after a new-born baby lengthened, and so on (Zakharova et al. 1989, p. 58).
Within two years of this article appearing its authors had founded the Centre for Gender Studies in Moscow and, as well as publishing material hitherto unimaginable in the Soviet Union, were mounting university courses and hosting conferences on gender issues in Soviet society. 14729
Soiet Studies: Gender
3. Research on Gender Issues in the Soiet Union Since the 1970s, Western research on Russia and the Soviet Union, in virtually all fields—literature, history and the social sciences—had paid considerable attention to the experience of women. Feminist scholars were initially attracted to the study of Russian and Soviet women, as the historian Linda Edmondson explains, because: it seemed (from a Western perspective) that a degree of equality and recognition had been won by women in the Soviet Union that had not been achieved in the West … they were impressed by Soviet claims not only to have legislated complete sexual equality, but also to have provided the conditions for its full realization; if not in the present, then at least in the foreseeable future (Edmondson 1992, p. 1).
They soon realized their mistake. However, it still seemed paradoxical that despite the Soviet Union’s professed commitment to women’s equality there was such apparent disinterest on the part of its scholars in discovering how it had supposedly been achieved, and that Western researchers working in the field ‘were obliged to provide almost all the answers to their own questions’ (Edmondson 1992, p. 2). Accordingly, the creation of the Centre for Gender Studies and the emergence of an indigenous feminist movement were greeted with considerable interest and enthusiasm. Yet when Russian scholars did start providing their own answers, they made somewhat uncomfortable reading for their Western counterparts. The experience of living under state socialism, perhaps inevitably, led them to an understanding of equality and women’s rights that did not accord with that of Western feminism. While feminist theory in the West had drawn attention to the distinction between the private and the public, Soviet women saw the main dichotomy as that between the family and the state. Western feminists had been largely critical of the patriarchal nature of the family, but for women in the Soviet Union it was seen as a refuge from the state-dominated bureaucracies that controlled the worlds of work and politics. While Western feminists tended to promote paid work as the essential route to women’s financial independence and self-realization, women from the Soviet Union had experienced it as a form of exploitation which, in the words of Nanette Funk and Magda Mueller, reduced them ‘to a cheap source of labor that could be plugged in anywhere without regard to specific capabilities, prerequisites, and ambitions’ (Funk and Mueller 1993, p. 87). Western feminists had pointed to men’s vested interest in keeping women in their place, but women in the Soviet Union were more inclined to see men as fellow sufferers under the yoke of state oppression. Indeed, some Soviet feminists now argued that men had had a harder time than women, since the latter were able to use their family responsibilities as an excuse to withdraw as far as possible from the public realm, where state control was at its greatest. 14730
The supposed successes of the Soviet Union in securing women’s equality—the high level of female involvement in paid work, the relatively strong representation of women at least in the lower echelons of politics, the extensive, if still inadequate, provision of pre-school childcare institutions—were also looked at with a critical eye by the new Soviet feminists. State socialism, it was argued, had been so obsessed with the idea of work that it had reduced women’s equality to nothing more than their mass entry into social production. Being equal meant being treated like men, except that they also had the major responsibility for childrearing and domestic work. Many Soviet women began to define liberation in a surprising new way: as the right not to work, and to be treated as innately different to men. This essentialist position, which saw women as naturally oriented towards the home, was anathema to many Western feminists, but it was an understandable response to the conditions in which women had lived under what they now described as ‘really existing socialism.’ The differences which emerged in women’s attitudes in East and West have not precluded a fruitful dialogue between them, but this has had to be conducted with considerable sensitivity and understanding.
Bibliography Atkinson D, Dallin A, Lapidus G W (eds.) 1978 Women in Russia. Harvester, Hassocks, Sussex, UK Attwood L 1990 The New Soiet Man and Woman: Sex Role Socialization in the USSR. Macmillan, London Buckley M 1989 Women and Ideology in the Soiet Union. University of Michigan Press, Ann Arbor, MI Edmondson L (ed.) 1992 Women and Society in Russia and the Soiet Union. Cambridge University Press, Cambridge, UK Evans Clements B, Engel B, Worobec C D (eds.) 1991 Russia’s Women. University of California Press, Berkeley, CA Farnsworth B, Kollontai A 1980 Socialism, Feminism, and the Bolsheik Reolution. Stanford University Press, Stanford, CA Funk N, Mueller M (eds.) 1993 Gender Politics and Postcommunism: Reflections from Eastern Europe and the Former Soiet Union. Routledge, London Gasiorowska X 1968 Women in Soiet Fiction 1917–1964. University of Wisconsin Press, Madison, WI Goldman W Z 1993 Women the State and Reolution: Soiet Family Policy and Social Life 1917–1936. Cambridge University Press, Cambridge, UK IIic M 1999 Women Workers in the Soiet Inter-war Economy. Macmillan, London Khripkova A G, Kolesov D V 1981 Devochka–podrostok– devuska [Girl–teenager–young woman] Prosveshchenic Moscow Kollontai A 1977 Selected Writing (translated, and with commentary by Alix Holt). Allison & Busby, London Kon I, Riordan J (eds.) 1993 Sex and Russian Society. Indiana University Press, Bloomington, IN Lapidus G W 1979 Women in Soiet Society: Equality, Deelopment and Social Change. University of California Press, Berkeley, CA
Soiet Studies: Geography Liljestrom M, Muntysaari B, Rosenham A (eds.) 1993 Gender Restructuring in Russian Studies. Tampere University Press, Tampere, Finland Stites R 1978 The Women’s Liberation Moement in Russia. Princeton University Press, Princeton, NJ Zakharova N, Posadskaya A, Rimashevskaya N 1989 Kak my reshaem zhenskii vopros [How we are resolving the Woman Question] Kommunist [Communist] No. 4, pp. 56–65
L. Attwood
Soviet Studies: Geography The field of Soviet geography has undergone tremendous change. As a result of the significant social, economic and political changes of the post-1985 era, many practitioners of Soviet geography found the physical borders of their inquiry change. More often than not, the region of focus shifted from the Soviet Union to one or more of the former republics. In addition, research themes and theoretical approaches to scholarship expanded. Accordingly, the subject of Soviet geography has two distinct periods. Geographic research about the Soviet Union before perestroyka or the mid-1980s generally focused on amassing information and analyzing spatial patterns related to the natural environment, demographic patterns and economic activity (see Pryde et al. 1989 for an exhaustive review of research during this period). The post-Soviet period has continued previous traditions and added many new themes such as political change and the social and economic consequences of transition (see Engelmann et al. 2001) for a review of research themes and issues during this period).
1. Soiet Geography During the Soviet period, the scope of geographical research often coincided with the strategic needs of policy makers. Because of Soviet secrecy regarding information about the natural environment and the location of natural resources, the location of economic activity and population dynamics, much of the work outside of the Soviet Union focused on amassing and interpreting information at a subnational scale in order to satisfy the demand for such knowledge among the policy-making community. As a result, geography was one of the few disciplines within Soviet studies to focus on regional and local issues throughout the Soviet period. Despite limited interaction among Soviet colleagues and their counterparts elsewhere during the Cold War era, work by geographers in the Soviet Union became accessible to outsiders through various translation projects such as the journal Soiet Geography and Translation, founded by Theodore Shabad and renamed Soiet Geography, and a volume by Fuchs and Demko (1974)
Geographers in the Soviet Union and elsewhere published widely on the location of natural resources and the changing distribution of natural resource extraction, particularly on strategic resources like oil and natural gas. Several important volumes (e.g., Shabad 1969, Pryde 1972, Jensen et al. 1983) document the location of natural resources and discuss the wider implications such as economic development and conservation. Additional work by geographers brought to light important environmental issues related to renewable natural resources like the drying up of the Aral Sea and the resulting regional scale environmental degradation (e.g., Micklin 1988). Numerous geographers traced the progress of various large-scale Soviet industrial projects related to natural resources and industrial development. Soviet officials perceived these projects as critical to national welfare, and other governments perceived understanding them as critical to their own national security. For example, geographers followed projects like the diversion of major Siberian rivers to provide water resources to Soviet Central Asia and the construction of a new railway line across Siberia, the Baykal–Amur main-line, to facilitate the transportation of natural resources. Many debated the strategic and economic consequences of constructing the new line. The journal Soiet Geography provided current information on these and many other projects related to natural resources and industrial development in its ten annual issues. The location of economic activity and demographic dynamics comprised two other significant and interrelated research topics in Soviet geography. Geographers inside the Soviet Union and elsewhere analyzed urbanization patterns and the Soviet policy of city size limitations (e.g., Khorev 1971). In the United States, Harris (1970) initially promoted the study of Soviet urbanization using approaches and techniques that were standard at that time in Western urban geography, (e.g., central-place theory) and functionally categorized Soviet urban places. These works along with others (e.g., Lewis and Rowland 1979) evaluated aspects of Soviet central policy, like city size limits, and suggested that such policies did not appear to change overall urban growth rates. Other works attempted to demystify the Soviet central planning system and show how regional and spatial planning occurred under this system and affected spatial development (e.g., Pallot and Shaw 1981). Shabad in his News Notes section of the journal Soiet Geography and elsewhere (Shabad 1984), provided data and analytical information about the location of economic activity and population. For example, he showed how the Soviet urban classification system according to population size and function could be used to reveal locations where central planners located economic activity or changed its nature. Liebowitz (1987) contributed to the ideologically generated debate over the effectiveness of 14731
Soiet Studies: Geography Soviet regional equalization policies. By using an oblast level scale of analysis, he concludes that the Soviet government had a strong commitment to reducing spatial inequality. Investigative works such as those cited above distributed information to the scholarly and public policy communities about a mysterious place and system. They filled an important information gap about a significant part of the world. Salient questions in the field revolved around identifying what happens where and for what reasons. Other geographers often labeled the work as overly descriptive or devoid of theoretical or conceptual content. In reality, research during this period was implicitly theoretical and generally fell into two underlying and opposing frameworks: (a) Soviet exceptionalism due to the combination of communism, totalitarianism, and central planning; and (b) universalism due to the belief that human behavior is fundamentally the same irrespective of social, cultural, or economic context. Both perspectives mirrored broader trends in social sciences such as cold war dogma and modernization theory. Even before the break up of the former Soviet Union, the field of Soviet geography began to change. New and expanded sources of information reduced the demand for amassing and publishing subnational data. Soviet geographers began to shift focus away from identifying spatial trends to interpreting and conceptualizing them. A new political imperative arose to document and interpret the spatial implications of ‘democratization’ and marketization. A modified set of supply and demand issues faced the scholarly community and produced a substantially increased set of research themes and theoretical perspectives.
2. Post-Soiet Geography As a result of the break up, the geographical scope of research has generally narrowed to Russia as many Soviet specialists interested in other former Soviet states have aligned with regional specialists interested in Europe, the Middle East, and Asia. The research emphasis in the subdiscipline continues on regional and local issues in Russia with several changes. Among the most important changes is the increased collaboration with and access to colleagues in Russia. A review of articles in Post-Soiet Geography and Economics (formerly the journals Soiet Geography and Post Soiet Geography), shows much greater collaboration among international scholars and among non-regional specialists. Also important are the increased amount of information available (usually through purchase) and the ability to conduct original field research (including ethnographic and survey studies). New ethical considerations have arisen. For example, we have begun to consider how non-Russian academics profit from interactions with Russian colleagues and informants. 14732
Many also consider the advantages to Russian colleagues who participate in joint research. In addition, the need to purchase information brings up issues related to the accessibility and dissemination of scholarly information as well as quality. The possibility of generating new forms of scholarly evidence leads to the ability to expand theoretical and methodological perspectives in Russian geography. Scholars within the field are not only in the business of amassing and interpreting information, but also of producing that information. While these changes have wrought considerable excitement among practitioners and new entrants into the field, the cohesiveness of the Soviet period seems to have disappeared. Traditional research themes remain with a new emphasis on political issues. For example, while geographers still provide information on the location of natural resource extraction and the state of the natural environment, new themes related to the broader subdiscipline of nature–society relations have emerged. New work using ethnographic research methods includes a focus on the issue of the rights of indigenous people (e.g., Fondahl 1998). Research continues to document the nature of demographic change during this transition period. Several authors note important trends such as reversals of traditional demographic patterns, including population losses and deurbanization in the Russian East, which had traditionally been an urban migration destination (e.g., Heleniak 1997 and Rowland 1996). Other researchers place a new emphasis on the political dimensions of demographic change and ethnic identity politics (e.g., Kaiser 1994). These works continue the tradition of providing information about the region while explicitly showing a change in the intellectual content of the field by incorporating methods and topics that speak to the broader community of social scientists as well as to policy makers. While documenting the location of economic processes still predominates, research during this period looks at the political aspects of economic relations and activity. Using theoretical approaches to study local governance in other countries, Mitchneck (1998) cites the example of the use of historical imagery for economic development to show that Russian local governments increased their participation in promoting local economic development. A focus on political relations also permeates the work of economic geography of Russia. Bradshaw and Lynn (1994) argue for the integration of new theoretical approaches into our study of economic relations in Russia. They use the example of world systems theory to show how we can enrich both the study of transition and social sciences at large by bringing analysis of Russia and the former Soviet Union into broader scholarly debates. The direct examination of politics as a geographical process has also entered into the lexicon of Russian geography. Craumer and Clem produced a series of articles on electoral processes in Russia at various
Soiet Studies: Politics scales of analysis (e.g., 1997) and O’Loughlin et al. (1996) also studied electoral processes. Also new to Russian geography is the analysis of the impact of broader social and economic change on the individual—especially differences across gender—and the introduction of social theory to the study of the transition period in Russia (see Bell 1997 and Pavlovskaya 1998). In many ways, thematically, and theoretically, Russian geography appears more integrated into the overall field of geography than was Soviet geography. Research themes, methods, and theoretical approaches mirror the discipline at large more explicitly. The future of the discipline, however, is still quite uncertain. The productivity of scholars both in Russia and abroad seems somewhat sidetracked due to the need to restructure. In addition, the political imperatives that generated both scholarly interest and immense funding for research appear to have waned. A promising area of future research for geographers studying Russia and other former Soviet countries would be contributing to transition theory and studying the impact of marketization on both the natural and built landscapes. Relatively few works to date have begun to deal directly with such issues (for exceptions see Bradshaw 1997, Ioffe and Nefedova 1998, Bradshaw and Hanson 1998, Smith 1999), in part, because of the limited time that has elapsed. But in this ever-evolving field, we have learned to expect the unexpected. See also: Communism; Geography; Postsocialist Societies; Regional Geography; Socialist Societies: Anthropological Aspects; Soviet Studies: Economics; Soviet Studies: Environment; Soviet Studies: Politics; Soviet Studies: Society
Bibliography Bell J E 1997 A place for community? Urban social movements and the struggle over the space of the public in Moscow. Dissertation, University of Washington. Bradshaw M J (ed.) 1997 Geography and Transition in the PostSoiet Republics. Wiley, Chichester, UK Bradshaw M, Hanson P 1998 Understanding regional patterns of economic change in Russia: an introduction. Communist Economics and Economic Transformation 10: 285–304 Bradshaw M J, Lynn N 1994 After the Soviet Union: the postSoviet states in the world system. The Professional Geographer 46: 439–49 Clem R S, Craumer P R 1997 Urban-rural voting differences in Russian elections, 1995–1996: a Rayon-level analysis. PostSoiet Geography and Economics 38: 379–96 Engelmann K, Bell J, Mitchneck B, Kaiser R 2001 Russia, Central Eurasia, East Europe. In: Gaile G L, Willmott C J (eds.) Geography at the Dawn of the 21st Century. Oxford University Press, Cambridge, MA Fondahl G 1998 Gaining Ground? Eenkis, Land and Reform in Southeastern Siberia. Allyn and Bacon, Wilton, CT Fuchs R, Demko G 1974 Geographical Perspecties in the Soiet Union. Ohio State University Press, Columbus, OH
Harris C 1970 Cities of the Soiet Union: Studies in their Functions, Size, Density, and Growth. Rand McNally, Chicago Heleniak T 1997 Internal migration in Russia during the economic transition, (1989 to end-1996). Post-Soiet Geography and Economics 38: 81–105 Ioffe G, Nefedova T 1998 Environs of Russian cities: A case study of Moscow. Europe–Asia Studies 50: 43–64 Jensen R G, Shabad T, Wright A W (eds.) 1983 Soiet Natural Resources in the World Economy. University of Chicago Press, Chicago Kaiser R J 1994 The Geography of Nationalism in Russia and the USSR. Princeton University Press, Princeton, NJ Khorev B S 1971 Problemy Gorodo. Mysl’, Moscow Lewis R A, Rowland R H 1979 Population Redistribution in the USSR, Impact on Society, 1897–1977. Praeger, New York Liebowitz R 1987 Soviet investment strategy: a further test of the ‘equalization hypothesis’. Annals of The Association of American Geographers 77: 396–407 Micklin P P 1988 Desiccation of the Aral Sea: A water management disaster in the Soviet Union. Science 241: 1170–76 Mitchneck B 1998 The heritage industry Russian style: the case of Yaroslavl. Urban Affairs Reiew 34: 28–51 O’Loughlin J, Shin M, Talbot P 1996 Political geographies and cleavages in the Russian parliamentary elections. Post-Soiet Geography and Economics 37: 355–87 Pallot J, Shaw D J B 1981 Planning in the Soiet Union. Croom Helm, London Pavlovskaya, M E 1998 Everyday life and social transition: gender, class, and change in the city of Moscow. Dissertation, Clark University. Pryde P R 1972 Conseration in the Soiet Union. Cambridge University Press, London Pryde P R, Berentsen W H, Stebelsky I 1989 Soviet and East Europe. In: Gaile G L, Willmott C J (eds.) Geography in America. Merrill, Columbus, OH Rowland R H 1996 Russia’s disappearing towns: new evidence of urban decline, 1979–1994. Post-Soiet Geography and Economics 37: 63–89 Shabad T 1969 Basic Industrial Resources of the USSR. Columbia University Press, New York. Shabad T 1984 The urban administrative system of the USSR and its relevance to geographic research. In: Demko G J, Fuchs R J (eds.) Geographical Studies on the Soiet Union (Department of Geography Research Paper No. 211). University of Chicago, Chicago, IL Smith G 1999 The Post-Soiet States: Mapping the Politics of Translation. Arnold, London
B. Mitchneck Copyright # 2001 Elsevier Science Ltd. All rights reserved.
Soviet Studies: Politics The study of Soviet politics was a special area dealing with the analysis of political developments in the former Soviet Union: its political history and traditions, its political system and regime, the evo-
14733 Copyright # 2001 Elsevier Science Ltd. All rights reserved. International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7
Soiet Studies: Politics lution of its political leadership, its social structure, and the cultural roots of Communist totalitarianism. With the fall of the Soviet Union and Communism in 1991, Soviet studies ceased to exist as an area of research. Today, a review of Soviet studies is primarily of historical interest. It is notable that, in its traditional form, Soviet studies was first and foremost exclusively the product of a Western community of researchers and politicians. Moreover, this brand of social science was largely an outcome of the Cold War between the West and the USSR, and served as the ideological and theoretical basis for that war. It was distinguished by a descriptive, narrative, and heavily ideological nature. The strong ideological undercurrent present in Soviet studies is understandable, given that its focus of research was the totalitarian Soviet regime, the very opposite of liberal democracy. Soviet studies was a closed area of research, removed from mainstream social science, and it effectively ignored modern research methods (i.e., comparative analysis). The field was a typical instance of area studies in that researchers felt little need to search for analogies and parallels beyond its confines. Indeed, Soviet studies (particularly those in the US) was dominated by e! migre! s who, in many cases, had left Communist countries as a result of persecution. This could not help but tinge their work with a certain amount of bitterness and strong emotion that were hardly conducive to objective analysis. Soviet studies, with their zeal and passionate condemnations, was not unlike Soviet social science and its attacks on the Western democratic system. In any case, Soviet studies was unable to explain Gorbachev’s perestroika or the collapse of the USSR that, incidentally, was not foreseen by any of the scholars in the field. This gives some indication of the standard of scientific analysis in the area. Still, a number of Sovietologists did provide the West with unbiased assessment of specific problems in Soviet development and of the processes that followed the collapse of Communism. Among these the most notable are Alex Dallin, Robert Legvold, Adam Ulam, George Breslauer, Arnold Horelick, Robert Tucker, Robert Daniels, Steven Cohen, Gail Lapidus, and Angela Stent. After the fall of Communism, traditional Soviet studies petered out, not only in the US but also throughout the West. Effective research into the developments that shaped post-Communist Russia and other successor states of the USSR adopted new approaches developed by Western social inquiry. A complex analysis of post-Communism began to develop on the basis of transformation studies across all dimensions: politics, economics, culture, and the social sphere. The greatest progress in understanding post-Communism was achieved once researchers borrowed models from transition and comparative studies and started to explore both general and specific features of Russia’s transformation from the per14734
spective of other transition societies. At this stage, Juan Linz, Alfred Stepan, Terry Karl, Phillipp Schmitter, and Valery Bunce made significant contributions to the understanding of post-Communist Russia. Among the books that discuss the developments in the former USSR form a new perspective see Bunce (1999), Elster et al. (1998, Brown (1996). Social science as practiced in the USSR prior to the fall of Communism was a discipline encumbered by ideological cliche! s. It was only in the perestroika years that research began to delve into the nature of the Soviet system, its evolution, and its relationship with historical tradition and national characteristics. With the breakup of the USSR and the fall of Communism, Russia’s research community was faced with the need to reconsider its research methods completely and turn to international literature on social development. This has not been an easy process for Russian scholars and many have not been able to break free of Marxist philosophy and social science. Still, the 1990s have seen a gradual emergence of select social science disciplines that make use of internationally accepted research tools to study various social dimensions. Emerging areas of research include comparative political science, geopolitics, international relations, and transition studies. However, such research is largely concentrated in large university centers (such as Moscow, St. Petersburg, and Yekaterinburg) and around key magazines and journals (most importantly Polis, Pro et Contra, Vlast, and Voprosy Philosophii). Additionally, new communities of philosophers and political and social scientists have sprung up. In short, Russian social science has largely completed the initial phase of organizational and institutional development. It has done so with valuable support from the Western research community and financial assistance from certain Western funds and institutions, including the McArthur, Carnegie, Ford, Soros, and some European foundations. Core works of Western social science have been translated into Russian and translations of contemporary research are readily available. Social science learning at Russian universities conforms with international standards to an ever-increasing extent. A new crop of researchers has done much to investigate various developments in Russian society. Among these are German Diligensky, Mikhail Ilyin, Andrey Melvil, Andranik Migranian, Igor Klyamkin, Vladimir Pastukhov, Oleg Bogomolov, Alexander Nekipelov, Nikolay Shmelev, and Yuri Levada. Nevertheless, Russian social science is still in a transitional phase and its development remains difficult. There is a clear lack of stable avenues of interaction between various schools and perspectives, while research efforts are poorly differentiated and integrated. Inflows of new academic talent are limited. Sadly, some Russian observers consider Russian social science to be mired in crisis and apply this conclusion to all key social disciplines: political science, law,
Soiet Studies: Society history, social and cultural studies, etc. Russian research is relegated to the periphery of international research. Its investigation of major trends in social development is empirically weak and likewise flawed in a theoretical approach. Many research projects still rely on an oddly eclectic combination of outdated Marxist research methods and modern Western techniques, without due regard for the overall research context. This is a result of the parochial nature of Russian social sciences, their estrangement from international research, the belief in a unique Russian experience, and the reluctance of many scholars to study Russia in the context of other transformations and transitions from authoritarian rule. Knowledge of superior international works on the development of post-Communist Russia and other successor states of the USSR remains inadequate. How can the weakness of Russian social science be explained? The first reason for it is the continuing crisis in Russian society and the indifference of authorities and the public to the development of science in general. Also part of the problem is the adverse tradition of excessively ideologizing research, the habit of some researchers to work for the authorities as well as the corresponding conformist attitudes that this engenders, and the lack of objective analyses, particularly when contemporary Russian developments are in question. Scientific prospects are blighted by an inadequate consolidation of the scholarly community, a lack of funding, the concentration of research effort in a handful of large cities, and the often-inadequate knowledge of foreign languages (particularly English) on the part of researchers, which prevents Russian science from breaking its parochial confines. The coming phase in the development of Russia’s social science, i.e., its integration into the global framework, largely hinges on society’s ability to overcome the current period of crisis and stagnation and acquire new perspectives and hopes. One of the major goals of the newly founded (1998) Russian Political Association which unites a major part of Russia’s political scientists is the creation of a new framework for discussion and itellectual intercourse. Thus, in 2001 Russian political studies have a complicated agenda. On the one hand, its goal is to integrate Russian studies into the general mainstream of world political science using its newest accomplishments, instruments, and approaches, on the other hand, its goal is to preserve the emphasis on area studies. The new step in the development of Russian political studies will demand more effort in analysis of the major problems of political history and traditions; of the political regimes; of the correlation between economic and political reform; of continuity and change; of stability and instability; and of the interaction between domestic and foreign policy. The success of this endeavor will depend on the ability of scientists to cooperate and share the results of their studies, and on the attempts to organize the collective
effort of analysts from different fields in order to get a systemic and deeper understanding of the nature and puzzles of the post-Communist transformation. See also: Communism; Communist Parties; Economic Transformation: From Central Planning to Market Economy; Socialist Societies: Anthropological Aspects; Soviet Studies: Economics; Soviet Studies: Society
Bibliography Bonnel V, Breslauer G (eds.) 2000 Russia in the New Century: Stability or Disorder? Westview Press, Boulder, CO Brown A 1996 The Gorbache Factor. Oxford University Press, Oxford, UK Bunce V 1999 Subersie Institutions. The Design and Destruction of Socialism and the State. Cambridge University Press, Cambridge, UK Colton T, Legvold R (eds.) 1992 After the Soiet Union. W. W. Norton and Co, New York Dallin A, Lapidus G (eds.) 1995 The Soiet System: From Crisis to Collapse. Westview Press, Boulder, CO Elster J, Offe C, Preuss U K 1998 Institutional Design in Postcommunist Societies. Rebuilding the Ship at Sea. Cambridge University Press, Cambridge, UK Linz J, Stepan A 1996 Problems of Democratic Transition: Southern Europe, South America, Post-communist Europe. Johns Hopkins University Press, Baltimore, MD
L. Shevtsova
Soviet Studies: Society ‘Soviet society’ refers fundamentally to the society of the former Soviet Union (the Union of Soviet Socialist Republics). More broadly, scholars refer to a ‘Soviet type of society,’ i.e., a society structurally similar to that of the Soviet Union.
1. Main Perspecties Social scientists have approached the study of Soviet society from several main perspectives. For some, Soviet society is of particular interest for what it can reveal about one of the world’s superpowers. Those taking this view tend to observe and analyze all aspects of Soviet society through a political lens. They stress societal features potentially influencing or especially affected by its political system and policies. At another extreme, for those studying societies comparatively, Soviet society may be just another data point like American, British, Chinese, French, Indian, or Turkish society. Those adopting this perspective seek 14735
Soiet Studies: Society to discern relationships among societal characteristics rather than the inner workings of Soviet society per se. For those with yet another perspective, Soviet society is a distinctive type of society; they think that studying its distinctive features can give a deeper understanding of how political, economic, and social structures are linked, and how history and culture impact the linkages among these structures. These various perspectives have not received equal attention. The first was dominant during much of the twentieth century, especially in periods when the Soviet Union was regarded as a major military threat to the West. The second almost never leads to publications manifestly on Soviet society and has not predominated at any time; it is not covered in this article. The third, more sociological perspective has become prevalent since the Soviet Union collapsed.
2. Soiet Society as a Distinctie Societal Type As a societal type, Soviet society was historically unique because its basic institutions were designed with the intent of implementing a particular theory of societal development and ideals, namely, that of Karl Marx. Although other societies were later constructed to realize Marx’s vision, the Soviet Union’s society was regarded as the prototype and the model to be followed. According to Marx’s theory of history, feudalism was succeeded by capitalism, and capitalism would be followed by communism. Communism would usher in a communist society, a superior stage of societal development. Although Marx wrote prolifically, his work mainly dealt with broad changes in economic and political systems and with his critical understanding of capitalism and capitalist society. Ironically, his writings furnish only a few general ideas about the nature of a communist society. In the picture of a communist society painted by Marx, private capital and exploitation would be eliminated as the bases of society. There would be no social classes and no inherited privileges. People would be free to choose their activities; consequently, they would not be alienated from their work. Social relations would be based on cooperation rather than competition and exploitation; hence people would not be alienated from one another. Communist society would initiate a truly human and humane history. Communism would not only foster humane social relationships but also be economically more productive than other systems. As evidenced in general by the actual societies designed to be communist and in particular by Soviet society, Marx’s utopian vision of communism and of communist society proved to be wishful thinking. Its leaders pursued policies of rapid industrialization and urbanization, and it indeed achieved a relatively high level of economic output and became a largely urban 14736
society. Further, it succeeded in reducing (if not in eliminating) social inequalities and inheritance of social privileges. But productivity and economic growth were not as great as in capitalist countries, and the widespread use of force and repression to control people’s lives gainsaid the claim that Soviet society was humane and had ended alienation. Marx’s contemporaries as well as succeeding generations of social thinkers challenged his theory and many of his claims, but few offered a different view of human history and human societies that was as broad and as encompassing as Marx’s. In his monumental work, A Study of History, Arnold Toynbee (1947), was among the first to propose a vision of societal development that was not based on social classes. Like most scholars who have attempted to classify societal types, including Marx, he thought that at each stage of societal development, a certain mode of production was the driving force behind economic and social relations in a society. In Toynbee’s scheme, pastoral societies were followed by agricultural societies, which were in turn succeeded by industrial societies. Later Daniel Bell (1973) argued that a new type of society came into being in the twentieth century: postindustrial society. Other scholars subsequently suggested particular kinds of postindustrial societies (e.g., information, service-based, postmodern). Soviet society does not fit very well into the usual classifications of societal types, however, because it differed sharply from industrialized, urbanized societies that had market economies and democratic polities. Its structures for the organization of productive activities and for the allocation of economic outputs were so distinctive that the industrial basis of its economy provides comparatively little help in understanding Soviet society. The failure of the concept of ‘industrial society’ to identify fundamental differences between Soviet society and industrial societies in North America and West Europe has motivated scholars to study Soviet society as a distinctive type—to seek to identify its basic features that distinguish it from industrial societies with market economies and democratic polities.
3. Origin and Key Elements of Soiet Society Communist ideology promised a better way of life to the many people living in industrializing societies who were relatively (and sometimes absolutely) deprived. In the nineteenth and early twentieth century, it was hard not to attribute much deprivation to the social upheavals that resulted when small numbers of entrepreneurs and other ‘captains of industry’ built large factories and other enterprises employing great numbers of men, women, and children who toiled very long hours at very low wages. Arguably even more deprived were the jobless and those without land to till. Not surprisingly, the communist vision was attractive to
Soiet Studies: Society many workers, peasants, intellectuals, and members of suppressed nationalities and ethnic groups. Some of them were mobilized by this vision and organized to change society. According to Marx, social revolution is the only way that a working class exploited by the capitalist class can gain power in order to build a new communist society. In 1870 the Paris Commune tried to create the first communist society, but its efforts failed. Other attempted communist revolutions were sparked by the heavy losses of life and material destruction during World War I and by the hardships in core European countries in this war’s aftermath. Surprisingly, the first successful revolution of selfproclaimed communists occurred, not in a highly developed capitalist country, as Marx and Engels (1848\1962) had expected, but in Russia in 1917 where industrialization had then barely begun. A small group of intellectuals headed by Vladimir Lenin (ne! e Ulyanov) was able to snatch power from Russia’s ineffectual tsarist elite with comparative ease. They then managed to fight and win a three-year civil war (1918–1921) and to suppress dissenting factions in the communist movement. As a result, the Union of Soviet Socialist Republics (USSR) emerged as the world’s first communist state. During the early stages of the Soviet Union, its communist leaders—first Lenin and then his successor, Joseph Stalin—eliminated much of Russia’s previous elite, not only its nobility but also its few industrialists. Numerous intellectuals with different ideas about societal goals were also removed. Well-to-do peasants, most of whom were suspicious of communism, were also got ten rid of. The Soviet Union’s leaders justified the massive use of force to implement communist doctrine partly by the country’s backwardedness: its largely peasant population was illiterate and had short time-horizons, and peasants were too disorganized, except on the village level, to create a new countrywide social system. In addition, the ideology itself excused the use of force by communist leaders. Social revolutionaries believe that their destiny is to force society to change in order to bring about a better society. The notorious ruthlessness of Russia’s tsarist elite made this rationale more acceptable to many people. Thus, it came about that Soviet society was built and maintained through the use of excessive force. Power and decision-making were concentrated at the very top. The real head of the Soviet Union was never elected, not even by the Central Committee of the Communist Party of the Soviet Union, which was officially the Party’s leading body. The wishes and will of the general population were neglected. In the Soviet Union’s command economy, the state owned and controlled all productive resources. The Soviet system of extreme concentration of control over resources could be very effective. For example, it permitted this formerly backward country with its predominantly peasant population to industrialize
with surprising speed, eliminate illiteracy and educate the general population, win World War II despite its initial unpreparedness, develop the first hydrogen bomb, launch the first satellite to orbit the earth, put the first cosmonaut in space, and become one of the world’s two main superpowers during most of the twentieth century. It also achieved a considerable degree of social equality, though at a lower average level than in the United States and West Europe. While not overlooking the negative features of Soviet society, one should acknowledge its accomplishments. Although the Soviet Union began as a dictatorship of the head of the Communist Party, the political system gradually evolved. By the 1960s, the apparatus of the Communist Party had been merged with the state apparatus, creating a party-state. During the final decades of the USSR, the party-state was the only actor in Soviet society that no one could legitimately contest; its authority was supreme. The party-state’s monopoly on public action greatly limited the circle of people able to influence how society developed. Individual initiative was condoned only in a person’s private life. This depressed the populace’s motivation to act in the interest of themselves personally or of the general public. It led people to shun responsibility in public life. Since individuals depended on the state for their basic sustenance, and since the economic system offered few incentives for hard work and achievements, ordinary people sensibly chose to focus on their own private, personal lives: their families, a few close friends, and their personal hobbies. Because material wellbeing was secure (though low), the average person living in the Soviet Union was able to concentrate on his or her personal interests and have a private life offering considerable personal satisfaction. Soviet society did not, however, allow a civil society to come into being. It lacked a level of social organization above family and friends—a level of voluntary organizations and private clubs—that many scholars (e.g., Putnam 1993) regard as a major foundation of a democratic society. The party-state controlled not only all productive resources in the economy but also all social organizations linking individuals who were neither family members nor close personal friends (e.g., even choral groups and clubs of birdwatchers). Private individuals were not allowed to meet and form associations, even ones intended to foster private and personal goals. The Soviet system successfully indoctrinated its citizens in Marxist ideology and established a public mentality oriented to the collectivity. In this ideology, a person’s life had value only as a tool for the attainment of collective societal goals. In the early decades of the Soviet Union, its leaders experimented even with ways to change family life. To emancipate women, marriage and divorce were made very easy. To dilute the influence of families on new generations, state organizations for tending young 14737
Soiet Studies: Society children and for educating and training older children were instituted. Some of these experiments had unintended adverse consequences and were later abandoned. For example, it was discovered that in practice easy divorce meant a huge upsurge in unmarried mothers trying to raise children without the father’s assistance. Because it was in the state’s interest for fathers to support their children, the law was changed again to make divorce in the Soviet Union almost impossible. Nevertheless, despite various planned interventions of the party-state into family life, on the whole, family life and other personal aspects of individuals’ lives in Soviet society seem to have been determined largely by the same kinds of factors as in noncommunist industrialized societies. Empirical research on this topic is, however, still scanty. Two key institutions in modern capitalist societies, money and law, were not very important in Soviet society. The Soviet party-state did not need money to motivate people or to act as the medium of economic exchanges. Nor did it rely on laws to control people’s behavior or to regulate economic activities. These ends were accomplished primarily through administrative orders and, when necessary, the threat of punishment. Indeed, both money and law were seen as potentially limiting the absolute power of the party-state and therefore as best avoided.
4. One Society or Many? The Soviet Union was comprised of many different nations (ethnic groups) having different cultures, traditions, and histories. Within its borders, hundreds of different languages were spoken. While all national groups were predominantly rural before being incorporated into the Soviet Union, not all of them fit the prototypical image of Russian peasants living in villages. In Central Asia, there were nomadic herders; in Siberia, there were small tribal groups who survived by hunting. In some of the Baltic countries and in certain parts of Russia, there were independent farmers, some of whom produced food for world markets. Forging its diverse national groups into a Soviet people and creating a ‘homo Sovieticus’ (Soviet man\ woman) was an important objective of Soviet leaders. They used a wide variety of means, ranging from overt repression, to economic incentives, to formal indoctrination in schools and in compulsory lectures given at work sites, to more subtle efforts in songs, films, plays, and various literary works. After 70 years, these efforts could claim a moderate degree of success. Even during the harshest years of repression under Stalin, however, people’s ways of living were noticeably different in the various regions of the Soviet Union. Although the party-state institutional organization penetrated every part of the Soviet Union, every 14738
institution operated somewhat differently in the various major regions of the country. Not surprisingly, similarities were greatest in state institutions, where there were definite efforts to impose uniformity. For example, schools throughout the Soviet Union taught pupils of a given age the same subjects using essentially the same books. Remnants of traditional cultural patterns were most evident in family life and in patterns of interpersonal relations and friendships. The last region conquered by the Russian Empire in the tsarist period was Central Asia. This enormous area has never been densely populated. Its native inhabitants are descended from many different ethnic groups that followed Islam and were tribally organized. However, it was not the Islamic religion as much as the traditional clan-based communities that made this part of Soviet society distinctive. Uprisings in this region began during World War I. The Soviet state succeeded in restoring calm only after a series of mass slaughters in the 1930s. A colonial strategy of ‘divide and conquer’ was used to control it. Uzbeks, Kazakhs, Tajiks, Turkmens, and Kirgizes were given their own Soviet republics to establish the dominance of new nations in Central Asia. Toward the end of the Soviet era, the Central Asian part of the Soviet Union was still predominantly rural and inhabited by nonSlavic peoples. But it also had capital cities and a few other urban centers with large industrial complexes inhabited and staffed by immigrant Slavic laborers. To a large extent, these urban areas were separate social oases whose residents had a way of life very different from that of those living in the other parts of these republics. The collapse of the Soviet Union left a large Russian diaspora in Central Asian cities. The titular nations who assumed the rule of their own countries have largely driven out the Russian diaspora. There remains a tiny, Sovieteducated native elite that has a say about who rules in this region, but the influence of this tiny native elite seems unlikely to endure. Leading groups, once again based on clans, are already being formed, and Central Asian societies formerly part of the Soviet Union are becoming more similar to other Central Asian societies, such as Afghanistan. A second very distinctive part of the Soviet Empire was the Caucasus region in the south. The people of this region had distinctive religions and cultures. In the North Caucasus, some nations, such as the Avaris and Chechens, followed Islam; others were part of Orthodox Christianity; still others had their own distinctive and ancient national religions. The native peoples of this region were also organized in clans, but many had developed a more elaborate culture than was found in Central Asia. To a considerable extent, the various peoples of the Caucasus region developed a national identity in the process of confrontations with Russians who sought to control their homelands. During the Soviet era, members of the national minorities in the Caucasus
Soiet Studies: Society were deeply involved in the Soviet power structure. Native elites sought to accommodate Moscow in some matters while at the same time acquiring more autonomy in handling affairs in their own traditional homelands. The top elite in the Caucasus in the Soviet era was politically very competent and has successfully protected its own turf. Evidence for their success is found in the fact that First Secretaries of the Communist Party in the former republics in this region have become the heads of some of the newly independent countries in this region (e.g., E. Shevardnadze in Georgia, G. Alijev in Azerbaijan). A third tiny but clearly distinctive group lived in the western part of the Soviet Union, the Baltic peoples. The Baltic countries enjoyed independent statehood between World Wars I and II. In 1940 they were occupied by Soviet troops but were brought fully under Soviet control only after World War II ended. Soviet leaders used migration of nontitular peoples into the Baltic region as a method of sovietization. This policy was most successful in Latvia, where the 1989 Soviet Census revealed roughly equal numbers of Latvians and nonLatvians (most of whom were Russians or other Slavs). It was least successful in Lithuania. According to this same census, Lithuanians were still four-fifths of the population in 1989, and they stubbornly remembered the distant past when a Lithuanian Empire extended to the Black Sea. Since the process of sovietization of the Baltic peoples and their societies was shorter than elsewhere in the Soviet Union, it never succeeded in erasing an antiSoviet mentality. Opposition to the Soviet regime in the Baltic region was silent but overwhelming. At the first opportunity presented by perestroika, mass movements organized in popular fronts along with native political leaders began to push for greater autonomy in the Baltic republics. Their aim was eventual restoration of independent Baltic countries, a goal achieved in 1991, the year when the USSR collapsed. At the core of Soviet society were the major Slavic nations: Russians, Ukrainians, and Belarusians. In tsarist times, their languages were distinctive but mutually intelligible, and most of them were followers of Russian Orthodoxy. They also shared a largely similar heritage based on the life of peasants living in small villages loosely linked by towns functioning as trading posts. Stalin’s program of collectivization destroyed village life and pushed many former peasants into new Soviet factories in cities. The destruction of the old way of life was as important as other direct methods in creating a new Soviet person. The sovietization of Slavs reached its culmination during World War II, the Great Patriotic War. This war brought the greatest hardships in modern Russian history. Victory over Hitler’s military forces was achieved only through millions of people sacrificing their lives and by millions of others toiling to support the Soviet war effort. This victory in the face of
tremendous odds and losses bolstered the legitimacy of communist ideology and its collectivistic orientation. It made Soviet citizens proud and united them, especially those of Slavic origin. In the postwar period until the 1960s, the Soviet Union experienced rapid economic development and achieved important ‘firsts’ in the race to space. Soviet Slavic citizens’ pride grew as their country acquired the status of a world superpower and leader in many arenas (e.g., science, sports). Once the peoples of these regions were largely peasant Slavs, many of whom were descendants of former serfs. In the Soviet era, a majority of Slavs became city dwellers who lived in state-owned apartment complexes while working on factory assembly lines or doing various white collar jobs. They experienced great intergenerational mobility: geographical (rural to urban), educational (illiterate to a secondary education or higher), and social ( peasant to manual or white collar worker). This great mobility reshaped people’s personalities and mentalities and obliterated many memories of and allegiance to traditional ways of life. When a period of high growth ends (which inevitably it does), educated people usually begin to wonder about their collective as well as personal future. After the Soviet Union crushed the Prague Spring in Czechoslovakia in the late 1960s, there came dreary years of ritualistic procedures of the Kremlin gerontocracy and little economic progress. Indeed, there were signs of economic retrenchment. Outside economic observers would label this as a shift from extensive to intensive economic development, but Soviet citizens of Slavic origin realized only that their lives were not getting better. The educated core of the Slavic people living in large cities (and not only the few public dissidents like Andrei Sakharov and Aleksandr Solzhenitsyn) began to speculate about the Soviet Union’s future. Faith in the Soviet leadership, and even in a communist future, began to wither. A growing and widespread but diffuse view of Soviet society as collectively ‘treading water’ set the stage for the political and economic system to collapse from within.
5. Studies of Soiet Society 5.1 Studies During the Soiet Era During most of the Soviet Union’s existence, it was hard for observers to acquire objective information about Soviet society. In this situation, the study of Soviet society became very specialized and was not closely linked to other areas of social science research. Soietology was the special field of social studies that addressed all aspects of Soviet society, though it focused especially on its political system and economy. 14739
Soiet Studies: Society This special field arose as part of the political and ideological confrontation between communist and capitalist countries during the Cold War era. Specialists sought to reconstruct the reality of social life in Soviet society. Scholars were successful primarily in analyzing official Soviet documents and writings that gave evidence about the declared and actual policy goals of the Soviet Union. Sovietological types of studies helped to manage the ideological and political confrontation between communist and Western democratic systems, but they were less valuable in providing a social scientific understanding of how Soviet society actually functioned. The basic bottleneck for Western social scientists was the dearth of high quality data on a wide array of topics. During the Cold War era, social scientists conducted only a few sizable empirical projects that were intended to gather in-depth information about Soviet society per se. One of the best known was the so-called Soviet E; migre! Project conducted by scholars at Harvard University in the late 1970s (cf. Millar 1987). In this project, e! migre! s from the Soviet Union were interviewed about their beliefs and their life experiences in the Soviet Union. These interviews were regarded as the best available substitute for direct surveys of random samples of individuals within the Soviet Union. This project indeed produced a richer and more complex picture of Soviet society than had been put together from scrutiny of the Soviet mass media and official public documents. There were also careful and in-depth studies of the Soviet Censuses of Population and other official Soviet statistics (e.g., Clem 1986). A major conclusion was that the factual basis of official sources of Soviet data was highly suspect. At that time, it was widely believed that most errors in official data were intentional, designed to hide the true state of affairs in the Soviet Union. This was sometimes the case; however, mistakes often resulted from the same low level of quality control that was endemic in all nonmilitary production in the Soviet Union. In a society designed to implement a particular view of history and societal development, there was no thirst for independent evidence that might reveal a picture different from the one offered by the partystate’s leadership, or even worse, that might contradict the claims of the official view and expose actual flaws in the supposedly socialist society. Nevertheless, scholars within the Soviet Union made some useful studies of Soviet society. In the 1960s, Yuri Levada initiated some theoretical analyses of the Soviet type of society. His analyses contained criticisms of how it worked, which led him to be barred from sociological work for the next two decades. In the 1970s, Boris Grushin began the Taganrog project that investigated the limits of the influence of mass media on society. In this same period, Aleksandr Yadov worked to develop middle14740
range theories justifying empirical sociological research outside an ideological framework, and Igor Kon applied Western sociological thought to various topics in Soviet society. Tatiana Zaslavskaya founded a program of sociological research that provided highquality data on work behavior in Soviet industries and agriculture. Mikk Titma initiated longitudinal surveys designed to compare life careers of youths and young adults in various regions of the Soviet Union. The works of these and other Soviet scholars were published almost solely in Russian, and very few are available in English translation.
5.2 Studies in the Post-Soiet Era With the demise of the Soviet Union, Soviet society is no longer of great political interest. As a major social experiment with important influences on the history of the twentieth century, Soviet society is now primarily of historical and social scientific interest. Huge amounts of information about Soviet society and the Soviet system were released in the last years of the Soviet Union and during the first years of the new Russia Federation. Formerly secret archives, documents, and statistics became available to social researchers to study and analyze using the research methodologies and theoretical questions developed in the West. It also became much easier for social scientists to collect new data through interviews, social surveys, and various other kinds of field work. The first few years of this new openness allowed the quality of research on Soviet society and post-Soviet societies to reach the normal standards of the top journals in sociology, demography, economics, and political science in the United States and West Europe.
5.2.1 New insights into soiet society. Sovietological studies in the Cold War had usually depicted the Soviet system and society as not only highly centralized but monolithic and exceedingly stable. An important conclusion supported by new archival information is the remarkable evolution of the Soviet system during the seven decades of its existence. It developed from a crude personal dictatorship into a totalitarian society and then into a party-state with a complex system of institutions staffed by relatively educated officials. The Soviet Union began as a relatively uncomplicated state with a top-down command system operated by the Communist Party apparatus, with terror used to guarantee that commands were obeyed. With a growing economy and a huge external empire in East Europe after World War II, the Soviet Union needed to expand the state and to develop a state apparatus having a legal basis to operate. A large new stratum, an intelligentsia with higher education, was
Soiet Studies: Society formed to serve in the state apparatus. Secondary education spread quickly and had become compulsory by the 1980s. University education expanded during the same postwar period. By the 1960s, the number of university graduates was already close to that in the United States. In the 1980s, white-collar workers had become a third of the labor force, an especially astonishing transformation when one considers that the Soviet Union was essentially a country of peasants in 1917. It has become apparent that the Soviet Union’s educated workforce transformed the state and Communist Party apparatus. The Constitution was changed to make the Communist Party the ruling party officially and as a matter of law. The Soviet Union developed a state bureaucracy that functioned relatively well. State institutions began to operate like typical bureaucracies pursuing their own organizational interests. Major Soviet institutions contested for power, largely in bureaucratic ways. Power became more widely dispersed among different state institutions and parts of the state apparatus. Even the KGB and other parts of the security apparatus were brought under the control of party-state in the postwar era. In the final years of the Soviet Union, most members of the Communist Party’s Politburo were leading representatives of key economic ministries or of important republics within the Soviet Union. The Politburo, the Soviet Union’s top executive organ, came to be comprised of the country’s top executives; it was no longer just a group of Communist Party leaders. Under Brezhnev’s mediocre rule, internal dissatisfaction with a command system that limited initiative and restricted independent action even by loyal Soviet citizens soared. Cronyism and loyalty to a person’s superior started to be the operative principle guiding behavior within the party-state apparatus. As repressiveness diminished and the threat of terror receded, criticism of the Soviet system became increasingly widespread, although initially not public. Gorbachev’s policies of perestroika and glasnost brought these general feelings into the open. Though the top leaders of the Soviet state and Communist Party realized that change was needed, they had no clear idea what changes might have the desired effects of increasing efficiency while maintaining the basic power structure and its ideological foundation. Attempts to reform the command system failed. The finale came when the top ranks of party-state hesitated to use force to reestablish their complete control. They realized that they had no more positive alternative than the past and that they had lost the support of even the state bureaucracy, not to speak of the intelligentsia, which was actively opposing their leadership. The collapse of the Soviet Empire and the partystate is a subject of special interest. There continue to be multiple explanations. The heavy burden of competition with the United States in the arms race
certainly surpassed the capacity of the Soviet economy, but it cannot explain the regime’s internal collapse. In 1991, the Soviet state could still marshal sufficient resources to have survived for some years. Challenges from the republics with strong nations were clearly a factor in the disintegration but could have been met and managed by applying sufficient force, as had been done in the past. Openness to the rest of the world during perestroika gave the Soviet elite a clear understanding that the Soviet Union offered a lower standard of living than the Western world. This realization eroded many people’s faith in the communist ideology and system. A crucial factor was the paralysis of the party-state apparatus as a result of the loss of this faith in the long-term viability of the communist system, which thereby limited the capacity of the party-state’s elite to govern.
5.2.2 Post-Soiet society and the transition process. In many instances, Soviet society and market societies are now seen as exemplifying very different ways in which an industrialized society can be organized. Rapid changes in postsocialist countries, but along different paths, permit rich comparisons between various postsocialist countries as well as between them and the well-developed societies in capitalist, democratic countries. It is often presupposed that all postsocialist societies are moving on a path toward a Western-type of society, that is, a democratic society with a well-developed market economy. Although some post-Soviet countries have unambiguously moved in this direction, it is too soon to be sure about some others. The transformation of post-Soviet states from the previous system has raised a number of basic problems: the development of the state; strategies for developing democratic and market institutions; the basic principles of citizenship (ethnic or civic); the transformation of enterprises from the state sector to the private sector; factors promoting entrepreneurial activities; the emergence of greater and different kinds of social inequalities; the legacy of privileged positions in the Soviet era; the demographic crisis (a plummeting birth rate coupled with a rising mortality rate). The Soviet Union’s collapse has had an especially strong impact on studies of ethnicity and nationality. The emergence of fifteen post-Soviet states plus nearly a dozen other new nation states in the Soviet Union’s ‘outer empire’ (i.e., the Soviet bloc in East Europe), most with sizable minority populations and many with unexpectedly high levels of ethnic hostilities, has stimulated social scientific studies of interethnic relations, nation building, and the role of nation states in the development of civil societies. Interethnic relations may be an issue in a multiethnic state, or between a majority and minority ethnic group within a single state, or between neighboring states each having an 14741
Soiet Studies: Society ethnic diaspora in the other country. National selfdetermination is a different issue that develops when a cultural identity turns into a political identity. In a democratic regime, the outcome can be a peaceful divorce as occurred in the emergence of separate Czech and Slovak states. There is a clear difference between ethnic rights for cultural, social, economic, and political self-determination in the historic homeland of an ethnic group on the one hand and civic principles of interethnic relations in a modern state on the other hand. A civic society is only established with great difficulty when there are historical animosities between nationalities. Communist regimes suppressed ethnic divisions along with other features of a civil society. In contrast, a modern market-based society is premised upon civic principles. These civic principles allow ethnic tensions and hostilities to grow and offer no clear means to check them. There has also been particular interest in the gains and losses from economic transformation for both individuals, groups, and society as a whole. In terms of the last, the main issue has been the advantages and disadvantages of economic shock therapy rather than a gradual transformation to a market economy. Available data on various postsocialist countries suggests that shock therapy does not always lead to quick improvements in economic and social life and that a gradual transition, as in China, tends to be more effective. Evidence also suggests that a macroeconomic, top-down approach to development excludes most people from participation in building a market economy. In contrast, a bottom-up approach motivates many people to seek and seize opportunities that markets provide. It allows a majority to participate, although naturally not everyone succeeds in the new economic competition. Indeed, a bottom-up process is how market societies emerged historically. Both theory and comparative research have also debated the role of elites in the transformation of socialist societies. In post-Soviet and East European countries, the issue was initially politicized and often presented as the old (communist l bad) elite versus a new (noncommunist l good) elite. Sociological research has, however, focused mainly on whether the old elite retained their advantages in postsocialist societies, whether position in the old elite (e.g., political versus managerial elite) mattered, and whether members of the old elite tended to gain or lose relative to members of the non-elite who were otherwise comparable (e.g., in educational achievement). A related issue has been the value of social connections as contrasted with personal know-how. The old elite tended to be advantaged in both, and even in the level of formal education. To date, there have been no definitive conclusions about issues pertaining to the role of the former elites. Clearly, however, many members of the former elite have been able to retain an elite status although keeping an elite status has often involved moving into a different kind of social role 14742
(e.g., a former party functionary might become an entrepreneur or manager of an enterprise).
5.2.3 Major research projects. Information about Soviet society, and especially the societal transitions in post-Soviet states, has provided Western sociologists, demographers, political scientists, and economists with opportunities to gather new data and to test a wide array of hypotheses. In the post-Soviet era, a basic problem is the quality of data rather than its simple availability. Direct control over data collection is necessary to monitor and evaluate data quality, and Western researchers often lack direct control. Major empirical research projects focusing on postSoviet and East European countries include Hout and Wright’s study of social stratification (see Hout et al. 1992, Gerber 1999); I. Szelenyi and Treiman’s study of postcommunist elites, begun in 1994 (see Eyal et al. 1998); Kohn’s surveys of class consciousness (see Kohn et al. 1997); Burawoy’s case studies of the political economy of the transition from socialism (e.g., see Burawoy and Krotov 1992); and the longitudinal study of health and related issues by social scientists at the University of North Carolina (e.g., Zohoori et al. 1998, Lokshin and Popkin 1999). In the former Soviet Union per se, the longitudinal study of the life careers of secondary school graduates begun by Titma in 1983 has had two follow-ups in the postSoviet period, one in 1993–94 and the other in 1997–98 (see Titma and Tuma 1995, Titma 1997). O’Brien and Patsiorkovski (1999) have collected data over time on households in Russian villages. Many post-Soviet countries, especially Russia and the larger East European countries once in its sway, have been included in large cross-national studies. These include Inglehart’s study of world values (see Inglehart and Baker 2000, World Value Survey Study Group 1994); the Eurobarometer surveys (see Reif et al. 1996); and numerous studies of elections (e.g., White et al. 1997) and public opinion (e.g., Clarke 1993). These various projects have enormously enlarged the amount of data relevant to many fields of sociology that come from postsocialist countries in general and post-Soviet countries more specifically. A basic consequence of the greatly expanded base of data on Soviet and post-Soviet societies is that sociological research on the Soviet Union and postsocialist states has entered the mainstream of Western social science. It is no longer an area of specialization with strong connections to the foreign policies of Western States. See also: Bureaucracy and Bureaucratization; Collectivism: Cultural Concerns; Communism; Communism, History of; Dictatorship; Elites: Sociological Aspects; Marx, Karl (1818–89); Marxism\Leninism;
Space and Social Theory in Geography Marxist Social Thought, History of; Peasants and Rural Societies in History (Agricultural History); Revolutions of 1989–90 in Eastern Central Europe; Soviet Studies: Politics; Totalitarianism: Impact on Social Thought; Utopias: Social
Bibliography Bell D 1973 The Coming of Post-industrial Society. Basic Books, New York Bobak M, Pikhart H, Hertzman C, Rose R, Marmot M 1998 Socioeconomic factors, perceived control and self-reported health in Russia. A cross-sectional survey. Social Science and Medicine 47(2): 269 Burawoy M, Krotov P 1992 The Soviet transition from socialism to capitalism: worker control and economic bargaining in the wood industry. American Sociological Reiew 57(1): 16–38 Clarke S 1993 Popular attitudes to the transition to a market economy in the Soviet Union on the eve of reform. The Socological Reiew 41 (November): 619–52 Clem R S (ed.) 1986 Research Guide to the Russian and Soiet Censuses. Cornell University Press, Ithaca, NY Eyal G, Szele! nyi I, Townsley E 1998 Making Capitalism Without Capitalists: Class Formation and Elite Struggles in Postcommunist Central Europe. Verso, London Gerber T P 1999 Survey of employment, income, and attitudes in Russia (Seiar), January–March 1998 [Computer file]. ICPSR version. All-Russian Center for Public Opinion and Market Research (VTsIOM), Moscow [producer], 1998. Interuniversity Consortium for Political and Social Research, Ann Arbor, MI [distributor] Hout M, Wright E O, Sa! nchez-Jankowski M 1992 Class Structure and Class Consciousness in Russia (MRDF). Survey Research Center, Berkeley, CA Inglehart R, Baker W E 2000 Modernization, cultural change, and the persistence of traditional values. American Sociological Reiew 65 (February): 19–51 Kohn M L, Slomczynski K M, Janicka K, Khmelko V, Mach B W, Paniotto V, Zaborowski W, Gutierrez R, Heyman C 1997 Social structure and personality under conditions of radical social change: A comparative analysis of Poland and Ukraine. American Sociological Reiew 62 (August): 614–38 Lokshin M, Popkin B M 1999 The emerging underclass in the Russian Federation: Income dynamics 1992–96. Economic Deelopment and Cultural Change 47: 803–29 Marx K, Engels F 1962 (1848) Manifesto of the Communist Party. Foreign Languages Publishing House, Moscow, Russia Millar J R (ed.) 1987 Politics, Work, and Daily Life in the USSR: A Surey of Former Soiet Citizens. Cambridge University Press, Cambridge, UK O’Brien D J, Patsiorkovski V V 1999 Russian village household panel surveys, 1995–1997 (Computer file). D J O’Brien, University of Missouri, Department of Rural Sociology, Columbia, MO\V V Patsiorkovski, Russian Academy of Sciences, Institute for the Socio-Economic Studies of Population, Moscow ( producers). Inter-university Consortium for Political and Social Research, Ann Arbor, MI (distributor) Putnam R D 1993 Making Democracy Work: Ciic Traditions in Modern Italy. Princeton University Press, Princeton, NJ Reif K, Cunningham G, Kuzma M 1998 Central and Eastern Eurobarometer 6: Economic and Political Trends, October–
November 1995 (Computer file). ICPSR version. GfK EUROPE Ad hoc Research, Brussels, Belgium ( producer), 1996 Zentralarchiv fu$ r Empirische Sozialforschung, Ko$ ln, Germany\Inter-university Consortium for Political and Social Research, Ann Arbor, MI (distributors.) 1998 Titma M (ed.) 1997 Sotsial’noe rassloenie ozrastnoi kogorty [ypustkniki 80–x postsoetskom prostranste. Proiekt ‘Puti pokoleniia’] (Social Stratification in an Age Cohort Graduates of the 1980s in Post-Soiet Countries. Project ‘Paths of a Generation’). Institute of Sociology of the Russian Academy of Sciences, Moscow, and the Center for Social Research in Eastern Europe, Tallinn, Estonia Titma M, Tuma N B 1995 Paths of a Generation: A Comparatie Longitudinal Study of Young Adults in the Former Soiet Union. Technical Report, Department of Sociology, Stanford University, Stanford, CA, and Center for Social Research in East Europe, Tallinn, Estonia Toynbee A J 1947 A Study of History. Oxford University Press, New York White S, Rose R, McAllister I 1997 How Russia Votes. Chatham House Publishers, Chatham, NJ World Value Survey Study Group 1994 World Value Survey, 1981–1984 and 1990–1993 (Computer file). ICPSR version. Institute for Social Research, Ann Arbor, MI ( producer), Inter-University Consortium for Political and Social Research, Ann Arbor, MI (distributor) Zohoori N, Kozyreva P, Kosolapov M, Swafford M, Popkin B M, Glinskaya E, Lokshin M, Mancini D 1998 Monitoring the economic transition in the Russian federation and its implications for the demographic crisis: The Russian longitudinal monitoring survey. World Deelopment 26(11): 1977–93
M. Titma and N. B. Tuma
Space and Social Theory in Geography Human geography’s engagement with social theory has been enormously influential, leaving no part of the discipline untouched. ‘Social theory’ encompasses a heterogeneity of postpositivist philosophies, including, but not limited to, Marxist political economy, humanistic thought, structuration theory, feminism, postmodernism, and postcolonialism.
1. Marxist Political Economy Starting with radical geography in the 1960s, Marxist geography stressed the analytical centrality of labor and the production process in the creation and transformation of economic landscapes (see Marxist Geography). Through labor, human beings enter into social relations, materialize ideas, and transform nature. Marxists emphasized the powerful role that class plays in the social and spatial division of labor, as 14743
Space and Social Theory in Geography a vehicle through which social resources are distributed, as a central institution in shaping labor and housing markets, as a defining characteristic of everyday life, and as a fundamental dimension of political struggle. Underpinning this view is the labor theory of value and its implication that class relations rest upon the extraction of surplus value from the working classes, a process pregnant with politics. The Marxist perspective moved beyond simplistic dichotomies (e.g., base–superstructure) to incorporate the multiplicity of class relations over space and time, gender, the state, and ideology and culture. Marxism is thus not just an economic analysis, but a social, political, and moral one as well. Marxist geography allowed landscapes to be understood as the creation of modes of production, most notably that of capitalism. Harvey (1982, 1989b) initiated a fascination with uneven development, in which regions are created, transformed, and annihilated by flows of capital over space in search of maximum rates of profit. Capital creates a geography of the built environment, a ‘spatial fix’ that must be recreated periodically over time. The uneven development of cores and peripheries was analyzed at several spatial scales, i.e., within the metropolitan region, within nation-states, and internationally. Competition forces innovation and induces an acceleration in the turnover rate of capital that rhythmically reconfigures the landscape of commodity production (Harvey 1982). Periodic crises emanate from the overaccumulation of surplus value, leading to broad restructurings of production and location. Viewed systemically, crises are necessary for capitalism’s existence; commodity production is not only crisis-ridden, but crisis-dependent. Marxist geography diversified in several ways. Marxist analyses of the state, or government, arose primarily from French structuralism. In contrast to neoclassical views of the ‘public sector,’ social theory highlighted social origins and the multiple ways in which the state shapes the territorial reproduction of social relations. Long dominated by a focus on capital, Marxists have examined the geography of labor, including, for example, the spatial dimensions of class struggle and unions. In the 1980s, economic geography established linkages to French regulation theory, which theorized the multiple forms of capitalism over time and space (see Regulation Theory in Geography). The latest of many restructurings historically, late twentieth century capitalism, underwent a profound transformation from Fordism—characterized by large scale production, economies of scale, vertically integrated firms, oligopolies, homogeneous products, mass consumption, assembly lines, interchangeable parts, and Taylorism—to a post-Fordist regime, characterized by small-scale production, networks of cooperative firms, niche markets, and flexible (computerized) production methods clustered in large urban agglomerations. When local production complexes were 14744
linked to the global economy, Marxism offered a nuanced understanding of how regions are constructed around industries (see Technology Districts). Geography’s engagement with political economy led to an important retheorization of space. Improved transportation and communication systems create rounds of time–space compression, or in Marx’s apocryphal words, the ‘annihilation of space by time’ (see Time–Space in Geography). Massey (1984) reincorporated the local uniqueness of places within wider divisions of labor, holding that the reproduction of uneven development occurs through the layering of different rounds of accumulation upon the physical and social landscapes of individual locales. The localities project recovered local places as fundamental to the understanding of how broad processes play out in specific contexts. The enormous time–space compression of late twentieth century capitalism, particularly through telecommunications, has embedded places within a worldwide ‘space of flows’ (Castells 1996). In the context of hypermobile, post-Fordist, postmodern capitalism, such processes are central to an increasingly service-dominated, globalized economy dominated by streams of money and information (see Finance, Geography of).
2. World-systems Theory Political economy’s engagement with geography extended to the origins and structure of the worldsystem, including colonialism and imperialism, the geopolitics of nation-states, multinational corporations, and the possibilities of socio-economic development. Widespread dissatisfaction with modernization theory led dependency theorists to reject neoclassical views of the international division of labor (see Deelopment Theory in Geography). Third World poverty—indeed, the construction of the developing world itself—did not, in this view, simply ‘happen,’ but was the product of centuries of colonialism and continued neocolonial exploitation through transnational corporations, i.e., the development of underdevelopment. Dependency theory represented the first application of Marxist geographical thought on a global scale, locating the cause of poverty in the materialist, external workings of the world economy, not the ideological, internal order of societies, as did modernization theory. Dependency theory rejected the promise that imitation of the West would bring prosperity and liberty to developing nations. Rather, it argued, development and underdevelopment constitute two sides of global capital accumulation, i.e., the wealth of the developed world was, and is, derived from the extraction of surplus value from underdeveloped places. Every moment in the penetration of commodity relations into non-capitalist social formations, therefore, was viewed as inevitably an act of exploitation.
Space and Social Theory in Geography Wallerstein (1979) advanced understanding of the political economy of global geographies with worldsystems theory. The capitalist world-system, in contrast to earlier world empires dominated by a single political structure, consists of numerous, independent nation-states, which collectively exert no effective control over the global market. The political geography of capitalism is the interstate system, not the nation-state per se. World-systems views divide nations into three broad categories, the core, i.e., the world’s leading powers (typically dominated by a single hegemon); the periphery, which comprises large numbers of weak, impoverished states; and the semiperiphery, consisting of wealthier members of the developing world and less developed members of the West. Unlike dependency theory, which takes the nation-state as its unit of analysis, world-systems theory insists upon viewing the entire planet as an indivisible totality, overcoming the dualism between the global and the local. The possibilities of economic development are seen as more open-ended than in dependency theory. The success of the East Asian Newly Industrialized Countries (NICs) since World War II boosted the appeal of world-systems views considerably. Political geography witnessed sophisticated interpretations of hegemony and territoriality at several spatial scales (see Political Geography). Agnew and Corbridge (1995), for example, dissect the ‘power container’ of the nation-state and its mounting ‘leakages’ to and from the world-system. O’Tuathail (1996) offers a poststructuralist critical geopolitics, that emphasizes the social origins of statecraft, foreign policy, and geographical knowledge.
Humanistic geography, with roots in the French Vidalian tradition, made numerous contributions to the analysis of place, including the roles of understanding and hermeneutics, the body, experience, and intentionality, and everyday life. Its form of understanding emphasized thought rather than behavior: ‘truth,’ in this view, could never be grounded simply in objective facts, but in the subjective interpretations that people give to them. This school gave cultural geography a badly needed platform to appropriate hitherto unavailable sources of data, such as literature, poetry, art, and mythology. Tuan’s (1977) ‘sense of place’ studies elaborated upon places as abstract, intangible webs of symbolic meanings produced by and for people, emphasizing the notion of landscapes as authored by agents, for a purpose. Humanism also legitimized later geographical explorations of ethics, morality, and justice (e.g., Smith 1994). Finally, the emphasis on human consciousness was an important antidote to the teleology of structural Marxism; humanistic thought, emphasizing choice and contingency, revealed that history never rolls along a predetermined track. However, humanistic geography also suffered from serious flaws. Most importantly, lacking any conception of social relations and the social construction of the individual subject, it lapsed into the atomistic ontologies of positivism and neoclassical economics. Unable to appropriate for itself the conceptual foundations necessary to comprehend production, class, the state, power, and struggle, humanism serves largely as a critique but not as an alternative to more structurally-oriented forms of social theory.
4. Structuration and Realism 3. Humanistic Geography A parallel moment in geography’s embrace of social theory occurred with the adoption of humanistic philosophies, particularly phenomenology. Humanists insisted emphatically on the central role of human consciousness, and sought to return purposeful human agents to the center of spatial analysis, as did Renaissance humanists. Philosophically, this line of thought drew upon Husserl, Heidegger, and, more recently, the semiotics of perception, cognition, and language (Buttimer 1976). Humanistic thought successfully refuted the assumption of naturalism, i.e., that there are no substantive differences between the human and natural sciences, revealing that social science is qualitatively different from the physical or natural sciences, so that metaphors drawn from those disciplines are inappropriate to the study of society (Barnes 1996). Because the human subject was poorly theorized in structural Marxism, humanists criticized its economic determinism and teleological reading of history (Duncan and Ley 1982).
Giddens’s (1984) structuration theory effectively overcame the long-standing micro–macro division between social structures that act on their own accord on the one hand and ahistorical individuals on the other. Structuration theory began with the recognition that culture is what is taken-for-granted in daily life, i.e., common sense, the matrix of ideologies that people use to negotiate their way through their everyday worlds. Unlike the ahistorical subject of humanism, culture is acquired through a lifelong process of socialization: individuals never live in a social vacuum, but are socially produced from cradle to grave. Viewing culture as social practice upstaged the prevailing superorganic view in cultural geography, which reified culture as a force suspended above people’s heads (Duncan 1980). By emphasizing that the deeply internalized rules and resources that people draw upon everyday simultaneously enable and constrain ways of thinking and acting, structuration theory rescued the living subject from the twin abysses of structural overdetermination and individualistic vol14745
Space and Social Theory in Geography untarism. Social formations both produce, and in turn are produced by, their inhabitants (see Timegeography). The counterpart to the unacknowledged preconditions of everyday life are the unintended consequences, the ways in which social relations continually escape the intentions of their creators. Drawing on Marx’s famous aphorism that ‘Men (sic) make history, but not by means of their own choosing,’ structuration theory holds that social and spatial change occurs through intentional action, but not intentionally: in forming their biographies everyday, people unintentionally reproduce and transform their social worlds; individuals are simultaneously produced by, and producers of, history and geography. Thus, everyday thought and behavior do not simply mirror the world, they constitute it (Sayer 1984). Societies and regions are neither reducible to, nor independent of, the everyday lives of their inhabitants. This view linked the macrostructures of social relations to the microstructures of everyday life (Thrift 1983, Pred 1984). Epistemologically, structuration theory was complemented by the philosophy of realism (Sayer 1984) (see Critical Realism in Geography). As with structuralism, realism holds that the interrelations among observations are ontologically real, even if they cannot be measured directly; social relations are thus ‘real’ even if they are not directly ‘observed.’ Explanation in realism centers on the distinction between necessary and contingent relations. Necessary relations, identified through theoretical abstraction, concern the essential mechanisms that produce change, not events or observations per se; contingent relations, in contrast, are specified empirically in concrete contexts. Whereas necessary conditions must exist in every social explanation, whether the causal properties they describe are activated or not empirically is contingent and indeterminate. The parallels here to structure and agency in structuration theory are clear. This distinction ruptures the unity of generalization and explanation: because different causal powers may generate similar empirical patterns, regularities are not sufficient for explanation; because necessary properties are contingently activated and may occur in only one instance, generalizations are not necessary for explanation either.
5. Feminism Feminist scholarship arose from and contributed to dramatic changes in the status and lives of women, particularly since the women’s movement of the 1960s. Feminism is not one unified school of thought, but a series of overlapping discourses, including liberal, Marxist, ecofeminist, radical, and postmodern variants (see Gender and Feminist Studies in Geography). Differentiating between the biological category of sex 14746
and the social category of gender, feminism stressed gender as a form of resource organization, power relation, and matrix of deeply sedimented presuppositions that underpin daily life. To ignore gender is to assume that mens’ lives are ‘the norm,’ that there is no fundamental difference in the ways in which men and women experience and are constrained by social relations. Feminism served a critical role asserting gender as the first nonclass-based form of social determination. Central to feminist critiques is the dissection of patriarchy, the domination and oppression of women by men, which is, if not a universal feature of social relations, extremely widespread historically and geographically. Are patriarchy and capitalism necessarily, and not contingently, related? As a political project, feminism is committed to the eradication of patriarchy, to the difference that gender makes in the life opportunities of men and women, to gender as a relation of power. It does not follow that feminism is about eradicating gender differences per se; as Hanson (1992) argued, the task is to theorize difference without power. Feminist scholarship revealed how space is constructed around gender relations, including housing, work, and commuting patterns, access to public and private resources, and the meanings and behaviors that typically favor men and disadvantage women (Bowlby et al. 1990). While early feminist work focused simply on the ‘geography of women,’ later scholarship explored how gender is intertwined with relations of power, culture, ethnicity, and ideology. Feminists played a major role in gendering economic geography, including spatial divisions of labor (Massey 1984) and gender differences in commuting patterns (Hanson and Pratt 1995). Feminists exposed as patriarchal the traditional equation of male roles with public spaces and female roles with private ones, which mirrors the broader schism between production and reproduction. Feminists also plunged into epistemology, showing how the construction of knowledge itself is pervasively gendered. Rose (1993) describes how geography’s very maleness long colored its construction, arguing for an emancipatory geography that empowers women. Gender, by definition, includes males. To equate ‘gender’ with women is to assume men have no gender, that it is not socially constructed or geographically instantiated. Like femininity, masculinity in its multiple forms is reproduced spatially through specific sites, practices, and discourses. Lastly, while gender and sexuality are far from synonymous, recent social theory has also explored the role of sexuality in the construction of personal spaces (see Sexuality and Geography). As geography embraced political economy, it expanded the ‘economy’ to include the culture of the firm (Schoenberger 1997), the gendered nature of knowledge (Gibson-Graham 1996), and the role of
Space and Social Theory in Geography language and metaphor (Barnes 1996). These works emphasized the role of human actors rather than mechanistic institutions in the unfolding of economic processes, focusing on trust, knowledge, reputation, uncertainty, indeterminacy, and contingency rather than modes of production, class struggle, or technological change. The result has been a more thoroughly ‘humanized’ economic geography sensitive not only to time and space but to the complexities of social life.
6. Postmodernism, Representations, Identity, and Postcolonialism Following the ‘linguistic turn’ of many social sciences, geographers exhibited a sustained interest in the politics of knowledge, ideology, discourse, language, symbols, and representations (Barnes and Duncan 1992) as well as the unity of knowledge and power. Far from constituting a web of selfpropelling ideas, all knowledge arises from and in turn contributes to power relations, either by naturalizing or challenging them. Every discourse and representation, whether self-consciously or not, is politically laden, serving some interests and not others; every theory, therefore, is not only an explanation but a legitimation of a particular interest. Postmodernism contrasts sharply with the Enlightenment quest for universal metanarratives and the modernist assumption that reality is fundamentally ordered; postmodernism emphasizes complexity, randomness, disorder, and chaos. The postmodern picture of reality consists of a multitextured kaleidoscope that can never be adequately captured in a single theory: reality is more complex than any language can describe. Postmodernism celebrates heterogeneity, not commonalities; it accepts uncertainty and ambiguity; it emphasizes ephemerality rather than permanence. However, the simple dichotomy between modern and postmodern fails to do justice to the complexity of both. Marxists conceptualize postmodernism as a cultural response to the material and social transformation of post-Fordist capitalism (Harvey 1989 a). This view destabilized many received categories of geographic thought (Benko and Strohmeyer 1997) (see Postmodernism in Geography). Postmodernism’s engagement with feminism problematized the unified metanarrative of gender that underpinned feminist work. Postmodernism and Marxism share an uneasy relationship (Harvey 1989 a, Soja 1989). In the context of the localities debate, postmodernism holds that theory must be adapted to the temporal and spatial specifics of places. Postmodernism also paved the way for the injection of semiotics into geographic thought and a wideranging debate about the nature of representations of space. Orthodox views, grounded in the correspondence theory of truth, portray representations as
claims to truth, not attempts at persuasion; in this view, objectivity is both possible and necessary, and the act of interpretation is unproblematic. Under postmodernism, this perspective gave way to deconstructionist interpretations that emphasize every representation is a simplification, inevitably filled with silences (Barnes and Duncan 1992). Like landscapes, representations are always authored, situated in a context, inescapably partial and biased; thus, representations can only be understood intertextually, within the context of other representations. Further, the manner in which representations are interpreted\consumed is not necessarily how they are intended\produced: meanings of texts and landscape generally escape their authors. This line of thought is insistent that representations are social constructions: while discourses often become taken-for-granted as ‘natural,’ acquiring an asocial ‘objective’ status, there can be no value-free representations. Rather, representations of place are inescapably linked to power relations. Thus, they have social consequences (though not always intended ones), becoming part of the social reality that they describe: word-making is also worldmaking. Cartographers, for example, revealed maps to be discourses, laden with power relations, that constrain perceptions of space along some avenues and not others (Wood 1992, Harley 1989). Pickles (1995) argues that Geographic Information Systems have become implicated in the discursive rewriting of space, essentially underpinning a resurgence of positivist thought. Human geography has also exhibited an interest in personal and social identity, the processes that shape human experience, establish the individual’s boundaries of the self, and define their place in the world. Building on a long tradition of phenomenology, psychotherapy, and symbolic interactionism, this genre stressed the social construction of the individual self and sutured the formation and reproduction of individual and collective identities to place and space (Pile and Thrift 1995). Identities are power relations that reflect and contest relations of normality and marginality. In this light, identities are constructed by establishing difference, by the drawing of boundaries, by defining what they are not: there is always an Other, and ‘othering’ is a power relation. In contrast to the Enlightenment assumption of the unified human subject, identities can be contradictory bundles, constantly in flux, and inevitably interwoven with power relations, a view suited to the realities of late twentieth century, post-Fordist, hypermobile capitalism, in which the electronic mediation of experience is widespread, rapid time-space compression commonplace, and hyperreality the norm. Social theory in geography only recently turned to race and ethnicity (Jackson and Penrose 1994), to the complex ways in which it is constructed and negotiated in everyday existence, how it interacts contingently with class and gender, and the complicated manner in 14747
Space and Social Theory in Geography which it plays out over space. Racial and ethnic identity is fundamentally a power relation that typically reflects, reinforces, and naturalizes existing lines of inequality, although it may challenge hegemonic norms as well. The racialized geographies of urban poverty and the underclass have received attention as the products of restructuring, state policy, and local discrimination. To equate ‘ethnicity’ with nonwhites, however, presumes that whiteness is the norm, or ‘default option;’ for this reason, geographers have turned to the social construction of whiteness and its multiple political connotations. The geographical understanding of ethnicity, discourse, and identity was greatly advanced by Said’s (1978) Orientalism, which opened the door to an imaginative reconceptualization of the politics of spatial representation and identity formation at the worldwide level, revealing colonialism to be as much an ideological process as an economic or political one. Gregory (1994) laid bare geography’s complicity in the power relations of global (see Postcolonial Geography) modernity: by framing space within ‘occularcentric’ worldviews of the Enlightenment— essentially, the assumption that the world is knowable, that it can be subjected to the dissection in terms of an all-seeing observer—modernity discursively constructed images, ideas, and identities that deepened the Western penetration of its Other, the non-West.
7. Concluding Comments Paradoxically, contemporary geography illustrates both greater unity and greater divisions than perhaps at any time in its history. While enormous differences of opinion remain, a broad consensus has emerged that takes seriously political economy and social relations of knowledge and power as fundamental to the understanding of space. Social theory raised class, gender, ethnicity, and power as central foci of spatial analysis, including both their historically constituted material forms on the landscape and their ideological, taken-for-granted dimensions that inform daily life, discourse, and individual and social identities. Social reproduction has been put on a par with the dynamics of production as a key moment in the dissection of spatialities. In the process, geography became highly sensitized to issues of difference, to the politics of knowledge and the diverse ways in which places, and the people who inhabit them are represented to one another and to observers. The ways in which social and spatial relations are explained, justified, and described in politically-laden terms—in short, the manner in which discourse does not simply reflect the social world, but constitutes it—have also figured prominently. As a result of its intercourse with social theory, geography—long a discipline that lagged behind other fields in theoretical vigor—was catapulted from a naı$ ve, atheoretical discipline into a 14748
theoretically informed and informative (i.e., generative) one.
Bibliography Agnew J, Corbridge S 1995 Mastering Space: Hegemony, Territory and International Political Economy. Routledge, London Barnes T J 1996 Logics of Dislocation: Models, Metaphors, and Meanings of Economic Space. Guilford, New York Barnes T J, Duncan J S (eds.) 1992 Writing Worlds: Discourse, Text and Metaphor in the Representation of Landscape. Routledge, London Benko G, Strohmayer U (eds.) 1997 Space and Social Theory: Interpreting Modernity and Postmodernity. Blackwell, Oxford, UK Bowlby S, Lewis J, McDowell L, Foord J 1990 The geography of gender. In: Peet R, Thrift N (eds.) New Models in Geography. Unwin Hyman, London, Vol. 2 Buttimer A 1976 Grasping the dynamism of life-world. Annals of the Association of American Geographers 66: 277–92 Castells M 1996 The Rise of the Network Society. Blackwell, Oxford, UK Duncan J S 1980 The superorganic in American cultural geography. Annals of the Association of American Geographers 70: 181–98 Duncan J, Ley D 1982 Structural Marxism and human geography. Annals of the Association of American Geographers 72: 30–59 Gibson-Graham J K 1996 The End of Capitalism (As We Knew It). Blackwell, Oxford, UK Giddens A 1984 The Constitution of Society: Outline of the Theory of Structuration. University of California Press, Berkeley, CA Gregory D 1994 Geographical Imaginations Blackwell, Oxford, UK Hanson S 1992 Geography and feminism: worlds in collision? Annals of the Association of American Geographers 82: 569–86 Hanson S, Pratt G 1995 Gender, Work and Space. Routledge, London Harley B 1989 Deconstructing the Map. Cartographica 26: 1–20 Harvey D 1982 The Limits to Capital. University of Chicago Press, Chicago Harvey D 1989a The Condition of Postmodernity. Blackwell, Oxford, UK Harvey D 1989b The Urban Experience. Blackwell, Oxford, UK Harvey D 1996 Justice, Nature and the Geography of Difference. Blackwell, Oxford, UK Jackson P, Penrose J (eds.) 1994 Constructions of Race, Place and Nation. University of Minnesota Press, Minneapolis, MN Massey D 1984 Spatial Diisions of Labor. Methuen, New York O’Tuathail G 1996 Critical Geopolitics: The Politics of Writing Global Space. University of Minnesota Press, Minneapolis, MN Pickles J (ed.) 1995 Ground Truth: The Social Implications of Geographic Information Systems. Guilford Press, New York Pile S, Thrift N (eds.) 1995 Mapping the Subject: Geographies of Cultural Transformation. Routledge, London Pred A 1984 Place as historically contingent process: Structuration and the time-geography of becoming places. Annals of the Association of American Geographers 74: 279–97 Rose G 1993 Feminism and Geography: The Limits of Geogaphical Knowledge. University of Minnesota Press, Minnesota, MN
Space: Linguistic Expression Said E 1978 Orientalism 1st edn. Pantheon Books, New York Sayer A 1984 Method in Social Science: A Realist Approach. Hutchinson, London Schoenberger E 1997 The Cultural Crisis of the Firm. Blackwell, Oxford, UK Smith D M 1994 Geography and Social Justice. Blackwell, Oxford, UK Soja E W 1989 Postmodern Geographies. Verso, London Thrift N-J 1983 On the determination of social action in space and time. Enironment and Planning D: Society and Space 1: 23–57 Tuan Y-F 1977 Space and Place: The Perspectie of Experience. Edward Arnold, London Wallerstein I 1979 The Capitalist World Economy: Essays. Cambridge University Press, Cambridge, UK Wood D 1992 The Power of Maps. Guilford, New York
B. Warf
Space: Linguistic Expression Spatial cognition is at the heart of much human thinking, as evidenced for example in the explanatory power of spatial metaphors and diagrams, and the explication of spatial concepts has thus played a key role in the development of Western mathematics and philosophy over two and a half millennia (see Jammer 1954). It is therefore easy to assume that spatial notions form a robust, universal core of human cognition. But naı$ ve human spatial language turns out to vary significantly across languages, both in the way it is semantically organized and the way in which it is coded. This has major implications for how we should think about universals in human thought and language. The Western intellectual tradition has provided an elaborate metalanguage for spatial concepts, but the conceptual underpinnings of naı$ ve human spatial language utilize only a small subset of these concepts. Moreover, there are many aspects of our spatial perception and motor control which utilize much richer and more precise representations of space than are coded in language. The study of the representation of space in language, and its relation to cognition, must therefore be treated as an empirical enterprise, and this article reports on the still quite limited crosslinguistic information about how different languages structure the spatial domain (see Levinson and Wilkins in preparation).
1. Space as a Semantic Domain Languages tend not to treat space as a single, coherent semantic field in the same way that they treat e.g., color (q.v.), kinship (q.v.), or ethnobotany—rather, space is structured as a large semantic field with many distinct subdomains, each of which is closely structured (and often varies considerably across languages).
Nearly all languages have ‘where’—questions which cover the entire field related (that is with the same or morphologically related forms for ‘where-to,’ ‘wherefrom,’ ‘where-at’) and describe the location of one object primarily with respect to another. But beyond these general properties, spatial expressions are structured in distinct subdomains, notably topology, frames of reference, and motion, with place names and spatial deixis as complementary areas. These subdomains cross-cut in certain ways (e.g., frames of reference may play a role in motion description), but because they have their own internal organization they need to be kept analytically distinct. Although, the philosophical and scientific literature gives us many different ways to talk about space, natural languages tend to structure spatial description more in accord with the ideas of Leibniz than of Newton. That is, location is thought about as the place of one thing relative to another (rather than, as in Newtonian thinking, as specifications in the coordinates of an infinite, abstract three-dimensional ‘box’). We will call the object to be located the ‘figure,’ and the object with respect to which the location is specified ‘the ground’ (other equivalent terms in the literature are ‘theme’ vs. ‘relatum’ or ‘trajector’ vs. ‘landmark’). As Talmy (1983) noted, a good ground or landmark object is larger than and less mobile than the figure object to be located. The problems of location specification then largely boil down to the following functional considerations: (a) Where figure F and ground G are contiguous or coincident, a static F may be said to be ‘at G.’ Where G is large, it can be subdivided, so that F can be said to be ‘at the X-part of G.’ This is, roughly, the subdomain of topology. (b) Where F and G are separated in space, the problem is to specify an angle or direction from G in which F can be found (the search domain). This involves a coordinate system of some kind, a ‘frame of reference,’ of which there are three main types found in languages. (c) Where F is in motion, two kinds of ground are especially relevant: the source or the goal of the motion. To indicate the progression of motion (i.e., the increase or decrease of distance from a ground), motion can be specified as ‘F is moving towards goal G’ or ‘F is moving from source S.’ In addition, motion can be thought of as bringing about, or destroying, a topological relation, as in ‘F moves into G.’ But to indicate a direction or vector of motion, both S and G must be specified, or a coordinate system of the kind mentioned in (b) utilized. (Of course, motion can also be specified as taking place within a location, in which case the systems in (a) and (b) can be utilized. These functional distinctions give us a number of cleavages: stasis (or location) vs. kinesis (or motion), non-angular topological relationships vs. angular relations specified in frames of reference, with potentially cross-cutting domains as in Fig. 1. 14749
Space: Linguistic Expression
Figure 1 Cross-cutting subdomains of spatial language
Some further details in Fig. 1, like the three frames of reference, will be described below. But two subfields which will not receive further discussion need to be mentioned here. One is deixis, which concerns the way in which the location of the speech participants can constitute a special ground or landmark (as in here), which then makes interpretation dependent on determining that location (as in He used to come here). Languages universally seem to provide a structured set of deictic elements with distinctions that are often spatial (demonstratives like ‘this’ vs. ‘that,’ adverbs like ‘here’ vs. ‘there’). Deictic anchoring of spatial expressions is, however, often implicit or pragmatic: the local pub implies a pub close to some reference point, which unless specified is taken to be the place of speaking. A second subfield not treated further here is toponymy, or systems of place names. Place names have been much investigated from an historical point of view: since they are highly conservative they reveal much about the locations of ancient languages and peoples. However, the theory and cross-linguistic typology of place names—e.g., why some are descriptive, others not—is hardly developed at all (see e.g., Hunn 1993).
2. Uniersals and Variation in the Semantic Parameters in Spatial Subdomains In this section, the semantic parameters involved in spatial description in two major subdomains are described to illustrate the different ways in which languages structure the spatial domain. 2.1 Topology This subdomain, as mentioned, concerns contiguous relations between figure and ground. The term is potentially misleading—topology proper is a branch 14750
of geometry concerned with constancies under continuous distortions such as stretching, but is used in studies of spatial language following Piaget to describe the sort of spatial relations covered by the English prepositions on, in, at (Herskovits 1986). These spatial terms are amongst the earliest learnt by Western children, and it has therefore been supposed that they form a universal innate core of spatial notions. However, cross-linguistic studies show that universal semantic concepts are only to be found at a more abstract level. Take, for example, the spatial relations covered by English on, as in The cup is on the table, The picture is on the wall, or The fresco is on the ceiling— clearly the term covers a wide area, which is already subdivided in the adpositions (prepositions or postpositions) of many other European languages. Some languages (like Japanese) conflate ‘on’-relations with ‘over’-relations (as in The light is oer the table) in their adpositions; others (like Yukatek) even include ‘under’-relations as well (as in The ball is under the table). A detailed analysis of a dozen languages from different families suggests that universal semantic concepts in this domain lie at a more atomic level (Levinson and Wilkins, in preparation). If we think of the core prototype of English on as consisting of the notion of vertical superposition of figure over ground, with a horizontal ground supporting a non-attached figure, then the kind of notions that look universal are at the level of ‘vertical superposition,’ ‘contact,’ ‘horizontal supporting surface,’ (non) ‘attachment.’ Clearly these more abstract atomic concepts can be rearranged in many different ways to constitute the semantic relations expressible in an individual language. Topological relations are coded in the European languages primarily in adpositions or case, but as we shall see these are not the only possibilities—some languages code topological relations entirely in nouns or verbs. 2.2 Frames of Reference Linguists have presumed that there are two main frames of reference, which are often called ‘deictic’ and ‘intrinsic,’ and which can be illustrated by the ambiguity of The boy is behind the bus—either the bus is between the boy and us (‘deictic’) or he is at the rear of the bus (‘intrinsic’). Psychologists have long known, however, that on the vertical dimension our perceptual systems use three frames of reference against which to measure uprightness: an object-centered frame, a viewer-centered frame, and an environmental frame. Cross-linguistic research shows that in fact these three frames of reference, which we will call the intrinsic, the relative, and the absolute frame respectively, may be used on the horizontal plane too (Levinson 1996). These three frames appear to be universally available, yet not all languages make systematic use of them all—some use the absolute frame, others the relative
Space: Linguistic Expression frame, while most use the intrinsic frame, and some use all three (the main universal constraint appears to be that the relative frame is dependent on the possession of the intrinsic frame). The intrinsic frame of reference, as its name implies, has to do with the intrinsic parts of ground objects. Recollect that frames of reference have to do with specifying angles or directions from a landmark. One way to achieve this is to designate a side or facet of an object, so that the figure can be said to be lying on an axis drawn from the center of the object through the designated side. Most languages provide a vocabulary for parts of objects, but the ways in which they partition objects can be very variable. In English, for example, the criteria for assigning the front of an object involve a complex mix of functional and orientational information: the front of a bus is the direction in which it canonically moves, the front of a building is the side in which it is normally entered, and the front of a television the side designed to be watched. In English the top of a bottle remains the top even when tipped over, but in other languages (like Zapotec) a more strictly orientational approach may assign the ‘top’ to whatever side is currently uppermost. Alternatively (as in Tzeltal), the assignment may rely entirely on the internal axial geometry of the object, so that e.g., the ‘face’ of an object may be a flat side on a secondary axis, even if normally this is undermost. What all these systems have in common is that they provide a system of designated sides which is independent of the position of the viewer, and (apart from the vertical axis) independent of an environmental orientation — how they achieve this differs not only in detail but in fundamentals. Another way of obtaining angular distinctions on the horizontal plane is to use the body-axes of the viewer. We can then say that something is, for example, to the ‘left’ of something else in the visual field. Full versions of such systems use a left\right axis and an orthogonal front\back axis, but again the way in which they map bodily axes onto the ground object is very variable. Although these systems are often called ‘deictic’ in the literature, this is misleading—they typically have a deictic centre but they need not (as in He kicked the ball to the left of the goal ), and the other two frames of reference can also have deictic centers. Hence these systems are here said to be ‘relative’ frames of reference, because they are relative to the body-axes of some observer. Some languages translate the bodily coordinates onto the ground object without rotation: in such languages (like Hausa) The boy is to the left of the tree means just what it does in English, but The boy is in front of the tree means what in English is described as The boy is behind the tree. In contrast, some languages rotate the axes, so that the front of the tree is towards us, but the left of the tree is what we would in English call its right (this is probably a minority option, but is found in some dialects of Tamil). Some languages reflect the axes, as in English,
so that left and right remain in the same directions as the viewer’s body, but front and back are reversed or ‘flipped’ over. Notice that terms like ‘front’ and ‘back’ and ‘left’ and ‘right’ have intrinsic interpretations (as in He’s sitting at the chairman’s left), and for that reason ambiguities arise of the kind illustrated above with The boy is behind the bus. Relative systems are based on the intrinsic sides of humans, and probably arise as extensions of an intrinsic system to objects (like trees or balls) which resist the assignment of named sides. Hence the implicational relation: all relative systems are associated with intrinsic systems, but the possession of an intrinsic system does not necessarily imply the use of a relative one. Relative systems had been thought to be universal, but in fact it turns out that perhaps as many as a third of the world’s languages do not employ them, instead making use of absolute systems. Absolute systems rely on environmental gradients or ‘geocentric’ coordinates. Such systems use absolute directions, or fixed bearings, a bit like English North, but unlike in English, they are often used on all scales (as in The cup to the north of yours is mine, or even There’s a spider on your northern leg), completely replacing the need for relative systems. There are many different varieties of such systems. Some of them (as in Australian Aboriginal languages) are fully abstract cardinal direction systems, with terms that may gloss ‘North,’ ‘South,’ etc., but which denote quadrants or arcs of specific angle, and are often based on bearings rotated from our cardinal directions. To use a system of this kind, speakers must always orient themselves precisely, using multiple environmental clues (wind directions, solar angles, sidereal ecliptics, landscape features, and the like). Other systems are based more directly on one major environmental cue, like prevailing winds (as in languages of the Torres Straits), river drainage (as in Alaska), or prevailing slopes in mountainous terrain (as in the Himalayas). For example, Tenejapan Tzeltal uses an ‘uphill,’ ‘downhill,’ ‘across’ system, based on the general lie of the land in the territory of Tenejapa (Chiapas, Mexico), but the system has been abstracted from the terrain to provide fixed orientation, so that the bearings remain constant outside the territory. Many systems use fixed orthogonal axes, but not all: for example, the Austronesian languages in the Pacific islands typically use the monsoon winds as the basis for one axis, but an inland-sea opposition as the basis for another axis—as one moves around the island, the angles between the axes constantly change.
3. How Spatial Distinctions are Coded in Language Forms Much of the literature gives the impression that spatial distinctions are primarily coded in adpositions, as in 14751
Space: Linguistic Expression the prepositions of English. In fact, in most languages spatial information is distributed throughout the clause. Some languages have a score or more local cases (nominal suffixes) covering roughly the same distinctions as English prepositions (e.g., Finnish or the NE Caucasian language Avar). Other languages use a combination of case and adposition (as in Tamil), or case and spatial nominal (as in many Australian languages). Special classes of spatial nominals do a lot of the work in isolating languages like Chinese, and they play special roles in other languages (cf. front in the English complex preposition in front of, and North in the adverbial use He went North). Many spatial oppositions are made constructionally, as when German prepositions like auf take the dative to indicate location, and the accusative to indicate motion towards. Finally, verbs play a crucial role in spatial description. In the case of motion, there is often a special form-class of motion verbs, and languages tend to conflate motion with manner (as in English crawl ) or direction or ‘path’ (as in Spanish salir ‘to go out’), but not generally both (Talmy 1983). In many languages (e.g., Dutch), there is a small set of contrasting verbs (often derived from posture verbs) used to describe static location, encoding properties of the figure, or ground, or both. Some languages have a larger set (as in Tzeltal), and in this case most of the semantic load is taken by the verbs, which may encode such specific details as ‘be located in a bowl-shaped container.’ In conclusion, it is clear that spatial concepts form a complex semantic field made up of structured linguistic subdomains. The semantic parameters involved may ultimately have a universal base, but the ways in which the semantic subdomains are organized displays considerable cross-linguistic variation. There is also considerable cross-linguistic variation in the way in which such distinctions are encoded in the form-classes of languages. This linguistic variation is especially interesting given the centrality of spatial concepts in human thinking—and indeed it can be shown that linguistic differences in frames of reference make a difference to how humans memorize and reason non-linguistically (Levinson 1996, in press, Pederson et al. 1998). See also: Spatial Cognition
Bibliography Bloom P, Peterson M, Nadel L, Garrett M (eds.) 1996 Language and Space. MIT Press, Cambridge, MA Herskovits A 1986 Language and Spatial Cognition. Cambridge University Press, Cambridge, UK Hunn E 1993 Columbia Plateau Indian place names: What can they teach us? Journal of Linguistic Anthropology 6(1): 3–26 Jammer M 1954 Concepts of Space: The History of Theories of Space in Physics. Harvard University Press, Cambridge, MA Landau B, Jackendoff R 1993 ‘What’ and ‘where’ in spatial language and spatial cognition. BBS 16(2): 217–65
14752
Levinson S C 1996 Frames of reference and Molyneux’s question: Crosslinguistic evidence. In: Bloom P, Peterson M, Nadel L, Garrett M (eds.) Language and Space. MIT Press, Cambridge, MA, pp. 109–70 Levinson S C in press Space in Language and Cognition: Explorations in Linguistic Diersity. Cambridge University Press, Cambridge, UK Levinson S C Wilkins D (eds.) in preparation. Grammars of Space: Towards a Semantic Typology. Lyons J 1977 Semantics. Cambridge University Press, Cambridge, UK Miller G, Johnson-Laird P 1976 Language and Perception. Cambridge University Press, Cambridge, UK Pederson E, Danziger E, Wilkins D, Levinson S C, Kita S, Senft G 1998 Semantic typology and spatial conceptualization. Language 74: 557–89 Piaget J, Inhelder B 1956 The Child’s Conception of Space. Routledge, London Svorou S 1993 The Grammar of Space. Benjamins, Amsterdam Talmy L 1983 How language structures space. In: Pick H, Acredolo L (eds.) Spatial Orientation: Theory, Research and Application. Plenum Press, New York Talmy L 2000 Toward a Cognitie Semantics. MIT Press, Cambridge, MA, Vols. 1, 2
S. C. Levinson
Spatial Analysis in Geography The proliferation and dissemination of digital spatial databases, coupled with the ever wider use of geographic information systems (GIS), is stimulating increasing interest in spatial analysis from outside the spatial sciences. The recognition of the spatial dimension in social science research sometimes yields different and more meaningful results than analysis which ignores it. Spatial analysis is a research paradigm that provides a unique set of techniques and methods for analysing events—events in a very general sense—that are located in geographical space (see Table 1). Spatial analysis involves spatial modeling, which includes models of location-allocation, spatial interaction, spatial choice and search, spatial optimization, and space-time. Other entries in the encyclopedia take up these models (e.g., see Location Theory; Spatial Interaction Models; Spatial Optimization Models; Spatialtemporal Modeling); this article concentrates on spatial data analysis, the heart of spatial analysis.
1. Spatial Data and the Tyranny of Data Spatial data analysis focuses on detecting patterns and exploring and modeling relationships between such patterns in order to understand processes responsible
Spatial Analysis in Geography Table 1 Popular techniques and methods in spatial data analysis Exploratory spatial data analysis
Model-driven spatial data analysis
Quadrat methods Kernel density estimation Nearest neighbor methods K-function analysis
Homogeneous and heterogeneous Poisson process models and multivariate extensions
Global measures of spatial associations: Moran’s I, Geary’s c Local measures of spatial association: Gi and Gi* statistics, Moran’s scatter plot
Spatial regression models
Field data
Variogram and covariogram Kernel density estimation Thiessen polygons
Trend surface models Spatial prediction and kriging Spatial general linear modeling
Spatial interaction data
Exploratory techniques for representing such data
Spatial interaction models
Object data Point pattern
Area data
Regression models with spatially autocorrelated residuals
Location-allocation models Techniques to uncover evidence of hierarchical structure in the data such as graph-theoretic and regionalization techniques
Spatial choice and search models Modeling paths and flows through a network
for observed patterns. In this way, spatial data analysis emphasizes the role of space as a potentially important explicator of socioeconomic systems, and attempts to enhance understanding of the working and representation of space, spatial patterns, and processes.
1.1 Spatial Data and Data Types Empirical studies in the spatial sciences routinely employ data for which locational attributes are an important source of information. Such data characteristically consist of one or few cross-sections of observations for either micro-units such as individuals (households, firms) at specific points in space, or aggregate spatial entities such as census tracts, electoral districts, regions, provinces, or even countries. Observations such as these, for which the absolute location and\or relative positioning (spatial arrangement) is explicitly taken into account, are termed spatial data (e.g., see Spatial Data). In the socioeconomic realm, points, lines, and areal units are the fundamental entities for representing spatial phenomena. This form of spatial referencing is also a salient feature of GIS (e.g., see Geographic Information Systems; Spatial Data Infrastructure). Three broad classes of spatial data can be distinguished: (a) object data where the objects are either points (spatial point patterns or locational data, i.e., point locations at which events of interest have occurred) or
areas (area or lattice data, defined as discrete variations of attributes over space), (b) field data (also termed geostatistical or spatially continuous data), that is, observations associated with a continuous variation over space, given values at fixed sampling points, and (c) spatial interaction data (sometimes called link or flow data) consisting of measurements each of which is associated with a link or pair of locations representing point or areas. The analysis of spatial interaction data has a long and distinguished history in the study of a wide range of human activities, such as transportation movements, migration, and the transmission of information (see Spatial Interaction; Spatial Interaction Models). Field data play an important role in the environmental sciences, but are less important in the social sciences. This article, therefore, focuses on object data, the most appropriateperspectiveforspatialanalysisapplications in the social sciences. Object data include observations for micro-units at specific points in space, that is, spatial point patterns, and\or observations for aggregate spatial entities, that is, area data. Of note is that point data can be converted to area data, and area data can be represented by point reference. Areas may be regularly shaped such as pixels in remote sensing or irregularly shaped such as statistical reporting units. When divorced from their spatial context such data lose value and meaning. They may be viewed as single realizations of a spatial stochastic process, similar to the approach taken in the analysis of time series. 14753
Spatial Analysis in Geography 1.2 The Tyranny of Data Analyzing and modeling spatial data present a series of problems. Solutions to many of them are obvious, others require extraordinary effort for their solution. Data exercise a power that can lead to misinterpretation and meaningless results; therein lies the tyranny of data. Quantitative analysis crucially depends on data quality. Good data are reliable, contain few or no mistakes, and can be used with confidence. Unfortunately, nearly all spatial data are flawed to some degree (e.g., see Spatial Data). Errors may arise in measuring both the location and attribute properties, but may also be associated with computerized processes responsible for storing, retrieving, and manipulating spatial data. The solution to the data quality problem is to take the necessary steps to avoid having faulty data determining research results. The particular form (i.e., size, shape, and configuration) of the spatial aggregates can affect the results of the analysis to a varying, usually unknown, degree as evidenced in various types of analysis (see, e.g., Openshaw and Taylor 1979; Baumann et al. 1983). This problem generally has become recognized as the modifiable areal units problem (MAUP), the term stemming from the fact that areal units are not ‘natural’ but usually arbitrary constructs. For reasons of confidentiality, social science data (e.g., census data) are not often released for the primary units of observation (individuals), but only for a set of rather arbitrary areal aggregations (enumeration districts or census tracts). The problem arises whenever area data are analyzed or modeled and involves two effects: one derives from selecting different areal boundaries while holding the overall size and the number of areal units constant (the zoning effect). The other derives from reducing the number but increasing the size of the areal units (the scale effect). There is no analytical solution to the MAUP, but questions of the following kind have to be considered in constructing an areal system for analysis: What are the basic spatial entities for defining areas? What theory guides the choice of the spatial scale? Should the definition process follow strictly statistical criteria and merge basic spatial entities to form larger areas using some regionalization algorithms (see Wise et al. 1996)? These questions pose daunting challenges. In addition, boundary and frame effects (i.e., the geometric structure of the study area) may affect spatial analysis and the interpretation of results. These problems are considerably more complex than in time series. Although several techniques, such as refined Kfunction analysis, take the effect of boundaries into account, there is a need to study boundary effects more systematically. An issue that has been receiving increasing attention relates to the suitability of data. If the data, for example, are available only at the level of spatial 14754
aggregates, but the research question is at the individual respondent level, then the ecological fallacy (ecological bias) problem arises. Using area-based data to draw inferences about underlying individuallevel processes and relationships poses considerable risks. This problem relates to the MAUP through the concept of spatial autocorrelation. Spatial autocorrelation (also referred to as spatial dependence or spatial association) in the data can be a serious problem (e.g., see Spatial Autocorrelation), rendering conventional statistical analysis unsafe and requiring specialized spatial analytical tools. This problem refers to situations where the observations are nonindependent over space. That is, nearby spatial units are associated in some way. Sometimes, this association is due to a poor match between the spatial extent of the phenomenon of interest (e.g., a labor or housing market) and the administrative units for which data are available. Sometimes, it is due to a spatial spillover effect. The complications are similar to those found in time-series analysis, but are exacerbated by the multidirectional, two-dimensional nature of dependence in space rather than the unidirectional nature in time. Avoiding the pitfalls arising from spatially correlated data is crucial to good spatial data analysis, whether exploratory or confirmatory. Several scholars even argue that the notion of spatial autocorrelation is at the core of spatial analysis (see, e.g., Tobler 1979). No doubt, much of current interest in spatial analysis is directly derived from the monograph of Cliff and Ord (1973) on spatial autocorrelation that opened the door to modern spatial analysis.
2. Pattern Detection and Exploratory Analysis Exploratory data analysis is concerned with the search for data characteristics such as trends, patterns, and outliers. This is especially important when the data are of poor quality or genuine a priori hypotheses are lacking. Many such techniques emphasize graphical views of the data that are designed to highlight particular features and allow the analyst to detect patterns, relationships, outliers, etc. Exploratory spatial data analysis (ESDA), an extension of exploratory data analysis (EDA) (Haining 1990, Cressie 1993), is geared especially to dealing with the spatial aspects of data.
2.1 Exploratory Techniques for Spatial Point Patterns Point patterns arise when the important variables to be analysed is the location of events. At the most basic level, the data comprise only the spatial coordinates of events. They might represent a wide variety of spatial phenomena, such as cases of disease, crime incidents,
Spatial Analysis in Geography pollution sources, or locations of stores (e.g., see Spatial Pattern, Analysis of ). Research typically concentrates on whether the proximity of particular point events, their location in relation to each other, represents a significant (i.e., nonrandom) pattern. Exploratory spatial point pattern analysis is concerned with exploring the first and second order properties of spatial point pattern processes. First order effects relate to variation in the mean value of the process (a large scale trend), while second order effects result from the spatial correlation structure or the spatial dependence in the process. Three types of methods are important: Quadrat methods, kernel estimation of the intensity of a point pattern, and distance methods. Quadrat methods involve collecting counts of the number of events in subsets of the study region. Traditionally, these subsets are rectangular (thus, the name quadrat), although any shape is possible. The reduction of complex point patterns to counts of the number of events in quadrats and to one-dimensional indices is a considerable loss of information. There is no consideration of quadrat locations or of the relative positions of events within quadrats. Thus, most of the spatial information in the data is lost. Quadrat counts destroy spatial information, but they give a global idea of subregions with high or low numbers of events per area. For small quadrats more spatial information is retained, but the picture degenerates into a mosaic with many empty quadrats. Estimating the intensity of a spatial point pattern is very like estimating a bivariate probability density, and bivariate kernel estimation can be adapted easily to give an estimate of intensity. Choice of the specific functional form of the kernel presents little practical difficulty. For most reasonable choices of possible probability distributions the kernel estimate will be very similar, for a given bandwidth. The bandwidth determines the amount of smoothing. There are techniques that attempt to optimize the bandwidth given the observed pattern of event location. A risk underlying the use of quadrats is that any spatial pattern detected may be dependent upon the size of the quadrat. In contrast, distance methods make use of precise information on the locations of events and have the advantage of not depending on arbitrary choices of quadrat size or shape. Nearest neighbor methods reduce point patterns to onedimensional nearest neighbor summary statistics (see Dacey 1960, Getis 1964). But only the smallest scales of patterns are considered. Information on larger scales of patterns is unavailable. These statistics indicate merely the direction of departure from complete spatial randomness (CSR). The empirical K function, a reduced second-moment measure of the observed process, provides a vast improvement over nearest neighbor indices (see Ripley 1977, Getis 1984). It uses the precise location of events and includes all event–event distances, not just nearest neighbor dis-
tances, in its estimation. Care must be taken to correct for edge effects. K-function analysis can be used not only to explore spatial dependence, but also to suggest specific models to represent it and to estimate the parameters of such models. The concept of K-functions can be extended to the multivariate case of a marked point process (i.e., locations of events and associated measurements or marks) and to the time-space case.
2.2 Exploratory Analysis of Area Data Exploratory analysis of area data is concerned with identifying and describing different forms of spatial variation in the data. Special attention is given to measuring spatial association between observations for one or several variables. Spatial association can be identified in a number of ways, rigorously by using an appropriate spatial autocorrelation statistic (Cliff and Ord 1981), or more informally, for example, by using a scatter-plot and plotting each value against the mean of neighboring areas (Haining 1990). In the rigorous approach to spatial autocorrelation the overall pattern of dependence in the data is summarized in a single indicator, such as Moran’s I and Geary’s c. While Moran’s I is based on crossproducts to measure value association, Geary’s c employs squared differences. Both require the choice of a spatial weights or contiguity matrix that represents the topology or spatial arrangement of the data and represents our understanding of spatial association. Getis (1991) has shown that these indicators are special cases of a general formulation (called gamma) defined by a matrix representing possible spatial associations (the spatial weights matrix) among all areal units, multiplied by a matrix representing some specified nonspatial association among the areas. The nonspatial association may be a social, economic, or other relationship. When the elements of these matrices are similar, high positive autocorrelation arises. Spatial association specified in terms of covariances leads to Moran’s I, specified in terms of differences, to Geary’s c. These global measures of spatial association can be used to assess spatial interaction in the data and can be visualized easily by means of a spatial variogram, a series of spatial autocorrelation measures for different orders of contiguity. A major drawback of global statistics of spatial autocorrelation is that they are based on the assumption of spatial stationarity, which implies inter alia a constant mean (no spatial drift) and constant variance (no outliers) across space. This was useful in the analysis of small data sets characteristic of pre-GIS times but is not very meaningful in the context of thousands or even millions of spatial units that characterize current, data-rich environments. In view of increasingly data-rich environments, a focus on local patterns of association (‘hot spots’) and an allowance for local instabilities in overall spatial 14755
Spatial Analysis in Geography association recently has been suggested as a more appropriate approach. Examples of techniques that reflect this perspective are the various geographical analysis machines developed by Openshaw and coworkers (see, e.g., Openshaw et al. 1990), the Moran scatter plot (Anselin 1996), and the distance-based Gi and Gi* statistics of Getis and Ord (1993). This last has gained wide acceptance. These G-indicators can be calculated for each location i in the data set as the ratio of the sum of values in neighboring locations (defined to be within a given distance or order of contiguity) to the sum over all the values. The two statistics differ with respect to the inclusion of the value observed at i in the calculation [included in Gi*, not included in Gi ]. They can be mapped easily and used in an exploratory analysis to detect the existence of pockets of local nonstationarity, to identify distances beyond which no discernible association arises, and to find the appropriate spatial scale for further analysis (e.g., see Spatial Association, Measures of ). No doubt, ESDA provides useful means to generate insights into global and local patterns and associations in spatial data sets. The use of ESDA techniques, however, generally is restricted to expert users interacting with the data displays and statistical diagnostics to explore spatial information and to fairly simple low-dimensional data sets. In view of these limitations, there is a need for novel exploration tools sufficiently automated and powerful to cope with the datarichness-related complexity of exploratory analysis in spatial data environments (see, e.g., Openshaw and Fischer 1994).
3. Model Drien Spatial Data Analysis ESDA is a preliminary step in spatial analysis to more formal modeling approaches. Model-driven analysis of spatial data relies on testing hypotheses about patterns and relationships, utilizing a range of techniques and methodologies for hypothesis testing, the determination of confidence intervals, estimation of spatial models, simulation, prediction, and assessment of model fit. Getis and Boots (1978), Cliff and Ord (1981), Upton and Fingleton (1985), Anselin (1988), Griffith (1988), Haining (1990), Cressie (1993), Bailey and Gatrell (1995) have helped to make model-driven spatial data analysis accessible to a wide audience in the spatial sciences. 3.1 Modeling Spatial Point Patterns Spatial point pattern analysis grew out of a hypothesis testing and not out of the pattern recognition tradition. The spatial pattern analyst tests hypotheses about the spatial characteristics of point patterns. Typically, complete spatial random (CSR) represents the null hypothesis against which to assess whether observed point patterns are regular, clustered, or random. The 14756
standard model for CSR is that events follow a homogeneous Poisson process over the study region, that is, events are independently and uniformly distributed over space, equally likely to occur anywhere in the study region and not interacting with each other. Various statistics for testing CSR are available. Nearest neighbor tests have their place in distinguishing CSR from spatially regular or clustered patterns. But little is known about their behavior when CSR does not hold. The K-function may suggest a way of fitting alternative models. Correcting for edge effects, however, might provide some difficulty. The distribution theory for complicated functions of the data can be intractable even under the null hypothesis of CSR. Monte Carlo tests is a way around this problem. If the null hypothesis of CSR is rejected, the next obvious step in model-driven spatial pattern analysis is to fit some alternative ( parametric) model to the data. Departure from CSR is typically toward regularity or clustering of events. Clustering can be modeled through a heterogeneous Poisson process, a doubly stochastic point process, or a Poisson cluster process arising from the explicit incorporation of a spatial clustering mechanism. Simple inhibition processes can be utilized to model regular point patterns. Markov point processes can incorporate both elements through large-scale clustering and small-scale regularity. After a model has been fitted (usually via maximum likelihood or least squares using the Kfunction), diagnostic tests have to be performed to assess its goodness-of-fit. Inference for the estimated parameters is often needed in response to a specific research question. The necessary distribution theory for the estimates can be difficult to obtain in which case approximations may be necessary. If, for example, clustering is found, one may be interested in the question of whether particular spatial aggregations, or clusters, are associated with proximity to particular sources of some other factor. This leads to multivariate point pattern analysis, a special case of marked spatial point process analysis. For further details see Cressie (1993).
3.2 Modeling Area Data Linear regression models constitute the leading modeling approach for analyzing social and economic phenomena. But conventional regression analysis does not take into account problems associated with possible cross-sectional correlations among observational units caused by spatial dependence. Two forms of spatial dependence among observations may invalidate regression results: spatial error dependence and spatial lag dependence. Spatial error dependence might follow from measurement errors such as a poor match between the spatial units of observation and the spatial scale of the
Spatial Analysis in Geography phenomenon of interest. Presence of this form of spatial dependence does not cause ordinary least squares estimates to be biased, but it affects their efficiency. The variance estimator is downwards biased, thus inflating the R#. It also affects the t- and Fstatistics for tests of significance and a number of standard misspecification tests, such as tests for heteroskedasticity and structural stability (Anselin and Griffith 1988). To protect against such difficulties, one should use diagnostic statistics to test for spatial dependence among error terms and, if necessary, take action to properly specify the spatially autocorrelated residuals. Typically, dependence in the error term is specified as a spatial autoregressive or as a spatial moving average process. Such regression models require nonlinear maximum likelihood estimation of the parameters (Cliff and Ord 1981, Anselin 1988). In the second form spatial lag dependence, spatial autocorrelation is attributable to spatial interactions in data. This form may be caused, for example, by significant spatial externalities of a socioeconomic process under study. Spatial lag dependence yields biased and, also, inconsistent parameters. To specify a regression model involving spatial interaction, one must incorporate the spatial dependence into the covariance structure either explicitly or implicitly by means of an autoregressive and\or moving-average interaction structure. This constitutes the model identification problem that is usually carried out using the correlogram and partial correlogram. A number of spatial regression models, that is, regression models with spatially lagged dependent variables (spatial autoregressive models), have been developed that include one or more spatial weight matrices which describe the many spatial associations in the data. The models incorporate either a simple general stochastic autocorrelation parameter or a series of autocorrelation parameters, one for each order contiguity (see Cliff and Ord 1981, Anselin 1988). Maximum likelihood procedures are fundamental to spatial regression model estimation, but data screening and filtering can simplify estimation. Tests and estimators are clearly sensitive not only to the MAUP, but also to the specification of the spatial interaction structure represented by the spatial weights matrix. Recent advances in computation-intensive approaches to estimation and inference in econometrics and statistical modeling may yield new ways to tackle this specification issue. In practice, it is often difficult to choose between regression model specifications with spatially autocorrelated errors and regression models with spatially lagged dependent variables, though the ‘common factor’ approach (Bivand 1984) can be applied if the spatial lags are neatly nested. Unlike linear regression, for which a large set of techniques for model specification and estimation now exist, the incorporation of spatial effects into nonlinear models in general—and into models with limited dependent variables or count data (such as log-linear,
logit and tobit models) in particular—is still in its infancy. The hybrid log-linear models of Aufhauser and Fischer (1985) are among the few exceptions. Similarly, this is true for the design of models that combine cross-sectional and time series data for areal units. See Hordijk and Nijkamp (1977) for dynamic spatial diffusion models.
4. Toward Intelligent Spatial Analysis Spatial analysis is currently entering a period of rapid change leading to what is termed intelligent spatial analysis (sometimes referred to as geocomputation). The driving forces are a combination of huge amounts of digital spatial data from the GIS data revolution (with 100,000 to millions of observations), the availability of attractive softcomputing tools, the rapid growth in computational power, and the new emphasis on exploratory data analysis and modeling. Intelligent spatial analysis has the following properties. It exhibits computational adaptivity (i.e., an ability to adjust local parameters and\or global configurations to accommodate in response to changes in the environment); computational fault tolerance in dealing with incomplete, inaccurate, distorted, missing, noisy and confusing data, information rules, and constraints; speed approaching human-like turnaround; and error rates that approximate human performance. The use of the term ‘intelligent’ is, therefore, closer to that in computational intelligence than in artificial intelligence. The distinction between artificial and computational intelligence is important because our semantic descriptions of models and techniques, their properties, and our expectations of their performance should be tempered by the kind of systems we want and the ones we can build (Bezdek 1994). Much of the recent interest in intelligent spatial analysis stems from the growing realization of the limitations of conventional spatial analysis tools as vehicles for exploring patterns in data-rich GI (geographic information) and RS (remote sensing) environments and from the consequent hope that these limitations may be overcome by judicious use of computational intelligence technologies such as evolutionary computation (genetic algorithms, evolutionary programming, and evolutionary strategies) (see Openshaw and Fischer 1994) and neural network modeling (see Fischer 1998). Neural networks models may be viewed as nonlinear extensions of conventional statistical models that are applicable to two major domains: first, as universal approximators to areas such as spatial regression, spatial interaction, spatial choice, and space-time series analysis (see, e.g., Fischer and Gopal 1994); and second, as pattern recognizers and classifiers to allow the user to sift through the data intelligently, reduce dimensionality, and find patterns 14757
Spatial Analysis in Geography of interest in data-rich environments (see, e.g., Fischer et al. 1997). See also: Geographic Information Systems; Location Theory; Mathematical Models in Geography; Regional Science; Remote Sensing; Spatial Association, Measures of; Spatial Autocorrelation; Spatial Choice Models; Spatial Data; Spatial Data Infrastructure; Spatial Interaction; Spatial Interaction Models; Spatial Optimization Models; Spatial Pattern, Analysis of; Spatial Sampling; Spatial-temporal Modeling; Spectral Analysis; Urban Geography; Urban System in Geography
Bibliography Anselin L 1988 Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dordrecht Anselin L 1996 The Moran scatterplot as an ESDA tool to assess local instability in spatial association. In: Fischer M M, Scholten H J, Unwin D (eds.) Spatial Analytical Perspecties on GIS. Taylor & Francis, London Anselin L, Florax R J G M (eds.) 1995 New Directions in Spatial Econometrics. Springer, Berlin Anselin L, Griffith D A 1988 Do spatial effects really matter in regression analysis? Papers, Regional Science Association 65: 11–34 Aufhauser E, Fischer M M 1985 Log-linear modelling and spatial analysis. Enironment and Planning A 17: 931–51 Bailey T C, Gatrell A C 1995 Interactie Spatial Data Analysis. Longman, Harlow, UK Baumann J H, Fischer M M, Schubert U 1983 A multiregional labor supply model for Austria: The effects of different regionalizations in multiregional labour market modelling. Papers, Regional Science Association 52: 53–83 Bezdek J C 1994 What’s computational intelligence. In: Zurada J M, Marks II R J, Robinson C J (eds.) Computational Intelligence: Imitating Life. Institute of Electrical and Electronics Engineers, New York Bivand R S 1984 Regression modeling with spatial dependence: An application of some class selection and estimation methods. Geographical Analysis 16: 25–37 Cliff A D, Ord J K 1973 Spatial Autocorrelation. Pion, London Cliff A D, Ord J K 1981 Spatial Processes, Models & Applications. Pion, London Cressie N A C 1993 Statistics for Spatial Data, rev. edn. Wiley, New York Dacey M F 1960 A note on the derivation of the nearest neighbour distances. Journal of Regional Science 2: 81–7 Fischer M M 1998 Computational neural networks—a new paradigm for spatial analysis. Enironment and Planning A 30(10): 1873–91 Fischer M M, Getis A (eds.) 1997 Recent Deelopments in Spatial Analysis—Spatial Statistics, Behaioural Modelling, and Computational Intelligence. Springer, Berlin Fischer M M, Gopal S 1994 Artificial neural networks. A new approach to modelling interregional telecommunication flows. Journal of Regional Science 34: 503–27 Fischer M M, Gopal S, Staufer P, Steinnocher K 1997 Evaluation of neural pattern classifiers for a remote sensing application. Geographical Systems 4: 195–223, 233–4
14758
Fischer M M, Scholten H J, Unwin D (eds.) 1996 Spatial Analytical Perspecties on GIS. Taylor & Francis, London Fotheringham S, Rogerson P (eds.) 1994 Spatial Analysis and GIS. Taylor & Francis, London Getis A 1964 Temporal land-use pattern analysis with the use of nearest neighbor and quadrat methods. Annals of the Association of American Geographers 54: 391–9 Getis A 1984 Interaction modelling using second-order analysis. Enironment and Planning A 16: 173–83 Getis A 1991 Spatial interaction and spatial autocorrelation: A cross-product approach. Papers, Regional Science Association 69: 69–81 Getis A, Boots B 1978 Models of Spatial Processes. Cambridge University Press, Cambridge, UK Getis A, Ord J K 1993 The analysis of spatial association by use of distance statistics. Geographical Analysis 25(3): 276 Griffith D A 1988 Adanced Spatial Statistics: Special Topics in the Exploration of Quantitatie Spatial Data Series. Kluwer Academic, Dordrecht Haining R 1990 Spatial Data Analysis in the Social Sciences. Cambridge University Press, Cambridge Hordijk L, Nijkamp P 1977 Dynamic models of spatial autocorrelation. Enironment and Planning A 9: 505–19 Longley P, Batty M (eds.) 1996 Spatial Analysis: Modelling in a GIS Enironment. GeoInformation International, Cambridge, UK Openshaw S, Cross A, Charlton M 1990 Building a prototype geographical correlates exploration machine. International Journal of Geographical Information Systems 4: 297–312 Openshaw S, Fischer M M 1994 A framework for research on spatial analysis relevant to geostatistical information systems in Europe. Geographical Systems 2(4): 325–37 Openshaw S, Taylor P 1979 A million or so correlation coefficients: Three experiments on the modifiable areal unit problem. In: Wrigley N (eds.) Statistical Applications in the Spatial Sciences. Pion, London Ripley B D 1977 Modelling spatial patterns. Journal of the Royal Statistical Society 39: 172–212 Tobler W 1979 Cellular geography. In: Gale S, Olsson G (eds.) Philosophy in Geography. Reidel, Dordrecht Upton J G, Fingleton B 1985 Spatial Data Analysis by Example. Wiley, New York Wise S, Haining R, Ma J 1996 Regionalisation tools for the exploratory spatial analysis of health data. In: Fischer M M, Getis A (eds.) Recent Deelopments in Spatial Analysis— Spatial Statistics, Behaioural Modelling and Computational Intelligence. Springer, Berlin
M. M. Fischer
Spatial Association, Measures of In the context of the following discussion, association means connectedness or relationship. If spatial association exists, the connectedness is over space (the earth’s surface), and it is mappable. If variables are somehow held together tightly or loosely in space, as if a magnet were pulling them into a pattern reminiscent
Spatial Association, Measures of of the well-known iron filings demonstration, one can say that the variables are spatially associated.
1. Introduction Spatial association can characterize a single variable or many variables simultaneously. A single variable may be spatially autocorrelated; that is, values of the variable are somehow connected or related spatially (see Spatial Autocorrelation). Many variables—say, environmental variables such as air quality, water quality, certain species location, and human activities—may be associated one with another at one or more sites. In addition, spatial association and spatial interaction are related (see Spatial Interaction). If there is spatial interaction, for example, the movement of commuters to workplaces, there is also spatial association since home places are spatially connected to work places. The scientist interested in the study of spatial association is motivated by the need to understand the spatial relationships between variables at specific sites or how variables of interest are spread across the earth’s surface. Many disciplines are particularly concerned with the study of spatial association. Among them are geography, regional science, ecology, epidemiology, regional and urban economics, geology, hydrology and geomorphology. Since this is a broad and popular interest area, many methods and techniques of analysis have been created to confirm notions of spatial association between and among variables. Scientists test or theorize about variables to determine whether spatial association, either observed or expected, actually can be confirmed. As a result of this interest, statistical procedures have been developed for identifying and measuring the existence of spatial association. After a short discussion of maps depicting spatial association, the remainder of this entry will outline the characteristics of measurement procedures.
Figure 1 The study area represented by the black square indicates that there may be an in situ relationship among the three variables shown by different shading
Figure 2 The width of the arrows indicates the magnitude of the interaction
2. Maps Showing Spatial Association Maps depicting spatial association fall into several categories: in situ, flows, time series, distance decay, and residuals. In situ maps reveal by means of overlays the possible association of two or more variables in a region (see Fig. 1). Flow maps show the interaction between sites (see Fig. 2). Time series usually are shown as a series of map views over time (days, years, etc). These represent spatial relationships as they evolve or change over time (see Fig. 3a, b, c). Distance decay maps emphasize Tobler’s First Law of Geography that ‘everything is related to everything else, [but] near things are more related than distant things’. A distance decay map might show that the effect of an earthquake declines with distance from its epicenter or
Figure 3 Over time o(a), (b), (c)q, there is an increase in the number of events at the same time as spacing between events is reduced
14759
Spatial Association, Measures of
Figure 4 On the plane, as distance increases from A to B the effect on A of the variable indicated by the vertical triangle declines
Figure 5 In standard deviational units, the map shows residuals from a regression model. It remains to be determined whether or not there is a spatially statistically significant association among the residuals
that the influence of a retail store declines with distance from its site (see Fig. 4). Residual maps show the remaining ‘unexplained’ variation after a model has been used to statistically explain a relationship between two or more variables (see Fig. 5). The unexplained spatial variation represents that part of whatever total variation exists in a mathematical model that is not accounted for by the variables included in the model. Residual maps help researchers identify new variables that might belong in a more complete model, such as a transportation cost, a regional effect, or a previously unidentified physical or human anomaly.
Figure 6 The spatial pattern shown in (a) is reduced to a W matrix of spatial associations (b) and a Y matrix (c) made up of absolute differences between values associated with each region
3. Spatial Association in Matrix Terms In order to answer scientific questions about spatial association, one can reduce the complexities of what one sees on a map to a mathematically manipulable set of numbers (Hubert et al. 1981, Getis 1991). The following simple cross-product formula represents spatial association: n n
Γ l wij yij i j
14760
(1)
Γ is the measure of spatial association, and the wij are contained in a matrix, W, of values that represent the relationship of each location of interest to all other sites (see Fig. 6a and b). Figure 6a is a hypothetical map of ten regions. The numbers in parentheses represent the value that a hypothetical variable, Y, takes in each of the regions. Suppose it is decided that regions that are contiguous to one another are associated. The W matrix (Fig. 6b) indicates the spatial relationship of each region to every other. For 6b, the convention is
Spatial Association, Measures of Thus, a typical W matrix might contain elements of the form wij l 1\dαij (α greater than 1). Thus, short distance relationships would be weighted more heavily than the long distances. 3.2 The Y Matrix
Figure 7 The value of Γ is made up of the sum of numbers shown in bold
adopted of placing a 1 in the matrix if a link (contiguity) exists and zero otherwise. The Y matrix (see Fig. 6c) contains numbers that are the absolute difference between values for all combinations of the regions. For example, the absolute difference between values in regions C and E is shown as an element of the matrix in row C and column E, that is, 6–7 l 1. The two matrices, when summed as a cross-product as shown in Eqn. 1 (see Fig. 7), give the value of Γ. If the values in the Y matrix are small for those same rows and columns that correspond to the wij l 1 values in the W matrix, the value of Γ will be small. For the example shown, Γ equals 66, an average of 2.2 for each of the 30 links between contiguous regions. If the number 1 appeared in all the W cells, the total Γ would have been 300 or a mean of 3.3 for each of the hypothetical 90 links. The implication here is that the contiguous neighbors that characterize W gives a smaller gamma value than one might expect if the values yij were randomly placed in the 10 regions. Γ indicates that there may be a spatial association among the Y values. We use the term ‘may be’ to indicate that we have not yet used a statistical test to confirm that Y is in fact a spatially associated variable. 3.1 The W Matrix In the preceding paragraph, we outlined a simple form of a W matrix, one that was constructed based on contiguous spatial relationships between regions. The W matrix embodies our preconceived or derived understanding of spatial relationships. If we believe, or if theory tells us, that a particular spatial relationship is distance dependent, then the W matrix should reflect that supposition. For example, if it is assumed that a spatial relationship declines in strength as distance increases from any given site, then the W matrix will show that nearby areas are weighted more highly than sites that are far from one another. Various distance-decay formulations theorized or derived for such phenomena as travel behavior, economic interaction, or disease transmission would require the elements within the W matrix to reflect these effects.
The nonspatial matrix, Y, gives us a view of how a variable(s) whose elements are georeferenced, that is, values of the variable are indexed at certain sites, might be associated with one another. The elements of the Y matrix, yij, represent the interaction of values of Y. Values can interact in a number of ways: addition, subtraction, multiplication, division, or combinations of these. An important type of interaction is that of covariance, that is, a value at one site may vary from the mean of all sites (the variation) and this difference is multiplied by the variation at another site. Taken over all sites, the coariation is n n
(yiky` ) (yjky` ) i
j
and the coariance is this sum divided by the sample size, n. 3.3 Γ When the matrix W and the matrix Y are similarly structured, one can conclude that Y values interact very much as the individual regions are spatially associated. Thus, Γ reveals the spatial association in the Y variable. In our previous example, we purposely created a W matrix that represented a relationship between contiguous regions. The fact that the values of the Y variable tend to be similar to one another when regions are contiguous allows us to conclude that the two matrices are similarly structured. In the literature on spatial association, much attention has been given to the structure of W and Y. Each, in a sense, represents a related area of research. Those concerned with the W matrix have devised ever increasingly complex representations of spatial relations among spatial units. From the simple contiguity structure presented above, empiricists and theorists have constructed W matrices that represent supposed migration distances, disease transmission distances, assumed hierarchical spatial relations, political boundary effects, psychological distances, barriers to communication effects, economic effects of distance, social distances, and so on. In experimental work, the matrix W has also been used to represent concentric bands measured from various spatial foci or as representative of cumulative distances. These latter matrices help researchers to identify ‘critical’ distances, that is, distances at which association reaches some peak. Research on the structure of the Y matrix usually is a response to a particular type of possible spatial 14761
Spatial Association, Measures of association. For example, the researcher may wonder whether the association is that of positive covariance. Y is structured in such a way that statistical tests can be carried out on the research hypothesis that statistically significant spatial association exists. Several of these formulations are given below. Each is an alternative method of identifying spatial association.
4. Spatial Association as Coariance
This measure, called the K-function, helps to identify spatial clustering (Ripley 1981) within the region A. It is used primarily in ecological and epidemiological studies (Diggle 1983). The W matrix is made up of the wij elements, represented as ones, up to an arbitrary distance d. The Y matrix is invisible but is assumed to have each element equal to one. In practice, K values are found for a series of distances. Tests are usually carried out by simultating alternative patterns of points (see Spatial Pattern, Analysis of ).
Typical Formula: n n
n n
n
i j
i j
i
7. Focused Spatial Association
I l n wij(yiky` ) (yjky` )\ wij (yiky` )# , ij
Formula: (2)
This is the well-known Moran’s I statistic, explicated by Cliff and Ord (1981), and used widely as a measure of spatial association for spatial residuals from regression analysis. When the data used for testing purposes are georeferenced, the basic form of the linear regression model requires that the error term be spatially randomly distributed. Moran’s I is used as a test for the randomness of residuals from regression. The elements of the W matrix are represented as wij. The Y matrix is the covariance and the remainder of the equation is a scaling device used to give the statistic useful testing properties.
5. Spatial Association as Difference Formulas:
n
n
j
j
Gi*(d) l wij(d) yj\ yj
(6)
For each type of measure shown above there is a counterpart (usually with some variation) that allows one to determine whether or not spatial association exists within the proximal region surrounding a georeferenced point. A popular measure of this type is the Getis–Ord statistic, G (d), which for the Y matrix i each georeferenced point i is weighted (Getis and Ord, 1992). As distance increases from the site in question, values of G increase if clustering is in evidence, i otherwise they decrease. Anselin’s Local Indicators of Spatial Association or LISA statistics are designed for focusing Moran’s and Geary’s statistics (Anselin 1995).
8. Spatial Association as Interaction
n−h n
γ (h) l wij(yikyj)#\2n(h) i j = "+h
n n
n n
n
i j
i j
i
(3)
c l nk1 wij(yikyj)#\2 wij (yiky` )# , ij (4)
The basic formula: Tij l yi yj\wij−β
(7)
The better known of these formulas is the first, called the semiariance or ariogram. Widely used among geoscientists, this measure helps to identify the decline in spatial association as distance (h) increases away from the sites in question. It is the backbone for the large research area that yields continuous map surfaces common in studies of meteorology, geography, and geology (Cressie 1991). W is again represented by wij. Y is made up of (yikyj)# and the remainder of the equation is used to scale the semivariance in terms of the variance. Equation 4 is known as Geary’s c.
There are a number of interaction models; the one shown above is the simplest. These widely used models enable transportation and economic researchers to estimate the flow of automobiles, aircraft, migrants, goods and services, and so on, between places (Haynes and Fotheringham 1984). The W matrix usually represents a distance decay function while Y is usually a multiplicative interaction matrix. All of these models have the Γ form. All can be used to test hypotheses about or to measure the extent of spatial association. Recent summaries on the measurement of spatial association can be found in Getis (1966, 1999) and Bailey and Gatrell (1995).
6. Spatial Association as Clusters
See also: Mathematical Models in Geography; Spatial Analysis in Geography; Spatial Interaction Models
Formula:
9
n n
K(d) l A wij(d)\π n(nk1) 14762
i j
:
"/#
Bibliography (5)
Anselin L 1995 Local indicators of spatial association—LISA. Geographical Analysis 27: 93–115
Spatial Autocorrelation Bailey T C, Gatrell A C 1995 Interactie Spatial Data Analysis. Longman Scientific and Technical, Harlow, Essex, UK Cliff A D, Ord J K 1981 Spatial Processes: Models and Applications. Pion Press, London Cressie N A C 1991 Statistics for Spatial Data. Wiley, New York Diggle P J 1983 Statistical Analysis of Spatial Point Patterns. Academic Press, London Getis A 1991 Spatial interaction and spatial autocorrelation: a cross-product approach. Enironment and Planning A 23: 1269–77 Getis A 1966 Local spatial statistics. In: Longley P, Batty M (eds.) Spatial Analysis: Modelling in a GIS Enironment, Geoinformation International, Cambridge, UK Getis A 1999 Spatial statistics. In: Longley P A, Goodchild M F, Maguire D J, Rhind D W (eds.) Geographical Information Systems, 2nd edn., Wiley, New York, Vol. 1 Getis A, Ord J K 1992 The analysis of spatial association by use of distance statistics. Geographical Analysis 24: 189–206 Haynes K E, Fotheringham A S 1984 Graity and Spatial Interaction Models. Sage, Beverly Hills, CA Hubert L J, Golledge R G, Costanzo C M 1981 Generalized procedures for evaluating spatial autocorrelation. Geographical Analysis 13: 224–33 Ripley B D 1981 Spatial Statistics. Wiley, New York
A. Getis
Spatial Autocorrelation Spatial autocorrelation is the term used to describe the presence of systematic spatial variation in a variable. The variable can assume values either (a) at any point on a continuous surface (such as land use type or annual precipitation levels in a region); (b) at a set of fixed sites located within a region (such as prices at a set of retail outlets); or (c) across a set of areas that subdivide a region (such as the count or proportion of households with two or more cars in a set of Census tracts that divide an urban region). Positive spatial autocorrelation is the tendency for sites or areas that are close to one another to have similar values of the variable (i.e., both high or both low). Negative spatial autocorrelation is the tendency for adjacent values to be dissimilar (high values next to low values) which can arise in cases (b) and (c). The presence of spatial autocorrelation in map data has often been taken as indicative that there is something of interest in the map that calls for further investigation in order to understand the reasons behind the observed spatial variation (Cliff and Ord 1981). Whether identified formally through some statistic (see Sect. 3 below) or otherwise, if geographic phenomena display systematic spatial variability then this may provide the starting point for theorizing about the spatial outcomes of processes that may have their roots in, for example, economic, political, or sociological theory. If spatial data display spatial autocorrelation, this has important implications for the application of
statistical methods to such data. The term spatial autocorrelation arises in several different data analysis contexts. For example it arises in the fields of spatial statistics (Ripley 1981, Cressie 1991, Berliner), spatial econometrics (Anselin 1988) and geostatistics (Cressie 1991, Isaaks and Srivastava 1989, Berliner). If the values of a variable are known for a set of locations (or areas) then if there is spatial autocorrelation each value provides some information on the values of the same variable at nearby sites or for adjacent areas. One implication of this is that within the spatial data set there is information redundancy and for the purposes of hypothesis testing the size of the sample is not the same as the number of independent observations of the process. It is with the recognition of this property that statistical methods for analyzing spatial data part company with methods commonly employed in traditional statistics, sharing ground instead with the sorts of methods used to analyze time series data. The following sections will describe how spatial autocorrelation arises; how it is measured and its presence tested for; and the implications of spatial autocorrelation for spatial sampling, exploratory spatial data analysis, map interpolation, and confirmatory analysis including correlation and regression.
1. How Does Spatial Autocorrelation Arise? Spatial autocorrelation can be a property of data that are conceptualized as either the outcome of a deterministic process or a stochastic (random) process. In the case of a deterministic process there is conceptually only one set of values (namely those that are observed) and it is the arrangement of values across the region that are defined as autocorrelated or not. The reference point for deciding whether such a distribution of values is autocorrelated is the geographic distribution of values that would have been obtained had those values been allocated at random across the region. In the case of a stochastic process, the observed values are but one set of realized values from an infinite set of possible realizations (a ‘superpopulation’) that can be generated by a spatial process or probability model. The variable may display spatial autocorrelation because it is a realization of a model where the random variables are spatially dependent. An example of such a model is the following, which is a particular case of the simultaneous autoregressive model (Whittle 1954). Let: y(i) l ρ wi,j y( j)je(i)
(1)
j
where y(i) is the value of the variable for area i, ρ is a parameter, wi,j is non zero if area j ( ji) is a neighbor of area i (for example area j shares a common boundary with area i) and is zero otherwise, and e(i) is independent and identically distributed and drawn 14763
Spatial Autocorrelation from a normal distribution with mean zero and variance σ#. It can be seen from (1) that if ρ 0 then the set of realized values of the variable Y will be positively spatially autocorrelated because the value at i will be a function of values in the neighboring areas. For other examples of these types of models see Besag 1974, 1975 and for reviews see Haining 1990, Cressie 1991, and Richardson 1992. There are other generators for spatially dependent variables and such models may derive from models that incorporate both space and time into their specification (see Berliner). In yet other contexts it is parameters (rather than the variables themselves) that are realizations of a spatially dependent probability model. This spatial dependency is transmitted to or inherited by the variables (Marshall 1991, Mollie 1996). In the context of a stochastic process, the term spatial autocorrelation has a closely proscribed meaning and is associated with the second order moment (correlation) properties of the model. When thinking about actual problems or processes then spatial autocorrelation in a variable can arise in a number of different ways. Here are some examples: (a) Many geographic phenomena display spatial continuity. If the spatial framework used to record the occurrence or the level of the phemonena is at a smaller scale than the scale at which the phenomena varies across the region, then values in adjacent areas of the spatial framework will tend to be similar. Remote sensing provides an illustration of this point. If land use remains the same over large tracts of land, and if a grid of small pixels records land use, then neighboring pixels will tend to record the same value of the variable. The pixel values are spatially autocorrelated. (b) Even if the scale of the spatial framework used for recording data values is the same as the scale at which the phenomena varies across the landscape, measurement error can introduce spatial autocorrelation. This can occur with certain types of land use survey where recorders are given blocks of land to survey so that any operator-specific biases in the recording process will be similar within blocks and may therefore induce a form of spatial autocorrelation in the errors (Milne 1959). Spatial autocorrelation in errors also arises with remotely sensed data due to the nature of the recording device on the satellite. Due to light scattering, what is recorded by the sensor in one pixel, which supposedly refers to one parcel of land on the ground, is actually a function in part of terrain characteristics in neighboring parcels (Foster 1980). (c) Variable values may show spatial autocorrelation because they depend on an underlying process in which spatial relationships affect the way the process unfolds. There are a number of examples. Diffusion processes are processes where some attribute, like an infectious disease, spreads through a fixed population and the probability of a new individual becoming infected depends on spatial proximity to an already 14764
infected individual (Cliff and Ord 1981). Spillover processes are processes where the attribute at some location is a function not only of conditions at that location but also of conditions in neighboring areas, for example, house prices as a function of local neighborhood conditions (Anas and Eum 1984), or city income levels as a function of income levels in neighboring cities and rural areas arising because of expenditure spillovers (Haining 1987). Interaction processes are where actions at one location influence actions at another location, such as price setting at one retail outlet influencing the prices set at other competitor sites (Haining 1983, 1984). Dispersal processes are where individuals themselves disperse (like seeds around a parent plant or new settlers colonizing a region) so that population counts are autocorrelated (Cliff and Ord 1981). In all these cases, at any point in time or when the process reaches some steady state then the variable may show evidence of spatial autocorrelation. (d) The measured variable may not itself be the outcome of any of the foregoing situations but may be causally linked to another variable which is. In this case the spatial structure of the dependent variable inherits the spatial structure of the independent variable to which it is related. Spatial autocorrelation in a variable may also arise because of spatial autocorrelation in underlying parameters. The incidence of a noninfectious disease like a respiratory condition may be autocorrelated because the relative risk of the disease (which is a parameter and is itself a function of environmental characteristics such as air quality) is autocorrelated. (e) Spatial autocorrelation can be a property of a derived set of values such as the residuals from a regression model. In this case spatial autocorrelation may have arisen because the regression model failed to include an important covariate in the model which had a spatially autocorrelated structure. The spatial structure of the omitted independent variable is then inherited by the regression residuals.
2. Measuring and Testing for Spatial Autocorrelation 2.1 Autocorrelation Tests There are several statistics for measuring and testing for spatial autocorrelation. In the case of nominal data, the Join-Count test developed by Krishna Iyer (1949) is often used. The statistic is based on counting the number of joins of a particular color or colors on a map. Suppose each area is colored either black (B) or white (W). The Join-Count test is based on counting the number of BB, WW or BW adjacencies on the map. If a map is positively spatially autocorrelated then the number of BB and WW joins should be large and the number of BW joins small. If a map is
Spatial Autocorrelation negatively spatially autocorrelated then the number of BW joins should be large and the number of BB and WW joins small (Haining 1990). The terms ‘large’ and ‘small’ are taken with reference to the number of such counts that would be expected if the black and white colors were allocated at random across the areas. This statistic can be generalized to handle maps where areas are classified into three or more colors (Dacey 1965). In the case of ordinal and interval level data the two tests most commonly used are the Geary test and the Moran test (Geary 1954, Moran 1948). The Geary test is based on differences between the values of adjacent areas: ( y(i)ky( j))# where y(i) denotes the value observed in area i and y( j) denotes the values observed in area j. If areas i and j are adjacent and if the map is positively autocorrelated then ( y(i)ky( j))# should be small. The Geary statistic is based on summing these squared differences across the map: δi, j ( y(i)ky( j))#
(2)
i j
where δi,j is 1 if areas i and j are adjacent neighbors and zero otherwise (δi, i l 0). The measure (2) will be ‘small’ in the case of positive spatial autocorrelation relative to the case of a random distribution of values. On the other hand (2) will be ‘large’ in the case of negative spatial autocorrelation. The Moran coefficient is based on cross products for adjacent areas: δi, j (y(i)ky` )( y( j)ky` )
(3)
i j
where y- is the mean of the observations on Y. In the case of a positively autocorrelated map pattern, (3) will be positive since usually both terms in the cross product will be positive or negative. In the case of a negatively autocorrelated map pattern the terms will be negative, as if one of the terms is positive the other will usually be negative. These statistics are appropriate to regular lattices but less appropriate for situations where the areal framework is irregular. Both (2) and (3) are invariant under certain alterations to the areal system but for any given area i some adjacencies are likely to be more important than others and should arguably therefore have greater weight in the calculation of any spatial autocorrelation statistic. In this case the binary weight (δi, j) is replaced by a more general weight (wi, j). The weights may be chosen to reflect some hypothesis about the degree of contact between areas (Cliff and Ord 1981). For example wi,j can be defined to reflect the length of the shared boundary between the two areas i and j, or the distance between the geographic centres of the two areas, or to reflect some measure of the intensity of interaction between the two areas in terms of, say, traffic flow. The inference theories for the Geary and Moran statistics have been developed under two different
conceptualizations of the null hypothesis of randomness or no spatial autocorrelation. In one case (the randomization hypothesis) the means and variances of the statistics are obtained assuming that the observed values are distributed at random across the region. In the other case (the normality assumption) the means and variances of the statistics are obtained assuming that the observed values are independent drawings from a normal probability distribution with the same mean and variance for each area. The assumption of equal variance is a strong assumption and often questioned in the case of irregular areal partitions where population counts vary substantially between areas and where the population count appears in the denominator of the variable (Waldhor 1996). There are other techniques for exploring spatial autocorrelation. Perhaps he simplest technique is the scatterplot of paired data values oy(i), W*y(i)q where y(i) is the value observed in area i and W*y(i) is the value in the ith position of the matrix product of W* with y. W* is the matrix of general weights owi, jq where each row has been standardized to 1. So W*y(i) is a weighted average of the local values of y around i. If there is positive spatial autocorrelation present the scatterplot will have a distribution of points sloping upward from the bottom left of the graph. If there is negative spatial autocorrelation the scatterplot will have a distribution of points sloping downward from the top left of the graph. If the distribution of values is random the scatterplot will have no discernable trend. Cressie (1984) describes other graphical methods for inspecting for spatial structure.
2.2 Spatial Correlograms and Semi-ariograms The tests for spatial autocorrelation described above can be implemented at any number of spatial scales. That is, the neighbors of any area i need not be restricted to the adjacent neighbors but can be extended to include more remote neighbors. It may be informative to examine spatial structure at a range of distances simultaneously. Spatial correlograms (and semi-variograms) provide plots of spatial structure at a range of different distances. The correlogram at distance g has similarities with the Moran statistic. It is based on the expression: wi, j(g) ( y(i)ky` )( y( j)ky` ) i j
where wi,j(g) identifies all areas ( j) that are a distance g from the area i. Distance g can be defined in terms of a metric or in terms of a shortest path on a graph that connects adjacent areas (Haining 1990). The spatial correlogram plot (g l 1, 2, 3, …, G) thus provides a description of how spatial structure varies with distance. 14765
Spatial Autocorrelation The semi-variogram has similarities with the Geary statistic and is based on the expression: wi, j(g) ( y(i)ky( j))# i j
This, like the correlogram, provides a plot of spatial structure with distance thereby showing how spatial structure varies with distance (Isaaks and Srivastava 1989, Berliner). Both the correlogram and the semivariogram may be used to suggest spatial models for the distribution of values across the map. 2.3 A Test for Local Autocorrelation The statistics in 3.1 are often refered to as ‘global’ or ‘whole map’ tests of spatial autocorrelation because they compute an average measure for the entire map. It is recognized, however, that spatial autocorrelation might be a property of just a subset of the data. There might be a local tendency for similar values to cluster together but that this is not a general or ‘global’ property of the data. A local test for spatial autocorrelation has been developed which computes the probability of getting the observed collection of values in a subset of the region if all the data values had been distributed at random over the area. The local Moran statistic is given by (Anselin 1995): I(i) l ( y(i)ky` ) wi, j ( y( j)ky` ) i l 1, …, n j
where n is the number of areas so there are as many tests as there are cases. A high positive value of I(i) signals a local spatial clustering of similar values, whilst a high negative value signals a local spatial clustering of dissimilar values.
3. Spatial Autocorrelation and Spatial Data Analysis The presence of spatial autocorrelation in a variable has implications for a number of different areas of data collection and data analysis. These include spatial sampling, map interpolation, exploratory spatial data analysis, and confirmatory analysis including modeling. Each of these areas is briefly discussed in turn. 3.1 Spatial Sampling Spatial sampling is the process of taking a subset of data drawn according to some specified rule and on the basis of this subset making inferences about the spatial population from which the data have been drawn. Consider the case where the aim is to estimate the proportion of an area covered by a particular type 14766
of land use. In this case we do not adopt a ‘superpopulation’ view of the problem but rather focus on the ‘here and now’ of the land use geography. A conventional estimator for the quantity is the sample mean: 1 y` l y(i) n i where n is the size of the sample and y(i) is 1 if sample site i is occupied by the particular land use, zero otherwise. The problem is to choose the most efficient sample design from random sampling, stratified random sampling, and systematic sampling. The spatial autocorrelation in the land use surface has an important influence on the relative efficiency of these three designs because, as noted above, the presence of spatial autocorrelation signals that data values recorded at one location carry information about data values at nearby locations. The stronger the spatial autocorrelation on the surface the more important it is to select sample sites that ensure that each sample point is not duplicating information contained in the rest of the sample. Random sampling is the least efficient sample design because there is no guarantee that sample points will not be clustered in particular areas of the map. Stratified random sampling overcomes the worst features of random sampling by ensuring coverage but sample sites can still be adjacent to one another on either side of a strata boundary. Systematic sampling is frequently the best because it ensures a minimum separation distance between sample sites (Dunn and Harrison 1993). 3.2 Map Interpolation A range of estimators have been constructed to estimate values at specified locations where as yet data have not been collected. One of the more widely used methods, the kriging estimator, has the form: yV (k) l λk, j y( j) j
where y# (k) is the value to be estimated at location k where currently no data value is available (Berliner). The values oy( j)q are known and oλk, jq are coefficents that are a function of the spatial autocorrelation between locations k and j (Isaaks and Srivastava 1989). In this case the presence of spatial autocorrelation can be used to construct estimates of missing or unobserved values and in addition confidence intervals can be attached to the estimates. This confidence interval is influenced by how much information about the surface is available in the vicinity of the site. Confidence intervals will be large if the site k is far from sample points for which information is available, small if there are sample points with known values nearby.
Spatial Autocorrelation 3.3 Exploratory Spatial Data Analysis Disease atlases are constructed to represent the geographic variability of a disease. Maps of standardized mortality rates (SMRs), for example, display the geography of relative risk for a particular disease and may be used to provide an indication of where the disease risk might be highest—perhaps as a preliminary to exploring for particular risk factors. Maps constructed using small geographic areas suffer from a number of problems including that the estimated rates tend to be unstable with the most extreme rates often associated with areas with the smallest populations. Bayes adjustment methods are often used to adjust such standardized rates. These methods incorporate information from the rest of the map in order to eliminate some of the random variation that is present in each area’s SMR. A subset of these methods do not just use the SMRs from the rest of the map but place greatest emphasis on those estimates closest to the area to be adjusted. For this adjustment, spatial process models are used and incorporate the spatial autocorrelation in the relative risk rates into the adjustment process (Mollie 1996). Other, simpler smoothing methods may also be employed to provide evidence on broad patterns in a spatial data set, for example, mean and median smoothers (Haining et al. 1998). These techniques all implicitly exploit some aspect of the concept of spatial autocorrelation—that places near to an area, i, provide information on the variable value at i. Similar methods have been used to ‘clean’ remotely sensed data where noise has blurred the image (Green 1996). 3.4 Information Loss in Statistical Tests Spatial autocorrelation can undermine the use of classical statistical techniques. The correlation coefficient (r) is a common statistic for measuring the linear relationship between two variables (X and Y ). The Pearson correlation coefficent varies between k1 and j1 with j1 signifying a perfect positive relationship between X and Y (as X increases, Y increases). The inference theory for the correlation coefficient is based on: (nk2)"/# QrV Q (1krV #)−"/#
(4)
where n is the sample size and r# is the estimated correlation coefficient. Under the null hypothesis that the correlation in the population is zero (4) is t distributed with (nk2) degrees of freedom. However the amount of statistical information carried by the n observations when the data are (positively) spatially autocorrelated is less than would be the case were the n observations to be independent. In the terminology of Clifford and Richardson (1985) the ‘effective sample size’ is less than the actual sample size n so the degrees of freedom for the test is less than n. The solution to the problem is to compute this effective sample size
(Nh) which is obtained from the spatial correlogram (see Sect. 3.2). Then in (4) Nh replaces n and it is t distributed with (Nhk2) degrees of freedom. 3.5 Modeling Spatial Variation One of the statistical assumptions underlying the application of the classical regression model is that the errors are independent. If this assumption is violated tests of significance on the parameters of the independent variables are affected (the risk of a type I error is increased) and the coefficient of multiple determination measure of goodness of fit is inflated. This assumption has to be tested via the regression residuals (which are estimates of the errors). In the case of spatial data, an adaptation of the Moran test is often used although other tests are available (Anselin 1988, Cliff and Ord 1981). The inference theory of the Moran test has to be adapted in this context because the mean of the residuals is zero by construction. If the null hypothesis of no spatial autocorrelation is rejected then remedial action is needed to avoid mis-specification error. One approach is to seek out other independent variables that, because they were originally (and wrongly) excluded and because they were spatially autocorrelated, caused the residual autocorrelation. Other forms of model specification may also need to be considered depending on the analyst’s conceptualization of the process underlying the behavior of the dependent variable. Thus there may be a need to include interaction terms, diffusion process terms or spillover terms (see Sect. 2). Thus a respecified regression model might include spatially lagged terms of the dependent variable, one or more of the independent variables or a spatial error model. (Examples of these and other models are described in Ord 1975, Anselin 1988, Haining 1990, Cressie 1991.)
Bibliography Anas A, Eum S 1984 Hedonic analysis of a housing market in disequilibrium. Journal of Urban Economics 15: 87–106 Anselin L 1988 Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dordrecht, The Netherlands Anselin L 1995 Local indicators of spatial association—LISA. Geographical Analysis 27: 93–115 Besag J 1974 Spatial interaction and the statistical analysis of lattice systems (with discussion). Journal of the Royal Statistical Society B 36: 192–236 Besag J 1975 Statistical analysis of non-lattice data. The Statistician 24: 179–96 Cliff A, Ord J 1981 Spatial Processes: Models and Applications. Pion, London Clifford P, Richardson S 1985 Testing the association between two spatial processes. Statistics and Decisions 2(supplement): 155–60 Cressie N 1984 Towards resistant geostatistics. In: Verly G et al. (eds.) Geostatistics for Natural Resources Characterization. Reidel, Dordrecht, The Netherlands, pp. 21–44 Cressie N 1991 Statistics for Spatial Data. Wiley, New York
14767
Spatial Autocorrelation Dacey M 1965 A review of measures of contiguity for two and kcolor maps. In: Berry B, Marble D (eds.) Spatial Analysis: A Reader in Statistical Geography. Englewood Cliffs, NJ, pp. 479–95 Dunn R, Harrison A 1993 Two dimensional systematic sampling of land use. Applied Statistics 42: 585–601 Foster B 1980 Urban residential ground cover using Landsat digital data. Photogrammetric Engineering and Remote Sensing 46: 547–58 Geary R 1954 The contiguity ratio and statistical mapping. The Incorporated Statistician 5: 115–45 Green P 1996 MCMC in image analysis. In: Gilks G, Richardson S, Spiegelhalter D (eds.) Marko Chain Monte Carlo in Practice. Chapman & Hall, London, pp. 381–99 Haining R 1983 Anatomy of a price war. Nature 304: 679–80 Haining R 1984 Testing a spatial interacting-markets hypothesis. Reiew of Economics and Statistics 66: 576–83 Haining R 1987 Small area aggregate income models: theory and methods with an application to urban and rural income data for Pennsylvania. Regional Studies 21: 519–30 Haining R 1990 Spatial Data Analysis in the Social and Enironmental Sciences. Cambridge University Press, Cambridge, UK Haining R, Wise S, Ma J 1998 Exploratory spatial data analysis in a GIS environment. The Statistician 47: 457–69 Isaaks E, Srivastava R 1989 An Introduction to Applied Geostatistics. Oxford University Press, Oxford, UK Krishna Iyer P 1949 The first and second moments of some probability distributions arising from points on a lattice, and their application. Biometrika 36: 135–41 Marshall R 1991 A review of methods for the statistical analysis of spatial patterns of disease. Journal of the Royal Statistical Society, Series A 154: 421–41 Milne A 1959 The centric systematic area sample treated as a random sample. Biometrics 15: 270–97 Mollie A 1996 Bayesian mapping of disease. In: Gilks G, Richardson S, Spiegelhalter D (eds.) Marko Chain Monte Carlo in Practice. Chapman & Hall, London, pp. 359–79 Moran P 1948 The interpretation of statistical maps. Journal of the Royal Statistical Society, Series B 10: 243–51 Ord J 1975 Estimation methods for models of spatial interaction. Journal of the American Statistical Association 70: 120–6 Richardson, S 1992 Statistical methods for geographical correlation studies. In: Elliott P, Cuzick J, English D, Stern R (eds.) Geographical and Enironmental Epidemiology: Methods for Small Area Studies. Oxford University Press, Oxford, UK pp. 181–204 Ripley B 1981 Spatial Statistics. Wiley, New York Whittle P 1954 On stationary processes in the plane. Biometrika 41: 434–49 Waldhor T 1996 The spatial autocorrelation coefficient Moran’s I under heteroscedasticity. Statistics in Medicine 15: 887–92
R. P. Haining
constitutes the basis for the evolution of spatial structures. In turn, the supply of facilities across space provides opportunities for individuals and households to conduct their activities, but at the same time also sets the space-time constraints that restrict behavior. Over the years, many different attempts at modeling spatial choice behavior have been reported in the literature. These models predict the probability that an individual will choose a choice alternative as a function of its locational and non-locational attributes. In this article, the background, theoretical underpinnings, and basic methodology of these modeling approaches are examined. From the outset, it should be clear that limited available space prevents us from discussing modeling approaches in considerable detail.
1. Graity Models The modeling of spatial choice behavior gained considerable momentum when Wilson (1971) introduced a family of spatial interaction models, based on the principle of gravity, to describe and predict social phenomena. This principle states that the interaction between two bodies is proportional to their masses, and inversely proportional to the square of distance. Wilson elaborated this principle in a number of ways. First, he suggested that any distance decay function that reflects the notion that, ceteris paribus, the intensity of interaction decreases with increasing distance may be used to replace straight distance. Secondly, he argued that in spatial problems often the number of departures from particular origins and\or the number of arrivals at particular destinations is known, implying that so-called balancing factors have to be introduced in the model to satisfy these constraints. Wilson distinguished an unconstrained spatial interaction model (departures and arrivals are not known), a production-constrained spatial interaction model (number of departures per zone is known), an attractions-constrained spatial interaction model (number of arrivals per zone is known), and the doubly constrained spatial interaction model (both number of departures and number of arrivals per zone are known). In the context of spatial choice behavior, the production constrained spatial interaction model is most interesting as it predicts which destination will be chosen. It can be expressed as: pij l AiOiWj f (dij)
Spatial Choice Models The interrelationship between spatial structure and individual choice behavior has traditionally been at the forefront of geographic research. Geographers have realized that spatially varying consumer demand 14768
(1)
where, pij is the probability that an individual in zone i will choose destination j; Ai is a balancing factor, guaranteeing that the number of predicted interactions from zone i is equal to its number of observed interactions; Oi is the number of observed interactions from zone i; Wj is a vector of variables, representing
Spatial Choice Models the attractiveness of destination j; f (dij) is some function of the distance between origin i and destination j. Although much empirical work has been conducted on spatial interaction models since this seminal work, most of the relevant literature has been concerned with operational problems. Theoretical progress has been limited. For example, different measures of distance (straight-line distance, Euclidean distance, cityblock metric, and travel time) and attractiveness have been suggested. Likewise, different functions for representing the distance decay effect have been proposed (power, exponential, and more complex functions). Finally, an exponent has been added to the specification of attractiveness term to allow for the fact that the larger destinations tend to have extra attraction beyond their size because of the benefits of economies of scale. The estimation of spatial interaction models is typically based upon aggregate, zonal data of interaction flows (shopping, recreation, migration, traffic, etc). To that effect, the study area is delineated carefully to avoid large external flows, and divided into a number of smaller homogeneous zones. The spatial interaction model is calibrated using observations about interactions between these zones as input. As a result, the parameters of the spatial interaction model are highly influenced by the geometry of the study area. This caused a search for improvements of the model, and for alternative modeling approaches. In this context, some authors suggested the development of models that estimate the effects of attraction variables endogenously. Another way of dealing with this problem was to incorporate some measure of spatial structure into a spatial interaction model (Fotheringham 1983). The simplest approach to the specification of this additional term is the use of a Hansen-type accessibility measure. This so-called competing destinations model may be expressed as: pij l AiOiWjC δj d −ijβ
(2)
where Ai l
A B
WjC δj d −ijβ j
C
−"
(3)
D
and Cj l Wjd −jkγ
(4)
kj
2. The Reealed Preference Model The fact that the spatial interaction model is heavily influenced by the geometry of the study led Rushton (1969) to develop a preference-scaling model. The
purpose of the model was to derive rules of spatial choice behavior. To that effect, choice alternatives were classified into so-called locational types, which are a combination of an attractiveness category and a distance category. Spatial choice behavior is assumed to reflect a trade-off between attractiveness and distance separation. Pairwise choice data are then used to calculate the proportion of times a particular locational type is chosen, given that both are present. Based on this data, the locational types are positioned on a unidimensional preference scale using a nonmetric multidimensional scaling algorithm. Over the years, this basic model has been elaborated in a number of important ways. First, Rushton (1974) showed how graphical methods, trend surface analysis, or conjoint analysis might be used to decompose the preference scale into the contributions of the two basic variables. Second, Girt (1976) suggested linking the preference function to overt behavior by relating distances on the preference scale to choice probabilities. Third, Timmermans (1979) suggested deriving the choice sets of individuals from data on information fields and consumer attitudes, and using these to scale the alternatives. Although Rushton’s preference scaling model does have considerable appeal, it never received any major following. One of the reasons might be that the model is based on two explanatory variables only, while heterogeneity in preferences is not accounted for. The model has also been criticized because of its conceptual basis. It has been suggested that the model may not represent preferences at all, but rather inconsistencies in individual choice behavior.
3. Discrete Choice Models The spatial interaction model has been criticized for its reliance on aggregate, zonal interaction flows. This criticism has led to the development of disaggregate discrete choice models that are based on individual choice behavior. Discrete choice models may be derived from at least two formal theories: Luce’s strict utility theory and Thurstone’s random utility theory. In particular, Luce assumed that the probability of choosing a choice alternative is equal to the ratio of the utility associated with that alternative to the sum of the utilities for all the alternatives in the choice set. Luce thus assumed deterministic preference structures and postulated a constant-ratio decision rule. In contrast, random utility theory is based on stochastic preferences in that an individual is assumed to draw a utility function at random on each choice occasion. An individual’s utility for a choice alternative is assumed to consist of a deterministic component and a random utility component. Given the principle of utility maximizing behavior, the probability of choosing a choice alternative is then equal to the 14769
Spatial Choice Models probability that its utility exceeds that of all other choice alternatives in the choice set. The specification of the model depends on the assumptions regarding the distributions of the random utility components. The best-known model is the multinomial logit (MNL) model. It can be derived by assuming that the random utility components are independently, identically Type I extreme value distributed. It can be expressed as: pik l
exp [U(Xk, Si)] exp [U(Xj, Si)]
(5)
j
where U(Xk, Si) is the deterministic part of the utility of choice alternative k of individual i with socioeconomic characteristics Si. One of the most important criticisms that have been expressed against the multinomial logit model concerned the fact that the utility of a choice alternative is independent of the existence and the attributes of other alternatives in the choice set (the Independence from Irrelevant Alternatives or IIA property). It predicts that the introduction of a new, similar choice alternative will reduce market shares in direct proportion to the utility of the existing alternatives, which is counterintuitive in the case of a high degree of similarity between particular alternatives. Therefore, various alternative models have been developed to relax the IIA assumption (Timmermans and Golledge 1990). Similar work in this regard has been conducted by Williams (1977). The best-known model that circumvents the IIA-property is the nested logit model. Choice alternatives that are assumed to be correlated are grouped into the same nest. Each nest is represented by an aggregate alternative with a composite utility, consisting of the so-called inclusive value, and a parameter to be estimated. To be consistent with utility-maximizing behavior, the inclusive values should lie in the range between 0 and 1, and the values of the parameters should change consistently from lower levels to higher levels of the hierarchy.
4. Conjoint Preference and Choice Models Unlike discrete choice models, the parameters of the conjoint preference and choice models are not derived from real-world data, but from experimental design data. Conjoint models involve the following steps when applied to spatial choice problems (e.g., Timmermans 1984). First, each choice alternative is described by its position in a set of attributes, considered relevant for the problem at hand. Next, a series of levels for each attribute is chosen. Having defined the attributes and 14770
their levels, the next step involves creating an experimental design to generate a set of hypothetical choice alternatives. This design should be created such that the necessary and sufficient conditions to estimate the model of interest are satisfied. Respondents are then required to evaluate the resulting attribute profiles. Their responses are decomposed into the utility contribution of the attribute levels. Once partworth utilities have been estimated, choice behavior can be predicted by assuming that an individual will choose the alternative with the highest overall evaluation. Alternatively, different probabilistic rules can be postulated. To avoid such ad hoc rules, Louviere and Woodworth (1983) suggested estimating the preference function and choice model simultaneously. To that end, the attribute profiles should be placed into choice sets. Again, the composition of the choice sets depends on the properties of the choice model. Over the years, the basic conjoint preference and choice models have been elaborated in many different ways. To avoid the problem of information overload and reduce respondent demand, Louviere (1984) suggested the Hierarchical Information Integration method. It assumed that individuals first group the attributes into higher-order decision constructs. Separate designs are constructed for each decision construct, and a bridging experiment is designed for measuring the tradeoff between the evaluations of the decision constructs. Oppewal et al. (1994) suggested an improved methodology by using multiple-choice experiments to test the implied hierarchical decision structure. Other recent advances include the development of group preference models, the development of models of portfolio choice, the inclusion of contextual effects and constraints, and the combined use of experimental design and revealed preference data, to name a few.
5. Complex Spatial Choice Behaior The previous models are all typically based on singlepurpose, single-stop behavior. Over the years, however, it became clear that an increasing proportion of trips involved multistop behavior. Moreover, Ha$ gerstrand’s time geography had convincingly argued that behavior does not reflect preferences only, but also constraints. Thus, various attempts have been made to develop models of trip chaining and activity-travel patterns. Originally, most models relied on semi-Markov process models or Monte Carlo simulations. More recently, utility-maximizing models have dominated the field. An important contribution in this regard was made by Kitamura (1984), who introduced the concept of prospective utility. It states that the utility of a destination is not only a function of its inherent attributes and the distance to that destination, but also
Spatial Cognition of the utility of continuing the trip from that destination. Dellaert et al. (1998) generalized Kitamura’s approach to account for multipurpose aspects of the trip chain. Trip chaining is only one aspect of multiday activity\travel patterns. Several models have recently been suggested to predict more comprehensive activity patterns. These models can be differentiated into the older constraints-based models (e.g., Lenntorp 1976), utility-maximizing models (e.g., Recker et al. 1986), rule-based or computational process models (e.g., Golledge et al. 1994, ALBATROSS – Arentze and Timmermans 2000).
6. Conclusions This article has given a very brief overview of spatial choice modeling. Due to the limited available space, we have focused on some of the key modeling approaches, disregarding alternatives (e.g., dynamic models) altogether. If we examine the accomplishments, one cannot escape the conclusion that the models developed over the years have been increasing in complexity. The theoretical underpinnings of the recent activity-based models are much richer than the Markov or gravity models of the 1960s. Moreover, the brief overview indicates that geographers have been using alternative data, such as experimental design data and activity diaries, in addition to the more conventional survey data. See also: Spatial Analysis in Geography; Spatial Decision Support Systems; Spatial Interaction; Spatial Interaction Models; Spatial Optimization Models; Spatial Pattern, Analysis of; Spatial Statistical Methods; Spatial Thinking in the Social Sciences, History of
Kitamura R 1984 Incorporating trip chaining into analysis of destination choice. Transportation Research 18B: 67–81 Lenntorp B 1976 Paths in space-time environments. Lund Studies in Geography, Series B, no. 44 Louviere J J 1984 Hierarchical information integration: a new method for the design and analysis of complex multiattribute judgment problems. In: Kinnear T C (ed.) Adances in Consumer Research 11, ACRA, Provo, UT, pp. 148–55 Louviere J J, Woodworth G 1983 Design and analysis of simulated consumer choice or allocation experiments: An approach based on aggregate data. Journal of Marketing Research 20: 350–67 Oppewal H, Louviere J J, Timmermans H J P 1994 Modeling hierarchical information integration processes with integrated conjoint choice experiments. Journal of Marketing Research 31: 92–105 Recker W W, McNally M G, Root G S 1986 A model of complex travel behavior: part I – theoretical development. Transportation Research, A 20A: 307–318 Rushton G 1969 Analysis of spatial behavior by revealed space preference. Annals of the Association of American Geographers 59: 391–400 Rushton G 1974 Decomposition of space preference structures. In: Golledge R G, Rushton G (eds.) Spatial Choice and Spatial Behaior. Ohio State University Press, Columbus, OH, pp. 119–33 Timmermans H J P 1979 A spatial preference model of regional shopping behaviour. Tijdschrift oor Economische en Sociale Geografie 70: 45–8 Timmermans H J P 1984 Decompositional multiattribute preference models in spatial choice analysis: a review of some recent developments. Progress in Human Geography 8: 189– 221 Timmermans H J P, Golledge R G 1990 Applications of behavioural research on spatial choice problems II: preference and choice. Progress in Human Geography 14: 311–54 Williams H C W L 1977 On the formation of travel demand models and economic evaluation measures of user benefit. Enironment and Planning, A 9: 285–34 Wilson A G 1971 A family of spatial interaction models and associated developments. Enironment and Planning, A 3: 1–32
H. Timmermans
Bibliography Arentze T A, Timmermans H J P 2000 ALBATROSS: a Learning-based Transportation-oriented Simulation System. EIRASS, Eindhoven University of Technology, The Netherlands Dellaert B, Arentze T A, Borgers A W J, Timmermans H J P, Bierlaire M 1998 A conjoint based multi-purpose multi-stop model of consumer shopping centre choice. Journal of Marketing Research 35: 177–88 Ettema D F, Timmermans H J P 1997 Actiity-based Approaches to Trael Analysis. Elsevier Science, Oxford, UK Fotheringham A S 1983 A new set of spatial interaction models: The theory of competing destinations. Enironment and Planning, A 15: 1121–32 Girt J L 1976 Some extensions to Rushton’s spatial preference scaling model. Geographical Analysis 8: 137–56 Golledge R G, Kwan M P, Ga$ rling T 1994 Computational process modeling of travel decisions using a geographical information system. Papers in Regional Science 73: 99–117
Spatial Cognition Spatial cognition concerns the study of knowledge and beliefs about spatial properties of objects and events in the world. Cognition is about knowledge: its acquisition, storage and retrieval, manipulation, and use by humans, nonhuman animals, and intelligent machines. Broadly construed, cognitive systems include sensation and perception, thinking, imagery, memory, learning, language, reasoning, and problemsolving. In humans, cognitive structures and processes are part of the mind, which emerges from a brain and nervous system inside of a body that exists in a social and physical world. Spatial properties include location, size, distance, direction, separation and connection, shape, pattern, and movement. 14771
Spatial Cognition
1. The Geographic Study of Spatial Cognition Geographers study spatial cognition for three reasons. First and foremost, spatial cognition provides questions of geographic interest in their own right, given that cognition about space and place is an expression of human-environment or human-earth relations. Second, geographers have hoped that spatial cognition would improve their understanding of traditional phenomena of interest; for example, where people choose to shop should depend in part on their beliefs about distances and road connections. Finally, there has been an interest in improving the use and design of maps and other geographic information products which depend in part on human understanding of depicted spatial relations. In addition to geographers and cartographers, researchers from many different disciplines study spatial cognition, including psychologists, architects and planners, anthropologists, linguists, biologists, philosophers, and computer scientists. While overlapping and interacting with these disciplines, the geographer’s interest in spatial cognition is distinct in some ways. Geographers are interested in the earth as the home of humanity. Their concern is with the cognition of humans, not the cognition of barn swallows or underseas robots. Geographers focus on spatial scales relevant to human activities on the earth’s surface, such as temporary travel, migration, settlement, and economic activities; a geographer wants to understand human conceptualizations of the layout of a city or a region, not of an atom or the solar system. Finally, the geographer ultimately desires to describe, predict, and explain external humanenvironment relationships, and focuses on internal mental and neuropsychological structures and processes only insofar as it helps in the understanding of external relationships. 1.1 History of Spatial Cognition in Geography The notion that a subjective understanding of space partially mediates human relationships with an objective physical world is undoubtedly quite old. Within the modern discipline of geography, the study of spatial cognition first appeared with work on geographic education early in the twentieth century. However, interest in spatial cognition flowered during the 1950s and 1960s. The subdiscipline of experimental cartography emerged with the insight that maps could be improved by understanding how humans perceive and think about information depicted in map symbols. In an attempt to explain human responses to floods and other natural hazards, the subdiscipline of environmental perception began to study people’s subjective beliefs about the occurrence and consequences of hazard events. Most importantly for the study of spatial cognition, behavioral geographers began developing theories and models of the human reasoning 14772
and decision-making involved in spatial behavior e.g., migration, vacationing, and trip scheduling. This approach gained insight from the work of Kevin Lynch (1960), a planner who argued that ‘images’ of cities guide people’s behavior and experiences of those cities. All of these historical developments constituted a new focus on disaggregate or microscale approaches within human geography, approaches that focused on the geographic behavior of individuals as an accompaniment to traditional approaches that focused on human settlements and larger aggregations of people.
2. Topics in the Geographic Study of Spatial Cognition The study of spatial cognition in geography sheds light on questions such as how spatial knowledge and beliefs are acquired and develop over time; the nature of spatial knowledge structures and processes; how people navigate and stay oriented in space; how people use language to communicate with each other about space; and how aspects of spatial knowledge and reasoning are similar or different among individuals or groups.
2.1 Acquisition and Deelopment Humans acquire spatial knowledge and beliefs directly via sensorimotor systems that operate as they move about the world. People also acquire spatial knowledge indirectly via static and dynamic symbolic media such as maps and images, 3-D models, and language. Geographers are interested in how different media have consequences for the nature of acquired knowledge. Spatial knowledge changes over time, through processes of learning and development. Both changes in the child’s spatial knowledge and reasoning, and that of an adult visiting a new place for the first time, are of interest to geographers. Both physical maturation and experience influence development. A person’s activity space; the set of paths, places, and regions traveled on a regular basis, is an important example of spatial experience that influences people’s knowledge of space and place, no matter what their age. Most people know the areas around their homes or work places most thoroughly, for example (as in the anchor-point theory of Couclelis et al. 1987). A widely accepted model of spatial learning suggests that spatial knowledge of places develops in a sequence of three stages or elements. First is landmark knowledge, unique features or views that identify a place. Second is route knowledge, based on travel routines that connect ordered sequences of landmarks. The final element is survey knowledge, knowledge of twodimensional layouts that includes the simultaneous
Spatial Cognition interrelations of locations; survey knowledge would support detouring and shortcutting. Not everyone has detailed or comprehensive survey knowledge of their surrounds, and exposure to maps of places clearly advances the development of such knowledge.
2.2 Spatial Knowledge Structures and Processes The metaphor of a cognitive map was coined by Tolman in a 1948 paper to refer to internally represented spatial models of the environment. The cognitive (or mental) map includes knowledge of landmarks, route connections, and distance and direction relations; nonspatial attributes and emotional associations are stored as well. However, in many ways, the cognitive map is not like a cartographic ‘map in the head.’ It is not a unitary integrated representation, but consists of stored discrete pieces including landmarks, route segments, and regions. The separate pieces are partially linked or associated frequently so as to represent hierarchies such as the location of a place inside of a larger region. Furthermore, spatial knowledge is not well modeled by metric geometries such as the Euclidean geometry of high school math. The ways in which spatial knowledge is internally represented in human minds lead to certain patterns of distortions in the way people answer spatial questions. For example, people often believe the distance from place A to B is different than from B to A. Turns are frequently adjusted in memory to be more like straight lines or right angles. At larger scales, most people have somewhat distorted ideas about the sizes and locations of continental land masses on the earth; for example, South America is thought to be due south of North America when it is actually rather far to the east as well.
tures, and interpreting verbal route directions. Wayfinding tasks generally do involve a cognitive map of the environment. Various locomotion and wayfinding tasks vary greatly in their demand on attentional capacity. Following a familiar route is not very demanding on attention; finding one’s way in a new city may be quite demanding. Orientation is knowing ‘where you are,’ although the precision and comprehensiveness of this knowing varies greatly in different situations and for different people. Two fundamental processes are involved in orientation during navigation. The first process is the recognition of external features or landmarks– pilotage. In some cases, the recognized landmark is the goal, but more commonly the landmark aids orientation by serving as a key to knowledge of spatial relations stored in an internal cognitive or external cartographic map. The second process is dead reckoning, updating a sense of orientation by integrating information about movement speed, direction, and\or acceleration without reference to recognized features. People also use symbolic media such as maps to navigate and stay oriented. Maps used for navigation show robust alignment effects, a common confusion in using maps when the top of the map is not the direction in the surrounds that one is facing when looking at the map. It results from the fact that spatial perception is ‘orientation-dependent’; objects and pictures have a top and a bottom in the perceptual field. In the case of road or trail maps, many people turn the map to bring about alignment. Unfortunately, this option is not available with ubiquitous ‘You-AreHere’ maps, which are surprisingly often placed in a misaligned way with respect to the viewer’s perspective. The resulting alignment effect is reflected by the fact that misaligned maps are interpreted more slowly and with greater error by most people (Levine et al. 1984). While an annoyance in many situations, misaligned emergency maps placed in public buildings are potentially disastrous.
2.3 Naigation and Orientation Humans travel over the earth’s surface to reach destinations. This typically requires planning and the ability to stay oriented while moving. Navigation is this coordinated and goal-directed travel through space. It consists of two components, locomotion and wayfinding. Locomotion refers to the guidance of oneself through space in response to local sensorimotor information in the immediate surrounds, and includes such tasks as identifying surfaces of support, avoiding obstacles, and moving toward visible landmarks. Locomotion generally occurs without the need for an internal model or cognitive map of the environment. Wayfinding refers to the planning and decision-making that allows one to reach a destination that is not in the immediate sensory field, and includes such tasks as choosing efficient routes, scheduling destination sequences, orientating to nonlocal fea-
2.4 Spatial Communication ia Language Spatial information is often communicated verbally, in natural language such as English. People give and receive verbal route directions, read spatial descriptions contained in stories, and increasingly interact with computer systems via verbal queries. There are two notable characteristics of the way language expresses spatial information. One is that language expresses mostly nonquantitative or imprecise (‘fuzzy’) quantitative information about space. Statements about connections and approximate location are more important than precise statements. For example, people say ‘turn left at the gas station’ rather than ‘turn 80m after you go 0.6 miles.’ Great precision is typically unnecessary or even confusing. A second characteristic is that various aspects of the context of 14773
Spatial Cognition communication are critical in interpreting spatial language. Context is provided by knowledge of who is speaking, where they are, physical features in the situation, the previous topic of conversation, and so on. ‘The bicycle is near the church’ is generally understood differently than ‘Wisconsin is near Michigan.’
2.5 Indiidual and Group Similarities and Differences No two individuals know exactly the same things or reason in exactly the same way about space and place. Some people are better at tasks such as wayfinding, learning spatial layouts, or reading maps. In some cases, there may be different ways to think about such problems, all of which may be effective. In such cases, one might speak of ‘stylistic differences’ in spatial cognition rather than ‘ability differences.’ Geographers are interested in measuring and explaining such individual differences. Traditional psychometric paper-and-pencil tests of spatial abilities tend not to measure these differences very well (Allen et al. 1996). There are many factors that may contribute to variations in spatial cognition: body size, age, education, expertise, sex, social status, language, culture, and more. A first goal for geographic research is to measure and document ways these factors might covary with spatial cognition. Females and males appear to differ in some ways in their spatial abilities and styles, for example, but not in others (Montello et al. 1999). In addition to describing such covariations, geographers have the research goal of distinguishing their underlying causes. For example, the two sexes likely differ in their spatial cognition in part because of different travel experiences afforded them by their socialized roles. Explanations for such covariations are generally quite difficult to determine, however, as one cannot readily do randomized experiments on person characteristics such as age, sex, culture, and activity preferences.
3. Technologies and the Future of Spatial Cognition Research A variety of technologies are poised to have a large impact on the questions and methods of spatial cognition research in geography. Global positioning system (GPS) receivers have recently become small and inexpensive, and are increasingly used by sailors, hikers, and other outdoor enthusiasts. A GPS receiver provides information about one’s location by geometrically combining distance signals it picks up from several geosynchronous satellites constantly orbiting the earth. This system is also part of the technology of automated navigation systems built into automobiles to provide drivers up-to-date and locally relevant 14774
information about routes and destinations. Several research issues concern how locational information is best displayed or communicated, and how the availability of such information might change people’s experiences and behaviors in space and place. Other computer technologies will lead to new research and thinking in spatial cognition. There are a host of spatial cognition questions inspired by efforts to improve the effectiveness and efficiency of geographic information systems (GISs). How can complex geographical information be depicted to promote comprehension and effective decision-making, whether through maps, graphs, verbal descriptions, or animations? How does exposure to new geographic information technologies alter human ways of perceiving and thinking about the world? There are also research issues concerning the way that people do or do not spatialize their understanding of other types of information networks such as the World Wide Web; one often speaks metaphorically of ‘navigating’ the Web. However, undoubtedly the most dramatic technological development in spatial cognition will be the advent of computer-simulated worlds known as virtual environments or virtual reality. See also: Cartography; Cognitive Maps; Geographic Education; Geographic Learning in Children; Knowledge (Explicit and Implicit): Philosophical Aspects; Space: Linguistic Expression; Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms; Spatial Thinking in the Social Sciences, History of
Bibliography Allen G L, Kirasic K C, Dobson S H, Long R G, Beck S 1996 Predicting environmental learning from spatial abilities: An indirect route. Intelligence 22: 327–55 Couclelis H, Golledge R G, Gale N, Tobler W 1987 Exploring the anchor-point hypothesis of spatial cognition. Journal of Enironmental Psychology 7: 99–122 Downs R M, Stea D (eds.) 1973 Image and Enironment. Aldine, Chicago Golledge R G 1987 Environmental cognition. In: Stokols D, Altman I (eds.) Handbook of Enironmental Psychology, Wiley, New York Levine M, Marchon I, Hanley G 1984 The placement and misplacement of you-are-here maps. Enironment & Behaior 16: 139–57 Liben L S, Patterson A H, Newcombe N 1981 Spatial Representation and Behaior Across the Life Span. Academic Press, New York Lloyd R 1997 Spatial Cognition: Geographic Enironments. Kluwer Academic, Dordrecht, The Netherlands Lynch K 1960 The Image of the City. MIT, Cambridge, MA Mark D M, Frank A U (eds.) 1991 Cognitie and Linguistic Aspects of Geographic Space. Kluwer Academic, Dordrecht, The Netherlands Montello D R, Lovelace K L, Golledge R G, Self C M 1999 Sex-related differences and similarities in geographic and
Spatial Data environmental spatial abilities. Annals of the Association of American Geographers 89(3): 515–34 Tolman E C 1948 Cognitive maps in rats and men. Psychological Reiew 55: 189–208 Tversky B 1992 Distortions in cognitive maps. Geoforum 23: 131–8
D. R. Montello
Spatial Data Spatial data are the sum of our interpretations of geographic phenomena. Taken literally, spatial data could refer to any piece of information that associates an object with a location—from stars in the heavens to tumors in the human body. Typically, however, when used in a geographical context, spatial data is restricted to describing phenomena on or near the earth’s surface. In digital form, the data are the primary information needed by geographic information systems (GIS), the software tools used for spatial-data analysis.
1. What Are Spatial Data? ‘Spatial data’ is used as an all-encompassing term that includes general-purpose maps, remotely sensed images, and census-tract descriptions, as well as more specialized data sets such as seismic profiles, distribution of relics in an archeological site, or migration statistics. Information about buildings, bridges, roads, streams, grassland, and counties are other examples of types of spatial data. Broadly, spatial data refers to any piece of information that has an absolute or relative location (see Location: Absolute\Relatie). More commonly, spatial data are associated with information shown on maps and images of natural and man-made features that range in size from 1 m to 10 km. Such data are often referred to as geospatial data (see Geographic Information Systems; Spatial Data Infrastructure; Remote Sensing; Landsat Imagery in Geography; Remote Sensing and Geographic Information Systems Analysis; Global Positioning System for examples of the varying types and usages of spatial data). Spatial data sets are collections of locational and nonlocational information about selected geographic phenomena. This collection of information forms the heart of a spatial database. Once the information is in the database, computers provide the capabilities to handle geographic data in ways that were previously impossible.
2. Data Models A digital spatial data set represents a certain model of geographic reality. These models describe both the locational and nonlocational aspects of the phenom-
Table 1 Terminology for describing geographic features Feature type
Example
Digital database term (and synonyms)
Point feature Benchmark Node (vertex, 0-cell) Linear feature Road centerline Arc (edge, 1-cell) Areal feature Lake\pond Polygon (face, 2-cell)
ena of interest. For users to make sense of the data they are receiving, they must understand the conceptual model underlying the data. The locational aspect of a spatial data set is usually described by one of two types of models, either grid-cell (sometimes called raster) data models, or vector data models. Grid-cell data models use collections of fixed units of space, for example, cells of land 50 m on a side, to describe the location and extent of a geographic phenomenon. Digital remotely sensed images are collected in this form. Vector data models use graphic elements (points, lines, and polygons) to describe the location and extent of the phenomenon under study. Most digital map data are in this form. The spatial data also must include all of the nonlocational information used to describe the features in the database. In the case of digital-image data, the amount of nonlocational data is relatively small, usually a set of values measuring characteristics such as the greenness of vegetation. With vector data the amount of nonlocational, or attribute, data can be quite large, even exceeding the amount of spatial data. This includes information about the various features included in the database (for example, the number of lanes of a highway) and their relationships to each other (for example, that the railroad tracks lie between the stream and the highway). In the context of GIS, features are the sum of our interpretations of geographic objects or phenomena. Buildings, bridges, roads, streams, grassland, and counties are examples of features. 2.1 Feature Data Structures The real world of geographical variation is infinitely complex and often uncertain, but must be represented digitally in a discrete, deterministic manner. In one method of describing geographic reality, characterized by the information shown on maps, we think of the world as a space populated by features of various kinds—points, lines, and areas. This is commonly referred to as the vector or feature data model. Features have attributes that serve to distinguish them from each other. Any point in the space may be empty, or occupied by one or more feature. These features may exist in one combined database, or may be separated according to a theme or variable into a number of layers or classes. The terminology used to describe these features is given in Table 1. 14775
Spatial Data The relationships between the nodes, arcs, and polygons are called topological relationships. In simple terms, the topological relationships are that arcs start and end only at nodes, and that polygons are enclosed by a series of arcs. Attention is given to neighboring geographical entities when coding arcs. For instance, the topological description of each arc includes not only the starting and ending bounding nodes but also the left and right designations of the cobounding regions. In this way an arc is directed, that is, the first coordinate is that of the start node and the last coordinate that of the end node. Each polygon record might also include the identification of any interior islands or, if the polygon is itself an interior island, the identification of the single exterior polygon. This data structure not only accommodates the natural pattern of mapped features efficiently, but also provides the information needed to facilitate automated editing and manipulation of the database. 2.2 Raster Data Structures Instead of describing discrete geographic objects, geographical variation can be represented by recording the locational pattern of one variable over the study area. When the spatial interval of this sampling is regular (e.g., rectangular), the resulting matrix of observations is called a raster data structure. If there are n variables, then n separate items of information are available for each and every point in the area. Digital terrain models data and digital images are typically stored in this structure. Other types of information such as categories of land-cover could also be encoded in a raster data structure. The cells of raster data (particularly raster data of images) are commonly referred to as pixels (derived from the term picture element). 2.3 Modeling Solids So far our discussions have dealt with data on a surface. However, certain types of spatial data require a more complete modeling of the third dimension. Solid bodies, such as geologic structures (ore bodies, petroleum reservoirs, etc.), need to be encoded not only with x, y locations, but also with z or depth information. The surface models mentioned earlier can encode z values, but only one z value is permitted for each location. This results in what is termed a 2.5D model. True solid or 3-D modeling requires that multiple z values can exist at a given x, y location. One mechanism to model such data is to extend the raster data model into an additional dimension. Thus the 2-D squares become 3-D cubes. The cubes are often termed voxels (for volume elements). Each voxel is encoded with attribute data (such as rock type). The geolocation of any given voxel is implied by its location in the x,y,z array and the size of the cubes. The voxel 14776
representation allows us to model such things as hollow spheres, with a varying thickness of the shell, or solids with holes through them. Another way to model solids is to use a technique called boundary representation. This is an extension of the vector data model to 3-D. A solid object is defined by its bounding surface. Just as one-dimensional lines bound two-dimensional areas, 2-D planar areas are located in 3-D space and are used to bound a 3-D solid. The various planar areas bound solids, much like the facets bound the shape of a jewel.
3. Describing Geographic Features As noted earlier, spatial data models, in addition to providing locational data, must handle all of the nonlocational information used to describe the features in a database. For users to understand the nature of the data there must be a complete specification of the features being described. A comprehensive data specification includes the definition of the features as well as definitions of their attributes and attribute values. Other information required for a complete understanding includes delineation guidelines, data extraction rules, and representation rules. Understanding how the locational component of a feature is represented is of little use unless the user understands the nature of what is being described. What is really meant by terms such as ‘forest,’ ‘marshlands,’ ‘residential land,’ or ‘bridge’? For example, one could declare a ‘bridge’ to be a feature. A bridge is one of a number of built-up structures on or near the surface of the earth. Its definition may be further refined to a structure erected to carry traffic over a depression or obstacle. Thus, the feature ‘bridge’ is an element of a set of phenomena (an ‘erected structure’), with the common attributes of function (‘to carry traffic’) and location. It also has the common relationship of spanning another feature (‘over a depression or obstacle’). The definitions and specifications of the data producers hopefully match to some degree the expectations of the user. To help solve this data-definition problem a feature data specification can be used to provide a detailed definition of each feature and how it is modeled. This technique is used by national mapping agencies in the USA, France, and Germany. The feature data specifications define the domain of features that may be represented in a spatial database. For each feature in the feature data specification, a domain of attributes is defined. Each attribute is assigned a value, taken from the domain of values specified by the attribute definition authority. In addition to the definitions, guidance is given on how a given feature is delineated, when it is collected, and how it is represented in the spatial data model. Specific relationships, termed semantic relationships, are used to express interactions that occur between features that are otherwise difficult to rep-
Spatial Data resent either with locational data, topological relationships, or feature descriptions. Such relationships can include the following: to flow, the relationship that occurs among the two-dimensional features composing a river system; to connect, the relationship that occurs (at junctions) among the two-dimensional transportation features composing a road network; to bound or to be bounded, the relationship that occurs between the perimeter of a city and its city limits; to be composed of, the relationship that occurs between a designated route of travel and the transportation features that compose it, or of the stream and pond features that comprise a watercourse; and to be above, the relationship that occurs between the features in an overpass or underpass of a freeway interchange. The intent of all of these relationships is to model the interactions between features sufficiently to support GIS applications and automated cartographicproduct-generation requirements.
4. Modeling Time Just as maps change over time and must be revised, so do spatial databases. There are two basic ways to handle the temporal dimension associated with spatial data. The simplest way is to maintain multiple versions of a database, akin to keeping a historical map file. When changes occur a new edition of the database is produced, released with the current data, and the previous edition is placed in a historical archive. A more comprehensive, yet more complex, strategy is to maintain one database, where every database feature (and potentially its attributes) has time information associated with it. This is fairly straightforward for information in a raster form. For each cell, a variable-length list of attributes is maintained, noting any change in attribution and the time at which the change occurred. For feature-based vector data sets, the process is more complicated. Information is encoded into the database on when the feature was first incorporated into the database, if it presently exists, when it ceased to exist (e.g., a pond was filled in), and when it was recreated, perhaps in a different form (e.g., a road was rerouted). In this fashion the existence and modification of any given feature can be traced.
5. Data Quality The utility of spatial data in GIS depends directly on the quality of the spatial data used in the process. Geospatial data quality is measured along at least three different axes: space (location, places), time, and theme (content, characteristics). The space axis characterizes positional accuracy; the time axis indicates temporal accuracy and completeness; and the theme axis reflects semantic accuracy, attribute accuracy, and completeness. These qualities are not
independent, however. Indeed some data-quality aspects, such as logical consistency, directly refer to the interrelationships of these data-quality elements. Not all of these quality elements have well-defined quality metrics associated with them. Metrics of semantic accuracy, for example, are not well understood. And some of the measures used for other quality elements may be inadequate. Also needed are linkages between the quality information present in the data and the GIS-processing algorithms, so that software can produce error estimates on the end results. The results of GIS analyses have often been presented as definitive to decision makers (e.g., this is the best site for the municipal water well), with little or no qualification. One would expect that decision makers would insist on qualifying statements on any results presented to them. The methods for doing this are not well understood. For example, due to the cumulative effects of error, it has been conjectured that as the number of data sets used in a GIS analysis increases, the accuracy of the result decreases. This runs counter to common practice, where multiple data sets are used with little regard for whether they are contributing more to the information content or to errors in the result.
6. Metadata As technology provides faster and more efficient ways to transmit and process spatial data, data producers and users are realizing the potential value of exchanging and sharing spatial data. The ability to use existing spatial data is important to organizations that are trying to share data and to increase benefits from their data collections. This ability depends not only on being able to find data, but also on understanding the characteristics and quality of the data. Metadata, ‘data about data,’ provide information such as the characteristics of a data set, the history of a data set, and organizations to contact to obtain a copy of a data set. Standardized metadata elements provide a means to document data sets within an organization, to contribute to catalogs of data that help persons find and use existing data, and to help users understand the contents of data sets that they receive from others. Metadata support activities include determining existing data, determining the appropriate data set for a particular use, providing information to access data sets, and providing information to transfer data sets for processing and use. While the benefits of metadata are manifest, several impediments hinder uniform adoption and use. These center on two topics: agreement on what metadata elements to collect, and the level of effort required to collect whatever metadata is required. Obviously the two items are linked. From the user perspective, detailed metadata is highly desirable. More information will help the user determine the suitability of a given data set. From the data producer’s perspective, 14777
Spatial Data metadata costs money; metadata elements should be constrained to a cost-effective set. Significantly more resources may be required to collect metadata about ‘legacy’ data holdings than data sets collected using techniques that capture the metadata as a part of the entire data-collection process. Agreement on standardized terminology and definitions would not only provide benefits in the automation of the collection of metadata, but would also be exceedingly valuable in searching data catalogues. Several metadata ‘standards’ are in widespread use around the world. In general they have data elements that have been selected to provide information about four topics: Availability—data needed to determine the sets of data that exist for a geographic location; Fitness for use—data needed to determine if a set of data meets a specified need; Access—data needed to acquire an identified set of data; and Transfer—data needed to process and use a set of data. These elements form a continuum where a user cascades through a pyramid of choices when determining what data are available, evaluating the fitness of data for use, accessing the data, and transferring and receiving the data. The exact nature of these data elements continues to be the subject of much debate and discussion. A collection of metadata records combined with data management and search tools forms a data catalogue. The architecture of these data catalogues can vary from tightly held internal data indexes to widely accessible distributed catalogs spread across the Internet. Establishing a data catalog can be a major project. Among the items that need to be accomplished are to: (a) determine which programs or individuals produce, manage, or disseminate data that are to be included in the catalog; (b) identify what geospatial data an organization holds, where data are located, and their condition; (c) identify priorities for documenting previously collected or produced data based on demand, geographic extent, uniqueness, or other criteria to be determined by the organization; (d) create the metadata for new spatial data; (e) ensure that the metadata comply with applicable standards, to the extent possible for the organization; (f) provide training and technical assistance to metadata producers, quality-assurance personnel, and end users; (g) update and maintain metadata as necessary; (h) specify, acquire, and manage the hardware, software, and telecommunications services so that the metadata are electronically accessible; (i) determine how to link or cross-reference spatial data and metadata to ensure consistency, maintainability, easy query or search, and easy access; and 14778
(j) respond to questions about or orders for spatial data. Clearly the catalog function can go far beyond the basic listing of content. Extensions that allow for direct ordering or download of data, user comment or feedback, as well as requests for new data collections are all possible. Metadata and data catalogs are the means by which data users can more economically find, share, and use spatial data.
Bibliography Guptill S C, Morrison J L (eds.) 1997 Elements of Spatial Data Quality. Pergamon, Oxford, UK Langran G 1992 Time in Geographic Information Systems. Taylor and Francis, Philadelphia, PA Longley P A, Goodchild M F, Maguire D J, Rhind D W (eds.) 1999 Geographical Information Systems, 2nd edn. John Wiley, New York Robinson A H, Morrison J L, Muehrcke P C, Kimlerling A J, Guptill S C 1995 Elements of Cartography, 6th edn. John Wiley, New York
S. C. Guptill
Spatial Data Infrastructure 1. The Context The nature of Geographical Information Systems (GIS) is that the basic principles are simple, sometimes apparently even trivial—but the rate of change in all aspects of the field is great and good practice is sometimes fiercely demanding intellectually and in other ways. This article reflects that situation; the continuously evolving characteristics of GIS and Spatial Data Infrastructure make this a snapshot of the situation in the early part of the twenty-first century (see Geographic Information Systems). The history of GIS has been summarized by Coppock and Rhind 1991) and Foresman (1997). What began as a hugely expensive—and thus rare—mainframe computer tool for inventory and mapping purposes in the mid-1960s blossomed in the 1990s. The statistics suggest that there are now about two million users of GIS worldwide and this figure is growing at about 20 percent per annum. Many of the active users of GIS are carrying out tasks on behalf of large numbers of citizens, notably through the work of governments. In addition, there are many everyday users of services that include GIS elements, sometimes without this being obvious (e.g., in many Internet searches). The annual commercial revenues generated from sales of GIS software are now over $1 billion worl-
Spatial Data Infrastructure
Figure 1 Attendance at the annual ESRI User conference since 1981—a surrogate for interest in GIS as a whole. Source: Environmental Systems Research Institute
dwide. Typically this leverages a ratio of about 1:5 in terms of expenditures on hardware, staff training, data collection, and operations. Thus the total annual GIS expenditure is now likely—insofar as it can be estimated—to be of the order of $15 to $20 billion. This figure is supported by estimates of expenditure by known users e.g., cadastral and mapping agencies. The great bulk of the software revenues are earned by a small number of US-based firms but other GIS expenditures are typically more local. One surrogate for the growth of GIS is the number of people attending conferences: attendances at the most popular GIS software conference have grown from 23 in 1981 to over 10,000 in 2001 (Fig. 1). All this is not surprising for the range of applications of GIS is now truly extraordinarily huge (Longley et al. 2001). GIS has thus moved in 30 years from an esoteric side issue, based on the creation of software by academics and government, to a fully fledged global business and scientific endeavor.
2. The Problem That said, views of what GIS is and why it is significant expanded greatly in the late 1990s. Issues such as the reliability of results from GIS-based analyses of multiple data sets, the impact of the technology on privacy, debates on public policies on access to
information and its ownership, plus the international interactions of national policies, all involved and influenced thinking on GIS. The result was to reengineer GIS as a science and as a political, managerial, and business issue as well as a technological one. This has a series of consequences, such as calling into question the existing, surprisingly homogeneous, GIS educational infrastructure existing in over 2,000 universities worldwide (Longley et al. 2001). Paradoxically, it has also led to a recognition that the ‘island-like state’ of GIS in respects other than education needs to be addressed: to date, most applications and uses of GIS have been project- based. If GIS is to become totally ubiquitous—a reasonable aim given the ubiquity of use of geographical information in most decision making—then a number of major problems have to be overcome. These have been summarized by the US Federal Geographic Data Committee (1997) as follows: In the United States, geographic data collection is a multibillion-dollar business. In many cases, however, data are duplicated. For a given piece of geography, such as a state or a watershed, there may be many organizations and individuals collecting the same data. Networked telecommunications technologies, in theory, permit data to be shared, but sharing data is difficult. Data created for one application may not be easily translated into another application. The problems are not just technical—institutions are not accustomed to working together. The best data may be collected
14779
Spatial Data Infrastructure on the local level, but they are unavailable to state and federal government planners. State governments and federal agencies may not be willing to share data with one another or with local governments. If sharing data among organizations were easier, millions could be saved annually, and governments and businesses could become more efficient and effective. Public access to data is also a concern. Many government agencies have public access mandates. Private companies and some state and local governments see public access as a way to generate a revenue stream or to recover the costs of data collection. While geographic data have been successfully provided to the public through the Internet, current approaches suffer from invisibility. In an ocean of unrelated and poorly organized digital flotsam, the occasional site offering valuable geographic data to the public cannot easily be found. Once found, digital data may be incomplete or incompatible, but the user may not know this because many data sets are poorly documented. The lack of metadata or information on the ‘‘who, what, when, where, why, and how’’ of databases inhibits one’s ability to find and use data, and consequently, makes data sharing among organizations harder … If finding and sharing geographic data were easier and more widespread, the economic benefits to the nation could be enormous.
This statement actually understates the scale of the problems encountered in typical GIS projects or programs. These normally involve dealing with the assembly of data from multiple sources and coping with scarce staff skills, ‘state of the art’ technology, and frequently contested outcomes. Making effective and safe use of data encoded by different people to different levels of resolution and accuracy, collected at different times and without certification of quality, is self-evidently non-trivial. These are some of the problems that Spatial Data Infrastructures are designed to resolve.
3. What is a Spatial Data Infrastructure? It seems (Table 1) that Spatial Data Infrastructures (SDIs) are already commonplace. Harris (1997) and Masser (1998) have described various policy aspects relating to Geographic Information and to the early Table 1 Countries which have assembled or are assembling National Spatial Data Infrastructures as of mid-2000. Source: Henry Tom, Oracle Corporation Argentina Australia Canada Colombia Cyprus Finland France Germany Greece
14780
Hungary India Indonesia Japan Kiribati Macau Malaysia Mexico Netherlands New Zealand
Northern Ireland Norway Pakistan Poland Russian Federation South Africa Sweden United Kingdom USA
stages of SDIs but the situation is constantly evolving. The use of terminology, however, conceals many different entities and states of development. In Australia the local manifestation is termed ‘the Australian Spatial Data Structure;’ in Canada ‘the Canadian Geospatial Data Infrastructure;’ in the Asia-Pacific group there is ‘the Permanent Committee for GIS Infrastructure in Asia and the Pacific;’ and in the UK it is the ‘National Geospatial Data Infrastructure.’ SDIs are not necessarily national: a number of US states have set one up and a global version is under discussion. The most commonly used terms in this arena are ‘spatial data,’ ‘geospatial data,’ and ‘geographic(al) information.’ The phraseologies can be regarded as synonymous in some uses but not in others (see, for instance, Longley et al.’s (2001) discussions of data, information, evidence, and knowledge). Broadly speaking, ‘spatial data’ is the most widely used term except in Europe where ‘Geographic Information (GI)’ is widely used. Formal definitions of SDIs vary significantly around the world. Two definitions of the US National Spatial Data Infrastructure (NSDI) are shown below, together with that of the putative Global Spatial Data Infrastructure (GSDI): [The National Spatial Data Infrastructure is] ‘the comprehensive and coordinated environment for the production, management, dissemination, and use of geospatial data. [It subsumes] the totality of the policies, technology, institutions, data and individuals that were producing and using geospatial data or GI within the United States.’ Source: 1993 and 1994 reports of the Mapping Science Committee (MSC) of the United States National Research Council [NSDI is] ‘the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilization of geospatial data.’ Source: Presidential Executive Order F12906: 1994 The GSDI is the broad policy, organizational, technical and financial arrangements necessary to support global access to geographic information.’ Source: Steering Committee of the Global Spatial Data Infrastructure (see http:\\www.gsdi.org)
The differences in these are subtle but important. For instance, the financial underpinnings of NSDIs were largely ignored in the early years, as were the commercial and private sector contributions. Many originally saw the entire NSDI enterprise being driven by new technologies, especially Geographical Information Systems; others saw the key issues as the information production process. The progressive acceptance of Geographical Information Science has now broadened the generally accepted concept of NSDIs to include a stronger research and education agenda. In the USA in particular, the NSDI was originally driven by federal government agencies and the Federal Geographic Data Committee in particular; this has led to much debate on how to create leadership in an enterprise that can or should not—according to Internet enthusiasts—be led.
Spatial Data Infrastructure
4. The US Experience with an SDI The President’s Executive Order and the FGDC identified three primary areas to promote development of the NSDI. The first was the development of standards, the second improvement of access to and sharing of data by developing a National Geospatial Data Clearinghouse, and the third was the development of the National Digital Geospatial Data Framework. All of these efforts were to be carried out through partnerships among federal, state, and local agencies, the private and academic sectors, and nonprofit organizations. Executive Orders in the USA are only enforceable in regard to federal agencies: such cross-sectoral adherence can only be achieved by influence and the leverage through federal funding. The Federal Geographic Data Committee operates as a bureaucracy through a series of subcommittees based on different themes of geospatial data (e.g., soils, transportation, cadastral), each chaired by a different federal agency. Figure 2 illustrates the orig-
inal subcommittees and which federal agency led each one. Several working groups have also been formed to address issues on which there is a desire among agencies to coordinate and which cross subcommittee interests (e.g., Clearinghouse, Standards, Natural Resource Inventories). 4.1 Standards Many of these groups have developed standards for data collection and content, classifications, data presentation, and data management to facilitate data sharing. For example, the Standards Working Group developed the metadata standard, which was formally adopted by the FGDC in mid-1994 and has since been adapted in the light of international and other national developments. All of the FGDC-developed standards undergo an extensive public review process. This includes nationally advertised comment and testing phases plus solicitation of comments from state and
Working Groups
Thematic Subcommittees
Figure 2 The original structure of the Federal Geographic Data Committee sub-committees and lead federal agency responsibilities for them.
14781
Spatial Data Infrastructure
Figure 3 Growth in use of the US NSDI Clearinghouse. Source: Henry Tom, Oracle Corporation
local government agencies, private sector firms, and professional societies. The most recent NSDI standards at the time of writing (late 2000) are for utility data content, hydrography, geological maps, and a profile (i.e., a subset) of the shoreline metadata standard. The Executive Order mandated that all federal agencies must use all FGDC-adopted standards.
This model does not necessarily assume that GI will be distributed for free. Obtaining some of the data sets requires the payment of a fee; others are free. The Clearinghouse can also be used to help find partners for database development by advertising interest in or needs for GI. The growth in use of the Clearinghouse is shown in Fig. 3.
4.3 The Geographical Framework 4.2 The Clearinghouse The second activity area is intended to facilitate access to data, minimize duplication, and assist partnerships for data production where common needs exist. This has been done by helping to advertise the availability of data through development of a National Geospatial Data Clearinghouse. The strategy is that agencies producing Geographical Information describe its existence with metadata. They then serve those metadata on the Internet in such a way that they can be accessed by commonly used Internet search and query tools. The FGDC-adopted metadata standard describes the content and characteristics of geospatial data sets. President Clinton’s Executive Order stipulated that agencies make these metadata accessible electronically. This has been rendered practicable by the development of metadata-creating software within GIS by the larger vendors. As a result of all this, nearly all federal agencies—as well as most States and numerous local jurisdictions—have become active users of the Internet for disseminating geospatial data. 14782
The third NSDI activity area is the conceptualization and development of a US digital geographical (or geospatial) framework data set. Framework data are those which underpin all GIS operations, providing the geographical context and structure within which spatially manifested processes operate and on which other (‘area coverage’) data are assembled. There are two main reasons why a spatial framework is essential in almost all GIS operations. The framework provides both spatial context and spatial structure. In dealing with a satellite image, for example, the incorporation of place names facilitates linkage to the real world; moreover, the framework may be used to rectify distortions in, or subset and reorientate, the imagery. The second reason for the importance of the framework is that geographic information is typically collected by many different agencies, usually within the boundaries of national territories; and most of these agencies have historically collected data primarily to suit their own local purposes. Thus, if the benefits of added value are to be obtained from data
Spatial Data Infrastructure integration within a GIS, there must be a mechanism (or spatial ‘key’) to register the different data sets together in their correct relative positions and, beyond this, to ensure that the component objects within each data set are properly related. In general, only the geographical framework can provide this ubiquitous data linkage key through the use of coordinates or implicit geographical coding. What constitutes the framework varies in different countries. In the USA, the Federal Geographic Data Committee (FGDC 1995a) has defined the information content of the US framework as consisting of: $ geodetic control $ digital orthoimagery $ elevation data $ transportation $ hydrography $ the geography of governmental units; and public land cadastral information. Similar definitions—including place names—would in practice occur in many countries. Most typically, the framework is manifested in user terms through topographic mapping—the ‘topographic template’—which may contain all this information but partly in implicit form (e.g., geodetic control). Hence the US framework was intended to form the foundation for the collection of other data, minimize data redundancy, and facilitate the integration and use of geospatial data. It should be pointed out that the USA is generally less well endowed in relation to a consistent geographical framework than are many other nations. The federal nature and physical extent of the country has ensured a multilevel and inconsistent hierarchy of geographical information is produced by the federal, state, and local governments, with local updating and tailoring by agencies and private sector bodies. This contrasts with Britain and many other countries where a single, consistent national framework has long been in existence and is consistently maintained up to date. The disadvantages of a multiplicity of frameworks are evident: the creation of multiple but different framework data sets will guarantee the creation of artifacts in any computer analysis using combinations of data sets related to each of the frameworks. Putting a value on such externalities is difficult, especially since the costs are visited on those at the end of supply chains and are often only discovered long after the data have led to mistaken conclusions. Given the huge number of players involved, getting agreement on a multipurpose framework has been difficult in the USA. The Executive Order directed the FGDC to develop a plan for completing initial implementation of the framework by the year 2000 but this has proved impossible. A popular, partial solution has been to create one-meter resolution geometrically correct orthophotomaps in place of a conventional digital map framework. Attractions of this approach are that it could be automated to a great degree and
could also be financed jointly by states and the federal government and implemented via the private sector. The collaboration to create or refine the national NSDI has led to various tensions. These have occurred between federal agencies, between them and the states and local governments, and—to a more limited extent because of their more modest involvement until the late 1990s—with the private sector. These tensions arise partly because the NSDI permits the questioning of traditional, legacy roles for government bodies: for instance, the concept of assembling national databases by sewing together detailed ones held by states directly undermines the traditional role of some federal agencies. In addition, the NSDI has made public some latent demands which have ramifications for budget holders: in many cases the beneficiaries of spending are not those called on to divert resources from other activities. Most recently, a US national report on NSDI pointed out the consequences of seeking a private\public sector partnership in developing the National Spatial Data Infrastructure. These consequences include an obvious requirement for the private sector to generate profits from trading in information and in financing infrastructure. These would necessitate, it was argued, a more cohesive and industry-friendly policy on information ownership and massive investment by government to convert its data into more coherent, consistent, and object-oriented form—which could then be marketed. Partnerships of this kind—as opposed to normal contractual arrangements with the private sector working for the government—are difficult to make work.
5. The Global Spatial Data Infrastructure Most GI has been collected, assembled, and used hitherto in a national context. Yet there is a growing number of requirements for pan-national, even global data. For instance, Agenda 21 was designed to facilitate sustainable development. The manifesto was agreed in Rio in 1992 at the ‘Earth Summit’ and was supported by some 175 of the governments of the world. It is noteworthy that eight chapters of the Agenda 21 plan dealt with the need to provide geographic information. In particular, Chapter 40 aimed at decreasing the gap in availability, quality, standardization, and accessibility of data between nations. This was reinforced by the Special Session of the United Nations General Assembly on the Implementation of Agenda 21 held in June 1997. The report of this session includes specific mention of the need for global mapping, stressing the importance of public access to information and international cooperation in making it available. The Global Spatial Data Infrastructure (GSDI) is an attempt to enhance the availability and accessibility 14783
Spatial Data Infrastructure of global GI. It is the result of a voluntary coming together of national mapping agencies and a variety of other individuals and bodies. The organization has sought to define and facilitate creation of a GSDI by learning from national experiences. As part of this process, it funded a scoping study carried out by Australian consultants. They outlined in early 2000 their conceptual view of how the inputs to decisionmaking were linked to an SDI and the potential benefits that would arise from having one. They pointed out however that proof of the reality of the predicted benefits was difficult to find. The fundamental question for any GSDI business case is: what exactly is the product that governments and other agencies will be asked to fund? The consultants said bluntly that ‘The broad definitions of GSDI are not sufficiently operational (or practical) to build a clear business case around.’ They went on to raise some key questions which they recognized as being difficult to answer but had to be tackled to give credibility to a formal business case for GSDI. These included: $ Is GSDI a real global SDI or a federation of national or regional SDI? $ Is GSDI really about encouraging developing countries (or at least those that do not currently place emphasis on geospatial data to do so)? In this case, the product may be a suite of policy recommendations and technical assistance approaches that development agencies can apply in their client countries. $ Is GSDI a process, a general framework, or is it a particular product such as a world map or a comprehensive database? $ Who are the stakeholders? $ What about the demand side of GSDI? How do you isolate demand which arises from decisionmaking processes? Given all this—and not surprisingly in view of the lack of any global government and the complications in building a national, let alone pan-national NSDI—progress on a GSDI has been slow to date. The primary reason is not inadequate technology but the requirement to clarify real need, obtain sources of finance, and secure political support.
6. Spatial Data Infrastructures and the Social Sciences The diversity of multiple strands in the social sciences revealed by this encyclopedia complicate any brief attempt to link the book’s subject matter to GIS and SDIs. This relationship is, however, manifestly in existence. Examples of factors which underpin the multiple and sometimes subtle relationships include: $ The increasingly ‘taken for granted’ nature of the role of GIS in supporting tactical (and sometimes strategic) decision making worldwide 14784
The growth of problem-oriented approaches within government as well as in business. These necessitate crossing traditional disciplinary boundaries and using information drawn from many sources e.g., relating to the natural environment as well as the social sciences—with all the ownership, epistemological, and other problems that entails $ The methodological implications of conflating information on people, places, institutions, and policies which have different conceptual bases and other characteristics $ Policy conflicts between different nations or other governed units such as US states (e.g., on the ‘proper’ role of the state as in the free distribution or charging for information) $ Institutional and interinstitutional politics and financing struggles within government (still the largest sources of Geographical Information). This can be encapsulated in the ‘who leads’ question which is difficult to answer given the permeation of SDI and GIS through many activities and responsibilities in different sectors. $ The interrelationship of the state and business. This is more complex than in many other cases for governments are both providers (in some case competitive with the private sector) and consumers of Geographic Information. The supply of some GI is a natural monopoly. There is a geography to this business\government relationship which is much more complex than national information policies would suggest (Longley et al. 2001). $ The legal issues involved in SDIs and GIS relate to liability, privacy (e.g. Curry 1999 and Longley et al. 2001), and ‘fair dealing’ as well as information ownership and protection. Again, there is a geography of the law related to SDIs. In summary, SDIs are posited on worthy aims. But they raise fundamental theoretical and practical questions of the role of the state and of intersectoral and organizational partnerships on a scale hitherto unknown in many social sciences. Success in creating SDIs may both benefit and disadvantage the individual (e.g., in providing better and cheaper access to information but also in undermining privacy). It is also not clear how to assess success in an environment where competition and partnerships within governments and between them and the private sector are both inevitable. SDIs also raise a series of practical managerial questions. Ultimately, they impact on the lives and work of everyone. As a consequence of all of the above, SDIs provide a fertile laboratory for the social sciences. $
See also: Geographic Information Systems: Critical Approaches; Geography; Remote Sensing; Remote Sensing and Geographic Information Systems Analysis; Spatial Analysis in Geography; Spatial Data
Spatial Decision Support Systems
Bibliography Coppock J T, Rhind D W 1991 The history of GIS. In: Maguire D J, Goodchild M F, Rhind D W (eds.) Geographical Information Systems. Longman Scientific and Technical, Harlow, UK pp. 21–43 Curry M R 1999 Rethinking privacy in a geocoded world. In: Longley P A, Goodchild M F, Maguire D J, Rhind D W (eds.) Geographical Information Systems: Principles, Techniques, Applications and Management, 2nd edn. Wiley, New York pp. 757–66 Executive Order F 12906 1994 Co-ordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure. Washington DC FGDC 1995a Deelopment of a National Digital Geospatial Data Framework. Federal Geographic Data Committee, Washington, DC FGDC 1995b The National Geospatial Data Clearinghouse. Federal Geographic Data Committee, Washington, DC FGDC 1997 A Strategy for the National Spatial Data Infrastructure. Federal Geographic Data Committee, Washington, DC Foresman T W 1997 (ed.) The History of GIS. Prentice-Hall, Upper Saddle River, NJ Harris R 1997 Earth Obseration Data Policy. Wiley, Chichester, UK Longley P A, Goodchild M F, Maguire D J, Rhind D W 2001 Geographical Information Systems and Science: A Student Companion. Wiley, New York Masser I 1998 Goernments and Geographic Information. Taylor & Francis, London National Research Council (US) Mapping Science Committee 1993 Toward a Co-ordinated Spatial Data Infrastructure for the Nation. Mapping Science Committee, National Research Council, National Academy Press, Washington DC National Research Council (US) Mapping Science Committee 1994 Promoting the National Spatial Data Infrastructure Through Partnerships. Mapping Science Committee, National Research Council, National Academy Press, Washington DC National Research Council (US) Mapping Science Committee 1997 The Future of Spatial Data and Society. Mapping Science Committee, National Research Council, National Academy Press, Washington DC Rhind D W 1999 National and international geospatial data policies. In: Longley P A, Goodchild M F, Maguire D J, Rhind D W (eds.) Geographical Information Systems: Principles Techniques Management and Applications. Wiley, New York pp. 767–87
D. Rhind
Spatial Decision Support Systems People frequently make choices that involve evaluating geographic position or geographic context. Whether selecting routes, making service areas, or defining environmental hazardous areas, they use ‘multiple spatial criteria’ in making such choices and apply them to complex geographic data. In social science, such complex decision-making problems are often called ‘ill-structured’ and a set of theories,
methods, and techniques has been developed for a wide variety of problem-solving contexts. It has become common to link these together in computer software that operates on spatial data. They are called spatial decision support systems and are increasingly being used in both business and public sectors today.
1. Spatial Decision Support for Ill-structured Problems A number of problems may occur when making decisions for ill-structured problems. First, it may not be clear to the decision maker how much of the attainment of one spatial criterion should be sacrificed in order to attain a given amount of another. The decision maker often wants to explore alternative solutions and to reach a decision about the best solution only after extensive exploration of the spatial and the attribute solution space. That is, their preferences for a particular solution may be influenced by what can be attained in a particular context. Although they may have preferences for abstract entities, the implementation of these preferences is mediated by knowing what can be attained in the given context. Decision support systems are designed to facilitate the exploration of these spaces and thus to assist decision makers in their search for the best solutions to their problems. A second problem is that the spatial databases containing relevant information may not be complete either with respect to all the variables that are relevant or to the accuracy with which they are measured. During the process of searching for a solution, decision-makers will sometimes want to add their unique spatial knowledge about an area or decide to search for additional facts about a location or area before making a decision. A third problem is that though decision makers may want to view the results of models of processes, they may choose not to accept these results over concerns about their validity for the given situation. All three problems lead many decision makers to ask for computer-based, interactive, support systems with which to explore their problem and search for a solution.
1.1 Solution Spaces in SDSS The solutions to all SDSS problems exist in both geographic and attribute space. As the geographic space of the solution to any problem changes, so its attribute space changes too. For example, in making districts from which voters will elect representatives, the geographic solution space is the map of the boundaries of the constituencies and the areas they contain. The corresponding attribute space describes the numbers of eligible voters in each district, their 14785
Spatial Decision Support Systems party affiliations, their past voting record, and any attribute that a politician may view as relating to their likely voting behavior on a range of relevant issues in an upcoming election. Such attributes might be their ethnicity, income, age structure, religious preferences, etc. Thus, in an exercise to re-draw electoral districts following a decennial census, any participant in the process may want to see how any changes in the boundaries are accompanied by changes in any or all of these attributes. A system for simultaneously computing alternative regionalizations of territory would constitute a spatial decision support system for electoral redistricting. In this example, ‘regionalizing’ is the spatial function for which the support system is developed. Defined characteristics of the areas and the constraints for their definition specified by a particular political system would be the attributes for which the SDSS would be developed.
2. Types of SDSS Different types of SDSS can be defined according to how geographic and attribute spaces are generated. In some SDSS, these spaces are generated interactively by the user. The capabilities of the SDSS then consist of the tools for manipulating the geographic information and reporting the characteristic attributes of the outcomes. Consequently, the simplest SDSS incorporate the capacity to convey information from the geographic to the attribute space or from the attribute space to the geographic space. These are key capabilities of geographic information systems (GIS). GIS are simple spatial decision support systems. They permit users to query locations and learn about attributes at the locations. They also permit the definition of attributes—often following database queries such as sort or find—and then to see their geographic locations quickly. Good examples can be found in Birkin et al. 1996. Crossland et al. (1995) determined the efficacy of using GIS as SDSS for making location decisions. In an experiment, the students with access to GIS made better and quicker decisions with the SDSS than those without it. The term SDSS is often reserved for systems that combine the functions of GIS with multicriteria decision analysis and optimizing models. These are developed for problem contexts that involve not only the manipulation of geographic information in GIS, but also the generation of preferences, the modeling of choices, and the search for locations or areas that best meet a user’s defined needs. For such problems, GIS need to be integrated with spatial models. Integration is accomplished by visualization of related geographic and attribute spaces and by linking the analysis systems so that users can move easily between them using a unified user-interface (Densham 1994). All SDSS link together a set of fundamental elements described below (see Fig. 1). 14786
2.1 Preference, Choice and Multicriteria Decision Models Many spatial decisions involve correct anticipation of how people will make selections from alternative locations when presented with any given set of alternatives. Subjective analysis of maps of existing or hypothetical future alternatives does not lead to accurate predictions of how many people will select one alternative over another. For this purpose spatial interaction or spatial choice models have been developed. When added to GIS with appropriate measures of the variables that affect choice, these models can provide invaluable assistance to people charged with making decisions about future facility locations, for example. These ‘choice models’ (Timmermans et al. 1984) examine the social and economic characteristics of small areas, and then, based on results from samples of the population, estimate how many of each population group will go to one facility and how many to another. When repeated for each small area, (Goodchild 1984) the sum of all the estimates is the aggregate estimate of the use of each facility. In some cases, instead of modeling the choices of individuals, the user of the SDSS directly participates in the choice evaluation process. Jankowski and Ewart (1996), for example, used the spatial query capability of GIS to select plausible locations for rural health practitioners within an area. Their SDSS asked users to compare pairs of locations on given criteria, and then, using a multicriteria evaluation technique, to rank order the locations for this purpose. The GIS displayed the locations of the highest ranked alternatives. 2.2 Location Optimizing Algorithms Many decision-makers want to select locations to optimize some defined outcome. A large literature describes models for finding optimal locations, routes, etc. SDSS often integrate such models with GIS so that, from a common user interface, the user can assemble the spatial data, solve the problem using a model selected from the interface, and then display the result. Ghosh and Rushton (1987) describe the evolution of location-allocation models from tools to find optimal locations for well-defined objectives to tools that can be incorporated in flexible analysis systems. In such systems, decision makers explore alternative location patterns and use optimizing algorithms to compute alternatives that they can evaluate against criteria of interest, some of which are highly qualitative in character. Other SDSS are designed to select geographic routes for the delivery of goods or services (Keenan 1998). These are often used in a GIS environment supported by inputs from mobile units that are monitored by global positioning systems (GPS) that report their
Spatial Decision Support Systems
Figure 1 Fundamental elements of spatial decision support systems
locations at regular intervals to headquarters. This information is used in subsequent spatial assignment tasks. The principal spatial task of some SDSS is to form regions (Densham and Rushton1996). Strapp and Keller (1996) developed a spatial decision support system to support the consolidation of agricultural land in the Czech Republic. Their link with GIS permitted the dynamic computation of spatial relationships between small pieces of territory and the farmsteads into which they are grouped at any point in the region formation process. Because of the ill-structured nature of this problem, they designed a spatial decision support system to permit users to work interactively with provisional regions to improve upon them. 2.3 Group Decision Support Systems Spatial decisions are often made by groups of people, involving them in collaborative efforts. Nyerges and Jankowski (1997) discuss using adaptive structuration theory in a GIS environment to assist these efforts for
situations in which groups are involved in public land planning controversies such as hazardous waste cleanup and habitat redevelopment. Some SDSS incorporate methods for reconciling different views. Couclelis and Monmonier (1995) discuss this problem and offer a computer-supported system to help promote the resolution of NIMBY-type (‘not in my back yard’) planning controversies. Their system uses GIS capabilities to present spatial details of the planning situation but uses the concept of the graphic script as a device for encouraging dialogue and reconciliation of opposing views. Armstrong (1993) describes how computer supported cooperative work (CSCW) computing environments can enhance SDSS by enabling group members to observe, manipulate, and evaluate alternative plans collectively and then provide mechanisms for conflict resolution.
3. Knowledge-based SDSS In many situations, the assistance required by users of an SDSS lies in the knowledge of experts who are not 14787
Spatial Decision Support Systems available in person. Much of current research in SDSS entails capturing the knowledge of experts and incorporating it in the analysis system so that this knowledge becomes available at the appropriate time to the user. An early paper by Armstrong et al. (1990) outlined this approach. Examples of such systems are beginning to appear (Leung et al. 1997). In fully developed knowledge-based SDSS, many of the activities of the userinterface are bypassed (see Fig. 1) and are automated in the knowledge base. Some of the knowledge there is about the algorithms and their links to the decision task. Other knowledge is substantive knowledge about the processes that need to be modeled.
4. Implementations of SDSS Many examples of prototype implementations of SDSS can be found in the academic literature. Commercial SDSS applications are usually domain-specific and are being developed to support decision making for specific problems that are complex, yet occur with sufficient frequency to support the overhead cost of developing and testing SDSS developed for the purpose. In many countries, following each census, there is a flourish of political redistricting activities. Increasingly, decision makers rely on SDSS for this task.
5. Conclusions Currently, most SDSS are tailored for specific problem domains. Users often will not recognize that SDSS in quite different application domains have similar architectures and combine similar elements. The applications use the attribute language of the domain, even though the spatial data inputs, manipulations, and outcomes may be identical. SDSS represent the integration of ideas, methods, and techniques from many of the social, behavioral, and management sciences. Their usefulness in the policy and planning sciences is based on the degree to which they successfully incorporate the findings and approaches of these sciences. Their focus on the spatial and the geographic leads to a very important link with the evolving field of geographic information science. The rapid development of national spatial data infrastructures provides a rich and expanding database to which attribute information can be attached. SDSS, at their inception, were a derivative science that built upon the successful development of models of behavioral, social, and political processes. However, with the expanding use of SDSS in a wide variety of decision situations, theories and models of these processes will have to account for the use of SDSS. When normal decision-making patterns involve SDSS use, they will become an important element in social science theories and models to explain decisions in these areas. This evolution from the use of social and 14788
behavioral models as exogenous units in SDSS to their incorporation as endogenous units is an important development in the social sciences. This development will mark the more explicit incorporation of geographic variables in social science theory and method. See also: Geographic Information Systems; Locational Conflict (NIMBY); Spatial Choice Models; Spatial Equity; Spatial Optimization Models
Bibliography Armstrong M P 1993 Perspectives on the development of group decision support systems for locational problem-solving. Geographical Systems 1: 69–81 Armstrong M P, De S, Densham P J, Lolonis P, Rushton G, Tewari V K 1990 A knowledge-based approach for supporting locational decisionmaking. Enironment and Planning B: Planning and Design 17: 341–64 Birkin M, Clarke G, Clarke M, Wilson A 1996 Intelligent GIS: Location Decisions and Strategic Planning. Geoinformation International, Cambridge, UK Couclelis H, Monmonier M 1995 Using SUSS to resolve NIMBY: How spatial understanding support systems can help with the ‘Not in my backyard’ syndrome. Geographical Systems 2: 83–101 Crossland M D, Wynne B E, Perkins W C 1995 Spatial decision support systems: an overview of technology and a test of efficacy. Decision Support Systems 14: 219–35 Densham P J 1994 Integrating GIS and spatial modeling: visual interactive modeling and location selection. Geographical Systems 1: 203–21 Densham P J, Rushton G 1996 Providing spatial decision support for rural public service facilities that require a minimum work load. Enironment and Planning B: Planning and Design 23: 553–74 Ghosh A, Rushton G (eds.) 1987 Spatial Analysis and Locationallocation Models. Van Nostrand, New York Goodchild M F 1984 ILACS: A location-allocation model for retail site selection. Journal of Retailing 60: 84–100 Jankowski P, Ewart G 1996 Spatial decision support system for health practitioners: selecting a location for rural health practice. Geographical Systems 3: 279–99 Keenan P B 1998 Spatial decision support systems for vehicle routing. Decision Support Systems 22: 65–71 Leung Y, Leung K S, Shao Z, Lau C K 1997 A development shell for intelligent spatial decision support systems: 2. An application in flood simulation and damage assessment. Geographical Systems 4: 39–57 Nyerges T L, Jankowski P 1997 Enhanced adaptive structuration theory: a theory of GIS-supported collaborative decision making. Geographical Systems 4: 225–59 Strapp J D, Keller C P 1996 Towards an interactive spatial decision support system for agricultural land consolidation. Geographical Systems 3: 15–31 Timmermans H J P, van der Heyden R, Westerveld H 1984 Decisionmaking between multi-attribute choice alternatives: a model of spatial shopping behaviour using conjoint measurements. Enironment and Planning A 16: 377–87
G. Rushton Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Spatial Equity
Spatial Equity Although most social scientists may not know the term ‘spatial equity,’ almost everyone experiences it in everyday life. ‘Spatial’ implies that most of the experienced world, physical and social, takes up room; this has important social consequences. Elements of the natural and social world always have a location in space or a place, just as they do in time. Phenomena also have relationships of distance, spatial interaction and accessibility with one another, and these attributes shape the causes and consequences of things. ‘Space’ has structure, that is, institutions and relationships which constrain what and who can be where—in other words, everything and everyone’s place on the landscape. ‘Equity’ is a social construct by individuals, or groups or societies, as to the degrees and kinds of inequalities that are acceptable or ‘fair’ or are unacceptable and ‘unfair.’ Thus, the measure of equity is inescapably bound up with the individual’s or group’s position with respect to the basic human dilemma of individual and collective rights and responsibilities. There is no measure of where ‘fair’ ends and ‘unfair’ begins; society redefines the line as needed according to custom and law. Classic works include Perin’s Eerything in its Place (1977), Rawls’ A Theory of Justice (1971) and Harvey’s Social Justice and the City (1973). The latter was recently commemorated in Merrifield and Swyngedouw’s The Urbanization of Injustice (1997). On Rawl, see also Clark (1985).
1. The Meaning of Spatial Equity Webster defines equity as ‘a legal system which supplements statutes or laws to correct, if necessary, its failure to account for what is morally just and fair.’ Societies may deal with questions of equity and fairness in the justice system or via regulations and legislated standards. Thus a 13-year-old may not claim an equity issue in not being allowed to drive, but a healthy 80-year-old can. The very fact of space implies that the notion of spatial equality (that all parts of space are of the same quality or value) is without meaning; every bit of space is different from every other. For this reason, equity is of heightened importance and difficult to obtain. In economic theory an efficient equilibrium should be equitable as well. Structurally, monopoly and inheritance, for example, lead to inefficient and inequitable outcomes. But space itself may make a structurally efficient pattern geographically inequitable. Consider a common situation of an uneven distribution of people, far denser on one side of a jurisdiction than the other. The most efficient, travelminimizing location for a public facility will be strongly pulled to the densest core, leaving those in the
low density edge in a highly unequal position (Morrill 1973). The physical environment is incredibly diverse, and utterly unequal from place to place and over time in its utility or value to human production, residence, exchange or enjoyment. For physical, cultural, historic, and political reasons, individuals, groups, corporations, institutions, and societies are similarly diverse, and highly unequal in physical and mental capacity, experience, and mobilization of resources. They are profoundly unequal in their power to structure the landscape, to alter the environment, or to control more valuable locations. Individuals, groups and societies are continually aware of their unequal positions in space, which often leads to perceptions of unfairness on the part of the less territorially advantaged (those on the ‘wrong side of the tracks’), and on the part of moralists, ethicists, and liberals offended by or concerned with territorial inequality.
1.1 Spatial Equity and Social Inequality Inequality of power and of rewards itself may lead to resentment and a profound sense of unfairness of the social system. At the societal level, the equity issue is the social system itself, and the battles are more structural than spatial, for examples, calls for socialism or communism as the solution for unequal power and wealth, or battles over minimum wages and tax rates. An exception is perhaps the demand for political self-determination of colonies—struggles that were more territorial and cultural than structural. But within a nation and social system, as in the USA, although there may be many who want to change the system, the majority of US citizens, for example, accept the competitive and individual system and understand that unequal power and wealth are inherent to the system, that the rich and powerful can and will preempt the ‘best places.’ Thus a classsegregated society as such is not a spatial equity issue. After all, that poor person may win the lottery! Rather, issues of spatial equity arise when people feel unfairly treated, although they’re playing within the rules of the game; when they feel that the rich and powerful manipulate the law itself to get more than the share their competitiveness and wealth warrant. An example is the conviction of poverty area residents that since they pay taxes, too, the potholes on their streets should have no lower priority than the potholes on the streets of the rich. There are many examples of spatial inequality that raise equity issues. Starting at the global scale are historical colonialism and imperialism and contemporary dependency theory, hypothesizing that unfair power relations exploit and impoverish the Third World. Less technologically advanced cultures distrust 14789
Spatial Equity and fear the power of Western and especially of US culture. Smith (1977, 1994) provides extensive treatment of inequality from the global to the local scale. See also Knox (1975). Within nations, prominent bases for raising issues of equity include (a) regional inequality, the sense of exploitation of the periphery by the core, or of poor regions by rich regions; (b) unequal access to public facilities and resources; (c) unequal or unfair treatment of areas because of their poverty, wealth or power; (d) spatial manifestations of the sports analogy that a competitive society should have a ‘level playing field;’ and (e) spatial aspects of unequal treatment of categories of people, because of race, ethnicity, language, religion, lifestyle, and gender.
1.3 Voting and Representation In the USA, perceptions of unfairness in representation led to a series of Supreme Court decisions outlawing malapportionment (unequal population per district) and racial gerrymandering which disenfranchised minorities (initial decisions ruled against ‘diluting’ and ‘packing’ minority voters, but later rulings forbade gerrymandering to maximize minority representation). Ironically representation in the US Senate intentionally violated the equal population principle, since every state has two senators, but this inequality was defended on grounds of avoiding unfair domination of smaller by larger states.
1.4 Oercoming the ‘Unfairness’ of Nature 1.2 Regional Inequality Within the European Union, a dominant part of the budget and a large amount of research and rhetoric are devoted to the issue of lagging regions, built on the perception that it is somehow ‘unfair’ that some regions are prosperous and growing while others are poor and languishing; that it shows the failure of the system, rather than of the region. In US economic history, class struggles often seemed secondary to the resentment of the south and west against the hegemony of the northeast. In a generally private economy, the US has a long tradition of regional redistribution via public programs; for example, the Reclamation Act subsidizing irrigation development in the West, federal defense outlays and massive highway programs, TVA and the Bonneville Power Administration, and the Regional Development Commissions, were all designed to improve the competitive position of ‘disadvantaged’ parts of the country (Johnston 1980). Harrington’s classic The Other America (1962) raised the nation’s consciousness of poverty and helped lead to the War on Poverty. Efficient market decisions frequently take the form of the closure of factories and consequent community decline, leading to forced out-migration. In these cases, ‘pure market’ solutions implying the ‘shipment’ of the surplus labor to growing regions confront the immense nonmarket value of places to their long-term residents, and a strong sense of unfairness arises, that the decline is not the fault of local people (Clark 1983). Societies often institute a variety of programs to reduce the social and economic costs, as through subsidies, community purchase of the closed plant, tax forgiveness, or retraining and relocation assistance. The same principle is applied at the local level through federal or local programs such as slum clearance, public housing, and ‘enterprise or empowerment zones,’ reflecting a political consensus that the landscape had become unfairly unequal, presumably because of forces beyond local control. 14790
As communities develop and compete with other places, perceptions of being treated ‘unfairly by nature’ may arise. At the international scale, such perceptions may be used to justify invasions and wars. Within countries, governments at all levels spend huge sums to overcome natural limits, so they may ‘compete on that more level playing field.’ Examples are bridges, tunnels, and artificial harbors, built to improve the accessibility of a place to national and global markets.
1.5 Access to Public Facilities and Resources This is the arena for which the word equity is most likely to be invoked, and for which legal remedies may be sought, on grounds of ‘equal protection’ under law (Pinch 1985). In the USA, until as recently as the 1960s it was the perception and probably the reality that weaker, poorer places among a set of jurisdictions, and weaker and poorer areas within a jurisdiction, were unequally and unfairly treated. Examples of inequitable access to public services and facilities entail the maintenance and quality of utilities and infrastructure, like roads; the location and quality of parks, playgrounds and pools, or of libraries; and the timeliness of police response calls, as if action were a function not of citizenship but of wealth—which it was. As a consequence of a series of court decisions over many years, cities have become far more attentive to more equal treatment across all parts of their jurisdictions, although huge historic inequities remain. The construction and operation of public transportation systems have similarly been subject to scrutiny, as to the costs and benefits of subsidies to different parts of a region; poor, more transitdependent districts no longer meekly accept inferior service and equipment. In most of the USA school districts are a distinct set of governments. District populations are astoundingly unequal in income, and districts in tax bases. In many but not all states, courts
Spatial Equity have interpreted state constitutions to mean that all students should have ‘equal access’ to education, wherever they live. This has resulted in substantial redistribution of resources from richer to poorer districts, as well as to the subsidy of ‘remote but necessary’ small schools in very isolated communities.
Although men and women are usually residentially integrated, the history of gender inequality resulted in unequal access to opportunities and resources, and in insecure built environments. And because of this history the very subject of inequitably gendered landscapes has only recently been studied (Pratt and Hanson 1994).
1.6 The Location of Undesirable Facilities: Enironmental Racism
2. Reactions to Spatial Inequity
A different issue of equity arises as to the differential ability of stronger and richer or weaker and poorer jurisdictions or parts of cities to ward off undesirable private and public activities. Less advantaged places and areas widely perceive that they have long been and remain dumping grounds for unattractive and even damaging facilities—prisons, halfway houses, incinerators and landfills, and services to the ‘less desirable’ folks (drug addicts, alcoholics, the homeless); adult shops and theaters, and polluting and noisy factories. If the obnoxious facilities or activities are in segregated racial minority areas, charges of ‘environmental racism’ may be made. Poorer and minority areas perceive also that they are treated as tolerance zones for illicit activities and inadequate law enforcement. More generally, these areas are characterized by loss of jobs, retailing and services, and middle-class housing, leading to spatial mismatch between a declining inner city and more affluent suburbs. In more liberal cities and states, a consensus may arise for some degree of ‘sharing’ of the ‘burden’ of housing the less affluent, as through requirements for ‘fair shares’ of subsidized affordable housing, even in suburbs. In a very few metropolitan areas, where extreme political fragmentation results in highly inequitable resources for jurisdictions, a sense of injustice and unfairness has resulted in a degree of tax-base sharing, as for police and sewerage services.
As most of these examples show, the achievement of greater spatial equity seems limited to those aspects of everyday life in which public resources or facilities are involved. Social intervention in the form of redistribution of resources toward weaker places and areas seems to occur when a social consensus of ‘market failure’ arises—that the ‘playing field’ has become too uneven, within the prevailing social order. Within the social system, it proves easier also to address inequities in some arenas of life than in others. Not surprisingly it is less threatening to those on top, usually the holders of wealth, to accept equalizing measures with respect to political participation and voting and social dimensions (race, language and gender, although these remain highly contested), and even, if rarely, as to urban housing, than with respect to the truly key economic arena of income, taxes and inheritance. In the USA, issues of equity and redistribution were fought mainly at the national level for most of the postworld war period. Since the mid-1980s and national shifts toward the right—that is, toward greater individual than collective responsibility—the arena of action has moved to local and state levels, where coalitions of activist groups have sometimes been able to gain enough power to enact redistributive plans unimaginable at the national level, for example, high minimum wages, rent controls, or mandates for affordable housing, in a process that might be called ‘pluralist democracy’ (see also Postmodern Urbanism). In other areas and arenas conservative coalitions stress faith in the market, privileging its efficiency over concerns for equity. Indeed, the concepts of efficiency and equity are the very symbols of differing ideological stances toward inequality. The treatment of equity (including this discussion) is usually based on a liberal assumption of people’s similarity in needs and values— ideas going back at least to John Stuart Mill. The new politics of identity and difference may complicate the meaning and evaluation of equity (Keith and Pile 1993, Young 1990).
1.7 Urban Inequality: Unequal treatment of categories of people Many books focus on spatial inequality in US and European cities, e.g. Badcock (1984), Gottdiener (1985) and Katznelson (1981). An influential segment of the intellectual and planning community advocates fairly radical efforts to break down the high degree of racial and class segregation, as through stronger zoning control of settlement. The failed programs of mandatory bussing for school desegregation were similarly based on a sense of injustice and on the premise that mixing in classrooms would lead to mixing in neighborhoods. As with schools, current efforts toward class mixing seem unlikely to be a match for the stronger societal forces for segregation.
2.1 The Analysis of Spatial Equity Maps, and now GIS (geographic information systems) have proven highly effective in revealing the structure of inequality in the landscape—patterns of segre14791
Spatial Equity gation, poverty, concentration of pollution, and unequal access, and have proven essential to convince legislative bodies and courts to address equity issues. A variety of both statistical and mathematical models have been used to measure inequality and its costs, and to assess the impacts of policy changes. Awareness of the human costs of spatial inequality may be increased through more ethnographic studies of the situation of individual households. See also: Access: Geographical; Equity; Location: Absolute\Relative; Location Theory; Locational Conflict (NIMBY); Postmodern Urbanism; Racism, Sociology of; Residential Segregation: Geographic Aspects; Spatial Interaction Models; Spatial Pattern, Analysis of; Urban Poverty in Neighborhoods
Bibliography Badcock B 1984 Unfairly Structured Cities. Basil Blackwell, Oxford, UK Clark G 1983 Interregional Migration, National Policy and Social Justice. Rowan and Allenheld, Totowa, NJ Clark G 1985 Making moral landscapes: John Rawl’s original position. Political Geography Quarterly 5: 147–62 Gottdiener M 1985 The Social Production of Space. University of Texas Press, Austin, TX Harrington M 1962 The Other America. Macmillan, New York Harvey D 1973 Social Justice and the City. Johns Hopkins University Press, Baltimore, MD Jackson P, Smith S 1984 Exploring Social Geography. Allen and Unwin, London Johnston R 1980 Geography of Federal Spending in the United States. Wiley, Chichester, UK Katznelson I 1981 City Trenches. Pantheon, New York Keith M, Pile S (eds.) 1993 Place and the Politics of Identity. Routledge, London Knox P 1975 Social Well-being: A Spatial Perspectie. Oxford University Press, London Kodras J, Jones J P 1995 Geographic Dimensions of US Social Policy. Edward Arnold, New York Merrifield A, Swyngedouw E 1997 Urbanization of Injustice. New York University Press, New York Morrill R 1973 Efficiency and equity aspects of optimum location models. Geographical Analysis 9: 215–26 Perin C 1977 Eerything in its Place. Princeton University Press, Princeton, NJ Pinch S 1985 Cities and Serices. Routledge, London Pratt G, Hanson S 1994 Geography and the construction of difference. Gender, Place and Culture 1: 5–30 Rawls J 1971 A Theory of Justice. Harvard University Press, Cambridge, MA Smith D 1977 Human Geography—A Welfare Approach. Edward Arnold, London Smith D 1994 Geography and Social Justice. Blackwell, Cambridge, UK Walzer M 1983 Spheres of Justice. Basic Books, New York Young I 1990 Justice and the Politics of Difference. Princeton University Press, Princeton, NJ
R. Morrill 14792
Spatial Interaction The study of spatial interaction has been carried on actively for many years in the field of geography, with its emphasis on the spatial aspects of cities and regions. Spatial interaction studies in geography may be divided into those dealing with spatial structure and those dealing with attempts to understand and model these structures. Spatial structure includes the study of: linkages the connections between centers; nodes, the centers themselves; hinterlands, the tributary area surrounding the nodes; and hierarchies, the systems composed of hinterlands, nodes, and linkages. Each of these will be examined briefly, with a particular emphasis on Chicago, as an example of a large, centrally located focus of spatial interaction. Then, some models of spatial interaction will be discussed as examples of attempts to describe and explain the structural systems of linkages, nodes, hinterlands, and hierarchies.
1. Structural Systems of Spatial Interaction 1.1 Linkages and Nodes Linkages are a basic part of spatial interaction and may be studied at several different geographic scales. At the local scale, dense and complex networks of commuting linkages surround large metropolitan centers connecting city and suburban residential areas to employment areas. In the case of Chicago, certain corridors have been traced out by the flow of commuting traffic. These corridors developed initially in connection with commuter railroads and were later massively reinforced and ultimately replaced by expressways which both compete with and complement the rail linkages. Shopping and recreational flows form quite different sets of linkages with a more dispersed and multicentered pattern. In Chicago and other large metropolitan centers, beltways in the form of high-capacity peripheral expressways encircling the city have become important additional linkages connecting suburbs and connecting with the major radial corridors leading into the central city. At the national scale, certain primary corridors have developed in the network of linkages. Chicago is at the western end of a major corridor extending through the Midwest to connect with the Megalopolis corridor extending from Washington to Boston. In air transportation, quite a different national pattern of linkages has developed, dominated by the traffic flows between large cities so that Chicago’s strongest linkages are with New York and Los Angeles rather than with nearby cities. Nodes within networks represent another structural aspect of spatial interaction study. The relative im-
Spatial Interaction portance of a node is determined by the size and nature of its linkages. At the local level important nodes are located along the major radial corridors and near the beltways. Nodes located in areas where beltways intersect radials have particularly high accessibility to the rest of the metropolitan area. At the national level, the development of nodality has been closely linked to technological change. Chicago’s national-scale nodality in the railroad era was unmatched by any other large US city. As the US highway system and the Interstate Highway System emerged, however, Chicago’s nodality was no greater than that of a number of other large metropolitan areas. With the coming of high-speed, high-capacity jet airline service in the 1960s, Chicago again served as major focal point, ranking with New York and Los Angeles in number of high-density linkages. When national communication networks are considered, however, a different pattern of spatial interaction emerges. For television, entertainment, and publishing, there is a strong focus on the New York and Los Angeles nodes.
1.2 Hinterlands and Hierarchies Spatial interaction has also been studied in terms of hinterlands and hierarchies. Systems of linkages and nodes give rise to groups of tributary areas, or hinterlands which, in turn, may be organized into hierarchical networks. Some of the earliest geographic studies of spatial interaction dealt with the delimitation of local hinterlands. These could be based on commuting, retail market areas, social trips, recreational trips, newspaper circulation, or shipments of goods. Chicago’s large metropolitan area, which is mainly defined by its commuting linkages, also contains within it a number of small cities each with its own commuting and retail hinterlands. The result is a complex, fine-grained hierarchical structure of central places. At the national scale, Chicago itself is embedded into several hierarchical systems. In air passenger traffic, for example, New York has become the single most dominant center. By the 1990s, it was the leading passenger generator not only for a group of adjacent cities but also for such large more distant cities as Chicago, Atlanta, and Los Angeles. In hierarchical fashion, each of these centers dominates its own surrounding hinterland. In terms of scheduled airline traffic as opposed to air-passenger origin and destination, a hierarchical hub-and-spoke system has developed. Short-haul connections for a particular airline focus on a single hub where passengers transfer to connecting flights to other hub centers. Communications networks, such as those developed in the film and television industries, have a clearly bipolar structure focused on New York and Los Angeles. Other large metropolitan centers, such as
Chicago, occupy far less central positions in the system. The rise of the Internet is in the process of creating still a different structure of spatial interaction. To date, the system has been a dispersed one in which it is difficult to detect major nodes such as those marking hierarchical or bipolar systems. At the global scale, the early pattern of world trade routes focused on Britain and a number of strategically located ports has given way to an extensive system focused on major world cities such as London, New York, and Tokyo. There is also some evidence of a hub-and-spoke system, with the major world cities serving as regional centers, distributing international traffic to their surrounding areas. London serves European cities; New York, US cities; Miami, Latin American cities; and Tokyo, Hong Kong, and Singapore serve Asian cities.
2. Spatial Interaction Models Models have ranged from the descriptive through the explanatory, predictive, and normative. Useful descriptions of the comparative accessibility of given networks, or of given nodes within a network, have been provided by graph-theoretic models. Hubs may be identified, and the differing accessibility of industrial cities may be measured and compared. Such measures, both for nodes and for entire networks, permit planners to evaluate the effects of proposed new linkages on nodal and network accessibility. More explanatory models evolved from earlier descriptive efforts as attempts were made to identify some of the underlying forces shaping systems of spatial interaction. There is a large literature dealing with the observed tendency for interaction between any two cities to increase as the distance between them decreases. This relationship was first described as a gravity model: directly proportional to the product of the two cities’ populations and inversely proportional to the distance between them. It soon became apparent, however, that spatial interaction influenced by many other factors. Distance had different effects for different kinds of interaction. Highway traffic, for example, was much more sensitive to distance than was air traffic. Other characteristics of cities such as per-capita income, economic complementarity, and community of interest had to be considered. The spatial structure of the regions surrounding the cities was also found to have effects in the form of intervening opportunities for spatial interaction and of the proximity of competing origins or destination cities. As the explanatory models continued to ramify, they evolved into a series of traffic demand equations incorporating a number of variables that had been shown empirically to affect spatial interactions. Intra-urban spatial interaction, that is, the flow of traffic within large cities, has come in for special attention in recent years. Initially, aggregate explana14793
Spatial Interaction tory models were used by planners and engineers as a basis for predicting and improving urban transportation. These models were codified and used in large traffic surveys and analyses. Cities were divided up into analysis zones and detailed surveys were carried out. Demand equations were used to estimate trip generation by analysis zones and the trips were then allocated to existing transport network facilities and used as the basis for future planning. Later, these aggregate models generally were replaced by models designed to facilitate an understanding of individual travel behavior. Data were collected and analyzed at an individual or household level rather than at an aggregate traffic-zone level. These behavioral studies addressed questions of travel choice at an individual level. How did such attributes as comfort, reliability, safety, waiting time, or privacy related to time and cost? How did the attributes interact with such travels characteristics as time and cost? What trade-offs were involved, as travelers would choose between such present or planned types of service as auto, bus, express bus, park-n-ride, car pools, high-occupancy traffic lanes, light rail, and others? Spatial interaction was also modeled in a normative fashion. The most efficient patterns of linkages between a given set of supply nodes and a given set of demand nodes could be ascertained with programming models. Constraints could then be introduced to account for such things as bottlenecks, capacity limitations, or particular market characteristics. Also, if it were desirable to maximize profits rather than minimize transport, a more complex set of economic functions would be used at the various cities. In addition to optimal linkage patterns, programming models have been used to identify optimal hinterland or tributary area patterns. An optimal set of school districts, for example, can be obtained by minimizing the travel times of a given set of schools with varying capacities. Similar studies have been made for tributary areas of supermarkets, rural hospitals, and others. Normative models have also been used for purposes other than the establishment of an optimal system of linkages or hinterlands. Sensitivity analysis permits the planner to view the consequences of a number of alternative proposed changes in the transport facilities. Each proposed change can be run through the model and the minimum costs identified. For example, in a school-district case, decisions as to the optimum location for adding or eliminating a school could be aided by running the model for each alternative. Trade-offs could then be evaluated. The cost in increased travel time of certain socially desirable alternatives, such as adding to the ethnic or racial diversity of each school, could be ascertained and evaluated in the light of other possible policy alternatives. Thus, there is a gradation from descriptive models to explanatory, predictive, and normative models, all of which may provide improved understanding of 14794
spatial interaction as it is expressed in structural systems of linkages, nodes, hinterlands, and hierarchies. See also: Access: Geographical; Communication: Geographic Aspects; Market Areas; Spatial Choice Models; Spatial Data Infrastructure; Spatial Interaction Models; Spatial Labor Markets; Spatial Optimization Models; Transportation Geography; Urban Activity Patterns
Bibliography Fleming D Hayuth Y 1994 Spatial characteristics of transport hubs: Centrality and immediacy. Journal of Transport Geography, Vol. 2, pp. 3–18 Fotheringham A S, O’Kelly M E 1989 Spatial Interaction Models: Formulations and Applications. Kluwer Academic Publishers, Dordrecht Golledge R G, Rushton G (eds.) 1976 Spatial Choice and Spatial Behaior. The Ohio State University Press, Columbus Hagget P, Chorley R J 1969 Network Analysis in Geography. Edward Arnold, London Hanson S (ed.) 1995 The Geography of Urban Transport (2nd edn.). Guilford Press, New York Hoyle B S, Knowles R D (eds.) 1992 Modern Transport Geography. Belhaven Press, London Janelle D 1969 Spatial reorganization: A model and a concept. Annals of the Association of American Geographers 5: 348–54 Pred A R 1966 The Spatial Dynamics of U.S. Urban-Industrial Growth, 1800–1914. MIT Press, Cambridge Taaffe E J, Gauthier H, O’Kelly M E 1996 The Geography of Transportation, 2nd edn. Prentice-Hall, Upper Saddle River, New Jersey Ullman E 1956 The role of transportation and the bases for interaction. In: Thomas W L (ed.) Man’s Role in Changing the Face of the Earth. University of Chicago Press, Chicago Vance J E Jr 1980 Capturing the Horizon: The Historical Geography of Transportation. Harper and Row, New York
E. J. Taaffe
Spatial Interaction Models 1. Definitions and Uses Spatial interaction is the general term for any movement of people, goods, or information over space that results from a decision-making process. Specific examples include movements such as migration, shopping trips, commuting, trips for recreational purposes, trips for educational purposes, freight flows, the spatial pattern of telephone calls, emails and world-wide web connections, of the use of healthcare facilities. Consequently, spatial interaction is a very broad topic, and spatial interaction models have a wide range of
Spatial Interaction Models applications (see also Spatial Interaction). Olsson (1970, p. 233) states: The concept of spatial interaction is central for everyone concerned with theoretical geography and regional science … Under the umbrella of spatial interaction and distance decay, it has been possible to accommodate most model work in transportation,migration,commuting,anddiffusion,aswellas significant parts of location theory.
Spatial interaction models are mathematical descriptions of spatial flows (see Mathematical Models in Geography). The models serve two purposes. One is to predict flows where they are unknown. For example, spatial interaction models have been used to predict the impact on traffic patterns of a new development (O’Kelly 1986), to predict the impact a new store will have on trade at existing stores (Fotheringham and O’Kelly 1989), and to predict the optimal spatial arrangement of healthcare facilities (Mayhew et al. 1986). The other is to yield information on the determinants of the flow system being analyzed. For example, spatial interaction models are commonly used to determine the influence of various store attributes on the choice of stores by consumers (Fotheringham and Trew 1993) or to examine the characteristics of places which affect in and out migration (Boyle et al. 1998, Flowerdew and Salt 1979). Some questions that can be answered with the judicious use of spatial interaction models include: (a) Is there sufficient demand between two cities to justify direct air service and what frequency of air service will be sufficient to meet the demand? (b) Where is the optimal location for a new supermarket within a metropolitan area? (c) What will the effect on traffic patterns be of building a new bridge or tunnel across a river? (d) What can a store do to attract new customers— what features of the store are most likely to attract or deter customers? (e) Are migrants becoming less constrained by distance over time? (f ) What influence does a political or administrative boundary have on students’ choices of a university? The remainder of this article examines some of the basic elements of spatial interaction modeling. It then describes four common types of spatial interaction models and outlines a typical application of each type of model. Brief discussions of model calibration and local forms of spatial interaction models are then given followed by a description of the various underlying theoretical structures that have been proposed to substantiate spatial interaction models.
2. The Basic Elements of Spatial Interaction Models The starting point for spatial interaction modeling is a matrix of flows between a set of origins, each of which
is usually labeled i, and a set of destinations, each of which is usually labeled j. The task is to represent the number of trips between any i and j, Tij, as a function of origin characteristics, destination characteristics, and the separation between origins and destinations. Hence spatial interaction models contain four basic elements: (a) a measurement of the spatial interaction (b) a measurement of the propulsiveness of each origin (that is, a measure of how many flows begin at each origin) (c) a measurement of the attractiveness of each destination (that is, how many flows terminate at each destination) (d) a measurement of the spatial separation between origins and destinations. Each of these elements can be measured in different ways depending on the type of spatial interaction being modeled. Usually, distances between origins and destinations are used to measure spatial separation although cost and time can also be used. Regardless of how spatial separation is measured, it is often the most important determinant of spatial interaction patterns: individuals tend to be highly constrained by space for most types of spatial interactions. However, spatial separation is not the only determinant of spatial interaction patterns. For example, while an individual is likely to patronize a supermarket close to home, it may not always be the supermarket closest to home that is the most attractive: a supermarket farther away but offering a wider range of products or cheaper prices might be more attractive. This trade-off between proximity and other destination attributes is the essence of spatial interaction modeling. Sometimes, the number of trips leaving each origin or the number of trips terminating at each destination is known from surveys or from some other exogenous source. In such cases, this information is used directly in the model. When such information is not known, the model has to contain surrogates of origin propulsiveness or destination attractiveness or both. Table 1 gives some ways in which origin propulsiveness and destination attractiveness might be measured for different types of spatial interaction. Different forms of spatial interaction model arise from having different amounts of information on the number of trips leaving each origin and terminating at each destination. These different types of spatial interaction model and how they arise are now described.
3. A ‘Family’ of Spatial Interaction Models It was established above that in order to predict the volume of interaction between places, it is necessary to have three types of information about those places: a measure of origin propulsiveness, a measure of destination attractiveness, and a measure of spatial separation. The last is readily available as distances 14795
Spatial Interaction Models Table 1 Possible measures of origin propulsiveness and destination attractiveness Type of spatial interaction
Origin propulsiveness
Destination attractiveness
Shopping trips
Population Household income
Store size Number of parking spaces Variety of products Price levels
Commuting flows
Population Household income Car ownership rate
Number of jobs
Migration
Population Crime rate Unemployment rate Wage rate
Population Crime rate Unemployment rate Wage rate Climate
between locations can easily be calculated, especially if the locations are stored digitally. This produces four possible situations in which different levels of information are available on origin propulsiveness and destination attractiveness. In turn, these situations give rise to four different forms of spatial interaction model which are collectively referred to as a ‘family of spatial interaction models.’ (Fotheringham and Dignan 1984) demonstrate that they are actually four extreme points on a continuum of spatial interaction models.
3.1 The Unconstrained Model Suppose a measure of spatial separation is available but there is no information on how many trips originate at each origin and how many trips terminate at each destination. These latter two values need to be estimated within the spatial interaction model by surrogate variables such as those described in Table 1. The appropriate spatial interaction model in such a situation is known as an unconstrained model with the following general form: αp λq β Tij l k V αi " V αi # … V ip W λj " W λj # … W jq d ij " # " #
(1)
where Tij is the number of trips that begin at origin i and end at destination j, Vi represents an origin attribute that is a surrogate for the number of trips beginning at i, Wj represents a destination attribute that is a surrogate for the number of trips ending at j, and dij represents the spatial separation (assumed here to be measured by distance) between origin i and destination j. The number of origin and destination variables included in the model depends on the type of spatial interaction being modeled and the information avail14796
able. Here the number of origin variables is denoted by p and the number of destination variables is denoted by q. In some situations there might only be one origin attribute and one destination attribute in the model. The parameters α, λ, and β reflect the influence of each of the exogenous variables in the model on the volume of interaction and the values are obtained in the calibration of the model (see below). For example, if Vi were population, it would be expected that α1 " would be positive: as the population of an origin increased, a greater volume of interaction would emanate from it, everything else being equal. Similarly, β would normally be negative: as the distance between any two places increases, the volume of interaction between them decreases, everything else being equal. The exact rate at which interaction decreases or ‘decays’ as distance increases, is a matter for empirical evaluation. One of the fascinating aspects about spatial interaction modeling is the ability to estimate these parameters in different systems and to compare spatial behavior in different systems and over time. For instance, the value of the distance-decay parameter, β, has been shown to vary across different economic systems, over time, by income group, and by gender (Southworth 1983). Generally, the deterrence of distance for interaction is less in more developed economies and there is a gradual reduction in distancedecay over time as transportation facilities improve and as knowledge about different locations increases. Finally, the parameter k in the model is a scale factor that simply balances the units on both sides of the equation and is of no great behavioral importance. 3.2 The Production-constrained Model Suppose data on spatial separation and on the numbers of trips emanating from each origin are available but there are no data on the number of trips that terminate at each destination. The appropriate spatial
Spatial Interaction Models interaction model for this situation is known as a production-constrained model and it has the form: Tij l Oi
λq β W λj " W λj # … W jq d ij "λ #λ λq β j W j " W j # … W jq d ij " #
Tij l Dj
(2)
where Oi is the total number of trips beginning at origin i and it replaces the surrogate origin propulsiveness variables in the unconstrained version. The model essentially distributes the number of trips from an origin in proportion to the relative attractiveness of each destination as measured by the numerator of Eqn. (2). The denominator acts as a constraint on the predicted trips so that j T *ij l Oi for every origin
beginning in each origin. The model has the general form:
(3)
αp β V αi " V αi # … V ip d ij "α #α αp β j V i " V i # … V ip d ij " #
(6)
where Dj represents the total number of trips terminating at destination j and this replaces the destination attractiveness surrogates in the unconstrained model. The attraction-constrained model is the complement of the production-constrained model. In this case the model allocates the number of trips terminating at a destination across the various origins in proportion to the relative propulsiveness of each origin as measured by the numerator of Eqn. (6). The denominator again acts as a constraint on the predicted trips but in this case, the constraint is on the number of predicted trips into each destination so that
where T *ij is the predicted value of Tij from the model. The model is sometimes represented as:
i T *ij l Dj for every destination
λq β Tij l Ai Oi W λj " W λj # … W jq d ij " #
where T *ij again represents the predicted value of Tij from the model. The model can also be represented as:
(4)
where λq β V Ai l [j W λj "W λj # … W jq d ij] " " #
(5)
Production-constrained forms of spatial interaction models are encountered very frequently, particularly in retailing where estimates of the number of trips leaving each origin can be obtained from census population data. The models are used to discover the characteristics of stores that attract consumers and to predict the proportion of people in each residential neighborhood shopping at each store. Once such a model is calibrated, it can be used to estimate a new store’s potential market and its impact on the sales at existing stores. The model can be used to estimate sales at a variety of possible locations to aid in the optimal location of a store. It can also help retailers decide on the optimal strategy to attract more customers to a store. For example, would more customers be attracted to a store if the number of parking spaces were doubled or if price levels were reduced slightly?
3.3 The Attraction-constrained Model Suppose data on spatial separation and on the numbers of trips terminating at each destination are available but there are no data on the number of trips beginning at each origin. In this case, an attractionconstrained spatial interaction model is the most appropriate form because it uses the known inflow totals into each destination and a set of surrogate variables to represent the total number of trips
(7)
αp β Tij l Bj Dj V αi " V αi # … V ip d ij " #
(8)
αp β V Bj l [i V αi " V αi # … V ip d ij] " " #
(9)
where
The terms Ai and Bj in the production-constrained and attraction-constrained models are sometimes referred to as ‘balancing factors’ since they constrain the row and column totals, respectively, of the predicted flow matrix. Attraction-constrained spatial interaction models are encountered less frequently than productionconstrained models but they can be used to estimate the demand for housing in different residential neighborhoods due to the location of a new enterprise. They can also be used to analyze geographic patterns in student enrollments where the total number of students attending a university is known. The attractionconstrained model can then be used to determine the characteristics of places that affect the number of students attending from different origins. The model can also be used in a normative way to derive ‘expected’ numbers of students enrolling from each origin (the predicted trips from the model) and to compare these expected values with the actual enrollments. This comparison can help identify areas from which the university is drawing more students than expected, given the characteristics of these areas, and areas where it is drawing fewer than expected. The latter could be targeted to increase enrollments. Obviously, this type of application could easily be extended to retailing or any other area where there is competition to attract individuals. 14797
Spatial Interaction Models 3.4 The Production-attraction Constrained Model (or Doubly Constrained Model )
Tij l Ai Oi Bj Dj d βij
(10)
Ai l [j Bj Dj d βij]V"
(11)
useful to determine how sensitive a model’s predictions are to variations in the parameters used to derive the predictions. If flow data are not available, model calibration as described above is impossible and ‘educated guesses’ have to be made for parameter values. This could be done, for example, by surveying the results of previous calibrations of similar models. However, in practice this is fraught with difficulties and should be avoided where possible as it is rare to find exactly the same spatial interaction model with the same combinations of explanatory variables being used in different applications. Also, it appears that the parameters of spatial interaction models are quite context dependent (Meyer and Eagle 1982).
Bj l [j Ai Oi d βij]V"
(12)
5. Local Forms of Spatial Interaction Models
Suppose data are available on spatial separation, the total number of trips beginning at each origin, and the total number of trips terminating at each destination. The appropriate spatial interaction model is then the production-attraction constrained, or doubly constrained, model. This has the form:
where
and
The Ai and Bj terms have to be estimated iteratively and they ensure that the constraints in Eqns. (3) and (7) are met. The doubly constrained model allocates trips to the cells of the flow matrix (the origin-destination links) given row and column totals. It does this solely on the basis of the distance between origins and destinations. Its use has mainly been confined to transportation applications where the total numbers of commuters leaving each residential neighborhood are known and the total numbers of employment opportunities in each destination are also known. What is unknown, and can be predicted from the model, is the distribution of trips between origins and destinations.
4. A Note on Model Calibration Although a detailed discussion of the nature and methods of calibrating spatial interaction models is beyond the scope of this article, it is worth mentioning very briefly how the parameters in a spatial interaction model might be estimated. For a fuller discussion of the calibration of spatial interaction models, the reader is referred to Fotheringham and O’Kelly (1989). In order to use a spatial interaction model to answer questions such as those posed in Sect. 1, it is necessary to obtain values for the model’s parameters. If the purpose of the model is to obtain information on the way either origin or destination characteristics affect spatial interaction patterns, the model must be calibrated with some known flow data. This is done by finding the values of the parameters that maximize some criterion, such as the accuracy with which the model replicates the known flow pattern. Parameter values that are far from their ‘true’ values will tend to produce very poor estimates of the flows, while parameters close to their ‘true’ values will tend to produce more accurate estimates. It is sometimes 14798
The forms of spatial interaction models described above are referred to as ‘global’ models because they provide a single set of parameter estimates for the whole flow system being analyzed. For instance, in each of the above models, a single estimate of distancedecay is obtained when the model is calibrated. In most circumstances, it would be useful to obtain more detailed information on the flow system and to obtain separate distance-decay parameter estimates for each origin in the system or for each destination. This is quite easy to achieve by calibrating what are known as origin-specific models and destination-specific models. An origin-specific form of the production-constrained model given in Eqn. (4) is: Tij l Ai Oi W jλ"(i) W jλ#(i) … W λjqq(i) d ijβ(i) " #
(13)
Ai l [j W jλ"(i) W jλ#(i) … W λjqq(i) d ijβ(i)]V" " #
(14)
where
In this model, each parameter is specific to origin i so that the spatial distribution of a parameter can be mapped to examine any spatial patterns in the behavior represented by that parameter. Interestingly, there is a long history of estimated origin-specific distance-decay parameters exhibiting spatial patterns in which centrally located origins have less negative parameter estimates than peripheral ones (Fotheringham 1981). The interpretation of such a pattern has provoked intense debate at times, but it appears that the distribution is due to a mis-specification bias which can easily be eliminated by the addition of a variable measuring the proximity of a destination to all the other destinations (Fotheringham et al. 2000, chap. 9). The models that result are termed ‘competing destinations models’ and have been shown to remove the previously observed spatial pattern of origin-specific, distance-decay parameters. Fotheringham and O’Kelly 1989). It is interesting to note that the mis-
Spatial Interaction Models specification of spatial interaction models became evident only when local forms of the models were calibrated. For completeness, a destination-specific form of the attraction-constrained model in Eqn. (8) is: αp(j) β(j) Tij l Bj Dj V iα"(j) Viα#(j) … V ip d ij " #
(15)
where αp(j) β(j) V (16) d ij ] " Bj l [i V iα"(j) V iα#(j) … V ip " # In this model, parameters are obtained that are specific to each destination j.
6. Theoretical Foundations of Spatial Interaction Models Although spatial interaction models generally perform well empirically and are generally assumed to be reasonable statements of spatial behavior, the models have been criticized for not having a sound theoretical base. The ‘Holy Grail’ of spatial interaction modeling has therefore been to provide a commonly acceptable theoretical derivation of spatial interaction models. Four main frameworks have been postulated for this purpose and in chronological order these are: (a) spatial interaction as social physics; (b) spatial interaction as statistical mechanics; (c) spatial interaction as aspatial information processing; and (d) spatial interaction as spatial information processing. It is not the place to describe each of these theoretical frameworks in detail (see Fotheringham et al. 2000, chap. 9, for such a description). However, a very brief discussion is given to demonstrate the progress that has been made in this area and how the modeling framework has moved far away from its original reliance on an analogy to laws of physical gravitation. Attempts at understanding regularities in patterns of spatial flows began with the observation that the movement of people between cities was analogous to the gravitational attraction between solid bodies. That is, greater numbers of migrants were observed to move between larger cities than smaller ones, ceteris paribus, and between cities that were closer together than between cities that were farther apart, ceteris paribus. This led to the proposal of a simple mathematical model to predict migration flows between origins and destinations based on the Newtonian gravity model. However, there was no theoretical justification for this form of model until Wilson derived the family of spatial interaction models from entropy-maximizing principles (Wilson 1967). However, this framework, although a catalyst for a great volume of research in spatial interaction modeling, was behaviorally difficult to justify and was superseded by McFadden’s random utility maximizing principle (1974). This linked spatial interaction modeling with the process of spatial choice (see Spatial Choice Models) although the spatial
choice process it depended on was developed for aspatial choice processes and did not always fit well into a spatial context (Fotheringham and O’Kelly 1989). Consequently, the latest theoretical framework in which spatial interaction models have been embedded is that of spatial information processing (Fotheringham et al. 2000, chap. 9 1992). This has led to the development of the new forms of spatial interaction models mentioned in Sect. 4 termed ‘competing destinations models’. Spatial interaction models are the products of a spatial theoretical framework that incorporates concepts from economics, such as random utility maximization, and psychology, such as spatial cognition and information processing (see Spatial Cognition; Behaioral Geography). However, the models describe the constraints of space on important aspects of human behavior and hence are quintessentially geographic. See also: Location Theory; Market Areas; Spatial Optimization Models; Transportation Geography
Bibliography Boyle P J, Flowerdew R, Shen J F 1998 Modelling inter-ward migration in Hereford and Worcester: the importance of housing growth and tenure. Regional Studies 32: 113–32 Flowerdew R, Salt J 1979 Migration between labour market areas in Great Britain, 1970–1971. Regional Studies 13: 211–31 Fotheringham A S 1981 Spatial structure and distance-decay parameters. Annals of the Association of American Geographers 71: 425–36 Fotheringham A S, Brunsdon C, Charlton M 2000 Quantitatie Geography: Perspecties on Spatial Data Analysis. Sage, London Fotheringham A S, Curtis A 1992 Encoding spatial information: the evidence for hierarchical processing. In: Frank A U, Campari I, Formentini U (eds.) Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Lecture Notes in Computer Science. Springer-Verlag, Berlin, pp. 269–87 Fotheringham A S, Dignan T 1984 Further contributions to a general theory of movement. Annals of the Association of American Georgraphers 74: 620–33 Fotheringham A S, O’Kelly M E 1989 Spatial Interaction Models: Formulations and Applications. Kluwer Academic Publishers, Dordrecht, The Netherlands Fotheringham A S, Trew R 1993 Chain image and store choice modeling: the effects of income and race. Enironment and Planning A 25: 179–96 Mayhew L D, Gibberd R W, Hall H 1986 Predicting patient flows and hospital case-mix. Enironment and Planning A 18: 619–38 McFadden D 1974 Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed.) Frontiers in Econometrics. Academic Press, New York, pp. 105–142 Meyer R J, Eagle T C 1982 Context-induced parameter instability in a disaggregate-stochastic model of store choice. Journal of Marketing Research 19: 62–71 O’Kelly M E 1986 Activity levels at hub facilities in interaction networks. Geographical Analysis 18: 343–56
14799
Spatial Interaction Models Olsson G 1970 Explanation, prediction and meaning variance: an assessment of distance interaction models. Economic Geography 46: 223–33 Southworth F 1983 Temporal versus other impacts upon trip distribution model parameter values. Regional Studies 17: 41–7 Wilson A G 1967 Statistical theory of spatial trip distribution models. Transportation Research 1: 253–69
A. S. Fotheringham
Spatial Labor Markets Everyone has seen headlines that report declining or increasing unemployment. If the articles that ensue do not report the unemployment data for distinct regions, the headlines are misleading. The idea that processes leading to unemployment are similar in all regions may be assumed, but it is not tenable. The labor market of a state cannot be thought of as a uniform entity; rather it comprises different spatial entities called spatial labor markets. The concept of a spatial labor market reflects the realities that mobility incurs costs, that information about a national labor market is not equally available everywhere and that people are not always willing to give up their home in order to take up a job where they are needed.
1. Definition A spatial labor market is defined as an imaginary and sometimes also actual meeting place for the supply of and demand for labor. Thus, a spatial labor market is both a spatial entity and a mediation agency for paid work. Or, put another way: a spatial labor market is always both a territorial entity and a social institution that serves to distribute labor across the labor supply. The dual nature of a spatial labor market can still be seen in many markets that emerge from time to time, as well as in more permanent markets. Markets are both actual meeting places for those offering and seeking goods and institutions that serve to set prices and distribute goods separated from the local and regional context.
2. How a Spatial Labor Market Works From a functional point of view a spatial labor market thus corresponds to the general model of labor markets limited by the spatial range over which supply and demand can intersect and try to satisfy their requirements. It is the friction of distance that creates local and regional labor markets by limiting the spatial range over which supply and demand can intersect. 14800
Nevertheless, the components and the mechanism of the spatial labor market corresponds to that of the nonspatial model. The labor supply offers its readiness to work and its qualifications while seeking to achieve high wages, job security, and career expectations. The labor demand, i.e., business people, entrepreneurs, public bodies, and representatives of all kinds of associations, seek this readiness to work and specific qualifications and endeavor on grounds of costs to pay low wages in return, while promising job security and career expectations only to certain employees. This searching process is accompanied on both ‘sides’ by organizations that have, through collective agreements, created the framework conditions which markedly shorten the individual negotiation process. For most workers the room for individual negotiations is narrowly restricted by wage agreements arrived at by collective bargaining, minimum wages, and labor laws. The extent of labor supply and demand is not fixed. The number of people looking for work and in demand depends on many factors. On the supply side, the number of employable people of working age (usually between 15 and 60 years of age) is determined by the size of the population living in the region and its age structure, regardless of whether the people thus defined as ‘employable’ are in fact seeking work. The number of employable persons is increased directly by permanent, temporary, and pendular migration and natural population trends (fertility and mortality). The proportion of employable people actually seeking work (workforce), and thus calculable as part of the labor supply, varies greatly. Factors determining whether people are seeking work include educational and professional qualifications, family status, gender, the regional infrastructure, and sociocultural norms. The proportion of people seeking work can be measured by the participation rate. This is obtained by dividing the number of persons employed (or unemployed) by the number of population. Finally, those seeking work offer their labor for a particular period of time, which is dependent on personal preferences and legal norms. In multiplying the labor supply by the period of work offered, it is possible to calculate the total volume of labor offered. Set against this is the volume of labor demand. This can be calculated from the number of jobs available in a region. This is heavily dependent on the economic situation with an increase in the demand for goods and services tending to lead to an increase in the demand for labor, unless production is increased by new ways of organizing work and technical advances, which lead to a leveling-off in the volume of labor demand. The volume of labor demand is also dependent on the working hours, with a reduction and constant degrees of flexibility in the framework times clearly tending to an increase in demand. The extent of the supply of and demand for labor is not fixed. On a spatial labor market the volume of labor supply and demand can fluctuate more strongly
Spatial Labor Markets
Figure 1 Supply and demand at the labor market (after Fassmann and Meusburger 1997)
than on a national labor market. As well as those mentioned above, other factors that come into play are the effects of the varied accessibility of the spatial labor market. Thus, the construction of roads and of rail-borne means of mass transport can affect the accessibility of a spatial labor market and, thus, the labor supply, just as the establishment or transfer of a firm can increase or decrease the demand (Fig. 1).
3. Size and Demarcation of a Spatial Labor Market There is a consensus that a spatial labor market is based on distance which creates local and regional markets and leads to geographical subdivisions of a nonspatial labor market. To this extent the definition is clear. But it is still not clear how large a spatial labor market is and by what yardsticks and criteria it is demarcated. Two different demarcation approaches are possible. The first approach involves using given administrative units (or aggregation of administrative units). This means that spatial labor markets can be limited by administrative boundaries. If one asks what is the labor market situation in Munich the answer will contain information on unemployment and employment in the administrative area of the city regardless of the fact that the functional area is much broader. The same is true if, for example, the labor market of old industrial areas, the rural labor market or the urban labor market is in the center of interest. The space is always a homogeneous and not a functional unit. This contrasts with the prominent position of the functional integration that is found in the second approach to the demarcation of labor markets. The demarcation of a regional labor market depends on the commuter field (Eckey et al. 1990) and thus the accessibility and integration of a labor market center with its own hinterland. This is a common method for defining regional labor markets in the literature and considered the ‘best’ approach. Self-contained labor market areas, functionally demarcated on the basis of commuter data, would be an ideal model for taking stock of regional labour markets (Bro$ cker 1988). Both approaches—the spatial labor market as an administrative (homogeneous) unit respective a functional unit—are overlooking important points. Theoretical and empirical research came to the conclusion that the idea of demarcation of a spatial labor market
which is valid for all societal groups and for all times is wrong (Peck 1989, Hanson and Pratt 1995, Fassmann and Meusburger 1997). Spatial labor markets are always time-specific and constructed. Spatial labor markets do not exist as a priori territorial entities which require only a suitable analysis to discover, but are socially-constructed spatial entities depending on gender, occupation, and sociocultural norms. The functional labor markets for the male population is significantly different to that of the female, the societal acceptance of commuting to work on a daily or weekly basis over long distances varies from country to country and region to region. An administrative boundary can also change, just like the catchment area of a central location. The transport infrastructure is as significant in the formation of spatial labor markets as are company location decisions and the effect of institutions. The establishment of a large new company or the closure of an existing one can— depending on the analytical criteria— alter decisively the flow of commuters. Similarly, the opening up of previously closed borders and currency fluctuations can quickly redirect commuter flows in border regions. Labor market regions thus defined must therefore be re-examined every few years and possibly redemarcated. The following is a further principal argument against the method of functional demarcation of a labor market region by the flow of commuters. Commuter data always underestimate the real extent of a regional labor market because commuter data are based on figures for the actual places of residence and work. The fact that someone lives in the commuter catchment area and looks for a job there does not mean, of course, that the person concerned has always lived there. It is equally possible that they migrated from a long distance in order to take up a job. The demarcation of a spatial labor market on the basis of commuter data is not, however, based on the actual place of residence and ignores previous migration. This is of major significance in assessing the extent of a regional labor market. Another problem in defining a spatial labor market by using commuting data lies in the suggested precision that is, however, not guaranteed. Especially at the margins of a spatial labor market, it is not clear if a small spatial unit belongs to one or the other spatial labor market. For example, when 20 percent of the workforce commutes into one center and 20 percent into another, the classification ceases to be clear. A continuing and as yet unsolved problem, however, is still the question of the size of a region integrated as a result of spatial interactions. Depending on the research question and the purpose chosen, there exist local, regional, and national labor market centers with corresponding catchment areas. A national labor market can be broken down into regional labor markets and the latter can be further split up into ‘subregional’ or local labor markets. Precise bound14801
Spatial Labor Markets aries demarcating where a local labor market becomes a regional one and the yardsticks suitable for defining the regional labor market are lacking. Those that believe that universally-valid demarcation criteria exist are, however, mistaken.
4. Empirical Findings The difficult question of demarcation should not obstruct the main finding here. Those that would like to distance themselves from the abstract model of the labor market of the economy and move towards a more realistic position are unavoidably approaching the model of a spatial labor market because labor markets are always spatial phenomena and research cannot therefore avoid taking the spatial dimension into consideration. In any case, empirical surveys have drawn attention to the systematic differences between specific regional labor markets and different types of spatial labor markets. As an example of that one can ask how spatial labor markets are changing when comparing large metropolises, with the medium-sized centers and the rural areas. Some results will be presented (Fassmannn 1993, Fassmann and Meusburger 1997). Analogous to and very closely connected with the idea of central place theory is the distribution of highly qualified and leading positions in the secondary and tertiary sectors. Large towns offer the highest-ranking central services and connected with them a specific range of jobs. Public sector institutions such as ministries, the highest courts, museums, universities, and the head offices of major companies, together with all the service industries connected with them, are located in the respective ‘primate cities’ of the national
network of cities. This phenomenon is being strengthened by the decline of manufacturing jobs. Industrial cities are being transformed into information and services centers with a consequent change in the demand for labor (Table 1). The wide and attractive range of jobs offered in metropolises has led to a marked increase in employment. In metropolises the proportion of those seeking and taking up work is higher than on other labor markets. Women in particular are attracted by the demand from service industry companies. They benefit particularly from the expansion of the tertiary and quaternary sectors and the transfer and reduction of manufacturing jobs. With the decline in manufacturing jobs and the dominance by now of highly qualified tertiary and quaternary jobs in central locations, the labor markets of large towns and cities experience an increase in wellpaid and highly-qualified workers. Their purchasing power and consumer behavior stimulate the growth of less qualified service industry jobs at the other end of the labor market. Laundries, restaurants, and retail shops with long opening hours make new lifestyles possible and family structures that are growing in number. These phenomena are not found to the same extent in all towns and cities, and depend on the political regulation of the labor market and on the accessibility of cheap and mostly immigrant labor. The tendency appears to be in this direction, however, and was also dealt with thoroughly in the discussion of global cities (Sassen 1994). The generally attractive range of jobs in large towns does not prevent high unemployment but, paradoxically, contributes to it. Unlike on the labor markets of small towns, old industrial areas, and rural areas the attractive supply of jobs leads to immi-
Table 1 Geographical features of labor markets in relation to the settlement system Features
Rural areas
Central places
Metropoles
Size of labour
Severe job shortage Job surplus negative commuter flow balanced commuter flow
Unemployment
Moderate to small and seasonal peaks
Tremendous spatial Relatively high but variation (old industrial more polarized areas vs. new growth centers)
Structure of labour, supply and demand
Low demand, concentration on a few sectors
Varying demand, according to structure of central locations
Varying and polarized demand in manufacturing and services sectors
Moderate labor supply; especially low female supply
Structure-specific labor supply
High labour supply; through women and immigrants
Low labor qualifications (industrial labourers, agriculture)
Medium-level qualifications; skilled workers; middle-level white-collar staff
Polarises qualification (specialists and lowlyqualified service workers)
Source: Fassmann 1993
14802
Large job surplus positive commuter flow
Spatial Labor Markets gration, commuter migration, and to high levels of employability among the population which persists even in times of temporary unemployment. Unemployment does not lead to employable people retreating into the reserve army. They continue to actively seek work, but are registered as unemployed. Characteristic of the urban labor market, therefore, is the wide range of opportunities. For both the highlyqualified and the low-qualified immigrants and natives, men and women, old and young, the cities offer the best chance of finding work. In addition to this, moreover, the chances of changing jobs, of mobility, and social promotion are also greater. In short, the urban-labor market is a career-labor market. To achieve certain positions on the labor market it is almost essential to commute or to move to the urban centers. The ‘reward’ for this is a labor market openended at the top. If there were any truth in the myth that everybody could reach every social stratum, this would apply in the first instance to the urban labor market. As one goes further down the settlement structure to smaller and medium-sized towns, the structural features of the labor market change characteristically. The number of high-ranking services declines, together with the proportion of highly-qualified and leadership jobs. The employment spectrum becomes narrower and is restricted to a middle-ranking sector. In particular, in those industrial towns not yet affected by internationalization and company transfers, traditional skilled jobs dominate the job structure at the middle level. Compared with the polarized job spectrum of the large towns, that of the small and mediumsized towns is markedly more homogeneous, with fewer highly-qualified jobs but also fewer jobs for the lowly-qualified in the industrial trade and service sectors. It is women who are affected primarily by the increasing concentration in the number and quality of jobs. They are less often in work and more often drop out of paid work at certain stages of their lives (e.g., after marriage or after childbirth). One result is that they reduce the labor supply and prevent a marked rise in unemployment although there is tremendous variation in this phenomenon across countries. The tendency of leaving the labor market after marriage or after childbirth varies according to sociocultural norms and stages of modernization. A further step down the settlement hierarchy finally leads one to the rural area, which no longer offers any central services for the surrounding area. As a result, the job spectrum is further restricted to just a few sectors: agriculture, the construction industry, tourism, and industrial jobs in wood processing, the production of building materials, textiles, and food. On the whole these tend to be unskilled and poorly paid jobs. Only a small proportion of the population of working age has a job on the local labor market. The locations lack the conditions for higher-ranking and highly qualified
jobs. Service industry companies and modern industrial companies that amount to more than just extended workbenches require transport links and the advantages of an agglomeration. The lack of jobs does not, as might be expected in theory, lead to a constantly high level of unemployment but, on the contrary, to average and in some regions even markedly low levels of unemployment. Part of the labor supply withdraws from the labor market and at first does not even try to find paid work. Women, who cannot undertake long commuter journeys or any journeys at all, because of the lack of childcare facilities, give up looking for work. This decision is easier when wage levels are low and compare unfavorably with the costs of commuting and childcare. In addition women take on minor and part-time activities or withdraw into unpaid housework. Those wishing to have a career and attend a higher level school or university and then look for a suitable job at the end of their educational training must commute or leave the area, because in rural areas qualified jobs are in short supply. The establishment of school centers in rural areas has changed this situation only to a limited extent and sometimes has even been counterproductive. Those graduates not finding a job teaching in their hometown have to move or be satisfied with a job below their qualifications. Establishing school centers without creating new jobs has therefore often led to an erosion of qualifications in rural areas and can thus to be seen as a failure of regional policy.
5. Conclusion The demarcation and analyses of spatial labor markets leads to a differentiated and more realistic picture of the dynamic and structures of a national labor market. Trends observed in recent years on the labor market differ fundamentally from those on the labor markets in metropolises, old industrial areas, rural areas, peripheral border areas, and in the suburbs. In each case there are considerable differences regarding those population groups which were unemployed, and to what extent, and which groups found work or were forced out of active life into the silent reserve army. This draws attention to the need for a regional labor market policy that takes account of the specific causes of unemployment, underemployment and wage differentials among regions, while seeking solutions that are spatially acceptable. See also: Employment and Labor, Regulation of; Labor Markets, Labor Movements, and Gender in Developing Nations; Labor Supply; Social Stratification; Unemployment: Structural; Uneven Development, Geography of; Urban Activity Patterns 14803
Spatial Labor Markets
Bibliography Bro$ cker J 1988 Regionale Arbeitsmarktbilanzen 1978–1984. Methoden und Ergebnisse. Raumforschung und Raumordnung 3: 87–97 Eckey H-F, Horn K, Klemmer P 1990 Abgrenzung on regionalen Diagnoseeinheiten fuW r die Zwecke der regionalen Wirtschaftspolitik. Gutachten im Auftrag des Untersuchungsausschusses der Gemeinschaftsaufgabe ‘Verbesserung der Regionalen Wirtschaftsstruktur.’ N. Brockmeyer, Bochum, Germany Fassmann H 1993 Arbeitsmarktsegmentation und Berufslaufbahnen. Ein Beitrag zur Arbeitsmarktgeographie Oq sterreichs. BeitraW ge zur Stadt- und Regionalforschung 11. Verlag der O= sterreichischen Akademie der Wissenschaften, Vienna Fassmann H, Meusburger P 1997 Arbeitsmarktgeographie. Teubner, Stuttgart, Germany Fischer M M, Nijkamp P 1987 Spatial labour market analysis: releance and scope. In: Fischer M M, Nijkamp P (eds.) Regional Labour Markets. Contribution to Economic Analysis. Elsevier Science, Amsterdam, New York, Oxford, Tokyo, pp. 1–26 Hanson S, Pratt G 1995 Gender, Work and Space. Routledge, London Pearce D (ed.) 1986 The MIT Dictionary of Modern Economics. MIT Press, Cambridge, MA, pp. 233–34, 238 Peck J A 1989 Reconceptualizing the local labour market: space, segmentation and the state. Progress in Human Geography 3: 42–61 Sassen S 1994 Cities in a World Economy. Pine Forge Press, Thousand Oaks, CA
H. Fassmann
Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms This article will consider the effects of normal, agerelated changes on certain forms of memory. The term ‘normal aging’ is used to refer to developmental processes that are expected to occur in organisms near the end of their lifespan. Alzheimer’s disease and other neuropathological conditions in humans, for example, are not ‘normal’ or inevitable parts of the aging process, and thus are not considered here. The focus here will be on the identification of brain changes responsible for age-related learning and memory changes, which can be understood by examining any number of mammalian species. The motivation for this approach is ultimately the development of methods that can attenuate age-associated cognitive decline, thereby improving life quality of the elderly.
1. The Hippocampus as a Model Brain Structure in which to Study Age-related Memory Deficits The way in which new events are learned and remembered changes over the lifespan. A great deal of research on learning and memory has focused on the 14804
Figure 1 Coronal section through the rat hippocampus showing CA1 and CA3 pyramidal cell layers as well as the granule cell layer of the fascia dentata (FD)
hippocampus (Fig. 1), a structure implicated in these processes (see Memory and Aging, Neural Basis of ). O’Keefe and Nadel (1978) proposed that the hippocampus functions as a cognitive map, and, in all mammals studied to date, optimal functioning of this structure is required for acquisition of the spatial knowledge required for effective navigation. The hippocampus is likely to participate in more than just spatial memory, but its role in spatial memory is emphasized here to constrain the topic of review (Squire 1992). It is certainly the case that in either humans or rats, extensive hippocampal damage prevents formation of new spatial maps that allow accurate large-scale navigation. It is these types of observations that suggest that the computation the hippocampal system performs is likely to be conserved across mammals. 1.1 Changes in Naigational Behaiors in Normal Aging The acquisition of new spatial information is defective in old humans, old monkeys, and old rats. For example, Newman and Kaszniak (2000) tested healthy younger and older adults’ ability to accurately locate a pole in relation to multiple cues present in a large tent-like environment. The older adults were not as accurate as were younger adults in placing the pole in the correct position, suggesting a difference in spatial knowledge between groups. Rapp et al. (1997) tested young and old monkeys in an eight-choice foraging task in which reward could be obtained only once at any location by the monkey. When the spatial memory demands of the task were made more difficult by imposing greater delays between choices, older monkeys became increasingly impaired. For rats, the circular platform task has been used to test spatial navigation ability. The object of the task is to navigate to the one place where it is possible to enter a dark chamber away from the brightly-lit open surface of the platform. The
Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms consistent finding is that old rats do not remember this preferred dark chamber location from day to day as well as do younger rats (Barnes 1979). Taken together, the observations suggest that older mammals remember spatial information less accurately than do their young counterparts, and most often these spatial memory problems can be separated from other agerelated sensory changes that might have a negative impact on performance.
suggest that the biophysical properties of hippocampal cells are largely preserved across the lifespan. This includes measurements such as membrane potential, cell input resistance, action potential heights, and membrane time constants measured in itro, as well as average spontaneous cell firing rates measured in the awake, behaving state. Thus, normal aging does not result in widespread cell loss, or deterioration in cell viability and function (Barnes 1994).
2. Changes in Hippocampal Connectiity and Functional Responsieness
2.2 Specific Age-related Hippocampal Structural and Functional Changes
Because normal aging results in hippocampal-dependent memory changes, changes in hippocampal structure and function over the lifespan might also be expected. In general, however, there appears to be rather little change in cell numbers, and only rather subtle and region-specific changes in cell function.
The age-related changes that do occur appear to be in connectivity and functional responsiveness, and the pattern of change is specific to the region examined. For example, from anatomy we know that there are fewer synaptic contacts from the main source of input to granule cells of the hippocampus (Fig. 1) from the entorhinal cortex (Geinisman et al. 1995, Fig. 2). From electrophysiology we know that there is pruning of these axons; however, the synaptic contacts that remain in old rats are more powerful (Barnes 1994, Fig. 2). This change in synaptic strength appears to help keep the probability of granule cell output relatively constant, in spite of the extensive deafferentation, and suggests a compensatory process at this synapse during aging. By contrast, in adjacent hippocampal regions to the granule cells (CA3 and CA1 in Fig. 1) there is no pruning of CA3 pyramidal cell axons that project to the pyramidal cells of CA1 (Fig. 2). The anatomy and physiology suggest that there are, however, fewer functional contacts per axonal branch in old rats. Interestingly, the efficacy of these individual synaptic contacts in CA1 does not change with age (Fig. 2), suggesting that functional compensation is not an obligatory process in response to age-related deteriorative change (Barnes 1994), but occurs selectively at only certain connections.
2.1 Preseration of Hippocampal Cell Numbers and Function In contrast with some earlier anatomical studies, modern stereological cell counting methods have revealed no significant numerical differences in the major cell types within the hippocampus across the lifespan of humans (West 1993), monkeys, or rats. Similarly, studies of basic cellular neurophysiology
3. How Does Aging Affect the Plasticity Characteristics of the Hippocampus?
Figure 2 Schematic of age-related changes in granule cells of the fascia dentata (FD) and pyramidal cells in CA1 of rat hippocampus. Old rats show fewer axon collaterals and synapses, but those synapses that remain are more powerful in FD. In CA1 old rats show fewer synapses, but no change in synaptic strength
The changes in connectivity and cellular function during aging discussed above could impact the way in which information is stored in the hippocampal network. Hebb (1949) was among the earliest to propose that synaptic modification must take place during the acquisition and storage of associative memories. Modification of this type was described at the perforant path—granule cell synapse in the hippocampus by Bliss and Lomo (1973) and McNaughton et al. (1978) verified that it was, indeed, an associative process. Since that time an enormous body of empirical work has arisen describing the physiology, pharmacology, and molecular biology underlying this process known as long-term potentia14805
Spatial Memory Loss of Normal Aging: Animal Models and Neural Mechanisms tion (or LTP). Although there is no direct proof that LTP induced artificially at synapses in the brain reflects the process the nervous system normally uses to store information, there is a wide range of evidence consistent with this hypothesis. Thus, it has been important to examine the effect of normal aging on this process.
3.1 Hippocampal Plasticity Deficits in Aging Although LTP can be robustly induced at synapses in the aging hippocampus (Landfield et al. 1977), such modification is more difficult to achieve in older rats. The factors contributing to these age-related changes include synapse sparsity, reduced glutamate receptor numbers, and reduced temporal summation of incoming input. Even if equivalent LTP induction levels are achieved, LTP can decay faster in old rats (Barnes 1979). Furthermore, processes that decrease synaptic efficacy, such as LTP reversal and long-term depression, are more effective in older animals (Norris et al. 1996) Importantly, the decay rate of LTP in hippocampus of rats correlates with spatial memory accuracy on the circular platform task, that is, old rats with the fastest decaying LTP also show the poorest retention of spatial information. These data suggest that the most promising candidate mechanism for information storage in the nervous system is compromised in old rats. Furthermore, the data suggest that if we understood the mechanism of faster decay of LTP, one might better understand why older animals show faster forgetting. In fact, a number of laboratories are investigating pharmacological interventions for extending LTP decay rates, in an effort to modify forgetting rates in aging.
4. How Do Changes in Plasticity Mechanisms Impact the Information Processing Characteristics of Hippocamal Ensembles? It has recently become possible to record significant numbers of single hippocampal cells simultaneously when rats are awake, and free to behave (Wilson and McNaughton 1993). With this type of electrophysiological recording, the firing patterns of ensembles of cells are monitored without the use of artificial stimulation. Rather, cell firing occurs in response to the animal’s behavior and sensory input, and thus provides a complementary window for viewing experience-dependent changes in brain function. Single cells in the hippocampus increase their firing rates when an animal is in a particular place within an environment, which is why O’Keefe first called these cells place cells (O’Keefe and Dostrovsky 1971). The area of space over which a given cell fires is called the cell’s place field, and the firing pattern of many cells over a given environment is hypothesized 14806
to represent a cognitive map of that environment (O’Keefe and Nadel 1978). In fact, it is possible to reconstruct the animal’s location, simply from knowing the pattern of hippocampal cell activity (Wilson and McNaughton 1993). These place field distributions are commonly referred to as maps. With repeated exposure to an environment, it is hypothesized that these maps can become stabilized and bound to features of the environment as a consequence of LTP at the synapses conveying external sensory information to the hippocampus. Such binding could help ensure accurate map retrieval. Two main findings from age comparisons using ensemble recording methods are discussed below, both suggesting ways in which faulty LTP mechanisms may impact the information processing characteristics of hippocampal ensembles.
4.1 Experience-dependent Place Field Expansion Deficits in Aging There is a plastic change in the pattern of place cell discharge characteristics that occurs as a consequence of experience, strongly suggesting that synaptic modification is taking place in the system. This empirical observation was predicted by theoretical considerations of route learning, the fundamental idea for which goes back to Hebb’s ideas on phase sequences, or linked cell assemblies. The LTP mechanism has an intrinsic asymmetry that predicts that an expansion of place fields should occur with repeated traversal of a given route. Such place field expansion, and a shift in the center of mass of place fields towards the origin of a route, has been verified experimentally (Mehta et al. 1997). Moreover, glutamate receptor blockade that blocks LTP induction also blocks this experiencedependent place field expansion effect. Because LTP mechanisms are disrupted in old rats, it was predicted that normal aging may alter place field expansion. This was confirmed by Shen et al. (1997) who showed that the expansion of place field size that occurs in young rats after a few traversals of a route is much less robust in old rats. The deficit in experience-dependent place field expansion in old rats is consistent with the likelihood of a failure of effective LTP mechanisms in these old animals.
4.2 Map Retrieal Deficits in Aging Another example of how deficits in hippocampal synaptic plasticity may impact spatial cognition can be illustrated by a comparison of hippocampal map retrieval in young and old rats. When ensembles of hippocampal cells are recorded in young rats, different place cells fire selectively in the environment through which the rat moves. If the young rat is removed from a familiar environment, and then brought back later to
Spatial Mismatch many biophysical parameters of neurons relatively unaffected. At the system level, aging also leads to changes in hippocampal network dynamics that result in nonlinear failures in information retrieval. These age-related changes in brain structure and function are likely to contribute to moderate, but clear changes in spatial learning and memory observed during normal aging. See also: Aging Mind: Facets and Levels of Analysis; Brain Aging (Normal): Behavioral, Cognitive, and Personality Consequences; Education in Old Age, Psychology of; Lifespan Theories of Cognitive Development; Memory and Aging, Cognitive Psychology of; Memory and Aging, Neural Basis of
Bibliography
Figure 3 The firing of different hippocampal cells are shown in different colors, only a few of the simultaneously recorded cells are plotted for clarity. This illustrates hippocampal map retrieval accuracy on a triangular maze in a young rat on two different recording sessions, and a map retrieval error between sessions in an old rat
the same room, the same place field distribution, or same map, is retrieved at this later time (Fig. 3). On about two thirds of occasions when old rats are removed and returned to the same familiar room, they retrieve the same map, just like young rats. About one third of the time, however, a different place field distribution appears, as though the old rat had failed to retrieve the original map (Fig. 3). Thus, possibly because of a deficient binding to environmental cues or weak associative connections within the map itself, when old rats come back to the same situation a completely different pattern of cell activity can emerge (Barnes et al. 1997). This may well be associated with a place recognition failure. In addition to the map retrieval deficits described above, Tanila et al. (1997) have also shown that memory-impaired old rats do not create new maps as readily as do young rats when features are changed in an environment. Thus, it seems possible that altered LTP-like plasticity mechanisms may be partly responsible for the greater tendency of old organisms to become lost.
5. Conclusions At the cellular level, normal aging leads to changes in hippocampal connectivity and plasticity mechanisms that are specific to different subregions, but leaves
Barnes C A 1979 Journal of Comparatie and Physiological Psychology 93: 74–104 Barnes C A 1994 Trends in Neuroscience 17: 13–8 Barnes C A, Suster M S, Shen J, McNaughton B L 1997 Nature 388: 272–5 Bliss T V P, Lomo T 1973 Journal of Physiology 232: 331–56 Geinisman Y, de Toledo-Morrell L, Morrell F, Heller R E 1995 Progress in Neurobiology 45: 223–52 Landfield P W, Rose G, Wohlstadter T C, Lynch G 1977 Journal of Gerontology 32: 523–33 McNaughton B L, Douglas R M, Goddard G V 1978 Brain Research 157: 277–93 Mehta M R, Barnes C A, McNaughton B L 1997 Proceedings of the National Academy of Science USA 94: 8918–21 Newman M C, Kaszniak A W 2000 Aging Neuropyschology and Cogition 7: 86–93 Norris C M, Korol D L, Foster T C 1996 Journal of Neuroscience 16: 5382–92 O’Keefe J, Dostrovsky J 1971 Brain Research 34: 171–5 O’Keefe J, Nadel L 1978 The Hippocampus as a Cognitie Map. Clarendon Press, Oxford, UK Rapp P R, Kansky M T, Roberts J A 1997 NeuroReport 8: 1923–8 Shen J, Barnes C A, McNaughton B L, Skaggs W E, Weaver K L 1997 Journal of Neuroscience 17: 6769–82 Squire L R 1992 Psychological Reiew 99: 195–231 Tanila H, Shapiro M, Gallagher M, Eichenbaum H 1997 Journal of Neuroscience 17: 5155–66 West M J 1993 Neurobiology of Aging 14: 287–93 Wilson M A, McNaughton B L 1993 Science 261: 1055–8
C. A. Barnes
Spatial Mismatch Spatial mismatch describes a broad set of geographical barriers to employment that stem from a disparity, or mismatch, between where people live and where appropriate job opportunities are located. A more specific form of spatial mismatch, the spatial mismatch 14807
Spatial Mismatch hypothesis (SMH), argues that the combined effects of residential segregation and economic restructuring limit geographical access to employment opportunities for black, inner-city residents. This article emphasizes geographical dimensions of the SMH, highlighting the changing spatial distributions of residences and job opportunities in cities, the measurement of geographical separation between the two, the sociospatial processes that underpin spatial mismatch, and policy implications. Spatial mismatch is a historically specific concept that emerged in the 1960s amid widespread concern about racial discrimination and urban poverty. Although he never used the term ‘spatial mismatch,’ John Kain (1968) introduced the concept in analyzing the combined effects of residential segregation in housing markets and the postwar suburbanization of jobs on black unemployment. Poor geographical access to jobs, especially jobs in manufacturing, was said ‘to reduce employment opportunities of Negroes who are already handicapped by employer discrimination and low levels of education’ (p. 196). Although some subsequent researchers have defined spatial mismatch in narrow geographical terms, as long distances to jobs or small numbers of locally available jobs, Kain’s research was notable in exploring the intersections between spatial and social barriers to employment. He considered how transportation access, employer discrimination, skills, and social networks affect workers’ access to employment. These themes continue to the present in the SMH literature. Kain’s research stimulated discussion and debate during the early 1970s. After that, the field languished for a decade until William Julius Wilson’s research on the urban underclass brought spatial mismatch back to the forefront (Wilson 1987). Since then the SMH has attracted research attention in a wide range of social science disciplines.
1. Eidence of Spatial Mismatch The great appeal of the SMH is that it connects two prominent features of postwar American cities: the strong segregation of residential areas by race and ethnicity and the movement of jobs, especially jobs in manufacturing, to the suburbs and overseas. But do these processes necessarily result in spatial mismatch? Many central cities are losing employment, but they are also losing population (Stoll 1999). Thus, the balance of jobs to resident workers has remained relatively constant. Although the SMH emphasizes city–suburban differences, finer-grained patterns of residential and work location are much more relevant to job access. Cities and suburbs are not homogeneous; within each, African-Americans and other minorities are concentrated in a few areas. Research using diverse indicators of geographical access to employment, in a variety of US cities, finds 14808
support for the SMH at the localized, intra-urban scale. Within central cities, black neighborhoods generally have a lower ratio of jobs to resident workers than do white neighborhoods, though the findings are highly sensitive to the geographical areas considered (Holzer 1991). A better approach is to examine the availability of jobs within a reasonable commuting radius. Research in cities like Chicago and San Francisco finds that blacks are less likely to work in their local neighborhoods than are comparable whites, and that the density of job opportunities is lower in and near predominately black neighborhoods (Ihlanfeldt and Sjoquist 1998). In the 1990s researchers have started to combine the power of geographic information systems with detailed urban databases to compute new measures of employment accessibility (Ong and Blumenberg 1998). Transportation routes and physical and social barriers to movement can be layered together to describe the influences on travel and interaction patterns. Spatial mismatch is also apparent in commuting times and flows. African-American workers have longer commute times on average than do comparable white workers, and the differences persist after controlling for wages, mode of transportation and personal and household characteristics (McLafferty and Preston 1996). Black workers are also more likely to have ‘constrained’ work trips that involve long travel times to low-wage jobs (Johnston-Anumonwo 1997). The propensity to work in one’s local neighborhood—local working—is also low in black neighborhoods (Immergluck 1998). Spatial mismatch, then, is evident in long work trips to jobs that provide little economic return. Although the time and cost of commuting are important in their own right, they have several limitations as measures of spatial mismatch. By definition, the measures are only based on a selected group, namely, employed workers. This ‘selection bias’ implies that commuting indicators may not accurately represent geographical barriers to employment for the unemployed (Cooke and Ross 1999). One way to address this problem is to analyse changes through time as people and jobs move away from central city locations. Case studies reveal that when firms move to the suburbs, black workers are more likely to quit and less likely to relocate in order to gain access to a new suburban work site than are comparable white workers (Ihlanfeldt and Sjoquist 1998). Black workers are more fixed in their residential locations, indicating the long-lasting effects of racial segregation and discrimination. There are also problems in interpreting commuting patterns. Do long work trips reflect individual choices to travel further for high-wage, high-quality employment, or do geographical constraints compel long work trips? Many commuting decisions result from a complex mix of choice and constraints, as people juggle personal and family needs within the spatial and
Spatial Mismatch social bounds of daily life (Gilbert 1998, Hanson and Pratt 1995). These choices and constraints vary between men and women, highlighting the importance of gender in the SMH. Attention is turning to the role of place in spatial mismatch and the important variations in spatial mismatch within and among cities (Holloway 1998). In the USA, spatial mismatch is primarily a large-city phenomenon that is most evident in cities like New York, Chicago, Atlanta, and Los Angeles. It is also more apparent in the ‘rustbelt’ cities of the Northeast and Midwest where economic restructuring and suburbanization are in full force (Ihlanfeldt and Sjoquist 1998). Little has been written about spatial mismatch in racially and ethnically diverse cities outside the USA, although there are hints of mismatch in cities like London and Berlin where ethnic minority populations cluster in areas of limited job opportunity.
2. Spatial Mismatch and Labor Market Processes Spatial mismatch is rooted in the social, economic, and spatial processes that shape urban labor markets. These influence where people and firms locate, the channeling of workers into certain types of occupations, workers’ ability to search for and commute to job opportunities, and firms’ willingness to hire workers. Connecting spatial mismatch to employment outcomes requires an understanding of labor market processes, as well as an understanding of the changing geographical patterns of firms and people. Urban labor markets are strongly differentiated according to gender, race, ethnicity, and class. This labor market segmentation concentrates blacks and Latinos in a narrow set of occupations, and the particular occupations vary with gender. Historically black men worked as laborers, in transportation, manufacturing, and certain service occupations, whereas black women were concentrated in domestic service. Although these occupations have changed over time, there continues to be strong segmentation by race and gender. Segmentation implies that groups of workers face different sets of potential job opportunities that are unevenly distributed within metropolitan areas. Thus, the spatial mismatch may vary by gender, race, or class. Researchers have only begun to tease out the interactions between segmentation and spatial mismatch (Wyly 1996). For example, despite their different occupations, black men and women face similar spatial mismatch barriers in urban labor markets (McLafferty and Preston 1996). Some labor market segmentation results from differences in education and skills that coincide, in some cases, with racial and ethnic divisions. The skills mismatch refers to the gap between the education and
skill levels of inner-city residents and the requirements of available jobs. The transformation from a goodsproducing to a service-producing economy is causing a loss of low-skill, manufacturing jobs from central cities, while the emerging ‘information economy’ demands a highly educated workforce. Poverty, poor quality schools, and social dislocations limit educational achievement in the inner city, resulting in the skills mismatch. The spatial and skills mismatches position workers and jobs in intersecting social and geographical spaces to limit employment opportunities for innercity residents. Transportation is a crucial factor in spatial mismatch, influencing the ease and cost of overcoming distance in traveling to work. American cities are built around the automobile, the most flexible mode of transportation and the one that offers access to the widest geographical range of job opportunities. In contrast, travel by public transit is slow and limited to fixed routes and schedules. For inner-city residents, the hub and spoke design of most mass transit systems is poorly structured for travel to suburban job centers. Because blacks and Latinos have low rates of car ownership and licensing, they are much more dependent on public transit than are their white counterparts. Transit dependence affects all aspects of employment including job search and access to suburban jobs, and it is the single most important determinant of minority workers’ long commute times. Some argue that this ‘automobile mismatch’ is a more serious problem than the spatial mismatch (Taylor and Ong 1995). The process of spatial job search is also important to the SMH. People often search for jobs near their homes. As distance from the home increases, information about nearby job opportunities decreases and the costs of job search rise. Research indicates that blacks’ and Latinos’ job search patterns are centered upon their homes, and they tend to search in fewer areas and areas with weaker employment growth than do comparable whites (Stoll and Raphael 2000). Social networks play a key part in spatial job search. Racial and ethnic minorities rely heavily on family and friends for finding employment. For low-income, low-skill workers this may lead to job search in areas of low job growth and poor quality jobs, thus aggravating the spatial mismatch.
3. Impacts of Spatial Mismatch Spatial mismatch affects peoples’ lives in diverse ways. The SMH emphasizes the effects on poverty and unemployment and low incomes, but the empirical record is mixed. Perhaps the strongest support for the SMH comes from research on minority youth. For this group, unemployment is strongly correlated with 14809
Spatial Mismatch the average travel time to the types of jobs that youth normally hold and youths’ proximity to areas of employment growth (Ihlanfeldt 1992). Minority youth who face long travel times or distances to areas of employment opportunity are much more likely to be unemployed. Looking beyond youth, research among vulnerable populations such as welfare recipients reveals that unemployment is highly sensitive to the concentration of low-wage jobs nearby (Ong and Blumenberg 1998). A 1998 review of the spatial mismatch literature concludes that the vast majority of recent studies support the SMH, and ‘the lack of geographical access to employment is an important factor in explaining labor market outcomes’ (Ihlanfeldt and Sjoquist 1998, p. 881). Others follow Ellwood (1986) in arguing that ‘race, not space’ (p. 181) is the main determinant of poor labor market outcomes. Racial discrimination permeates urban labor markets in the United States, reducing minority employment irrespective of geographical access to jobs. Racial discrimination is often rooted in place, as firms avoid hiring workers from inner-city, minority communities, or use far-flung, racially-biased social networks in attracting employees (Kasinitz and Rosenberg 1996). Critics of the SMH point to examples where high unemployment and dense job opportunities exist side by side. The challenge in examining ‘race versus space’ is that the two are deeply intertwined: Race is space in the USA. Kain (1968) recognized this in the original formulation of the SMH, noting the impacts of racial discrimination by employers and discrimination in housing markets on geographical access to employment. Attempts at disentangling the effects of race and space find that both have large and significant effects on employment outcomes (Stoll 1999). The implications of spatial mismatch beyond labor markets are discussed in Wilson’s path-breaking work on the urban underclass (Wilson 1987, Wilson 1996). Wilson starts with spatial mismatch and economic restructuring and weaves a rich tapestry that includes the effects on male unemployment, marriage, household structure, and the implications for an array of social issues such as crime, migration, local institutions, and educational achievement. As jobs move away and incomes decline, the ties that bind communities, households, and institutions start to disintegrate. Spatial mismatch is not the only cause of these changes, but it is an important piece of the puzzle.
4. Spatial Mismatch and Urban Policy Despite differences of opinion about the SMH, it has played an important role in urban policy formulation for several decades. Policies to address spatial mismatch involve efforts to bring people and jobs closer 14810
together or to improve the connections between them. Three general types of policy have been suggested. First, ‘economic development’ policies aim at creating jobs in inner-city areas. These place-based policies include public works programs and the provision of subsidies, training programs, and infrastructure to attract new industries to inner-city areas. The policies go by various names—the current incarnation is ‘enterprise zones’—and they have had limited success. Many have failed to generate sustained, long-term employment because they did not alter the profitability of inner-city locations and benefits leaked out from inner-city neighborhoods to other areas. Second, ‘dispersal’ policies aim at facilitating the movement of people to job-rich suburban areas. These strategies focus on the housing market and include the vigorous enforcement of fair housing laws, the development of public housing in suburban areas, and the provision of subsidies to help inner-city residents move to the suburbs. A good example of the latter is the Gautreaux program in Chicago, which offered housing subsidies to residents of public housing who wanted to move to the suburbs. A careful comparison of movers and stayers showed significant gains in employment for those who moved to the suburbs (Popkin et al. 1993). Third, ‘mobility’ strategies emphasize improving inner-city residents’ access to suburban job opportunities, rather than changing the locations of jobs or residences. Included here are improvements in public transit to facilitate reverse commuting, creation of car pools and van pools, employer-sponsored transportation from inner-city areas to job sites, and efforts to improve car ownership among the poor. Such policies have immediate and obvious benefits; however, they have never been implemented on a large scale. Also, the most effective mix of policies varies among cities depending on the density of urban development and the availability of transportation infrastructure. After decades of experimentation, most analysts agree on the need for multipronged approaches that are sensitive to the particularities of place and context. Furthermore, there is general agreement that spatial mismatch cannot be isolated from the constellation of forces that affect poverty and unemployment. Breaking down the barriers of racial discrimination in job and housing markets, and improving health, housing, and education in disadvantaged communities could make spatial mismatch irrelevant in the long run. Looking ahead, the historical specificity of the SMH highlights the fact that spatial mismatch is not a characteristic of particular locations but of particular places where people are disadvantaged based on a combination of race, ethnicity, class, and poor geographical access to work. Changing patterns of urban development, including the gentrification of inner-city neighborhoods, the growth of poverty in the suburbs, and evolving patterns of employment opportunity
Spatial Optimization Models mean that spatial mismatch is perpetually recast. Today’s suburbs may become tomorrow’s ‘inner cities’ as urbanization redefines spatial mismatch. See also: Environmental Determinism; Labor Markets, Labor Movements, and Gender in Developing Nations; Labor Supply; Local Economic Development; Migration, Economics of; Rural Geography; Social Stratification; Spatial Interaction; Spatial Labor Markets; Transportation Geography; Transportation Planning; Urban Geography; Urban Growth Models; Urban Poverty in Neighborhoods
Stoll M A, Raphael S 2000 Racial differences in spatial job search: Exploring the causes and consequences. Economic Geography 76: 201–23 Taylor B D, Ong P M 1995 Spatial mismatch or automobile mismatch? An examination of race, residence and commuting in US metropolitan areas. Urban Studies 32: 1453–73 Wilson W J 1987 The Truly Disadantaged: The Inner City, The Underclass, and Public Policy. University of Chicago Press, Chicago Wilson W J 1996 When Work Disappears: The World of the New Urban Poor. Knopf, New York Wyly E K 1996 Race, gender, and spatial segmentation in the Twin Cities. Professional Geographer 48: 431–45
S. McLafferty
Bibliography Cooke T J, Ross S L 1999 Sample selection bias in models of commuting time. Urban Studies 36: 1599–613 Ellwood D 1986 The spatial mismatch hypothesis: Are there teenage jobs missing in the ghetto? In: Freeman R B, Holzer H T (eds.) The Black Youth Employment Crisis. University of Chicago Press, Chicago, pp. 147–85 Gilbert M R 1998 ‘‘Race’’, space and power: The survival strategies of working poor women. Annals of the Association of American Geographers 88: 595–621 Hanson S, Pratt G 1995 Gender, Work, and Space. Routledge, New York Holloway S R 1998 Male youth activities and metropolitan context. Enironment and Planning 30: 385–99 Holzer H J 1991 The spatial mismatch hypothesis: What has the evidence shown? Urban Studies 28: 105–22 Ihlanfeldt K R 1992 Job Accessibility and the Employment and School Enrollment of Teenagers. W. E. Upjohn, Institute for Employment Research, Kalamazoo, MI Ihlanfeldt K R, Sjoquist D V 1998 The spatial mismatch hypothesis: A review of recent studies and their implications for welfare reform. Housing Policy Debate 9: 849–92 Immergluck D 1998 Job proximity and the urban employment problem: Do suitable nearby jobs improve neighborhood employment rates? Urban Studies 35: 7–23 Johnston-Anumonwo I 1997 Gender, race and constrained work trips in Buffalo, 1990. The Professional Geographer 49: 306–17 Kain J 1968 Housing segregation, negro employment, and metropolitan decentralization. The Quarterly Journal of Economics 82: 175–97 Kasinitz P, Rosenberg J 1996 ‘Missing the connection’: Social isolation and employment on the Brooklyn waterfront. Social Problems 43: 180–96 McLafferty S, Preston V 1996 Spatial mismatch and employment in a decade of restructuring. Professional Geographer 48: 420–31 Ong P, Blumenberg E 1998 Job access, commute and travel burden among welfare recipients. Urban Studies 35: 77–93 Popkin S J, Rosenbaum J E, Meaden P M 1993 Labor market experiences of low-income, black women in middle-class suburbs. Evidence from a survey of Gautreaux program participants. Journal of Policy Analysis and Management 12: 556–73 Stoll M A 1999 Race, Space and Youth Labor Markets. Garland Publishing, New York
Spatial Optimization Models The concept of spatial optimization is quite simple. This topic involves identifying how land use and activities should be allocated and arranged spatially in order to optimize efficiency or some other measure of goodness. For example, how should fire protection resources be distributed across a city in order to maximize protection? How do we design a system of reserve lands that can ensure the survival of an endangered species? How should an area be divided into school attendance zones? These are but a few examples of spatial design problems. For each type of design problem there can be numerous if not an infinite number of alternative configurations or solutions. Optimization involves the attempt to identify the best of the alternatives. Simply put, spatial optimization is the science of optimal arrangement. The use of spatial optimization is growing due to the existence of better spatial data, the emergence of powerful geographic information systems, decreasing costs of computation, and better computer graphics and solution algorithms. At first we might think that such a topic is futuristic, cutting edge. Actually, spatial optimization is a geographical problem that is as old as mankind. For example, hunters and gatherers had to decide how to allocate their efforts in collecting foodstuffs, moving encampments, etc. Such societies needed to be efficient in order to survive. Anthropologists and geographers have used optimization to analyze the efficiency of these early societies (see, for example, Van Reidhead (1979), Keene (1979), Church and Bell (1988)). Even the ancient Chinese art of Feng Shui deals with the arrangement of activities in order to be in a harmonious state. Just as we are interested in how spatial arrangements evolved in the past, we are equally interested in how future patterns of activities should be designed. For example, just where should that new store be located? Where should a dealer 14811
Spatial Optimization Models provide service facilities for their products? How should a network of warehouses be designed and operated in order to supply a number of retail locations? Such questions are but the tip of the ‘iceberg’ of possible spatial location\allocation problems. As with all geographic problems, the temporal domain is important. For example, between 1990 and 1998 Broward County, FL, located 15 new schools and operated up to 2,000 portable classrooms in order to cope with a growing school population. This means that school boundaries have been frequently adjusted and school capacities are often expanded with portable classrooms in order to cope with increasing enrollment. One of the major objectives in planning new schools and creating attendance areas is to plan an expansion that involves the least disruption of students and school reassignments (Lemberg and Church 2000). In a different example, to meet the fluctuations in daily emergency ‘911’ call demand in San Diego County, CA, American Medical Response operates between two and 24 ambulances during each day. Posting ambulances at various locations and dispatching them over time is a complicated logistical\spatial optimization problem. Rest assured that the patterns of deployment vary over the day in order to reduce response times, but just where should such posts be located to get the most service out of a limited number of vehicles? Since it costs over $500,000 per year to keep an ambulance in operation, it is important to find an efficient service plan. Geographic scale is an equally important issue in spatial optimization. There are typically three basic scales in spatial optimization problems: micro, meso, and macro. The macroscaled problem involves the highest level or area size (i.e., smallest map scale). At such a level, we might ask: ‘in which cities should we have factories or warehouses?’ We generally use no specific site information in making such a selection. Interactions between customers and facilities are modeled at an aggregated level. The objective is to identify those areas in which it makes the most sense to open a facility or activity. Such a decision might be considered ‘strategic.’ If we have a strategic plan to locate a regional warehouse in Reno, NV, the ‘meso’ level question is ‘where in Reno should it be.’ Such a problem would involve identifying the best available site, considering a wide range of criteria. This is a type of ‘tactical’ decision making and is often performed to support a strategic decision. Once we have a site or a place the microscaled problem involves carving up that space and allocating it into specific activities. For example, we may allocate store space to specific retail items or a shop floor to machine tools. Even malls are arranged spatially in order to take advantage of anchor stores and walking patterns (Bean et al. 1988). How people use such facilities and interact with their environment is an especially important aspect of this problem (Underhill 1999). In the 14812
industrial engineering literature, this type of problem is often referred to as a layout problem. Microscaled problems are akin to an ‘operational level’ problem.
1. Historical Roots of Spatial Optimization Efficient land use activity was first discussed by Von Thunen (1842) when he described how agricultural activities including wood lots should be optimally arranged around an isolated village. Primary industries such as forestry now use spatial optimization to identify how each stand (tree type and age class) is to be managed over time. These management plans include the building of roads, the provision of environmental standards, and the return on investment (Church et al. 1998). In Chile, forest companies optimize nearly every feature of their industry, including the scheduling and routing of vehicles for log transport (so that logs arrive when just needed at loading facilities such as ports), the placement of harvest towers and cable ways for steep slope harvesting, and the minimization of land devoted to unproductive elements (e.g., roads). The US Forest Service has utilized a spatial optimization model called Forplan (and its successor SPECTRUM) for the last three decades in developing management plans (Kent et al. 1990) for virtually every national forest in the US. Alfred Weber proposed one of the earliest industrial location models in 1908 (Wesolowsky 1993). As a simple example, he described the location of a manufacturing point between two localized raw materials and a market. He considered that the three points could be plotted on a Cartesian plane as shown in Fig. 1. Transport distances between the points were assumed to be Euclidean. Weber discussed how, under certain assumptions, the ideal location was the one that minimized transport costs of the raw materials to the manufacturing point and the product to market. Weber showed how this model could be solved geometrically for certain problems. The first optimal solution for the more generalized Weber model involving many markets and raw materials was developed in 1937 by Weisfeld and reinvented by Kuhn and Kuenne and Cooper in the 1960s (Wesolowsky 1993). Cooper subsequently proposed an extended model involving the location of multiple facilities. The multifacility Weber problem can now be solved optimally for only those problems involving less than 100 points. In 1933 Walter Christaller (1966) wrote: ‘Just as there are laws which determine the life of the economy, there are special economic-geographic laws determining the arrangement of towns.’ Christaller attempted to find the laws that determined the number, size, and distribution of towns in a region of Southern Germany. Since that time geographers, archaeologists, and anthropologists have searched for Christaller’s central place patterns in other regions and in other
Spatial Optimization Models
Figure 1 The classic Weber location triangle
applications, and have developed spatial optimization models that solve for the optimal spatial arrangement of towns and markets (Kuby 1989, Storbeck 1990). Virtually all early location theories have some type of spatial optimization counterpart today.
2. Selected Examples of Spatial Optimization A large number of spatial design\arrangement\allocation problems have been approached using some form of optimization modeling. Techniques used to solve optimization models are typically found in the opera-
tions research (OR) literature. The field of OR traces its roots to World War Two and the need to solve large scale military logistics problems. One of the first OR models that was developed and can be solved optimally was the classical transportation problem. This problem involves a distributed set of supply points (called sources) and a distributed set of demand locations (called sinks). Each source has a finite capacity to supply a good and each destination has a specified amount of demand for that good that needs to be met. Between each unique pair of one source and one sink we can identify the most efficient route and 14813
Spatial Optimization Models Table 1 Examples of spatial optimization problems Problem name Corridor location problem
Vehicle routing problem
Classical transportation problem
Simple plant location problem
p-Median location problem Maximal covering location problem
p-Hub location problem
Site configuration\design problem
Districting problem
Spatial allocation problem
Problem definition
General reference
Across a landscape, identify the optimal route for a right-of-way for a facility such as a pipeline, highway, or power transmission line that connects an origin and destination Optimally route a vehicle from a starting point to visit a number of pre-specified locations and then return to the starting point. Allocate limited supplies at different locations to a number of spatially distributed demand points in such a manner as to minimize transportation costs of serving all demand Locate a set of facilities and allocate their supplies to demand points in such a manner as to minimize facility and transportation costs Locate p-facilities so that the total weighted distance of serving all demand is minimized Locate p-facilities so that the largest proportion of demand is covered or served within some maximal distance or time Locate p-transportation hubs and allocate demand to specific hubs in such a manner as to minimize total transportation costs Identify the best set of parcels or land units that optimizes costs and meets shape, contiguity, size, and suitability constraints Partition spatial units into a set of districts that optimize one or more criteria; examples, school districts, political districts, etc. Allocate specific activities to defined spatial units
Smith et al. 1989 Goodchild 1977 Lombard and Church 1993
the cost to ship one unit of that good between the source-sink pair. The objective of the classical transportation problem is to allocate the supplies to the demands so that the total transport cost is minimized and the demand at each sink is met. Garrison (1959) was the first geographer to discuss the utility of the transportation problem as a model to analyze spatial structure. Yeates (1963) and others have since used this spatial optimization model to analyze efficiency of an existing arrangement or to develop an optimal transportation plan. 14814
Bodin 1990 Weigel and Cao 1999 Yeates 1963 Camm et al. 1997
Love et al. 1988
ReVelle and Swain 1970 Church and ReVelle 1974 Schilling et al. 1993 O’Kelly and Miller 1994
Wright et al. 1983 Cova and Church 2000 Woodall et al. 1980 Lemberg and Church 2000 Hof and Bevers 1998
Table 1 gives a representative sample of spatial optimization problems that can be found in the literature, including the classical transportation problem. The list includes vehicle routing, districting, transportation, facility location, site search, and space allocation. For each of the 10 problems listed, the name, definition, and a few references are given. For each of these generic problems, a number of models have been developed. The references cite the original work, a review, or a recent work that contains an upto-date list of references. All of the problems listed in
Spatial Optimization Models Table 1, with the exception of the corridor location problem and the classical transportation problem, have a problem complexity that is called NP (Nondeterministic Polynomial). The class NP refers to a set of problems for which there exists virtually no chance that procedures will be found which can solve all problems optimally within a reasonable computational time on a computer. One of the reasons for this is that these problems require the use of integer programming, an area of optimization that can be both computational intensive and limited in terms of the size of an individual application that can be solved optimally. Because of this fact, it is common to see that different techniques are used for solving specific models and that heuristics are needed for most moderate to large sized problems. Heuristics are specialized solution procedures that are designed to approach problem solving using either a set of rules or some search strategy. An example of a heuristic is given in the next section.
of need to be supplied by a facility). Assume that each demand will be supplied or served by their closest facility. The transportation cost is assumed to be directly proportional to the weight of a demand multiplied by the distance to the closest facility (note: this is the same assumption used in the Weber model). The p-median problem involves selecting the locations of p-facilities so that the total weighted-distance for all demand is minimized. Hakimi proved that at least one optimal solution to the p-median problem consists entirely of nodes of the network. Thus, the search for an optimal configuration can be limited to just the nodes of the network. Searching for the best nodal configuration, and hence an optimal configuration, seems simple. Why not enumerate all of the possible facility patterns, and pick the configuration with the lowest weighteddistance? Unfortunately, it isn’t that simple. Consider, for example, that we wish to locate 10 facilities on a network of 100 nodes. Such an enumeration would involve generating and evaluating the following number of combinations
3. A Detailed View of a Spatial Optimization Problem Hakimi (1965) proposed a network location model called the p-median problem. Consider the network depicted in Fig. 2. Each node is connected to other nodes by arcs of given distances. Without any loss of generality, consider that each node represents a potential facility site as well as a point of demand. The measure of demand is represented as a demand weight (in units of people, tons, trips, or some other measure
E
F
100 10
G
l
100! 5!(100k5)!
l
100M99M98M97M96M95M94M…..M1 10M9M8M…..M1M90M89M88M…..M1
H
or 100M99M98M97M96…M91 % 17,300,000,000,000 10M9M8M7M….M1 configurations or alternatives
Figure 2 An example road network
If a very fast computer could analyze 1,000 solutions each second, it would take 548 such computers working simultaneously to finish the task in a year. Thus, for many problems enumeration is impossible. In such cases, it is reasonable to consider optimization. ReVelle and Swain (1970) proposed the first optimal approach to solving the p-median problem. The basis of their approach is to formulate the p-median model as an integer-programming problem. Consider a connected transportation network comprised of n nodes and at least nk1 arcs (note: it requires at least nk1 arcs to be connected). An example of a 12-node network is given in Fig. 2. In this review we will consider the most general form of the pmedian problem where each node represents both a place of demand and a potential facility site. Given any pair of nodes on the network, a shortest path exists between the pair of nodes, and the distance of that path is easily calculated by a variety of efficient techniques. Consider the following notation n l the number of nodes in the network 14815
Spatial Optimization Models Table 2 Optimization model Objective function: Subject to: (1) Each demand must be served by either placing a facility at its node or by assigning to a facility elsewhere
n n aidijxij i=" j=" n
(2) Assignment is possible to only open facilities (3) Exactly p-Facilities are located
l1 j = " ij
or each node i
xij xjj
for each node i and j where i j
n
(4) Integer restrictions on variables
x
xjj l p
j=" xij l 0, 1
for each node i and j
Table 3 Optimization model: modified second constraint (2) Assignments are possible to a facility, only if it has been selected
i, j l indices used to reference nodes, where the nodes are numbered from 1 to n ai l the amount of demand at node i dij l the distance from node i to node j 1
1, if demand at node i assigns to a facility at node j 0, otherwise
xij l 4
3
2
1
xjj l 4
3
2
1, if node j self assigns indicating that a facility is located at node j 0, otherwise
Using the above notation, we can define the optimization model in Table 2 (ReVelle and Swain 1970). The model in Table 2 can be solved by the use of integer programming techniques, most notably, linear programming with branch and bound (LP\BB). In a relatively high percentage of problems, the optimal linear programming solution is integer optimal as well and the branch and bound routine is not used. This property is called ‘integer friendly.’ Since branch and bound algorithms may take considerable computational time to resolve noninteger solutions, finding an optimal integer optima without BB usually means relatively fast computational times for a given problem size. Unfortunately, the above model is large in terms of the number or variables and constraints (n# and n#j1 respectively). Consequently there is a limit as to how large a problem can be solved by this specific model formulation and LP\BB. It is possible to formulate the p-median problem with a slightly different constraint set. Consider the following form of the second constraint (Table 3): the constraints that restrict demand assignments to only those nodes selected for a facility have been aggregated into one constraint for each facility. Consequently, this second model is more compact in that it contains only 2nj1 constraints rather than n#j1 constraints. 14816
n
nxjj i = " ij x
For each node j
As models are typically easier to solve when they have fewer constraints, this would at first seem to be an improvement. Unfortunately, this second model form is not ‘integer friendly’ and consequently the use of LP\BB typically relies on the BB algorithm and resulting solution times can be very large. The lesson here is that there are always alternate ways in which to formulate any given spatial optimization model, and the difficulty in solving a given problem can be in part due to the mathematical form of the problem. There are practical limits, however, as to the size of a problem that can be solved optimally. For large problems it is simply not possible to find an optimal solution with certainty. Because of this there has been a great reliance on heuristic solution procedures. Heuristics are often designed to exploit some type of search strategy. For example one of the best known methods for solving the p-median problem is the ‘swap’ heuristic of Teitz and Bart (Church and Sorensen 1996). The ‘swap’ heuristic starts with a randomly generated solution. Given this starting solution, consider a candidate node as a possible replacement site for each of the currently selected nodes. If any swap involving this candidate node yields an improvement, the best swap with that candidate is made, creating a new and improved solution. Swaps are tested until no improvements can be made involving any node of the network. Repeating this procedure a few times has a good chance of finding a good, if not optimal solution. Other strategies are based upon ‘survival-of-the-fittest’ genetic algorithms, simulated annealing and statistical mechanics, Monte Carlo sampling, etc. All heuristics are rated for their capabilities in finding good solutions.
4. Research Needs in Spatial Optimization Research in spatial optimization has expanded over the years. This expansion is due in part to increasing computational power and availability of computers,
Spatial Optimization Models better general purpose commercial products like GIS and optimization software, fourth generation computer language tools like visual studio and Cjj, and high resolution color monitors. Some spatial optimization models are now a part of general purpose GIS software. There are, however, a number of research directions that must be taken in order for such modeling tools to be commonly available. To make this a reality, research is needed in the following areas: (a) develop better and more capable solution algorithms and heuristics. This is especially important in improving the capability to solve large problems of the simple plant location problem where the facilities have finite capacities; (b) expand existing models and develop more comprehensive spatial optimization models, especially in ecological reserve design and land use planning; (c) develop methods to better estimate the impact of data error and its propagation on model results. This is especially important since all geographic data sets of realistic size contain error; (d) expand data model definitions to better link spatial optimization models with GIS. One application of considerable importance is the optimal location of cellular phone towers; (e) develop better visualization methods along with graphical interface approaches that can support collaborative decision making processes; (f ) explore methods of data aggregation so that it may be possible to construct small problem representations that are easier to solve, but still retain all of the important problem characteristics. Many industries and government agencies have been pushed to become more efficient. Increasing productivity has become a mantra of stockholder and citizen alike. In many settings new spatial optimization models are being proposed to help increase efficiency, promote equity, reduce environmental damages, and reduce energy needs. For example, new spatial\ temporal optimization models that are being used for emergency ambulance operations can create plans that improve service response without additional investment or keep service levels constant and cut costs. The economic incentive to improve services and reduce costs will expand the application base that is already very large. See also: Environmental and Resource Management; Industrial Ecology; Industrial Geography; Local Economic Development; Space and Social Theory in Geography; Spatial Equity; Spatial Pattern, Analysis of; Transportation Geography; Urban Places, Planning for the Use of: Design Guide
Bibliography Bean J C, Noon C E, Ryan S M, Salton G J 1988 Selecting tenants in a shopping mall. Interfaces 18: 1–9
Bodin L D 1990 Twenty years of routing and scheduling. Operations Research 38: 571–9 Camm J D, Chorman T E, Dill F A, Evans J R 1997 Blending OR\MS, judgment, and GIS: Restructuring P&G’s supply chain. Interfaces 27: 128–42 Christaller W 1966 Central Places in Southern Germany. Prentice-Hall, Englewood Cliffs, NJ Church R L, Bell T L 1988 An analysis of ancient Egyptian settlement patterns using location-allocation covering models. Annals of the Association of American Geographers 78: 701–14 Church R L, Murray A T, Weintraub A 1998 Locational issues in forest management. Location Science 6: 137–53 Church R L, ReVelle C S 1974 The maximal covering location problem. Papers of the Regional Science Association 32: 101–18 Church R L, Sorensen P 1996 Integrating normative location models into GIS: Problems and prospects with the p-median model. In: Longely P, Batty L M (eds.) Spatial Analysis–Modelling in a GIS Enironment Cova T J, Church R L 2000 Contiguity constraints for single region site search problems. Geographical Analysis, in press Garrison W L 1959 Spatial structure of the economy II. Annals of the Association of American Geographers 49: 471–82 Goodchild M 1977 An evaluation of lattice solutions to the problem of corridor location. Enironment and Planning A 9: 727–38 Hakimi S L 1965 Optimum distribution of switching centers in a communication network and some related group theoretic problems. Operations Research 13: 562–74 Hof J, Bevers M 1998 Spatial Optimization for Managed Ecosystems. Columbia University Press, New York Keene A S 1979 Economic optimization models and the study of hunter-gatherer subsistence settlement systems. In: Renfrew C, Cooke K (eds.) Transformations: Mathematical Approaches to Culture Change. Academic Press, London Kent B, Bare B B, Field R, Bradley G 1991 Natural resource land management planning using large-scale linear programming: The USDA Forest Service Experience with FORPLAN. Operations Research 39: 13–27 Kuby M J 1989 A location-allocation model of Losch Central Place Theory-Testing on a uniform lattice network. Geographical Analysis 21: 316–37 Lemberg D L, Church R L 2000 The school boundary stability problem over time. Socio-Economic Planning Sciences 34: 159–76 Lombard K, Church R L 1993 The gateway shortest path problem: Generating alternative routes for the corridor location problem. Geographical Systems 1: 22–45 Love R F, Morris J G, Wesolowsky G O 1988 Facilities Location: Models and Methods. North-Holland, New York O’Kelly M E, Miller H J 1994 The hub network design problem: A review and synthesis. Journal of Transport Geography 2: 31–40 ReVelle C S, Swain R W 1970 Central facilities location. Geographical Analysis 2: 30–42 Schilling D A, Jayaraman V, Barkhi R 1993 A review of covering problems in facility location. Location Science 1: 25–55 Smith T R, Peng G, Gahinet P 1989 Asynchronous, iterative, and parallel procedures for solving the weighted-region least cost path problem. Geographical Analysis 21: 147–66 Storbeck J E 1990 Classical central places as protected thresholds. Geographical Analysis 22: 4–21 Underhill P 1999 Why We Buy: The Science of Shopping. Simon and Schuster, New York
14817
Spatial Optimization Models Van Reidhead V A 1979 Linear programming models in archaeology. Annual Reiew of Anthropology 8: 543–78 Von Thunen J H 1842 Der Isolierte Staat. In: Hall P (ed.) Von ThuW nens Isolated State. Pergamon, London Weigel D, Cao B 1999 Applying GIS and OR techniques to solve Sears technician-dispatching and home-delivery problems. Interfaces 29: 112–30 Wesolowsky G O 1993 The Weber problem: History and procedures. Location Science 1: 5–24 Woodall M, Cromley R G, Semple R K, Green M B 1980 The elimination of racially identifiable schools. Professional Geographer 32: 412–20 Wright J, ReVelle C, Cohon J 1983 A multi-objective integer programming model for the land acquisition problem. Regional Science and Urban Economics 13: 31–53 Yeates M 1963 Hinterland delimitation: A distance minimizing approach. Professional Geographer 15: 7–10
R. L. Church
Spatial Pattern, Analysis of Spatial data consist of measurements or observations (data values) taken at specific locations (data sites) in a geographic space. Typically, such data are presented visually in the form of a map, although they may also be represented in other analog forms such as aerial photographs and remotely sensed images. Since most processes operate over both space and time, such representations may be considered as cross-sectional snapshots of reality. The relative locations of the data values in the geographic space define a spatial pattern. Spatial pattern analysis involves describing and explaining such patterns. In general, this is done by comparing characteristics of an empirical pattern with those of null models representative of spatial randomness (i.e., no discernible spatial order). Once identified, the distinctive characteristics of the empirical pattern can be used as evidence to gain a better understanding of the processes which generated the pattern. A classic example of this is provided by the nineteenth-century physician John Snow who identified the cause of a cholera epidemic in London by observing that deaths were concentrated around a particular public water pump. Methods of spatial pattern analysis can be classified in terms of the kinds of spatial data analyzed. Three types of spatial data are usually recognized: point data; geostatistical data; and lattice data. For each type, further classification can be achieved by considering if the data values are qualitative (categorical) or quantitative, and if they are univariate or multivariate. In addition, spatial pattern analysis originated in distinct academic communities. Point pattern techniques were developed in plant ecology, economic geography, and epidemiology; geostatistics in mining engineering and geology; and lattice data methods in econometrics. As a result, there are literally hundreds of different spatial pattern analysis techniques and 14818
almost as many texts describing them. Consequently, this essay will emphasize conceptual issues common to all techniques and illustrate them by outlining a widely used procedure for analyzing univariate, quantitative spatial data of each type. To simplify the presentation, it is assumed that the behavior of the process responsible for the pattern being analyzed is the same throughout the area under study. This condition is known as ‘stationarity.’ For more comprehensive treatments of spatial pattern analysis see Upton and Fingleton (1985), Cressie (1993), and Bailey and Gatrell (1995).
1. Point Data Here the data sites consist of a finite number of point locations which themselves are the object of interest (e.g., the towns shown in Fig. 1). More formally, let (s , s ,…, sn), where si l (si , si ), be the locations of n " # in a bounded study "region # A of size QAQ. Such events patterns are typically analyzed by examining characteristics of either the number of events in subareas of A or interevent distances RsiksjR. The most popular method of analysis, K function analysis, developed by the statistician Brian Ripley (Ripley 1981), combines a consideration of both characteristics. The procedure analyses the average number of events that lie within a distance h of a randomly chosen event. For a given empirical pattern, the K function, K(h), may be estimated using n
n
Kp (h) l λV −" w(si, sj)−"I (RsiksjR h)\n i="j=" h 0, i j
Figure 1 Bus service centers in south-central England, 1955
Spatial Pattern, Analysis of
Figure 2 Observed and expected Lp (h) for bus service centers shown in Fig. 1.
where λ# l n\QAQ is an estimate of λ, the intensity (mean number of points per unit area) of the point process, I (:) is the number of events sj within distance h of all events si, and w(si, sj) is a correction for edge effects. The last is required since a disk, radius h, may extend outside the boundary of A. Usually, w(si, sj) is set equal to the proportion of the circumference of a circle centered at si, passing through sj, that is inside A. In order to evaluate Kp (h), it is compared to K(h) which would result if the events were located independently in a completely homogeneous environment. Such a pattern is known as complete spatial randomness (CSR). More formally, CSR is the realization of a homogeneous planar Poisson process. This process has two conditions: every location in the study region has an equal chance of receiving an event (uniformity); and the selection of a location for an event in no way influences the selection of locations for any other events (independence). Under CSR, K(h) l πh# To facilitate the comparison with CSR, the statistic L (h) is often used in place of K(h), where L(h) l
kh pK(h) n
The expected value of L(h) under CSR is 0. When the events in an empirical pattern are more dispersed over A than they would be in CSR (spatial segregation), the estimated value of L(h), Lp (h), tends to be negative, whereas if the events are more clustered than expected under CSR (spatial clustering), Lp (h) tends to be positive. A test of the statistical significance of the difference between an empirical pattern and CSR can be made using a Monte Carlo simulation procedure. This involves simulating a set of x CSR patterns each with the same number of events as the empirical pattern in a study area identical to that of the empirical pattern. A confidence envelope is then defined by the
extreme maximum and minimum values of Lp (h) for the x simulations. If Lp (h) for the empirical pattern extends outside of the confidence envelope, the null hypothesis of CSR is rejected (at a 1\(xj1) significance level) in favor of one of segregation or clustering, depending on the direction of the departure. To illustrate the use of this technique, consider Fig. 1. This shows the locations of 74 central places in south-central England which provided bus services to surrounding rural areas in the 1950s. Figure 2 shows Lp (h) for this pattern together with a confidence envelope for CSR (based on 99 simulations). Note that for small values of h (approximately two miles or less), Lp (h) is negative and extends outside the CSR envelope. This indicates that the bus service centers have an inhibitive effect on each other since they are significantly more dispersed (at the 99 percent confidence level) than would be expected if they were independent.
2. Geostatistical Data Geostatistical data consist of a set of n data sites, (s , s ,…, sn), in a bounded study region A, at which " values, # the ( y , y ,…, yn), of some spatially continu# been measured, i.e., the data are ous variable Y" have assumed to represent sample observations from a surface defined by (s, y) (e.g., the snowfall data in Fig. 3). Geostatistical data can be analyzed by examining characteristics of the association between pairs of data values as some function of their relative location (e.g., the distance separating them). The most commonly used procedure, the semi-variogram, γ(h), focuses on the variance of the data values. For empirical data, the semi-variogram is estimated for equidistant classes h using the sample semi-variogram γV (h) l
1 ( y kyi+h)# 2nh Rs −s R = h i i j
where the summation is over all pairs of data sites separated by distance h, and nh is the number of such pairs. In practice, if the data sites are irregularly spaced in A (as in Fig. 3), there may only be one pair of data sites for which RsiksjR l h. In this case, a distance interval hp∆ can be used in place of h. The maximum value of γ# (h) is often referred to as the ‘sill’ and the distance at which this occurs is known as the ‘range.’ The range defines the distance at which the data values cease to be related. In theory, γ(h) l 0 when h l 0, since the underlying surface is assumed to be continuous in A. However, in empirical data it is sometimes seen that as h 0, γ# (h) c 0. This is ! known as the ‘nugget effect’ and is considered to represent the inherent error induced by measurement errors and by micro-scale environmental variability. If there is no spatial dependence in the data, the sample variogram will consist of just a nugget effect, i.e., it is 14819
Spatial Pattern, Analysis of
Figure 3 Thirty year averages (1961–90) of annual snowfall in centimeters at meteorological stations in western Canada, latitude 48.75–60 degrees north, longitude 94–119.5 degrees west. To simplify visualization of the data, the individual values are grouped into three classes.
3. Lattice Data
Figure 4 Empirical semi-variogram (crosses) and model semivariogram (line) for snowfall data in Fig. 3.
horizontal. One of several parametric models can be fitted to the sample semi-variogram to provide a smooth, continuous description of the variance structure in the data. Such models can be used to predict the values of Y for non-sampled locations using a procedurecalled‘kriging’(Cressie1993).Figure4showsthe sample semi-variogram for the data mapped in Fig. 3. For this data the value of the sill is 2,100, reached at a range of 480 kilometers, and there is no nugget effect. An exponential model of the form γ (h) l σ# (1kexp−$h/r ) where σ# is the sill and r is the range has been fitted to the empirical data. The analysis indicates that while there is a clear spatial dependence in snowfall values at distances of up to about 500 kilometers, this is very marked for distances of up to 200 kilometers. 14820
In lattice data, the values, ( y , y , …, yn ) of some # variable Y are recorded for each" member of a set of n areal units (a , a , …, an) which, taken collectively, " region # cover the study A, i.e., a Da D…Dan l A. " to# represent one The set of data values is considered possible realization of a spatial process operating over A. Lattice data are analyzed by examining characteristics of the association between pairs of data values as some function of their spatial association. This is referred to as ‘spatial autocorrelation.’ Since the data sites are areas, there are many different ways in which the spatial association wij between any two data sites ai, aj may be modeled. By far the most frequently used method is to set wij l 1 if ai, aj share a common boundary, and wij l 0, if they do not. Moran’s I, the most widely used measure of spatial autocorrelation, is given by E
G
n
Il F
n n wij i=" j=" ij E
H
G
n
n w ( y ky` ) ( yjky` ) i = " j = " ij i
i F
n ( yiky` )# i=" H
where: y- l n"n yi. The first term on the right-hand i=" side of this equation is a normalizing factor, so that the range of I is approximately k1 I 1. The expected value of Moran’s I under the assumption of inde-
Spatial Pattern, Analysis of
Figure 5 Percentage of census tract population with German mother tongue (Kitchener-Waterloo 1996)
pendence in the spatial pattern of values of Y (no spatial autocorrelation) is E (I ) lk
1 nk1
4. Changing Emphases and Recent Deelopments in Spatial Pattern Analysis
and the variance is E
Var (I ) l F
n#S knS j3W # " # W # (n #k1)
G
H
k
1 (nk1) #
where n n (wijjwji)# i=" j=" W l wij; S l ; " 2 i = "j = " n n
n
n
n
S l (wi.jw..j)#; wi. l wij; w..j l wij. # i=" j=" i=" An observed value of I which is larger than its expectation under the assumption of spatial independence indicates positive spatial autocorrelation (i.e., similar values of y, are found in spatial juxtaposition). In contrast, negative spatial autocorrelation occurs when neighboring values of yi are mutually dissimilar, indicated by an observed value of Moran’s I which is smaller than its expectation. For data sets of n 50, I is approximately normally distributed so that the standard score zl
As an example of this approach, consider Fig. 5 which shows the percentage of the 1996 population whose mother tongue was German for each census tract in the twin cities of Kitchener-Waterloo in Ontario, Canada. For this data set, n l 56, E(I ) lk0.018 and Var(I ) l 0.0057. While the observed value of I l 0.115 suggests that there is a tendency for census tracts with similar proportions of German speakers to cluster in space, the standard score of z l 1.77 indicates that the pattern of spatial association is not significantly different (at the 95 percent confidence level) from that which could arise by chance (i.e., no spatial autocorrelation). The measurement of spatial autocorrelation can be extended to higher-order spatial neighbors where the spatially associated areas are (kk1) intervening areas apart. A plot of the values of I for different spatial lags k, the spatial correlogram, is useful for detecting scale variations in the spatial pattern (Upton and Fingleton 1985).
IkE (I ) NVar (I )
may be used to determine if the empirical pattern displays significant spatial autocorrelation.
Most recent developments in spatial pattern analysis have come in response to changes in two areas. The first involves characteristics of the typical spatial data set to be analyzed. Proliferation of automated monitoring devices, such as orbiting earth satellites, has meant that large (i.e., hundreds of thousands or more data sites) and often frequently updated data sets have become increasingly available. The second change is in computing; in particular, increases in computing power and memory capacity, coupled with cheaper hardware costs. One result of this is the appearance of more sophisticated geographical information systems capable of visualizing spatial data in a variety of ways. The need to analyze large data sets has resulted in an increasing emphasis on spatial data exploration which, in turn, has led to a shift away from global methods of spatial pattern analysis to local ones (Getis and Ord 1996). While global methods attempt to summarize the entire spatial data set, local methods focus on searching for localized patterns within the data set by such means as identifying atypical locations (spatial outliers), clusters of abnormally high (hot spots) or low (cold spots) data values, and the presence of different patterns in sub-areas (spatial regimes). Since it is possible to generate a local statistic for each data site, the results of such analyses are best appreciated using some form of visualization (Hearnshaw and Unwin 1994). For local statistics, the null models of spatial randomness, such as that of CSR described in Sect. 1, used to evaluate global statistics may not be appropriate. Instead, null hypotheses which take into account some form of pre-specified spatial structure 14821
Spatial Pattern, Analysis of may be more appropriate. These may include, for example, allowing for some degree of significant spatial autocorrelation in the overall pattern or permitting the characteristics of the causal process to vary over subareas within the study area. Alternatively, a null hypothesis might be specified to evaluate if events cluster around one or more focal locations (e.g., disease incidences around toxic waste sites). While the characteristics of the simpler null models can be derived analytically, this is often not possible for the newer models, resulting in their properties being estimated from repeated simulations of the model processes. Frequent updating of data sets has spurred an interest in real time surveillance of spatial patterns. This involves analyzing new data, as it becomes available, to detect any significant changes that have occurred in the characteristics of an existing spatial pattern (Rogerson 1997). Since all of these innovations in spatial pattern analysis impose heavy computational demands, they have only become practical with the developments in computing previously noted. Indeed, in some situations it is possible that the amount of collected data may exceed human analytical ability. This has stimulated interest in the creation of automated procedures for spatial pattern analysis (Openshaw 1998). In particular, a new research field called spatial data mining, which integrates aspects of machine learning, database systems, data visualization, information theory and spatial statistics, has evolved with the aim of developing efficient methods for identifying the characteristics of very large spatial data sets. See also: Spatial data; Spatial autocorrelation; Spatial association, measures of; GIS, critical approaches; Spatial analysis, in geography; Spatial sampling; Spatial statistical methods
Bibliography Bailey T C, Gatrell A C 1995 Interactie Spatial Data Analysis. Longman Scientific and Technical, Harlow, UK Cressie N 1993 Statistics for Spatial Data, rev. edn. John Wiley and Sons, Inc., New York Getis A, Ord J K 1996 Local spatial statistics: an overview. In: Longley P, Batty M (eds.) Spatial Analysis: Modelling in a GIS Enironment. GeoInformation International, Cambridge, UK Hearnshaw H M, Unwin D J (eds.) 1994 Visualization in Geographical Information Systems. John Wiley and Sons, Inc., Chichester, UK Openshaw S 1998 Building automated geographical analysis and explanation machines. In: Longley P A, Brooks S M, McDonnell R, Macmillan B (eds.) Geocomputation: a Primer. John Wiley and Sons, Inc., Chichester, UK Ripley B D 1981 Spatial Statistics. John Wiley and Sons, Inc., New York Rogerson P 1997 Surveillance systems for monitoring the development of spatial patterns. Statistics in Medicine 16: 2081–93
14822
Upton G J G, Fingleton B 1985 Spatial Data Analysis by Example, Vol. 1: Point Pattern and Quantitatie Data. John Wiley & Sons, Inc., Chichester, UK
B. Boots
Spatial Sampling Sampling is undertaken in order to estimate an attribute of a variable of interest in a population. Sampling is used instead of undertaking a complete census for many reasons. For example undertaking a complete census may be impossible because the population is very, perhaps infinitely, large or it may be impractical because of expense. Some census data, for example household employment data in the UK, is only made available as a 10 percent sample from the 100 percent enumeration partly to help preserve confidentiality but also because the data requires manual coding which is expensive. In many situations a census is unnecessary because a value for the attribute is only required, or can only be measured, up to some predetermined level of precision and a sample size can be specified that will achieve that level of precision. Sampling is also undertaken to improve the accuracy of census data because some methods of recording are known to produce undercounts (e.g., in the case of crimes, not all events are reported to the police) and some methods of enumeration miss certain groups in the population (e.g., the homeless in the case of national censuses). The term spatial sampling is used to describe the activity of sampling when the population is spatially or geographically distributed. All the individuals in the population have a georeference or geocode that identifies their location with respect to some local or global frame of reference. The population might be a surface of land-use values or ground-level air pollution values. In this case the population is the set of all points in the region and is infinitely large. The population might be the set of all households in a city in which case the population, whilst not infinite, is nonetheless very large. The population might refer to spatial aggregates. For example the population of households might be bundled into census units so that the population is the set of such units for which aggregate quantities are of interest. Spatial sampling is usually undertaken by recording the data value at a point location such as a position on a line or a surface or recording some aggregate quantity for a polygon such as a pixel, a quadrat, or a census tract. In some spatial sampling a household is treated as a point (is the head of household unemployed?) in others as an area-based quantity (number of persons per room). Designing a spatial sample to estimate an attribute involves selection of the required sample size (n) and
Spatial Sampling specification of the sample locations (s ,…, sn). The " latter is called the sampling plan. The sampled data z l oz(s ),…, z(sn)q is used to estimate the attribute " which might be a ‘nonspatial’ attribute such as the mean value of the variable Z, or the proportion of the population with a given set of values on Z or the maximum value of Z. The attribute might be a ‘spatial’ attribute such as the autocorrelation function, semivariogram, or a map of the variable constructed using optimal interpolation. It is necessary to choose an appropriate estimator of the quantity and specify a measure of closeness (with respect to the attribute) that the sampling design seeks to minimize. A measure of closeness might for example be the sampling variance. There may be other issues that affect spatial sampling designs. There may, for reasons associated with cost constraints, be a trade off that needs to be recognized between on the one hand the greater levels of precision obtainable by increasing the sample size versus the costs associated with collecting larger volumes of data. Samples designed to deliver acceptable levels of sampling error for a national survey won’t achieve equivalent levels at smaller state or county-level scales without a larger sample and without stratification. Hence there needs to be some assessment of the benefits of having information at the smaller scale set against the costs of a larger sample. One solution to this problem, that can be used where some estimate is available at the local level, is to adapt methods based on ‘borrowing strength.’ National surveys yield precise estimates of national rates, but these are biased estimators of local area rates because local areas have different rates. Estimators that ‘borrow strength’ combine these two sources of information by using weights that reflect the relative precision of the two estimators (Ghosh and Rao 1994, Longford 1999). In disease mapping, ‘borrowing strength’ methods have been further adapted so that instead of borrowing strength from the national (or ‘region wide’) estimate, it is only borrowed from the geographically adjacent region (Elliott et. al. 1992). There may be particular problems of accessibility to particular sample areas as well as the collection costs associated with adopting a dispersed distribution of sampling locations versus a more clustered distribution. Other issues may also need to be considered such as whether the sample design is required to undertake long term monitoring and whether there are to be several measured attributes recorded at each site so that some compromise is required between the different estimators (Haining 1990). The form of the sampling problem may be to decide how best to add to, or conversely rationalize an existing sampling network where there may be costs associated with any reconfiguration (Arbia and Lafratta 1997). There are situations where some spatial object such as a polygon or a line is to be estimated, in which case the sample design is selected in order to estimate
or represent characteristics of the object—such as the area of a polygon or the location of a line. In geographic information system databases the representation and resolution of boundary lines between polygons is the outcome of a sampling process. The precision with which any boundary or shape can be represented is a function of the spacing of the sample points (Griffith 1989).
1. Design and Model Based Approaches to Spatial Sampling Suppose a region R is divided into a population of N units, where N might be very large. In the design-based approach it is assumed that z(si) is the value of Z observed at the point siwhich is one of the N points in R. The value of Z at each of the N units is considered fixed. We let m(R) denote the mean of the population of these N values. The problem is to estimate the parameter m(R) and in the design-based approach Z is observed at n sample points where these sites have been selected according to some sampling plan where units have a known probability of being selected. Repeated sampling according to this scheme generates a distribution of values of the vector z. This distribution together with an estimator of m(R) defines a (design-based) sampling strategy. This estimator is used to make inferences about m(R) because, as it is based on z, it too will have a distribution of values under repeated sampling. The model-based approach assumes an underlying random process with an assumed probability distribution that is generating the observed set of values in the N units of R. (See Spatial Autocorrelation; Spatial Statistics.) So the value z(si) is the value of a random variable–it is only one of a distribution of values that could have been realized at location si. It is usual to specify properties of the model generating the N data values including the first order (mean, µ) and second order (variance-covariance, Σ) properties. Now m(R) is not a parameter but a random variable because it is a function of N random variables. To make clear the distinction this mean will be labeled as M(R) in the model-based case. It could also be of interest to estimate parameters such as the mean, µ, of the underlying model. In the model-based approach the estimator of µ and the estimator (strictly speaking the predictor) of M(R) both incorporate the spatial model generating the data explicitly in order to produce good estimates (e.g., Haining 1988, Cressie 1991). For example, the best linear unbiased predictor of M(R) is of the form: n
a(i) z(si) (1) i=" where the oa(i)q depend on the model and are chosen to minimize the mean square prediction error. In this approach the sampling plan is of secondary interest. 14823
Spatial Sampling By contrast in the design-based approach, the sample mean (a(i) l 1\n in Eqn. (1)) is used to estimate m(R) having been shown to have good properties (Bellhouse 1977). Attention focuses here on the sampling plan in order to identify which one will have the lowest sampling variance or sampling error. The advantage of this approach is that the estimator is easy to compute and it does not depend on specifying the underlying model. The model-based approach assumes a hypothetical super-population of possible (but unrealized) maps. The design-based approach is only concerned with the here and now of what has occurred. Perhaps more importantly from a conceptual perspective, the model-based approach can be used to estimate the parameters of the underlying signal (the parameters of the map generator, like µ) rather than the particular (noisy) map. For example, in relative risk estimation (also called disease mapping), it is a quantity like µ rather than M(R) that is usually of interest (Elliott et al. 1992).
2. Sampling Plans There are a number of different ways to select the sites to be sampled. The main classes of sampling plan are random, stratified random, and systematic (Ripley 1981). We consider each of these in turn before turning to some other schemes. In the case of random-sampling the n sample sites are selected so that each member of the population has an equal and independent chance of selection. Figure 1(a) shows a random sampling plan on a surface where each point on the surface had an equal chance of being selected. As has been commented before, a random sample often gives the appearance of some clustering in the distribution of selected sites. Figure 1(b) shows a stratified random sampling plan on the same surface that has been divided into nine strata. Within each stratum, one sample site has been selected according to the rule for a random sampling plan but more than one could be selected. Figure 1(c) shows one type of systematic sampling—the centric systematic sampling plan. The region has been divided into square strata and the central point in the strata selected. This method of sampling contains no randomization. However, randomization can be introduced by for example, randomizing on the location of the first sample point (the random start method). The other (nk1) sample points occupy the same relative locations in the remaining (nk1) strata. There are a number of variants of the systematic sampling scheme (Koop 1990). Stratified random and systematic sampling both raise questions about the size and shape of the strata to use. Shapes such as the hexagon or the triangle could be used for the strata. An understanding of the process generating the quantities of interest may help in designing these parameters of the sampling plan (Cressie 1991). A pilot survey may be important 14824
Figure 1
Spatial Sampling (a)
(b)
Figure 2
in helping to decide on the size of the strata and hence what density of points are needed. In addition to these sampling plans there are others that may be appropriate in particular circumstances although there are sometimes special problems associated with computing sampling variances and hence interpreting results. Cluster sampling involves grouping the population into clusters (Lyberg and Cassell). These clusters are often defined in such ways that subsets of individuals comprising the population are spatially close together within any one cluster. These clusters are sampled and then the sample is drawn by selecting, usually at random, individuals from within the selected clusters. This sampling method is often
cheaper and administratively easier to implement. Kahn and Sempos (1989) give the example of a health survey involving a large number of economically similar but geographically dispersed villages. Instead of drawing the sample of size n from all the villages (which could involve visiting a large number incurring considerable time and travel costs), a subsample of villages is drawn and the sample of size n is drawn from within this subset of villages. Nested sampling may be used where variation in the quantity of interest is associated with different spatial scales–for example, different administrative tiers of a political system. Nested sampling may be employed, again to help keep down the time and cost of the 14825
Spatial Sampling sampling exercise. The area is subdivided at the first (highest) tier of the system and units are chosen at random (similar to cluster sampling). Then these selected areas are subdivided at the second tier and these in turn are sampled—and so on until the lowest tier is reached. Within the selected areas at the lowest tier, individuals are sampled (Fig. 2(a)). As can be seen from Fig. 2(a) this method does not control for distance between the sample points so that if scales of variation vary continuously (rather than in terms of discrete hierarchical levels) as might be likely to arise with a surface of values, then this method of sampling may obscure the different scales of variation. Fixed interval point sampling as illustrated in Fig. 2(b) avoids this problem. The distance between sample points is fixed at those intervals the investigator wishes to study closely (Oliver and Webster 1986).
not sampled—albeit smaller in extent. Systematic sampling, especially using the centric or the random start methods reduces both effects. Intuitively we therefore expect that for given sample size n, systematic sampling ought to produce estimates with lower sampling variances of a quantity such as m(R). This, by and large, proves to be the case because separating the sample sites improves the information content of the sample. However, while the gains of moving from random to either stratified random or systematic sampling are considerable, the gains of systematic sampling over stratified random sampling are generally much smaller and seem to depend quite critically on the specific form of the underlying spatial autocorrelation (Ripley 1981, Matern 1986, Dunn and Harrison 1993). The sampling variance is estimated in the case of random sampling using: n
[z(si) k ZF )]# \ (nk1) i="
3. Spatial Variation and Sampling Plans Spatial autocorrelation is the term used to describe the presence of systematic spatial variation in a variable and positive spatial autocorrelation, which is most often encountered in practical situations, is the tendency for areas or sites that are close together to have similar values. The presence of spatial autocorrelation in a population to be sampled has important implications for the sampling variance of the estimator. It is however necessary to distinguish between estimating surface properties in the case of the design-based approach (for example, estimating m(R)) and estimating underlying parameters in the case of the model based approach (for example estimating µ). We consider the design-based approach first. If the purpose is to estimate m(R) then the presence of positive spatial autocorrelation implies that each sample point carries information not only about its own sample site but also some information about neighboring site values. In the case of random sampling, the sampling variance of the sample means (as the estimator of m(R)) for any given sample size will be lower than that arising in the case of a population of independent individuals (Griffith et al. 1994). However, this does raise the question as to whether other sampling plans might do better. Consider the three sampling plans described in Fig. 1. If the surface of values is positively spatially autocorrelated one of the problems with random sampling is that because adjacent values will be similar there will be information redundancy in the case of those sample points that happen to be close together. It is also evident from Fig. 1(a) that the random sampling plan leaves areas unsampled. Both these problems with the random sampling plan are reduced to some extent by the introduction of strata, but as is clear in Fig. 1(b) there are still some sample points that are close together resulting in information redundancy, and there are still some areas of the map 14826
(2)
where n
1 ZF l z(si) n i= " Equation (2) provides an unbiased estimator in this case and Ripley (1981) discusses related methods for estimating the sampling variance under stratified random and systematic sampling. The presence of spatial autocorrelation has implications for the other sampling plans. In the case of cluster sampling a critical factor is between-cluster and within-cluster variation. If the clusters are just random assemblages of individuals in the population then cluster sampling (with random selection of individuals from within each cluster) ought not to produce sampling variances any greater than say random sampling. If all the clusters differ greatly then selecting a sample of them will not lead to good population estimates. If clusters have internal positive spatial autocorrelation (so that they are more homogeneous than the population as a whole) then cluster-sampling variance is likely to be bigger than (simple) random sampling. In the case of (hierarchical) nested sampling, if the population shows spatial autocorrelation then, again there will be information redundancy because some sample sites are close together. This effect may be somewhat less of a problem in the case of fixed interval point sampling as a result of keeping sample sites apart. In the model-based approach, if the purpose is to estimate µ (of a spatially autocorrelated model) using the sample mean, then this estimator’s sampling variance is inflated relative to its sampling variance under the assumption of a population that is independent. If Eqn. (2) is used as the estimator of
Spatial Search Models the sampling variance it can severely underestimate the true sampling variance of the sample mean. Although the sample mean is an unbiased estimator of µ, the best linear unbiased estimator (BLUE) is usually preferred although it requires the correct specification of the underlying spatial model (see Eqn. (1)). As noted above the choice of sampling plan tends to be of secondary interest in this area of analysis but careful specification of the required sample size is needed if parameters are to be estimated to the desired level of precision (Haining 1988). In addition to undertaking sampling to estimate non-spatial properties of an attribute, sampling may be needed in order to estimate spatial properties (the correlogram or semi-variogram) or to construct maps based on optimal interpolation. Evidence suggests that particularly for optimal interpolation and for constructing contour maps for a variable, centric systematic sampling grids are best with a density chosen to achieve required levels of precision and with a rectangular grid selected for practical reasons (Webster and Burgess 1981). Interpolation error will vary across the map as a function of the density of sample points and the effect of the survey boundary. If anisotropy is present (directional differences in the autocorrelation structure) then both alignment and grid specification for the systematic sampling will need to be adjusted (McBratney et al. 1981). See also: Sample Surveys, History of; Sample Surveys: Methods; Sample Surveys: Model-based Approaches; Sample Surveys: Nonprobability Sampling; Spatial Analysis in Geography; Spatial Association, Measures of; Spatial Autocorrelation; Spatial Data; Spatial Data Infrastructure; Spatial Pattern, Analysis of; Spatial Search Models; Spatial Statistical Methods; Spatial-temporal Modeling
Bibliography Arbia G, Lafratta G 1997 Evaluating and updating the sample design in repeated environmental surveys: Monitoring air quality in Padua. Journal of Agricultural, Biological and Enironmental Statistics 2: 451–66 Bellhouse D R 1977 Some optimal designs for sampling in two dimensions. Biometrika 64: 605–11 Cressie N 1991 Statistics for Spatial Data. Wiley, New York Dunn R, Harrison A R 1993 Two-dimensional systematic sampling of land use. Applied Statistics 42: 585–601 Elliott P, Cuzick J, English D, Stern R 1992 Geographical and Enironmental Epidemiology: Methods for Small Area Studies. Oxford University Press, Oxford, UK Ghosh M, Rao J N K 1994 Small area estimation: An appraisal. Statistical Science 9: 55–93 Griffith D A 1989 Distance calculations and errors in geographic databases. In: Goodchild M, Gopal S (eds.). Accuracy of Spatial Databases. Taylor and Francis, New York, pp. 81–90 Griffith D A, Haining R P, Arbia G 1994 Heterogeneity in attribute sampling error in spatial data sets. Geographical Analysis 26: 300–20
Haining R P 1988 Estimating spatial means with an application to remotely sensed data. Communications in Statistics: Theory and Methods 17: 573–97 Haining R P 1990 Spatial Data Analysis in the Social and Enironmental Sciences. Cambridge University Press, Cambridge, UK Kahn H A, Sempos C T 1989 Statistical Methods in Epidemiology. Oxford University Press, Oxford, UK Koop J C 1990 Systematic sampling of two-dimensional surfaces and related problems. Communications in Statistics: Theory and Methods 19: 1701–50 Longford N T 1999 Multivariate shrinkage estimation of small area means and proportions. Journal of the Royal Statistical Society, A 162: 227–45 Matern B 1986 Spatial Variation, 2nd edn. Springer Verlag, Berlin, Germany McBratney A B, Webster R, Burgess T M 1981 The design of optimal sampling schemes for local estimation and mapping of regionalized variables—I Theory and method. Computers and Geosciences 7: 331–4 Oliver M A, Webster R 1986 Semi-variograms for modeling the spatial pattern of landform and soil properties. Earth Surface Processes and Landforms 11: 491–504 Ripley B 1981 Spatial Statistics. Wiley, New York Webster R, Burgess T M 1981 Optimal interpolation and isarithmic mapping of soil properties: III changing drift and universal kriging. Journal of Soil Sciences 32: 505–24
R. P. Haining
Spatial Search Models Spatial search is recognized as an important construct within behavioral geography because it is the preliminary, information acquisition phase of most locational decision processes. Spatial search is the task of identifying and investigating spatially distributed choice alternatives that serve as the basis for a locational decision. Spatial search processes range from purely mechanical, space covering procedures characteristic of remote sensing methods through complex, multi-stage searches drawing on indirect information sources as well as direct experience—the search for a new residence or a new place of employment, for example. No less important is the ongoing acquisition of geographic information which occurs as people move about their environment, consciously and unconsciously creating a practical knowledge base which they draw upon to inform the myriad locational decisions made during the normal course of a day—way finding and shopping decisions, for example (for a detailed review of the spatial search literature and for further references see Golledge and Stimson 1997). Whether searching for a new place to live or shopping for a new pair of shoes, the set of locations visited during the search process describes a search pattern with predictable spatial characteristics and 14827
Spatial Search Models important locational consequences. Four spatial regularities in the search patterns are particularly important: search patterns are limited in geographic scope and search effort is spatially persistent; the search patterns reflect the underlying spatial distribution of choice alternatives within the geographically limited search domain; the search pattern reflects the spatial biases in the information sources used to identify potential choice alternatives; and the search pattern is oriented around key locations or anchor points in the decision-maker’s activity space. By implication, the spatial regularities and biases observed in residential mobility patterns, local labor mobility patterns, and in shopping patterns are reflections of systematic spatial regularities in the underlying search processes and in their associated search patterns. Spatial search emerged as a major theoretical construct in behavioral geography in the 1960s as attention shifted from descriptive questions relating to the patterns of geographic activity to an interest in understanding the spatial choice and locational decision-making processes generating the geographic pattern. The early conceptualizations of the locational decision-making process by Wolpert (1965) and Brown and Moore (1970) emphasized the importance of understanding how search decisions affect both the probability of success and the geographic distribution of the locational alternatives included in the decision maker’s choice set thereby affecting the location of the final choice. They argued that the first key decision in a locational decision process is the decision to begin active search for attractive alternatives. The decision to begin active search implies that: the decision maker is dissatisfied with the current situation; better alternatives are believed to exist within the decision maker’s prospective search domain; and the expected returns to search are greater than the expected search costs. Subsequent work on the decision to begin the search process emphasizes the context dependent nature of the decision and the need to consider the prior knowledge and experience which the decision-maker brings to the spatial search decision. Three search questions are triggered by the decision to begin a spatial search; the decision-maker must decide: which information sources should be used; where search effort should be focused; and when to terminate the search. Subsequent theoretical and empirical work on spatial search has been primarily concerned with the locational consequences of the decisions made in response to one or more of these three questions. Each of these questions has a geographic consequence and each decision is affected by the interaction between the locational preferences of the individual and the decision-making context. The delineation of a geographically bounded search domain reflects the locational preferences of the decision-maker but is also closely tied to the choice of information source or sources used to identify search 14828
possibilities. The primary information source affecting the delineation of the search domain is the practical knowledge base that the individual brings to the search task. If an area is familiar to a person, search is more likely in that area; the search costs are likely to be lower as is the risk of making a bad choice. The search domain tends to be defined in terms of the individual’s activity and the awareness space that varies in geographic scope depending upon the age, class, and ethnicity of the individual. Similar observations pertain for personalized information sources such as friends and relatives. Households relying on personalized sources of information are found to have much smaller search domains than individuals relying on impersonal sources such as newspapers and directories or professional sources such as real estate agents or employment agencies (Clark and Smith 1979). In these instances, the search domain is dictated by the geographic reach of the information source. The role of information channels as sources of bias in the search process has received considerable attention in the racial segregation literature (Courant 1978). These investigations have focused specifically on racial steering and the potential role of real estate agents in creating racially homogeneous neighborhoods through selective filtering of housing information to black and white households. It is important to understand how geographic information is produced because whether systematic and mechanical or selective and subjective, spatial search processes produce representations of the external environment which are necessarily incomplete and potentially distorted both as a result of the geographically selective knowledge base and because information channels also selectively represent or filter the locational information available to the decision-maker. Most locational decisions and the search process leading up to the actual choice are bounded geographically in that the active investigation of choice alternatives is preceded by a decision to focus the search effort within a particular geographically delineated market. Once search begins within a particular market area, there is a strong tendency to continue searching within that same market area. Once the search for a new residence begins within a local housing market, for example, it is highly likely that search will continue within that same housing market. Similar patterns of spatial persistence are observed in job searches within local labor markets and in shopping decisions within local market areas. In other words, people continue to focus search effort within a targeted market area as a spatial satisfying strategy that discounts the potential acceptability of choice alternatives located outside the targeted market area. There is also evidence that area based search strategies continue to apply at more micro scales within the larger market area. In the case of residential search, for example, households tend to focus their search effort within specific community areas within
Spatial Statistical Methods larger geographically defined housing markets (Huff 1986). Decision makers decide where to focus their search effort based upon their locational preferences and upon what they know about the chances of being able to satisfy locational and nonlocational preferences by targeting search within a given area. Often there are pre-existing anchor points that have locational significance for the decision-maker and serve to focus or anchor the search effort (Golledge 1978). When households search for a new residence, for example, the job locations of the wage earners in the household and the location of the previous residence often serve as strong anchor points in the search process (Huff 1986). In the case of job search, the location of the residence often ties the search to a particular labor market and may result in a highly focused job search in the immediate proximity of the home (Hanson and Pratt 1988). In the case of search associated with shopping, home and work locations frequently are identified as key anchor points in the search process and at a more micro level, the strategic locations of anchor stores in shopping centers reflect an awareness of how these stores anchor the search behavior of shoppers within the mall. How people decide when to conclude the search process has received a great deal of attention within the economics literature on job search (Lippman and McCall 1976). Conceptually, the answer to the stopping-rule question is quite simple: search will continue until the expected cost of further search exceeds the expected returns to further search. On the other hand, the empirical validation of stopping rule models has proven to be quite difficult. In the case of spatial search, both the expected costs and the expected returns to search are expected to vary from one geographic area to another (Smith et al. 1979) which implies that the length of search and the chance of successfully completing the search vary from one geographic area to another (Rogerson 1982). Understanding the path dependent nature of the spatial search process is of vital importance with respect to the stopping rule question as well as the information use and the search location questions. In the stopping-rule case, the order in which choice alternatives are visited during the search process is found to affect the outcome of the search process. Different information sources with very different content and spatial coverage tend to be used during different phases of the search process. During the actual search process, the path dependencies often are spatial in nature. Spatial persistence in the decision to continue searching in the same area, the spatial clustering of choice alternatives visited during search highlight the existence of strong spatial dependencies in the search sequence. In the case of multi-purpose shopping behavior, spatial search often requires the resolution of both a routing or way finding problem and a stopping rule problem (Miller 1993).
Bibliography Brown L A, Moore E A 1970 The intra-urban migration process: A perspective. Geografiska Annaler 52B: 1–13 Clark W A V, Smith T R 1979 Modelling information use in a spatial context. Annals of the Association of American Geographers 69: 575–88 Courant P 1978 Racial prejudice in a search model of the urban housing market. Journal of Urban Economics 5: 329–45 Golledge R G, Stimson R J 1997 Spatial Behaior: A Geographic Perspectie. Guilford Press, New York Hanson S, Pratt G 1988 Reconceptualizing the links between home and work in urban geography. Economic Geography 64: 299–321 Huff J O 1986 Geographic regularities in residential search behavior. Annals of the Association of American Geographers 76: 208–27 Lippman S A, McCall J J 1976 The economics of job search: A survey. Economic enquiry 14: 155–89 Miller H 1993 Modeling strategies for the spatial search problem. Papers in Regional Science 72: 63–85 Rogerson P 1982 Spatial models of search. Geographical Analysis 14: 217–28 Smith T R, Clark W A V, Huff J O, Shapiro P 1979 A decision making and search model for intraurban migration. Geographical Analysis 11: 1–22 Wolpert J 1965 Behavioral aspects of the decision to migrate. Papers and Proceedings of the Regional Science Association 15: 159–69
J. O. Huff
Spatial Statistical Methods Spatial statistics is characterized by modeling and data analysis for processes displaying spatially-indexed dependence structures. Adjustment for spatial dependence has long been mandated in many disciplines. The very nature of geography is spatial. F. Stephan (1934) noted ‘Data of geographic units are tied together … in dealing with social data, we know that by virtue of their very social character, persons, groups, and their characteristics are interrelated and not independent.’ In the context of agricultural field experiments, Fisher (1935) described the basic intuition underlying much of spatial statistics: ‘After choosing the area we usually have no guidance beyond the widely verifiable fact that patches in close proximity are commonly more alike, as judged by the yield of crops, than those which are far apart.’ Though Fisher developed randomized experimental designs to avoid modeling spatial dependence explicitly, agricultural applications are an important motivation for spatial statistics (Besag 1991). In the early 1950s the South African mining engineer D. G. Krige pursued methods for the spatial interpolation of data with the intent of predicting the distribution of ore (e.g., gold) in space. G. Matheron 14829
Spatial Statistical Methods (France) extended that work, formalizing mining geostatistics, and coined the term ‘kriging’ to refer to a class of spatial prediction methods. In parallel, L. S. Gandin (Soviet Union) developed essentially the same procedures for applications in meteorology where they are known as optimal interpolation and objectie analysis. While basic geostatistical methods remain very useful, the methodology was limited in terms of (a) applicability, since the analyses were historically confined to linear models and (b) volume of data analyzed. Technological advances regarding both data collection methods and computation have led to expansion of applications and methods of spatial statistics. Specifically, image analysis is and will continue to be a critical research area. Example problems include satellite remote sensing of the Earth’s environment and ecology, including humans and their behavior and impacts; analysis of astronomical images from optical, radio-wave, infrared, etc., instruments; magnetic resonance imagery; and computer vision. Methods for image restoration (filtering out noise), classification, and object identification are sought. Similarly, the development of geographic information systems is a watershed of geographically referenced data, though the efficient extraction of information from such massive datasets offers new statistical challenges. Spatial statistics also includes modeling and analysis of mapped point processes, in which the locations of observations are themselves random. For example, observations may be the locations and characteristics of trees in a forest (Stoyan and Penttinen 2000), or positions and types of galaxies in space (Feigelson and Babu 1992). For applications in archaelogy, see Hodder and Orton (1976); see also Stoyan (1988). Finally, an introduction to spatial modeling of random objects can be found in Cressie (1993); see also Grenander (1993).
1. Perspecties and Comparisons Let D denote a collection of sites or locations denoted by s, distributed in space. (Formally, D is taken to be a subset of d-dimensional Euclidean space. This permits mathematical consideration of the differences and distances between sites). In geostatistics, D is a collection of sites on the surface of the Earth, indexed in two dimensions either by s l (latitude, longitude) or by Cartesian coordinates s l (x, y), or in three dimensions with the addition of altitude or depth. In image analysis, s typically represents a pixel located on a discrete lattice. For each s in D, there exists some random quantity, Z(s), of interest. The stochastic process oZ(s):s ? Dq is a random field or spatial random process. Z could be a continuous variable (e.g., temperature); a discrete variable (e.g., a binary variate indicating the presence or absence of some characteristic, or a multivalued discrete variable of gray 14830
levels, especially in image analysis); a mixed variable (e.g., rainfall may be zero with positive probability at a location, yet continuous, if some rain occurs). Z may also be a multivariate variable. Goals in spatial statistical analyses include distributional modeling of spatial random processes; estimation and inference for aspects of the spatial model based on observational data; prediction at sites where observations are not available; and filtering or noise reduction. Two approaches to spatial process modeling dominate current practice. In the first, jointly or simultaneously specified models for the process oZ(s):s ? Dq are developed. For example, in much of geostatistics, the variable Z(s) is decomposed into ‘mean’ plus ‘error’ components. A critical challenge is the development of joint distributional information for the errors across space. A second strategy is to build conditionally specified models by modeling local dependence structures. This track builds directly on the intuition that variables in close spatial proximity tend to be related. This is quantified by formulating distributional features of Z(s), conditional on the rest of the Z process, to depend only on the values of Z at locations in neighborhood of s. Practical modeling often focuses on specification of conditional means and variances. Two important issues arise. First, to enable spatial prediction accompanied by meaningful measures of uncertainty, a conditionally specified model should correspond to a proper joint distributional model. To that end, conditionally specified Markov random fields are often used since there is a well-developed theory regarding the existence of implied joint models (e.g., Cressie 1993, Chap. 6). The second issue involves dealing with variables located near the boundary of D, since this region typically dictates local dependence structures that are by nature different from those in the interior of D. Indeed, boundary problems are among the issues that generally distinguish spatial statistics from other applied statistics areas. While geostatistics and imaging appear to have developed separately, they share some common formulations and goals. However, modeling and analysis methods are responsive to the typical types of data available. Geostatistical settings often feature small to moderately sized datasets, and data that are irregularly distributed in space, leading to difficulties in model estimation. In many imaging problems datasets are relatively large and are often regularly distributed in space. One impact of this difference is that treatment of measurement error is more accessible and common in imaging compared to its haphazard treatment in classical geostatistics. The primary goal, particularly in geostatistics, is spatial prediction: Based on a vector of n observations, Z (Z(s ), …, Z(sn))h, we seek predictions of Z(s ) " ! where s is an arbitrary site in D. Depending upon the ! level of assumptions made, two essential principles guide the predictive analysis. First, in analyses based
Spatial Statistical Methods on assumptions asserting only the trend and covariance structures, most predictions are derived as best linear unbiased predictions (BLUP). ‘Best’ refers to the property of having smallest prediction variance among all linear, unbiased predictors. In analyses which also include distributional assumptions, predictive analyses are based on the implied conditional distribution of Z(s ) given the observed data Z. For example, assuming!mean squared prediction error as the criterion for assessing the quality of prediction, the optimal predictor is Z< (s ) l E (Z(s ) Q Z); that is, the conditional mean of Z(s !) given the! data. ! Both Bayesian and non-Bayesian viewpoints are applied in spatial statistics, though the use of these terms varies with the application areas. Since most strategies view the primary variable of interest Z as random, the principle that spatial prediction is based on the conditional distribution, or at least its moments, of the unobserved variables given the data is clearly tied to the Bayesian view in both interpretation and computational techniques. This is particularly evident in image analysis, where a potentially nonrandom, ‘true image’ is modeled as a realization of a stochastic spatial process (e.g., Geman and Geman 1984, Besag 1989). On the other hand, despite similar stochastic modeling of the primary variables of interest, the origins of geostatistics and spatial point process analysis appear to be primarily non-Bayesian in language and spirit. In particular, early approaches to model fitting and parameter estimation were primarily non-Bayesian, though their interpretation as empirical Bayesian is quite natural. With the advent of powerful Bayesian computational strategies (i.e., Markov chain Monte Carlo), the Bayesian approach is undeniably on the rise in applications of all areas of spatial statistics. It offers opportunities for effective analyses in complex spatial problems, including multivariate and inhomogeneous datasets and data types, as well as highly nonlinear and non-Gaussian datasets, not readily amenable to competing viewpoints.
2. Spatial Random Fields Assuming all quantities exist, let µ(s), s ? D be the mean (expected value) µ(s) l E (Z(s)), and let cov (Z(s), Z(sh)), defined for all pairs of sites s and sh in D, be the covariance function of the process. The process is (second-order or weakly) stationary if it has constant mean µ(s) µ and cov (Z(s), Z(sh) l C(h),
for all s ? D, sh ? D (1)
where h l sksh. That is, the covariance between Z(s) and Z(sh) depends on the locations only through their difference. Note that this implies that the variance of Z(s), σ#(s) σ#, is constant. The function C(h) is known as the coariogram. Strict or strong stationarity
involves similar requirements applied to the joint distributions of oZ(s):s ? q as opposed merely to first and second moments (Cressie 1993, Sect. 2.3). Alternatively, again assuming that µ(s) µ, the process is intrinsically stationary if var (Z(s)kZ(sh)) l 2γ (h),
for all s ? D, sh ? D (2)
The function 2γ (h) is known as the ariogram; γ (h), is the semiariogram. If the process is stationary, then calculation implies 0.5 var (Z(s)kZ(sh)) l C(0)kC(h) l γ (h)
(3)
Hence, stationarity implies intrinsic stationarity. However, variograms satisfying (2) may exist in some cases for which covariograms do not exist. For stationary processes, (3) allows one to view modeling covariograms and variograms as being equivalent, though some have argued that estimation based on data is enhanced for the variogram. An intrinsically stationary process is isotropic if γ (sksh) l γ (h) for all pairs of sites, where h is the distance between them. That is, the variogram depends only on the distance between sites (i.e., the length of the vector h l sksh), but not their relative directions. Anisotropy (the violation of isotropy) arises for processes that develop differentially in different directions of space (Cressie 1993, Sect. 2.3). For example, temperature values, or variables depending on temperature (e.g., type of agriculture, ecology, etc.), are moderated roughly by latitude. Hence, north–south covariance structures might well be different from their east–west counterparts. An example of an isotropic semivariogram is the exponential model: 1
γ (h) l 4
3
2
0
if h l 0
γ jγ (1ke−h/a) ! "
if h 0
(4)
where a 0. Note that as h 0, γ (h) γ , rather than ! the nugget 0. This discontinuous behavior, known as effect, occurs in a variety of commonly used semivariograms. Physically, γ has been interpreted as arising ! from (a) very small-scale fluctuations in the Z process, and\or (b) measurement error. However, if the process modeled is a sufficiently smooth function of space, the only mathematically valid explanation for the nugget effect is measurement error. The exponential covariogram corresponding to Eqn. (4) is 1
C(h) l 4
3
2
γ jγ ! "
if h l 0
γ e−h/a, "
if h 0
(5)
14831
Spatial Statistical Methods Another covariogram sometimes used is the Gaussian model. 1
C(h) l
γ jγ ! " 2
if h l 0
−h#/b
3 4
γe "
,
if h 0
(6)
where b 0. (Use of the modifiers ‘exponential’ and ‘Gaussian’ accrues from similarities to the exponential and Gaussian probability density functions, rather than distributional assumptions for Z.) A very rich class of isotropic covariograms is the Mate! rn class, written here without a nugget for comparison: C(h) l
2σ# (hNν\a)ν Kν(2hNν\a) Γ(ν)
(7)
where ν 0, σ# is the variance of Z(s), Γ(:) is the usual gamma function, and Kν(:) is the modified Bessel function of the second kind.
The unbiasedness condition, E (Z< (s )) l µ adds the ! side condition that n λi l 1. Minimization of the n=" predictive variance (or mean squared prediction error) E(Z(s )kZp (s ))# ! ! subject to the side condition yields simple formulas for the optimal values of the λi as appropriate functions of the covariance function. Alternatively, the formula can be written in terms of the variogram, enabling spatial prediction for processes that are intrinsically stationary (Cressie 1993, pp. 121–3). The achieved optimal value of the predictive variance is then used to measure the precision of the prediction. The kriging formulas are derived under the assumption that the variogram or covariance function is known. Typically, data-based estimates of these functions are simply plugged into the kriging formulas. A variety of methods for estimation of the variogram (or covariogram) have been suggested. A simple procedure is to apply a method-of-moments notion: an empirical version of the semivariogram is γV (h) l
3. Spatial Prediction
A standard spatial model takes the form Z(s) l µ(s)jε(s)
(8)
Where µ(s) is a spatial mean model and ε(s) is a zeromean spatial error process. Critical issues involve the choices of models regarding both the mean or trend (also known as drift) µ(s) and the error process oε(s): s ? q. Such modeling is challenging in that both components explain aspects of the spatial structure present in Z. The conventional guideline is that the trend reflects large-scale spatial variation, while the error process explains residual small-scale spatial dependence. Implementation of this guideline and interpretation of results vary with the context as well as with the goals and knowledge of the analyst. As noted by Cressie (1993, p. 25), ‘What is one person’s (spatial) covariance structure may be another person’s mean structure.’ Also note that in modern statistical approaches, trend models based on both spatial location and covariates are recommended. Trend models can also be incorporated in conditionally specified models, either via conditionally specified mean models, or with globally specified mean models and conditionally specified error models. 3.1 The Basics of Kriging Ordinary kriging asserts a constant trend model. Linear predictors take the form n
Zp (s ) l λi Z(si) ! i=" 14832
(9)
n(h)
(Z(si)kZ(sih))# Fn(h)
(10)
where the sum is taken over those data points satisfying siksih l h and Fn(h) is the number of such pairs of data points. A simple estimate of the covariogram is Cp (h) l
n(h)
(Z(si)kZz ) (Z(sih)kZz ) Fn(h)
(11)
where ZF is the simple mean of the observed values. However, since there are only n observations, these estimates cannot provide estimates for all h. Practical estimates are typically constructed based on binning procedures. For example, under the additional assumption of isotropy, consider a collection of M nonoverlapping intervals Im l [hmkδm, hmjδm];
m l 1, …, M
For each interval the above empirical sums and counts of pairs are taken over all pairs separated by distances in Im. The resulting M values can then be smoothed to produce a function estimate. The above estimates suffer from a variety of deficiencies, especially if the sample size n is small. A variety of alternatives have been suggested. One general approach is to parameterize the covariogram (or variogram), specifying it to be a member of some class (the Mate! rn class (7) is a popular choice). Methods for parameter estimation include least squares, weighted least squares, and maximum likelihood. A second avenue for estimation involves the use of nonparametric function estimation, especially
Spatial Statistical Methods in the context of estimating the spectral representation of the covariogram. Also, robust estimates with respect to outlying observations have been developed. See Cressie (1993) and Stein (1999) for further discussion of classes of variograms and estimation procedures. Sampson and Guttorp (1992) consider estimation in the nonstationary case.
3.2 Trend Models and Their Estimation Universal kriging refers to spatial prediction based on trend models of the form p
µ(s) l fj(s) βj j=!
(12)
where the fj(:) are pj1 specified functions of space and β l ( β , …, βp)h is a vector of pj1 unknown parameters.! In vector notation, let f(s) l ( fp(s), …, fp(s))h, so the model is written compactly as µ(s) l f h(s)β Ordinary kriging is included by assuming f (s) l 1 and ! p l 0. A common example is a quadratic trend surface model: Writing locations in Cartesian coordinates s l (x, y), assume that µ(s) l β jβ xjβ yjβ x#jβ xyjβ y# ! " # $ % & In addition to higher-order polynomial trends, nonpolynomial models, such as local polynomials have been considered (Venables and Ripley 1999). The use of splines is highly related; see Wahba (1990) and Cressie (1993, pp. 180–3). Representation of the kriging model in matrix form provides a better understanding and enables parallels to familiar statistical linear regression modeling. The notation X " (m, C) is read ‘the random vector X has a multivariate distribution with mean m and covariance matrix C.’ The notation X " N(m, C) adds the assumption that the distribution is multivariate normal. The standard universal kriging model is then E
F
Z(s ) !
G
E
fh β ! Fβ
l H
F
G
E
j H
F
ε(s ) ! ε
G
(13) H
where E
F
ε(s ) ! ε
G
E
E
H
" F
F
0 0
G
E
H
, F
σ#(s ) ! σ
σh Σ
G
G
H
H
(14)
where σ is the ni1 vector of covariances between
errors at site s and the observation sites; F is the ! ni( pj1) matrix dictated by the functions fp(:) evaluated at the observation locations; and ε is the vector of errors at observation sites. These errors are assumed to have zero means and covariance matrix Σ whose entries are the covariances cov (ε(si), (ε(sih)) for pairs of data points. Formulas for the optimal linear predictor or BLUP are readily derived (Cressie 1993, pp. 172–80): Zp (s ) l f h(s )βjσh Σ−"(kFβ ) ! !
(15)
These formulas depend on appropriate values of the covariance function and β, requiring estimation of both quantities. Note that use of (15) does not require spatial stationarity of the error process, though it is often assumed in estimation. Estimation of β is typically accomplished via generalized least squares. Since the observations follow the model l Fβjε
(16)
The generalized least squares estimate of β is βV l (Fh Σ−"F)−"F Σ−"
(17)
Estimation of the covariogram of the ε process proceeds as discussed earlier, though now the data used are the residuals from the fitted trend model: kFβ# . However, there is a clear difficulty. The very residuals needed to estimate the covariogram depend on the covariogram since β# is the function of Σ. A variety of approaches to this problem have been suggested. One is to use some preliminary guess at Σ in (17), but base prediction formulas on the estimated covariogram of the resulting residuals. With the increase of readily available computation, the more modern approach iterates between generalized least squares estimation of β and covariogram estimation until some stability is achieved (Venables and Ripley 1999). Note that the extra uncertainty accruing from the use of data-based estimates of β and the covariance function is seldom accounted for in classical kriging. Treatment of measurement error can be included in the modeling. Rather than observing directly, the observed data values, denoted by Y, may be modeled as Y(si) l Z(si)je(si); i l 1, …, n. Let denote the ni1 vector of observations. In matrix form we write " (, Dy)
(18)
where Dy is the covariance matrix of the measurement errors e l (e(s ), …, e(sn))h. A common, but not un" equivocal, assumption is that Dy l σ#y I (i.e., the measurement errors are mutually uncorrelated). Also, assume e is uncorrelated with . If Z is a smooth function of space, σ#y is a quantification of the nugget effect. To complete the formalization, interpret (18) as 14833
Spatial Statistical Methods a specification conditional on . The model (16) for implies that the unconditional model for is " (Fβ, Σjσ#y I )
(19)
(Compare this covariance matrix to the piece-wise definitions of variograms in (5) and (6).) Without replicate observations at some of the observation locations, σ#y is not directly estimable in classical kriging. Typically, estimates are obtained by extrapolation of covariogram estimates for small distances. Note that the goals of the analysis now include not only prediction of Z(s ), but also prediction of . ! provides spatial smoothing, Predicting based on as opposed to spatial interpolation when σy# is assumed to be zero. 3.3 Multiariate Methods In many applications, several variables may be of interest, suggesting the consideration of a random vector, Z(s) l (Z (s), …, ZK(s))h. In mining, the data might consist of" measurements of amounts of K minerals at each observed site. In meteorology, data on temperature, pressure and precipitation are often collected. In satellite imaging, multichannel sensors collect intensities in various bandwidths. The classical geostatistical treatment of multivariate spatial data is known as cokriging. The modeling strategy is a direct generalization of the univariate case. For each coordinate Zk(s) of Z(s), consider a mean or trend model (see (12)) µk(s) l f hk(s)βk, k l 1, …, K
(20)
Note that the spatial regression functions comprising the various fk need not be the same for all k: indeed, the dimensions of the βk may vary with k. The key secondorder parameters are the covariance matrices, cov (Z(s), Z(sh)) l C(s, sh),
s ? D, sh ? D
(21)
The principles and basic formulas for best linear unbiased prediction remain essentially the same as for the univariate case. Since the notation is cumbersome, the reader is referred to Wackernagel (1995) for details. Practical implementation of cokriging can be difficult. First, specification and subsequent estimation of C(s, sh) is problematic, in that it involves specification not merely of K spatial covariance functions (one for each variable) but also 0.5 K(Kk1) cross-covariance functions, one for each pair Zk, Zkh, k kh. The number of observations critically limits the ability to estimate the required covariance functions. Indeed, sample size typically mandates the need for extremely simple, often implausible, covariance models and\or a reduction in K. Further, the sampling pattern (e.g., observations of only some coordinates of Z at some sites) can moderate the ability to estimate parameters and predict. 14834
An alternative modeling strategy is to build multivariate models hierarchically, namely as a sequence of conditional models. A precursor to this view from the geostatistics literature is kriging with external drift. For example, suppose that spatial prediction of nearsurface ozone is of interest. It is well accepted that temperature is predictive for ozone. Hence, spatial prediction of ozone could be enhanced if appropriate temperature data were available and included in the model. An elementary approach is to view temperature as a (fixed) covariate and to include it in the formulation of a universal kriging model for ozone. Next, suppose temperature observations are unavailable at sites at which ozone predictions are sought. One could essentially form a kriging model for temperature, and rely on temperature predictions to produce ozone predictions. These notions can be made rigorous statistically. Specification of a complete spatial model for ozone conditional on temperature combined with a marginal spatial model for temperature does imply a joint model for both, as might be formed in a cokriging approach. However, the parameterizations of the models can be different. In particular, rather than direct specification and estimation of the cross-covariance function for ozone and temperature, one specifies a conditional mean relationship between the two. For further discussion and examples, see Royle and Berliner (1999).
4. Bayesian Spatial Statistics 4.1 Bayesian Geostatistics Models for the Z process, such as the basic model (13) along with a measurement error stage as in (18) can readily be posed in the Bayesian framework. Process models for Z are viewed as priors; these priors may be specified simultaneously or conditionally. Specification of prior distributions on model parameters, such as the trend model coefficients β, measurement error variance, and parameters of the modeled covariance function are also formulated. Predictive inference is then based on the posterior distribution of unknown quantities, given the data. (A general review of Bayesian analysis is given in Bernardo and Smith 1994.) Direct analogs to conventional kriging often rely on multivariate normal distributions in both (13) and (18). This imposition of distributional assumptions has been criticized. However, neither standard kriging nor Bayesian analyses based on normal distribution assumptions are recommended for highly non-normal data and processes. The Bayesian approach readily offers valid measures of uncertainty (e.g., Le and Zidek 1992, Handcock and Stein 1993), unlike those arising from the ‘plug-in’ of estimates procedures dominating non-Bayesian kriging. With the advent of Bayesian computing advances, non-normal models can be analyzed. For example,
Spatial Statistical Methods settings in which the Z variable is discrete, such as in some imaging problems (Geman and Geman 1984) or is count data modeled as via Poisson or related distributions (Kaiser and Cressie 1997, Wolpert and Icktadt 1998), have been considered. See also Diggle et al. (1998).
4.2 Hierarchical Models Hierarchical Bayesian models for collections of variables are combinations of component models or stages. This strategy is particularly useful in complex settings. Motivations for hierarchical modeling include: (a) individual stages are often formulated more easily than is the full joint model directly; (b) scientific understanding of the relationships among variables is often readily available for components of the model; and (c) combining datasets consisting of very different types of data is often enhanced. The first stage of a Bayesian hierarchical model is usually a measurement error model for the observations, analogous to (18). A variety of higher-level stages, interrelating spatial processes and model parameters, can then be formulated. Finally, a prior probability distribution for all model parameters is specified. See Royle et al. (1998) and Berliner (2000) for examples and discussion.
5. Spatial Point Processes
(λQAiQ)n exp (kλQAiQ), n!
λ(s) l lim QdsQ !
P(N(ds) 0) QdsQ
(23)
where ds is a small square centered at s. If only one event may occur at a location, note that λ(s) l lim QdsQ !
E(N(ds)) QdsQ
The second-order product density of a point process is
A spatial point process arises when the locations of events occur randomly in D. Basic data take the form (s , …, sN)h, where the number of locations, N is a " random variable. The probability theory for point processes typically formulates a model for random points in #. The model then implies a distribution for N, the number of events in the study region D, a bounded subset of #, as well as the joint distribution of their locations. Generally, the treatment of edge effects then becomes a challenge (Ripley 1988, Chap. 2). A standard point process model is the homogeneous Poisson spatial process with intensity λ: For any k and any selection of disjoint subsets A , …, Ak of #, " ), i l 1, …, k, assume that the numbers of events, N(A i are independent random variables each with probability P(N(Ai) l n) l
viewed as completely spatially random, since counts in disjoint intervals are independent and the probabilities in (22) depend only on Ai through its area and not its shape or position. This model is often used as a null hypothesis. Rejection of the Poisson model suggests that the process has some spatially dependent structure, or dependence among separated regions. Numerous generalizations have been considered. The Poisson cluster process allows cluster locations to be distributed according to one Poisson spatial process, while events in clusters are distributed according to another Poisson model. Extensions to alternative measures on sets beyond simple area, including measures which vary in space leading inhomogeneous processes have been considered. A Cox process is generated as a Poisson process, but the measures associated with sets are themselves random. As in geostatistics and related areas, selected moments of a spatial process play roles in both theory and data analysis. Define the intensity function of a point process as
n l 0,1, …
λ (s, sh) l lim # QdsQ !,QdshQ !
(24)
If λ (s, sh) depends only on h l sksh, the process is # stationary; if λ (s, sh) depends only on the distance h # between the points, it is isotropic. In the isotropic case Ripley (1977) developed a theory for the K function, K(h) l E (number of other points within distance h of an event)\λ
(25)
and its relationship to λ (h). A Poisson process is isotropic with K(h) l πh#.# The K function plays a critical role in theory, because it can be computed readily for a variety of important point processes, and in practice, because it can be estimated readily from data. Simple data-based estimates take the form
(22) where QAiQ is the area of Ai, ds. Note that for any set A A, the mean E(N(A)) l λQAQ.i The Poisson process is
E(N(ds) N(dsh)) QdsQ QdshQ
Kp (h) l
N number of other points within i=" distance h of si\λN 14835
Spatial Statistical Methods (λ is often estimated as N\QDQ.) Note that this estimate is biased negatively, since events outside the data collection domain D are not included. The intuition behind the K function relates generally to statistical distance methods and specifically, nearest-neighbor methods. Distance methods involve statistics which are computed from the set of all distances between events, rather than the actual locations. Substantial theory regarding the distributions of distance-based statistics is available. Extensions of point processes include marked point processes for which a (possibly multivariate) variable Z is associated with each event, leading to data of the form Z (Z(s ), …, Z(sN))h. A related multivariate " class of problems involves the modeling and data analysis for multiple point processes.
6. Spatial–Temporal Statistics Very complex model structures may arise when both spatial dependences and temporal dependences, and their interactions, are present. Early attempts in this area often viewed time as if it were another spatial dimension, leading to relatively simple treatments of spatial–temporal data via spatial methods. However, the special role of time (i.e., the direction of its flow) often renders such approaches ineffective. Alternatively, the Kalman filter adapted to space–time problems has had wide application in the sciences. The basic method is analogous to kriging in temporal settings. Spatial–temporal statistics is at the forefront of current statistical research. See Kyriakidis and Journel (1999) for a review in geostatistics. For recent Bayesian approaches and references, see Wikle et al. (1998) and Wikle (2000). See also: Bayesian Statistics; Multivariate Analysis: Overview; Spatial Autocorrelation; Spatial Sampling
Bibliography Berliner L M 2000 Hierarchical Bayesian modeling in the environmental sciences. Allgemeines Statistisches Archi (Journal of the German Statistical Society) 84: 141–53 Bernardo J M, Smith A F M 1994 Bayesian Theory. Wiley, New York Besag J E 1989 Towards Bayesian image analysis. Journal of Applied Statistics 16: 395–407 Besag J E 1991 Spatial statistics in the analysis of agricultural field experiments. In: Spatial Statistics and Digital Image Analysis. National Academy Press, Washington, DC Cliff A, Ord J K 1981 Spatial Processes: Models and Applications. Pion, London Cressie N A C 1993 Statistics for Spatial Data, rev. edn. Wiley, New York Cressie N 1995 Bayesian smoothing of rates in small geographic areas. Journal of Regional Science 35: 659–73 Diggle P J, Tawn J A, Moyeed R A 1998 Model based geostatistics (with discussion). Applied Statistics 47: 299–350
14836
Feigelson E D, Babu G J (eds.) 1992 Statistical Challenges in Modern Astronomy, Springer-Verlag, New York Fisher R A 1935 The Design of Experiments. Oliver and Boyd, Edinburgh, UK Geman S, Geman D 1984 Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern analysis and Machine Intelligence PAMI-6: 720–41 Grenander U 1993 General Pattern Theory: A Mathematical Study of Regular Structures. Clarendon Press, Oxford, UK Handcock M S, Stein M L 1993 A Bayesian analysis of kriging. Technometrics 35: 403–10 Hodder I, Orton C 1976 Spatial Analysis in Archeaology. Cambridge University Press, London Kaiser M S, Cressie N 1997 Modeling Poisson variables with positive spatial dependence. Statistics and Probability Letters 35: 423–32 Kyriakidis P C, Journel A G 1999 Geostatistical space–time models: A review. Mathematical Geology 31: 651–84 Le N D, Zidek J V 1992 Interpolation with uncertain spatial covariance: A Bayesian alternative to kriging. Journal of Multiariate Analysis 43: 351–74 Ripley B D 1977 Modelling spatial patterns. Journal of the Royal Statistical Society, Series B 39: 172–92 Ripley B D 1988 Statistical Inference for Spatial Processes. Cambridge University Press, Cambridge, UK Royle J A, Berliner L M 1999 A hierarchical approach to multivariate spatial modeling and prediction. Journal of Agricultural, Biological, and Enironmental Statistics 4: 1–28 Royle J A, Berliner L M, Wikle C K, Milliff R 1998 A hierarchical spatial model for constructing wind fields from scatterometer data in the Labrador Sea. In: Gatsonis et al. (eds.) Case Studies in Bayesian Statistics. Springer-Verlag, New York, pp. 367–82 Sampson P D, Guttorp P 1992 Nonparametric estimation of nonstationary spatial covariance structure. Journal of the American Statistical Association 87: 108–19 Stein M L 1999 Statistical Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York Stephan F 1934 Sampling errors and interpretations of social data ordered in time and space. In: Ross F A (ed.) Proceedings of the American Statistical Journal, New Series No. 185A. Journal of the American Statistical Association, 29 Suppl: 165–6 Stoyan D 1988 Thinnings of point processes and their use in statistical analysis of a settlement pattern with deserted villages. Statistics 19: 45–56 Stoyan D, Penttinen A 2000 Recent applications of point process methods in forestry statistics. Statistical Science 15: 61–78 Wackernagel H 1995 Multiariate Geostatistics. SpringerVerlag, New York Wahba G 1990 Spline Models for Obserational Data. SIAM, Philadelphia, PA Wikle C K 2000 Hierarchical space–time dynamic models. In: Berliner L M et al. (eds.) Studies in the Atmospheric Sciences. Lecture Notes in Statistics, 144, Springer-Verlag, New York, 45–64 Wikle C K, Berliner L M, Cressie N 1998 Hierarchical Bayesian space–time models. Journal of Enironmental and Ecological Statistics 5: 117–54 Wolpert R L, Ickstadt K 1998 Poisson gamma random fields for spatial statistics. Biometrika 85: 251–67
L. M. Berliner Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Spatial-temporal Modeling
Spatial-temporal Modeling ‘Spatial Temporal Models’ (STMs) portray the variation of phenomena over time and across space. In the sections that follow, first the STMs are defined. Then the ‘canonical’ and ‘compartment’ types of STMs are described. The construction of STMs is discussed next. A number of trends that have influenced the evolution of spatial temporal modeling are touched upon in a ‘concluding comments’ section.
1. Introduction A mathematical model qualifies as a STM if at least some of its variables are referenced spatially and temporally. Spatial temporal referencing means that the values of these variables are linked to points or intervals in time and simultaneously to entities in geographical space such as points, or areal units, or arcs or nodes of a network. The STMs are likely to include also purely spatial and purely temporal variables. Examples of spatial variables are distances, directions, spatial coordinates, locational quotients, measures of spatial inequality, measures of phenomena that vary over space or index geographic environments, and spatial dummies. Examples of purely temporal variables are time and functions of time, dummies associated with time intervals and points in time, and measures of phenomena that vary over time or index temporal environments. The STMs primarily are concerned with three classes of phenomena: spatial flows, spatial diffusion, and spatial temporal change. The latter includes spatial variation of growth rates (for instance, of product per capita), qualitative spatial change (for instance, in land use), and change in spatial distributions (for instance, of crime events or of settlements). A systematic introduction to the spatial temporal modeling of several classes of substantive phenomena can be found in Haggett et al. (1977). In the social sciences, much of the conceptual work that precedes or is intertwined with mathematical modeling is based on notions from system theory. Books dealing with the nexus between spatial temporal modeling and spatial systems include Bennett and Chorley (1978), Bertuglia et al. (1987), Huggett (1980), and Klassen et al. (1979). The STMs figure prominently in the edited collections by Anselin and Florax (1995), Fischer and Getis (1997), Griffith and Mackinnon (1981), Griffith and Lea (1983), Griffith and Haining (1986), and Wrigley (1979). The STMs encompass a huge body of knowledge crisscrossed by many substantive and analytical literatures. In order to convey a feel for spatial temporal modeling the two most common types of STMs were singled out for a brief treatment in the next two sections. A subsequent section discusses an approach
to the construction of STMs. In the concluding comments some trends that have influenced the evolution of the STMs are touched upon.
2. Canonical STMs ‘Canonical single equation STMs’ are models consisting of an equation with spatially referenced rates of change over time as the dependent variable, and in which at least some of the independent variables are spatial or spatially referenced. The ‘canonical multiple equations STMs’ instead are systems of equations, some or all of which qualify as canonical single equation STMs. In the social sciences, the most common STMs are the ones here referred to as canonical. Examples of canonical single equation STMs can be found everywhere in the social sciences. For instance, they are the backbone of the empirical literature on the determinants of economic growth. Consider as a specific ‘case’ the regression model in Barro (1997). It has growth rates of GDP per capita as the dependent variable, and GDP per capita, male schooling, fertility rates, a democracy index, inflation rates, and others as independent variables. Barro estimated it using a sample of 114 countries, and growth rates calculated from several time intervals. In fact, the STMs in which the dependent variable is the rate of change of variables such as income, unemployment, population, welfare rolls, crime rates, for countries, counties, or regions, and in which the independent variables include indices of relevant characteristics of these geographical entities, are ubiquitous. Also very common are models in which rates of change are functions of distances or of spatial coordinates. An example of canonical multi-equation STMs is Greenwood’s 1981 model of metropolitan growth and migrations. The model consists of 14 equations with 14 endogenous and 16 exogenous variables. It focuses on the rates of change of income per capita, unemployment, and migrations into or out-of metropolitan and nonmetropolitan areas. A number of multi-equation canonical STMs are described in Putman (1979).
3. Spatial Compartments STMs ’Compartment models’ describe systems made of subsystems interconnected by flows (De Palma and Lefevre 1987). In a number of instances the dynamics of a system and of its subsystems and the flows among these are influenced by controls or from policy instruments (Bennett and Chorley 1978). The term ‘spatial compartment models’ refers to spatial systems with subsystems corresponding to geographical entities such as regions, countries, or urban centers. For 14837
Spatial-temporal Modeling instance, a world system may consist of spatial compartments made of aggregates of countries (‘regions’). A system may be defined in terms of one or more ‘elements.’ For instance, population may be the only element in a system, and a four-elements system may be defined in terms of population, product, capital, and technology. An example of spatial compartment STMs is the one employed in a 1977 UN study of the future of the world economy (Leontief 1977). The model is of the inter-regional input–output type. It has 48 producing and consuming sectors and 15 regions. The regions are aggregates of countries. A sector inputs from other sectors, and outputs to other sectors. The goods produced by sectors are classified as ‘domestic’ or ‘internationally traded.’ The internationally traded goods are exported to or drawn from international sectoral pools, and each region exports\imports a fixed share of the aggregate world exports\imports of any given internationally traded commodity. The model is made of a total of 2625 equations grouped into 15 regional sets. Each regional set describes the inter-relations within a region of production, consumption, imports, and exports, of goods and natural resources. Each set involves 269 variables, of which 229 are region specific, and the remaining 40 represent the imports\exports pools of internationally traded goods and the balance\imbalance of the regions’ international financial transactions. Thus, the model contains more variables than equations. To solve it, enough variables are set to fixed values to render the number of the remaining variables equal to the number of the equations. This model was used to produce eight development paths of the world economy from the base year 1970 through 1980 and 1990 to the year 2000. These paths are based primarily on combinations of low\ medium\high population growth scenarios, and low\ high GDP per capita growth scenarios. In short, the questions asked focus upon what the world economy has to be like in 1980, 1990, 2000, if the population and product per capita follow specified scenarios that unfold within realistic technical and physical constraints. Spatial compartment STMs can be constructed by combining input output models with econometric models. A review of pertinent issues and literatures is contained in Rey (1998). In the ‘computable general equilibrium models’ the demand and supply functions of products and of factors of production are specified and estimated. In the spatial compartment variants of these models the mechanisms affecting the spatial flows of products and factors are also brought forth and the equations describing these mechanisms are specified and estimated. Regional computable general equilibrium modeling has been surveyed by Partridge and Rickman (1998). Isard et al. (1998) review applied inter-regional equilibrium modeling. A discussion of the theory and estimation of multicountry trade 14838
models is contained in the book by Deardorff and Stern (1986). This ‘Michigan’ model represents a massive undertaking, and was developed for the purpose of evaluating the impacts of changes in tariffs on international trade flows. Linked macroeconomic models are discussed in Ball (1973) and in Fair (1984). A spatial compartments follow up to the ‘Limits to Growth’ model (Meadows et al. 1972) is described in Mesarovic and Pestel (1974a) and is documented in the six volumes edited by the same authors (1974b). Their work builds upon the ‘Limits to Growth’ thesis that if the current population growth, economic growth, pollution increases, and natural resources depletion continue, the world will face a number of catastrophes. However, Mesarovic and Pestel maintained that before the middle of the twenty-first century we shall be confronted by regional catastrophes rather than by a collapse of the world system. These catastrophes will materialize in different regions for different reasons and at different times, and will be profoundly felt throughout the entire world (Mesarovic and Pestel 1974a). Most of the spatial compartment models referred to in this section are constructed to produce forecasts, simulations, or scenarios. It should be noted, however, that there are extensive literatures dealing with spatial compartment STMs constructed for the purpose of investigating systemic optima. Intertemporal linear and non linear programming models based on spatial compartments are discussed in Takayama and Judge (1971) and in Judge and Takayama (1973). Optimal control spatial compartments STMs are reviewed in Tan and Bennett (1984).
4. STMs Generated by Expansions In many cases, STMs in the social sciences have been imported from nonsocial science disciplines. For instance, some of the diffusion models in geography and economics originated from their counterparts in physics and engineering (Banks 1994). In general, the analogies between very different phenomena may trigger interdisciplinary transfers of methodologies and models. The notions of systems and of system dynamics and the schools of thought linked to them have played a major role in the construction of classes of STMs based on cross-disciplinary analogies. However, in many circumstances the need may arise to construct STMs from purely spatial or purely temporal models, or to extend existing STMs to encompass relevant contexts that are outside their original scope. An approach to the incremental construction of mathematical models that can be easily applied to building STMs is the ‘expansion method’ (Casetti 1997). In a nutshell, the expansion method involves widening the scope of a simpler ‘initial model’ by redefining some or all of its parameters into
Spatial-temporal Modeling functions of expansion variables or variates that index substantively relevant contexts of the model. The ‘terminal model’ thus produced encompasses both the initial model and a specification of its contextual variation. To demonstrate the operation of the method, let us discuss its application by adding a time dimension to a purely spatial model of urban population densities. This demonstration is based on a variant of the model presented in Casetti (1973). The urban studies literature has identified two spatial patterns of urban population densities that are often referred to as ‘density cones’ and ‘density craters.’ In the cones, the population densities decline exponentially with distance from city centers. Instead, a density crater type of trend is characterized by urban population densities that increase at first and then decline as the distance from the city center increases. Let D, s, and ε denote, respectively, the natural logarithm of urban population density, distance from a city center, and an error term. Let D l α jα sjα s#jε ! " #
(1)
be an initial model subsuming both a density cone and a density crater as special cases. In fact, for α 0 and " while α 0 this model corresponds to a density cone, # for α 0 and α 0 it yields a density crater. " literature has # also identified distinctive patterns The of temporal change of these spatial trends. They include the rising and flattening of the density cones, the widening of the density craters, and the gradual transformation of cones into craters. In order to extend the initial purely spatial model (1) so that it will encompass the spatial temporal trends recognized by the literature, the α parameters in it are expanded by the stochastic expansion equations αi l αi jαi tjθi ! "
i l 0,1,2
(2)
where the θs are error terms. The terminal models D l α jα tjα sjα stjα s#jα s#tju (3) !! !" "! "" #! #" u l εjθ jθ sjθ s# (4) ! " # are obtained by replacing the right hand sides of the expansion equations for the αs in the initial model. Upon estimation, these terminal models can be used to test for the occurrence of specific spatial temporal trends, and to portray these trends if they occur. The model appearing in this demonstration is intended for estimation and for the testing of statistical hypotheses. It should be noted, though, that the expansion method can be used to construct STMs of any kind. Any model that has parameters can be
expanded, irrespectively of whether it is deterministic, stochastic, or mixed, and irrespectively of whether it is intended for solving, estimating, or optimizing.
5. Concluding Comments Most STMs in the social sciences came into existence during or after the 1930s, and have been influenced or shaped by trends that kept altering the landscape of the quantitative analytical social sciences during this time horizon. The comments that follow highlight some such trends and influences. Mathematical models in general, and STMs in particular, have been acquiring an important role in government and private sector activities as aids to decision making, to planning, to policy development and evaluation, and for forecasting. In the process they have been driven to address the complexities of the real world to an ever-increasing degree. The tendency toward more complex models is especially apparent in the canonical and compartment STMs, and in turn the proliferation of these models is a reflection of their practical usefulness. The movement toward the ‘computational’ counterparts of STMs that originated as purely mathematical formalisms is another aspect of these tendencies. Increasingly complex STMs have also played an important role in the precise articulation of the theories and conceptual frameworks that constitute a substantial component of the social sciences. In the more theoretical applications this complexification has been intertwined with conceptual and technical advances imported from mathematics and the non social sciences. The STMs with nonlinear dynamics, and the ones centering on optimal control techniques are cases in point. The entire evolution of STMs and of mathematical modeling at large has been permeated by the growth of stochastic modeling, and by the penetration of stochastic elements into models born deterministic. For instance, deterministic canonical STMs have been converted into ‘mixed’ or stochastic models by the addition of stochastic errors and\or of temporally or spatially autoregressive terms. An adequate stochastic specification and its attendant statistical econometric testing have increasingly become a must for empirical analyses to be credible. See also: International Trade: Geographic Aspects; Location Theory; Spatial Decision Support Systems; Spatial Interaction Models; Spatial Optimization Models; Spatial Pattern, Analysis of; Spatial Search Models; Transportation Geography
Bibliography Anselin L, Florax R J G M (eds.) 1995 New Directions in Spatial Econometrics. Springer, Berlin
14839
Spatial-temporal Modeling Ball R J (ed.) 1973 The International Linkage of National Economic Models. North Holland, Amsterdam Banks R B 1994 Growth and Diffusion Phenomena: Mathematical Frameworks and Applications. Springer, Berlin Barro R J 1997 Determinants of Economic Growth: A Crosscountry Empirical Study. MIT Press, Cambridge, MA Bennett R J, Chorley R J 1978 Enironmental Systems: Philosophy, Analysis, and Control. Princeton University Press, Princeton, NJ Bertuglia C S et al. (ed.) 1987 Urban Systems: Contemporary Approaches to Modelling. Croom Helm, London Casetti E 1973 Testing for spatial-temporal trends: An application to urban population density trends using the expansion method. Canadian Geographer 17(2): 127–37 Casetti E 1997 The expansion method, mathematical modeling, and spatial econometrics. International Regional Science Reiew 20(1\2): 9–33 Deardorff A V, Stern R M 1986 The Michigan Model of World Production and Trade. MIT Press, Cambridge, MA De Palma A, Lefevre C 1987 The theory of deterministic and stochastic compartmental models and its applications. In: Bertuglia C S, Leonardi G, Occelli S, Rabino G A, Tadei R, Wilson A G (eds.) Urban Systems: Contemporary Approaches to Modelling. Croom Helm, London Fair R C 1984 Specification, Estimation, and Analysis of Macroeconometric Models. Harvard University Press, Cambridge, MA Fischer M M, Getis A (eds.) 1997 Recent Deelopments in Spatial Analysis. Springer, Berlin Greenwood M J 1981 Migration and Economic Growth in the United States: National, Regional, and Metropolitan Perspecties. Academic Press, New York Griffith D A, Haining R P (eds.) 1986 Transformations Through Space and Time: An Analysis of Nonlinear Structures, Bifurcation Points, and Autoregressie Dependencies. Mt Nijhoff, Dordrecht, The Netherlands Griffith D A, Lea A C (eds.) 1983 Eoling Geographical Structures: Mathematical Models and Theories for Space-time Processes. M Nijhoff, The Hague, The Netherlands Griffith D A, Mackinnon R (eds.) 1981 Dynamic Spatial Models. Sijthoff & Noordhoff, Alphen aan den Rijn, The Netherlands Haggett P, Cliff A D, Frey A 1977 Locational Analysis in Human Geography, 2nd edn. Wiley, New York Hotelling H 1978 A mathematical theory of migration. Enironment and Planning: A 10: 1223–39 Huggett R J 1980 Systems Analysis in Geography. Clarendon Press, Oxford, UK Isard W, Azis I J, Drennan M P, Miller R E, Saltzman S, Thorbecke E 1998 Methods of Interregional and Regional Analysis. Ashgate, Aldershot, UK Judge G G, Takayama T (eds.) 1973 Studies in Economic Planning Oer Space and Time. North Holland\American Elsevier, Amsterdam Klassen L H, Paelinck J H P, Wagenaar S 1979 Spatial Systems. Saxon House, Farnborough, UK Leontief W 1977 The Future of the World Economy: A United Nations Study. Oxford University Press, New York Meadows D H, Meadows D L, Randers J, Behrens III W W 1972 The Limits to Growth. Universe Books, New York Mesarovic M, Pestel E 1974 a Mankind at the Turning Point: The Second Report to The Club of Rome, 1st edn. Dutton, New York Mesarovic M, Pestel E (eds.) 1974 b Multileel Computer Model of World Deelopment System. International Institute for Applied Systems Analysis, Laxenburg, Austria, vols. 1–6
14840
Partridge M D, Rickman D S 1998 Regional computable general equilibrium modeling: a survey and critical appraisal. International Regional Science Reiew 21(3): 205–48 Putman S H 1979 Urban Residential Location Models. Nijhoff, Boston Rey S J 1998 The performance of alternative integration strategies for combining regional econometric and inputoutput models. International Regional Science Reiew 21(1): 1–35 Takayama T, Judge G G 1971 Spatial and Temporal Price and Allocation Models. North Holland, Amsterdam Tan K C, Bennett R J 1984 Optimal Control of Spatial Systems. Allen & Unwin, London Wrigley N (ed.) 1979 Statistical Applications in the Spatial Sciences. Pion, London
E. Casetti
Spatial Thinking in the Social Sciences, History of The objects upon which social sciences work, either people or artefacts, take up some room and are located in specific places. Cognition implies the building and mobilization of spatial skills, like orientation. Social scientists are generally more interested in particular spatial categories like direction, distance, place, area, landscape, village, town, city than in space: the distribution of things and people, the behaviors and practices of everyday life, or the plans developed for the future deal with concrete geographical realities rather than the abstract idea of space. Even geography was traditionally more oriented towards places, regions, or zones than towards space: the idea of geography as a spatial discipline did not develop before the 1950s, and was soon criticized. Among the other social sciences, economics was the only one with a specialty branch in the field: spatial economics, which began to flourish in the 1820s. Among the cognitive sciences, the psychology of orientation developed before Jean Piaget described the stages of space construction by children. As a result, spatial thinking in the social sciences often appears split into different fields, urban studies, rural studies, landscape studies, the significance of the regional approach and of place, the idea of enironment, the problem of orientation, etc. Spatial thinking refers to five types of intellectual operations: (a) using geographical observations as a starting point for social analysis, (b) explaining, generally within a functional perspective, the role of different aspects of space in the life of social groups, (c) understanding the way people invest symbolic meanings in their environment, (d) wondering about the significance given to space in the definition of concepts like society, economics, culture, language, polity, etc.,
Spatial Thinking in the Social Sciences, History of or (e) deciphering the ways in which the human mind deals with space and builds and uses spatial categories. The first family was present from the earliest developments of the social sciences. The second family flourished mainly in the late nineteenth and most of the twentieth centuries. The third family is more characteristic of the rise of phenomenological approaches, since the 1930s or 1940s. The fourth family is mainly linked to the development of critical perspectives on the social sciences since the 1970s. The development of an interest for space by the cognitive sciences remained until recently independent from the other approaches.
built his analysis of democracy by comparing the US, where it permeated all forms of social life, and Old Europe, where aristocratics were still influencial. Karl Marx explored the opposition between medieval cities and rural areas, the Asiatic social and economic organization, and the industrial revolution in Britain before building the notion of mode of production. By the middle of the nineteenth century, the cartographic model was increasingly replaced by models each with a logic dictated by the dynamic, immediate, interactive phenomena under study: the new specialized disciplines broke their earlier links with spatial thinking.
1. Spatial Distributions as a Point of Departure for Social Thinking
2. The Functional Approach to the Relations Between Space, Social Life, and Thought
We are living in social systems and so deeply embedded into them that generally we think they are an expression of the natural order. In order to build a science of social realities, the social scientist has to introduce a distance between the object. Travel narratives and geographical descriptions played a central role in this operation. Montaigne started to ponder about the relativity of social institutions when reading about the American cannibals. From Jean Bodin to Montesquieu, the diversity of political and social institutions nurtured the reflexion on social and political life and was interpreted through a simple model, the influence of climate. The first steps of economics were linked with geographic evidence: the role of transport in the distribution of economic activities, the complementarities between cities and rural areas, the economic circuit served as bases for the development of a more conceptual vision, which started with the analysis of market mechanisms by Turgot and Adam Smith’s reflexion on the division of labor. With the development of travels and the improvement of their accounts, geographical comparisons became more sophisticated: they allowed for the development of new disciplines, like ethnography. The change of scenery could be provided by travel through time as well as space. Immanuel Kant emphasized the similarities between history and geography: spatial and temporal distributions were the first features to confront social scientists in their study of the World. He opposed spatio-temporal descriptions to systematic analyses. Partly under the influence of Kant, the idea that geography was essentially a description of spatial distributions developed and remained alive well into the twentieth century. It was during the first decades of the nineteenth century that spatial comparisons played their most important role in the development of science. It was true for the natural sciences—Charles Darwin developed his evolutionary theory when traveling on the Beagle. In the social sciences, Alexis de Tocqueville
When social and behavioral sciences severed their former links with distributional approaches, they generally considered space as a neutral stand and ceased to use it as a significant factor for explaining what they observed. Many scholars thought, however, that space mattered: they had the idea that social life differed according to its physical setting, the fact that people lived in empty or overpopulated areas, and were more or less distant from each other. For Emile Durkheim the foundation of sociology lay in social morphology, but he was unable to provide a clear picture of it. The only way to take it into account was to adopt a functional perspective: space could be conceived either as a collection of environments or a set of distances (see Functionalism, History of ).
2.1 Space as Enironment In the environmental perspective, the emphasis was given to the local connexions between people, artefacts, and place. The first environmental approaches were developed by the sensualist psychology of the eighteenth century. Man was a product of whatever his senses impressed upon his mind. His aptitudes and personality reflected his environment. If it was disorderly or unsound, social life had many reasons to be stressed. By the end of the eighteenth century, social scientists were well aware of the social and political implications of such a conception. Jeremy Bentham conceived his Panopticon in order to provide criminals with an environment which would teach them safer forms of behavior. More generally, the idea that to reform society, you had to change its material setting was central to the nineteenth century utopian socialism and still present in the conception of urbanism developed by the International Movement of Modern Architecture in the 1930s. Johann Gottfried Herder was more sensitive to the harmony which developed between peoples and the country they lived in than to the sensual influence of 14841
Spatial Thinking in the Social Sciences, History of these surroundings on their mind and behavior. The early development of geography in Germany was linked to the combined influence of Kant (evident in Alexander von Humboldt’s conception of geographical description) and Herder (who was closer to Carl Ritter). Such a conception was important for all the social sciences in Germany, since it was conducive to a keener interest in peoples than in individuals or social classes. For many German geographers, the harmony between a people and the land they lived in was expressed through its landscapes (see Romanticism: Impact on Social Thought). The Herderian environmentalism was a literary and poetic one. Natural evolutionism had a more direct impact on the social sciences, since it was a scientific theory. The early forms of evolutionism as developed in the eighteenth and early nineteenth centuries had no direct impact on the then emerging social sciences. The publication of The Origin of Species occurred at a time when history and the idea of progress had become popular in the public opinion, and original forms of evolutionism had developed in sociology: hence the impact of its theses in the social sciences. For sensualist psychology or Herderian philosophy, the influence of things upon the human mind was a direct one. For scientific evolutionism, the relations were mediated through biological mechanisms (see Eolutionism, Including Social Darwinism). Did environments shape completely the destiny of humankind and of any of its groups? Hot discussions developed between environmentalists and the scientists who believed in human liberty. In order to depict the relations between organisms and their environment, a new discipline, ecology, appeared. Named by Haeckel in 1872, its first scientific achievements occurred at the end of the nineteenth century; it became mature in the 1930s and 1940s. Geographers did not wait for the development of ecology to propose a new conception of their discipline: Anthropogeographie, or human geography, was named and shaped by Friedrich Ratzel in the 1880s and 1890s. For him and for the first human geographers in France, a strict environmentalism was an unbearable position, but physical constraints did exist and shaped at least partly human behavior and history: geographers pleaded for possibilism, which gave room to physical and biological constraints, but acknowledged the role of other variables. By 1920, it had become evident that environmentalism was not a very fruitful way of conceiving the role of space in social functioning.
unhappy with the elimination of space from mainstream theory. Hence, the development of spatial economics as a sideline. It was centered on two problems: the uneven distribution of natural factors and the role of distance. The name of David Ricardo is associated with the first orientation and the ensuing theory of international specialization and trade. Johann Heinrich von Thu$ nen explained how distances to the market determined farming specialization even in a perfectly uniform setting (1826–52). The progress of spatial economics was a slow one: people had to wait until Alfred Weber for a coherent theory of industrial location (1909), and Walter Christaller and August Lo$ sch for an understanding of the location of service activities and the nature of urban networks (early 1930s). At the end of the nineteenth and the beginning of the twentieth centuries, some anthropologists also considered distance as a major factor of cultural differentiation. It soon appeared that diffusionism could not serve as an overall explanatory theory. Diffusion was, however, a significant cultural process (see Anthropology, History of ). When geographers, tired from the hopelessly descriptive character of the possibilist approach, tried in the late 1950s and 1960s to develop a more systematic view of their scientific enterprise, many of them decided that geography should become a discipline of distances: they incorporated what spatial economics and anthropology could offer and went further through the emphasis they gave to communication processes. They used what a new scientific field, the study of communication, was developing. The functional approach to social problems through the analysis of distance was fruitful. Its role in planning theory and practice was essential during the 1960s and 1970s. It allowed for the design of well organized cities or suburbs. It did not prevent, however, the development of urban pathologies. In order to understand them, other approaches to the role of space in social life had to be developed.
3. Space as Experience Functional approaches were specific to the social sciences. They did not offer many possibilities to the cognitive sciences. The situation is different for those who try to analyze space as experience.
3.1 Enironmental Psychology 2.2 Space as Distance Economics had developed another approach to analyze the role of space in the functioning of society. In the early nineteenth century, some economists were 14842
The curiosity of psychologists for spatial problems was initially focused on the problems of orientation and place recognition. Their interest changed with the genetic approach developed by Jean Piaget: his problem was to understand the way in which children explored their environment and built behavioral
Spatially Explicit Analysis in Enironmental Studies schemes and mental categories out of their experience and inborn capabilities. The relations between man and environment are mediated through the senses: what are the specific contributions of sight, hearing, smell, touch, taste, and cynesthesic sensation to the individual experience of space? Michel Foucault, for instance, was fascinated by the place given to sight by Western civilizations and its role in social control. Because of its emphasis on experimental methods, mainstream psychology ignored the social dimension of spatial learning and experience. They were brought back by phenomenologists, Gaston Bachelard or Maurice Merleau-Ponty, for instance. Environmental psychology introduced the theme of proxemics.
Torstein Ha$ gerstrand stresses the fundamental diversity of spatial and social experiences developed by people along their life trajectories: it means that social experiences differ according to persons. In the field of communication, for instance, people participate to different intersubjectivity circles depending on the places they live or had lived. Such an observation is fundamental for all the social sciences: it reminds that societies, cultures, polities, languages do not exist in a spaceless social realm. They differ from place to place, because of the existence of locales, as expressed by Anthony Giddens. The social sciences of today can no more ignore spatial thinking: it is one of the main lessons of the postmodernist epistemological revolution.
3.2 The Psychological, Moral, and Religious Significance of Space
Bibliography
Through phenomenology and proxemics, space started to be conceived as a necessary component of communication systems: hence the contribution of linguistics and semiotics to spatial thinking. Landscapes ceased to be analyzed in a functional perspective. They were conceived as supports for messages. People experience different forms of connivance with them. A difference exists between analytical and symbolic communication: in the first case, the problem is to transfer complex sets of information from individuals to individuals; in the second one, it is to stir simultaneously many persons through a unique and short message. Symbolism appears in this way as a way of mastering distance. Hence, the interest of geographers for this form of communication. Space has other social meanings. Normative thinking relies on perspectives from the outside: people have to imagine Beyonds, either within beings or things (when the way of immanence is chosen) or outside them, generally in another World (when the way of transcendance is selected). These choices are important since they introduce a qualitative differenciation of space: places where the Beyonds touch the surface of our World are loaded with sacrality, in contrast to the surrounding profane areas. The building of Utopias, the belief in the Golden Age, or in a Land without Evil are ways to use history or terrestrial space to build other types of Beyonds. The dreams they nurture are important in the shaping of rural or urban areas. The new approaches to geography developed since the 1970s incorporate this interest in the human experience of space.
3.3 Social Experience as a Localized Experience People cannot develop an experience of society outside specific places. The Time Geography developed by
Bachelard G 1957 La PoeT tique de l’Espace. Presses Universitaires de France, Paris Dagognet F 1977 Une EpisteT mologie de l’Espace Concret. Vrin, Paris Eliade M 1957 Das Heilige und das Profane. Rohwolt, Hamburg Giddens A 1984 The Constitution of Society. Polity Press, Cambridge, UK Gould P, White R 1974 Mental Maps. Penguin, Harmondsworth, UK Lee T 1976 Psychology and the Enironment. Methuen, London Livingstone D N 1993 The Geographical Tradition. Blackwell, Oxford, UK Merleau-Ponty M 1945 PheT nomeT nologie de la Perception. Gallimard, Paris Piaget J 1926 La RepreT sentation du Monde Chez l’Enfant. Alcan, Paris Ponsard C 1984 History of Spatial Economic Theory. SpringerVerlag, Berlin Proshansky H M, Ittelson W H, Rivlin L G (eds.) 1970 Enironmental Psychology. Holt, Rinehart and Wilson, New York
P. Claval
Spatially Explicit Analysis in Environmental Studies Problems of the human–environment condition, including its change, are increasingly examined in terms of their spatial configuration and scale of analysis. It could be argued that much, if not all, of the field of geography studies these problems in this way, and there are numerous articles in this Encyclopedia dedicated to geographical problem solving and approaches. Spatial analysis of the human–environment condition (i.e., environmental and resources problems) has expanded beyond geography, however, becoming a multidisciplinary if not interdisciplinary 14843
Spatially Explicit Analysis in Enironmental Studies subfield of study. This multidisciplinary focus on spatially explicit problem solving, where the spatial aspect of the human behavior and\or the natural environment makes a crucial difference in the analysis and policy response to the problem, is treated here. Throughout, the emphasis is on geographical space (metric and relative distance), not on such social science concepts as ‘social distance’ (i.e., between social or economic classes), which are treated elsewhere.
1. Space in the Sciences The natural sciences are further advanced than the social sciences in their theoretical developments of models of spatial processes. Landscape ecology and population biology are exemplary. In both of these fields, the spatial arrangement of the geophysical landscape, as well as flora and fauna on the landscape are the major focus of analysis. The advances in spatial analysis in the natural sciences, compared with the social sciences, are fairly straightforward to explain. There is a more developed physical theory of behavior over space, such as gravity and watershed impacts, or species interaction over space, than there are theories of spatial human behavior and individual preferences of spatial configuration. Also, until recently, there has been much more spatially explicit natural science data than social science data available for modeling and hypotheses testing, resulting in delayed response in the social sciences to spatially explicit problem solving (NRC 1997). In geographical spatial models employed in the social sciences, space appears as a constraint on the system under consideration. For example, geographical space is reduced to distance in a transportation cost model of location. However, space cannot be considered only as a constraint or cost to be paid; it must also be considered in terms of spatial processes and must be modeled as such at the appropriate scale of analysis. In order to explain and predict these spatial processes, theoretical and empirical models must be developed to address where, when, and why these processes happen (Bockstael 1996). The other social and behavioral sciences have lagged behind geography in developing theories and models of human behavior and choices over space. One of the main reasons for this lag is likely the paucity of disaggregate spatially explicit social science data. These data are necessary to test empirically the hypotheses derived from the theoretical constructs of individual human behavior. In addition to the increase in spatial data, the advances in geographical information systems (GIS) technology in availability and ease of use has increased the interest in spatial issues among the social sciences beyond geography. Also, as spatial data have become more available, so too have the statistical tools to use these data. 14844
2. Topical Areas Associated with the use of many renewable and nonrenewable natural resources are spatial effects, where the spatial location and configuration of the natural resource affects human behavior and consequences to the environment. Examples include stationary resources, such as energy resources, minerals, forestry; and nonstationary resources, such as fisheries and other wildlife, as well as the use of resources that can lead to spatial environmental problems. The specific examples that are focused on in this article include fisheries and other wildlife, forestry and urban land use, with an emphasis on the implications of where spatially explicit problem solving makes a critical difference in the analysis and policy response to the interaction between human systems and natural systems.
2.1 Fisheries and Wildlife Management Historically, the primary focus of fisheries management models has been the common property feature of most fisheries, that is, not one person, industry or even country has control over the stock and location, and often, owing to mismanagement, this leads to open access problems resulting in overfishing and depletion of stocks. The policy responses to address these problems have included reduction in fishing season, and the development of tradable permits. Sanchirico and Wilen (1999) argue that these traditional models assume a spatially homogenous distribution of the resource, but if that is not case, then the policy implications concerning the associated human behavior will likely be incorrect. They have developed a model that includes the spatial configuration of the resource, focusing on patchy and heterogeneous distribution of the fish populations, especially the interconnections among and between patches. The main result is that by incorporating different forms of these dynamic spatial relationships, such important biological concepts as ‘source’ or ‘sink’ of the living resource can depend on both biological and socioeconomic parameters. Therefore, spatially differentiated management techniques, such as refuges or rotating harvesting zones that do not consider the spatial interaction of human behavior and the resource, could lead to suboptimal policies. Similar to the spatial fishery models are spatial terrestrial wildlife models that analyze different wildlife management schemes. An example of the effects of the spatial interaction between beaver populations and forestry management is found in Bhat et al. (1996). As beaver populations move over space, management of the beavers on one property can have unintended spillovers on contiguous properties. By taking these interactions into account, the optimal beaver control strategies is derived for each location
Spatially Explicit Analysis in Enironmental Studies and time period, leading to a more efficient management scheme than would be possible without taking into account the spatial dynamics of the system.
2.2 Forestry: Timber Management and Deforestation The original emphasis of timber management forestry models was to maximize the value of the timber products deriving from a stand, and these stands were assumed to be spatially homogenous. These models assumed that the only benefit from the forest was the value of the timber, and that there were no off-site impacts from the growing forest or from timber harvesting. Swallow and colleagues (1997) argue that while these models are important, the lack of stand interactions in them can lead to faulty management and policy prescriptions, as the spatial relationships that lead to many ecosystem functions are ignored, such as the relationship between forest fragmentation and wildlife populations. By including these spatially explicit ecological interdependencies, Swallow and colleagues show that the optimal forestry management scheme can differ substantially from using a nonspatial modeling approach, including different harvesting periods for the different stands, and different amounts of the total benefits of recreation and forage availability to wildlife. Similarly, a spatially explicit cellular landscape ecology model that incorporates spatial ecological interdependencies derives the optimal spatial pattern of harvesting over time, a result that could not be obtained from a nonspatial approach (Hof et al. 1994). These results have important policy implications. For example, this approach can be used by public land managers who are managing the forest for multiple use and have to take into consideration nontimber outputs, such as wildlife, that are located on adjacent land that is not under their jurisdiction. Tropical forest management issues have an important global policy context, owing to the linkages between tropical deforestation and the loss of biodiversity, and the decrease in carbon sequestration potential and those effects on global warming. The major uses of tropical forests include timber activities as in the temperate forest context, but to a greater degree than exists in temperate forests, they also include gathering of fuelwood, agroforestry and land clearing for cultivation. In a tropical forest management context, Albers (1996) develops a theoretical spatial model that incorporates the different potential land uses and their spatial interaction across plots. Tropical forest fragmentation and its spatial pattern have important effects on biodiversity and the ability of the forest to regenerate. The simulation model optimizes the pattern of land uses under different spatial and management scenarios, and investigates how these characteristics affect forest fragmentation and the associated ecological effects of the different scenarios.
Albers concludes the interaction between land-use patterns and forest recovery generates flexibility in spatial planning for forest recovery that can generate income for land managers in the short run, while maintaining the option for large preserved areas in the long run. For reasons not entirely clear, there has been much work on spatially explicit empirical modeling of tropical deforestation than in the temperate forestry (e.g., Mertens and Lambin 1997, Veldkamp and Fresco 1997). Included in these models are variables that can be derived from the satellite imagery and used with geographical information systems to create spatial variables, such as distance to roads and urban centers, as well as existing spatial geophysical data. As these models do not include any model of human behavior, the spatial pattern is not an outcome of a behavioral spatial process. The only possible simulations that can be performed are changes that are imposed on the explicit features of the landscape, therefore only simple policy analyses such as a moratorium on road-building or certain land uses can be performed. In contrast, models that develop an underlying structural model seeking to explain the human behavior that generates land-use patterns include Chomitz and Gray (1996) and Pfaff (1999). These models include advances by correctly including aggregate economic variables in a behavioral model. This makes these models policy-relevant, as behavioral responses to changes in policies can be predicted. However, spatially explicit tropical deforestation models have made advances in controlling for the spatial heterogeneity of the landscape, they all have a very simple spatial behavioral process, where distances to roads and markets are the main spatial elements linked with deforestation. More sophisticated spatial processes are needed for models of tropical deforestation where, in addition to distances to roads and markets, distances to neighboring land uses and the patterns of those neighboring land uses are considered in the probability of deforestation (e.g., Geoghegan et al. 2000).
2.3 Urban Land Use The location, distribution and pattern of urban land uses and urban fringe development have major effects on the environment. It is the pattern of development that determines the amount and type of nonpoint source pollution into water bodies; loss of farmland and other open-space amenities; the amount of commuting with associated air pollution and congestion; as well as ecological effects, including hydrological disturbances and habitat fragmentation. There is a rich social science literature of spatial models of urban land use. The classic monocentric city model is an equilibrium model of urban spatial structure, where the distribution of land uses on a featureless plain 14845
Spatially Explicit Analysis in Enironmental Studies around a central business is a result of an equilibrium between the declining land price gradient and increasing transportation costs. In its simplest form, concentric rings of residential development around the urban center describe the resulting equilibrium pattern of land use and decreasing residential density as distance from the urban center increases. More sophisticated versions of the model have been developed, but nonetheless, the model’s ability to explain observed land use patterns is weak. In comparing the model’s predictions with actual land-use patterns, the model fails to explain the complexity of the spatial and temporal patterns of urban growth. This limitation is partly due to the treatment of space, which is assumed to be a ‘featureless plain’ and is reduced to a simple measure of distance from the urban center. Some research has tried to overcome the ‘featureless plain’ assumption by including some of the heterogeneous landscape features that exist, for example, by focusing on how individuals value different landscape arrangements. In a study of individual preferences concerning agricultural land abandonment and subsequent spontaneous reforestation, Hunziker and Kienast (1999) use photographs of different stages of reforestation to elicit respondents’ most preferred pattern of land uses and find that there are particular arrangements of the landscape that are preferred by individuals. In order to test the hypothesis that individuals value the pattern of land uses surrounding their homes, Geoghegan and colleagues (1997), have created spatial indices of land use fragmentation and diversity measured at different scales, and include these variables in a spatially explicit model of residential land values. Results demonstrate that individuals prefer to have a homogenous distribution of land use in the immediate area surrounding their homes, but a more heterogonous distribution of land uses on a larger scale. The spatial interdependencies among land use, zoning regulations and on ecosystems are modeled in Bockstael and Bell (1997). A spatially explicit modeling approach is necessary in order to understand how these regulations affect development decisions across the landscape and in turn how the location of nutrients from point sources, such as sewer treatment plants, and nonpoint sources, for example, septic fields affect water quality. They find that differential zoning across regions deflects development from one region to another and that the amount of increased nitrogen loadings from a constant amount of new development varies from 4–12 percent, depending on the degree of difference across zoning regulations. In the urban land-use models introduced above, the theoretical structure assumes a very simple concept of spatial human behavior—that individuals only respond to exogenously determined locations of landscape features. Other models allow for the potential spatial interactions among individuals. An approach that can be used to model this interaction explicitly is 14846
a cellular automata model, where behavior of a system is generated by a set of deterministic or probabilistic rules that determine the discrete state of a cell based on the states of neighboring cells. Despite the simplicity of the transition rules, these models, when simulated over many time periods, often yield complex and highly structured patterns. Researchers have used cellular automata models to analyze the process of urban growth (e.g. Clarke et al. 1997, White et al. 1997). In contrast to the standard urban economic models of urban structure introduced above, in which complex patterns are generated by imposing external conditions, these models demonstrate how complex structure arises internally from the interaction among individual cells. While these models fit the spatial process and the spatial evolution of urban land uses when compared with urban forms in the USA, because these models impose the land-use transition rules there is very little policy-relevant context to these models. This limitation is overcome in a cellular automata approach used by Irwin and Bockstael (1999) to predict patterns of land use change in an urbanizing area, in which the transition probabilities are estimated as functions of a variety of exogenous variables and an interaction term that captures the effect of neighboring land use conversions. They then use a cellular automata model, with estimated parameters of land conversion, to demonstrate that a model in which there are spatial interactions between the land use decisions of individuals result in the evolution of a fragmented land-use pattern that is qualitatively much more similar to the observed pattern of development than the pattern that is predicted by models that ignore these spatial interactions. Models that include improved understanding of the spatial evolution of urban land use, when linked with a spatial ecosystem model, will be able to make much better predictions of the effects of government policies concerning land use on ecosystems.
3. Conclusions In order to improve the understanding and modeling of spatially explicit human behavior in relationship to the environment, improved datasets, methods, and theory need to be developed (Irwin and Geoghegan 2000). All three of these issues are closely linked. As argued above, the historical paucity of spatially explicit social science data constrained spatial modeling of human behavior. As a result, interest in spatial issues in the social sciences, beyond the field of geography, has been sparse. But now that spatial social science data are becoming available, there has been a renewed interest in spatial issues. In a recent discussion of future research issues in natural resource and environmental economics, Deacon and colleagues (1998) comment, ‘the spatial dimension of resource use may turn out to be as important as the exhaustively
Special Education in the United States: Legal History studied temporal dimension in many contexts’—a conclusion that the regional and spatial science of the 1960s and 1970s argued previously. Therefore, having spatially explicit social science data should result in more sophisticated modeling of spatial human behavior and hypotheses testing. However, this necessitates improved empirical methods, such as those being developed within the field of spatial statistics and econometrics. A better understand of human behavior over space, coupled with better data and improved tools are necessary for advances in spatially explicit environmental problem solving. See also: Ecological Economics; Economic Geography; Environment and Common Property Institutions; Environmental and Resource Management; Environmental Economics; Exhaustible Resources, Economics of; Land Use and Cover Change; Location Theory; Regional Science; Resource Geography; Spatial Analysis in Geography; Spatial Autocorrelation; Spatial Choice Models; Spatial Data; Spatial Pattern, Analysis of.
Bibliography Albers H J 1996 Modeling ecological constraints on tropical forest management: spatial interdependence, irreversibility, and uncertainty. Journal of Enironmental Economics and Management 30: 73–94 Bhat M G, Huffaker R G, Lenhart S M 1996 Controlling transboundary wildlife damage: modeling under alternative management scenarios. Ecological Modelling 92: 215–24 Bockstael N E 1996 Modeling economics and ecology: the importance of a spatial perspective. American Journal of Agricultural Economics 78: 1168–80 Bockstael N E, Bell K P 1997 Land use patterns and water quality: the effect of differential land management controls. In: Just R E, Netanyahu S (eds.) International Water and Resource Economics Consortium, Conflict and Cooperation on Trans-boundary Water Resources. Kluwer Publishers, The Netherlands, pp. 169–91 Chomitz K M, Gray D A 1996 Roads, land use and deforestation: a spatial model applied to Belize. World Bank Economic Reiew 10: 487–512 Clarke K C, Hoppen S, Gaydos L 1997 A self-modifying cellular automaton model of historical urbanization in the San Francisco Bay Area. Enironment and Planning B: Planning and Design 24: 247–61 Deacon R T, Brookshire D S, Fisher A C, Kneese A, Kolstad C, Scrogin D, Smith V K, Ward M, Wilen J 1998 Research trends and opportunities in environmental and natural resource economics. Enironmental and Resource Economics 11: 383–97 Geoghegan J, Wainger L A, Bockstael N E 1997 Spatial landscape indices in a hedonic framework: an ecological economics analysis using GIS. Ecological Economics 23: 251–64 Geoghegan J, Cortina Villar S, Klepeis P, Macario Mendoza P, Ogneva-Himmelberger Y, Roy Chowdhury R, Turner II B L, Vance C 2000 Modeling tropical deforestation in the southern Yucatan peninsular region: comparing survey and satellite
data. In: Agriculture, Ecosystems, and Enironment (under consideration) Hof J, Bevers M, Joyce L, Kent B 1994 An integer programming approach for spatially and temporally optimizing wildlife populations. Forest Science 40: 177–91 Hunziker M, Kienast F 1999 Potential impacts of changing agricultural activities on scenic beauty: A prototypical technique for automated rapid assessment. Landscape Ecology 14: 161–76 Irwin E G, Bockstael N E 1999 Interacting agents, spatial externalities, and the endogenous evolution of residential land use pattern. Department of Agricultural and Resource Economics, University of Maryland, College Park, Working Paper 98–24 Irwin E G, Geoghegan J 2000 Theory, data, methods: developing spatially-explicit economic models of land use change. In: Agriculture, Ecosystems, and Enironment (under review) Mertens B, Lambin E F 1997 Spatial modeling of deforestation in southern Cameroon. Applied Geography 17: 143–62 National Research Council 1997 The Future of Spatial Data and Society: A Summary of a Workshop. National Academy Press, Washington DC Pfaff A S P 1999 What drives deforestation in the Brazilian Amazon? Journal of Enironmental Economics and Management 37: 26–43 Sanchirico J N, Wilen J E 1999 Bioeconomics of spatial exploitation in a patchy environment. Journal of Enironmental Economics and Management 37: 129–50 Swallow S K, Talukdar P, Wear D N 1997 Spatial and temporal specialization in forest ecosystem management under sole ownership. American Journal of Agricultural Economics 79: 311–26 Veldkamp A, Fresco L 1997 Exploring land use scenarios: an alternative approach based on actual land use. Agricultural Systems 55: 1–17 White R, Engelen G, Uljee I 1997 The use of constrained cellular automata for high-resolution modelling of urban land-use dynamics. Enironment and Planning B: Planning and Design 24: 323–43
J. Geoghegan
Special Education in the United States: Legal History People with disabilities in most parts of the world have often been excluded from many of the usual privileges and benefits of the societies in which they lived. This has been as true of schools as of any other social institution. Although the rhetoric of public education, in particular, has often emphasized equality of opportunity, the practice of education well into the twentieth century belied this rhetoric. A recent report on educational practices in China, for example, indicates that students with disabilities are routinely rejected by universities regardless of how intelligent or hard working they are. As a result, most of China’s 20 million young people with physical disabilities stand 14847
Special Education in the United States: Legal History little chance of receiving a higher education (Mooney 2000). Institutionalized inequalities in schools have been most evident in the ways that children of racial and ethnic minorities have been excluded from equitable learning opportunities, sometimes under the rationale that they could be provided a ‘separate but equal’ education. Less known is the history of exclusion of children with disabilities through legal and administrative practices. In 1893, a child was expelled from the Cambridge Public Schools in the USA because he was claimed to be ‘weak in mind.’ The expulsion was upheld by the Massachusetts Supreme Court (Watson . City of Cambridge 1893). The extent to which school decisions leading to the exclusion of children with disabilities were based on factors other than a child’s academic ability is also illustrated by the wording of another court decision. In 1919, the Supreme Court of Wisconsin heard the case of a child with cerebral palsy who had the academic ability and physical capacity to be in school, but who drooled, had impaired speech, and had frequent facial muscle contractions. The court ruled that the child could be denied continuation in public school because he ‘produces a depressing and nauseating effect upon the teachers and school children’ (Beattie . State Board of Education, City of Antigo 1919). During the first half of the twentieth century, exclusion of children with disabilities from schooling was common and legal. A striking example of the justification given for segregated education for children with disabilities is the statement of the Superintendent of the Baltimore Public Schools in 1908. James Van Sickle said: If it were not for the fact that the presence of mentally defective children in a school room interfered with the proper training of the capable children, their (separate) education would appeal less powerfully … But the presence in a class of one or two mentally or morally defective children so absorbs the energies of the teacher and makes so imperative a claim upon her attention that she cannot under these circumstances properly instruct the number commonly enrolled in a class. School authorities must therefore, greatly reduce this number, employ many more teachers, and build many more school rooms to accommodate a given number of pupils, or else they must withdraw into small classes these unfortunates who impede the regular progress of normal children. The plan of segregation is now fairly well established in large cities, and superintendents and teachers are working on the problem of classification, so that they may make the best of this imperfect material (Van Sickle 1908, pp. 102-3).
The idea that children with disabilities have a right to an education, and that they should be educated in integrated settings, somewhat haltingly followed the racial integration implementation efforts that were put into motion following Brown . Board of Education decision in the USA. Although there were increasing 14848
allocations of funds in the 1960s for training special education teachers and supporting special classes, these were usually incentive programs that school districts could choose to participate in, or not. The question of a right to an education for children with disabilities was not addressed for decades after the racial integration of schools was mandated. In 1975, the interest and commitment in the US Congress to special education had evolved to the point that legislation was passed that provided the basic foundation for the right to education for students with disabilities that exists today. That legislation, entitled the Education of All Handicapped Children Act of 1975 (PL 94-142 1975), delineated that appropriate educational services be provided to all students with disabilities. It provided funding for the implementation of these services. It also mandated the manner in which these services should be planned and provided, and it included safeguards for students and their families. The statement in the law that most directly addressed the issue of integrated education for students with disabilities is one that provides for what has come to be known as the least restrictive environment: ‘Each public agency shall insure that special classes, separate schooling, or other removal of handicapped children from the regular educational environment occurs only when the nature or severity of the handicap is such that education in regular classes with the use of supplementary aids and services cannot be achieved satisfactorily’ (Public Law 94-142 1975). In 1986 the law was amended by PL 99-457. This legislation required a free appropriate education for preschool children aged three to five with disabilities. It also provided funding incentives to encourage the development of early intervention programs for infants and toddlers with disabilities, and those at risk of developing disabilities. PL 94-142 was further amended in 1990. Among other modifications, this amendment changed the name of the existing law to the Individuals with Disabilities Education Act or, as it is now commonly called, IDEA. The IDEA included language that was very explicit in its call for including children with disabilities in regular school programs. It stated that schools must have a continuum of placements available in order to appropriately meet the needs of these students. It emphasized, however, that unless a child’s individualized education program specifies some other necessary arrangement, the student must be educated in the school he or she would attend if he or she did not have a disability. Thus the IDEA reiterated the requirement that children with disabilities and those without disabilities be educated together to the maximum extent appropriate. The law also underscored the requirement that special classes, separate schooling, or other forms of segregation of students with disabilities occur only when the education of these students with the use of supplementary aids and
Special Education in the United States: Legal History services cannot be achieved satisfactorily in regular classrooms (Individuals with Disabilities Education Act 1990).
1. The Institutionalizing of Special Education: A Conceptual History It is ironic that PL 94-142 has been referred to as ‘the mainstreaming law.’ Mainstreaming, as both a concept and a term, has been used widely but defined very imprecisely. Like other educational terms, it has been used by professionals, parents, and politicians as if there were a common understanding of what was being communicated. In fact, that has often not been the case. Rather, the term has meant different things to different individuals and groups. One early attempt to offer a model of mainstreaming outlined three elements that should characterize it: a continuum of services for students with disabilities, a reduction in the number of children ‘pulled out’ of regular classes, and increased provision of special services within regular classrooms rather than outside of those classrooms (Berry 1972). The impetus for mainstreaming as it was defined by Berry, and as it was interpreted elsewhere under the term least restrictive environment, can perhaps be seen most clearly in Lloyd Dunn’s 1968 article entitled ‘Special Education for the Mildly Retarded: Is Much of It Justifiable?’ In the article, Dunn called for special educators to consider carefully the efficacy research then available indicating that children with disabilities made greater academic progress in regular classrooms than in special classes. He urged special educators to resist ‘being pressured into a continuing and expanding program (special classes) that we know now to be undesirable for many of the children we are dedicated to serve’ (Dunn 1968, p. 5). He also argued that the labeling of children for placement in special classes led to a stigma that was very destructive to their selfconcepts. Dunn asserted that ‘removing a child from the regular grades for special education probably contributes significantly to his feelings of inferiority and problems of acceptance’ (Dunn 1968, p. 9). Following the publication of Dunn’s article, there were others who called for the activating of Dunn’s concerns in educational policy and practice. A ‘zero reject model’ was advocated that suggested that no child with mental retardation should be ‘rejected’ from a regular class and placed in a special class (Lilly 1970). The Council for Exceptional Children Policies Commission adopted a policy that specified that all students with disabilities ‘should spend only as much time outside of regular classroom settings as is necessary to control learning variables’ (Council for Exceptional Children Policies Commission 1973, p. 494). Even though it was variously defined and understood, the concept of mainstreaming grew in importance and in its acceptance among professionals
and the public during the 1970s and 1980s. However, the actual practice of mainstreaming was not, according to some critics, as common as the rhetoric that endorsed it. In 1986 a new call for the integration of children with disabilities into regular education programs was issued from the Assistant Secretary for Special Education and Rehabilitative Services of the US Department of Education. Secretary Madeline Will proposed what she called the Regular Education Initiative (REI). The REI proposed the restructuring of American education into a single system for the delivery of services to all children. Will argued that by merging special and regular education, a ‘shared responsibility’ would be created that would serve children without the stigma of diagnostic labels or segregated classrooms (Will 1986). Will’s proposal for the REI generated a great deal of controversy and concern in the special education profession. In 1987, for example, a group of leaders in the field met to assess the implications of the proposal. They praised the intent of Secretary Will’s call for restructuring but cautioned that it ‘is important to acknowledge that special education cannot seek institutional solutions to individual problems without changing the nature of the institution … school organization is what fundamentally must change’ (Heller and Schilit 1987). This idea, that special education can change in fundamental ways only if the institution of the public school changes, is to be found repeatedly in the literature on special education reform that has been written in recent years (McLaughlin and Warren 1992). The critics of the REI voiced strong concern over the implications of the proposal. Kauffman, for example, expressed fears that the special services needed by children with disabilities might be diluted or eliminated if the REI became a reality. He also argued that most regular classroom teachers do not have the training and\or the inclination to work with many of the students who are served by special education. In language reflective of his strong beliefs on this issue, Kauffman stated, ‘The belief systems represented by the REI are a peculiar case in which both conservative ideology (for example, focus on excellence, federal disengagement) and liberal rhetoric (for example, nonlabeling, integration) are combined to support the diminution or dissolution of a support system for handicapped students’ (Kauffman 1989, p. 273). Supporters of the REI asserted that the response to Lloyd Dunn’s challenge in 1968 to the efficacy of special classes had not been genuine or widespread. They questioned the essential validity of what had been tried and accomplished during the 1970s and 1980s in the name of mainstreaming. REI proponents maintained that only through a merger of special education into the regular education structure could all children be given the educational services they need without the stigma that had become associated with special education (Robinson 1990). 14849
Special Education in the United States: Legal History The interest in a merger of special and regular education as it was advocated by REI supporters proved, however, to be a one-dimensional interest. REI advocates were almost exclusively special educators. Ironically, regular education showed very little interest in the REI. This lack of interest in a merger of structures and efforts between regular and special education inspired an early critic to write: ‘We have thrown a wedding and neglected to invite the bride. If this is an invitation to holy matrimony, it was clearly written by the groom (special education), for the groom, and the groom’s family. In fact, the bride (general education) wasn’t even asked. She was selected’ (Lieberman 1985, p. 513).
2. Inclusion and the Future of Special Education The most recent descriptor for the effort to create greater integration of children with disabilities into school programs is the term inclusion. For many educators, the term is viewed as a more positive description of efforts to include children with disabilities in genuine and comprehensive ways in the total life of schools. Whereas mainstreaming might have meant the same thing for many people, it could also be taken as having as its primary focus the physical presence of a child with disabilities in a regular classroom. Mainstreaming in this form could be only the appearance of integration. The REI was seen by some people, even some of its supporters, as meaning the dismantling of special education. Inclusion, however, can mean that the goal of education for children with disabilities is the genuine involvement of each child in the total life of the school. Genuine inclusion means welcoming children with disabilities into the curriculum, environment, social interaction, and self-concept of the school. Inclusion, of course, can (and already does) mean different things to different people. Some people interpret it to be just a new way of talking about mainstreaming. For others it may be seen as the REI with a new label. Some have even used the term inclusion as a banner for calling for ‘full inclusion’ or ‘uncompromising inclusion,’ meaning the abolishment of special education (Fuchs and Fuchs 1994). Debates over the nuances and fine points of difference between mainstreaming, the REI, inclusion, full inclusion, and whatever the next term or concept may be are surely not reflective of the grassroots concerns of parents, teachers, and school leaders who are earnestly trying to do their best for children with disabilities. Coupled with the other vagaries and demands that are placed on parents and educators in our increasingly complex society, the confusion that exists over the rights and responsibilities of the various constituencies in the education arena must seem overwhelming. 14850
The most sound basis for providing an inclusive education for students with disabilities in public schools must ultimately consist of the legislative foundations that create services for these students, and the interpretation that courts give to questions about this foundation. The legislative foundation for inclusion in the USA, for example, is IDEA. It is clear that this statute stress two things: the policy of the least restrictive environment and the requirement that a continuum of services must be available so that an appropriate education may be provided to every child who has the need for special education. Court decisions about this foundation are a fitting topic for the close of this discussion of the institutional aspects of special education. Although the cases and decisions of US courts concerning inclusion have varied, the principles derived from this litigation have been clear and compelling. Yell (1995) has shown that school districts must make every good-faith effort to maintain students with disabilities in general class placements with whatever appropriate aids and services might be needed to support a student in that environment. School districts that have not done so have consistently been found by courts to be in violation of the law. On the other hand, it is apparent from judicial review and decisions that some children may be so disruptive to general classrooms that these settings are not in their best interests or those of their nondisabled peers. The following suggestions to educators and others interested in the inclusion of children with disabilities in the total school environment are based on Yell’s study (1995) of court decisions concerning the issue of the least restrictive environments for the education of students with disabilities: (a) Determination of the LRE (least restrictive environment) must be based on the individual needs of the child. (b) Good-faith efforts must be made to keep students in integrated settings. The beginning assumption should always be that a student belongs in a general classroom. Exceptions may be made to this assumption based on individual needs. (c) A complete continuum of alternative placements and services must be available. (d) In making LRE\inclusion decisions, the needs of the student’s peers should be considered. (e) When students are placed appropriately in more restrictive settings, they must still be integrated into general classes and other school activities to the maximum extent feasible and beneficial to the student. (f)TheLRE\inclusiondecision-makingprocessmust be thoroughly documented. The achievement of an individualized and appropriate educational program for any student is a dynamic process. Decisions regarding the most effective blend of special services and inclusive practice must be continually reviewed and changed in the best interests of the student.
Specialization and Recombination of Specialties in the Social Sciences See also: Disability, Demography of; Disability: Psychological and Social Aspects; Disability: Sociological Aspects; Law and People with Disabilities; Mental Retardation: Clinical Aspects; Mental Retardation: Cognitive Aspects
Bibliography Beattie V 1919 State Board of Education, City of Antigo, 169 Wisc. 231, 172 N.W. 153 Berry K 1972 Models for Mainstreaming. Dimension, San Rafael, CA 1973 Proposed CEC Proposed CEC policy statement on the organization and administration of special education. Council for Exceptional Children Policies Commission 39: 493–97 Dunn L M 1968 Special education for the mildly retarded: Is much of it justifiable? Exceptional Children 34: 5–22 Fuchs D, Fuchs L 1994 Inclusive schools movement and the radicalization of special education reform. Exceptional Children 60: 294–309 Heller H W, Schilit J 1987 The regular education initiative: A concerned response. Focus on Exceptional Children 20: 1–7 1990 Indiiduals with Disabilities Education Act (PL 101-476) 20 USC: Ch. 33, Sec. 1400–1485 Kauffman J M 1989 The regular education initiative as Reagan– Bush education policy: A trickle-down theory of education of the hard-to-teach. Journal of Special Education 23: 256–78 Lieberman J M 1985 Special education and regular education: A merger made in heaven? Exceptional Children 51: 513–16 Lilly M S 1970 Special education: A tempest in a tempest. Exceptional Children 37: 43–48 McLaughlin M, Warren S 1992 Issues and Options in Restructuring Schools and Special Education Programs. Center for Policy Options in Special Education\University of Maryland, College Park, MD Mooney P 2000 A college for disabled students meets indifference in Beijing. The Chronicle of Higher Education 46: 1–6 PL94-142 1975 The Education of All Handicapped Children Act of 1975 Public Law 94-142 34 C.F.R. 300.550 (b) (2) Robinson V F 1990 Regular education initiative: Debate on the current state and future promise of a new approach to educating children with disabilities. Counterpoint 5: 3–5 Van Sickle J H 1908 Provision for exceptional children in the public schools. Psychological Clinic 2: 102–11 Watson . City of Cambridge 1893 157 Mass. 561, 32 N.E. 864 Will M C 1986 Educating children with learning problems: A shared responsibility. Exceptional Children 52: 411–15 Yell M L 1995 Least restrictive environment, inclusion, and students with disabilities: A legal analysis. Journal of Special Education 28: 389–404
J. D. Smith
Specialization and Recombination of Specialties in the Social Sciences Specialization is an old phenomenon, observed as early as Auguste Comte. Recombination of specialties is also an old trend: Robert Merton noticed it in the early 1960s: ‘the interstices between specialties become
gradually filled with new cross-disciplinary specialties’ (Merton 1963, p. 253). The multiplication of scientific activities and the growing complexity of knowledge have brought a division of labor, visible in increasing expertise in all domains. The history of scientific advancement is a history of concatenated specialization. In spite of this evidence, ‘even since the social sciences began to develop, scholars have repeatedly expressed apprehension about the increasing fragmentation of knowledge through specialization’ (Smelser 1967, p. 38). Fragmentation and specialization are the two faces of the same coin. This double process is a condition of scientific advancement. The term ‘discipline’ is inherited from the vocabulary of the nineteenth century and is understood as a branch of instruction for the transmission of knowledge and as a convenient mapping of academic administration. In all universities, teaching, recruitment, promotion, peer review, and financing are still organized along disciplinary lines, but the great scientific laboratories and the most creative research centers are anchored on specialties, which in many cases cross the borders of formal disciplines, or are located at the interstices between disciplines. In the sociology of science the term ‘discipline’ is defined as a ‘cluster of specialties’ (Crane and Small 1992, p. 198), the specialty being the main source of academic recognition and of professional legitimation (Zuckerman 1988, p. 539). In the natural sciences, specialization is celebrated as a mark of competence. Each contemporary natural science from astronomy to zoology is a conglomerate of dozens of specialties. Only in some social sciences, particularly in sociology, political science, psychology, and anthropology, does one hear complaints and lamentations about the fragmentation of the discipline into specialties. The word ‘crisis’ appears in the title of many books and articles about the identity and the contours of these disciplines (Horowitz 1993, S. Turner and J. Turner 1990). In reality, such ‘crises’ are symptoms of growth and expansion.
1. Specialization Analysis means breaking things into parts. All sciences have made progress, from the sixteenth century on, by internal differentiation and cross-stimulation among emergent specialties. Each specialty developed a patrimony of knowledge. With the growth of these patrimonies specialization became a necessity. Increasing specialization has led to the creation of subdisciplines. A specialty requires a high qualification in a delimited domain. The process of specialization has tended to disjoin activities which had previously been united, and to separate scholars belonging to the same formal discipline, but who are interested in different fields. Specialties are considered as parts of a larger discipline but ‘the modern scholar normally does not attempt to 14851
Specialization and Recombination of Specialties in the Social Sciences master his discipline in its entirety. Instead he limits his attention to one field, or perhaps two’ (Somit and Tanenhaus 1964, p. 49). Each formal discipline gradually becomes unknown in its entirety. No theory can any longer encompass the whole territory of a discipline. Talcott Parsons was the last sociologist to attempt such a theory. As a discipline grows, its practitioners become specialized and must neglect other areas of the discipline. It is for this reason that scissiparity, the amoeba-like division of a discipline into two, is a common process of fragmentation. Specialization provides researchers with methods so that they need not develop them anew. Refinements in methods are more easily transmitted among specialists. Specialization within disciplines is visible in national and international professional meetings. Everyone who has participated at a gathering of several thousand people has noticed the absence of coherence: 20–30 panels may run simultaneously, most of them mobilizing only a handful of people. The plenary sessions attract only a small minority, with most participants uninterested in issues encompassing the entire discipline. Disciplines fragment along substantive, epistemological, methodological, theoretical, and ideological lines. To those in the field, the theoretical and ideological divisions are likely to seem more important than to others. Ralph Turner gives a description of this process in sociology, based on his experience as Editor of an important journal: ‘In the 1930s and 1940s the aspiration to be a general sociologist was still realistic. There was a sufficiently common body of core concepts and a small enough body of accumulated research in most fields of sociology that a scholar might make significant contributions to several, and speak authoritatively about the field in general. It is difficult to imagine the genius necessary for such accomplishments today’ (R. Turner 1991, p. 70). The division of disciplines into specialties should be distinguished from their fragmentation into schools and sects. The term school refers to a group of scholars ‘who stress a particular aspect’. The proponents of a school are emotionally committed to their approach. A school is simultaneously a subdivision of a field and a species of cult or sect (Smelser 1967, pp. 6–7). While the multiplication of specialties is related to the maturation of the discipline, ‘the presence of numerous schools in a discipline generally betokens a relative scientific immaturity’ (Smelser 1967, pp. 6–7). The same diagnosis is formulated by Gabriel Almond who, by crossing two dimensions, an ideological one and a methodological one, described four types of schools in political science (Almond 1990, p. 15). These types are not determined by the process of specialization. Various authors stress the importance of the patrimony’s expansion for subsequent fragmentation. As disciplines grow, they fragment and most parts become the patrimonies of individual subfields; a rare few are passed down in the patrimony of several formal 14852
disciplines and become the classics. For Randall Collins, the increasing specialization in sociology could be explained by the growth in scale in the profession during the late twentieth century: ‘How does one make oneself visible when the sheer number of competitors increases? The field differentiates into various specialties. Instead of seeking recognition in the larger field, one opts for a smaller arena within which the discourse and the search for originality and for allies can go on’ (Collins 1986, p. 1340). A process of differentiation without integration is visible across disciplinary borders. Specialists ‘seem scarcely to recognize the names of the eminent practitioners in specialties other than their own’ (Collins 1986, p. 1340). Another scholar who emphasizes the importance of specialties in the organization of scientific communities is Harriet Zuckerman: ‘Much historical and sociological evidence suggests that specialties have been prime audiences for many scientists: they are the explicit and tacit audiences—the reference groups—to which they address their work, just as they are the prime sources of wherewithal and rewards for that work’ (Zuckerman 1988, p. 539). As in some cathedrals, there are more ceremonies in the chapels than under the principal nave.
2. Specialties Versus Disciplines Analyzing the relationships between specialties within disciplines and between specialties across disciplines, two kinds of disciplines can be perceived: ‘restricted disciplines, such as most physical sciences, which would be expected to exhibit a high degree of linkage between different research areas within the discipline, but less linkage to other disciplines,’ and unrestricted sciences, such as most social sciences, which would be likely to exhibit relatively diffuse links among research areas both within and outside the disciplines (Crane and Small 1992, p. 200). Depending on the definition we adopt, there are in the social sciences between 12 and 15 formal disciplines, but dozens of specialties, sectors, fields, and subfields. In sociology, for instance, there are some 50 specialties, as indicated by the list of research committees of the International Sociological Association. There are as many in the International Political Science Association. Most of these groups cooperate to a limited extent within their respective discipline. There is more communication between specialties belonging to different disciplines than between specialties within the same discipline. Sociometric studies show that many specialists are more in touch with colleagues who belong officially to other disciplines than with colleagues in their own discipline. The ‘invisible college’ described by Robert Merton and other sociologists of science is an eminently interdisciplinary institution because it ensures communication
Specialization and Recombination of Specialties in the Social Sciences not only from one university to another and across all national borders, but also, and above all, between specialists attached administratively to different disciplines. The networks of cross-disciplinary influence are such that they are obliterating the old classification of the social sciences. The division of disciplines into subfields tends to be institutionalized, as can be seen in the organization of large departments in many American and European universities. Another good indication of the fragmentation of the discipline is the increasing number of specialized journals. In the last two decades several hundred specialized journals in English relevant to social sciences have been launched. Most of these new journals cross the borders of two or three disciplines. Some other new journals have appeared in French and German. European unification has had an impact on the development of cross-national journals focusing on special fields. Specialization has consequences on the role of national professional associations and on the general journals. The size of specialties varies greatly: Egyptology and biosociology are small specialties; socialization and clientelism are very large. A segment which drifts away from the main body of the discipline has to reach a certain critical size before being recognized. Some of them enjoy a relative autonomy within the discipline. A distinction has to be drawn between specialization within a formal discipline and specialization at the intersection of several monodisciplinary subfields. The latter can occur only after the former has become fully developed.
3. Recombination of Specialties Across Disciplines Specialization within disciplines, accompanied by their fragmentation, is the first process. Recombination of specialties across disciplinary borders is the second. In the history of science a twofold process can be seen: on the one hand, a fragmentation of formal disciplines and, on the other, a cross-disciplinary recombination of the specialties resulting from fragmentation. The new field may become independent, like social psychology, or may continue to claim a dual allegiance, like historical sociology. In the latter case, librarians are not sure whether to place a work in the category of history or of sociology. Such a recombination has been called hybridization (Dogan and Pahre 1990, p. 63). A hybrid is a combination of two branches of knowledge. Jean Piaget suggested a different analogy: genetic recombination (Piaget 1970, p. 524). The reasons for hybridization are clear enough: specialization leaves gaps between disciplines; specialties fill the gaps between two existing fields. Many of the new specialties are to a large degree hybrid specialties. The phenomenon is
even more obvious in the natural sciences, e.g., in the awards of the Nobel prizes. Some scholars recommend an ‘interdisciplinary’ approach. Such a recommendation is not realistic because it overlooks the process of specialization. Research enlisting several disciplines involves in fact a combination of segments of disciplines, of specialties, and not whole disciplines. The fruitful point of contact is established between sectors, and not along disciplinary boundaries. Considering the trends in the social sciences, the word ‘interdisciplinarity’ appears inadequate since it carries a hint of superficiality and dilettantism, and consequently should be avoided and replaced by hybridization of specialties. Different disciplines may proceed from different foci to examine the same phenomenon. This implies a division of territories between disciplines. In contrast, hybridization implies an overlapping of segments of disciplines, a recombination of knowledge in new specialized fields. Innovation in each discipline depends largely on exchanges with other fields belonging to other disciplines. When old fields grow they accumulate such a mass of material in their patrimony that they split up. Each fragment of the discipline then confronts the fragments of other fields across disciplinary boundaries, losing contact with its siblings in the old discipline. A sociologist specialized in urbanization has less in common with a sociologist studying elite recruitment than with a geographer interested in the distribution of cities; the sociologist studying social stratification has more in common with his colleague in economics analyzing income inequality than he does with a fellow sociologist specialized in organizations; psychologists studying child development are much more likely to be interested in developmental physiology or in the literature on language acquisition than they are in other branches of psychology. A political scientist researching political socialization reads more sociological literature on the agents of socialization (family, church, school, street-corner society, cultural pluralism, etc.) than literature in political science on the Supreme Court, legislative processes, party leadership, or the recruitment of higher civil servants. Those working on the subfield of international relations in the nuclear age have little recourse to the classical literature in political science, but rather to economics, technology, military strategy, diplomatic history, game theory, nuclear physics, and engineering. Most hybrid specialties and domains recognize their genealogical roots: political economy, political sociology, social geography, historical sociology, genetic demography, psycholinguistics, social anthropology, social ecology, biogeography, and many others. Some hybrid sciences do not indicate their filiation in their name: cognitive science, paleoarcheology. Eight social sciences also have roots in the natural sciences: demography, geography, psychology, anthropology, linguistics, archeology, criminology, and cognitive 14853
Specialization and Recombination of Specialties in the Social Sciences science. The hybrid specialties branch out in turn giving rise, at the second generation, to an even larger number of hybrids. Recombination of specialties appears in the borrowing of concepts, theories and methods (Dogan 1996). Concepts are vehicles for communication between specialties belonging formally to different disciplines. Concepts are not equally important in all specialties. For instance, alienation is more frequently used in social psychology than in political economy, segregation more in urban studies than in comparative politics, and charisma more in political sociology than in social ecology. There are very few general theories accepted by the majority of scholars in any discipline; the best known theories are specialized theories: they are what Robert Merton calls middle-range theories, which do not encompass an entire discipline, but only a part of it. A specialized theory can cross several disciplines with a minimal effort of adaptation. For instance ‘rational choice’—considered by some as a theory and by others as a school or sect—is, except for some stylistic features, basically the same in political science, in sociology, and in economics. The theory of methodological individualism does not change on moving from one discipline to another. Two to three dozen theories peregrinate freely among formal disciplines. The high interdisciplinary mobility of specialized theories and of specialized concepts, and also of research methods, gives some centripetal orientation to the wide range of social sciences. The cross-disciplinary diffusion of concepts and theories is in fact a diffusion among specialties belonging to different disciplines. Methodology divides some disciplines as much as ideologies, and at the same time builds bridges between specialties across the disciplinary borders. The scholars who practice factor analysis understand each other well no matter what their official disciplinary affiliation, but most of them encounter some difficulties in communicating, within their discipline, with their colleagues who do not practice quantitative methods. In some departments methodological controversies generate factions, which may or may not coincide with divisions between specialties. The citation patterns in academic publications permit us to measure empirically the degree of coherence of a discipline, the relations between specialties within disciplines, and the interferences of disciplines. If specialists in a given subdiscipline tend to cite mostly, or exclusively, specialists of the same subdiscipline, and if relatively few authors cite across the borders of their subdiscipline, then the formal discipline presents a low degree of internal coherence. In this case the real loci of research are the specialties. If, on the contrary, a significant proportion of authors cross the borders of specialties, then the discipline as a whole appears more or less as an integrated territory (Dogan 1997). Political scientists have borrowed in the 1930s from law, in the 1950s from sociology and law, 14854
and in the 1970s from sociology, philosophy, economics, history, and psychology. According to Craig Calhoun, ‘Sociology is neither the most open nor the most insular of social sciences’ (Calhoun 1992, p. 143). A study of patterns of citation during the period 1936–75 has found that sociologists cited 58 percent of the time articles in sociology journals, political scientists cited 41 percent of the time scholars from their own discipline, anthropologists referred 51 percent of the time to their colleagues, psychologists 73 percent to their own kin, and 79 percent of the economists did the same (Rigney and Barnes 1980, p. 117). In each social science a significant proportion of theoretical, methodological, and substantive communication has been with other disciplines, the most open being, in the late twentieth century, political science and the most autarchic economics. In an analysis of journals identified as belonging to sociology and economics, a significant shift has been found from sociology to ‘interdisciplinary sociology’ and from economics to ‘interdisciplinary economics,’ between 1972 and 1987. The criteria for trade between disciplines was the proportion of citing references in journals of the respective discipline (Crane and Small 1992, pp. 204–5). An analysis by the same authors in terms of clusters of references shows a clear increase in interdisciplinary relationships. Most social scientists are specialists contributing primarily to their own subdiscipline. For instance, the Handbook of Sociology edited by Neil Smelser (1988) is divided into major specialties, without an overarching structure. Smelser is aware of this heterogeneity. Noting that Talcott Parsons had ‘exaggerated the internal unity of the discipline,’ which ‘if applied today would appear almost ludicrous’ (p. 12), he writes: ‘there appears to be no present evidence of an overarching effort at theoretical synthesis … and little reason to belief that such an effort is on the horizon,’ adding that ‘we have by now gone too far down the road of specialization and diversification’ (p. 12). The ship of sociology is already built with watertight compartments. If the borders between specialties within disciplines are hermetic, in exchange, paradoxically enough, the frontiers between formal disciplines are open. A distinction is needed between interdisciplinary amalgamation and hybridization by recombination of specialties belonging to different disciplines. A ‘unified social science’ was visible only in the early phase of the development of social sciences, when disciplines were not yet differentiated. Hybridization of specialties came much later, after the maturation of the process of internal fragmentation of disciplines. The word interdisciplinary is appropriate only for a certain phase of social sciences at the beginning of the twentieth century; it becomes misleading when it is used to describe contemporary trends, because at the end of the twentieth century only specialties are overlapping, not entire disciplines.
Species and Speciation Some may confound recombination and synthesis. A synthesis brings a new interpretation, a personal or stylistic achievement. The difference is most clear in history. Arnold J. Toynbee’s theory of history and Hobsbawn’s The Age of Empire are syntheses. Fernand Braudel’s The Mediterranean World is a recombination of segments of social sciences, largely history and geography. Perry Anderson’s Lineages of the Absolutist State is largely synthesis, whereas Karl Wittfogel’s Oriental Despotism is largely recombination.The recombinations are varied. Child development includes developmental physiology, language acquisition, and socialization. Indo-European studies encompass segments of historical linguistics, archeology, prehistory, and botany. Scholars in criminology come from subfields in law, sociology of deviance, social psychology, endocrinology, urban studies, social economics, and ethnopolitics. The study of artificial intelligence encompasses formal logic from philosophy, grammar and syntax from linguistics, and computer programming from computer science. Folklore studies includes sectors from historical linguistics, cultural anthropology, social history, and comparative literature. In one generation the number of social scientists has increased 10-fold. Their status in the profession depends mostly on their contribution to knowledge, and much less on their performance as teachers. As teachers they are generalists, as researchers, specialists. A strong feeling of diminishing marginal returns is perceived among the mass of social scientists. This phenomenon of saturation and overcrowding has been called the ‘paradox of density’: the tendency of densely populated disciplines to produce less innovation per head in spite of the greater effort applied in these overcrowded domains (Dogan and Pahre 1990, p. 32). What is perceived by some scholars as ‘dispersion’ and ‘disintegration’ should be considered as manifestations of the growth of specialties, resulting in the weakening of the saturated disciplinary core, a process engendered by centrifugal and autonomous tendencies of the specialties across disciplines. The recombination of specialties across disciplines requires a remodeling of the intellectual map of the behavioral social sciences.
Crane D, Small H 1992 American sociology since the seventies: the emerging identity crisis in the discipline. In: Halliday T C, Janowitz M (eds.) Sociology and its Publics: The Forms and Fates of Disciplinary Organization. University of Chicago Press, Chicago, pp. 197–234 Dogan M 1996 Political science and the other social sciences. In: Goodin R, Klingemann H D (eds.) A New Handbook of Political Science. Oxford University Press, Oxford, UK, pp. 97–130 Dogan M 1997 Les nouvelles sciences sociales: fractures des murailles disciplinaires. Reue Internationale des Sciences Sociales 153: 467–82 Dogan M, Pahre R 1990 Creatie Marginality: Innoation at the Intersections of Social Sciences. Westview Press, Boulder, CO Horowitz I L 1993 The Decomposition of Sociology. Oxford University Press, New York Merton R 1963 The mosaic of the behavioral sciences. In: Merton R K, Berelson B (eds.) The Behaioral Sciences Today. Basic Books, New York, pp. 247–72 Merton R 1968 On sociological theories of the middle range. In: Social Theory and Social Structure. Free Press, New York, pp. 39–53 Piaget J 1970 General problems of interdisciplinary research in UNESCO. In: Piaget J (ed.) Main Trends of Research in the Social and Human Sciences. Mouton, Paris, pp. 467–528 Rigney D, Barnes D 1980 Patterns of interdisciplinary citation in the social sciences. Social Science Quarterly 61: 114–27 Smelser N J 1967 Sociology and the other social sciences. In: Lazarsfeld P, Sewell W, Wilensky H (eds.) The Uses of Sociology. Basic Books, New York, pp. 3–44 Smelser N J (ed.) 1988 Handbook of Sociology. Sage, Newbury Park, CA Somit A, Tanenhaus J 1964 American Political Science, a Profile of a Discipline. Atherton, New York Turner R 1991 The many faces of American sociology. A discipline in search of identity. In: Easton D, Schelling C S (eds.) Diided Knowledge Across Disciplines, Across Cultures. Sage, London, pp. 59–85 Turner S P, Turner J H 1990 The Impossible Science: An Institutional Analysis of American Sociology. Sage, Newbury Park, CA Zuckerman H 1988 Sociology of science. In: Smelser N J (ed.) Handbook of Sociology. Sage, Newbury Park, pp. 511–74
M. Dogan
Species and Speciation Bibliography Almond G A 1990 Discipline Diided, Schools and Sects in Political Science. Sage, Newbury Park, CA Calhoun C 1992 Sociology, other disciplines and the project of a general understanding of social life. In: Halliday T C, Janowitz M (eds.) Sociology and its Publics: The Forms and Fates of Disciplinary Organization. University of Chicago Press, Chicago, pp. 137–96 Collins R 1986 Is 1980s sociology in the doldrums? American Journal of Sociology 91: 1336–55
At first glance, ‘species’ should be a relatively easy term to define: a group of organisms sharing a common biological lineage. Yet biologists have argued over the details of the definition since around 1900. This is in large part because the definition inevitably reflects concepts about processes resulting in the formation of species (or speciation), which lie at the heart of Darwinian evolution. Those speciation processes also underlie observed patterns of biodiversity in both space and time. 14855
Species and Speciation
1. Species Concepts Species concepts proposed since around 1950 may be broadly grouped into three categories: typological (based on shared traits), biological (based on breeding habits), and evolutionary (based on lineage) concepts. In the first case, the goal is usually a definition that follows taxonomic practice in species descriptions. The second and third categories reflect ideas about the speciation process. 1.1 Typological Species Concepts Typological species concepts are based on the premise that a group of individuals of one ‘type,’ sharing a list of traits, constitute a species. This set of species concepts includes the phenetic species concept (Sokal and Crovello 1970) and the genotypic cluster definition (Mallet 1995). In these cases, species are individuals sharing well-defined clusters of traits, with few to no individuals with intermediate traits. Some versions of what are usually referred to as phylogenetic species concepts (e.g., Cracraft 1983) also fall into the typological category, as species are defined by clusters of shared traits, although individuals are also grouped by population or lineage (for asexual organisms). Traits may be genotypes or phenotypes; that is, the genes possessed by an organism or the characters resulting from the expression of those genes. However, as its name implies, the genotypic cluster definition focuses on genotypes. Phenotypic traits are commonly measured as length, width, etc., of some body part, or as proteins separated and visualized on a gel. Genotypic traits tend to be quantified as DNA or RNA base sequences. For example, bacterial species that cannot be grown in the laboratory are often described based on their DNA differences and similarities. Taxonomists routinely use these typological species concepts when they identify individual organisms to species. The definitions have the advantage that they can be applied to almost any organism, dead or alive, fossil or recent. All one needs to be able to do is to make measurements of the phenotype and\or the genotype of the organism in question. However, this is also the disadvantage: one has to choose appropriate traits to measure, which will highlight the differences and similarities among individuals. 1.2 The Biological Species Concept and Its Relaties The biological species concept states that species are populations of interbreeding organisms. This concept, strongly espoused by Mayr (1942), was dominant until the 1980s. It continues to be widely cited. The initial formulation of the biological species concept focused on reproductive isolation and hence restriction of gene exchange between species. The inverse of inter-specific reproductive isolation, use of 14856
shared mate recognition and fertilization systems within a species, was later put forward by Paterson (1985). For example, two species of frogs could be discriminated upon based on their inability to produce viable offspring, or alternatively by shared mating calls and ability to produce viable offspring within each species. Some relatively obvious difficulties arise in applying this species definition to individual cases. First, the requirement for interbreeding makes the application to asexual or clonal organisms impossible. Second, many populations are geographically separated and it is not known whether they could interbreed or not if they were brought together in the same place. Third, application to the fossil record is nearly impossible, as the probability of finding two fossil organisms interbreeding is exceedingly low, and even if that were to occur, the question of viable offspring remains. Nonetheless, the concept is useful as an embodiment of the necessity for restricted gene exchange between species. 1.3 The Eolutionary Species Concept and Its Relaties Evolutionary species concepts place weight on defining species based on organisms’ historical or evolutionary lineage. These concepts arose initially out of the recognition that species do occasionally interbreed. Interbreeding results in movement of genes from individuals of one species into the gene pool of another (introgression), but in some cases, species maintain their morphological and ecological distinctness even in the face of gene introgression. Definitions by Simpson (1951) and Wiley (1978) focused on species as lineages that have their own evolutionary histories and fates. Most phylogenetic species concepts also fall under this heading. A species is defined as those organisms within a lineage between the initial splitting off of the species (cladogenesis) and either extinction of the lineage or further cladogenesis (e.g., Hennig 1966). The idea has been refined further to assert that such lineages must be monophyletic (derived from one common ancestor). However, the observation of gene introgression suggests that while individual traits in a species may be monophyletic, it is unlikely that all traits will be. In practice, application of these species concepts requires knowledge of evolutionary history, which at present is obtained through comparisons of traits and constructions of cladograms (or genealogical trees) representing phylogenetic hypotheses. The data used are thus similar to that described in Sect. 1.1. 1.4 Pluralistic Views of the Species Concept While broader conceptualizations of species may be unwieldy in terms of the data required to define a set of
Species and Speciation individuals as a species, such concepts have been put forward as more realistic and defensible. Templeton (1989) offered a cohesion species concept, which defines species based on a combination of factors promoting genetic divergence between populations along with those promoting genetic similarity within populations, and includes factors that limit the spread of variation through genetic drift or natural selection. This concept combines biological species and recognition concepts outlined above with a more explicit incorporation of evolutionary processes. Identification of species starts with identification of an evolutionary lineage (as in evolutionary species concepts), followed by tests of reproductive isolation or mate recognition systems, and culminating with examination of the uniformity of selection regimes or ecological adaptations within groups. 1.5 Remaining Problems with Species Concepts Most of these discussions of species assume that species are now formed as well-defined entities. This is clearly not the case for any of the definitions or concepts presented in Sects. 1.1–1.4. We can identify races and sub-species, which eventually may form species, using the typological approach; hybrid zones between species challenge the biological species approach; and gene introgression poses problems for monophyly. As a result, further refinement of the species concept is likely in the future.
2. Speciation Mechanisms In practice, speciation occurs through formation of barriers to gene exchange between groups of organisms, concurrent with or followed by genetic differentiation of the groups. 2.1 Pre- and Post-zygotic Barriers to Gene Exchange The barriers to gene flow are often lumped into two classes: pre-zygotic and post-zygotic mechanisms, named depending on whether they occur before or after the uniting of the gametes to form a zygote. Prezygotic mechanisms include all the factors associated with mate location, courtship, and mating, either behavioral or morphological. They also include the process of gamete union to form a zygote. Post-zygotic mechanisms include hybrid inviability or sterility. In general it is believed that even though post-zygotic mechanisms may initially isolate two species, selection should favor the development of pre-zygotic mechanisms whenever variation allowing that development occurs. This is because the investment by individuals in a hybrid offspring is higher if only post-zygotic mechanisms are in place.
2.2 Specific Barriers to Gene Exchange Specific barriers to gene exchange include morphology, behavior, and genetic and cytoplasmic incompatibility. The first, morphology, has been well documented in a number of insect-pollinated plants. The location of the anthers (pollen bearing structures) relative to the nectaries combined with the body size and shape of the insects determines whether visiting insects actually brush against the anthers and hence transport pollen to other flowers. Speciation may also involve behavioral divergence between two groups. Behavioral changes involved in speciation are usually associated with courtship and mating rituals. For example, species’ differences in mating calls occur in frogs and crickets. Pheromones, or chemicals use to communicate between individuals, may also change. Moths, butterflies, and flies, for instance, use pheromones as integral parts of mate attraction or courtship. The chemical composition of the pheromone mix is generally species-specific. Species may also be isolated by differences in the timing of courtship and mating either within the year or within the day. Seventeen year and 13 year cicadas are an extreme example of such temporal isolation. These cicadas develop underground, emerging as adults to mate at the end of a 17 or 13 year lifespan, depending on the species. Each of three species of 17 year cicadas in the USA has a corresponding very similar 13 year cicada species. Yet, 17 and 13 year cicadas will only overlap as adults once every 221 years. Finally, species may be effectively isolated if they have different courtship or mating locations. The fruit fly, Rhagoletis pomonella, is a classic example. Courtship and mating occur on the fruit tree species where the female will later lay eggs. Initially in USA, hawthorn fruits were used; a new race of the fly moved onto apple in the nineteenth century, and then onto sour cherry in the mid-twentieth century. Rhagoletis races on each of these hosts respond to specific odors and color, shape and size cues, effectively isolating them (summarized in Bush 1998). Genetic incompatibility between species arises in several ways. For example, egg and sperm, or pollen and stigma can possess surface proteins that prevent either fusion of the egg and sperm into a zygote, or pollen tube growth allowing fertilization of the plant ovum. A modification of these incompatibilities is seen in patterns of sperm or pollen precedence within a species, which limit gene exchange and could precede speciation (Howard 1999). Alternatively, once a hybrid zygote is formed, it may have low viability or be sterile. Mating between a horse and a donkey, for example, produces mules, which are sterile. Thus genes from the hybrid mating are not transferred beyond one generation. Empirically, the sex for which survival or fertility is more affected by hybridization tends to be the heterogametic sex (or the sex possessing dissimilar sex chromosomes, commonly called ‘x’ and 14857
Species and Speciation ‘y’ chromosomes), in a phenomenon labeled ‘Haldane’s Rule’ (Haldane 1922). This suggests that genes affecting hybrid sterility might be localized to the sex chromosomes. However, the genetics underlying hybrid sterility have been explored in Drosophila, which does show stronger male than female sterility in hybrid crosses. Genes affecting either male or female sterility are spread throughout the genome, and the number involved in male sterility alone sometimes exceeds 100 genes (summarized by Wu and Hollocher 1998). Genetic barriers may also arise through changes in the number of chromosomes in new species (Ramsey and Schemske 1998). This is particularly common in plants. Gametes (in this case, pollen and ova) generally have a single copy of each chromosome (termed haploid), uniting to give a zygote with two copies of each chromosome (termed diploid). In the case of hybrids between two species, each of the two copies will come from different species, and gamete formation may be impeded because the cells cannot properly divide. Occasionally, however, errors in cell division occur, and diploid gametes are produced. If these form within hybrid individuals, then the resulting tetraploid zygote is completely fertile, as it now contains a complete set of the chromosomes from each of the original parent species. Subsequent gamete formation can proceed normally and the offspring are interfertile, but cannot mate successfully with the parental species. Such polyploidy has been demonstrated for example in the history of both agricultural tobacco and wheat. Other sorts of errors can produce individuals with other chromosome copy numbers; anything greater than diploid is generally called polyploid. No matter what their origin, if polyploid individuals are viable, the gametes they produce cannot usually produce viable offspring by uniting with gametes from their diploid parents. Cytoplasmic incompatibility occurs when factors in the cytoplasm of the two gametes are not compatible, resulting in the death of the zygote. Wolbachia bacteria in insects, which are passed with the cytoplasm from mother to offspring, are one example. Currently, it is believed that males infected with Wolbachia produce sperm which must be ‘rescued’ by a factor in eggs produced by females that are also infected, before viable zygotes are produced. Whether this mechanism has been involved in speciation in the past, or is ‘simply’ a factor enhancing reproductive isolation at present is not yet established (see Werren 1998 for a review). 2.3 Differentiation: Selection and Chance Both the development of reproductive isolation and the differentiation of other characters identifying organisms as members of a species could be influenced by selection or by genetic drift. In the former case, adaptation to a particular set of environmental conditions drives speciation. In the latter case, chance 14858
events, such as the genetic diversity present in individuals that become geographically isolated, or that happens to be included when a polyploid individual is formed, influence the results of speciation. The relative influence of these two forces has been the subject of debate; like many polarized debates in biology, most cases probably fall along the continuum between the two ends.
3. Biogeography of the Speciation Process Evolutionary biologists have historically categorized speciation based on the geographic relationship of the newly evolved species. This classification reflects differences of opinion as to the relative role of geography in reducing gene exchange between putative species. Nonetheless, the actual evolutionary processes underlying speciation may be the same in each of these types of speciation. 3.1 Allopatric Speciation ‘Allopatric’ derives from ‘allo’ meaning ‘separate’ and ‘patria’ meaning ‘country.’ Allopatric speciation is thus speciation that occurs through geographic isolation of two populations or groups of populations. In this scenario, a population is split by some geographic barrier, such as a mountain range or river for terrestrial organisms, or a land mass for aquatic organisms. Over time, individuals in populations on each side of the barrier differentiate. This may be due to a founder effect, in which only part of the genetic variability originally present is represented in a new population; to chance loss of genetic variation through accidents of who happens to reproduce or not; or to adaptation to environmental differences on each side of the barrier. This differentiation between the divided populations is extensive enough to include the evolution of differences in reproductive biology, such that when the two populations are re-united by the removal of the barrier, they are no longer able to inter-breed and are viewed as separate species. Support for the allopatric speciation model increased with the discovery of continental drift and plate tectonics, and the subsequent increased understanding of geologic history. Such continental rearrangement at a deep time level, and geographic changes associated with Pleistocene ice ages in more recent times, provide the geographic variation needed for allopatric speciation. Enhanced understanding of geologic events also provided the opportunity to test expected distributions of species under allopatric speciation through the study of vicariance biogeography, or the geographic distribution of species as a result of subdivision of ancestral species’ ranges. Allopatric speciation has been the dominant geographic model for speciation processes since around 1950, although other models are currently gaining support. This dominance derived from the argument
Species and Speciation that development of genetic isolation would most likely occur if incipient species were physically isolated from each other for a time period, allowing the evolution of reproductive isolating mechanisms.
3.2 Parapatric Speciation ‘Parapatric’ derives from ‘para’ meaning ‘near’ and ‘patria’ meaning ‘country.’ Parapatric speciation thus occurs when a smaller population is isolated, usually at the periphery of a larger group, and becomes differentiated to the point of becoming a new species. Gene flow may remain possible between the two populations during the speciation process, and hybrid zones may be observed at the interface between the two populations as a result. The best-known example of incipient parapatric speciation occurs in populations of the grass Agrostis tenuis which span mine tailings and normal soils. Individuals that are tolerant to heavy metals, a heritable trait, survive well on contaminated soil, but poorly on non-contaminated soil. The reverse occurs for intolerant populations. Gene flow occurs between sub-populations on and off mine tailings, but hybridization is inhibited by slight differences in flowering time between the two locations (McNeilly and Antonovics 1968, McNeilly and Bradshaw 1968).
3.3 Sympatric Speciation ‘Sympatric’ derives from ‘sym’ meaning ‘together with’ and ‘patria’ meaning ‘country.’ Sympatric speciation thus refers to speciation occurring within a single population. Because of the lack of geographic isolation, sympatric speciation was historically regarded by many biologists as being impossible. However, evidence is accumulating that sympatric speciation does indeed occur. The most obvious mechanism that can result in sympatric speciation is speciation through polyploidy (Ramsey and Schemske 1998). A chance doubling of the chromosome number can lead to reproductive isolation through zygote inviability, without a preceding need for geographic isolation. Given that about half of all flowering plants are polyploid, this form of speciation may have played a large role in their formation. Hybridization between two species, followed by rapid changes in the hybrid to enhance reproductive isolation from the parents has been held up as a form of sympatric speciation, particularly in plants. While the number of solid examples of this speciation mechanism are few, it does occur (Dowling and Secor 1997, Rieseberg 1997). Genetic introgression through mechanisms other than hybridization, or genetic rearrangements (‘jumping genes’) are a potential source of variability
within a population that could lead to speciation, but are essentially unexplored at present. Small shifts in the ecological niche of individuals within a population are also argued to result in a form of sympatric speciation. The host races of Rhagoletis described above are a case in point. Another example is the radiation of species of cichlid fish within small crater lakes in Cameroon (Schliewen et al. 1994). Genetic analysis indicates that the species in each lake derive form a single ancestor. Although species differ slightly in habitat preference, the lakes are small enough that the different fish species are sympatric. In order for this mechanism to be effective, natural and sexual selection would need to act in the same direction. For example, if individuals diverge in resource use, and individuals with similar resource use prefer each other as mates, resource use races will evolve, which may eventually become full species. Note that this form of speciation has conceptual affinities both to the biological species concept (reproductive isolation) and to the recognition species concept (species cohesion).
4. Space: Patterns of Biodiersity Resulting From Speciation The different modes of speciation outlined above each result in a different spatial pattern of biodiversity. Hybrid zones between newly formed species are the predicted outcome of allopatric speciation, if the geographically isolated populations have not developed full reproductive isolation prior to removal of the geographic barrier. Further, the establishment of a geographic barrier is expected to divide ancestral populations of many species within the flora and fauna of a region. Removal of that barrier and recontact of newly divergent flora and fauna can result in multiple individual hybrid zones that overlap, or suture zones. Such suture zones mark boundaries between regional floral and faunal assemblages. Remington (1968) outlined six major suture zones and a number of minor suture zones in the Nearctic region, including a compendium of known hybridizing pairs of species within each suture zone and an analysis of the likely geographic barriers whose removal resulted in the now-observable suture zone pattern. Ring species also result from the interaction of speciation with a geographic barrier, but with a twist. As populations of a species expand into new areas, moving around a geographic barrier, they can form a ring around that barrier. While the immediately adjacent populations at each point around the ring interbreed, the terminal populations are reproductively isolated and cannot interbreed. Ring species thus can be thought of as a result of geographically extended parapatric speciation. Irwin et al. (2001) present an outstanding example of a ring species in the case of the greenish warbler (Phylloscopus trochi14859
Species and Speciation loides), whose six sub-species circle the Tibetan Plateau. The sub-species that are sympatric at the ends of the ring in central Siberia do not interbreed. The structure of songs involved in mate choice have diverged due to selection pressures favoring increased song complexity as the species expanded from the southwest around on the eastern and western sides of the plateau, eventually contacting again north of the plateau. Interestingly, there is now a gap in the ring in central China, which is probably due to habitat destruction by humans. The recentness of this gap has not allowed time for song divergence across the gap, but one could predict speciation across the gap in the future. Sympatric speciation can generate a different pattern of biodiversity, hot spots of endemic species. Endemic species have very localized distributions, being found in one (usually small) area. Such species are not necessarily rare; the localized populations may be quite large. The cichlid fish species assemblages found in the crater lakes of Cameroon and cited above, along with the much richer cichlid diversity in the rift lakes of east Africa, are classic examples of hot spots of endemic species. In Lake Victoria, for example, over 300 species are found, of which very few are found anywhere else. Overall biogeographic patterns of diversity within a region should therefore reflect the relative dominance of allopatric, parapatric, and sympatric speciation in shaping the flora and fauna. However, these biogeographic patterns are all too easily disturbed by human activities, which include habitat destruction resulting in species extinction and species transport resulting in new mixing of floras and faunas. This disturbance lends urgency to attempts to understand the causes of biogeographic patterns at the beginning of the twenty-first century, before the evidence is muddied by human activity.
5. Time: Speciation Speed Based on the mechanisms of speciation outlined in Sects. 2 and 3, it is clear that speciation takes variable amounts of time to occur. Speciation via polyploidy, for example, can be nearly instantaneous. Speciation via changes in sexual signals involved in mate choice may be a much more gradual event. From a lineage perspective, there are two models of speciation rates. First, speciation may occur through gradual change within the lineage, such that over time species ‘a’ is transformed into species ‘b.’ Second, lineages may split by either budding off a new lineage while the ancestral lineage remains unchanged, or by branching into two new lineages with the loss of the ancestral lineage. Thus, in the first model, change is slow; in the second, speciation events are rapid and interspersed among periods of no change or stasis. The first model has been dubbed ‘gradualism’ and the 14860
second, ‘punctuated equilibrium’ (Eldridge and Gould 1972). Large debates about the relative importance of gradualism and punctuated equilibrium raged among paleontologists throughout the 1970s and 1980s. The debate had a bearing on the role of natural selection and adaptation in speciation, as contrasted with genetic drift or chance founder effects. Supporters of the gradualist position usually argued that natural selection is of overwhelming importance, while supporters of punctuated equilibrium, initially at least, postulated a strong role for founder effects resulting in sub-sampling of the variation present in an ancestral species in a newly geographically isolated population, enhancing the speed of evolution of new species in those isolated populations. Support for both gradualism and punctuation exists in the fossil record, making resolution of the debate difficult. In addition, the time scale of the fossil record is very coarse, making it difficult to determine exactly how saltatorial a lineage might be. See also: Adaptation, Fitness, and Evolution; Evolution, History of; Evolutionary Selection, Levels of: Group versus Individual; Evolutionary Theory, Structure of; Extinction of Species; Natural Selection; Primates, Evolution of
Bibliography Bush G L 1998 The conceptual radicalization of an evolutionary biologist. In: Howard D J, Berlocher S H (eds.) Endless Forms. Species and Speciation. Oxford University Press, New York Cracraft J 1983 Species concepts and speciation analysis. Current Ornithology 1: 159–87 Dowling T E, Secor C L 1997 The role of hybridization and introgression in the diversification of animals. Annual Reiew of Ecology and Systematics 28: 593–619 Eldridge N, Gould S J 1972 Punctuated equilibrium: An alternative to phyletic gradualism. In: Schopf T J M (ed.) Models in Paleontology. Freeman, Cooper, San Francisco Haldane J B S 1922 Sex ratio and unisexual sterility in animals. Journal of Genetics 12: 101–9 Hennig W 1966 Phylogenetic Systematics. University of Illinois Press, Urbana, IL Howard D J 1999 Conspecific sperm and pollen precedence and speciation. Annual Reiew of Ecology and Systematics 30: 109–32 Howard D J, Berlocher S H (eds.) 1998 Endless Forms. Species and Speciation. Oxford University Press, New York Irwin, D E, Bensch S, Price T D 2001 Speciation in a ring. Nature 409: 333–7 Mallet J 1995 A species definition for the modern synthesis. Trends in Ecology and Eolution 10: 294–9 Mayr E 1942 Systematics and the Origin of Species. Columbia University Press, New York McNeilly T, Antonovics J 1968 Evolution in closely adjacent plant populations. IV. Barriers to gene flow. Heredity 23: 205–18 McNeilly T, Bradshaw A D 1968 Evolutionary processes in populations of copper tolerant Agrostis tenuis Sibth. Eolution 22: 108–18
Spectral Analysis Otte D, Endler J A (eds.) 1989 Speciation and its Consequences. Sinauer Associates, Sunderland, MA Paterson H E H 1985 The recognition concept of species. In: Vrba E S (ed.) Species and Speciation. Transvaal Museum, Pretoria, South Africa Ramsey J, Schemske D W 1998 Pathways, mechanisms, and rates of polyploid formation in flowering plants. Annual Reiew of Ecology and Systematics 29: 467–501 Remington C L 1968 Suture-zones of hybrid interaction between recently joined biotas. In: Dobzhansky T, Hecht M K, Steer W C (ed.) Eolutionary Biology 2: 321–428 Rieseberg L H 1997 Hybrid origins of plant species. Annual Reiew of Ecology and Systematics 28: 359–89 Schliewen U K, Tautz D, Paabo S 1994 Sympatric speciation suggested by monophyly of crater lake cichlids. Nature 368: 629–32 Simpson G G 1951 The species concept. Eolution 5: 285–98 Sokal R R, Crovello T J 1970 The biological species concept: A critical evaluation. American Naturalist 104: 127–53 Templeton A R 1989 The meaning of species and speciation. A genetic perspective. In: Otte D, Endler J A (eds.) Speciation and its Consequences. Sinauer Associates, Sunderland, MA Werren J H 1998 Wolbachia and speciation. In: Howard D J, Berlocher S H (eds.) Endless Forms. Species and Speciation. Oxford University Press, New York Wiley E O 1978 The evolutionary species concept reconsidered. Systematic Zoology 27: 17–26 Wu C-I, Hollocher H 1998 Subtle is nature. The genetics of species differentiation and speciation. In: Howard D J, Berlocher S H (eds.) Endless Forms. Species and Speciation. Oxford University Press, New York
C. L. Boggs
Spectral Analysis Spectral analysis is one of several statistical techniques necessary for characterizing and analyzing sequenced data. Sequenced data are observations that have been taken in one, two, or three dimensional space, and\or time. Examples might be observations of population density along a road, or of rainfall over an area, or of daily births at a hospital. One important limitation is that the observations be equally spaced in order that the analysis proceed efficiently. Spectral analysis refers to the decomposition of a sequence into oscillations of different lengths or scales. By this process, the observations in what is called the data domain are converted into the spectral domain. The reasons for doing this are that: (a) some forms of manipulation are easier in the spectral domain; and (b) the revealed scales are necessary statistical descriptors of the data and may suggest important factors that affect or produce such data. The following will provide brief descriptions of: (a) Fourier analysis and its use in manipulating data that are assumed to be periodic; (b) relevant statistics; and (c) one approach to spectral analysis of nonperiodic data including an example.
1. Fourier or Harmonic Analysis Here data are represented by y where the position in the sequence is labeled by j, i.e., y[ j ], where j varies from 0 to (nk1). Because trigonometric functions require radians for their calculation the label j is converted to radians with 2πj\n. What Fourier showed in 1822 was that almost any equally spaced sequence y[ j ] could be fitted exactly by a set of cosine and sine curves. In addition, since no information was lost, these could be returned to the original data. There was one important assumption, that the data were periodic, i.e., that they repeated themselves both forwards from (nk1) and backwards from 0 indefinitely. A sinusoidal curve, y[ j ] l A[k] cos o(2πkj )\nkΦ[k]q, as displayed by curve (a) in Fig. 1 is characterized by its frequency label, k, its amplitude, A[k], and phase, Φ[k]. The frequency, k, is the number of complete oscillations that occur over the total observations, n. For curve (a) this is 2. For n observations k may have integer values of 0 to jn\2 (n even). Frequency may also be expressed in observational units as f l k\n∆t where ∆t is the interval between observations. In the example, if y is observed daily for a year, the oscillation at k l 2 has a frequency of f l 2\(365i1 day) l 1\182.5th of a cycle per day l 2 cycles per year. It also has a wavelength (peak to peak distance) of 1\f l 182.5 days l 1\2 year. The highest frequency that may be calculated is fmax l 1\(2∆t). The amplitude, A[2], of curve (a) is 0.5. This also provides the variance of the curve, which is half the square of its amplitude, A#[2]\2 l 0.125. The phase Φ[k], when it is divided by k to give the phase shift, gives the distance of the first peak from the origin. In this case Φ[2]\2 l 0.46. Curve (a) can also be represented by the sum of a simple cosine curve ( y[ j ] l a[k] cos (2πkj )\n) as displayed as curve (b) and a simple sine curve ( y[ j] l b[k] sin (2πkj )\n) given by curve (c). From trigonometry we can show that A[k] l (a #[k]jb #[k])"/#, and Φ[k] l
Figure 1 Examples of (a) sinusoidal, (b) simple cosine, and (c) simple sine curves for periodic data
14861
Spectral Analysis arctan(b[k]\a[k]). The coefficients a [k] and b[k] may be obtained from
0 1
(1)
0 1
(2)
2 n−" 2πkj a [k] l y[ j ] cos , n j= n ! 2 n−" 2πkj b[k] l y[ j ] sin , n j= n !
where, for k l 0 and for k l n\2 when n is even, only Eqn. (1) exists and the factor 2 is replaced by 1. a[0] is the mean of the series. A real set of data will be represented by the sum of the whole range of possible frequencies,
9
0 1
0 1: .
n/# 2πkj 2πkj y[ j ] l a [k] cos jb[k] sin n n k=!
(3)
Note that there were n y[ j ] and a total of n a[k] and b[k]. Thus the information present in the data domain, y[ j ], is equivalent to the information in the spectral domain, a[k] and b[k]. It is just arranged differently. A variable arranged according to k is said to be the spectrum of that variable. As an example, Fig. 4 represents the spectrum of the percentage variance. All of these formulas refer to one dimension (along a road or through time). Similar equations may be written for spatial observations. It is nothing more than a curve- or surface-fitting procedure, like fitting a polynomial to observed data. One problem to be considered is how well the beginning of the data match the end of the data. If the beginning and ending points have very different magnitudes, the periodic assumption causes a big jump at those points. As a result, many frequencies will be required to account for it and these amplitudes may overwhelm the amplitudes within the data. Consequently, it is advisable to smooth off the ends of the data so that they meet at approximately the same magnitude. By this technique some information about the beginning and the end will be unavoidably lost. What the Fourier series fitting does is to estimate any truly periodic elements present. In addition, it permits analytical procedures to be applied to the data even if they are not periodic. Often manipulations that are complex in the data domain are very simple in the spectral domain. For example, the degree of slope and azimuth of slope is a simple process of differentiation of Eqn. (3). The amplitudes are calculated by the process previously described, each amplitude is multiplied by its respective frequency, and the new amplitudes converted back to the data domain where they produce slope and orientation (Rayner 1972). As another example, pattern recognition is a simple multiplication process in the spectral domain (Rayner 14862
1971). Yet another use is in the description or summarization of shape. Manipulation of digitized shape data such as national or other boundaries, by these Fourier techniques can reduce the number of original observations to less than one third without significant loss of information (Moellering and Rayner 1982).
2. Statistical Description One use of statistics is to summarize large amounts of data. For example, Fig. 2 displays one year of daily birth data at one hospital. A first step is usually to calculate its probability distribution. Under ideal conditions that distribution will approach the ‘normal probability distribution’ or ‘Gaussian probability distribution’ form. Then the mean ( µ ) and variance " (σ# or µ ) are all that are needed to describe the data. # Fig. 3 shows the probability distribution of the birth data which have a mean of 8.7 and a variance of 9.5. Superimposed is the normal curve. If the distribution is ‘skewed’ higher moments than the second moment (variance) will be required. Multimodal distributions will have to be subject to additional processing. In cases where more than one set of observations are being compared or related, ‘covariance’ and ‘regression’ techniques are applied. The true or population
Figure 2 Daily birth data for one hospital during 1998
Figure 3 Probability distribution of birth data
Spectral Analysis characteristics of distributions are called ‘parameters.’ Since only samples of data are available, only estimates of these characteristics can be obtained. These estimates are called statistics and are usually denoted by ‘hatted’ symbols (σ# # or µ# ). The particular # formula used to calculate a statistic is called an estimator so, for example, different variance estimators are produced by dividing the sums of squares of deviations from the mean either by n or by (nk1), where n is the number of observations in the sample. Two important measures of the quality of an estimator are bias, which is the deviation of the mean magnitude of the estimate from the population magnitude, and the variance of the sample estimates. Unfortunately, estimators that have small bias typically have large variance, and vice versa. Another quality measure is consistency, in which both bias and variance should decrease as n increases. Spectral estimates are typically not consistent. Statistical estimates are assumed to come from an ensemble of observations. That is, the sample is taken from all possible occurrences of the phenomenon. In fact, typically only one set of observations is available through time or in space. To be useful therefore, the data from the series must be assumed to be representative of the ensemble, the data are assumed to be ergodic data. This also assumes that the particular segment of time series is representative of the whole continuum. Then the statistics from the different segments should be equivalent. In other words, the data are said to be stationary data. Now to return to the description of observations, if the data are independent and one observation is not affected by the previous or next one, the probability of observing a particular magnitude is dictated by the normal distribution and is not related to the sequence of magnitudes observed. Then the mean and variance are sufficient descriptors. When this assumption is not fulfilled (e.g., the population density one kilometer away from a high density within an urban area is also likely to be high and not given by the normal distribution), this influence may be specified by the autocovariance function (i.e., the covariance of each observation with its next neighbor, with its neighbor two observations away, with its neighbor three observations away, etc.), 1 n−"−Q pQ (4) y[ j ] y[ jjp], nkQ pQ j = ! where p is the neighbor distance, which is often called the lag. The y[ j ] are pre-processed to have a mean of zero. Because the possible number of summed pairs is reduced as the lag increases, there is a practical limit to p which is typically taken to be in the order of n\10 l m. Note that this is a symmetrical function around p l 0having2mterms,anditisnotnecessarytocalculate for negative p. Eqn (4) produces a new set of numbers that typically have a maximum at p l 0, decline to yy[ p] l
zero, and then oscillate around zero. How may this function be interpreted? Because, like the original data, these numbers are not independent, that is not easy. However, if this sequence is transformed with an orthogonal function, the new sequence, which is independent, is interpretable. The function selected is the Fourier transform which is a limiting form of Fourier analysis. This produces a spectrum of the variance: an allocation of the variance into different scales, wavelengths of frequencies of fluctuations. In Sect. 1 it was shown that harmonic analysis requires that the data be periodic. The autocovariance function, yy[ p], is truncated at (mk1) and k(mk1) and is certainly not periodic. The yy[ p] are really all that can be calculated from what may be seen through a blurred window of the original observations that theoretically extend from k_ to j_. How can this window be described, or what can be said about what is not seen? The oldest approach, also known as the classical approach (Priestley 1997), assumes that the ends of the series are smoothed off and the unseen y[ j ] are zero. Alternative approaches have searched for other estimators which use knowledge of the data to obtain, for example, more realistic models of what is unseen. These use autoregression, moving average, and maximum entropy techniques, individually and in combination. They occupy a large amount of literature and all cannot be adequately summarized here where the classical technique will be used to illustrate the basic principles. For the other approaches reference should be made to books by Box and Jenkins (1976) and Kay (1988) which contains computer programs for empirical work.
3. Classical Spectral Analysis As zeroes are added to the series, the number of frequencies that may be calculated is increased but their magnitudes are decreased. For example, if n is doubled the label for an equivalent frequency is doubled and its magnitude is halved. What this shows is that the individual amplitudes, which refer to integer k frequencies in harmonic analysis, become average amplitudes over a frequency band centered at k as n _. To distinguish them, the frequency counter is changed from k to f. The practical equations essentially stay the same but the interpretation changes. Also, now a and b are typically defined for twosided functions with a [ f ] l a [kf ] l (1\2) a [k] and b[ f ] lkb[kf ] l (1\2) b[k], and Eqns. (1) and (2) become
9 9
0 1: , 0 1: .
1 n−" 2πkj a[ f ] l y[ j ] cos n j= n ! 1 n−" 2πkj b[ f ] l y[ j ] sin n j= n !
(5) (6) 14863
Spectral Analysis only suggest interesting phenomena and apparent relationships for investigation by other means. For more complex distributions, where higher statistical moments and products are required, higher order spectra, polyspectra, provide valuable insights (Rayner in press).
Figure 4 Spectrum of percentage variance of birth data
The calculation with the new interpretation of a and b is called a Fourier transform. Because the autocovariance function is a symmetrical function only Eqn. (5) is required for its transformation,
9
0 1:
1 m−" 2πkp YY [ f ] l . (7) yy[ p] cos 2m p =−(m− ) m " This still leaves the shape of the window to be taken into account. Various windows have been proposed. Their effect in the spectral domain is to perform a weighting average over adjacent frequencies, i.e., +(m−") ave(YY [ f ]) l YY [ f ]H [ f kf ], (8) " " −(m−") where, for the ‘hanning’ window, H [ f ] is equal to 1\4, 1\2, 1\4. The above process is the same as a Fourier transform of the original y[ j ] series that has had the mean removed, been windowed by the smoothing off of the ends, and had some zeroes added. Then, for the variance, the (a#[ f ]jb#[ f ])’s are summed over equally-sized groups of f values. As an example of such an analysis the spectrum of the percentage variance of the birth data is given in Fig. 4 with m l 40 and f l q\(2m∆t) where 0 q m. One might expect the number of births to be random in time, in which case the spectrum should be flat (approximately constant variance over all frequencies). This is essentially what Fig. 4 reveals with one exception: a large peak at q l 11, at approximately one week wavelength. One may speculate on the cause. The application of other techniques of spectral analysis, such as maximum likelihood, might tease out other interesting periods. The next step might be a regression analysis relating births to other factors. If the other factors are sequenced, then this calls for cross spectral analysis (Rayner 1971). It allocates the variance to other factors as a function of scale. It also provides an estimate at each frequency of any delay between births and the influencing factor. Of course, statistical techniques cannot assign a cause. They can
14864
See also: Sequential Statistical Methods; Spatial Autocorrelation; Spatial Choice Models; Spatial Interaction Models; Spatial Pattern, Analysis of; Spatial Sampling; Spatial Search Models; Spatial Statistical Methods; Spatial-temporal Modeling; Statistical Pattern Recognition; Statistical Systems: Censuses of Population; Statistical Systems: Crime; Time Series: Advanced Methods; Time Series: Cycles; Time-use: Research Methods
Bibliography Box G E P, Jenkins G M 1976 Time Series Analysis, Forecasting and Control, rev. edn. Holden-Day, San Francisco, CA Kay S M 1988 Modern Spectral Estimation: Theory and Application. Prentice-Hall, Englewood Cliffs, NJ Moellering H, Rayner J N 1982 The dual axis Fourier shape analysis of closed cartographic forms. The Cartographic Journal 19: 53–89 Priestley M B 1997 A short history of time series. In: Rao T S, Priestley M B, Lessi O (eds.) Applications of Time Series in Astronomy and Meteorology, 1st edn. Chapman and Hall, London, pp. 3–23 Rayner J N 1971 An Introduction to Spectral Analysis. Pion, London Rayner J N 1972 The application of harmonic and spectral analysis to the study of terrain. In: Chorley R J (ed.) Spatial Analysis in Geomorphology. Methuen, London, pp. 283–302 Rayner J N (in press) The Basis of Dynamic Climatology. Blackwell, Oxford, UK
J. N. Rayner
Speech Errors, Psychology of The generation of spoken language is a complex task normally accomplished with high accuracy. Human beings are, on average, consummate ‘talkers.’ But, speakers do err occasionally, and in ways that reflect the mental processes underlying the act of speaking. That is the topic pursued here: What are the speech error patterns of normal language use, and what do they tell us about language processing? Various departures from the ideal of perfectly articulated, fluent speech occur in ordinary conversation: hesitations, false starts, changes of mind in mid-sentence, and misarticulations. In this context, the term ‘speech error’ has come to be used in a semitechnical fashion to pick out particular errors
Speech Errors, Psychology of from the welter of everyday language. The focus is on fluently uttered forms in which some element or elements of the intended target utterance are missing, mis-selected or misordered. Errors of knowledge (i.e., those arising from ignorance of a word or of ‘proper usage’) and various nonfluencies are distinguished from speech error in this way of proceeding.
1. A Methodological Note How is error data acquired? Several investigators have collected extensive records of errors heard during everyday language use. A classic such effort was that of Rudolf Merringer (Merringer and Meyer 1895) in the late nineteenth century. Cutler and Fay (1978) describe his methods in their introductory essay to a reissue of Merringer’s work. The practices they describe are characteristic of later diary studies. Typically, a written record of an error is made soon after observation (normally, a few seconds to a few minutes), with annotations regarding pronunciation and setting. Such observation has the strong appeal of face validity for processes of language formulation. However, influences of perceptual filtering and memory distortion must be carefully weighed. It is clear that minor phonetic deviations may be missed, and only changes clearly discernable with respect to the speaker’s target are well represented in such error corpora. See Cutler (1982) for detailed discussion of these and related issues. Diary studies exist for several languages. It may be said with some assurance that whether the target language is Chinese, Dutch, English, German, Italian, Spanish, Turkish, or other, the same general types of phenomena arise. Specific language features may afford options for error —e.g., Chinese tone errors— not available in English or related languages. Nevertheless, the general behavior of errors involves similar mechanisms across languages. Discussion here focuses on English, with some commentary on representative language differences. Error examples are taken from papers in the bibliography.
2. Types of Error In the popular literature, ‘speech error’ is synonomous with ‘Spoonerisms’, named after the Dean of New College, Oxford, Rev. W. A. Spooner (1844– 1930), who was notorious for his misplacements of speech sounds, sometimes to extraordinary effect. For example, in lecturing a wayward student, he is alleged to have said: ‘You have hissed all my mystery lectures. ... and tasted two worms’ (‘missed all my history lectures ... wasted two terms’). But Spoonerisms are special in several respects. They suggest that sound errors usually make words, and that they make a kind of wild and humorous sense. Neither condition is
typical. Further, Spoonerisms primarily exploit sound error, and the relevant phenomena are much more general. No account of Rev. Spooner’s performance is given here; for comment, see, e.g., Robbins (1966). Notwithstanding the exclusion of Rev. Spooner’s legacy, there is no denying the charm of suitably chosen errors as a way to display the hidden structure in language. The errors are intricate, and they illustrate essential features of language processing. Examples (a)–(n) are representative. (a) ‘… sat down on a sot hoddering iron’ (… hot soldering …) (b) ‘I won’t hit the bard hall’ (… ball hard) (c) ‘I have a case of lay grode flu.’ (… low grade …) (d) ‘… so I can start the stapeback up (… the tape …) (e) ‘The currenth month is … (The current …) (f) ‘..an anguage lacquisition device … (… a language acquisition… ) (g) ‘Is your purry kitting’ (… kitty purring) (h) ‘Make it so the apple has less trees. (… so the tree has less apples.) (i) ‘She sings everything she writes. (… writes everything she sings) (j) ‘Bensinger takes to that left-field wall like water takes to a duck. (… like a duck takes to water) (k) ‘I’ve got whipped cream on my mushroom! (… on my mustache!) (l) ‘You put your nose … your finger on it. (… put your finger on it) (m) ‘My! So that’s the Porche symblem. (symbol\ emblem) (n) ‘It doesn’t care. (It doesn’t matter\I don’t care) A rough task analysis sets the stage for error interpretation. It is evident that systems for speaking include mechanisms that select from tens of thousands of available words just those appropriate to an intended message (‘lexical retrieval’), mechanisms that integrate the words into sentence structures (‘syntactic construction’), and mechanisms that orchestrate these decisions into control structures (‘phonological and prosodic integration ) that drive motor outputs. Psychological explanation must reconstruct this complex of mental events. Descriptively, in examples (a)–(g), sounds move around in various ways—they are exchanged, anticipated, perseverated, and shifted. Important features of sound error include: Moved elements are phonetically similar, they correspond in syllabic position, and their immediate phonetic enironments are similar (Boomer and Laver 1968, Dell 1984, Noteboom 1969, Caldognetti et al. 1987, del Viso et al. 1991, McKay 1970, Vousen et al. 2000). The fact that error likelihood is so strongly influenced by structural similarity suggests that errors arise, in part, out of mechanisms that associate independently specified processing elements (sounds, in this example) with a planning framework (‘syllable or word frames’, in this 14865
Speech Errors, Psychology of instance). These errors and related phenomena indicate how sounds that make up words enter planning structures for sentences. In the task analysis, this reflects phonological and prosodic integration processes. Examples (h)–(j) are ‘word movement’ errors. Some are of special interest because they reflect an analysis of words into stem and affix. In Example (h), ‘apples’ is decomposed into ‘apple j s’ —more abstractly [‘apple’ j plural marker]. The technical term in linguistic science for such a decomposition is morphemic analysis. Notice that the two types of morpheme (stem and affix) behave in very different ways. The stem moves, but the affix stays behind. One can also describe this error in terms of the notion of structural frame. By this hypothesis, the affix would be considered part of a phrasal frame that stem morphemes are assigned to as a sentence develops. This idea proves applicable to a wide range of error observations, and is strikingly manifest in languages with richly developed morphological structures. (Related facts are discussed in Sect. 3.4.) In the task analysis, these errors implicate syntactic integration. Examples (k) and (l) provide another perspective on errors. Descriptively, these are word substitutions. Several points are relevant. First, (k) displays a phonetic similarity between target and intruding element; the phonetic similarities between ‘mustache’ and ‘mushroom’ are representative. Moreover, both words are nouns. Thus, the syntactic structure of the intended utterance is preserved. One may plausibly treat this as a failure—perhaps triggered by phonetic similarity—of mechanisms that retrieve elements from an inventory of word forms. But what about errors like (l)? Grammatical category is preserved, but phonetic similarity is not apparent. And, unlike (k), an obvious meaning relation holds between target and intrusion. Section 3.5 discusses this contrast. In the task analysis, these errors implicate lexical selection processes, and they encompass both form and meaning relations. Examples 13 and 14 are blend errors. In morpheme, word, and phrase blends, two processing elements compete for a single output slot. Speakers are often aware of the competition, and can report experiencing it. The competing expressions are, with rare exception, of the same grammatical class and are semantically equivalent for the intended message. This phenomenon is a prima facie demonstration of some parallelism in sentence production. Such errors implicate lexical selection processes and aspects of phrasal and phonological composition.
3. Interpretations of Error Phenomena Speech errors have many variants. The challenge is to explain their similarities and differences in ways that link them to accounts of production mechanisms at the necessary levels of representation (phrases, words, 14866
morphemes, syllables, sounds). A robust generalization across the error types is relevant: Like elements interact, i.e., sounds interact with sounds, words with words, and phrases with phrases. These interactions indicate vocabularies of ‘units’, and the error data shows they are linked in an architecture of processing stages discussed in Sects. 3.1–3.4. 3.1 Accommodations in Error Processes Errors move elements of the intended utterance to ‘new’ positions. Because languages display many local dependencies of form, such displacements could violate these. But, typically they do not. Example (f) illustrates: ‘a language acquisition device’ became ‘an anguage lacquisition device.’ The indefinite article changed from ‘a’ to ‘an’, and so accommodated to the vowel onset (‘anguage’) created by the error. In Example (o), the bound morpheme deriving ‘easily’ from ‘easy’ shifts. But, the isolated stem ‘easy’ was correctly pronounced as \izi\ rather than with a reduced final vowel (as would have been appropriate had the affix remained in place). (o) ‘... easy enoughly ...’ (easily enough) Many such variations triggered by changes in phonetic context appear in accommodations. By these observations, processes that fix the surface phonetic forms of words are ordered later than the errorgenerating structures: First, an abstract morphological and phonological representation is constructed, then regular processes flesh out its phonetic detail. This is a familiar notion in linguistics, and its expression in error phenomena is not greatly surprising (see Fromkin 1971). But, bear in mind that it could have turned out otherwise. The speaker of Example (f) might have said ‘a anguage’ but did not; the speaker of Example (o) might have said ‘easuh enoughly’, but did not; and so on. Error processes on this evidence are tightly bound to the structural systems that govern error-free speech. 3.2 Other Eidence for Ordering of Processes: Sound Exchanges and Word Exchanges These error classes reveal syntactic and phonological planning structures. Sound interactions, and particularly exchanges as in (a)–(c), typically involve words in the same phrase. The relevant structure is prosodic (see Boomer and Laver 1968, Fromkin 1971, Garrett 1980). Prosodic grouping reflects syntax, but is also motivated by intonation and the phonological structure of the words chosen to express a given syntactic form. Treating the domain of sound exchange as prosodic may also explain some cross-language differences. For example, sound exchanges in English are predominantly between words, but in Spanish, within words (del Viso et al. 1992). The greater morphological complexity and length of Spanish words, as compared to English, would require exchanges be-
Speech Errors, Psychology of tween words to cross prosodic boundaries more often for the former than the latter. Word and phrase exchanges sharply differ from sound exchanges. As Examples (i), (j), and (p)–(r) suggest, they arise for elements in distinct phrases that correspond in grammatical category. (p) ‘... I left my briefcase in the cigar.’ (... my cigar in the briefcase) (q) ‘What we want to do is train its tongue to move the cat.’ (... the cat to move its tongue) (r) ‘As you reap, Roger, so shall you sow.’ (As you sow ... so shall you reap) Nouns exchange with nouns, verbs with verbs, adjectives with adjectives (Bierwisch 1970, Garrett 1975, Nooteboom 1967). On this evidence, operations at the point of error involve the grammatical roles of words (or minimal phrasal units). Sound exchanges differ in both respects. Recall also that sound exchanges implicate many detailed features of phonological structure; word exchanges are minimally so affected (see Dell and Reich 1981, del Viso et al. 1992). These differences in the role of phrasal domain, grammatical category, and phonological description reveal two processing systems. Syntactic constituents are built by a multiphrasal processor that does not represent the sound structure of its elements (word stems, morphemes, simple phrases); failures yield word and phrase exchanges. A subsequent process elaborates sound sequences for words within a phrase group; failures yield sound exchanges. Multiple phrasal units are present at the syntactic level, and a narrower planning window develops sound structures and pronunciation codes in prosodic groups. 3.3 Stranding Errors and Multiple Leels of Processing The exchanges in Examples (g) and (h) decompose target words into their constituent morphemes and strand affixes. One finds stranding of a variety of productive structural markers, predominantly those of inflectional device (e.g., number, tense, and aspect markers, gerundives, comparatives). Evidence indicates that stranding arises at two points: within phrase (re prosodic groups for sound exchanges), and between phrase (re structures for word exchanges) (see Garrett 1993). These ideas extend the earlier discussion of accommodation. That discussion described it as a phonologically conditioned response to error movements. But, consider Examples (s) and (t). (s) ‘( that I would hear one if I knew it.’ (...that I would know one if I heard it) (t) ‘( drei Wochen im Tag gearbeitet.’ (drei Tage in der Woche gearbeitet). Here there is accommodation, but, strikingly, it involves irregular forms of inflected words. In Example(s), the appropriate irregular verb forms for ‘know’ and ‘hear’ appear. The German example (Bierwisch 1970) shows several accommodations.
Singular ‘Woche’ takes plural marker ‘en’; plural ‘Tage’ has the singular form ‘Tag’. Further, German nouns bear grammatical gender: ‘Woche’ is feminine, ‘Tag’ is masculine. The articles required thus differ, and may interact with prepositions. Here, number and gender are appropriate, including interaction with the preposition (‘im’ is a contracted form of ‘in dem’). What accounts for this impressive range of accommodations? They are lexically dependent and so not derivable from phonological environments (as described in Sect. 3.2). In fact, lexically dependent accommodation is restricted to between phrase stranding exchanges for elements of corresponding grammatical category. In the account in Sect. 3.2, such errors occur at the level of abstract phrasal construction, prior to the phonological interpretation of content words. Development of the irregular forms follows from affiliation of the switched stems with stranded inflectional features. 3.4 Generalizing the Obserations About Processing Frames Stranding shows that inflectional elements are separately represented when stems are inserted into phrasal frames. This can be expressed by assuming that inflectional elements are parts of structural frameworks. If exchanges arise when lexical elements are misassigned to structural frames, it follows that parts of frames should not exchange. This expectation is borne out in error corpora; inflectional exchanges are rare (Garrett 1975, Stemberger 1984). Another class of observations (morpheme shifts), not discussed here, may also be connected to this picture (see Garrett 1980). There is some basis for treating all functional elements both bound and free in this way. The claim is that functional elements (inflections, determiners, complementizers, etc.—the minor grammatical categories) are derivatives of structure and are thus recruited for incorporation into sentences by different mechanisms than those responsible for the selection of content items (nouns, verbs, prepositions, adjectives, adverbs—the major grammatical categories). This contrast has been discussed in the processing literature under the label of ‘open class’ and ‘closed class’ vocabularies. It might be construed as a distinction between those elements individually selected under conceptual control (‘content elements’) and those that are reflexes of the logic and syntax of the sentence (Levelt 1989, Bock and Levelt 1994). See MyersScotten and Jakes (2001), for a related account based on ‘code-switching’ (a bilingual language practice that fluently combines elements from two languages into a single utterance). Note, however, that some accounts of speech error link profiles for these vocabulary types to their relative frequency of occurrence in the language (Stemberger and McWhinney 1986, Dell 1991). 14867
Speech Errors, Psychology of On this view, very high frequency rates, rather than structural roles, governs error behavior of functional elements. 3.5 Lexical Selection Processes: Word Substitutions Word substitutions provide a data class distinct from movement error. These reflect failures in lexical retrieval; Examples (u)–(y) illustrate. (u) ‘Are you left-handed? No, I’m amphibian, no, ambialent... arghh, ambidextrous!’ (v) ‘If you can find a garlic around the house, I’ll take it.’ (gargle) (w) ‘All I want is something for my shoulders.’ (elbows) (x) ‘He rode his bike to school tomorrow.’ (yesterday) (y) ‘Yumm, are we having lobsters tonight?’ (oysters) A dissociation of errors with meaning similarities and those with form-based similarities is immediately apparent. Fay and Cutler (1977) in a seminal paper described these contrasts and proposed two selection processes, rooted respectively in content (semantic representations) and form (morphological and phonological representations). Depending on which process is compromised, errors should display quite different similarity metrics. Fay and Cutler’s analysis, and subsequent studies, revealed multiple systematic form relations between target and error words that lacked a meaning relation—as Examples (k), (u), (v), while errors with meaning relations, as in Examples (m), (w), and (x), lack such phonological resemblances. These substitutions also interact with sentence structure: Grammatical category correspondence between target and intrusion powerfully constrains word substitutions whether semantically or form driven. This links to the movement error analysis in Sects. 3.1–3.4. That analysis indicates staged processing in which phonologically uninterpreted elements are first associated with syntactic environments, followed by assignment of phonological representations to structural frameworks. Word substitution patterns fit into those processes. The distinction parallels the discussion of lemmas in Speech Production, Psychology of. Meaning-guided retrieval selects lexical representations (phonologically uninterpreted: lemmas) for input to the multiphrasal construction process that is the locus of word exchanges. Following this, word form retrieval (mapping from lemmas to word forms) provides inputs to the prosodic groups that are the locus of sound exchange errors. The ordering is clear, but details of the sequencing less so. The differences between meaning and formbased substitutions have been a matter of sustained debate. A way to put this is to ask whether semantic selection is accomplished without feedback from form to meaning levels. Dell (1986) and others have argued that the process is to some degree interactive, and includes such feedback, yielding semantically related 14868
errors that also display a form relation, as illustrated in Example (y). Such errors are a minority, but occur with greater than chance likelihood. See Dell et al. (1997) for a recent review; and Levelt et al. (1999) for contrasting discussion.
4. What Causes Error? Several causal factors have been identified. Broadly speaking, they are either ‘noise’ or ‘competition’ based. See Butterworth (1982) for a helpful exposition. He distinguishes causes of error in terms of ‘sentence internal’ and ‘sentence external’ processes, with emphasis on ‘competing plans’ as error sources. Fluctuations in efficiency of the neural systems that underlie language processes could clearly cause error. But such accounts require explicit linkage to production mechanisms. This step is well represented in work by Dell (1986) and others using network models to simulate error patterns; see also Levelt et al. (1999). Error caused by competing sentence formulations is also readily identifiable in blends—as in Examples (m) and (n). These give strong evidence for multiple encoding paths within a sentence. It is easy to generalize such conditions to rationalize certain word substitutions and sound errors as well. ‘Sentence external’ processes provide additional sources of competing expressions. Such may be discourse-relevant variations on sentence form or discourse external representations for an independent line of thought. This latter may also include interaction of linguistic processing with attentional mechanisms that filter automatic responses to the environment. A ‘filtering breakdown’ may lead to incorporation of environmental material into speech (see e.g., Harley 1990). Sigmund Freud (1901) is identified with accounts of errors motivated by thoughts related to unconscious wishes and desires. His account links those thoughts to the meaning of the words or phrases resulting from error. That requirement suggests that Freudian mechanisms are not a major factor in the generation of movement errors. A robust feature of virtually all such errors is their indifference to the meaning of error products (Bierwisch 1970). An alternative focus for Freud’s claims may lie in editorial processes—i.e., accounts of self-monitoring and its relation to the mechanisms of sentence production (see Motley et al. 1982).
5. The Broader Picture: Building Models of the Language Production Process Speech error analysis provides a rich source of information about the real-time processes that underlie human abilities to generate fluent, coherent, oral discourse. But, rich as it is, there are sharp limitations on its utility for certain questions about language
Speech Errors, Psychology of production. Issues of semantic and discourse structure are awkwardly represented or absent from error data, and multiple questions about the temporal relations of lexical and phrasal integration are undetermined by the patterns of naturally occurring speech error. At precisely this point one turns to experimental studies of production performance. Bock (1987, 1990, 1995) and collaborators have used picture description tasks and related techniques to uncover detailed relations between sentence generation and the underlying conceptual structures that sentences encode. Models of lexical retrieval have been extensively investigated via picture naming and related paradigms (see Speech Production, Psychology of ). A family of experiments that focuses on TOT (tip of the tongue) states provides complementary information on lexical staging for syntactic and phonological representations (see Tip-of-the-tongue, Psychology of ). Studies using error induction techniques also provide points of contact with spontaneous error data. See Shattuck-Hufnagel (1987) for a comprehensive series of error induction studies relating sound error to lexical and phrasal structure, including stress factors. Work by Baars et al. (1975) led to a series of investigations showing lexical influences on the probability of sound exchanges. Some aspects of the experimentation point to monitoring processes as the source of such effects (see Levelt et al. 1999 for review). An extensive experimental literature on induced agreement errors has developed from original work by Bock and Miller (1991) on English number agreement, and has been applied to agreement issues in several additional languages (see, e.g., Vigliocco et al. 1999). See Bock and Levelt (1994) for a recent synthesis of several strands of production experimentation. See also: Aphasia; Speech Production, Neural Basis of; Speech Production, Psychology of; Tip-of-thetongue, Psychology of
Bibliography Baars B, Motley M, McKay D 1975 Output editing for lexical status from artificially elicited slips of the tongue. Journal of Verbal Learning and Verbal Behaior 14: 382–91 Bierwisch M 1970\1982 Linguistics and language error. In: Cutler A (ed.) Slips of the Tongue. Mouton, Amsterdam Bock J K 1987 Coordinating words and syntax in speech plans. Ellis (ed.) Progress in the Psychology of Language, Vol. 3. Erlbaum, London, pp. 337–90 Bock J K 1990 Structure in language: Creating form in talk. American Psychologist 45: 1221–36 Bock K, Levelt W J M 1994 Language production: Grammatical encoding. In: Gernsbacher M A (ed.) Handbook of Psycholinguistics. Academic Press, London, pp. 945–84 Bock J K, Loebell H, Morey R 1992 From conceptual roles to structural relations: Bridging the syntactic cleft. Psychological Reiew 99: 150–71
Bock J K, Miller C A 1991 Broken agreement. Cognitie Psychology 23: 45–93 Boomer D, Laver J 1968 Slips of the tongue. British Journal of Disorders of Communication 3: 2–12 Butterworth B 1982 Speech errors: Old data in search of new thories. In: Cutler A (ed.) Slips of the Tongue. Mouton, Amsterdam Caldognetto E, Tonelli L, Vagges K, Cosi P 1987 The organization of constraints on phonological speech errors. Proceedings of the XIth International Congress of Phonetic Sciences, Tallinn, U.S.S.R. Vol. 3, 50.4.1–50.4.4 Cutler A 1982 The reliability of speech error data. In: Cutler A (ed.) Slips of the Tongue. Mouton, Amsterdam Cutler A, Fay D 1978 Introduction (ix–xxxv). In: New edn. Versprechen und Verlesen: Eine Psycologisch-Linguistische Studie. John Benjamins, Amsterdam del Viso S, Igoa J, Garcia-Albea J 1991 On the autonomy of phonological encoding: Evidence from slips of the tongue in Spanish. Journal of Psycholinguistic Research 20: 161–85 Dell G 1984 The representation of serial order in speech: Evidence from the repeated phoneme effect in speech errors. Journal of Experimental Psychology, Learning Memory & Cognition 10: 222–33 Dell G 1986 A spreading activation theory of retrieval in sentence production. Psychological Reiew 93: 283–21 Dell G 1991 Effects of frequency and vocabulary type on phonological speech errors. Language and Cognitie Processes 5: 313–49 Dell G, Reich P 1981 Stages in sentence production: An analysis of speech error data. Journal of Verbal Learning and Verbal Behaior 20: 611–29 Dell G S, Schwartz M F, Martin N, Saffran E M, Gagnon D A 1997 Lexical access in aphasic and nonaphasic speakers. Psychological Reiew 104(4): 801–38 Fay D, Cutler A 1977 Malapropisms and the structure of the mental lexicon. Linguistic Inquiry 8: 505–20 Freud S 1901 The psychopathology of everyday life. Monatschrift fur Psychiatrie und Neurologie 10: 1–13 Fromkin V 1971 The non-anomalous nature of anomalous utterances. Language 47: 27–52 Garrett M F 1975 The analysis of sentence production. In: Bower G (ed.) The Psychology of Learning and Motiation: Adances in Research and Theory. Academic Press, New York, Vol. 9 Garrett M 1980 Levels of processing in sentence production. In: Butterworth B (ed.) Language Production: Vol 1. Speech and Talk. Academic Press, London, pp. 177–220 Garrett M 1993 Errors and their relevance for models of language production. In: Dittman J, Blanken G (eds.) Handbooks of Linguistics and Communication Science: Linguistic Disorders and Pathologies. Walter De Gruyter, New York Harley T 1990 Environmental contamination of normal speech. Applied Psycholinguistics 11: 45–72 Levelt W J M 1989 Speaking: From Intention to Articulation. MIT Press, Cambridge, MA Levelt W J M, Roelofs A, Meyer A 1999 A theory of lexical access in speech production. Behaioral and Brain Sciences 22: 1–75 McKay D G 1970 Spoonerisms: The structure of error in the serial order of speech. Neuropsychologia 8: 323–50 McKay D G 1972 The structure of words and syllables: Evidence from errors in speech. Cognitie Psychology 3: 210–27 Meringer R, Meyer K 1895\1978 Versprechen und Verlesen: Eine Psychologisch Linguistische Studie. Stuttgart: Gos-
14869
Speech Errors, Psychology of chensche Verlag. In: Cutler A, Fay D (eds.) John Benjamins, Amsterdam Myers-Scotton C, Jake J 2001 Explaining aspects of code switching and their implications. In: Nicol J (ed.) One Mind, Two Languages: Bilingual Language Processing. Blackwell, Oxford, pp. 91–125 Motley M, Camden C, Baars B 1982 Covert formulation and editing of anomalies in speech production. Journal of Verbal Learning and Verbal Behaior 21: 578–94 Shattuck-Hufnagel S 1987 The role of word onset consonants in speech production planning: New evidence from speech error patterns. In: Kellar E, Gopnik M (eds.) Motor and Sensory Processing in Language. Erlbaum Press, Hillsdale, NJ Stemberger J 1985 An interactive activation model of language production. In: Ellis A (ed.) Progress in the Psychology of Language. Erlbaum, London, Vol. 1 Stemberger J, McWhinney B 1986 Form-oriented inflectional errors in language processing. Cognitie Psychology 18: 329–54 Vousden J I, Brown G D A, Harley T A 2000 Serial control of phonology in speech production: a hierarchical model. Cognitie Psychology 41: 101–75
M. Garrett
Speech Perception It is a commonplace to have the impression that foreign languages are spoken much more rapidly than our own, and without silent periods between the words and sentences. Our own language, however, is perceived at a normal pace (or even too slowly at times) with clear periods of silence between the words and sentences. In fact, languages are spoken at approximately the same rate, and these experienced differences are solely due to the memory structures and psychological processes involved in speech perception. Thus we define speech perception as the process of imposing a meaningful perceptual experience on an otherwise meaningless speech input. The empirical and theoretical investigation of speech perception has blossomed into an active interdisciplinary endeavor, including the fields of psychophysics, neurophysiology, sensory perception, psycholinguistics, linguistics, artificial intelligence, and sociolinguistics.
1. Psychophysics of Speech Perception In any domain of perception, one goal is to determine the stimulus properties responsible for perception and recognition of the objects in that domain (see Psychophysics). The study of speech perception promises to be even more challenging than other domains of perception because there appears to be a discrepancy between the stimulus and the perceiver’s experience of it. For speech, we perceive mostly a discrete auditory 14870
message composed of words, phrases, and sentences. The stimulus input for this experience, however, is a continuous stream of sound (and facial and gestural movements in face-to-face communication) produced by the speech production process. Somehow, this continuous input is transformed into more or less a meaningful sequence of discrete events (see Psychophysics). Although we have made great progress on this front, there is still active controversy over the ecological properties of the speech input that are actually functional in speech perception. One issue, revived by recent findings, is whether the functional properties in the signal are static or dynamic (changing with time). Traditionally, static cues (such as the location of formants (bands of energy in the acoustic signal related to vocal tract configuration), the distribution of spectral noise as in the onset of saw and shawl, and the mouth shape at the onset of a segment) have been shown to be effective in influencing speech perception. Dynamic cues such as the transition of energy between a consonant and the following vowel have also been shown to be important. For example, recent research has shown that the second formant (F2) transition defined as the change between the F2 value at the onset of a consonant–vowel (CV) transition and the F2 value in the middle of the following vowel is a reliable predictor of the place of articulation category (Sussman et al. 1998). Controversy arises when research is carried out to argue for one type of cue rather than another. For example, investigators recently isolated short segments of the speech signal and reversed the order of the speech within each segment (Saberi and Perrott 1999). In this procedure, a sentence is divided into a sequence of successive segments of a fixed duration such as 50 ms. Each segment is time-reversed and these new segments are recombined in their original order, without smoothing the transition borders between the segments. In this fashion, the sentence could be described as between globally contiguous but locally time-reversed. The authors claimed that the speech was still intelligible when the reversed segments were relatively short (1\20th to 15th of a second). Their conclusion was that our perception of speech was demonstrated to be primarily dependent on higherorder dynamic properties rather than the short static cues normally assumed by most current theories. This type of study and logic follows a tradition of attempting to find a single explanation or influence of some psychological phenomenon. However, most successful research in psychology is better framed within the more general framework of ceteris paribus (all other aspects neutral). There is good evidence that perceivers exploit many different cues in speech perception, and attempting to isolate a single functionally sufficient cue is futile. There is now a large body of evidence indicating that multiple sources of information are available to
Speech Perception support the perception, identification, and interpretation of spoken language. There is an ideal experimental paradigm that allows us to determine which of the many potentially functional cues are actually used by human observers, and how these cues are combined to achieve speech perception (Massaro 1998). The systematic variation of the properties of the speech signal and quantitative tests of models of speech perception allow the investigator to interpret the psychological validity of different cues. This paradigm has already proven to be effective in the study of audible, visible, and bimodal speech perception (Massaro 1998). Thus, this research strategy addresses how different sources of information are evaluated and integrated, and can identify the sources of information that are actually used.
2. The Demise of Categorical Perception Too often, behavioral scientists allow their phenomenal impressions to spill over into their theoretical constructs. One experience in speech perception is that of categorical perception. Using synthetic speech, it is possible to make a continuum of different sounds varying in small steps between two alternatives. Listening to a synthetic speech continuum between \ba\ and \pa\ provides an impressive demonstration: students and colleagues usually agree that their percept changes qualitatively from one category to the other in a single step or two with very little fuzziness in-between. Consistent with this impression, there is no denying that we experience discrete categories in speech perception. This experience of discrete perception was developed as a description of the speech perception process and called categorical perception (CP), or the perceived equality of instances within a category. The CP of phonemes has been a central concept in the experimental and theoretical investigation of speech perception and has also spilled over into other domains such as face processing (Beale and Keil 1995, Etcoff and Mcgee 1992). CP was operationalized in terms of discrimination performance being limited by identification performance. Researchers at Haskins Laboratories (Liberman et al. 1957) used synthetic speech to generate a series of 14 consonant–vowel syllables going from \be\ to \de\ to \ge\ (\e\ as in gate). The onset frequency of the second formant transition of the initial consonant was changed in equal steps to produce the continuum. In the identification task, observers identified random presentations of the sounds as \b\, \d\, or \g\. The discrimination task used the ABX paradigm. Three stimuli were presented in the order ABX; A and B always differed and X was identical to either A or B. Observers were instructed to indicate whether X was equal to A or B. This judgment was supposedly based on auditory discrimination in that observers were instructed to use whatever auditory differences they could perceive.
The experiment was designed to test the hypothesis that listeners can discriminate the syllables only to the extent that they can recognize them as different phoneme categories. The CP hypothesis was quantified in order to predict discrimination performance from the identification judgments. The authors concluded that discrimination performance was fairly well predicted by identification. This rough correspondence between identification and discrimination has provided the major source of support for CP. There have been several notable limitations in these studies. Discrimination performance is consistently better than that predicted by identification, it is possible that participants are making their discrimination judgments on the basis of identification rather than on their auditory discrimination, and it is likely that any alternative theory would have described the results equally well (Massaro 1987). In many areas of inquiry, a new experimental paradigm enlightens our understanding by helping to resolve theoretical controversies. For CP, rating experiments were used to determine if perceivers indeed have information about the degree of category membership. Rather than ask for categorical decisions, perceivers are asked to rate the stimulus along a continuum between two categories. A detailed quantitative analysis of the results indicated that perceivers have reliable information about the degree of category membership, contrary to the tenets of CP (Massaro and Cohen 1983). Although communication forces us to partition the inputs into discrete categories for understanding, this property in no way implies that speech perception is categorical. To retrieve a toy upon request, a child might have to decide between ball and doll; however, she can certainly have information about the degree to which each toy was requested. Although CP has been discredited, it is often reinvented under new guises. Most recently, the perceptual-magnet effect (PME) has had a tremendous impact on the field, and has generated a great deal of research (Kuhl 1991). The critical idea is that the discriminability of a speech segment is inversely related to its category goodness. Ideal instances of a category are supposedly very difficult to distinguish from one another relative to poor instances of the category. If we understand that poor instances of one category will often tend to be at the boundary between two categories, then the PME is more or less a reformulation of prototypical CP. That is, discrimination is predicted to be more accurate between categories than within categories. In demonstrating its viability, the PME faces the same barriers that have been difficult to eliminate in CP research. In standard CP research, it is necessary to show how discrimination is directly predicted by identification performance. In the PME framework, it is also necessary to show how discrimination is directly predicted by a measure of category goodness. We can expect category goodness 14871
Speech Perception to be related to identification performance. Good category instances will tend to be identified equivalently, whereas poor instances will likely be identified as instances of different categories. Lotto et al. (1998) found that discriminability was not poorer for vowels with high category goodness, in contrast to the predictions of the PME. The authors also observed that category goodness ratings were highly context sensitive, which is a problem for the PME. If category goodness is functional in discrimination, it should be relatively stable across different contexts. Notwithstanding the three decades of misinterpreting the relationship between identification and discrimination of auditory speech, we must conclude that it is perceived continuously and not categorically. Recent research reveals conclusively that both visible and bimodal speech are perceived continuously (Massaro 1987, 1998). This observation pulls the carpet from under current views of language acquisition that attribute to the infant and child discrete speech categories (Eimas 1985, Gleitman and Wanner 1982). Most importantly, the case for the specialization of speech is weakened considerably because of the central role that the assumption of CP has played (Liberman and Mattingly 1985). Finally, several neural network theories such as single-layer perceptrons, recurrent network models, and interactive activation have been developed to predict CP (Damper 1994): its nonexistence poses great problems for these models. Given the existence of multiple sources of information in speech perception, each perceived continuously, a new type of theory is needed. The theory must describe how each of the many sources of information is evaluated, how the many sources are combined or integrated, and how decisions are made. The development of a promising theory has evolved from sophisticated experimental designs and quantitative model testing to understand speech perception and pattern recognition more generally. A wide variety of results have been accurately described within the Fuzzy Logical Model of Perception (FLMP).
Figure 1 Schematic representation of the three processes involved in perceptual recognition. The three processes are shown to proceed left to right in time to illustrate their necessarily successive but overlapping processing. These processes make use of prototypes stored in longterm memory. The sources of information are represented by upper-case letters. Auditory information is represented by Ai and visual information by Vj. The evaluation process transforms these sources of information into psychological values (indicated by lowercase letters ai and vj) These sources are then integrated to give an overall degree of support, sk, for each speech alternative k. The decision operation maps the outputs of integration into some response alternative, Rk. The response can take the form of a discrete decision or a rating of the degree to which the alternative is likely
source of information is evaluated to determine the continuous degree to which that source specifies various alternatives; (b) the sources of information are evaluated independently of one another; (c) the sources are integrated to provide an overall continuous degree of support for each alternative; and (d) perceptual identification and interpretation follows the relative degree of support among the alternatives. In the course of our research, we have found the FLMP to be a universal principle of perceptual cognitive performance that accurately models human pattern recognition. People are influenced by multiple sources of information in a diverse set of situations. In many cases, these sources of information are ambiguous and any particular source alone does not usually specify completely the appropriate interpretation.
3. The Fuzzy Logical Model of Perception (FLMP)
4. Multimodal Speech Perception
The three processes involved in perceptual recognition are illustrated in Fig. 1 and include evaluation, integration, and decision. These processes make use of prototypes stored in long-term memory. The evaluation process transforms these sources of information into psychological values, which are then integrated to give an overall degree of support for each speech alternative. The decision operation maps the outputs of integration into some response alternative. The response can take the form of a discrete decision or a rating of the degree to which the alternative is likely. The assumptions central to the model are: (a) each
Speech perception has traditionally been viewed as a unimodal process, but in fact appears to be a prototypical case of multimodal perception. This is best seen in face-to-face communication. Experiments have revealed conclusively that our perception and understanding are influenced by a speaker’s face and accompanying gestures, as well as the actual sound of the speech (Massaro 1998). Consider a simple syllable identification task. Synthetic visible speech and natural audible speech were used to generate the consonant–vowel (CV) syllables \ba\, \va\, \a\, and \da\. Using an expanded factorial design, the four syllables
14872
Speech Perception averaging model (WTAV), which is an inefficient algorithm for combining the auditory and visual sources. For bimodal trials, the predicted probability of a response, P(\da\) is equal to w a jw j # l wa j(1kw) P(\da\) l " " i j w jw " # Figure 2 Probability correct identification of the unimodal (AU l Auditory, VI l Visual) and bimodal (BI) consistent trials for the four test syllables
were presented auditorily, visually, and bimodally. Each syllable was presented alone in each modality for 4i2 l 8 unimodal trials. For the bimodal presentation, each audible syllable was presented with each visible syllable for a total of 4i4 or 16 unique trials. Thus, there were 24 types of trials. Twelve of the bimodal syllables had inconsistent auditory and visual information. The 20 participants in the experiment were instructed to watch and listen to the talking head and to indicate the syllable that was spoken. Accuracy is given in Fig. 2 for unimodal and bimodal trials when the two syllables are consistent with one another. Performance was more accurate given two consistent sources of information than given either one presented alone. Consistent auditory information improved visual performance about as much as consistent visual information improved auditory performance. Given inconsistent information from the two sources, performance was poorer than observed in the unimodal conditions. These results show a large influence of both modalities on performance with a larger influence from the auditory than the visual source of information. Although the results demonstrate that perceivers use both auditory and visible speech in perception, they do not indicate how the two sources are used together. There are many possible ways the two sources might be used. We first consider the predictions of the FLMP. In a two-alternative task with \ba\ and \da\ alternatives, the degree of auditory support for \da\ can be represented by ai, and the support for \ba\ by (1–ai). Similarly, the degree of visual support for \da\ can be represented by j, and the support for \ba\ by (1–j). The probability of a response to the unimodal stimulus is simply equal to the feature value. For bimodal trials, the predicted probability of a response, P(\da\) is equal to P(\da\) l
aij aijj(1kai)(1kj)
(1)
In previous work, the FLMP has been contrasted against several alternative models such as a weighted
(2)
The WTAV predicts that two sources can never be more informative than one. In direct contrasts, the FLMP has consistently and significantly outperformed the WTAV (Massaro 1998). More generally, research has shown that the results are well-described by the FLMP, an optimal integration of the two sources of information (Massaro and Stork 1998). A perceiver’s recognition of an auditory-visual syllable reflects the contribution of both sound and sight. For example, if the ambiguous auditory sentence, My bab pop me poo brie, is paired with the visible sentence, My gag kok me koo grie, the perceiver is likely to hear, My dad taught me to drie. Two ambiguous sources of information are combined to create a meaningful interpretation (Massaro and Stork 1998). Recent findings show that speech reading, or the ability to obtain speech information from the face, is not compromised by oblique views, partial obstruction or visual distance. Humans are fairly good at speech reading even if they are not looking directly at the talker’s lips. Furthermore, accuracy is not dramatically reduced when the facial image is blurred (because of poor vision, for example), when the face is viewed from above, below, or in profile, or when there is a large distance between the talker and the viewer (Massaro 1998).
5. Higher-order or Top-down Influences Consistent with the framework being developed, there is now a substantial body of research illustrating that speech perception is influenced by a variety of contextual sources of information. Bottom-up sources correspond to those sources that have a direct mapping between the sensory input and the representational unit in question. Top-down sources or contextual information come from constraints that are not directly mapped onto the unit in question. As an example, a bottom-up source would be the stimulus presentation of a test word after the presentation of a top-down source, a sentence context. A critical question for both integration and autonomous (modularity, models is how bottom-up and top-down sources of information work together to achieve word recognition. For example, an important question is how early can contextual information be integrated with acoustic-phonetic information. A large body of research shows that several bottom-up sources are 14873
Speech Perception
Figure 3 Observed (points) and FLMP’s predictions (lines) of \g\ identifications for FT and S contexts as a function of the speech information of the initial consonant. Results from Pitt’s (1995) Experiment 3a
evaluated in parallel and integrated to achieve recognition (Massaro 1987, 1994). An important question is whether top-down and bottom-up sources are processed in the same manner. A critical characteristic of autonomous models might be described as the language user’s inability to integrate bottom-up and top-down information. An autonomous model must necessarily predict no perceptual integration of topdown with bottom-up information. Pitt (1995) studied the joint influence of phonological information and lexical context in an experimental paradigm developed by Ganong (1980). A speech continuum is made between two alternatives, and the contextual information supports one alternative or the other. The initial consonant of the CVC syllable was varied in six steps between \g\ and \k\. The following context was either \Ift\ or \Is\. The context \Ift\ favors or supports initial \g\ because gift is a word whereas kift is not. Similarly, the context \Is\ favors or supports initial \k\ because kiss is a word whereas giss is not. Pitt improved on earlier studies by collecting enough observations to allow us to perform a subject-by-subject evaluation of the ability of specific models of language processing to account for the results. Previous tests of models using this task have been primarily dependent on group averages which may not be representative of the individuals that make the averages up. The points in Fig. 3 give the observed results for each of the 12 subjects in the task. For most of the subjects, the individual results tend to resemble the average results reported by Pitt and earlier investiga14874
tors. Ten of the 12 subjects were influenced by lexical context in the appropriate direction. Subject 1 gave an inverse context effect and Subject 7 was not influenced by context. This FLMP was applied to the identification results of the 12 individual subjects in Pitt’s Experiment 3a for which the greatest number of observations (104) were obtained for each data point for each subject. According to the FLMP, both the bottom-up information from the initial speech segment and the top-down context are evaluated and integrated. If si is the degree of support for the voiced alternative given by the initial segment and cj is the support given by the following context, the total support for the voiced alternative i is given by S(voiced Q SiCj) l siicj
(3)
The support for the voiceless alternative would be S(voiceless Q SiCj)l(1ksi)i(1kcj)
(4)
The predicted probability of a voiced response is simply P(voiced QSiCj) l
S(voiced QSiCj) S(voiced QSiCj) j S(voiceless QSiCj) (5)
In producing predictions for the FLMP, it is necessary to estimate parameter values for each level of each
Speech Production, Neural Basis of experimental factor. The initial consonant was varied along six steps between \g\ and \k\ and the following context was either \Ift\ or \Is\. Thus, there were six levels of bottom-up phonological information si and two contexts cj. A free parameter is necessary for each level of bottom-up information but it is reasonable to assume that the contextual support given by \Is\ is one minus the lexical support given by \Ift\ so that only one value of cj needs to be estimated. Thus, seven free parameters are used to predict the 12 independent points. The lines in Fig. 3 also give the predictions of the FLMP. As can be seen in the figure, the model generally provides a good description of the results of this study. The root mean squared deviation (RMSD) between predicted and obtained is 0.017 on the average across all 12 independent fits. For the 10 subjects showing appropriate context effects, the RMSD ranges from 0.003 to 0.045 with a median of 0.007. Thus, for each of these individuals, the model captures the observed interaction between phonological information and lexical context: the effect of context was greater to the extent that the phonological information was ambiguous. This yields a pattern of curves in the shape of an American football, which is a trademark of the FLMP. The model tests have established that perceivers integrate top-down and bottom-up information in language processing, as described by the FLMP. This result means that sensory information and context are integrated in the same manner as several sources of bottom-up information. These results pose problems for autonomous models of language processing.
Kuhl P K 1991 Human adults and human infants show a ‘perceptual magnet effect’ for the prototypes of speech categories, monkeys do not. Perception & Psychophysics 50: 93–107 Liberman A M, Harris K S, Hoffman H S, Griffith B C 1957 The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology 54: 358–68 , 753–71 Liberman A M, Mattingly I G 1985 The motor theory of speech perception revised. Cognition 21: 1–36 Lotto A J, Kluender K R, Holt L L 1998 Depolarizing the perceptual magnet effect. Journal of the Acoustical Society of America 103: 3648–55 Massaro D W 1987 Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry. Erlbaum Associates, Hillsdale, NJ Massaro D W 1994 Psychological aspects of speech perception: Implications for research and theory. In: Gemsbacher M (eds.) Handbook of Psycholingristics. Acadamie Press, New York pp. 219–63 Massaro D W 1998 Perceiing Talking Faces: From Speech Perception to a Behaioral Principle. MIT Press, Cambridge, MA Massaro D W, Cohen M M 1983 Categorical or continuous speech perception: A new test. Speech Communication 2: 15–35 Massaro D W, Stork D G 1998 Sensory integration and speechreading by humans and machines. American Scientist 86: 236–44 Pitt M A 1995 The locus of the lexical shift in phoneme identification. Journal of Experimental Psychology: Learning, Memory, & Cognition 21: 1037–52 Saberi K, Perrott D R 1999 Cognitive restoration of reversed speech. Nature 398: 760 Sussman H M, Fruchter D, Hilbert J, Sirosh J 1998 Linear correlates in the speech signal: The orderly output constraint. Behaioral & Brain Sciences 21(2): 241–99
See also: Audition and Hearing; Speech Production, Neural Basis of; Speech Production, Psychology of; Speech Recognition and Production by Machines
D. W. Massaro
Bibliography Beale J M, Keil F C 1995 Categorical effects in the perception of faces. Cognition 57(3): 217–39 Damper R I 1994 Connectionist models of categorical perception of speech. Proceedings of the IEEE International Symposium on Speech, Image Processing and Neural Networks, Vol. 1, pp. 101–4 Etcoff N L, McGee J J 1992 Categorical perception of facial expressions. Cognition 44: 227–40 Eimas P D 1985 The perception of speech in early infancy. Scientific American 252: 46–52 Fitch H L, Halwes T, Erickson D M, Liberman A M 1980 Perceptual equivalence of two acoustic cues for stop-consonant manner. Perception & Psychophysics 27: 343–50 Ganong III W F 1980 Phonetic categorization in auditory word recognition. Journal of Experimental Psychology: Human Perception and Performance 6: 110–25 Gleitman L R, Wanner E 1982 Language acquisition: The state of the state of the art. In: Wanner E, Gleitman L R (eds.) Language Acquisition: The State of the Art. Cambridge University Press, Cambridge, UK
Speech Production, Neural Basis of 1. Brain Regions Inoled in Speech Production Over a century ago, the French neurologist Paul Broca demonstrated that speech mechanisms could be localized in the human brain. He did this by interviewing a patient with a severe speech production disorder with output limited to the recurring utterance, ‘tan.’ Upon the patient’s death, Broca examined the brain and concluded that the patient’s inability to speak was due to a lesion in the inferior part of the frontal lobe (Broca 1861b). A second patient also had severely reduced speech and was subsequently found to have a similar cortical lesion (Broca 1861a). Since the time of Broca, scientists have found that lesions to Broca’s area alone are not enough to produce lasting speech deficits (e.g., Alexander et al., 1989; Dronkers et al. 2000; Mohr 1976). Many have attempted to diagram 14875
Speech Production, Neural Basis of and map additional areas that might also subserve speech and language functions. This chapter will focus on evidence related to brain mechanisms of speech production. Specifically, evidence will be reviewed from lesion studies, electrocortical stimulation, and functional neuroimaging, all of which are helping to elucidate the neural basis of the planning and execution of articulatory movements. The involvement of these neural structures can best be understood in the framework of a working model that follows the utterance from its conception to its ultimate articulation. Numerous studies have confirmed that the temporal lobe is important for the translation of concepts into linguistic representations. Once the utterance has been linguistically formulated, it must be transferred to the speech mechanisms that will carry out its production. Articulation itself requires planning, initiation, modification, and execution. In addition to Broca’s area, it has become clear that many areas are involved in these aspects of speech production. New research has suggested a role for the anterior insula in articulatory planning, the supplementary motor area for the initiation of sequential speech movements, the basal ganglia and the cerebellum in the modification of pitch, loudness and rate, and the primary motor face cortex and pre-motor cortex in the execution of articulatory movements. Each of these areas will be discussed below in terms of the lesion studies that have helped to identify them and the functional neuroimaging research that has corroborated these findings. This chapter will begin by discussing the structures that are believed to convey linguistic information to frontal lobe speech mechanisms from posterior language cortex and end with a description of those mechanisms that execute the articulatory plan. Superior longitudinal and arcuate fasciculi. The arcuate fasciculus is a bundle of fibers that connects the temporal and frontal lobes. As it travels through the parietal lobe, it becomes part of the superior longitudinal fasciculus and incorporates parietal fibers as well. The arcuate fasciculus was identified by Norman Geschwind as the tract that could potentially connect Wernicke’s language area to Broca’s speech area as described in Wernicke’s original model (see Geschwind 1970; Wernicke 1874). The WernickeGeschwind model predicts that disruption of this tract results in isolated repetition deficits. However, recently a more comprehensive disorder has been reported. Dronkers and colleagues found that lesions that sever the arcuate\superior longitudinal fasciculus result in a severe production deficit with a complete loss of propositional speech (see Dronkers et al. 2000). Only automatized utterances remain. These patients have lost more than just repetition skills—they are unable to transfer any information from temporal lobe language areas to anterior speech mechanisms. Broca’s two original patients were also noted to have this same production disorder, which Broca 14876
termed ‘aphemia’ (Broca 1861a; 1861b). Dronkers et al. (submitted) recently had the opportunity to scan the postmortem brains of Broca’s two original patients. They found that deep lesions also existed in these brains, in addition to the lesions Broca noted in the inferior frontal lobe. These deep lesions involved the superior longitudinal fasciculus in both cases, suggesting that this lesion also played a role in these historic patients’ production deficits. Insular cortex. The insula is a region of neocortex hidden beneath the intersection of the frontal, parietal, and temporal lobes. In the past, the insula has been implicated only vaguely in speech production (e.g., Ojemann and Whitaker 1978; Tognola and Vignolo 1980). More recently, Dronkers (1996) linked lesions in a specific region of the insula to an inability to plan and coordinate the appropriate movements necessary for articulation. She found that all patients studied with this disorder, known as ‘apraxia of speech,’ had lesions in the superior tip of the precentral gyrus of the insula while those without speech apraxia did not. Dronkers concluded that this small portion of the insula is critical for coordinating articulatory movements. In keeping with this conclusion, recent neuroimaging studies have reported activation in the insula with tasks such as articulation of single words, word reading, picture naming, and word generation (see Indefrey and Levelt 2000). Interestingly, the insula appears to be activated only when tasks involve articulation of non-repeated and phonologically complex words (e.g., Wise et al. 1999), but not when only automatic or simple articulatory patterns are produced (e.g. Murphy et al. 1997). Supplementary motor cortex. The supplementary motor area is located in the superior frontal gyrus and extends medially between the hemispheres. Penfield and others described the participation of the supplementary motor area in speech by documenting the effects of electrocortical stimulation on this region in patients undergoing surgical removal of epileptic tissue (see Penfield and Roberts 1959). Stimulation to the supplementary motor area either caused patients to make involuntary vocalizations or interrupted their ability to speak. Infarctions of the supplementary motor area often result in a transcortical motor aphasia (See Rapcsak and Rubens 1994 for a review). Spontaneous speech is initially dysfluent, but comprehension, repetition, and reading remain relatively intact. Similarly, complete excision of the supplementary motor area causes a transient aphasia with deficits in articulation that resolve quickly (Penfield and Roberts 1959). A review of the lesion literature suggests that the supplementary motor area is involved in the initiation of sequential, voluntary movements, including those for speech (e.g., Ziegler et al. 1997). This also seems to be true for the right supplementary motor area, as demonstrated by several case studies as well as
Speech Production, Neural Basis of functional imaging data (e.g., Indefrey and Levelt 2000; Laplane et al. 1977). Other functional neuroimaging tasks that elicit supplementary motor cortex activation include the control of breathing for speech and vocalization (Murphy et al. 1997) and automatic speech (e.g., reciting months of the year; Ackermann et al. 1998). Basal ganglia. The basal ganglia are subcortical nuclei comprised of the caudate, putamen, and globus pallidus. These structures interact with the cortex and with a number of other subcortical structures in a series of feedback loops that help to maintain motor activity (see Love and Webb 1996). Lesions to the basal ganglia can result in several types of movement disorders including Parkinson’s and Huntington’s disease, which each have their own speech characteristics (Duffy 1995). Patients with Parkinson’s disease (which results from a lack of dopaminergic innervation of the caudate\putamen) show hypokinetic speech with decreased intensity and little modulation of pitch or loudness. In contrast, patients with Huntington’s disease (which results from a loss of GABAergic, inhibitory neurons in the caudate) exhibit hyperkinetic speech with erratic control of pitch and loudness. These data suggest a role for the basal ganglia in the modulation of articulation and phonation. Similar disorders can be seen in patients with focal lesions in the basal ganglia. For example, Pickett et al. (1998) reported a case study of a patient with bilateral lesions in the putamen and caudate. Her speech was initially characterized as dysarthric and was very difficult to understand, with poor articulatory and phonatory control. Similarly, Fabbro and colleagues reported at least two cases in which lesions in the basal ganglia resulted in hypophonia and a reduction of speech initiation and output (Fabbro et al. 1996). In functional neuroimaging studies, caudate activation has been noted in only a small number of speech production studies (see Indefrey and Levelt 2000), but this may be due to the fact that it is not generally treated as a region of interest in imaging studies of speech. Cerebellum. The cerebellum has traditionally been thought to be involved in fine coordination of motor acts including speech. Evidence for this comes from patients with focal cerebellar lesions. Ataxic dysarthria following cerebellar injury is a motor speech disorder that appears to be due to a discoordination in muscular control of the speech apparatus, as opposed to loss of initiation or execution (Duffy 1995). Mutism has also been associated with midline lesions in the cerebellum, in both children and adults (Coplin et al. 1997). However, this mutism is often transient and can resolve into ataxic dysarthric speech. A broader theory of cerebellar function suggests that the cerebellum acts as a timing mechanism. That is, the cerebellum’s role in motor coordination (for speech and other coordinated motor acts) is based on
a more general mechanism that allows for the precise temporal control of such movements (Ivry and Keele 1989). Consistent with this hypothesis, Gandour and Dardarananda (1984) and others have shown that cerebellar patients have abnormal voice onset time distributions, indicating a timing disorder. In the late 1990s, the cerebellum was implicated in speech production as a store for verbal short-term memory. Silveri et al. (1998) studied a patient following right cerebellar hemispherectomy and found that the patient had poor verbal short-term memory due to an impaired phonological output buffer. Functional neuroimaging data have also suggested that the cerebellum may play a role in verbal short-term memory (Ivry and Fiez 2000). The role of the cerebellum in speech is probably multi-faceted, based on the functional heterogeneity of the cerebellum in motor control. Functional neuroimaging studies may help to tease out the functions of the various subregions of the cerebellum and their roles in speech and language. Broca’s Area. Traditionally, the literature has emphasized Broca’s area as a major structure for speech production. However, the function of Broca’s area has been re-evaluated. Data from stroke patients and from neurosurgical excision have revealed that removal\ injury to Broca’s area results in a transient mutism that resolves in 3–6 weeks (Dronkers et al. 2000; Penfield and Roberts 1959). Also, focal lesions to Broca’s area do not result in a persisting Broca’s aphasia, and Broca’s aphasia may result from lesions outside of Broca’s area (Dronkers et al. 2000). With the increasing acknowledgment of the contribution of other brain areas to speech production (discussed in this chapter), Broca’s area is no longer considered to be as crucial to speech as once thought. Though clearly involved in articulation, Broca’s area functions within a network of brain regions that support speech production. Indeed, neuroimaging studies have reported networks of activation that include Broca’s area in tasks involving phonological encoding and articulation, such as picture naming, word generation, and reading (see Indefrey and Levelt 2000). Motor face and pre-motor cortex. The muscles of the vocal mechanism are innervated by cranial nerves V, VII, IX, X, XI, and XII, which receive impulses from the cortex via the corticobulbar tract. The motor face area of primary motor cortex, as well as pre-motor cortex, are believed to be the source of these projections (Duffy 1995). Lesions to these areas result in a pure motor speech disorder with slow, effortful speech and impaired articulation but intact language (Alexander et al. 1989). Such lesions are presumed to disrupt the corticobulbar fibers that activate the cranial nerves necessary for speech. Penfield and colleagues originally showed that a number of regions in primary motor and pre-motor cortex were involved in speech output, as electro14877
Speech Production, Neural Basis of cortical stimulation of these sites resulted in speech arrest (see Penfield and Roberts 1959). Similarly, Ojemann and Mateer (1979) showed that, in four cases, speech arrest followed stimulation of inferior motor and premotor regions in the left frontal cortex. Activation of primary motor cortex has been reported in fMRI studies of speech production. Wildgruber et al. (1996) found that tongue movements elicited comparable right and left activation in the inferior region of the motor strip. Monotone recitation of the months of the year activated more left than right motor cortex, while singing activated more right than left motor cortex. Ackermann et al. (1998) also found left motor cortex activation for silent repetition of months of the year.
2. Conclusions Much has been learned about the neural processes involved in speech based on studies that have examined the specific deficits associated with lesions in particular brain regions. While language mechanisms cluster in posterior neocortical areas, speech production is generally supported by anterior cortical regions. Broca’s area, motor face area, supplementary motor area, and the insula all play important roles in aspects of speech production, as evidenced by the effects of lesions in these areas. The superior longitudinal fasciculus provides the mechanism by which language information is transferred to these anterior speech regions. The basal ganglia and cerebellum also assist in speech production through modulation, coordination, and timing of speech movements. Functional neuroimaging studies have corroborated these findings and have allowed the visualization of the networks involved in speech production on-line. See also: Language Development, Neural Basis of; Speech Perception; Speech Production, Psychology of; Syntactic Aspects of Language, Neural Basis of
Bibliography Ackermann H, Wildgruber D, Daum I, Grodd W 1998 Does the cerebellum contribute to cognitive aspects of speech production? A functional magnetic resonance imaging (fMRI) study in humans. Neurosci. Letters 247: 187–90 Alexander M P, Benson D F, Stuss D T 1989 Frontal lobes and language. Brain and Language 37: 656–91 Broca P 1861a Nouvelle observation d’aphe! mie produite par une le! sion de la troisie' me circonvolution frontale [New observations of aphemia produced by a lesion of the third frontal convolution]. Bulletins de la SocieT teT d’Anatomie (Paris), 2e serie 6: 398–407 Broca P 1861b Remarques sur le sie' ge de la faculte! du langage articule! , suivies d’une observation d’aphe! mie (perte de la parole) [Remarks on the location of the faculty of articulate
14878
language, followed by an observation of aphemia (loss of speech)]. Bulletins de la SocieT teT d’Anatomie (Paris), 2e serie 6: 330–57 Coplin W M, Kim D K, Kliot M, Bird T D 1997 Mutism in an adult following hypertensive cerebellar hemorrhage: nosological discussion and illustrative case. Brain and Language. 59: 473–93 Dronkers N F 1996 A new brain region for coordinating speech articulation. Nature. 384: 159–61 Dronkers N F, Plaisant O, Iba-Zizen M T, Cabanis E A (submitted) Magnetic resonance imaging of the brains of Paul Broca’s historic cases Dronkers N F, Redfern B B, Knight R T 2000 The neural architecture of language disorders. In: Gazzaniga M S (ed.) The New Cognitie Neurosciences. MIT Press, Cambridge, MA Duffy J R 1995 Motor Speech Disorders: Substrates, Differential Diagnosis and Management. C. V. Mosby-Year Book, St Louis, MO Fabbro F, Clarici A, Bava A 1996 Effects of left basal ganglia lesions on language production. Perceptual and Motor Skills 82: 1291–8 Gandour J, Dardarananda R 1984 Voice onset time in aphasia: Thai. 2. Production. Brain and Language 23: 177–205 Geschwind N 1970 The organization of language and the brain. Science 170: 940–4 Indefrey P, Levelt W J 2000 The neural correlates of language production. In: M S Gazzaniga (ed.) The New Cognitie Neurosciences. MIT Press, Cambridge, MA Ivry R B, Fiez J A 2000 Cerebellar contributions to cognition and imagery. In: Gazzaniga M S (ed.) The New Cognitie Neurosciences. MIT Press: Cambridge, MA Ivry R B, Keele S W 1989 Timing functions of the cerebellum. Journal of Cognitie Neuroscience 1: 136–52 Laplane D, Talairach J, Meininger V, Bancaud J, Orgogozo J M 1977 Clinical consequences of corticectomies involving the supplementary motor area in man. Journal of the Neurological Sciences 34: 301–14 Love R J, Webb W G 1996 Neurology for the Speech-language Pathologist. (Third ed.). Butterworth-Heienemann, Boston, MA Mohr J P 1976 Broca’s area and Broca’s aphasia. In: Whitaker H A, Whitaker H (eds.) Studies in Neurolinguistics, Vol. 1. Academic Press, New York Murphy K, Corfield D R, Guz A, Fink G R, Wise R J, Harrison J, Adams L 1997 Cerebral areas associated with motor control of speech in humans. Journal of Applied Physics 83: 1438–47 Ojemann G, Mateer C 1979 Human language cortex: Localization of memory, syntax, and sequential motor-phoneme identification systems. Science 205: 1401–3 Ojemann G A, Whitaker H A 1978 Language localization and variability. Brain and Language 6: 239–60 Penfield W, Roberts L 1959 Speech and Brain Mechanisms. Princeton University Press, Princeton, NJ Pickett E R, Kuniholm E, Protopapas A, Friedman J, Lieberman P 1998 Selective speech motor, syntax and cognitive deficits associated with bilateral damage to the putamen and the head of the caudate nucleus: A case study. Neuropsychologia 36: 173–88 Rapcsak S Z, Rubens A B 1994 Localization of lesions in transcortical aphasia. In: Kertesz A (ed.) Localization and Neuroimaging in Neuropsychology. Academic Press, San Diego, CA Silveri M C, Di Betta A M, Filippini V, Leggio M G, Molinari M 1998 Verbal short-term store-rehearsal system and the
Speech Production, Psychology of cerebellum: Evidence from a patient with a right cerebellar lesion. Brain 121: 2175–7 Tognola G, Vignolo L A 1980 Brain lesions associated with oral apraxia in stroke patients: A clini-neuro-radiological investigation with the CT scan. Neuropsychologia 18: 257– 72 Wernicke C 1874 Der Aphasische Symptomencomplex [The aphasia symptom complex]. Cohn and Weigert, Breslau Wildgruber D, Ackermann H, Klose U, Kardatzki B, Grodd W 1996 Functional lateralization of speech production at primary motor cortex: A fMRI study. NeuroReport 7: 2791–5 Wise R J S, Greene J, Buchel C, Scott S K 1999 Brain regions involved in articulation. Lancet 353: 1057–61 Ziegler W, Kilian B, Deger K 1997 The role of the left mesial frontal cortex in fluent speech: Evidence from a case of left supplementary motor area hemorrhage. Neuropsychologia 35: 1197–208
N. F. Dronkers and J. V. Baldo
Meaning and Syntactic Features
Speech Production, Psychology of
Figure 1 Levels of processing in speech production. Left part refers to stored representations, right part refers to processes integrating these representations into wellformed linguistic utterances
Speech production refers to the cognitive processes engaged in going from mind to mouth (Bock 1995), that is, the processes transforming a nonlinguistic conceptual structure representing a communicative intention into a linguistically well-formed utterance. Within cognitive psychology, research concerning speech production has taken various forms such as: research concerning the communicative aspect of speaking; research concerning the phonetics of the produced speech; and research concerning the details of the cognitive processing machinery that translates conceptual structures into well-formed linguistic utterances. We focus on the latter. Native adult speakers produce on average two to three words per second. These words are retrieved from a lexicon of approximately 30,000 (productively used) words. This is no small feat: producing connected speech not only entails retrieving words from memory, but further entails combining this information into well-formed sentences. Considering the complexity of all the encoding processes involved, it is impressive that we produce speech at such a fast rate while at the same time remaining highly accurate in our production. Bock (1991) estimated that slips of the tongue occur in speech approximately every 1,000 words despite the ample opportunities for errors. In the following sections the cognitive processes subserving this ability will be discussed.
input and behavior are then used to infer the cognitive processes mediating between them. For example, researchers interested in language comprehension may systematically vary properties of a linguistic input such as the syntactic complexity of sentences or the frequency of words, and measure a corresponding behavior, such as reading times. Differences in the latter are taken as an indication of differences in the processes engaged in building an interpretation of a sentence (the output). Thus, the input is directly observable and completely accessible to systematic manipulation, while the output and the cognitive processes leading to it are inferred from the behavior. The situation is different for research in speech production. The input (the conceptual structure to be expressed), is not directly observable and not readily accessible for experimental manipulation. By contrast, the output (i.e., the spoken utterance), is directly observable. Given this situation, research on speech production has started by focusing on properties of the behavior. Most prominent in this respect are analysis of speech errors, and of hesitations and pauses as they occur in spontaneous speech. These analyses motivated the general theoretical framework for speech production (Fig. 1) which is described in detail below. More recently researchers have started to make speech production accessible to standard experimental approaches from cognitive psychology (see Bock 1996).
1. Methodological Issues
2. Leels and Processes
The prototypical empirical approach in cognitive psychology consists of systematically varying some properties of a stimulus (input), and to measure a corresponding behavior. Systematic relations between
Figure 1 provides an outline of the levels of processing involved in speech production. These levels are shared by most current psycholinguistic models (e.g., Dell 1986, Garrett 1988, Levelt 1989). Conceptual prep14879
Speech Production, Psychology of aration refers to the processes that transform thoughts into verbalizable messages. Grammatical encoding refers to the processes that turn the messages into syntactically well-formed skeletons for sentences. Phonological encoding refers to the processes that provide flesh to the syntactic skeletons, realizing the phonological content of the sentences. At these latter two processing levels, stored lexical information is retrieved from a long-term memory store, the mental lexicon. In particular, during grammatical encoding it is assumed that abstract lexical representations, specifying the syntactic properties of the words, but not their phonological content, are retrieved. During phonological encoding, word form information is retrieved. Finally, articulation refers to the processes that implement the phonetic content and turn it into motor commands for speech. These processing levels were originally inferred from properties of speech errors (see Speech Errors, Psychology of). Speech errors can be analyzed with respect to the type of linguistic unit involved (words, morphemes, phonemes), and with respect to constraints that hold for errors concerning a given type of unit. For example, word exchanges (e.g., ‘on the room to my door’ instead of ‘on the door to my room’) are constrained by syntactic factors (like the syntactic category of the exchanged words) whereas phoneme exchanges (e.g., ‘heft lemisphere’ instead of ‘left hemisphere’) are constrained by phonological factors (like their position as onset, nucleus, or coda of a syllable). These and related results suggest a distinction between a processing level that is primarily concerned with syntactic processing (grammatical encoding), and another level primarily concerned with phonological processing (phonological encoding). The fact that phonemes are relevant units in speech errors also shows that the phonological form of a word is not stored and retrieved as a whole, but built from smaller units, namely the phonemes. Planning of a sentence is not completed on one processing level and then sent to the subsequent level. For example, phonological encoding starts as soon as the syntactic structure of some part of a sentence has been determined, while the grammatical encoding of the remainder of the sentence continues. This pieceby-piece generation of a sentence is referred to as incremental production. Although there is no consensus on the size of the relevant planning units at the different processing levels, it is generally agreed upon that the size of the planning units becomes smaller as one moves from conceptual preparation through grammatical encoding to phonological encoding and articulation. Incrementality provides a natural account of fluency in speech production. Furthermore, incremental production reduces the burden of storage of intermediate results of the production process at each level. Although there is general agreement on the levels of processing, theories differ with respect to how the flow 14880
of information between levels is characterized. According to modular views, each level of processing first produces a complete output representation for a given planning unit. Only then is this output representation passed onto the next level (e.g., Levelt 1989). According to interactive views, every processing stage can pass partial results to the next processing level, before the output representation at the higher level has been completely specified, and lower levels of processing can provide feedback to higher levels of processing. As a consequence, processes at the phonological level can affect processes at the grammatical level (e.g., Dell 1986). In the following sections, we discuss the different processing steps in more detail. 2.1 Conceptual Preparation A first important goal of conceptual preparation is to establish which parts of the conceptually available information are going to be encoded, and in what order (see Levelt 1989, for an overview). A second goal is to convert the conceptual information into a format that is a suitable for the linguistic formulation processes. One important open issue concerning conceptual preparation is how to characterize the mapping between conceptual and grammatical encoding. First is the question of whether conceptual preparation takes language specific properties into account (see Language and Thought: The Modern Whorfian Hypothesis). Languages differ in which conceptual or formal properties need to be realized as a detail of the sentential form. For example, in English the word ‘friend’ does not carry information concerning the sex of the friend. In Spanish the corresponding word is differentially inflected for a man (‘amigo’) or a woman (‘amiga’). In English, adjectives used as predicates (e.g., ‘tall’ in ‘The friend of Luis is tall’) do not agree in gender with the noun; in Spanish they do (e.g., ‘El amigo de Luis es alto’ or ‘La amiga de Luis es alta’). Thus, in these two languages, conceptual information concerning natural gender must (Spanish) or need not (English) be conveyed by a sentence. Second is the question of whether conceptual information permeates the processes occurring during grammatical encoding beyond providing its input (Vigliocco and Franck 1999). 2.2
Grammatical Encoding
Grammatical encoding refers to the processes involved in developing a syntactically well-formed sentence. It comprises first, those processes that map the relationships among the participants in a conceptual representation (e.g., agent, patient, etc.) onto functional syntactic relations between the words of a sentence (e.g., subject, direct object, etc.). Next, on the basis of the resulting hierarchically organized syntactic frame for the sentence, the words in the sentence are linearized in a manner allowed by the language being
Speech Production, Psychology of spoken. The first step is also referred to as functional leel processing, the second as positional leel processing (Garrett 1988). Distinguishing between building hierarchical and linear frames provides a solution to an important problem that the language production system faces. Speech production has to be incremental to allow for fluent utterances, but at the same time the resulting utterance has to obey language specific constraints that force the use of only certain word orders. Assuming incremental conceptualization, however, the order in which parts of a conceptual message are processed do not necessarily correspond to a word order allowed in the speaker’s language. This problem can be solved by separating the construction of hierarchical structures from the serial ordering of the words. In this way, hierarchical structures can be built in an incremental manner as soon as lexical elements are available, these can then be mapped to permissible linearly ordered positions (Vigliocco and Nicol 1998). Evidence compatible with such a separation comes from studies showing that the linear position of words can be primed by the previous presentation of the same linear order, even if the hierarchical structure differs (Hartsuiker et al. 1999). An important open question with respect to grammatical encoding concerns whether the encoding at this level can be influenced by feedback from phonological encoding (Bock 1986). Research on the process of computing number agreement between a sentence subject and a verb suggests that this grammatical process is sensitive to whether number information in the subject noun phrase is overtly realized in the phonological form of the noun, suggesting the possibility of feedback from phonological encoding to grammatical encoding (e.g., Vigliocco et al. 1995; but see Sect. 2.5 for an alternative interpretation of the results). 2.3 Phonological Encoding Phonological encoding refers to the processes that are responsible for determining the phonological word forms and prosodic content of the sentence (e.g., Levelt 1999). First, the phonemes of words are retrieved from the mental lexicon, together with a metrical frame which specifies stress pattern and number of syllables in the word. Following this retrieval process, the resulting sequence of phonemes is syllabified according to a language specific set of syllabification rules. The domain of syllabification is assumed to be the phonological word which can but need not coincide with a lexical word. It should be noted that in this view, the syllabic structure of an utterance is computed on-line. The reason for this assumption is that the actual syllabification of a word in running speech depends on the context in which it appears. For example, the word ‘deceive’ in isolation is syllabified as ‘de-ceive,’ but in the context of the utterance ‘deceive us,’ the syllable structure becomes
‘de-cei-veus.’ This observation also provides a functional reason why the phonological form of a word is not stored and retrieved as one entity: such a representation would have to be broken up into its constituent parts whenever the stored syllabification of a word does not agree with its syllabification in the context of running speech. The representation formed by phonological encoding, a syllabified phonological code, forms the input for the articulatory processes which realize this code as overt speech. It is an open issue whether the transition from phonological encoding to articulation involves accessing a ‘syllabary,’ i.e., memory representations specifying the motor tasks that have to be performed to generate each syllable (Levelt 1999). 2.4 Lexical Access Grammatical and phonological encoding make use of stored representations of words. Parallel to the distinction between grammatical encoding and phonological encoding, two levels of stored representation are distinguished. On one level, words are represented as abstract syntactic (and semantic) entities (technically also referred to as lemmas), and on the other level their phonological form is specified. Evidence for two levels of representation in the mental lexicon comes from tip-of-the-tongue (TOT) states (see Tipof-the-tongue, Psychology of ). It has been demonstrated repeatedly that speakers in a TOT state can report grammatical properties of a word (e.g., the grammatical gender of a noun) above chance level even when they are unable to access the phonological form of the word. It is unresolved whether, within the lexicon, the flow of information from meaning to form can be characterized as discrete, cascading, or fully interactive. Discrete serial models of lexical access (e.g., Levelt 1999) assume that, first, on the basis of some fragment of a conceptual structure, a set of lemmas becomes activated. This set includes the target lemma as well as semantically related lemmas. Phonological encoding of the word is initiated only after the target lemma has been selected. Hence, the phonological forms of words which are semantically related to the target word should not become activated. Cascading models (e.g., Peterson and Savoy 1998) assume that every activated lemma sends some activation to its corresponding word form, even before the actual target lemma has been selected. Thus, also the phonological forms of semantic competitors of the target word should receive some activation. Finally, interactive models (e.g., Dell 1986) allow for feedback from the phonological level to the lemma level. Thus not only will the phonological forms of semantic competitors receive activation, but the phonological forms will also send activation back to all lemmas with similar phonological forms. The question of whether the phonological forms of semantic competitors do receive activation, as pre14881
Speech Production, Psychology of dicted by cascading models and feedback models, or not, as predicted by discrete serial models, is still an empirically unresolved issue. Phonological activation of semantic competitors has been demonstrated for the case of near-synonyms like ‘couch’ and ‘sofa’ (e.g., Jescheniak and Schriefers 1998, Peterson and Savoy 1998) and has been interpreted in favor of cascading of activation. However, as the evidence is presently restricted to this specific semantic relation, it has been suggested that near-synonyms might be a special case (e.g., Levelt 1999). Concerning the question whether there is feedback from lower levels to higher levels in lexical processing, the lexical bias effect has been cited as evidence for feedback. The lexical bias effect concerns the observation that phoneme errors lead to existing words rather than nonwords more often than would be expected by chance. This finding can be explained by feedback from the level of phonological segments to some higher level representing words as one integral unit (e.g., Dell 1986). According to discrete serial models, by contrast, the chance of incorrect selection of a phoneme should be independent of whether the resulting string of phonemes forms an existing word or not (but see Sect. 2.5).
2.5 Self-monitoring and Repair Speakers not only produce speech, but also monitor their own speech output and this monitoring may occur on overt speech as well as on ‘internal speech.’ Assuming an internal monitor implies that not all misfunctions of the speech production system will become visible in the eventual output. Rather, some will be discovered and repaired before they are actually articulated. Such covert repairs can play an important role in the interpretation of speech error results. For example, the lexical bias effect in phoneme errors has also been explained as the result of a pre-articulatory covert repair mechanism. This covert repair mechanism might be more likely to detect and correct phoneme errors leading to nonwords than those leading to words before the resulting error is actually produced. Hence, the lexical bias effect would not imply feedback of activation from lower to higher levels. It should be noted that related arguments have been put forward with respect to other empirical findings which appear to speak against discrete serial models, like the potential influence of phonology on grammatical encoding processes. It is a matter for future investigation to design empirical tests that allow us to differentiate between feedback and monitoring explanations of the relevant empirical phenomena. See also: Aphasia; Dyslexia (Acquired) and Agraphia; Dyslexia, Developmental; Dyslexia, Genetics of; Speech Errors, Psychology of; Speech Production, Neural Basis of 14882
Bibliography Bock J K 1986 Meaning, sound and syntax: lexical priming in sentence production. Journal of Experimental Psychology: Learning, Memory and Cognition 12: 575–86 Bock J K 1991 A sketchbook of production problems. Journal of Psycholinguistic Research 20: 141–60 Bock J K 1995 Sentence production. From mind to mouth. In: Miller J L, Eimas P D (eds.) Handbook of Perception and Cognition. Speech, Language, and Communication, 2nd edn. Academic Press, San Diego, CA, Vol. 11, pp. 181–216 Bock J K 1996 Language production: Methods and methodologies. Psychonomic Bulletin and Reiew 3: 395–421 Dell G S 1986 A spreading-activation model of retrieval in sentence production. Psychological Reiew 91: 283–321 Garrett M F 1988 Processes in language production. In: Nieumeyer F J (ed.) Linguistics: The Cambridge Surey. Biological and Psychological Aspects of Language. Cambridge University Press, Cambridge, UK, Vol. 3, pp. 69–96 Hartsuiker R J, Kolk H H J, Huiskamp P 1999 Priming word order in sentence production. Quarterly Journal of Experimental Psychology 52A: 129–47 Jescheniak J D, Schriefers H 1998 Serial discrete versus cascaded processing in lexical access in speech production: Further evidence from the co-activation of near-synonyms. Journal of Experimental Psychology: Learning, Memory, and Cognition 24: 1256–74 Levelt W J M 1989 Speaking. From intention to articulation. MIT Press, Cambridge, MA Levelt W J M 1999 Models of word production. Trends in Cognitie Science 3: 223–32 Peterson R R, Savoy P 1998 Lexical selection and phonological encoding during language production: Evidence for cascaded processing. Journal of Experimental Psychology: Language, Memory, and Cognition 24: 539–57 Vigliocco G, Butterworth B, Semenza C 1995 Constructing subject–verb agreement in speech: The role of semantic and morphological factors. Journal of Memory and Language 34: 186–215 Vigliocco G, Franck J 1999 When sex and syntax go hand in hand: Gender agreement in language production. Journal of Memory and Language 40: 455–78 Vigliocco G, Nicol J 1998 Separating hierarchical relations and word order in language production: is proximity concord syntactic or linear? Cognition 68: B13–B29
H. Schriefers and G. Vigliocco
Speech Recognition and Production by Machines The fields of speech recognition and speech production (text-to-speech or speech synthesis) have made great progress since the early 1990s. The technologies are now at the point of becoming commercially viable, and a number of products are currently available. Recent progress in both technologies has been charac-
Speech Recognition and Production by Machines terized by rapid incremental improvement rather than any theoretical breakthroughs. Both technologies are based on stochastic models and have benefited greatly from increased computing and data resources. This article will describe the techniques used for current state-of-the-art speech recognition and synthesis systems. It will give only a brief overview of the field. For further reading, see Jurafsky and Martin (2000) and Jelinek (1997).
1. Speech Recognition This section will focus on large vocabulary continuous speech recognition (LVCSR), which is the most general form of the problem. There are other paradigms that are used to simplify the problem for some specific tasks. In isolated word recognition, as opposed to continuous recognition, the user is required to pause between words when speaking. This simplifies the problem by making it easier to find the word boundaries, so higher accuracies can be obtained. If the system only needs to detect a small set of words, and a small footprint is required, a technique called word spotting can be used. Word spotting attempts to find a small set of words in the input rather than trying to decode all of the speech. Both of these techniques can be viewed as special cases of LVCSR. The general process of speech recognition is shown in Fig. 1. A signal processing component converts the analog speech signal into a sequence of discrete feature vectors. A decoding search process then matches internal acoustic models against the sequence of feature vectors to produce a sequence of words.
spectrum is subjected to log compression. This log amplitude spectrum is then passed through another FFT resulting in a representation referred to as the cepstrum. Linear predictive coding (LPC) is used to calculate the coefficients for the cepstrum. In this scheme, each speech sample is represented as a linear combination of the N previous samples plus an additive term. The elements of the feature vector are the first N coefficients (for the linear combination) produced by the LPC calculation. A typical value for N is 12, and normally one feature vector is produced each 10 milliseconds. 1.2 The Matching Process The most common form of speech recognition uses a maximum likelihood search. Given acoustic data A (as represented by the sequence of feature vectors), a maximum likelihood search attempts to find the word string W such that P(W Q A), the probability of the word sequence given the acoustic data, is maximized. This search evaluates hypotheses by applying Bayes Rule: P(WQA) l
P(AQW)iP(W) P(A)
where P(A Q W) is the probability of acoustic data A given word string W, P(W) is the a priori probability of word string W, and P(A) is the a priori probability of acoustic input A. P(A Q W) is estimated using acoustic models and P(W) is estimated using language models. Typically, P(A) is not estimated since it is difficult to estimate and it does not change the rank ordering of hypotheses since it is the same for all competing hypotheses.
1.1 Signal Processing While the exact signal processing configuration varies from system to system, the following description represents typical values. First, the analog signal is low-pass filtered with an 8 kHz cutoff and sampled at 16 kHz. A fast Fourier transform (FFT) is calculated using a 25 msec window, and the resulting amplitude
1.3 Acoustic Models Acoustic models are a stochastic mechanism for estimating P(AQW), the probability of the observed acoustic feature sequence given a specified word sequence. The most common form of acoustic model
Figure 1 The speech recognition process
14883
Speech Recognition and Production by Machines
Figure 2 Hidden Markov models of speech units
used in speech recognition is the hidden Markov model (HMM). A hidden Markov model consists of a set of states S, a set of transition probabilities from each state to other states, and a set of observation probabilities for each state. This model is illustrated in Fig. 2. In order to apply the model to speech decoding, HMMs are associated with units of speech, such as phones, to model their acoustic realization. A word level model is generated by concatenating the HMMs for the phones comprising the word. In practice, more specific models are used which model phones in their left and right context. These models are referred to as triphones. To create the models, the parameters of the HMMs, the transition and observation probabilities, must be estimated from training data. Recorded speech is transcribed, and the HMMs are aligned against the speech vectors using a procedure called the forward– backward algorithm. There is not a deterministic procedure for estimating the parameter values; rather an iterative procedure, Baum–Welch reestimation, is used to refine the parameters until a stopping criterion is achieved. While estimation of the transition probabilities is straightforward, estimation of the observation probabilities is less so since they are represented as a multidimensional space. For each state, the observation probability is the probability of observing the input feature vector given the state. There are two principal techniques for estimating the observation probabilities, Gaussian mixtures and neural networks. In the Gaussian mixtures technique, the observation probability is estimated as a weighted sum of Gaussian probability density functions (PDFs). Every state has 14884
an associated set of multidimensional PDFs, each represented by a mean and covariance matrix. The feature vector is evaluated with respect to each of the distributions and the observation probability is a weighted sum of the values. An alternate way of estimating the observation probabilities is to use neural networks to directly estimate the probability of the feature vectors. A recurrent neural network is associated with each Markov model state and gives the probability of the observation, given the state. These networks take a long time to train, but decode quickly. 1.4 Language Models A language model is a stochastic mechanism for estimating P(W), the a priori probability of a specified word sequence. P(W) is normally decomposed to be applied in a left-to-right fashion. Using ! to represent the start of sentence, P(W) is represented as: P(W) l P(w Q!)iP(w Q!w ) " # " iP(w Q!w w )i(iP(wnQ!w w …wn− ) $ " # " " # In this form, the probability of extending a sequence by the next word is conditioned on all the words in the sequence up to that point. In practice, it is not feasible to represent all possible histories, !w w …wn− . The " # whose " last prevailing solution is to group all histories Nk1 words are the same, and thus to condition the probability of a word on the last Nk1 words. These models are referred to as N-gram models. For N l 3,
Speech Recognition and Production by Machines
Figure 3 The speech decoding search
referred to as a trigram, P(wnQ!w w …wn− ) is estimated " # are" also trained as P(wnQwn− wn− ). N-gram models # data " and estimated as: from training C(wn− , wn− , wn) # " P(wnQwn− , wn− ) l # " C(wn− , wn− ) # " where C(wn− , wn− , wn) is the count of the number of # sequence " times that the wn− , wn− ,wn was observed in # " the training data. 1.5 The Decoding Search The general structure of the acoustic decoding process is shown in Fig. 3. It uses a language model, a dictionary, and an inventory of acoustic models to decode speech input. The recognizer processes input from left to right. It generates a language model probability for each word in the lexicon, given the start of utterance and matches these words against the start of the input. The system pursues hypotheses in parallel and extends them forward by assigning language model probabilities that each word will extend each hypothesis and matching that word against the corresponding portion of the speech input. In order to match a word against the digitized speech input, the system looks the word up in its pronunciation dictionary to find the sequence of phones describing the
word pronunciation. The phones in the dictionary are context-independent, that is, the representation does not change for a phone dependent on its context, the phones to its left and right. Since the acoustic realization of a phone depends very much on its phonetic context, context-dependent models are actually used. The context-independent phones in the pronunciation are mapped to context-dependent phones; that is, notation is added to reflect the context of the phone to the left and to the right of the phone used. These-context dependent phone units are referred to as triphones. The system has a pretrained inventory of HMM acoustic models for the triphones. The triphone models are concatenated together to form a word model, which is matched against the speech input. A dynamic programming search, the Viterbi algorithm, is used to align the speech feature vectors with the HMM triphone models. This search produces the sequence of Markov states that has the highest probability of generating the observed feature vectors. In reality the search does not generate real probabilities for hypotheses, but scores intended to approximate probabilities. The score for a hypothesis is a weighted combination of the acoustic match score P(AQW) and the language model score P(W). The match process is a large stochastic search to match models for word sequences against the speech input. In principle, all possible sequences of words in the 14885
Speech Recognition and Production by Machines vocabulary are matched against the speech input. An exhaustive search clearly is not computationally viable, so pruning must be applied. The language model scores are very valuable in helping prune the search to look for the more likely word sequences. Most often, beam pruning is used. Hypotheses with scores more than a threshold value worse than the current best score are pruned. 1.6 Performance The performance of a speech recognition system depends on many factors: The time–accuracy tradeoff—since the recognition process is a pruned search, it has a time vs. accuracy tradeoff. Less pruning allows the system to explore more hypotheses and thus achieve higher accuracy, but also requires more run time. Research systems may run in many times real time, but systems interacting with users must run in very near real time. Increases in machine power can therefore translate into increased recognition accuracy by allowing less pruning while still running in real time. Amount of training data—since the acoustic and language model parameters are estimated from training data, the amount and appropriateness of the training data critically affect the quality of the model. Degree to which usage conditions are represented in training data—the weakest feature of current recognition technology is its lack of robustness. The systems generally perform well as long as the usage conditions are represented well by the training data. As the conditions under which the systems are used move away from the data on which they were trained, the performance degrades rapidly. If a system were trained only on male voices, it would not work well for most females. If it were trained on native English speakers, it would not work well for someone with a German accent (or almost any other accent). If trained with a close talking microphone, it would not work well over a telephone. So-called speaker-independent systems just use a training database containing as many different speakers as possible to get a reasonably broad coverage of speaker variability. The commercial dictation systems that are currently available generally require a short enrollment process where the user is required to speak some prompted sentences into the system. This is to allow the system to adapt its general models to the specific user and insure at least a small amount of appropriate training data. Currently, commercial dictation systems use about 30,000 words and report average word error rates of approximately 5 percent.
2. Speech Production Currently, the various approaches to speech recognition can be grouped into three basic categories: articulator models, source-filter models, and concaten14886
ative synthesis. An excellent brief discussion of the details of these approaches can be found in Pellom (1998). This section will present only a short overview. 2.1 Articulator Model Synthesis Articulator model-based systems attempt to explicitly model the physical aspects of the human speech production system. They produce detailed models of the articulators and the airflow in the vocal tract. These models are generally based on detailed measurements of human speech production. Models of this type are not widely used because it is difficult to get the sensor data and because the models are computationally intensive. 2.2 Source-filter Based Synthesis Source-filter-based systems use an abstract model of the speech production system (Fant 1960). Speech production is modeled as an excitation source that is passed through a linear digital filter. The excitation source represents either voiced or unvoiced speech, and the filter models the effect produced by the vocal tract on the signal. Within the general class of sourcefilter models, there are two basic synthesis techniques: formant synthesis and linear predictive synthesis. Formant synthesis models the vocal tract as a digital filter with resonators and antiresonators (Klatt 1980). These systems use a low-pass filtered periodic pulse train as a source for voiced signals and a random noise generator as an unvoiced source. A mixture of the two sources can also be used for speech units that have both voiced and unvoiced properties. Rules are created to specify the time varying values for the control parameters for the filter and excitation. This is the technique used by DECTalk (Bruckert et al. 1983), probably the most used commercial synthesis system. Linear predictive synthesis uses linear predictive coding to represent the output signal. Each speech sample is represented as a linear combination of the N previous samples plus an additive excitation term. As in formant synthesis, the excitation term uses a pulse train for a voiced signal and noise for un-voiced. While the source-filter-based systems are capable of producing quite intelligible speech, they have a distinct mechanical sound, and would not be mistaken for a human. This quality arises from the simplifying assumptions made by the model and therefore is not easily remedied. 2.3 Concatenatie Synthesis In recent years, much research has been focused on techniques for concatenative synthesis (Taylor et al. 1998). This technique uses an inventory of prestored units that are derived from real recorded speech. A database of speech is recorded from a single speaker
Speed Reading and is segmented to create an inventory of speech units. These units are concatenated together to create the synthesized output. The speech units differ between systems. One popular unit is diphones, which consist of parts of two phones, including the transitions between them (Peterson et al. 1958). The beginning and end of the diphone are positioned to be in stationary portions of the two phones. This reduces the problem of smoothing the junctions of concatenated units. There are also a manageable number of units, at most 1600 (40i40). The actual number is somewhat smaller, since not all possible pairs of phones occur in English. Context dependent phones, clustered by acoustic similarity, are also used. A search algorithm is used to find the optimal sequence of units (Hunt and Black 1996). An extension of the small unit concatenative synthesis technique is to use variable sized units for concatenation (Yi and Glass 1998). In this method, task-specific sentences, phrases, and words are recorded from a single speaker. A search process is used to concatenate the largest units available in generating the output. In some cases, an entire prerecorded sentence or prompt can be used. Much of the time phrase level and word level units will be used. If a novel word is encountered, it is synthesized by concatenating phone sized units. This technique sounds very natural if word sized or larger units are being concatenated. The problem with this approach is that good coverage with large units is only possible in limited domains, and a new database must be recorded for each new domain. Work is in progress to unify the small and variable unit approaches to give a more general system. See also: Speech Perception; Speech Production, Psychology of
Bibliography Bruckert E, Minow M, Tetschner W 1983 Three-tiered software and VSLI aid developmental system to read text aloud. Electronics 56: 133–8 Fant C G M 1960 Acoustic Theory of Speech Production. Mouton, The Hague, The Netherlands Hunt A, Black A 1996 Unit selection in a concatenative speech synthesis system using a large speech database. Proc. ICASSP96 1: 373–6 Jelinek F 1997 Statistical Methods For Speech Recognition. MIT Press, Cambridge, MA Jurafsky D, Martin J H 2000 Speech and Language Processing. Prentice-Hall, Englewood Cliffs, NJ Klatt D H 1980 Software for a cascade\parallel formant synthesizer. Journal of the Acoustics Society of America 3: 971–95 Pellom B 1998 Enhancement, segmentation, and synthesis of speech with application to robust speaker recognition. PhD thesis, Duke University, Durham, NC Peterson G E, Wang W, Sivertsen E 1958 Segmentation techniques in speech synthesis. Journal of the Acoustics Society of America 8: 739–42
Taylor P, Black A, Caley R 1998 The architecture of the festival speech synthesis system. 3rd ESCA Workshop on Speech Synthesis. pp. 147–51 Yi J R W, Glass J R 1998 Natural-sounding speech synthesis using variable-length units. Proc. ICSLP-98 4: 1167–70
W. Ward
Speed Reading Modern society requires individuals to process an increasing amount of written information on a daily basis. The rising stacks of to-be-read books, articles, and documents often induce people to turn toward the commercially available courses that claim to radically increase reading speed. Most of these speed-reading courses claim that reading speed can be increased to 1,000 words per minute (wpm) with improved comprehension. This rate is over three times normal reading speeds (200–300 wpm), and some speed readers have claimed to read up to 100,000 wpm. Typical speed-reading training involves instructing readers to focus on fewer sections of the printed material, to refrain from rereading words and from lip movement or inner speech, to attempt to perceive meanings at a glance, and simply to concentrate on reading faster. While there is an abundance of anecdotal claims from individuals who report that they are able to read at amazing rates, the scientific evidence is not as convincing. The empirical studies that have supported speed-reading claims have been fraught with methodological problems. Those studies that are better designed have generally found that speed reading considerably inhibits comprehension and is comparable to scanning or skimming. This article reviews the history of speed-reading, the validity of speed-reading claims, evidence that refutes speedreading claims, and some potentially positive contributions of speed-reading methods.
1. Brief History of Speed-reading Increased fascination with anecdotes of speed reading emerged in the late nineteenth century popular press. This interest led to the first known speed-reading course, which was offered at Syracuse University in 1925. After a brief hiatus during World War II, the use of machines became popular to increase reading speed by modifying readers’ eye movements. Normal reading involves rapid, jerky eye movements, called saccades. Words are only perceived during fixations between saccades. Ordinarily, readers make forward saccades; however, backward saccades called regressions are also made to fixate previous words. The number of letters or words that are processed during a single fixation is called the perceptual span (see Eye 14887
Speed Reading Moements in Reading). Poor readers tend to make more regressions, and more frequent fixations (see Reading Skills). Hence, speed-reading advocates assumed that reducing regressions and fixations (with increased perceptual spans) would improve reading rate and consequently, reading ability. The focus after World War II on using mechanical devices to change readers’ eye movements emerged from the success of using tachistoscopes (machines that flash images at controlled rates on a screen) to train pilots to more quickly and accurately identify aircraft. Similarly, tachistoscopes were used to train readers to reduce the number of fixations and widen the perceptual span. The results from these methods were initially promising. However, it eventually became evident that the effects of training were temporary and did not generalize to normal reading situations. In the late 1950s, Evelyn Wood popularized speedreading with her hand-pacing method, which involved training the eyes to follow a zigzag motion down each page. Combined with other reading and learning techniques, she introduced Reading Dynamics in 1959, which ultimately led to Evelyn Wood Reading Dynamics Institutes worldwide. The variety of speedreading courses increased in the 1960s along with an abundance of anecdotal claims of their success.
2. The Validity of Speed-reading Claims Are there individuals who can read at 1,000 wpm, or faster, and still comprehend at least 75 percent of the information? With its increasing popularity in the 1960s, many researchers began to challenge the astounding claims made by speed-reading advocates. Although many individuals have claimed to possess this ability, the empirical evidence was scant. Two principle areas of debate emerged in the literature. The first area of debate concerns the possibility of modifying readers’ eye movements and perceptual span. The second issue concerns the methodology employed by speed-reading studies. The empirical research in support of speed-reading has tended to be fraught with methodological weaknesses, generally with respect to the measurement of comprehension during speed-reading tests. 2.1 Eye Moements Many speed-reading courses attempt to modify readers’ eye movements by increasing the perceptual span and reducing the number of fixations and regressions. However, researchers have argued that eye anatomy constrains the perceptual span to approximately three words from the fixation point; and more specifically, 3–4 letter spaces to the left and 14 letter spaces to the right of the center of an advanced reader’s fixation point (Rayner 1986). The physiological boundaries of the perceptual span imply that 14888
reading rates exceeding 600–800 wpm are physiologically impossible if the reader is attempting to read all of the words (Taylor 1965). For example, a rate of 10,000 wpm would allow about 2.5 seconds per page of text with 7–8 fixations. Thus, each fixation would have to include a span of 55 words, which is impossible. In response, speed-reading advocates have claimed that it is unnecessary to read every word of a passage, and that readers can learn to fixate only important words or sentences. In support of that claim, research has indicated that speed-readers make inferences between parts of the text by using prior knowledge (Carver 1983). That is, knowledge allows readers to develop an understanding of the text, despite not having read all of the words. This process is more successful when readers have more knowledge about the text, but virtually impossible without relevant knowledge. Indeed, most speed-reading courses acknowledge that speed-reading is only effective for familiar material. Nevertheless, there is no guarantee that knowledgebased inferences will be correct, and evidence does not support the assumption that speed-readers learn to read only the most important words or sentences of a passage (Just and Carpenter 1987). Many speed-reading courses also encourage readers to eliminate the tendency to make regressive eye movements. However, research has shown that regressions are virtually unavoidable with unfamiliar or complex material, and are adaptive when the segments that are reread are relevant to the reader’s goals. Moreover, the relationship between eye movements and reading efficiency remains controversial and there is no available evidence that eye movement training improves reading ability. Although speed-reading training may alter eye movement patterns, these patterns revert to normal when comprehension is emphasized. 2.2 Comprehension The second area of debate has centered on the comprehension of rapidly read material. There is little doubt that readers can turn pages quickly and scan printed material. The question is whether speedreaders can do so and maintain comprehension of the information. Speed-reading advocates have claimed that comprehension is not only maintained but improved. Others have claimed that speed-reading results in large comprehension decrements. The answer to this debate has been clouded by methodological problems (Carver 1990). Much of the evidence in support of speed-reading has been anecdotal. In these cases, neither documentation nor comparisons to normal readers are generally provided. In addition, many empirical studies of speed-reading are flawed because comprehension was not assessed, or comprehension and rate were measured on separate passages. Without correspond-
Speed Reading ing evidence of comprehension and reading rate, a speed-reader is apt to ‘read’ rapidly when rate is measured (with low comprehension of the material) and read more slowly when comprehension is tested. A second weakness of speed-reading research has stemmed from the use of comprehension questions that can be answered with reasonable accuracy without reading the passage. Generally, passages used to measure reading rate cover relatively familiar information. Consequently, passage-independent questions that can be answered on the basis of reasoning or prior knowledge inflate estimates of reading comprehension. For example, Carver (1972) demonstrated that previous estimates of speed-readers’ comprehension (see Wood 1963) dropped from 68 percent to 17 percent for fictional material after correcting for passage-independent questions. Hence, tests showing that speed-readers maintain comprehension may not account for questions that can be answered without reading the material. Similarly, the common use of multiple-choice questions to assess comprehension allows readers to guess the correct answers among the alternatives. For example, if two readers answer 20 of 30 questions correctly, but one reader provides answers for only 20 (missing 0, but skipping 10) and the other provides answers for all 30 questions (missing 10), the latter reader is showing greater evidence of guessing. Without correcting for guessing (see Carver 1990), the readers seem statistically equivalent in terms of their comprehension levels. This issue is particularly important because speed-readers often answer more questions, and guess more, than normal readers. An additional weakness of speed-reading research concerns practice effects on the tests. When the same test or similar tests are used to assess comprehension before and after speed-reading training, the speedreader may improve from having completed the test previously and not from training. Likewise, similar tests during and after training can lead to improvement on the post-training tests. These testing problems are most important for studies lacking a control group. In summary, typical shortcomings of speed-reading research include: the lack of a comprehension test for the material used to measure reading rate, the use of passage-independent comprehension questions, not correcting comprehension performance for guessing, and the lack of a control group to measure practice effects on the tests.
3. Eidence that Refutes Speed-reading Claims The bulk of empirical research has led to the conclusion that reading rapidly and maintaining comprehension is not possible. For example, Homa (1983) assessed two graduates of the American Speed-reading Academy who had reported reading rates above 100,000 wpm. He administered two visual identification tasks (letters and words) and a comprehension
task. The speed-readers performed comparably to six undergraduates on the visual identification tasks but showed a greater tendency to guess the answers than the undergraduates. The speed-readers averaged 15,000 and 30,000 wpm on the comprehension task, but clearly understood little of what they had read, even after three attempts with the same material and the same tests. Homa concluded, ‘the only noteworthy skill exhibited by the two speed-readers was a remarkable dexterity in page-turning.’ Carver (1985) examined the reading rate and comprehension of four groups of superior readers: speedreaders, professionals, high-skilled college students, and people who scored exceptionally high on tests such as the SAT. Carver found that speed-readers were similar in ability to other superior readers, except that they typically chose to skim and accept the lowered comprehension at rates higher than 1,000 wpm. The evidence strongly suggested that reading rates exceeding 600 wpm, with adequate comprehension, were not possible. Just and Carpenter (1987) compared the reading abilities of 11 graduates of a Reading Dynamics course, 12 skimmers, and 13 normal readers. On two passages, the speed-readers, skimmers, and normal readers averaged 700 wpm, 600 wpm, and 240 wpm, respectively. The speed-readers’ and skimmers’ accuracy on comprehension questions was comparable, though speed-readers more accurately answered questions concerning the gist of an easier Reader’s Digest passage. Both groups showed consistent comprehension decrements (7–15 percent) compared to the normal readers. Just and Carpenter concluded that reading speed can increase, but not without sacrificing comprehension.
4. Potential Adantages of Speed-reading Although reading rates exceeding 600 wpm clearly impede comprehension, research has also indicated that many individuals read more slowly than necessary and that moderate increases in reading rate may improve comprehension. In addition, skilled readers show greater flexibility and efficiency by adjusting reading rate to their purpose and to the difficulty or familiarity of the material. However, it is debatable whether readers can be trained to be more flexible and to focus solely on pertinent information. Nevertheless, knowing when to read rapidly is considered an important skill in most reading programs.
5. Conclusion The available research leads to the conclusion that speed-reading does not allow the reader to adequately understand the reading material. A distinction should be made between speed-reading, which assumes that normal comprehension is maintained, and scanning 14889
Speed Reading and skimming, which accept a considerable decline in comprehension levels. Many researchers claim that speed-reading is essentially skimming. Nevertheless, proponents of speed-reading remain steadfast in their claims that speed-reading is effective. Research to support these claims would have to carefully control comprehension measurements and show equivalent or improved understanding of material read at rates exceeding 1,000 wpm. At this juncture, however, evidence suggests that you get what you read, and reading takes time. See also: Eye Movements in Reading; Readability, Applied Psychology of; Reading, Neural Basis of; Reading Skills
Bibliography Brozo W G, Johns J L 1986 A content and critical analysis of forty speed-reading books. EDRS (ED 270 724) Carver R P 1972 Comparisons among normal readers, speedreaders, and clairvoyant readers. Yearbook of the National Reading Conference 21: 150–5 Carver R P 1983 Is reading rate constant or flexible? Reading Research Quarterly 18: 190–215 Carver R P 1985 How good are some of the world’s best readers? Reading Research Quarterly 20: 398–419 Carver R P 1990 Reading Rate. Academic Press, San Diego, CA Harris A J, Sipay E R 1985 How to Increase Reading Ability, 8th edn. Longman, New York Homa D 1983 An assessment of two extraordinary speedreaders. Bulletin of the Psychonomic Society 21: 123–6 Just M A, Carpenter P A 1987 The Psychology of Reading and Language Comprehension. Allyn & Bacon, Boston Rayner K 1986 Eye movements and the perceptual span in beginning and skilled readers. Journal of Experimental Child Psychology 41: 211–36 Taylor S E 1965 Eye movements in speed-reading: Facts and fallacies. American Educational Research Journal 2: 187–202 Thompson M E 1985 Dimensions of speed reading: A review of research literature. EDRS (ED 263 530) Wood E N 1960 A breakthrough in reading. Reading Teacher 14: 115–17 Wood E N 1963 Opinions differ on speed-reading. NEA Journal 46: 44–6
D. S. McNamara
Spelling In an alphabetic writing system, spelling refers to the norms or conventions by which the characters of the alphabet are arranged to form words. The term ‘orthography’ is also used to refer to a spelling system, especially within linguistics and literary studies. The principle of alphabetic writing means that the characters used for writing correspond in some way, 14890
perhaps only roughly, to the individual phonemes of the language represented. In theory, it seems, spelling could be a simple matter of representing each phoneme by its corresponding alphabetic character. There are several reasons why this is not so, however. (a) Spelling systems often, incorporate morphophonemic and\or grammatical information as well as phonological. They are rarely if ever simply representations of sounds. (b) Spelling systems often are adaptations of spelling systems originally designed for other languages. This may result in some phonemic distinctions going unrepresented (or, conversely, subphonemic distinctions being indicated in spelling). (c) The existence of a ‘spelling system’ implies a degree, often a high degree, of standardization. However, in every language, the written language represents an abstraction away from spoken varieties with potential variation along social, regional, and possibly other dimensions. There is always some mismatch between spoken words (even in a standard variety) and their written form. (d) Over time, languages undergo grammatical and phonological change, but the spelling of the written language often remains conservative. Once established and conventionalized, a spelling system does not have to change at the same rate as the language it represents. (e) Spelling systems, like other symbolic systems, are not purely a technical device but are subject to elaboration as part of the culture of their users. Spelling may communicate various kinds of social and ideological meaning as well as phonological information. Thus, it is very rare for spelling systems to be wholly phonemic (or ‘phonetic’ as this term is often used outside linguistics); rather, they represent a complex of phonological, grammatical, historical and cultural information–only some of which may be apparent to their users on the occasion of use.
1. The Underlying Principles of Spelling Systems Fundamental to spelling systems is an alphabetic principle, that is to say, that a finite, small number of symbols (typically less than 50) or ‘letters of the alphabet’ are used in combination to write linguistic units (morphemes or words). Alphabetic writing systems normally have discrete symbols to represent vowels and consonants (though some, like Arabic and Hebrew, do not represent vowels), and the alphabet typically has a fixed order (A, B, C, … etc.). Closely associated with the alphabetic principle, at least historically, is the phonemic principle, which prescribes that there should be a correspondence between graphemes (letters or letter combinations) and phonemes of the language. While some words in some languages may be spelt more or less phonemically, it is rare for languages to be strictly phonemic in
Spelling their spelling systems. However, it is usual that, with a few exceptions, each letter in isolation has at least one phoneme of the language associated with it. In combination with other letters, it may keep its association with that phoneme or be associated with others. In English, for example, the fcg letter in isolation may be associated with \k\ or\s\ (although it cannot be pronounced without combining it with a vowel). In combination with fhg as fchg it is associated with \k\ (as in ‘ache’) or \c) \ as in cherry. It is common for alphabetic spelling systems to observe a morphophonemic principle which preserves the spelling of specific morphemes even when they undergo conditioned phonological changes, Thus in English, the past tense ending is spelt fedg even where, as in ‘sailed,’ it is pronounced [d], or, as in ‘hoped,’ it is pronounced [t]. Similarly, the unity of the German Bild [bIlt] ‘picture’ and Bilder [bildbR] ‘pictures’ is preserved by observing a morphonemic rather than a phonemic principle, as the latter would require the singular to be written as Bilt. As the change from \d\ to \t\ is predictable (because word-final plosives are always unvoiced) it is not reflected in the spelling. An important cause of deviation from the phonemic principle in spelling systems is the operation of a historical principle, which in some cases might be called simply orthographic conservatism. The initial f kg in ‘knee,’ for example, has not been pronounced in most dialects of English for several centuries. Spelling systems are notoriously resistant to change, while phonological systems, by comparison, are not. Another potential source of apparent inconsistencies is the etymological principle, typically applied to loan words from another language which shares the same alphabet. Thus in Dutch, the French loan-word cadeau (‘gift’) may be spelt fcadeaug (respecting the etymological principle) or fkadog which respects the phonemic principles inherent in the conventional spelling of native Dutch words. Likewise, considerations of etymology in English led to a class of words, coined on the basis of ancient Greek, in which \f\ is spelt as fphg: telephone, philosophy. The historical principle and etymological principle together account for a phenomenon which is particularly apparent in English, though observable in many languages: the hybrid nature of the spelling system. English, though its inclusion of Anglo-Saxon, French, and Classical Greek and Latin words in the written language, with each set of lexis following the spelling tradition of the language in which it originated (the etymological principle) has become a hybrid system preserving (by the historical principle) a mixture of orthographic traditions, in particular the Anglo-Saxon and Norman French traditions (see Scragg 1974 for the history of English spelling). The longer the written tradition of a language, the more likely it is that the spelling system will be a hybrid showing the influence of different traditions. A further consequence of the hybrid nature of a spelling system
like English is that it may become in part logographic, i.e., words which sound alike are distinguished by their ‘shapes’ or spellings (for example ‘sole,’ ‘soul’).
2. Reading, Writing, and Spelling As pointed out by Augst (1986a, p. 26), ‘no alphabetic script in the world is a representation of sounds and thus identical to a transcription.’ Only the most naı$ ve accounts of spelling see it simply as a matter of letters representing sounds. Many accounts, however, see spelling as a system of correspondences between phonemes (i.e., the potentially contrastive sound units of a language) and graphemes (written units, which could include single letters or letter combinations like fshg in English representing the phoneme \s) \). Such models assume that the processes of both reading and writing require access to the phonological level of representation of the word or morpheme. Augst suggests that a better model would be a trilateral sign, consisting of a representation of meaning, a phonemic schema and a graphemic schema. The reader could go directly from the graphemic schema (i.e., the written word on the page) to the meaning by recognizing the visual image, or vice versa in the case of writing. For words where the visual image was not familiar (or not familiar enough), the reader\writer would be able to access the phonemic schema—in lay terms, to ‘sound out’ the word by relying on the known correspondences between phonemes and graphemes. A common word like ‘cat’ might be read without accessing the phonemic schema, even though the phoneme-grapheme correspondence is exact; but an unusual word like ‘tesla’ might be read by matching graphemes to phonemes, and might be written ‘phonetically’ (and thus perhaps misspelt, e.g., ‘tessler’ or ‘tezla’ by a user who had heard the word but had not seen it in print. The precise relationship between reading, writing, and spelling remains controversial. Which theoretical model is adopted clearly has consequences not just within linguistics, but for pedagogical theories as well, since in some countries pupils expend much time and effort to learn to spell (see Sect. 6).
3. The Historical Origins of Spelling Systems The first written texts in a language are usually the work of writers who already know and use the written form of another language—one which already has a literate tradition. The conventions of writing are typically introduced from one culture to another through the mediation of a class of literate bilinguals. In terms of spelling, this means that the conventions associated with writing one language are transferred to a new language, for which they may or may not be well suited at the time. Spelling traditions therefore 14891
Spelling tend to reflect social and historical movements and power relationships. For example, the contemporary orthography of Manx (Sebba 1998a) reflects its origins in religious texts produced by a class of bilingual clergy who were the first writers of the language. More generally, as the vernaculars of Europe were transformed into written languages, the spelling conventions of Latin, in various forms, became part of the spelling traditions of the newly written languages. During the era of imperial and colonial expansion, literacy was introduced to many languages using spelling systems based on the language of the colonial power. In many cases, the agents of literacy were missionaries or other clergy, or colonial administrators. Lepsius’s (1863\1981) work set out to provide a ‘standard alphabet’ suitable for all languages, whereby the same symbol consistently represented the same sound (e.g., the symbol f yg for the [ j] sound of English ‘yes’ and German ja). His work was the forerunner of phonemic orthographies introduced to many languages by trained linguists, both secular and missionary. One issue debated by linguists providing ‘new’ orthographies for indigenous languages is whether such systems should reflect the spelling conventions of the local official language (e.g., Spanish in the case of language of Central America), enabling an easier transition to the dominant language of literacy—or more ‘universal’ principles, which may foster more independence for the language. See Barros (1995) for the history of this debate. Languages with relatively short histories of literacy are often unfortunate enough to have several competing spelling systems, in spite of a dearth of printed literature. This may be the result of competing groups of missionaries (sometimes representing different linguistic philosophies or colonial powers) or competing political interests, or both (see Sect. 5).
4. Spelling and Language Standardization The existence of a spelling system does not by itself imply standardization, but for languages which have gone down the road of grammatical and lexical standardization, the standardization of spelling seems to be an inevitable part of the process. In the histories of the main European languages of literacy, printers, publishers, and lexicographers have had a major part in this. While English was successfully used as a written language for centuries even though the spelling system, was unstandardized, Stubbs (1980, pp. 44–5) points out how ‘good spelling,’ has become socially valued, while even the concept of a ‘spelling mistake’ in English is as recent as the last two centuries (Strang 1970, p. 107), following the publication of major dictionaries (Johnson’s, in England, Webster’s, in North America). 14892
Publishers and printers have their own economic motives for wishing to standardize the forms of written words, and for English, had achieved this goal by about 1650, a century or so before spellings in private writing became standardised (Strang 1970, p. 107). It is not clear, however, whether the standardization of spelling produces gains in terms of reading efficiency. If reading is largely recognition of a visual image (see Sect. 2) then having just one image corresponding to each word might produce a gain in efficiency as fewer images will have to be stored and accessed. Standardization of spelling might then be expected to go hand in hand with mass literacy. This remains to be demonstrated however; and it does not fit with the fact that literacy in England was ‘moderately widespread’ in the two centuries before the industrial revolution (Strang 1970, p. 106), and then declined in percentage terms just as standardization approached completion. Advocates of particular spelling systems (including the existing ones) rarely bother to justify the need for a standardized spelling. It is overwhelmingly taken for granted. This in itself is a good reason to ask why spelling standardization is necessary, and why such a high social value is put on being able to spell correctly.
5. Spelling and Ideological Debate and Conflict Spelling systems, as a type of symbolic system, inevitably acquire social and cultural meanings which go far beyond their function as a means of representing a language. The circumstances of the introduction of a spelling system and the circumstances surrounding its use give rise to this cultural symbolism which may in turn lead to ideological contestation and conflict, with the spelling system as its focus. Some recent examples include the debate over an appropriate spelling system for Haitian Creole (Schieffelin and Doucet 1994), where a key issue was whether to use a ‘gallicising’ orthography, where certain letters had the same value as in French, or a more international one, where they did not. The choice between letters such as fkg and fcg was reinterpreted in the context of the debate by some protagonists as representing ‘Anglo-Saxon imperialism’ (fkg) vs. ‘colonial subservience’ (fcg). A similar type of conflict exists in Galicia (AlvarezCaccamo and Herrero Valeiro 1996), where the choice of orthography for Galician reflects a writer’s view on whether Galician is a variety of Portuguese or an independent language. Those favoring ‘reintegrationism’ (with Portuguese) use mainly Portuguese spelling conventions; those favoring ‘differentialism’ use conventions closer to Spanish. In less extreme cases, particular spelling systems may be unpopular because of their associations: for example, one orthography for Breton is disliked for its associations with Nazi collaborators during the World War II (McDonald 1989, p. 131ff ).
Spelling Even where not codified, orthographic practices may represent ideological positions, especially a desire to differentiate. Thus Caribbean writers may write kool, tuff to distance Creole from English, although there is no significant pronunciation difference (Sebba 1998b). Within an Anarchist tradition in Spain, words such as OkupacioT n (‘Occupation (of buildings)’) are spelt with f kg instead of fcg in graffiti, in defiance of the official spelling.
6. Spelling and Educational Policy Most languages with a written tradition now have highly standardized spelling. Thus spelling is usually seen as one of the literacy skills which have to be acquired through formal schooling, with ‘good spelling’ taken as a sign that the writer is well educated. Education authorities differ, however, in the emphasis placed on the accomplishment of ‘correct’ spelling, as measured by the amount of school time devoted to it, and the penalties for failing to achieve the acquired standard. There are differences between Britain and the US, for example, in this respect, with Britain usually being seen as having a more tolerant attitude, though there have been recent policy changes. An insistence on ‘correctness’ in spelling and substantial penalties for underachieving are usually associated with political conservatism, with a more relaxed approach being seen as ‘liberal,’ in keeping with prescriptive approaches to other aspects of language. In a few places, such as Norway, there are alternative spelling standards which take into account the dialectal preferences and competences of the local population (see Wiggen 1986). This is unusual; learners are much more likely to be subject to a national or supra-national spelling standard. While there is almost universal acceptance of the notion that ‘correct spelling’ is a necessary educational outcome, for some languages there is a feeling that spelling is too difficult, and thus presents a barrier to educational achievement. This provides a major impetus for spelling reform, discussed in Sect. 7.
For English, the widespread perception of the spelling system as ‘complex’ and ‘illogical’ has led to many proposals for reform over the years, most of which have had no impact on actual spelling practices (see Carney 1994 for discussion). In England, the banner of spelling reform is carried by, among others, the Simplified Spelling Society. A few languages, such as Norwegian, have regular and fairly uncontentious reviews of spelling practices, carried out by a central authority. Other languages have implemented reforms, though seldom without resistance from the general public. Recent examples of contentious reforms are those of Dutch (Jacobs 1997), Portuguese (Garcez 1995), and German, each of which required international negotiation. Reforms often founder as a result of conflicting interests: the general public (which nearly always seems to be opposed to changes), educationists (who are more likely to be in favor), publishers and lexicographers (who have a vested interest in certain types of change, which may produce increased demand from consumers), authors (who sometimes oppose change on the grounds that it will make earlier literary works unavailable to readers who do not know the pre-reform spelling), and national governments, who have to try to balance these opposing parties and promote their own interests. The outcome is often a set of reforms which satisfies no one, and the road to reform is often extremely rocky. The German reforms of the 1990s, for example, in spite of affecting only 0.05 percent of words in running text, have been subject to successful challenges in some regional courts, although these have been overturned by the Federal Constitutional Court (Johnson 2000). See also: Dyslexia, Developmental; Dyslexia: Diagnosis and Training; Dyslexia, Genetics of; Reading Nonalphabetic Scripts; Reading Skills; Writing, Evolution of; Writing Instruction; Writing Systems; Writing Systems, Psychology of
Bibliography 7. Spelling Reform Spelling reform—usually meaning an adjustment to an existing system, rather than a root-and-branch rethink—is invariably controversial. Proposals for spelling reform are usually put forward when the current conventions are seen to be causing a ‘problem’—typically an educational one. A commonly identified source of problems is language change, which results in the phonemic principle applying over a decreasing part of the lexicon; words where the spelling is out of step with the pronunciation are seen as causing difficulties for learners and leading to ‘illiteracy.’
Alvarez-Caccamo C, Herrero Valeiro M J 1996 O continuum das normas cscritas na Galiza: do espanhol ao portugue# s. AgaT lia: reista da Associacm om Galega da Lingua, pp. 143–56 Augst G 1986a Descriptively and explanatorily adequate models for orthography. In: Augst G (ed.) New Trends in Graphemics and Orthography. Mouton De Gruyter, Berlin, pp. 25–42 Augst G (ed.) 1986b New Trends in Graphemics and Orthography. Mouton De Gruyter, Berlin Barros C D M 1995 The missionary presence in literacy campaigns in the indigenous languages of Latin America. International Journal of Educational Deelopment 15(3): 277–87 Carney E 1994 A Surey of English Spelling. Routledge, London Garcez P M 1995 The debatable 1990 Luso-Brazilian orthographic accord. Language Problems and Language Planning 19: 181–78
14893
Spelling Jacobs D 1997 Alliance and betrayal in the Dutch orthography debate. Language Problems and Language Planning 21(2): 103–118 Johnson S 2000 The cultural politics of the 1998 reform of German orthography. German Life and Letters 53(1): 106–25 Lepsius R 1981 Standard Alphabet for Reducing Unwritten Languages and Foreign Graphic Systems to a Uniform Orthography in European Letters. 2nd revised edn. (London 1863), Kemp J A (ed.) John Benjamins, Amsterdam McDonald M 1989 ‘We are not French!’ Language, Culture and Identity in Brittany. Routledge, London Schieffelin B B, Doucet R C 1994 The ‘‘real’’ Haitian Creole: Ideology, metalingusitics and orthographic choice. American Ethnologist 21: 176–200 Scragg D G 1974 A History of English Spelling. Manchester University Press, Manchester, UK Sebba M 1998a Orthography as practice and ideology: The case of Manx. Lancaster University Department of Linguistics and Modern English Language, Working Papers of the Centre for Language in Social Life, No. 102 Sebba M 1998b Phonology meets ideology: The meaning of orthographic practices in British Creole. Language Problems and Language Planning 22(1): 19–47 Strang B M H 1970 A History of English. Methuen, London Stubbs M 1980 Language and Literacy: The Sociolinguistics of Reading and Writing. Routledge and Kegal Paul, London Wiggen G 1986 The role of the affective filter on the level of orthography. In: Augst G (ed.) New Trends in Graphemics and Orthography. Mouton De Gruyter, Berlin, pp. 395–412
M. Sebba
Spencer, Herbert (1820–1903) Herbert Spencer was born on April 27, 1820 in Derby, UK, and after a long and distinguished career as a major intellectual, he died on December 8, 1903 in Brighton, UK. During Spencer’s childhood, he was privately tutored, first by his father and then, after the age of 13, by his uncle in Bath. Thus, Spencer received little formal education outside of his family, although his in-depth tutelage in mathematics and science was sufficient for him to contemplate as a relatively young man a grand Synthetic Philosophy in which a few general or ‘first principles’ (1862) could be used to explain the operative dynamics of physical science, biology (1864–67), psychology (1855), sociology (1874–96), and ethics (1892–98). He also was a moral crusader, writing important philosophical statements on such topics as education (1860) and social wellbeing (1850, 1892–98). Spencer’s sociology comes late in his career, beginning with a methodological treatise on sources of 14894
bias (Spencer 1873), continuing with an effort to systematically array cross-cultural and comparative data (Spencer 1873–34), and culminating with the use of data to illustrate the basic laws of sociology (Spencer 1898b [1874–96 ]). In this article, the goal is to see how Spencer’s grand scheme developed and how it evolved into a profoundly important sociology [For a comprehensive listing of primary works by Spencer and of works on Spencer, see Perrin (1993)].
1. Biographical Influences on Spencer’s Sociology Several important aspects of Spencer’s biography are important in examining his impact on social science. First, he never held an academic position, and hence, he did not have any students to carry forth his work after his death, although the wealth inherited from his uncle and from his book sales did allow him to commission other academics to perform research and, even after his death, to continue to completion the volumes of his Descriptie Sociology (Spencer 1873–34). Second, he lived an extremely spartan life, and yet at the same time, he interacted in London Clubs and various assemblies with the elite scientists and intellectuals of his time. Indeed, it is from these scholars—figures such as Thomas Huxley, Joseph Hooker, John Tyndall, and even Charles Darwin— that Spencer absorbed much of his knowledge about current discoveries in science. Third, Spencer was immensely popular in his time. His works generally came out in serial form in magazines and private subscriptions, later being packaged as whole books which sold more than 400,000 copies during Spencer’s lifetime. Yet, despite his enormous popularity, Spencer was an enigma because he had no formal academic affiliation and because he lived a very private and, it appears, rather neurotic life punctuated by hypochondria. Yet, in all his works, several key influences are evident. Reading Thomas Malthus, Karl Erst von Baer, and Charles Darwin gave Spencer a view of the universe as evolving from simple to more complex through competition; extensive reading of the physical sciences gave Spencer a post-Newtonian conception of science, a view emphasizing that universal laws could explain all domains of the universe; and reading of Auguste Comte also gave Spencer a view of sociology as the search for general laws as well as a conception of social organisms as having structures devoted to meeting basic needs and of societal evolution as involving an increasing complexity of structures meeting these needs. Spencer (1968 [1864]) was, however, always rather defensive about perceptions of Comte’s influence on him. Out of this mix of influences came Spencer’s grand Synthetic Philosophy that was to unify physical science, psychology, biology, sociology, and ethics under first principles borrowed from the physics of Spencer’s time.
Spencer, Herbert (1820–1903)
2. The Stigma of Spencer’s Moral Philosophy Spencer’s moral philosophy is one reason that his intellectual star began to fade in the early twentieth century. In his first major work, Social Statics (1850), almost a decade before Darwin published On the Origin of Species, Spencer coined the phrase ‘the survival of the fittest’ that was to haunt him for the rest of his career. The more general philosophy in this work simply argued that individuals should be free to pursue their happiness, as long as they do not abridge the rights of others to do the same, although there was a hard edge to this philosophy emphasizing that the less fit should not be supported by the state. Spencer (1904) complained bitterly in his autobiography about how he had become type cast as a result of this first book, but it is hard to feel sorry for him because he made the same ‘libertarian’ (in today’s vocabulary) statements in his last work, The Principles of Ethics. For a discipline such as sociology, which is liberal in the twentieth century as opposed to the nineteenth century sense where freedom of the individual and lack of governmental activism are emphasized, this libertarian-sounding ideology has always worked to obscure the important contributions of Spencer to sociology.
3. The ‘First’ or ‘Cardinal’ Principles Spencer’s use of the physics of his time to formulate his ‘cardinal principles’ is now rather dated. Spencer postulated three basic principles: (a) the indestructibility of matter, (b) the continuity of motion in a given direction, and (c) the persistence of force behind movement of matter. From these general principles, three corollaries were derived: (d) the transferability of force from one type of matter and motion to another, (e) the tendency of motion to pass along the line of least resistance, and (f) the rhythmic nature of motion. From these principles, Spencer (1880[1862], p. 286) formulated his famous law of evolution as ‘change from a less coherent form to a more coherent form, consequently on the dissipation of motion and integration of matter.’ This law was presumed to apply to the evolution of celestial bodies, chemical compounds, organisms, and superorganic bodies (organization of organisms). Spencer was, therefore, the world’s first general systems theorist. The first or cardinal principles are less interesting than what Spencer did with them as he applied them to organic and superorganic evolution. In Principles of Biology, Spencer used the first principles to formulate the cornerstone of his biology and sociology: increasing size of an organic or superorganic mass requires an exponential differentiation of the structure necessary to support the larger mass. This principle is still used in biology, although its origins to Spencer are long forgotten; the principle is employed in sociology to study differentiation at all levels of social organi-
zation, with only a vague sense that this idea comes from Spencer. In many ways, this theoretical cornerstone of Spencer’s Synthetic Philosophy actually comes more from his training and brief practice as an engineer than it does from his first principles; for within engineering it is well established that larger structures require more complex systems of support. Spencer’s star faded long ago in biology, and there is only a hint of his influence in education and philosophy today. Where Spencer still remains a figure deserving of at least mention, often in a somewhat derisive way, is in sociology. In order to appreciate what Spencer’s legacy is today, we must review his contributions to sociology, where he still lives on as a marginal figure in sociology’s pantheon of great masters that includes such giants as Karl Marx, Max Weber, E; mile Durkheim, and George Herbert Mead.
4. Spencer’s Functionalism Spencer is famous for his ‘organismic analogy’ in which he lists in a more formal manner than Comte before him the parallel principles of organization between organic and superorganic bodies. But this analogy is often given too much emphasis in commentaries on Spencer because it hides a rather sophisticated functionalism. For Spencer, all superorganic systems composed of organic units must meet three basic requisites if they are to survive: (a) operation or (i) production of life-sustaining substances and (ii) reproduction of units; (b) regulation or coordination and control of units; and (c) distribution or movement of necessary information and substances among units. Populations which cannot meet these requisites will experience ‘dissolution,’ a notion first developed in First Principles (1880 [1862]). Thus, Spencer did not view evolution as an inevitable process of movement from simple to complex as populations sought to resolve problems of operation, regulation, and distribution; on the contrary, he recognized that societies often disintegrate. In this discussion of functional requisites, Spencer introduces two conceptions of ‘selection.’ On the one hand, Spencer argued that evolution of societies tends to increase complexity because, as societies come into conflict, the more complex and better organized generally will prevail, thereby ratcheting up the overall level of societal complexity. This argument is one of the first ‘group selectionist’ statements presented in either biology or the social sciences. On the other hand, there is another kind of selection which is at the core of all functionalist arguments. This selection occurs alongside Darwinian selection and revolves around problems and crises in production, reproduction, regulation, and distribution. When problems in any of these spheres emerge, they set into motion efforts to search for solutions by goal-seeking social units. Thus, social selection in superorganic systems is not blind because the very 14895
Spencer, Herbert (1820–1903) units subject to selection can change their structure and thereby avoid being selected out. Spencer argues that most selection in sociocultural systems is of this second type, where problems of adaptation activate efforts to resolve crises with respect to production, reproduction, regulation, and distribution. Since these requisites are the essential problems of survival in all superorganic systems, including both human and nonhuman societies, differentiation of structure during the course of evolution will revolve around these four axes. First will come differentiation of productive structures, followed by differentiation of reproductive units. This increased differentiation along these axes will generate selection pressures for their coordination and control, leading to the elaboration of political systems. As differentiation of operative and regulatory units occurs, new systems of distribution become necessary if dissolution is to be avoided. As these new systems of distribution emerge, e.g., markets, communication, and transportation infrastructures, they create new selection pressures for their regulation while at the same time providing positive feedback for further differentiation of production and, later, reproduction. Thus, movement of superorganic systems from simple to complex forms revolves around selection pressures along the axes of operation, regulation, and distribution as well as around the mutual feedforward and feedback effects of differentiation along one axis on the other axes. This kind of analysis is far more sophisticated than the typical commentary on functionalism acknowledges. Spencer’s substantive sociology also offers some very important models of system dynamics. One element of these models follows from the basic law: as a population grows, the increasing size generates escalated logistical loads for operation, regulation, and distribution which, in turn, set off selection pressures in the manner described above. Spencer’s political sociology should be viewed in this context. For Spencer, once power is concentrated, this initial concentration will be used to gain even more regulatory control. As power becomes more concentrated, inequalities will escalate as those with power extract an ever-greater proportion of a population’s resources. As inequalities increase, disintegrative pressures mount leading to the concentration of more power to suppress the potential conflict that always inheres in inequality; and as this process is iterated, system dissolution is likely. Geopolitics are also involved in these dynamics, in several senses. First, elites in the regulatory sector often engage in war to secure the resources of other populations which only increases the concentration of power and the scale and scope of inequalities. Second, such war-making efforts often distract subordinates in a system of inequality, at least for a time, from their true interests in changing the system of power generating inequality. Third, when power becomes too concentrated, resistance to this power inevitably increases, often leading to protest 14896
movements and other actions that lead to some deconcentration of power. Yet, the longer deconcentrated power lasts, problems of coordination and control escalate, creating selection pressures for reconsolidation of power. Societies thus cycle between concentrated profiles of power (what Spencer termed the ‘militant’) and deconcentrated power (what Spencer called ‘industrial’ profiles)—a line of analysis that predates Vilfredo Pareto’s discussion of the circulation of elites.
5. The Data Behind Spencer’s Sociology These lines of argument have a very contemporary ring to them, and they can be found throughout Spencer’s Principles of Sociology. Moreover, they are rather easily converted into laws or principles since this is how Spencer thought (see Turner 1985, for an effort to extract the principles from Spencer’s work). One of the criticisms of Spencer’s functionalism is that it is so speculative; while this may be so, Spencer also amassed enormous amounts of data to support his theoretical speculations. Indeed, the reason that Spencer’s Principles of Sociology is such a long work is that it is filled with data. In fact, many scholars complained at its first publication in serial form and later as a three-volume book that the analytical ideas get lost in all the data. Thus, Spencer assembled data far in excess of almost any other sociologist in the history of the discipline. He hired professional academics to gather historical and ethnographic information, ordering it into categories that followed from his general functionalist scheme. There are many tragedies in the contemporary view of Spencer, but none is greater than the loss of what Spencer began in the 1860s and first published in 1873 as the inaugural volume of his Descriptie Sociology (the last volume was published in 1934). Spencer was the first to execute what was later done by those assembling the Human Relations Area Files (HRAF) (e.g., Murdock 1953). Spencer recognized that in order to make generalizations about human societies, a large comparative database was necessary. But these data had to be arrayed so that they could be compared, leading Spencer to develop a category system for recording the data being assembled by professional academics. The result was the multivolume Descriptie Sociology (1873–34) which arrays data historically for those societies that have a written history and cross-sectionally for those less developed societies without a written history. The publication of these oversized volumes has been completely lost to modern social science, but if Spencer’s lead had been followed by others in the nineteenth century, all social science would have a much firmer empirical base than is provided by HRAF. At the very time Spencer was assembling available information on ‘primitive’ societies, these societies were being changed or destroyed
Spencer, Herbert (1820–1903) by colonial forces, with the result that by the time anthropologists began to study them in detail in the 1930s, their pristine character had long been compromised. Descriptie Sociology could have been Spencer’s greatest legacy, but like so much of his work, the contemporary world remains largely ignorant of his efforts to create a database for assessing the plausibility of his theories.
6. What Might hae Been, and What Can Still Be By the time of Spencer’s death in 1903, his star had already begun to fade in intellectual circles. One reason for this decline can be attributed to Spencer as a lone scholar working outside of academia. Without students to carry on his work and without a school of thought institutionalized in academia, there was no one and no place to carry forward Spencerian sociology. Work still continued on Descriptie Sociology but this was much narrower in scope than Spencer’s philosophy or even his sociology, and moreover, most of this work was contracted to scholars who were not committed to Spencer’s grand scheme. So, when Talcott Parsons (1937) asked ‘who now reads Spencer,’ the answer was not very many scholars inside or outside of sociology. Spencer’s star had fallen at the same time that evolutionary thinking came under attack in anthropology, and with no one to defend Spencer’s sociology, he fell into obscurity. Yet, let us imagine for a moment what might have happened if Spencer had been defended by students or had established a school entrenched in academia like Durkheim’s anneT e school in France. In conducting this thought experiment, we can come to appreciate not only what might have been, but more fundamentally, what is still available in Spencer’s legacy. The first legacy would have been a Human Relations Area Files that came earlier, before colonialism completely destroyed indigenous populations. Moreover, sociology would have had a database that was comparative in several senses: the data would have been arrayed in common categories that are more sophisticated than George P. Murdock’s (1953) HRAF; the data would have been both historical and ethnographic and would have allowed systematic comparisons of different types of societies at any point in time. Only in the last three decades has comparativehistorical and ethnographic data been reintroduced back into sociology and, ironically, this reintroduction has been part of a movement to rekindle interest in the very evolutionary thinking that was used to discredit Spencer. The second legacy is that the evolutionary paradigm would have survived and avoided the 50-year hiatus. Stage models have utility in sociological analysis, as has been demonstrated in the work of Gerhard Lenski (1966), Stephen K. Sanderson (1995), Jonathan
Turner (1984, 1997), and others in recent years. Moreover, adopting Spencer’s views on evolution would have avoided the view of evolution as inexorable, since dissolution would be a prominent point of analysis as would dialectical cycles within either evolution or dissolution. A third legacy would have been a sophisticated functionalism that, unlike mid-twentieth century approaches, sought to articulate theoretical principles about the dynamics of social processes rather than simply cross-tabulate lists of functional requisites with lists of social processes. Functionalism fell into disuse not so much because it was dead wrong but because it was not done well. A reading of Spencer would have led the way to a functionalism that does not build conceptual schemes as much as highlight fundamental forces whose operative dynamics can be captured in theoretical principles. A fourth legacy would have been a sociology that did not lose sight of its status as a science in which the goal is to generate theoretical principles that explain the operative dynamics of the social universe. Spencer’s methodological treatise and his articulation of actual laws of human organization would have sustained the discipline better than Marx, Weber, Simmel, or Durkheim could, as various antiscience and excessively relativistic movements have come and gone for the whole of the twentieth century in sociology. A fifth legacy would have been the actual substance of Spencerian sociology. Imagine if sociologists had mined The Principles of Sociology for insights into human organization with the same intensity as we have delved into Marx, Durkheim, Weber, or Mead. Spencer provided not only highly abstract theoretical principles but also many principles about the operation of stratification systems, polity, kinship, religion, economy, markets, ceremonies, professions, organizations, education, and many other key structures, and he documented these generalizations with copious amounts of historical, ethnographic, and for his time, contemporary data. The sociology of major institutions (e.g., family, education, religion,etc.) would be less parochial and descriptive than it is today because it would examine the dynamics of the institution with ethnographic as well as historical-comparative data organized by basic theoretical principles. A sixth legacy would have been the continued connection between biology and sociology. This connection has only recently been revived along all the fronts found in Spencer’s work. There is the ecological approach emphasizing Darwinian dynamics that Spencer saw before Darwin and that are now used to examine the relations between resources niches and a wide variety of social processes. Indeed, there is the revival of selection arguments in general (e.g., Runciman 1998), and moreover, an effort to move beyond Darwinian selection to considerations of group selection (e.g., Sober and Wilson 1998) and 14897
Spencer, Herbert (1820–1903) Lamarckian processes so evident in Spencerian sociology (Turner 1995). A seventh legacy is a political sociology that connects stratification processes within society to geopolitical dynamics in world systems. At the core of Spencer’s sociology is a concern with power that is the equal of that to be found in either Marx or Weber. One of Spencer’s main explanatory variables is how the concentration of power alters all other social processes, especially stratification and institutional systems, and his analysis of how war and commerce in the world system are implicated in power dynamics would not only have supplemented Marx and Weber but would have led to the emergence of world systems theory at the beginning rather than at the threequarters mark of the twentieth century (e.g., Wallerstein 1974). An eighth legacy is the emphasis on the fundamental relation between size, differentiation, and modes of integration. Much of this legacy was rediscovered in the mid-twentieth century in organizational studies on administrative intensity, but the basic dynamic operates at all levels of society. If sociologists had pursued Spencer’s lead and focused in 1910 on this fundamental relationship in the social universe, not only would organizational studies been moved forward at an earlier point in history but also other social units might well have been analyzed for how these relations among size, differentiation, and integration play themselves out. These eight lines of thinking that are part of Spencerian sociology were available to all sociologists by the early 1890s. In a sense we have squandered this incredible legacy, but it is not too late. The above thought experiment can be more than an intellectual exercise in the history of ideas. A reading of Spencer today can still inform sociology. There are gems of insight to be discovered and used to analyze societies in almost everything Spencer wrote. They simply sit there on library shelves, waiting to be used by scholars in the twenty-first century. There is probably little more to be gleaned from poring over the texts of others in sociology’s early pantheon, but the neglect of Spencer for the whole of the twentieth century provides a rich resource for the twenty-first century. See also: Conflict: Anthropological Aspects; Conflict: Organizational; Conflict Sociology; Cross-cultural Research Methods; Cultural Evolution: Theory and Models; Darwinism; Development: Organizational; Development: Social-anthropological Aspects; Elites, Anthropology of; Elites: Sociological Aspects; Ethics and Values; Evolution, History of; Evolution, Natural and Social: Philosophical Aspects; Evolution: Optimization; Evolution: Self-organization Theory; Evolutionary Selection, Levels of: Group versus Individual; Functionalism, History of; Functionalism in Sociology; Groups, Sociology of; Hierarchies and 14898
Markets; Historical Change and Human Development; Individual\Society: History of the Concept; Motivational Development, Systems Theory of; Organizational Culture; Public Opinion: Political Aspects; Regulation, Economic Theory of; Religion: Definition and Explanation; Religion: Family and Kinship; Religion: Morality and Social Control; Social Evolution: Overview; Sociality, Evolution of; World Systems Theory
Bibliography Lenski G E 1966 Power and Priilege: A Theory of Social Stratification. McGraw-Hill, New York Murdock G P 1953 Social structure. In: Tax S, Eisely L, Rouse I, Voegelin C (eds.) An Appraisal of Anthropology Today. University of Chicago Press, Chicago Parsons T 1937 The Structure of Social Action. McGraw-Hill, New York Perrin R G 1993 Herbert Spencer: A Primary and Secondary Bibliography. Garland Publishers, New York Runciman W G 1998 The selectionist paradigm and its implications for sociology. Sociology. 32: 163–88 Sanderson S K 1995 Social Transformations: A General History of Historical Deelopment. Blackwell, Cambridge, MA Sober E D, Wilson S 1998 Unto Others: The Eolution and Psychology of Unselfish Behaior. Harvard University Press, Cambridge, MA Spencer H 1873 The Study of Sociology. Kegan Paul, London Spencer H 1873–1934 Descriptie Sociology, or Groups of Sociological Facts. D. Appleton, New York Spencer H 1880 [1862] First Principles. A L Burt, New York Spencer H 1887 The Principles of Biology, 2 Vols. D. Appleton, New York, originally published 1864–67 Spencer H 1888 Education: Intellectual, Moral, and Physical. D. Appleton, New York, originally published in 1861 Spencer H 1892 Social Statics. D. Appleton, New York, originally published in 1850 Spencer H 1892–98 The Principles of Ethics. D. Appleton, New York Spencer H 1898a The Principles of Psychology. 2 Vols. D. Appleton, New York, originally published in 1855 Spencer H 1898b [1874–96] The Principles of Sociology. 3 vols. D. Appleton, New York, originally published between 1874–96 Spencer H 1904 An Autobiography. 2 Vols. Appleton, New York Spencer H 1968 [1864] Reasons for Dissenting from The Philosophy of M. Comte and Other Essays. Glendessary Press, Berkeley, CA, originally published in 1864 Turner J H 1984 Societal Stratification: A Theoretical Analysis. Columbia University Press, New York Turner J H 1985 Herbert Spencer: A Renewed Appreciation. Sage, Beverly Hills, CA Turner J H 1995 Macrodynamics: Toward a Theory on the Organization of Human Populations. Rutgers University Press, New Brunswick, NJ Turner J H 1997 The Institutional Order. Longman, New York Wallerstein I 1974 The Modern World System. Academic Press, New York, Vol. 1
J. H. Turner Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Sperry, Roger Walcott (1913–94)
Sperry, Roger Walcott (1913–94) Roger W. Sperry, scientist, teacher, philosopher, and humanitarian was born in Hartford, Connecticut, USA on August 20, 1913. He was married to Norma Dupree in 1949; they had 2 children. He died on April 17, 1994. The diversity of Sperry’s interests is not only reflected in his distinguished academic career, but also in his avocations. In addition to his major contributions to our understanding of brain structure and function, he was an unusually talented artist, sculptor, and paleontologist. His multiple talents across many areas attracted a constant stream of graduate students, postdoctoral fellows, and visiting scientists to his Psychobiology Unit at Caltech for four decades. Sperry’s scientific contributions centered on four major areas, the first of which began in the early 1940s, while he was still a graduate student in Paul Weiss’ laboratory at the University of Chicago. Paul Weiss, whose views dominated the field of neural development at the time, held that the leading tips of growing neurons solely responded to local mechanical stimuli, and were in no way influenced by distant chemical cues. His numerous tissue culture studies had essentially closed the door on any possibility of chemical influences on neuronal growth. Weiss also held that the normal function of the nervous system was in no way dependent upon the precise patterns of neuronal connections. One example of this, according to him, was that when motor nerves were transplanted into any foreign muscle, normal movements would eventually occur after a period of training. Sperry, as was to become his custom, challenged these ideas, and designed a typically simple, but ingenious experiment to test whether or not specific nerve connections are critical for normal function. Using rats, he transplanted motor nerves from leg muscles of one side to muscles of the opposite side, then tested their movements after the new nervemuscle connections were established. Function was found to be severely impaired, and no amount of training could correct it (Sperry 1945). These results also had important implications for humans. Neurosurgeons at the time were transplanting nerves in wounded war veterans in attempts to help them regain digit and limb movements, which had been lost as a result of war injuries. In spite of extensive post-surgical training, these transplants met with little success. During his military service with the Office of Scientific Research and Development (1942–45), Sperry, along with Weiss, showed that surgical transplantation of nerves to antagonistic muscle groups followed by intensive training simply did not work because transplanted nerves were inherently different from the nerves which normally innervated those muscles. This finding resulted in significant modifications of treatment procedures, thus saving a great deal of time, frustration, and money.
During this same period it was reported that if the optic nerve were sectioned in amphibians, optic axons would regenerate into the brain, and normal function would be restored. Sperry immediately recognized that this finding offered the perfect opportunity to test his ideas regarding functional specificity and the possible role of chemical influences on neuronal growth patterns. He first confirmed the above results (Sperry 1950), then once again designed a brilliantly simple study—the now classic eye rotation experiment in the frog—to test his ideas. A hungry frog will accurately strike out with its tongue to obtain a fly held in front of it. Sperry sectioned the optic nerves of normal frogs, and rotated the eyeballs by 180 degrees. After regeneration had taken place and connections were reestablished in that part of the brain which is known to receive visual fiber input (an area called the optic tectum), he tested the frog’s ability to capture a fly. The premise was that if the retinal axons had regrown into their normal sites of termination (in spite of the fact that the eye had been left\right reversed and turned upside-down), the frog should misdirect its strike by 180 degrees in any direction. If, however, the regenerating optic fibers disregarded their original target sites and established new central connections corresponding to the rotated eyes, the frog would be capable of striking accurately at the target fly. The answer was unequivocal. When the fly was held 90 degrees to the frog’s right, it struck out 90 degrees to its left—180 degrees out of phase. The same result was obtained for all directions—left, right, up, down. The strike was always 180 degrees off, and no amount of retraining could correct it. The conclusion from both of the above lines of evidence was clear: peripheral nerves cannot be ‘forced’ to establish connections with foreign muscles, and centrally regenerating nerves will reconnect to their original sites, in spite of radical alterations of their starting positions. A number of other rerouting studies were carried out, each of them supporting the basic premise, which Sperry termed ‘the chemoaffinity hypothesis.’ Briefly stated, the growth and development of a pre-patterned neural network is determined by a genetically controlled chemical code, and the ‘hard-wired’ nature of this network results in a specific, highly adaptive range of behavior which is unalterable by training. Other studies over the next 20 years, using neuroanatomical and neurophysiological approaches, provided confirmation of this hypothesis. Attardi and Sperry (1963) provided further, histological confirmation by showing that regenerating retinal axons in the goldfish would always regrow to their original terminal sites in the brain. Two major theoretical papers were published by Sperry in the mid-1960s (see Voneida 1997, Sperry 1965a), in which he precisely described his chemoaffinity theory. In spite of numerous attempts over the next three decades to disprove the theory, it has prevailed with only slight variations and elaboration 14899
Sperry, Roger Walcott (1913–94) of the basic premise. It was primarily this work, begun as a predoctoral student in 1938 and pursued through the early 1960s that resulted in his election to the National Academy of Sciences. A second major area of Sperry’s work deals with cortical organization related to higher cognitive function in mammals. He joined the renowned psychobiologist Karl Lashley at Harvard for his postdoctoral work, where he began a new series of studies on the role of electrical fields in neocortical function. He was not comfortable with the idea that electrical fields or waves acting in a volume conductor (the brain) are critical for neocortical processing. His first study to challenge this was published in the late 1940s (see Sperry 1947). Here he demonstrated that motor coordination in monkeys remained virtually unaffected after multiple transections of sensorimotor cortex. He later confirmed this by demonstrating that neither cortical cuts, the insertion of numerous short circuiting wires or insulating plates into the cortex had any adverse effect on cortical function. These studies demonstrated that perception is dependent upon vertically oriented afferent and efferent cortical axons, predating the discovery of vertically oriented cortical columns. Sperry’s premise was based on his keen understanding of neuroanatomy and neurophysiology, including the work of Santiago Ramon y Cajal and Lorente’ de No, both of whom had demonstrated the importance of radial cortical connections to cortical function. Thus, in a few carefully conducted experiments, Sperry had once again upset two major theories of brain function, Gestalt electric field theory of perception and the reduplicated interference pattern hypothesis. Indeed, when the renowned neuroembryologist Viktor Hamburger presented Sperry with the Ralph Gerard Award from the Society of Neuroscience in 1979, he proclaimed, ‘I know of nobody else who has disposed of cherished ideas of both his doctoral and his postdoctoral sponsor, both at that time the acknowledged leaders in their fields.’ Sperry’s third major contribution relates to the function of neuronal connections between the two brain halves (called commissures), interhemispheric communication, and lateralization of cognitive abilities in humans. Most of this work was done at Caltech, where he served as Hixon Professor of Psychobiology from 1954 to 1994. It was these studies, carried out between the mid-1950s through the 1970s (see Sperry 1981), which resulted in the Nobel Prize in Physiology and Medicine (shared with Hubel and Wiesel) in 1981. The early studies, carried out by Sperry and his students at Caltech from the mid-1950s through to the 1960s, elegantly elucidated some of the major functions of interhemispheric connections in memory transfer and eye-hand coordination. Restriction of sensory input to one brain hemisphere in animals having undergone surgical separation of the two brain halves was shown to limit the learning of various tasks 14900
to the ‘input’ hemisphere; the opposite side was capable of learning, but remained naive to those tasks until trained. Learning curves for each brain half were virtually identical; it was as if two separate brains were housed within a single cranial vault. It was during this period that he also began his work on the lateralization of cerebral function in humans. Neurosurgeons Philip Vogel and Joseph Bogen had performed a surgical section of the major pathway connecting the two brain halves on a World War II veteran who was suffering from progressively worsening epileptic seizures. Working in collaboration with Vogel and Bogen, Sperry and his students began a series of tests directed at understanding the effects of this surgical hemispheric separation on human perception, speech, and motor control. The work involved comparisons of cognitive abilities between the two separated brain halves, demonstrating differences theretofore unrecognized. For the next 20 years, the work of Sperry and his collaborators revolutionized our understanding of brain function. Prior to this time, the right hemisphere had been virtually neglected by neuroscientists. At best, it was felt to play only a minor role in human cognition, with all major cognitive functions emanating from the left, or dominant half. As a result of Sperry’s research, the right hemisphere gained its long overdue recognition as a full partner to the left, playing a major role in holistic, parallel, and spatial abilities. The left brain half was found to be superior to the right in tasks involving analytical, sequential, and linguistic processing. Perhaps most importantly, however, Sperry demonstrated that the combined effects of bihemispheric activity amounted to much more than the simple additive effects of the 2 separate hemispheres, and led him into the area of his fourth and final major contribution; namely, the relationship between brain and mind, science and values. Sperry began thinking and writing about the subjective aspects of brain function as early as the late 1950s. In 1959, for example, as part of the discussion at the Josiah Macy Conference on the Central Nervous System and Behavior, he stated that he had never been entirely satisfied with the behaviorists’ interpretation of brain function in purely objective terms, with no reference whatsoever to subjective experience. He pointed out that this does not mean that we should abandon the objective approach; it simply means that we should also include subjective experience in our working models of the brain. Again, in his 1965 paper, Merging Mind, Brain, and Human Values (see Sperry 1965b), he reinforces this thesis in his declaration that conscious forces must be included in any model or description of brain function, stating that in this model, ‘… conscious mental psychic forces are recognized to be the crowning achievement of some five hundred million years or more of evolution.’ Sperry continued to think and write about the mind, consciousness, and their relationship to human values right up to the end of his life. He postulated that the
Spirit Possession, Anthropology of mind is an emergent property of brain function, and like most emergent properties, bears little resemblance to the hundreds of millions of neurons in the central nervous system. He emphasized that while the mind is not the same as the brain, it is dependent upon it. This point is an important one, for it separates Sperry’s views from the philosophical concept of dualism. Thus brain and mind are by no means separate entities, but function rather as coexistant partners through the processes of emergence and downward causation, or feedback. Taken together, these two properties result in a continuous, highly dynamic process of change from one moment to the next, drawing constantly on previous experiences by way of learning and memory, whether they occurred in the distant past or only moments ago. He strongly argued against reductionism, which dominated psychological thinking during the first half of the twentieth century. He believed that reducing consciousness to its separate components obliterates the emergent properties of mind, with ‘… all its great power and uniqueness’ (see Sperry 1993). Sperry then elevated his concept of mind as an emergent property of brain activity to a global level, stating that: The new paradigm affirms that the world we live in is driven not solely by mindless physical forces but more crucially, by subjective human values. Human values become the underlying key to world change (see Voneida 1997).
He urged his scientific colleagues to use the very unique and powerful forces of mind and consciousness toward improving and preserving the quality of life on our planet, rather than the reverse. His urgent appeal finally began to be heard when his long-term friend and colleague, Professor Rita Levi-Montalcini, herself a Nobel Laureate, convened the International Conference of Human Duties at the University of Trieste in November, 1992, to discuss these ideas in greater detail. Regrettably, Sperry’s illness prevented him from attending the first meeting, which included 10 Nobel Laureates and numerous others, representing such widespread disciplines as neurobiology, chemistry, physics, economics, and theology. This conference led to the formation of The International Council of Human Duties, which formulated ‘The Declaration of Human Duties’ in 1994. The document has been published (Trieste Declaration 1997) and forwarded to The United Nations, for its consideration as a corollary to the ‘Declaration of Human Rights.’ Sperry’s thinking about subjective experience, consciousness, the mind and human values were crystallized in his paper ‘The Impact and Promise of the Cognitive Revolution’ (Sperry 1993), delivered at the American Psychological Association’s centennial meeting in 1993. It expressed his great hope and sincere belief that if we humans could simply be persuaded to put our collective minds together and to
use the enormous emergent power that they are capable of generating, we might very well begin the new millennium by not only improving the quality of life on the planet, but also by significantly increasing our chances for survival. See also: Brain Asymmetry; Cognitive Neuroscience; Mind–Body Dualism; Neuromuscular System
Bibliography Attardi D G, Sperry R W 1963 Preferential selection of central pathways by regenerating optic fibers. Experimental Neurology 7: 46–64 Sperry R W 1945 The problem of central nervous reorganization after nerve regeneration and muscle transposition. Quarterly Reiew of Biology 20: 311–69 Sperry R W 1947 Cerebral regulation of motor coordination in monkeys following multiple transection of sensorimotor cortex. Journal of Neurophysiology 10: 275–94 Sperry R W 1950 Neural basis of the spontaneous optokinetic response produced by visual inversion. Journal of Comparatie and Physiological Psychology 43: 482–9 Sperry R W 1952 Neurology and the mind-brain problem. American Scientist 40: 291–312 Sperry R W 1965a Embryogenesis and behavioral nerve nets. In: Dehaan R L, Ursprung H (eds.) Organogenesis. Holt, Rinehart and Winston, New York, pp. 161–85 Sperry R W 1965b Mind, brain and humanist values. In: Platt J R (ed.) New Views of the Nature of Man. University of Chicago Press, Chicago, pp. 71–92 Sperry R W 1981 Some effects of disconnecting the cerebral hemispheres. (Nobel lecture.) Almquist and Wiksell, Stockholm Sperry R W 1983 Science and Moral Priority: Merging Mind, Brain, and Human Values. Columbia University Press, New York Sperry R W 1993 The impact and promise of the cognitive revolution. American Psychologist 48(3): 878–85 Trieste Declaration of Human Duties 1997 A Code of Ethics of Shared Responsibilities. Trieste University Press, Trieste, Italy Voneida T J 1997 Roger Walcott Sperry, 1913–1994. The National Academy Press, Washington, DC
T. J. Voneida
Spirit Possession, Anthropology of Since ancient times possession is a transcendental human experience that exists all over the world. Bourguignon (1973) found that in 90 percent of her sample of 488 societies, one or several forms of possession or, as she called it, altered state of conscience, existed. Even in Western societies, despite secularization, belief in spirits regains in importance due to new religious movements and the sweeping success of fundamentalist Christian churches. This 14901
Spirit Possession, Anthropology of worldwide attractiveness asks for a rethinking of possession in relation to modernity and the disenchantment of the world.
his\her family, since most rituals include wider kin. De Heusch (1962) called this process adorcism and differentiated it from exorcism that implies the removal of evil spirits or ghosts, which is typical of demonic possession.
1. Understanding Possession The wide varieties of beliefs and cultural traditions make it difficult to define what possession is, and what it means for different people. A minimal classification is advanced by Boddy who defines spirit possession as ‘the hold exerted over a human being by external forces or entities more powerful than she’ (Boddy 1994, p. 407). In most possession cults these external forces are considered as spirits, god(s), or ancestor heroes. In Africa, possessing spirits are generically thought of as breath, air, or wind. Their individuality is produced in an interactive process, in which the medium, the afflicted person, and the public take part. Spirits may incarnate ancestors, cultural heroes, or former kings as well as specific ethnic or social groups, animals, material things, abstract principles, and even emotional states. They are refractions of social life and express the whole range of a society’s cultural archive. Despite this objectification they are thought of as distinct personalities with special tastes for food, clothes, jewelry, colors, rhythms, and music. The spirits’ interaction with humans varies considerably. It is either conceived of as a form of communication through dreams, images, or voices and\or as a violent attack in which illness or misfortune is inflicted on an individual. During possession, humans are no longer themselves, but ‘horses,’ ‘vessels,’ or ‘containers’ of their spirits. They are not responsible for their acts or speeches, since they are dominated by forces alien to them. Lienhardt (1961) has called this mental stage passiones, a form of being acted upon, instead of being an actor. In order to determine the identity of the afflicting spirit, ritual performances are staged which serve as diagnosis and therapy at the same time. The process of initiation is more or less formal and in some cases dependent on public accountability. Complete unification with the spirit is achieved by way of trance or ecstasy which may be generated through music, dance, medicine, and drugs, or as Eliade (1951) has pointed out by way of archaic techniques, such as concentration, meditation, or rapture. Trance and ecstasy are used synonymously by many scholars. Others think they represent different bodily expressions of an altered state of consciousness. Trance, a sleep-like state and form of mental dissociation, may change into ecstasy: a form of frenzy, expressed in high pitched cries, hiccups, convulsive fits, speaking in tongues, or unknown languages, known as glossolalia. The identification with the spirit engenders a change in personhood and reverses former relations of subordination into relations of reciprocity. Spirits are accommodated and integrated into the medium’s life and that of 14902
2. Possession, Gender, and Empowerment Possession is not a singular event but denotes a lifelong relationship whose meaning is culturally determined, socially transmitted, and individually learned and implemented. The great majority of the possessed are women. Their dominance in possession cults has been the object of many enquiries and the answers given have differed fundamentally. Lewis (1989) discussed possession as a means for subordinate women and (marginalized) men, to gain material benefits and improve their social status. Others have pointed out the close link between possession, femininity, and female identity that are playfully re-enacted, modified, or contested by transgressing moral boundaries. The heterosexual nature of most spirits rivals long established marriages and deconstructs male sexual hegemonies. It is this subversive nature of possession that is one reason for its ambivalent and ambiguous reputation. Possession offers a wide range of opportunities for women and men to explore new identities, negotiate power relations between the sexes, and open up new spaces for social action. These are not limited to symbolic forms of behavior but refer to very mundane forms of praxis. Women and men competitively try to benefit from the professionalization and commercialization of healing which is more or less pronounced in different cults. They are engaged in ecological protection as ‘guardians of the land’ and exercise social and political power as cult leaders. Since colonial times spirit mediums are feared by colonial administrators as subversive, because they play(ed) eminent roles in wars of liberation and in the subsequent purification of war traumata. Due to the close association between possession, individual stress, and social change, possession cults are not limited to rural areas, but occur also in towns. In Malaysia female factory workers are struck by spirit possession (see Ong 1987) as a form of resistance against capitalist exploitation, while Kapferer describes exorcism in Sri Lanka as ‘metacommunicative of the dynamics of class’ (1991, p. 51). The transcending of class boundaries is also reported for Brazil and the Caribbean where middle-class people form a substantial number of cult members (Csordas 1987). In African towns a variety of possession cults compete for adherents belonging to all walks of life, including women as well as men. Although many cults are remarkably stable over time and extend over vast regions, a frequent redrawing of boundaries between town and village is quite typical.
Spirit Possession, Anthropology of
3. Discourses on Possession Since the beginning of the twentieth century the fascination with the subject on the part of scholars has generated a large body of literature. Whereas European case studies played an important role in earlier publications, examples from other continents began to dominate in the twentieth century. Due to the complexity of the subject all main theoretical debates are represented in the interpretations of spirit possession. Three of the most important approaches are:
movement directed against the dominant sex’ (1989, p. 26). Another important discussion in his comparative sociology of ecstasy was his fervent pladoyer for the identification of spirit possession with shamanism. His arguments were directed against Eliade and de Heusch who had conceived the two as inverse processes. Whereas shamans rival with the gods and ascend to them, spirit mediums incarnate the spirits who descend and attack humans. Despite the impact of Lewis’ book the instrumentalist portrayal of possession by most functionalists came increasingly under attack in the 1980s and led to another shift in methodology.
3.1 Psychological Approaches An important issue in early studies was the truth or falseness in the experience of possession that de Heusch (1962) classified as authentic or inauthentic possession. As a consequence, psychological interpretations loomed large from the very beginning, associating possession with multiple personalities and psychiatric symptoms. It was described as a form of pathology, due to personal stress or social change. Experiences of possession were predominantly discussed in metaphors of illness and healing which were dissociated from religious beliefs. Significant for this form of medicalization (Csordas 1987) was the neglect of its cultural and cosmological context. This reductionism was first challenged by more sophisticated case studies (Crapanzano and Garrison 1977, Obeyesekere 1981) which related psychological or psychoanalytical insights to the cultural construction of healing. A synthesis was achieved by Janzen (1992) who convincingly demonstrated that healing and religion are inseparable for the understanding of possession.
3.2 Sociological Approaches Firth’s (1964) distinction between spirit possession and spirit mediumship opened a sociological path for interpretation. He discussed spirit possession as a form of individual affliction whereas spirit mediumship was characterized ‘as serving as an intermediary between spirit and men’ (1964, p. 247). Beattie and Middleton (1969) elaborated this differentiation in their pioneering study on spirit mediums whose exercise of power they related to different forms of social organization. Arguing from a functionalist background they were concerned with the sociological and political functions of spirit mediumship and stressed its closeness to social change. Lewis (1989) took up their distinction and equated central possession cults with societies’ main moral order being represented by men of social standing while peripheral possession cults were the domain of amoral spirits and powerless women. Being a weapon of the subordinate he described these cults as a ‘thinly disguised protest
3.3 Hermeneutic Approaches Due to the influence of Schu$ tz and Geertz on anthropological theory during the 1980s, many authors favored a phenomenological or hermeneutic approach for a deeper understanding of the complexity of meanings acted out in these cults. Lambek who had earlier discussed curing ceremonies as performed ‘texts’ introduced a new topic by his ascription of possession as a means of knowledge (1993, p. 305f). He not only differentiated between objectified knowledge in Islam and embodied knowledge in possession, but also stressed the interdependence of objectification and embodiment for each type of knowledge. These dialectics explain, among other factors, as Boddy has pointed out, the counter-hegemonic praxis of possession without inhibiting the many forms of syncretism between the two. In her view possession beliefs constitute a metalanguage, ‘a metaphoric construction on quotidian reality’ (1989, p. 270). She describes how the world of humans is paralleled by the world of the spirits whose behavior, however, ridicules, satirizes or reinterprets human values. In an insightful analysis Leiris (1958) had already drawn attention to the playful and theatrical character of possession dances as arenas for negotiating identities. Fritz Kramer (1987) explored this venue further by interpreting ‘performances of alien spirits as mimetic ethnographies’ (Behrend and Luig 1999, p. 17). The distancing gaze on the other not only helps to recognize one’s own self but also is a means of understanding past and present events. At the same time, this symbolic construction of the world is part of an ongoing appropriation of modernity that may be modified or even rejected through the intervention of the spirits. It is this creative aspect of the rituals of possession that allow for the regional or even international mediation of local practices that are to produce spirits of cosmopolitan nature. Spirit possession is thus in no way an archaic practice that runs counter to a disenchanted world, but rather acts as a metacommentary on its modernity. See also: Mysticism; Myth in Religion; Ritual; Ritual and Symbolism, Archaeology of; Shamanism 14903
Spirit Possession, Anthropology of
Bibliography Beattie J, Middleton J (eds.) 1969 Spirit Mediumship and Society in Africa. Routledge and Kegan Paul, London Behrend H, Luig U (eds.) 1999 Spirit Possession, Modernity and Power in Africa. University of Wisconsin Press, Madison, WI Boddy J P 1989 Wombs and Alien Spirits. Women, Men and the Zar-Cult in Northern Sudan. University of Wisconsin Press, Madison, WI Boddy J P 1994 Spirit possession revisited: Beyond instrumentality. Annual Reiew of Anthropology 23: 407–34 Bourguignon E 1973 Religion, Altered States of Consciousness and Social Change. Ohio State University, Columbus, OH Crapanzano V, Garrison V (eds.) 1977 Case Studies in Spirit Possession. Wiley, New York Csordas T J 1987 Health and the holy in African and AfroAmerican spirit possession. Social Science and Medicine 24(1): 1–11 de Heusch L 1962, Cultes de possession et religions initiatiques de salut en Afrique. Annales du Centre d’ Etudes des Religions Eliade M 1951 Le chamanisme et les techniques archaıV ques de l’ extase. Editions Payot, Paris Firth R W 1964 Essays on Social Organization and Values. University of London, London Janzen J M 1992 Ngoma: Discourses of Healing in Central and Southern Africa. University of California Press, Berkeley, CA Kapferer B 1991 A Celebration of Demons. Berg Publishers, Oxford, UK Kramer F 1987 Der rote Fes. Uq ber Besessenheit und Kunst in Afrika. Athena$ um, Frankfurt, Germany Lambek M 1993 Knowledge and Practice in Mayotte, Local Discourse of Islam, Sorcery, and Spirit Possession. University of Toronto Press, Toronto, ON Leiris M 1958 La possession et ses aspects the! atraux chez les Ethiopiens de Gondar. In: L’Homme: Cahiers d’Ethnologie, de GeT ographie et de Linguistique. Plon, Paris Lewis I M 1989 Ecstatic Religion. An Anthropological Study of Shamanism and Spirit Possession, 2nd edn. Routledge, London Lienhardt R-G 1961 Diinity and Experience. The Religion of the Dinka. Clarendon Press, Oxford, UK Obeyesekere G 1981 Medusa’s Hair. An Essay on Personal Symbols and Religious Experience. Chicago University Press, Chicago Ong A 1987 Spirits of Resistance and Capitalist Discipline: Factory Women in Malaysia. State University of New York Press, Albany, NY
U. Luig
Split Brain Before the 1940s, the functions of the corpus callosum, estimated to have about 1,000 million axons, were unknown (Akelaitis 1944). Early in the 1950s, Roger Sperry at the University of Chicago, who was researching the anatomy of visual form perception in the cerebral cortex, proposed a surgical investigation that was to yield new insights into this tract and the brain 14904
systems of consciousness (Evarts 1990, pp. xix–xxi). Ronald Myers, a research student with Sperry, demonstrated by cutting the corpus callosum that a cat’s visual experience was divided in two, and could be transferred by it between the hemispheres (Myers and Sperry 1953). In the 1950s and 1960s Sperry’s research group, now at the California Institute of Technology (Caltech), exploited what was now called the ‘split brain’ to study visual and tactile perceptions, memories and motor control in cats or monkeys (Sperry 1961). Studies with monkeys showed that, while ordinary pattern discrimination was completely split, less figurative visual experiences were not completely duplicated after splitbrain surgery. This proved the importance of ‘motor sets’ or ‘intentions to move’ that ‘anticipate’ sensory information and that invoke undivided systems of the brain stem and cerebellum. In 1962, the effects of commissurotomy, undertaken to arrest life-threatening epilepsy, were investigated in a human patient (Bogen 1985). Sperry directed psychological tests of consciousness, learning and various skills after the operation (Sperry 1967). Commissurotomy to treat multifocal epilepsy in other patients made it possible to test rational consciousness in the left and right hemispheres, their similarities and differences, their convergent influence over acts, ideas and feelings of the whole person, and the role of language (Sperry 1967, 1974, Sperry et al. 1969). Commissurotomy research confirmed Sperry’s theory of the causal potency of motor intentions, and of self-consistent conscious ideas and beliefs, and moral impulses (Sperry 1977, 1983). It stimulated the neuropsychological analysis of hemispheric differences unique to humans, and, by clarifying the special modes of processing in cortical territories, supported the ‘cognitive revolution’ of the 1960s (Sperry 1993). Sperry’s research on the cognitive processes of the cerebral hemispheres of human beings gained him a Nobel prize in 1981 (Sperry 1982). Since the classical period of split-brain research in the 1960s and 1970s, interest in hemispheric differences has flourished, and the cognitive processes, language processes and memory of commissurotomy patients have been studied in new, detailed ways. Nevertheless, the overall picture of cerebral functions has not significantly changed (Benson and Zaidel 1985, Nebes 1990, Trevarthen 1990a, Corballis 1991, Gazzaniga 1995). Though commissurotomy was effective in limiting seizures, the operation carries risks, and is now rarely performed.
1. Split-brain Experiments with Animals Myers applied the stereo-microscope surgery developed by Sperry to cut the crossover of visual nerves in the optic chiasm under a cat’s brain, so each eye would
Split Brain
Figure 1 Anatomy of the cerebral commissures. Saggital sections of the brains of cat, monkey (rhesus macaque) and human adult, showing the corpus callosum (c), anterior commissure (a) and optic chiasm (oc). The estimated numbers of nerve axons in the corpus callosum of each species is indicated. This illustrates the enormous expansion of cortical functions in human beings, and their bilateral integration through the corpus callosum
lead to only one cerebral hemisphere (Figs. 1 and 2). In 1953, he and Sperry reported the transfer of visual memory between the hemispheres in cats with the chiasm divided, and the abolition of this transfer when the corpus callosum was cut (Myers and Sperry 1953). Outside carefully controlled test situations, the animals moved normally. Monkeys also experience independent visual perception and memory in the two brain halves after the optic chiasm and forebrain commissures (corpus callosum and anterior commissure) are cut (Fig. 1). Again, the operated animals retained normal coordination of movements and attention when free to orient, correcting for loss of awareness due to the surgery. Split-brain monkeys learn two, mutually contradictory, visual discrimination habits in the right and left half-brains, showing that they possess double and separate realms of consciousness (Trevarthen 1962, Sperry 1977). In the cat, each disconnected hemisphere directed movements of the whole body, but with monkeys, a hand on the same side as one seeing hemisphere was, at least in the immediate postsurgical period, less willing to respond and, if forced into action, was clumsy, as if blind. Crossed pathways linking each cortex to the opposite hand were superior for guiding fine, purposeful movements of the fingers. Related split-brain projects demonstrated shifts of attention between the two separated halves of the cortex and the effects of preparatory sets to move in particular ways. The results revealed the global design of the mammalian brain for awareness, learning, and voluntary action (Sperry 1961, Berlucchi 1990).
What a chiasm-callosum sectioned (split-brain) monkey sees and remembers, i.e., in which hemisphere visual perception and learning occurs when both hemispheres are presented stimuli, depends on which hand the animal chooses for response. This is evidence that intention can instruct experience, that activities of the neocortex are directed intrinsically by purposeful impulses generated in the brain as a whole, and not just by processing the input of stimulus information at the cortical level. When a monkey learns a complex two-handed skill, the memory of how to move is located more in the hemisphere opposite the dominant hand, i.e., the hand that the individual spontaneously prefers to use for fine, consciously regulated manipulations. A splitbrain monkey’s visual awareness can be turned on or off in one side of the brain by changing which hand the animal uses to manipulate the test apparatus to gain a reward. When using the right hand the monkey becomes conscious on the left side of its cerebrum and is unable to see with the right cortex. This causes the monkey to be inattentive to things in the left half of its experience, and makes it likely to imagine things that did not exist there. This clarifies ‘perceptual neglect,’ and ‘illusory completion,’ phenomena known from clinical neuropsychological research with consciousness of patients who have lost function in parts of their cerebral cortex (Trevarthen 1990b). Tests of vision in the divided brain led to the theory that the sensory systems are linked to a hierarchy of motor coordinative systems, which either gear body movements to the layout of the surrounding world, or direct the subject to select objects that have life14905
Split Brain
Figure 2 Human brain showing asymmetric functions in the cerebral hemispheres. The hemispheres are separated after commissurotomy (the ‘split-brain’ operation). Above, the two hands in the visual field, with the subject’s attention fixated on the tip of the paintbrush. (L l left half of visual field; R l right half of visual field; o.c. l optic chiasm)
relevant qualities. Two active visual systems were distinguished: wide-field and part-conscious ambient ision informs whole-body locomotion and orientation; focal ision, employing the foveae to see within only a few degrees of visual angle and with precise intermittent fixation of gaze, inspects detail and identifies objects (Trevarthen 1968). The former, involving many subcortical centers, as well as parts of the parietal cerebral cortex, acts as motivator and map-reader for attention in the latter, more conscious vision of shapes, colors and patterns. The split-brain operation divides focal vision because its essential processes are confined to the cortex, but leaves ambient vision largely intact.
2.
Human Commissurotomy Patients
In 1960, Joseph Bogen, a Los Angeles neurosurgeon, saw a patient who was having progressive, lifethreatening seizures as a result of head injuries sustained as a paratrooper. In the light of the research with monkeys, which showed that commissurotomy could leave essentially normal control of whole-body action and the distribution of awareness, it was decided that cutting the corpus callosum and other 14906
interhemispheric commissures, ‘total commissurotomy,’ would reduce the epilepsy. This operation was carried out in 1962, and the epilepsy improved significantly. Sperry and Bogen, with Michael Gazzaniga, began systematic psychological tests at Caltech on the effects on perception, speech, and motor control. Instead of relying on standard neurological procedures, original tests were devised based on principles learned from split-brain animals (Sperry 1967, 1974, Sperry et al. 1969). The first report, with clinical observations made by Bogen, described the resultant left–right division of human consciousness (Gazzaniga et al. 1962). Norman Geschwind and Edith Kaplan, in Boston, described comparable ‘deconnection’ phenomena in patients with lesions that interrupted cortico-cortical links (Geschwind and Kaplan 1962). Tests confirmed that awareness of objects seen by either half of the brain (in left and right halves of the visual field), or felt in the left or right hand, was essentially normal, but, as in the earlier animal studies, when free orientation was prevented, the left and right mental realms were as separate as if in separate heads (Fig. 2). Only when the subject shifted visual fixation or passed an object from one hand to the other did the one hemisphere know what the other was perceiving. As with the monkeys, human commissurotomy subjects experienced unified ambient visual functions, but split focal vision (Trevarthen and Sperry 1973, Trevarthen 1990b). Though both hemispheres were conscious, only the left could speak or write. Surprisingly, however, given that left hemisphere lesions can produce total aphasia, the right hemisphere, though speechless, could nevertheless comprehend speech (Gazzaniga and Sperry 1967, Zaidel 1990b). It picked up information quickly from utterances made by the left halfbrain, could read common words flashed to the left visual field, and could follow verbal instructions. It was, however, conspicuously weak at calculating and in comprehending logical propositions. Jerre Levy showed that in the nonspeaking right hemisphere, classically referred to as the ‘minor’ hemisphere, mental faculties that imagine spatial relations and transformations in space were more highly developed than on the left ‘dominant’ side of the brain. Two different, complementary kinds of cognitive strategy, two kinds of consciousness, were demonstrated (LevyAgresti and Sperry 1968) (Fig. 2). The surgically separated left hemisphere commands fluent speech and can generate and retain phonological images of speech sounds. It performs serial logical propositions such as are expressed in spoken, gestural or written language, and it is adept at the kind of combinatorial and systematic logic required for formal mathematics or score-based musical thought. The right hemisphere, in contrast, has superior mastery of the immediate ‘appositional’ experience (Bogen 1969)—of forms by sight, sound, or touch—and,
Split Brain according to recent research, of odors. It can remember these experiences better, can distinguish and remember faces better, has richer reactions to colors, and to affective tones of the voice and its prosodic quality. It perceives more easily resemblances in topological but not Euclidean geometry. Commissurotomy patients have a color anomia, an inability to name colors, because the disconnected right hemisphere has a superior but inarticulate ability to perceive color. There is evidence that an undivided brain-stem color vision defines a dichromatic distinction between blue space and the reddish-brown solids, part of the natural layering of the conscious mind to serve different forms of action in the world (Trevarthen and Sperry 1973). Tests with commissurotomy patients support findings with brain-damaged and normal subjects showing the right half of the brain to be involved in emotional communication and the mediation of direct social interactions. The main hemispheric differences are summarized in Sperry’s Nobel Address (Sperry 1982). Jerre Levy and Colwyn Trevarthen used ‘stimulus chimeras,’ pictures showing different half objects either side of the midline, to observe biases in cognitive activities of the hemispheres induced by what a commissurotomy patient intended to do, or by what he or she had been told to think about (Levy and Trevarthen 1976). They called the motive sets that were capable of turning on or off one or other hemispheric cognitive mechanism, sometimes independently of the different cognitive aptitudes on the two sides, ‘metacontrol.’ Similar habitual arousal biases in normal subjects cause individual differences in cognitive aptitudes.
3. Theoretical and Philosophical Implications Commissurotomy research stimulated both the neuropsychological investigation of effects of unilateral brain injury, and tests of normal perceptual, cognitive, and motoric asymmetries. The findings have been invoked to explain why children with similar opportunities for learning may be different in educational performance, and why adults show different intellectual aptitudes and job preferences. It is likely that hemispheric biases in cognition and intellectual differences between persons are products of inherent styles in purposes and emotions, as well as of different experiences and opportunities for practice and for communication with others. In the normal brain the hemispheres work in tight partnership. The intuitive response to experience and the evocation of imagery linked to phenomenal reality by metaphor, on the one hand, and verbal analysis and prescription of aesthetic judgments, on the other, are two natural brain systems that develop as complements in the making and communicating of language, technology and art. There are large differences be-
tween individuals. Findings with commissurotomy patients have been used to support theories of ‘modular intelligence’ derived from neuropsychological and cognitive science fields and attempts have been made to apply these theories to evolutionary changes in hominid behavior and the different kinds of cognitive processing inferred. However, such models of the mind are incomplete if they do not incorporate explanations of the motivations that generate both action and awareness, and the communicable emotions that evaluate them. Cognitive operations resist efforts to locate them in parts of the brain. The skills, concepts and memories of language are stored over wide areas. Paralleldistributed processes contribute to the comprehension of language, as for facial recognition, spatial awareness and categorical understanding of objects, or for executive planning and motor sequencing. The development of cortical units involved in all these functions depends on associative cortico-cortical links, both within each hemisphere and through the corpus callosum. These supplement equally massive connections to and from cell populations beneath the cortex. A majority of humans are ‘right-handed,’ and have been since Neolithic times, preferring to take conscious control in the right hand, especially for communicating by gesture and for carefully directing implements. About 15 percent prefer the left hand. Most of us tend to move the right half of our mouths more, or earlier, in speaking, and look right as we speak. These elaborated learned skills originate in asymmetries implanted in subcortical systems before birth. It has been clear for more than 100 years that language functions are more strongly represented in the left hemisphere. Right hemisphere lesions selectively impair object and event recognition by sight or hearing and may have devastating effect on the capacity to recognize who another person is and what their emotion may be. Right posterior lesions can also weaken a person’s discrimination of their own expression of feeling, and lower self-awareness generally. The cognitive and performance functions of the right half of the brain are important to a person’s cultural life in a community, and to the acquisition of many skills that language may find hard to describe, and they serve to support the language-mediating functions in the left hemisphere. There has been much speculation about how recurring asymmetries in works of art might be related to the anatomy of the brain, and our common brain asymmetry does appear to have been unknowingly exploited by many painters to generate aesthetic effects. There is a school of art teachers who draw on split-brain research to guide pupils to make stronger use of the right hemisphere, claiming that this generates more dynamic, coherent and aesthetically pleasing forms, and some require right-handers to sketch unthinkingly with their left hands. We have to learn how to talk about what is beautiful in art, and it is 14907
Split Brain inherently difficult to recall what we say. Language offers a powerful tool for introspecting, for planning actions and for manipulating information in ‘phenomenal absence’ from its source, but it is limited in representational imagination. It is particularly difficult to translate back from word to picture. The work with commissurotomy patients convinced Sperry that brain-generated awareness and evaluation have a determining role over the experience obtained by learning. He concluded that science has to give consciousness an interactive, causal role in brain function. Subjective mental states explain conscious behavior and its changes, and determine what a person is and does. This view gives the higher ‘macro,’ mental, vital and social forces equal importance, in scientific explanations of life, with physics and chemistry. Sperry’s thinking about human consciousness and the importance of values and emotion is summarized in Nobel Prize Conersations (Sperry et al. 1985). See also: Brain Asymmetry; Brain Damage: Neuropsychological Rehabilitation; Conscious and Unconscious Processes in Cognition; Consciousness and Sensation: Philosophical Aspects; Consciousness, Cognitive Psychology of; Consciousness, Neural Basis of; Epilepsy
Bibliography Akelaitis A J 1944 A study of gnosis, praxis, and language following section of the corpus callosum and anterior commissure. Journal of Neurosurgery 1: 94–102 Benson D F, Zaidel E 1985 The Dual Brain. Guilford Press, New York Berlucchi G 1990 Commissurotomy studies in animals. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, Vol. 4. Elsevier, Amsterdam, New York, Oxford Bogen J E 1969 The other side of the brain, II. An appositional mind. Bull. Los Angeles Neurol. Soc. 34: 135–62 Bogen J E 1985 The callosal syndromes. In: Heilman K M, Valenstein R (eds.) Clinical Neuropsychology, 2nd edn. Oxford University Press, New York Corballis M C 1991 The Lopsided Ape: Eolution of the Generatie Mind. Oxford University Press, New York Edwards B 1979 Drawing on the Right Side of the Brain. J-P. Tarcher, Los Angeles Evarts E V 1990 Foreword: Coordination of movement as a key to higher brain function: Roger W. Sperry’s contributions from 1939 to 1952. In: Trevarthen C (ed.) Brain Circuits and Functions of the Mind: Essays in Honor of Roger W. Sperry. Cambridge University Press, New York Gardner H 1982 Art, Mind and Brain. Basic Books, New York Gazzaniga M S 1970 The Bisected Brain. Appleton-CenturyCrofts, New York Gazzaniga M S 1989 Organization of the human brain. Science 245: 947–52 Gazzaniga M S 1995 Principles of human brain organization derived from split-brain studies. Neuron 124: 217–28 Gazzaniga M S, Bogen J E, Sperry R W 1962 Some functional effects of sectioning the cerebral commissures in man. Proceedings of the National Academy of Sciences USA 48: 1765–9
14908
Gazzaniga M S, Sperry R W 1967 Language after section of the cerebral commissures. Brain 90: 131–48 Geschwind N, Kaplan E 1962 A human cerebral deconnection syndrome. Neurology 12: 675–85 Innocenti G M 1986. General organization of the callosal connections in the cerebral cortex. In: Jones E G, Peters A (eds.) Cerebral Cortex, Vol. 5. Plenum, New York Levy J 1988 Cerebral asymmetry and aesthetic eperience. In: Rentschler I, Herzberger B, Epstein D (eds.) Beauty and the Brain. Biological Aspects of Aesthetics. Birkhauser Verlag, Basel, Boston, Berlin Levy J, Trevarthen C 1976 Metacontrol of hemispheric function in human split-brain patients. Journal of Experimental Psychology: Human Perception and Performance. 2: 299–312. Levi-Agresti J, Sperry R W 1968 Differential perceptual capacities in major and minor hemispheres. Proceedings of the National Academy of Sciences USA 61: 1151 Myers R E, Sperry R W 1953 Interocular transfer of a visual form discrimination habit in cats after section of the optic chiasm and corpus callosum. The Anatomical Record 115: 351–2 Nebes R D (ed.) 1990 The commissurotomised brain. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, Vol. 4 Sect. 7. Elsevier, Amsterdam, New York, Oxford Sperry R W 1961 Cerebral organization and behavior. Science 133: 1749–57 Sperry R W 1967 Mental unity following surgical disconnection of the cerebral hemispheres. The Harey Lectures 62: 293–323 Sperry R W 1974 Lateral specialization in the surgically separated hemispheres. In: Schmitt F O, Worden F G (eds.) The Neurosciences. Third Study Program. MIT Press, Cambridge, MA Sperry R W 1977 Forebrain commissurotomy and conscious awareness. The Journal of Medicine and Philosophy 2(2): 101–26 Sperry R W 1982 Some effects of disconnecting the cerebral hemispheres. (Nobel lecture). Science, 217: 1223–6 Sperry R W 1983 Science and Moral Priority. Columbia University Press, New York Sperry R W 1993 The impact and promise of the cognitive revolution. American Psychologist 48(3): 878–85 Sperry R W, Eccles J, Prigogine I, Josephson B, Cousins N 1985 Nobel Prize Conersations. Saybrook, Dallas, TX Sperry R W, Gazzaniga M S, Bogen J E 1969 Interhemispheric relationships: The neocortical commissures; syndromes of hemisphere disconnection. In: Vinken P J, Bruyn G W (eds.) Handbook of Clinical Neurology, Vol. 4. North Holland, Amsterdam Stephan M 1990 A Transformational Theory of Aesthetics. Routledge, London Trevarthen C 1962 Double visual learning in split-brain monkeys. Science 136: 258–9 Trevarthen C 1968 Two mechanisms of vision in primates. Psychologische Forschung 31: 299–337 Trevarthen C 1987 Split brain and the mind. In: Gregory R L, Zangwill O L (eds.) Oxford Companion to the Mind. Oxford University Press, Oxford, New York Trevarthen C 1990a Integrative functions of the cerebral commissures. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, Vol. 4. Elsevier (Biomedical Division), Amsterdam Trevarthen C (ed.) 1990b Brain Circuits and Functions of the Mind: Essays in Honour of Roger W. Sperry. Cambridge University Press, New York
Sport and Exercise Psychology: Oeriew Trevarthen C 1999 Roger W. Sperry. In: Wilson R, Keil F (eds.) The MIT Encyclopedia of Cognitie Sciences. MIT Press, Cambridge, MA Trevarthen C, Sperry R W 1973 Perceptual unity of the ambient visual field in human commissurotomy patients. Brain 96: 547–70 Zaidel E 1990a Language functions in the two hemispheres following complete commissurotomy and hemispherectomy. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, Vol. 4. Elsevier, Amsterdam, New York, Oxford Zaidel D 1990b Memory and spatial cognition following commissurotomy. In: Boller F, Grafman J (eds.) Handbook of Neuropsychology, Vol. 4. Elsevier, Amsterdam, New York, Oxford
C. Trevarthen
Sport and Exercise Psychology: Overview Though the beginnings of sport psychological research can be traced back to the end of the nineteenth century it was not before the sixties that sport psychology emerged as an important and recognized academic and applied discipline (cf. Singer et al. 1993, ch. 1). The foundation of the International Society of Sport Psychology (ISSP) in 1965 and of several national organizations in Europe and in North America mark this period as the start of systematic scientific research and practice in sport psychology. Due to its increasing importance after World War II a systematic scientific research and support of elite sport gained significance. It became obvious that psychological processes were important aspects of athletic peak performance. Therefore, sport psychology was in this period mainly concerned with research on elite athletes and on performance enhancement in competitive sport. A distinction between the academic or research-oriented discipline of sport psychology and the discipline of applied sport psychology emerged in the 1980s, when not only the Association for the Advancement of Applied Sport Psychology (AAASP) was formed in 1985, but when also more and more books and journals were published worldwide that were dedicated to helping elite athletes reach their full potential. At the beginning of the twenty-first century it seems clear that academic and applied sport psychology are interdependent and profit from each other. Scientifically based methods of intervention are used in applied settings and research is stimulated by suggestions from practitioners. During the 1980s another important development emerged. More and more researchers became involved in exercise psychology. In contrast to sport psychology that is focused on performance enhancement and competitive sports, exercise psychology is concerned with psychological aspects of primarily noncompeti-
tive leisure, fitness, and health-oriented physical activity. This development can be nicely seen in the renaming of the Journal of Sport Psychology, first published in 1979, into the Journal of Sport and Exercise Psychology in 1988. Sport and exercise psychology, as it is called nowadays, deals with the psychological factors that influence sport performance and exercise behavior, including motor learning and motor development, as well as with the impact of sport and exercise on psychological growth and development, like personality, health, and well-being. The present overview will focus on individual, social, and motivational influences on sport and exercise behavior as well as on short- and long-term effects of sport and exercise. The topics of performance enhancement and of developmental sport psychology are dealt with in several other articles (see Sport Psychology: Performance Enhancement; Deelopmental Sport Psychology).
1. Indiidual, Social, and Motiational Influences on Sport and Exercise Behaior 1.1 Indiidual Determinants of Sport and Exercise Research on personality differences in sport was greatly influenced by the paradigms of differential psychology. The trait approach was most popular until the 1970s. Numerous studies looked for differences in personality variables between athletes and nonathletes as well as between athletes of different performance levels. Talent detection via personality tests and performance enhancement due to personality training were the main impetus of this line of research. Taken together the results are rather inconclusive and show only correlational but no causal relationships. There were several methodological problems leading to these disappointing findings, like lacking adaptation to the sports context of the scales used and suboptimal research designs with respect to causal conclusions. Even the well-known finding from the 1960s showing a healthier personality profile of successful elite athletes than of nonathletes did not prove valid in the long run (Rowley et al. 1995). The interactional approach that followed is more promising and is still popular. Scales were developed within the sports context that differ between state and trait aspects of respective variables like, for instance, competitive anxiety, sport confidence, or cognitive strategies. Personality research in sport at the beginning of the twenty-first century is more concerned with either performance enhancement in sport or with effects of exercise on personality variables like, for instance, self-esteem (cf. Sect. 2.2). 1.2 Social Processes in Sport and Exercise The beginnings of research on productivity in sport teams and about spectators’ influence on performance can be traced back to the end of the nineteenth 14909
Sport and Exercise Psychology: Oeriew century. Sport and exercise are often performed in groups or teams, and therefore social processes are deemed highly significant (Carron and Hausenblas 1998). Team cohesion, leadership, and aspects of group composition on individual and group outcomes are the most important topics. Whereas in the beginning cohesion was regarded as interpersonal attraction between team members, i.e., social cohesion, at the beginning of the twenty-first century cohesion in sport teams is conceptualized and assessed as social and task cohesion. The positive relationship of group cohesion and performance has been well-documented for sport teams and is mainly based on task cohesion. This relationship is quite robust, and exists irrespective of the group task. For example, highly cohesive handball, rowing, or golf teams alike are more successful than low cohesive teams. Less is known about the causal paths of the relationship. There is convincing evidence that success fosters cohesion more than vice versa (Mullen and Copper 1994). Looking at the relationship of cohesion and adherence to sport and exercise reliable differences can be found between dropouts perceiving a lower cohesion and adherers perceiving a higher coherence in exercise groups and in sport teams. Perceived high cohesion may lead to more satisfaction of the group members and their overall attraction to the group, whereas perceived low cohesion reduces the motivation to stay in the group. This result is of high importance for any setting, be it exercise, recreational, or competitive sport. The predominant approaches to leadership behavior of coaches in sport are the mediational model of Smoll and Smith (1989) and the multidimensional model of Chelladurai (see Chelladurai and Riemer 1998). The mediational model was developed in youth sport and postulates that the effects of the coaches’ behaviors are mediated by the meaning that players attribute to them. Smoll and Smith (1989) could demonstrate that coaches who give general technical instruction, who prefer positive feedback, and who give mistake-contingent instruction and encouragement are evaluated most highly by their athletes. Training in these skills led to more positive evaluation of youth sport coaches, to fewer dropouts from their groups, and to an increase in the athletes’ self-esteem and in team cohesion compared to nontrained youth sport coaches. However, the coaches’ behaviors had no direct impact on the athletes’ performance. The multidimensional model of leadership (MDL) by Packianathan Chelladurai postulates that individual and group outcomes (satisfaction and performance) are dependent on the congruence among required, actual, and preferred leader behavior. Preferred and perceived leader behavior is measured with a five-dimensional questionnaire called the Leadership Scale for Sports (LSS). This instrument has been tested and validated in quite a few studies with athletes and coaches in various sport contexts and countries, in North America, Asia, and Europe. Research was 14910
primarily focused on the perception of coach behavior by adult and high-level athletes. The relationship to outcome measures is inconclusive. Future research has to focus on this relationship between coaches’ behavior and outcome measures such as performance and satisfaction in the group.
1.3 Motiation Considering motivation as a term for explaining intensity and direction of an individual’s effort, motivation research in the field of sport and exercise is on the one hand concerned with explaining why and how certain people are attracted to which kind of sport or exercise and why others are not. On the other hand strategies to enhance motivation, like mastery training, attributional retraining, or goal setting have been developed and tested (Roberts 1992). With regard to the first issue, achievement and competition have been a traditional focus of motivation research in sport. Three theoretical perspectives were most influential: need achievement theory, attribution theory and social-cognitive approaches, mainly goal orientation theory. Starting with need achievement theory in the 1950s and 1960s, high and low achievers in sport were found to differ in their dispositional orientations (motive to achieve success or to avoid failure), their task choice, and their emotional reactions to success and failure, as was predicted by the theory. The attributional model of achievement motivation of Bernard Weiner led to a considerable number of studies since the 1970s (Singer et al. 1993, ch. 19). They showed that the initial attributional framework of ability, task, effort, and luck attributions needs an extension to more sport specific attributions like opponents’ quality or audience support. Self-serving biases in attributions were also found in the sports context, whereas gender differences were rarely found. Reattribution training for low achievers was successfully applied in sport settings like physical education, elite sport, and exercise. Another important topic was the relationship of attributions and emotions in sport. Though an attributional model would predict that emotions after success and failure depend on attributions, emotions of athletes seem to depend more on outcome than on attributions (Willimczik and Rethorst 1995). Since the 1980s social cognitive approaches have gained attention in sport psychology. For instance, the goal perspective approach of John Nicholls was adapted to sport settings by Joan Duda and tested in numerous studies (cf. Duda 1992). Goal perspective theory assumes that in achievement contexts like sport and exercise individuals are motivated because they want to demonstrate ability or competence. This can be done with two types of goal, which are conceptualized as dispositions. The ego or outcome orientation is easily aroused in competitive situations like
Sport and Exercise Psychology: Oeriew sport. Success or failure is dependent on the performance of others and perceptions of ability are referenced to the ability of others. In contrast, the task goal or mastery goal focuses on individual comparison. Perceptions of ability are self-referenced. Task orientation is typically aroused in situations where the focus is on learning or mastery. Results show that task orientation is correlated with intrinsic motivation, with more enjoyment in sport, with persistence in the face of failure, and with the selection of moderately difficult tasks. Task-oriented individuals are more like high achievers. In contrast, egooriented individuals, especially when low in perceived competence, are more at risk to drop out of sports, they are more prone to develop fear of failure, and they emphasize fairness in sport less than task-oriented individuals. Fostering task orientation more than ego orientation is therefore highly recommended. Goal perspective theory has also focused on socialization contexts and their impact on the development of one or the other kind of goal. It is assumed that the socialization context induces a motivational climate where either competition or mastery is emphasized. A competitive climate may enhance ego orientation, whereas a mastery climate enhances task orientation. Though achievement motivation is clearly important it is not the sole reason for sport and exercise participation. For many youngsters and adults affiliation motivation, pleasure seeking, physical challenge, adventure, and risk motivation, as well as the wish to shape an attractive body and to keep one’s ‘fitness’ by means of exercise play an equally or even more important role. Sport and exercise become integrated into a person’s lifestyle, and the wish to express a certain lifestyle is an important motivation to practice exercise and sport. A lifestyle theoretical perspective in sport and exercise psychology (Dishman 1994, Abele et al. 1997) posits that sport is an important aspect of a person’s general way of living. It is a means both to pursue different motives (affiliation, achievement, risk, fun, challenge, health), and to express certain attitudes (toward youth, health, physical attractiveness) at the same time. With increasing knowledge about the physical and mental health benefits of exercise (see Sect. 2) psychology becomes increasingly concerned with exercise adherence (Dishman 1994, Abele et al. 1997). Knowing that more than half of the adults in Western countries do not exercise enough from a healthoriented point of view and knowing that more than half of the participants who have started an exercise program for health reasons quit it again after a short time points to the importance of research on the determinants of exercise adherence and on interventions aimed at enhancing adherence and preventing dropout. Whereas early models have stated that perceived vulnerability or health protection motivation are most important, research has shown that perceived vulnerability does not lead to long-term
adherence. Rather positive factors like exercisespecific self-efficacy beliefs, social support in the person’s immediate environment, perceived group cohesion (see Sect. 1.2), and a suitable program and program climate are important. Participation motivation has to be understood not as a state but rather as a process undergoing at least three phases (inactivity, intention to participate and begin participation, adherence) that need specific motivational support.
2. Short- and Long-term Effects of Sport and Exercise 2.1 Exercise and Physical Health Because physical health and psychological well-being are interrelated, it is appropriate to briefly consider the effects of sport and exercise on physical health (cf. Abele et al. 1997). These may be differentiated into cardiovascular (reduction of heart rate, decrease of diastolic blood pressure, improvement of blood circulation); hemodynamic (improvement of the blood flow characteristics); metabolic (stronger and more efficient musculature, improvement of the bone substance); and endocrinological effects (increase of catecholamines and cortisol, etc.). Immunological effects also exist; they are, however, only understood with respect to elite sports, but not yet with respect to exercise. Another distinction relates to exerciseinduced changes in endurance, vigor, and coordination skills. All these changes also have an impact on psychological well-being. However, besides studying the beneficial effects of sport and exercise on physical and psychological health, sport and exercise psychology is also concerned with high risk sports, the determinants and their possibly dangerous effects.
2.2 Exercise and Psychological Well-being Studies on exercise and psychological well-being have been conducted both with nonclinical and with clinical groups. Questions addressed were the role of exercise in the reduction of anxiety and depression, exercise and mood changes, positive and negative addiction to exercise, and the relation among exercise, personality changes, and cognitive functioning. Usually the impact of aerobic exercise like walking or running or fitness training, but sometimes also the impact of more competitive games like tennis, volleyball, or soccer, was analyzed. Several studies show that participants experience both short-term anxiety reduction and long-term depression reduction if they exercise regularly at least twice a week and for a time period of not less than six weeks. The longer they continue the more these negative psychological states are reduced. Run14911
Sport and Exercise Psychology: Oeriew ning, for instance, can be an inexpensive, time-efficient adjunct to traditional psychotherapies (Weinberg and Gould 1999). Nonclinical participants usually show the ‘iceberg profile,’ which means that compared to their preexercise mood their postexercise mood is characterized by elevated vigor, activity, and elation and reduced levels of anxiety, depression, confusion, fatigue, and anger. It has been argued that fitness activities versus sport games may serve different functions in the regulation of mood. Fitness activities may primarily lead to equilibration effects. A negatively experienced tension state in advance of the exercise is transformed into a positively experienced relaxation state after it. More competitive games, on the other hand, may lead to disequilibration effects. A positively experienced ‘prestart’ tension state is transformed into a positively experienced postgame relaxation state. Both shortterm equilibration and disequilibration effects serve the participants’ long-term well-being (Abele and Brehm 1993). The causes of these mood changes are multifaceted and relate both to biophysiological processes (hormonal changes, increased blood circulation, muscle strengthening, and relaxation; see above) and to psychological processes. Important psychological processes may be distraction and time-out. During sport and exercise the participant is distracted from stress and urgent problems, and the time-out may lead to free-floating attention and to remote associations which, in turn, may help to find new and creative problem-solving strategies. Exercise can also become an addiction. As a positive addiction it is a regular activity integrated as one important and valuable aspect into the person’s life. As a negative addiction it gradually becomes overwhelming and the person cannot stop doing it at the cost of neglecting other important life tasks. Negative exercise addiction needs psychological help and intervention. Personality changes and changes in cognitive functioning have also been studied. Physical self-concept and self-esteem are positively influenced by regular exercise (Alfermann and Stoll 2000, Sonstroem 1997). It has also been posited that cognitive functioning may be enhanced by physical exercise. Though the idea that motor development is an important determinant of intellectual development in childhood is appealing, the research findings in adulthood show a correlational but not necessarily a causal relationship between exercise and cognitive functioning (Etnier et al. 1997).
3. Conclusions Since the early 1960s sport and exercise psychology has undergone a rapid development both with respect to its theories and methods and with respect to its aims and topics. Regarding theoretical models, instead of directly taking over models from general psychology 14912
sport and exercise psychology now adapts and modifies these models contingent to the sport context and also builds new and more specific models. Such a context-specificity of theory construction leads to more specific hypotheses and may also help to refine more general theoretical models in psychology. Regarding methodological advances, similar developments can be observed. The measures used become more refined and more specific to the respective context. The general methodological approaches both in experimental and in field research become more refined. Most important, sport and exercise psychology has broadened its topics and perspective from a predominant concern with performance of elite athletes to a concern with sport and exercise in everyday life, too. The psychological analysis of sport and exercise in schools, in recreational settings, in health settings, and in further sports related institutions has become as important as the analysis of psychological processes in elite sports. Exercise adherence as a topic, for instance, becomes more relevant the more people acquire health problems because of lack of exercise. Research on sports and lifestyle, as another example, must receive increased attention, not only because ‘trend sports’ are an important economic factor in Western societies, but also because people on average have more leisure time, and sport is one of the most important leisure activities. Research on psychological processes in physical education as a school subject, as a final example, may be directed at the impact of sport and exercise in schools on an individual’s lifetime exercise behavior, as well as on its impact on tension reduction and aggressive behaviors in schools. See also: Developmental Sport Psychology; Health Behavior: Psychosocial Theories; Health Behaviors; Health Behaviors, Assessment of; Health Psychology; Sport, Anthropology of; Sport Psychology: Training and Supervision; Sport, Sociology of; Sports as Expertise, Psychology of; Youth Sports, Psychology of
Bibliography Abele A, Brehm W 1993 Mood effects of exercise versus sport games: Findings and implications for well-being and health. International Reiew of Health Psychology 2: 53–80 Abele A E, Brehm W, Pahmeier I 1997 Sportliche Aktivita$ t als gesundheitsbezogenes Handeln. In: Schwarzer R (ed.) Gesundheitpsychologie. Hogrefe, Go$ ttingen, Germany, pp. 117–50 Alfermann D, Stoll O 2000 Effects of physical exercise on selfconcept and well-being. International Journal of Sport Psychology 31: 47–65 Carron A V, Hausenblas H A 1998 Group Dynamics in Sport, 2nd edn. Fitness Information Technology, Morgantown, WV Chelladurai P, Riemer H A 1998 Measurement of leadership in sport. In: Duda J L (ed.) Adances in Sport and Exercise Psychology Measurement. Fitness Information Technology, Morgantown, WV, pp. 227–53
Sport, Anthropology of Dishman R K (ed.) 1994 Adances in Exercise Adherence. Human Kinetics, Champaign, IL Duda J L 1992 Motivation in sport settings. In: Roberts G C (ed.) Motiation in Sport and Exercise. Human Kinetics, Champaign, IL, pp. 57–91 Etnier J L, Salazar W, Landers D M, Petruzzello S J, Han M, Nowell P 1997 The influence of physical fitness and exercise upon cognitive functioning. Journal of Sport and Exercise Psychology 19: 249–77 Mullen B, Copper C 1994 The relation between group cohesiveness and performance: An integration. Psychological Bulletin 115: 210–27 Roberts G C (ed.) 1992 Motiation in Sport and Exercise. Human Kinetics, Champaign, IL Rowley A J, Landers D M, Kyllo L B, Etnier J L 1995 Does the iceberg profile discriminate between successful and less successful athletes? A meta-analysis. Journal of Sport and Exercise Psychology 17: 185–99 Singer R N, Murphey M, Tennant L K (eds.) 1993 Handbook of Research on Sport Psychology. Macmillan, New York Smoll F L, Smith R E 1989 Leadership behaviors in sport. Journal of Applied Psychology 18: 1522–51 Sonstroem R 1997 The physical self-system: a mediator of exercise and self-esteem. In: Fox K R (ed.) The Physical Self. From Motiation to Well-Being. Human Kinetics, Champaign, IL, pp. 3–26 Weinberg R S, Gould D 1999 Foundations of Sport and Exercise Psychology, 2nd edn. Human Kinetics, Champaign, IL Willimczik K, Rethorst S 1995 Cognitions and emotions in sport achievement situations. In: Biddle S H (ed.) European Perspecties on Exercise and Sport Psychology. Human Kinetics, Champaign, IL, pp. 218–44
A. E. Abele and D. Alfermann
Sport, Anthropology of Theory production in anthropology has been characterized by the absence of sport as a field of analysis for understanding rituals, complex processes of identity construction, rites de passage, and bodily performances. For instance, Mauss’ (1935) early piece on body techniques does not mention sport arenas and performances as privileged loci for the development of an anthropology of the body. Several reasons can be advanced as an explanation for this inattention to sport. The main explanation lies perhaps in the accepted idea that sports and games, because they produce asymmetry between winners and losers, flourish only in competitive industrial societies. In rituals, asymmetry is postulated in advance—initiated and uninitiated or sacred and profane—and the ‘game’ consists in making all the participants pass to the winning side by means of ritual participation (Le! viStrauss 1966, p. 32). Therefore, almost by definition, anthropology, as the science of the nonmodern and ‘primitive,’ has in the past excluded the study of sports and competitive games because they were correctly perceived as central features of modernity.
The attention to the importance of sport in social theory is to a large extent related to the influence of the late Norbert Elias. In his historical sociological approach, sport was defined as a key area in the civilizing process of European societies. This process was characterized by continuous state control of the legitimate use of force, the development of social organizations to reduce open conflict among groups, and the elaboration of codes for social behavior oriented towards the exercise of individual selfcontrol. Sport became a typical activity of leisure that, historically, fulfilled some of the required functions for the consolidation of the civilizing process. Clearly the exercise of self-control is central to sport. For Elias, one of the essential problems of many sports is how to reconcile two contradictory functions: ‘the pleasurable de-controlling of human feelings and the full evocation of enjoyable excitement on the one hand, and on the other the maintenance of a set of checks to keep the pleasantly de-controlled emotions under control’ (Elias 1986, p. 49). Modern sport is regulated by rules that assure equality among the contenders, make possible the maximization of inner pleasures and the relaxation of individual tensions, and contain the exercise of physical violence. Through Elias, sport becomes a privileged field for the analysis of individual and social tensions in modern societies. In the critical tradition of social theory, best represented by Bourdieu (1984), the practice of sports has been seen as an efficient state, and bourgeois, mechanism to indoctrinate youth with values of sexism, nationalism, fanaticism, irrational violence, the cult of performance and competition, the cult of idols, and the uncritical acceptance of the values of capitalism. Bourdieu maintains that one of the perverse effects of sport is that it transforms the spectators—the consumers of the performances of the athletes—into a caricature of militancy. This is because their participation is imaginary, and their knowledge has been appropriated by the sports coaches, journalists, and bureaucrats (Bourdieu 1984, p. 185). It is clear that the anthropological perspective of Le! vi-Strauss, the critical tradition of social theory of Bourdieu, and the historical sociology of Elias have something in common: sport is perceived as an important field of analysis for achieving a better understanding of the appropriate or contradictory functioning of modern societies. In the arena of sport, individual self-fulfillment is related to games producing winners and losers, to exalted nationalism in an age of international competition, to passivity and ideological domination, and sometimes to an abnormal quest for excitement and violence. No one will deny that sport is associated with these practices and values, but recent anthropological research has shown that it is much more. In the new type of research, the key assumption is that sport is a ritual and a game at the same time, and 14913
Sport, Anthropology of as such a cultural construction that makes symbolic communication among its participants possible. The content of the communication may vary according to the degree of formality, rigidity, concentration of meaning, and redundancy. However, the ritual is also a performance in the sense that saying something is also doing something; hence ritual action makes possible a connection between the meanings and values mobilized by the participants. In every ritual, various types of participants can be distinguished: the experts in knowledge, the central participants or players, and the peripheral participants or audience. This approximation permits the consideration of a range of discourses, practices, and identities produced by various kinds of participants. A historical analysis of the impact of sports situates these narratives and practices in specific periods and places, and enables us to follow their transformations. The empirical analysis of sport can contribute to general theories of the meaning of sport in identity construction or of the significance of bodily exploits for expanding the scope of theories of ritual and cultural performances. MacAloon’s (1984) analysis of spectacle, ritual, festival, and games in Olympic competitions exemplifies a longstanding anthropological concern with ritual. In this direction, the anthropological analysis of sport is not a reflection of society, but a means of reflecting on society. Sport, understood as a central and not marginal activity, is a fruitful entrance for capturing important cultural, historical, and social processes. Sports represent a complex space for the display of identities as well as an arena for challenging dominant social and moral codes, contrary to what Bourdieu sustained.
1. Gender and Nationhood: Sports and Politics It has been argued that a nation is an ‘imagined political community’: its members share a sovereign boundary and have a strong feeling of communion. Hence, the ideology of nationalism should be integrated in social practices that can create an image of ‘the people’ having ‘something’ in common. DaMatta (1982) has argued that Brazil is a society articulated by the sharp division between the ‘home’ and the ‘street,’ and between the family—a system of hierarchical social relations and persons—and the market and free individuals. The role of soccer is privileged because the personalized social world of the home and the impersonal universe of the street are combined in a public ritual. Impersonal rules regulating the game make possible the expression of individual qualities: ‘soccer, in the Brazilian society, is a source of individualization … much more than an instrument of collectivization at the personal level’ (DaMatta 1982, p. 27). The male players escape from the fate of class or race and construct their own successful biographies in 14914
an open arena. Soccer makes it possible to experience equality and freedom, important values of nationhood, in hierarchical contexts. In order to triumph, soccer players (like samba dancers) must have jogo de cintura, the capacity to use the body to provoke confusion and fascination in the public and in their adversaries. The European identification of a Brazilian style of playing relating football and samba, manifested in the expression ‘samba-football,’ is therefore not an arbitrary creation: it is rooted in Brazilian self-imagery (Leite Lopes 1997). This identification establishes important cultural differences—the existence and development of European styles of playing are not linked to music and dance. The case of Brazil is not unique. In Cuba the ritual of playing baseball, the national sport, has been associated with dancing: ‘each baseball match culminated in a magnificent dinner and dancing, for which orchestras were hired for playing danzones’ (Gonza! lez Echeverrı! a 1994, p. 74). As in the case of soccer, baseball was perceived as a democratic and modern game that made it possible for young male players of modest origins to experience social mobility. In Argentina, the arrival in the 1900s of millions of European immigrants in Buenos Aires and its hinterlands made possible the rapid expansion of football and tango. Leisure was transformed; the practice of different sports was accompanied by the crystallization of the choreography of tango and the formation of new orchestras. Tango was first exported to Europe in the 1910s, then to the rest of Latin America and North America, becoming a ‘universal’ dance to express eroticism and modernity. Argentinian soccer players were exported to Europe as early as in the 1920s. Argentinian performing bodies—dancers, musicians, and soccer players—became highly visible in the world arena of leisure (Archetti 1999). It is interesting that both baseball and soccer, two sports originating outside the Latin American countries, were integrated into the construction of the ‘national’ masculinity as cricket in the West Indies, Pakistan, or India. Rules, esthetics, and style in cricket have been transformed into something that is no longer typically English because the hegemonic systems of values its practice evokes is questioned (Appadurai 1995). The rich ethnography of Alter (1992) shows that wrestling as a genuine Indian sport provides a context within which concerns with fitness, health, and strength articulate with various forms of nationalism. In the vision of many wrestlers, the various forms of self-discipline associated with the sport provide the moral coordinates around which a new India is imagined and embodied. Thus, cricket and wrestling enter into a complex relation between modernity, masculinity, decolonization, and nationalism. The practice of sports is ‘gendered’ and should be understood as expressing and articulating gender
Sport, Anthropology of differences. MacClancy’s (1996) analysis of female bullfighting in Spain is a case in point, showing how anthropologists have unproblematically accepted bullfighting as exclusively male, despite the fact that for over two centuries women have been bullfighters, taking the same risk and bearing the same scars as the male matadores. Brownell (1995) has demonstrated that the international success of Chinese sportswomen has allowed them to represent the nation in a way that is unlike Western women’s representations of their nations. The ability to endure pain and sacrifice as feminine values is part of the national discourse transforming Chinese women into national models. Sa! nchez Leo! n (1993) has analyzed the gendered meaning of the two most popular sports in Peru. Football, a male practice, is related to freedom and improvisation, whereas volleyball, a female practice, displays the sense of responsibility and discipline of Peruvian women.
they are human beings endowed with grace, elegance, creativity, and freedom (Archetti 1997). The cult of sport heroes provides a social model to perceive paradoxes and dramas in society, and to recognize and question key values. When Garrincha, the extraordinary right winger of the Brazilian soccer team which won the World Cup in 1958 and 1962, died an alcoholic and in extreme poverty in 1983, the entire nation was shocked. His funeral became a part of national history. The crowds following his casket in silence in the streets of Rio de Janeiro were participants in a social drama, showing the most profound respect to a son of the working class who had suffered from injustice, maladjustment, and prejudice. The nation wondered how it had been possible to abandon so heartlessly the player defined as ‘the joy of the people’ (Leite Lopes and Maresca 1992).
3. Engaged Audiences and Supporters 2. The Production and Adoration of the Heroes The world of sport heroes is a world of creative enchantments because momentarily, like flashes of intense light, athletes become mythical icons representing mastery over mortality. The heroic sports figures must be seen in their cultural context in order to understand their communal impact (Holt et al. 1996). Oriard (1993) has shown that the athlete-hero in America embodies the ‘land of opportunity’ and is therefore the most attractive self-made man. The hero sustains the American Dream, personifies the democratic ideal of open accessibility to prestige and allows all citizens to share his glory. In relation to American football, Oriard has demonstrated that in the historical production of a restricted heroic masculinity certain competing qualities were at stake. On the one side, simple, unambiguous physical force was emphasized—what he calls an antimodern value. On the other side, the required ‘manly’ qualities were temperance, patience, self-denial, and self-control—which lie behind the modern and technocratic aspects of modernity (Oriard 1993, p. 192). Oriard asserts that football’s cultural power derived in large part from this collision of the modern and the antimodern, one aspect of which was this dialectical embrace of competing notions of manliness. In many cases, sporting heroes are a source of collective pride in both national and supranational settings. The case of Maradona, an Argentinian soccer player, is a clear illustration of ‘transnational’ heroism. He was like a God-like hero in Naples while he was playing for the Naples Football Club and in Argentina while playing on the national team. His heroism was related to mythological figures in the respective local traditions: in Naples he was identified with the mythical figure of the scugnizzo and in Argentina with the pibe. These prototypes have something in common:
Bourdieu’s image of passive supporters and alienated audiences has been contradicted by ethnographic findings. Neves Flores (1982) has demonstrated that in the arena of soccer the supporters ‘narrate’ social stories and even produce what he calls ‘ideologies of transformation.’ The popular clubs, such as Flamengo in Rio de Janeiro and Corinthias in Sa4 o Paulo, are seen as courageous, with lots of stamina and will to win. The ‘elite’ clubs, such as Fluminense in Rio de Janeiro, are machista and show discipline and fair play. The clubs where sons of immigrants are most represented are usually perceived as less national than the others. Neves Flores argues that the organized activity of the supporters displays the class and ethnic divisions in society and, in this sense, act against the dominant values in Brazilian society of national equality, populist integration, and political paternalism. Bromberger (1995) has shown that the narrative of the supporters of AC Torino, a soccer club from northern Italy, emphasizes local suffering and extreme loyalty, while the narrative of the supporters of Juventus, the ‘other’ club in the city, is based on continuous victory, international accomplishments, discipline, and efficiency. The history of Juventus is at the same time the history of Fiat, the Italian company owning it. The fans assert, in a creative way, the ‘right’ to identity and difference in modern societies. With his rich ethnography, Armstrong (1998) challenges the assumption that violence is wholly central to the match-day experience of soccer supporters in England. Rather, the creation of identity is at the roots of hooliganism, with all the cultural values and rituals, codes of honor and shame, and communal patterns of behavior and consumption that accompany it. Klein (1991) has argued that baseball is a privileged field for the analysis of cultural hegemony and resistance in the Dominican Republic. Dominican supporters use the very form and symbol of American 14915
Sport, Anthropology of domination—baseball—to promote resistance. Adopting the sport of baseball has also influenced cultural movements. ‘Dominicans infused the game with their own raucous, melodramatic style, marked by a highly individualistic way of playing, and easy-going attitude toward the game both on the field and on the stands, music and dancing, and crowds by turn temperamental and tranquil’ (Klein 1991, p. 152). Klein has shown something unusual in an era of global influences: working-class supporters in the Dominican Republic prefer wearing the shirts of national clubs rather than American ones.
Festial, Spectacle: Rehearsals Toward a Theory of Cultural Performance. Institute for the Study of Human Issues, Philadelphia MacClancy J 1996 Female bullfighting, gender stereotyping and the state. In: MacClancy J (ed.) Sport, Identity and Ethnicity. Berg, Oxford, UK Neves Flores L F B 1982 Na zona de agriao. Algumas mensagens ideolo! gicas do futebol. In: DaMatta R et al (eds.) Unierso do Futebol: Esporte e Sociedade Brasileira. Edicoes Pinakotheke, Rio de Janeiro, Brazil Oriard M 1993 Reading Football. How The Popular Press Created an American Spectacle. University of North Carolina Press, Chapel Hill, NC Sa! nchez Leo! n A 1993 La Balada del Gol Perdido. Ediciones Noviembre Trece, Lima, Peru
E. P. Archetti
Bibliography Alter J S 1992 The Wrestler’s Body: Identity and Ideology in North India. University of California Press, Berkeley, CA Appadurai A 1995 Playing with modernity: The decolonization of Indian cricket. In: Breckenridge C A (ed.) Consuming Modernity: Public Culture in a South Asian World. University of Minnesota Press, Minneapolis, MN Archetti E P 1997 ‘And give joy to my heart’: ideology and emotions in the Argentinian Cult of Maradona. In: Armstrong G, Giulianotti R (eds.) Entering the Field. New Perspecties on World Football. Berg, Oxford, UK Archetti E P 1999 Masculinities. Football, Polo and the Tango in Argentina. Berg, Oxford, UK Armstrong G 1998 Football Hooligans. Knowing the Score. Berg, Oxford, UK Bourdieu P 1984 Questions de Sociologie. Minuit, Paris Bromberger C 1995 Le Match du Football. Ethnologie d’une Passion Partisane aZ Marseille, Naples et Turin. Editions de la Maison des Sciences de l’Homme, Paris Brownell S 1995 Training the Body for China: Sports in the Moral Order of the People’s Republic. University of Chicago Press, Chicago DaMatta R 1982 Esporte na sociedade: um ensaio sobre o futebol Brasileiro. In: DaMatta R, et al (eds.) Unierso do Futebol: Esporte e Sociedade Brasileira. Edicoes Pinakotheke, Rio de Janeiro, Brazil Elias N 1986 Introduction. In: Elias N, Dunning E (eds.) Quest for Excitement. Sport and Leisure in the Ciilising Process B. Blackwell, Oxford, UK Gonza! lez Echeverrı! a R 1994 Literatura, baile y beisbo! l en el (u! ltimo) fin de siglo Cubano. In: Ludmer J (ed.) Las Culturas de Fin de Siglo en AmeT rica Latina. Beatriz Viterbo Editora, Rosario, Brazil Holt R, Mangan J A, Lanfranchi P (eds.) 1996 European Heroes. Myth, Identity, Sport. F. Cass, London Klein A M 1991 Sugarball. The American Game, the Dominican Dream. Yale University Press, New Haven, CT Leite Lopes J S 1997 Successes and contradictions in ‘multiracial’ Brazilian football. In: Armstrong G, Giulianotti R (eds.) Entering the Field. New Perspecties on World Football. Berg, Oxford, UK Leite Lopes J S, Maresca S 1989 La disparition de la ‘joie du peuple’. Actes de la Recherche en Sciences Sociales 79: 21–36 Le! vi-Strauss C 1966 The Saage Mind. Weidenfeld & Nicolson, London MacAloon J J 1984 Olympic Games and the theory of spectacle in modern societies. In: MacAloon J J (ed.) Rite, Drama,
14916
Sport, Geography of The sports central to geography’s concern are institutionalized contests, frequently professional in their highest form and attractive to spectators, involving the use of vigorous physical exertion between individuals or teams of human beings. The geography of sport is a recent development, largely cultural-historical in approach. It maps and explains the changing distribution patterns of individual sports and packages of national, major, and minor sports at levels ranging from the world to district (Bale 1982). It relates these changes to the character and form of stadiums and landscape ensembles. It also deals with the quality of theater—interactions between players and spectators—in different types of stadiums (Bowden 1995). New directions of research focus on sport in the formation of working-class culture, the effects of globalization and televised sports on the distribution of sports and the map of rapidly emerging women’s sports.
1. Modern Sports: Origins and Dispersals The distribution of modern sports in the twentieth century is explained in a model of (a) primary origins in southeast England; (b) dispersal to the British Isles, Anglo-America, and the British Empire; (c) secondary origins in northern Britain and in northeastern Anglo-America; and (d) dispersal from the latter mainly to the rest of North America and the Caribbean Basin, and from Britain to the rest of the world. The birthplace of modern sports in Metropolitan England is attributed by some to the development of the mathematical-empirical world view after 1660. Marxists correlate sport with the birth of industrial
Sport, Geography of capitalism in the 1700s (Guttmann 1994, Holt 1989). Others point to Thomas Hughes’s Tom Brown’s Schooldays (1857) which glorified Thomas Arnold’s education system at Rugby School, but maintained that moral development and character training inculcated by cricket and football outranked intellect in the curriculum. Hughes’s transformation of Arnold, abetted by the ruling classes, inspired the cult of athleticism in the elite schools and universities which built principally on the first modern game—cricket—codified by the aristocracy in the eighteenth century (see James 1963). It was joined in the public schools in the middle third of the nineteenth century by hockey and athletics and by the Rugby School and dribbling games. Neither of these football games were played after the 1870s, but seven modern forms of football developed from them and one of these forms of football has become the major team sport played by men in at least one season in practically every country in the world, and by women in most First World countries and in some Third World countries. Research in the 1990s goes far to explain the diffusion and distribution of the different types of football 1870–2001 as a classic case in the geography of sport.
2. Football Exceptions to Soccer’s World Dominance Modern association football originated in England. By the 1880s it was the national game in England and Scotland, and diffused thereafter to become the most popular team sport throughout the world. The exceptions are seven English-speaking countries: Wales and Ireland, and five countries settled substantially from Britain—USA, Canada, Australia, South Africa, and New Zealand. The explanation lies in (a) the preference of the majority of English public schools for the more manly rugby compared to the dribbling game played by the minority, and its consequence: rugby’s greater popularity among the upper and middle classes in Britain and in a majority of regions in the British Isles 1848–78, and (b) a developing English upper and middle class prejudice against the dribbling game (association football after 1863), increasingly associated in the 1870s with working class industrial regions (Murray 1994, Russell 1997). All of this ensured that the predominantly middle class colonists and their public-school and Oxbridgeeducated leaders in New Zealand and Australia and in English-speaking Canada and South Africa in this midcentury period chose to carry with them Britain’s most popular football game of the time—rugby (Holt 1989). That they did so to the exclusion of dribbling soccer is explained by Foster’s law of simplification in which in the survival phase of early settlement, only one of many cultural forms in the diverse homeland is
chosen by colonists, usually the form found in the region whence came most of the first settlers: rugby in southern England before 1878 (see Foster 1960). The Rugby School game—the rugby current before the rules regularization banning the violent ‘hacking’ by the Rugby Union in 1871—provided the framework for the nationalist forms of football: Australian Rules played in Australia outside Queensland and New South Wales (Hibbins 1992) and Canadian and American football derived from the Harvard and McGill universities’ encounters in 1874. The peaks of British settlement occurred in eastern Australia, New Zealand, and South Africa late enough that the Rugby Union code became established there in the 1870s. This was challenged in the 1900s in eastern Australia and New Zealand by the new 13-aside Rugby League professional code developed in northern England. Before 1880 rugby was preferred over dribbling soccer in Ireland and Wales. The continued preference for rugby over the new passing soccer after 1880 results from cultural nationalism. In Ireland the Gaelic Athletic Association modernized Gaelic football to be the national game, and its ban on English games worked selectively against ‘new’ soccer relative to wellestablished rugby. In Wales the under-the-table payment of Welsh ‘amateurs’ pre-empted professional soccer’s attractiveness to Welsh athletes and gave them the chance to beat the middle class English at their own game (Murray 1994, Holt 1989, Wagg 1995). In the 1860s and 1870s in Argentina, Belgium, Sweden, and France both rugby and soccer were adopted. The attractiveness of the new soccer in the early 1880s relegated rugby thereafter to the status of a minor sport in Argentina, and replaced it in Sweden and Belgium. But in France rugby union parried the counterattack of new soccer after 1880, as did rugby league in southwest France after 1900, and both contributed to a long delay in soccer’s elevation as the national team sport (Guttmann 1994, LaBlanc and Henshaw 1994).
3. Diffusion of Soccer: The World’s Sport The football adopted in every other country in the world after 1880 was soccer. The explanation for this lies in the revolutionary transformation of soccer from the dribbling game played by unspecialized players into a simple game of high skill emphasizing accurate long- and short-ball passing, wing-play, and the crossing of the ball to forwards specialized in shooting and heading, and positional specialization. The revolution began in northern Britain, produced two professional leagues (England 1888; Scotland 1891) and swept through Great Britain, leaving rugby union more popular only in southwest England and the 14917
Sport, Geography of Scottish border, marginal rural regions with small towns and no large cities—and in south Wales (Russell 1997, Holt 1989, Murray 1994).
4. Diffusion and Adoption During the first period of diffusion of soccer to independent countries, 1890–1940, the adoption of soccer in each country followed a standard pattern. A gradual transmission of the game to young members of the local elite from: (a) British resident businessmen and agents of trading houses in the large port(s); (b) members of the British colony in the capital; (c) expatriate British teachers in schools and colleges for British residents and the local elite; and (d) elite students returning with a soccer ball from educational sojourns in Britain. Often towards the end of this gradual phase of adoption and formation of teams by denizens, members of the middle class picked up the game from the elite and also in the large inland industrial and mining towns from British mining and railway engineers and textile mill owners and skilled workers. The initial 15–20 year phase of gradual adoption by the elite was followed by a filtering-down process and an explosive 10–15 years of rapid adoption by the middle and working classes in urban areas. The three phases of the S-curve pattern of the diffusion of innovations are completed with the 10–15 year slackening of the rate of adoption by the laggards: the market-townsmen and agricultural workers\peasants of the thinly populated countryside (Guttmann 1994, Russell 1997). The critical signifier of the establishment of soccer as a country’s major form of football (national team sport) is the date of creation by the national football association of the national club championship (data from LaBlanc and Henshaw 1994, Wagg 1995). The plotting of these dates of founding of national championships establishes three discrete models of the spread of soccer worldwide, variants of the classic patterns of spatial waves and S-curve time-series of diffusion (see Rogers 1983) in men’s soccer to: (a) 35 independent nations 1890–1945, peaking in the 1920s; (b) 65 newly independent nations 1946–96, peaking in the 1960s; and, in women’s soccer, to (c) 65 countries 1957–2000, peaking in the 1970s–1980s.
5. Traditional Men’s Soccer: Europe, the Middle East, and Latin America In the first diffusion-wave model, beachheads developed in Denmark, The Netherlands, Switzerland, Italy, and Argentina in the 1890s. In the 1900s the second wave pushed out to Norway–Sweden; Belgium–Germany; Austria–Hungary–Bohemia; Spain; Uruguay–Chile–Paraguay, and to a new beachhead in Mexico. The third wave of the 1910s, blocked in 14918
Europe by World War 1 and by continued Czarist and Ottoman imperial suppressions, broke only on the shores of Iceland, France, Portugal, Algeria– Morocco, Haiti, and Brazil. The fourth wave completed the conquest of the European outer margins (the three Baltic countries and Poland, Soviet Union, and three Balkan countries and Greece), swept into the Middle East and North Africa (Turkey, Israel\ Palestine, Egypt, Tunisia), broke in northern Latin America (Bolivia–Peru–Ecuador–Colombia, Surinam, Costa Rica–Guatemala), and reached East Asia (Thailand, Japan\Korea). The last countries reached by the outer wave of the 1930s were Albania, El Salvador, and Honduras.
6. Postcolonial Men’s Soccer: Asia, Africa, and Islands Most of the countries that adopted modern soccer after 1945 were former colonies of European powers. Soccer was the team sport offered in missionary schools, in private schools mainly for Europeans, and to the local elite in the main cities. Mine-owners and railways used it as a means for social control of workers. Lack of investment by the colonial powers in the provision of social institutions for the vast majority ensured that the game did not spread widely beyond these two groups in the large towns. The filteringdown process often extended over 40 years before emergent nationalism and independence accelerated the process and united the masses behind national soccer teams. Soccer was established as a national sport earliest in the former British, Dutch, and Belgian colonies because of soccer’s earlier acceptance by the three colonial powers compared to France and Portugal and because Britain, Holland, and Belgium established the European national association\championship model in each colony’s development. By contrast the imperial family model developed in the French and Portuguese colonies funneled the better colonial players into the French and Portuguese leagues thereby stunting the development of colonial ones. Thus, in the post-World War II period the establishment of soccer as the national team sport took place first in the 1950s in the British colonies in the Mediterranean (Malta, Cyprus), North Africa (Ghana, Sudan), and East Asia (Burma, Malaysia), the Dutch colonies in the East and West Indies and in Congo–Kinshasa (formerly Belgian). Most of the 38 countries that formed national associations and club championships in an unprecedented decade of rapid establishment of soccer in the 1960s were the larger British Caribbean islands, former British colonies in eastern and southern Africa, and the coastal countries of Francophone Africa. In the long third phase of slackening adoption from 1970 onwards, the laggards were land-locked Francophone countries, war-torn
Sport, History of Angola and Mozambique, the smallest British Caribbean islands and island nations in the Atlantic, Indian, and Pacific oceans, their numbers boosted in the 1970s by the new Gulf States (Murray 1994, Wagg 1995).
7. Postmodern Women’s Soccer: The First World After decades of being rebuffed by national football associations and FIFA, women’s soccer began its international phase of growth in the 1950s in its heartland—Germany and England—spreading slowly to seven neighboring European countries by 1969 when the English Football Association became the first to recognize women’s soccer officially. The Women’s Football Association founded there in 1971 was soon followed by other European national football associations, of which Germany’s established the first national championship in 1974. The European-only period of gradual adoption to 1969 gave way 1970–89 (with a prospective Women’s World Cup in 1991) to an explosive adoption of women’s soccer in another 40 countries, mainly in Europe and East Asia, but significantly in the new postmodern football cultures of Japan as well as USA, Canada, Australia, New Zealand, and South Africa where women are choosing soccer over the more physical and ‘manly’ forms of football favored by the majority of men. In the First World and in the northwest European heartland, until recently the unchallenged preserve of women’s field hockey, it is the young middle class women of metropolitan suburbia with their greater freedom within the realms of lifestyle and leisure politics who dominate women’s soccer. However, a small but significant minority of club and international players hail from working-class male strongholds (also a source of soccer-playing women in Brazil). Conspicuously under-represented on the world map of women’s soccer are Latin American countries where working class men and macho values inhibit women’s soccer. Absent are the countries of sub-Saharan Africa where female soccer is culturally proscribed, and those of Western Asia and North Africa where Muslims and Hindus demand sharp cultural divisions between male and female (Giulianotti 1999, LaBlanc and Henshaw 1994). See also: Diffusion: Geographical Aspects; Diffusion, Sociology of; Entertainment; Play, Anthropology of; Sport, Anthropology of; Sport, History of; Sport, Sociology of; Sports, Economics of
Bibliography Bale J 1982 Sport and Place: A Geography of Sport in England, Scotland and Wales. C Hurst & Co., London Bowden M J 1995 Soccer. In: Raitz K (ed.) The Theater of Sport. The Johns Hopkins University Press, Baltimore, MD Foster G M 1960 Culture and Conquest. Quadrangle Books, Chicago
Giulianotti R 1999 Football: A Sociology of the Global Game. Polity Press, Cambridge, UK Guttmann A 1994 Games and Empires: Modern Sports and Cultural Imperialism. Columbia University Press, New York Hibbins G M 1992 The Cambridge connection: The English origins of Australian Rules Football. In: Mangan J (ed.) The Cultural Bond: Sport, Empire, Society. Frank Cass, London Holt R 1989 Sport and the British: A Modern History. Oxford University Press, London James C L R 1963 Beyond a Boundary. Stanley Paul, London LaBlanc M, Henshaw R 1994 The World Encyclopedia of Soccer. Visible Ink Press, Detroit, MI Murray W J 1994 Football: A History of the World Game. Scolar Press, Aldershot, UK Rogers E 1983 Diffusion of Innoations, 3rd edn. Free Press, New York Russell D 1997 Football and the English. Carnegie Publishing, Preston, UK Wagg S (ed.) 1995 Giing the Game Away: Football, Politics, and Culture on Fie Continents. Leicester University Press, London
M. J. Bowden
Sport, History of 1. Two Definitions of ‘Sport’ The English word ‘sport’ is etymologically derived from the late Latin ‘deportare.’ Originally it meant ‘to divert yourself,’ ‘have fun,’ ‘amuse yourself ’—in short everything which you would do ‘in sport’ or ‘for love.’ In the narrow sense of the word it meant the hunt for ‘game,’ a semantic connection derived from mediaeval aristocratic practices, a term which is still used in English. Horse-racing was also described as ‘sport’ in early-modern times. The word was first taken into everyday English usage, however, in the mid-nineteenth century when new games and pastimes became popular amongst the middle classes. In this period ‘sport’ became an umbrella term for ball games (cricket, football, croquet, and hockey), martial arts (boxing, fencing, and wrestling, etc.), heterogeneous outdoor pastimes (boating, riding, cycling, skating, swimming, and mountain-climbing) as well as for socalled ‘athletics’ (running, jumping, javelin throwing). This list of disciplines can be found in a general article on English sporting life published in 1899 (Aflalo 1899). One hundred years later a sociological survey for the city of Hamburg came to the conclusion that there were more than 240 different sporting disciplines, from BMX-cycling, to bungee jumping, and free climbing on artificial mountain walls (Heinemann and Schubert 1994). The increasing variety of games and pastimes can have different forms of organization (neighborhood and circles of friends, clubs, schools and universities, the armed forces, political parties, or the State) and types of income (admission charges, private sponsoring, public grants). Some disciplines are popular, 14919
Sport, History of others are not. Some are followed with the support of modern mass media, others have retained a more private character. Nonetheless, these games and pastimes all have one common feature: they are competitive. Therefore, they encompass only a narrow understanding of the term ‘sport.’ For, in the course of the development of competitive sports into a mass phenomenon and their growing significance in education, the armed forces, the economy, and culture, there has been an increasing tendency in many languages to use the word ‘sport’ (or its international usage as in ‘deporte’ (Spanish), ‘desporto’ (Portuguese), ‘spor’ (Turkish), ‘Π’ (Russian)) also for noncompetitive activities and institutions, for which there are many other—and mostly more precise— terms. Amongst these may be counted gymnastics and physical education, dancing, forms of active tourism such as rafting, and even rehabilitation. In this broad usage ‘sport’ today has become a collective term for many and diverse expressions of the general physical culture of mankind, including ‘dead’ cultures like the physical culture of the ancient Orient or the α' γω! υ (agon) of the ancient Greeks. Both usages of the word ‘sport’ have given rise to different directions in the study of the history of sporting activity.
2. Sport History as the History of the Physical Culture of Mankind The direction based on the broader usage of the word—‘sport history’ as the history of physical culture—is the older of the two. Amongst the founders of the discipline in this sense can be counted the chronicler of the Greek α' γω! υ Pausanius (c. 115–180), the Enlightenment educationalist Johann Christoph Friedrich GutsMuths (1759–1839) and the ‘father of gymnastics’ Friedrich Ludwig Jahn (1778–1852). Their nineteenth and twentieth century descendants were physical education teachers, club and association officials, doctors, hygiene experts, military officers, journalists, and writers who have researched themes like mediaeval bathing cultures, physical bearing in the Renaissance, or the history of their local gymnastic clubs. The great majority of these ‘amateurs’ were, and still are, not professional historians, but ‘amateurs’ and part-time writers. Thus, they have little experience in dealing with sources, do not know how to arrange their results in a more comprehensive context, and tend to indulge in off-the-cuff philosophizing about allegedly anthropological constants in human physical activities. Methodical studies are of little interest to these scholars; indeed, a good deal of them used their research into sport history to legitimize their own political standpoint. This is particularly conspicuous in publications of the 1920s and 1930s when continental Europe aired conflicts between so-called ‘bourgeois’ and working-class sport associations; and 14920
in later writings between 1950 and 1970 which were influenced by cold-war ideologies. They often yielded more information about the political and moral values of the respective authors than about the subject that they have researched. The largest fund of knowledge in the field of sport history, however, is produced by academics who have made a name for themselves in university departments for physical education (Institute fuW r LeibesuW bungen)—most particularly in German-speaking countries from the 1920s onwards and again since the 1960s. With the exception of some experts in ancient sport who are trained classical historians, a great many of these sport academics have been educated mainly in the natural sciences. Therefore, they do not make it their ambition to write elaborate historical studies. Nevertheless, they must be credited for their assiduous work in collecting and documenting sources and literature about sporting incidents, persons, institutions, and events in history. All the more so because very early on they cast their gaze beyond national boundaries, as documented by representative volumes of collected essays like Geschicte der Sport aller VoW lker und Zeiten (Bogeng 1926) or Geschichte der LeibesuW bungen (Ueberhorst 1972\1989). Academic exchanges have occurred since the 1970s with the aid of societies like the International Committee for the History of Physical Education and Sport (ICOSH, founded in Prague 1967) and the International Association for the History of Physical Education and Sport (HISPA, founded in Zurich 1973); these two societies, representing the East and the West, amalgamated into the International Society for the History of Physical Education and Sport (ISHPES) in 1989. In addition, there are more specialized societies that promote exchanges of information on North America, Australia, and Europe.
3. Sport History as the History of Competitie Games and Pastimes A second direction in sport history, which primarily regards modern competitive sport as its theme—i.e., is orientated towards ‘sport’ in the narrow sense of the word—has been adopted by professionals who have moved into sport history after a rigorous academic training in another academic discipline. Most of these scholars are historians, whilst the majority of the rest are sociologists, anthropologists, political scientists, and economists. A few are geographers, architects, or media specialists. Sport academics are only included in this group when they can be counted—as in France and the US—as trained social scientists. This second category of sport historians is comparatively new. It sprang up in the 1970s in Great Britain, Australia, and the US, where sporting competition traditionally has been a part of the educational system. In the second half of the 1980s it also found
Sport, History of adherents on the European continent, and today it is in the process of spreading to South America, Africa, and Asia. Given the lack of local experts, however, studies on these regions of the earth have up to now mostly been conducted by European and American scholars. International exchanges take place at conferences and in publications like Sporting Traditions (founded in 1977 in Australia) or the British Journal of Sports History which in 1987 became The International Journal of the History of Sport; further at research centers such as the Center for Olympic Studies at the University of Western Ontario, the University of Neufcha# tel’s Centre International d’En tude du Sport, or—in the UK—the International Research Centre for Sport and Socialisation in Society at the University of Strathclyde and the International Centre for Sports History and Culture at DeMontfort University, Leicester (this institute employs an international staff ). In contrast to the first category of sport history, the international cooperation amongst these younger scholars is marked by a clear orientation towards problem-solving, and a certain pleasure in interdisciplinary experimentation, which in different ways enables themes to be looked at from completely heterogeneous perspectives. To give but one example: the phenomenon of spectators rioting in football stadiums which has been observed since the nineteenth century and which has been the object of comprehensive studies in the 1990s by historians not to speak of sociologists, social workers, architects, and media specialists (for an overview see Williams 1991). A great many amateur historians are also active in the area of competitive sport, not least amongst them the oft-derided collectors of ‘sportifacts,’ who punctiliously generate lists of trophy winners, game scores and scorers, season and career averages, chronological events of team and club performances. Since these data, when put into context, have some value as source material, professional historians have a considerable interest in cooperating with the amateurs by supervising their work and giving them the chance of being published in special journals like Sporting Heritage. The cooperation with academic sport scientists, however, is not as intense as it could be. One reason is that the two factions put the main stress on different subject matters, another is to be seen by the fact that scholars mainly belong to different generations. The leading sport historians in sport science were born before the World War II, whereas the social science faction, with few exceptions, were all born in the 1950s and after. Whereas the older generation grew up at a time when educationalists and politicians were interested in promoting virtues like self-discipline and strength in sport, the younger generation of research workers, which incidentally also includes women, has been imbued with a different understanding of sport. Massive state subventions during the cold-war period, expansion of educational opportunities and an in-
crease in leisure since the 1960s, and above all the breakthrough of television and the concomitant commercialization tendencies, have not only transformed sport into a mass phenomenon but also helped to wipe out the old virtues and produce a mentality of hedonistic sporting consumption. While many of their older counterparts are irritated by, and even disgusted at this development, it is well known to the younger generation, and tracing back the traditions of this mentality is a matter of course for them.
4. Sport History, Social History, and the Social Sciences The formation of this second category in sport history around the turn of the 1980s is remarkable because both professional historians and social scientists had tended to ignore sport up to this time. Therefore, alongside the generation-specific interest of young academics in sport, there needed to be a change in the disciplines of history and social sciences themselves in order to embrace sport history as a natural object of study. The decisive change came with the rapid development of social history and historical sociology that began in the 1960s (see Social History). This boom was accompanied by an intensive reception of modernization theories, which made a fundamental contribution to the interest of scholars in the social form of modern sports, i.e., the rules, constraints, and organization of sporting competition (see Modernization and Modernity in History). For the features of this social form fitted very well into the categories of Max Weber and Talcott Parsons. ‘Equality of opportunity to compete and in the conditions of competition,’ ‘specialization of roles,’ ‘rationalization,’ ‘bureaucratic organization,’ ‘quantification,’ and ‘quest for records’: these were the crucial characteristics of modern sport as established by Guttmann (1978). Whereas historians and social scientists generally felt they had no responsibility for bodily movements and behavior in human beings—something which the first category of scholars from sport studies had no hesitation in writing about—the above definition seemed to imply that there might be a connection between the social formation of modern sport and the formation of modern Western society. Which political, economic, and social conditions were favorable to organized and thoroughly regulated competition? Who were the actors? What motives connected them with sport? What were the effects of their passion for sport? These and other questions were raised. Significantly, the one modernization theorist still living at the time, the civilization theorist Elias, dedicated his work to sport research for a time. One of his hypotheses was that the emergence of organized sport as a form of relatively nonviolent physical combat was connected with a period in English history 14921
Sport, History of when Parliament became a battleground for resolving and defusing interests and conflicts (Elias and Dunning 1986; see also Elias 1971). As exciting as were scholars’ attempts to emphasize the contexture of the development of sport, their theoretical ambitions began to evaporate as soon as they began exploring sources in more detail. During their researches they all discovered that modernization and other ‘great’ theories about society provide in many ways an unsuitable framework in which to interpret sport history. One reason was the focus of ‘great theories’ on industrial society. This focus did not allow to take into account that competitive sport, at least in its ‘mother country’ England, had sprung up in pre-industrial times. There the most protracted impulse for development had been the early formation of a market society and the commercialization of everyday life: which explains why seemingly modern professional sport has a longer tradition in Britain than amateur sport (see Consumption, History of ). On the other hand it was discovered that, precisely because of the rationality of its social form, sport could be effective as a transmitter of ‘premodern’ attitudes. Thus, in its land of origin, the notion of ‘fairness,’ e.g., evolved from informal aristocratic and popular cultures that can be traced back to the Middle Ages. There were also some other elements which, with the help of sport, bound ‘chivalrous’ gentlemen into modern civil society. Modernization theorists had not foreseen such paradoxes and even Elias, who was well acquainted with the sources, seemed at a loss to know what to do with them. Further communication problems with interpretations based on modernization theories resulted from the increasing confirmation of research results which showed that, everywhere apart from Great Britain, modern sport was a cultural import, and that it was disseminated by international elites whose identities and loyalties were anchored not only in, but also across states, a factor which was partly responsible for disseminating a uniform, global image. Established historians and social scientists were unable to say much about this phenomenon because their studies were traditionally fixed on the modern nation state. Hence, the method of international comparison which they recommended was of some—if insufficient—help; and those sport historians who studied themes like the establishment of sporting culture in the British Empire or its diffusion in Europe as a side-effect of tourism, business contacts, and technology exchanges (Mangan 1988, Lanfranchi 1991, Guttmann 1994, Eisenberg 1997) became of necessity pioneers of cultural transfer studies in a global framework. The significance of cultural transfer in the development of modern sport puts a different perspective on such research methods that, like oral history (see Oral History) or thick descriptions, were increasingly asserting themselves in social history in order to be 14922
able to reconstruct the development in small communities more accurately. True, such methods were used by sport historians, especially for Ph.D. theses (e.g., to research the social basis and the symbolic content of certain sports), and in individual cases (e.g., research work into how competitive sport was operated in totalitarian regimes) they brought to light highly surprising results. However, when it came to explaining social and sporting changes, this form of research contributed very little. Furthermore, completely different questions had to be addressed, such as the influence of single persons, governments, and military officers, especially in both world wars, the impact of rule changes by world sporting authorities, or media imposition of patterns of perception. These findings gradually have led to a loosening of the relationship of sport historians to their ‘parental’ disciplines, historical science and historical sociology (see Historical Sociology), despite their fundamental ties. That this tension has not been expressed openly, nor led to an antitheoretical offensive, as happened some years ago with representatives of modern cultural history, can be attributed to the priority given to empirical research, and also to the fact that sport historians have been lucky enough to be able to pursue their interests elsewhere. The rejection of ‘great theories’ occurred at a period of intensive reception of ‘smaller’ problem-related ‘theories of the middle range’ (Robert Merton), which were being developed by colleagues in sociology and economics (Neale 1964, Loy and Kenyon 1969). Some historians such as Vamplew contributed actively to this discussion (Vamplew 1988). This direction in research grew out of the awareness that social relations in sport present in many ways a different picture to those in ‘real life.’ A commonly given example is the nature of sporting competition. Although most businesses would regard a monopoly as the ideal market situation, in sport this would be an economic disaster: it is of little commercial value to be a world champion if you have no challengers. Another example is the financial actions of clubs and sports firms, where ambitions for success often override profit considerations. Furthermore, the borders of class, gender, and ethnicity often run differently in sport from those in society which, in addition, are considerably less fluid. Such observations fascinated sport historians, not the least because they raised the value of the object under review. For they demonstrated that modern sport was in no way a ‘mirror of society,’ as was keenly asserted. It was much rather a relatively autonomous social subsystem, which functioned according to its own rules on the basis of its ‘built-in’ game character, and which in certain situations produced its own dynamic of development. Where this subsystem interacted with its social and economic environment, it would not only reproduce this environment, but also impose its own mark on this
Sport, History of environment. It could contribute to the erosion of outdated traditions like premodern value systems or social inequality; and conversely it could help to extend the life span of such anachronisms. Sport was an active, yet selective, impulse of social change.
5. The Future of Sport History Reconstructing this capacity of sport from individual cases will be a central task for the future of sport history as a branch of social history. For this task to be fulfilled satisfactorily, sport historians, social historians, and social scientists must cooperate. Alas, the reception of the results of sport history research amongst the scientific community has left a lot to be desired. To write a synthesis of the scattered findings of the last few years is, therefore, not only a desideratum but an urgent necessity. In addition, sport historians should target and close gaps in research. Given the fact that recent research work has almost exclusively covered the period from the eighteenth century to World War I this means, first of all, an extension of the chronological scope. A far broader look at the premodern phase of sport would be most fitting, especially a study of the effects in Great Britain of the Renaissance, which is a key epoch for questions of continuity and discontinuity between ancient and modern ideals in sport. This question, which can only be answered in cooperation with interested classical historians (cf. the journal Nikephoros), has been completely neglected to date. It is however far more important for sport history to turn its attention to the later decades of the twentieth century. What is needed here is, first, the compilation of a comprehensive data collection covering the social basis of sport, the biographies of influential actors, the development of organizations on a national and international level, financing methods, etc., as has been done in the last few years for the nineteenth century. Second, the substantial entanglement of sporting associations and totalitarian regimes (e.g., Europe between the world wars, Eastern Europe, South America, and China after 1945) must be made the subject of research studies. In this respect it would be of great help if a special branch of political science could be evolved, analogous to those of sport economy and sport sociology in recent years. For there are some signs which show that relationships of dominance and power in sport take on different shapes from those in other areas of social life. Furthermore, it is desirable for social science oriented sport historians to broaden their thematic range, i.e., extend their research activities to cover themes of the physical culture, which to date have mostly been left to amateurs and sport scientists: the movements of the human body, sexuality, education, gender-fixing, etc. Such studies should obviously avoid making naive statements about anthropological constants in humanity. They should rather reveal how regulated and
organized modern competitive sport emanates into other areas of social life, what effects it has, and what consequences result from this. In other words, scholars should research subjects like the correlation between doping and the contemporary medical consumption (the first studies on the subject have already been conducted by Hoberman 1992), between the physical appearance of athletes and modern bodily ideals in the fashion industry and on the labor market in the present-day service economy, between certain easygoing manners and expressive forms of communication in sporting arenas, and in other areas of modern social life. Because this is only possible in close cooperation with academic disciplines like media studies, which in many countries are still in the process of coming into being, there is a danger that sport history will lose its own profile and independence. But if, by doing so, it can become an established part of a methodically reflected project of writing social history in the twentyfirst century, this would only be to its own advantage. See also: Mass Media and Sports; Sport, Anthropology of; Sport, Geography of; Sport, Sociology of; Sports as Expertise, Psychology of; Sports, Economics of
Bibliography Aflalo F G 1899 The sportsman’s library. A note on the books of 1899. Fortnightly Reiew 72: 968–76 Bogeng G A E (ed.) 1926 Geschichte de Sports aller VoW lker und Zeiten, 2 Vols. Seemann, Leipzig, Germany Eisenberg C (ed.) 1997 Fußball, soccer, calcio. Ein englischer Sport auf seinem Weg um die Welt. Deutscher Taschenbuch Verlag, Munich, Germany Elias N 1971 The genesis of sport as a sociological problem. In: Dunning E (ed.) The Sociology of Sport. Cass, London Elias N, Dunning E 1986 The quest for excitement in leisure. In: Elias N, Dunning E (eds.) Quest for Excitement: Sport and Leisure in the Ciilizing Process. Blackwell, Oxford, UK, pp. 63–90 Guttmann A 1978 From Ritual to Record. The Nature of Modern Sports. Columbia University Press, New York Guttmann A 1994 Games and Empires. Modern Sports and Cultural Imperialism. Columbia University Press, New York Heinemann K, Schubert M 1994 Der Sporterein. Ergebnisse einer repraW sentatien Untersuchung. Hofmann, Schorndorf, Germany Hoberman J M 1992 Mortal Engines. The Science of Performance and the Dehumanization of Sport. Free Press, New York Holt R 1989 Sport and the British. A Modern History. Oxford University Press, Oxford, UK Holt R 1992 Amateurism and its interpretations. The social origins of British sport. Innoation 5: 19–31 Lanfranchi P 1991 Fußball in Europa 1920–1938. Die Entwicklung eines internationalen Netzwerkes. In: Horak R, Reiter W (eds.) Die Kanten des runden Leders. BeitraW ge zur europaW ischen Fußballkultur. Promedia, Vienna, pp. 163–72 Loy J W Jr., Kenyon G S (eds.) 1969 Sport, Culture, and Society. A Reader on the Sociology of Sport. Macmillan, London
14923
Sport, History of Mangan J A ed. 1988 Pleasure, Profit, Proselytism. British Culture and Sport at Home and Abroad 1700–1914. Cass, London Neale W C 1964 The peculiar economics of professional sport. Quarterly Journal of Economics 78: 1–14 Nikephoros. Zeitschrift fuW r Sport und Kultur im Altertum ff. Weidmannsche Verlagsbuchhandlung, Hildesheim, Germany Poliakoff M B 1987 Combat Sports in the Ancient World: Competition, Violence and Culture. Yale University Press, New Haven, CT Ueberhorst H (ed.) 1972\1989 Geschichte der LeibesuW bungen, 6 Vols. Bartels und Wernitz, Berlin, Germany Vamplew W 1988 Pay Up and Play the Game. Professional Sport in Britain 1875–1914. Cambridge University Press, Cambridge, UK Williams J 1991 Having an away day: English football spectators and the hooligan debate. In: Williams J, Wagg S (eds.) British Football and Social Change. Getting into Europe. Leicester University Press, Leicester, UK, pp. 160–84
C. Eisenberg
Sport Psychology: Performance Enhancement Enhancing sport performance is part of the discipline of psychology known as sport psychology. Sport psychology is the scientific study of people in sport and exercise contexts. There are two main objectives of sport psychology: (a) learning how psychological factors affect an individual’s physical performance, and (b) understanding how participation in sport affects a person’s psychological development, health, and well-being (Weinberg and Gould 1995). Although traditionally the emphasis has been on the first objective, both objectives will be considered. The rationale for this decision is that of the millions of individuals, especially children and adolescents, who play sports, only a tiny fraction will parlay those activities directly into a career in sport. For the rest, growing up means further defining their identity, discovering other skills and interests and, potentially, applying some of the principles learned during sport participation to their adult pursuits. Sport psychology, then, involves the use of sport to enhance competence and promote human development throughout the life span, and practitioners should be as concerned about ‘life’ development as they are about athletic development (Danish and Nellen 1997).
1. Why Study Sports? The impact of sports on US society is pervasive. US culture places a tremendously high value on sport. It is a major source of entertainment for both young and old. It has been reported that between 20 and 35 14924
million 5–18 year old youth in the USA participate in one or more nonschool sports and another 10 million 14–18 year old youth participate in school sports yearly. Only family, school, and television involve children’s time more than sport. In sum, sport is a cultural phenomenon that permeates all of US society. Sport is not only a relevant topic for study because of its importance in society; it is a significant factor in the development of a child’s self-esteem and identity. However, whether it is a positive factor has been debated widely without a clear answer. Danish et al. (1990) concluded, following a review of the literature, that for sport to be a positive developmental experience, it must be designed specifically for the purpose of enhancing competence.
2. A Brief History of Sport Psychology Sport psychology dates back to the turn of the twentieth century. Although Norman Triplett, a psychologist from Indiana University, is credited with conducting the first study on athletic performance in 1898, Coleman Griffith is known as the father of sport psychology. A psychologist at the University of Illinois who also worked in athletics he developed the first sport psychology laboratory, was involved in the first US coaching school, and wrote two groundbreaking books, The Psychology of Coaching (Griffith 1926) and The Psychology and Athletics (Griffith 1928). He also conducted a series of studies with the Chicago Cubs baseball team and the University of Illinois as well as consulting with a number of well-known athletes of his time. As sport psychology evolved over the twentieth century, two different kinds of sport psychologists emerged. Clinical sport psychologists are trained primarily in applied areas of psychology such as abnormal, clinical, counseling, and personality psychology and are usually licensed psychologists. They tend to be less well trained in the sport sciences. Educational sport psychologists are usually not licensed psychologists. Their training is in exercise and sport science, physical education, kinesiology, and the psychology of human movement, especially as it is related to the context of sport. They often have additional training in counseling. They see themselves either as researchers or as ‘mental coaches.’ For some time there has been conflict between these two groups, based on who is best trained to provide services that enhance sport performance. The target of the services are primarily elite and professional athletes, many of whom see little need for such services or who sometimes seek the help of individuals not trained in either area. More recently, a rapprochement between these two groups has taken place with the formation of the Association for the Advancement of Applied Sport Psychology, an organization of approximately 1,000 members, almost equally divided
Sport Psychology: Performance Enhancement between members trained in traditional areas of psychology and members trained in the sport sciences.
3. A Framework for the Deelopment of Performance Excellence Gould and Damarjian (1998) have proposed a psychological pyramid of peak performance (see Fig. 1). They consider three sets of factors that interact to
produce peak performance. At the base of the pyramid is the personality, motivational, and philosophical foundation of the individual. This foundation consists of those enduring individual differences that influence one’s motivational disposition to learn the psychological skills that enhance performance. Among these factors are goal orientation, self-confidence, anxiety, and attentional style. On one side of the pyramid are the psychological strategies used to enhance peak performance. Among
Figure 1 The Psychological Pyramid of Peak Performance (adapted from Gould and Damarjian 1998).
14925
Sport Psychology: Performance Enhancement the skills taught are: goal setting, relaxation techniques, imagery and mental rehearsal, self-talk, the development of pre-performance and performance routines, and association\disassociation strategies. However, to achieve and maintain excellence, peak performers must learn how to deal with the adversity associated with excellence. Adversity coping strategies represent the other side of the pyramid. Stressors may come from an individual’s own expectations, others’ expectations, time demands, injuries, and general life concerns. As a result, learning effective coping strategies is essential. Although some of the strategies needed to cope may be the same as those used to enhance peak performance, the way they are used may be markedly different. Athletes do not live, train, and compete in a vacuum. There are social, psychological, physical, and organizational factors in their environment that can have facilitative or debilitative effects on their performance. Gould et al. (1999) studied eight teams from the 1996 Atlanta Olympics. Four of these teams met or exceeded performance expectations and four failed to meet performance predictions. Through interviews and focus groups with the athletes and coaches, the authors found that teams that met or exceeded expectations participated in resident training programs, experienced family or friend support, employed mental preparation strategies, and were highly focused and committed. Teams that failed to meet expectations experienced planning and team cohesion problems, lacked experience, faced travel problems, encountered coaching problems, and had problems related to focus and commitment.
4. Building the Foundation of a Peak Performer: Deeloping the Base B. S. Bloom (1985) and his colleagues followed 120 individuals who had achieved world-class success in sport (tennis and swimming), esthetics (music and the arts), and cognitive or intellectual (research mathematicians and research neurologists) endeavors. They found that the development of these individuals was divided into three phases: romance, precision, and integration. During the romance phase, development was multifaceted and characterized by play, exploration, and fun. It was the time when youth developed a love for learning and learned that hard work and fun were not incompatible. They learned the importance of having high expectations of themselves. The need to have high expectations was communicated through the expectations that parents, teachers, and the community in general had for the individual. Parents and teachers played a crucial role in the development of excellence. During this first phase motivation and effort counted more than any special gifts or qualities the child had. It was during this period that qualities 14926
such as character, emotional intelligence, and life skills were taught and acquired. The romance phase was followed by a focus on precision. This was a period of systematic learning and skill development. The teachers\coaches of the individuals became more demanding and the children became more committed to the activity. Motivation became more internal than external. Skills were learned but not necessarily demonstrated very effectively. It was not until the integration phase that individuals experienced the behavior as part of themselves. This phase was characterized by the perfection of their talent. The individuals became increasingly responsible for their own motivation. Much of their time was spent in their area of talent. Bloom and his colleagues noted that by age 11 or 12 only about 10 percent of these individuals had progressed sufficiently to predict their later level of achievement. Among the special qualities that characterized these talented individuals were: a strong commitment to their particular talent field, a desire to reach a high level of attainment in their chosen field, an ability to learn quickly and well, and a willingness to invest great amounts of time and effort to reach high levels of achievement in their chosen field. For each field there were specific qualities that were important. Obviously, coordination, reflexes, endurance, and strength are needed for athletic excellence, but in contrast to what is believed by many, these conditions, while necessary, are clearly not sufficient.
5. Teaching a Psychological Skills Training Program: Deeloping the Sides Athletes are active individuals. Their life experiences suggest they learn best by doing rather than by talking. They are accustomed to being ’coached’ and expect concrete suggestions, feedback, and tasks. The process of teaching psychological skills should take advantage of their preferred learning style and expectations. As a result, although many of the specific strategies used to develop peak performance are based on traditional psychological approaches, they are best implemented using a skill-based, psychoeducational format (Danish et al. 1993). As such, these strategies may be quite similar to what Seligman and Csikszentmihalyi (2000) have referred to as ‘positive psychology’. A psychological or mental skills training program for developing all kinds of talent uses a psychoeducational approach as opposed to a traditional psychotherapeutic approach. The traditional psychotherapy model is problem- or illness-focused. The illness or problem is diagnosed, a treatment prescribed, therapy is undertaken and, if successful, a cure takes place. A psychoeducational model, on the other hand, begins with the individual’s dissatisfaction or am-
Sport Psychology: Performance Enhancement bition. Goals are set and then the individual learns the skills to achieve the goal. The difference between the two approaches is both in the goals of the intervention and in the methods of delivery. In terms of the goals, it is the difference between developmental and remedial help, between pathology and health promotion. Traditional psychotherapy focuses on identifying deficits and fixing them so the individual can return to ‘psychological homeostasis’. The focus of psychoeducational approaches is on identifying assets and how to enhance them. Learning the skills serves as a catalyst for attaining personal and performance excellence. One psychoeducational approach is life development intervention (LDI; Danish and D’Augelli 1983). The central strategy of LDI involves teaching individuals how to set a goal, develop a plan to reach the goal, and to overcome the roadblocks to goal attainment. Goals are actions undertaken to reach some desired end, not the end itself. Goals are different from results in that they are actions over which the participant has control. For example, for an athlete preparing for a competition, a goal might be to decide to learn how to develop a precompetition routine that will help focus on positive thoughts. To want to improve a time or score is not a goal, but a result. Individuals have control over their goals and only partial control over the results they want. In addition to having control over one’s goals, three other elements are necessary for effective goal setting; (a) setting goals one is motivated to attain, (b) making sure that the goal is stated in positive terms, and (c) having a goal that is defined specifically and can be measured. Being able to set a goal is one thing; being able to reach or attain the goal is another. There may be roadblocks that interfere with reaching a goal, regardless of whether it is of an athletic, career, or personal nature. These roadblocks are: a lack of knowledge, a lack of skill, being afraid to take risks in order to reach the goal, and\or a lack of adequate social support to work toward the goal. A lack of knowledge means that the individual is missing some information necessary for goal attainment. A lack of skills may refer to either physical or mental skills. A skill deficiency exists when the athlete does not know ‘how to’ attain a needed skill level. When individuals lack social support they need to know what support they need, who can provide them with the support (and who can not), and how to ask for the support. Support can either be of a tangible or emotional nature. The final roadblock relates to risk taking. Many individuals play and live their lives in a ‘comfort zone’ because they are afraid of failing, or possibly even succeeding. They may believe it is better not to try to extend themselves to reach a goal than to try and fail. Or they may feel that if they reach a goal, more will be expected of them on an ongoing basis. Risks are defined as the perceived benefits of an action minus the
perceived costs. If the perceived costs outweigh the perceived benefits, efforts to reach a goal may be abandoned; if the perceived benefits outweigh the perceived costs, individuals may choose to work toward the goal. To optimize risk taking the benefits of an action must be increased and the costs decreased. As part of the intervention, the roadblock(s) must be removed so that the individual may work toward the goal. Thus, teaching the necessary knowledge, skills, means of developing social support, and\or risk taking strategies to overcome the roadblocks to a goal is a major component of how the LDI approach is implemented. The timing of when the intervention is implemented affects how it is delivered. For example, if goal setting were being taught, it could be taught to help an athlete prepare for the competition (enhancement), facilitate the athlete’s ability to fine tune the goal during competition (support), or cope with an injury resulting from competition by helping develop a rehabilitation plan (counseling). Regardless of when the intervention is implemented, the building of rapport and the development of an alliance to work together toward the attainment of an individual’s goals are essential.
6. Deeloping Excellence in Sport and Life Not every individual can be a peak performer, but most can improve through performance enhancement strategies. Improving one’s performance in sport or in other life domains begins with a decision to invest time and energy in deliberate practice. Bloom (1985) describes how a single-minded focus is an integral part of performance excellence. However, Marcia (1966) and Petitpas and Champagne (1988) point out that if individuals invest all their energies in sport, or in any single area, and achieve a level of success, they may decide to forego searching for a personal identity and commit to only that activity. By not exploring other areas, the individual may gain a sense of security at the expense of a search for identity. Baltes and Baltes (1980) refer to a single-minded focus as selectie optimization. When individuals have a limited amount of time and energy, based on an assessment of environmental demands, their motivation, skills, and biological capacities, individuals select a pathway upon which to focus. It should be pointed out that identity foreclosure and selective optimization are not in themselves harmful. These behaviors only become problematic when they prevent permanently the exploration of alternatives. It is through this process of exploration that individuals learn more about themselves. They receive feedback about their strengths and weaknesses by testing themselves in a variety of situations. They learn life skills (Danish 1995). For sport to serve as an effective model for learning life skills, the sport experience must be designed with this goal in mind. 14927
Sport Psychology: Performance Enhancement Promoting personal and performance excellence simultaneously is not incompatible. However, it must be a planned outcome of sports participation. It occurs when athletes compete against themselves—more specifically, when that competition is focused on maximizing their own potential and achieving their own goals. It occurs when knowing oneself becomes as important as proving oneself. De Coubertin ([1918] 1966), the founder of the Olympic movement, best described the dilemma faced by athletes who are so intent on proving themselves that they lack an awareness of what they have learned through sport: Sport plants in the body seeds of physio-psychological qualities such as coolness, confidence, decision, etc. These qualities may remain localized around the exercise which brought them into being; this often happens—it even happens most often. How many daredevil cyclists there are who once they leave their machines are hesitant at every crossroads of existence, how many swimmers who are brave in the water but frightened by the waves of human existence, how many fencers who cannot apply to life’s battles the quick eye and nice timing which they show on the boards! The educator’s task is to make the seed bear fruit throughout the organism, to transpose it from a particular circumstance to a whole array of circumstances, from a special category of activities to all the individual’s actions (De Coubertin 1918).
Whether the purpose is to enhance sport performance or to improve other areas of one’s life, the process is the same. What becomes essential is to ensure that what is learned can be generalized to other areas of the individual’s life so that the game of life is a success.
7. Concluding Comments The future of psychological interventions in sport lies in teaching mental skills to the masses, especially youth, and not just working with elite athletes. Sport participation among youth will likely continue to expand. However, it will not be taught in schools, where mandatory physical education and intramurals have all but disappeared. Instead, sport will be almost exclusively an after-school activity. As a result, the coaches will probably be youth workers and volunteers who may have little or no experience in teaching\ coaching the different sports and will need special training. Traditional programs will probably not be effective. The coaches at these youth-serving agencies tend to be underpaid and poorly trained, the volunteers are hard to recruit and retain, and the goals of these new coaches may not be the same as school-based coaches. Consequently, new programs must be developed that are modular, flexible, cost and time sensitive, multimedia in format, transportable, easily taught, and scalable. 14928
There will be challenges and opportunities in developing the materials, training the new coaches, and expanding the sport experience to more youth. One of the biggest challenges is how to help youth develop both athletically and personally. It will require integrating the physical and mental aspects of the game. If we are successful, sport will become a vehicle for learning and applying some of life’s most critical lessons. See also: Sport and Exercise Psychology: Overview; Sport Psychology: Training and Supervision; Sports as Expertise, Psychology of; Sports Performance, Selfregulation of
Bibliography Baltes P B, Baltes M M 1980 Plasticity and variability in psychological aging: Methodological and theoretical issues. In: Gurski G (ed.) Determining the Effects of Aging on the Central Nerous System. Shering, Berlin, Germany, pp. 41–60 Bloom B S (ed.) 1985 Deeloping Talent in Young People. Ballantine, New York Danish S J 1995 Reflections on the status and future of community psychology. Community Psychologist 28(3): 16–8 Danish S J, D’Augelli A R, Laquatin I 1983 Helping Skills II: Life Deelopment Interention. Human Sciences, New York Danish S J, Nellen V C 1997 New roles for sport psychologists: Teaching life skills through sport to at-risk youth. Quest 49(1): 100–13 Danish S J, Petitpas A J, Hale B D 1990 Sport as a context for developing competence. In: Gullotta T, Adams G, Monteymar R (eds.) Deeloping Social Competency in Adolescence. Sage, Newbury Park, CA, Vol. 3, pp. 169–94 Danish S J, Petitpas A J, Hale B D 1993 Life development intervention for athletes: Life skills through sports. Counseling Psychologist 21(3): 352–85 De Coubertin P [1918] 1966 What we can now ask of sport. In: Dixon J [trans.] The Olympic Idea: Discourses and Essays. Carl Diem Institut, Ko$ ln, Germany Gould D, Damarjian N 1998 Mental skills training in sport. In: Elliott B (ed.) Training in Sport: Applying Sport Science. Wiley, Chichester, UK, pp. 69–116 Gould D, Guinan D, Greenleaf C, Medbery R, Peterson K 1999 Factors affecting Olympic performance: Perceptions of athletes and coaches from more and less successful teams. Sport Psychologist 13: 371–94 Griffith C R 1926 Psychology of Coaching: A Study of Coaching Methods from the Point of View of Psychology. Scribners, New York Griffith C R 1928 Psychology and Athletics, a general surey for athletes and coaches. Scribners, New York Marcia J E 1966 Development and validation of ego-identity status. Journal of Personality and Social Psychology 3: 551–8 Petitpas A, Champagne D E 1988 Developmental programming for intercollegiate athletes. Journal of College Student Deelopment 29(5): 454–60 Seligman M E P, Csikszentmihalyi M 2000 Positive psychology – an introduction. American Psychologist 55(1): 5–14 Weinberg R S, Gould D 1995 Foundations of Sport and Exercise Psychology. Human Kinetics, Champaign, IL
S. J. Danish Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Sport Psychology: Training and Superision
Sport Psychology: Training and Supervision Sport psychology is a hybrid discipline more associated with physical education than with mainstream psychology. Debate over the training of sport psychologists is exemplified in the title of Brown’s (1982) paper ‘Are Sport Psychologists Really Psychologists?’ Models of training and supervision will vary depending on how one answers that question. Sports psychologists and sport psychology organizations in different countries around the world have come up with different answers, and thus, different models of training and supervision have developed. People pursue advanced degrees in sport psychology for reasons as varied as wanting to improve their coaching, preparing themselves for university academic positions, acquiring skills for clinical or counseling careers, and obtaining the training necessary to work in rehabilitation and sports medicine clinics (McCullagh and Noble 1996). Sport psychology as an academic discipline is almost always found in physical education\exercise science departments (other names include human movement studies and kinesiology) around the world (cf. Fujita 1987), but in many countries, those who practice come from psychology backgrounds. For example, in Flanders, virtually all of the members of the Flemish Society of Sport Psychology who practice have psychology degrees (Wylleman et al. 1999). In Australia, academic sport psychology is in human movement departments, but the current training of practitioners is in psychology departments, making graduates eligible for registration as psychologists (Morris 1995). This dual parenting of sport psychology has led to some lively debate about training. The United States Olympic Committee (USOC 1983) stepped into the training debate when they began their Sport Psychology Registry (only those with doctoral degrees could apply). They divided their registry into three sections for three ‘different’ types of sport psychology service providers. ‘Research sport psychologists’ could apply for that category if they wished to conduct research with Olympic athletes on a variety of psychological topics. ‘Educational sport psychologists’ could deliver psychoeducational training (e.g., relaxation, goal setting, imagery, positive self-talk) to individuals and groups. ‘Clinical sport psychologists’ could deliver psychotherapeutic services for athletes with psychological problems (e.g., eating disorders, panic attacks). This tripartite scheme was an attempt to accommodate individuals trained either in exercise science (many of the applications to the research and educational groups) or in psychology (most of the applications to the clinical group). The USOC was trying to be ‘inclusive’ of exercise science trained individuals who could not really use the protected title ‘psy-
chologist,’ or be eligible for licensure, but who nevertheless often had extensive training in mental skills for performance enhancement. Also, one could apply for more than one designation. This classification scheme sparked some debate (e.g., Clarke 1984, Heyman 1984), and it also had a large influence on shaping how sport psychologists talked about themselves. Even today, many practitioners specifically call themselves ‘educational sport psychologists’ although there appears to be a trend away from using these, in many cases, somewhat restricting labels. The question of training in sport psychology in the early to mid-1980s had become: training for what? Simplifying the USOC categories into to two broadly defined realms of research (academic posts and research center positions) and applied practice (performance enhancement, counseling, consultation, and clinical work) will help illustrate the basic paths for training that have evolved.
1. Training The majority of people in academic research positions in sport psychology are still housed in exercise science departments. Sport psychology positions in psychology departments are relatively rare, and training for academic positions in sport psychology differ markedly around the world. In North America, a Ph.D. program would involve two to three years of extensive course work, involvement in a variety of research projects, and the completion of a substantial research dissertation, with many programs offering applied practica as part of the degree. In UK and other British Commonwealth nations (e.g., Australia, New Zealand), a Ph.D. in sport psychology is essentially a three-year long program of research (with little or no required course work), the product of which is often a series of three or four conceptually linked studies of a particular topic in the field. Thus, one system of training (e.g., the UK) has the doctoral student deeply immersed in the research process, and another system (e.g., the US) has extensive course work along with a major research project. Because the majority of academic research position training occurs in North America (nearly 100 universities offer graduate level programs), the majority of professional discussions on the training of sport psychologists has come from academics in those schools. There is general agreement among sport psychologists that training should be interdisciplinary, covering both psychology and the exercise sciences (e.g., Dishman 1983). Feltz (1987), however, noted problems related to the integration of physical education\exercise science and psychology and suggested that expertise in psychology would be most appropriate to the applied aspects of the field. In the past decade many programs in North America, Australia, 14929
Sport Psychology: Training and Superision and the UK have made opportunities for exercise science and psychology students to take classes in each others’ departments. As sport psychology grows, it appears that training is becoming more and more interdisciplinary. Consensus on what training should be for sport psychologists, however, has not been reached. The field of sport psychology has no equivalent of the ‘Boulder Model’ in clinical psychology, but many programs describe themselves in their promotional material as espousing a scientist-practitioner model of training (see Training in Clinical Psychology in the United States: Scientist–Practitioner Model ). The organization that comes closest to setting training guidelines for those who wish to become practitioners is the Association for the Advancement of Applied Sport Psychology (AAASP) with their criteria for certification as a sport psychology consultant (see Sachs et al. 1998). Both the exercise sciences and scientific and applied psychology are well represented in the course work and training needed to become a certified consultant. The AAASP criteria also contain recommendations for supervised experience (at least 400 hours of supervised practicum working with athletes and coaches). AAASP requires a doctoral degree for certification, but in many places around the world, a doctoral degree is not required to practice psychology. For example, in Australia sport psychologists are registered (equivalent of ‘licensure’ in the US) at the master degree level. Currently, Australia probably has the most integrated training and registration model of sport psychologists in the world. The Australian Psychological Society (APS) has several ‘Colleges’ (equivalent of Divisions in the American Psychological Association) within its structure, and one of them is the College of Sport Psychologists. The APS and the College of Sport Psychologists are the bodies that accredit master of applied psychology programs that emphasize working with sports people. Graduates of accredited programs are eligible for registration with State Psychologists Registration Boards. This integration of applied sport psychology training, program accreditation, and registration into mainstream applied psychology represents the latest level of professionalization in the field.
2. Superision The topic of supervision of sport psychology practica and internships, along with collegial supervision, arrived relatively recently on the professional practice of sport psychology debate scene. In 1993, the International Society of Sport Psychology published a short article on supervising trainees (Van Raalte and Andersen 1993), and supervision received a brief discussion in a book chapter on ethics (Sachs 1993). In 1994, however, two articles were published in the major journals of the field. The article by Andersen 14930
(1994) was a discussion piece on ethics and the supervision of applied sport psychology graduate students that offered recommendations for future training and supervision. The databased study on assessing the skills of sport psychology supervisors (Andersen et al. 1994) revealed the state of supervision in applied sport psychology training. Well over half the supervisors surveyed had never received any supervision of their own work with athletes. The median number of supervised practicum hours for students was just over 100. The opportunities for supervised experience in sport psychology graduate programs appeared seriously restricted, and the quality of the supervision could be questioned given the limited experience of many of the supervisors. Andersen et al. recommended that practitioners and trainers of future sport psychologists needed to pay more attention to supervision issues because it appeared that the training and supervision sport psychology graduates were receiving was, in many cases, inadequate. In 1996, two more journal articles on supervision appeared. Andersen and Williams-Rice (1996) discussed, in a sport psychology service delivery context, the potential benefits of using psychodynamic, phenomenological, behavioral, cognitive-behavioral, and developmental models of supervision. They suggested that the cognitive-behavioral supervision model of Kurpius and Morran (1988) might be the most comfortable and appropriate for sport psychologists because it contains many features that sport psychologists use in practice (e.g., cognitive restructuring, mental rehearsal, cognitive self-instruction, covert modeling). Thus, a supervisor might have a supervisee mentally image and rehearse an up-coming encounter with an athlete, or supervisors might help supervisees overcome their own irrational beliefs about service delivery by using cognitive restructuring techniques. This supervision model has the potential to be the ‘parallel process’ model for sport psychology in that the interactions of the supervisor and the supervisee reflect similar processes that are going on with the supervisee and the athlete. Andersen and WilliamsRice also discussed transference and countertransference, along with peer and meta-supervision (supervision of supervision), misalliances, boundaries, and dual roles in supervision. From the evidence in Andersen et al. (1994), there appeared to be very little training in how to be a sport psychology supervisor. Even in many clinical and counseling psychology programs, the main training one receives for becoming a supervisor is having been in supervision oneself. This state of affairs is similar to the wishful thinking that if one is well coached, one will become a good coach. Admittedly, current clinical and counseling programs are offering more training in supervision than they have in the past. In sport psychology education, however, training in how to supervise is quite rare.
Sport Psychology: Training and Superision Barney et al. (1996) presented a model for supervision training in sport psychology, borrowing and adapting Hart’s (1982) three-stage developmental model of supervision. The Barney et al. model was actually a multitiered unified program of supervision and training. The model begins with background education in the sport sciences, physiology, physical education, and psychology. The training begins with a year-long course in sport psychology theory and research. Towards the end of that year students would begin their practica counseling athletes. Those beginning practicum students would be supervised by advanced students undergoing a ‘supervision practicum’ where they are being trained to become good supervisors. Barney et al. described three major phases for the supervisor-in-training that are interwoven. Working on the acquisition of supervisory skills (e.g., how to do a supervision intake), especially for a beginning supervisor, may take a considerable amount of time. Supervision may swing back and forth between skill acquisition and personal growth issues. The meta-supervisor helps supervisors-in-training to explore what they bring to the supervisor–supervisee relationship, their needs, their countertransferential thoughts and behaviors, and so forth. This selfexploration and personal growth are in the service of producing a more aware sport psychologist who will competently deliver service. The integration phase for the supervisor-in-training is combining new skills and personal growth for building dynamic helping relationships with athletes. Barney et al. placed great emphasis on the study of relationships in sport psychology (e.g., meta-supervisor–supervisor-intraining, supervisor–supervisee, psychologist–athlete) and how relationships are the key to successful sport psychology service delivery. This emphasis on relationships is something new in sport psychology, and this trend is exemplified in another recent article on training. Petitpas et al. (1999) have presented arguments and recommendations for making the examination of relationships one of the central features in the training of future applied sport psychologists. The recognition of the importance of relationships is nothing new in mainstream counseling and clinical psychology, but because sport psychology’s original ‘home’ was in physical education, such counseling and clinical considerations did not arise until recently. The new emphasis on relationships represents the continuing ‘psychologization’ of sport psychology. The debate on training in sport psychology has been with the field since the early 1980s, but the discussion of supervision is a relative newcomer. Supervision and training are currently becoming widely debated topics. For example, at the 1998 AAASP annual conference, there were over 20 presentations that were concerned with training and\or supervison. The study of sport psychology, even though it roots go back to Triplett (1898) and Griffith (1928), is a young, but maturing field. As applied sport psychology matures, interest in
quality training and supervision will probably grow even more. See also: Developmental Sport Psychology; Sport and Exercise Psychology: Overview; Sport Psychology: Performance Enhancement; Sports Performance, Age Differences in; Teamwork and Team Training; Youth Sports, Psychology of
Bibliography Andersen M B 1994 Ethical considerations in the supervision of applied sport psychology graduate students. Journal of Applied Sport Psychology 6: 152–67 Andersen M B, Van Raalte J L, Brewer B W 1994 Assessing the skills of sport psychology supervisors. The Sport Psychologist 8: 238–47 Andersen M B, Williams-Rice B T 1996 Supervision in the education and training of sport psychology service providers. The Sport Psychologist 10: 278–90 Barney S T, Andersen M B, Riggs C A 1996 Supervision in sport psychology: Some recommendations for practicum training. Journal of Applied Sport Psychology 8: 200–17 Brown J M 1982 Are sport psychologists really psychologists? Journal of Sport Psychology 4: 13–17 Clarke K S 1984 The USOC Sports Psychology Registry: A clarification. Journal of Sport Psychology 6: 365–6 Dishman R K 1983 Identity crisis in North American sport psychology. Journal of Sport Psychology 5: 123–34 Feltz D 1987 The future of graduate education in sport and exercise science: A sport psychology perspective. Quest 39: 217–23 Fujita A H 1987 The development of sport psychology in Japan. The Sport Psychologist 1: 69–73 Griffith C R 1928 The Psychology of Athletics. Scribner, New York Hart G M 1982 The Process of Clinical Superision. University Park Press, Baltimore, MD Heyman S R 1984 The development of models for sport psychology: Examining the USOC guidelines. Journal of Sport Psychology 6: 125–32 Kurpius D J, Morran D K 1988 Cognitive-behavioral techniques and interventions for application in counselor supervision. Counselor Education and Superision 27: 368–76 McCullagh P, Noble J M 1996 Education and training in sport and exercise psychology. In: Van Raalte J L, Brewer B W (eds.) Exploring Sport and Exercise Psychology, 1st edn. American Psychological Association, Washington, DC, pp. 377–94 Morris T 1995 Sport psychology in Australia: A profession established. Australian Psychologist 30: 128–35 Petitpas A J, Giges B, Danish S J 1999 The sport psychologistathlete relationship: Implications for training. The Sport Psychologist 13: 344–57 Sachs M L 1993 Professional ethics in sport psychology. In: Singer R N, Murphey M, Tennant L K (eds.) Handbook of Research on Sport Psychology. MacMillan, New York, pp. 921–32 Sachs M L, Burke K L, Gomer S 1998 The Directory of Graduate Programs in Applied Sport Psychology, 5th edn. Fitness Information Technology, Morgantown, WV
14931
Sport Psychology: Training and Superision Triplett N 1898 The dynamogenic factors in pacemaking and competition. American Journal of Psychology 9: 507–33 — 1983 US Olympic committee establishes guidelines for sport psychology services. Journal of Sport Psychology 5: 4–7 Van Raalte J L, Andersen M B 1993 Special problems in sport psychology: Supervising the trainee. In: Serpa S, Alves J, Ferreira V, Paulo-Brito A (eds.) Proceedings: VIII World Congress of Sport Psychology. International Society of Sport Psychology, Lisbon, pp. 773–6 Wylleman P, De Knop P, Delhoux J, Vanden Auweele Y 1999 Current status and future issues of sport psychology consultation in Flanders. The Sport Psychologist 13: 99–106
M. B. Andersen
sometimes fanatical, relationship between fans and sport. Despite the rule-bound and relatively autonomous space of sport, the structure of a sport (i.e., its rules, amount of physical contact, who is allowed to compete, who governs the sport, etc.) and the meanings associated with a sport are a product of a particular culture. The social significance of a sport, argues Bourdieu (1988), cannot be apprehended in the absence of a much broader understanding of the structure of the society as a whole. For many sociologists, sport is a site through which questions related to social and political power, domination, and ideology are considered. Since the 1960s, sport sociology can be viewed as having two approaches: an institutional approach that emphasizes the unique and semi isolated nature of sporting practices and a cultural orientation which views sport as a cultural product or practice reflective of society as a whole.
Sport, Sociology of Sport, once considered a marginal area of interest (a subset of the sociology of leisure or sociology of culture at best) is now a distinct area of concern for sociologists. Since the 1960s, a significant body of sociological body of research has been produced constituting the area of sport sociology. The growing interest in sport as a site of research is the result of three factors: the increasing pervasiveness of competitive sport in everyday life, the growing acceptance of critical examination of sporting practices, and sport’s distinctive and relatively autonomous structure. Sport sociologists are particularly interested in modern sport, a relatively new cultural form dating from the late 1800s. Modern sport shares attributes with leisure\recreation\play and with drama, but is distinct from both. It distinguishes itself from other forms of leisure\recreation\play in that it is physical, competitive and institutionalized. It can be distinguished from other forms of entertainment (e.g., drama, music) in that it is not in and of itself expressive (Gadamer 1975). The movement of a sport contest is governed by its rules. Athletes, if they are giving an ‘honest effort,’ are trying to win a game, not to communicate an idea or emotion. Thus, much of the social meaning of sport comes from outside of the actual playing of the game. The relationship between spectators and sporting practices also distinguishes sport from other forms of entertainment. Sport spectators see themselves giving support to athletes. Where other forms of entertainment attempt to evoke an emotional response in the audience, the sport audience attempts to evoke an emotional response in athletes in an effort to impact the outcome of a contest. The fan\athlete bond, then, is rooted in the norms and obligations of reciprocity (Mauss 1954) that forms the foundation of the intense, 14932
1. Sport as a Unique Institution The institutional approach to the study of sport has its roots in physical education. Physical education researchers in the 1960s and 1970s applied their empirical methods to sport (Kenyon and Loy 1965). For the most part, early researchers were looking for empirical proof for popular beliefs about sport’s positive and character building attributes or as Coakley put it later ‘socialization-as-internalization’ research (1993). The results, by and large, were inconclusive (Fine 1987). At about the same time, sport and physical education became embroiled in the social movements of the 1960s and 1970s. In the United States, scholar and political activist Jack Scott published his manifesto, Athletic Reolution (1971) and Harry Edwards documented the protests and threatened boycott he led in 1968 in Black Athlete Reolt (1969). In France, Jean-Mari Brohm (1978), influenced by the student movement, argued that sport was alienated play and encouraged an anti-Olympic movement. Beginning in the 1970s, empirical research on sport as an institution also took a critical turn, documenting structural inequalities within sport and challenging popular ideologies. One line of research following from these studies focuses on positional racial segregation on the field of play; a phenomenon dubbed ‘stacking’ (Curtis and Loy 1978). A related field of inquiry is the documentation of racial and gender inequalities in the hiring practices of teams and leagues. Accosta and Carpenter (1994), for example, document gender inequalities in hiring practices, salaries, and advancement of women in intercollegiate sport in the United States. More recently, sociologists have focused on disparities in media coverage. Socialization research also took a critical turn as sociologists
Sport, Sociology of attempted to demonstrate a link between athletic participation and sexual assaults and other criminal or deviant behavior. Another line of critical inquiry in sport has roots in a social problems perspective. The social problems perspective in sport tends to focus on contradictions within sport. Researchers frame sport as paradoxical, (i.e., sport promotes health and injury, both encourages and discourages drug use) or highlight positive deviance or over-conformity (i.e., a strong emphasis on winning encourages drug use or injury). Other research frames the problems of sport in the context of social-historical shifts. Debates about performance enhancing drug use, for example, are framed in light of changes in approaches to medical treatment. Commonly accepted medical interventions, such as cold remedies, therapeutic treatments for slowing down the aging process, or preventing depression, are marketed with the express intent to enhance performance in everyday life. As the standard of acceptable medicine changes, and the idea of what is a treatable ailment shifts, so will athletes’ sense of fairness and prohibitions by sports organizations. Independent of the burgeoning field of sport sociology, qualitative researchers in the 1960s, influenced by the Chicago School, began producing ethnographies of marginalized groups within sports world (Scott 1968, Polsky 1969). Ethnographers would find the relatively autonomous world of sport fertile ground for understanding common, everyday experiences. For example, Chambliss (1989) theorizes excellence through elite swimming, Adler and Adler (1991) detail the experience and consequences of role engulfment through college basketball, and Fine (1987) explores preadolescent male culture in his study of youth baseball.
2. Cultural Orientation to Sport In the 1970s, Gruneau and other scholars associated with the Massachusetts School attempted to re-center the sociological study of sport within an overriding concern for political economy (Ingham and Donnelly 1997). The works of these scholars formed the basis of the cultural orientation in the United States and Canada. They found theoretical grounding in Gramsci’s notion of hegemony. As a social practice, sports are produced by athletes and organizers, but within the context of historically-specific social constraints. This theoretical formulation provides the basis for Gruneau’s analysis of class relations, reproductions and transformation of Canadian sport (1983). By extension, modern sport is viewed as a celebration of modern values of competition and winning and organized around principles of the division of labor, capitalism, and professionalism. Because sport offers so little social significance in and of itself (e.g., it has a veneer of neutrality), it becomes
a powerful resource for constructing meaning—particularly for the benefit of the power elite. Central to the Massachusetts School is the Gramscian idea of counter hegemony or resistance. Sport, they note, could also be used by marginalized groups to challenge dominant ideologies and structures. From this perspective, sport is a site in which meaning is constructed, challenged, and reconstructed. Put slightly differently, sport is viewed as ‘contested terrain.’ At about the same time in Europe, Eric Dunning, a student of Norbert Elias cultivated a somewhat different cultural orientation. Dunning incorporated Elias’ grand theory of social development into a figurational approach to sport sociology (Elias and Dunning 1986, Dunning 1999). The figurational approach which focuses above all on social processes and interdependencies frame a vast range of works from hooliganism, athlete migrations and male bonding. The figurational approach distinguishes itself from the Massachusetts School in its focus on a total social structure—taking into account a wide variety of social processes that hold groups together such as emotions, gender, race, and communication patterns. In the 1980s, feminist sport theorists incorporated aspects of the figurational approach and the Massachusetts School emphasis on hegemony and resistance and began to theorize about the changes in sport due, in large part, to the feminist movement. The influence of Dunning and Sheard’s (1973) essay on rugby as a male preserve on the Massachusetts School is evident in the works of Theberge (1985) and Birrell (Birrell and Richter 1987). Arguing that sport is a male preserve that contributes to the oppression of women physically and sexually, they assert that sport nonetheless offers women the potential for subverting male dominance by enabling women to experience their bodies as strong and powerful or to restructure sport to fit feminist principles. Messner (1988), while discounting the simplistic notion that increasing participation by women signals ideological shifts, argues that images of athletic women are forged, interpreted, and contested in sport and may partially contradict masculine hegemony. Feminist insights of contested terrain in sport were applied to a variety of concerns, including bodybuilding and masculinity (Klein 1993) and inequalities and Texas High school football (Foley 1990). These sport sociologists (often called critical sport sociologists) see sport as a site through which questions related to social and political power, ideology, and transformative possibilities are considered. The bulk of the research that follows from this conceptualization of sport is far less optimistic than the theory that inspired it. Bryson (1987), for example, found masculine hegemony is both resilient and more complex than had been previously theorized. Bryson documents the political tactics of men in Australia in response to women’s athletic prowess that reframe their accomplishments to support masculine he14933
Sport, Sociology of gemony. In many ways, the more recent trends in sport follow from the discovery of sport’s rather limited transformative potential.
3. Recent Trends, Future Challenges Foucault, cultural studies, and postmodern theory have, to varying degrees, influenced critical sport sociology. One of the great difficulties for the sociology of sport is how to understand the physicality of sport. Most classical sociological theory ignores or diminishes the significance of body. Yet the empowerment and transformation of the body or body skills is at the heart of sport’s transformative possibility for critical sport sociologists. Foucault and feminist theory enabled sport scholars to problemitize the body—to see it as a site of domination and through which power is exercised and enjoyed. Bodies are not simply empowered through sport, but they are constrained, shaped, and consumed and thus a central ideological resource (Cole 1994). Cultural studies encourages sport sociologists to make a critical analysis of the social meanings of sporting events. A growing literature offers critical readings of the discourse surrounding a sporting event or celebrity, both to expose the multiple meanings of sport and to analyze how these narratives do ideological work which reproduces and justifies race, class, and gender inequalities (Birrell and McDonald 2000). For example, images of black athletes, even those purporting to offer positive images of African–Americans in order to subvert racism, refer to and reinforce racial stereotypes associating African–Americans with crime, drugs, and poverty. The postmodern perspectives suggest that we see sport as part of broad and fundamental social changes. One of the distinguishing features of contemporary post-industrial societies is the use of images and spectacles to enculturate individuals into a media and consumer culture. Social values, political debates, and everyday concerns are not garnered through active participation, but through the passive consumption of spectacles. Post-industrial media culture implodes sport into media spectacle. Celebrity athletes become images that are dynamic and complex—but, because they are disconnected from social life, they can be interpreted in a variety of ways. Postmodern perspectives encourage a shift in sociological concerns away from the active producers of sport and culture to an exploration of the passive consumption of sport spectacles (Kellner 1996). Somewhat in response to the postmodern turn, another line of inquiry is the re-examination of the institutions of sport and their structural intersections with political power. These examinations of the organizational practices within the sport industry attempt to ground the discussion of power and domination in and through sport in a structural framework. The narratives surrounding sport become ancillary to the exercise of power—particularly global 14934
political and economic power—and how this shapes the institutions of sport. One of the future challenges of sport sociology will be to find a synthesis between institutional and the cultural critiques. While the concerns of both approaches appear similar at a very basic level they are very different. The cultural critique relies essentially on the success and celebrity of people of color, women and other marginalized groups to demonstrate the resiliency of ideological domination while the institutional critique emphasizes barriers and limitations impeding the progress of these very same groups. Nonetheless the synthesis of these two perspectives seems to hold the promise for greater sociological understanding of sport. See also: Critical Theory: Contemporary; Elias, Norbert (1897–1990); Entertainment; Hegemony: Cultural; Regulation: Empirical Analysis; Sport, Anthropology of; Sport, Geography of; Sports, Economics of
Bibliography Acosta R V, Carpenter L J 1994 The status of women in intercollegiate athletics. In: Birrell S, Cole C L (eds.) Women, Sport, and Culture. Human Kinetics, Champaign, IL, pp. 111–18 Adler P A, Adler P 1991 Backboards and Blackboards: College Athletics and Role Engulfment. Columbia University Press, New York Birrell S, McDonald M G (eds.) 2000 Reading Sport: Critical Essays on Power and Representation. Northeastern University Press, Boston Birrell S, Richter D M 1987 Is a diamond forever? Feminist transformations of sport. Women’s Studies International Forum 10(4): pp. 395–409 Bourdieu P 1988 Program for sport sociology. Sociology of Sport Journal 5(2): 153–61 Brohm J-M 1978 Sport—A Prison of Measured Time. Ink Links, London Bryson L 1987 Sport and the maintenance of masculine hegemony. Women’s Studies International Forum 6(2): 135–53 Chambliss D 1989 The mundanity of excellence. Sociological Theory 7(1): 70–86 Coakley J 1993 Sport and socialization. Exercise and Sport Sciences Reiews 21: 169–200 Cole C L 1994 Resisting the canon: Feminist cultural studies, sport and technologies of the body. In: Birrell S, Cole C L (eds.) Women, Sport and Culture. Human Kinetics, Champaign, pp. 5–29 Curtis J, Loy J 1978 Positional segregation in professional baseball: Replications, trends, data and critical observations. International Reiew of Sport Sociology 4: 5–21 Dunning E 1999 Sport Matters: Sociological Studies of Sport, Violence and Ciilization. Routledge, London Dunning E, Sheard K 1973 The rugby football club as a type of male preserve. International Reiew of Sport Sociology 8: 5–24 Edwards H 1969 The Reolt of the Black Athlete. Free Press, New York Elias N, Dunning E 1986 Quest for Excitement. Basil Blackwell, Oxford
Sports as Expertise, Psychology of Fine G A 1987 With the Boys: Little League Baseball and Preadolescent Culture. University of Chicago Press, Chicago Foley D 1990 The great American football ritual: Reproducing race class and gender inequality. Sociology of Sport Journal 7(2): 111–35 Gadamer H G 1975 Truth and Method. Sheed and Ward, London Gruneau R 1983 Class, Sport and Social Deelopment. University of Massachusetts, Amherst, MA Ingham A G, Donnelly P 1997 A sociology of North American sociology of sport. Sociology of Sport Journal 14(4): 362–418 Kellner D 1996 Sport media culture and race—Some reflections on Michael Jordon. Sociology of Sport Journal 13(4): 458–67 Kenyon G S, Loy L W 1965 Toward a sociology of sport: A plea for the study of physical activity as a sociological and social psychological phenomenon. Journal of Health, Physical Education and Recreation 38: 24, 68–69 Klein A 1993 Little Big Men: Bodybuilding Subculture and Gender Construction. State University of New York Press, Albany, NY Mauss M 1954 The Gift: Forms and Reason for Exchange in Archaic Societies. W W Norton, New York Messner M 1988 Sports and male domination: The female athlete as contested ideological terrain. Sociology of Sport Journal 5(3): 197–211 Polsky N 1969 Hustlers, Beats and Others. Anchor Doubleday, New York Scott J 1971 The Athletic Reolution. Free Press, New York Scott M 1968 The Racing Game. Aldine, Chicago Theberge N 1985 Toward a feminist alternative to sport as a male preserve. Quest 37(2): 193–202.
T. W. Crosset
Sports as Expertise, Psychology of Elite athletes represent a small and specific subset of the human population within whom the mastery of movement (motor) skill acquisition is at its most pronounced. The emerging field of sport expertise concerns itself with the detailed study of exceptional sports performers and the mechanisms underpinning the overt power, strength, grace, beauty, speed, accuracy, and apparent ease, efficiency, and effectiveness of their movement skills. Comparisons with less skilled individuals, along with detailed examinations of the developmental life histories of exceptional athletes, are used in an attempt to develop a principled understanding of the nature and limits of human perceptual–motor learning.
1. Theoretical and Applied Significance of Sport Expertise Research Studying expert performance in sport has the potential to contribute both important theoretical and applied knowledge to the social and behavioral sciences. In a theoretical context, studying sport expertise is im-
portant because of its potential to provide valuable and unique insights into the nature of perceptual– motor skill learning. The attainment of expert performance in a whole range of domains (sport being no exception) appears to take at least 10 years, or approximately 10,000 hours, of daily, effortful practice, during which time literally millions of trials of practice are undertaken (Ericsson et al. 1993). Studying the expert therefore provides a window into learning, and skill optimization that conventional, laboratory-based learning studies of untrained participants simply cannot replicate. Studies of sports experts in particular are uniquely valuable in permitting broad assessment of the explanatory efficacy of theories and models of human skill acquisition and expertise (typically developed in domains other than sport) and, in conjunction with information from the field of kinesiology or human movement science, in helping isolate the limiting factors to performance in a range of different sport tasks. In an applied context, sport expertise research offers promise as a means of providing information of practical significance to coaches and athletes regarding the selection of the most appropriate forms of practice and training, the assessment of skill progression, and the optimal design of sports systems and talent identification. For example, knowing what factors do, and equally importantly do not, discriminate expert and novice performers can provide a principled basis for guiding emphasis within practice. Knowing the specific information that must be learned in order to become an expert on a given task has direct implications for the structure of instruction and the provision of feedback. Knowing the early developmental experiences that adult experts in particular sports share in common can provide direct guidance with respect to the appropriate emphases to apply to the design of junior sport systems.
2. Historical Trends in the Study of Sport Expertise While a strong case can be made for the potential importance of sport expertise research, historically it has not been a favored topic of study in either sport psychology or ‘mainstream’ cognitive psychology. Although a considerable number of studies comparing skilled athletes with less skilled people on a range of isolated ‘abilities,’ such as vision, reaction time, balance, movement speed, and agility, were undertaken throughout the 1930s to 1970s, this research work was, almost without exception, descriptive, atheoretical, and nonprogrammatic. Consequently, little progress was made during this period toward the development of a consolidated body of knowledge, nor toward the development of conceptual models or theories of sport expertise. In large part this was due to the concerted emphasis in the motor learning field at this same time 14935
Sports as Expertise, Psychology of on research examining the limitations and capacities of putative information-processing stages—such work proceeded typically by studying untrained participants as they performed simple, but novel, laboratory tasks. The foundations for a consolidated body of research on sport expertise were laid largely in the 1980s. During this period paradigms developed to study expertise in cognitive tasks (such as playing chess or solving mathematical problems) were applied progressively to the study of sport situations. Particularly prominent were the pattern-recognition paradigm developed by de Groot (1966), and Chase and Simon (1973), and the knowledge-based approaches to skill acquisition of Chi (1981), Anderson (1982) and others. Studies using the pattern-recognition paradigm typically demonstrate superior pattern recognition and recall by experts for material that is domainspecific; however, this advantage disappears for unstructured material and for patterns derived from beyond the expert’s domain. This observation holds true for sport tasks as well as more traditional cognitive tasks, although there is some controversy as to whether the expert’s advantage on such tasks is a cause or merely a by-product of their expertise. The knowledge-based approaches have revealed systematically superior knowledge content and organizational structure for experts, again regardless of whether the experts are drawn from the cognitive or the motor domain (Thomas and Thomas 1994). Studies derived from the pattern-recognition and knowledge-based approaches were invaluable in providing the impetus for the development of the sport expertise field, although on occasions the assumptions, strengths, and limitations in the use of these imported paradigms was probably not fully appreciated (Abernethy et al. 1993, 1994). For example, the verbal self-report method of data collection used within most of the knowledge-based approaches to cognitive expertise may have reduced veracity when used with sports experts. In sport settings, direct, conscious access to the mechanisms controlling highly automated, rapidly performed motor acts may simply not be possible, and any data derived from verbal selfreport may need to be treated with caution. The scholarly study of sports expertise has gained considerable momentum since the beginning of the 1980s. A number of factors have contributed to this. These include the growing interest in expert systems within cognitive science, the development of improved technology for the in situ experimental study of sport tasks, and a rejuvenated interest in learning as a pivotal issue within the broader motor-control field. Two additional factors are particularly noteworthy. The first is the emergence of a theoretical perspective (ecological psychology) which challenges many of the fundamental premises of cognitive psychology. Ecological psychology is influenced heavily by the work of 14936
James Gibson on perception (e.g., Gibson 1979), and by Nicholas Bernstein on action (e.g., Bernstein 1967). Its orientation toward functional couplings between perception and action places great emphasis on the study of natural actions as a central vehicle toward understanding skill learning, and such an orientation is compatible with a focus on sport tasks, and experts within such tasks (Abernethy et al. 1994). The second noteworthy factor is the renewed interest in the developmental bases and social support mechanisms for expert performance—a topic first presented in the work of Benjamin Bloom (Bloom 1985). The effective fusion of qualitative, social approaches to expertise development of the type developed by Bloom with more traditional, quantitative, empirical studies of experts represents one of the key challenges for sport expertise in the modern era.
3. Current Emphases in Sport Expertise Research Many of the distinguishing characteristics of expert athletes surmised by both early skill-acquisition theorists and lay commentators have now been confirmed empirically (Abernethy et al. 1998). In comparison to less skilled individuals, exceptional athletes are demonstrably advantaged on a range of perceptual, cognitive, and motor parameters. Perceptually, they are faster and more accurate in recognizing patterns from their sport, superior in terms of both their speed and accuracy in anticipating the actions of an opponent, and better able to pick up essential biomechanical information from observation of the movement patterns of others. Cognitively, they have superior knowledge of both factual (declarative) and procedural matters related to their sport, their knowledge is organized in a deeper, more richly structured form, they possess superior knowledge of situational probabilities for their sport, and they are better able to plan their own actions in advance. In terms of movement per se, expert athletes produce movements with greater consistency and adaptability, greater efficiency and effectiveness, and greater speed and accuracy than can less skilled performers. In addition, expert athletes have also been shown to be able to produce movement more automatically, with reduced attention demand and yet with superior ongoing monitoring, than can less skilled individuals. An outstanding feature of expert performance is therefore the capacity to satisfy simultaneously competing task demands (such as speed versus accuracy, consistency versus adaptability, or attention versus monitoring) in a manner that optimizes task performance. Across each of the elements of expert sports performance, the challenge for researchers is to move beyond the existing descriptive knowledge about expertise toward the more fundamental goals of explaining and understanding how this expert ad-
Sports as Expertise, Psychology of vantage develops and is achieved. This challenge is being taken up at the time of writing in at least three major areas of focus for contemporary sports expertise research. One major current research focus is the determination of the limiting factors to sports performance. While much is now known about factors that differentiate reliably the expert athlete from everyone else, an equally important body of knowledge is that accrued on factors which do not appear to be important discriminators. The accumulating evidence showing the inability of generalized perceptual, cognitive, and motor tests (such as simple visual tests, and generic measures of reaction time) to differentiate experts from others suggest that the attributes being measured by these tests are not the limiting factors to sports performance (Abernethy et al. 1998). Approaches which seek to improve sports skill by training these attributes are unlikely to be effective. The evidence on other aspects, such as whether or not experts and novices differ systematically in their sportspecific eye movement patterns, is less clear-cut, leaving issues as to the efficacy of training learners to model the scan patterns of experts unresolved at this time (Williams et al. 1998). A second major current research focus is on the determination of possible differences in information usage and dependence between experts and novices. Determining the specific information (or cues) used by experts and novices in sport tasks is a challenging undertaking. To date, the main approaches that have been used involve selective temporal and\or spatial masking of potential information sources, manipulation of kinematic display information using computer digitization, and the recording of eye movement (or visual search) patterns. Such approaches indicate some systematic differences in the specific cues used by the experts and less skilled people—experts, in timeconstrained sports at least, being able to pick up information from early-occurring cues that novices cannot utilize. More precise determination of the specific information sources guiding the selection and production of movement by experts in particular sport settings may, in the future, provide a more principled basis for training, and instructional and feedback augmentation than is currently possible. A third major contemporary research focus is on the critical question of how learning to be an expert takes place. The approaches that describe sports expertise in terms of component skills and limiting factors, and in terms of differences in information usage, provide a valuable starting point for knowing the ‘end product’ differences between experts and novices; but they are silent with respect to the types and amount of practice, the critical role models, developmental experiences and social support structures that are necessary (if not sufficient) for expertise to be achieved. A perspective, advanced by Anders Ericsson and others (e.g. Ericsson et al. 1993), is that the sheer quantum of intense goal-
directed (deliberate) practice is the single best predictor of the attainment of expertise. This perspective is stimulating significant research activity as others attempt to verify the validity and generality of this proposition across sports and attempt to determine the role played by the type of practice and by other contextual factors (such as parental support levels and relative age in first entering organized sport) in the development of expertise (e.g., see Singer and Janelle 1999, Starkes et al. 1996). An emerging view, though as yet supported by only a modest evidence base, is that relatively early exposure to competition with adults, opportunities to model and learn by watching from an early age, and the existence of broad diversity in practice conditions in the formative years, especially through settings that encourage creativity and experimentation without excess emphasis on outcomes, may all be important environmental ingredients for the nurturing of expert performers.
4. Methodological Issues and Challenges Studying sports experts and understanding the basis of sports expertise poses some unique, and difficult, methodological challenges (Abernethy et al. 1993). Exceptional performers in any particular sport are, by definition, relatively few in number. A consequence of this is that many\most studies of sports expertise suffer from either limited statistical power (because of the difficulty in assembling a large sample size of genuine experts), or a dilution of the degree of ‘expertise’ within the expert group. With considerable variability evident in the stringency with which experts are defined, it is not an uncommon observation that the expert sample in one study may well qualify as the less skilled control sample in another. This absence of agreed, universal criteria or even convention as to what minimally constitutes an expert clearly works against the development of a consolidated knowledge base on sports expertise. A further, vexing methodological problem relates to the dissociation of expertise effects from effects due not to expertise per se but rather due simply to experience. In the majority of the existing studies on sport expertise, key characteristics of the expert group are compared directly with those of a control group, usually consisting of task novices. As a consequence, the groups being compared differ not only in terms of their level of task expertise but also in their level of task experience. It is frequently difficult, therefore, to ascertain whether any observed differences are related genuinely to expertise or are simply a product of different levels of exposure to the task. Using additional groups differing in performance level but matched in terms of years of playing experience can go some way toward addressing this issue, although even with such controls the quality and quantity of experience are so tightly interrelated with skill level 14937
Sports as Expertise, Psychology of that the confounding effect of experience cannot be completely disentangled from the issues of expertise. Sport expertise is clearly not unidimensional, yet many of the prevailing methodological approaches used in an attempt to understand sport expertise have been of this type. As sport tasks provide the potential for minor deficiencies in some skill components to be offset by strengths in others, a raft of dependent measures tapping different facets of performance rather than unitary measures is obviously needed to capture fully the essence and individual variability that goes toward the making of an expert. More consistent use of multiple dependent measures is to be expected in future research endeavors in this field. Unlike many other fields of study, where the key variable(s) of interest can be mapped effectively over relatively short time periods, the cumulative and uncertain nature of the development of expertise makes prospective studies—desirable as they may be—logistically very difficult. The accumulation of data on the lifespan development of experts therefore relies heavily on retrospective self-reports by athletes, yet such methods need considerable confirmation from other sources to avoid potential biases and inaccuracies. With the growing emphasis on mapping of the individual developmental histories of athletes, a key challenge for the future is to continue to modify and improve the reliability and validity of the methods used for acquiring and comparing lifespan data.
5. Future Directions The major theoretical challenge for the sport expertise field is to understand, in a much more coherent way than is the case at the time of writing, the mechanisms responsible for the observable expert–novice differences in overall performance and in specific components of performance. Closer linkage of the research thrust in sports expertise with the present parallel theory development in motor control and learning, and in expert systems in cognitive science, appears imperative. The major challenge from an applied psychology perspective is to translate the knowledge being acquired about sport expertise into useful practical directions for the guidance of sports training and instruction. Understanding more about the respective roles of implicit and explicit learning in expert performance, and utilizing fully the opportunities created through the new technologies (see Simulation and Training in Work Settings; Virtual Reality, Psychology of) will undoubtedly become increasingly critical in the future. See also: Developmental Sport Psychology; Expert Systems in Cognitive Science; Expertise, Acquisition of; Sport and Exercise Psychology: Overview; Sport, Anthropology of; Sport, Geography of; Sport, History of; Sport Psychology: Training and Supervision 14938
Bibliography Abernethy B, Burgess-Limerick R J, Parks S 1994 Contrasting approaches to the study of motor expertise. Quest 46: 186–98 Abernethy B, Thomas K T, Thomas J R 1993 Strategies for improving understanding of motor expertise. In: Starkes J L, Allard F (eds.) Cognitie Issues in Motor Expertise. Elsevier, Amsterdam, The Netherlands Abernethy B, Wann J P, Parks S 1998 Training perceptualmotor skills for sport. In: Elliott B C (ed.) Training in Sport: Applying Sport Science. Wiley, Chichester, UK Anderson J R 1982 Acquisition of cognitive skill. Psychological Reiew 89: 369–406 Bernstein N 1967 The Co-ordination and Regulation of Moements. Pergamon Press, Oxford Bloom B S 1985 Deeloping Talent in Young People. Ballantine, New York Chase W G, Simon H A 1973 The mind’s eye in chess. In: Chase W G (ed.) Visual Information Processing. Academic Press, New York Chi M T H 1981 Knowledge development and memory performance. In: Friedman M P, Das J P, O’Connor N (eds.) Intelligence and Learning. Plenum Press, New York de Groot A D 1966 Perception and memory versus thought. In: Kleinmuntz B (ed.) Problem Soling Research, Methods and Theory. Wiley, New York Ericsson K A, Krampe R T, Tesch-Romer C 1993 The role of deliberate practice in the acquisition of expert performance. Psychological Reiew 100: 363–406 Gibson J J 1979 The Ecological Approach to Visual Perception. Houghton-Mifflin, Boston Singer R N, Janelle C M 1999 Determining sport expertise: From genes to supremes. International Journal of Sport Psychology 30: 117–50 Starkes J L, Deakin J M, Allard F, Hodges N, Hayes A 1996 Deliberate practice in sports: What is it anyway? In: Ericsson K A (ed.) The Road to Excellence: The Acquisition of Expert Performance in the Arts and Sciences, Sports and Games. Erlbaum, Mahwah, NJ Thomas K T, Thomas J R 1994 Developing expertise in sport: The relation of knowledge and performance. International Journal of Sport Psychology 25: 295–312 Williams A M, Davids K, Williams J 1998 Visual Perception and Action in Sport. E & F N Spon, London
B. Abernethy
Sports, Economics of 1. Introduction Sport is organized among individuals and among teams, and is professional or amateur. Most of the interesting questions on the economics of sport relate to professional team sport, which is the focus here. Over their histories, and until very recently, in some leagues, players have been restricted from moving between clubs within a league. Club owners have claimed that reservation of players or the imposition of transfer fees was necessary to maintain financial soundness of small clubs and to preserve competitive balance within leagues. In this article, pay and per-
Sports, Economics of formance under player reservation and under free agency are evaluated, along with issues relating to competitive balance. Under player reservation, player pay is less than the players’ contribution to club incremental revenue. Under free agency, the players, ultimately, receive all of the incremental revenue. It is shown that the intrateam salary structure, with its wide pay differential between starting and backup players, is rational and wealth-maximizing. Competitive balance is value-maximizing, also. The effect of free agency, revenue-sharing among clubs, team salary caps, and syndicate ownership on competitive balance are evaluated.
2. Players’ Market 2.1 Player Achieement: The Hot Hand and Sequences In part, the appeal of professional team sport is the extraordinary playing ability of the athletes. This is demonstrated repeatedly on television, and commented on widely. Such feats are value-maximizing for the leagues, since they promote interest in the sport. A case can be made that baseball has been resuscitated by the record-breaking home run duel between Mark McGuire and Sammy Sosa, in the 1998 season. While there are many ways of measuring player achievement, none excites more discussion than ‘hot hands.’ Mere mortals seek a causative explanation for the player who achieves a long sequence. But, hot hands is a myth, and the observed runs are no different from what one expects from a (possibly unbalanced) coin toss (successive heads in a row, when the probability of a head is some fixed number p). Even premier players in basketball do not make much more than half of their field goals. In the course of a season, such players would be expected to have one run of seven made shots in a row, and in the course of a career a sequence of 10 in a row. Naturally, players with higher shooting averages have longer sequences. Measured over the 50-odd year history of the National Basketball Association, the coin toss model predicts that at least one 0.5 shooter will make 16 shots in a row. Research on sequences in sports (for basketball, see Gilovich et al. 1985, Camerer 1989) reveals only one exception to the coin toss model (Gould 1991). In 1941, Joe DiMaggio hit safely in 56 consecutive games. He hit 0.357 that year, and the coin toss model predicts that by chance one would see a sequence of that length once in 10 trillion trillion games. What an achievement! 2.2 Player Pay under Player Reseration and under Free Agency Prior to the mid-1970s, baseball and basketball players could not negotiate freely with more than one team,
and this restriction applied in football and in hockey, until more recent times. In European football (soccer), until 1995, player movement was restricted by the substantial transfer fee that had to be paid to the club owning the player’s contract. The result of player reservation in professional team sport was that players were exploited economically by the owners in the sense that they were paid less than the incremental revenue that their performance brought to the club (Scully 1974). It is not the case that the athletes in professional sports were poor by any means, since they were paid on average several times a working man’s wage and the best of them received perhaps 10 times the pay of the less gifted performers. But, their pay was not equal to what they could have earned if they were free to offer their talent to the highest bidding club. Veteran free agency has led to an enormous explosion in player salaries and in the share of club revenue captured by the players. Average player salaries are 50 or more times their level in the late 1960s, and player payroll costs are three or more times their share of club revenues (Scully 1989, 1995). While a superstar player then could expect to earn 10 times or so the pay of a rookie player, now a factor of 50 or more is not extraordinary. On the whole, then, the introduction of veteran free agency has eliminated monopsonistic exploitation of players in professional team sports. Nevertheless, pockets of exploitation exist, particularly among promising rookies, journeymen players, and first-round draft choices from the amateur ranks. This is so because there is not complete freedom in the players’ market, such as would occur with teams competing for all players by offering one year or multiyear contracts for their services. New entrants into the professional ranks, if they survive the intense competition for their positions, must wait several years before they are free to negotiate with other clubs. 2.3 Pay Structure in Professional Team Sports Payment for player performance is not piece rate, that is, a player who is twice as good as another, measured by some performance metric, does not receive twice the pay. Rather, relatively small differences in player performance often are associated with large differences in player compensation. The pay structure in professional team sport is hierarchical, as in a rank-order tournament (Lazear and Rosen 1981). In professional sports, owners of clubs or sponsors of events, such as in tennis and golf, want all athletes to develop their skills to their maximum potential and for the athletes to expend maximum effort per contest. This desire is profit motivated: contests among highly skilled players who expend the maximum effort at winning are more valuable, and it is costly to monitor effort (prevent shirking). These facts imply that there will exist an optimal hierarchy of payments (an 14939
Sports, Economics of optimal prize structure in golf and tennis, an optimal intraclub salary structure in professional team sport) that aligns player behavior (investments in developing playing skills and expending effort) with the owner’s or sponsor’s objective function. Individual sports, such as tennis and golf, are rank-order tournaments where the reward structure is predetermined and a competitor achieves the next higher payment by advancing one position in the rank of competitors. In professional team sports, there is a team rank-order tournament in which players compete for starting positions that are rewarded greatly and backup positions that pay a great deal less (Porter and Scully 1996). Starting players are always threatened by the performance of the lower-ranked backup players. This threat makes the performance of the starter like a serially repeated contest in the individual sports. 2.4 Pay and Performance Profiles There are several features about player careers. Most enter the professional ranks as untested rookies, having spent a number of years in the minor leagues or in the amateur ranks. During that training period, pay is low. When a player gets a contract with a professional team, often they play infrequently and may be traded to another club. Only if their play is judged superior, will they play frequently. Player performance tends to improve over a certain portion of the playing career, reaches a peak, and then declines as age takes its toll on reflexes and physical ability. The rise in performance may be due in part to learning by doing, but mainly it is due to player and club investments in honing playing skills. The age at which the decline in performance sets in, and perhaps its rate, is related to player investments in skill and conditioning. Thus, the life cycle of performance and that of player pay is the result of a team rank-order tournament. Players invest endogenously in developing their skills, which determines their chance of securing a starting position. Under free agency, pay for starters is considerably more than for irregular players. The high rewards for starting players implies that every effort (training, conditioning, diet) will be made to maintain peak or near-peak performance for as long as possible. Empirical evidence on the life cycle of performance and pay is measured most readily for baseball players (Scully 1995). During the monopsony era (sample period 1964–75), it was found that the peak in baseball player performance occurred at 1,596 games (about 10.6 years of play) and that the peak in salary occurred at 2,328 games (15.5 years). During the postfree agency era (sample period 1977–92), the peak in performance occurred much earlier (1,347 games or 9 years), and the peak in pay a great deal earlier (1,523 games or 10.2 years). The sharp increase in the hierarchy of earnings brought about by veteran free agency induced greater player investment in developing playing skills more rapidly. As such, baseball players reach their 14940
maximum performance potential about 1.5 years earlier under free agency. Moreover, if one compares the profiles of performance and pay in the monopsony and the free agency periods one finds the following. Player productivity exceeds player pay up until 1,550 games (10.3 years) in the monopsony period, which is an indication of exploitation. In the free agency period, after 547 games (3.6 years) player pay exceeds player performance.
3. Competitie Balance 3.1 Competitie Balance within Professional Team Sports Leagues An important dimension of playing quality is the relative strengths of the teams on the playing field (competitive balance). Fans want their favorite team to win under uncertainty. If the top club always beat the next ranked club, and so on, the games would be exhibitions, not contests, and they would have little market value. Some degree of uncertainty of outcome is necessary in sports, and this uncertainty is greater the more equal the playing strengths of the competitors. Greater equality of playing strengths among the clubs is wealth maximizing. Attendance at games and viewership on television is higher the closer the standings of the clubs are over a season. Competitive balance depends mainly on the comparative financial strengths of the clubs. Teams earn revenue primarily from ticket sales and from broadcast fees. The main cost is player payroll. Attendance is determined chiefly by market (city) size and the club’s record (Noll 1974). Since teams are located in cities of markedly different size, the prospect of fielding a competitive team depends in part on where the club is located, but also on how revenues are divided up among the clubs. In basketball and hockey, the home team receives all of the ticket revenue. In baseball, the gate split is about 85:15. In football, the gate division is 60:40. All national broadcast revenues are divided evenly among the clubs in all professional team sports. The difference in the formula for dividing up the gate receipts corresponds roughly to the dispersion in team financial strengths across the various leagues. The highest revenue-producing basketball team tends to have about 2.7 times the revenue of the lowest revenueproducing club, while it is about 2.9 times in hockey. The top club in baseball has about 2.5 times the revenue of the poorest drawing team. In football, the top team tends to have about 70 percent more revenue than the lowest revenue-producing team. With clubs distributed in unequal size markets and with a ban on unilateral relocation of clubs to the more lucrative markets, the more unequal the division of revenue among clubs the less will be the competitive balance within a league.
Sports, Economics of There are several ways of measuring competitive balance. The most obvious is to compare the actual number of championships with the theoretical number (one divided by the number of clubs times the number of years the club has existed). A second is to compare the historical average win records among the clubs. A third, and more precise, way is to compute the standard deviation among the win percentages of the clubs (Scully 1989). To make comparisons of competitive balance among the leagues, differences in the number of games played in a season needs to be taken into account, since the standard deviation of the win percent is inversely related to the number of games. Competitive balance is measured as the ratio of the actual standard deviation to the theoretical standard deviation under equality of playing strengths (0.5\Ng, where g is the number of games). Using this procedure, Quirk and Fort (1992) found for the period 1980–90 that football had the greatest competitive balance with a ratio (σ\(0.5\Ng)) of 1.53, while basketball had the least competitive balance with a ratio of 2.73. Baseball, with a ratio of 1.67 and hockey with 1.91 had relative playing strengths between football and basketball. Measured over the history of these leagues, the rankings of competitive balance remain unchanged. Football has always been the most competitively balanced, and basketball the least. Moreover, over time, competitive balance has improved significantly in baseball, somewhat in football, since the 1960s, but has remained substantially unchanged in basketball and hockey. Nevertheless, since the actual standard deviation of win percents greatly exceeds the theoretical standard deviation, one cannot conclude that competitive balance exists in any professional team sports league. With the advent of veteran free agency, owners claimed that competitive balance would be destroyed. There is no difference in competitive balance in baseball in the period of veteran free agency compared to the player reservation period. While less clear cut, because freedom of player movement proceeded in steps in basketball, competitive balance has not changed in basketball in the era of freedom of player movement. 3.2 Freedom of Player Moement and Competitie Balance Club owners, particularly in American baseball and European football, always have claimed that restrictions on player transfers are necessary to ensure competitive playing balance within a league and to protect the investments in the smaller city franchises. Economists, since Rottenberg (1956) have been very skeptical of this claim and there is no empirical evidence to support it. While it is true that there is a tendency for the best players to migrate to the larger city franchises, freedom of player movement or restrictions on player-initiated movement has no bearing on
Figure 1 Freedom of player movement and competitive balance
the distribution of playing talent within a league (ElHodiri and Quirk 1971). Under free agency, players will migrate to the best paying clubs. The franchises that pay the most are those that expect the largest incremental revenue from those players’ performances. Since an increment in the win percent raises revenue more in a large city than in a small city, the best players tend to migrate to the big city franchises. At issue is whether in a market where player-initiated movement is forbidden (player reservation, transfer fees), would a small city club keep a star player that it had acquired through a draft or an assignment? On the roster of that club, the player is worth, say, $1 million in incremental revenue. On a large city franchise, that player’s performance is worth, say, $3 million. Without a scrupulously enforced league ban on cash sales of player contracts, the franchise in the small city has an incentive to sell the player’s contract to the big city club, and capture some portion of that $2 million difference in incremental value. Therefore, players are allocated across clubs based on their highest incremental revenue. Whether the player or the club owns the contract (property right) is irrelevant to the allocation of playing talent within a league. That the efficient allocation of a resource is independent of who owns the property right is, of course, the famous Coase (1960) invariance theorem, but the essence of that theorem was discovered independently by Rottenberg (1956) in his earlier paper on the baseball players’ market. While the distribution of playing talent is independent of the degree of freedom in the market for players, the distribution of the rents between owners and players is not. The players obtain a much higher share of the rents arising from their scarce talent under free agency than would be so under league restrictions on player-initiated transfers. The invariance of the allocation of playing talent to the degree of freedom in the players’ market is illustrated in Fig. 1 for a two-club league. On the 14941
Sports, Economics of vertical axis is club revenue and player cost per point in the win percent. On the assumption that revenue per point in the win percent declines as the win percent increases, marginal revenue (MR) declines with the win percent. MC is the cost per point in the win percent under free agency. Since top-playing talent is rarer than mediocre talent, its cost per unit is higher. Equilibrium, in the sense that both clubs maximize profits, occurs at E(1) for the large city club (designated as team 1) with its higher marginal revenue, and where marginal revenue equals the marginal cost of playing talent (MR(1) l MC ), and at E(2) for the small city club (team 2), where MR(2) l MC. Thus, in equilibrium, MR(1) l MR(2) l MC. With MR(1) MR(2), the profit-maximizing distribution of playing talent is such that team 1 has a win record of 0.600 and team 2 has 0.400. Now, assume that a reverse order of finish amateur draft, under which the club with the worst record gets to draft first, leaves each team with equal talent levels (each club will have a 0.500 record). Under this condition, the market is in disequilibrium. For club 1(cut point G), at a 0.500 win percent, marginal revenue exceeds marginal cost (the triangle GDE(1) represents potential profit), while, for club 2 (cut point D), marginal cost exceeds marginal revenue (the triangle CDE(2) represents a loss). Under free agency, club 1 will bid away playing talent from club 2 to the point at which a profit-maximizing equilibrium is reestablished at E(1) and E(2), with club records of 0.600 and 0.400. Thus, under free agency, an initial equalizing redistribution of playing talent through a reverse order of finish drafting mechanism will be ineffective in maintaining equality of play with a league. The better players will migrate to the better paying, large city clubs, where their marginal revenue is higher. Under player reservation, the player comes to terms with the club that owns their contract, or does not play. This condition leads to pay less than the incremental value associated with the player’s contribution. The marginal cost per point in the win percent under monopsony is labeled MCh; in Fig. 1. Under monopsony, with an equal distribution of playing talent among the clubs (each club having a 0.500 record), team 2 has a level of playing talent such that marginal revenue equals marginal cost (point C in the figure), but club 1 has marginal revenue in excess of marginal cost (point G in the figure). There is more profit to be had on club 1, if it can purchase higher quality playing talent from club 2. It is in the interest of club 2 to sell that playing talent for cash, since by lowering its win percent profits increase. Equilibrium in the distribution of playing talent occurs when both clubs earn an equal profit from the distribution of playing talent (win percents). This occurs with a record of 0.400 for club 2, with MR(2) MC h; by BE(2), and 0.600 for club 1, with MR(1) MC h; by AE(1), and with BE(2) l AE(1). Observe that under monopsony the profit-maximizing win percents remain as they 14942
Figure 2 Revenue sharing and competitive balance
were under free agency. The gaps AE(1) and BE(2), in the figure, represent the monopsony rent transferred from the players to the owners under a player reservation rule. Thus, under monopsony, the distribution of playing talent remains the same as under free agency, but the big city club has cross-subsidized the small city team through a cash-sale purchase of player contracts. Equalization of playing talent is not possible under free agency or under player reservation with cash sales of player contracts. The only way in which reverse-order drafting will lead to equalization of playing talent among clubs that play in different size cities is under player reservation with a ban on cash sales of player contracts. But, note that under such a ban profit maximization is not achieved. Club 2 earns more profit with a win percent of 0.400 than with a record of 0.500, and club 1 earns more with a record of 0.600. Even if a formal ban on cash sales is imposed by a league, these profit opportunities are a powerful incentive to devise mechanisms for player reallocations (lopsided player trades, a player trade for compensation and a player named later) that will restore the distribution of win percents among the clubs toward their profit-maximizing distribution. 3.3 Reenue Sharing and Competitie Balance If the degree of competiveness in the players’ market does not affect competitive balance, will it be affected by the division of revenue among the clubs? Revenue division tends to moderate the distribution of revenue among teams within a league, but has no effect on competitive balance. The conclusion is illustrated in Fig. 2. Without revenue sharing, equilibrium is at E(1) for team 1 and E(2) for club 2. Profit maximization leads to 0.600 and 0.400 win records. Now, let revenue sharing occur such that each club has the same marginal revenue. The effect of revenue sharing is to
Sports, Economics of lower the marginal revenue for both clubs from MR(1) to MR(1)h for team 1 and from MR(2) to MR(2)h for club 2. Algebraic proof: under revenue sharing Ri l sRij(1ks)Rj, i l 1, 2, j l 1, 2, ij, with s as the revenue share. In a two-team league, an increase in the number of games won (or, the win percent) for one club means a corresponding decrease in wins for the other club. Then, the change in revenue is MRih l sMRik(1ks)MRj. The profit-maximizing equilibrium under revenue-sharing is at E(1)h for club 1 and E(2)h for club 2, where MR(1)h l MR(2)h l MC d. Note that the marginal cost of playing talent is proportionately lower under revenue division than in its absence (MC d MC). This is so because the worth of playing talent to the clubs is inversely proportional to the amount of revenue sharing. The distribution of playing strengths among the clubs is unaffected by the revenue division rule. While revenue sharing reduces the incentive of clubs to bid for scarce playing talent, it also reduces the financial incentive to win.
3.4 Team Salary Limits and Competitie Balance Since the advent of veteran free agency in professional team sport, owners have sought mechanisms to constrain the competition for playing talent that they perceive as financially ruinous. One mechanism, invented by basketball, in 1980, is the team salary cap, a device that in theory imposes a maximum salary as a percent of gross revenue (and, a minimum percentage that applies to small city clubs). Football, since 1993, has a team salary cap, and baseball wants one, but has been blocked in its ambition by the players’ union. Can a salary cap equalize playing strengths within a league? Suppose that a binding salary cap is imposed that fixes player salaries at a certain percentage of league revenues. The marginal cost of playing talent falls under the salary cap from its level under free agency. If scrupulously enforced, all clubs have the same team salary costs, and if there are no information asymmetries about expected playing talent and no differences in coaching quality, all clubs will have 0.500 records. Like the solution of player reservation with a ban on cash sales of player contracts, the distribution of win percents brought about by a hard salary cap is stable, but it is not profit maximizing, nor does it maximize league revenue. A large city team can earn more profit by acquiring better playing talent and a small market club can earn more by lowering its level of playing talent. In basketball, the team salary constraint is relaxed significantly by the so-called Larry Bird exemption. Under the exemption, a team may re-sign a veteran free agent at whatever cost, and not have that player’s salary count in the salary cap. This has led to most clubs having player payrolls well in excess of the cap, and this transforms the team salary limit concept into a soft or nonbinding constraint. The Larry Bird
exemption was the main point of dispute between the basketball players and the owners, prior to the 1998–9 season. Salary caps have led, also, to creative financial arrangements such as large player signing bonuses relative to annual salary, back-end loaded long-term player salary contracts, deferred player compensation, and the shifting of revenue from team operations to other, nonteam entities. Sufficient erosion in the constraint of the salary cap, as with a player reservation system with player transfers for cash, can lead to the same distribution of playing talent as is attained with unrestricted free agency.
3.5 Syndicate Ownership and Competitie Balance Major league soccer in the US is organized in an entirely different way than the main professional team sports leagues. The latter are composed of clubs that are privately owned and that are subject to various agreements with their respective leagues. In soccer, players contract with the league, not a club within the league, a fixed team salary is enforced, investors operate but do not own individual clubs, and all revenues are divided equally among the clubs. Thus, in theory, marginal revenue is equalized across all clubs at a win record of 0.500. Alternatively, there is no extra money to be made by a club from winning a championship. Nor is there any financial penalty for finishing in the cellar. While syndication will bring about equality of playing strengths, it raises very serious questions about club independence and competitive integrity. Baseball’s early experiment with syndication failed. Interleague wars weakened many clubs and the aid from the stronger to the weaker clubs led to cross-ownership. In 1901, the surviving National League clubs formed a syndicate. Fans were not impressed, and believed that the outcomes of the games suspect. ‘Trust’ or cross-ownership in the National League was ended in 1902.
Bibliography Camerer C 1989 Does the basketball market believe in the ‘Hot Hand’? American Economic Reiew 79: 1257–61 Coase R 1960 The problem of social cost. Journal of Law and Economics 3: 1–44 El-Hodiri M, Quirk J 1971 An economic model of a professional sports league. Journal of Political Economy 79: 1302–19 Gilovich T, Vallone R, Tversky A 1985 The Hot Hand in basketball: On the misperception of random sequences. Cognitie Psychology 17: 295–314 Gould S J 1991 The Streak of Streaks. Bully for Brontosaurus: Reflections on Natural History. Norton, New York Lazear E P, Rosen S 1981 Rank-order tournaments as optimum labor contracts. Journal of Political Economy 89: 841–64 Noll R 1974 Attendance and price setting. In: Noll R (ed.) Goernment and the Sports Business. Brookings, Washington, DC
14943
Sports, Economics of Porter P K, Scully G W 1996 The distribution of earnings and the rules of the game. Southern Economic Journal 63: 149–62 Quirk J, Fort R 1992 Pay Dirt: The Business of Professional Team Sports. Princeton University Press, Princeton, NJ Rottenberg S 1956 The baseball players’-labor market. Journal of Political Economy 64: 242–58 Scully G 1974 Pay and performance in major league baseball. American Economic Reiew 64: 915–30 Scully G 1989 The Business of Major League Baseball. University of Chicago Press, Chicago Scully G 1995 The Market Structure of Sports. University of Chicago Press, Chicago
G. W. Scully
Sports Performance, Age Differences in ‘Sports,’ wrote veteran journalist Frayne (1990) in his memoir, are just ‘children’s games that seem to enchant millions of grownups.’ Frayne’s irony is fitting. Although dictionaries define sports as pleasant pastimes, to the athletes, supporters, and wider society, sport is so much more. From classical times to the modern era, sports waxed and waned in their significance for religion, militarism, business, and occupation (McIntosh 1963). Innumerable people devoted their lives to sport, and far too many lost life because of sport. At least one war was blamed on sport, and at least one war halted because of sport (Morris 1981). Nowadays, television has made sport a truly international drama, enchanting, enraging, dismaying, or even boring audiences of tens of millions. Social and behavioral scientists study sport for many reasons. Social scientists study societal influences on sports and of sports. Behavioral scientists study sports to gain knowledge about expertise in physical performance, including its developmental progression across the life span. This article examines the attributes of expert performance in different sports in relation to the aging of expert performers.
1. Expertise and Performance 1.1 Identification of Expertise Who is an expert? A simple answer might be someone with acknowledged skill. This interpretation equates expertise with competence. Although most people are competent at skills practiced routinely in everyday life (e.g., driving), the appellation of expertise usually connotes a higher degree of exclusivity. Experts are not people who are simply competent, but those who exhibit exclusive competence consistently (Salthouse 1991). There are two types of experts. First, experts who are normatively exclusive lie at the upper extreme of 14944
normal distributions for everyday skills (e.g., walking, running, cycling). However, normative expertise also depends upon the population providing the distribution. Running 1,500 meters in 6 minutes certainly denotes expertise within the overall adult population, falling within the top few percentile points. But compared to other runners, such a performance might be merely average. Unless that runner was a male aged 80 years or a female aged 65 years, in which case the performance would approximate a world record. Most sports accommodate to normative relativism by including age and gender distinctions within the competitive categories. Second, other experts have exclusive skills not possessed by the majority of the population. Only seasoned race walkers can walk 1,500 meters in 6 minutes because none but race walkers have mastered that effective but peculiar style of walking. Most race walkers, before the 1950s, were of European ancestry. Similarly, equestrians might be considered experts because few people in relatively few countries ride horses competitively. These examples illustrate cultural and historical relativism in exclusive skill, with both pedestrian and equestrian pursuits being less popular today than in previous centuries. Athletes at the upper levels of expertise are termed elite athletes. Elite athletes include those who break world records, win international championships, or compete at the highest levels within their country. Comparison of age differences in performances by elite athletes can clarify the limits to which life-span developmental processes impact on physical performance. 1.2 Paradigms for the Study of Expertise Research on expertise gained impetus from studies of games like chess. Chase and Simon’s (1973) research with chess players of different proficiencies elucidated cognitive processes contributing to superior performance. The same dominant paradigm of comparing people at different levels of proficiency found application in studies of occupation (e.g., surgical skill, typing, computer programming), recreation (e.g., bridge, video games), and sport (e.g., badminton, soccer). Spirduso (1980) pioneered a variant of this paradigm by comparing athletes and nonathletes at different ages to identify fast reaction time as an ability retained by athletes. Other approaches to the study of expertise derived from ability, ecological, informational, and training models (Allard 1993). These paradigms respectively seek to identify abilities common to expert performers, environmental cues that trigger skilled performance, sequencing within performance, and the limits to which expertise can be enhanced by training. A recent development is the application of compensation theory to performances by aging athletes (Stones and Kozma 1995).
Sports Performance, Age Differences in Statistical modeling of age trends in athletic performance has emerged as an influential paradigm in the study of sports expertise (Moore 1975, Stones and Kozma 1986). Age-class records in individual sports such as singles rowing, swimming, track and field, and weight lifting provide invaluable sources of data about age changes in physical performance. Not only is performance measured as meticulously as in any laboratory study, but the record holders (being the best in their country, or the world, at their respective age) can reasonably be assumed to be fit and healthy, with their performance relatively unaffected by chronic disease. Although sport for veterans became organized only during the 1970s, the numbers of competitors are now very large in some sports. The biannual championship of World Association of Veteran Athletes regularly attracts 5,000 or more track and field athletes from around the world, and the US Masters swimming organization has more than 37,000 members. It is more difficult to evaluate performances by older athletes in team sports. Changes in team tactics, the composition of a team, the overall performance by a team, and transfer to another team are all factors that can affect the contributions of individual athletes. It is difficult to disentangle such extraneous effects from those of changing skill. However, two kinds of data enable insight into the sources of age differences in performance. First, aggregated longitudinal data show age trends within a sport. Examples include Stones and Kozma (1995), who analyzed aggregated data from ice hockey, suggesting that players compensate for late-career decreases in point scoring by aggressive play (i.e., penalty minutes). Also, Langford’s (1987) research showed that home runs and batting averages in baseball peaked around the age of 27 years. Second, the ages of elite athletes can be compared across different sports. Limitations to this kind of comparison include changing historical trends, such that ages at peak performance and retirement in the 1900s may not mirror those of today. 1.3 Components of Expertise The components of motor expertise include knowledge, skill, experience, and motivation. These components are hypothesized to change in relative importance over the course of athletes’ careers (Abernethy et al. 1993). Knowledge becomes less important after an early acquisition phase; motor skills decrease in significance, particularly in later life; experience increases in importance over the life span; motivation may decrease or reflect changing priorities (see Sports as Expertise, Psychology of ). Accumulated experience contributes to performance in two ways. First, athletes become more adept in the use of strategy. Second, experience helps athletes to choose among different modes of compensation if performance falls below expectations. Examples of
compensation include attempts improve skill, the substitution of an alternative skill, or modification of expectations (Stones and Kozma 1995). Motivation can be differentiated into intrinsic and extrinsic components. Intrinsic motivations include desires to achieve, retain competence, and remain healthy. Extrinsic motivations include praise and payment. Most older athletes are long-time competitors who cite intrinsic benefits to health and competence as reasons for participation. 1.4 Categorization of Sports Sports have been categorized on three dimensions other than individual vs. team sports. The first is between closed and open skills. Closed skills sports are performed in an unvarying environment, with the performance evaluated by the quality of movement patterns. Examples include individual sports such as figure skating, diving, and gymnastics, and team sports like synchronized swimming. Open skill sports are performed in a varying environment. The sources of variation include changes produced by an opponent (e.g., boxing, baseball, soccer), as well as in physical topography (golf). In contrast to closed skill sports, performance in open skill sports is evaluated by the outcome of movement rather than the motor pattern itself. The second distinction refers to physical effort, the main components of which are power and endurance. Power events fall into two categories, both of which are ‘fueled’ by anaerobic sources that are depleted rapidly. Sprinting involves a continuous expenditure of high power over a short period. Peak power events involve an explosive use of power (e.g., jumping, throwing, pole vaulting, weight lifting). Events of longer duration require endurance rather than power (e.g., distance running, race walking). Such activity draws mainly on aerobic sources that provide for relatively low power expenditures over extended periods of time. Not all sports require much physical effort. Frayne (1990) wrote the following about baseball: ‘Except for the pitcher and catcher, nobody does much of anything for long periods but stand around and wait.’ Cricket is similar in this respect to baseball, and contrasts with sports like rugby and football that intermittently tax both power and endurance. The final distinction refers to the use of strategy. Open skill sports such as motor racing, snooker, and tennis all are high strategy sports. Low strategy activities include most closed skill sports, but also open skill sports such as darts, swimming, and archery.
2. Age Differences in Different Sports The preceding categorization suggests that sports can be grouped into four main categories. Closed skill sports are different from other sports because per14945
formance is evaluated by the quality of movement rather than the outcome of movement. Although such sports vary in the physical effort required (e.g., divers expend less effort than gymnasts or skaters) practice is more important to performance than strategy. Open skill sports can be divided into three groups: sports requiring (a) effort rather than strategy, (b) strategy rather than effort, and (a) both strategy and effort. Differences in the ages of the elite performers are as follows.
Time reflective to world record
Sports Performance, Age Differences in 2.5 1 2 3 4
2
1.5
1 30–34
40 – 44
50 – 54 60 – 64 Age group (years)
70 – 74
80 –84
2.1 Closed Skill Sports Athletes in closed skill sports typically attain peak performance at a young age. Examples include Fu Mingxia, who had never practiced a complete set of dives before she became part of the Chinese juniors’ diving team at age 11 years. Within 3 years she was both world champion and the youngest Olympic champion to that time. At the same 1992 Barcelona Olympics, another Chinese diver became a men’s champion at the age of 13 years. Elite gymnasts are also youthful performers, with 14-year-olds regularly appearing in elite competition. In figure skating, Tara Lipinski of the US became the Olympic champion in 1998 at age 15 years, and the youngest ever champion in an individual event. Another figure skater, Sonja Henie, previously held that distinction. Although approximately 15 percent of Olympic triple gold medal winners (i.e., winners of the same event in three successive games) competed in closed skill sports, few such athletes participate in elite competition beyond age 30 years. Many elite skaters become professional entertainers during retirement. Many gymnasts, including former Olympic champion Olga Korbut, become coaches. These examples illustrate compensation, wherein athletes substitute skills related to those formerly practiced. 2.2 Open Skill Sports Requiring Effort Rather than Strategy Although swimmers may break world records in their teens, the recent history of track and field overturns traditional beliefs that athletes in power events both peak and fade at earlier ages than athletes in endurance events (Spirduso 1995). With the exception of the race walker Wan Yan, the youngest and oldest world record holders and Olympic champions all competed in power events. Statistical modeling of age differences among veteran athletes suggests that performance decreases proportionate to the demands on power and endurance (Stones and Kozma 1986, 1996). Figure 1 illustrates these trends with swimming records. Figure 1 is based on 1999 US Masters swimming records at different ages expressed as a ratio of 1999 world records. The ratios were combined for males and females in the 50 m and 200 m backstroke and 14946
(4) (3)
Backstroke 50 m (2) Butterfly 50 m (1)
Backstroke 200 m Butterfly 200 m
Figure 1 Age group records relative to world records in 50 m and 200 m backstroke and butterfly swimming events
butterfly. The butterfly requires more power than the backstroke (Astrand and Rodahl 1977), and the 200 m event taxes endurance more than the 50 m. The findings show performance times to increase exponentially with age, with greater age differences in events requiring higher power (butterfly) and more endurance (200 m). Comparable trends can be obtained from analysis of track and field records (Spirduso 1995, Stones and Kozma 1996).
2.3 Open Skill Sports Requiring Strategy Rather than Effort Archery, baseball, cricket, equestrianism, golf, fencing, motor racing, and yachting all belong to this category. Although the athletes may earn elite status in their teens, their careers include the longest in any sport. Archer Galen Spencer won an Olympic gold medal at age 64 years. The youngest and oldest major league baseball players in the twentieth century were respectively aged 15 and 59 years. Seven Pakistani cricketers represented their country at 16 years or younger, and four cricketers from various countries competed internationally when aged over 50 years. In equestrian sport, the age range for the youngest and oldest qualifiers for the National Finals Rodeo in the US is the same as for baseball, and the career of Lester Piggott (England’s most famous jockey) spanned 47 years (i.e., from his first win, at age 12 years, to his retirement at age 59 years). Ivan Osiier competed in Olympic fencing competitions over a 40-year span. The youngest and oldest golfers who won a national championship (Sri Lanka) were aged 12 and 54 years, respectively. Although the youngest Grand Prix motor racing winners were in their twenties, the oldest winners and competitors were aged in their fifties. Magnus Konow competed in Olympic yachting over a 40-year period. Experience undoubtedly contributes
Sports Performance, Self-regulation of to length of careers in these sports, with the practiced motor skills being resistant to age-related physical decline. 2.4 Open Skill Sports Requiring Strategy and Effort Such sports include basketball, various forms of football (e.g., soccer), ice hockey, and tennis. The youngest players to compete at elite levels are in their mid-to-late teens, and the oldest generally in their forties. Examples include elite basketball players (youngest, 18 years; oldest, 43 years), Wimbledon tennis champions (youngest, 15 years; oldest, 43 years), and international soccer players (youngest, 17 years; oldest, 42 years). Within domestic soccer, Stones and Kozma (1995) reported that more than half of the oldest professional soccer players in the top European nations were goalkeepers, a position that requires the least physical effort. This finding suggests that careers may be longer if the demands on effort are lower. Very few athletes in this category compete at elite levels in their fifties, but with notable exceptions that include Sir Stanley Matthews in soccer and Gordie Howe in ice hockey.
3. Conclusions Aging lowers the potential to perform at elite levels in sports that require power, endurance, endurance, and closed patterns of skill. Aging has less effect on performance in high strategy sports, and particularly those that require low effort. See also: Aging, Theories of; Developmental Sport Psychology; Intelligence, Prior Knowledge, and Learning; Sport and Exercise Psychology: Overview; Sport Psychology: Performance Enhancement; Sports as Expertise, Psychology of
Bibliography Abernethy B, Thomas K, Thomas J T 1993 Strategies for improving understanding of motor expertise (or mistakes we have made and things we have learned!). In: Starkes J L, Allard F (eds.) Cognitie Issues in Motor Expertise. Elsevier, North-Holland Allard F 1993 Cognition, expertise, and motor performance. In: Starkes J L, Allard F (eds.) Cognitie Issues in Motor Expertise. Elsevier, North-Holland Astrand P-O, Rodahl K 1977 Textbook of Work Physiology: Physiological Bases of Exercise, 2nd edn. McGraw-Hill, New York Chase W G, Simon H A 1973 The mind’s eye in chess. In: Chase W G (ed.) Symposium on Cognition (8th; 1972; Carnegie-Mellon Uniersity): Visual Information Processing. Academic Press, New York Frayne T 1990 The Tales of an Athletic Supporter: A Memoir. McClelland & Stewart, Toronto Langford W M 1987 Legends of Baseball: An Oral History of the Game’s Golden Age. Diamond Communications, South Bend, IN
McIntosh P C 1963 Sport in Society. Watts, London Moore D H 1975 A study of age group track and field records to relate age and running speed. Nature 253: 264–5 Morris D 1981 The Soccer Tribe. Cape, London Salthouse T A 1991 Expertise as the circumvention of human processing limitations. In: Ericsson K A, Smith J (eds.) Toward a General Theory of Expertise. Cambridge University Press, Cambridge, UK Spirduso W W 1980 Physical fitness, aging, and psychomotor speed: A review. Journal of Gerontology 35: 850–65 Spirduso W W 1995 Physical Dimensions of Aging. Human Kinetics, Champaign, IL Stones M J, Kozma A 1986 Age trends in maximal physical performance: Comparison and evaluation of models. Experimental Aging Research 12: 207–15 Stones M J, Kozma A 1995 Compensation in aging athletes. In: Dixon R, Backma$ n L (eds.) Compensating for Psychological Deficits and Declines: Managing Losses and Promoting Gains. Lawrence Erlbaum, Mahwah, NJ Stones M J, Kozma A 1996 Activity, exercise and behavior. In: Birren J, Schaie K W (eds.) Handbook of the Psychology of Aging, 4th edn. Academic Press, San Diego, CA
M. J. Stones J. Farrell, and J. Taylor
Sports Performance, Self-regulation of Self-regulation refers to a regulation of the psychological state of an individual by the individual heror himself. Self-regulation becomes necessary when internal or external barriers threaten the efficient realization of an intended action. Self-regulation consists of auxiliary processes that are based on metaknowledge (metacognitive or metamotivational knowledge). These auxiliary processes support basic action processes like attention, motivation, etc., if those are not sufficient for optimum goal attainment. Within the last decades it has been widely acknowledged that the so-called mental side is a decisive factor in athletic performance. As physical training has become more and more optimized for all athletes participating in major competitions, the winning edge is often furnished by an athlete’s self-regulatory ability. Of two equally talented and equally wellprepared athletes, the one with better psychological skills is likely to win. Without such skills athletes may become their own worst enemy during competition, imposing psychological barriers upon themselves. Self-regulation helps to overcome these barriers.
1. A Brief History of Sport Psychology and Selfregulation Self-regulation has a comparatively short history within the not very long history of sport psychology. Although it is sometimes claimed that sport psy14947
Sports Performance, Self-regulation of chology has a history that dates back to ancient Greece, its scientific origins are not older than about 100 years. From the late nineteenth century on, the first decades of this discipline were mainly focused on the scientific study of athletic behavior. The first book with an applied perspective, The Psychology of Coaching, was published in 1926 by Griffith in the USA. During the 1960s and 1970s there was a considerable increase in the application of psychology in the field of athletics. However, self-regulation was mainly restricted to the use of relaxation techniques. At the same time self-regulation had become an evolving approach within behavior therapy. In the late 1960s, behavior therapists began to fill the black box, the organism variable, within the S-O-R model of behaviorism with cognitive processes individuals could actually use to control their own behavior (e.g., Kanfer 1970). In the late 1970s behavior therapists took the models and techniques of self-regulation developed in clinical psychology to the field of athletic behavior (e.g., Mahoney and Avener 1977). These self-regulation techniques became the core elements in ‘mental training’ or ‘psychological skill training’ for athletes which today is frequently seen as the major training content of applied sports psychology.
2. Naie Self-regulation s. Psychological Skill Techniques Individuals use self-regulation in everyday life, e.g., to resist the temptations of delicious food when they have committed themselves to a diet. Most of the selfregulation techniques used in daily situations are referred to as ‘naive’ because they are not derived from scientific knowledge and their effectiveness has not been systematically evaluated. Usually, naive self-regulatory techniques are highly personalized coping strategies for situations constituting a challenge or a threat for the person. These techniques have developed through the person’s experience with that situation and they are based on the person’s subjective criteria. Some of these techniques are rites that have developed and some of them are simply instances of superstitious behaviors. These ‘naive’ self-regulation techniques can have positive effects. They may, for example, reduce athletes’ fear of the uncontrollable, increase self-confidence, and induce a ‘winning feeling.’ However, because they do not systematically address disadvantageous mental performance conditions, they are often bound to fail. In contrast to the ‘naive’ self-regulation techniques, those imparted by sport psychologists are ‘a formal, structured application of psychological skill techniques to the enhancement of sport performances’ (Morris and Bull 1991, p. 4). Sport psychologists train athletes in the use of these psychological skills. The objectives of such psychological skill training are 14948
manifold. A study by Mahoney et al. (1987) found sport psychologists and athletes to agree that the most basic skills to be acquired by athletes are anxiety management, concentration, motivation, self-confidence, and mental preparation.
2.1 Self-regulation and Self-control In general, the approach to develop self-regulation skills in athletes is based on the premise that the right psychological climate helps mobilize physiological reactions that are essential to performing at one’s best. A negative psychological climate such as feelings of frustration, fear, anger, and worry, or negative thinking in general, typically does the opposite. Loehr (1984) emphasized that neither negative nor positive psychological states just happen. Athletes can identify their own ideal performance state, and they can learn, intentionally or subconsciously, to create and maintain this state voluntarily so that their talents and physical skills thrive. This is the basic credo of selfregulation. The mental skills needed to trigger and sustain one’s ideal performance state are learned through practice just as physical skills are learned. Some gifted athletes seem to be able to develop efficient ‘naive self-regulation’ techniques, but as mentioned above, these may have their flaws. Thus, most athletes need to be taught specific mental skill techniques. Psychology has systematically studied and developed such techniques, so that they are now available in sport psychology. Self-regulation and self-control are key terms in motivation and action. They refer especially to volitional abilities that are needed to deal with adverse conditions during the execution of an intention. Kuhl and Beckmann (1994) subsume self-regulation and self-control processes under the more general conception of action control. Action control refers to the correspondence of the actual behavior with the intended action or, more specifically, the correspondence of the actual performance with the intended performance. Under adverse performance conditions, action control processes, i.e., self-regulation and selfcontrol, are required to achieve a high degree of such correspondence. In general, there are three objectives to be achieved through self-regulation: (a) positive self-related control beliefs (e.g., selfefficacy) (b) positive self-related emotions (high self-esteem) (c) clear-cut distinctions between self and nonself (self-discrimination). Most self-regulation techniques that are employed in sports address the first two objectives. The third objective is a more general developmental goal. It is based on the assumption that only individuals with access to their own needs and values (i.e., their integrated self ) can perform up to their potential.
Sports Performance, Self-regulation of There are two different routes to master adverse conditions, i.e., to cope with threats to action control. One route is the intention-driven inhibition of the counterintentional processes. The other route is the promotion of conflict-reducing processes through supporting the facilitation of intention-relevant processes. Kuhl and Beckmann (1994) refer to the former as self-control and to the latter as self-regulation. Selfcontrol is the less sophisticated action control strategy. It is based on metacognitive and metamotivational knowledge to a much lesser degree than self-regulation. Nevertheless, in some sport disciplines selfcontrol can actually mediate top performance. Beckmann and Kazen (1994) found that athletes who primarily use self-control excel in sport disciplines requiring a short-lived, high-energy input like shotputting, sprint and jump disciplines in track and field. But self-control failed in sport events demanding energy regulation, for example, long distance running. In these disciplines athletes with good self-regulatory skills showed the best performance. Different tasks or sport disciplines require various kinds of self-regulatory abilities. In endurance sport, for example, it is important to stick to one’s competition plan. In disciplines like sprint, shot-putting, or weight-lifting, maximum energy mobilization combined with high concentration is required, but only for a few seconds. In decathlon it is crucial to mobilize energy for a specific discipline but relax and regenerate between the heats.
3. Self-regulation Techniques Several self-regulation techniques have been developed which an athlete can acquire through mental training. There is no unified terminology for psychological skills training. Seiler (1992, p. 31) speaks of ‘a certain randomness of the classification, depending on the author’s initial conception of what is important in sport and competition and the construction method itself.’ As a result, different mental training programs stress different elements. Most programs share a focus on the following elements: activation regulation (relaxation and energization), attention regulation (concentration), self-talk (positive thinking, thought control), imagery (visualization, mental rehearsal). In the numerous mental training programs the different techniques usually blend in with one another. For example, a lot of techniques make use of images to support self-regulation. Only the major techniques of self-regulation will briefly be described in this article.
3.1 Actiation Regulation Self-regulation in sports was often equated with relaxation, especially in the early days of sport psychological interventions. The goal was to generate
a state of psychovegetative functioning, which was considered optimal for the performance of a certain task. The guiding idea behind this intervention was the, so-called, Yerkes–Dodson law of the relationship between arousal and performance. In 1908, Yerkes and Dodson found an inverted U-shaped relationship between arousal and performance in a learning experiment. In this experiment, rats learned slowest when they were highly aroused, or when their arousal was low. Their learning was fastest when arousal was at a medium level. As a consequence of this finding generations of coaches and athletes were taught that medium arousal was optimal for performance. Because the majority of athletes show a high level of arousal before a competition, relaxation was promoted as a major technique of mental preparation for a competition. But contradictory empirical findings show that the relationship between activation and performance is more complicated than postulated in the Yerkes– Dodson law. Neiss (1988) assumed that performance impairment found in highly aroused individuals is mediated through task-irrelevant cognitions (worry) rather than through the arousal component itself. Supporting this assumption, Dienstbier (1989) reported numerous studies that show a general positive relationship between physiological arousal and performance. Beckmann and Rolstad (1997) conclude that thorough relaxation may not be the optimal preparation immediately before starting a competition. Rather, athletes should aim at increasing their precompetition activation through so-called ‘psyching up.’ But this should be combined with the selfregulation technique of thought control (avoiding negative thoughts and worries). Although relaxation techniques may not be adequate immediately prior to or during a competition, they still play an important role in psychological skills training. They are necessary as a first step in training other self-regulation techniques. On their own, they are also an important measure for personality development (developing an even temper, the ability to ‘let go,’ supporting recovery, higher tolerance for unexpected and uncontrollable events). Relaxation methods taught in mental training are progressive muscle relaxation (PMR) and autogenous training (AT). These rather extensive relaxation programs intend to reduce bodily activation through focusing on relaxing muscles (PRM) or autosuggestive formulas like ‘Heart beats calmly and steadily’ (AT). A very basic technique is ‘controlled breathing.’ The breathing is conducted using the diaphragm rather than the chest. Athletes concentrate on ‘feeling the air enter the lungs and then slowly leave’ (Smith 1998, p. 42). ‘Controlled breathing’ has proven to be a good means to ‘let go,’ which means to get rid of disturbing, negative thoughts. An advantage of ‘controlled breathing’ compared to other relaxation techniques is that it does not necessarily reduce athletes’ readiness 14949
Sports Performance, Self-regulation of for competition. Consequently, it is a relaxation technique that has its place in the preparation for competition. 3.2 Attention Control For the majority of sport disciplines, attention is crucial for performance. Attention is threatened by various sources, especially during a competition. Through distractions athletes can lose the focus which allows them to execute their tasks effectively. Therefore, the ability to refocus in the face of distractions, to manage to concentrate on the task, is a critical selfregulatory skill. Distractions come from a variety of sources, outside and inside athletes. The crowd, the competition conditions, a starting aircraft while an athlete tries to concentrate on her high jump, thoughts about the finishing place, or a specifically difficult and dangerous section of a downhill ski run while still gliding through an earlier section. Nideffer (1980) made techniques of attention control popular among coaches and athletes. These techniques are used to focus or refocus attention more effectively during sport performance. One of these techniques is called ‘centering’: this is a technique used in a stressful situation to regain concentration. It involves taking a deep abdominal breath, with the individual focusing attention on the breath and then back out to a positive external cue. Inducing positive affect (emotion control) is a frequently suggested aid in attention control. The advice is to focus on doing what will help keep the person positive and in control. The idea is that a strong positive focus protects against distractions. Usually a cognitive component is added: the athlete should develop specific, task-related goals that can capture attention; a focus that connects or reconnects to the task. This can be supported by images. Athletes can, for example, imagine being in an invisible force field. The slalom skier imagines that the slalom course is inside a tunnel of glass that keeps all possible distractions outside. Also, athletes can imagine a guiding line that carries them through their competition e.g., the downhill skier imagines a railroad track laid out along the ideal line of the downhill run. Another, more general strategy involves relaxation training, in this case, as a form of personality development. Through relaxation training (PMR or AT), which is combined with other training routines on a regular basis, athletes can learn to ‘let go.’ They can learn not to carry mistakes with them and practice getting back on track quickly.
mental strategies are based on approaches in cognitive behavior modification which address the altering of ‘private’ cognitive processes. These techniques aim at stopping negative and destructive thoughts and promoting positive thoughts that support performance. Gallwey (1976) has pointed out that tennis players lose their games in their heads first. When the ‘inner dialogue’ changes from a positive, confident focus to a negative one, the game is lost. Gallwey developed the so-called ‘inner game’ technique which became popular especially in tennis and golf. The inner game addresses the inner dialogue between what Gallwey (1976) calls Self 1, the analytical problem solver, and Self 2, the intuitive and emotional self. The inner dialogue can become critical, when Self 1 primarily articulates self-doubts or anxieties and thus threatens the maintenance of positive self-related control beliefs and positive self-related emotions. Strategies either have the task-related Self 2 completely occupy attentional capacity or alter the focus of Self 1 toward more positive, supportive thoughts. The importance of having a positive inner dialogue is underlined by a study by Mahoney and Avener (1977). They found that the self-talk of US gymnasts who did not qualify for the Olympics was dominated by self-doubts and thoughts about failures. In contrast, the self-talk of those who qualified was characterized by self-confidence. A simple technique that is often employed to stop negative thoughts is directly taken from behavior therapy: stop rules. Athletes are trained to shout ‘stop’ along with, for example, imagining a large stop sign to interrupt a negative thought sequence. But stopping the negative thoughts alone is not sufficient to get back into competition. After stopping the negative thoughts, it is important to refocus on an appropriate performance-related cue or positive thought. However, this technique involves an interruption of task performance or at least a momentary loss of concentration. It is more efficient to prevent intrusive thoughts or negative thoughts from occurring beforehand. Beckmann (1993) used a thought-stopping technique to prevent negative thoughts, which has proven to effectively stop thoughts in numerous experiments in cognitive psychology. He had downhill skiers count numbers backwards beginning with 999 during a gliding section of a downhill run. It is impossible to think of anything else while counting the numbers. Thus, this technique prevented intrusive thoughts during that section. Because it is also quite easy to quit this task, it allows athletes to switch to task-relevant thoughts when approaching the next difficult section of the run.
3.3 Self-talk and Thought Control
3.4 Visualization
Another major route of self-regulatory skills for athletes became popular during the 1970s. These
There are two applications of the visualization or mental imagery technique. One of these is used in
14950
Sports Performance, Self-regulation of combination with relaxation techniques to induce a positive, relaxed state of mind. The athletes visualize themselves in a favorite relaxing scene. This can be a past successful performance, that will reinstall the positive affect, the feelings of competence, or in other words, the ‘winning feeling’ that was experienced then. The second application of visualization is to mentally rehearse performance. In this regard, visualization can be used to assist the learning of a new technique, continue practice during a period of injury, or to prepare for a competition. Neurophysiological data shows that the visualization of a physical activity is not restricted to brain processes, but actually involves small movements of ‘neuromuscular activity’ of the muscles that would be used if the activity was actually performed (MacKay 1981). The mental rehearsal of performance yields the best results, however, when visualization is used in combination with actual performance. Eberspa$ cher (1995) points out that ultimately athletes should be able to imagine themselves perfectly performing a task from an inner perspective, the perspective of the actor him- or herself (this is referred to as ideomotor training in a specific sense). However, it seems necessary to step by step develop and practice this inner perspective. Otherwise, an athlete may not have sufficient control over the images that come up. In visualization that is not systematically developed, unconscious fears may intrude, for example, in their ‘naive’ visualizations slalom skiers may see themselves missing a gate that they considered critical during the inspection of the run. This would, of course, not constitute a good preparation for the competition. To avoid this, Eberspa$ cher (1995) suggests three steps during the acquisition of visualization skills: (a) subvocal training: write down the course of performance and then recite this course in an inner dialogue; (b) covered perception training: see oneself perfectly perform the activity from an outside perspective, like seeing oneself on a video (observer perspective); (c) ideomotor training: visualize the course of performance from an inner perspective (actor perspective). To obtain the best results, and prevent negative thoughts and fears from intruding, it is important to execute a relaxation technique before beginning with the visualization.
3.5 Goal-setting Numerous studies have shown that goal-setting is a self-regulation technique that works (Locke and Latham 1984). Specifically, if individuals commit themselves to specific, hard, but still realistic goals, they show better performance than when they only have a ‘do your best’ orientation. This technique
supports task motivation, but also helps to focus attention on the task. It is often important to help athletes to make their goals realistic. This means that they learn to define their goals in terms of what they can actually achieve given the state of their condition and their skills. This involves differentiating goals into short-term goals (upcoming competition), mid-term goals (standing at half-time of the season), and long-term goals (standing at the end of the season, or what one ultimately wants to achieve). In this perspective, it is helpful to introduce the image of a ladder that one intends to climb up. Take one step after another (short-term goals). You cannot expect to get to the top (long-term goal) directly from the bottom. Stay positive and confident while climbing up. You may slip down from a step from time to time (which means: don’t expect to win every time), but stay oriented toward your goal, and keep climbing up steadily.
4. Conditions for the Application of Psychological Skills Training The mental side plays an important role in athletic performance. Athletes can learn to control the mental side through the acquisition of self-regulation techniques. It should be noted that there are other psychological factors that influence performance and can be addressed through psychological interventions. Among these are, for example, the communication skills (between coach and athletes, and among team members) and team spirit. The focus of the present chapter was on how athletes can regulate themselves in their preparation for a competition (including technique acquisition) as well as during a competition. Numerous self-regulation techniques have been developed by sport psychologists. Only a small sample of them could be described here. These self-regulation techniques are acquired through psychological skills training. It should be pointed out that the techniques cannot be learned overnight, as some athletes, coaches, and officials seem to assume. Continued training for some period of time is necessary before these techniques will actually start to work. The acquisition of the basic techniques requires at least six months of regular practice. Self-regulation techniques in sport are most effective when they do not have to be activated consciously. They work most effectively when they have become an automatically activated element in an athlete’s behavioral repertoire. Ultimately, these techniques work tacitly and smoothly without interrupting performance. They help to achieve the optimum state of flow (Csikszentmihalyi 1975). For best effectiveness, psychological skills training should be integrated into the normal training routines. 14951
Sports Performance, Self-regulation of
5. Future Perspecties During the last two decades quite a number of mental skills training programs have been developed. They provide sets of self-regulation techniques that athletes can learn in order to improve their performance. Although these techniques seem to work, it has also become obvious for practitioners that training the skills is not enough. The athletes the sport psychology consultant deals with show large individual differences. They have different intellectual capacities, different attitudes and needs, as well as different knowledge. Based on the existing set of self-regulation techniques, mental skills training has to be tailor-made for the individual athletes with their highly specific problem situation. Furthermore, short-term programs are not sufficient. Psychological training should accompany or better become integrated into physical training routines from adolescence on. Personality development is essential and should become a major focus of sport psychological training programs. The self is the core of self-regulation. Without a conflictfree self it becomes difficult to self-regulate. An integrated self may lead to the aspired maximizing of performance and achievement and at the same time to responsibility, improved decision-making, and happiness. See also: Action Theory: Psychological; Activity Theory: Psychological; Health: Self-regulation; Selfregulation in Adulthood; Self-regulation in Childhood; Sport and Exercise Psychology: Overview; Sport Psychology: Performance Enhancement; Sport Psychology: Training and Supervision; Sports as Expertise, Psychology of; Virtual Reality, Psychology of
Bibliography Beckmann J 1993 Aufgaben und Probleme bei der psychologischen Betreuung der deutschen alpinen Skimannschaft bei den Olympischen Winterspielen 1992. Leistungssport 23(1): 23–5 [Responsibilities and difficulties of psychological coaching of the German Alpine Ski Team at 1992 Winter Olympics] Beckmann J, Kazen M 1994 Action and state orientation and the performance of top athletes. In: Kuhl J, Beckmann J (eds.) Volition and Personality. Action Versus State Orientation. Hogrefe and Huber Publishers, Seattle, WA, pp. 439–51 Beckmann J, Rolstad K 1997 Aktivierung, Selbstregulation und Leistung. Gibt es so etwas wie U= bermotivation? Sportwissenschaft 27: 23–37 [Actiation, self-regulation, and performance. Is there something like oermotiation?] Csikszentmihalyi M 1975 Beyond Boredom and Anxiety, 1st edn. Jossey-Bass, San Francisco Dienstbier R A 1989 Arousal and physiological toughness. Implications for mental and physical health. Psychological Reiew 96: 84–100 Eberspa$ cher H 1995 Mentale Trainingsformen in der Praxis. Sportinform, Mu$ nchen [Forms of Mental Training in Practice] Gallwey W T 1976 Inner Tennis. Playing the Game, 1st edn. Random House, New York Griffith C R 1926 Psychology of Coaching. Scribner’s, New York
14952
Kanfer F H 1970 Self-regulation: Research, issues, and speculations. In: Neuringer C, Michael J L (eds.) Behaior Modification in Clinical Psychology. Appleton-Century-Crofts, New York, pp. 178–220 Kuhl J, Beckmann J (eds.) 1994 Volition and Personality. Action and State Orientation. Hogrefe and Huber Publishers, Seattle, Go$ ttingen Locke E A, Latham G P 1984 Goal-setting: A Motiational Technique that Works! Prentice Hall, Englewood Cliffs, NJ Loehr J E 1984 How to overcome stress and play at your peak all the time. Tennis (March): 66–76 MacKay D 1981 The problem of rehearsal or mental practice. Journal of Motorbehaior 13: 274–85 Mahoney M J, Avener M 1977 Psychology of the elite athletes: An exploration study. Cognitie Therapy and Research 1: 135–41 Mahoney M J, Gabriel T J, Perkus T S 1987 Psychological skills and exceptional athletic performances. The Sport Psychologist 1: 181–99 Morris T, Bull S J 1991 Mental Training in Sport: An Oeriew (BASS Monograph). British Association of Sports Sciences and National Coaching Foundation, Leeds, UK Neiss R 1988 Reconceptualizing arousal: Psychobiological states in motorperformance. Psychological Bulletin 103: 345–66 Nideffer R M 1980 The role of attention in optimal athletic performance. In: Klavova P, Daniel J (eds.) Coach, Athlete and the Sport Psychologist. University of Toronto Press, Toronto, ON, pp. 92–112 Seiler R 1992 Performance enhancement. A psychological approach. Sport Science Reiew 1(2): 29–45 Smith M 1998 Mental Skills for the Artistic Sports. Johnson Gorman Publishers, Alberta Yerkes R M, Dodson J D 1908 The relation of strength of stimuli to rapidity of habit-formation. Journal of Comparatie and Neurological Psychology 18: 459–82
J. Beckmann
Standpoint Theory, in Science According to standpoint theorists, there are social positions from which people can obtain privileged perspectives on social relations, because those who are oppressed are in a good position to understand the social relations that constrain them. Recent work in Science and Technology Studies has shown that scientific knowledge is irreducibly social (e.g., see Science, Sociology of ). Social relations are embedded in scientific knowledge and in different ways form topics and resources across the sciences, from physics and mathematics to primatology and psychology. Therefore standpoint theory could be relevant even in the natural sciences.
1. Standpoint Theory’s Central Argument With respect to science, standpoint theory has been most comprehensively worked out by feminist researchers. For this reason this article focuses primarily
Standpoint Theory, in Science on feminist contexts. But the arguments and conclusions can be extended to other types of standpoints: there are contexts in which the standpoints of nonEuropeans or, returning to the roots of the theory, proletarians, might provide privileged epistemic positions within the sciences. The central arguments for privileged perspectives stem from Hegel’s ‘master–slave’ dialectic, Marx’s discussions of ideology, and particularly from Georg Luka! cs’ discussion of the privileged standpoint of the proletariat. In his 1922 essay ‘Reification and the Consciousness of the Proletariat’ (in Luka! cs 1971), Luka! cs argues that the proletariat potentially has a privileged standpoint from which to understand the rationalized structure of commodities and capital. Under capitalism, different classes perceive rationalization in different ways. The bourgeois, says Luka! cs, tend to accept their position as natural, just and even necessary, because it was created by them; they can retain an illusion of subjectivity, of themselves as actors. Through attempts to break out of the rationalized structure, the proletariat, on the other hand, can understand the constraining nature of social relations, and can also question the naturalness or necessity of those relations. As a result they can see that the system that removes much of their subjectivity is a specific historical product. Therefore particular social locations have the potential to provide insights when people in those locations run up against arbitrary structures. In essence, then, standpoint theory is an account of the blinding power of ideology, and how it can be overcome (see Ideology, Sociology of ). If its arguments are right, it is relevant to science to the extent that scientific knowledge participates in such ideologies. Within feminist discussions, standpoint theory has been developed in divergent directions, for example by Nancy Hartsock (1983), Patricia Hill Collins (1986), Dorothy Smith (1987), Sandra Harding (1991), and Rosemary Hennessy (1993). As do the rationalizations of capitalism, gender reifies social relations. Women have a distinctive experience, in the experience of discrimination, from which to see gender relations. And by working against discrimination they come to understand the unfairness and lack of necessity of many things normally taken for granted.
2. Criticisms and Elaborations Because it has been taken up by many scholars and activists, the meaning of standpoint theory can be confusing. Its central and most controversial claim is that a particular social position potentially provides a priileged perspective, not merely another perspective. Claims that styles of thought are correlated with gender (e.g., Keller 1985) are sometimes grouped with feminist standpoint theories, but they do not contain the same sort of general argument for privilege. Even
setting aside positions that do not recognize privilege, we can still differentiate between two versions of standpoint theory as it relates to scientific knowledge. A strong reading of the arguments of standpoint theorists entails that the social location of oppressed groups is a definite and positive resource for increasing the objectivity of science. A weaker reading entails only that objectivity tends to increase as more and more diverse groups are represented in science. The difference can be seen in Sandra Harding’s notion of ‘strong objectivity’ (Harding 1991) and Helen Longino’s more procedural notion of objectivity (Longino 1990). (See also Objectiity: Philosophical Aspects.) The theory is often taken to imply that there is one correct feminist perspective, which will eventually reveal the true nature of gender relations. Setting aside good questions about whether the idea of a single complete theory is coherent, the claim can be challenged as overly essentialist. Essentialist assumptions of commonality in women’s experiences or situations have been discredited by feminist researchers steadily finding diversity where once was seen uniformity. However, there is nothing necessarily essentialist about standpoint theory. Harding’s formulation, based on ‘thinking from women’s lives’ (Harding 1991), suggests that diversity can be built into the theory. It is not women’s experiences alone that standpoint theory privileges, but experiences of working against discrimination, including in the production of knowledge. And there is no implied essentialism about women or their experiences when standpoint theory is framed, as it often is, in negative terms: interests and positions tend to blind people to features of their worlds. The call to think from women’s lives does not exclude the possibility that men can acquire and develop feminist standpoints. By acting to oppose sexism, and by understanding obstacles faced by women, men can gain experience and knowledge of the social structures that maintain gender as oppressive. Claims of privilege are limited by the fact that the oppressed typically have reduced resources with which to study their environments. Thus, even if the central arguments for standpoints are correct, they do not provide any cut-and-dried formula for correlating standpoints with more objective knowledge. Theorists representing oppressed groups may have better knowledge of the relevant social structures, but then again they may not. Thus standpoint theory’s arguments are best given the weaker reading mentioned above. The standpoint theory that emerges from these criticisms is modest and pragmatic. Without the determinism sometimes associated with standpoints—without the claim that, say, women necessarily understand some phenomena better than do men—standpoint theory is primarily an argument for the representation of diverse groups in the sciences. Those groups will be apt to take their research in 14953
Standpoint Theory, in Science different directions, to discover different phenomena and see different strengths and weaknesses in arguments. Multiple standpoints can contribute to both the breadth and solidity of scientific thinking. But the theory does not claim that some people are simply to be believed because they command cognitively superior standpoints. The metaphor of ‘privilege’ is cashed out in terms of possibilities for understanding: genuinely inclusive scientific communities are likely to produce more balanced knowledge, being attuned to a broader array of social facts.
3. The Domains of Standpoints in Science There are a number of contexts in the sciences for which even a cautious standpoint theory can be relevant. Three examples may be offered of broad contexts in which feminist and other standpoints can be applicable. The feminist examples easily could be modified for other standpoints.
relatively fixed and uncontroversial unit—and have found conflicts, differences in interest, and social structures that had remained hidden when inquiry focused on the male world alone. Problem choices like these are common in the humanities, social sciences, and biological sciences, all of which self-consciously study social relations, though presumably such choices are not directly relevant in the physical sciences. However, some areas of research may directly affect women, because the knowledge they produce can be used to reduce or exacerbate inequities (see Gender and Technology). Every area of inquiry has at least some instrumental uses, and many forms of applied knowledge have some relevance to the gendered structure of society. Topics within so-called ‘pure’ physics are connected to military uses, and are funded precisely because of those uses (see Science, Technology, and the Military). Women tend not to participate in benefits coming from the high status of military prowess, yet suffer many of the consequences of war. A feminist standpoint, then, is more likely to recognize and see as important the connections of certain areas of research to their consequences.
3.1 Gender and its Construction Social relations in themselves are topics of inquiry in the humanities, social, medical, and biological sciences. Research has shown ways in which law, literature, divisions of labor, and many other cultural products participate in the construction of important social categories. Many of these findings are due to the influx of feminist, Afro-American, post-colonial, and other critical theorists into academic social sciences and humanities departments, which has disrupted established patterns of thinking and added to various disciplines’ understanding of their traditional as well as new subject matters. Studies of sex differences, classifications of diseases, behavior patterns, and explanations of all of these in terms of evolutionary history also participate in the construction of the social world, for example in the construction of gender. Feminist biologists have found areas in which sexist ideology affects research, and have seen how scientists can reproduce, explain, and amplify social relations that they see in society (for a survey, see Schiebinger 1999). Standpoint theorists would claim that people who understand everyday social biases as the product of cultural formations are in a good position to see attempts to reify those formations and naturalize them, to see, for example, attempts to justify sexism by characterizing it as part of human biological heritage.
3.2 Problem Choices Ideology affects the topics that are studied. For example, social scientists have looked to the structure of the household—usually taken for granted as a 14954
3.3 Social Structures within Disciplines To varying degrees, sexism is a normal part of the working environment in all of the sciences. The allocation of funds, the structure of the tenure system, and old-boys networks can work against women. To recognize and study sexism within the sciences, however, is not just a specialized project of feminist sociology of science. To be a scientist means having knowledge (often tacit) of the norms and structures of one’s discipline (see Norms in Science). For example, to convince colleagues to read an article, it is as important to know about form and rhetoric as it is to know something about ‘subject matter.’ Therefore, if feminist standpoint theory is right, feminist scientists will be in a better position than men to notice sexist aspects of their own discipline, aspects of their relations with other members of the field that perpetuate oppression.
4. Conclusions Science and technology studies have pointed to the local and contextual origins of scientific knowledge (e.g., Knorr-Cetina 1981). A consequence of those origins is that scientific knowledge, even when it is deemed universal, may carry with it parochial assumptions about what is natural. As a result, science taken up outside its contexts of production needs to be corrected to take account of its own social assumptions and the knowledge of nonscientists (e.g., Wynne 1996). In this type of insight we might see the value of standpoint theory, showing how social location can be a resource for decontextualizing scientific knowledge.
State and Local Taxation Standpoint theory’s arguments establish some grounds for privileged perspectives: people for whom social constraints are oppressive can more easily understand those constraints than can others. Recent studies of science and technology show that those grounds can be relevant in all branches of science: science is a thoroughly social process, and perspectives that tend to expose some of the effects of social structure are thus valuable throughout the sciences. In particular, feminist scholars have argued that because scientific communities have lacked diversity, they have typically lacked some of the resources to better enable them to see aspects of sexism and androcentrism in scientific work. Definite social locations, then, are relevant across the disciplines, though the range of particular situations in which they are relevant varies greatly in different contexts, sometimes affecting problem choices or norms of behavior, and sometimes directly affecting substantive claims. Therefore standpoint theory, together with the recognition of the social character of knowledge, shows that to increase objectivity, communities of research and inquiry should be diverse, representative, and democratic.
Bibliography Collins P H 1986 Learning from the outsider within: The sociological significance of black feminist thought. Social Problems 33: 14–32 Harding S 1991 Whose Science? Whose Knowledge?: Thinking from Women’s Lies. Cornell University Press, Ithaca, NY Hartsock N C M 1983 The feminist standpoint: Developing a ground for a specifically feminist historical materialism. In: Harding S, Hintikka M (eds.) Discoering Reality: Feminist Perspecties on Epistemology, Metaphysics, Methodology and Philosophy of Science. Reidel, Dordrecht, The Netherlands Hennessy R 1993 Women’s lives\feminist knowledge: Feminist standpoint as ideology critique. Hypatia 8: 14–34 Keller E F 1985 Reflections on Gender and Science. Yale University Press, New Haven, CT Knorr-Cetina K 1981 The Manufacture of Knowledge: An Essay on the Constructiist and Contextual Nature of Science. Pergamon, Oxford, UK Longino H 1990 Science as Social Knowledge: Values and Objectiity in Scientific Inquiry. Princeton University Press, Princeton, NJ Luka! cs G 1971 History and Class Consciousness; Studies in Marxist Dialectics. MIT Press, Cambridge, MA Schiebinger L 1999 Has Feminism Changed Science? Harvard University Press, Cambridge, MA Smith D 1987 The Eeryday World as Problematic: A Feminist Sociology. Northeastern University Press, Boston Tuana N (ed.) 1989 Feminism and Science. Indiana University Press, Bloomington, IN Wynne B 1996 May the sheep safely graze: A reflexive view of the expert–lay knowledge divide. In: Lash S, Szerszynski B, Wynne B (eds.) Risk, Enironment and Modernity. Sage, London
S. Sismondo
State and Local Taxation State and local taxation comprises those taxes that are collected at the subfederal government levels in order to finance state and local public services, assigning discretion in the determination of rates and bases of these taxes to subfederal governments. Consider a nation with different layers of government that are only serving administrative purposes. Each government level only distributes the public services to and collects the taxes from that nation’s citizens upon which they agreed at the national level. In such a country, no state or local taxation is necessary since the level of public goods and services is decided upon uniformly for the whole country. This leaves citizens in some local jurisdictions, which prefer more public services than the national average, as much dissatisfied as citizens in local jurisdictions, which prefer less. It is possible to provide public services, the benefits of which are limited geographically and which are not national in scope, at different levels in different subfederal jurisdictions. Enabling citizens to consume publicly provided goods and services at different levels according to their preferences makes everyone better off without making someone worse off. In order to let people, who want to consume more, pay a higher tax price for those state and local services, taxes must be differentiated accordingly between states and local jurisdictions. In such a world with a differentiated supply of publicly provided goods, each individual can reside in a jurisdiction where a certain level of public services is provided to adequate tax prices. The art of state and local taxation is the assignment of different kinds of taxes to government levels such that an invisible hand properly guides a nation in an ideal world to an optimal multiunit fiscal system. This is not necessarily the case in the real world and some answers exist why fiscal systems are not optimal.
1. The Rationale of Tax Assignment to the State and Local Leels The basic idea of a power to tax for states and local jurisdictions is that their public goods and services should be financed by those citizens benefiting from them and deciding on their quantity and quality. In this case, individuals choose their place of residence according to their preferences and their dislike for congestion. A set of fiscal jurisdictions consisting of equal taste communities then results from a process of ‘voting by feet’ (Tiebout 1956). Olson (1969) calls the basic condition for such an outcome the ‘principle of fiscal equivalence.’ It requires assignment to which public goods and services should be provided at which level of government. The guideline for this decision is the geographical incidence of those benefits and the costs borne by citizens in order to finance them. As some public goods and services reasonably cannot be 14955
State and Local Taxation provided at the local level, some kinds of taxes may not be assigned to the local level for good reasons, too. There are numerous reasons why the proper assignment of tasks and the power to tax to the subfederal levels are crucial to avoid suboptimal outcomes. For example, take the Scala in Milan, the famous Italian opera house. As it is partly financed by user charges and partly by subsidies paid from general taxes of the city, it will be congested because residents living in the suburban towns are able to visit it paying only the user charges and thus a lower price than the city’s citizens. Nonresidents receive some of the benefits from the public service without paying an adequate tax price. This spillover is one of the externalities that Gordon (1983) summarized in his treatment of an optimal tax system for subfederal jurisdictions. Another externality exists if nonresidents pay some of the taxes of state or local jurisdictions. Taxes are exported to nonresidents and state and local governments thus have an incentive to provide public services at a higher level than their residents prefer. An example of tax exporting can be found in the taxation of tourist trade or of natural resources in some US states. For the year 1962, McLure (1967) estimated for the US states that overall tax export rates were between 19 and 28 percent in the short run. In the long run, only 10 states exported more than 24 percent of their taxes or less than 16 percent. The estimates are corroborated in a more recent study by Metcalf (1993). Mieszkowski (1983) estimated the ‘total waste’ in Alaska in the beginning of the 1980s that accrued from excessive public revenues and expenditures that were possible because of high natural resource taxes to about $500 million or 13 percent of estimated petroleum revenues. Tax exporting is not limited to commodity taxes or natural resource taxes. It may also occur if individuals, firms, or capital are mobile. If factor mobility is allowed for, the problems of subfederal taxation are even enhanced. If a state or community levies a higher personal or corporate income tax than its neighboring jurisdictions, its mobile citizens or firms emigrate (or move their capital there) in order to enjoy a lower tax burden. Such a situation is characterized as tax competition. In doing so, mobile factors reduce the tax burden of residents and firms in the state or community they move to and increase the tax burden in those jurisdictions they leave. These changes in tax burdens usually are not considered by public authorities in these jurisdictions when deciding on the level of public goods and services such that these are provided at a lower than ‘optimal’ level. Since there is a local loss from taxation that does not correspond to a social loss, the cost of public services is overstated and jurisdictions will tend to underspend on state and local public services. Buchanan and Goetz (1972) call this effect a fiscal externality. The effects of tax competition among states and local jurisdictions are similar to those of international 14956
tax competition and they vary according to different degrees of mobility of individuals and firms, to the number of mobile factors, to the number of jurisdictions involved in that competition, on the size of the jurisdictions involved, and to the number of tax instruments available to tax authorities. The assessment of the usefulness of these effects becomes rather complicated if competition using tax instruments is complemented by competition using public expenditures, like subsidies or specific infrastructural services, but also tax expenditures like tax holidays to attract mobile production factors. The impact of tax competition on the possibilities of state and local governments to redistribute income by tax-transfer schemes is relatively clear-cut. Stigler (1957) already stated that ‘redistribution is intrinsically a national policy.’ Suppose that a region adopts a progressive income tax program designed to achieve a significantly more egalitarian distribution of income than exists in neighboring jurisdictions. If rich and poor households are mobile, such a program would create strong incentives for the wealthy to move out to neighboring jurisdictions. In this case, local redistribution induces sorting of the population, with the richest households residing in the communities that redistribute the least by income taxes. A more equal distribution of income would result, but it would be caused largely by an outflow of the rich and a consequent fall in the level of per capita income in the jurisdiction under consideration (Oates 1972). At the central level, this problem does not occur, because mobility is reduced, the larger a jurisdiction. There are not many theoretical arguments against this line of reasoning. Most of them rely on imperfect mobility of individuals. But the more mobile people become, the better this mechanism should work, the less possible should state and local redistribution by a progressive income tax be. However, according to Buchanan (1975), all individuals have an incentive at the constitutional stage to agree to income redistribution because they are fundamentally uncertain about their future positions in income, health, and employment. At the postconstitutional level the rich might agree to income redistribution by the government because they, first, are interested in a public insurance scheme against fundamental privately uninsurable risks for themselves and their children, and, second, against exploitation by the majority of the poor residents and increasing crime rates. Particularly the second reason is important for the rich: they pay a premium for obtaining social peace. Janeba and Raff (1997) argue that individuals will agree to decentralized income redistribution because it is a credible commitment of the poor majority to the rich minority that the latter is not exploited at the postconstitutional stage. The rich allow for the agreed extent of redistribution because a constitutional competence for income redistribution at the federal level leads to a larger public sector with more income redistribution.
State and Local Taxation From this perspective, decentralized redistribution might be possible and even more efficient since redistribution is closer to citizens’ preferences.
2. Guidelines for a Reasonable Tax Assignment This reasoning leads to some guidelines for tax assignment of different taxes to the state and local levels. The bottom line is that state and local jurisdictions should finance the services they provide by charges or quasicharges to the beneficiaries (Musgrave 1983) or they should use those tax bases which have low interregional mobility (Wildasin 1980). Land and natural resource taxes and to a lesser degree also real estate taxes are thus especially suited for purpose of state or local taxation. A property tax, for example, is a good candidate for being assigned to the local level at least as far as residential property instead of business property is taxed. A property tax is basically a tax on land and construction built on it and it can thus be considered to be a tax on a relatively immobile tax base. Consumption taxes are inappropriate at the local level because the interjurisdictional mobility leads to cross-border shopping or tax exporting, but they become suitable for the state level if the region is large enough. However, the technological development, in particular Internet shopping, reduces the possibility for state governments to levy a tax upon consumption. If individuals, firms, and financial capital are fully mobile a comprehensive income or profits tax should not be assigned to the state and local levels because it is necessary for such a taxation to be optimal so that a jurisdiction should be able to tax all income of its residents in the other jurisdictions. This is certainly not a realistic case. Beyond this, taxation of wage income tends to reduce this tax base less than the taxation of capital income and it could thus be more easily taxed at the state or even local level unless tax authorities cooperate and exchange information about taxpayers’ incomes. And even profits taxes can be levied at the state or local levels if investment is specific to the locality such that a firm cannot easily relocate. The latter is particularly the case for not-in-my-back-yard (NIMBY) firms like waste incinerators or nuclear power plants. These firms have difficulties finding a location and assigning local jurisdictions the power to tax their profits gives these communities an incentive to provide such firms with a location. Following from this reasoning is another rule for tax assignment: Progressive taxation, designed to secure redistributional objectives, should be assigned to the central level. A progressive income tax should thus be a federal tax. Moreover, as the case of Alaska illustrates, a large natural resource tax base permits local jurisdictions to provide public services at a low tax price leading to an attraction of individuals and
firms at an inefficient scale. On these grounds, a case for central taxation of natural resources may be made whereby the central natural resource tax may apply to an excess base only, while leaving the average base for subfederal use. All in all, these basic considerations led Musgrave (1983) to present the following tax assignment to government levels: The local level should rely on property taxes and payroll taxes, the state level on a resident income tax, a consumption tax, and a natural resource tax, and the federal level on an integrated progressive income tax and a natural resource tax for large natural resource bases. All government levels should additionally rely on fees and user charges as far as public services are attributable to particular users or certain groups of users. This possibility is probably reduced from the local to the federal level. According to Musgrave, corporate source income ideally would be included in the integrated base of the federal income tax. These propositions become substantially more complicated, if interjurisdictional tax agreements providing for tax credits or deductibility are considered. Federal tax deductibility of state and local taxes, for example, may mainly serve taxpayers with higher income and is thus regressive.
3. Tax Assignment in Some OECD Countries A first look at real tax systems in OECD countries quickly reveals that tax assignment is not necessarily Musgravian (Pola et al. 1996; Pommerehne and Ress 1996). The US seem to come closest to those rules of tax assignment. The US states are prohibited by the US constitution from levying taxes on imports or exports but otherwise they may adopt any type of tax that does not egregiously discriminate against citizens of other states. US states have a lot of freedom in defining their tax bases, in setting tax rates, and in offering tax abatements to induce businesses to locate within their state. Finally, the states grant tax authority to differing degrees to local jurisdictions, too. While there are only minimal restrictions on state power to tax in the US, the federal level nevertheless relies mainly on the individual income and corporate income taxes, the states on sales taxes and natural resource taxes, and the most important source of revenue at the US local level is the property tax. Although state tax revenue from individual income and corporate income taxes is nonnegligible, it is only about one fifth of the respective federal revenue. Such a favorable judgment about the US tax system is, however, only justified at a first glance. According to Feldstein and Metcalf (1987), for example, the federal tax deductibility of personal income tax liabilities led many states to rely more heavily on deductible personal income taxes than on other type of revenue. Apart from the US, for example, in Canada and Australia and particularly in Europe, the assignment 14957
State and Local Taxation of taxes to the state and local levels differs considerably from the Musgravian model and varies widely across countries. In Canada, personal and corporate income taxes are the most important revenue sources of the federal level. The most visible of taxes in Canada, the personal income tax, is, however, also the most important revenue source of the provinces. On the one hand, there are fairly significant income tax rate differences between provinces. On the other hand, arrangements between the Canadian federal and provincial governments on sharing the bases for the personal and corporate income taxes have a long history going back at least to the Second World War. There is also a large variation in sales tax rates from Newfoundland to Alberta as the second important revenue source of the provinces. Canadian local jurisdictions, like those in the US, mainly rely on property taxes. In Australia, the federal government exclusively levies taxes on personal and company income, sales taxes, and (nearly exclusively) excise taxes. The bulk of a state’s tax revenues are derived from payroll taxes, stamp duties, and associated taxes on financial transactions, and so called franchise (i.e., business license) fees on alcohol, tobacco, and petroleum products that are not so much differing from excise taxes as well as lots of smaller revenue sources. Local governments, on the other hand, derive their revenues almost entirely from property taxes. Australian states also obtain revenue from the central government on the basis of a revenue sharing arrangement. The Australian tax system is not as different from the German and Austrian system of state and local taxation. In particular Germany has established an extensive revenue sharing arrangement of personal and corporate income and value added taxes as the most important taxes accounting for about three quarters of total tax revenue. These are really shared taxes in the sense that German state governments decide on them together with the federal government. The federal government has at its own discretion only the gasoline tax and a surcharge on income taxes while the states have no power to tax. Local government relies on property taxes and local business taxes. The less federalist European countries usually have taxing authority at the local level. In the UK, however, there is not only no regional or state level of government, but local authorities basically have not much power to tax, only some power in determining tax rates of the local property tax on real property. The formerly existing local business tax in the UK was abolished by the ‘Thatcher’ administration which cut the contributions of local taxes to local revenue sharply after the unsuccessful attempt to introduce the poll tax in the 1980s. There have been some changes in the UK at the end of the 1990’s to establish regional authorities in Scotland, Wales, and Northern Ireland providing at least the Scottish government with some power to tax. The UK thus appears to be following the 14958
trend in Europe to a more decentralized government structure with a power to tax for subfederal governments as well. Since the 1970s, systems of regional government have been given an enhanced role in taxation in France, Italy, and Spain. In France, there are three levels of subnational government: the local level with about 36 000 cities and communities, about 100 deT partements and 22 regions. Since 1986, the importance of these subnational governments has increased. All three levels have the possibility to levy a property tax on real property (land and construction built on it), a local business tax (‘taxe professionelle’) and something similar to a residents’ income tax (‘taxe d’habitation’). All three taxes are subject to central regulation, but nevertheless there is a nonnegligible variation in tax rates. In Belgium, there is a movement towards fiscal federalism since it is agreed that the sales and property taxes will become exclusive taxes for the regional and local levels. Apart from smaller taxes, the three Belgian regions additionally can levy a supplementary rate on the personal income tax which may be positive, a surcharge, or negative, a tax reduction. Although there is some regulation of the local and regional power to tax income, the variation of rates is considerable. In contrast to that, local governments in the Netherlands can only levy property taxes. An interesting pattern of local taxation can be observed in the Nordic countries. Basically, Denmark, Finland, Norway, and Sweden are unitary countries with only two government levels. According to the studies in Rattsø (1998), local governments have considerable power to tax in Denmark and Finland, less so in Sweden, and only to a limited extent in Norway. In Denmark and Finland, local governments can rely on taxes on property and personal income (nearly) without restrictions for setting tax rates accompanied by a considerable variation. In Sweden, local taxation consists mainly of personal income taxes with less variation in rates due to the central regulation of local governments. In Norway, most tax revenue originates from income and wealth taxes shared with the central government giving local authorities taxing power only within a narrow band. Since 1979, all local governments apply the maximum rate. The property tax is not available to all local governments, but is restricted to urban areas and certain facilities (notably power plants). In addition, local property tax rates are also limited to a narrow band. In Switzerland, states (cantons) have the basic power to tax personal and corporate income as well as capital. The local jurisdictions can levy a surcharge on cantonal direct taxes and raise their own property taxes. The central government relies mainly on indirect (proportional) taxes, the value added tax, and specific consumption taxes like the mineral oil tax. It can also rely on a source tax on income from interest. There is,
State and Local Taxation Table 1 State and local tax assignment in selected OECD countries Country
States
Local authorities
USA
Income tax Sales tax Natural resource tax
Property tax Sales tax
Canada
Personal and corporate income tax Sales tax
Property tax
Australia
Payroll taxes Stamp duties Franchise fees
Property taxes
Germany
No taxing power
Property tax Business tax
UK
No state level
Property tax
France
Property tax Business tax Residents’ income tax
Property tax Business tax Residents’ income tax
Belgium
Personal income tax surcharge Property tax
Personal income tax surcharge Property tax
Switzerland
Personal and corporate income tax Inheritance tax Wealth tax
Personal and corporate income tax surcharge Property tax
Norway
No state level
Personal income tax Property tax
Denmark
No state level
Personal income tax Property tax
Musgravian ‘optimal’ assignment
Residents’ income tax Consumption tax Natural resource tax
Property taxes Payroll taxes
finally, a small but highly progressive federal income tax, which, together with revenue from the source tax on interest income, amounted to 34 percent of total federal tax revenue in 1995, while the cantons and municipalities rely on income and property taxes to about 50 percent of their total revenue and 95 percent of their tax revenue. In addition, personal and corporate income taxes in Switzerland vary considerably between the cantons. Taking the value of the (weighted) average for Switzerland as 100, the index of the tax burden of personal income and property taxes varies from 55.6 in the canton of Zug to 135.8 in Fribourg in 1995, while the index of corporate income and capital taxes varied from 57.6 again in Zug to 145.6 in Nidwalden. As the overview of these selected OECD countries in Table 1 indicates, there is no clear picture of which kinds of taxes are assigned to the state and local levels. Local jurisdictions regularly levy property taxes as well as fees and user charges. Some power to tax personal income, at least labor income, is attributed to state and local governments in the US, Canada,
France, Belgium, the Nordic countries, and Switzerland. In some cases, for example, Germany, France and Switzerland, there exist even local business taxes in the discretion of subfederal jurisdictions. But on the whole, the assignment of taxes does not follow along Musgravian lines and there is a considerable variation in federal regulation of subfederal taxing power.
4. The Political Economy of State and Local Taxation Given these differences in tax assignment to state and local authorities, the question arises which institutional and political factors explain the tax structure in different countries. An early contribution as to the choice of taxation in a political economy framework originates from Schneider and Pommerehne (1980). They develop a model of Australian tax and expenditure structure along with a model of selfish government behavior assuming that the re-election constraint is not always binding. In choosing among 14959
State and Local Taxation tax instruments, a government in power is only urged to follow the preferences of a majority of voters if an election approaches and the re-election constraint is binding due to low government popularity. Between elections, the government can follow its ideological preferences. This modeling of government behavior proved to be clearly superior compared to the traditional models of government behavior in ex-post predictions of important variables (government spending, GDP, etc.). In contrast, Hettich and Winer (1999) assume that the governing party or coalition always maximizes the expected votes which by themselves depend on the consequences of policies for the economic welfare of voters. In making fiscal choices, political parties thus take into account economic, political, and administrative factors that determine the nature of electoral support and opposition associated with alternative ways of taxing. The authors present empirical evidence on the reliance on income taxation and income tax structure of the US states in 1985 and 1986 supporting their hypotheses. Inman (1989) finds that redistributive concerns strongly influenced the decision of the 41 largest US cities between 1961 and 1986 to rely on property taxes, sales taxes, or user charges. The higher the demand of local voters for redistributive services, the less city governments rely on fees and user charges as well as sales taxes and the more on property taxes. Hunter and Nelson (1989) present evidence for the impact of interest groups on the local revenue structure. The higher the share of wealthy homeowners in local jurisdictions in Louisiana, the lower the share of property tax revenue and the higher the share of charges. Besides vote-maximizing behavior of politicians and interest-group influence for redistribution in their favor, tax competition is a third factor determining the structure of taxation. However, the expected impact is ambiguous. On the one hand, tax competition for mobile capital and workers may lead to a convergence of tax rates to a low level. On the other hand, tax competition may lead to increases in tax rates if a state’s geographic neighbors follow suit when it raises its tax rates. Such behavior might occur due to yardstick competition between states and local jurisdictions (Besley and Case 1995). Due to their incomplete political information, voters evaluate the fiscal performance of representatives in their jurisdiction by comparing it to the fiscal performance of neighboring jurisdictions. If a state’s neighbor increases tax rates, this state has an increased possibility to raise its taxes as well. Arkansas State Senator Doug Brandon once said: ‘We do everything everyone else does.’ Finally, the structure of state and local taxation, like federal taxation, is influenced by the degree of fiscal illusion among taxpayers. The more complex a tax system, the more incorrect voters’ perceptions about their true tax burden and thus their tax resistance, the 14960
more easily representatives can increase taxes in order to finance new spending. Pommerehne and Schneider (1978) present evidence for Swiss local jurisdictions that the extent of fiscal illusion attributed to the complexity of the tax system is less pronounced, the more voters can determine taxes and spending directly in fiscal referenda. Moreover, Matsusaka (1995) finds that US states which allow for the use of initiatives in their constitutions rely more on fees and user charges than on broad based taxes increasing the visibility of contributions to the public sector and the equivalence between public services and tax prices.
5. Concluding Remarks Although a comprehensive assessment is still too early, state and local taxation, appears to be the outcome of political decisions shaped by constitutional provisions and influenced by economic and administrative restrictions. Voters, representatives, bureaucrats, and interest groups deciding upon taxation under the decision making provisions of their constitutions and taking into account the mobility of tax bases and the accompanied tax competition and tax exporting determine to what extent state and local jurisdictions rely on property, income and sales taxes, or on user charges. The differences in these political, institutional, and economic factors explain why there are so many differences among countries in their reliance on tax sources at the state and local levels. See also: Urban Government: Europe
Bibliography Besley T, Case A 1995 Incumbent behavior: vote-seeking, taxsetting, and yardstick competition. American Economic Reiew 85: 25–45 Buchanan J M 1975 The Limits of Liberty. University of Chicago Press, Chicago Buchanan J M, Goetz C J 1972 Efficiency limits of fiscal mobility: an assessment of the Tiebout model. Journal of Public Economics 1: 25–43 Feldstein M S, Metcalf G E 1987 The effect of federal tax deductibility on state and local taxes and spending. Journal of Political Economy 95: 710–36 Gordon R H 1983 An optimal taxation approach to fiscal federalism. Quarterly Journal of Economics 98: 567–86 Hettich W, Winer L 1999 Democratic Choice and Taxation: A Theoretical and Empirical Analysis. Cambridge University Press, Cambridge, UK Hunter W J, Nelson M A 1989 Interest group demand for taxation. Public Choice 62: 41–61 Inman R P 1989 The local decision to tax: evidence from large US cities. Regional Science and Urban Economics 19: 455–91 Janeba E, Raff H 1997 Should the power to redistribute income be (de-) centralized? An example. International Tax and Public Finance 4: 453–61 Matsusaka J G 1995 Fiscal effects of the voter initiative: Evidence from the last 30 years. Journal of Political Economy 103: 587–623
State and Society McLure C E 1967 The interstate exporting of state and local taxes: estimates for 1962. National Tax Journal 20: 49–77 Metcalf G E 1993 Tax exporting, federal deductibility, and state tax structure. Journal of Policy Analysis and Management 12: 109–26 Mieszkowski P 1983 Energy policy, taxation of natural resources, and fiscal federalism. In: McLure C E (ed.) Tax Assignment in Federal Countries. Centre for Research on Federal Financial Relations, Canberra, Australia Musgrave R A 1983 Who should tax, where and what? In: McLure C E (ed.) Tax Assignment in Federal Countries. Centre for Research on Federal Financial Relations, Canberra, Australia Oates W E 1972 Fiscal Federalism. Harcourt\Brace\Jovanovich, New York Olson M 1969 The principle of ‘Fiscal Equivalence’: the division of responsibilities among different levels of government. American Economic Reiew, Papers and Proceedings 59: 479–87 Pola G, France G, Levaggi R (eds.) 1996 Deelopments in Local Goernment Finance: Theory and Policy. Edward Elgar, Cheltenham, UK Pommerehne W W, Ress G (eds.) 1996 Finanzerfassung im Spannungsfeld zwischen Zentralstaat und Gliedstaaten. Nomos, Baden-Baden, Germany Pommerehne W W, Schneider F 1978 Fiscal illusion, political institutions, and local public spending. Kyklos 31: 381–408 Rattsø J (ed.) 1998 Fiscal Federalism and State-Local Finance: The Scandinaian Perspectie. E Elgar, Cheltenham, UK Schneider F, Pommerehne W W 1980 Politico-economic interactions in Australia: Some empirical evidence. Economic Record 56: 113–31 Stigler G J 1957 The tenable range of functions of local government. In: US Congress, Joint Economic Committee (ed.) Federal Expenditure Policy for Economic Growth and Stability. US Government Printing Office, Washington, DC Tiebout C M 1956 A pure theory of local expenditures. Journal of Political Economy 64: 416–24 Wildasin D E 1980 Locational efficiency in a federal system. Regional Science and Urban Economics 10: 453–71
L. P. Feld and F. Schneider
State and Society The conceptual couple ‘state\(civil) society’ is a product of the protracted effort through which, starting with the Enlightment, a few generations of European intellectuals sought to characterize the advance of modernization. The juxtaposition of ‘state’ to ‘(civil) society’ emphasizes the growing differentiation between, on the one hand, the structures and events directly connected with public authority, with the constitution and management of the polity, and on the other the structures and events which individuals generate as they pursue their private interests, express their emotions, acquire and exercise their capacity for judgment. Some institutional out-
comes of this process can be still be recognized, according to many authors, in the shape and dynamics of contemporary societies.
1. An Unsettled Distinction Speaking of a ‘juxtaposition’ between two concepts suggests that they lie on the same plane and are unproblematically complementary to one another. But this is not the only relation in which those concepts have stood to one another. In an earlier relation, the two concepts, instead, were closely overlapped. In a philosophical and juridical vocabulary that lasts between Aristotle and Kant, the expression ‘civil society’ designates a condition in which humans have left the state of nature by bringing about, and subjecting themselves to, political authority; one wrote of societas ciilis SIVE res publica. Here, what made a society civil was the very fact of its being policed and governed. Hegel (1953) breaks with this fusion of the two concepts in the Philosophy of Right of 1821. The civil society is understood as the set of individuals engaged in the competitive pursuit of their private interests (chiefly economic). The state is understood as an institutional site where identified collective interests— such as the increase in the state’s security, or the maintenance of the public order—are pursued through expressly political activities. But this juxtaposition is loaded with tensions, and Hegel presents conceptual entities—the family, the corporation, or the ‘police’ (which here means: the administration)—intended to mediate those tensions. Further, the juxtaposition may be perceived as a contrast between state and society, leading thinkers to take sides with the one or the other. Hegel himself opts for the state, but already Lorenz von Stein, who places the paired concepts state\society at the center of his main work, resolutely opts for society. In a sense, so does Karl Marx, who like Hegel views civil society as the locus of the formation and confrontation of private interests of an economic nature. However, he endows it with a powerful, selfsustaining dynamic, grounded on means and relations of production, with respect to which the state itself loses the creative, autonomous power to drive historical development Hegel attributed to it. Even in the course of the twentieth century, some authors prioritize one of two concepts and subsume under it the other. Oliver Williamson’s (1975) theory of markets and hierarchies, for instance, posits economic exchange as the primordial social relation, and derives from ‘market failures’ the emergence of hierarchical, command-based relations, and thus of organizations. Implicitly, also in this view society, conceived in the first instance as the locus of market relations, precedes and in a sense justifies the state (if one sees in it a particularly significant organization). This priority of society, naturally, imposes tight 14961
State and Society limitations on what the state can and should do (see Karlson 1993). Such arguments imply that state\society is not a proper dichotomy, for one concept can be made to encompass and posit the other instead of accepting it as its peer. But there are also arguments to the effect not that one concept is sufficient, rather that more than two are needed. Thus, one may object to the Hegelian\Marxian reduction of the civil society to the political economy or the market, and suggest for it an alternative referent. A model of a social realm focused on intellectual pursuits, and constituted and regulated without political mediation and power relations, can be found in the eighteenth century Imperium Litterarum, the international cosmopolis of saants and literati. Or one may seek to comprise within civil society the phenomena of culture and religion, or the institutions concerned with play, emotions, aesthetic expression, and other inter-individual relations not reducible to exchange. The market itself may either be made to adjust itself to the presence, under the conceptual umbrella of ‘civil society,’ of these new and rather different referents, or may be relocated to a third realm all of its own, as if to acknowledge its peculiarly powerful and demanding dynamic. For instance, according to Cohen and Arato (1992), in advanced societies there are three fundamental institutional complexes—the state, the market, and civil society. The latter is conceived primarily as the site not of competition between private interests, rather of the formation of collective identities, of solidarity, of the formation and transformation of institutions by means of social movements. In this perspective, tertium datur. Politics and Markets (Lindblom 1977) do not exhaust the social space.
2. Historical Roots of the Distinction One reason why, at the purely conceptual level, the relations between the two terms are so unsettled, may be that ‘state’ and ‘(civil) society’ should not be conceived as universal categories. Whether we conceive them as juxtaposed or counterposed to one another, both expressions refer to historically unique ways of establishing and regulating both political and economic processes (and sociocultural ones, if a tripartition is accepted as more plausible). The state\society distinction presupposes, and profoundly modifies, arrangements which had emerged in various European countries in the late Middle Ages, and which are often characterized by the German expression Sta¨ndestaat (literally, ‘state of the estates’). But the expression is potentially misleading, for it may lead one to think of its referent as a ‘state’ in the modern sense of the term. To avoid this misunderstanding the Austrian historian Otto Brunner (1968) has proposed replacing 14962
that expression with that of sta¨ndische Gesellschaft. By this is meant a context where constituted estates, through arrangements partly traditional partly agreed, take an active part in the government of a territory. However they do so chiefly in the exercise of their own privileges, not on a mandate from the central government; and in those privileges political aspects are tightly fused with both economic ones and with status ones, relating to honor and rank. Their activities of government are differentiated, but not as articulations of an overriding political function, rather on the basis of each estate’s claims and responsibilities. Furthermore, typically the component units of each estate are not individuals, but collective entities with a pronounced hierarchical structure. The dominant unit is the ‘house,’ understood as the seat and patrimony of an aristocratic dynasty. ‘The lord of the house (Hausherr), at the same time that he performed the function, which today we consider private in nature, of head of the family, also exercised an authentic political power (jurisdictional, administrative, representative) over the members of the house subjected to him,’ which tended to include the bulk of the local rural population (Schiera 1983). Furthermore, each ‘house’ tended to be also a fortified place, an assemblage of autonomous military resources, which the members of the family (again, including the rural dependants) would employ at the behest and under the command of the lord. In that capacity the house normally took part in the ruler’s military undertakings; but equally could also protect the house’s rights, arms in hand, from encroachments by other houses, including the ruler’s own. Finally, the typical house constituted an oikos, which is\was the seat of a complex plurality of activities intended to make it economically selfsufficient, as much as possible without recourse to exchange. The monitoring and organization of those activities (including, critically, the periodic delivery to the lord’s household of unpaid produce and labor by the rural dependants) engaged the lord’s faculties of jurisdiction and coercion.
3. The Crisis of the staendische Gesellschaft The ‘decline and fall’ of these arrangements was caused by a critical aspect of state formation, the gathering at the center of a country of the faculties and facilities of organized violence, culminating, according to Max Weber’s well known formula, in a monopoly of legitimate violence. Exactly because the aristocratic houses originally possessed an autonomous military capacity, that process, when it does not dissolve them outright, turns them into subordinate components of an unchallengeable public order, centered on the sovereign, their councils, standing army, and fiscal, administrative and judicial apparatus. The political autonomy of the houses, and thus of the estates they
State and Society collectively constitute, is lost. The sovereign’s superiority, while it does not suppress the houses’ internal hierarchical arrangements (including, of course, ‘patriarchy’) relativizes them, flattens them. Seen from the summit of sovereign power even the most exalted members of the realm begin to appear relatively similar to one another, relatively powerless. At the same time that centralized political activities become the prerogative of a sovereign power that performs them in a more unitary and rational manner, economic activities become decentralized. An increasing number of subjects decide what goods and services to produce, in what ways, by interacting according to their own interests and capacities. The provision for the needs of the individuals depends increasingly on their capacity to enter into traffics with one another on the market. In the countryside, commercialization reduces the part played in the totality of the economic process both by the aristocratic patrimonies and by the traditional rural units constituted by villages, which had shared the aristocrats’ preference for self-sufficiency and their hostility to the market. It also diminishes the significance of city-based productive units regulated by guilds, which typically contrasted the market’s tendency toward innovation and the free combination and recombination of resources. To sum up: while state-building dissolves the staendische Gesellschaft by taking from it tasks, resources, and legitimacy pertaining to the political sphere, the advance of an economy based on traffics pushes into the market the units let loose by that dissolution. Against an expressly political organization with a strong vertical gradient, concerned with all relations grounded on imperium, there stands an open-ended set of increasingly numerous social units which, in the pursuit each of its own interest, establish with one another horizontal, or apparently horizontal relations.
4. The Political Prerequisites of the Ciil Society This can only take place, however, insofar as the process of state building reflects two peculiar principles: first, the political center takes on all expressly political concerns and resources but only those; second, it acknowledges the existence of private subjects who, while excluded from the political sphere, act autonomously in others. In both ways, political power expressly limits itself, through complex and delicate institutional arrangements: for instance, the progressive denial of political and juridical significance to the religious interests of individuals, or the acknowledgment and guarantee of private property and of contract. Heinrich Popitz (1986) highlights the historical rarity of such arrangements: ‘Only rarely has it been
possible even just to pose the question of how to constrain institutionalized violence, in a systematic and effective manner. This has only taken place with the Greek polis, republican Rome, a few city-states, and the constitutional state of the modern era. The answers have been surprisingly similar: the principle of the primacy of the law and of the equality of all before the law (isonomia); the idea of setting boundaries to all legislation (basic rights), norms of competence (separation of powers, federalism), procedural norms (decisions to be taken by organs, their publicity, appeal to higher organs), norms concerning the distribution of political responsibilities (turn-taking, elections), public norms (freedom of opinion and of assembly).’ Within this small set of experiences of limited political power not all counterpose two realms we might call ‘state’ and ‘(civil) society.’ For instance, in the Greek polis individuals exist chiefly as soldiers and citizens, and the private dimension of their life is little valued and expressed. Only vis-a' -vis the modern constitutional state does (civil) society emerge, through the processes indicated above. In many other civilizations the tendency of political power to aggrandize itself, to possess itself of all resources (beginning with land and labor power) suppresses all autonomy of the society, and may even reduce all individuals to subjects or slaves. In the modern West, by acknowledging and protecting private property in productive resources, the political center allows at least some subjects to stand against it as autonomous units. This, according to North and Thomas (1973), greatly favors the advance of the division of labor and the enlargement of the traffic economy. But it has also distinctive political effects, best perceived a contrario in the ‘decline and fall’ of empires which fail to acknowledge and protect property rights (Eisenstadt 1969). The significance of private property has two implications. First, it justifies to an extent the Hegelian\ Marxian understanding of civil society primarily as the site of traffics. Second, it entails that the civil society should not be envisaged as a ‘flat’ space, harboring no power relations; for private property, especially in the means of production, engenders a power difference between those who do and those who do not possess it. The state\society difference does not mean that power is institutionalized only in the former, but that the latter is structured (among other things) by power relations of its own, foremost among which are those established by property.
5. State and Society in the Twentieth Century The complex interactions between the power forms that lie on each side of the divide shape, to a large extent, the dynamics of modern societies. In the twentieth century the prevailing trend could be called 14963
State and Society ‘statism,’ for it saw a considerable enlargement of the scope of state action. In totalitarian systems, this trend suppressed all institutional limitations upon that action, and threatened the existence itself of society as a separate realm. In other systems, it pushed back considerably the boundaries of that realm, but this in response also to demands originating from it. For instance, economically underprivileged strata sought to maximize the import of citizenship and to reduce that of their market position; many economically powerful groups sought the state’s assistance in securing their profits and stabilizing or correcting the market itself, which for various reasons had lost its self-equilibrating capacity. A further component of the trend consisted in the strategies of groups (sections of the ‘political class,’ public employees, members of the so-called ‘new professions’) whose livelihood and whose opportunities for a greater sway on policy depended on an expansive definition of the state’s role. Whatever its protagonists, in non totalitarian countries that definition was much assisted by the existence of a ‘public sphere’ and by the institutionalization of adversary politics, which gave voice to broader social strata, and found its most recurrent theme in the respective scope of the state and the market. Political parties played the central role in this dynamic, sometimes opposing but more often supporting the ‘statist’ trend. This expressed itself, among other things, in the formation of bodies and agencies that sat astride the traditional divide between the public and the private realm—as with the Italian parastato or the British quango. In the last decades of the twentieth century the trend was challenged by social forces expressing themselves politically through conservative and\or right-wing parties. These sought to at least slow down ‘statism,’ and proclaimed the superior rationality of the market, which in their discourse, but to an extent also in practice, reasserted itself as the chief, if not the only institutional site of the civil society. This late revisitation of the economy-centered, Hegelian\ Marxian conception of the civil society acknowledges that in contemporary society also human potentialities and social processes—of a cultural, communicative, symbolic, apparent noneconomic nature—which other formulations treat as a tertium with respect to both state and market, are in fact targeted and controlled to a large extent by great concentrations of economic power. This is achieved through new material and social technologies, and subjected to the logic of accumulation, which is increasingly focused on informational and financial resources. Mostly those technologies are developed and operate transnationally, and thus increasingly escape the reach of the political competences and facilities vested in the states. This lends a new edge to the attempt to reduce, after decades of ‘statism,’ the significance of public and political action, and to enhance that of the market. 14964
Also the end of the Cold War operated in that sense, for it reduced the necessity for strong political\military tutelage of the capitalist order. It also loosened the constraints of the Keynesian compact which the dominant strata had previously made with the lower orders and implemented largely through state action. Both dominant trends—the explicit or implicit reduction of civil society to the market, and the slogan prevalent in the advanced countries at the end of the twentieth century: market, si, state, no—are potentially worrisome. The market itself remains the site of the formation and interaction between large economic forces, endowed with great power; and these are increasingly operating in a transnational space. In contrast the state—still wedded to the territory—can no longer perform even its most traditional functions, jurisdiction and police, let alone those functions of guidance and intervention in the economic realm it had increasingly claimed during most of the twentieth century. These recent phenomena compromise the institutional tenability of the couple state\society nearly as much as it had been threatened, earlier, by the totalitarian state or by central economic planning. A seriously dis-empowered state cannot establish and monitor an authentic civil society complementary to itself. In this sense, the Aristotelian and medieval vision according to which only a politically organized society can truly call itself ‘civil,’ deserves to be revisited.
Bibliography Brunner O 1968 Neue Wege der Verfassungs- und Sozialgeschichte. Vandenhoeck & Ruprecht, Go$ ttingen, Germany Cohen J L, Arato A 1992 Ciil Society and Political Theory. MIT Press, Cambridge, MA Eisenstadt S N 1969 The Political Systems of Empires. Free Press, New York Hegel G W F 1953 Philosophy of Right. Clarendon Press, Oxford, UK Karlson N 1993 The State of State: An Inquiry Concerning the Role of Inisible Hands in Politics and Ciil Society. Almkvist & Wiksell, Stockholm, Sweden Lindblom C E 1977 Politics and Markets: The World’s Political Economic Systems. Basic Books, NY Marx K 1904 A Contribution to the Critique of Political Economy. Kerr, Chicago, IL North D C, Thomas R P 1973 The Rise of the Western World: A New Economic History. Cambridge University Press, London Popitz H 1986 Pha¨nomene der Macht. Mohr (Siebeck), Tu¨bingen, Germany Schiera P 1983 Societa' per ceti. In: Bobbio N, Matteucci N, Pasquino G (eds.) Dizionario di Politica. UTET, Turin, Italy Williamson O E 1975 Markets and Hierarchies. Analysis and Antitrust Implications: A Study in the Economics of Internal Organization. Free Press, New York
G. Poggi Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
State: Anthropological Aspects
State: Anthropological Aspects The ‘state’ refers to a form of polity that links people and territory in ‘societies’ and thereby demarcates itself from other states. Although archeologists date states back to ancient Egypt and Mesopotamia (about 2500 BC), this article deals only with anthropological aspects of the modern state. It examines the implications of linking people to territory on subject formation, the administration of populations and their cultures, and on the practices of religiosity, kinship, and commerce as well as of anthropology itself.
1. The Concept of Modern Statehood Modern statehood is a mode of political organization of ‘society’ in which the state is the primary agent charged with building the representation of society aware of itself and in harmony with itself. Modern law is the principal means by which the state attempts to create this harmony, an ideal state in which there is a coincidence between behavior and rules, and where the state can anticipate the conduct of its subjects (citizens) and develop normative frameworks (policies) to direct change (Nader 1990). States are only one of many rule-making institutions with administrative capacities. International and supranational bodies also increasingly make law, but states largely retain for themselves the power of legitimate enforcement. They remain the only institution in most parts of the contemporary world with an effective ideological and institutional apparatus that claims to represent the entire people within a particular territorial jurisdiction. Statehood became the central principle asserted in international law in 1648, with the signing of the Treaty of Westphalia ending the Thirty-Year’s War in North-Central Europe. Driven by industrial capitalism as a superior mode of production and accumulation, the state expanded its functions beyond the conduct of war, the administration of justice, and the regulation of commerce, to massive projects in the colonization of populations in the Americas, Asia, Africa, and the mid-East (Dirks 1993, Stoler 1995). Western European states, in particular, formulated projects of internal colonization and they began to usurp religious authority’s control over the definitions of transcendence, the bodies of their subjects, and the institutionalization of their life courses. Initially employed as a political program in the French Revolution and subsequently written into the United Nations Charter, the nation form grew out of transformed empires and tribes during the process of European state formation (Tilly 1990). At the same time, in a process of reciprocal copying, states in the Americas also emulated the nation form. This transformation of diverse peoples into unified nations
appropriated local discourses of consanguinity and affinity and, in turn, reconstructed kinship. Unification was rarely accomplished without intermittent purges or other kinds of homogenizing processes. From the late fifteenth to the early seventeenth centuries, all Western European states engaged, with often-disastrous economic consequences, in variant forms of ethnoreligious ‘cleansing’ leading to population resettlement and the formation of victim groups. In the twentieth century, national consolidations and the creation of new types of victim groups reached new heights, especially with the advent of fascist and racist ideologies originating in Europe but also employed elsewhere (Lefort 1986). The peace treaties concluding World War I introduced into Eastern and Southern Europe the two conditions of West European nation-states—homogeneity of population and rootedness in the soil. In Eastern Europe the implementation of these principles was largely elided (though tacitly encouraged) during the Cold War with the establishment of socialist states based on universal brotherhood and allied with the Soviet Union—a bloc that dissolved following the so-called ‘velvet revolutions’ of 1989. Western European states emerged weakened from World War II and unable to resist independence movements in their colonized states, primarily in Africa and Asia. These postcolonial states in turn nearly universally adopted the nation form to integrate diverse groups within territorial boundaries already drawn, arbitrarily, by European states. Nation-states have since become the mode by which all of the earth’s polities are organized into formally comparable and sovereign governing units. Appropriation of the nation form effectively broadened the scope of the state to include the planned change of culture, variously defined, and the generation of forms of subjectivity in populations (Borneman 1992).
2. Subject Formation through Linking Territory to People The 1933 Montevideo Convention defined the criteria for statehood as having: (a) a permanent population; (b) a defined territory; (c) government; and (d) the capacity to enter into relations with other states (Article 1, League of Nations Treaty Series No. 881)—with only the first two criteria consistently upheld. Between 1945 and 1999, the number of states recognized by international law as sovereign members of the United Nations increased from 51 to 189, with two nonmember states (the Holy See and Switzerland) also recognized. Defining statehood this way has numerous effects on subject formation. First, it produces pressures to unify ethnically mixed and territorially dispersed populations by centralizing legal and education systems and standardizing the life course. States uniformize by means of both bureau14965
State: Anthropological Aspects cratic technologies (the use of, e.g., interviews, surveys, and maps) and psychological appeals to the ‘shared’ symbolic universes they create (through the standardization of grammars and linguistic use, and the shaping of experience in ritual events and everyday routines). Second, it creates populations with mixed loyalties, settlers or migrants who live in a place among peoples separate from the place and peoples with whom they primarily identify. Third, it generates categories of automatic exclusion (e.g., resident stateless peoples, resident internal ‘minorities,’ nonresident external others), which are then used frequently to justify war against external or internal enemies. Fourth, the dependence of modern political representation on territorial attachment reinforces identifications with territory and the desire for statehood. These identifications exacerbate territorial conflicts and often lead to wars between potential antagonists who tend to receive international recognition only when their disagreements are articulated in terms of border violations. Despite the ubiquity of the nation-state, different historical experiences of populations have produced a variety of ways of classifying states, some corresponding to intent of the ruler, others to the relation of ruler to ruled, and still others to their cultural form. In the twentieth century alone, classification of state forms include, for example, monarchy, oligarchy, theocracy, democracy, totalitarian, liberal, democratic, communist, fascist, socialist, authoritarian, welfare, theater, tributary, cephalic, bicephalic, romantic, satiric, colonial, postcolonial. In accounting for this variety of state forms and the historical conditions of their production, anthropologists have focused on the organization of public authority and forms of domestic control. The historian Foucault (1977, 1978) and the political scientist Anderson (1983) have inspired much recent scholarship. The powers unleashed by science since the Enlightenment have generated what Foucault calls ‘rituals of truth,’ including disciplinary technologies that refigure space and the arts of governance charged with producing ‘society’ and ‘culture’ as objects of knowledge and identification. Moreover, the deployment of new procedures of power in the nineteenth century transformed state authority, from a sovereign’s ability to ‘take life or let live’ (to order or require someone’s death), to an investment in administering life by means of disciplining the body and regulating populations. States institutionalize particular cultural forms through projects of reform, education, development, incarceration, war, welfare, and family planning, along with the use of scientific techniques such as surveys, census, questionnaires, actuarial tables, economic forecasting, spatial mapping, case studies, and bureaucratic interviews. Anderson focuses on the historical conditions in which the ‘imagined community’ of the nation comes into being and achieves global currency, particularly in its 14966
relation to states following the collapse of empires. One of these conditions, print capitalism, enables advances in techniques of memory and collective narration through the proliferation of historicist myths and standardized accounts of experiences of national belonging across space and irrespective of local temporalities. Because modern governing necessitates delineating a population and a territory, rulers deploy ethnographic assumptions about how to define a people. As well, through laws and policies states produce and circulate information about belonging (through families, schools, and the media), positively and negatively sanction prescriptive and proscriptive behaviors (about sexuality, gender, age, race, ethnicity, and work), and build libraries, monuments, and museums to display particular interpretations of the nation. Hence, modern states are distinctive in that they produce both laws and people; they not only administer and regulate already-existing cultural patterns but also produce new domains of objects and forms of subjectivity.
3. The Discipline of Anthropology and the State Early anthropological work on the state, especially among archaeologists, was inspired above all by Friedrich Engels’s study of the origin of the state and its relation to private property, Lewis Henry Morgan’s evolutionary scheme, and Sir Henry Maine’s theory of the progressive movement from status to contract in the development of state law (Service 1975). Without being fully conscious of it, fieldwork-based anthropologists, inspired by Bronislaw Malinowski and Franz Boas, initially participated in this expansion of the state by studying the generation, elimination, transformation, or preservation of indigenous peoples, tribes, peasants, and clans through the creation of encompassing, and simultaneously exclusionary, national forms of belonging. One of the major anthropological aspects of states has been, therefore, the deployment of anthropology: ways of defining and categorizing human populations and of systematizing knowledge about them (Ferguson 1990, Pemberton 1994). The discipline of anthropology has closely followed and defined its objects in dialogue with state projects, alternately supportive and oppositional: itemizing culture traits and social organizations of ‘discrete’ peoples, devising evolutionary schemes, tracing efforts at internal and external colonization, mapping the settling and resettling of territory and the transformation of property, and tracking diasporic populations, cultural diffusion, and global trade routes (Moore 1986). By and large, however, anthropologists traditionally have focused on the production of the cultural or the social, or with forms of exchange among people
State: Anthropological Aspects without states (Sahlins 1981). A preference for fieldwork in villages or small-scale communities also made it difficult for anthropologists to connect their subjects with states or the large-scale. Further, long-term observational fieldwork requires an intimate access that is difficult to obtain from rulers or those in the inner circles of power. Anthropologists have, therefore, often assumed an essential sociocultural opposition of the people to the state. This seemed credible, if not ethically imperative, when, as in the early twentieth century, the objects of study were socalled ‘primitives’ and ‘indigenous peoples,’ victimized and threatened with extinction, frequently through state-supported genocide. Or, alternatively, when the state was primarily a puppet of a colonial power. Nonetheless, the opposition is misleading, as society (Gesellschaft) vs. community (Gemeinschaft) originates as a project of the modern state. Anthropology has in the meantime become a global discipline of alterity, no longer restricted to questions of origin or to particular peoples or places. Also, the relation of indigenous people to states has changed considerably, with many states now guaranteeing the ‘preservation of indigenous cultures’ and prioritizing their rights over those of more recent settlers. Moreover, the United Nations along with some nongovernmental organizations and a global tourist industry positively sanction states that protect and support what are called ‘first peoples’—a movement that has reversed the trend toward extinction and strengthened claims of cultural authenticity or temporal priority of settlement. In any case, the notion of ‘kin-based’ or ‘acephalic’ polities as a pre-evolutionary stage to a ‘state-based’ polity, or of the replacement of status by contract, has become moot for ethnographic research as the entire planet is organized territorially by states unevenly integrated in global market exchanges, mutually regulated through legal contracts. The interaction of people with states is an unavoidable condition of modern life. This interaction took three meta-narrative forms in the second half of the twentieth century: people living within the bipolar Cold War structure of socialist or communist states (planned economies, single Party rule, so-called ‘Second World’ of totalitarian states) vs. liberal states (free market economies, multiparty democracies, so-called ‘First World’) and a contested zone of people living in postcolonial states (mixed economies with diverse political forms liberated from empires, so-called ‘Third World’). Until the late 1970s, anthropologists had little access to people within totalitarian states, and in other parts of the world they devoted much research to reconstructing the precolonial state. In the mid-1970s, a global trend toward democratization of authoritarian states began in southern Europe, was picked up in Latin America and parts of Asia in the 1980s, and spread to sub-Saharan Africa, Eastern Europe, and the Soviet Union in the late 1980s and 1990s, making the relation of these
people to transforming states a subject of anthropological research. In effect, the space of the Second World has disappeared and the spaces of the Third and First Worlds are mixed, existing within each other, though in radically different proportions. The vast majority of states now call themselves democracies and are engaged in market reforms that open internal economic and cultural relations to global markets. Of note at the beginning of the twenty-first century is the distinctive disjuncture between the global integration of financial markets and shifts in state strategies with respect to sovereignty over national economies, the proliferation of ethnic affirmation within states, and the denationalization of much military conflict (a process also driven by fear of the use of nuclear weapons). Whether states drive, resist, or submit to globalization, and hence thereby curtail or strengthen their sovereignty, varies greatly depending on size, wealth, location in the international system, and domestic legitimacy. Local cultural forms unable to resist pressures of cultural or economic globalization are nonetheless not homogenized, a fact apparent both in the local variations in ‘society’s’ relation to the state, specifically to democratic form, and the continued diversity of everyday routines and rituals. By the beginning of the twenty-first century, democracy has become the dominant form of internally organizing states, functioning as a mode of ruling, a device of power, and a strategy of self-representation. Furthermore, democracy requires systems of personal and public accountability, usually embodied in principles of ‘the rule of law,’ or it begins to degenerate into forms of despotism (Borneman 1997). As a mode of ruling, democracy means the introduction of regularized elections, the creation of ruling majorities and minorities, the institutionalization of an opposition, and the strengthening of principles of accountability that circumscribe the exercise of power. As a device of power, democracy is as dependent on the externalization or creation of negative others as on the internal dynamics of citizen formation. Practices of such externalization range from state-tolerated social discrimination to legal exclusion of refugees to war. As a strategy of self-representation, each democracy receives international recognition to the extent it is able to claim governance by ‘rule of the people’ and conformity to international norms and conventions. Anthropologists charged with listening to others and often asked to give ‘voice’ to the people studied are involved in all three aspects of democracy. Hence, the discipline of anthropology unavoidably intervenes in local political struggles about defining ‘the people.’ Given the increasingly global nature of ‘news’ and accelerated information flows across state borders, local struggles about representation quickly enter inter- or supranational arenas, subjecting the state to pressure from authorities external to it and explicitly 14967
State: Anthropological Aspects drawing attention to the ways in which it exercises sovereignty. Such global flows also challenge state sovereignty by bringing to the world’s attention local and subregional actors and events that might be in opposition to dominant groups within their own states. The study of conflicts in democratizing states includes research on social and legal citizenship and on the relation of state policy to demands for representation and accountability. Along these lines, four basic contemporary conflicts about representation and states are observable: involving excluded people who want more social, economic, cultural, or legal inclusion into states in which they reside; diasporic people who want representation in both the place in which they reside and the place from which they are displaced; numerical minorities (usually represented as a ‘culture’ or a ‘nation’) who do not have their own state but want one; and citizens or social movements that demand more accountability from their rulers.
4. Legitimacy and the Modern State Although the conception and functions of the modern state keep changing, its legitimacy is always based both on some notion of the ‘consent’ of its own citizens to rule and on the recognition of other states. This was not the case when rule was justified without recourse to ‘the people,’ in other words, rule based on coercion alone, principles of heredity or wealth, the ability to wage war and collect taxes, or the administration of justice. To obtain this consent and recognition, the state must usually go beyond Max Weber’s famous dictum that its power rests on a monopoly on the legitimate use of violence. The state does not, in fact, hold this monopoly in many parts of the world. And where the state has dissolved or weakened, the most vulnerable citizens tend to suffer the most from the lack of any protections. Indeed, today weak states are just as likely to lead to violent conflict as were strong imperial states in the last several centuries. In many places, the state’s own abuse of this supposed monopoly—through genocide, strategic terrorism, the suppression of dissent, cynical uses of war—has traumatic effects on its citizens, leading to repeated and delayed suffering of past events. Irrespective of whether they initiated or were victims of the violence, states instrumentalize this trauma, not only by building monuments to the fallen soldiers but also though the construction of nationalist historiographies and support for a heritage industry that especially honors the idea of national sacrifice and organizes the process of mourning for losses after wars. So long as states continue to draw legitimacy from principles of territorial sovereignty and peoplehood, they will be motivated to generate, exacerbate, and institutionalize differences with neighbors or 14968
neighboring states. Anthropology is concerned less with the conduct of these conflicts and wars than with their effects on the organization of populations and communities (Tambiah 1992). Also, the collapse of the Cold War’s bipolar state system has resulted in the relatively unrestricted proliferation of arms, much of it driven by statesupported arms industries, and in a market for new and affordable technologies of death outside the control of states. Where the sources and agents of violence outside state control are increasing—a wideranging phenomena found, for example, in Colombia due to drug trafficking, in Sri Lanka and much of Africa due to civil war, and in the United States or Mexico due to poverty and class conflict—people employ private police forces. Alternately, many states now lend their militaries for ‘peace-keeping’ efforts organized by the United Nations, and still others, such as states within the European Union, are relinquishing military sovereignty to continental or transcontinental military alliances. There is also a growing body of legal doctrine—and an international court established in 1999—holding states accountable to their own norms and to international covenants and agreements on issues such as human rights, environmental and maritime law, and commercial law. The state’s sovereignty and legitimacy, then, is not granted as natural but conferred by its citizens and the recognition of other states and new international regimes. The long history of linking the state to realization of the Idea of the people (Hegel), representation of the social (Durkheim), technical and bureaucratic rationality (Weber), or exploitation and class dominance (Marx), must now be supplemented by acknowledging that most states still resort to magical practices and cultivate religiosity in order to legitimate rule (Taussig 1997). State policies and edicts administered through ‘law’ often work through processes similar to magic: belief in the power of the utterance of the speaker, the performativity of speech, the linguistic re-articulation of group suffering and instrumentalization of memory, nationalistic narrativization of macro-political events, ad hoc rationalization of failures and successes. Even in the domain of criminal law in the most rational of states—Germany or Sweden, for example—most crimes are never solved and most criminals go free. Yet criminal law there still ‘works’—that is, contributes to the legitimation of the state—largely through belief in the principles of justice that judicial inquiries, prosecutions, and trials efficaciously represent. Likewise with respect to religion, states most successful in ‘neutralizing’ religion are those which recognize the significance of religiosity and incorporate religious traditions and symbolic forms into the secular rituals of political parties, elections, bureaucratic routines, and lifecourse rites-de-passage (Gauchet 1997, Herzfeld 1992). Such practices are
State: Anthropological Aspects forms of group worship and they crystallize in patterns of national belonging when citizen and state narratives are brought into harmony. The struggle between the great world religions and the state to occupy and shape the symbolic forms of everyday life has not quieted, partly due to the development and ubiquity of new technologies of communication that make it possible for both instances to react independently and quickly to everyday events. Nonetheless, there has been a dispersion of the power over the control of religious expression, and the state is one of many benefactors from this process. In the last quarter of the twentieth century in most places of what is known as the First World, entertainment—especially involving sport and music—emerged as the field dominating both the management of experiences of transcendence and of people’s affective attachments and identifications. State rituals—especially electoral campaigns, legislative debates, memorialization and mourning ceremonies, and political speeches—have, therefore, increasingly incorporated models and techniques of authority drawn from entertainment. This leads to a fusion of actors (‘personalities’) in the two fields and an incorporation of magic and infusion of ‘charisma’ in the rationalized arena of state activity. With the end of the Cold War, First World states increasingly structure their relations to new and Third World states by promoting NGOs (‘nongovernmental organizations’). Acting in a negative symbiosis with the state, such NGOs also incorporate magic into their missionary projects of rational reform and economic and political development. In much of the world they have replaced organized religion as competitors to the state in providing care (welfare) and promoting both local and global accountability in the exercise of power. Often, however, they serve primarily as instruments for the transfer of wealth or power between supranational elites. Notwithstanding liberal mistrust of the modern state, reaffirmed by the horrors of totalitarian experiments, the nation-state form still provides the possibility for local control of people’s own destinies in a manner impossible for prenational communities. Stateless people are indeed the most defenseless actors in the contemporary world. Hence, despite the wars, violent revolutions, and massive repressions of the nation-state, as the sociologist Lepsius (1988) argues, this form of polity has offered particular advantages to West European societies, including the institutionalization of peaceful conflict-solving through the rule of law, guarantees of individual freedom, the organization of interests through parliamentary democracies, and the integration of national economic development in the world economy. In each of these measures, autocratic and totalitarian states elsewhere have been at a permanent disadvantage, incapable of recognizing the subjects they seek to control for whom they are, and unable to organize power except through coercion. Such states destroy sociality and build false repre-
sentations of the people, thereby creating a basis for the illusion of society against state. See also: Nation: Sociological Aspects; Postcoloniality; Socialist Societies: Anthropological Aspects; State Formation; State, History of; State, Sociology of the; States and Civilizations, Archaeology of
Bibliography Anderson B R O’G 1983 Imagined Communities: Reflections on the Origin and Spread of Nationalism. Verso, London Borneman J 1992 Belonging in the Two Berlins: Kin, State, Nation. Cambridge University Press, Cambridge, UK Borneman J 1997 Settling Accounts: Violence, Justice, and Accountability in Postsocialist States. Princeton University Press, Princeton, NJ Dirks N B 1993 The Hollow Crown: Ethnohistory of an Indian Kingdom. University of Michigan Press, Ann Arbor, MI Ferguson J 1990 The Anti-Politics Machine: ‘Deelopment,’ Depoliticization, and Bureaucratic Power in Lesotho. Cambridge University Press, Cambridge, UK Foucault M 1977 Discipline and Punish: The Birth of the Prison [trans. Sheridan A]. Vintage, New York Foucault M 1978 The History of Sexuality [trans. Hurley R]. Penguin, London Geertz C 1980 Negara: The Theatre State in Nineteenth-century Bali. Princeton University Press, Princeton, NJ Gauchet M 1997 The Disenchantment of the World: A Political History of Religion [trans. Burge O]. Princeton University Press, Princeton, NJ Herzfeld M 1992 The Social Production of Indifference: The Symbolic Roots of Western Bureaucracy. Berg, New York Lefort C 1986 The Political Forms of Modern Society: Bureaucracy, Democracy, Totalitarianism. MIT Press, Cambridge, MA Lepsius M R 1988 Interessen, Ideen und Institutionen. Westdeutscher Verlag, Opladen Moore S F 1986 Social Facts and Fabrications: ‘Customary Law’ on Kilimanjaro, 1880–1980. Cambridge University Press, Cambridge, UK Nader L 1990 Harmony Ideology: Justice and Control in a Zapotec Mountain Village. Stanford University Press, Stanford, CA Pemberton J 1994 On the Subject of Jaa. Cornell University Press, Ithaca, NY Sahlins M D 1981 Historical Metaphors and Mythical Realities: Structure in the Early History of the Sandwich Islands Kingdom. University of Michigan Press, Ann Arbor, MI Service E R 1975 Origins of the State and Ciilization: The Process of Cultural Eolution. Norton, New York Stoler A L 1995 Race and the Education of Desire: Foucault’s History of Sexuality and the Colonial Order of Things. Duke University Press, Durham, NC Tambiah S J 1992 Buddhism Betrayed? Religion, Politics, and Violence in Sri Lanka. University of Chicago Press, Chicago
14969
State: Anthropological Aspects Taussig M 1997 The Magic of the State. Routledge, New York Tilly C 1990 Coercion, Capital and European States AD 990–1990. Blackwell, Oxford, MA
J. Borneman
State Formation A state consists of a set of institutions whose functions are social control and authoritative decision-making and implementation processes. In addition, modern states seek autonomy in their relations with other states. Beyond these very general criteria, the concept of the state quickly becomes multidimensional with respect to its historical development, its relevant legal considerations, and its theoretical debates. The institutions of a state are subject to conflict and disagreement over their structure and the role they play. Institutions change over time, and the changes both shape and reflect differing degrees of autonomy and strengths. In the twenty-first century there is considerable interest not only in the historical trajectory of state formation and the more proximate formation of new states—dozens of new states have been created in the second half of the twentieth century—but also in the concept of state capacity. Political scientists and political economists who measure and assess the strength and relative autonomy (sovereignty) of states in national and international arenas concern themselves with the proper role of states in the domestic and international political economy. States are in constant competition with other institutions such as intergovernmental institutions and for-profit and nonprofit institutions. Political scientists study and observe the role and function of each of these various national and international institutions. Not surprisingly, the proper role, the function, as well as the capacity of today’s states has become a lively and contentious issue.
1. The Problematic of the State Many prominent classical theorists studied state formation and the multifaceted issues surrounding the role and function of states. Theorists such as Machiavelli (The Prince 1988), Rousseau (The Social Contract 1987), Marx and Engels (The Communist Manifesto 1998), and Lenin (State and Reolution 1993) considered the formation and role of the state to be a theoretically interesting issue. More recent students of the formation of the modern state have focused their attention on such other topics as the absolutist states of Western Europe (Anderson 1979) and states and revolution (Skocpol 1979). Contemporary writers address issues such as the state and industrialization, nationalism, colonialism, and the formation of the 14970
state, in addition to many other topics concerned with communism and the postcommunist state. Other ongoing themes among students of state formation are the state and the economy, power and legitimacy, and the relationship of states to other states. Twenty-first century analysis of the forces of globalization and their impact on the state are also frequently discussed in the context of the historical evolution of state formation.
2. History of the State The modern state is a Western European creation which emerged in the fifteenth and sixteenth centuries and became more firmly grounded in the seventeenth century. The modern state came into existence in the same time frame and in geographical locations that were the wellsprings of modern capitalism, modern science, and philosophy. Many scholars link the evolution of the modern state and Christianity in its Protestant form. It is generally agreed that the emergence of a new political form—the modern state—was intimately linked to the economic, intellectual, and religious developments of the time. The first modern states to arise were Italy, the UK, France, and Spain. The principal technical tools of modern state building were the development of permanent bureaucracies and a standing army, but it was the availability of money that enabled rulers to acquire them. Once a ruler was able to control the professional bureaucracy and the army, the nobility of medieval times lost its political role and power. The principal producers of cash were the commercial and industrial middle class and the peasantry. By the late eighteenth and early nineteenth centuries the bourgeoisie was able to curtail and even abolish the privileged status of the aristocracy and the authority of the monarch. The bourgeoisie acquired increasing control over the state. Their victory, signaled by the French Revolution, transformed the dynastic, state into the modern state. However, to define the modern state as ‘bourgeois,’ meaning ‘civic’ signifies the civil and legal equality of all subjects regardless of status by birth or occupation. The abolition of the old feudal privileges did not mean that the propertied and educated classes ceased to have privileges in the modern state. Modern theorists of state formation like Charles Tilly (Tilly 1975) and many Marxist theorists like Miliband (1969), Poulantzes, (1969), and Offe (1975) view the state as ‘organized coercion’ orchestrated by an elite. Other social scientists view the state as the arena of legitimate authority embodied in the rules of politics and in governmental leadership and policies. Some thinkers, especially Marxists, tend to treat the state as an analytic construct by which to explain the modes of production, class relations, and societal struggles in general. Others, including Theda Skocpol
State Formation (1979), argue that states are ‘actual organizations’ that control territories and peoples. Skocpol echoes the legal definition of a state much more closely than does Tilly for example, by pointing out that while the state is anchored in class-divided socio-economic structures it also needs to be seen as part of the international system. As a practical matter, in order for a state to engage in international relations, other states need to recognize it as a fellow state. Modern sociologists call attention to the fact that a state is a human community that successfully claims the monopoly of legitimate use of physical force within a given territory. Whether a state is considered legitimate and on what basis its legitimacy is acquired and honored in the human community within the territorial boundaries of the state and in the large international community of fellow states, is to say the least, very historically and theoretically complex.
3. The Modern State The modern state and the modern state system developed simultaneously. It required coercive capacity for the state to sustain itself both domestically and internationally. The state system of the nineteenth and twentieth centuries led to two world wars and the Cold War. These wars were significant in shaping contemporary states and the international system of states. However, contemporary concern and interest in the state is not confined to questions of the legitimacy of states or even their coercive capacity. At issue in the twenty-first century is the role of states as actors, competing with other actors, to determine what the ongoing proper role and function of a state ought to be with respect to its political, social, cultural, and economic obligations to its citizens. State formation has never been simply a ‘founding.’ Quite the contrary, state formation is a process of formation and reformation based on the changing nature of societies within the state and the larger international system of states. The institutions and processes of a state and of states collectively are not static. To say that a state requires the capacity to carry out the proper functions of a state is a truism, but the capacities required at the turn of the twentieth century vary greatly from state capacities required at the turn of the twenty-first century. Furthermore, states compete with other national and international institutions. For example, states vie with interstate organizations, such as the European Union, the United Nations (UN), the World Trade Organization (WTO), and with nonprofit organizations such as Greenpeace and Oxfam, and for-profit corporations such as Microsoft and CNN. Financial capacity, knowledge capacity, monetary capacity, and even legal capacities of nonstate actors all challenge state capacity. This is not to suggest that states have become defunct but merely that what it means to be a state today and what
capacities a state is expected to possess by its citizens and in the larger community of states continues to go through a process of formation and reformation.
4. The Future of the State As the twenty-first century begins there are almost 200 so-called sovereign states in the world. It is useful to think of contemporary states in their subgroups of differing state types: first, the ‘old states,’ such as the UK, France, Spain, the US, China, and Japan, which have developed reasonably stable nation states over the centuries. Some of these states, like the US for example, are multinational and multiethnic. Others, such as Japan, are more homogeneous. What these states have in common is a citizenry that views the national government as legitimate, and these states on the whole, share stable political economies. Second, there are the ‘new’ states. Many of these states, for example, most of the sub-Saharan African states and many states in Asia and the Middle East, emerged out of colonialism during the twentieth century. These new states tend to meet the legal criteria for statehood, yet many lack legitimacy as states in the eyes of their citizens. Most often such states are also multiethnic and have yet to achieve a sense of nationhood—a shared belief on the part of citizens that they belong to the nation-state. Some of these states, for example, Somalia or the Congo, are, at the beginning of the twenty-first century, in a state of civil conflict. States which are not able to carry out the basic function of a state, are frequently referred to as ‘failed’ states. Such states have the legal trappings of a state but the nationstate formation has broken down and the state’s capacity to govern its territory has failed. In the category of failed or failing states are all those for which the historical process of state formation has not yet succeeded. One may observe that the historical timeframe during which state formation takes place will determine both the internal and external factors which have a bearing on the process, as well as the outcome, of state formation. In the post-Cold War era for example, state formation and reformation—in the case of former communist states—is taking place at a time when the dominant paradigm is a democratic form of governance within the context of free market capitalism. Accordingly, state formation in the new states and postcommunist states, takes place in an international setting which exerts a distinctive global leverage on the type of state formation for which political and economic support and acknowledgement is rendered. However, today state formation and the reformation of all states, not just new or failing states, takes place in the context of globalization. Globalization brings with it active participation in the national and international life of states by nonstate actors and a new degree of interstate and inter-actor interdepen14971
State Formation dence. Today global and regional institutions, along with states, participate in rule making, rule enforcing, knowledge creation, and in fields such as commerce, banking technologies, medicine, genetic engineering, and in defining the standards for human rights. The globalized world of the twenty-first century is reshaping the idea and the reality of the sovereign state. Yet states still are extremely important in the lives of most people and hence, state formation continues to be an important ongoing process, a process which is not merely a multifaceted historical process but a multidimensional work in progress. See also: Regional Government; Regional Integration; State: Anthropological Aspects; State, History of; State, Sociology of the
Bibliography Habermas J 1976 Legitimation Crisis. Heineman, London Lenin V I 1993 State and Reolution. Penguin USA, New York Machiavelli N 1988 The Prince. Cambridge University Press, Cambridge, UK Marx K 1983 The eighteenth Brumaire. In: Held O (ed.) States and Societies. New York University Press, New York Marx K, Engels F 1998 The Communist Manifesto. Signet Classics, New York Miliband R 1969 The State in Capitalist Society. Weidenfeld and Nicolson, London Nozick R 1974 Anarchy state and Utopia. Basic Books, New York Offe C 1975 The theory of capitalist state and the problem of policy formation. In: Lindberg L N, Alford R, Crouch C, Offe C (eds.) Stress and Contradiction in Modern Capitalism. Lexington Books, Lexington, MA Poulantzas N 1969 Classes in Contemporary Capitalism. Weidenfeld and Nicolson, London Rousseau J 1987 The Social Contract. Penguin USA, New York Skocpol T 1979 States and Social Reolutions. Cambridge University Press, Cambridge, UK Strange 1996 The Retreat of the State. Cambridge University Press, Cambridge, UK Tilly C (ed.) 1975 The Formation of National States in Western Europe. Princeton University Press, Princeton, UK
I. V. Gruhn
State, History of 1. Introduction: Problems of Definition At present there are about 200 communities in the world that are considered sovereign nation states, according to standards developed during European history. Officially, alternative political models no longer exist. Therefore, it seems reasonable to adopt the narrow definition of modern statehood created by German professors of law (Jellinek 1959). According to them, a state is constituted by a homogenous 14972
population inhabiting a contiguous territory under one single government which is characterized by complete independence from any outside authority (sovereignty) and holding the monopoly of jurisdiction and the legitimate use of violence internally. History has shown that this kind of state need not be a nation and even less a constitutional democracy based on human rights. But today most states claim to be nations under the rule of law and a formal constitution. Nevertheless, these are recent achievements and rather exceptional in world history. History abounds with different kinds of political organization that may also be called ‘states.’ This is not quite correct from the historian’s point of view, because in most European languages the term ‘state’ was originally used to designate social status or real estate. But between the seventeenth and nineteenth centuries it became the name of a particular type of European political organization, achieving perfection during that same period. Before then, pre-state communities can be roughly classified into: (a) tribes without rulers, (b) chiefdoms, (c) city-states, and (d) empires (van Creveld 1999). Historically as well as typologically, humankind was first organized in faceto-face communities, either acephalous societies without rulers based upon kin groups, or chiefdoms based upon force, wealth, and consequent social stratification. In addition, many chiefs were distinguished by religious qualities or supported by a privileged group of priests. All this is even truer of empires, which through religious ideology, a standing army, a kind of bureaucracy, and a tax system solved the problem of permanent control of large territories. This was a necessary precondition of future state-building, but not the sufficient one, which was to be the achievement of some city-states of Antiquity. Whereas everywhere else the distinction between ruler and owner of a polity did not exist or was at least blurred, in Greece and Rome for the first time citizens appointed certain persons to govern on behalf of the community and not for their own purposes, not as rulers, but as magistrates. These were the roots of the state as a transpersonal, abstract institution with an abstract legal personality (van Creveld 1999). Therefore, the history of the state is first of all the history of the transformation of the loosely structured ‘chiefdoms’ of European aristocrats and ‘empires’ of European kings into modern states between the high Middle Ages until the nineteenth century.
2. Theory European state-formation was a multidimensional process, but most theories of state-building are still one-dimensional. Therefore, a multifactorial threelevel theory of state-building which combines (a) the micro-level of individuals and groups, (b) the mesolevel of the political system, and (c) the macro-level of
State, History of society is a more promising suggestion (Reinhard 1992). State-building starts on the micro-level with the selfseeking lust for power of individuals, often with the competitive advantage of holding a kingship. Before the existence of the state as an abstract institution, the necessary transpersonal continuity was provided by the dynasty. Dynastic state-building consisted of the elimination or at least control of the rival holders of autonomous power dating from the prestate phase of history—the nobility, the church, urban, and rural communities—in order to establish a monopoly of power. To succeed dynasties needed the assistance of power elites, which in their own interest made the growth of state power their own cause. In the long run, lawyers of bourgeois origin proved better qualified for that role than members of the church or the nobility because, in contrast to the latter, lawyers owed all their status and power to the service of monarchs. Far-reaching change on the meso-level of the political system resulted from successful exploitation of war, religion, and patriotism for the purpose of expanding dynastic power. The pre-existing rivalry of European monarchs inevitably grew with their power, because it became essential to outstrip their neighbors, to grow at their expense, and to protect themselves against the same objectives in turn. Therefore, they needed continuously growing armies and money in ever-increasing quantity to pay them. In its decisive phase of growth the modern state was a war state, which expanded its taxation, administration, and coercive apparatus mainly to wage war. This resulted in a circular process, the coercionextraction-cycle (Finer 1997) and finally in the internal and external monopoly of violence. Ultimately, only states wage war. Private wars, such as vendettas or feuds, noble or popular revolts were no longer legitimate under the powerful war and police state. ‘Necessity’ in the service of the common good served as key argument to legitimate this growth of state power. But when the competing ‘confessional’ churches lost most of their autonomy to the state after the Protestant Reformation—the price to be paid for political protection—religion became an instrument of emotional identification of the subjects with their country. ‘Catholic’ and ‘Bavarian,’ ‘Polish,’ or ‘Spanish’ became almost synonyms on the one hand, as well as ‘Protestant’ and ‘English,’ ‘Prussian,’ or ‘Swedish’ on the other hand. Essential contributions originated from the societal and cultural environment on the macro-level. First, the geo-historical plurality of Europe was the incentive to the growth of state power through the coercion– extraction cycle. The result was a stable pluralism of internally strictly unitary states, an exceptional case worldwide. Universal empires never had a chance in Europe; the Holy Roman Empire of the Germans was at most a first among equals. But internal unity was
not realized before the late eighteenth, the nineteenth, and in some cases even the twentieth century. For a long time most monarchies were composed of several parts of unequal status such as Castile and Aragon or Polonia and Lithuania. Everywhere monarchs had to cope with a powerful system of autonomous local noble rule on the one hand, with a countrywide network of partly autonomous urban and rural communities on the other, once again European particularities. In addition, before the Reformation the Church considered itself an independent community, in some respect even a state before the state. This singular European dualism of spiritual and temporal, combined with the equally singular political pluralism turned out to be a precondition of political freedom, although neither Church nor state, neither noble landlords nor urban oligarchies were in favor of any liberty except their own. Finally, the strong position of the Church derived from its role as keeper of Latin culture. Roman law, to some extent transformed into the canon law of the Church, directly and indirectly proved fundamental not only to monarchic state-building, but also to individual liberty and property.
3. The Age of Monarchy (Middle Ages to Eighteenth Century) European kings of the Middle Ages and the early modern period claimed to be emperors of their kingdoms, which is to say that like the pope in the church they held unrestricted power (plenitudo potestatis). For the first time in history the competence of a king was defined by the French constitution of 1791. Before that date royal power was limited only in the sense that there were some things a king could not do. The English king could not create statute law or raise taxes without consent of the realm, the French king could not change the succession to the throne or give away parts of the crown domain. For the rest, every monarchy was ‘absolute,’ that is to say the king’s prerogative included everything necessary to fulfill his fundamental duties as fountain of justice and keeper of peace. In that sense even Poland’s ‘republic of nobles’ and England’s parliamentary system were called ‘absolute’ by contemporaries. In Britain, where no written constitution defines powers, in theory the monarch today is still entitled to declare war and to conclude peace without consulting Parliament. In reality, however, since the later Middle Ages this was possible only if the necessary funds were raised with the help of representatives of the subjects. The monarchies of the European Ancien Re! gime were limited not so much by legal restraint as by small resources, traditional rules, and social conventions. Under these circumstances it was decisive for the process of dynastic state formation that the crowns of Castile, England, and France 14973
State, History of became hereditary whereas the Empire and to some extent Denmark, Sweden, Poland, and Hungary remained elective monarchies. Compared with the universal competence of the modern state, including the right to extend this competence at discretion, the prerogatives of early modern monarchs were extremely modest. State activity expanded, when during the early modern period the so-called ‘police’ ordinances (‘Policeyordnungen’) started to cover new fields such as health and public welfare, moral order and communication, economic behavior, and labor relations. This was no longer mere jurisdiction, reactive government by rescript as in the Middle Ages, but an expanding administration actively interfering with larger parts of the subjects’ lives. But it was still not ‘absolutism,’ an anachronistic term of the nineteenth century which should be dropped (Henshall 1992), because the kind of monarchy without any legal restriction it designates never existed except in Denmark 1665–1834 and in Sweden for a few decades after 1693. Similar to most monarchies of the world, European ones had a sacred character, practiced elaborate symbolic rituals, and were legitimized ‘by the grace of God.’ The coronation ceremony together with the coronation oath of the king and the corresponding oath of allegiance of the people were central in this respect. But there were varieties. In France political rituals and ideology constituted a true ‘royal religion’ (‘religion royale’), and court life with its palaces, ceremonies, and festivities became an all-embracing total work of art. Some of its elements recur in England and elsewhere, especially the ritual of the royal touch, when French and English monarchs touched hundreds of people suffering from scrofula, the tubercular ‘royal disease’ (‘morbus regius’), to cure them. Monarchs practiced this ritual with particular intensity when their monarchy needed additional emotional support: Henry IV and Louis XIII of France after the religious civil wars, Elizabeth I of England because of her succession problems, and Charles II after the restoration. Other monarchies were more prosaic, including the Holy Roman Empire. And in spite of his politicoreligious sanctuary, the Escorial, Philip II of Spain never wore a crown; there was not even a coronation ceremony in Castile. This was a way of demonstrating independence from the Church, but even in France political sacrality implied royal control of the Church and not the opposite. Because of the Church’s monopoly of culture, Europe’s early political discourse was theological. The state was considered a consequence of the fall of man. With the reception of Aristotle natural man became a political being, a concept which was used to create an autonomous discourse of legitimation for the monarchy, erroneously, because ‘political being’ referred to the face-to-face community of the city and 14974
not to the state. But from Dante Alighieri down to John Locke, political theory remained essentially a theory of monarchical state-building, at most harmonizing limited monarchy with aristocracy and democracy into a mixed constitution, supposedly like the ancient Roman republic. Thomas Hobbes was the first to speak of the state instead of the monarchy, but this was not common practice before the eighteenth century, with Montesquieu and Jean-Jacques Rousseau. This change coincided with the development of central institutions, which under the impact of enlightened rationality became more effective. Administrators were no longer the king’s personal servants but transformed into professional officials of an abstract state, into modern civil servants (‘Beamte’) as described by Max Weber. The early medieval prince was assisted in his governmental functions by members of his court and council (‘curia regis’), for written legal work by a chancellor, often an ecclesiastic. Most central institutions of European monarchies were created by segmentation of that curia. The court was reduced to a mere household, the chancery became a separate body. Professional supreme courts and treasuries were established. To produce political consensus and financial aid an enlarged council of important subjects was created to become an assembly of estates or a parliament. What was left over of the ancient council was transformed from a multifunctional decisionmaking instrument such as the German aulic council (‘Hofrat’) into a differentiated system of governing bodies with a small group of confidants of the ruler at the core (‘Privy Council,’ ‘Geheimer Rat,’ ‘Staatsrat,’ etc.). From the sixteenth to the eighteenth century the conciliar principle of collegiality was common practice, to accumulate competence, to balance interests, to prevent disloyalty by mutual control, and to provide positions for as many members of the power elite as possible. Only a few very dedicated rulers such as Philip II of Spain, Louis XIV of France, and Frederic II of Prussia were still able to handle this complicated system of central government in person. Instead, many princes relied on a personal confidant as prime minister, such as Wolsey, Richelieu, Olivares, or used the new office of a secretary of state for coordination. In England, parliamentary management was a central task of royal confidants. The development of modern cabinet government was the consequence. In France, Castile, Rome, and elsewhere many public offices were officially for sale, just another way to tap private accumulation in favor of public finance. The growth of state power was first of all a credit problem, because tax income had to be anticipated by credit to make the large sums of money which were necessary to wage war available on short term. Tax farming was another widespread technique used for that purpose. Direct taxes were based on the land, because movables and income were still not under administrative control. As usual, indirect taxes on
State, History of mass consumption proved more comfortable, whereas the rich were involved as creditors. But credit was still the personal credit of a prince, not of the state as institution. Therefore, repayment of loans was expected and interest rates were as high as risks. Under these circumstances, bankruptcy was the regular consequence of power policy until the Dutch and English invention of low interest long-term bonds with parliamentary guarantee, a decisive contribution to the further growth of the modern state.
4. Premodern Alternaties and Their Political Heritage Traditionally, Europe respected man (and to some extent woman) as proprietor of his (her) person and goods. Therefore, in principle taxation was only possible with the consent of the taxed or at least their lords and guardians. For this purpose, most princes created structurally similar assemblies of important subjects or representatives of corporations, such as cities, called assemblies of estates, beginning 1188 in Leo! n. By the seventeenth century, however, most princes were strong enough to raise taxes without consent and to ignore their estates. This decline of estates has been considered the very essence of ‘absolutism’ by former research. In contrast, we know today that to some extent estates remained alive in most countries (with the exception of Denmark) and princes still had to bargain with them. But only in England seventeenth-century conflicts with the crown ended with victories of parliament and the definite establishment of financial control by that body. This system, imitated in most European countries after the French Revolution, finally led to parliamentary government and democracy. In contrast to other parts of the world, in European political culture the relationship between ruler and subjects was considered a contractual one, based on legal mutuality. In case of a supposed breach of contract by the ruler, subjects felt entitled to armed resistance, because their legitimate lord had degenerated into an illegitimate tyrant. Therefore, early modern discourse and practice of the right of resistance through a series of European revolutions laid the foundations of modern liberty and democracy. In most countries, the nobility dominated assemblies of estates, as holders of local power the true lords of the land. Not before the eighteenth century did the administrative institutions of the state start to penetrate this autonomous local power structure and not before the nineteenth century was noble rule to a large extent replaced by public administration. But beside or below noble rule existed innumerable more or less self-governing urban and rural communities, from independent republics like Venice, quasirepublics like German imperial towns, privileged English boroughs, down to villages of Polish serfs who
were still entitled to regulate certain common affairs on their own. This is the ‘Europe of thousand republics,’ the Europe of ‘communalism’ (Blickle 2000), a world where members of face-to-face communities decided on their immediate concerns. These are the roots of European republicanism and of the tradition of European self-administration. When assemblies of estates flourished, ecclesiastical power competing with princes was already declining. But the Roman church had been Europe’s first state with the pope as the first absolute monarch and the priest as the first nonhereditary professional public servant. The Church practiced territoriality (‘FlaW chenstaat’) at a time when empires were still mainly based upon personal dependency (‘Personenerbandsstaat’). And the church established the rules of political religion with the consequence that to the present day states remain transcendent entities in need of believers. In this sense the state is the last church because European political culture is a culture of believers. In contrast, professional law courts such as the French ‘parlements’ or the English ‘common law and equity courts’ were not originally autonomous alternatives to state-building monarchies like aristocratic rule, communal self-government, or the Church but creations of the expanding state power. Nevertheless, because of the loose structure of early modern polities giving room for the development of specific group identity and interests, these judges under certain conditions could transform themselves into opponents of their king’s policy and initiate resistance or revolution, as happened in England and France in the seventeenth and eighteenth centuries. But in this way the principles of rule of law, of an independent judiciary, and even of a judicial review of legislation were introduced into political culture. In the end, however, the fully developed modern state replaced noble rule by his officials, reduced local self-administration to a commission of the central government, and established complete control of the churches and the legal apparatus. The principle of justice existing beyond, above, before the state was largely replaced by the arbitrary use of the state’s monopoly of legislation. But nevertheless the heritage of political alternatives remained alive: liberty and democracy, rule of law and independent judges, the claim to self-government, and to have a state one could believe in.
5. The Age of Equality (Nineteenth and Twentieth Centuries) The modern European state, so far developed by statebuilding monarchies, reached maturity with the French Revolution and the political modernization of Europe which followed. Modernity means unity, unity of territory, people, government, sovereignty, and violence. The revolutionary process took the last step 14975
State, History of to modernity and replaced the hierarchy of orders and the plurality of privilege which had characterized the political society of the Ancien Re! gime with a society where stratification became a private distinction because politically all subjects were now equal before the state and its law. From now on, not families or corporations were the basic components of the state, but individuals. But mobilizing individuals emotionally in favor of the state into some kind of participation—not necessarily a democratic one—resulted in an ultimate increase of state power with incredible dimensions: the total state. With the quasi-natural order of estates and corporations gone, the modern state of equal individuals needed a new organization. With the exception of Britain and Sweden (until 1975) this was provided from now on by a constitution. But in spite of the American and French models of 1787 and 1791 nineteenth-century European constitutions established rule of law, but neither democracy or fundamental rights. Their first purpose was equal integration and not equal participation. Most countries remained monarchies, but deprived of sacrality since the period of enlightenment, monarchy lost unquestionable legitimacy. Step by step monarchies were transformed into democratic republics or quasirepublics. But in most countries during the nineteenth century democracy was still far away. Not only informally as usual, but even legally, in their right to vote, all subjects were equal, but some were more equal than others (Orwell 1945), mostly because of their wealth. In 1830 the following percentage of the total population was entitled to vote: in Austria, Denmark, Prussia, and Spain 0 percent, France 0.5 percent, Belgium 1.1 percent, Britain 2.1 percent. In 1910 the numbers were: in Denmark 17 percent, Britain 18 percent, Austria, Belgium, Germany and Spain 21–23 percent, and France 29 percent (Reinhard 1999). Percentages in the twenties indicate universal suffrage of adult males. Until the twentieth century, statebuilding and politics remained almost exclusively a business for men. There were some outstanding queens and empresses in European history—Isabella of Castile, Elizabeth I of England, Christina of Sweden, Maria Theresa of Austria, Catherine II of Russia, and others. But some of these great women had their difficulties with male political culture. And there were no female ministers and officials. Even the enlightened philosopher Immanuel Kant considered women political non entities because they lacked economic independence. After Wyoming 1869 and New Zealand 1893 female suffrage was introduced in Finland in 1906, Austria, Germany and Poland in 1918, the USA in 1920, Britain in 1928, Spain in 1931, France in 1944, and Switzerland in 1971. But in the late twentieth century female heads of state or government, ministers, parliamentarians, and even female civil servants were still a minority. 14976
Originally, humans identify themselves mainly with their face-to-face community. ‘Patria,’ the land of the fathers, and ‘Natio,’ the community of birth, even during the early modern period quite often designated rather small territorial entities. However, through state-building people learned to apply them to their country in the large. Finally, after the French Revolution self-defined linguistic or cultural communities, peoples in search of a state considered themselves nationalities, that is, nations to be. In contrast to ideology, nations are not natural, but invented entities. This does not imply complete artificiality and arbitrariness. There is always some kind of basis in reality, such as common descent, language, religion, or territory. Most important is a common history and the consequent option for its continuity. Nationalism as a system of symbols constitutes a community as a nation, integrates its members, and defines the difference from other communities. For radical nationalists the nation is the highest value and the ultimate source of meaning; nationalism becomes a religion and as such a way to tap the material and emotional resources of subjects, especially in war. Therefore, nation-building became the final phase of European state-building, whereas new states in Europe and worldwide tried to constitute themselves as nation-states right from their foundation. Expanding political participation and emotional mobilization of subjects correspond to a further extension of government, of state power. A common denominator is a high degree of social discipline in the subject. Church and State had been working on this since the late Middle Ages, but achieved this goal only in the nineteenth and twentieth centuries with the assistance of armies, schools, and factories (Weber 1964). The citizens of a fully developed modern state enjoy the legal protection of their person and property and some political participation. In return, a citizen is subject to three fundamental duties, compulsory school attendance, compulsory military service, and general liability to taxation. Based upon the fiction of the sovereignty of the people the state apparatus is able to extend its competence at its own discretion and to transform itself into a totalitarian system as happened under the impact of nationalist, racist, and communist ideologies in the twentieth century. Some of them proclaimed the end of the state, but that proved impossible. In contrast, the totalitarian state turned out to be the final stage of development of the modern state. And at a closer look, even the modern welfare state proves to be a soft variety of totalitarianism. Beginning with limited social security programs before World War I (1883–89 creation of an insurance system for workers in Germany) the welfare state exploded after World War II. Between 1950 and 1983 the West-European average of social security expenses rose from 9 percent of the GNP to 25 percent. But the total care for the citizens means also total
State, History of regulation of their lives. Even the neo-liberal reaction is based upon large-scale government action in the service of a general economic and social policy; goals, not methods differ from former socialist welfare policy. Six hundred years of European expansion made the modern European state the world’s standard model of political organization. The colonial regimes of Spanish and British America established in the sixteenth and seventeenth centuries adopted premodern forms of European statehood and political culture. Therefore, most Latin American states remain authoritarian but weak states even today, whereas traditional selfgovernment in a comparatively egalitarian society made the USA a successful nation, although it lacked certain traits of the modern state such as belief in the state as a value in itself, in bureaucratic centralism, in a strong professional army, and in an established church. From 1839 to 1935 Britain’s white settler colonies became sovereign states, mostly combining the British and US models, whereas in the nineteenth and early twentieth centuries East Asian and Muslim empires tried to transform themselves into modern states to cope with the weakness of their traditional regimes when confronted with the West, in the case of Japan with tremendous success. Finally, during several waves of decolonization after World War II, the lot of quite artificially created colonies became equally artificial nation states. For the first time a global community of international law and moderately effective international organizations such as the United Nations (UN) became reality. Quantitatively and qualitatively the modern state reached its culminating point during the twentieth century.
6. The Disintegration of States and Possible New Forms of Political Life In the last third of the twentieth century, culmination of the modern state turned into decline. To begin with, most ‘new’ states worldwide suffer from serious shortcomings, such as lack of democracy and efficiency, general corruption, and political disintegration. The export of European state models was a success only in three specific constellations: (a) where it was accompanied with the export of European civil society and political culture as in the cases of the USA and the British dominions; (b) where it met a polity and society compatible with its own structure as in the case of Japan; and (c) where colonial rule lasted long enough to create a Western-style civil society as in the case of India. For the rest, the new states of the former colonial and the former communist worlds tend to loose their sovereignty internally to the para-statehood of old or new chiefdoms or ethnic communities, externally to international organizations like the UN or the International Monetary Fund (IMF). They become mere ‘quasi-states’ (Jackson 1990).
At the same time European states were also, to some extent, disintegrating into ‘tribes,’ were loosing sovereignty to ‘chiefs,’ pressure groups, and international organizations, and were no longer able to provide complete social security. As a consequence of totalitarianism and mass participation, the modern state turned self-destructive. The excesses of nationalism, racism, and communism destroyed the necessary faith in the state. Social discipline was questioned by the common desire of self-realization. New ethnic, regional, social movements started to compete with the state for the loyalty of the subjects, some on the basis of democratic self-determination, some with the use of violence. Withdrawal of loyalty led to an underground economy expanding at the expense of the tax and welfare systems. The necessity to win elections by bargaining with interest groups and by providing additional welfare benefits for the electorate ended in a debt-trap with almost the whole budget allocated in advance. In addition, economic globalization allows important tax-payers to move to less restrictive countries with lower wages and taxes. However, for economic and security reasons most states loose more of their sovereignty to international organization. The European Union has not yet become a federal state and suffers from deficient democracy and social policy, but nevertheless common economic interest has released remarkable binding forces. The right to declare and to wage war, once the very essence of sovereignty, has become obsolete. Instead of interstate wars and peace treaties we have hundreds of ‘low intensity conflicts’ (van Creveld 1991) which tend to respect neither non-combatant status or national borders. Nevertheless, the national state is still there. But it has definitely lost the monopoly of power and policy. In power processes, it tends to become one competitor among others. Unity, the constituent principle of modernity, the monocratic polity, is being replaced by a postmodern plurality of multiple political units coordinated by the necessity to cooperate and not by state power. Certainly, this pluralism might end in chaos. But it might also prove a new way for humankind to seek the political pursuit of happiness. See also: Absolutism, History of; Aristocracy\ Nobility\Gentry, History of; Balance of Power, History of; Bureaucratization and Bureaucracy, History of; Citizenship, Historical Development of; Citizenship: Political; Citizenship: Sociological Aspects; Communism, History of; Conservatism: Historical Aspects; Contemporary History; Democracy; Ethnic Groups\Ethnicity: Historical Aspects; Frontiers in History; Global History: Universal and World; Historiography and Historical Thought: Current Trends; History and Memory; Human Rights, History of; Interest Groups, History of; International Relations, History of; Liberalism; Military History; Modern14977
State, History of ization and Modernity in History; Nation: Sociological Aspects; National Liberation Movements; National Socialism and Fascism; Nationalism: General; Nationalism, Historical Aspects of: The West; Nations and Nation-states in History; Parliaments, History of; Peacemaking in History; Play, Anthropology of; Political History (History of Politics); Political Parties, History of; Psychology: Historical and Cultural Perspectives; Racism, History of; Social Movements, History of: General; Socialism; State and Society; State: Anthropological Aspects; State Formation; State, Sociology of the; Urban Anthropology; Violence, History of; Warfare in History; Welfare State
Bibliography Anderson M S 1993 The Rise of Modern Diplomacy, 1450–1919. Longman, London Bayart J-F 1989 L’eT tat en Afrique. La Politique du Ventre. Fayard, Paris [English translation 1996 The State in Africa. The Politics of the Belly. Longman, London] Berman H J 1983 Law and Reolution. The Formation of the Western Legal Tradition. Harvard University Press, Cambridge, MA [1991 Recht und Reolution. Die Bildung der westlichen Rechtstradition. Suhrkamp, Frankfurt, Germany] Blickle P (ed.) 1997 Resistance, Representation, and Community. Clarendon Press, Oxford, UK [1998 ReT sistance, RepreT sentation et CommunauteT . Presses Universitaires de France, Paris] Blickle P 2000 Kommunalismus. Geschichte einer gesellschaftlichen Organisationsform. 2 Vols., Oldenbourg, Munich, Germany Bonney R (ed.) 1995 Economic Systems and State Finance. Clarendon Press, Oxford, UK [1996 SysteZ mes En conomiques et Finances Publiques. Presses Universitaires de France, Paris] Coleman J (ed.) 1996 The Indiidual in Political Theory and Practice. Clarendon Press, Oxford, UK [1996 L’indiidu Dans La TheT orie Politique et Dans la Pratique. Presses Universitaires de France, Paris] Contamine P (ed.) 2000 War and Competition Between States. Clarendon Press, Oxford, UK Creveld M v 1991 The Transformation of War. Free Press, New York [1998 Die Zukunft des Krieges. Gerling, Munich, Germany] Creveld M v 1999 The Rise and Decline of the State. University Press, Cambridge, UK [1999 Aufstieg und Untergang des Staates. Gerling, Munich, Germany] Edelman M 1972 Politics as Symbolic Action. University of Illinois Press, Urbana, IL [1976 Politik als Ritual. Die symbolische Funktion staatlicher Institutionen und politischen Handelns. Campus, Frankfurt, Germany] Edelman M 1974 The Symbolic Uses of Politics. Markham, Chicago, IL [1976 Politik als Ritual. Die symbolische Funktion staatlicher Institutionen und politischen Handelns. Campus. Frankfurt, Germany] Ellenius A (ed.) 1998 Iconography, Propaganda, and Legitimation. Clarendon Press, Oxford, UK Finer S 1997 History of Goernment From the Earliest Times, 3 vols. University Press, Oxford, UK Hamilton K, Langhorne R 1995 The Practice of Diplomacy. Its Eolution, Theory and Administration. Routledge, London
14978
Henshall N 1992 The Myth of Absolutism. Change and Continuity in Early Modern European Monarchy. Longman, London Hobsbawm E J 1990 Nations and Nationalism since 1780: Programme, Myth, Reality. University Press, Cambridge, UK [1991 Nationen und Nationalismus. Mythos und RealitaW t seit 1780. Campus, Frankfurt, Germany; 1997 Nations et Nationalisme Depuis 1780: Programme, Mythe, ReT aliteT . Gallimard, Paris] Hoffman J 1995 Beyond the State. Polity Press, Oxford, UK Jackson R H 1990 Quasi-states. Soereignty, International Relations, and the Third World. University Press, Cambridge, UK Jellinek G 1959 Allgemeine Staatslehre, 3rd edn. Wissenschaftliche Buchgesellschaft, Darmstadt, Germany [1st edn. 1900] Jouvenel B D 1945 Du Pouoir. Histoire Naturelle de sa Croissance. Bourquin, Geneva, Switzerland [1972 Uq ber die Staatsgewalt. Naturgeschichte ihres Wachstums. Rombach, Freiburg, Germany; 1993 On power. The Natural History of its Growth. Liberty Fund, Letchworth, UK] Koenigsberger H G (ed.) 1988 Republiken und Republikanismus im Europa der FruW hen Neuzeit. Oldenbourg, Munich, Germany Mann M 1986–1993 The Sources of Social Power, 2 Vols. Cambridge University Press, Cambridge, UK 1995–2000 The Origins of the Modern State in Europe, 13th–18th Centuries. 7 vols. Clarendon Press, Oxford, UK [1996–1998. Les Origines de l’ En tat Moderne en Europe, XIIIe–XVIIIe SieZ cle. 7 vols. Presses Universitaires de France, Paris; 1998 sq. Los OrıT genes del Estado Moderno en Europa, Siglos XIII a XVIII. 7 vols. Fondo de Cultura Econo! mica, Madrid, Spain] Orwell G 1945 Animal Farm. Secker & Warburg, London Padoa-Schioppa A 1997 Legislation and Justice. Clarendon Press, Oxford, UK Reinhard W 1992 Das Wachstum der Staatsgewalt. Historische Reflexionen. Der Staat 31: 59–75 Reinhard W 1996 Power Elites and State Building. Clarendon Press, Oxford, UK [1996 Les En lites du Pouoir et la Construction de l’En tat en Europe. Presses Universitaires de France, Paris; 1997 Las En lites del Poder y la ConstruccioT n del Estado. Fondo de Cultura Econo! mica, Madrid, Spain] Reinhard W 1999 Geschichte der Staatsgewalt. Eine Vergleichende Verfassungsgeschichte Europas on den AnfaW ngen bis zur Gegenwart. Beck, Munich, Germany Reinhard W (ed.) 1999 Verstaatlichung der Welt? EuropaW ische Staatsmodelle und außereuropaW ische Machtprozesse. Oldenbourg, Munich, Germany Ritter G A 1991 Der Sozialstaat. Entstehung und Entwicklung im internationalen Vergleich, 2nd edn. Oldenbourg, Mu$ nchen, Germany Weber M 1964 Wirtschaft und Gesellschaft. Studienausgabe. Kiepenheuer und Witsch, Ko$ ln, Berlin, Germany [1st edn. 1921, 1978 Economy and Society. University of California Press, Berkeley, CA]
W. Reinhard
State, Sociology of the Few terms are as ubiquitous in the social sciences as that of the ‘strong state,’ yet there is no agreement on how to define its scope. The strong state crops up in
State, Sociology of the the writings of economists, political scientists, and sociologists. It has been labeled both a destroyer and creator of wealth. It has been connected both to chronic underdevelopment and, just as frequently, to high levels of economic performance. If everyone knows what a strong state is, why do they disagree over its role?
strong polarity in the literature. In one camp, a strong state is viewed as an obstacle to economic development; in the other, state activism is touted as an essential component in the most celebrated cases of economic growth since the end of World War II. Both of these views contain important elements of truth, but both are too limited to account for the varied performance of developing nations.
1. Strong and Weak States In the conventional, ‘neoclassical’ view of the state and economic development, two types of government exist—strong and weak—with the latter usually given better grades for promoting economic performance (Klitgaard 1991). In this view, strong states typically choose interventionist strategies for development. This notion is largely based on the strong states of Latin America, where state interventions in economic affairs include populism, import substitution, and regulatory manipulation of markets, often designed to obtain substantial and, typically, unsustainable redistributive goals (see Dornbusch and Edwards 1991). The usual justification for developing countries to adapt strong activist states has been that the private sector would not provide the massive public investments needed for economic growth. In reality, however, strong states often ended up pursuing power for the sake of their own leaders; elitism, corruption, and misallocation of resources resulted. Disillusionment with the performance of state-led growth strategies caused critics to champion weak states, which foster growth by leaving markets alone (Friedman and Friedman 1980). The minimalist state is touted for doing no more than providing a stable, credible currency and a strong legal system designed to enforce private property rights and contract law. On the other hand, such weak states have historically been unable to prevent the capture and distortion of economic policy by oligopolies and rent-seeking elites. The conventional language of strong vs. weak thus fails to explain a government’s ability to foster or hinder economic performance. East Asia’s high-performing economies—South Korea, Malaysia, Taiwan, Hong Kong, and Singapore—dramatically demonstrated the inadequacy of the traditional ‘neoclassical’ categories. In this region strong, not weak, states enabled the private sector to make a major contribution to growth. Contrary to expectation, the weakest states in East Asia—Indonesia and the Philippines—did not nurture strong private sectors; in fact, they were much less successful than the others in terms of generating home demand and accumulating capital and skills. In response to the economic successes of these activist East Asian governments, a new generation of statist scholars has emerged to advocate the role of a strong state in promoting economic development (see Campos et al. 1994, Johnson 1982, Wade 1990a, 1990b, Evans 1992). The result is a
2. Weak States A weak government is handicapped by the inability to enforce property rights, or even many of its own laws. Because of the legal inadequacy of a weak state, its interventions are based not on rules, but on administrative or executive discretion; this results in corruption and opportunism rather than supervision. In fact, insufficient legal clarity is often intentional, thus facilitating random interventions by political authorities. For example, to compensate for rampant tax evasion, governments in developing countries devise tax regimes that allow considerable official discretion in assigning rates. But the capricious application of expansive tax powers drives private firms into the informal sector. Deprived of these tax revenues, the government cannot provide basic services such as law enforcement, and few benefits remain for those firms that do disclose profits. When transparency becomes a risk, firms will have difficulty attracting capital, and the government will become weaker still. When states are too weak to control their own officials, agents of the state may act independently of one another. Thus, a policy of reform may be announced by one part of the government, while another part blocks its implementation. The inability to enforce rules reduces private business activity unless some important actor of the state, such as the president’s family or inner circle, intervenes (for example, Indonesia under Suharto, the Philippines under Marcos, or Russia under Yeltsin). Political power is needed to defend property rights when the impartiality of the legal system is compromised or when the rules are themselves inadequate. When countervailing institutions such as an independent legislature or judiciary are weak, the leader can provide market access to chosen favorites in return for their loyalty, disabling the discipline of market forces. Bureaucrats in a weak state learn to use their power and access to information to extract value from proposed economic activity rather than to promote that activity. With corruption flourishing at all levels of government (Shleifer and Vishny 1993), the dealmaking authority of leadership becomes essential. In this regard, a weak state acquires the same marketdestroying tendencies of strong discretionary states, in which property rights enforcement is at the discretion of a handful of individuals. 14979
State, Sociology of the Although weak states respond to interest groups, they often cannot enforce bargains made with important constituents. Interventionist rules typically cannot be consistently enforced. As a result, reform measures, property rights, and contracts are not durable because the state lacks the means to implement and maintain them. Unable to enforce these minimal conditions, weak states can rarely sustain competitive markets. Instead, networks of individuals with special connections to the political regime are needed to sustain trade. Inevitably, officials of weak states are likely to collude with select, but powerful, private groups to extract the nation’s resources. A weak state easily becomes a predatory state. Many of the postcolonial African states fall into this category. Publicly disinterested bureaucracies in which pay and promotion were linked to performance did not emerge, compromised by governments that disbursed employment in exchange for loyalty. Dependency on foreign capital, including aid, discouraged accountability to citizens while eroding national sovereignty. Ethnic, religious, and regional fragmentation gave rise to political parties that enjoyed only narrow social support. Excessive centralization was often pursued to overcome these forces of national disintegration. However, highly centralized governments allowed bureaucracy to be replaced by a single-party or no-party system with national planning bodies controlling public goods and services. Local governments that were allowed neither independent judicial authority nor the resources to develop managerial and technical capacity failed to deliver basic services. The legislature and judiciary were dominated by the executive branch, eliminating independent oversight of budgetary allocations. The result of this excessive centralization was often a weak state dominated by the unbridled discretion of the executive, degenerating into personal rule and kleptocracy. Support for government collapsed, and many African states became incapable of providing basic public order.
of reform. The irony is that due to their unlimited power, no contract with the government or of interest to the government is secure. A strong and unlimited state has much in common with a weak state; both are likely to become predators of private sector wealth.
4. Strong but Limited States A third type of state, which arose in Western Europe, is not only strong enough to establish and maintain property rights, but is constitutionally prevented from violating these rights. This state requires an administrative structure that keeps the economic and political activities of regime officials separate. Such a state must be strong enough not only to adopt rules that can sustain a competitive private sector, but also to prevent itself from responding to the political forces that inevitably arise to monopolize access to the marketplace. Only this type of state will ensure competitive access to open markets. Strong but limited states are often mistaken for minimal states, since in comparison with interventionist states, they directly provide a relatively small number of goods and services. But the term minimal gives the impression of weakness, which is misleading. These states must instead be strong enough to withstand the political pressure on government to intervene in markets. Hong Kong under British rule was perhaps a perfect example of such a state in which ‘positive noninterventionism’ protected the market from pressure groups, rent seekers, and unscrupulous officials. Like the governments of Western Europe and North America, East Asia’s other high-performing economies—Japan, Malaysia, South Korea, Singapore, and Taiwan—all accepted limits on their discretion over private sector profits. All adopted mechanisms to separate the political and economic functions of government and to protect property and contract rights.
3. Strong and Unlimited States Strong and unlimited states, too, can act in a discretionary fashion. Lacking constitutionally defined limits, they are likely to be both interventionist and confiscatory. Able to alter rights and markets at will, they are incapable of making a credible commitment to private sector development. (Contemporary examples include Vietnam, Syria, Ethiopia, Iraq, and the USSR.) To induce participation in markets, strong and unlimited states must offer protection in the form of monopoly rights, preferential tariffs, or guild or trade union privileges. The exercise of unlimited political discretion allows them to promise excessive benefits to win the support of particular groups. This, in turn, creates political risk, reducing economic investments and undermining the possible outcomes 14980
5. Implications In distinguishing strong but limited government from the conventional perspective that identifies only weak and strong government, the advantages of each and the contradictions between them can be understood. In the conventional view, the two types of strong states are not distinguished. A state that carries out and maintains highly interventionist policies, such as import substitution or subsidizing a highly unionized urban workforce, is confused with a strong but limited state that plays a neutral role in the allocation of rents and subsidies. Federalism in the USA, for example, offers an institutional foundation for a strong but limited state which is able to foster equitable growth (see Weingast 1995, 1997).
State, Sociology of the In many emerging states, technical and practical knowledge is unevenly distributed. Often a small group of capitalists dominates local industry, agriculture, and access to the external market. The superior technical knowledge of these small groups forces the government to depend on them to build the market, which allows representatives from that group to gain control over the government’s regulatory apparatus and ultimately over the state. When economic power is concentrated in a few governmental ministries, it is more vulnerable to capture than when decision-making powers are dispersed among institutions that have veto power over each other. Strong but unlimited states often promote corruption and rent-seeking, exacerbating social and regional inequality. As noted earlier, both strong unlimited and weak states are prone to plunder the markets. What ‘neoclassical’ wisdom calls a weak state can often more aptly be described as a strong but limited state that can maintain the market, providing the necessary public goods while resisting capture by interest groups. While the champions of state activism raise reasonable doubts about the earlier neoclassical view, they do not explain why some strong states promote economic development while others pursue policies that hinder it. After all, the majority of countries that pursued activist policies experienced negative growth along with exacerbated social inequality. Even if the arguments advocated by statists—that these states have made a direct and significant contribution to economic growth—are taken at face value, they have done no more than show that this role for the state is feasible, something long known. But why have some strong states promoted growth rather than other political goals? The conventional view offers little help. By contrast, the example of East Asia’s successful regimes suggests that states are strong when critical and durable limits channel government behavior into activities compatible with economic development.
On the cost side, all such reform programs can also be undone by decree and hence are only as good as the paper on which they are written. For economic reform to be durable, it must be costly for political authorities to renege on promises once made. As the US federalists realized, durability cannot depend upon altruism. No government should expect public service to attract only godlike figures who always place the needs of the public before those of their family or friends. Institutions must be designed so that political actors have incentives to maintain and preserve the system, while carefully delineating where their power to intervene in private economic activity begins and ends. A government’s commitment to market-led growth requires the design of appropriate government institutions that ensure clear boundaries between private and public, and that protect both property rights and agreements with the business sector. The creation of such boundaries will shift the investment horizons of investors toward the long-term so that growth can be sustained. Are boundaries that protect private wealth from government plunder more likely to be erected in weak or strong states? In their effort to establish competitive market economies, weak states often falter because narrow interest groups obtain control over the market, over the judiciary, and ultimately over the polity. Oligopolies often arise to overcome systemic market risk caused by the weak enforcement of laws and contracts. In countries as diverse as Indonesia, the Philippines, and Russia, elites have enough resources to block necessary reforms and to prevent transparent governmental practices from being implemented. Strong and unlimited states will falter because their leaders may refuse to implement reforms that dilute centralized power and allow a decentralized private sector to grow.
6. Achieing Strong but Limited Goernment
The successful economies of East Asia offer examples of strong states that encouraged economic growth by institutionalizing the following set of limits on executive discretion: (a) channels of representation for businesses to induce them to make long-term investments, (b) wealth-sharing schemes to elicit the support of the wider population for development; (c) a competent bureaucracy encouraging credible government commitment to long-term, growth-enhancing policies. The significant improvements in income distribution that accompanied sustained economic growth in the high-performing East Asian economies resulted from effective use of public and private sector institutions and organizations. Common throughout East Asia were deliberation councils, which, although limited to particular sectors, offered input into the
The development of strong but limited government depends on the existence of institutions that make it in the interests of political officials to abide by limits on their power. In many reform environments, the desire to create strong but limited government often drives leadership to increase the executive’s discretionary power. Both Mikhail Gorbachev in the former USSR and Boris Yeltsin in Russia succumbed to this approach. Many Latin American leaders have also been attracted to it. Reform by mechanisms such as presidential decree is a two-edged sword, however. On the benefit side, the reform-minded leader gains the power to design and implement a reform package by eliminating many of the veto points blocking reform.
7. The Asian Miracle and the Sociology of the State
14981
State, Sociology of the policy process and had veto power over the outcome. The councils provided greater predictability, improved transparency, and procedural guidelines and precedents for the exchange of views between leadership and the private sector. The leadership of East Asia generally avoided undertaking major policy directives without first consulting with the major players targeted by the regulation. The councils made information about who gets what from government a matter of record, thereby limiting the opportunities for preferred firms to engage in behind-the-scenes favor seeking or private appropriation of gains from insider information. Ultimately, deliberation councils served as commitment devices, helping to reduce the tendency of government to renege on contracts with private holders of wealth. A range of institutional innovations promoting broad-based health, education, and housing were designed to make credible the regime’s promise of shared wealth and growth. These efforts enabled leadership to gain the cooperation of the citizenry. Businesses invested in fixed capital because the potential risks of social unrest were mitigated. Encouraged by anti-inflationary macroeconomic policies, the wider population saved a considerable portion of their income and invested in education, confident of being able to enjoy the future gains. The introduction of institutions that encouraged equitable growth was an explicit policy choice by government that contributed directly to an improved investment environment. Sharp contrasts between the successful and less successful bureaucracies in Asia are instructive. The bureaucracies of Japan, Korea, Singapore, and Taiwan, and some bureaus in Malaysia and Thailand, enjoyed merit-based recruitment and promotion, welldefined career paths, competitive salaries relative to the private sector, pension systems, and the possibility of retirement to a board seat in a large private or public enterprise. All but Thailand and Indonesia upheld harsh penalties for corruption that generated (relative) competence and integrity among bureaucrats. Even in the less competent bureaucracies in Indonesia and Thailand, explicit spending limits and a binding commitment to monetary stability limited discretion. Improving the productivity of the civil service so that it could enforce rules and ensure standards was a first step in the encouragement of private sector confidence among all of East Asia’s high performers. The civil service was rewarded for its conformity with agreed-upon rules that circumscribed arbitrary and capricious interventions in its oversight of the economy. Merit supplanted personal patronage so that government would be more responsive to general interests. Although direct political participation was weak, nonelites in all of the high performers shared the fruits of growth, as evidenced by declining poverty and a reduction in income disparity. In fact, the countries 14982
that grew fastest had the most income equality. Wealth-sharing mechanisms created opportunities for the nonelites to pursue productive rather than politically destabilizing activities. Moreover, growthpromoting policies contributed to regime legitimacy, helping to encourage broad social support for the government. While each country determined its own specific pattern of institutional mechanisms and the purposes to which they were devoted, their experience shows that progress toward sustained, self-reliant, and more equitable development depends, in part, on the strength and quality of a country’s institutional and organizational capacity. The creation and utilization of predictable procedures distinguishes effective development leadership. The difference between development based on political will and that based on institution building is essentially that the latter is sustainable, while the former is not. As indicated by the success of East Asia, Western Europe, and North America, strong but limited states have historically had the best record for economic performance. Having defined the character of institutions that promote growth, there remains the conundrum of why only a small minority of world leaders have based their political survival on the competent selection of economic policies. East Asia’s success firmly supports the generalization that governments will realize the goals of economic efficiency only when leadership depends on broad social foundations. The erection of such foundations continues to be the most elusive component of economic growth around the world. See also: Citizenship, Historical Development of; Civil Society\Public Sphere, History of the Concept; Development and the State; Nation: Sociological Aspects; Nationalism: General; Nationalism, Historical Aspects of: The West; Nations and Nation-states in History; Policy History: State and Economy; Power in Society; Power: Political; State and Society; State: Anthropological Aspects; State Formation; State, History of; Women’s Suffrage
Bibliography Campos E, Levi M, Sherman R 1994 Rationalized bureaucracy and rational compliance, Working Paper 105, Institutional Reform and the Informal Sector. University of Maryland, College Park, MD Dornbusch R, Edwards S 1991 The Macroeconomics of Populism in Latin America. University of Chicago Press, Chicago Evans P 1992 The state as problem and solution: Predation, embedded autonomy, and structural change. In: Haggard S, Kaufman R (eds.) The Politics of Economic Adjustment. Princeton University Press, Princeton, NJ, pp. 139–81 Friedman M, Friedman R 1980 Free to Choose: A Personal Statement. Harcourt Brace Jovanovich, New York Johnson C 1982 MITI and the Japanese Miracle. Stanford University Press, Stanford, CA
States and Ciilizations, Archaeology of Klitgaard R 1991 Adjusting to Reality: Beyond ‘State ersus Market.’ ICS Press, San Francisco Shleifer A, Vishny R W 1993 Corruption. Quarterly Journal of Economics 108: 559–617 Wade R 1990a Goerning the Market: Economic Theory and the Role of Goernment in East Asian Industrialization. Princeton University Press, Princeton, NJ Wade R 1990b Industrial policy in East Asia: Does it lead or follow the market? In: Gereffi G, Wyman D (eds.) Manufacturing Miracles: Paths of Industrialization in Latin America and East Asia. Princeton University Press, Princeton, NJ, pp. 231–66 Weingast B R 1995 The economic role of political institutions: Market-preserving federalism and economic development. Journal of Law, Economics and Organization 1(April): 1–31 Weingast B R 1997 The political foundations of democracy and the rule of law. American Political Science Reiew 91(June): 245–63
H. L. Root
States and Civilizations, Archaeology of 1. Ciilizations ‘Civilization’ has had two meanings in reference to ancient societies both of which require critical review. First, civilization has designated complex prehistoric cultures with productive economies, class stratification, and striking intellectual and artistic achievements. Civilization in this sense has contrasted with ‘primitive’ or ‘tribal’ societies i.e., societies with modest subsistence economies, egalitarian social relations, and less extravagant art and science. Civilizations have been equated with progress and cultural superiority, a view promoted by European elites of the early modern era. Using ideas of civilization and progress, these elites assumed the role of civilizing agents to legitimate their domination of the working class at home and colonial peoples abroad. Modern anthropologists reject narratives of civilization and progress, preferring less ethnocentric accounts of nonWestern cultures. Anthropologists now use more neutral terms to refer to the complex cultures of the ancient world, calling them complex societies or archaic states. Civilization has also referred to the enduring cultural traditions of the specific regions of the world. Distinctive cultural principles characterized large geographical areas of the world for long periods of time, outliving the expansion and collapse of individual polities. For example, the idea of cyclical time was a fundamental principle among the Olmec, Maya, and Aztec people of ancient Mesoamerica, and it is an idea that persists today among indigenous peoples in Mexico and Guatemala. While contemporary anth-
ropologists recognize the persistence of such orienting concepts, they reject the possible inference that these long-lived ideas imply cultural stasis. In the past as in the present, old concepts were reinterpreted in the face of new situations. Nor do these enduring principles imply uncontested social harmony. People of different social standing might share a set of cultural principles, but still arrive at different conclusions regarding appropriate behavior.
2. States State has been a more useful term for analyzing complex societies. States are powerful, specialized institutions for public decision-making and control. State government was the most important element of complex ancient societies. The state created a surplus economy that supported many of the other institutions characteristic of complex societies: infra-structural investments, administrative bureaucracies, full-time craft specialization, long distance trade, market exchange, law courts, armies, and priesthoods. Every archaic state had among its key institutions: (a) some form of surplus extraction e.g., corve! e, tribute, or taxation; (b) a bureaucracy that supervised the upward flow of tribute payments and the downward flow of tribute demands, and (c) coercive institutions, including armies, law courts, and police, to ensure tribute payment. Archaeologists once characterized archaic states as having relatively simple systems of class stratification: nobles and commoners, or tribute-receivers and tribute-payers. But more recently, archaeologists have emphasized the diversity of groups within the state, their complex interactions, and the state’s chronic instability. Archaic states were constantly threatened by both internal and external forces. High-ranking nobles led factions that competed for positions of rule. Provincial leaders stood ready to declare their independence from state control. Lineages and ethnic groups resisted the state’s appropriation of their collective assets. Households tried to evade the state’s demands on their goods and labor. The history of states and civilizations is the history of strategy and counter-strategy deployed by oppositional groups, leading to the development of complex societies and their dissolution.
3. Key Characteristics of Archaic States—Economics Economic activity in archaic states was guided by the principles of efficiency, security, and control. All economic actors took into account the efficiency (e.g., the cost : return ratio) of economic alternatives. But peasants and household-based craft specialists often 14983
States and Ciilizations, Archaeology of sacrificed efficiency for security; that is, they avoided economic alternatives that placed their annual subsistence at risk. States, on the other hand, often sacrificed efficiency for control; that is, they organized some forms of production so as to monopolize output rather than to maximize returns. Peasants, too, were sometimes guided by their desire for control; they were inclined to produce goods whose attributes (e.g., perishability) made state appropriation less likely.
3.1 Agricultural and Pastoral Systems Archaic states were characterized by highly productive agricultural systems. The vast majority of these states were dependent upon agriculture augmented by some form of water-control, terracing, and\or fertilizer. Mexico is famous for its chinampas (raised fields), Peru for its terrace systems, Mesopotamia for its irrigation canals, and China for its rice paddies. Such systems enhanced both agricultural productivity and reliability. At one time, archaeologists believed that states came into existence in order to construct and manage intensified agricultural systems as a response to the problem of population pressure (this is the ‘hydraulic hypothesis’ of state formation). But periods of state formation do not coincide with periods of population pressure, and the study of contemporary irrigation systems in tribal societies suggests that such systems can be built and managed without centralized forms of government. States did have good reason to construct agricultural facilities. First, such systems provided income to the state. Agricultural facilities constructed with corve! e labor represented prime agricultural land that the state could claim as its own without alienating existing property owners. Second, agricultural facilities attracted tax-paying settlers. Peasants were drawn to newly available, high quality land, and they were willing to pay for access to it. Third, agricultural facilities expanded food production within the immediate hinterlands of political capitals, making it possible to feed the urban populace. The desire for more land, greater crop security, and larger quantities of marketable food also led peasant households to invest in agricultural facilities on their own. Statesponsored agricultural investment can be distinguishedfrompeasantinvestmentthroughcarefulstudy: uniformity in the timing and techniques used in facility construction are the best indicators of state sponsorship. The long-term environmental consequences of agricultural development was a potentially destabilizing factor in all ancient states. In Mesopotamia, irrigation led to the salinization of the soil, first causing agriculturalists to switch from wheat to more salt-tolerant barley and then to abandon salinized areas all together. 14984
3.2 Tribute, Taxes, and Priate Estates All archaic states needed wealth to fund their centralized power. Taxes could be paid in labor, goods, or currency. A tribute in labor was least likely to provoke peasant resistance, since labor could be ‘reciprocated’ with food and drink, presented to workers by the state in a party-like atmosphere. This form of taxation may characterize the earliest states (i.e., states that still had to contend with kinship groups and kinship norms) or states that legitimated their rule with ideologies of collective wellbeing. A tribute in goods may have its origins in the spoils of war, tribute representing spoils paid on a sustained-yield basis. Agricultural staples and woven cloth were the most common forms of tribute goods, but metals, precious stones, and other exotic valuables were also collected. Tribute foodstuffs were used to feed state personnel. Tribute luxury goods enabled rulers to reward loyal servants and to control the formation of social identities. Because of their bulk, tribute goods were often stored in provincial warehouses, potentially funding revolts by provincial leaders. Substituting a monetary tax for tribute goods reduced transport costs and provided more centralized control over wealth. But monetary taxation occurred only in conjunction with market systems, where money could be converted into needed goods and services. Tribute demands often transformed rural landscapes and households. Greek farm families under Roman rule abandoned their rural farmsteads and moved to towns where economic diversification could supplement their incomes and provide the necessary cash for taxes. In the Valley of Oaxaca, the state’s demand for maize led to the cultivation of more distant fields which, in turn, altered routines of food preparation. More dry foods (i.e., tortillas) were made so that meals could be carried and eaten away from home. Peasant households sometimes tried to escape tribute payments. In Mexico under the Aztecs, maguey cultivation expanded, perhaps because its perishable sap was not easily appropriated by the state. In Mesopotamia, peasant families wishing to avoid taxation sometimes fled to the mountains, becoming nomadic pastoralists.
3.3 Urbanism Archaic states provided the setting for the earliest cities i.e., large, socially-heterogeneous, permanent settlements. These cities were religious and political centers; their economic functions were much less important than is the case for modern cities. Cities housed the administrative machinery of the state, usually in a multi-roomed palace complex. Highranking nobles, bureaucrats, and military personnel resided in the capital so that their activities could be
States and Ciilizations, Archaeology of directed and monitored by the ruler. Most state capitals also contained shrines, temples, and the residences of high-ranking priests. The concentration of political elites in the capital provided a clientele for enterprising craftspeople, merchants, and other service providers, who swelled the urban populace. Early cities did not supply many of the tools or meager furnishings used by rural peasants; these were usually produced in hinterland towns and villages, by parttime craft specialists. In several cases, the growth of ancient cities was accompanied by the emptying out of the countryside. Both Uruk (Mesopotamia) and Teotihuacan (Mexico) grew explosively at the expense of rural settlements. In these cities, peasant farmers and farm laborers made up as much as 75 percent of the urban population. Both push and pull factors may have moved farmers from the country to the city. The security, sanctity, and commercial opportunities of the city provided powerful draws; at the same time, rural people may have been forcibly removed to the cities in order to facilitate supervision by the state and to loosen their claims on the land. Cities developed only minimally in early Egypt. In Egypt, the concentration of population along the banks of the Nile, and the ease of water-bourne communication up and down the river, may have made it unnecessary to concentrate elites and rituals in urban centers.
3.4 Markets Markets often appear in conjunction with cities. Anthropologists are divided on the issue of whether markets developed from the ‘top down’ or from the ‘bottom up.’ ‘Top down’ theorists postulate that markets developed from the demands of urban elites for food and status goods. ‘Bottom up’ theorists view markets as developing from the efforts of peasants to enhance the efficiency and\or economic security of their households by participating in specialized production and exchange. Regardless of their origins, markets attracted the attention of political leaders. Market taxes were an important source of revenue to rulers, and markets provided an efficient means of provisioning cities. States provided both direct and indirect stimuli to market growth. Rulers promoted markets by offering merchants protection against thieves and bandits and maintaining market order. Indirectly, states stimulated market participation by peasant households. Tribute demands forced peasants to become more efficient producers, engaging in specialized production and market exchange. In addition, the social hierarchies that accompanied state formation sometimes prompted peasants to consume specialist-produced goods to enhance their social standing.
In the long run, markets weakened the state’s control over the economy. Merchant guilds, formed during periods of state collapse, resisted state control of the economy. The wide array of goods offered for market sale made it difficult to enforce the state’s sumptuary rules limiting certain forms of food and dress to state-designated groups. Buying and selling conferred economic power upon individuals who were not state functionaries.
3.5 Full-time Craft Specialists Many forms of craft specialization antedate the origins of states. In tribal societies, it is not unusual for households or entire villages to specialize in ceramic production, stone tool production, or even metallurgy. This craft activity is carried out on a part-time basis in conjunction with basic subsistence activities. Craft products are distributed in reciprocal exchanges to strengthen social ties and\or to acquire goods from other specialists. Rural, part-time specialization of this sort was also present in archaic states, often enhanced by the opportunities for exchange provided by urban markets. In situations of incipient social inequality, some craft specialists produced very elaborate goods that required heavy expenditures of labor. These ‘hypertrophic’ goods (such as embroidered textiles, jade earspools, stone sculptures, heavy bronze vessels, and gold jewelry) were commissioned by aspiring leaders to legitimate their claims to social worth. The craftsmen filling these commissions were part-time specialists either living in their own households or as members of the leader’s high-ranking family. The earliest full-time craft specialists in archaic states were the dependents of state rulers. Rulers supported craft specialists to maintain exclusive control over their output. Controlling sumptuary goods enabled rulers to consolidate their power by rewarding loyal servants and controlling social identities. Thus, for example, the Inka ruler supported a certain number of yanakuna and aqllakuna, full-time specialists in metallurgy and elaborate textile production. Gold and silver jewelry and embroidered textiles were used by the ruler to denote statuses within the state hierarchy. Maya rulers sponsored elite kinsmen who specialized in the production of elaborate painted ceramic vessels. Used in royal funerary rituals, these vessels confirmed the noble status of families. Egypt’s pharaohs supported entire towns of craftsmen dedicated to the preparation of royal tombs. Royal patronage accounts for the high levels of skill and artistry that characterize the finest works of archaic states. Full-time craft specialization also emerged in more commercial contexts when a high demand for craft goods (usually for export trade) combined with developed transportation networks and a steady supply of marketed food. These conditions were met most 14985
States and Ciilizations, Archaeology of fully in the Mediterranean world where the waterbourne transport of food staples from one broad region to another underwrote really large-scale production and exchange. In late precolonial India and China, reliable agricultural systems combined with riverine and oxcart transport yielded similarly high levels of commercial production and exchange.
buted to loyal soldiers constituted an early form of private property. In Mesopotamia, private property developed long after states arose, through the gradual sale of lands belonging to kinship units to secular or religious elites.
4.2 Gender
4. Key Characteristics of Archaic States—Social Organization Social heterogeneity is the defining characteristic of complex societies. In addition to the occupational diversity described above, the intersection of class, gender, and ethnicity created an array of social statuses and life histories in the ancient world. 4.1 Class Stratification In the earliest states, the division between classes was based on political rather than economic status. A key division within society was between those who paid tribute and those who received it. Tribute was paid to the ruler who distributed it to servants of the state (usually close relatives). Key personnel were endowed with their own streams of tribute flow. Thus, the tribute-collecting class in archaic states comprised the ruler and his close relatives, high-ranking military personnel, government administrators, priests, and surviving elements of the pre-state nobility. For the governing class, wealth, power, and prestige coincided and were mutually reinforcing. Conversely, tributepayers suffered a lower standard of living, a lack of formal power, and low social status. Archaeological and historical research reveals significant variability within the commoner class. For example, Aztec households varied by size, composition, wealth, and ethnic affiliation. Rural households chose different combinations of agricultural production and craft activity. This variability complicates the traditional view that commoners were a homogeneous class. It suggests that commoners were divided by demographic, economic, and social factors. Slaves constituted a class below commoners. Slaves were captured in warfare, purchased through the slave trade, or sentenced to servitude as punishment for crimes or failed debts. Among Aztecs and Mayas, slaves were used as household servants, but they were never very numerous. In Mesopotamia, large numbers of female war captives worked at textile production, and this may have set a precedent for the use of slave labor in commodity production. The use of slave labor in commercial enterprises was extremely important in highly commercialized economies of ancient Greece and Rome. Private property sometimes developed out of warfare-based economic systems. In China and Aztec Mexico, lands that were seized in warfare and redistri14986
In The Origin of the Family, Priate Property, and the State (1972 [1884]), Frederick Engels argued that the development of private property brought about the state (to maintain wealth differences) and the patriarchal family (to insure the legitimacy of inheriting offspring). In the process, women’s status declined from a position of approximate equality with men to an inferior position. This was the ‘world historic defeat of the female sex.’ Modern historic and archaeological scholarship suggests that institutions such as private property and the patriarchal family are not universal characteristics of ancient states, nor is the subordination of women. Nevertheless, gender relations and ideologies were frequently renegotiated during the process of state formation, with results that were detrimental to women. The status of women in states was undermined by several factors. First, states were commonly created and maintained through warfare, and war was usually men’s work. Through warfare, men gained access to spoils, a form of wealth that was not householddependent. In addition, rulers often glorified male warriors and rewarded them with administrative positions in order to maintain their loyalty. Thus, warfare provided avenues of wealth, prestige and power for men that were not available to women. Furthermore, rulers dismantled kinship groups, which were nodes of resistance to state rule. These kinship groups often provided the primary rationale and defense of women’s rights in ancient societies, and when they were destroyed, women were left without social recourse. It is not surprising, therefore, that many archaic states favored men. Ancient Rome and ancient China both awarded great authority to male household heads. The Classic Maya and the Aztecs conferred public prestige upon male warriors, and men dominated these state governments. However, the subordination of women to men was not universal. States that legitimated their rule with ideologies of collective wellbeing did not emphasize warfare and did not stress gender difference. These states might have included Teotihuacan (Mexico), the Harappan states of the Indus Valley, and Egypt under the pharaohs. In the Andes, a long tradition of parallel descent favored gender equality. Even in states where patriarchy was the norm, high-ranking women were wealthier and more powerful than lower class men (i.e., class trumped gender). In peasant families, women’s contribution to the household economy often translated into approximate gender equality. In ad-
States and Ciilizations, Archaeology of dition, women found forums for resistance outside of the state. Female oracles provided social critique in Oaxaca Mexico, ancient Greece, and the Andes. 4.3 Ethnicity Ethnicity is an identity based upon a presumption of shared history and common cultural inheritance. Ethnic identity is shaped by both ethnic affiliation and ethnic attribution. Ethnic affiliation refers to individuals’ own sense of group membership and the characteristics of the group as defined by its members. Ethnic attribution concerns the characteristics of the group as defined by outsiders. States acted opportunistically and inconsistently in dealing with ethnicity. Sometimes they suppressed ethnic affiliation to weaken the resistance of subject groups to the state. At other times they encouraged ethnic affiliation to accentuate division within the commoner class. States also used derogatory ethnic stereotypes to legitimize their exploitation of subject peoples. For example, Aztecs depicted their Otomı! subjects as lazy, improvident, untrained blockheads and gaudy dressers. Stereotyping and ethnic prejudice generally heightened ethnic consciousness and perpetuated ethnic heterogeneity within ancient states.
5. Key Characteristics of Archaic States—Religion and World Views 5.1 State Religion All archaic states had an official religion with standardized temples and full-time priests. Rulers’ patronage of religion may have sprung from either piety or cynical efforts to manipulate the subordinate class. Either way, religion provided a powerful political resource that was actively exploited by ancient rulers. Many archaeologists argue that state religion was used to forge a unified polity out of distinctive subject peoples and to legitimize the ruler’s authority. For example, Chinese rulers tried to force cultural uniformity upon all the groups that they conquered, and Confucian philosophers declared that just rulers held the ‘Mandate of Heaven.’ The Incas’ effort to impose cultural uniformity on their subjects is archaeologically visible in provincial capitals where large plazas were constructed using Inca architectural forms. Local people gathered in these plazas to attend statesponsored rituals. In contrast, Aztec state religion was not geared to produce a uniform world view within the empire, nor to sanctify Aztec rulers in the eyes of their subjects. Rather, Aztec religion was intended to win the loyalty of a relatively small target group, the young men who formed the core of the Aztec army, by imbuing their military exploits with cosmological significance. In the
Aztec hinterlands, distinctive forms of ceramic figurines suggest that commoners did not accept the state’s ideology; on the contrary, commoners successfully maintained their own world view. Distinctive commoner ideologies may have been present in all archaic states, challenging the official point of view.
5.2 Monumental Architecture The earliest monuments were megalithic tombs and earthworks built by groups that were still organized using kinship principles. These monuments were probably constructed during communal rituals, when aspiring leaders sought social prominence by sponsoring group festivities. Under state rule, monumental construction increased in scale, creating some of the most striking icons of the ancient world: Egypt’s pyramids, Mesopotamia’s ziggurats, the colossus of Zeus at Rhodes. It is estimated that the great Pyramid of Khafu (Cheops) required 13,440,000 person-days of labor. Some monuments may have been constructed by a willing populace, motivated by civic pride and religious devotion. This would have been most likely in states that legitimated their rule with ideologies of collective wellbeing. At Teotihuacan, for example, the Temple of the Sun and the Temple of the Moon may have been constructed in a surge of popular enthusiasm. In states relying upon coercive rule, monument construction expressed submission to the state. Aztec rulers invited subject communities to participate in the construction of the Great Temple in Tenochtitlan, and refusals to participate were considered acts of rebellion. Through their size, monumental structures symbolized the ruler’s exclusive link to supernatural forces. Monumental structures also expressed overwhelming power of the state and the fact of commoners’ subordination. Large buildings and massive sculptures imposed themselves on social space and human psyches. In the words of Antonio Gilman (1996), they showed ‘who was boss.’
5.3 Writing Writing developed in archaic states for many different reasons. The earliest texts, from Mesopotamia at 3300 BCE, recorded the economic transactions of large temple estates. In ancient Egypt and Mesoamerica, writing was first used to record the exploits of rulers. The earliest Chinese inscriptions were questions carved on turtle carapaces and the shoulder blades of cattle for the purposes of divination. In all cases, writing was an elite activity. Writing can be regarded the product of elite efforts to establish and to preserve a single, authoritative record of events and to make its voice stronger than the spoken words of its subjects. 14987
States and Ciilizations, Archaeology of
6. The Rise and Fall of Archaic States In the 1970s, archaeologists believed that states arose to manage environmental problems. Since states are powerful administrative institutions, state officials can function as highly effective problem solvers. They can construct large-scale irrigation networks, insure populations against crop failure, administer markets, organize warfare, and sponsor long-distance trade. Archaeological evidence confirms that archaic states did oversee some of these activities. However, the evidence for these activities usually dates to a few centuries after state formation; thus, the problemsolving capacity of states does not appear to account for their origins. More recently, archaeologists have begun to account for state origins by examining how individuals acquire social power. In one model, dynamic, ambitious leaders assemble coalitions of followers by sponsoring feasts and distributing gifts. Each leader with his followers then competes against other leaders and followers to establish dominance, first on the local, then on the regional, level. In these systems, personal prestige is emphasized, and dominance may be maintained through naked force. In a second model, leaders attract a broader following, by appealing to ideologies of collective wellbeing. State formation is more ‘volunteeristic,’ resembling a religious movement, and there is broad-based participation in collective rituals and public monument-building. Leaders are more anonymous, and when force is used, it is used on behalf of society as a whole. It has been suggested that the flamboyant Maya and the Shang rulers might exemplify the outcome of the first (individualizing) strategy, whereas the Teotihuacan (Mexico) and Harappan states might be products of the second (corporate) strategy. Archaic states were complex, often fragile, social structures. Social dominance required balancing the needs and power of diverse social segments. Sudden shocks and long-term problems threatened the delicate balance. Over-population, excessive tax burdens, climatic deterioration or invading barbarians, all threatened to topple archaic states at one time or another. Egypt’s Old Kingdom collapsed when provincial officials became increasingly independent of the pharaoh’s rule; the Middle Kingdom fell because bureaucrats at the center absorbed too much of the state’s wealth. Competition among the independent Classic Maya kingdoms led the Maya to over-exploit their resources and, finally, to abandon their homes. The surprise is not that archaic states eventually collapsed, but that ruling elites could devise institutional structures that enabled states to endure, sometimes for centuries at a time. See also: Agricultural Change Theory; Civilization, Concept and History of; Civilizational Analysis, History of; Civilizations; Intensification and Specializ14988
ation, Archaeology of; Pastoralism in Anthropology; State, History of; Trade and Exchange, Archaeology of
Bibliography Alcock S E, D’Altroy T N, Morrison K D, Sinopoli C M (eds.) 2001 Empires: Perspecties from Archaeology and History. Cambridge University Press, Cambridge, UK Blanton R E, Feinman G M, Kowalewski S A, Peregrine P N 1996 A dual-processual theory for the evolution of Mesoamerican civilization. Current Anthropology 37: 1–14 Childe V G 1950 The urban revolution. Town Planning Reiew 21: 3–17 Classen H J M, Skalnı! k P (eds.) 1978 The Early State. Mouton, The Hague DeMarrais E, Castillo L J, Earle T 1996 Ideology, materialization, and power strategies. Current Anthropology 37: 15–31 Engels F 1972 [1884] Origin of the Family, Priate Property, and the State. Leacock E (ed.). International Publishers, New York Feinman G M, Marcus J (eds.) 1998 Archaic States. School of American Research, Santa Fe, NM Gilman A 1996 Comments on ‘Ideology, materialization, and power strategies.’ Current Anthropology 37: 56–7 Gledhill J, Bender B, Larsen M T (eds.) 1988 State and Society: The Emergence and Deelopment of Social Hierarchy and Political Centralization. Unwin Hyman, London Patterson T C 1997 Inenting Western Ciilization. Monthly Review Press, New York Richards J, Van Buren M (eds.) 2000 Order, Legitimacy and Wealth in Ancient States. Cambridge University Press, Cambridge, UK Sanders W T, Webster D 1988 The Mesoamerican urban tradition. American Anthropologist 90: 521–46 Silverblatt I 1988 Women in States. Annual Reiew of Anthropology 17: 427–60 Stein G J 1998 Heterogeneity, power, and political economy: Some current research issues in the archaeology of Old World complex societies. Journal of Archaeological Research 6: 1–44 Yoffee N, Cowgill G L (eds.) 1988 The Collapse of Ancient States and Ciilizations. University of Arizona Press, Tucson, AZ
E. M. Brumfiel
Statistical Analysis: Multilevel Methods Multilevel, or hierarchical, data are blocked or clustered. Children within families; individuals within organizational units; longitudinal data on individuals; repeated measures on individuals; organizational units within larger organizations; multistage sample surveys; and time series of cross-sections exemplify multilevel or hierarchical data structures. When data are clustered, each observation counts for less than it would if the data points were independent. If clustering is ignored, statistical inference can be faulty. With
Statistical Analysis: Multileel Methods clustering, regression methods that assume independent observations provide inefficient and possibly biased coefficient estimates, and associated confidence intervals that are too narrow. In the context of regression modeling, any method that attempts to take account of clustering is multilevel. Several generalized regression methods are commonly used in the analysis of hierarchical data. Hierarchical or random coefficient models are multilevel. So too are fixed effects models, marginal models, classical single-level estimation followed by regression coefficient covariance matrix adjustment, and hybrid approaches. The choice of an approach for a particular research problem must depend on the goals of the research and assessment of the design and measurement deficiencies presented by a particular body of data, and is often consequential. Multilevel modeling approaches emphasize different aspects of the data, vary in the nature and stringency of their assumptions, and support different kinds of descriptions and interpretations. Poor choice of method can lead to the application of an incorrectly specified model, and thereby to good estimates of the wrong quantities.
1. Random Coefficient Models The random coefficient model offers a compelling formulation that is consistent with the social scientific goal of understanding how units at one level affect, and are affected by, units at other levels. Consider a design with two levels, and suppose that there is a single level-1 regressor (X ), a single level-2 regressor (G) that is by definition constant over all level-1 observations within a context, and that the dependent variable (Y ) is Gaussian distributed conditional on X and G. The random coefficient model can be expressed as Yij l β jjβ jXijjεij ! "
(1)
with β j l η jη Gjjα j !! !" ! ! β j l η jη Gjjα j " "! "" "
(2)
where j l 1, 2,…, J denotes contexts (level-2 units), and i l 1, 2,…, nj denotes individuals within contexts (level-1 units). The level-2 equations (Eqn. (2)) assert that the level-1 intercepts and slopes vary over contexts as linear functions of G. There is often justification for supposing that the within-context coefficients depend on contextual characteristics. It is appealing that the dependence includes random error components (α j and α j). !Assumptions " about the error structure of this twolevel model are that, for each context, the level-1 errors follow the Gaussian distribution, are centered on zero,
have constant variance, and are independent. The error variances may vary over contexts. The level-2 errors are also Gaussian distributed, centered on zero, independent within equations, and have constant variances. The level-2 errors may be correlated across equations. It is further assumed that the level-1 errors are orthogonal to the level-2 errors, and that all error terms are orthogonal to X and G. The orthogonality assumptions are needed to secure consistent estimates of the η’s. Many generalizations of this model are possible— coefficients can be constrained to be nonrandom; the dependent variable may be discrete, counted, truncated, ordinal, or time to event; there may be more regressors at any level as well as more than two levels; the nesting of levels may be cross-classified; different assumptions about the functional forms of the distributions of the errors can be made; and the assumption of independence of errors within level-2 (or higher) units can be relaxed. Nevertheless, the assumptions of orthogonality between regressors and errors, and of cross-level orthogonality between error components, continue to remain important. Until that is no longer so, the credibility of an application of a random coefficient model rests on the plausibility of these assumptions, which should be thought of as inextricably linked to the systematic components of the model, and not as the fine print of a contract that is never read, much less enforced. However compelling it may be on substantive grounds to specify the random coefficient multilevel model as Eqns. (1) and (2), substitution of Eqn. (2) into Eqn. (1) results in the equivalent mixed-model representation Yij l η jη Gjjη Xijjη XijGj !! !" "! "" j(α jjα jXijjεij) ! "
(3)
which demonstrates that treating intercepts and slopes as dependent on G is the same as supposing that Y depends on X, G, and their cross-level (product) interaction. Random coefficient models with Gaussian errors are commonly estimated by maximum likelihood or restricted maximum likelihood. For binary response the prevalent methods are marginal and penalized quasi-likelihood, although Bayesian methods based on Markov Chain Monte Carlo (MCMC) estimation are also beginning to be used. A consensus on preferred methods of estimation of multilevel binary response and other nonlinear (non-Gaussian) models has yet to emerge. Estimation of multilevel models is iterative and computationally intensive, more so for non-Gaussian cases. Especially for these cases, researchers are well advised to check numerical estimates across algorithms, methods, and software. Other problems, such as substantial design imbalance in combination with context size, also need to be eval14989
Statistical Analysis: Multileel Methods uated for their potential impact on numerical estimates (see Estimation: Point and Interal; Hierarchical Models: Random and Fixed Effects; Meta-analysis: Oeriew; Marko Chain Monte Carlo Methods).
2. Fixed Effects Models The fixed effects model does not require the assumption of orthogonality between X and α inherent to the ! random coefficient model. It can be written as Yij l β jβ Xijjβ XijGjjδjjuij (4) ! " # where the δj are between-context contrasts, and the uij are assumed to be classical Gaussian errors. Relative to the random coefficient model, the additive term in G has been dropped, and the intercept and slope are no longer allowed to vary randomly over contexts, although each context is allowed its own intercept and the cross-level interaction is retained. The δj absorb the additive component of all possible G-variables, not just the finite number that can be included in Eqn. (3). Additive components for any level-1 regressors that do not vary within contexts are also absorbed by the δj. In the fixed effects approach, ‘fixing’ the betweencontext contrasts means that the contexts are themselves assumed to be fixed, in the sense that statistical inference is conditional on the particular selection of contexts present in the data. Because the contrasts can be defined as the coefficients of a set of indicator variables that assign individuals to contexts, there is no need to assume that context membership is orthogonal to X. This is a critical difference between the fixed effect and random coefficient models. If the assumption of orthogonality between X and α is implausible in the random coefficient model, and it !can be when the list of G-variables is incomplete, then the fixed effects model provides an alternative that solves the problem. In the Gaussian case, the fixed effects model is a conventional regression model. When J is large—and often when it is not—there may be little interest in describing estimated values of the δj, in which case estimation of the other covariate coefficients can be performed after within-context centering of the variables. For a binary response variable, conditional logistic regression can be used to estimate fixed effect models (Chamberlain 1980). In this case, the context contrasts are not estimated, although additive context differences are controlled. Software for fixed effects estimation is widely available. If it is desired to obtain estimates of the additive component of the contextual variables, then the fixed effects approach is not the method of choice. Also, in the binary response case, conditional logistic regression does not necessarily use all of the data, and in this sense may be inefficient (see Economic Panel Data; Linear Hypothesis: Regression (Basics); Multiariate Analysis: Discrete Variables (Logistic Regression)). 14990
3. Coefficient Variance Estimates Adjusted for Clustering The adjusted coefficient variance estimator in common use for multilevel analysis is a robust estimator. In actual applications, care must be taken to ensure that the robust estimator selected includes an extension of the algorithm for the general robust estimator that accommodates clustering. With the robust estimator, it is possible to specify a model whose systematic component is identical to that of Eqn. (3), and whose error component need not be independently and identically distributed, Gaussian distributed, or orthogonal to the regressors. The robust equivalent of Eqn. (3) can be expressed as: Yij l b jb Gjjb Xijjb XijGjjeij (5) ! " # $ Least-squares computation can be used to produce coefficients whose estimated variances and covariances are subsequently adjusted for context membership. The coefficients will be unbiased estimates of the corresponding terms in Eqn. (3) if the orthogonality assumption is satisfied. The standard errors will be adjusted for clustering and other sources of heteroscedasticity regardless of whether this is so. When Eqn. (3) is true, confidence intervals for its coefficients obtained by prevailing methods will tend to be smaller than confidence intervals based on the robust clustering adjustment associated with Eqn. (5). The robust approach is also an alternative to the fixed effects approach. The latter achieves orthogonality between the regressors and the error component by including context membership in the systematic component of the model, but is unable to include main effects of contextual variables even though it controls for all possible contextual variables. In the robust approach coefficients of contextual regressors are estimated, but all possible contextual variables are not controlled. Coefficient standard errors are adjusted for clustering, but the coefficient estimates are not efficient. The robust, variance adjustment approach to multilevel analysis has been extended to models of discrete and counted outcomes, and to survival problems, and can be used where a fixed effects approach is not available. Nevertheless, where it can be justified, explicit incorporation of clustering into model specification is preferable to post hoc adjustments of estimators that ignore clustering (see Sample Sureys: The Field; Sample Sureys: Surey Design Issues and Strategies; Sample Sureys: Model-based Approaches).
4. Marginal Models When Y is a discrete or count response variable, it is possible to specify a multilevel model in which the within-context correlatedness of observations is in-
Statistical Analysis: Multileel Methods corporated as a side condition on the model, rather than as a consequence of a random intercept. For Y binary and ωij, the logit of ‘success’ for the ijth observation, the multilevel dependence of Y on regressors, can be written as ωij l β jβ Gjjβ Xijjβ XijGj (6) ! " # $ with an additional specification indicating the pattern of correlation between observations within a context (corr(Yij, Yihj), iih). Specifying that all pairs of observations within a context are equally correlated corresponds to the assumption of correlatedness induced by a random coefficient model, but virtually any pattern can be specified. Models that characterize the marginal distribution of Y as a function of regressors are said to be marginal models. The model of Eqn. (6) and its side condition on correlated observations is a member of this class. The generalized linear model (McCullagh and Nelder 1989), a unifying theoretical framework for the linear model and discrete and count response variables, has been extended to allow for the analysis of multilevel data (Liang and Zeger 1986, Zeger and Liang 1986). This extension, known as the generalized estimating equations approach (GEE), provides a marginal model (population-average) alternative to random coefficient models when Y is discrete or a count. Equation (6) can be estimated using the GEE method. Gaussian random effects models sustain a marginal model interpretation of regression coefficients. Random effects models for discrete and count response variables do not. Marginal models are appropriate when the analytic goal is to infer how Y depends on covariates for a population of level-1 units averaged over contexts. Random effects models are appropriate when within-context estimates are desired. For the logistic response case, the estimated coefficients of marginal models tend to be smaller in absolute value than the corresponding coefficients in random effects models (Diggle et al. 1994, p. 141). Marginal models for Y discrete or a count can be estimated for a broad range of structures of potential within-context correlation; random coefficient models for discrete and count response variables currently cannot. Computation is fast (iterative generalized least squares), and estimation is relatively straightforward compared to binary response random coefficient models, although issues surrounding optimal characterization of within-context correlation and estimation of coefficient standard errors continue to attract attention.
omitted variables are contextual. There are others: measurement of omitted variables, random assignment to contexts, and instrumental variables estimation. Measuring the omitted variables is not an option in secondary analysis or in the midst of primary analysis—by then the data have been collected and returning to the field is generally infeasible. This should not, however, preclude future attempts to extend the range of measurement. Random assignment to contexts presumes that an experimental design is feasible in the study of a particular phenomenon. Often it is not. Even when randomization is possible, it is not necessarily well suited to answering certain kinds of questions (see Experimental Design: Randomization and Social Experiments; Quasi-Experimental Designs; Experimental Design: Large-scale Social Experimentation). Instrumental variables estimation is often used in an effort to solve an endogeneity problem. When an instrumental variables approach is used, aspects of the clustering problem may be sacrificed to solve what is perceived as a much more important one—selection into contexts. In Eqn. (3) replace X by T, suppose that the underlying data structure is quasi-experimental, and let Tij l 1 if an individual is treated (0 otherwise). If Y is a health outcome and contexts are hospitals, and if assignment to treatment is not random, then ε potentially contains unmeasured variables correlated with T and also with Y. In quasi- and nonexperimental data, whether a patient receives a particular kind of treatment can be nonrandom with some causes known, others not. An instrumental variables approach to this problem attempts to find one or more covariates W correlated with T and Y, where Y and W controlling T are uncorrelated, and W and ε are uncorrelated. It is often difficult to find plausible instruments (W ), although contextual variables are sometimes used. Thus, instrumental variables estimation cannot be employed with the automaticity of any of the other approaches to clustered data.
6. Examples An example of a Gaussian application, and another of a binary response application, illustrate the kinds of differences in findings that can occur using alternative multilevel approaches. The data are from a one percent clustered sample of the 1990 Census of China, and are specific to Hainan province. Clustering is at the enumeration district level, and within clusters enumeration is 100 percent.
5. Omitted Variables Omitted variables in random coefficient models are of particular concern when they are endogenous, that is, correlated with Y and the included regressors, because they induce coefficient bias. The fixed effect approach provides one solution to this problem when the
6.1 Gaussian Example In the Gaussian example, there are 17,662 men in the data, distributed over 332 clusters, the smallest having 114 men and the largest having 3,735. The occupational socioeconomic status (SEI) of men ages 20–65 14991
Statistical Analysis: Multileel Methods Table 1 Alternative multilevel estimates, occupational socioeconomic status of men Regressions
Covariates Age Education Rural Age:CED# Ed:CED# Intercept R#
Ordinary least squares Unadjusted\Robust\Cluster (1)\(2)\(3) 0.09 (7.78)\(7.92)\(1.32) 0.34 (7.29)\(6.00)\(1.00) k4.92 (k18.47)\(k13.50)\(k1.01) 0.31 (13.16)\(13.03)\(1.72) 2.67 (25.51)\(21.78)\(3.06) 15.50 (33.08)\(32.15)\(4.63) 0.45
Fixed effect (4) k0.01 (k0.91) 0.36 (6.96) 0.66 (17.83) 2.75 (20.81) 11.48 (29.42) 0.57
Random coefficient (5) 0.02 (0.50) 0.43 (2.36) k0.72 (k0.35) 0.55 (5.75) 2.38 (4.72) 12.89 (6.71)
Source: One percent sample, males ages 20–65, Hainan Province, 1990 Census of China. N l 17,662. Quantities in parentheses are ratios of coefficients to standard errors. The coefficients of Regressions (1)–(3), estimated by ordinary least squares, are identical. The standard errors of (1) are unadjusted; those of (2) are adjusted, but without taking explicit account of clustering; those of (3) are adjusted for cluster membership. The context contrasts in the fixed effects regression (4) are not shown. The level-2 error variances in the random coefficient regression (5), not shown, are significant.
is a function of age and years of schooling at the individual level, type of place of residence (rural vs. urban), and the square of a contextualized measure of education. Type of place of residence is a characterization of place, and constant within contexts. Substantive theory and exploratory data analysis lead to this random effects model: SEI l β jjβ jAgejβ jEdjε ! " # β j l η jη Ruraljα j !! !" ! ! β j l η jη CED#jα j " "! "" " (7) β j l η jη CED#jα j # #! #" # Table 1 reports coefficient estimates and ratios of coefficents to standard errors (‘t-ratios’) for several approaches. Regressions 1–3 are the same ordinary least squares (OLS) regression, differentiated to keep track of the type of standard error. When clustering is ignored, the t-ratios are far larger (regressions 1 and 2). Unstructured adjustment of standard errors makes little difference. Adjusting OLS standard errors for clustering radically reduces the t-ratios (regression 3). The fixed effects coefficients (regression 4) are roughly the same as the OLS coefficients, with larger t-ratios than obtained with the clustering adjustment. Only one term is ‘significant’ when the clustering adjustment is used, whereas three of the four estimable coefficients are highly ‘significant’ in the fixed effects model. The coefficients of the random effects model (regression 5) are comparable with those for the fixed effects model (where present), but the t-ratios are smaller. For those terms present in both the fixed effects and random 14992
coefficient models, coefficients are either significant in both models, or not significant in both. In computations not displayed here, the random coefficient specification was computed twice within two different programs, once using maximum likelihood and once using restricted maximum likelihood. There was agreement across programs and methods. Although the basic conclusion is the same for the fixed effects and random coefficient models and differs somewhat from that provided by the clustering adjustment, the variation in coefficient magnitudes and t-ratios across the models is cautionary. Even in the Gaussian case, it can be misleading to estimate a model using only one approach. 6.2 Binary Response Example In the binary response example, data from the 1990 Hainan sample are again used. Infants and toddlers born in the 18 months prior to the starting date of the Census enumeration are the level-1 units of observation, and enumeration districts are level-2 observations. The available data do not record child mortality, which has been inferred from maternal information elicited in the Census questionnaire. The response variable is whether an imputed death to a child born after 1988 occurred prior to July 1, 1990. There are 43 imputed deaths out of 2,572 mother–child pairs. A majority of enumeration districts have no imputed deaths. For this reason, only the within-context intercept is treated as random. For ω, the logit of a death within the first 18 months of life, the specifi-
Statistical Analysis: Multileel Methods Table 2 Binary response example, imputed infant mortality Regression Random coefficients Variables
Maternal Ed Male Ed:Male Rural Intercept Level-2 Var.
Classical cluster adj.
GEE
Fixed effects
PQL
MCMC
(1)
(2)
(3)
(4)
(5)
k0.25 (k2.62) k0.96 (k1.26) 0.24 (1.73) 2.07 (2.17) k4.74 (k4.71)
k0.284 (k2.93) k1.18 (k1.95) 0.29 (2.48) 1.99 (1.48) k4.74 (k3.50)
k0.27 (k2.77) k1.25 (k2.13) 0.30 (2.63)
k0.29 (k3.19) k1.20 (k2.07) 0.29 (2.60) 1.81 (2.03) k4.75 (k5.07) 0.85 (1.78)
k0.32 (k3.56) k1.43 (k2.55) 0.32 (2.84) 1.38 (1.38) k4.74 (k4.24) 2.05 (1.53)
Source: One percent sample, infants to age 18 months paired with mothers, Hainan Province, 1990 Census of China. N l 2,572. Quantities in parentheses are ratios of coefficients to standard errors.
cation treats the probability of death as a function of maternal years of schooling, the sex of the child, and the interaction between maternal education and sex of child. The intercept is treated as a function of type of place of residence (the same for all residents of an enumeration district) and a random error: ωij l β jjβ Edjβ Malejβ Ed:Male ! " # $ β j l η jη Ruraljα j (8) !! !" ! ! Table 2 presents a selection of regressions based on alternative multilevel approaches and algorithms. Regression 1 is the result of classical maximum likelihood estimation followed by an adjustment for clustering. Using this approach, neither the sex of the child nor the sex–education interaction is ‘significant.’ Using the GEE approach, the coefficient for sex is at the cusp of conventional significance and the sex– education interaction is significant (regression 2). The fixed effects results, using conditional logistic regression, indicate significance for all included terms (regression 3). Penalized quasi-likelihood estimates are provided by regression 4. These are consistent with the fixed effects results for those terms present across the fixed and random effects specifications. Regression 5 presents Bayesian results based on MCMC. The type of place of residence coefficient is smallest and not significant in the Bayesian regression. Application of a clustering adjustment to the GEE results (not shown) also substantially affects conclusions about statistical significance. It is not unusual to encounter sparse data. This binary response example is perhaps even more cautionary than the Gaussian example. Researchers are not yet routinely able to determine whether convergence is to a local or a global maximum, whether sufficient iterations have been calculated using MCMC, whether all of the assumptions required for a particular method are valid in a given instance, or whether software is performing as intended.
7. Conclusions Analysis of multilevel data structures using methods that ignore the clustering of observations in the data is unsatisfactory because it can lead to confidence intervals that are too narrow, and to coefficient bias. Random coefficient models—and their specializations and generalizations—are attractive tools for the analysis of multilevel data. When there is selection into contexts, the random coefficient approach is subject to coefficient bias, as it is more generally when there are omitted regressors at any level. The fixed effects approach provides one kind of solution to the omitted variables problem. Instrumental variable estimation offers another. A posterior adjustment to accommodate clustering, applied to the estimated variances of coefficients, is robust but inefficient when the random effects model is correct. Although this adjustment does not require orthogonality between errors and covariates, robust standard errors for biased coefficients are far from universally desirable. For discrete and count responses, marginal models estimated using GEE provide an alternative to random effects models. Marginal models are attractive when there is little interest in within-context estimates, or when the pattern of within-context correlation is not simple. The choice of a multilevel modeling approach is ideally governed by research goals, the study design, and the nature of the data. There is also value in crosschecking results across methods and software.
8. Further Reading For an introduction to random coefficient models (terminology often used interchangeably with random effects models and hierarchical models), see Snijders and Bosker (1999). The level of exposition in Bryk and Raudenbush (1992) ranges from introductory to advanced, as is also true for Goldstein (1995). The 14993
Statistical Analysis: Multileel Methods exposition in Longford (1993) is technically demanding. Littell et al. (1996) survey mixed models in general, and include material on hierarchical models. Gilks et al. (1998) and Gelman et al. (1995) are useful for Bayesian random coefficient models. Draper (2002) provides a detailed introduction to random coefficient models from the Bayesian perspective. Fixed effects modeling is treated in econometrics texts (e.g., Greene 2000), and is often discussed as one approach to the analysis of panel data (Baltagi 1995). For the binary response case, see Pendergast et al. (1996), who provide a review with broad scope. Robust adjustments of standard errors for clustering (e.g., Rogers 1993) are an extension of the methodology of Huber (1967) and White (1980, 1982). The GEE approach is well exposited in Diggle et al. (1994), who provide detailed comparisons with random coefficient and other models. In this same vein, Allison’s (1999) introduction should not be missed. Instrumental variables estimation is a standard subject in econometrics texts (e.g., Greene 2000). For an illuminating use of random coefficient models in the study of neighborhoods see Sampson et al. (1997). Geronimus and Korenman (1992) use a fixed effects approach in the study of teen pregnancy. Hamilton and Hamilton (1997) use a fixed effects approach in the analysis of surgery outcomes. For the same kind of problem, but using a different data structure, McClellan et al. (1994) employ an instrumental variables approach. Applications of marginal models are found largely in the epidemiological and biomedical literature. Multilevel analysis has long been mainstream in economics, and will soon become so in the rest of the social sciences.
9. Software Dedicated random coefficient packages include HLM5 (Raudenbush et al. 2000) and MLwiN (Goldstein et al. 1998). HLM5 additionally offers GEE estimation; MLwiN includes Bayesian MCMC estimation. Most major statistical packages include facilities for all of the approaches discussed here, as integral components or contributed macros. Software for multilevel analysis is in continuous development. The multilevel models mail list is an essential portal for current information (multilevel!jiscmail.ac.uk), as is the website hosted by the Multilevel Models Project (http:\\www.ioe.ac.uk\multilevel\).
Bibliography Allison P D 1999 Logistic Regression Using the SAS System. SAS Institute, Cary, NC Baltagi B H 1995 Economic Analysis of Panel Data. Wiley, New York Bryk A S, Raudenbush S W 1992 Hierarchical Linear Models. Sage, Newbury Park, CA
14994
Chamberlain G 1980 Analysis of covariance with qualitative data. Reiew of Economic Studies 47: 225–038 Diggle P J, Liang K-Y, Zeger S L 1994 Analysis of Longitudinal Data. Oxford University Press, Oxford, UK Draper D 2002 Bayesian Hierarchical Modeling. SpringerVerlag, New York Gelman A, Carlin J B, Stern H S, Rubin D B 1995 Bayesian Data Analysis. CRC Press, Boca Raton, FL Geronimus A T, Korenman S 1992 The socioeconomic consequences of teen childbearing reconsidered. Quarterly Journal of Economics 107: 1187–214 Gilks W R, Richardson S, Spiegelhalter D J 1998 Marko Chain Monte Carlo in Practice. Chapman & Hall, London Goldstein H 1995 Multileel Statistical Models, 2nd edn. Halstead, New York Goldstein H, Rasbash J, Plewis I, Draper D, Browne W, Yang M, Woodhouse G, Healy M 1998 A User’s Guide to MLwiN. Multilevel Models Project, Institute of Education, University of London, London Greene W H 2000 Econometric Analysis, 4th edn. Prentice-Hall, Upper Saddle River, NJ Hamilton B H, Hamilton V H 1997 Estimating surgical volume—outcome relationships applying survival models: Accounting for frailty and hospital fixed effects. Health Economics 6: 383–95 Huber P J 1967 The behavior of maximum likelihood estimators under nonstandard conditions. In: LeCam L M, Neyman J (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley, CA, Vol. 1, pp. 221–33 Liang, K-Y, Zeger S L 1986 Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22 Littell R C, Milliken G A, Stroup W W, Wolfinger R D 1996 SAS System for Mixed Models. SAS Institute, Cary, NC Longford N T 1993 Random Coefficient Models. Oxford University Press, New York McClellan M, McNeil B J, Newhouse J P 1994 Does more intensive treatment of acute myocardial infarction in the elderly reduce mortality? Journal of the American Medical Association 272: 859–66 McCullagh P, Nelder J A 1989 Generalized Linear Models. Chapman and Hall, New York Pendergast J F, Gange S J, Newton M A, Lindstrom M J, Palta M, Fisher M R 1996 A survey of methods for analyzing clustered binary response data. International Statistical Reiew 64: 89–118 Raudenbush S, Bryk A, Cheong Y F, Congdon R 2000 HLM 5: Hierarchical Linear and Nonlinear Modeling. Scientific Software International, Lincolnwood, IL Rogers W H 1993 Regression standard errors in clustered samples. Stata Technical Bulletin 13: 19–23 Sampson, R J, Raudenbush, S W, Earls F 1997 Neighborhoods and violent crime: A multilevel study of collective efficacy. Science 277: 918–24 Snijders T A B, Bosker R 1999 Multileel Analysis. Sage Publications, London White H 1980 A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817–38 White H 1982 Maximum likelihood estimation of misspecified models. Econometrica 50: 1–25 Zeger S L, Liang K-Y 1986 Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42: 121–30
W. M. Mason Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Statistical Analysis, Special Problems of: Grouped Obserations
Statistical Analysis, Special Problems of: Grouped Observations Often data are available to social and behavioral scientists only in grouped form, i.e., data presented in tables or using display techniques such as frequency polygons or histograms. Reasons for having the data only in grouped form might be because of the confidentiality of data or because access to the individual observations might be too costly (e.g., data suitable for an international comparison might be found in tabular form in the United Nations Demographic Yearbook, but to collect the original data even if it is available will be costly and time consuming) or because the data sets are vast and it is desirable to group it in order to economize on data handling and computation. In the last two cases the researchers must judge the cost of obtaining the original data against the loss of precision in estimating from grouped, relative to ungrouped, data. The literature on grouped observations can be divided into three broad categories. The first is inference made from a sample of grouped observations about the ungrouped population. The second involves the relationship between moments calculated from grouped observations, before and after grouping, and the third is that of optimal grouping. Since the latter two were quite intensively discussed in Haitovsky (1983), most of the present entry is devoted to a discussion of the cases which are of major interest to the social and behavioral scientist. They are: (a) Estimation of linear models (or regression models) from observations cross-classified by all the explanatory variables in the model in which only the cell means and frequencies are reported; or when the observations are differently grouped by subsets of the explanatory variables and only the corresponding marginal means and their frequencies are available to the researcher. (b) The estimation of distribution curves when only mean values and frequencies are available. A case in point is the estimation of income or wealth distributions, when the individual data points are not available either because of confidentiality or its abundance. This has been historically the typical case of estimation of income or wealth distributions. A related curve is the so called Lorenz Curve or Relative Concentration Curve, used for the representation of the size distribution of income or wealth and inequality measures based on the latter, such as the Gini coefficient. It is interesting to note that in spite of a history of over hundred years of analyzing the relationships between various statistics estimated from grouped observations and their ungrouped counterparts and of making statistical inference from ungrouped observations about the parent, ungrouped, distribution, a rigorous theory about the legitimacy of the use of
grouped observations for statistical inference has not been developed until very recently. A unified theory about data coarsening, of which grouping is a special case, with the emphasis on coarsening at random (CAR) has just started to evolve in the last five years.
1. Definition and Notation Data grouping is the process by which any variable W with a given distribution function F(w), continuous or discrete, is condensed into a discrete distribution function, i.e., pi l
&
ci
dF (w),
i l 1,…, m
(1)
ci−"
c ( cm where W ? [c , cm] is partitioned by c ! where " ! and exhaustive groups, into m disjoint the ci’s are set in advance. The ci’s are called interval limits or boundaries and (ci− , ci) is the ith interval or class. " Thus, pi is the probability of an observation falling in the ith class. Hence, grouping is a transformation of one distribution function into a multinomial distribution function. The number of cases within the ith class is denoted by ni, mni l n, and is called the ith i group frequency. The condensations are usually into the interval mid-point ki l (ci− jci)\2 or into the " latter is the coninterval mean or centroid w- i. The ditional mean of W, given ci− W ci, i.e., w- i l " are equally spaced, ci wdF (w)\p . If all the intervals i ci−" i.e., are of the same width h l cikci− Bi, we will refer " to them as equidistant. It is useful to represent the grouping by an min grouping matrix G l N −"A where A is an min aggregation matrix with elements 0 or 1, the 1’s in the ith row indicate the unit to be aggregated into the ith class and N is an mim diagonal matrix with the class frequencies along its main diagonal. Hence, N −" is an averaging matrix. Thus, the ith row of A sums to ni( ni l n) due to the exhaustiveness of the classification and each column sums to 1, due to the disjointedness of the classification. Note that irrespective of how the observations are arranged. A
1 n " <
1 n # <
0
0
0
GG h l
B
C
0
…
0
…
0
<
< 1 nm
…
(2)
D
14995
Statistical Analysis, Special Problems of: Grouped Obserations The transformation of ungrouped to grouped data is obtained by WF l GW. Cross-classifications can be presented in a similar manner: the two (or more) different groupings can be performed by different grouping matrices, say, Gi (i l 1.2) and the two matrices, which have necessarily the same number of columns, but not necessarily of the same number of rows, are stacked one upon the other to produce the piled matrix Gp to yield A
C
N −"QTN −" N −" " " # (3) − − N −" N "QN " B D # " # where Ni i l 1, 2 are diagonal matrices with group frequencies of the ith classification along their main diagonal and Q is the matrix with the joint frequencies as entries. For a simple numerical example the reader is referred to Haitovsky (1973, pp. 3–5). A k-way classification will be represented by stacking k matrices of type G. GpGp l
2. The Linear (Regression) Model
by grouping the observations, we group the corresponding disturbances, resulting in: Eoε` q l GEoεq l 0 and Eoεεhq l GEoεεhqGh l σ#GGh;
(4a)
where Yz l GY Xz l GX and ε` l Gε, where G is a grouping matrix of either type G or Gp from above. Furthermore, if it is assumed that Eoεiq l 0
and
Eoεjεhjq l σ#In
(5)
8
is an unbiased estimate of σ#,
β` hXz h(GGh)−"Yz Yz h(GGh)−"Yz
(8)
(9)
Finally, under the assumption of the normality of the disturbances ε the analysis-of-variance table can be constructed as shown in Table 1. For testing the null hypothesis β l 0 one uses the central F-statistics, where F(p,m−p) l
β` hXz h(GGh)−"Yz mkp , z Yh(GGh)−"Yz kβ` hXz h(GGh)−"Yz p
(10)
where the subscripts of F denote the numbers of degrees of freedom of the F-distribution. The intuitive explanation of the proposed procedure is that, in the absence of direct information on all individuals in the sample, the best one can do is to assign the class centroids to all individuals falling within that class. The weighting has the effect of repeating the values recorded in the tables as many times as the number of individuals comprising the category, without of course attributing more degrees of freedom in the calculation of the variance. That is, each sum, sum of squares and sum of cross-products, is weighted by the corresponding frequencies of the
Table 1 Source of variation
Degrees of freedom
Sums of squares
E osum of squaresq
Total ungrouped Due to grouping Total after grouping Due to grouped regression Residual after grouped regression
n nkm m p mkp
YhY Yh(IkGh(GGh)−"G)Y YF h(GGh)−"YF β- hXF h(GGh)−"YF YF h(GGh)−"YF kβ- hXF h(GGh)−"YF
βhXF h(GGh)−"YF jpσ# (mkp)σ#
14996
(6) (7)
and the coefficient of determination R#, which serves as a measure of the goodness with which the model fits the data, is computed as
(4)
Yz l Xz βjε` ,
(5a) 7
β` l oXz h(GGhq−"Xz q−"Xz h(GGh)−"Yz V( β` ) l σ#oXz h(GGh)−"Xz q−" z (YkXβ)h(GGh)−"(Yz kXβ)\(mkp)
R# l
where Y is an ni1 vector of the original (ungrouped) observations on the dependent variable, X is an nip matrix of n original observations on p explanatory variables, ε is an ni1 vector of disturbances, and β is the pi1 vector of unknown parameters to be estimated. If instead observations grouped into only m classes are available, the model can be transformed into:
6
that is, unless the grouping scheme is such that the same number of observations appear in each class of each classification, the transformation matrix destroys the homoscedastic (equal-variance) properties of the regression disturbances. In order to restore homoscedasticity, we use generalized least-squares estimation, resulting in
Consider the linear (regression) model Y l Xβjε
5
Statistical Analysis, Special Problems of: Grouped Obserations ungrouped data in each category. The typical elements in the cross-product matrices obtained by so doing will be njxijxsj and njxijyj ( j l 1, ,…, mi;i, s l 1, …, p). j
j
2.1 How to Group There still remains the question: how should the original observations be grouped to assure estimates as close as possible to those obtained from the original ungrouped observations? The question was addressed by Cramer (1964) for the case of one explanatory variable model and generalized by Haitovsky (1973, pp. 10–11) to the multivariate case, who showed that the observations should be cross-classified by all explanatory variables which are included in the model and each cell in the cross-classified table should include the sums (or centroids) of all the variables in the model—explanatory and dependent—falling within each cell and the number of observations making each cell, namely, the joint frequencies. We refer to such grouping as ‘full cross-classification.’ Fully crossedclassified data maximize the correlation between grouped and the corresponding ungrouped estimates, or alternatively it maximizes their respective relative efficiencies. In this sense it assures the closeness of the respective estimates. The reader is referred to Haitovsky (1973) for numerical examples that show how bad estimates not based on fully cross classified data can get, even though they are unbiased.
2.1.1 Estimation with differently grouped obserations. The problem is that tables with data crossclassified by more than two variables are rarely available, while models with more than two explanatory variables are very prevalent in the social and behavioral sciences. What is often available are oneway-classification tables classified by each one of the model explanatory variables, where all the variables in the model and their frequencies are reported in each column of the table. In other words, the observations are differently grouped by all explanatory variables in the model, constituting the marginal means and frequencies of the fully cross classified tables. For this data set up Houthakker (reported by Haitovsky 1966) devised a Generalized-LeastSquares-type estimators of the form (6)–(10) except that here XF and YF are the stacked XF matrix and YF vector from the different one-way-classification tables, the number of degrees of freedom for the estimators of the model variance is the number of independent rows of the stacked XF minus the number of estimated parameters required for the estimation of the model variance, and GGh is replaced by GpGph . (Note that care must be taken before inverting GpGph since it is a singular matrix: either the appropriate rows and the corresponding columns must be deleted
(Haitovsky 1973, p. 15) or generalized inverses must be applied (see Farebrother 1979 and the correction by Baksalary 1982). An apparent shortcoming of Houthakker’s method is that there is no reason to believe that, if the full cross-classification of the necessary variables is not available, the corresponding joint frequencies will be available. However, there are quite a few good and relatively easy ways to construct the joint frequencies from the marginal frequencies. The way to estimate the joint frequencies from the marginal frequencies varies according to the researcher’s assumption about the nature of the cross classification. Farebrother (1979) considers the case where the grouping by the explanatory variables is independent and therefore the joint frequencies can be approximated by nij l ni.n.j\m" m#nij where ni. (i l i j 1,…, m ) are the row sums and n.j ( j l 1j, …, m ) are " # N. the column sums of the joint frequency matrix Bewley (1989) showed however that, under the independence of grouping assumption, Farebrother’s method can be very closely mimicked by stacking the marginal XF matrices and YF vectors from the different tables and then applying to them the method of weighted least squares, where the weights are the corresponding marginal frequencies (see also Haitovsky 1973, footnote p. 59). Thus there is no need to construct the joint frequencies. Fukushige and Hatanaka (1991) suggested an improvement upon Farebrother’s method by using Goodman’s (1985) RC correlation method for estimating the joint frequencies from the marginals, which entails the calculation of correction factors based on the standardized scores of the rows and columns of the marginal frequencies and the correlation between them, applied to nij l ni.n.j\nij. Fukushige and Hatanaka (1991) also suggested a Bayesian estimation of the joint frequencies considering them as being jointly generated by a multinomial density, subject to nij l n, with a i j Dirichlet density as a natural conjugate prior. Fukushige and Hatanaka (1991) then went further to estimate the model parameters by the ordinary Bayes method as applied to the linear model, given the joint frequencies estimated either by the Bayesian or by Goodman’s methods.
2.1.2 Haitosky’s methods. Estimations not based on Houthakker’s method are Haitovsky’s (1966, 1973, Chap. 4) unbiased estimators and extentions (1973, pp. 35–36), shown by Farebrother (1979) to be instrumental variables estimators (he also corrected an error in Haitovsky’s calculation of the variance of the estimator), and Haitovsky’s ‘Efficient Estimator’ (1968, 1973, Chap. 5), which also requires iterations. Haitovsky’s (1973) unbiased estimators utilize the property that only estimates from observations group14997
Statistical Analysis, Special Problems of: Grouped Obserations Table 2 Effect of different grouping and estimation methods on the estimates of a model with two explanatory variables using Houthakker’s data Coefficient of X " 0.7578 (0.1398) 0.7473 (0.1203) 0.5505 (1.6139) k0.6532 (2.5391) 0.7271 (0.1033) 0.7319
Coefficient of X # k0.1778 (0.0367) k0.1624 (0.0323) 0.0382 (1.8752) k0.0931 (0.1572) k0.1718 (0.0282) k0.1723
16.7025 (8.9025)
0.7560 (0.2006)
k0.1725 (0.0187)
18.5065 (5.2765)
0.7133 (0.1320)
k0.1698 (0.0355)
4331
k0.1719 (0.0338) k0.1719 (0.0338)
4285 4286
k0.1704
4322
k0.1728
5419
k0.1700
5238
Model
Intercept
Ungrouped data (n l 1218) Fully cross classified data (n l 56) Classified by X (n l 7) " Classified by X (n l 8) # Haitovsky’s ‘ unbiased method ’ (n l 15) Haitovsky’s ‘ unbiased method ’ explanatory variables are differently groupedc (n l 15) Haitovsky’s ‘ 2 stage efficient method ’ (n l 15) Weighted regression applied to stacked datad (n l 14)
17.10 (7.30)a 16.47 (10.32) 10.86 (5.78) 73.74 (11.26) 18.0335 (6.6065)b 17.8659
Estimation based on Houtakker’s method Joint frequencies are available (n l 14) 18.08 (5.87) 0.7263 (0.1259) Joint frequencies are available, using 18.0745 (5.8691) 0.7263 (0.1259) generalized inverse (n l 15) Joint frequencies are estimated by RC 18.491 0.7147 correlation method (n l 15) Bayes estimates, joint frequencies are 18.542 0.7174 availablee Bayes estimates joint frequencies are 18.581 0.7120 estimated by RC correlation methode
σ# # 5250 3914 9027 1345 4337
Source: Haitovsky (1973), lines 1–5, 7–9, Farebrother (1979), line 10, Fukushige & Hatanaka (1991), lines 11–13. a Figures in parentheses are the estimated standard errors of the coefficients above them. b The standard error is a correction suggested by Farebrother (1979). c Here the tables of Y and X classified by X and Y and X classified by X have 7 and 8 classes as in the previous entry in " " # # Table 1, but the X’s are cross classified in four classes in each X and X . d The estimates are identical to those obtained by Houthakker’s method " # where the joint frequencies are constructed from the marginal frequencies under the independence of the cross classification assumption. See Bewley (1989). e The estimates reported here are based on 10 000 replicates and on the highest level of precision imposed on the hyper parameter in the Bayesian estimates. The differences between the extreme values for the different level of precision are 4 %, 1 %, and 4 % for the estimates for the intercept, coefficient of X , and coefficient of X . " #
ed by all explanatory variables featuring in the model, either cross-classified or differently classified, assures the credibility of the estimates in the following way. Suppose that the researcher estimates only β# j ( j l 1, …, p) from the table grouped by the jth variable, using the least squares formula for the estimation of a simple, one explanatory variable, model, namely, ( j)yixji\( j)x#ji, where xji and yi are i i measured as deviation from their respective weighted means, the superscripts designate the table used for the estimation, and the weighting by the marginal frequencies is subsumed under the summation sign. But since the appropriate model is given by eqn. (4a), upon substituting it in the last formula and replacing the βj by β# j one obtains (j)yixji i
(j)x#ij i
p
l βV jj βV l lj
(j)xjixli i
(j)x#ji i
( j l 1, …, p)
(11)
which is a system of p equations linear in the p 14998
unknown β# j’s. Multiplying both sides by (j)x#ji (asi suming they are all strictly positive), one obtains a normal-equation-like system, in which the jth equation is calculated from the table classified by the jth variable. If there are as many one-way-classification tables as there are explanatory variables, where each one of the tables is classified by each of the explanatory variables, and in each of these tables all the explanatory variables appear as data (centroids), the β# j’s can be derived yielding unbiased, or conditionally unbiased, estimates depending on whether the x variables are random or fixed. This method is the ‘unbiased method.’ However, if the dependent variable and only one explanatory variable are classified by the latter in a one-way-classification table and in addition each one of the explanatory variables is classified by each of the others, and both the classified and classifying variables appear in all the tables, eqn (11) can be solved to yield unbiased (or conditionally unbiased) estimates. The advantage of the latter option is that the different tables need not be grouped by the same number of classes, nor with the same interval limits.
Statistical Analysis, Special Problems of: Grouped Obserations As for Haitovsky’s ‘efficient method’ the reader is referred to Haitovsky (1968). 2.1.3 A numerical example. To complete this section, it will be instructive to present some numerical results. The data set originally used by Houthakker (see Haitovsky 1973, Appendix 1) has been extensively used to examine the performance of various methods by comparing to the estimates obtained from the ungrouped or fully cross classified data. As can be seen from Table 2, all the regression coefficients based on cross-classified data or on properly weighted data, classified in a one way classification table, are remarkably close to the ungrouped estimates, providing that there are tables classified by all explanatory variables. It is true even in cases where the information about the ungrouped observations is substantially reduced by grouping.
3. Estimation of Distribution Functions Another important application of grouped observations is in the estimation of distribution functions. A case in point is the estimation of income or wealth distributions. Typically, confidentiality disallows the reporting of the original observations by, say, the US Internal Revenue Service. Instead only the grouped observations including the class boundaries, the class frequencies and often the centroids are available. Another case is data measured on a discrete, rather than continuous scale, and grouped into classes. The latter often happens in the estimation of survival curves where a subject is inspected for failure only at prespecified time intervals. Note that in this case the midpoints rather than the centroids are available, except for open-ended intervals; the latter are to be treated differently. 3.1 Estimation of Income or Wealth Distributions Income and wealth distributions are often assumed to follow a parametric distribution, such as Pareto or Log-Normal distributions, and the grouped data are applied in the estimation of the parameters that characterize these distributions. Alternatively, a nonparametric approach is taken, namely, the underlying data generation process is ignored and a distribution free curve is fitted to the available income points. For the latter, we quote Mazzarino (1986) for an interesting experiment investigating the effect of the number of groups on the estimated curve. A natural starting point for the two approaches is to consider the group frequencies as a random sample from a multinomial distribution and write the likelihood function L(θ) of the vector of parameters θ as proportional to L`m pni i, pi being the probability of an observation i="
falling in the ith group, as defined in Sect. 2 (see Kulldorff 1961). In addition to the maximum likelihood estimate of Pareto’s scale parameter, Aigner and Goldberger (1970) apply a regression framework to the estimation of the scale parameter, making use of the mean, variance and covariance of the multinomial distribution and applying the least squares principle to the differences between the observed relative group frequencies ni\n and their corresponding probabilities pi. Alternatively, least squares can be applied to the differences between the observed cumulative ‘greater than’ relative frequencies and the corresponding probabilities or to the log cumulative frequencies relative to the corresponding log probabilities. Aigner and Goldberger (1970) found that, although the latter is easiest to compute, its relative asymptotic efficiency deteriorates fast when compared to the maximum likelihood and generalized least squares estimates, the two most (asymptotically) efficient estimators considered by them, as the scale parameter grows beyond 1.5.
3.2 Estimation of Lorenz Cure The Lorenz curve (also called relative concentration curve) is widely used for the representation and analysis of the size distribution of income (or wealth). It relates the cumulative income (wealth) fraction earned by various proportions of the population to the proportion receiving this income, when the units are arranged in increasing order of their income (wealth). Thus, if income is given by the probability density function f (u), the proportion of units having income less than w is F(w) l
& ! f (u) du w
while their mean income is w` l
& ! u f (u) du, w
the latter is often called the incomplete first moment. Given that µ, the average of all the units receiving income, exists, then L(w) l w- \µ is the cumulative relative income earned by those whose income is less than w. The Lorenz curve is obtained by relating L(w) to F(w) and plotting its values in a unit square with L(w) as the ordinate and F(w) as the abscissa, or as was defined by Lorenz: plot along one axis the cumulated percent of the population from the poorest to the richest, and along the other axis the percent of the total wealth held by these percents of the populations. (1905)
14999
Statistical Analysis, Special Problems of: Grouped Obserations For grouped data we define here i
5
m
p i l n j nj , j=" j=" the relative population frequencies of the first i groups, and w- j the average income within group i, L(w) is estimated by i
5
m
Lp ( pi) l njw` j njw` j j=" j=" and the fraction of the population earning less than ci is given by F< −"( pi). For estimating the Lorenz curve, Kakwani and Podder (1973, 1976) assume income to follow Pareto’s law, from which the Lorenz curve is easily derived, and then follow Aigner and Goldberger’s (1970) framework to estimate the Lorenz curve. The formula derived from Pareto’s law suggests two extensions (Kakwani and Podder 1973) which seems to fit the data better. Another extension was proposed by Ortega et al. (1991), who also list a few more functional forms previously suggested for the estimation of Lorenz curves. Another interesting formulation that takes account of some mathematical restrictions on the Lorenz curve and hence improves its estimation was proposed by Kakwani and Podder (1976). A non-parametric approach is suggested by Gastwirth and Glauberman (1976), who use Hermite interpolation for smoothing the curve. An alternative approach to distribution-free methods was suggested by Brown and Mazzanino (1984). Still another approach was taken by Gastwirth (1975), Gastwirth and Krieger (1975), and Krieger (1979). They put lower and upper bounds on the Lorenz curve and on its moments. A bivariate version of the Lorenz curve has been applied to measure the progressivity of taxation (Suits 1977) and to the design of welfare dominating tax policies (Yitzhaki and Slemrod 1991). Another application is in the field of finance where the second degree of stochastic dominance criterion (Levy 1992) is applied to ordering risky prospects in accordance with maximization of expected utility, when the shape of the utility function is not known (Yitzhaki 1982) and for the evaluation of rising prospects (Shalit and Yitzhaki 1994).
3.3 Estimation of Inequality Measures Based on Lorenz Cures Perfect equality is represented by the 45m line in the Lorenz curve unit box, connecting the origin with the opposite corner, hence called the egalitarian line. The larger the area between that line and the Lorenz curve the less egalitarian is the society represented by the latter. There are numerous ways to calculate the Gini 15000
Figure 1 Lorenz curve and the estimation of Gini coefficient
coefficient, but most of them are not suitable for its calculation from grouped data because they ignore the intra-class inequality and thus bias the measure downward (see Lerman and Yitzhaki 1989). A simple way to estimate the Gini coefficient non-parametrically is by m
G l 1k2 ∆ piQi i="
(12)
where m
Q l Lp ( p )\2, Qi l Lp ( pi− )jniw` i\2 njw` j, " " " j=" i l 2…, m, ∆p l p , ∆pi l pikpi− " " "
i l 2,… m,
and m ∆piQi is the area of the rectangles and i=" triangles in Fig. 1. Note that m ∆piQi overestimates the area to the i=" right of the Lorenz curve in the typical strictly convex Lorenz curve and hence G underestimates the Gini coefficient, but this bias decreases rapidly as the number of groups increases. The Gini coefficient, as estimated by G, lends itself to various types of decompositions, such as decomposition by income source or by population class, e.g., urban\rural division, where both can be further decomposed by, say, regions, resulting in a two-tier decomposition. Another possible decomposition is by intra-class, inter-class, and overlapped components. The latter arises from the fact that poor people in a high-income class may be worse off than rich people in a low income-class (Yao 1999). Bounds on the Gini coefficient were proposed by Gastwirth (1975) and by Gastwirth and Krieger (1975). Parametric estimation can be derived from the parametric estimation of
Statistical Analysis, Special Problems of: Grouped Obserations Lorenz curves by the method of moments or maximum likelihood (Slottje 1990). Other parametric estimation were proposed by Kakwani and Podder (1976) and by Kakwani (1976) to overcome the downward bias inherent in most derivations of the Gini coefficients from grouped observations.
4. Coarsening at Random A question that only recently has been investigated concerns the effect of coarsening mechanism (grouping in the context of our article) on statistical inference using grouped observations about the ungrouped distribution. Alternatively, the question is: when can the grouping mechanism be ignored if the above inference is desired? Gill et al. (1997) and Gill and Robins (1997) investigated this question in terms of the conditional distribution of the grouped observations given the corresponding ungrouped observations. This question is of particular interest in cases of statistical inference when the above-mentioned conditional distribution is being used, e.g., in the calculation of score functions, in using the EM algorithm (the E-step), or in going through the Gibbs’ sampler. The intuition in the grouped observations case is straightforward: if we know, or can safely assume, that the grouping mechanism is independent of the ungrouped observations, the grouped observations can be legitimately used for statistical inference without the explicit introduction of the grouping mechanism into the analysis. To put it more formally, does the marginal distribution of the grouped observations, indexed by the parameters that characterize the ungrouped distribution and the particular mechanism applied in the grouping process respectively (i.e., the joint likelihood for these two parameters) factor into the conditional distribution of the grouped observations, given the ungrouped observations, and the distribution of the ungrouped observations, indexed by its parameter? If such factorization is possible, the grouping mechanism can be ignored in making the inference from the grouped to the ungrouped distribution. In the grouped observations case, if the grouping is done by intervals set in advance and independent of the ungrouped observations and the statistics used in the inference are calculated from the observations grouped this way, the above factorization is possible and the statistics calculated from the grouped observations can be legitimately used for the desired inference. The good news is that indeed most grouping mechanisms obey this rule.
5. Conclusion Although it is evident from the discussion above that methods are available for drawing inference from grouped observations with little loss of precision, the
cost of storing, handling and analyzing large data sets has gone down with the fast growing speed and capacity of computers and the parallel development of the necessary software for its utilization. It seems that inference from grouped observations will be mainly needed for cases in which access to the individual data is too costly or impossible, e.g., because of data confidentiality, and therefore the use of grouped observations will be cost efficient.
Bibliography Aigner D J, Goldberger A C 1970 Estimation of Pareto’s law from grouped observations. Journal of the American Statistical Association 65: 712–23 Baksalary J K 1982 An invariance property of Farebrother’s procedure for estimation with aggregated data. Journal of Econometrics 22: 317–22 Bewley R 1989 Weighted least squares estimation from grouped observations. Reiew of Economics and Statistics. 71: 187–8 Brown J A C, Mazzarino G 1984 Drawing the Lorenz curve and calculating the Gini concentration index from grouped data by computer. Oxford Bulletin of Economics and Statistics 46: 273–78 Cramer J S 1964 Efficient grouping, regression and correlation in Engel curve analysis. Journal of the American Statistical Association 59: 233–50 Farebrother R W 1979 Estimation with aggregated data. Journal of Econometrics 10: 43–55 Fukushige M, Hatanaka M 1991 Estimation of a regression model on two or more sets of differently grouped data. Journal of Econometrics 47: 207–26 Gastwirth J L 1975 The estimation of a family of measures of economic inequality. Journal of Econometrics 3: 61–70 Gastwirth J L, Glauberman M 1976 The interpolation of the Lorenz curve and Gini index from grouped data. Econometrica 40: 479–83 Gastwirth J L, Krieger A M 1975 On bounding moments from grouped data. Journal of the American Statistical Association 70: 468–71 Gill R D, van der Laan M J, Robins J M 1997 Coarsening at random: Characterizations, conjectures, counter-examples. In: Lin D Y, Fleming T R, (ed.) Proceedings of the First Seattle Symposium in Biostatistics: Surial Analysis. Springer-Verlag, pp. 255–94 Gill R D, Robins J M 1997 Sequential models for coarsening and missingness. In: Lin D Y, Fleming T R (ed.) Proceedings of the First Seattle Symposium on Biostatistics: Surial Analysis. Springer-Verlag, pp. 295–305 Goodman L A 1985 The analysis of cross-classified data having ordered and\or unordered categories: Association models, correlation models, and symmetry models for contingency tables with or without missing entries. Annals of Statistics 13: 10–69 Haitovsky Y 1966 Unbiased multiple regression coefficients estimated from one-way-classification tables when the cross classifications are unknown. Journal of the American Statistical Association 61: 720–8 Haitovsky Y 1968 Regression analysis of grouped observations when the cross classifications are unknown. Reiew of Economic Studies 35: 77–89 Haitovsky Y 1973 Regression Estimation from Grouped Obserations. Griffin, London
15001
Statistical Analysis, Special Problems of: Grouped Obserations Haitovsky Y 1983 Grouped data. In: Johnson N L, Kotz S, Read C B (eds.) Encyclopedia of Statistical Sciences. Wiley, New York. Vol. 3, pp. 527–537 Heitjan D F, Rubin D B 1991 Ignorability and coarse data. Annals of Statistics 19: 2244–53 Kakwani N 1976 On the estimation of income inequality measures from grouped observations. The Reiew of Economic Studies 43: 483–93 Kakwani N C, Podder N 1973 On the estimation of Lorenz curves from grouped observations. International Economic Reiew 14: 278–91 Kakwani N C, Podder N 1976 Efficient estimation of the Lorenz curve and associated inequality measures from grouped observations. Econometrica 44: 137–48 Krieger A M 1979 Bounding moments, the Gini index and Lorenz curve from grouped data for unimodal density functions. Journal of the American Statistical Association 74: 375–8 Kulldorff G 1961 Contributions to the Theory of Estimation from Grouped and Partially Grouped Samples. Almquist and Wiksell, Stockholm, Sweden Lerman R, Yitzhaki S 1989 Improving the accuracy of estimates of Gini coefficient. Journal of Econometrics 42: 43–7 Levy H 1992 Stochastic dominance and expected utility: Survey and analysis. Management Science 83: 555–93 Lorenz M O 1905 Method for measuring of wealth. Journal of the American Statistical Association 9, New series No. 70, 209–19 Mazzarino G 1986 Fitting of distribution curves to grouped data. Oxford Bulletin of Economics and Statistics 48: 189–200 Ortega P, Martin G, Fernandez A, Ladoux M, Garcia A 1991 A new functional form for estimating Lorenz curves. Reiew of Income and Wealth 37: 447–52 Shalit H, Yitzhaki S 1994 Marginal conditional stochastic dominance. Management Science 40: 670–84 Slottje D J 1990 Using grouped data for constructing inequality indices. Economics Letters 32: 193–7 Suits D B 1977 Measurement of tax progressivity. American Economic Reiew 67: 747–52 Yao S 1999 On the decomposition of Gini coefficients by population class and income source: A spreadsheet approach and application. Applied Economics 31: 1249–64 Yitzhaki S 1982 Stochastic dominance, mean variance, and Gini’s mean difference. American Economic Reiew 72: 178–85 Yitzhaki S, Slemrod J 1991 Welfare dominance: An application to commodity taxation. American Economic Reiew 81: 480–96
Y. Haitovsky
Statistical Analysis, Special Problems of: Outliers Anyone analyzing data may expect sometimes to encounter one or more observed values which appear to be inconsistent or out of line with the rest of the data. Such a value is called an outlying observation or outlier. Judged subjectively, it looks discrepant or surprising. There are two main ways of dealing with 15002
outliers, discordancy testing (Sects. 2, 3) and accommodation (Sect. 4). In simpler types of data set, the presence of an outlier is evident by inspection. In more complex data situations such as designed experiments, an outlying observation may disrupt the expected pattern of results, though its presence is not obvious by inspection and a statistical procedure is needed to detect it (Sect. 5).
1. Two Simple Examples 1.1 A Disputed Paternity Case In 1921 Mrs Kathleen Gaskill was sued for divorce in the British courts by her husband, a serving soldier, because she had given birth to a child 331 days after he had left the country to serve abroad. He claimed that the child could not be his, since 331 days was not credible as a duration of pregnancy. She denied adultery and claimed, with medical support, that the length of pregnancy, though unusual, was genuine. Viewed as a duration of pregnancy, 331 days is an outlier. In the legal case (which the court decided in favor of the wife), the only evidence of the alleged adultery was the abnormal length of this duration. This exemplifies the situation where the judgment of an outlier depends purely on its position in relation to a main data set. Table 1 gives the durations of pregnancy for 1669 women who gave birth in a maternity hospital in the north of England in 1954 (summarized from Newell 1964). Clearly 331 days, exceeded certainly by 1 case and possibly by 2 out of 1669 in this data set, cannot be regarded as an incredible duration. 1.2 Predicting College Grades Figure 1 shows a scatter plot of points (x, y) for a small sample of 22 college students; x is the student’s score in an aptitude test, used as a predictor of college performance, and y is the student’s freshman year grade-point average. There are two outlying points A (xA, yA) and B (xB, yB); the desired predictive relationship of y from x could be modeled satisfactorily by a linear regression omitting these two points. Should they be included in the relationship or not? This is discussed in Sect. 5. See Linear Hypothesis: Regression (Basics).
2. Outliers in Uniariate Samples The nature of outliers and the range of issues in dealing with them are first discussed in the simple situation of a univariate sample of n observations x , x ,…, xn of a variable X. Denote the observations " # arranged in ascending order by x( ), x( ),…, x(n), " #
Statistical Analysis, Special Problems of: Outliers Table 1 Durations of pregnancy for 1669 women Duration (weeks) Numbers Duration (weeks) Numbers
11–20 4 42 144
21–25 10
26–30 23
43 53
44 12
31–35 113 45 10
36–38 333
39–41 962
46 3
47 1
56 1
2.1 Causes of Outliers
Figure 1 Scatter plot of grade-point average y vs. predictive test score x for 22 college students
x( ) x( ) ( x(n). If, say, the upper extreme ob" # x appears (to an analyst or investigator servation (n) examining the data) to be, not only higher than the rest of the sample but surprisingly high, it is declared an outlier. This is a subjective judgment. Is the outlier credible as a value of X? This depends on the form the investigator is assuming for the distribution of X, e.g., whether it is, say, normal or on the other hand a skew distribution such as exponential in which a high value x(n) is less surprising. Call this assumed model for the distribution of X the basic model and denote it by F; x(n) is an outlier relative to model F (e.g., normal), but may not be an outlier relative to a different model (e.g., exponential). Table 2, with dotplot Fig. 2, gives the scores by 30 children in a reading accuracy test. There is an upper outlier x( ) l 62 and possibly a second one x( ) l 53. $! X is here the reading accuracy test #* score The variable obtained by a child in a population of children of the relevant age group and educational characteristics. A reasonable working assumption is that its distribution is normal; this is supported by a plot of ordered values against normal order scores, which is roughly linear.
No statistical issue arises if there is a tangible explanation for the presence of an outlier, e.g., if in the above example the score was misrecorded, or component subscores were wrongly totaled, or if the child tested was known to be unrepresentative of the population under study. The outlying value would simply be removed from the data set, or corrected where appropriate, e.g., in the case of wrong totaling. Otherwise, a statistical judgment is required for interpreting the outlier and deciding how to deal with it, based on the apparent discrepancy between it and the pattern the investigator has in mind for the data—the basic model F. This judgment requires a test of the null hypothesis or working hypothesis H that all n observations belong to F. On this hypothesis, the upper outlier x(n) is a value from the distribution of a variable X(n), the highest observation in a random sample of size n from F. If H is rejected by a statistical test at some level (e.g., 5 percent), x(n) is judged to be not credible (at the level of the test) as the highest value in a sample of size n from F. It is then said to be a discordant outlier (relative to the basic model F); the test of H is a discordancy test. (See Hypothesis Testing in Statistics; Significance, Tests of. See also Sect. 2.3.) In a Bayesian approach, there is no direct analogue of a test of discordancy. Its place is taken by procedures designed to obtain a posterior probability assessment of the ‘degree of contamination’ of the data. This means the degree to which the data contain observations (‘contaminants’) generated not by F but by some other model G; a basic model is used which contains additional parameters relating to a possible G (see Bayesian Statistics).
2.2 Treatment of Outliers: Retention, Identification, Rejection, Incorporation If the result of the test is to accept H and find x(n) nondiscordant, all the data can be taken to come from F, including the outlier. Although it has been thought to
Table 2 Reading accuracy test scores (in ascending order) for 30 children 2 23
4 26
5 27
8 27
9 28
12 30
12 34
12 38
12 40
14 42
16 53
16 62
18
19
20
21
23
23
15003
Statistical Analysis, Special Problems of: Outliers
Figure 2 Dotplot of the 30 test scores in Table 2
look somewhat unusual, it is consistent with having arisen purely by chance under the basic model and can be retained in the data set. When on the other hand the outlier is adjudged discordant, there are three possibilities. If the focus of interest is the outlier itself (as in the disputed paternity case), it is identified as a positive item of information, inviting interpretation or further examination; it might reveal, for instance, a novel teaching method causing an exceptional test score. The other two possibilities, rejection and incorporation, arise when the investigator’s concern is with the main data set, and the aim of the research is to study characteristics of the underlying population, e.g., to estimate the mean of a distribution of examination scores. The outlier is only of concern insofar as it affects these inferences about the underlying population. It is not of interest in itself; for example, the students giving rise to points A and B in Fig. 1 are effectively unknown! Either the model F is firmly adopted without challenge, and estimates or tests of the population parameters are to be based on F; or there is the option of changing the model. If the model is not under question and inferences are to be based on it, the outlier, being adjudged inconsistent with F, is a nuisance item and should be rejected. Historically, outlying values have been studied since the mid eighteenth century, often in the context of astronomical observations. Until the 1870s rejection was often the only procedure envisaged for dealing with outliers, as typified by such publications as Peirce (1852) ‘Criterion for the rejection of doubtful observations’ and Stone (1868) ‘On the rejection of discordant observations.’ The fact that an outlier is omitted for purposes of analyzing a data set does not necessarily mean that it is erroneous or undesirable. Consider, for example, Fig. 3. The ordinates are annual figures for the population of the UK, in millions, published in each year’s Annual Abstract of Statistics. Those for 1985 through 1990 are estimates based on the 1981 census, while the outlier at 1991, the true value from the 1991 census, is the only genuine data point! The alternative possibility, when the outlier is adjudged discordant relative to F, is that all the observations, including the outlier, belong to the same distribution, but this distribution is not satisfactorily represented by the initial basic model F. The analysis then needs to start afresh with a more appropriate choice of model F* (e.g., a skew distribution instead of a symmetrical F), in relation to which x(n) is not 15004
Figure 3 Some annual figures for the UK population
discordant. It was an outlier relative to F, but is not an outlier relative to F*, which satisfactorily models the entire data set. It has been incorporated in the data set in a non-discordant fashion, and the outlier problem disappears. 2.3 Other Meanings of the Word ‘Outlier’ In the entry, an observation which appears inconsistent with the main data set is called an outlier; this is a subjective judgment. If a statistical test provides evidence that it cannot reasonably be regarded as consistent with the main data set, it is a discordant outlier. It is important to bear in mind, however, that some writers use the word ‘outlier’ for a discordant outlier, and refer to an observation which looks anomalous but may or may not be discordant as a ‘suspicious’ or ‘suspect’ value. Note also the different meaning given to the word ‘outlier’ in graphical displays in exploratory data analysis (EDA), such as boxplots. With Q and Q " denoting (essentially) the lower and upper quartiles in$ the sample, observations greater than Q jk (Q kQ ) $ as outliers. $ " or less than Q kk (Q kQ ) are flagged " sometimes $ " outliers and sometimes These values are not. With the typical value of 1.5 for k, a normal sample of size 100 has more than 50 percent chance of containing one or more of these ‘outliers’! (See Exploratory Data Analysis: Uniariate Methods.)
3. Multiple Outliers: Block and Consecutie Tests Any particular discordancy test (taking univariate samples as context) is specified by three items: the particular outlier configuration (e.g., a lower outlier; an upper and lower outlier pair), the distribution of the data on the working hypothesis (e.g., normal with unknown mean and variance), and the test statistic. Taking Table 2 for illustration, there is a discordancy
Statistical Analysis, Special Problems of: Outliers test for x( ) l 62, in a normal sample, based on its difference $! from the sample mean x- l 22.53 in relation to the sample standard deviation s l 14.13, with test statistic T l (x(n)kx- )\ s l (62k22.53) \14.13 l 2.79. For n l 30 the tabulated 5 percent point of T on hypothesis H is 2.74; 62 is declared discordant ( just) at the 5 percent level. The observation x( ) l 53, which is #* also high, can now be tested for discordancy in relation to the other 28 values; this is a consecutive test procedure for 2 outliers. The value of T is (53k21.17) \12.22 l 2.61, P # 8 percent, the evidence of discordancy is weak. Alternatively, 62 and 53 could be considered together as an outlying pair in relation to the other 28 observations. This can be tested, in a block test procedure for 2 outliers, by the statistic (x(n− )jx(n)k2x- )\s, which here l (53j62k " 2i22.53) \14.13 l 4.95, to be compared with the tabulated 1 percent point 4.92. So, on this block test, there is quite strong evidence that (62, 53) are a discordant pair.
3.1 Masking and Swamping If x( ) were nearer to x( ) l 42, the T-value for x( ) #* be higher than 2.79, #) and the evidence for the $! would discordancy of x( ) would be stronger. This is called $! A discordancy test of the most the masking effect. extreme observation is rendered insensitive by the nearness of the next most extreme observation; the second observation has masked the first. Conversely, a block test could be insensitive if the second observation x(n− ) is close to x(n− ) and not outlying. The pair x(n), "x(n− ) might not #reach dis" cordancy level when tested jointly, even though x(n) on its own is discordant in relation to the other nk1 observations. This is called the swamping effect of x(n− ). If x(n− ) is unduly high, there is the risk of mask" ing;"if it is unduly low, there is the risk of swamping.
4. Accommodation of Outliers When the aim is to make inferences about a population of values of a variable X by analyzing a sample of observations of X, there is an alternative to detecting outliers and testing them for discordancy. This is to use procedures (testing or estimation) which are relatively unaffected by possible outliers, by giving extreme values less weight than the rest of the observations, whether they are discordant or not. Such a procedure is called robust against the presence of outliers (see Robustness in Statistics). There is a great range of procedures available in the literature. The approach is well illustrated by the following simple methods for robust estimation of the mean and variance of a univariate sample x( ) x( ) ( x(n). " #
4.1 Trimming The mean is calculated after one or more extreme values have been omitted from the sample. Omission of the r (lowest) and the s (highest) values gives the (r, s)-fold trimmed mean (x(r+ )jx(r+ )j(jx(n−s))\(nkrks). " #
4.2 Winsorizing Instead of omitting the r (lowest) values x( ),…, x(r), they are each replaced by the nearest value "which is retained unchanged, i.e., x(r+ ); similarly for the s (highest) values. The mean of "the resulting modified sample, still of size n, is the (r, s)-fold Winsorized mean ((rj1) x(r+ )jx(r+ )j(jx(n−s− )j(sj1) x(n−s))\n. " # " The choice of r and s depends on various factors such as the form of the distribution of X (e.g., symmetrical or asymmetrical) and the level of robustness required. The greatest possible amount of trimming or Winsorizing, with r l s, gives the sample median. Robust estimates of variance can be based in various ways on the deviations xkm, where m is a trimmed or Winsorized mean, instead of the usual xkx- . For an extensive account of accommodation issues and procedures, see Barnett and Lewis (1994).
5. Outliers in More Complex Data Situations In a regression of a variable y on a single regressor variable x, an outlying point is apparent by inspection, as instanced by Fig. 1. For a regression on 2 or more regressor variables x , x ,…, this is no longer the case. " # contain an outlying point The data may nonetheless (or more than one), i.e., a point (x, y) which departs to a surprising degree from the overall pattern of the data. Its observed response value y deviates ‘surprisingly’ from the fitted value from the regression equation; in other words, it has an outlying residual. Detection of an outlying point requires essentially a statistical examination of residuals. To illustrate this, consider the regression data in Fig. 1, with two outlying points A and B. Denote the fitted regression line CD by y l ajbx (a lk1.528, b l 0.003528), with residual mean square s# l 0.148. The residual for the outlying point A is yAk (ajbxA) l 1.4k2.447 lk1.047, and its estimated standard deviation is
p 2
1
s
1k 4
3
1 (x kx` )# k A (xkx` )# n
5
l 0.371. 7
6 8
A discordancy test statistic for the point A is its studentized residual (the residual divided by its esti15005
Statistical Analysis, Special Problems of: Outliers mated standard deviation), viz. k1.047\0.371 l k2.82; this is just significant at the 5 percent level. On a similar test the outlying point B is not discordant at all. The most outlying point in the regression (in this case, A) is the one with the largest studentized residual, ignoring sign. See Linear Hypothesis: Regression (Basics). An equivalent criterion to the studentized residual for testing discordancy is the proportional reduction in the residual sum of squares when the outlier is omitted, or, equivalently again, the increase in the maximized log-likelihood. The test then reflects directly how much the fit of the regression model to the data is improved on omission of the outlier. This approach—the deletion method—obviously extends to the detection and testing of multiple outliers, with the choice between block and consecutive deletion (see Likelihood in Statistics). It is also the appropriate approach to outliers in multivariate data; see Multiariate Analysis: Oeriew. Similar considerations apply to the detection and testing of outliers in designed experiments and outliers in contingency tables, with use of residual-based or deletion procedures. See Experimental Design: Oeriew. Accommodation procedures are also available in the literature. Outliers also occur in time series (see Time Series: General). For details of methods for dealing with outliers in the various data situations mentioned above, see Barnett and Lewis 1994. Some work has been done on outliers in sample surveys, but the methodology for outliers in this vital field urgently needs developing. See Sample Sureys: Methods; Sample Sureys: Model-based Approaches; Sample Sureys: The Field. In multilevel, or hierarchical, data the concept of an outlier becomes more complex, and much work needs to be done. See Hierarchical Models: Random and Fixed Effects; Statistical Analysis: Multileel Methods; Lewis and Langford (2001). See also Langford and Lewis (1998), from which the following passage is quoted: In a simple educational example, we may have data on examination results in a two-level structure with students nested within schools, and either students or schools may be considered as being outliers at their respective levels in the model. Suppose … that … a particular school is found under test to be a discordant outlier; we … need to ascertain whether it is discordant owing to a systematic difference affecting all the students measured within that school or because one or two students are responsible for the discrepancy. At student level, an individual may be outlying with respect to the overall relationships found across all schools or be unusual only in the context of his or her particular school. … [Again,] masking [see Sect. 3.1] can apply across the levels of a model. The outlying nature of a school may be masked because of another similarly outlying school or by the presence of a small number of students within the school whose influence
brings the overall relationships within that school closer to the average for all schools.
15006
Copyright # 2001 Elsevier Science Ltd. All rights reserved.
6. Conclusion ‘The concern over outliers is old and undoubtedly dates back to the first attempt to base conclusions on a set of statistical data. … The literature on outliers is vast … ’ (Beckman and Cook 1983). As this article has shown, outliers occur commonly in data and the problem of how to deal with them in any particular case is an unavoidable one. A massive armory of principle, model, and method has built up over the years. For a review of the relevant literature, see the extensive coverage of outlier methodology by Barnett and Lewis (1994), whose book, which includes some 970 references, still remains the only comprehensive treatment of the subject. A valuable survey of many outlier problems is also given in Beckman and Cook (1983).
Bibliography Barnett V, Lewis T 1994 Outliers in Statistical Data, 3rd edn. Wiley, Chichester, UK Beckman R J, Cook R D 1983 Outlier …… s (with discussion). Technometrics 25: 119–63 Chatterjee S, Hadi A S 1988 Sensitiity Analysis in Linear Regression. Wiley, New York Cook R D, Weisberg S 1982 Residuals and Influence in Regression. Chapman and Hall, London Hawkins D M 1980 Identification of Outliers. Chapman and Hall, London Hoaglin D C, Mosteller F, Tukey J W 1983 Understanding Robust and Exploratory Data Analysis. Wiley, New York Langford I H, Lewis T 1998 Outliers in multilevel data (with discussion). Journal of the Royal Statistical Society A 161: 121–60 Lewis T, Langford I H 2001 Outliers, robustness and the detection of discrepant data. In: Leyland A H, Goldstein H (eds.) Multileel Modelling of Health Statistics. Wiley, Chichester, UK McKee J M 1994 Multiple Regression and Causal Analysis. F E Peacock, Itasca, IL Newell D J 1964 Statistical aspects of the demand for maternity beds. Journal of the Royal Statistical Society A 127: 1–33 Peirce B 1852 Criterion for the rejection of doubtful observations. Astronomical Journal 2: 161–3 Rousseeuw P J, Leroy A M 1987 Robust Regression and Outlier Detection. Wiley, New York Rousseeuw P J, van Zomeren B C 1990 Unmasking multivariate outliers and leverage points (with discussion). Journal of the American Statistical Association 85: 633–51 Stone E J 1868 On the rejection of discordant observations. Monthly Notices of the Royal Astronomical Society 28: 165–8 von Eye A, Schuster C 1998 Regression Analysis for Social Sciences. Academic Press, San Diego, CA
T. Lewis
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Statistical Analysis, Special Problems of: Transformations of Data
Statistical Analysis, Special Problems of: Transformations of Data Data transformations such as replacing a variable by its logarithm or by its square-root are often used to simplify the structure of the data so that they follow a convenient statistical model or are more suitable for graphing. Transformations have one or more objectives, including: (a) inducing a simple systematic relationship between a response and predictor variables in regression; (b) stabilizing a variance, that is, inducing a constant variance in a group of populations or in the residuals after a regression analysis; and (c) inducing a particular type of distribution, e.g., a normal or symmetric distribution, or removing extreme skewness so that the data are more appropriate for plotting. The second and third goals are concerned with simplifying the ‘error structure’ or random component of the data. The need for transformation can be seen in Fig. 1, showing assets and sales for a sample of 79 ‘Forbes × 104
500’ companies. The data were taken from the Data and Story Library at Carnegie Mellon University’s StatLib (http:\\lib.stat.emu.edu\DASL\). Panel (a) shows that both assets and sales are highly right skewed and that the scatter of sales, conditional on the value of assets, is greater for larger values of assets. Because of this extreme right skewness, most of the data are crowded into the lower left-hand corner of the plot and it is difficult to ascertain what type of regression model is appropriate for these data. The normal probability plot in panel (b) also shows the right skewness of sales. Panel (c) is a normal plot of the residuals from a linear regression of sales on assets. Notice that the extreme residuals are much larger in absolute value than the corresponding normal quantiles. These outliers are due to the nonconstant conditional variance of sales, given assets; the outliers are from the observations with large values of assets. The skewness and heteroscedasticity in this example is quite typical of economic data. Figure 2 uses the same raw data as Fig. 1, but in Fig. 2 both assets and sales are log transformed. Notice in panel (a) that log(sales) appear linearly related to × 104
(a)
(b)
6 5 Residual
Sales
4 3 2 1 0 0
2
4
6
Assets
2
(c) 2
0.99 0.98 0.95 0.90 0.75
4 Sales
× 104
× 104 (d)
1.5 Residual
Normal quantiles
× 104
0
× 104
0.50 0.25 0.10 0.05 0.02 0.01
1 0.5
–1
0 Residuals
1
2
0
0
2
4 Assets
× 104
6 × 104
Figure 1 Assets and sales of a sample of Forbes 500 companies: (a) plot of sales and assets; (b) normal probability plot of sales; (c) normal probability plot of residuals from a linear fit of assets to sales; (d) plot of absolute residuals vs. assets with a scatterplot smooth
15007
Statistical Analysis, Special Problems of: Transformations of Data (a)
(b)
11
Normal quantiles
log (sales)
10 9 8 7 6 5
4
6
8 log (assets)
10
0.99 0.98 0.95 0.90 0.75 0.50 0.25 0.10 0.05 0.02 0.01
12
6
7
10
(d) 2
0.99 0.98 0.95 0.90 0.75
1.5 Residual
Normal quantiles
(c)
8 9 log (sales)
0.50 0.25 0.10 0.05 0.02 0.01
1 0.5
–2
–1
0 Residuals
1
2
0 4
6
8
10
12
log (assets)
Figure 2 Log(assets) and log(sales) of a sample of Forbes 500 companies. (a) plot of log(sales) and log(assets); (b) normal probability plot of log(sales); (c) normal probability plot of residuals from a linear fit of log(assets) to log(sales); (d) plot of absolute residuals versus log(assets) with a scatterplot smooth
log(assets). The normal probability plots in panels (b) and (c) suggest that both log(sales) and the residuals from the regression of log(sales) against log(assets) are approximately normally distributed. Panel (d) shows that these residuals are far less heteroscedastic than those from the untransformed data. Data transformations were developed at a time when there were far fewer tools available to the applied statistician than now. Many problems that in the past would have been tackled by data transformation plus a linear model are now solved using more sophisticated techniques; for example, generalized linear models, nonlinear models, smoothing, and variance function models. Nonetheless, model simplification is still important, and data transformation is still an effective and widely used tool.
1. Transformations to Simplify the Systematic Model In regression, if the expected value of the response y is related to predictors x , …, xJ in a complex way, often " 15008
a transformation of y will simplify this relationship by inducing linearities or removing interactions. For example, suppose that y l β xβ" … xβJJjε ! "
(1)
where ε is a small random error and y and x , …, xJ are " all positive. The Taylor series approximation log ( µjε) $ log ( µ)jε\µ is appropriate for small values of ε. Using this approximation with µ l β xβJ … xβJJ, " linear one finds that log y follows approximately! the model log y l β *jβ x *j … jβJx*Jj( β xβ" … xβJJ)−"ε (2) ! " " ! " where β * l log( β ) and x*j l log(xj). Model (1) is nonlinear! in each! xj and has large and complex interactions. In contrast, model (2) is linear and additive in the transformed predictors. However, transformations should not be used blindly to linearize a systematic relationship. If the errors in Eqn. (1) are homo-
Statistical Analysis, Special Problems of: Transformations of Data scedastic, then the errors in Eqn. (2) will be heteroscedastic. Also, the log transformation can induce skewness and transform the data nearest 0 to outliers. On the other hand, if the original data exhibit heteroscedasticity and skewness, then these problems might be reduced by transformation. Any intelligent use of transformations requires that the effects of the transformations on the error structure be understood; see Ruppert et al. (1989), where the use of transformations to linearize the Michaelis–Menten equation is discussed. In our example of Forbes 500 companies, log(sales) seems linearly related to log(assets), which suggest that the untransformed data satisfy sales $ β assets β". !
(3)
Moreover, the log transformed data is less skewed and heteroscedastic than the original data, so transformation brings the data more closely in line with usual assumptions of linear regression.
2. Transformations to Simplify Error Structure The effects of a transformation upon the distribution of data can be understood by examining the transformation’s derivative. A nonlinear transformation tends to stretch out data where the transformation’s derivative is large in absolute value, and to squeeze together data where the absolute value of that derivative is small. These effects are illustrated in Fig. 3(a) where the logarithm transformation is applied to normalize right-skewed lognormally distributed data. Along the x-axis one sees the probability density function of the lognormal data. The log transformation is shown as a solid curve. On the y-axis is the normal density of the transformed data. The mapping of the 5th percentile of the original data to the same percentile of the transformed data is shown with dashed lines, and similar mappings for the median and 95th percentiles are shown with solid and dashed lines. The log is a concave transformation; that is, its derivative is decreasing. Therefore the long right tail
(b)
(a) 1.5
7
1
6
5
Normal
Normal
0.5
0
4
3
–0.5
2 –1 1 –1.5 1
2 3 lognormal
4
0
0
10 20 square root normal
30
Figure 3 (a) A transformation to remove right skewness. Here the concave log transformation converts right-skewed lognormal data to normally distributed data. The densities of the original and transformed data are shown on the x and y axes, respectively. The lines show the mappings of the 5th, 50th, and 95th percentiles; (b) A variancestabilizing transformation. Here the square-root transformation converts heteroscedastic data consisting of two populations to data with a constant variance; ‘*’ and ‘o’ denote the two populations. The lines show the mappings of the first and third quartiles of the two populations
15009
Statistical Analysis, Special Problems of: Transformations of Data and short left tail of the lognormal distribution are shrunk and stretched, respectively, by the mapping to produce a symmetric distribution of the transformed data. The density on the y-axis is evaluated at equally spaced points. Their inverse images on the x-axis are spread out increasingly as one moves from left to right. The same effect can be seen in the percentiles. In general, concave transformations are needed to symmetrize right-skewed distributions. Analogously, convex transformations which have increasing first derivatives are used to symmetrize left-skewed distributions. Data often exhibit the property that populations with larger means also have larger variances. In regression if E ( Y Q X) l µ ( X) is the mean of the conditional distribution of the response Y given a vector X of covariates, then we might find that the conditional variance var( Y Q X) is a function of µ ( X), i.e., var(Y Q X) l g o µ ( X)q for some function g. If Y is transformed to h(Y ), then a delta-method linear approximation (Bartlett 1947) shows that var (h (Y ) Q X ) $ ohh( µ ( X )q# g o µ ( X )q.
(4)
The transformation h stabilizes the variance if ohh ( y)q# g( y) is constant. For example, if g( y) ` yα, then h ( y) ` y"−α/# is variance-stabilizing for α 2. For Poisson data, α l 1 and the square-root transformation is variance-stabilizing. For data with a constant coefficient of variation, α l 2 and the log transformation is variance-stabilizing. Exponentially distributed data, or, more generally, gamma-distributed data with a constant shape parameter, are examples with a constant coefficient of variation. For binomial data with a constant sample size, n, g( y) l ny (1ky) and the ‘angular’ transformation, sin−" (Ny), is variancestabilizing. When g is an increasing function, then the variance-stabilizing transformation will be concave, as illustrated in Fig. 3(b). The original data on the x-axis are such that their square roots are normally distributed data, except that they are truncated in the extreme tails to be strictly positive. There are two original populations with different means and variances. After a square-root transformation, the two populations have different means but equal variances. The dashed lines show the mapping of the first and third quartiles of the first population of the original data to their transformed values. The solid lines show the same mapping of the second population. When var( Y Q X) is not a function of µ ( X ), then there is no variance-stabilizing transformation, and variance function modeling (Carroll and Ruppert 1988) should be used as an alternative, or at least an adjunct, to transformation of Y.
3. Power Transformations There are several parametric families of data transformation in general use. The most important are the 15010
power or Box and Cox (1964) family and the powerplus-shift family. The Box–Cox power transformations, that are used for positive y, are: λ0
h ( y; λ) l ( y λk1)\λ l log ( y)
λ l 0.
For λ 0 the ordinary power transformation, y λ, is shifted and rescaled so that h ( y; λ) converges to log( y) as λ converges to 0. This embeds the log transformation into the power transformation family in a continuous fashion. The shift and rescaling are linear transformations and have no effect on whether the transformation achieves the three objectives enumerated in Sect. 1. Power transformations with λ l k1\2, 0, 1\2, 1, 1.5, 2, and 2.5 are shown in Fig. 4. Notice that the transformations are upward bending (convex) when λ 1 and downward bending (concave) when λ 1. Transformations which are either convex or concave are called ‘one-bend’ by Kruskal (1968). As mentioned before, power transformations are variance-stabilizing when the variance is a power of the mean, a model that often fits well in practice. The cube-root transformation, called the Wilson–Hilferty (1931) transformation, approximately normalizes, and therefore symmetrizes, gamma-distributed data. The effect of h ( y; λ) on data depends on its derivative hh( y; λ) l (c\cy) h ( y; λ) l y λ−". Consider two points 0 y y . The ‘strength’ of the trans# formation can be "measured by the ratio. hh( y ; λ) # l hh(y ; λ) " F
E
y # y "
G
λ−
"
H
which shows how far data are spread out at y relative # to y . This ratio is one for λ l 1 (linear transform" ation), is less than one for λ 1 (concave transformations), and is greater than one for λ 1 (convex transformation). For fixed y and y , the ratio is " # or convexity increasing in λ, so that its concavity becomes stronger as λ moves away from 1. If a certain value of λ transforms a right-skewed distribution to symmetry, then a larger value of λ will not remove right-skewness completely, while a smaller value of λ will induce some left-skewness. Analogously, if the variance is an increasing function of the mean, and a certain value of λ is variance-stabilizing, then for a larger value of λ the variance of the transformed data is still increasing in the mean, while for a smaller value of λ, the variance is decreasing in the mean. Understanding this behavior is important when a symmetrizing or variance-stabilizing power transformation is sought by trial and error. The shifted-power transformation is h ( y; λ, θ ) l h ( yjθ; λ)
Statistical Analysis, Special Problems of: Transformations of Data 3 2.5 2 1.5 1 1/2 0 –1/2
2
h (yλ)
1
0
–1
–2
–3
0
0.5
1
1.5
2
2.5
y
Figure 4 Box–Cox power transformations. The linear transformation with λ l 1 is shown as a heavy solid line. The convex transformations with λ 1 are above this line and the concave transformations with λ 1 are below 3
angular logit probit
2
h (yλ)
1 0 –1 –2 –3
0
0.2
0.4
0.6
0.8
1
y
Figure 5 Angular, logit, and probit transformations. To facilitate comparisons of shapes, each of the transformations was shifted and scaled so that the mean and variance of the transformed values were 0 and 1, respectively
which is a location shift of y to yjθ followed by a Box–Cox power transformation. The ordinary power transformation can be applied only to nonnegative data, but the shifted power transformation can be used for any data set provided that θ kmin (Y ), where min(Y ) is the smallest value of the data. Thus, shifted power transformations are particularly well suited to
variables that have a lower bound, but no upper bound. Kruskal (1968) and Atkinson (1985) discuss families of transformations that are appropriate for percentages, proportions, and other variables that are constrained to a bounded interval. The most common transformations for variables constrained to lie between 0 and 1 are the angular transformation, sin−" (Ny), the logit transformation, log o y\(1ky)q, and the probit transformation, Φ−" ( y), where Φ is the normal (0, 1) cumulative distribution function. These transformations, shown in Fig. 5, are concave for y 1\2 and convex for y 1\2; Kruskal (1968) calls them ‘two-bend’ transformations. For data that are bounded neither above nor below, Manly (1976) proposes the exponential data transformations. h ( y; λ) l o exp (λy)k1q\λ, λ 0 l y,
λl0
which is the transformation from y to exp (y), a positive variate, followed by the Box–Cox power transformation.
4.
Estimation of Transformation Parameters
There are situations where a transformation may be selected based on a priori information. For example, if we know that model (1) holds, then we know that the log transformation will linearize and remove inter15011
Statistical Analysis, Special Problems of: Transformations of Data actions. In other situations, transformations are often based on the data, though perhaps selected from a predetermined parametric family, e.g., shifted power transformations. Trial and error can be an effective method of parameter selection. For example, one might try power transformations with λ equal to k1, k1\2, 0, 1\2, and 1, and see which value of λ produces the best residual plots. When all aspects of the data are modeled parametrically, then the transformation parameter, as well as other parameters, can be estimated simultaneously by maximum likelihood. Semiparametric transformations are used when the model is not fully parametric, e.g., it is assumed that there is a parametric transformation to symmetry but not to a specific symmetric parametric family. 4.1 Maximum Likelihood Estimation Maximum likelihood estimation is possible once a parametric model is specified. However, there are at least three possible parametric transformation models in common use: the Box and Cox (1964) model; the transform-both-sides (TBS) model of Carroll and Ruppert (1981); and the Box and Tidwell (1964) model. In practice it is essential to consider which, if any, of these models is most appropriate. Box and Cox (1964) assumed that after a parametric transformation of the response, the data follow a simple linear model with homoscedastic, normally distributed errors. Therefore, the Box–Cox model is that h (Y; λ) l β jβ xpj … βpxpjε (5) ! " where ε is normal (0, σ#) for some unknown value of λ. They find the likelihood for this model as a function of (λ, σ#, β , …, βp) and show how to maximize likelihood ! to estimate the transformation parameter, the regression parameters, and the variance of the errors. The Box–Cox model attempts to achieve each of the three objectives in Sect. 1 with a single transformation. Although it may be rare for all three goals to be achieved exactly, often in practice both the systematic model and the error structure can be simplified by the same transformation, and because of this the Box–Cox technique has seen widespread usage. Simultaneous estimation of a power and shift by maximum likelihood is not recommended, for technical reasons (Atkinson 1985), but λ can be estimated by maximum likelihood for a fixed value of the shift, which can be chosen by trial and error. Model (5) with only an intercept, i.e., h (Y; λ) l β jε, is commonly used to choose a normalizing ! transformation for a univariate sample. When maximum likelihood estimation was applied to this model using the Forbes 500 data, the maximum likelihood estimations of λ were k0.07 and k0.04 for sales and assets, respectively. These values are quite close to the 15012
log transformation, λ l 0, which partially justifies the use of the log transformation for this example. Carroll and Ruppert (1981, 1988) apply maximum likelihood estimation to a somewhat different parametric model than that of Box and Cox, called TBS. They assume that there is a known theoretical model relating the response to the predictors, i.e., that Y $ f (x , …, xp; β , …, βq) " " where f (x , …, xp; β , …, βq) is some nonlinear regres" " Transforming sion model. only the response will destroy this relationship, so they transform both the response and the theoretical model in the same way. It is assumed that after transforming by the correct λ, the errors are homoscedastic and normally distributed. Therefore, the Carroll–Ruppert model is h ( Y; λ) l h o f (x , …, xp; β , …, βq); λqjε. " " TBS simplifies only the error structure while leaving the systematic model unchanged. In the Forbes 500 example, if one starts with the power model (3), then the model used in Fig. 2, i.e., log (assets) l β * jβ log (assets)jε, β * l log ( β ) ! " ! ! (6) is the TBS model with λ l 0. The model of Box and Tidwell (1962) assumes that the untransformed Y is linear in the transformed predictors, for example, that Y l β jβ h (x ; λ )j … jβph (xp; λp)jε. ! " " " This model is a special case of nonlinear regression and can be fit by ordinary nonlinear least-squares. Atkinson (1975) discusses a combination of the Box–Tidwell model and the Box–Cox model, where h ( Y; λ) l β jβ h (x ; λ )j…jβph (xp; λp)jε. ! " " " A special case of Atkinson’s model restricts the transformation parameters so that λ l λ l … l λp. " Atkinson’s model with λ l λ l 0 is equivalent to the " TBS model (6).
4.2 Semiparametric Estimation Hinkley (1975) estimates a transformation to symmetry by setting a skewness measure of the transformed data to zero and solving for the transformation parameter. Ruppert and Aldershof (1989) estimate a variancestabilizing transformation by minimizing the correlation between the squared residuals and the fitted values.
Statistical Analysis, Special Problems of: Transformations of Data 4.3 Effects of Outliers and Robustness An outlier is an observation that is poorly fit by the model and parameter values that fit the bulk of the data. When the model includes a transformation, outliers can be difficult to identify. For example, if the original data are lognormally distributed, apparent outliers in the long right tail may be seen to be conforming after the normalizing transformation. Conversely, the smallest observations may be quite close to the bulk of the original data but outlying after a log transformation. It is essential that the transformation parameter not be chosen to fit a few outliers but rather to fit the main data set. Diagnostics and robust transformations can be useful for this purpose. The semiparametric estimators discussed above are often chosen to be insensitive to outliers, e.g., in Hinkley (1975), by choosing λ to minimize a robust measure of skewness. Because of the potentially serious effects of not accounting for the effects of outliers properly, users of transformations are advised to consult the monographs by Atkinson (1985) and Carroll and Ruppert (1988) for practical advice.
likelihood. See Gnanadesikan (1997) and Ruppert (2001) for further discussion.
6. Nonparametric Transformations For interpretability, a data transformation should be continuous and monotonic, but parametric assumptions are made only for simplicity. A nonparametric estimate of a transformation allows one to select a suitable parametric family of transformation, to check if an assumed family is appropriate, or to forgo parametric assumptions when necessary. Tibshirani’s (1988) transformation to additivity and variance stabilization (AVAS) method uses nonparametric variance function estimation and Eqn. (2) to estimate the variance-stabilizing transformation. Nychka and Ruppert (1995) use nonparametric spline-based transformations within the TBS model. Nonparametric transformations should not be confused with semiparametric transformation, as discussed above; in the latter a transformation is selected from within a parametric family but the data are not assumed to have a density within a parametric family.
4.4 Inference after a Transformation There has been some debate on whether a data-based transformation can be treated as fixed when one makes inferences about the regression parameters (Box and Cox 1982). This is a serious practical question. In the Box–Cox model, the values of the regression parameters are highly dependent on the choice of transformation, and there is a general consensus that inference for regression parameters should treat the transformation parameters as fixed; see Box and Cox 1982 for discussion. If one is estimating the expected response or predicting responses at new values of the predictors, then the predictions are more variable with an estimated transformation than when the correct transformation is known (Carroll and Ruppert 1981). Prediction intervals can be adjusted for this effect to obtain coverage probabilities closer to nominal values (Carroll and Ruppert 1991).
5. Multiariate Transformations Most of the classical techniques of multivariate analysis assume that the population has a multivariate normal distribution, an assumption that is stronger than that the individual components are univariate normal. Andrews et al. (1971) generalize the Box–Cox model to multivariate samples. They transform each coordinate of the observations using the same transformation family, e.g., power transformations for all coordinates, but with coordinate-specific transformation parameters. It is assumed that within this family of multivariate transformations, there exists a transformation to multivariate normality. All parameters in this model are estimated by maximum
7. Further Information Atkinson (1985), Carroll and Ruppert (1988), and Draper and Smith (1998) are good places to learn more about the practical aspects of data transformation. See also: Distributions, Statistical: Approximations; Linear Hypothesis: Regression (Basics); Linear Hypothesis: Regression (Graphics)
Bibliography Andrews D F, Gnanadesikan R, Warner J L 1971 Transformations of multivariate data. Biometrics 27: 825–40 Atkinson A C 1985 Plots, Transformations, and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis. Oxford University Press, New York Bartlett M S 1947 The uses of transformations. Biometrics 3: 39–52 Box G E P, Cox D R 1964 An analysis of transformations (with discussion). Journal of the Royal Statistical Society, Series B 26: 211–46 Box G E P, Cox D R 1982 An analysis of transformations revisited, rebutted. Journal of the American Statistical Association 77: 209–10 Box G E P, Tidwell P W 1964 Transformations of the independent variables. Technometrics 4: 531–50 Carroll R J, Ruppert D 1981 On prediction and the power transformation family. Biometrika 68: 606–19 Carroll R J, Ruppert D 1984 Power transformation when fitting theoretical models to data. Journal of the American Statistical Association 79: 321–8 Carroll R J, Ruppert D 1985 Transformations in regression: A robust analysis. Technometrics 27: 1–12
15013
Statistical Analysis, Special Problems of: Transformations of Data Carroll R J, Ruppert D 1988 Transformation and Weighting in Regression. Chapman and Hall, New York Carroll R J, Ruppert D 1991 Prediction and tolerance intervals with transformation and\or weighting. Technometrics 33: 197–210 Draper N R, Smith H 1998 Applied Regression Analysis, 3rd edn. Wiley, New York Gnanadesikan R 1997 Methods for Statistical Data Analysis of Multiariate Obserations. Wiley, New York Hinkley D V 1975 On power transformations to symmetry. Biometrika 62: 101–11 Kruskal J B 1968 Transformations of data. In: Sills D L (ed.) International Encyclopedia of the Social Sciences. Macmillan, New York Nychka D, Ruppert D 1995 Nonparametric transformations for both sides of a regression model. Journal of the Royal Statistical Society, Series B 57: 519–32 Manly B F J 1976 Exponential data transformations. Statistician 25: 37–42 Ruppert D 2001 Multivariate transformations. In: El-Shaarawi A H, Piegorech W W (eds.) Encyclopedia of Enironmentrics. Wiley, New York Ruppert D, Aldershof B 1989 Transformations to symmetry and homoscedasticity. Journal of the American Statistical Association 84: 437–46 Ruppert D, Carroll R J, Cressie N 1989 A transformation\ weighting model for estimating Michaelis–Menten parameters. Biometrics 45: 637–56 Tibshirani R 1988 Estimating transformations for regression via additivity and variance stabilization. Journal of the American Statistical Association 83: 394–405 Wilson E B, Hilferty M M 1931 The distribution of chi-square. Proceedings of the National Academy of Sciences The United States of America 17: 684–88
(c) in comparing different classification schemes for effectiveness; and (d) in expanding the search for optimal classifications by probabilistic search strategies. These statistical methods are referred to as cluster analysis or statistical classification. The second term is preferred, because a cluster, from its astronomical origin, tends to connote a crowd of points close together in space, but a class is a set of objects assembled together for any kind of reason, and so the more general term classification is to be preferred. Part of the reason for the narrower term cluster analysis is that the term classification was once used in Statistics to refer to discriminant analysis (see Multiariate Analysis: Classification and Discrimination). In discriminant analysis, a classification rule is developed for placing a multivariate observation into one of a number of established classes, using the joint distribution of the observations and classes. These problems are regression problems: the variable to be predicted is the class of the observations. For example, logistic regression is a method for discriminant analysis with two classes (see Multiariate Analysis: Discrete Variables (Logistic Regression)). More recently, classification in statistics has come to refer to the more general problem of constructing families of classes, a usage which has always been accepted outside of statistics.
D. Ruppert
The first book on modern clustering techniques is Sokal and Sneath (1963). The most comprehensive recent review of a huge diverse literature is Arabie et al. (1996) which includes in particular an excellent survey of algorithms by Arabie and Hubert and of probabilistic methods by Bock. A classification consists of a set of classes; formal clustering techniques have been mainly devoted to either partitions or hierarchical clustering (see, however, the nonhierarchic structures in Jardine and Sibson 1968 and the block clusters in Hartigan 1975).
Statistical Clustering Every formal inquiry begins with a definition of the objects of the inquiry and of the terms to be used in the inquiry, a classification of objects and operations. Since about 1950, a variety of simple explicit classification methods, called clustering methods, have been proposed for different standard types of data. Most of these methods have no formal probabilistic underpinnings. And indeed, classification is not a subset of statistics; data collection, probability models, and data analysis all require informal subjective prior classifications. For example, in data analysis, some kind of binning or selection operation is performed prior to a formal statistical analysis. However, statistical methods can assist in classification in four ways: (a) in devising probability models for data and classes so that probable classifications for a given set of data can be identified; (b) in developing tests of validity of particular classes produced by a classification scheme; 15014
1. Algorithms
1.1 Partitions A partition is a family of classes, sets of objects with the property that each object lies in exactly one class of the partition. Ideal-object partitions are constructed as follows. Begin with a set of given objects, a class of ideal objects, and a numerical dissimilarity between any given object and any ideal object. A fitted ideal object is associated with each class having minimum sum of dissimilarities to all given objects in the class. The algorithm aims to determine k ideal objects, and a partition of the objects into k classes, so that the dissimilarity between an object and the fitted ideal object of its class, summed over all objects, is minimized. The usefulness of such a clustering is that the
Statistical Clustering original set of objects is reduced with small loss of information to the smaller set of fitted ideal objects. Except in special circumstances, it is impracticable to search over all possible partitions to minimize the sum of dissimilarities; rather some kind of local optimization is used instead. For example, begin with some starting partition, perhaps chosen randomly. First fit the ideal objects for a given partition. Second reallocate each object to the class of the fitted ideal object with which it has smallest dissimilarity. Repeat these two steps until no further reallocation takes place. The k-means algorithm, MacQueen (1967), is the best known algorithm of this type; each object is represented by a point in pdimensional space, the dissimilarity between any two points is squared euclidean distance, and the ideal object for each class is the centroid of the class. Asymptotic behaviour of the k-means algorithm is described in Pollard (1981).
1.2 Hierarchical Clustering Hierarchical clustering constructs trees of clusters of objects, in which any two clusters are disjoint, or one includes the other. The cluster of all objects is the root of the tree. Agglomeratie algorithms, Lance and Williams (1967), require a definition of dissimilarity between clusters; the most common ones are maximum or complete linkage, in which the dissimilarity between two clusters is the maximum of all pairs of dissimilarities between pairs of points in the different clusters; minimum or single linkage or nearest neighbor, Florek et al. (1951), in which the dissimilarity between two clusters is the minimum over all those pairs of dissimilarities; and aerage linkage in which the dissimilarity between the two clusters is the average, or suitably weighted average, over all those pairs of dissimilarities. Agglomerative algorithms begin with an initial set of singleton clusters consisting of all the objects; proceed by agglomerating the pair of clusters of minimum dissimilarity to obtain a new cluster, removing the two clusters combined from further consideration; and repeat this agglomeration step until a single cluster containing all the observations is obtained. The set of clusters obtained along the way forms a hierarchical clustering.
minimized. There may be more than one such optimal graph. The minimum spanning tree is related to single linkage clusters: begin by removing the largest link, and note the two sets of objects that are connected by the remaining links, continue removing links in order of size, noting at each stage the two new sets of objects connected by the remaining links. The family of connected sets obtained throughout the removal process constitutes the single linkage clusters. In a Wagner tree, the given objects are end nodes of the tree, and some set of ideal objects are the possible internal nodes of the tree. A Wagner tree for the given objects is a tree of minimum total length in which the given objects are end nodes and the internal objects are arbitrary. A Wagner tree is a minimum spanning tree on the set of all objects and the optimally placed internal nodes. This tree has application in evolutionary biology as the most parsimonious tree reaching the given objects (Fitch and Margoliash 1967). A third kind of tree structure arises from the CART algorithm of Breiman et al. (1984). This algorithm applies to regression problems in which a given target variable is predicted from a recursively constructed partition of the observation space; the prediction is constant within each member of the partition. If the predictors are numerical variables, the first choice in the recursion looks over all variables V and all cutpoints a for each variable, generating the binary variables identifying observations where V a. The binary predictor most correlated with the target variable divides the data set into two sets, and the same choice of division continues within each of the two sets, generating a regression tree. It is a difficult problem to know when to stop dividing. These regression trees constitute a classification of the data adapted to a particular purpose: the accurate prediction of a particular variable. Usually, only the final partition is of interest in evaluating the regressive prediction, although the regression tree obtained along the way is used to decide which variables are most important in prediction, for example, the variables appearing in the higher levels of the tree are usually the most important. However, if there are several highly correlated variables, only one might appear in the regression tree, although each would be equally useful in prediction.
2. Probabilistic Classification 1.3 Object Trees A different kind of tree structure for classification is a graph on the objects in which pairs of objects are linked in such a way that there is a unique sequence of links connecting any pair of objects. The minimum spanning tree, Kruskal (1956), is such a graph, constructed so that the total length of links in the tree is
The aim is to select an appropriate classification C from some family Γ based on given data D. 2.1 Mixture Models It is assumed that there are a number of objects, denoted x , …, xn, which are to be classified. A probability" model specifies the probability of an 15015
Statistical Clustering observation x when x is known to come from a class with parameter value θ, say p(xQθ). The parameter value θ corresponds to an ideal point in the partition algorithm [Sect. 1.2]. The mixture model assumes that there is a fixed number of classes k, that an object xi comes from class Cj with mixing probability pj, so that the probability of observation x is p(x) l Σj pj p(xQθj), and collectively the observations have probability Πi p(xi). The parameters pj and θj usually are estimated by maximum likelihood in a non-Bayesian approach. No single partition is produced by the algorithm, but rather the conditional probability of each partition at the maximum likelihood values of the parameters. How do these statistical partitionings compare with the dissimilarity based partitions of Sec. 1.2? Suppose the partition itself is a parameter to be estimated in the statistical model; estimate the optimal partition C l oC , …, Ckq, and the parameters pj, θj to maximize the " probability of the data given the partition ΠjΠx ?C pj p(xiQθj). Thus, the optimal partition in the i j statistical model is the optimal partition in a dissimilarity model with dissimilarities between the object xi and the ideal object θj d(xi, θj) l log ( pj)jlog p (xiQθj)
(1)
It is, thus, possible to translate back and forth between statistical models and dissimilarity models by taking dissimilarities to be log probabilities. However, it is wrong to estimate the optimal partition by maximizing the probability of the data given the partition and parameters over all choices of partitions and parameters; as the amount of data increases, the limiting optimal partition does not correspond to the correct partition obtained when the parameters pj, θj are known. (The correct partition assigns xi to the class that maximizes pj p(xiQθj). In contrast, the estimate of pj, θj that maximizes the data probability averaged over all possible partitions, is consistent for the true values under mild regularity conditions, and the partition determined from these maximum likelihood estimates is consistent for the correct partition. In a Bayesian approach, the probability model is a joint probability distribution over all possible data D and classifications C. The answer to the classification problem is the conditional probability distribution over all possible classifications C given the present data D. Add to the previous assumptions a prior distribution on the unknown parameters pj, θj (Bock 1972). Again, consistent estimates of the correct classification are not obtained by finding the classification C and parameters p, θ of maximum posterior probability. Use the marginal posterior distribution of p, θ, averaging over all classifications, to give consistent maximum posterior density estimates pV , θ# say. Then find the classification C of optimal posterior probability C given the data and given p l pV , θ l θ# . This classification will converge to the optimal classifi15016
cation for the mixture distribution with known p and θ. If in addition a prior distribution is specified for the number of clusters, a complete final posterior distribution is available for all possible partitions into any number of components. 2.2 Hierarchical Models Many of the hierarchical methods assume a dissimilarity function specifying a numerical dissimilarity d(i, j) for each pair of objects i, j. If d satisfies the ultrametric inequality for every triple i, j, k d (i, j) max [d(i, k), d( j, k)]
(2)
then the various hierarchical algorithms discussed in Sect. 1.2 all produce the same clusters. The dissimilarity between a given object and another object in a given cluster C is less than the dissimilarity between that object and another object not in C. Thus hierarchical clustering may be viewed as approximating the given dissimilarity matrix by an ultrametric; methods have been proposed that search for a good approximation based on a measure of closeness between different dissimilarities. With slightly more generality, eolutionary distances between a set of objects conform to a tree in which the objects are end nodes and the internal nodes are arbitrary. A nonnegative link distance is associated with each link of the tree, and the evolutionary distance between two objects is the sum of link distances on the path connecting them. Evolutionary distances are characterised by the following property: any four objects i, j, k, l may be partitioned into two pairs of objects in three ways; for some pairing (i, j), (k, l ) say, the eolutionary equality is satisfied d(i, k)jd( j, k) l d(i, l )jd( j, k)
(3)
The pairing (i, j ), (k, l ) means that the tree path from i to j has no links in common with the tree path from k to l. We can construct an evolutionary tree on a set of objects by approximating a given dissimilarity by an evolutionary distance. One case where evolutionary distances arise probabilistically is in the factor analysis model in which the objects to be classified form a set of variables on which a number of independent -dimensional observations have been made (Edwards and Cavalli-Sforza 1964). The variables are taken to be end-nodes in an evolutionary tree whose links and internal objects remain to be determined. Associated with each link of the tree is an independent normal variable with mean zero and variance depending on the link; this variance is the link distance. Each observation for the variables is generated by assigning the random link values; then the variable VikVj is a sum of independent normal variables along the path of directed
Statistical Clustering links connecting i and j. The random normal associated with each link is of opposite sign when the link is traversed in the opposite direction. The quantities d(i, j ) l E(VikVj)# form a set of evolutionary distances, which are estimated by dV (i, j) l sample have (VikVj)#. For each observation oV kVi, i l 2, … pq is multi" variate normal with a covariance matrix determined by the link variances, and the log likelihood may be written, for a certain weight function w (dV (i, j)kd(i, j))wi, j(σ)
(4)
i, j
This expression is maximized to estimate the unknown σ values. In this case the sample distances form a sufficient statistic for the parameters, and no information is lost in using distances rather than the complete data set. Similar discrete Markovian models due to Felsenstein (1973) reconstruct evolutionary trees from DNA sequence data. A nonparametric model for hierarchical clustering assumes that the objects x , …, xn to be clustered are " sampled randomly from a space with values x ? S and density f with respect to some underlying measure, usually counting measure or lebesgue measure. It is necessary to have a concept of connectedness in the space S. The family of connected sets of points having density f (x) c, computed for each nonnegative c, together form a hierarchical clustering called high density clusters (Bock 1974, Hartigan 1975). The statistical approach estimates this hierarchical clustering on the density f from the given sample x , … xn " by first estimating the density f by fV say, then forming the estimated clusters as the high density clusters in fV. There are numerous parametric and nonparametric estimates of density available. How do the standard hierarchical clustering methods do in estimating these high density clusters? Badly. The single linkage clusters may be re-expressed as the high density clusters obtained from the nearestneighbor density estimator; the density estimate at point x is a decreasing function of d(x, xi), where xi is the closest sample point to x. This estimator is inconsistent as an estimate of the density, and the high density clusters are correspondingly poorly estimated. However, it may be shown that the single linkage method is able to consistently identify distinct modes in the density. On the other hand, complete linkage corresponds to no kind of density estimate at all, and the clusters obtained from complete linkage are not consistent for anything at all, varying randomly as the amount of data increases. In evaluating and comparing clustering methods, a statistical approach determines how well the various clustering methods recover some known clustering when data is generated according to some model based on the known clustering (Milligan 1980). The most frequently used model is the multivariate normal mixture model.
3. Validity of Clusters A typical agglomerative clustering produces nk1 clusters for n objects, and different choices of dissimilarities and cluster combination rules will produce many different clusterings. It is necessary to decide ‘which dissimilarity, which agglomeration method, and how many clusters?’. In partition problems, if a parametric mixture model is used, standard statistical procedures described in Sect. 2.1 are available to help decide how many components are necessary. These models require specific assumptions about the distribution within components that can be critical to the results. The nonparametric high density cluster model associates clusters with modes in the density. The question about number of clusters translates into a question about number of modes. The threshold or kernel width test, (Silverman 1981), for the number of modes is based on a kernel density estimator, for example, given a measure of dissimilarity, and a threshold d, use the density estimate at each point x consisting of the sum of 1kd(x, xi)\d over observed objects xi within dissimilarity d of x. The number of modes in the population is estimated by the number of modes in the density estimate. To decide whether or not two modes are indicated, the quantity d is increased until at d l dV , the estimated density fV first has a single mode. "If the value dV required "to render the estimate unimodal is large" enough, then more than one mode is necessary. To decide when dV is large, resample from the unimodal distribution fV ," obtaining a distribution of resampled " dV values for those samples, the values of d in each of " those resamples required to make the density estimate unimodal. The tail probability associated with the given dV is the proportion of resampled dV that exceed " dV . A similar procedure works "for deciding the given whether or" not a larger number of modes than 2 is justified. An alternative test that applies in 1-dimension for two modes vs. one uses the DIP statistic (Hartigan and Hartigan 1985), the maximum distance between the empirical distribution function and the unimodal distribution function that is closest to the empirical distribution function. The null distribution is based on sampling from the uniform.
4. Probabilistic Search Consider searching for a classification C in a class Γ that will maximize l(C ) the log probability of C given the data. Γ is usually too large for a complete search, and a local optimization method is used. Define a neighborhood of classifications N(C ) for each classification C. Beginning with some initial classification C , ! steepest ascent selects Ci to be the neighbor of Ci− that " C maximizes l(C ), and the algorithm terminates at m when no neighbor has greater probability. Of course, the local optimum depends on the initial classification 15017
Statistical Clustering C , and the local optimum will not in general be the ! global optimum. The search may be widened probabilistically by randomized steepest ascent or simulated annealing (Klein and Dubes 1989), as follows: rather than choosing Ci to maximize l(C ) in the neighborhood of Ci− , choose Ci with probability proportional to " exp(l(C )\λ) from the neighborhood of Ci− . In run" highest ning the algorithm, we keep track of the probability classification obtained to date, and may return to that classification to restart the search. The degree of randomness λ may be varied during the course of the search. To do randomized ascent at randomness λ add to the log probability of each classification the independent random quantity kλ log E, where E is exponential; then do deterministic steepest ascent on the randomized criterion. To be specific, consider the probabilification of the k-means algorithm, that constructs say k clusters of n objects in dimensions to minimize the total within cluster sum of squares S. Use a probability model in which within cluster deviations are independent normal with variance σ#, the cluster means are uniformly distributed, and σ# has density σ−(#vk+%) (these prior assumptions match maximum likelihood), the log probability of a particular classification into two clusters given the data is k["(nj1)kj1] log S (5) # Steepest ascent k-means switches between clusters the observation that most decreases S; randomized steepest ascent switches the observation that most increases the randomized criterion k["(nj1)kj1] log Skλ log E (6) # Finally, to choose the number of clusters, allowing k to vary by permitting an observation to ‘switch’ to a new singleton cluster. Penalize against many clusters by the BIC penalty; the number of parameters (cluster means and the variance) specifying k clusters in dimensions is kj1, so the final randomized criterion is k"nk log Skλ log Ek"(kj1) log n (7) # # This formula adjusts only for the increase in log probability due to selecting optimal parameter values, and not for the increase in the log probability caused by selecting the highest probability partitions. A number of alternative choices of information criterion are proposed by Bozdogan (1988).
5. Conclusion Although probabilistic and statistical methods have much to offer classification, in providing formal methods of constructing classes, in validating those classes, in evaluating algorithms, and in spiffing up 15018
algorithms, it is important to realise that classifications have something to contribute to statistics and probability as well. The early extensive development of taxonomies of animals and plants in Biology proceeded quite satisfactorily without any formal statistics or probability through 1950; the big methodological change occurred in the late-nineteenth century following Darwin, who demanded that biological classification should follow genealogy. Classification is a primitive elementary basic form of representing knowledge that is prior to statistics and probability. Many of the standard algorithms were developed without allowing for uncertainty. Many of them, such as complete linkage, say, do not seem supportable by any statistical reasoning. Others will benefit by statistical evaluations and by probabilistic enhancements, as suggested in Hartigan (2000) and Gordon (2000). Classification is both fundamental and subjective. If an observer classifies a set of objects to be in a class, for whatever reason, that class is valid in itself. Thus, there is no correct natural classification that our formal techniques can aim for, or that our statistical techniques can test for. Since classification is fundamental, it should be used to express formal knowledge, and probabilistic inferences and predictions should be based on these classificatory expressions of knowledge, Hartigan (2000). More formal and refined classifications may be assisted and facilitated by probability models reflecting and summarizing the prior fundamental classifications. See also: Algorithmic Complexity; Algorithms; Hierarchical Models: Random and Fixed Effects; Mixture Models in Statistics
Bibliography Arabie P, Hubert L J, De Soete G (eds.) 1996 Clustering and Classification. World Scientific, Singapore Bock H H 1972 Statistiche Modelle und Bayes’sche Verfahren zur Bestimmung einer unbekannten Klassifikation normalveteilter zufa$ lliger Vektoren. Metrika 18: 120–32 Bock H H 1974 Automatische Klassifikation. Theoretische und praktische Methoden zur Gruppierung and Strukturierung on Daten (Clusteranalyse). Vandenhoek & Ruprecht, Go$ ttingen, Germany Bozdogan H 1988 A new model selection criterion. In: Bock H H (ed.) Classification and Related Methods of Data Analysis. North Holland, Amsterdam, pp. 599–608 Breiman L, Friedman J, Olshen R, Stone C 1984 Classification and Regression Trees. Chapman and Hall, New York Edwards A W F, Cavalli-Sforza L L 1964 Reconstruction of evolutionary trees. In: Heywood V N, McNeill J (eds.) Phenetic and Phylogenetic Classification. The Systematics Association, London, pp. 67–76 Felsenstein J 1973 Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Systematic Zoology 22: 240–49 Fitch W M, Margoliash E 1967 Construction of phylogenetic trees. Science 155: 279–84
Statistical Data, Missing Florek K, Lukaszewicz J, Perkal J, Steinhaus H, Zubrzycki S 1951 Sur la liaison et la division des points d’un ensemble fini. Colloquium Mathematicum 2: 282–5 Gordon A D 2000 Classification. Chapman and Hall, New York Hartigan J A 1975 Clustering Algorithms. Wiley, New York Hartigan J A 2000 Classifier probabilities. In: Kiers H A L, Rasson J-P, Groenen P J F, Schader M (eds.) Data Analysis, Classification, and Related Methods. Springer-Verlag, Berlin, pp. 3–15 Hartigan J A, Hartigan P M 1985 The dip test of unimodality. Annals of Statistics 13: 70–84 Jardine N, Sibson R 1968 The construction of hierarchic and non-hierarchic classifications. Computer Journal 11: 77–84 Klein R W, Dubes R C 1989 Experiments in projection and clustering by simulated annealing. Pattern Recognition 22: 213–20 Kruskal J B 1956 On the shortest spanning subtree of a graph and the travelling salesman problem. Proceedings of the American Mathematical Society 7: 48–50 Lance G N, Williams W T 1967 A general theory of classificatory sorting strategies. I: Hierarchical systems. Computer Journal 9: 373–80 MacQueen J 1967 Some methods for classification and analysis of multivariate observations. In: LeCam L M, Neyman J (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley, CA, pp. 281–97 Milligan G W 1980 An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika 45: 325–42 Pearson K 1984 Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London, Series A 185: 71–110 Pollard D 1981 Strong consistency of k-means clustering. Annals of Statistics 9: 135–40 Silverman B W 1981 Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society B 43: 97–9 Sokal R R, Sneath P H A 1963 Principles of Numerical Taxonomy. Freeman, San Francisco
J. A. Hartigan
Statistical Data, Missing 1. Introduction Missing values arise in social science data for many reasons. For example: (a) In longitudinal studies, data are missing because of attrition, i.e., subjects drop out prior to the end of the study. (b) In most surveys, some individuals provide no information because of noncontact or refusal to respond (unit nonresponse). Other individuals are contacted and provide some information, but fail to answer some of the questions (item nonresponse). See Sample Sureys: The Field. (c) Information about a variable is partially recorded. A common example in social science is right
censoring, where times to an event (death, finding a job) are recorded, and for some individuals the event has still not taken place when the study is terminated. The times for these subjects are known to be greater that corresponding to the latest time of observation, but the actual time is unknown. Another example of partial information is interal censoring, where it is known that the time to an event lies in an interval. For example in a longitudinal study of employment, it may be established that the event of finding a job took place some time between two interviews. (d) Often indices are constructed by summing values of particular items. For example, in economic studies, total net worth is a combination of values of individual assets or liabilities, some of which may be missing. If any of the items that form the index are missing, some procedure is needed to deal with the missing data. (e) Missing data can arise by design. For example, suppose one objective in a study of obesity is to estimate the distribution of a measure Y of body fat in the population, and correlate it with "other factors. Suppose Y is expensive to measure but a proxy measure Y ", such as body mass index, which is a function of# height and weight, is inexpensive to measure. Then it may be useful to measure Y and covariates, X, for a large sample but Y , Y and #X for # predica smaller subsample. The subsample "allows tions of the missing values of Y to be generated for the " larger sample, yielding more efficient estimates than are possible from the subsample alone. Unless missing data are deliberately incorporated by design (such as in (e)), the most important step in dealing with missing data is to try to avoid it during the data-collection stage. Because data are likely to be missing in any case, however, it is important to try to collect covariates that are predictive of the missing values, so that an adequate adjustment can be made. In addition, the process that leads to missing values should be determined during the collection of data if possible, since this information helps to model the missing-data mechanism when an adjustment for the missing values is performed (Little 1995). Three major approaches to the analysis of missing data can be distinguished: (a) discard incomplete cases and analyze the remainder (complete-case analysis), as discussed in Sect. 3; (b) impute or fill in the missing values and then analyze the filled-in data, as discussed in Sect. 4; and (c) analyze the incomplete data by a method that does not require a complete (that is, a rectangular) data set. With regard to (c), we focus in Sects. 6, 7, and 8 on powerful likelihood-based methods, specifically maximum likelihood (ML) and Bayesian simulation. See Bayesian Statistics. The latter is closely related to multiple imputation (Rubin 1987, 1996), an extension of single imputation that allows uncertainty in the 15019
Statistical Data, Missing imputations to be reflected appropriately in the analysis, as discussed in Sect. 5. A basic assumption in all our methods is that missingness of a particular value hides a true underlying value that is meaningful for analysis. This may seem obvious but is not always the case. For example, consider a longitudinal analysis of employment; for subjects who leave the study because they move to a different location, it makes sense to consider employment history after relocation as missing, whereas for subjects who die during the course of the study, it is clearly not reasonable to consider employment history after time of death as missing; rather it is preferable to restrict the analysis of employment to individuals who are alive. A more complex missing data problem arises when individuals leave a study for unknown reasons, which may include relocation or death. Because the problem of missing data is so pervasive in the social and behavioral sciences, it is very important that researchers be aware of the availability of principled approaches that address the problem. Here we overview common approaches, which have quite limited success, as well as principled approaches, which generally address the problem with substantial success.
2. Pattern and Mechanism of Missing Data The pattern of the missing data simply indicates which values in the data set are observed and which are missing. Specifically, let Y l ( yij) denote an (nip ) rectangular dataset without missing values, with ith row yi l ( yi , …, yip) where yij is the value of " variable Yj for subject i. With missing values, the pattern of missing data is defined by the missing-data indicator matrix M l ( mij ), such that mij l 1 if yij is missing and mij l 0 if yij is present. Some methods for handling missing data apply to any pattern of missing data, whereas other methods assume a special pattern. An important example of a special pattern is uniariate nonresponse, where missingness is confined to a single variable. Another is monotone missing data, where the variables can be arranged so that Yj+ , …, Yp are missing for all cases " all j l 1, …, pk1. The result is where Yj is missing, for a ‘staircase’ pattern, where all variables and units to the left of the broken line forming the staircase are observed, and all to the right are missing. This pattern arises commonly in longitudinal data subject to attrition, where once a subject drops out, no more subsequent data are observed. The missing-data mechanism addresses the reasons why values are missing, and in particular whether these reasons relate to values in the data set. For example, subjects in a longitudinal intervention may be more likely to drop out of a study because they feel the treatment was ineffective, which might be related to a poor value of an outcome measure. Rubin (1976) 15020
treated M as a random matrix, and characterized the missing-data mechanism by the conditional distribution of M given Y, say f ( M Q Y, φ), where φ denotes unknown parameters. When missingness does not depend on the values of the data Y, missing or observed, that is, f ( M Q Y, φ) l f ( M Q φ) for all Y, φ, the data are called missing completely at random (MCAR). MCAR does not mean that the pattern itself is random, but rather that missingness does not depend on the data values. An MCAR mechanism is plausible in planned missing-data designs as in example (e) above, but is a strong assumption when missing data do not occur by design, because missingness often does depend on recorded variables. Let Yobs denote the observed values of Y and Ymis the missing values. A less restrictive assumption is that missingness depends only on values Yobs that are observed, and not on values Ymis that are missing. That is: f ( M Q Y, φ) l f ( M Q Yobs, φ) for all Ymis, φ The missing data mechanism is then called missing at random (MAR). Many methods for handling missing data assume the mechanism is MCAR, and yield biased estimates when the data are not MCAR. Better methods rely only on the MAR assumption.
3. Complete-case, Aailable-case, and Weighting Analysis A common and simple method is complete-case (CC) analysis, also known as listwise deletion, where incomplete cases are discarded and standard analysis methods applied to the complete cases. In many statistical packages this is the default analysis. Valid (but often suboptimal) inferences are obtained when the missing data are MCAR, since then the complete cases are a random subsample of the original sample with respect to all variables. However, complete-case analysis can result in the loss of a substantial amount of information available in the incomplete cases, particularly if the number of variables is large. A serious problem with dropping incomplete cases is that the complete cases are often easily seen to be a biased sample, that is, the missing data are not MCAR. The size of the resulting bias depends on the degree of deviation from MCAR, the amount of missing data, and the specifics of the analysis. In sample surveys this motivates strenuous attempts to limit unit nonresponse though multiple follow-ups, and surveys with high rates of unit nonresponse (say 30 percent or more) are often considered unreliable for making inferences to the whole population. A modification of CC analysis, commonly used to handle unit nonresponse in surveys, is to weight respondents by the inverse of an estimate of the probability of response. A simple approach is to form adjustment cells (or subclasses) based on background
Statistical Data, Missing variables measured for respondents and nonrespondents; for unit nonresponse adjustment, these are often based on geographical areas or groupings of similar areas based on aggregate socioeconomic data. All nonrespondents are given zero weight and the nonresponse weight for all respondents in an adjustment cell is then the inverse of the response rate in that cell. This method removes the component of nonresponse bias attributable to differential nonresponse rates across the adjustment cells, and eliminates bias if within each adjustment cell respondents can be regarded as a random subsample of the original sample within that cell (i.e., the data are MAR given indicators for the adjustment cells). With more extensive background information, a useful alternative approach is response propensity stratification, where (a) the indicator for unit nonresponse is regressed on the background variables, using the combined data for respondents and nonrespondents and a method such as logistic regression appropriate for a binary outcome; (b) a predicted response probability is computed for each respondent based on the regression in (a); and (c) adjustment cells are formed based on a categorized version of the predicted response probability. Theory (Rosenbaum and Rubin 1983) suggests that this is an effective method for removing nonresponse bias attributable to the background variables when unit nonresponse is MAR. Although weighting methods can be useful for reducing nonresponse bias, they do have serious limitations. First, information in the incomplete cases is still discarded, so the method is inefficient. Weighted estimates can have unacceptably high variance, as when outlying values of a variable are given large weights. Second, variance estimation for weighted estimates with estimated weights is problematic. See Estimation: Point and Interal. Explicit formulas are available for simple estimators such as means under simple random sampling (Oh and Scheuren 1983), but methods are not well developed for more complex problems, and often ignore the component of variability arising from estimating the weight from the data. Available-case (AC) analysis (Little and Rubin 1987 Sect. 3.3) is a straightforward attempt to exploit the incomplete information by using all the cases available to estimate each individual parameter. For example, suppose the objective is to estimate the correlation matrix of a set of continuous variables Y , …, Yp. See " Multiariate Analysis: Oeriew. Complete-case analysis uses the set of complete cases to estimate all the correlations; AC analysis uses all the cases with both Yj and Yk observed to estimate the correlation of Yj and Yk, 1 j, k p. Since the sample base of available cases for measuring each correlation includes at least the set of complete cases, the AC method appears to make better use of available information. The sample base changes from correlation to correlation, however,
creating potential problems when the missing data are not MCAR or variables are highly correlated. In the presence of high correlations, there is no guarantee that the AC correlation matrix is even positive definite. Haitovsky’s (1968) simulations concerning regression with highly correlated continuous data found AC markedly inferior to CC. On the other hand, Kim and Curry (1977) found AC superior to CC in simulations based on weakly correlated data. Simulation studies comparing AC regression estimates with maximum likelihood (ML) under normality (Sect. 6) suggest that ML is superior even when underlying normality assumptions are moderately violated (Little 1988a). Although AC estimates are easy to compute, standard errors are more complex. The method cannot be generally recommended, even under the restrictive MCAR.
4. Single Imputation Methods that impute or fill in the missing values have the advantage that, unlike CC analysis, observed values in the incomplete cases are retained. A simple version of the method imputes missing values by their unconditional sample means based on the observed data. This method can yield satisfactory point estimates of means and totals, but it yields inconsistent estimates of other parameters such as variances, covariance matrices or regression coefficients, and it overstates precision (Little and Rubin 1987, Chap. 3). Thus it cannot be recommended in any generality. An improvement is conditional mean imputation, in which each missing value is replaced by an estimate of its conditional mean given the values of observed values. For example, in the case of univariate nonresponse with Y , …, Yp− fully observed and Yp some" times missing,"one approach is to classify cases into cells based on similar values of observed variables, and then to impute missing values of Yp by the within-cell mean from the complete cases in that cell. A more general approach is regression imputation, in which the regression of Yp on Y , …, Yp− is estimated from " prediction equa" resulting the complete cases, and the tion is used to impute the estimated conditional mean for each missing value of Yp. For a general pattern of missing data, the missing values for each case can be imputed from the regression of the missing variables on the observed variables, computed using the set of complete cases. Iterative versions of this method lead (with some important adjustments) to ML estimates under multivariate normality (Little and Rubin 1987, Chap. 8). Although conditional mean imputation incorporates information from the observed variables and yields best predictions of the missing values in the sense of mean squared error, it leads to distorted estimates of quantities that are not linear in the data, such as percentiles, correlations and other measures of 15021
Statistical Data, Missing association, as well as variances and other measures of variability. A solution to this problem of point estimation is to use random draws rather than best predictions to preserve the distribution of variables in the filled-in data set. An example is stochastic regression imputation, in which each missing value is replaced by its regression prediction plus a random error with variance equal to the estimated residual variance. Hot-deck methods impute actual values observed in the dataset. An example is the Current Population Survey (CPS) hot deck for imputing income items in that survey. For each nonrespondent on one or more income items, the CPS hot-deck finds a matching respondent based on variables that are observed for both; the missing income items for the nonrespondent are then replaced by the respondent’s values. When no match can be found for a nonrespondent based on all of the variables, the algorithm searches for a match at a lower level of detail, obtained by omitting some variables and collapsing the categories of others. A more general approach to hot-deck imputation matches nonrespondents and respondents using a distance function based on the observed variables. For example, predictie mean matching (Little 1988b) that have similar predictive means from a regression of the missing variable on observed variables. This method is somewhat robust to misspecification of the regression model used to create the matching metric. The imputation methods discussed so far assume the missing data are MAR. In contrast, models that are not missing at random (NMAR) assert that even if a respondent and nonrespondent to Yp appear identical with respect to observed variables Y , …, Yp− , " " their Yp-values differ systematically. Lillard et al. (1986), for example, discussed NMAR imputation models for missing CPS data. It is also possible to create a NMAR hot-deck procedure; for example, respondents’ values that are to be imputed to nonrespondents could be multiplied by an inflation or deflation factor that depends on the variables that are observed. A crucial point about the use of NMAR models is that there is never direct evidence in the data to address the validity of their underlying assumptions. Thus, whenever NMAR models are being considered, it is prudent to consider several NMAR models and explore the sensitivity of analyses to the choice of model (Rubin 1977, Little and Wang 1996).
5. Multiple Imputation A serious defect with imputation is that it invents data. A single imputed value cannot represent the uncertainty about which value to impute, so analyses that treat imputed values just like observed values generally underestimate uncertainty, even if nonresponse is modeled correctly and randomly drawn imputations are created. Large-sample results show that for simple 15022
situations with 30 percent of the information missing, single imputation under the correct model results in nominal 90 percent confidence intervals having actual coverages below 80 percent. The inaccuracy of nominal levels is even more extreme in multiparameter testing problems (Rubin 1987, Chap. 4). A modification of imputation that fixes this problem is multiple imputation (MI) (Rubin 1987). Instead of imputing a single set of draws for the missing values, a set of M (say M l 5) datasets are created, each containing different sets of draws of the missing values from their joint predictive distribution. We then apply the standard complete-data analysis to each of the M datasets and combine the results in a simple way. In particular for scalar estimands, the MI estimate is the average of the estimates from the M datasets, and the variance of the estimate is the average of the variances from the five datasets plus 1j1\M times the sample variance of the estimates over the M datasets (The factor 1j1\M is a small-M correction). The last quantity here estimates the contribution to the variance from imputation uncertainty, missed (i.e., set to zero) by single imputation methods. Another benefit of multiple imputation is that the averaging over datasets results in more efficient point estimates than does single random imputation. Often MI is not much more difficult than doing a single imputation—the additional computing from repeating an analysis M times is not a major burden, and methods for combining inferences are straightforward. Most of the work involves generating good predictive distributions for the missing values, which is needed for consistent point estimation in any case. A more detailed discussion of multiple imputation appears elsewhere in this encyclopedia.
6. Maximum Likelihood for Ignorable Models There are statistical methods that allows the analysis a nonrectangular dataset without having to impute the missing values. In particular, maximum likelihood (ML) avoids imputation by formulating a statistical model and basing inference on the likelihood function of the incomplete data. Define Y and M as above, and let X denote an ( niq) matrix of fixed covariates, assumed fully observed, with ith row xi l ( xi , …, xiq ) " where xij is the value of covariate Xj for subject i. Covariates that are not fully observed should be treated as random variables and modeled with the set of Yj’s (Little and Rubin 1987 Chap. 5). The data and missing-data mechanism are modeled in terms of a joint distribution for Y and M given X. Selection models specify this distribution as: f ( Y, M Q X, θ, ψ) l f ( Y Q X, θ ) f ( M Q Y, X, ψ) (1) where f ( Y Q X, θ ) is the model in the absence of missing values, f ( M Q Y, X, ψ ) is the model for the missing-
Statistical Data, Missing data mechanism, and θ and ψ are unknown parameters. The likelihood of θ, ψ given the data Yobs, M, X is then proportional to the density of Yobs, M given X regarded as a function of the parameters θ, ψ, and this is obtained by integrating out the missing data Ymis from Eqn. (1), that is: L (θ, ψ Q Yobs, M, X )
l consti f ( Y, M Q X, θ, ψ) dYmis
7. (2)
The likelihood of θ ignoring the missing-data mechanism is obtained by integrating the missing data from the marginal distribution of Y given X, that is: L (θ Q Yobs, X ) l consti f ( Y Q X, θ ) dYmis
(3)
The likelihood (3) is easier to work with than (2) since it is easier to handle computationally; more important, avoids the need to specify a model for the missing-data mechanism, about which little is known in many situations. Hence it is important to determine when valid likelihood inferences are obtained from (3) instead of the full likelihood (2). Rubin (1976) showed that valid inferences about θ are obtained from (3) when the data are MAR, that is: p ( M Q X, Y, ψ ) l p ( M Q X, Yobs, ψ ) for all Ymis, ψ If in addition θ and ψ are distinct in the sense that they have disjoint sample spaces, then likelihood inferences about θ based on (3) are equivalent to inferences based on (2); the missing-data mechanism is then called ignorable for likelihood inferences. Large-sample inferences about θ based on an ignorable model are based on ML theory, which states that under regularity conditions θkθV " Nk(0, C )
data problems that provides an interesting link with imputation methods. The history of EM, which dates back to the beginning of this century for particular problems, is sketched in Little and Rubin (1987) and McLachlan and Krishnan (1997).
(4)
where θ# is the value of θ that maximizes (3), and Nk(0, C ) is the k-variate normal distribution with mean zero and covariance matrix C, given by the inverse of an information matrix; for example C l I−"(θ# ) where I is the observed information matrix I (θ) l kc# log L (θ Q Yobs, X )\cθcθT, or C l J−"(θ# ) where J (θ ) is the expected value of I (θ ). As in (Little and Rubin 1987), Eqn. (4) is written to be open to a frequentist interpretation if θ# is regarded as random and θ fixed, or a Bayesian interpretation if θ is regarded as random and θ# fixed. Thus if the data are MAR, the likelihood approach reduces to developing a suitable model for the data and computing θ# and C. In many problems maximization of the likelihood to compute θ# requires numerical methods. Standard optimization methods such as Newton-Raphson or Scoring can be applied. Alternatively, the ExpectationMaximization (EM) algorithm (Dempster et al. 1977) can be applied, a general algorithm for incomplete
The EM Algorithm for Ignorable Models
For ignorable models, let L (θ Q Yobs, Ymis, X ) denotes the likelihood of θ based on the hypothetical complete data Y l ( Yobs, Ymis) and covariates X. Let θ(t) denote an estimate of θ at iteration t of EM. Iteration tj1 consists of an E-step and M-step. The E-step computes the expectation of log L (θ Q Yobs, Ymis, X ) over the conditional distribution of Ymis given Yobs and X, evaluated at θ l θ(t), that is, Q (θ Q θ(t)) l log L (θ Q Yobs ,Ymis, X ) f (Ymis Q Yobs, X, θ(t)) dYmis. When the complete data belong to an exponential family with complete-data sufficient statistics S, the E-step simplifies to computing expected values of these statistics given the observed data and θ l θ(t), thus in a sense ‘imputing’ the sufficient statistics. The M-step determines θ(t+") to maximize Q (θ Q θ(t)) with respect to θ. In exponential family cases, this step is the same as for complete data, except that the complete-data sufficient statistics S are replaced by their estimates from the Estep. Thus the M-step is often easy or available with existing software, and the programming work is mainly confined to E-step computations. Under very general conditions, each iteration of EM strictly increases the loglikelihood, and under more restrictive but still general conditions, EM converges to a maximum of the likelihood function. If a unique finite ML estimate of θ exists, EM will find it. Little and Rubin (1987) provided many applications of EM to particular models, including: multivariate normal data with a general pattern of missing values and the related problem of multivariate linear regression with missing data; robust inference based on multivariate t models; loglinear models for multiway contingency tables with missing data; and the general location model for mixtures of continuous and categorical variables, which yields ML algorithms for logistic regression with missing covariates. An extensive bibliography of the myriad of EM applications is given in Meng and Pedlow (1992). The EM algorithm is reliable but has a linear convergence rate determined by the fraction of missing information. Extensions of EM (McLachlan and Krishnan 1997) can have much faster convergence: ECM, ECME, AECM, and PX-EM (Liu et al. 1998). EM-type algorithms are particularly attractive in problems where the number of parameters is large. Asymptotic standard errors based on bootstrap methods (Efron 1994) are also available, at least in simpler situations. An attractive alternative is to switch to a Bayesian simulation method that simulates the posterior distribution of θ (see Sect. 9). 15023
Statistical Data, Missing
8. Maximum Likelihood for Nonignorable Models Nonignorable, non-MAR models apply when missingness depends on the missing values (see Sect. 2). A correct likelihood analysis must be based on the full likelihood from a model for the joint distribution of Y and M. The standard likelihood asymptotics apply to nonignorable models providing the parameters are identified, and computational tools such as EM, also apply to this more general class of models. However, often information to estimate both the parameters of the missing-data mechanism and the parameters of the complete-data model is very limited, and estimates are sensitive to misspecification of the model. Often a sensitivity analysis is needed to see how much the answers change for various assumptions about the missing-data mechanism. There are two broad classes of models for the joint distribution of Y and M. Selection models model the joint distribution as in Eqn. (1). Pattern-mixture models specify: f ( Y, M Q X, π, φ ) l f ( Y Q X, M,φ ) f ( M Q X, π) (5) where φ and π are unknown parameters, and now the distribution of Y is conditioned on the missing-data pattern M (Little 1993). Equations (1) and (5) are simply two different ways of factoring the joint distribution of Y and M. When M is independent of Y, the two specifications are equivalent with θ l φ and ψ l π. Otherwise (1) and (5) generally yield different models. Rubin (1977) and Little (1993) also consider pattern-set mixture models, a more general class of models that includes pattern-mixture and selection models as special cases. Although most of the literature on missing data has concerned selection models of the form (1) for univariate nonresponse, pattern-mixture models seem more natural when missingness defines a distinct stratum of the population of intrinsic interest, such as individuals reporting ‘don’t know’ in an opinion survey. However, pattern-mixture models can also provide inferences for parameters θ of the complete-data distribution, by expressing the parameters of interest as functions of the pattern-mixture model parameters φ and π. An advantage of the pattern-mixture modeling approach over selection models is that assumptions about the form of the missing-data mechanism are sometimes less specific in their parametric form, since they are incorporated in the model via parameter restrictions. This idea is explained in specific normal models in Little and Wang (1996), for example. Heitjan and Rubin (1991) extended the formulation of missing-data problems via the joint distribution of Y and M to more general incomplete data problems involving coarsened data. The idea is replace the binary missing-data indicators by coarsening variables that map the yij values to coarsened versions zij( yij). Full and ignorable likelihoods can be defined for this 15024
more general setting. This theory provides a bridge between missing-data theory and theories of censoring in the survival analysis literature. The EM algorithm can be applied to ignorable models by including the missing data (or coarsening) indicator in the completedata likelihood.
9. Bayesian Simulation Methods Maximum likelihood is most useful when sample sizes are large, since then the log-likelihood is nearly quadratic and can be summarized well using the ML estimate of θ and its large sample variance-covariance matrix. When sample sizes are small, a useful alternative approach is to add a prior distribution for the parameters and compute the posterior distribution of the parameters of interest. For ignorable models this is p (θ Q Yobs, M, X ) p (θ Q Yobs, X )
l constip (θ Q X )if ( Yobs Q X, θ)
where p (θ Q X ) is the prior distribution and f (Yobs Q X, θ ) is the density of the observed data. Since the posterior distribution rarely has a simple analytic form for incomplete-data problems, simulation methods are often used to generate draws of θ from the posterior distributionp (θ Q Yobs, M, X ).Dataaugmentation(Tanner and Wong 1987) is an iterative method of simulating the posterior distribution of θ that combines features of the EM algorithm and multiple imputation, with M imputations of each missing value at each iteration. It can be thought of as a smallsample refinement of the EM algorithm using simulation, with the imputation step corresponding to the E-step (random draws replacing expectation) and the posterior step corresponding to the M-step (random draws replacing MLE). An important special case of data augmentation arises when M is set equal to one, yielding the following special case of the ‘Gibbs’ sampler.’ Start with an initial draw θ(!) from an approximation to the posterior distribution of θ. Given a value θ(t) of θ drawn at iteration t: (t+") with density p ( Y Q Y , X, θ(t)); (a) Draw Ymis mis obs "), X ). (b) Draw θ(t+") with density p (θ Q Yobs, Y(t+ mis The procedure is motivated by the fact that the distributions in (a) and (b) are often much easier to draw from than the correct posterior distributions p (Ymis Q Yobs, X ) and p (θ Q Yobs, X ). The iterative procedure can be shown in the limit to yield a draw from the joint posterior distribution of Ymis, θ given Yobs, X. The algorithm was termed chained data augmentation in Tanner (1991). This algorithm can be run independently K times to generate K iid draws from the approximate joint posterior distribution of θ and Ymis. Techniques for monitoring the convergence of such algorithms are discussed in a variety of places (Gelman et al. 1995, Sect. 11.4). Schafer (1996) developed data
Statistical Identification and Estimability augmentation algorithms that use iterative Bayesian simulation to multiply-impute rectangular data sets with arbitrary patterns of missing values when the missing-data mechanism is ignorable. The methods are applicable when the rows of the complete-data matrix can be modeled as iid observations from the multivariate normal, multinomial loglinear or general location models.
10. Oeriew Our survey of methods for dealing with missing data is necessarily brief. In general, the advent of modern computing has made more principled methods of dealing with missing data practical (e.g., multiple imputation, Rubin 1996). Such methods are finding their way into current commercial and free software. Schafer (1996) and the forthcoming new edition of Little and Rubin (1987) are excellent places to learn more about missing data methodology.
Bibliography Dempster A P, Laird N M, Rubin D B 1977 Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B 39: 1–38 Efron B 1994 Missing data, imputation and the bootstrap. Journal of the American Statistical Association 89: 463–79 Gelman A, Carlin J, Stern H, Rubin D B 1995 Bayesian Data Analysis. Chapman & Hall, New York Haitovsky Y 1968 Missing data in regression analysis. Journal of the Royal Statistical Society B 30: 67–81 Heitjan D, Rubin D B 1991 Ignorability and coarse data. Annals of Statistics 19: 2244–53 Kim J O, Curry J 1977 The treatment of missing data in multivariate analysis. Sociological Methods and Research 6: 215–40 Lillard L, Smith J P, Welch F 1986 What do we really know about wages: the importance of nonreporting and census imputation. Journal of Political Economy 94: 489–506 Little R J A 1988a Robust estimation of the mean and covariance matrix from data with missing values. Applied Statistics 37: 23–38 Little R J A 1988b Missing data adjustments in large surveys. Journal of Business and Economic Statistics 6: 287–301 Little R J A 1993 Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association 88: 125–34 Little R J A 1995 Modeling the drop-out mechanism in longitudinal studies. Journal of the American Statistical Association 90: 1112–21 Little R J A, Rubin D B 1987 Statistical Analysis with Missing Data. Wiley, New York Little R J A, Wang Y-X 1996 Pattern-mixture models for multivariate incomplete data with covariates. Biometrics 52: 98–111 Liu C H, Rubin D B, Wu Y N 1998 Parameter expansion for EM acceleration: the PX-EM algorithm. Biometrika 85: 755–70
McLachlan G J, Krishnan T 1997 The EM Algorithm and Extensions. Wiley, New York Meng X L, Pedlow S 1992 EM: a bibliographic review with missing articles. Proceedings of the Statistical Computing Section, American Statistical Association 1992, 24–27 Oh H L, Scheuren F S 1983 Weighting adjustments for unit nonresponse. In: Madow W G, Olkin I, Rubin D B (eds.) Incomplete Data in Sample Sureys, Vol. 2, Theory and Bibliographies. Academic Press, New York, pp. 143–84 Rosenbaum P R, Rubin D B 1983 The central role of the propensity score in observational studies for causal effects. Biometrika 70: 41–55 Rubin D B 1976 Inference and missing data. Biometrika 63: 581–92 Rubin D B 1977 Formalizing subjective notions about the effect of nonrespondents in sample surveys. Journal of the American Statistical Association 72: 538–43 Rubin D B 1987 Multiple Imputation for Nonresponse in Sureys. Wiley, New York Rubin D B 1996 Multiple imputation after 18j years. Journal of the American Statistical Association 91: 473–89 [with discussion, 507–15; and rejoinder, 515–7] Schafer J L 1996 Analysis of Incomplete Multiariate Data. Chapman & Hall, London Tanner M A 1991 Tools for Statistical Inference: Obsered Data and Data Augmentation Methods. Springer-Verlag, New York Tanner M A, Wong W H 1987 The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82: 528–50
R. J. Little and D. B. Rubin
Statistical Identification and Estimability A statistical identification problem arises when a sample of infinite size, assumed to be generated by an element of a class of statistical models indexed by a parameter, does not lead to certain knowledge of one or more components of that parameter. Roughly, a model is identified if enough observed data could allow one to distinguish elements of the parameter set from one another. Statistical identification problems are ubiquitous, occurring in fields as diverse as reliability, optimal control theory, experimental design, and epidemiology. A rich array of statistical identification problems appear in the social sciences: latent structure analysis, factor analysis, and analysis of covariance structures are based on multivariate statistical models that possess, by construction, this problem. Nonresponse and data censoring in the conduct of sample surveys often lead to an identification problem and the problem is a central feature of the analysis of econometric structural equation models and of errors-invariables regression models. Experimental designs in which treatments are confounded lead to an identification problem. A nonparametric identification problem occurs in the theory of competing risks. 15025
Statistical Identification and Estimability Understanding if a statistical model is identified, nonidentified, or unidentified is a necessary prelude to the construction of a sensible method for making inferences about model parameters from observed data.
1.
Background
The problem of statistical identifiability appears in economics as early as Pigou (1910). Thurstone discusses identification in his applications of factor analysis to intelligence (1935, 1947) and, in his latent structure analysis of opinion research and attitudes, Lazarsfeld deals with identification issues (1950). Principal contributions to a rigorous mathematical treatment of the problem began in the 1950s with the work of Haavelmo (1944), Hurwicz (1950), Koopmans et al. (1950) and Koopmans and Reiersøl (1950) and spawned a huge literature on the subject. Fisher (1966) provides a thorough rigorous treatment of identification for simultaneous equation systems. He laid to rest the issue of whether or not identifiability is a discrete or continuous property of parameter estimation by demonstrating that estimators which are consistent given a correct model specification converge in probability to true parameter values as specification error approaches zero. Thus identifiability may be regarded as a continuous property of a set of parameters and small specification errors lead to small inconsistencies in parameter estimation. Rothenberg (1971) and Bowden (1973) extend identification theory to embrace nonlinear models, distinguishing local from global identifiability. Hannan (1969, 1971) generalizes the theory of identification for linear structural equation systems with serially uncorrelated residual errors in another direction: he establishes conditions for identifiability for the important case of auto-regressive moving average error terms (see Time Series: General). Probabilistic models employed in linear system optimal control theory have much in common with econometric structural (simultaneous) equation systems in which observations are measured with random error. In the psychometric literature, the later class of models goes under the acronym LISREL (Linear Structural Relationships). Jo$ reskog and So$ rbom (1984) have developed methods for specifying LISREL model structure, algorithms for calculating various types of parameter estimators and a framework for interpretation of results. As with econometric structural equation systems, applications of LISREL span a wide domain of social science applications, marketing, psychology, sociology, and organizational behavior among them. A Bayesian treatment of identification began with Dawid (1979) and Kadane’s (1974) studies of identification in Bayesian theory and Dre' ze’s (1972), (1974) analysis of econometric simultaneous equation systems from a Bayesian perspective. Identification 15026
through a Bayesian lens is currently a vigorous area of research yielding new insights into how to construct meaningful prior distributions for parameters.
2.
Example 1
A sample survey is conducted to determine attitudes of individuals in a population toward a particular subject. The sample frame consists of N individuals, each of whose response is classified as either ‘Favorable’ or ‘Unfavorable.’ This sample yields NR: respondents and NS: l NkNR: non-respondents. Of the NR: respondents, NFR are classified as ‘Favorable’ and NUR l NkNFR are classified as ‘Unfavorable.’ In the absence of information bearing directly on the fraction of non-respondents who would be classified as ‘Favorable,’ statistics available for inference are NFR, NUR, NS: subject to NFSjNUS l NS: and to NFRjNURjNS: l N. Suppose that the process generating data about both respondents and nonrespondents is modeled as a double dichotomy with probabilities pFR, pUR, pFS and pUS 0 and pFRjpURjpFSjpUS l 1 as shown in Table 1. Table 2 shows the corresponding tables for the model of observed data. The probability distribution for observations NFR, NUR and NS in Table 1 is proportional to (pFSjpUS)NS: pNFRFR pNURUR Observations NFR, NUR and NS: are sufficient for joint inference about pS: l pFSjpUS, pFS and pUR but even an infinite sample will not lead to knowledge of pFS and pUS with certainty: pFS and pUS are not identifiable in the presence of non-response as an infinite sample yields knowledge only that pFS and pUS satisfy pFSjpUS l pS:
3. Definitions A statistical ‘model’ is a family oPθ, θ ? Θq of probability distributions indexed by a parameter θ ? Θ. A statistical ‘structure’ is Pθ for a particular θ l θ ? Θ. ! Parameters θ and θh ? Θ are ‘observationally equivalent’ if and only if Pθh (A) l Pθh (A) for all A in a measurable space . Definition: Θ is ‘identifiable’ if and only if θ l θh for all observationally equivalent pairs θ, θh ? Θ. In practice, it is useful to couch these definitions in terms of distribution functions: let X be an observable random variable with range X and distribution function Fθ(x) P(X x Q θ) belonging to a model oFθ, θ ? Θq. If there exists at least one pair (θ, θh) for which θ, θh ? Θ and θ θh such that Fθ(x) l Fθh(x) for
Statistical Identification and Estimability Table 1
Table 2
all x ? x then we say that Θ is ‘non-identifiable’ by X. If Θ is not non-identifiable by X it is identifiable. These definitions of identifiability can be recast to embrace nonparametric models as well (Basu 1981, Berman 1963).
4. Example 2 In general, unobservables in an ‘econometric simultaneous equation model’ are restricted to residuals (errors in model equations). A ‘factor analysis model,’ on the other hand, possesses both unobservable latent and unobservable explanatory variables. Each is a special case of a more general ‘structural equation model’ in which a ‘latent variable model’ displaying interactions among latent variables and a ‘measurement model’ specifying how observable variables are related to latent variables are conjoined. Properties of the covariance matrix of observable variables are at center stage and structural equation test procedures are designed to measure the quality of fit between an observed (sample) covariance matrix Σ and a model based covariance matrix Σ(θ) indexed by a vector θ of model parameters. Even when the structure of relations between observable variables and latent variables and the structure of relations among latent variables are both linear, the model covariance matrix Σ(θ) may be a complicated, nonlinear function of elements of the parameter vector θ. In this setting, in order for an element of the parameter vector θ to be identified it must be an explicit, known function of one or more elements of the sample covariance matrix Σ under the hypothesis that Σ l Σ(θ). (See Bollen (1989) for a comprehensive treatment. Also Hoyle (1995) for applications.) If PDS Σ(θ) is (kik) it has at most " k(kj1) functionally independent components so for# θ to be identifiable, it can possess at most " k(kj1) func# is a generic, tionally independent components. This easy-to-apply, necessary but not sufficient condition for identifiability of θ.
In the notation of Bollen 1989 we have a Latent Variable Model: η l B*ηjΓξjζ The (mim) coefficient matrix B* induces interactions among latent endogenous variables (mi1) η, the (mir) coefficient matrix Γ establishes the effect of exogenous variables (ri1) ξ on η and (mi1) ζ represents a random residual error term assumed to be uncorrelated with ξ and to possess mean zero and finite (PDS) variance matrix Ψ. In addition B IkB* is assumed to be nonsingular. Measurement Model: y l Λy ηjε x l Λxξjδ The elements of the ( pim) coefficient matrix Λy determine the magnitude of the effect of levels and changes in levels of latent variables η on observable variables y; (qir) Λx relates exogenous variables ξ to observable variables in x in a similar fashion. Residual errors (pi1) ε and (qi1) δ are random variables, generally assumed to be uncorrelated with one another, with η and with ξ and to possess common mean zero, but possibly different (PDS) covariance matrices Θε and Θδ respectively. To be specific, focus on the composition of the variance matrix Var(y) of y as a function of latent variable model parameters B, Γ, Φ, and Ψ and measurement model parameters Λy and Θε: Var( y) l Λy B−"[ΓΦΓtjΨ](B−")tΛtyjΘε The ( pip) PDS matrix Var(y) possesses " p(pj1) functionally independent components. The #variance matrix Var(ε) l Θε alone also possesses " p(pj1) functionally independent components and,# in the absence of restrictions, other than just Φ and Ψ are PDS, Λy, B, Γ, Φ, and Ψ possess " m(mj1)j" r # # (rj1)jm(mjpjr) additional functionally independent components. As the above equation suggests, it 15027
Statistical Identification and Estimability can be a formidable task to verify by direct algebraic analysis whether or not a particular model parameter is, or is not identifiable. Clearly, parameter matrices Λy, B, Γ, Φ, and Θε cannot be expressed as unique functions of elements of Var( y) unless restrictions are placed upon them. When one or more components of B cannot be recovered uniquely as a function of elements of Var(y), such components are called ‘under-identified’ or ‘nonidentified.’ A central task then, is to select constraints on latent variable model parameters and measurement model parameters that renders desired parameters ‘identifiable.’ If a priori restrictions on parameters of latent variable and measurement model parameters are specified in a way that leads to two or more distinct representations of some components of B as a function of components of Var( y), those components of B are called ‘over-identified.’ It can happen, of course, that some elements of B are identified, some are underidentified and others are over-identified. Bartels (1985) provides an easily accessible treatment of these concepts. A common form of a priori constraint is to ‘specify’ that some parameter values are equal to zero. For example, if the modeler assumes that y and x are measured without error and that Λy l (mim) I, Λx l (rir) I,Var(η) l Var(ξ) l Θε l Θδ l 0 and the variance of y becomes Var(y) l B−"[ΓΦΓtjΨ](B−")t. If, in turn, as is often the case in econometric applications, exogenous variables x are assumed to be predetermined and known with certainty, Φ l 0. With all of these restrictions in force η l y, ξ l x and the latent variable model becomes B y l Γzjδ. Then observable y l Πzju, with Π kB−"Γ, u l B−"δ and Var( y) l B−"Ψ(B−")t Ω, is called the ‘reduced form’ of this model. Provided that predetermined variables z are uncorrelated with variables u (weaker assumptions are possible) and that the asymptotic variance matrix of the z’s exists, both Π and Ω can be consistently estimated by ordinary least squares. However, the distribution of y given z now depends exclusively on parameters Π and Ω. Pre-multiplication of both sides of B y l Γzjδ by any nonsingular (mim) matrix leads to the same reduced form, so all such structures are observationally equivalent and without further restrictions the system is underidentified. Fisher (1966) provides an elegant and thorough exposition of conditions for identifiability of linear simultaneous equation system model parameters. Among the most important are ‘order’ and ‘rank’ conditions established by Koopmans et al. (1950). Suppose interest focuses on a single equation and that exclusion restrictions alone on y and z are in force. A (necessary but not sufficient) ‘order condition’ for 15028
parameters of this equation to be identifiable is that the number of variables excluded be greater than or equal to mk1—one less than the number of components in (endogenous) y. Alternatively, the total number of excluded predetermined variables must be greater than or equal to the number of variables not excluded from this equation, less one (Fisher 1966, pp. 40–1). The following argument due to Fisher is a transparent explanation of the origin of the ‘rank condition’ for linear simultaneous equation systems, first proved by Koopmans et al. (1950). Exclusion of the l th element of the first row a of mi(mjr)A [B, Γ] is " c(l) an (mjr)i1 column representable as a c(l) l 0, " vector with a 1 at element l and zeros elsewhere. Thus k such exclusions of distinct variables may be written in terms of (mjr)ik C as a C l 0. Clearly if a is " rows of A, it is not " some linear combination of other identifiable. Thus, a sufficient condition for a to be " x of identified is that xAC l 0 holds only for (1im) the form l (α, 0, …, 0), α a scalar. In the absence of other a priori restrictions on elements of a , this condition is also necessary. When it obtains, the" rank of AC must be mk1. The order condition follows from the observation that the rank of AC must be less than or equal to that of C, so the number of independent restrictions imposed by C must be at least mk1.
5. Estimation When a linear statistical model is just identified, both consistent and unbiased estimators of its parameters exist. A more general notion of estimability for linear statistical models, first proposed by Bose (1944, 1947), says that a parameter θ of such a model is (linearly) estimable if there exists a linear function of observations with expectation equal to θ for each θ in a prespecified set Θ. This special case of identifiability, apparently developed independently of the more general theory of identification outlined in this review, has generated a large literature of its own (see Factor Analysis and Latent Structure: Oeriew). Deistler and Seifert (1978) address the interplay between identification and estimability for a quite general (possibly nonlinear) version of an econometric structural equation model without measurement error. They answer the question ‘When, for such systems, does identifiability imply existence of consistent estimators?’ Jo$ reskog (1973) and Wiley (1973) were among the first to specify practical procedures for estimation of parameters by analysis of the covariance structure of observations generated by a general structural equation model of the form described in Example 2. For a detailed description of parameter estimation via maximum likelihood, instrumental variables and least squares see, for example, Jo$ reskog and So$ rbom (1984).
Statistical Identification and Estimability Simultaneous Equation Estimates (Exact and Approximate), Distribution of is a crisp summary of methods designed to recover consistent estimates of structural equation parameters, such as Full Information Maximum Likelihood (FIML), Limited Information Maximum Likelihood (LIML), R-class estimators, two- and three-stage least squares (2 SLS, 3 SLS). A more leisurely presentation of econometric simultaneous equation system identification issues, couched in terms of a two-equation supply and demand model, and the nature of estimation problems they pose, appears in Simultaneous Equation Estimation: Oeriew.
6.
Bayesian Identification
Non-Bayesians typically assign exact numerical values to a subset of parameters of a statistical model in order to achieve identification of one or more other parameters. Fisher (1961) suggested broadening this special type of a priori specification by replacing exact numerical values with probabilistic constraints. Dre' ze (1962) took up the challenge in his treatment of Bayesian identification for simultaneous equation estimation models. As before, let ByjΓz l u be a ‘structural equation system’ indexed by matrix parameters consisting of nonsingular B, Γ, and PDS Var(u) l Σ with reduced form y l Πzjv, Π lkB−"Γ and disturbance variance Var(v) l B−"Σ(B−")t Ω. Because the likelihood function for observations y can be written as a function of reduced form parameters Π and Ω alone, full knowledge of the exact numerical values of elements of Π and Ω does not permit recovery of unique values for elements of B, Γ, and Σ. The map from Π and Ω to B, Γ, and Σ is one to many. In effect, all nonsingular matrices B are observationally equivalent and B is only identifiable in the special case when f (B Q Π, Ω) is concentrated on a single point in the space of nonsingular matrices B. Nevertheless, one can assign a prior density g(B, Γ, Σ) to structural parameters, observe data (Y, Z) and compute a posterior density f (B, Γ, Σ Q Y, Z) for B, Γ, and Σ whether or not the model is identified in a classical sense. Because identifiability is a property of the likelihood function alone, if a proper prior distribution is assigned to a complete set of parameters (B, Π, and Ω here) the posterior for these parameters is proper whether or not the likelihood function is identified. Such priors must be chosen judiciously: too tight a prior will anchor the posterior on the prior until sample size becomes very large; a very loose proper prior may fail to reflect relevant prior knowledge. In order to streamline computation of posteriors, Raiffa and Schlaifer (1961) introduced the concept of ‘natural conjugate’ priors. The functional form of a natural conjugate prior is derived by interpreting parameter as sufficient statistic and sufficient statistic as parameter in the likelihood function. A consequence is that both
prior and posterior possess the same functional form, a form dictated by the shape of the likelihood function. For simultaneous equation systems, multivariate-, matric- and poly-t posterior distributions for parameters follow and figure prominently in work by Dre' ze (1976), Dre' ze and Morales (1976), Dre' ze and Richard (1983), and Mouchard (1976). (See Bayesian Statistics.) To minimize the impact of a priori evidence or, in some cases, to exploit invariance with respect to mapping from one representation of a parameter to another, a diffuse (vague, noninformative) prior can be the prior of choice. However, in some settings, this choice comes at a cost as a diffuse prior may lead to an improper posterior. Lindley and El Sayeed (1968) foreshadow much recent work on the character of diffuse priors. They were perhaps the first to recognize several important features of prior to posterior analysis. First, a proper prior always yields a proper posterior. Second, ‘… in the functional relationship problem [improper priors] can cause considerable difficulties … .’ Third, even when the likelihood function is not identifiable, observed data can lead to a posterior that is informative; i.e., different from the prior. Fourth, ‘Uncritical use of improper distributions can lead to inconsistent estimates.’ A simple example often used to introduce the idea of nonidentifiability highlights issues raised by diffuse priors.
7. Example 3 Suppose observations to be independent, identically normal with mean θ l µ jµ and variance 1. Then µ " Nevertheless, # and µ are non-identifiable. assignment" # of a proper prior to µ and µ leads to a proper # posterior. As the number" of observations approaches infinity, the posterior for θ converges in distribution to a true value θ of θ and the (proper) posterior for µ ! " and µ concentrates on the line θ l µ jµ . # ! " # If, however, a diffuse prior dµ , k_ µ _, is " to θ, the "posterior assigned to µ and proper prior g(θ) " diffuse irrespective of sample size. for µ remains " the proper prior g(θ) for θ with a proper prior Replace h( µ ) for µ . Then, provided the prior for µ is diffuse, # for µ remains the same as" the prior the #posterior # many observations are made. h( µ ), no matter how # These features of Example 3 suggest that uncritical choice of a diffuse prior can lead to a posterior with unsatisfactory or even unacceptable properties, in particular when the parameter space is of large dimension. For example, Kleibergen and Van Dijk (1997) point out that for simultaneous equation systems ‘… the order condition reflects overall identification while the rank condition reflects local (non) identification.’ They show that local non-identification coupled with a diffuse prior leads to posterior pathology when one attempts Limited Information analy15029
Statistical Identification and Estimability sis of a single equation and suggest an alternative approach. (See Gelfand and Sahu (1999) for other examples.) Dawid (1979) and Kadane (1974) use conditional independence as a device for a definition of nonidentifiability which embraces both Bayesian and nonBayesian perspectives. Suppose that an observable X possesses a distribution Fθ determined by a parameter θ which we partition into two pieces θ and θ . If the # θl distribution of X is fully determined by" θ alone, " (θ , θ ) is non-identifiable. "When # a Bayesian assigns a prior distribution g(θ) to θ, Bayes theorem tells her that the posterior distribution f (θ Q y) for a parameter θ given data y is proportional to the product of the likelihood function L(θ Q y) for θ given y and the prior g(θ) assigned to θ. The (posterior) distribution for θ given θ and data y " given prior g(θ) l g(θ Q θ )h(θ ) is# # " " f (θ Qθ , y) ` L(θ , θ Q y)g(θ Q θ )h(θ ) # " " # # " " If θ is absent from the likelihood function, # f (θ Q θ , y) ` g(θ Q θ )h(θ Q y) ` g(θ Q θ ) # " # " " # " so that while data will modify her prior judgements about θ , no amount of observable data y will change " her a priori probability assessment g(θ Q θ ) for θ # " given θ . Data y is, however, unconditionally in-# " formative about θ ; as data accrues and we learn more # about θ , prior judgements about θ are modified in " # accord with
&
f (θ Q y) ` g(θ Q θ )h(θ Q y) dθ # # " " " provided that both g(θ Q θ ) and h(θ Q y) are proper # " " distributions. Return to Example 1. As Bayesians, (noting that pFRjpURjpFSjpUS l 1), we assign a prior distribution g(pFR, pUR, pFS) to pFR, pUR, pFS. If observed data y l (NS:, NFR, NUR) is generated in accord with Table2, the likelihood function for pFR, pUR, pFS, [1kpFRkpUR]NS: pNFRFR pNURUR does not depend on pFS. Consequently, f (pFS Q pFR, pUR, y) ` g(pFS Q pFR, pUR) This matches our intuition: given pFRjpUR l α we then know with certainty that pFSjpUS l 1kα, pFS, pUS 0. Data of type (NS:, NFR, NUR) refines our knowledge about the value of α, but even an infinite sample of this type will not lead to knowledge certain of pFS. The conditional distribution of pFS given pFRjpUR l α prior to observing y is the same as the conditional distribution of pFS given pFRjpUR l α posterior to observing y. 15030
While the theory of identification in a classical statistical context is now well understood, the story of identification from a Bayesian viewpoint is still unfolding. Markov Chain Monte Carlo [MCMC] methods breathe new life into Bayesian inference, allowing efficient computation of normalizing constants and parameter estimates for complex models. An explosion of Bayesian application of MCMC methods surfaces new identification issues. In their study of Bayesian identifiability for ‘Generalized Linear Models’ (GLMs), Gelfand and Sahu (1999) present conditions under which assignment of an improper prior distribution to some components of a parameter set leads to a proper posterior distribution. More particularly, if a posterior distribution for parameters is improper, under what conditions does there exist a lower-dimensional parameter set that possesses a proper posterior distribution? Ghosh et al. (2000) prove that when data y is uninformative about θ given # θ so that θ is not identifiable, the posterior density " , θ Q y) is# proper if and only if both h(θ Q y) and f (θ " g(θ# Q θ ") are proper, a fact that Gelfand and Sahu use # " to establish that for GLMs at least, even when the posterior distribution for all parameters is improper, there are embedded lower dimensional models that possess a proper posterior distribution, although in general, that distribution is not uniquely determined. (See Gelfand and Sahu (1999) for conditions that guarantee uniqueness.) They warn, however, that model fitting using MCMC methods is ‘… somewhat of an art form, requiring suitable trickery and turning to obtain results in which we can have confidence.’
Bibliography Bartels R 1985 Identification in Econometrics. The American Statistician 39(2): 102–4 Basu A P 1981 Statistical Distributions in Scientific Work 5. In: Taillie C, Patil, Baldessari (eds.) D Reidel, Dordrecht, The Netherlands, pp. 335–48 Berman S M 1963 Annals of Mathematical Statistics 34: 1104–6 Bollen K A 1989 Structural Equations with Latent Variables. Wiley, New York Bose R C 1944 Proceedings of the 31st Indian Sci. Congr., Delhi, Vol. 3, pp. 5–6 Bose R C 1947 Proceedings of the 34th Indian Sci. Congr., Delhi, Part II, pp. 1–25 Bowden R 1973 The theory of parametric identification. Econometrica 41(6): 1069–74 Dawid A P 1979 Conditional independence in statistical theory (with discussion). Journal of the Royal Statistical Society Ser. B 41: 1–31 Deistler M, Seifert H-G 1978 Identifiability and consistent estimability in econometric models. Econometrica 46(4): 969– 80 Dreze J H 1962 The Bayesian approach to simultaneous equation estimation. ONR Research Memorandum No. 67. Northwestern University Dre' ze J H 1972 Econometrics and Decision Theory. Econometrica 40: 1–17
Statistical Methods, History of: Post-1900 Dre' ze J H 1974 Bayesian theory of identification in simultaneous equations models. Studies in Bayesian Econometrics and Statistics. North-Holland, Amsterdam, pp. 159–174 (presented to Third NBER-NSF Seminar on Bayesian Inference Econometrics, Harvard Univ., October 1971) Dre' ze J H 1976 Bayesian limited information analysis of the simultaneous equations model. Econometrica 44: 1045–75 Dre' ze J H, Morales J A 1976 Bayesian full information analysis of simultaneous equations. Journal of the American Statistical Association 71(376): 919–23 Dre' ze J H, Richard J F 1983 Bayesian analysis of simultaneous equation systems. Handbook of Econometrics 1 Fisher F M 1961 On the cost of approximate specification in simultaneous equation estimation. Econometrica 29(2) Fisher F M 1966 The Identification Problem in Econometrics. McGraw-Hill, New York Gelfand A, Sahu S K 1999 Identifiability, improper priors, and Gibbs sampling for generalized linear models. Journal of the American Statistical Association 94(445): 247–53 Ghosh M, Chen M H, Ghosh M, Agresti A 2000 Noninformative priors for one parameter item response models. Journal of Statistical Planning and Inference 88: 99–115 Hannan E J 1969 The identification of vector mixed autoregressive moving average systems. Biometrika 57: 223–5 Hannan E J 1971 The identification problem for multiple equation systems with moving average errors. Econometrica 39: 751–65 Haavelmo T 1944 The probability approach in econometrics. Econometrica 12, Supplement Hoyle 1995 Structural Equation Modeling: Concepts, Issues and Applications. Sage, Thousand Oaks, CA Hurwicz L 1950 Generalization of the concept of identification. Statistical Inference in Dynamic Economic Models. Cowles Commission Monograph 10. Wiley, New York Johnson W O, Gastwirth J L 1991 Bayesian inference for medical screening tests: Approximations useful for the analysis of AIDS. Journal of the Royal Statistical Society Ser. B 53: 427–39 Jo$ reskog K G 1973 In: Krishnaiah P R (ed.) Analysis of Coariance Structures. Multiariate Analysis III. Academic Press, New York Jo$ reskog K G, So$ rbom D 1984 LISREL VI Analysis of Linear Structural Relations by Maximum Likelihood, Instrumental Variables, and Least Square Methods. User’s Guide. Department of Statistics, University of Uppsala, Uppsala, Sweden Kadane J B 1974 The role of identification in Bayesian theory. Studies in Bayesian Econometrics and Statistics. NorthHolland, Amsterdam, pp. 175–91 (presented to Winter Meeting of Econometric Society, Toronto, December 1972). Kleibergen F, Van Dijk H K 1997 Bayesian Simultaneous Equations Analysis using Reduced Rank Structures. Erasmus University Technical Report 9714\A Koopmans T C, Reiersøl O 1950 The identification of structural characteristics. Annals of Mathematical Statistics 21: 165–81 Koopmans T C, Rubin H, Leipnik R B 1950 Measuring the equation systems of dynamic economics. Statistical Inference in Dynamic Economic Models. Cowles Commission Monograph 10. Wiley, New York Lazarsfeld P F 1950 The logical and mathematical foundation of latent structure analysis. The interpretation of some latent structures. Measurement and Prediction 4: Studies in Psychology of World War II Leamer E E 1978 Specification Searches. Wiley, New York
Mouchard J-F 1976 Posterior and Predictie Densities for Simultaneous Equation Models. Springer-Verlag, New York Neath A A, Samaniego F J August 1997 On the efficacy of Bayesian inference for nonidentifiable models. The American Statistician 51(3): 225–32 Pigou C 1910 A method of determining the numerical values of elasticities of demand. Economic Journal 20: 636–40 Poirier D J 1996 Reising Beliefs in Non-Identified Models. Technical Report, University of Toronto, Canada Raiffa H, Schlaifer R O 1961 Applied Statistical Decision Theory. Harvard Business School, Boston Rothenberg T J May 1971 Identification in parametric models. Econometrica 39: 577–91 Thurstone L L 1935 The Vectors of Mind. University of Chicago Press, Chicago Thurstone L L 1947 Multiple-Factor Analysis. University of Chicago Press, Chicago Van der Genugten B B 1997 Statistica Neerlandica 31: 69–89 Wald A 1950 Note on the identification of economic relations. Statistical Inference in Dynamic Economic Models. Cowles Commission Monograph 10. Wiley, New York Wiley D E 1973 The identification problem for structural equation models with unmeasured variables. In: Goldberger A S, Duncan O D (eds.) Structural Equation Models in the Social Sciences. Seminar Press, New York
G. M. Kaufman
Statistical Methods, History of: Post-1900 1. Introduction In Statistical Methods for Research Workers, Fisher (1925) characterizes statistics as ‘(i) the study of populations, (ii) the study of variation, (iii) the study of methods of the reduction of data.’ This is still the field of statistics in a nutshell. The three forms of studies are interwoven, and one thread cannot fully be understood without the other in the fabric of statistics that they weave. During the early decades of the twentieth century, the theory of probability and of statistical methodology of the least-squares tradition from Laplace and Gauss continued to be developed and the two fields married permanently. Methods of regression, correlation, and analysis of variance found their final form. As did survey sampling, which was given a probabilistic footing. The discussion of Bayesian analysis and inverse probability continued and led to a split between the frequentist and the Bayesian schools of thought. This period also saw the emergence of fundamentally new ideas and methodologies. Statistical inference, e.g., in the forms of hypothesis testing and confidence interval estimation, was identified as distinct from data description. Parametric models, allow15031
Statistical Methods, History of: Post-1900 ing exact inference, even for small samples, were found, and led to constructive concepts of inductive logic and optimality for estimation and testing. This impressive development in the concepts and methods peaked with Fisher’s 1922 paper. It was a paradigmatic revolution of statistics, along those of Laplace and Gauss around 1800 (Hald 1998). It took about 20 years and many papers for each of these authors to work out their basic ideas in detail and it took half a century for the rest of the statistical community to understand and develop the new methods and their applications. The impact of statistics on society and the social sciences also changed and expanded fundamentally. Statistical thinking spread and was integrated in several of the social sciences as part of their basic thinking. The development of psychology, sociology and economics was influenced in essential ways by statistics and statistics itself benefited from ideas and methods developed in these fields, as well as in biology and other sciences.
2. Impacts of Statistics on the Social Sciences The term ‘statistics’ has its origin in the social sciences. Statistics meant the assemblage of numerical data describing the state, its population, institutions etc., with focus on the particulars. With Adolphe Quetelet this changed. Statistical mass phenomena attracted more and more interest from social scientists throughout the nineteenth century. It was asked whether distributions in society followed laws of chance, such as the binomial, the Poisson, and the normal distribution. Most phenomena were found to show ‘supernormal’ variation, and the question arose whether the population could be stratified to obtain ‘normal’ variation within strata. The emphasis in statistical work changed around 1900 from the general study of variability to more specific issues. Statistical data were requested and brought forward into discussions of social policy and social science. In 1910, Karl Pearson and Ethel M. Elderton, for example, gathered data to see whether children of heavy drinkers were more disabled than other children. Their conclusion (Elderton and Pearson 1910) was in the negative: the alcoholic propensities of parents seemed utterly unrelated to any measurable characteristics of the health or intelligence of offsprings, and what relations were found were as likely to be the reverse of what the temperance movement expected as not.
Stigler (1999) summarizes this paper and the ensuing discussion with the leading economist Alfred Marshall and others. Pearson (letter to the Times, 12 July 1910) scolded the economists for only arguing theoretically and not bringing forward empirical support: 15032
to refute these data the only method is to test further equally unselected material … Professor Marshall suggested that the weaklings among the sober had grandparents who were profligate. This may be so but, as a statistician, I reply: ‘Statistics—and unselected statistics—on the table, please.’ I am too familiar with the manner in which actual data is met with the suggestion that other data, if collected, might show another conclusion to believe it to have any value as an argument. ‘Statistics on the table, please,’ can be my sole reply.
The call for focused unbiased statistics spread, and the growth in statistical data was impressive both in official statistics and in social and other sciences. The toolbox of statistical methods also grew and the concepts and methods were steadily used more widely. Even though statistics and probability originated in the social sphere with gambling, insurance and social statistics, statistical thinking in the modern sense came late to the core social sciences. It was, in a sense, easier for Gauss to think statistically about errors in his measurement of the deterministic speed of a planet and for the experimental psychologist to randomize their design and to use corresponding statistical models in the analysis, than it was for Pearson to study how children were affected by their parents’ drinking and than it was for the economists to study business cycles. The difficulty is, according to Stigler (1999) that statistics has a much more direct role in defining the object of study in the social sciences. This relative unity of statistical and subject matter concepts was a taxing challenge that delayed progress. When statistical thinking took hold, however, the fields themselves changed fundamentally and received new energy. This delayed but fundamental impact of statistics can be traced in various disciplines. Consider the study of business cycles. Good times follow bad times, but are these cycles of business regular in some way? While the nineteenth century economists mostly were looking for particular events that caused ‘crises’ in the economy, Karl Marx and others started to look for regular cyclic patterns. Moving averages and other linear filtering techniques were used to remove noise and trend from the series to allow cycles of medium length to be visible. Statisticians worried whether a cycle of wavelength 20 years were due to the data or to the filtering methods. George Udny Yule found in 1926 that ‘nonsense correlations’ between time series of finite length are prone to show up and stand conventional tests of significance simply because of internal auto-correlation. He had earlier coined the term ‘spurious correlation,’ but found that nonsensecorrelation are flukes of the data that ‘cannot be explained by some indirect catena of causation’ (Yule 1926). Eugene Slutsky’s Russian paper of 1927 made things even worse for descriptive business cycle analysis. By a simulation experiment using numbers drawn in the People’s Commissariat of Finance lottery, he demonstrated that linear filters can produce seemingly
Statistical Methods, History of: Post-1900 regular cycles out of pure noise. It has later been found that the filter used to find the popular 20-year cycles tends to produce 20-year cycles even from random data. Slutsky’s experiment and Yule’s analysis highlighted the limitations of model-free descriptive statistical analysis. It gradually became clear that economic data had to be interpreted in light of economic theory, and the field of econometrics emerged. This revolution peaked with Trygve Haavelmo’s Probability Approach in Econometrics (1944). Mary S Morgan (1990) holds further that the failure of descriptive business cycle analysis was instrumental in the development of dynamic models for whole economies, starting with the work of Ragnar Frisch and Jan Tinbergen. Other disciplines saw similar developments of delayed but fundamental impact of statistical methodology, but perhaps not to the same extent as economics. The study of intelligence and personality was, for example, transformed and intensified by the advent of statistical factor analysis stemming from Spearman (1904) and Thurstone (1935).
3. Statistics Transformed 3.1 The Study of Populations Populations can be real, such as the population of Norway in 1894 with respect to the number of sick and disabled. Theoretical populations, on the other hand, are the hypothetical possible outcomes of an experiment, an observational study, or a process in society or nature. For finite realized populations, Kiær proposed in 1895 to the International Statistical Institute that his representative method should be considered as an alternative to the method of the day, which was to take a complete census. The idea of survey sampling was extensively debated and was gradually accepted internationally. Results from Kiær’s 1894 survey of sickness and disability were, however, questioned in Norway by Jens Hjorth. He had no problem with sampling per se but he criticized the particulars of the survey and the questionnaire. From actuarial reasoning he claimed that Kiær had underestimated the number of disabled. At issue was a proposed universal insurance system for disability and Hjorth argued that a snapshot picture of the population obtained from a survey would be insufficient. Estimates of transition probabilities to disability were also needed. In the background of Hjorth’s reasoning was, in effect, a theoretical population, e.g., a demographic model, with disability probabilities by age, occupation group and other determinants as primary parameters. Equipped with the statistical competence of the day, Hjorth presented error bounds based on the binomial
distribution for the fraction of disabled and he discussed the sample size necessary to obtain a given precision for various values of the true fraction. Kiær apparently did not understand probability. In 1906, he had to concede that the number of disabled was grossly underestimated in his 1894 survey, and he became silent on his representative method for the rest of his life. This debate over survey sampling continued over the ensuing decades until Jerzy Neyman (1934) described in clear and convincing form the modern frequentist probability model for survey sampling including stratification, cluster sampling, and other schemes. Neyman’s model for a survey of a given finite population can be seen as a theoretical population consisting of the collection of all the possible samples that could be drawn, with a probability associated with each. The distinction between sample quantities and theoretical parameters were not always observed. For example, instead of asking how data could be used to draw inferences about theoretical quantities, Karl Pearson and others proposed to assess the degree of association in two-by-two tables by means of various coefficients, e.g., the coefficients of mean square contingency and tetrachoric correlation, by emphasizing the importance of substantive theory, and the distinction between theoretical and empirical quantities. R A Fisher cleared up the confusion caused by the wilderness of coefficients. By 1920, he had introduced the notion of ‘parameter’ for characteristics of the theoretical population and he used the word ‘statistics’ for estimates, test variables, or other quantities of interest calculated from the observed data. Fisher’s interest was not focused on describing data by statistics, but on using them to draw inferences about the parameters of the underlying theoretical population. In the mid-1920s, Fisher pursued these notions of inference and he incorporated his evolving principles of statistical estimation into a new chapter in the second 1928 edition of his pioneering 1925 book, Statistical Methods for Research Workers. Here he presented his new measure of the statistical information in the data with respect to a particular parameter and he stressed the importance of using efficient estimators. Fisher’s ideas were based on the concept of a parametric statistical model that embodies both the substative theory, e.g., relevant aspects of Mendel’s theory of genetics, and a probabilistic description of the data generating process, i.e., the variational part of the model. This probabilistic model led Fisher to formulate the likelihood function, which in turn led to a revolutionary new methodology for statistics. Fisher also picked up on William S. Gosset’s 1908 development of the t-distribution for exact inference on the mean of a normal distribution with unknown variance. From a simulation experiment and guided 15033
Statistical Methods, History of: Post-1900 by distributional forms developed by Karl Pearson, Gosset guessed the mathematical form of the t-distribution. Later, Fisher was able to prove formally the correctness of Gosset’s guess and then go on to show how the t-distribution also provided an ‘exact solution of the sampling errors of the enormously wide class of statistics known as regression coefficients.’ (Fisher 1925) 3.2 The Study of Variation There are two related aspects of the term variation: uncertainty in knowledge, and sampling variability in data, estimates or test statistics. In the tradition of inverse probability stemming from Thomas Bayes, no distinction was drawn between uncertainty and variation. This inverse-sampling probability tradition dominated well into the twentieth century, although unease with some of the implicit features was expressed repeatedly by some critics as early as the midnineteenth century. Inverse probability is calculated via Bayes’ theorem, which turns a prior distribution of a parameter coupled with a conditional distribution of the data given the parameter into a posterior distribution of the parameter. The prior distribution represents the ex ante belief\uncertainty in the true state of affairs and the posterior distribution represents the ex post belief. The prior distribution is updated in the light of new data. Before Fisher, statistical methods were based primarily on inverse probability with flat prior distributions to avoid subjectivity, if they were grounded at all in statistical theory. Fisher was opposed to the Bayesian approach to statistics. His competing methodology avoided prior distributions and allowed inference to be based on observed quantities only. When discussing Neyman’s paper on survey sampling Fisher (1934), stated: All realized that problems of mathematical logic underlay all inference from observational material. They were widely conscious, too, that more than 150 years of disputation between the pros and the cons of inverse probability had left the subject only more befogged by doubt and frustration.
The likelihood function was the centerpiece of Fisher’s methodology. It is the probability density evaluated at the observed data, but regarded as a function of the unknown parameter. It is proportional to the posterior density when the prior distribution is flat. The maximum-likelihood estimate is the value of the parameter maximizing the likelihood. Fisher objected, however, to the interpretation of the normed likelihood function as a probability density. Instead, he regarded the likelihood function and hence the maximum likelihood estimator as a random quantity, since it is a function of the data which are themselves considered random until observed. 15034
To replace the posterior distribution, he introduced the fiducial distribution as a means of summing up the information contained in the data on the parameter of interest. In nice models, the fiducial distribution spans all possible confidence intervals by its stochastic quantiles. A confidence-interval estimator for a parameter is a stochastic interval obtained from the data, which under repeated sampling bounds the true value of the parameter with prescribed probability, whatever this true value is. Neyman was the great proponent of this precise concept. He even claimed to have introduced the term ‘degree of confidence,’ overlooking Laplace, who had introduced the large sample error bounds and who spoke of ‘le degreT de confiance’ in 1816 (Hald 1998). Fisher’s fiducial argument helped to clarify the interpretation of error bounds as confidence intervals, but Fisher and Neyman could not agree. Neyman thought initially that confidence intervals, which he introduced in his (1934) paper on sampling, were identical to Fisher’s fiducial intervals and that the difference lay only in the terminology. He asked whether Fisher had committed a ‘lapsus lingua’ since Fisher spoke of fiducial probability instead of degree of confidence (Neyman 1941). Neyman’s point was that no frequentist interpretation could be given for the fiducial probability distribution since it is based on the observed data ex post and is a distribution over the range of the parameter. He shared with Fisher the notion of the parameter having a fixed but unknown value and to characterize it by a probability distribution made no sense to him. Fisher stressed, however, that his fiducial distribution is based on the logic of inductive reasoning and provides a precise representation of what is learned from the observed data. He could not see that the frequentist property of the confidence interval method allowed one to have ‘confidence’ in the actually computed interval. That intervals computed by the method in hypothetical repetitions of the experiment would cover the truth with prescribed frequency was of some interest. It was, however, the logic and not this fact that made Fisher say that the computed interval contained the parameter with given fiducial probability. Harrold Hotelling (1931) staked out an intermediate position. He generalized Gosset’s t-distribution to obtain confidence regions for the mean vector of multivariate normal data with unknown variancecovariance matrix. He concluded confidence corresponding to the adopted probability P may then be placed in the proposition that the set of true values [the mean vector] is represented by a point within this boundary [the confidence ellipsoid].
Although fierce opponents on matters of inference, Fisher and Neyman agreed that inverse probability had left the subject only more ‘befogged by doubt and frustration’ (Fisher 1934) and they united in their fight against Bayesianism. The tradition from Bayes survived, however. One solution to the flawed indifference
Statistical Methods, History of: Post-1900 principle is to go subjective. By interpreting the prior distribution as a carrier of the subjective belief of the investigator before he sees the data, the posterior distribution is interpreted correspondingly, as representing the subjective believes ex post data. This was the approach of Bruno de Finetti. Harold Jeffreys proposed another solution. He argued that the prior should be based on the statistical model at hand. For location parameters, the flat prior is appropriate since it is invariant with respect to translation. Scale parameters, like the standard deviation, should have flat priors on the logarithmically transformed parameter. These and other priors were canonized from invariance considerations. The basic idea of statistical hypotheses testing had been around for some 200 years when Karl Pearson developed his chi-square test in 1900 (Hald 1998). The chi-square test and Gosset’s t-test initiated a renewed interest in the testing of statistical hypotheses. A statistical test is a probabilistic twist of the mathematical proof by contradiction: if the observed event is sufficiently improbable under the hypothesis, the hypothesis can reasonably be rejected. Fisher formulated this logic in his p-value for tests of significance. A given statistical hypothesis can usually be tested by many different test statistics calculated from the data. Fisher argued that the likelihood ratio was a logical choice. The likelihood ratio is the ratio of the likelihood at the unrestricted maximum to the likelihood maximized under the hypothesis. Neyman and Egon S. Pearson, son of Karl Pearson, were not satisfied with this reasoning. They looked for a stronger foundation on which to chose the test. From Gosset, they took the idea to formulate the alternative to the hypothesis under test. To them, testing could result either in no conclusion (there is not sufficient support in the data to reject the null hypothesis), or in a claim that the alternative hypothesis is true rather than the null hypothesis. The alternative hypothesis, supposed to be specified ex ante, reflects the actual research interest. This led Neyman and Pearson to ask which test statistic would maximize the chance of rejecting the null hypothesis if indeed this hypothesis and not the null hypothesis is true, when the probability of falsely rejecting the null hypothesis is controlled by a pre-specified level of significance. In nice problems, they found the likelihood ratio statistic to produce the optimal test in 1933. This result is the jewel of the Neyman–Pearson version of frequentist inference. As important as their optimality result is, perhaps, their emphasis on the alternative hypothesis, the power of the test, and the two types of error (false positive and false negative conclusion) that might occur when testing hypotheses. Gauss asked in 1809 whether there exists a probability distribution describing the statistical variability in the data that makes the arithmetic mean the estimator of choice for the location parameter. He showed that the normal distribution has this property
and that his method of least squares for linear parameters is indeed optimal when data are normally distributed. This line of research was continued. The distribution that makes the geometric mean the estimator of choice was found and John M. Keynes discussed estimating equations and showed that the double exponential distribution makes the median an optimal estimator. In 1934 Fisher followed up with the more general exponential class of distributions. When the statistical model is within this class, his logic of inductive inference gives unambiguous guidance. Probability remained as the underpinning of formal statistical methods and the probability theory saw many advances in the early twentieth century, particularly in continental Europe. The Central Limit Theorem, which is at the heart of Laplace’s approximate error bounds, was strengthened. Harald Crame! r (1946) summed up this development and he exposed its statistical utility since many estimators and other statistics are approximately linear in the observations and thus are approximately normally distributed. Related to the Central Limit Theorem is the use of series expansions to find an approximation to the distribution of a statistic. Thorvald N. Thiele completed his theory of distributional approximation by way of half invariants, which Fisher later rediscovered under the name of cumulants. Francis Y Edgeworth proposed another series expansion in 1905 (see Hald 1998). Probability theory is purely deductive and should have an axiomatic basis. Andrei N Kolmogoroff gave probability theory a lasting axiomatic basis in 1933, see Stigler (1999) who calls 1933 a magic year for statistics.
3.3 The Study of Methods of the Reduction of Data Raw statistical data need to be summarized. Fisher (1925) says no human mind is capable of grasping in its entirety the meaning of any considerable quantity of numerical data. We want to be able to express all releant information contained in the mass by means of comparatively few numerical values … It is the object of the statistical process employed in the reduction of data to exclude the irrelevant information, and to isolate the whole of the relevant information contained in the data.
It is theory that dictates whether a piece of data is relevant or not. Data must be understood in the light of subject matter theory. This was gradually understood in economics, as well as in other fields. Fisher (1925) showed by examples from genetics how theory could be expressed in the statistical model through its structural or parametric part. The likelihood function is, for Fisher, the bridge between theory and data. Fisher introduced the concept of sufficiency in 1920: a statistic is sufficient if the conditional distribution of the data given the value of the statistic is the same for 15035
Statistical Methods, History of: Post-1900 all distributions within the model. Since statistics having a distribution independent of the value of the parameter of the model is void of information on that parameter, Fisher was able to separate the data into its sufficient (informative) statistics, and ancillary (noninformative) statistics. The likelihood function is sufficient and thus carries all the information contained in the data. In nice cases, notably when the model is of the exponential class, the likelihood function is represented by a sufficient statistic of the same dimensionality as the parameter. It might then be possible to identify which individual statistic is informative for which individual parameter. If, say, the degree of dependency expressed by the theoretical odds ratio in a two by two contingency table is what is of theoretical interest, the marginal counts are ancillary for the parameter of interest. Fisher suggested therefore conditional inference given the data in the margins of the table. In the testing situation, this he argued leads to his exact test for independence. Conditional inference is also often optimal in the Neyman–Pearson sense.
4. Conclusion Statistical theory and methods developed immensely in the first third of the twentieth century with Fisher as the driving force. In 1938, Neyman moved from London to Berkeley. The two men were continents apart and kept attacking each other. The question is, however, whether this was more grounded in personal animosity than real scientific disagreement (Lehmann 1993). In the US, where Neyman’s influence was strong, there has been a late-twentieth-century resurgence in Fisherian statistics (Efron 1998) and even more so in Bayesian methods. Fisher and Neyman both emphasized the role of the statistical model. Both worked with the likelihood function, and suggested basically the same reduction of the data. Fisher gave more weight to the logic of inductive reasoning in the particular case of application, while Neyman stressed the frequentistic properties of the method. To the applied research worker, this boils down to whether the level of significance or the confidence coefficient should be chosen ex ante (Neyman), or whether data should be summarized in fiducial distributions or pvalues. These latter ex post benchmarks are then left to the reader of the research report to judge. Over the period, probabilistic ideas and statistical methods were slowly dispersed and were applied in new fields. Statistical methods gained foothold in a number of sciences, often through specialized societies like Econometric Society and through journals such as Econometrica from 1933 and Psychometrica from 1936. Statistical inference was established as a new field of science in our period, but it did not penetrate empirical sciences before World War II to the extent that 15036
research workers in the various fields were required to know its elements (Hotelling 1940). Statistical ideas caught the minds of eminent people in the various sciences, however, and new standards were set for research. Fisher’s Statistical Methods for Research Workers has appeared in 14 editions over a period of 47 years. His ideas and those of Neyman and others spread so far and so profoundly that C Radhakrishna Rao (1965) declared, perhaps prematurely, statistics to be the technology of the twentieth century. See also: Bayesian Statistics; Fiducial and Structural Statistical Inference; Fisher, Ronald A (1890–1962); Statistical Methods, History of: Pre-1900; Statistics, History of
Bibliography Crame! r H 1946 Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ Efron B 1998 R. A. Fisher in the 21st Century. Statistical Science 13: 95–114 Elderton E M, Pearson K 1910 A First Study of the Influence of Parental Alcoholism on Physique and Ability of the Offspring. The Galton Laboratory, London Fisher R A 1934 Discussion of Neyman. Journal of the Royal Statistical Society 97: 614–19 Fisher R A 1922 On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, Series A, 222: 309–68 Fisher R A 1925 Statistical Methods for Research Workers. Reprinted in Statistical Methods, Experimental Design, and Scientific Inference. Oxford University Press, Oxford, 1990 Hald A 1998 A History of Mathematical Statistics From 1750 to 1930. Wiley, New York Haavelmo T 1944 The probability approach in econometrics. Econometrica 12(Supplement): 1–118 Hotelling H 1931 The generalization of Student’s ratio. Annals of Mathematical Statistics 2: 360–378 Hotelling H 1940 The teaching of statistics. Annals of Mathematical Statistics 11: 457–70. Reprinted (1988) as ‘Golden Oldy,’ Statistical Science 3: 57–71 Lehmann E L 1993 The Fisher, Neyman–Pearson theories of testing hypotheses: One theory or two? Journal of American Statistical Association 88: 1242–9 Morgan M S 1990 The History of Econometric Ideas. Cambridge University Press, Cambridge, UK Neyman J 1934 On two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97: 558–625 Neyman J 1941 Fiducial argument and the theory of confidence intervals. Biometrika 32: 128–50 Rao C R 1965 Linear Statistical Inference and its Applications. Wiley Spearman C 1904 General intelligence, objectively determined and measured. American Journal of Psychology 15: 201–93 Stigler S M 1999 Statistics on the Table. The History of Statistical Concepts and Methods. Harvard University Press, Cambridge, MA Thurstone L L 1935 The Vector of Mind. University of Chicago Press, Chicago, IL
Statistical Methods, History of: Pre-1900 Yule G U 1926 Why do we sometimes get non-sense correlation between time-series? A study in sampling and the nature of time-series. Journal of the Royal Statistical Society 18: 1–63
T. Schweder
Statistical Methods, History of: Pre-1900 Statistics first assumed its recognizably modern form, as a set of tools and rationales for analyzing quantitative data, during the decade of the 1890s. Even then, it remained a far-flung and heterogeneous endeavor. Before this time, statistical methods were developed and put to use in the context of a variety of scientific and bureaucratic activities. These were sometimes recognized as analogous, but there were no common textbooks, nor even a shared name. The history of statistical methods up to 1900 means an examination of the development of statistical knowledge in various social and intellectual settings, and of the migration of techniques and understandings among them. The fields involved include mathematical probability, observational astronomy, quantitative natural history, physics, and an array of social science projects.
1. From Political Arithmetic to Statistics, 1660–1830 The name ‘statistics’ was coined in eighteenth-century Germany for a descriptive science of states. Whether or not statistical descriptions should involve numbers was left open at first. After 1800, in the polarizing era of the French Revolution, this question became more ideological, and the mere purveying of numerical tables was denounced for its slavishness, and for viewing the state as lifeless and mechanical. It thus entailed a change of mentality, suggestive also of new structures of government and administration, when this ‘state science’ began to be defined in terms of the collection and deployment of numbers. That happened first in Britain and France, where it was largely settled by 1840. The new science of statistics was closely related to censuses and other official information agencies, which expanded enormously in this same period (Hacking 1990). The new statistics, commonly called a science, was practiced mainly by reformers and administrators. Its methods involved organization and classification rather than mathematics. Procuring reliable numbers, then as now, was no small achievement. Among those who hoped to move this social science in the direction of finding laws and abstract relationships, the highest priority was to standardize categories. In this way, they hoped, the ‘statist’ could compare rates of crime,
disease, or illegitimacy across boundaries, and try to discover how they varied according to nationality, language, religion, race, climate, and legal arrangements. Such figures might also enable them to predict the potential for improvement if, for example, laws were changed or educational systems were put in place. A series of statistical congresses beginning in 1853 undertook to negotiate statistical uniformity, with very little success. The comparative neglect of mathematics among the social statisticians was not simply a consequence of a lack of appropriate tools. Social numbers had been collected since the late seventeenth century under the rubric of ‘political arithmetic,’ a bolder and more freewheeling enterprise than statistics. John Graunt, a tradesman by origin, was elected to the Royal Society of London for his ‘Observations upon the Bills of Mortality’ (1662), a compilation and analysis of mortality records. William Petty, rather more daring, used what figures he could procure to calculate such quantities as the total wealth of Britain, including the monetary value of its human lives. On this basis he even proposed modestly to increase the wealth of the nation by transporting the Irish to England. In the eighteenth century, probability theory was deployed by the most illustrious of mathematicians to investigate social numbers. Jakob Bernoulli’s Ars conjectandi (1713) addressed the convergence of observed ratios of events to the underlying probabilities. A discussion was provoked by the discovery that male births consistently outnumber female ones, in a ratio of about 14 to 13. In 1712, John Arbuthnot offered a natural theology of these birth numbers. The ratio was nicely supportive of belief in a benevolent God, and indeed of a faith without priestly celibacy, since the excess of male births was balanced by their greater mortality to leave the sexes in balance at the age of marriage. The odds that male births should exceed female ones for 82 consecutive years by ‘chance’ alone could be calculated, and the vanishingly small probability value stood as compelling evidence of the existence of God. Against this claim, Nicolas Bernoulli of the famous Swiss mathematical family showed that the steadiness of these odds required no providential intercession, but was about what one would anticipate from repeated tosses of dice with—he revised slightly the ratios—18 sides favoring M and 17 F (Hacking 1975). Pierre Simon Laplace investigated birth ratios in the 1780s, and also discussed how to estimate a population from a partial count. He was party to an alliance in France between mathematicians and administrators, which also included the mathematician and philosopher Condorcet (Brian 1994). Laplace’s population estimate was based on a registry of births for all of France, and on the possibility of estimating the ratio of total population to births through more local surveys. Proceeding as if the individuals surveyed were independent, and the birth numbers accurate, Laplace 15037
Statistical Methods, History of: Pre-1900 used inverse probabilities to specify how many should be counted (771,469) in order to achieve odds of 1,000 to 1 against an error greater than half a million. The estimated population, which had to be fed into the previous calculation, was 25,299,417, attained by multiplying the average of births in 1781–2 by a presumed ratio of 26 (Gillispie 1997). Laplace and Condorcet wrote also about judicial probabilities. Supposing a jury of 12, requiring at least 8 votes to convict, and that each juror had a specified, uniform probability of voting correctly, these mathematicians could calculate the risk of unjust convictions and undeserved acquittals. They also determined how the probabilities would change if the number of jurors were increased, or the required majority adjusted. The study of judicial probabilities reached its apex in the 1830s when Sime! on-Denis Poisson made ingenious use of published judicial records to inform the calculations with empirical content. By this time, however, the whole topic had become a dinosaur. The use of mathematics to model human judgment now seemed implausible, if not immoral, and inverse probabilities (later called Bayesian) came under challenge (Daston 1988). Political arithmetic had come to seem weirdly speculative, and probabilistic population estimates both unreliable and unnecessary. In the new statistical age, sound practice meant a complete census. On this basis, some of the more mathematical methods of statistical reasoning were rejected around 1840 in favor of empirical modesty and bureaucratic thoroughness. Not until the twentieth century was sampling restored to favor. One important application of probability to data grew up and flourished in place of these old strategies and techniques. That was the method of least squares. The problem of combining observations attracted a certain amount of attention in the eighteenth century. The first formal method to gain wide acceptance was worked out between about 1805 and 1810. It was initially published in 1805 by the French mathematician Adrien-Marie Legendre, who was then at work on the problem of measuring a line of meridian in order to fix the meter. Carl Friedrich Gauss asserted soon afterwards that he had already been employing least squares for several years, though there are reasons not to take his claim at face value (Stigler 1999, Chaps. 17 and 20). From 1809 to 1811, Gauss and then Laplace published probabilistic derivations of the method, and provided interpretations that shaped much of its subsequent history (see Gauss, Carl Friedrich (1777–1855)). Particularly crucial was its connection to what became known as the astronomer’s error law, and later as the Gaussian or normal distribution. Gauss derived least squares by assuming that errors of observations conform to this law. Laplace showed how the error curve would be produced if each observation was subject to a host of small disturbances, which could equally well be positive or negative. The mathematics of error, and of least 15038
squares, formed the topic for a line of mathematical publications throughout the century. It provided a means of analyzing quantitative data in astronomy, meteorology, surveying, and other related observational sciences. Later, through a paradoxical transformation, the same mathematics was used to describe and analyze variation in nature and society.
2. Social Science and Statistical Thinking The mathematical tools of statistics worked out by Laplace and Gauss were extended but not decisively changed in the period from 1830 to 1890. They engaged the attention of some excellent mathematicians, including Auguste Bravais, I.-J. Bienayme! , and A.-L. Cauchy (Hald 1998). The work of the Russian school of probability theory, initiated by P. L. Chebyshev, had consequences for statistical methods. But the really decisive development in this period involved statistics as a form of reasoning, worked out mainly in relation to social science, and involving, for the most part, comparatively rudimentary mathematics. This statistical thinking arose first of all in relation to the social science that bore the name ‘statistics.’ Statistics as a method was conceived by analogy to this substantive science. It was, to rely here on anachronistic terms, less a mode of inference than a family of models. They involved, at the most basic level, the production of collective order from the seemingly irregular or even chaotic behavior of individuals. Like most discoveries, this one had ostensible precedents. Since, however, the Belgian Adolphe Quetelet had in the 1820s learned probability from Laplace, Poisson, and Joseph Fourier in Paris, and published several memoirs on vital statistics, his astonishment in 1829 should count for something. In that year, inspired by newly published French statistics of criminal justice, he attached a preface on crime to a statistical memoir on the Low Countries. He declared himself shocked at the ‘frightening regularity with which the same crimes are reproduced,’ year after year. He introduced a language of ‘statistical laws.’ At first he worried that such laws of moral actions might conflict with traditional doctrines of human free will, though he eventually rejected the idea of a ‘strange fatalism’ in favor of interpreting these laws as properties of a collective, of ‘society.’ His preoccupation with statistical regularities reflected the new emphasis in moral and political discussion on society as an autonomous entity, no longer to be regarded as subordinated to the state. But there was also something more abstract and, in the broadest sense, mathematical, in Quetelet’s insight. It implied the possibility of a quantitative study of mass phenomena, requiring no knowledge at the level of individuals (Porter 1986). Quetelet was an effective publicist for his discovery, which indeed was not uniquely his. Statistical laws
Statistical Methods, History of: Pre-1900 were widely discussed for half a century in journalism, literature, and social theory by authors like Dickens, Dostoevsky, and Marx. Moralists and philosophers worried about their implications for human freedom and responsibility. Natural scientists invoked socialstatistical analogies to justify the application of this form of reasoning to physical and biological questions. The creation of a statistical physics by James Clerk Maxwell, Ludwig Boltzmann, and Josiah Willard Gibbs, and of statistical theories of heredity and evolution by Francis Galton and Karl Pearson, attests to the increasing range of these statistical theories or models. Precisely this expansion made it increasingly credible to define statistics as a method rather than a subject matter. At the same time, a new body of analytical tools began to be formulated from within this discourse, drawing from probability and error theory, but applying and adapting them to issues in social statistics. The most pressing of these issues was precisely the stability of statistical series. A particularly forceful and uncompromising restatement of Quetelet’s arguments in Henry Thomas Buckle’s 1857 History of Ciilization in England provoked an international debate about statistics and free will. One strategy of response was to explore this stability mathematically. The presentation and analysis of time series had long been associated with economic and population statistics, and with meteorology (Klein 1997). These debates provided a new focus. Was there really anything remarkable in the much-bruited regularities of crime, marriage, and suicide? Robert Campbell, schoolboy friend of the physicist Maxwell, published a paper in 1859 comparing Buckle’s examples with expectations based on games of chance. He found the social phenomena to be much less regular than were series of random events. In Germany, where statistics was by this time a university-based social science, this question of the stability of social averages formed the core of a new mathematical approach to statistics. Wilhelm Lexis set out to develop tools that could simultaneously raise the scientific standing of statistics and free it from the false dogmas of ‘French’ social physics. The normal departure of measured from true ratios in the case of repeated independent random events was expressed by Poisson as N(2pq\n), where p and q are complementary probabilities and n the number of repetitions. Lexis began his investigation of the stability of statistical series in 1876 with a paper on the classic problem of male to female birth ratios. In this case, the tables of annual numbers were reasonably consistent with Poisson’s formula. He called this a ‘normal dispersion,’ though it appeared to be unique. Nothing else—be it rainfall or temperature, or time series of births, deaths, crimes, or suicides, whether aggregated or broken down by sex or age—showed this much stability. He understood Buckle’s deterministic claims as implying ‘subnormal dispersion,’ that is,
an even narrower range of variation than the Poisson formula predicted. This could easily be rejected. Lexis’s limits of error, defined by three times the Poisson number, corresponded with a probability of about 0.999978. Virtually every real statistical series showed a ‘supernormal’ dispersion beyond this range. Laplace and Poisson seemed often to use legal and moral questions as the occasion to show off their mathematics. Lexis was serious in his pursuit of a quantitative social science, one that would reveal the social and moral structure of society. Against Quetelet, who had exalted the ‘average man’ as the type of the nation, he undertook to show that human populations were highly differentiated—not only by age and sex, but also by religion, language, and region. His program for social science was to begin by disaggregating, seeking out the variables that differentiate a population, and using statistical mathematics to determine which ones really mattered. Society, it could be shown, was not a mere collection of individuals with identical propensities, but a more organic community of heterogeneous groups. The wider than chance dispersions revealed that society was not composed of independent atomic individuals.
3. Origins of the Biometric School Institutionally, the crucial development in the rise of modern methods of statistics was the formation of the biometric school under Karl Pearson at University College, London. The biometric program began as a statistical theory or model of natural processes, not a method of inference or estimation. Francis Galton, who pioneered the statistical study of heredity and evolution, derived his most basic conceptions from the tradition initiated by Quetelet. He was impressed particularly by the error curve, which Quetelet had applied to distributions of human measurements such as height and circumference of chest. Since he regarded the applicability of the error law as demonstrating the reality of his ‘average man,’ he minimized the conceptual novelty of this move. The mathematics, for him, showed that human variation really was akin to error. Galton strongly rejected this interpretation. The ‘normal law of variation,’ as he sometimes called it, was important for statistics precisely because it provided a way to go beyond mean values and to study the characteristics of exceptional individuals. The biometricians used distribution formulas also to study evolution, and especially eugenics, to which Galton and Pearson were both deeply committed (MacKenzie 1981). Galton attached statistics to nature by way of a particulate theory of inheritance, Darwin’s ‘Pangenesis.’ His statistical concepts were also, at least initially, biological ones: ‘reversion’ meant a partial return of ancestral traits; ‘regression’ a tendency to return to a 15039
Statistical Methods, History of: Pre-1900 biological type; and ‘correlation’ referred to the tendency for measures of different bodily parts of one individual to vary in the same direction. By 1890, however, he recognized that his statistical methods were of general applicability, and not limited to biology. Pearson drew his statistical inspiration from Galton’s investigations, though he may well have required the mediation of Francis Edgeworth to appreciate their potential (Stigler 1986). Edgeworth, an economist and statistician, was, like Pearson, a skillful and even original mathematician of statistics, but no institution builder. Thus it was Pearson, above all, who created and put his stamp on this new field. Interpreting it in terms of his positivistic philosophy of science, he strongly emphasized the significance of correlation as more meaningful to science than causation. While he was committed above all to biometry and eugenics, he also gave examples from astronomy, meteorology, and education. Beginning in the mid-1890s, he attracted a continuing stream of postgraduate students, many but not all intending to apply statistics to biology. Among the first was George Udny Yule, an important statistician in his own right, who worked mainly on social and economic questions such as poverty. Pearson fought with many of his students, and also with other statisticians, most notably R. A. Fisher, and his reputation among modern-day statisticians has suffered on this account. Nevertheless, the 1890s were decisive for the history of statistical methods. Their transformation was not exclusively the work of the biometricians. This is also when the Norwegian A. N. Kiaer drew the attention of statisticians once again to the possibilities of sampling, which Arthur Bowley began to address mathematically. These trends in social statistics, along with the new mathematics of correlation and the chi-square test, were part of something larger. This was the modern statistical project, institutionalized above all by Pearson: the development of general mathematical methods for managing error and drawing conclusions from data. See also: Bernoulli, Jacob I (1654–1705); Galton, Sir Francis (1822–1911); Laplace, Pierre-Simon, Marquis de (1749–1827); Pearson, Karl (1857–1936); Quetelet, Adolphe (1796–1874); Statistics, History of; Statistics: The Field
Bibliography Brian E 1994 La mesure de l’Etat: Administrateurs et geT omeZ tres au XVIIIe sieZ cle. Albin Michel, Paris Daston L 1988 Classical Probability in the Enlightenment. Princeton University Press, Princeton, NJ Desrosie' res A 1998 The Politics of Large Numbers: A History of Statistical Reasoning. Harvard University Press, Cambridge, MA Gillispie C C 1997 Pierre-Simon Laplace, 1749–1827: A Life in Exact Science. Princeton University Press, Princeton, NJ
15040
Hacking I 1975 The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical inference. Cambridge University Press, London, UK Hacking I 1990 The Taming of Chance. Cambridge University Press, London, UK Hald A 1998 A History of Mathematical Statistics from 1750 to 1930. Wiley, New York Klein J 1997 Statistical Visions in Time: A History of Time-Series Analysis, 1662–1938. Cambridge University Press, Cambridge, UK Kru$ ger L, Daston L, Heidelberger M (eds.) 1987 The Probabilistic Reolution, Vol. 1: Ideas in History. MIT Press, Cambridge, MA MacKenzie D 1981 Statistics in Britain, 1865–1930: The Social Construction of Scientific Knowledge. Edinburgh University Press, Edinburgh, UK Matthews J R 1992 Mathematics and the Quest for Medical Certainty. Princeton University Press, Princeton, UK Porter T M 1986 The Rise of Statistical Thinking, 1820–1900. Princeton University Press, Princeton, NJ Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty Before 1900. Belknap Press of Harvard University Press, Cambridge, MA Stigler S 1999 Statistics on the Table: The History of Statistical Concepts and Methods. Belknap Press of Harvard University Press, Cambridge, MA
T. M. Porter
Statistical Pattern Recognition Statistical pattern recognition is concerned with the problem of using statistical techniques to design machines that can classify patterns of measurement data in useful ways. For example, such techniques have been successfully exploited in areas such as object recognition, speech recognition, fingerprint recognition, data analysis, signal processing, cognitive modeling, and artificial intelligence. After identifying key ‘features’ of information in the measurement data, a statistical pattern recognition machine uses its expectations about the ‘likelihood’ of those features and the ‘losses’ of making situation-specific decisions to select a classification action which minimizes an appropriate expected risk function. In many cases, the statistical pattern recognition machine must ‘learn’ through experience the relative likelihood of occurrence of patterns of features in its environment. Although typically viewed as a branch of artificial intelligence (see Decision Support Systems) statistical pattern recognition methods have been extensively and successfully applied to many behavioral modeling and data analysis problems (see Multidimensional Scaling in Psychology; Multiariate Analysis: Oeriew). The present entry discusses some of the essential features of the generic statistical pattern recognition problem (see Andrews 1972, Duda and Hart 1973, Golden 1996, Patrick 1972).
Statistical Pattern Recognition
1. Generic Statistical Pattern Recognition Problem As shown in Fig. 1, the generic statistical pattern recognition problem may be described in terms of four major components: feature selection and feature extraction, probabilistic knowledge representation, Bayes decision rule for classification, and a mechanism for learning.
1.1 Feature Selection and Extraction The pattern of measurement data to be classified is represented as a measurement vector (i.e., a point in ‘measurement space’). The choice of the measurement space is primarily determined by problem-imposed considerations such as the types of transducers which are available and the types of patterns which must be processed. The particular pattern classification methodology embodied in the pattern recognition machine usually has little or no influence over the choice of the measurement space. Each measurement vector is then mapped into a feature vector (i.e., a ‘point’ in feature space) whose dimensions are primarily chosen to facilitate pattern classification by the pattern recognition machine. This process which maps a measurement vector s into a feature vector x is called the feature selection and extraction stage of the pattern recognition process.
1.2 Probabilistic Knowledge Representation Knowledge is implicitly represented in a statistical pattern recognition machine in several ways. The particular process for mapping points in measurement space into feature space (as well as the dimensions of the feature space) is one important knowledge source. However, the expected likelihood of observing a particular feature vector and the associated losses for making different situation-specific decisions are also critical knowledge sources (see Utility and Subjectie Probability: Contemporary Theories). Specifically, let the symbol ω denote the pattern class. Thus, ω refers to the first pattern class (e.g., a " set of photographs of ‘dogs’) that might have generated the observed feature vector x, while the symbol ω refers to a second competing pattern class (e.g., a set# of photographs of ‘cats’) that might have generated x. The likelihood of observing a feature vector x, corresponding to a particular pattern class is expressed in terms of two types of probability distributions. The a priori class probability distribution p(ωi) indicates the likelihood that a feature vector will be generated from the pattern class ωi before the feature vector x has been observed. The class conditional probability distribution p(x Q ωi) indicates the conditional likelihood of observing the feature vector x given that the
Figure 1 Flow chart diagram of a generic statistical pattern recognition machine. In this example, the sensor array (represented as a measurement vector) detects a pattern of light intensities produced from light reflecting off the head of a dog. A feature selection and extraction stage transforms the measurement vector into a feature vector. Using the feature vector and its probabilistic knowledge representation the pattern recognition machine selects the minimum expected risk decision (i.e., ‘Decide Dog’)
feature vector x had in fact been generated from pattern class ωi. Knowledge regarding the expected consequences of taking a particular classification action αj, given the feature vector came from pattern class ωi is also exploited. Typically, the classification action takes the form of a decision indicating which of a finite number of pattern classes generated the feature vector. Formally, it is assumed that a loss λ(αj Q ωi) is incurred when classification action αj is taken when the feature vector was in fact generated from pattern class ωi.
1.3 Bayes Decision Rule for Classification The optimal decision rule for selecting a classification action αj if the pattern class ωi generated the feature vector simply requires that the classification action α* which minimizes the loss for the pattern class ωi be chosen (i.e., λ (α Q ωi) λ(α* Q ωi) for all α). In practice, however, the pattern class ωi which actually generated the feature vector is unknown. The statistical pattern recognition machine only has access to the observed feature vector x and its probabilistic knowledge base. One possible rational inductive decision making strategy is called the Bayes decision rule for classification; it forms the basis for classification decisions in most statistical pattern recognition algorithms. The Bayes decision rule for classification dictates that the pattern recognition machine should select the classification action α* which minimizes the expected loss or 15041
Statistical Pattern Recognition conditional risk given feature vector x is observed. Formally, the risk R(α Q x) is defined by the formula: M
(1) R(α Q x) l λ (α Q ωi) p (ωi Q x) i=" The important minimum probability of error case of Eqn. (1) arises when λ(αk Q ωi) l 1 when i k and λ(αk Q ωi) l 0 for i l k. In this case, Eqn. (1) becomes: M
R(αk Q x) l p (ωi Q x) l 1kp (ωk Q x)
(2)
ik
The action αk which minimizes the probability of error case of the risk function in Eqn. (1) is usually called a maximum a posteriori (MAP) estimate since αk maximizes the a posteriori distribution p(ωj Q x) (i.e., the distribution obtained after observing the feature vector x). Now note that the risk R(α Q x) in Eqn. (1) can be expressed in terms of the a priori class likelihoods and class conditional likelihoods using the formula: p(ωi Q x) l
p (x Q ωi) p (ωi) p (x)
(3)
Substituting this expression into Eqn. (1), it follows that the risk, R(α Q x), for selecting action α given feature vector x is observed can be expressed as: E
R(α Q x) l
1 p (x)
G
M
λ(α Q ωi) p (x Q ωi) p (ωi) (4) F H i=" For computational reasons, some monotonically decreasing function of R(αi Q x) is usually exploited and referred to as a discriminant function gi(x). For example, a useful discriminant function for Eqn. (4) is: M
(5) gi(x) l k λ (αi Q ωi) p (x Q ωi) p (ωi) i=" which does not require computation of p(x). Such discriminant functions define ‘decision boundaries’ in feature space. For example, suppose that there are only two pattern classes with corresponding discriminant functions g and g . In this situation, the " feature vector x is classified as# a member of pattern class ω if g (x)kg (x) 0 and a member of pattern " # 0. Thus, the ‘decision boundclass ω " if g (x)kg (x) # " the two # pattern classes in feature space is ary’ between defined by the set of feature vector which satisfy the equation: g (x)kg (x) l 0. " # 1.4 Learning as Bayes Estimation Just as the classification problem involves selecting that classification action which minimizes an appro15042
priate expected classification loss function, the learning problem can also be formulated within a minimum expected loss framework as well. Typically, the measurement vector, feature vector mapping from measurement space into feature space, a priori class distributions, and loss functions are known. Also the family of class conditional distributions or the ‘class conditional probability model’ is typically indexed by the values of some model parameter β. Given this latter assumption, a model is a set such that each element is a class conditional probability distribution p(x Q ω; β). The goal of the learning problem is then reduced to the problem of selecting a particular value of the parameter β. 1.4.1 A Bayes risk function for learning. Assume the measurement vector to feature vector mapping, classification decision loss function λ, and a priori class probabilities op(ω ), …, p(ωM)q have been specified. " a class conditional probability In addition, assume model defined by a set of class conditional distributions of the form p(x Q ω; β). The learning problem then reduces to the problem of estimating a value of the parameter β given a set of training data. Let Xn l [x("), …, x(n)] denote a set of n feature vectors which define a set of training data. For expository reasons, simply assume x("), …, x(n) are observed values (realizations) of independent and identically distributed (i.i.d.) random vectors. Let p(β) denote the a priori probability distribution of β (i.e., the distribution of the parameter vector β before the training data Xn has been observed). Let l(β, β*) denote a loss function associated with the action for selecting a particular value of β when the correct choice of the parameter β is in fact β*. A popular and general risk function for selecting parameter β given the training data Xn involves finding a β that minimizes the Bayes risk function for learning (e.g., Patrick 1972 pp. 71–2) given by the formula: E ( β Q Xn) l l( β, β*) p ( β* Q Xn)dβ*
(6)
The quantity p(β Q Xn) (known as the a posteriori density or a posteriori mass function for parameter vector β) is not explicitly provided since it can be expressed in terms of p(x Q ω; β) (the class conditional likelihood function), p(β) (the a priori likelihood of β which is not dependent upon the pattern recognition machine’s experiences), and the ‘training data’ Xn. It then follows from the i.i.d. assumption that: n
p (x(i) Q β) p (Xn Q β) j = p ( β Q Xn) l l " p (β) p ( β)
(7)
For expository reasons, assume β is quantized so that it can take on only a finite number of values. Then minimizing Eqn. (6) using the loss function l(β, β*) l l if β β* and l(β, β*) l 0 if β l β* corresponds to
Statistical Pattern Recognition maximizing Eqn. (7 ) (i.e., computing the MAP estimate or most probable value of β). It can be shown that for many popular a priori and class conditional distribution functions, p(β Q Xn) can be well approximated by the ‘likelihood of the observed data,’ p(Xn Q β) for sufficiently large sample size n (e.g., Golden 1996 pp. 299–301). Accordingly, a popular approach to parameter estimation known as maximum likelihood estimation involves maximizing the likelihood function p(Xn Q β) by selecting a parameter vector value, βn, such that p(Xn Q βn) p(Xn Q β) for all β. 1.4.2 Nonparametric and parametric learning algorithms. Learning algorithms that make minimal assumptions regarding how the data was generated are referred to as nonparametric learning algorithms. For such algorithms, the parametric form of the class conditional distribution is often quite flexible. Nonparametric Parzen-window-type learning algorithms (e.g., Duda and Hart 1973, Patrick 1972) essentially count the observed occurrence of different values of the feature vector x. Learning algorithms that incorporate considerable prior knowledge about the data generating process are referred to as parametric learning algorithms. The classical Bayes classifier which assumes that the class conditional distributions have a Gaussian distribution is a good example of a parametric learning algorithm (e.g., Duda and Hart 1973, Patrick 1972). Nonparametric learning algorithms have the advantage of being relatively ‘unbiased’ in their analysis of the data but may require relatively large amounts of the right type of training data for effective learning. Parametric learning algorithms are ‘biased’ algorithms but if the right type of prior knowledge is ‘built-in’ to the learning algorithm then a parametric learning algorithm can exhibit superior learning from even poor quality training data. Statistical pattern recognition algorithms which are typically viewed as ‘nonparametric’ can be transformed into parametric learning algorithms if a sufficient number of constraints upon the class conditional distributions are imposed. For example, polynomial discriminant function (e.g., Duda and Hart 1973, Patrick 1972) and the closely related multilayer backpropagation neural network and perceptron type learning algorithms (e.g., Golden 1996; also see Perceptrons) can learn to approximate almost any desired form of the class conditional distribution function (given sufficiently many free parameters). Thus, such learning algorithms are best viewed as nonparametric statistical pattern recognition algorithms. On the other hand, prior knowledge in the form of constraints on the parametric form of the class conditional distributions can usually be introduced resulting in parametric variations of such classical nonparametric learning algorithms (e.g., Golden 1996).
1.4.3 Superised, unsuperised, and reinforcement learning algorithms. Supervised, unsupervised, and reinforcement learning algorithms have the common goal of estimating the class conditional distribution from the training data. The essential difference among these different learning algorithms is the availability of information in the feature vector indicating which pattern class generated the feature vector. In supervised learning, each feature vector in the training data set contains explicit information regarding which pattern class generated that feature vector. In unsupervised learning, information regarding which pattern class generated a particular feature vector is not available in either the training data set or the test data set. In reinforcement learning, ‘hints’ about which pattern class generated a particular feature vector are provided either periodically or aperiodically throughout the learning process (Sutton and Barto 1998).
2. Applications 2.1 Artificial Intelligence Statistical pattern recognition methods have been extensively applied in the field of artificial intelligence. Successful applications of these methods in the field of computer vision include extraction of low-level visual information from visual images, edge detection, extracting shape information from shading information, object segmentation, and object labeling (e.g., Chellapa and Jain 1993, Duda and Hart 1973). Statistical pattern recognition methods such as Hidden Markov models play an important role in speech recognition algorithms and natural language understanding (Charniak 1993). Bavesian networks defined on directed acyclic graphs and the closely related Markov random field methods are being applied to problems in inductive inference (Chellapa and Jain 1993, Golden 1996, Jordan 1999). In the fields of robot control and neurocomputing, the general stochastic dynamical system control problem (see Optimal Control Theory) is naturally viewed within a statistical pattern recognition framework where the control signal corresponds to the pattern recognition machine’s ‘action’ (Golden 1996). 2.2 Behaioral Modeling In an attempt to develop and refine theories of human behavior, some psychologists have compared the behavior of a class of statistical pattern recognition machines known as ‘connectionist’ algorithms with human behavior (see Golden 1996, and Ellis and Humphreys 1999, for some introductory reviews also see Connectionist Models of Concept Learning; Connectionist Models of Deelopment; Connectionist Models of Language Processing). Although such algorithms 15043
Statistical Pattern Recognition are typically not presented within a statistical pattern recognition context, they can usually be naturally interpreted as statistical pattern recognition machines (Golden 1996). 2.3 Data Analysis Increasingly sophisticated methods for discovering complex structural regularities in large data sets are required in the social and behavioral sciences. Many classical statistical pattern recognition techniques such as factor analysis, principal component analysis, clustering analysis, and multidimensional scaling techniques have been successfully used (see Multidimensional Scaling in Psychology; Multiariate Analysis: Oeriew). More sophisticated statistical pattern recognition methods such as artificial neural networks (see Artificial Neural Networks: Neurocomputation; Connectionist Approaches) and graphical statistical models will form the basis of increasingly more important tools for detecting structural regularities in data collected by social and behavioral scientists. See also: Categorization and Similarity Models; Classification: Conceptions in the Social Sciences; Configurational Analysis; Feature Representations in Cognitive Psychology; Letter and Character Recognition, Cognitive Psychology of; Monte Carlo Methods and Bayesian Computation: Overview; Multivariate Analysis: Classification and Discrimination; Object Recognition: Theories; Signal Detection Theory: Multidimensional; Spatial Pattern, Analysis of
Bibliography Andrews H C 1972 Introduction to Mathematical Techniques in Pattern Recognition. Wiley, New York Charniak E 1993 Statistical Language Learning. MIT Press, Cambridge, MA Chellappa R, Jain A 1993 Marko Random Fields: Theory and Application. Academic Press, New York Chen C H, Pau L F, Wang P S P 1993 Handbook of Pattern Recognition & Computer Vision. World Scientific, River Edge, NJ Duda R O, Hart P E 1973 Pattern Classification and Scene Analysis. Wiley, New York Ellis R, Humphreys G 1999 Connectionist Psychology: A Text with Readings. Psychology Press, East Sussex, UK Golden R M 1996 Mathematical Methods for Neural Network Analysis and Design. MIT Press, Cambridge, MA Jordan M I 1999 Learning in Graphical Models. MIT Press, Cambridge, MA Nilsson N J 1965 Learning Machines. McGraw-Hill, New York Patrick E A 1972 Fundamentals of Pattern Recognition. PrenticeHall, Englewood Cliffs, NJ Schervish M J 1995 Theory of Statistics. Springer-Verlag, New York Sutton R S, Barto A G 1998 Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA
R. M. Golden 15044
Statistical Sufficiency In the language of statistical theory, a statistic is a function of a set of data, and a sufficient statistic contains as much information about the statistical model as the original set of data. Statistical sufficiency has served as a powerful concept in the theory of inference in helping to clarify the role of models and data, and in providing a framework for data reduction.
1. Introduction Statistical sufficiency is a concept in the theory of statistical inference that is meant to capture an intuitive notion of summarizing a large and possibly complex set of data by relatively few summary numbers that carry the relevant information in the larger data set. For example, a public opinion poll might sample one or two thousand people, but will typically report simply the number of respondents and the percentage of respondents in each of a small number of categories. A study to investigate the effectiveness of a new treatment for weight loss may well report simply the average weight loss of a group of patients receiving the new treatment. This study would usually report as well some measure of the range of weight loss observed; variability in this setting would be a crucial piece of information for evaluating the effectiveness of the new treatment. In order to make precise the vague notion of an informative data summary, it is necessary to have a means of defining what we mean by relevant information. This in turn is highly dependent on the use we wish to make of the data. The formal mechanism for doing this is to use a statistical model to represent a (usually idealized) version of the problem we are studying. Statistical sufficiency and information are defined relative to this model. A statistical model is a family of probability distributions, the central problem of statistical inference being to identify which member of the family generated the data currently of interest. The basic concepts of statistical models and sufficiency were set out in Fisher (1922, 1925) and continue to play a major role in the theory of inference. For example, we might in an idealized version of the weight-loss problem assume that the potential weight loss for any particular patient can be described by a model in the family of normal distributions. This specifies a mathematical formula for the probability that the weight loss of an individual patient falls in any possible interval, and the mathematical formula depends on two unknown parameters, the mean and standard deviation. If we are willing to further assume that the same statistical model applies to all the potential recipients of the weight-loss treatment, then the data collected in a careful study will enable us to
Statistical Sufficiency make an inference on the two parameters that depends only on two data summaries: the average weight loss in the data and the standard deviation of the weight loss in the data. The normal distribution is also referred to as the ‘bell curve,’ but there is in fact a family of bell curves, each differing from the others by a simple change of mean, which fixes the location of the bell curve, and standard deviation, which fixes the scale of the bell curve. Standardized tests for measuring IQ are designed so that the scores follow a bell-curve model with parameter values for the mean and standard deviation of 100 and 15, respectively. The parameters for the potential weight loss under our hypothetical new treatment are unknown, and the role of the data collected on the treatment is to shed some light on their likely values. Should the weight loss bell curve turn out to be centered at or near zero, one would be wise not to invest too heavily in the new company’s stock just yet. As long as we are willing to accept the bell-curve model for the weight loss data, all the information about that model is contained in two numbers: the average and the standard deviation of the weight loss recorded in our data set. An idealized model that lies behind most media reporting of polls is a Bernoulli or binomial model, in which we assume that potential respondents will choose one of two categories with a fixed but unknown probability, that respondents choose categories independently of each other, and with the same, constant probability. In such an idealized version of the problem, we have complete information about this unknown probability from the knowledge of the number of responses and the proportion of responses in each of the two categories. These two examples are by construction highly idealized, and it is easy to think quickly of a number of complexities that have been overlooked. An educated reader of report on the weight-loss study would want to know how the patients were selected to receive the new treatment, whether the patients were in any way representative of a population of interest to that reader, whether any comparison has been made with alternative, possibly conventional, treatments, the length of time over which the study was conducted and so on. Public opinion polls typically offer more than two categories of response, even to a ‘yes\no’ question, including something equivalent to ‘don’t know,’ or ‘won’t say,’ and also often break down the results by gender, geographic area, or some other classification. Randomized selection of respondents by computerized telephone polls is very different to a ‘reader response’ survey often conducted by magazines and media outlets. These and a wide variety of other issues are studied in the fields of experimental design and sample survey design, by statisticians and researchers in a variety of subject matter areas, from economics to geophysics. See also Obserational Studies: Oeriew and Experimental Design: Oeriew.
Statistical sufficiency is concerned with a much narrower issue, that of formalizing the notion that for a given family of models, a complex set of data can be summarized by a smaller collection of numbers, and lead to the same statistical inference. The following sections describe statistical sufficiency with reference to families of models and related concepts in the theory of statistics. Section 2 provides the main definitions and examples, Sect. 3 some related concepts, and Sect. 4 some more advanced topics.
2. Statistical Models and Sufficiency 2.1 Basic Notation We describe a statistical model by a random ariable Y and a family of probability distributions for Y, . A random variable is a variable that takes values on a sample space, and a probability distribution describes the probability of observing a particular value or range of values in the sample space. The sample space is in a great many applications either discrete, taking values on a countable set such as (some subset of ) the nonnegative integers, or continuous taking values on (some subset of ) the real line or p-dimensional Euclidean space p. A typical data set will be a number of realizations of Y, and the goal is to use these observations to deduce which member of generated these realizations. The members of are often conveniently indexed by the probability mass function or probability density function f ( y), where f ( y) l Prob(Y l y)
(1)
if the sample space is discrete, and f ( y) l lim h−" oProb(Yyjh)kProb(Yy)q (2) h ! if the sample space is continuous. If Y is a vector then Prob (Y y) is defined component-wise, i.e. Prob (Y y ,…, Yp yp). " "often the mathematical form of the probability Very distributions in is determined up to a finite number of unknown constants called parameters, so we write θ l o f (:;θ)q, or more precisely θ l o f (:; θ ); θ ? Θq, where Θ is a set of possible values for θ, such as the real numbers or real-valued vectors of fixed length p, p, or [0, 1], and so on. The problem of using data to make an inference about the probability distribution is now the problem of making inference about the parameter θ, and is referred to as parametric inference. Sufficiency also plays a role in nonparametric inference as well, and we illustrate this at the end of this section. We assume in what follows that the random variable Y l (Y ,…, Yn) is a vector of length n with observed " 15045
Statistical Sufficiency valued y l ( y ,…, yn). A statistic is a function T l " t(Y ) whose distribution can be computed from the distribution of Y. Examples of statistics include the mean, T l n−"ΣYi, the variance T l (nk1)−"Σ (YikT )#,"the standard deviation T l T#"/#, the range " Y –min Y , and so on. The $ probability # distT l max i i % ribution of any statistic is obtained from the probability distribution for Y by a mathematical calculation.
Example 2. Let Y ,…, Yn be independent, identically " normal distribution with mean µ distributed from the and standard deviation σ: 1
f ( yi; µ, σ) l
E
2.2 Sufficiency and Minimal Sufficiency
fYQT ( y Q t)
F
(3)
does not depend on θ. Example 1. Suppose Y ,…, Yn are independent and " probability mass function identically distributed with f( yi; θ ) l θyi(1kθ)"−yi, yi l 0, 1; θ ? [0, 1]
(4)
which can be used to model the number of successes in n independent Bernoulli trials, when the probability of success if θ. The joint density for Y l (Y ,…, Yn) is " f ( y; θ) l θ yi (1kθ)"−yi l θΣyi(1kθ)n−Σ yi
(5)
and the marginal density for T l ΣYi is f (t; θ) l θ Σyi(1kθ)n−Σyi l θ t(1kθ)n−tc(t, n)
(6)
y:Σyi = t
where c(t,n) l n!\t!(nkt)! is the number of vectors y of length n with t ones and nkt zeroes. The conditional density of Y, given T, is the ratio of (5) and (6) and is thus free of θ. The intuitive notion that the definition is meant to encapsulate is that once the sufficient statistic T is observed, i.e. once we condition on T, no further information about the θ is available from the original data Y. The definition of sufficiency in terms of the conditional distribution is not very convenient to work with, especially in the case of continuous sample spaces where some care is needed in the definition of a conditional distribution. The following result is more helpful. Factorization theorem. A statistic T l t(Y ) is sufficient for θ in the family of models θ l o f ( y; θ); θ ? Θq if and only if there exist functions g(t; θ ) and h( y) such that for all θ ? Θ f ( y; θ) l g(t; θ)h( y) 15046
(7)
5 6
(8) 7 8
The joint density of Y l (Y ,…, Yn) is " f ( y; µ, σ) l
A statistic T l t(Y ) is sufficient for θ in the family of models θ l o f(y; θ); θ ? Θq if and only if its conditional distribution
1 1 ( y kµ)# exp 23 k # i N(2π)σ 2σ 4
1 N2πσ
G
n
1
exp 23 k
H
4
1 ( yikµ)# 2σ#
5 6 7 8
1
l
1 1 n ( yiky` )#j ( y` kµ)# exp 23 k n σ 2σ# 2σ# 4 E
i F
1 N2π
G
n
5 6 7 8
(9)
H
where y- l n−"Σyi is the saple mean, θ l ( µ, σ), and Θ l i+. From (9) we see that T l oYF , Σ(YikYF )#q is sufficient for θ, with g(t; θ) identified with all but the final factor in (9). Strictly speaking, there are a number of other sufficient statistics in both the examples above. In Example 2, (Σ%Yi, ΣnYi, Σ((YikYF )#, Σn(YikYF )) is " is, trivially, " the original & ) vector Y also sufficient, as itself. Clearly (YF , Σn(YikYF )#) is in some sense " sufficient statistics, and to be ‘smaller’ than these other preferred for that reason. A minimal sufficient statistic is defined to be a function of every other sufficient statistic. One-to-one functions of the minimal sufficient statistic are also minimal sufficient and are not distinguished, so in Example 2 (ΣnYi, ΣnY #i ) is also " minimal sufficient. Any statistic defines a"partition of the sample space in which all sample points leading to the same value of the statistic fall in the same partition, and the minimal sufficient statistic defines the coarsest possible partition. Since the central problem of parametric statistical inference is to reason from the data back to the parameter θ, extensive use is made of the likelihood function, which is (proportional to) the joint density of the data, regarded as a function of θ: L(θ; y) l c( y) f y; θ)
(10)
The factorization theorem suggests, and it can be proved, that the minimal sufficient statistic in the family of models θ is the likelihood statistic L(:; Y ). This result is easily derived from the factorization theorem if Θ is a finite set, but extending it to cases of more practical interest involves some mathematical intricacies (Barndorff-Nielsen, 1978, Chap. 4). Example 3. To illustrate the concept of sufficiency in a nonparametric setting, suppose that Y ,…, Yn are independent, identically distributed on " with probability density function f; i.e. consists of all
Statistical Sufficiency probability distributions which have densities with respect to Lebesgue measure. Denote by Y(:) the vector of ordered values Y(:) l (Y( ) Y( ) … Y(n)) (see # Order Statistics). Then Y(:) is" sufficient for , as 1
fYQY : ( yQy(:)) l ()
3
2
4
1 n! 0
if y is a permutation of y(:) otherwise (11)
is the same for all members of Example 4. Let Y ,…, Yn be independent, identically " to the uniform distribution on distributed according (θk1, θj1), where θ ? . The order statistic Y(:) is sufficient, as this is a special case of Example 3 above, but is not minimal sufficient. The minimal sufficient statistic is determined from the likelihood function: L(θ; y) l
(0
n ("), i=" #
y(n)k1 θ y( )j1 " otherwise
(12)
from which we see that the minimal sufficient statistic is T l (Y( ), Y(n)), as this is a one-to-one function of " the likelihood statistic.
2.3 Exponential Families A random vector Y ? p is said to be distributed according to an exponential family of dimension k if the probability mass or density function can be expressed in the form f ( y; θ) l exp o(θ)Tt( y)kc(θ)kd( y)q
(13)
for some functions (θ) l o (θ),…, k(θ)q, t( y) l ot ( y),…,tk( y)q, c(θ) and d( y). " "Many common distributions can be expressed in exponential family form, including Example 1 with (θ) l logoθ\(1kθ)q, t( y) l Σyi and Example 2 with E
(θ) l F
µ 1 ,k σ 2σ#
G
H
t(y) l (yi, y#i ) Other examples of exponential families include the Poisson, geometric, negative binomial, multinomial, exponential, gamma, and inverse Gaussian distributions. Generalized linear models are regression models built on exponential families that have found wide practical application. See Dobson (1990) or McCullagh and Nelder (1989) and Analysis of Variance and Generalized Linear Models. From (13) we can see that exponential families are closed under independent, indentically distributed sampling, and that the minimal sufficient statistic for θ
is the k-dimensional vector T l Σt(Yi). Exponential families are said to permit a sufficiency reduction under sampling; we can replace the original n observations by a vector of length k. Possibly for this reason exponential families are used widely as models in applied work. A partial converse of this is also true, that if a finite dimensional minimal sufficient statistic exists in independent sampling from the same density, and the range of values over which the density is positive does not depend on θ, then that density must be of exponential family form. Versions of the material in this appear in most mathematical statistics textbooks. Suggested references, roughly in order of increasing difficulty, are Azzalini (1996), Casella and Berger (1990), Lehmann and Casella (1998), Cox and Hinkley (1974), Pace and Salvan (1997), and Barndorff-Nielsen and Cox (1994).
3. Some Related Concepts 3.1 Ancillary Statistics A companion concept for the reduction of data to a number of smaller quantities is that of ancillarity. A statistic A l a(Y) is said to be ancillary for θ in the family of models θ if the marginal distribution of A, fA(a) l
&
f ( y; θ ) d y
(14)
oy: a(y) = aq
is free of θ. If every other ancillary statistic is a function of a given ancillary statistic, the given statistic is a maximal ancillary statistic. Example 4 (continued). In independent sampling from the U(θ, θj1) distribution, T l (Y( ), Y(n)) is a minimal sufficient statistic, and A l Y(n)"kY( ) is a " maximal ancillary statistic. If an ancillary statistic exists, all the information about θ is in the conditional distribution of Y, given the ancillary statistic. If a minimal sufficient statistic exists, all the information about θ is in the marginal distribution of T. These two data reductions, to A or to T, are complementary, and can sometimes be combined. Example 5. Let Y ,…, Yn be independent, identically " one parameter location family distributed from the f ( yi; θ) l f ( yikθ), θ ? , where f (:) is known. ! Then the vector of residuals A l (Y !kYF ,…, YnkYF ) " location scale is ancillary. An extension of this to the family is possible: let f ( yi; θ) l σ−"f oσ−"( yikµ)q, ! then A l o(Y kYz )\S,…, (YnkYz )\S q, (15) " where S# l Σ(YikYF )#, is a maximal ancillary. The normal distribution of Example 4 is a location-scale 15047
Statistical Sufficiency model, with minimal sufficient statistic T l (YF , S#), which in this case is independent of A. The importance for the theory of statistics in concepts of dimension reduction such as ancillarity and sufficiency is that inference for a k-dimensional parameter is encapsulated by a probability distribution on a k-dimensional space. This immediately provides tests of significance, confidence intervals, and point estimates for the unknown parameter. From this point of view it now appears that ancillarity is possibly a more central notion to the theory of statistics, although sufficiency is regarded widely as more natural and intuitive. It should be noted that inference from a Bayesian point of view automatically provides a k-dimensional distribution for inference for a k-dimensional parameter, which is the posterior distribution given the observed data y. This is achieved at the expense of constructing a probability model for the unknown parameter θ. See Bayesian Statistics.
3.2 Asymptotic Sufficiency The normal distribution is used widely for modelling data from a variety of sources, but is also very important in the theory of inference in providing approximate inferences, when exact inferences may be unavailable. While the likelihood function is equivalent to the minimal sufficient statistic, a special role in approximate inference is played by some functions computed from the likelihood function. The maximum likelihood estimator of θ in the model f ( y; θ) is defined by sup L(θ; y) l L(θV ; y) θ
(16)
and the obsered information function for θ is j(θ) lkc#[log oL(θ; y)q]\cθcθh
(17)
Under regularity conditions on the family of models for Y l (Y ,…, Yn) we have the following asymptotic result for a" scalar parameter θ as n _: (θV kθ ) o j(θ )q"/# ! !
d
θ N(0, 1) !
(18)
which means that for sufficiently large n the distribution of the maximum likelihood estimator is arbitrarily close to a normal distribution with mean zero and variance j−"(θ ), under sampling from f ( y; θ ). A similar result is! available for vector θ. The! maximum likelihood estimator has by construction the same dimension as the unknown parameter θ, so (18) provides a means of constructing an inference for θ using the normal approximation. This suggests that a notion of asymptotic sufficiency could 15048
be formalized, the maximum likelihood estimator having this property. This formalization relies on showing that the likelihood function is asymptotically of normal form, with minimal sufficient statistic θ# ; see Fraser (1976, Chap. 8). The argument is very similar to that showing that the posterior distribution for θ is asymptotically normal. More generally, in models with smooth likelihood functions, a Taylor series expansion of the likelihood function around θ# suggests a refinement of asymptotic sufficiency: the set of statistics θ# and second and higher order derivatives of the likelihood function evaluated at θ# form a set of approximate sufficient statistics in a certain sense (Barndorff-Nielsen and Cox 1994, Chap. 7). In general models the maximum likelihood estimator is not the minimal sufficient statistic, although it is always a (many-to-one) function of the minimal sufficient statistic. Much of the recent development of the parametric theory of inference has been concerned with combining sufficiency or approximate sufficiency with ancillarity or an appropriate notion of approximate ancillarity, to effect a dimension reduction with minimal loss of information. In general models if the maximum likelihood estimator is used in place of the minimal sufficient statistic, information is lost. The idea of recovering this information, via conditioning on an ancillary statistic, was put forth in Fisher (1934). With some exceptions this idea was not exploited fully in the statistical literature until about 1980, when it was discovered that an asymptotic version of Fisher’s argument holds in very general situations. The subsequent 20 years saw continued development and refinement of Fisher’s original idea. A brief overview is given in Reid (2000), and the books Barndorff-Nielsen and Cox (1994) and Pace and Salvan (1997) give a detailed account.
3.3 Inference in the Presence of Nuisance Parameters Many statistical models have vector valued parameters θ, but often some components or scalar valued functions of θ are of particular interest. For example, in evaluating the hypothetical new weight-loss treatment under a normal model, the mean parameter µ is in the first instance of more interest than the standard deviation σ. In studying results from a poll we may wish to introduce parameters describing various aspects of the population, such as gender, age group, income bracket, and so on, but the main purpose of the poll is to construct an inference about the proportion likely to vote for our candidate, for example. The idealized modelling of these types of situations is to describe θ as consisting of two components θ l (ψ, λ), of parameters of interest, ψ and nuisance parameters, λ. An extension of this would be to define ψ l g(θ) as determined by a set of constraints on θ,
Statistical Systems: Censuses of Population but only the component version will be considered here. There are in the literature several different definitions of sufficiency for a parameter of interest ψ in the presence of a nuisance parameter λ. To illustrate some of the issues we consider again the normal distribution with θ l ( µ, σ), T l (YF , s#). We have fT(t; µ, σ) l fYz ( y` ; µ, σ) fS #QY (s#; σ#) l fYz ( y` ; µ, σ) fS"# (s#; σ#)
(19)
where fYF (:) is the density of a normal distribution with mean µ and standard deviation σ\Nn, and fS#(:) is proportional to that of a chi-squared distribution on nk1 degrees of freedom. As the conditional distribution of S#, given YF is free of µ, we could define YF as sufficient for µ, by analogy to (3). However it is not sufficient for µ in the sense of providing everything we need for inference about µ, since its distribution depends also on σ. In this latter sense S# is sufficient for σ. There have been a number of attempts to formalize the notions of partial sufficiency and partial ancillarity, but none has been particularly successful. However, the notion of dimension reduction through exact or approximate sufficiency and ancillarity has played a very important role in the theory of parametric inference. Sufficiency in the presence of nuisance parameters is considered in Barndorff-Nielsen (1978, Chap. 4), Jorgensen (1997) and Bhapkar (1991). A discussion of partitions like (19) and the role of conditional and marginal inference is given in Reid (1995).
4. Conclusion A sufficient statistic is defined relative to a statistical model, usually a parametric model, and provides all the information in the data about that model or the parameters of that model. In many cases the sufficient statistic provides a substantial reduction of complexity or dimension from the original data, and as a result the concept of sufficiency makes the task of statistical inference simpler. The related concept of ancillarity has a similar function. Recent developments in likelihood inference have emphasized the constructions of approximately sufficient and approximately ancillary statistics in cases where further dimension reduction is needed after invoking sufficiency and ancillarity. Excellent textbook references for the definitions and examples include Casella and Berger (1990) and Azzalini (1996). More advanced accounts of recent work are given in Barndorff-Nielsen and Cox (1994) and Pace and Salvan (1997). See also: Analysis of Variance and Generalized Linear Models; Bayesian Statistics; Distributions, Statistical:
Approximations; Experimental Design: Overview; Fiducial and Structural Statistical Inference; Frequentist Inference; Observational Studies: Overview; Order Statistics
Bibliography Azzalini A 1996 Statistical Inference. Chapman and Hall, London Barndorff-Nielsen O E 1978 Information and Exponential Families in Statistical Theory. Wiley, New York Barndorff-Nielsen O E, Cox D R 1994 Inference and Asymptotics. Chapman and Hall, London Bhapkar V P 1991 Sufficiency, ancillarity and information in estimating functions. In: Godambe V P (ed.) Estimating Functions. Oxford University Press, Oxford, UK Casella G, Berger R L 1990 Statistical Inference. Brooks-Cole, Pacific Grove, CA Cox D R, Hinkley D V 1974 Theoretical Statistics. Chapman and Hall, London Dobson A J 1990 An Introduction to Generalized Linear Models. Chapman and Hall, London Fisher R A 1922 On the mathematical foundations of theoretical statistics. Philosophical Transactions of The Royal Society London, Series A 222: 309–68. Reprinted. In: Kotz S, Johnson N L 1992 (eds.) Breakthroughs in Statistics. Springer, New York Fisher R A 1925 Theory of statistical estimation. Proceedings of the Cambridge Philosophical Society 22: 700–25 Fisher R A 1934 Two new properties of mathematical likelihood. Proceedings of the Royal Society Series A 144: 285–307 Fraser D A S 1976 Probability and Statistics: Theory and Applications. Duxbury Press, North Scituate, MA Jorgensen B 1997 The rules of conditional inference: Is there a universal definition of non-formation? Journal of the Italian Statistical Society 3: 355–84 Lehmann E L, Casella G 1998 Theory of Point Estimation, 2nd edn. Springer, New York McCullagh P, Nelder J A 1989 Generalized Linear Models, 2nd edn. Chapman and Hall, London Pace L, Salvan A 1997 Principles of Statistical Inference From a Neo-Fisherian Perspectie. World Scientific, Singapore Reid N 1995 The roles conditioning in inference (with discussion). Statistical Science 10: 138–57 Reid N 2000 Likelihood. Journal of the American Statistical Association 95: 1335–40
N. Reid
Statistical Systems: Censuses of Population 1. Early Sporadic Censuses ‘The Census at Bethlehem,’ painted by Pieter Bruegel (the Younger, around 1600) shows the Virgin Mary on a donkey, going to Nazareth to be counted at home by the Roman Census. There are countries even at the 15049
Statistical Systems: Censuses of Population beginning of the twenty-first century where the patriotic ‘ceremony’ of the census requires people to be at their usual (de jure) home for easier counting. A census was also conducted in AD 2 in the relatively newly unified China, where they counted 60 million persons, about a quarter of the world’s population then. (At 1.3 out of 6 billions today, China is still one-fifth of the world’s population and would be one-quarter without her rigorous population controls.) A millennium later in 1086, a sort of census was conducted in England for William the Conqueror, known as the Domesday Book. These are three prominent examples from several known to have been undertaken in different countries at different times, and they illustrate the importance placed on population counts. However, they were only sporadic and highly variable attempts, for which we have no complete records, especially because some of them were kept as ‘state secrets.’ There were also many censuses for only single cities, districts, or provinces. Furthermore, they were not usually counts of households or of all persons. Some counted only men of military age for allocating levies of troops; some counted only chimneys, used as an easy proxy for households. The chief aims of censuses were to apportion to administrative units legislative seats, the burden of taxation, army levies, or of work assignments. These aims survive today, but others have been added in modern times. All of these aims are supported by the details for small domains that complete censuses yield. (See Goyer 1992, Taeuber 1986).
Division of the new United Nations. The stimulation, guidance, and coordination of the UN Statistical Division were crucial for spreading the establishment of censuses and their comparability (UN 1980). The efforts of the UN’s sister agencies (FAO, WHO, and UNESCO) were also important. Beyond the growing coverage of most countries and improved coverage within countries, population censuses have advanced in five other dimensions. First, the quality of the census data has been improved. Second, that improvement has been subjected to measurement by quality checks, often with duplicate observations. Third, both the contents and the quality have been increasingly standardized across countries in order to facilitate international comparability of data. Fourth, the number, diversity, and depth of census variables have been greatly increased. Fifth, the analyses of census variables have been vastly deepened and improved, with the help of modern computing machines and methods. In addition to censuses of population and housing, many countries also conduct censuses of agriculture, business, manufacturing and mineral industries, etc. These are usually done separately, often by different and specialized staffs and enumerators. The censuses of agriculture may still be linked to population censuses in less developed countries, where most households still also operate small farms. However, in industrialized countries most agricultural products come from large firms of ‘agro-business.’
2. Rise of Modern Censuses
2.1 Contents of Population Censuses
The first modern, national census of households and persons was established in the Constitution of the new USA in 1790 and has been continued decennially since then. Its first major purpose was allocating seats in the new national Congress, but it also served other purposes and needs (Anderson 1988). Those purposes and needs increased gradually and the variables collected by the census increased likewise. The idea spread and decennial censuses began in the UK in 1801 and in Belgium in 1846. The International Statistical Institute’s Congresses of 1853, 1872, and 1897 passed recommendations in favor of decennial censuses for all countries, with some basic requirements, aiming at standardization for better international comparability. It also urged open publication and dissemination of results. In 1900 some 68 different countries already conducted and made available population censuses, which included 43 percent of the world’s population. The quality and content of the censuses improved gradually. For example, it is only since 1850 that the data for USA censuses covered all individuals rather than just families. The greatest international impetus came after 1946 from the Population Division and the Statistical
Beyond mere head counts, modern population censuses obtain several classes of variables, as shown below, condensed from the UN’s recommended list: Residences—usual, on the census date, duration, previous, at birth; place of work Sex, age, marital status, relation to head of family and household Citizenship, religion, language, national\ethnic origin Children: born alive, living. Births last year: live and dead Marriage: age, duration, order Education: attained, literacy, qualifications, attendance Occupation: status, industry sector, time worked Income, socioeconomic status, economic dependency From this recommended list, most countries omit some topics or include them only in some years. On the other hand, some countries add other topics, of course, as desired. With all those variables, and with increasingly complex and detailed analyses, the published outputs of censuses grew into census monographs of many volumes, and then took years to finish (Farley
15050
Statistical Systems: Censuses of Population 1995). Therefore statistical offices are continuing to develop cheaper and faster methods of dissemination, with the aid of computers and electronic publishing.
also on the stability of variables like total population. But they are not accurate for unstable and fluctuating variables, like unemployment, infectious diseases, and financial variables. Sample surveys have been developed for some of these (see Sect. 3).
2.2 Timing of Censuses Censuses of the population must assign specific locations on specific dates (the census day or reference period) to people. The collection period for enumerating may be one week or a few weeks. However, the reference period is a single census day for items like residence and age; for items like income (or crops) it may be the preceding year. However, humans are highly mobile and they are becoming even more so: with longer and more vacations, longer travel to work, more second homes, more split families, and so on. De facto is the census term for the actual location of persons on census day and de jure for their ‘usual place of residence’ (home). Without this distinction city workers would make the central cities larger in the daytime when their ‘usual place of residence’ locates them in the suburbs. But location problems are more difficult for students living away from home in university cities; for death rates in cancer hospitals far from home; large institutional populations in prisons or in the military; for second vacation homes, etc. These are all examples that census definitions and operations must cover, in the field and in the office. Nomadic populations pose a great variety of special problems in timing and locations, of too great a variety for us to attempt to cover here. For the census work in most countries, tens or hundreds of thousands of enumerators must be hired and trained for just one or two weeks of fieldwork and a few months in the offices. In China and India they need millions of enumerators. Some experts have doubts about how well they can be trained. However, others contend that the ‘ceremony,’ the patriotic campaign that can be mounted every 10 years, with educational campaigns, and exhortations in the media, increases awareness, participation, and response rates. Still others fear that the campaign arouses the latent and dormant opposition to fact-finding. Decennial censuses may be regarded as systematic samples of a continuous time dimension: samples of a single census day in the 3,652 days of 10 years, or one year every 10 years for variables like yearly income. Estimates for intercensal and postcensal periods must depend on models and methods, and these are called intercensal and postcensal estimates. Other methods are also available for national and large domain estimates; but what are lacking above all are estimates for small areas and small domains that are also more detailed in time than decennial censuses. These needs have given impetus to methods of small area estimates and small domain estimates (Platek et al. 1987, Purcell and Kish 1986). Their accuracy depends on relations to auxiliary data from registers and sample surveys;
2.3 Coerage by Censuses All censuses have problems of coverage; and even the best suffer a few percentages of noncoverage (or missing units). Furthermore, typically the units (households, persons) are not missed evenly, or at random; they are typically missed more often among the lower socioeconomic classes, some ethnic groups, and areas. These uneven coverages persist despite efforts to find and to measure them. However, the size and bias of this noncoverage may often be smaller than the even greater distortions due to the uses of obsolete decennial censuses for 10 years. All national census results also consist of combinations of regions or provinces or states, also of districts or counties or communes, and so on down to small domains. The populations of the provinces may differ greatly in culture and standards and the national results represent averages of the provincial levels. For some provinces different languages and questionnaires may be needed. Further, deliberate omissions may exclude some areas because: (a) they are vast, ‘uninhabited’ mountains, deserts, jungles, or small islands; or (b) they may be areas under hostile populations or rebellions. Changes of boundaries that have occurred, either losses or acquisitions, after wars and treaties cause problems for publishing and reading census monographs, tables, and graphs.
2.4 Costs of Censuses The fine spatial detail of complete censuses is bought with high total costs. Because the total costs of censuses are high, they are usually only held every 10 years; and only a few countries had five-year censuses. The growing number of topics covered by censuses have also increased their total high costs and these prevent more frequent census taking, despite demands for annual data with spatial detail. At the same time the large size of the census is also an obstacle to obtaining greater richness of data (with more variables) with spatial details. This high cost is one cause of the sporadic political controversies about censuses in many countries. But even more serious are the conflicts due to the policy implications of census results, especially those that are used for apportionment between provinces and between ethnic groups. Apportionment of seats in Congress (Parliament), of taxation, of armies, of welfare funds—the very reasons for collecting census data—can also lead to conflicts, and even (rarely) to suppression of data. The collection 15051
Statistical Systems: Censuses of Population and open publication of reliable and trusted statistical data is (like liberty) ‘the fruit of constant vigilance.’ The costs of censuses must vary with the methods used to collect and process them. Door-to-door personal interviews have been the standard throughout history. However, now they have been modified in places with telephoned and mailed-in questionnaires, and also with mailed-out-mailed-in postal methods. Processing and editing have been speeded with mechanical and electronic methods; but the savings seem to be swallowed by increased complexities of analyses. The comparison of total costs across times, and between countries are complicated by population sizes, exchange rates, and living standards; but guesses are made that the total cost of censuses per capita population equal about 1\2 to 1 hour of the countries’ median hourly wage for every 10 years.
3. Sampling and Censuses Two other methods for collecting population data have emerged as both complementary to and competitive with complete census taking: sampling methods and administrative registers. Sampling methods have come to be used in several distinct ways in different countries. 3.1 Census Samples The addition of great numbers of desirable topics increased the costs of censuses, and since 1950 a solution to this problem was increasingly found in the use of census samples. In addition to the complete count using the ‘short form’ containing a few basic questions, a sample of the population completes a ‘long form’ with many variables. This sampling reduces both the total cost of the census and the burden on the majority of respondents in 2000. The rates and methods of sampling differ between countries. The sampling units may be enumeration districts (areas— EDs or EAs), or households, or persons. Selection of persons has lower sampling variance, but selecting EDs may have better enumerative controls and greater flexibility. The sample population may only get the long form or they may get duplicates of both the short and the long forms. Census samples generate large samples, whether 5 or 10 or 20 percent of the complete count, and thus they can provide detailed spatial data, but with reduced costs. They are still much larger than ordinary survey samples, which have selection rates of 1\100, 1\1,000 or 1\10,000 of complete counts. 3.2 Postenumeratie Sureys, Quality Checks, Coerage Checks Samples may be taken for quality checks with measurements that may use more costly and (hope15052
fully) improved methods. These measurements may be in addition to, or instead of, the regular census data collection and the sampled units may be persons, households, or enumeration districts. Coverage checks are attempts to find the sort of persons missed by the regular census count. The surveys are often called postenumerative because they are usually done after the census collection, but they may be done simultaneously. 3.3 Sample Censuses These have been conducted by a few countries, and sometimes are called ‘minicensuses,’ perhaps based on one-percent samples and with fewer topics than the sample surveys described below. The greatest expressed need is for annual data to supplement the decennial censuses, which are increasingly judged to be inadequate because of obsolescence. Typically, census monographs containing the richer data of the long form become available five or more years after the census year, although the simple counts may be released quickly (Farley 1995). 3.4 Sample Sureys Generally, sample surveys differ greatly from complete censuses; they are much smaller; they are much cheaper, but they also lack detail in spatial (geographic) and other domains. But periodic samples yield much needed temporal detail, and also have other advantages. They can yield much richer data, especially because the content and the methods can be focused on specific contexts. Also because that greater concentration can be achieved for the questionnaires, for the interviewers and for the respondents as well. ‘Representative sampling’ was first proposed by F. N. Kiaer in 1895 in Norway, for replacing censuses. The fundamental concept requires a careful, scientific method for selecting a sample from the population and then estimating the population variables from the sample statistics. Many statisticians, academic and practical, in many journals and in the International Statistical Institute for about 50 years, opposed sampling as a substitute for complete enumeration. Meanwhile probability samples came into use for social surveys, notably for unemployment rates and then for the famous quarterly surveys of the labor force, which became the Current Population Surveys of the US Bureau of the Census (1978). All countries in the European Union and many others now have quarterly or monthly surveys of the national labor force. These surveys, mostly of 5,000 to 100,000 households, are financed because governments need current monthly or quarterly data on (un)employment and its periodic fluctuations. National statistical offices generally conduct many other samples every year, some of them only once, some repeatedly, and
Statistical Systems: Crime some periodically. Some of the censuses are also produced by sampling methods and the Food and Agriculture Agency (FAO) has been in the forefront of promoting agricultural censuses by means of sample surveys (see Social Surey, History of).
4. Administratie Registers Administrative registers related to population counts, sometimes called civil or population registers, may be of different kinds, purpose, and quality in different countries and at various times. Vital registers of births and deaths may be the most common examples, but others exist for voter registration, or recently auto licenses, telephone numbers, etc. The most relevant registers date from 1749 in Sweden and the Nordic countries, where the official church started keeping records of the parish populations, which continues today. Some statisticians propose replacing decennial censuses with data from administrative registers of different kinds, or with the combined social resources available in specific countries (see Databases, Core: Demography and Registers). The data are inexpensive, almost without cost, because some other agency, with other aims and resources, pays for them. Furthermore, they are often detailed spatially, and also detailed temporally, timely, and they are kept up-to-date. They have been used in several Scandinavian countries to replace their population counts. However for two chief reasons, most countries cannot use administrative registers to replace censuses. First, for lack of the conditions that facilitate the successes of Scandinavian registers: a highly educated and cooperative society, which is highly motivated to register promptly their own changes of residence. Second, no registers, not even the Scandinavian registers, have ever obtained the richer data (beyond a few lines) that modern censuses need. In addition, some statisticians fear the misuse of registers by careless, or malevolent, governments and societies.
5. Future Deelopments Modern censuses face some gravely conflicting demands: higher quality and better coverage of the population, but with reduction of mounting costs. We need spatial detail, but also temporal detail and timeliness. Censuses yield more spatial detail, but sample surveys are superior in all of the three other aspects. The greatest need is for more timeliness, because decennial data do not satisfy modern needs, especially for annual data with spatial details. At the other extreme the monthly surveys provide adequate national statistics and satisfy the other requirements, but they yield no spatial detail for local needs. This conflict gave rise to designs of ‘cumulative representative samples’ and then to ‘rolling samples,’ in the last few years. The National Household Health Survey in the USA yields separate samples of about 1,000
households each week, and they are cumulated into 52,000 households yearly, with health data on about 150,000 persons. Thus the yearly cumulations yield data for even moderately rare diseases and infirmities. Furthermore, the American Community Survey of the US Census Bureau is conducting a large pilot study for a ‘rolling sample’ by AD 2003, which may replace the long form of the US Census of 2010 (Alexander 1996, Kish 1998). Many statisticians in AD 2000 felt that the greatest need generally is for regular, standardized annual surveys (sample or census) that are harmonized internationally, and also large enough to yield reliable data for small domains. It is likely that the censuses of the twenty-first century will be quite different from those of the twentieth century.
Bibliography Alexander C 1996 A rolling example survey for yearly and decennial uses. Bulletin of the International Statistical Institute. Helsinki, 52nd session Alterman H 1969 Counting People, The Census in History. Harcourt, Brace and World, New York Anderson M 1988 The American Census: A Social History. Yale University Press, New Haven, CT Farley R 1995 State of the Union, Vols I and II. Russell Sage, New York Goyer D S 1992 Handbook of National Population Censuses: Europe. Greenwood Press, NY Kish L 1998 Space\time variations and rolling samples. Journal of Official Statistics 13: 31–46 Kish L, Verma F 1986 Complete censuses and samples. Journal of Official Statistics 2: 381–95 Platek R, Rao J N K, Sarndal C E, Singh M P 1987 Small Area Statistics. Wiley, New York Purcell N J, Kish L 1987 Estimation for small domains. Biometrics 35: 365–84 Shryock H S, Siegel J S 1973 The Methods and Materials of Demography. US Bureau of the Census, Washington, DC Taeuber C 1986 Census. In: International Encyclopedia of the Social Sciences. Macmillan, New York United Nations 1980 Principals and recommendations for population and housing censuses. Series M No. 67 (Sales E.80.XV11.8) US Bureau of the Census 1978 The current population survey: Design and methodology. USCB, Technical Paper 40, Washington
L. Kish
Statistical Systems: Crime 1. Crime Statistics Crime statistics are largely generated from three avenues—administrative data, surveys, and ethnographic studies. Administrative data are often referred to as the ‘official’ statistics as they are drawn most often from the records kept by state institutions. The 15053
Statistical Systems: Crime first official statistics were published in France in 1827 (Coleman and Moynihan 1996). Self-report surveys are the other major source of crime data or statistics and involve interviewing perpetrators and victims of crime. In general ethnographic studies do not result in ‘statistics’ as such. They are based on small samples that are usually unrepresentative of the population. However, such rich materials play an important role in understanding what lies behind the numbers used by crime analysts. The collection and monitoring of official crime statistics has largely been taken over by government and government sponsored agencies such as the US Bureau of Justice Statistics or the UK Home Office (Coleman and Moynihan 1996). These agencies draw on the major criminal justice institutions such as the police, prisons, and the courts for these official statistics. On the surface the collection of such data might appear straightforward and relatively uncontentious yet in reality it remains one of the most controversial areas in the analysis of crime. Official crime statistics have been subject to critique since the mid-nineteenth century through to the present, the beginning of the twenty-first century (Coleman and Moynihan 1996). There are a variety of factors that contribute toward this controversy (Kempf 1990, Jackson 1990). Some of the more important reasons include: (a) The study of crime itself is often controversial. (b) What is defined as a ‘crime’ is a social construction and not an empirical fact. (c) There is not always consensus on how to collect crime statistics. (d) There is no one data source that can address all the issues raised in the study of crime.
2. Official Crime Statistics Administrative databases are used to generate official crime statistics yet they are first and foremost a management tool. As a result, using data generated from this system as the sole source of crime statistics is problematic. Unlike many other economic and social activities, the act of committing a crime can result in the loss of liberty and in some cases the loss of life. Understandably, most people do not admit to committing crime. As a consequence detection of crime is dependent on victims reporting such activities to the police, and the police successfully apprehending the suspects. Police make decisions about whether to proceed with charges, then prosecutors make decisions about whether to continue with the charge and if so which charges they will pursue. Finally, the courts then make a determination as to guilt or innocence. There are many factors that intervene at different points in this process ultimately affecting the ‘numbers’ that appear in official statistical collections. Because of the potential for error at various stages through the recording process official homicide stat15054
istics are largely regarded as having the most credibility.
3. Police Data Counting rules used by the collection agencies complicates interpretation of police statistics. Data can be confusing to the novice as it can on occasion refer to the offences\incidents and at other times to the offenders and victims. For example, one offender may be responsible for 50 offences. Similarly, five people may be involved in one offence (e.g., a bank robbery by a group of people). Where an individual has committed a range of offences often only the most serious offence is recorded for statistical purposes. For example, prison statistics in Australia are based on the most serious offence of the prisoner. Thus if an individual is convicted on a range of offences on a particular day only the most serious offence is recorded. As a result official statistics can under-report the extent of crime in the community. Police data are limited by: (a) Definitions of what constitutes a crime. (b) Variability in reporting by citizens. (c) Law enforcement organizational structures. (d) Variations in counting rules across jurisdictions. (e) Low rates of detection and clearance.
4. Court and Corrections Data These particular administrative databases represent a subset of police data for they record crimes that have been processed further down the criminal justice chain. For example only one in 20 people who appear before the criminal courts in Australia serve time in prison. Data from these systems are used to produce crime statistics on matters relating to sentencing, prison composition, recidivism, probation, and parole. However, as with police databases they are highly restrictive in terms of the limited nature of the data recorded. Usually only the most basic demographic and arrest characteristics are collected. Counting rules also vary across the courts. In Australia, for example, the unit of count is the charges or appearances in the lower court while in the higher court it is distinct persons. Crime statistics from these databases will be biased as they are based on those who represent the most serious offenders and offences.
5. Improements in Official Statistics Despite these limitations there have been significant advances in the recording of crime within administrative systems. Many countries have moved or are moving toward collating national data on crime that is comparable across areas by standardizing definitions and counting rules (UNAFEI and AIC 1990). In addition jurisdictions are seeking to link data across
Statistical Systems: Crime criminal justice institutions so that a more complete picture of crime, offending, and offenders can be determined. Improved and cheaper technologies, and the development of statistical packages, have meant that analysts and researchers are demanding unit record files (URF) rather than aggregated statistics. As a result new administrative databases are being established to provide greater detail in response to research and policy demands. A major development has been the National Incident-Based Reporting System by the United States Federal Bureau of Investigation (Maxfield 1999). There are three major factors that are driving improvements in crime statistics. The first is the demand for accountability by governments for the way in which scarce public money is spent. To justify their activities criminal justice institutions are increasingly relying on statistics to support their assertions and resource allocations. This results in both a greater awareness of statistics and a greater scrutiny of these same statistics. The second major factor has been the critical scrutiny of these data particularly by independent academic researchers. The third has been the development of technology that enables the costeffective storage and retrieval of data.
and extent of crime has been shown to consistently be more prevalent than appears from ‘official’ statistics. However, such surveys have also been criticized on a number of grounds including (Koffman 1996, Coleman and Moynihan 1996, Skogan 1981): (a) The definition of what constitutes a ‘crime’ in the survey instrument does not always match the legal definition in the official statistics making comparisons difficult. (b) Particular subgroups are either more or less likely to self-report particular crimes affecting the reliability of the estimates. For example, those with higher levels of education are more likely to report assault, while women are less likely to report rape\ domestic violence. (c) Memory failure is a problem with respondents being more likely to recall incidents where the offender was not known to them than where the offender was known or related. There are also potential problems in terms of ignorance of events, deception, and ‘forward and backward telescoping.’ (d) Methodological problems include who should be interviewed in the household and the phrasing and order of questions.
7. Self-report Sureys 6. Crime Victim Sureys Researchers have long been concerned about a ‘hidden figure’ of crime that represents a large proportion of unrecorded crimes and criminals. Crime victims’ surveys from a range of countries have shown that the majority of people do not, for a variety of reasons, report crimes to the police. There are other biases in the official statistics. White-collar crime is rarely recorded or dealt with under criminal law (Sutherland 1949). Some researchers argue that the important feature of crime data is the long-term trend in crime and not the overall amount of crime. Others have argued that even the long-term trends are difficult to interpret—is crime really increasing or is it a reflection of other things? For example, changes in police practices and priorities, or changes in the interpretation and classification of particular offences can help to explain changes in crime statistics. There have also been changes in the police management\ processing of some forms of crime, such as rape and domestic violence, that have impacted on the number of recorded crimes. Given the sustained criticism of official statistics governments and academics have sought ways to improve upon their collections. To address the issue of under-reporting of crime to police, crime victim surveys are routinely undertaken in many countries. Individuals within households are randomly selected and asked about their experience of crime and victimization. From this source of information the nature
In addition to victimization surveys there has been increasing use of self-report data from offenders. Dissatisfaction with official crime statistics has been a major factor in the development of offender surveys. This source of information has tended to be dominated by academics rather than government statistical collection agencies. One of the reasons is because these data need to be more carefully informed by theoretical models that can only come from an in-depth understanding of the substantive material. Self-reports are now regarded as the most common methodology for collecting data on crime statistics and large scale selfreported studies have taken place in a range of countries (see Junger-Tas and Marshall 1999). Self-report data have generally confirmed that official data underestimates the level of crime in the community as described in official statistics. This particularly applies to victimless crime such as prostitution and illegal drug use. It is in the area of deviant behavior that self-reported crime statistics are superior to official statistics and these behaviors, if defined as illegal, are rarely reported to the police. Hence, selfreport data are widely used to provide more accurate statistics on the prevalence and incidence of engagement in criminal activities by offenders. In addition self-report data have been used to: (a) examine the correlates of offending; (b) test theoretical models of criminal behavior; and (c) evaluate intervention programs (Junger-Tas and Marshall 1999). 15055
Statistical Systems: Crime However, biases also apply to many self-report studies. Unless extraordinarily large sample sizes are obtained in population surveys the capacity to examine the correlates of offending or test theoretical models is limited by the highly skewed distribution of criminal behavior; this form of deviance is a rare event. For example, less than half a percent of the general population self-report having tried heroin in the past 12 months in household surveys in the USA and Australia. To generate 100 people who have used heroin in the past 12 months would require a sample of 20,000 people. In addition most household surveys tend to under-represent those engaged in criminal activity; this is partly due to the sampling frame and partly due to the differential participation amongst particular groups (Weis 1986). One solution to overcoming the problem of large sampling error is to use disproportionate stratified samples or to focus on specific groups such as incarcerated offenders. The latter will ensure that the sample has the desired characteristics but generalizability is severely limited. Given the different methodologies employed to obtain crime statistics it is not surprising that the absolute rates vary between official and self-report data from offenders and victims. Consequently, researchers have advocated that relative rates should be used in making comparisons (see Junger-Tas and Marshall 1999). When this is done there is considerable consistency between the statistical sources from a range of difference countries (Junger-Tas et al. 1994).
8. Impediments to the Improement of Crime Statistics There are three major impediments to the improved development of crime statistics: (a) the application of ethics\privacy\informed consent provisions; (b) access to primary or unit record data; and (c) the rising costs of undertaking major research studies. The movement toward increased ethics (or human) oversight of projects can be problematic particularly in the area of crime. In practice the self-report methodology asks people to incriminate themselves. This poses real dilemmas for ethics committees, many of who are not trained in the discipline. As a result committees may be reluctant to give ethics approval. It also poses problems for researchers who in many countries and jurisdictions do not have protection under the law; they may be legally required to release information that was gained under assurances of confidentiality to the informant. Instigating informed consent forms, particularly when it requires a signature may conflict with assurances of confidentiality 15056
or anonymity thus affecting response rates. JungerTas and Marshall (1999) allude to the problem of informed consent on participation rates for delinquency studies. These issues militate against the study of crime and could well reduce the empirical study of crime and the collection of crime data outside of the official statistics. The major improvements to crime statistics have largely occurred because of academic scrutiny of published statistics and secondary analysis of unit record data. The USA has had a long tradition of sharing and releasing statistical data for secondary analysis; this is not the norm. Access to official data collections for secondary analysis is often difficult and highly controlled by state institutions in many countries, like Australia. Until state institutions release full unit record data for secondary analysis this situation will continue to impede the improvement of national crime statistics. Finally, funding constraints balanced alongside the increasing costs of data collection and research will affect both the government and nongovernment sectors in their ability to improve on their statistical collections. This is compounded by improvements in statistical methods that allow social scientists to undertake more sophisticated analysis. But this entails more sophisticated statistical collections and these are usually much more expensive.
Bibliography Coleman C, Moynihan J 1996 Understanding Crime Data. Open University Press, Buckingham, UK Jackson P 1990 Sources of data. In: Kempf K L (ed.) Measurement Issues in Criminology. Springer-Verlag, New York Junger-Tas J, Marshall I H 1999 The self-report methodology in crime research. In: Tonry M (ed.) Crime and Justice, A Reiew of Research. University of Chicago Press, Chicago, Vol. 25, pp. 291–367 Junger-Tas J, Terlouw G-J, Klein M W (eds.) 1994 Delinquent Behaiour Among Young People in the Western World. Kugler, New York Kempf K L (ed.) 1990 Measurement Issues in Criminology. Springer-Verlag, New York Koffman L 1996 Crime Sureys and Victims of Crime. University of Wales Press, Cardiff, UK Maxfield M G 1999 The national incident-based reporting system: research and policy applications. Journal of Quantitatie Criminology 15: 119–49 Skogan W G 1981 Issues in the Management of Victimization. US Department of Justice: Bureau of Justice Statistics Sutherland E H 1949 White Collar Crime. Dryden, New York UNAFEI and AIC 1990 Crime and Justice in Asia and the Pacific. Australian Institute of Criminology, Canberra, Australia Weis J 1986 Methodological issues in criminal career research, Vol. 1. In: Blumstein A, Cohen J, Roth J A, Visher C (eds.) Criminal Careers and ‘Career Criminals.’ National Academy Press, Washington, DC
T. Makkai Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Statistical Systems: Economic
Statistical Systems: Economic ‘Economic statistics’ in the widest sense of the term refers to those statistics the purpose of which is to describe the behavior of households, businesses, public sector enterprises, and government in the face of changes in the economic environment. Accordingly, economic statistics are about incomes and prices, production and distribution, industry and commerce, finance and international trade, capital assets and labor, technology, and research and development. Nowadays it is customary to include under economic statistics estimates of the value and quantity of stocks of natural assets and measurements of the state of the physical environment.
to time. This is different from the way in which the economic accounts of the nation are derived. These are formed by combining all the existing economic (or basic) statistics through an interrelated system of accounting identities and from those inferring identity values that may not be available directly. They are also a means of assessing the strengths and weaknesses of economic statistics by noting how close they come into balance when the system of identities is applied to them. In other words, economic statistics are the raw material on the basis of which the economic or national accounts are compiled. The compilation of the national accounts in addition to summarizing and systematizing economic statistics plays the important role of showing up gaps in the range of economic statistics required to describe the functioning of the economic system (Table 1).
1. Economic Statistics and Official Statistics In its strict sense the term economic statistics has a narrower scope as a distinction is drawn between economic statistics and economic accounts (as in the national or international accounts). Some statistics occupy a gray area—between social and economic statistics. They comprise measurements that play a role in describing social conditions and are used intensively in the making of social policies but at the same time relate intimately to the economic fabric of society. Examples are the statistics on income distribution commonly used for the measurement of poverty, the occupational make-up of the labor force, the expenditure patterns of households, and the structure of their income. For purposes of the present article the term ‘economic statistics’ excludes the national accounts but includes the statistics that are simultaneously used for economic and social purposes. Most of the references in this article apply to official statistics, i.e., those statistics that are either produced by government statistical agencies or by other bodies but are legitimized by being used by government. The usage adopted corresponds to the way in which government statistical offices are commonly organized.
2. Directly Measured Statistics and Deried Statistics The term economic statistics is usually applied to statistics derived from direct measurement, i.e., either from surveys addressed to economic agents or from official administrative procedures that yield numerical data capable of being transformed into statistics. For example, the statistic on the national output of cement is usually derived by directly asking all (or a sample of) cement producers how much cement they produced over a defined period. The statistic on how many jobs were filled during a given period may be derived from the social security forms that employers are requested to send to the social security administration from time
2.1 Economic Statistics and Business Statistics The kernel of economic statistics is formed by statistics related to business and derived from questionnaires addressed to business, from questionnaires addressed to business employees, or from official records which businesses create in their dealings with government. Such statistics include descriptions of the business activity (what does the business produce, how does it produce it, for how much does it sell, what is the surplus generated by the sale once materials and labor are paid for, etc.) as well as of the nature of the factors engaged in production (structures and equipment used in production by type; labor employed by type of skill and occupation), and in addition statistics on the type of customers who purchase goods and services from businesses.
3. How Economic Statistics Are Measured In spite of the increasing popularity of using administrative records as sources to generate economic statistics, on the whole, economic statistics are compiled mostly from surveys addressed to business. Generally speaking, the surveys consist of a list of respondents, a formal questionnaire—which can be used as basis for a personal interview (or an interview by telephone or Internet), a procedure followed when dealing with exceptional cases (not finding the right person to whom the questions should be addressed, finding that the business is not in the expected activity, or that the answers are incomprehensible or contradictory)—and a set of steps designed to aggregate individual answers so as to produce statistical totals, averages, or measures of distribution. 3.1 Problems of Defining What Is a Business Part of the process of conducting a business survey consists in locating the addressees—either all the 15057
Statistical Systems: Economic Table 1 Types of economic statistics and sources (in italics) from which they are derived Economic agent
Production
Distribution
Resources employed
Prices
Households
Household income\sureys of household income
Wages and salaries\sureys of household income, labor force sureys
Current population estimates, participation in labor force, employment\ labor force sureys
Consumer prices and consumer price index\ sureys of prices
Business
Value and amount of production; production by type of activity and by goods and services produced, orders and deliveries\censuses and sureys of business actiity
Value and amount of sales; stocks of finished goods, materials and goods in process; sales by activity of seller and by product sold; sales by type of customer (exports and imports)\ censuses and sureys of business actiity; customs records
Numbers employed by occupation; remuneration of labor employed; capital assets used; depreciation; profits; loans and equity borrowing\ censuses and specialized business sureys; company balance sheets
Producer prices, unit values in manufacturing, producer price index\sureys of industrial prices, sureys of serice actiities, current agricultural sureys, customs records
Government (central and local, public sector corporations, etc.)
Government expenditures and revenues by function and by source\public sector accounts
Public sector sales of goods and services\ public sector accounts
Public sector employment; public sector wages and salaries\public sector reports; budget and supporting accounts
Prices subject to regulation\ special price sureys
Other (nonprofit organizations, labor unions, employers confederations, etc.)
Income and expenditure of nonprofit sector\ special sureys
addressees in the survey universe if the survey is of a census type or some of the addressees if it is a sample survey. In either case two issues must be settled beforehand: (a) that all the addressees conduct the activity they are supposed to and (b) that there is no one else who conducts the targeted activity and has been left out. A related question has to do with what is a business. The ‘business’ can be defined as the entity that is liable under the law to pay taxes; that has a management capable of deciding its scale of operations and how those operations should be financed; and that inter alia conducts the activity that is germane to the survey. An alternative definition is one that seeks that part of the business (and only that part) that engages in the target activity directly and excludes all other parts 15058
Employment in nonprofit sector; wages in nonprofit sector\ special sureys
of the business. In either case it is important that the respondents to the survey understand the scope of the questions and are capable of answering them out of records available to the business’s management.
3.2 Business Records Businesses tend to keep all the records they require to ensure informed management in addition to what they have to keep in order to comply with tax authorities and any other regulatory bodies that affect them. Usually those records fall into five categories: tax related; social insurance; engineering records on the performance of physical production; personnel and
Statistical Systems: Economic safety records; and environmental records if the productive activity is of sufficient importance and of a kind that may detrimentally affect the physical environment. The efficiency of statistical surveys is measured partly by the ability of the statistical agency to request that information, but mostly by that information which is readily available from the records the business tends to keep anyway.
3.3 Household Sureys Usually the major surveys addressed to households include a survey of households and how their members relate to the labor force and a survey of family income and expenditure. The former is the basis upon which statistics on the percentage of the labor force that is unemployed are calculated. The latter usually produces two key statistics: income distribution and structure of household expenditure. The latter is used to determine the various proportions of expenditure, which in turn are used in the compilation of the consumer price index. The surveys about household behavior—variation in the structure of expenditure and savings according to income, location of residence, family size, and age distribution—also provide key social statistics.
3.4 Public Accounts and Administratie Records There are at least two categories of records that make explicit survey taking redundant: reports on public accounts and administrative registers created as a result of regular and comprehensive administrative procedures. The two have different properties. The accounts of the public sector (and of para-public enterprises such as nationalized industries or public corporations) are usually made available to the public as a means of displaying transparency and accountability. If the statistical requirement is limited to the displaying of aggregate information, what is published is usually sufficient as a description of the income, expenditure, and balance sheet of all public sector entities. In addition, all operations relative to the collection of taxes generate comprehensive records, susceptible to replace statistics to the extent the latter target the same variables. For example, the collection of income tax generates statistics on the income structure of all households (or individuals) liable to pay tax. The collection of VAT-type taxes generates records on the amount of sales, the nature of the seller, and in many cases the broad economic sector to which the buyer belongs. Social security records tend to generate information on the nature of the worker from whom amounts for social insurance are withheld, their place of work, the hourly, daily, or weekly remuneration that serves as a basis for the
estimation of the amount withheld, etc. And of course, the oldest and most universal administrative operation of all—that connected with the activity of customs administrations—generates comprehensive information on the value, composition, conveyance used, origin, and destination of merchandise that either enters or leaves the national territory. This information, also known as merchandise exports and import statistics, is to be found amongst the earliest collections of numerical data of relevance to the administration of a country. To the extent that these administrative records are accessible to the compilers of statistics they tend to constitute acceptable replacements for estimates derived from surveys or from censuses.
4. Economic Classifications and Conceptual Frameworks Economic statistics—like any other kind of descriptive statistics—are usually presented within the framework of standard classifications. In addition to geographic classifications, the most popular of these classifications are the Standard Industrial Classification (the SIC, or in its international version, the ISIC) and the Standard Product Classification (the SPC or, in its international version, the Standard International Trade Classification or currently the Central Product Classification). The SIC refers to the activity (or production technique) used to produce a specified list of goods and services. Similar (or homogeneous) production techniques are grouped into industries. The overall result of these groupings is a classification that views economic activity from the supply side. The SPC groups goods and services by their intrinsic characteristics rather than by the way in which they were produced. It is a classification that views the results of economic activity from the demand side. The use of international classifications is vital to ensure intercountry comparability of economic statistics. The history of international cooperation in economic statistics is inextricably bound with the history of the two major economic classifications. Thus, the United Nations Statistics Commission, at its first meeting immediately after the end of World War II, commissioned the Secretariat to produce a classification of economic activities and shortly afterwards requested the development of a classification of the goods that entered into international exchanges (imports and exports). The Committee on Statistics of the League of Nations had worked on an initial version of such a classification. Almost immediately after the war the Statistics Commission approved the International Standard Industrial Classification of Economic Activities (ISIC) and the Standard International Trade Classification (SITC). Today, both classifica15059
Statistical Systems: Economic tions are into their fourth version (ISIC Rev. 3 and SITC Rev. 3, respectively). Currently, the economic statistics of the members of the European Union and of many central and eastern European countries are classified by the NACE (Nomenclature des Activite! s e! conomiques dans les Communaute! s Europe! ennes). The three North American countries have their own system of economic classifications (North American Industrial Classification System—NAICS). Most other countries use one or another of the various versions of the ISIC. The classification of goods was taken over by the body created to co-ordinate the activities of customs administrations worldwide (the Customs Co-ordination Council) and ended up as the Harmonized Commodity Description and Coding System (HS). As such it has been adopted for international trade statistics by the overwhelming majority of countries. Indeed, a number of them use the same classification for purposes of presenting their commodity output in the context of their industrial statistics. The requirement to develop an international classification of services prompted—again at the request of the United Nations Statistical Commission—the development of the Central Product Classification (CPC) which is being increasingly used to compare statistics on international trade in services with the internal production of and commerce in services.
required. Current trends suggest that because of pressures to keep government generated paperwork down, economic statistics tend to follow the patterns of business records. Where required, supplementary information is collected so that the basic statistics may be reconciled to the requirements of the national accounts compilers. But typically the necessary adjustments are conducted within the statistical agency. This is usually the case for key balances such as profits.
5. The Eolution of Economic Statistics as a System The traditional manner in which economic statistics were derived was by the periodic taking of an economic census. Indeed in earlier times the census of population was taken together with the economic census. More recently, the economic census is addressed at businesses and retrieves information on their activity, location, legal form of organization, and on how many and what kind of people they employ. Supplementary information is usually obtained through lengthier questionnaires addressed not to all businesses but rather to a selection, usually above a certain size but sometimes to a randomly selected fraction of all those in the scope of the census.
5.1 Censuses and Registers 4.1 Other Classifications These apply to a variety of economic statistics but they (with the possible exception of the International Standard Classification of Occupations (ISCO)) are neither as well developed as the ones on activity and product nor are they at this stage the subject of international agreement. There are several examples of such classifications. In the case of capital assets there is still the question of whether a classification of assets ought to be embedded in the standard classifications of goods or whether it should be sui generis. There is no suitable classification of intangible assets. In the realm of labor inputs, recent innovations in the contractual relationship between employers and employees have produced a variety of forms of employment, none of which has so far found its way into an accepted classification.
4.2 Conceptual Frameworks The conceptual framework for economic statistics is a hybrid, somewhere between business accounting and the system of national accounts. But, because economic statistics are the prime ingredient for the compilation of the national accounts, an explicit reconciliation between the former and the latter is 15060
Economic censuses have become steadily less popular, largely due to their cost, the time it takes to process them, and the fact that, because of rapid changes in the size, make up, and even activity of businesses, census information loses much of its value shortly after it is compiled (see Statistical Systems: Censuses of Population). Instead, there has been an equally steady increase in the popularity of registers of businesses, usually derived from administrative data and used for much the same purposes as censuses but updated far more frequently. Indeed, in a number of cases, the introduction of new administrative procedures connected with tax gathering has been managed bearing in mind possible requirements of importance to the government’s statistical agency. The lists of businesses derived from administrative or tax-gathering procedures, commonly referred to as ‘business registers,’ are used mostly in two modes: as sampling frames and as a basis for indicators of economic performance such as ‘net business formation’ (the difference between the numbers of businesses created and the number of businesses closed down in a given period of time). As sampling frames, business registers are invaluable to support ad hoc inquiries into the state of the economy, usually conducted by contacting a randomly chosen group of businesses and gauging the opinion of their chief executive officers.
Statistical Systems: Economic There are several requirements that must be satisfied for the modern approach to the compilation of business statistics to work. In particular, the source of information for the register must be comprehensive; it must be kept up to date as a result of the administrative procedures on which it is based; and the statistical agency must have unrestricted access to the source. The benefits that accrue to a statistical office from a business register are that: (a) it can dispense with the task of conducting a periodic census; (b) it can conduct a coherent set of sample surveys and link the resulting information so as to form a composite picture of the economy and its performance; and (c) it can control the paper burden that it imposes on individual business respondents through its multiple surveys. 5.2 Applications of Business Registers Business registers are maintained at two distinct levels. There is a list of all businesses through their registration with some government department (usually the tax authority). This list conveys no more than there is a possible business behind a tax account. The physical counterpart of the tax account is usually determined by direct contact once it is established that the business is more than just a legal entity. The relationships between businesses as legal entities and the establishments, mines, mills, warehouses, rail depots, etc. that comprise them are a source of interesting information, but they are also the most difficult part of the management of a business register.
6. Current Economic Indicators Under the impulse of macroeconomic stabilization policies and the belief in policy makers’ ability to fine tune effective demand, the early 1960s saw a worldwide push to develop current economic indicators of demand by sector. A great deal of the empirical work had been done in the US, spearheaded by the National Bureau of Economic Research and resulting in a classification of short term economic statistics according to whether they lead, coincided, or lagged behind the ‘reference’ economic cycle. Most attention was placed on the performance of manufacturing industry, particularly on the output of equipment for\of fixed investment as well as for export; on the construction of plant and of other nonresidential structures; and in general on leading indicators of capital formation. At the same time considerable effort was made to improve the reliability and the timeliness of the key variables for purposes of economic stabilization: the rate of unemployment and the change in the consumer price index. Lastly, the balances in the main markets of the economy were looked at through quarterly national accounts, which in turn demanded major developments in the quality and the quantity of basic economic statistics.
7. Today’s Array of Economic Statistics In today’s array of economic statistics the following would be the expected components in the more advanced statistical agencies: (a) statistics on the structure of the economy derived from either an economic census or from administrative data complemented by a survey on structural data (type of activity, breakdown of production costs, capital stock etc.); (b) exports and imports of goods and services, with the goods part broken down by commodity, country of origin and destination; (c) producer and consumer prices and indexes thereof; (d) short term indicators of economic change such as order books, deliveries of manufacturing goods and stocks thereof; (e) monthly physical output of key commodities and an index of industrial production; (f ) business balance sheets (income, profit and loss, financial) and employment and unemployment as well as household income and expenditure. In addition, a statistical version of the accounts of the public sector would be an expected component. In order to support these statistics, statistical agencies would maintain an up-to-date version of a classification by economic activity, usually an internationally agreed classification with a national supplement, and a classification of goods and services. They would also have their own or an international version of the classification of occupational employment. See also: Consumer Economics; Databases, Core: Demography and Registers; Databases, Core: Sociology; Economic Growth: Measurement; Economic History, Quantitative: United States; Economic Panel Data; Economics: Overview; Microdatabases: Economic; Quantification in the History of the Social Sciences; Statistical Systems: Education; Statistical Systems: Labor; Statistics, History of
Bibliography Cox B et al. 1995 Business Surey Methods. Wiley Series in Probability and Mathematical Statistics. Wiley, New York ISI, Eurostat, BEA 1996 Economic Statistics, Accuracy, Timeliness and Releance. US Department of Commerce, Washington, DC Statistics Canada 1997 North American Industry Classification System. Statistics Canada, Ottawa Turvey R 1989 Consumer Price Indices, an ILO Manual. International Labor Office, Geneva United Nations 1990 International Standard Industrial Classification of All Economic Activities, Statistical Papers Series M No. 4 Re. 3. United Nations, New York United Nations 1991 A Model Survey of Computer Services, Statistical Papers Series M No. 81. United Nations, New York United Nations 1993 System of National Accounts. United Nations, New York
15061
Statistical Systems: Economic United Nations 1997 Guidelines on Principles of a System of Price and Quantity Statistics, Statistical papers Series M No. 59. United Nations, New York Voorburg Group 1997 Papers and Proceedings. United Nations Statistical Commission, New York
J. Ryten
Statistical Systems: Education Educational activities of some kind occur in every human system such as the family, workplace, and governmental organizations such as schools and colleges. The statistical systems available to provide measurements of educational activities, such as individual learning and institutional support, are found in the institutions that administer those activities: schools and local, state, and federal government agencies. This article will describe the status of statistical information about the US educational system and educational attainment of individuals at the year 2000. It will discuss types of information collected by state and federal agencies, international agencies, and education associations. These agencies provide the basic statistical resources for many forms of social science analysis of educational practices and outcomes. The educational statistical system was originally created to provide measurement of administrative capacity, such as the distribution of financial resources, rather than the capacity of individuals, such as what is learned by individuals. In recent years, national policies about education have shifted from providing access to resources to encouraging greater accountability by the school organizations themselves. Thus, the statistical system concentrated more attention between 1965 and 1999 on measures of student achievement and its correlates. The complete system of education statistics includes information about students, teachers, administrators, the general population, and finances.
1. Organization of National and International Education Statistics For the United States, the constitutional power and responsibility for providing schooling for children resides in the laws of each state. Local school systems and colleges and universities collect some forms of statistical information about themselves to administer their own programs. Each school system and state education agency or college and university assembles basic facts about itself such as numbers of students enrolled, number of teaching staff, number of graduates, and amount of financial resources available. Since the nineteenth century, the federal government has been given authority to assemble aspects of school 15062
system statistics into national summaries at a central agency, such as the Department of Education, to provide for national trends in finance, enrollment, and completions of schooling and for monitoring the administration of programs in education. Standard definitions are assembled by the National Center for Education Statistics (NCES) of the US Department of Education to assist local agencies, states, and colleges and universities develop common statistical measures (NCES 1999c). International organizations that provide central functions for the reporting of educational statistics include the Organization for Economic Co-operation and Development (OECD) and UNESCO. International standards were created in the 1970s by UNESCO to enable worldwide assembly of statistics of these basic enrollments, graduates, and financial accounts into a single report (UNESCO 1998). Statistics collected under these procedures were published annually in the UNESCO Statistical Yearbook until 1998 but may be discontinued after that year (UNESCO 1998). The ISCED definitions are used also by the OECD for the collection of educational statistics for its member countries which include Europe, Japan, Korea, Australia, New Zealand, Mexico, Turkey, and the United States (OECD 1999). OECD publishes an annual report, Education at a Glance, that provides comparable statistics across countries on demographic, financial resources, school participation, school environment, graduate output, student achievement, and labor market outcomes (OECD 1997). The Office of Management and Budget (OMB) provides general oversight of the federal statistical agencies in the US government by evaluating their programs and establishing funding priorities. It also oversees and coordinates the administration’s information and regulatory policies to develop better performance measures and to reduce burdens on the public. Survey instruments of the population proposed by federal agencies are evaluated by the OMB offices to insure that survey questions are not asked of the general public unnecessarily.
2. US Department of Education In recognition that local schools and states needed to be able to monitor their own performance in enrollments in schools by comparing themselves with other states, a US Department of Education was originally created in 1867 for the purpose of collecting information on schools and teaching that would help the states establish efficient school systems. The most recent US Department of Education was created in 1979 by the Department of Education Organization Act (93 Stat. 668; 20 USC 3401). The responsibility for education statistics belongs to an agency within the Department of Education: The National Center for Education Statistics (NCES). The General Education
Statistical Systems: Education Provision act says, ‘The purpose of the Center shall be to collect, and analyze, and disseminate statistics and other data related to education in the United States and in other nations’ (20 U.S.C. 1221e-1, as amended). The programs of statistical data collection and reporting by NCES are described in their report Programs and Plans (NCES 1999c). Congress allocates funds to NCES for the purpose of collecting, analyzing, reporting, and training persons to use educational statistics. These funds are then used in nationally competed contracts with companies that carry out survey collection and analysis functions for the government. The quality of the statistics are monitored by a staff of NCES to insure that unbiased collection procedures are followed and reported to all those who would use these statistical services. The agency conducts a few research studies of its own to improve, expand and modernize its techniques of statistical collection and reporting. Oversight of the agency is provided by a board of researchers from university settings and by other governmental agencies such as the Office of Management and Budget, which seeks to assure the public that data collection requests are not too heavily burdensome on the organizations that are required to provide the funds.
years of school they had completed (Folger and Nam 1967 p. 131). This question has continued until the census of 2000 which has modifications to expand the categories of persons with more than a high school education. A complete description of the actual census items used in measuring educational levels is contained in 200 Years of U.S. Census Taking (US Bureau of the Census 1989). The Census Bureau conducts monthly surveys of employment for the Bureau of Labor Statistics and includes items on educational attainment and school enrollment regularly so that analysis of changes in employment patterns, income, marriage, voting, and mobility might be examined for persons of different educational levels. The March Current Population Survey which collects information on income for all respondents contains questions about years of school completed. Statistics on these factors are reported in the annual series of reports on the Current Population Survey and are included in the public use tapes available to researchers. Occasionally, at the request of other agencies, the Census Bureau asks its respondents expanded questions about their education, such as their field of study (US Bureau of the Census 1999). The measure of educational levels was changed to type of degree rather than years of school completed in 1993.
3. The Census Bureau The first statistical information on schooling for the Nation was collected in the Census of 1840. The 1840 census required that ‘the marshals and assistants should collect and return in statistical tables … all such information in relation to mines, agriculture, commerce, manufactures, and schools, as will exhibit a full view of the pursuits, industry, education, and resources of the country’ (Wright 1900). This first report resulted in national statistics from school systems rather than from individuals and it resulted in a number of errors such that in some states more students were reported than children counted in the population. Beginning in 1850 individuals were asked whether they had ‘attended school at any time during the year preceding the census’ (Folger and Nam 1967 p. 2). Censuses during the twentieth century continued to ask individuals whether they were enrolled in school, but did not ask about grade of enrollment until 1940. The censuses since 1940 all contained grade of enrollment by demographic characteristics such as residence, sex, and race. The census of 1840 also initiated a series of questions for adults on the ability to read and write. The reliability of this item was said to be weak, and to be reported for whites only. Approximately 1 in 5 persons in 1840 answered the census takers that they were illiterate (Folger and Nam 1967). A question on ability to read and write was continued in the censuses until 1930. But, by 1940 illiteracy rates had dropped to about 3 percent and adults were asked how many
4. National Science Foundation and National Academy of Sciences The National Science Foundation monitors the production of scientists and engineers through the collection and analysis of surveys on doctorates and of financial support for basic research by government agencies as well as by private businesses. A summary of these statistical studies are reported every two years in a report titled, Science & Engineering Indicators (National Science Board 2000). NSF also provides survey data to researchers intended to study issues in academic science and engineering resources (such as the production, employment, and funding of scientists) in publicly available databases. Or by accessing the website (http:\\www.nsf.gov\sbe\srs\stats.htm) some of the statistical data sources can be accessed directly. The National Academy of Sciences is a nongovernmental organization established by President Lincoln to provide the Federal Government with the best advice of scholars throughout the country. It receives funding from federal agencies to conduct studies on specific issues, including the quality of statistical data collection activities of the government and the interpretation of statistical reports on student achievement and international comparisons. Thus, discussions and reports by the National Academy of Sciences indirectly influence the policies of the government in procedures for their statistical system. Reports 15063
Statistical Systems: Education of the National Academy are publicly available through their website (http:\\www.nas.edu).
5. School Participation and Graduation The number and percentage of the population attending schools of different kinds was a significant indicator of changes in institutional support for education during the nineteenth and twentieth centuries when the education system was expanding. In more recent years, a more significant indicator of educational levels is the completion of secondary school, of college, and field of study of graduates. Information about enrollments and graduates has been collected in household surveys, the census of population, and from reports of schools and colleges to the Commissioner of Education Statistics since 1840 and are summarized annually in the Digest of Education Statistics (NCES 1999a). The federal statistical agencies provide aggregate summaries of local, state, and household reports into a national summary. Such descriptive statistics are of more interest to historian’s analysis of changes in American institutions than to social science analysis of individual differences in progress through the education system. The study of personal choice within a school setting is conducted in a series of studies that follow a sample of students through school with paper and pencil questionnaires. Many such longitudinal studies are carried out by federal agencies in education, labor, and health for the purpose of analyzing consequences of student performance and choices made at an early age on later life chances (West et al. 1998). Many of these studies are conducted by joint participation of researchers and government agencies for the design and support of the analysis of significant issues in education and educational policy. Beginning in the1990s, most of this information is made available through agency websites.
6. Student Achieement Every classroom in the United States evaluates student performance directly or indirectly. A national system of testing companies measuring student performance has grown since World War II to provide school teachers and administrators with professionally developed assessment instruments to evaluate the performance of whole schools compared with national averages or for individual student scores used for screening for college entry. Because of concerns that student performance as measured by college entrance exams appeared to be declining during the 1960s, but that the national estimates were not reliable, the Federal government created the National Assessment of Education Progress (NAEP) in 1966 as an independent measure of changes in student performance of the country. The National Assessment surveys and 15064
reporting of statistics has been supported and managed by NCES and results of the assessments for mathematics, science, reading, writing, and civics can be found in their annual publications. Establishing national standards for education performance and developing a methodology for conducting national assessments was made the responsibility of the National Assessment Governing Board by Congress in 1988 (see http:\\www.NAGBE.org). Planning and research to develop international comparisons of student performance was first initiated in 1958 and first conducted among 14 year-old students in mathematics in 1965 (Husen 1967). The countries that participate in these international comparisons formed an informal network of research institutes called the International Organization for the Evaluation of Educational Achievement (IEA). Several international studies of student performance in science, civics, writing, science, and mathematics were conducted between 1970 and 1999 (Medrich and Griffith 1992, OECD 2000). The US government, through the NCES and NSF, has provided support and monitoring of these studies because the results of the studies have received significant national publicity and have been effective in mobilizing local school districts to pay greater attention to student performance levels. The growth of the American form of testing in many other countries can be seen in the large number of countries that participated in the Third International Study of Mathematics and Science in 1995 when 50 countries participated in some aspects of the study and new countries were added to the assessment carried out in 1999. New international studies will be carried out after the year 2000 that use European definitions of the end of secondary schooling and measure student performance on skills and knowledge of mathematics, science, and reading literacy (OECD 2000). Federal agencies conduct a number of studies that are intended to inform the general public about knowledge and attitudes held by adults. For example, the International Adult Literacy Survey was conducted in member countries of the OECD in 1994 to describe the literacy and numeracy of the adult populations in those countries (http:\\ nces.ed.gov\pubs98\98053.pdf). A new survey of adults called the ‘Life Skills Survey’ is being planned for the year 2002 that will include national and international comparisons to identify and measure adult skills linked to social and economic success. The survey items in this survey will be based on a framework derived from American theories of intelligence and intends to create paper and pencil tests that would demonstrate the distribution of these skills in the population (NCES). This study and another study conducted for the National Science Foundation seeks to identify the distribution of knowledge and attitudes about science (science literacy) among adults in the United States and other countries (National Science Board 2000). Each of these surveys serves to
Statistical Systems: Education provide policymakers with information about the level of information held by adults. Studies and reports about methods for assessing student performance are conducted by the National Academy of Sciences in its Board on Testing and Assessment and the Board on International Comparative Studies in Education. These committee members are selected from academic researchers who have written about student assessment procedures. Their discussions and reports raise questions in their meetings about the statistical quality of the intended and unintended affects of reporting of student assessment by statistical agencies and thus provide helpful guidance to federal policies about data collection.
7. Teachers, Teaching, and Curriculum and Instruction A complete system of statistics for measuring changes in the US educational system would have to include aspects of the educational processes itself: teaching, curriculum, and instruction (Shavelson et al. 1989). Jeannie Oakes and Neil Carey say, Curriculum is the what of science and mathematics education. Curriculum in content—the topics, concepts, processes and skills that students learn in science and mathematics classes— but it includes more than that. It includes the depth to which students explore content; the way teachers organize, sequence, and present it; and the textbooks and materials schools use (Shavelson et al. 1989, p. 96).
International studies have helped introduce the methods of measuring the content of education systems. However, the statistical system has yet to put together all of the elements of this system in one place so that the operating events in schools can be compared and contrasted. Surveys are separately conducted about student achievement, teacher characteristics, curriculum, and teaching practices (National Science Board 2000). Little consensus has been reached among education researchers about the most significant events in schools. Studies are conducted of pedagogy, curriculum content, and home environment (The Coleman Report 1966). The statistical system produces statistical studies that permit the analysis of social characteristics of students and parents as well as some of the classroom. But no single survey itself can summarize all of the elements essential to describing the processes and outcomes of the educational system.
8. School Finance Differences in community capacity to support school systems results in large differences across the United States in revenues for public elementary and secondary schools. Since the differences may result from either differences in cost of living or in unequal spending, the
collection and analysis of statistics on school finance of public elementary and secondary school systems is a major activity for NCES (NCES 1999c). In higher education, the methods that students and families use for payment of their tuition and living expenses while in college are a national policy concern and a subject of intense data collection and analysis (NCES 1999c). Surveys are conducted to determine what the effect of federal policy on the subsidies for tuition is to individuals and institutions. The connection between educational level of individuals and personal income is a subject of data collection in surveys by the US Census Bureau where individuals are asked to report both their educational levels and income. In only a few instances has personal income been associated with measurement of specific educational experiences, such as field of study in college in Census Bureau surveys.
9. Statistics and the Internet The results of education statistical studies are being disseminated more rapidly and thoroughly through the Internet than in previous years. All federal and many state education agencies are making available on their Websites statistical tables and analytical reports. Also, the databases themselves can be addressed over the Internet. In some instances, it is possible to request a specific table by electronic mail from a particular survey that generates the specific summaries. These changes in production and distribution of statistical materials have revolutionized the availability of statistics and makes their use much more likely to occur in many reports.
10. Statistics and Public Policy School systems of all kinds have been undertaking reform of their practices of teaching and instruction for a number of years. Statistical measurement of the outcomes of schooling is important for the public discussion of the purposes and directions of these reforms. For example, the results of the Third International Mathematics and Science Study which reported that US students compared well with other countries in the 4th grade, less well at the 8th grade, and were much lower than other countries in the 12th grade attracted the attention of the President and many members of Congress. The production of statistical studies have affected deliberations about the importance of legislation regarding the improvement of education in the United States.
Bibliography The Coleman Report 1966 Equality of Educational Opportunity. US Department of Health, Education, and Welfare, US Government Printing Office, Washington, DC
15065
Statistical Systems: Education Folger J K, Nam C 1967 Education of the American Population. US Government Printing Office, Washington, DC Husen T (ed.) 1967 International Study of Achieement in Mathematics. John Wiley, New York Medrich E, Griffith J 1992 International Mathematics and Science Assessments: What Hae We Learned? US Department of Education, NCES 92-011 National Science Board 2000 Science and Engineering Indicators: 2000. National Science Foundation, Arlington, VA, NSB 0001 Office of Management and Budget 1999 Statistical Programs of the US Goernment: FY 2000 Organization for Economic Co-operation and Development 1997 Education at a Glance. OECD 2, rue Andre-Pascal, Paris Organization for Economic Co-operation and Development 1999 The OECD Data Base. OECD 2, rue Andre-Pascal, Paris http:\\www.oecd.org\statlist.htm Organization for Economic Co-operation and Development 2000 Program for International Student Assessment. http:\\www.pisa.oecd.org Shavelson R J, McDonnell L, Oakes J (eds.) 1989 Indicators for Monitoring Mathematics and Science Education: A Sourcebook. RAND, Santa Monica, CA UNESCO 1998 UNESCO Statistical Yearbook 1998. Unesco, 7 place de Fontenoy, 75352 Paris 07 SP US Bureau of the Census 1989 200 Years of U.S. Census Taking, Population and Housing Questions, 1790–1990. US Government Printing Office, Washington, DC US Bureau of the Census 1990 1990 Census of Population, Social and Economic Characteristics, United States. (1990 CP-2-1) Appendix B or http:\\www.census.gov\td\stf3\appendIb. html US Bureau of the Census 1999 Education Statistics. http:\\www.census.gov\population\ US Department of Education, National Center for Education Statistics 1999a The Digest of Education: 1999. May 1999. NCES 1999-032 US Department of Education, National Center for Education Statistics 1999b Classification Ealuation of the 1994–95 Common Core of Data. January 8 US Department of Education, National Center for Education Statistics 1999c Programs and Plans of the National Center for Education Statistics for 1999. NCES, no. 1999-027 West K, Hauser R, Scanlan T (eds.) 1998 Longitudinal Sureys of Children. National Academy Press, Washington, DC Wright C 1900 History and Growth of the United States Census. US Government Printing Office, Washington, DC
L. Suter
Statistical Systems: Health Statistics is the science of collecting, analyzing, presenting, and interpreting data. A statistic is an item of quantitative data and so the term statistics can also mean a collection of quantitative data i.e., the results of collecting and analyzing data. The term health statistics system can be used to describe any organized or structured mechanism that involves health statistics. It can therefore include systems that are used for collecting, analyzing, inter15066
preting, and making available quantitative data in relation to health status or health services activity, as well as referring to the actual collections of quantitative health related data. Increasingly these statistics systems are computerized, and the increased availability and ease of access of information technology means that this trend towards computerization of health statistics systems is likely to continue in the future. However in many countries, paper based systems are still used for the collection of health statistics and, in some cases, the aggregation and dissemination of these health statistics are also not computerized. Therefore focusing solely on computerized health statistics systems, when looking for health information can mean that much valuable information is missed. Decisions about health and health care, whether made by policy makers, by health professionals or by the public, are influenced by the information that is available. Much of this information is derived from official and other health statistics. Effective health care is therefore dependent on the process of gathering, analyzing, and disseminating timely and appropriate information. The quality and accessibility of the health statistics systems, from which this information is derived, can have a major influence on how health and health care systems are run and used. When discussing statistical systems the terms data and information are often used interchangeably, but it is important to recognize the differences. Data are specific facts recorded in some way for possible future use, for example, the annual expenditure for a specific department in a hospital. Data becomes information when it is has meaning, usually by putting it in some relationship with other data. For example, relating expenditure for a hospital department to the services provided by that department and comparing this with the data for similar departments in other hospitals, provides information on how expensive it is to deliver the service in the hospital concerned compared to other hospitals. Health statistics are derived from data, and should provide information. Another term that is increasingly used in relation to health information is knowledge. By putting information in the context of other information, including both quantitative data and textual information, deductions can be made that increase knowledge. Knowledge provides the basis for making decisions in a rational way. Knowledge, based on information, allows appropriate actions to be selected in order to achieve specific objectives. This is often referred to as evidence based decision-making. This is one of the main purposes of gathering data and producing health statistics.
1. Scope of Health Statistics In some respects, the terms health information system and health statistics system can be used inter-
Statistical Systems: Health changeably. Where the focus is on the collection of the data, or on the statistical calculations that need to be carried out on the data to produce useful information, then the term health statistics may be used more often. However, if the focus is on the results of this data collection and analysis, and on making this information available, then the term health information is used more often. The use of telemedicine is expanding the scope of health care information, and so the definition of health information is growing to include not just health statistics but also multimedia audio, video, and high-resolution still images (Balch and Tichenor 1997). The scope of what health statistics systems cover can be defined in several ways: (a) by the type of data and information included; (b) by where the data are produced; (c) by where and how the information is held and made available; and (d) by the purposes for which the information is used. A narrow definition of health statistics could conclude that the purpose for collecting statistical information should be primarily to satisfy the needs of government. This would mean that health statistics systems would just cover those systems designed to produce official health statistics for government use. However, the current view in many countries, and of many of the producers of official statistics, is that the statistics are produced to serve not only government but also the wider community. The members of this wider community of health statistics users come from different organizational backgrounds, including legislative bodies, government departments, health service organizations, researchers, voluntary organizations, and patients, as well as the public in general. The health service organizations that need to use health statistics can also be wide ranging, depending on the health system in operation in the country concerned. They are likely to include both the public and private sector, and both providers of health care services and purchasing organizations. This broader view of health statistics users is both driven by, and enabled by, the rapid spread of the use of the Internet and the World Wide Web. Familiarity with accessing information via the web has raised the expectations of both health professionals and the public in relation to the availability of health information. Increasingly producers of health statistics have started to respond to this by making the information they produce available in different formats and through different mechanisms to suit the differing needs of the user community. The widespread use of the Internet has meant that this is now used as a common method of disseminating information about statistics, as well as providing access to the actual data in some cases. Access to statistics held on data archives, many of which include health statistics, is also possible via the web. Web-
based approaches can also allow information from different systems to be brought together, either by making these available via a web site, or by using web based technologies to access information held on different systems. The design and coverage of health statistics systems, described later in this article, therefore needs to be seen as being currently very much in a transitional stage. For many systems, developments are ongoing which will take advantage of the opportunities, and also meet the challenges, of the technology that is now available for bringing together information from different sources and for providing wider access to this information (Sandiford et al. 1992).
2. Use of Health Statistics There are many different individuals and organizations involved in planning, managing, running, investigating, and using the health care system. Health statistics systems should ideally be designed to meet the information requirements of this health care community. Some of the main uses of health statistics are summarized in Sect. 2.1 to 2.9. 2.1 Information for Policy Makers and for the Management of Health Care Deliery Policy makers at both national and local level need information to support decisions on planning and funding of services. This applies both to policy makers responsible for publicly funded services, and to organizations such as health insurance companies involved in the funding and provision of private health care services (Roos 1999, Roos et al. 1999). Health information systems can provide decision-makers with the capability to: (a) Make critical comparisons across regions and subregions of residents’ health status, socioeconomic risk characteristics, use of hospitals, nursing homes, and physicians. (b) Carry out analyses, in relation to the population served, of demographic changes, expenditure patterns, and hospital performance. (c) Carry out outcome research across hospitals and countries to support recommendations on effective treatments, (d) Make comparisons of expenditure relative to outcomes to support purchasing decisions (e) Carry out longitudinal research on the impact of health policy changes. This information helps to answer questions such as what health care services do particular population groups need and, by comparing this need to the current service provision, which populations need more services and which need less? (See Slater 1999, Azubuike and Ehiri 1999, De Kadt 1989.) In addition, health information systems are important support tools in the management and im15067
Statistical Systems: Health provement of the delivery of health care services. Experience in several countries suggests that comparison of hospital-specific data influence hospital performance. If statistics on provider performance are made available this ensures not only prudent and costeffective health care purchasing, but also it gives health care providers comparable information to enable them to improve the care they deliver. 2.2 Health Sureillance Systems In health terms, surveillance is the close monitoring of the occurrence of selected health events in the population. This implies that the events to be monitored can be specified in advance. This may include setting up a notification system for the health event, to supplement data available from routine health data collection systems, for example in the case of infectious diseases such as tuberculosis. However, the events monitored can also include certain signs and symptoms which can be seen as ‘indicator events’, to give warning of unforeseen problems. Surveillance methods have been developed as part of efforts to prevent or control specific health events. Surveillance systems are therefore dynamic, in contrast to archival health information, which can be used for epidemiological analysis. The events that can be controlled in surveillance systems include infectious diseases, maternal and child health, chronic noninfectious diseases, injuries, and the effects of exposure to occupational or environmental risk factors. 2.3 Types of Health Information The ultimate aim of any statistical work in the health sector should be to assist in the maintenance and improvement of the health of the population and in the delivery of health care services. Health information can therefore be defined as any information that is needed to support these activities. It is important to be able to access both health service and non-health service data when looking at health needs, since the determinants of health often lie outside the health service. Work in relation to prevention of illness and the promotion of good health therefore require access to data from a wide range of sources. In response to this requirement health statistics systems need to cover, or provide access to, information not only on all aspects of the health care systems, but also on factors that influence health and the availability and use of health care. There are different models of health systems used in different countries, and these have an impact on the health statistics systems, both in terms of the data collected, and hence the information potentially available, and in terms of where the information is held and who has responsibility for it. There may also be variations in the organization of health care, hospitals 15068
and hospital information systems within a given national environment (Saltman and Figueras 1997). However, the differences in health system models usually relate mainly to funding mechanisms, rather than to the basic activities that need to be carried within the health sector. This means that the types of information held are usually similar, even if the organizations involved are different. The types of information covered usually include: (a) Health status —Health of the population —Vital statistics (births, deaths) —Health surveillance—for both infectious and non infectious diseases (b) Registers —Patient registers e.g., persons insured, persons registered with a family doctor —Disease registers e.g., cancer (c) Health service resources —Health care facilities —Health care staff —Other resources e.g., medical equipment, drugs (d) Health service activity —Hospital and other secondary care activity —Primary care activity —Financial information (e) Explanatory factors —Economic information —Sociodemographic information —Environmental information (f) Other related data (often government or public sector statistics) —Social care sector activity —Police—crime figures, road traffic accidents —Education Some of the main types of information listed are discussed in more detail in later sections. In addition to the quantitative data listed in Sect. 2.3, the interpretation of information from health statistics systems often requires reference to textual information in relation to current policies and guidance, and also to research findings. Some of the information is obtained using administrative data, which are routinely collected as part of the process of running and monitoring the health system. In addition information is available from censuses, either using data from national population censuses or from censuses of specific groups, e.g., census of patients in hospital at a particular time. Health statistics are also available from a wide range of survey data. These can be international, national, regional, or local surveys, and can include surveys that are part of a regular survey program as well as ‘one off’ surveys carried out for a specific research or policy analysis study. The increased use of computerization in health statistics systems has already been mentioned. Computers can store large volumes of data and can process them quickly. This means that if data are entered on to
Statistical Systems: Health a computer at source it may not be necessary to summarize data before onward transmission. For example, every patient’s record can be stored and sent rather than just counts of patients with particular diseases. Consequently converting data into information can be postponed until a specific issue raises the need for a particular piece of information. Much more flexibility is therefore available in how the data are aggregated to create information. Also data held in electronic form is no longer confined to being in one place at a time. Several people can simultaneously be reading the same record in quite different locations. This creates possibilities for sharing information, which were never feasible with paper systems. However it also increases the risk of unauthorized access to data, and data confidentiality issues in relation to health statistics systems are discussed in a later section. 2.4 Health of the Population Health monitoring requires measures of incidence, prevalence, and severity of illnesses, and the consequences of illness, classified by age, sex, socioeconomic, and geographical characteristics of the population. There are no ideal measures, and several sources need to be used in combination, for example using both health surveys and information about patients in hospital. The varied sources of population health information provide different pieces of the overall health picture. Ultimately the objective of health care expenditure is to improve the health of the population not for its own sake but so that people can live happier more prosperous lives. Gains in health status might be regarded as the ultimate measure of a health program’s success. Composite measures of health status such as the DALY (Disability Adjusted Life Year) and the QALY (Quality Adjusted Life Year), although developed from different disciplinary backgrounds, both attempt to summarize the potential gains from medical interventions (Williams 1999, Jenkinson 1994). Whereas health economists developed QALYs, DALYs have their roots in epidemiology. As a consequence, the perspective of the first is based on patient assessment of pain and suffering while the second is disease driven with a medical assessment of the amount of suffering. 2.5 Vital Statistics—Mortality Data The most absolute measure of ill health is death, and mortality statistics are one of the principal sources of information about the changing patterns of disease and the differences in these patterns between countries. Since mortality statistics are widely available death rates have traditionally been used as crude, but often effective, surrogates for more comprehensive measures of disease. This is often justified by the fact that death
is unambiguous and the data collection is complete. However use of mortality data alone can provide misleading impressions of the burdens of disease in society. For example, in certain parts of the world, musculoskeletal disorders (including rheumatism and arthritis) are the most important current causes of limiting long-standing illness but are one of the least prevalent causes of death. Death rates vary by age and by sex and so, if comparisons are to be made between populations with different age and sex composition then some adjustments are needed. A common approach is to use age standardized rates and ratios (often referred to as Standardized Mortality Ratios or SMRs). These are calculated by applying the rates found for a particular population to an agreed standard population. Standardization is also used when comparing other measures, such as hospitalization rates, which are likely to be affected by the age and sex structure of the population. Another measure commonly used when considering premature mortality or avoidable deaths is Potential Years of Life Lost (PYLL). This highlights the loss to society as a result of early deaths (for example, below age 65 or 75). The figure is the sum, over all persons dying from that cause, of the years that those persons would have lived had they survived to the stated age (e.g., 65). 2.6 Vital Statistics—Births Data Although systems vary between countries, the systems from which birth and maternity data can be derived are likely to include civil registration of births, the notification of births, and hospital-based data collection systems. Birth registration is the oldest source of data about births. In most countries the primary reason for setting up systems for registering births, marriages, and deaths was to provide civil documents to record the events for legal purposes. To obtain a full picture of birth and maternity care, data also needs to be collected about facilities and staffing of the maternity services and about the socio-economic context in which childbirth occurs. 2.7 Patient Registers Where health insurance systems are in operation, there will be registers for the population covered by the schemes. These registers are likely to include information such the individual’s insurance status, name, date of birth, address, and identification number, plus possibly other socio-economic or medical status data, e.g., whether the person is a diabetic. Other data related to individuals would be the health insurance contributions made on their behalf from employers or through other mechanisms. Information on dependents covered by an employed person’s contributions may also be included. 15069
Statistical Systems: Health In many health systems patients will be registered with a family doctor or primary care center. These registers are likely to include some of the same basic patient information as for the health insurance registers, as well as some health status information. Records of preventive services such as immunizations carried out by family doctors may also be included. 2.8 Disease Registers Separate registers are often kept, usually at local or regional level, for patients with specific diseases, and cancer registration is an important example of this. Since, in many parts of the world, cancer-related services are consuming ever-increasing health resources; the information from cancer registries is important in helping health care planners, researchers, and policy-makers formulate strategies to meet this challenge. Cancer registration is the collection and classification of data on all cases of cancer occurring within a defined territory. The primary source of data on which cancer registration is based is the patient’s clinical case notes. This holds true whether the information is derived from hospital departments, primary or community care, or the private sector. Cancer registration is a worldwide process. The longest established cancer registries are in Europe and North America, but increasingly they are being developed in the countries of Asia, Africa, and South America. Attempts to record comprehensive information on the incidence and distribution of cancer have been made as part of individual censuses and studies since the last century, but it was not until the late 1940s that cancer registration really got underway. In 1959 the World Health Organization (WHO) produced recommendations for the establishment of cancer registries, and, in 1965 the International Agency for Research on Cancer (IARC) was established as a specialist research center of WHO. 2.9 Hospital Data A wide variety of data is available from hospitals for the purpose of health service analysis and performance monitoring. Where computerized systems are not available, much of this is obtained as aggregate counts of patients, and of resources used, falling into predefined categories. However, where computerized health information systems are in operation, individual records of care can be accessed for each patient which contain wide ranging details of the diagnoses, the treatments delivered, and the resources used. 2.10 Non-health Serice Data As has already been mentioned, it is important to be able to access non-health service data when looking at 15070
health needs, and when comparing statistics on health service activity. In many countries there is also an increasing emphasis within the health sector on partnership working with other agencies, and the need to share information on common and related issues. Health and health care activities can be affected by activities in other sectors, and monitoring and planning of health services activity need to take this into account. Non-health service data sources include population estimates, vital statistics, and census data. Vital statistics covering data on births and deaths have already been mentioned. Population estimates and census data are discussed elsewhere in this publication. Various deprivation indices, some of which are derived from Census data, are also used when analyzing and interpreting statistics relating to the health sector. The links between housing and health have been well established. Some data on housing can usually be obtained from population censuses, and information about housing stock may also be available from local government statistical systems. Poor air quality is also considered to be detrimental to health. Once again some information may be available from government statistical systems supplemented by special studies. Other statistics relevant to health that may be available from government sources include information on education, income, unemployment, transport, and crime.
3. Standard Coding\Classification Systems If health statistics are to be compared across different organizations, regions, and countries, it is important that common coding and classification systems are used throughout the health sector and that they are used consistently. The WHO International Classification of Diseases is a classification of specific medical conditions and groups of conditions which is determined by an internationally representative group of experts who advise the World Health Organization (WHO), which publishes periodic revisions. It was first introduced a hundred years ago and the version in current use worldwide is the tenth revision (ICD 10). In addition to the use of international classifications such as ICD10, many countries and international communities have undertaken initiatives to standardize the health data they collect in order to facilitate comparisons across countries. An example of this is the European Union’s program ‘Community Action on Health Monitoring’ that was adopted for a fiveyear period with effect from 1 January 1997 (for EU health information projects, see CORDIS website). Its stated objective was to contribute to the establishment of a Community health monitoring system that makes it possible to: (a) Measure health status, trends and determinants throughout the Community.
Statistical Systems: Health (b) Facilitate the planning, monitoring and evaluation of Community programs and actions. (c) Provide member states with appropriate health information to make comparisons and to support their national health policies. (d) Develop methods and tools necessary for analysis and reporting on health status, the effect of policies on health and on trends and determinants in the field of health. The program is structured in three pillars as follows: (a) Pillar A deals with the establishment of Community health indicators and with conceptual and methodological work related to the process of making the data comparable and for identifying and developing suitable indicators; (b) Pillar B deals with the development of a Community wide network for the sharing and transferring of health data between member states, the European Commission and international organizations; (c) Pillar C deals with the development of methods and tools necessary for analysis and reporting and the support of analyses and reporting on health status, trends and determinants and on the effect of policies on health. These activities are being carried out in close cooperation with the member states and also with institutions and organizations which are active in the field of health monitoring, in particular: the World Health Organization, the Organization for Economic Cooperation and Development, the International Labor Organization (OECD Health Data, WHO Health for All Database, WHO World Health Report 2000, WHO Health in Transition profiles).
4. Managing Confidentiality One important aspect of health information, and therefore of the systems that relate to health information, is confidentiality. This is because the information held may relate to identifiable individuals. Many countries, as well as international communities such as the European Union, have strict laws requiring organizations, including government agencies, to register the types of personal information they hold, the purposes they use it for and who they share it with. The law gives people the rights to inspect the data which organizations hold about them and to challenge inaccuracy and mis-use. Information systems and data flows have to be designed to respect privacy with personal identifier data removed when it is not needed and security precautions in place when it is needed. Some principles that commonly apply to collecting, holding, and using health data are that: (a) Health authorities should justify the collection of personally identifiable information, (b) Patients should be given basic information about data practices,
(c) Data should be held and used in accordance with fair information practices, (d) Legally binding privacy and security assurances should attach to identifiable health information with significant penalties for breach of these assurances, (e) Disclosure of data should be made only for purposes consistent with the original collection, (f) Secondary uses beyond those originally intended by the data collector should be permitted only with informed consent.
5. Recent Deelopments Some of the trends in technology affecting health information systems have already been described. Another feature of planned developments in systems is a move towards integrated, patient-centered computing. Currently information is often collected according to the activity carried out e.g., by hospital department, which can make it difficult to obtain a complete set of information for an individual patient, or group of patients. The patient centered approach means that the computer systems and files are designed around capturing information along the whole continuum of care for a patient (Nold 1997). It is also expected that health statistics systems will move towards higher-speed data transmission, and greater network capacity, following national and international computing standards. This will allow more rapid access to health statistics, as well as facilitating the bringing together of information from different sources—leading to improved information and greater knowledge about population health and health care (Mendelson and Salinsky 1997). See also: Health Behaviors; Health Care Delivery Services; Health Economics; Health Education and Health Promotion; Health Interventions: Community-based; Health Policy; Health Surveys
Bibliography Azubuike M C, Ehiri J E 1999 Health information systems in developing countries: Benefits, problems, and prospects. Journal of the Royal Society for the Promotion of Health 119(3): 180–4 Balch D C, Tichenor J M 1997 Telemedicine expanding the scope of health care information. Journal of the American Medical Informatics Association 4(1): 1–5 CORDIS—website http:\\dbs.cordis.lu\ De Kadt E 1989 Making health policy management intersectoral: Issues of information analysis and use in less developed countries. Social Science & Medicine 29(4): 503–14 ICD10 http:\\www.who.int\whosis\icd10 Jenkinson C (ed.) 1994 Measuring Health and Medical Outcomes. UCL Press, London Mendelson D N, Salinsky E M 1997 Health information systems and the role of state government. Health Affairs 16(3): 106–19
15071
Statistical Systems: Health Nold E G 1997 Trends in health information systems technology. American Journal of Health-System Pharmacy 54(3): 269–74 OECD Health Data—website http:\\www.oecd.org\els\ health\ Roos N P 1999 Establishing a population data-based policy unit. Medical Care 37(6 Suppl): JS15–26 Roos N P, Black C, Roos L L, Frohlich N, De Coster C, Mustard C, Brownell M D, Shanahan M, Fergusson P, Toll F, Carriere K C, Burchill C, Fransoo R, MacWilliam L, Bogdanovic B, Friesen D 1999 Managing health services: How the Population Health Information System (POPULIS) works for policy-makers. Medical Care 37(6 Suppl): JS27–41 Saltman R B, Figueras, J 1997 European Health Care Reform—Analysis of Current Strategies. WHO Regional Publications, European Series, No 72 Sandiford P, Annett H, Cibulskis R 1992 What can information systems do for primary health care? An international perspective. Social Science & Medicine 34(10): 1077–87 Slater C H 1999 Outcomes research and community health information systems. Journal of Medical Systems 23(4): 335–647 Williams A 1999 Calculating the global burden of disease: Time for a strategic reappraisal? Health Economics 8(1): 1–8 WHO Health for All Database—website for Health for All in the 21st Century, http:\\www.who.int\archives\hfa\ WHO Regional Office for Europe, European Health Observatory Health in Transition profiles, http:\\www.observatory.dk\frameIindexImain.htm WHO World Health Report 2000 http:\www.who.int\ World Bank Data, http:\\www.worldbank.org\
D. Leadbeter
Statistical Systems: Labor Labor statistics represent a large body of information relevant to many of the social and economic issues of our time. They are used as indicators of how well the general economy is functioning and also as a basis for designing, implementing, and monitoring employment and social programs and policies. Labor statistics are generally part of a national statistical program and are interrelated with statistics in other fields, for example, with statistics on education, health, and production. International comparisons of labor statistics can help in assessing a country’s economic performance relative to other countries (for example, by comparing unemployment rates and employment growth) and in evaluating the competitive position of a country in international trade (for example, by comparing productivity and labor cost trends).
1. Topics and Classifications Article 1 of Labor Statistics Convention Number 160 of the International Labor Office (ILO) established in 1985 a minimum set of topics to be covered in official national labor statistics programs (ILO 2000). The list is as follows. 15072
(a) Economically active population, employment, where relevant unemployment, and, where possible, visible underemployment. (b) Structure and distribution of the economically active population, for detailed analysis and to serve as benchmark data. (c) Average earnings and hours of work (hours actually worked or hours paid for) and, where appropriate, time rates of wages and normal hours of work. (d) Wage structure and distribution. (e) Labor cost. (f) Consumer price indices. (g) Household expenditure or, where appropriate, family expenditure and, where possible, household income or, where appropriate, family income. (h) Occupational injuries and, as far as possible, occupational diseases. (i) Industrial disputes. ILO convention 160 recommends that each member state (almost all countries are members of the ILO) ‘collect, compile, and publish basic labor statistics, which shall be progressively expanded in accordance with its resources to cover [these] subjects.’ The USA and most other developed nations collect and publish data for all subjects listed in Article 1. On the other hand, developing nations often do not yet have the full array of labor statistics. The publication Key Indicators of the Labour Market (Johnson et al. 1999) provides information about the availability of many of the basic labor statistics in almost all countries of the world. Deficiencies in the availability and timeliness of labor statistics are particularly apparent in Africa and parts of Asia.
1.1 National Classifications The presentation of labor statistics requires many distinctions or categories such as occupations and branch of economic activity. In order to facilitate data collection and analysis, these categories are organized into classifications. These are instruments that group together units of a similar kind, often in hierarchies, thus allowing for different degrees of detail to describe the characteristics in a systematic and simplified way. For example, in the USA, the Standard Occupational Classification and the Standard Industrial Classification have been the structures that define occupations and industries for many years. Such systems need periodic revision to keep up with the changing economy and to accommodate new industries and occupations. Reising the Standard Occupational Classification System (1999) provides an account of the 1998 revision to the US occupational classification as well as a brief history of the industrial classification system. The North American Industry Classification System (NAICS) represents a complete restructuring of the
Statistical Systems: Labor US industrial classification (1997). This system was prepared jointly because the North American Free Trade Agreement of 1994 increased the need for comparable statistics from the USA, Canada, and Mexico. Each country can individualize the new system to meet its own needs, as long as data can be aggregated to the standard NAICS categories. 1.2 Other International Classifications International standard classifications serve to enhance cross-national comparisons. The International Standard Industrial Classification of all Economic Activities was established under the auspices of the United Nations, and the latest International Standard Classification of Occupations was established by the ILO’s 14th International Conference of Labor Statisticians in 1987. Summaries of these classifications appear in the appendices to the ILO’s Yearbook of Labour Statistics (1999). Countries often prepare concordances or ‘crosswalks’ that allow their national classifications to mesh with the international classifications. 1.3 International Concepts The ILO’s periodic Conferences of Labor Statisticians have established guidelines for the measurement of many aspects of labor statistics (ILO 2000). These guidelines serve as a model for countries trying to develop national statistics on the topic concerned and they also provide, to some extent, a benchmark for international comparisons. For example, ILO standards for measuring labor force, employment, and unemployment are followed by many countries. Varying interpretations of these guidelines, however, particularly with regard to different aspects of unemployment, create differences across countries that require adjustments to enhance comparability of unemployment rates (Sorrentino 2000).
2. Institutional Framework Labor statistics are primarily collected by governments rather than the private sector, although in some countries there has been a movement toward privatizing some data collections. In the USA, which has a decentralized statistical system, over 70 Federal agencies are engaged in the collection of statistics. The Bureau of Labor Statistics (BLS) in the US Department of Labor is the principal data-gathering agency of the US Federal Government in the broad field of labor economics. Most of the Bureau’s data come from voluntary responses to surveys of businesses or households conducted by BLS staff, by the Bureau of the Census (under contract to BLS), or in conjunction with cooperating State and Federal agencies.
Unlike the decentralized system of the USA, other countries tend to have central statistical offices, and the collection of labor statistics is accomplished within these national agencies. However, labor ministries often also play a role in the collection of labor statistics abroad. Several international organizations work in the realm of labor statistics, obtaining and publishing labor information from countries, providing technical assistance to member countries, and establishing conceptual guidelines. National agencies cooperate with the international organizations by transmitting the data requested on a regular basis and also for ad hoc studies. Major international organizations involved in labor statistics are discussed in Sect. 6.
3. Sources of Labor Statistics Labor statistics are generally collected in censuses or sample surveys of households or business establishments, or processed on the basis of data resulting from administrative procedures of public agencies. Many of the major programs of the US BLS are based on sample surveys. For instance, the Current Population Survey (CPS) is a monthly survey based on a sample of about 50,000 households; data are collected by computer-assisted personal and telephone interviews. The survey is conducted by the Bureau of the Census for BLS and is the source of national statistics on the labor force, employment, and unemployment. About 70 countries conduct regular labor force surveys, on an annual, quarterly, or monthly basis. Another major survey of BLS is the monthly Current Employment Statistics Survey. This survey is a sample of almost 400,000 establishments conducted by State employment security agencies in cooperation with BLS, and it provides employment, hours, and earnings data collected from payroll records of business establishments. Similar surveys are conducted by many other countries. Labor statistics may also be based on administrative records. Although the primary purpose of administrative records is not the production of statistics, these systems provide a rich information source that has been exploited to varying degrees in different countries. Among the uses in the USA, administrative data from the Unemployment Insurance system are used in combination with CPS data to produce unemployment data for States, labor market areas, counties, and cities. In several European countries, employment office registrations are the primary monthly indicator of the labor market. This is likely to be the case in countries that do not have a monthly household survey as a source for unemployment statistics. Some labor statistics are constructed by combining data from different sources. For example, industry 15073
Statistical Systems: Labor labor productivity data (output per employee or per hour) require data on both labor input and business output. The labor input measures are generally constructed from a combination of hours data from labor force and establishment surveys, while the output components typically are based on national income and product accounts which are derived from business establishment surveys.
4. Dissemination of Labor Statistics Disseminating statistics means presenting and distributing the data to users in various forms and preparing analyses of the data. The BLS makes available the information it produces through a broad publication program that includes news releases, periodicals, reports, and bulletins. Articles in the Monthly Labor Reiew present analyses of labor data. Some BLS material is available on magnetic tape, computer diskette, CD-ROM, and microfiche. Current BLS data, publications, and articles are also made available through the Internet. Other countries use similar methods of data dissemination. The publications Major Programs of the Bureau of Labor Statistics (1998) and the Handbook of Methods (1997), as well as the BLS web site noted in the bibliography, provide further information on the Bureau’s programs and how to obtain BLS data. Links to the statistical offices of other countries as well as to international organizations are included on the BLS web site. In 1996, the International Monetary Fund established a Dissemination Standards Bulletin Board on a web site (IMF DSBB) to assist countries in following the best practices of dissemination of economic and financial data to the public. One of the aspects of this, the General Data Dissemination System (GDDS), guides countries in the provision of comprehensive, timely, accessible, and reliable data, and the site provides information on data produced and disseminated by participating member countries. The GDDS system is built around four dimensions—data characteristics, quality, access, and integrity. Information on the formats and titles of national statistical publications is included. The GDDS is a source of metadata (that is, descriptions of current practices on data concepts, sources, and methods) on labor statistics as well as on other types of economic statistics. The detailed metadata provide users with a tool to better assess the usefulness of the data for their own particular purposes.
methods vary from country to country. Simply comparing national data across countries could lead to erroneous conclusions. The BLS and several international organizations derive meaningful comparisons by selecting a conceptual framework for comparative purposes; analyzing foreign statistical series and selecting those that most nearly match the desired concepts; and adjusting statistical series, where necessary and feasible, for greater inter-country comparability. The BLS comparisons generally are based on US concepts so that they will be on the conceptual basis most familiar to US data users. Comparisons made by international organizations frequently are based on the international standards recommended by the ILO, discussed in section 2.3. Brief descriptions of the comparative programs of the BLS, three international organizations, and a unique international microdatabase follow. Further information is available from their respective web sites, cited in the bibliography. 5.1 Bureau of Labor Statistics From its inception, BLS has published comparative information on labor conditions and developments abroad. Foreign labor research and statistical analyses have been undertaken because: (a) comparisons between US and foreign labor conditions shed light on US economic performance relative to other nations; (b) comparisons provide information on the competitive position of the USA in foreign trade, which has an important influence on the US economy and employment; (c) information on labor conditions published by a majority of foreign countries is not readily available to US labor representatives, employers, Government officials, and others, and frequently is not available in English; and (d) often, only an expert can judge the quality and comparability of foreign statistical data. The BLS comparisons program currently covers three key areas: (a) labor force, employment, and unemployment; (b) manufacturing productivity and labor costs; and (c) manufacturing hourly compensation costs. Only major developed countries are covered in the labor force and productivity comparisons, whereas the hourly compensation comparisons cover 28 countries at various stages of development. Outside the USA, national statistical agencies generally do not produce their own international comparisons. They depend on the comparative work of the BLS as well as that of international bodies such as those described below.
5. International Comparisons
5.2 International Labor Office
An international perspective can often enrich analysis of national labor markets, but such comparisons must be made with caution because statistical concepts and
The ILO in Geneva plays a major role in the arena of international labor statistics. This agency’s role has already been mentioned in terms of promoting the
15074
Statistical Systems: Labor production of labor statistics and setting international standards for statistics. The worldwide scope of the ILO’s membership ensures that this organization provides the most comprehensive geographic coverage among the international organizations involved in labor statistics. The ILO has several publications or programs that provide varying degrees of comparable labor statistics. The annual Yearbook of Labour Statistics (ILO 1999) is the most comprehensive source for worldwide labor statistics. The Yearbook compiles national labor data according to international classifications of industry and occupation, with annotations concerning deviations from these classifications, coverage, and sources. However, no adjustments are made for conceptual differences in labor force measures. All data are supplied by the national statistical authorities. The Yearbook contains data on labor force, employment, unemployment, hours of work, wages, labor cost, consumer prices, occupational injuries, and strikes and lockouts. Some statistics are presented as time series, others are provided for only the latest year. The ILO’s program of ILO-Comparable labor force statistics takes the process a step further by making adjustments of the labor force and unemployment data to ILO concepts for over 30 countries. Data from the ILO-Comparable program are available online (Lawrence 1999) and are also used in the publication Key Indicators of the Labour Market (Johnson et al 1999). The latter publication is well annotated concerning comparability issues for all of the 18 indicators provided.
5.3 Organization for Economic Cooperation and Deelopment (OECD) The Paris-based OECD comprises 29 member countries, most of which are the advanced countries in North America and Western Europe, as well as Australia, New Zealand, and Japan. In addition, some of the transition countries of Eastern and Central Europe, Mexico, and Korea became members of the OECD in the 1990s. Most OECD labor statistics cover the member countries only, but some series are being expanded to cover non-member Asian and Latin American countries. The main labor-related publications of the OECD are the annual and quarterly editions of Labour Force Statistics (1999). The annual publication compiles two decades of statistics, while the quarterly editions show selected data for the past five years. These publications provide statistics on key elements of the labor force, rather than on the broader range of labor statistics contained in the ILO Yearbook. Through the OECD’s provision of extended time series, it is possible to identify structural changes that have taken place in the labor force of the countries with respect to gender, age,
and economic activities. Data on part-time employment and duration of unemployment are included. The detailed statistics on employment are generally in conformity with the International Standard Industrial Classification. Another program of the OECD, the Standardized Unemployment Rates, provides unemployment rates adjusted to ILO concepts for 24 member countries. The Standardized Rates are published in an annex to the aforementioned quarterly editions and they also are available from the OECD web site. Additional labor statistics are published in the annual OECD Employment Outlook, expanding the range considerably to include special international analyses of training statistics, indicators of the degree of employment protection and regulation, trends in self employment, income distribution, minimum wages and poverty, the transition from education to the labor market, trends in working hours, and workforce aging, to name some of the special studies. The OECD analyses pay careful attention to comparability and often contain annexes that explain the data comparison issues in detail.
5.4 Statistical Office of the European Communities (Eurostat) Eurostat’s main role is to process and publish comparable statistical information for the 15 member countries of the European Union. Eurostat tries to arrive at a common statistical language that embraces concepts, methods, and statistical and technical standards. A significant portion of Eurostat’s statistics relates to labor. The Labor Force Surey (1998) is a joint effort by member states to coordinate their national surveys. The result is a detailed set of comparative data on employment, working time, and unemployment. Eurostat also publishes monthly harmonized unemployment rates for all member countries. Comparable data are critically important for the European Union because the Community’s social funds are distributed according to a formula that includes unemployment as a significant component. Eurostat’s Harmonized Indices of Consumer Prices (HICPs) are designed for international comparisons of consumer price inflation. The HICPs are the result of the collaboration between Eurostat and national statistical offices of member states on harmonizing the different methods used to compile price indices. The harmonization process has resulted in an improved quality of the indices. HICPs are not intended to replace national consumer price indices (CPIs). Besides the labor force survey and CPI statistics, Eurostat also publishes results from its harmonized labor cost surveys and conducts a European household panel that allows for analysis of income, employment, social exclusion, health, education, and many other topics. 15075
Statistical Systems: Labor 5.5 Luxembourg Income Study (LIS) The LIS, begun in 1983, is a unique cooperative international research project with a membership that includes 25 countries on four continents. The project is funded mainly by the national science and social science research foundations of its member countries. LIS comprises a collection of household income survey sets of microdata (files of data on individual households) for cross-country research. A companion database established in 1993 is the Luxembourg Employment Study, or LES. The data sets of LIS and LES have been harmonized in order to facilitate comparative studies. LIS and LES have become a rich source for comparative research in the areas of poverty and income distribution as well as labor market behavior. Moreover, these databases foster interdisciplinary research in the fields of economics, sociology, and political science. Researchers in member countries may use the LIS and LES databases at no charge, after signing a confidentiality agreement. Copies of the studies based on these databases are available from the LIS web site.
XI–XLIII. The ILO-comparable data are available at the following website: http:\\laborsta.ilo.org Luxembourg Income Study (and Employment Study) web site: http:\\www.lis.ceps.lu\index.htm Office of Management and Budget 1997 North American Industry Classification System. Bernan Press, Lanham, MD Organization for Economic Cooperation and Development web site: http:\\www.oecd.org. For the Standardized Unemployment Rates: http\\www.oecd.org\media\new-numbers 2000 Employment Outlook. OECD, Paris, France 1999 Labour Force Statistics 1978–1998. Also Quarterly Labour Force Statistics. OECD, Paris, France US Department of Labor Bureau of Labor Statistics (BLS) web site: http:\\stats.bls.gov. For the BLS international comparisons series: http:\\stats\bls.gov\flshome.htm 1997 BLS Handbook of Methods. BLS Bulletin 2490, BLS, Washington, DC 1998 Major Programs of the Bureau of Labor Statistics. BLS Report 919. BLS, Washington, DC 1999 Reising the Standard Occupational Classification System. BLS Report 929, BLS, Washington, DC. This publication is available at the following web site: http:\\stats.bls.gov\ pdf\socrpt929.pdf Sorrentino C 2000 International unemployment rates: How comparable are they? Monthly Labor Reiew 6: 3–20. Web site: http:\\www.stats.bls.gov\opub\MLR\2000\06\contents. htm
Disclaimer
C. Sorrentino
All views expressed are those of the author and do not necessarily reflect the views of the Bureau of Labor Statistics. See also: Employment and Labor, Regulation of; International Research: Programs and Databases; International Trade: Geographic Aspects; Labor Organizations: International; Labor Supply; Statistical Systems: Censuses of Population; Statistical Systems: Economic; Statistical Systems: Education; Statistical Systems: Health
Bibliography Eurostat (Statistical Office of the European Communities) web site: http:\\europa.eu.int\comm\eurostat\ 1998 Labour Force Surey. Eurostat, Luxembourg. International Labor Office web site: http:\\www.ilo.org Johnson L Sorrentino C et al. (eds.) 1999 Key Indicators of the Labour Market. ILO, Geneva, Switzerland. Web site: http: \\www.ilo.org\public\english\employment\strat\KILM\in dex.htm ILO 1999 Yearbook of Labour Statistics. ILO, Geneva, Switzerland. The statistics of the Yearbook are available online at http:\\laborsta.ilo.org ILO 2000 Current International Recommendations on Labour Statistics. ILO, Geneva, Switzerland. The ILO’s conventions and recommendations are also available at the following web site: http:\\www.ilo.org\public\english\bureau\stat\ standards\index\htm International Monetary Fund Dissemination Standards Bulletin Board web site: http:\\dsbb.imf.org\ Lawrence S 1999 ILO-comparable annual employment and unemployment estimates. ILO Bulletin of Labour Statistics 3:
15076
Statistics as Legal Evidence 1. Introduction The subject of statistics has sometimes been defined as decision-making in the face of uncertainty; from this perspective, the legal use of statistics is a natural rather than a surprising development. Indeed, probabilistic and statistical modes of reasoning and inference can be found in documents as old as the Talmud and the use of pseudo-quantitative degrees of proof is one element of medieval canon law. But all such attempts to introduce quantitative elements into the law remained necessarily informal until the rise of probability and statistics as formal mathematical disciplines. Shortly after the birth of mathematical probability in the seventeenth century, mathematicians such as Leibniz and James Bernoulli examined the legal implications of the subject (the latter in his masterpiece, the Ars conjectandi) and, in the eighteenth and nineteenth centuries, analyses of the reliability of testimonial evidence became commonplace in the treatises of the day on mathematical probability. But it was only in the hands of the French mathematicians Condorcet, Laplace, and Poisson that models of jury deliberation were first proposed, analyzed, and fitted to data. The models of Condorcet and his successors became of renewed interest in the 1970s
Statistics as Legal Eidence when some United States jurisdictions considered permitting juries to either have fewer than 12 members, or non-unanimous verdicts. Subsequently, critics such as Comte and Bertrand in France and Mill and Venn in England questioned whether the subjective, qualitative, and individual could be analyzed by methods that were supposed to be objective, quantitative, and based on the collective; and in the end such academic discussions had little practical legal impact. But in the twentieth century, and especially in the years following World War II, when the uses and applications of statistics found ubiquitous applications throughout science, commerce, government, and society, the law began to turn increasingly to the methods of statistical inference for assistance in a wide variety of fields. Such applications now range from sampling, proof of discrimination, and the determination of damages, to the establishment of paternity, human identification, and the prediction of future violent behavior. The applications that occur fall into two natural classes: in some cases, statistical methods are used to determine specific facts relevant to a particular case being decided (did this employer discriminate against that employee? did the defendant commit the offense charged?); in other cases, statistical methods are used are answer more general questions thought to be relevant to underlying law (does capital punishment deter homicides?).
2. Sampling The use of sampling in civil and administrative cases represents a natural and early application of statistical methods. A company seeking a refund for overpaid taxes but wishing to avoid the time and expense of examining each and every sales receipt at issue, might instead draw a sample from the population of all receipts; a government regulatory agency, suspecting inappropriate behavior on the part of a company trading commodity futures might conduct an audit of a random selection of accounts, rather than examine and analyze potentially tens of thousands of such accounts; an insurance company engaged in an arbitrated dispute might agree to accept the results of a sample of policies drawn by a neutral third party. Such applications are natural and represent some of the earliest serious uses of statistics in the law precisely because the benefits were obvious and the methodology (if properly conducted) difficult to challenge.
3. Random Selection A properly drawn scientific sample is accepted as representative because the randomness in question has been imposed by the statistician (using either a computer, random number table, pseudo-random number generator, or physical randomization). Sampling practitioners know however that true random-
ness is often difficult to achieve and that samples not drawn in accord with appropriate protocols may turn out to exhibit serious biases. Thus in some legal applications of statistics, whether or not a sample has been randomly drawn is itself the question of interest. Casinos reporting unusually low cash receipts during a year are sometimes challenged on the basis of a statistical analysis of possible variation in yearly revenue. Governments also resort to the use of random selection or lotteries, for example, in jury selection, the military draft, random audits of tax returns, and spot checks of baggage or individuals at airports, in ways that may be subjected to statistical scrutiny and legal challenge. One famous example of this is the 1968 United States draft lottery (which determined the order in which individuals were drafted into the army): individuals born later in the year were more likely to be drafted before individuals born earlier in the year. Sometimes the purpose of a government lottery is not just to achieve randomization, but also to ensure a popular perception of fairness. Aristotle, in his Athenian Constitution, describes a complex multistage procedure for juror selection that presumably satisfied both of these functions. Even if the mechanism for selection of a jury or grand jury does not employ randomization, there may be a legal requirement that the results of whatever method of selection has been used should at least appear to be random. In any discussion of random selection, it is important to distinguish between the process used to generate an outcome, and the absence of pattern in the outcome. The process may employ a true element of physical randomness, yet the outcome might by chance display some form of pattern; while the output of some deterministic processes, such as a pseudo-random number generator, may appear to be pattern free. Thus in the United States, for example, juror pools are often based on samples of administrative convenience but may, as a matter of law, be required, however selected, to be a representative cross-section of the population. Jury verdicts are therefore sometimes challenged on the ground of possible discrimination in the selection of the jury, it being alleged that either minorities or other groups had been excluded or underrepresented. Thus, for example, in the trial of Dr. Benjamin Spock for advocating resistance to the draft, his attorneys (believing that women would be more favorably disposed towards their client) appealed his conviction in part on the ground that women had been disproportionately excluded during one stage in the jury selection process and based this contention on a statistical analysis.
4. Employment Discrimination In scientific sampling, the model of random selection arises from the actions of the statistician and, in jury selection, the randomness of the outcome may be a 15077
Statistics as Legal Eidence legal requirement. But in some instances the model of random selection is a benchmark imposed by the analyst. In employment discrimination cases, for example, it may be claimed that racial, ethnic, sexual, or age discrimination took place in the hiring, promotion, or termination of individuals. In the workplace, such discrimination, if it does take place, is seldom explicit today and resort is often made to statistical methods of proof to document the underrepresentation of a protected class. In the United States, the law distinguishes two basic classes of employment discrimination: disparate treatment and disparate impact. In disparate treatment, intentional discrimination on the part of the employer is charged and statistical proof of underrepresentation may be introduced to support such a claim. In disparate impact cases, in contrast, the intent of the employer is not at issue: it is charged that, for whatever reason, the hiring, promotion, or termination practices of the employer impact to a greater extent on one or more protected classes, for example, Hispanics and African-Americans, than others. Rulings of the US Supreme Court in the 1970s found that, in certain circumstances, disparate impact was enough to sustain a finding of discrimination, unless the employer can defend the practice at issue on the grounds of ‘business necessity’; for example, a promotional examination may have a disparate impact but test for a necessary job skill. Such rulings were of great importance: by creating a purely statistical category of employment discrimination, they resulted in a rapid assimilation of statistical methods into the law, including many areas other than employment discrimination, due in large measure to the imprimatur given them by the Supreme Court. After statistical methods of proof of employment discrimination became routinely accepted in US courts, increasingly sophisticated methods began to be introduced. In cases of salary discrimination, for example, complex regression models are sometimes used in an attempt to argue that a minority group or other protected class is being underpaid even after the salary of the individual is adjusted for all variables (such as experience, skill, and credentials). Because both the hiring and promotion of individuals and the salaries they are paid are not the outcome of a lottery but the result of human choice and agency, the appropriateness of using as a yardstick the random selection model underlying all such statistical models has been questioned. One serious concern is that as the methods used become more complex and in consequence more removed from the intuition of the lay trier of fact, the potential for misunderstanding and misuse increases. One illustration of this is the not uncommon interpretation of a mere finding of statistical significance as being a direct proxy for one of substantive significance. It is not always obvious at what level an analysis should take place in a discrimination case. Simpson’s 15078
paradox refers to situations where a disparity between two groups is found at one level of analysis, but disappears at another. For example, in the 1960s, the Graduate School of the University of California at Berkeley discovered that a substantially higher percentage of men than women were being admitted for graduate studies. In an attempt to identify those departments responsible for the disparity, the Graduate School examined the records of individual departments and found that in each department women were doing at least as well as, and sometimes better than, their male counterparts. There is a simple explanation for the phenomenon in this case: women were applying preferentially to departments that had a lower admission rate, men were applying preferentially to departments that had a higher admission rate.
5. Forensic Applications Much of forensic science is devoted to the identification of substances or individuals, and in both cases important statistical issues arise. Classical gunpowder residue, blood alcohol level, and drug tests, three examples of tests for the suspected presence of substances, must all take into account the statistical variability inherent in the testing process. In the case of human identification, additional issues arise. The determination of human identity has classically exhibited two extremes. In fingerprint analysis, it is usually accepted that an adequate fingerprint exhibiting enough characteristics suffices to uniquely identify its source. In classical serological testing, in contrast, determining the ABO type of a biological specimen (usually blood, saliva, or semen) merely suffices to narrow down the class of possible donors to perhaps 25 percent to 1 percent of the population. Such evidence, although probative, obviously cannot by itself establish identity. It is only recently, with the advent of sophisticated methods of DNA testing, that the potential for human identification has dramatically increased. In a DNA profile, one or more locations (loci) on the human chromosome are examined and typed. Because human chromosomes come in pairs, two alleles are present for each locus and the list of allele pairs for each locus constitutes the DNA profile for the individual. Because the loci used in forensic applications are typically hypervariable (many possible alleles can occur at each locus), the resulting DNA profiles can be highly discriminating (few individuals in the population will have the same profile) and the typical profile may have a frequency in the population ranging from 1 in several thousand, to 1 in several million, billion, or more (depending on the system used). Such impressive frequency estimates necessarily make certain statistical or population genetic assumptions (independence of alleles within and between loci,
Statistics as Legal Eidence sometimes termed either ‘gametic’ and ‘zygotic,’ or ‘Hardy-Weinberg’ and ‘linkage’ equilibrium), assumptions whose validity must be tested and may vary substantially from one population to another. More sophisticated approaches estimate parameters that attempt to correct for the presence of possible subpopulations. Initially the introduction of such frequency estimates was attended by considerable controversy, but the introduction of polymerase chain reaction (PCR) methods in recent years has resulted in the ability to type increasing numbers of loci. At first, the FBI laboratory only typed four restriction fragment length polymorphism (RFLP) loci but in the not too distant future, the resulting DNA profiles may well come to be viewed as essentially establishing identity. Nevertheless, even in situations where the computation of a profile frequency is not regarded as being by itself controversial, a number of statistical complexities still arise: (a) the calculation of profile frequencies requires knowledge of the underlying allele frequencies, and knowledge of these requires either a true random sample, or at least, if a sample of convenience is used, one thought to be representative of the population, (b) profile frequencies can vary considerably from one racial or ethnic group to another and information for some population subgroups may not be readily available (for example, some American Indian populations), (c) in some cases, especially sexual assaults, a mixture of two or more sources of DNA may be present and such complex mixture cases can present formidable difficulties in both interpretation and appropriate statistical analysis, (d) if a DNA profile is identified by searching through a large databank of such profiles (for example, profiles of convicted felons), the statistical weight afforded to such a match needs to take this into account. (From a frequentist perspective, it is sometimes argued that this should be done by adjusting for the multiple comparisons being performed; from a Bayesian perspective, the adjustment would reflect the differing priors in a situation where a suspect is in hand and a suspectless crime resulting in a databank search.) (e) matching DNA profiles are much more common among relatives than unrelated individuals in a population, the computation of profile frequencies for relatives leads naturally to another important application of DNA testing.
6. Paternity Testing Closely related to its use in criminal forensics, DNA profiles are also used in establishing paternity. Suppose the profiles of a child, mother, and putative father
have been determined. If the putative father is not excluded as a possible father, then two statistics are commonly encountered to measure how unusual such a failure to exclude is. One statistic, termed the ‘random man not excluded’ (RMNE), computes what fraction of the male population would similarly not be excluded if their DNA profiles were also tested. The second statistic, termed the ‘paternity index,’ computes the likelihood ratio for two hypotheses conditional on the observed evidence (the three profiles): the first hypothesis, that the putative father is indeed the father; the second, that the true father is a random member of the population. Such calculations were initially performed using knowledge of the ABO and other blood groups, but in the current era of DNA testing often suffice to effectively establish paternity. There are a number of variant calculations that can also be performed. One can compute paternity indices if the profile of the mother is unknown or in more complex situations where the profiles of relatives other than the mother are known. Or one can attempt to identify the source of biological material when the profile of the individual is unavailable but the profiles of relatives are available. This is often the case in body identification, in particular when mass disasters are involved.
7. Violence Prediction Society often has a stake in attempting to predict future violent behavior: are there adequate grounds for confining a mentally ill person on the grounds that they pose a significant risk if left at liberty; if a prison inmate is paroled, is it likely that they will commit another crime after release; if a sexual predator is released after their prison sentence is completed, how likely is it that they will commit another sexually violent offense after their release? Such predictions, especially in the last category, have traditionally suffered from the same difficulties as testing for a rare disease: even if a test for a disease has both high sensitivity and specificity, when the disease is rare enough, most of the positives are false positives. Despite such difficulties, in recent years some psychologists have attempted to construct predictive ‘instruments’ (tests) using complex logistic regression models, and have analyzed their performance using survival analysis techniques.
8. Interpretation Statistical analyses may be performed by highly competent individuals but their results must be interpreted and used by persons lacking formal training or quantitative understanding. This presents challenges both in successful presentation and in meeting the important responsibilities placed on the statistician to 15079
Statistics as Legal Eidence ensure that his results are not overvalued. Terms such as ‘statistical significance’ are easily and frequently misunderstood to imply a finding of practical significance; levels of significance are all too often interpreted as posterior probabilities, for example, in the guise that, if a DNA profile occurs only 1 in 10,000 times, then the chances are 9,999 to 1 that an individual having a matching profile is the source of the profile. As a result, individuals may tend to focus on the quantitative elements of a case, thereby overlooking qualitative elements that may in fact be more germane or relevant.
Harr J 1996 A Ciil Action. Vintage Books, New York Lagakos S W, Wessen B J, Zelen M 1986 An analysis of contaminated well water and health effects in Woburn, Massachusetts. Journal of the American Statistical Association 81: 583–96 Meier P, Zabell S 1980 Benjamin Peirce and the Howland Will. Journal of the American Statistical Association 75: 497–506 Tribe L H 1971 Trial by mathematics. Harard Law Reiew 84: 1329–48 Zeisel H, Kaye D H 1997 Proe It With Figures: Empirical Methods in Law and Litigation. Springer-Verlag, New York
S. L. Zabell
9. Other Applications The last decades of the twentieth century saw a remarkable variety of legal applications of statistics: attempting to determine the possible deterrent effects of capital punishment, estimating damages in antitrust litigation, epidemiological questions of the type raised in the book and movie A Ciil Action. Some of the books listed in the bibliography discuss a number of these. See also: Juries; Legal Reasoning and Argumentation; Legal Reasoning Models; Linear Hypothesis: Fallacies and Interpretive Problems (Simpson’s Paradox); Sample Surveys: The Field
Bibliography Bickel P, Hammel E, O’Connell J W 1975 Is there a sex bias in graduate admissions? Data from Berkeley. Science 187: 398–404. Reprinted In: Fairley W B, Mosteller F (eds.) Statistics and Public Policy. Addison-Wesley, New York, pp. 113–30 Chaikan J, Chaikan M, Rhodes W 1994 Predicting Violent Behavior and Classifying Violent Offenders. In: Understanding and Preenting Violence, Volume 4: Consequences and Control. National Academy Press, Washington, DC, pp. 217–95 DeGroot M H, Fienberg S E, Kadane J B 1987 Statistics and the Law. Wiley, New York Evett I W, Weir B S 1998 Interpreting DNA Eidence: Statistical Genetics for Forensic Scientists Federal Judicial Center 1994 Reference Manual on Scientific Eidence. McGraw Hill Fienberg S E 1971 Randomization and social affairs: The 1970 draft lottery. Science 171: 255–61 Fienberg S E (ed.) 1989 The Eoling Role of Statistical Assessments as Eidence in the Courts. Springer-Verlag, New York Finkelstein M O, Levin B 1990 Statistics for Lawyers. SpringerVerlag, New York Gastwirth J L 1988 Statistical Reasoning in Law and Public Policy, Vol. 1: Statistical Modeling and Decision Science; Vol. 2: Tort Law, Eidence and Health. Academic Press, Boston Gastwirth J L 2000 Statistical Science in the Courtroom. Springer-Verlag, New York
15080
Statistics, History of The word ‘statistics’ today has several different meanings. For the public, and even for many people specializing in social studies, it designates numbers and measurements relating to the social world: population, gross national product, and unemployment, for instance. For academics in ‘Statistics Departments,’ however, it designates a branch of applied mathematics making it possible to build models in any area featuring large numbers, not necessarily dealing with society. History alone can explain this dual meaning. Statistics appeared at the beginning of the nineteenth century as meaning a ‘quantified description of human–community characteristics.’ It brought together two previous traditions: that of German ‘Statistik’ and that of English political arithmetic (Lazarsfeld 1977). When it originated in Germany in the seventeenth and eighteenth centuries, the Statistik of Hermann Conring (1606–81) and Gottfried Achenwall (1719–79) was a means for classifying the knowledge needed by kings. This ‘science of the state’ included history, law, political science, economics, and geography, that is, a major part of what later became the subjects of ‘social studies,’ but presented from the point of view of their utility for the state. These various forms of knowledge did not necessarily entail quantitative measurements: they were basically associated with the description of specific territories in all their aspects. This territorial connotation of the word ‘statistics’ would subsist for a long time in the nineteenth century. Independently of the German Statistik, the English tradition of political arithmetic had developed methods for analyzing numbers and calculations, on the basis of parochial records of baptisms, marriages, and deaths. Such methods, originally developed by John Graunt (1620–74) in his work on ‘bills of mortality,’ were then systematized by William Petty (1627–87). They were used, among others, to assess the population of a
Statistics, History of kingdom and to draw up the first forms of life insurance. They constitute the origin of modern demography. Nineteenth-century ‘statistics’ were therefore a fairly informal combination of these two traditions: taxonomy and numbering. At the beginning of the twentieth century, it further became a mathematical method for the analysis of facts (social or not) involving large numbers and for inference on the basis of such collections of facts. This branch of mathematics is generally associated with the probability theory, developed in the seventeenth and eighteenth centuries, which, in the nineteenth century, influenced many branches of both the social and natural sciences (Gigerenzer et al. 1989). The word ‘statistics’ is still used in both ways today and both of these uses are still related, of course, insofar as quantitative social studies use, in varied proportions, inference tools provided by mathematical statistics. The probability calculus, on its part, grounds the credibility of statistical measurements resulting from surveys, together with random sampling and confidence intervals. The diversity of meanings of the word ‘statistics,’ maintained until today, has been heightened by the massive development, beginning in the 1830s, of bureaus of statistics, administrative institutions distinct from universities, in charge of collecting, processing, and transmitting quantified information on the population, the economy, employment, living conditions, etc. Starting in the 1940s, these bureaus of statistics became important ‘data’ suppliers for empirical social studies, then in full growth. Their history is therefore an integral part of these sciences, especially in the second half of the twentieth century, during which the mathematical methods developed by university statisticians were increasingly used by the socalled ‘official’ statisticians. Consequently the ‘history of statistics’ is a crossroads history connecting very different fields, which are covered in other articles of this encyclopedia: general problems of the ‘quantification’ of social studies (Theodore Porter), mathematical ‘sampling’ methods (Stephen E. Fienberg and J. M. Tanur), ‘survey research’ (Martin Bulmer), ‘demography’ (Simon Szreter), ‘econometrics,’ etc. The history of these different fields was the object of much research in the 1980s and 1990s, some examples of which are indicated in the Bibliography. Its main interest is to underscore the increasingly closer connections between the so-called ‘internal’ dimensions (history of the technical tools), the ‘external’ ones (history of the institutions), and those related to the construction of social studies ‘objects,’ born from the interaction between the three foci constituted by university research, administrations in charge of ‘social problems,’ and bureaus of statistics. This ‘coconstruction’ of objects makes it possible to join historiographies that not long ago were distinct. Three key moments of this history will be mentioned here: Adolphe Quetelet and the average man (1830s), Karl
Pearson and correlation (1890s), and the establishment of large systems for the production and processing of statistics.
1. Quetelet, the Aerage Man, and Moral Statistics The cognitive history of statistics can be presented as that of the tension and sliding between two foci: the measurement of uncertainty (Stigler 1986), resulting from the work of eighteenth-century astronomers and physicists, and the reduction of diersity, which will be taken up by social studies. Statistics is a way of taming chance (Hacking 1990) in two different ways: chance and uncertainty related to protocols of observation, chance and dispersion related to the diversity and the indetermination of the world itself. The Belgian astronomer and statistician Adolphe Quetelet (1796– 1874) is the essential character in the transition between the world of ‘uncertain measurement’ of the probability proponents (Carl Friedrich Gauss, Pierre Simone de Laplace) and that of the ‘regularities’ resulting from diversity, thanks to his having transferred, around 1830, the concept of average from the natural sciences to the human sciences, through the construction of a new being, the average man. As early as the eighteenth century, specificities appeared from observations in large numbers: drawing balls out of urns, gambling, successive measurements of the position of a star, sex ratios (male and female births), or the mortality resulting from preventive smallpox inoculation, for instance. The radical innovation of this century was to connect these very different phenomena thanks to the common perspective provided by the ‘law of large numbers’ formulated by Jacques Bernoulli in 1713. If draws from a constant urn containing white and black balls are repeated a large number of times, the observed share of white balls ‘converges’ toward that actually contained by the urn. Considered by some as a mathematical theorem and by others as an experimental result, this ‘law’ was at the crossroads of the two currents in epistemological science: one ‘hypothetical deductive,’ the other ‘empirical inductive.’ Beginning in the 1830s, the statistical description of ‘observations in large numbers’ became a regular activity of the state. Previously reserved for princes, this information henceforth available to ‘enlightened men’ was related to the population, to births, marriages, and deaths, to suicides and crimes, to epidemics, to foreign trade, schools, jails, and hospitals. It was generally a by-product of administration activity, not the result of special surveys. Only the population census, the showcase product of nineteenth-century statistics, was the object of regular surveys. These statistics were published in volumes with heterogeneous contents, but their very existence 15081
Statistics, History of suggests that the characteristics of society were henceforth a matter of scientific law and no longer of judicial law, that is, of observed regularity and not of the normative decisions of political power. Quetelet was the man who orchestrated this new way of thinking the social world. In the 1830s and 1840s, he set up administrative and social networks for the production of statistics and established—until the beginning of the twentieth century—how statistics were to be interpreted. This interpretation is the result of the combination of two ideas developed from the law of large numbers: the generality of ‘normal distribution’ (or, in Quetelet’s vocabulary: the ‘law of possibilities’) and the regularity of certain yearly statistics. As early as 1738, Abraham de Moivre, seeking to determine the convergence conditions for the law of large numbers, had formulated the mathematical expression of the future ‘Gaussian law’ as the limit of a binomial distribution. Then Laplace (1749–1827) had shown that this law constituted a good representation of the distribution of measurement errors in astronomy, hence the name that Quetelet and his contemporaries also used to designate it: the law of errors (the expression ‘normal law,’ under which it is known today, would not be introduced until the late nineteenth century by Karl Pearson). Quetelet’s daring intellectual masterstroke was to bring together two forms: on the one hand the law of errors in observation, and on the other, the law of distribution of certain body measurements of individuals in a population, such as the height of conscripts in a regiment. The likeness of the ‘Gaussian’ looks of these two forms of distribution justified the invention of a new being with a promise of notable posterity in social studies: the average man. Thus, Quetelet restricted the calculation and the legitimate use of averages to cases where the distribution of the observations had a Gaussian shape, analogous to that of the distribution of the astronomical observations of a star. Reasoning on that basis, just as previous to this distribution there was a real star (the ‘cause’ of the Gaussian-shaped distribution), previous to the equally Gaussian distribution of the height of conscripts there was a being of a reality comparable to the existence of the star. Quetelet’s average man is also the ‘constant cause,’ previous to the observed controlled variability. He is a sort of model, of which specific individuals are imperfect copies. The second part of this cognitive construction, which is so important in the ulterior uses of statistics in social studies, is the attention drawn by the ‘remarkable regularity’ of series of statistics, such as those of marriages, suicides, or crimes. Just as series of draws from an urn reveal a regularity in the observed frequency of white balls, the regularity in the rates of suicide or crime can be interpreted as resulting from series of draws from a population, some of the members of which are affected with a ‘propensity’ to 15082
suicide or crime. The average man is therefore endowed not only with physical attributes but also ‘moral’ ones, such as these propensities. Here again, just as the average heights of conscripts are stable, whereas individual heights are dispersed, crime or suicide rates are just as stable, whereas these acts are eminently individual and unpredictable. This form of statistics, then called ‘moral,’ signaled the beginning of sociology, a science of society radically distinct from a science of the individual, such as psychology (Porter 1986). Quetelet’s reasoning would ground the one developed by Durkheim in Suicide: A Study in Sociology (1897). This way of using statistical regularity to back the idea of the existence of a society ruled by specific ‘laws,’ distinct from those governing individual behavior, dominated nineteenth-century, and, in part, twentiethcentury social studies. Around 1900, however, another approach appeared, this one centered on two ideas: the distribution (no longer just the average) of observations, and the correlation between two or several variables, observed in individuals (no longer just in groups, such as territories).
2. Distribution, Correlation, and Causality This shift of interest from the average individual to the distributions and hierarchies among individuals, was connected to the rise, in late-century Victorian England, of a eugenicist and hereditarian current of thought inspired from Darwin (MacKenzie 1981). Its two advocates were Francis Galton (1822–1911), a cousin of Darwin, and Karl Pearson (1857–1936). In their attempt to measure biological heredity, which was central to their political construction, they created a radically new statistics tool that made it possible to conceive partial causality. Such causality had been absent from all previous forms of thought, for which A either is or is not the cause of B, but cannot be so somewhat or incompletely. Yet Galton’s research on heredity led to such a formulation: the parents’ height ‘explains’ the children’s, but does not entirely ‘determine’ it. The taller fathers are, the taller are their sons on aerage, but, for a father’s given height, the sons’ height dispersion is great. This formalization of heredity led to the two related ideas of regression and correlation, later to be extensively used in social studies as symptoms of causality. Pearson, however, greatly influenced by the antirealist philosophy of the physicist Ernst Mach, challenged the idea of ‘causality,’ which according to him was ‘metaphysical,’ and stuck to the one of ‘correlation,’ which he described with the help of ‘contingency tables’ (Pearson 1911, Chap. 5). For him, scientific laws are only summaries, brief descriptions in mental stenography, abridged formulas, a condensation of perception routines for future use and forecasting. Such formulas are the limits of observa-
Statistics, History of tions that never perfectly respect the strict functional laws. The correlation coefficient makes it possible to measure the strength of the connection, between zero (independence) and one (strict dependence). Thus, in this conception of science associated by Pearson with the budding field of mathematical statistics, the reality of things can only be invoked for pragmatic ends and provided that the ‘perception routines’ are maintained. Similarly, ‘causality’ can only be insofar as it is a proven correlation, therefore predictable with a fairly high probability. Pearson’s pointed formulations would constitute, in the early twentieth century, one of the foci of the epistemology of statistics applied to social studies. Others, in contrast, would seek to give new meaning to the concepts of reality and causality by defining them differently. These discussions were strongly related to the aims of the statistical work, strained between scientific knowledge and decisions. Current mathematical statistics proceed from the works of Karl Pearson and his successors: his son Egon Pearson (1895–1980), the Polish mathematician Jerzy Neyman (1894–1981), the statistician pioneering in agricultural experimentation Ronald Fisher (1890– 1962), and finally the engineer and beer brewer William Gosset, alias Student (1876–1937). These developments were the result of an increasingly thorough integration of so-called ‘inferential’ statistics into probabilistic models. The interpretation of these constructions is always stretched between two perspectives: the one of science, which aims to prove or test hypotheses, with truth as its goal, and the one of action, which aims to make the best decision, with efficiency as its goal. This tension explains a number of controversies that opposed the founders of inferential statistics in the 1930s. In effect, the essential innovations were often directly generated within the framework of research as applied to economic issues, for instance in the cases of Gosset and Fisher. Gosset was employed in a brewery. He developed product quality-control techniques based on a small number of samples. He needed to appraise the variances and laws of distribution of parameters calculated on observations not complying with the supposed ‘law of large numbers.’ Fisher, who worked in an agricultural research center, could only carry out a limited number of controlled tests. He mitigated this limitation by artificially creating a randomness, itself controlled, for variables other than those for which he was trying to measure the effect. This ‘randomization’ technique thus introduced probabilistic chance into the very heart of the experimental process. Unlike Karl Pearson, Gosset and Fisher used distinct notations to designate, on the one hand, the theoretical parameter of a probability distribution (a mean, a variance, a correlation) and on the other, the estimate of this parameter, calculated on the basis of observations so insufficient in number that it was possible to disregard the gap between these two values, theoretical and estimated.
This new system of notation marked a decisive turning point: it enabled an inferential statistics based on probabilistic models. This form of statistics was developed in two directions. The estimation of parameters, which took into account a set of recorded data, presupposed that the model was true. The information produced by the model was combined with the data, but nothing indicated whether the model and the data were in agreement. In contrast, the hypothesis tests allowed this agreement to be tested and if necessary to modify the model: this was the inventive part of inferential statistics. In wondering whether a set of events could plausibly have occurred if a model were true, one compared these events—explicitly or otherwise—to those that would have occurred if the model were true, and made a judgment about the gap between these two sets of events. This judgment could itself be made according to two different perspectives, which were the object of vivid controversy between Fisher on the one hand, and Neyman and Egon Pearson on the other. Fisher’s test was placed in a perspective of truth and science: a theoretical hypothesis was judged plausible or was rejected, after consideration of the observed data. Neyman and Pearson’s test, in contrast, was aimed at decision making and action. One evaluated the respective costs of accepting a false hypothesis and of rejecting a true one, described as errors of Type I and II. These two different aims—truth and economy— although supported by close probabilistic formalism, led to practically incommensurable argumentative worlds, as was shown by the dialogue of the deaf between Fisher on one side, and Neyman and Pearson on the other (Gigerenzer et al. 1989, pp. 90–109).
3. Official Statistics and the Construction of the State At the same time as mathematical statistics were developing, so-called ‘official’ statistics were also being developed in ‘bureaus of statistics,’ for a long time on an independent course. These latter did not use the new mathematical tools until the 1930s in the United States and the 1950s in Europe, in particular when the random sample-survey method was used to study employment or household budgets. Yet in the 1840s, Quetelet had already actively pushed for such bureaus to be set up in different countries, and for their ‘scientification’ with the tools of the time. In 1853, he had begun organizing meetings of the ‘International Congress of Statistics,’ which led to the establishment in 1885 of the ‘International Statistical Institute’ (which still exists and includes mathematicians and official statisticians). One could write the history of these bureaus as an aspect of the more general history of the construction of the state, insofar as they developed and legitimized a common language specifically combining the authority of science and that of 15083
Statistics, History of the state (Anderson 1988, Desrosie' res 1998, Patriarca 1996). More precisely, every period of the history of a state could be characterized by the list of ‘socially judged social’ questions that were consequently put on the agenda of official statistics. So were co-constructed three interdependent foci: representation, action, and statistics—a way of describing and interpreting social phenomena (to which social studies would increasingly contribute), a method for determining state intervention and public action, and finally, a list of statistical ‘variables’ and procedures aimed at measuring them. Thus for example in England in the second third of the nineteenth century, poverty, mortality, and epidemic morbidity were followed closely in terms of a detailed geographical distribution (counties) by the General Register Office (GRO), set up by William Farr in 1837. England’s economic liberalism and the Poor Law Amendment Act of 1834 (which led to the creation of workhouses) were consistent with this form of statistics. In the 1880s and 1890s, Galton and Pearson’s hereditarian eugenics would compete with this ‘environmentalism,’ which explained poverty in terms of the social and territorial contexts. This new ‘social philosophy’ was reflected in news forms of political action, and of statistics. Thus the ‘social classification’ in five differentiated groups used by British statisticians throughout the twentieth century is marked by the political and cognitive configuration of the beginning of the century (Szreter 1996). In all the important countries (including Great Britain) of the 1890s and 1900s, however, the work of the bureaus of statistics was guided by labor-related issues: employment, wages, workers’ budgets, subsistence costs, etc. The modern idea of unemployment emerged, but its definition and its measurement were not standardized yet. This interest in labor statistics was linked to the fairly general development of a specific ‘labor law’ and the first so-called ‘social welfare’ legislation, such as Bismarck’s in Germany, or that developed in the Nordic countries in the 1890s. It is significant that the application of the sample survey method (then called ‘representative’ survey) was first tested in Norway in 1895, precisely in view of preparing a new law enacting general pension funds and invalidity insurance: this suggests the consistency of the political, technical, and cognitive dimensions of this co-construction. These forms of consistency are found in the statistics systems that were extensively developed, at a different scale, after 1945. At that time, public policies were governed by a number of issues: the regulation of the macroeconomic balance as seen through the Keynesian model, the reduction of social inequalities and the struggle against unemployment thanks to socialwelfare systems, the democratization of school, etc. Some people then spoke of ‘revolution’ in government statistics (Duncan and Shelton 1978), and underscored 15084
its four components, which have largely shaped the present statistics systems. National accounting, a vast construction integrating a large number of statistics from different sources, was the instrument on which the macroeconomic models resulting from the Keynesian analysis were based. Sample surveys made it possible to study a much broader range of issues and to accumulate quantitative descriptions of the social world, which were unthinkable at a time when the observation techniques were limited to censuses and monographs. Statistical coordination, an apparently strictly administrative affair, was indispensable to make consistent the observations resulting from different fields. Finally, beginning in 1960, the generalization of computer data processing radically transformed the activity of bureaux of statistics. So, ‘official statistics,’ placed at the junction between social studies, mathematics, and information on public policies, has become an important research component in the social studies. Given, however, that from the institutional standpoint it is generally placed outside, it is often hardly perceived by those who seek to draw up a panorama of these sciences. In fact, the way bureaus of statistics operate and are integrated into administrative and scientific contexts varies a lot from one country to another, so a history and a sociology of social studies cannot omit examining these institutions, which are often perceived as mere suppliers of data assumed to ‘reflect reality,’ when they are actually places where this ‘reality’ is instituted through co-constructed operations of social representation, public action, and statistical measurement. See also: Estimation: Point and Interval; Galton, Sir Francis (1822–1911); Neyman, Jerzy (1894–1981); Pearson, Karl (1857–1936); Probability and Chance: Philosophical Aspects; Quantification in History; Quantification in the History of the Social Sciences; Quetelet, Adolphe (1796–1874); Statistical Methods, History of: Post-1900; Statistical Methods, History of: Pre-1900; Statistics: The Field
Bibliography Anderson M J 1988 The American Census. A Social History. Yale University Press, New Haven, CT Desrosie' res A 1998 The Politics of Large Numbers. A History of Statistical Reasoning. Harvard University Press, Cambridge, MA Duncan J W, Shelton W C 1978 Reolution in United States Goernment Statistics, 1926–1976. US Department of Commerce, Washington, DC Gigerenzer G et al. 1989 The Empire of Chance. How Probability Changed Science and Eeryday Life. Cambridge University Press, Cambridge, UK Hacking I 1990 The Taming of Chance. Cambridge University Press, Cambridge, UK Klein J L 1997 Statistics Visions in Time. A History of Time Series Analysis, 1662–1938. Cambridge University Press, Cambridge, UK
Statistics: The Field Lazarsfeld P 1977 Notes in the history of quantification in sociology: Trends, sources and problems. In: Kendall M, Plackett R L (eds.) Studies in the History of Statistics and Probability. Griffin, London, Vol. 2, pp. 213–69 MacKenzie D 1981 Statistics in Britain, 1865–1930. The Social Construction of Scientific Knowledge. Edinburgh University Press, Edinburgh, UK Patriarca S 1996 Numbers and Nationhood: Writing Statistics in Nineteenth-century Italy. Cambridge University Press, Cambridge, UK Pearson K 1911 The Grammar of Science, 3rd edn rev. and enl.. A. and C. Black, London Porter T 1986 The Rise of Statistical Thinking, 1820–1900. Princeton University Press, Princeton, NJ Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty Before 1900. Belknap Press of Harvard University Press, Cambridge, MA Szreter S 1996 Fertility, Class and Gender in Britain, 1860–1940. Cambridge University Press, Cambridge, UK
A. Desrosie' res
Statistics: The Field Statistics is a term used to refer to both a field of scientific inquiry and a body of quantitative methods. The field of statistics has a 350-year intellectual history rooted in the origins of probability and the rudimentary tools of political arithmetic of the seventeenth century. Statistics came of age as a separate discipline with the development of formal inferential theories in the twentieth century. This article briefly traces some of this historical development and discusses current methodological and inferential approaches as well as some cross-cutting themes in the development of new statistical methods.
1. Introduction Statistics is a body of quantitative methods associated with empirical observation. A primary goal of these methods is coping with uncertainty. Most formal statistical methods rely on probability theory to express this uncertainty and to provide a formal mathematical basis for data description and for analysis. The notion of variability associated with data, expressed through probability, plays a fundamental role in this theory. As a consequence, much statistical effort is focused on how to control and measure variability and or how to assign it to its sources. Almost all characterizations of statistics as a field include the following elements: (a) Designing experiments, surveys, and other systematic forms of empirical study. (b) Summarizing and extracting information from data.
(c) Drawing formal inferences from empirical data through the use of probability. (d) Communicating the results of statistical investigations to others, including scientists, policy makers, and the public. This article describes a number of these elements, and the historical context out of which they grew. It provides a broad overview of the field, that can serve as a starting point to many of the other statistical entries in this encyclopedia.
2. The Origins of the Field The word ‘statistics’ is related to the word ‘state’ and the original activity that was labeled as statistics was social in nature and related to elements of society through the organization of economic, demographic, and political facts. Paralleling this work to some extent was the development of the probability calculus and the theory of errors, typically associated with the physical sciences (see Statistical Methods, History of: Pre-1900). These traditions came together in the nineteenth century and led to the notion of statistics as a collection of methods for the analysis of scientific data and the drawing of inferences therefrom. As Hacking (1990) has noted: ‘By the end of the century chance had attained the respectability of a Victorian valet, ready to be the logical servant of the natural, biological and social sciences’ ( p. 2). At the beginning of the twentieth century, we see the emergence of statistics as a field under the leadership of Karl Pearson, George Udny Yule, Francis Y. Edgeworth, and others of the ‘English’ statistical school. As Stigler (1986) suggests: Before 1900 we see many scientists of different fields developing and using techniques we now recognize as belonging to modern statistics. After 1900 we begin to see identifiable statisticians developing such techniques into a unified logic of empirical science that goes far beyond its component parts. There was no sharp moment of birth; but with Pearson and Yule and the growing number of students in Pearson’s laboratory, the infant discipline may be said to have arrived. (p. 361)
Pearson’s laboratory at University College, London quickly became the first statistics department in the world and it was to influence subsequent developments in a profound fashion for the next three decades. Pearson and his colleagues founded the first methodologically-oriented statistics journal, Biometrika, and they stimulated the development of new approaches to statistical methods. What remained before statistics could legitimately take on the mantle of a field of inquiry, separate from mathematics or the use of statistical approaches in other fields, was the development of the formal foundations of theories of inference from observations, rooted in an axiomatic theory of probability. 15085
Statistics: The Field Beginning at least with the Rev. Thomas Bayes and Pierre Simon Laplace in the eighteenth century, most early efforts at statistical inference used what was known as the method of inverse probability to update a prior probability using the observed data in what we now refer to as Bayes’ Theorem. (For a discussion of who really invented Bayes’ Theorem, see Stigler 1999, Chap. 15). Inverse probability came under challenge in the nineteenth century (see Statistical Methods, History of: Pre-1900), but viable alternative approaches gained little currency. It was only with the work of R. A. Fisher on statistical models, estimation, and significance tests, and Jerzy Neyman and Egon Pearson, in the 1920s and 1930s, on tests of hypotheses, that alternative approaches were fully articulated and given a formal foundation. Neyman’s advocacy of the role of probability in the structuring of a frequencybased approach to sample surveys in 1934 and his development of confidence intervals further consolidated this effort at the development of a foundation for inference (cf. Statistical Methods, History of: Post1900 and the discussion of ‘The inference experts’ in Gigerenzer et al. 1989). At about the same time Kolmogorov presented his famous axiomatic treatment of probability, and thus by the end of the 1930s, all of the requisite elements were finally in place for the identification of statistics as a field. Not coincidentally, the first statistical society devoted to the mathematical underpinnings of the field, The Institute of Mathematical Statistics, was created in the United States in the mid-1930s. It was during this same period that departments of statistics and statistical laboratories and groups were first formed in universities in the United States.
3. Emergence of Statistics as a Field 3.1 The Role of World War II Perhaps the greatest catalysts to the emergence of statistics as a field were two major social events: the Great Depression of the 1930s and World War II. In the United States, one of the responses to the depression was the development of large-scale probability-based surveys to measure employment and unemployment. This was followed by the institutionalization of sampling as part of the 1940 US decennial census. But with World War II raging in Europe and in Asia, mathematicians and statisticians were drawn into the war effort, and as a consequence they turned their attention to a broad array of new problems. In particular, multiple statistical groups were established in both England and the US specifically to develop new methods and to provide consulting. (See Wallis 1980, on statistical groups in the US; Barnard and Plackett 1985, for related efforts in the United Kingdom; and Fienberg 1985). These groups not only created imaginative new techniques such as sequential 15086
analysis and statistical decision theory, but they also developed a shared research agenda. That agenda led to a blossoming of statistics after the war, and in the 1950s and 1960s to the creation of departments of statistics at universities—from coast to coast in the US, and to a lesser extent in England and elsewhere. 3.2 The Neo-Bayesian Reial Although inverse probability came under challenge in the 1920s and 1930s, it was not totally abandoned. John Maynard Keynes (1921) wrote A Treatise on Probability that was rooted in this tradition, and Frank Ramsey (1926) provided an early effort at justifying the subjective nature of prior distributions and suggested the importance of utility functions as an adjunct to statistical inference. Bruno de Finetti provided further development of these ideas in the 1930s, while Harold Jeffreys (1938) created a separate ‘objective’ development of these and other statistical ideas on inverse probability. Yet as statistics flourished in the post-World War II era, it was largely based on the developments of Fisher, Neyman and Pearson, as well as the decision theory methods of Abraham Wald (1950). L. J. Savage revived interest in the inverse probability approach with The Foundations of Statistics (1954) in which he attempted to provide the axiomatic foundation from the subjective perspective. In an essentially independent effort, Raiffa and Schlaifer (1961) attempted to provide inverse probability counterparts to many of the then existing frequentist tools, referring to these alternatives as ‘Bayesian.’ By 1960, the term ‘Bayesian inference’ had become standard usage in the statistical literature, the theoretical interest in the development of Bayesian approaches began to take hold, and the neo-Bayesian revival was underway. But the movement from Bayesian theory to statistical practice was slow, in large part because the computations associated with posterior distributions were an overwhelming stumbling block for those who were interested in the methods. Only in the 1980s and 1990s did new computational approaches revolutionize both Bayesian methods, and the interest in them, in a broad array of areas of application. 3.3 The Role of Computation in Statistics From the days of Pearson and Fisher, computation played a crucial role in the development and application of statistics. Pearson’s laboratory employed dozens of women who used mechanical devices to carry out the careful and painstaking calculations required to tabulate values from various probability distributions. This effort ultimately led to the creation of the Biometrika Tables for Statisticians that were so widely used by others applying tools such as chisquare tests and the like. Similarly, Fisher also
Statistics: The Field developed his own set of statistical tables with Frank Yates when he worked at Rothamsted Experiment Station in the 1920s and 1930s. One of the most famous pictures of Fisher shows him seated at Whittingehame Lodge, working at his desk calculator (see Box 1978). The development of the modern computer revolutionized statistical calculation and practice, beginning with the creation of the first statistical packages in the 1960s—such as the BMDP package for biological and medical applications, and Datatext for statistical work in the social sciences. Other packages soon followed—such as SAS and SPSS for both data management and production-like statistical analyses, and MINITAB for the teaching of statistics. In 2001, in the era of the desktop personal computer, almost everyone has easy access to interactive statistical programs that can implement complex statistical procedures and produce publication-quality graphics. And there is a new generation of statistical tools that rely upon statistical simulation such as the bootstrap and Markov Chain Monte Carlo methods. Complementing the traditional production-like packages for statistical analysis are more methodologically oriented languages such as S and S-PLUS, and symbolic and algebraic calculation packages. Statistical journals and those in various fields of application devote considerable space to descriptions of such tools.
4. Statistics at the End of the Twentieth Century It is widely recognized that any statistical analysis can only be as good as the underlying data. Consequently, statisticians take great care in the the design of methods for data collection and in their actual implementation. Some of the most important modes of statistical data collection include censuses, experiments, obserational studies, and sample sureys, all of which are discussed elsewhere in this encyclopedia. Statistical experiments gain their strength and validity both through the random assignment of treatments to units and through the control of nontreatment variables. Similarly sample surveys gain their validity for generalization through the careful design of survey questionnaires and probability methods used for the selection of the sample units. Approaches to cope with the failure to fully implement randomization in experiments or random selection in sample surveys are discussed in Experimental Design: Compliance and Nonsampling Errors. Data in some statistical studies are collected essentially at a single point in time (cross-sectional studies), while in others they are collected repeatedly at several time points or even continuously, while in yet others observations are collected sequentially, until sufficient information is available for inferential purposes. Different entries discuss these options and their strengths and weaknesses.
After a century of formal development, statistics as a field has developed a number of different approaches that rely on probability theory as a mathematical basis for description, analysis, and statistical inference. We provide an overview of some of these in the remainder of this section and provide some links to other entries in this encyclopedia. 4.1
Data Analysis
The least formal approach to inference is often the first employed. Its name stems from a famous article by John Tukey (1962), but it is rooted in the more traditional forms of descriptive statistical methods used for centuries. Today, data analysis relies heavily on graphical methods and there are different traditions, such as those associated with (a) The ‘exploratory data analysis’ methods suggested by Tukey and others (see Exploratory Data Analysis: Uniariate Methods). (b) The more stylized correspondence analysis techniques of Benzecri and the French school (see Scaling: Correspondence Analysis). (c) The alphabet soup of computer-based multivariate methods that have emerged over the past decade such as ACE, MARS, CART, etc. (see Exploratory Data Analysis: Uniariate Methods). No matter which ‘school’ of data analysis someone adheres to, the spirit of the methods is typically to encourage the data to ‘speak for themselves.’ While no theory of data analysis has emerged, and perhaps none is to be expected, the flexibility of thought and method embodied in the data analytic ideas have influenced all of the other approaches. 4.2 Frequentism The name of this group of methods refers to a hypothetical infinite sequence of data sets generated as was the data set in question. Inferences are to be made with respect to this hypothetical infinite sequence. (For details, see Frequentist Inference). One of the leading frequentist methods is significance testing, formalized initially by R. A. Fisher (1925) and subsequently elaborated upon and extended by Neyman and Pearson and others (see below). Here a null hypothesis is chosen, for example, that the mean, µ, of a normally distributed set of observations is 0. Fisher suggested the choice of a test statistic, e.g., based on the sample mean, x- , and the calculation of the likelihood of observing an outcome as or more extreme as x- is from µ l 0, a quantity usually labeled as the p-value. When p is small (e.g., less than 5 percent), either a rare event has occurred or the null hypothesis is false. Within this theory, no probability can be given for which of these two conclusions is the case. 15087
Statistics: The Field A related set of methods is testing hypotheses, as proposed by Neyman and Pearson (1928, 1932). In this approach, procedures are sought having the property that, for an infinite sequence of such sets, in only (say) 5 percent for would the null hypothesis be rejected if the null hypothesis were true. Often the infinite sequence is restricted to sets having the same sample size, but this is unnecessary. Here, in addition to the null hypothesis, an alternative hypothesis is specified. This permits the definition of a power curve, reflecting the frequency of rejecting the null hypothesis when the specified alternative is the case. But, as with the Fisherian approach, no probability can be given to either the null or the alternative hypotheses (see Hypothesis Testing in Statistics). The construction of confidence intervals, following the proposal of Neyman (1934), is intimately related to testing hypotheses; indeed a 95 percent confidence interval may be regarded as the set of null hypotheses which, had they been tested at the 5 percent level of significance, would not have been rejected. A confidence interval is a random interval, having the property that the specified proportion (say 95 percent) of the infinite sequence, of random intervals would have covered the true value. For example, an interval that 95 percent of the time (by auxiliary randomization) is the whole real line, and 5 percent of the time is the empty set, is a valid 95 percent confidence interval. Estimation of parameters—i.e., choosing a single value of the parameters that is in some sense best—is also an important frequentist method. Many methods have been proposed, both for particular models and as general approaches regardless of model, and their frequentist properties explored. These methods usuallyextendedtointervalsofvaluesthroughinversion of test statistics or via other related devices (see e.g., Fiducial and Structural Statistical Inference). The resulting confidence intervals share many of the frequentist theoretical properties of the corresponding test procedures (see Estimation: Point and Interal). Frequentist statisticians have explored a number of general properties thought to be desirable in a procedure, such as invariance, unbiasedness, sufficiency, conditioning on ancillary statistics, etc. (see Inariance in Statistics, Estimation: Point and Interal, Statistical Sufficiency). While each of these properties has examples in which it appears to produce satisfactory recommendations, there are others in which it does not. Additionally, these properties can conflict with each other. No general frequentist theory has emerged that proposes a hierarchy of desirable properties, leaving a frequentist without guidance in facing a new problem. 4.3 Likelihood Methods The likelihood function (first studied systematically by R. A. Fisher) is the probability density of the data, 15088
viewed as a function of the parameters. It occupies an interesting middle ground in the philosophical debate, as it is used both by frequentists (as in maximum likelihood estimation) and by Bayesians in the transition from prior distributions to posterior distributions. A small group of scholars (among them G. A. Barnard, A. W. F. Edwards, R. Royall, D. Sprott) have proposed the likelihood function as an independent basis for inference (see Likelihood in Statistics). The issue of nuisance parameters has perplexed this group, since maximization, as would be consistent with maximum likelihood estimation, leads to different results in general than does integration, which would be consistent with Bayesian ideas. 4.4 Bayesian Methods Both frequentists and Bayesians accept Bayes’ Theorem as correct, but Bayesians use it far more heavily (see Bayesian Statistics). Bayesian analysis proceeds from the idea that probability is personal (see Probability: Interpretations) or subjective, reflecting the views of a particular person at a particular point in time. These views are summarized in the prior distribution over the parameter space. Together the prior distribution and the likelihood function define the joint distribution of the parameters and the data. This joint distribution can alternatively be factored as the product of the posterior distribution of the parameter given the data times the predictive distribution of the data. In the past, Bayesian methods were deemed to be controversial because of the avowedly subjective nature of the prior distribution. But the controversy surrounding their use has lessened as recognition of the subjective nature of the likelihood has spread. Unlike frequentist methods, Bayesian methods are, in principle, free of the paradoxes and counterexamples that make classical statistics so perplexing. The development of hierarchical modeling and Markov Chain Monte Carlo (MCMC) methods have further added to the current popularity of the Bayesian approach, as they allow analyses of models that would otherwise be intractable (see Monte Carlo Methods and Bayesian Computation: Oeriew and Marko Chain Monte Carlo Methods). Bayesian decision theory, which interacts closely with Bayesian statistical methods, is a useful way of modeling and addressing decision problems of experimental designs and data analysis and inference (see Decision Theory: Bayesian). It introduces the notion of utilities and the optimum decision combines probabilities of events with utilities by the calculation of expected utility and maximizing the latter (e.g., see the discussion in Lindley 2000). Current research is attempting to use the Bayesian approach to hypothesis testing to provide tests and pvalues with good frequentist properties (see Bayarri and Berger 2000).
Statistics: The Field 4.5 Broad Models: Nonparametrics and Semiparametrics These models include parameter spaces of infinite dimensions, whether addressed in a frequentist or Bayesian manner. In a sense, these models put more inferential weight on the assumption of conditional independence than does an ordinary parametric model. Methods and properties of these models are still being developed and some of these are described in various entries of this encyclopedia (see Exploratory Data Analysis: Multiariate Approaches (Nonparametric Regression), Nonparametric Statistics: Rankbased Methods, Semiparametric Models).
4.6 Some Cross-cutting Themes Often different fields of application of statistics need to address similar issues. For example, dimensionality of the parameter space is often a problem. As more parameters are added, the model will in general fit better (at least no worse). Is the apparent gain in accuracy worth the reduction in parsimony? There are many different ways to address this question in the various applied areas of statistics. Another common theme, in some sense the obverse of the previous one, is the question of model selection and goodness of fit. In what sense can one say that a set of observations is well-approximated by a particular distribution? (cf. Goodness of Fit: Oeriew). All statistical theory relies at some level on the use of formal models, and the appropriateness of those models and their detailed specification are of concern to users of statistical methods, no matter which school of statistical inference they choose to work within.
5. Statistics in the Twenty-first Century 5.1 Adapting and Generalizing Methodology Statistics as a field provides scientists with the basis for dealing with uncertainty, and, among other things, for generalizing from a sample to a population. There is a parallel sense in which statistics provides a basis for generalization: when similar tools are developed within specific substantive fields, such as experimental design methodology in agriculture and medicine, and sample surveys in economics and sociology. Statisticians have long recognized the common elements of such methodologies and have sought to develop generalized tools and theories to deal with these separate approaches (see e.g., Fienberg and Tanur 1989). One hallmark of modern statistical science is the development of general frameworks that unify methodology. Thus the tools of Generalized Linear Models draw together methods for linear regression and
analysis of various models with normal errors and those log-linear and logistic models for categorical data, in a broader and richer framework. Similarly, graphical models developed in the 1970s and 1980s use concepts of independence to integrate work in covariance section, decomposable log-linear models, and Markov random field models, and produce new methodology as a consequence. And the latent variable approaches from psychometrics and sociology have been tied with simultaneous equation and measurement error models from econometrics into a broader theory of covariance analysis and structural equations models. Another hallmark of modern statistical science is the borrowing of methods in one field for application in another. One example is provided by Markov Chain Monte Carlo methods, now used widely in Bayesian statistics, which were first used in physics. Survival analysis, used in biostatistics to model the disease-free time or time-to-mortality of medical patients, and analyzed as reliability in quality control studies, are now used in econometrics to measure the time until an unemployed person gets a job. We anticipate that this trend of methodological borrowing will continue across fields of application.
5.2 Where Will New Statistical Deelopments be Focused ? In the issues of its year 2000 volume, the Journal of the American Statistical Association explored both the state of the art of statistics in diverse areas of application, and that of theory and methods, through a series of vignettes or short articles. These essays provide an excellent supplement to the entries of this encyclopedia on a wide range of topics, not only presenting a snapshot of the current state of play in selected areas of the field but also affecting some speculation on the next generation of developments. In an afterword to the last set of these vignettes, Casella (2000) summarizes five overarching themes that he observed in reading through the entire collection: (a) Large datasets. (b) High-dimensional\nonparametric models. (c) Accessible computing. (d) Bayes\frequentist\who cares? (e) Theory\applied\why differentiate? Not surprisingly, these themes fit well those that one can read into the statistical entries in this encyclopedia. The coming together of Bayesian and frequentist methods, for example, is illustrated by the movement of frequentists towards the use of hierarchical models (see Hierarchical Models: Random and Fixed Effects) and the regular consideration of frequentist properties of Bayesian procedures (e.g., Bayarri and Berger 2000). Similarly, MCMC methods are being widely used in non-Bayesian settings and, because they focus 15089
Statistics: The Field on long-run sequences of dependent draws from multivariate probability distributions, there are frequentist elements that are brought to bear in the study of the convergence of MCMC procedures. Thus the oft-made distinction between the different schools of statistical inference (suggested in the preceding section) is not always clear in the context of real applications.
5.3 The Growing Importance of Statistics Across the Social and Behaioral Sciences Statistics touches on an increasing number of fields of application, in the social sciences as in other areas of scholarship. Historically, the closest links have been with economics; together these fields share parentage of econometrics. There are now vigorous interactions with political science, law, sociology, psychology, anthropology, archeology, history, and many others. In some fields, the development of statistical methods has not been universally welcomed. Using these methods well and knowledgeably requires an understanding both of the substantive field and of statistical methods. Sometimes this combination of skills has been difficult to develop. Statistical methods are having increasing success in addressing questions throughout the social and behavioral sciences. Data are being collected and analyzed on an increasing variety of subjects, and the analyses are becoming increasingly sharply focused on the issues of interest. We do not anticipate, nor would we find desirable, a future in which only statistical evidence was accepted in the social and behavioral sciences. There is room for, and need for, many different approaches. Nonetheless, we expect the excellent progress made in statistical methods in the social and behavioral sciences in recent decades to continue and intensify.
Bibliography Barnard G A, Plackett R L 1985 Statistics in the United Kingdom, 1939–1945. In: Atkinson A C, Fienberg S E (eds.) A Celebration of Statistics: The ISI Centennial Volume. Springer-Verlag, New York, pp. 31–55 Bayarri M J, Berger J O 2000 P values for composite null models (with discussion). Journal of the American Statistical Association 95: 1127–72 Box J 1978 R. A. Fisher, The Life of a Scientist. Wiley, New York Casella G 2000 Afterword. Journal of the American Statistical Association 95: 1388 Fienberg S E 1985 Statistical developments in World War II: An international perspective. In: Anthony C, Atkinson A C, Fienberg S E (eds.) A Celebration of Statistics: The ISI Centennial Volume. Springer-Verlag, New York, pp. 25–30 Fienberg S E, Tanur J M 1989 Combining cognitive and statistical approaches to survey design. Science 243: 1017–22 Fisher R A 1925 Statistical Methods for Research Workers. Oliver and Boyd, London
15090
Gigerenzer G, Swijtink Z, Porter T, Daston L, Beatty J, Kru$ ger L 1989 The Empire of Chance. Cambridge University Press, Cambridge, UK Hacking I 1990 The Taming of Chance. Cambridge University Press, Cambridge, UK Jeffreys H 1938 Theory of Probability, 2nd edn. Clarendon Press, Oxford, UK Keynes J 1921 A Treatise on Probability. Macmillan, London Lindley D V 2000 1932 The philosophy of statistics (with discussion). The Statistician 49: 293–337 Neyman J 1934 On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection (with discussion). Journal of the Royal Statistical Society 97: 558–625 Neyman J, Pearson E S 1928 On the use and interpretation of certain test criteria for purposes of statistical inference. Part I. Biometrika 20A: 175–240 Neyman J, Pearson E S 1932 On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society, Series. A 231: 289–337 Raiffa H, Schlaifer R 1961 Applied Statistical Decision Theory. Harvard Business School, Boston Ramsey F P 1926 Truth and probability. In: The Foundations of Mathematics and Other Logical Essays. Kegan Paul, London, pp. Savage L J 1954 The Foundations of Statistics. Wiley, New York Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty Before 1900. Harvard University Press, Cambridge, MA Stigler S M 1999 Statistics on the Table: The History of Statistical Concepts and Methods. Harvard University Press, Cambridge, MA Tukey John W 1962 The future of data analysis. Annals of Mathematical Statistics 33: 1–67 Wald A 1950 Statistical Decision Functions. Wiley, New York Wallis W 1980 The Statistical Research Group, 1942–1945 (with discussion). Journal of the American Statistical Association 75: 320–35
S. E. Fienberg and J. B. Kadane
Status and Role, Social Psychology of The social psychology of status and role refers to the study of the social interactive processes that involve people in displaying, creating, and changing statuses and roles. Status and role link individuals to society, to institutional structures, social groups, etc.
1. History of Status and Role Status is a term originally referring to legally enforceable rights related to societal positions, such as the rights implied in the expression ‘the status of citizens.’ In the social sciences it has been used to refer to societal positions, such as ascribed (race, gender, age) and achieved (occupational) statuses. Historically different societal positions have been attributed vary-
Status and Role, Social Psychology of ing degrees of legal rights (slaves, women, peasants); thus the term status carries with it an invidiousness. This connotation has been conceptualized as amounts of societal honor, esteem, and resources attributed to individuals or groups. The social psychological conception of role is derived from the theater applied to everyday life; individuals are conceived as playing roles in society similar to actors upon a stage. Role also has more than one meaning. Role is conceptualized as cultural norms, prescriptions, and expectations associated with statuses such as those of fulfilling the achieved status of physician or the ascribed status of woman. Role also has been formulated as the real practices associated with a status. Both concepts have long histories. What is now termed symbolic interaction, a form of social psychology associated with the University of Chicago early in the twentieth century employed the concept of role as developed by George Herbert Mead and Robert E. Park (Stryker and Statham 1985). Mead’s (1934) conceptions of ‘role-playing’ or ‘taking the role of the other’ are normative formulations used to theorize social processes such as the socialization of children, the development and evolution of self in interactions with others, and the acquisition of meanings in conversations. Park formulated roles as ubiquitous public performances in which people engage to construct the social organizations of society (Turner 1968). It was the anthropologist Ralph Linton who explicitly linked status to role. Linton theorized statuses as ascribed and achieved societal positions. Attached to each status is a role described as normative prescriptions (i.e., rights and duties, Linton 1936, p. 113) expressing expected behaviors associated with the status that the role incumbent is obliged and committed to fulfill. For Linton status is a structural unit of society separate from the individual who fills it and role is social process (‘A role represents the dynamic aspect of a status’ Linton 1936, p.114). Linton illustrates these by describing age and gender (he uses the word sex) as ascribed statuses associating these with culture-specific behaviors that constitute these roles in different societies.
2. Elaborating the Social Psychology of Status and Role: 1940s to 1970s In the period following World War II Theodore Sarbin, a social psychologist, along with two sociologists, Erving Goffman and Ralph H. Turner, shaped the social psychological discourse of status and role as social processes. Sarbin moved Linton’s focus from cultural prescriptions embedded in status shaping role incumbents’ behavior to factors for effectively accomplishing roles. Goffman elaborated the metaphor of social life as theater, focusing on the processes of role playing in creating selves and definitions of the situations. Turner conceptualized indeterminacy and
individuals’ agency in status-role playing. Despite their different focuses the authors share intellectual heritages in Park and G. H. Mead. In 1943, 1954, and 1968 Theodore Sarbin provided extended statements on role theory. The 1954 and 1968 essays were published in the influential Handbook of Social Psychology. In the 1954 essay Sarbin used, but collapsed, the terminology of status and role, conceiving of roles as simultaneously normative obligations (‘expectations,’ ‘rights and obligations,’ Sarbin 1954, p. 226) and as social positions (‘We define position as a system of role expectations, p. 223). This formulation framed an evolving trend for these three interactively inclined authors; the concept of status became muted as the dynamic processes of role were moved center stage in their analyses. In the late 1960s Sarbin and Allen formulated role enactment (1968, p. 497) as a dependent variable, analyzing the influences of independent variables that enhance or inhibit the effectiveness of a role’s display. By effectiveness the authors mean motor and linguistic actions expressed in roles that are congruent with roles’ prescriptions and expectations (1968, p. 550). Sarbin and Allen provide six independent variables that impact upon the effectiveness of role enactment. The variables (1968, pp. 546–7) are: (a) role expectations (validity of role expectations held by the actor); (b) role location (accuracy locating others and the self in the proper role system); (c) role demands (sensitivity to situationally generated role demands); (d) role skills (available general and specific skills for role performances); (e) self-role congruence (congruence of self-conception with the role); (f) audience effects (role reinforcement properties of the audience). Detailed analyses of the six variables indicate that role incumbents possess differing linguistic, motor, and emotional capacities furnishing them with the wherewithal to perform roles more or less effectively (i.e., convincingly, appropriately, authentically—1968, p. 490). Although Sarbin emphasizes individuals’ capacities for effective role enactments, his role theorizing is interactive (e.g., ‘role enactment always occurs in the social context of complementary roles,’ p. 545) as evidenced by the concern with the causes, personal and social, for variation in role performances in social settings. In a closing section of the 1968 essay the authors turn their analysis around asking how role enactments, treated as independent variables, affect the creation, evolution, and change of role incumbents’ social identities; this analysis of social identities parallels Goffman’s conception of the construction of self in interaction. Sarbin insists that his portrayal of status and role is within the context of a dramaturgical (social life as theater) metaphor (Sarbin and Allen 1968, pp. 488–9, Sarbin 1968, p. 546). That metaphor is even more appropriately applied to Erving Goffman’s theorizing. In the preface to his The Presentation of Self in Eeryday Life Goffman (1959, p. xi) so anoints his approach, and in the remainder of the book he 15091
Status and Role, Social Psychology of elaborates the metaphor. However, the focal concept of the book and of Goffman’s corpus is the self. Goffman has devoted much less space to theorizing status and role than he has to the self. However, it is individual role players’ actual public accomplishments in interactive performances that construct selves for others and self (Goffman 1959, pp. 252–3). Goffman defines role similarly to Linton as cultural prescriptions situated in statuses but also as typical real activities in which role players engage (i.e., role enactments, Goffman 1961, pp. 85, 93); he conceives of status as social position related to other positions and as honors and resources attached to the status. Within the context of the dramaturgical metaphor Goffman theorizes a conception of interactive performances in which every rule and practice used to construct selves and define situations are sociological features of role playing. Goffman describes two settings in which role playing occurs: focused and unfocused interactions. The latter refers to the myriad trivial interchanges of daily life such as those between shoppers and supermarket cashiers, or among a group of unacquainted passengers riding an elevator (Goffman 1963). Sustained interactions are focused and particularly relevant; among these are organized work settings. Institutionalized work teams interest Goffman because role playing in these are staged in bounded structural and interactive circumstances that are specifiable precisely for analysis. It is from the work setting of doctors and nurses as a hospital surgical team that Goffman formulates a theory of the processes of status and role (Goffman 1961). Goffman frames the essay by distinguishing between role embracing and distancing. ‘To embrace a role is to disappear completely into the virtual self available in the situation’ …, (Goffman 1961, p.106). It is the role incumbent engaged as Linton might describe ‘… to act solely in terms of the normative demands upon someone in his position’ (p. 85). In contrast role distancing suggests that the role incumbent, while engaging in the normative performances of the role will also perform activities that are not part of, or may even be inconsistent with, the role in order to perpetuate the organization’s work functions (pp. 108, 114–15). Nowhere should role incumbents more totally embrace their roles than among doctors and nurses engaged in the serious work of surgery. Yet in typical ways doctors and nurses exhibit role-distancing activities in the operating theater. In addition to performing surgery a chief surgeon may find that he or she must act as ‘team manager’ expending status resources (e.g., their dignity) by playing the role of ‘nice guy or gal’ in order to reduce neophyte team members’ anxieties, increasing their poise and communication during the surgery (Goffman 1961, p.122). Surgeons may mimic, jest, joke, sing, or discuss recent personal and news events to dispel tension. Surgeons may perform tasks not associated with their role and 15092
beneath their status (again expending status honor) to maintain the team’s solidarity and commitments to the tasks. These role-distancing activities, Goffman suggests, are systemic and typical because they are derived from the sociological role-set in which surgeons are embedded. Some distancing activities evolve from the surgeon’s ties to the hospital administration, others from their status as instructor and mentor to interns, still others from connections with patients and their family, and even the surgeon’s own family may be the source for role-distancing performances. No matter where they originate from, these distancings are endemic to the surgical theater; they are performed to preserve and perpetuate the ordered functions of the team and its effective accomplishment of the surgical task. Goffman argues that Linton’s conception of the structural determination of roles (an embracing of role norms) is anomalous. He insists that adequate theorizing requires integrating normative role obligations with situationally required role distancing. Only such a complementary formulation proffers an empirically sound theory for the effective accomplishment of institutional tasks. Turner does not emphasize the dramaturgical metaphor; instead he methodically and systematically theorizes and investigates the processes of status and role. Turner centers interactional, in contrast to structural, determination of status and role. Linton‘s structural conceptualization of role behavior determined by status did not sit well with Turner’s epistemic roots in the Meadian tradition. Early in his career Turner published ‘The Navy Disbursing Officer as Bureaucrat’ (1947) addressed to this issue. In that essay Turner suggests that some status incumbents performed the role consistent with its bureaucratic normative role prescriptions, but others were capable of creating alternate solutions for the Navy Disbursing Officer role. These officers constructed novel, better adapted, more effective courses of action, innovating upon the cultural, structural, and personal constraints placed upon them as role incumbents. This early essay indicated Turner’s inclinations toward interactive determination. His subsequent writings became more ardently so inclined. Thus, he ultimately theorized that even role performances that publically conform to normatively expected behaviors may not be taken for granted as structurally determined. Instead, Turner conceives of status occupants as interactively engaged in constructing their roles and positions. For example, Turner argues that Sarbin’s and Goffman’s conceptions of ‘appearances and props,’ such as a military uniform or the spacial arrangement in classrooms such as the teacher’s position at the head of class, do not assure incumbents’ statuses nor their role performances. Turner advocates that in every instance it is the role performances, interactively achieved, that create an incumbent’s right to claim a status and to play a role (Turner 1985).
Status and Role, Social Psychology of Turner’s conception of status and role is that they are volitionally constructed within the constraints of social settings. Paraphrasing his language, incumbents engage in processes of interactively creating their statuses and roles; they engage in role making (an intractable aspect of role playing) as well as role taking (Turner 1962). While Turner assumes normatively prescribed roles exist, their prescriptions are general propositions (gestalts regarding such as values, goals, sentiments, styles, etc., Turner 1985), not specific dictates to behaviors, and therefore they are indeterminate. Individuals use these gestalts as schema to organize and interpret their own and others’ behaviors. It is individuals’ interpretations of these gestalts, within specific social contexts, that guide their performances. This proposition involved Turner in an epistemic coherent ‘inductive’ approach to the study of a variety of related topics on statuses and roles. However, after three decades of such publications Turner moved to unify his social psychology of status and role. As the 1970s ended, a theoretical language to study the processes of status and role had evolved. The language assumed that while status and role exist as structural and normative features of society their enactment is determined by individual incumbents’ qualities and accomplishments in situated interactions. The language formulated the possibilities not only for individuals’ recapitulation of culturally prescribed status and role but also for actors’ fashioning and performing differentiated variations of them. Status and role are linked to the development and character of selves and identities, as well as to definitions of situations, social organizations, and institutions. The language proffered an empirically grounded interactive theory of status and role social processes. While the theoretical language was bold, it was also preliminary. Nevertheless, as a series of loosely connected propositions and hypotheses it stimulated an array of empirical and theoretical topics, from status and role enactment, to status and role strain and conflict, differentiation and change, etc. The social psychology of status and role had progressed a long way from its initial formulations delivered by Mead, Park, and Linton.
3. Consolidations in the 1980s The 1980s began a period of critical review and theoretical consolidation. Of the three influential social scientists, only Turner remained involved with the topic. Turner provides a unique effort at consolidating the social psychology of status and role. He, and his colleague Paul Colomy, introduced axioms to explain role processes (Turner and Colomy 1988, Turner 1990). The essays focus on the processes of role allocation, differentiation, and change. Three axioms determine these processes. They are: (a) functional
efficiency of roles, (b) costs and benefits for role incumbency, and (c) positive and negative cultural representation of roles. Each of these three axioms separately and in combination explain the processes of role allocation and differentiation. Turner provides a similar but more complex analysis for role change (Turner 1990). The intent of these essays is to integrate role theory using a small, limited set of explanatory axiomatic factors. At the same time it is an effort to coherently integrate macro structural features to role processes (Turner 1990, pp. 106–8). The authors provide remarkably incisive models. However, their commitment to indeterminacy in the construction of statues and roles deters them from achieving theoretical closure. The models also still require empirical substantiation. A second form of consolidation is the attempt to synthesize interactive and structural approaches (earlier attempts are in Coser 1966 and Merton 1975.) In the 1980s ‘impartial’ scholars tried to resolve the differences between the approaches in an effort to provide a unified theory. An underlying theme in these endeavors is that both approaches had advanced theoretically to where they were no longer in opposition with one another. Warren Handel, in a cautiously reasoned essay (1979), suggests that interactive and structural approaches are ‘compatible and complementary.’ Structural approaches’ increasing theoretical emphasis on role conflict undermines a previously held core conception of role incumbents’ conformity to internalized norms permitting stability and variation in status and role performances. Handel suggests that interactive approaches’ conception of negotiated meanings do not undermine structural conceptions of patterned conduct. Negotiated meanings may coexist within a working consensus among interacting individuals in a particular situation. Handel concludes by legitimating his synthetic formulation, suggesting it has been designed in accord with Mead’s interactive theorizing. Jerold Heiss provides a similar synthesis between the two approaches. His reasoning is more ad hoc than is Handel’s but it comes to a similar conclusion, ‘The general role theory that I foresee for the future will combine the macroanalyses of the structuralists with the microanalyses of the interactionists’ (Heiss 1981, p. 101, emphasis in the original). In a third form of consolidation Sheldon Stryker attempts to develop a unified theory of roles by integrating interaction and structure. The framework is entitled a ‘structural symbolic interactionism’ (Stryker 1983, Stryker and Statham 1985), for some an oxymoron. The theory assumes that macro and micro social structures may constrain role improvisation. Among the micro and macro structures that do so are: persons’ selves and identities, expectations embedded in roles, others’ potential sanctions, structural closure and openness, cultural symbols’ impositions. Each is a variable, thus its constraint of improvisation depends 15093
Status and Role, Social Psychology of on the relative capacity to impose itself upon role incumbents. Stryker generalizes that in interactions when role incumbents’ selves and identities and their goals and interests are satisfied they will conform to the role expectations. In contrast, disruptive social circumstances will encourage role improvisation. Stryker’s commitment to Mead’s symbolic interaction inspires him also to suggest that the same structural conditions (e.g., high employee turnover) may provoke antithetical interpretations simultaneously suggesting opportunities for role innovation and demands for conformity. Stryker suggests that his theory is a work in progress which ‘to this point has produced only a few reasonably rigorous theories. It largely consists of a set of assumptions and concepts …’ (Stryker and Statham 1985, p. 368).
4. Conclusion: To the Millennium and Beyond The theoretical link between status and role was severed soon after World War II. The structural conception of status is now confined to such as ‘status attainment studies,’ i.e., investigations of individuals’ and groups’ relative position in stratification systems that are the result of acquired resources such as income, education, occupation. Status in social psychology is conceived as individuals’ or groups’ relative societal evaluation, thus referring to degrees of honor, esteem, or resources. Status in this form is used as an independent variable, analyzing its effects upon other social processes. A second fault line, already discussed, occurred between structural and interactive approaches. Despite their controversy, both camps reduced the use of the term status; a situation particularly accented among interactionists who came to use role as both societal position and its cultural norms. The social psychology of status and role is now tantamount to the social psychology of roles. Additionally, recent studies of status and roles have relied upon and elaborated established theory rather than created theory anew. Baker and Faulkner’s (1991) conception of roles as employed by persons to acquire resources and combine into concrete positions is exemplary of the terminological and theoretical uses of contemporary role theory. Their study of Hollywood film producers, directors, and screen writers refers to these as roles and not as social positions or statuses. Their analysis then focuses upon combinational relations among the three roles before and after the rise of the movie blockbuster. It relates films’ financial successes to innovative role combinations. In the production of films the stereotypical organization is functional specificity with three different persons playing the three roles. However, Baker and Faulkner find that functional specificity of producer and functional dedifferentiation of directorscreen writer was the most effective arrangement for 15094
achieving financial successes during the blockbuster era. The authors present quotes from secondary sources to illustrate that role participants were aware of the advantages of this organizational arrangement. Although their study is primarily focused at the structural level the authors assume that organization is the result of persons’ volitional activities, and they attribute the structural reorganization to movie industry leaders’ personal agency. Neither in theory nor in practice is there a division between structural and interactive determination of roles and organization for the authors. This position permits the authors to conclude that role innovations may be used as organizational resources. Each of the first three editions of the Handbook of Social Psychology (1954, 1968, 1985) devoted a chapter to role theory. The fourth edition of the Handbook, published in 1998, contains no chapter exclusive to roles. Instead role is introduced throughout the fourth edition in relation to its functions and displays in groups, gender, race, stereotyping, distributive justice and so forth. Although the language is still not unified, role and status are now part of the social science lexicon and are employed conceptually when empirical issues appropriately call for their use. See also: Ethnomethodology: General; Goffman, Erving (1921–82); Group Processes, Social Psychology of; Groups, Sociology of; Interactionism: Symbolic; Mead, George Herbert (1863–1931); Personality and Conceptions of the Self; Self: History of the Concept; Social Psychology; Social Psychology: Sociological; Social Psychology, Theories of; Status and Role: Structural Aspects
Bibliography Baker W E, Faulkner R R 1991 Role as resource in the Hollywood film industry. American Journal of Sociology 97: 279–309 Coser R L 1966 Role distance, sociological ambivalence, and transitional status systems. American Journal of Sociology 72: 173–87 Goffman E 1959 The Presentation of Self in Eeryday Life. Doubleday Anchor Books, New York Goffman E 1961 Encounters: Two Studies in the Sociology of Interaction. Bobbs-Merrill, Indianapolis, IN Goffman E 1963 Behaior in Public Places: Notes on the Social Organization of Gatherings. Free Press, New York Handel W 1979 Normative expectations and the emergence of meaning as solutions to problems: Convergence of structural and interactionist views. American Journal of Sociology 84: 855–81 Heiss J 1981 Social roles. In: Rosenberg M, Turner R H (eds.) Social Psychology: Sociological Perspecties. Basic Books, New York, pp. 94–129 Linton R 1936 The Study of Man: An Introduction. AppletonCentury-Crofts, New York Mead G H 1934 Mind, Self and Society: From the Standpoint of a Social Behaiorist. University of Chicago Press, Chicago
Status and Role: Structural Aspects Merton R K 1975 Structural analysis in sociology. In: Blau P (ed.) Approaches to the Study of Social Structure. Free Press, New York, pp. 21–52 Sarbin T R 1954 Role theory. In: Lindzey G (ed.) The Handbook of Social Psychology, Vol. 1: Theory and Method. AddisonWesley, Cambridge, MA, pp. 223–58 Sarbin T R 1968 Role: Psychological aspects. In: Sills D L (ed.) International Encyclopedia of the Social Sciences. Macmillan and Free Press, New York, Vol. 13, pp. 546–52 Sarbin T R, Allen V L 1968 Role theory. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology, 2nd edn. Addison-Wesley, Reading, MA, Vol. 1, pp. 488–567 Stryker S 1983 Social psychology from the standpoint of a structural symbolic interactionism: Toward an interdisciplinary social psychology. In: Berkowitz L (ed.) Adances in Experimental Social Psychology. Academic Press, New York, Vol. 16, pp. 181–218 Stryker S, Statham A 1985 Symbolic interaction and role theory. In: Lindzey G, Aronson E (eds.) The Handbook of Social Psychology, 3rd edn. Random House, New York, Vol. 1, pp. 311–78 Turner R H 1947 The Navy disbursing officer as a bureaucrat. American Sociological Reiew 12: 342–48 Turner R H 1962 Role-taking: Process vs. conformity. In: Rose A (ed.) Human Behaior and Social Process. HoughtonMifflin, Boston, pp. 20–40 Turner R H 1968 Role: Sociological aspects. In: Sills D L (ed.) International Encyclopedia of the Social Sciences. Macmillan and Free Press New York, Vol. 13, pp. 552–7 Turner R H 1985 Unanswered questions in the convergence between structuralist and interactionist role theories. In: Halle H J, Eisenstadt S N (eds.) Micro-sociological Theory: Perspecties on Sociological Theory. Sage, Beverly Hills, CA, Vol. 2, pp. 22–36 Turner R H 1990 Role change. In: Scott W R, Blake J (eds.) Annual Reiew of Sociology. Annual Reviews, Palo Alto, CA, Vol. 16, pp. 87–110 Turner R H, Colomy P 1988 Role differentiation: Orienting principles. In: Lawler E J, Markovsky B (eds.) Adances in Group Processes A Research Annual. JAI Press, Greenwich, CT, Vol. 5, pp. 1–27
G. M. Platt
Status and Role: Structural Aspects The closely connected concepts of ‘status’ and ‘role’ have a long and complex history in a number of different traditions of social thought, but their contemporary meanings developed during the first half of the twentieth century. Despite it having been Shakespeare who claimed that ‘all the world’s a stage,’ it was the social psychologists of the early twentieth century who systematically employed the concept of role and so paved the way for later ‘dramaturgical’ approaches to social life. This work also prepared the way for those who would develop the idea of status as the structural position corresponding to a role. The term status, however, has been used in two quite
different ways, only one of which is directly relevant to this discussion. One usage has referred to hierarchical differences of prestige and social honor in systems of social stratification (see Social Stratification); the other has referred, more generally, to any structural position or location in a social system. It is the second sense that will be discussed in this article.
1. Status as Structural Position The hierarchical concept of status was popularized by Max Weber, who contrasted stratification by status with stratification by class (Scott 1996). In the 1930s, however, cultural anthropologists introduced the idea of status as a position or location in a social structure (Linton 1936). This usage was rapidly adopted by structural functionalist sociologists. While this initially led to some confusion about whether or not particular statuses were hierarchically ranked, the new usage soon became established (Parsons 1942b). Some resolved this confusion by using ‘position’ or ‘location’ in preference to status (Warner 1952, p. 46), and this practice will be followed in this article. The idea of status as structural position originated in the idea that social structure consisted of culturally defined institutions. This had its origins in Durkheim, but it was the development of a clear understanding of culture by American anthropologists (Kroeber 1917, Kluckhohn 1954) that gave the idea its contemporary form. It became a central concept in sociological analysis (Parsons 1951). It was held that the culture of a society establishes a set of social institutions. These are standardized normative patterns that people are expected to follow in their social actions. They regulate individual and group behavior by defining the nature of their social relations. Institutions define ‘proper, legitimate or expected modes of action or of social relationship’ (Parsons 1940, p. 53), and so guide and channel behavior. Parsons defined social structure as ‘a patterned system of social relationships of actors’ (Parsons 1945, p. 230). The institutionalized expectations that people share define their social positions relative to other positions, and to the performances of the individuals who occupy these (Parsons 1942b, p. 143). This knowledge of social positions provides people with the conceptual maps that they use to organize their social actions. Like all ideas, they are contained in individual minds. However, because they are shared, there is a reciprocity of expectations that solidifies them into distinctively collective representations (Davis 1948, p. 87). Social positions are Durkheimian social facts with the power to constrain the behavior of individuals. We do not simply ‘know’ the social positions that make up our society, we experience social pressure to conform to the expectations they involve. These pressures are felt through role expectations. Roles define the specific rights and obligations that are 15095
Status and Role: Structural Aspects entailed in a social position (Dahrendorf 1958, p. 36). They tell us what to do and what to expect others to do (Parsons 1940, p. 54). They provide us with a definition of the situation that sets the limits within which we may legitimately act (Williams 1960). Some writers used the concept of role to describe the actual behavior associated with a position, but most have recognized that a normative focus requires a concern for expected behavior (Levy 1952, pp. 158– 60). The relationship between expected behavior and actual behavior is very complex, and there is no oneto-one correspondence between norms and behavior. Roles are definitions of what people are expected to do in different situations. They define the goals, the means, and the attitudes that are culturally recognized as legitimate for those who occupy particular social positions.
2. Role Theory These ideas were taken up during the 1950s as the basis of a ‘role theory’ that tried to set out a comprehensive paradigm for role analysis. The most influential formulation of this was that of Gross et al. (1958), who used it to explore the behavior of school superintendents. They did this using Merton’s (1957) argument that each social position is associated with an array of role-specific forms of behavior that together comprise a ‘role set.’ A medical student, for example, must act as a student not only in relation to other students, but also in relation to teachers, doctors, nurses, social workers, patients, medical technicians, and so on. In each of these relations, the student is likely to encounter different expectations about her or his behavior (Merton 1957, p. 112). Gross et al. showed that the school superintendent had to negotiate the conflicting expectations held by teachers, parents, children, governors, and ancillary staff. They recognized that this produces varying degrees of strain or conflict in role expectations. The concept of ‘role conflict’ has been central to many applications of role theory. While this work concentrated on the structural aspects of position and role, it was complemented by those who worked on a social psychology of roles (Newcomb 1950). These writers explored the learning processes through which roles are acquired and the social pressures to conformity that exist in face-to-face situations. Socialization and conformity were central principles for role theorists, and their views have been subject to persistent criticism.
3. Socialization and Conformity Role theorists see the norms that define social positions as learned through a process of socialization. The 15096
expectations of others are internalized in the personality as it is formed. They become objects of moral commitment, and individuals are likely to be strongly attached to them. They tend to regard conformity as a duty (Parsons 1940). This is associated with the argument that there is consensus over social norms. Members of a society share their moral commitments and so violations are met with guilt or shame in the violator and indignation or condemnation in those whom the deviance affects. Deviance from these shared norms is likely to result in ‘corrective forces’ or sanctions that may vary from mild disapproval to overt coercion. For many, this led to the conclusion that deviance was both infrequent and automatically corrected. Societies were seen as self-regulating systems in which actors had little choice but to conform. The degree of consensus in a society rarely reaches this level, however, and individuals are never perfectly socialized (Wrong 1961). These points were especially forcefully raised against Parsons, but even he recognized that the ‘completely institutionalized’ social relationship was a rarity. Although broad descriptive contrasts could be drawn between societies in terms of their common values, ‘when dynamic problems of directions and processes of change are at issue, it is essential to give specific attention to the elements of malintegration, tension and strain in the social structure’ (Parsons 1942a, p. 117). Nevertheless, the mainstream of role theory overemphasized both consensus and internalization. It was conflict theorists such as Dahrendorf (1958) who showed that role expectations have to be seen in relation to the distribution of power and that this can give them a solidity that does not depend on a consensus of opinion. What passes for consensus may be the ideas of a dominant or powerful group (Rex 1961, Shils 1961). A minority may be able to use its resources to institutionalize its own norms for the population as a whole. This may be especially effective where the majority of the population forms a relatively unorganized mass without strong counterinstitutions of its own.
4. Role-taking and Role-making Structural functional and conflict theories of position and role, supported by social psychological theories of socialization and conformity, lay at the heart of mainstream sociology from the 1940s to the 1960s. During the 1960s, however, radical and forceful objections were raised against this (see Action Theory: Psychological). The immediate origin of the sociological concept of role had been Mead’s (1927) account of taking the role of the other, and critics of the mainstream turned to Mead for support. They questioned cultural determinism and its oversocialized image of the actor that saw people as the mere puppets of structural forces. Blumer (1962, p. 189) held that a
Status and Role: Structural Aspects structure of positions is simply a ‘framework inside of which action takes place.’ It is a conceptual guide for action, not an external, substantial entity or an independent coercive power. Turner (1962) took this argument to the heart of role theory. He held that the emphasis on conformity gave only a partial view of role behavior. It stressed imitation and ignored innovation, failing to recognize that role-takers were also ‘role-makers.’ Individuals do not simply take over roles as templates for conformist behavior. Turner held that role-taking always required improvisory behavior. The role knowledge that individuals learn during their primary and secondary socialization does not give them precise, programmed instructions for behavior in all the many unique and unpredictable circumstances in which they are likely to find themselves. Individuals tentatively interpret and reinterpret each other’s actions in the situations that they encounter, recreating their roles from the raw materials provided to them during socialization (Turner 1962, p. 23). Shared knowledge of social positions sets the general conditions for action, but it does not fully determine it (Blumer 1962, p. 190). The normative functionalist approach to roletaking has to be broadened into a more comprehensive account. The fullest development of this view was that of Erving Goffman (1959, 1963), who explored the ways in which individuals actively and creatively construct the images that they present in their role behavior. Goffman systematically employed the theatrical metaphor of role-playing, holding that individuals are never merely actors following a script, but are authors as well. Using such ideas as props, scenery, front stage, and back stage, Goffman showed how public performances depend on the more private situations to which individuals can withdraw. He introduced a number of novel concepts, most notably ‘role distance.’
5. A Sense of Social Structure The criticisms made by the symbolic interactionists added a whole new dimension to the analysis of position and role. While some of the critics of mainstream views saw this role as a complete alternative to the orthodoxy, others saw it as complementing the structural account. These differences persist, and have been compounded by a newer and more radical line of argument. Influenced by phenomenology and ethnomethodology, Aaron Cicourel has produced the most systematic statement of this critique. Cicourel asked the fundamental question: how is role-taking possible? His answer was that the taking and making of roles rests on a set of cognitive processes through which actors give meaning to the world and so sustain a ‘sense of social structure’ (Cicourel 1972,
p. 11). People do carry role information in their heads, but they must also be able to recognize when one particular position or role is relevant, and they must be able to infer what expectations others have of their behavior. They cannot make sense of their social world simply by drawing on the role and positional knowledge that they have learned during their socialization. Before they can apply norms in particular situations, they must arrive at some kind of understanding of what kind of situation it is. This means that ‘members of a society must acquire the competence to assign meaning to their environment so that surface rules and their articulation with particular cases can be made’ (Cicourel 1968, p. 52). This ability to infer and to impute meaning to situations is a practical skill that is an essential condition for any social life at all. Cicourel sees this skill as an interactional competence, making explicit parallels with Chomsky’s concept of linguistic competence. Though he does not adopt Chomsky’s own rationalist theory of the mind, Cicourel does take over his stress on the generative capacities that are provided by human competences. It is their practical, meaning-making skills that allow people to use their knowledge of social norms to generate appropriate role behavior. The structural aspects of positions and roles, therefore, are seen by Cicourel as resting on the possession of a complex set of cognitive procedures (also termed inductive, interpretive, or inference procedures) that operate in the same way as the deep structure grammatical rules of a language. Cicourel illustrated these cognitive procedures by drawing on Alfred Schutz’ discussion of the assumptions that people must make for social interaction to be possible. Schutz held that individuals must assume a reciprocity of perspectives between themselves and their potential partners, they must fill in the gaps in their knowledge through the et cetera principle, and they must assume that things occur as ‘normal form.’ These and similar cognitive procedures constitute the mental module that makes it possible for actors to generate appropriate but innovative responses in changing circumstances, despite the fact that they have only fragmentary and uncertain evidence available to them. They allow people to assign meaning and relevance to the objects in their environment and to construct definitions of the situation that allow them to infer which of the norms stored in their memories are relevant. People build a sense of social structure that allows them to orient themselves appropriately in the various situations that they encounter. Once the meaning of a situation has been decided, norms can be invoked on the assumption that there is a consensus among those with whom they interact and that these are, indeed, the appropriate norms. Normative order and role behavior, therefore, are negotiated and constructed on the basis of the underlying sense of social structure that interactional competence makes possible. 15097
Status and Role: Structural Aspects
6. Conclusion Status—or position—and role remain central concepts in the normative account of social structure. This normative account does not tell us everything that we need to know about social structure (Lockwood 1956, 1964), but it is an essential aspect of any structural account. The use of these normative concepts, however, must incorporate an awareness of the creative and innovative role-making activities that are integral to all role behavior. It is important to recognize that improvisation and negotiation are essential features of the construction of social action. It is also important to recognize, however, that these processes are possible only because individuals possess certain cognitive skills that allow them to infer the meaning of social situations and to build a sense of the social structures in which they must act. Structural positions and roles are human accomplishments, but they have real consequences for the people who occupy them. See also: Action Theory: Psychological; Dramaturgical Analysis: Sociological; Functionalism in Sociology; Institutions; Norms; Phenomenology in Sociology; Social Stratification; Status and Role, Social Psychology of
Bibliography Blumer H 1962 Society as symbolic interaction. In: Rose A (ed.) Human Behaior and Social Processes. Routledge and Kegan Paul, London Cicourel A V 1968 The acquisition of social structure: Towards a developmental sociology of language and meaning. In: Cicourel A V (ed.) 1973 Cognitie Sociology. Penguin, Harmondsworth, UK Cicourel A V 1972 Interpretive procedures and normative rules in the negotiation of status and role. In: Cicourel A V (ed.) 1973 Cognitie Sociology. Penguin, Harmondsworth, UK Dahrendorf R 1958 Homo Sociologicus: On the history, significance, and limits of the category of social role. In: Dahrendorf R (ed.) 1968 Essays in the Theory of Society. Routledge and Kegan Paul, London Davis K 1948 Human Society. Macmillan, New York Goffman E 1959 The Presentation of Self in Eeryday Life. Penguin, Harmondsworth Goffman E 1963 Relations in Public Places. Free Press, New York Gross N, Mason W S, Mc Eachern A W 1958 Explorations in Role Analysis: Studies of the School Superintendency Role. John Wiley and Sons, New York Kluckhohn C 1954 Culture and behavior. In: Lindzey G (ed.) Handbook of Social Psychology. Addison-Wesley, Cambridge, MA, Vol. 2 Kroeber A 1917 The superorganic. American Anthropologist 19: 162–213 Levy M J 1952 The Structure of Society. Princeton University Press, Princeton, NJ Linton R 1936 The Study of Man. D. Appleton Century, New York Lockwood D 1956 Some remarks on the social system. British Journal of Sociology 7: 134–46
15098
Lockwood D 1964 Social integration and system integration. In: Zollschan G, Hirsch W (eds.) 1976 System, Change and Conflict. Houghton-Mifflin, New York Mead G H 1927\1934 Mind, Self and Society From the Standpoint of Social Behaiorism. University of Chicago Press, Chicago Merton R K 1957 The role set: Problems in sociological theory. British Journal of Sociology 8: 106–20 Newcomb T M 1950 Social Psychology. Dryden Press, New York Parsons T 1940 The motivation of economic activity. In: Parsons T (ed.) 1954 Essays in Sociological Theory, 2nd edn. Free Press, New York Parsons T 1942a Democracy and social structure in preNazi Germany. In: Parsons T (ed.) 1954 Essays in Sociological Theory, 2nd edn. Free Press, New York Parsons T 1942b Propaganda and social control. In: Parsons T (ed.) 1954 Essays in Sociological Theory, 2nd edn. Free Press, New York Parsons T 1945 The present position and prospects of systematic theory in sociology. In: Parsons T (ed.) 1954 Essays in Sociological Theory, 2nd edn. Free Press, New York Parsons T 1951 The Social System. Free Press, New York Rex J A 1961 Key Problems of Sociological Theory. Routledge and Kegan Paul, London Scott J 1996 Stratification and Power: Structures of Class, Status and Command. Polity Press, Cambridge, UK Shils E A 1961 Centre and periphery. In: Shils E A (ed.) 1975 Centre and Periphery: Essays in Macrosociology. University of Chicago Press, Chicago Turner R 1962 Role-taking: Process versus conformity. In: Rose A (ed.) Human Behaior and Social Processes. Routledge and Kegan Paul, London Warner W L 1952 The Structure of American Life. University of Chicago Press, Chicago Williams R 1960 American Society: A Sociological Inestigation. 2nd edn, Alfred A Knopf, New York Wrong D 1961 The oversocialized concept of man in modern sociology. American Sociological Reiew 26: 183–93
J. Scott
Statutes of Limitations 1. Definition Statutes of limitations are procedural rules that limit legal actions on the basis of time. They derive from both legislative and judicial sources and exist in many continental and non-continental legal systems. Specifically, they regulate the amount of time that a potential plaintiff has to initiate a formal legal claim. In general, once a statute of limitation expires, a plaintiff cannot press a particular legal claim, regardless of its underlying substantive merits. It is important to distinguish statutes of limitations from the closely related yet distinct common law doctrine of ‘laches.’ The doctrine of laches vests courts with discretionary authority to determine whether a plaintiff, while commencing a legal action within the
Statutes of Limitations time requirements set forth by the relevant statute of limitations, nonetheless has unreasonably delayed bringing a lawsuit to the detriment of a defendant (Heriot 1992). Both statutes of limitations and the doctrine of laches limit a plaintiff’s access to court.
2. Purposes Statutes of limitations seek to reconcile numerous individual and social interests. These interests, broadly construed, include a desire for an orderly legal system that resolves claims promptly, efficiently, and equitably and also promotes legal stability. Within the context of litigation, these general interests become more individualized, acute, and particularized. Plaintiffs benefit from having a reasonable amount of time within which to press a legal claim. Some amount of time is necessary so that a potential plaintiff can contemplate possible litigation and gather and assess evidence that will inform a decision about whether to pursue formal legal remedies. Statutes of limitations induce plaintiffs to pursue legal claims diligently and promptly by punishing those who do not by foreclosing the opportunity to pursue a legal claim. Also, defendants have a legitimate interest in repose as well as not having to defend against ‘stale’ claims. Moreover, defendants should not be disadvantaged in court due to lost evidence, faded memories, and the disappearance of witnesses—all possibilities that increase with the passage of time. Numerous social interests cut in different directions. It is, of course, well known that societies benefit when their judicial systems vindicate their members’ legal rights. Other social interests include a desire for a secure and stable legal system. Moreover, a desire to minimize court congestion is shared by many. Finally, the probability of legal error increases when plaintiffs wait too long before commencing legal action and degrade components critical to a fair, just, and thorough trial.
3. History and Deelopment Statutes of limitations have existed for centuries; scholars trace the origins of statutes of limitations to the early days of Roman Law (Lowenthal 1950). By 1236 in England, statutes of limitations governing the transfer of property were enacted (Pollock and Maitland 1898). During the reign of Henry VIII, Parliament passed legislation establishing some time limits that affected real property transfers (Heriot 1992). The Limitation Act of 1623 marks the advent of the first systematic and reasonably effective rules governing time limitations for commencing legal actions. Notably, the Act expressly tolled limits for, among others, infants as well as the aged, imprisoned,
and infirm. In addition, the Act codified the common law doctrine of adverse possession which generally provides an individual with legal title to property that the individual has occupied openly, notoriously, and continuously for a prescribed period of time. The Act also marks the beginning of the modern law of limitations governing personal actions in common law (Lowenthal 1950) and served as the model adopted by many American legislatures (Heriot 1992). Statutes of limitations are found in many common law and non-common law settings. In common law settings, such as the United States, a recent survey of California law uncovered more than 32,000 statutes of limitations (or time-limiting bars) in that state alone (Montoya 1996). These enactments provide insights into a legal system’s moral and ethical orientation, and in particular, their preference for human life issues over legal claims relating to other issues. In California, for example, statutes of limitations for slander or libel claims typically prevent any such claim being initiated more than one year after the action. On the other hand, there is no statutory limit whatsoever for murder. Statutes of limitations are found in non-common law settings as well. For example, general 30-year statutes of limitation for civil actions operate in France and Germany (Martindale-Hubbell 2000). Of course, deviations from the general rule peculiar to each country also exist. In Germany, a general four-year statute of limitation governs certain business claims; three years for tort actions (Martindale-Hubbell 2000). In France, a ten-year limit governs most tort actions; a two-year limitation bounds claims on most insurance policies (Martindale-Hubbell 2000).
4. Rationale Reflecting statutes of limitations’ effort to reconcile conflicting principles, rationales advanced by scholars justifying such statutes vary. Both efficiency and equity (or ‘fairness’) emerge in the literature as common rationales for statutes of limitations (see, for example, Lowenthal 1950). Epstein (1986) advances a utilitarian justification that, in a world where everyone has an equal probability of being a plaintiff or defendant, the application of statutes of limitations and the resulting reduction in legal error, administrative and transaction costs bring about a net gain for the benefit of all.
5. Criticisms Most critics of statutes of limitations focus on three points. One critique dwells on the particular time limits imposed on particular legal claims (see, for example, Bibas 1994, Thomas 1991). A second critique focuses on the operation of statutes of limitations, 15099
Statutes of Limitations especially on when time limits begin to run against a possible plaintiff (see, for example, Glimcher 1982). A third, broader critique flows from the courts’ application of time limits in a ‘rule-like’ manner. Some critics of the rule-like application of statutes of limitations advance as one alternative administration by judges on a case-by-case basis (see, for example, Richardson 1997). However, even most critics recognize that a rule-like application of statutes of limitations yields certain advantages. Such an application provides benefits from the fact that it is comparatively easy to administer, since decisions depend on a limited number of ascertainable facts and results can often be predicted ex ante. Moreover, well-crafted statutes convey in advance clear requirements for those contemplating pressing legal claims. Finally, a rule-like application of statutes of limitations reduces the potential for judicial abuse of discretionary powers. Problems with statutes of limitations arise when the application of rules generates difficult or inequitable results. Critics point to such results as evidence for a more flexible application of statutes of limitations and for a close examination, case-by-case, of the way in which a particular application advances or impedes the underlying rationale for the time limits. A case-by-case or individualized (i.e., non-rulebased) approach to applying statutes of limitations possesses certain advantages. Principally, this discretionary approach permits a closer examination of whether a specific claim falls within the range of difficulties that statutes of limitations were designed to address. Judicial discretion can reduce the possibility of injustice generated by a mechanical application of a statute of limitation. Ironically, greater judicial discretion in the application of such rules as statutes of limitations entails costs as well as benefits, because they generate greater uncertainty and thus increase the danger of judicial abuse. Over one hundred years ago, Justice Oliver Wendell Holmes (1897) asked, ‘What is the justification for depriving a man of his rights, a pure evil as far as it goes, in consequence of the lapse of time?’ Justice Holmes’ rhetorical question highlights a general ambivalence surrounding statutes of limitations and their applications (Ochoa and Winstrich 1997). Benefits flowing from the application of statutes of limitations are frequently long-term and sometimes difficult to discern. In contrast, the costs imposed by statutes of limitations are short-term and more immediate and visceral. By seeking to reconcile critical principles that sometimes collide in litigation, statutes of limitations generate ambivalence and unease which weakens—but by no means dislodges—their secure and long-standing position in many—if not most— legal systems. See also: Common Law; Legal Systems, Classification of; Rule of Law; Rules in the Legal Process 15100
Bibliography Bibas S A 1994 The case against statutes of limitations for stolen art. Yale Law Journal 103: 2437–69 Epstein R A 1986 Past and future: The temporal dimension in the law of property. Washington Uniersity Law Quarterly 64: 667–722 Glimcher S D 1982 Statutes of limitations and the discovery rule in latent injury claims: An exception of the law? Uniersity of Pittsburgh Law Reiew 43: 501–23 Heriot G L 1992 A study in the choice of forum: Statutes of limitation and the doctrine of Laches. Brigham Young Uniersity Law Reiew 1992: 917–70 Holmes O W 1897 The path of the law. Harard Law Reiew 10: 457–78 Lowenthal M A 1950 Developments in the Law: Statutes of Limitations. Harard Law Reiew 63: 1177–270 Martindale-Hubbell International Law Digest 2000 Montoya J M 1996 California statutes of limitations. Southwestern Uniersity School of Law 25: 745–1373 Ochoa T T, Winstrich A J 1997 The puzzling purposes of statutes of limitations. Pacific Law Journal 28: 453–514 Pollock F, Maitland F W 1898 The History of English Law Before the Time of Edward I, 2nd edn. Cambridge University Press, Cambridge, UK (Ledlie’s translation) Richardson E J 1997 Eliminating the limitations of limitations law. Arizona State Law Journal 29: 1015–74 Thomas R L 1991 Adult survivors of childhood sexual abuse and statutes of limitations: A call for legislative action. Wake Forest Law Reiew 26: 1245–95
M. Heise
Stereotypes, Social Psychology of 1. Origin and History Introduced to the social sciences by Lippmann in his book Public Opinion (1922), the concept of stereotype refers to beliefs, knowledge, and expectations of social groups. To capture the idea of a stereotype, Lippmann made famous the phrase ‘pictures in our heads’ to refer to an internal, mental representation of social groups in contrast to their external reality. At a time when mental constructs of any type were undeveloped and soon to be frowned upon in American psychological discourse, it is especially remarkable that Lippman and other early theorists detected and described the psychological importance of stereotypes. Not only was such a possibility theorized, the results of empirical tests of stereotypes were made available by Katz and Braly (1933). Their checklist method of asking respondents to assign trait adjectives (e.g., intelligent, dishonest) to ethnic and national groups (e.g., Jews, Germans) dominated the field, and in modified form, remains in use even today. With the exception of a hiatus in research on stereotypes at midcentury, the concept of stereotype has occupied a dominant position in psychology since the 1920s.
Stereotypes, Social Psychology of Reviews of historical and contemporary research on stereotypes are liberally available (see Fiske 1998). The concept of stereotype, in many ways, remains largely in keeping with its original formulation, although some shifts in definition and emphasis are worth noting. In the earliest proposals, stereotypes were regarded to be inaccurate assessments of groups; in fact, according to Lippmann, a stereotype was a ‘pseudoenvironment’ or ‘fiction’ and according to Katz and Braly (1933) it was an unjustified and contradictory reaction to an outgroup. In contemporary psychology, the accuracy with which a stereotype captures the essence of a social group is not of central theoretical interest. The belief that ‘Many X’s are Y’ may well be accurate (e.g., many neurosurgeons are men) but if such a belief is applied in judging an individual member of a group (e.g., Female X is not, or cannot be, a neurosurgeon), a stereotype is seen to be in play. This process, by which an individual is given or denied an attribute because of membership in a group, is regarded to be of psychological and social interest. Such a shift is the result of a change in emphasis from examining beliefs about social groups per se (stereotype content), to an interest in the mental mechanics by which they influence interpersonal and intergroup perception and interaction (stereotype process). It is obvious that both aspects of stereotypes, their content and process, are critical to an understanding of their nature and function in social interaction. In fact, attention to implicit or automatic stereotypes in recent years has rekindled an interest in stereotype content and strength and in their relationship to related constructs such as explicit stereotypes, implicit and explicit prejudice, and group identity. From the earliest use of the term in psychology, stereotypes have been regarded as the cognitive (thought) as opposed to the affective (feeling) component of mental representations of social groups. As such, the construct is tied to but differentiated from the concept of attitude, preference, or liking as well as the concept of discrimination. ‘I do not like group X’ is a verbal statement of an attitude or prejudice, while an act denying friendship or intimacy with a member of group X may be regarded as a behavioral indicator of prejudice. To say ‘I believe group X to be incompetent’ is to express a stereotype, while an act of denying employment to a member of group X based on such a belief is seen as the behavioral indicator of that stereotype.Of course, such arbitrary distinctions between mental and behavioral measures disappear when measures of brain activation are introduced. Verbal self-reports, bodily gestures, decisions about hiring are all viewed as behavioral, as opposed to brain indicators of stereotypes and prejudice. It is likely that just as measures of activation in particular sub-cortical structures like the amygdala have been shown to be associated with behavioral measures of prejudice (Phelps et al. 2000), the future will yield similar studies of stereotypes and their neural correlates. Such re-
search has the potential to unify social, cognitive, and neural levels of analysis by demonstrating that the learning of stereotypes that culture and social environments make possible can be synchronously detected in observable behavior as well as in brain activation. The lure of a cognitive analysis in psychology led to a preponderance of attention to stereotypes, to the exclusion of its sister concept, prejudice. Fiske notes a 5:1 ratio in published work on the two concepts between 1974 and 1995. That trend appears to be shifting with attention to both stereotype and prejudice in recent years and especially to their relationships as observed at both conscious and unconscious levels. In addition, research has focused on the role of mood states in dictating stereotype expression with the counterintuitive proposal that positive mood increases rather than decreases reliance on stereotypes (Bodenhausen et al. 1994). The relationship between stereotype (cognition) and prejudice (attitude) has been a complex one with early work linking the two concepts closely enough so as to regard them as synonymous. Today, the relationship between the two is an uneasy one, without a clear sense of the exact nature of the relationship between stereotypic thoughts and prejudicial feelings. It is assumed that stereotypes and prejudice need not be evaluatively compatible. Attitude (liking) can be dissociated from stereotypes (competence, respect, etc.). When searching for large-scale theories of stereotypes or theories in which the role of stereotypes is central, one comes up relatively empty-handed. The focus of research appears not to have been on the development of broad theories, although two exceptions must be noted, in each of which stereotypes play a role in the larger framework: a psychodynamic, individual difference approach advancing the notion of an authoritarian personality to understand ethnocentrism (Adorno et al. 1950), and a social-cognitive approach based on social identity and selfcategorization (Tajfel and Turner 1986) to understand individual–group relationships and their psychological and social consequences. The paucity of large-scale theories of stereotyping and prejudice is compensated by creative experiments that form the groundwork of what may in the future yield unified theories of stereotypes and related constructs of prejudice and discrimination. The last three decades are likely to be remembered for noteworthy experimental discoveries and diversity in the manner in which stereotypes have been measured, with special emphasis on the role of consciousness in thought and feeling about social groups.
2. Categorization and Beyond To perceive is to differentiate, and social perception inherently involves the ability to see differentiation among groups, for example, to see women as dif15101
Stereotypes, Social Psychology of ferentiated from men, elderly as differentiated from young, European as differentiated from Asian. Individual humans each belong to multiple social groups, with gender, age, race\ethnicity, class, religion, and nationality being the most obvious such categories and numerous other differentiations also easily perceived along dimensions such as political orientation, physical attractiveness, etc. A fundamental feature of social perception is the classification of individuals into groups, and the process by which such classification is achieved is assumed to be automatic (Fiske 1998). Once categorized, attributes believed to be associated with the group are generalized to individuals who qualify for group membership. Thus, for example, women may come to be seen as nurturant and nice, men as strong and competent whether they as individuals, deserve those ascriptions or not. Stereotypes are fundamental to the ability to perceive, remember, plan, and act. Functionally, they may be regarded as mental helpers that operate in the form of heuristics or short-cuts. Historically, this view of stereotypes as devices that allow a sensible reading of a complex world marked a breakthrough in research on stereotypes. Advanced by Allport (1954) and Tajfel (1969), the idea that stereotypes were inherent to the act of social categorization now forms the basis of the modern view. As such, stereotypes are regarded to be ordinary in nature, in the sense that they are the byproducts of basic processes of perception and categorization, learning, and memory. This cognitive view of stereotypes has dominated the field since the early 1980s, and from this standpoint several noteworthy discoveries about the nature of stereotypes have emerged.
3. Discoeries Early research in this cognitive tradition showed that within-group differences are minimized and betweengroup differences are exaggerated, with greater confusion, for example, in memory for within-race and within-gender comparisons than between group comparisons. Stereotypes can be generated through a misperception about the frequency with which an attribute is associated with a group. For example, perceivers may come to see an illusory correlation based on the shared distinctiveness of particular attributes. Specifically, when a doubly distinctive event occurs (e.g., when a statistical minority performs a low frequency behavior) such an event is overemphasized in memory and produces a bias in perception such that their occurrence is overestimated. Hamilton and Gifford (1976) have shown how such a process can lead to the assessment that members of minority groups perform more evaluatively negative behaviors than is actually the case. Research in this tradition has also provided a good understanding of how stereotypes, in the form of expectancies of what 15102
social groups are capable of, can lead to a bias to confirm rather than disconfirm expectancies. Moreover, the targets who are stereotyped may themselves fulfill the expectancies the perceiver holds about them to complete the biased cycle of intergroup interaction. Stereotypes are the vehicles of essentialist thinking about social groups. Dispositional group attributions, or the belief that groups are inherently the way they are, can lead to the assessment that attributes associated with groups are stable and unchanging. Stereotypes play their role from the earliest moments in the information processing sequence, giving preference to stereotype-consistent options. To date the evidence overwhelmingly suggests that stereotypes influence the manner in which information is sought, perceived, remembered, and judged. Stereotypes limit the amount of information that is required to make a judgment by giving meaning to partial and even degraded information, and they allow decisions to be reached even when time is short. Stereotypes facilitate initial identification of congruent information and disallow attention to incongruent information. Although research has focused on the role of stereotypes in thinking, they are viewed as serving a socially pragmatic function—as helping social interaction proceed smoothly and with ease (Fiske 1998). From such research it may appear that the only judgments of individuals that ensue are those that rely entirely on knowledge of and beliefs about social groups. Yet, that is not the case. Depending on the context and the motivation of the perceiver, categorybased stereotypic judgments may well yield to more individualized, person-based judgments (Brewer 1988, Fiske and Nuberg 1990). Indeed, experiments on the mechanisms of stereotype expression have regularly shown that stereotypes may or may not play a role depending on the extent to which such knowledge is seen as meaningful or justified (Banaji et al. 1993). The routine simplification function of stereotypes can have far-reaching effects. In the work of Tajfel and co-workers, categorization of individuals into social groups was shown to produce a heightening of perceived differences, and these differences (even when the distinction between groups was arbitrary and minimal) created intergroup discrimination in the form of greater resource allocation to members of one’s own than another group. Thus, emanating from ordinary categorization, stereotypes can play a crucial role in interpersonal and intergroup relations. Besides routine simplification, stereotypes are also regarded as serving the function of protecting a stable and psychologically justified view of the world and the place of humans, as members of social groups, in it. Jost and Banaji (1994) argued that holding negative stereotypes of another’s group may serve not only an egoprotective function (‘I am better than you’) and a group-protective function (‘My group is better than yours’) but a system justifying function as well. The counterintuitive hypothesis from such a perspective is
Stereotypes, Social Psychology of that when status hierarchies relegate groups to relative positions of inferiority and superiority, members of disadvantaged groups may themselves come to hold negative beliefs about their groups in the service of a larger system in which social groups are arranged.
3.1 Implicit Stereotypes With its roots in the ideas of Allport and Tajfel, the notion that stereotypes may operate without conscious awareness, conscious intention, and conscious control is hardly surprising. In fact, throughout the twentieth century, experiments have shown that in one form or another stereotypes emerge spontaneously from initial categorization and continue to have a life of their own independent of conscious will. Yet, it would be fair to say that a direct interest in implicit or unconscious social cognition is relatively recent, with theoretical input from theories of unconscious mental life and methodological input from the development of new measurement tools and techniques. Contrast the following two measures of stereotypes. A respondent is asked to indicate, using a traditional verbal self-report scale, the extent to which African Americans are scholarly and athletic. Or, a respondent is asked to rapidly pair words like ‘scholar’ and ‘athlete’ with faces of African–Americans, and the time to do so is measured. The first measure assumes the ability to respond without self-presentational concerns, and more importantly, it assumes the ability to be able to adequately reflect on the content of one’s thoughts and provide an accurate indication of the complex association between race and psychological attributes. The second measure, although not within the traditional view of stereotype assessment, provides a measure of the strength of association between the group and the attributes. Such a measure has been taken to be an indicator of the stereotype and its strength. To investigate the implicit or automatic manner in which stereotypes of social groups may express themselves, investigators have used a variety of techniques from measuring response latencies (i.e., the time to make a response), to examining errors in memory and biases in linguistic reports. The largest single body of work has used response latencies as indicators of automatic stereotypes and prejudice and the data from such measures have yielded several new results and debates about them (see Banaji 2001). Stereotypes can be activated by the mere presentation of symbols of social group or group-related attributes. It appears that although conscious prejudice and stereotypes have changed, their less conscious, automatic expressions are strikingly strong. As measured by the Implicit Association Test (Greenwald et al. 1998) automatic stereotypes appear to exist in robust form; large effect sizes are the hallmark of automatic stereotypes (see Nosek et al. in press). A priming measure has also been widely utilized in which
prime-target pairs are presented in close succession and response latency to the target serving as the measure of automatic stereotypes. For example, responses are reliably faster to female first names (‘Jane’) when the immediately preceding word is stereotypically consistent (‘nurse’) than inconsistent (‘doctor’). Such effects are obtained with words and pictures and they generalize to a variety of social groups. Given the socially significant consequences of stereotype use, investigations of the variability and malleability of automatic stereotypes have been examined. Research has focused on the relationship between conscious and unconscious expressions of stereotypes and prejudice. As Devine (1989) showed, evidence of automatic race stereotypes is present irrespective of the degree of conscious prejudice toward Black Americans. Additionally, Banaji and Hardin (1996) showed that automatic gender stereotypes were manifested irrespective of endorsement of conscious attitudes and beliefs about gender egalitarianism. Such results point to the dissociation between conscious and unconscious social stereotypes, but it is clear that a simple dissociation may not adequately or accurately capture this relationship. Rather, results are now available that indicate that those with higher levels of conscious prejudice may also show higher levels of automatic or implicit prejudice. It appears that studies using multiple measures of each stereotype and statistical tools to uncover latent factors will yield evidence in favor of a relationship between conscious and unconscious stereotypes, while also revealing their unique and non-overlapping nature. Questions concerning the controllability of automatic stereotypes are hotly debated (Fiske 1998). It appears that a desire to believe that stereotypes can be controlled, perhaps because of their pernicious social consequences, can result in the wishful assessment that they are indeed controllable. Automatic stereotypes do not appear to be controllable by ordinary acts of conscious will. However, habitual patterns of thought, feeling, and behavior toward social groups that cohere with broader value systems and ideology appear to predict automatic responses. In addition, Greenwald et al. (in press) have shown that automatic identity with one’s group can predict stereotypes held about the group and attitudes toward it and have put forth a unified theory of self, group stereotypes, and attitudes. In support, they have found that attitudes toward mathematics and science can be predicted by the strength of the automatic stereotype that math is male or masculine. Women who hold a stronger math l male stereotype also show more negative attitudes toward mathematics. The effects of minor interventions to activate stereotype-incongruent associations (e.g., female– strong) can be detected in weaker automatic stereotypes (Blair et al. in press). Such findings point to the flexibility of the representations of social stereotypes. Although the category ‘strong women’ may be 15103
Stereotypes, Social Psychology of counter-stereotypic, interventions that highlight this association can produce a lowering of the default stereotype of female l weak. The possibility of such strategies for inducing a shift in automatic stereotypes and the potential to track stereotypes through both behavioral and brain activation measures has the potential, in the future, to inform about stereotype representation, process, content, and mechanisms for social change. See also: Mental Representation of Persons, Psychology of; Prejudice in Society; Schemas, Frames, and Scripts in Cognitive Psychology; Schemas, Social Psychology of; Small-group Interaction and Gender; Stigma, Social Psychology of
Bibliography Adorno T W, Frenkel-Brunswick E, Levinson D J, Sanford R N 1950 The Authoritarian Personality. Harper, New York Allport G 1954 The Nature of Prejudice. Doubleday, New York Banaji M R 2001 Implicit attitudes can be measured. In: Roediger H L III, Nairne J S, Neath I, Surprenant A (eds.) The Nature of Remembering: Essays in Honor of Robert G. Crowder. American Psychological Association, Washington, DC Banaji M R, Hardin C D 1996 Automatic stereotyping. Psychological Science 7: 136–41 Banaji M R, Hardin C, Rothman A J 1993 Implicit stereotyping in person judgment. Journal of Personality and Social Psychology 65: 272–81 Blair I V, Ma J E, Lenton A P in press Imagining stereotypes away: The moderation of implicit stereotypes through mental imagery. Journal of Personality and Social Psychology Bodenhausen G V, Kramer G P, Susser K 1994 Happiness and stereotypic thinking in social judgment. Journal of Personality and Social Psychology 66: 621–32 Brewer M 1988 A dual-process model of impression formation. In: Srull T K, Wyer R S Jr (eds.) Adances in Social Cognition. Erlbaum, Hillsdale, NJ, pp. 1–36 Devine P G 1989 Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology 56: 5–18 Fiske S T 1998 Stereotyping, prejudice, and discrimination. In: Gilbert D T, Fiske S T, Lindzey G (eds.) The Handbook of Social Psychology 4th edn. McGraw Hill, New York, pp. 357–411 Fiske S T, Neuberg S L 1990 A continuum of impression formation, from category-based to individuating processes: Influences of information and motivation on attention and interpretation. In: Zanna M P (ed.) Adances in Experimental Social Psychology. Academic Press, New York, pp. 1–74 Greenwald A G, Banaji M R, Rudman L, Farnham S, Nosek B A, Mellott D in press A unified theory of social cognition. Psychological Reiew Greenwald A G, Mellott D, Schwartz J 1998 Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology 74: 1464–80 Hamilton D L, Gifford R K 1976 Illusory correlation in interpersonal perception: A cognitive basis of stereotypic judgments. Journal of Experimental Social Psychology 12: 392–407
15104
Jost J T, Banaji M R 1994 The role of stereotyping in systemjustification and the production of false consciousness. British Journal of Social Psychology 33: 1–27 Katz D, Braly K 1933 Racial stereotypes in one hundred college students. Journal of Abnormal and Social Psychology 28: 280–90 Lippmann W 1922 Public Opinion. Harcourt, Brace, Jovanovich, New York Nosek B, Banaji M R, Greenwald A G in press Harvesting intergroup attitudes and stereotypes from a demonstration website. Group Dynamics Phelps E A, O’Connor K J, Cunningham W A, Funayama S, Gatenby J C, Gore J C, Banaji M R 2000 Performance on indirect measures of race evaluation predicts amygdala activation. Journal of Cognitie Neuroscience 12: 729–38 Tajfel H 1969 Cognitive aspects of prejudice. Journal of Social Issues 25: 79–97 Tajfel H, Turner J C 1986 The social identity theory of intergroup behavior. In: Worchel S, Austin W A (eds.) Psychology of Intergroup Relations. Nelson-Hall, Chicago, pp. 7–24
M. R. Banaji
Stevens, Stanley Smith (1906–73) S. Smith Stevens (1906–73) was a leading figure in the field of experimental psychology during the middle part of the twentieth century. He is best known for his work in psychophysics and especially for the development of the psychophysical power law, but he also made important contributions to the study of hearing, and to an understanding of measurement theory and its role in psychology. Aside from his many research reports, his most influential publications include the classic Hearing: its Psychology and Physiology (1938) co-authored with Hallowell Davis, the Handbook of Experimental Psychology (1951), and Psychophysics: Introduction to its Perceptual, Neural and Social Prospects (1975) edited by his wife, Geraldine Stevens.
1. Beginnings To understand fully the significance of any scientist’s contribution, it is important to know the context in which it occurred. It is particularly useful to be acquainted both with the scientists who paved the way and with those who followed. Historians of experimental psychology have traced the beginnings of the field to the German scientist Wilhelm Wundt (1832– 1920), who established the first laboratory for psychological research in 1879. Although other scientists had made important contributions to the emerging field, it was Wundt who first identified an entire range of questions in psychology amenable to laboratory inves-
Steens, Stanley Smith (1906–73) tigation, and it was to his laboratory in Leipzig that students from many countries came for training in experimental psychology. Stevens, whose distinguished career culminated in his position as Professor of Psychophysics at Harvard University, stands in a direct line of scholarly descent from Wundt as a member of the fourth generation. He was trained by E. G. Boring (1886–1968) at Harvard, who in turn had studied at Cornell with E. B. Titchener (1867–1927), an Englishman who had received his degree in Leipzig from Wundt in 1892. And whereas Wundt may be considered the first to devote himself to Experimental Psychology, it may well be that Stevens was the last. His work marked and contributed to a transition from one era to another in the evolution of scientific psychology and an understanding of that transition makes an interesting story. Stevens was born on November 4, 1906 in Ogden, Utah, home to the Mormon Church, and, following the customs of that institution, he departed at the age of 17 to spend 3 years as a missionary in Belgium and later in France despite a complete ignorance of the language. It seems likely that this experience contributed to the development of persistence and determination, traits that were to serve him well in his career as a scientist. On his return he entered the University of Utah and, after two years, transferred to Stanford, where he graduated in 1931 having taken such a wide variety of courses that the identification of a major was problematic. He was accepted by the Harvard Medical School but demurred on the grounds that the required payment of a $50 fee and the need to spend the summer taking a course in organic chemistry seemed equally unattractive. But he did decide to study at Harvard, and duly enrolled in the School of Education as the easiest (and most economical) path to Harvard’s resources. The transforming experiences there were E. G. Boring’s course on perception and service as Boring’s unpaid research assistant. By the end of his second year, Stevens had earned a PhD in the Department of Philosophy (from which the Psychology Department emerged a year later). A year studying physiology under Hallowell Davis and another as a research fellow in physics led to appointment in 1936 as an instructor in the Psychology Department, where he remained until his death on January 18, 1973 in Vail, Colorado. By 1946 he achieved the rank of full professor, and in 1962 the University acceded to his request that he be named Professor of Psychophysics in honor of the field founded a century earlier by G. T. Fechner, whose theories, ironically, Stevens worked hard to refute.
2. Steens’s Psychophysical Power Law Scholars familiar with his work quickly could agree in making a list of Stevens’s major achievements, but could and do disagree about their relative importance
and their lasting value. For some the single most important accomplishment was his discovery of the psychophysical power law that sometimes does and always should carry his name (Stevens 1957). He not only revived a view broached by some in the nineteenth century, but also created a body of evidence that firmly established what is sometimes called the Psychophysical Power Law and sometimes Stevens’s Law. It concerns the relation between the strength of some form of energy, such as the sound pressure level of a tone, and the magnitude of the corresponding sensory experience, loudness in this example. It is easily discovered that sensation strength is nonlinearly related to stimulus intensity (two candles do not make a room seem twice as bright as one), but it is harder to say what that relation is. For over a century before Stevens approached the problem, the prevailing view, Fechner’s Law, asserted that, as stimulus strength grew geometrically (by constant ratios), sensation strength increased arithmetically. This logarithmic relation had little supporting evidence but remained in many texts for lack of a good alternative. Stevens came to a different conclusion on the basis of data obtained by asking observers who were presented with stimuli varying in intensity to make numerical judgments of their subjective experience. In the case of loudness he found that judged loudness grew as a power function of sound pressure (with an exponent of about 0.6), not as the logarithmic relation predicted by Fechner (Stevens 1961). A similar relation was found to hold between judged brightness and luminance, and eventually, such a relation was found for dozens of other continua. In each case, as the amount or intensity of a stimulus grew by ratio steps, so too did the observer’s estimate of his subjective experience of it. Stevens’s Law seemed applicable to all intensive perceptual continua. Equally interesting was the discovery that for each perceptual continuum, there was a distinctive value of the exponent relating numerical judgment to stimulus intensity. In hearing, for example, a huge range of sound pressures was compressed into a much smaller range of loudness, whereas for electric shock applied to the fingertips, a quite small range of currents between detectability and pain was expanded into a larger range of perceived intensities. Although Stevens originally argued that the method of magnitude estimation provided a ‘direct measure’ of sensation (see Sensation and Perception: Direct Scaling) and that the power law described how energy in the environment was transformed through the relevant sensory apparatus into sensation, critics soon showed this assumption to be untenable. What remained was an empirical principle relating judgment to stimulus intensity. Stevens went on to show that numerical magnitude estimation was a special case of a more general paradigm, cross-modal matching. If the observer is asked to match loudness, for example, by manipu15105
Steens, Stanley Smith (1906–73) lating the luminance of a light source until its brightness matches the level of the target loudness, the power relation is again the result. And, neatly enough, the exponent of that function is exactly predictable by the exponents identified through magnitude estimation of brightness and loudness alone (Stevens 1969). It is this ability of observers to match perceived magnitudes across many different perceptual attributes, and the discovery that in all cases the result is described by a power function, that are the two pillars on which Stevens’s Law rests. Despite the many controversies over its meaning, this simple empirical principle stands securely on a mountain of evidence and must be accommodated by any theory of psychophysics.
3.
Scales of Measurement
Others attach greater importance to Stevens’s ideas about scales of measurement. In a seminal paper (1946) he created a taxonomy of measurement, which identified four classes of scales—nominal, ordinal, interval, and ratio—distinguishing among them on the basis of the operations that leave the form of the scale invariant. Nominal scales assigned numbers that serve only to distinguish one event or class of events from another (the classic example is the set of numbers used to identify each member of a team); any one-to-one substitution is permissible because player identity is preserved. Ordinal scales of measurement permit a rank ordering of measured events (relative hardness of materials, for example) and will remain invariant under monotonic increasing transformations. Interval scales entail a constant unit of measurement and so permit the calculation of differences between any two values (as for the common scales of temperature) and remain invariant under linear transformations. Ratio scales (the bread and butter of physics) also feature constant units of measurement, but in addition allow the expression of two values as a ratio, since a true zero exists; weight is a good example. Such a scale is invariant under a simple multiplicative transformation. It is harder to assess the impact of this conceptualization of measurement than that of the work on cross-modal matching judgments. The latter was instrumental in creating new bodies of knowledge, especially in the field of sensory psychology, and continues to play an influential role in psychophysical theory and research. But Stevens’s four-step hierarchy of measurement scales encountered an assortment of reactions that leaves its present status in some doubt. Contemporary theorists find the nominal scale of little interest because it imposes no ordering on the measured entities. And several authorities have noted the omission of a log interval scale from what Stevens had proposed as an exhaustive listing. The relevance of this classification to the selection of statistical 15106
procedures was hotly debated (e.g., see Binder 1984). Stevens had argued that the type of measurement placed constraints on which statistics can be used, but statisticians were quick to cite seeming exceptions, and the usefulness of Stevens’s measurement taxonomy as a guide to the selection of statistical techniques became uncertain. But, for psychologists, his ideas about measurement were, and in some ways continue to be, of great importance. At a time when Stevens was trying to provide support for his newly-created scale of loudness, he felt it important to show that it met the same criteria of legitimacy as did the familiar scales of the more developed sciences. It was a time when the status of psychology as a science was a topic of lively debate, and Stevens was much involved in efforts to identify the critical features of science and the ways in which psychology shared them. During the 1940s and 1950s questions about the nature of science and psychology’s place in it were discussed at length, and Stevens’s analysis of measurement in general and the ways in which psychologists could use scales of measurement was very influential. The idea that a measurement scale—or, for that matter, any lawful relation—could be expressed in terms of the operations that left it invariant, played an important role in the thinking of his students, and no doubt for scientists in other disciplines as well. The work that Stevens began has now evolved into a mathematical specialty in its own right, yielding results that confirm some of what Stevens intuited, and disconfirming other aspects of his work. (It must be remembered that the ideas he was advancing lent themselves to mathematical formalization, though he himself did not provide it.) That his ideas about measurement have continued to be developed and refined by other scientists today may be the ultimate accolade for his pioneering work in this field (e.g., see Luce and Narens 1999).
4. Hearing If Stevens had never established the psychophysical power law and had left the study of measurement untouched, he would still be known and remembered as a pioneer in the field of psychoacoustics, the scientific study of the perception of sound (e.g., see Scharf 1999). His collaboration with Hallowell Davis in the late 1930s resulted in Hearing, a book that for many years was the fundamental source on the subject and has been ranked by some scholars as second only to the work of Helmholtz in its originality and breadth. In addition to his measurements of loudness, he did important work on auditory localization, and on the relation of pitch to intensity. At the behest of the US Army Air Corps he established the Psycho-Acoustic Laboratory at Harvard in 1940 to study the effect of noise on pilot psychomotor efficiency. As its director
Steens, Stanley Smith (1906–73) he was able to attract a group of students and colleagues of extraordinary ability, among them W. R. Garner, K. D. Kryter, J. C. R. Licklider, and G. A. Miller, all of whom made important contributions to psychoacoustics and beyond. It was an interesting aspect of his personality that, despite a rather gruff manner, he functioned as an intellectual magnet, attracting to his laboratory some of the most talented researchers in the world. Not the least of these was G. von Be! ke! sy, who spent almost 20 years in Stevens’ basement lab and won a Nobel Prize during that period for his early work (in Budapest and Stockholm) on the ear. Throughout his career there were several periods of intense activity as a laboratory scientist. But in the period before World War II he was also active in the discussion of philosophical questions about the nature of psychology. He was impressed by the work of the physicist P. W. Bridgman on the central role of operationism in science, and argued for its use in defining the concepts of psychology. And for a period he was associated closely with a group of logical and methodological positivists whose ideas about the nature of science seemed especially hospitable to a behaviorist psychology (Stevens 1939). Perhaps one manifestation of this outlook was a reluctance to extend his thinking very far beyond his data. In a period when many experimental psychologists were developing ever more complex theoretical systems, Stevens’s own forays into the realm of theory were few and limited in scope. He seemed happiest examining his data (as well as those of others) to see what useful information could be extracted. Much of his time in the laboratory was spent plotting and replotting data in the pursuit of clarity and simplicity, and his research reports were models of both qualities.
5. Impact on Experimental Psychology Two additional characteristics of his work deserve mention. While the majority of his contemporaries were engaged in a search for relevant variables (e.g., does ‘meaningfulness’ affect ease of remembering?), Stevens’s goal was the discovery of functional relations, mathematical descriptions of the association of one factor with another, as epitomized by his psychophysical power law. A corollary of this belief was an attitude toward sampling theory and statistical testing that ranged from indifference to disdain. The object for him was not to see whether 80 dB tones received an average loudness judgment greater than did 60 dB tones (at the 5 percent level of significance), the equivalent of what most of his contemporaries were doing. What was important to Stevens was whether a lawful relation could be seen when data were plotted in a thoughtful way. Large numbers of experimental subjects were not needed to reveal the kind of robust relations he sought; a few colleagues or students
recruited in the hallways were enough for most of his purposes. A landmark publication for research psychologists in the 1950s was the Handbook of Experimental Psychology edited by Stevens (1951). It was, as he described it in the preface, ‘a technical survey that would systematize, digest, and appraise the midcentury state of experimental psychology.’ It was not the first of its kind, but it was the most recent, and thus defined for teachers and students alike the set of shared interests that constituted experimental psychology. It was a time still when one might be expert in one aspect of the field and fairly knowledgeable about many others. Since then the amount of knowledge in psychology has grown at an explosive rate, and now the achievement of competence in one area increasingly restricts knowledge of any other. At the beginning of the twenty-first century, the tent of ‘experimental psychology’ has few remaining occupants; there are instead cognitive scientists, brain researchers, vision scientists, and a host of others in specialized niches. Just as Stevens’s mentor E. G. Boring may have been among the last to teach an allinclusive account of psychology in a single course, so Stevens may have been among the last to view the phrase ‘experimental psychology’ as a useful rubric.
6. Conclusion Stevens can be seen as a transitional figure in the evolution of psychological science, yet much of what he did continues to influence the way it is practiced today. His lasting contributions to the fields of psychoacoustics and psychophysics have been discussed here. But, in addition, he created a roadmap for much of modern sensory neurophysiology and for those who theorize in that area. Many of the phenomena for which today’s brain researchers seek explanations were established by Stevens or by others using the methods he developed. Were he alive today, it seems likely that he would embrace the new technology and would be developing ingenious techniques for exploring the neural aspects of sensory events. Although an account of his influence remains an unfinished story, at the least he will be remembered as one who brought new and powerful ideas to the study of psychophysics, leaving it a field reinvigorated and evolving as a subject of great intrinsic interest and many useful applications (e.g., see Baird 1997, Norwich 1993, Stevens 1966). See also: Audition and Hearing; Experimentation in Psychology, History of; Galton, Sir Francis (1822–1911); Helmholtz, Hermann Ludwig Ferdinand von (1821–94); Lashley, Karl Spencer (1890–1958); Psychology: Historical and Cultural Perspectives; Psychology: Overview; Psychophysics; Wundt, Wilhelm Maximilian (1832–1920) 15107
Steens, Stanley Smith (1906–73)
Bibliography Baird J C 1997 Sensation and Judgment: Complementarity Theory of Psychophysics. Erlbaum, Mahwah, NJ Binder A 1984 Restrictions on statistics imposed by method of measurement: Some reality, much mythology. Journal of Criminal Justice 12: 467–81 Luce R D, Narens L 1999 S. S. Stevens, measurement, and psychophysics. In: Killeen P R, Uttal W R (eds.) Fechner Day 99: The End of Twentieth Century Psychophysics. The International Society for Psychophysics, Tempe, AZ Miller G A 1974 Stanley Smith Stevens: 1906–1973. American Journal of Psychology 87: 279–88 Norwich K H 1993 Information, Sensation, and Perception. Academic Press, San Diego, CA Scharf B 1999 S. S. Stevens’s contributions to an understanding of hearing. In: Killeen P R, Uttal W R (eds.) Fechner Day 99: The End of Twentieth Century Psychophysics. The International Society for Psychophysics, Tempe, AZ Stevens S S 1939 Psychology and the science of science. Psychological Bulletin 36: 221–63 Stevens S S 1946 On the theory of scales of measurement. Science 103 Stevens S S (ed.) 1951 Handbook of Experimental Psychology. Wiley, New York Stevens S S 1957 On the psychophysical law. Psychological Reiew 64: 153–81 Stevens S S 1961 To honor Fechner and repeal his law. Science 133: 80–6 Stevens S S 1966 A metric for the social consensus. Science 151: 530–41 Stevens S S 1969 On predicting exponents for cross-modality matches. Perception & Psychophysics 6: 251–6 Stevens S S 1975 Psychophysics: Introduction to its Perceptual, Neural, and Social Prospects. Wiley, New York Stevens S S, Davis H 1938 Hearing: its Psychology and Physiology. Wiley, New York
R. Teghtsoonian
Stigler, George Joseph (1911–91) 1. Personal History George Stigler, one of the great economists of all time, with a gift for writing matched among modern economists only by John Maynard Keynes, was born January 17, 1911 in Renton, Washington, a suburb of Seattle, the only child of Joseph and Elisabeth Stigler. His parents had separately migrated to the United States at the end of the nineteenth century, his father from Bavaria, his mother from what was then AustriaHungary. Stigler went to public schools in Seattle, and then to the University of Washington, receiving a BA in 1931. Although he says that he had ‘no thought of an academic career’ when he graduated from college, he applied for and was awarded a fellowship at Northwestern University for graduate study in the business school, receiving an MBA in 1932 (Stigler 1988, p.15). More important, at Northwestern he 15108
developed an interest in economics as an object of study in its own right and decided on an academic career. After one further year of graduate study at the University of Washington, he moved to the University of Chicago where he found the intense intellectual atmosphere he sought. Chicago became his intellectual home for the rest of his life, as a student from 1933 to 1936, a faculty member from 1958 to his death in 1991, and a leading member of, and contributor to, the ‘Chicago School’ throughout. He received his PhD in 1938, with a thesis, now a classic, on the history of neoclassical theories of production and distribution (Stigler 1941). At Chicago, Stigler was particularly influenced by Frank H. Knight, under whom he wrote his dissertation, Jacob Viner, who taught economic theory and international economics, John U. Nef, economic historian, and their younger colleague, Henry Simons, who became a close personal friend and whose A Positie Program for Laissez Faire greatly influenced many of Stigler’s contemporaries. ‘At least as important to me,’ Stigler writes (1988, pp. 23–5), ‘as the faculty were the remarkable students I met at Chicago,’ and goes on to list W. Allen Wallis, Milton Friedman, Kenneth Boulding, and Robert Shone from Britain, Sune Carlson from Sweden, Paul Samuelson, and Albert G. Hart—all of whom subsequently had distinguished careers. In 1936, Stigler was appointed an assistant professor at Iowa State College (now University), and shortly thereafter was married to Margaret Mack. The Stiglers had three sons: Stephen, a professor of statistics at the University of Chicago, David, a lawyer in private practice, and Joseph, a businessman. Mrs. Stigler died in 1970, tragically without any advance warning. George never remarried. In 1938, Stigler accepted an appointment at the University of Minnesota, going on leave in 1942 to work first at the National Bureau of Economic Research and then at the Statistical Research Group of Columbia University, where he joined a group directed by W. Allen Wallis that was engaged in war research on behalf of the armed services. When the war ended in 1945, George returned to the University of Minnesota, but remained only one year, leaving in 1946 to accept a professorship at Brown University, where he also remained only one year before moving to Columbia University in 1947 and then, in 1958, to the University of Chicago where he remained the rest of his life. At Chicago, he became an editor of the Journal of Political Economy, established the Industrial Organization Workshop, which achieved recognition as the key testing ground for contributions to the field of industrial organization, and in 1977 founded the Center for the Study of Economy and the State, serving as its director until his death in Chicago on December 1, 1991. Stigler served as President of the American Economic Association in 1964, and of the History of
Stigler, George Joseph (1911–91) Economics Society in 1977. He was elected to the National Academy of Sciences in 1975. He received the Alfred Nobel Memorial Prize in Economic Science in 1982 and the National Medal of Science in 1987. Stigler’s governmental activities included: serving as a member of the Attorney General’s National Committee to Study the Antitrust Laws, 1954–5; Chairman, Federal Price Statistics Review Committee, 1960–1; member, Blue Ribbon Panel of the Department of Defense, 1969–70; Vice Chairman, Securities Investor Protection Corporation, 1970–3; Co-Chairman, Blue Ribbon Telecommunications Task Force, Illinois Commerce Commission. 1990–1.
2.
Scientific Contributions
Stigler made important contributions to a wide range of economic areas: (a) history of thought, (b) price theory, (c) economics of information, (d) organization of industry, and (e) theory of economic regulation. In addition, (f ) he wrote many influential essays and reviews, and was a long-time editor.
2.1
Intellectual History
Intellectual history was Stigler’s first field of specialization and remained a continuing interest throughout his life. His doctoral dissertation, published as Production and Distribution Theories, was followed by a steady flow of perceptive, thoughtful, and beautifully written articles and books interpreting the contributions of his predecessors. Almost single-handedly, he kept the field alive, despite widespread lack of interest within the profession. More recently, he succeeded in bringing about a revival of interest in the field, which is currently flourishing. In the words of one of his students, ‘The versatility of his work in [the history of economic thought] has been matched by its depth and insight—and unmatched by any other historian of economic thought’ (Sowell 1987, p. 498; the rest of this paragraph draws from Sowell’s excellent summary of ‘Stigler as an Historian of Economic Thought’). In addition to analyzing the theories of particular economists, he examined particular concepts, such as perfect competition and utility, traced trends in the economics profession, considered the influence of economic theories on events and vice-versa, and the meaning of originality. Many of his important articles in this area are collected in a 1965 volume, Essays in the History of Economics. The history of thought provided a rich seedbed for Stigler’s wide-ranging scientific work. Stigler regarded economic theory, in the words of Alfred Marshall, as ‘an engine for the discovery of concrete truth,’ not as a subject of interest in its own right, a branch of
mathematics. Few economists have so consistently and successfully combined economic theory with empirical analysis, or ranged so widely.
2.2
Price Theory
Stigler’s first important publication after his doctoral thesis was a textbook, The Theory of Competitie Price, published in 1942, followed by four revised versions under the title The Theory of Price (1946, 1952, 1966, 1987). His textbook is unique among intermediate textbooks in price theory in its systematic linking of highly abstract theory to observable phenomena. It is unique also in its concise yet rigorous exposition. One of his earliest papers (1939) broadened the concept of a short-run cost curve to accommodate the observation that firms faced with fluctuating demand could build flexibility into their fixed equipment at a cost of not achieving the lowest possible cost for a particular output. Another early paper (1945) presented a formal analysis of determining the minimum cost of an adequate diet, supplemented by an empirical calculation of the minimum cost—one of the earliest applications of linear programming. ‘The procedure,’ he explained, ‘is experimental because there does not appear to be any direct method of finding the minimum of a linear function subject to linear conditions’ (p. 310). Two years later George Dantzig provided such a direct method, the simplex method, now widely used in many economic and industrial applications (1987). Like these two examples, Stigler’s many later contributions to price theory were a by-product of seeking to understand puzzling real world phenomena, and are best discussed in that context.
2.3
Economics of Information
‘The Economics of Information’ is the title of a seminal article, published in 1961. In his intellectual autobiography, Stigler termed it, ‘My most important contribution to economic theory’ (1988, pp. 79–80). The article begins, ‘one should hardly have to tell academicians that information is a valuable resource: knowledge is power. And yet it occupies a slum dwelling in the town of economics. Mostly it is ignored’ (Stigler 1961, p. 213). This brief 12-page article is a splendid illustration of several of Stigler’s unique virtues: creativity, which he defined as consisting ‘of looking at familiar things or ideas in a new way’; the capacity to extract new insights about those seemingly familiar things; and the ability to state his main points in a provocative and eminently readable way. 15109
Stigler, George Joseph (1911–91) Stigler illustrates the importance of subjecting information to economic analysis with two familiar observations—that there is generally no single ‘market price’ but a variety of prices that ‘various sellers (or buyers) quote at any time’ and that ‘advertising … is an enormously powerful instrument for the elimination of ignorance … and is normally ‘‘paid’’ for by the seller.’ In each case, he presents a sophisticated analysis that explains what determines the observed phenomena, an analysis that has empirical implications capable of being contradicted by observation. He concludes that ‘The identification of sellers and the discovery of their prices are only one sample of the vast role of the search for information in economic life’ (quotes from Stigler 1988, pp. 213, 220, 221, 224). As occurred throughout his career in every field Stigler touched, this article was less important for the particular familiar observations that it analyzed than for giving birth to an essentially new area of study for economists. There followed an outpouring of further studies by him but even more by his students and others that have radically altered the accepted body of economic theory. This aspect of his influence was stressed in the citation for his Nobel Prize, which noted that Stigler’s ‘studies of the forces which give rise to regulatory legislation have opened up a completely new area of economic research … Stigler is also recognized as the founder of ‘‘economics of information’’ and ‘‘economics of regulation,’’ and one of the pioneers of research in the intersection of economics and law.’
2.4
The Organization of Industry
The Organization of Industry is the title of a book published in 1968. Its ‘main content,’ Stigler says in the preface, ‘is a reprinting of 17 articles I have written over the past two decades [including the ‘‘Economics of Information’’] in the area of industrial organization. Although the main topics in industrial organization are touched upon, the touch is often light. The ratio of hypotheses to reasonably persuasive confirmation is distressingly high in all of economic literature, and it must be my chief and meager defense that I am not the worst sinner in the congregation.’ Far from being the worst sinner, Stigler’s main contribution to the field, both in this book and later writing, was the use of empirical evidence to test hypotheses designed to explain features of industrial organization. Article after article combines subtle theoretical analysis with substantial nuggets of empirical evidence, presented so casually as to conceal the care with which the data were compiled and the effort that was expended to determine what data were both relevant and accessible. These articles record the shift in Stigler’s views on antitrust—from initial support of an activist antitrust policy to skepticism about even a minimalist policy. 15110
One of the most influential of the articles in the book was on ‘The Economies of Scale’ in which he introduced ‘the survivor principle’ as a way to use market data to determine the efficient size of an enterprise in a particular industry. The approach has since become widely accepted. In 1962, he published an article, in collaboration with Claire Friedland, his long-time associate, entitled ‘What Can Regulators Regulate? The Case of Electricity,’ which concluded that regulation of electric utilities had produced no significant effect on rates charged. Schmalensee (1987, p. 499) notes that ‘This essay stimulated a great deal of important empirical work on public regulation in the USA. It also posed a basic problem: if regulation does not generally achieve its stated objectives, why have so many agencies been established and kept in existence?’
2.5
Economic Regulation
‘The Theory of Economic Regulation’ (1971b) attempts to answer that question. It is one of a series of contributions by Stigler to what has since come to be called ‘public choice’ economics: the shift from viewing the political market as not susceptible to economic analysis, as one in which disinterested politicians and bureaucrats pursue the ‘public interest,’ to viewing it as one in which the participants are seeking, as in the economic market, to pursue their own interests, and hence subject to analysis with the usual tools of economics. The seminal work that deserves much of the credit for launching ‘public choice,’ The Calculus of Consent, by James Buchanan and Gordon Tullock (1962), appeared in the same year as the Stigler– Friedland article. The ‘central thesis of the article,’ Stigler wrote, ‘is that, as a rule, regulation is acquired by the industry and is designed and operated primarily for its benefit.’ He notes that two ‘alternative views of the regulation of industry are widely held. The first is that regulation is instituted primarily for the protection and benefit of the public at large or some large subdivision of the public. The second view is essentially that the political process defies rational explanation.’ He then gives example after example to support his own thesis, which is, by now, the orthodox view in the profession, concluding: ‘The idealistic view of public regulation is deeply imbedded in professional economic thought. The fundamental vice of such a [view] is that it misdirects attention’—to preaching to the regulators rather than changing their incentives (quotes from Stigler 1971b, p. 3). ‘Smith’s Travels on the Ship of State’ (1971a), published in the same year as ‘The Theory of Economic Regulation,’ raises the same question on a broader scale. Smith gives self-interest pride of place in analyzing the economic market, but does not give it the same role in analyzing the political market. Smith’s
Stigler, George Joseph (1911–91) failure to do so constitutes Stigler’s main—indeed, nearly only—criticism of the Wealth of Nations, that ‘stupendous palace erected upon the granite of self interest’ (Rosenberg 1993, p. 835). The same theme pervades many of Stigler’s later publications. 2.6
Essays, Editorship, and Reiews
In addition to his prolific academic publications, Stigler published a great many essays for a broader audience, many given as talks to a wide variety of audiences. The most important are collected in three volumes, The Intellectuals and the Marketplace (1963), The Citizen and the State (1975), and The Economist as Preacher (1982). ‘There he (the intelligent layman) will find a potpourri of wit and seriousness blended with a high writing style’ (Demsetz 1982, p. 656). ‘For 19 years Stigler was a very successful editor of the Journal of Political Economy. Under his leadership this journal solidified its high reputation among economists’ (Becker 1993, p. 765). Stigler was also a prolific reviewer of books. His complete bibliography lists 73 reviews in 24 publications ranging from strictly professional, such as the Journal of Political Economy (22) and the American Economic Reiew (10) to the popular, such as the Wall Street Journal (5), and the New York Times (3), and dating from 1939 to 1989. Stigler’s last book, his intellectual autobiography, published in 1988, Memoirs of an Unregulated Economist, is a delight to read. As I described it at the time: ‘Stigler’s memoirs are a gem: in style, in wit, and above all, in substance, they reflect accurately his own engaging personality and his extraordinarily diverse contributions to our science.’
3. Concluding Comments An evaluation of Stigler’s scientific contribution would be incomplete without recognizing his role as a teacher and colleague. As teacher, he inspired his students by his own high standards, and instilled a respect for economics as a serious subject concerned with real problems. As one of his students, Thomas Sowell, wrote (1993, p. 788): ‘What Stigler really taught, whether the course was industrial organization or the history of economic thought, was intellectual integrity, analytical rigor, respect for evidence—and skepticism toward the fashions and enthusiasms that come and go.’ Stigler supervised many doctoral dissertations at both Columbia and the University of Chicago. According to Claire Friedland, Stigler’s associate for many years, he served on more than 40 thesis committees at Chicago, and perhaps 40 more at Columbia, and chaired a considerable fraction of those committees. His students come close to dominating the field of industrial organization.
Stigler was an extremely valuable colleague. He provided much of the energy and drive to the interaction among members of the Chicago economics department, business school, and law school that came to be known as ‘The Chicago School.’ His Workshop in Industrial Organization was an outgrowth of a law school seminar started by Aaron Director that Stigler cooperated in running when he came to Chicago. His relations were especially close with Aaron Director, Gary Becker, Richard Posner, Harold Demsetz, and myself, enhancing significantly the scientific productivity of all of us. See also: Economics, History of; Employment and Labor, Regulation of; Firm Behavior; Information, Economics of; Institutional Economic Thought; Law and Economics; Market Areas; Market Structure and Performance; Markets and the Law; Political Economy, History of; Regulation, Economic Theory of; Regulation Theory in Geography; Utilitarian Social Thought, History of
Bibliography Becker G S 1993 George Joseph Stigler, January 17, 1911– December 1, 1991. Journal of Political Economy 101: 761–7 Buchanan J M, Tullock G 1962 The Calculus of Consent. University of Michigan Press, Ann Arbor, MI Dantzig G B 1987 Linear programming. In: Eatwell J, Milgate M, Newman P (eds.) The New Palgrae: A Dictionary of Economics. Stockton Press, New York, Vol. 3, pp. 203–5 Demsetz H 1982 The 1982 Nobel Prize in economics. Science 218: 655–7 Rosenberg N 1993 George Stigler: Adam Smith’s best friend. Journal of Political Economy 101: 833–48 Schmalensee R 1987 Stigler’s contributions to microeconomics and industrial organization. In: Eatwell J, Milgate M, Newman P (eds.) The New Palgrae: A Dictionary of Economics. Stockton Press, New York, Vol. 4, pp. 499–500 Sowell T 1987 Stigler as an historian of economic thought. In: Eatwell J, Milgate M, Newman P (eds.) The New Palgrae: A Dictionary of Economics. Stockton Press, New York, Vol. 4, pp. 498–9 Sowell T 1993 A student’s eye view of George Stigler. Journal of Political Economy 101: 784–92 Stigler G J 1939 Production and distribution in the short run. Journal of Political Economy 47: 305–27 Stigler G J 1941 Production and Distribution Theories, 1870 to 1895. Macmillan, New York Stigler G J 1942 The Theory of Competitie Price. Macmillan, New York Stigler G J 1945 The cost of subsistence. Journal of Farming Economy 27: 303–14 Stigler G J 1946 The Theory of Price, rev. edns. 1952, 1966, 1987. Macmillan, New York Stigler G J 1961 The economics of information. Journal of Political Economy 69: 213–25 Stigler G J 1963 The Intellectual and the Market Place, and Other Essays. Free Press of Glencoe, New York Stigler G J 1965 Essays in the History of Economics. University of Chicago Press, Chicago, IL
15111
Stigler, George Joseph (1911–91) Stigler G J 1968 The Organization of Industry. Irwin, Homewood, IL Stigler G J 1971a Smith’s travels on the ship of state. History of Political Economy 3: 265–77 Stigler G J 1971b The theory of economic regulation. Bell Journal of Economics and Management Strategy. Sci. 2: 3–21 Stigler G J 1975 The Citizen and the State: Essays on Regulation. University of Chicago Press, Chicago, IL Stigler G J 1982 The Economist as Preacher, and Other Essays. University of Chicago Press, Chicago, IL Stigler G J 1988 Memoirs of an Unregulated Economist. Basic Books, New York Stigler G J, Friedland C 1962 What can regulators regulate? The case of electricity. Journal of Law and Economics 5: 1–16
M. Friedman
Stigma, Social Psychology of A person who possesses an attribute that risks their full acceptance from others is said to possess a stigma. As a consequence, the individual is stigmatized, reduced in our minds from a whole and usual person to a tainted, discounted one (Goffman 1963). A person with a stigma may be described as possessing a failing, a shortcoming, or a handicap, be viewed as deviant or abnormal, and thus be subject to marginalization or oppression. Thus, stigmatized individuals are the targets of prejudice, stereotyping, and discrimination (see Intergroup Relations, Social Psychology of; Stereotypes, Social Psychology of ). In order to cope with their situation, some stigmatized individuals may make use of a variety of psychological and behavioral coping strategies (see Coping across the Lifespan). There are several categories of attributes in our culture that have been the basis for pervasive stigmatization (e.g., race, ethnicity, religion, physical ability, physical appearance, gender, and sexual orientation). Goffman (1963) described three general categories\types of stigma: abominations of the body (e.g., physical deformities), blemishes of individual character (e.g., criminal behavior, addiction, or sexual orientation), or tribal stigma (e.g., race, nation, religion, etc.). Most characterizations of stigma emphasize its relational nature: an attribute cannot be stigmatizing without another attribute being viewed as more usual or more acceptable. For example, an unusually tall woman may not feel stigmatized in the company of other very tall people, however, she might feel stigmatized in the company of average height people. Depending upon the context, we all have the potential to be stigmatized. The nature of stigmas may vary across several dimensions. For instance, Jones et al. (1984) described six dimensions: concealability of the stigmatizing attribute, course (duration and permanence) of the stigmatizing attribute, disruptiveness of the stigmati15112
zing attribute in social interactions, esthetic qualities of the stigmatizing trait, origin or cause of the stigmatizing trait, and perilousness of the stigmatizing trait to others. Crocker et al. (1998) argued that two of these traits are the most critical to understanding the experience of a stigmatized person: the visibility (or concealability) of the attribute, and the controllability of the attribute (or the origin, reason for the person’s having that attribute; for example, personal choice vs. genetic inheritance).
1. History of Research on Stigma Early research on stigma was very descriptive in nature, bringing the circumstances and concerns of stigmatized individuals to the attention of the social sciences. This early research typically focused on one particular stigmatizing trait (e.g., physical disability, mental illness, or criminal status) in single-study pieces, often conducted by sociologists or clinicians using qualitative methods such as interviews and intensive case studies. It was not until 1963 that this varied literature was integrated into a single theoretical review in Goffman’s (1963) book Stigma: Notes on the Management of Spoiled Identities. His book provided a unifying theory of stigma, and insights into issues stigmatized people face in terms of their social and personal identity and their management of interactions with nonstigmatized people. Early theorizing and research can be characterized as attempts to promote sympathy and understanding by emphasizing the perceived pitiable situation of stigmatized individuals. This is illustrated by Lewin’s (1941\1948) discussion of self-hatred among Jews and Allport’s (1954\1985) discussion of traits due to victimization. As Allport noted ‘One’s reputation, whether false or true, cannot be hammered, hammered, hammered, into one’s head without doing something to one’s character.’ It was during this phase that the famous ‘doll studies’ were conducted in an attempt to show the damage to the self-concept caused by prejudice and segregation (e.g., Clark and Clark 1939). The conclusions often drawn from this research were that black children showed self-hatred by preferring white to black dolls whereas no such self-hatred was shown by white children. Although later refuted (see below), these doll studies were used as evidence in the landmark case of Brown vs. Board of education (1954) to illustrate the damage caused by segregation. The second phase of research can be considered a transitional phase, in that it represents a merger of research on stigma and research on prejudice, which had been largely independently conducted. This merger is illustrated by the second major integrative work on stigma, Jones et al.’s (1984) book Social Stigma: The Psychology of Marked Relationships. Jones and his colleagues illustrate clearly the connection between prejudice and stigmatization pro-
Stigma, Social Psychology of viding further analysis of the implication of stigma for social interactions. This phase can be considered in part a result of the civil rights movement in that it was during this time that a new cultural norm emerged as to how to perceive stigmatized group members, providing new rights, higher social standing, and group pride to members of a number of stigmatized groups. During this time more affirmative theories of self-concept and group identity emerged (see Ethnic Identity, Psychology of; Social Identity, Psychology of ) as well as theories about processes by which the stigmatized can protect their self-esteem. It was also during this time of upward social mobility of oppressed group members that the large body of empirical work on intergroup contact and tokenism began. The latest development in the research on stigma can be termed ‘The Target’s Perspective’ that has extended the research on stigma in two major ways. First, it broadens our view of the stigmatized. It highlights the importance of understanding how stigmatized individuals perceive prejudice (Swim et al. 1998) and the reasons for their responses to prejudice, which might be misinterpreted if not viewed within the larger context of their life experiences (Miller and Myers 1998). It also highlights understanding variations within stigmatized groups such as variations in extent and type of group identity (Cross 1991). Moreover, the target’s perspective research argues for a more empowering view of stigmatized individuals, not as passive recipients of prejudice but as active copers (Swim et al. 1998). Second, the target’s perspective research goes beyond simply merging the literatures on stigma and prejudice, instead arguing that the study of the target’s experience with prejudice should be used to enlighten the larger theory concerning prejudice and its perpetrators. The target’s perspective researchers have asserted that the research on prejudice primarily has studied perpetrators, failing to give a voice to the stigmatized.
2. Current Understanding of Stigma Research on stigma can be conceived of as representing three primary topic areas: (a) perceptions of prejudice and discrimination; (b) effects of prejudice and discrimination on the stigmatized; and (c) coping with prejudice and discrimination. Below we highlight illustrative and consequential work in these areas.
2.1 Perceptions of Prejudice and Discrimination Perceptions of prejudice include general perceptions of the prevalence of prejudice against specific groups or oneself and whether people label specific incidents that occur to themselves or others as discriminatory. An intriguing finding is that people tend to perceive
that they personally are less likely to experience prejudice than are members of their group (Taylor et al. 1994). This finding spurred research examining motivational factors (e.g., a desire to not accuse others or to protect sense of personal control) and cognitive factors (e.g., difficulties in aggregating information) influencing stigmatized members to deny their personal experiences with prejudice. Consistent with a denial explanation for the personal\group discrepancy, research has revealed that members of stigmatized group are more likely than members of nonstigmatized groups to underestimate the likelihood that specific ambiguous negative incidents are a function of discrimination against them (Rugerrio and Major 1998). Several factors influence the likelihood that people will perceive prejudice and discrimination. First, individual differences in sensitivity to prejudice and discrimination influence perceptions. For example, the more people endorse sexist beliefs, the less likely they are to label certain incidents as sexual harassment (Swim and Cohen 1997) and African-Americans who have a general cultural distrust may more readily perceive prejudice than those who are more trusting (Feldman Barrett and Swim 1998). Second, several features of particular incidents influence perceptions. Some beliefs and actions are more readily defined as prejudicial. For example, women are more likely to favorably rate a man who endorses benevolent sexist beliefs than hostile sexist beliefs perhaps in part because they do not recognize the link between these two types of sexist beliefs (Kilianski and Rudman 1998). Benevolent sexist beliefs include the endorsement of complementary gender differentiation, heterosexual intimacy, and paternalism and men who tend to endorse these beliefs tend to also endorse hostile sexist beliefs. Also, some people are more likely to be perceived to be prejudiced than others are. That is, people hold stereotypes about stereotypers such as believing that whites are more likely to discriminate against blacks than blacks are (Inman and Baron 1996). Third, there are personal costs and benefits to perceiving prejudice (Feldman Barrett and Swim 1998). Perceiving prejudice when none existed can lead to interpersonal disruption and distrust in outgroup members as well as to confirmatory biases (see Selffulfilling Prophecies) and not perceiving prejudice when it does exist can endanger an individual’s selfesteem and can lead to internalization of prejudicial or stereotypical beliefs.
2.2 Effects of Prejudice and Discrimination on the Stigmatized As noted above, there has been much concern that prejudice psychologically impacts the stigmatized. Perhaps the best-known early set of empirical work is the previously mentioned doll studies originating in 15113
Stigma, Social Psychology of work by Clark and Clark (1939). The conclusion that these studies illustrate self-hatred among AfricanAmericans has been disputed (e.g., Cross 1991) and countered by findings illustrating, for example, that African-Americans have higher self-esteem than European Americans as measured on standardized measures of self-esteem (Gray-Little and Hafdahl 2000). Despite these findings, it is important to recognize that some stigmatized groups (e.g., heavy women) may be more likely to have lower self-esteem than nonstigmatized people, which can be a function of internalizing prejudice (Quinn and Crocker 1998). There also is variation within stigmatized groups with some group members, such as those that do not strongly identify with a group, being more effected by prejudice than others (Gray-Little and Hafdahl 2000). Further, experiencing discrimination can lead to other psychological effects such as increased negative daily moods (Swim et al. 2000) and long-term psychiatric symptoms such as obsessive-compulsiveness, depression, and anxiety (Klonoff et al. 2000). Decreases in academic performance are also important consequences of being in a stigmatized group. Compared with the nonstigmatized, stigmatized individuals may more often find themselves solo members of groups which can lead to distraction due to impression management concerns that can interfere with performance (Saenz and Lord 1989). Further, situations can induce stigmatized individuals to fear that others will stereotype them and this ‘stereotype threat’ can impair test performance (Crocker et al. 1998). For instance, telling women that there are gender differences on a difficult exam in a stereotype relevant domain of importance to the individual is sufficient to result in decrements in exam scores. Similar findings have occured for other groups, such as African–Americans.
2.3 Coping with Prejudice and Discrimination Coping strategies used by stigmatized individuals can be divided into psychological and behavioral responses. In many cases, coping strategies protect people from the negative effects of prejudice and discrimination and have potential costs. Disengagement or disidentification is a possible psychological response to discrimination that can allow one to not attend to potentially demeaning feedback from others (Crocker et al. 1998). However, this can result in disidentification with domains that might be useful for promoting one’s academic success. One type of behavioral response is preventative. Stigmatized individuals may choose to avoid situations where they feel that they may encounter prejudice (Swim et al. 1998) or to engage in behaviors that might compensate for the negative expectations that others might have of them (Miller and Myers 1998). Although these prevention strategies may protect the individual from 15114
future encounters with prejudice, they too have their costs. Avoidance can cause people to miss opportunities, prevent disconfirmation of beliefs, and prevent the development of meaningful relationships. People can overcompensate for prejudice, which may appear out of place or fake, or they may undercompensate particularly if they are in situations where they do not feel threatened. Other behavioral responses occur after an event and these responses can be both nonassertive and assertive (Swim et al. 1998). People may choose to politely confront a person, for instance, by asking the perpetrator to explain themselves, by attempting to educate the perpetrator, or by engaging in collective action against the perpetrator. While these responses may result in change and may leave the target feeling more empowered, there are potential costs, such as retaliation on the part of the perpetrator especially with the more assertive responses.
3. Future Directions As noted above, current research on stigma has broadened our understanding of the stigmatized and we anticipate that this process will continue to occur. For instance, there are still relatively few studies that illustrate the strengths of the stigmatized. However, we anticipate that this will change with more people focusing, for instance, on how the stigmatized support each other, confront perpetrators, and set and seek their life goals. Further, we anticipate that there will be more research on variations in experiences across stigmatized groups, such as differences in experiences of those from different Asian countries or differences in identity and economic standing within any stigmatized group. Moreover, there will be more complexity in these analyses as mixed relationships develop where children are born to parents of different ethnic heritages or adopted to parents who do not share, for instance, the same ethnic heritage as their parents. We also anticipate that research on the stigmatized will continue to inform and be informed by other research areas. For instance, there is a strong connection between the literature on stress and stigma in that experiences with prejudice can be considered forms of stressors. Further research on stigma likely to lead to major changes in how people study prejudice and intergroup relationships. Given greater emphasis on similarities and differences between the stigmatized, there may be greater attention paid to similarities and differences between racism, sexism, heterosexism, and antifatism. There may also be greater recognition in the subjectivity in defining what is and is not prejudice in that members of different groups will likely disagree as to what constitutes prejudicial behavior. Further, there will likely be more attention paid to the contributions that the stigmatized have to intergroup relationships, such as understanding their attempts to
Stochastic Dynamic Models (Choice, Response, and Time) maintain smooth intergroup interactions as well as the origins and characteristics of their prejudices against both dominant groups and other stigmatized groups. See also: Discrimination; Discrimination: Racial; State, Sociology of the; Stereotypes, Social Psychology of
Bibliography Allport G W 1985 [1954] The Nature of Prejudice. AddisonWesley, Reading, MA Brown vs. Board of Education of Topeka 1954 347 US 483 Clark K P, Clark M P 1939 Segregation as a factor in the racial identification of Negro pre-school children: A preliminary report. The Journal of Experimental Education 8: 161–3 Crocker J, Major B, Steele C 1998 Social stigma. In: Gilbert D, Fiske S T, Lindzey G (eds.) Handbook of Social Psychology, 4th edn. McGraw-Hill, Boston, Vol. 2, pp. 504–53 Cross W E 1991 Shades of Black: Diersity in African–American Identity. Temple University Press, Philadelphia Feldman Barrett L, Swim J K 1998 Appaisals of prejudice and discrimination. In: Swim J K, Stangor C (eds.) Prejudice: The Target’s Perspectie. Academic Press, San Diego pp. 11–36 Goffman E 1963 Stigma: Notes on the Management of Spoiled Identities. Prentice-Hall, Englewood Cliffs, NJ Gray-Little B, Hafdahl A R 2000 Factors influencing racial comparisons of self-esteem: A quantitative review. Psychological Bulletin 126: 26–54 Inman M L, Baron R S 1996 The influence of prototypes on perceptions of prejudice. Journal of Personality and Social Psychology 70: 727–39 Jones E E, Farina A, Hastorf A H, Markus H, Miller D, Scott R A 1984 Social Stigma: The Psychology of Marked Relationships. Freeman, New York Kilianski S E, Rudman L A 1998 Wanting it both ways: Do women approve of benevolent sexism? Sex Roles 39: 333–52 Klonoff E A, Landrine H, Compbell R 2000 Sexist discrimination may account for well-known gender differences in psychiatric symptoms. Psychology of Women Quarterly 24: 93–9 Lewin K 1948 [1941] Self-hatred in Jews. In: Lewin G W (ed.) Resoling Social Conflicts—Selected Papers on Group Dynamics. Harper & Brothers, New York, pp. 186–200 Miller C T, Myers A M 1998 Compensating for prejudice: How heaveyweight people (and others) control outcomes despite prejudice. In: Swim J K, Stangor C (eds.) Prejudice: The Target’s Perspectie. Academic Press, New York, pp. 191–219 Quinn D, Crocker J 1998 Vulnerablitiy to the affective consequences of the stigma of overweight. Experiencing everyday prejudice and discrimination. In: Swim J K, Stangor C (eds.) Prejudice: The Target’s Perspectie. Academic Press, San Diego, pp. 127–48 Ruggiero K M, Major B N 1998 Group status and attributions to discrimination: Are low- or high-status group members more likely to blame their failure on discrimination? Personality & Social Psychology Bulletin 24: 821–37 Saenz D S, Lord C G 1989 Reversing roles: A cognitive strategy for undoing memory deficits associated with token status. Journal of Personality & Social Psychology 56: 698–708 Swim J K, Cohen L L 1997 Overt, covert, and subtle sexism: A comparison between the attitudes toward women and modern sexism scales. Psychology of Women Quarterly 21: 103–18
Swim J K, Cohen L L, Hyers L L 1998 Experiencing everyday prejudice and discrimination. In: Swim J K, Stangor C (eds.) Prejudice: The Target’s Perspectie. Academic Press, San Diego, pp. 38–61 Swim J K, Hyers L L, Cohen L L, Ferguson M J 2000 Everyday sexism: Evidence for its incidence, nature, and psychological impact from three daily diary studies. Journal of Social Issues 57 Taylor D M, Wright S C, Porter L E 1994 Dimensions of perceived discrimination: The personal\group discrimination discrepancy. In: Zanna M P, Olson J M (eds.) The Psychology of Prejudice: The Ontario Symposium. Erlbaum, Hillsdale, NJ, Vol. 7, pp. 233–56
J. K. Swim and L. L. Hyers
Stochastic Dynamic Models (Choice, Response, and Time) The term ‘stochastic dynamic model’ is used in human experimental psychology to describe models of choice and reaction time (RT) in which noise-perturbed information about stimulus identity accrues in continuous time. It has gained currency since it was used by Luce (1986) to refer to a class of models first investigated in a pioneering series of articles by Pacut (e.g., 1976), in which information accrual was described by linear stochastic differential equations (SDEs). Because of the relatively rich dynamics that can be represented by equations of this kind, the term is also used in a more general sense to describe models in which informational accrual is not constant, but varies with time or state. This entry describes the mathematical methods that are used to analyze models of this kind. Like the classical sequential-sampling models (see Auditory Models, Sequential Decision Making), the main application of stochastic dynamic models is to performance in simple decision or simple informationprocessing tasks, typically, those that require a speeded choice between two alternatives. Along with Signal Detection Theory and its variants (see Signal Detection Theory: Multidimensional), these models are members of a larger class of statistical decision models, which assume that the representation of stimuli in the central nervous system is inherently statistical, by virtue of the mass action properties of neural coding. This leads to models in which the stimulus representation, viewed as a sequence of events unfolding in time, is treated formally as a stochastic process. Because of the assumed noisiness of stimulus representations, statistical decision models typically assume that decisions are made by an averaging or integration device, whose function is to smooth out transitory fluctuations in neural activation, thereby rendering decisions less susceptible to random perturbations in 15115
Stochastic Dynamic Models (Choice, Response, and Time) the stimulus process than would otherwise be the case. In Signal Detection Theory, the period of integration is assumed to be fixed, or to be determined by factors external to the stimulus process, whereas in sequentialsampling models integration continues until a criterion quantity of evidence for a particular response is obtained. From a theoretical standpoint, sequential-sampling models have greater explanatory power than signaldetection models, because they can account for choice probabilities and RT within a unified theoretical framework, whereas signal-detection models provide an account of choice probabilities only. However, sequential-sampling models are more complex to analyze mathematically because they require solution of a first passage time problem. This solution gives the distribution of times needed for the accrued evidence to attain a criterion level, which is identified in such models with the decision component of RT. In contrast, the analysis of signal detection models requires only the mean and variance of the accrued evidence as a function of time, as these suffice to determine the choice probabilities. In the sequential-sampling literature, two main classes of model have predominated: random-walk models, in which evidence accrues as a single, signed total and counter, or accumulator models, in which evidence for competing response alternatives accrues in parallel. When formulated in continuous time, these models become diffusion process models and Poisson counter models, respectively. Although a number of researchers have contributed to the development of these models, the standard versions are those described, respectively, by Ratcliff (1978) and Townsend and Ashby (1983). Recently, dynamic generalizations of these ‘classical’ models have begun to appear, in which the assumption of a constant rate of information accrual is replaced by time-dependent accrual, state-dependent accrual, or some combination of the two. These assumptions are motivated by the need to capture the complex dynamics that seem to underlie performance in some domains. State dependence can arise when the accrual process is imperfect or ‘leaky’ (Diederich 1995, Rudd and Brown 1997, Smith 1995), or when it is governed by resistive, or approach-avoidance, dynamics (Busemeyer and Townsend 1993). Time dependence arises when, for any of a number of reasons, the physical mechanism that generates the stimulus process is itself time varying (Heath 1992). Examples of the latter include models in which the accrual process is driven by time-varying encoding mechanisms in the visual cortex or by a noisy, continuous-time neural network, and models in which information accrual is gated in time by selective attention. In all such cases, the theoretical principles that underlie the model lead naturally to the assumption of time-dependent or state-dependent information accrual. 15116
1. Diffusion Process Models Figure 1 shows a two-barrier, dynamic, diffusion process model of RT. The main elements of this model are an encoding function, a source of noise, and a decision stage. The encoding function, µ(t), is a deterministic function of time, which represents the average information available to the decision stage at time t. (‘Information’ is used loosely here, to refer to the output of relevant neural systems, rather than in its technical sense of entropy.) In general, the value of µ(t) is assumed to reflect the results of an ongoing matching operation between perceptual and memorial representations of a stimulus. In Fig. 1, the encoding function arises as the difference between a timevarying stimulus representation (the continuous curve) and a sensory referent or ‘mental standard,’ c (the dashed line), which results in a function that takes on both positive and negative values. Positive values are evidence for one response, rA; negative values are evidence for the alternative response, rB. In classical sequential-sampling models, the encoding function is assumed to be constant for a given experimental trial ( µ(t) l µ), but in general dynamic models it may vary with time. At each instant, t, the encoding function is additively perturbed by noise, W(t). In diffusion process models, the noise is assumed to be white noise, that is, broadspectrum (delta correlated) Gaussian noise. To make a decision, successive values of the resulting noisy comparison process are integrated, as shown in the box on the right of Fig. 1. The value of the integral as a function of time is a stochastic process, X(t), which represents the accrued stimulus information at time t. The process of accrual continues until X(t) exits from the region bounded by two absorbing barriers aA(t) and aB(t). If, at the time of the first boundary crossing, X(t) aA(t), then response rA is made; if X(t) aB(t), then response rB is made. Psychologically, the time of the first boundary crossing is identified with the
Figure 1 Dynamic diffusion process model showing cognitive operations intervening between presentation of an external stimulus (s) and production of a binary identification response (r)
Stochastic Dynamic Models (Choice, Response, and Time) decision time component of RT, and the probability that the first boundary crossed is aA(t) is identified with the proportion of rA responses. Although the figure shows aA(t) and aB(t) as constant, in general dynamic models the absorbing boundaries and the encoding function may all depend on time. Such variations may be useful when modeling the response to external time constraints, such as RT deadlines. 1.1 Stochastic Differential Equations for Information Accrual In general, the process of information accrual in stochastic dynamic models can be described by a linear SDE of the form dX(t) l [ µ(t)jb(t)X(t)] dtj[σ(t)jc(t)X(t)] dB(t)
(1)
This Eqn. defines dX(t), the random change in the accrual process X(t) in a small time interval dt. The change consists of two parts, one deterministic, the other stochastic. The deterministic change, or drift, is expressed by the quantity µ(t)jb(t)X(t). The first term µ(t), which is purely time dependent, is determined by the temporal properties of the encoding process. The second term b(t)X(t) expresses a joint dependency on time and on the current (random) state. Pure state dependency arises in the special case in which µ(t) l 0 and b(t) is constant. The stochastic part of the change, the so-called diffusion coefficient, is represented by the second term on the right of Eqn. 1. Stochastic perturbation is introduced via the white noise process, dB(t), which is the formal derivative of the Brownian motion or Wiener diffusion process. The stochastic perturbation at any instant can also be represented as the sum of two terms, one purely time dependent, σ(t), and another, c(t)X(t), that is jointly dependent on time and state. Under appropriate regularity conditions, the process X(t) that solves Eqn. 1 is a diffusion process, that is, a continuous sample path Markov process. Solutions to linear stochastic differential equations of the form of Eqn. 1 are obtained by using the Ito# calculus (e.g., Karatzas and Shreve 1991). In applications, a number of special cases of Eqn. 1 are of interest. These are instances of the case c(t) l 0, in which there is no state dependence in the diffusion coefficient. Under these circumstances, the solution to Eqn. 1 is a Gaussian process, X(t), whose distribution at time t can be characterized uniquely by its first two moments. By means of the Ito# calculus it may be shown that, for a process with initial condition X(τ) l y, the value of X(t) is normally distributed with mean E [X(t) Q X(τ) l y] l
& exp & b(u) du jy exp & b(u) du t
A
t
C
τ
B
s t
D
τ
D
A
B
µ (s) ds C
(2)
and variance var [X(t)QX(τ) l y] l
&
t
A
τ
&
t
C
s
D
2 b(u) du
exp B
σ#(s) ds. (3)
These eqns. provide the basis for a dynamic signaldetection model that generalizes the classical model of Signal Detection Theory. It extends in a straightforward way to multiple correlated dimensions, to give a dynamic formulation of the General Recognition Theory of Ashby and Towsend (1986). To date, the most important subcases of the Gaussian process in Eqns. 2 and 3 are those in which σ(t) l σ (constant) and b(t) l 0 or b(t) lkγ. These subcases represent, respectively, dynamic versions of the Wiener and Ornstein-Uhlenbeck (OU) diffusion process. From a psychological perspective, the Wiener and OU processes may be thought of as describing the dynamics of stochastic accrual processes in which information integration is perfect, or ‘leaky,’ respectively. 1.2 Solution of First Passage Time Problems for Time-inhomogeneous Diffusions by Numerical Integral Equations In models of RT such as the one shown in Fig. 1, the quantities of interest are the times required for X(t) to first cross one of the two absorbing boundaries, which represent decision criteria for the responses rA and rB. The distribution of these times are obtained by solving the first passage time problem for X(t). Classical methods for solving first passage time problems treat the problem as a boundary value problem in the theory of partial differential equations (e.g., Ratcliff 1978). However, these methods are usually not tractable when the drift term in Eqn. 1 is not timehomogeneous. Instead, such problems can often be solved using numerical integral equation techniques that were developed in mathematical biology to model the properties of integrate-and-fire neurons. These methods yield integral equation representations for gA[aA(t), t Q x , t ] and gB[aB(t), t Q x , t ], the first passage ! !of the process X(t) ! ! through the abtime densities sorbing boundaries aA(t) and aB(t), given the initial condition X(t ) l x . ! ! To give a formal characterization of these methods, we define the transition density of X(t), the unconstrained diffusion process, by f (x, t Q y, τ) l
d P [X(t) x Q X(τ) l y]. dx
That is, the transition density is the partial derivative with respect to the spatial coordinate of the transition distribution P[X(t) x Q X(τ) l y]. Classically, the transition distribution is obtained by solving the 15117
Stochastic Dynamic Models (Choice, Response, and Time) partial differential equation that is variously known as the Kolmogorov forward equation or Fokker-Planck equation, with specified initial condition. For the special Gaussian case c(t) l 0 of Eqn. 1, it may be obtained directly from the solution given by the Ito# calculus, namely, 1
P[X(t) x Q X(τ) l y] l Φ
xkE [X(t) Q X(τ) l y] Nvar[X(t) Q X(τ) l y] 2 3 4
5 6 7 8
where Φ is the normal distribution function and where the moments are given by Eqns. 2 and 3. For an arbitrary diffusion process, which may be both temporally and spatially inhomogeneous, the first passage time densities gA[aA(t), t Q x , t ] and ! ! gB[aB(t), t Q x , t ] satisfy the Fortet eqns. ! ! f [aA(t), t Q x , t ] l ! !
& g [a (τ), τ Q x!, t!] f [a (t), t Q a (τ), τ] dτ j& g [a (τ), τ Q x , t ] f [a (t), t Q a (τ), τ] dτ ! ! t
t!
A
A
A
A
t
t!
B
B
A
B
(4a)
4b provide a mutually-exclusive and exhaustive decomposition of the probability densities of sample paths of this kind. Formally, Eqns. 4a and 4b are Volterra integral equations of the first kind, which can be solved analytically only in special cases. While the general case can in principle be handled numerically, by approximating the integrals with sums, any simple method of approximation will be numerically unstable, because as the interval of approximation, h, becomes small, the transition density f (x, t Q y, tkh) approaches a Dirac delta function. In the terminology of integral equation theory, the kernel of the equation is said to be singular. Consequently, recent theoretical work has sought to find ways to transform the eqn. to remove the singularity. It was shown by Buonocore et al. (1990) and subsequently, in somewhat greater generality, by Gutie! rrez Ja! imez et al. (1995), that for a large class of diffusion processes, Eqns. 4a and 4b can be transformed into Volterra integral equations of the second kind of the form gA[aA(t), t Q x , t ] l k2Ψ[aA(t), t Q x , t ] ! ! ! !
& g [a (τ), τ Q x!, t!] Ψ[a (t), t Q a (τ), τ] dτ j2& g [a (τ), τ Q x , t ] Ψ[a (t), t Q a (τ), τ] dτ ! !
j2
t
t!
and
& g [a (τ), τ Q x!, t!] f [a (t), t Q a (τ), τ] dτ j& g [a (τ), τ Q x , t ] f [a (t), t Q a (τ), τ] dτ. (4b) ! !
t!
t
B
B
B
B
A
A
B
A
t
t!
These eqns. express the unknown first passage time densities as functions of the free transition density f (x, t Q y, τ) of X(t). They are obtained by considering the sample paths of X(t) that pass through one of two small open intervals a εA(t) (aA(t), aA(t)jε) and a εB(t) (aB(t), aB(t)kε), on the upper and lower absorbing barriers, respectively, at time t. These intervals are both in the absorbing region for X(t), so any sample path of this kind must have made at least one boundary crossing at some time τ, τ t. It may therefore be decomposed into an initial segment representing a transition from the starting point x to the point of the first boundary crossing at aA(τ)! or aB(τ), followed by a second segment representing the subsequent transition into a εA(t) or a εB(t). Because X(t) is a Markov process, the probability density for such sample paths, obtained by shrinking ε to zero, may be decomposed into a product of two density functions. One describes the first passage through an absorbing boundary at τ; the other describes the subsequent free transition (irrespective of the number of intervening boundary crossings) to aA(t) or aB(t). Equations 4a and 15118
A
A
B B
A
B
t
f [aB (t), t Q a , t ] l ! ! t!
A A
(5a)
and gB[aB(t), t Q x , t ] l 2Ψ[aB(t), t Q x , t ] ! ! ! !
& g [a (τ), τ Q x!, t!] Ψ[a (t), t Q a (τ), τ] dτ k2& g [a (τ), τ Q x , t ] Ψ[a (t), t Q a (τ), τ] dτ. (5b) ! !
k2
t
t!
A A
B
A
B B
B
B
t
t!
These eqns. express the first passage time densities at time t as a function jointly of their values at all preceding times, τ t and of a kernel function Ψ(x, t Q y, τ). The latter may be chosen in such a way that its value goes to zero as τ approaches t, which means that, unlike Eqns. 4a and 4b, the right-hand sides of Eqns. 5a and 5b can be approximated numerically, with finite sums. A full account of this method, including details of how to obtain the kernel functions for diffusions such as the Wiener and OU processes, may be found in Smith (2000).
1.3 Finite-state Marko Chain Approximations Another approach to modeling decision-making using diffusion processes has been used by Busemeyer and
Stochastic Dynamic Models (Choice, Response, and Time) Townsend (1992) and collaborators, in which the process is approximated numerically as the limit of a finite-state Markov chain. Computationally efficient solutions to the first passage time problem for such chains can be obtained using spectral methods when the process is time-homogeneous, as in the case of the OU process with a constant comparison function, µ(t) l µ. More recently, Diederich (1997) has extended this method to the case in which the encoding function is only piecewise time-homogeneous, that is, µ(t) l µi l 1, 2,…, where the index i denotes one of a sequence of consecutive, nonoverlapping observation intervals. This describes a situation in which the encoding function undergoes a sequence of instantaneous shifts during the course of an experimental trial. As such, it resembles models previously considered for piecewise random walks and Wiener processes by Ratcliff (1980) and Heath (1981), respectively. Its main applicationhasbeentomodelshiftsinattentionbetween stimulus attributes in cognitive decision tasks, in settings in which the time scale of the decision is long relative to that of the attention shifts. In tasks in which decisions are made more rapidly, as is the case in simple information-processing tasks, it may be more relevant to represent shifts in attention as progressive rather than instantaneous events. The dynamics of such shifts can then be modeled in continuous time using the techniques of the preceding paragraphs.
stimuli that correspond to the response alternatives rA and rB. The result of this comparison is a pair of independent point processes, XA(t) and XB(t), whose intensities reflect the degree of the match between the stimulus and the associated mental representations. In the classical model, these stochastic comparison processes are assumed to be homogeneous Poisson processes. More generally, however, the intensities of these processes may be arbitrary functions of time, denoted a(t) and b(t). Under these circumstances, XA(t) and XB(t) become time-inhomogeneous. These processes drive a pair of counting processes, NA(t) and NB(t), which count the number of point events in XA(t) and XB(t), respectively. The accrual of counts continues in this way until the total in one or another counter exceeds an associated critical count, specifically, until NA(t) KA or NB(t) KB. At this point, accrual ceases and the response associated with the counter in question is given. The RT density functions for this model are determined by the integrated event rate functions A(t) and B(t), which are defined as
2. Poisson Counter Models
The first passage time densities gA(t) and gB(t) may then be written:
Figure 2 shows an example of the second class of stochastic, dynamic models, the Poisson parallelcounter. This model consists of three main stages: an encoding stage, a point-process generation stage, and a stochastic accrual stage. The output of the encoding stage is a function µ(t) that describes the time course of the stimulus representation. This is compared simultaneously to the mental representations of the
& !a(s) ds B(t) l & b(s) ds. ! A(t) l
t
t
gA(t) l
1
[A(t)]KA−"a(t)e−A(t) (KAk1)!
KB−" 2 3 4
j=!
[B(t)] j e−B(t) j!
5
[A(t)] j e−A(t) j!
5
6
(6a) 7 8
and gB(t) l
[B(t)]KB−"b(t)e−B(t) (KBk1)!
1
KA−" 4
3
2
j=!
6
(6b) 7 8
respectively. Like the time-inhomogeneous diffusion process, closed-form expressions for choice probabilities and mean RT cannot in general be obtained for this model, although these quantities may always be computed from the RT density functions by numerical integration. However, for some special cases that are of interest in applications, closed-form expressions for these statistics are readily obtainable. Details may be found in Smith and Van Zandt (2000).
3. Models with Multiariate Diffusion Processes Figure 2 Poisson parallel-counter model with timeinhomogeneous intensities
The preceding sections described the simplest versions of the main classes of stochastic dynamic models. However, other alternatives are possible, for example, models in which several diffusion processes race one 15119
Stochastic Dynamic Models (Choice, Response, and Time) another in parallel, as first proposed by Ratcliff (1978). Subsequent authors have considered general dynamic models in a similar vein, such as the model of simple RT described by Smith (1995), which incorporates both time inhomogeneity and state dependence. Providing the constituent processes themselves are independent, such models may be analyzed straightforwardly using the methods of Sect. 1. Usually, however, considerable complexity accompanies any attempt to relax the independence assumption in a useful way. When the processes are not independent, their combined behavior must be modeled as a multivariate diffusion process with specified covariance properties. Distributional properties for a fairly wide variety of correlated diffusion processes may be obtained by solving the multivariate analogue of Eqn. 1 using the Ito# calculus. This suffices to give the time-dependent mean and covariance function of the process, unconstrained by absorbing barriers, which provides the basis for a dynamic signal-detection model. For models of RT, however, it is usually necessary to solve the associated first passage time problem in multivariate form. One exception to this prescription is the stochastic version of General Recognition Theory developed by Ashby (2000), which seeks to characterize how people make binary classification judgments on stimuli defined in a multivariate perceptual space. In this model, the multivariate problem can be reduced to a univariate problem because, geometrically, all the information relevant to the decision is contained in the projection of the diffusion process onto the normal to the hyperplane dividing the response categories. However, in models not possessing this structure it is necessary to solve the multivariate first passage time problem explicitly. Unfortunately, there are no general methods for solving such problems and the fragmentary results that have been obtained in special cases are usually not applicable to behavioral data. From a modeling perspective, this represents a significant limitation, because in some domains, models in which there is crosstalk between channels are of considerable theoretical interest. In these cases, because of the difficulties in obtaining relevant results in other ways, the properties of the associated models have been investigated using Monte Carlo simulation. A recent example of the use of simulation methods to investigate stochastic dynamic models is the work of Wenger and Townsend (2000).
4. Conclusions and Future Directions This article has surveyed the mathematical methods used to obtain RT and accuracy predictions for the two main classes of continuous-time, dynamic models used to model decision-making in cognitive tasks: diffusion process models and Poisson parallel-counter models. The attraction of such models is that they 15120
allow richer and more complex ideas about the neural representation of stimuli to be embodied in mathematical models. The disadvantage of this additional freedom is that the resulting models may be more difficult to test empirically than their classical counterparts. For this reason, future developments in this area are likely to be in domains in which the dynamic characteristics of a model are strongly constrained by existing theory. Such constraints are needed to ensure models that are expressions of theoretical principles rather than simply vehicles for fitting data. Areas with a body of theory that is sufficiently well articulated to support models of this kind include early vision, mathematical studies of memory, and selective attention. See also: Majorization and Stochastic Orders; Neural Systems and Behavior: Dynamical Systems Approaches; Self-organizing Dynamical Systems; Stochastic Models
Bibliography Ashby F G 2000 A stochastic version of general recognition theory. Journal of Mathematical Psychology 44: 310–29 Ashby F G, Townsend J T 1986 Varieties of perceptual independence. Psychological Reiew 93: 154–79 Buonocore A, Giorno V, Nobile A G, Ricciardi L 1990 On the two-boundary first-crossing-time problem for diffusion processes. Journal of Applied Probability 27: 102–14 Busemeyer J, Townsend J T 1992 Fundamental derivations from decision field theory. Mathematical Social Sciences 23: 255–82 Busemeyer J, Townsend J T 1993 Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychological Reiew 100: 432–59 Diederich A 1995 Intersensory facilitation of reaction time: Evaluation of counter and diffusion coactivation models. Journal of Mathematical Psychology 39: 197–215 Diederich A 1997 Dynamic stochastic models for decision making under time constraints. Journal of Mathematical Psychology 41: 260–74 Gutie! rrez Ja! imez R, Roma! n Roma! n P, Torres Ruiz F 1995 A note on the Volterra integral equation for the first-passagetime probability density. Journal of Applied Probability 32: 635–48 Heath R A 1981 A tandem random walk for psychological discrimination. British Journal of Mathematical and Statistical Psychology 34: 76–92 Heath R A 1992 A general nonstationary diffusion model for two choice decision making. Mathematical Social Sciences 23: 283–309 Karatzas I, Shreve S E 1991 Brownian Motion and Stochastic Calculus. Springer-Verlag, New York Luce R D 1986 Response Times: Their Role in Inferring Elementary Mental Organization. Oxford University Press, New York Pacut A 1976 Some properties of threshold models of reaction latency. Biological Cybernetics 28: 63–72 Ratcliff R 1978 A theory of memory retrieval. Psychological Reiew 85: 59–108 Ratcliff R 1980 A note on modeling accumulation of information when the rate of accumulation changes over time. Journal of Mathematical Psychology 21: 178–84
Stochastic Models Rudd M E, Brown L G 1997 A model of Weber and noise gain control in the retina of the toad bufo marinus. Vision Research 37: 2433–53 Smith P L 1995 Psychophysically-principled models of visual simple reaction time. Psychological Reiew 102: 567–93 Smith P L 2000 Stochastic dynamic models of response time and accuracy: A foundational primer. Journal of Mathematical Psychology 44: 408–63 Smith P L, Van Zandt T 2000 Time-dependent Poisson counter models of response latency in simple judgment. British Journal of Mathematical and Statistical Psychology 53: 293–315 Townsend J T, Ashby F G 1983 The Stochastic Modeling of Elementary Psychological Processes. Cambridge University Press, Cambridge, UK Wenger M J, Townsend J T 2000 Basic response time tools for studying general processing capacity in attention, perception and cognition. Journal of General Psychology 127: 67–99
P. L. Smith
Stochastic Models The simplest stochastic experiment is coin-toss. Each toss of a fair coin has two possible results and each of these results has probability of one half. This experiment is mathematically modeled with a random variable. A random variable is characterized by a state space and a probability distribution; in coin-toss the state space is ohead, tailq and the distribution is ‘probability of head l 1\2,’ ‘probability of tail l 1\2’. This paper deals essentially with three issues. The first one is a description of families of independent random variables and their properties. The law of large numbers describes the behavior of averages of a large number of independent and identically distributed variables; for instance, it is ‘almost sure’ that in a million coin-tosses the number of heads will fall between 490 thousands and 510 thousands. The central limit theorem describes the fluctuations of the averages around their expected value. The Poisson approximation describes the behavior of a large number of random variables assuming values in o0, 1q, each one with small probability of being 1. The second issue is phase transition and is discussed in the setup of percolation and random graph theory. For example, consider that each i belonging to a bidimensional integer lattice (that is, i has two coordinates, i l (i , i ), each coordinate belonging to the set " # of integer numbers o…,k1, 0, 1,…q) is painted black with probability p and white otherwise, independently of everything else. Is there an infinite black path starting at the origin of the lattice? When the answer is yes it is said that there is percolation. A small variation on the probability p may produce the transition from absence to presence of percolation. The third question is related to dynamics. Markov chains describe the behavior of a family of random
variables indexed by time in o0, 1, 2,…q for which the probabilistic law of the next variable depends only on the result of the current one. Problems for these chains include the long time behavior, law of large numbers and central limit theorems. Two examples of Markov chains are discussed. In the birth-and-death process at each unit of time a new individual is born or a present individual dies, the probabilities of such events being dependent on the current number of individuals in the population. In a branching process each new individual generates a family that grows and dies independently of the other families. Special attention is given to interacting particle systems; the name refers to the time evolution of families of processes for which the updating of each member of the family depends on the values of the other members. Two examples are given. The simple exclusion process describes the evolution of particles jumping in a lattice subjected to an exclusion rule (at most one particle is allowed in each site of the lattice). The study of these systems at long times in large regions is called the hydrodynamic limit. In particular, when the jumps are symmetric the hydrodynamic limit relates the exclusion process with a partial differential equation called the heat equation, while when the jumps are not symmetric, the limiting equation is a one-conservation law called the Burgers equation. The voter model describes the evolution of a population of voters in which each voter updates his opinion according to the opinion of the neighboring voters. The updating rules and the dimension of the space give rise to either unanimity or coexistence of different opinions. The stochastic models discussed in this article focus on two commonly observed phenomena: (a) a large number of random variables may have deterministic behavior and (b) local interactions may have global effects.
1. Independent Random Variables The possible outcomes of the toss of a coin and its respective probabilities can be represented by a random variable. A random variable X is characterized by a set S of possible outcomes and the probability distribution or law of these outcomes: for each x in S, (X l x) denotes the probability that the outcome results x. In coin-toss the set S is either ohead, tailq or o0, 1q and the distribution is given by (X l 0) l (X l 1) l 1\2. The mean or expected value (X ) of a random variable X is the sum of the possible outcomes multiplied by the probability of the outcome: (X ) l xx(X l x). The variance (X ) describes the square dispersion of the variable around the mean: (X ) l (xkX)#(X l x). x
(1) 15121
Stochastic Models A process is a family of random variables labeled by some set. Independent random variables are characterized by the fact that the outcome of one variable does not change the law of the others. For example, the random variables describing the outcomes of different coins are independent. The random variables describing the parity and the value of the face of a dice are not independent: if one knows that the parity is even, then the face can only be 2, 4, or 6. 1.1 Law of Large Numbers and Central Limit Theorem A sequence of independent random variables is the most unpredictable probabilistic object. However, questions involving a large number of variables get almost deterministic answers. Consider a sequence of independent and identically distributed random variables X , X ,… with common mean µ and finite # partial sums of the first n variables is variance"σ#. The defined by Sn l X j(jXn and the empiric mean of " by S \n. The expectation and the the first n outcomes n variance of Sn are given by (Sn) l nµ and (Sn) l nσ#. The law of large numbers says that as n increases, the random variable Sn\n is very close to µ. More rigorously, for any value ε 0, the probability that Sn\n differs from its expectation by more than ε can be made arbitrarily close to zero by making n sufficiently large. The Central Limit Theorem says that the distance between Sn and nµ is of the order Nn. More precisely it gives the asymptotic law of the difference when n goes to _: E
lim
n
_
F
G
Snknµ a Nnσ
l Φ(a)
(2)
H
where Φ is the normal cumulative distribution function defined by Φ(a) l
1 N2π
a
&e
−x#/#dx
(3)
−_
1.2 Law of Small Numbers and Poisson Processes Consider a positive real number λ and a double labeled sequence X(i, n) of variables with law [X(i, n) l 1] l λ\n, [X(i, n) l 0] l 1kλ\n; n is a natural number, that is, in the set o0, 1,… q and i is in the ‘shrinked’ lattice o1\n, 2\n,…q (the nomenclature comes from the fact that as n grows the distance between points diminishes). In this way, for fixed n there are around n random variables sitting in each interval of length one but each variable X(i, n) has a small probability of being 1 and a large probability of being 0 in such a way 15122
that the mean number of ones in the interval is around λ for all n. When n goes to infinity the above process converges to a Poisson process. This process is characterized by the fact that the number N(Λ) of ones in any interval Λ is a Poisson random variable with mean λm(Λ), where m(Λ) l length of Λ: [N(Λ) l k] l e−λm(Λ)[λm(Λ)]k\k!; furthermore, the numbers of ones in disjoint regions are independent. A bi-dimensional lattice consists of points with two coordinates i l (i , i ) with both i and i natural numbers and " # # analogously the points"in a d-dimensional lattice have d coordinates. When i runs in a multidimensional shrinked lattice, the limit is a multidimensional Poisson process. This is called the law of small numbers. The one-dimensional Poisson process has the property that the distances between successive ones are independent random variables with exponential distribution (it is said that T has exponential distribution with parameter λ if (T t) l e−λt). In this paper, onedimensional Poisson process will be called a Poisson clock and the times of occurrence of ones will be called rings. The classic book for coin-tossing and introductory probability models is Feller (1971). Modern approaches include Fristedt and Gray (1997), Bre! maud (1999) and Durrett (1999a). Aldous (1989) describes how the Poisson process appears as the limit of many distinct phenomena.
2. Static Models 2.1 Percolation Suppose that each site of the two-dimensional lattice is painted black with probability p and white with probability 1kp independent of the other sites. There is a black path between two sites if it is possible to go from one site to the other jumping one unity at each step and always jumping on black sites. The cluster of a given site x is the set of (black) sites that can be attained following black paths started at x. Of course, when p is zero the cluster of the origin is empty and when p is one, it is the whole lattice. The basic question is about the size of the cluster containing the origin. When this size is infinity, it is said that there is percolation. It has been proven that there exists a critical value pc such that when p is smaller than or equal to pc there is no percolation, and when p is bigger than pc there is percolation with positive probability. This is the typical situation of phase transition: a family of models depending on a continuous parameter which show a qualitative change of behavior as a consequence of a small quantitative change of the parameter. Oriented percolation occurs when the paths can go from one point to the next following only two directions; for example South–North and West– East. The standard reference for percolation is Grimmett (1999).
Stochastic Models 2.2 Random Graphs Consider a set of n points and a positive number c. Then, with probability c\n, connect independently each pair of points in the set. The result is a random graph with some number of connected components. Let L (n) be the number of points in the largest " connected component. It has been proven that as the number of points n increases to infinity the proportion L (n)\n of points in the largest component converges to" some value γ and that if c 1 then γ 0 and that if c 1, γ l 0. This means that if the probability of connection is sufficiently large, a fraction of points is in the greatest component. In this case this is called the giant component. The number of points in any other component, when divided by n, always converges to zero; hence there is at most one giant component. The fluctuations of the giant component around γn, the behavior of smaller components, and many other problems of random graph theory can be found in Bolloba! s (1998).
3. Marko Chains A Markov chain is a process defined iteratively. The ingredients are a finite or countable state space S, a sequence of independent random variables Un uniformly distributed in the interval [0, 1], and a transition function F which to each state x and number u in the unit interval assigns a state y l F(x, u). The initial state X is either fixed or random and the successive ! the chain are defined iteratively by X l states of n F(Xn− , Un). That is, the value of the process Xn at each " step n is a function of the value Xn− at the preceding " step and of the value of a uniform random variable Un independent of everything. In this sense a Markov chain has no memory. The future behavior of the chain given the present position does not depend on how this position was attained. Let Q(x, y) l (F(x, Un) l y), the probability that the chain at step n be at state y given that it is at state x at time nk1. Since this probability does not depend on n the chain is said to be homogeneous in time. The transition matrix Q is the matrix with entries Q(x, y). When it is possible to go from any state to any other the chain is called irreducible, otherwise reducible. The state space of reducible chains can be partitioned in subspaces in such a way that the chain is irreducible in each part. Irreducible chains in finite state spaces stabilize in the sense that the probability of finding the chain at each state converges to a probability distribution π that satisfies the balance equations: π( y) l xπ(x)Q(x, y). The measure π is called invariant for Q because if the initial state X has law π then the law of Xn will be π for all future !time n. When there is a unique invariant measure π the proportion of time the chain spends in any given state x converges to π(x) when time goes to infinity no
matter how the initial state is chosen. This is a law of large numbers for Markov chains. The expected time the chain needs to come back to x when starting at x at time zero is 1\π(x). When the state space has a countable (infinite) number of points it is possible that there is no invariant measure; in this case the probability of finding the chain at any given state goes to zero; that is, the chain ‘drives away’ from any fixed finite region. It is possible also to define Markov chains in noncountable state spaces, but this will not be discussed here.
3.1 Coupling Two or more Markov chains can be constructed with the same uniform random variables Un. This is called a coupling and it is one of the major tools in stochastic processes. Applications of coupling include proofs of convergence of a Markov chain to its invariant measure, comparisons between chains to obtain properties of one of them in function of the other, and simulation of measures that are invariant for Markov chains. Recent books on Markov chains include Chen (1992), Bianc and Durrett (1995), Fristedt and Gray (1997), Bre! maud (1999), Durrett (1999a), Schinazi (1999), Thorisson (2000), Ha$ ggstro$ m (2000), and Ferrari and Galves (2000).
3.2 Examples: Birth–Death and Branching Processes The state space of the birth–death process is the set of natural numbers o0, 1,…q, the probability to jump from x to xj1 is px, a number in [0, 1] and the probability to jump from x to xk1 is 1kpx. It is assumed that p l 1, that is, the process is reflected at the origin. The!transition function is given by F(x, u) l x j H(x, u), with H(x, u) l 1 if u px and H(x, u) l k1 if u px. It is possible to find explicit conditions on px that guarantee the existence of an invariant measure. For instance, if px a for all x for some a 1\2, then there exists an invariant measure π. The reason is that the chain has a drift towards the origin where it is reflected. On the other hand, if the drift towards the origin is not sufficiently strong, the chain may have no invariant measure. The (nj1)th generation of a branching process consists of the daughters of the individuals alive in the nth generation: Zn
Zn+ l Xn,i " i="
(4)
where Zn is the total number of individuals in the nth generation and Xn,i is the number of daughters of ith 15123
Stochastic Models individual of the nth generation. Z l 1 and Xn,i are ! distributed ranassumed independent and identically dom variables with values in o0, 1, 2,…q. When Xn,i l 0 the corresponding individual died without a progeny. The process so defined is a Markov chain in o0, 1, 2,… q; the corresponding transition function F can be easily computed from the distribution of Xn,i. The mean number of daughters of the typical individual µ l Xn,i (a value that does not depend on n, i) is the crucial parameter to establish the behavior of the process. When µ 1 the probability that at all times there are alive individuals is positive; in this case the process is called supercritical and in case of survival the number of individuals increases geometrically with n. When µ 1 the process dies out with probability one (subcritical case). Guttorp (1991) discusses branching processes and their statistics.
4. Interacting Particle Systems Processes with many interacting components are called interacting particle systems. Each component has state space S and Xt(i ) denotes the state of component i at time t; t may be discrete or continuous, generally i belongs to a d-dimensional lattice. The update law of each component typically depends on the state of the other components; this dependence is in general ‘local,’ that is, determined only by the configuration of a few other neighboring components. The references for interacting particle systems are Liggett (1985) and (1999), Durrett (1988) and (1999b), Chen (1992), Schinazi (1999), Fristedt and Gray (1997).
4.1 The Exclusion Process In this process i belongs to the d-dimensional lattice and Xt(i) assumes only the value 0 and 1. If Xt(i) l 0, site i is empty at time t and if Xt(i) l 1, site i is occupied by a particle at time t. The process evolves as follows. Independent Poisson clocks are attached to the sites; when the clock rings at site i (at most one clock rings at any given time), if there is a particle at i, then a site j is chosen at random and the particle tries to jump to j. If j is empty then the particle jumps; if j is already occupied the jump is suppressed. The interaction between particles is embedded in the law of the jump. The simplest case is when the distribution of the jump does not depend on the configuration. In this case the interaction is just reduced to the suppression of jumps attempted to occupied sites and the motion is called the simple exclusion process. The measure obtained when a particle is put at each site with probability ρ independently of the other sites is called a product measure. A surprising feature of the simple exclusion process with spatially homogeneous 15124
jump laws is that if the initial configuration is chosen according to a product measure, then the configuration at any future time will also be distributed with this measure. In other words, product measures are time invariant for the process. Due to the conservative character of the dynamics, i.e., the initial density is conserved, the process accepts a family of invariant measures; there is one invariant product measure for each density ρ in [0, 1]. The main questions are related to the asymptotic behavior of the process when time runs. In many cases it has been proven that the process starting with an initial configuration having an asymptotic density converges as time goes to infinity to the product measure with that density. More detailed questions are related to hydrodynamic limits obtained when large space regions are observed at long times. A particularly interesting case is the symmetric simple exclusion process, for which jumps of sizes y and ky have the same probability. The large space, long time behavior of the symmetric simple exclusion process is governed by the heat equation. For example, in the one-dimensional case it has been proven that the number of particles in the region [aN, bN] at time tN# divided by N converges to ba u(r, t) dr, where u(r, t) is the solution of the heat equation cu c#u l ct cr#
(5)
with initial condition u(r, 0) (the initial particle distribution must be related with this condition in an appropriate sense). This is called a law of large numbers or hydrodynamic limit. The limiting partial differential equation represents the macroscopic behavior while the particle system catches the microscopic behavior of the system. The law of the configuration of the process in sites around rN at time tN# converges as N goes to infinity to the product measure with density given by the macroscopic density u(r, t). This is the most disordered state (or measure). This is called propagation of chaos. When the motion is not symmetric the behavior changes dramatically. The simplest case is the totally asymmetric one-dimensional nearest neighbor simple exclusion process. The particles jump in the onedimensional integer lattice o… , k1, 0, 1, …q. When the Poisson clock rings at site i, if there is a particle at that site and no particle at ij1, then the particle jumps to ij1; otherwise nothing changes. To perform the hydrodynamic limit time and space are rescaled in the same way: the number of particles in the region [aN, bN] divided by N at time tN converges to ba u(r, t) dt, the solution of the Burgers equation cu cu(1ku) lk ct cr
(6)
Stochastic Models with initial condition u(r, 0). Similar equations can be obtained in the d-dimensional case. Propagation of chaos is also present in this case. Contrary to what happens in the symmetric case, the Burgers equation develops noncontinuous solutions. In particular, consider ρ λ and the initial condition u(r, 0) equal to ρ if r 0 and to λ otherwise. Then the solution u(r, t) is ρ if r (1kρkλ)t and λ otherwise. The well-known phenomenon of highways occur here: the space is divided in two regions sharply separated; one region of low density and high velocity and the other region of high density and low velocity. This phenomenon that was first observed for the Burgers equation (Evans (1998)) was then proven to hold for the microscopic system for the nearest neighbor case. The location of the shock is identified by a perturbation of the system: start with a configuration η coming from a product shock measure, that is, the probability of having a particle at site i is ρ if i is negative and λ if i is positive, and put a particle at the origin. Consider another configuration ηh that coincides with η at all sites but at the origin. ηh is a perturbation of η. Realizing two processes starting with η and ηh respectively, using the same Poisson clocks, there will be only one difference between the two configurations at any given future time. The configuration as seen from the position of this perturbation is distributed approximately as η, that is, with a shock measure. The perturbation behaves as a second class particle; the sites that are occupied by particles of both η and ηh are said to be ‘occupied by a first class particle,’ while the site where η differs from ηh is said to be ‘occupied by a second class particle.’ Following the Poisson clocks, the second class particle jumps backwards when a first class particle jumps over its site and jumps forwards if the destination site is empty. See Liggett (1999). Special interest has been devoted to the process when the initial condition is ‘there are particles in all negative sites and no particles in the other sites.’ The solution u(r, t) of the Burgers equation with this initial condition is a rarefaction front: u(r, t) is one up to kt, zero from t on and interpolates linearly between 0 and 1 in the interval [kt, t]. This particular case has been recently related to models coming from other areas of applied probability. In particular with increasing subsequences in random permutations, random matrices, last passage percolation, random domino tilings and systems of queues. Hydrodynamics of particle systems is discussed by Spohn (1991), De Masi and Presutti (1991), and Kipnis and Landim (1999). 4.2 Voter Model In the voter model each site of a d-dimensional integer lattice (this is the set of d-uples of values in the integer lattice o…,k1, 0, 1,…q) is occupied by a voter having two possible opinions: 0 and 1. At the rings of a Poisson clock each voter chooses a neighbor at
random and assumes its opinion. The voter model has two trivial invariant measures: ‘all zeros’ and ‘all ones.’ Of course, if at time zero all voters have opinion zero, then at all future times they will be zero; the same occurs if all voters have opinion one at time zero. Are there other invariant measures? The answer is no for dimensions d l 1 and d l 2 and yes for dimension d 3. The proof of this result is based on a dual process. This process arises when looking at the opinion of a voter i at some time t. It suffices to go back in time to find the voter at time zero that originates the opinion of i at time t. The space–time backwards paths for two voters may coalesce. In this case the two voters will have the same opinion at that time. The backwards paths are simple random walks. In one or two dimensions, nearest-neighbors random walks always coalesce, while in dimensions greater than or equal to three there is a positive probability that the walks do not coalesce. Hence there are invariant states in three or more dimensions for which different opinions may coexist. A generalization of the voter model is the random average process. Each voter may have opinions corresponding to real numbers and, instead of looking at only one neighbor, at Poisson rings voters update their opinion with a randomly weighted average of the opinions of the neighbors and adopt it. As in the twoopinions voter model, flat configurations are invariant, but if the system starts with a tilted configuration (for example, voter i has opinion λi for some parameter λ 0), then the opinions evolve randomly with time. It is possible to relate the variance of the opinion of a given voter to the total amount of time spent at the origin by a symmetric random walk. This shows that the variance of the opinion of a given voter at time t is of the order of t in dimension one, log t in dimension 2, and constant in dimensions greater than or equal to 3. The random average process is a particular case of the linear processes studied in Liggett (1985).
5. Future Directions At the start of the twenty-first century, additional developments for the exclusion process include the behavior of a tagged particle, the convergence to the invariant measure for the process in finite regions, and more refined relations between the microscopic and macroscopic counterpart. The voter model described here is ‘linear’ in the sense that the probability of changing opinion is proportional to the number of neighbors with the opposite opinion. Other rules lead to ‘nonlinear’ voter models that show more complicated behavior. See Liggett (1999). The rescaling of the random average process may give rise to nontrivial limits; this is a possible line of future development. The contact process, a continuous-time version of oriented percolation, has not been discussed here. The contact process in a graph instead of a lattice may 15125
Stochastic Models involve three or more phases (in the examples discussed here there were only two phases). The study of particle systems in graphs is a promising area of research. An important issue not discussed here is related to Gibbs measures; indeed, phase transition was mathematically discussed first in the context of Gibbs measures. One can think of a Gibbs measure as a perturbation of a product measure: configurations have a weight according to a function called energy. Gibbs measures may describe the state of individuals that have a tendency to agree (or disagree) with their neighbors. Stochastic Ising models are interacting particle systems that have as invariant measures the Gibbs measure related to a ferromagnet. Mixtures of Ising models and exclusion processes give rise to reaction diffusion processes, related via hydrodynamic limit with reaction diffusion equations. An area with recent important developments is Random tilings, which describes the behavior of the tiles used to cover an area in a random fashion. The main issue is how the shape of the boundary of the region determines the tiling. Dynamics on tilings would show how the boundary information transmits. The rescaling of some tilings results in a continuous process called the mass-less Gaussian field; loosely speaking, a multidimensional generalization of the central limit theorem. An important area is generally called spatial processes. These processes describe the random location of particles in the d-dimensional real space instead of a lattice. The Poisson process is an example of a spatial process. Analogously to Gibbs measures, other examples can be constructed by giving different weight to configurations according to an energy function. The study of interacting birth and death processes having as invariant measure spatial process is recent and worth to develop. This is related also with the so-called perfect simulation; roughly speaking this means to give a sample of a configuration of a process as a function of the uniform random numbers given by a computer. See, for instance, Ha$ ggstro$ m (2000). See also: Diffusion and Random Walk Processes; Dynamic Decision Making; Markov Decision Processes; Markov Processes for Knowledge Spaces; Probability: Formal; Probability: Interpretations; Sequential Decision Making; Stochastic Dynamic Models (Choice, Response, and Time)
Bolloba! s B 1998 Modern Graph Theory. Graduate Texts in Mathematics, 184. Springer-Verlag, New York Bre! maud P 1999 Marko Chains. Gibbs Fields, Monte Carlo Simulation, and Queues. Texts in Applied Mathematics, 31. Springer-Verlag, New York Chen M F 1992 From Marko Chains to Nonequilibrium Particle Systems. World Scientific Publishing, River Edge, NJ De Masi A, Presutti E 1991 Mathematical Methods for Hydrodynamic Limits. Lecture Notes in Mathematics, 1501. Springer-Verlag, Berlin Durrett R 1988 Lecture Notes on Particle Systems and Percolation. Wadsworth & Brooks\Cole Advanced Books & Software, Pacific Grove, CA Durrett R 1999a Essentials of Stochastic Processes. Springer Texts in Statistics. Springer-Verlag, New York Durrett R 1999b Stochastic Spatial Models. Probability Theory and Applications (Princeton, NJ, 1996), 5–47, IAS\Park City Mathematics Series 6, American Mathematical Society, Providence, RI Evans L C 1998 Partial Differential Equations. American Mathematical Society, Providence, RI Feller W 1971 An Introduction to Probability Theory and its Applications, Vol. II, 2nd edn. Wiley, New York Ferrari P A Galves A 2000 Coupling and Regeneration for Stochastic Processes. Associatio! n Venezolana de Matema! ticas, Me! rida Fristedt B, Gray L 1997 A Modern Approach to Probability Theory: Probability and its Applications. Birkhauser Boston, Boston, MA Grimmett G 1999 Percolation, 2nd edn. Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 321. Springer-Verlag, Berlin Guttorp P 1991 Statistical Inference for Branching Processes. Wiley, New York Ha$ ggstro$ m 2000 Finite Marko Chains and Algorithmic Applications. To appear. Kipnis C, Landim C 1999 Scaling Limits of Interacting Particle Systems. Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 320. Springer-Verlag, Berlin Liggett T M 1985 Interacting Particle Systems. Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Science], 276. Springer-Verlag, New York Liggett T M 1999 Stochastic Interacting Systems: Contact, Voter and Exclusion Processes. Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 324. Springer-Verlag, Berlin Schinazi R B 1999 Classical and Spatial Stochastic Processes. Birkha$ user Boston, Boston, MA Spohn H 1991 Large Scale Dynamics of Interacting Particles. Springer-Verlag, Berlin Thorisson H 2000 Coupling, Stationarity, and Regeneration: Probability and its Applications. Springer-Verlag, New York
P. A. Ferrari
Bibliography Aldous D 1989 Probability Approximations ia the Poisson Clumping Heuristic. Applied Mathematical Sciences, 77. Springer-Verlag, New York Bianc P, Durrett R 1995 Lectures on probability theory. In: Bernard P (ed.) Lectures from the 23rd Saint-Flour Summer School held August 18–September 4, 1993. Lecture Notes in Mathematics, 1608. Springer-Verlag, Berlin
15126
Stock Market Predictability One of the most enduring cornerstones of modern financial theory has been that stock markets are reasonably efficient in reflecting new information and
Stock Market Predictability that price movements closely fit a random walk model where future prices cannot be predicted from an examination of past prices. Toward the end of the twentieth century, however, financial economists and statisticians began to question received doctrine, and many came to believe that stock prices were at least partially predictable. Moreover, although far more controversial, many have argued that these predictable patterns enable investors to earn excess risk-adjusted rates of returns and provide evidence that the efficient market theory should be rejected. This survey reviews the findings of the empirical work concerning stock market predictability. It reviews the work done on stock prices that appears to reject the random walk hypothesis and looks at the body of studies suggesting that future returns can be predicted on the basis of such fundamental variables as dividend yields, price earnings, and price-to-book value ratios, market capitalization (size), etc. The survey also examines the relationship between predictability and market efficiency.
1. The Efficient Market Theory and its Critics The basic idea behind the efficient market theory is that securities markets are extremely efficient in digesting information about individual stocks or about the stock market in general. When information arises about a stock (or the market as a whole), the news spreads very quickly and is immediately incorporated into the prices of securities. Thus, neither technical analysis (an analysis of past price patterns to determine the future) nor fundamental analysis (an analysis of a company’s earnings, dividends, future prospects, etc. to determine a stock’s proper value) will help investors to achieve returns greater than would be obtained by buying and holding one of the broad stock market indices. The efficient market theory is associated with the idea of a ‘random walk,’ which is a term loosely used in the finance literature to characterize a price series where all subsequent price changes represent random departures from previous prices. The logic of the random walk idea is that if the flow of information is unimpeded and information is immediately reflected in stock prices, then tomorrow’s price change will reflect only tomorrow’s news and will be independent of the price changes today. But news is by definition unpredictable and, thus, resulting price changes must be unpredictable and random. As a result, prices fully reflect all known information, and even uninformed investors buying a diversified portfolio at the tableau of prices given by the market will obtain a rate of return as good as that achieved by market professionals. Discussions of the random walk hypothesis can be found in Samuelson (1965) and Fama (1970). Academic work published mainly during the 1980s and 1990s has tended to cast doubts on many of the tenets of the efficient market theory. This work has
suggested that the random walk model is not supported and that future stock prices and returns can be predicted, at least in part, by an examination of past stock prices and certain ‘fundamental’ valuation methods such as the ratio of stock prices to earnings and book values. Moreover, some of this work has been interpreted as implying that markets are inefficient in the sense that arbitrage opportunities exist that enable investors to earn excess risk-adjusted returns.
2. The Major Predictable Patterns that Hae Been Discoered This article will now focus on the findings of empirical work about stock market predictability done during the 1980s and 1990s. The general thrust of much of the published academic work has been along the following two lines. First, it appears that there are many predictable patterns in the stock market. The stock market is not random. Indeed, a book published by Lo and MacKinlay (1999) has the arresting title, A Nonrandom Walk Down Wall Street. Second, although this is far more controversial, many have argued that these predictable patterns enable an investor to earn big risk-adjusted rates of return and are proof that the efficient market theory is wrong.
2.1 There is Some Eidence of Short-run Momentum in the Market When stock returns for individual stocks and market averages are calculated over days, or weeks, or even months, there is positive serial correlation. In other words, a positive rate of return one week is more likely than not to be followed by a positive return the next week. The market is not a perfect random walk. Further evidence provided by Lo and MacKinlay uses a simple specification test based on variance estimators. Their test exploits the fact that the variance of the increments of a random walk is linear in the sampling interval. For example, if stock prices are generated by a random walk then the variance of monthly sampled log-price relatives must be four times as large as the variance of a weekly sample. Using data over a 23-year time frame from 1962 through 1985, Lo and MacKinlay reject the random walk hypothesis. The rejection cannot be explained completely by infrequent trading (where news affects smaller company stocks with a lag because they trade less frequently) or by time-varying volatilities. This work suggests that quantitative managers who use momentum to help guide purchases have some empirical support on their side. Whether this is a true inefficiency or not, however, is open to more question. Certainly any investor who pays transactions costs is not able to outperform a buy-and-hold strategy because the kind of momentum that has been found is 15127
Stock Market Predictability small relative to the transactions costs involved, and it is far from clear how dependable any patterns that can discovered will be. The market is not statistically random, but it is nearly so.
2.2 But there is Mean Reersion Returns may be positively correlated for short periods, such as days or weeks, but when returns are measured over longer periods, such as three or four years, there tends to be negative correlation. Just as in the Bible seven lean years followed seven fat years, so in the stock market extraordinary returns are somewhat more likely to be followed by lackluster returns. For example, Fama and French (1988) demonstrated that long holding period returns are negatively serially correlated. The serial correlation is significant and implies that 25–40 percent of the variation in long holding period returns can be predicted on the basis of past returns. Similar results were found by Poterba and Summers (1988), who conclude that there is substantial mean reversion in stock market returns at longer horizons. Some studies have attributed this forecastability to the tendency of stock market prices to ‘overreact.’ DeBondt and Thaler (1985), for example, argue that investors are subject to waves of optimism and pessimism that cause prices to deviate systematically from their fundamental values and later to exhibit patterns of reversal. These findings give some support to investment techniques that rest on a ‘contrarian’ strategy, that is buying the stocks, or groups of stocks, that have been out of favor for long periods of time, and avoiding those stocks that have had large run-ups over the last several years. While there is considerable support for this view, it should be pointed out that such mean reversion is quite a bit weaker in some decades than it is in other periods. Indeed, the strongest empirical results come from periods including the Great Depression. Moreover, such return reversals for the market as a whole may be quite consistent with the efficient functioning of the market and could result, in part, from the volatility of interest rates. There is a tendency when interest rates go up for stocks to go down, and as interest rates go down for stocks to go up. If, in fact, interest rates fluctuate over time, one will tend to get return reversals, or mean reversion, and this is quite consistent with the efficient functioning of markets where stock returns will need to go up or down to be competitive with bonds. Moreover, it may not be possible to profit from the tendency for individual stocks to exhibit patterns of return reversals. For example, Fluck et al. (1997) simulated over the period a strategy of buying stocks that had particularly poor returns over the past three to five years. They found very strong statistical evidence of return reversals, but it was truly reversion to the mean, not an opportunity to make extraordinary 15128
returns. They found that stocks with very low returns over the past three to five years had higher returns in the next period. Stocks with very high returns over the past three to five years had lower returns in the next period, but the returns in the next period were similar for both groups. While they found strong evidence of mean reversion, they could not confirm that a contrarian approach would yield higher than average returns. There was a statistically strong pattern of return reversal, but not one that implied an inefficiency in the market that would enable investors to make excess returns.
2.3 There do Seem to be Some Seasonal Patterns in the Stock Market A number of researchers have found that January has been a very unusual month for stock market returns. Returns from an equally weighted stock index have tended to be unusually high during the first two weeks of the year. The return premium has been particularly evident for small stocks i.e., those with relatively small total capitalizations, as has been demonstrated by Keim (1983). Haugen and Lakonishok (1988) detail the high January returns in a book entitled The Incredible January Effect. There also appears to be a number of seasonal effects. For example, French (1980) documents significant Monday returns. There appear to be significant differences in average daily returns in countries other than the USA, as summarized by Hawawini and Keim (1995). There also appear to be some patterns in returns around the turn of the month, as shown by Lakonishok and Smidt (1988), as well as around holidays as shown by Ariel (1990). The general problem with these anomalies, however, is that they are not dependable from period to period. The January effect has not been useful to investors through much of the 1990s. Moreover, these nonrandom effects (even if they were dependable) are small relative to the transactions costs involved in trying to exploit them. They do not appear to offer investors any arbitrage opportunities that would enable them to make excess returns.
2.4 The Size Effect Probably one of the strongest effects that investigators have found is the so-called size effect: the tendency over long periods of time for small stocks to generate larger returns than those of large stocks. Since 1926, small-company stocks in the USA have produced a rate of return about 1.5 percentage points larger than the returns from large stocks. Fama and French (1992) looked at data from 1963 to 1990 and divided all stocks into deciles according to their size as measured by total capitalization. Decile 1
Stock Market Predictability
Figure 1 Average monthly returns for portfolios, formed on the basis of size, 1963–90 (source: Fama and French 1992)
contained the smallest 10 percent of all stocks while decile 10 contained the largest stocks. The results plotted in Fig. 1 show a clear tendency for the deciles made up of portfolios of smaller stocks to generate higher average monthly returns than deciles made up of larger stocks. If the ‘Beta’ measure of systematic risk from the capital asset pricing model is accepted as the correct risk measurement statistic, the size effect can be interpreted as indicating a market inefficiency. This is so because portfolios consisting of smaller stocks have excess risk-adjusted returns. But any conclusion that an inefficiency exists involves accepting the joint hypothesis that a size effect is present and that the capital asset pricing model (CAPM) is correct. Fama and French point out, however, that the average relationship between ‘Beta’ and return is flat—not upward sloping as the CAPM predicts. Moreover, within Beta deciles, ten portfolios constructed by size display the same kind of positive relationship shown in Fig. 1. On the other hand, within size deciles, the relationship between Beta and return continues to be flat. Fama and French conclude that size may well be a far better proxy for risk than Beta, and therefore that
Figure 2 Average monthly returns for portfolios, formed on the basis of price-to-earnings (P\E) ratios, 1962–90 (source: Hawawini and Keim 1995)
their findings should not be interpreted as indicating that markets are inefficient. The dependability of the size phenomenon is also open to question. Certainly during the decade of the 1990s there has been little to gain from holding smaller stocks. Indeed, in most world markets it has been the large stocks that have done especially well. Finally, it is also possible that the small-firm effect is simply a result of survivorship bias in currently available computer tapes of past returns. Today’s lists of companies include only small firms that have survived, not the ones that later went bankrupt. Thus, a 15129
Stock Market Predictability researcher who examined the 10-year performance of today’s small companies would be measuring the performance of those companies that survived—not of the ones that failed.
2.5 There is Eidence that Stocks with Low Price Earnings Multiples Generate Higher Returns than those with High Price Earnings Multiples As Fig. 2 shows, evidence first published by Nicholson (1960) and later confirmed by Ball (1978) and Basu (1983) shows that low price earnings (P\E) stocks provide higher rates of return than high P\E stocks. This finding is consistent with the views of behavioralists (see, for example, Kahneman and Riepe 1998) that investors tend to be overconfident of their ability to project high earnings growth and thus overpay for ‘growth’ stocks. The finding is also consistent with the views of Benjamin Graham and David Dodd (1934), first expounded in their classic book on security analysis and later championed by the legendary US investor, Warren Buffett. Similar results have been shown for price\cash-flow multiples, where cash flow is defined as earnings plus depreciation and amortization. See Hawawini and Keim (1995).
2.6 Stocks with Low Price-to-book Value (P\BV) Ratios Generate Larger Returns than Stocks with High P\BV Ratios The ratio of stock price to book value (the value of a firm’s assets minus its liabilities divided by the number of shares outstanding) has also been found to be a useful predictor of future security returns. Low priceto-book (along with low P\E) is considered to be another hallmark of so-called ‘value’ in equity securities, and is also consistent with the view of behavioralists that investors tend to overpay for ‘growth’ stocks that subsequently fail to live up to expectations. Fama and French (1992) concluded that size and P\BV together provide considerable explanatory power for future returns and once they are accounted for, little additional influence can be attributed to P\E multiples. Fama and French (1997) also conclude that the P\BV effect is important in many world stock markets other than that of the USA. Lakonishok et al. (1994) argue that such results raise questions about the efficiency of the market. But these findings do not necessarily imply inefficiency. P\BV may be a more accurate proxy for the systematic risk priced in the stock market than traditional estimates of the CAPM Beta because of measurement errors in the traditional estimates. Indeed, this is the explanation Fama and French give for this predictable pattern. Moreover, especially with so much merger and restructuring activity, book value is increasingly hard to interpret. Thus, it is far from clear that the 15130
pattern will continue. In fact, stocks with low P\BV ratios underperformed the general market during the 1990s. Indeed, if one looks at the behavior of growth stocks and value stocks (those with low P\BV and P\E) in the USA over a 75-year period to 2000, it appears that the Fama–French period from the early 1960s through 1990 may have been a unique period in which value stocks rather consistently produced higher rates of return.
2.7 Predicting Market Returns from Initial Diidend Yields Another apparently predictable relationship concerns the returns realized from stocks and the initial dividend yields at which they were purchased. The data since 1926 for the USA are presented in Fig. 3. The figure was produced by measuring the dividend yield of the broad stock market (in this case, the Standard and Poor’s 500 Stock Index) each quarter between 1926 and 2000, and then calculating the market’s subsequent 10-year total return. The observations were then divided into deciles depending upon the level of the initial dividend yield. In general, the figure shows that investors have earned a higher rate of return from the stock market when they purchased stocks with an initial dividend yield that was relatively high, and a below average rate of return when they purchased them at initial dividend yields that were low. Formal statistical tests of the ability of dividend yields (dividend-price ratios) to forecast future returns have been conducted by Fama and French (1988) and Campbell and Shiller (1988). Depending on the forecast horizon involved, as much as 40 percent of the variance of future returns can be predicted on the basis of initial dividend yields. These findings are not necessarily inconsistent with efficiency. Dividend yields of stocks tend to be high when interest rates are high, and they tend to be low when interest rates are low. Consequently, the ability of initial yields to predict returns may simply reflect the adjustment of the stock market to general economic conditions. Moreover, the dividend behavior of US corporations may have changed over time (see Fama and French 1999). Companies in the twentyfirst century may be more likely to institute a share repurchase program rather than increase their dividends. Thus, dividend yield may not be as meaningful as in the past. Moreover, it is worth pointing out that dividend yields were unusually low at the start of 1995. An investor who avoided US stocks then would have missed out on one of the steepest five-year increases in stock prices in history. Finally, it is worth noting that this phenomenon does not work consistently with individual stocks. Investors who simply purchase a portfolio of individual stocks with the highest dividend yields in the market will not earn a particularly high rate of return.
Stock Market Predictability
Figure 3 The future 10-year rates of return when stocks are purchased at alternative initial dividend yields (D\P) (source: The Leuthold Group)
Figure 4 The future 10-year rates of return when stocks are purchased at alternative initial price-to-earnings (P\E) multiples (source: The Leuthold Group)
One popular implementation of such a ‘high dividend’ strategy is the ‘Dogs of the Dow Strategy,’ which involves buying the ten stocks in the Dow Jones Industrial Average with the highest dividend yields. For some periods this strategy handily outpaced the
overall average, and so several ‘Dogs of the Dow’ funds were brought to market and aggressively sold to individual investors. Such funds have generally underperformed the market averages during the 1995–9 period. 15131
Stock Market Predictability 2.8 Predicting Market Returns from Initial Price Earnings Multiples The same kind of predictability for the market as a whole has been shown for price earnings ratios. The data are shown in Fig. 4. The figure presents a decile analysis similar to that described for dividend yields above. Investors have tended to earn larger longhorizon returns when purchasing stocks at relatively low price earnings multiples. Campbell and Shiller (1988) report R#s (a measure of goodness of fit that takes on a maximum value of one) of over 40 percent and conclude that equity returns have been predictable in the past to a considerable extent. Whether such historical relations are necessarily applicable to the future is another matter. Such an analysis conducted by Shiller in January 1996 predicted a long-horizon (nominal) return for the US stock market close to zero.
3. Concluding Comments In summary, there are many statistically significant predictable patterns in the stock market. It is important to emphasize, however, that these patterns are not dependable in each and every period, and that some of the patterns based on fundamental valuation measures of individual stocks may simply reflect a better proxy for measuring risk. Moreover, many of these patterns could self-destruct in the future, as many of them have already done. Indeed, this is the logical reason why one should be cautious not to overemphasize these anomalies and predictable patterns. Suppose, for example, that one of the anomalies is really true. Suppose that there is a truly dependable January effect, and that the stock market—especially stocks of small companies—will generate extraordinary returns during the first five days of January. What will investors do? They will buy late in December, on the last day of December, and sell on January 5. But then investors find that the market rallied on the last day of December and so they will begin to buy on the next-to-last day; and because there is so much ‘profit taking’ on January 5, investors will have to sell on January 4 to take advantage of this effect. Thus, to beat the gun, investors will have to be buying earlier and earlier in December, and selling earlier and earlier in January so that eventually the pattern will selfdestruct. Any truly repetitive pattern that can be discovered in the stock market and can be arbitraged away will self-destruct. Indeed, the January effect became undependable after it received considerable publicity. Similarly, suppose there is a general tendency for stock prices to underreact to certain new events, leading to abnormal returns to investors who exploit the lack of full immediate adjustment (see DeBondt and Thaler 1995 and Campbell et al. 1997 for a 15132
discussion of event studies and possible under reaction of market participants). ‘Quantitative’ investment managers will then develop strategies in an attempt to exploit the pattern. Indeed, the more potentially profitable a discoverable pattern is, the less likely it is to survive. Moreover, many of the predictable patterns that have been discovered may simply be the result of data mining. The ease of experimenting with financial data banks of almost every conceivable dimension makes it quite likely that investigators will find some seemingly significant but wholly spurious correlation between financial variables or among financial and nonfinancial data sets. One amusing one that has been uncannily accurate in the past for US markets is the Super Bowl Indicator, which predicts the sign of market returns for the year by the winner of January’s football Super Bowl. Given enough time and massaging of data series it is possible to tease almost any pattern out of every data set. Moreover, the published literature is likely to be biased in favor of reporting such results. Significant effects are likely to be published in professional journals while negative results, or boring confirmations of previous findings, are relegated to the file drawer or discarded. Data-mining problems are unique to nonexperimental sciences, such as economics, which rely on statistical analysis for their insights and cannot test hypotheses by running repeated controlled experiments. An exchange at an academic conference (1992) between Robert Shiller (1981), an economist who is sympathetic to the argument that stock prices are partially predictable and skeptical about market efficiency, and Richard Roll, an academic financial economist who also is a businessman managing billions of dollars of investment funds, is quite revealing. After Shiller stressed the importance of inefficiencies in the pricing of stocks, Roll responded as follows: I have personally tried to invest money, my client’s money and my own, in every single anomaly and predictive device that academics have dreamed up … I have attempted to exploit the so-called year-end anomalies and a whole variety of strategies supposedly documented by academic research. And I have yet to make a nickel on any of these supposed market inefficiencies … a true market inefficiency ought to be an exploitable opportunity. If there’s nothing investors can exploit in a systematic way, time in and time out, then it’s very hard to say that information is not being properly incorporated into stock prices. (Journal of Applied Corporate Finance, Spring, 1992)
Finally, there is a remarkably large body of evidence suggesting that professional investment managers are not able to outperform index funds that simply buy and hold the broad stock market portfolio (see, for example, Malkiel 1995). Three-quarters of professionally managed funds are regularly outperformed by a broad index fund with equivalent risk, and those that do appear to produce excess returns in one period are not likely to do so in the next. The record of
Stockholders’ Ownership and Control professionals does not suggest that sufficient predictability exists in the stock market to produce exploitable arbitrage opportunities. Pricing irregularities and predictable patterns in stock returns do appear over time and even persist for periods. Undoubtedly, with the passage of time and with the increasing sophistication of our databases and empirical techniques, we will document further apparent departures from efficiency, and further patterns in the development of stock returns. But the end result will not be an abandonment of the belief of many in the profession that the stock market is remarkably efficient in its utilization of information and that whatever patterns do exist are unlikely to provide investors with a method to obtain extraordinary returns. See also: Decision Research: Behavioral; Markets and the Law; Risk: Theories of Decision and Choice
Bibliography Ariel R A 1990 High stock returns before holidays: Existence and evidence on possible causes. Journal of Finance 45: 1611–26 Ball R 1978 Anomalies in relationships between securities’ yields and yield-surrogates. Journal of Financial Economics 6: 103–26 Basu S 1983 The relationship between earning’s yield, market value and the returns for NYSE common stocks: Further evidence. Journal of Financial Economics 12: 129–56 Campbell J Y, Lo A W, MacKinlay A C 1997 The Econometrics of Financial Markets. Princeton University Press, Princeton, NJ Campbell J Y, Shiller R J 1988 Stock prices, earnings, and expected dividends. Journal of Finance 43: 661–76 DeBondt W F M, Thaler R 1985 Does the stock market overreact? Journal of Finance 40: 793–805 DeBondt W F M, Thaler R 1995 Financial decision-making in markets and firms: A behavioral perspective. In: Jarrow R, Maksimovic V, Ziemba W (eds.) Handbook in Operations Research & MS. Elsevier Science, Vol. 9 Fama E F 1970 Efficient capital markets: A review of theory and empirical work. Journal of Finance 25: 383–417 Fama E F, French K R 1988 Permanent and temporary components of stock prices. Journal of Political Economy 96: 246–73 Fama E F, French K R 1997 Value vs. growth: The international evidence. Journal of Finance 53: 1975–99 Fama E F, French K (forthcoming) Disappearing dividends: Changing firm characteristics or increased reluctance to pay? Journal of Financial Economics Fluck Z, Malkiel B G, Quandt R E 1997 The predictability of stock returns: A cross-sectional simulation. Reiew of Economics and Statistics 79: 176–83 French K R 1980 Stock returns and the weekend effect. Journal of Financial Economics 8: 55–69 Graham B, Dodd D L 1934 Security Analysis: Principles and Techniques. McGraw-Hill, New York Haugen R A, Lakonishok J 1988 The Incredible January Effect. Dow Jones-Irwin, Homewood, AL Hawawini G, Keim D B 1995 On the predictability of common stock returns: Worldwide evidence. In: Jarrow R, Maksimovic V, Ziemba W (eds.) Handbooks in Operations Research & MS. Elsevier Science, Amsterdam New York, vol. 9, pp. 497–544
Kahneman D, Riepe M W 1998 Aspects of investor psychology. Journal of Portfolio Management 24: 52–65 Keim D B 1983 Size-related anomalies and stock return seasonality: Further empirical evidence. Journal of Financial Economics 12: 13–32 Lakonishok J, Schleifer A, Vishny R W 1994 Contrarian investment, extrapolation, and risk. Journal of Finance 49: 1541–78 Lakonishok J, Smidt S 1988 Are seasonal anomalies real? A ninety-year perspective. Reiew of Financial Studies 1: 403–25 Lo A W, MacKinlay A C 1999 A Non-random Walk Down Wall Street. Princeton University Press, Princeton, NJ Malkiel B G 1995 Returns from investing in equity mutual funds 1971 to 1991. Journal of Finance 50: 549–72 Nicholson S F 1960 Price-earnings ratios. Financial Analysts Journal. July/August: 43–50 Poterba J M, Summers L H 1988 Mean reversion in stock returns: Evidence and implications. Journal of Financial Economics 22: 27–59 Roll R, Shiller R J 1992 Comments: Symposium on volatility in US and Japanese stock markets. Journal of Applied Corporate Finance 1: 25–9 Samuelson P 1965 Proof that properly anticipated prices fluctuate randomly. Industrial Management Reiew 6: 41–9 Shiller R J 1996 Price-earnings ratios as forecasters of returns: The stock market outlook in 1996. Unpublished manuscript, Yale University Shiller R J 1981 Do stock prices move too much to be justified by subsequent changes in dividends?American Economic Reiew 71: 421–36
B. G. Malkiel
Stockholders’ Ownership and Control Theory and research on the relations among top managers, company directors, stockholders, and external contenders for corporate control experienced a remarkable flowering during the 1990s. Early work addressed the central puzzle raised by the widespread separation of ownership and control among large American corporations, namely, why would any sensible person—much less thousands of them—invest their savings in businesses run by unaccountable professional managers? As Berle and Means (1932) framed the problem, those who ran such corporations would pursue ‘prestige, power, or the gratification of professional zeal’ in lieu of maximizing profits. Shareholders weakened by their fractionation could do little to stop them. Yet generations of individuals and financial institutions continued to invest in these firms. Why?
1. Resoling the Separation of Stock Ownership from Company Control Answering this question led to the creation of a new theory of the firm that portrayed the public corporation as a ‘nexus of contracts.’ In this model, the mana15133
Stockholders’ Ownership and Control gers of the corporation were disciplined to optimize shareholder value by an array of mechanisms, from incentive compensation and vigilant directors to an unforgiving ‘market for corporate control.’ Taken together, these devices worked to vouchsafe shareholder interests even when ownership was widely dispersed. Research in this tradition flourished in the 1980s, as takeovers of under-performing firms in the US became common and restive institutional investors made their influence known. Studies focused on assessing the effectiveness of such contrivances as governing boards dominated by outsiders, executive pay tied to stock performance, and susceptibility to hostile acquisition (Walsh and Seward 1990). Three developments, however, expanded research on ownership and control beyond these mechanisms and beyond American business. The first was for investigators to view the governance structure of the firm—the set of devices that evolve within the firm to monitor managerial decision making—as an ensemble. Rather than regarding any particular aspect of the firm’s structure as essential, researchers came to study them as complements or substitutes. Remuneration highly leveraged around share price, for instance, could act as a substitute for a vigilant board, and together they could really tighten the leash. The second development was a growth of comparative and historical research that highlighted the idiosyncrasy of the American obsession with ownership and control. American-style corporate governance, aimed at ‘solving’ the problems created by the separation of ownership and control, came to be seen as only one of several feasible systems. Indeed, among some advanced economies with well-developed business enterprise, the problem needed no solving at all since the separation of ownership from control remained much the exception (Roe 1994). The third development was the articulation of a reflexive stance on the theory of ownership and control. While agency theory can be viewed as an empirical theory of the corporation, it also came to be considered a prescriptive theory, an explanation of not only what is but also a vision of what could or should be. Its influence on public policy debates during the 1980s and 1990s is evident in the widespread rise of the rhetoric of shareholder value in the business press and executive suite (Useem 1996).
2. Shareholder Influence on Top Management and the Corporation When ownership is dispersed widely, shareholders possess few direct devices to ensure their control—or at least their influence—on the company’s direction and performance. This is the essence of the ‘agency problem’ identified by Berle and Means (1932), and it remains potentially potent among many US firms. 15134
Three-quarters of the 100 largest US corporations in 1999 lacked even a single ownership block of 10 percent or more and, of the top 25, the biggest single holder averaged a scant four percent (Brancato 1999). Scattered shareholders in effect relinquish their control to directors whom they elect to serve as their agents in selecting and supervising top management. Dispersed holders can influence the firm only through trading shares and voting directors. They have no real voice in the selection of executives, or even who serves on the board. By one argument, shareholder inattention is in fact the only rational course for small shareholders since the cost of learning about the governance of the firm and mobilizing change is sure to be far overshadowed by the marginal and nonselective benefits of any resulting performance improvement. This problem of collective action could result in management unaccountability in some capital markets, but investors nonetheless rest easy in the US because of a well-developed set of mechanisms that have arisen to ensure share price maximization without overt investor control (Easterbrook and Fischel 1991). The most forceful instrument for riveting attention on share price is the ‘market for corporate control.’ By hypothesis, a poorly-run corporation suffers a depressed market valuation, and this opens an opportunity for outsiders with better management skills to acquire the firm at an implicit discount, oust top management, and rehabilitate the firm, thus restoring value to its long-suffering stockholders. Unwanted takeovers and their threats provide a potent mechanism to shock top managers who under-serve shareholder interests; they know that executives generally find themselves on the street after an unfriendly takeover. The market for corporate control is thus the most visible hand of Darwinian discipline, weeding out poorly run firms and protecting shareholders from bad management (Manne 1965). Hostile takeovers are a blunt instrument for corporate control, and as a result, some US institutional investors have moved toward more fine-grained methods of influence. Large pension funds, for example, often field shareholder proxy proposals for better governance. Though little evidence yet confirms that this mode of shareholder activism has immediate impact on the share price of target firms, it does push targeted firms to improve their governance and thereby further open them further to investor influence (Karpoff 1998). If hostile takeovers are often too blunt an instrument and shareholder activism too diffuse, stock analysts provide an intermediate form of shareholder influence on company behavior. Analysts investigate companies in order to render judgments on their prospects for future earnings and thus to estimate their appropriate valuation. Analysts are often allowed privileged access to corporate executives and are in a position to give strategic and governance advice
Stockholders’ Ownership and Control directly to senior managers, something rarely afforded to the firm’s small shareholders. In principle, the rewards go to the most accurate analysts, giving them incentives to act as corporate watchdogs. In practice, however, some analysts are not entirely dispassionate observers: those who work for firms doing business with a corporation give systematically more favorable evaluations than do others. While analysts may serve as a more refined channel for shareholder influence, this avenue of control remains problematic as well (Useem 1996).
3. Between Ownership and Control: Top Management A handful of company executives make the defining decisions, whether to launch a new product, enter a promising region, or resist a tender offer. It is they who most directly control the fate of the firm. It is they who stand between corporate owners and company results. Firms often define their ‘top’ as no more than the seven or eight most senior officers. To much of the world beyond the company walls, however, top management is personified almost solely by the chief executive. Academic researchers had long been drawn to the same pinnacle of the pyramid, partly on the conceptual premise that the chief executive is the manager who really matters, and partly on the pragmatic ground that little is publicly available on anybody except the CEO. We have, thus, benefited from a long accumulation of work on their family histories, educational pedigrees, and political identities, and we know as a result that their origins are diverse, credentials splendid, and instincts conservative. More contemporarily, we have learned much about their tangled relations with directors and investors as well. To know the CEO’s personal rise and board ties is to anticipate much of the firm’s strategic intent and performance promise. They are also a good predictor of the CEO’s own fortunes. An elite MBA degree accelerates movement to the top, a wealthy background attracts outside directorships, and a hand-picked board enhances pay and perquisites (Useem and Karabel 1986). The conceptual and pragmatic underpinnings for shining the light solely on the CEO as the fulcrum for ownership and control, however, have eroded in recent years, as companies redefined their operations and researchers reconceived their methods. A central thrust of company restructuring during the 1980s and 1990s has been to transform the chief executive’s office into the office of the executive. At many companies, CEOs have fashioned teams among their top officers that meet frequently and resolve jointly, their vertical control giving way to lateral leadership. As a result, researchers expanded their field of view from the chief executive to the entire upper echelon. They applied fresh strategies, ranging from personal interviews to
direct observation and internet surveys, to acquire data on top managers other than the chief executive who now matter far more to company performance and stock analysts. A host of studies subsequently confirmed what many executives already appreciated. A better predictor of a company’s performance than the chief executive’s capability is the quality of the team that now runs the show (Finkelstein and Hambrick 1996). Yet even this widening of research attention beyond the chief executive has not broadened enough. Many companies have expanded the concept of top management to include one, two, or even three hundred senior managers whose decisions have decisive bearing on firm performance. The concept of top management, then, has expanded in the minds of company managers and, to a lesser extent, academic researchers, from just the boss to the boss’s team and now to the boss’s court. If shareholders are increasingly insistent that company directors require top managers to deliver steadily rising returns on their investments, their action is premised on a critical assumption that these top management teams do make the difference. While the premise may seem intuitively obvious to those inhabiting this world, it is far from self-evident to those who study it. Observers diverge from the assumption of influential executives in two opposing directions, one viewing top managers as all powerful, others seeing them as all but powerless. Vilfredo Pareto and kindred ‘elite’ theorists early articulated the all-powerful view, but C. Wright Mills captured it best for the US in his classic work, The Power Elite (1956). Company executives, in his acid account, had joined with government officials and military commanders in an unholy alliance to exercise a commanding control over companies and the country. To the extent that top management is indeed all powerful, shareholders possess a silver bullet for righting whatever has gone wrong in a company: install new management. For the community of management scholars, Pfeffer (1981) well articulated the opposite, all-powerless view with the contention that market and organizational constraints so tied top management’s hands that it was much the captive, not a maker, of its own history. Production technologies and competitive frays in this view are far more determining of company results than the faceless executives who sit in their suites. It counts little who serves in top management, and, by extension, on the board. If top management is as powerless as the imagery suggests, investors seeking new management in response to poor performance are surely wasting their time. Company executives do complain of a seeming helplessness at times, but almost anybody in working, consulting, or research contact with top managers reports just the opposite. Direct witnesses typically describe instead a commanding presence of top management, and they characterize its capabilities as 15135
Stockholders’ Ownership and Control making the difference between a company’s success or failure. Investors themselves think otherwise as well. When companies announce executive succession, money managers and stock analysts are quick to place a price on the head of the newly arrived, and, depending on the personality, billions can be added to or subtracted from the company’s capitalization. The importance that investors and directors attribute to top management for growing their fortunes can also be seen in the ultimate punishment for failure to do so: dismissal. Whether the US, Japan, or Germany, the likelihood of a CEO’s exit in the wake of a stock free fall is increased by as much as half. And investors are often the engine behind the turnover. Study of Japanese companies stunned by a sharp reversal of fortune, for example, reveals that those whose top ten shareholders control a major fraction of the firm’s stock are more often than others to dismiss the president and replace directors (Kaplan 1997, Kang and Shivdasani 1997). Similarly, company boards are observed to reward successful executives far more generously than in the past. For each additional $1,000 added to a company’s value in 1980, directors on average provided their chief executive an extra $2.51. By 1990, the difference in their payments had risen to $3.64, and by 1994 to $5.29. Put differently, in 1980 the boards of companies ranked at the 90th percentile in performance gave their CEOs $1.4 million more than did boards whose firms ranked at the 10th percentile (in 1994 dollars). But by 1990, the boards had expanded that gap to $5.3 million and by 1994 to $9.2 million (Hall and Liebman 1998).
4. Between Stockholders and Top Management: The Goerning Board If companies make semiconductors and concrete much the same way the world around, they organize their directors in almost as many ways as there are national economies. American boards tend to include only two or three insiders, while Japanese boards rarely include even two outsiders. German and Dutch governance is built around a two-tier governance structure, employees holding half of the upper tier seats; British and Swiss governance is designed around a single-tier, management-dominated structure. Some systems give formal voice to labor, others none: German law requires that labor representatives serve on the board, while French law places labor observers in the boardroom. American law mandates nothing, but some boards have added a labor leader on their own (Charkham 1994). Yet such varieties in national practices may well diminish for two reasons. First, as equity markets internationalize, with companies seeking capital from all corners of the globe, investors predictably prefer relatively consistent director models that they believe 15136
will optimize shareholder value and performance transparency regardless of country. Their mental models of what’s right may not always be right, but that is beside the point. Consider the preference expressed by some institutional investors in the US for a board chair who is not a sitting executive, believing that separation improves monitoring. Whatever the separatist penchant and however well it may have served some companies, research reveals that such a division creates no clear advantage for company performance (Baliga et al. 1996). The diversity in directorship practices is likely to diminish, however, for a second, more factual reason. Certain practices do engender better performance, regardless of the setting, and as companies increasingly compete worldwide for customers and investors, they are likely to adopt what indeed does prove best. A case in point here is the number of directors on the board. Research on teams’ success suggests that bigger is not better when boards exceed 15 or 20 members. When the number of directors climbs far above middle range, the engagement of each diminishes, and so too does their capacity to work in unison. Studies of American and European firms reveals that smaller boards displayed stronger incentives for their chief executive, greater likelihood of dismissing an under-performing CEO, and larger market share and superior financial performance (Conyon and Peck 1998). It is not surprising, then, to see companies migrating worldwide toward smaller boards in search of improved performance. Directors in principle protect owner interests, direct company strategy, and select top management. In practice, they had concentrated more on strategy and selection and less on owner interests. But the rising power of investors has made for greater director focus on creating value and less coziness with top management. American and British directors, following the ‘Anglo-Saxon’ model, are already more focused on value than most. But directors in other economies can be expected to gravitate slowly toward the mantra of shareholder supremacy as well.
5. Shareholders, Directors, and Executies at the Table With the rise of professional investors and their subjugation of national boundaries, those who occupy the executive suite and those who put them there are drawing far more research and policy attention, and justifiably so. In a by-gone era when markets were more steady and predictable, when airline and telephone executives confidently knew what to expect next year, the identity of top management mattered little. When shareholders were far smaller and decidedly quieter, when airline and telephone directors comfortably enjoyed inconsequential board meetings, the composition of the governing body mattered little
Strategic Human Resources Management either. During the 1990s, however, all of this has changed in the US and UK, with other economies close behind. As shareholders have sought to reclaim authority over what they owned, they have brought top management and company directors back in. Company strategy and financial results can no longer be understood without understanding the capabilities and organization of those most responsible for delivering them. The market is no longer viewed as so impersonal, the company no longer so isolated. Ownership and financial decisions are embedded in a complex network of working relations among money managers, stock analysts, company directors, senior executives, and state regulators. Given the right combination of features, that lattice can yield high investment returns and robust national growth. Given the wrong amalgam, it can lead instead to self-dealing, frozen form, and economic stagnation. The road ahead thus depends on how well researchers understand what control mechanisms and leadership styles work well both within national settings and across cultural divides, and on the extent that top managers, company directors, and stockholders learn and apply what is best. See also: Business Law; Capitalism; Capitalism: Global; Corporate Finance: Financial Control; Corporate Governance; Corporate Law; Institutional Investors; International Law and Treaties; International Organization; International Trade: Commercial Policy and Trade Negotiations; Internet: Psychological Perspectives; Law and Economics; Law and Economics: Empirical Dimensions; Markets and the Law; Monetary Policy; Regulation, Economic Theory of; Regulation: Working Conditions; Stock Market Predictability; Technological Innovation; Venture Capital
Bibliography Baliga B R, Moyer R C, Rao R S 1996 CEO duality and firm performance: What’s the fuss? Strategic Management Journal 17: 41–53 Berle Jr A A, Means G C 1932 The Modern Corporation and Priate Property. Commerce Clearing House, New York Brancato C 1999 International Patterns of Institutional Inestment. Conference Board, New York Charkham J 1994 Keeping Good Company: A Study of Corporate Goernance in Fie Countries. Oxford University Press, New York Conyon M J, Peck S I 1998 Board size and corporate performance: Evidence from European companies. European Journal of Finance 4: 291–304 Davis G F, Thompson T A 1994 A social movement perspective on corporate control. Administratie Science Quarterly 39: 141–73 Easterbrook F H, Fischel D R 1991 The Economic Structure of Corporate Law. Harvard University Press, Cambridge, MA
Finkelstein S, Hambrick D C 1996 Strategic Leadership: Top Executies and Their Effects on Organizations. West Publishing, Minneapolis-St. Paul, MN Hall B J, Liebman J B 1998 Are CEOs really paid like bureaucrats? Quarterly Journal of Economics 113: 653–91 Kang J-K, Shivdasani A 1997 Corporate restructuring during performance declines in Japan. Journal of Financial Economics 46: 29–65 Kaplan S N 1997 Corporate governance and corporate performance: A comparison of Germany, Japan, and the U.S. Journal of Applied Corporate Finance 9: 86–93 Karpoff J M 1998 The impact of shareholder activism on target companies: A survey of empirical findings. Unpublished, University of Washington Manne H G 1965 Mergers and the market for corporate control. Journal of Political Economy 73: 110–20 Pfeffer J 1981 Management as symbolic action: The creation and maintenance of organizational paradigms. In: Cummings L L, Staw B M (eds.) Research in Organizational Behaior. JAI, Greenwich, CT, Vol. 3 Roe M J 1994 Strong Managers, Weak Owners: The Political Roots of American Corporate Finance. Princeton University Press, Princeton, NJ Useem M 1996 Inestor Capitalism: How Money Managers Are Changing the Face of Corporate America. Basic Books, New York Useem M, Karabel J 1986 Pathways to top corporate management. American Sociological Reiew 51: 184–200 Walsh J P, Seward J K 1990 On the efficiency of internal and external corporate control mechanisms. Academy of Management Reiew 15: 421–58
M. Useem and G. Davis
Strategic Human Resources Management Defining ‘strategic human resource management’ and placing it in context involves several related questions. In the first place, what do we mean by ‘strategy’? Then, what is ‘human resource management’ and does it differ from ‘strategic human resource management’? What are the characteristics and issues concerning different approaches to strategic human resource management? What is the relationship between strategic human resource management and organizational performance?
1. What is Strategy? First, a simple answer: Strategy may be defined as a set of fundamental choices about the ends and means of an organization. More recently it might be defined as policies and procedures that are aimed at securing competitive advantage. That said, there are several different approaches in exploring the question ‘what is strategy?’ Some focus on the content of strategy and its positioning in relation to factors internal or external to 15137
Strategic Human Resources Management the organization that might be seen as salient to its securing competitive advantage. An example of the internal, firm-focused approach would be the ‘resource-based’ view of strategy of the firm (e.g., Barney 1991). This perspective defines firms as unique bundles of resources and suggests that the route to developing sustained competitive advantage is to develop resources internally that meet the criteria of value, rarity, being costly to imitate, and nonsubstitutability. Examples of the latter, market-focused approach might be Porter’s (1980) generic strategies (low cost, differentiation, segment, or niche) in relation to his five-forces model or Miles and Snow’s (1978) categorization of firms as prospectors, analyzers, defenders, and reactors, both models emphasizing market-oriented choices. Theories that bring together internal and external perspectives are life-cycle theories of the firm or product (for example, the Boston Consultancy Group’s portfolio planning growth-share matrix). A second approach, which can subsume the content approaches, focuses on the processes of strategy making. A famous distinction is that between ‘planned’ and ‘emergent’ strategy (Mintzberg and Walters 1985). Building on this is a useful typology of Whittington (1993), based on two axes, relating to continua of outcomes (profit maximizing\pluralistic) and of processes (deliberate\emergent). Four perspectives on strategy are identified: the ‘classical’ (profit maximizing\deliberate), ‘evolutionary’ (profit maximizing\emergent), ‘processual’ (pluralistic\emergent) and the ‘systemic’ (pluralistic\deliberate). The processes of strategy making that are ‘deliberate’ comprise of rational, planned, top-down decision making (‘classical’ model) and decision making shaped by the social systems in which organizations are embedded (‘systemic’ model). ‘Emergent’ processes are those that reflect the very bounded, incremental, and political nature of decision making (‘processual’) or, in contrast, the ‘population ecology’ view of the competitive processes of natural selection, where the market picks the winners, rendering ineffective any deliberate strategizing (‘evolutionary’). Writers typical of each of these perspectives would be Ansoff (1965) (‘classical’), Hannan and Freeman (1988) (‘evolutionary’), Mintzberg and Walters (1985) (‘processual’), and Granovetter (1985) (‘systemic’).
2. What is Strategic HRM? Even leaving aside the work on HRM by European writers, such as Anthony, Keenoy, Legge, Townley, and Willmott, that lies within a critical discourse tradition and which, as such, will not be considered here, there has been much debate on this question since the mid-1980s (see, e.g., Beer and Spector 1985, Fombrun et al. 1984, Guest 1987, Walton 1985). For many years, especially in the USA, human resource 15138
management was just another term for personnel management. Both terms referred to the management of the employment relationship and of the indeterminacies in the employment contract. In the early 1980s, however, a combination of factors—increased globalization of markets and perceived intensification of competition, the rise of the ‘enterprise culture,’ the Japanese ‘Janus’ and models of excellence, the decline in trade union pressure and disillusionment with ‘traditional’ personnel management, changes in the nature of work and in the workforce—awoke senior managers to the importance of human resources in the achievement of competitive advantage, and to a recognition that HRM\personnel management was too important to be left to personnel specialists. Indeed, whether HRM was considered to be different from personnel management largely depended on the point of comparison. Sharp distinctions and contrasts emerged if the normative aspirations of HRM were compared to the descriptive practices of personnel management, but otherwise faded into several different emphases, all of which, though, pointed to HRM, in theory at least, being an essentially more central strategic management task than personnel management. As one commentator put it, the real difference between HRM and personnel management was ‘not what it is, but who is saying it. In a nutshell HRM represents the discovery of personnel management by chief executives’ (Fowler 1987, p. 3). Hence the idea of strategic human resource management. However, different commentators espouse different models and approaches to strategic HRM. A major distinction may be found between the ‘hard’ model of the Michigan school (Fombrun et al. 1984), reflecting what has been termed a utilitarian instrumentalism and the ‘soft,’ ‘mutuality’ model of the Harvard school (Beer and Spector 1985, Walton 1985), more reminiscent of deelopmental humanism (Hendry and Pettigrew 1990). The ‘hard’ model stresses HRM’s focus on the crucial importance of the close integration of human resources policies, systems, and activities with business strategy, requiring that they are not only logically consistent with and supportive of business objectives, but achieve this effect by their own coherence. From this perspective, employees are regarded as a headcount resource to be managed in exactly the same rational, impersonal way as any other resource, i.e., to be exploited for maximal economic return. Thus its focus is on human resource management and the implied perspective on strategy seems to have much in common with Whittington’s ‘classical’ profit-maximizing\deliberate model of strategy. In contrast, the ‘soft,’ ‘developmental humanism’ model, while still emphasizing the importance of integrating HR policies with business objectives, sees this as involving treating employees as valued assets, a source of competitive advantage through their commitment, adaptability, and high quality (of skills, performance, etc.). Employees are proactive and
Strategic Human Resources Management resourceful rather than passive inputs into the productive process. Rather than exploitation and cost minimization, the watchwords in this model are investment and value added, with the focus on human resource management. The strategic perspective here, in Whittington’s terms, is, therefore, pluralistic and echoes some of the ideas of the resource-based view of strategy. Throughout the 1990s this model of HRM became increasingly seen, in both the USA and UK, as the exemplar of what HRM, as a distinctive philosophy of managing the employment relationship, was all about (see, e.g., Becker and Gerhart 1996, Guest 1987, Huselid 1995, MacDuffie 1995). Termed variously, the ‘high commitment’ (HCM), ‘high performance,’ or ‘mutuality’ model of HRM, at an operational level it was seen to comprise such policies as: (a) Careful recruitment and selection with an emphasis on traits and competencies; (b) Extensive use of systems of communication; (c) Team working with flexible job design; (d) Emphasis on training and learning; (e) Involvement in decision making with responsibility (empowerment); (f) Performance appraisal linked to contingent reward systems. Increasingly, throughout the 1990s, particularly among US commentators, this model came to be equated with strategic HRM, reflecting the popularity of resource-based views of strategy in its belief in acquiring and developing employees as assets (see, e.g., Becker and Gerhart 1996, Boxall 1996). In some ways this reflects the interests of HR specialists and academics alike. Fitting HR policy and practice to business strategy always leaves HR as the ‘third order’ strategic decision, designed to reinforce prior ‘first order’ decisions about portfolio planning and ‘second order’ business decisions about organizational design and operating procedures, still reactive if no longer as powerless as traditional personnel management. The resource-based view of strategy accords a more proactive role to HRM as it can be valued not just for reinforcing predetermined business strategies, but for developing strategic capability, improving the longterm resilience of the firm. However, it would be a mistake to see these two contrasting models of HRM as necessarily incompatible. If the stress is on integrating HRM policies with business strategy, in certain situations, choosing HCM policies may be appropriate to driving the business strategy. HCM appears the logical choice where employee discretion is crucial to the delivery of added value, and reciprocal high-trust relations are essential to organizational performance. Such a situation is often equated with organizations pursuing a strategy of producing high value-added goods and services, in a knowledge-based industry, where it adopts a policy of value-added goods and services rather than asset management. But if the organiza-
tion’s chosen business strategy is to compete in a labor-intensive, high-volume, low-cost industry, generating profits through increasing market share by cost leadership, such a policy of treating employees as assets may be deemed an expensive luxury. For such an organization the HR policies that may be most appropriate to driving strategic objectives are likely to involve treating employees as a variable input and cost to be minimized. Further, human resource management, in line with the ‘soft’ HCM model, is only viable if the organization is seen to be economically successful or ‘hard,’ if it manages effective human resource management. At the heart of the distinction between ‘hard’ and soft’ models of HRM lies a fundamental contradiction that personnel management\HRM—whatever the nomenclature—is called upon to mediate: needs it to achieve both the control and the consent of employees. Integration with business strategy may require the design and interpretation of an employment contract that veers toward control (with assumptions of labor as a commodity, embedded structural conflict, low trust, and transactional contracts) for some groups of employees; but consent (with assumptions about the employee as a valued asset, a unitary frame of reference, high trust, and relational contracts) for others. This raises a major question about what precisely we mean by ‘integrating HRM with business strategy,’ the cornerstone of the ‘hard’ model of HRM. ‘Integration’ appears to have two meanings: integration or ‘fit’ with business strategy, and the integration or complementarity and consistency of ‘mutuality’ employment policies aimed at generating employee commitment, flexibility, quality, and the like. This double meaning has been referred to also as the ‘external’ and ‘internal’ fit of the HRM policies. The problem is that while ‘fit’ with strategy, as suggested above, would argue a contingent design of HRM policy, internal consistency—at least with the ‘soft’ human resource values associated with ‘mutuality’—would argue a uniersalistic approach to the design of employment policy. Can this contradiction be reconciled without stretching to the limit the meaning of HRM as a distinct approach to managing people at work? Indeed, should we focus on HRM as a ‘special variant’ of personnel management, reflecting a particular ideology about how employees should be treated? Or should we regard it as a variety of very different policies and practices designed to achieve the desired employee contribution, judged solely ‘against criteria of coherence and appropriateness (a less rigid term than ‘‘fit’’)?’ (Hendry and Pettigrew 1990, pp. 8–9). This distinction is critical as the contingent and universalistic approaches rest on very different and contradictory theoretical perspectives about organizational competitveness. The universalistic approach is consistent with institutional theory and arguments about organizational isomorphism (DiMaggio and 15139
Strategic Human Resources Management Powell 1983). In other words, the assumption here is that organizations that survive do so because they identify and implement the most effective, ‘best’ policies and practices. As a result, successful organizations get to look more and more like each other through practices like benchmarking and so on. (Here we have echoes of Whittington’s ‘evolutionary’ perspective on strategy.) In HR terms this equates with the belief that treating employees as assets via the HCM model will always pay off, irrespective of circumstances. Contingency approaches, on the other hand, are consistent with the resource-based theories, referred to earlier, that argue that sustained competitive advantage rests not on imitating so-called ‘best practice,’ but on developing unique, nonimitable competencies (Barney 1991). This approach rests on a recognition of the importance of idiosyncratic contingencies that result from causal ambiguity and path dependency. From this perspective organizational performance is not enhanced merely by following ‘best’ HR practice (the HCM model), but from knowledge about how to combine, implement, and refine the whole potential range of HR policies and practices to suit the organization’s idiosyncratic contingencies (Boxall 1996). This brings us to what has become the key research question about strategic HRM, in both the USA and UK: what is its relationship to organizational performance?
3. Strategic HRM and Organizational Performance Is strategic HRM linked to organizational performance on a universalistic (additive), configurational (patterned), or contingency (idiosyncratic) basis (Delery and Doty 1996)? At first sight the greatest support appears to be for the universalistic model: that the greater the extent to which the characteristics of the ‘high commitment’\’high performance’ model of HRM are adopted the better the organizational performance (see, e.g., empirical studies in both the UK and USA by Delery and Doty 1996, Guest 1997, 1999, Huselid 1995, MacDuffie 1995, Patterson et al. 1997). (For support of the contingency position, see Arthur 1992.) However, on examination, the empirical evidence for such universalism is more equivocal than it might appear at first sight. The problems stem from methodological concerns with research design (see Legge 2000) and empirical observation (see Cully et al. 1999). Leaving aside disagreements about the specification and measurement of HCM HR practices, there are several major difficulties in drawing clear inferences from this research. Much of the work has been cross-sectional and survey-based, relying on the responses of one senior executive. The level of analysis is generally at corporate level, given the reliance on financial measures 15140
of performance, as it is at this level that much of the publicly available financial data exist. This leads to a range of difficulties. Where the research design is cross-sectional (as opposed to longitudinal), while causality may be inferred from correlation, technically it is not tested. This results in three possibilities. A causal relationship may exist in the direction inferred, i.e., HCM HR policies and practices may give rise to positive outcomes, although even if this did exist, it might reflect no more than a temporary ‘Hawthorne’ effect in response to a change program (Purcell, 1999). (Of course it is possible, with the same direction of causality, that strategic HRM, particularly contingently chosen ‘hard’ HRM, may give rise to negative outcomes, or, positive outcomes on organizational performance measures but negative outcomes on employee measures, such as turnover and absenteeism.) Or, reverse causality may exist. Thus, as a firm becomes more profitable, it may invest in HCM HR practices, such as expenditure on training, or profit sharing, in the belief that such practices will further enhance performance, or that they will reduce the risk of performance declines, or from a belief in the justice and efficacy of wealth distribution. However, it is the profits that generate HCM practices rather than vice versa. A further possibility is that the observed relationship between HR practices and performance may be an artefact of the implicit performance theories of survey respondents. In other words, if respondents have little detailed knowledge of the HR practices in their firm (highly likely if the firms are large, diverse and multisite and the research is being conducted at corporate level) but know that the firm is performing well in terms of productivity and profitability, they may infer that ‘high performance’ HR practices must exist, given this level of performance, based on the implicit theory that such practices are related to high performance (Wright and Gardener 2000). Similarly, the groups of employees that are likely to be most visible to such a respondent are those core employees to whom HCM is most likely to be applied, not the contingent workers to whom more transactional contracts are likely to be applied. The workers on the periphery of an organization may simply not be visible to the senior management single respondent to a corporate-level survey (Purcell 1999). Issues of workforce segmentation appear to be consistently neglected in such research (Boxall 1996). Second, there is a theoretical problem with the heavy reliance on financial measures of performance (return on assets, Tobin’s q) or organizational measures (productivity, quality, customer satisfaction) without equal concern to measure employee outcomes. (This is particularly marked in the US studies cited. Some recent UK studies, such as Guest 1999 and Cully et al. 1999 have attempted to measure employees’ responses, such as satisfaction and employee relations climate.) The HCM model of HR practices rests on
Strategic Human Resources Management the assumption that they will result in a positive psychological contract between employees and the organization that will give rise to employees’ high commitment, quality, and flexibility, that in turn will have a positive effect on organizational performance and, ultimately, financial results. Reliance solely on financial and performance measures seems misguided as the theoretical rationales of how HR effects organizational performance rest on the assumption that it is through employee behaviors and outcomes. However, while the corporate level of analysis is consistent with the collection of financial data, it is not conducive to assessing employee attitudes and behaviors. Failure to open the ‘black box’ of employee behavior is a major limitation of many of these studies. However, to do so within the confines of quantitative positivistic designs (deemed necessary to establish generalizable causal relationships) raises immense problems about how many and what variables should be in the ‘black box’ and how these variables should be specified. Then there is the issue of distinguishing between espoused and actual HR practices and employee skills and behaviors. The greater the number of intervening variables identified and the greater the level of specificity, the greater the multiplicative effect in determining the processes of a model, as the model building requires specification of the relationships between each of the specifications of the major intervening variables, giving rise to potentially unmanageable complexity (Wright and Gardner 2000). Turning to empirical observation, we are confronted with a puzzling finding. If we are to accept at face value the research studies that suggest that there is a universalistic relationship between ‘high commitment,’ ‘soft’ model HRM and organizational performance, why do we find such relatively little implementation of this model of HRM? In the UK, for example, the latest Workplace Employee Relations Survey (WERS) in 1998 (Cully et al. 1999) found that only 14 percent of the responding workplaces have ‘soft’ HRM fully in place (defined as eight plus out of 15 ‘high commitment’ management practices) as opposed to 29 percent which had three or fewer, 22 percent of which, with three or fewer HCM practices and no unions, may be defined in Guest’s own memorable phrase as constituting a ‘black hole.’ As several commentators have observed, the implementation of any management best practice is likely to fall foul of vested interests, politicking, organizational history, and culture—precisely the factors that undermine the exercise of the ‘classical’ model of strategizing, but which are central to the ‘processual’ model. Indeed, it has been suggested that in the Anglo-American world, since about the mid1980s, the dominant emphasis in practice has been on short-term survival rather than the development of long-term resource-based advantage, with the widespread, ad hoc adoption of the ‘hard’ policies of delayering, downsizing, and increasingly contingent forms of employment (Boxall 1996). Such initiatives
seem consistent with Whittington’s ‘processual’ model of strategizing, emerging as a stream of incremental, politically feasible actions in response to pressing problems.
4. Future Directions of Research on Strategic HRM Whether from a practitioner or academic viewpoint, interest in the HRM\performance relationship appears to be here to stay. Most of the commentators already cited have suggestions for the improvement of research designs (e.g., more within-industry, businesslevel and plant-level designs; more consistent and valid measures of HR practices; more longitudinal studies; recognition of a broader range of organizational outcomes; opening up the ‘black box’, and so on). Guest (1997) and Purcell (1999) present two contrasting approaches, one from a universalistic and one from a contingent perspective. Guest (1997, 1999) suggests that expectancy theory might provide a theory of process to link HRM practices and performance, as it links motivation and performance. Specifically, expectancy theory proposes that, at the individual level, high performance depends on high motivation, coupled with the necessary skills and abilities and appropriate role design and perception, equating skills and abilities with quality, motivation with commitment and role structure, and perception with flexibility. HR practices designed to foster these HR outcomes (e.g., selection and training for skills and abilities\quality, contingent pay and internal promotion for motivation\commitment, and team-working design for appropriate role design and perception\flexibility) should facilitate high individual performance, which in turn is a contributory factor in high performance outcomes (e.g., high productivity, low absence and labor turnover) that should contribute (other things being equal) to financial outcomes. In an empirical study of employees’ reactions to HCM HR policies. Guest (1999) suggests that the psychological contract may be a key intervening variable in explaining the link between HR practices and employee outcomes such as job satisfaction, perceived job security, and motivation. Furthermore, although Guest’s research is cross-sectional and, hence, raises the usual caveats about causality, the inferred direction of his empirical findings is supported by a similar longitudinal study (Patterson et al. 1997). Nevertheless, Guest acknowledges the limitations of these models in terms of explaining organizational outcomes. While we may be able to measure the impact of HRM practices on HRM outcomes (quality, commitment, and flexibility), the measurable impact on organizational and financial outcomes is likely to become progressively weaker because of the range of potentially intervening variables. Guest also suggests 15141
Strategic Human Resources Management that we need a theory about the circumstances when human resources matter more and a theory about how much of the variance between HR practices and performance can be explained by the human factor. Purcell (1999, pp. 36–8), building on a resourcebased view of strategy has some interesting observations here. He recognizes that, on the one hand, the claim that the bundle of ‘best practice’ HCM HR is universally applicable ‘leads us into a utopian cul-desac,’ ignoring dual labor markets, contingent workers, and business strategies that logically do not require ‘high commitment’ HR practices to achieve financial success. On the other hand, the search for a contingency model of HRM is a ‘chimera,’ ‘limited by the impossibility of modelling all the contingent variables, the difficulty of showing their interconnection, and the way in which changes in one variable have an impact on others, let alone the need to model idiosyncratic and path-dependent contingencies.’ The way forward, Purcell argues, is the analysis of how and when HR factors come into play in the management of strategic change. Purcell suggests that we should explore how organizations develop successful transition management, build unique sets of competencies and distinctive organizational routines, and, in situations of ‘leanness,’ with greater dependency on all core workers, develop inclusiveness and the trust of such workers. The focus of research on strategic HRM (indeed, the focus of strategic HRM itself ) should be on ‘appropriate HR architecture and the processes that contribute to organizational performance in the short and medium term, and which positively contribute to the achievement of organizational flexibility or longevity.’ See also: Decision Making, Psychology of; Human Resource Management, Psychology of; Industrial Relations and Collective Bargaining; Industrial Sociology; Industrial–Organizational Psychology: Science and Practice; Organizational Behavior, Psychology of; Organizational Change; Organizational Decision Making; Organizations, Sociology of; Rational Choice and Organization Theory
Bibliography Ansoff H 1965 Corporate Strategy. Penguin, Harmondsworth, UK Arthur J B 1992 The link between business strategy and industrial relations systems in American steel minimills. Industrial and Labor Relations Reiew 45: 488–506 Barney J 1991 Firm resources and sustained competitive advantage. Journal of Management 17(1): 99–120 Becker B, Gerhart B 1996 The impact of human resource management on organizational performance: Progress and prospects. Academy of Management Journal 39(4): 779–801 Beer M, Spector B 1985 Corporate-wide transformations in human resource management. In: Walton R E, Lawrence P R
15142
(eds.) Human Resource Management, Trends and Challenge. Harvard Business School Press, Boston, pp. 219–53 Boxall P 1996 The strategic HRM debate and the resource-based view of the firm. Human Resource Management Journal 6(3): 59–75 Cully M, Woodland S, O’Reilly A, Dix G 1999 Britain at Work. Routledge, London Delery J, Doty H 1996 Models of theorizing in strategic human resource management: Tests of universalistic, contingency, and configurational performance predictions. Academy of Management Journal 39(4): 802–35 DiMaggio P, Powell W 1983 The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Reiew 48(2): 147–60 Fombrun C, Tichy N M, Devanna M A (eds.) 1984 Strategic Human Resource Management. Wiley, New York Fowler A 1987 When chief executives discover HRM. Personnel Management 19(1): 3 Granovetter M 1985 Economic action and social structures: The problem of embeddedness. American Journal of Sociology 91(3): 481–510 Guest D E 1987 Human resource management and industrial relations. Journal of Management Studies 24(5): 503–21 Guest D E 1997 Human resource management and performance: A review and research agenda. International Journal of Human Resource Management 8(3): 263–90 Guest D E 1999 Human resource management—the workers’ verdict. Human Resource Management Journal 9(3): 5–25 Hannan M T, Freeman J 1988 Organizational Ecology. Cambridge University Press, Cambridge, MA Hendry C, Pettigrew A M 1990 Human resource management: An agenda for the 1990s. International Journal of Human Resource Management 1(1): 17–43 Huselid M 1995 The impact of human resource management practices on turnover, productivity, and corporate financial performance. Academy of Management Journal 38: 635–72 Legge K 2000 Silver bullet or spent round? Assessing the meaning of the ‘high commitment management’\ performance relationship. In: Storey J (ed.) Human Resource Management: A Critical Text. Thomson Learning, London, pp. 20–36 MacDuffie J P 1995 Human resource bundles and manufacturing performance: Organizational logic and flexible production systems in the world auto industry. Industrial and Labor Relations Reiew 48: 197–221 Miles R E, Snow C C 1978 Organizational Strategy, Structure and Process. McGraw-Hill, New York Mintzberg H, Walters J A 1985 Of strategies, deliberate and emergent. Strategic Management Journal 6: 257–72 Patterson M, West M, Lawthom R, Nickell S 1997 Impact of People Management Practices on Business Performance. IPD, London Porter M 1980 Competitie Strategy. Free Press, New York Purcell J 1999 Best practice and best fit: Chimera or cul-de-sac? Human Resource Management Journal 9(3): 26–41 Walton R E 1985 From control to commitment in the workplace. Harard Business Reiew 63(2): 77–84 Whittington R 1993 What is Strategy and Does it Matter? Routledge, London Wright P M, Gardener T M 2000 Theoretical and empirical challenges in studying the HR practice-performance relationship. Paper presented at the special workshop on Strategic Human Resource Management, EIASM, INSEAD, Fontainebleau, France, 30 March–1 April
K. Legge Copyright # 2001 Elsevier Science Ltd. All rights reserved.
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Strategic Intelligence
Strategic Intelligence Intelligence is the provision of information about targets of concern, mostly foreign, for the use of decision makers, mostly governmental. The information normally prioritizes secrets and\or forecasts, and the degrees of animosity toward the targets (ranging from hostility all the way over to friendliness) are likely to shape the various efforts. The intelligence that is called ‘strategic’ is especially alert to the more competitive ranges of this spectrum, and is also loosely distinguished from what is called ‘tactical’ intelligence—the strategic seen as having wider scope, longer time horizons, and organizationally higher standing. Such strategic intelligence is several thousand years old, yet it lacked widely known, sharply focused manuals and treatises until the twentieth century (except in the narrow domain of cryptology). The long delay was no doubt related to the traditional secretiveness of most of the intelligence activities—but also to slowness in differentiating them from broad statecraft and diplomacy and generalship, and from each other, and eventually institutionalizing various intelligence functions: distinct modes of information collection and analysis and interface with decision makers, and systematic counterintelligence shielding of one’s own efforts from those of others. With these developments in the twentieth century have come incentives for elaborate doctrine and theorizing inside the intelligence organizations—and outsiders are increasingly being allowed to play a part, often driven also by their own concerns for social science applications and\or for greater democratic accountability. A significant literature is emerging.
1. History of the Emergence of Intelligence Institutions and a Literature Some of the most ancient Mesopotanian tablets are cryptograms, and Western memory has highlighted espionage activities since the sieges of Troy and Jericho. Systematic literature on generalship, statecraft, and diplomacy became abundant, and commonly included passages on the need to have spies and to do assessments of information, but little, however, about exactly how to do the assessments, and even less about the subtleties of exactly how to direct one’s spies, and almost nothing about how to be a spy and perform espionage. For the latter, from among published works that ever achieved wide circulation, one would have had to rely on some fiction and a few memoirs. Sheer secretiveness does not suffice to explain the slowness of emergence of a systematic intelligence literature: such literature began to appear with the emergence of modern science—and appropriately was first about cryptology (ca. 1550), the aspect of intelligence that could most readily benefit from the
application of the mathematics, etc., with ‘how to’ orderliness. Traditional espionage had required secretiveness too—but about methods that were thought to be sufficiently inculcated through ageold conventional wisdom and craftsmanship\ apprenticeship not to require formal treatises for transmission of new specialties to novice practitioners or for continued scholarly updating, as cryptology soon did. Many advances in science and technology were of course also contributing importantly to a general, lurching modernization of warfare and statecraft in the seventeenth and eighteenth centuries—and to a precipitation (which finally became rapid in the late nineteenth century) of new institutionalized specialties, crystalized out of the age-old fluid mixes of generalship, statecraft, and diplomacy. Again cryptology helped lead the way, from ‘black chambers’ of the seventeenth century to ‘signals intelligence’ (SIGINT), salient with the advent of telegraphy. But until after Napoleon, the crystalization of intelligence functions in Europe was mostly discontinuous, ad hoc, and personalistic. Generals and admirals, rulers and their top advisers constructed intelligence outfits (usually rather small) as they saw fit and employed them idiosyncratically. There was a general presumption that diplomats would engage in espionage, discreetly, as part of their role as reporters to their home governments; likewise postal services dutifully would intercept accessible messages; merchant captains would be available to be debriefed, similarly traders and merchant bankers; clerics would commonly collaborate; and so forth, ad hoc. Yet eventually the long century of relative peace after 1815 enabled the career diplomats to consolidate for themselves, as reporters, a professional status and ethic of personal abstention from conducting espionage—though their embassies and consulates might continue to harbor other functionaries in roles that did provide cover for agent running. Also, military attaches (likewise naval and later air) acquired a formally recognized ‘licensed spy’ status that was mutually convenient to the governments concerned. And the spread of democracy led to emphasis on differentiating one’s counterintelligence security arrangements, domestic from international. In the twentieth century, technical collection (TECHINT) methods proliferated, utilizing new technologies in SIGINT and beyond it: notably overhead reconnaissance (imagery intelligence, IMINT). Such new or reconstituted intelligence specialties, specifically identified as such, were incompatible with continuation of ad hocery; they required distinct stable career bureaucracies, at least within the armed services, and probably also outside lest political, economic, and other civilian priorities be absorbed subordinately into military juggernauts, even in peacetime. Instead, civilian intelligence services could do some of the espionage and could highlight open 15143
Strategic Intelligence sources of information, contributing to a more complete (‘all-source’) system of collection. To be sure, that would complicate coordination in all-source assessment\analysis of the collected information and in dissemination of it to the policy makers. So additional new institutions would be needed to reintegrate the recently proliferating institutions. And each of them, old and new, would come to need doctrines, for its own stable identity and its missions. All of this was in fact the swiftly evolving setting for twentieth-century discourse about strategic intelligence.
2. Components of the Twentieth Century Intelligence Literature Informed discourse about intelligence requires access to evidence. Yet successful conduct of intelligence requires much secretiveness and deception. The conflict between these two principles is the central dilemma retarding scholarship about the subject—in this era when the complexity and importance of the functions and institutions has finally achieved general recognition, while the prevailing standards of respectability in social science do not comport easily with the dubious quality of evidence available for open discourse about intelligence. The field is nearly unique in that evidence is not only hard to find because it is deliberately hidden but additionally is hard to trust because often it has been deliberately planted to deceive. At least some governmental relaxation of long enforced secrecy restrictions is virtually prerequisite for building an evidentiary base usable by outsiders. Yet various possible governmental motives for relaxation rouse suspicions that even the releases are being directly or indirectly rigged. Governments do hold the keys. Investigative journalists, especially those with current events preoccupations, can be stymied; defectors and adversaries can be denigrated and largely discredited; veterans can be held to oaths of silence; archives can be kept closed and\or purged. Nevertheless, whenever stonewalling erodes and instead validation gradually proceeds, any and all of these potential sources can come cumulatively to constitute the evidentiary base for informed open discourse. That is what has in fact been slowly occurring— especially with momentum from the United States—in a widening circle of democratic and ex-communist countries during the decades since the USA’s traumatic encounter with Vietnam. Exposes have led step by step to treatises. And much that in the past had indeed been factual but had been presented as intelligence fiction, so as to elude restrictions, can now be accorded less ambiguous credibility, even become part of the accepted data base. Historiography and some social science can take hold. Among outside historians, about twentieth-century intelligence the most outstanding achievement thus far 15144
is the UK’s Christopher Andrew’s successive masterworks (1985, 1995, Andrew and Gordievsky 1990) on the services of each of the major intelligence powers. For coverage of earlier centuries also, the general masterwork is German: (Piekalkiewicz 1988); about cryptology it is American (Kahn 1996). Now, even for intelligence history as recent as the Cold War, multiarchival international research collaborations are sometimes successful, e.g. Russians and Americans (Fursenko and Natfali 1997, Murphy et al. 1997). For sheer massive contemporary updated data accumulation by an outsider no one anywhere rivals the USA’s Jeffrey Richelson (1999). But most important for durably connecting strategic intelligence with social science are the works that do not just add to the base of openly available evidence but use it to rethink and generalize intelligence problems. In this genre, usually the writers who have been best positioned to take early advantage of the new permissiveness come from among former intelligence practitioners (or supervisory associates) who happen to be of uncommonly reflective—even scholarly—disposition, have familiarized themselves with the open literature currently available in their respective periods, and soberly resolve to contribute their own perspectives to its cumulative advancement. Milestone works of this kind in the United States have been Kent (1966), Johnson (1989), Shulsky (1991), and in the UK, Herman (1996), all of them distinguished for comprehensiveness as well as for insider\outsider complementarity of strengths. What kinds of contributions can they make, in the widening discourse based on increasingly available data and personal contacts? And how about authors who remain essentially outsiders yet want to study the arcane world of ongoing intelligence—not just its history? What does the existing literature already exhibit and foreshadow about the feasibility and value of various research directions? Scholars can act like investigative journalists and ‘speak the unspeakable,’ in reports that are also additional evidence for generalization by others. Scholars can act as systematic ethicists, judging different forms or instances of intelligence activity. Much of such effort has been devoted to evaluating covert action (intrigue, etc.), but not much yet about intelligence in the stricter sense, where problems like loyalties and means–ends relationships also abound, worth normative exploration. Epistemologists and cognitive psychologists have wide room to apply their disciplinary skills studying problems of perception\misperception\deception that arise in intelligence collection and assessment, and in counterintelligence where these complexities are loosely tagged a ‘wilderness of (interfacing) mirrors.’ The pioneering studies here were done as academic political psychology (Jervis 1970, 1976). There is room for management-school faculties to contribute insights from their fields of management
Strategic Planning psychology and of operations research. Here quantitative methods are central whereas most other outsiders’ work on intelligence, thus far, has been entirely qualitative or only rudimentally quantitative. However, rational-choice\game-theory modelers’ increasing emphasis on their players’ information levels may come to alter this imbalance, if the applicability to intelligence studies gets recognized. Political scientists as well as psychologists can conceptualize and taxonomize in their studies of intelligence. This is a function that insiders are very familiar with on their own; they eagerly construct, debate, and reconstruct their secret lexicons, knowing full well the value of verbal precision (and deliberate imprecision). But such efforts on the inside are prone to express conflicting parochialisms. Taxonomies by outsiders may be more objective, a moderating influence, and sometimes their conceptualization can even introduce or clarify ideas that have not already become familiar to insiders blinkered by their own secret terminology. Similarly, outside social scientists can sometimes venture into speculative scenario writing, which implicitly can be a useful cross-check upon the range and content of comparable exercises done inside, even when the outsiders are not privy enough to the latter for the cross-checking to be made explicit. Political scientists as well as like-minded historians can make cross-national comparisons leading to some mid-level theorizing–for example about how much the interface between intelligence analysts and top policymakers is affected by characteristics of different countries’ governmental systems, be they pluralistic or relatively centralized. All these modes of widening discourse about strategic intelligence, besides having benefits for the scholars and some practitioners, contribute to making civil society in general more knowledgeable about this important dimension of world affairs. Thus the discourse can tend to enhance broad democratic accountability, with long-term advantages that probably outweigh the occasional damage from excessive leaks and disclosures. But wide acquiescence by insiders comes only very slowly. See also: Cold War, The; Information and Knowledge: Organizational; Information Technology; Intelligence: Organizational; Knowledge Societies; Military Sociology; Organizations, Sociology of; Technology and Social Control; War, Sociology of
Bibliography Andrew C 1985 Her Majesty’s Secret Serice: The Making of the British Intelligence Community. Viking, New York Andrew C 1995 For the President’s Eyes Only: Secret Intelligence and the American Presidency from Washington to Bush. HarperCollins, New York
Andrew C, Gordievsky O 1990 KGB: The Inside Story. 1st edn. HarperCollins, New York Fursenko A V, Naftali T 1997 One Hell of a Gamble: Khrushche, Castro, and Kennedy, 1958-1964. 1st edn. Norton, New York Herman M 1996 Intelligence Power in Peace and War. Cambridge University Press, Cambridge, UK Jervis R 1970 The Logic of Images in International Relations. Princeton University Press, Princeton, NJ Jervis R 1976 Perception and Misperception in International Politics. Princeton University Press, Princeton, NJ Johnson L K 1989 America’s Secret Power: The CIA in a Democratic Society. Oxford University Press, Oxford, UK Kahn D 1996 The Codebreakers: The Story of Secret Writing, rev. edn. Scribner, New York Kent S 1966 Strategic Intelligence for American World Policy, rev. edn. Princeton University Press, Princeton, NJ Murphy D E, Kondrashev S A, Bailey G 1997 Battleground Berlin: CIA s. KGB in the Cold War. Yale University Press, New Haven, CT Piekalkiewicz J 1988 Weltgeschichte der Spionage: Agenten, Systeme, Aktionen [World History of Espionage: Agents, Systems, Operations]. Sudwest Verlag, Munich, Gemany Richelson J T 1999 The U.S. Intelligence Community, 4th edn. Westview, Boulder, CO Shulsky A N 1991 Silent Warfare: Understanding the World of Intelligence. Brassey’s (US), Washington, DC
H. B. Westerfield
Strategic Planning Strategic planning may be defined as ‘a disciplined effort to produce fundamental decisions and actions that shape and guide what an organization (or other entity) is, what it does, and why it does it’ (Bryson 1995, pp. 4–5). This definition implies that strategic planning is not a single thing, but is instead a set of concepts, procedures, and tools that must be carefully tailored to situations if desirable outcomes are to be achieved. Many authors have suggested generic strategic planning processes. Poister and Streib (1999) in a review of the field of strategic planning and management assert that strategic planning: (a) Is concerned with identifying and responding to the most fundamental issues facing an organization; (b) Addresses the subjective question of purpose and the often-competing values that influence mission and strategies; (c) Emphasizes the importance of external trends and forces, as they are likely to affect the organization and its mission; (d) Attempts to be politically realistic by taking into account the concerns and preferences of internal, and especially external, stakeholders; (e) Relies heavily on the active involvement of senior level managers, and sometimes elected officials, assisted by staff support where needed; 15145
Strategic Planning (f ) Requires the candid confrontation of critical issues by key participants in order to build commitment to plans; (g) Is action oriented and stresses the importance of developing plans for implementing strategies; and (h) Focuses on implementing decisions now in order to position the organization favorably for the future. It is important to highlight what strategic planning is not. Strategic planning is not a substitute for strategic thinking and acting. It may help people do that, but used unwisely it may hinder strategic thinking and acting. Strategic planning is not a substitute for leadership. At least some key actors must be committed to the process or it is bound to fail. Strategic planning is also not a substitute for an organizational or community strategy. Strategies have numerous sources, both planned and unplanned. Strategic planning is likely to result in a statement of organizational or community intentions, but what is realized in practice will be some combination of what is intended with what emerges along the way. Finally, strategic planning is not synonymous with what is called ‘comprehensive planning’ for communities in the USA, or has been called ‘structure planning’ in Europe. There may be little difference if the agency doing the comprehensive or structure planning is tied directly to governmental decision-makers. However, in practice, there may be three significant differences. First, the plans are often prepared to meet legal requirements and must be formulated according to a legally prescribed process with legally prescribed content. As legal instruments, these plans have an important influence. On the other hand, the plans’ typical rigidity can conflict with the political process with which public officials must cope. Therefore strategic plans can provide a bridge from legally required and relatively rigid policy statements to actual decisions and operations. Second, comprehensive or structure plans are usually limited to a shorter list of topics than a government’s full agenda of roles and responsibilities. For that reason, they are of less use to key decision makers than strategic planning, which can embrace all of a government’s actual and potential roles before deciding why, where, when, and how to act. Third, as Kaufman and Jacobs (1987) argue, strategic planning on behalf of a community is more action-oriented, more broadly participatory, more emphatic about the need to understand the community’s strengths and weaknesses as well as the opportunities and threats it faces, and more attentive to inter-community competitive behavior. Strategic planning is thus typically more comprehensive than comprehensive planning or structure planning, while at the same time producing a more limited action focus. Strategic planning also is typically distinguished from strategic management. Strategic management is 15146
a far more encompassing process ‘concerned with managing an organization in a strategic manner on a continuing basis’ (Poister and Streib 1999, p. 310). Strategic management links strategic planning and implementation by adding ongoing attention to budgeting; performance measurement, evaluation, and management; and feedback relationships among these elements.
1. Brief History of Deelopment and Use of Strategic Planning Strategic planning for organizations has primarily developed in the private sector. This history has been amply documented by others (Eden and Ackermann 1998). However, since around 1980, public and nonprofit use of strategic planning has skyrocketed. This experience, and a growing body of literature, have indicated that strategic planning approaches either developed in the private sector, or else strongly influenced by them, can help public and nonprofit organizations, as well as communities or other entities, deal in effective ways with their dramatically changing environments. In the USA—where strategic planning has proceeded the furthest—a substantial majority of municipal and state governments, and an overwhelming majority of federal agencies and nonprofit organizations, make use of strategic planning. Strategic planning has been principally applied to public and nonprofit organizations, but applications to communities have also increased substantially (Berman and West 1998). Other nations are also making increased use of strategic planning concepts, procedures, and tools for public and nonprofit organizations and communities (e.g., Osborne and Plastrik 1997, Salets and Faludi 2000).
2. Benefits When done well, strategic planning offers a number of benefits. Advocates usually point to four main potential benefits. The first is the ‘promotion of strategic thought and action.’ The second is ‘improved decision making,’ while the third is ‘enhanced organizational responsiveness and improved performance.’ Finally, strategic planning can ‘directly benefit the organization’s people’ by helping them better perform their roles, meet their responsibilities, and enhance teamwork and expertise. There is no guarantee, however, that these benefits will be achieved. For one thing, strategic planning is simply a set of concepts, procedures, and tools that must be applied wisely to specific situations. Also, even when applied wisely there is no guarantee of success.
Strategic Planning
3. Approaches to Strategic Planning Strategic planning may be divided into ‘process’ and ‘content’ approaches. We start first with broadly encompassing process approaches, then consider more narrowly conceived process approaches, and finally discuss two content approaches.
holders; and (c) it does not offer specific advice on how to develop strategies, except to note that effective strategies will build on strengths, take advantage of opportunities, and overcome or minimize weaknesses and threats.
3.2 Strategic Planning Systems 3.1 The Harard Policy Model The Harvard policy model was developed as part of the business policy courses taught at the Harvard Business School since the 1920s (Bower et al. 1993). The approach provides the principal (though often implicit) inspiration behind the most widely cited recent models of public and nonprofit sector strategic planning. The main purpose of the Harvard model is to help an organization develop the best ‘fit’ between itself and its environment; that is, to develop the best strategy for the organization. One discerns the best strategy by analyzing the internal strengths and weaknesses of the company and the values of senior management, and by identifying the external threats and opportunities in the environment and the social obligations of the firm. Then one designs the appropriate organizational structure, processes, relationships, and behaviors necessary to implement the strategy, and focuses on providing the leadership necessary to implement the strategy. Effective use of the model presumes that senior management can agree on the organization’s situation and the appropriate strategic response, and has enough authority to enforce its decisions. A final important assumption of the model, common to all approaches to strategic planning, is that if the appropriate strategy is identified and implemented, the organization will be more effective. For the model to be useful for public and nonprofit purposes, it typically has to be supplemented with other approaches, such as the portfolio and strategic issues management approaches discussed below. A portfolio approach is needed because a principal strategic concern at the organizational level is oversight of a portfolio of agencies, departments, or programs. Strategic issues management is needed because much public and nonprofit work is typically quite political, and articulating and addressing issues is the heart of political decision making. The systematic assessment of strengths, weaknesses, opportunities and threats—known as a SWOT analysis—is a primary strength of the Harvard model. This element of the model appears to be applicable in the public sector to organizations, functions, and communities. Another strength is its emphasis on the need to link strategy formulation and implementation in effective ways. The main weaknesses of the Harvard model are: (a) it does not draw attention to strategic issues; (b) it does not consider the full range of stake-
Strategic planning is often viewed as a system whereby leaders and managers go about making, implementing, and controlling important decisions across functions and levels in the organization. When viewed this way, strategic planning is synonymous with strategic management. Strategic planning systems vary along several dimensions: the comprehensiveness of decision areas included, the formal rationality of the decision process, and the tightness of control exercised over implementation of the decisions, as well as how the strategy process itself will be tailored to the organization and managed. The strength of these systems is their attempt to coordinate the various elements of an organization’s strategy across levels and functions. Their weakness is that excessive comprehensiveness, prescription, and control can drive out attention to mission, strategy, and innovation, and can exceed the ability of participants to comprehend the system and the information it produces (Mintzberg 1994, Roberts and Wargo 1994). Strategic planning systems are applicable to public organizations (and to a lesser extent communities), for, regardless of the nature of the particular organization, it makes sense to coordinate decision making across levels and functions and to concentrate on whether the organization is implementing its strategies and accomplishing its mission (Boschken 1988, p. 1994). However, it is important to remember that a strategic planning system characterized by substantial comprehensiveness, formal rationality in decision making, and tight control will work only in an organization that has a clear mission, clear goals and objectives, relatively simple tasks to perform, centralized authority, clear performance indicators, and information about actual performance available at reasonable cost. While some public and nonprofit organizations—such as hospitals and police and fire departments—operate under such conditions, most do not. As a result, most public and nonprofit sector strategic planning systems typically focus on a few issues, rely on a decision process in which politics plays a major role, and control something other than program outcomes (e.g., budget expenditures) (Osborne and Plastrik 1997).
3.3 Stakeholder Management Approaches Freeman (1984) states that strategy can be understood as an organization’s mode of relating to or building 15147
Strategic Planning bridges to its stakeholders. For Freeman, a stakeholder may be defined as any individual, group, or organization that is affected by, or that can affect, the future of the organization. He argues, as do others who emphasize the importance of attending to stakeholders, that a strategy will only be effective if it satisfies the needs of multiple groups. Because it integrates economic, political, and social concerns, the stakeholder model is one of the approaches that is most applicable to the public sector. Many interest groups have stakes in public organizations, functions, and communities. If the model is to be used successfully, however, there must be the possibility that key decision-makers can achieve reasonable agreement about who the key stakeholders are and what the response to their claims should be. The strengths of the stakeholder model are its recognition of the many claims—both complementary and competing—placed on organizations by insiders and outsiders and its awareness of the need to satisfy at least the key stakeholders if the organization is to survive. Because of its attention to stakeholders, the approach may be particularly useful in working collaboratively with other organizations, since each party to a collaboration must be viewed as a stakeholder (Huxham 1993). The weaknesses of the model are the absence of criteria with which to judge competing claims and the need for more advice on developing strategies to deal with divergent stakeholder interests. A number of other encompassing process approaches to strategic planning have been developed primarily in the UK in the field of operations research. These include Journey Making (Eden and Ackermann 1998) and Strategic Choice (Friend and Hickling 1997). They are now beginning to be used more widely around the world.
3.4 Strategic Issues Management Approaches These approaches are process components, pieces of a larger strategic planning process. The concept of strategic issues first emerged when practitioners of corporate strategic planning realized a step was missing between the SWOT analysis of the Harvard model and the development of strategies. That step was the identification of strategic issues, or crucial challenges facing the organization. Many organizations now include a strategic issue identification step as part of full-blown strategy revision exercises, and also as part of less comprehensive annual strategic reviews between major strategy revisions. In addition, many organizations have developed strategic issue management processes actually separated from annual strategic planning processes. Many important issues emerge too quickly to be handled as part of an annual process. Strategic issue management is clearly applicable to public organizations, since the agendas of these 15148
organizations consist of issues that should be managed strategically (Eden and Ackermann 1998). The strength of the approach is its ability to recognize and analyze key issues quickly. The approach also applies to functions or communities, as long as some group, organization, or coalition is able to engage in the process and to manage the issue. The main weakness is that in general the approach offers no specific advice on exactly how to frame the issues other than to precede their identification with a situational analysis of some sort. Nutt and Backoff (1995) have gone the furthest in remedying this defect. The final two process approaches to be discussed are ‘process strategies.’ They are logical incrementalism and strategic planning as a framework for innovation. Process strategies are approaches to implementing a strategy that already has been developed in very broad outline and is subject to revision based on experience with its implementation. Other important process approaches not discussed in this article, due to space limitations, include total quality management (Cohen and Brand 1993), strategic negotiations (Susskind and Cruikshank 1987), collaboration (Gray 1989), and the management of culture (Schein 1992).
3.5 Logical Incrementalism In incremental approaches, strategy is a loosely linked group of decisions that are handled incrementally. Decisions are handled individually below the organizational level because such decentralization is politically expedient—organizational leaders should reserve their political clout for crucial decisions. Decentralization also is necessary since often only those closest to decisions have enough information to make good ones. The incremental approach is identified principally with Quinn (1980), although the influence of Lindblom (1959) is apparent. Quinn developed the concept of ‘logical incrementalism’—or incrementalism in the service of overall organizational purposes—and as a result transformed incrementalism into a strategic approach. Logical incrementalism is a process approach that, in effect, fuses strategy formulation and implementation. The strengths of the approach are its ability to handle complexity and change, its emphasis on minor as well as major decisions, its attention to informal as well as formal processes, and its political realism. A related strength is that incremental changes in degree can add up over time into changes in kind (Mintzberg 1987). The major weakness of the approach is that it does not guarantee that the various loosely linked decisions will add up to fulfillment of corporate purposes. Logical incrementalism would appear to be very applicable to public and nonprofit organizations, as long as it is possible to establish some overarching set of strategic objectives to be served by the approach.
Strategic Planning When applied at the community level, there is a close relationship between logical incrementalism and collaboration. Indeed, collaborative purposes and arrangements typically emerge in an incremental fashion as organizations individually and collectively explore their self-interests and possible collaborative advantages, establish collaborative relationships, and manage changes incrementally within a collaborative framework (Stone 2000).
3.6 Strategic Planning as a Framework for Innoation As noted in Sect. 3.5, strategic planning systems can drive out attention to mission, strategy, and innovation. The systems, in other words, can become ends in themselves and drive out creativity, innovation, and new ventures without which the organization might become irrelevant or die. Many organizations, therefore, have found it necessary to emphasize innovative strategies as a counterbalance to the excessive control orientation of many strategic planning systems (Light 1998). The framework-for-innovation approach relies on many elements of the approaches discussed above, such as SWOT analyses and portfolio methods. This approach differs from earlier ones in four emphases: (a) innovation as a strategy; (b) specific management practices to support innovation; (c) development of a mission or ‘vision of success’ that provides the decentralized and entrepreneurial parts of the organization with a common set of super-ordinate goals toward which to work; and (d) nurture of an entrepreneurial organizational culture (Light 1998, Osborne and Plastrik 1997). The main strength of the approach is that it allows for innovation and entrepreneurship while maintaining central control. The weaknesses of the approach are that typically—and perhaps necessarily—a great many, often costly, mistakes are made as part of the innovation process and that there is a certain loss of accountability in very decentralized systems (Mintzberg 1994). Those weaknesses reduce the applicability to the public sector, in particular, in which mistakes are less acceptable and the pressures to be accountable for details (as opposed to results) are often greater (Light 1998). Many nonprofit organizations will also have trouble pursuing this approach because a shortage of important resources will magnify the risks of failure. ‘Process’ approaches do not prescribe answers, though good answers are presumed to emerge from appropriate application. In contrast, ‘content’ approaches do yield answers. In fact, the models are antithetical to process when process concerns get in the way of developing the ‘right’ answers. In the next two sections, we discuss two content approaches: portfolio models and competitive analysis. Other important content approaches not covered in this article, due to space limitations, include
‘reinventing government’ (Osborne and Gaebler 1992), systems analysis (Senge 1990), and reengineering (Linden 1994).
3.7 Portfolio Models The idea of strategic planning as managing a portfolio of ‘businesses’ is based on an analogy with investment practice in which an investor manages a portfolio of stocks to balance risk and return. Although the applications of portfolio theory to the public sector may be less obvious than those of the approaches described above, they are nonetheless just as powerful (Nutt and Backoff 1992). Many public and nonprofit organizations consist of ‘multiple businesses’ that are only marginally related. Often resources from various sources are committed to these unrelated businesses. That means public and nonprofit managers must make complex portfolio decisions, though usually without the help of portfolio models that frame those decisions strategically. The strength of portfolio approaches is that they provide a method of measuring entities of some sort (e.g., programs, proposals, or problems) against dimensions that are deemed to be of strategic importance (usually attractiveness of the option versus capability to implement it). Weaknesses include the difficulty of knowing what the appropriate strategic dimensions are, difficulties of classifying entities against dimensions, and the lack of clarity about how to use the tool as part of a larger strategic planning process. Portfolio approaches can be used in conjunction with an overall strategic planning process to provide useful information on an organization, function, or community in relation to its environment. However, unlike the process models portfolio approaches provide an ‘answer’; that is, once the dimensions for comparison and the entities to be compared are specified, the portfolio models prescribe how the organization or community should relate to its environment. Such models will work only if a dominant coalition is convinced that the answers they produce are correct.
3.8 Competitie Analysis Another important content approach that assists strategy selection has been developed by Michael Porter (1998a) and his associates. Porter (1998a) hypothesizes that five key competitive forces shape an industry: relative power of customers, relative power of suppliers, threat of substitute products, threat of new entrants, and the amount of rivalry activity among the players in the industry. For many public and nonprofit organizations, there are equivalents to the forces that affect private industry. An effective organization in the public and nonprofit sector, 15149
Strategic Planning therefore, must understand the forces at work in its ‘industry’ in order to compete effectively, and must offer value to its customers that exceeds the cost of producing it. On another level, planning for a specific public function (health care, transportation, or recreation) can benefit from competitive analysis if the function can be considered an industry. In addition, Porter (1998b) points out that for the foreseeable future self-reinforcing agglomerations of firms and networks are crucial aspects of successful international economic competition. In effect, not just firms or nations, but metropolitan regions (Shanghai, the Silicon Valley, New York, London, and Paris) are key economic actors. Regions interested in competing on the world stage should therefore try to develop the infrastructure necessary for virtuous (rather than vicious) cycles of economic growth to unfold. In other words, wise investments in education, transportation and transit systems, water and sewer systems, parks and recreation, housing, and so on, can help firms reduce their costs—particularly the costs of acquiring an educated labor force—and thus improve the firms’ ability to compete internationally. The strength of competitive analysis is that public and nonprofit organizations can use competitive analysis to discover ways to help the private firms in their regions. When applied directly to public and nonprofit organizations, however, competitive analysis has two weaknesses: it is often difficult to know what the ‘industry’ is and what forces affect it, and the key to organizational success in the public and nonprofit world is often collaboration instead of competition. Competitive analysis for the public organizations, therefore, must be coupled with a consideration of social and political forces and the possibilities for collaboration (Huxham 1993, Osborne and Plastrik 1997).
4. Conclusions It should be noted that careful studies of corporatestyle strategic planning in the public and nonprofit sectors are few in number. Nevertheless, it is possible to reach some tentative conclusions. First, it should be clear that strategic planning is not a single concept, procedure, or tool. In fact, it embraces a range of approaches that vary in their applicability to the public and nonprofit sectors and in the conditions that govern their successful use. Second, while any generic strategic planning process may be a useful guide to thought and action, it will have to be applied with care in a given situation, as is true of any planning process (Christensen 1999). Because every planning process should be tailored to fit specific situations, every process in practice will be a hybrid. Third, public and nonprofit strategic planning is well on its way to becoming part of the standard repertoire of public and nonprofit leaders, managers, and planners. Because of 15150
the often dramatic changes these people and their organizations confront, we can hypothesize that the most effective leaders, managers, and planners at the beginning of the twenty-first century are, and will be increasingly in the future, the ones who are best at ‘strategic’ planning. Finally, research must explore a number of theoretical and empirical issues in order to advance the knowledge and practice of public-sector strategic. In particular, strategic planning processes that are responsive to different situations must be developed and tested. These processes should specify key situational factors governing their use; provide specific advice on how to formulate and implement strategies in different situations; be explicitly political; indicate how to deal with plural, ambiguous, or conflicting goals or objectives; link content and process; indicate how collaboration as well as competition is to be handled; and specify roles for those involved in the process. Other topics in need of attention include: the nature of strategic leadership; ways to promote and institutionalize strategic planning across organizational levels, functions that bridge organizational boundaries, and intra- and interorganizational networks; and the ways in which information technologies can help or hinder the process. Progress has been made on all of these fronts, but work is clearly necessary if we are to understand better when and how to use strategic planning. See also: Nonprofit Sector, Geography of; Planning, Politics of; Planning Theory: Interaction with Institutional Contexts; Public Goods: International; Strategy: Organizational
Bibliography Berman E M, West J P 1998 Productivity enhancement efforts in public and nonprofit organizations. Public Productiity and Management Reiew 22: 207–19 Boschken H L 1988 Strategic Design and Organizational Change. The University of Alabama Press, Tuscaloosa, AL Boschken H L 1994 Organizational performance and multiple constituencies. Public Administration Reiew 54: 308–12 Bower J, Bartlett C, Chistensen C, Pearson J 1991 Business Policy: Text and Cases, 7th edn. Irwin, Homewood, IL Bryson J M 1995 Strategic Planning for Public and Nonprofit Organizations, rev. edn. Jossey-Bass, San Francisco, CA Christensen K S 1999 Cities and Complexity: Making Intergoernmental Decisions. Sage, Los Angeles, CA Cohen S, Brand R 1993 Total Quality Management in Goernment. Jossey-Bass, San Francisco, CA Eden C, Ackermann F 1998 Making Strategy: The Journey of Strategic Management. Sage, Los Angeles, CA Freeman R E 1984 Strategic Management: A Stakeholder Approach. Pitman, Boston Friend J, Hickling A 1997 Planning Under Pressure: The Strategic Choice Approach. Butterworth-Heinemann, Oxford, UK Gray B 1989 Collaborating: Finding Common Ground for Multiparty Problems. Jossey-Bass, San Francisco, CA
Strategy: Organizational Huxham C 1993 Pursuing collaborative advantage. Journal of the Operational Research Society 44: 599–611 Kaufman J, Jacobs H M 1987 A public planning perspective on strategic planning. Journal of the American Planning Association 53(1): 23–33 Light P C 1998 Sustaining Innoation: Creating Nonprofit and Goernment Organizations that Innoate Naturally. JosseyBass, San Francisco, CA Lindblom C E 1959 The science of muddling through. Public Administration Reiew 19: 79–88 Linden R M 1994 Seamless Goernment: A Practical Guide to Re-Engineering in the Public Sector. Jossey-Bass, San Francisco, CA Mintzberg H 1987 Crafting strategy. Harard Business Reiew 87(4): 66–75 Mintzberg H 1994 The Rise And Fall Of Strategic Planning. Free Press, New York Nutt P C, Backoff R W 1992 Strategic Management for Public and Third Sector Organizations: A Handbook for Leaders. Jossey-Bass, San Francisco, CA Nutt P C, Backoff R W 1995 Strategy for public and third sector organizations. Journal of Pubic Administration Research and Theory 5(2): 189–211 Osborne D, Gaebler T 1992 Reinenting Goernment. Addison Wesley, Reading, MA Osborne D, Plastrick R 1997 Banishing Bureaucracy: The Fie Strategies for Reinenting Goernment. Addision-Wesley, Reading, MA Poister T H, Streib G 1999 Strategic management in the public sector: Concepts, models, and processes. Public Management and Productiity Reiew 22(3): 308–25 Porter M 1998a Competitie Adantage: Creating and Sustaining Superior Performance. Free Press, New York Porter M 1998b The Competitie Adantage of Nations: With New Introduction. Free Press, New York Quinn J B 1980 Strategies for Change: Logical Incrementalism. R D Irwin, Homewood, Ill Roberts N, Wargo L 1994 The dilemma of planning in largescale public organizations: The case of the United States Navy. Journal of Public Administration Research and Theory 4: 469–91 Salets W, Faludi A (eds.) 2000 The Reial of Spatial Strategic Planning. Royal Netherlands Academy of Arts and Sciences, Amsterdam Schein E 1992 Organizational Culture and Leadership, 2nd edn. Jossey-Bass, San Francisco, CA Senge P M 1990 The Fifth Discipline: The Art and Practice of the Learning Organization. Doubleday, New York Stone M M 2000 Exploring the effects of collaborations on member organizations: Washington county’s welfare-to-work partnership. Nonprofit and Voluntary Sector Quarterly 29(1) Supplement: 98–119 Susskind L E, Cruikshank J 1987 Breaking the Impasse: Consensual Approaches to Resoling Public Disputes. Basic Books, New York
J. M. Bryson
Strategy: Organizational The subject of strategy in organizations has become one of the dominant features of the contemporary world of academics and practitioners. The use of the
word is pervasive. The genesis and development of the concept has been relatively swift yet has produced a rich variety of interpretations and uses. Its core meaning was taken by Chandler, to center on a logical sequence of goal-setting and resource allocation within firms. In his widely-adopted formula (Chandler 1962, p. 13): ‘Strategy is the determination of the basic long-term goals of an enterprise, and the adoption of courses of action and the allocation of resources necessary for carrying out these goals.’ Subsequent work has emphasized the competitive element so that the purpose of a strategy is to maintain or improve performance. Strategies are judged by their ability to neutralize threats and exploit opportunities for the organization while capitalizing on strengths and avoiding weaknesses (Barney 1997, p. 27). Strategy also refers to the direction of an organization, which matches its resources to its environment in order to meet stakeholders expectations (Johnson and Scholes 1993, p. 10). Strategic management is concerned with the process by which strategies are adopted and put into practice.
1. Origins The practical origin of the term strategy is found in military history. In the ancient world strategos was applied to the generalship of an army commander, ‘the art of the general.’ By 330 BC it also included notions of administration and the use of armed force in order not only to defeat enemies but also establish systems of governance (Quinn et al. 1988, p. 1). The word strategy was first used in English in the seventeenth century. By 1870, military dictionaries distinguished between strategy (planning conducted away from the enemy) and tactics (the immediate engagement of an adversary) (Whipp 1996, p. 13). The idea of organizational strategy in a business or commercial sense was a product of the second half of the twentieth century. Yet its emergence as a separate subject of teaching and research in the 1960s owed much to its military roots. The pioneering book of Ansoff (1965) on Corporate Strategy, for example, distinguished between corporate strategy and business strategy, paralleling the military ideas of grand strategy and battle strategy. Corporate strategy was seen as selecting what business to be in; business strategy dealt with how to operate in a given business area. A second landmark text (Andrews 1971) also relied on these military distinctions. He emphasized the aim of strategy as establishing a clear dominance in a business domain while simultaneously planning for the movements of hostile adversaries (see also Quinn et al. 1988, p. 42). The second main influence on the development of the subject of strategy was the discipline of economics. In the early decades of the twentieth century, research on industry form, competition and innovation had pointed to the role of individual firms in economic 15151
Strategy: Organizational relations. Such writers were firmly set in the traditions of microeconomics. The approach was a rational one with a reliance on profit maximizing behavior by all economic agents, accessible information and the mobility of resources. Andrews’ view of strategy was therefore clear: ‘rivalry amongst peers for prizes in a defined and shared game’ (Pettigrew and Whipp 1991, p. 29). The third force to shape the emergence of strategy as a subject was the geographical and corporate context. The flourishing of large, multi-product corporations, such as General Motors and Du Pont, took place in the 1920s and 1930s in the USA. As they grew commercially and organizationally more complex, so did their need for more sophisticated means of assessing markets and competitors and matching them with the resources of the firm. The early experts on strategy worked closely with these corporations in response to their needs. Ansoff was an executive at Lockheed Aerospace while Chandler was related to the Du Pont family. As the demand for technical experts to produce strategic analysis rose, so business schools and consulting firms grew in number and importance during the 1960s and 1970s. Outside of the US, corporate forms were much slower to evolve and in Europe the large-scale growth in business schools and the study of strategy did not occur until 20 years later.
2. Uneen Deelopment Given the late advent of strategy as a subject, its development since the 1960s has often been more colorful than more mature areas of the social sciences. New approaches have arisen, offering challenges to the main orthodoxies. Insights and techniques from outside of economics have been applied to enrich and extend the subject. As the study of strategy has progressed, so its full complexity has become apparent. The growth of the area is best understood by reference to three waves of activity: the classical era of the 1960s, the challenges set by organizational analysts in the 1970s and early 80s, and the production of a more composite approach to meet the circumstances of the 1980s and early 1990s. During the 1960s, strategy experts placed a heavy reliance on planning techniques. Financial planning tools were created to link capital investments with production and to derive profit forecasts. Forecastbased planning was constructed in the light of unstable market behavior. Business portfolio matrices and product lifecycle tools were examples of devices produced to make forecasts (Bourgeois 1996, p. 5). Early computing technology raised the expectations of strategic planners regarding the potential of this planning-led approach. The conviction that the environment of the firm was eminently capable of being captured by such techniques was strong. 15152
In the following decades, this planning-led approach to strategy continued as the mainstream, particularly in North America. Alternative voices appeared on both sides of the Atlantic with their attention directed not to planning strategy but the process of implementation. The process perspective was built upon decision-making theory (e.g., see Organizational Decision Making). First Lindblom (1959), and later March and Olsen (1976), demonstrated how the decisions of senior management were often less than rational, notwithstanding their use of planning apparatus. Trade-offs, compromises, and coalition formation were shown to be routine. These writers drew attention to the haphazard and highly subjective qualities of strategic decisions and their mode of implementation. Writers, such as Mintzberg (1978), revised the profile of the strategic decision making, and implementation activities in an organization, suggesting that these constituted important change processes in their own right and should be analyzed and managed accordingly. Concepts of power and culture were applied to the problem for the first time (e.g., Organizations: Authority and Power). The result was a more realistic appreciation of the problems strategic management involved, including the issues of political influences, aborted ideas and unintended outcomes. Whilst this process-based approach has continued as a separate tradition in the study of strategy (often labelled strategic change), it also had a major influence on the third wave of activity in the 1980s. The driving force behind these new developments was a dramatic shift among international competitors. Japanese firms in the manufacturing and consumer goods sectors enjoyed spectacular gains in overseas markets, notably the USA and Western Europe. Their market success was based on the ability to combine product innovations with new standards in quality, underpinned by (in the eyes of western managers) novel forms of organization. These events provoked two responses from academics. First, was a renewed interest in competitive analysis, drawing on the microeconomic perspective (outlined in Sect. 1). Porter played the leading role by highlighting ‘generic strategies’ (1980)—such as cost leadership vs. differentiation—and the need for organizations to clarify their choice of strategy. He went on to offer a model of extended competition (Porter 1985) which forced firms to treat seriously new sources of competition from outside a sector (e.g., see Competitie Strategies: Organizational). This fresh emphasis on competitor analysis was augmented by a rejuvenation during the 1980s of the resource-based theory of the firm. Here particular attention was paid to the identification and maintenance of the unique collection of capabilities (technological, organizational, financial, for example) found in successful firms but which others would find difficult to imitate. A preoccupation grew with ‘strategic intent’: the
Strategy: Organizational ability to create extremely long-term aspirations for an organization and link them to immediate operations (Hamel and Prahalad 1989). The insights of the strategic change analysts (e.g., see Organizational Change) became especially relevant to this new conception of strategy and competition (Pettigrew and Whipp 1991).
3. Current Themes The emphasis in current strategy research, in part, arises from ideas and incomplete projects, which began in earlier eras; this is particularly true for the interest in strategic capabilities and transformation. In the empirical sphere the dominant characteristics of strategic studies has been their entry into hitherto unresearched settings. At the same time, it has been noticeable how other specialisms in the business studies area, such as human resource management have attempted to incorporate a strategic perspective (e.g., see Strategic Human Resources Management). The harsh competitive conditions of the 1980s and 1990s, produced a general interest amongst business people in organizations, which could prosper in such circumstances. Companies such as Honda, Chrysler or ABB received intense scrutiny from practitioners and academics alike because of their superior strategic management. The result was that academics became exceptionally concerned with explaining their success. To this end, research has been targeted at the unusual ways in which such firms analyze their relationship with their environments, create ways of handling those relationships and do so in a consistent pattern of business behavior (Kay 1993, p. 9). The core of their success is attributed to their delivery of innovations in products or services. Research has shown that producing these innovations is explained by the existence of a set of supporting capabilities. In the case of Honda, its unique record of product innovation worldwide rests on a collection of distinctive capabilities involving: problem solving, project management and working with suppliers (Pascale 1990). The circumstances, which prompted these exceptional feats of strategic management, have, in turn, become the subject of academic attention. The exploitation of information technology has transformed entire industries. New forms of communication have created totally novel means of acquiring information and purchasing goods or services. Information technology has also led to completely new types of products. The character of sectors, such as retail banking and insurance, has been transformed. The ability of organizations to manage the wholesale reworking of their strategic assumptions and operational methods has become a source of fascination to practitioners and academics alike. Whereas in the 1980s, the emphasis had been on the adjustment of company strategies to new rules of competition, today,
the contemporary agenda is dominated by the prospect of companies facing the growth of new products, markets and sectors on a scale not seen since the twentieth century in Europe and the early twentieth century in the USA. In parallel to these high profile projects on strategic capabilities and transformations, strategy scholars have moved away from their preoccupation with corporations. The result has been an engagement with strategic management within small businesses, voluntary and not-for-profit sectors and professional organizations. These alternative contexts have provided useful tests for established theories. Sometimes the characteristics of these under-researched sites have forced analysts and managers in the corporate sector to review their models. The ability of voluntary organizations (e.g., see Voluntary Organizations) to succeed with few resources and yet motivate staff is a case in point (Johnson and Scholes 1993, pp. 23–30).
4. Methods The strategy field has witnessed the use of an extensive battery of research methods during its comparatively brief existence to-date. In the early phase, the quantitative methods associated with economics became well established. These techniques have continued to play a major role, especially in the measurement of performance and competition. From the 1970s, as the insights from psychology, politics and sociology were used to open up the strategic management process (see Sect. 2), so more qualitative approaches (including ethnography in the study of organizational culture, for example) were employed. The qualitative and quantitative currently co-exist productively (Kaplan and Norton 1996). More contentious is the common reference by strategy writers to individual company examples. Outsiders may confuse teaching cases (usually of a single organization) with full-scale research-based case studies. Nevertheless, popular accounts of corporate success stories and the personal reflections of leading executives which are light on formal research methodologies, have created an impression of superficiality which academics often have to live down.
5. Future Research Given the youthful nature of the subject, the study of strategy in the future will continue to unravel the problems associated with understanding competitive environments, how they may be matched to the resources of an organization and how the resulting decisions can be implemented. Fresh opportunities for research will appear in the international sphere. This will include not only the continued relevance of global markets and organizations but also the further exploration of national contexts. The differences of the financial and business systems between the USA, 15153
Strategy: Organizational Europe and Japan are well known, detailed comparisons with other parts of Asia or South America await. In addition, the fragmentation of social and economic relations—around, for example, network structures or self-organized commercial ventures—will pose a challenge to a subject which was founded on the problems of large corporations. Other tests will be supplied by links between strategy writers and specialists from other subject areas. Examples include, complexity theory and knowledge management experts from organization studies (e.g., see Learning: Organizational). There are also signs that those who are skeptical of the precepts of strategic management will increase their attention. Their assertions that strategy is merely a specialized form of language which enhances the power in society of management is the opening salvo in what could be a long exchange of intellectual fire power. The response by the strategy experts will demonstrate how far the subject has come of age into its mature form. See also: Management: General
Bibliography Andrews K R 1971 The Concept of Corporate Strategy. Irwin, Homewood, IL Ansoff H I 1965 Corporate Strategy. McGraw-Hill, New York Barney J B 1997 Gaining and Sustaining Competitie Adantage. Addison-Wesley, Reading, MA Bourgeois L J 1996 Strategic Management from Concept to Implementation. Dryden, Fort Worth, TX Chandler A 1962 Strategy and Structure. MIT Press, Boston, MA Hamel C, Prahalad C K 1989 Strategic intent. Harard Business Reiew. May-June: 63–76 Johnson G, Scholes K 1993 Exploring Corporate Strategy, 3rd edn. Prentice-Hall, Hemel Hempstead Kaplain R S, Norton D P 1996 The Balanced Scorecard. Harvard Business School Press, Boston Kay J 1993 Foundations of Corporate Success. Oxford University Press, Oxford, UK Lindblom L 1959 The science of muddling through. Public Administration Reiew 19: 78–88 March J, Olsen J 1976 Ambiguity and Choice in Organizations. Universitetsforlaget, Bergen, Norway Mintzberg H 1978 Patterns in strategy formation. Management Science 24: 934–48 Pascale R T 1990 Managing on the Edge. Viking, London Pettigrew A, Whipp R 1991 Managing Change for Competitie Success. Blackwell Oxford, UK Porter M 1980 Competitie Strategy. Free Press, New York Porter M 1985 Competitie Adantage: Creating and Sustaining Superior Performance. Free Press, New York Quinn J B, Mintzberg H, James R M 1988 The Strategy Process. Prentice Hall, Englewood Cliffs, NJ Whipp R 1999 Strategy and organisations. In: Clegg S, Hardy C, Nord W R (eds.) Handbook of Organization Studies. Sage, London
R. Whipp 15154
Street Children: Cultural Concerns Since the early 1980s street children have become a focus for attention in the media, an embarrassment to many bureaucrats, and a matter of priority for international agencies and numerous non-governmental organizations. At the same time, challenging academic and practical questions have been raised regarding who ‘street children’ are and the best strategies for helping them.
1. Definitions and Problematic Categorizations The term ‘street children’ came into widespread use following the United Nations Year of the Child (1979) when it was adopted by international agencies and promptly translated into many different languages (as meninos de rua in Portuguese, ning os de la calle in Spanish, enfants de la rue in French). It arose from attempts to develop a term free of negative connotations about children popularly known as street urchins, vagrants, gaminos, rag-pickers, or glue-sniffers, and historically as street Arabs or vagrants. Utilized in 1951 by UNESCO to refer to vagrant children following World War II, it was applied in 1986 to the developing world by UNICEF, the agency primarily concerned with children’s wellbeing, to designate minors who worked or slept on the streets (Williams 1993). At a most basic level, a street child is ‘a homeless or neglected child who lives chiefly in the streets’ (Oxford Dictionary). Welfare agencies, however, have reworked their definitions many times, encountering enormous problems in devising meaningful statements about street children and making appropriate distinctions between categories of children. Problems ensued even after the consensual definition formulated in 1983 by the Inter-NGO Programme for Street Children and Street Youth: ‘Street children are those for whom the street (in the widest sense of the word that is unoccupied dwellings, wasteland, etc …) more than their family has become their real home, a situation in which there is no protection, supervision or direction from responsible adults’ (see Ennew 1994, p. 15). Several terms could lead to confusion: what is meant by ‘family,’ ‘protection,’ or ‘responsible’ adults (Hecht 1998)? Even what is meant by ‘home’ is conceptualized differently across cultures: being homeless is rendered as desamparado (without protection or comfort of other people) in Latin America, furosha (floating) in Japan, and khate (rag-picker) in Nepal, evoking concepts of disaffiliation, transience, and marginal work rather than notions of residential access or type of abode (Desjarlais 1996). Several attempts to categorize street children, developed to acknowledge their heterogeneous circumstances and lifestyle, have also proved problematic. In
Street Children: Cultural Concerns one typology following UNICEF, ‘children of the street’ have a family accessible to them but have made the streets their home; ‘children on the street’ are child workers who return at night to their families; ‘abandoned’ children literally have no home to go to, while children ‘at high risk’ are those likely to be drawn into street-life. Distinctions have also been made between ‘abandoned’ and ‘abandoning’ street children (Felsman 1984), and among children according to their degree of family involvement and psychological characteristics. These terminologies have proved difficult to uphold in practice, given the fluidity of children’s movements on and off the streets and a lack of correspondance with the ways children themselves relate their experiences. Many labels such as children ‘without family contact’ and ‘abandoned’ lacked precision and were used ad hoc rather than analytically. The focus on categories of children, which led to the popular distinction between children of and on the streets (translated as de la calle\en la calle in Spanish, or de la rue\dans la rue in French) is falling into disuse. The term ‘street children’ is still widely used to highlight a set of working and living conditions rather than to refer to the personal or social characteristics of the children in question. But while street children are identified as those who occupy the public spaces of urban centres and whose activities are largely unsupervised by adults, it has proved important to move beyond definitions based solely on physical, social, and economic criteria.
squat, or double up with other families, encompassing the literally homeless and the precariously housed (Glasser and Bridgman 1999). Estimates of homeless youth—whether ‘runaways,’ in care, or with a parent—are often inflated by welfare agencies to legitimate their role, yet minimized by bureaucratic institutions with legal or financial responsibilities; thus statistics reflect the various agendas of organizations which collect them (Hutson and Liddiard 1994). The process of identifying and counting street youths is problematic in itself and it is also a part of the construction and management of homelessness as a social issue. Notwithstanding their imprecision, estimates can help to identify significant groups and appropriate interventions in different cultural contexts. It is important to know that the great majority of street children in developing countries are not ‘homeless’ but return at night to their families (they are children on the streets). The fact that they are predominantly boys has alerted attention on why local circumstances should differ according to gender. Care is taken to reveal the age distribution between teenage adolescents, those in late childhood, and very young children with families, who may experience different reactions from the public, as well as ethnic distribution. Estimates also help to appreciate changing trends. In the US, the fastest growing subgroup among the homeless are presently young children.
3. Representations of Street Children 2. Statistics and the Creation of a Social Issue A great deal of controversy also surrounds the process of enumerating street children. Different agencies issue often widely different estimates, partly because they find street children difficult to count, partly because they are not always talking about the same children. In Brazil alone, the figure of 7 million street children is most often cited, typically with reference to the homeless, although careful surveys suggest a national estimate less than 1 percent of that figure (39,000 in Hecht 1998). The 1991 video ‘About the UN Rights of the Child’ stated that an estimated 200 million children around the world live on the streets, while another global figure of 100 million street children has also gained credibility. But how are the children counted and who exactly is included? Huge numbers, produced to draw attention to children’s plight, rest upon largely elastic and nebulous definitions of homeless and working children; they are usually the result of guesswork and have ‘no validity or basis in fact’ (Ennew 1994, p. 32). The same problems of enumeration and group definition beset studies of the homeless in the developed world. The definitions of ‘homeless’ may shift to include those who sleep rough, live in shelters,
Several different explanations have been offered to explain the phenomenon of homelessness and the dramatic rise in the numbers of street children. There are, broadly speaking, two major schools of thought: one links homelessness to personal pathology, the other to external structural factors (Glasser and Bridgman 1999). The former focuses on problems such as family dysfunction and substance abuse or mental illness, which increase an individual’s risk of taking to the streets. The latter emphasizes broad economic and sociopolitical factors such as poverty and marginalization under conditions of rapid urbanization, rural-to-urban migration, or conflict, which affect the stability of households. Whether linked to biographic or structural factors, street children are often portrayed as victims of deprived environments and helpless with respect to changing circumstances, or else as social deviants (Veale et al. in Panter-Brick and Smith 2000). The ‘characteristics’ of street-children—their tenuous links with families and independence from adults—do not conform to commonly held views about what constitutes a ‘proper’ childhood in contemporary thought. Popular images of street-children—as victims or villains—rest upon particular representations of children and twentieth century discourses about child15155
Street Children: Cultural Concerns hood that arose among the wealthier classes of the western industrial world (Boyden 1990). The apposition of ‘street’ and ‘children’ suggests that street location is a peculiar and significant mark of identity outside the normal frame: ‘after all, children can also be found using fields, lofts, and gardens without there being any apparent need to coin terms such as ‘field children,’ ‘loft children,’ or ‘garden children’ (Glauser 1990). Street children upset broad expectations both about the street—a place for commerce, men, and prostitutes—and about childhood—a time for carefree and protected innocence (Hecht 1998). From the modern Western viewpoint, children who are not ‘at home’ and nurtured by responsible adults are ‘forsaken’ or ‘deviant,’ and street children in particular are ‘out of place’ in society and ‘outside childhood’ (Connolly and Ennew 1996). It has been argued that such portrayals are actually unhelpful to the children themselves (Ennew 1994). Many interventions which emphasise a vulnerable and ‘lost’ childhood have focused on ‘rescuing’ children from the streets by placing them in institutions or back with a family; this tends to ignore the social networks and coping strategies developed by the children, and generally has failed to provide lasting solutions. The focus of children’s activities ‘on the street’ has also tended to promote unidimensional accounts of their lives; for instance, it is not often realized that children who live on the street commonly attend school and retain contacts with their families (Ennew and Milne 1997). Furthermore, public condemnation of street murders in Latin America has unwittingly reduced children’s status to that of hunted prey, which conveys little understanding of the ways in which violence is actually experienced: in Brazil, while harassment at the hands of police is ubiquitous and extermination or torture not uncommon, street children are far more likely to die at the hand of their peers than as victims of death squads (Hecht 1998).
4. Child-centered Perspecties and Research Issues In the late 1990s fundamental assumptions about children and appropriate methods of research began to be re-evaluated, in the context of international legislation of children’s ‘best interests’ (see Children, Rights of: Cultural Concerns) and the development of participatory methodology (Ennew and Milne 1997). The realization was made in social science research that children are not helpless or passive, but capable of making informed decisions about the future, and that they express views and aspirations that may well differ from those of adults. Child-centered perspectives and participatory research methods have emphasized the need to recognize the agency and social competence of street children and to enhance their participation. The 15156
key to research and project planning has changed to working ‘with’ children rather than ‘for’ them: ‘they are not ‘objects of concern’ but people. They are vulnerable, but not incapable. They need respect, not pity’ (Ennew 1994, p. 35). Remedial action in the best interests of children—as stressed by the United Nations Convention of the Rights of the Child—is not a matter of rescuing children, but of listening to them, and of fostering child participation. A range of participatory approaches have been attempted, such as having street youths to conduct actual research, encouraging children to represent their world through drawings or plays, and staging workshops or group discussions on topics selected by the children themselves. These methods, more appropriate than the ubiquitous questionnaire survey, aim to involve the children in the research process rather than simply gather information or anecdotal testimonies. They have the potential to generate more valid data on children’s lives but raise their own set of ethical questions. Two significant research problems have been the lack of comparative information on peer groups within the same country and a failure to evaluate longitudinally the ‘career’ outcomes of street children (see Baker and Panter-Brick in Panter-Brick and Smith 2000). The tendency has been to focus on snap shot descriptions of street children with little time-depth or contextual information. Street children live complex and varied lifestyles—they may move home and back to the street, go to school, or work at times for an employer. Few street children remain on the street into adulthood, although the range of outcomes is enormously variable and shaped by cultural context. While prison, insanity, or death are the common expectations of Brazilian street children, stable employment, marriage, and children are achievable career paths for Nepali street children. Comparative and longitudinal research is critical, but not easily achieved. For instance, it is easy to contend that homelessness endangers children’s physical and emotional health, and indeed most studies of the homeless in the West and of street children in developing countries have emphasized the debilitating and deprived aspects of street-life. Recently, however, a number of comparative studies have questioned the expectation that homeless children should be the most vulnerable in deprived environments, arguing that poverty, not homelessness per se, may carry the most significant risks for health and behavior outcomes (Ziesemer et al. 1994, Bassuk et al. 1997, Veale et al. in Panter-Brick and Smith 2000). It is also recognized that the public discourse on street children has detracted from the issue of widespread poverty and violence affecting children who live at home. The issues of violence, perpetrated by both adults and other street children, of substance abuse, of sexual exploitation, and sexually transmitted diseases, are often raised but among those most difficult to research.
Street Children: Psychological Perspecties The nature of appropriate interventions designed to enhance children’s own coping strategies, such as social networks with street peers and resourcefulness in obtaining money, have also been discussed extensively. A compilation of studies across different continents and topics of research are found in Mermet (1997), Ennew and Milne (1997), and dedicated web sites.
5. Conclusion Studies of street children and research on homelessness were first concerned with describing lifestyle situations in terms of the uses of public spaces and the links with family or institutions, hinging various categories of street life upon the constructs of home, family, and a proper childhood. Twenty years on, such research is concerned with identifying the subjective and cultural interpretations of homelessness (Desjarlais 1996), with considering explicitly the context of poverty and social exclusion (Mingione 1999), and with seeking to follow career paths and identify what transitions are possible from one situation to another. At the international level, advocacy for children is less focused on the ‘street’ as it is on abusive work situations, sexual exploitation, or criminality, and with promoting responses that are consistent with children’s rights (Ennew 1995, Bartlett et al. 1999). See also: Child Abuse; Childhood Sexual Abuse and Risk for Adult Psychopathology; Children and the Law; Children, Rights of: Cultural Concerns; Children, Value of; Early Childhood: Socioemotional Risks; Infant and Child Mortality in the Less Developed World; Poverty and Child Development; Street Children: Psychological Perspectives; War, Political Violence and their Psychological Effects on Children: Cultural Concerns
Bibliography Bartlett S, Hart R et al. 1999 Cities for Children: Children’s Rights, Poerty and Urban Management. Earthscan, London Bassuk E L, Weinreb L F et al. 1997 Determinants of behavior in homeless and low-income housed preschool children. Pediatrics 100: 92–100 Boyden J 1990 Childhood and the policy makers: A comparative perspective on the globalization of childhood. In: James A, Prout A (eds.) Constructing and Reconstucting Childhood: Contemporary Issues in the Sociological Study of Childhood. Falmer Press, London, pp. 184–215 Connolly M, Ennew J 1996 Introduction—Children out of place. Childhood—A Global Journal of Child Research 3(2): 131–45 Desjarlais R 1996 Some causes and cultures of homelessness. American Anthropologist 98(2): 420–5 Ennew J 1994 Street and Working Children: A Guide to Planning. Save the Children, London
Ennew J 1995 Outside childhood: Street children’s rights. In: Franklin B (ed.) The Handbook of Children’s Rights: Comparatie Policy and Practice. Routledge, London and New York Ennew J, Milne B 1997 Methods of Research with Street and Working Children: An Annotated Bibliography. Radda Barnen, Stockholm and Weden Felsman J K 1984 Abandoned children: A reconsideration. Children Today 13: 13–8 Glasser I, Bridgman R 1999 Braing the Street: The Anthropology of Homelessness. Berghahn, Oxford, UK Glauser B 1990 Street children: Deconstructing a construct. In: James A, Prout A (eds.) Constructing and Reconstructing Childhood: Contemporary Issues in the Sociological Study of Childhood. Falmer Press, London, pp. 138–56 Hecht T 1998 At Home in the Street: Street Children of Northeast Brazil. Cambridge University Press, Cambridge, UK Hutson S, Liddiard M 1994 Youth Homelessness—The Construction of a Social Issue. Macmillan, London Mermet J 1997 Bibliography on Street Children. Henry Durant Institute, Geneva Mingione E 1999 The excluded and the homeless: The social construction of the fight against poverty in Europe. In: Mingione E (ed.) Urban Poerty and the Underclass: A Reader. Blackwell, Oxford, UK, pp. 83–104 Panter-Brick C, Smith M 2000 Abandoned Children. Cambridge University Press, Cambridge, UK Williams C 1993 Who are ‘street children?’ A hierarchy of street use and appropriate responses. Child Abuse & Neglect 17: 831–41 Ziesemer C, Marcoux L et al. 1994 Homeless children: Are they different from other low-income children? Social Work 39(6): 658–68
C. Panter-Brick
Street Children: Psychological Perspectives The United Nations defined street children as ‘any boy or girl … for whom the street (in the widest sense of the word, including unoccupied dwellings, wasteland, etc.) has become his or her habitual abode and\or source of livelihood; and who is inadequately protected, supervised, or directed by responsible adults’ (quoted in Lusk 1992 p. 294). The definition of street children plays a pivotal role in research and may be a source of disagreement about the results of studies (Koller and Hutz 1996). Children and adolescents who look like drifters (wear shabby, dirty clothing, beg for food or money, sell small objects, work, or wander without a purpose on the streets) can be found in large cities all over the world. The appearance of abandonment singles them out as belonging to the same group. However, their life histories, family character15157
Street Children: Psychological Perspecties istics, street life experiences, and prognoses are very different (Dallape 1996). Some researchers define street children based on characteristics such as sleeping location, family ties, school attendance, leisure, survival activities, occupation on the street environment, etc. Such definitions can lead to broad categorizations as: children of the streets or children in the streets (e.g., Barker and Knaul 1991, Campos et al. 1994, Forster et al. 1992). Children of the streets would be those who actually live on the street, all day and at night, who do not attend school, and do not have stable family ties. They fulfill their needs and are socialized on the streets. In contrast, children in the streets would be those who live with their families, may attend school, but spend all or part of their days on the streets, trying to earn money for themselves or their families (Hutz and Koller 1999). The relationship with the family has been considered as a key feature of the definition of street children. Felsman (1985) identified three groups of street children in Colombia: (a) orphaned or abandoned children, (b) runaways, and (c) children with family ties. Leisure and occupation on the streets were added to family ties in Martins’ work (1996) to identify three different groups of street children in Brazil. He found a group of children with stable family ties who worked on the streets and went home every night. These children played in their neighborhood or on the streets where they worked and many attended school. A second group had unstable family ties. Although they lived on the streets, these children knew their families and, occasionally, went home to visit or even to stay for a while. Finally, there was a group of children who were on their own on the streets and who had lost all contact with their family. Nevertheless, it is difficult and it may even be misleading, to define a child as belonging to a specific category. Hutz and Koller (1999) claimed that in their research they rarely found children who had completely lost contact with their family. They also identified many children who lived at home and worked on the streets, but occasionally slept on the street, and children who periodically left home and lived on the streets for weeks or months, and then went back home. The variability within these groups regarding the frequency of family contact, sleeping location, occupation on the streets, the destination for the money they earn, school attendance, and several other variables (including physical and sexual abuse, sexual activity, etc.) may be so large that the distinction between the of the street group and the in the street group may be meaningless or even misleading for research or intervention purposes. These authors suggested that it would be more appropriate to categorize street children as a function of the risks to which they are exposed (e.g., contact with gangs, use of drugs, dropping out of school, lack of proper parental guidance, prostitution, etc.) and the protective factors available to them (e.g., school atten15158
dance, supportive social networks, contact with caring adults, etc.). Researchers could then determine how vulnerable children are to developmental risk and what appropriate actions could be taken in each specific case.
1. Deelopmental Implications Children living on the streets are still children undergoing development, despite their life conditions. They experience risks and challenges that, at the same time, may jeopardize their development and promote the acquisition of strategies for dealing with life on the streets. There is some evidence that economical pressures and emotional disturbances in the family expose children to larger risks than do the conditions of the street (Hecht 1998, Hutz and Koller 1997, Matchinda 1999). Street children often face larger risks than children in general because they are exposed to negative physical, social, and emotional factors at home and still have to deal with the challenges of life on the streets. On the other hand, there is evidence that the conditions of life on the streets lead to the development of coping strategies that are adaptive and that may help to strengthen their cognitive and social skills.
2. Social Deelopment Street children are usually targets of social rejection and discrimination. They have to develop their social identity and sense of belonging to a society that views them either as victims, who deserve pitying, or as criminals who must be taken off the streets and locked in jail. On the one hand, they are seen as victims, because they do not have shelter, clothes, food, or adult protection, have to work on the streets instead of going to school, are sexually exploited, and so on. On the other hand, they are perceived also as transgressors because they often use drugs, commit robbery, make noise, and are grouped in threatening gangs. The adult environment is usually very hostile to street children. The police aggress them often, causing physical harm and humiliation. They are also harassed by street adults and gangs, which fight for space and better survival conditions. An effective strategy to survive and develop in such a hostile environment is to belong to a group on the streets. Therefore, street children will often join gangs and develop different kinds of peer relationships that leads to the development of emotional groups (appropriate to spend the night and to have fun together) and business groups (organized to dodge street life risks and fulfill their survival needs). Another strategy consists of going to social institutions for food and shelter. However, such institutions often fail to help
Street Children: Psychological Perspecties them effectively, because they aim at taking the children off the streets, whereas the children seek them, because they perceive the institutions as part of street life, and not as a way out of it (Hecht 1998).
3. Emotional Deelopment Aptekar (1989, 1996) stated that street children are mentally healthier than their siblings who stayed at home. Koller and Hutz (1996) observed that these children have the ability to reorganize their lives on the streets, in spite of their risky conditions and their life histories. Most of them left home because their parents failed to provide a safe, nurturing, and affective environment. Many children also report sexual or physical abuse, drug use at home, and economic exploitation as reasons for leaving home (Raffaelli et al. 1995). Some evidence presented by DeSouza and collaborators (DeSouza et al. 1995) indicated that street children were not in greater psychological distress than children of a low social economic status who lived with their families. Koller et al. (1996) investigated subjective wellbeing of street children and of children who lived in deep poverty and their findings also did not show significant differences between these groups.
attention span, memory, and cognitive development in general may be affected by malnutrition, drug use and intoxication, untreated illnesses, and accidental injuries. Often, they have difficulty adapting to the formal school system because it requires discipline, attention to specific tasks and schedules, planning ahead, and other routines with which they cannot deal effectively. Language, critical thinking, and intelligence also develop more slowly and may present significant deficits because street children interact mostly with their peers and have very little contact with adults. In fact, some researchers have noted that street children find it rewarding to talk to adults who will listen and speak with them in a friendly manner (Hutz and Koller 1999). Middle-class children, as a rule, are exposed daily to caring adults who talk to them, tell them stories, listen to their tales, and spend time interacting with them. Street children do not have this experience. Spatial skills and very well developed visual and auditory discrimination, for example, are required to detect, avoid, and escape street risks. Carraher et al. (1985) noted that street children who worked at the market were very capable of dealing with money and doing sophisticated calculations to figure out the price or value of products (although they failed when presented with standard school math problems). Aptekar (1989) also referred to what he calls street knowledge as an important skill for social interaction.
4. Physical Deelopment Life on the streets represents a constant source of risk to children and adolescents. Their safety and survival demand energy and coping strategies to confront the daily risks. Conflicts between gangs, police harassment, and adult street dwellers physical abuse are some examples of daily violence that street children have to deal with successfully to stay safe. Cold weather, lack of food and shelter, traffic accidents, untreated injuries and illnesses, exposure to drugs and unprotected sexual activities are also important risks to their health and physical integrity (Donald and Swart-Kruger 1994, Hecht 1998). Street children must develop adaptive strategies to survive and stay safe in spite of those risks. As previously mentioned, they form groups (emotional and business groups) that protect them from street violence and help them to survive. Also, often they find shelter and food in institutions that have rules that must be obeyed. They learn to cope with such rules, even when they do not agree with them, but their behavior becomes opportunistic and often ingenious (Donald and Swart-Kruger 1994).
5. Cognitie Deelopment Most street children, even those who go to school, are illiterate and have negative school experiences. Their
6. Conclusion Children living on the streets are a social problem in many countries, a problem that has to be fought by all means available. To fight this social ill requires that individuals and groups in society take the social and political responsibility to develop effective prevention and intervention projects. Children on the streets are vulnerable to risks but they manage to develop ‘coping strategies’ that often make them resilient (Hutz et al. 1996). They behave as children when they play or interact with peers on the streets. But, they must also act as adults when they have to provide for their subsistence and safety. In spite of their circumstances, street children are still developing persons that require appropriate health care, education, a nurturing home, safety, and human rights in order to grow with dignity and to become adjusted and productive citizens. See also: Childhood: Anthropological Aspects; Children, Rights of: Cultural Concerns; Infant and Child Mortality: Central and Eastern Europe; Infant and Child Mortality in Industrialized Countries; Infant and Child Mortality in the Less Developed World; Mental Health Programs: Children and Adolescents; Poverty and Child Development; Socialization in Infancy and Childhood; Street Children: Cultural Concerns; Violence and Effects on Children; War, 15159
Street Children: Psychological Perspecties Political Violence and their Psychological Effects on Children: Cultural Concerns
Bibliography Aptekar L 1989 Colombian street children: Gamines or chupagruesos. Adolescence 24: 783–94 Aptekar L 1996 Crianc: as de rua nos paı! ses em desenvolvimento: Uma revisa4 o de suas condic: o4 es [Street children of developing countries: A review of their conditions]. Psicologia Reflexag o e CrıT tica 9: 153–85 Barker G, Knaul F 1991 Exploited Entrepreuners: Street and Working Children in Deeloping Countries. Working Paper 1, Childhope, New York Campos R, Raffaelli M, Ude W, Greco M, Ruff A, Rolf J, Antunes C M, Halsey N, Greco D 1994 Social networks and daily activities of street youth in Belo Horizonte, Brazil. Child Deelopment 65: 319–30 Carraher T N, Carraher D, Schliemann A 1985 Mathematics in the streets and in the schools. British Journal of Deelopmental Psychology 3: 21–9 Dallape F 1996 Urban children: A challenge and an opportunity. Childhood 3: 283–94 DeSouza E, Koller S H, Hutz C S, Forster L M 1995 Preventing depression among Brazilian street children. InterAmerican Journal of Psychology 29: 261–5 Donald D, Swart-Kruger J 1994 The South-African street child: Developmental implications. South-African Journal of Psychology 24: 169–74 Felsman J K 1985 Abandoned children reconsidered: Prevention, social policy, and the trouble with sympathy. ERIC Document ED268457 Forster L M, Barros H T, Tannhauser S L, Tannhauser M 1992 Meninos de rua: Relac: a4 o entre abuso de drogas e atividades ilı! citas [Street children: The relationship between drug use and illicit activities]. Reista ABP-APAL 14: 115–20 Hecht T 1998 At Home in the Street. Cambridge University Press, Cambridge, UK Hutz C S, Koller S H 1997 Questo4 es sobre o desenvolvimento de crianc: as em situac: a4 o de rua [Issues regarding the development of street children]. Estudos de Psicologia 2: 175–97 Hutz C S Koller S H 1999 Methodological issues in the study of street children. In: Raffaelli M, Larson R (Volume eds.) Damon W (Series ed.) Homeless and Working Street Youth Around the World: Exploring Deelopmental Issues. New Directions in Child Deelopment. Jossey Bass, San Francisco, Vol 85, pp. 59–70 Hutz C S, Koller S H, Bandeira D R 1996 Resilie# ncia e vulnerabilidade em crianc: as em situac: a4 o de risco [Resilience and vulnerability in children at risk]. ColetaV neas da ANPEPP 1(12): 79–86 Koller S H, Hutz C S 1996 Meninos e meninas em situac: a4 o de rua: Dina# mica, diversidade e definic: a4 o [Boys and girls on the streets: Dynamics, diversity, and definition]. ColetaV neas da ANPEPP 1(12): 11–34 Koller S H, Hutz C S, Silva M 1996 June. Subjective well-being of Brazilian street children. Paper presented at XXVI International Congress of Psychology, Montreal, Canada Lusk M 1992 Street children of Rio de Janeiro. International Social Work 35: 293–305 Martins R A 1996 Censo de crianc: as e adolescentes em situac: a4 o de rua em Sa4 o Jose! do Rio Preto [A census of street children and adolescents in Sa4 o Jose! do Rio Preto, Brazil]. Psicologia Reflexag o e CrıT tica 9(1): 101–22
15160
Matchinda B 1999 The impact of home background on the decision of children to run away: The case of Yaounde city street children in Cameroon. Child Abuse and Neglect 23: 245–55 Raffaelli M, Siqueira E, Payne-Merrit A, Campos R, Ude W, Greco M, Greco D, Ruff A, Halsey N (The Street Youth Study Group) 1995 HIV-related knowledge and risk behaviors of street youth in Belo Horizonte, Brazil. AIDS Education and Preention 7(4): 287–97
S. H. Koller and C. S. Hutz
Stress and Cardiac Response The origin of the concept of stress in health-related disciplines is closely linked to the idea that organisms react physiologically to the presence of danger or threat. This reactivity has a protective function since it provides the logistic and instrumental basis for adaptive behaviors such as fight or flight. However, if maintained for long periods, this protective or defensive reactivity may become a health risk, compromising the normal functioning of the organs involved. For many scientists the excessive physiological reactivity associated with stress is the main mechanism linking stress to illness.
1. Cardiac Response The heart is a vital organ that reacts rapidly to physical and psychological demands. This reactivity is controlled by the central nervous system through neural and humoral pathways. 1.1 Neural Pathways The neural pathway to the heart involves the two branches of the autonomic nervous system: the sympathetic and the parasympathetic. When the sympathetic nerves are activated, the heart rate accelerates and the heart contraction becomes stronger, resulting in a greater blood volume discharge and greater blood pressure in the vessels. When the parasympathetic nerves are activated, the heart rate decelerates, reducing the blood volume discharge and the blood pressure. The two branches can act reciprocally (one activated and the other inhibited), working in the same direction (increasing or decreasing the heart activity), or nonreciprocally (both activated or inhibited simultaneously), competing against each other for the final cardiac outcome. 1.2 Humoral Pathways The humoral pathway involves the neural activation of the endocrine system, resulting in the secretion by
Stress and Cardiac Response the adrenal glands of various hormones into the bloodstream. One of these hormones, adrenaline, when it reaches the heart, has the same effect as the sympathetic nerves: it increases the heart rate and the stroke volume. Other adrenal hormones, such as cortisol, facilitate the transfer of lipids and glucose into the bloodstream and also have an important indirect effect on the heart and on the arteries that supply blood to the heart muscle: the coronaries. The humoral effects on the heart are slower than the neural effects but they last longer, contributing to maintaining cardiac reactivity during prolonged periods.
reflex to refer to protective physiological responses elicited by noxious stimulation, such as hand withdrawal from an electric shock, eyeblink to a puff of air, or vomit to bad food. A few years later, Walter Cannon used the term defense to refer to the ‘fight or flight’ response, a sympathetically mediated cardiovascular response to emergency situations aimed at providing energy supply to the body to facilitate adaptive behaviors such as attack (fight) or escape (flight). By the middle of the century and following Cannon’s ideas, Hans Selye had introduced the concept of stress and used the term ‘alarm’ response to refer to the first stage of the physiological response to stressful situations.
1.3 Cardiac–Respiratory Interactions The cardiovascular and the respiratory systems are both responsible for maintaining the continuous exchange of gases, oxygen and carbon dioxide, between the atmosphere and the tissues. The cardiovascular system reacts to changes in the concentration of these gases in the blood. Therefore, changes in the respiratory pattern which induce changes in gas concentration produce changes in the blood flow and cardiac activity in order to restore the appropriate gas concentration. Two examples of cardiac reactivity due to changes in respiration are the hyperentilation and respiratory sinus arrhythmia phenomena. 1.3.1 Hyperentilation. Hyperventilation is caused by rapid respiratory patterns provoking reduced carbon dioxide concentration in the blood. The cardiovascular response is a reduced blood flow to the brain and heart arteries, and an increase in cardiac activity due to parasympathetic inhibition. 1.3.2 Respiratory sinus arrhythmia. In contrast, respiratory sinus arrhythmia is caused by slow and relaxed respiratory patterns. The consequence is a rhythmic deceleration of the heart rate, due to parasympathetic activation, coinciding with each expiratory cycle.
2. Stress and Cardiac Defense Cardiac defense can be defined as the cardiac response to perceived danger or threat. The concept of defense has a long tradition in physiology and psychology and it is in this context that the concept of stress was first introduced (see Stress and Coping Theories). 2.1 Historical Antecedents At the beginning of the twentieth century, Ivan Pavlov and other Russian reflexologists used the term defense
2.2 Classical Approaches to Cardiac Defense There have been two major approaches to the study of cardiac defense: (a) the cognitive or sensory processing approach built on Pavlov’s distinction between orienting and defense reflexes, and (b) the motivational or emotional approach built on Cannon’s ideas on the fight or flight response. 2.2.1 The cognitie approach. The cognitive approach assumes that cardiac changes to environmental stimuli reflect attentional and perceptual mechanisms aimed at facilitating or inhibiting the processing of the stimuli (Sokolov 1963, Graham 1979). The orienting response (a deceleration of the heart rate to moderate or novel stimulation) facilitates attention and perception of the stimuli, whereas the defense response (an acceleration of the heart rate to intense or aversive stimulation) reduces attention and perception as a form of protection against the stimuli. To study cardiac defense, this approach has developed tasks that present discrete sensory stimuli, such as noise, light, pictures, or words, manipulating their physical features—intensity, duration, and rise time—or their affective content—pleasant vs. unpleasant pictures or words. According to this approach, cardiac defense is characterized by a heartrate accelerative response, peaking between 3 and 6 s after stimulus onset, which habituates slowly with stimulus repetition. 2.2.2 The motiational approach. The motivational approach assumes that cardiac changes in response to environmental demands reflect metabolic mechanisms aimed at providing the necessary energy to the body to support behavioral adjustments (Obrist 1981). If the appropriate behavior is to be quiet and relaxed, then the cardiac response will be a heart-rate deceleration. If the appropriate behavior is to be active, either physically or psychologically, then the cardiac response will be a heart-rate acceleration. 15161
Stress and Cardiac Response To study cardiac defense, this approach has developed tasks that involve emotionally or cognitively challenging situations maintained throughout several minutes, such as keeping the hand in cold water, pressing a key as fast as possible to avoid shocks, or doing complex arithmetic tasks. The term ‘mental stress’ was introduced in this context to refer to tasks that require ‘mental effort’ instead of ‘physical effort.’ Under such conditions cardiac defense is characterized by a heart-rate acceleration maintained during the task duration, accompanied by reduced cardiac variability linked to hyperventilation and loss of respiratory sinus arrythmia.
3. A New Approach to Cardiac Defense The cognitive and motivational approaches to cardiac defense have been difficult to reconcile in the past. However, new data have been accumulated on defense responses, both in animals and humans, and on the neurophysiological brain circuits controlling such responses, which not only facilitate integration of the classic approaches but also have potential implications for the concept of stress and its relation to cardiovascular illness.
3.1 Natural Defense In natural settings, such as the imminence of a predator, the defense reaction is not a unique response that could be conceptualized as cognitive vs. motivational, or vice versa. Rather, the defense reaction follows a dynamic sequential process with initial phases in which attentional factors predominate, aimed at the detection and analysis of the potential danger, and if the process continues, later phases in which behavioral factors predominate, aimed at active defense such as fight or flight. This sequential process can be divided into three stages: pre-encounter, postencounter, and circa strike. Depending on the spatial and temporal proximity of the predator, different components of the defense reaction may take place successively.
3.2 Protectie Reflexes Motor startle and cardiac defense are two protective reflexes that have been extensively studied in recent years in relation to both cognitive and emotional modulatory factors. Startle is a whole-body motor flexor response, including an eyeblink that is elicited by abrupt stimulation such as a brief, intense noise or light. Startle is augmented when the organism is in an aversive emotional state and inhibited when the organism is in a positive emotional state. Similarly, startle is potentiated in the first stages of the natural 15162
Figure 1 Cardiac defense response to an intense noise of half a second duration. The horizontal axis represents 80 s after noise onset. The vertical axis represents the cardiac changes in beats per minute (bpm) from the baseline
defense reaction when attentional factors predominate, and inhibited during the later phases when behavioral factors predominate (Lang 1995). On the other hand, studies on the cardiac components of the defense response to unexpected noises or shocks have questioned the classical models on response characteristics and their functional significance (Vila 1995, Vila et al. 1997). The cardiac response to an unexpected noise is not only a heartrate acceleration. The response is a complex pattern of heart-rate changes, observed throughout the 80 s after stimulus onset, with two large accelerative and two smaller decelerative components in alternating order (see Fig. 1). The evocation of the response has both parasympathetic (first acceleration and deceleration) and sympathetic (second acceleration and deceleration) influences. As regards its psychological significance, contrary to the classical cognitive model, cardiac defense is accompanied by an increase in sensory processing (sensory intake). On the other hand, emotional factors are also involved, producing similar modulatory effects as in motor startle. For example, phobic subjects presented with aversive noise while viewing pictures of their phobic object show a clear augmentation of the two accelerative components of the response. 3.3 The Brain’s Defense System The above data support a description and interpretation of cardiac defense as a sequence of heart-rate changes with both accelerative and decelerative components, with both parasympathetic and sympathetic
Stress and Coping Theories influences, and with both cognitive (attentional) and emotional (energetic) significance. The structural foundation of this response is in the brain’s aversive motive system. What is known about this system comes from animal research, particularly from studies on defense reactions and fear conditioning (see LeDoux 1996 and Lang et al. 1997 for a comprehensive review). Neuroscientists have traced the chain of neural activation in the brain, starting from the sensory input, proceeding through the subcortical and cortical connecting structures, and finishing in the autonomic and motor effectors. A critical structure in this circuit is the amygdala, a subcortical center located within the temporal lobes that receives inputs from the sensory thalamus and various cortical structures. Downstream from the amygdala, the defense circuit branches into different structures, controlling separate forms of defense. This hierarchical organization, together with the learned plasticity in the anatomy and functioning of the brain’s structures, provides considerably variety in the pattern of defensive output. Depending on the situational context and on the weight of the local pathways within the circuit, the aversively motivated responses can be escape or avoidance, aggressive behavior, hormonal and autonomic responses, or the augmentation of motor startle.
See also: Cardiovascular Conditioning: Neural Substrates; Coronary Heart Disease (CHD), Coping with; Coronary Heart Disease (CHD): Psychosocial Aspects; Hypertension: Psychosocial Aspects; Stress and Health Research; Stress, Neural Basis of; Stress: Psychological Perspectives
Bibliography Graham F K 1979 Distinguishing among orienting, defense and startle reflexes. In: Kimmel H D, van Olst E H, Orlebeke F J (eds.) The Orienting Reflex in Humans. Erlbaum, Hillsdale, NJ, pp. 137–67 Lang P J 1995 The emotion probe: Studies of motivation and attention. American Psychologist 50: 372–85 Lang P J, Bradley M M, Cuthbert B N 1997 Motivated attention: Affect, activation, and action. In: Lang P J, Simons R F, Balaban M T (eds.) Attention and Orienting. Erlbaum, Mahwah, NJ, pp. 97–135 LeDoux J 1996 The Emotional Brain. Pergamon, New York Obrist P A 1981 Cardioascular Psychophysiology: A Perspectie. Simon & Schuster, New York Sokolov E N 1963 Perception and the Conditioned Reflex. Pergamon, Oxford, UK Vila J 1995 Cardiac psychophysiology and health. In: Rodrı! guezMarı! n J (ed.) Health Psychology and Quality of Life Research. University of Alicante Press, Alicante, Spain, pp. 520–40 Vila J, Pe! rez M N, Ferna! ndez M D, Pegalajar J, Sa! nchez M 1997 Attentional modulation of the cardiac defense response in humans. Psychophysiology 34: 482–7
4. Conclusions: Implications for Stress and Cardioascular Disease The classical models of defense did not emphasize the dynamic character of the defense response—obvious in natural settings—and the simultaneous involvement of different psychological processes—cognitive and motivational. The new model matches better the prevalent model of stress in health-related disciplines: the cognitive-transactional model (see Stress and Coping Theories). The continuous dynamic interaction between the situation and the person, leading to activation or deactivation of the stress response, is the main feature of the transactional model. In addition, the new approach to defense can help to understand the specific nature of stress, differentiating it from other related concepts such as anxiety or arousal. Based on the new defense model, stress can be defined as the sustained activation of the brain’s defense motivational system. A state of maintained activation of this system implies a variety of specific defense responses being continuously elicited, including neural and humoral cardiovascular responses that correlate with poor cardiovascular health: maintained increases in heart rate, stroke volume and blood pressure, and maintained decreases in respiratory sinus arrhythmia and cardiac variability. These sustained responses are instrumental for the development of cardiovascular diseases such as essential hypertension and coronary heart disease.
J. Vila
Stress and Coping Theories For the last five decades the term stress has enjoyed increasing popularity in the behavioral and health sciences. It first was used in physics in order to analyze the problem of how man-made structures must be designed to carry heavy loads and resist deformation by external focus. In this analysis, stress referred to external pressure or force applied to a structure, while strain denoted the resulting internal distortion of the object (for the term’s history, cf. Hinkle 1974, Mason 1975a, 1975c). In the transition from physics to the behavioral sciences, the usage of the term stress changed. In most approaches it now designates bodily processes created by circumstances that place physical or psychological demands on an individual (Selye 1976). The external forces that impinge on the body are called stressors (McGrath 1982).
1. Theories of Stress Theories that focus on the specific relationship between external demands (stressors) and bodily processes (stress) can be grouped in two different categories: 15163
Stress and Coping Theories approaches to ‘systemic stress’ based in physiology and psychobiology (among others, Selye 1976) and approaches to ‘psychological stress’ developed within the field of cognitive psychology (Lazarus 1966, 1991, Lazarus and Folkman 1984, McGrath 1982).
1.1 Systemic Stress: Selye’s Theory The popularity of the stress concept in science and mass media stems largely from the work of the endocrinologist Hans Selye. In a series of animal studies he observed that a variety of stimulus events (e.g., heat, cold, toxic agents) applied intensely and long enough are capable of producing common effects, meaning not specific to either stimulus event. (Besides these nonspecific changes in the body, each stimulus produces, of course, its specific effect, heat, for example, produces vasodilatation, and cold vasoconstriction.) According to Selye, these nonspecifically caused changes constitute the stereotypical, i.e., specific, response pattern of systemic stress. Selye (1976, p. 64) defines this stress as ‘a state manifested by a syndrome which consists of all the nonspecifically induced changes in a biologic system.’ This stereotypical response pattern, called the ‘General Adaptation Syndrome’ (GAS), proceeds in three stages. (a) The alarm reaction comprises an initial shock phase and a subsequent countershock phase. The shock phase exhibits autonomic excitability, an increased adrenaline discharge, and gastro-intestinal ulcerations. The countershock phase marks the initial operation of defensive processes and is characterized by increased adrenocortical activity. (b) If noxious stimulation continues, the organism enters the stage of resistance. In this stage, the symptoms of the alarm reaction disappear, which seemingly indicates the organism’s adaptation to the stressor. However, while resistance to the noxious stimulation increases, resistance to other kinds of stressors decreases at the same time. (c) If the aversive stimulation persists, resistance gives way to the stage of exhaustion. The organism’s capability of adapting to the stressor is exhausted, the symptoms of stage (a) reappear, but resistance is no longer possible. Irreversible tissue damages appear, and, if the stimulation persists, the organism dies. Although Selye’s work influenced a whole generation of stress researchers, marked weaknesses in his theory soon became obvious. First of all, Selye’s conception of stress as a reaction to a multitude of different events had the fatal consequence that the stress concept became the melting pot for all kinds of approaches. Thus, by becoming a synonym for diverse terms such as, for example, anxiety, threat, conflict, or emotional arousal, the concept of stress was in danger of losing its scientific value (cf. Engel 1985). Besides this general reservation, specific critical issues have been raised. One criticism was directed at the theory’s 15164
core assumption of a nonspecific causation of the GAS. Mason (1971, 1975b) pointed out that the stressors observed as effective by Selye carried a common emotional meaning: they were novel, strange, and unfamiliar to the animal. Thus, the animal’s state could be described in terms of helplessness, uncertainty, and lack of control. Consequently, the hormonal GAS responses followed the (specific) emotional impact of such influences rather than the influences as such. In accordance with this assumption, Mason (1975b) demonstrated that in experiments where uncertainty had been eliminated no GAS was observed. This criticism lead to a second, more profound argument: unlike the physiological stress investigated by Selye, the stress experienced by humans is almost always the result of a cognitive mediation (cf. Arnold 1960, Janis 1958, Lazarus 1966, 1974). Selye, however, fails to specify those mechanisms that may explain the cognitie transformation of ‘objective’ noxious events into the subjective experience of being distressed. In addition, Selye does not take into account coping mechanisms as important mediators of the stress–outcome relationship. Both topics are central to psychological stress theories as, for example, elaborated by the Lazarus group. A derivative of the systemic approach is the research on critical life eents. An example is the influential hypothesis of Holmes and Rahe (1967), based on Selye’s work, that changes in habits, rather than the threat or meaning of critical events, is involved in the genesis of disease. The authors assumed that critical life events, regardless of their specific (e.g., positive or negative) quality, stimulate change that produces challenge to the organism. Most of this research, however, has not been theoretically driven and exhibited little empirical support for this hypothesis (for a critical evaluation, see Thoits 1983).
1.2 Psychological Stress: The Lazarus Theory Two concepts are central to any psychological stress theory: appraisal, i.e., individuals’ evaluation of the significance of what is happening for their well-being, and coping, i.e., individuals’ efforts in thought and action to manage specific demands (cf. Lazarus 1993). Since its first presentation as a comprehensive theory (Lazarus 1966), the Lazarus stress theory has undergone several essential revisions (cf. Lazarus 1991, Lazarus and Folkman 1984, Lazarus and Launier 1978). In the latest version (see Lazarus 1991), stress is regarded as a relational concept, i.e., stress is not defined as a specific kind of external stimulation nor a specific pattern of physiological, behavioral, or subjective reactions. Instead, stress is viewed as a relationship (‘transaction’) between individuals and their environment. ‘Psychological stress refers to a relationship with the environment that the person appraises as significant for his or her well being
Stress and Coping Theories and in which the demands tax or exceed available coping resources’ (Lazarus and Folkman 1986, p. 63). This definition points to two processes as central mediators within the person–environment transaction: cognitie appraisal and coping. The concept of appraisal, introduced into emotion research by Arnold (1960) and elaborated with respect to stress processes by Lazarus (1966, Lazarus and Launier 1978), is a key factor for understanding stressrelevant transactions. This concept is based on the idea that emotional processes (including stress) are dependent on actual expectancies that persons manifest with regard to the significance and outcome of a specific encounter. This concept is necessary to explain individual differences in quality, intensity, and duration of an elicited emotion in environments that are objectively equal for different individuals. It is generally assumed that the resulting state is generated, maintained, and eventually altered by a specific pattern of appraisals. These appraisals, in turn, are determined by a number of personal and situational factors. The most important factors on the personal side are motivational dispositions, goals, values, and generalized expectancies. Relevant situational parameters are predictability, controllability, and imminence of a potentially stressful event. In his monograph on emotion and adaptation, Lazarus (1991) developed a comprehensive emotion theory that also includes a stress theory (cf. Lazarus 1993). This theory distinguishes two basic forms of appraisal, primary and secondary appraisal (see also Lazarus 1966). These forms rely on different sources of information. Primary appraisal concerns whether something of relevance to the individual’s well being occurs, whereas secondary appraisal concerns coping options. Within primary appraisal, three components are distinguished: goal releance describes the extent to which an encounter refers to issues about which the person cares. Goal congruence defines the extent to which an episode proceeds in accordance with personal goals. Type of ego-inolement designates aspects of personal commitment such as self-esteem, moral values, ego-ideal, or ego-identity. Likewise, three secondary appraisal components are distinguished: blame or credit results from an individual’s appraisal of who is responsible for a certain event. By coping potential Lazarus means a person’s evaluation of the prospects for generating certain behavioral or cognitive operations that will positively influence a personally relevant encounter. Future expectations refer to the appraisal of the further course of an encounter with respect to goal congruence or incongruence. Specific patterns of primary and secondary appraisal lead to different kinds of stress. Three types are distinguished: harm, threat, and challenge (Lazarus and Folkman 1984). Harm refers to the ( psychological) damage or loss that has already happened. Threat is the anticipation of harm that may be
imminent. Challenge results from demands that a person feels confident about mastering. These different kinds of psychological stress are embedded in specific types of emotional reactions, thus illustrating the close conjunction of the fields of stress and emotions. Lazarus (1991) distinguishes 15 basic emotions. Nine of these are negative (anger, fright, anxiety, guilt, shame, sadness, envy, jealousy, and disgust), whereas four are positive (happiness, pride, relief, and love). (Two more emotions, hope and compassion, have a mixed valence.) At a molecular level of analysis, the anxiety reaction, for example, is based on the following pattern of primary and secondary appraisals: there must be some goal relevance to the encounter. Furthermore, goal incongruence is high, i.e., personal goals are thwarted. Finally, ego-involvement concentrates on the protection of personal meaning or egoidentity against existential threats. At a more molar level, specific appraisal patterns related to stress or distinct emotional reactions are described as core relational themes. The theme of anxiety, for example, is the confrontation with uncertainty and existential threat. The core relational theme of relief, however, is ‘a distressing goal-incongruent condition that has changed for the better or gone away’ (Lazarus 1991). Coping is intimately related to the concept of cognitive appraisal and, hence, to the stress-relevant person-environment transactions. Most approaches in coping research follow Folkman and Lazarus (1980, p. 223), who define coping as ‘the cognitive and behavioral efforts made to master, tolerate, or reduce external and internal demands and conflicts among them.’ This definition contains the following implications. (a) Coping actions are not classified according to their effects (e.g., as reality-distorting), but according to certain characteristics of the coping process. (b) This process encompasses behavioral as well as cognitive reactions in the individual. (c) In most cases, coping consists of different single acts and is organized sequentially, forming a coping episode. In this sense, coping is often characterized by the simultaneous occurrence of different action sequences and, hence, an interconnection of coping episodes. (d) Coping actions can be distinguished by their focus on different elements of a stressful encounter (cf. Lazarus and Folkman 1984). They can attempt to change the person–environment realities behind negative emotions or stress ( problem-focused coping). They can also relate to internal elements and try to reduce a negative emotional state, or change the appraisal of the demanding situation (emotion-focused coping).
1.3 Resource Theories of Stress: A Bridge between Systemic and Cognitie Viewpoints Unlike approaches discussed so far, resource theories of stress are not primarily concerned with factors that 15165
Stress and Coping Theories create stress, but with resources that preserve well being in the face of stressful encounters. Several social and personal constructs have been proposed, such as social support (Schwarzer and Leppin 1991), sense of coherence (Antonovsky 1979), hardiness (Kobasa 1979), self-efficacy (Bandura 1977), or optimism (Scheier and Carver 1992). Whereas self-efficacy and optimism are single protective factors, hardiness and sense of coherence represent tripartite approaches. Hardiness is an amalgam of three components: internal control, commitment, and a sense of challenge as opposed to threat. Similarly, sense of coherence consists of believing that the world is meaningful, predictable, and basically benevolent. Within the social support field, several types have been investigated, such as instrumental, informational, appraisal, and emotional support. The recently offered conservation of resources (COR) theory (Hobfoll 1989, Hobfoll et al. 1996) assumes that stress occurs in any of three contexts: when people experience loss of resources, when resources are threatened, or when people invest their resources without subsequent gain. Four categories of resources are proposed: object resources (i.e., physical objects such as home, clothing, or access to transportation), condition resources (e.g., employment, personal relationships), personal resources (e.g., skills or self-efficacy), and energy resources (means that facilitate the attainment of other resources, for example, money, credit, or knowledge). Hobfoll and co-workers outlined a number of testable hypotheses (called principles) derived from basic assumptions of COR (cf. Hobfoll et al. 1996). (a) Loss of resources is the primary source of stress. This principle contradicts the fundamental assumption of approaches on critical life events (cf. Holmes and Rahe 1967) that stress occurs whenever individuals are forced to readjust themselves to situational circumstances, may these circumstances be positive (e.g., marriage) or negative (e.g., loss of a beloved person). In an empirical test of this basic principle, Hobfoll and Lilly (1993) found that only loss of resources was related to distress. (b) Resources act to preserve and protect other resources. Self-esteem is an important resource that may be beneficial for other resources. Hobfoll and Leiberman (1987), for example, observed that women who were high in self-esteem made good use of social support when confronted with stress, whereas those who lacked self-esteem interpreted social support as an indication of personal inadequacy and, consequently, misused support. (c) Following stressful circumstances, individuals have an increasingly depleted resource pool to combat further stress. This depletion impairs individuals’ capability of coping with further stress, thus resulting in a loss spiral. This process view of resource investment requires to focus on how the interplay between resources and situational demands changes 15166
over time as stressor sequences unfold. In addition, this principle shows that it is important to investigate not only the effect of resources on outcome, but also of outcome on resources.
2. Coping Theories 2.1 Classification of Approaches The Lazarus model outlined above represents a specific type of coping theory. These theories may be classified according to two independent parameters: (a) trait-oriented versus state-oriented, and (b) microanalytic versus macroanalytic approaches (cf. Krohne 1996). Trait-oriented and state-oriented research strategies have different objectives: The trait-oriented (or dispositional) strategy aims at early identification of individuals whose coping resources and tendencies are inadequate for the demands of a specific stressful encounter. An early identification of these persons will offer the opportunity for establishing a selection (or placement) procedure or a successful primary prevention program. Research that is state-oriented, i.e., which centers around actual coping, has a more general objective. This research investigates the relationships between coping strategies employed by an individual and outcome variables such as selfreported or objectively registered coping efficiency, emotional reactions accompanying and following certain coping efforts, or variables of adaptational outcome (e.g., health status or test performance). This research strategy intends to lay the foundation for a general modificatory program to improve coping efficacy. Microanalytic approaches focus on a large number of specific coping strategies, whereas macroanalytic analysis operates at a higher level of abstraction, thus concentrating on more fundamental constructs. S. Freud’s (1926) ‘classic’ defense mechanisms conception is an example of a state-oriented, macroanalytic approach. Although Freud distinguished a multitude of defense mechanisms, in the end, he related these mechanisms to two basic forms: repression and intellectualization (see also A. Freud 1936). The traitoriented correspondence of these basic defenses is the personality dimension repression–sensitization (Byrne 1964, Eriksen 1966). The distinction of the two basic functions of emotion-focused and problem-focused coping proposed by Lazarus and Folkman (1984) represents another macroanalytic state approach. In its actual research strategy, however, the Lazarus group extended this macroanalytic approach to a microanalytic strategy. In their ‘Ways of Coping Questionnaire’ ( WOCQ; cf. Folkman and Lazarus 1988, Lazarus 1991), Lazarus and co-workers distinguish eight groups of coping strategies: confrontative coping, distancing, self-controlling, seeking social support, accepting responsibility, escape-avoidance,
Stress and Coping Theories planful problem-solving, and positive reappraisal. The problem with this conception and, as a consequence, the measurement of coping is that these categories are only loosely related to the two basic coping functions. Unlike the macroanalytic, trait-oriented approach that generated a multitude of theoretical conceptions, the microanalytic, trait-oriented strategy is mostly concerned with constructing multidimensional inventories (overviews in Schwarzer and Schwarzer 1996). Almost all of these measurement approaches, however, lack a solid theoretical foundation (cf. Krohne 1996). 2.2 Macroanalytic, Trait-Oriented Coping Theories Research on the processes by which individuals cope with stressful situations has grown substantially over the past three decades (cf. Lazarus 1991, Zeidner and Endler 1996). Many trait-oriented approaches in this field have established two constructs central to an understanding of cognitive responses to stress: igilance, that is, the orientation toward stressful aspects of an encounter, and cognitie aoidance, that is, averting attention from stress-related information (cf. Janis 1983, Krohne 1978, 1993, Roth and Cohen 1986). Approaches corresponding to these conceptions are repression–sensitization (Byrne 1964), monitoringblunting (Miller 1980, 1987), or attention-rejection (Mullen and Suls 1982). With regard to the relationship between these two constructs, Byrne’s approach specifies a unidimensional, bipolar structure, while Miller as well as Mullen and Suls leave this question open. Krohne, however, explicitly postulates an independent functioning of the dimensions vigilance and cognitive avoidance. 2.2.1 Repression–sensitization. The repression–sensitization construct (cf. Byrne 1964, Eriksen 1966) relates different forms of dispositional coping to one bipolar dimension. When confronted with a stressful encounter, persons located at one pole of this dimension (repressers) tend to deny or minimize the existence of stress, fail to verbalize feelings of distress, and avoid thinking about possible negative consequences of this encounter. Persons at the opposite pole (sensitizers) react to stress-related cues by way of enhanced information search, rumination, and obsessive worrying. The concept of repression– sensitization is theoretically founded in research on perceptual defense (Bruner and Postman 1947), an approach that combined psychodynamic ideas with the functionalistic behavior analysis of Brunswik (1947). 2.2.2 Monitoring and blunting. The conception of monitoring and blunting (Miller 1980, 1987) originated from the same basic assumptions formulated earlier by Eriksen (1966) for the repression–
sensitization construct. Miller conceived both constructs as cognitive informational styles and proposed that individuals who encounter a stressful situation react with arousal according to the amount of attention they direct to the stressor. Conversely, the arousal level can be lowered, if the person succeeds in reducing the impact of aversive cues by employing avoidant cognitive strategies such as distraction, denial, or reinterpretation. However, these coping strategies, called blunting, should only be adaptive if the aversive event is uncontrollable. Examples of uncontrollable events are impending surgery or an aversive medical examination (Miller and Mangan 1983). If control is available, strategies called monitoring, i.e., seeking information about the stressor, are the more adaptive forms of coping. Although initially these strategies are associated with increased stress reactions, they enable the individual to gain control over the stressor in the long run, thus reducing the impact of the stressful situation. An example of a more controllable stressor is preparing for an academic exam. The general relationship between a stressor’s degree of controllability and the employment of monitoring or blunting strategies can be moderated by situative and personal influences. With regard to situation, the noxious stimulation may be so intense that blunting strategies, such as attentional diversion, are ineffective with respect to reducing stress-related arousal. Concerning personality, there are relatively stable individual differences in the inclination to employ blunting or monitoring coping when encountering a stressor. 2.2.3 The model of coping modes. Similar to Miller’s monitoring-blunting conception, the model of coping modes (MCM) deals with individual differences in attention orientation and emotional-behavioral regulation under stressful conditions (Krohne 1993). The MCM extends the (largely descriptive) monitoring-blunting conception (as well as the repression–sensitization approach) in that it relates the dimensions vigilance and cognitive avoidance to an explicative cognitive-motivational basis. It assumes that most stressful, especially anxiety evoking, situations are characterized by two central features: the presence of aersie stimulation and a high degree of ambiguity. The experiential counterparts of these situational features are emotional arousal (as being primarily related to aversive stimulation) and uncertainty (related to ambiguity). Arousal, in turn, should stimulate the tendency to cognitively avoid (or inhibit) the further processing of cues related to the aversive encounter, whereas uncertainty activates vigilant tendencies. These two coping processes are conceptually linked to personality by the hypothesis that the habitual preference for avoidant or vigilant coping strategies reflects individual differences in the susceptibility to emotional arousal or uncertainty. Individuals who are 15167
Stress and Coping Theories especially susceptible to states of stress-induced emotional arousal are supposed to habitually employ cognitive avoidance. The employment of avoidant strategies primarily aims at shielding the person from an increase in arousal (arousal-motiated coping behaior). Individuals who are especially affected by the uncertainty experienced in most stressful situations are supposed to habitually employ vigilant coping. Thus, the employment of vigilant strategies follows a plan that is aimed at minimizing the probability of unanticipated occurrence of aversive events (uncertainty-motiated coping behaior). The MCM conceives the habitual coping tendencies of vigilance and cognitive avoidance as independent personality dimensions. That means, aggregated across a multitude of stressful encounters, the employment of vigilant strategies and of avoidant ones does not preclude each other. Thus, four coping modes can be defined. (a) Persons who score high on vigilance and low on cognitive avoidance are called sensitizers. These persons are primarily concerned with reducing uncertainty by directing their attention towards stressrelevant information. (b) Individuals with the opposite pattern are designated as repressers. These persons minimize the experience of arousal by avoiding aversive information. (c) Nondefensies have low scores on both dimensions. These persons are supposed to flexibly adapt to the demands of a stressful encounter. Instead of frequently employing vigilant or avoidant coping strategies, they prefer to act instrumentally in most situations. (d ) Individuals who exhibit high scores on both dimensions are called high anxious. In employing vigilant as well as avoidant coping strategies, these persons try to reduce both the subjective uncertainty and the emotional arousal induced by stressful encounters. Because the two goals are incompatible in most situations, high-anxious persons are assumed to show fluctuating and therefore lessefficient coping behavior. Approaches to assess individual differences in vigilance and cognitive avoidance are described in Krohne et al. (2000). Empirical results related to predictions derived from the MCM are presented in Krohne (1993, 1996), and Krohne et al. (1992).
3. Future Perspecties Although the fields of stress and coping research represent largely explored territory, there are still fertile perspectives to be pursued in future research. Among the promising lines of research, two perspectives will be mentioned here. (a) Compared to the simplistic stimulus-response conception of stress inherent in early approaches on stress, the ‘psychological’ (i.e., cognitive transformation) approach of the Lazarus group clearly represents progress. However, in advocating a completely ‘subjective’ orientation in conceptualizing stress, Lazarus overstated the ‘cognitive turn’ in stress re15168
search. In stating that ‘we might do better by describing relevant environments and their psychological meanings through the lenses of individuals’ (Lazarus 1990, p. 8) he took a stand that is at variance with the multivariate, systems-theory perspective proposed in his recent publications on stress and emotions (Lazarus 1990, 1991). First, the stress process contains variables to be assessed both subjectively and objectively, such as constraints, temporal aspects, or social support networks, as well as responses to be measured at different levels (cf. Lazarus 1990, Table 1). Second, the fact that most objective features relevant to stress-related outcomes exert their influence via a process of cognitive transformation (Mischel and Shoda 1995) does not mean that objective features can be neglected. It is of great practical and theoretical importance to know which aspects of the ‘objective’ environment an individual selects for transformation, and how these characteristics are subjectively represented. Third, as far as response levels are concerned, it is obvious that stressors do not only create subjective (cognitive) responses but also reactions at the somatic and the behavioral-expressive level. In fact, many individuals (especially those high in cognitive avoidance) are characterized by a dissociation of subjective and objective stress responses (cf. Kohlmann 1997; for an early discussion of the psychological meaning of this dissociation see Lazarus 1966). These individuals may manifest, for example, relatively low levels of subjective distress but at the same time considerable elevations in autonomic arousal. In recent years, the concept of subjective-autonomic response dissociation has become increasingly important in clarifying the origin and course of physical diseases and affective disorders. (b) It is important to define central person-specific goals (or reference values) in coping, such as reducing uncertainty, inhibiting emotional arousal, or trying to change the causes of a stressful encounter. These goals are not only central to understanding the stress and coping process, they are, in fact, ‘the core of personality’ (Karoly 1999). Goals define the transsituational and trans-temporal relevance of certain stressors, serve as links to other constructs such as self-concept or expectancies, influence regulatory processes such as coping, and define the efficiency of these processes (cf. Karoly 1999, Lazarus 1991, Mischel and Shoda 1995). Instead of applying global and relatively content-free trait concepts in stress and coping research such as anxiety, depression, or optimism, a more fertile perspective would be to study personality in this field by paying attention to what people are trying to do instead of only observing how they actually respond to stressful events. See also: Coping across the Lifespan; Coping Assessment; Health: Self-regulation; Self-regulation
Stress and Coping Theories in Adulthood; Self-regulation in Childhood; Stress: Measurement by Self-report and Interview; Stress, Neural Basis of; Stress: Psychological Perspectives
Bibliography Antonovsky A 1979 Health, Stress, and Coping. Jossey-Bass, San Francisco, CA Arnold M B 1960 Emotion and Personality (2 Vols.). Columbia University Press, New York Bandura A 1977 Self-efficacy: Toward a unifying theory of behavioral change. Psychological Reiew 84: 191–215 Bruner J S, Postman L 1947 Emotional selectivity in perception and reaction. Journal of Personality 16: 69–77 Brunswik E 1947 Systematic and Representatie Design of Psychological Experiments: With Results in Physical and Social Perception. University of California Press, Berkeley, CA Byrne D 1964 Repression–sensitization as a dimension of personality. In: Maher B A (ed.) Progress in Experimental Personality Research. Academic Press, New York, Vol. 1, pp. 169–220 Engel B T 1985 Stress is a noun! No, a verb! No, an adjective. In: Field T M, McCabe P M, Schneiderman N (eds.) Stress and Coping. Erlbaum, Hillsdale, NJ, pp. 3–12 Eriksen C W 1966 Cognitive responses to internally cued anxiety. In: Spielberger C D (ed.) Anxiety and Behaior. Academic Press, New York, pp. 327–60 Folkman S, Lazarus R S 1980 An analysis of coping in a middleaged community sample. Journal of Health and Social Behaior 21: 219–39 Folkman S, Lazarus R S 1988 Ways of Coping Questionnaire Research edition. Consulting Psychologists Press, Palo Alto, CA Freud A 1936 Das Ich und die Abwehrmechanismen [The ego and the mechanisms of defense]. Internationaler Psychoanalytischer Verlag, Vienna Freud S 1926 Hemmung, Symptom und Angst [Inhibitions, symptoms and anxiety]. Internationaler Psychoanalytischer Verlag, Vienna Hinkle L E 1974 The concept of ‘stress’ in the biological and social sciences. International Journal of Psychiatry in Medicine 5: 335–57 Hobfoll S E 1989 Conservation of resources: A new attempt at conceptualizing stress. American Psychologist 44: 513–24 Hobfoll S E, Freedy J R, Green B L, Solomon S D 1996 Coping reactions to extreme stress: The roles of resource loss and resource availability. In: Zeidner M, Endler N S (eds.) Handbook of Coping: Theory, Research, Applications. Wiley, New York, pp. 322–49 Hobfoll S E, Leiberman J R 1987 Personality and social resources in immediate and continued stress-resistance among women. Journal of Personality and Social Psychology 52: 18–26 Hobfoll S E, Lilly R S 1993 Resource conservation as a strategy for community psychology. Journal of Community Psychology 21: 128–48 Holmes T H, Rahe R H 1967 The social readjustment rating scale. Journal of Psychosomatic Research 11: 213–8 Janis I L 1958 Psychological Stress: Psychoanalytic and Behaioral Studies of Surgical Patients. Wiley, New York Janis I L 1983 Stress inoculation in health care: Theory and research. In: Meichenbaum D, Jaremko M (eds.) Stress Reduction and Preention. Plenum, New York, pp. 67–99
Karoly P 1999 A goal systems self-regulatory perspective on personality, psychopathology, and change. Reiew of General Psychology 3: 264–91 Kobasa S C 1979 Stressful life events, personality, and health: An inquiry into hardiness. Journal of Personality and Social Psychology 37: 1–11 Kohlmann C-W 1997 PersoW nlichkeit und Emotionsregulation: Defensie BewaW ltigung on Angst und Streß [Personality and the regulation of emotion]. Huber, Bern, Switzerland Krohne H W 1978 Individual differences in coping with stress and anxiety. In: Spielberger C D, Sarason I G (eds.) Stress and Anxiety. Hemisphere, Washington, DC, Vol. 5, pp. 233–60 Krohne H W 1993 Vigilance and cognitive avoidance as concepts in coping research. In: Krohne H W (ed.) Attention and Aoidance. Strategies in Coping with Aersieness. Hogrefe & Huber, Seattle, WA, pp. 19–50 Krohne H W 1996 Individual differences in coping. In: Zeidner M, Endler N S (eds.) Handbook of Coping: Theory, Research, Applications. Wiley, New York, pp. 381–409 Krohne H W, Egloff B, Varner L J, Burns L R, Weidner G, Ellis H C 2000 The assessment of dispositional vigilance and cognitive avoidance: Factorial structure, psychometric properties, and validity of the Mainz Coping Inventory. Cognitie Therapy and Research 24: 297–311 Krohne H W, Hock M, Kohlmann C-W 1992 Coping dispositions, uncertainty, and emotional arousal. In: Strongman K T (ed.) International Reiew of Studies on Emotion. Wiley, Chichester, UK, Vol. 2, pp. 73–95 Lazarus R S 1966 Psychological Stress and the Coping Process. McGraw-Hill, New York Lazarus R S 1974 Psychological stress and coping in adaptation and illness. International Journal of Psychiatry in Medicine 5: 321–33 Lazarus R S 1990 Theory-based stress measurement. Psychological Inquiry 1: 3–13 Lazarus R S 1991 Emotion and Adaptation. Oxford University Press, New York Lazarus R S 1993 Coping theory and research: Past, present, and future. Psychosomatic Medicine 55: 234–47 Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Springer, New York Lazarus R S, Folkman S 1986 Cognitive theories of stress and the issue of circularity. In: Appley M H, Trumbull R (eds.) Dynamics of Stress. Physiological, Psychologcal, and Social Perspecties. Plenum, New York, pp. 63–80 Lazarus R S, Launier R 1978 Stress-related transactions between person and environment. In: Pervin L A, Lewis M (eds.) Perspecties in Interactional Psychology. Plenum, New York, pp. 287–327 Mason J W 1971 A re-evaluation of the concept of ‘nonspecifity’ in stress theory. Journal of Psychiatric Research 8: 323–33 Mason J W 1975a A historical view of the stress field. Part I. Journal of Human Stress 1: 6–12 Mason J W 1975b A historical view of the stress field. Part II. Journal of Human Stress 1: 22–36 Mason J W 1975c Emotion as reflected in patterns of endocrine integration. In: Levi L (ed.) Emotions: Their Parameters and Measurement. Raven, New York, pp. 143–81 McGrath J E 1982 Methodological problems in research on stress. In: Krohne H W, Laux L (eds.) Achieement, Stress, and Anxiety. Hemisphere, Washington, DC, pp. 19–48 Miller S M 1980 When is little information a dangerous thing? Coping with stressful events by monitoring versus blunting.
15169
Stress and Coping Theories In: Levine S, Ursin H (eds.) Health and Coping. Plenum, New York, pp. 145–69 Miller S M 1987 Monitoring and blunting: Validation of a questionnaire to assess styles of information seeking under threat. Journal of Personality and Social Psychology 52: 345–53 Miller S M, Mangan C E 1983 Interacting effects of information and coping style in adapting to gynecologic stress: Should the doctor tell all? Journal of Personality and Social Psychology 45: 223–36 Mischel W, Shoda Y 1995 A cognitive-affective system theory of personality: Reconceptualizing situations, dispositions, dynamics, and invariance in personality structure. Psychological Reiew 102: 248–68 Mullen B, Suls J 1982 The effectiveness of attention and rejection as coping styles: A meta-analysis of temporal differences. Journal of Psychosomatic Research 26: 43–9 Roth S, Cohen L J 1986 Approach, avoidance, and coping with stress. American Psychologist 41: 813–9 Scheier M F, Carver C S 1992 Effects of optimism on psychological and physical well-being: Theoretical overview and empirical update. Cognitie Therapy and Research 16: 201–28 Schwarzer R, Leppin A 1991 Social support and health: A theoretical and empirical overview. Journal of Social and Personal Relationships 8: 99–127 Schwarzer R, Schwarzer C 1996 A critical survey of coping instruments. In: Zeidner M, Endler N S (eds.) Handbook of Coping: Theory, Research, Applications. Wiley, New York, pp. 107–32 Selye H 1976 The Stress of Life (rev. edn.). McGraw-Hill, New York Thoits P A 1983 Dimensions of life events that influence psychological distress: An evaluation and synthesis of the literature. In: Kaplan H B (ed.) Psychosocial Stress: Trends in Theory and Research. Academic Press, New York, pp. 33–103 Zeidner M, Endler N S (eds.) 1996 Handbook of Coping: Theory, Research, Applications. Wiley, New York
the body to any demand characterized by the secretion of glucocorticoids. Selye developed the concept of the general adaptation syndrome (GAS), which encompasses the alarm reaction, the stage of resistance, and finally the stage of exhaustion. The bodily responses to massive or ongoing stress are summarized in the stress triad, including (a) the enlargement and hyperactivity of the adrenal cortex, (b) the shrinking or atrophy of the thymus, spleen, lymph nodes, and the lymphatic system, and (c) the appearance of gastrointestinal and bowel ulcers (Selye 1936, 1937). Regarding health consequences, Selye argued that stress plays a role in the development of every disease. According to Selye, there is always a particularly weak organ or system (due to heredity or external conditions) that is thus likely to break down under stress, which is why he concluded that individuals can develop different types of diseases under the influence of the same kind of stressor. In the concept of four basic ariations of stress, Selye (1983) emphasized that stress can be based on overstress (hyperstress) as well as understress (hypostress). He also contrasts harmful, damaging stress (distress) from good stress (eustress). Selye’s idea of an unspecific stress response to all kinds of stimuli was challenged by Mason (1968, 1975), who underlines the importance of specific emotional reactions that determine a specific endocrine stress response. Mason showed that specific situational characteristics, such as novelty, uncontrollability, unpredictability, ambiguity, anticipation of negative consequences, and high ego-involvement, lead to specific hormonal stress responses.
H. W. Krohne 1.2 Definitions of Stress
Stress and Health Research 1. Introduction 1.1 The Origin of Stress Research Stress is a term that is often related to a psychological or physical state of health (cf. Sapolsky 1994) and is used by many psychological, sociological, and medical scientists as well as laymen. A first definition of the term stress was proposed by Cannon (1935). He postulated that passing a critical stress level, triggered by physical and emotional stimuli, endangers the homeostatic control of an organism. However, the name that is most closely associated with stress is Hans Selye, who defined stress as a nonspecific response of 15170
To date, a large number of definitions of stress are available (see Stress and Coping Theories). Some researchers have even proposed to drop the term stress. Levine and Ursin (1991) have offered a comprehensive definition of stress, distinguishing between (a) input (stress stimuli), (b) (individual) processing, and (c) outcome (stress reaction). In this concept, those stimuli that require processing are defined as loads. An individual appraisal process determines whether a load becomes a stressor. Differences in processing are based for example on genetic, ontogenetic, and social factors, and early life as well as lifelong experiences. A related idea has been introduced earlier by Lazarus and Folkman (1984). According to the cognitivetransactional model by Lazarus and Folkman, stress is experienced as a process that is initially triggered by situational demands, but then mainly by the cognitive appraisal of these demands. The characteristics of a situation (primary appraisal) are evaluated simultaneously in line with the available coping capacities or resources (secondary appraisal). This results in a cognitive appraisal of challenge, threat, or harm\loss,
Stress and Health Research and, subsequently, in emotions, coping attempts, and adaptational outcomes. This process can be cyclical, and it may lead to a number of reappraisals that change the nature of the stress episode. At the level of stress responses, physiological, behavioral, and subjective\verbal reactions could be distinguished. Physiological responses primarily include the sympathetic–adrenal–medullary (SAM) axis, hypothalamic–pituitary–adrenal (HPA) axis, and immune system. Behavioral responses, among others, cover attention, arousal, and vigilance; subjective\ verbal reactions are interpretations, cognitions, and emotions. The stress reactions to physical and psychological stimuli are primarily determined by the individual interpretation, but also by the social context, the social status, genetic factors, gender, developmental stage, and individual lifelong experiences. Finally, Levine and Ursin (1991) contrast an unspecific stress response (general alarm) and a specific individual stress reaction. With this idea they offer an integrational view of Selye’s (1983) and Mason’s (1968) approaches to a definition of stress.
In the face of an internal or external challenge, these systems are activated. (a) Catecholamines are released from the sympathetic nerves and the adrenal medulla. (b) Corticotropin-releasing factor (CRF) from the nucleus paraventriculares (PVN) of the hypothalamus triggers the discharge of adrenocorticotrophic hormone (ACTH) from the anterior pituitary gland. ACTH stimulates the secretion of glucocorticoids (e.g., cortisol) from the adrenal cortex. These hormones serve as mediators of adaptational processes. Besides the sympathetic nervous system and the HPA axis, the cardiovascular, metabolic, and immune system play a crucial role in allostatic processes. In sum, allostatic systems enable the body to respond adequately to changes in social and physical conditions and therefore protect the body in the short run. In the long run, however, the bodily responses to stress can cause damage and finally promote several diseases. The organism’s ‘hidden toll’ of short-term adaptation to challenges is described as allostatic load, which constitutes a wear and tear on the body from chronic overactivity or underactivity of allostatic systems.
2. The Model of Allostatic Load
2.2 Types of Allostatic Load
2.1 Introduction While homeostatic regulation reflects the stability of constants, allostatic regulation means stability through change. Homeostatic systems (e.g., body temperature, blood oxygen, blood pH, glucose levels) must be maintained within a narrow range. That means that deviations in a homeostatic system trigger a restorative response to correct the changes. In contrast, the regulation of allostatic (adaptive) systems can operate in a relatively broad range of regulation. Following Sterling and Eyer (1988), allostasis is the regulation of the internal milieu through dynamic change in hormonal and physical parameters. Allostasis is defined as the ability of the body to increase or decrease vital functions to a new steady state on challenge. McEwen and Stellar (1993) concluded that the concept of homeostasis is not helpful for explaining the hidden costs of chronic stress on the organism. Therefore, they extended the concept of allostasis over the dimension of time and introduced the idea of allostatic load. Allostatic load is defined as the cost of chronic exposure to elevated or fluctuating endocrine or neural responses resulting from chronic or repeated challenges that the individual experiences as stressful. Stress, for instance psychological demands, physical threat\danger, or adverse life experiences, activates various adaptive (allostatic) systems, initiating adaptation and coping processes. The SAM and the HPA axes are the main endocrine stress systems of the organism involved in an allostatic response.
Four different scenarios can cause allostatic load: (1) frequent exposure to stress, (2) inability to habituate to repeated challenges, and (3) inability to terminate a stress response. In these three types, allostatic load is promoted by an organism’s increased exposure to stress hormones and other allostatic mediators. Finally, (4) an inadequate allostatic response in one allostatic system could be related to an increased activation of another allostatic system. An inadequate response of one allostatic system could trigger an inadequate (compensatory) response in another allostatic system. The model emphasizes that allostatic load reflects not only the impact of lifelong experiences (wear and tear), but also covers early life experiences, genetic predispositions, environmental factors, as well as psychological and behavioral parameters. The cascading relationships between these factors could explain individual differences in the susceptibility to stress and, in some cases, to disease. For example, Meaney and co-workers (1994) showed in animal studies that early life experiences can calibrate the lifelong pattern of physiological stress responses. Whereas unpredictable stress in newborn rodents led to an overactivity of HPA and SAM axis responsiveness, neonatal handling resulted in a reduced reactivity of the stress systems for the entire lifespan. Concerning the impact of genes on HPA axis functioning, Wu$ st et al. (2000) report on a mediumsized yet distinct genetic influence on cortisol levels after awakening. The role of genes in the regulation of stress systems is discussed in detail by Koch and Stratakis (2000). Psychological factors, such as an15171
Stress and Health Research ticipating negative consequences, pessimism, anxiety, or worry also contribute to allostatic load. While the origin of allostatic load can be based on an individual psychological appraisal process, psychological factors can prolong, intensify, expand, or aggravate the amount of existing allostatic load. Finally, there seems to be a bidirectional influence of stress and lifestyle factors, for example nutrition, physical exercise, sleeping habits, alcohol, caffeine, and nicotine consumption, with significant effects on the extent of allostatic load (cf. Steptoe 2000).
3. Stress and Health 3.1 Types of Allostatic Load and Health The four different types of allostatic load may possibly be related to increased risks of the development of different health problems. Repeated ups and downs of physiological responses or a heightened activity of physiological systems due to internal or external challenges result in allostatic load, which could predispose the organism to disease. There is evidence that acute and chronic stress are significant risk factors for cardiovascular, immunological, and psychological health problems as well as impairments in the central nervous system (CNS). However, it has to be taken into consideration that evidence for stress effects on health are mainly based on correlational analyses. Animal and human data point to the idea that allostatic load of type 1 and type 3 may possibly be related to the risk of myocardial infarction. Repeated blood pressure elevation (allostatic load type 1), for instance, could promote atherosclerosis. The failure to recover from blood pressure increase following an acute stressor (allostatic load type 3) may lead to hypertension, which also increases the risk of atherosclerosis. Furthermore, heightened cortisol levels promote an increase in insulin secretion, resulting in an endocrine condition that may also accelerate atherosclerosis. Atherosclerosis itself appears to increase the risk of myocardial infarction. Allostatic load type 2 covers the failure to habituate or the extent of habituation to repeated stress exposure. After repeated speaking and mentally solving arithmetic problems in front of an audience (Trier Social Stress Test), most participants showed habituation in their cortisol response, but about onethird of them still had high cortisol increases every time they were exposed to the stressor. The relationship between the inability of the HPA axis to habituate to repeated exposure to this psychosocial laboratory stress protocol and various health parameters is currently being investigated. Chronically elevated SAM and HPA axis activity can lead to decreased bone mineral density, body weight loss, amenorrhea and\or anorexia nervosa, as observed in depressed patients or high-performance athletes (allostatic load type 3). Concerning bone 15172
mineral density, Michelson et al. (1996) showed that moderately elevated glucocorticoid levels over a prolonged period were associated with reduced levels of osteocalcin in depressed women, therefore inhibiting bone formation (allostatic load type 3). Allostatic load type 4 is illustrated by observing that a hyporeactive HPA axis (decreased release of glucocorticoids after stress) could lead to an increased response of the immune system. In case one system does not show an adequate stress response, another system increases its activity because the underactive system does not provide the normal counter-regulatory control. For instance, lewis rats, which are known to have an HPA axis hyporeactivity, show increased susceptibility for autoimmune and inflammatory disturbances, and those rats which developed an HPA axis hyporeactivity due to a subordinate position in their social network are believed to die earlier than their counterparts (allostatic load type 4) (cf. McEwen 2000). 3.2 Stress and the Cardioascular System Animal and human studies indicate that a relationship exists between stress and cardiovascular disturbances. Stress plays a crucial role in cardiovascular diseases as a risk factor in the progressive development of hypertension and atherosclerosis and as a trigger of acute incidences, such as myocardial infarction, stroke, or even death. While stress provokes cardiovascular and endocrine changes that could be beneficial even in healthy human beings, the same stimuli can be detrimental to patients with coronary artery disease (see Stress and Cardiac Response). Manuck et al. (1988) observed that socially subordinate female primates and dominant male primates in unstable social hierarchies have an increased risk of developing atherosclerosis. In humans, occupational strain, for example, jobs with high psychological demands and low control, leads to elevated blood pressure, increased progression of atherosclerosis, and increased risk of coronary heart disease (CHD) (cf. McEwen 2000). Allen and colleagues (1993) concluded that women are more likely to be cardiac reactors, whereas men are more prone to be vascular reactors, suggesting a gender-specific allostatic load within the cardiovascular system. 3.3 Stress and the Immune System A large amount of literature shows the impact of stress on changes in immune system functioning (cf. Biondi 2001). Chronic psychological stress or increased allostatic load is generally shown to be associated with decreased immune responses as well as enhanced severity of infections. Consequently, a stress-related suppression of cellular immune system functioning can, for instance, lead to increased susceptibility to or
Stress and Health Research severity of the common cold (Cohen et al. 1991). While chronic stress seems to precede blunted immune response, acute stress may be potentially beneficial: experiencing acute stress can enhance the trafficking of immune cells to the site of acute challenge, resulting in an immune-enhancing effect over several days (cf. McEwen 2000). Acute stress may call immune cells ‘to their battle stations,’ thus enhancing short-term immunity (cf. Dhabar 2000, Dhabar and McEwen 1997). Possible mechanisms involved in the relationship between psychological stress and immune function changes, covering the stress–disease connection, are discussed in Biondi (2001). There seems to be a complex connection between emotional stress and the development of cancer. In sum, human data on psychological factors and cancer are controversial, but apparently there is support that psychological stress may contribute to disease manifestation and progression. Concerning infectious diseases, some studies have reported negative findings. However, human data point to a possible role that stress plays in increasing the risk of tuberculosis, influenza, upper respiratory infections in children and adults, as well as recurrences of herpes labialis virus and genital herpes virus, or HIV progression. Biondi (2001) summarizes that psychological stress could act as a cofactor in the pathogenesis, course, and outcome of infectious diseases in animals and humans. The exact role of psychological factors in the pathogenesis and course of specific autoimmune diseases (e.g., rheumatoid arthritis, systemic lupus erythematosis, multiple sclerosis, ulcerative colitis, Crohn’s disease, joint inflammation, psoriasis) is still poorly understood. In view of the clinical complexity of these illnesses, a simple connection between psychological stress and the status of an autoimmune disease cannot be presumed. For asthma bronchiale, there appears to be some support for the assumption that this illness could be exacerbated by psychological stress via immune–inflammatory mechanisms. Finally, the process of wound healing appears to be impaired under chronic as well as minor psychological stress, a process which may be mediated by proinflammatory cytokines (Glaser et al. 1999).
3.4 Stress and the Central Nerous System Besides numerous peripheral effects of stress, stressinduced secretion of adrenal steroids can also affect the central nervous system (CNS). For example, elevated glucocorticoid levels can impair hippocampal and temporal lobe functioning, brain areas which are involved in declarative, episodic, and short-term memory. While acute stress can profoundly impair memory (effects reversible and relatively short-lived; Kirschbaum et al. 1996), repeated stress can end up in atrophy of the dendrites of pyramidal neurons in the CA3 region of the hippocampus, causing long-term
cognitive impairments. Besides its role in memory and cognition, the hippocampus is most important for the negative feedback regulation of the HPA axis. Allostatic load during a lifetime may cause loss of resilience of HPA axis function, resulting in an attenuation of the negative feedback signal that may lead to hyperactivity of this stress system. The connection between the wear and tear on this region of the brain and the dysregulation of the HPA axis is elaborated in detail in the glucocorticoid cascade hypothesis (Sapolsky et al. 1986). Animal data show that in aged rats, impairments of declarative, episodic, and spatial memory are associated with HPA axis hyperactivity. In elderly humans, Lupien et al. (1998) observed that hippocampal atrophy and cognitive impairment were related to progressive elevations of cortisol levels over a fouryear interval. In sum, animal as well as human data support the idea that elevated glucocorticoid levels could lead to a reduction in hippocampal functioning. Consequences may include the dysregulation of the HPA axis as well as cognitive impairments.
3.5 Stress and Psychological Diseases Stress is said to be one important parameter, among others, in the manifestation or aggravation of several psychological disturbances, psychosomatic illnesses, and psychiatric disorders, including burnout, anorexia nervosa, chronic fatigue syndrome (CFS), asthma, neurodermatitis, major depression, schizophrenia, or post-traumatic stress disorder (PTSD), which are sometimes referred to as stress-related disorders. The notion that stress contributes somehow to onset or progression of these disorders does not indicate how strong the influence of stress is compared to other important contributing factors, such as genetic predisposition or organic diseases. However, changes in the regulation of the HPA axis can be related to several stress-related disorders. For example, chronic stress, major depression, schizophrenia, or anorexia nervosa seem to be associated with a hyperactive HPA axis, whereas a hyporeactive HPA axis is found more often in CFS, autoimmune processes, or PTSD. Since there are only few longitudinal data available to date on the relationship between HPA activity and stress-related disorders, the question of cause and effect remains to be answered in future research.
4. Gender Differences in Stress Responses and Health Gender-specific prevalence rates reveal that men and women are at differential risk for a number of illnesses (seeGenderandPhysicalHealth).Whereaswomensuffer 15173
Stress and Health Research more often from autoimmune diseases (e.g., multiple sclerosis or systemic lupus erythematosus), men are more prone to develop coronary heart diseases and appear to be more susceptible to infectious diseases. Also, the incidence of psychiatric disorders is not equally distributed between men and women. Diagnoses such as major depression, anxiety, phobia, panic disorder, and obsessive compulsive disorder are more common in women, whereas men more often develop antisocial behavior, abuse substances, or commit suicide. Life expectancy is about seven years higher for women compared to men, although women consistently appear to report more physical as well as somatoform symptoms than men. Besides this clear gender–disease dimorphism, gender differences also exist in subjective and biological aspects of the human stress response (cf. Kudielka et al. 2000). Concerning subjective\verbal reactions to stress, several studies (ranging from everyday hassles to posttraumatic stress reactions) show that women experience and\or report more stress than men. As to HPA axis responses to psychological stress, men showed repeatedly higher ACTH and free, bioaailable, cortisol responses compared to women, but similar total cortisol increases. Only in the luteal phase of the menstrual cycle did women show similar free cortisol responses to those of men (Kirschbaum et al. 1999). There is evidence that these differences are, at least partly, explained by gender differences in the availability of gonadal steroids, especially estrogens. The finding that men have higher blood pressure changes and higher elevations in epinephrine concentrations under psychological stress parallels the observation that some women seem to be protected from CHD (see Gender and Cardioascular Health). Estrogen availability is discussed as being a preventive agent for the onset or progression of CHD, but the use of hormone replacement therapy (HRT) for prevention of heart disease is more complex than initially believed. The heart and estrogen\progestin replacement study (HERS) was a first large randomized, blind, placebo-controlled secondary prevention trial to evaluate the efficacy and safety of estrogen and progesterone replacement therapy in reducing CHD risk (cf. e.g., Hulley et al. 1998). It was conducted from January 1993 through July 1998, with a mean followup of 4.1 years at 20 centers. However, in sum, the results of the HERS study could not support the notion that HRT is a protective measure against CHD.
5. Summary and Conclusion This article has dealt with the origin of stress research, briefly describing the basic aspects of the stress concepts of Cannon, Selye, and Mason. Then, a comprehensive definition of stress by Levine and Ursin 15174
(1991) was offered, distinguishing between stress stimuli, individual processing, and stress reactions on three different levels (physiological, behavioral, subjective\ verbal). The relationship between stress and health was elucidated using the allostatic load model introduced by McEwen and Stellar (1993): the body’s cost of short-term adaptation to internal and external challenges is allostatic load, which constitutes a wear and tear on the organism. Four different types of allostatic load were presented and related to different health problems, including the cardiovascular system, the immune system, and the CNS. Finally, gender differences in stress responses and health outcome, including psychosomatic and psychiatric diseases, were briefly discussed, with the conclusion that the gender– disease dimorphism could be associated with genderspecific stress responses. See also: Emotional Inhibition and Health; Environmental Stress and Health; Gender and Cardiovascular Health; Gender and Physical Health; Gender Role Stress and Health; Hypothalamic–Pituitary–Adrenal Axis, Psychobiology of; Job Stress, Coping with; Posttraumatic Stress Disorder; Social Support and Stress; Stress and Cardiac Response; Stress and Coping Theories; Stress at Work
Bibliography Allen M T, Stoney C M, Owens J F, Matthews K A 1993 Hemodynamic adjustments to laboratory stress: The influence of gender and personality. Psychosomatic Medicine 55: 505–17 Biondi M 2001 Effects of stress on immune functions: An overview. In: Ader R, Felten D L, Cohen N (eds.) Psychoneuroimmunology, 3rd edn. Academic Press, San Diego, CA, Vol. 2, pp. 189–226 Cannon W B 1935 Stresses and strains of homeostasis. American Journal of the Medical Sciences 189(1): 1–14 Cohen S, Tyrrell D A, Smith A P 1991 Psychological stress and susceptibility to the common cold. New England Journal of Medicine 325: 606–12 Dhabar F S 2000 Stress-induced enhancement of immune function. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 2, pp. 515–23 Dhabar F S, McEwen B S 1997 Acute stress enhances while chronic stress suppresses immune function in vivo: A potential role for leukocyte trafficking. Brain Behaior and Immunity 11: 286–306 Glaser R, Kiecolt-Glaser J K, Marucha P T, MacCullum R C, Laskowski B F, Malarkey W B 1999 Stress-related changes in proinflammatory cytokine production in wounds. Archies of General Psychiatry 56: 450–6 Hulley S, Grady D, Bush T, Furberg C, Herrington D, Riggs B, Vittinghoff E 1998 Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. Heart and Estrogen\progestin replacement study\HERS) research group. Journal of the American Medical Association 280(7): 605–13 Kirschbaum C, Kudielka B M, Gaab J, Schommer N C, Rohleder R, Hellhammer D H 1999 Impact of gender,
Stress at Work menstrual cycle phase and oral contraceptives on the activity of the hypothalamic–pituitary–adrenal axis. Psychosomatic Medicine 61(2): 154–62 Kirschbaum C, Wolf O T, May M, Wippich W, Hellhammer D H 1996 Stress- and treatment-induced elevations of cortisol levels associated with impaired declarative memory in healthy adults. Life Sciences 58(17): 1475–83 Koch C A, Stratakis C A 2000 Genetic factors and stress. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 2, pp. 205–12 Kudielka B M, Hellhammer D H, Kirschbaum C 2000 Sex differences in human stress response. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 3, pp. 424–9 Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Springer, New York Levine S, Ursin H 1991 What is stress? In: Brown M R, Koob G F, Rivier C (eds.) Stress—Neurobiology and Neuroendocrinology. Marcel Dekker, New York, Chap. 1 Lupien S J, de Leon M, de Santi S, Convit A, Tarshish C, Nair N P, McEwen B S, Hauger R L, Meaney M J 1998 Cortisol levels during human aging predict hippocampal atrophy and memory deficits [published erratum appears in Nature Neuroscience 1998, 1(4): 329]. Nature Neuroscience 1: 69–73 Manuck S B, Kaplan J R, Adams M R, Clarkson T B 1988 Studies of psychosocial influences on coronary artery atherogenesis in cynomolgus monkeys. Health Psychology 7: 113–24 Mason J W 1968 A review of psychoendocrine research on the sympathetic–adrenal medullary system. Psychosomatic Medicine 30: 631–53 Mason J W 1975 A historical view of the stress field. Journal of Human Stress 1: 6–12 McEwen B S 2000 Allostasis and allostatic load. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 1, pp. 145–50 McEwen B S, Stellar E 1993 Stress and the individual. Archies of Internal Medicine 153: 2093–101 Meaney M J, Tannenbaum B, Francis D, Bhatnagar S, Shanks N, Viau V, O’Donnell D, Plotsky P M 1994 Early environmental programming hypothalamic pituitary–adrenal responses to stress. Seminars in Neuroscience 6: 247–59 Michelson D, Stratakis S, Hill L, Reynolds J, Galliven E, Chrousos G, Gold P 1996 Bone mineral density in women with depression. New England Journal of Medicine 335: 1176–81 Sapolsky R M 1994 Why Zebras Don’t Get Ulcers. W H Freeman, New York Sapolsky R M, Krey L C, McEwen B S 1986 The neuroendocrinology of stress and aging: The glucocorticoid cascade hypothesis. Endocrine Reiews 7: 284–301 Selye H 1936 A syndrome produced by diverse nocuous agents. Nature 32 Selye H 1937 Studies on adaptation. Endocrinology 21(2): 169–88 Selye H 1983 The stress concept: past, present, and future. In: Cooper C L (ed.) Stress Research. Wiley, Chichester, UK, pp. 1–20 Steptoe A 2000 Health behavior and stress. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 2, pp. 322–6 Sterling P, Eyer J 1988 Allostasis: A new paradigm to explain arousal pathology. In: Fisher S, Reason H S (eds.) Handbook of Life Stress, Cognition and Health. Wiley, New York, pp. 629–49
Wu$ st S, Federenko I, Hellhammer D H, Kirschbaum C 2000 Genetic factors, perceived chronic stress, and the free cortisol response of awakening. Psychoneuroendocrinology 25: 707–20
B. M. Kudielka and C. Kirschbaum
Stress at Work 1. The Changing Nature of Work and the Role of Stress The nature of work has changed considerably over the past several decades in economically advanced societies. Industrial mass production no longer dominates the labor market. This is due, in part, to technological progress, in part to a growing number of jobs available in the service sector. Many jobs are confined to information processing, controlling, and coordination. Sedentary rather than physically strenuous work is becoming more and more dominant. New management techniques, including quality management, are introduced, and economic constraints produce work pressure, rationalization, and cut-down in personnel. These changes go along with changes in the structure of the labor market. More employees are likely to work on temporary contracts, on fixed term, or in flexible job arrangements. The workforce is getting older, and an increasing proportion of women enters the labor market, with an increase of double exposure among women with children or, more generally, in dual career families. Most importantly, overemployment in some segments of the workforce is paralleled by underemployment, job instability, or structural unemployment in other segments. This latter trend currently hits the economically most advanced societies as well as economies that lag behind. Overall, a substantial part of the economically active population is confined to insecure jobs, to premature retirement, or job loss. Why is work so important for human well-being, and how does work contribute to the burden of stress and its adverse effects on health? In all advanced societies work and occupation in adult life are accorded primacy for the following reasons. First, having a job is a principal prerequisite for continuous income and, thus, for independence from traditional support systems (family, community welfare, etc.). Increasingly, level of income determines a wide range of life chances. Second, training for a job and achievement of occupational status are the most important goals of socialization. It is through education, job training, and status acquisition that personal growth and development are realized, that a core social identity outside the family is acquired, and that 15175
Stress at Work goal-directed activity in human life is shaped. Third, occupation defines an important criterion of social stratification. Amount of esteem in interpersonal life largely depends on type of job and level of occupational achievement. Furthermore, type and quality of occupation, and especially the degree of selfdirection at work, strongly influence personal attitudes and behavioral patterns in areas that are not directly related to work, such as leisure or family life (Kohn and Schooler 1983). Finally, occupational settings produce the most pervasive continuous demands during one’s lifetime, and they absorb the largest amount of active time in adult life, thus providing a source of recurrent negative or positive emotions. It is for these reasons that stress research in organizations where paid work takes place is of particular relevance. It is important to recognize that traditional occupational hazards, such as exposure to toxic substances, heat, cold, or noise, are no longer the dominant challenges of health at work. Rather, distinct psychological and emotional demands and threats are becoming highly prevalent in modern working life. There has been a recognition that the importance of work goes beyond traditional occupational diseases and, indeed, it is likely that work makes a greater contribution to diseases not thought of as ‘occupational’ in conventional terms. At a descriptive level, recent reports find that, for instance, almost half of the workforce are exposed to monotonous tasks or lack of task rotation; 50 percent work at a very high speed or to tight deadlines. Thus, overand underload at work are highly prevalent. Every third employee has no influence on work rhythm, and every fifth is exposed to shiftwork (Paoli 1997). There is now growing awareness among all parties of the labor market that psychosocial stress at work produces considerable costs, most importantly a high level of absenteeism, reduced productivity, compensation claims, health insurance, and direct medical expenses. Permanent disability and loss of productive life years due to premature death add to this burden. In the European Union, the costs of stress to organizations and countries are estimated to be between 5 and 10 percent of GNP per annum (Cooper 1998). At the same time scientific evidence on associations between psychosocial stress at work and health is growing rapidly.
device that selectively reduces complex reality to meaningful components. Components are meaningful to the extent that they provide the material from which the researcher can deduce explanations and, thus, produce new knowledge. Ideally, a theoretical model of psychosocial stress at work with relevance to health should encompass a wide variety of different occupations and should not be restricted to a specific time or space. Before describing some prominent theoretical models of occupational stress, the basic terminology needs to be clarified. Much critique has been raised against the ambiguity of the term ‘stress.’ To avoid this, the following terms were suggested: ‘stressor’ is defined as an environmental demand or threat that taxes or exceeds a person’s ability to meet the challenge; ‘strain’ is defined as the person’s response to such a situation in psychological and physiological terms. Psychological responses include specific cognitions and negative emotions (e.g., anger, frustration, anxiety, helplessness), whereas physiological responses concern the activation of the autonomic nervous system and related neuro-hormonal and immune reactions. Stressors, in particular novel or dangerous ones, are appraised and evaluated by the person, and as long as there is some perception of agency on the part of the exposed person efforts are mobilized to reverse the threat or to meet the demands. Such efforts are termed ‘coping,’ and they occur at the behavioral (even interpersonal), cognitive, affective, and motivational level. Clearly, when judging strain, the quality and intensity of a stressor as well as the duration of exposure have to be taken into account, as well as individual differences in coping and in vulnerability to strain reactions. Recent research indicates that only part of human strain reactions are subject to conscious information processing, whereas a large amount bypasses awareness. The term ‘stressful experience’ is introduced to delineate that part of affective processing that reaches consciousness. Stressful experiences at work are often attributed to adverse working conditions by exposed people themselves. While they usually refer to some common sense notions of stress, it is crucial to note that these attributions differ from the explanatory constructs of stressful experience at work that have been identified by science.
2. Concepts of Work-related Stress and Selected Findings
2.1 Person–Enironment Fit
Research on psychosocial work-related stress differs from traditional biomedical occupational health research by the fact that stressors cannot be identified by direct physical or chemical measurements. Rather, theoretical models are needed to analyze the particular nature of the psychosocial work environment. A theoretical model is best understood as a heuristic 15176
The first significant contribution to modern occupational stress research dates back to studies conducted more than 30 years ago at the Institute for Social Research at the University of Michigan. They were guided by a theoretical concept termed ‘person–environment fit’ (Caplan et al. 1980). Stressful experience at work, in this model, is conceptualized in two ways: first, as an experience where the work
Stress at Work environment does not provide adequate supplies to meet the person’s needs; second, as an experience where the abilities of the person fall short of demands that are prerequisite to receiving supplies. In both conditions stressful experience results from a misfit between needs or abilities of the working person and demands or opportunities of the work environment. Further job opportunities may fail to fulfill the person’s needs as a consequence of unmet demands. As an example, a worker with limited skills is excluded from any promotion prospects, although, to meet his level of living, he badly needs better pay. This approach represents a type of theorizing that is termed process theory. As such, it is applicable to a wide range of work but does not specify the particular content dimensions on which person and environment should be examined (Edwards et al. 1998). As its basic terms are rather broad, it may be difficult to reach a consensus on how to measure core constructs. Nevertheless, person–environment fit offers an elaborate set of propositions of how critical constellations of workrelated environmental and personal characteristics contribute to the development of stressful experience. At the empirical level, most up-to-date studies have tested adverse effects on subjective health. Therefore, it is not quite evident to what extent these types of misfit are capable of eliciting sustained physiological strain and thus contribute to the development of bodily disease.
uniform, but may vary according to the degree of perceived control. For instance, stressful experience in terms of striving without success was found to elicit strain reactions that exert particularly harmful effects on the cardiovascular system (Kaplan et al. 1983). Second, epidemiologic studies on work-related control and health, most importantly in the framework of the demand–control model, help explaining the social distribution of some major chronic diseases, especially coronary heart disease (CHD). An inverse social gradient of CHD has been documented. Exposure to low control jobs is more frequent among lower status groups as is the prevalence of CHD. In fact, recent studies found evidence that a considerable part of the social variation in CHD is attributed to adverse psychosocial work conditions (Marmot et al. 1999). Finally, research on control at work has put special emphasis on coping resources, in particular the buffering effects produced by social support. Strainreducing effects of social support are due to positive experience resulting from social relationships at the instrumental, emotional, and cognitive level (House 1981). This evidence has implications both for a broader conceptualization of occupational stress (e.g., the revised demand–support–control model (Karasek and Theorell 1990)) and for the design of workplace intervention measures.
2.3 Effort and Reward 2.2 Demand, Support, and Control The argument of how strain reactions affect physiological regulation has received special attention in a second influential contribution to current occupational stress research that focuses on the theoretical construct of control. Animal stress research and psycho-physiological experiments in humans confirmed the critical impact on physiological and psychological strain reactions produced by lack of control in a challenging situation (Frankenhaeuser 1979, Henry and Stephens 1977). Control is defined as the ability of a person to choose their own actions to cope with a challenge. In the workplace it ranges from task autonomy to participation in decision making, and strain evolves from challenging situations at work that leave the person with limited choice on how to react. This idea was elaborated in terms of the demand–control model of occupational stress by Karasek (Karasek and Theorell 1990, see Occupational Health). This model posits that strain is contingent on the combined effects of high demand and low control; though both high demands and low control may also have independent additive effects on strain and health. Stress research based on the concept of control resulted in several important discoveries. First, it was shown that physiological strain reactions are not
In physiological terms, theories of control at work deal with the experience of power and influence which, at the psychological level, is paralleled by feelings of autonomy and self-efficacy. An alternative model of stressful experience at work, the model of effort– reward imbalance, is concerned with distributive justice, that is, with deviations from a basic ‘grammar’ of social exchange rooted in the notions of reciprocity and fairness (Siegrist 1996). This model assumes that effort at work is spent as part of a socially organized exchange process to which society at large contributes in terms of rewards. Rewards are distributed by three transmitter systems: money, esteem, and career opportunities including job security. The model of effort–reward imbalance claims that lack of reciprocity between costs and gains (i.e., high ‘cost’\low ‘gain’ conditions) elicits sustained strain reactions. For instance, having a demanding, but unstable job, achieving at a high level without being offered any promotion prospects, are examples of high cost\low gain conditions at work. In terms of current developments of the labor market in a global economy, the emphasis on occupational rewards including job security reflects the growing importance of fragmented job careers, of job instability, underemployment, redundancy, and forced occupational mobility including their financial consequences. 15177
Stress at Work According to this model, strain reactions are most intense and long-lasting under the following conditions: (a) lack of alternative choice in the labor market may prevent people from giving up even unfavorable jobs, as the anticipated costs of disengagement (e.g., the risk of being laid-off) outweigh the costs of accepting inadequate benefits; (b) unfair job arrangements may be accepted for a certain period of one’s occupational trajectory for strategic reasons; by doing so employees tend to improve their chances for career promotion at a later stage; (c) a specific personal pattern of coping with demands and of eliciting rewards characterized by overcommitment may prevent people from accurately assessing cost–gain relations. ‘Overcommitment’ defines a set of attitudes, behaviors, and emotions reflecting excessive striving in combination with a strong desire for being approved and esteemed (Siegrist 1996). At the psychological level, experience of effort–reward imbalance is often paralleled by feelings of impaired self-worth or selfesteem. 2.4 Empirical Eidence Evidence of impaired health due to exposure to stressors that are defined by the concepts described is growing rapidly. In particular, a number of welldesigned prospective social-epidemiological studies found moderately elevated relative risks (two- to threefold) of incident CHD in economically active middle-aged men and women who were suffering from high demand\low control or effort–reward imbalance. Other health effects observed in prospective and crosssectional studies concern distinct cardiovascular risk factors, gastrointestinal symptoms, mild psychiatric disorders, poor subjective health, and musculo-skeletal disorders. In addition, rates of absenteeism are elevated (Marmot et al. 1999). There is also evidence of changes in physiological patterns, release of stress hormones, and altered immune competence following stressful experience at work (see Occupational Health). Taken together, the burden of illness that can be attributed, at least in part, to exposure to an adverse psychosocial work environment turns out to be impressive. While further research is needed to address important issues, practical implications of this new evidence deserve special attention.
3. Practical Implications and Future Directions Overall, worksite health promotion activities are realized with little concern about scientific advances obtained in the field of occupational stress research. Eventually, this may change in the future as all three models described in the last paragraph (as well as additional models not mentioned in detail, for summary see Cooper 1998) provide a rationale for evidence-based preventive activity. Person–environ15178
ment fit theory concludes that work demands need to be tailored according to the worker’s abilities. This involves both specific measures of organizational development and person development. Moreover, organizational interventions must suit the needs of affected individuals that are closely linked to the work role (e.g., motivation, learning, satisfaction; Edwards et al. 1998). In the demand–support–control model, special emphasis is put on autonomy, skill discretion, and personal growth. Measures of work reorganization include job enlargement, job enrichment, skill training, enhanced participation, and teamwork (Karasek and Theorell 1990). Practical implications of the effort–reward imbalance model concern the development of compensatory wage systems, the provision of models of gain sharing, and the strengthening of nonmonetary gratifications. Moreover, ways of improving promotional opportunities and job security need to be explored. Supplementary measures are interpersonal training and skill development, in particular leadership behavior. Clearly, the power of economic life and the constraints of globalization limit the options and range of worksite health promotion measures. Moreover, such measures are difficult to apply to medium-sized and small enterprises where the majority of the workforce in advanced societies is employed. Finally, it must be stated that many instrumental activities that qualify as work are no longer confined to the conventional type of workplace located within an enterprise or organization, particularly with the advent of advanced communications technology (Cooper 1998).
3.1 Future Challenges for Research There are at least three obvious future directions for occupational stress research. First, the theoretical concepts of stressful experience at work discussed need to be examined in a comparative perspective, with emphasis on their potential overlap or cumulative effects. For instance, recent studies found independent statistical prediction of the demand–control and the effort–reward imbalance model when tested simultaneously (Marmot et al. 1999). Moreover, the estimation of health risk was considerably improved when based on combined information derived from both models. A second direction of research addresses the link or interface between work and nonwork settings, in particular work and family. In advanced societies work and family have been identified as the two most central domains, especially so in midlife. Therefore, stress-enhancing and stress-buffering experiences in these two domains significantly contribute to personal health and well-being. This interface is conceptualized in terms of spillover, compensation, or cumulation. Spillover refers to the transfer of experience and behavior from one domain to the other. For instance,
Stress in Organizations, Psychology of work overload or shiftwork schedules may have negative effects on family life. Alternatively, experience of competence and success at work can have beneficial effects on marital life and relationships between parents and children (Kohn and Schooler 1983). Compensation represents efforts to reduce stressful experience in one domain by improving satisfaction in another domain. For instance, leisure and family activities are enhanced to the extent that decreasing involvement in a stressful work environment is feasible. Compensation often occurs in terms of resource management in a stressful situation. In conceptual terms, there is some overlap between the buffering model of social support and compensation processes (House 1981). Cumulation refers to the fact that the critical components of stressful psychosocial experience, such as lack of control or effort–reward imbalance, generalize across core life domains. For instance, disease vulnerability may be highest among people with insecure, badly paid, effortful jobs who also live in a family arrangement that offers little emotional and material satisfaction. It is desirable to extend the theoretical models discussed, both conceptually and operationally, to life domains other than work. A third direction of research concerns intervention studies. The implementation and evaluation of theorybased intervention is a promising area for creating new knowledge although it is admitted that established methodological principles of scientific trials cannot be fully met. Recent theory-based intervention studies have documented favorable health effects (Cooper 1998, Karasek and Theorell 1990, Siegrist 1996, see Occupational Health). In view of the importance of work for health, it is well justified to advance the field of occupational stress both at the science and policy level. See also: Environmental Stress and Health; Job Stress, Coping with; Occupational Health; Performance Evaluation in Work Settings; Psychological Climate in the Work Setting; Social Support and Stress; Stress and Cardiac Response; Stress and Coping Theories; Stress and Health Research; Stress in Organizations, Psychology of; Stress Management Programs; Stress: Psychological Perspectives; Well-being and Burnout in the Workplace, Psychology of; Worksite Health Promotion and Wellness Programs
Bibliography Caplan R D, Cobb S, French J R P, Harrison R V, Pinneau S R 1980 Job Demand and Worker Health: Main Effects and Occupational Differences. Institute for Social Research, Ann Arbor, MI Cooper C L (ed.) 1998 Theories of Organizational Stress. Oxford University Press, Oxford, UK
Edwards J R, Caplan R D, van Harrison R 1998 Person– environment fit. In: Cooper C (ed.) Theories of Organizational Stress. Oxford University Press, Oxford, UK, pp. 28–67 Frankenhaeuser M 1979 Psycho-neuroendocrine approaches to the study of emotions as related to stress and coping. In: Howe H E, Diensbier R A (eds.) Nebraska Symposium on Motiation. University of Nebraska Press, Lincoln, NE, pp. 123–61 Henry J P, Stephens P M 1977 Stress, Health, and the Social Enironment. Springer, Berlin House J 1981 Work, Stress and Social Support. Addison-Wesley, Reading, MA Kaplan J R, Manuck S B, Clarkson T B 1983 Social stress and atherosclerosis in normocholesterolemic monkeys. Science 220: 733–5 Karasek R A, Theorell T 1990 Healthy Work. Stress, Productiity, and the Reconstruction of Working Life. Basic Books, New York Kohn M, Schooler C 1983 Work and Personality: An Inquiry into the Impact of Social Stratification. Ablex, Norwood, NJ Marmot M, Siegrist J, Theorell T, Feeney A 1999 Health and the psychosocial environment at work. In: Marmot M G, Wilkinson R (eds.) Social Determinants of Health. Oxford University Press, Oxford, UK, pp. 105–31 Paoli P 1997 Working Conditions in Europe. The Second European Surey on Working Conditions. European Foundation, Dublin Siegrist J 1996 Adverse health effects of high effort\low reward conditions. Journal of Occupational Health Psychology 1: 27–41
J. Siegrist
Stress in Organizations, Psychology of Stress is a term widely used in the social sciences, but not well defined by them; it may refer to any noxious or unwanted stimulus, or to the effects of such stimuli. The concept of stress itself seems to have been borrowed from the fields of physics and engineering, where it is defined more precisely as any force applied to a body, and the resulting alteration or distortion is referred to as strain. This article, following those definitions, will refer to the relevant stimuli as stressors, their presence as stress, and their effects on the individual as strain. Despite their differences in terminology and the variety of theoretical models with which they work, researchers on organizational stress agree on some or all of the following causal sequence: an objective stressor, its perception by individuals exposed to it, their appraisal of it, their immediate reactions, and its longer term effects. Models or frameworks for research along these lines have been proposed by numerous authors since the early 1960s. The major relationships specified in these models, along with some of the variables most frequently studied, are included in the 15179
Stress in Organizations, Psychology of
Figure 1 Theoretical framework for the study of stress in organizations (Kahn and Byosiere 1992)
‘ISR model’ developed by French and Kahn (1962) and elaborated by Kahn and Byosiere (1992) (for a review of stress research in organizations see Fig. 1).
1. Early Empirical Research Empirical organizational studies, especially in the early years, were less comprehensive than these models suggest; most research on stress was limited to undesirable job characteristics and their effects on individual employees. Some studies, however, examined the extent to which this causal sequence is moderated by enduring characteristics of the individuals and by properties of the organization or its parts. Still earlier, research and clinical observations of stress effects were carried out by physicians rather than psychologists. For example, medical doctors recognized angina pectoris and other diseases as the apparent result of ‘stress and strain’ in their patients’ lives, although they did not identify working conditions as hypothetical causes. Selye (1982), who pioneered research on the physiological effects of stress, worked almost exclusively in the laboratory, with rats as experimental subjects. The stressors in his experiments, however, are suggestive of some occupational conditions. These included alarm, sus15180
tained vigilance, and physical constraint, all of which led to consequences that Selye called the stress response or general adaptation syndrome (GAS). The GAS included damage to the adrenal glands and the lymphatic system, development of stomach ulcers, and other indications of physiological dysfunction. Selye insisted that the GAS was evoked by any of a wide range of stressful stimuli or ‘noxious agents.’ Research that was more specific to role demands in real-life settings, especially military service, was initiated by psychiatrists and psychologists during World War II (Grinker and Spiegel 1945, Janis 1951). Criterion variables in this research were mainly psychiatric symptoms. In the 1950s and 1960s, interest in occupational stress increased substantially, and experimental and field research on the subject was conducted in a number of countries. This research was stimulated by the ‘human relations movement,’ an unorganized but widespread concern about working conditions and their effects on employees. The emphasis on human relations, as the term implies, included psychological variables, both as causes and effects, in stress research. Centers of this early work were located at the University of Stockholm and the Karolinska Institute in Sweden, the Tavistock Institute in England, the Work Research Institute in Norway, the Institute for Social Research at the
Stress in Organizations, Psychology of University of Michigan, and in a number of other academic settings. One of the early Michigan studies (Kahn et al. 1964) investigated the effects of two occupational variables, role conflict and role ambiguity. The underlying assumption was that for each person in an organization, there is a set of other people—supervisors, subordinates, and peers—whose expectations that person must meet. Role conflict for an individual occurs when the members of his or her set disagree about what that individual should do, and role ambiguity exists when the expectations themselves are unclear. In this study the effects of these two stressors, based entirely on self-report, included increased tension, internal conflict about appropriate action, and reduced satisfaction with the job, with supervisors, and with the organization as a whole. The Swedish research, both in laboratory and organizational settings, included physiological as well as psychosocial effects of stress. Effects of stress were measured as changes in catecholamine (adrenaline and noradrenaline) excretion, creatinine clearance, and elevations in triglyceride and cholesterol levels (Levi 1972, Frankenhaeuser 1979).
2. Recent Research Findings The summary of more recent research findings, with recent defined as the years since 1980, follows the sequence shown in Fig. 1, from organizational and societal characteristics that are stress-generating, to the stressors themselves, their perception and appraisal by individuals, the physical and emotional and behavioral responses they evoke, and the longer term effects on those exposed to the stressors and on others associated with them.
2.1 Organizational Antecedents to Stress Job loss has been perhaps the most intensively investigated work-related stressor. Much of the research has been conducted around plant closings and the consequences for the affected workers and their families. A program to mitigate these negative effects was developed by Price and Vinokur (1995) and experimental designs to measure its effectiveness have been carried out, first in the USA and more recently in many other countries. The psychological and physiological strains of job loss have been demonstrated beyond question at the individual level. It is plausible therefore that high levels of general unemployment would have negative effects on health. Sociologists and economists who work with variables at institutional and societal levels, however, have seldom been concerned with the measurement of stress and its effects on individuals. Brenner and his colleagues are exceptions; they have
demonstrated that the health of populations is affected by changes in economic conditions at regional and national levels (Brenner and Mooney 1983). Economic growth or decline and economic instability are substantially correlated with rates of institutionalization, suicide, and all-cause mortality. These correlations, appropriately time lagged, have been replicated in nine industrialized countries. The intervening processes of stress and strain, however, are neither specified in the theory nor included in the research. Stress-related findings are similarly sparse at the organizational level. An early correlation from a national population sample suggested that organizational size was a source of stress, presumably because it led to increased formality and rigidity. But research in The Netherlands found a curvilinear relationship between organizational size and stress, with employees in medium-sized companies reporting higher levels of both stressor presence and strain experienced. In short, research evidence indicates that organizations differ significantly in their levels of stress and strain, but does not say what structural properties explain these differences. Somewhat more research has been done on the stress patterns associated with different positions within organizations. These studies, almost none of them replicated, suggest that span of control and location at an organizational boundary expose employees to increased stress. Data on the effects of hierarchical position are inconsistent, with some studies showing maximum stress at highest positions and other studies showing curvilinear patterns. Occupation, which perhaps can be considered an organizational factor once removed, has been much studied for its health consequences. Most occupational studies by social scientists, however, do not deal explicitly with stress; they report outcome differences in health or other indicators of well-being and leave the intervening processes to the imagination of the reader.
2.2 Organizational Stressors Job characteristics that have been identified as stressors are a miscellaneous set, some of them technological, some policy-determined, and still others interpersonal in origin. Most studied are such physical conditions as monotony and repetitiveness, shift work, the need for sustained vigilance, exposure to extremes of temperature, and ambient presence of noise and vibration. A few other job characteristics, more social psychological than physical, have also been identified as stressors; these include role conflict, in which a person is subjected to incompatible demands; role ambiguity, in which a person is not given adequate information about the requirements of the job; role overload, in which the quantitative or qualitative demands of the job exceed the person’s resources or 15181
Stress in Organizations, Psychology of ability; and lack of autonomy, perhaps the most frequently heard complaint of people in nonsupervisory positions. 2.3 Perception and Cognition Some stressors, as toxicological research has demonstrated, can have effects without having been perceived by people exposed to them. Most organizational sources of stress, however, are recognized by the people they affect; their seriousness is assessed, and their physical and psychological impact is mediated by that cognitive assessment. Most research on organizational stress does not distinguish between the objective stressor and its subjective perception and evaluation by people exposed to it. Lazarus and his colleagues, however, have concentrated on this sequence of perception and cognition, which they call the appraisal process (Lazarus and Folkman 1984). There is strong experimental evidence that an individual’s appraisal of a stressor and the degree of its threat to his or her well-being has effects that are independent of its objective properties. Lazarus’s work has been almost exclusively experimental and not directly concerned with organizational sources of stress. Beehr and Bhagat (1985), however, have brought together those organizational studies that utilized appraisal as a predictor of strain. 2.4 Responses to Stress Responses to organizational stress are usually classified as physiological, psychological, and behavioral. Ideally, the effects of a stressor would be measured in all three of these domains. In practice, however, the disciplinary background of researchers seems to determine the choice of criterion variables. Physicians and epidemiologists concentrate on physiological responses; psychologists are more likely to emphasize self-reports of strain and behavioral change. 2.4.1 Physiological responses. Job-related stress has been repeatedly shown to affect three kinds of physiological responses—cardiovascular, biochemical, and gastrointestinal. Cardiovascular responses to job stress include elevations in blood pressure and heart rate. Biochemical responses include elevations in cholesterol level and in catecholamines (epinephrine and norepinephrine). Gastric symptoms have been assessed primarily by self-report, sometimes with the inference of peptic ulcer. Among the stressors that evoke these symptoms are role conflict and ambiguity, unpredictability of work load and other demands, lack of control over pace and method of work, and ambient distracting noise. Repetitive motions, on jobs that require heavy lifting and on those that are limited to keyboard work, have been interpreted as causing a variety of musculoskeletal 15182
complaints, from back pain to carpal tunnel syndrome. Swedish research has been a major source of information on physiological responses to workrelated stress. For summaries that include this research, see Kahn and Byosiere 1992 and Lundberg 1999. Finally, job insecurity and actual job loss must be counted among the more serious sources of workrelated stress. Studies of unemployment and job loss have been numerous for many decades, especially at times when unemployment rates were high. More recently, studies of job loss and of job insecurity have been stimulated by widespread reductions in force, usually referred to as corporate downsizing. Most such research has not included physiological measures. The early study of plant closings by Cobb and Kasl (1977) is an exception; it showed elevations in blood pressure, heart rate, cholesterol level, and catecholamine output. These symptoms occurred during anticipation of job loss and continued through the time of actual dismissal. A gradual return to prestressed levels then began and was completed as workers found new employment and adjusted to their new jobs. More recent research on the impact of job loss has included findings on the mitigating effect of an experimental program of counseling and peer support (Price and Vinokur 1995). 2.4.2 Psychological responses. Many studies of job stress include psychological responses, and they have been summarized in a number of review articles (Holt 1982, Cooper and Payne 1988, Kahn and Byosiere 1992, Lundberg 1999). As these reviews indicate, most of the psychological responses to job stress in this research are self-reported and unlinked to clinical validation or physiological measures. An exception is the work of colleagues at Stockholm University, both at the Karolinska Institute and the Department of Psychology (Stockholm University 1999). By far the most frequently cited response to job stress is dissatisfaction with the job. More intense affective responses include anger, frustration, irritation, and hostility toward supervisors and the organization as a whole. Other negative responses correlated with job stress are more passive; among them are boredom, burnout, fatigue, feelings of helplessness and hopelessness, and depressed mood. Many of these words are virtual synonyms, a fact that reflects the tendency of researchers to design and label their own measures rather than replicate previous work. Replication in diverse organizations would be very useful, however, as would the combination of physiological and behavioral criteria with psychological responses. 2.4.3 Behaioral responses. Two behavioral correlates of job stress are well replicated: reductions in
Stress in Organizations, Psychology of job performance and increased absence. Several studies also reported increased turnover and increased use of tobacco. A review of these (Kahn and Byosiere 1992), along with a larger number of unreplicated findings, subsumed the behavioral responses to job stress into five broad categories: (a) Degradation\disruption of the work role itself (job performance, accidents, errors, etc.); (b) Aggressive behavior at work (stealing, purposeful damage, spreading rumors, etc.); (c) Flight from the job (absenteeism, turnover, early retirement, etc.); (d) Degradation\disruption of other life roles (marriage, friendships, citizenship); (e) Self-damaging behaviors (drug use, alcohol use, smoking, etc.).
and Sweden, found that, among workers with heavy job demands, those who had low decision latitude (little control over pace and method) reported symptoms of physiological and psychological strain; for those who had high decision latitude, however, the relationship between job demands and strain did not hold.
3. Unresoled Issues in the Study of Stress Research on organizational sources of stress continues and it is appropriate, therefore, to conclude by mentioning some issues that require the attention of stress researchers. They can be summarized in three categories: unresolved questions of concept and theory and of method and design.
2.5 Mediators of Stress The model used in this brief review shows two sets of variables that might act to mediate or buffer the effects of job stress: enduring properties of the person and properties of the work situation itself. That people differ in their sensitivity of job stress is beyond argument, and many demographic and personality characteristics have been proposed as stress mediators. No single characteristic mediates the effects of stress in all cases, but evidence for a stress-buffering effect has been found for at least three personality traits: Type A, locus of control, and self-esteem. Type A individuals (competitive, impatient, aggressive) showed greater strain than Type B individuals working under comparable conditions. People high in internal control (belief that their own choices and decisions make a difference) show less strain than people who score high in external control (belief that their circumstances are determined by ‘external’ forces beyond their control). High self-esteem also seems to offer some protection from the effects of job stressors. Few situational factors have been studied for their ability to buffer the effects of job stress. Social support is the exception; both its main effects and its buffering or interaction effects have been frequently tested. Findings are more consistent for main effects than for interactions; social support from various sources— supervisors, co-workers, family, and friends—predicts lower overall strain. Buffering effects have been frequently found as well, but the reasons for its failure in some cases are not clear. An earlier review (LaRocco et al. 1980) proposed that social support buffered (i.e., reduced) the relationship between various job stressors and indicators of mental and physical health but not between job stressors and more specific job strains (boredom, job dissatisfaction, dissatisfaction with workload). Autonomy on the job is a second factor in the work situation that has been studied for its buffering as well as its main effects on strain. Karasek and his colleagues (1981), analyzing national survey data from the USA
3.1
Questions of Concept and Theory
A longstanding issue in stress theories is the distinction between stimuli that are stressful and those that are not. Stress theories must not imply that the optimum human state is absence of stimuli; to the contrary, research on sensory deprivation has demonstrated the negative effects of that state. Moreover, some stimuli are pleasant or exciting at some levels or for brief periods, but unpleasant or damaging at higher levels or longer duration. Furthermore, physical or mental demands that are experienced as stressful or challenging may have training or strengthening effects rather than effects that are weakening or damaging in other ways. Selye (1982) attempted to distinguish between two kinds of task-related stress, eustress and distress, on the basis of the prospect for successful task completion or achievement. He did not argue that eustress avoids the long-term physiological effects of stress, but rather that to some extent these are unavoidable in life. Eustress and the psychological rewards of achievement, however, make the effort worthwhile. Levi (1972) had earlier dealt with this problem in more quantitative terms. He proposed that stress be thought of in terms of a continuum of stimulation, in which conditions of both understimulation (deprivation) and overstimulation (excess) were stressors. Somewhere between these extremes were the domains of pleasant stimulation, challenge, and training effect. Another issue in stress research, which is as important for methodology as for concepts and theory, is the distinction between crisis events and chronic conditions. Stressful life events—job loss, bereavement, illness, and the like—have been prominent in stress research, at least since Holmes and Rahe (1967) published a scale that gave researchers a basis for rating their intensity and their combined demand as stressors. Lazarus and his colleagues (Lazarus and Folkman 1984) have emphasized the stress effects of chronic conditions, daily hassles, rather than the 15183
Stress in Organizations, Psychology of stressful life eents (SLEs) that characterize the work of Holmes and Rahe (1967) and of Dohrenwend and her colleagues (Dohrenwend and Dohrenwend 1984). More recent work by McEwen and his colleagues (McEwen 1998) has introduced the concept of allostatic load, which has something in common with the daily hassles of Lazarus. Allostatic load refers to the stress effects of sustained physical or psychological demands, even when individuals are coping with them successfully. Theories of psychological stress have yet to deal fully with these issues. 3.2 Issues of Method and Design The main weaknesses of design in organizational research on stress are three, all of them long-standing: overreliance on self-report, on cross-sectional designs, and on single organizations. Exclusive dependence on self-reported data means that all respondents in a study describe their perceptions of some working condition and the symptoms of strain that they experience. At best, this design limitation generates data of uncertain validity as descriptors of the work situation; at worst, it is a near tautology. The reliance on cross-sectional rather than longitudinal designs is by no means a special defect of research on job stress; it is a common weakness in social research. The criticism is similarly familiar; causal inferences from cross-sectional data are notoriously weak. The usual reliance on singleorganization designs makes it impossible to bring organizational variables into the analysis of job stress. If one could write prescriptions for research on job stress, therefore, there would be more designs that were longitudinal, that included objective measures of job characteristics, that included physiological as well as self-reported measures of strain, and that traced the effects of job stress on the performance of both work and nonwork roles. All these improvements are within reach and each of them is occasionally exemplified in current research. Progress in stress research requires that they become more general in future research. See also: Stress: Psychological Perspectives
Bibliography Beehr T A, Bhagat R S (eds.) 1985 Human Stress and Cognition in Organizations. Wiley, New York Brenner M H, Mooney A 1983 Unemployment and health in the context of economic change. Social Science and Medicine 17(16): 1125–38 Cobb S, Kasl S V 1977 Termination: The Consequences of Job Loss. US Department of Health, Education and Welfare. US Government Printing Office, Washington, DC Cooper C L, Payne R 1988 Causes, Coping and Consequences of Stress at Work. Wiley, Chichester, UK Dohrenwend B S, Dohrenwend B P 1984 Stressful Life Eents and their Contexts. Rutgers University Press, New Brunswick, NJ
15184
Frankenhaeuser M 1979 Psychoneuroendocrine approaches to the study of emotion as related to stress and coping. In: Howard H E, Dienstbier R A (eds.) Nebraska Symposium on Motiation, 1978. University of Nebraska Press, Lincoln, NE French J R P Jr., Kahn R L 1962 A programmatic approach to studying the industrial environment and mental health. Journal of Social Issues 18(3): 1–47 Grinker R R, Spiegel J 1945 Men Under Stress. Blakiston, Philadelphia Holmes T H, Rahe R H 1967 The social readjustment rating scale. Journal of Psychosomatic Research 11: 213–18 Holt R R 1982 Occupational stress. In: Goldberger L, Breznitz S (eds.) Handbook of Stress: Theoretical and Clinical Aspects. Free Press, New York, pp. 419–44 Janis I L 1951 Air War and Emotional Stress, 1st edn, McGrawHill, New York Kahn R L, Byosiere P 1992 Stress in organizations. In: Dunnette M D, Hough L M (eds.) Handbook of Industrial and Organizational Psychology, 2nd edn. Consulting Psychologists Press, Palo Alto, CA, Vol. 3, pp. 571–650 Kahn R L, Wolfe D M, Quinn R P, Snoek J D 1964 Organizational Stress: Studies in Role Conflict and Ambiguity. Wiley, New York Karasek R A, Baker D, Marxer F, Ahlbom A, Theorell T 1981 Job decision latitude, job demands, and cardiovascular disease: A study of Swedish men. American Journal of Health 71: 694–705 LaRocco J M, House J S, French J R P Jr 1980 Social support, occupational stress and health. Journal of Health and Social Behaior 21: 202–18 Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Springer, New York Levi L 1972 Stress and distress in response to psychosocial stimuli. Laboratory and real life studies on sympathoadrenomedullary and related reactions. Acta Medica Scandinaica, 191, Supplement No. 528, p. 14. Stockholm, Sweden Lundberg U 1999 Workplace stress. In: McGuigan F J (ed.) Encyclopedia of Stress. Allyn and Bacon, Boston McEwen B S 1998 Stress, adaptation, and disease: Allostasis and allostatic load. Annals of the New York Academy of Sciences B40: 33–44 Price R H, Vinokur A D 1995 Supporting career transitions in a time of organizational downsizing: The Michigan JOBS Program. In: London M (ed.) Employees, Careers and Job Creation: Deeloping Growth-oriented Human Resource Strategies and Programs, 1st edn., Jossey-Bass, San Francisco, pp. 191–209 Selye H 1982 History and present status of the work stress concept. In: Goldberger L, Breznitz S (eds.) Handbook of Stress: Theoretical and Clinical Aspects, 1st edn. Free Press, New York, pp. 7–17 Stockholm University 1999 Annual Report XXXIX. Department of Psychology, Stockholm University, Sweden
R. L. Kahn
Stress Management Programs Rationale and contents of intervention programs developed for improving stress management skills and for cultivating a personal sense of control are described. Focus is on the cognitive-behavioral programs
Stress Management Programs which cover the most frequently used components of intervention. The range of application of stress management programs across different groups and settings, such as workplace or chronic illness, is illustrated. Results on the evaluation of the effectiveness of interventions with regard to the different outcome measures used are reported. Implications for the implementation of stress management programs are discussed.
1. Introduction Stress is generally considered a major impediment to individual health and well-being and to social and societal functioning. Stress is associated with psychological symptoms, diminished well-being, acute and chronic health problems, and impaired performance in work and social settings. To alleviate such stressrelated problems, theory-based stress management programs have been developed whose aim is to improve personal skills in dealing more effectively with stressful conditions and to cultivate a personal sense of self-efficacy and control in problem situations. In the next section of this article, the main theories and key concepts that underlie stress management interventions will be delineated. Subsequently, program designs, intervention techniques, and the range of application of stress interventions will be discussed. The fourth section presents results of the evaluation of stress management programs across different outcome measures. Moreover, methodological aspects of intervention studies will be examined. Finally, prospects for future research and program development will be considered.
2. Rationale and Theoretical Background The majority of stress management programs are based on the following theories: (a) transactional theories of stress and coping, and (b) theories for cognitive-behavioral psychotherapy developed in clinical psychology. In addition to these psychological theories, knowledge of the physiology of stress and the physiological systems activated by stress has resulted in the view that relaxation must be a major objective in healthy stress management. Relaxation techniques have since been advocated that are aimed at reducing heightened arousal and at preventing its short- and long-term deleterious effects.
2.1 Transactional Theories of Stress and Coping The most influential theory of stress and coping was developed by Lazarus and Folkman (1984) who defined stress as resulting from an imbalance between perceived external or internal demands and the per-
ceived personal and social resources to deal with them. According to Lazarus and Folkman, two cognitive appraisal processes can be distinguished. The initial appraisal, defined as primary appraisal, involves the analysis of whether an event is personally relevant. Events perceived as personally relevant can be appraised as either positive or stressful (the latter including possible harm, threat, or challenge). If individuals perceive events as stressful, they evaluate their own resources to deal with the demands. This constitutes the process of secondary appraisal. Stress occurs when the demands are perceived as either exceeding or taxing the resources and coping responses become activated. Lazarus and Folkman (1984) defined coping as cognitive and behavioral efforts to deal with situations appraised as stressful. Generally, cognitive appraisal and coping processes are influenced by personality factors, personal and social resources, characteristics of the situation, and other variables (see Stress and Coping Theories). A very similar stress model has been developed by Palmer (1997) to guide health professionals in the selection of appropriate stress interventions. It incorporates Arnold Lazarus’ seven interacting response modalities comprising behavior, affect, sensory, imaginal, cognitive, interpersonal, and drugs\biology. According to this multimodal–transactional model, stress and coping proceed in five stages. In Stage 1, an external or internal stressor is perceived by the individual to be emerging either from an external source or from internal bodily sensations. In Stage 2, the individual appraises his or her capacity to deal with the stressor. The person then decides whether he or she has the resources to cope. If the individual perceives that he or she cannot cope, stress may be experienced, and stress responses are likely to become activated. Appraisal of coping capacity is influenced by social and cultural beliefs and attitudes that determine the importance of an event. An example is the belief that one must meet every deadline. If individuals perceive that they cannot cope, they progress to Stage 3. In this stage, stress responses occur that include behavioral, affective, sensory, imaginal, cognitive, interpersonal, and physiological changes. In Stage 4, the individuals appraise the effectiveness of the coping strategies they have used. If they perceive themselves to have been successful, they return to a state of equilibrium. If they believe, however, that they have failed to cope with the situation, this perceived failure exacerbates existing problems and turns into an additional strain. Stage 5 relates to the continuing process and long-term consequences. If an individual uses strategies that alter the situation and reduce stress, he or she will return to a state of equilibrium. If the problems persist, however, eventually the person’s health may be negatively affected. In sum, transactional theories of stress and coping focus on the cognitive and affective aspects of an 15185
Stress Management Programs individual’s interactions with the environment and the coping strategies he or she uses or lacks. 2.2 Cognitie Approaches to Psychotherapy Many of the techniques adopted in stress management programs have their origin in clinical psychology. In addition to the stress and coping theories, stress management is an explicit goal of intervention in cognitive and cognitive-behavioral approaches to psychotherapy. The basis of this form of clinical intervention is the belief that negative thoughts and cognitive distortions (e.g., ‘catastrophizing’ and overestimating negative outcomes) play a causal role in many clinical disorders. Cognitive interventions seek to modify biased and maladaptive beliefs by cognitive reeducation. The most influential cognitive therapies are Beck’s cognitive therapy (Beck 1976) and Ellis’ rational emotive behavior therapy (Ellis 1962, Ellis et al. 1997). Within the framework of his rational emotie behaior therapy, Ellis has developed the ABC model, which posits that an emotional, behavioral, or physiological reaction (C, for consequence) to an activating event (A) is influenced mainly by the beliefs (B) pertaining to an event and not so much by the activating event per se. According to Ellis, irrational beliefs such as ‘awfulizing’ or dogmatic and absolutist beliefs will lead to self-defeating behavioral and emotional consequences. In psychotherapy, such beliefs are challenged by asking logical, empirical, and pragmatic questions. Generally, cognitive interventions are employed in combination with behavioral techniques and the acquisition of problem solving and social skills.
3. Program Designs 3.1 Interention Techniques The importance of transactional theories of stress (Lazarus and Folkman 1984, Palmer 1997) and cognitive therapy approaches (Beck 1976, Ellis 1962, Meichenbaum 1985) lies in their recognition that the subjective interpretation of events plays a critical role in the experience of stress. Moreover, lack of appropriate skills in dealing with problem situations or inappropriate or rigid use of coping strategies may exacerbate stress. Thus, stress intervention now focuses on two aspects: dysfunctional cognitive appraisal such as cognitive distortions and irrational beliefs, and inefficient coping attempts and deficits in a wide range of personal and social skills. Furthermore, in accord with the physiology of stress, reducing the level of physiological arousal is acknowledged as a major objective in stress management. Based on this theoretical framework, the main intervention techniques employed in stress intervention include: (a) relaxation techniques that can be used to achieve short15186
term, in situ relaxation, as well as prolonged relaxation in the private setting (e.g., progressive muscle relaxation, meditation, breathing exercises, biofeedback, and autogenic training); (b) cognitie re-education and cognitie restructuring, mainly by challenging maladaptive cognitions and negatively biased beliefs; and (c) skills training in areas such as problem solving, time management, negotiation, assertive behavior, and the activation of social support (Bunce 1997, Murphy 1996, Palmer 1996, 1997). According to reviews of stress intervention studies, combinations of individual techniques, especially the combination of relaxation and cognitive-behavioral skills training, are the most frequently used forms of intervention (Bunce 1997, Kaluza 1997, Murphy 1996). A very influential composite stress management program that represents the multicomponent approach is ‘stress inoculation training’ (Meichenbaum 1985). The stress inoculation procedure typically includes three stages. The first stage, education, focuses on the assessment of an individual’s problems, the identification of strategies usually adopted to deal with stressful situations, and reconceptualization of the individual’s responses to stressors. Thus, the client learns to become aware of how he or she habitually responds to stressful situations. In the second stage, rehearsal, the client learns and practices various coping-skill techniques such as relaxation, problem solving, and cognitive restructuring. During the third stage, application, the newly acquired coping skills are practiced under simulated conditions and transferred into everyday life. In sum, stress management programs seek to reduce physiological arousal, and to provide the individual with a positive perception of stressful situations that is more conducive to well-being. Moreover, the programs aim to teach a wide range of coping skills to be used in a flexible, situation-appropriate way. Above all, the goal of stress interventions lies in promoting a sense of control and self-efficacy in coping with problem situations (see Self-efficacy and Health). 3.2 Range of Application Stress management programs have been implemented in a variety of settings. They are usually preention oriented and focus on psycho-education and cognitivebehavioral skills training. Accordingly, most programs are offered to all who seek such forms of intervention for themselves (Murphy 1996). In contrast to psychotherapeutic approaches, stress management programs are primarily developed for subclinical populations. However, these programs may also address high-risk groups such as hypertension patients, persons suffering from coronary heart disease (Bennett and Carroll 1990, Dusseldorp et al. 1999), or individuals with specific psychological problems such as high levels of stress, anxiety (Murphy 1996), or anger (Novaco 1975). For these groups, there may be
Stress Management Programs no clear distinction between preventive stress management programs and psychotherapy. Moreover, stress management programs are offered to persons coping with specific stressors or chronically difficult living conditions; this may include caregiving or chronic financial problems (Kaluza 1997). Modified stress interventions are also offered to children and adolescents (Wolchik and Sandler 1997). Stress management programs are particularly prevalent within work settings (Bunce 1997, Murphy 1996). The high frequency of stress interventions that target the workplace and occupational stress is a result of the recognition that stress is a major cause of problems such as absenteeism, accidents, poor work performance, and burnout (see Stress at Work). Coping with chronic illness is another important domain for which stress management programs have been developed and applied (Devins and Binik 1996, Dusseldorp et al. 1999). The most common forms of stress interventions for patients are education, behavioral training, relaxation, cognitive restructuring, and support groups. Usually, these interventions are part of composite intervention or self-management programs. Such interventions are generally designed to assist the patients to engage in active coping, to change risk behavior, and to feel less hopeless and helpless (see Chronic Illness, Psychosocial Coping with). 3.3 Formats Though stress management programs can be administered individually, most programs are implemented in small groups. Typically, a number of weekly training sessions which may last one hour or longer are either conducted over a number of weeks or condensed into block sessions. Learning theories see advantages in both formats (Brown et al. 1998). Interest in working with large groups which meet for ‘massed learning’ format workshops may increase in the future, however, as preventive psycho-education and skills training become part of general health promotion. Brown et al. (1998) have shown that large-scale, day-long stress management workshops are just as effective as the traditional programs consisting of one weekly session. The value of large-scale stress management programs lies in their potential to make preventive training accessible to the general public and to induce more persons to avail themselves of psychological intervention measures.
4. Program Ealuation 4.1 Outcome Measures In the studies that evaluate the effects of stressmanagement interventions, a wide range of health outcome measures were used to assess program effectiveness. Frequently employed measures include in-
dices of subjectie well-being (which, for the most part, are negatively defined as distress reduction, e.g., anxiety, anger\hostility, depression), perceied strain associated with different life domains, self-reported somatic symptoms and complaints, control and selfefficacy beliefs, and physiological measures. Moreover, some studies rely on biochemical measures (Kaluza 1997, Murphy 1996). According to a meta-analysis of 36 intervention studies, only five studies assessed coping strategies (Kaluza 1997). This is especially noteworthy in light of the fact that the acquisition of adaptive coping strategies is the primary focus of stress intervention. Obviously, the existing program evaluations have major shortcomings. In workplace intervention studies, organizational outcome measures such as job satisfaction, work performance, absenteeism, and turnover were additionally assessed (Murphy 1996). Evaluations of programs aimed at coping with chronic illness frequently assess illnessrelated knowledge, illness-related behavior and cognitions, and disease-related outcomes in addition to measures of subjective well-being (Devins and Binik 1996, Dusseldorp et al. 1999). 4.2 Results There are several systematic and comprehensive reviews of studies which evaluate the success of stress management programs (Devins and Binik 1996, Murphy 1996). While there exist only a few metaanalyses of preventive stress management interventions, especially of primary preentie programs, they clearly attest to positive treatment effects (Kaluza 1997, Lipsey and Wilson 1993; for secondary preventive programs see Dusseldorp et al. 1999). In their review of meta-analyses of psychological treatment effects, Lipsey and Wilson (1993) reported two studies of stress management programs, with the two mean effect sizes indicating positive treatment results. Similarly, a more recent meta-analysis of 36 studies (31 of them conducted in the US) reported small to medium effect sizes found in randomized control studies. Positive effects were found for different outcome measures such as subjective well-being, anger\Type A behavior, coping strategies, physiological measures, and cognitions (Kaluza 1997). Notably, mean effect sizes were greater for follow-up measures assessed between one and six months after the intervention. Particularly remarkable is the increase in mean effect size found in five studies that assessed subjective wellbeing more than six months after post-test. These results suggest that the effectiveness of stress management programs may be stable or may even increase over time, at least in regard to subjective well-being. According to a systematic review of 64 studies on workplace management programs (Murphy 1996), stress interventions were generally found to be effective, but the effectiveness varied with the outcome measure used. Cognitive-behavioral techniques were 15187
Stress Management Programs more effective with respect to psychological and cognitive outcomes, mainly anxiety, whereas relaxation was more effective in regard to physiological outcomes, predominantly assessed by blood pressure measurements. Meditation was used in only six studies, but produced the most consistent results across outcome measures. Generally, composite programs which used a combination of techniques proved to be more effective than single-technique programs. Job and organizational outcome measures, mainly job satisfaction, were employed in only 40 percent of the studies. According to Murphy, the low frequency of organizational outcomes may reflect the fact that stress management programs target the individual worker, not the organization. Of the studies reviewed, two-thirds reported no increase in job satisfaction. It appears that the interventions had a greater effect on subjective well-being and physiological arousal, which are subject to personal control, than on outcomes such as job satisfaction and somatic complaints, which are more influenced by external working conditions. The conclusion may be that workplace stress management needs to be more comprehensive and should address personal coping skills as well as the work environment (Murphy 1996). In sum, meta-analyses and systematic reviews attest to the general effectiveness of stress management programs, especially with regard to psychological and physiological well-being. The programs had less of an effect on outcomes that may be heavily influenced by external factors. Inconclusive are the results for coping skills, which were seldom assessed. However, in interpreting the results one has to bear in mind that studies examining the success of stress management programs face the problem that it is more difficult to demonstrate a significant effect in participants who enter the training with subclinical levels of complaints than it is in participants who enter with high levels of stress. For example, in the review of Murphy (1996), 79 percent of the prevention-oriented studies yielded positive effects, compared with 94 percent of the treatment-oriented studies that involved persons complaining of high stress or anxiety. 4.3 Methodological Aspects The variety of different intervention techniques raises the question of whether some techniques are more effective than others. Results for contrasting different intervention techniques are equivocal. Murphy (1996) reported technique-specific effects, whereas other reviewers came to the conclusion that seemingly different techniques appear to have similar outcomes, which is generally known as the ‘equialence paradox’ (Bunce 1997, Palmer 1996). According to Bunce (1997), outcome equivalence may be due to two factors. First, methodological and design weaknesses may underlie outcome equivalence. Second, all the different intervention techniques or combinations of techniques 15188
may have common features which are responsible for change. The methodological shortcomings that could add to outcome equivalence include: (a) combining techniques even if only one technique is used so that it is not possible to identify which specific components are responsible for the changes that are observed; (b) small sample sizes that may be associated with insufficient statistical power to detect techniquespecific effects even if they exist; (c) inappropriate or unspecific outcome measures; and (d) participant heterogeneity. The second interpretation suggested by Bunce to explain outcome equivalence is that nonspecific factors common to distinct contrasting techniques produce similar effects. Such nonspecific factors may be the experience of being in a group and meeting people with similar problems, expectancy effects, or insights gained by participation in respect to the self, others, and one’s environment. Indeed, studies that controlled for placebo effects, e.g., with a control group directed to sit quietly or to passively receive information, have demonstrated nonspecific effects (Bunce 1997, Murphy 1996). One may argue, however, that nonspecific effects are no less valuable. In fact, nonspecific effects may be constituent parts of psychological treatment, involving mechanisms of social interaction, expectations, and changes in beliefs and attitudes (Lipsey and Wilson 1993). The quality of the studies that evaluate intervention effects has been a point of severe methodological critique in the past. Recent reviews, however, reveal increasing methodological rigor (Ivancevich et al. 1990, Kaluza 1997, Murphy 1996). According to a recent review of 64 intervention studies in the workplace, just over half of the studies were true experiments with random assignment to a training or control group (Murphy 1996). However, about 25 percent of the studies did not use a control or comparison group, and 25 percent did not randomly assign participants to training and control groups. Generally, the studies involving randomized groups yielded lower effects, which underscores the importance of randomized control trials. Of the 36 studies that were included in the meta-analysis of Kaluza (1997), 61 percent were true experimental studies with random assignment to training and control groups. In the remaining studies, participants were not randomized into an intervention or control condition. In addition to post-treatment assessment, 53 percent of the studies reported results for a follow-up between one and over six months after post-test.
5. Prospects for Future Program Deelopment Stress management programs are designed for a wide range of possible stressors and chronically difficult living conditions. Most of them are preventive in nature and open to all persons who wish to improve their skills in dealing with stressful situations and
Stress Management Programs enhance their feelings of control. Unspecific and stressor-specific stress management programs provide an invaluable service for health promotion by offering psycho-education, skills training, and adaptive perceptions of problem situations. Generally, systematic reviews and meta-analyses of intervention studies attest to the effectiveness of preventive stress management programs. Clearly, however, further research is needed, especially with nonclinical populations attending preventive stress management programs (Kaluza 1997, Palmer 1996). Future research should focus on the following four problems. (a) Studies on stress intervention should seek to identify the factors that are responsible for change and the mechanisms of change. Such mediators by which changes may be engendered could be psychological variables such as personal sense of control and selfefficacy, the use of adaptive coping strategies, and insights into self, others, and work (Bunce 1997). (b) Outcome measures should be used that reflect the specific goals of stress management intervention such as realistic and adaptive stress-related cognitions, increased skills for dealing with everyday stress, and improved social competencies (Kaluza 1997). Currently, most outcome measures pertain to subjective well-being, which may indicate improvement in coping skills, but does not clearly reflect actual changes in stress-related interactions between the individual and his or her environment. Moreover, the effectiveness of stress intervention should be evaluated by long-term follow-up assessments in order to ascertain whether the interventions guard against future stress experiences. (c) Biographical and psychological individual differences such as age, gender, self-esteem, or locus of control may systematically influence changes effected by stress interventions (Bunce 1997). According to Bunce, future research must consider indiidual differences and should examine how these factors might act as moderators and mediators of outcome variance. Matthews and Wells (1996) assert that intervention is likely to be most successful when it is based on systematic analyses of individual styles of attention and coping. Miller has shown that interventions may be more effective when matched with the dispositional information-processing style (e.g., Miller and Mangan 1983). (d) In the development of stress intervention programs, cultural aspects should be taken into consideration in addition to biographical and psychological individual differences. For instance, minority groups in work or social settings may experience stress due to racial prejudices that could become the target of special stress intervention measures (Palmer 1996). Generally, cultural differences may play an important role in successful health promotion. Matching stress interventions with cultural beliefs and models of health and illness is a challenging task for multicultural societies (Devins and Binik 1996).
There is every reason to believe that stress will continue to be a salient problem in the near future. Increasingly, individuals and organizations will seek effective interventions to improve coping skills and reduce stress-related problems. Providing methodologically sound and effective unspecific and stressorspecific stress management programs will remain a predominant task for all who are involved in health promotion. See also: Chronic Illness, Psychosocial Coping with; Social Support and Stress; Stress and Coping Theories; Stress and Health Research; Stress at Work
Bibliography Beck A T 1976 Cognitie Therapy and the Emotional Disorders. International Universities Press, New York Bennett P, Carroll D 1990 Stress management approaches to the prevention of coronary heart disease. British Journal of Clinical Psychology 29: 1–12 Brown J S, Cochrane R, Mack C F, Leung N, Hancox T 1998 Comparison of effectiveness of large scale stress workshops with small stress\anxiety management training groups. Behaioural and Cognitie Psychotherapy 26: 219–35 Bunce D 1997 What factors are associated with the outcome of individual-focused worksite stress management interventions? Journal of Occupational and Organizational Psychology 70: 1–17 Devins G M, Binik Y M 1996 Facilitating coping with chronic physical illness. In: Zeidner M, Endler N S (eds.) Handbook of Coping. Wiley, New York Dusseldorp E, van Elderen T, Maes S, Meulman J, Kraaij V 1999 A meta-analysis of psychoeducational programs for coronary heart disease patients. Health Psychology 18: 506–19 Ellis A 1962 Reason and Emotion in Psychotherapy. Lyle Stuart, New York Ellis A, Gordon J, Neenan M, Palmer S 1997 Stress Counselling: A Rational Emotion Behaiour Approach. Cassell, London Ivancevich J M, Matteson M T, Freedman S M, Philipps J S 1990 Worksite stress management interventions. American Psychologist 45: 252–61 Kaluza G 1997 Evaluation von Streßbewa$ ltigungstrainings in der prima$ ren Pra$ vention—eine Meta-Analyse (quasi-)experimenteller Feldstudien [Evaluation of stress management interventions in primary prevention—a meta-analysis of (quasi-)experimental studies]. Zeitschrift fuW r Gesundheitspsychologie 5: 149–69 Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Springer, New York Lipsey M W, Wilson D B 1993 The efficacy of psychological, educational, and behavioral treatment: Confirmation from meta-analysis. American Psychologist 48: 1181–209 Matthews G, Wells A 1996 Attentional processes, dysfunctional coping, and clinical intervention. In: Zeidner M, Endler N S (eds.) Handbook of Coping. Wiley, New York Meichenbaum D 1985 Stress Inoculation Training. Pergamon, New York Miller S M, Mangan C E 1983 Interacting effects of information and coping style in adapting to gynecologic stress: Should the doctor tell all? Journal of Personality and Social Psychology 45: 223–36
15189
Stress Management Programs Murphy L R 1996 Stress management in work settings: A critical review of the health effects. American Journal of Health Promotion 11: 112–35 Novaco R 1975 Anger Control: The Deelopment and Ealuation of an Experimental Treatment. Heath, Lexington, MA Palmer S 1996 Developing stress management programs. In: Woolfe R, Dryden W (eds.) Handbook of Counseling Psychology. Sage, London Palmer S 1997 Stress, counselling, and management: Past, present and future. In: Palmer S, Varma V (eds.) The Future of Counselling and Psychotherapy. Sage, London Wolchik S A, Sandler I N 1997 Handbook of Children’s Coping. Plenum, New York
H. Weber
Stress: Measurement by Self-report and Interview The term ‘stress’ is used in a variety of different ways. Sometimes it is employed to refer to an experience of physical tension or distress stimulated by the environment and sometimes to the occurrence of events or experiences that are assumed to involve significant adaptive demands. In this latter case, the focus is on stressors, measured in terms of the occurrence of events and circumstances of stress-evoking potential. Even though it is useful to distinguish stressors from stress, the two terms are often used interchangeably. Despite varying characterizations of stress, Cohen et al. (1995, p. 3) have argued that all perspectives share a concern with the process in which ‘environmental demands tax or exceed the adaptive capacity of an organism, resulting in psychological and biological changes that may place persons at risk for disease.’ This article is focused on stress measurement in social as opposed to laboratory contexts. For that reason, strategies for assessing the environmental activation of physiological systems are not covered. Initial consideration will be given to measures of eventful stressors, which fall into two categories— those using a checklist approach and those that rely on data obtained in detailed interviews. This will be followed by brief discussions on the measurement of chronic stress and of exposure to major, potentially traumatic events.
1. Life Eent Checklists By far the largest amount of available evidence linking stress exposure to physical and mental health has come from studies employing checklist measures of social stress. Stimulated by Adolf Meyer’s life chart procedure, Hawkins et al. (1957) developed the Schedule of Recent Experiences (SRE). This 42-item ques15190
tionnaire covered events in areas such as marital problems, relationship changes, work and residential changes, and family relationships. A decade later Holmes and Rahe (1967), elaborating on the SRE, developed the Social Readjustment Rating Scale (SRRS) by employing a panel of judges to rate the amount of adjustment each event would typically require. Although research has not supported the expectation that the weights derived from these ratings would improve estimates of stress exposure, their contribution to the face validity of event checklists was evidently substantial. Following the publication of the SRRS there was a huge outpouring of research on stressful life events and it was, for a considerable period, the best-known and most widely used instrument. The conceptualization that initially informed the checklist approach was that an individual’s level of experienced stress was embodied in the cumulative amount of change or readjustment brought about by personal events occurring in the recent past (Holmes and Rahe 1967). This idea that the amount of change is the event property responsible for the stressful impact of life events seems to have been based on Selye’s (1956) contention that stress is comprised of nonspecific biological change elicited in response to environmental events and that even relatively small changes occurring in close succession can influence susceptibility and thus disease. Over the years since the publication of the Social Readjustment Rating Scale (SRRS) there has been an active debate over its adequacy and about alternative approaches to measuring life events and to evaluating their stress-engendering properties. Indeed, at least 20 critical reviews of stressful life event methodology have been published since 1975 (Turner and Wheaton 1995). However, the checklist method has evolved in response to criticism, and a substantial array of different life event scales have been developed. Turner and Wheaton (1995) provide a nonexhaustive list of 47 different life event inventories—many targeted to specific age groups. Despite several clearly justified criticisms and associated questions about the reliability and validity of resulting scores, the various published and unpublished life event inventories collectively represent the traditional, and still dominant, research procedure for estimating variations in stress exposure. A substantial preponderance of the now massive literature assessing the linkage between social stress and health is based on checklist measures of life stress. This literature has indicated a reliable association between life stress and the occurrence of both psychological distress and risk for physical health problems (e.g., Thoits 1983, Jemmot and Locke 1984). A conclusion that seems to follow from the consistency with which these relationships have been observed is that, whatever the shortcomings of the method, event inventories yield meaningful estimates
Stress: Measurement by Self-report and Interiew of stress exposure. However, there are two major challenges to this conclusion. One is the possibility that observed associations are an artifact of operational confounding arising from the fact the many checklists, perhaps the majority historically, have included items that are arguably symptoms of physical or mental health problems, or the consequences of such problems. Although evidence on this possibility has not been entirely consistent, the majority of studies have found that the relationships at issue persist, and are little attenuated, when potentially confounding items are removed from consideration (e.g., Tausig 1982). The second challenge involves the hypothesis of reverse causality. Concern that illness or distress also produces either real or artifactually higher numbers of reported events has been a prominent theme within the life events literature. A crucial point in this regard is that the possibility that illness also causes stress exposure does not question the causal status of life event scores with respect to illness; it simply complicates the proper estimation of this causal impact (Turner and Wheaton 1995). In this connection, longitudinal studies consistently suggest that if one controls for prior or baseline distress or mental health, life events have a net impact on change in distress or mental health over time. Thus, at least with respect to mental health, where the causal direction debate has been most intense, the literature supports two essential criteria of causality: (a) statistical association, and (b) temporal precedence. The primary remaining issue is the possibility of spuriousness in the apparent effect of life events on health outcomes. Although this explanatory hypothesis cannot be entirely set aside, the literature is filled with examples of studies that have attempted to control for confounding factors, and almost none that have claimed to explain away the life events—illness relationship (e.g., Kessler et al. 1985, Turner and Lloyd 1999). 1.1 Change Eents Versus Negatie Change Eents As previously noted, the conceptualization that guided developers of earlier event checklists was that change per se, whether positive or negative in nature, required adaptation and can have a stressful impact. Although this perspective continues to be represented among a number of prominent investigators, the majority of life event researchers have come to focus upon undesirable change assessed with lists containing only putatively negative events. The debate about whether it is change or undesirability that is relevant to the experience of stress is difficult to resolve because there is no wholly satisfactory criterion measure—at least not one that can be applied in field settings. However, a number of researchers have argued that power to predict distress or illness represents a reasonable criterion (e.g., Turner and Wheaton 1995). From this perspective, con-
siderable evidence on the change versus undesirability issue has accumulated. Zautra and Reich (1983) published a review of 17 studies that examined the relationships of desirable and undesirable events to measures of psychological distress. They observed a consistent pattern of clear positive relationships with negative events in contrast to weak and contradictory findings with respect to positive events. Results suggesting the dominant significance of negative compared to positive events as risk factors for distress and illness have been quite consistent across a range of study populations, dependent variables, and life event inventories (Turner and Wheaton 1995). However, the conclusion that ostensibly positive life changes play no role in either elevating or conditioning stress experiences cannot be made with confidence. The mechanisms involved may simply be more complex than research has so far been able to reveal. The data do support the recommendation that, in the context of the limited time or space that characterizes most questionnaires, event checklists be restricted to events that are presumptively undesirable in nature. 1.2 Eent Timing and Duration Most checklist studies on the health effects of social stress have employed a one-year time frame for inquiring about experienced events. This time frame appears to be favored because it is regarded as long enough to obtain a reasonable estimate of variations in exposure to recent life events, and short enough to avoid the substantial decline in the ability of respondents to recall events that appears to occur beyond one year. In contrast, studies employing two- to five-year time frames have sometimes reported higher correlations than those that have considered shorter periods. In general, the choice of an appropriate time frame must involve consideration of the nature of the outcome of interest and the estimated or assumed causal lag of event effects for that outcome. When relatively long time periods are selected, group decline in event reporting can be investigated as a function of months prior to interview and cut-points for the time frame established based on the results. In this connection, it should be recognized that the problem of reliability of reporting or of remembering, and the contention that stressors may be time limited in their effects, cannot be applied equally to all events that may be experienced. For example, it would rarely be the case that a subject would not remember, or fail to report in response to specific questions, that his or her parent had died, or child had died, or parents had divorced that he or she had served in combat, or had been physically or sexually assaulted. Evidence that a number of major events or circumstances are relevant to mental health and that the duration of their effects is closer to a lifetime than to one year suggests the utility of considering such events within efforts to index a person’s burden of stressful experience. The 15191
Stress: Measurement by Self-report and Interiew measurement of major, potentially traumatic events is discussed briefly below. It is widely recognized that both discrete life events and relatively enduring problems are relevant to distress and illness. The experience of eventful stress may alter the meanings of existing strains, generate new strains, or magnify existing strains (Pearlin 1983). Alternatively, the impact of recent events may be amplified in the presence of chronic stress (Brown and Harris 1978). To test the idea that eventful and chronic stressors act synergistically requires the capacity to distinguish stressors within the two categories. Although life event checklists aim to assess discrete events, it is clear that some occurrences of a given event are time-limited in nature, whereas other occurrences of the same event may represent the beginning of a long-term difficulty. In a large-scale community study (n l 1561), Avison and Turner (1988) obtained information for each reported event on when it ended as well as when it had occurred. Analyses confirmed that event checklists routinely reflect differences in exposure to relatively enduring as well as discrete stressors. Moreover, none of the 31 negative events assembled by these investigators from established lists were found to be uniformly experienced as either discrete or chronic stressors. This research thus indicates that enduring or chronic stress is part of what is reflected in associations of checklist scores with health outcomes and that it is possible to distinguish its contribution. However, the enduring stressors that might be indexed within an event checklist are likely to be too limited and narrow to yield adequate estimates of variations in exposure to chronic stress. This issue is discussed further below. 1.3 Eent Weighting Considerable attention has been given to differential weighting of life events, presumably because the idea that there are no differences in impact potential across events is counterintuitive. There are several types of weighting, but the two most common are ‘objective’ ratings of change, or importance, or seriousness, of events by ‘judges,’ selected in random samples of defined populations, in purposive samples of individuals who have experienced the event being weighted, or by expert panels, and ‘subjective’ ratings assigned by individuals to their own events and considered only at the individual level. A primary difference between these two is the level of aggregation of the weight: objective weights reflect an average assigned by a group of raters; subjective weights involve self-reporting by the individual on how serious or stressful the event was for them. Thus, ‘objective’ weights are attempts to capture differences in event characteristics that imply differences in impact potential that are applicable across individuals, whereas ‘subjective’ weights attempt to capture differences in impact potential associated with 15192
variations in the meaning of an event for individuals themselves. As noted above, despite numerous attempts to demonstrate otherwise, research has failed to yield evidence for the effectiveness of the ‘objective’ weighting of events. Nevertheless, some researchers have called for additional efforts, arguing that the issue is not yet settled (Turner and Wheaton 1995). Concern about subjective weights most often revolves around the issue of confounding estimates of stress exposure with measures of outcome. Reports of illness or distress and of the seriousness or stressfulness of reported events may reflect the same domain of experience. This problem is arguably less serious when the outcome of interest is physical health status and, with longitudinal data, bias can be substantially reduced by partialling out the effects of events on later health. However, subjective weights are problematic in another respect: they tend to confound differential stress exposure with differential vulnerability to stress. The weight attached by a respondent to an event will be substantially a function of his or her capacity, real or perceived, to handle that event emotionally and\or practically. This capacity is largely determined by the individual’s coping skills and the availability of social and personal resources. Thus, subjective ratings confound the effects of stress mediators\moderators with those of stress exposure and should therefore be avoided even in prospective studies and studies on physical health status.
2. Interiew Measures of Life Eents Interview measures are distinguished from the checklist approaches by their use of qualitative probes. These probes are employed to more precisely specify the characteristics of life events that might be stressful. The approach avoids the violence to logic implicit in assuming that widely differing events have equivalent stress-evoking potential by specifically attempting to establish the degree of severity of each event exposure (Wethington et al. 1995). Other major strengths of intensive interview measures include achievement of greater comprehensiveness of event coverage through flexible domain-specific elicitation and self-nomination of events not evoked by direct questioning, and the ability to establish the time ordering of stress exposure and health outcomes with greater confidence. The major disadvantage of interview measures is that they are extremely time-consuming and expensive relative to checklist methods. 2.1 Life Eents and Difficulties Schedule (LEDS) The Life Events and Difficulties Schedule (LEDS) is the most widely used personal interview method and has produced perhaps the most compelling evidence available for the role of stress in psychiatric disorder (Brown and Harris 1978). This semistructured in-
Stress: Measurement by Self-report and Interiew strument yields a narrative story of each experienced event that is used by investigators to rate the importance of events. These investigator-based ratings attempt to estimate what the impact would be in a specific context for the average person. It is the experience of a ‘severely’ threatening event or of a severe chronic difficulty that is hypothesized to increase risk for the occurrence of physical or psychiatric disorder. Severity is established through ratings of ‘long-term contextual threat,’ aided by the use of ‘dictionaries’ of examples of rated events and difficulties, the use of which requires a formal training course (see Wethington et al. 1995). The considerable strengths of this approach are offset by the very high cost noted earlier. This fact, along with the extensive amount of time required to complete the rating process, may make the method unsuitable for large-scale community studies. From certain points of view, a second disadvantage is that the method, in evaluating social context within the severity rating process, ends up aggregating stress exposure with factors that may amplify or moderate its impact. That is, what has been described as the stress process is substantially represented in the ratings of contextual threat. Although this results in powerful predictions of health outcomes, information that is crucial to interventionists is obscured when the effects of stress exposure and of differential vulnerability are confounded. As Wethington et al. (1995) have noted, Brown and his colleagues continue to refine the measure to address this criticism, at least to some extent.
2.2 The Standardized Eent Rating System The Standardized Event Rating System (SEPRATE) utilizes a structured interview format to develop descriptions of each event or difficulty reported (Dohrenwend et al. 1993). In addition to the narrative description, the probes allow a standardized assessment of separate aspects of events and situations thought to condition level of stress experienced. A major difference from the LEDS interview is that material that might infer differences in stress vulnerability is stripped from narratives to avoid the inclusion of vulnerability estimates within event ratings. This interview measure is relatively new and has not been as widely tested as the LEDS. Like the LEDS it is relatively expensive to use, is lengthy to administer, and requires specialized training from the authors.
3. Chronic Stress Despite growing evidence that chronic stressors affect physical and psychological health, there have been few attempts toward comprehensive measurement of such
stressors. The relative lack of attention in this area may arise from concern that reports of enduring rolerelated or other stressors may be more a consequence or reflection, than a cause of disorder. A promising 51item inventory has recently been offered by Wheaton (1997) who has presented evidence contrary to this concern about causal confounding. The inclusion of this measure, along with a life events checklist, has been shown to account for about 2.5 times as much variance in depressive symptomatology as typically reported in the literature (Turner et al. 1995). This measure assesses a wide range of role-related and roleunrelated chronic stressors and the observation that it is unique association with depression suggests the need in future research for including a separate sampling of items addressed to the general domain of chronic stress. There is also a variety of less comprehensive questionnaires that deal with specific domains such as work or relationship strains. These have been enumerated and critiqued by Lapore (1995) and Herbert and Cohen (1995).
4. Lifetime Exposure to Major\Traumatic Eents There is evidence that traumatic events or circumstances, individually or cumulatively, are relevant to mental health (Rutter 1989, Kessler and Magee 1993, Turner and Lloyd 1995). There appears to be a range of events that, presumably, can be reliably measured and that can have significant mental health consequences despite their occurrence years, or even decades, earlier. This argues strongly for the systematic inclusion of lifetime experience of such events as part of the effort to assess difference in stress exposure. Analyses should consider both the direct individual and cumulative effects of lifetime traumas and the extent to which they may condition the impact of more contemporaneous stressors. Brief inventories of major\traumatic events are presented in Kessler and Magee (1993) and Turner and Lloyd (1995). However, because the nature of traumatic experience varies across populations and age cohorts, event selection and coverage should be specific to each study population. It is important to understand that the health significance of social stress has never been effectively tested because we have yet to adequately measure differences in stress exposure. The conceptual consequences of this inadequate measurement can be profound. This is so because unmeasured group differences in stress exposure masquerade within research findings as group differences in vulnerability. Failure to take account of differences in traumatic stress or in the stress of discrimination can inappropriately suggest group differences in coping skills or other personal attributes relevant to adaptation. It seems clear that efforts to measure stress exposure should be as comprehensive as possible and take 15193
Stress: Measurement by Self-report and Interiew account of eventful stressors, more enduring strains, lifetime occurrence of traumatic events, and stress associated with discrimination and acculturation. See also: Coping across the Lifespan; Coping Assessment; Stress and Coping Theories; Stress Management Programs; Stress, Neural Basis of; Stress: Psychological Perspectives
Bibliography Avison W R, Turner R J 1988 Stressful life events and depressive symptoms: Disaggregating the effects of acute stressors and chronic strains. Journal of Health and Social Behaior 29: 253–64 Brown G W, Harris T O 1978 Social Origins of Depression. Free Press, New York Cohen S, Kessler R C, Gordon L U 1995 Measuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York Dohrenwend B P, Raphael K G, Schartz S, Stueve A, Skodol A 1993 The structured event probe and narrative rating method for measuring stressful life events. In: Goldberg L, Breznitz S (eds.) Handbook of Stress. Free Press, New York, pp. 174–99 Hawkins N G, Davies R, Holmes T H 1957 Evidence of psychosocial factors in the development of pulmonary tuberculosis. American Reiew of Tuberculosis and Pulmonary Diseases 75: 768–80 Herbert T B, Cohen S 1995 Measurement issues in research of psychosocial stress. In: Kaplan H (ed.) Psychosocial Stress: Perspecties on Structure, Theory, Life-course and Methods. Academic Press, New York, pp. 295–332 Holmes T H, Rahe R H 1967 The social adjustment rating scale. Journal of Psychosomatic Research 11: 213–8 Jemmot J B, Locke S E 1984 Psychosocial factors, immunologic mediation and human susceptibility to infectious diseases: how much do we know? Psychological Bulletin 95: 78–108 Kessler R C, Magee W J 1993 Childhood adversities and adult depression: Basic patterns of association in a US national survey. Psychological Medicine 23: 679–90 Kessler R C, Price R H, Wortman C B 1985 Social factors in psychopathology: Stress, social support, and coping processes. Annual Reiew of Psychology 36: 531–72 Lapore S J 1995 Measurement of chronic stressors. In: Cohen S, Kessler R C, Gorden L V (eds.) Meassuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York Pearlin L I 1983 Role strains and personal stress. In: Kaplan H B (ed.) Psychosocial Stress: Trends in Theory and Research. Academic Press, New York, pp. 3–31 Ross C E, Mirowsky J 1979 A comparison of life event weighting schemes: change, undesirability, and effect proportional indices. Journal of Health and Social Behaior 20: 166–77 Rutter M 1989 Psychosocial risk trajectories and beneficial turning points. In: Doxiadis S, Stewart S (eds.) Early influences in shaping the individual. NATO Adanced Science Institutes Series, Series A: Life Sciences. Plenum Press, New York Selye H 1956 The Stress of Life. McGraw-Hill, New York Tausig M 1982 Measuring life events. Journal of Health and Social Behaior 23: 52–64 Thoits P A 1983 Dimensions of life events that influence psychological distress: An evaluation and synthesis of the
15194
literature. In: Kaplan H B (ed.) Psychosocial Stress: Trends in Theory and Research. Academic Press, New York, pp. 33–101 Turner R J, Lloyd D A 1995 Lifetime traumas and mental health: The significance of cumulative adversity. Journal of Health and Social Behaior 36: 360–76 Turner R J, Lloyd D A 1999 The stress process and the social distribution of depression. Journal of Health and Social Behaior 40: 374–404 Turner R J, Wheaton B 1995 Checklist measurement of stressful life events. In: Cohen S, Kessler R C, Gordon L U (eds.) Measuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York Turner R J, Wheaton B, Lloyd D A 1995 The epidemiology of social stress. American Sociological Reiew 60: 104–24 Wheaton B 1997 The nature of chronic stress. In: Gottlieb B (ed.) Coping With Chronic Stress. Plenum, New York Wethington E, Brown G W, Kessler R C 1995 Interview measurement of stressful life events. In: Cohen S, Kessler R C, Gordon L U (eds.) Measuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York Zautra A J, Reich J W 1983 Life events; and perceptions of life quality: Developments in a two-factor approach. Journal of Community Psychology 1: 121–32
R. J. Turner
Stress, Neural Basis of The term stress is semantically confusing and difficult to define. The confusion of what is meant by stress is compounded by its use both as a noun, i.e., a stressful experience or life event, and as a verb, i.e., a stress response. In addition, stress is usually seen as a negative state particularly as it is often used to describe the state of having too much to do in too little time. Despite the negative connotations, the dictionary defines stress, as something that weight is placed upon, something important. Synonyms for stress include emphasis, accent, and significance; not exactly definitions that should elect such negative responses. Another definition is raised from the field of physics, where the typical synonym is strain. That strain is typically measured through changes in the physical properties of matter. This definition has some utility when considering the neural basis of stress. That is, a given amount of stress can be useful to material, testing its strength. But when the amount of stress is increased precipitously, those physical properties can change, often irrevocably. Combining these two perspectives, then, it is tempting to assert that stress is the physical response to environmental change beyond the expected level of force. This would necessarily exclude changes in sensory processing, i.e., perception, which may or may not be stressful. However, it also excludes two of the more salient features of the stress response: subjectivity and emotionality. With regard to subjectivity, what is stressful to one person is not necessarily stressful to another. And with respect to emotionality, few would deny the emotional
Stress, Neural Basis of content of a stressful experience. The true definition lies somewhere between the mechanistic\physical view and a more emotional\subjective one. For the purposes of this submission, we will consider stress to be the subjective response to environmental changes beyond an expected level of force. Experience of stress and the neural processes that mediate it will be discussed here. Possibly the most accepted definition of stress was proposed by the eminent psychologist Hans Seyle who defined it as the nonspecific response of the body to any demand placed on it. With an emphasis on nonspecificity, Seyle was referring to the most well described stress response systems in the body: the hypothalamic-pituitary-adrenal axis (HPA). In response to the stressor, the paraventricular nucleus of the hypothalamus releases the corticotrophin-releasing factor (CRF), which induces the release of adrenocorticotrophic hormone (ACTH) from the anterior lobe of the pituitary. ACTH exerts its effects on the adrenal cortex promoting the synthesis and secretion of glucocorticoids, cortisol in humans, into the bloodstream. Glucocorticoids are the final effectors of the HPA axis and exert homeostatic regulatory effects including energy mobilization and inhibition of anabolic processes. The HPA axis is based on negative feedback resulting from the release of glucocorticoids into the blood that suppress the release of CRF from the hypothalamus and ACTH secretion from the pituitary, and results in termination of the stress response. Concomitantly with the release of ACTH from the pituitary, the opioid peptide beta-endorphin is also released having wide-ranging modulatory effects on neurotransmission and hormone regulation in the brain. Seemingly paradoxical, beta-endorphin plays a role in stress-induced analgesia and prevents the stress response from overwhelming the capacity of the organism to contend with the stressor. Glucocorticoids are released in response to every type of stressor known to humans, from novelty to chronic pain, to death. They are also released in response to more positive emotions and experiences such as pleasure and learning. Because glucocorticoids are released into the bloodstream and thus have access to nearly every known organ in the body, the emphasis on non-specificity is not unfounded. Activation of the HPA is known to have widespread consequences and under supra-threshold conditions can be detrimental to neuronal functioning. However, Seyle probably did not mean to infer that the stress response in its entirety was nonspecific and certainly did not mean to infer that it was out of control. In fact, there is a recent appreciation for the role that glucocorticoids play as anti-inflammatory agents and in bringing the system back to homeostasis after a stressful experience. The many different neural and neuroendocrine pathways activated as a result of stressful experience further exemplify the oversimplification of Seyle’s emphasis on non-specifity. In addition to the secretion
of glucocorticoids by the adrenal cortex, the adrenal medulla and autonomic nervous system also have prominent actions in mediating the stress response. The autonomic nervous system consists of two branches, the sympathetic and parasympathetic, which work in opposition to each other. Activation of the sympathetic division of the autonomic nervous system stimulates the chromaffin cells of the adrenal medulla to secrete the catecholamines epinephrine and norepinephrine into the bloodstream. Epinephrine and norepinephrine induce a variety of physiological changes that provide the organism with the necessary energy resources to carry out its actions under stressful experiences in preparation for the ‘fight or flight’ response, a term used by Walter Cannon. As a result, respiration increases to facilitate the exchange of oxygen in the lungs, while cardiac output is increased to divert blood from digestive areas to muscles. Sympathetic activation also results in an increase in the supply of glucose via the inhibition of insulin and increased glucagon production that converts stored glycogen into glucose. Although counterintuitive, the inhibition of insulin is very adaptive since it increases the availability of glucose to the brain thereby enhancing neuronal function. In contrast to the sympathetic, the parasympathetic division is promptly inhibited by stress and thereby postpones the costly non-essential anabolic processes of vegetative function such as digestion. In addition to these peripheral responses to stress, there are numerous central responses. Among other purposes, they are important in establishing the ‘what’ and ‘where’ of the stressful experience, as well as the formation of memories about the experience. For example, CRF neurons form part of the critical HPA response to stress and are innervated by limbic areas including the amygdala, bed nucleus of the stria terminalis, and lateral septum. It is suggested that they provide emotional tone to or perception of the stressful stimulus. Brainstem afferents from catecholaminergic neurons have a stimulatory action on HPA activation, as does serotonergic input from the dorsal raphe nucleus. Inhibitory input is provided by local GABA neurons that decrease the release of CRF. CRF neurons also receive input from other hypothalamic nuclei including the superchiasmatic nucleus, which serves to synchronize circadian rhythmicity and control the daily oscillations of hormone secretion. The stress response in turn activates several brain regions and pathways. For example, exposure to stressful experience stimulates the mesolimbic\mesocortical dopamine systems, which in turn innervate the prefrontal cortex, an area involved in anticipatory phenomenon and cognitive function. Furthermore, the mesolimbic\mesocortical systems are linked to the nucleus accumbens, which are implicated in reinforcement\motivational and reward processes. The amygdala and hippocampal components of the limbic system are also stimulated during stress. The 15195
Stress, Neural Basis of amygdala is important for the emotional analysis of the stressor and has direct connections with the hypothalamus, cortex, and regions of the brainstem involved in autonomic control. Thus, the amygdala may serve as an integrator linking regions responsible for detection of stressors involving cognitive processing (e.g., the cortex and hippocampus) and those that serve as effectors of the stress response (e.g., the hypothalamus and brainstem controls). The hippocampus, on the other hand, appears to be involved in inhibition of the HPA axis and thus is part of the feedback control of ACTH release. The inhibitory role of the hippocampus depends on activation of hippocampal glucocorticoid receptors, which stimulate the fimbria-fornix pathway from the lateral septum and bed nucleus of the stria terminalis inducing the release of the neuropeptide, somatostatin, to inhibit the PVN. Thus, the interaction between the stress system and neural elements function to regulate the autonomic, endocrine, and behavioral responses to stress allowing for perception of the stressor, emotional coding, and initiation of action. The HPA axis of the stress system exerts a profound influence on the regulation of growth, reproduction, and the immune function. There is a stress related suppression of growth hormone that inhibits growth in the young organism and the repair of existing tissue in adults. In both the male and female reproductive systems, stress induced CRF and beta-endorphin secretion suppress the gonadtropin-releasing hormone (GNRH) at the level of the hypothalamus. Furthermore, glucocorticoids decrease pituitary responsiveness resulting in suppression of the leutenizing hormone (LH) and the release of the follicle-stimulating hormone (FSH). Testicular and ovarian sensitivity to LH is also decreased by glucocorticoids. Stress can thereby disrupt ovulation and implantation in females and reduce sperm production and testosterone secretion in males. Stressors also exert multiple effects on the inflammatory\immune reaction. Glucocorticoids mediate stress-induced immunosuppression via a negative feedback loop. Glucocorticoid secretion resulting from HPA axis stimulation by cytokines (i.e., interleukin-1, tumor necrosis factor-alpha) suppresses the stimulatory effects of cytokines on the HPA axis. It is obvious that the impact of stressor exposure is widespread and extends to many physiological and neural systems. Defining stress and its neural consequences is made more difficult by the parameters of stress. For example, the duration of stressful stimulus can be either chronic or acute and the neural indices differ dramatically depending in these temporal characteristics. Stressful experience also differs in its intensity. Minor stressful experiences can induce a range of responses that are much different from more severe forms of stress. This is contradictory to Seyle’s view of a common stress response regardless of the nature of the stressor. The stress system produces adaptive responses to acute 15196
stressors. However, the same stressor, when chronic or activated without sufficient physiological cause, becomes pathogenic resulting in the emergence of stress related disease. Prolonged activation of the catabolic processes of the stress response can result in cardiovascular (hypertension), metabolic (myopathy, fatigue), gastrointestinal (gastric ulcers), and reproductive (impotence and amenorrhea) consequences, as well as complications associated with immunosuppression (impaired disease resistance). Although there is consistency of responses activated during stress, responses can be specific to the particular characteristics of the stressor. Stress physiology is not only modulated by physical stressors, but also by psychological variables. In fact, psychological factors can initiate a stress response even in the absence of physical insult and homeostatic disruption. Animals and humans have been shown to exhibit the classic stress response during circumstances such as bereavement, difficult cognitive tasks, and conditioned fear. The stress response initially evolved as a way of contending with environmental challenges but over time came to be activated by cognitive phenomenon. The magnitude of a stressful event is modulated by not only the nature of the stressor and the resulting physiological perturbations, but also individual perception and interpretation of the stressor. Lack of predictability and control are the two major elements underlying the cognitive appraisal of stress. Uncertainty about an aversive event reduces feelings of control over the situation thereby inducing higher levels of stress. Related to this is the welldocumented phenomenon of learned helplessness in which prolonged exposure to uncontrollable and inescapable aversive stimuli can produce emotional disruption and some deficits in performance, while exposure to the same amount of controllable stress has no effect. Furthermore, the psychological component of stressors underscores the element of individuality of the stress response. What is stressful to one individual may not be to another. Factors such as gender, prenatal and early life psychological and physical stress have profound and long-lasting effects on the basal activity of the stress system. Just as chronic activation of the stress response can lead to a manifestation of physical pathologies, so it can lead to psychiatric syndromes including melancholic depression, anorexia nervosa, obsessive compulsive disorder, and panic anxiety. It has been demonstrated thus far that stressors have extensive impact on various neural and peripheral systems. But, the aforementioned neuromodulatoy systems also contribute to a range of behaviors allowing for maximal ability of the organism to contend with the stressful stimulus. This reciprocal relationship between behavior and physiology is essential to overcoming a stressor and re-establishing homeostasis. Reproductive and feeding behaviors are suppressed allowing for appropriate provision of
Stress, Neural Basis of energy to overcome the stressor. The degree of arousal and cognition is also altered. As already demonstrated, acute stress can produce adaptive behavioral responses. Such a response would include increased arousal and enhanced learning and memory resulting from blocked glucose transport to cells benefiting neuronal function by providing more glucose to neurons. As a result, the animal is able to contend with the stressful experience. On the other hand, chronic stress can be detrimental to neural function. Elevated glucocorticoids damage hippocampal neurons resulting in the inability of the hippocampus to inhibit CRF release. Consequently, there is an increased secretion of glucocorticoids leading to additional hippocampal degeneration. Such damage can produce deficits in learning and memory. An inverted U function could thus be used to represent the relationship between stress and performance. Optimal performance occurs with moderate levels of demand but is compromised at the extremes. Low levels of stress, such as boredom or drowsiness, are detrimental to performance. As stress increases, performance is facilitated up to an optimal level beyond which there is deterioration. Thus, physiological and behavioral processes are involved in the adaptive response to stress. Although stress is difficult to define, the responses produced by stress are not. Stressors are pervasive and challenge the physiologic and behavioral well being of the organism. All stress-inducing stimuli, both physical and psychological, exert their initial effects on the central nervous system (CNS). Thus, the stress response system is predominantly under neural control. The CNS regulates the reactions of the organism through close interactions with other neuroanatomical, peripheral and functional elements to produce adaptive responses necessary for maintaining homeostasis. The stress system is designed for quick activation and disengagement. Chronic stimulation results in maladaptive responses and a multitude of physical, emotional, and behavioral pathologies. Thus, a contemporary perspective of stress has evolved from Seyle’s original concept of a single stereotypical response resulting from any stress put on the body to one that reflects different patterns of biochemical, behavioral, and physiological responses of varying degrees to particular stressors. To this point, we have focused primarily on the nonspecific or general responses to stress. However, the past few decades of research on the neural basis of stress would suggest that many responses to stressful experience are not nonspecific and are indeed specific at many levels: from their temporal characteristics, to regional location, to molecular identity. To illustrate this point, we will discuss the responses to one specific stressor, that of acute exposure to brief, low-intensity, inescapable tail shocks in the rat. Exposure to this one experience induces a number of responses that are both regionally and temporally specific. At the transmitter level, exposure to the event enhances the release
of acetylcholine in the hippocampus and prefrontal cortex but not in the nucleus accumbens. At the receptor level, exposure to the stressor activates nmethyl-d-aspartate (NMDA) receptors in the basolateral amygdala, while enhancing the affinity of the AMPA type of glutamate receptor in area CA1 but not area CA3 of the hippocampus. At the biochemical level, exposure enhances the binding of phorbol esters, a marker for protein kinase C activity. This effect occurs specifically in the basolateral nucleus but does not occur in either the central nucleus of the amygdala or the hippocampus. At the physiological level, exposure to the acute stressful event reduces spontaneous unit activity in the basolateral amygdala, while enhancing theta rhythms and reducing long-term potentiation in hippocampal subregions. At the molecular level, activation of immediate early gene expression occurs in the several layers of the cortex and area CA1 of the hippocampus but not in the dentate gyrus. At the cognitive level, exposure to the stressor has been reported to impair instrumental responding while enhancing classical conditioned responses. These responses to stress are also dependent on the sex of the animal. For example, exposure to the stressor, while enhancing classical conditioned responses in the male rat, dramatically impairs conditioning in the female rat. Finally, many of these responses to stress are immediately induced while others are persistently expressed. Moreover, a subset of these responses can be reactivated long after the stressful experience has ceased by simply placing the animal in the context where the stressful event occurred. Thus, there is a psychological component (or memory) of the experience that is sufficient to activate many indices of stress in the brain. From this one example, it is clear that the responses to stress can be both regionally and temporally specific as well as sex specific. Understanding the neural basis of stress will require an appreciation and documentation of these specifics. See also: Anxiety and Fear, Neural Basis of; Hypothalamic–Pituitary–Adrenal Axis, Psychobiology of
Bibliography Cannon W B 1929 The wisdom of the body. Physiological Reiew 9: 399–431 Hebb D 1955 Drives and the CNS (conceptual nervous system). Psychological Reiew 62: 243–54 Hennessy J W, Levine S 1979 Stress, arousal, and the pituitaryadrenal system: A psychoendocrine hypothesis. Progress in Psychobiology and Physiological Psychology 8: 133–78 McEwen B S, Sapolsky R M 1995 Stress and cognitive function. Current Opinions in Neurobiology 205–16 Sapolsky R M 1992 Stress, the Aging Brain, and the Mechanisms of Neuron Death. MIT Press, Cambridge, MA
15197
Stress, Neural Basis of Seyle H 1956 The Stress of Life. McGraw Hill, New York Shors T J 1998 Stress and sex effects on associative learning: For better or for worse. The Neuroscientist 4: 353–64
T. J. Shors and B. Horvath
The intuitive appeal of explanations based on psychological stress and related ideas requires firm thinking and sound science to differentiate valid associations from plausible and convenient fictions.
2. The Major Perspecties and Traditions
Stress: Psychological Perspectives Psychological stress, like the general term ‘stress,’ is a common term that defies a universally agreed-upon definition. These difficulties are not due to a lack of interest in the concept; rather, they are due to a widespread fascination with psychological phenomena related to organism–environment interactions. It is the inherent complexity of this enterprise that attracts and vexes investigators. In the place of a standard definition of psychological stress, three different perspectives have evolved with their own definitions, as well as their own conceptual and methodological traditions.
Two traditions may be discerned which, together, have given rise to three perspectives on psychological stress. The first tradition derives from clinical observations and field studies of humans exposed to threatening circumstances. The second tradition arises from the animal laboratory, where systematic control over stimulus conditions and response alternatives of the organism has been exerted. The common denominator of the three perspectives on psychological stress emerging from these traditions is an interest in the psychological concomitants and sequelae associated with challenging or threatening environmental conditions. The differences between these perspectives, as explained next, is the relative emphasis placed on the environment, organism, or environment–organism interactions over time as a defining feature of psychological stress.
1. Historical Considerations and Concerns Interest in the implications of emotions for health is age-old, dating back to the origins of holistic approaches to medicine exemplified by the writings of Hippocrates (400 BC). Interest in the influence of psychological factors on emotional functioning has consequently represented a parallel theme of interest over the centuries (see Enironmental Stress and Health). The basic premise is that stress gives rise to distressing emotions, which in turn influence disease. Yet the connections between these words and ideas are quite rich, and more complex than current viewpoints suggest. Indeed, ‘stress’ is derived as a shortened form of ‘distress’; disease, in turn, originally referred to ‘disease,’ or distress (Rees 1976). It is in this lexicographic context that varied connections between stress, distress, and disease can be appreciated, only some of which are useful for scientific purposes. As rich in explanatory power as psychological stress might be for understanding the causes of distress and disease, the linguistic overlap of these terms also signifies problems. The intuitive appeal of the ideas and their relations may foster false theories and findings. There is a tendency to ‘explain away’ disorders of unknown etiology based on prevailing beliefs about the unwanted or adverse social or psychological circumstances (Monroe and McQuaid 1994). For example, before the physical causes of particular infectious diseases were discovered, the origins of these illnesses were commonly ascribed to psychological stress and distressing emotions (Sontag 1978). 15198
2.1 The Stimulus or Enironmental Perspectie The stimulus or environment perspective holds that psychological stress is anchored in the environment and its characteristics. The underlying assumption is that objective external circumstances can be classified as psychologically demanding, threatening, benign, and so forth. In human field or laboratory research, the stimulus parameters were recorded in naturalistic contexts or manipulated in laboratory settings. Animal laboratory research, on the other hand, could readily control the environment, and thereby more easily operationalize and manipulate the stimulus conditions. Other factors, too, were influential in promoting the environmental perspective. Attention to the environment for creating stress and illness became especially evident toward the mid-twentieth century; the experiences of World War II reshaped the ways in which a breakdown in human functioning was viewed. Prior to this time, it was thought that individual vulnerability (i.e., biological) largely accounted for who succumbed to stress. Normal people maintained health and survived the ravages of stress; the biologically predisposed or tainted broke down. Given that one-third of American casualties in World War II sustained no physical injury (Weiner 1992), the idea that severe stress could precipitate psychological and physical breakdown in previously normal individuals was incorporated into medical and psychiatric thinking (Dohrenwend 1998). This brought to the foreground
Stress: Psychological Perspecties the importance of extreme environments for defining psychological stress. Finally, a surge of interest and activity in psychological stress for an environmental perspective came from another avenue of inquiry: life events questionnaires. Thomas Holmes and Richard Rahe developed the idea that life events could be assessed in an objective manner. These researchers developed a list of 43 events observed to occur prior to the onset of disease, including such items as marriage, trouble with the boss, change in sleeping habits, death in the family, and vacation. The events were collated and weighted, producing the Schedule of Recent Experiences (SRE; Holmes and Rahe 1967). Literally thousands of studies were conducted with the SRE and the many derivative life event checklists it spawned (Monroe and McQuaid 1994).
2.2 The Response Perspectie The second perspective on psychological stress can be traced to the writings of Darwin and the theory of natural selection. It is in these ideas that the attention began to focus on the implications of challenges and dangers of the environment, forcing the organism’s adaptation to ever-altering external circumstances. The response features of stress take on special importance in this framework, as it is recognized that evolution also fostered skills in anticipation of dangers and adversity (Weiner 1992). Being able to prepare for and prevent problems in the future drew upon mental, emotional, and behavioral capabilities that, over time, enhanced survival and reproduction. From this perspective, psychological stress is viewed as the response of the organism to the complex adaptive demands of the environment. The twentieth century brought more sustained and intense interest in probing the intricacies of psychological stress, with an emphasis on the response perspective. The work on physiology by Walter Cannon posed questions about how the body maintained its regulatory functions in the face of changing environmental demands. Hans Selye! introduced the term ‘stress’ into more formal scientific arenas, and investigated the physiology of organisms in response to acute and extreme environmental conditions.
2.3 The Enironment–Organism Interaction Perspectie The stimulus and response perspectives on psychological stress each possess certain limitations. For example, knowledge of the environment does not always predict the responses; individual differences in response to similar stimuli are a common finding. From a response perspective, there are numerous indicators to draw upon (different emotions, psycho-
physiologic, or hormonal indices), many of which displayed different relationships with distinct environmental challenges (Weiner 1992). To address some of these limitations, psychological stress has been proposed to represent a particular relationship between organism and environment. This third perspective holds that psychological stress is the product of the individual’s ongoing transactions with ambient circumstances, particularly circumstances that are perceived as taxing or exceeding adaptive resources. Several possibilities are entertained within this perspective, ranging from an emphasis on stressful experiences (Weiner 1992) through emphases on the ongoing process of event occurrence, appraisal, coping, reappraisal, and so on (Lazarus and Folkman 1984). The core feature of this perspective is that of denoting an actie process between a dynamic environment and an adapting organism over time. Broadening the definition of psychological stress to encompass the interactions between organism and environment over time also enlarged the conceptual scope of research. This shift raises additional considerations, to which we now turn.
3. Conceptual Challenges and Problems After the 1950s, stress became a common term, cutting across sociology, psychology, medicine, physiology, and biology. A broad array of constructs were united under this common rubric. At least two issues arise from this explosive growth in research on stress: (a) individual differences, and (b) generality vs. specificity of psychological stress.
3.1 Indiidual Differences Despite the promising leads of early research on psychological stress, it became apparent that there was individual variability in response to stressful circumstances. Not all people succumbed to even the most extreme of situations, while some people broke down in the face of apparently minimal demands. Theory and research needed to account for such variation, thereby introducing factors hypothesized to moderate stress effects. Many factors have been proposed as stress moderators. Attention to this matter invoked such individual difference variables as appraisal of threat and coping (Lazarus and Folkman 1984, Monroe and Kelley 1995) and more remotely a variety of other considerations believed to moderate the effects of stress (e.g., personality factors, social support, etc.; Brown and Harris 1989, Goldberger and Breznitz 1993). The net effect was to broaden the conceptual arena of stress research considerably and to introduce multifactorial models, taking into account a host of factors that could influence stressors and responses 15199
Stress: Psychological Perspecties over time (see Coping Across the Lifespan; Control Behaior: Psychological Perspecties; Social Support and Stress; Stress and Coping Theories). 3.2 Generality s. Specificity of Psychological Stress A second concern was related to the omnibus nature of the stress concept. On the one hand, the stress united a variety of cognate constructs and emphasized the general nature of stress (e.g., Selye! 1976). On the other hand, the unique implications of specific environmental challenges and responses could be obscured within such a general viewpoint (e.g., Weiner 1992). Recent research from both the animal laboratory and human field studies suggests that responses to stressors can be very specific with respect to the particular demands imposed, with specificity being key for understanding many of the adaptive implications (Dohrenwend 1998, Weiner 1992) (see Stress, Neural Basis of ). Overall, the challenge for research on psychological stress was to remain true to its roots in extracting the ‘essence’ of psychological phenomena related to adversity, while at the same time not becoming so overly inclusive as to become essentially meaningless or obscure other underlying associations. Such an agenda raises important methodological issues, as discussed next.
4. Methodological Considerations Measuring psychological stress in a reliable and valid manner has been a continuing challenge for stress researchers. For example, animal laboratory research has proven useful for more precise specification of both stimulus parameters and response characteristics associated with psychological stress. Yet the methods adopted often constrained the array of possible responses, creating unnatural stressor conditions which, in turn, yielded misleading views about how adaptation proceeds under more ecologically valid conditions (Weiner 1992) (see Stress Management Programs). In terms of human field research, there has been considerable debate about measurement procedures for assessing stressful life events. Research on selfreport life event checklists such as the SRE revealed serious psychometric limitations (Brown and Harris 1989, Monroe and McQuaid 1994). Naı$ ve research designs also led to spurious findings. For example, being depressed could cause life events such as ‘trouble at work,’ and ‘difficulties with spouse,’ rather than the life events causing subsequent depression. Additionally, many of the life events on the SRE and similar checklists were direct indicators of disorder or illness (i.e., ‘major change in eating habits,’ ‘major change in sleeping habits’). Although more refined and com15200
prehensive measures were developed to address such concerns, fundamental problems with self-report checklists remain (Brown and Harris 1989, Monroe and McQuaid 1994). In response to these methodological concerns, other investigators designed semistructured interviews and developed explicit guidelines, decision rules, and operational criteria for defining and rating life events (Brown and Harris 1989, Dohrenwend 1998). In general, these procedures enhance the reliability of life event assessments and provide better prediction of outcomes. Although these approaches are more timeand labor-intensive, they represent among the best available tools for measuring psychological stressors. Importantly, too, these approaches provide flexibility in measuring life stress, allowing for ratings of specific dimensions of stress and for taking into account the influences of moderators of stress impact. Thus, in theory, such approaches can help to bridge the gap between a strictly stimulus perspective, and to incorporate useful features of the response and organism–environment interaction perspectives.
5. Future Directions The three perspectives on psychological stress are converging on a viewpoint of psychological stress as a comprehensive process of complex transactions over time between organism and environment. Yet each approach continues to be of value for emphasizing particular features of these complex models that do not receive adequate attention within the other perspectives. The stimulus perspective is essential to provide valid measures of the stressor characteristics of environments. Without standardized indices reflecting the input characteristics of the circumstances animals or people face, little can be said about ‘downstream’ elements of the models. For example, individual differences in appraisal or coping have little meaning without considering what is being appraised or coped with. It is in relation to well-specified environmental conditions that the implications of individual differences in appraisal and coping may be most readily understood in terms of the impact on health and wellbeing (Monroe and Kelley 1995). Future work may benefit from further specification and standardization of the stressor characteristics associated with adverse environments and their relations to varied outcomes (Monroe and McQuaid 1994). The response perspective suggests a number of avenues to enrich future inquiry. With respect to animal laboratory research, simple singular response indices are being replaced by more complex patterns of behavioral and biological effects. These may range from the immediate ‘exquisitely specific’ neuroendocrine response profiles through long-term structural changes in the central nervous system (Weiner 1992). With
Stressful Medical Procedures, Coping with respect to human field studies, retrospective recall of conditions and emotions may be enhanced through such developments as ecological momentary assessments. These procedures target multiple, immediate reports from people in their usual environments. These methods avoid many of the pitfalls of procedures based on the retrospective reconstruction of events and experiences (Stone et al. 1999). Perhaps most promising but challenging is the interactional perspective. Formidable challenges in operationalizing key constructs exist (e.g., appraisal and coping; Monroe and Kelley 1995, Cohen et al. 1995). Yet interesting progress has been made recently, employing sophisticated longitudinal research designs with in-depth, microanalytic techniques evaluating day-to-day transactions between the person and environment (Lazarus 2000). Further work along these lines is essential for differentiating the facts from the fictions in delineating the implications of psychological stress for health and well-being. See also: Coping Across the Lifespan; Job Stress, Coping with; Social Support and Stress; Stress and Coping Theories; Stress and Health Research; Stress at Work; Stress in Organizations, Psychology of; Stress Management Programs; Stress, Neural Basis of
Bibliography Brown G W, Harris T O (eds.) 1989 Life Eents and Illness. Guilford Press, London Cohen S, Kessler R C, Gordon L U (eds.) 1995 Measuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York Dohrenwend B P 1998 Overview of evidence for the importance of adverse environmental conditions in causing psychiatric disorders. In: Dohrenwend B P (ed.) Adersity, Stress, and Psychopathology. Oxford University Press, New York, pp. 523–38 Goldberger L, Breznitz S (eds.) 1993 Handbook of Stress: Theoretical and Clinical Aspects, 2nd edn. The Free Press, New York Holmes T H, Rahe R H 1967 The social readjustment rating scale. Journal of Psychosomatic Research 11: 213–18 Lazarus R S 2000 Toward better research on stress and coping. American Psychologist 55: 665–73 Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Springer, New York Monroe S M, Kelley J M 1995 Measurement of stress appraisal. In: Cohen S, Kessler R C, Gordon L U (eds.) Measuring Stress: A Guide for Health and Social Scientists. Oxford University Press, New York, pp. 122–47 Monroe S M, McQuaid J R 1994 Measuring life stress and assessing its impact on mental health. In: Avison W R, Gotlib I H (eds.) Stress and Mental Health: Contemporary Issues and Prospects for the Future. Plenum, New York, pp. 43–73 Rees W L 1976 Stress, distress, disease. British Journal of Psychiatry 128: 3–18 Selye! H 1976 The Stress of Life, rev. edn. McGraw-Hill, New York Sontag S 1978 Illness as Metaphor. Farrar, Straus and Giroux, New York
Stone A A, Shiffman S S, DeVries M W 1999 Ecological momentary assessment. In: Kahneman D, Diener E, Schwarz N (eds.) Well-being: The Foundations of Hedonic Psychology. Russell Sage Foundation, New York Weiner H 1992 Perturbing the Organism: The Biology of Stressful Experience. The University of Chicago Press, Chicago
S. M. Monroe
Stressful Medical Procedures, Coping with Many moderately invasive, potentially life saving procedures, e.g., colonoscopy to examine the colon and remove polyps, can be performed on an outpatient or ‘same-day’ basis. Many patients endure these procedures with minimal expressions of discomfort, while others experience and manifest moderate to high levels of pain and distress. Laboratory and clinical studies designed to assess the effects of psychological preparation in reducing distress have increased our understanding of the processes underlying these individual differences in three ways. First, the pain experience is an integration of a sensory experience and an emotional, distress response; pain is not just a sensation that is proportional to the extent of injury. Second, the type and intensity of the emotional response, e.g., fear-distress, is dependent on the patient’s representation of the experience, i.e., how it is perceived and interpreted, and the perceived availability of a means of controlling it. Finally, the patient’s representation of a procedure is based on information from doctors and nurses and by the ideas, beliefs, memories, and perceptions based on their own experience and observations of others.
1. Pain is Sensation and Emotion Pain has often been conceptualized as a sensory process(seePain,HealthPsychologyof ).Thus,aninjury was thought to activate receptors that sent pain messages to the brain, and pain and distress were expected to be directly proportional to the extent of injury (Leventhal and Everhart 1979). Patients who reported or showed greater discomfort than expected in response to an injury or to a treatment were assumed to have personalities that predisposed them to be excessively sensitive to pain signals or to be malingerers. Observations of wounded servicemen during World War II (Beecher 1946) contradicted the sensory model in two ways. The first was the absence of pain and distress when survival was enhanced by active coping. The second was the failure to alleviate pain and distress with analgesics such as morphine supposed to block conduction of sensory nerves. Administration of barbiturates to reduce emotional 15201
Stressful Medical Procedures, Coping with reactions of fear and anxiety reduced pain-distress, suggesting that emotional reactions were a component of the pain response that could amplify pain reactions. Anatomical and physiological studies as well as behavioral data show that the pain-distress experience is a product of parallel and interacting sensory and emotional systems (Melzack and Wall 1970). Situational factors that increase emotional reactions and amplify distress include the unexpectedness of the noxious sensations, the perception that more serious consequences are yet to follow, and the perception that the sensations are not controllable. Inhibition of the emotion system and minimization of pain-distress occurs when the sensations are expected, perceived as benign (imply no serious consequences), and can be controlled. Two mechanisms appear to be involved in this inhibitory process: the body’s opiates, the endorphins, and direct, neural inhibition of pain centers (Strausbaugh and Levine 2000).
2. Distress Reduction in Laboratory Settings A classic study by Egbert and colleagues (1964) demonstrated that postsurgical distress could be reduced by preparatory information; it triggered both laboratory and clinical studies of preparation. The laboratory studies showed how subtle differences in preparatory information had a substantial impact on the level of pain-distress. For example, subjects about to undergo a cold pressor test kept their hands submerged in the ice water for less time if they were told they were participating in a ‘Vasoconstrictive Pain Study’ (painful label) than if the study was described as a ‘Discomfort Study’ (less painful label; Hirsch and Liebert 1998). The discomfort group also gave lower ratings to their distress. In our earlier studies, subjects told to focus on the benign, sensory experience of the ice water during cold pressor immersion (pins and needles, color of the hand, etc.) experienced less distress and showed an increase in skin temperature in comparison to subjects told the experience would be painful. Distraction often proved ineffective in blocking these noxious experiences unless it was in the same modality as the pain stimulus (auditory; skin sensation).
3. Distress in Response to Inasie Medical Procedures The study by Egbert and colleagues (1964) and the laboratory studies indicated that a patient’s response to an invasive medical procedure can be shaped by the way their doctors and nurses frame the somatic sensations evoked by the procedure. As Janis and Leventhal pointed out years ago (Janis and Leventhal 1968), the florid symptoms often observed among surgical patients and previously attributed to under15202
lying psychopathology were often a product of the meaning of the situation and the availability of coping reactions. For example, although the painful sensations experienced during procedures such as endoscopy or bronchoscopy are readily attributed to the procedure, their meaning is ambiguous. They could imply that the instrument has caused damage or contacted a cancer, or suggest to that patient that he will become increasingly frightened and will be unable to retain emotional control if the pain lasts much longer. Ambiguity also can extend to post-treatment symptoms, e.g., the pain experienced after coronary bypass surgery can be identified as wound pain from a healing incision or damage to the heart. The implication of the latter is far more threatening and fear arousing than the first. In the absence of information from medical practitioners, the meaning of somatic experience will reflect the patient’s limited knowledge, memories of similar experiences, anecdotes from others, and their worst fears. Providers can provide patients with information so that they will experience procedures in a nonthreatening way and minimize pain-distress. Two types of information were given to patients prior to an endoscopic examination and each had beneficial effects: sensory information regarding what they would feel during the examination (e.g., having air pumped into your stomach will make it feel that you have just eaten a large meal) and behavioral plans (e.g., techniques for mouth breathing and swallowing to reduce gagging while swallowing the endoscopic tube). Patients given both types of information were least likely to gag when swallowing the endoscope (36.4 percent) and had lower heart rates in contrast to 90 percent of those gagging who received neither type of information. Patients given only one type of information fell in between (Johnson and Leventhal 1974). Thus, providing patients with information that minimizes the threat value of an experience reduces their anxiety and helps patients to coordinate behavioral procedures with their somatic sensations. The efficacy of this combination is well illustrated in studies of preparation for childbirth. The simple act of monitoring the sensory features of labor contractions lowered postdelivery pain reports in women who had participated in LaMaze classes and were trained in the behavioral responses needed to facilitate child birth (E. Leventhal et al. 1989). Women who were offered various ways of distracting themselves did not benefit with reduced pain from their childbirth classes. Thus, monitoring increased familiarity with body sensations, ‘… reduced emotional behaviors and lowered the levels of pain, by facilitating the coordination of breathing and pushing with contractions’ (p. 369). Practitioners seeking to help patients cope with stressful procedures need to be aware of the influence of current levels of pain-distress on retrospective pain reports. For example, chronic pain patients underestimated the pain they had reported on electronic
Stressful Medical Procedures, Coping with diaries during the prior week if their retrospective reports were given after physical therapy when their current pain was low (Smith and Safer 1993); patients who reported their prior pain before physical therapy overestimated their diary reports. Redelmeier and Kahneman (1996) found that retrospective ratings of the overall painfulness of colonoscopy and lithotripsy was related to the peak level of pain during the procedure and the level of pain experienced at the end of the procedure. The duration of the procedure was unrelated to overall judgments of pain. Thus, memories of pain-distress during medical procedure are biased upward if patients are experiencing much pain when making their judgments, and can be biased downward if the procedure is designed to end at a low level of pain and distress. This bias may be of clinical importance as the recall of pain may affect willingness to undergo future, protective examinations. There is evidence that interventions to reduce paindistress are particularly helpful for patients who are most vulnerable to upset. Baron and Logan (1993) proposed that patients who have a high desire for control during a dental, i.e., root canal, procedure (perhaps because they interpret the procedure as threatening), yet perceive themselves as having little control (because they feel incapable of coping with the threat), are highly vulnerable to upset and experience the greatest anticipatory anxiety before treatment and the greatest discomfort during it. When given sensory and procedural information that reframed the procedure in a less toxic way, these patients with high desire for control but low perceived control were no more distressed by their root-canal procedure than other, nonvulnerable patients. Vulnerable patients given only ‘placebo,’ filler information were highly distressed. An important caveat is whether the factors effective for the reduction of pain-distress in response to brief procedures such as colonoscopy will be effective in reducing pain and distress for prolonged stressors such as the pain and distress experienced following surgery. Informing people about somatic sensations may be ineffective in reducing pain postsurgery (Johnston and Vogele 1993) unless the information provides clear and believable nonthreatening interpretations of these somatic events. Recent work on recovery from myocardial infarction suggests that this is critical. Behavioral techniques for distress control seem to be effective in long-term stress settings if the control is effective. For example, hospitalized patients allowed to self-administer analgesics use less analgesic and report less pain.
4. Summary The pain and distress experienced during invasive procedures can be modified by effective preparation. The key constituents of such preparation appear to be nonthreatening interpretations of the sensory cues
evoked during the procedure and strategies for effective situational control linked to these cues. This combination appears to be effective for regulating the emotional component of the pain-distress response. Individual differences, however, can modify the effectiveness of behavioral preparation; patients lacking a sense of efficacy in controlling either the situation or their own affective reactions may benefit more when they see trusted others in control of the procedure. Preparation may also have different effects on different outcomes, some reducing pain experience, others physiological reactions, and others the time for completing the procedure. Though studies are needed to identify the mechanisms underlying the control of pain-distress during noxious medical procedures (Strausbaugh and Levine 2000), it is clear that patients can benefit from information. A humane system will provide information with regard to what patients will experience and help them to interpret these experiences as an expected part of a healing process. It will also provide behavioral procedures to facilitate the control of these noxious sensations and their associated affects. See also: Control Beliefs: Health Perspectives; Pain, Management of; Pain, Neural Basis of; Social Support and Recovery from Disease and Medical Procedures; Stress Management Programs
Bibliography Baron R S, Logan H 1993 Desired control, felt control, and dental pain: Recent findings and remaining issues. Motiation and Emotion 17(3): 181–204 Beecher H K 1946 Pain in men wounded in battle. Annals of Surgery 123: 96–105 Egbert L D, Battit G E, Welch C E, Bartlett M K 1964 Reduction of post operative pain by encouragement and instruction of patients. New England Journal of Medicine 270: 825–7 Hirsch M S, Liebert R M 1998 The physical and psychological experience of pain: The effects of labeling and cold pressor temperature on three pain measures in college women. Pain 77(1): 41–8 Janis I L, Leventhal H 1968 Human reactions to stress. In: Borgatta E, Lambert W (eds.) Handbook of Personality Theory and Research. Rand McNally, Chicago, pp. 1041–85 Johnson J E, Leventhal H 1974 Effects of accurate expectations and behavioral instructions on reactions during a noxious medical examination. Journal of Personality and Social Psychology 29(5): 710–8 Johnston M, Vogele C 1993 Benefits of psychological preparation for surgery: A meta-analysis. Annals of Behaioral Medicine 15(4): 245–56 Leventhal E A, Leventhal H, Shacham S, Easterling D V 1989 Active coping reduces reports of pain from childbirth. Journal of Consulting and Clinical Psychology 57(3): 365–71 Leventhal H, Everhart D 1979 Emotion, pain, and physical illness. In: Izard C E (ed.) Emotions in Personality and Psychopathology. Plenum, New York, pp. 263–99 Melzack R, Wall P D 1970 Psychophysiology of pain. The International Anesthesiology Clinics 8: 3–34
15203
Stressful Medical Procedures, Coping with Redelmeier D A, Kahneman D 1996 Patients memories of painful medical treatments: Real-time and retrospective evaluations of two minimally invasive procedures. Pain 66: 3–8 Smith W B, Safer M A 1993 Effects of present pain level on recall of chronic pain and medication use. Pain 55: 355–61 Strausbaugh H J, Levine J D 2000 Pain. In: Fink G (ed.) Encyclopedia of Stress. Academic Press, San Diego, CA, Vol. 3, pp. 115–8
H. Leventhal
Strikes: Sociological Aspects The withholding of labor is known to have occurred during antiquity, the Middle Ages, and the Renaissance (Crouzel 1887). It was not, however, until the spread of wage labor under industrial capitalism that strikes became a routine expression of workers’ claims. The very words by which we know the phenomenon in many European languages (‘strike’ in English, with its derivatives: German streik, Dutch strijk, and Swedish strejk, the French greZ es, the Spanish huelga, the Italian sciopero) reveal a mid- to late nineteenth century origin. Only in English, expressions such as strike, strike tools, or strike of work became increasingly common starting during the mid-eighteenth century, keeping pace with the industrial revolution. ‘This day the hatters struck and refused to work till their wages are raised’ we read in the Annual Register on 9 May 1768. By 1793, the expression must have been common enough in England for G. Dyer to write: ‘The poor … seldom strike, as it is called, without good reasons … The colliers had struck for more wages’ (The Oxford English Dictionary, entry for ‘strike’). In French, we have to wait until 1805 to come across the first use of the word greZ e in its meaning of withholding of labor by extension of a much earlier meaning of the word as ‘unemployed’ or ‘without work.’ But it is not until 1845–1848 that the new meaning became common (Le Grand Robert de la Langue Francaise, greZ e). The Italian sciopero or the Spanish huelga did not become part of the daily vocabulary until the later half of the nineteenth century, again by adapting a word used for centuries to refer to unemployment and lack of work (Deli-Dizionario Etimologico della Lingua Italiana; Diccionario CrıT tico etimoloT gico castellano e hispaT nico).
1. Strike Statistics Systematic collection and publication of strike data started in most countries between 1870 and 1900. Until then, strike data were kept secret in the hands of Ministries of the Interior. After all, strikes were viewed 15204
as crimes under Combination Acts (Perrot 1974). Only when the conception of strikes shifted from one of crime to one of ‘social illness’ did strikes become the concern of specialized Labor Ministries. It is these Ministries that initiated systematic recording and publishing of strike data and that standardized definitions and collecting procedures. Early statistics provided rich information on each individual strike: the immediate causes of strikes, their outcomes, their organizational bases, the occurrences of violent incidents and the number of establishments affected by any given strike. There is none of that richness in today’s strike statistics. Furthermore, strike data are highly aggregated (Fisher 1973). Typically, they do not go much beyond three aggregate measures of strike activity: number of strikes (S ), number of workers involved (W ), and number of hours lost (H ). On the basis of these three indicators, several different measures have been constructed in the literature (Forcheimer 1948, Knowles 1952, Shorter and Tilly 1974, Hibbs 1976): frequency or number of strikes per thousand employed workers; size or average number of strikers per strike (W\S ); duration or average number of hours lost per strikers (H\W ); and olume, defined as frequencyisizeiduration l (S\thousand employed workers)i(W\S)i(H\W ) l (H\thousand employed workers); the Galambos’ and Evans’ composite strike index based on an unweighted average of S, W, and H (Galambos and Evans 1966). Graphically, these different components of strike activity can be represented in a three-dimensional space as a parallelepiped—the ‘shape’ of strikes— whose sides are given by the size, duration, and frequency of strikes and whose volume corresponds to the product of the three sides (Shorter and Tilly 1974, Hibbs 1976, Franzosi 1995).
2. Data Reliability The reliability of strike statistics varies from country to country, but it is generally low (Shalev 1978). Some countries carry out data collection through specialized offices within Departments of Labor; others rely on police departments (e.g., Italy). Some use newspapers as sources of data; others rely on self-reporting, and still others base their statistics on police records. Different countries adopt different definitions of strikes, leading to difficulties in cross-national comparisons. It is generally difficult to gauge the bias of strike statistics under any data collection arrangement. Strike size can be just as easily overestimated as underestimated. If employers report strike size, they may be inclined to play down the strike to embarrass the unions or to portray a public image of peaceful labor relations (particularly vis-a' -vis clients), but they may also wish to magnify the strike in order to justify future retaliatory actions or police\government in-
Strikes: Sociological Aspects tervention, or to contest the imposition of penalties by clients for late delivery. Union reported figures are more likely to be overestimated in a show of organizational strength. If police offices themselves estimate the size of the strike it may be no more than pure guesswork. Strike frequency, on the other hand, is likely to be always greatly underestimated regardless of the efficiency of collection: small, short, shop-level spontaneous work stoppages, that represent the bulk of all strike occurrences, probably escape recording, as scores of field investigations have shown (Turner et al. 1967, Korpi 1978, 1981, Batstone et al. 1978). Bias in the data is not without consequences. When single strikes across a city or an industry are recorded separately, the number of strikes is inflated but the average strike size is reduced. To the extent that strike size is related to the workers’ organizational capacities, a bias in this measure would misrepresent the organizational basis of strikes (Batstone et al. 1978, p. 22). Under-reporting of small, plant-level strikes obscures the role of informal leadership and the existence of diffused, grass-root discontent, often directed against official unions as much as against employers (Korpi 1981). The systematic lack of information on strike tactics (e.g., walk-out, sit-down, checker-board, with demonstrations through the shop, marches to city hall), and on the broader spectrum of working class actions (such as slowdowns, work to the rule, overtime ban, rallies, demonstrations), fundamentally biases the nature and extent of working-class mobilization. In any case, exclusive emphasis on what workers do and on just one type of action (strike) distorts the broader social relations in which strikes occur. ‘It takes two to tango,’ Franzosi argues, ‘it takes at least two to fight’ (Franzosi 1995, pp. 15–18). Strikes emerged in the nineteenth century out of the incessant interaction among three main actors—workers, employers, and the state; it is this interaction that ultimately defined the rules and the acceptable forms of protest (Tilly 1978, p. 161). It is not possible to understand what workers do in isolation of what employers and the state do. Data aggregation adds problems of its own. There is no way of telling which combination of strike frequency and duration produces any given figure of number of hours lost (Batstone et al. 1978, p. 23). A yearly total of 100,000 strikers could refer to 100,000 different workers or to the same 10,000 going on strike 10 different times (Batstone et al. 1978, p. 21). ‘Average’ measures, such as size and duration, may mask wide differences in the distribution of actual strike sizes and durations, in particular when used for international comparisons. The same ‘average’ size, in fact, could be the result of a few, large and official strikes for the renewal of industry-wide, collective agreements and a large number of small, plant-level strikes (such as the Swedish or the Italian case), or of a great deal of intermediate size strikes, as in the United States, where plant-level bargaining is the rule but workforce con-
centration is higher (Edwards 1981, p. 231). The same ‘volume’ can mask marked differences in each of its constituent components. Standardized indicators, such as frequency, have been computed using different deflating procedures: total civilian labor force, nonagricultural labor force, unionized labor force, employment figures, etc., leading to an unnecessary proliferation of strike indicators that stand in the way of comparability of findings (Stern 1978).
3. Patterns of Strikes: What we Know The ready availability of strike data for over a century has resulted in a rich scholarly tradition of (mostly quantitative) research on strikes, focusing on the interindustry propensity to strikes (e.g., Kerr and Siegel 1954), but notably on the temporal dynamics of strikes. From as early as the beginning of the 20th century there has been a concern with the relationship between strikes and the short- and medium-term fluctuations in the level of economic activity. Several statistical studies have shown that strike frequency follows the business cycle and the movement of unemployment in particular—the higher the level of unemployment, the lower the number of strikes (see the seminal work by Ashenfelter and Johnson 1969). As American historian David Montgomery wrote: ‘Large scale strikes erupted [in the early 1900 in America] whenever the level of unemployment fell off sufficiently to give the strikers a ghost of a chance of success’ (Montgomery 1980, p. 94). Indeed, the dependence of strike frequency on the business cycle is one of the clearest findings of econometric strike research. Yet, economic theories of strikes have not gone unchallenged. For one thing, economic determinants of strikes can be statistically significant without being the main or exclusive influence (Vanderkamp 1970). Furthermore, while there is strong statistical evidence in favor of the relationship between strike frequency and the business cycle, it is not so for the other strike measures: size, duration, and volume. While strike frequency is negatively related to the level of unemployment, there is evidence supporting a positive relationship for duration (e.g., Turner et al. 1967). If economic determinants seem to play a significant role only on frequency, and each strike component measures a different aspect of strike activity, what are the determinants of strike size, duration, and volume? No doubt, it is workers’ organizational capacities that have most deeply affected the overall shape of strikes, size and duration, in particular. As early as 1911–13, French economist March showed the close statistical connection between strike size and unionization. But it is Shorter and Tilly (1974) who most systematically investigated the effect of organization on strike shape in what to date remains a landmark in strike research. For one thing, national, large-scale 15205
Strikes: Sociological Aspects strikes are only possible in the presence of national, large-scale labor organizations. But even at the local level, the organizational basis of trade unions helps to bring out on strike a larger portion of the workforce, leading to larger average strike sizes. Labor organizations have also contributed to reduce the average strike duration. The rise of labor organizations has typically shifted the timing of strikes from economically ‘bad’ times to ‘good’ times. Through organization, as Hobsbawm put it, industrial workers learned ‘the common sense of demanding concessions when conditions are favorable, not when hunger suggests’ (Hobsbawm1964, pp. 126, 144). By timing strikes during periods of prosperity unions also contributed to shortening the average duration of strikes. In fact, strikes are more likely to be long and drawn-out during periods of recession when employers are under no pressure to seek a quick settlement. Changes in the timing of strikes and the consequent shortening of their average duration have also brought about changes in the likelihood of successful settlements and of the occurrence of violence during a strike. Long disputes are likely to be unsuccessful for two reasons: (a) because they tend to occur in economically unfavorable times and (b) because they have a greater likelihood of being violent (and violent strikes decrease the chances of success) (Snyder and Kelly 1976). Yet the statistical significance of a unionization variable in econometric models of strike size hardly offers a causal explanation. After all, the reverse causal order has produced statistically significant results in union growth models, where unionization becomes the ‘dependent’ variable and number of strikers the ‘independent’ one (Ashenfelter and Pencavel 1969, Bain and Elsheikh 1976). In the end, we may have done no more than observe a strong correlation (Goetz-Girey 1965, pp. 146–7, Edwards 1981, p. 9). The exclusive reliance on highly aggregated strike and unionization figures in econometric work has obscured the causal mechanisms in the relationship between these different aspects of mobilization processes; it has also exaggerated the role of formal organization at the expense of informal shop-floor leadership (Batstone et al. 1978). Indeed, the use of temporally disaggregated firm-level data shows how rises in unionization follow (rather than precede) surges in the level of strikes, successful strikes in particular (Franzosi 1995, pp. 124–34). The political position of labor in national power structures has also deeply affected the shape of strikes. Both strike volumes (Hibbs 1978) and strike sizes (Korpi and Shalev, 1980) have gone down in countries where labor-oriented social-democratic parties acquired stable and lasting control over governments, such as in the Scandinavian social democracies and in Austria (Korpi and Shalev, 1980 p. 320). Direct access to political power offers labor alternative and less costly means to achieve a more favorable distribution 15206
of resources: the government machinery itself, rather than the strike (Shorter and Tilly 1974, Hibbs 1976, Korpi and Shalev, 1980). When the working class has control of the government, the locus of conflict over the distribution of resources, national income in particular, shifts from the labor market and the private sector, where strike activity is the typical means of pressure, to the public sector, where political exchange prevails. Short-lived labor governments, on the other hand, or outright exclusion from the government of communist parties, are associated with higher, rather than lower, levels of strike activity (Hibbs 1976, Paldam and Pedersen 1982). Franzosi’s work on postwar Italy shows that the ‘historic compromise’ strategy of the Italian Communist Party drastically changed the shape of strikes between 1975 and 1978 during the Governments of National Solidarity (Franzosi 1995).
4. The Withering Away of Strikes and Cycles of Strikes In the 1950s arguments about the ‘embourgeoisement’ of the working class and the ‘end of ideology’ were popular. Postwar prosperity and the institutional mechanisms for conflict resolution and collective bargaining, the arguments went, have resulted in permanent industrial peace, ‘the withering away of the strike, the virtual disappearance of industrial conflict in numerous countries’ (Ross and Hartman 1960, p. 6, Dahrendorf 1959, pp. 257–79, Kornhauser 1954, pp. 524–5). History was to prove these arguments wrong, with the resurgence of conflict in the 1960s to unprecedented levels in those same numerous countries (Crouch and Pizzorno 1978). But once again the ‘virtual disappearance’ of strikes in the 1980s and 1990s rekindled the debate on the withering away of strikes, with a novel, and perhaps more powerful trump to the argument: the very working class has now virtually disappeared, with the drastic shrinking of manufacturing and the soaring of services. Since service workers are largely white collar, with a high percentage of women—both actors being traditionally little strike prone—changes in the occupational structure should result in the ‘withering away of strikes’ once and for all. These arguments are compelling. No doubt the change in the location of conflict from industry to services has deeply changed the nature of labor conflicts. For one thing, it has changed the very nature of the two traditional actors involved in labor conflicts: workers and employers. The typical worker is no longer male and blue-collar, and the typical employer is public rather than private: the state has become the largest service employer in most Western countries (for comparative evidence, see Tiziano 1987). Historically, the state intervened in labor disputes in-
Strikes: Sociological Aspects directly, regulating conflict via legislative means, only rarely taking a more direct role (either to repress or to arbitrate). As employer the state cannot escape being drawn directly into service disputes as a principal bargaining actor. Second, the ‘essential’ nature of many services (transportation, health care, etc.) and the monopoly conditions under which most services are provided has dramatically increased strikers’ disruptive power (i.e., their ability to inflict severe losses to the counterpart at minimal cost to strikers). The more centralized and monopolistic a service is, and the larger the number of users affected by a service strike. Disruption is always intrinsic to service disputes, regardless of the intentions of the actors. The law-and-order potential of highly disruptive strikes further draws the state into service disputes, even when these disputes are located in the private sector. But from here to argue in favor of the ‘withering away of strikes’ is to ignore fundamental characteristics of strikes. First, strikes, as we have seen, are closely linked to wage labor. The nature of employment (wage labor) does not fundamentally change with changes in the occupational structure. Technicians, teachers and college professors, nurses and doctors, pilots and air controllers have shown to be as willing to go on strike as blue-collar workers. Second, strikes are a cyclical phenomenon. Strike research has made that quite clear (Franzosi 1995, pp. 345–8). Short-term seasonal fluctuations weave in and out of medium-term fluctuations of some years duration, mostly related to the business cycle, which in turn ziz-zag around long-term fluctuations of some 40–50 years in length. Research on these long-term cycles of strikes suggests that these ‘strike waves’ are linked to the long-term movements of capitalist economies (the so-called Kondratieff’s cycles; Mandel 1980, Cronin 1980). During wave years the characteristics of strikes are different than in surrounding years: strikes tend to be longer; they involve a higher proportion of the workforce, both at the plant and at the industry level, and a higher number of establishments (Hobsbawm 1964,p. 127; Shorter and Tilly 1974, pp. 142–6; Franzosi 1995). Strike tactics become more radical and innovative during strike waves, being mainly plant-based and in the hands of informal leaders, outside the control of the union and the institutionalized rules of the game. Demands become radicalized as well. Strike waves also often bring about legislative reforms (Shorter and Tilly 1974, p. 145; Franzosi 1995, pp. 302–6). Finally, the resurgence of conflict after long periods of peace is typically sudden and overwhelming. It takes employers, the state, and even the unions by surprise (French May of 1968 docet or the Italian autunno caldo of 1969). If these theories are correct, the decline in strike activity in recent decades is only the result of a longterm cycle rather than a permanent downward trend leading to the withering away of strikes. If the past is anything to go by, the mobilization process will catch
us by surprise with its sudden, sharp, and sweeping bouts of conflict: 1840s, 1880s, 1920s, 1960s … we may not have to wait long to be surprised. See also: Conflict Sociology; Industrial Relations and Collective Bargaining; Industrial Relations, History of; Industrial Sociology; Labor History; Labor Movements, History of; Labor Unions; Work and Labor: History of the Concept; Work, Sociology of; Working Classes, History of
Bibliography Ashenfelter O, Johnson G E 1969 Bargaining theory, trade unions, and industrial strike activity. American Economic Reiew 59(40): 35–49 Ashenfelter O, Pencavel J H 1969 American trade union growth: 1900–1960. Quarterly Journal of Economics 83: 434–48 Bain G S, Elsheikh F 1976 Union Growth and the Business Cycle. Blackwell, Oxford, UK Batstone E, Boraston I, Frenkel S 1978 The Social Organization of Strikes. Blackwell, Oxford, UK Cronin J E 1980 Stages, cycles and insurgencies: The economics of unrest. In: Hopkins T K, Wallerstein I (eds.) Processes of the World-system. Sage, Beverly Hills, CA, pp. 101–18 Crouch C, Pizzorno A (eds.) 1978 The Resurgence of Class Conflict in Western Europe Since 1968. Holmes and Meier, New York Crouzel A 1887 Etude historique, eT conomique, juridique sur les coalitions et les greZ es dans l’industrie. A. Rousseau, Paris Dahrendorf R 1959 Class and Class Conflict in Industrial Society. Stanford University Press, Stanford, CA Edwards P K 1981 Strikes in the United States 1881–1974. Blackwell, Oxford, UK Fisher M 1973 Measurement of Labour Disputes and Their Economic Effects. OECD, Paris Forcheimer K 1948 Some international aspects of the strike movement. Bulletin of the Oxford Institute of Statistics 10: 9–24 Franzosi R 1995 The Puzzle of Strikes: Class and State Strategies in Postwar Italy. Cambridge University Press, Cambridge, UK Galambos P, Evans E W 1966 Work-stoppages in the United Kingdom, l95l–l964: A quantitative study. Bulletin of the Oxford Uniersity Institute of Economics and Statistics 28: 33–55 Goetz-Girey R 1965 Le mouement des greZ es en France. Editions Sirey, Paris Hibbs D Jr 1976 Industrial conflict in advanced industrial societies. American Political Science Reiew 70(4): 1033–58 Hibbs D Jr 1978 On the political economy of long run trends in strike activity. British Journal of Political Science 8(2): 153–75 Hobsbawn E 1964 Labouring Men. Basic Books, New York Kerr C, Siegel A 1954 The interindustry propensity to strike. An international comparison. In: Kornhauser A, Dubin R, Ross A (eds.) Industrial Conflict. McGraw-Hill, New York, pp. 189–212 Knowles K G J C 1952 Strikes, BA Study in Industrial Conflict. Blackwell, Oxford, UK Kornhauser A 1954 The undetermined future of industrial conflict. In: Kornhauser A, Dubin R, Ross A (eds.) Industrial Conflict. McGraw-Hill, New York, pp. 519–26
15207
Strikes: Sociological Aspects Korpi W 1978 The Working Class in Welfare Capitalism. Routledge and Kegan Paul, London Korpi W 1981 Unofficial strikes in Sweden. British Journal of Industrial Relations 19(1): 66–86 Korpi W, Shalev M 1980 Strikes, power, and politics in the Western nations, 1900–1976. In: Zeitlin M (ed.) Political Power and Social Theory. JAI Press, Greenwich, CT, Vol. 1, pp. 301–34 Mandel E. 1980 Long Waes in Capitalist Deelopment. Cambridge University Press, Cambridge, UK Montgomery D 1980 Strikes in nineteenth-century America. Social Science History 4(1): 81–104 Paldam M, Pedersen P 1982 The macroeconomic strike model: A study of seventeen countries, 1948–1975. Industrial and Labor Relations Reiew 35(4): 504–21 Perrot M 1974 Les ouriers en greZ e (France 1871–1890). Mouton, Paris Ross A H, Hartman P T 1960 Changing Patterns of Industrial Conflict. Wiley, New York Shalev M 1978 Lies, damned lies, and strike statistics: The measurement of trends in industrial conflicts. In: Crouch C, Pizzorno A (eds.) The Resurgence of Class Conflict in Western Europe Since 1968. Volume 1. National Studies. Holmes and Meier, New York, pp. 1–20 Shorter E, Tilly C 1974 Strikes in France, 1830–1968. Cambridge University Press, Cambridge, UK Snyder D, Kelly W R 1976 Industrial violence in Italy, 1878– 1902. American Journal of Sociology 82(1): 131–62 Stern R N 1978 Methodological issues in quantitative strike analysis. Industrial Relations 17(1): 32–42 Tilly C 1978 From Mobilization to Reolution. Addison-Wesley, Reading, MA Tiziano T (ed.) 1987 Public Serice Labour Relations: Recent Trends and Future Prospects. International Labour Organisation, Geneva Turner H A, Clack G, Roberts G 1967 Labour Relations in the Motor Industry. Allen & Unwin, London Vanderkamp J 1970 Economic activity and strike activity. Industrial Relations (Uniersity of California, Berkeley) 9(2): 215–30
R. Franzosi
Strong Program, in Sociology of Scientific Knowledge The ‘Strong Program’ is a set of methodological claims about the proper way to conduct sociological enquiry into the nature of knowledge, including scientific knowledge. The program enjoins the investigator to adopt four requirements, which may be labeled, respectively, (a) causality, (b) impartiality, (c) symmetry, and (d) reflexivity. ‘Causality’ means that the aim is not merely descriptive or interpretive but also explanatory. The concern is with what brings about and sustains a body of knowledge. ‘Impartiality’ means that the investigator disregards his or her sense 15208
of whether the beliefs under scrutiny are true or false, or rational or irrational. Equal curiosity should attend both sides of the evaluate divide. ‘Symmetry’ means that this equally distributed curiosity should issue in the same general kinds of sociological explanation. All beliefs confront the same general problem of credibility and depend on the same contingencies to solve the problem. True beliefs have no more intrinsic credibility than false ones. Finally, ‘reflexivity’ means that sociologists of knowledge must not claim access to any transcendent standpoint, or engage in special pleading, to justify their own claims. Put simply: no sociological theory of knowledge is acceptable unless it applies to itself. These requirements all stem from the desire that the approach should itself be ‘scientific’ in spirit, although it is widely perceived by critics as antiscientific. To offset such misunderstandings each of the points calls for amplification.
1. Causality The commitment to causality does not mean that only sociological causes are at work. It is taken for granted that belief systems are the achievements of biological entities that interact with one another and their material environment. People have their eyes open, and sensory inputs influence individual and collective belief. Unless they had effective sense organs and some innate cognitive propensities, they could neither create nor transmit a shared culture. Knowledge cannot be ‘merely’ or ‘purely’ social. Sociologists can therefore only contribute partial explanations of the overall phenomenon of cognition. They must work alongside, say, sensory psychologists—but no adequate account of human knowledge can ever proceed without due acknowledgment of the role of the social. The ‘strength’ of the program lies in the necessity, not the sufficiency, of the social dimension. The idea that material reality plays little or no role in science is a form of philosophical ‘idealism.’ Its association with the ‘Strong Program’ has been solely through the imputations of critics. For example, Latour (1993) says the program is one of a family of theories based on the idea that knowledge has two ingredients (‘nature’ and ‘society’) which trade-off against one another—the more of one meaning the less of the other. Within this framework the ‘Strong Program’ is said to be an extreme case where one of the ingredients, nature, actually makes a nil contribution. The strength of the program is thus wrongly said to reside in the claim that knowledge (and indeed reality) is purely social. This false view of the ‘Strong Program’ is widespread. The sociologist Cole (1992) who himself accepts the zero-sum picture, seeks to distance himself from the program by insisting that nature plays some role rather than no role. He suggests that it is an
Strong Program, in Sociology of Scientific Knowledge empirical question to decide just how much influence is exerted by nature by comparison to that exerted by society. A variant of the error also underlies Gottfried and Wilson’s (1997) article in Nature. They claim that the program is incompatible with science having a good record of predictive success. The allegation appears to rest on the idea that if scientists were only attending to what is going on in society, and were not attending to what is happening in nature, then their ability to make predictions would be unintelligible. That, however, is not the ‘Strong Program’s’ claim. The real claim of the ‘Strong Program’ is that scientists understand nature through and with society, not in spite of it. Society is the vehicle of understanding, not a screen or distraction. Society does not trade-off against nature, rather it enables scientific access to it. Society no more interferes with knowledge than the eye interferes with vision. They both operate as the relevant organ of cognition and as a necessary constituent of it. No one asks ‘how much’ of our visual experience is contributed by the eye, and ‘how much’ by the object seen. (We might ask ‘what’ is contributed, but not ‘how much.’) The question makes no sense, and in this framework neither does Cole’s allegedly ‘empirical’ research program to find out how much of knowledge depends on nature and how much on society. How do social processes enable us to have scientific access to the world? What manner of ‘vehicle’ or ‘organ’ does society provide? The answer given by Kuhn (1962) illustrates the general case. Cognition within a group is coordinated by reference to a shared exemplar or paradigm, that is, a concrete achievement on which activity is modeled. By virtue of their orientation towards that achievement, and their knowledge of its standing in the eyes of others, members of a group endow it with a normative status. Something is constituted as a paradigm in the way that money is constituted as money by being regarded as money, and a leader by being seen as and known as a leader. In this way society gets woven into the very fabric of knowledge and into the struggle to make sense of intransigent, real-world processes. Establishing a paradigm produces a qualitative change in the mode of cognition from mere individual belief to a collective relation to the world that we can recognize as scientific. Without a paradigm there may be scientists, but there is something less than science.
2. Impartiality and Symmetry The reference to Kuhn puts us in a better position to understand impartiality and symmetry. He did not argue that the idea of a paradigm merely helps us make sense of old theories about natural processes, but no longer applies to current or correct theories. On the contrary, he applied the idea to all cases, that is, impartially and symmetrically, because he was searching
for invariant principles. Every shared approach to nature must overcome the problem that our experience can be given more than one interpretation. The shared interpretation has to be sustained in the face of continuing difficulty, anomaly, and the threat of fragmentation into subjective opinion. The requirements of impartiality and symmetry face a major obstacle because they run counter to some aspects of common sense. In general, our curiosity has a markedly partial and asymmetric structure. We typically confine the request for causal explanation to the breakdowns in our routine expectations. When things go smoothly we do not ask why, because nothing more needs to be said. If nothing needs to be said, perhaps nothing can be said, and we may be tempted to think that causality does not apply, or that qualitatively different kinds of causes are in operation. Thus, some philosophers have argued that while illusory perception calls for causal explanation, veridical perception does not. No professional psychologist would speak like this—just as no medical scientist would say there are causes for disease but no causes for health, and no engineer would say there are causes for why bridges collapse but none for why they stay up. If psychologists and other scientists can restructure their professional curiosity, and give it a locally symmetrical character, then this is also an appropriate posture for sociologists. A deeper problem for symmetry comes from our capacity to follow rules. Here is the argument. Everyone accepts that the rules of a game, such as chess, are conventional and hence social. But once the rules have been agreed the question of whether or not a move conforms to the rule seems to be nonconventional. It is a matter of conceptual match or logical consistency and inconsistency. More generally, the meaning of any concept may be a matter of convention, but the question of whether an object falls under that concept seems wholly different. Meaning is conventional; truth is not. A sociologist may legitimately try to explain the choice of rules and definitions, but not, surely, matters of consistency and truth. The first is a question of causation, the latter belongs to the realm of logic and rationality. In as far as we follow rules and respond to considerations of meaning and truth our behavior must be explained rationally not sociologically. So, once again, asymmetry asserts itself. This argument is widely accepted, but it may be challenged on the grounds that, in reality, every single act of concept application is problematic and negotiable. Not every single act is negotiated, in practice they are often routine, but as the history of science, logic and mathematics shows us, they are negotiable. Past applications of a concept do not serve to fix meanings and thereby determine in advance the norms of proper concept application or rule following. The move to the next case is always problematic. This position, known as ‘meaning finitism,’ follows from the determined effort to adhere to the causal and 15209
Strong Program, in Sociology of Scientific Knowledge scientific perspective. It marks an important point of contact between the ‘Strong Program’ and the later philosophy of Wittgenstein which centered on the analysis of rule following and the problematic character of ‘sameness.’
and Credibility: Science and the Social Study of Science; Truth, Verification, Verisimilitude, and Evidence: Philosophical Aspects
Bibliography 3. Reflexiity This requirement immediately raises the issue of relativism and self-refutation. If sociologists, own claims are socially determined and relative’ why believe what they say? That the program is relativist is beyond question, but this can be seen as a strength not a weakness. Relativism (which should not be confused with idealism) is the opposite of absolutism and implies that there are no absolute justifications for any knowledge claim. All justifications end in something unjustified and merely taken for granted. But this does not mean that there is never any reason to accept what anyone says, as if it were all an ideological smokescreen. It just means that in the end reasons are local and contingent. Such an admission is not self-defeating. To draw that conclusion would be to argue in a circle from the premise that the only real reasons are absolute ones, thus begging the question on behalf of absolutism. Rather, it is the attempt to produce absolute justifications that should be viewed with suspicion.
Barnes B 1974 Scientific Knowledge and Sociological Theory. Routledge, London Bloor D 1976 Knowledge and Social Imagery. Routledge, London Cole S 1992 Making Science. Between Nature and Society. Harvard University Press, Cambridge, MA Gottfried K, Wilson K G 1997 Science as a cultural construct. Nature 386: 545–7 Harwood J 1993 Styles of Scientific Thought. The German Genetics Community 1900–1933. University of Chicago Press, Chicago Kuhn T S 1962 The Structure of Scientific Reolutions. Chicago University Press, Chicago Kusch M 1999 Psychological Knowledge. A Social History and Philosophy. Routledge, London Latour B 1993 We Hae Neer Been Modern. Harvard University Press, Cambridge, MA Mackenzie D A 1981 Statistics in Britain 1865–1930. Edinburgh University Press, Edinburgh, UK Shapin S 1982 History of science and its social reconstructions. History of Science 20: 157–211 Shapin S, Schaffer S 1985 Leiathan and the Air Pump. Hobbes, Boyle and the Experimental Life. Princeton University Press, Princeton, NJ
D. Bloor
4. Conclusion The ‘Strong Program’ was formulated by an interdisciplinary group in Edinburgh in the early 1970s (see Barnes 1974, Bloor 1976). The immediate aim was to codify and clarify an emerging body of case-studies, particularly by historians of science (see Shapin 1982). The real life-blood of the sociology of scientific knowledge lies with such empirical work, for example, to name but a small sample: Mackenzie 1981) on the history of statistics and its connections with eugenics, Shapin and Schaffer (1985) on the disputes surrounding Boyle’s experiments with the air-pump, Harwood (1993) on styles of research in German genetics before 1933, and Kusch (1999) on the different theories of mind generated in different psychological laboratories in the early years of the century. The program continues to serve its function, but its rich contributions to specific historical and sociological projects have been obscured to some degree by criticism of its philosophical foundations. See also: Campbell, Donald Thomas (1916–96); Empiricism, History of; Experiment, in Science and Technology Studies; Experimentation in Psychology, History of; Kuhn, Thomas S (1922–96); Popper, Karl Raimund (1902–94); Reflexivity, in Science and Technology Studies; Reflexivity: Method and Evidence; Scientific Knowledge, Sociology of; Truth 15210
Structural Contingency Theory An effective organizational structure provides the coordination necessary for people to work together so that an organization can perform at a high level. There is no ‘one best way,’ that is, one structural type that makes all organizations perform highly. On the contrary, the effect of an organizational structure on performance is contingent upon whether it fits certain factors called the contingency factors. An effective structure for an organization is one that fits the level of its contingency factors. Research has shown that there are three main contingency factors: size, task uncertainty, and diversification. In overview, increasing size requires a more bureaucratic structure, task uncertainty requires reducing the degree of bureaucratization, and diversification requires a divisional or a matrix structure.
1. Bureaucratic Structure and its Contingencies The structure of an organization includes how the work of the organization is divided up and allocated among people, in terms of job specialization, sections,
Structural Contingency Theory and departments. It also includes the number of levels in hierarchy and how many subordinates report directly to each manager, that is, his or her span of control. Another aspect of organizational structure is the degree of formalization: how far rules, standard operating procedures, and manuals govern the way people work. A further aspect of structure is the degree of decentralization, the degree to which decisiontaking authority is distributed down the hierarchy (Jones 1998). Many of these structural variables tend to be correlated so that they compose a single structural dimension of the degree of bureaucratic structure (Donaldson 2001). An organization that is low on bureaucratization has low specialization, few departments and sections, narrow spans of control, low formalization, few hierarchical levels, and high centralization. Conversely, an organization that is high on bureaucratization has high specialization, many departments and sections, wide spans of control, high formalization, many hierarchical levels, and low centralization. Organizations are distributed widely along this continuum of bureaucratic structure.
2. Bureaucratic Structure and the Size Contingency Size is the number of people who must be organized, that is the number of employees or members of the organization. For high performance, the organization needs to have the degree of bureaucratic structure that fits its size (Child 1984). An unbureaucratic structure fits an organization of small size, while a highly bureaucratic structure fits a large size. For small size, the required, unbureaucratic, structure is a short hierarchy, with most of the decisions being taken by the top manager. The top manager controls the organization directly by giving instructions (Child 1984). There is little formalization because the top manager is making decisions on each case, rather than bothering to identify abstract rules to anticipate future cases. There is low specialization and few departments and sections, because there are few people among whom the work could be divided. The span of control is narrow, because each manager must supervise subordinates who are doing a broad range of tasks. Conversely, large size entails a bureaucratic structure that has a tall hierarchy because an additional level of management has to be introduced each time the number of employees exceeds the feasible span of control of their supervisor. Size also brings increased complexity and this combines with the tall hierarchy to force top management to delegate decisions down the hierarchy and so decentralization increases. With so many people to be organized in the large organization, the work is finely divided up among them, producing an elaborate division of labor. There is high
specialization and many departments and sections. The span of control is wide, because each manager can supervise more subordinates because the subordinates are doing a limited range of tasks. There is high formalization because, with larger size, many decisions recur about similar cases, leading to codification of best practice into rules and standard operating procedures (e.g., a hiring procedure). Such formalization relieves managers from having to attend to each decision, so that administration can be delegated to clerks who follow rules, which speeds decision-making, cuts costs, and is another reason why managerial spans increase with size. Overall, top management has relinquished much direct control and instead controls the organization indirectly and impersonally, through formalization (Child 1984). As the organization grows in size over time, so it needs to increase progressively the degree of bureaucratization of its structure in order to maintain the fit that sustains high performance. Information technologies, whether used in operations or administration, tend to reinforce the effects of size (Donaldson 2001). Information technologies increase bureaucratization, such as by increasing formalization, because they reduces uncertainty, though their effects are weaker than size. Moreover, computerization itself is really a form of bureaucratization. Work is executed according to computer programs that are rules (e.g., ‘If the checking account balance is less than $500, then apply a $10 fee to every transaction’). Knowledge workers and other employees working with computers are constrained in their actions by the programs in their computers. Thus computerization is electronic formalization that allows top management more indirect, impersonal control over lower-level employees, so that top management can further increase decentralization. Bureaucratic structure is sometimes criticized for having too many managers and administrators relative to operating personnel (e.g., workers). However, the ratio of managers and administrators to operating personnel tends to fall as size increases. While larger organizations have more levels in their hierarchy, their spans of control increase so as to more than compensate (Jones 1998). Thus bureaucratic structure brings economies of scale in administration (Donaldson 1996).
3. Bureaucratic Structure and the Task Uncertainty Contingency While a bureaucratic structure fits low task uncertainty, a less bureaucratic structure is required to fit high task uncertainty. The creation of effective rules relies on all the cases of the same occurrence being similar. In routine tasks this holds, allowing rules to be formulated. However, some tasks are uncertain, so that one case varies 15211
Structural Contingency Theory markedly from another, and future cases differ from past ones. Some uncertainty in organizations comes from the environment, for example, competitor pricing is hard to predict. Other task uncertainty comes from innovation inside the organization, such as designing a new product or developing a new technology to distribute a service. Whatever the source of the uncertainty, tasks high in uncertainty cannot be handled effectively by rules. Tasks of medium uncertainty can be handled effectively by the hierarchy. However, the greater the uncertainty, the more information that has to be processed by the organization. Because of the pyramidal shape of the hierarchy, it soon becomes inadequate for dealing with more than moderate uncertainty. High task uncertainty, such as involved in radically new products or services, or making many innovations at the same time, requires reliance upon experts. The high uncertainty in their tasks prevents the organization from governing their work effectively by rules. Some decision-making authority needs to be delegated to these experts, even if they are at low hierarchical levels. Similarly, managers of innovative programs need to be able to act autonomously. Thus task uncertainty is fitted by a structure that is lower in formalization and centralization (Jones 1998). The effect of the task uncertainty on bureaucratic structure is secondary to that of size. High task uncertainty partially offsets the high formalization that large size causes by lowering formalization. Conversely, high task uncertainty reinforces the high decentralization that large size causes by further increasing decentralization. Thus a large organization with highly uncertain tasks is less formalized and more decentralized than a large organization with highly certain tasks. The reduction in formalization decreases bureaucratic structure, while the increase in decentralization increases that aspect of bureaucratic structure. Moreover, these effects of task uncertainty are localized to specific parts of the organization. For a firm that is developing new products, the uncertainty of its tasks is greatest in some parts, such as the research and development department that is inventing the new product. In contrast, the manufacturing department of the firm focuses mainly on the routine production of existing products, so that the uncertainty of its tasks is lower. Thus the required structure of the research and development department is low in formalization, while the manufacturing department is high. Similarly, the research and development department employs qualified scientists and technologists and devolves some decision-making to them so that they exercise autonomy and creativity. Conversely, the manufacturing department employs mainly less educated personnel who lack professional qualifications and trains them to conform to the operating procedures of the firm, so the department is run in a more formalized and centrally planned way. 15212
Nevertheless, innovation entails solving some novel problems that involve all the functional departments. For example, research needs to know whether manufacturing can make their new design on its existing machines. The integration required between the functions necessitates the adoption of cross-functional project teams. These contain representatives from each of the functional departments who are brought together for collaboration (Jones 1998). The teams provide lateral linkages that run across the hierarchy and rely upon ad hoc internal coordination rather than planning or formalization. Therefore they are a further way task uncertainty reduces bureaucratic structure.
4. Diisional and Matrix Structure and the Diersification Contingency 4.1 Functional and Diisional Structures Another aspect of structure is how activities and responsibilities are grouped in the organization. In a functional structure, the managers reporting to the top manager (i.e., the CEO; chief executive officer) are specialized by function, including operational functions (e.g., manufacturing). In contrast, in the divisional structure, the senior managers are specialized by their products or services, customers, or geographical areas, and each has a broad range of operational functions within their division. A divisional structure is more specialized, formalized, and decentralized than a functional structure and so divisionalization is an extension of bureaucratization (Donaldson 2001). However, the contingency causing divisionalization is primarily diversification, with size playing a minor role (Donaldson 2001) so that the contingency determination of divisionalization differs from that of bureaucratization. Organizations vary in their levels of diversification of their products, services, customers, or geographical areas. For brevity, product diversification will be discussed here but not service and customer diversification because similar considerations apply. Increasing diversification reduces the interdependence between the tasks involved in one product and the others. This requires less coordination between products, so each can be organized as a separate subunit acting autonomously from the other subunits and from the corporate center. Consequently, the corporate center needs to contain fewer operational functions. The greater the diversification, the greater the decentralization needs to be and thus the smaller the head office. As diversification varies from low to high, the fitting structure goes from being a pure function to a pure divisional structure. At intermediary levels of diversification, the role of the central functions decreases with each level and the role of the product heads increases. Where a firm is undiversified, such as
Structural Contingency Theory producing only one product, then specialization by function fosters economies of scale and avoids duplication. Each function is highly interdependent on the next, so that operational coordination is needed from the CEO, producing centralization. Also, each function is a cost center, with profitability only being assessed for the firm overall. The next level of product diversification is where a firm offers products that are vertically integrated with each other. For example, an aluminum-refining firm might vertically integrate backward by acquiring the bauxite mine that supplies its feed material. It might also vertically integrate forward by creating a plant that fabricates its refined aluminum into window frames. Each stage along the value-added chain, that is, mining, refining, and fabricating, becomes an organizational subunit, which may be termed a division. Some decisions will be taken autonomously by each division, such as the types of production equipment to purchase, so that the structure is more decentralized than a functional structure. However, a large flow of materials must be coordinated between divisions. This necessitates some restrictions on divisional autonomy and the presence of corporate-level staff with expertise in transportation and the planing of production and inventory. Thus the corporate center contains many staff additional to the administrative functions such as accounting. Whereas the fabrication division is a profit center, the mining and the refining divisions should be cost centers because their sales are to other divisions making their profitability a contentious issue. Divisional general managers need to have their bonuses tied to corporate profitability to encourage them to cooperate. A medium level of diversification is where a firm has two or more related products. The relatedness between the products lies in similarities among their materials, production technologies, skills, brands, sales force, or distribution systems. Each product is organized as a division and is run quite autonomously, except where the divisions are related and so synergies may be extracted through central coordination. For example, an electronics firm has three product divisions: avionics, computers, and domestic appliances, with each focusing on their product market. But there are commonalties in the underlying technologies so that there is a large central research and development laboratory that pursues major innovations beyond those being researched in the divisions. Therefore, the corporate center contains functions beyond the merely administrative and is quite large in size. Each division is a profit center, but the transfer pricing of shared components moderates its accountability. This is also moderated by large corporate overhead charges for the central laboratories and by corporate decisions about which division receives the profitable, new products developed in the central laboratory. Divisional general managers need to have their bonuses partly tied to that of their own division and partly to
corporate profitability to encourage them to cooperate with the other divisions and the corporate staff. The divisional structure boosts innovation by having functions for each product which facilitates functional interaction. However, this is at the cost of some duplication of assets across divisions. Thus, where cost pressures are paramount, the organization instead needs to use a functional structure. A high level of diversification is where a firm has two or more unrelated products, e.g., a conglomerate. There are no operating synergies to be extracted, so little central coordination of operation is required and the divisions are highly autonomous. Therefore the corporate level needs no functions beyond the merely administrative and so can be small. Each division can be straightforwardly held profit accountable. Divisional general managers need to have their bonuses tied solely to the profitability of their own division. Area diversification has similar effects. Area diversification is how far the organization contains an area that operates separately from other areas, that is, it designs, manufactures, and sells the product all within its own area. For example, area divisions are frequently found in the food industry, where low price relative to weight makes intercontinental transportation too costly. Low area diversification fits with the functional structure, while high area diversification fits with the area divisional structure.
4.2 Matrix Structures In a matrix structure, a subordinate reports to more than one boss. Despite the inherent complexity and resulting conflict potential of matrices, they fit some contingencies. A related product organization attains both some product innovation and some cost reduction by using a project-functional matrix structure. Each project manager heads a temporary team, which fosters close interaction between different functional specialists, to facilitate product innovation. Additionally, the heads of the functions coordinate the deployment of resources across projects to reduce costs. The relatedness of the products (e.g., at LockheedGelac all the products are transport airplanes) allows the same resources to be used by the projects. This resource sharing is facilitated by the projects each being at different stages so that the same resources are not required simultaneously. The project-functional structure fits medium product diversity and low area diversity so that operations are geographically proximate and there is no area diversity that needs managing. In contrast, an area type of matrix is appropriate for an organization that requires coordination along the area dimension in addition to coordination by either function or product. For example, a multinational corporation has offices and plants scattered around the world, each reporting up to functional or product 15213
Structural Contingency Theory heads. However, the corporation also needs to have a local manager coordinating each geographical area to deal with local issues, e.g., the host government. The increase in environmental diversity, caused by cultural and political heterogeneity, that is, medium area diversification, requires that the organization mirrors and manages this diversity through having a manager for each area. Thus, area type matrices fit medium area diversification. There interdependencies exist between the areas, requiring coordination through high-level functional or product heads, at the same level as the area managers, thereby creating the matrix. Where area diversification is medium and product diversification is low, then core operations need to be structured functionally and so the fitting structure is the area-functional matrix. However, where area diversification is medium and product diversification is also medium, then core operations need to be structured divisionally and so the fitting structure is the areaproduct matrix (e.g., Asea Brown Boveri). Medium product diversification obtains not only for related, but also for unrelated, product diversification. This is because the medium area diversification introduces links across the separate products due to their geographical co-location (e.g., being in France and so having to deal jointly with the government there). Thus, the level of product diversification has to be determined jointly with the level of area diversification, that is, interdependence is established by examining together products and areas. The heart of structural contingency theory is the concept of fit of organizational structure to the contingency. Many of the more promising developments in structural contingency theory delve into the concept of fit, using it to illuminate organizational performance and organizational change. A major conceptual statement about fit that has influenced subsequent research is by Van de Ven and Drazin (1985). Some researchers have extended the analysis by considering the fit of an organizational structural variable to multiple contingencies (e.g., Keller 1994). Other researchers have analyzed the fit of multiple organizational structural variables to a contingency (e.g., Drazin and Van de Ven 1985). Still other researchers have analyzed the fit of multiple organizational structural variables to multiple contingencies simultaneously (e.g., Pennings 1987). Other researchers have analyzed the differential effects of fit on different aspects of organizational performance such as financial performance vs. social costs (e.g., Kraft et al. 1995). Structural contingency theory has sometimes been criticized for being static, however, on the contrary, it contains a theory of change dynamics in that fit and misfit play crucial roles in organizational structural change (Donaldson 2001). An organization in fit tends thereby to have high performance, which leads it to increase its contingencies, such as size and diversification, so that it moves into misfit and so has lower performance. An ensuing crisis of low per15214
formance causes the organization to adopt a new structure that fits the new levels of its contingencies, thereby restoring fit and performance (Donaldson 2001). Whether the performance level of a firm in misfit becomes low enough to trigger structural change depends on other causes of performance that interact with misfit. These other causes include debt and divestment and can reinforce misfit or offset its depressive effect on performance. The set of causes of organizational performance and their effects on organizational growth and adaptation have been formalized into a new organizational theory, organizational portfolio theory (Donaldson 1999). Fit is traditionally conceptualized as a line of isoperformance, meaning that a fit of organizational structure to a low level of the contingency (e.g., small size) produces the same performance as a fit of organizational structure to a high level of the contingency (e.g., large size). Such a model fails to provide an incentive for an organization to increase its level of the contingency (e.g., size), so that there is no explanation of why an organization changes its level of the contingency, such as by growing larger. A reconceptualization of fit has been proposed in which fit is a line of hetero-performance, in which fits to higher values of the contingency variable produce greater performance than fits to lower levels (Donaldson 2001). Therefore there is an incentive for the organization to change its level of contingency such as by growing larger over time. In this way, reconsidering the effect of fit on performance allows the explanation of changes in the contingency (e.g., size) of an organization in fit, thereby rendering structural contingency theory more comprehensive and dynamic. Thus, deeper analyses of fit, both empirically and conceptually, provide a major route for the ongoing development of structural contingency theory (for developments synthesizing structural contingency theory with other organizational theories see Donaldson 1995). See also: Authority: Delegation; Bureaucracy and Bureaucratization; Diversification and Economies of Scale; Labor, Division of; Organizational Decision Making; Organizational Size; Organizations: Authority and Power; Organizations, Sociology of; Risk: Empirical Studies on Decision and Choice; Risk: Theories of Decision and Choice
Bibliography Child J 1984 Organization: A Guide to Problems and Practice, 2nd edn. Harper and Row, London; Hagerstown, MD Donaldson L 1995 American Anti-Management Theories of Organization: A Critique of Paradigm Proliferation. Cambridge University Press, Cambridge, UK; New York Donaldson L 1996 For Positiist Organization Theory: Proing the Hard Core. Sage, London; Thousand Oaks, CA
Structural Equation Modeling Donaldson L 1999 Performance-Drien Organizational Change: The Organizational Portfolio. Sage, Thousand Oaks, CA Donaldson L 2001 The Contingency Theory of Organizations. Sage, Thousand Oaks, CA Drazin R, Van de Ven A H 1985 Alternative forms of fit in contingency theory. Administratie Science Quarterly 30: 514–39 Jones G R 1998 Organizational Theory: Text and Cases, 2nd edn. Addison-Wesley Publishing Company, Reading, MA Keller R T 1994 Technology–information processing fit and the performance of R&D project groups: a test of contingency theory. Academy of Management Journal 37: 167–79 Kraft K L, Puia G M, Hage J 1995 Structural contingency theory revisited: main effects versus interactions in Child’s National Study of Manufacturing and Service Sectors. Reue Canadienne des Sciences de l’Administration—Canadian Journal of Administratie Sciences 12: 182–94 Pennings J M 1987 Structural contingency theory: a multivariate test. Organization Studies 8: 223–40 Van de Ven A H, Drazin R 1985 The concept of fit in contingency theory. In: Staw B M, Cummings L L (eds.) Research in Organizational Behaiour. JAI Press, Greenwich, CT, Vol. 7, pp. 333–65
L. Donaldson
Structural Equation Modeling In 1980, Peter Bentler (1980, p. 420) stated that structural equation modeling held ‘the greatest promise for furthering psychological science.’ Since then, there have been many important theoretical and practical advances in the field. So much so, in fact, that Muthe! n (2001) announced a ‘second generation’ of structural equation modeling. The purpose of this article is to provide a description of the ‘first generation’ of structural equation modeling and to outline briefly the most recent developments that constitute the ‘second generation’ of structural equation modeling. This article will not devote space to the comparison of specific software packages. For the most part, existing software packages differ in terms of the user interface. The competitive market in structural equation modeling software means that new developments available in one software package will soon be available in others.
1. The First Generation: Definition and Brief History of Structural Equation Modeling Structural equation modeling can be defined as a class of methodologies that seeks to represent hypotheses about the means, variances, and covariances of observed data in terms of a smaller number of ‘structural’ parameters defined by a hypothesized underlying conceptual or theoretical model. Histori-
cally, structural equation modeling derived from the hybrid of two separate statistical traditions. The first tradition is factor analysis developed in the disciplines of psychology and psychometrics. The second tradition is simultaneous equation modeling developed mainly in econometrics, but having an early history in the field of genetics and introduced to the field of sociology under the name path analysis. The subject of this article concerns the combination of the latent variable\factor analytic approach with simultaneous equation modeling. The combination of these methodologies into a coherent analytic framework was based on the work of Jo$ reskog (1973), Keesling (1972) and Wiley (1973).
2. The General Model The general structural equation model as outlined by Jo$ reskog (1973) consists of two parts: (a) the structural part linking latent variables to each other via systems of simultaneous equations, and (b) the measurement part which links latent variables to observed variables via a restricted (confirmatory) factor model. The structural part of the model can be written as η l BηjΓξjζ
(1)
where η is a vector of endogenous (criterion) latent variables, ξ is a vector of exogenous (predictor) latent variables, B is a matrix of regression coefficients relating the latent endogenous variables to each other, Γ is a matrix of regression coefficients relating endogenous variables to exogenous variables, and ζ is a vector of disturbance terms. The latent variables are linked to observable variables via measurement equations for the endogenous variables and exogenous variables. These equations are defined as y l Λ yηjε
(2)
x l Λxξjδ
(3)
and
where Λy and Λx are matrices of factor loadings, respectively, and ε and δ are vectors of uniqueness, respectively. In addition, the general model specifies variances and covariances for ξ, ζ, ε, and δ, denoted Φ, Ψ, Θε, and Θδ, respectively. Structural equation modeling is a methodology designed primarily to test substantive theories. As such, a theory might be sufficiently developed to suggest that certain constructs do not affect other constructs, that certain variables do not load on certain factors, and that certain disturbances and measurement errors do not covary. This implies that some of the elements of B, Γ, Λy, Λx, Φ, Ψ, Θε, and Θδ 15215
Structural Equation Modeling are fixed to zero by hypothesis. The remaining parameters are free to be estimated. The pattern of fixed and free parameters implies a specific structure for the covariance matrix of the observed variables. In this way, structural equation modeling can be seen as a special case of a more general coariance structure model defined as Σ l Σ(Ω), where Σ is the population covariance matrix and Σ(Ω) is a matrix valued function of the parameter ector Ω containing all of the parameters of the model.
3. Identification and Estimation of Structural Equation Models A prerequisite for the estimation of the parameters of the model is to establish whether the parameters are identified. We begin with a definition of identification by considering the problem from the covariance structure modeling perspective. The population covariance matrix Σ contains population variances and covariances that are assumed to follow a model characterized by a set of parameters contained in Ω. We know that the variances and covariances in Σ can be estimated by their sample counterparts in the sample covariance matrix S using straightforward equations for the calculation of sample variances and covariances. Thus, the parameters in Σ are identified. What needs to be established prior to estimation is whether the unknown parameters in Ω are identified from the elements of Σ. We say that the elements in Ω are identified if they can be expressed uniquely in terms of the elements of the covariance matrix Σ. If all elements in Ω are identified then the model is identified. A variety of necessary and sufficient conditions for establishing the identification of the model exist and are described in Kaplan (2000). Major advances in structural equation modeling have, arguably, been made in the area of parameter estimation. The problem of parameter estimation concerns obtaining estimates of the free parameters of the model, subject to the constraints imposed by the fixed parameters of the model. In other words, the goal is to apply a model to the sample covariance matrix S, which is an estimate of the population covariance matrix Σ that is assumed to follow a structural equation model. Estimates are obtained such that the discrepancy between the estimated covariance matrix Σ< l Σ(Ω< ) and the sample covariance matrix S, is as small as possible. Therefore, parameter estimation requires choosing among a variety of fitting functions. Fitting functions have the properties that (a) the value of the function is greater than or equal to zero, (b) the value of the function equals zero if, and only if, the model fits the data perfectly, and (c) the function is continuous. We first consider estimators that assume multivariate normality of the data. We will discuss estimators for non-normal data in Sect. 5. 15216
3.1 Normal Theory-based Estimation The most common estimation method that rests on the assumption of multivariate normality is the method of maximum likelihood. If the assumption of multivariate normality of the observed data holds, the estimates obtained from maximum likelihood estimation possess certain desirable properties. Specifically, maximum likelihood estimation will yield unbiased and efficient estimates as well as a large sample goodnessof-fit test. Another fitting function that enjoys the same properties under the same assumption is the generalized least squares (GLS) estimator developed for the structural equation modeling context by Jo$ reskog and Goldberger (1972).
4. Testing Once the parameters of the model have been estimated, the model can be tested for global fit to the data. In addition, hypotheses concerning individual parameters can also be evaluated. We consider model testing in this section. 4.1 Testing Global Model Fit A feature of maximum likelihood estimation is that one can explicitly test the hypothesis that the model fits data. The null hypothesis states that the specified model holds in the population. The corresponding alternative hypothesis is that the population covariance matrix is any symmetric (and positive definite) matrix. Under maximum likelihood estimation, the statistic for testing the null hypothesis that the model fits in the population is referred to as the likelihood ratio (LR) test. The large sample distribution of the likelihood ratio test is chi-squared with degrees of freedom given by the difference in the number of nonredundant elements in the population covariance matrix and the number of free parameters in the model. 4.2 Testing Hypotheses Regarding Indiidual Parameters In addition to a global test of whether the model holds exactly in the population, one can also test hypotheses regarding the individual fixed and freed parameters in the model. Specifically, we can consider three alternative ways to evaluate the fixed and freed elements of the model vis-a' -vis overall fit. The first method rests on the difference between the likelihood ratio chi-squared statistics comparing a given model against a less restrictive model. Recall that the initial hypothesis, say H , is tested against the !"a symmetric positive alternative hypothesis Ha that Σ is
Structural Equation Modeling definite matrix. Consider a second hypothesis, say H !# that differs from H in that a single parameter that !" is now relaxed. Note that the was restricted to zero alternative hypothesis is the same in both cases. Therefore, the change in the chi-squared value between the two models can be used to test the improvement in fit due to the relaxation of the restriction. The distribution of the difference between the two chisquared tests is distributed as chi-squared with degrees of freedom equaling the difference in degrees-offreedom between the model under H and the less !" of a single restrictive model under H . In the case restriction described here, !# the chi-squared difference test is evaluated with one degree of freedom. There are other ways of assessing the change in the model without estimating a second nested model. The Lagrange multiplier (LM) test can be used to assess whether freeing a fixed parameter would result in a significant improvement in the overall fit of the model. The LM test is asymptotically distributed as chisquared with degrees of freedom equaling the difference between the degrees of freedom of the more restrictive model and the less restrictive model. Again, if one restriction is being evaluated, then the LM test will be evaluated with one degree of freedom. The LM test is also referred to as the modification index. Finally, we can consider evaluating whether fixing a free parameter would result in a significant decrement in the fit of the model. This is referred to as the Wald test. The Wald test is asymptotically distributed as chisquared with degrees of freedom equaling the number imposed restrictions. When interest is in evaluating one restriction the Wald test is distributed as chisquared with one degree of freedom. The square root of the Wald statistic in this case is a z statistic. The reason why the LM and Wald tests can be used as alternative approaches for testing model parameters is that they are asymptotically equivalent to each other and to the chi-squared difference test.
4.3 Alternatie Methods of Model Fit Since the early 1980s, attention has focused on the development of alternative indices that provide relatively different perspectives on the fit of structural equation models. The development of these indices has been motivated, in part, by the known sensitivity of the likelihood ratio chi-squared statistic to large sample sizes. Other classes of indices have been motivated by a need to rethink the notion of testing exact fit in the population—an idea that is deemed by some to be unrealistic in most practical situations. Finally, another class of alternative indices has been developed that focuses on the cross-validation adequacy of the model. In this section we will divide our discussion of alternative fit indices into three categories: (a)
measures based on comparative fit to a baseline model, including those that add a penalty function for model complexity, (b) measures based on population errors of approximation, and (c) cross-validation measures.
4.3.1 Measures based on comparatie fit to a baseline model. Arguably, the most active work in the area of alternative fit indices has been the development of what can be broadly referred to as measures of comparatie fit. The basic idea behind these indices is that the fit of the model is compared to the fit of some baseline model that usually specifies complete independence among the observed variables. The fit of the baseline model will usually be fairly large and thus the issue is whether one’s model of interest is an improvement relative to the baseline model. The index is typically scaled to lie between zero and one, with one representing perfect fit relative to this baseline model. The sheer number of comparative fit indices precludes a detailed discussion of each one. We will consider a subset of indices here that serve to illustrate the basic ideas. The quintessential example of a comparative fit index is the normed-fit index (NFI) proposed by Bentler and Bonett (1980). This index can be written as NFI l
χ#b kχ#t χ#
(4)
b
where χtb is the chi-squared for the model of complete independence (the so-called baseline model) and χ#t is the chi-squared for the target model of interest. This index, which ranges from zero to one, provides a measure of the extent to which the target model is an improvement over the baseline model. The NFI assumes a true null hypothesis and therefore a central chi-squared distribution of the test statistic. However, an argument could be made that the null hypothesis is never exactly true and that the distribution of the test statistic can be better approximated by a non-central chi-squared with noncentrality parameter λ. An estimate of the noncentrality parameter can be obtained as the difference between the statistic and its associated degrees of freedom. Thus, for models that are not extremely misspecified, an index developed by McDonald and Marsh (1990) and referred to as the relatie noncentrality index (RNI) can be defined as RNI l
[( χ#b kdfb)k( χt#kdft)] χ#b kdfb
(5)
The RNI can lie outside the 0–1 range. To remedy this, Bentler (1990) adjusted the RNI so that it would lie in the range of 0–1. This adjusted version is referred to as the comparatie fit index (CFI). 15217
Structural Equation Modeling Finally, there are classes of comparative fit indices that adjust existing fit indexes for the number of parameters that are estimated. These are so called parsimony-based comparative fit indices. The rationale behind these indices is that a model can be made to fit the data by simply estimating more and more parameters. Indeed, a model that is just identified fits the data perfectly. Therefore, an appeal to parsimony would require that these indices be adjusted for the number of parameters that are estimated. It is important to note that use of comparative indices has not been without controversy. In particular, Sobel and Borhnstedt (1985) argued early on that these indices are designed to compare one’s hypothesized model against a scientifically questionable baseline hypothesis that the observed variables are completely uncorrelated with each other. Yet, as Sobel and Borhnstedt (1985) argue, one would never seriously entertain such a hypothesis, and that perhaps these indices should be compared with a different baseline hypothesis. Unfortunately, the conventional practice of structural equation modeling as represented in scholarly journals suggests that these indices have always been compared with the baseline model of complete independence. 4.3.2 Measures based on errors of approximation. It was noted earlier that the likelihood ratio chisquared test assesses an exact null hypothesis that the model fits perfectly in the population. In reality, it is fairly unlikely that the model will fit the data perfectly. Not only is it unreasonable to expect our models to hold perfectly, but trivial misspecifications can have detrimental effects on the likelihood ratio test when applied to large samples. Therefore, a more sensible approach is to assess whether the model fits approximately well in the population. The difficulty arises when trying to quantify what is meant by ‘approximately.’ To motivate this work it is useful to differentiate among different types of discrepancies. First, there is the discrepancy due to approximation, which measures the lack of fit of the model to the population covariance matrix. Second, there is the discrepancy due to estimation, which measures the difference between the model fit to the sample covariance matrix and the model fit to the population covariance matrix. Finally, there is the discrepancy due to overall error that measures the difference between the elements of the population covariance matrix and the model fit to the sample covariance matrix. Measures of approximate fit are concerned with the discrepancy due to approximation. Based on the work of Steiger and Lind (1980) it is possible to assess approximate fit of a model in the population. The method of Steiger and Lind (1980) utilizes the root mean square error of approximation (RMSEA) for measuring approximate fit. In utilizing the RMSEA 15218
for assessing approximate fit, a formal hypothesis testing framework is employed. Typically an RMSEA value 0.05 is considered a ‘close fit.’ This value serves to define a formal null hypothesis of close fit. In addition, a 90% confidence interval around the RMSEA can be formed, enabling an assessment of the precision of the estimate (Steiger and Lind 1980). Practical guidelines recommended suggest that values of the RMSEA between 0.05 and 0.08 are indicative of fair fit, while values between 0.08 and 0.10 are indicative of mediocre fit. 4.3.3 Measures that assess cross-alidation adequacy. Another important consideration in evaluating a structural model is whether the model is capable of cross-validating well in a future sample of the same size, from the same population, and sampled in the same fashion. In some cases, an investigator may have a sufficiently large sample to allow it to be randomly split in half with the model estimated and modified on the calibration sample and then crossvalidated on the alidation sample. When this is possible, then the final fitted model from the calibration sample is applied to the validation sample covariance matrix with parameter estimates fixed to the estimated values obtained from the calibration sample. In many instances, investigators may not have samples large enough to allow separation into calibration and validation samples. However, the crossvalidation adequacy of the model remains a desirable piece of information in model evaluation. Two comparable methods for assessing cross-validation adequacy are the Akaike information criterion (AIC) and the expected cross-validation index (ECVI). In both cases, it is assumed that the investigator is in a position to choose among a set of competing models. The model with the lowest AIC or ECVI value will be the model that will have the best cross-validation capability.
5. Statistical Assumptions Underlying Structural Equation Modeling As with all statistical methodologies, structural equation modeling requires that certain underlying assumptions be satisfied in order to ensure accurate inferences. The major assumptions associated with structural equation modeling include: multivariate normality, no systematic missing data, sufficiently large sample size, and correct model specification. 5.1 Sampling Assumptions For the purposes of proper estimation and inference, a very important question concerns the sampling mechanism. In the absence of explicit information to the
Structural Equation Modeling contrary, estimation methods such as maximum likelihood assume that data are generated according to simple random sampling. Perhaps more often than not, however, structural equation models are applied to data that have been obtained through something other than simple random sampling. For example, when data are obtained through multistage sampling, the usual assumption of independent observations is violated resulting in predictable biases. Structural equation modeling expanded to consider multistage sampling will be discussed in Sect. 6.
5.2 Non-normality A basic assumption underlying the standard use of structural equation modeling is that observations are drawn from a continuous and multivariate normal population. This assumption is particularly important for maximum likelihood estimation because the maximum likelihood estimator is derived directly from the expression for the multivariate normal distribution. If the data follow a continuous and multivariate normal distribution, then maximum likelihood attains optimal asymptotic properties, viz., that the estimates are normal, unbiased, and efficient. The effects of non-normality on estimates, standard errors, and tests of model fit are well known. The extant literature suggests that non-normality does not affect parameter estimates. In contrast, standard errors appear to be underestimated relative to the empirical standard deviation of the estimates. With regard to goodness-of-fit, research indicates that nonnormality can lead to substantial overestimation of likelihood ratio chi-squared statistic, and this overestimation appears to be related to the number of degrees of freedom of the model.
5.2.1 Estimators for non-normal data. The estimation methods discussed in Sect. 3 assumed multivariate normal data. In the case where the data are not multivariate normal, two general estimators exist that differ depending on whether the observed variables are continuous or categorical. One of the major breakthroughs in structural equation modeling has been the development of estimation methods that are not based on the assumption of continuous multivariate normal data. Browne (1984) developed an asymptotic distribution free (ADF) estimator based on weighted least-squares theory, in which the weight matrix takes on a special form. Specifically, when the data are not multivariate normal, the weight matrix can be expanded to incorporate information about the skewness and kurtosis of the data. Simulation studies have shown reasonably good performance of the ADF estimation compared with ML and GLS.
A problem with the ADF estimator is the computational difficulties encountered for models of moderate size. Specifically, with p variables there are u l 1\2p( pj1) elements in the sample covariance matrix S. The weight matrix is of order uiu. It can be seen that the size of the weight matrix grows rapidly with the number of variables. So, if a model was estimated with 20 variables, the weight matrix would contain 22,155 distinct elements. Moreover, ADF estimation required that the sample size (for each group if relevant) exceed pj1\2p( pj1) to ensure that the weight matrix is non-singular. These constraints have limited the utility of the ADF estimator in applied settings. Another difficulty with the ADF estimator is that social and behavioral science data are rarely continuous. Rather, we tend to encounter categorical data and often, we find mixtures of scale types within a specific analysis. When this is the case, the assumptions of continuity as well as multivariate normality are violated. Muthe! n (1984) made a major contribution to the estimation of structural equation models under these more realistic conditions by providing a unified approach for estimating models containing mixtures of measurement scales. The basic idea of Muthe! n’s (1984) method is that underlying each of the categorical variables is a latent continuous and normally distributed variable. The statistics for estimating the model are correlations among these latent variables. For example, the Pearson product–moment correlations between observed dichotomous variables are called phi coefficients. Phi coefficients can underestimate the ‘true’ or latent correlations between the variables. Thus, when phi coefficients are used in structural equation models, the estimates may be attenuated. In contrast, the latent correlations between dichotomous variables are referred to as tetrachoric correlations, and their use would result in disattenuated correlation coefficients. Other underlying correlations can also be defined. The procedure developed by Muthe! n (1984) handles mixtures of scale types by calculating the appropriate correlations and the appropriate weight matrix corresponding to those correlations. These are entered into a weighted least-squares analysis yielding correct goodness-of-fit tests, estimates, and standard errors. Unfortunately, as with the ADF estimator, the method of Muthe! n is computationally intensive. However, research has shown how such estimation methods can be developed that are not as computationally intensive.
5.3 Missing Data Generally, statistical procedures such as structural equation modeling assume that each unit of analysis has complete data. However, for many reasons, units may be missing values on one or more of the variables 15219
Structural Equation Modeling under investigation. Numerous ad hoc approaches exist for handling missing data, including listwise and pairwise deletion. However, these approaches assume that the missing data are unrelated to the variables that have missing data, or any other variable in the model, i.e., missing completely at random (MCAR)— a heroic assumption at best. More sophisticated approaches for dealing with missing data derive from the seminal ideas of Little and Rubin (1987). Utilizing the Rubin and Little framework, a major breakthrough in model-based approaches to missing data in the structural equation modeling context was made simultaneously by Allison (1987) and Muthe! n et al. (1987). They showed how structural equation modeling could be used when missing data are missingat-random (MAR), an assumption that is more relaxed than missing-completely-at-random. The approach advocated by Muthe! n et al. (1987) was compared with standard listwise deletion and pairwise deletion approaches (which assume MCAR) in an extensive simulation study. The general findings were that Muthe! n et al.’s approach was superior to listwise and pairwise deletion approaches for handling missing data even under conditions where it was not appropriate to ignore the mechanism that generated the missing data. The major problem associated with Muthe! n et al.’s approach is that it is restricted to modeling under a relatively small number of distinct missing data patterns. Small numbers for distinct missing data patterns will be rare in social and behavioral science applications. However, it has proved possible to apply MAR-based approaches (Little and Rubin 1987) to modeling missing data within standard structural equation modeling software in a way that mitigates the problems with Muthe! n et al.’s approach. Specifically, imputation approaches have become available for maximum likelihood estimation of structural model parameters under incomplete data assuming MAR. Simulation studies have demonstrated the effectiveness of these methods in comparison to listwise deletion and pairwise deletion under MCAR and MAR.
(summarized in Kaplan 2000) the general finding is that specification errors in the form of omitted variables can result in substantial parameter estimate bias. In the context of sampling variability, specification errors have been found to be relatively robust to small specification errors. However, tests of free parameters in the model are affected in such a way that specification error in one parameter can propagate to affect the power of the test in another parameter in the model. Sample size also interacts with the size and type of the specification error.
6. Modern Deelopments Constituting the ‘Second Generation’ Structural equation modeling is, arguably, one of the most popular statistical methodologies available to quantitative social scientists. The popularity of structural equation modeling has led to the creation of a scholarly journal devoted specifically to structural equation modeling as well as the existence of SEMNET, a very popular and active electronic discussion list that focuses on structural equation modeling and related issues. In addition, there are a large number of software programs that allow for sophisticated and highly flexible modeling. In addition to ‘first-generation’ applications of structural equation modeling, later developments have allowed traditionally different approaches to statistical modeling to be specified as special cases of structural equation modeling. Space constraints make it impossible to highlight all of the modern developments in structural equation modeling. However, the more important modern developments have resulted from specifying the general model in a way that allows a ‘structural modeling’ approach to other types of analytic strategies. The most recent example of this development is the use of structural equation modeling to estimate multilevel data as well as specifying structural equation models to estimate growth parameters from longitudinal data. 6.1 Multileel Structural Equation Modeling
5.4 Specification Error Structural equation modeling assumes that the model is properly specified. Specification error is defined as the omission of relevant variables in any equation of the system of equations defined by the structural equation model. This includes the measurement model equations as well as the structural model equations. As with the problems of non-normality and missing data, the question of concern is the extent to which omitted variables affect inferences. The mid-1980s saw a proliferation of studies on the problem of specification error in structural equation models. On the basis of work by Kaplan and others 15220
Attempts have been made to integrate multilevel modeling with structural equation modeling so as to provide a general methodology that can account for issues of measurement error and simultaneity as well as multistage sampling. Multilevel structural equation modeling assumes that the levels of the within-group endogenous and exogenous variables vary over between-group units. Moreover, it is possible to specify a model which is assumed to hold at the between-group level and that explains between-group variation of the within-group variables. An overview and application of multilevel structural equation modeling can be found in Kaplan (2000).
Structural Equation Modeling 6.2 Growth Cure Modeling from the Structural Equation Modeling Perspectie Growth curve modeling has been advocated for many years by numerous researchers for the study of interindividual differences in change. The specification of growth models can be viewed as falling within the class of multilevel linear models (Bryk and Raudenbush 1992), where level 1 represents intraindividual differences in initial status and growth and level 2 models individual initial status and growth parameters as a function of interindividual differences. Research has also shown how growth curve models could also be incorporated into the structural equation modeling framework utilizing existing structural equation modeling software. The advantages of the structural equation modeling framework to growth curve modeling is its tremendous flexibility in specifying models of substantive interest. Developments in this area have allowed the merging of growth curve models and latent class models for the purpose of isolating unique populations defined by unique patterns of growth.
7. Pros and Cons of Structural Equation Modeling Structural equation modeling is, without question, one of the most popular methodologies in the quantitative social sciences. Its popularity can be attributed to the sophistication of the underlying statistical theory, the potential for addressing important substantive questions, and the availability and simplicity of software dedicated to structural equation modeling. However, despite the popularity of the method, it can be argued that ‘first-generation’ use of the method is embedded in a conventional practice that precludes further statistical as well as substantive advances. The primary problem with the ‘first-generation’ practice of structural equation modeling lies in attempting to attain a ‘well-fitting’ model. That is, in conventional practice, if a model does not fit from the standpoint of one statistical criterion (e.g., the likelihood ratio chi-squared test), then other conceptually contradictory measures are usually reported (e.g., the NNFI). In addition, it is not uncommon to find numerous modifications made to an ill-fitting model to bring it in line with the data, usually supplemented by post hoc justification for how the modification fit into the original theoretical framework. Perhaps the problem lies in an obsession with null hypothesis testing—certainly an issue that has received considerable attention. Or perhaps the practice of ‘first-generation’ structural equation modeling is embedded in the view that only a well-fitting model is worthy of being interpreted. Regardless, it has been argued by Kaplan (2000) that this conventional practice precludes learning valuable information about the phenomena under study that could other-
wise be attained if the focus was on the predictive ability of a model. Such a focus on the predictive ability of the model combined with a change of view toward strict hypothesis testing might lead to further substantive and statistical developments. Opportunities for substantive development and model improvement emerge when the model does not yield accurate or admissible predictions derived from theory. Opportunities for statistical developments emerge when new methods are developed for engaging in prediction studies and evaluating predictive performance. That is not to say that there are no bright spots in the field of structural equation modeling. Indeed, developments in multilevel structural equation modeling, growth curve modeling, and latent class applications suggest a promising future with respect to statistical and substantive developments. However, these ‘second-generation’ methodologies will have to be combined with a ‘second-generation’ epistemology so as to realize the true potential of structural equation modeling in the array of quantitative social sciences. See also: Causation (Theories and Models): Conceptions in the Social Sciences; Factor Analysis and Latent Structure, Confirmatory; Factor Analysis and Latent Structure: Overview; Growth Curve Analysis; Indicator: Methodology; Latent Structure and Causal Variables; Linear Hypothesis: Regression (Basics); Statistical Analysis: Multilevel Methods
Bibliography Allison P D 1987 Estimation of linear models with incomplete data. In: Clogg C C (ed.) Sociological Methodology. JosseyBass, San Francisco, pp. 71–103 Bentler P M 1980 Multivariate analysis with latent variables: causal modeling. Annual Reiew of Psychology 31: 419–56 Bentler P M 1990 Comparative fit indices in structural models. Psychological Bulletin 107: 238–46 Bentler P M, Bonett D G 1980 Significance tests and goodnessof-fit in the analysis of covariance structures. Psychological Bulletin 88: 588–606 Browne M W 1984 Asymptotic distribution free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology 37: 62–83 Bryk A S, Raudenbush S W 1992 Hierarchical Linear Models: Applications and Data Analysis Methods. Sage, Newbury Park, CA Jo$ reskog K G 1972 Factor analysis by generalized least squares. Psychometrika 37: 243–60 Jo$ reskog K G 1973 A general method for estimating a linear structural equation system. In: Goldberger A S, Duncan O D (eds.) Structural Equation Models in the Social Sciences. Academic Press, New York, pp. 85–112 Kaplan D 2000 Structural Equation Modeling: Foundations and Extensions. Sage, Newbury Park, CA Keesling J W 1972 Maximum likelihood approaches to causal analysis. Unpublished Ph.D. thesis, University of Chicago Little R J A, Rubin D B 1987 Statistical Analysis with Missing Data. Wiley, New York
15221
Structural Equation Modeling McDonald R P, Marsh H W 1990 Choosing a multivariate model: noncentrality and goodness-of-fit. Psychological Bulletin 107: 247–55 Muthe! n B 1984 A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika 49: 115–32 Muthe! n B 2001 Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class\latent growth modeling. In: Sayer A, Collins L (eds.) New Methods for the Analysis of Change. American Psychological Association, Washington, DC, pp. 291–322 Muthe! n B, Kaplan D, Hollis M 1987 On structural equation modeling with data that are not missing completely at random. Psychometrika 51: 431–62 Sobel M, Bohrnstedt G W 1985 Use of null models in evaluating the fit of covariance structure models. In: Tuma N B (ed.) Sociological Methodology. Jossey-Bass, San Francisco, pp. 152–78 Steiger J H, Lind J M 1980 Statistically based tests for the number of common factors. Paper presented at the Psychometric Society, Iowa City, IA Wiley D E 1973 The identification problem in structural equation models with unmeasured variables. In: Goldberger A S, Duncan O D (eds.) Structural Equation Models in the Social Sciences. Academic Press, New York, pp. 69–83
D. Kaplan
Structuralism For expository reasons, a specific sense of structuralism will be distinguished from a loose sense. The specific sense is the program which Claude Le! viStrauss, borrowing from structural linguistics, tried to introduce into anthropology. It is defined by the various, not always compatible, statements by Le! viStrauss. The loose sense is reserved for those authors who deserve to be discussed when speaking about a style of approach which has manifested itself in different ways in different times and places, but which in the history of ideas can generally be traced back to certain writings of Emile Durkheim and Marcel Mauss. Included in this groups are figures who wrote well before Le! vi-Strauss, such as Dutch authors who made many of the analytic discoveries found in Le! viStrauss’s The Elementary Structures of Kinship (1969a), but who did not speak of structuralism, as well as contemporaries of Le! vi-Strauss who though knowledgeable of Le! vi-Strauss’s writings did not regard themselves necessarily as structuralists in his sense, or did so only part of the time. The specific sense of structuralism poses its own problems. Which of Le! vi-Strauss’s many doctrines and which of the many intellectual influences on Le! viStrauss are to be regarded as structuralist? Is it not arbitrary to regard some as structuralist and to 15222
disregard the others? Yet taken as a whole, Le! viStrauss’s style of thought is so personal as to be impossible to imitate in its entirety. Given the elusive nature of his language and his many self-contradictions, many would not want to. Le! vi-Strauss’s thought and prose, for example, have been said to be influenced by the French symbolist poets and the surrealists (e.g., Leach 1970, p. 16); and according to his own testimony he has been influenced by such composers as Debussy, Stravinsky, and Wagner (Le! viStrauss 1969b, p. 15). Elsewhere (Le! vi-Strauss 1955, pp. 48–50) he describes the influence on his thought of geology, psychoanalysis, and Marxism. ‘All three showed that understanding consists in the reduction of one type of reality to another; that true reality is never the most obvious of realities … .’ In response Leach (1970, p. 13) has written, ‘the relevance of Marxist ideology for an understanding of Le! vi-Strauss is difficult to determine. Le! vi-Strauss’ use of dialectic, with the formal sequence of thesis–antithesis–synthesis, is Hegelian rather than Marxist and his attitude to history seems to be quite contrary to Marxist dogma.’ Le! vi-Strauss dedicated The Saage Mind (1966) to the phenomenologist Merleau-Ponty. How are we precisely to relate the influence on him of American anthropologists such as Boas, Kroeber, and Lowie to his structuralism? Perhaps we are on surer ground when it comes to the debts which he admits to owing Marcel Granet, Georges Dume! zil, and Henri Gre! goire (Le! vi-Strauss 1969b, p. 15, n. 5). Also there is less difficulty in seeing the similarity between his structuralism and Gestalt psychology with its emphasis or the perception of patterns (Le! vi-Strauss 1963a, pp. 324–41). The relevance of Gestalt was recognized by another of his influences, namely Wiener (1948, p. 18, ff.), one of the founders of communications theory. Indeed games theory, the mathematical theory of communication and cybernetics are obvious and acknowledged sources of some of his more unquestionably structuralist ideas (Le! vi-Strauss 1963a, p. 283; for a discussion see Ardener 1971, liv ff.). The most direct source of course is the linguistic methods of the structuralist linguists Troubetzkoy, Jakobson, and Halle (cf. Benveniste 1962). In an anticipatory response to a statement of Wiener (1948, p. 24) that sociology and anthropology lacked the long runs of statistics under constant conditions necessary for the applications of the mathematics of communications theory, Le! viStrauss wrote in a paper of 1945 on Structural Analysis in Linguistics and in Anthropology that, ‘It [linguistics] is probably the only one [social science] which can truly claim to be a science.’ Le! vi-Strauss’s scientism (a characteristic of his thought), like that of A. R. Radcliffe-Brown, is thus programmatic rather than realized. However, it is the methods which have allegedly turned linguistics into a scientific social science that Le! vi-Strauss wants to use to achieve the same transformation in social anthropology.
Structuralism Referring to Troubetzkoy (1933), Le! vi-Strauss reduced the structural method to four basic operations. First, structural linguistics shifts from the study of conscious linguistic phenomena to study of their unconscious infrastructure; second, it does not treat terms as independent entities, taking instead as its basis of analysis the relations between terms; third, it introduces the concept of system. ‘Modern phonemics does not merely proclaim that phonemes are always part of a system; it shows concrete phonemic systems and elucidates their structure’; finally, structural linguistics aims at discovering general laws, either by induction ‘or … by logical deduction, which would give them an absolute character.’ Thus, for the first time, a social science is able to formulate necessary relationships (Le! vi-Strauss 1963a, p. 33, 1962). The crucial influence here was the conception of the phoneme. This idea was based on a distinction between the significant and the insignificant sounds in a given language. Only a small set of sounds in any given language are capable of determining differences of meanings. These individual phonemes are defined by contrastive features which are the distinctive features of a given languages, and these distinctive features are defined by a system of oppositions—the sort of thing that makes the difference between pat and bat. In the early 1945 paper he drew an analogy between the phoneme and the terms of a kinship terminology: ‘Like phonemes, kinship terms are elements of meaning; like phonemes, they acquire meaning only if they are integrated into systems. Kinship systems, like ‘‘phonemic systems,’’ are built by the mind on the level of unconscious thought … in the case of kinship as well as linguistics, the observable phenomena result from the action of laws which are general but implicit’ (Le! vi-Strauss 1963a, p. 34). Le! vi-Strauss also took over from Ferdinand de Saussure, from whom structuralist linguistics derives, the notion of the arbitrary nature of the linguistic sign and the realization that each linguistic term derives its value from its opposition to all the other terms (de Saussure 1922, p. 126). As Ardener (1971, p. 36) comments, ‘Here we have clearly stated the now fashionable anthropological view that elements in the system define themselves in opposition to all other elements in the system.’ In contrasting his views on social structure to those of Radcliffe-Brown, Le! vi-Strauss held that social structure is a model made up after empirical reality. It is, therefore, a method with the following features. It is a system made of elements, none of which can change without changing other elements. As a model, it should take its place in a series of transformations resulting in a group of models of the same type. It should be possible to predict how the model will react if its elements are modified. Importantly and finally, the model should make intelligible all of the facts. The best model is the simplest model which accounts for all of the facts (Le! vi-Strauss 1963a, pp. 279–81). Le! vi-Strauss
can be said to have brought the intellect into anthropology. As he said, ‘each culture has its own theoreticians whose contributions deserve the same attention as that which the anthropologist gives to colleagues.’ ‘Many ‘‘primitive’’ cultures have built models of their marriage regulations which are much more to the point than models built by professional anthropologists’ (Le! vi-Strauss 1963a, p. 282). The anthropologist should take these models into consideration and after comparing them to what people actually do, form his own models which for Le! vi-Strauss are the structure. These ideas are already present in his early classic The Elementary Structures of Kinship (1969a) in which he treated certain positive marriage rules as transformations across a range of logically possible models, of which only some had empirical realization. They directed his study of systems of classification in his Totemism (1963b) and The Saage Mind (1966). A characteristic argument is that ‘so-called totemic institutions’ turn on ‘an homology between two systems of differences,’ that is, the differences between species is homologous to the differences between groups. These ideas are also carried into his several papers and four books on myth analysis which were intended to reveal the constraints on the human mind (Le! vi Strauss 1969b, p. 10). ‘Mythical thought always progresses from the awareness of oppositions toward their resolution’ (Le! vi-Strauss 1963a, p. 224). Le! viStrauss (1963a, p. 228) early on proposed that every myth corresponds to a formula of the following type Fx (a ): Fy (b) " Fx(b): Fak1( y). Later he has he returned to it (Le! vi-Strauss 1988, p. 57, 156, 163, 170). This formula he describes as representing ‘an inversion of terms and relations, under two conditions: (1) that one term be replaced by its opposite (in the above formula, a and ak1); (2) that an inversion be made between the function value and the term value of two elements (above, y and a ). Mosko (1991) has modified the formula and said that it works. For some, however, the formula may be incomprehensible. However, elsewhere Le! vi-Strauss warns his readers against taking his mathematical formulas and symbols too seriously, as their resemblances to those of the mathematicians are only superficial (Le! vi-Strauss 1969b, p. 30). Nevertheless, apart from the above formula, Le! vi-Strauss never shows us how the fundamental structure of myth differs from his initial ideas of opposition, mediation and transformation. When he does set forth the structure of a given myth, the demonstration is so heavily freighted with content (the concrete elements that are in opposition) that it is hard to see the difference between the formal structure and the concrete myth. Dumont (1975, p. 333), following Pocock (1961, p. 76), has written of a convergence between EvansPritchard and Le! vi-Strauss in the pages of The Nuer (1940a), in a passing from the functionalism of 15223
Structuralism Radcliffe-Brown to meaning. Radcliffe-Brown used structure as a synonym for social structure, the true subject of social anthropology. In the closing pages of The Nuer, Evans-Pritchard makes several attempts to characterize his achievement and state what he means by structure, finally stating that, ‘We have defined structure by what amounts to the presence of group segmentation,’ that is, the segmentary principle (Evans-Pritchard 1940a, p. 265). If this is a convergence, it is a convergence before the fact; for Le! viStrauss wrote his essay on social structure more than a decade after Evans-Pritchard wrote The Nuer. It is striking that in that essay Le! vi-Strauss mentions The Nuer only in a footnote in the midst of a long list of books which have had something to say about economic formations (Le! vi-Strauss 1963a, p. 319, n. 49). At the risk of reading the present into the past and taking the lead of Pocock and Dumont, we could say that in 1939 or thereabouts, Evans-Pritchard did not have the vocabulary available to talk about his achievement which would have been available a little over a decade later. Taking the better parts of Le! viStrauss’s essay, he might later have been able to write that what he had done in The Nuer is present a series of models, starting with the models of the Nuer themselves, that is, their meanings, and that the most salient and most abstract model he had used was the segmentary principle, which was essentially his final definition of structure. Dumont noted too that in Evans-Pritchard’s treatment of factual oppositions there were underlying conceptual oppositions and that, ‘the author has well and truly discovered structuralism on his own account without direct borrowing—as was the case in other quarters—from contemporary developments in neighboring disciplines (phonology, Gestalt theory)’ (Dumont 1975, p. 335). Dumont also noted that ‘what is properly and strictly the structural aspect of The Nuer has never really taken root in England’ (Dumont 1975, p. 334). For example, ‘immediately afterwards, the opposite tendency to reify or solidify, in the manner of Radcliffe-Brown, was vigorously asserted by Meyer Fortes’ (Dumont 1975, p. 340). Thus, Fortes stated that ‘The most important feature of unilineal descent groups in Africa … is their corporate organization’ (Fortes 1953, p. 25). More recently Holy stated that ‘Evans-Pritchard saw the lineage system in terms of the paradigm of hierarchically structural segments, each of which was a corporation’ and lists as one of his errors ‘ascribing corporateness to what are empirically … only categories of people’ (Holy 1996, pp. 88–9). As evidence of Evans-Pritchard’s views he cites Fortes. In fact, both Fortes and Holy overlook the fact that Evans-Pritchard stated quite plainly in The Nuer (Evans-Pritchard 1940a, p. 89) and in his contribution to African Political Systems (1940b, pp. 286–7) that, ‘Nuer lineages are not corporate, localized, communities’ and ‘clans and their lineages are not distinct corporate groups.’ 15224
In The Nuer structure remains close to social structure, but by 1956 in chapter IV, ‘Spirit and the Social Order,’ in Nuer Religion, Evans-Pritchard was applying similar ideas to conceptions of divinity. ‘Given the segmentary political and lineage structure of the Nuer it is understandable that the same complementary tendencies towards fission and fusion and the same relativity that we find in the structure are found also in the action of Spirit in the social life’ (Evans-Pritchard 1956 p. 115). This chapter understandably bears a close similarity in inspiration to Chap. 4 in his student Needham’s Structure and Sentiment (1962) and to that mine of structuralist inspiration Primitie Classification (Durkheim and Mauss 1963) with its concern about the relation between the structure of society and the structure of classification (for a discussion of its influence see Needham 1963, pp. 31–2). Evans-Pritchard was one of the few Africanists who had the slightest understanding of what Asianists such as Dumont, Leach, and Needham were trying to do in bringing structuralist approaches into British anthropology, particularly, but not exclusively, in relation to the topic of marriage alliance. A convergence between Le! vi-Strauss’s structuralism and early work by Dutch anthropologists was recognized and celebrated by Josselin de Jong (1952). P. E. Josselin de Jong (1977) provided a service by bringing out a selection of this early Dutch structuralism (to give it a retrospective designation) in English translation in his Structural Anthropology in the Netherlands, which includes Louis Onvlee’s essay on the construction of the Mangili dam, a crystal clear example of pre-Le! vi-Straussian structuralism that achieves more than Le! vi-Strauss ever did at a similar level, but which remains entirely free of the characteristic blemishes of Le! vi-Strauss’s thought. The terms structuralism and structuralist were unknown to some of the authors alluded to here, accepted retrospectively by some, and avoided by others. Leach wrote (1970, p. 9), ‘I myself was once a pupil of Malinowski and I am, at heart, still a ’functionalist’ even though I recognize the limitations of Malinowski’s own brand of theory. Although I have occasionally used the ‘structuralist’ methods of Le! vi-Strauss to illuminate particular features of particular cultural systems the gap between my general position and that of Le! vi-Strauss is very wide.’ In Political Systems of Highland Burma (1954), which Leach later characterized as rationalist, that is, structuralist, unlike his later book Pul Eliya (1961) which he characterized a empiricist, that is, functionalist (Leach 1976, p. 6), Leach writes that as opposed to social structure, culture provides the form and ‘dress’ of the social situation (Leach 1954, p. 16). These views did not prevent him from publishing structuralist treatments of the Bible (Leach 1962) and of Michelangelo’s paintings on the ceilings of the Sistine Chapel (Leach 1977). Leach’s attitude appears to be that structural-
Structuralism, History of ism is a technique to be used when it is advantageous to do so, rather than a movement within which the anthropologists must always travel. In all probability Dumont would have agreed with him. When considering north Indian kinship in relation to south Indian kinship, Dumont (1966, p. 103) found that his structuralist methods, which were so productive in the south, were unproductive, or at least not productive to the same extent, in the north. It can be said that some peoples are structuralist and others are not. Perhaps it could be said that there are structuralist topics and non-structuralist topics about a given people. The sensible course is to use whatever techniques suggest themselves and produce a result. See also: Anthropology, History of; Durkheim, Emile (1858–1917); Kinship in Anthropology; Linguistics: Overview; Saussure, Ferdinand de (1857–1913); Structuralism, History of; Structuralism, Theories of
Bibliography Ardener E 1971 Introductory essay: Social anthropology and language. In: Ardener E (ed.) Social Anthropology and Language (ASA Monographs 10). Tavistock, London Benveniste E 1962 ‘Structure’ en linguistique. In: Bastide R (ed.) Sens et usages du terme structure. Mouton, The Hague Dumont L 1966 North India in relation to South India. Contributions to Indian Sociology 9: 90–103 Dumont L 1975 Preface by Louis Dumont to the French edition of The Nuer. In: Beattie J H M, Lienhardt R G (eds.) Studies in Social Anthropology. Clarendon Press, Oxford, UK Durkheim E, Mauss M 1963 Primitie Classification (trans. Needham R). University of Chicago Press, Chicago Evans-Pritchard E E 1940a The Nuer. Clarendon Press, Oxford, UK Evans-Pritchard E E 1940b The Nuer of the Southern Sudan. In: Fortes M, Evans-Pritchard E E (eds.) African Political Systems. Oxford University Press, London Evans-Pritchard E E 1956 Nuer Religion. Clarendon Press, Oxford, UK Fortes M 1953 The structure of unilineal descent groups. American Anthropologist 55(1): 17–41 Holy L 1996 Anthropological Perspecties on Kinship. Pluto Press, London Josselin de Jong J P B 1952 LeT i-Strauss’s Theory on Kinship and Marriage. Brill, Leiden Josselin de Jong P E (ed.) 1977 Structural Anthropology in the Netherlands (Koninklijk Instituut voor Taal-, Landen Volkenkunde Translation Series 17). Nijhoff, The Hague Leach E 1954 Political Systems of Highland Burma. Bell, London Leach E 1961 Pul Eliya. Cambridge University Press, Cambridge, UK Leach E 1962 Genesis as myth. Discoery May: 30–5 Leach E 1970 LeT i-Strauss. Collins, London Leach E 1976 Culture and Communication. Cambridge University Press, Cambridge, UK Leach E 1977 Michelangelo’s genesis: Structuralist comments on the paintings on the Sistine Chapel ceiling. Times Literary Supplement March 18: 311–3 Le! vi-Strauss C 1955 Tristes Tropiques. Plon, Paris
Le! vi-Strauss C 1962a Les limites de la notion de structure en ethnologie. In: Bastide R (ed.) Sens et usages du terme structure. Mouton, The Hague Le! vi-Strauss C 1963a Structural Anthropology. Basic Books, New York Le! vi-Strauss C 1963b Totemism (trans. Needham R). Beacon, Boston Le! vi-Strauss C 1966 The Saage Mind. Weidenfeld and Nicolson, London Le! vi-Strauss C 1969a The Elementary Structures of Kinship (trans. Harle Bell J, von Sturmer J R, Needham R). Eyre & Spottiswoode, London Le! vi-Strauss C 1969b The Raw and the Cooked: Introduction to a Science of Mythology. Harper and Row, New York, Vol. 1 Le! vi-Strauss C 1988 The Jealous Potter (trans. Chorier B). University of Chicago Press, Chicago Mosko M S 1991 The canonic formula of myth and nonmyth. American Ethnologist 18(1): 126–51 Needham R 1962 Structure and Sentiment. University of Chicago Press, Chicago Needham R 1963 Introduction: Primitie Classification. University of Chicago Press, Chicago Pocock D 1961 Social Anthropology. Sheed and Ward, London Radcliffe-Brown A R 1952 Structure and Function in Primitie Society. Macmillan, London de Saussure F 1922 Cours de linguistique geT neT rale, 2nd edn. Payot, Geneva Troubetzkoy N 1933 La phonologie actuelle. Psychologie du langage. Paris Wiener N 1948 Cybernetics, or Control and Communication in the Animal and the Machine. Wiley, New York
R. H. Barnes
Structuralism, History of The triumph of structuralism during the 1950s and 1960s was of such a remarkable nature that it became identified with the ensemble of French intellectual history since 1945. During these years, little attention was paid to anything outside of this trend which appeared as a new perspective on the world and human culture. Emphasising critical thinking and expressing the emancipatory will of nascent social sciences in search of scholarly and institutional legitimation, structuralism generated a real sense of collective enthusiasm for the whole of French intelligentsia over at least two decades. Then suddenly, at the beginning of the 1980s, everything was overturned. Most of the French heroes of this intellectual adventure disappeared and with them their work, anxious to be buried by a new epoch, accentuating thus the impression of the end of an era. In this way, the working through of mourning, necessary in order to do justice to what had been one of the most fruitful periods of our intellectual history, had been avoided. Taken in its most general sense, the word structure served as a password for a large part of the human sciences (sciences humaines), enabling a unitary pro15225
Structuralism, History of gram. Where does the concept of structuralism, one that has led both to such infatuation and to such opprobrium, originate? Derived from the word structure, it starts out as an architectural term. Structure was first and foremost ‘the way in which a building is constructed’ (Dictionnaire de TreT oux, 1771 edition). During the seventeenth and eighteenth centuries, the meaning of the term structure is modified and enlarged by the analogy with living beings. Thus, the human body is perceived as a construction in Fontenelle as is language in Vaugelas and Bernot. The term, therefore, is taken to refer to a description of how the components of a concrete object make up the whole. It can have various applications (anatomical, psychological, geological, mathematical structures, etc.). The structural approach is not really applied to the field of the human sciences until the more recent nineteenth century, with Spencer, Morgan and Marx. It is then taken to refer to a lasting phenomenon that connects, in complex terms, the components of an aggregate according to a more abstract schema. The term structure, rarely found in Marx, with the exception of the preface to A Contribution to the Critique of Political Economy (1859), is consecrated by Durkheim at the end of the nineteenth century in The Rules of Sociological Method (1895). Structure thus gives rise to that which, in Andre! Lalande’s Vocabulaire, is qualified as a neologism—structuralism between 1900 and 1926. Structuralism emerges among psychologists in opposition to the functional psychology of the turn of the century. However, structuralism’s real starting point, in terms of its modern meaning and standpoint of domination among all the human sciences, originates in the evolution of linguistics. Although Saussure uses the term structure only three times in The Course in General Linguistics, it is the Prague School (Troubetzkoy and Jakobson) that is to make widespread the use of the terms structure and structuralism. The Danish linguist, Hjemslev, who founded the journal Acta Linguistica in 1939, asserts the reference to structure as a charter program. The first article in this journal treats the subject of ‘structural linguistics.’
1. The Structural Years Departing from this linguistic nucleus, the term triggers a veritable revolution throughout the human sciences at the heart of the twentieth century. To the extent that Michel Foucault observed that structuralism ‘is not a new method, but the awakened and anxious conscience of modern knowledge.’ Jacques Derrida defined this approach as an ‘adventure in perception’ and Roland Barthes saw structuralism as the passage from symbolic to paradigmatic conscience, or as the advent of the awareness of paradox. It 15226
therefore suggests a movement of thought and a new relationship to the world, much wider than a simple binary methodology confined to one or another field of specific study. This interpretation, seeing itself as unitary, privileges the sign at the expense of meaning, space at the expense of time, the object at the expense of the subject, relation at the expense of content, and culture at the expense of nature. Primarily, structuralism functions as a paradigm for a philosophy of suspicion, or an act of exposure on behalf of intellectuals that aimed at demystifying the doxa and denaturalise meaning, to destabilise it while searching behind the scenes for an expression of bad faith. It is postulated that the class position of the speaker or his libidinal position enables the denouncement of the illusion under which all types of discourse are condemned to exist. This situation of exposure is inscribed in the ancestry of French epistemological tradition that proposes a position of overhang or rupture between scientific competence and common sense, the latter seen as being tied to illusion. Behind this paradigm of suspicion, a profound pessimism envelops intellectuals as well as a critique of western modernity the antithesis of which they attempt to grasp by reference to the figures of the child, the insane or the savage. Thus, beneath the liberatory discourse of Enlightenment, the discipline of the body, the confinement and imprisonment of the social body within an infernal logic of knowledge and power is exposed. Roland Barthes, then, says: ‘I profoundly refuse my civilisation, to the point of nausea’ and the ‘Finale’ of Claude Le! vi-Strauss’s The Naked Man ends with the word ‘NOTHING,’ written in capitals, evoking a requiem or the twilight of mankind. This structuralist paradigm proposes the prevailing of a suspension of meaning, as a means of opposing both eurocentrism and various forms of westernised teleology, in favour of a differentialist mode of thought. The second dimension of the structuralist paradigm involves philosophy’s take-over of three of the social sciences undergoing emancipation and sharing a common valorisation of the unconscious as the locus of truth. These disciplines are general linguistics when, with Saussure, they separate language (langue) as a legitimate object from speech ( parole) forced back to the non-scientific domain; anthropology, with its concern with the encoding of messages rather than with messages themselves and psychoanalysis’s view upon the unconscious as an effect of language. This relates to a quest for both legitimation and standing by nascent social sciences confronted with the established force of the classical humanities and by the tradition and conservatism of the old Sorbonne. Structuralism, at this stage, offers itself as a third discourse, bridging literature and the hard sciences, seeking institutionalisation through socialisation, by-passing the academic center of the Sorbonne by way of the peripheral universities. Publishing and the press through the respected institution of the ColleZ ge De France, serves
Structuralism, History of as a refuge for research on the cutting edge. This involves a veritable battle between the ancients and the moderns in which the themes of breaking and rupture come into play at several levels. The social sciences then attempt to free themselves by cutting the umbilical cord that tied them to philosophy by erecting the efficiency of a scientific method. On their part, some philosophers, taking into account the importance of these endeavors, capture their program and redefine the function of philosophy as a philosophy of the concept. This was to become what is known as ‘effectology,’ or the use of research in the human sciences for a philosophical project that simultaneously deconstructs the classifications employed from within these practices. Philosophy, therefore, retains its central position while declaring its own demise. The major role played by philosophers in this matter also relies on their ability, in the middle of the 1960s, to impose a unitary program, an endeavor in which the structuralist etiquette serves as a rallying call. The second explanatory idea that accounts for this history is that of generation. Thus, a number of landmark events make a lasting impression upon a generation indelibly marked by the Second World War. The difficulty of thinking after Auschwitz, expressed by Adorno, meant the difficulty of thinking optimistically, as was possible at the turn of the century, of the history of a west that had turned towards atrocity, to crimes against humanity. This history was to become the very locus of doubt, questioning, and severe criticism. Hell is no longer the other, but oneself. A questioning of the self and of its illusory capacities for mastery then aroused by a society that had reduced the individual to nothing more than a number. This despair that engenders the critical distance of suspicion becomes accentuated during the 1950s and 1960s with the progressive liberation of the colonised peoples who had broken free of the colonial yoke and achieved their independence. Many intellectuals were to see this rejection of western influence as a confirmation of their critical position and held the figure of the ‘other,’ or the epitome of absolute alterity from the west, to be the very locus of the expression of truth and of a certain purity. This release from western history, that enabled the Bororos or the Nambikwaras to be perceived as expressing the purified cradle of humanity, is reinforced by the successive revelations about what some had taken to be the incarnation of their hopes for the real Marxism of Communist countries.1956 is to become a vital date for a generation of Marxist intellectuals that takes refuge in a structuralised Marxism, a ‘vacuum-packed Marxism,’ as it is dubbed by Jean-Marie Domenach, upon which the disasters of real Communism would have no bearing. Many among these intellectuals are to break with Communist culture, stealing away on tiptoes from history and taking refuge in the enclosure of text, science, and in the evacuation of the subject and the signified.
2. Founding the Scientificity of a Third Discourse The hope for the scientific renovation of the social sciences found its method, a common language capable of effecting change, within structural linguistics. Linguistics then became the model for a whole range of social sciences aspiring to formalism. The discipline was progressively diffused from anthropology, to literary criticism, and to psychoanalysis and it led to the profound renewal of the mode of philosophical questioning. The areas most affected by the linguistic contagion were those disciplines that found themselves in an institutionally precarious position. Alternatively, disciplines were in search of an identity tainted by the internal contradiction of their pretensions to scientific positivity and their relationship to the political, as in the case of sociology and those such as literary studies and philosophy, fully immersed in the dispute between tradition and modernity. This connection served to blur the boundaries between disciplines. Structuralism offered itself here as a unifying project. This temptation was most clearly expressed by Roland Barthes who advocated a general semiology capable of grouping all the human sciences around the study of the sign. Modernization becomes articulated at this point with the interdisciplinary, as it is necessary to defy the sacrosanct boundaries in order to allow the linguistic model to infiltrate throughout the field of the human sciences. As everything is linguistic, everyone is linguistic and the world is language. This interdisciplinariness, that infringes upon the Humboltdian model of the university in which each discipline has its strictly delineated place and is related to the delegitimation of meta-narratives, creates a veritable enthusiasm for all variants of formalism, for knowledge immanent to itself. The key word of the times is ‘communication’ that, even beyond the journal bearing the same title is evocative of the multi-disciplinary euphoria of this period. Le! vi-Strauss was the first to formulate this unifying program for the human sciences in the postwar period. To be sure, the constellation he elaborated gravitated around the social anthropology that he represented and that alone was capable of leading this totalising enterprise. The basis for anthropology’s particular vocation, according to Le! vi-Strauss, is its ability to bridge the natural and human sciences and, therefore, it does not reject being classed among the natural sciences at the final call of judgement. Strengthened by his fruitful encounter with Jakobson during the war in the USA, Le! vi-Strauss privileges the linguistic model within his anthropological project. His search for invariants and his paradigmatic and syntactic deconstructions incorporate the teaching of Jakobson’s phonology, for example binary opposites and differential variations. His contribution ensured that linguistics engendered a particularly fecund area of scholarship during the post-war period. Although Le! vi-Strauss guides anthropology in a cultural di15227
Structuralism, History of rection, due to the emphasis placed on language and the decoding of signs, he does not, however, neglect his ambition for unity. His quest for mental perimeters also seeks out the field of biology. The wholeness, to which Le! vi-Strauss aspires, incorporating also Marcel Mauss’s ambition to construct the ‘total social fact,’ thus aims to embrace the entire scientific domain. It also aims to finally make an anthropology of the Science of Man; a federation of auxiliary sciences strengthened by logical-mathematical models, the contribution of phonology and by a limitless field of study, able at the same time to encompass societies devoid of history or writing across the globe. The anthropologist is therefore able to access the unconscious of social practices and can reinstate the complex combinatories of the rules in force in all human societies. Such an ambition represented a major challenge for all sciences involved in the study of human beings. It also encouraged competitive reactions to this program from within other fields of knowledge or, on the other hand, a dependency upon a process of conquest with the aim of attaining a standing for those still marginalised disciplines in search of legitimation. The excessive nature of this ambition matches the difficulty faced by anthropology at its outset to position itself within the institution. Although anthropology alone did not succeed in emancipating the human sciences, structuralism, taking up the challenge, in fact became the common paradigm if not a common school for a whole range of disciplines while converging to construct a whole and unified science. In 1953, against a backdrop of open crisis, Lacan gives his Rome Report . He is charged with paving an attractive and specifically French way towards the unconscious. In order to win this gamble, he searches for foundations, for institutional and theoretical guarantees. Lacan heads on the quest for lifelines. Therefore he is responsible for breathing new life into psychoanalysis, for curbing its crisis by using a strategy of offence and a dynamic of alliance. While Lacan employs all the means at his disposal he nonetheless uses any available intellectual material to turn things to his advantage, here proving rather more successful. Lacan’s re-reading of Freud is inscribed in the Saussurian tradition, prioritising the synchronic dimension. With this, he shares in the structuralist paradigm and prompts a new reading of Freud that no longer privileges the importance of stage theory but submits this to a basic oedipal structure characterized by its universality, independent of the relationship to temporal or spatial contingencies and preceding all history. Contrary to Saussure, for whom the primary object was language, Lacan privileges speech, a displacement necessary due to the therapeutic practice. But this word does not represent the utterings of a conscious subject in control of his speech, but rather the opposite. Speech is forever cut off from all access to reality. They 15228
merely convey the signifiers that rebound between them. Man, therefore, only exists according to his symbolic function that, in turn, should apprehend him. Lacan thus presents a radical reversal of the notion of the subject as a product of language and its effect, as implied by the notorious proposition that ‘the unconscious is structured like a language.’ There is, therefore, no point in searching for the human essence anywhere else but in language. This is what Lacan meant when he says, ‘language is an organ;’ ‘the human being is characterized by the fact that his organs are external to himself.’ This symbolic function, upon which man’s identity is founded, is opposed by Lacan in his Rome lecture to the language of bees that is entirely based on the stability of its relation to the reality it signifies. Lacan finds in the Saussurian sign, cut off from the referent, the semi-ontological kernel of the human condition. This new vision of a decentered, split subject is compatible with the notion of the subject being developed in other structuralist fields within the human sciences. This subject is in some ways fictive, not existing outside of its symbolic dimension and of the signifier. Although the signifier is prioritised over the signified, there is nonetheless no question of the latter’s abandonment. These two different levels therefore interact, a process that Lacan relates to Freud’s discovery of the unconscious thus making of him, in Lacan’s eyes, the first structuralist. The signifier even makes the signified undergo a sort of passion. As becomes clear here, Lacan submits Saussure to a certain twisting of his concepts and in as much as the notion of the sliding of the signified under the signifier would not have made sense to Saussure, so too the notion of the unconscious escaped him. Lacan takes up the two big rhetorical figures of metaphor and metonomy used by Jakobson to highlight the deployment of discourse and assimilates the two to the functionary mechanism of the unconscious that, structured like language, situates itself in total isology from the rules of the latter.
3. A Global Semiological Project: 1964–66 In 1964 there were several breakthroughs in the classical schema. Structuralism now has the possibility of appearing to be a program shared by several disciplines. It is, in fact, in 1964 that the new university at Nanterre was opened, the very center of literary modernity. It is in the same year that a special edition of Communications appears, edited by Roland Barthes and dedicated to semiology. Barthes here defines ‘structuralist activity’ as the conscience of paradox and the denaturalisation of meaning. At the same time, Lacan, definitively excommunicated by the International Freudian Association, establishes the ECF (l’En cole de la cause freudienne). His seminar moves from Sainte-Anne to the En cole Normale
Structuralism, History of SupeT rieure at the rue d’Ulm, the epicenter of humanities studies that had fallen prey to the veritable fever of scientism and epistemology favoured by Louis Althusser. Lacan was not only welcomed by Althusser there but he was also welcomed to the bosom of Marxism in a text entitled ‘Lacan and Freud’ in which he subscribes to the new direction taken by psychoanalysis as defined by Lacan—seeing in the latter’s return to Freud the mirroring of his own return to Marx. It is in the same year that Le! vi-Strauss embarks upon his signature work, Mythologiques, a tetralogy on myths beginning with the publication of The Raw and the Cooked in which mythical thought is described to be structured-like language. It is around philosophy that the formulation of a global program going beyond the positivity of any specific discipline in the social sciences can be found. At the En cole Normale SupeT rieure d’Ulm, the epicenter of the circumvention of the strictly classical Sorbonne, the intellectual home of structuralism and that of Lacan and Derrida, philosophy around Louis Althusser is to become the ‘Theory of Theoretical Practice,’ the appropriation of the contribution of the new social sciences and their internal subversion. It is due to the return to Marx, evident in the seminars beginning at the outset of the 1960s, that Althusser proposes re-Reading Marx by establishing his work as the advent of Science following the epistemological break discernible in it around 1845. Two versions of Marx are, therefore proposed of which Althusser refers only to the second, that of science, structural causality and over-determination. Althusser reads Marx in a way that suggests a theoretical antihumanism and a ‘process devoid of either subject or object’ as many of the themes being elaborated within the structuralist project of the 1960s in which man is spoken rather than speaking. The other attempt led by the structuralist program in the effort to globalise itself is led by Michel Foucault during structuralism’s peak year—1966—the year in which The Order of Things was published. Foucault, thus, announces the death of philosophy and its substitution by active thought, brought about due to logic and linguistics. Foucault prioritises two (social) sciences in particular, psychoanalysis and ethnology in that he sees them as ‘counter-sciences’ that enable the destabilising of both history and of the subject. Foucault sees humanism as our ‘middle-ages’ and the figure of man as a transitory one destined to disappear. It is once again within the figure of the unconscious of the human sciences that their truth may be uncovered behind their illusory pretension to reinstate man in a command made forever impossible by the three narcissistic wounds provoked by Galileo, Darwin and Freud. The structuralist program is, however, deconstructed from within itself in that it departs from a principled double negation that could not but cause it to fall into an aporia of the subject and of historicity.
The ontologisation of structure that resulted, that which Paul Ricoeur called ‘Kantianism without the transcendental subject,’ could not but lead it into a dead end. I support Paul Ricoeur’s warning against the untimely dissociation of the absolute complementarity of the explanatory level, represented by semiological analysis and the interpretative stage or the reappropriation by the subject of the meaning of text, allowing him to endow it not only with meaning but also with signification. The joining of the semiological and hermeneutic approach enables a reappropriation of the discourse’s various components. Paul Ricoeur presents them as a ‘quadrilateral of discourse:’ (a) the speaker as a singular event of speech; (b) the interlocutor’s bringing out of the dialogical character of the discourse; (c) meaning or the very theme of the discourse which is the unique and exclusive dimension of structuralism to the exclusion of all three other dimensions; and (d) reference or that being spoken about. This kind of reappropriation allows for the emergence, not of a transparent subject or master of meaning, but of an ontology of action based upon a triangular relation systemised by Ricoeur in Oneself as Another, the relationship drawn between the self (ipse), the other (alterity) and the institutional foundations (community). This perspective thus seeks to ‘regroup’ the metaphysics of the act, interpersonal ethics and the problem of social ties, a task made all the more pressing by the discovery in the twentieth century of the ‘fallibility’ of democracy. Paul Ricoeur thus enables us to develop action open to historicity, creativity and to narrativity in order to understand better the constitution of both our personal and collective narrative identities. See also: Anthropology; Deconstruction: Cultural Concerns; Postmodernism: Philosophical Aspects; Structuralism; Structuralism, Theories of
Bibliography Barthes R 1953 Le DegreT ZeT ro de l’En criture. Editions du Seuil, Paris Barthes R 1957 Mythologies. Editions du Seuil, Paris Barthes R 1964 Essais Critiques. Editions du Seuil, Paris Brandel F 1969 Ecrits sur l’Histoire. Flammarion, Paris Dosse F 1991\1992 Histoire du Structuralisme. Editions la De! couverte, Paris, 2 Vols. Ducrot O, Todorov T 1972 Dictionnaire EncyclopeT dique des Sciences du Language. Editions la De! couverte, Paris Foucault M [1961] 1972 Histoire de la Folie aZ l’Ap ge Classique. Gallimard, Paris Jakobson R 1963 Essais de Linguistique GeT neT rale. Minuit, Paris Lacon J 1966 Ecrits. Editions du Seuil, Paris Le! vi-Strauss C 1949 Structures En leT mentaires de la ParenteT . Plon, Paris Le! vi-Strauss C 1952 [1973] Race et histoire. In: Anthropologie Structurale Deux. Plon, Paris Le! vi-Strauss C 1958 Antropologie Structurale. Plon, Paris
15229
Structuralism, History of Le! vi-Strauss C 1962 La PenseT e Sauage. Plon, Paris Todorov T, Ducrot O, Sperber D, Safouan M, Wahl F 1968 Qu’est-ce que Le Structuralisme? Editions du Seuil, Paris Vernant J P 1965 Mythe et PenseT e chez les Grecs, 2nd edn. Maspero, Paris
F. Dosse
Structuralism, Theories of Structuralism is an intellectual tendency that seeks to understand and explain social reality in terms of social structures. Influenced by the rising prestige of midnineteenth century science, theories of structuralism focused on structural form as an organizing principle underlying whole cultures and societies. Structural form is analytically distinct from cultural content such as meanings and norms, although culture and structure can be defined in terms of each other as, e.g., in cultural structuralism or in the idea of structure as a cultural pattern or model. In contrast to both culture and the reductionist view of society as an aggregate of individuals, social structure is defined as the forms of social relations (e.g., difference, inequality) among a set of constituent social elements such as positions, units, levels, regions and locations, and social formations (see Structure: Social). Structuralism proceeds on two analytic levels: (a) as a method of analysis or procedure of knowing (epistemology), and (b) as an ontology or metaphysical design of social reality. It approaches its subject matter in terms of two meta-theoretical perspectives on social reality: (a) social structure as an empirical or historical reality, and (b) social structure as a model or representation of reality (see Sociology, Epistemology of; Theory: Sociological). The conceptual property space created by these analytical dimensions generates a fourfold typology that accommodates the major theories of structuralism existing today: (a1) sociological structuralism, (a2) symbolic structuralism, (b1) historical structuralism, and (b2) orthodox structuralism. Since the major symbolic and orthodox structuralist theories privilege simultaneity and synchrony over history and diachrony, b1 will only be discussed briefly in contrast to genetic structuralism in Sect. 2.
1. Sociological Structuralism Marx and Durkheim are the two most important theorists to introduce structural concepts into the study of society. Both conceived of social structure in holistic and nonreductionist terms, i.e., not as an aggregation of individual actors, but as social facts sui generis, explainable only by other social facts. For Marx, the central social fact was the political economy; for Durkheim it was social ties and norms. Both were committed to the use of positivist scientific procedures 15230
in the study of society and believed in the necessity of empirical evidence and the importance of establishing objective causal connections between the phenomena to be explained and the structures explaining them. Both offered an analysis of the structural features of modern societies and were centrally concerned with the transition from premodern to modern forms of social organization. However, they developed different structural concepts and theories of historical and structural transformation. 1.1 Marx’s Materialist Conception of History Marx formulated the transition from feudal to capitalist society as a historical and structural shift in the mode of production, a high-level structural concept comprising the productive forces as the active source of continuous change and the social relations of production as the historical outcome of conscious human productive activity or self-transforming praxis. The developing contradictions between ever-changing productive forces and increasingly constraining social relations generate the possibility of the transformation of a given mode of production and the emergence of a new mode of production. Structural features such as the division of labor, the ownership and control of capital, the class structure of society, and the ideological representation of class conflict constitute a sociohistorical totality which tends to change with the progressive transformation of the mode of production. Resistance to change by social classes invested and interested in the status quo is the most proximate cause of class conflict and the possibility of social revolution. Marx’s ([1859] 1904) reliance on the Hegelian causal mechanism of dialectical contradiction and on the internal dynamics of the mode of production led him to overestimate the likelihood of revolutionary change and to underestimate the relatively autonomous role of the state, legal institutions, and social movements in mediating between pressures for change and the collective accommodation to the status quo. The use of a self-transformative principle of dialectical causality intrinsic to the economic sphere to explain historical social change led to the charge that Marx’s theory is one-dimensional, self-confirming, and not independently falsifiable or testable, hence worthless as a scientific theory. It can be argued, however, that the rise of the regulatory state in the twentieth century as well as specific forms of radical state intervention under state socialist or national socialist regimes interrupted global capitalist expansion, thus providing the negative evidence necessary to test and refute Marxian economic predictions. By the same token, the collapse of state socialism and the worldwide decline of regulation and intervention under neoliberal auspices have spurred the resumption of globalization. These historical processes, in turn, are likely to confirm certain Marxian expectations and to contribute to the theoretical rehabilitation of
Structuralism, Theories of Marxian theory (see Marxism in Contemporary Sociology). 1.2 Durkheim’s Sociological Realism (Holism) Durkheim ([1893] 1964) undertook to explain the transition to modern industrial society by invoking a different set of structural forces: population growth due to the agricultural revolution, the resulting rise of population density, the increase in the rate of social interaction, competition, and division of labor among differentiated and specialized social functions, and the emergence of social relations and associations among social groups and occupations. This transition from the mechanical solidarity of segmented but undifferentiated societies to the ‘moral density’ and organic solidarity of highly differentiated ones implied a structural theory of social change which was perfected in his work on the social causes of suicide. The resulting theory of social integration continues to be tested by empirical research. Durkheim formulated a set of methodological rules designed to put sociology on a scientific footing. His conception of social structure as a nonreducible ‘social fact’ was paralleled by culturalist notions of collective representations, collective conscience, as well as cognitive and normative constraints which he referred to as ‘ways of acting, thinking, and feeling.’ This formulation implied a theory of institutions combining structural and normative elements as ‘social facts,’ a theoretical framework transcending the distinction between structure and culture, just as Marx acknowledged the constant interplay of collective action, consciousness, ideology, and structure. Elements of Durkheim’s and Marx’s structuralism reappear in contemporary approaches, e.g. Bourdieu’s (1989) constructivist structuralism and Giddens’s (1984) theory of structuration. 1.3 Modern Network Structuralism An important development in contemporary sociological structuralism is social network analysis (Scott 1991). Social networks, defined as the patterns of social relations among a set of actors or nodes, are generic social structures. Networks differ from organizations and institutions in that they are informal, private, self-organizing, noncontractual, unregulated, unaccountable, nontransparent, and typically of limited focus, size, and duration. Networks are the least structured social forms that can be said to possess any structure at all, yet whose structural configurations affect the behavior of their members. While networks are typically seen as real, empirical structures, they can be represented by formal models. As such, they are closely related to concepts like ‘intersecting social circles’ and the ‘web of group affiliations’ in Simmel’s (1971, p. 23) formal sociology and Blau’s (1998) theory of structural sociology (see Networks: Social).
2. Symbolic Structuralism Sociological empiricism and positivism gave rise to a countervailing epistemological tendency claiming that the holistic structures identified by the new social sciences were nothing but cognitive models of reality since they necessarily had to rely on categories that represented, constituted, or constructed reality. This resurgence of neo-Kantian philosophy influenced Weber, Simmel, and Le! vi-Strauss, among others, and served to justify the rejection of positivist causality and historical materialism. Symbolic structuralism is a method of analysis that conceives of structure as a model or representation of reality. Typical is Saussure ([1915] 1966), who introduced certain methodological innovations into the study of language that would move linguistics from a descriptive historical discipline to an analytical, scientific one. The linguistic sign was seen as a holistic combination of two structural elements: a form that signifies (signifier) and a concept to which the form refers (signified). The relationship between signifier and signified is not natural or fixed, but social and arbitrary, hence infinitely variable. Linguistic symbols are meaningful only insofar as they produce identityin-contrast, binary oppositions (difference), and formal distinctions. This method shifts the analytic focus from such metaphysical concepts as ‘object,’ ‘reality,’ or ‘thing-in-itself’ to the formal properties and the internal structural relations of sign systems. The content and use-value of symbols and objects are subordinated to their relational identity, their position, function, and exchange value within a given system. A little-noted aspect of Saussure’s linguistic structuralism is its origin in the categories of the capitalist economy, especially the ‘radical duality’ of economic history and the political economy as a system (1966, p. 79). Both economics and linguistics are ‘confronted with the notion of alue’; both are ‘concerned with a system for equating things of different orders—labor and wages in one and a signified and signifier in the other’ (Saussure 1966, p.79, italics in original). Saussure elevates the duality of system and history to central principles of linguistic analysis: the distinction between language and speech, and the priority of synchrony over diachrony. This methodological imperative entails the epistemological precedence of the code over its application, of the structure of the system over its origins and transformation, of exogenous over endogenous change. ‘The linguist who wishes to understand a state must discard all knowledge of everything that produced it and ignore diachrony’ (Saussure 1966, p. 81). Historical change involves merely the displacement of one form or structure by another. History becomes eventless history. Piaget (1970) showed that these structuralist methods spread from linguistics to other disciplines, 15231
Structuralism, Theories of e.g., gestalt psychology, psychoanalysis (e.g., the structure of the unconscious in Freud and Lacan), and structural anthropology (Le! vi-Strauss). Piaget’s (1970) own genetic structuralism replaces the historical study of social change with the notion of structure as a system of transformations. Three key elements here are the idea of wholeness, of structured wholes as determined by internal rules of transformation, and of the relative autonomy and self-regulation of structures. Piaget tacitly acknowledges the transition of structuralism from a method to an ontology by asserting that even though ‘structures consist in their coming to be; that is, their being ‘‘under construction,’’’ they nevertheless depend on higher-level structures; i.e., ‘on a prior formation of the instruments of transformation—transformation rules or laws’ (1970, pp. 140–1). Distinct from genetic structuralism, historical structuralism (Marc Bloch, Lucien Febvre, Fernand Braudel and the Annales School) assumes the determinative influence of historical deep structures such as the longue dureT e. The epistemological ambiguity of structures as models of reality or as pregiven structures appears in other sociological approaches, for example Georg Simmel’s (1971) social forms and social types, Max Weber’s ‘ideal types’ of religion and rationality, and Alfred Schutz’s ‘typifications,’ the ‘first-order constructs’ (the primary, common-sense construction of social reality by participants) and ‘second-order constructs’ (the secondary, derivative social construction by the scientific observer) of the structures of the lifeworld (see Dosse 1997, p. 42 and Phenomenology in Sociology).
3. Orthodox Structuralism The convergence of structure as a model or representation of reality and the ontological view of structure as substance constitutes orthodox structuralism. The analytic differences between structure and system are blurred. The idealist version is best represented by the work of Claude Le! vi-Strauss, the materialist version by Louis Althusser. 3.1 LeT i-Strauss and Anthropological Structuralism Le! vi-Strauss’s structural anthropology was influenced by Saussure’s structuralist linguistics by way of Roman Jacobson and the phonological school of Prague as well as by Durkheim and Marcel Mauss (Dosse 1997, pp. 21–2, 29–30). In his initial attempt to formulate a method for producing a model of social reality, Le! vi-Strauss develops a systemic concept of social structure implying the internal interdependence of elements or ‘elementary structures’ as, for example, in kinship systems. This conceptualization set the stage for the idea of a ‘formal structural homology,’ the reproducibility of structures on different levels such that a series of transformations can produce a 15232
group of models of the same general type, ascending from the ‘order of elements’ in the social structure to the ‘order of orders,’ and ultimately to some universal order (Le! vi-Strauss 1967, pp. 302–9). For Le! vi-Strauss, the advantage of structural models is that they make it possible to arrive at general laws and to specify the internal logic of social systems such that all the observed facts are immediately visible, an assumption shared with gestalt psychology and other approaches based on a coherence concept of truth rather than on a positivist correspondence concept of truth. The search for general laws leads Le! vi-Strauss to shift his attention from the study of conscious phenomena (speech, opinions, ideology) to that of the unconscious infrastructure of cognitive forms, basic codes, and myths. Here, then, the reality of social structure is but the surface manifestation of the deep structure of the mind. Ultimately, Le! vi-Strauss can be said to take the structuralist model of reality from the agnostic dualism of Kant’s a priori categories back to the pre-Socratic belief that reality can be found only in timeless, changeless ‘Being.’ Thus, it would be logically absurd to see history and social change as anything other than the recurrent rearrangements of internal relations and structural building blocks.
3.2 Althusser’s Marxist Structuralism The immediate textual source for the adaptation of Marxism to structuralism is Frederick Engels’ formulation of historical materialism in terms of the complex interaction between the economic base and the political and ideological superstructure of society. In this tripartite distinction between relatively autonomous regions of a given social formation, each representing its own form of practice, ‘the economic element finally asserts itself as necessary’ (Engels, in Heydebrand 1981, p. 88). Althusser adapts Engels’ dialectical concept of interaction among levels of social structure by developing a deterministic (rather than probabilistic) scientific model of ‘contradiction and overdetermination.’ This model rejects empiricism, construes dialectics as a dualism of opposites, separates ‘science’ from ideology, and asserts that the superstructure ‘is relatively autonomous but the economy is determinant in the last instance’ (Althusser 1970, p. 177). Since the structuralist concept of causality is deterministic rather than stochastic, a ‘structure is always the co-presence of all its elements and their relations of dominance and subordination—it is an ever pre-given structure’ (Althusser 1970, p. 255). For Althusser, theory is the work of transforming ideology into science. This ‘theoretical practice’ is subject only to its own determinations and is prior to, and independent of, the base–superstructure problematic. This formulation of materialist structuralism occurs at such a high level of abstraction that it becomes practically indistinguishable from idealist struc-
Structure: Social turalism. Althusser later recognized this problem and rejected his notion of theoretical practice as a ‘theoreticist error, i.e., as an epistemological quest for categorical constants and orthodox theoretical principles that falsely impose a fixed, ontological order on the ‘real object’ by means of a reified ‘object of knowledge.’ The controversies surrounding orthodox structuralism as an intellectual and ideological tendency have contributed to the clarification of structural concepts in the social sciences, but failed to formulate testable propositions. Sociological, symbolic, and historical structuralism, however, continue to exercise considerable theoretical influence in the social sciences and in the post-structuralism of Derrida and Foucault. See also: Durkheim, Emile (1858–1917); Macrosociology–Microsociology; Marx, Karl (1818–89); Networks: Social; Piaget, Jean (1896–1980); Saussure, Ferdinand de (1857–1913); Simmel, Georg (1858– 1918); Social Networks and Gender; Sociology: Overview; Structuralism; Structuralism, History of
Bibliography Althusser L 1970 For Marx. Vintage, New York Blau P 1998 Culture and Social Structure. In: Sica A (ed.) What is Social Theory. Blackwell, Oxford, UK, pp. 265–75 Bourdieu P 1989 Social space and symbolic power. Sociological Theory 7(1): 14–25 Dosse F 1997 History of Structuralism, 2 vols. University of Minnesota Press, Minneapolis, MN Durkheim E [1893] 1964 The Diision of Labor in Society. Free Press, New York Giddens A 1984 The Constitution of Society: Outline of a Theory of Structuration. Polity Press, Oxford, UK Heydebrand W 1981 Marxist structuralism. In: Blau P, Merton R K (eds.) Continuities in Structural Inquiry. Sage, London, pp. 81–119 Le! vi-Strauss C 1967 Structural Anthropology. Doubleday Anchor, New York Marx K [1859] 1904 A Contribution to the Critique of Political Economy. Charles Kerr, New York Piaget J 1970 Structuralism. Harper Torch, New York Saussure F de [1915] 1966 Course in General Linguistics. McGraw-Hill, New York Scott J 1991 Social Network Analysis. Sage, London Simmel G 1971 Georg Simmel on Indiiduality and Social Forms [Levine D (ed.)]. University of Chicago Press, Chicago
W. V. Heydebrand
anatomic language, to refer to the interdependence of parts and to the corresponding mode of organization in living bodies. What is worth noticing is that the metaphor of construction on the one hand, the analogy with the organism on the other, are plainly involved in the two first main uses of the term with reference to human societies, in the mid-nineteenth century. The metaphor is clearly used by Karl Marx, when he writes in The Preface to A Contribution to Critique of Political Economy (1859), the well-known sentence according to which ‘the sum total of [the] relations of production constitute the economic structure (Struktur) of society, the real foundation, on which rises a legal and political superstructure (Uq berbau).’ It was a characteristic of Herbert Spencer’s thought to dwell on the analogy between organism and society, even if he used it mainly ‘as a scaffolding’: from the ‘Prospectus of a System of Philosophy’ (1858) on, he worked with a mental scheme associating structure and function, respectively, linked to anatomical phenomena and physiological processes. These two original ideas have deeply influenced the development of the field and it should be noted that from the beginning, social structure has been conceptualized in quite different ways. Perhaps it may be added that the Marxian conception in terms of a ‘real foundation’ already implies the assumption of a sort of ‘deep structure.’ The pronounced growth of (social) anthropology and sociology made social structure a key concept of both sciences but there never was an uncontested agreement about its definition; on the contrary, the term has been used in so many different senses that it will be quite difficult to draw up their complete list. Therefore, Merton was quite right in beginning his ‘fourteen stipulations for structural analysis’ by remarking ‘that the … notion of ‘‘social structure’’ is polyphyletic and polymorphous’ (Merton 1975). We shall then limit our analysis to some landmarks in the history of the concept and in the more recent developments. Rather than trying at first to give a general definition, we shall emphasize the main dimensions which are, so to speak, the substratum of the most elaborated conceptions of social structure. Our last section will be devoted on the one hand to the formulation of some theoretical principles which nowadays are (or can be) widely accepted and, on the other hand, to the major open questions scholars in the field are still faced with.
Structure: Social The term structure (Latin structura from struere, to construct) has been first applied to construction and only later on, during the classical period, to the scientific field of biology. It began to be used, in the
1. Some Landmarks in the History of the Concept and in Contemporary Adances In this area of research as in many others, Emile Durkheim’s work is a significant link between the 15233
Structure: Social pioneers and later developments. The Diision of Labour in Society (Durkheim 1893) can be read as based on a structural analysis. In this respect the most typical argument is to be found in Book II, in which Durkheim brings out the causes of division of labor in terms of increase of the population in the society, of its ‘material density’ and, mainly, of its ‘dynamic or moral density.’ However, it is not to be forgotten that in Book I he lays great stress on the function of a developing division of labor, which is to generate (‘organic’) solidarity. Thus, Durkheim gave an impulse to the morphological view on social structure, emphasizing the material and spatial distribution of population, while paying attention to the moral dimension attuned to the complexity of social relations in heterogeneous societies. In keeping with some ideas derived from Durkheim (but not his attempt at synthesis), Radcliffe-Brown maintained the organic analogy, with the crucial distinction between structure and function and the attention given to the description of the interdependence of the component parts in human society. As he put it, ‘in the study of social structure the concrete reality with which we are concerned is the set of actually existing relations … which link together … human beings’ (Radcliffe-Brown 1940). What is required by ‘scientific purposes is an account of the form of the structure, … abstracted from the variations of particular instances.’ The British structural school of anthropology was imbued with Radcliffe-Brown’s views. There is yet a book which stands out as a theoretical advance, namely Nadel’s The Theory of Social Structure (Nadel 1957), in which he works out a conception of social structure in terms both of an elaborate idea of network and of a theory of roles. In his attempt at building up such a theory, Nadel was bound to discuss some parts of Parsons’ work, in which role is conceived as a major structural category. In Parsons’ words, ‘the structure of social systems … consists in institutionalized patterns of normative culture’ (Parsons 1961). But norms vary according to the positions of the individual actors and also to the type of activity. On the one hand, they define roles, with the corresponding rules of behavior, on the other they constitute institutions. Social structure is thus to be studied in its essentially regulative properties and this approach strongly constrasts with the conception emphasizing the distributive dimension of social structure; we hit here upon an unmistakable dividing line. There is another big divide which is worth noticing from an epistemological point of view, even if it conspicuously took place in the field of anthropology, bringing about dispute between different schools. It would be pointless in this article to outline an overview of such a composite movement as structuralism. However, for the social sciences it was significant in two main respects: first it meant the parting of the suitable ways for sociology and anthropology in so far as cultural anthropologists were requested to follow 15234
the lead of structural linguistics; and secondly it claimed a novel conception of structure. Levi-Strauss specificably states that ‘the term ‘‘social structure’’ has nothing to do with empirical reality but with models which are built up after it’ and consequently asserts that ‘social structure can, by no means, be reduced to the ensemble of the social relations to be described in a given society’ (Le! vi-Strauss 1953). Therefore LeviStrauss’ followers are very keen on contrasting this ‘transformational structuralism,’ emphasizing the transformability of a single relational set, with the still prevailing empirical views of social structure, grounded in the positivistic notion of ‘correspondence rules.’ It may perhaps be added that the rules of transformation connecting specific arrangements of the assumed underlying structure do not prevent it from being arbitrary; the argument is still needing some kind of empirical substantiation. Accordingly, Levi-Strauss is the most convincing, when he could draw upon a whole research tradition. His kinship models cannot be cut off from the large and momentous set of earlier studies in the field. The term ‘model’ has often been used loosely in the social sciences; but, with regard to social structure, it generally has formal connotations. The third dividing line we want to emphasize is mainly methodological and deals with the relevant approach to social structure: must it be formal, as Peter Blau (Blau 1977) would defend it to be, or not? We have here to go back to a classical ‘master of sociological thought,’ namely Georg Simmel, for finding the father of the formal approach. He dwelt on ‘The Number of Members as Determining the Sociological Form of the Group,’ translated by Albion Small as early as 1902–3, stressed the importance of triads (vs. dyads) and analyzed the main conditions of group persistence, one of the most essential of which is organization. Peter Blau resumed this line of thought and enlarged it in a decidedly macrosociological direction. In his words, ‘a social structure can be defined as a multidimensional space of different social positions among which a population is distributed.’ His primary interest is ‘in the two generic forms [social] differentiation assumes,’ that is inequality and heterogeneity in ‘the degree to which [they] intersect’ and in their effect upon the integration of various segments of society; and he endeavors ‘to construct a deductive theory of social structure,’ by stating theorems deduced from primitive propositions. We lack space to enumerate all of them but at least two are worth noticing: the more heterogeneous social structure, the more likely intergroup relations; and the less correlated social differences, the more likely intergroup relations. These theoretical statements were later on put to an empirical test by Blau and Schwartz (Blau and Schwartz 1984). Simmel likewise is held up as a pioneer by scholars belonging to a somewhat different and developing research subtradition—social network analysis. Its approach is positively structural, but its substantive
Structure: Social focus is generally microsociological; therefore in most cases it does not convey an overall view of social structure. Yet it can contribute a lot, by clarifying specific levels of structure and carrying on with other intellectual tools Merton’s study of status—and role—sets and Homans’ work on ‘internal systems.’ All the more as some synthesis has been brought about by Fararo and Skvoretz (Fararo and Skvoretz 1987) between social network analysis and the macrostructural approach, represented by Granovetter’s ‘strength-of-weak ties’ theory and Blau’s formal deductive one, respectively. Harrison White’s work has also been wholly dedicated to the investigation of structural problems. His last book, Identity and Control (White 1992) opens up new vistas, even if, as a commentator put it, it often requires as much attention as Finnegans Wake.
2. Guiding Principles and Open Questions for Sociological Research First of all, to avoid any misunderstanding, we do not ingenuously intend to prescribe relevant rules for research. We have the only aim of stating theoretical propositions about which there is (or could be) general agreement. Some of them have been explicitly formulated by scholars. On other points, they have not and our statements will accordingly be more tentative. We shall succinctly comment on all of them. To begin with, a first major principle has been tersely conveyed by Nadel: ‘It seems impossible to speak of social structure in the singular.’ No matter whether structure is conceived as a network of roles (Nadel) or whether it is conceptualized in terms of a multiplicity of social positions (Blau), the statement holds true. The ordering always is partial and relative both to the specific property being analyzed and to the systemic frame of reference chosen to describe it. A second central principle has been laid down by Thomas Fararo (Fararo 1989): ‘Structure is not an ultimate given but a conditional attainment.’ Persistence (which is not to be mistaken for permanence) and endurance of social structures are problematic. In any case their stability has to be accounted for; the idea of equilibrium may be used as a heuristic device, provided attention is paid to moving equilibria as well as to static ones. Structures are maintained, when and if they do so, by underlying processes. These two principles are basic but some others are easily derived from them and there still are some points which are worth noticing. Although it sounds like a cliche! , it is to be remembered that structural characteristics are always constituted by the social activities of individuals, however contrary to their purposes the outcomes may be. The idea to be stressed upon here is that structures
are generated by men’s actions and interactions. It entails a decisive consequence for sociological research: structural analysis cannot be divorced from action theory. In other words, when we are directing our efforts to problems of social structure, we had better, as Fararo put it, ‘start with … principles and models of action in social situations’ (Fararo 1989). Social structure is no more conceived as the determining factor of individuals’ behavior and this view has been exploded. The Mertonian idea of a ‘core process of choice between socially structured alternatives’ in Stinchcombe’s words (Stinchcombe 1975), is more to the point. To go further in the matter, we could say that social structure indeed is constraining but it is also enabling. The notion of ‘structural effects’ is not questionable at all, although it would be better to conceptualize the corresponding phenomena in terms of emerging structural properties. But the idea of structural causality is at least dubious. There has been an impressive demise of what Raymond Boudon aptly called ‘magic structuralism’ (1968), even if it never was as widespread in the social sciences as in the cultural and literary studies. It certainly is a moot point, whether and how social structure and culture are linked. But whatever importance is to be attached to cultural dimensions, social systems have to be, on an analytical level, clearly distinguished from cultural systems: Parsons has here given the lead. We can thus point out a major flaw in symbolic or transformational structuralism: it fails to specify ‘the ways through which’ the symbolic rules, the so-called universal rules of the human mind, ‘impinge on the actual workings of societies’ and ‘to identify the social mechanisms through which … the principles of deeper structure are activated within institutional frameworks’ (Eisenstadt in Blau and Merton 1981). In the first part of this article, we dwelt on two contrasting epistemological positions: on one side, there are scholars linking social structure to empirical reality, on the others, scholars linking it to models. Now, this dividing line can be tackled from another point of view and perhaps understood in a less radical way. As Nadel and Blau both make it clear, the main question is whether scholars intend social structure for description or explanation. These two conceptions obviously are quite different; still you cannot describe without some selection, i.e., abstraction process: description always is the first step in explanation. In this respect, Mario Bunge’s distinction between a model object, which defines a representation principle for a class of empirical phenomena, and a theoretical model, constructed with the intention of explaining these phenomena, perhaps is not irrelevant. The stress put on the selection and abstraction process may be useful for guarding against too naturalistic a view of social structure. Structural properties are what we are interested in, not an ‘entity’ 15235
Structure: Social called social structure. In this regard, Giddens’ statement, ‘structure only exists as ‘‘structural properties’’’ (Giddens 1979), makes good sense. But this conception, implying a principle of ‘order,’ a ‘real’ virtuality,’ in Gudmund Hernes’ words (in Hedstro$ m and Swedberg 1998) is still compatible with a realist epistemology. A quite different point is worth clearing up: we began by quoting Marx but later on we did not deal with class structure and the reader may wonder why. It is indeed a significant field to study and it would deserve a paper by itself. But class structure is no more to be understood as the rock-bottom foundation of society: the continental-European ‘structural marxism’ is now obsolete, as the whole of our theoretical and epistemological argument, we hope, makes it clear. Finally, we shall borrow from Merton his last and wise stipulation: ‘like any other theoretical orientation in sociology, structural analysis can lay no claim to being capable of accounting exhaustively for social and cultural phenomena’ (Merton 1975). Such are the guiding lines along which scholars’ work can proceed; but there are still major questions on the research agenda. Two of them especially are, in our opinion, worth noticing. The first one requires a kind of theoretical synthesis. We have previously in this paper distinguished two main dimensions of social structure: the distributive and the regulative. Yet they are strongly connected and we need an approach accounting for both of them. The durkheimian lead seems to have been forgotten and it would theoretically be a strategical move to follow it again: it is somewhat ironic that Durkheim could be read both ways (and brilliantly, by Parsons and by Blau). As regards the latter one, the scholarly investigation must fully take account of the conception of social structure as a multilevel phenomenon. Therefore due attention must be paid to the micro-macro link and to the social mechanisms and processes through which it is operating; and the on-going interaction of the micro- and macro-levels would need a more general model (from macro to macro through micro) and the identification, at each step, of the appropriate types of mechanisms. Of course social scientists will address problems of social structure with diverse theoretical views and differing ‘domain assumptions.’ But today all of us know that theoretical pluralism is not a sign of the immaturity of sociology as a science. See also: Biology’s Influence on Sociology: Human Sociobiology; Class: Social; Functionalism in Sociology; Institutions; Integration: Social; Labor, Division of; Macrosociology–Microsociology; Marx, Karl (1818–89); Networks: Social; Parsons, Talcott (1902–79); Social Change: Types; Social Constructivism; Social Geography; Sociology, History of; Spencer, Herbert (1820–1903); Status and Role: 15236
Structural Aspects; Structuralism, Theories of; System: Social; Theory: Sociological
Bibliography Blau P M 1977 Inequality and Heterogeneity: A Primitie Theory of Social Structure. Free Press, New York Blau P M, Merton R K (eds.) 1981 Continuities in Structural Inquiry. Sage Publications, London and Beverly Hills Blau P M, Schwartz J 1984 Cross-cutting Social Circles. Academic Press, Orlando, CA Boudon R 1998 A quoi sert la notion de ‘‘structure’’? Gallimard, Paris. [trans. Vaugham M 1971 The Uses of Structuralism. Heinemann, London] Durkheim E 1893 De la diision du traail social. Alcan, Paris. [trans. Halls W D 1984 The Diision of Labor in Society, Free Press, New York] Fararo T J 1989 The Meaning of General Sociological Theory. Cambridge University Press, Cambridge, UK Fararo T, Skvoretz J 1987 Unification research programs: Integrating two structural theories. American Journal of Sociology 92: 1183–209 Giddens A 1979 Central Problems of Social Theory: Action, Structure and Contradiction in Social Analysis. Macmillan, London Hedstro$ m P, Swedberg R (eds.) 1998 Social Mechanisms: An Analytical Approach to Social Theory. Cambridge University Press, Cambridge, UK Le! vi-Strauss C 1949 Les structures eT leT mentaires de la parenteT , 1st edn. Gallimard, Paris [trans. Bell J H, von Sturmer J R, Needham R (ed.) 1969 The Elementary Structures of Kinship. Beacon Press, Boston] Le! vi-Strauss C 1953 Social structure. In: Kroeber A L (ed.) Anthropology Today: An Encyclopedic Inentory. The University of Chicago Press, Chicago Marx K 1913 A Contribution to Critique of Political Economy. Kerr, Chicago Merton R K 1975 Structural analysis in sociology. In: Blau P M (ed.) Approaches to the Study of Social Structure. Free Press, New York Nadel S F 1957 The Theory of Social Structure. Cohen and West, London Murdock G P 1949 Social Structure. Macmillan, New York Parsons T 1961 An outline of the social system. In: Parsons T, Shils E, Naegele K D, Pitts J R (eds.) Theories of Society: Foundations of Modern Sociological Theory. Free Press of Glancoe, New York, pp. 30–79 Radcliffe-Brown A R 1940 Social structure. In: Evans-Pritchard E E, Eggan F (eds.) Structure and Function in Primitie Society. Cohen & West, London, pp. 188–204 Spencer H 1858 Prospectus of a system of philosophy. In: Rumney J (eds.) 1966 Herbert Spencer’s Sociology: A Study in the History of Social Theory. Atherton, New York, pp. 297–303 Simmel G 1902–3 The number of members as determining the sociological form of the group. American Journal of Sociology 8: 1–46, 158–96 Simmel G 1908 Soziologie: Untersuchungen uW ber die Formen der Vergesellschaftung. Dunker & Humblot Leipzig, Germany [trans. Wolff K H, Bendix R 1955 Conflict and the Web of Group Affiliations. Free Press, Glencoe, IL]
Subaltern History Spencer H 1876–96 Principles of Sociology. Appleton, London Stinchcombe A L 1975 Merton’s theory of social structure. In: Coser L (ed.) The Idea of Social Structure: Papers in Honor of Robert K. Merton. Harcourt, Brace, Jovanovich, New York White H C 1992 Identity and Control: A Structural Theory of Social Action. Princeton University Press, Princeton, NJ
F. Chazel
Subaltern History 1. Gramsci in India In the form in which it is currently known, subaltern historiography is derived from the writings of a group of historians of modern South Asia whose work first appeared in 1982 in a series entitled Subaltern Studies (Amin and Chakrabarty 1996, Arnold and Hardiman 1994, Bhadra et al. 1999, Chatterjee and Jeganathan 2000, Chatterjee and Pandey 1992, Guha 1982–9). The term ‘subaltern’ was borrowed by these historians from Antonio Gramsci, the Italian Marxist, whose writings in prison in the period 1929–35 had sketched a methodological outline for a ‘history of the subaltern classes’ (Gramsci 1971). In these writings, Gramsci used the word ‘subaltern’ (subalterno in the Italian) in at least two senses. In one, he used it as a code for the industrial proletariat. But against the thrust of orthodox Marxist thinking, he emphasized that in its rise to power, the bourgeoisie did not simply impose a domination through the coercive apparatus of the state but transformed the cultural and ideological institutions of civil society to construct a hegemony over society as a whole, even eliciting in the process the acquiescence of the subaltern classes. In Gramsci’s analysis of capitalist society, the central place is occupied by questions such as the relation of state and civil society, the connections between the nation, the people, the bourgeoisie and other ruling classes, the role of intellectuals in creating the social hegemony of the bourgeoisie, strategies for building a counterhegemonic alliance, etc. In the second sense, Gramsci talked of the subaltern classes in precapitalist social formations. Here he was referring to the more general relationship of domination and subordination in class-divided societies. But specifically in the context of southern Italy, he wrote about the subordination of the peasantry. Gramsci was very critical of the negative and dismissive attitude of European Marxists towards the culture, beliefs, practices and political potential of the peasantry. Positioning himself against this attitude, he wrote of the distinct characteristics of the religious beliefs and practices, the language and cultural pro-
ducts, the everyday lives and struggles of peasants, and of the need for revolutionary intellectuals to study and understand them. But he also highlighted the limits of peasant consciousness which, in comparison with the comprehensiveness, originality and active historical dynamism of the ruling classes, was fragmented, passive and dependent. Even at moments of resistance, peasant consciousness remained enveloped by the dominant ideologies of the ruling classes. These discussions by Gramsci were turned to productive use by South Asian historians writing in the 1980s.
2. Against Elitist Historiography Ranajit Guha, who edited the first six volumes of Subaltern Studies, wrote in the introductory statement of the project: ‘The historiography of Indian nationalism has for a long time been dominated by elitism— colonialist elitism and bourgeois-nationalist elitism’ (Guha 1982). The objective of subaltern historiography was to oppose the two elitisms. The field of modern South Asian history was dominated in the 1970s by a debate between a group of historians principally located in Cambridge, UK, and another based mainly in Delhi, India. The former argued that Indian nationalism was a bid for power by a handful of Indian elites who used the traditional bonds of caste and communal ties to mobilize the masses against British rule. The latter spoke of how the material conditions of colonial exploitation created the ground for an alliance of the different classes in Indian society and how a nationalist leadership inspired and organized the masses to join the struggle for national freedom. Guha argued that both these views were elitist—the former representing a colonial elitism and the latter a nationalist elitism. Both assumed that nationalism was wholly a product of elite action. Neither history had any place for the independent political actions of the subaltern classes. In setting their agenda against the two elitisms, the historians of Subaltern Studies focused on two main issues. One was the difference between the political objectives and methods of colonial and nationalist elites on the one hand and those of the subaltern classes on the other. The second was the autonomy of subaltern consciousness. Pursuing the first question, the historians of Subaltern Studies showed that the claim of colonialist historians that the Indian masses had been, so to speak, duped into joining the anticolonial movement by Indian elites using primordial ties of kinship and clientelism was false. They also showed that it was untrue to say, as nationalist historians did, that the political consciousness of the subaltern classes was only awakened by the ideals and inspiration provided by nationalist leaders. It was indeed a fact that the subaltern classes had often entered the arena of nationalist politics. But it was also a fact that in many instances they had refused to join, 15237
Subaltern History despite the efforts of nationalist leaders, or had withdrawn after they had joined. In every case, the goals, strategies and methods of subaltern politics were different from those of the elites. In other words, even within the domain of nationalist politics, the nationalism of the elites was different from the nationalism of the subaltern classes. The second question followed from the first. If subaltern politics was indeed different from that of elite politics, what was the source of its autonomy? What were the principles of that politics? The answer that was suggested was that subaltern politics was shaped by the distinct structure of subaltern consciousness. That consciousness had evolved out of the experiences of subordination—out of the struggle, despite the daily routine of servitude, exploitation and deprivation, to preserve the collective identity of subaltern groups. Where was one to look for the evidence of this autonomous consciousness? It could not be found in the bulk of the archival material that historians conventionally use, because that material had been prepared and preserved by and for the dominant groups. For the most part, those documents only show the subaltern as subservient. It is only at moments of rebellion that the subaltern appears as the bearer of an independent personality. When the subaltern rebels, the masters realize that the servant too has a consciousness, has interests and objectives, methods and organization. If one had to look for evidence of an autonomous subaltern consciousness in the historical archives, then it would be found in the documents of revolt and counterinsurgency. The first phase of the work of Subaltern Studies was dominated by the theme of peasant revolt. Ranajit Guha’s Elementary Aspects of Peasant Insurgency in Colonial India (1983) is the key text in this area. But most other scholars associated with the project wrote on the history of peasant revolt from different regions and periods of South Asian history. They were able to discover a few sources where the subaltern could be heard telling his or her own story. But it was always clear that there would be few such sources. What became far more productive were new strategies of reading the conventional documents on peasant revolts. The historians of subaltern politics produced several examples of ways in which reports of peasant rebellion prepared by official functionaries could be read from the opposite standpoint of the rebel peasant and thus used to shed light on the consciousness of the rebel. They also showed that when elite historians, even those with progressive views and sympathetic to the cause of the rebels, sought to ignore or rationally explain away what appeared as mythical, illusory, millenarian or utopian in rebel actions, they were actually missing the most powerful and significant elements of subaltern consciousness. The consequence, often unintended, of this historiographical practice was to somehow fit the unruly facts of subaltern politics into the rationalist grid of elite 15238
consciousness and to make them understandable in terms of the latter. The autonomous history of the subaltern classes, or to put it differently, the distinctive traces of subaltern action in history, were completely lost in this historiography.
3. The Imbrication of Elite and Subaltern Politics The analysis of peasant insurgency in colonial India and of subaltern participation in nationalist politics by the historians of Subaltern Studies amounted to a strong critique of bourgeois–nationalist politics and of the postcolonial state. Writing about peasant revolts in British India, Ranajit Guha (1983) and Gautam Bhadra (1994) showed how this powerful strand of anticolonial politics, launched independently of bourgeois-nationalist leaders, had been denied its place in established historiography. Gyanendra Pandey (1984, 1990), David Hardiman (1984), Sumit Sarkar (1984) and Shahid Amin (1995) wrote about the two domains of elite and subaltern politics as they came together in the nationalist movement led by the Congress. Dipesh Chakrabarty (1989) wrote about a similar split between elite and subaltern politics in the world of the urban working class. Partha Chatterjee (1986, 1993) traced the development of nationalist thought in India in terms of the separation of elite and subaltern politics and the attempts by the former to appropriate the latter. The postcolonial nation-state had, it was argued, included the subaltern classes within the imagined space of the nation but had distanced them from the actual political space of the state. Although the political emphasis was not the same in each writer, there was nevertheless a strong flavor in Subaltern Studies of the Maoist politics that had hit many parts of India in the 1970s. Many critics thought they could detect in it a romantic nostalgia for a peasant armed struggle that never quite took place. Others alleged that by denying the unifying force of the nationalist movement and stressing the autonomous role of the subaltern classes, the historians of Subaltern Studies were legitimizing a divisive and possibly subversive politics. Another connection that was often made in the early phase of Subaltern Studies was with the ‘history from below’ approach popularized by British Marxist historians. Clearly, the work of Christopher Hill, E. P. Thompson, Eric Hobsbawm or the History Workshop writers was eagerly mined by subalternist historians for methodological clues for doing popular history. But there was a crucial difference because of which subaltern history could never be a ‘history from below.’ The forgotten histories of the people, pulled out from under the edifice of modern capitalist civilization in the West, had undoubtedly made the story of Western modernity more detailed and complete. But there did not seem to be any narrative of
Subaltern History ‘history from below’ that could persuasively challenge the existence, stability or indeed the historical legitimacy of capitalist modernity itself. Not surprisingly, ‘history from below’ was invariably written as tragedy. But in countries such as India, ‘history from below’ could not be confined within any such given narrative limits. The subalternist historians refused to subscribe to the historicist orthodoxy that what had happened in the West, or in other parts of the world, had to be repeated in India. They rejected the framework of modernization as the necessary emplotment of history in the formerly colonial countries. They were skeptical about the established orthodoxies of both liberal– nationalist and Marxist historiographies. As a result, they resisted in their writings the tendency to construct the story of modernity in India as the actualization of modernity as imagined by the great theorists of the Western world. This resistance, apparent even in the early phase of Subaltern Studies, was to be expressed later in arguments about ‘other modernities.’
4. Reorientations One line of argument of the early Subaltern Studies project that ran into serious problems concerned the existence of an autonomous subaltern consciousness. Much of the study of the insurgent peasant was a search for a characteristic structure of peasant consciousness, shaped by its experience of subordination but struggling ceaselessly to retain its autonomy. One problematic question that arose was about the historicity of this structure. If subaltern consciousness was formed within specific historical relations of domination and subordination, then could that consciousness change? If so, why could it not be said that the Indian peasantry was transformed, modernized, turned into citizens by its experience of nationalist politics? Why the resistance to a progressive narrative of history? Or was there another narrative of a changing subaltern consciousness? It was a classic structuralist impasse to which there was no easy historical answer. A related problem was about the notion of the subaltern as an active historical agent. Research into subaltern history had shown that the subaltern was both outside and inside the domains of colonial governance and nationalist politics. To the extent that it was outside, it had retained its autonomy. But it had also entered those domains, participated in their processes and institutions and thereby transformed itself. Every bit of historical evidence was pointing to the fact that the subaltern was ‘a deviation from the ideal’. Why then the search for a ‘pure structure’ of subaltern consciousness? Moreover, argued Gayatri Spivak (1987, 1988) in two influential articles, subaltern history had successfully shown that the ‘man’ or ‘citizen’ who was the sovereign subject of bourgeois
history writing was in truth only the elite. Why was it necessary now to clothe the subaltern in the costume of the sovereign subject and put him on stage as the maker of history? Subaltern historiography had in fact challenged the very idea that there had to be a sovereign subject of history possessing an integral consciousness. Why bring back the same idea into subaltern history? It was only a myth that the subaltern could directly speak through the writings of the historian. In fact, the historian was only representing the subaltern on the pages of history. The subaltern, announced Spivak, cannot speak. The new turn in Subaltern Studies began more or less from the fifth and sixth volumes published in 1987–89. It was now acknowledged with much greater seriousness than before that subaltern histories were fragmentary, disconnected, incomplete, that subaltern consciousness was split within itself, that it was constituted by elements drawn from the experiences of both dominant and subordinate classes. Alongside the evidence of autonomy displayed by subalterns at moments of rebellion, the forms of subaltern consciousness undergoing the everyday experience of subordination now became the subject of inquiry. Once these questions entered the agenda, subaltern history could no longer be restricted to the study of peasant revolts. Now the question was not ‘What is the true form of the subaltern?’ The question had become ‘How is the subaltern represented?’ ‘Represent’ here meant both ‘present again’ and ‘stand in place of’. Both the subjects and the methods of research underwent a change. One direction in which the new research proceeded was the critical analysis of texts. Once the question of the ‘representation of the subaltern’ came to the fore, the entire field of the spread of the modern knowledges in colonial India was opened up for subaltern history. Much studied subjects such as the expansion of colonial governance, English education, movements of religious and social reform, the rise of nationalism—all of these were opened to new lines of questioning by the historians of Subaltern Studies. The other direction of research concentrated on the modern state and public institutions through which modern ideas of rationality and science and the modern regime of power were disseminated in colonial and postcolonial India. In other words, institutions such as schools and universities, newspapers and publishing houses, hospitals, doctors, medical systems, censuses, registration bureaux, the industrial labor process, scientific institutions—all of these became subjects of subaltern history-writing.
5. Other Modernities A major argument that has been developed in the more recent phase of writings in and around Subaltern Studies is that of alternative or hybrid modernities. 15239
Subaltern History The focus here is on the dissemination of the ideas, practices and institutions of Western modernity under colonial conditions. The framework of modernization theory invariably produces the history of modernity in non-Western colonial countries as a narrative of lag or catching up. As Dipesh Chakrabarty has put it, these societies seem to have been consigned for ever to ‘the waiting room of history’. The universality of Western modernity erases the fact that like all histories, it too is a product of its local conditions. When transported to other places and times, it cannot remain unaffected by other local conditions. What happens when the products of Western modernity are domesticated in other places? Do they take on new and different shapes— shapes that do not belong to the original? If they do, are we to treat the changes as corruptions? As deviations from an ideal? Or are they valid as examples of a different modernity? To argue the latter is both to provincialize Europe and to assert the identity of other cultures even as they participate in the presumed universality of modernity. Chakrabarty (2000), Arnold (1993), Prakash (1999), Spivak (1999) and Chatterjee (1995), for example, have explored various aspects of this process of the ‘translation’ of modern knowledges, technologies and institutions. They have tried to show that the encounter between Western forms of modernity and colonized non-Western cultures was not a simple imposition of the one on the other, nor did it lead to corrupt or failed forms of modernity. Rather, it produced different forms of modernity whose marks of difference still remain subject to unresolved contestations of power. Here, the work of South Asian subalternist historians has often overlapped with and contributed to what has become known as postcolonial studies. The historical study of modern discourses and institutions of power in colonial India has fed into a growing literature on the production of hybrid cultural forms in many different regions of the formerly colonial world. More significantly, postcolonial studies has extended the argument about hybrid cultural forms to the understanding of contemporary cultures in the Western metropolitan countries themselves, instanced most immediately in the diasporic cultures of immigrants but also, less obviously, in the role of the colonial experience in the formation of Western modernity even in its purportedly ‘original’ form. Historical and social science disciplines have tended to merge here with the concerns and methods of the literary and cultural disciplines to break new, and still mostly uncharted, theoretical grounds.
6. Rethinking the Political To return to the initial political questions raised by subaltern history writing: where have they led? The early emphasis on peasant rebellion and consciousness 15240
had widened, even in the first phase of the project, to include the resistance of other dominated and marginalized groups in colonial society. Once the idea of a paradigmatic structure of subaltern consciousness became less persuasive, the subject of resistance began to be approached from a variety of angles and without a fixed design either of the reproduction of a traditional structure or of the necessary transition to new structures. Recent subaltern historiography has vigorously participated in three sets of South Asian debates—over religious minorities, caste and gender. These debates have opened up the way to rethinking the political formation of the nation as well as the political process of democracy. In India, the broader political debate over religious minorities has been carried out between two opposed groups—the Hindu chauvinists on the one hand and the secularists on the other. What the researches of subalternist historians have shown is that the debate between secularism and communalism is in no way a struggle between modernity and backwardness. Both the rival political positions are firmly planted in the soil of modern government and politics. Second, the two groups are pursuing two different strategies of consolidating the regime of the modern nation-state. Both strategies are elitist, but they involve two different modes of representation and appropriation of the subaltern. Third, faced with these rival elitist strategies, subaltern groups in India are devising, in their own ways, independent strategies of coping with communal as well as secularist politics. The recent experience of ethnic violence and authoritarian politics in Sri Lanka and Pakistan have raised even more fundamental questions about the adequacy of the nation-form as a field where subaltern politics might be negotiated (Chatterjee and Jeganathan 2000). The second question on which there has been significant recent discussion is caste. There has been a transformation in the politics of caste in India following the agitations in 1990–1 over the Mandal Commission report. By looking at the politics of caste in its discursive aspect, it has become clear that the supposedly religious basis of caste divisions has now completely disappeared from public debate. The conflicts now are almost exclusively centered on the relative positions of different caste groups in relation to the state. Second, the debate over whether or not to recognize caste as a criterion for affirmative action by the state once again reflects two different elitist strategies of representing and appropriating the subaltern. And subaltern groups too, in their efforts to establish social justice and self-respect, are devising various strategies of both resisting the state as well as of using the opportunities offered by its electoral and developmental functions. Strategies of alliance between castes at the middle and bottom rungs of the ritual hierarchy with other oppressed groups such as tribals and religious minorities have produced significant electoral successes. But with the creation of new
Subaltern Studies: Cultural Concerns political elites out of subaltern groups, the questions of ‘who represents?’ and ‘to what end?’ are being asked with a new urgency. The third debate concerns the social position of women. In one sense, all women living in patriarchal societies are subalterns. Yet, it is not true that women are not identified by class, race, caste, and community. Hence, just as it is valid to analyse the subordination of women in a society ruled by men, so also is it necessary to identify how the social construction of gender is made more complex by the intervention of class, caste, and communal identities. Recent discussions on this question have focused on the Indian social reform movements of the nineteenth century, especially the various legal reforms to protect the rights of women, in the context of the colonial state and nationalist politics. Subaltern feminist writings have raised questions about the adequacy of a modernizing agenda of legal reform from the top without facing up to the challenge of reforming the actual structures of patriarchal power within the local communities which continue to flourish outside the reach of the law. The recent debates raise new questions about conceptualizing old modernist ideas such as nation, citizenship and democracy. By virtue of these implications, the recent subaltern history writings from South Asia have been productively used in writings on the history of modernity in other parts of the formerly colonized world, such as, for instance, on nationalism and gender in the Middle East or on the politics of peasant and indigenous groups in Latin America. Having traveled from Italy to India, the idea of subaltern history has now produced a generally available methodological and stylistic approach to modern historiography that could be used anywhere. See also: Caste; Colonialism, Anthropology of; Colonization and Colonialism, History of; Gramsci, Antonio (1891–1937); Historiography and Historical Thought: Current Trends; Historiography and Historical Thought: Modern History (Since the Eighteenth Century); Historiography and Historical Thought: South Asia; Minorities; Peasants and Rural Societies in History (Agricultural History); Peasants in Anthropology
Bhadra G 1994 Iman o nishan: banglar krishak chaitanyer ek adhyay [Faith and the Flag: An Aspect of Peasant Consciousness in Bengal]. Subarnarekha, Calcutta, India Bhadra G, Prakash G, Tharu S (eds.) 1999 Subaltern Studies X. Oxford University Press, Delhi, India Chakrabarty D 1989 Rethinking Working-class History: Bengal 1890–1940. Princeton University Press, Princeton, NJ Chakrabarty D 2000 Proincializing Europe: Postcolonial Thought and Historical Difference. Princeton University Press, Princeton, NJ Chatterjee P 1986 Nationalist Thought and the Colonial World: A Deriatie Discourse? Zed Books, London Chatterjee P 1993 The Nation and its Fragments: Colonial and Postcolonial Histories. Princeton University Press, Princeton, NJ Chatterjee P (ed.) 1995 Texts of Power: The Disciplines in Colonial Bengal. University of Minnesota Press, Minneapolis, MN Chatterjee P, Jeganathan P (eds.) 2000 Subaltern Studies XI. Permanent Black, Delhi, India Chatterjee P, Pandey G (eds.) 1992 Subaltern Studies VII. Oxford University Press, Delhi, India Gramsci A 1971 Selections from the Prison Notebooks [trans. Hoare Q, Nowell Smith G]. International Publishers, New York Guha R (ed.) 1982–9 Subaltern Studies I–VI. Oxford University Press, Delhi, India Guha R 1982 On the historiography of Indian nationalism. In: Guha R (ed.) Subaltern Studies I. Oxford University Press, Delhi, India, pp. 1–8 Guha R 1983 Elementary Aspects of Peasant Insurgency in Colonial India. Oxford University Press, Delhi, India Hardiman D 1984 Peasant Nationalists of Gujarat. Oxford University Press, Delhi, India Pandey G 1984 The Ascendancy of the Congress in Uttar Pradesh, 1926–1934. Oxford University Press, Delhi, India Pandey G 1990 The Construction of Communalism in Colonial North India. Oxford University Press, Delhi, India Prakash G 1999 Another Reason. Princeton University Press, Princeton, NJ Sarkar S 1984 Popular Moements and Middle-class Leadership in Late Colonial India. K P Bagchi, Calcutta, India Spivak G C 1987 Subaltern studies: Deconstructing historiography. In: Guha R (ed.) Subaltern Studies IV. Oxford University Press, Delhi, India, pp. 338–63 Spivak G C 1988 Can the subaltern speak? In: Nelson C, Grossberg L (eds.) Marxism and the Interpretation of Culture. University of Illinois Press, Urbana, IL Spivak G C 1999 A Critique of Postcolonial Reason. Harvard University Press, Cambridge, MA
P. Chatterjee
Bibliography Amin S 1995 Eent, Metaphor, Memory: Chauri Chaura 1922– 1992. Oxford University Press, Delhi, India Amin S, Chakrabarty D (eds.) 1996 Subaltern Studies IX. Oxford University Press, Delhi, India Arnold D 1993 Colonizing the Body: State Medicine and Epidemic Disease in Nineteenth-century India. University of California Press, Berkeley, CA Arnold D, Hardiman D (eds.) 1994 Subaltern Studies VIII. Oxford University Press, Delhi, India
Subaltern Studies: Cultural Concerns Subaltern Studies is an Indian school of historiography whose inspiration lay in the Maoist movement of the 1970s, and whose raison d’eV tre has been the critique of the perceived elitist bias of Indian nationalist discourse in history writing. Since 1983, when the first 15241
Subaltern Studies: Cultural Concerns volume appeared, Subaltern Studies has produced 10 volumes of collected research articles, which comprise the main corpus. After the appearance of the sixth volume, a collective has managed editorial work. Individual members of the collective have also written texts that exemplify the ‘subaltern’ viewpoint.
1. The Origins The school was founded by Ranajit Guha, a Marxist intellectual from Bengal. Once a member of the Communist Party of India, Guha was influenced by the radicalism of the Chinese Cultural Revolution in the late 1960s to align himself with Indian Maoism, which characterized independent India as a semifeudal and semicolonial state. Arguably, the project’s impetus derived from an effort to establish the truth of this proposition: ‘the price of blindness about the structure of the colonial regime as a dominance without hegemony has been, for us, a total want of insight into the character of the successor regime too as a dominance without hegemony’ (Guha 1989, p. 307). However, Subaltern Studies has changed a great deal since then. Ranajit Guha is acknowledged by the collective to be its intellectual driving force and edited the first six volumes. He is also the author of Elementary Aspects of Peasant Insurgency in Colonial India (Guha 1983), recognized as a seminal work in the genre of radical Indian historiography. Subaltern Studies began with an attempt to apply the approach known as ‘history from below’ in the Indian context. The term subaltern was inspired by the Italian Marxist Antonio Gramsci, and is used to indicate powerlessness in a society wherein class differentiation, urbanization, and industrialization had proceeded very slowly. Gramsci is the source of the subalternist respect for ‘culture’ and the ‘fragment’ in history writing. Subaltern Studies was also influenced by the critical Marxism of the English historian E. P. Thompson, who attempted to move beyond economistic definitions of class interest, as well as by Foucauldian critiques of power\knowledge systems. Its (earlier) Maoist orientation and polemical concern with the nationalism of Indian colonial elites made for a slant toward the investigation of peasant rebellion under colonial rule as well as peasant recalcitrance visa' -vis Gandhian nationalism and the Indian National Congress.
2. The Significance of ‘Culture’ Initially overtly political in its stance, Subaltern Studies was popular among young Indian historians and scholars of modern India abroad as a radical alternative to an uncritical academic celebration of Independence. The Indian national movement was seen as a failed hegemonic project (Sen 1987, pp. 15242
203–35, Guha 1983, 1989, pp. 210–309). Following Guha’s investigation of elementary forms of insurgent peasant consciousness, the school attracted the hostility of the Indian Marxist establishment for being ‘idealist’ (see Chakrabarty 1986, p. 364)—a sign that it had departed from economic reductionism. In a society where cultural symbols play an important role in everyday life as well as in political mobilization, this was a fruitful departure, necessary for comprehending phenomena such as charismatic leadership and communal conflict (see, e.g., Amin 1984, pp. 1–55, 1995, Pandey 1982, pp. 143–97, 1983, pp. 60–129, 1984, pp. 231–70, 1990, Hardiman 1982, pp. 198–231, 1984, pp.196–230, 1987). It has since expanded ‘beyond the discipline of history’ to engage ‘with more contemporary problems and theoretical formations’ (Subaltern Studies 1998, Vol. 9, Preface). These include the politics of identity and literary deconstruction. The shift in the 1990s was marked by the interest shown in the project by literature scholars and critics such as Edward Said and Gayatri Chakravorty Spivak. The analysis of discourse has since become a major preoccupation for subalternist research. It would be wrong to adduce a uniformity of vision in the output of Subaltern Studies over the years. The explicit, if unorthodox Marxian bent of its ‘history from below’ phase has been superseded by a recent preoccupation with community, rural innocence, and cultural authenticity. The epistemological approval of the Leninist ‘outsider’ as the bearer of ‘higher’ revolutionary consciousness (Chaudhury 1987, pp. 236–51) sits in unresolved tension with the oftexpressed critique of elitism and statism in historiography (Pandey 1994, pp. 188–221), and the belief in the immanence of culturally mediated forms of universal community (Chatterjee 1989, pp. 169–209). A thematic that persists, however, is the opposition of two ‘domains’—that of the elite, meaning the colonial state and its allies, and their forms of politics, knowledge, and power, vs. the subaltern. The latter has been variously interpreted as the peasantry, community, locality, and traditional domesticity and distinguished by its resistance to colonization. The difficulty caused by the problem of mass complicity has been dealt with by valorizing the recalcitrance of ‘fragments.’ The subalternist stance of giving voice to the repressed elements of South Asian history has engendered valuable research. A prominent example is Shahid Amin’s meticulous and thought-provoking investigation of the prolonged aftermath of the (in)famous Chauri Chaura riot of 1922, which resulted in the death of 22 policemen, the suspension of the first noncooperation movement, and the subsequent punishment by hanging of 19 accused rioters (Amin 1987, pp. 166–202, 1995). This unravelling of ‘an event which all Indians, when commemorating the nation, are obliged to remember—only in order to forget,’ (Amin 1995, p. 1) relentlessly juxtaposes event to
Subaltern Studies: Cultural Concerns nationalist metaphor and existential reality to ideological representation. It will remain an outstanding text in the subalternist corpus. Pandey’s intricate account of cow-protection movements in eastern India in the late nineteenth century exposes the interplay of symbolism, class interest, and public space. This path-breaking essay in the prehistory of communal politics (Pandey 1983, pp. 60–129), along with his writings on the ‘construction’ of communalism in colonial India (Pandey 1989, pp. 132–68, 1990) has contributed significantly to a raging historical debate. Guha’s (1987, pp. 135–65) ‘Chandra’s Death’ skilfully uses a legal narrative from mid-nineteenth century Bengal to analyze the workings of patriarchal culture and indigenous justice with great sensitivity to the existential predicament of ‘low-caste’ women. In a brilliant passage, Guha qualifies a description of conventional systems of asylum thus: this other dominance did not rely on the ideology of Brahmanical Hinduism or the caste system for its articulation. It knew how to bend the relatively liberal ideas of Vaishnavism and its loose institutional structure for its own ends, demonstrating thereby that for each element in a religion which responds to the sigh of the oppressed there is another to act as an opiate. (Guha 1987, p. 159)
3. Theoretical Tensions Subaltern Studies’ concern with issues of ideological hegemony elicits a questioning of the school’s own theoretical tensions. The juxtaposition of statist versus subalternist history; or the tyrannical march of Western-inspired universals versus the resistance\ occlusion of heroic subalterns expresses a view of a society divided into discrete social zones—with a concomitant oversight regarding the osmosis between these ‘domains.’ The idea that contemporary history encompasses a grand struggle between the narratives of Capital and Community, that the latter is the truly subversive element in modern society (‘community, which ideally should have been banished from the kingdom of capital, continues to lead a subterranean, potentially subversive life within it because it refuses to go away’—P. Chatterjee 1997, p. 236), raises the question of why ‘class’ has been demoted from the estate of subalternity, even though it too refuses to go away. In an era wherein the assertion of community is rapidly transiting from the realm of peasant insurgency to that of mass-produced identity, might not ‘community’ actually function as the necessary metaphysic of Capital rather than its unassimilable Other? The tensions extend beyond theory to that of discursive choice and indeed, silence. The political success of the movement for Pakistan, for example, the transformation in this case, of ‘communalism’ into ‘nationalism,’ has not been investigated despite
Guha’s pointers 1989, pp. 210–309, Guha 1992, pp. 69–120), and despite the urgent need for reflection on the Indian communists’ transitory but significant support in the 1940s for the two-nation theory and Partition. Nor has the subsequent history of Pakistan and the emergence of Bangladesh been addressed—a rich field for those interested in the ever-shifting paradigms of nationhood and identity in South Asia. Has subalternist thinking confined itself ideologically within the fragmentary remainder of 1947? Similarly, the category of labor and the history of the working class is absent from the main corpus of research after Chakrabarty’s (1983, pp. 259–310, 1984, pp. 116–52, 1989) publications on the jute-mill workers of Calcutta. Nor would it appear from this fleeting passage of workers through the subalternist corpus, that the national movement and nationalism had any impact on them. Despite insightful commentaries by P. Chatterjee (1984, pp. 153–95, 1986) on Gandhian ideology and Amin’s work on Chauri Chaura, the widespread popular appeal of Gandhi and his ahimsa remains an underexamined theme. The scholar who tires of negativity and is looking for answers on the role of charisma might find emotional sustenance as well as food for thought in an essay by Dennis Dalton (1970). Dalton is neither a historian nor a subalternist. This lacuna coexists with a reluctance to tackle the history of the communist movement, within India or internationally. Given its founder’s abiding interest in the failure of radical historiography to produce a ‘principled and comprehensive (as against eclectic and fragmentary) critique of the indigenous bourgeoisie’s universalist pretensions’ (Guha 1989, p 307), it would have been intellectually appropriate for him to address the history and historiographical practice of the movement to which he owed theoretical inspiration. The subalternist antipathy towards what is perceived as the representational pretension of the Gandhian Congress, its habit of translating a constricted, bourgeois aspiration into a nationalist universal (see Guha 1992, pp. 69–120), elicits a query about the political practice of the ‘true representatives’ of the workers and peasants. Guha can hardly be faulted for polemical shyness— and this makes Subaltern Studies’ sustained avoidance of ‘principled and comprehensive’ research on the fractious and tragic meanderings of Indian communism quite remarkable. The observations on Guha’s own political trajectory (Amin and Bhadra 1994, pp. 222–5) which refer to his disillusionment with the Soviet invasion of Hungary in 1956, and to his association with Maoist students in Delhi University in 1970–1, throw no light on the nature and content of his theoretical transformation. These are not mere matters of biographical detail. They are linked to vital historical questions on Bolshevism and its impact on anticolonial struggles in the twentieth century, and not just in India. The fact that Subaltern Studies has 15243
Subaltern Studies: Cultural Concerns carried the occasional essay on non-Indian societies, implies that the project of exposing elitist bias and ideological camouflage has been (notionally at least) thrown open to cross-national debate. Yet its sole addressal of Leninism in 17 years takes the form of a theoretical apology (Chaudhury 1987, pp. 236–51) without raising the matter of political hegemonism and subalternity in the USSR, initiatives ‘from below’ in the Russian Revolution, or the impact of Stalinism on the international communist movement.
4. A Radical Historiography? As a discursive field Subaltern Studies has produced provocative research on the history of colonial India and, of late, into more recent developments. It has been a forum for fresh scholarship on a variety of themes, ranging from ‘low’ caste and ‘tribal’ peasant insurgency, middle-class ideologies of nationalism, prison life, and disciplinary structures under colonialism, to the politics of liquor, the significance of myth, and interpretations of ‘bondage.’ It has also contributed important theoretical reflections on questions of nationalism, colonial science, caste, gender, and identity. This includes an evaluation of the historiographical antecedents of Hindutta or majoritarian nationalism (Chatterjee 1994, pp. 1–49), a critique of colonial penology (Arnold 1994, pp. 148–87), a commentary on recent developments in Indian feminism (Tharu and Niranjana 1996, pp. 232–60), research on concubinage and female domestic slavery (Chatterjee 1999, pp. 49–97) and an epistemological analysis of colonial ethnography (Ghosh 1999, pp. 8–48). The inquisitive scholar will also find it worthwhile to read a critique of the school written by an erstwhile member of the collective. Sarkar’s (1998) essay on ‘The Decline of the Subaltern in Subaltern Studies’ challenges what he considers to be its valorization of the indigenous, its ‘enshrinement of sentimentality’, and the shift in its polemical target from capitalist and colonial exploitation to Enlightenment rationality. A caveat might also be entered on the status of any scholarly claim to ‘represent’ the voice, interest or agency of a preferred subject—the historian’s discipline may indeed never be free of bias, but surely it must be as committed to the ideal of truth-as-thewhole, and balance, as to polemic. Be that as it may, Subaltern Studies has raised the level of debate in Indian historiography—the corpus may be critiqued, but certainly not ignored. It has had an impact on the orientation of many scholars, within and outside the discipline of history, and beyond the frontiers of India. Whether it will retain its original radical impetus by engaging boldly with questions posed by its own practice and the rapidly changing social and political environment in the post-Soviet global order remains to be seen. 15244
Bibliography Amin S 1984 Gandhi as Mahatma: Gorakhpur District, Eastern UP, 1921–2. In: Guha R (ed.) Subaltern Studies 3. Oxford University Press, Delhi, India Amin S 1987 Approver’s testimony, judicial discourse: The case of Chauri-Chaura. In: Guha R (ed.) Subaltern Studies 5. Oxford University Press, Delhi, India Amin S, Bhadra G 1994 Ranajit Guha: A bibliographical sketch. In: Hardiman D, Arnold D (eds.) Subaltern Studies 8, Essays in Honour of Ranajit Guha. Oxford University Press, Delhi, India Amin S 1995 Eent, Metaphor, Memory: Chauri Chaura, 1922–1992. Oxford University Press, Oxford, UK Arnold D 1994 The colonial prison: Power, knowledge and penology in nineteenth century India. In: Hardman D, Arnold D (eds.) Subaltern Studies 8, Essays in Honour of Ranajit Guha. Oxford University Press, Delhi, India Chakrabarty D 1983 Conditions for knowledge of working class conditions: Employers, government and the jute workers of Calcutta, 1890–1940. In: Guha R (ed.) Subaltern Studies 2. Oxford University Press, Delhi, India Chakrabarty D 1984 Trade unions in a hierarchical culture: The jute workers of Calcutta, 1920–1950. In: Guha R (ed.) Subaltern Studies 3. Oxford University Press, Delhi, India Chakrabarty D 1986 Discussion: Invitation to a dialogue. In: Guha R (ed.) Subaltern Studies 4. Oxford University Press, Delhi, India Chakrabarty D 1989 Rethinking Working Class History: Bengal, 1890–1940. Oxford University Press, Oxford, UK Chatterjee P 1984 Gandhi and the critique of civil society. In: Guha R (ed.) Subaltern Studies 3. Oxford University Press, Delhi, India Chatterjee P 1986 Nationalist Thought and the Colonial World— A Deriatie Discourse. Zed\Oxford University Press, Oxford, UK Chatterjee P 1989 Caste and subaltern consciousness. In: Guha R (ed.) Subaltern Studies 6. Oxford University Press, Delhi, India Chatterjee P 1994 Claims on the past: The genealogy of modern historiography in Bengal. In: Hardman D, Arnold D (eds.) Subaltern Studies 8, Essays in Honour of Ranajit Guha. Oxford University Press, Delhi, India Chatterjee P 1997 The Nation and Its Fragments: Colonial and Post Colonial Histories. Oxford University Press, Oxford, UK Chatterjee I 1999 Colonising subalternity: Slaves, concubines and social orphans in early colonial India. In: Bhadra G, Prakash G, Tharu S (eds.) Subaltern Studies 10. Oxford University Press, Delhi, India Chaudhury A K 1987 In search of a subaltern Lenin. In: Guha R (ed.) Subaltern Studies 5. Oxford University Press, Delhi, India Dalton D 1970 Gandhi during partition. In: Philips C H, Wainright M D (eds.) The Partition of India: Policies and Perspecties 1935–1947. Allen and Unwin, London, pp. 222–44 Ghosh K 1999 A market for aboriginality: Primativism and race classification in the indentured labour market of colonial India. In: Bhadra G, Prakash G, Tharu S (eds.) Subaltern Studies 10. Oxford University Press, Delhi, India Guha R 1983 Elementary Aspects of Peasant Insurgency in Colonial India. Oxford University Press, Oxford, UK Guha R 1987 Chandra’s death. In: Guha R (ed.) Subaltern Studies 5. Oxford University Press, Delhi, India
Subculture, Sociology of Guha R 1989 Dominance without hegemony and its historiography. In: Guha R (ed.) Subaltern Studies 6. Oxford University Press, Delhi, India Guha R 1992 Discipline and mobilise. In: Chatterjee P, Pandey G (eds.) Subaltern Studies 7. Oxford University Press, Delhi, India Hardiman D 1982 The Indian faction: A political theory examined. In: Guha R (ed.) Subaltern Studies 1. Oxford University Press, Delhi, India Hardiman D 1984 Adivasi assertion in South Gujarat: The Devi movement of 1922–3. In: Guha R (ed.) Subaltern Studies 3. Oxford University Press, Delhi, India Hardiman D 1987 The Coming of the Dei: Adiasi Assertion in Western India. Oxford University Press, Oxford, UK Pandey G 1982 Peasant revolt and Indian nationalism: The peasant movement in Awadh 1919–22. In: Guha R (ed.) Subaltern Studies 1. Oxford University Press, Delhi, India Pandey G 1983 Rallying round the cow: Sectarian strife in the Bhojpuri region, 1888–1917. In: Guha R (ed.) Subaltern Studies 2. Oxford University Press, Delhi, India Pandey G 1984 Encounters and calamities: The history of a North India Qasbah in the nineteenth century. In: Guha R (ed.) Subaltern Studies 3. Oxford University Press, Delhi, India Pandey G 1989 The colonial construction of ‘communalism’: British writings on Benares in the nineteenth century. In: Guha R (ed.) Subaltern Studies 6. Oxford University Press, Delhi, India Pandey G 1990 The Construction of Communalism in Colonial North India. Oxford University Press, Oxford, UK Pandey G 1994 The prose of otherness. In: Hardiman D, Arnold D (eds.) Subaltern Studies 8, Essays in Honour of Ranajit Guha. Oxford University Press, Delhi, India Prakash G 1990 Genealogies of Labour Seritude in Colonial India. Cambridge University Press, Cambridge, UK Sarkar S 1998 Writing Social History. Oxford University Press, Oxford, UK Sen A 1987 Capital, class and community. In: Guha R (ed.) Subaltern Studies 5. Oxford University Press, Delhi, India Tharu S, Niranjana T 1996 Problems for a contemporary theory of gender. In: Amin S, Chakrabarty D (eds.) Subaltern Studies 9. Oxford University Press, Delhi, India Subaltern Studies 1982–99 10 Vols, Oxford University Press, Delhi, India
D. Simeon
Subculture, Sociology of More than the notion of culture, that of subculture poses more problems for sociologic analysis than it is able to solve. Just as specialists have enumerated several dozens of definitions of the word culture (Kroeber and Kluckhohn 1952, Shils 1961), it would be easy to do likewise for the notion of subculture.
1. The Concept of Subculture When the concept of culture is taken in a normative manner, as being the set of works of the mind, art, philosophy, religion, and science, thus entailing a high
level of work and development, the notion of subculture would seem to subtend the degraded or ‘vulgar’ forms of that culture. Within the hierarchical framework of culture, the subcultures may be understood as the popular cultures, often marginal, and especially the mass culture engendered by the cultural industries. On the one hand, a whole conservative tradition (Bloom 1987) and on the other, a critical one, the Frankfurt School (Adorno 1980), has analyzed this mass subculture as being a mixture of propaganda, commerce and alienation, and as being a degradation of both the aristocratic and rationalist ideal of a great culture, as widespread as civilization itself. In short, this view holds subculture to be an inferior culture. While much thinking on mass culture has considered subculture as being centered on leisure activities, entertainment and cultural consumption, the limitations of such a view were soon revealed. First (Morin 1962), it was demonstrated that mass culture could play an integrating and communicative role at the same time as the traditional social cultures and affiliations were declining, such as those involving classes and religions. Moreover, this form of culture does not remain superficial when it induces strong cultural transformations, and when it is partially associated with social movements such as youth culture and the production and consumption of music. For many years thought of as a soft form of propaganda, the mass subculture conveyed by the media plays a more ambiguous and probably more democratic role than one might believe (Lazarfeld et al. 1968). Finally, the notion of mass cultures comprises a set of specific forms of cultural consumption, ranging from art to leisure activities, including more private activities which have become the object of public cultural policies playing a leading role in the definition of ways of life and social identities. This mass culture might even be a degradation of ‘real’ culture; indeed, while some of its productions are authentic cultural works, the question arises as to whether cinema, sport and television are an integral part of culture, or whether they are only subcultures. In fact, the issue is hardly of scientific interest and has perhaps come about only through the ethnocentricity of culturally dominant and legitimate groups. At best, the notion of subculture in this view is only a polemic device, the danger being that we may become blind to the diversity of mass culture and to the value of studying its modes of production and forms of consumption.
2. The Hierarchy of Cultures When culture is considered from an anthropologic viewpoint as a set of values, representations, and ways of being and acting, the problem of the hierarchy of cultures is resolved, since each group is logically defined by a culture (Le! vi-Strauss 1952). In this case, however, it is only the nature of the difficulty which changes, because the major issue is that concerning the 15245
Subculture, Sociology of demarcation between cultures held to be coherent and relatively closed entities and consequently defining where the frontier with subcultures lies. In fact, there is a risk of associating a subculture to just about every identifiable activity\practice, and therefore of diluting the concept into an ongoing continuum. It is clear that any group, social category or set of activities\practices may be more or less clearly defined by a subculture. It is valid to talk of youth subcultures, peasant subcultures, class subcultures, the subculture of chic or delinquent city quarters, and even professional subcultures. Numerous authors have investigated these themes and there is now a whole anthropology of modernity. However, the notion of subculture took on a particular meaning with studies on the culture of poverty. R. Hoggart (1957) highlighted two essential dimensions to working-class culture: first, the way a community culture can be based on social distance and how the working class is locked into its own values; second, the manner in which a culture serves to ratify the social domination that the working class undergoes, with the result that one’s class destiny is widely accepted. In this view, the subculture is seen as a way of maintaining social order. In describing the culture of poverty, O. Lewis (1963) treads similar ground, but with more clarity. For him, poverty engenders a specific culture which, more than just a form of adaptation to economic constraints, is rather a normative interiorization of poverty, thus explaining the way it is reproduced and why its actors are unable to escape from their situation in life. Thus, poverty may be explained by its own subculture. This idea has been heavily criticized owing to its conservative undertones, suggesting that the poor are victims, more of their culture, than of the social conditions that are imposed on them. Beyond this criticism, the notion of subculture is problematic, since it is difficult to know where the frontiers between cultures lie in a porous society where there is interpenetration, imitation and opposition between groups and cultures (Bourdieu 1979). Moreover, this view tends to reify cultures and subcultures, to give them more homogeneity than they really have, and to explain the conduct of actors only by their cultural identification.
3. The Links Between Culture and Action To these first two difficulties is to be added a third, i.e., the links between culture and action. While it is clear that social action is normatively and culturally orientated and that it proceeds in ‘patterns’ of action, it is problematic to explain action as the simple application of a cultural program, as postulated by the most radical thinkers on culturalism (Benedict 1935, Kardiner 1939, Linton 1968). More precisely, by explaining social conduct in terms of cultures and subcultures, there is a great risk of getting into a 15246
tautologic loop. For example, deviant behavior could be explained by the existence of a deviant subculture, authoritarian behavior by the existence of authoritarian subcultures, and people might play soccer because it likewise has its own subculture. Even if such an explanation were to be admitted, the main scientific issue remaining would be the ability to explain the development and logic of subcultures. Regarding deviance, for example, there are several competing theories of the subculture, because they explain the development of the subcultures differently. It is clear that delinquent behavior supposes specific norms, a particular form of socialization, a conception of social life and its values, if only to neutralize the interiorization of conformist norms (Matza 1964). Three schools of thought may be distinguished here. The first (Thrasher 1927, Jankowski 1991) considers the subculture of gangs of youngsters as being a response to a phenomenon of social disorganization and anomie. Enhancement of the bonds of solidarity in the gang, the logic of confronting other gangs and the police, and the obsession with honor may all be seen as ways of constructing a subculture, guaranteeing limited social integration for individuals who cannot play a full role in social life. The second school of thought regarding delinquent subculture derives from the Mertonian model of anomie (Merton 1965). Here, subcultures are seen as a way to reduce the structural tension brought about by the juxtaposition of profound social inequality with the democratic culture of achievement and its concomitant realization of the individual in economic success. The various deviant cultures may therefore be considered to enhance either delinquency as a response to frustrated conformism (Cohen 1955), or social retreat, withdrawal from the game and holing up in a primary group (Cloward and Olhin 1960). Finally, other more marginal theories hold deviant subculture to be the product of confrontation between the working class and the conformism of the middle classes. Consequently, the actors of social control are seen as engendering deviance and the labeling of individuals by way of their own professional subcultures. Hence, social conflicts may be considered to be conflicts between subcultures. To be clear and not fall foul of tautology, subculture may be explained as the product of social relations and of the strategies that the actors implement. In general, subcultures result from the meeting of wider cultures and particular social situations (Dubet 1987). Therefore, subcultures could be considered as the way in which actors interpret whole cultural settings in the light of the situations and contexts in which they find themselves. In these terms, subcultures are ‘constructed’ by actors and are not imposed on them as principal determinants of their action. Moreover, it is difficult to free the notion of subculture from the value judgments that define it. Indeed, it is very often used in a naive ideological way by researchers oscillating
Subjectie Probability Judgments between populist enchantment and aristocratic loftiness. See also: Counterculture; Critical Theory: Frankfurt School; Culture, Sociology of; Popular Culture; Values, Sociology of; Youth Gangs
Bibliography Adorno T W 1980 Minima Moralia. Reflexions sur la Vie MutileT e. Payot, Paris Benedict R 1935 Patterns of Culture. Routledge, London Bloom A 1987 L’aV me DeT sarmeT e, Essai sur la Culture GeT neT rale. Julliard, Paris Bourdieu P 1979 La Distinction. de Minuit, Paris Cloward R A, Ohlin L E 1960 Delinquency and Opportunity. The Free Press, New York Cohen A K 1955 Delinquent Boys. The Culture of the Gang. The Free Press, New York Dubet F 1987 La GaleZ re. Jeunes en Surie. Fayard, Paris Hoggart R 1957 The Uses of Literacy. Chatto and Windus, London Jankowski M S 1991 Islands in the Streets: Gangs and American Urban Society. University of California Press, Berkeley, CA Kardiner A 1939 The Indiidual and his Society. Columbia University Press, New York Kroeber A L, Kluckhohn C 1952 Culture: A Critical Reiew of Concepts and Definitions. Vintage Books, New York Lazarsfeld P, Berelson B, Gaudet H 1968 The People’s Choice. Columbia University Press, New York Le! vi-Strauss C 1952 Race et Histoire. UNESCO, Paris Lewis O 1963 Les Enfants des Sanchez. Gallimard, Paris Linton R 1936 The Study of Man. Appleton Century, New York Matza D 1964 Delinquency and Drift. Wiley, New York Merton R K 1965 ‘Structure Sociale, Anomie et DeT iance’, EleT ments de Theorie et de MeT thode Sociologique. Plon, Paris Morin E 1962 L’esprit du temps. Grasset, Paris Shils E 1961 Mass society and its culture. In: Culture for the Millions. D. Van Nostrand, Princeton Thrasher F 1927 The Gang. University of Chicago Press, Chicago, 1963
F. Dubet
Subjective Probability Judgments 1. Introduction Life is rife with uncertainty. Do our enemies have biological weapons? Will it rain next Sunday? How much oil can be found by drilling in this spot? Important decisions in areas as diverse as medicine, business, law, public policy, and one’s private life, must be made under uncertainty, hence hinge on odds and probabilities. ‘Probability,’ in the famous words of Bishop Butler (1736), indeed ‘is the very guide to
life.’ Yet the theories of mathematical probability or of statistical inference cannot deliver the probabilities sought. So we have to make do with subjective probabilities—the probabilities that people generate in their own minds to express their uncertainty about the possibility of the occurrence of various events or outcomes. How do people generate these subjective probabilities? Since the 1970s, this question has attracted much research attention from cognitive psychologists. The present article will review this research. Although statistics and probability theory can seldom replace intuitive judgments, a favored paradigm in the study of subjective probabilities has been to use the norms and dictates these theories provide as yardsticks against which people’s intuitions can be compared, thereby illuminating the cognitive heuristics people use, and the biases to which they are subject. Appropriately, the dominant school in this line of research is called the heuristics and biases approach (Tversky and Kahneman 1974, Kahneman et al. 1982). At its core are two heuristics for judging probabilities—the representativeness heuristic, and the availability heuristic.
2. Representatieness This is a heuristic for judging the probability that a certain target case, presented by some individuating information, belongs to a certain category (or comes from a certain source, or possesses a certain feature). This is a common type of probability judgment: a doctor judges the probability that a particular patient has pneumonia, a juror judges the probability that a particular defendant committed murder, or an employer judges the probability that a particular job candidate would be a good employee. Its essence is to base the probability judgment on the extent to which the target case’s relevant particulars (e.g., the patient’s symptoms, the defendant’s motives, or the job candidate’s background) match, resemble, or represent the attributes which the judge considers typical of, or central to, the category (Kahneman and Tversky 1972). Similarity judgments, however, obey different rules than probability. For example, a person could well be more like a typical New Yorker than like a typical American—but a person can never be more likely to be a New Yorker than an American. This difference between resemblance and probability can bias judgments of the latter when they are based on the former. Consider, for instance, the following example. Example 1: After graduating from college, Sarah took a trip around the world, writing frequent letters home. One contained the following description: ‘It is so clean here, you could eat off the streets. Snowcapped mountains frame picturesque log cabins, with geraniums in every window.’ Where do you think this letter was written? 15247
Subjectie Probability Judgments Respondents given this question sensibly thought it was more likely that the letter was written from Switzerland than from Brazil. But they also thought it was more likely that the letter was written from Switzerland than from ... Europe! When asked to select the most likely of several possibilities (which included Switzerland, Brazil, and Europe, among others), they ranked Switzerland—not Europe—first (Bar-Hillel and Neter 1993). Probability theory does not provide an answer to the question: ‘What is the probability that this letter was written from Switzerland?’ But once this probability is assessed, it sets constraints on certain other probabilities. Thus, whatever the probability that the letter was written from Switzerland, it cannot possibly be higher than the probability that it was written from Europe. This is the extension principle, and people realize its validity. Those that rely on it when performing the experimental task do not rank Switzerland before Europe. Most, however, just compare the description in the letter with their idea of the places under consideration. The match between the description and Switzerland is the best—better even than the match between it and Europe. So, in accordance with the representativeness heuristic, Switzerland is ranked most likely. Results such as this point out both what people often do (namely, base probability judgments on judgments of representativeness), and what they often fail to do (namely, rely on formal rules of probability). One factor, which plays an important role in formal probability calculations but is relatively neglected in subjective probability reasoning, is size—either category size, or sample size. This is shown in the following example. Example 2: A certain town is served by two cab companies, the Blue Cab and the Green Cab, named after the color of cab they operate. The Blue Cab Company has many more cabs than the Green Cab Company. One dark and foggy night, a pedestrian was knocked down by a hit-and-run cab. She later testified that it was green. Her accuracy rate in distinguishing green from blue cabs under the conditions that existed that night was found to about 85 percent. What is the probability that the errant cab was a Green Cab? This problem, and many variants thereof, has been studied in scores of studies (e.g., Tversky and Kahneman 1982a, Bar-Hillel 1980, 1983). Formatted like a high school math problem, it invites explicit numerical calculation. The modal response is to assign the witness’ testimony a probability that matches her accuracy, 85 percent. The same happens when the witness is said to have identified the cab as belonging to the larger company, Blue Cab. Indeed, the relevance of the relative size of the cab companies to the question eludes most people. After all, they argue, the pedestrian was run over by a single cab, not by the entire fleet. Nonetheless, when the witness is unable to identify the color at all, most people properly realize 15248
that the errant cab is probably from the larger company, simply because it is the larger company. Normatively, the witness’ testimony should be integrated with this base-rate consideration, rather than replacing it. The inclination to neglect category size is known as the base-rate fallacy. Since Example 2. does not involve judgment by representativeness, it shows that baserate neglect is not merely a side effect of this heuristic, as it was in Example 1. Example 3. shows that size considerations are neglected not only when considering the source population, but also when considering the sample. Example 3.: A game of squash can be played either to 9 or to 15 points. If you think you are a better player than your opponent, which game gives you a higher chance of winning, the short game, or the longer game? Suppose, instead, that you are the weaker player, which game gives you a higher chance now? If you are inclined to favor the same length in either case, consider this theorem from probability theory: The larger the sample of rounds (i.e., 15 rather than 9), the likelier it is to yield the expected outcome, namely victory to the stronger player. So if you believe you are the stronger player, you should prefer the longer game, but if you believe you are weaker, your chances are better with a shorter game. Intuitively, though, winning a game seems equally representative of player strength whether it is a shorter or longer game (Kahneman and Tversky 1982b). This is not to deny that people can—spontaneously or by mild prompting—appreciate that larger samples are better for estimating population parameters than smaller samples. But they erroneously think that the advantage lies in being relatiely larger (in terms of the ratio of sample size to population size) rather than absolutely larger. Many, for example, indicate equal confidence in the accuracy of a survey of 1 in every 1,000 people in predicting election outcomes in a city of a million voters as in a survey of 1 in every 1,000 people in a town of 50,000—even though the former survey samples 20 times as many people (Bar-Hillel 1979). Example 4., which follows, demonstrates another judgmental bias, which is related to size neglect and is known as failure to regress. Example 4.: When Julie was 4, she was rather tall for her age—in the 80th percentile (i.e., taller than about 80 percent of her peers). She is now 10 years old. What would you guess is her present height percentile? If you think that Julie may have moved up, or she may have moved down, but she probably is still taller than most of her peers, you are quite right. However, if you also think that Julie is as likely to be in a higher percentile as in a lower percentile, and that by and large she is probably still most likely to be approximately in the 80th percentile, then you are sharing a common, but erroneous, disregard for the phenomenon known as regression to the mean. This is the fact that the mean of a second sample from a population
Subjectie Probability Judgments
Figure 1 Which matrix is more ‘randomly’ mixed?
can always be expected to be closer to the population mean than was the first sample’s (unless the first sample’s mean happens to have coincided with the population mean to begin with). In the present example, this follows from the fact that a larger proportion of the population of children were shorter than Julie (80 percent) than were taller than her (20 percent). Failure to recognize regression to the mean can result in inventing causal explanations for what may be mere statistical artifacts (Kahneman and Tversky 1973), such as for the observation that very bright women tend to have somewhat less bright husbands. The following example demonstrates some difficulties that people have with the concept of randomness. Example 5: Consider the two 12i12 matrices in Fig. 1. Each consists of 50 percent black cells and 50 percent white cells. In one, the colors are mixed in a manner typical of truly random mixing. In the other, the cells are not mixed randomly at all. Which do you think is the ‘random’ matrix? Most people perceive B as more ‘random’ (Falk 1975). A seems to them too ‘clumpy.’ But in fact, about half the pairs of adjacent cells in B are of the same color—exactly as much clumpiness as expected in a randomly mixed matrix. On the other hand, in A, almost two-thirds of neighboring cells are of different colors, which is more alternation than could be expected statistically. But people’s intuitions of randomness (see, e.g., Bar-Hillel and Wagenaar 1991) are characterized by ‘local representativeness’—the erroneous expectation that the short run should exhibit the properties which the law of large numbers promises only in the long run. Thus, local representativeness is like a law of small numbers (Tversky and Kahneman 1971).
3. Aailability If the representativeness heuristic tends to overlook category size, the availability heuristic is used pri-
marily for judging category size—or rather, relative size. Category size relates to probability inasmuch as probability is sometimes taken to be the proportion, in the population as a whole, of instances that belong to some specific category, or possess some specific feature. A statistician who wants to estimate relative frequencies draws a random sample. The cognitive counterpart of the statistician’s procedure is to ‘sample in the mind’—namely, to bring to mind, either by active search or spontaneously, examples of the target population. Sampling in the mind, however, is hardly random. It is subject to many effects that determine the ease with which examples come to mind (Tversky and Kahneman 1973). Examples of such effects are salience, recency, imaginability, and—fortunately— even actual frequency. Effects that determine the ease of calling to mind are known as aailability effects. The availability heuristic is the judgmental procedure of reliance on mental sampling, and is demonstrated in the following example. Example 6: Subjects were read a list of 39 names of celebrities. In List 1, the 19 women were rather more famous than the 20 men, and in List 2 it was the 19 men who were more famous than the 20 women. Each person heard a single list. They were then asked whether they had heard more female names or more male names. As predicted, when the men [women] were more famous, male [female] names were more frequently recalled. They were also judged, by independent respondents, to be more frequent. It turns out to be unnecessary actually to bring examples to mind, as it is enough just to estimate how many could be brought to mind. People seem to have a pretty good notion of availability (Tversky and Kahneman 1973), and use it to estimate probabilities. In Example 7., the availability heuristic leads to a violation of the aforementioned extension principle, mentioned before. Example 7.: Imagine the following scenarios, and estimate their probability. (a) A massive flood somewhere in America, in which more than 1,000 people die. (b) An earthquake in California, causing massive flooding in which more than 1,000 people die. Respondents estimated the first event to be less likely than the second (Tversky and Kahneman 1983). As discussed earlier, the extension principle logically precludes this possibility: a flood with unspecified causes in a relatively unspecified location is necessarily more likely than one with a more specific cause and location. An earthquake in California, however, is a readily imaginable event which greatly increases the availability—hence the subjective probability—of the flood scenario. A particularly prominent result from the availability heuristic is the unpacking effect, shown in Example 8. Example 8.: A well-known 22-year-old Hollywood actress was admitted to a hospital emergency room with pains in the lower right abdomen, which had 15249
Subjectie Probability Judgments lasted over 12 hours. What could she be suffering from? A group of general practitioners from the Stanford area were given this question. Half were asked to assign a probability to the following possibilities: (a) Indigestion (b) Extra-uterine pregnancy (c) Other. The list given to the other half was: (a) Indigestion (b) Extra-uterine pregnancy (c) Acute appendicitis. (d) Inflamed kidney. (e) Cervical inflammation (f ) Other. The first group assigned a probability of 50 percent to the ‘Other’ category. The second group assigned a probability of 69 percent to categories (c) to (f ), although of course these are but a more detailed breakdown (‘unpacking’) of ‘Other’ (Redelmeier et al. 1995). A similar effect is found even when it is hard to argue that the unpacking itself provides additional information. When the category of ‘homicide’ is broken down into ‘homicide by a stranger’ and ‘homicide by an acquaintance,’ the latter two are given a higher total probability as causes of death than the unbroken category (Tversky and Koehler 1994). Here, apparently, the different descriptions of the categories merely call attention to different aspects of the categories. Either by affecting memory or by affecting salience, enhancing the availability of subcategories through more detailed descriptions enhances their perceived probability.
4. Subjectie Uncertainty When judging probability, people can locate the source of the uncertainty either in the external world or in their own imperfect knowledge (Kahneman and Tversky 1982). The latter is what Bayesian theory means by the term subjectie probability (whereas in this article, recall, the adjective refers to the mode of assessment, rather than to the locus of the uncertainty). When assessing their own uncertainty, people tend to underestimate it. The two major manifestations of this tendency are called oerconfidence, and hindsight bias. Overconfidence concerns the fact that people overestimate how much they actually know: when they are p percent sure that they have answered a question correctly or predicted correctly, they are in fact right on average less than p percent of the time (e.g., Lichtenstein et al. 1982, Keren 1991). Hindsight bias concerns the fact that people overestimate how much they would have known had they not possessed the correct answer or prediction: events which are given an average probability of p percent before they are known to have occurred, are given, in hindsight, probabilities higher than p percent (Fischhoff 1975)—a phenomenon sometimes known as ‘I knew it all along.’ Both these biases can result from biased availability. After all, what we know is almost by definition more 15250
available to us than what we do not know. Regarding hindsight, it is enough to note that once something (such as the final outcome) has been brought to mind, it is nearly impossible to ignore (try to ‘not think of a white elephant’). Regarding overconfidence, it has been shown that it can be dampened by instructing people to ‘think of reasons you might be wrong’ prior to asking them to assess their own confidence (Koriat et al. 1980)—a direct manipulation to increase the availiability of the complementary event, the one competing for a share of the total 100 percent probability. In addition, overconfidence turns to underconfidence in very hard questions (e.g., May 1986). Perhaps in answering easy questions, one’s best guess is mentally prefaced by: ‘I’m not sure I know, but I think that …’ which focuses on the available. In answering hard questions, on the other hand, one’s best guess might be mentally prefaced by: ‘I don’t think I know, but I guess that …’ which focuses on the unavailable.
5. Conclusion The survey of heuristics and biases involved in people’s judgments under uncertainty shows that subjective probabilities (in the sense of judged probabilities) are not just imperfect or inaccurate versions of objective probabilities, but rather are governed by cognitive principles of their own. These may often lead to good judgments—but in many circumstances entail violations of normative dictates, and hence systematic departures from more rational judgments. This view stands in contrast to earlier views, which saw ‘Man as an intuitive statistician,’ (Peterson and Beach 1967). It also contrasts with recent views (e.g., Cosmides and Tooby 1996, Gigerenzer 1998), which challenge the very appropriateness of a research program that asks people to make probability judgments. This school of thought insists that we evolved to judge frequencies, not probabilities, so we are not to be faulted for the biases in our subjective probabilities. But as long as people are called upon, as in modern life they so often are, to make probability judgments, awareness of the phenomena this article described would benefit them greatly. See also: Bayesian Theory: History of Applications; Bounded and Costly Rationality; Decision Making: Nonrational Theories; Decision Theory: Bayesian; Heuristics for Decision and Choice
Bibliography Bar-Hillel M 1979 The role of sample size in sample evaluation. Organizational Behaior and Human Performance 24: 245–57 Bar-Hillel M 1980 The base-rate fallacy in probability judgments. Acta Psychologica 44: 211–33
Sub-Saharan Africa, Archaeology of Bar-Hillel M, Neter E 1993 How alike is it versus how likely is it: A disjunction fallacy in stereotype judgments. Journal of Personality and Social Psychology 65(6): 1119–32 Bar-Hillel M, Wagenaar W A 1991 The perception of randomness. Adances in Applied Mathematics 12: 428–54 [also in Keren G, Lewis C (eds) 1993 A Handbook for Data Analysis in the Behaioral Sciences Lawrence Erlbaum, Hillsdale, NJ, 369–93] Butler J 1736 The Analogy of Religion [‘Introduction’ cited in The Oxford Book of Quotations, 3rd edn. (1979), Oxford University Press, Oxford, UK, p. 117 : 9] Cosmides L, Tooby J 1996 Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgment under uncertainty. Cognition 58: 1–73 Falk R 1975 The perception of randomness. Ph D dissertation [in Hebrew], The Hebrew University, Jerusalem, Israel Fischhoff B 1975 Hindsight l foresight: The effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance 1: 288–99 Gigerenzer G 1998 Ecological intelligence: An adaptation for frequencies. In: Dellarosa Cummins D, Allen C (eds.) The Eolution of Mind. Oxford University Press, New York Kahneman D, Tvesky A 1982a Evidential impact of base rates. In: Kahneman D, Slovic P, Tvesky A (eds.) 1982 Judgement Under Uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge, UK Kahneman D, Slovic P, Tversky A (eds.) 1982b Judgment under Uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge, UK Kahneman D, Tversky A 1972 Subjective probability: A judgment of representativeness. Cognitie Psychology 3: 430–54 Kahneman D, Tversky A 1973 On the psychology of prediction. Psychological Reiew 80: 237–51 Kahneman D, Tversky A 1982 On the study of statistical intuitions. Cognition 11: 123–41 Keren G 1991 Calibration and probability judgments: Conceptual and methodological issues. Acta Psychologica 77: 217–73 Koriat A, Lichtenstein S, Fischhoff B 1980 Reasons for confidence. Journal of Experimental Psychology: Human Learning and Memory 6: 107–18 Lichtenstein S, Fischhoff B, Phillips L 1982 Calibration of probabilities: The state of the art to 1980. In: Kahneman D, Slovic P, Tversky A (eds.) Judgment Under Uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge, UK, pp. 306–34 May R 1986 Inferences, subjective probability and frequency of correct answers: A cognitive approach to the overconfidence phenomenon. In: Brehmer B, Jungermann H, Lourens P, Sevoaan A (eds.) New Directions in Research on Decision Making. North Holland, Amsterdam, pp. 174–89 Peterson C R, Beach L R 1967 Man as an intuitive statistician. Psychological Bulletin 68(1): 29–46 Redelmeier D, Koehler D J, Liberman V, Tversky A 1995 Probability judgment in medicine: Discounting unspecified alternatives. Medical Decision Making 15: 227–30 Tversky A, Kahneman D 1971 The belief in the ‘law of small numbers.’Psychological Bulletin 76: 105–10 Tversky A, Kahneman D 1973 Availability: A heuristic for judging frequency and probability. Cognitie Psychology 5: 207–32
Tversky A, Kahneman D 1974 Judgment under uncertainty: Heuristics and biases. Science 185: 1124–31 Tversky A, Kahneman D 1983 Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Reiew 90: 293–315 Tversky A, Koehler D J 1994 Support theory: A nonextensional representation of subjective probability. Psychological Reiew 101: 547–67
M. Bar-Hillel
Sub-Saharan Africa, Archaeology of 1. The African Stone Age: A Hard Act to Follow People evolved in sub-Saharan Africa from a common ancestor with the other African apes about 5 or 7 million years ago. The earliest recognizable artefacts, which are stone tools associated with animal bones, come from a number of sites in east Africa about 2.5 million years ago. The earliest members of our own genus, Homo, date from a little after that time. The earliest claimed evidence for the controlled use of fire and the earliest persuasive bone artefacts come from sites in eastern and southern Africa dated to just before or just after 2 million years ago. Some time before 1 million years ago, a first ‘out of Africa’ expansion took archaic humans across the land bridge into Eurasia. The earliest handaxes, beautiful as well as functional, spread from Africa some time after 1 million years ago. Current opinion is that between a half and a quarter of a million years ago African early Homo populations evolved into our own species, Homo sapiens, whilst relatives in southeastern Asia and in Europe took other directions. Eventually, perhaps 100,000 years ago, a second ‘out of Africa’ movement, or perhaps a series of waves, arguably caused the replacement of archaic humans by modern ones, as Homo sapiens completed the process of global colonization. These early modern people in southern and eastern Africa left behind the earliest evidence for systematic marine food use and substantial early evidence for the manufacture of pigment, presumably for ritual use. If this sounds somewhat ‘earlier than thou,’ it nevertheless underlines the massive contribution of sub-Saharan Africa to the story of human evolution and the emergence of modern behavior. John Goodwin effectively began the systematization of sub-Saharan stone age prehistory when, in 1929, along with van Riet Lowe, he published The Stone Age Cultures of South Africa, in which he established the terminological framework still used throughout the continent. He recognized similarities between the stone artefact assemblages in South Africa and Europe, 15251
Sub-Saharan Africa, Archaeology of particularly in the earlier phases, but elected to coin a new set of terms, the Earlier, Middle, and Later Stone Ages. His vision was that these major units, broadly equivalent to the Lower, Middle, and Upper Palaeolithic of Europe, could be subdivided into geographically bounded assemblage sets that he called cultures or variants. Whereas the Earlier and Middle Stone Age assemblages closely resembled their European counterparts, the Later Stone Age (LSA) was quite different from the Upper Palaeolithic of Europe. At least one of his cultural constructs, the Wilton culture, became popular as a designation of microlithic Later Stone Age assemblages from the Cape of Good Hope to Cape Horn. After the foundation of the Pan African Congress meetings in the 1940s, which created a structure for French and English speaking archaeologists working in sub-Saharan Africa to better coordinate their efforts, the notion of ‘intermediates’ between Earlier and Middle and between Middle and Later Stone Ages was briefly entertained. These were to contain assemblages that illustrated the technological and typological transitions, but for reasons explained later were unsuccessful. The most serious attempt to standardize what had become a terminological nightmare was the brave but ineffective set of recommendations of the Burg Wartenstein conference, held in Austria in 1964. This template envisaged a set of hierarchically organized terms (occurrences, phases, industries, and industrial complexes), replacing the older tripartite system of Goodwin’s. It never found much favor and archaeologists seem simply to have moved on from terminological angst, choosing to use older or newer terms informally and to investigate behavior rather than debate the structure of the archaeological record. Significantly, this early obsession with stones as determining materials has been diffused by attention to other materials and other issues.
2. Human Origins A question of considerable interest is that of whether the archaeological record is an accurate reflection of the distributions of early people. Fossil fragments of the genus Australopithecus (and Ardipithecus) were known only from the sedimentary lake basins of East Africa and the dolomite caverns of South Africa until the discovery of australopithecines from the deserts of Chad. The classic sites of eastern and southern Africa are depositional environments, which might suggest that australopithecines were more widespread but not preserved in more erosional landscapes without such catchments. Chad is certainly a hint of wider distributions. Stone tools, however, are less affected by poor preservational contexts and establish the significance of the lake basins of the African Rift Valley as the focal places in early human technological history. 15252
Mary Leakey’s pioneering archaeology of living floors at Olduvai Gorge in East Africa established that the hand axe containing Acheulean assemblages of the African Earlier Stone Age, which Goodwin had wanted to call the the Stellenbosch culture in the Cape, were preceded by earlier assemblages without such bifaces. With the first potassium argon dates of about 1.8 million years from the base of Bed 1 at Olduvai came the realization of the huge antiquity of stone tool making in Africa. The earliest assemblages, naturally referred to as Oldowan, had a variety of cores, flakes, and rather crudely retouched pieces. Pre-hand axe assemblages are now known to date from as early as 2.5 million years ago, at a number of sites in the stratified lake basin deposits of the East African Rift Valley. The assemblages vary, at least in part, with the variable nature of the stone raw materials available. Very little has yet emerged as to the likely precedents of these Oldowan assemblages, but presumably a long period of more informal tool use and modification precedes the localized concentrations now claimed as the earliest sites. By comparison with recently recorded chimp tool uses, we might expect that the changes in behavior which led to the localization of tool making debris in specific places is what is actually reflected in the archaeological record. Initial experiments with stone tool making must substantially predate 2.5 million years, but may never be recognized. The earliest assemblages with recognizable bifaces, characteristically both roughly teardrop shaped hand axes and blade ended cleavers, date to about 1.6 million years or so, soon after the earliest appearance of the fossil remains of what used to be African Homo erectus, more recently termed Homo rudolfensis. These are now universally referred to as Acheulean assemblages, named after the much later, though discovered earlier, French sites, in contradiction to Goodwin’s wishes. Three remarkable aspects of these occurrences have continued to puzzle archaeologists. First is the observation that hand axes are extremely widespread, from the river gravels of the Somme and Thames in England, through the beach terraces of the Mediterranean to the lakeside camps of the East African Rift and on to the caves and coastal dunes of the Cape. A spatial distribution of several million square kilometers and a time span of more than a million years implies a broad functionality and a lasting usefulness, a veritable Palaeolithic Swiss army knife. Suggestions as to the uses of hand axes have included digging up underground plant foods, butchering animal carcasses, implementation as missiles, and suspension as parts of animal snares. This issue is not independent of the other two enduring characteristics of these often beautifully made tools. Many archaeologists have noted that in many, but not all, cases hand axes appear to have been flaked beyond the apparent functional requirements. Indeed the pursuit of symmetry and regularity of outline is
Sub-Saharan Africa, Archaeology of undeniable in many examples, raising the question of whether the objects are functional or works of art, or both. Associated with this observation is the remarkable number of such hand axes at many, but again not all, sites. At Olorgesailie and other East African sites, and particularly at the hand axe localities along the Orange River in South Africa, there are thousands, and in some cases millions of bifacially worked hand axes. Even allowing for the fact that such sites are obvious palimpsests, there seems to be a case for questioning the production strategy of artefacts that often appear to the casual observer to be still relatively usable. Why did people put so much energy into the finishing of such pieces, and why did they make so many? A recent suggestion has been that both of these characteristics imply that the making of the tool has to be seen as just as important as its use. Like the tail of the peacock, the ability to lavish attention onto the making of a hand axe, and then discard it after little use, must imply a set of evolutionary capacities to impress the opposite sex. We cannot say who was making and who was impressed, but the assumption of many archaeologists is that men made and women gasped. There is little doubt that all developments in human genetic evolution and technological history prior to about 1.8 million years ago happened in Africa. The earliest non-African sites at Dmanisi in Georgia and Ubeidiya in Israel are not Acheulean assemblages. Although not strictly a matter of African prehistory, the absence of the hand axe and cleaver, and hence the designation Acheulean, from southeast Asian stone tool assemblages, has caused much comment. If it is confirmed that the first ‘out of Africa’ movement preceded the invention of these bifacial forms, and if it is further confirmed that genetic exchange between southeast Asia and Africa has been minimal after this movement, then a long independent trajectory for both hominid evolution and artefactual tradition in southeast Asia becomes plausible. By contrast, the artefactual records of Europe, southwestern Asia, and Africa remain rather tightly linked throughout most of the Pleistocene. In an intriguing cycle of argument and counter argument, it now appears that Raymond Dart’s belief that australopithecines or early members of the Homo lineage were both firemakers and bone tool makers may have some support. Dart’s original ideas were based on his work at Makapansgat, South Africa, which he initially believed to be a place where australopithecines lived. The hearths, however, turned out to be manganese staining. C. K. (Bob) Brain, in an inspiring programme of taphonomic, experimental, and ethno-archaeological research, showed that the complex stratigraphies of the (then) Transvaal dolomite caves were places where early hominids died and were deposited by predators. Patterns in associated faunal remains were produced by the actions of predators and other forces on bones of variable
density, not by tool makers’ choices. It is ironic, then, that working at the stratigraphically better understood site of Swartkrans, Brain later began to record the existence there of evidence for both fire and bone tools. It should be recognized that the Swartkrans evidence is much later than that claimed by Dart from Makapansgat, and that the argument is much more robust, if still somewhat vulnerable. The current claims from Swartkrans are that over a hundred small pointed bones were used for excavating underground plant foods and that charred bones among the faunal remains reflect the controlled use of fire, both claims for behaviors predating 1 million years.
3. Regionality and the Appearance of Modern Humans The transition from Earlier to Middle Stone Age assemblages is poorly understood, in part because it happened much before radiocarbon dating became useful and after the time period best suited to potassium argon dating. Many Middle Stone Age (MSA) sites are open sites difficult, in any case, to date. Most archaeologists presume that the assemblages without hand axes, characterized by some facetted platform flakes and radial and Levallois cores, first appear about 250,000 years ago, but there is much room for debate. The decision no longer to make hand axes or cleavers from about this time may signal the invention and dissemination of methods of mounting and hafting stone points onto wooden shafts. Assuming this required smaller stone components, we might interpret the many kinds of points, unretouched, unifacial, bifacial, leaf-shaped, hollow-based, tanged and serrated, as versions of the spear point. As Desmond Clark pointed out some time ago, regionalization of point ‘style’ characterizes the MSA and distinguishes it from the preceding Earlier Stone Age (ESA), where change was slow and dissemination of form more encompassing. For some archaeologists this regionality within the MSA, along with the more rapid replacement of one form by another, is a mark of symbolic storage, because it reflects decisions by presumably identity-conscious groups, choices that reflect not function but style, a first indication of ‘them’ and ‘us.’ The most dramatic development in thinking about the African Middle Stone Age has been its remarkable rise, in Desmond Clark’s words, ‘from peripheral to paramount.’ In the late 1960s, bolstered by a rather naive belief in the accuracy of some of the earliest radiocarbon results obtained, MSA assemblages were thought to be contemporary with Upper Palaeolithic assemblages in Europe. Because of the obvious similarities between MSA and Middle, rather than Upper, Palaeolithic stone tool technologies and types, this was taken as a reflection of Africa’s backwardness in 15253
Sub-Saharan Africa, Archaeology of progressing along the (presumed standard) toolmaking trajectory. It was also wrongly believed at that time that the Kabwe (formerly Broken Hill) cranium and the Saldanha calvarium were ‘African neanderthalers.’ After Vogel and Beaumont began to doubt the radiocarbon ‘dates’ for MSA assemblages, new, more experimental dating techniques, such as luminescence and ESR dating, became available (see Chronology, Stratigraphy, and Dating Methods in Archaeology). These have shown that almost all of the MSA is beyond the range of radiocarbon dating and confirmed the rough contemporaneity of Middle Palaeolithic and Middle Stone Age. We should also not underestimate the impact of the genetic work on mitochondrial DNA of the mid 1980s on archaeologists’ thinking about the age of the MSA and the significance of the morphology of associated hominid remains. The ‘out of Africa’ genetic model, a reevaluation of the modernity of MSA hominids and new dating assessments for African MSA sites have transformed our understanding of the origins of anatomically and behaviorally modern people. In this process, Africa has emerged as paramount, the likely source area of modernity in both senses. Central to this change of thinking has been the set of assemblages known to John Goodwin and referred to by him as Howiesons Poort. These formed part of the short-lived notion of a ‘second intermediate’ between the Middle and Later Stone Ages, but have been shown repeatedly to lie stratigraphically within the suite of MSA assemblages, most effectively at Klasies River (South Africa) main site. Howiesons Poort assemblages are MSA in that they contain radial cores and sub-triangular flakes with facetted platforms, but they have an interesting set of novel characters that make them, in some estimations, precocious manifestations of later stone tool making patterns. Unlike assemblages above and below them, for example, there is a tendency for tool makers to prefer finer grained rocks and to manufacture substantial numbers of blades, some of them punch-struck, from them. There are also many backed pieces, including segments, some of them microlithic, trapezes and truncated blades, and in some assemblages, burins. These assemblages are increasingly well dated to beyond 60,000 years, which has interesting implications for these ‘Upper Palaeolithic-like’ characteristics, which seem to appear in Europe, along with unquestionably modern people, some 20,000 or more years later. The African Middle Stone Age, but not specifically the Howiesons Poort, is increasingly viewed as the product of anatomically modern people. From Dar es Soltan in the north to Klasies River main site in the south, MSA hominids are anatomically modern, as indeed, or nearly so, are their contemporaries at Middle Palaeolithic sites in southwest Asia. Klasies River main site is, in fact, the most persuasive example of a set of associations that constitute anatomical and behavioral modernity. A deep depositional pile that is 15254
unquestionably a shell midden, associations of foodwaste and stone tools with hearths, well dated by radiometric dates and faunal observations to Marine Isotope Stages 5 and 4, and with excellent stratigraphic associations of MSA artefacts, including a Howiesons Poort component, with enough hominid fragments to be sure of a modern morphology, sets the standard for claims about modernity. Elsewhere in southern Africa, MSA assemblages are associated with abundant, clearly utilized ochre, bone tools, and decorated pieces of ostrich eggshell. So far it is not clear whether the relative paucity of these kinds of associations elsewhere in Africa is due to incomplete survey, poor preservation or the geography of ancient innovations. Claims for MSA manufacture of bone harpoons at Katanda would, if confirmed, support the innovative character of African MSA people. Why, we may wonder, has Africa become so clearly paramount, so definitively not peripheral in the evolutionary history of modern people. One possibility is the accumulating evidence for the necessary consumption of marine and freshwater foods as the nutritional substrate for intellectual advances. Rather than being a consequence of increased intelligence, something that modern people did, the regular gathering of shellfish and other marine and freshwater organisms, appears to be essential if people were to acquire the structural lipids necessary to build a bigger brain. Coastal, stream, and lakeside site locations have been the pattern in Africa since the time of early hominids: it may be as much the available foods as the accessibility of water that had significance. Because the earliest shell middens are located along the shores of the Mediterranean-type terrestrial ecosystems at the northern and southern extremities of Africa, it is from here that the surge to modernity may have been fueled.
4. The Surial of Hunting and Gathering Our understanding of the transition from Middle to Later Stone Age (LSA) is complicated by poor dating resolution, highly episodic site occupations, and by the fact that it apparently happened when much of the central parts of northern and southern Africa were even more arid than they are today. Additionally, at this time, approximately 20,000 to 50,000 years ago, the mean sea level was lower and some of the most attractive parts of the subcontinent were the exposed continental shelves, now lost to study by the rise in sea level. One interesting consequence of a lengthy expansion during Marine Isotope Stages 2, 3, and 4 of the Kalahari and Sahara arid zones, is the probable reduction in genetic exchanges between subtropical African populations and their contemporaries in the two Mediterranean-type ecosystems to the north and south. It was perhaps during this time, after the emergence and expansion out of Africa of modern
Substance Abuse in Adolescents, Preention of people, that the current geographic patterns in African human phenotypes became established. Apparently supporting this projection is the increasingly regionally differentiated and localized nature of the African LSA. Goodwin was correct in recognizing little similarity between the African LSA as a whole and the European Upper Palaeolithic. The transition from the MSA to the LSA technologically involved the abandonment of radial and Levallois core techniques and the focusing on small bipolar and unipolar cores that had already appeared in the MSA (see Lithics and Archaeology). It may have happened at different times in different parts of the sub-continent, something that now appears likely in the European transition from Middle to Upper Palaeolithic as well. Microlithic assemblages appear much earlier in Kenya than in southern Africa, but robust regional chronological patterns are yet to emerge. Certainly by 20,000 years ago the sub-continent is widely but patchily occupied by hunter gatherers making a variety of (mostly) microlithic tools on fine grained raw materials and, where preservation is good, leaving behind tools of ostrich eggshell, bone and other organic raw materials that loosely resemble those of the recent ethnographic past. There are, however, many important technological, social, and demographic changes within the last 20,000 years, especially as pastoralism and farming began to spread slowly and irregularly through the ecosystems of subSaharan Africa. An important component of the heritage of LSA hunter–gatherers is the large corpus of paintings and engravings, commonly referred to as rock art (see Prehistoric Art). Throughout southern and eastern Africa there are images made by farming people, usually in white paint, often presenting geometric forms or images of colonial times and almost always made with the finger. Easily distinguished from these is a tradition that extends south from Zambia, Zimbabwe, northern Botswana, and Namibia to the Cape, characterized by relatively naturalistic fine line images of people and animals, in places overlain by handprints. These paintings and engravings may have begun to appear as early as 25,000 years ago, were certainly widespread by 10,000 years ago and were still being made in the century before last. From the research of Partricia Vinnicombe and David LewisWilliams, we can now relate the images to the world view of surviving San hunter–gatherers and their ancestors. A particularly important set of reference material has been the texts recorded by Wilhelm Bleek and his sister in law Lucy Lloyd in the 1870s and related to them by a group of Xam San informants from the South African interior. Recent Kalahari hunter–gatherers do not themselves paint or engrave, but clearly subscribed to the same cosmology as the now extinct Xam. A point of some interest is the corpus of Tanzanian fine line images that resemble
those from much further south. There is here an artistic parallel to the enigma of ‘click’ languages in more or less the same part of eastern Africa, also isolated from a host of such languages in the south. No doubt there is still much to learn about the history of the interface between Khoesan people and their traditions with other Africans to the north. There is a significant irony in the discovery of Africa’s paramount role in the story of human evolution. Because European science gave rise to the intellectual discipline of archaeology, European scientists have always been searching for origins in Europe. Gradually and serially these have failed, to be replaced by a growing realization of the African contribution. But it is tragic that with few exceptions this legacy of African prehistory has been revealed by people of European ancestry, albeit ones with a strong sense of belonging to Africa. But then, in the longer view, we are all Africans. See also: African Studies: History; Hunter–Gatherer Societies, Archaeology of; Hunting and Gathering Societies in Anthropology
Bibliography Goodwin A J H, van Riet Lowe C 1929 The Stone Age cultures of South Africa. Annals of the South African Museum. 27: 1–289 Klein R G 1999 The Human Career, 2nd edn. Chicago University Press, Chicago
J. Parkington
Substance Abuse in Adolescents, Prevention of Substance abuse is a problem that countries throughout the world have struggled with for many decades. In the USA where substance abuse is the highest in the industrialized world, the annual prevalence of illicit drug use among adolescents peaked in 1979, declined for more than a decade, and then rose sharply throughout most of the 1990s (Johnston et al. 1995\96; see Fig. 1). Concern for substance abuse as a problem derives from the fact that it can lead to a host of adverse consequences including long-term disability and premature death. Broadly conceived, substance abuse involves the inappropriate use or misuse of one or more psychoactive substances including tobacco, 15255
Substance Abuse in Adolescents, Preention of Used any illicit drug other than marijuana
Percentage
Used any illicit drug
Use in last 12 months
Figure 1 Trends in annual prevalence of an illicit drug use index for 12th graders (Johnston et al. 1995\96)
alcohol, and illicit drugs such as marijuana, cocaine, and heroin. For most individuals, substance abuse has its roots in childhood and adolescence (Hartel and Glantz 1999). It begins with a brief period of ‘experimental’ use of one or more substance followed by escalation in both the frequency and amount of the substance(s) used. As use increases, so too does tolerance for the pharmacological effects of most substances, leading to further increases in the amount used per occasion and the frequency of use as well, as the eventual development of psychological and\or physical dependence. Thus, the developmental course of substance abuse progresses from onset and occasional use to frequent and\or heavy addictive use. It also typically progresses from the use of widely available legal substances (alcohol and tobacco) to the use of illicit substances. Efforts to combat the problem of substance abuse have included actions by law enforcement agencies to interdict the flow of drugs, legislative initiatives, treatment interventions, mass media campaigns, and prevention programs. In view of the obvious difficulty of solving the problem of substance abuse through enforcement or treatment approaches, many experts have long argued that prevention offered the greatest potential. Since the early 1980s, there has been considerable progress in understanding both the causes of substance abuse and how to prevent it. This article summarizes what is currently known about both effective and ineffective prevention approaches.
1. Informational and Affectie Approaches Efforts to prevent substance abuse have largely emphasized the dissemination of factual information about psychoactive substances, their pharmacological effects, and the adverse consequences of use (Botvin et al. 1995b). These approaches assume that individuals begin using both licit and illicit substances after 15256
logically considering the potential risks and expected benefits. It is further assumed that individuals who elect to use drugs do so because they are unaware of the adverse consequences of using drugs. Given this, prevention approaches were initially designed to provide information about the risks associated with substance use so that people would be more likely to choose not to use drugs. Closely related to information dissemination were fear arousal approaches to prevention that attempted to deter use by dramatizing the dangers of substance abuse. Despite the intuitive appeal of these approaches, evaluation studies have consistently found that they do not decrease substance use (Schaps et al. 1981). Some studies suggest that these approaches may even increase substance use by stimulating curiosity. Another approach to substance abuse prevention that was widely used during the 1960s and 1970s attempted to prevent substance abuse by promoting affective development. These ‘affective education’ approaches, as they were called, were designed to increase self-understanding and acceptance through activities such as values clarification and responsible decision making, improving interpersonal relations by promoting effective communication and assertiveness, and providing students with the opportunity to fulfill their basic needs through existing social institutions. Once again, evaluation studies have found little evidence to support the effectiveness of these approaches (Schaps et al. 1981).
2. Social Influence Approaches In contrast to the cognitive and affective approaches, prevention approaches targeting the social factors that promote and sustain adolescent substance abuse have shown considerably more promise. These approaches were based on studies showing the important role that social factors play in the initiation and development of substance abuse. They are also based on social psychological theories of human behavior, particularly social learning theory, persuasive communications theory, and problem behavior theory. 2.1 Psychological Inoculation Derived from persuasive communications theory, psychological inoculation involves first exposing adolescents to weak social influences promoting substance use, followed later by progressively stronger social influences. Through this process, it was hypothesized that the exposed adolescents would build up resistance to more powerful prodrug messages, similar to what they would be likely to encounter as they entered junior high school. Despite the prominence of psychological inoculation in the initial formulation of the social influence prevention approach, it has not been supported by the empirical evidence and consequently has received considerably less emphasis in
Substance Abuse in Adolescents, Preention of more recent variations of the social influence model. Other components of the social influence model have assumed greater emphasis, particularly normative education and resistance skills training. 2.2 Normatie Education The prevalence of smoking, drinking, and illicit drug use is usually overestimated by adolescents (Fishbein 1977). This overestimate leads to the formation of inaccurate normative expectations—the forming of a set of normative beliefs that support substance use. In order to reduce pressure to use drugs, social influence approaches to substance abuse prevention generally include a component designed to correct the misperception that many adults and most adolescents use drugs. Two different strategies have been used to modify or correct normative expectations. The first is straightforward and simply involves providing adolescents with information from national or local survey data concerning the prevalence of substance use among adolescence and\or adults. A second method involves having adolescents conduct their own local surveys of substance use within their school or community. This information provides adolescents with a more realistic view of the prevalence of substance use and, because substance use in most communities is far lower than adolescents believe, it corrects the misperception that most people use drugs and helps establish antisubstance use norms. 2.3
Drug Resistance Skills
Drug resistance skills are the hallmark of most contemporary approaches to substance abuse prevention (Ellickson and Bell 1990). The primary focus of these prevention approaches is on teaching adolescents a set of skills or tactics for resisting influences to smoke, drink, or use illicit drugs— particularly influences coming from the media and peers. Adolescents are taught how to identify and respond to prodrug messages in advertisements, movies, rock videos, etc. They are also taught to identify the persuasive appeals used by advertisers to promote the sale of tobacco products and alcoholic beverages, to analyze advertisements and their messages, and to formulate counterarguments to common advertising appeals (Flay et al. 1989). Teaching these skills is designed to provide adolescents with the tools for actively resisting advertisements promoting cigarette smoking and alcoholic beverages, particularly given the growing recognition that advertisers have developed rather sophisticated ad campaigns explicitly designed to target the youth market. Perhaps the best known aspect of resistance skills training involves teaching adolescents a set of skills that they can use to resist offers from their peers to smoke, drink, or use illicit drugs. Prevention programs
that include resistance skills training emphasize verbal and nonverbal skills for resisting these offers. These programs teach adolescents what to say when they are offered or pressured to engage in some form of substance use as well as teaching them how to say it in the most effective way possible. Adolescents are taught to identify ‘high-risk’ situations where they are likely to experience peer pressure to use drugs, are taught to avoid those situations whenever possible, and are taught to develop ‘action plans’ so that they are prepared to handle situations involving peer pressure to use drugs. In order to be optimally effective, resistance skills training must provide adolescents with a general repertoire of verbal and nonverbal skills that they can use when confronted by peer pressure to use drugs. Adolescents must not only develop these skills, but also the confidence that they can apply them when necessary. To accomplish this, prevention programs teach students these resistance skills and also provide opportunities to practice the use of these skills.
3. Personal and Social Skills Training Since the early 1980s, prevention approaches have also been developed and tested that are broader in their focus than social influence approaches. These approaches are designed to teach information and skills that are specifically related to the problem of substance abuse within the context of enhancing the development of general personal competence, hence they are also referred to as ‘competence enhancement’ approaches. Although these approaches include elements of the social influence model, a distinguishing feature is an emphasis on teaching a variety of personal selfmanagement skills and general social skills. An underlying assumption of this approach is that merely teaching drug refusal skills is not likely to be effective with adolescents who are motivated to use drugs. Put differently, teaching adolescents how to say no to drugs will not succeed among those adolescents who want to say yes. Moreover, advocates of this approach argue that prevention approaches that target a more comprehensive set of risk and protective factors are more likely to be effective than those intended to target a limited number of these factors. Competence enhancement approaches teach generic skills such as general problem-solving and decisionmaking skills, cognitive skills for resisting interpersonal and media influences, skills for increasing self-control and self-esteem, skills for coping effectively with anxiety or stress, general social skills, and assertive skills. One prominent approach, called life skills training (LST), teaches several types of selfmanagement skills such as self-appraisal, goal setting and self-directed behavior change, decision making and independent thinking, and anxiety management skills (Botvin et al. 1990). Adolescents are also taught 15257
Substance Abuse in Adolescents, Preention of a set of general social skills including communications skills, conversational skills, skills for asking someone out for a date, complimenting skills, and both verbal and nonverbal assertive skills. All of these skills are taught using a combination of instruction and demonstration, behavioral rehearsal (practice), feedback, social reinforcement (praise), and extended practice through behavioral homework assignments.
normative education and\or resistance skills training. Although approaches that combine the key elements of the social influences approach and the broader competence enhancement approach appear to be the most effective based on current evidence, additional research is necessary to better understand how and why effective prevention programs work. 4.2 Long-term Effects
4.
Eidence of Effectieness
A growing literature shows that several approaches to substance abuse prevention can substantially reduce adolescent substance use. However, it is now clear that not all prevention approaches are effective. In fact, the only prevention approaches that have consistently reduced substance use are those that teach resistance skills and normative education, either alone or in combination with components designed to enhance general personal and social competence. Studies testing these approaches provide strong evidence for the short-term effectiveness of prevention, but somewhat weaker evidence of long-term effectiveness. 4.1 Short-term Effects Studies testing the effectiveness of social influence approaches to cigarette smoking have generally shown that they are able to reduce the rate of smoking by 30 to 50 percent after the initial period of intervention (e.g., Luepker et al. 1983, Ellickson and Bell 1990, Sussman et al. 1993). One study has also demonstrated that the social influence approach can reduce the use of smokeless tobacco (Sussman et al. 1993). Studies testing interventions that contain social influence components in combination with personal and social skills training components show that they can cut cigarette smoking by 40 to 75 percent (Botvin et al. 1995a, Schinke 1984). Social influences approaches have also been shown to reduce alcohol and marijuana use by roughly the same magnitude as for tobacco use (Ellickson and Bell 1990). Similarly, studies show that the personal and social skills training approaches can also reduce the use of alcoholic beverages and marijuana (Botvin et al. 1990). When comparing the relative impact of these prevention approaches on tobacco, alcohol, and marijuana use, the strongest effects are for cigarette smoking and marijuana use, while there are somewhat weaker and less consistent effects on the use of alcoholic beverages. It is important to recognize that the social influence approaches and the competence enhancement approaches discussed here both include program components designed to increase drug resistance skills and promote antidrug norms. The results from one study (Caplan et al. 1992) suggests that generic competence enhancement approaches to substance abuse prevention may not be effective unless they also teach 15258
The results of longer-term follow-up studies present a somewhat mixed picture. Evidence from several studies indicates that current prevention approaches have a reasonable degree of durability. Studies show prevention effects lasting anywhere from one to four years (Luepker et al. 1983, Flay et al. 1989). Several factors appear to play a role in determining how durable these prevention approaches are. These include (a) the length of the initial intervention, (b) the inclusion of ongoing (multiyear) intervention or booster sessions, (c) the quality and completeness of implementation, and\or (d) the number of risk and protective factors targeted. Prevention approaches that produced longer-lasting effects generally have a stronger initial dosage (eight class sessions or more), include booster sessions, are implemented with high fidelity, and utilize a more comprehensive prevention approach. Results from a large-scale, randomized field trial involving nearly 6,000 seventh graders from 56 public schools in the USA found long-term reductions in smoking, alcohol, and marijuana use at the end of the twelfth grade (Botvin et al. 1995a). In this study schools were randomly assigned to prevention and control conditions. Participating students in the prevention condition received the LST program during the seventh grade (15 sessions) with booster sessions in the eighth grade (10 sessions) and ninth grade (five sessions). No intervention was provided during grades 10 through 12. According to the results, the prevalence of cigarette smoking, alcohol use, and marijuana use for the students in the prevention condition was as much as 44 percent lower than for controls. Significant reductions were also found with respect to the percentage of students who were using multiple substances one or more times per week. These results suggest that to be effective, prevention programs need to be more comprehensive and have a stronger initial dosage than most studies using the social influence approach. Another implication is that prevention programs also need to include at least two additional years of interventions (booster sessions) and must be implemented in a manner that is consistent with the underlying prevention model. 4.3
Effectieness with Ethnic Minority Youth
Several studies have examined the impact of prevention approaches on minority youth. Two strategies
Suicide have been followed. One strategy, common to many local prevention efforts, assumes that the etiology of substance abuse is different for different populations. Thus, many prevention providers have developed interventions designed to be population-specific (e.g., projects for African-American youth with an Afrocentric cultural focus). A second strategy assumes that the etiology of substance abuse is more similar than different across populations. This approach involves the development of preventive interventions designed to be generalizable to a broad range of individuals from different populations. As information concerning the etiology of substance use among minority youth has accumulated, it has become clear that there is substantial overlap in the factors promoting and maintaining substance use among different populations. The results of these studies suggest that prevention approaches may have application to a broad range of individuals from different populations. Indeed, studies show that both the social influence approach and the combined social influence\competence enhancement approach are effective with minority youth (Botvin et al. 1995b). Research also shows, however, that in environments with homogeneous populations, prevention effects can be further enhanced by tailoring the intervention to the target populations.
5. Concluding Comments Substantial progress has been made since the early 1980s in the area of substance abuse prevention. Much has been learned concerning the etiology of substance abuse and how to prevent it. Despite the advances in prevention, additional research is needed to determine the long-term effectiveness of current approaches; further refine existing approaches, and identify new ones that may be more powerful; identify and test the mechanisms through which effective programs work; develop and test more comprehensive prevention strategies that combine the best features of existing prevention modalities; and extend the findings from substance abuse prevention to other related health and social problems such as HIV\AIDS, teen pregnancy, violence, and delinquency. Finally, it is vitally important that a concerted effort be made to translate research into practice in order to capitalize on the progress that has been made in prevention research. See also: Crime and Delinquency, Prevention of
Bibliography Botvin G J, Baker E, Dusenbury L, Botvin E M, Diaz T 1995a Long-term follow-up results of a randomized drug abuse prevention trial in a white middle-class population. Journal of the American Medical Association 273(14): 1106–12
Botvin G J, Baker E, Dusenbury L, Tortu S, Botvin E M 1990 Preventing adolescent drug abuse through a multimodal cognitive-behavioral approach: Results of a three-year study. Journal of Consulting and Clinical Psychology 58: 437–46 Botvin G J, Schinke S, Orlandi M 1995b Drug Abuse Preention with Multi-ethnic Youth. Sage, Newbury Park, CA Caplan M, Weissberg R P, Grober J S, Sivo P, Grady K, Jacoby C 1992 Social competence promotion with inner-city and suburban young adolescents: Effects of social adjustment and alcohol use. Journal of Consulting and Clinical Psychology 60: 56–63 Ellickson P L, Bell R M 1990 Drug prevention in junior high: A multi-site longitudinal test. Science 247: 1299–1305 Fishbein M 1977 Consumer beliefs and behavior with respect to cigarette smoking: A critical analysis of the public literature. In: Anonymous (ed.) Federal Trade Commission Report to Congress Pursuant to the Public Health Cigarette Smoking Act of 1976. US Government Printing Office, Washington, DC Flay B R, Keopke D, Thomson S J, Santi S, Best J A, Brown K S 1989 Long-term follow-up of the first waterloo smoking prevention trial. American Journal of Public Health 79(10): 1371–6 Hartel C R, Glantz M D 1999 Drug Abuse: Origins and Interentions. APA Books, Washington, DC Johnston L D, O’Malley P M, Bachman J G 1995\96 National Surey Results on Drug Use From the Monitoring the Future Study, 1975–1998. Vol. II: College Students and Young Adults. National Institute on Drug Abuse, Rockville, MD Luepker R V, Johnson C A, Murray D M, Pechacek T F 1983 Prevention of cigarette smoking: Three year follow-up of educational programs for youth. Journal of Behaioral Medicine 6: 53–61 Schaps E, Bartolo R D, Moskowitz J, Palley C S, Churgin S 1981 A review of 127 drug abuse prevention program evaluations. Journal of Drug Issues 11: 17–43 Schinke S P 1984 Preventing teenage pregnancy. Progress in Behaior Modification 16: 31–64 Sussman S, Dent C W, Stacy A W, Sun P 1993 Project towards no tobacco use: 1-year behavior outcomes. American Journal of Public Health 83: 1245–50
G. J. Botvin
Suicide 1. Definition The term suicide was mentioned for the first time in medieval times in 1177 (van Hooff 1990) according to the Latin sui cidium (self-killing) and sui caedere (to kill himself\herself). Now, this term is used in the international scientific literature. It avoids judgment, as do terms like ‘self murder’ and ‘voluntary death.’ In the realm of the term suicide, there are other terms that have to be defined. Suicide ideas include: thinking about death either of one’s own or someone else’s or death in general; the wish to die; suicide ideas in a more narrow sense. It is not easy to define the term suicide attempt. The definition most commonly ac15259
Suicide cepted is the following: ‘An act with nonfatal outcome, in which an individual deliberately indicates a nonhabitual behavior that, without the intervention of others, will cause self-harm, or deliberately ingests a substance in excess of the prescribed or generally recognized therapeutic dosage, and which is aimed at realizing changes which the subject desired via the actual or expected physical consequences’ (WHO 1989, Platt et al. 1992). This definition includes an active intention for self-harm but not necessarily selfkilling. Suicide and suicide attempt are also considered as suicidal behavior.
2.
Importance of Suicide
Suicide belongs to the 10 most frequent causes of death among adults in industrialized European and North American countries. In these countries, suicide is considered the second most frequent cause of death among adolescents and young adults (15–35 years). Suicide in young male adults has increased constantly during the last 20 years. Table 1 shows the frequencies of causes of death worldwide (WHO 1990) disregarding the most frequent causes of death such as cardiovascular diseases, stroke, cancer, infectious diseases, and malnutrition. These figures have to be taken cum grano salis because of a dearth of or inaccurate statistics on death in developing countries. As shown in Table 1, only fatal car accidents had a similar frequency of death as compared with suicide rates. War activities, crime, or even AIDS at the beginning of the 1990s had a considerably lower death rate.
3. Epidemiology The most recent available suicide rates from all over the world are depicted in Fig. 1. This shows that a high variability of suicide rates exists across cultures and countries. Looking at the rank order of the frequency of suicide rates, a relatively stable pattern can be described—at least for European countries. There exist a North–South down gradient and a East–West down gradient of suicide rates: eastern and northern countries have higher suicide rates than western and southern countries. Hereby, sociological factors seem to play an important role in terms of secularization and industrialization with the collapse of the family Table 1 WHO statistics: causes of death worldwide (1999) 856,000 818,000 322,000 292,000 291,000
15260
Death because of car accidents Death because of suicide Death because of war Death because of crime Death because of Aids
group as a consequence as well as climate factors, which cannot be disentangled sufficiently. There exist distinct sex differences with a preponderance of males in suicide and preponderance of females in suicide attempt. This distribution is stable across all countries and cultures. Only China reports a prevalance of suicide among women in selected areas. The highest rate of suicide is found in people of older age and that of suicide attempt in younger people. Suicidal behavior has been found very rarely in children but, without doubt, there is clinical evidence that children from 4 years of age have suicide ideas and make a suicide attempt or kill themselves. However, suicide occurring in children before prepuberty is very rare. Furthermore, it has to be taken into consideration, that until prepuberty (8–10 years) children normally do not experience death as something irreversible (Pfeffer 1986). Cultural differences exist in terms of using different suicide methods. Socalled weak suicide methods (ingestion of drugs, wrist cutting) and hard suicide methods (shooting, hanging, jumping, drowning, etc.) will be distinguished. Weak suicide methods prevail in suicide attempts, hard methods in successful suicide. In general, females more often prefer weak methods than males. The most frequent method for suicide in females and males in North America is shooting, in Europe hanging. Suicide rates in several European countries such as Germany, England and Wales, France, Hungary, and the Nordic countries show an astonishing stability over decades and even a century. The same holds true for counties such as Saxonia, Brandenburg, and Thuringia, where suicide rates have been stable over a century. Figure 2 depicts suicide rates over one century in Germany (Deutsches Reich and Federal Republic of Germany) (Wedler 1992). Such stability can be traced also (in other countries) back to the nineteenth century such as in Prussia, France, and the Nordic countries. On the other hand, there have been dramatic changes in suicide rates as during the nineteenth century, where secularization, industrialization, and individualization have taken place in western and northern European societies. During World War I and II further dramatic changes occurred not only affecting both males and females but also countries which have not been involved in these wars, such as Switzerland. Finally, the last changes have occurred after the fall of the Iron Curtain in Eastern Europe between 1985 and 1995 with a dramatic decrease in suicide rates by up to 40 percent (Diekstra 1996). (See Mental Illness, Epidemiology of .)
4. Differing and Common Aspects between Suicide and Suicide Attempts There are considerable differences between suicide attempters and suicide victims (Stengel 1964). The group of suicide victims is rather small in comparison
Suicide
Figure 1 WHO statistics: suicide rates of males and females (1994–96)
with the group of suicide attempters. The sociodemographic characteristics are quite different: suicide victims are older and predominantly male, suicide attempters are younger age and predominantly female.
The tendency of a transition from the group of suicide attempters to the group of suicide victims is rather small with a 10 percent chance. Whereas the life of the victim is terminated by suicide, a nonfatal suicide 15261
Suicide
Figure 2 Suicide rates per 100,000 inhabitants in the German Reich from 1900 to 1940 and in West Germany from 1945 to 1990 (Wedler 1992)
attempt becomes a very important event in the life of this person, often with a change in the further life trajectory. Nevertheless, there are also some common aspects between these two groups: the probability of suicide is higher, the more often suicide attempts happen. Despite the fact that more males are committing suicide and more females make a suicide attempt, there is a considerable overlap between both groups. The same holds true for age distribution: a considerable proportion of young males and females commit suicide, while older males and females make suicide attempts. Most of the risk factors are equally valid for both suicide victims and suicide attempters (see below).
5. Risk Factors for Suicide and Suicide Attempts Risk factors for suicide and suicide attempts are as follows: increased age; marital status like being separated, divorced, widowed; single; unemployment; gender and economical status (suicide occurs more often in males, suicide attempts occur more often in females and the lowest social class). The most important risk factors for suicide and suicide attempts, however, are a previous suicide attempt and a psychiatric disorder (Kreitman 1977). Psychiatric hospitals bear a great risk of losing a patient to suicide. It has been estimated that 15 percent of depressed patients, 15 percent of patients with a drug or alcohol addiction, and 10 percent of patients 15262
with schizophrenia, who have been treated in a psychiatric hospital, finally die by committing suicide. Patients with anxiety disorders or personality disorders also have a considerable risk of committing suicide. From an epidemiological point of view, subjects with a depression or an addiction have a 15fold to 60-fold higher risk of committing suicide as compared with the general population (Bronisch 1999). There are differences between urban and rural areas with respect to suicide rates. The highest suicide rates are seen in urban areas, whereas the lower rates are seen in rural areas. An exception are the states of the former USSR, where a preponderance of suicide rates in rural areas is found. Climate seems also to play an important role in the distribution of suicide rates. The highest suicide rates are found during late spring and summer time. This distribution can be found across the northern as well as the southern hemispheres. Countries in southern parts of Europe show a lower suicide rate than countries in the northern parts of Europe. Religious affiliation also plays in important role in the frequency of suicides. Especially in those countries with a predominance of Catholics, suicide rates are lower, whereas in countries with a predominance of Protestants, suicide rates are higher. However, if the degree of urbanization and the distribution of social classes are taken into consideration, the differences between catholic and protestant countries are fading (Kreitman 1977).
Suicide
6. Origins of Suicidal Behaior 6.1 Suicide as a Basic Human Existential Problem Ever since the written history of mankind, suicide has been considered a basic human existential phenomenon that has been reflected in many philosophical and religious contributions as well as in all kinds of arts. The attitude towards suicide did not only depend on the Zeitgeist of the respective historical period, but also on the individual stance of different philosophical schools such as of Stoa and Epikur in antiquity and different religions such as Christianity and Buddhism. The sociological theories are a consequence of the enlightment of the seventeenth and eighteenth centuries. During the second half of the nineteenth century, psychological theories developed upon the influence of psychological thinking and emerging psychoanalysis. During the second half of the twentieth century, biological theories were finally stimulated by the natural sciences focusing on the brain. Suicide requires self-reflecting properties of an individual, i.e., an individual who can differentiate between an observing and experiencing ego. Suicide does not exist in animals. Even nonhuman primates do not show suicidal behavior, neither in natural settings nor in preservations. However, suicide has been found in Neolithic societies. 6.2 Sociological Aspects 6.2.1 Durkheimhs theory. Sociological theories are influenced by the theory of Durkheim (1897) based upon the evaluation of suicide statistics of western European countries (France and Prussia) during the second half of the nineteenth century. Durkheim proposed four different types of suicide depending on the degree of successful adaptation of the individual to different types of societies. A prerequisite to adaptation of the individual to society is that individualization is neither too weak nor too strong. If the individualization is too strong, the individual will be isolated from society and the danger of an egoistic suicide increases. If the individualization is too weak, the danger of an altruistic suicide increases. On the other hand, a prerequisite to the equilibrium between the individual and society is that social norms are neither too strong nor too unspecific. Norms that are too strong facilitate fatalistic suicide such as an expiatory death. Norms that are too unspecific favor anomic suicide. With the help of the existing suicide statistics, Durkheim tried to verify three of the four types (he did not discuss the fatalistic suicide). He tried to prove that Protestant countries have a higher suicide rate than Catholic countries, since Protestantism allows the individual a higher degree of freedom than does Catholicism: it is called egoistic suicide. He explained the increase in suicide rates during political
crises (Dreyfuss affair in France) in terms of an egoistic suicide. During times of flourishing economic growth, implying a derestriction of social norms, suicide rates should lead to an anomic suicide, whereas during times of war, stronger norms lead to a stronger coherence of society resulting in a decrease in suicide rates.
6.2.2 General sociological trends. During the nineteenth century, only parts of Durkheimhs theories could be confirmed such as the differences between Catholic and Protestant countries in western Europe and the decrease of suicide rates during both world wars. Generally speaking, Durkheimhs data and statistical data from the twentieth centuries could demonstrate an increase in suicide in Europe during the nineteenth and twentieth centuries depending on the development of industrialization, secularization, and individualization of the respective countries, i.e., the more pronounced these developments were the more frequent suicide and suicide attempts occurred (Retterstol 1992). However, through the fall of the Iron Curtain in Eastern Europe between 1985 and 1995, the theory of anomic suicide has been opposed. Also, a remarkably stable sex difference with male preponderance in suicide and female preponderance in suicide attempts has been shown in nearly all cultures and societies. Another quite remarkable trend of the last decades has been the increase in suicide rate of 15–35-year-old adolescents and young adults. In most western societies, this development has been correlated with an increasing rate of depression and addiction potential. An increasing urbanization, loss of social structures and familial bonds, as well as changes in sex roles may play an important role in this development (Klerman 1987).
6.3 Psychological Aspects Psychological models include psychodynamic, attachment, and cognitive-behavioral theories that are only partially substantiated by empirical research. The main theories are concerning aggression directed inward instead of outward, the loss of a symbiotic relationship, and a severe personal insult resulting in severe loss of self-esteem. Finally, classical conditioning, operant conditioning, and model learning are, in terms of the behavioral point of view, also of great importance.
6.3.1 Aggression directed against the ego. Freud (1917) conceptualized one of the first psychological theories about the development of suicidal behavior in ‘mourning and melancholia.’ To him, the psychodynamics of suicidal behavior and depression were 15263
Suicide indistinguishable, i.e., suicidal behavior and depression were seen as a turn of aggression against the ego and an ambivalent attitude towards other people. Aggression directed against other people will be followed by loss of a desperately needed person. Turn of the aggression against the ego is expressed by feelings of guilt, self-devaluation, and finally selfkilling.
6.3.2 Distorted cognitie schemes. Beck (1967) also emphasized a strong connection between suicidal behavior and depression. The depressive sees himself, the world, and the future (so called cognitive triade) negatively because of their distorted cognitive schemes. Catastrophization, minimizing, maximizing, and overgeneralization belong to typically distorted cognitive schemes. As a consequence of such a cognitive triade, helpnessless, hopelessness, and powerlessness result. In this case, the only solution for an individual may be to end their miserable life by committing suicide.
6.3.3 Narcissistic crisis. Henseler (1974\1984) formulated the first narcissistic theory of suicidal behavior. Individuals with narcissistic personality traits are people who can be hurt easily by significant others through criticism or lack of special attention. They are lacking a stable self-esteem and their self-perception is distorted: they oscillate between overestimation and underestimation of their own abilities. This is not restricted to one’s own character but also extends onto a corresponding distorted view of other people. Insults through criticism or lack of special attention lead to aggressive outbursts that can be directed either towards others or their own character in terms of a suicide or suicide attempt.
6.3.4 Appeal to human bonding. Attachment of the newborn to the mother is a characteristic common to all mammals. In all mammals, the newborn is very dependent and the reproductivity rate at least in the nonhuman and human primates is rather small. Therefore, the strong attachment is a life-preserving measure of the evolution to protect the dependent newborn until it is grown up. The emotional bonding in primates is very intense, a separation of the mother will be experienced as very painful (Bowlby 1977a, 1977b). As has been known from many empirical studies, most suicides and suicide attempts are consequences of imminent separation, separation, or divorce of males and females. The intention to die is ambivalent in most suicide attempts, i.e., besides a true tendency to end life in nearly all suicide attempts there are 15264
tendencies to survive. That can be seen easily in the choice of suicide methods, which often are not so dangerous, i.e., ingestion of small amounts of pills, or in the arrangement of suicide attempt that leaves a rescue open, i.e., by attempting suicide in the presence of a significant other. The painfully expected or experienced separation from a significant other and the appeal of the significant others not to be left behind both seem to play an important role in the meaning of a suicide attempt (Bronisch 1995).
6.3.5 Classical and operant conditioning. Basic principles of behavioral theory and therapy can be described using the Stimulus Organism Reaction Kontingency Consequence Scheme (SORKC scheme) of Kanfer (Schmidtke 1988) (see Fig. 3). The SORKC scheme tries to describe each behavior in a broader context of conditions, which can be placed linearly in a chain or circle. The constituents of the SORKC scheme for suicidal behavior are the following: (a) life events and social situations as Stimuli variables (S), which may trigger suicidal behavior; (b) behavioral repertoire based upon organic conditions of the individual such as psychiatric disorders and the personality (e.g., cognitive styles and stances) as Organism variable (O), which may predispose to suicidal behavior; (c) suicidal behavior as a reaction (R) to the S and O variables; (d) Kontingency (K), i.e., the reinforcement conditions, which support or suppress suicidal behavior; and behavior according to a regular or irregular order of the consequences; (e) Consequences (C) in the sense of negative reinforcement (e.g., death) or positive reinforcement (attention of the environment) of suicidal behavior. The process hereby is rather circular, i.e., consequences can again influence further suicidal behavior. This might be possible by changing the individual (O variable) and the surroundings (S variable).
6.3.6 Model learning. Imitation of suicidal behavior is a new area in suicide research. The observation that suicidal behavior is running in certain families, cultures, and countries could be explained by genetic transmission, psychological conflicts, or by model learning. In regard to model learning, three hypotheses have been proposed: the imitation hypothesis, the contagion hypothesis, and the suggestion hypothesis. All three hypotheses have in common that suicidal behavior of a model will be imitated, whereby predisposition of the individual (so-called ‘presuicidal personality’), lack of social support in coping with life crises and enduring social burden are the contributing factors. Furthermore, a high degree
Suicide
Figure 3 The cognitive-behavioral scheme for the development of suicidal behavior
of social suggestibility of predisposed individuals plays an important role. During the 1980s and 1990s the influence of different media, such as books, newspapers, journals, television, and movies, on increasing suicide rates can be demonstrated (Phillips and Lesyna 1995). Suicide victims have resembled the person portrayed by the respective media: there have been similarities between this person and the suicide victim in terms of age, gender, motives for the suicide, and the methods used for suicide. The increase of suicide rate was in temporal relationship to the reported suicide cases and was not compensated by a decrease in later time, i.e., these suicides were committed earlier than planned. Also very impressive are studies that demonstrate a decrease in suicide rates not only by changing the information towards being more informative and being less sensational, but also by reducing information about suicide (Phillips and Lesyna 1995). One example is a study of railway suicides reported in newspapers, which could be reduced by cooperation with local newspapers in reducing and confining to nonsensational reports on this matter (Sonneck et al. 1994). In a broader sense, traditions in different cultures and countries concerning either permissive or restrictive attitudes towards suicide can also be regarded as a form of model learning. These attitudes may be
transferred over generations such as in Hungary where an extremely high suicide rate can be observed throughout an entire century.
6.4
Biological Aspects
During the nineteenth century, Esquirol (1838) was the first to observe that suicidal behavior ran in families and, therefore, postulated that suicidal behavior was a kind of mental illness. Family, twin, and adoption studies gave hints about suicidal behavior having a genetic component. There is a familial burden due to suicide and suicide attempts (of suicide victims or suicide attempters as compared with subjects without suicide or suicide attempts). Separation of nature (heritage) and nurture (rearing) is not possible within family studies, but is within twin and adoption studies. In twin studies, identical (identical genes) and fraternal twins (half of the genes are nonidentical as in sisters and brothers) are compared with regards to suicide or suicide attempts. The respective studies demonstrate a significantly higher frequency of suicide and suicide attempts in identical as compared with fraternal twins (Roy et al. 1991). In adoption studies, identical twins have been compared with each other, whereby one twin has 15265
Suicide been raised with his biological parents, while the other twin has been adopted right away after birth. Significantly fewer twins raised by their biological parents developed suicidal behavior or committed suicide than twins who had been adopted following birth. Furthermore, within biological families, a significantly higher rate of suicides or suicide attempts was shown than within adoptive families of the respective identical twin. All these studies also demonstrate that suicidal behavior has to be seen apart from a specific psychiatric diagnosis, especially apart from a depressive illness. On the other hand, the distinctive feature that might be inherited may not be suicidal behavior as such but rather personality traits such as lack of impulse control or aggressiveness (Wender et al. 1986). Postmortem studies of suicide victims and in vivo studies (investigation of cerebrospinal fluid and the platelets in the blood of suicidal persons concerning molecular genetic, biochemical and physiological parameters) support the so-called serotonin hypotheses of suicidal behavior: impulsiveness, aggression directed inwards or outwards, and violent suicidal behavior are correlated with a dysfunction of the serotonergic transmitter system in the human brain (Mann 1998).
7. Preention Prevention can be divided into primary, secondary, and tertiary prevention. With regard to primary prevention the main focus implies a reduction of tendencies within the society that support suicidal tendencies. A verdict of suicidal behavior by means of social ostracizing or punishment has not been successful in reducing suicide rates in medieval and modern times (Minois 1995). However, primary prevention also concerns the prevention of suicidal behavior by ameliorating life conditions in general and socioeconomical conditions in particular as well as effective treatment of psychiatric disorders. It also refers to a support of tolerance for individual life styles. Furthermore, the restricted approach to frequently used methods, such as guns for shooting, can be helpful. Primary prevention implies also an amelioration of power of resistance against suicidal tendencies in the individual. Education of lay people and professionals in detecting and treating suicidal persons could be the most effective strategies. Secondary and tertiary prevention provided by fellow citizens (e.g., Samaritans) or medical or psychological professional personnel mean prevention of relapses in suicidal behavior or rehabilitation of patients with suicidal behavior. Empirical studies proving the effect of the above mentioned preventive strategies are still lacking. There is now an emerging trend to develop national suicide prevention programs in several western and northern European countries (Silverman and Maris 1995). 15266
8. Therapy Therapy has to be divided into crisis intervention, with actual psychiatric-psychotherapeutic intervention, into pharmacotherapy and psychotherapy. Crisis intervention primarily means prevention of an imminent danger of suicide. Crisis intervention should provide care and protection to the individual, i.e., protection of life, since suicide attempts often require medical care such as admission to an internal intensive care unit for detoxification or a surgical intervention. Crisis intervention also includes immediate evaluation of still existing suicidal tendencies of the patient in order to make decisions for further preventive measures: admission to a closed psychiatric ward for relapse prevention or crisis intervention on an inpatient or outpatient basis. Crisis intervention also includes acute psychotherapy and pharmacotherapy of the patient. The most important determinant of the psychotherapy is hereby the development of a supportive relationship between the suicidal person and the therapist. The greater the time lapse between the suicide attempt and help offered, the more often suicidal people are inclined to disregard this help. A supportive relationship means that the therapist is accepting the patient’s ‘cry for help’ and is making a commitment trying to understand the patient, however, is not willing to accept the patient’s decision to end their life as a form of reasonable problem solving. They clarify the triggering situation that has led to the suicide attempt. Together with the patient, the therapist tries to develop other strategies to resolve seemingly insoluble conflicts in the patient’s life. Also, an important determinant of a successful intervention is the involvement of significant others of the patient as well as the social environment in general. The acute pharmacological intervention means primarily sedation of the patient in order to keep him or her from further suicidal behavior. Psychotherapy means treatment of psychopathological states such as depression, dysfunctional behaviors like impulsiveness, or pathological personality traits such as low self-esteem (Bronisch 1999). It includes the classical types of psychotherapy, i.e., psychodynamic approaches and cognitive-behavioral approaches. So far, empirical studies could only demonstrate efficacy for some groups of patients, such as patients with Borderline Personality Disorders by cognitive-behavioral therapies (Linehan 1997). Pharmacotherapy of suicidal behavior primarily is focused on the underlying psychiatric illness of the patient such as schizophrenia (antipsychotics), depressive disorders (antidepressants), anxiety disorders (anxiolytics), and addiction (drugs against withdrawal symptoms). However, during the 1990s drugs specifically targeting the typical characteristics of suicide attempters such as impulsiveness and aggressiveness seem to have been successful in reducing suicide and suicide attempts: lithium (Ahrens 1997) and selective
Suicide serotonin reuptake inhibitors (Montgomery 1997). (See Antidepressant Drugs.)
9. Concluding Remarks Suicide and suicide attempt are a behavior confined to human beings. They are rooted fundamentally in the human existence and have as a prerequisite selfreflecting properties of the individual, i.e., conscious acting with the consequence or the attempt to extinguish the own existence. The attempts to describe suicide and suicide attempt based on a single cause in only one dimension has been tried by Esquirol (1838) in biology, Durkheim (1897) in sociology, or Freud (1917) in psychology and seems to be inappropriate. Suicide and suicide attempt are multidetermined acts (Roy 2000). However, the weighting of different aspects of suicide and suicide attempt seems to be rather difficult to explain, so that in the following, only a synopsis of the empirical findings will be given. Suicide is lacking in animals and even in nonhuman primates. However, it has already been found in Neolithic societies as well as later on in the old Greek and Latin societies. Precise empirical data about suicide in different countries and cultures has been lacking up to the nineteenth century so that a description of the development of suicide from the antiquity via the medieval to modern times is impossible. However, there is a considerable increase of suicides in societies with a high degree of secularization, industrialization, and individualization starting in the middle of the nineteenth century. For individual countries, there is often a stability of suicide rates on a high or low level depending at least in Europe on the degree of industrialization, religious affiliation and climate: northern countries with a high degree of industrialization, Protestant religious affiliation and rather few sunny days have considerably higher suicide rates than southern countries with a low degree of industrialization, Catholic religious affiliation, and many sunny days have lower suicide rates. The differences could best be explained by human bonding as a protective factor, whereas climate could influence biological rhythms, hence, triggering suicidal behavior. In regard to sex differences and suicidal behavior, very stable differences exist between males and females with a male preponderance in suicide and female preponderance in suicide attempts. The reasons for these sex differences, which seem to be stable across countries, cultures, and times, have not been well investigated and need further empirical research. There is no doubt that psychological motives play not only an important role as triggers for suicidal behavior but also are important determinants for the development of suicidal behavior. The most important motive for
suicidal behavior is certainly the impending loss of a human relationship, even if the wish to die outweighs the wish to survive in the suicidal act, i.e., in the seriousness of the suicide attempt. Besides that, the turning of aggressive tendencies towards the own character and narcissistic vulnerability are personality traits predisposing to suicidal behavior. From the cognitive-behavioral point of view, classical conditioning, operantconditioning, and model learningare basic learning principles, which also can be transferred to behaviors such as suicide and suicide attempt. This way, model learning is not only important for perpetuating suicidal behavior in families and peer groups but also in societies with traditional high (Hungary) suicide rates (so called imitation hypothesis). In certain families, a genetic burden of suicidal behavior could be confirmed by twin and adoption studies. This genetic burden is often connected to psychiatric illnesses such as depressive disorder, addiction, anxiety disorder, schizophrenia, and personality disorder. Biological studies (postmortem brain and peripheral biochemical studies) could demonstrate a dysfunction of the serotonergic system in suicide victims and suicide attempters, which in psychopathological terms is correlated to a loss of impulse control, aggressiveness, affective instability, and violent suicide attempt. As empirical studies could confirm, suicidal behavior is not a ‘normal’ behavior, but is rooted in psychological and biological abnormalities of the respondent with the corollary of prevention and therapy. However, empirical data about successful prevention of suicidal behavior is still lacking and successful therapeutic interventions only exist for subgroups of suicide attempters such as depressives with a successful treatment with lithium and serotonin reuptake inhibitors as well as cognitivebehavioral therapy for patients with a borderline personality disorder. Life as a precious good, desperation, and suffering of the respondents as well as suffering of the relatives should bring the focus on suicide more into the scientific scope of view as other life threatening illnesses already have been for a long time. See also: Bowlby, John (1907–90); Depression; Depression, Clinical Psychology of; Depression, Hopelessness, Optimism, and Health; Durkheim, Emile (1858–1917); Freud, Sigmund (1856–1939); Suicide: Ethical Aspects; Suicide, Sociology of
Bibliography Ahrens B 1997 Mortality studies and the effectiveness of drugs in long-term treatment. Pharmacopsychiatry 30(suppl.): 57–61 Beck A T 1967 Depression: Clinical, Experimental and Theoretical Aspects. Harper and Row, New York, Evanston, IL, London
15267
Suicide Bowlby J 1977a The making and breaking of affectional bonds I. British Journal of Psychiatry 130: 201–210 Bowlby J 1977b The making and breaking of affectional bonds II. British Journal of Psychiatry 130: 421–31 Bronisch T 1995 Der Suizid. Ursachen—Warnsignale— PraW ention. Beck, Munich, Germany Bronisch T 1999 Suizidalita$ t. In: Mo$ ller H J, Laux G, Kapfhammer H P (eds.) Psychiatrie und Psychotherapie. Springer, Berlin, Heidelberg, New York, pp. 1673–91 Diekstra R F W 1996 The epidemiology of suicide and parasuicide. Archies of Suicide Research 2: 1–29 Durkheim E 1897 Le Suicide: Etude de Sociologie (trans. Spaulding J A, Simpson G 1951). Free Press, Glencoe, IL Esquirol J E D 1838 Des maladies mentales: ConsidereT es sous les rapports meT dicale, hygienique et medicolegale; AccompagneeT des 27 Planches graeT es. Bruxelles. TI & II, Paris Freud S 1917 Trauer und Melancholie. In: Johnes E (ed.) Collected Papers. Hogarth Press, London, Vol. 4, pp. 137–56 Henseler H 1974\1984 Narzißtische Krisen. Zur Psychodynamik des Selbstmords. Westdeutscher Verlag, Obladen, Germany (1984), Reinbek bei Hamburg: Rowohlt (1974) van Hooff A J L 1990 From Autothanasia to Suicide: Self-killing in Classical Antiquity. Routledge, London Klerman G L 1987 Clinical epidemiology of suicide. Journal of Clinical Psychiatry 48(Suppl.): 33–8 Kreitman N 1977 Parasuicide. Wiley, New York Mann J J 1998 The neurobiology of suicide. Nature Medicine 4: 25–30 Minois G 1995 Histoire du suicide. Fayard, Paris Montgomery S A 1997 Suicide and antidepressants. In: Stoff D M, Mann J J (eds.) The Neurobiology of Suicide from the Bench to the Clinic. Annals of the New York Academy of Sciences, New York, pp. 329–38 Pfeffer C R 1986 The Suicidal Child. Guilford Press, New York, London Phillips D P, Lesyna K 1995 Suicide and the media research and policy implications. In: Diekstra R F W, Gulbinat W, Kienhorst I, De Leo D (eds.) Preentie Strategies on Suicide. Brill, Leiden, Germany, pp. 231–61 Platt S, Bille-Brahe U, Kerkhof A, Schmidtke A, Bjerke T, Crepet P, De Leo D, Haring C, Lonnqvist J, Michel K, Philippe A, Pommereau X, Querejeta I, Salander-Renberg E, Temesvary B, Wasserman D, Sampaio Faria J 1992 Parasuicide in Europe: The WHO\EURO Multicentre Study on Parasuicide I. Introduction and preliminary analysis for 1989. Acta Psychiatrica Scandinaica 85: 97–104 Retterstøl N 1992 Suicide in the Nordic countries. Psychopathology 25: 254–65 Roy A 2000 Suicide. In: Sadock B J, Sadock V A (eds.) Kaplan & Sadock’s Comprehensie Textbook of Psychiatry II. Lippincott Williams & Wilkins, Philadelphia, Baltimore, New York, London, Buenos Aires, Hong Kong, Sydney, Tokyo, pp. 2031–40 Roy A, Segal N L, Centerwall B S, Robinette C D 1991 Suicide in twins. Archies of General Psychiatry 48: 29–32 Schmidtke A 1988 Verhaltenstheoretisches Erklarungsmodell suizidalen Verhaltens. S. Roderer, Regensburg, Germany Silverman M M, Maris R W 1995 Suicide Preention Toward the Year 2000. Guilford Press, New York, London Stengel E 1964 Suicide and Attempted Suicide. Penguin, UK Wedler H 1992 Some remarks on the frequency of suicides in Germany. In: Crepet P, Ferrari G, Platt S, Bellini M (eds.) Suicidal Behaior in Europe. Recent Research Findings. Libbey, Rome, pp. 79–83
15268
Wender P H, Kety S S, Rosenthal D, Schulsinger F, Ortmann J, Lunde I 1986 Psychiatric disorders in the biological and adoptive families of adopted individuals with affective disorders. Archies of General Psychiatry 43: 923–29 World Health Organization 1990 Consultation on Strategies for Reducing Suicidal Behaiors in European Region: Summary Report. World Health Organization, Geneva
T. Bronisch
Suicide: Ethical Aspects In 1917, perhaps still troubled by the suicides of two of his brothers and of a close friend, Ludwig Wittgenstein wrote: If suicide is allowed then everything is allowed. If anything is not allowed, then suicide is not allowed. This throws a light on the nature of ethics, for suicide is, so to speak, the elementary sin …
In contrast, in 1961 the Nobel physicist Percy Bridgman, then almost 80 years old and suffering from terminal cancer, shot himself, leaving a final note: It isn’t decent for society to make a man do this thing himself. Probably this is the last day I will be able to do it myself.
These views frame the scope of the contemporary debates over the ethics of suicide. At one extreme, some hold the view that suicide is, as Wittgenstein put it (though he questioned even this view), ‘the elementary sin,’ the act most clearly ‘not allowed.’ At the opposite extreme, others hold that control over one’s own life is a matter of right, and ending that life something for which, at least in circumstances like painful terminal illness, a person ought to receive assistance and social support. These debates are both ancient and new. In the West, prior to the beginning of the twentieth century, suicide had often been viewed as a moral question; at the beginning of the twentieth century, however, ‘scientific’ views of suicide as the product of social forces or of individual psychopathology worked to remove suicide from ethical scrutiny. Suicidality came to be seen largely as a clinical matter for prevention and treatment. Not until the emergence of the right-todie movement in the latter part of the twentieth century did ethical issues about physician-assisted suicide and hence about suicide itself begin to re-emerge as a matter for philosophical reflection and for public debate.
1. Conceptual Issues In popular discourse, the term ‘suicide’ carries extremely negative connotations. However, there is little
Suicide: Ethical Aspects agreement on a formal definition. Some authors count all cases of voluntary, intentional self-killing as suicide. Others include only cases in which the individual’s primary intention is to end his or her life. Still others recognize that much of what is usually termed ‘suicide’ is neither wholly voluntary nor involves a genuine intention to die, including many of the cases usually labeled ‘suicide’—especially those associated with depression or other mental illness. Many writers exclude cases of self-inflicted death which, while voluntary and intentional, appear aimed to benefit others or to serve some purpose or principle: for instance, Socrates’ drinking the hemlock, Captain Oates’ walking out into the Antarctic blizzard to allow his fellow explorers to continue without him, or the self-immolation of war protesters. These cases are usually not called ‘suicide’ but ‘self-sacrifice,’ ‘martyrdom,’ ‘heroism,’ or other terms with strongly positive connotations. However, attempts to differentiate these positive cases often seem to reflect moral judgments, not genuine conceptual differences. Nonwestern traditional cultures offer many additional conceptual puzzles. Is the label ‘suicide’ appropriate for wives and royal retainers who allow themselves to be buried alive at the death of the king? For widows who, responding to strong social expectation, commit sati by throwing themselves on their husband’s funeral pyre? For jihad or religiously motivated holy-war acts certain to cause one’s own death? For kamikaze missions undertaken by volunteers who know they will not return but who are seen as making a supremely heroic sacrifice for the good of the nation? In all cultures, cases of death from self-caused accident, self-neglect, chronic selfdestructive behavior, victim-precipitated homicide, high-risk adventure, and self-administered euthanasia—which all share many features with suicide but are not usually termed such—cause still further conceptual difficulty. Consequently, some authors claim that it is not possible to reach a rigorous formal definition of suicide, and that only a criterial or operational approach to characterizing uses of the term can be successful. Nevertheless, conceptual issues surrounding the definition of suicide are of considerable practical importance in policy formation, as for instance in coroners’ practices in identifying causes of death, psychiatric protocols, insurance disclaimers, religious prohibitions, and laws either prohibiting or permitting aiding suicide in situations such as terminal illness.
2. Traditional and Contemporary Ethical Arguments Concerning Suicide Because recent scientific views of suicide have tended to presuppose that suicide is not a voluntary act, much of the most significant argumentation concerning the
moral status of suicide is to be found in the historical tradition. Traditional discussions of suicide employ a wide range of types of arguments, including principlebased, consequentialist, and virtue-based arguments. In each of these categories, some arguments are directed toward the conclusion that suicide is immoral and\or should be prohibited; others lead to the conclusion that suicide is morally permissible and\or should be permitted; a few lead to the conclusion that suicide is morally obligatory in certain sorts of circumstances.
2.1 Principle-based Arguments Many thinkers in Western philosophical and religious traditions which employ principle-based, duty-based, or other deontological forms of moral reasoning have argued that suicide is intrinsically wrong, and, hence, that it is prohibited. Conversely, others argue that because suicide is prohibited (by God), it is known to be intrinsically wrong. For example, traditional Judaism cites the Biblical passage at Genesis 9:5, ‘For your lifeblood I will surely require a reckoning,’ as the basis of the principle that suicide is wrong; the only exceptions involve self-killing to avoid spiritual defilement, a practice known as Kiddush Hashem (as in the mass suicide at Masada), and temporary insanity. In classical Greece, Socrates, as portrayed in Plato’s Phaedo, cites the secret, ‘whispered’ Orphic principle that man is a ‘prisoner’ or ‘possession’ of the gods and ought not run away; thus suicide would violate the wishes of the gods. Early Christian thinkers implied or asserted that suicide is wrong per se, since, as Augustine argued, it is to reject God’s gift of life and runs counter to the will of God. In exceptional cases like those of Samson pulling down the temple on the Philistines—and himself—, and Saul falling on his sword, Augustine insists, suicide must have been commanded by God, but such cases are very, very rare and do not excuse ordinary persons. In the high Middle Ages, Thomas Aquinas definitively established Christianity’s opposition to suicide with a set of five arguments. Central among these is the principled claim that suicide is unnatural: ‘everything naturally seeks to keep itself in being.’ From this principle, Thomas drew the conclusion that ‘… suicide is, therefore, always a mortal sin in so far as it stultifies the law of nature …’ Still later, philosopher Immanuel Kant argued that suicide can be shown to be intrinsically wrong by application of the Categorical Imperative: one cannot will without contradiction that it be universal law. Many other moralists have held that, as a matter of basic ethical principle, suicide is intrinsically wrong. At the same time, however, other thinkers employing deontological or principle-based reasoning have argued that suicide can be morally permissible or even 15269
Suicide: Ethical Aspects praiseworthy. Typically, these thinkers assert that the principle of autonomy is fundamental, claiming that unless the person is not acting voluntarily or does not have adequate information, or unless serious harm to others will follow, a person’s choice of suicide is to be respected. Defenses of suicide by Seneca, by Rousseau’s character St. Preux in the novel Heloise, by Schopenhauer, and by Mme. de Stael (though she later argued against suicide) are of this form: they assert that as a matter of principle, suicide is not intrinsically wrong and that suicide can be a morally acceptable, sometimes praiseworthy thing to do. Although the language of moral rights is not available to the earlier thinkers, some have elevated the principle of autonomy into the basis of a fundamental right to suicide. For example, Schopenhauer and Nietzsche see suicide as a matter of the basic right of the individual to choose whether to continue to exist in the world. As Nietzsche put it, ‘suicide is man’s right and privilege.’
2.2 Consequentialist Arguments Other thinkers have argued that suicide is morally permissible in certain situations, though not in others. Typically, these arguments are consequentialist or utilitarian in character—outcome-oriented, so to speak—assessing the permissibility of suicide in terms of its impact on the suicidal person, on family members, on friends and associates, and on the larger society. Aristotle, for example, says that suicide is wrong because it ‘treats the state unjustly’; he is concerned about its impact on society. Christian and Islamic thinkers often argue not only that suicide is wrong in itself but that it will mean damnation for the victim. Blackstone argued that the suicide is an ‘offense against the king,’ who has an interest in the preservation of all his subjects. Samuel Pepys, recounting in his diary his efforts to intervene on behalf of the widow of a man who has just drowned himself, points to the disastrous impact suicide could have on family members, inasmuch as the English legal system made a suicide’s property forfeit to the Crown. Using similarly consequentialist arguments, many contemporary authors also point to the consequences suicide can have for family members (especially children) and other close associates. They particularly emphasize the sense of guilt and shame that may be aggravated in survivors. Suicide, on these consequentialist arguments, damages its victim, damages family members, and can damage the society of which they are part. However, not all consequentialist arguments speak against the morality of suicide. Greek and Roman Stoics, defending suicide in principle, also point to the beneficial consequences suicides like that of Cato might have in protecting the freedoms of society. Both traditional and contemporary authors acknowledge 15270
that the suicide of an aggressive or abusive person may have positive consequences for others by removing a threat, that the suicide of a person suffering an extended, expensive illness may relieve burdens on an overwrought family, and that suicides of self-sacrifice and principle can have substantial benefits for society. Because these arguments appear to favor or even encourage suicide in many sorts of circumstances, they are currently viewed as extremely controversial. They are rarely openly discussed, though they play an influential role in the background of concerns about ‘slippery slope’ pressures on vulnerable persons.
2.3 Virtue-based Arguments Still other thinkers treat suicide as an index of character. For example, Aristotle asserts that suicide is ‘cowardly’—that is, that it is evidence of a defect of character, a failure to act virtuously by exhibiting insufficient courage in facing a difficult situation. Many later thinkers, including Augustine and Thomas Aquinas, have adopted the view that suicide is ‘cowardly.’ These and other Christian writers also celebrate the willing acceptance of suffering, particularly in emulating the sacrifice made by Christ. As Mme. de Stael put it in her second essay on suicide opposing the libertarian stance of her first, ‘Suffering is a blessing … it is a privilege to be able to suffer.’ In contrast, Greek and Roman Stoics favoring suicide, as well as many later authors, also employ virtue criteria, but argue to the opposite conclusion. The Stoics held that real cowardice held in fearing death, and that committing suicide itself requires courage. Examples cited in this tradition include the heroic suicides of Cato, Lucretia, and Seneca, respectively involving commitment to an ideal of political liberty, sexual virtue, and personal courage in the face of an untrustworthy emperor. For some virtue theorists, suicide can involve other virtues as well: fortitude, loyalty, civic responsibility, and many others. Thus, on a virtue-based analysis, too, arguments may be made both against and for the ethical permissibility of suicide.
3. Social Correlates, Biochemical Findings, and Markers of Suicide Risk The question of whether suicide is wrong—whether in principle, or because of its consequences, or as a fault of character—seems to presuppose that the act is at least to some degree under the agent’s voluntary control. Traditional accounts of the morality of suicide, both those which allow it in some or all circumstances—including Plato, the Stoics, Hume, Nietzsche, and others—and those which oppose it in all or virtually all circumstances—Aristotle, Augustine, Thomas Aquinas, Kant, and others—agree in
Suicide: Ethical Aspects assuming that the act is sufficiently voluntary that the person whose death it will be can appropriately be blamed (or praised) for so acting—after all, he or she could have refrained from suicide and kept on living. Scientific views of suicide, introduced by Durkheim and Freud at the beginning of the twentieth century, have challenged this assumption. Durkheim’s anomie theory held that suicide is a function of the degree of integration of the social body: suicide can be altruistic, where suicide practices are highly institutionalized; it can be egoistic, where individuals do not respond to social expectations and regulations; or it can be anomic, where society does not provide adequate regulation of its members, as, Durkheim believed, in modern industrial society. In contast, Freud insisted that suicide was the product of mental illness. For much of the twentieth century, whether suicide was understood as the product of social forces or as the consequence of underlying depression or other mental disorder, in either case it has been seen as largely beyond the voluntary control of its victim. Suicide prevention, pioneered by Edwin Shneidman in the mid-twentieth century and continued by many later suicidologists, works in part to identify risk factors associated with elevated rates of suicide: risk is greater for males (especially older white males), for alcohol abusers, for those with low religiosity, for those with poor or rigid coping skills, for those less willing to seek professional help and more likely to ignore the warning signs of suicide, for those with poorer support systems, for those with access to more lethal technology such as guns, and for those for whom failure is more obvious in the primary adult role (defined for males as economic success; for women, as success in relationships). Also at heightened risk are teens who listen to heavy-metal music and white males who listen to country music, though these factors are not causes but proxies for other variables, like alcohol or drug use or divorce. Copycat effects following celebrity suicides, holiday dips and post-holiday peaks, and seasonal variations in suicide have all been explored in the search for explanations, as has the concept of suicidal ‘careers’ as a form of coping mechanism. At the same time, new findings in psychiatry, biochemistry, epidemiology, and genetics are beginning to erode assumptions about voluntariness as they reveal biological correlates of suicidal behavior. Depression, it has long been known, is strongly associated with suicide (some studies suggest that it is present in as many as 80–90 percent of cases, though in other studies the rate is lower). Levels of 5-HT, or serotonin, a neurotransmitter, and its metabolite 5-HIAA are decreased, as measured in CNS spinal fluid. Fenfluramine used as a challenge agent evokes an impaired response in people who have made serious suicide attempts. Prolactin release is stimulated by serotonin; it too is decreased in suicidal persons. There is some evidence of heritable genetic patterns in families with
multiple suicides: adoption studies show that adoptees with suicide in their biological families are at greater risk for suicide than other adoptees, and there is some evidence of significant association between genetic variation in the gene for tryptophan hydroxylase (TPH) and violent suicide attempts. It may even be the case that depression and suicidality are associated with second-trimester exposure during pregnancy to winter flu. Such findings suggest that suicide, as well as suicidal ideation and suicidal and parasuicidal behavior, along with aggressive behavior generally, may be to some degree biologically determined, either directly or in conjunction with social and environmental factors. If so, suicidality would thus be less a focus for moral blame or praise than for clinical treatment and concern. It may even be possible to develop a biological test—for example, a blood test, a saliva assay, or a urine analysis—that would screen for CNS neurotransmitter abnormalities associated with suicidal behavior. While work to date establishes correlations rather than causes, it nevertheless points to psychological and situational factors that may play a causal role in suicide. Some theorists have assumed that such research will eclipse ethical issues in suicide altogether. However, this research faces two challenges: first, the definitional issue of who counts as a ‘suicide’ (as distinct from martyr, hero, self-sacrificer, or terminally ill patient seeking an easier death), and second, the prudential issue of whether suicide can ever serve the interests of the person who commits it. Definitional challenges to research programs are numerous: should research on completed ‘suicides’ include among its cases, for example, self-sacrificing parents who put themselves in harm’s way to protect their children, or social protesters who immolate themselves for political causes, or the sea captain who elects, as tradition has required him, to go down with the ship? The prudential challenge asks whether there can be ‘good reason’ for suicide in some cases, such as painful terminal illness or extreme old age, and hence whether research is misguided if it focuses primarily on negatively valued cases; that is, those called ‘suicides’ in ordinary parlance. Some theorists suggest that there are two basic groups of suicides: on the one hand, people with neurotransmitter abnormalities or other biological defects who suffer from mental illness, depression, or extreme emotional stress, people who cannot consider their best alternatives, and on the other hand biologically normal people who can make such considerations but who find themselves caught in situations in which suicide appears to be the most reasonable choice, a way out of their difficulties. These groups are sometimes distinguished as ‘irrational’ or mentally ill, and ‘rational’ suicides. A clear differentiation is not adequately supported by current findings, primarily because research on ‘suicide’ typically includes just 15271
Suicide: Ethical Aspects those attempts and completed cases conventionally labeled suicide, overlooking cases not usually identified as suicides where the motivation is viewed as understandable or heroic, as in, for example, cases of political protest or of hastened dying in terminal illness. By some estimates, 95 percent of suicide cases are ‘irrational’ and five percent ‘rational,’ though this estimate is itself challenged by conceptual concerns about what counts as ‘suicide’ in the first place. Nevertheless, this distinction influences much of contemporary discussion about ethical issues in suicide. ‘Irrational’ suicides are assumed not to pose moral issues concerning the agent, since they are assumed not to be voluntary in a robust sense, though of course second parties like family members and physicians may face moral questions about whether and how to intervene. ‘Rational’ suicides, on the other hand, are assumed to be appropriately considered in ethical terms: these are people who make a choice to end their own lives, and the moral issue is whether it can ever be right to choose as they do.
4. The Contemporary Debate oer Physicianassisted Suicide Beginning in the latter part of the twentieth century, the so-called ‘right-to-die’ movement began to challenge life-prolongation practices in the treatment of the terminally ill. Patients, it argued, should have the right to control the care given them as they are dying, whether by refusing treatment like respirators or antibiotics or by discontinuing treatment already in progress—even if this means they will die. By giving a central role to patients’ autonomous decisions about medical care at the end of life, the right-to-die movement also raised the issue of whether the terminally ill patient may deliberately and directly end his or her own life—and receive help from his or her physician in doing so. As of the year 2001, physicianassisted suicide had become legal only in the Netherlands and the US state of Oregon, and had briefly been so in the Northwest Territory of Australia, but the issue was being widely debated in many countries, especially in the USA, Canada, the UK, Belgium, Australia, Colombia, and several Scandinavian countries. Four principal ethical arguments, all to be found in earlier, historical discussions of the ethics of suicide, contribute to this broadly international public debate over whether physician-assisted suicide (and, in some countries, voluntary active euthanasia) should be regarded as morally acceptable and legally permitted. The two principal arguments on the pro side are (a) the argument from autonomy, that the autonomous choice of a competent person is to be respected, and (b) the argument from mercy, an argument for freedom from suffering, that the dying patient has the right to try to avoid pain and suffering as much as 15272
possible, including by means of a deliberately undertaken earlier death if that is the only effective way to avoid them. At least on some views, both conditions are necessary and neither sufficient in itself: a person must both choose autonomously and have no realistic alternative in seeking freedom from suffering in the process of dying to underwrite an adequate case for physician-assisted suicide. Neither voluntary choice alone in the absence of terminal suffering nor terminal suffering without genuine choice would be sufficient to make such a case. On the opposing side, two principal arguments against moral recognition or legal acceptance of physician-assisted suicide also inform the current debates, (a) the argument from the intrinsic wrongness of killing—that killing, and hence self-killing, is in itself wrong, and (b), that permitting assisted suicide would risk the ‘slippery slope’: it would invite pressures on patients by overwrought or greedy families, unscrupulous physicians, or cost-conscious institutions, all of which might force a vulnerable patient to ‘choose’ suicide when they did not really wish to die. Each of these arguments is open to rebuttal. For instance, the claim that killing is not morally permissible can be countered by the objection that killing in self-defense, in war, and in capital punishment are regarded as permissible. The claim that death may be the only way to avoid pain and suffering is countered with the possibility of deep sedation. Such counterarguments are themselves open to rebuttal: for example, that it is only the killing of innocents that is prohibited, or that deep, irreversible sedation is tantamount to causing death, in that it permanently ends the possibility of human experience. These counter-counter-arguments are in turn open to still further reply. In practice, the disputes over physicianassisted suicide are hugely extended and extremely complex, simmering in many disciplinary areas— philosophy, law, medicine, psychology, sociology, political theory, theology, social anthropology, economics, and public policy—in many countries around the world. These disputes have had a focusing effect, redirecting the larger normative debate about suicide, quiescent since Durkheim, Freud, and the beginning of the twentieth century, toward a specific type of case, terminal illness. Contemporary disputes about the ethical issues in suicide no longer address the larger range of cases in which suicide had been considered permissible or impermissible by various cultures at various times: in shame or dishonor, in chronic illness, in old age, in extreme poverty, for blindness or other disability, on being widowed, for social protest or the defense of principle or religious belief, to avoid degrading labor or coercion in unacceptable activities, and many other reasons. The debate over right-to-die issues and physician-assisted dying that has emerged in the latter part of the twentieth century is indeed a debate about the ethics of suicide, but it is a highly
Suicide, Sociology of specific, highly focused debate, raised virtually exclusively with regard to physical suffering in the context of terminal illness; it does not yet address whether and when suicide is morally impermissible, permissible, or obligatory in any of a much wider range of situations. While the right-to-die debates over the ethics of physician-assisted suicide in terminal illness continue to ferment, primarily in the developed nations, work is re-intensifying in suicide prevention, focused especially on teens and on some groups of persons at higher identified risk, and on improvement in the care of dying patients, particularly on more effective pain control. Some believe that these measures will be able to reduce both the incidence of suicide and the call for physician-assisted suicide in the future.
5. The Future In 1975, Henry Van Dusen, former president of Union Theological Seminary, and his wife Elizabeth—both leaders in American theological life—committed suicide together by taking overdoses. Mr. Van Dusen was 77 and had had a disabling stroke five years earlier; Mrs. Van Dusen was 80, blind, and suffering from serious arthritis. They left a letter saying that ‘We still feel this is the best way and the right way to go,’ predicting that it would become more frequent and more publicly accepted. It is not yet possible to say whether they were right; some observers argue that the ethical issues in suicide will recede again out of public view as modern medicine becomes more and more capable of alleviating pain and suffering now feared by the dying, while others think the ethical issues will become more and more broadly explored, as patients seek greater and greater autonomy in the matter of their own dying. And some think that these ethical issues will be central in the moral life of many societies for the foreseeable future, as life expectancies increase and the likelihood of dying from late-life degenerative disease grows. If so, ethical scrutiny in the matter of suicide will turn to two sorts of issues: the morality of approaching terminal illness and unavoidable death by ‘designing’ one’s own dying, and the risks of social encouragement and pressures that could develop as it becomes a normal, socially tolerated, expected thing to do. Whether such trends will in fact develop is impossible to predict, but it is possible to predict what ethical issues such trends would raise. See also: Consequentialism Including Utilitarianism; Suicide; Suicide, Sociology of; Virtue Ethics
Bibliography Ame! ry J 1976 1999 On Suicide: A Discourse on Voluntary Death. Indiana University Press, Bloomington and Indianapolis, IN Battin M P 1982, 1995 Ethical Issues in Suicide. Prentice-Hall, New York
Battin M P, Rhodes R, Silvers A (eds.) 1998 Physician-Assisted Suicide: Expanding the Debate. Routledge, New York Beauchamp T L (ed.) 1996 Intending Death. The Ethics of Assisted Suicide and Euthanasia. Prentice-Hall, Upper Saddle River, NJ Bridgman P 1961 letter in Bulletin of the Atomic Scientists, quoted by Max Delbru$ ck, education for suicide, interview in Prism, a publication of the American Medical Association 2 (1974), p. 20 Brody B A (ed.) 1989 Suicide and Euthanasia: Historical and Contemporary Themes. Kluwer, Dordrecht\Boston\London Durkheim E 1897 1951 Suicide: A Study in Sociology. Free Press, New York Emanuel L L (ed.) 1998 Regulating How We Die. The Ethical, Medical and Legal Issues Surrounding Physician-Assisted Suicide. Harvard University Press, Cambridge, MA Ethics 109(3) (April 1999) Symposium on Physician-Assisted Suicide Kaplan K J, Schwartz M B (eds.) 1998 Jewish Approaches to Suicide, Martyrdom, and Euthanasia. Jason Aronson, Northvale, NJ and Jerusalem Larson E J, Amundsen D W 1998 A Different Death: Euthanasia and the Christian Tradition. Inter Varsity Press, Downers Grove, IL Maris R 1981 Pathways to Suicide. Johns Hopkins University Press, Baltimore Maris R W, Berman A L, Silverman M M 2000 Comprehensie Textbook of Suicidology. Guilford Press, New York Minois G 1995 Histoire du suicide: La socieT teT occidentale face a la mort olontaire. Libraire Artheme Fayard [1999 History of Suicide. Voluntary Death in Western Culture. Johns Hopkins University Press, Baltimore] Murray A 1998 Suicide in the Middle Ages. Vol. I: The Violent against Themseles. Oxford University Press, Oxford Stack S 2000 Suicide: A 15-year review of the sociological literature: Part I: Cultural and economic factors. Suicide and Life-Threatening Behaior 30(2): 145–76 Weir R F (ed.) 1997 Physician-Assisted Suicide. Indiana University Press, Bloomington and Indianapolis, IN Werth J L Jr 1996 Rational Suicide? Implications for Mental Health Professionals. Taylor and Francis, Washington, DC Wittgenstein L Notebooks 1914–16. von Wright G H, Anscombe G E M (eds.) trans. Anscombe G E M, Blackwell, Oxford, 1961, p. 91e. Entry dated January 10, 1917 Zucker M B (ed.) 1999 The Right to Die Debate: A Documentary History. Greenwood Press, Westport, CT
M. P. Battin
Suicide, Sociology of Although the word itself has been used only since the seventeenth century, the reality that is suicide has been part of human history for a long time. Many suicides occurred in Greece in the fifth century BC and in Rome during the first century AD. The first writings about suicide present it as a religious, moral, or pathological phenomenon, frequently attributed to supernatural causes or to mental illness. Empirical studies appear only at the beginning of the nineteenth century, with still moral overtones. Statistics are analyzed according to an individual or social patho15273
Suicide, Sociology of logical perspective and the suicide rates are correlated with such variables as the regional economic situation and the homicide or mental illness rates.
1. Durkheim The first truly sociological writing on the subject is Le Suicide. En tude de Sociologie by Durkheim (1967) (see Durkheim, Emile (1858–1917)). Making use of official statistics, he points out the variations in suicide rates throughout time and space. For him, the rate of suicide constitutes a social fact. Each society has a specific tendency toward suicide. The structure of a society decides, so to speak, its own contingent of voluntary deaths. Durkheim refutes the theories that explain suicides by extrasocial individual factors. He not only dismisses psychopathic or normal psychological factors, but also cosmic factors and the imitation factor which includes contagion. He constantly distinguishes between the cause of suicide, which is always social, and the motives, which are merely pretexts. Applying the concepts of integration and regulation to religious, domestic, and political societies, Durkheim constructs a typology of suicides. Egotistic suicide results from insufficient social integration and differs from altruistic suicide that is seen in societies too strongly integrated. Anomic suicide occurs in societies where rapid changes are not, or insufficiently, regulated; on the contrary, fatalistic suicide, scarcely touched upon by Durkheim, follows from an excess of social regulation. Several criticisms have been directed against Durkheim’s work. In particular, Baechler (1975) compares his lack of interest in the victim of suicide with his great concern for the social rate of suicide. Baechler observes that, for Durkheim, it is not individuals who take their own life, but society itself that commits suicide through some of its members. Durkheim’s analysis of the low suicide rate among women is also severely criticized, along with his lack of development of fatalistic suicide and the confusion he perpetuates between anomy and egotism.
2. In Durkheim’s Wake Controversy was not the only reaction to Le Suicide. The book awakened a keen interest in the study of suicide, although that interest showed itself only in the 1950s, especially in the USA. Durkheim’s research generated a number of studies seeking to replicate, extend, clarify, and even to rectify not only his conceptual analysis but also his empirical results. Still widely taught today in many countries, Durkheim continues to exert a powerful influence on the sociology of suicide. Using the notions of social status and social roles, Gibbs and Martin (1964) developed a theory around the Durkheimian concept of social integration. When 15274
a society engenders a status incompatibility and therefore role conflicts, it has a greater risk of suicide. For Gibbs and Martin, a relationship exists between the suicide rate and the integration of status and roles. It gives rise to their theorem: the suicide rate of a given society varies inversely with the degree of integration of the status and the roles which make up that society. Lester (1989, 1997) is one of the sociologists who have contributed the most—and he continues to do so—to the operationalization of Durkheim’s concepts, especially social integration, and to the widest possible verifying of his empirical conclusions. Although he remains open to psychological and individual perspectives, Lester is part of the Durkheimian movement, as witnessed by his numerous statistical and comparative studies of national and regional suicide rates throughout the world and his insistence on deepening the direct relationship between suicide and the modernization of society. In this regard, Lester recommends greater precision in the criteria for modernization, beyond the vague notions of industrialization, urbanization and secularization. Declining birth and death rates and the increases in the divorce rate and in the gross national product are specific indices that should be taken into account.
3. Psychosocial Approach Notwithstanding the studies that throw light on the social factors of suicide, they do not satisfy several researchers who are interested in studying how such factors influence the persons who commit suicide and what moves them, as individuals, to take their life. The study of suicide thus moves more and more inward, refusing to see the causes of suicide as only social and external. Without neglecting social factors, the study of suicide now considers psychological and individual factors, in both their affective and cognitive dimensions. 3.1 Affectie Dimension Halbwachs (1930) gives a different interpretation to the Durkheimian notion of anomic suicide. For him, the increase in the number of suicides should be more correctly attributed to the loneliness felt by those individuals who are faced with an increasingly complex society that makes it difficult for them to establish satisfactory social contacts. The key concept is to be found in the lifestyle of people, since it determines the opportunities for a relationship with others. One therefore finds fewer suicides in rural areas because of a more harmonious lifestyle. Loneliness with its accompanying social isolation, brought on by a complex and conflicting urban society, thus becomes the cause of suicide, which is consequently explained by psychosocial factors. Henry and Short (1954) suggest another psychosocial explanation of suicide by incorporating the
Suicide, Sociology of Freudian concepts of frustration and aggression. An integrated society, in which external constraints increase, has a lower rate of suicide. This is because external constraints have more effect on persons of low status; the subsequent frustration causes their aggressiveness to be turned outwards, with the potential for homicide. Conversely, persons of high status, who impose their demands on others, have no one but themselves to blame if they feel frustrated; their aggressiveness is directed inwards upon themselves, with the potential for suicide. Far from being unanimously accepted, Henry and Short’s theory still captured the attention of some researchers. Maris (1981) endorses the idea of social constraints. However, while they are fewer for the person socially isolated, they nevertheless increase the likelihood of suicide. Giddens (1966) also puts forward a psychosocial approach to suicide. He presents a typology of suicide, using the Durkheimian notions of egotism and anomy with which he associates the Freudian theory of depression. Different kinds of societies spawn various depressed and suicidal personalities. On the one hand, an egotistic society produces introverts who are incapable of relating to others; their suicidal behavior seeks expiation or social reintegration. On the other hand, an anomic society creates personalities dissatisfied with their performance; their suicide aims at surpassing themselves. In any event, for Giddens, all cases of suicide presuppose social isolation, marked by the absence of significant relationships.
3.2 Cognitie Dimension Sociology also progressively integrates the cognitive dimension that is linked to a concrete suicide. The possible meanings of suicidal behavior gradually come to revolve around the idea that suicide is the solution to one’s problems. Douglas (1967) has left his mark on this approach to the study of suicide by his work on meanings. Influenced by Weber, Douglas states that suicide is a social act that includes subjective meanings constructed from interactions between the victim and his cultural context. He thus advances a social constructionist theory of meanings. Suicide is seen as ‘a means of transforming the soul from this world to the other world,’ ‘a transformation of the substantial self in this world or in the other world,’ ‘a means of achieving fellow-feeling,’ or ‘of getting revenge’ (Douglas 1967, pp. 284, 300, 304, 310). Also inspired by Weber, Baechler (1975) attempts to understand what makes human beings put an end to their lives. He defines suicide as any behavior that seeks and finds the solution to an existential problem in the act of putting an end to one’s life. His strategic
theory includes several types of suicidal behavior: the escapists, who see a way out, the aggressives, who wish to attack someone, the oblates, who sacrifice themselves, and the gamblers, who take a chance with death. Taylor (1978) also studies the significance of the suicide act for the actor himself. Even as they commit suicide, individuals are rarely seeking death. Aligning himself with Baechler, Taylor suggests two meanings to this behavior. The first is the ‘ordeal’ character of suicidal action that represents the risk, the game, the betting with death. The second is the aspect of ‘rebirth’ that the suicide act may take. In risking his life, the individual hopes to escape his problems and reinvest in a new life that will make more sense to him. He is seeking a renewal of life.
4. Psychological and Biological Approaches Since the 1980s, the increasing influence of the psychological approach on the sociology of suicide has become evident. For Shneidman (1985), suicide is a multidimensional distress of which the main component is always psychological. Suicidal people always suffer from an unbearable psychological pain and from a cognitive constriction, a ‘tunnel vision.’ Suicide becomes the only solution to their problems. Following Shneidman, Kral (1994) also insists on the psychache felt by the suicidal person. However, for Kral, the direct cause of the fatal act is the idea itself of suicide. Suggested by the victim’s culture, it acquires more and more credibility in his eyes and is gradually transformed into an acceptable alternative for getting rid of his psychache. The cultural schema of suicide is progressively internalized into a personal schema. Some sociologists underline the influence of biology on suicidal behavior. De Catanzaro (1981) adopts a sociobiological standpoint. In his literature review on suicide, Maris (1981) takes note of a biopsychosocial model of suicide. For certain individuals, their life has been so strewn with multiple problems that their capacity for adaptation is severely impaired, possibly owing to their biological condition, e.g., a failure of serotonin or an imbalance of neurotransmitters triggered by social and psychological factors.
5. Differentiation of Suicidal Behaiors The sociology of suicide progressively differentiates. First interested in completed suicides, it examines their many particular forms; multiple studies grouped around gender, ethnic origin, and age group proliferate rapidly. Then, sociologists gradually widen their field to include studies of noncompleted suicides. 5.1 Various Forms of Completed Suicides Instead of a general analysis of the completed suicides, several sociologists isolate particular groups of the 15275
Suicide, Sociology of fatal acts. Some of them concentrate on suicides by imitation or contagion, studying the fluctuation of suicides following publicity about certain voluntary deaths. When there is widespread coverage by the press, there follows an increase in the number of suicides. This is known as the ‘Werther effect,’ referring to suicides committed in 1774 after the publication of a novel by Goethe. Tinged with controversy and a lack of unanimity in the results, a good number of studies have been carried out around that point since the early 1980s (Stack 1987). Other particular forms of completed suicides are examined according to sex, ethnic origin and age group. Analyses of the sex of persons who committed suicide become more detailed and more discriminating. Invoking his concepts of integration and regulation, Durkheim himself had accounted for a higher suicide rate among men than among women. In most societies, researchers still observe far more male suicides than female, with the particular socialization of each gender often offered as an explanation. Researchers are also looking more and more at links between suicide and homosexual orientation, because of the stigmatization and social exclusion that result from the latter. Sociologists continue to be deeply interested in the ethic origin of persons who died by their own hand. The suicide rate of immigrants to the USA is analyzed and found similar to the rate in their country of origin. Even the methods for suicide used by Asian Americans differ from those chosen by the majority of Americans. The immigrants’ culture of origin is thus a determining factor in their suicide. Intrigued by the low suicide rate of African American men and women in spite of the several risk factors associated with them, some sociologists point out the protective influence of their cultural patterns, particularly of their religious family roles. Black women are protected by their network of social support. As for Chinese women, almost the only ones who take their life more often than their male counterpart, it is because of their traditional culture making them responsible for family problems. Suicide rates according to age group are carefully and more painstakingly examined. The peaking of suicide in old age is confirmed by Lester (1997) on an international scale. On the other hand, several researchers observe an increasing rate of suicide among the youth in contemporary Western societies. Suicides by imitation and suicide pacts are studied among teenagers who, it is well known, are vulnerable to this phenomenon.
5.2 Other Suicidal Behaiors It is only in the 1960s that sociologists began to explore the subject of noncompleted suicides. For some researchers, attempted suicide and completed 15276
suicide do not show the same characteristics and therefore constitute two distinct social facts. The attempted suicide is presented either as a typical behavior of egotistic societies or as a form of blackmail or a call for help. Often individuals who attempt suicide are hoping more for an intervention that will reconcile them with their social environment than they are seeking death. Other sociologists consider noncompleted suicides as part of a process of which the final step is a completed suicide. Maris (1981) underlines the importance of attempted suicides in the suicidal process. Identifying 15 predictors of completed suicides during the lifetime of a suicidal person, he puts forward a model of suicide careers. One of the predictors is the greater number of nonfatal suicide attempts, especially among young women. In fact, a higher frequency of attempted suicides and of ‘suicidal ideations’— a concept only timidly discussed in sociology—is generally found among girls. In their review of the literature published between 1963 and 1993, Arensman and Kerkhof (1996) make a distinction between those whose suicide attempts have been rather weak, those whose attempts have been serious, and those who have ended up killing themselves. For these authors, the link between mild attempts and completed suicides is not shown; they refer to those attempts as parasuicides and reserve the expression ‘attempted suicide’ for severe attempts.
6. Methods Used In the wake of Durkheim’s work, the analysis of the social factors of suicide was mostly done by means of quantitative and statistical studies of the macrosociological type. Along with its focusing on the suicidal persons themselves, sociology gradually opened itself to qualitative studies of a more microsociological type that favored sociological individualism (see Macrosociology–Microsociology ).
6.1 Quantitatie Studies Durkheim’s use of statistics has been severely criticized, particularly by Douglas (1967), who reproaches Durkheim for choosing statistics that confirmed his theory, while giving scant attention to those that did not. However, the school of Columbia University, especially via Selvin, praised the work of Durkheim and greatly contributed to reviving the interest for his study of suicide in the USA and in France. In particular, Selvin (1958) pointed out the cleverness and the innovative character of Durkheim’s statistical analysis, not merely contented with simple correlations but using multivariate analysis to validate the complex links between suicide and many social factors.
Suicide, Sociology of Macrosociological studies of suicide involving official statistics continue to abound, indeed are in the majority. The relative inaccuracy of such statistics has been argued by some authors, but the hesitation about using them is less and less relevant in contemporary advanced societies where they enjoy greater credibility (Lester 1997). More and more sophisticated statistical tests allow for rigorous longitudinal analyses, as well as precise verification of the correlations between suicide and numbers of variables: gender, marital status, hereditary and biological characteristics, family history, profession, religious affiliation, ethnic origin, immigration, use of drugs or alcohol, emotional problems, previous suicide attempts, economic, geographical and seasonal context of suicides, and means used to commit suicide.
6.2 Qualitatie Studies Correlational studies identify the associations between suicide and different social and psychological factors (see Positiism: Sociological ), without delving too deeply into the concrete influence such variables actually exert on individuals. With time, sociology, turning more to the person who took his own life, wished to go to the heart of this human and social drama and capture the dynamics and the diversity of its significance. Thus, qualitative studies of suicide were undertaken. In this type of research, the sociology of suicide often prefers the biographical approach in either of its two forms. It is a life story, if it is reconstructed from accounts and documents furnished by the persons themselves who had suicidal ideas or had attempted suicide. The expression life history refers to accounts given by others as well as to various documents allowing the reconstruction of the person’s history in situations of completed suicides. Baechler (1975) retraces, through accounts written by others, short life histories of individuals who have attempted or completed suicide. Adopting the Weberian idealtype, he ends up with several types of meanings in suicidal behavior, which have been mentioned above. Gratton (1996) writes on her own thorough life stories of some Quebec youngsters who committed suicide, using their personal material left behind as well as 29 in-depth interviews with their kin. On that basis, she constructs four Weberian idealtypes involving, for young people, a lack of connectedness between their values and resources. Two of those idealtypes are centered around the values of the youth, either excessive or inadequate, and two around their resources, either turned powerless or so weak that they depend on others. Published biographies are used to analyze the suicide of 30 famous people in order to check which theories best clarify the meaning of their suicide. Each
biography is analyzed according to situations corresponding to each of the theories. Some theories, such as those of Maris and Shneidman, explain better than others those suicides. The least applicable was Freud’s theory. Suicide notes, in particular, offer a privileged instrument with which to capture the significance that the actors themselves attribute to their suicide. Researchers are more and more sensitive to the analysis of such notes. For example, the collective suicide of Jim Jones and his sect members, in 1978, was investigated through that medium. Seven hundred audiotapes were found at the scene of the drama. The words exchanged between Jones and his followers are deciphered in the light of the Durkheimian concepts of altruistic and fatalistic suicides. The leaders who influenced the collective suicide appear to have committed an altruistic suicide. On the other hand, the climate of despair, hostility, and degradation which reigned in that community seems to have provoked a fatalistic suicide for the majority of Jones’ disciples. Life histories of suicide committers also provide the means to map out their psychological autopsies. At first, these autopsies were meant to retrace the social and psychological events that preceded the suicide, in order to understand their subjective significance. However, even if this instrument is still mentioned by Maris (1981) to establish the life trajectory of suicide victims, it is often used by psychologists and psychiatrists merely in the form of standardized questionnaires, aimed at determining the presence or absence of mental illness in those people. Other researchers, interested in intergenerational suicide behavior, will use the ‘ge! nogramme’ in the reconstruction of life histories; it is a graphic representation of the main events that marked the history of a family for at least three generations. In summary, when sociology looks at suicide as a social phenomenon of a group or of a society, it tends to favor quantitative and generalizing approaches coming from the natural sciences. However, as soon as sociology becomes concerned with the social actors themselves who adopt suicidal behavior and with the diversity of their experiences, it turns to cultural sciences and uses qualitative approaches focused on the individuals.
7. Conclusion It seems that, ever since Durkheim, the sociology of suicide has been developing by accretion, advancing in three major stages. First of all, Durkheim’s Le Suicide was a pioneer text, identifying suicide as a social phenomenon whose causes had to be sought in the society itself. The sociology of suicide thus made its appearance on the scene of theoretical and empirical knowledge. However, in the following 50 years, it 15277
Suicide, Sociology of made almost no progress. Durkheim’s study reigned practically alone. It did not, during that period, awaken a strong current of research on suicide. Then, in the 1950s, the sociology of suicide began to flourish, especially in the USA. Sociologists were busy verifying, extending, refining, and revising Durkheim’s research. Interest in suicide and its social factors broadened progressively, to include the study of the victims of suicide themselves and of their different suicidal behaviors, and to admit psychosociological and psychological approaches. In this regard, social and clinical psychology, Freud’s psychoanalysis and Weber’s comprehensive sociology have been determining influences in the development of the sociology of suicide. In the third stage, the prevailing one at the start of the twenty-first century, the sociology of suicide is diversifying greatly, as much in its research subjects as in its methodological tools. It has become a field of study, called suicidology, that is increasingly complex and autonomous, with stretchable boundaries and where multidisciplinarity takes on more and more importance (Kral 1994). In that regard, it is emphasized that the various disciplines need not be in competition with each other, since their common ground is suicide. What they ought to do is to confront their different bodies of knowledge and articulate them in a common reflection. The recent publication of Maris et al. (1999) bears witness to this necessary sharing of knowledge about suicide. Simultaneously, interest in the sociology of suicide is rapidly becoming universal, embracing researchers in almost all corners of the world. Studies on the subject abound. A search of Sociological Abstracts alone, from 1986 to 1999, reveals more than 150 articles from authors in the UK, France, Germany, Spain, Italy, Switzerland, The Netherlands, Norway, Finland, Iceland, Brazil, New Zealand, Japan, Canada, and notably the USA. Periodicals devoted exclusively to suicide have been published: Suicide and Lifethreatening Behaior, Archies of Suicide Research, Giornale Italiano di Suicidologia. Finally, given the gravity of the suicidal phenomenon in several societies or groups, particularly among the youth, suicidologists from all scientific disciplines are no longer content merely to study suicide. More and more they demonstrate the desire and the need to act socially. With the goal of intervening to prevent suicide, they join national and international associations and take part in seminars and colloquia around the world. The sociology of suicide has contributed, as have other disciplines, to the fact that scientific research is nowadays trying to apply itself to improving society. See also: Anomie; Durkheim, Emile (1858–1917); Integration: Social; Quantification in the History of the Social Sciences; Social Integration, Social Net15278
works, and Health; Solidarity: History of the Concept; Solidarity, Sociology of; Status and Role: Structural Aspects; Suicide; Suicide: Ethical Aspects; Violence, History of
Bibliography Arensman E, Kerkhof A F M 1996 Classification of attempted suicide: review of empirical studies, 1963–1993. Suicide and Life-threatening Behaior 26: 46–67 Baechler J 1975 Les suicides. Calmann-Levy, Paris De Catanzaro D 1981 Suicide and Self-damaging Behaior. A Sociobiological Perspectie. Academic Press, New York Douglas J D 1967 The Social Meanings of Suicide. Princeton University Press, Princeton, NJ Durkheim E; 1967 Le suicide. En tude de sociologie, 2nd edn. Presses Universitaires de France, Paris (1st edn., 1897) Gibbs J P, Martin W T 1964 Status Integration and Suicide. University of Oregon Press, Eugene, OR Giddens A 1966 A typology of suicide. Archies of European Sociology 7: 276–95 Gratton F 1996 Les suicides d’eV tre de jeunes queT beT cois. Presses de l’Universite! du Quebec Sainte-Foy, Canada Halbwachs M 1930 Les causes du suicide. Arnopress, Glencoe, IL Henry A F, Short J F Jr. 1954 Suicide and Homicide. Free Press, Glencoe, IL Kral M J 1994 Suicide as social logic. Suicide and Lifethreatening Behaior 24: 245–55 Lester D 1989 Suicide from a Sociological Perspectie. Thomas, Springfield, IL Lester D 1997 Part III: International perspectives. Suicide in an international perspective. Suicide and Life-threatening Behaior 27: 104–11 Maris R W 1981 Pathways to Suicide. A Surey of Selfdestructie Behaiors. Johns Hopkins University Press, Baltimore, MD Maris R W, Canetto S S, McIntosh J L, Silverman M M 1999 Reiew of Suicidology, 2000. Guilford Press, New York Selvin H C 1958 Durkheim’s Suicide and Problems of Empirical Research. American Journal of Sociology 63: 607–19 Shneidman E 1985 Definition of Suicide. Wiley, New York Stack S 1987 The sociological study of suicide: methodological issues. Suicide and Life-threatening Behaior 17: 133–50 Taylor S 1978 Suicide and the renewal of life. Sociological Reiew 26: 373–90
F. Gratton
Sun Exposure and Skin Cancer Prevention Sun exposure can be defined as the exposure of human beings to natural and artificial ultraviolet radiation (UVR). UVR is a component of natural sunlight and is produced artificially (e.g., by sunbeds) for cosmetic (e.g., to get tanned skin) or medical reasons (e.g., therapy of psoriasis). Sun exposure is the most
Sun Exposure and Skin Cancer Preention important environmental risk factor for all forms of skin cancer (Marks 1995). Sun protection behavior, on the other hand, is the most effective form of skin cancer prevention. This article provides an overview of the incidence and causation of skin cancer, of the psychological conditions for sun exposure and sun protection behavior, and of skin cancer prevention programs.
1. Skin Cancer: Incidence and Causation The two major classes of skin cancer are malignant melanoma (MM) and nonmelanoma skin cancers (NSC; subtypes: basal cell carcinoma, squamous cell carcinoma). In many countries, skin cancer is the cancer with the most rapidly increasing incidence rate. In the United States, the incidence rates of skin cancer are 12.0 per 100,000 Caucasians for MM (mortality rates: men 3.3 per 100,000 people\year, women 1.7) and 236.6 per 100,000 Caucasians for NSC (mortality rates: men 0.7, women 0.3) (Marks 1995). Incidence rates for people with darker skin are much lower (see Cancer Screening). The development of skin cancer depends on constitutional and environmental factors (for an overview, see Marks 1995). The major constitutional factors are skin color, skin reaction to strong sunlight, and, additionally for MM, large numbers of moles, freckles, and family history of MM. The major environmental risk factor for skin cancer is sun exposure. Cumulative sun exposure is related to the development of NSC. For MM, the role of sun exposure is less clear. Increased risk of MM seems to be related to severe sunburns, in particular in childhood, and intermittent recreational sun exposure.
2. Predictors of Sun Exposure Sun exposure not only accompanies many working and recreational outdoor activities, it is also seeked intentionally (for an overview of studies on psychosocial factors of sun exposure, see Arthey and Clarke 1995, Eid and Schwenkmezger 1997). The major motivational predictor for unprotected sun exposure is the desire for a suntan. Level of suntan is related to the perception of other people as being healthy and attractive in a curvilinear manner, with a medium tan considered the healthiest and most attractive skin color. Therefore, getting a suntan helps people to make a good impression on others. The belief that a suntan improves one’s physical appearance is the strongest predictor of suntanning. Voluntary sun exposure is related to thinking more about one’s physical appearance, to a higher fear of negative evaluations by others, and to a higher body selfconsciousness and body self-esteem. In addition to these variables of personal appearance, suntanning is
related to a positive attitude toward risk taking in general and to having friends who sunbathe and use sunbeds (see Health Behaior: Psychosocial Theories ).
3. Predictors of Sun Protection Behaior The harmful effects of UVR on the skin can be prevented by sun protection behavior. The four most important protection behaviors are: (a) Avoiding direct sunlight, particularly during the peak hours of daylight (10 am to 4 pm), (b) seeking shade, (c) wearing a hat, (d) wearing skin-covering clothing, and (e) using sunscreen with a sun protection factor of at least 15. Even in high-risk countries such as Australia, the prevalence rates of these protective behaviors are rather low indicating that about half of the population are inadequately protected on sunny days (e.g., Lower et al. 1998). Voluntary sun exposure and the use of sun protection are two largely independent and only weakly correlated behaviors. Moreover, the predictors for sun protection behavior differ from the predictors for sun exposure. In contrast to sun exposure, appearance-oriented variables are less important for predicting sun protection behavior. The major predictors for sun protection behavior are perceived threat of skin cancer, the benefits and barriers of different sun protection behaviors, social factors, and knowledge about skin cancer (Eid and Schwenkmezger 1997, Hill et al. 1984). Perceived threat of skin cancer depends on the perceived severity of and the vulnerability to skin cancer. People who rate skin cancer as a severe disease and who think that they are more vulnerable to skin cancer are more likely to use sun protection. Furthermore, individuals with a sensitive and fair skin, both which are more prone to skin cancer, use sun protection more often. This risk perception is strongly related to gender. Women have stronger vulnerability and severity beliefs. Moreover, women use sunscreen more often and apply sunscreen with a higher sun protection factor than men (see Gender and Cancer and Gender and Health Care). The actual display of specific sun protection behaviors depends on (a) the belief that these behaviors are effective in preventing skin damage, sunburns, and skin cancer, and (b) the specific barriers of each behavior (Hill et al. 1984). Specific barriers to the application of sunscreen is the belief that sunscreen is greasy and sticky and the fact that sunscreen has to be applied repeatedly. An additional barrier for men is the belief that sunscreen makes men appear seemingly sissy and looks unattractive. Barriers against wearing hats are the beliefs that hats are inconvenient to wear as well as causes problems with hairstyle. Additionally, for men, wearing a hat is associated negatively with the beliefs that wearing a hat causes baldness and a sweaty head, is inconvenient on windy days, and gets in the way when playing sports. The major barriers against 15279
Sun Exposure and Skin Cancer Preention wearing shirts are the discomfort from heat and the feeling of being overdressed. Therefore, a goal of skin cancer prevention programs must be the removal of these specific barriers against sun protection behavior. Because these barriers partly depend on temporary fashion influences, the present-day barriers of specific sun protection behaviors have to be examined before a prevention campaign is planned. Social factors also have an important influence on sun protection. Use of sun protection is correlated with peers’ sun protection behavior, parental influence, and parental sun protection behavior. Furthermore, people who know a person suffering from skin cancer use sun protection more often. Finally, knowledge about the risks of sun exposure has been proven a significant predictor for sun protection in many studies. In particular, people with more knowledge about skin cancer use sunscreen with a higher sun protection factor.
4. Skin Cancer Preention With respect to the target groups considered, sun exposure and sun protection modification programs can be grouped into three classes: (a) programs using mass media, (b) community intervention programs, and (c) educational programs (for an overview of these programs, see Eid and Schwenkmezger 1997, Loescher et al. 1995, Morris and Elwood 1996, Rossi et al. 1995).
4.1 Mass Media Programs Typical mass media programs use informational pamphlets, comic books, newspaper reports, TV and radio advertisements, or videotapes to carry their messages. These mass media campaigns are either often not evaluated, or the evaluations are published in reports that are not accessible to the public, or the evaluations suffer from methodological problems (e.g., missing control group). However, there are some studies evaluating single videotapes or informational pamphlets by an experimental or control-group design (for an overview see Eid and Schwenkmezger 1997, Morris and Elwood 1996). According to these studies, videotapes and informational pamphlets can increase knowledge about, perceived severity of, and vulnerability to skin cancer, as well as increase individual sun protection intentions. Studies focusing on long-term effects on behavior, however, are missing. Furthermore, these studies show that an emotional prevention video imparting knowledge by employing a person with skin cancer might be more appropriate than a more unemotional presentation of facts. Moreover, educational texts focusing on the negative effects of sun exposure for physical appearance (wrinkles, skin 15280
aging) might be more appropriate than texts on the negative health-related consequences. This, however, might only be true for people with a low appearance orientation, whereas appearance-based messages can result in boomerang effects for high appearanceoriented people. Finally, the manner in which a health message is framed is important. Detweiler et al. (1999) demonstrated that messages on sun protection behavior are more effective when they highlight potential positive effects of the displayed sun protection behavior (gain-framed messages) than when they focus on the negative effects of the omitted behavior (lossframed messages).
4.2 Community Interention Programs Mass media are often a component of community intervention programs, but community intervention programs can implement several other methods as well (see Gender and Cancer ). Community-based programs can make use of local peculiarities (e.g., the daily UVR rate in a community) and can focus on local places of risk behavior, for example, swimming pools and beaches. Based on principles of learning theory, Lombard et al. (1991) designed an intervention program with pool lifeguards as models for sun protection behaviors and daily feedback posters of the sun protection behavior displayed the day before. This program was effective with respect to two sun protection behaviors (staying in the shade, wearing shirts) but not other ones. Rossi et al. (1995) describe several new techniques such as special mirrors (sun scanner) and photographs (sun damage instant photography) that have been used in intervention programs on beaches to show the effects of sun damage and photoaging in a more dramatic fashion.
4.3 Educational Programs Because of the cumulative risk of sun exposure and the risks of sunburn in childhood, intervention programs are particularly important for children and adolescents. Educational programs have been developed for students of different ages from preschool to high school (see Health Promotion in Schools). In general, these programs are effective in enhancing knowledge about skin cancer and the awareness of the risks of unprotected sun exposure. Changing attitudes toward sun protection, however, have only been found after participation in programs consisting of more than one session. With respect to changes in behavioral intentions the results are inconsistent. Significant changes in intentions have been found in only a few studies and not consistently across the intentions to different behaviors. Changes in observed behavior, however, have not been analyzed. The failure of educational programs in changing behavioral intentions might be
Supply Chain Management due to the fact that these programs have focused more on motivational than on behavioral factors. Multimethod educational programs including components on the implementation of behavioral routines that are very important for developing health behavior in childhood are desiderata for future research.
5. Future Directions Sun exposure and skin cancer prevention are relatively new fields of research. Future research will profit from the use of multimethod assessment strategies of behavior that does not only assess self-reported behavioral intentions but also uses observational and physical measures of behavior (e.g., spectrophotometers for measuring the melanin skin content). Furthermore, research on new components of intervention programs is needed. For example, future programs could consist of behavior change methods in addition to attitude change methods as well as strategies for weakening the association between suntan and attractivity. See also: Cancer-prone Personality, Type C; Cancer: Psychosocial Aspects; Cancer Screening; Diabetes: Psychosocial Aspects; Health Behaviors; Health Education and Health Promotion
Bibliography Arthey S, Clarke V A 1995 Suntanning and sun protection: A review of the psychological literature. Social Science Medicine 40: 265–74 Detweiler J B, Bedell B T, Salovey P, Pronin E, Rothman A 1999 Message framing and sunscreen use: Gain-framed messages motivate beach-goers. Health Psychology 18: 189–96 Eid M, Schwenkmezger P 1997 Sonnenschutzverhalten (Sun protection behavior). In: Schwarzer R (ed.) Gesundheitspsychologie (Health Psychology). Hogrefe, Go$ ttingen Hill D, Rassaby J, Gardner G 1984 Determinants of intentions to take precautions against skin cancer. Community Health Studies 8: 33–44 Loescher L J, Buller M K, Buller D B, Emerson J, Taylor A M 1995 Public education projects in skin cancer. Cancer Supplement 75: 651–56 Lombard D, Neubauer T E, Canfield D, Winett R 1991 Behavioral community intervention to reduce the risk of skin cancer. Journal Applied Behaioral Analysis 4: 677–86 Lower T, Grigis A, Sanson-Fisher R 1998 The prevalence and predictors of solar protection use among adolescents. Preentatie Medicine 27: 391–9 Marks R 1995 An overview of skin cancers: Incidence and causation. Cancer Supplement 75: 607–12 Morris J, Elwood M 1996 Sun exposure modification programmes and their evaluation: A review of the literature. Health Promotion International 11: 321–32 Rossi J S, Blais L M, Redding C A, Weinstock M A 1995 Preventing skin cancer through behavior change. Implications for interventions. Dermatol. Clin. 13: 613–22
M. Eid
Supply Chain Management 1. Introduction Supply chain management is one of the most essential aspects of conducting business. Many people outside of the direct community (in research and industry) do not realize this because an ordinary consumer often experiences only its effects. Recall the times when the item that you wanted was not available in your favorite garments or grocery store, recall how many times you got a great ‘deal’ at the end of the season, recall the sudden increases in gas prices due to shortages, recall the times when your e-commerce site promised availability but later could not send the required product or sent you the wrong product, or recall the times when your customized product (like a personal computer or kitchen cabinet) was delayed to a great extent. All the above and several other experiences that consumers have on a routine basis are direct consequences of supply chain practices followed by firms. As opposed to business-to-consumer transactions, supply chain practices have immediate impact on business-tobusiness transactions. A few years back Toyota had to shut down its manufacturing facility in Japan due to supply shortages for its brake pedals, Boeing took a charge for several million dollars in the late 1990s due to insufficient capacity and part shortage resulting from an inability of the supply base to ramp up production, while firms such as Dell Computers, WalMart, and 7-Eleven Japan have consistently outperformed competition due to their great strengths in supply chain management.
2. Definition Supply chain management is such a vast topic that as a result people often give it a different definition based on their own personal experience. To some, supply chain management is all about managing the supplier base, determining what to outsource and to whom, and managing relationships with the various suppliers. To some others it is efficient ways of transferring goods from one place to another taking into account the distribution and transportation costs. To another set of people it is all about how the different firms in the distribution channel or value chain are integrated in terms of information systems and inventory management practices. To yet another group it is effective management of fixed and variable assets required for running the business. In a sense all these definitions are like the blind men defining the elephant based on its different organs. A comprehensive definition of supply chain management can be given as follows. A supply chain is the set of entities that are involved in the design of new products and services, procuring 15281
Supply Chain Management
S U P P L I E R
Demand and Supply Planning
Sourcing and Supplier Management
Manufacturing and Operations
Distribution and Logistics
Customer and Order Management
Supporting Processes: IT, Finance, Performance Measurement
C U S T O M E R
Figure 1 Supply chain process
raw materials, transforming them into semifinished and finished products and delivering them to the end customer. Supply chain management is efficient management of the end-to-end process starting from the design of the product or service to the time when it has been sold, consumed, and finally disposed of by the consumer. This complete process includes product design, procurement, planning and forecasting, production, distribution, fulfillment, and after-sales support (see Fig. 1). Supply chain management issues can be classified into two broad categories—configuration and coordination. Configuration-level issues relate to the basic infrastructure on which the supply chain executes and coordination-level issues relate to the actual execution of the supply chain. 2.1 Configuration-leel Issues 2.1.1 Supply base decisions. How many and what kinds of suppliers to have? Which parts to outsource and which to keep in house? How to standardize and streamline procurement practices? Should one use vertical marketplaces for auctions or should one invest in developing highly integrated supply partnerships? How long or short contracts with suppliers should be?
through the supply chain? How much variety to provide to customers? What degree of commonality to have across the product portfolio?
2.1.4 Information support decisions. Should enterprise resource planning software be standardized across functional units of a firm? Should the supply chain work on standard protocols such as XML (extended markup language) or on proprietary standards?
2.2 Coordination-leel Issues 2.2.1 Material flow decisions. How much inventory of different types of products should be stored? Should inventory be carried in finished form or semifinished form? How often should inventory be replenished? Should a firm make all of its inventory decisions or is it better to have the vendor manage the inventory? Should suppliers be required to deliver goods just in time?
2.1.2 Plant location decisions. Where and how many manufacturing, distribution, or retail outlets to have in a global production distribution network? How much capacity should be installed at each of these sites? What kind of distribution channel should a firm utilize—traditional brick and mortar, direct to consumer via Internet or phone, or a combination?
2.2.2 Information flow decisions. In what form is information shared between different entities in the supply chain—paper, voice via telephone, EDI (electronic data interchange), XML? To what degree does collaborative forecasting take place in the supply chain? What kind of visibility is provided to other entities in the supply chain during execution? How much collaboration takes place during new product or service development among the supply chain partners?
2.1.3 Product portfolio decisions. What kinds of products and services are going to be supported
2.2.3 Cash flow decisions. When do suppliers get paid for their deliveries? What kinds of cost reduction
15282
Supply Chain Management efforts are taken across the supply chain (or expected of suppliers)? In a global firm, in which currency will a supplier be paid? 2.2.4 Capacity decisions. How to optimally utilize the existing capacity in terms of manpower and machines? How to schedule on a manufacturing line to complete jobs on time? How much buffer capacity to have for abnormal situations with excess demand? As is evident, configuration and coordination issues are interdependent. Configuration issues can be viewed as strategic long-term decisions whereas coordination issues are medium-to-short term decisions. Generally, firms develop a strategy for the configuration-level decisions and then constrain the coordination decisions based on those.
3. Complexities Associated with Supply Chains As evident from discussions in the earlier section, supply chain management spans several functional and geographical areas. This introduces complexities both in terms of design and execution of supply chains. Some of the pertinent factors that complicate supply chain management decisions are as follows. 3.1 Multiple Agents Supply chain issues need to be decided by different entities sometimes having different interests. For example, a retailer may want that the distributor provide very high availability for the products but at the same time not charge anything additional from the retailer. The distributor may sometimes agree to that but in turn may want information about actual customer sales which the retailer may not want to share. Even when decisions have to be made within the same firm there could be incentive issues. For example, the marketing or sales department, typically a revenue center, presents the future demand forecast to the manufacturing department that is a cost center. Clearly, there is incentive for the former to overforecast and the latter to under-produce (as compared to the forecast). This creates several difficulties while deciding on the amount of inventory to be stocked. Another related issue is encountered where the marketing department may push for huge amount of variety in the product\service offerings; the manufacturing department may not want to embrace that because additional complexities are created during execution.
the process. There is uncertainty in product and technology development, in predicting customer demand, in day-to-day operations and manufacturing, and supply. Typically, uncertainty creates more inefficiency in the system. For example, if the final demand for a sweater at a store cannot be predicted accurately then the firm either stocks too little (in which case it suffers from stock-outs) or produces too much (in which case it has to salvage the inventory through a huge sale at the end of the season). Similarly, the uncertainty in supply may necessitate additional buffer inventory. 3.3 Information Asymmetry Since supply chain processes extend across multiple functional units within a firm and often across different firms there is a high degree of asymmetry in terms of information. This is caused primarily by two main reasons—one relates to lack of adoption of information technology and the other relates to reluctance to share information with other supply chain partners. The lack of information causes several problems during actual fulfillment. For example, when a consumer goes to an e-commerce site and buys an item off the electronic catalog, the consumer expects to receive the product on time. The consumer is not aware that the inventory status on the product may be updated only once a week and that the information on the site may be outdated. As a result, the consumer is disappointed when the product does not arrive on time. 3.4 Lead Time Each and every task in the supply chain process needs time to be completed and the resources (labor, machines, or computers) have limited processing capacity. As a result, not all tasks can be completed after the actual demand is known and some of the tasks need to be done up front (which may or may not get utilized based on the actual demand realized). Further, the limited capacity associated with the resources creates variability in the actual realized leadtime, which in turn necessitates greater resource requirements at the next stage in the supply chain. The above complexities lead to several types of inefficiencies in the supply chain that are often perceived as the ‘bad effects’ of inefficient supply chain management. Some of the major inefficiencies can be classified into the following categories.
4. Inefficiencies of Supply Chain Management 3.2 Uncertainty Accurately matching supply and demand is the ultimate goal of effective supply chain management but that is complicated by uncertainty at various levels of
4.1 Poor Utilization of Inentory Assets One common effect of poor supply chain management is having excess inventory at various stages in the 15283
Supply Chain Management supply chain, at the same time having shortages at other parts of the supply chain. Since inventory forms a substantial part of working assets of a firm, poor management could lead to huge inefficiencies. Lee and Billington (1992) provide an excellent overview of pitfalls and opportunities associated with inventory management in supply chains.
4.2 Distortion of Information Another effect relates to lack of visibility of demand and supply information across the supply chain which causes the bullwhip effect. This effect describes how a small blip in customer demands may get amplified down the supply chain because the different entities in the supply chain generate and revise their individual forecasts and do not collaborate and share actual demand information. Lee et al. (1997) describe the causes and controls for this effect.
4.3 Stock-outs Poor supply chain management also results in late deliveries and large stock-outs. Fundamentally, these effects are caused due to an inability of the firm to predict the requirement for raw material and equipment capacity together with the uncertainty associated with obtaining deliveries of products on time from its suppliers. Fisher et al. (1994) describe how accurate forecasts in the apparel industry could potentially reduce this inefficiency.
4.4 Customization Challenges As the degree of customization has increased in the marketplace, one of the immediate effects of poor supply chain management relates to late deliveries of customized products. Firms are developing several strategies in order to provide variety while keeping costs under control. These include delaying differentiation of the product and introducing more commonality and modularity in product lines (see Swaminathan and Tayur 1998).
5. Supply Chain Models: Past, Present, and Future The science related to supply chain management traces its history back to the early 1950s when several researchers were interested in understanding the optimal policies related to inventory management. One of the first pieces of work in this stream relates to the models developed by Clark and Scarf (1958) for managing inventories at multiple echelons. Several hundreds of researchers have studied related inventory 15284
problems under stochastic and deterministic environments since the 1950s. This research is captured concisely in the research handbook edited by Graves et al. (1993). There is a large amount of literature in the area of transportation and distribution as well as plant location models in the context of supply chain management. Traditional researchers focussed on developing optimal policies and rules for specific supply chain issues assuming a centralized control of the supply chain. In the 1990s researchers have started to study problems which take a decentralized multi-agent approach to analyzing supply chain problems, integrate information availability across the supply chain with logistics decisions, develop new models for supply contracts, and demand forecasting and integrate product design with supply chain management. A collection of prominent pieces of research in this area is contained in Tayur et al. (1999). In addition to academic research, several firms in the 1990s have developed successfully and employed large analytical and simulation models for supply chain optimization and execution. Arntzen et al. (1995) describe one such system developed for Digital Equipment and recently, IBM was awarded the Franz Edelman Award by INFORMS (The Institute for Operations Research and Management Sciences) for a supply chain optimization project which led to $750 million in inventory savings. In the twenty-first century, firms face severe challenges in terms of global competition and customer requirement for greater variety, shorter and reliable delivery times, and lower prices. The advent of ecommerce has created immense opportunities but at the same time has made firms more vulnerable to logistics pitfalls. Today customers do not just buy products but they buy delivered products. As a result fulfillment is as important as making the sale. As opposed to traditional channels where inventory could be stored to hide other inefficiencies in terms of lead time and poor forecasts, in the fast-paced electronic business environment such arrangements are not as useful. As a result, firms are beginning to pay more attention towards supply chain management. Both business-to-consumer and business-to-business ecommerce environments have introduced several issues related to supply chain management which are likely to be studied by researchers in the near future. The prevalence of the Internet has led to the development of vertical market places that promise to reduce the inefficiencies in the buying process in several industries. On one hand, these market places are likely to reduce the cost of goods for the manufacturer due to more competition leading to better prices. This line of thought indicates that in future supply chains may be more agile and supplier relationships may be shortterm oriented. On the other hand, several firms realize that greater benefits can be attained if some of these market places can in fact be used for process integration and collaboration across the supply chain.
Support and Self-help Groups and Health In such an environment, firms need to develop greater trust so that they would be willing to share information with their supply chain partners. Researchers today are trying to identify under what conditions one or the other scenario may play out and what kinds of new models and analysis need to be developed. A related effect of the Internet is the expansion of global supply chains. Today it is much easier for any supplier located in a remote part of the world to bid on contracts from large firms in developed nations with whom they may not have done business in the past. Issues related to coordination of global supply chain management are likely to be an integral part of supply chain management research in the future. Another important research topic is the reverse logistics issues related to supply chain management. Traditionally, researchers have only concerned themselves with efficient movement of goods from supplier to the customer. Now a greater number of researchers are studying problems related to disposal of used products, refurbishing old products, making packaging more environmentally friendly, and basing supplier selection on environmental criteria in addition to traditional criteria related to cost, quality, and reliability. Another new stream of research is the study of supply chains in the service industry. As opposed to traditional manufacturing-oriented supply chains, service supply chains are more complicated due to the inability to store inventory. Uncertainty is handled in those cases using additional buffer capacity. Finally, researchers are beginning to focus more on the integration of product design and supply chains and developing models to understand the concepts related to design for supply chain management. See also: Commodity Chains; Location Theory; Market Areas; Market Structure and Performance; Marketing Strategies; Retail Trade
Bibliography Arntzen B C, Brown G G, Harrison T P, Trafton L L 1995 Global supply chain management at Digital Equipment Corporation. Interfaces 25(1): 69–93 Clark A J, Scarf H 1958 Optimal policies for a multi-echelon inventory problem. Management Science 6: 475–90 Graves S C, Rinnooy Kan A H G, Zipkin P H 1993 Logistics of Production and Inentory, Handbook in OR and MS. North Holland, Amsterdam, Vol. IV Fisher M L, Hammond J H, Obermeyer W R, Raman A 1994 Making supply meet in an uncertain world. Harard Business Reiew 72(3): 83–9 Lee H L, Billington C 1992 Managing supply chain inventory: Pitfalls and opportunities. Sloan Management Reiew 33(3): 65–73 Lee H L, Padmanabhan V, Whang J 1997 Information distortion in a supply chain: The bullwhip effect. Management Science 43(4): 546–8
Swaminathan J M, Tayur S 1998 Managing broader product lines through delayed differentiation using vanilla boxes. Management Science 44(12): S161–72 Tayur S, Ganeshan R, Magazine M J 1998 Quantitatie Models for Supply Chain Management. Kluwer, MA
J. M. Swaminathan
Support and Self-help Groups and Health 1. Nature and Use of Self-help and Support Groups As psychosocial interventions, support groups and mutual aid, self-help (MASH) groups have a great deal in common. Their members share a similar health concern, life event, role transition, or chronic stressor, and they exchange a variety of types of support through face-to-face interaction, bringing to bear their experiential knowledge to gain an understanding of their personal and collective predicament and of ways of coping with it. The differences between these two small group structures are that support groups are organized, governed, and led by professionals who provide supplementary expert knowledge or skill training, whereas MASH groups are self-governing, and led by individuals who are peers by virtue of sharing the group’s problem or concern. In addition, support groups tend to have a closed membership whereas MASH groups are open, and support groups tend to have a fixed duration whereas MASH groups are indefinite in duration. Finally, some MASH organizations engage in advocacy efforts on behalf of marginalized or vulnerable segments of the population (Reissman and Carroll 1995), whereas support groups tend to look inward rather than outward. Both support and MASH groups differ quite dramatically from therapy groups; the leader does not engage in any clinical activities, such as psychological assessment, and the members are not assigned to a group on the basis of their symptomatology, diagnostic label, or prognosis. A recent national US survey revealed that approximately 7 percent of adults had participated in a MASH group in the prior 12 months, and 18 percent had done so at some point in their lifetime (Kessler et al. 1997). A similar study in Canada, also based on a national probability sample, revealed a much lower prevalence of 2 percent of adults who had participated in a MASH group in the prior 12 months (Gottlieb and Peters 1991). Perhaps the self-help ethos is stronger in the US than in Canada, where a national healthcare system may also serve as a disincentive to MASH participation. Humphreys and Klaw (1998) report 15285
Support and Self-help Groups and Health that, in the US, there are MASH organizations for almost every major chronic condition and leading cause of mortality. Examples of the former include groups for arthritis sufferers, people diagnosed with ischemic heart disease, cancer patients, diabetics, and people with hearing and visual impairments. MASH organizations are heavily utilized by people with substance abuse and emotional problems, Alcoholics Anonymous (AA) being the most widely used of all MASH organizations and probably the most widely accepted by professional referral agents such as family physicians and addictions counselors. AA also has been subjected to more evaluation research than other MASH groups, one recent study finding that, over three years, participants decreased their daily alcohol intake by 75 percent while also effecting substantial cost savings relative to those receiving professional outpatient treatment (Humphreys and Moos 1996). Although no corresponding census has been taken of public participation in support groups, prior reviews of the literature have spotlighted the abundance of groups for cancer patients (Fawzy et al. 1995, Helgeson and Cohen 1996), family caregivers of older adults, particularly those suffering from dementia (Bourgeois et al. 1996, Lavoie 1995, Toseland and Rossiter 1989), and people coping with loss through death or divorce (Hughes 1988). The fact that support groups have been subjected to far more research than MASH groups testifies to the greater ease of access they provide, due in part to their professional sponsorship and more casual attitude toward maintaining anonymity. Both support and MASH groups function as specialized personal communities that supplement or compensate for deficiencies in the participants’ natural networks (Helgeson and Gottlieb 2000). These deficiencies may stem from the natural network’s inability to comprehend or tolerate the thoughts, feelings, and behaviors that arise from relatively novel stressors, the network’s impatience with or exhaustion from lengthy periods of rendering support and aid, or from other interpersonal dynamics that compromise the network’s prosocial behaviors, such as conflicts about appropriate ways of coping (Gottlieb and Wagner 1991) and emotional overinvolvement (Coyne et al. 1988). In some circumstances, coparticipants substitute for a natural network that has proved destructive, such as when the fellowship offered by AA supercedes former drinking buddies (Humphreys and Noke 1997). In some contexts coparticipants serve functions complementary to the natural network’s, such as when a weight loss group helps to maintain the gains effected by dietary changes at home.
2. Group Design and Dynamics From the perspective of an observer, meetings of MASH groups and support groups would be virtually 15286
indistinguishable except for two differences. First, the support group would devote a portion of its time to education or training provided by an expert, whereas the MASH group would devote the entire meeting to the swapping of experiences and the exchange of support. Second, many, but not all MASH groups would operate on a 12-step model or a variation thereof, which consists of quite elaborate ideological formulations about the cause and source of difficulty and about the ways members need to think about their situations in order to come to terms or resolve them (Antze 1976). Underlying the 12-step belief system are the ideas of spiritual surrender and personal struggle, along with the need for moral cleansing and mutual aid. In contrast, support groups rarely import an ideology but develop their own norms, with whatever cognitive restructuring occurring as a result of the resources introduced by the professional leader and the influences arising from the members’ interactions. Aside from these two differences, the same set of structural elements need to be considered in designing both MASH and support groups. These include: (a) the setting and scheduling of the group sessions, including their number, length, duration, and intervals; (b) the leadership, which may be consistent or rotating, and involve one or two (co-led) facilitators; (c) the group composition, taking into account demographic (e.g., age, gender, socioeconomic status), stressor-related (e.g., stage, severity), and copingrelated (e.g., extent of distress and resource mobilization) criteria; and (d) the format, including the extent of meeting structure, extra-group contacts among members, and behavioral training. To date, there has been relatively little research comparing the effects on both the process and outcomes, as well as attrition, resulting from systematic manipulation of these structural elements. For example, little is known about ways of composing the groups to optimize the social comparisons that members make, and whether aspects of the social mix or of the problem they share more strongly affect members’ attraction, participation, social learning and supportive exchanges in the group (Gottlieb 1998). The beneficial psychosocial dynamics that underlie the mutual aid occurring in both support and MASH groups include the emotional venting along with the validation and normalization of emotions that are gained from contact with similarly afflicted peers, the opportunity to enhance one’s sense of worth and control by providing support to others, a process that has been called helper-therapy (Riessman 1965), and the opportunities for self-enhancement and selfimprovement that derive from the social comparisons that members make. These latter comparisons may take an upward, downward, or lateral direction, ideally respectively serving self-improvement, selfenhancement, and normalizing functions. For example, in a widely cited study, Taylor (1983) found that the vast majority of women with breast cancer
Support and Self-help Groups and Health found a real or imagined dimension on which they compared more favorably to other women with the same diagnosis. These psychosocial dynamics have yet to be integrated within a comprehensive model of the process through which support and MASH groups exert an impact on health outcomes, whether psychiatric or medical. That is, to understand the mechanisms of action through which these psychosocial dynamics influence health and morale, competing theoretical models should be specified and tested. Cohen (1988), Stroebe et al. (1996), and Wills and Filer (1999) have described several potential direct and indirect pathways through which social support may affect health, including physiological mechanisms, appraisal and reactivity mechanisms, and behavioral mechanisms of action. For example, it is presently unknown whether the increased survival time documented by Spiegel et al. (1989) and Fawzy et al. (1993) for women with metastatic breast cancer and malignant melanoma respectively resulted from mediating improvements in their immune and endocrine function, the enhancement of positive health behaviors such as diet, exercise, and compliance with their therapy regime, or a purely psychological process involving hope and a sense of being needed by interdependent group members. Naturally, there may also be a complex interaction among these mechanisms. The important point is that support and MASH groups that are informed by these models should be designed to impact the hypothesized mediating processes, and measures that are sensitive to change in these processes must also be included.
3. Group Impacts Evaluation of the effects of MASH groups has consisted largely of feedback regarding consumer satisfaction, not randomized controlled trials that administer standardized outcome measures at several intervals following the intervention. Indeed, since most MASH groups are open-ended in nature, allowing members to flexibly come and go as their needs dictate, it is virtually impossible to assess anything that resembles a uniform intervention. Moreover, many MASH groups are not oriented to personal change, recovery, or improved adjustment; instead, they offer acceptance, support, and fellowship to people who feel marginalized (e.g., people with developmental disabilities) or who have been stigmatized (e.g., people with mental illnesses), or people who live with and care for someone who has one of those statuses. In short, many MASH groups view support as an end in itself rather than as a means to other ends, rendering conventional outcome evaluation irrelevant. Those MASH groups that have been rigorously evaluated consist of both ad hoc groups designed along self-help principles and groups that are part of a MASH organization, such as AA. Examples of ad hoc
groups are parents of premature infants (Minde et al. 1980), bereaved wives (Marmar et al. 1988), and older adult men who have diabetes (Gilden et al. 1992). All three of these studies report significant desirable health effects that lasted for several months after the group ended, compared to less effective or equally effective but more costly comparison interventions such as psychotherapy. Though limited in number, evaluations of groups that are part of a larger MASH organization show similar beneficial effects, including Humphreys and Moos’ (1996) study of participants in AA, and Lieberman and Videka-Sherman’s (1986) study of bereaved participants in THEOS (They Help Each Other Spiritually). One provocative finding of the latter study is that participants who developed friendships through their group experience achieved significantly better adjustment than those who did not, suggesting that MASH group effects are magnified for those who have the social skills, personality, or motivation to convert a tie based on shared organizational membership and stressful circumstances into a strong personal tie. As previously noted, there exist far more evaluations of the health impact of support groups than of MASH groups, including review papers on groups for cancer patients, the family caregivers of elderly persons with dementia, and the bereaved. Both within and across these three stressor contexts there is much variability in the characteristics of the participants, the structure, content, and format of the meetings, and the measures used to tap the intended effects. The only constant is the ‘dosage,’ with the vast majority of groups meeting between 6 and 10 times for 1–2 h on each occasion. As for the impacts, there seems to be far more consensus about the limited impact of support groups for family caregivers than for the other two stressor contexts. Specifically, the evidence shows that the mental health of family caregivers does not improve much from such a brief intervention, but that longer-term groups and groups that involve skill training that is tailored to situational demands and needs effect larger gains, whether measured in terms of general (e.g., depressive affect and anxiety) or domain specific (e.g., role performance and hostility toward the recipient of care) indicators (Gottlieb 1998, Lavoie 1995, Toseland and Rossiter 1989, Ostwald et al. 1999). It appears that a short-term intervention is not matched appropriately to the chronic character of the caregiving experience, and it is also hard to see how such a limited group experience can mitigate the distress that is a natural biproduct of helplessly witnessing the deterioration of a family member. Fawzy et al. (1995) also concluded their review of the effects of support groups for cancer patients with the recommendation that short-term groups were best suited to the needs of newly diagnosed patients who are early in their treatment, whereas long-term groups were better suited to those with advanced metastatic disease. An example of the latter is a year-long support 15287
Support and Self-help Groups and Health group intervention that involved weekly 90-min meetings plus abundant extra-group contact among the members, women who had metastatic breast cancer (see Gender and Cancer). Co-led by a professional and a counselor whose breast cancer was in remission, the group achieved better adjustment (e.g., mood disturbance, tension-anxiety, vigor, confusion) than a control group by the 12-month measurement interval, but not by the two preceding 4-month and 8-month intervals. Moreover, by the time of the 10-year followup, the support group participants had increased their survival time by 18 months relative to the control group. Although this study gives reason for optimism about the health benefits of longer-term support groups, the findings must be replicated before any definitive conclusions can be drawn. Moreover, future research should devote more attention to identifying the mechanisms underlying the group’s ameliorative effects and determining which subgroups of participants and which disease diagnoses benefit most from this type of psychosocial intervention. Clearly, people who are more extraverted, inclined to cope by seeking information and venting their feelings, and have the skills and motivation to form attachments with co-participants, are more likely to be attracted to this medium in the first place and perhaps more prone to benefit from the provisions it supplies. Conceivably, the group offers a sense of belonging and being needed by others, and these perceptions may stimulate more active immune system functioning, which may account for the superior outcomes that have been observed. Helgeson and Cohen (1996) limited their review of psychosocial interventions for cancer patients to a comparison of peer support vs. education vs. combined support and education groups. The former concentrate on the provision of emotional support, whereas the latter mainly deliver informational support. Although their conclusions are hedged due to the numerous methodological weaknesses of the studies and differences in the samples and disease sites and stages, they offer important insights into the mechanisms that may be implicated in psychological and physical adjustment to cancer (see Cancer: Psychosocial Aspects). These mechanisms include enhanced self-esteem, gains in perceived control, optimism regarding the future, improved processing of emotions, and opportunities to derive meaning from their adversity. These and other pathways, including biobehavioral processes, deserve greater scrutiny in future intervention trials. Support groups for people who have experienced losses through death or divorce have proved to speed the adjustment process, compared to people who have not used any professional or mutual aid resource (Bloom et al. 1985, Hughes 1988, Vachon et al. 1980, Silverman 1986). Groups for children whose parents have divorced have also been found to foster improved behavioral, academic, psychological and social adjustment (see Grych and Fincham, 1992, for a review). 15288
Support groups have also been offered to adolescents with eating disorders and to children of varying ages who have a parent with a mood disorder (Beardslee et al. 1993) or with an alcohol or substance abuse problem (Emshoff 1990). Evaluation of support groups is complicated by the fact that these interventions usually consist of multiple active ingredients, not just the support that is exchanged among the members. In fact, in a rare comparison of support groups with and without a skill-training component, Stolberg and Mahler (1994) found that the group that lacked skills training was no more effective than the control group for children whose parents had divorced. More studies that adopt such a dismantling design (West et al. 1993) need to be conducted in order to demonstrate that changes in social support are responsible at least in part for improved outcomes. This means that measures of support need to be taken and then related to the distal outcomes of the intervention as a strategy of confirming the mediating role of support. Finally, recognizing that many people who might be attracted to a support group live in geographically remote communities, have disabilities that limit their mobility, or prefer not to divulge their identity, a number of investigators have implemented computermediated groups, using the technology that has recently become available at relatively low cost. For example, Dunham et al. (1998) trained 50 poor, young, single mothers to use a bulletin board system to post both private and network-wide messages to one another and to hold text-based teleconferences with up to eight mothers at a time. With the mothers’ permission, Dunham et al. (1998) analyzed the extent and nature of the participants’ communications, and found a significant pre- to post-test decrease in parenting stress for those mothers who participated consistently over the 6-month study period. Telephone support groups have also been implemented for family caregivers of persons with dementia (Goodman and Pynoos 1990), a vulnerable population that is largely home-bound. A distinct psychological advantage of support groups that use computer and telephone media is that they moderate the emotional intensity of the group experience, which in turn may decrease its threat. On the other hand, without direct exposure to the other group members, the social comparison process (see Social Comparison, Psychology of ) may be more limited and the possibility of friendship formation, which has proved to be so vital in the bereavement and cancer studies reviewed earlier, is diminished.
4. Conclusion MASH and support groups are relatively low-cost and highly accessible helping resources that promote health and well being through many psychological
Support and Self-help Groups and Health processes. In the health field, these groups have been employed both as an alternative and a complement to professional services. Moreover, as the actual and potential health benefits of these groups become more widely recognized, mutual aid strategies are more likely to become integrated within health care and health promotion programs (Humphreys and Klaw 1998). More systematic investigation of different ways of structuring the groups and tailoring them to the needs and coping styles of prospective members, promises to augment their attractiveness and health impacts. See also: Social Support and Health; Social Support and Recovery from Disease and Medical Procedures; Social Support and Stress; Social Support, Psychology of
Bibliography Antze P 1976 The role of ideologies in peer psychotherapy organizations: Some theoretical considerations and 3 case studies. Journal of Applied Behaioral Science 12: 323–46 Beardslee W R, Salt P, Porterfield K, Rothberg P C, van de Velde P, Swatling S, Hoke L, Moilanen D L, Wheelock I 1993 Comparison of preventive interventions for families with parental affective disorder. Journal of the American Academy of Child Psychology 32: 254–63 Bloom B L, Hodges W, Kern M B, McFadden S 1985 A preventive intervention program for the newly separated: final evaluations. American Journal of Orthopsychiatry 55: 9–26 Bourgeois M S, Schulz R, Burgio L 1996 Interventions for caregivers of patients with Alzheimer’s Disease: A review and analysis of content, process, and outcomes. International Journal of Aging and Human Deelopment 43: 35–92 Cohen S 1988 Psychosocial models of the role of social support in the etiology of physical disease. Health Psychology 7(3): 269–97 Coyne J C, Wortman C B, Lehman D 1988 The other side of support: emotional overinvolvement and miscarried helping. In: Gottlieb B H (ed.) Marshaling Social Support: Formats, Processes, and Effects. Sage, Newbury Park, CA, pp. 305–30 Dunham P J, Hurshman A, Litwin E, Gusella J, Ellsworth C, Todd P W 1998 Computer-mediated social support: Single young mothers as a model system. American Journal of Community Psychology 26: 281–306 Emshoff J G 1990 A preventive intervention with children of alcoholics. Preention in Human Serices 7: 225–54 Fawzy F I, Fawzy N W, Arndt L A, Pasnau R O 1995 Critical review of psychosocial interventions in cancer care. Archies of General Psychiatry 52(2): 100–13 Fawzy F I, Fawzy N H, Hyun C S, Elashoff R, Guthrie D, Fahey J L, Morton D 1993 Malignant melanoma: Effects of an early structured psychiatric intervention, coping, and affective state on recurrence and survival 6 years later. Archies of General Psychiatry 50(9): 681–89 Gilden J L, Hendryx M S, Clar S, Casia C, Singh S P 1992 Diabetes support groups improve health care of older diabetic patients. Journal of the American Geriatric Society 40(2): 147–50 Goodman G, Pynoos J 1990 A model telephone information and support program for caregivers of Alzheimer’s patients. Gerontologist 30(3): 399–404
Gottlieb B H 1998 Support groups. In: Friedman H (ed.) Encyclopedia of Mental Health. Academic Press, San Diego, CA, pp. 635–48 Gottlieb B H, Peters L A 1991 National demographic portrait of mutual aid group participants in Canada. American Journal of Community Psychology 19(5): 651–66 Gottlieb B H, Wagner F 1991 Stress and support processes in close relationships. In: Eckenrode J (ed.) The Social Context of Coping. Plenum, New York, pp. 165–88 Grych J H, Fincham F D 1992 Interventions for children of divorce: Toward greater integration of research and action. Psychological Bulletin 111(3): 434–54 Helgeson V S, Cohen S 1996 Social support and adjustment to cancer: Reconciling descriptive, correlational, and intervention research. Health Psychology 15(2): 135–48 Helgeson V S, Gottlieb B H 2000 Support groups. In: Cohen S, Underwood L, Gottlieb B H (eds.) Social Support Measurement and Interention: A Guide for Health and Social Scientists. Oxford University Press, New York, pp. 221–45 Hughes R J 1988 Divorce and social support: A review. Journal of Diorce 11(3–4): 123–45 Humphreys K, Klaw E 1998 Expanding the Self-Help Group Moement to Improe Community Health and Well-being. Center for Health Care Evaluation, VA Palo Alto Health Care System, Stanford, CA Humphreys K, Moos R H 1996 Reduced substance abuserelated health care costs among voluntary participants in Alcoholics Anonymous. Psychiatric Serices 47(7): 709–13 Humphreys K, Noke J M 1997 The influence of post-treatment mutual help group participation on the friendship networks of substance abuse patients. American Journal of Community Psychology 25(1): 1–16 Kessler R C, Mickelson K D, Zhao S 1997 Patterns and correlates of self-help group membership in the United States. Social Policy 27: 27–46 Lavoie J P 1995 Support groups for informal caregivers don’t work! Refocus the groups or the evaluations? Canadian Journal of Aging 14(3): 580–603 Lieberman M A, Videka-Sherman L 1986 The impact of selfhelp groups on the mental health of widows and widowers. American Journal of Orthopsychiatry 56(3): 435–49 Marmar C R, Horowitz M J, Weiss D S et al. 1988 A controlled trial of brief psychotherapy and mutual-help group treatment of conjugal bereavement. American Journal of Psychiatry 145(2): 203–9 Minde K, Shosenberg N, Marton P, Thompson J, Ripley J, Burns S 1980 Self-help groups in a premature nursery: a controlled evaluation. Journal of Pediatrics 96(5): 933–40 Ostwald S K, Hepburn K W, Caron W, Burns T, Mantell R 1999 Reducing caregiver burden: A randomized psychoeducational intervention for caregivers of persons with dementia. Gerontologist 39(3): 299–309 Riessman F 1965 The helper therapy principle. Social Work 10: 29–38 Reissman F, Carroll D 1995 Redefining Self-help. Jossey-Bass, San Francisco Silverman P R 1986 Widow-to-widow. Springer, New York Spiegel D, Kraemer H C, Bloom J, Gottheil E 1989 Effect of psychosocial treatment on survival of patients with metastatic breast cancer. Lancet 2(8668): 888–91 Stolberg A L, Mahler J 1994 Enhancing treatment gains in a school-based intervention for children of divorce through skill training, parental involvement, and transfer procedures. Journal of Consulting Clinical Psychology 62(1): 147–56
15289
Support and Self-help Groups and Health Stroebe W, Stroebe M, Abakoumkin G, Schut H 1996 The role of loneliness and social support in adjustment to loss: A test of attachment versus stress theory. Journal of Personal Social Psychology 70(6): 1241–9 Taylor S E 1983 Adjustment to threatening events: A theory of cognitive adaptation. Am. Psychology 38(11): 1161–73 Toseland R W, Rossiter C 1989 Group interventions to support family caregivers: A review and analysis. Gerontologist 29(4): 438–48 Vachon M L, Lyall W, Rogers J, Freedman-Letofsky K, Freeman S 1980 A controlled study of a self-help intervention for widows. American Journal of Psychiatry 137(11): 1380–84 West S G, Aiken L S, Todd M 1993 Probing the effects of individual components in multiple component prevention programs. American Journal of Community Psychology 21(5): 571–605 Wills T A, Filer M 1999 Social networks and social support. In: Baum A, Revenson T (eds.) Handbook of Health Psychology. Erlbaum, Mahwah, NJ, pp. 209–34
B. H. Gottlieb
Suprachiasmatic Nucleus The suprachiasmatic nucleus (SCN) of the hypothalamus is the principal circadian pacemaker in the mammalian brain and, as such, it generates circadian rhythms in rest and activity, core body temperature, neuroendocrine function, autonomic function, memory and psychomotor performance, and a host of other behavioral and physiological processes. The SCN is the central player in an important neural system, the circadian timing system (CTS).
1. Circadian Rhythms are Adaptations to the Solar Cycle The physical environment of living organisms presents extraordinarily diverse circumstances requiring complex adaptations. The most prominent, regularly recurring stimulus in the physical environment is the solar cycle of light and dark and, not surprisingly, nearly all species, plant and animal, have evolved daily rhythms in organismal functions and behavior in response to the solar cycle which enhance adaptation and survival. Such rhythms, when they have special properties, to be outlined below, are designated circadian rhythms. Circadian rhythms are found in nearly all living organisms, from prokaryotes through eukaryotes, from simple to higher plants, and simple invertebrates to the human. The nature of these rhythms is diverse but in mammals, the major subject of this article, the most evident expression of circadian 15290
function is the sleep–wake cycle. This rhythm, and associated rhythms in physiological and endocrine functions, are expressions of the activity of a specialized neural system, the CTS. Mammals present two patterns of behavioral adaptation to the solar cycle. In those species in which olfaction and audition dominate the animals’ perception of their environment, the rest-activity (sleep– wake) rhythm is ‘nocturnal’; the animals are awake at night and sleep during the day. In animals in which vision is the major sensory modality, the rest-activity rhythm is diurnal with activity during the day and rest at night. The nocturnal pattern is typical of rodents and carnivores, for example, whereas the diurnal pattern is exemplified by birds and primates.
2. Circadian Rhythms Hae Two Principal Properties Under normal conditions of the light-dark cycle, circadian rhythms have a period of exactly 24 hours. If photic cues are removed, however, by placing animals in constant darkness or constant light, the rhythms are maintained but with a period that differs from 24 hours. In the first circumstance, in which the period of the rhythm is the same as that of the solar cycle, the rhythm is said to be ‘entrained.’ When the period differs from 24 hours in constant conditions, the rhythm is termed ‘free-running’ (Fig. 1). These two properties, entrainment and endogenous maintenance of free-running rhythmicity, provide us with insight into the necessary organization of the CTS. That is, the phenomenon of a free-run in the absence of environmental cues indicates that circadian rhythmicity is generated by a clock-pacemaker within the animal. Entrainment implies that photoreceptors and a visual pathway to the clock-pacemaker are necessary components of the system and the diverse functions under circadian control imply the presence of pathways from the clock-pacemaker to sites expressing effector rhythms (Fig. 1). The effect of circadian regulation is not to alter a function but to provide a temporal organization for its expression.
3. The SCN is the Center of the CTS 3.1 The SCN is an Endogenous Circadian Pacemaker The SCN is a small, paired hypothalamic nucleus that lies immediately about the optic chiasm and just lateral to the third ventricle in all mammals. Typically it is composed of small neurons that are compactly organized. There are four lines of evidence supporting the conclusion that the SCN is a circadian pacemaker.
Suprachiasmatic Nucleus
Figure 1 The left panel shows an entrained and free-running rhythm in diagram. Each line represents a day and the length of the line would be the duration of the walking period in a wake–sleep cycle. LD refers to a light–dark cycle and DD to constant darkness. The first six days are entrained in LD and the last six free-running in DD. The right panel is a representation of the circadian timing system with light acting upon retinal photoreceptors to be transduced to neural information to be carried through an entrainment pathway to the pacemaker. The pacemaker generates a circadian signal which is transmitted to effector systems which express circadian regulation of function
3.1.1 The SCN is the site of termination of the retinohypothalamic tract—an essential entrainment pathway. For an explanation of this, see below.
3.1.2 Ablation of the SCN abolishes circadian rhythms. The initial studies of SCN ablation were performed in the early 1970s. Following these lesions, there is a complete loss of temporal organization of functions under circadian regulation. For example, SCN-lesioned animals show an arrhythmic pattern of sleep-wake behavior yet the amount of time spent awake is normal as is the amount of time spent in REM and non-REM sleep. Similarly, the rhythm in adrenal steroid secretion is lost but the total amount of steroid secreted is normal and responses to stress are maintained. Thus, in these and many other functions, the effect of SCN lesions is only to
alter the temporal organization of functions, not the functions themselves.
3.1.3 Circadian rhythmicity is maintained in the SCN in isolation in io and in itro. In elegant experiments, Inouye and Kawamura were able to isolate the SCN from the rest of the brain in intact animals and record from SCN neurons in these animals. Although the animals were arrhythmic in behavior, and other brain areas lost rhythms in neuronal firing, SCN neurons maintained a stable rhythm of firing with high levels during the day and low levels at night, and the firing rate rhythm free-runs in constant conditions. Subsequent studies by a number of investigators confirmed these observations in SCNs maintained in itro in the absence of the rest of the brain. This has not been found for other brain areas. Thus, the SCN is a unique brain area in that it main15291
Suprachiasmatic Nucleus tains rhythmicity in isolation both from environmental cues and input from other brain areas. 3.1.4 Circadian rhythmicity is restored in SCNlesioned animals that receie transplants of fetal SCNs. In arrhythmic animals with SCN lesions, placement of small transplants of fetal hypothalamus containing the SCN into the third ventricle restores the circadian sleep–wake rhythm. The transplants must contain SCN tissue and be placed in the relative vicinity of the lesioned area to be effective. It was unclear until recently how the transplants functioned but work by Silver and Lehman and co-workers, who were early leaders in the transplant studies, showed conclusively that the transplants can communicate with the rest of the brain through a humoral signal released into the cerebrospinal fluid to diffuse into the adjacent brain, hypothalamus and thalamus. These data on transplantation of SCN provide the final, conclusive evidence that the SCN is the principal circadian pacemaker. 3.1.5 SCN Neurons are circadian oscillators. There are two potential mechanisms by which the SCN might generate circadian rhythms. The first is that individual SCN neurons are born as circadian oscillators that are then coupled to make a pacemaker. The second is that circadian function arises as an emergent property of coupled SCN neurons which are not themselves oscillators. The SCN is formed embryologically from precursor cells located in the diencephalic germinal epithelium. As these precursor cells divide, the newly formed neurons migrate from the germinal epithelium to form the SCN. In the rat, the most extensively studied species, SCN neurons are formed in late prenatal life with the process completed by three days before birth. At 2 days before birth, the SCN is very primitive and contains no synapses, indicating that its neurons do not participate in a neural network. Nonetheless, a circadian rhythm in glucose utilization appears in the SCN on that day, and this is not generated by maternal influences, or other fetal influences, indicating that SCN neurons have become rhythmic and, hence, at least a subset must be individual circadian oscillators. Subsequently, the neurons are coupled through neural connections to make a pacemaker. The validity of this interpretation of SCN organization is shown by other studies in which individual SCN neurons, isolated in tissue culture, exhibit circadian rhythms in firing rate. This type of pacemaker organization is also consonant with observations on invertebrate pacemakers. 3.1.6 Circadian function in the SCN is genetically determined and generated through protein inter15292
actions. Data presented over many years have established that circadian rhythmicity is a geneticallydetermined function. Among the studies contributing to this understanding are experiments in Neurospora and Drosophila that have indicated the existence of clock genes. Beginning in the mid-1980s, clock genes were cloned and analyzed, largely by Dunlap and his co-workers for Neurospora and by Hall, Rosbash, and Young for Drosophila. This work shows that circadian rhythmicity, at the molecular-cellular level, is the expression of clock genes producing proteins that, in turn, regulate expression of the clock genes, that is, the clock mechanism can be viewed as a simple feedback loop. This field is evolving very rapidly at present and recent findings indicate that there is a substantial conservation of function from invertebrates to mammals. Homologues of the Drosophila clock genes are present in the mammalian SCN and cycle in a pattern similar to that in Drosophila. This work will continue to expand rapidly and we should expect a reasonably complete understanding of the molecular-cellular events underlying circadian rhythmicity in the mammalian SCN to be elucidated in the near future.
3.2 The SCN is Coupled to Effector Systems by Neural Pathways The major output of the SCN is to hypothalamus and thalamus with very limited projections to basal forebrain. Within this context, the densest projections are to anterior hypothalamus for the control of body temperature and autonomic function and to the paraventricular and arcuate nuclei for neuroendocrine regulation and to the posterior hypothalamus and thalamus for behavioral state regulation. Thus, these projections form the basis circadian control of effector systems. A scheme for this is shown diagrammatically in Fig. 2.
4. The SCN in Nonmammalian Vertebrates There appears to be an SCN, or an SCN homologue, in the brains of all vertebrates. The function of the SCN, however, differs in nonmammalian vertebrates in comparison to mammals. As described above, the SCN is the circadian pacemaker responsible for the temporal organization of behavior into 24-hour sleepwake cycles in virtually all mammals. Relatively little work has been done on fish, amphibia, and reptiles but birds have been studied extensively. In avian species there are three patterns of organization of the circadian timing system. The first, represented by the domestic chicken, is somewhat similar to mammals. The SCN is the pacemaker responsible for the temporal organization of behavior but the pineal is
Suprachiasmatic Nucleus
Figure 2 Diagrammatic representation of the functional organization of SCN control of effector system
capable of independent circadian activity. The second, represented by passerine species, differs in that the pineal is the principal pacemaker but the pineal acts through effects on the SCN, that is, if the pineal is removed from animals in constant conditions, the animals become arrhythmic and transplantation of the pineal to the anterior chamber of the eye restores rhythmicity. If the SCN is ablated in animals with intact pineals, behavioral rhythmicity is lost even though the pineal continues to put out a rhythmic signal. The third variation, seen in some quail species, is characterized by three structures, the pineal, SCN, and retina being capable of generating rhythmicity. Thus, in avian species, the SCN, the pineal or the retina are potential circadian pacemakers. This is of interest in that recent data indicate that the mammalian retina is capable of generating intrinsic circadian rhythms and there is some evidence that mammals can exhibit a very limited amount of circadian function in the absence of the SCN. This is only evident in specialized environmental situations and, under usual circumstances, SCN-lesioned animals do not recover circadian function and remain arrhythmic throughout their lives.
into cycles of rest (sleep) and activity. In mammals, this temporal organization of behavior, facilitated by a precise temporal organization of physiological, neural, and endocrine processes is established by a small, paired hypothalamic nucleus, the suprachiasmatic nucleus. The suprachiasmatic nucleus is comprised of individual neuronal oscillators coupled to constitute a network that functions as a circadian pacemaker. The precise output of the pacemaker is modulated by photic and nonphotic neural afferents to provide maximal adaptation. This output acts predominantly through connections to hypothalamus and, to a lesser extent, to thalamus and basal forebrain. Although limited, these connections orchestrate the precise pattern of behavioral and physiological processes that represent the circadian contribution to adaptation. See also: Circadian Rhythms; Hypothalamus; Sleep and Health; Sleep Disorders: Psychiatric Aspects; Sleep Disorders: Psychological Aspects; Sleep States and Somatomotor Activity; Sleep: Neural Systems
Bibliography 5. Conclusions The adaptation of animals to their environment is characterized by a temporal organization of behavior
Hastings M H 1997 Central clocking. Trends in Neuroscience 20: 459–64 Klein D C, Moore R Y, Reppert S M 1991 Suprachiasmatic Nucleus: The Mind’s Clock. Oxford University Press, New York
15293
Suprachiasmatic Nucleus Meijer J H, Rietveld W J 1989 Neurophysiology of the suprachiasmatic circadian pacemaker in rodents. Physiological Reiews 69: 671–707 Moore R Y 1996 Entrainment pathways and the functional organization of the circadian system. Progress in Brain Research 111: 103–19 Moore R Y 1997 Circadian rhythms: Basic neurobiology and clinical applications. Annual Reiew of Medicine 48: 253–66 Moore R Y, Leak R K 1999 Suprachiasmatic nucleus. In: Takahashi J S, Moore R Y, Turek F W (eds.) Biological Rhythms. Plenum Press, New York
R. Y. Moore
Suprasegmentals Suprasegmental features of speech are associated with stretches that are larger than the segment (whether vowel or consonant), in particular pitch, stress, and duration (Lehiste 1970). Respectively, these terms refer to the sensation of higher and lower tone, to the prominence patterns of words, and to durational differences between segments, whether between different segments or between pronunciations of the same segment in different contexts. In general, speech can be analyzed from two points of view: the phonetic reality, most readily accessible as the acoustic speech wave form, but equally present in the articulatory behavior of the speaker and the perceptual analysis of the auditory signal by the listener, and the discrete mental representation which serves as the set of instructions from which the phonetic signal is built by the speaker. One way of structuring a discussion of this topic is to take these three suprasegmental features as a point of departure, consider their somewhat varied manifestations at the different stages of the speech chain, that is, their articulation, acoustics, and perception, and tabulate the ways in which they function in languages. This approach would reveal that the terms pitch, stress, and duration do not refer to concepts at a single stage in the speech chain. Stress is largely a phonological notion (cf. section on the foot). Pitch is a perceptual concept, the articulatory correlate of pitch being rate of vocal cord vibration, and the acoustic correlate, in the usual case, being periodicity in the speech signal, the repetition of nearly the same pattern of vibration. Another approach takes the phonological representation as the starting point, and refer to the phonetic features that are involved in their realization, which is what will be done here. Recently, the phonological object of study that corresponds to the traditional suprasegmental features has come to be known as prosody. Prosodic structure has two components. First, there is a hierarchical compartmentalization of the segment string into 15294
phonological constituents, like the syllable and the intonational phrase, called the prosodic hierarchy (Selkirk 1978, Nespor and Vogel 1986, Hayes 1989). Second, overlaid on the string of segments, there is a tonal structure, which is synchronized in specific ways with the segmental and prosodic structure (Pierrehumbert and Beckman 1988, Ladd 1996). All languages have both of these structural components.
1. The Prosodic Hierarchy The phonological structure of an English sentence like Too many cooks spoil the broth does not just consist of a linear segment string [tut meni k?ks spull \b bruH]. The segments are grouped in a hierarchical set of constituents, where constituents at each rank include those of the rank below. There are seven syllables and five feet (Too, many, cooks, spoil, and broth), each of which is also a phonological word; next, there are three phonological phrases (Too many cooks, spoil, and the broth), two intonational phrases (Too many cooks and spoil the broth), and one utterance (Too many cooks spoil the broth). The foot captures the notion stress. It is a constituent which in the unmarked case consists of a stressed (or strong: S) and an unstressed (or weak: W) syllable, with languages choosing between trochees (S–W) and iambs (W–S). Often, there are restrictions on segmental contrasts in weak syllables, as in Russian, which has five vowels in stressed syllables but only three in unstressed syllables, and in English, which, depending on the variety, has between 13 and 17 monophthongs and diphthongs in stressed syllables, but only three in unstressed syllables (cf. the final syllables in illa, folly, and fellow; Bolinger 1986, p. 347ff.). Commonly, feet are disyllabic. English, a trochaic language, also has marked feet: a monosyllabic foot occurs in cooks and manifest, where -fest occurs after the disyllabic foot mani-; a trisyllabic foot occurs in caity and uniersity, where the foot -ersity occurs after the disyllabic foot uni-. One of the feet of a word is selected to bear primary stress or main stress [*], the others having secondary stress [)]. In English, the primary stress is on the rightmost branching foot, as shown by *mani) fest, uni*ersity, and )Apa)lachi*cola, but there are many exceptions, like *sala)mander, )perso*nnel. Other languages will make other choices. For instance, Bengali, Czech, Finnish, Icelandic, and Pintupi (Australia) always have the primary stress on the first trochee, that is, all words have initial stress, and secondary stresses may be found on alternate syllables, as in Pintupi [*juma)Ri
Suprasegmentals quantity-sensitie languages, long vowels, or long vowels and closed syllables, do not tolerate being in a weak position, and will thus interrupt any regular count through the word. Hawaiian is a quantitysensitive trochaic language which assigns feet right-toleft, so that [ko*hola] ‘reef ’ has penultimate stress, but [)koho*lat] ‘whale’ has final stress. When a language has extrametricality, the regularities obtain minus an edgemost (usually final) syllable. Finally, within limits imposed by the typology of the language, some words may have exceptional stress, in which case the foot structure is marked in the underlying representation. English is a quantity-sensitive, trochaeic, right-to-left language with extrametricality and exceptions (cf. Hayes 1985). Stressed positions in phonological representations not only attract more phonological contrasts, as the English and Russian vowel systems show, but also lead to less casual articulation than unstressed positions. The intensity at the higher end of the frequency spectrum tends to fall off less steeply (even spectral balance), spectral targets of vowels are approached more closely (less owel reduction), and their duration is longer; overall intensity measures like RMS intensity appear to be unreliable indicators of stress (e.g., Beckman 1986, Sluijter 1995). Stressed syllables also have different pitch characteristics from unstressed syllables. These are caused by the fact that stressed syllables attract intonational pitch accents, tones, or tone complexes, which (together with boundary tones, see below) make up the tonal structure of the language. The presence of a pitch accent on a stressed syllable makes that syllable accented, as indicated by capitalization. When said in isolation, an English word like theory is pronounced THEory, while there may also be a pitch accent on a secondary stress before the primary stress, so that Japanese will be JapaNESE or JAPaNESE. English has a number of different pitch accents, like H*, L*, L*H, etc., where the starred tone is time-aligned with the stressed syllable. While in isolation, the pronunciation will be THEory, in SET theory the syllable the- is unaccented, because in compounds only the first constituent is accented. Other reasons for deaccentuation are stress shift and lack of focus (see below). A frequently applied diagnostic for the phonological word is syllabification, more specifically the extent to which consonants syllabify as onsets with vowels to their right. The phonological word may be smaller or larger than the morphological word. In many languages, it is smaller in compounds, where each constituent is a separate syllabification domain (e.g., English cat’s eye syllabifies as [*kæts.)al], not as *[*kæt.)sal]); it is larger in cases of cliticization (e.g., English What is he [*wut.si] saying? ). The phonological phrase may determine the distribution of accents, favoring first and last accents. Although TOO MAny! will have two accents, TOO many COOKS is likely to lose the pitch accent on many.
The term stress shift is applied to cases where such rhythmic deaccentuation leads to the sole presence of a pitch accent on a foot with secondary stress, as in JAPaNESE, but JAPanese FURniture (Gussenhoven 1991). The intonational phrase typically comes with boundary tones at its edges. For instance, the intonational phrase Too many cooks might end with a Htone, pronounced after the pitch accent on cooks. Boundaries of this type are often felt to correspond with a comma in writing. The utterance, finally, roughly corresponds with the intuitive notion of spoken sentence. Instead of, or in addition to, the intonational phrase, the phonological phrase and the utterance may also be provided with boundary tones. In (1), a full prosodic representation is given, together with frequently used symbols; the subscript I in the tonal structure indicates a boundary tone.
h
segmental structure tonal structure
(1)
The choice of tones is one of many possible combinations, each of which will produce a different intonation contour. At the higher ranks, some optionality also exists in the phrasing, depending on style, constituent length, and sometimes language variety. Thus, spoil the broth may form a single phonological phrase, as shown by varieties like American English which allow stress shift in verb-object combinations, like MAINtain ORder (cf. ORder to mainTAIN). The sentence can also be pronounced as a single intonational phrase. The constituents in the prosodic hierarchy often define the context of segmental restrictions. For instance, syllable-final obstruents are voiceless in Turkish; voicing of Dutch [s] occurs between a sonorant segment and a vowel if [s] occurs at the end of a prosodic word; in Kimatuumbi, all vowels are shortened except any long vowel that ends up in the last word in the phonological phrase. Higher rank prosodic constituents in particular tend to undergo (phonetic) preboundary lengthening, which is usually confined to the final syllable. There is no consensus on how many prosodic constituents there are, in part due to apparent differences between languages. For instance, Japanese has an accentual phrase, which seems to lie halfway 15295
Suprasegmentals between PW and A. A clitic group has been claimed for Italian and English, for instance, providing a level between PW and A. A constituent below the syllable is the mora, which captures quantity contrasts. A syllable with a short vowel has one mora, one with a long vowel or diphthong two. A geminate consonant adds one mora, a single consonant none. Disyllabic words in Finnish, which has quantity contrasts in vowels and consonants in stressed syllables, may have two moras (CVCV), three (CVVCV or CVCCV), or four (CVV-CCV). Exhaustie parsing refers to the ideal situation in which each constituent includes all and only the segments included in the constituents of the rank below. For instance, the inclusion of [m] in the syllable [m`] implies that it is also included in the foot [m`ni] and the phonological word [m`ni] (which makes the structure distinct from that of Tomb any). A violation occurs in (1) for the syllable the, which lacks the normal properties of F and PW, and is directly included in the A.
2. Tonal Structure The tonal structure can be thought of as a string of tones arranged in parallel to the segmental structure provided by the string of consonant and vowels, that is, as an autosegmental representational tier (Goldsmith 1976). It varies greatly across languages. There are two ways in which a typology can be set up, a morphosyntactic one (‘How do tones signal what words, phrases, or clause types are used?’) and a phonological one (‘Where are tones located and how many contrasting tones are there?). A frequently applied morphosyntactic criterion is whether the tone(s) enter into the specification of (also segmentally specified) words or morphemes. If they do, the language is a tone language, examples of which can readily be found in Asia (Sino-Tibetan languages, Japanese), Africa (Bantu languages, Khoi-San languages, Nilo-Saharan languages), and many languages in the Americas. Examples of tonal distinctions ([-] l low, [.] l high, [ , ] l mid) of this kind are Japanese [ha! sı' ] ‘chopsticks,’ [ha' sı! ] ‘bridge,’ [ha' sı, ] ‘end,’ Mandarin (Sino-Tibetan) [ma! t] ‘mother,’ [ma) t] ‘hemp,’ [ma' t] ‘horse,’ [ma# t] ‘vituperate,’ Mamvu [ma! a' ka' ] ‘type of seasoning’ vs. [ma' a! ka' ] ‘cat.’ Frequently, tone is involved in the morphology in particular in languages spoken in Africa, as shown by Makonde [nı' ndı' ta! le' e! ka' ] ‘I cooked’ vs. [nı' nda' ta' le' e! ka' ] ‘I will cook’ and Mamvu [a! fu' ] ‘man,’ [a' fu! ] ‘men’ (all examples from Kaji 1999). About half of the languages in the world fall in this class (Maddieson 1978). If they do not use tone to determine the identity of morphemes, they are not tone languages; sometimes non-tone languages are referred to as intonation languages, but this term incorrectly suggests that tone languages do not use tone intonationally. 15296
In nontone languages, tones are used to indicate discourse meanings, like ‘interrogativity,’ ‘finality,’ ‘continuation,’ etc. This is illustrated by English There’s no cutlery: when cutlery is spoken with a lowto-high rise, the sentence is meant as a question; with falling-rising pitch it could either be a reminder or again a question, while a high-to-low fall signals a statement of fact. Many tone languages use tone in this intonational way, like Japanese, which distinguishes a L-tone in statements from a H-tone in questions utterance-finally. By contrast, some languages, like the Western Aleut dialect of Unangan, would appear to restrict their tonal structure to the insertion of specific initial and a final boundary tones at the beginning and end of every intonational phrase, and perhaps one more at the end of the utterance. Here, the only function of tone is to mark the phrasing. There are thus widely varying functional claims made by the phonologies of different languages on pitch variation. Not all acoustic variation, however, is due to differences in phonological representation. Speakers of all languages will also use pitch variation to signal universal meanings. Variations in overall pitch range, for example, may signal such general meanings as surprise, interest, emphasis, etc., many of which have an ethological origin (Ohala 1994). A phonological typology of tone might be based on tonal density: how many locations are specified for tone, that is, have tonal associations? In other words, languages differ in the selection of the tone bearing unit: in the ‘densest’ case they specify every mora for tone, and in the sparsest case they just mark the phrasing. When more than one tone is associated with a syllable, like HL or LH, the combination is known as a contour tone, while a single tone is a leel tone. After edges of prosodic constituents, stressed syllables rank as a popular location for tone. In (1), all tones are intonational. The Venlo Dutch sentence (2a) also has intonational boundary tones and an intonational pitch accent (H*), but the second H-tone on the word for ‘play’ is a lexical tone, whose presence makes this word distinct from (the segmentally identical) word for ‘rinse,’ given in (2b). The difference comes out in that the syllable [spAt] has high pitch in (2a) but falling pitch in (2b), where the boundary tone spreads to the second mora of the stressed syllable. Thus, although (2a) and (2b) differ in pitch, they have the same intonation.
(2) Languages that, like the Dutch dialect of Venlo, restrict the lexical tonal contrasts to one per word are
Suprasegmentals called pitch accent languages. (Observe the different meaning of ‘pitch accent’ from that given earlier.) Most of these use the syllable with word stress for this location (metrical pitch accent languages), but the location may also be arbitrary, as is in Japanese (lexical pitch accent languages), where the HLtone complex is located on one of the syllables of accented words. (Unaccented words are unaccented because they—again arbitrarily—do not have this tone complex on any of their syllables.) In Japanese, the accented syllable has none of the phonetic characteristics of stressed syllables (Beckman 1986). Metrical pitch accent languages can also be found in Europe (Norwegian, Swedish, Central Franconian dialect of German, Limburgian dialect of Dutch, Serbocroatian dialects, Basque dialects, Lithuanian; cf. van der Hulst 1999). The number of tonally specified locations is usually greater than the number of tones. For instances, moras may be specified by a spreading tone: when a tone spreads (left or right), it specifies more than one tone bearing unit. Also, suffixes may have a tone with the opposite value from the tone of the stem ( polar rule), etc. Moras, syllables, or words that are unspecified for tone are pronounced with a pitch that results from an interpolation between the two pitches specified at the nearest locations to the right and to the left. If a language just has an initial Ltone and a final H-tone in the intonational phrase, for instance, the pitch in the phrase may begin low, and rise towards the end. In fact, the pitches indicated in the unaccented Japanese word for ‘end’ are due to the interpolation between an initial LH tone complex of the accentual phrase and a final L-tone of the utterance, and both pitches are thus due to intonational tones, the mid pitch on the second syllable being a curtailed interpolation between H and L. A second typological question concerns the number of contrasting tones. While most languages, and most probably all nontonal languages, contrast H with L, some languages have three or even four contrasting level tones. Both the prosodic hierarchy and the autosegmental tonal structure are nonlinear forms of representations, differing radically from the linear segment string given at the beginning of Sect. 1.
3. The Expression of Focus Languages frequently exploit prosodic structure for the expression of focus, the marking of information status in sentences. Languages appear to vary in the kind of information status which is marked (counterpresupposition, counter-assertiveness, topichood), but a common operationalization of focus is the answer to a question. Thus, if Sue doesn’t like cheese is an answer to Who doesn’t like cheese?, the focus is Sue, since it expresses the information the question seeks to elicit,
while the rest of the sentence is outside the focus. However, if it is an answer to Why are you changing the menu?, the whole sentence is focus. The latter type is called broad focus, sentences with unfocused constituents having narrow focus (Ladd 1996). Prosodic encoding of focus distribution may take several forms. One is phrasing, in which case the beginning or end of a medium-rank prosodic constituent coincides with, respectively, the beginning or end of the focus constituent (e.g. Japanese, Bengali). Some languages use different types of intonational pitch accent for focused and non-focused parts of sentences (e.g., Italian), while others deaccent non-focused parts (Germanic), or de-accent and use different pitch accents for broad and narrow focus (Bengali) (Hayes and Lahiri 1991). Lastly, accentuation may signal other meanings than focus. If sentence (1) were to lose its accent on spoil, it would lose its implicit conditional status (‘If there are too many cooks …’) and imply that the number of cooks that actually end up spoiling the broth is too large. See also: Linguistics: Overview; Phonology; Speech Perception; Speech Production, Psychology of; Speech Recognition and Production by Machines
Bibliography Beckman M E 1986 Stress and Non-Stress Accent. Foris, Dordrecht, The Netherlands Bolinger D 1986 Intonation and Its Parts: Melody in Spoken Language. Stanford University Press, Stanford, CA Goldsmith J 1976 Autosegmental phonology. Doctoral dissertation, MIT Gussenhoven C 1991 The English Rhythm Rule as an accent deletion rule. Phonology 8: 1–35 Hayes B 1989 The prosodic hierarchy in meter. In: Kiparsky P, Youmas G (eds.) Rhythm and Meter. Academic Press, Orlando, FL, pp. 201–60 Hayes B 1995 Metrical Stress Theory. University of Chicago Press, Chicago Hayes B, Lahiri A 1991 Bengali intonational phonology. Natural Language and Linguistic Theory 9: 47–96 Hulst H van der 1999 Word Prosodic Systems in the Languages of Europe. Mouton de Gruyter, Berlin Hyman L M in press Tone systems. In: Haspelmath M, Ko$ nig E, Oesterreicher W, Raible W (eds) Language Typology and Language Uniersals: An International Handbook. Mouton de Gruyter, Berlin Kaji S 1999 (ed.) Proceedings of the Symposium on Crosslinguistic Studies of Tonal Phenomena: Tonogenesis, Typology, and Related Topics. Institute for the Study of Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies Ladd D R 1996 Intonational Phonology. Cambridge University Press, Cambridge, UK Lehiste I 1970 Suprasegmentals. MIT Press, Cambridge, MA Maddieson I 1978 Universals of tone. In: Greenberg J (ed.) Uniersals of Human Language, Vol. 2: Phonology. Stanford University Press, Stanford, CA, pp. 335–65
15297
Suprasegmentals Nespor M, Vogel I 1986 Prosodic Phonology. Foris, Dordrecht, The Netherlands Ohala J J 1994 The frequency code underlies the sound-symbolic use of voice pitch. In: Hinton L, Nichols J, Ohala J J (eds.) Sound Symbolism. Cambridge University Press, Cambridge, UK Pierrehumbert J, Beckman M E 1988 Japanese Tone Structure. MIT Press, Cambridge, MA Selkirk E O 1978 On prosodic structure in relation to syntactic structure. In: Fretheim T (ed.) Nordic Prosody, Vol. 2. TAPIR, Trondheim, Norway, pp. 111–40 Sluijter A 1995 Phonetic Correlates of Stress and Accent. Foris, Dordrecht, The Netherlands
C. Gussenhoven
Supreme Courts Supreme courts are courts of final appeal within national judicial systems. There is a degree of ambiguity about which institutions properly may be called both a ‘court’ and ‘supreme.’ Civil law countries have, effectively, three (or more) separate court systems covering different areas of the law, with separate institutions at the head of each, of which only the supreme civil court is called a ‘court.’ Federal systems commonly have, along with a national supreme court, separate supreme courts for their states or provinces. The development of supranational courts (e.g., the European Court of Justice), further complicates the matter, as these have become the courts of final appeal on some matters. Among these many institutions that arguably function, in one way or another, as ‘supreme courts,’ this article focuses on national, not supranational or sub-national, courts, and on courts whose cases arrive principally by way of appeal from lower courts. Over the latter part of the twentieth century scholarly interest in these courts has grown significantly due to several developments. The growing capacity of rights-advocacy and other organizations to pursue sustained litigation, growing complaints about bureaucratic abuses, and a weakening of centralized governmental power have encouraged many supreme courts, in a wide array of countries, to intervene increasingly frequently and actively in the policy process (Epp 1998, Tate and Vallinder 1995). At the same time, many observers have contended that strong and independent supreme courts play a key role in fostering the conditions for constitutional democracy, the rule of law, and the protection of individual and minority rights (Henkin and Rosenthal 1990, Stotzky 1993), as well as property rights and economic development. These are heavy burdens to bear, and the capacity of supreme courts for achieving a wide range of policy goals is often exaggerated. Although supreme courts are not without significant influence, their 15298
power is constituted and thus also limited by their countries’ particular political, social, and economic structures. Nonetheless, the conditions favoring active policy intervention by supreme courts show no signs of abating, and scholarly attention to them is likely to increase.
1. Institutional Structure 1.1 The Main Types Supreme courts take a wide variety of forms, and are related to other political institutions in a wide variety of ways, but nonetheless fall within two main types reflecting the broad division between the common law and civil law traditions (see Legal Systems, Classification of; Common Law; Ciil Law). In commonlaw countries, a single national supreme court has general jurisdiction over a wide range of cases in virtually all areas of the law arising in lower courts. These courts are frankly recognized as law-making bodies (through their development of caselaw), and traditionally have been valued for their roles in checking executive discretion and lending coherence and consistency to their country’s caselaw. They exercise these roles almost exclusively in cases arising out of concrete disputes between two or more parties; the losing party in a lower court may appeal the decision to a superior court, and a small proportion of these appeals ultimately reaches the country’s supreme court where review, in theory, is confined to disputes over law (and not facts) (see Appeals: Legal). This model is found today in countries throughout the former British empire, particularly in the United States, Canada, Australia, India, and in Africa, particularly Ghana, Kenya, Nigeria, as well as South Africa. In some common-law countries, particularly the USA and countries following its model, the supreme court possesses the power of judicial review to strike down legislation as unconstitutional, but typically only incidentally to resolving disputes between two or more parties (a system known as ‘concrete review’) (see Judicial Reiew in Law). In other common-law countries—those with parliamentary sovereignty, particularly Britain—the supreme court has no such power but nonetheless exercises relatively broad authority to check administrative discretion. Following the enormous symbolic influence of the US Supreme Court’s attack on racial segregation and its creation of new individual rights in the 1950s and 1960s, some supreme courts in common-law countries, particularly Canada and India, shifted markedly from the British to the American model by increasingly exercising the power of judicial review, particularly in the area of individual rights (Epp 1998; Knopff and Morton 1992). Under the growing influence of European law, even the Appellate Committee of the House of Lords,
Supreme Courts Britain’s supreme court, has made decisions that appear to be edging toward a frank exercise of judicial review, a trend that is likely to be reinforced as Britain’s new bill of rights gains authority. Civil law systems, by contrast, have been heavily influenced by the French Revolution’s efforts to limit the power of courts and to foreclose the development of an independent, powerful, and active judiciary headed by a single supreme court. Consequently, the supreme judicial power in many civil law countries is divided into three (or more) distinct institutions, each with a specialized jurisdiction (Jacob et al., 1996, Merryman 1985). Thus, in France, the ordinary judiciary, which is headed up by the Court of Cassation, traditionally has been strictly forbidden from exercising judicial review or in any other way challenging legislative or administrative acts. The value of some mechanism for checking administrative acts was quickly recognized, however, resulting in the creation of a separate administrative court, the Contentious Section of the Council of State (hereafter, the Council of State), which now functions as the supreme court for the administrative court system. With the growing interest in constitutionalism after World War II, many civil law countries added a separate constitutional court with the power of judicial review over legislation, reflecting the belief that the ordinary supreme court was insufficiently statesmanlike, too conservative, or too tied to a prior regime, to give meaningful effect to constitutional provisions (Cappelletti 1989, p. 145, Favoreu 1990, Merryman and Vigoriti 1967) (see Constitutional Courts). In countries in which the constitutional court decides challenges to existing laws, that court and the ordinary supreme court have at times developed sharply conflicting interpretations of the validity of particular laws, which has led to significant tensions between these courts (see e.g., Krug 1997, Merryman and Vigoriti 1967). The civil law model—with multiple courts of last resort—has been adopted throughout much of Europe, many parts of Africa, the Middle East, and Latin America. The strength of the pressures favoring active supreme court participation in the policy process are evident in the history of the French system. The French Court of Cassation originated as a legislative tribunal with power only to quash incorrect judicial interpretations of law, but over time it adopted a forthright appellate function and eventually was formally converted into the ordinary court system’s supreme court; it has since engaged in significant judicial policy-making (Merryman 1985); (for parallels in Britain, see Stevens 1978). Similarly, although the Council of State is not a part of the regular judiciary, it has evolved to a position of great independence within the administrative branch, adjudicates the legal claims of individuals against the state bureaucracy, exercises judicial review over administrative regulations (which constitute a substantial proportion of French law), and has served as a model for supreme
courts elsewhere (Brown and Bell 1993, Hill 1993, Merryman 1985). Although formally there is no doctrine of stare decisis in civil law systems, in practice the decisions of the Court of Cassation and the Council of State gain wide adherence in lower courts and constitute a body of ‘jurisprudence’ that is similar in many ways to caselaw (Merryman 1996). The institutional structures of supreme courts in Latin America mix elements of the US and French models (see e.g., Barker 1986). Thus, most LatinAmerican countries have a single supreme court with the power of judicial review and with general jurisdiction over appeals from a variety of ordinary courts, but many deny to the inferior courts the power of judicial review, and even the supreme court’s power of judicial review in most of these countries is limited to suspending the application of a law in a particular case, not invalidating the law itself. Some countries in the region follow the French model by denying the supreme court jurisdiction over the administrative courts; some countries’ supreme courts hear constitutional issues that are referred by the legislature, and Columbia has both a regular supreme court and a separate constitutional court.
1.2 Agenda-setting, Workload, and Internal Structure The workload of many supreme courts has increased tremendously in recent years, providing significant opportunities for creative judicial policy-making (McWhinney 1986). To better manage rising caseloads, some countries have created intermediate appellate courts and have granted their supreme court significant discretion over which cases to decide. Such reforms acknowledge and enhance the policy-making role of the supreme court, and courts that have this discretion typically use it to expand their policymaking activities by building a coherent agenda that commonly has focused on constitutional issues (Kagan et al. 1977, Pacelle 1991). Granting the supreme court discretion over its judicial agenda is more typical of common law than civil law systems, as appellate review traditionally has been viewed in the former as an extraordinary resort, whereas in the latter it is viewed as a regular aspect of the judicial process (Damaska 1986, Shapiro 1981). Countries that have not granted discretion over case selection to their supreme courts—for instance, India and France—are virtually forced by rising caseloads to expand the number of judges on the court; this is often accompanied by a division of the court into several panels (or ‘chambers’ or ‘benches’) specializing in different areas of the law. Supreme courts differ, too, in how decisions are issued. The decisions of common-law supreme courts typically report significant disagreements among the 15299
Supreme Courts deciding justices. Some, like the supreme courts of Canada and the United States, issue a single majority opinion along with dissenting opinions; others report separate (‘seriatim’) opinions written by each of the justices. Civil law courts, by contrast, present a public face of consensus that conceals any politically motivated disputes among the judges. Modes of publication differ as well: generally, the larger the court’s caseload, the smaller the percentage of decisions that is published. Although the most important decisions typically are published, a body of research shows that in lower courts decision-making patterns in published and unpublished decisions differ significantly. These various internal structures provide senior supreme court justices with different opportunities and constraints for manipulation of the decision agenda. Thus, in courts that are divided into separate panels, the chief justice may influence outcomes in particular cases through exercise of authority to assign justices and cases to particular panels. In courts that meet as a whole, as in the USA, the chief justice or the senior justice in the majority similarly may influence case outcomes by exercising authority to assign the responsibility for writing the majority opinion (see e.g., Epstein and Knight 1998, pp. 125–35).
2. Supreme Courts in the Political Process 2.1 Political Influences on Supreme Courts and their Legitimacy Supreme courts are political institutions and are influenced by the political process; but the ways in which they are political, and the nature and paths of political influence vary considerably. Although countries have adopted a wide array of constitutional and statutory provisions aimed at striking a balance between judicial independence from undue political manipulation and judicial responsiveness to widespread popular values, the effects of these measures are strongly shaped by several key factors related to the broader structure and distribution of political and organizational power in each society. First, the more centralized is governmental power, particularly in the executive, the greater is the degree of direct manipulation of the supreme court by officials in these other institutions. At one extreme, military coups or the exercise of emergency powers by governing regimes are commonly associated with sharp limitations on the authority of a country’s supreme court and direct manipulation of the court by the regime (Tate 1993). At the other extreme, in countries in which governing power is greatly divided either by a federal structure or an American-style separation of powers, or both, the paths of influence 15300
on the supreme court are more attenuated, affording the supreme court greater room for maneuver. Between these ends of the continuum lie many intermediate instances. The supreme courts of LatinAmerican republics and Japan, where political power is highly centralized, rarely challenge governmental policies or intervene in the policy process (Itoh 1989, Stotzky 1993, Verner 1984). Nonetheless, even supreme courts in pluralistic systems— e.g., the United States and Canada—rarely challenge directly the key policies favored by the national governing coalition. Second, political influences on supreme courts are conditioned by the degree of pluralistic conflict over selection of justices and the degree of diversity in the pool from which justices are drawn (see Judicial Selection). At one extreme, in a pattern more typical of common law than civil law countries, are courts whose appointment processes are subject to open political conflict between competing factions and whose justices are drawn from a relatively diverse population that includes politicians, government officials, and law professors. Thus, in the United States, the constitutional requirement for Senate confirmation of nominations made by the President has afforded the opportunity for open, pluralistic conflict over appointments to the Supreme Court (Maltese 1995). At the other extreme are systems that insulate appointment processes from popular pressure and which draw judges from an insular elite, a pattern typical of civil law countries. In some civil law countries, the judiciary is largely self-regulating; judges are selected by competitive examination after receiving specialized legal training, and they advance through the judicial hierarchy on the basis of seniority and the approval of superior judges. In other civil law countries, the executive dominates judicial recruitment and promotion to the supreme courts. Often the two factors are found together, resulting in highly-insular supreme courts. For instance, in Chile the Supreme Court annually evaluates and ranks all of the country’s judges, and appointments to the Court are made by the executive from a short list of nominees selected by the Court itself from among the lower judiciary; the system discourages the development of diversity of opinion and creativity within the judicial ranks, and, as a consequence, the Court has remained very quiescent over most of its history (Stotzky 1993, Vaughn 1992–93). Although the differences between civil law and common law patterns of appointment to supreme courts thus loom large, the dichotomy does not account for all variations. In the common-law world, the process of appointment to the US Supreme Court is notoriously open and conflictual, while for the British higher judiciary it is highly insular. In France, judges on the Court of Cassation are drawn exclusively from inferior courts; by contrast, members of the Council of State are drawn not only from inferior administrative tribunals but also from the admin-
Supreme Courts istrative bureaucracy and academia (Brown and Bell 1993, pp. 78–80). In general, the insularity of the recruitment process in continental Europe appears to be gradually eroding, with potentially significant implications for the judicial role. Third, the degree of popular legitimacy of supreme courts and the nature and diversity of external coalitions of support for them place constraints on direct manipulation of these courts by other governmental officials. Some supreme courts have gained significant popular legitimacy, while others remain relatively unsupported; judicial legitimacy, ironically, may be cultivated by courts themselves through active intervention in the policy process that attracts external coalitions of support (Gibson et al. 1998). The ability of courts to cultivate supporters, however, is constrained by the extent of litigational resources and organizational pluralism in civil society; strong societies composed of numerous litigation-advocacy organizations are a key condition for the development of active judicial agendas (Epp 1998). Where coalitions of support are diverse and politically strong, they have played key roles in supporting supreme courts against measures aimed at curbing their power. This is clear, for instance, in comparing the limited success of attacks on the US Supreme Court prior to 1937 (Ross 1994) with the Gandhi government’s relatively greater influence over the Supreme Court of India in the 1970s (Baxi 1980, Epp 1998). In general, judges in commonlaw countries tend to be accorded relatively high status and respect, while the civil law tradition has cultivated a lasting and deeply-entrenched disrespect for the judicial role and for judges (Merryman 1996). Although the various factors just summarized are conceptually distinct, they are mutually reinforcing in practice. Governing systems in which political power is decentralized have tended to accord greater symbolic status to their judges, particularly to their supreme court justices; have tended to develop pluralistic and even conflictual appointment processes that select justices from a diverse population; have tended to cultivate broad coalitions of support around their supreme courts; and their supreme courts have tended to be relatively actively engaged in the policy process. By contrast, governing systems in which political power is centralized have tended to accord their judges, and even their supreme court justices, the lowly status of legal bureaucrats; have tended to select these justices through an executive-dominated process from highly insular pools of legal professionals; and, as a consequence, these courts have developed highly formalistic decision processes that yield politicallyacquiescent decisions and which rarely cultivate support in the broader population. The latter extreme is exemplified by the insular judicial bureaucracies of Latin America that have few sources of popular support—a pattern that is surely compounded by the effects of economic inequality on the degree of organizational pluralism in civil society.
2.2 The Growth in Policy Interention by Supreme Courts, and the Limits of their Power Supreme courts, particularly in governing systems in which political power is not highly centralized, have increasingly intervened in the policy process in recent decades, in areas ranging from individual rights to environmental policy. Judges in active policy-making supreme courts commonly have developed relatively high degrees of individuality, and, consequently, these courts are subject to relatively open internal conflict that has made possible the quantitative study of decision-making patterns. Several partially conflicting strains have developed in these studies. One, called the ‘attitudinal model,’ posits that justices’ votes directly reflect their political attitudes toward judicial policies; this theory, although originally based on studies of the US Supreme Court (Segal and Spaeth 1993), has now been applied elsewhere as well (Morton et al. 1995). A second strain of research, sometimes labelled ‘new institutionalism,’ rejects the claim that justices’ decisions directly reflect their policy preferences (Clayton and Gillman 1999). One variant, influenced by rational choice theory, posits that in attempting to achieve their policy goals, supreme court justices temper the raw exercise of their preferences through strategic action that takes into account the preferences of other actors (and thus also of certain institutional norms) (Epstein and Knight 1998). A second variant of new institutionalism emphasizes the influence of legal norms: these studies show that many decisions, particularly in the agenda-setting process, are constrained by legal factors and are reached unanimously (Perry 1991), and that legal assumptions establish the foundations and boundaries of judicial decision-making and thereby constrain the choices recognized by the justices (Gillman 1993). The extent to which supreme courts are capable of effecting substantial changes in governmental policy, too, remains a matter of dispute. It is clear that supreme courts have no enforcement authority of their own and thus depend for implementation of their decisions on the actions of public officials, private individuals and organizations. On this basis, Gerald Rosenberg (1991) has argued that policy reform decisions of the US Supreme Court have led to actual reforms only in very limited circumstances, particularly when they are aided by sympathetic officials or a sympathetic population, or are subject to implementation by market forces. Michael McCann (1994), however, has shown that judicial decisions provide strategic resources that may be used by social movements, organized groups, and public officials to substantial effect (albeit often in ways unanticipated by the judiciary) (see Law, Mobilization of). In some countries, indeed, particular supreme court decisions undoubtedly have led to substantial policy changes. In the USA, time-series analyses have revealed that major decisions by the US Supreme Court have significantly 15301
Supreme Courts influenced news media attention to particular issues (Flemming et al. 1997). Taken as a whole, the increasing policy intervention of supreme courts, and of courts in general, has contributed to the ‘judicialization’ or ‘legalization’ of policy and administrative processes (see e.g., Tate and Vallinder 1995).
3. The State of Social Scientific Research on Supreme Courts Research on supreme courts has advanced remarkably in the late twentieth century. Scholars have developed sophisticated quantitative databases of decisionmaking patterns and of relations between supreme courts and some aspects of governmental systems and societies at large. Additionally, research agendas are increasingly based on social scientific theories and thus increasingly attempt to explain rather than merely to describe. Yet important gaps remain. The cognitive revolution in psychology has as yet been largely ignored in studies of supreme court decision-making. Scholarly attention, moreover, is often directed exclusively to courts’ internal decision-making processes even though it is widely recognized that judicial policies are commonly transformed in the implementation process. Most of the existing social scientific research on supreme courts has focused on a handful of highly active courts, particularly those of the United States and Canada, while many other courts, particularly the supreme courts of cassation in the civil-law world, have been largely ignored. Finally, although research on supreme courts in many parts of the world has expanded in the 1990s, only a few truly comparative studies have been done. Nonetheless, our understanding of supreme courts has grown. The body of social scientific research suggests both that supreme courts may contribute to the broader goals that many observers in recent years have pinned on them—the protection of individual and minority rights and the cultivation of constitutional normal more generally—but also that the extent to which they support these goals is conditioned by a range of factors, particularly the degree to which political power in the governing system is centralized, and the degree to which these courts enjoy popular legitimacy and are supported by coalitions of organizations in civil society. See also: Civil Law; Constitutional Courts; Courts and Adjudication; Judicial Review in Law
Bibliography Barker R S 1986 Constitutional adjudication in Costa Rica: a Latin American model. Inter-American Law Reiew 17: 249–74 Baxi U 1980 The Indian Supreme Court and Politics. Eastern Book Company, Lucknow, India
15302
Brown L N, Bell J S 1993 French Administratie Law. 4th edn. Clarendon, Oxford, UK Cappelletti M 1989 The Judicial Process in Comparatie Perspectie. Clarendon, Oxford, UK Clayton C W, Gillman H 1999 Supreme Court Decision-making: New Institutionalist Approaches. University of Chicago Press, Chicago Damaska M R 1986 The Faces of Justice and State Authority: A Comparatie Approach to the Legal Process. Yale University Press, New Haven, CT Epstein L, Knight J 1998 The Choices Justices Make. CQ Press, Washington, DC Epp C R 1998 The Rights Reolution: Lawyers, Actiists, and Supreme Courts in Comparatie Perspectie. University of Chicago Press, Chicago Favoreu L 1990 American and European models of constitutional justice. In: Clark D S (ed.) Comparatie and Priate International Law: Essays in Honor of John Henry Merryman on His Seentieth Birthday. Duncker and Humblot, Berlin, pp. 105–36 Flemming R B, Bohte J, Wood B D 1997 One voice among many: the supreme court’s influence on attentiveness to issues in the United States, 1947–1992. American Journal of Political Science 41: 1224–50 Garcia F S 1998 The nature of judicial reform in Latin America and some strategic considerations. American Uniersity International Law Reiew 13: 1267–325 Gibson J L, Caldeira G A, Baird V A 1998 On the legitimacy of national high courts. American Political Science Reiew 92: 343–58 Gillman H 1993 The Constitution Besieged: The Rise and Demise of Lochner Era Police Powers Jurisprudence. Duke University Press, Durham, NC Henkin L, Rosenthal A 1990 Constitutionalism and Rights: The Influence of the United States. Constitution Abroad. Columbia University Press, New York Hill E 1993 Majlis al-Dawla: The administrative courts of Egypt and administrative law. In: Mallat C (ed.) Islam and Public Law: Classical and Contemporary Studies. Graham and Trotman, London, pp. 207–28 Itoh H 1989 The Japanese Supreme Court: Constitutional Policies. Wiener, New York Jacob H, Blankenburg E, Kritzer H M, Provine D M, Sanders J 1996 Courts, Law and Politics in Comparatie Perspectie. Yale University Press, New Haven, CT Kagan R A, Cartwright B, Friedman L M, Wheeler S 1977 The business of state supreme courts, 1870–1970. Stanford Law Reiew 30: 121–56 Knopff R, Morton F L 1992 Charter Politics. Nelson Canada, Scarborough, Canada Krug P 1997 Departure from centralized model: The Russian supreme court and constitutional control of legislation. Virginia Journal of International Law 37: 725–87 Maltese J A 1995 The Selling of Supreme Court Nominees. Johns Hopkins University Press, Baltimore, MD McCann M 1994 Rights At Work: Pay Equity Reform and the Politics of Legal Mobilization. University of Chicago Press, Chicago McWhinney E 1986 Supreme Courts and Judicial Law-making: Constitutional Tribunals and Constitutional Reiew. Martinus Nijhoff, Dordrecht, The Netherlands Merryman J H 1985 The Ciil Law Tradition: An Introduction to the Legal Systems of Western Europe and Latin America, 2nd edn. Stanford University Press, Stanford, CA
Surprise Examination Merryman J H 1996 The French deviation. American Journal of Comparatie Law 44: 109–19 Merryman J H, Vigoriti V 1967 When courts collide: Constitution and cassation in Italy. American Journal of Comparatie Law 15: 665–86 Morton F L, Russell P H, Riddell T 1995 The Canadian charter of rights and freedoms: A descriptive analysis of the first decade, 1982–1992. National Journal of Constitutional Law 5: 1–60 Pacelle R L Jr 1991 The Transformation of the Supreme Court’s Agenda: From the New Deal to the Reagan Administration. Westview Press, Boulder, CO Perry H W 1991 Deciding to Decide: Agenda Setting in the United States Supreme Court. Harvard University Press, Cambridge, MA Rosenberg G N 1991 The Hollow Hope: Can Courts Bring About Social Change? University of Chicago Press, Chicago Ross W G 1994 A Muted Fury: Populists, Progressies, and Labor Unions Confront the Courts, 1890–1937. Princeton University Press, Princeton, NJ Segal J A, Spaeth H J 1993 The Supreme Court and the Attitudinal Model. Cambridge University Press, New York Shapiro M M 1981 Courts: A Comparatie and Political Analysis. University of Chicago Press, Chicago Solomon D H 1992 The Political Impact of the High Court. Allen and Unwin, St Leonards, Australia Stevens R B 1978 Law and Politics: The House of Lords as a Judicial Body, 1800–1976. University of North Carolina Press, Chapel Hill, NC Stotzky I P (ed.) 1993 Transition to Democracy in Latin America: The Role of the Judiciary. Westview Press, Boulder, CO Tate C N 1993 Courts and crisis regimes: A theory sketch with Asian case examples. Political Research Quarterly 46: 311–38 Tate C N, Vallinder T (eds.) 1995 The Global Expansion of Judicial Power. New York University Press, New York Vaughn R G 1992–93 Proposals for judicial reform in Chile. Fordham International Law Journal 16: 577–607 Verner J G 1984 The independence of supreme courts in Latin America. Journal of Latin American Studies 16: 463–506
C. R. Epp
Surprise Examination Examinations, and tax audits, are not always announced in advance. There are ‘surprise’ examinations. What is more, it seems that there might be surprise ones consistent with it being announced that they will occur in some specified period of time. There is, however, a line of argument to the conclusion that it is impossible to have a surprise examination when it has been announced ahead of time that it will occur in some specified period of time. This is very hard to believe, which is why the line of argument constitutes a paradox. This article starts by stating the line of argument and various attempts to identify the fallacy. It con-
cludes with a short discussion of some related issues concerning cooperative social arrangements, a matter often raised under the heading of the backwards induction paradox. This article does not discuss certain technical issues to do with self-reference sometimes raised in the context of the surprise examination paradox (for these see, e.g., Shaw 1958).
1. The Paradox Suppose that on Monday morning the head of a school makes two announcements to the students. (a) An examination will be held at 14:00 one afternoon this week. (b) You will not know until shortly before 14:00 on the day which afternoon it is. The examination will, that is, be a surprise one. The argument to paradox runs as follows. Suppose that the examination is held on Friday. In this case, it will be known in advance when the examination is to be held. For it will be known on Thursday after 14:00 that as the only day left to hold the examination is Friday, the examination has to be on Friday. It follows that the examination cannot be held on Friday consistent with the announced conditions (a) and (b). Now suppose that the examination is held on Thursday afternoon. After 14:00 on Wednesday, it will be known that the only afternoons left to hold the examination are Thursday and Friday. But Friday has just been ruled out as a possible day to hold a surprise examination, so that leaves Thursday as the only possible day. But then the students will know after 14:00 on Wednesday that the examination has to be on Thursday. But this is to say that Thursday as well as Friday can be ruled out as a possible day for a surprise examination. Suppose next that the examination is held on Wednesday. On Tuesday, after 14:00, it will be known that the only afternoons left for the examination are Wednesday, Thursday, and Friday. Thursday and Friday are not possible days for a surprise examination, by the reasoning just given, which leaves Wednesday as the only possible day. But then, if the examination is held on Wednesday, it will be known in advance that it is on Wednesday: Wednesday can, that is, be ruled out as a possible day for a surprise examination. Repeating the same reasoning rules out Monday and Tuesday. It follows that it is impossible to hold an examination under the announced conditions, that announcing that the examination will be held in a given period, and that it will be a surprise one makes it impossible for it to be a surprise one. There is clearly a fallacy somewhere in this reasoning but it has proved hard to say exactly where it goes wrong. This is part of the reason the paradox has generated a large literature (see Kvanvig 1998). 15303
Surprise Examination
2. Replies that Draw on the Possibility of Doubt About (a) or (b) 2.1 Uncertainty About (a)—the Examination Being Held at All In most discussions of the paradox, it is granted that the examination cannot be held on the last day and still be a surprise. The focus of the discussion is on where the argument goes wrong after the step where Friday is excluded. However, as Quine (1953) points out, the matter is not that simple. If there is some doubt about whether or not the conditions set by the head will be satisfied, the argument that rules out the last day can fail. In particular, if there has been no examination by late Thursday afternoon and the students find themselves, perhaps as a result of this fact or perhaps for some other reason, in doubt about whether condition (a) will be satisfied, about whether, that is, there will be an examination at all in the week in question, they do not know that there will be an examination on Friday. There maybe, for all they know, no examination at all. In this case, if the head proceeds to hold the examination on the Friday at 14:00, both conditions set down will be satisfied. The examination will have been held at 14:00 on one day in the week in question (which satisfies condition (a)), and the students will not have known in advance which day it was to be (which satisfies condition (b)). However, as many commentators point out, the students might be in a position to be certain that the examination will be held in the week in question. Perhaps they know that holding the examination some time during that week is an unshakable prerequisite for government registration of the school. However, it seems that it would still be possible that the examination should be a surprise. There must, therefore, be something more to say about where the argument to the conclusion that a surprise examination is impossible goes wrong. The mistake cannot simply be the possibility of doubt about the examination being held in the relevant period, because the paradox survives the removal of that doubt.
2.2 Uncertainty About (b)—the Examination Being a Surprise Suppose that it is certain that the examination will be held in the week in question, in which case the students can rule out Friday as a possible day for a surprise examination. This is consistent with it being a matter of doubt whether or not the examination will be a surprise one. Now suppose that there is doubt about this condition—(b)—being satisfied. Wright and Sudbury (1977) point out that this undermines the students’ reasoning on Thursday morning in the case where Monday through Wednesday prove to be examfree. The students cannot argue that Friday is excluded 15304
as a possible day for the examination to be held on the ground that then they would know in advance when the examination was to be held. For, if there is doubt about (b), they must allow that maybe the examination will not be a surprise. They must, therefore, allow that there are two possible days for the examination to be held: Friday, in which case (b) will be violated, or Thursday. In this case, the head can hold the examination on Thursday consistent with (a) and (b). This is because the students cannot, on Thursday morning, rule out Friday owing to doubt about the examination being a surprise. In effect, the head can exploit doubt about the examination being a surprise—which leaves Friday as a possible day for the examination in the students’ minds—to ensure that it is a surprise by holding it on Thursday. In sum, doubt about (a) would make it possible for the examination to be held on Friday consistent with (a) and (b). Doubt about (b), even in the absence of doubt about (a), would make it possible to hold the examination on Thursday consistent with (a) and (b). The obvious question to ask at this point is what happens when there is no doubt about either (a) or (b). Could it not be the case that it is certain that the examination will be held on one afternoon of Monday through Friday, and certain that the students will not know in advance which afternoon it is? Isn’t there still a paradox to be addressed? In order to address this question, it is necessary to draw a distinction between whether there is doubt on some matter and whether there would be doubt on some matter were certain things to happen.
3. What is Open to Doubt s. What Would be Open to Doubt It is not open to doubt given current evidence that Hitler died in the bunker, but were certain discoveries to be made, it would become open to doubt. Similarly, for most people who are here and now awake, it is not open to doubt that they are here and now awake, but were they to have the experience as of waking up from an extraordinarily life-like dream, it would be open to doubt. What is here and now open to doubt is distinct from what would be open to doubt were certain things to happen. The view that the points made in the previous section do not dispose of the surprise examination paradox arises from the conviction that a surprise examination is possible when it is beyond doubt both that the examination will occur during the week and that the students will not know in advance. How then, runs the challenge, can we dispose of the paradox by pointing out that when there is doubt about (a) or (b), the student’s reasoning fails at some key point or other? However, on careful inspection, the failure of the students’ reasoning does not depend on there actually being doubt about (a) or (b). It depends on it
Surprise Examination being the case that there would be doubt were certain things to happen. Provided it is the case that were Friday morning to come without the examination having so far taken place, there would then be doubt about the examination happening in the week in question, that is, about (a), the students’ cannot rule out the examination being held on Friday. If there would, should Friday morning arrie before the examination makes its appearance, be doubt on that Friday morning about (a), the head can hold the examination on Friday without violating either (a) or (b)—as was noted above. This does not require that there be, at any stage, actually any doubt. It requires simply that were Friday morning to arrive without an examination having been held, there would be doubt about (a). A similar point applies to the role of doubt about the examination being a surprise, that is, of doubt about (b). There need not actually be any doubt that the examination will be a surprise. The students’ reasoning would be undermined simply by the fact that there would be doubt should enough days pass without the examination taking place. As was noted above, should Thursday morning arrive without there having been an examination and were this to mean that there would then be doubt about (b), the students cannot rule out Friday even if they are certain of (a). It is not required that there ever actually be any doubt about (b). All that is required is that should certain things come to pass, there would be doubt about (b). It might be suggested that it could be built into the paradox that not only is there no doubt at any stage about (a) and (b), there would be no doubt about them no matter how many days pass without an examination. But this would be to build something absurd into the statement of the paradox. There would hae to be doubts about one or both of (a) and (b) should enough days pass without the examination having taken place. For example, should Friday morning arrive without the examination having taken place, the students would know that one or both of (a) and (b) are open to doubt. It would be inconsistent of them to hold in that eventuality both that it is beyond doubt that the examination will take place on one of the designated days and that they cannot know in advance which it is. For example, if it would still be certain that the examination will take place on one of the designated days should Friday morning arrive without the examination having been held, they would know that should this happen, the examination would not be a surprise (for a longer development, see Jackson 1987).
4. The Backward Induction Paradox 4.1 The Rationality of Cooperation There are parallels between the style of argument in the surprise examination paradox and certain argu-
ments concerning the rationality or otherwise of cooperative social arrangements. Many cooperative arrangements that have both costs and benefits to the involved parties rest on the basis of expected returns in the future to all parties to the arrangement. The Robinsons work with the Smiths on something that advantages the Smiths in the expectation that in the future the Smiths will work with the Robinsons on something that advantages the Robinsons. A classic example is helping with building houses. It is much easier to build a house when there are a number of workers to lift heavy beams and the like. So the Robinsons give of their time to help the Smiths build the Smiths’ house, but do so in the expectation that the Smiths will later help the Robinsons build the Robinsons’ house. But why, in terms of self-interest, should the Smiths later help the Robinsons; they already have their house? The answer is that there will be other opportunities for cooperation, and the Smiths can expect to benefit from some of them. This intuitively appealing answer can be challenged by a ‘backwards induction.’ It will not be rational (in the sense of being advantageous) for the ‘giving’ party to cooperate on the last occasion for possible cooperation, because there is no prospect of their being the ‘receiving’ party in the future. But then, it seems, there is no reason for the giving party to cooperate on the second last occasion either, assuming rationality on the part of both parties. The only reason for the giving party to cooperate on the second last occasion is in the hope of being the receiving party on the last occasion, and that could only happen if the giving party on the last occasion is irrational. Similar reasoning applies to the third last, fourth last, etc. occasions, so yielding the conclusion that it is never rational to be a giving party in a cooperative arrangement. It would be a mistake to reply to this argument by challenging the self-interest conception of rationality it employs. The puzzle is that it clearly is in parties’ self-interest sometimes to cooperate when they are the givers, in the expectation of being receivers in the future. The error in the argument is that it neglects the kind of epistemological points we saw to be crucial in diagnosing the error in the surprise examination paradox. Although it would be irrational for the giving party to cooperate on the known last occasion for cooperation, normally it is not known when the last occasion for cooperation is. Similarly, there may be doubt about all parties being rational.
4.2 Sequences of Prisoner’s Dilemmas A special version of the backwards induction paradox is sometimes deployed to argue that it is rational to defect throughout a sequence of prisoner’s dilemmas as well as in a one-off prisoner’s dilemma. A prisoner’s dilemma is defined as follows. There are two parties, X 15305
Surprise Examination and Y. If they both cooperate, they each do better than if they both defect. However, X and Y each do better if they defect, independently of whether or not the other cooperates or defects—defecting dominates cooperating, as it is put (see Resnick 1987). Most game theorists accept that this means it is rational for each to defect despite the fact that if they both defect, they each end up worse off than if they both do the irrational thing of cooperating. What has proved harder to accept is the claim that, in a sequence of prisoner’s dilemmas of known length involving rational players, it would be rational to defect throughout. Surely it would be rational, sooner or later, for X and Y to start cooperating in order to set up an expectation of cooperation which means that, at least sometimes, they end up both cooperating, which is better for each than their both defecting? A backwards induction argument has been used to resist this thought. Clearly, runs the resistance, it is not rational to cooperate at the last prisoner’s dilemma— engendering expectations of future cooperation has no pay off in that case. But cooperation at the second last prisoner’s dilemma is only rational if it leads to cooperation at the last prisoner’s dilemma, so this means that cooperation at the second last cannot be rational unless it leads to irrationality at the last prisoner’s dilemma. But that is to say that if X and Y are steadfastly rational, defection is rational at the second last prisoner’s dilemma as well as at the last. Repeating the reasoning delivers the result that it is rational to defect throughout the sequence of prisoner’s dilemmas. A full discussion of this last backwards induction argument is beyond the scope of this article (but see Pettit and Sugden 1989 and Bicchieri 1998). But it should be noted, first, that belief in rationality, or lack of it, is crucial to the argument, and, second, that if cooperating at early stages in a sequence of prisoner’s dilemmas sets up certain expectations, these early stages will not strictly speaking be prisoner’s dilemmas, for taking into account the longer term benefits means that they will not be cases where defecting dominates cooperation. See also: Prisoner’s Dilemma, One-shot and Iterated
Bibliography Bicchieri C 1998 Decision and game theory. In: Craig E (ed.) Routledge Encyclopedia of Philosophy. Routledge, London, Vol. 2, pp. 823–35 Kvanvig J L 1998 Paradoxes, epistemic. In: Craig E (ed.) Routledge Encyclopedia of Philosophy. Routledge, London, Vol. 7, pp. 211–14 Jackson F 1987 Conditionals. Basil Blackwell, Oxford, UK Pettit P, Sugden R 1989 The backward induction paradox. Journal of Philosophy 86: 169–82 Quine W V 1953 On a so-called paradox. Mind 62: 65–7 Resnick M D 1987 Choices. University of Minnesota Press, Minneapolis, MN
15306
Shaw R 1958 The paradox of the unexpected examination. Mind 67: 382–4 Wright C, Sudbury A 1977 The paradox of the unexpected examination. Australasian Journal of Philosophy 55: 41–58
F. Jackson
Survey and Excavation (Field Methods) in Archaeology 1. The Term ‘Archaeology’ The term ‘archaeology’ comes from the Greek word, arkhaiologia, meaning the ‘discussion or discourse about ancient things.’ By definition, archaeology is the science by which the remains of the ancients can be methodically and systematically studied to obtain as complete a picture as possible of ancient culture and society, and thereby to reconstruct past ways of life. An antiquarian doctor of Lyon, Jacques Spoon, first used the term ‘archaeology,’ in the seventeenth century. Archaeology is a discipline that involves the scientific method of study, observation, recording, and experiment. Archaeologists study in detail the complete life of cultures and their patterns of change in an attempt to delineate the causes and effects of the cultural process. The fascination of archaeology lies in the solving of long-standing problems and in the discovery of the meaning of things. The excitement of archaeology lies in the new problems which recent findings produce.
2. The History of Archaeology The earliest ‘archaeologist’ was the sixth century BCE King Nabonidus of Babylon who excavated a temple down to its foundations which had been laid thousands of years before. Even the Roman emperor Augustus was a collector of ‘giants’ bones and ancient weaponry. With time, Europeans began to collect burial urns, chipped flint stone and ground stone tools which some believed were amulets. In 1719 the Austrian Prince d’Elboef led the first excavation to Herculaneum, which had been destroyed along with Pompeii in AD 79 when Mount Vesuvius erupted. Thereafter when pieces of classical sculpture began to turn up, artists began to study their shapes and decoration and collectors began to collect, proudly display, and discuss their forms. With time, more people began to describe what they recognized as ancient monuments
Surey and Excaation (Field Methods) in Archaeology and excavations were undertaken to retrieve more and more objects. In the mid eighteenth and early nineteenth centuries digging for ancient objects became the pursuit of gentlemen of the leisured class and professionals: they became known as antiquarians and formed clubs to discuss the merits of ancient artifacts. Archaeology as a pursuit using scientific systems of recovery began in the nineteenth century, along with the growing realization that humankind had an ancient past that could be investigated in a methodical way. The names of most important archaeologists from the early days include Sir William Matthew Flinders Petrie in Egypt, Robert Koldewey in Babylon, the duplicitous Heinrich Schliemann in the Aegean at Troy, Mycenae and Tiryns, and General Augustus Pitt-Rivers in Britain. Later pioneers were Sir R. E. Mortimer Wheeler in Britain and India, George Andrew Reisner and Sir Leonard Woolley in the Near East, Max Uhle and Alfred Vincent Kidder : in America, and Francois Bordes and Andre! Parrot in France. In the present day, archaeology has grown into a multidisciplinary, methodical, problem-oriented search for the past. It now involves above-ground research as well as excavation, and a host of specialists is required, depending on the site. Background research as well as excavation has become meticulous. The object study of pottery, for example, involves not only typologies but sophisticated scientific testing by neutron activation analysis to understand the chemical components and to identify the sources and, cultural patterns as well as to assess trade relationships. Archaeological questions extend far beyond the excavation and interpretation of any single cultural group, geographic region, or chronological period of time. In theory, all archaeological study is aimed toward an explanation that will provide a conceptual basis for the reconstruction of the past, and the where, how, and why of the changes that took place.
3. The Importance of Archaeology Archaeology is unique in that it encompasses the whole world over the past 2.5 million years in order to reconstruct the past. It includes the research of all kinds of sites—underwater wrecks or port installations, caves, jungles, deserts, and mountain-top sites; wherever past activity has taken place. Archaeology wants to discover where and when humankind developed and began to practice art, agriculture, and technology and became urbanized. It is our route to our understanding of ourselves, for in order to know where we are going we have to reconstruct and understand the past and where we have come from. Archaeology continues to give us a broad and longterm view about the past and the events that conditioned us. There is an implosion of new answers to
research on how, where, and when humankind began, our development of technology, the arts, the expression of our ideas in writing, how we harnessed the environment brought about the domestication of plants and animals, and developed agricultural systems. Archaeology helps us to understand the processes of becoming urbanized how we developed complex societies, and to understand our basic needs.
4. Archaeological Sites Archaeology covers a wide diversity in time, space, subject matter, and approach. An archaeological site can be any place that has been a scene of past human activity, including a briefly settled campsite, a shipwreck, a monumental city, or the evidence of any number of different combinations of human remains. Any place where human beings have established themselves, even momentarily, is considered a site. As each archaeological site is a unique time capsule, each has its own distinct character and problems. Types of sites are often labeled by their prehistoric or historic context, their function, their general topographic nature, or according to whether they are composed of one or many levels of occupation. Sites represent a body of data relevant to their setting and their cultural patterning and must be interpreted both in relation to this local setting and as a link between cultures. Sites may be small or large. Small sites are far more numerous throughout the world and may range from hunting or fishing stations to small village communities. There are prehistoric plains or riverside terraces such as Olduvai Gorge in Tanzania, Africa; there are caves such as Chou Kou Tien near Peking, China where the Peking Man was found; or those containing Upper Paleolithic art, like Lascaux in France. Large sites are generally cities, temples, large tells— artificially built-up mounds in the Middle East with one city built on top of the other (known as hoW yuW k in Turkish or tepe in Persian). The ancient mound of Jericho, Israel is particularly important, revealing evidence dating back to 8000 BCE. A more recent tell of the Mississippi Basin was built by the Mound Builders of North America in about 1300 CE. There are monumental cities such as classical Rome, MesoAmerican Tikal in Guatemala, or Angkor Vat in Cambodia. There are sacred areas reserved for ritualistic functions such as Stonehenge in England, or the statues of Easter Island. There are cemeteries and tombs such as those of Mycenae in Greece, the Pyramids in Egypt, or the mounds of Ohio, Illinois or Mississippi. Famous sites also include underwater shipwrecks and harbors discovered worldwide off the shores of the Mediterranean, or near Oslo, Norway, where early Viking ships have been recovered. 15307
Surey and Excaation (Field Methods) in Archaeology The selection of a site for excavation therefore depends upon what the excavator–principal investigator wants to find, and what problems will be answered by excavation. What is the nature of the evidence? It usually involves a problem to be solved—how has the site been built up over time? What is its environment and how has this changed? Using a research design, a review process is undertaken which includes the preliminary analysis of the site, the evaluation of the probable results, the examination of the costs, and the evaluation of organizational abilities and needs to carry out the tasks involved.
5. Archaeological Studies Many types of archaeology have evolved over the years. There are Old World (Eastern Hemisphere) and New World (Western Hemisphere) prehistoric specialties, based on the stone tools humankind produced. The Paleolithic, Mesolithic, Neolithic, Bronze Age, and Iron Age are specialties. These are broad artificial classifications with subdivisions. As far as the discipline in which it has received attention, Old World archaeology is a separate subject in some schools, or is associated with departments of classics or Near Eastern studies, history, or art history. In the New World prehistoric specialists may concentrate on the Paleolithic, Archaic, and Woodland periods. Then both Old and New World archaeology moves into historical archaeology. In the Old World there are experts who are devoted to classical archaeology (the worlds of Greece and Rome) or Biblical archaeology (the areas encompassed by the Holy Land). Similarly, many kinds of specialties have developed in archaeology, for example, industrial archaeology; the study of manufacturing and salvage archaeology; the rescue work done before a modern construction takes place such as land development, or urban renewal, or when a pipeline, road, or dam is built. There are also technical specialties of people who concentrate on specific techniques such as ceramics, lithics, paleobotany, palynologists (specialists in pollen), and dendrochronologists (specialists of tree ring growth) among many others, including computer specialists who lend support to all these areas. Since the 1960s in America, archaeology as a subdiscipline also has been associated with anthropology. As anthropology is the study of humanity, anthropologists generally observe the behavior of present-day people in order to explain and reconstruct their behavior patterns. In so doing, they have to make assumptions about past behavioral patterns as set against the environment of the present. Anthropology is divided into several subfields among which are social and\or cultural anthropology, physical or biological anthropology, linguistic anthropology, which includes the interpretation of oral history, demographic anthropology, ethnic anthro15308
pology, cognitive and symbolic anthropology, enthnoarchaeology, and archaeo-astronomy (the study of ancient astronomical events).
6. Dating Systems in Archaeology The term ‘culture’ is used when assuming that a group of people exhibit shared behavior. Archaeologists want to be able to establish accurate space and time frameworks or dates for the cultures they are researching. In short archaeologists want to establish the site context, culture, and chronology. Laboratory methods provide what are known as absolute dating methods, and relative dates are ascribed to archaeological strata based on the site deposition or stratigraphy and the discovery of cultural artifacts similar to those found at other sites. Absolute dating is a date given in specific years, in terms of chronometrical dating or the calendar. A date given as BP is a date before 1950. BC means ‘before Christ’ (although nowadays BCE is used, meaning ‘before the Common Era’). CE (Common Era) is used instead of the old term AD, meaning Anno Domini, or after the birth of Christ. Absolute dates can be established by laboratory analyses like radiocarbon dates (C-14), potassium-argon, thermoluminesence (TL), archaeomagnetism, and palaeomagnetism, among others. These are geochronologic methods used for determining the absolute date of a deposit. Relative dates are based on the stratigraphic deposition—the principles of superposition, association, and correlation. Superposition assumes that the uppermost stratum (soil level) was deposited last and the lowest stratum was deposited first; thus the reasoning assumes that lower strata are generally older than the layers above them. Association assumes that the artifacts found in the same closed and sealed archaeological context are contemporary and are therefore presumed to be culturally and temporally associated. Correlation assumes that, once cultural associations have been determined, artifacts can be cross-culturally correlated with artifacts from other site cultures and they can be expected to have been manufactured at the same time, and therefore to possess the same relative date. Archaeologists use seriation as a method of relative dating. Seriation involves the ordering of specific artifact types from an earlier to a later time sequence by the use of a typology. In creating a type series, the archaeologist assumes that artifacts have formal attributes that can be measured like raw material, length, width, thickness, color, weight, and so on, and that these nonrandom attribute factors were used as a standard by the maker. Therefore, a typology is comprised of types which are a cluster of attributes in a class of objects and these can be given relative dates that can be used to date the stratification.
Surey and Excaation (Field Methods) in Archaeology Once associations have been defined cultural levels can be inferred. A cultural level represents the choices of a particular social group, such as the classical Greeks. The data collected in such a cultural level depict a known and recognizable set of ideas that are then represented as the cultural products of that society.
7. Archaeological Surey Survey includes exhaustive historical research, mapreading, geomorphologic and geographic analysis, aerial photography, and ground-level surveys, including electronic and magnetic surveying techniques. Historical research includes all previous research of the area—excavation reports, artifact studies, and the conclusions drawn by previous researchers. In-depth bibliographic research provides essential background information. All archaeologists are concerned with surveying as a means of planning and laying out the area to be studied or excavated. Surveying is generally the first step in all archaeological projects, and is often the last step before the completion of the excavation. Using maps, aerial photos, and whatever background information is available, ground level surveys can be undertaken on foot to understand the physical features of the site as well as the artifacts—in short, all the material cultural remains of the site or the culture. The site is then located according to Global Positioning Systems on a topographic government map with a scale of 1:50,000, and its location is established and officially reported. As a branch of engineering, surveying is the science that accurately determines the shape, area (size), and positions of a site’s surface by the measurement of certain points. It includes the measurements of horizontal distances and elevations, and determines the direction and relationships of angles by the use of optical instruments. There are several principal classes of survey: geodetic surveys, photogrammetric (picturemeasured) aerial surveys, and plane surveys. Geodetic surveys are used for the study of large areas like countries or continents, and take into account the curvature of the earth’s surface. Satellite photographs have become important to the archaeological analyses of large landforms. Photogrammetric surveys are the science of taking detailed vertical or oblique photographs from an aircraft for the reliable measurement of the earth’s features. Plane surveying is used for small areas of archaeological focus in which the curvature of the earth’s surface is not measured. The preliminary survey also consists of a systematic walking over the site recording standing structures and features as well as artifact materials and scatters. Depending upon what type of survey is to be conducted, artifact concentrations may be collected, processed and studied to indicate the type of site and
its artifact range. The survey should establish the site size, what kind of site it may be, its time span, and, if possible, its depth of deposit. Depending on the site, three principal types of survey are usually conducted before excavation: (a) the land survey for the proper planning and laying out of the site which includes the site grid; (b) The topographic survey which is the measurement of elevations representing the surface of the area; and (c) the ongoing excavation survey, which is made during excavation to control the location of archaeological structures, features, and objects and providing these with elevations. Anomalies and physical features are then thoroughly mapped and described as well as factors like vegetation and soils. Decisions then may be taken as to what additional surveying equipment will be used to look at the subsurface features. Electrical resistivity meters and magnetic surveying by using any number of different magnetometers are instruments that are linked to computer-processing programs; they help to determine what disturbances or structures exist in the area. Only when the comprehensive survey has been completed can the researcher decide if excavation is justified, and if it is, where it is to begin, and what course it is to take.
8. Excaation and Ethical Considerations Even before a site is excavated, preservation and consolidation measures must be seriously considered and implemented. How will the site be left after the excavation is completed? As excavation is destruction, one of the most important considerations is the site’s stratigraphy, its depth of deposit, and its characteristics. We need to conserve and protect the excavated sites and monuments from all periods and in all parts of the world. Furthermore we need to protect sites against the threats from human intervention, mass tourism, and building development, as well as from the elements of natural deterioration and neglect. Looting is an age-old problem. There are those who clandestinely excavate to sell artifacts at an enormous profit. Although artifact-rich countries have the legislation to prevent the illegal excavation and export of their country’s patrimony, they lack consistent regulatory measures to police looters, dealers, and the exporters of their antiquities. If we are to preserve our common human heritage, the protection of cultural patrimony has to become everyone’s priority. There have been changes in the ethical standards observed by archaeologists today. Ethical standards in archaeology may be defined as the obligations of a professional or an amateur to the excavation, to the country, State or area in which it is located, to the public at large, and to fellow archaeologists. Today’s archaeologists strive to work and live according to a 15309
Surey and Excaation (Field Methods) in Archaeology high standard of behavior, applying their knowledge to enhance the reputation and achievements of archaeology as a whole and for the benefit of human knowledge. Many archaeological associations have drawn up ethical codes that have been accepted by international archaeological societies. The codes of the Archaeological Institute of America and the Society of Professional Archaeologists are among these. Who will be responsible for the care and upkeep of the site over time? And what should be done to care for site artifact collections? Where are they going to be stored, who is going to study them and on what time schedule? What is to be their final disposal—a museum, reburial, and, if the latter, where will they be buried? Today the archaeologist has the responsibility of addressing such issues and these should be best addressed before excavation takes place.
9. Excaation Excavation is the deliberate recovery of buried objects in relation to the layer or stratum and other associated objects in its original deposit. The principle of excavation is that artifacts, nonartifactual materials, and architectural features are not randomly distributed, but are deposited by people. Their position in the earth reflects the cultural, economic, and social behavior of humankind that can be studied by means of excavation. This is the basis for the functional reconstruction of the site. It is only through excavation that the archaeologist can collect and extract a wide range of physical information or obtain a sense of the people who lived, worked, and perhaps worshipped at a site. The archaeologist does this by describing the evidence the people left behind—the architectural features and objects. It is only through excavation that we can align these physical features in time and space so that the functional reconstruction of the processes involved can be determined. In this way we can better understand and interpret the causes, changes, and processes which the ancients used to bring about change. Excavation is the method of removing objects and uncovering stationary features that have been concealed by later deposits. It involves the removal of layers or strata in reverse order to the way in which they were laid down, gradually revealing each successive stage in the history of the people who used the site. Because excavation is the means by which information is unearthed, it must be conducted methodically so that whatever is found can be seen and studied within its own context. The main objectives of excavation are to uncover the position of objects as well as floral and faunal data, and to plot their location on a horizontal plan. It also involves the meticulous recording and sequencing of successive layers on the vertical section by position and depth measurements. 15310
Finally, the goal is to relate such data to the total environment of the site and to assess its relationship to other sites in the area.
10. Modern Deelopments in Archaeology Major developments in archaeology took place in the late 1990s with the advent of computer technology— remote sensing, web pages, virtual reality, computer simulations, digital cameras, and the greater use of the hard sciences including DNA analysis. In the 1960s American archaeologists began to challenge the traditional archaeological academy and argued that the discipline had become simplistic. As a result of this critique the ‘new archaeology’ was developed, emphasizing the importance of the processes within the society being studied. This was known as ‘processual archaeology,’ and it emphasized the value of objective, logical explanation and the descriptive processes that can produce testable assumptions. They believed that culture should be studied as sets of systems and subsystems as they relate to the environment, the economy, and patterns of subsistence. They also emphasized that the interrelationships between different sectors of the society should be explainable so that specific developments could be established throughout time. They failed to stimulate the traditionalist majority who practiced archaeology as they had before and who believed that there were no universal laws governing human behavior. The merit to the then-new approach to archaeological data opened the way to a vast array of new theories including Marxist, neo-Marxist positivist, structuralist, post-structuralist, post-processualist, deconstructionist, and probabilist views along with other theoretical constructs. The post-processual school of archaeologists still adheres to a so-called interpretative archaeology, which insists on emphasizing history, philosophy and, whenever possible, literary studies. Interpretative archaeologists place importance on the fundamental uniqueness and diversity of culture, believing there is no one governing law that can apply either to archaeological research or to the interpretation of the past. These theories hold some truth, provide stimulating hypotheses for our explanation of the past and are used by many who thoughtfully interpret the probabilities of the evidence. Now archaeologists are wrestling with the ideas of multivariate explanations—and many continue to build idealized models for their interpretations.
11. Conclusion By and large, archaeologists are dedicated researchers and teachers. As for future directions in theory and research, evaluation can be said to drive archaeological standards. By clearly specifying the outcomes
Surey Research: National Centers in measured ways, archaeologists will change the way they excavate and publish. In the training of students, field directors hope they will train people who possess the qualities and attributes desired in a competent archaeologist. As an archaeological project is expected to foster a shared sense of a shared mission between the field researcher and the principal investigator, both will strive to reach a common goal. Archaeology is best taught through active learning processes, and principal investigators help students by gauging their progress and identifying and overcoming barriers to the excavation as a whole. The developmental process of archaeology is most effective when the goals are known so that the whole team can work together toward the shared goals, collaboratively devising strategies to overcome any obstacles. Our understanding of the past is in a constant state of change due to new discoveries and advances in our knowledge. It will never be finished. Today archaeology has developed into a complex discipline with a vast literature and a range of specialties that are well beyond the competence of a single researcher. More and more data are being produced, museums are overwhelmed with objects and, even with the advent with the computer, many artifacts have remained unanalyzed or unregistered. There is a crisis in the discipline due to the fact so much data has been produced, and as excavation involves the destruction of the site, archaeologists must be aware that the discipline may self-destruct unless it puts its house in order. It faces considerable challenges and risks ahead and it must develop an international dialogue among its leadership that is committed to a strategy which will ensure the saving of the past for the future.
Bibliography Deetz J 1996 In Small Things Forgotten: An Archaeology of Early American Life. Anchor Books\Doubleday, New York Greene K 1995 Archaeology: An Introduction: The History, Principles and Methods of Modern Archaeology. University of Pennsylvania Press, Philadelphia, PA Hodder I 1999 The Archaeological Process: An Introduction. Blackwell, Oxford, UK Hodder I 1991 Reading the Past: Current Approaches to Interpretation in Archaeology. Cambridge University Press, Cambridge, UK Renfrew C, Bahn P G 2000 Archaeology: Theories, Methods and Practice, 3rd edn. Thames & Hudson, New York Sharer R J, Ashmore W 1992 Archaeology: Discoering Our Past. Mayfield, Mountain View, CA Schiffer M B 1987 Formation Processes of the Archaeological Record. University of Utah, New Mexico Press, Albuquerque, NM Trigger B 1989 A History of Archaeological Thought. Cambridge University Press, Cambridge, UK Willey G R, Sabloff J A 1995 A History of American Archaeology. W H Freeman, San Francisco, CA
M. S. Joukowsky
Survey Research: National Centers National centers for survey research are a rare species among the masses of survey research centers, companies and offices. However, the impact of the few centers on the development of systematic survey research and its methodology cannot be overestimated. Four centers in three different countries will be examined as blueprints or models. These are, in chronological order, the National Center for Opinion Research (NORC, Chicago), the Institute for Social Research (ISR, Ann Arbor), the National Center for Social Research (NATCEN, London), and the German Society of Social Science Infrastructure Institutes (GESIS, Mannheim). The focus is on survey research; thus surveys must be defined clearly and distinguished from other instruments for social research, like tests and census. Next, national centers for survey research will be distinguished from survey research centers, institutes, companies, and census offices at large. The four model centers will then be portrayed with some emphasis on their history and their impact on survey research.
1. Social Sureys Surveys are defined in this article as obsering societal facts using interiews as the source of information and systematic sampling as the base for selecting information sources (cf. de Heer et al. 1999), and they are quantitative instruments to identify distributions of societal characteristics. Questions or items in a survey are understood as indicators pointing to a latent construct or dimension. Psychological tests, including many marketing studies, differ in two ways from surveys: they focus on characteristics of individuals or small groups but not societal characteristics. In most cases the selection of the respondent is not based on systematic sampling. Survey research is, on the contrary, neither interested in information about a single individual nor in small, identifiable groups. Information gathered on the individual level in a survey will be aggregated across groups and even societies. A population census may share items or question wording with survey research studies. However, census studies are not designed to identify latent constructs. By its name, a census is an investigation involving all members of a given population not a sample. Survey research, hence, requires among others specific methods and techniques for designing sampling frames, dealing with nonresponse, and constructing and validating measurement instruments which must be in turn suitable for complex statistical modeling across all strata of a society. Survey research is a complex, often high-tech, production process. Consider as an example a National General Social Survey like the American GSS or the German ALLBUS (Davis and Smith 1992, Davis et al. 1994). 15311
Surey Research: National Centers The target is to collect data from about 3,000 respondents. To do so, one has to contact up to 6,000 households and to identify whether they are eligible and willing to participate. Up to 600 interviewers in 630 sampling points will collect the data. About 50– 100 other staff will back them up, supervise the process, or edit the incoming data. The whole process will be directed and supervised by 5–10 senior social researchers. The goal is to collect from each and every respondent all information as accurately as possible. The result will be about 1.2 million different information units (answers to questions, etc. by all respondents). This complexity of the survey production process requires adequate organizational and professional production structures. Big survey research companies like Gallup, Harris, Infratest-Burke, Inra, or Mori to name but a few and famous, provide such organizational and professional structures for polls, big market surveys, etc. In the public, nonprofit domain, independent, academic survey research centers serve the academic communities, governments, and even compete with private companies on many sectors. It is the integration of academic methodological standards and perspectives into the applied survey research process, which makes academic survey research centers different from private companies’ research units. Publishing rigorous methodological research, going beyond the ‘it works’ to find out ‘how and why it works,’ is thus a must for an academic survey research institute. Hence it is no coincidence that the two most prominent survey research centers today, ISR and NORC, are also centers of methodological excellence. However, methodological research, important as it may be for survey research, is a very small area in the sea of modern social research. It is almost always interdisciplinary and cooperative. In short, methodological survey research is not suited for a career at a university. National research centers by their very size and their mission to accomplish high quality surveys provide a natural base for long-term high quality methodological research also. National centers for survey research cover national and international issues, which are quite distinct from regional centers. In addition such centers have a broad focus on methods, target audience, and topics. Funding and organization differ from center to center and from nation to nation. The term ‘national center’ as it is used here does not exclude several centers within one nation, nor does it imply that the word ‘national’ in the name immediately qualifies an institute to be a national center. They also serve as hosts for important national surveys. Studies like the General Social Survey (NORC), the Panel Study of Income Dynamics (ISR), British Election Studies (NATCEN), or the German General Social Survey (ZUMA\GESIS) are cornerstones of their countries’ social science information systems. Finally, they have an enormous impact on the development of 15312
multinational survey research: NATCEN, NORC, ZUMA are founding members of the International Social Survey Programme (ISSP, www.issp.org), ISR hosts the World Value Survey. Rigorous methods, the public as an audience, free distribution of major nationwide data sets, intensive, published methodological research, and an international perspective are what mark national centers for survey research. Today, 88 years after the first survey (de Heer et al. 1999, p. 28) centers which truly qualify are still very small in number. This is despite the enormous growth in interviews conducted around the globe (in 1978 the gross revenue of the survey industry was estimated at $4 billion; de Heer et al. 1999, p. 32); the annual turnover reported by ESOMAR in 1998 was about $13 billion with an increase of 10 percent since 1997 (www.esomar.nl). There are actually only a handful institutions worldwide which can be counted as a national center for survey research: in the US the National Opinion Research Center (NORC, Chicago, founded 1941), the Survey Research Center in the Institute for Social Research (ISR, Ann Arbor, 1946), in Great Britain the National Center for Survey Research (NATCEN (formerly SCPR), London, 1969), the Norwegian Social Science Data Services (NSD, Bergen, 1971), the German Social Science Infrastructure Institutes (GESIS, Mannheim, an umbrella organization for ZUMA in Mannheim, Zentralarchiv in Cologne and the Social Science Information Center in Bonn, 1986). Other institutes like social science data archives, for instance the Roper Center, ICPSR, or the Essex Archive (cf. CESSDA, www.nsd.uib.no\cessda\ or www.ifdo.org), or the numerous social research institutes listed in clearing houses (www.abacon.com\ sociology\soclinks with references to other lists) either do not carry or contract out major national surveys or are not heavily involved in methodological survey research apart from secondary data analysis. Hence they are not covered in this section. In the following, short bios of the four centers will be presented. These bios will not go into the history of these institutes; rather their current standing will be emphasized. In concluding, the impact to these centers on survey research will be outlined at the beginning of the twenty-first century.
2. NORC: National Opinion Research Center, Chicago NORC was established in 1941 as a nonprofit corporation through a grant from the University of Denver and the Field Foundation. In 1947, NORC moved to the University of Chicago campus (www.norc.uchicago.edu). The center conducts survey research in the public domain as well as for private
Surey Research: National Centers organizations. It is an independent nonprofit organization with strong ties to the University of Chicago. NORC has a reputation for both complex surveys and innovations in survey research methodology. It is the birthplace of surveys conducted for the benefit of the sociological profession at large, the General Social Survey (GSS) (Davis and Smith 1992). The first GSS was conducted as early as 1972. A GSS is a generalized tool which repeatedly measures important characteristics. It is released immediately after completion to the social science community. Thus researchers, students, and other interested parties have up-to-date survey data to hand which they can analyze on their own terms. Such a complex, long-term enterprise requires a strong and well-equipped infrastructure, like the one provided by NORC as a national survey research center. Other strongholds of NORC are methodological studies concentrating on measurement error. Norman Bradburn and Roger Tourangeau are but two among the number of researchers who contributed to this field from the 1980s.
3. ISR: Institute for Social Research, Ann Arbor Shortly after NORC moved to Chicago, in 1948, researchers from the Survey Research Center and the Research Center for Group Dynamics founded ISR in Ann Arbor (www.isr.umich.edu). From its very beginning ISR advocated interdisciplinary research. Its continuous studies range from National Election Studies to the Panel Study of Income Dynamics. It provides major databases as well as substantial contributions in group dynamics and political science to the scientific community. ISR is also a world leader in survey research methodology and statistics. Researchers like Kish, Schuman, Converse, Presser, Groves, Couper, Singer, and Schwarz pave the way for modern survey research processes in areas such as sampling statistics, response error, householdnon-response, and survey cognition processes. Another impact area of ISR is training and data publishing. Since 1962 ISR has hosted the Interuniversity Consortium for Political and Social Research (ICPSR) which was founded as an archive for quantitative social science data. From a safe haven for key punched cards ICPSR developed the world’s largest survey data publisher\distributor where researchers have access to thousands of data files. ISR offers three training programs open to researchers worldwide. These are the Summer Institute in Survey Research Techniques, the ICPSR Summer Program in Quantitative Methods and, more recently, the Research Center on Group Dynamics Program in Experimental Methodology and Statistics. The institute puts much emphasis on international cooperation and knowledge dissemination in all its activities.
Apart from a healthy competitive attitude, NORC and ISR are complementary institutes with strong ties to their respective universities (NORC—University of Chicago, ISR—University of Michigan at Ann Arbor). NORC puts more emphasis on fielding many complex studies, while ISR concentrates on specific studies. ISR is very strong in survey methodology, while NORC is stronger in survey process management and sampling design. Both provide a base for international cooperation.
4. NATCEN: National Centre for Social Research, London NATCEN, formerly SCPR, was founded in 1969. Today, it is Britain’s largest nonprofit survey research organization (www.natcen.ac.uk). It is the home of British Social Attitudes as Britain’s general social survey, of the British Elections studies, and major health surveys, to name but a few. Within the tradition of British social research, NATCEN conducts both quantitative and qualitative studies. Among its divisions are a Survey Methods Centre (research, consulting, and teaching) and joint centers with universities like CASS (Centre for Applied Social Surveys) which hosts a survey item bank. Researchers at NATCEN are leading in applied survey research. Among their innovations is the International Social Survey Programme (ISSP) initiated by Roger Jowell and the first ever deliberative poll. It will be the ‘command and engine room’ of the European Social Survey to come in 2002. NATCEN is the only center among the survey research centers discussed here which names itself a ‘national center.’ However, it is a national center not by Her Majesty’s appointment, but de facto due to the specific structure of the British science funding system. As an independent charitable trust it forms the base for collaborative mid-range and long-range projects like CREST, the Centre for Research into Elections & Social Trends. Funded as a research center of the Economic and Social Research Council (ESRC), it is a joint venture of Nuffield College in Oxford and NATCEN in London. The above-mentioned CASS, on the other hand, is an ESRC research center joining NATCEN with the University of Southampton.
5. GESIS: German Society of Social Science Infrastructure Institutes, Mannheim, Cologne, and Bonn Founded in 1986 by act of the federal government and all states, GESIS comprises three major centers: ZUMA, Center for Survey Research and Methodology in Mannheim, the Zentralarchiv in Cologne, and the Information Center (www.gesis.org). Together 15313
Surey Research: National Centers these three centers form a comprehensive national survey research center as a permanent infrastructure for the social sciences. ZUMA consults academics in all areas of quantitative survey methods; it runs continuous research streams in areas like standardization of demographics, intercultural survey research, and, recently in web surveys. It is, among others, the home of ALLBUS, the German General Social Survey, the German ISSP, and a participant in the German Social Welfare Studies. Together with researchers from five other European institutes it will be part of the European Social Survey ‘engine room’ (see above). It is also the birthplace of the German Standard Demography, now a major project of the German Census Office, ZUMA, and other professional groups. The center bridges the gap between German Official Statistics and the social science profession by providing and\or enabling access to official microdata. The Zentralarchiv, founded as early as 1960, is the German national archive for quantitative survey data. It played a major role in establishing the international archive network. The archive also provides training courses in survey methods. The Information Center provides in-depth bibliographies of German social science publications, keeps record of ongoing research projects, and serves as a platform for information interchange between Eastern and Western Europe. Taken together the three centers form a unique infrastructure for their national scientific community. Access to all services is free, products are nominally charged. The three institutes were independent and differently funded units until the National Science Board (Wissenschaftsrat) voted to form a unified national survey research center, namely GESIS. As a social science infrastructure it has a strict supportive function for academic research. Substantive research is different than in America or Britain, and is not a core issue of GESIS. However, as a dedicated social science infrastructure GESIS is fully funded by the federal government and all German states.
6. Conclusion National centers for survey research are defined as institutions that provide databases, methodological research, and training for their scientific community. In addition, national centers focus on the national rather than the regional level and bridge national research with international. There may be more institutions around the globe than the four introduced here. Many others may fulfill one or other of the functions. However, the core of a national center for survey research is dedicated methodological studies embedded into long-term methodological and substantial research programs. 15314
In looking back to get a vision of the future, one could arguably say that the fast progress one could observe in survey research technology and methodology since the 1950s was caused by the very existence of the two American centers. Others countries still lag behind the United States both in advancement of scientific knowledge and in survey engineering. Germany, for instance, where ZUMA was founded as late as 1974 with limited capacities (no fieldwork staff, no own surveys), had to bridge a huge methodological and technological gap since the 1980s. The founding of GESIS as a national center was a late but not too late remedy. It is thus no coincidence that the lack of national centers or their substitutes correlates with the standards of survey research in a country. Because all modern societies need sound information about their own state, efforts should be made to either found more national centers or to transform existing ones into real or virtual transnational centers. Finally, in view of the new survey technologies, like net-based multimedia surveys, and their likely abuse and misuse, national centers become even more indispensable as a means to sort good methods and data from information rubbish. See also: Data Archives: International; Databases, Core: Political Science and Political Behavior; Databases, Core: Sociology; Economics, History of
Bibliography Davis J A, Mohler P P, Smith T W 1994 Nationwide general social surveys. In: Borg I, Mohler P P (eds.) Trends and Perspecties in Empirical Social Research. Walter de Gruyter, Berlin Davis J A, Smith T W 1992 The NORC General Social Surey— A User’s Guide. Sage, Newbury Park, CA De Heer W, De Leeuw E, van der Zouwen J 1999 Methodological issues in survey research: A historical review. Bulletin de Methodologie Sociologique 64: 25–48
P. P. Mohler
Surveys and Polling: Ethical Aspects Polls and surveys are used to investigate opinions and attitudes, and to obtain factual information. Two broad aspects of ethical practice with respect to polls and surveys can be distinguished. The first has to do with the extent to which research designs, instruments, data collection procedures, and analysis permit valid and reliable conclusions. The second pertains to the researcher’s obligations toward others, especially respondents, but clients and the larger public as well. Because polls and surveys are really indistinguishable
Sureys and Polling: Ethical Aspects except in terms of location, with polls typically carried out by commercial organizations and surveys by academic institutions (Schuman 1997), the present discussion of ethical practices is intended to apply to both. Although the discussion is framed in terms of the code of ethics of one organization in a particular country—the Code of Ethics of the American Association for Public Opinion Research (AAPOR), a professional organization including academic, commercial, and government survey researchers among its members—the principles presented here apply much more generally. But because the legal requirements for protecting respondent confidentiality, as well as the ability to do so in practice, vary from one country to another, such variations are not discussed here. Also excluded from the discussion are the fundamentally different ethical issues raised by administrative uses of data gathered for statistical purposes (see, for example, Seltzer 1998).
1. Standards for the Conduct of Research AAPOR’s Code of Ethics, which obligates members to adhere to the Code and is one of the few such instruments to provide a mechanism for punishing violations, is necessarily couched in general terms, simply asserting that members will ‘exercise due care’ in carrying out research and ‘take all reasonable steps’ to assure the validity and reliability of results. In May 1996, AAPOR adopted two more specific statements, one defining a set of performance standards that embody current ‘best practices’ in the field of survey research, the other a set of practices that are universally condemned. Members are urged to adhere to the first and refrain from the second. Among the ‘best practices’ are scientific methods of sampling, which ensure that each member of the population has a measurable chance of selection; follow-up of those selected to achieve an adequate response rate; careful development and pretesting of the questionnaire, with attention to known effects of question wording and order on response tendencies; and adequate training and supervision of interviewers. Among practices defined as ‘unacceptable’ are fund raising, selling, or canvassing under the guise of research; representing the results of a poll of self-selected or volunteer respondents (for example, readers of a magazine who happen to return a questionnaire, callers to an 800- or 900-number poll, or haphazard respondents to a webbased survey) as if they were the outcome of legitimate research; and ‘push polls,’ which feed false or misleading information to voters under the pretense of taking a poll. For example, a push-poll might ask if the respondent would object to learning that candidate X has been accused of child abuse, has defaulted on a loan, etc. Revealing the identity of individual respondents to anyone without their permission, and
misleading the public by selecting or manipulating methods of research or analysis to reach a desired conclusion, are included among ‘unacceptable’ practices but they are also mentioned in the AAPOR Code.
2. Standards for Dealing with Clients, the Public, and Respondents 2.1 Obligations to Clients Ethical behavior toward clients requires, in the first place, undertaking only those research assignments that can reasonably be accomplished, given the limitations of survey research techniques themselves or limited resources on the part of the researcher or client. Second, it means holding information about the research and the client confidential, except as expressly authorized by the client or as required by other provisions of the Code. Chief among the latter is the requirement that researchers who become aware of serious distortions of their findings in public must disclose whatever may be needed to correct these distortions, including a statement to the group to which the distorted findings were presented. In practice, this provision of the Code may obligate researchers to issue a corrective statement to a legislative body, a regulatory agency, the media, or some other appropriate group if they become aware of the release of such distorted results.
2.2 Obligations to the Public The AAPOR Code of Ethics attempts to satisfy the profession’s obligations to the larger public by mandating the disclosure of minimal information about a poll or survey whose findings are publicly released. The elements that must, at a minimum, be disclosed are the following: (a) who sponsored the survey, and who conducted it; (b) the exact wording of questions asked, including the text of any preceding instruction or explanation to the interviewer or respondent that might reasonably be expected to affect the response; (c) a definition of the population under study and a description of the sampling frame; (d) a description of the sample selection procedure, making clear whether respondents were selected by the researcher or were self-selected; (e) size of sample and, if applicable, completion rates and information on eligibility criteria and screening procedures; (f) a discussion of the precision of the findings, including, if appropriate, estimates of sampling error and a description of any weighting or estimating procedure used; (g) which results, if any, are based on parts of the sample rather than the entire sample; (h) method, location, and dates of data collection. Together, these items of information communicate some idea of the error likely to 15315
Sureys and Polling: Ethical Aspects be associated with estimates from a given survey, or, conversely, the degree of confidence warranted by the results. The Code requires this information to be included in any report of research results, or, alternatively, made available at the time the report is released. In other words, researchers are not required by the Code of Ethics to follow the best professional practices, but they are required to disclose the practices they used. Thus, the Code relies on the ‘marketplace of ideas’ to drive out bad practices over time.
2.3 Obligations to Respondents Few, if any, surveys put respondents at physical or psychological risk in ways that psychological experiments or, particularly, medical research may do. Ordinarily, the biggest risk to survey respondents comes from the possibility that their identity will be disclosed, either inadvertently, because adequate precautions were not taken to maintain the confidentiality of the data, or deliberately, as a result of special efforts by an ‘intruder’ to obtain the identity of one or more respondents. Thus, survey organizations have a special responsibility both to protect the confidentiality of the data they collect and to inform respondents of the risks that remain. These are two of the ethical issues that will be discussed in this section; the third arises from declining survey response rates and attempts to compensate for these by special persuasion strategies, including the offer of refusal conversion payments.
2.3.1 Informed consent in sureys and polls. One important purpose of informed consent is to protect respondents from harm. Another purpose, less often cited, is to give them meaningful control over information about themselves, even if the question of harm does not arise. For example, informing potential respondents of the voluntary nature of their participation in the interview is essential to ethical survey practice, even if this leads some potential respondents to refuse to participate. Similarly, obtaining the respondents’ permission to record an interview should be standard practice, even though no harm is likely to come to them as a result. Currently, many states permit such recording with only the interviewer’s consent. At present, obtaining informed consent, which may be defined as the ‘knowing consent of an individual or his legally authorized representative … without undue inducement or any element of force, fraud, deceit, duress, or any other form of constraint or coercion,’ is not legally mandated for participation in most surveys in the United States. Even those surveys that are subject to federal rules for the protection of human subjects are exempt unless they collect respondent identification and disclosure of responses could rea15316
sonably be damaging to respondents’ good reputation, employability, or financial standing or could put them at risk of civil or criminal liability (Federal Register 1991, 28:012). (Surveys of populations that are seen as especially vulnerable, such as minors or incapacitated or institutionalized adults, are required to institute special procedures for obtaining informed consent, however.) But surveys conducted by federal agencies are subject to the Privacy Act of 1974, which requires the disclosure of some information to respondents—for example, the principal purpose for which the survey is to be used (Duncan et al. 1993). Although informed consent is not legally required for most surveys in the United States, some major professional associations (for example, the American Statistical Association and the American Sociological Association) do make mention of it in their ethical codes. AAPOR, however, does not, noting only that ‘We shall strive to avoid the use of practices or methods that may harm, humiliate, or seriously mislead survey respondents.’ It is not difficult to understand this omission. Although full disclosure of sponsorship and purpose, for example, may be highly desirable from an ethical point of view, it may seriously bias responses to a survey. It may do so not only because some people will choose not to participate in surveys sponsored by certain organizations or with a particular purpose, but also because those who do participate may distort their responses in order to thwart or advance the purpose as they understand it. From another standpoint, the issue is how much detail should be communicated in order to ‘inform’ respondents— should they, for example, be told in advance that they are going to be asked their income near the end of the interview? Although Presser (1994) argues persuasively that the uses to which a survey will be put are often impossible to predict and even harder to communicate to respondents with any accuracy, he acknowledges that in the end, there ‘may be no escaping the conflict of ethical obligations with research ends’ (Presser 1994, p. 454).
2.3.2 Protection of respondents’ confidentiality. As has already been noted, probably the most important source of danger to survey respondents comes from identification of their responses, that is, from deliberate or inadvertent breaches of confidentiality. The more sensitive the information requested—for example, HIV status or whether the respondent has engaged in some illegal activity—the greater the harm that a breach of confidentiality might entail. Not unexpectedly, respondents require strong assurances of confidentiality when they are asked to provide such data about themselves (Singer et al. 1995). A variety of threats to the confidentiality of survey data exist. Most common is simple carelessness—not
Sureys and Polling: Ethical Aspects removing identifiers from questionnaires or electronic data files, leaving cabinets unlocked, not encrypting files containing identifiers. Although there is no evidence of respondents having come to harm as a result of such negligence, it is important for data collection agencies to raise the consciousness of their employees with respect to these issues, provide guidelines for appropriate behavior, and ensure that these are observed. Less common, but potentially more serious, threats to confidentiality are legal demands for identified data, either in the form of a subpoena or as a result of a Freedom of Information Act (FOIA) request. Surveys that collect data about illegal behaviors such as drug use, for example, are potentially subject to subpoena by law-enforcement agencies. To protect against this, researchers studying mental health, alcohol and drug use, and other sensitive topics, whether federally funded or not, may apply for Certificates of Confidentiality from the Department of Health and Human Services. Such certificates, which remain in effect for the duration of the study, protect researchers from being compelled in most circumstances to disclose names or other identifying characteristics of survey respondents in federal, state, or local proceedings (42 C.F.R. 2a.7, ‘Effect of Confidentiality Certificate’). The Certificate protects the identity of respondents; it does not protect the data. However, when surveys are done expressly for litigation, demands for data from one party or the other are not uncommon, and this protection may not be available. Several such cases are reviewed by Presser (1994). In one, the case was settled before the data were to be produced; in two others, the courts ordered the researchers to turn over raw data, including respondent identifiers, to the defendants. Presser (1994) discusses several actions that might be taken by professional survey organizations to enhance protections of confidentiality in such cases—for example, mandating the destruction of identifiers in surveys designed for adversarial proceedings. If this became standard practice, individual researchers would be better able to defend their own use of such a procedure. A final threat to data confidentiality that should be mentioned here is what is known as ‘statistical disclosure.’ So far, the discussion has been in terms of explicit respondent identifiers—name, address, and social security number, for example. But there is increasing awareness that even without explicit identifiers, re-identification of respondents may be possible given the existence of high-speed computers, external data files to which the survey data can be matched, and sophisticated matching software. Longitudinal studies may be especially vulnerable to such re-identification. There are essentially two ways to reduce the risk of statistical disclosure: restricting access to the data, for example by enclave or bonding arrangements, or restricting the data that may be released, for example, by top-coding age and income and releasing detailed
data only for large geographic areas. Although guidelines for reducing the risk of statistical disclosure exist (Federal Committee on Statistical Methodology 1994), some researchers believe that the best way to safeguard confidentiality is to create synthetic data sets derived from the actual information provided by respondents (Rubin 1987). In view of the diverse threats to confidentiality reviewed above, there is a growing consensus among researchers that absolute assurances of confidentiality should seldom, if ever, be given to respondents (Duncan et al. 1993). The ‘Ethical Guidelines for Statistical Practice’ of the American Statistical Association (ASA), for example, caution statisticians to ‘anticipate secondary and indirect uses of the data when obtaining approvals from research subjects.’ At a minimum, the Guidelines urge statisticians to provide for later independent replication of an analysis by an outside party. They also warn researchers not to imply protection of confidentiality from legal processes of discovery unless they have been explicitly authorized to do so, for example, by a Certificate of Confidentiality. Research by Singer (1978) and Singer et al. (1995) suggests that for surveys on nonsensitive topics, neither the response rate nor data quality is likely to suffer as a result of such qualified assurances of confidentiality. For those collecting sensitive data, researchers should make every effort to obtain a Certificate of Confidentiality, which would permit them to offer stronger assurances to respondents. Even such Certificates, however, cannot guard against the risk of statistical disclosure.
2.3.3 Efforts at persuasion. An issue that merits brief discussion here is the dilemma created by the need for high response rates from populations that spend less time at home, have less discretionary time, and are less inclined to cooperate in surveys. This dilemma leads survey organizations to attempt more refusal conversions and to make greater use of refusal conversion payments. Respondents see both practices as undesirable (Groves et al. 1999). They also regard the paying of refusal conversion payments as unfair, but awareness of such payments does not appear to reduce either the willingness to respond to a future survey or actual participation in a new survey (Singer et al. 1999).
2.3.4 Polling on the Net. Polls and surveys conducted on the Internet have shown explosive growth in the last several years, and such surveys bring an entirely new set of ethical problems to the fore, or bring new dimensions to old problems. For example, maintenance of privacy and confidentiality in such surveys requires technological innovations (for example, ‘secure’ websites, encryption of responses) beyond those 15317
Sureys and Polling: Ethical Aspects required by older modes of data collection. Other ethical issues simply do not arise with older modes of conducting surveys—for example, the possibility that a respondent might submit multiple answers, thus deliberately biasing the results—and these, too, call for new technical solutions requiring sophisticated knowledge of computer hardware and software. These problems, and proposed solutions, are too new to permit more than brief mention of them here.
Seltzer W 1998 Population statistics, the Holocaust, and the Nuremberg trials. Population and Deelopment Reiew 24: 511–52 Singer E 1978 Informed consent: Consequences for response rate and response quality in social surveys. American Sociological Reiew 43: 144–62 Singer E, Groves R M, Corning A D 1999 Differential incentives: Beliefs about practices, perceptions of equity, and effects on survey participation. Public Opinion Quarterly 63: 251–60 Singer E, Von Thurn D R, Miller R E 1995 Confidentiality assurances and response: A quantitative review of the experimental literature. Public Opinion Quarterly 59: 266–77
3. Concluding Obserations A number of developments in recent years have increased ethical dilemmas for survey researchers. There is growing demand for information about individuals on the part of both business and government. At the same time, there is growing concern about personal privacy on the part of the public. Technological developments have made safeguarding privacy and confidentiality increasingly difficult, and for this as well as other reasons, there is increasing reluctance on the part of the public to participate in polls and surveys, including government censuses. How these dilemmas are resolved in the next few years will have profound implications for the future of survey research. See also: Confidentiality and Statistical Disclosure Limitation; Ethical Dilemmas: Research and Treatment Priorities; Personal Privacy: Cultural Concerns; Privacy: Legal Aspects; Privacy of Individuals in Social Research: Confidentiality; Psychiatry: Informed Consent and Care; Research Conduct: Ethical Codes; Research Ethics: Research; Research Subjects, Informed and Implied Consent of; Social Survey, History of
Bibliography Duncan G T, Jabine T B, DeWolf V A (eds.) 1993 Priate Lies and Public Policies: Confidentiality and Accessibility of Goernment Statistics. National Academy Press, Washington, DC Federal Committee on Statistical Methodology 1994 Working Paper 22: Report on Statistical Disclosure Limitation Methods. Statistical Policy Office, Office of Information and Regulatory Affairs, Office of Management and Budget, Washington, DC Federal Register 1991 Federal policy for the protection of human subjects. Federal Register June 18: 28,002–31 Groves R M, Singer E, Corning A D, Bowers A 1999 A laboratory approach to measuring the joint effects of interview length, incentives, differential incentives, and refusal conversion on survey participation. Journal of Official Statistics 15: 251–68 Presser S 1994 Informed consent and confidentiality in survey research. Public Opinion Quarterly 58: 446–59 Rubin D B 1987 Multiple Imputation for Nonresponse in Sureys. Wiley, New York Schuman H 1997 Polls, surveys, and the English language. The Public Perspectie April/May: 6–7
15318
E. Singer
Survival Analysis: Overview 1. Failure-time Data and Representation The term survival analysis has come to embody the set of statistical techniques used to analyze and interpret time-to-event, or failure-time data. The events under study may be disease occurrences or recurrences in a biomedical setting, machine breakdowns in an industrial context, or the occurrence of major life events in a longitudinal-cohort setting. It is important in applications that the event of interest be defined clearly and unambiguously. A failure-time variable T is defined for an individual as the time from a meaningful origin or starting point, such as date of birth, or date of randomization into a clinical trial, to the occurrence of an event of interest. For convenience, we refer to the event under study as a failure. A cohort study involves selecting a representative set of individuals from some target population and following those individuals forward in time in order to record the occurrence and timing of the failures. Often, some study subjects will still be without failure at the time of data analysis, in which case their failure times T will be known only to exceed their corresponding censoring times. In this case, the censoring time is defined as the time elapsed from the origin to the latest follow-up time, and the failure time is said to be right censored. Right censorship may also arise because subjects drop out the study, or are lost to follow-up. A key assumption underlying most survival analysis methods is that the probability of censoring an individual at a given follow-up time does not depend on subsequent failure times. This independence assumption implies that individuals who are under study at time t are representative of the cohort and precludes, for example, the censoring of individuals at time t because, among individuals under study at that time, they appear to be at unusually high (or low) risk of failure. (This assumption can be relaxed to independence between censoring and failure times conditional on the preceding histories of covariates.) The
Surial Analysis: Oeriew reasonableness of the independent censoring assumption, which usually cannot be tested directly, should be considered carefully in specific applications. The distribution of a failure time variate T 0 is readily represented by its survivor function F (t) l P(T t). If T is absolutely continuous, its distribution can also be represented by the probability density function f (t) l kdF (t)\dt or by the hazard, or instantaneous failure rate, function defined by λ(t) l lim P(t T ∆t !
tj∆t Q T t)\∆t l f (t)\F (t−). (1)
Note that
&
t
F (t) l exp ok λ (u) duq, o
&
2. Estimation and Comparison of Surial Cures Suppose that a cohort of individuals has been identified at a time origin t l 0 and followed forward in time. Suppose that failures are observed at times t t ( t . Just prior to the failures observed " time # t , there kwill be some number, r , of cohort at j j members that are under active follow-up. That is, the failure and censoring times for these rj members occur at time tj or later. Of these rj individuals, some number, dj, are observed to fail at tj. The empirical failure rate or estimated hazard at tj is then dj\rj, and the empirical survival rate is (1kdj\rj). The empirical failure rate at all times other than t ,…, tk in the follow-up period is " Meier (1958) estimator of the zero. The Kaplan and survivor function F at follow-up time t is Fp (t) l (1kdj\rj)
and
(3)
ot tq
j
t
f (t) l λ(t ) exp ok λ(u) duq o
for 0 t _ In a similar way, the distribution of a discrete failure time variate T, taking values on x x (, can be " # probability represented by its survivor function, its function f (xi) l P(T l xi), i l 1, 2 … , or its hazard function λ(xi) l P(T l xi QT xi) l f(xi) Q F(x−i ). Note that F (t) l P(T t) l [1kλ(xi)] i Q xi t
(2)
as arises through thinking about the failure mechanism unfolding sequentially in time. To survive past time t where xj t xj+ , an individual must survive in " points x ,…, x with conditional sequence each of the " j. j probabilities 1kλ(xi), i l 1,…, It is often natural and insightful to focus on the hazard function. For example, human mortality rates are known to be elevated in the first few years of life, to be low and relatively flat for the next three decades or so, and then to increase rapidly in the later decades of life. Similarly, comparison of hazard rates as a function of time from randomization for a new treatment compared with control can provide valuable insights into the nature of treatment effects. Hazard function modeling also integrates nicely with a number of key strategies for selecting study subjects from a target population. For example, if one is studying disease occurrence as it depends upon the age of study subjects, hazard function models readily accommodate varying ages at entry into the study cohort since the conditioning event in (1) specifies that the individual is surviving and event-free at age t. Again one needs to assume that selected subjects have hazard rates that are representative (conditional on covariates) of those at risk for the target population.
the product of the empirical survivor rates across the observed failure times less than or equal to t. Note that F< estimates the underlying survivor function F at those times where there are cohort members at risk. For (3) to be valid, the cohort members at risk at any followup time t must have the same failure rate operating as does the target population at that follow-up time. Thus, the aforementioned independent censoring assumptions are needed. The hazard rate estimators dj\rj, j l 1,…, k can be shown to be uncorrelated, giving rise to the Greenwood variance estimator var oFp (t)q l Fp (t)# dj\orj(rjkdj)q
(4)
ot tq
i
for F< (t). In order to show asymptotic results for the estimator F< (t), for example that F< (t) is a consistent estimator of F(t), it is necessary to restrict the estimation to a follow-up interval [0, τ) within which the size of the risk set (i.e., number of individuals under active follow-up) becomes large as the sample size (n) increases. Under these conditions, n"/#oF< (t)kF(t)q; 0 t τ can be shown to converge weakly to a mean-zero Gaussian process, permitting the calculation of approximate confidence intervals and bands for F on [0, τ). Applying the asymptotic results to the estimation of log o–log F(t)q, which has no range restrictions, is suggested as a way to improve the coverage rates of asymptotic confidence intervals and bands (Kalbfleisch and Prentice 1980, pp. 14, 15). Consider a test of equality of m survivor functions F , … , Fm on the basis of failure-time data from each of" m populations. Let t t ( tk denote the ordered failure times from" the#pooled data, and again let dj failures occur out of the rj individuals at risk just prior to time tj, j l 1,…, k. Also, for each sample i l 1,…, m, let dij denote the number of failures at tj, and rij denote the number at risk at t−j. Under the null 15319
Surial Analysis: Oeriew hypothesis F l F l ( l Fm the conditional distribution of d "j,…, #dmj given (dj, r j,…, rmj) is hyper" " geometric from which the conditional mean and variance of dij are respectively uij l rij(dj\rj) and (Vj)ii l rij(rikrij)dj(rjkdj)r−j #(rjk1)−". The conditional covariance of dij and d%j is (Vj)i% l rijr%jdj(rjkdj)r−j #(rjk1)−".
(5)
Hence the statistic hj l (d jku j,…, dmjkumj) has " " V under the null conditional mean zero and variance j hypothesis. The summation of these differences across failure times, k
l j (6) " is known as the logrank statistic (Mantel 1966, Peto and Peto 1972). The contributions to this statistic across the various failure times are uncorrelated, and hV−" generally has an asymptotic chi-square distribution with mk1 degrees of freedom where V l kVj and a ‘prime’ denotes vector transpose. " This construction also shows that a broad class of ‘censored data rank tests’ of the form kwjj also provide valid tests of the equality of the m" survival curves. In these tests, the ‘weight’ wj at time tj can be chosen to depend on the above conditioning event at tj, or on data collected before tj, more generally. The null hypothesis variance of these test statistics is readily estimated by kwj#Vj, again leading to an " asymptotic chi-square (mk1) test statistic. The special case wj l F< (t−j ) generalizes the Wilcoxon rank sum test to censored data in a natural fashion (e.g., Peto and Peto 1972, Prentice 1978) and provides an alternative to the logrank test that assigns greater weight to earlier failure times. More adaptive tests can also be entertained (e.g., Harrington and Fleming 1982).
3. Failure-time Regression: The Cox Model 3.1 Model Deelopment The same type of conditional probability argument generalizes to produce a flexible and powerful regression method for the analysis of failure-time data. This method was proposed in the fundamental paper by Cox (1972) and is referred to as Cox regression or as relative risk regression. Suppose that each subject has an associated vector of covariates zh l (z , z ,…). For example, the elements # of z may include "primary exposure variables and confounding factor measurements in an epidemiologic cohort study, or treatment indicators and patient characteristics in a randomized clinical trial. Let λ(t; z) denote the hazard rate at time t for an absolutely continuous failure time T on an individual with covariate vector z. Now one requires the censoring 15320
patterns, and the cohort entry patterns, to be such that the hazard rate λ(t; z) applies to the cohort members at risk at follow-up time t and covariate value z for each t and z. The Cox regression model specifies a parametric form for the hazard rate ratio λ(t; z)\λ(t; z ), ! where z is a reference value (e.g., z l 0). Because this ! to use an exporatio is ! nonnegative, it is common nential parametric model of the form exp ox(t)βq. The vector x(t) l ox (t),…, xp(t)q includes derived covariates that are"functions of the elements of z (e.g., main effects and interactions) and possibly product (interaction) terms between z and t. The vector βh l (β ,…, βp) is a p-vector of regression parameters to be " estimated. Denoting λ (t) l λ(t; z ) allows the model ! to be written in the standard form! λ(t; z) l λ (t) exp ox(t)βq. (7) ! As in previous section let t t ( tk denote the # ordered distinct failure times" observed in follow-up of the cohort, and let dj denote the number of failures at tj. Also let R(tj) denote the set of labels of individuals at risk for failure at t−j . We suppose first that there are no ties in the data so that dj l 1. The conditional probability that the ith individual in the risk set with covariate zi fails at tj, given that dj l 1 and the failure, censoring (and covariate) information prior to tj, is simply λ(tj; zi) exp oxi(tj)βq l . % ? R(t )λ(tj; z%) % ? R(t ) exp oxl(tj)βq j
(8)
j
The product of such terms across the distinct failure times k exp oxj(tj)βq L( β) l % ? R(t ) exp ox%(tj)βq j=" j
(9)
is known as the partial likelihood function for the relative risk parameter β. If one denotes L( β) l kj= Lj( β), then the fact that the scores c log Lj( β)\cβ, j l" 1,…, k have mean zero and are uncorrelated (Cox 1975) suggests that, under regularity conditions, the partial likelihood (9) has the same asymptotic properties as does the usual likelihood function. Let β# denote the maximum partial likelihood estimate that solves c log L( β)\cβ l 0. Then, n"/#( β# kβ) will generally converge, as the sample size n becomes large, to a p-variate norma1 distribution having mean zero and covariance matrix consistently estimated by kn−"c# log L( β# )\cβ# cβ# h. This result provides the usual basis for tests and confidence intervals on the elements of β. Even though we are working with continuous models, there will often be some tied failure times. If the ties are numerous, it may be better to model the data using a discrete failure-time model. Often, however, a simple modification to (9) is appropriate.
Surial Analysis: Oeriew When the dj are small compared with the rj, the approximate partial likelihood (Peto and Peto 1972, Breslow 1974) k exp osj(tj)βq L( β) l [% ? R(t ) exp oxl(tj) βq]dj j=" j
(10)
is quite adequate for inference, where sj(tj) l i ? Dxi(tj) j and Dj represents the set of labels of individuals failing at tj. Efron (1977) suggested an alternate approximation that is only slightly more complicated and is somewhat better. Both the Breslow and Efron approximations are available in software packages. 3.2 Special Cases and Generalizations Some special cases of the modeled regression vector x(t) illustrate the flexibility of the relative risk model (7). If one restricts x(t) l x, to be independent of t, then the hazard ratios at any pair of z-values are constant. This gives the important proportional hazards special case of (7). For example, if the elements of z include indicator variables for mk1 of m samples and x(t) l (z ,…, zm− ) then each eβj, j l 1,…, mk1 " " for the jth sample compared denotes the hazard ratio with the last (mth) sample. In this circumstance the score test for β l 0 arising from (9) reduces to the logrank test of the preceding section. The constant hazard ratio assumption, in this example, can be relaxed by defining x(t) l og(t)z ,…, " g(t)zm− q for some specified function g of follow-up " example g(t) l log t would allow hazard time. For ratios of the form tβj for the jth sample compared with the mth. This gives a model in which hazard ratio for the jth vs. the mth population increase (βj 0) or decrease (βj 0) monotonely from unity as t increases. The corresponding score test would be a weighted logrank test for the global null hypothesis. More generally one could define x(t) l oz ,…, zm− , " " g(t)z ,…, g(t)zm− q. This model is hierarchical with " " main effects for the sample indicators and interactions with g(t). A test that the interaction coefficients are zero provides an approach to checking for proportionality of the hazard functions. Departure from proportionality would imply the need to retain some product terms in the model, or to relax the proportionality assumption in some other way. As a second special case consider a single quantitative predictor variable z. Under a proportional hazards assumption one can include various functions of z in the modeled (time independent) regression vector x(t) l x for model building and checking, just as in ordinary linear regression. The proportionality assumption can be relaxed in various ways including setting x(t) l oxI (t),…, xIq(t)q where I (t),…, Iq(t) are " for a partition of the " study followindicator variables up period. The resulting model allows the relative risk parameters corresponding to the elements of x l x(z)
to be separately estimated in each follow-up interval. This provides a means for testing the proportionality assumption, and relaxing it as necessary to provide an adequate representation of the data. The above description of the Cox model (7) assumed that a fixed regression vector z was measured on each study subject. It is important to note that (7) can be relaxed to include measurements of stochastic covariates that evolve over time. This does not complicate the development of (9) or (10). Specifically, if z(t) denotes a study subject’s covariates at follow-up time t, and Z(t) l oz(u); u tq denotes the corresponding covariate history up to time t, then the hazard rate λ(t; Z(t)q can be modeled according to the right side of (7) with x(t), comprising functions of Z(t) and product terms between such functions and time. Careful thought must be given to relative risk parameter interpretation in the presence of stochastic covariates. For example, the interpretation of treatment effect parameters in a randomized controlled trial may be affected substantially by the inclusion of postrandomization covariates when those covariates are affected by the treatment. However, careful use of such covariates can serve to identify mechanisms by which the treatment works. 3.3 Surior Function Estimation The cumulative ‘baseline’ hazard function Λ (t) l t λ (u)du (7) can be simply estimated (Breslow !1974) ! ! by p (t; βV ) l Λ !
tj t
A B
dj
5
% ? R(t
j)
exp ox%(tj)βV q
C
.
(11)
D
If x(t) l x is time-independent, the survivor function F(t; z) l poT tQzq l exp okΛ (t)exβq is estimated by ! V p (t; βV )exβq Fp (t; z) l exp okΛ (12) ! A slightly more complicated estimator F< (t; z) is available if x is time-dependent. A practically important generalization of (7) allows the baseline hazard to depend on strata s l s(z), or more generally s l s(z, t) so that λ(t; z) l λ s(t) exp ox(t)βq (13) ! where s takes integer values 1, 2,…, m indexing the strata. This provides another avenue for relaxing a proportional hazards assumption. The stratified partial likelihood is simply a product of terms (9) over the various strata. 3.4 Asymptotic Distribution Theory The score function U( β) l c log L( β)\cβ from (9) involves a sum of terms that are uncorrelated but not 15321
Surial Analysis: Oeriew independent. The theory of counting processes and martingales provides a framework in which this uncorrelated structure can be described, and a formal development of asymptotic distribution theory. Specifically, the counting process N for a study subject is defined by N(t) l 0 prior to the time of an observed failure, and N(t) l 1 at or beyond the subject’s observed failure time. The counting process can be decomposed as N(t) l Λ(t)jM(t)
(14)
where, under independent censorship,
&
t
Λ(t) l Y(t) λou; Z(u)qdu !
(15)
and M(t) is a locally square integrable martingale. Note that in the ‘compensator’ Λ the ‘at risk’ process Y is defined by Y(t) l 1 up to the time of failure or censoring for the subject and zero thereafter. The decomposition (14) allows the logarithm of (9), and its derivatives, and the cumulative hazard estimator (11), to be expressed as stochastic integrals with respect to martingales, to which powerful convergence theorems apply. Anderson and Gill (1982) provide a thorough development of asymptotic distribution theory for the Cox model using these counting process techniques, and provide weak conditions for the joint asymptotic convergence of n"/#( β# kβ) and to Λ< (:, β# ) to Gaussian processes. The books by Fleming! and Harrington (1991) and Andersen et al. (1993) use the counting process formulation for the theory of each of the statistics mentioned above, and for various other tests and estimators.
4. Other Failure-time Regression Models 4.1 Parametric Models Parametric failure-time regression models may be considered as an alternative to the semiparametric Cox model (7). For example, requiring λ (t) in (7) to be ! gives expoa constant or a power function of time nential and Weibull regression models respectively. The maximum partial likelihood estimate β# of the preceding section is known to be fully efficient within the semiparametric model with λ (t) in (7) completely unspecified. While some efficiency! gain in β estimation is possible if an assumed parametric model is true, such improvement is typically small and may be more than offset by a lack of robustness. Of course there are many parametric regression models that are not contained in the Cox model (7). One such class specifies a linear model for the logarithm of time to failure, log T l αjxβjσu 15322
(16)
where x l x(z) is comprised of functions of a fixed regression vector z. The parameters α and σ are location and scale parameters, β is a vector of regression parameters and u is an error variate. Because regression effects are additive on log T, or multiplicative on T, this class of models is often referred to as the accelerated failure-time model. If the error u has an extreme value distribution with density f(u) l exp (ukeu), then the resulting model for T is the Weibull regression model previously mentioned. More generally, any specified density for u induces a hazardfunction model λ(t; z) for T to which standard likelihood estimation procedures apply, and asymptotic distribution theory can be developed based again on counting process theory. 4.2 A Competing Semiparametric Model The model (16) has also been studied as a semiparametric class in which the distribution of the error u is unspecified. Prentice (1978) considered the likelihood for the censored data rank vector based on the ‘residuals’ ( β ) l log Tkxβ to generate a linear rank ! ! statistic k
(xjkx` j)wjj(β ) ! j="
(17)
for testing β l β . In (l7), j( β ) is the jth largest of the uncensored (β !) values, xj! is the corresponding ! value, x- is the average of the xmodeled covariate j values for the individuals having ‘residual’ equal or greater than j( β ), and wj is a weight, usually taken to ! be identically equal to one (logrank scores). An # estimator of β of β in (17) can be defined as a value of β for which the length of the vector (17) is minimized. ! calculation of β# is complicated by the fact that (17) The is a step function that requires a reranking of the residuals at each trial value of β . However, simulated annealing methods have been! developed (Lin and Geyer 1992) for this calculation, and resampling methods have been proposed for the estimation of confidence intervals for the elements of β (Parzen et al, 1994). See Ying (1993) for an account of the asymptotic theory for β# . The model (16) implies a hazard function of the form λ(t; z) l λ (te−xβ)e−xβ. (18) ! This form suggests a time-dependent covariate version of the accelerated failure-time model (e.g., Cox and Oakes 1984, Robins and Tsiatis 1992) given by 1
!
&!e t
λot, Z(t)q l λ
4
3
2
5
−x(u)βdu
e−x(t)β 7
6
(19)
8
which retains the notion of the preceding covariate
Surial Analysis: Oeriew history implying a rescaling or acceleration, by a factor e−x(t)β, of the rate at which a study subject traverses the failure-time axis.
5. Multiple Failure Times and Types 5.1
Multiple Failure Types and Competing Risks
Suppose that individuals are followed for the occurrence of a univariate failure time, but that the failure may be one of several types. The type i specific hazard function at follow-up time t, for an individual having preceding covariate Z(t) can, for example, be specified as λiot; Z(t)q l λ i(t) exp ox(t)βiq (20) ! where λ i is a baseline hazard for the ith failure type and βi !is a corresponding relative risk parameter. Estimation of β , β ,…, under independent censorship is readily based "on #a partial likelihood function that is simply a product of terms (9) over distinct failure types (e.g., Kalbfleisch and Prentice 1980, Chap. 7). In fact, inference on a particular regression vector βi and corresponding cumulative hazard function Λ i is based ! on (9) upon censoring the failures of types other than (i) at their observed failure times. Accelerated failure time regression methods similarly generalize for the estimation of type-specific regression parameters. Very often questions are posed that relate to the relationship between the various types of failure. It should be stressed that data on failure time and failure type alone are insufficient for studying the interrelation among the failure mechanisms for the different types, rather one generally requires the ability to observe the times of multiple failures on specific study subjects.
5.2 Recurrent Eents Suppose that there is a single type of failure, but that individual study subjects may experience multiple failures over time. Examples include multiple disease recurrences among cancer patients, multiple job losses in an employment history study, or multiple breakdowns of a manufactured item. The counting process N at follow-up time t is defined as the number of failures observed to occur on the interval (0, t] for the specific individual. The asymptotic theory of Andersen and Gill (1982) embraced such counting processes in which the failure rate at time t was of Cox model form λot; H(t)q l λ (t) exp ox(t)βq. (21) ! In this expression, the conditioning event H(t) includes the individual’s counting process history, in addition to the corresponding covariate history prior to followup time t. Hence the modeled covariate x(t) must
specify the dependence of the failure rate at time t both on the covariates and on the effects of the individual’s preceding failure history. Various generalizations of (21) may be considered to acknowledge these conditioning histories in a flexible fashion. For example, λ can be stratified on the number of preceding failures ! the study subject; or the argument in λ may be on ! failure replaced by the gap time from the preceding (Prentice et al 1981). A useful alternative to (21) involves modeling a marginalized failure rate in which one ‘forgets’ some aspects of H(t). For example, one could replace H(t) in (21) by the covariate history Z(t) and the number s l s(t) of failures for the individual prior to time t. A Coxtype model λ s(t) expox(t) βsq for the resulting failure rate function !would allow one to focus on covariate effects without having to model the timing of the s preceding failures. While the parameters βs can again be estimated using the score equations from a stratified version of (9) a corresponding sandwich variance formula is needed to acknowledge the variability introduced by the nonincreasing conditioning event (Wei et al. 1989, Pepe and Cai 1993, Lawless and Nadeau 1995).
5.3 Correlated Failure Times In some settings, such as group randomized intervention trials, the failure times on individuals within groups may be correlated, a feature that needs to be properly accommodated in data analysis. In other contexts, such as family studies of disease occurrence in genetic epidemiology, study of the nature of the dependency between failure times may be a key research goal. In the former situation, hazard-function models of the types described above can be specified for the marginal distributions of each of the correlated failure times, and marginal regression parameters may be estimated using the techniques of the preceding sections. A sandwich-type variance estimator is needed to accommodate the dependencies between the estimating function contributions for individuals in the same group (Wei et al 1989, Prentice and Hsu 1997). If the nature of the dependence between failure times is of substantive interest it is natural to consider nonparametric estimation of the bivariate survivor function F(t , t ) l p(T t , T t ). Strongly con# " "Gaussian # # nonparametric sistent and "asymptotically estimators are available (Dabrowska 1988, Prentice and Cai 1992), though these estimators are nonparametric efficient only if T and T are independent " # (Gill et al. 1995). Nonparametric efficient estimators away from this independence special case can be developed through a grouped data nonparametric maximum likelihood approach (Van der Laan 1996), though this estimation problem remains an active area of investigation. 15323
Surial Analysis: Oeriew 5.4 Frailty Models There is also substantial literature on the use of latent variables for the modeling of multivariate failure times. In these models, failure times are assumed to be independent given the value of a random effect, or frailty, which is usually assumed to affect the hazard function in a multiplicative (time-independent) fashion. One of the simplest such models arises if a frailty variate has a gamma distribution. This generates the Clayton (1978) bivariate model with joint survivor function F(t , t ) l oF (t )−θjF (t )−θk1q−"/θ " # " " # #
(22)
where θ controls the extent of dependence between T and T with maximal positive dependence as θ _," # negative dependence as θ k1, and inmaximal dependence at θ l 0 (e.g., Oakes 1989). Nielsen et al. (1992) generalize this model to include regression variables and propose an expectationmaximization algorithm with profiling on θ for model fitting. Andersen et al. (1993) and Hougaard (2000) provide fuller accounts of this approach to multivariate failure-time analysis.
6. Additional Surial Analysis Topics Space permits only brief mention of a few additional survival analysis topics: Considerable effort has been devoted to model building and testing aspects of the use of failure-time regression models. The monograph by Therneau and Grambsch (2000) summarizes much of this work in the context of the Cox model (7). The cost of assembling covariate histories has led to a number of proposals for sampling within cohorts, whereby the denominator terms in (9) are replaced by suitable estimates, based on a subset of the subjects at risk. These include nested case-control sampling in which the denominator at tj is replaced by a sum that involves the individual failing at tj and controls randomly selected from R(tj), independently at each distinct tj, and case-cohort sampling in which the denominator summation at tj consists of the failing individual along with the at risk members of a fixed random subcohort (see Samuelson (1997) for a recent contribution). Considerable effort has also been directed to allowing for measurement error in regression variables, particularly in the context of the Cox model (7) with time-independent covariates. Huang and Wang (2000) provide a method for consistently estimating β in (7) under a classical measurement error model, and provide a summary of the literature on this important topic. Testing and estimation on regression parameters has also been considered in the context of clinical trials 15324
under sequential monitoring. The recent book by Jennison and Turnbull (2000) summarizes this work. Additional perspective on the survival analysis literature can be obtained from the overviews of Anderson and Keiding (1999) and of Oakes (2000). Goldstein and Harrell (1999) provide a description of the capabilities of various statistical software packages for carrying out survival analyses.
Bibliography Andersen P K, Gill R D 1982 Cox’s regression model for counting processes: A large sample study. Annals of Statistics 10: 1100–20 Andersen P K, Borgan O, Gill R D, Keiding N 1993 Statistical Models Based on Counting Processes. Springer, New York Andersen P K, Keiding N 1999 Survival analysis, overview. Encyclopedia of Biostatistics. Armitage P, Colton T (eds.). Wiley, New York, Vol. 6, pp. 4452–61 Breslow N E 1974 Covariance analysis of censored survival data. Biometrics 30: 89–99 Clayton D 1978 A model for association in bivariate life tables and its application in studies of familial tendency in chronic disease incidence. Biometrika 65: 141–51 Cox D R 1972 Regression models and life tables (with discussion). Journal of the Royal Statistical Society, Series B, 34: 187–220 Cox D R 1975 Partial likelihood. Biometrika 62: 269–276 Cox D R, Oakes D 1984 Analysis of Surial Data. Chapman & Hall, London Dabrowska D 1988 Kaplan-Meier estimate on the plane. Annals of Statistics 16: 1475–89 Efron B 1977 Efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association 72: 557–65 Fleming T R, Harrington D D 1991 Counting Processes and Surial Analysis. Wiley, New York Gill R D, Van der Laan M, Wellner J A 1995 Inefficient estimators of the bivariate survival function for three models. Annals of the Institute Henri PoincareT , Probability and Statistics 31: 547–97 Goldstein R, Harrell F 1999 Survival analysis, software. Encyclopedia of Biostatistics. Armitage P, Colton T (eds.). Wiley, New York, Vol. 6, pp. 4461–6 Harrington D P, Fleming T R 1982 A class of rank test procedures for censored survival data. Biometrika 69: 133–43 Hougaard P 2000 Analysis of Multiariate Surial Data. Springer, New York Huang Y, Wang C Y 2000 Cox regression with accurate covariates unascertainable: A nonparametric correction approach. Journal of the American Statistical Association 95: 1209–19 Jennison C, Turnbull B W 2000 Group Sequential Methods with Application to Clinical Trials. Chapman & Hall, New York Kalbfleisch J D, Prentice R L 1980 The Statistical Analysis of Failure Time Data. Wiley, New York Kaplan E L, Meier P 1958 Nonparametric estimation from incomplete observations. Journal of the American Statistical Association 53: 457–81 Lawless J F, Nadeau C 1995 Some simple and robust methods for the analysis of recurrent events. Technometrics 37: 158–68 Lin D Y, Geyer C J 1992 Computational methods for semiparametric linear regression with censored data. Journal of Computational and Graphical Statistics 1: 77–90
Sustainability Transition: Human–Enironment Relationship Lin D Y, Wei J L, Ying Z 1998 Accelerated failure time models for counting processes. Biometrika 85: 608–18 Mantel N 1966 Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer and Chemotherapy Reports 50: 163–70 Nielsen G G, Gill R D, Andersen P K, Sorensen T I A 1992 A counting process approach to maximum likelihood estimation in frailty models. Scand Journal of Statistics 19: 25–43 Oakes D 1989 Bivariate survival models induced by frailties. Journal of the American Statistical Association 84: 487–93 Oakes D 2000 Survival analysis. Journal of the American Statistical Association 95: 282–5 Parzen M I, Wei L J, Ying Z 1994 A resampling method based on pivotal estimating functions. Biometrika 68: 373–9 Pepe M S, Cai J 1993 Some graphical displays and marginal regression analysis for recurrent failure times and time dependent covariates. Journal of the American Statistical Association 88: 811–20 Peto R, Peto J 1972 Asymptotically efficient rank invariant test procedures (with discussion). Journal of the Royal Statistical Society Series A, 135: 185–206 Prentice R L 1978 Linear rank tests with right censored data. Biometrika 65: 167–79 Prentice R L, Cai J 1992 Covariance and survivor function estimation using censored multivariate failure time data. Biometrika 79: 495–512 Prentice R L, Hsu L 1997 Regression on hazard ratios and cross ratios in multivariate failure time analysis. Biometrika 84: 349–63 Prentice R L, Williams B J, Peterson A V 1981 On the regression analysis of multivariate failure time data. Biometrika 68: 373–9 Robins J M, Tsiatis A A 1992 Semiparametric estimation of an accelerated failure time model with time-dependent covariates. Biometrika 79: 311–9 Samuelsen S O 1997 A pseudolikelihood approach to analysis of nested case-control studies. Biometrika 84: 379–84 Therneau T, Grambsch P 2000 Modeling Surial Data: Extending the Cox Model. Springer, New York Van der Laan M J 1996 Efficient estimation in the bivariate censoring model and repairing NPMLE. Annals of Statistics 24: 596–627 Wei I J, Lin D Y, Weissfeld L 1989 Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. Journal of American Statistical Association 84: 1065–73 Ying Z 1993 A large sample study of rank estimation for censored regression data. Annals of Statistics 21: 76–99
R. L. Prentice and J. D. Kalbfleisch
Sustainability Transition: Human–Environment Relationship The reconciliation of society’s human developmental goals with the planet’s environmental limits over the long term is the foundation of an idea known as sustainable development. This idea emerged in the early 1980s from scientific perspectives on the in-
terdependence of society and environment, but even as it garnered increasing political attention and acceptance around the world, its scientific base weakened. The decade of the 1990s has seen an effort to re-engage the scientific community around the requirements for a sustainability transition. Beyond its commonplace meaning as a transition towards a state of sustainable development, a sustainability transition was studied as a series of interlinked transitions, as a process of adaptive management and social learning, and as a set of indicators and future scenarios.
1. Sustainable Deelopment The origins of sustainable development can be traced back through the 1980 World Conservation Strategy and the 1972 Stockholm Conference on the Human Environment to the early days of the international conservation movement (O’Riordan 1988, Adams 1990). But the contemporary linking of environment and development is little more than a decade old, stemming from Our Common Future, the 1987 report of the World Commission on Environment and Development (also known as the Brundtland Commission). The Commission defined sustainable development as the ability of humanity ‘to ensure that it meets the needs of the present without compromising the ability of future generations to meet their own needs’ (WCED 1987). An extraordinarily diverse set of groups and institutions have taken this widely accepted statement of sustainable development and projected upon it their own hopes and aspirations. While sharing a common concern for the fate of the earth, proponents of sustainable development differ in what they think is to be sustained, what is to be developed, how each is related to the other, and for how long a time. The most common view as to what is to be sustained are planetary life-support systems, particularly those supporting human life (Kates 1994). These systems include natural resources as well as environments that provide aesthetics, recreation, pollution absorption and cleansing, and other ecosystem services (Daily 1997). Contrasting this anthropocentric view is one of sustaining nature itself in all its biodiversity and assemblages of life forms. These ought to be sustained, not only for their utilitarian service to humans, but because of the moral obligations of humanity arising either as ‘stewardship’—still acknowledging the primacy of humans—or as a form of ‘natural rights’ in which earth and its other living things have equal claims for existence and sustenance. Also for some, cultural diversity is seen as the counterpoint to biological diversity, and various communities that include distinctive cultures, particular groups of people, and specific places need to be sustained. The most common view of what is to be developed is the economy that provides employment, needed 15325
Sustainability Transition: Human–Enironment Relationship income, desired consumption, and the means for investment in physical and human capital, as well as environmental maintenance and restoration. An alternative emphasis is on people-centered human development which focuses on the quantity of life, as seen in the survival of children and their increased life expectancy, and the quality of life through education and equal opportunity. Finally, some discussions of what is to be developed emphasizes society itself, addressing the well-being and security of national states, regions and institutions. The relationship between what is to be sustained and what is to be developed also differ. Some conceptual statements, provide equal emphasis for both; others, while paying homage to sustainable development, seem to be saying ‘sustain only,’ ‘develop mostly,’ or ‘sustain or develop subject to some minimal constraint on the other.’ Similarly the implied time horizons for sustainable development differ. There is universal acceptance that such time horizons are intergenerational, but these then range between a single generation of 25 years, several generations, or as an unstated, but implicit, forever. Each of these time periods present very different prospects and challenges for sustainable development. Over the space of a single generation, almost any development appears sustainable; over an infinite forever none does. At the century’s end, the difficulties of actually delivering on these diverse hopes that people around the world have attached to the idea of sustainable development had become increasingly evident. In part, these difficulties reflected political problems, grounded in questions of financial resources, equity and the competition of other issues for the attention of decision-makers. In part, they reflect the differing views about what should be developed, what should be sustained and over what period. Additionally, however, the political impetus that carried the idea of sustainable development so far and so quickly in public forums has also increasingly distanced it from its scientific and technological base.
2. Interlinked Transitions Thus during the 1990s several groups of scientists sought to reinvigorate their historic connection with the concept of sustainable development by focusing not on the larger concept, with all its actual and implied diverse meanings, but on a sustainability transition. The interest in transitions arose in part from the widespread perception that the continuation of current trends in both environment and development would not provide for the desired state of sustainable development. To the extent that transitions represent breaks in such trends, there was growing interest in identifying the needed or desired transitions in the relationships between society and environment. 15326
There was also widespread interest in the post-Cold War world in the nature of transitions themselves. Attention was focused on the economic transition from state to market control and on the civil society transition from single-party, military or state-run institutions to multiparty politics and a rich mix of nongovernmental institutions (Mathews 1997). Earlier transitions had also been identified, such as that in settlement patterns from rural to urban; in agricultural productivity from increases in production derived from additional land to increases derived from greater yields; and in health from early death by infectious diseases to late death by cancer, heart attack, and stroke. For the environment, significant transitions were seen for biogeochemical cycles in a shift from dominance by natural processes to dominance by human releases; for specific pollutants from increasing to decreasing rates of emissions; and for land cover in temperate zones a change from deforestation to reforestation (Turner et al. 1990). But the greatest interest lay in the successful confirmation of the ‘demographic transition’—the change in population regimes from ones of high birth and death rates, to ones of low birth and death rates. The demographic transition appeared credible because it met scientific criteria: it was partly supported by theory, matched well the data and had predictive power (Kates 1996). It was also seen as both needed and desired because of the widespread consensus that rapidly growing population made more difficult human development, while further stressing the natural systems that needed to be sustained. On a global scale, at the century’s end, the transition in birth rates was more than halfway towards birth stability, the 2.1 births required to achieve eventual zero population growth, with an average of less than three children for each woman of reproductive age, compared with five at the post World War II peak of population growth. The death transition was even more advanced, with life expectancy at birth having increased from 40 years to 66 years, about threequarters of the way to the stabilizing stage of the transition, when life expectancy is likely to reach 75 years. Mid-range projections (in 1998) foresaw further slowing of population growth rates with 8.9 billion people in the world by 2050 and stabilizing to 9.5 billion by the century’s end (United Nations 1999). Thus, in the first major conceptualization of the ‘sustainability transition’ Speth (1992) set forth a series of five inter-linked transitions (demographic, technological, economic, social and institutional) as collective requirements for a sustainability transition. In addition to the desired demographic transition, Speth saw the need for a transition towards technologies that were environmentally benign and reduced sharply the consumption of natural resources, and the generation of waste and pollutants. For the economy, the desired transition was towards one in which prices reflected their full environmental costs. A social transition
Sustainability Transition: Human–Enironment Relationship would move towards a fairer sharing of economic and environmental benefits both within and between countries. All of these transitions required a fifth: a transition in the institutional arrangements between governments, businesses and people that would be less regulation-driven and more incentive-led, less confrontational and more collaborative. To these five, Gell-Mann (1994) further suggested an ideological transition towards a sense of solidarity that can encompass both the whole of humanity and the other organisms of the biosphere and an informational transition that integrates disciplinary knowledge and disseminates it across the society. Together, Speth and Gell-Mann present these transitions as requirements for a more sustainable world: if each individual transition is completed successfully the result would constitute a sustainability transition. And all of these transitions may be underway as there are some trends in each of the desired directions. To investigate this concept they organized a major, albeit uncompleted, research effort known as the ‘2050’ project (World Resources Institute, Brookings Institutions, Santa Fe Institute) that sought to track these transitions and understand their interactions.
3. Adaptie Management and Social Learning A different approach was adopted by the National Academy of Sciences (USA) Board on Sustainable Development (1999). It too adopted a two generation, to the year 2050, time horizon. But in Our Common Journey: A Transition Toward Sustainability, the Board doubted that any specific set of trends or transitions constituted necessary or sufficient conditions for sustainability. It argued that the pathway to a sustainability transition could not be mapped in advance. Instead, it would have to be navigated adaptively through trial and error, and conscious policy experimentation (Holling 1978, Lee 1993) and a process of social learning (Social Learning Group 2001) as well other devices such as indicators and scenarios (described below). What could be selected were goals for a sustainability transition. In the Board’s judgment, the primary goals of a transition toward sustainability over the next two generations should be to meet the needs of a much larger but stabilizing human population, to sustain the life-support systems of the planet, and to reduce hunger and poverty substantially. Specific targets for these goals for human well-being and environmental preservation have been defined over the past few decades through extensive processes of international political debate and intergovernmental action. In the area of human needs, internationally agreedon targets exist for providing food and nutrition, nurturing children, finding shelter and providing an
education, although not for finding employment. Thus, there is an implicit hierarchy of needs that favors children and people in disasters, and favors feeding and nurturing first, followed by education, housing and employment. Compared with targets for meeting human needs, quantitative targets for preserving life-support systems are fewer, more modest and more contested. Global targets now exist for ozone-depleting substances and greenhouse gases, and regional targets exist for some air pollutants. Absolute prohibitions (zero targets) exist for ocean dumping of radioactive wastes and some toxics, for the taking and\or sale of a few large mammals (whales, elephants and seals), migratory birds when breeding or endangered, and certain regional fishing stocks. Water, land resources and ecosystems such as arid lands and forests have, at best, qualitative targets to achieve sustainable management or restoration. International standards exist for many toxic materials, organic pollutants and heavy metals that threaten human health, but not for ecosystem health. A major action and research agenda was developed in Our Common Journey based on both current scientific understanding and the requirements of an emerging sustainability science. For the core sectoral areas of sustainable development identified more than a decade ago by the Brundtland Commission—human population and well-being, cities, agriculture, energy and materials, and living resources—the Board identified concrete goals and appropriate next steps to accelerate major transitions underway or needed in each sector. To create a research agenda of what might be called ‘sustainability science,’ the Board would promote the creation of usable knowledge from scientific understanding; would seek to integrate global and local perspectives to shape a ‘place-based’ understanding of the interactions between environment and society; and initiate focused research programs on a small set of understudied questions that are central to sustainability transition.
4. Indicators and Scenarios The largest scientific effort to date has been the creation of indicators of sustainable development and to use them to track a transition toward sustainability. It sought to make more realistic the diverse meanings of sustainable development by attempting to develop indicators of the different values so implied. A major effort was launched by the Scientific Committee on Problems of the Environment (Moldan et al. 1997) and an important alternative effort by the so-called Balaton Group (Meadows 1998). These scientific efforts complemented similar efforts by various local, national, and international groups to create indicators as ways of expressing desired values for sustainable 15327
Sustainability Transition: Human–Enironment Relationship Table 1 Sustainable development in the USA: an experimental set of indicators Economic
Environmental
Social
Capital assets
Surface-water quality
Population of USA
Labor productivity
Ratio of renewable water supply to withdrawals
Life expectancy at birth
Domestic product
Fisheries utilization
Births to single mothers
Income distribution
Acres of major terrestrial ecosystems
Children living in families with only one parent present
Consumption expenditures per capita
Invasive alien species
Educational achievement rates
Unemployment
Conversion of cropland to other uses
Educational attainment by level
Inflation
Soil-erosion rates
Teacher training level and application of qualifications
Investment in R&D as a percentage of GDP
Timber growth to removals balance
People in census tracts with 40 percent or greater poverty
Federal debt to GDP ratio
Outdoor recreational activities
Crime rate
Energy consumption per capita and per $ of GDP
Contaminants in biota
Participation in the arts and recreation
Materials consumption per capita and per $ of GDP
Quantity of spent nuclear fuel
Contributing time and money to charities
Homeownership rates
Identification and management of superfund sites
Percentage of households in problem housing
Metropolitan air quality nonattainment Status of stratospheric ozone Greenhouse-gas emissions Greenhouse climate response index
futures (http:qqiisd1.iisd.ca\measureqcompendium. htm and United Nations Commission for Sustainable Development 1996). A set of 40 economic, environmental and social indicators for the USA that use existing data sets are shown in Table 1 (http:\\www. sdi.gov\reports.htm). In contrast to indicators such as these that can inform society to what extent progress is being made in a transition toward sustainability, alternative pathways for a transition were explored through long-range development scenarios that reflect the uncertainty about how the global system might unfold. While they are rigorous, reflecting the insights of science and modeling, these scenarios are told in the language of words as well as numbers because assumptions about culture, values, lifestyles, and social institutions require qualitative description. Sets of scenarios were created that reflect current trends and reform proposals, various forms of social and environmental breakdown, and more fundamental tran15328
sitions or transformations (Raspin et al. 1997, Hammond 1998). Based on these and other data, the National Academy of Science’s Board (USA) on Sustainable Development (1999) concluded that: Although the future is unknowable … a successful transition toward sustainability is possible over the next two generations. This transition could be achieved without miraculous technologies or drastic transformations of human societies. What will be required, however, are significant advances in basic knowledge, in the social capacity and technological capabilities to utilize it, and in the political will to turn this knowledge and know-how into action.
Bibliography Adams W M 1990 Green Deelopment: Enironment and Sustainability in the Third World. Routledge, London Daily G C 1997 Nature’s Serices: Societal Dependence on Natural Ecosystems. Island Press, Washington DC
Sustainable Deelopment Gell-Mann M 1994 The Quark and The Jaguar: Adentures in the Simple and Complex. W. H. Freeman, New York Hammond A 1998 Which World? Scenarios for the 21st Century. Island Press, Washington DC Holling C S (ed.) 1978 Adaptie Enironmental Assessment and Management. Wiley, Chichester, UK Kates R W 1994 Sustaining life on earth. Scientific American 271: 114–22 Kates R W 1996 Population, technology, and the human environment: A thread through time. Daedalus 125(3): 43–71 Lee K N 1993 Compass and Gyroscope: Integrating Science and Politics for the Enironment. Island Press, Washington DC Mathews J T 1997 Power shift. Foreign Affairs 76: 50–66 Meadows D 1998 Indicators and Information Systems for Sustainable Deelopment. A report to the Balaton Group. The Sustainability Institute, Hartland Four Corners, VT Moldan B, Billharz S, Matravers R (eds.) 1997 Sustainability Indicators: A Report on the Project on Indicators of Sustainable Deelopment. Wiley, New York and Chichester, UK Board on Sustainable Development, National Research Council 1999 Our Common Journey: A Transition Toward Sustainability. National Academy Press, Washington, DC O’Riordan T 1988 The politics of sustainability. In: Turner R K (ed.) Sustainable Enironmental Management: Principles and Practice Boulder. Westview Press, Boulder, CO Raskin P, Gallopin G, Gutman P, Hammond A, Swart R 1998 Bending the Cure: Toward Global Sustainability. A Report of the Global Scenario Group. Stockholm Environment Institute, Boston Speth J 1992 The transition to a sustainable society. Proceedings of the National Academy of Sciences of the United States of America 89: 870–72 The Social Learning Group 2001 Learning to Manage Global Enironmental Risks: Vol 1. A Comparatie History of Social Responses to Climate Change, Ozone Depletion, and Acid Rain. Vol 2. A Functional Analysis of Social Responses to Climate Change, Depletion, and Acid Rain. MIT Press, Cambridge, MA Turner B II, Clark W, Kates R, Richards J, Mathews J, Meyer W 1990 The Earth as Transformed by Human Action: Global and Regional Changes in the Biosphere Oer the Past 300 Years. Cambridge University Press, Cambridge, UK United Nations 1999 World Population Prospects: The 1998 Reision. United Nations Population Division, New York United Nations Commission for Sustainable Development 1996 Indicators of Sustainable Deelopment: Framework and Methodologies. United Nations, New York World Commission on Environment and Development 1987 Our Common Future. Oxford University Press, Oxford, UK
R. W. Kates
Sustainable Development In the late 1980s, the World Commission on Environment and Development (WCED), otherwise known as the Brundtland Commission, defined sustainable development as ‘development that meets the needs of the present without compromising the ability of future generations to meet their own needs’ (WCED
1987). Since then, sustainable development has received significant attention from the global community at the local, national, and international level amid concerns about climate change, biodiversity loss, tropical deforestation, and other environmental depletion and degradation problems. This article discusses briefly the emergence of the concept of sustainable development and provides a broad definition of sustainable development. Two distinct interpretations of sustainable development—weak and strong sustainability—are compared and contrasted. Recent advances in three areas of the literature that are relevant to understanding sustainable development and to designing appropriate policy responses are examined. These are: ecological functioning and resilience; environmental Kuznets’ curve (EKC) and the environment–growth debate; and endogenous growth, technological innovation, and resource dependency. For each area, the basic concept is explained and the implications for sustainable development are discussed. Finally, the policy issues and options for achieving sustainable development are discussed. Four key options are identified: measuring the ecological and economic impact of declining natural resources; improving our estimates of the economic costs of depleting natural capital; establishing appropriate incentives, institutions, and investments for sustainable management of natural capital; and encouraging interdisciplinary collaboration between the relevant social and behavioral sciences undertaking research on sustainable development.
1. Concept of Sustainable Deelopment 1.1 Definition of Sustainable Deelopment The broad concept of sustainable development (Pearce and Barbier 2000) encompasses considerations of equity across and within generations, taking a longerterm perspective and accounting for the value of the environment in decision-making (Pearce et al. 1989). As noted by Turner (1997), the concept of sustainable development has been challenged on several grounds: as an oxymoron, as a means of imposing a particular political position, as an approach to divert attention from more pressing socioeconomic problems, and given the history of the human–environment relationship (see Human–Enironment Relationships). However, sustainable development continues to receive increasing international recognition and it has become a key guiding principle for the global society at the start of the new millennium (National Research Council 1999). There have been a substantial number of diverse and wide-ranging interpretations of sustainable development, both across and within scientific disciplines. Pearce et al. (1989) provide a collection of 15329
Sustainable Deelopment such definitions, and for summaries of recent developments see Goldin and Winters (1995) and Pearce and Barbier (2000). In general, sustainable development implies that economic development today must ensure that future generations are left no worse off than present generations. This means that future generations should be entitled to at least the same level of economic opportunities—and thus at least the same level of economic welfare—as is currently available to present generations. As Pezzey (1989) succinctly states, the welfare of society should not be declining over time.
1.2 Total Capital Stock The total stock of capital employed by the economic system, which includes physical, human, and natural capital, determines the full range of economic opportunities and welfare available to both present and future generations. Natural capital comprises of natural resources and environments and provides various goods and services to society. For example, tropical forests provide direct use benefits in the form of timber, fruits, and nuts, indirect use benefits such as recreation, watershed protection, biodiversity protection, carbon storage, and climate maintenance, and nonuse benefits which include existence and bequest values (Barbier et al. 1994). Sustainable development requires managing and enhancing the total capital stock so that its aggregate value does not decline over time. Therefore, society must decide how best to use its total capital stock today to increase current economic activities and welfare, and how much it needs to save or even accumulate for the welfare of future generations. However, it is not simply the aggregate stock of capital in the economy that matters but also its composition. In particular, whether present generations are ‘using up’ one form of capital to meet the needs of today. For example, much of the recent interest in sustainable development has risen out of concern that current economic development may be leading to the rapid accumulation of physical and human capital, but at the expense of excessive depletion and degradation of natural capital. The major concern has been that the irreversible depletion of the world’s stock of natural wealth will have detrimental implications for the welfare of future generations (see Consumption, Enironmentally Significant). While generally it is accepted that economic development around the world is leading to the irreversible depletion of natural capital, there is widespread disagreement as to whether this necessarily implies that such development is inherently unsustainable. From an economic standpoint, the critical issue of debate is not whether natural capital is being irreversibly depleted. Instead, as argued by Ma$ ler (1995), the more crucial issue for sustainable de15330
velopment is whether we can compensate future generations for the current loss of natural capital, and if that is the case, how much is required to compensate future generations for this loss? This depends on whether natural capital is considered to have an unique or essential role in sustaining human welfare and whether special ‘compensation rules’ are required to ensure that future generations are not made worse off by natural capital depletion today.
1.3 Weak and Strong Sustainability Two distinct interpretations of sustainable development are referred to commonly in the literature as ‘weak’ and ‘strong’ sustainability. According to the weak sustainability view, there is essentially no inherent difference between natural and other forms of capital, and hence the same optimal depletion rules ought to apply to both. That is, as long as the natural capital that is being depleted is replaced with even more valuable physical and human capital, then the value of the aggregate stock—comprising human, physical, and the remaining natural capital—is increasing over time. Maintaining and enhancing the total stock of all capital alone is sufficient to attain sustainable development. In contrast, proponents of the strong sustainability view argue that physical or human capital cannot substitute for all the environmental resources comprising the natural capital stock, or all of the ecological services performed by nature. This view questions whether human, physical, and natural capital comprises a single homogeneous total capital stock. Uncertainty over many environmental values, in particular the value that future generations may place on increasingly scarce natural resources and ecological services, further limits our ability to determine whether we can compensate adequately future generations for irreversible losses in essential natural capital today. Thus, the strong sustainability view suggests that environmental resources and ecological services that are essential for human welfare and cannot be substituted easily by human and physical capital should be protected and not depleted. Maintaining or increasing the value of the total capital stock over time in turn requires keeping the nonsubstitutable and essential components of natural capital at least constant over time.
2. Aspects of Sustainable Deelopment 2.1 Ecological Functioning and Resilience The ecological economics literature has called attention to the important role of ecological functioning and resilience in sustaining human livelihoods and
Sustainable Deelopment welfare (Perrings et al. 1992) (see Ecological Economics). By ecological functioning, ecologists usually mean those basic processes of ecosystems, such as nutrient cycling, biological productivity, hydrology, and sedimentation, as well as the overall ability of ecosystems to support life. Ecologists consider the collective range of life-support functions to be the key characteristic that defines an ecosystem, as well as the source of the many key ecological resources and services that are important to human welfare (e.g., drinking water, food, and soils). Although there is some dispute in the ecological literature over precise interpretations of the term ecological resilience, ecologists generally use this term to mean the capacity of an ecosystem to recover from external shocks and stresses (e.g., drought, fire, earthquakes, pollution, biomass removal) (Holling 1986, Walker 1992). As the health of an ecosystem usually is determined by its capacity to deliver the life-support functions appropriate to its stage of ecological succession, then the ecological resilience of the system is linked inherently to its ecological functioning. That is, if an ecosystem is resilient, then it should recover sufficiently from any human-induced or natural stresses and shock and function normally. However, if ecosystems are insufficiently resilient to recover from persistent problems of environmental degradation then the ability of ecosystems to function normally and deliver important biological resources and ecological services will be affected. At some threshold level of ecological functioning, ecosystems that are subject to persistent environmental degradation and disruption begin to malfunction, breakdown, and ultimately, collapse (see Human–Enironment Relationship: Carrying Capacity). The loss of ecosystem functioning and resilience will lead to a decline in the availability of some economically important biological resources and ecological services (see Extinction of Species). For example, the loss of fish stocks as the result of pollution impacting on marine systems, the inability of rangeland pastoral systems to recover from drought, and the decline of the watershed protection function of degraded forests. At first these ecological and welfare impacts may be localized, but if they occur on a sufficiently great enough scale, they may generate wider national, regional, or even global effects, such as climate change (see Climate Change, Economics of). Although the scale of such impacts and their ultimate effects on human welfare are often uncertain and difficult to determine, the irreversible loss of key biological resources and ecological ‘services’ through the disruption and breakdown of ecosystems could constitute a growing problem of ecological scarcity (Barbier 1989). The rising ‘scarcity’ of ecological resources and services will mean that they will increase in value relative to human-made goods and services produced in economic system. Moreover, persistent and rising ecological scarcity may be an indicator of more
widespread and frequent disruptions to the functioning and resilience of ecosystem, and increase the likelihood of ecological collapse and catastrophe on ever widening scales (Daily and Ehrlich 1992, Meadows et al. 1992, UNPF 1993).
2.2 Implications of Ecological Functioning and Resilience for Sustainable Deelopment Given the rate at which ecological disruption is occurring globally as a result of the cumulative impacts of pollution, habitat modification, and conversion and over-exploitation of some biological resources, the ecological functioning and resilience upon which many important ecological resources and services depend may be undermined. Thus, not only is there a danger of depleting essential ecological components of natural capital, but the widespread disruptions to the functioning and resilience of ecosystems may mean that these essential components are being lost irreversibly. Therefore, the ecological economics literature generally endorses the strong sustainability view. Ensuring sustainable development first will require identifying those essential ecological resources and services that are most at risk from current patterns of economic activity and development. Second, it is necessary to take appropriate policy measures to protect the irreplaceable components of natural capital. Finally, it is important to maintain the functioning and resilience of the key ecosystems on which they depend. This implies a fundamental rethinking in the way in which countries currently manage their environments, as well as the relative importance of policies for environmental protection in the process of economic development. The additional costs that are required to adjust current patterns of economic activity and development could be significant, especially in the short and medium term, and may constrain such a comprehensive shift in policy.
2.3 Enironmental Kuznets’ Cures and the Enironment–Growth Debate There has been a long-standing debate about the relationship between economic growth and the environment. An early view was expressed in the publication of the Club of Rome’s Limits to Growth in 1972 that there exists a trade-off between economic growth (measured by rising real per capita incomes) and the environment, whereby an improvement in one implies a reduction in the other (Meadows et al. 1972, Daly 1987). The modern sustainable development debate has focused more on the potential complementarity between growth and the environment, where growth is considered compatible with improved environmental quality (see Cassandra\Cornucopian Debate). 15331
Sustainable Deelopment A recent extension to this literature looks at the analysis of enironmental Kuznets’ cures (EKC). This is the hypothesis that there exists an ‘inverted U’ shaped relationship between a variety of indicators of environmental pollution or resource depletion and the level of per capita income (Kuznets 1955, Grossman and Kreuger 1995, Shafik 1994). The implication of this hypothesis is that environmental degradation should be observed to increase initially, but to eventually decline, as per capita income increases. To some extent, this combines the two previous views on the relationship between economic growth and the environment. That is, at low incomes there exists a tradeoff between economic growth and environmental quality, while at higher incomes economic growth is complementary to improved environmental quality. Pearce and Barbier (2000) and Stern et al. (1996) provide a review of the relevant empirical studies. These studies suggest that EKC relationships are more likely to hold for only a few pollutants with short-term and local impacts, such as sulphur dioxide (SO ) and # (see to a lesser extent solid particulate matter (SPM) Air Pollution). In addition, even when an EKC relationship is estimated, the ‘turning point’ on the curve where environmental degradation starts to decline with per capita income is very high relative to the current per capita GDP levels of most countries. This is a particular problem for less developed countries with low per capita income levels (see Enironment and Deelopment). However, several recent studies have also demonstrated that EKCs are highly susceptible to structural economic shifts and technological changes, which are in turn influenced by policy initiatives such as the Montreal Protocol and the Second Sulphur Protocol.
2.4 Implications of EKCs for Sustainable Deelopment Recent EKC studies have revived the wider ‘growth vs. the environment’ debate. Some commentators have argued that the empirical evidence of EKCs supports the general proposition that the solution to combating environmental damage is economic growth itself (Beckerman 1992). For example, the EKC literature offers limited evidence that for certain environmental problems, particularly air pollutants with localized or short-term effects, there is an eventual reduction in emissions associated with higher per capital income levels, which may be attributable to the ‘abatement effect’ that arises as countries become richer (Panayotou 1997). Also, both the willingness and the ability of political jurisdictions to engage in and enforce improved environmental regulations, to increase public spending on environmental research and development, or even to engage in multilateral agreements to reduce emissions may also increase with per 15332
capita income levels (see Greening of Technology and Ecotechnology). Other commentators have been more cautious, noting that conclusive evidence of an EKC relationship applies only to a few pollutants, which makes it difficult to support more general growth-environment linkages. It has also been noted that, even for those pollutants displaying EKC characteristics, aggregate global emissions are projected to rise over time, demonstrating that the existence of an EKC relationship for such pollutants does not imply necessarily that, at the global level, any associated environmental damage is likely to disappear with economic growth. Therefore, the recent EKC literature should not be taken to imply that economic growth on its own will foster environmental improvement automatically. As Panayotou (1997) concluded, ‘when all effects are considered, the relationship between growth and the environment turns out to be much more complex with wide scope for active policy intervention to bring about more desirable, and in the presence of market failures, more efficient economic and environmental outcomes.’ Or, as Arrow et al. (1995) have succinctly put it: ‘economic growth is not a panacea for environmental quality; indeed it is not even the main issue.’
2.5 Endogenous Growth, Technological Innoation, and Resource Dependency Another recent advance in the literature with important implications for sustainable development looks at endogenous growth, technical innoation, and resource dependency. A key feature of recent endogenous growth models is that technological innovation, that is, the development of new technological ideas or designs, is endogenously determined by private and public sector choices within the economic system (hence the term ‘endogenous’ growth) rather than determined outside (i.e., ‘exogenous’) to the system (Romer 1990). This implies that the level of technology in an ‘endogenous growth’ economy can be advanced perpetually through public and private investments, such as research and development expenditures and increases in human capital skills. Since technical innoation adds to physical capital already used in production, the economy can potentially ‘sustain’ growth rates indefinitely. Furthermore, as the economy develops, its growth rate should eventually diminish until it reaches its long-run growth rate (Barro and Sala-I-Martin 1995). The recent debate over the role of innovation in economic growth has led to empirical studies across countries and regions in the factors underlying longterm economic growth. Empirical evidence seems to support the notion that rich countries have higher rates of savings or more ‘effective’ institutions, and
Sustainable Deelopment implies that persistently bad government policies and ineffective institutions are inhibiting the long-run growth prospects of low-income countries. In addition, poor countries may fail to achieve higher rates of growth because they fail to generate or use new technological ideas to reap greater economic opportunities. Institutional and policy failures in poor economies are important determinants of their inability to innovate sufficiently to achieve higher long-term growth rates. Another important factor may be the structural economic dependence of these economies on their natural resource endowments (see Enironmental Vulnerability). Recent explanations as to why resource dependence may be a factor in influencing economic growth point to a number of possible fundamental linkages between environment, innovation, and longterm growth relevant to poor economies. Previous studies of an economy’s dependence on natural resources and its long-run growth (Dasgupta and Heal 1979, Stiglitz 1974) have been extended by Barbier (1997) to include an endogenous growth economy in order to determine whether an economy’s dependence on an exhaustible natural resource is necessarily a binding constraint on long-term growth. The results of the analysis show that sufficient allocation of human capital to innovation will ensure that in the long run resource exhaustion can be postponed indefinitely, and the possibility exists for sustained long-term per capita consumption. However, when negative feedback effects between the rate of resource utilization and innovation exist, the conditions under which it is possible to sustain per capita consumption over the long run are much more stringent and again assume that the economy is able to build up sufficient human capital and to innovate.
2.6 Implications of Endogenous Growth Theory, Technical Innoation, and Resource Dependency for Sustainable Deelopment Endogenous growth theory offers some support for the weak sustainability view. That is, provided that obstacles such as persistent policy distortions, political instability, and institutional failures can be overcome, any economy—even a low-income and resource dependent developing country—should be able to foster endogenous innovation to substitute human and physical capital for a declining natural capital base in order to sustain economic opportunities and welfare indefinitely. The key appears to be developing effective policies and institutions, including the necessary public and private investments to enhance research and development and human capital skills. Unfortunately, in most low-income countries current economic policies and investments in agriculture, forestry, and other resource-based sectors have led to rapid changes—frequently with adverse economic
consequences—in resource stocks and patterns of use. Demographic trends have often worsened the relationship between population and resource carrying capacity in many regions, and chronic environmental degradation may itself be contributing to social conflict and political instability, thus undermining the social and economic conditions necessary to foster long-term innovation and development (HomerDixon 1995). In addition, the type of ‘resource-saving’ innovations envisioned in most endogenous growth models are likely to be technologies to abate pollution and other waste products, and to conserve the use of raw material and energy inputs. However, many ecological resources and services, such as biodiversity, amenity services, and ecological functions, are less amenable to substitution by the conventional resource-saving innovations developed in the economic process. Thus, innovations are less likely to lead to opportunities for the substitution of the more unique, and possibly essential, range of ecological resources and services, including the general lifesupport functioning and resilience of ecosystems.
3. Policy Issues and Options for Sustainable Deelopment This section discusses the policy issues and options for sustainable development. There are some important areas of common ground for policy makers that make good progress towards sustainable development, regardless of the viewpoint on strong and weak sustainability. For example, as there exists considerable scope for substituting human and physical capital for natural capital through conservation and substitution of resource inputs, reducing pollution, and improving environmental protection, then designing policies for sustainable development must begin with these basic objectives in mind. To do this requires progress in four areas. First, it is important to improve existing efforts to measure the economic and ecological consequences of natural capital depletion and degradation. That is, in order to consider how well an economy is substituting physical and human capital for natural capital, it is necessary to measure how much the value of the natural capital stock has been ‘depreciated’ through current efforts to use environmental resources and services to promote greater levels of consumption and investment in the economy. Determining the ‘sustainable’ level of income and consumption, that is, accounting for the increases in the total available goods and services in the economy net of the depreciation in the value of the natural capital stock, is an important economic approach to measuring sustainability. Equally, determining the ecological ‘limits’ to exploitation of the natural capital stock, including ecological thresholds, carrying capacities, and key ecological goods, services and natural environments 15333
Sustainable Deelopment that might be considered ‘essential,’ are important ecological contributions to measuring sustainability. Second, many of the resources and services of our natural capital endowment are either ‘unpriced,’ meaning that unlike other goods and services in our economy there are no markets for them, or they are ‘underpriced,’ meaning that where markets do exist for some environmental goods and services the prices in these markets do not necessarily reflect their true contribution to human welfare. As long as environmental goods and services continue to be ‘unpriced’ and ‘underpriced,’ then their contribution to human welfare relative to other goods and services in the economy is effectively ‘undervalued.’ Thus, in order to promote better policies for sustainable development it is necessary to determine the correct economic values for environmental goods and services and to develop a variety of economic tools for assessing these values (see Resources and Enironment: Contingent Valuation). Third, determining an appropriate ‘mix’ of human, physical, and natural capital to ensure sustainable development is ultimately about designing appropriate incentives, institutions, and investments for efficient and sustainable management of the natural environment. In order to promote sustainable development it is important to determine the causes of environmental degradation, particularly where the failure of institutions, markets, and government policy are the main contributing factors, and to demonstrate how correcting these failures can lead to improved incentives and investments for efficient and sustainable management of natural capital (see Enironmental Policy: Protection and Regulation). Finally, in order to make genuine progress towards sustainable development, it is necessary for the conventional social and behavioral sciences, such as economics and ecology, to work together towards a common goal of sustainable development. See also: Development: Rural Development Strategies; Development: Sustainable Agriculture
Bibliography Arrow K, Bolin B, Costanza R, Dasgupta P, Folke C, Holling C S, Jansson B-O, Levin S, Ma$ ler K-G, Perrings C A, Pimentel D 1995 Economic growth, carrying capacity, and the environment. Science 268: 520–21 Barbier E B 1989 Economics, Natural Resource Scarcity and Deelopment: Conentional and Alternatie Views. Earthscan Publications, London Barbier E B 1997 Endogenous growth and natural resource scarcity. Enironmental and Resource Economics 14(1): 51–74 Barbier E B, Burgess J C, Folke C 1994 Paradise Lost? The Ecological Economics of Biodiersity. Earthscan, Publications, London Barro R J, Sala-I-Martin X 1995 Economic Growth. McGraw Hill, New York
15334
Beckerman W 1992 Economic growth and the environment: Whose growth? World Deelopment 20: 481–96 Daily G C, Ehrlich P R 1992 Population, sustainability and Earth’s carrying capacity. BioScience 42: 761–71 Daly H E 1987 The economic growth debate: What some economists have learned but many have not. Journal of Enironmental Economics and Management 14(4): 323–36 Dasgupta P S, Heal G E 1979 The Economics of Exhaustible Resources. Cambridge University Press, Cambridge, UK Goldin I, Winters L A (eds.) 1995 The Economics of Sustainable Deelopment. Cambridge University Press, Cambridge, UK Grossman G M, Kreuger A B 1995 Economic growth and the environment. Quarterly Journal of Economics 110(2): 353–77 Holling C S 1986 Resilience of ecosystem: Local surprise and global change. In: Clark E C, Munn R E (eds.) Sustainable Deelopment of the Biosphere. Cambridge University Press, Cambridge, UK Homer-Dixon T 1995 The ingenuity gap: Can poor countries adapt to resource scarcity. Population and Deelopment Reiew 21: 587–8 Kuznets S 1955 Economic growth and income inequality. American Economic Reiew 49: 1–28 Ma$ ler K-G 1995 Economic growth and the environment. In: Perrings C, Ma$ ler K-G, Folke C, Holling C S, Jansson B-O (eds.) Biodiersity Loss: Economic and Ecological Issues. Cambridge University Press, Cambridge, UK. Meadows D H, Meadows D L, Randers J, Behrens W 1972 The Limits to Growth. Universe Books, New York National Research Council Board on Sustainable Development, Policy Division 1999 Our Common Journey: A Transition Towards Sustainability. National Academy Press, Washington, DC Panayotou T 1997 Demystifying the environmental Kuznets curve: Turning a black box into a policy tool. Special Issue on Enironmental Kuznets Cures, Enironment and Deelopment Economics 2(4) Pearce D W, Barbier E 2000 Blueprint for a Sustainable Economy. Earthscan Publications, London Pearce D W, Markandya A, Barbier E B 1989 Blueprint for a Green Economy. Earthscan Publications, London Perrings C, Folke C, Maler K-G 1992 The ecology and economics of biodiversity loss: The research agenda. AMBIO 21(3): 201–11 Pezzey J 1989 Economic Analysis of Sustainable Growth and Sustainable Development. Environment Department Working Paper No. 15, World Bank, Washington, DC Romer P 1990 Endogenous technological change. Journal of Political Economy 98: S71–S102 Shafik N 1994 Economic development and environmental quality: An econometric analysis. Oxford Economic Papers 46: 757–73 (Suppl. S.) Stern D I, Common M S, Barbier E B 1996 Economic growth and environmental degradation: The environmental Kuznets curve and sustainable development. World Deelopment 24(7): 1151–60 Stiglitz J E 1974 Growth with exhaustible natural resources: Efficient and optimal growth paths. Reiew of Economic Studies. Symposium on the Economics of Exhaustible Resources, pp. 123–38 Turner B L II 1997 The sustainability principle in global agendas: Implications for understanding land use\cover change. The Geographic Journal 163(2): 133–40 UNPF (United Nations Population Fund) 1993 Population, Growth and Economic Deelopment. Report on the Consultative Meeting of Economists convened by the United
Sustainable Transportation Nations Population Fund, 28–29 September 1992, UNPF, New York Walker B H 1992 Biodiversity and ecological redundancy. Conseration Biology 6: 18–23 WCED (World Commission on Environment and Development) 1987 Our Common Future. Oxford University Press, Oxford
J. C. Burgess and E. B. Barbier
Sustainable Transportation The concept of sustainable transportation applies the idea of societal sustainability to a particular area of human activity: the movement of people and goods. Sustainability, or sustainable development, has been adopted by the United Nations agencies as an overarching goal of economic and social development (UNCED 1992). Although there are several alternative definitions of sustainability, the most widely cited definition and the one generally recognized as initiating the global consideration of the subject, is that of the World Commission on Environment and Development (also known as the Brundtland Commission). Sustainable development is development that meets the needs of the present without compromising the ability of future generations to meet their own needs. (WCED 1987, p. 43)
This and most other definitions constitute a normative concept of intergenerational responsibility or fairness. It is also significant that the context of the Brundtland definition was the issue of economic development and resulting conflicts with the environment. The concept of sustainability balances the right of less advantaged nations to economic development, with the imperative to preserve the resources, especially the environmental resources, that future generations will need for their own well-being. Thus, a concept of equity, both intergenerational and international, has been inherent in the idea of sustainability since its inception. Sustainability pertains to the responsibility of an entire generation of society to future generations; whether it can meaningfully be applied to a single area of human activity such as transportation has been a subject of debate. That is, sustainability must be satisfied by the integral activities of a society and so, in this sense, it is not possible to judge whether one sector of society is sustainable on its own. On the other hand, it would not be possible for society to achieve sustainability, at least not in the long run, if even one sector were not sustainable. Thus, sustainable transportation may be considered a necessary but not sufficient condition for the sustainability of society. Economic development has been central to the issue of sustainability since its inception. Economists have
argued that the Bruntland definition implies nondecreasing well-being, and some have suggested that measures such as gross domestic product or real consumption per capita may be appropriate measures of sustainable growth. Others argue that well-being must be defined more broadly to include other areas of human development, such as education, health, environmental quality, and social justice (e.g., Pearce et al. 1994). In this broader view, the human institutions passed from one generation to another are just as important a resource as any other institution. The question of how to translate the Brundtland imperative into a measurable and verifiable concept is still being worked out. In the economic literature, efforts to resolve this question have revolved around the concepts of weak and strong sustainability. The doctrine of weak sustainability holds that manufactured capital (machinery, buildings, infrastructure, etc.) is perfectly substitutable for natural capital (natural and environmental resources). This view holds not only that manufactured capital can be substituted for resources such as coal or iron ore, but that it can be substituted for environmental functions such as the ability of the air to absorb pollution or even the atmospheric and ocean processes that produce the global climate. Strong sustainability, on the other hand, maintains that there are limits to the substitutability of manufactured and natural capital. In particular, it maintains that there are certain environmental processes and functions that must be protected and preserved because they cannot be replaced by created capital. Ecosystems and biodiversity are often-cited examples of irreplaceable environmental processes. Howeconomicgrowthandexpandinghumanwelfare could continue indefinitely in the face of intensifying environmental impacts and finite natural resources is a central question for theorists and researchers. Human activity, and especially economic activity, are inextricably linked to the environment by the everexpanding quantities of natural resources extracted, processed and consumed (Pearce and Warford 1993). Because mass and energy must be conserved due to the laws of nature, all the resources and energy used must end up somewhere in the form of waste (unless recycled). A useful way to conceive of the relationship between human activity and the environment is to imagine a coefficient of environmental intensity that relates total environmental damage to the total level of activity, perhaps measured, albeit incompletely by gross domestic product. Similarly, one can imagine a coefficient of natural resource requirements for input to the world economy. Because emissions controls, recycling, and greater energy efficiency can change the environmental coefficient, technological change and behavioral change clearly must play a key role in achieving sustainability. If the environmental coefficient can be reduced as fast or faster than the growth of human activity, and if increasing efficiency and 15335
Sustainable Transportation recycling can similarly reduce the requirements for energy and other natural resource inputs, then it should be possible for human well-being to expand in a sustainable way. The above line of reasoning identifies sustainability not as an equilibrium or final state of being able to continue indefinitely, but as a steady state in which the rate of improvement exceeds the rate of exhaustion of manufactured capital, natural resources, environmental services, and human and institutional capital. In this view, sustainability requires reducing demands on the environment per unit of activity as fast as or faster than the growth in activity. It requires increasing the efficiency of energy and resource use, or expanding the scope of useful resources at a rate at least as fast as the growth of human activity. It may even be interpreted more broadly as a matter of improving human institutions faster than challenges emerge to threaten them. While these concepts can and have been applied to transportation, a comprehensive theory of sustainable transportation that is measurable and verifiable is still yet to emerge.
1. Sustainable Transportation and Economic Deelopment Throughout history, the expansion and improvement of transportations systems has been an engine of economic development (Greene and Wegener 1997). The fast and reliable movement of persons and commodities contributes more than simply a reduction in the costs of production. It facilitates the reorganization of production and consumption to permit specialization, scale economies, inventory reduction, and more. Today, transportation activity continues to increase at roughly the rate of economic growth in both the developed and developing economies. Motorized transport is the source of essentially all the demands transportation imposes on natural and environmental resources. The motorization of transport, fully accomplished in the developed economies during the twentieth century, is now proceeding rapidly in the developing world. In 1950, 90 percent of the world’s 50 million motor vehicles could be found in Europe or the United States. Today, half of the world’s roughly 700 million vehicles are to be found outside of those countries. The developing economies own more than one-third of the world’s motor vehicles, and vehicle ownership outside of the developed economies is growing at twice the rates seen in the developed economies. The global potential for motorization is enormous. In developed economies, motor vehicle ownership rates average 500 per 1,000 persons. In the US, there is more than one motor vehicle per person of driving age. Ownership rates in the developing economies are about one-twentieth as high. 15336
Expansion of modern, motorized transportation in the developing economies appears to be essential to their sustainable development. The World Bank has identified three broad types of sustainability in considering transport investments in developing countries: (a) economic and financial sustainability; (b) environmental sustainability; and, (c) social sustainability (World Bank 1996). These categories reflect the view that to be sustainable, transportation systems must have stable financing in the context of market economies, must address the many adverse impacts of transportation on health and ecosystems, and must serve the mobility needs of the poor as well as those with the ability to pay.
2. Enironmentally Sustainable Transport Transportation systems have diverse and widespread impacts on the environment, affecting climate, air and water quality, noise, habitats, and biodiversity. Worldwide, transportation accounts for one-fourth of global carbon dioxide emissions. Scientists have concluded that the accumulation of carbon dioxide and other less common anthropogenic greenhouse gases in the world’s atmosphere since the onset of the industrial revolution are now having a perceptible impact on the earth’s climate. While it is not yet clear what the impacts on climate will be, they are likely to include increases in storm activity, changes in the patterns of rainfall, a general warming leading to higher sea levels, and other potentially harmful or even catastrophic effects. The quantity of carbon dioxide emission produced by transportation depends on the carbon content of the transportation fuels and the energy efficiency of transport systems. Today, motorized transport is entirely dependent on fossil fuels. The combustion of fossil fuels is the predominant source of greenhouse gas emissions from human activities. A typical passenger car adds 1 to 2 tonnes of carbon to the atmosphere each year. Despite 50–100 percent increases in the fuel economy of automobiles over the past 30 years, there appears to be no prospect of reducing global greenhouse gas emissions from transportation for at least 20 to 30 years. Clearly, continuing to force alterations in the earth’s climate systems in unpredictable ways does not satisfy the sustainability doctrine. Transportation is a major source of local and regional air pollution. In Europe and the United States, transportation vehicles emit roughly 45 percent of nitrogen oxides and 40 percent of volatile organic compounds that mix in the atmosphere to form ozone, a pollutant that exacerbates a variety of respiratory problems and causes damage to vegetation as well. Air pollutants from transport have an immediate impact on human health and on vegetation. Ozone pollution has been observed to change the distribution of tree species in forests, retard the growth of vegetation,
Sustainable Transportation and damage crops. Prolonged exposure to ozone is strongly believed to reduce biodiversity of plant communities and diminish variability in their gene pools (NRC 1997). Emissions of oxides of nitrogen and sulfur from motor vehicles contribute to acid rain with widespread negative impacts on terrestrial and aquatic ecosystems. Volatile organic compounds also include a number of toxic pollutants, including known and suspected carcinogens, such as benzene, formaldehyde, cyanide, and dioxin. Transportation vehicles produce significant quantities of fine particulate pollution via several mechanisms. Vehicle tailpipe emissions contain particles produced by incomplete combustion. Particles and aerosols are formed when certain motor vehicle emissions react in the air. Finally, dust from roads is a major source of air-borne particulate matter. Fine particulate matter can be inhaled deeply into the lungs, contributing to a variety of respiratory problems. Epidemiological studies have correlated elevated morbidity and mortality rates with long-term, lowlevel exposure to fine particulates, as well as to episodes of acute pollution, suggesting that particulate pollution may be the most serious form of motor vehicle pollution from the perspective of human health (NRC 1997). Since the 1950s, enormous technological progress has been made in reducing the pollutant emissions from new transportation vehicles. A properly operating motor vehicle manufactured in 2000 produces 1 percent to 10 percent of the pollution (depending on the pollutant) of a car made 35 years earlier. Still, even this rate of progress has been insufficient to achieve air quality standards in many urban areas because the growth of travel has outpaced the rate of technological improvement. Further significant improvements in conventional vehicles and fuels will be made from 2000 to 2010. Yet achieving sustainability will probably require entirely new energy sources and propulsion technologies ultimately.
3. Solid Waste Transportation infrastructure and vehicles generate enormous amounts of solid waste. Some of this waste, such as iron and steel, has a long history of recycling. Other components like plastics most often end up in landfills or incinerators. Waste oil and fluids, batteries, and other components often contain toxic materials that pose serious health and environmental hazards if not properly disposed of. The maintenance and reconstruction of transportation facilities also generate substantial quantities of mostly asphalt and concrete waste. Not only increased recycling, but lifecycle management of materials for transportation systems has been recognized as essential to achieving sustainable transportation systems (OECD 1997).
4. Noise It may seem odd to include noise among the issues affecting transportation’s sustainability. Noise, after all, is inherently transitory and its impacts on society and the environment would seem to be also. On the other hand, transportation systems operate more or less continuously, generating noise around the clock. Studies carried out in Europe have shown that about 20 percent of the population are exposed to daytime transport noise exceeding acceptable levels (65 dB(A)) (European Commission 1995). Another 170 million are exposed to noise levels that cause serious annoyance as defined by the World Health Organization. Road noise is the dominant source, followed by rail and air transport. Furthermore, data for the 1980– 95 period showed no significant improvement in exposure to traffic noise for the population of Europe. The principal impacts of transportation noise are annoyance, interference with work and sleep and, less commonly, hearing loss. Estimates of the costs of noise pollution in Europe range between 0.1 percent and 2 percent of GDP.
5. Natural Resources Transportation is a major consumer of two critical, limited resources: land and petroleum. Worldwide, transportation accounts for 56 percent of global petroleum consumption and between 1973 and 1997 was responsible for all of the increase in world oil consumption (IEA 2000). Recent assessments of the world’s supplies of conventional petroleum indicate that sometime between 2005 and 2025, the world will have consumed half of all the conventional petroleum that is known or believed to exist (EIA 2000 and USGS 2000). Once this point is passed, it is virtually certain that production of conventional petroleum will begin to decline, and unconventional energy sources will have to be substituted at an increasing rate. The world’s transportation system, which is 97 percent dependent on petroleum for energy, will not be sustainable unless it can make a transition to alternative energy sources. While there are more than adequate resources of fossil fuels that can be converted to transport fuels (e.g., coal, oil shale, and tar sands), the greater environmental impacts associated with using these fuels raises serious questions about the sustainability of a motorized world transport system based on fossil fuels. Perhaps even more serious are the questions raised by transportation systems’ expanding direct and indirect uses of land. Not only does transportation infrastructure such as road, airports, and railroad tracks, occupy millions of hectares of land, but the infrastructure and vehicles that use it have significant impacts on surrounding ecosystems. Linear transportation structures serve as deadly barriers to the 15337
Sustainable Transportation migration and movements of species, and alter water flow and groundwater. But perhaps more importantly, transportation infrastructure indirectly alters the patterns and density of human settlements, with even more pervasive impacts on ecosystems and biodiversity (NRC 1997).
ened by increasingly unsafe transportation systems. Where both total fatalities and fatality rates are declining, an argument can be made for sustainability. Without a doubt, safety is the single most important problem facing transportation. Yet, to date there is little agreement on how safety should be interpreted in the context of the sustainability doctrine.
6. Sustainable Communities Transportation is recognized as a key element of sustainable communities (PCSD 1997). Human development patterns and the transportation systems that serve them are strongly interdependent. Transportation choices made by present generations will constrain the opportunities of future generations because built environments can be changed only slowly and at great cost. The high level of mobility provided by automotive transport has engendered dispersed low-density development, sometimes referred to as sprawl. The area of built-up urban land in the United States increased by an estimated 14 million acres between 1982 and 1992, driven in large part by the expansion of highway infrastructure (BTS 1996, p. 162). As demand for mobility has continued to grow, while the land available for expanding transportation infrastructure in urban areas has not, chronic congestion has become a major problem in cities around the world. Increasing levels of traffic congestion are not only a source of stress and waste the time of present generations, but can restrict the opportunities for mobility of future generations. If the capacities of transportation systems cannot be expanded to accommodate future growth without incurring unacceptable trade-offs, the options available to future generations will be diminished.
8. Equity Equity among generations is clearly integral to the concept of sustainability. For sustainability to exclude equity considerations among individuals within a generation would be inherently inconsistent. The Charter of European Cities and Towns Towards Sustainability states that the goals of sustainable development are ‘… to achieve social justice, sustainable economies, and environmental sustainability’ (Greene and Wegener 1997). Furthermore, the context of the Brundtland sustainability definition was clearly the need to preserve the global environment while permitting developing countries the economic development that is their right. Clearly social justice and particularly equity have been integral to the concept of sustainability. It has been argued that the concept of intergenerational fairness embodied in sustainability, in fact arises from a societal commitment to equality of opportunity among contemporaries (Howarth 1997). Exactly how equity concerns are to be incorporated in an operating definition of sustainability has yet to be resolved, as have the implications for transportation. At a minimum, equity considerations include the interests of developing economies to pursue higher levels of mobility, as well as the needs of disadvantaged citizens for access to mobility in all societies.
7. Safety
9. Measuring Sustainability
Safety must also be included in assessing the sustainability of transportation systems (e.g., Black 2000). While absolutely compelling as a human tragedy, the sense in which transportation safety is a sustainability issue may be less obvious. Globally, there are about 700,000 motor vehicle fatalities annually, and many millions of serious injuries. On the other hand, fatality rates (e.g., per 100,000 vehicle miles) are generally declining. In developed economies such as the United States, recent years have seen declines in total fatalities even while motor vehicle travel and population size continue to grow. In countries undergoing rapid motorization, however, total fatalities are increasing at the same time rates are declining. When both fatalities and fatality rates are increasing, the sustainability of transportation systems is in question. The resource at issue is human life itself, and the needs of future generations would clearly appear to be threat-
If policies are to be formulated and implemented to achieve sustainable transportation, indicators must be found to measure it (OECD 1994). If the steady-state concept of sustainability is accepted, these indicators should show: whether the world’s environmental resources are being reduced by transportation’s impacts on air or water pollution; whether the global climate is being altered by greenhouse gas emissions; whether habitats are being destroyed through direct or indirect land-use impacts; and whether the generation of solid waste is increasing or being restrained by recycling. Indicators must show whether the fundamental natural resources required for transportation, such as energy and land, are being expanded as fast or faster than they are being consumed, and whether trans-
15338
Sustainable Transportation portation systems are being developed and managed so as to enhance social equity by improving access to mobility for all. Until the consequences of transportation for sustainability are made visible measuring them, progress toward sustainable transportation will be limited. See also: Air Pollution; Desertification; Ecotourism; Environmental Economics; Resource Geography; Safety, Economics of; Tourism, Geography of; Transportation Geography; Transportation Planning; Transportation: Supply and Congestion
Bibliography Black W R 2000 Socio-economic barriers to sustainable transport. Journal of Transport Geography 8: 141–7 Bureau of Transportation Statistics (BTS) 1996 Transportation Statistics Annual Report 1996. US Department of Transportation, Washington, DC Energy Information Administration (EIA) 2000 Long-term World Oil Supply. US Department of Energy, Washington, DC. Available on the World Wide Web at http:\\www.eia.doe.gov European Commission 1995 Towards Fair and Efficient Pricing in Transport. COM(95)691, Directorate-General for Transport, DG-VII, Brussels, Belgium Greene D L, Wegener M 1997 Sustainable transport. Journal of Transport Geography 5(3): 177–90
Howarth R B 1997 Sustainability as opportunity. Land Economics 73(4): 569–79 International Energy Agency (IEA) 2000 Key Energy Statistics. OECD, Paris. Available on the World Wide Web at http:\\www.iea.org National Research Council (NRC), Committee for a study on transportation and a sustainable environment 1997 Toward a Sustainable Future. Special Report 251, Transportation Research Board, National Academy Press, Washington, DC Organization for Economic Co-operation and Development (OECD) 1994 Enironmental Indicators. OECD, Paris OECD 1997 Recycling strategies for road works. Road Transport Research. Organization for Economic Co-operation and Development, Paris Pearce D W, Atkinson G D, Dubourg W R 1994 The economics of sustainable development. Annual Reiew of Energy and the Enironment 19: 457–74 Pearce D W, Warford J J 1993 World Without End. International Bank for Reconstruction and Development, Washington, DC President’s Council on Sustainable Development (PCSD) 1997 Sustainable Communities Task Force Report. Washington, DC United Nations Conference on Environment and Development (UNCED) 1992 Agenda 21. United Nations, New York United States Geological Survey (USGS) 2000 US Geological Surey World Petroleum Assessment 2000—Description and Results. Department of the Interior, Denver, CO World Bank 1996 Sustainable Transport. International Bank for Reconstruction and Development, Washington, DC World Commission on Environment and Development (WCED) 1987 Our Common Future. Oxford University Press, Oxford, UK
D. L. Greene Copyright # 2001 Elsevier Science Ltd. All rights reserved.
15339
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7
Sy Symbolic Boundaries: Overview 1. Definition and Intellectual Context ‘Symbolic boundaries’ are the lines that include and define some people, groups and things while excluding others (Epstein 1992, p. 232). These distinctions can be expressed through normative interdictions (taboos), cultural attitudes and practices, and more generally through patterns of likes and dislikes. They play an important role in the creation of inequality and the exercise of power. The term ‘symbolic boundaries’ is also used to refer to the internal distinctions of classification systems and to temporal, spatial, and visual cognitive distinctions in particular (see Expression and Interaction: Microfoundations of Expressie Forms). This article focuses on boundaries within and between groups. It discusses the history, current research, and future challenges of work on this topic. The literature on symbolic boundaries has gained importance since the 1960s due to a convergence between research on symbolic systems and indirect forms of power. Writings by Pierre Bourdieu, Mary Douglas, Norbert Elias, Erving Goffman, and Michel Foucault on these and related topics have been influential internationally across several disciplines, but particularly in anthropology, history, literary studies, and sociology. In North America, a renewed cultural sociology has produced wide-ranging empirical research agendas on symbolic boundaries and inequality. In other fields including community, cognition, deviance, gender, immigration, knowledge and science, nationalism, professions, race and ethnic studies, and social movements, issues of boundaries have gained analytical prominence, although some authors analyze boundary work without using the language of symbolic boundaries.
2. History Two of the founding fathers of sociology played central roles in shaping the literature on symbolic boundaries: Emile Durkheim and Max Weber. Their contributions are reviewed before turning to the ‘neoclassical’ writings of Mary Douglas, Norbert Elias, and Thornstein Veblen, which illustrate the lasting influence of Durkheim and Weber on this literature up to the 1960s. While Durkheim brings attention to classification systems and their relationship with the
moral order, Weber is more concerned with their impact on the production and reproduction of inequality. One of the most widely used examples of symbolic boundaries is taken from Durkheim’s later work, Les formes eT lementaires de la ie religieuse (The Elementary Forms of Religious Life 1965 [1911]). In this book, Durkheim argues that the distinctiveness of the religious experience from other types of experiences rests in the fact that it involves a symbolic distinction between the realms of the sacred and the profane (pp. 234, 250). The meanings of these realms are mutually exclusive and are defined relationally, through interdictions and rituals that isolate and protect the former from the latter (e.g., a Roman Catholic sinner cannot receive communion until he is purified through confession) (p. 271). The distinction between the sacred and the profane extends to the whole universe of objects and people in which it takes place. For instance, the status of members of a community is defined by the types of relationship they have with sacred objects (e.g., Roman Catholic women cannot celebrate mass). In this sense, religious systems provide a cosmology, i.e., a general interpretation of how the world is organized and how its elements relate to one another and to the sacred. This cosmology acts as a system of classification and its elements are organized according to a hierarchy (counterpoising for instance the pure with the impure). The belief invested in this ‘order of things’ structures people’s lives to the extent that it limits and facilitates their action. In Durkheim’s words, ‘the power attached to sacred things conducts men with the same degree of necessity as physical force’ (p. 260). Moving beyond the religious realm, Durkheim points to the existence of a moral order, i.e., a common public system of perception of reality that regulates, structures, and organizes relations in a community. This system operates less through coercion than through intersubjectivity (p. 238). In fact, Durkheim defines society by its symbolic boundaries: it is the sharing of a common definition of the sacred and the profane, of similar rules of conducts and a common compliance to rituals and interdictions that defines the internal bonds within a community. Hence, he posits that the boundaries of the group coincide with those delimitating the sacred from the profane. Unlike Durkheim, Max Weber is more concerned with the role played by symbolic boundaries (honor) in the creation of social inequality than in the creation of social solidarity. In Wirtschaft und Gesellschaft 15341
Symbolic Boundaries: Oeriew (1922; Economy and Society 1978), he describes human beings as engaged in a continuous struggle over scarce resources. In order to curb competition, they discriminate toward various groups on the basis of their cultural characteristics, such as lifestyle, language, education, race, or religion (Chap. 2). In the process, they form status groups whose superiority is defined in relation to other groups. They cultivate a sense of honor, privilege relationships with group members, and define specific qualifications for gaining entry to the group and for interacting with lower status outsiders (e.g., opposing miscegenation). They invoke their higher status and shared rules of life to justify their monopolization of resources. Hence, cultural understandings about status boundaries have a strong impact on people’s social position and access to resources. Thornstein Veblen’s The Theory of the Leisure Class (1979 [1899]) parallels Weber’s writings on status groups. Veblen is also concerned with the mechanisms that produce boundaries between status groups. This American economist suggests that habits of thought (‘classifying and demarcating’) are central to these mechanisms and are often organized around notions of superiority and inferiority concerning employment, consumption, and leisure. In his words, ‘the concept of dignity, worth, or honor, as applied either to persons or conduct, is of first-rate consequence in the development of class and class distinctions’ (p. 15). For instance, idleness symbolizes status because it signifies pecuniary status. It is a way for ‘successful men to put their prowess in evidence.’ Evidence of nonproductive consumption of time includes ‘quasi-scholarly or quasi-artistic accomplishments.’ Also, ‘refined tastes, manners, and habits of life are useful evidence of gentility because good breeding requires time, application, and expenses’ (p. 49) and are therefore not available to those ‘whose time and energies are taken up with work.’ Veblen also developed the concept of ‘conspicuous consumption’ in the context of an acerbic critique of the excesses of the business class. He argues that the possession and display of wealth confers honor: as an invidious distinction, it symbolizes ranking within a group. This way of manifesting superiority is more common when predatory aggression or war are less frequent. Veblen’s analysis assumes that there is a usual tendency to change standards of sufficiency as one’s pecuniary situation improves, so that one becomes restless with creating ‘wider and ever-widening distance’ between herself and the average standard. Also paralleling Weber’s work is the work of German sociologist Norbert Elias, U¨ ber Den Prozess der Ziilisation (The Ciilizing Process 1982 [1939]). Elias analyzes the historical emergence of a boundary between civilized and barbarian habits by using evidence from Western manner manuals written between the late middle ages and the Victorian period. His attention centers on ‘natural’ bodily functions 15342
such as spitting, defecating, eating, and blowing one’s nose to show an ‘advance in the threshold of selfcontrol’ over time. He demonstrates the growing centrality of shame and embarrassment in instituting norms of behavior in public and private. At a more general level, he shows transformations in standards of behavior and feelings and in personality structures (what he calls ‘habitus’ or habits emerging from social experience). He argues that these vary across hierarchical groups in society and that these variations are key to pacification and the exercise of power. To turn now to the lasting influence of Durkheim’s work, in Purity and Danger (1966), Mary Douglas is concerned with the order-producing, meaning-making, and form-giving functions of classification systems and the role of rituals in creating boundaries grounded in fears and beliefs. In Natural Symbols (1970), she takes on the idea of a correspondence between classification systems and social organization advanced by Durkheim and Mauss (1963 [1903]). She describes the structure of binary symbolic systems as ‘reflecting’ that of group structures. Like Elias, she is also concerned with the moral order and centers her attention on the system of social control as expressed through the body and through the observable artifacts of everyday life (food, dirt, and material possessions). She argues that the very basis of order in social life is the presence of symbols that demarcate boundaries or lines of division. One of Douglas’ main concerns is how communities differentiate themselves from one another and how they are internally differentiated. She distinguishes groups on the basis of their degree of social control and of the rigidity of their grid (by which she means the scope and coherent articulation of their system of classification or the extent to which it is competing with other systems). In societies with high social control and great cultural rigidity (i.e., what she calls high grid and group), there is a concern to preserve social boundaries; the role structure is clearly defined and formal behavior is highly valued and well-defined in publicly insulated roles. Through ‘the purity rule,’ formality screens out irrelevant organic processes, ‘matters out of place.’ Douglas suggests that the more complex the system of classification and the stronger the pressure to maintain it, the more social intercourse pretends to take place between disembodied spirits, i.e., the more the purity rule applies.
3. Current Theory and Research In the contemporary literature on symbolic boundaries, both the neo-Weberian and neo-Durkheimian heritage remain strong. The question of how boundaries intersect with the production of inequality has attracted great interest in recent years, following the publication of Pierre Bourdieu’s impressive corpus. In
Symbolic Boundaries: Oeriew the United States in particular, cultural sociologists have been working to assess some of Bourdieu’s theoretical claims and to use his work as a stepping stone for improving our understanding of the cultural aspects of class, gender, and racial inequality. Other important developments concern the study of identity through boundary work, and research on moral order, community, and symbolic politics. As is argued in the final section of this article, social scientists will soon face the challenge of integrating work from a wide range of fields that have used the concept of boundaries to various ends.
3.1 Culture and Inequality In the last 20 years, a large neo-Weberian literature emerged around the study of processes of closure, as illustrated most notably by the work of Frank Parkin and Randall Collins. Parkin (1979) drew on Weber to propose an analysis of class relationship that focuses on the distributive struggle for monopolizing or usurping resources within and across classes—with an emphasis on the right of ownership and credentialism, i.e., the use of educational certificates to monopolize positions in the labor market. Equally inspired by Durkheim, Collins (1998) extended his earlier work on credentialism and interaction rituals to analyze how intellectuals compete to maximize their access to key network positions, cultural capital, and emotional energy, which generates intellectual creativity. These authors’ contributions intersect with those of Pierre Bourdieu and his collaborators, although their ideas followed an independent path of development. In Reproduction (1977 [1970]), Pierre Bourdieu and Jean-Claude Passeron proposed that the lower academic performance of working class children cannot be accounted for by their lower ability but by institutional biases against them. They suggest that schools evaluate all children on the basis of their familiarity with the culture of the dominant class (or cultural capital), thus penalizing lower-class students. Extensive vocabulary, wide-ranging cultural references, and command of high culture are valued by the school system and students from higher social backgrounds are exposed to this class culture at home. Hence, children of the dominated classes are overselected by the educational system. They are not aware of it, as they remain under the spell of the culture of the dominant class. They blame themselves for their failure, which leads them to drop out or to sort themselves into lower prestige educational tracks. This work can be read as a direct extension of Karl Marx and Friedrich Engels’ ‘dominant ideology thesis,’ which centers on the role of ideology in cementing relations of domination by camouflaging exploitation and differences in class interests. However, Bourdieu and Passeron are more concerned with classification systems than with representations of the social world
itself, i.e., with how representations of social relationships, the state, religion, and capitalism contribute to the reproduction of domination. Implicitly building on the work of Antonio Gramsci, they made inroads in analyzing the subjective process of consolidation of class domination, focusing on the shaping of cultural categories. The control of subjectivity in everyday life through the shaping of common sense and the naturalization of social relations is the focus of their attention. They broaden Marx and Engels by suggesting that crucial power relations are structured in the symbolic realm proper, and are mediated by meaning. They de facto provide a more encompassing understanding of the exercise of hegemony by pointing to the incorporation of class-differentiated cultural dispositions mediated by both the educational system and family socialization. In Distinction (1984 [1979]), Bourdieu applies this analysis to the world of taste and cultural practice at large. He shows how the logic of class struggle extends to the realm of taste and lifestyle, and that symbolic classification is key to the reproduction of class privileges: dominant groups define their own culture and ways of being as superior. Thereby they exercise ‘symbolic violence,’ i.e., impose a specific meaning as legitimate while concealing the power relations that are the basis of its force (Bourdieu and Passeron 1977 [1970], p. 4). They define legitimate and ‘dominated’ cultures in opposition: the value of cultural preferences and behaviors are defined relationally around binary oppositions (or boundaries) such as high\low, pure\impure, distinguished\vulgar, and aesthetic\ practical (p. 245). The legitimate culture they thereby define is used by dominant groups to mark cultural distance and proximity, monopolize privileges, and exclude and recruit new occupants to high status positions (p. 31). Through the incorporation of ‘habitus’ or cultural dispositions, cultural practices have inescapable and unconscious classificatory effects that shape social positions. A large American literature applying, extending, assessing, and critiquing the contributions of Bourdieu and his collaborators developed in the wake of their translation in English (for a review, see Lamont and Lareau 1988). For instance, DiMaggio (1987) suggests that boundaries between cultural genres are created by status groups to signal their superior status. Lamont (1992, Chap.7) critiqued Bourdieu (1979) for exaggerating the importance of cultural capital in uppermiddle class culture and for defining salient boundaries a priori instead of inductively. By drawing on interviews with professionals and managers, she showed that morality, cultural capital, and material success are defined differently and that their relative importance vary across national contexts and by subgroups. Lamont also showed variations in the extent to which professionals and managers are tolerant of the lifestyles and tastes of other classes, and argued that cultural laissez-faire is a more important 15343
Symbolic Boundaries: Oeriew feature of American society than French society. High social and geographic mobility, strong cultural regionalism, ethnic and racial diversity, political decentralization, and relatively weak high culture traditions translate into less highly differentiated class cultures in the United States than France. Other sociologists also argue that cultural boundaries are more fluid and complex than cultural capital theory suggests. In particular, in his study of group variation in home decoration, Halle (1993) suggests that art consumption does not necessarily generate social boundaries. He finds that the meaning attached to living room art by dwellers is somewhat autonomous from professional evaluations, and is patterned and influenced by a wide range of factors beyond class—including neighborhood composition. He also finds that cultural consumption is less differentiated than cultural capital theory suggests—with landscape art being appreciated by all social groups for instance. Finally, he suggests that ‘the link between involvement in high culture and access to dominant class circles … is undemonstrated’ (p. 198). For his part, Hall (1992) emphasized the existence of heterogeneous markets and of multiple kinds of cultural capital. He proposes a ‘cultural structuralism’ that addresses the multiplicity of status situation in a critique of an overarching market of cultural capital. Bryson (1996), Erickson (1996), and Peterson and Kern (1996) suggest that cultural breadth is a highly valued resource in the upper and upper-middle classes, hence contradicting Bourdieu’s understanding of the dominant class which emphasizes exclusively the boundaries they draw toward lower class culture. Bryson (1996) finds that musical exclusiveness decreases with education. She proposes that cultural tolerance constitutes a multicultural capital more strongly concentrated in the middle and upper classes than in the lower classes. Erikson (1996) suggests that although familiarity with high status culture correlates with class, it is useless in coordinating class relations in the workplace. She writes that the ‘culture useful for coordination is uncorrelated … with class, popular in every class’ (p. 248) and that ‘the most useful overall cultural resource is variety plus a well-honed understanding of which [culture] genre to use in which setting’ (p. 249). For their part, Peterson and Kern (1996) document a shift in high status persons from snobbish exclusion to ‘omnivorous appropriation.’ These studies all call for a more multidimensional understanding of cultural capital as a basis for drawing boundaries, and counter Bourdieu’s postulate that the value of tastes is defined relationally through a binary or oppositional logic.
3.2 Identity and Boundary Work The growing literature on identity is another arena where the concept of symbolic boundaries has become 15344
more central. In particular, sociologists and psychologists have become interested in studying boundary work, a process central to the constitution of the self. Social psychologists working on group categorization have been studying the segmentation between ‘us’ and ‘them.’ Brewer’s (1986) social identity theory suggests that ‘pressures to evaluate ones’ own group positively through in-group\out-group comparison lead social groups to attempt to differentiate themselves from each other.’ This process of differentiation aims ‘to maintain and achieve superiority over an outgroup on some dimension’ (Tajfel and Turner 1985, pp. 16–17). While these authors understand the relational process as a universal tendency, sociologists are concerned with analyzing precisely how boundary work is accomplished, i.e. with what kinds of typification systems, or inferences concerning similarities and differences, groups mobilize to define who they are. The concept of boundary work was proposed originally by Gieryn in the early 1980s to designate ‘the discursive attribution of selected qualities to scientists, scientific methods, and scientific claims for the purpose of drawing a rhetorical boundary between science and some less authoritative residual nonscience’ (1999, pp. 4–5). In recent years, sociologists have become interested in analyzing this process by looking at self-definitions of ordinary people, while paying particular attention to the salience of various racial and class groups in boundary work. For instance, Newman (1999) analyzes how poor fast-food workers define themselves in opposition to the unemployed poor. Lamont (1992) studies the boundary work of professionals and managers, while Lamont (2000) examines how workers in the United States and France define worthy people in opposition to the poor, ‘people above,’ blacks, and immigrants, drawing moral boundaries toward different groups across the two national context. Lichterman (1999) explores how volunteers define their bonds and boundaries of solidarity by examining how they articulate their identity around various groups. He stresses that these mappings translate into different kinds of group responsibility, in ‘constraining and enabling what members can say and do together.’ Binder (1999) analyzes boundaries that proponents of Afrocentrism and multiculturalism build in relation to one another in conflicts within the educational system. Finally, Gamson (1992) analyzes how the injustice frames used in social movements are organized around ‘us’ and ‘them’ oppositions. Jenkins’s (1996) study of social identity provides useful tools for the study of boundary work. He describes collective identity as constituted by a dialectic interplay of processes of internal and external definition. On the one hand, individuals must be able to differentiate themselves from others by drawing on criteria of community and a sense of shared belonging within their subgroup. On the other hand, this internal
Symbolic Boundaries: Oeriew identification process must be recognized by outsiders for an objectified collective identity to emerge. Future research on the process of collective identity formation may benefit from focusing on the dynamic between self-identification and social categorization.
3.3 Moral Order, Community, and Symbolic Politics A third strand of work on symbolic boundaries presents more palpable neo-Durkheimian influences. Several empirical studies have centered on moral order and on communities. Wuthnow (1987, p. 69) writes ‘Order has somehow to do with boundaries. That is, order consists mainly of being able to make distinctions—of having symbolic demarcations—so that we know the place of things and how they relate to one another.’ A recent example of this neo-Durkheimian line of work is Alexander’s (1992) semiotic analysis of the symbolic codes of civic society. The author describes these codes as ‘critically important in constituting the very sense of society for those who are within and without it.’ He also suggests that the democratic code involves clear distinctions between the pure and the impure in defining the appropriate citizen. His analysis locates those distinctions at the levels of people’s motives and relationships, and of the institutions that individuals inhabit (with ‘honorable’ being valued over ‘self-interested’ or ‘truthful’ over ‘deceitful’ in the case of the democratic code). It should be noted that the last decades have produced several studies of status politics that documented precisely how groups sharing a lifestyle made such distinctions, engaged in the maintenance of the moral order, and simultaneously bolstered their own prestige. Particularly notable is Gusfield (1963) who analyzed the nineteenth century American temperance movement in favor of the prohibition and the Eighteenth Amendment to the constitution. Gusfield understands this movement as a strategy used by small-town Protestants to bolster their social position in relation to urban Catholic immigrants. Along similar lines, Luker (1984) describes the worldviews of antiabortion and pro-choice activists. She shows that they have incompatible beliefs about women’s careers, family, sexuality, and reproduction, and that they talk past one another and define themselves in opposition to one another. The literature on social movements includes numerous additional studies that focus on the process by which categories of people are turned into categories of enemies (Jasper 1997, Chap. 16).
4. Challenges and Future Directions Two main challenges concerning the study of boundaries are pointing at the horizon of sociological scholarship. They concern (a) a necessary synthesis of
the various strands of work that speak to boundary issues across substantive areas; and (b) the study of the connection between objective and subjective boundaries. The concept of boundaries is playing an increasingly important role in a wide range of literatures beyond those discussed above. For instance, in the study of nationalism, citizenship, and immigration, scholars have implicitly or explicitly used the boundary concept to discuss criteria of membership and group closure within imagined communities (e.g., Anderson 1983, Brubaker 1992). Some of these authors are concerned with the established rules of membership and boundary work. Yet others study national boundary patterns, i.e., the ways in which nations define their identity in opposition to one another (Saguy 2000). The concept of boundary is also central in the study of ethnicity and race. The relational process involved in the definition of collective identity (‘us’ vs. ‘them’) has often been emphasized in the literature on these topics. The work of Barth (1969) for instance concerns objective group boundaries and self-ascription, and how feelings of communality are defined in opposition to the perceived identity of other racial and ethnic groups. Along similar lines, Bobo and Hutchings (1996) understand racism as resulting from threats to group positioning. They follow Blumer (1958) who advocates ‘shift[ing] study and analysis from a preoccupation with feelings as lodged in individuals to a concern with the relationships of racial groups … [and with] the collective process by which a racial group comes to define and redefine another racial group’ (p. 3). Finally, gender and sexual boundaries are also coming under more intense scrutiny. For instance, Epstein (1992) points out that dichotomous categories play an important part in the definition of women as ‘other’ and that much is at stake in the labeling of behaviors and attitudes as feminine or masculine (also see Gerson and Peiss 1985). Those who violate gender boundaries in illegitimate ways often experience punishment in the workplace. More recently, Tilly (1997) argues that dichotomous categories such as ‘male’ and ‘female’ (but also ‘white’ and ‘black’) are used by dominant groups to marginalize other groups and block their access to resources. He extends the Weberian scheme by pointing to various mechanisms by which this is accomplished, such as exploitation and opportunity hoarding. He asserts that durable inequality most often results from cumulative, individual, and often unnoticed organizational processes. Because these various literature all deal with the same social process, boundary work, it may be appropriate at this point to begin moving toward a general theory of boundaries by, for instance, identifying similarities and differences between boundaries drawn in various realms—moral, cultural, class, racial, ethnic, gender, and national boundaries. This could be 15345
Symbolic Boundaries: Oeriew accomplished by focusing on a number of formal features and characteristics of boundaries, such as their visibility, permeability, boundedness, fluidity, and rigidity. We may also want to compare embedded and transportable boundaries; explicit and taken-forgranted boundaries; positive and negative boundaries; and the relationship between representations of boundaries and context. Social scientists should also think more seriously about how different types of boundaries can combine with one another across local and national contexts (e.g., how cultural or moral boundaries combine with race, gender, or class boundaries). A second challenge will be to understand the connection between objective boundaries and symbolic boundaries. Students of objective boundaries have focused on topics such as the relative importance of educational endogamy vs. racial endogamy among the college educated; racial hiring and firing; the extent of residential racial segregation; the relative permeability of class boundaries; and the process of creation of professional boundaries. Lamont (1992) has argued that symbolic boundaries are a necessary but insufficient condition for the creation of objective boundaries. More empirical work is needed on the process by which the former transmutes into the latter. See also: Collective Identity and Expressive Forms; Discourse and Identity; Expressive Forms as Generators, Transmitters, and Transformers of Social Power; Leisure and Cultural Consumption; Networks and Linkages: Cultural Aspects
Bibliography Alexander J 1992 Citizens and enemy as symbolic classification: on the polarizing discourse of civil society. In: Lamont M, Fournier M (eds.) Cultiating Differences: Symbolic Boundaries and the Making of Inequality. University of Chicago Press, Chicago Anderson B 1983\1991 Imagined Communities: Reflections on the Origin and Spread of Nationalism. Verso, London Barth F 1969 Introduction. In: Barth F (ed.) Ethnic Groups and Boundaries: The Social Organization of Culture Difference. George Allen and Unwin, London Binder A 1999 Friend and foe: boundary work and collective identity in the Afrocentric and multicultural curriculum movements in American public education. In: Lamont M (ed.) The Cultural Territories of Race: Black and White Boundaries. University of Chicago Press, Chicago and Russell Sage Foundation, New York Blumer H 1958 Race prejudice as a sense of group position. Pacific Sociological Reiew 1: 3–7 Bobo L, Hutchings V L 1996 Perceptions of racial group competition: extending Blumer’s theory of group position to a multiracial social context. American Sociological Reiew 61: 951–72 Bourdieu P 1979 Distinction, Critique sociale du jugement. Editions de Minuit, Paris [1984 Distinction: A Social Critique
15346
of the Judgment of Taste. Harvard University Press, Cambridge, MA] Bourdieu P, Passeron J-C 1970 Reproduction. Elements pour une theT orie du systeZ me d’enseignment, Editions de Minuit, Paris [1977 Reproduction in Education, Society, and Culture. Sage, Beverly Hills, CA] Brewer M B 1986 The role of ethnocentrism in intergroup conflict. In: Worchel S, Austin W G (eds.) Psychology of Intergroup Relations. Nelson-Hall Publishers, Chicago Brubaker R 1992 Citizenship and Nationhood in France and Germany. Harvard University Press, Cambridge, MA Bryson B 1996 ‘Anything but Heavy Metal’: symbolic exclusion and musical dislikes. American Sociological Reiew 61(5): 884–99 Collins R 1998 The Sociology of Philosophies: A Global Theory of Intellectual Changes. Harvard University Press, Cambridge, MA DiMaggio P 1987 Classification in art. American Sociological Reiew 52: 440–55 Douglas M 1970 Natural Symbols. Vintage, New York Douglas M 1966 Purity and Danger: An Analysis of Concepts of Pollution and Taboo. Routledge & Kegan Paul, London Durkheim E 1911 Les formes eT leT mentaires de la ie religieuse. Alcan, Paris [1965 The Elementary Forms of Religious Life. Free Press, New York] Durkheim E, Mauss M 1903 De quelques formes primitives de classification: contribution a' l’e! tude des repre! sentations collectives. AnneT e Sociologique 6: 1–72 [1963 Primitie Classification, University of Chicago Press, Chicago] Elias N 1939 U¨ ber Den Prozess der Zivilisation. Soziogenetische und Psychogenetische Untersuchungen, 2 Vols. Haus Zum Flakin, Basel [1982 The Ciilizing Process. Blackwell, New York] Epstein C F 1992 Tinker-bells and pinups: the construction and reconstruction of gender boundaries at work. In: Lamont M, Fournier M (eds.) Cultiating Differences: Symbolic Boundaries and the Making of Inequality. University of Chicago Press, Chicago Erickson B 1996 Culture, class, and connections. American Journal of Sociology 102: 217–51 Gamson W 1992 Talking Politics. Cambridge University Press, New York Gerson J M, Peiss K 1985 Boundaries, negotiation, consciousness: reconceptualizing gender relations. Social Problems 32: 317–31 Gieryn T F 1999 Cultural Boundaries of Science: Credibility on the Lines. University of Chicago Press, Chicago Gusfield J 1963 Symbolic Crusade. University of Illinois Press, Urbana, IL Hall J R 1992 The capital(s) of cultures: a non-holistic approach to status situations, class, gender, and ethnicity. In: Lamont M, Fournier M (eds.) Cultiating Differences: Symbolic Boundaries and the Making of Inequality. University of Chicago Press, Chicago Halle D 1993 Inside Culture. Art and Class in the American Home. University of Chicago Press, Chicago Jasper J M 1997 The Art of Moral Protest: Culture, Biography and Creatiity in Social Moements. University of Chicago Press, Chicago Jenkins R 1996 Social Identity. Routledge, London Lamont M 1992 Money, Morals, and Manners: The Culture of the French and American Upper-Middle Class. University of Chicago Press, Chicago Lamont M 2000 The Dignity of Working Men: Morality and the Boundaries of Race, Class, and Immigration. Harvard Uni-
Symbolic Interaction: Methodology versity Press, Cambridge MA and Russell Sage Foundation, New York Lamont M, Lareau A 1988 Cultural capital: allusions, gaps, and glissandos in recent theoretical developments. Sociological Theory 6: 153–68 Lichterman P 1999 Ciic Culture Meets Welfare Reform: Religious Volunteers Reaching Out. Paper presented at the Annual Meeting of the American Sociological Association, Chicago Luker K 1984 Abortion and the Politics of Motherhood. University of California Press, Berkeley, CA Newman K 1999 No Shame in My Game: The Working Poor in the Inner City. Knopf and Russell Sage Foundation, New York Parkin F 1979 Marxism and Class Theory: A Bourgeois Critique. Columbia University Press, New York Peterson R A, Kern R 1996 Changing highbrow taste: from snob to omnivore. American Sociological Reiew 61: 900–7 Saguy A C 2000 Sexual harassment in France and the United States: activists and public figures defend their positions. In: Lamont M, The! venot L (eds.) Rethinking Comparatie Cultural Sociology: Polities and Repertoires of Ealuation in France and the United States. Cambridge University Press, Cambridge, UK Tajfel H, Turner J C 1985 The social identity theory of intergroup behavior. In: Worchel S, Austin W G (eds.) Psychology of Intergroup Relations. Nelson-Hall, Chicago Tilly C 1997 Durable Inequality. Harvard University Press, Cambridge, MA Veblen T 1899\1979 The Theory of the Leisure Class. Penguin, New York Weber M 1922\1956 Wirtschaft und Gesellschaft. Grundisse der Verstehenden Soziologie, Hohrn, Tubingen [1978 Economy and Society. University of California Press, Berkeley, CA, Vol. 1] Wuthnow R 1987 Meaning and Moral Order: Explorations in Cultural Analysis. University of California Press, Berkeley, CA
M. Lamont
Symbolic Interaction: Methodology The term ‘symbolic interactionism’ (SI) was coined in an article by Blumer (1937). SI presented a radical break from the traditional, macrostructural theories of sociology. SI rejected the scientific method and its reliance on predetermined hypotheses. SI’s goal was not predicting about society, as structuralist methodologies did, but understanding the nature of the social world. In basing its research in the real world of everyday life, SI relied heavily on the theoretical work of the pragmatists, especially that of Mead (1934).
1. Blumer and Naturalistic Research In his book containing reprints of his earlier articles, Blumer (1969) applies a methodology, naturalistic research, to the theoretical work of Mead. In symbolic interactionism, theory and methodology are inter-
twined very closely. Mead’s social pragmatism advocated a study of society that takes place in small group interaction; namely, the relation of self and others. This reciprocal interaction is derived, for Mead, through the use of symbols, primarily language but others as well. Thus, society, for Mead, is a consensual, intersubjective world where the sharedness of meanings among its members allows some stability in an ever-changing flux (cf. Adler and Adler 1980). Blumer, who was first a student and then a colleague of Mead at the University of Chicago, followed the latter’s theoretical ideas and stated that society exists in action, in the reciprocal exchange of symbols, and the meaning these have for the members of society. If society is to be found in the interactional processes of its members we should study it in ‘the real world,’ i.e., the world created by societal members in their interaction, rather than in artificial settings or hypothetical situations. For Blumer, it follows that rather than constrain the study of social processes with controlled variables we should proceed with minimal intervention by the researcher. Also, in trying to maintain the integrity of the phenomenon under investigation we should not fit the ‘real world’ to our method, as typical in survey research, but should adapt out methodology to the world and thus use flexible data collection strategies. Finally, the model of the study should not be predetermined and then tested, but should be everchanging to fit the changes and events in the real world. For Blumer the naturalistic research has two phases: exploration and inspection. Exploration allows the researcher to gain a clearer understanding of the study, what methods to use, and to conceptualize the area of study in order to be able to talk from facts rather than speculations (Blumer 1969). Exploration should be flexible and change as necessary to understand the topic at hand and thus should rely on a multiplicity of methods including observation, interviewing, life histories, and personal documents. Exploration is the first step of naturalistic inquiry for Blumer but must be complemented by another, more analytic step, inspection. This latter step aids the researcher in developing an analytic understanding of the elements of inquiry and their relations. This understanding is sensitizing rather than definitive, since the world studied is a process, i.e., ever changing. Furthermore, inspection allows for the testing of theoretical propositions. Thus, naturalistic research uses exploration to define the parameters of the study, then turns to inspection to develop sensitizing concepts about the study, finally developing theoretical concepts about society and its members (Hammersley 1989). 1.1 The Chicago School Blumer mentioned some works of the Chicago School and particularly The Polish Peasant in Europe and 15347
Symbolic Interaction: Methodology America (Thomas and Znaniecki 1918–20) as exemplars of the naturalistic method. Indeed, many works of the Chicago School are considered the antecedents of SI, since they tended to rely on some facet of the naturalistic method. Both Mead and Blumer were at the University of Chicago in the 1920s and 1930s when the school had great influence on American sociology. Among many other important contributors to the naturalistic studies of the Chicagoans, besides W. I. Thomas, who was one of the theoretical leaders (but left as early as 1918), Robert Park must be mentioned. He was an ex-journalist trained in philosophy and took some courses from the pragmatists, John Dewey and William James. From his previous profession Park brought an understanding of studying the real world ‘out there,’ thus, being the inspirator of many studies of naturalistic nature. It must be noted that the Chicago School has been criticized for saying very little about their methods in their writings (Hammersley 1989) and for not really being a school of thought at all (Becker 1999). Some of the most notable works from the Chicago naturalistic tradition are The Ghetto (Wirth 1928), The Delinquent Gang (Thrasher 1927), The Gold Coast and the Slum (Zorbaugh 1929), The Hobo: The Sociology of the Homeless Man (Anderson 1923). For a detailed study of the Chicago School see Faris (1967) and Bulmer (1984).
2. Other Methods Based on Mead While Blumer’s adaptation of Mead’s theories is the methodological mainstay of SI, there are other methodologies based on SI, and these will be mentioned next. Analytic induction was first discussed by Znaniecki (1928). It was later used, with minor variations by Lindesmith (1937, 1968) (he was a graduate student of Blumer), Cressey (1950) (a student of Lindesmith), Becker (1963) (see Hammersley 1989), and others. Analytic induction, according to Znaniecki, recognizes the fact that objects in the world are open to an infinite number of description and, thus, our account of them must be selective; this selectivity will be based on the interest at hand, which for sociologists is primarily social and cultural systems; commonly used sociological methods relying on preidentification (deductive) or superficial description (inductive) will not work, only analytic induction will accomplish the task. The researcher will select a small number of cases (10–12, usually) and study them in depth, continually defining and redefining the event and formulating and reformulating theoretical propositions until they will fit all cases. Negative cases must also be examined (this was Lindesmith’s idea). Another student of Blumer, Strauss, together with Glaser, developed another SI method, grounded theory (Glaser and Strauss 1967). Closely related both to Blumer’s methodology and to analytic induc15348
tion, grounded theory placed more emphasis on the generation and development of theory. Relying on the inductive method, grounded theory is akin to Blumer’s inspection, only much more elaborate. Rather than relying on a priori population, in analytic theorizing one continues to study new cases until the point of saturation, generating theoretical categories. Not all SI methods followed the constructionist approaches outlined above. A notable exception came from the Iowa School of Sociology. Kuhn (1964) adopted a much more deterministic approach to Mead’s discussion of the self and the nature of the ‘me,’ the various roles and images we have of ourselves. The methodology he adopted to discover the nature of the self was called the Twenty Statements Test (TST), a series of open-ended questions about the self. Kuhn felt that rather than use the oblique method of observing people one ought to ask them directly about the nature of their inner feelings and they would honestly disclose them to the researcher. The results of TST would be used, by Kuhn, to outline generic laws that would apply to human beings in different situations. Other positivistic oriented symbolic interactionists are Sheldon Stryker, described as a ‘structural role theorist,’ who influenced numerous students at the University of Indiana and Carl Couch, who was a stalwart of the discipline, with his ‘Behavioral Sociology’ at the University of Iowa (cf. Reynolds 1993). In the 1960s and 1970s a plethora of theoretical approaches, largely based on the naturalistic method, appeared. Some were based on basic Meadian tenets, such as dramaturgy (Goffman 1959), and labeling (Becker 1963). Others based their constructionist approach not only on the ideas of Mead but on those of the phenomenologists (Husserl, Schutz,Heidegger, Dilthey) and the existentialists (Merleau-Ponty, Sartre), and ordinary language philosophers (Wittgenstein). They are phenomenological sociology, existential sociology, ethnomethodology, and the sociology of emotions (see Douglas et al. (1980) for a survey of these sociologies and a list of references to them; also, see Adler et al. (1987)).
3. Postmodern-informed Symbolic Interactionism More recently, with the influence of postmodernism on the social sciences, the methodology of SI has undergone some dramatic changes (Marcus and Fischer 1986, Dickens and Fontana 1994). Again, the closeness of theory and methodology in SI affects methodological approaches to SI. Postmodernism (Lyotard 1984) advocates a disenchantment with metatheories, claiming all we can possibly know about society is much smaller in scope, and it is relative due to its historical, temporal, regional nature, as well as other elements. Thus, society becomes fragmented and
Symbolic Interaction: Methodology we should question and deconstruct all overarching metatheories and reduce the scope and intent of our studies to our immediate surroundings, to the details of everyday life. These beliefs impact the naturalistic method of SI in a number of ways. First, we can no longer claim to be aiming at universal laws of human behavior since that would be a metatheory. Second, metatheories come under scrutiny and are deconstructed, i.e., their basic assumptions are rendered problematic and questioned. This leads to the questioning of the role of the researcher and author in terms of authority of selectively choosing data to record and report, thus biasing the study. Postmodern-informed interactionism attempts to remedy this authorial bias by the use of polyphony, by this meaning to allow the many voices of the respondents to speak for themselves. Additionally, researchers disclose as much information as possible about themselves—role, interest, possible bias, and more. This is only a partial solution to the authorial influence, which is ultimately, irremediable, as we cannot help but to perceive and report selectively. Other methodological changes take aim at the relation between the researcher and the researched. Traditionally, interactionists have attempted to remain as neutral and invisible as possible in order not to influence the study. Postmodern-informed interactionists make their presence felt in an attempt to establish a one-to-one partnership with the respondents. In so doing researchers are no longer aloof observers collecting data but become a part of the interaction and influence the research findings. Now data stems from a negotiated process between researcher and respondent (Gubrium and Holstein 1997) (notice that this goal contradicts the previous goal of reducing authorial bias). Another change involves the notion of ‘value free’ research. Some postmodern-informed interactionists wish to overthrow traditional sociology (Clough 1998, Denzin 1997, 1999a) which they consider patriarchal, instead they envision a new researcher who is a ‘feminist, communitarian reseacher … who builds collaborative, reciprocal, trusting, and friendly relations with those studied’ (Denzin 1997).
4. Current Trends Postmodern-informed interactionists not only have made dramatic methodological changes from Blumerian interactionism but are experimenting with radically different reporting styles, which are closely intertwined with the new methodological ethos. Some of these styles include poetry and performances, as well as the writing of short stories about one’s personal past (autoethnographies). While some of these reports are highly engaging (Richardson 1997, Denzin 1999b), they raise a question about standards—should these
works be judged and evaluated by the standards of sociology or poetry and literature? Some interactionists have been highly critical of postmoderninformed interactionism (Seidman 1991, Shalin 1993, Best 1995, Prus 1996). As a result the field of interactionism is currently divided in many groups with different methodological approaches. Adler and Adler (1999), in describing the varied groups of interactionists at the turn of the millenium, use the metaphor of ‘the ethnographers’ ball’ and sit the factions at various tables. We will rely upon the Adlers’ description in the following section. The first table sits the ‘postmodern-informed’ ethnographers, next to them is the ‘autoethnography group,’ a bit further down sit the ‘interpretive discourse analysts’ who attempt to integrate interactionism, ethnomethodology, and postmodernism. At another table the Adlers have the ‘classic interactionists,’ in the wake of the old Chicago School. Next to them sit the ‘no mo pomo’ (Adler and Adler 1999), the Blumerian group critical of the new postmodern influence. Near by are the followers of Erving Goffman, the ‘dramaturgists,’ studying everyday life by using the metaphor of the theater. They are followed by the ‘grounded theorists,’ next to the ‘conversational analysts’ who are continuing the work of ethnomethodologists such as the late Harvey Sacks and others. Next, the Adlers have a table for those who practice ‘the extended case study,’ using ethnography to do applied research. Finally we see the ‘visual interactionists’ who capture symbolic interaction through the eye of the camera lens. As the Adlers themselves say, ‘the list can go on’ (Adler and Adler 1999). We would add to the Adler’s list three more tables, before we run out of room in the ballroom. One would be the ‘phenomenological\existential’ table sitting Jack Douglas and his former students at the San Diego campus of the University of California. Another table, right next to the postmodernists, would be the ‘feminist interactionists’ who are ‘united in their criticisms of ‘‘Eurocentric masculinist approaches’’’ (Collins, quoted in Denzin 1997). Finally, there would be a table of ‘theoretical interactionists’ who have written extensively in the area without actually engaging in ethnographic work (cf. Lyman and Scott 1989). The methodology of Symbolic Interactionism has become the methodologies of symbolic interactionism, and can no longer be grouped under Blumer’s guiding ideas, yet their diversity is exaggerated and exacerbated by rivalry among various subgroups. By and large, while differing in some methodological points, the great majority of interactionists still wish to study the real world and the meaning that its members make of it and how they manage to maintain and achieve social order as a negotiated interactional accomplishment. See also: Ethnomethodology: General; Grounded Theory: Methodology and Theory Construction; In15349
Symbolic Interaction: Methodology teractionism: Symbolic; Postmodernism: Methodology; Postmodernism: Philosophical Aspects
Bibliography Adler P, Adler P 1980 Symbolic interactionism. In: Douglas J D, Adler P A, Adler P, Fontana A, Freeman C R, Kotarba J A (eds.) Introduction to the Sociologies of Eeryday Life. Allyn and Bacon, Boston Adler P, Adler P 1999 The ethnographers’ ball revisited. Journal of Contemporary Ethnography 28(5): 442–50 Adler P, Adler P, Fontana A 1987 Everyday life sociology. Annual Reiew of Sociology 13: 217–35 Anderson N 1923 The Hobo: The Sociology of the Homeless Man. University of Chicago Press, Chicago Becker H S 1963 Outsiders: Studies in the Sociology of Deiance. Free Press of Glencoe, London Becker H 1999 The Chicago school, so-called. Qualitatie Sociology 22(1): 3–12 Best J 1995 Lost in the ozone again. Studies in Symbolic Interaction 17: 120–30 Blumer H 1937 Social psychology. In: Schmidt E P (ed.) Man and Society. Prentice-Hall, New York Blumer H 1969 Symbolic Interactionism: Perspectie and Method. Prentice-Hall, Englewood Cliffs, NJ Bulmer M 1984 The Chicago School of Sociology. University of Chicago Press, Chicago Clough P T 1998 The End(s) of Ethnography: From Realism to Social Criticism, 2nd edn. Peter Lang, New York Cressey D 1950 The criminal violation of financial trust. American Sociological Reiew 15: 738–43 Denzin N K 1989 Interpretie Interactionism. Sage, Newbury Park, CA Denzin N 1997 Interpretie Ethnography: Ethnographic Practices for the 21st Century. Sage, Thousand Oaks, CA Denzin N 1999a Interpretive ethnography for the next century. Journal of Contemporary Ethnography 28(5): 510–19 Denzin N 1999b Performing Montana. In: Glassner B, Hertz R (eds.) Qualitatie Sociology as Eeryday Life. Sage, Thousand Oaks, CA Dickens D R, Fontana A 1994 Postmodernism and Social Inquiry. The Guilford Press, New York Douglas J D, Adler P A, Adler P, Fontana A, Freeman C, Kotarba J A (eds.) 1980 Introduction to the Sociologies of Eeryday Life. Allyn and Bacon, Boston Faris R E L 1967 Chicago Sociology 1920–1932. Chandler Pub. Co., San Francisco Glaser B G, Strauss A L 1967 The Discoery of Grounded Theory. Aldine, Chicago Goffman E 1959 The Presentation of Self in Eeryday Life. Doubleday, Garden City, New York Gubrium F, Holstein J A 1997 The New Language of Qualitatie Methods. Oxford University Press, New York Hammersley M H 1989 The Dilemma of Qualitatie Methods: Herbert Blumer and the Chicago Tradition. Routledge, London Kuhn M 1964a Major trends in symbolic–interaction theory in the past twenty-five years. Sociological Quarterly. 5: 61–84 Lindesmith A R 1937 The Nature of Opiate Addiction. The University of Chicago Libraries, Chicago Lindesmith A 1968 Addiction and Opiates. Aldine, Chicago Lyman S, Scott M 1989 Sociology of the Absurd, 2nd edn. Harcourt Brace, New York
15350
Lyotard J-F 1984 The Postmodern Condition: A Report on Knowledge. [Trans. Bennington G, Massumi B] University of Minnesota Press, Minneapolis, PA Marcus G E, Fischer M M J 1986 Anthropology as Cultural Critique: An Experimental Moment in the Human Sciences. University of Chicago Press, Chicago Mead H G 1934 Mind, Self and Society. University of Chicago Press, Chicago Prus R 1996 Symbolic Interaction and Ethnographic Research. State University of New York Press, New York Reynolds L 1993 Interactionism: Exposition and Critique. General Hall, Dix Hills, New York Richardson L 1997 Fields of Play: Constructing an Academic Life. Rutgers University Press, New Brunswick, NJ Sanders C 1995 Stranger than Fiction: Insights and pitfalls in post-modern ethnography. Studies in Symbolic Interaction 17: 89–104 Schmidt E P (ed.) 1937 Man and Society. Prentice-Hall, New York Seidman I E 1991 Interiewing as Qualitatie Research. Teachers College Press, New York Shalin D 1993 Modernity, postmodernism and pragmatist inquiry: An introduction. Symbolic Interaction. 16: 303–32 Thomas W I, Znaniecki F 1918–20 The Polish Peasant in Europe and America. University of Chicago Press, Chicago, Vols.1–5 Thrasher F M 1927 The Gang: A Study of 1,313 Gangs in Chicago. University of Chicago Press, Chicago Wirth L 1928 The Ghetto. University of Chicago Press, Chicago Znaniecki F 1928 Social research in criminology. Sociology and Social Research 12: 307–22 Zorbaugh H W 1929 The Gold Coast and the Slum. University of Chicago Press, Chicago
A. Fontana
Symbolism in Anthropology Symbolism, as it is known and practised in anthropology, is the positive side of a negative effect called ‘the subject,’ or ‘subjectivity.’ The symbol and its symbolism live a life of their own, even in the efforts to determine what they may be or mean, so that the whole meaningful life of our species is compounded within their parameters. It is because it is difficult or well-nigh impossible to separate the life of symbols from our own that the social sciences have come into existence. What we identify as ‘symbolic’ is a recollective fantasy or educated guess as to what may have been going on as it was being formulated, and the lived or thought continuity it presupposes is a cumulative reprojection, or collective daydream, compounded of such educated guesses. It is the scholar’s or the writer’s way of dealing with ‘the emotions.’ The secret of this has very much to do with ‘yes’ and ‘no.’ For it was Gregory Bateson who demonstrated that analogy is not exclusive to human beings, that the dog that wishes to be friendly will deliver ‘the bite that is not a bite,’ affirming positive intention by not
Symbolism in Anthropology closing its teeth on one’s hand. We, it seems, invented ‘no,’ or at least ‘delivered’ it. The simplest version is a joke: ‘Hurt me!’ said the masochist. ‘No,’ said the sadist, smiling in his teeth. Ever since, we have had the subject to contend with. ‘Hurt me!’ said the phenomenologist; ‘No,’ said the anthropologist. Symbolism gives us a deniability of the spoken word in the action of writing or reading it, and a deniability of the written word in the action of speaking from it. We know from ourselves it is that way, and we are a very quick study. In its most radical determination, anthropological symbolism is concerned with the human ability to reflect ideas, emotions, and perceptions in representational form, hence to reflect upon itself. As an analyzable artifact of cultural or psychological experience, symbolism renders conscious what might otherwise be thought to be unconscious or subconscious. As a totalizing perspective, integrating or at least classifying the whole of a personal, interpersonal, cultural, and social conceptualization of things, symbolism essays the whole compass of the concept. It is world-in-the-person and person-in-the-world. ‘DENY ME, CRUCIFY ME’ said God. ‘NO,’ said the being, smiling in its teeth, and just now suddenly HUMAN. The problems of psychology pertain to the individual person or personality, those of sociology involve a collective or consensual entity, and anthropology defines itself through the differences and similarities that these two polarities share in common. As a conjoint therapy, a medium through which each of these functions is determined by means of the other, symbolism takes on a life of its own, becomes its own mythology about itself. That might help to explain why anthropologists like Edward Sapir, Alfred Kroeber, Gregory Bateson, Margaret Mead, and especially Ruth Benedict, took such a deep interest in psychological problems. And also why psychologists beginning with Freud, Jung, and Adler, tried to capture anthropology within the net of their all-too-specific visions. It is as though the medium of symbolisation, a self-reflexive agency come into its own, simultaneously divided the two approaches and assimilated each to the other. Or as though the very medium of interpretation stole its message back from those who sought to interpret or dissect it. In the fine phrasing of the Urapamin people of Papua New Guinea, according to Joel Robbins, ‘the eye steals it and eats it.’ Interpretation is a thief, and not necessarily a visual one. Another anthropologist, F. E. Williams, was told in Papua that ‘The sky has no mouth, the sea has no mouth; only old men have mouths.’ Just how much of hermeneutical theory was made in envy of the spoken word? The favorite form of Melanesian immortality is what they call ‘the talk that never dies.’ Treated formalistically, subjected to its own definition of itself, it becomes, instead, the symbolism
that never lives. The structuralism and ethnosemantics of the 1960s shifted the interpretative message into the symbols, signs, and markers themselves, grounded in schemes for the classification of things, and ultimately the classification of classification itself. The economy or self-condensing advantage of the structuralist shift in emphasis is easily misunderstood in that structuralism and semiotics do not deny the reflexive (e.g., ‘psychological’) ramifications of symbolism so much as they incorporate them within their analytic designs. The so-called ‘poststructuralist’ or ‘postmodern’ approaches have basically nothing to add to this, for in ‘unpacking’ the ‘text,’ pretending lived social reality as an imaginary diagram or document, one is subversively packing it. An alternative approach was suggested in the writings of Sir Edmund Leach and Mary Douglas in which the significative and self-classifying properties of symbolization became secondary to the liminal categorization of properties developed from their usage. It was the anomalies, ambivalences, and ambiguities encountered in classification, and thus only indirectly classification itself, that motivated, and so defined, symbolism. (Thus the consumption of pork would have been proscribed in the ancient Middle East because the pig was an anomalous creature, one that ‘divideth the hoof but cheweth not the cud.’) The result was a theory of symbolism that was forced to deny its subject matter as a means of affirming it. Victor W. Turner noted that the possibility of an ‘antistructure,’ as he called it, is a necessary concomitant of anything that could be called structural. (‘Deny the liminal, the reality between the words,’ challenged Turner. ‘No,’ said the ardent symbolist, smiling in his teeth.) ‘What is the representational or reflectional status of a symbolism that compromises its own ability to mean or represent, and so exists through the riddle it makes of its own existence?’ Essentially all subsequent approaches to symbolism in anthropology devolved from the point of that impasse, offering alternative solutions to the problem by denying or qualifying certain of its drawbacks. One might, for instance, argue that there is no definable symbolism, but only an evolutionary or historical process, thereby postponing the subject of the enquiry indefinitely. In that case the symbolist would have to be what Claude Levi-Strauss called a bricoleur, a crafty opportunist anxiously fitting together what the inevitable process was casually taking apart. Bricolage would be a form of the Hegelian or Marxian dialectic, but, alas, the dialectic would also be a form of bricolage. Another alternative was posed in the American conception of a ‘cultural system,’ which has its only unity in a sort of either\or proposition. Either symbolism is metaphorical to its own usage, an open invitation to what Clifford Geertz has called its ‘understanding,’ or it is not. In his classic study of American kinship, David M. Schneider defined the 15351
Symbolism in Anthropology symbol as that which stands for what it is not, and it was but a step from this to his later conclusion that ‘kinship’ simply does not exist. (‘Deny relationship,’ said the anthropologist, ‘it is only a function of the person.’ ‘No,’ said the informant, smiling in his teeth.) Is misunderstanding simply a more arcane version of what Geertz and the phenomenologists have called ‘understanding?’ It would be impossible to form substantive conclusions about anthropological symbolism and what it means without in some way participating in the very effect one was trying to cast into relief. It would likewise be very difficult to live within or experience a kind of ‘construction of reality’ without in some way penetrating or disqualifying the facade it is said to impose. Virtually any example of symbolism one could know or imagine would have to correspond with some version of the antisymbolic insight, and thus give the lie to Max Weber’s famous dictum about human beings ‘suspended in webs of meaning that they themselves have created.’ Should Weber have added ‘… or uncreated,’ and smiled in his teeth? The thing that stands for what it is not is also not what it stands for. The prescient quality of the symbol, and of the symbolism on which it is founded, has just that little to do with content and form, with the kinds or varieties of its usage and recognizable categorization. The lesson in this, which has whole semiotic literatures at its beck and call, is not simply moronic, but actually oxymoronic, like a breath of fresh air. It is more clearly analogous to the lyrics of an American ‘country’ song: ‘Just what part of ‘‘no’’ don’t you understand?’ Is the metaphor really a metonym in disguise, or is it the other way around; is the ‘syntagmatic chain,’ a code phrase for the ways convention strings symbols together, a sort of learner’s permit for the paradigmatic, the illusion of all the meanings coming together at once? ‘There are no easy horses,’ says another country song, ‘but you’ve got to learn to ride.’ ‘Americans,’ said the cowboy–humorist Will Rogers, ‘are lucky enough not to get all the government they pay for.’ Are symbolists even luckier? The symbolism, or theory of symbolism, that does not partake of the emotional quality it engenders is just another sign, a part of ‘no’ that, properly misunderstood, might well become a ‘yes.’ We affirm our denials symbolically, as they say. If we can have no emotions that are not part of the process of determining what they themselves might be, then symbol and emotion are like siamese twins, joined at the hip. They are in a physical relationship; no kind of therapy or ‘talking cure’ will be able to cut them apart, but only, if lucky, make them feel better about their plight. Denial is also emotional, very strongly and suggestively so, but writing a theory of denials would be like photocopying the air instead of breathing it. A large part of the ‘symbolic’ problem seems to be that the predominant mental ethos in what we like to 15352
pretend to ourselves as a ‘modern world civilization’ has excused itself from the insights of the medieval nominalist philosophers. These were the gadflies who insisted, against the worship of essences, the Platonic Realism of their own version of civilization, that there is nothing in what we know save in the names we give to it. We know the names of the stars and their constituent elements and processes. We have given very fancy names to the molecules that make us up, and to their processes and transformations, and are lucky enough not to get all the ones we pay for. But, according to Peter Abelard, we would know very little else. ‘The named is the mother of the myriad creatures’ said Lao Tsu. Phenomenalism, epistemology, and, of course, symbolism, have a very ancient reading in the texts of Aristotle, the Arabic and Jewish, Indian, and Chinese philosophers, and even Heraclitus. But it would seem that the medieval problem of symbolism began with the assertion, by Berengar of Tours, that the Holy Christian Sacrament is but a symbol. He may have smiled in his teeth but was very nearly killed for this. He meant that the accidental properties of things (including the names we give them), and not their essences or substances (whatever those might be) make up the whole reality by which we know them. Thanks to his protector, Fulk the Black of Anjou, Berengar lived to a ripe old age. But he committed symbolic suicide for the medieval church. Dying by design may be the first and the last thought of the would-be suicide, and of the architect of gothic cathedrals. It was no accident that Berengar’s insight happened in medieval times, and provoked the theologies of sacramental doctrine into being. It was no accident either that the subsequent scientific revolution built virtual gothic cathedrals of the accidental properties of things, the matter of ‘substance’ having gone to subatomic particles, energies, uncertaintyprinciples or worse, committing real homicide instead of symbolic suicide. ‘Understand me,’ said Berengar. ‘No,’ said modern world civilization, smiling in its nuclear fallout. Lucky enough not to get all the Berengar it has paid for. Not surprisingly, it was a twentieth-century philosopher, Ludwig Wittgenstein, who completed Berengar’s ‘sentence.’ The sacrament, or anything else, is a symbol because ‘We picture facts to ourselves.’ In other words, important as understanding or even interpretation may be, the essential fact is what Wittgenstein called ‘making sense of things.’ This means that sense is a total phenomenon, and that reason, logic, and even the oerstanding of a problem that we call an ‘explanation,’ are its derivative qualities. The whole perspective is in the picture, and no abstraction or ideal quality, however profound, is ever-more useful or effective than the sense that models its meanings for us. More to the point of this definition, sense gives the gift of deniability to logic and rational ordering. This
Symbolism in Anthropology would have to mean that all of Einstein’s conclusions about general and special relativity were beholden to the ‘thought experiments,’ through which he imagined them to himself, the ‘picture facts.’ It would mean that the chemistry of DNA is the micromanagement of Watson and Crick’s ‘double helix’ model, and that all subsequent determinations of subatomic structure are footnotes to the initial mistake of the Bohr atom. It means that the fossil itself is a cartoon of the reconstructions that we make of it and call a ‘dinosaur,’ and that everything that we may remember about the past lies in the future of the present moment, that anything we may know of culture or through our own version of it, is a by-product of the initial invention of the concept. We make a solid object of its sense, we ‘make book on it.’ Imaginal deniability suggests that ‘culture’ and symbolization enjoy a positive (e.g., ‘double negative’) relation to each other, that each lives a life of its own within the shadow of the other. ‘Deny me, dissect me, disqualify me,’ said the culture. ‘No,’ said the symbol, smiling in its teeth. The English reformer John Wycliffe suggested that neither substance nor accident have very much to do with Berengar’s symbolic sacrament, that what the partaker is after is an ‘indwelling of Christ in the soul,’ a better sense of Christ’s sacrifice, or denial of his life, in the body. The self-denial in Christ’s sacrifice is essential to its meaning, or self-affirmation. Likewise, metaphor, the pictorial essence of sense in the symbol, is language’s way of falling on its sword. And of liking it. If what we call ‘consciousness’ is the mind’s way of surprising itself, then symbolism is the reaction it has when it will not take ‘no’ for an answer. No surprise\no consciousness; no consciousness\no symbol. Everything in the world of letter and number, and even the picture, has the quality of marks, usually in black and white. Everything else has the quality of surprise. Then consciousness itself has no other birth, and no death, and no other phenomenology either that would not come as a surprise. If consciousness begins in pain, as in the joke quoted, then, never mind the change of subject, it would have to finish in satisfaction. Put more bluntly, this means that if pain is the consequence of taking the human subject too seriously, then symbolism would be the consequence of passing it off as a joke. Then morphine, pace Christ, would be the sufferer’s best friend, for with morphine ‘all the pain is there, I can feel it in my bones, but, insofar as I cannot sense it, I quite honestly don’t give a damn,’ said the human subject, smiling in its teeth. We have plenty of subjects in anthropology, and the surprising thing is that we have no valid ethnography about how or why the subject should take itself seriously, care or not care about its culture. Not even in Colin Turnbull’s work on the Ik. And this makes it difficult to either care or not care about anthropology
itself. We have all too many anthropological studies of the ideological substitute for care or concern, the backlash of symbolization. The target of pity becomes a pathetic (e.g., phenomenological) subject when forced to turn that pity back on itself, and an object or willing tool when it lashes out against its plight. The giver of pain or of pity becomes an empathetic subjector. ‘Hurt me,’ implores the pathetically cultural masochist. ‘I am hurting,’ says the ideological sadist, smiling in his teeth. Symbolism does not come from inspired self-rejection; it goes there. Symbolism does not merely encourage personal solipsism; it positively enjoins it, and forces the solipsist to make the necessary compromises or adjustments in attitude. ‘Every man his own priest,’ was the motto of the Reformation; ‘every person their own guru, world-renouncer, and inventor of the world,’ sings the modern-day choir in response, ‘we are all hurting.’ The quality of mercy is strained through the sieve of cognition. Symbolism may or may not be the live wire, nature’s way of short-circuiting the human race, but it is definitely culture’s way of short-circuiting anthropology. Symbolism is the edged weapon, the most ancient, pace language, part of our technology; it is the distinction that cuts the person as the person cuts the distinction. Without the cutting tool, linguistic or whatever, we could not establish figure against ground, or ground against figure; we would not be grounded at all for want of knowing the difference between the two. Without the essential tension between affirmation and denial, what the symbolism ‘stands for’ and how it does not do so, the yes-and-no that Peter Abelard called sic et non, its significance would indeed be a mere construction. The so-called ‘interpretation’ merely dramatizes, and the so-called ‘construction’ deconstructs, an effect that is in fact causal to their very possibilities, like the trick that ‘emotion’ plays upon our feelings and sensibilities. (‘How do you feel about yourself? Tell me about it.’) Symbolism is both affirmation and denial at the same time, and the tension between its alternative propositions is the timespace, what Mikhail Bakhtin called the ‘chronotope,’ of our lives. It is the conscience, the judgment, the question and the answer, the argument and the satisfaction. In effect, symbolism makes the person necessary in a way that biology or biological evolution could never do by itself. It sustains an adversarial counterpoint without which the fictions of individual and collective, conviction and apathy or their moral and ethical surrogates, would be the empty idealities or meaningless categorizations that marry contemporary political ideologies to social science thinking. How, then, might one propose? Smiling, in one’s teeth? See also: Genealogy in Anthropology; Liminality; Mental Imagery, Psychology of; Metaphor and its Role in Social Thought: History of the Concept; Myth 15353
Symbolism in Anthropology in Religion; Ritual; Ritual and Symbolism, Archaeology of; Symbolic Boundaries: Overview; Symbolism (Religious) and Icon
Bibliography Bateson G 1972 Steps To an Ecology of Mind. Chandler, San Francisco Douglas M 1966 Purity and Danger. Routledge and Kegan Paul, London Geertz C 1973 The Interpretation of Cultures. Basic Books, New York Leach E 1976 Culture and Communication. Cambridge University Press, Cambridge, UK Levi-Strauss C 1966 The Saage Mind. Weidenfeld and Nicolson, London Schneider D M 1980 American Kinship. University of Chicago Press, Chicago Turnbull C M 1972 The Mountain People. Simon and Schuster, New York Turner V W 1969 The Ritual Process. Aldine, Chicago Wagner R 1986 Symbols that Stand for Themseles. University of Chicago Press, Chicago
R. Wagner
Symbolism (Religious) and Icon 1. Religious Symbols: Their Nature and Functions 1.1 Symbols, Transcendence, and Salation Religious symbols arise out of the tension which exists between consciousness and the world. Indeed, the raison d’eV tre of religious symbolism is to transcend that tension and move beyond, into ‘the other,’ into the world of transcendence. Religious symbols in other words are bridges between consciousness, the world, and the sacred. They are not just boundaries between the sacred and the profane, they are also means and processes of their inversion and transformation. Through symbols the material becomes spiritual and the spiritual becomes empirical and is communicated in visible form. As Eliade (1959, p. 12) put it: ‘By manifesting the sacred any object becomes something else, yet it continues to remain itself, for it continues to participate in the surrounding cosmic milieu.’ To transcend the world is to give it meaning at another level and attempt to transform it through symbols. A symbol, as Geertz (1975, p. 91) defines it, may be ‘any object, act, event, quality, or relation which serves as a vehicle for a conception.’ For Geertz, as for Sperber (1975), to symbolize is to conceptualize. Religious symbols, thus, serve as vehicles of conceptualization of this world in terms of another world. Along with 15354
conceptualizing the sacred they also serve as means to separate it and conceal it from the profane. In mediating transcendence, symbols function at the same time as vehicles of transformation and stabilization of the world, but the conservative or revolutionary political function of symbols is a matter of degree and depends on political and socioeconomic conditions. There is an intentional and existential element in religious symbolism which links ultimate reality, social reality, and the quest for salvation which is innate in the human condition and, in one sense or another, underpins all religions. The limitations posed by the human biological, psychological, and social life and above all death, become embodied in religious images which transcend these limitations and give meaning to an otherwise chaotic reality. From this premise Marx, following Feuerbach, and without any theoretical elaboration of the symbolic, produced a reductionist, compensatory theory of religion and understood all religious symbols to be projections and epiphenomena of unfulfilled material needs and limitations imposed by socioeconomic conditions. Freud also produced a reductionist theory of religious symbols in the framework of his psychoanalytic theory. Religious symbolism underwent sophisticated analysis, especially in anthropology during the twentieth century, but its religious meaning per se and the problem of salvation were rarely taken on board. Reductionist interpretations of religious symbols have declined in the last decades of the twentieth century but the postmodernists have been criticized for their extreme relativism (Gellner 1992). Weber (1948, p. 280) did not get involved in the analysis of symbols and rituals but he argued that ‘the world images that have been created by ideas have, like switchmen, determined the track along which action has been pushed by the dynamic of human interest.’ He made these images central to his sociology of religion and explored them as an integral part of human action in the pursuit of salvation. The symbolic representation of salvation then, as pursued within the various religious images of the world in various social and historical contexts, form the general framework of the functions and meanings of religious symbols. Durkheim (1915 p. 47) described ‘sacred things’ as ‘set apart and forbidden.’ Yet in the context of salvation the world of the sacred and that of the secular must be, and are, brought into interaction. Salvation means that chaos, limitation, and disunity are surmountable and God meets humankind. Within the Abrahamic faiths especially, salvation means to bring everything under the unity of God. The forces of evil and death must, and can, be overcome. A bridge between God and humans must be built with the initiative of God and the synergy of humankind. Because of this symbols, as tools of transcendence and salvation, are essentially ambiguous. Their ultimate function is to mediate, reveal, express, and com-
Symbolism (Religious) and Icon municate that which in essence cannot be fully grasped, spoken of, or communicated. God is both: known and unknown, present and absent, hidden and revealed, God and humans. As Martin (1980, p. 7) points out, ‘To transcend is to go beyond. So we must name it and refuse to name it.’ What Rudolf Otto (1953) has called mysterium tremendum et fascionum cannot be contained in ordinary human knowledge and logic. So the believer’s attitude to the sacred is basically one of faith, awe, fascination, and submission. For the same reason the sacred is not subject to the ordinary rules of formal logic. The nonlogical, the miraculous, and the extraordinary are basic properties of the sacred. Leach (1976, pp. 69–71) calls the logic involved in religious discourse ‘mytho-logic’ and argues that ‘One characteristic feature of such nonlogic (mytho-logic) is that metaphor is treated as metonymy.’ Sacred agents can be here, there, and everywhere simultaneously and according to the believer this is not just metaphorical language. It expresses (describes) true properties of divine agents which human agents cannot possess because they are limited, because they are human. Unlike humans, gods can do anything they please, whenever they please. If their doings are to make sense at all they must be in terms of some general scheme of salvation. They are not subject to biological and physical constraints (Christ walking on the water or moving through closed doors). The anthropomorphic, physical features of divine agents are usually represented in exaggerated and distorted form. The Hindu pantheon is characteristic for its multisemic, multiform, and weird symbolic representation. God Siva’s commonest representation, for instance, is the ‘linga’ (the phallus curved on stones) and the sacred bull. Victor Turner (1968, p. 580) describes trickster gods as agents of liminality and antinomian character with ambiguous sexual characteristics. Thus the Greek god Hermes was represented as hermaphrodite and was symbolized by ithyphallic statues and the Yoruba god Eshu is represented with a long headdress curved as a phallus. Christian symbolism is full of ambiguity and paradox which makes sense primarily in terms of salvation and the Kingdom. God the Son ‘begotten’ of God the Father, becoming man in Christ, being born of a virgin, dying on the cross, rising from the dead and ascending into heaven, and coming again in glory to judge the quick and the dead: all this is part of the Nicene Creed which is the symbol of the Church. The body and blood of Christ, ‘eaten’ and ‘drunk’ by Christians, are symbols of reconciliation (redemption) between humans and God. Adam is restored through Christ, the new Adam, who is God-Man (indivisible and without confusion according to the christological doctrine). A dominant, condensed symbol (Turner 1967, pp. 28–30, Douglas 1970, p. 10, Firth 1973, pp. 87–93), which reenacts and represents all that and more, is the Christian liturgy or Eucharist. The point
to be stressed here is that religious symbols as units do not make sense in isolation and are systematically related in terms of corresponding schemes of salvation which at the same time correspond to world images and social experience. That may, of course, be above all a theological interpretation of religious symbolism which may be rejected by many sociologists, anthropologists, and psychologists. David Martin (1980), however, has demonstrated that a deep and very sophisticated sociological analysis and interpretation of Christian symbolism can take place by encompassing transcendence in the context of the general soteriological scheme of Christianity. The point to be stressed, in any case, is that without taking seriously the implications for salvation no sociological, anthropological, or psychological interpretation of religious symbolism will be adequate. Indeed, reductionist interpretations may distort the essential anthropological meaning of religious symbols.
1.2 Symbols, Boundaries, Inersion, and Power Religious symbolism acquired specific cultural significance and potency within what have been called axial civilizations (Eisenstadt 1986). In the monotheistic religions of Judaism, Christianity, and Islam transcendence and unity, and the eschatology implicit in them, became central cultural forces with a cosmic historical dialectic. Transcendence and the pursuit of salvation necessitate boundaries. To go beyond is to move from one boundary to the next, from one set of conditions to another or from one status to another. To be redeemed means to be reborn. The Kingdom of God makes good sense in the context of imperfect, unjust, and corrupt human kingdoms. The heavenly city, where ‘there shall be tears no more, no sorrow, no death’ (Rev. 21:4), is the opposite of earthly cities where sin, death, suffering, injustice, misery, and tears abide. So the polar oppose of the city of God is the city of Babylon. Within the Kingdom there is unity and divine bliss. Divisions of any kind, injustice, and biological and social boundaries do not apply. Blood bonds and other social bonds as well as divisions of all kinds are characteristic of this fallen world. There are no churches or temples in heaven but the sacred and the profane realms, though separate, are interdependent because the Kingdom, where God shall be all in all, has not come yet. The two kingdoms are separated by ambiguous symbolic boundaries of which paradox and inversion are basic characteristics. Sacred symbols function as sacred markers which challenge or reinforce the powers of this world with a power of their own. Entering a church, a synagogue, a mosque, or any sacred place one crosses a boundary: the boundary between the world of everyday life and that of another world of sacred, cosmic power and significance. Within the sacred places there are other boundaries 15355
Symbolism (Religious) and Icon designating a hierarchy of meaning reaching up to the highest, the most sacred and the ineffable. In those specifically sacred parts, the holy of holies, only persons of special sacred status may enter in order to perform specific ritual acts. These acts guarantee that the holiness, the perfection, and the autonomy of the sacred are protected and maintained, while symbolically mediated to the secular world. In the Christian Eucharist the priest carries out the sacrifice of Christ on behalf of Christ, the congregation, and humankind. So hierarchy and status become intertwined from the highest to the lowest levels and transition from one status to another require rites de passage (Van Gennep 1960). Such rites symbolize losing the old status and entering a new one and involve what Turner (1969, p. 80) has called a state of liminality, a state ‘betwixt and between,’ rich in ambiguous symbolism linked ‘to death, to being in the womb, to invisibility, to darkness, to bisexuality, to the wilderness … .’ In the early church, to be baptized meant to enter the community of the saints, to pass from death to life. Immersion in, and rising out, of the water stood for purification, death, and resurrection. This is symbolized by stripping off the old clothes and putting on new white robes. The color white has transcultural symbolic significance linked to life and to light, as black signifies darkness and death. Martin (1980), Leach (1976), Douglas (1970), Turner (1969), and others have shown that inversion and status reversal are central mechanisms in religious symbolism. Christ, the Lord and Master of all, washes the feet of the disciples. Humility and hierarchy go together, as the irreconcilable tension between hierarchy and its total abolition are endemic in Christianity. Biological and social reproduction and family and social bonds are counteracted by the concept of virginity and the forming of celibate brotherhoods and sisterhoods. Being born again means entering spiritually and physically the community of the elect. ‘Those who will lose their life will gain it.’ God defeats death through death. Such inversions and logical contradictions form the substance of religious symbolism. Symbolic boundaries link the spiritual power that ‘moves mountains’ and the powers of this world which permeate the human condition. Again, in most religions, this is through paradox and inversion. For Christianity, ‘the infinite power of the Almighty’ is demonstrated by Christ’s suffering on the cross and by icons like that of the Pantocrator in Daphni outside Athens. A thin and weak man like Ghandi could challenge the British Empire by advocating nonviolence. Mary Douglas (1970, pp. 37–53) mentions the ‘bog Irish’ in London and the Apocryphal characters of old Eleazar in ancient Israel (2 Macc. 6:18) as exercising symbolic religious power against secular power, one by abstaining from meat on Friday and the other by refusing to eat pork on pain of death. The ultimate clash between the powers of this world and 15356
the power of heaven and glorious and total victory of the latter over the former is depicted most colourfully in the book of Revelation. Yet for Christianity, as for all religions, the tension between the two kingdoms is always present because the Kingdom of God is yet to come. There is a specific kind of formidable power which is generated through martyrdom which the powers and the rulers of this world seem to fear most. Millenarists down the centuries have celebrated the imminence of the Apocalypse by burning their homes and possessions, by weird behavior, by noise and ecstasy or by total silence, or, paradoxically, by adopting rigid absolute rules in their effort to reject all rules. But the Apocalypse remains in the future and nonviolent religious movements like early Christianity became in time powerful Churches embodying both sacred and secular power. However, as long as the Kingdom is not fully and finally here, religious symbols will continue to reveal and conceal the sacred, and their eschatological pull will continue to move the world.
2. Icon Icons represent the sacred in condensed pictorial form. In iconography the holy is grasped not by reasoning but by feeling. As Langer (1980, p. 15) argues ‘the thing we do with images is to envisage a story.’ Icons speak to the heart and according to St John of Damascus ‘they are to the eye what speech is to the ear’ (Uspenski 1976, p. 9). Religious images engage the emotions and express and communicate the sacred directly. Iconography, in most religions, is anthropomorphic and this is true of ‘primitive,’ Indian, Egyptian, Hindu, Buddhist, Greco-Roman as it is of Christian religion. Central feature of icons is the human body although in certain religions it is presented in a distorted form. In Hinduism, for instance, God Visnu is invariably represented with four hands which symbolize his various attributes. In Judaism, the making of images of God or idols are strictly prohibited (Ex. 20:4, Dt. 4:15–18, Dt. 27:15) so iconography did not develop. In Islam, although the Qur’an does not prohibit divine images, pictorial representations were forbidden from the beginning so abstract, geometric, decorative designs and calligraphy developed instead (Schimmel 1987, p. 64). In a restricted sense, icons refer to the specific form of iconography which developed within Eastern Christianity during the Byzantine period. They constitute the almost exclusive type of religious art in the Orthodox Church to the present day as frescoes, mosaics but mostly painted on wood (Sendler 1988, Kokosalakis 1995). Icons developed out of the early Christian iconography of the catacombs which, in large part, conveyed coded messages of basic Christian themes. Christ, for instance, was represented as fish, which in Greek is ΙΧΘΥΣ, which is an acronym for:
Symptom Awareness and Interpretation Ιησου! , Χριστο! , Θεου! , Υιο! , Σωτη! ρ (Jesus Christ Son of God Savior). By the fourth century icons could be found in most Christian communities and constituted a central part of Christian teaching parallel to biblical texts. The formalization of the Christian doctrine during the fourth and fifth centuries consolidated with it the basic themes of Christian iconography: Christ, the Virgin, the saints, and various biblical themes. The representation of Christ in particular went through a long evolution and for certain icons, especially of the Virgin, it was believed that they were not made by human hand (αχειροποι! ητοι). By the end of the sixth century icons occupied a central place in collective and private worship in Byzantium but their theological meaning became explicit during the iconoclastic period ( 726–843) when Byzantine society was shaken to its foundations. Devotion to icons had led to excesses and Emperor Leo III, supported by some bishops and certain heretical groups who rejected external worship, issued the first decree against icons in 726. There followed a long period of theological controversy and social unrest which, apart from religious, had also serious political and economic etiology. Icons were finally and decisively reinstated in 843 by Empress Theodora and since then have occupied a central place in Orthodox theology and worship. Icons in the Orthodox Church are religious symbols par excellence with central cultural significance. They are related to neoplatonic ideas about the expression of spiritual reality in matter with a substantive religious meaning around the concept of the person\ image which refers directly to both the creation of humans in the image of God (κατ’ ει0 κο! να) and their salvation through the incarnation of the word of God. Iconsembodytheuniversaland theparticular simultaneously. Some icons are attributed thaumaturgical powers by the faithful. In Eliade’s (1959, pp. 24–29) terms such icons constitute ‘theophanies.’ Orthodox pictorial art adopted the inverted, two-dimensional perspective so that the image enters the viewer instead of the viewer entering the image. For the Orthodox Church, icons are windows of revelation and vehicles of direct communication with divine agents. See also: Durkheim, Emile (1858–1917); Eliade, Mircea (1907–86); Myth in Religion; Religion, Sociology of; Ritual; Symbolic Boundaries: Overview; Symbolism in Anthropology
Bibliography Douglas M 1970 Natural Symbols: Explorations in Cosmology. Barrie and Rockliff, The Cresset Press, London Durkheim E 1915 The Elementary Forms of the Religious Life. Allen & Unwin, London Eliade M 1959 The Sacred and the Profane. Harcourt, Brace, New York
Eisenstadt S N (ed.) 1986 The Origins and Diersity of Axial Age Ciilizations. SUNY Press, Albany, NY Firth R 1973 Symbols: Public and Priate. Allen & Unwin, London Geertz C 1975 The Interpretation of Cultures. Hutchinson, London Gellner E 1992 Postmodernism, Reason and Religion. Routledge, London Kokosalakis N 1995 Icons and non-verbal religion in the Orthodox tradition. Social Compass 42(4): 433–49 Langer S 1980 Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art, Harvard University Press, Cambridge, MA Leach E 1976 CulturejCommunication: The Logic by which Symbols are Connected. Cambridge University Press, Cambridge, UK Martin D 1980 The Breaking of the Image: A Sociology of Christian Theory and Practice. Blackwell, Oxford, UK Otto R 1953 Das Helige Breslau [The Idea of the Holy. Harvey J (trans.)]. Oxford University Press, London Schimmel A 1987 Iconography: Iconography Islamic. In: Eliade M (ed.) Encyclopedia of Religion. Macmillan, London, pp. 64–71 Sendler E 1988 The Icon, Image of the Inisible: Elements of Theology, Aesthetics and Technique. Oakwood Publications, Rotondo Beach, CA Sperber D 1975 Rethinking Symbolism. Cambridge University Press, Cambridge, UK Turner V 1967 The Forest of Symbols: Aspects of Ndembu Ritual. Cornell University Press, Ithaca, NY Turner V 1968 Myth and symbol. In: Shils D (ed.) The Encyclopedia of the Social Sciences. Macmillan, London, Vol. 9, pp. 576–82 Turner V W 1969 The Ritual Process. Routledge, London Uspenski B 1976 The Semiotics of the Russian Icon. The Peter Reider Press, Lisse, The Netherlands Van Gennep A 1960 The Rites of Passage [Vizedom M and Caffee G (trans.)]. Routledge, London Weber M 1946 The social psychology of the world religions. In: Gerth H H, Wright Mills C (eds.) From Max Weber: Essays In Sociology. Routledge, London, pp. 267–301
N. Kokosalakis
Symptom Awareness and Interpretation Physical symptoms often are ambiguous and open for subjective interpretation. Even though people certainly have some awareness of their internal states, that awareness and the accuracy of interpreting internal sensations may be fairly limited (Pennebaker 1982). Many authors state that people are poor at tracking physiological processes. Nevertheless, they hold the belief that they are able to do so. In general, individuals are motivated to maintain an understanding of their bodily condition. Both vague internal states and information about possible diseases lead to interpretive activities. Understanding the processes that guide symptom identification and interpretation 15357
Symptom Awareness and Interpretation at the initial stage of the occurrence and also during the course of disease has been found to be critical for understanding how people act upon symptoms and disease. Traditionally, mainly fields such as psychophysiology and the medical sciences dealt with questions related to symptom awareness and interpretation. Pennebaker (1982), in his groundbreaking book, brought the information processing perspective to the area. This approach has proven useful in understanding phenomena of symptom awareness and interpretation. Specifically, Pennebaker proposed that symptoms or sensations are best understood as representing information about internal states. Taking the information processing perspective, this article will emphasize the role of the following aspects: (a) information organization and mental representation of symptoms, (b) knowledge activation and use, and (c) an explanation and attribution of symptoms. Throughout, the behavioral implications of the reviewed research for disease-management, help-seeking behavior, and compliance and adherence to medical regimens will be illustrated taking selected diseases as examples (hypertension and diabetes). Finally, open questions and probable directions of future theory and research will be addressed.
1. Information Organization: The Mental Representation of Symptoms Given the quantity of information that is available to individuals at any given time, it is functional for the organism to organize and reduce the incoming data. This is achieved by cognitively (re)structuring, synthesizing, and organizing the incoming information. Cognitive psychology has introduced several theoretical constructs in this context, namely schemata, cognitive sets, prototypes, expectancies, and scripts. Studies on the structure of illness schemata consistently found five dimensions according to which symptom and illness experiences are organized (Lau and Hartman 1983): illness schemata include information about the identity or label of the disease, its consequences (i.e., the symptoms), its causes, its duration, appropriate treatments, and expectations about its curability. Cognitive structures such as illness schemata or prototypes help to organize information from internal sensations (symptoms) and disease-related information gathered from the external environment. Certainly, the most relevant model of symptom perception within the tradition of information processing has been proposed by Leventhal et al. (1984). Their approach is rooted in cognitive psychology and builds on the work on cognitive schemata and prototypes (see Health and Illness: Mental Representations in Different Cultures). In specific, Leventhal et al. (1984) propose that individuals tend to construct their own individual representation of symptoms or illness 15358
and that this idiosyncratic representation will, in turn, influence their behavior (e.g., help-seeking, adherence, and compliance). Specifically, they proposed that illness representations (or schemata) are a function of an individual’s semantic knowledge about symptoms and disease and specific contextual factors such as the nature of somatic changes and the situations in which these occur. This semantic knowledge accumulates across the life span and is acquired through the media, through personal experience, and from family and friends who have had experience with the disease. People’s common sense models of illness strongly influence which symptoms a person will search for and will ultimately perceive. Work by Meyer et al. (1985) on hypertension illustrates this point. Hypertensives hold one or a combination of disease models about high blood pressure. As Meyer et al. (1985) demonstrated, some patients hold the belief that hypertension is a disease of the heart, others believe that it is an arterial disorder, and a third group might associate hypertension with emotional upset. Importantly, the kind of belief patients hold affects the way they monitor their body: Symptoms that are consistent with their specific illness belief are more likely to get noticed (see Hypertension: Psychosocial Aspects). The common sense model of illness approach has also been used successfully to explain such phenomena as placebo effectiveness and mass psychogenic illness (MPI). For example, Pennebaker (1982) stated that placebos are effective because they change the way in which individuals process sensory information. Individuals who are worried about their condition—after taking a (placebo) drug—change their focus from sensations that may signal illness to sensations that signify improved health. In addition to general concepts of illness, people also hold organized conceptions—termed disease prototypes—for particular diseases (Bishop 1991, Bishop and Converse 1986). For example, a person may have a disease prototype of heart disease. Similar to general illness conceptions, prototypes of specific diseases help people organize and evaluate information about bodily sensations that might otherwise not be interpretable. Thus, a person who holds the belief that he or she is vulnerable to heart disease is more likely to interpret chest pain in accord to his or her prototype of heart disease than a person who does not hold this belief (Bishop and Converse 1986). This latter person might instead regard the chest pain as signaling a gall bladder problem. Altogether, organized representations of illnessrelated knowledge and experiences strongly determine symptom awareness and interpretation, and the expectations individuals generate respecting their consequences and causes. This has important implications for health behavior as the kind of beliefs patients hold about symptoms and disease may lead them to alter or fail to adhere to their treatment regimens (e.g., see Patient Adherence to Health Care Regimens).
Symptom Awareness and Interpretation
2. Knowledge Actiation and Use 2.1 The Role of Knowledge Actiation in Symptom Awareness What directs individuals to become aware of a symptom and how to interpret it? As noted above, individuals use hypotheses and subjective theories (their prototypes) about diseases to interpret initially ambiguous symptoms. Once a plausible hypothesis is identified, information to verify the hypothesis is sought. This is performed by selectively scanning stored knowledge for matching evidence. Given the rather vague nature of many symptoms, confirming evidence often can easily be found. The accessibility of knowledge stored in memory has been demonstrated to increase when it has been recently primed. Mechanic (1972), for example, documented a common phenomenon in medical schools, the medical students’ disease. He reported that about 70 percent of first-year medical students perceive symptoms of those illnesses studied in their courses. Obviously, studying the symptoms leads students to focus on their own internal states and to compare what they hear with their concrete personal experience at the moment. Vague, ambiguous physical sensations are then interpreted in accordance with the disease that was made salient in their course. Importantly, Leventhal et al. (1980) showed that this process can work both ways. They proposed the symmetry rule, which holds that individuals will (a) seek symptoms when given a diagnostic label (such as in the medical students disease) and (b) seek a diagnostic label when experiencing symptoms, that is, the perception of a change in internal sensation creates a pressure for finding a label. Similarly, being provided with a label of a specific disease generates pressure for a perceptual referent. Schachter and Singer’s (1962) classic study can be regarded as evidence for the symptom-to-label part of the symmetry rule since it shows that people will come up with a specific interpretation or label for their somatic sensations that can explain their condition (see below, attribution of symptoms). Bishop’s (1991) and Bishop and Converse’s (1986) studies on differences in symptom perception as a function of differential beliefs about hypertension mentioned earlier can also be interpreted as reflecting the symptom-to-label part of the symmetry rule. The second part of the symmetry rule (that people will seek symptoms when given a label) is consistent with findings by Meyer et al. (1985) on hypertension. Hypertension is largely a symptomless disease. However, many patients erroneously believe that they can tell when their blood pressure is high (e.g., they believe their blood pressure to be high when their face gets warm, their heart beat increases, or they are feeling tense) (Baumann and Leventhal 1985). In fact, though, the correlation between beliefs about level of blood
pressure and actual blood pressure is low (Meyer et al. 1985). In accord with the label-to-symptom part of the symmetry rule, the authors found that patients in hypertension treatment became more likely to perceive their blood pressure the longer they were in treatment (even though high blood pressure, in fact, is unsymptomatic). Importantly, illness representations strongly influence the way patients will act upon (subjectively perceived) symptoms. Taking hypertension as an example, patients have been found to treat their condition based on their belief that they are able to detect when their blood pressure is high and when it is not. When they ‘feel fine,’ they may discontinue taking medication, and when they believe their blood pressure is high they might take it. A similar phenomenon has been documented for the self-management of diabetes. Both hyperglycemia and hypoglycemia produce unpleasant bodily sensations that can be vividly described by patients (Hampson 1997). Diabetes patients often fail to self-monitor their blood glucose level and instead, like hypertension patients, rely on what their blood glucose level ‘feels like.’ However, the relation between symptom experience and acute blood glucose levels is also often imprecise, and patients are misled by their illness belief. Hence, diabetes patients need to be advised that their illness management should not rely on subjective perceptions of their blood glucose level. Instead, in deciding when to take measures to increase or decrease their blood glucose level, they should rely exclusively on testing their blood glucose (Hampson 1997, see Diabetes: Psychosocial Aspects).
2.2 The Role of Knowledge Actiation in Symptom Interpretation What determines which knowledge will be activated and used in order to interpret perceived symptoms? The evaluation of symptoms is very subjective and has been found to be influenced by the nature of their attributes. Symptom attributes can include such aspects as how salient (noticeable) and how unexpected they are, or which consequences they may have for one’s life. Not all features of a physical sensation receive equal attention. Most likely, symptoms that are visible or cause pain draw most attention. Similarly, symptoms that affect highly valued parts of the body (e.g., the eyes, face, or heart) usually are interpreted as being more serious and as more likely to require attention than symptoms that affect less valued organs. In sum, symptoms that hurt, symptoms that change quickly, and symptoms that are incapacitating are more likely than their opposites to be interpreted as signs of a serious illness and therefore more likely to induce a person to seek medical treatment (Safer et al. 1979). This selective attention to symptoms can have its costs, however, when symptoms that are not salient 15359
Symptom Awareness and Interpretation but nevertheless serious remain unnoticed. For example, people with heart attacks often wait for several hours until they seek help (Matthews 1982). As a consequence, 70 percent of all deaths due to heart attacks occur in the first few hours. Matthews noticed that a person will wait longer before seeking help the less salient the first symptoms are, the more depressed and exhausted the person is, and the more effortful the activity is that they are engaged in at the time of symptoms’ onset. The latter effect can also be explained by Pennebaker’s (1982) Competition of Cues Model. In his model, Pennebaker states that external and internal sensory information compete for people’s attention. The likelihood with which internal sensory perceptions are perceived then is a function of the proportion of external to internal information. When people are focused externally, for example, in a highly distracting environment or during a vigorous physical activity, they are slower at noticing symptoms. In contrast, when the external environment lacks stimuli, attention is more likely directed toward the body. This increases the likelihood that cues that suggest illness are noticed (Pennebaker 1982). In one study, for example, Pennebaker demonstrated that people are more likely to cough during boring parts of a movie than during more interesting parts. Apparently people focus their attention more on their internal states when there are few external stimuli, and are therefore more likely to notice itching or tickling in their throats to which they then respond with coughing. Altogether, the awareness and interpretation of internal sensations is a function of the relation between external and internal stimuli. And it is also a function of the beliefs that individuals hold. The beliefs cause them to selectively attend to specific internal information or contextual information, i.e., not all knowledge is used in evaluating internal states.
3. Explanation and Attribution of Symptoms It has been suggested that people are ‘common-sense’ medical scientists who search for the meaning of somatic events and search to attribute it to a specific cause. Research has been presented which demonstrates that sometimes not the mere presence of symptoms impact on behavioral implications, but rather the attributions made for these symptoms. For example, almost two-thirds of the people who seek medical attention for low-level symptoms have been found to suffer from anxiety or depression instead. This may occur because symptoms of these emotional disorders are mistakenly thought to be indicative of a physical disease. Attributions of symptoms have also been found to affect adherence to medical regimens. For example, in their study on HIV-infected women, Siegel and Gorey (1997) found decreased adherence to medication and even discontinuation of medication intake when symptoms were attributed to side effects 15360
of the medicine taken. According to the authors, adherence was reduced because these symptoms were interpreted as evidence that the medication was making them sicker or had more risks than benefits. Similarly, failure of medications to relieve symptoms also negatively influenced adherence because they are interpreted as not having any effects.
3.1 What Guides Symptom Attribution? Valins (1966) was the first to suggest that the mere belief that a change in physiological ‘arousal’ had occurred would lead individuals to seek for information to label this change. In a recent demonstration, McCaul et al. (1992) found that students who were informed falsely that they were at risk for gum disease reported higher rates of gum bleeding (a symptom of gum disease) 2 days later than students in the control group who were informed that they had no gum disease. Mechanic’s notions on the medical students’ disease and Schachter and Singer’s (1962) study on the attribution of physical arousal mentioned earlier also fit into this conception: When persons have access to a certain sickness label (e.g., because they just read about it or met a sick person) they will be particularly likely to attribute their physical sensations to this disease. The explanation and attribution of symptoms is, however, also influenced by motivational factors. Specifically, it has been shown to be governed by optimistic tendencies that support a positive view of oneself and one’s physiological condition (Weinstein 1980, see Health Risk Appraisal and Optimistic Bias). This means that an individual will be less likely to interpret a novel symptom (e.g., bleeding of a mole) as a sign of a serious, life-threatening disease (e.g., skin cancer) and will be more declined to neglect it or to attribute it to a nonserious cause (e.g., a scratch). In general, the optimistic bias results in selectively seeking information that confirms the less threatening hypothesis. Obviously, this benign hypothesizing results in delay to seek help (e.g., see Illness Behaior and Care Seeking).
3.2 The Influence of Mood, Emotions, and Stress Mood, emotions, and life stressors may be a further important determinant of symptom attribution and in consequence of care seeking and adherence. People who are in a positive mood report fewer symptoms (Salovey and Birnbaum 1989). People in a bad mood, however, report more symptoms, are more pessimistic that any actions they might take would relieve their symptoms, and perceive themselves as more vulnerable to future illness. In the presence of life stress, ambiguous physical sensations (e.g., fatigue, headaches, or stomachaches) have been found to be
Symptom Awareness and Interpretation interpreted as signs of stress instead of illness (Cameron et al. 1995). Hereby, stress attributions are more likely for sets of symptoms that are unfamiliar, novel, and ambiguous (Pennebaker 1982). 3.3 Indiidual Differences in Symptom Attribution The attribution of perceived symptoms is further determined by both individual and cultural differences. Age and gender as well as ethnic and racial differences in symptom attribution have been documented (e.g., see Aging and Health in Old Age; Culture as a Determinant of Mental Health; Health and Illness: Mental Representations in Different Cultures; Gender and Physical Health). For example, identifying initially ambiguous symptoms as signs of a beginning disease is particularly difficult in the elderly. First, older individuals experience more physical complaints than younger persons. They therefore have to evaluate the significance of new symptoms against an increasingly complex background of age related somatic changes (Leventhal and Crouch 1997). Second, distinguishing symptoms that can be attributed to normal aging from symptoms that signal serious disease can be a complex task in old age. Importantly, the ‘wait and see’ strategy that is widely used at all ages in an attempt to rule out benign causes of ambiguous bodily signals may be more dangerous in old age because of a less vigorous immune protection. Discounting and ignoring treatable symptoms and waiting to seek help can delay diagnosis and treatment of potentially life-threatening disorders and can create unnecessary risk for morbidity and mortality. In sum, older individuals may be at greater risk because they are more likely to make aging attributions and because it has more serious consequences when they delay seeking treatment for a serious disease (Leventhal and Crouch 1997).
4. Remaining Questions Much progress has been made in the theoretical understanding of the mental representation of symptom sets and diseases and the behavioral implications of these beliefs. Most research in this area has so far focused on just a few diseases: asthma, diabetes, and hypertension. As a consequence, the mental representation of these diseases and their consequences for selfmanagement is, by now, quite well understood. Much more research is needed, however, with regard to the understanding of diseases that until now have received less research attention (e.g., allergies). Clearly, social and cultural factors shape both the awareness and interpretation of physical symptoms and the behavioral responses chosen for controlling them. A key area for future work is to understand the self-regulatory processes of symptom perception and interpretation within the social and cultural context in which they occur. With a marked increase in the
proportion of older people in many countries, research in this area has to move more strongly into the direction of taking a life-span perspective. A better understanding of age-related changes in the awareness and interpretation of symptoms will be vital in the process of optimizing the self-management of chronic diseases. At the same time, with many chronic diseases on the rise among children (e.g., asthma and allergies), more knowledge is needed to understand how they perceive and interpret symptoms and how contextual factors of their life situation can be worked into a better adherence to problematic medical regimens. See also: Culture as a Determinant of Mental Health; Diabetes: Psychosocial Aspects; Gender and Physical Health; Health and Illness: Mental Representations in Different Cultures; Health Risk Appraisal and Optimistic Bias; Hypertension: Psychosocial Aspects; Illness Behavior and Care Seeking; Patient Adherence to Health Care Regimens; Schemas, Frames, and Scripts in Cognitive Psychology; Stress Management Programs
Bibliography Baumann L J, Leventhal H 1985 I can tell when my blood pressure is up, can’t I? Health Psychology 4: 203–18 Bishop G D 1991 Understanding the understanding of illness: Lay disease representations. In: Skelton J A, Croyle R T (eds.) Mental Representations in Health and Illness. Springer-Verlag, New York, pp. 32–59 Bishop G D, Converse S A 1986 Illness representations: A prototype approach. Health Psychology 5: 95–114 Cameron L, Leventhal E A, Leventhal H 1995 Seeking medical care in response to symptoms and life stress. Psychosomatic Medicine 57: 1–11 Hampson S E 1997 Illness representations and the self-management of diabetes. In: Petrie K J, Weinman J A (eds.) Perceptions of Health and Illness. Harwood Academic, Amsterdam, pp. 323–47 Lau R R, Hartman K A 1983 Commonsense representations of common illnesses. Health Psychology 2: 167–85 Leventhal E, Crouch M 1997 Are there differences in perceptions of illness across the lifespan?. In: Petrie K J, Weinman J A (eds.) Perceptions of Health and Illness. Harwood Academic, Amsterdam, pp. 77–102 Leventhal H, Meyer D, Nerenz D 1980 The common sense representation of illness danger. In: Rachman S (ed.) Contributions to Medical Psychology, Vol. 2. Pergamon, New York Leventhal H, Nerenz D R, Steele D J 1984 Illness representations and coping with health threats. In: Baum A, Singer J (eds.) A Handbook of Psychology and Health, Vol. 4. Erlbaum, Hillsdale, NJ, pp. 219–52 Matthews K A 1982 Psychological perspectives on the Type A behavior pattern. Psychological Bulletin 91: 293–323 McCaul K D, Thiesse-Duffy E, Wilson P 1992 Coping with medical diagnosis: The effects of at-risk versus disease labels over time. Journal of Applied Social Psychology 22: 1340–55 Mechanic D 1972 Social psychological factors affecting the presentation of bodily complaints. New England Journal of Medicine 286: 1132–39
15361
Symptom Awareness and Interpretation Meyer D, Leventhal H, Gutmann M 1985 Common-sense models of illness: The example of hypertension. Health Psychology 4: 115–35 Pennebaker J W 1982 The Psychology of Physical Symptoms. Springer-Verlag, New York Safer M A, Tharps Q J, Jackson T C, Leventhal H 1979 Determinants of three stages of delay in seeking care at a medical care clinic. Medical Care 17: 11–29 Salovey P, Birnbaum D 1989 Influence on mood on healthrelevant cognitions. Journal of Personality and Social Psychology 57: 539–51 Schachter S, Singer J E 1962 Cognitive, social and physiological determinants of emotional state. Psychological Reiew 69: 379–99 Siegel K, Gorey E 1997 HIV infected women: Barriers to AZT use. Social Science and Medicine 45: 15–22 Valins S 1966 Cognitive effects of false heart rate feedback. Journal of Personality and Social Psychology 4: 400–8 Weinstein N D 1980 Unrealistic optimism about future life events. Journal of Personality and Social Psychology 39: 806–20
B. Kna$ uper
matrix and membrane-bound molecules. Recently, the L1 immunoglobulin superfamily molecule has also been found to be involved in ephrin receptor mediated repellent signaling. Netrins compose a family of proteins that can attract or repel axons, acting via distinct netrin receptors. Binding of recognition molecules to receptors activate signal transduction mechanisms, mostly involving receptor and non-receptor tyrosine kinases. These signals have been shown to converge in the regulation of activities of several small guanosine triphosphate (GTP) binding proteins of the Rho subfamily. Two of these proteins, Cdc42 and Rac1, are under the control of attractive guidance cues, whereas Rho is under control of repulsive guidance cues (Hall 1998). Recent studies show that one and same molecule can evoke attractive or repulsive responses depending on the internal state of the cell. For instance, three different extracellular cue molecules—recognition molecule netrin-1 brainderived neurotrophic factor (BDNF) and neurotransmitter acetylcholine—induce attractive or repellent turning of the growth cones depending on the intracellular level of cyclic adenosine monophosphate (AMP) (Ming et al. 1997).
Synapse Formation Formation of specialized contacts, the so-called synapses, between presynaptic axonal terminals of neurons and their postsynaptic targets, such as neurons and muscle cells, is a decisive step in building the nervous system. It enables rapid formation transfer between contacting cells via spatially and temporally restricted activation of postsynaptic receptors by neurotransmitters secreted from the nerve terminals. Considerable progress in the understanding of this complex phenomenon came from studies of synaptogenesis of the vertebrate neuromuscular junction (Fig. 1A). During the last decade investigations of glutamatergic neuron-neuron contacts in mammalian brains (Fig. 1B) and the neuromuscular junction in Drosophila highlighted the diversity of molecular interactions and universality of certain principles underlying formation of different types of synapses. These two issues are addressed in the present article.
1. Approaching the Target Growth cones are guided to their targets by attractive and repulsive cues that can be diffusible or integrated into the cell surface or the extracellular matrix (Mueller 1999). Semaphorins, ephrins and their receptors constitute large groups of guidance molecules that evoke repulsive and adhesive responses, often bidirectionally in the interacting cells. Cell adhesion molecules of the immunoglobulin superfamily promote neurite outgrowth via homophilic and heterophilic interactions with a variety of extracellular 15362
2. Selecting the Target Selection of the appropriate postsynaptic target and search strategies of axonal growth cones are interrelated processes. Removal of the repulsive cues or their axonal receptors often results in a decrease in axon bundling, so-called defasciculation, indicating that axon–axon interactions are no longer favored over axon–substrate interactions. This enables the fibers to grow into the regions that normally are not invaded and there form ectopic synapses. At the Drosophila neuromuscular junction, the immunoglobulin superfamily molecule fasciclin II promotes and semaphorin II inhibits promiscuous synaptogenesis. These molecules are expressed by all muscles in wild-type flies. A more restricted expression is seen for netrin B which is synthesized by a subset of muscles where it attracts some axons and repels others. Underexpression of repellent molecules or overexpression of adhesion-promoting fasciclin II upregulate formation of promiscuous and ectopic synapses. These experiments argue against the view that pre- and postsynaptic partners have unique complementary labels, and show that the relative balance of adhesive and repellent molecules controls the number of the synapses formed (Winberg et al. 1998). This is also demonstrated by genetic deletion of semaphorin IIIa and its receptor, neuropilin, which resulted in defasciculation and abnormal spreading of cranial nerves in mice. Also, more synapses were formed on wild-type hippocampal neurons expressing the neural cell adhesion molecule (NCAM), the closest mammalian homolog of fasciclin II, when compared to
Synapse Formation
Figure 1 Formation and differentiation of the vertebrate neuromuscular junction (A) and excitatory neuron–neuron connection (B). The numbers 1–4 refer to four consecutive stages of maturation, during which accumulation of synaptic vesicles in presynaptic active zones and clustering of postsynaptic receptors occur. For the neuromuscular junction (A), after the growth cone (stage 1) establishes contact with a muscle fiber, it is a simple bulbous enlargement (stage 2). During further maturation, extracellular material is accumulated in the synaptic cleft forming the basal lamina, multiple junctional folds are formed, and nerve terminals are coated by glial processes (stages 3–4). For the neuron–neuron synapse (B), induction of a synapse by dendritic filopodium (stage 1) is followed by retraction of the filopodium and formation of the shaft synapse (stage 2) that can be further transformed into stubby (stage 3) and to spine synapse (stage 4). For mature synapses (stage 4) the localization of some molecules is shown. Indicated in (A) synaptic vesicle and active zone proteins are common for these types of synapses. A more detailed description of protein functions and involved molecular interactions is given in the text
15363
Synapse Formation neighboring neurons from NCAM-deficient mice maintained in co-cultures. However, NCAM-deficient neurons maintained in a no-choice situation— separately from NCAM-expressing cells—receive normal numbers of contacts. This implies that NCAM mediates one of several signals which are integrated by the developing axon to confer preference for contact formation at sites where the integrated signal is optimal (Dityatev et al. 2000). Cooperative actions of some N-acetyl-galactosamine-terminated glycoconjugates, N-cadherin and BDNF are known to regulate formation of synapses of retinal axons in the chick optic tectum. Laminaspecific connections in the hippocampus illustrate that synaptogenic cues can be expressed by ‘guidepost’ cells that are not definitive synaptic targets but transient targets for growing axons. Such ‘guidepost’ cells, for instance, are Cajal–Retzius cells which occupy the stratum lacunosum-moleculare and calbindin-positive interneurons occupying the stratum radiatum of the hippocampus. During early development, entorhinal axons form synapses onto Cajal–Retzius cells before the dendrites of pyramidal cells reach the molecular layer. Calbindin-positive interneurons are preferred targets for commissural axons. Later, most of guidepost cells die and synapses are transferred to the pyramidal dendrites (Sanes and Yamagata 1999).
3. Exchanging Signals When an axon approaches its postsynaptic target, it releases anterograde signals and receives retrograde influences from the postsynaptic side. As discussed below, anterograde signals can induce clustering of postsynaptic receptors at synaptic sites and downregulation of receptor expression at nonsynaptic sites. Also, when neonatal muscles, destined to become fastor slow-twitch, are denervated and cross-reinnervated with each other’s nerves, they acquire contractile characteristics appropriate for their new innervation. Such ‘conversion’ of postsynaptic cells is mediated by acetylcholine release and electrical activity inducing changes in expression of proteins encoding fiber typespecific isoforms of contractile proteins. Retrograde signaling plays an important role in elimination of incorrect or excessive synaptic inputs. Most vertebrate muscle fibers are initially innervated by more than one motor axon, retaining only a single input during early postnatal life. Similarly, projections from both eyes initially overlap in the lateral geniculate nucleus, with eye-specific laminae forming from the initially diffuse projections late in embryogenesis. These processes appear to depend on postsynaptic electrical activity. First, competition and selection of afferents could then arise due to limited amounts of maintenance factors released from postsynaptic cells and taken up by axons in an activity-dependent manner. Nitric oxide and neurotrophic factors, like 15364
BDNF, nerve growth factor (NGF) and fibroblast growth factor 2 (FGF2), are among the favorite candidate molecules mediating retrograde strengthening of synapses. Second, electrical activity can evoke release of proteases such as thrombin or plasminogen activator, destabilizing cell interactions at the nerve terminal unless they are protected by endogenous inhibitors. Interestingly, the heparan sulfate proteoglycan agrin, implicated in clustering of nictotinic acetylcholine receptors (AChRs) at the neuromuscular junction, may function also as a protease inhibitor. Third, postsynaptic activity can redistribute the localization of adhesion molecules leading to retraction of axons and loss of synapses via downregulation of adhesive interactions in inappropriate sites, whereas synaptic strengthening should involve accumulation of adhesion molecules, such as NCAM, or changes in their molecular configuration as has been shown for cadherins, which dimerize upon depolarization. Such dimers are protected from proteolytic degradation and can form transsynaptic associations (Tanaka et al. 2000).
4. Postsynaptic Differentiation Maturation of synaptic contacts leads to the accumulation of transmitter receptors at postsynaptic sites. If AChRs are uniformly distributed in immature muscles at a density of 1000 µm−#, their expression in the mature muscles exceeds 10000 µm−# at the synapse, whereas the density drops to 10 µm−# extrasynaptically. Many molecules have been reported to promote clustering of AChRs in itro. Of these, only one has so far been implicated in synaptogenesis in io. This is the nerve-derived alternatively spliced isoform of agrin that interacts with the transmembrane protein tyrosine kinase MuSK. A crucial effector of receptor clustering downstream of MuSK is the cytoplasmic protein rapsyn that bears distinct domains responsible for association with the membrane, multimerization, and interaction with receptors. Functionally similar synapse-associated proteins were discovered in neuronal synapses (Garner et al. 2000). They bear protein motifs called PDZ domains that can bind ligand- or voltage-gated ion channels and adhesion molecules. In Drosophila, discs-large protein interacts with and determines synaptic localization of fasciclin II and Shaker type of voltage-gated K+ channels. In vertebrates, the PDZ domaincontaining postsynaptic density protein (PSD-95), glutamate receptor interacting protein (GRIP), and Homer interact with N-methyl-D-aspartate (NMDA), α-amino-3-hydroxy-5-methyl-isoxazole-4propionic acid (AMPA) and metabotropic glutamate receptors, respectively. PSD-95 can also interact with voltage-gated K+ channels, the synaptic cell adhesion molecule neuroligin, and kainate receptors, whereas GRIP interacts with ephrin B ligands. These complexes
Synapse Formation are linked to the postsynaptic cytoskeletal matrix and signaling cascades. For example, PSD-95 and GRIP are likely to couple glutamate receptors to the Ras GTPase-activating protein SynGAP and GDP-GTP exchange factor GRASP1, respectively. Differential expression of transmitter receptors in postsynaptic sites at the vertebrate neuromuscular junction is achieved not only by clustering of diffusely distributed receptors, but also by transcriptional activation of AChR genes in the muscle in subsynaptic nuclei, and transcriptional repression of AChR expression in nonsynaptic nuclei. Repression of synthesis is regulated by postsynaptic activity, inducing entry of Ca#+ and activation of protein kinase C. Critical targets of this kinase are transcriptional factors (myoD, myf5, MRF4, and myogenin) that downregulate transcription of AChRs by binding to the socalled E-box containing sequences in several AChR subunit genes. These repressive signals can be counteracted at synaptic sites by the nerve-derived neuregulin that is incorporated into synaptic basal lamina and activates synthesis of AChRs in myocytes. Neuregulin acts via the epidermal growth-factor-related receptor tyrosine kinase and downstream cascades of effector molecules, including Ras, Raf, and Erk kinases. These effectors regulate the activity of transcription factors of the ets family, which can bind to N-box elements in some AChR subunit genes. Neuregulins also trigger synthesis of some receptors in the central nervous system and some proteins are likely to be synthesized locally at synaptic sites in dendrites.
5. Presynaptic Differentiation Spontaneous and evoked neuromuscular transmission has been seen to begin within minutes after an axon contacts the postsynaptic cell. However, the synaptic activity is initially low, not only because of the small number of postsynaptic receptors, but also because only few vesicles are released. During maturation of a synapse, synaptic vesicles increase in number and become clustered at active zones, whereas some cytoskeletal elements characteristic of the growing axon are lost. These changes are accompanied by an increase in synaptic volume that leads to a prominent increase in synaptic activity. The factors that induce presynaptic differentiation are unknown, but several molecules, which can be retrogradely released or are present in extracellular matrix, have been implicated in stimulating clustering and release of synaptic vesicles. They include neurotrophic factors, BDNF and FGF2, and extracellular matrix molecules such as agrin and synaptic isoforms of laminin. As for the postsynaptic site, presynaptic differentiation involves redistribution of synaptic machinery, and changes in expression of different isoforms of synaptic proteins. Accumulation and mobilization of vesicles is mediated by synapsins and the PDZ domain-containing proteins Piccolo and Rim. Possibly, assembly of pre- and
postsynaptic components in neuron–neuron connections depends on formation of transsynaptic molecular complexes. Thus, postsynaptically localized neuroligin, associated with PSD-95, can bind to presynaptically localized β-neurexin associated via CASK with two PDZ domain-containing proteins, Veli and Mint1. The latter molecule interacts with Ntype Ca#+ channels that are responsible for the entry of Ca#+ and transmitter release at the nerve terminals.
6. Synaptogenesis as a ‘Neer Ending Story’ Formation and loss of synapses continue during the whole lifespan of animals. It is likely that high cognitive functions, such as learning and memory, utilize the mechanisms evolved to wire the brain during early development. Indeed, many molecules involved in early synaptogenesis, such as cell adhesion molecules of the immunoglobulin superfamily, integrins, cadherins, and ephrin receptors, have proved to play important roles in long-term potentiation and depression of synaptic connections, underlying learning and memory in adults (see Long-term Depression (Cerebellum); Long-term Depression (Hippocampus); Long-term Potentiation (Hippocampus); Memory: Synaptic Mechanisms). However, synaptogenesis in the adult brain occurs in the context of a very different milieu than that of early development. Mature synapses are subject to more severe constraints for further modifications, due to reduced NMDA receptor-mediated currents, and well-developed inhibitory circuitry. Additional restrictions are imposed by the established adhesive interactions within and outside of synapses, and a firm extracellular matrix enriched in repellent signals, thus possibly reducing structural plasticity. It remains to be seen which mechanisms active during development contribute to the formation of synapses in the adult brain and which mechanisms were additionally ‘invented’ to provide stability, but also flexibility in synaptic activity underlying learning and memory. See also: Learning and Memory: Computational Models; Learning and Memory, Neural Basis of; Recovery of Function: Dependency on Age; Synapse Ultrastructure; Synaptic Efficacy, Regulation of; Synaptic Transmission
Bibliography Broadie K 1998 Forward and reverse genetic approaches to synaptogenesis. Current Opinion on Neurobiology 8: 128–8 Dityatev A, Dityateva G, Schachner M 2000 Synaptic strength as a function of post- versus presynaptic expression of the neural cell adhesion molecule NCAM. Neuron 26: 207–17 Fitzsimonds R M, Poo M M 1998 Retrograde signaling in the development and modification of synapses. Physiological Reiew 78: 143–70
15365
Synapse Formation Flanagan J G, Vanderhaeghen P 1998 The ephrins and Eph receptors in neural development. Annual Reiew of Neuroscience 21: 309–45 Hall A 1998 Rho GTPases and the actin cytoskeleton. Science 279: 509–14 Garner C C, Nash J, Huganir R L 2000 PDZ domains in synapse assembly and signalling. Trends in Cell Biology 10: 274–80 Lee S H, Sheng M 2000 Development of neuron–neuron synapses. Current Opinion in Neurobiology 10: 25–31 Ming G L, Song H J, Berninger B, Holt C E, Tessier-Lavigne M, Poo M M 1997 cAMP-dependent growth cone guidance by netrin-1. Neuron 19: 1225–35 Mueller B K 1999 Growth cone guidance: first steps towards a deeper understanding. Annual Reiew of Neuroscience 22: 351–88 Sanes J R, Lichtman J W 1999 Development of the vertebrate neuromuscular junction. Annual Reiew of Neuroscience 22: 389–442 Sanes J R, Yamagata M 1999 Formation of lamina-specific synaptic connections. Current Opinion in Neurobiology 9: 79–87 Schachner M 1997 Neural recognition molecules and synaptic plasticity. Current Opinion in Cellular Biology 9: 627–34 Tanaka H, Shan W, Phillips G R, Arndt K, Bozdagi O, Shapiro L, Huntley G W, Benson D L, Colman D R 2000 Molecular modification of N-cadherin in response to synaptic activity. Neuron 25: 93–107 Walsh F S, Doherty P 1997 Neural cell adhesion molecules of the immunoglobulin superfamily: role in axon growth and guidance. Annual Reiew of Cellular and Deelopmental Biology 3: 425–56 Winberg M L, Mitchell K J, Goodman C S 1998 Genetic analysis of the mechanisms controlling target selection: complementary and combinatorial functions of netrins, semaphorins, and IgCAMs. Cell 93: 581–91
A. Dityatev and M. Schachner
speculated that learning might result from changes in the relative strength or efficacy of existing synapses. Donald Hebb (1949) postulated that one neuron’s involvement in firing a second neuron would strengthen the connection between the two neurons. The most likely location for this type of change in connective strength is the synapse. At the time of Hebb’s proposal there was little evidence to suggest that synapses were capable of change, either anatomically or physiologically. Based on advances in research methodology, however, it is clear that synapses are indeed capable of these types of changes. As a result of this early research (Cajal 1893, Hebb 1949), the structure and functioning of the synapse has become an area of intense study. While neuronal, axonal, and dendritic changes are also undoubtedly important, the synapse probably represents the primary location for activity dependent neural plasticity. Research on synapses includes examining synaptic chemical transmission (see Synaptic Transmission), how synapses form (see Synapse Formation), and the differences in synaptic connections throughout the brain. While describing specific synaptic components has also been an important pursuit, quantifying the number and dimensions of synapses is the primary goal of research on synapse ultrastructure. Quantifying synapses may be especially important for functional reasons when synapses change in number or dimension following neural activation, neural lesions, learning, and memory formation. While a comprehensive review of the literature in this area is beyond the scope of this article, the basic components of the synapse are described, some of the research examining the structural plasticity of the synapse is discussed, and a series of ultrastructural changes that may support the anatomical storage of information is proposed.
Synapse Ultrastructure The chemical synapse, which is the primary location for communication between neurons, is characterized by the apposition of two neuronal membranes with unequal membrane densities and a cluster of synaptic vesicles close to the synaptic site. Synaptic ultrastructure refers to the study of the physical components that make up the chemical synapse. This article examines some of the important findings in this area and considers the potential functional relevance of changes in synaptic ultrastructure that have been observed following neuronal activation, neural lesions, learning, and memory.
1. Why is Synaptic Ultrastructure Important? Ramon y Cajal (1893; cited in Bliss and Collingridge 1993) theorized that learning and memory formation resulted from the formation of new synapses. Others 15366
2. Basic Ultrastructural Components of the Synapse The following is a brief description of the basic physical components of the synapse. The synapse can be divided into presynaptic components, the synaptic cleft, and postsynaptic components. The presynaptic component usually arises from the end of an axonal branch and the postsynaptic component is usually a dendrite or a dendritic spine. Although many other synaptic contact types exist (e.g., axo-axonic (axon to axon), axosomatic (axon to neuron cell body), etc.), the majority of synapses in the more plastic commonly studied regions of the brain (e.g., the hippocampus and cerebral cortex) consist of axon terminals contacting dendritic spines. The synaptic junction contains a specialized area referred to as the active zone where synaptic transmission is known to take place. This area is composed of two plasma membranes separated by a cleft, with
Synapse Ultrastructure see Sect. 3.4). Please note that this article does not include a consideration of the ultrastructure of dendritic spines (see Harris and Kater 1994 for review).
3.1 Quantifying Synaptic Number
Figure 1 Basic ultrastructural components of the synapse
dense structural specializations (referred to as densities, dense projections, or thickenings; see Fig. 1). The presynaptic axon terminal contains synaptic vesicles situated within the presynaptic dense projections. These dense projections are tufts of electron dense material assumed to be essential for vesicular release. The postsynaptic element contains the postsynaptic density (PSD), which is comprised of various proteins (Bloomberg et al. 1977) and is associated with actin, a calcium calmodulin dependent protein kinase (Cohen et al. 1985). While PSDs are also associated with other kinases etc., actin may play a particularly important role in synaptic ultrastructural change as it is involved in maintaining the structure of the postsynaptic element (see Sect. 3.2). The PSD is also the site of postsynaptic neurotransmitter receptors and is considered integral to modulation of synaptic events (Seikevitz 1985). Gray (1959) proposed that chemical synapses come in two basic morphological types (Type I and Type II). Type I synapses are thought to be excitatory, house round vesicles, and have a wider PSD region (asymmetric synapses). Type II synapses are thought to be inhibitory, house oblong or oval vesicles and have a smaller PSD (symmetric synapses).
3. Ultrastructural Plasticity of the Synapse It has become clear that neuronal growth, changes in connectivity, and changes in activity are associated with modulation of the structure of synapses (Bailey and Kandel 1993). Changes have been observed in synaptic number, synaptic curvature, synaptic size, and synaptic configuration (e.g., perforated synapses,
When examining the ultrastructure of synapses in a given brain region the initial analysis usually involves some estimate of the total number of synapses in that region. In early research synapses were often counted on a single plane within the region of interest. That is, researchers used an electron microscope to photograph one large two-dimensional area and simply counted how many synapses were present. Gundersen (1977) and others began to realize that this form of counting involved a serious bias. Coggeshall and Lekan (1996) point out that considering only one plane biases the synaptic counts because the number of two-dimensional synaptic profiles observed depends not only on the actual number of whole synapses but also on the size, shape, orientation, etc. of the synapses. As a result, the method of using stereological dissectors to estimate synaptic number was developed. Put simply this involves examining a series of micrographs (photographs) that move through the tissue in three dimensions. Synapses are only counted once when they first appear which rules out the bias identified above and includes all synapses present (see Geinisman et al. 1996b for more details). Synaptic number in a brain region is assumed to important functionally as more synapses would yield stronger connections between neurons (Petit 1995).
3.2 Changes in Synaptic Curature Changes in synaptic curvature are also known to occur following various forms of synaptic activation (see Petit 1995 for review). Markus and Petit (1989) evaluated synaptic curvature using four possible states: concave (presynaptic element protrudes into the postsynaptic element), flat (no curvature), convex (postsynaptic element protrudes into the presynaptic element), and irregular (w-shaped or more than one curvature). One possible explanation for curvature changes is that plastic internal actin and myosin cytoskeletal networks may create changes in response to activity dependent calcium influx. Fifkova (1985) found that cytoskeletal networks that contain only highly plastic actin structures are found exclusively in developing neurons and mature dendritic spines (see Deller et al. 2000 for an update on this research). Functionally, curvature is thought to effect synaptic efficacy by increasing contact area and the probability of the transmitters reaching their target (see Markus and Petit 1989 for review). Computer modeling experiments have also suggested that concave shaped syn15367
Synapse Ultrastructure apses, seem to confine the diffusion of presynaptic intracellular calcium (Ghaffari et al. 1997). This would lead to higher calcium concentrations in the presynaptic terminal and an increased probability of vesicular (transmitter) release (see Synaptic Transmission).
tivation of synapses primarily in the hippocampal formation due to its involvement in learning and memory (see Hippocampus and Related Structures).
4.1 Long-term Potentiation 3.3 Changes in Synaptic Size Most researchers have theorized that larger synapses are stronger synapses. Conceivably, more neurotransmitter could be released (presynaptically), more postsynaptic receptors could be present, and more conductance through to the postsynaptic cell could occur (Petit 1995). A recent study used cultured cortical neurons to study the effect of synaptic size on the quantal capacity of the synapse (Mackenzie et al. 1999). By imaging Ca#+ and then conducting morphological analyses on the same synapse, Mackenzie et al. concluded that synaptic size correlates positively with the amplitude of the postsynaptic response. This suggests that larger synapses create larger synaptic signals.
3.4 Changes in Synaptic Configuration One of the main changes in synaptic ultrastructure is the transition of some synapses from a continuous synapse to a perforated synapse. A perforated synapse can be defined as any synapse with a discontinuous or segmented PSD (Geinisman et al. 1991). Carlin and Siekevitz (1983) proposed that at some optimal size the postsynaptic material would perforate, eventually segment, and potentially new simple synapses would form. There is, however, evidence against the notion that synapses split to form new simple synapses. Jones et al. (1991) found few double-headed spines, axospinous non-perforated synapses lying adjacent to one another, and spinules completely traversing the presynaptic terminals of perforated synapses. Functionally, perforations may indicate synapses ready to divide and this may lead to more synapses and a larger neural signal (Toni et al. 1998). Alternatively, perforations may allow calcium channels to be located closer to the vesicular release apparatus, which could result in greater neurotransmitter release (Jones et al. 1991).
4. Changes in Synaptic Ultrastructure following Neural Eents As mentioned in Sect. 1, a comprehensive review of the literature on changes in synaptic ultrastructure is beyond the scope of this article. As a result synaptic changes following the onset of disease and other models are not discussed. The following is a selection of findings associated with models that involve ac15368
Bliss and Lomo (1973) are credited with the discovery of long-term potentiation (LTP), which has proven to be a popular model system for learning and memory (see Long-term Potentiation (Hippocampus)). Put simply LTP is a sustained increase (hours, days, or weeks) in the amplitude of the response evoked in a cell, or population of cells, following tetanic (repeated) stimulation of those cells. The ultrastructural characteristics of synapses have been shown to change following the induction of LTP (Geinisman et al. 1991, Desmond and Levy 1983, Weeks et al. 1999). It is not clear, however, what role, if any, these changes play in the enhanced neural signal observed during the maintenance of LTP (see Sect. 3.2–3.4 for some possibilities). While LTP can be induced in several brain regions, the majority of the research on changes in synaptic ultrastructure following LTP has been conducted in the hippocampus. Geinisman et al. (1996a) found more numerous axo-dendritic synapses (synapses directly on the dendritic shaft) 13 days following the induction of LTP in the rat dentate gyrus (a sub-region of the hippocampus). Functionally, these axo-dendritic synapses may have a more direct effect on the soma and could therefore combine to create a potentiated signal (see Harris and Kater 1994 for discussion). Desmond and Levy (1983) analyzed synaptic curvature in the rat dentate gyrus and found that the number of concave shaped synapses increased as early as 2 minutes after the induction of LTP. General increases in synaptic size following LTP have also been reported (Chang and Greenough 1984, Desmond and Levy 1988). These studies found an enlargement of the presynaptic and postsynaptic elements, and an increase in length of the presynaptic dense projections and postsynaptic density (PSD). Geinisman et al. (1991) found an increase in the number of perforated synapses per neuron following the induction of LTP in the rat dentate gyrus. Importantly, the significant increase in perforated synapses observed at 1 hour post-induction was no longer evident at 13 days post-induction (Geinisman et al. 1996a). Buchs and Muller (1996) utilized calcium accumulation markers to identify activated synapses and found that these activated synapses were more perforated following LTP induction. Toni et al. (1999) reported an increase in synaptic perforations in activated synapses during the first 30 minutes following in itro LTP induction in the rat CA1 area (another subregion of the hippocampus). At 60 minutes postinduction, they found an increase in the proportion of double synapses (two spines connected to the same
Synapse Ultrastructure presynaptic terminal) suggesting that perforated synapses may eventually split to form new simple synapses. Contradictory evidence has come from Sorra and Harris (1998) who did not find any ultrastructural changes 2 hours post-LTP induction in area CA1 of the rat. In a series of experiments, Weeks et al. (1999, 2000, 2001) considered the pattern of synaptic ultrastructure at 1 hour, 24 hours, and 5 days post-LTP induction. LTP was associated with an increase in concave shaped synapses at 24 hours. Synapses were larger overall at 1 hour and 5 days but not different at 24 hours. These differences in length were particularly evident in concave-shaped synapses, which were longer at 1 hour, shorter at 24 hours, and longer at 5 days. The proportion of perforated synapses was increased at 1 hour post-LTP induction but did not differ from controls at later time periods. Taken together and added to the other research in this area these results form a model of ultrastructural remodeling that occurs in activated synapses following LTP. Synapses appear to initially grow in size, become concave in shape and perforate by 1 hour post-LTP induction, either split or form new smaller concave synapses at 24 hours and then grow again in size by 5 days. 4.2 Learning and Memory Synaptic ultrastructure has also been shown to change following learning. Reempts et al. (1992) found an increased number of concave perforated synapses in the rat dentate gyrus following the acquisition of a one-way active avoidance task. Concave synapses were also found to be increased in size relative to controls following training. This result is similar to the findings at 1 hour post-LTP induction (Weeks et al. 2000). Synapse ultrastructure was also analyzed following the hippocampus-dependent trace eye-blink conditioning in the rabbit (see Classical Conditioning, Neural Basis of; Eyelid Classical Conditioning). Geinisman et al. (2000) found that while the number of synapses did not change following conditioning the PSDs were significantly larger in the trained animals. Geinisman et al. speculated that this increase in size might represent the addition of signal transduction proteins (thus increasing synaptic strength). Learning that involves the cerebellum (see Cerebellum: Associatie Learning) has also been associated with changes in synaptic ultrastructure. Kleim et al. (1996) found that motor skill learning caused synaptogenesis (formation of new synapses) in the rat cerebellum (see Motor Control Models: Learning and Performance). The number of synapses per purkinje cell increased in rats after they learned to navigate an aerial maze compared to motor controls. Further, the number of multiple varicosities (clusters of synapses) increased following learning as did the amount of
branching in the purkinje cell processes. Kleim et al. also found that as the number of synapses increased, their average size decreased. This result is similar to the changes observed in the rat dentate gyrus 24 hours post-LTP induction (Weeks et al. 1999). 4.3 Synaptic Ultrastructure Following Neural Lesions Reactive synaptogenesis (formation of new synapses after a neural lesion) in the hippocampus was examined by Anthes et al. (1993). Following ipsilateral entorhinal cortical lesions (see Hippocampus and Related Structures), synapses were quantified in the rat dentate gyrus at 3, 6, 10, 15, and 30 days post-lesion. Results showed that the lesions caused an initial 88 percent synaptic loss at day 3, which was followed by rapid synaptogenesis from day 6 through to day 15. Anthes et al. speculated that, following the loss of entorhinal input, previously dormant or suppressed fibers and their synapses became active in the absence of the primary innervation. Interestingly, synaptic size was found to decrease during the phase of rapid synaptogenesis. As synaptogenesis returned to baseline levels (approximately day 15), synaptic size also returned to control or pre-lesion levels. Another important finding was that the number of perforated synapses was greatest at the peak of synaptogenesis (days 10–15) and returned to control levels by day 30 post-lesion. Despite the differences in the length of time post-lesion, the sequence of synaptic structural changes observed by Anthes et al. is very similar to that observed by Weeks et al. (2001) following LTP (see Sect. 4.1).
5. Proposed Sequence of Synaptic Ultrastructural Change There are limitations to the scope of the sequence of synaptic ultrastructural changes proposed. This sequence is based primarily on changes observed at excitatory, axo-spinous (axon to dendritic spine) synapses. Therefore, these modifications may not take place in inhibitory, axo-dendritic, or axo-somatic synapses, etc. The data required to make a detailed comparison between the synaptic structural changes associated with LTP, reactive synaptogenesis, learning, and memory is far from complete but it is tempting to speculate about a common mechanism. One intriguing possibility, which was suggested by the growth of clusters of synapses observed by Kleim et al. (1996), is that a subset of synapses become involved or recruited in the various forms of neural plasticity. This subset of synapses then follows a sequence of structural changes (an initial increase in size, concavity, and perforations, then a decrease in size but an increased number of synapses, followed by a return to baseline levels) to 15369
Synapse Ultrastructure produce more numerous and stronger synapses within the activated groups of afferent fibers. This finding directly supports Hebb’s (1949) postulate (see Sect. 1) and allows for lasting ultrastructural change in synapses that form activated networks within the nervous system. These activated networks may then form a basis for learning, memory, and other psychological processes. See also: Long-term Depression (Cerebellum); Longterm Depression (Hippocampus); Long-term Potentiation and Depression (Cortex); Long-term Potentiation (Hippocampus); Memory: Synaptic Mechanisms; Neurotransmitters; Synapse Formation; Synaptic Efficacy, Regulation of; Synaptic Transmission
Bibliography Anthes D, LeBoutillier J, Petit T 1993 Structure and plasticity of newly formed adult synapses: A morphometric study in rat hippocampus. Brain Research 626: 50–62 Bailey C, Kandel E 1993 Structural changes accompany memory storage. Annual Reiew of Physiology 55: 397–426 Bliss T, Collingridge G 1993 A synaptic model of memory: Long-term potentiation in the hippocampus. Nature 361: 31–9 Bliss T, Lomo T 1973 Long lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. Journal of Physiology 232: 331–56 Bloomberg F, Cohen R, Siekevitz P 1977 The structure of postsynaptic densities isolated from dog cerebral cortex. II. Characterization and arrangement of some of the major proteins within the structure. Journal of Cell Biology 78: 204–25 Buchs P, Muller D 1996 Induction of long-term potentiation is associated with major ultrastructural changes of activated synapses. Proceedings of the National Academy of Science USA 93: 8040–5 Carlin R, Siekevitz P 1983 Plasticity in the central nervous system; do sysapses divide? Proceedings of the National Academy of Science USA 80: 3517–21 Chang F, Greenough W 1984 Transient and enduring morphological correlates of synaptic activity and efficacy change in the rat hippocampal slice. Brain Research 309: 35–46 Coggeshall R, Lekan H 1996 Methods for determining numbers of cells and synapses: A case for more uniform standards of review. Journal of Comparatie Neurology 364: 6–15 Cohen R S, Chung S K, Pfaff D W 1985 Immunocytochemical localization of actin in dendritic spines of the cerebral cortex using colloidal gold as a probe. Cellular and Molecular Neurobiology 5: 271–84 Deller T, Merten T, Roth S, Mundel P, Frotscher M 2000 Actinassociated protein synaptopodin in the rat hippocampal formation: Localization in the spine neck and close association with the spine apparatus of principal neurons. Journal of Comparatie Neurology 418: 164–81 Desmond N, Levy W 1983 Synaptic correlates of associative potentiation\depression: An ultrastructural study in the hippocampus. Brain Research 265: 21–30
15370
Desmond N, Levy W 1988 Synaptic interface surface area increases with long-term potentiation in the hippocampal dentate gyrus. Brain Research 453: 308–14 Fifkova E 1985 Actin in the nervous system. Brain Research Reiews 9: 187–215 Geinisman Y, de Toledo-Morrell L, Morrell F 1991 Induction of long-term potentiation is associated with an increase in the number of axospinous synapses with segmented postsynaptic densities. Brain Research 566: 77–88 Geinisman Y, de Toledo-Morrell L, Morrell F, Persina I, Beatty M 1996a Synapse restructuring associated with the maintenance phase of hippocampal long-term potentiation. Journal of Comparatie Neurology 367: 413–23 Geinisman Y, Disterhoft J, Gundersen H, Mcechron M, Persina I, Power J, Van Der Zee E, West M 2000 Remodeling of hippocampal synapses after hippocampus-dependent associative learning. Journal of Comparatie Neurology 417: 49–59 Geinisman Y, Gundersen H, Van Der Zee E, West M 1996b Unbiased stereological estimation of the total number of synapses in a brain region. Journal of Neurocytology 25: 805–19 Ghaffari T, Liaw J, Berger T 1997 Impact of synaptic morphology on presynaptic calcium dynamics and synaptic transmission. Society for Neuroscience Abstracts 23: 2105 Gray E G 1959 Axo-somatic and axo-dendritic synapses of the cerebral cortex: An electron microscope study. Journal of Anatomy 93: 420–33 Gundersen H J G 1977 Notes on the estimation of the numerical density of arbitrary profiles: The edge effect. Journal of Microscopy 111: 219–23 Harris K, Kater S 1994 Dendritic spines: cellular specializations imparting both stability and flexibility to synaptic function. Annual Reiew of Neuroscience 17: 341–71 Hebb D 1949 The Organization of Behaiour. Wiley, New York Jones D G, Itarat W, Calverly R K S 1991 Perforated synapses and plasticity: A developmental overview. Molecular Neurobiology 5: 217–28 Kleim J A, Lussing E, Schwarz E, Comery T, Greenough W 1996 Synaptogenesis and FOS expression in the motor cortex of the adult rat after motor skill learning. Journal of Neuroscience 16: 4529–35 Mackenzie P, Kenner G, Prange O, Shayan H, Umemiya M, Murphy T 1999 Ultrastructure correlates of quantal synaptic function at single CNS synapses. Journal of Neuroscience 19: RC13 (1–7) Markus E, Petit T 1989 Synaptic structural plasticity: Role of synaptic shape. Synapse 3: 1–11 Petit T 1995 Structure and plasticity of the Hebbian synapse: The cascading events for memory storage. In: Spear L P, Spear N E, Woodruff M (eds.) Neurobehaioral Plasticity: Learning, Deelopment and Response to Brain Insults. Erlbaum, Hillsdale, NJ, pp. 185–205 Reempts J, Dikova M, Werbrouck L, Clincke G, Borgers M 1992 Synaptic plasticity in rat hippocampus associated with learning. Behaioral Brain Research 51: 179–83 Siekevitz P 1985 The postsynaptic density: A possible role in long-lasting effects in the central nervous system? Proceedings of the National Academy of Science, USA 82: 3494–8 Sorra K, Harris K 1998 Stability of synapse number and size at 2 hr after long-term potentiation in hippocampal area CA1. Journal of Neuroscience 18: 658–71 Toni N, Buchs P, Bron C, Muller D 1999 LTP promotes formation of multiple spine synapses between a single axon terminal and a dendrite. Nature 402: 421–5
Synaptic Efficacy, Regulation of Weeks A, Ivanco T, LeBoutillier J, Racine R, Petit T 1999 Sequential changes in the synaptic structural profile following long-term potentiation in the rat dentate gyrus: I. The intermediate maintenance phase. Synapse 31: 97–107 Weeks A, Ivanco T, LeBoutillier J, Racine R, Petit T 2000 Sequential changes in the synaptic structural profile following long-term potentiation in the rat dentate gyrus: II. The induction\early maintenance phase. Synapse 36: 286–96 Weeks A C W, Ivanco T L, LeBoutillier J, Racine R, Petit T 2001 Sequential changes in the synaptic structural profile following long-term potentiation in the rat dentate gyrus: III. Long-term maintenance phase. Synapse 40: 74–84
A. C. W. Weeks
Synaptic Efficacy, Regulation of Multiple biological signals and mechanisms regulate the efficacy of synaptic transmission, thus providing rich combinatorial possibilities for modifying neural communication. These processes have been studied particularly well in invertebrate preparations, whose large neurons and relatively simple neural circuits permit detailed mechanistic and functional analyses.
1. Multiple Processes Regulate Synaptic Efficacy The processes regulating synaptic efficacy vary along several dimensions, including (a) the nature of the change itself (e.g., an increase or decrease in efficacy); (b) mechanisms of induction —that is, the nature of the neural signals that initiate the synaptic change; (c) mechanisms of expression, the proximal causes of synaptic changes; (d) mechanisms of maintenance, the steps involved in their persistence; and (e) time course, ranging from transient forms of regulation lasting only a few seconds, to relatively permanent long-term changes persisting a day or more.
2. Simple Forms of Synaptic Regulation 2.1 Decreases and Increases of Synaptic Efficacy Simple forms of synaptic regulation include decreases and increases in synaptic efficacy. Synaptic depression is typically expressed as a decrease in the amplitude or other parameters of a postsynaptic potential (PSP) at a given synaptic connection (Fig. 1). Other things being equal, the larger the synaptic potential, the greater the influence a presynaptic neuron exerts on its postsynaptic target neuron. Hence, in synaptic depression, the presynaptic neuron exerts less control over its postsynaptic target. Facilitation, enhancement, augmentation, and potentiation all refer to an increase in synaptic strength, often expressed as an increase in PSP amplitude (Fig. 1).
Synaptic regulation has been studied most extensively at chemical synapses. However, changes in the efficacy of electrical synapses can also occur.
2.2 Mechanisms of Induction A separate but related issue is the type and locus of cellular activity necessary to induce, or trigger, the change in synaptic efficacy. Homosynaptic changes arise from activity that occurs at the same (‘homo’) synapse that is modified. Thus, homosynaptic depression is a decrease in synaptic strength (‘depression’) that arises from activity at that synapse (hence, ‘homosynaptic’). In contrast, heterosynaptic changes arise from activity in neurons other than those being modified. Both are instances of non-associative induction mechanisms, in that only one type or locus of neural activity is involved in triggering the synaptic change. In contrast, conjunctive or associative mechanisms of synaptic regulation depend on activity in more than one cell. Associate mechanisms are of particular interest from a theoretical perspective, because they involve a more advanced regulatory logic: changes in synaptic strength occur if and only if two conditions are met (see Associatie Modifications of Indiidual Neurons).
2.3 Regulation of Synaptic Efficacy in the Aplysia Siphon-withdrawal Circuit The following four examples illustrate these different forms of induction as well as certain other aspects of synaptic regulation in the context of the gill- and siphon-withdrawal circuit of the marine mollusk Aplysia (Fig. 2). As detailed by Eric Kandel and colleagues (Kandel and Schwartz 1982, Kandel and Pittenger 1999), the Aplysia gill- and siphon-withdrawal reflex to a tactile siphon stimulus is mediated in part by the LE siphon sensory neurons, which connect both directly and indirectly to LFS and other siphon and gill motor neurons. The strength of LE sensorimotor connections can be modified in several ways (see Memory: Synaptic Mechanisms). Similar changes have also been found at connections of Aplysia tail mechanosensory neurons.
2.3.1 Homosynaptic depression and homosynaptic facilitation. Low-frequency activation of LE siphon sensory neurons depresses transmitter release at sensorimotor connections and hence also depresses the postsynaptic response (Fig. 1; Castellucci and Kandel 1976). Functionally, homosynaptic depression contributes to behavioral habituation of the gill-and siphon-withdrawal reflex. The siphon is a small fleshy spout that expels seawater from the mantle cavity. A 15371
Synaptic Efficacy, Regulation of
Figure 1 Homosynaptic depression and heterosynaptic facilitation at Aplysia sensorimotor synapses A. Simplified circuit and experimental arrangement. Siphon sensory neurons (SN) project monosynaptically and polysynaptically to gill and siphon motor neurons (MN). Extracellular stimulation of the connective, which carries input from tail and head to the abdominal ganglion, activates facilitatory interneurons (Fac.) that produce heterosynaptic facilitation at sensorimotor connections (asterisk) by enhancing transmitter release. B. Example of homosynaptic depression (left) and heterosynaptic facilitation (right). Intracellular stimulation evokes an action potential in the sensory neuron that elicits a monosynaptic excitatory synaptic potential (EPSP) in the motor neuron. With repeated stimulation, the amplitude of the EPSP declines (synaptic depression). Connective stimulation (Conn. Stim.) increases the EPSP amplitude (heterosynaptic facilitation). Source: Castellucci and Kandel 1976. " American Association for the Advancement of Science. Adapted with permission
tactile stimulus to the siphon causes both the gill and the siphon to withdraw. After repeated taps of the siphon, however, the amplitude and duration of the reflex diminish. Habituation is an important and ubiquitous form of learning that enables an organism to screen out stimuli of little consequence. In addition to homosynaptic depression, Aplysia sensorimotor connections exhibit post-tetanic potentiation (PTP) in response to high-frequency activation, which is due at least in part to homosynaptic facilitation (Clark and Kandel 1984).
2.3.2 Heterosynaptic facilitation and heterosynaptic inhibition. In Aplysia siphon-withdrawal circuits, strong noxious stimuli (e.g., tail shock) activate a group of facilitatory interneurons (e.g., L29 cells) that synapse onto LE siphon sensory cells. Activation of facilitatory interneurons modifies sensory cells and increases their transmitter release via multiple mechanisms. Because this increase in synaptic efficacy is regulated by an extrinsic input, it is an example of heterosynaptic facilitation (Fig. 1; Castellucci and Kandel 1976). Behaviorally, presyn15372
aptic facilitation contributes to sensitization of the siphon-withdrawal reflex, increasing its amplitude and duration. With very strong stimuli, heterosynaptic inhibition can also be induced.
2.3.3 Actiity-dependent enhancement of presynaptic facilitation. In this associative mechanism (also termed activity-dependent neuromodulation and activity-dependent extrinsic modulation), activity in sensory neurons enhances the heterosynaptic presynaptic facilitation normally evoked by facilitatory interneurons (Hawkins et al. 1983). The facilitation is greater than can be accounted for by a simple combination of a homosynaptic increase (PTP arising from sensory neuron activity) and heterosynaptic facilitation (arising from activation of facilitatory interneurons) and thus is associative. This mechanism is believed to contribute to classical conditioning of the siphon-withdrawal reflex, in which a siphon tap conditioned stimulus (CS) that activates siphon sensory neurons is paired with a tail shock unconditioned stimulus (US) that activates neuromodulatory facilitatory interneurons. Such pairings result in greater synaptic
Synaptic Efficacy, Regulation of campus)). Behaviorally, Hebbian facilitation also contributes to classical conditioning of the reflex. The tactile siphon CS activates siphon sensory neurons, and the tail shock US activates and depolarizes siphon motor neurons, thereby providing the key conjunctive events. 2.4 Expression, Maintenance, and Time Course of Synaptic Regulation
Figure 2 Four induction mechanisms for synaptic facilitation at Aplysia sensorimotor connections. In each case, the synapse being strengthened is the sensorimotor connection from the siphon sensory neuron (SN) to the siphon motor neuron (MN), as indicated by the asterisk. The locus of cellular activity critical for the induction of facilitation is indicated by shading of the cell or synapse. A. High-frequency activity of the SN alone (shaded) produces homosynaptic facilitation (e.g., PTP). B. Activity in facilitatory interneurons enhances transmitter release from the SN, resulting in heterosynaptic facilitation. C. Conjoint activation of the sensory neuron and the facilitatory interneuron produces even greater facilitation, indicating activitydependent enhancement of heterosynaptic facilitation. D. Conjoint activity in the presynaptic SN and postsynaptic MN produces Hebbian facilitation
facilitation and a greater increase in the siphon-withdrawal response than that produced by unpaired CS and US presentations. 2.3.4 Hebbian facilitation. Hebbian facilitation (Hebb 1949) also contributes to increases in synaptic efficacy at LE-LFS connections (Glanzman 1995). The critical inductive signal for Hebbian facilitation is conjoint activity of the presynaptic sensory neuron and activity or large depolarization of the postsynaptic motor neuron. The associative properties presumably arise from a glutamate N-methyl-aspartate (NMDA)-type receptor, which acts as a molecular ‘AND-gate’ that requires both neurotransmitter binding and large postsynaptic depolarization (to remove Mg#+ block of the channel) in order for Ca#+ influx to occur through the channel and trigger synaptic change (see Associatie Modifications of Indiidual Neurons; Long-term Potentiation (Hippo-
In principle, increases and decreases in synaptic efficacy can result from either a presynaptic expression mechanism (e.g., a decrease in transmitter release from the presynaptic cell), or a postsynaptic expression mechanism (e.g., a decrease in the sensitivity or number of transmitter receptors on the postsynaptic target cell), or both. Although the mechanisms for expression and for maintenance and duration of synaptic regulation are logically and biologically separable, they are often functionally related and will be considered together here. Four broad categories of mechanisms allow the regulation of synaptic efficacy to span a wide range of temporal domains: (a) transient regulation, which is voltage- and Ca#+-dependent, lasting milliseconds to seconds; (b) short-term regulation, involving secondmessenger mediated covalent modifications of preexisting proteins, typically lasting minutes to an hour; (c) intermediate-term regulation, dependent on protein synthesis, persisting several hours; and (d) longterm regulation, dependent on both RNA synthesis and protein synthesis, persisting for hours and days (Fig. 3). 2.4.1 Transient, oltage- and Ca#+-dependent presynaptic regulation. As demonstrated by studies at the squid giant synapse, which permits intracellular stimulation and recording from both presynaptic and postsynaptic elements, transmitter release is regulated by the voltage of the presynaptic terminal, which in turn influences the basal concentration of intracellular Ca#+, the ion responsible for transmitter release. Depolarization of the presynaptic terminal elevates intracellular Ca#+ concentrations and increases release, whereas hyperpolarization decreases intracellular Ca#+ concentrations and decreases release. Axo-axonic synapses (connections from one neuron onto presynaptic terminals of another) can depolarize or hyperpolarize presynaptic terminals, thereby regulating release and producing presynaptic facilitation or inhibition, respectively. This form of synaptic regulation is typically transient, as it is dependent on the continuation of the afferent axoaxonic input, so it allows a moment-to-moment gating of information flow. Homosynaptic spike activity can also influence residual Ca#+ levels and thus modify transmitter release relatively directly. 15373
Synaptic Efficacy, Regulation of
Figure 3 Maintenance of Enhanced Synaptic Efficacy. Four classes of mechanisms allow the regulation of synaptic efficacy to span a wide range of temporal domains, illustrated here by four examples of presynaptic facilitation. A. Transient voltage- and Ca#+-dependent facilitation. Activation of presynaptic ionotropic receptors depolarizes the presynaptic terminal, increasing basal Ca#+ levels and enhancing transmitter release. The facilitation terminates quickly (within milliseconds to seconds) after afferent activity subsides and the terminal repolarizes. B. Short-term covalent modifications, as occur at Aplysia sensorimotor connections. Facilitatory inputs elevate concentrations of second messengers (e.g., cAMP), activate protein kinases, and phosphorylate substrate proteins, enhancing transmitter release. Short-term facilitation persists beyond the termination of afferent facilitatory input, but subsides on a time scale of minutes as substrate proteins are dephosphorylated. C. Intermediate-term facilitation at Aplysia sensorimotor connections lasting a few hours requires translation of mRNA to protein but not transcription of DNA to RNA. D. Long-term facilitation at Aplysia sensorimotor connections. Here the second messenger cascade regulates gene expression as well as protein synthesis, leading to relatively permanent synaptic facilitation lasting a day or more
2.4.2 Short-term regulation ia second-messenger mediated coalent modifications. Short-term regulation of synaptic efficacy, lasting minutes to approximately an hour, typically involves second-messenger cascades and the subsequent covalent modification of pre-existing proteins, often via kinase-dependent phosphorylation. For example, in short-term facilitation at Aplysia sensorimotor connections, facilitatory interneurons activate both the cAMP-dependent protein kinase (PKA) and protein kinase C (PKC) in sensory cells. In the cAMP-dependent cascade, the 15374
facilitatory transmitter (e.g., serotonin), the ‘first messenger,’ binds to a receptor coupled to G-protein that activates the enzyme adenylyl cyclase. Adenylyl cyclase activation converts adenosine triphosphate (ATP) to cyclic adenosine monophosphate (cAMP), the ‘second messenger.’ cAMP in turn activates PKA by binding to the regulatory subunits of the kinase, causing them to dissociate from the catalytic subunits. The freed catalytic subunits then phosphorylate the substrate proteins, invoking conformational and functional changes. Similarly, in the
Synaptic Efficacy, Regulation of PKC cascade, serotonin activates phospholipase C via a separate G-protein coupled receptor, which in turn elevates the levels of the second messenger diacylglycerol and activates PKC. Together, these two second-messenger cascades produce a broadening of sensory neuron action potentials, enhanced Ca#+ influx, and an enhancement of spike-release coupling, thereby increasing transmitter release. Second-messenger cascades modify a variety of substrate proteins, allowing for coordinated regulation of multiple steps involved in synaptic transmission. Because substrate phosphorylation persists even after the brief input from facilitatory interneurons terminates, synaptic facilitation is maintained beyond the presence of sensitizing stimulus itself. However, biochemical mechanisms for substrate dephosphorylation also exist. Hence, covalent modifications of substrate proteins by themselves are not well suited for more permanent maintenance of synaptic changes. 2.4.3 Intermediate-term and long-term regulation of synaptic efficacy inole translation and transcription. Intermediate-term (lasting hours) and long-term forms of synaptic regulation (lasting a day or more) typically require the recruitment of additional steps to those implicated in short-term regulation. The relationship among these different forms of cellular memory has been particularly well studied at Aplysia sensorimotor connections. Unlike short-term facilitation, certain forms of intermediate-term facilitation require the synthesis of new proteins (translation of mRNA to protein). Long-term facilitation involves not only protein synthesis, but also the modification of gene expression (transcription of DNA to RNA) (Montarolo et al. 1986). These forms of synaptic regulation are thus distinct on a mechanistic as well as phenomenological level. With repeated sensitizing stimuli or their analogs (e.g., serotonin application) used to induce long-term facilitation, PKA phosphorylates not only substrate proteins involved in short-term facilitation at the synapse, but also the transcriptional activator cAMPresponse element binding protein (CREB) in the sensory neuron nucleus. CREB then binds to and activates the promoter element CRE (the cAMP response element), inducing gene expression. The new gene products involve not only ‘effector’ molecules directly involved in the expression of long-term facilitation, but also additional transcription factors, such as C\EBP, that regulate the expression of other genes, thereby producing a cascade of gene regulation. Interruption of this gene cascade at early time points by inhibitors of RNA or protein synthesis blocks subsequent long-term facilitation, but leaves shortterm facilitation intact. In contrast, inhibition of RNA and protein synthesis at later time points has little effect. Thus the dynamics of macromolecular synthesis
in long-term facilitation mimic the time-sensitivity of long-term memory itself. cAMP and CREB have been implicated in neural substrates of memory in other systems, such as Drosophila (see Memory in the Fly, Genetics of). Just as short-term facilitation involves the coordinated facilitation of several processes involved in transmitter release, long-term facilitation involves multiple effector genes. Enhanced expression of the enzyme ubiquitin carboxyterminal hydrolase promotes the degradation of the inhibitory regulatory subunits of PKA, thereby making the kinase persistently active, even after cAMP levels return to baseline. Other gene products lead to the growth of new sensory neuron synaptic terminals (which does not occurn short-term facilitation), thus providing a stable, enduring substrate for long-term memory. Structural changes in sensory neurons are accompanied by corresponding growth in postsynaptic targets, providing further evidence for the coordinated regulation of synaptic efficacy on multiple levels.
3. Functional Consequences of Simple Synaptic Regulation 3.1 Modulation of Input–Output Pathways The functional consequences of simple forms of synaptic regulation can be relatively straightforward in circuits where there is a relatively uniform direction of information flow from input to output. For example, as described above, decreases in synaptic strength at Aplysia sensorimotor connections contribute to decreases in the siphon-withdrawal response (e.g., habituation), and increases in synaptic strength contribute to increases in the siphon-withdrawal response (e.g., sensitization and classical conditioning). 3.2 Reconfiguration of Multifunctional Networks In other instances, however, even relatively simple forms of synaptic change can have more complex functional consequences. Many neural networks (or their components) are multifunctional networks, participating in more than one behavior. Because the operation of such networks depends on interactions among multiple non-linear processes at the cellular, synaptic, and network levels, regulation of synaptic efficacy can profoundly alter network operation, reconfiguring the network into functionally distinct modes. 3.2.1 Escape swimming in Tritonia. One interesting example is the reconfiguration of a premotor network from a resting reflex withdrawal circuit into a rhythmic central pattern generator (CPG) swim circuit in the marine mollusk Tritonia (Katz and Frost 15375
Synaptic Efficacy, Regulation of 1996). None of the three major interneuron types in this circuit have endogenous bursting properties; the oscillation of the network instead arises from, and hence depends upon, their synaptic interactions. The dorsal swim interneurons (DSIs) are essential elements of the CPG circuit for escape swimming. DSI activity contributes directly to the generation of the swim motor program; in addition, it rapidly enhances the synaptic efficacy and excitability of a second interneuron, C2. C2 has been implicated in the reconfiguration mechanism; its activity disinhibits the DSI neurons, enabling rhythmic dorsal flexions. Thus, the rapid enhancement of C2 contributions may functionally rewire the network, allowing the reconfiguration from simple reflex withdrawal to rhythmic swimming to occur. Because the regulation of C2 synaptic efficacy develops and decays dynamically with DSI activity, it may also play a role in the cycle-by-cycle generation of the swim rhythm itself.
3.2.2 CPGs in lobster stomatogastric ganglion. The gastric mill CPG and pyloric CPG in the lobster stomatogastric ganglion are also polymorphic networks that can be functionally reconfigured to produce fundamental variations in output under different circumstances (Harris-Warwick and Marder 1991). Such alterations arise both from regulation of synaptic efficacy in the circuit and from changes in the intrinsic properties of the component neurons. For example, stimulation of the anterior pyloric modulator (APM) neuron activates the pyloric motor pattern by enhancing the plateau potential capabilities of pyloric CPG neurons. The platean maintains neurons in a depolarized state and so allows them to fire repetitively. As a consequence, a brief excitatory synaptic input that was previously ineffective can trigger a transition of the neuron to the tonically active plateau condition. In the plateau state, neurons also actively resist hyperpolarization, thus rendering inhibitory inputs less effective. The combined enhancement of synaptic excitation and diminution of inhibition result in a more active network. Modulatory transmitters produce bursting pacemaker potentials in the anterior burster (AB) neurons, the most important pacemaker of the pyloric CPG. Synaptic interactions within the stomatogastric ganglion are mediated partly by non-spiking release, which varies continuously with membrane potential. Hence, larger bursting potentials enhance transmitter release from AB terminals, increasing AB inhibition of certain postsynaptic CPG targets and removing them from the functional circuit, thereby altering the pyloric network and rhythm. Because different modulatory amines exert different effects on AB bursts, the anatomical circuit can be sculpted in a variety of ways to yield a number of different functional networks. 15376
4. Qualitatie Regulation of Synaptic Efficacy The decreases and increases in synaptic efficacy described above involve primarily quantitative modifications of synaptic strength, rendering synapses either more or less potent. However, regulation of synaptic efficacy can also involve a number of qualitative changes in synaptic transmission, thus extending the range of adaptive mechanisms that the nervous system utilizes. 4.1 Regulation of Transmitter Type An important additional form of synaptic regulation involves changes in the type, rather than the amount, of transmitter that a neuron releases. Such changes can arise either from shifts in the relative release of different co-transmitters resulting from different rates and patterns of neural firing, or from changes in transmitter synthesis (Simerly 1990). Release of one co-transmitter or another can produce qualitative changes in the overall functional effect of synaptic transmission on a particular postsynaptic target (e.g., from excitation to inhibition), rather than simple quantitative changes in a pre-existing synaptic response. Regulation of the transmitter type can also effectively add or remove a particular postsynaptic pathway from operation, thus allowing functional rewiring of the neural circuit without the need for anatomical reorganization. In principle, similar molecular switching might be accomplished by regulation of the relative number or sensitivities of different postsynaptic receptors (see Long-term Potentiation (Hippocampus)). 4.1.1 Lobster stomatogastric ganglion. Changes in co-transmitter release affect the pyloric CPG in the lobster stomatogastric ganglion. The inferior ventricular nerve (IVN) cells evoke both excitatory and inhibitory responses in the AB neurons, the most important pacemaker of the pyloric CPG, as described above. At low frequencies of IVN activation, one (unknown) excitatory transmitter is released, and the frequency of the pyloric rhythm increases. In contrast, at higher frequencies, the inhibitory transmitter histamine is preferentially released, and the pyloric rhythm is disrupted. In a variation on this theme, the neuromodulatory transmitter dopamine changes the sign of the synaptic connection between certain identified neurons in the CPG from inhibitory to excitatory by reducing the efficacy of the inhibitory synapse and uncovering a weak electrical connection. 4.1.2 Feeding in Aplysia. Differential release of cotransmitters may also provide for intrinsic self-regulation of a network. For example, the motor neurons in the feeding circuit of Aplysia use a variety of
Synaptic Efficacy, Regulation of peptide co-transmitters that modulate neuromuscular transmission, extending the dynamic range over which the muscles can operate effectively. High frequency activation of motor neuron B15 enhances the co-release of small cardioactive peptide B. This neuromodulatory peptide increases the amplitude and quickens relaxation time of the muscle, thereby enabling larger muscle contractions to be driven at a higher rate. Similarly, in the leech, the myogenic heart muscle is innervated by heart excitatory motor neurons that co-release acetylcholine and FMRFamide, a modulatory peptide that regulates amplitude and duration of heart beat. Amines also modulate the amplitude of potentials at many crustacean neuromuscular junctions. 4.2 Regulation of Synaptic Plasticity: Metaplasticity An additional form of synaptic regulation, recently termed ‘metaplasticity’ (Abraham and Bear 1996), is manifested as a modification of the ability of the synapse to undergo changes in synaptic efficacy, rather than as a change in synaptic efficacy per se. For example, long-term potentiation (LTP) not only increases the population spike evoked in dentate granule cells of mammalian hippocampus by afferent stimulation, it also reduces non-linear response properties so that synaptic efficacy is less affected by the recent history of activity at that synapse. LTP thus increases the fidelity as well as the gain of the system (Berger and Sclabassi 1985). Regulation of more longlasting forms of synaptic plasticity also occurs at hippocampal synapses, involving, for example, shifts in the tendency of synapses to exhibit LTP or longterm depression (LTD). Metaplasticity has recently been implicated at identified inhibitory synapses of the siphon-withdrawal circuit of Aplysia (Fischer et al. 1997). The inhibitory interneuron neuron L30 exhibits multiple forms of synaptic enhancement that control the gain of the siphon-withdrawal reflex, including frequency facilitation during initial phases of high-frequency activation and synaptic augmentation and potentiation that persist after activation. Tail shock reduces the baseline strength of L30 connections onto L29 cells; it also attenuates augmentation\potentiation while leaving frequency facilitation relatively intact. Behaviorally, tail shock decreases the inhibition normally produced by cutaneous stimuli, perhaps by reducing augmentation of L30 inhibition, thus suggesting a functional role for metaplasticity within a defined neural circuit.
5. Regulation of Synaptic Efficacy and of Other Neural Properties Although not the explicit subject of this article, changes in neural excitability and other intrinsic neural
properties can occur in addition to modifications of synaptic efficacy. Synaptic and non-synaptic changes can co-occur in the same neurons, and can be functionally and even mechanistically related. For example, both Aplysia mechanosensory cells and Hermissenda type B photoreceptors exhibit learningrelated increases in excitability (increased firing to a given stimulus), as well as enhanced synaptic strength (Schuman and Clark 1994). These two modifications work in concert to increase both the number and the potency of the synaptic signals conveyed to their postsynaptic targets. Similarly, as described above, reconfiguration of the Tritonia escape swim CPG and the pyloric CPG in lobster stomatogastric ganglion involve both synaptic and non-synaptic forms of neural regulation. One important distinction between regulation of synaptic efficacy and of other neural properties is the greater specificity that synaptic changes may exhibit. Because any given neuron may form pre- or postsynaptic contacts with thousands of other cells, the ability to modify only selected connections of a neuron confers a high degree of precision in the modification of neural pathways. In contrast, intrinsic changes such as enhanced excitability are often expressed cell-wide or nearly cell-wide, enhancing responses to virtually all inputs and enhancing virtually all outputs. In Aplysia siphon sensory cells, several short-term forms of synaptic plasticity, including heterosynaptic facilitation, heterosynaptic inhibition, and Hebbian facilitation, exhibit synapse specificity (Clark and Kandel 1984, Clark 2001). Synapse specificity is perhaps not surprising in short-term regulation, which involves covalent modifications of pre-existing proteins. However, long-term heterosynaptic facilitation, which depends on regulation of transcriptional and translational events, is also in part synapse-specific (Clark 2001, Martin et al. 1997). Synapse-specific long-term facilitation raises the intriguing question of how new gene products are preferentially utilized at facilitated synapses, compared with other synapses of the same neuron. An emerging answer, consistent with comparable findings in hippocampal LTP, is that synapse-specific long-term facilitation involves synaptic tagging and depends in part on local protein synthesis. According to this hypothesis, relatively short-lived, local regulatory events at treated synapses allow the subsequent capture of new gene products that are shipped cell-wide, thus conferring the specificity. Irrespective of the particular mechanistic details, long-term synapse-specific facilitation provides an additional illustrative example of the rich interplay and elegance of the multiple biological processes involved in synaptic regulation. See also: Associative Modifications of Individual Neurons; Ion Channels and Molecular Events in Neuronal Activity; Memory: Synaptic Mechanisms; Synaptic Transmission 15377
Synaptic Efficacy, Regulation of
Bibliography Abraham W C, Bear M F 1996 Metaplasticity: The plasticity of synaptic plasticity. Trends in Neurosciences 19: 126–30 Berger T W, Sclabassi R J 1985 Nonlinear systems analysis and its application to study of the functional properties of neural systems. In: Weinberger N M, McGaugh J L, Lynch G (eds.) Memory Systems of the Brain. Guilford Press, New York Castellucci V, Kandel E R 1976 Presynaptic facilitation as a mechanism for behavioral sensitization in Aplysia. Science 194: 1176–8 Clark G A 2001 The ins and outs of classical conditioning: Maxims from a mollusk. In: Steinmetz J E, Gluck M, Solomon P (eds.) Model Systems and the Neurobiology of Associatie Learning. Erlbaum, Hillsdale, NJ Clark G A, Kandel E R 1984 Branch-specific facilitation in Aplysia siphon sensory cells. Proceedings of the National Academy of Sciences USA 81: 2577–81 Fischer T M, Blazis D E J, Priver N A, Carew T J 1997 Metaplasticity at identified inhibitory synapses in Aplysia. Nature 389: 860–5 Glanzman D L 1995 The cellular basis of classical conditioning in Aplysia californica—It’s less simple than you think. Trends in Neurosciences 18: 30–6 Harris-Warrick R M, Marder E 1991 Modulation of neural networks for behavior. Annual Reiew of Neuroscience 14: 39–57 Hawkins R D, Abrams T W, Carew T J, Kandel E R 1983 A cellular mechanism of classical conditioning in Aplysia: Activity-dependent amplification of presynaptic facilitation. Science 219: 400–5 Hebb D O 1949 The Organization of Behaior. Wiley, New York Kandel, E R, Pittenger C 1999 The past, the future and the biology of memory storage. Philosophical Transactions of the Royal Society London B 354: 2027–52 Kandel E R, Schwartz J H 1982 Molecular biology of learning: Modulation of transmitter release. Science 218: 433–43 Katz P S, Frost W N 1996 Intrinsic neuromodulation: Altering neuronal circuits from within. Trends in Neurosciences 19: 54–61 Martin K C, Casadio A, Zhu H E-Y, Rose J C, Chen M, Bailey C H, Kandel E R 1997 Synapse-specific, long-term facilitation of Aplysia sensory to motor synapses: A function for local protein synthesis in memory storage. Cell 91: 927–38 Montarolo P G, Goelet P, Castellucci V F, Morgan J, Kandel E R, Schacher S 1986 A critical period for macromolecular synthesis in long-term heterosynaptic facilitation in Aplysia. Science 234: 1249–54 Schuman E M, Clark G A 1994 Synaptic facilitation at connections of Hermissenda type B photoreceptors. Journal of Neuroscience 14: 1613–22 Simerly R B 1990 Hormonal control of neuropeptide gene expression in sexually dimorphic olfactory pathways. Trends in Neurosciences 13: 104–10
G. A. Clark
Synaptic Transmission As explained in detail by J. C. Eccles in 1964, nerve cells communicate via specialized regions of close juxtaposition that were first named ‘synapses’ by C. S. 15378
Sherrington, who used this term to denote the anatomical region where the axon terminal of one neuron contacts another neuron. Sherrington always used synapse in its functional sense (for example, to account for the one-way conduction in the reflex arc). However, it must now be considered from an anatomical perspective as well since, despite variations, there are common structural features consistent with some of the physiological properties of synapses. For example, given that there is a high resistance gap of 200 AH width separating neuronal membranes at most synapses, as first claimed by Ra! mon y Cajal, current generated by one cell cannot spread significantly to the next one. Thus, it is advantageous for postsynaptic terminals to utilize a chemical mediator to secure transmission. This concept was only established after decades of controversy and efforts to distinguish between electrical and chemical models, for both excitation and inhibition. Yet, although it is beyond the scope of this overview, electrotonic interactions between nerve cells, in the periphery and in the brain, can indeed occur, due to either (a) areas of continuity between them via gap junctions (Bennett 1977); or (b) the channeling of currents in adjacent cells by an unusually high external resistance. The second case is called ephaptic or, better stated, field effect transmission (Faber and Korn 1989). These two electrical modes, which should not be confused, allow signaling to occur instantaneously, and are well suited for the mediation of rapid behaviors, such as escape. In contrast, the hallmark of chemical synapses is the irreducible delay (k1 msec) between the incoming stimulus and the postsynaptic response, which reflects activation of the presynaptic machinery involved with transmitter release (Katz 1969, Llinas 1982). Chemical synapses, however, have remarkable adaptive advantages, such as: (a) variable gain; (b) a large repertoire of polarity and kinetics, resulting in excitations and inhibitions of different durations, depending upon their specific chemistry; and (c) the ability to store information on a long-term basis, due to plastic properties which have become a paradigm for studies of learning and memory.
1. Integratie Properties of Neurons and Synaptic Correlates Since any behavior is ultimately produced by combined activity in neuronal assemblies, it is of critical interest to understand how each individual neuron reacts to incoming signals initiated in cells connected to it through synaptic junctions, as in a simple reflex chain (Fig. 1A). In fact, information is distributed widely in the nervous system, such that a single neuron may both have many targets and receive inputs from numerous presynaptic axonal terminals, according to the principles of divergence and convergence, respectively. Recording electrodes inserted in the various
Synaptic Transmission
Figure 1 Integrative properties of neurons and mode of generation of action potentials
parts of a neuron (soma, dendrite, axon), in combination with anatomical studies of the synapses contacting these loci, have revealed electrical signals whose polarity reflects their function. The signals are named excitatory postsynaptic potentials (EPSPs), and inhibitory postsynaptic potentials (ISPSs). Due to nerve cell cable properties, which are determined by the membrane resistivity and capacitance, signals generated at a distance from the axon are attenuated there, but nevertheless contribute to the summation of responses at the so-called trigger zone (Fig. 1B), which decides whether or not some information will be passed on to the next neuronal relay. This essential integrative property of the cell can be quite complex, with there being multiple trigger zones (Llinas et al. 1969), and summation not always being algebraic (non-linear). Yet, whatever the rules are in a given neuronal subtype, the tonic and phasic bombardment of a cell by incoming inputs can only bring it to such a decision when excitation dominates inhibition and reaches a critical value, the threshold for generating an action potential (Fig. 1C) that will propagate along the axon towards the next synaptic relay. It can already be noted that reaching threshold is the result of the
collective activity of large populations of inputs, each of which may, however, act randomly, as will be described below. This cooperative nature of the nervous system pertains whether it is viewed from a molecular, synaptic, network, or behavioral perspective.
2. What Synapses Hae in Common The meaning of the term ‘synapse’ has evolved over time, in parallel with the development of the morphological and physiological tools available for studies of the brain. Today, its strict definition requires electron microscopic demonstration of ultrastructural features associated with chemical transmission. A comparison of central synapses with the neuromuscular junction (nmj), which was the Holy Grail of pioneering studies, allows one to recognize their basic unity. At the frog nmj, the terminals of the motor axon occupy the end plate, a specialized depression in the muscle membrane, which forms folds. Opposite the crest of each fold, where transmitter receptors are the 15379
Synaptic Transmission
Figure 2 Presynaptic and postsynaptic components of quantal transmission
most abundant, the axon terminal contains dense bands and spherical organelles that package 5,000– 10,000 molecules of the transmitter acetylcholine (Kuffler and Yoshikami 1975) and are called synaptic vesicles. It was later demonstrated that the opening of synaptic vesicles (exocytosis) takes place at the borders of the bands, which therefore were designated active zones (Couteaux and Pecot-Dechavassine 1970). An equivalent term for physiologists is ‘release site,’ as vesicle openings were observed there (Heuser et al. 1974). The motor end plate can thus be considered as a number of synaptic units, each represented by a dense band with its associated vesicles and a postsynaptic density (for review, see Peters et al. 1991). 15380
Synapses in the central nervous system (CNS) have more restricted apposition zones. They most often consist of synaptic boutons (Fig. 2A, B) that contain the vesicles and are separated from the postsynaptic element by the 10–20 nm wide cleft. The plasma membranes on both sides exhibit marked densities which, together with the assembly of vesicles, are referred to as a synaptic complex (Palay 1958). Here the analog of the active zone is the presynaptic grid, formed by electron-opaque densities called presynaptic dense projections, which are arranged in triangular arrays and, in en face view, take the form of an ‘egg box.’ This structure is postulated to ‘guide’ the vesicles towards the plasma membrane (Pfenninger et
Synaptic Transmission al. 1972). As at the nmj, the postsynaptic membrane facing this release apparatus contains the receptors which, however, are restricted to localized clusters (Triller et al. 1985). Thus, as further shown by physiological studies, the basic functional unit of synaptic transmission is a complex containing the presynaptic active zone and its neighboring synaptic vesicles, together with the aggregate of ligand-gated receptors in the opposing postsynaptic membrane. It thus corresponds to the synaptic complex. At either type of chemical synapse, the depolarization of the terminals associated with a presynaptic action potential opens voltage sensitive Ca++ channels, with the consequent Ca++ influx initiating a cascade of molecular events that can culminate in the opening of a vesicle fused to the plasma membrane at the release site and the rapid discharge of its contents into the cleft (Fig. 2B). The postsynaptic receptors are thus exposed to a high concentration of transmitter, on the order of mM, albeit briefly (Fig. 2C, double arrow). Consequently, the ligand-gated channels open nearly synchronously, allowing specific ions to flow down their electrochemical gradient. However, as each individual opened channel in the postsynaptic receptor matrix closes randomly, the postsynaptic current, which is the sum of all individual channel activity, decays exponentially. This ‘population’ phenomenon is illustrated in Fig. 2C, where the summed current rises rapidly and relaxes slowly, a time course typical of those at the so-called fast synapses used by the brain for information processing. At the nmj, the contents of one vesicle open about 1000 receptor-channels, while at central synapses the number can be much less. One should note that in the CNS the current flow through synaptic channels can be either inward (i.e., directed towards the cell interior) and excitatory, or outward and inhibitory, depending upon the selectivity of the channel, i.e., which ion(s) it allows to pass. Furthermore, in most neuronal systems, glycine and γaminobutyric acid (GABA) are inhibitory, and glutamate and acetylcholine are excitatory, although a given transmitter can have either function, depending upon the type of channel associated with that transmitter’s receptor (Hille 1984).
3. Why Synaptic Potentials Fluctuate in Integer Steps With a recording electrode placed at the nmj, Katz and colleagues made two related discoveries. First, at rest, spontaneous potentials of similar amplitudes occur randomly. Second, when the nerve is stimulated repeatedly, the resulting evoked endplate potentials fluctuate in amplitude in a steplike manner from trial to trial. Furthermore, amplitude histograms show that the apparently discrete size classes are integer (1, 2, 3, … n) multiples of a basic unit which corresponds in time course and average size to the spontaneous event
(del Castillo and Katz 1954). These elementary ‘miniature’ endplate potentials were thus called ‘quanta,’ since they are representative of the basic unit of the evoked response. Remarkably, the underlying statistics of transmitter release, which account for the variable number of quanta issued after each impulse (based on experiments conducted in a low concentration of extracellular calcium ([Ca#+] ), satisfied the ! Poisson process. Release occurred with statistical independence at any one of a large number, n, of ‘quantal units available for release’ at the junction, with a low release probability ( p) that was identical for each unit. Finally, it was postulated that each quantum corresponded to the action of the transmitter packaged in one synaptic vesicle, and since n is large in the Poisson equation, it was assumed to represent the total number of vesicles ready to be released (Katz 1969). Despite difficulties inherent to the analysis of signals recorded from brain cells, such as the inability to relate any miniature response to one of the numerous junctions impinging on the neuron being investigated, and the low signal to noise ratio, the conclusion illustrated in Fig. 3A, namely that responses evoked by activating a single presynaptic neuron are quantal, was also reached for central synapses (Korn and Faber 1991, Redman 1990, see also Jonas et al. 1993). However, a statistical formulation other than the Poisson most often provides a better description of the amplitude fluctuations. Here the statistics are either a simple binomial model, where n is finite and p is relatively large, or a compound binomial, where p may be different at individual releasing units (see also below). Regardless of the mathematical details (Martin 1977 and see below), n is smaller than originally thought. Also, in experiments combining physiological recordings with anatomical reconstructions of the cells studied, the value derived for this parameter turned out to be equivalent to the number of active zones or release sites in the activated presynaptic terminals. The result lead to the ‘one vesicle’ hypothesis, according to which a presynaptic action potential causes the exocytosis of the contents of at most one vesicle at each active zone (reviewed in Korn et al. 1990). Despite controversies about multivesicular release, for example during development, this concept is still commonly used for interpreting experimental data (see Stevens 1993).
4. Consequences of Quantal Transmission At the skeletal nmj, neurotransmission in normal Ca++ has a high safety factor (large n and p), such that a presynaptic impulse always leads to muscle contraction. On the other hand, the full implications and richness of the quantal nature, and hence uncertainty of transmitter release, are better exploited, albeit still puzzling, at central synapses. From a functional point 15381
Synaptic Transmission n = 4 active zones
1q
3q
np
if p = 0.5, np = 2
Figure 3 Binomial model of transmitter release
of view, the most novel feature is that communication across central synapses is not guaranteed systematically even when p is relatively high: after each impulse, a synapse flips a coin. This probabilistic aspect is exemplified in Fig. 3A, 3B, which stress that each synaptic unit operates in an ‘all or none’ manner. It also shows that over a large number of trials the average response amplitude equals the n:p:q, product, where n and p are presynaptic parameters, and q, the size of a quantum, is proportional to the number of channels opened by a single exocytotic event. This product defines the so-called synaptic weight or strength of a connection between two neurons, which is a dynamic term, as it can be modulated pre- or postsynaptically. Indeed, persistent increases in synaptic weights are considered to underlie higher-order processes such as learning and memory, from invertebrates to primates. For example, activity-dependent long-term potentiation (LTP; Bliss and Collingridge 1993) in mammalian hippocampus, which is considered a well-suited candidate for a memory mechanism, can result from an increase in any one of the three quantal parameters (Stevens 1993). The reverse form of plasticity is long-term depression (LTD; Bear and Abraham 1996) (first described in the cerebellum), which may be a cellular correlate of erasing memory traces. It presumably represents a second mechanism for allowing neurons to respond preferentially to selected subsets of their inputs. Since a neuron may receive information from thousands of junctions, the binary mode of transmission illustrated in Fig. 2A may be better suited for computational purposes than having a graded process at each site, which would lead to excess or redundant information. Along this line, a synapse can be viewed metaphorically as a symbolic communication system 15382
with, according to the scheme of Shannon and Weaver, the encoding process localized to the release site. Knowledge of p and n then allows calculation with the classical entropy equation of the amount of information, in bits, at a connection between neurons. For example, with a given value of p, the increased reliability of neuronal connections with larger ns is counterbalanced by a decreased number of bits per quantum (Korn and Faber 1987).
5. Diersity of Synapses Central synapses exhibit a great deal of structural and functional heterogeneity, which is built upon the scaffolding described above. For example, the transmitter at adjacent contacts on to a given postsynaptic could be an amino acid such as glycine or acetylcholine, a biogenic amine such as serotonin or dopamine, or a peptide such as somatostatin, substance P, or Vasoactive Intestinal Peptide (VIP), and, furthermore, combinations of these substances may be packaged in one terminal ending (Hokfelt et al. 1984). The consequences of this colocalization and of the cotransmission that is often associated with it are not yet resolved, particularly in puzzling cases where two classical transmitters with the same function can be released together (Jonas et al. 1998). In addition, there are multiple receptor subtypes for most transmitters, and more than one may be colocalized to the same postsynaptic locus (Somogyi 1998). Some are ionotropic, with binding causing channel opening or closing, while others are metabotropic, which means that the ligand-bound receptor activates an intracellular metabolic cascade which can modulate kinases or channel activity. In fact, listing the differences
Synaptic Transmission neurons than is commonly recognized. In any case, it influences the richness of animal behaviors and higher functions which result from activity in complex neuronal assemblies that are synchronized via their synaptic connectivity. Some elements of synaptic diversity and their functional implications deserve mention. First, although most junctions are fully operational in the normal brain, some of them may remain silent, either because a presynaptic action potential does not evoke transmitter release, or because the postsynaptic receptors are unresponsive (Fig. 4A). These may be viewed as ‘latent’ rather than ‘silent’ synapses, because they may be turned on in specific circumstances, such as during learning (Malenka and Nicoll 1997). Second, by virtue of the close packing of synapses, the transmitter released at one of them can diffuse laterally and interact with the postsynaptic receptors of its neighbors (Fig. 4B). This spillover may violate the principle of independence of individual junctions and result in synergistic interactions (Korn et al. 1990), and it may contribute to phenomena such as LTP. Finally, synaptic strength can vary on a millisecond timescale when a presynaptic neuron discharges repetitively (Fig. 4C), and this plasticity can be expressed as either a facilitation or depression (Johnston and Wu 1995, Markram and Tsodyks 1996). Both processes are presynaptic in origin, due to a corresponding change in the release probability, and typically meet the functional requirements of the related networks. For example, facilitation may contribute to recruit neuronal targets that are otherwise weakly excited.
6. Quantal Analysis and its Limitations Figure 4 ‘Latent’ synapses and mechanisms of short-term modifications of synaptic efficacy
between synapses is almost an insurmountable task. One would first have to consider constitutive variations in pre- or postsynaptic parameters, namely the probability of release, the kinetics and conductances of single channels. The mechanisms for limiting the duration of action of a transmitter such as its diffusion in the synaptic cleft, its uptake by activated terminals or glia, its chemical inactivation, and receptor desensitization once the channels have opened, add another layer of complexity (Johnston and Wu 1995). One then would have to factor in the synergistic or antagonistic effects of the growing inventory of modulators (Kaczmarek and Levitan 1987), which can influence these various functions in normal and pathological conditions, as well as during development. This diversity may confer a higher degree of functional uniqueness to individual synapses and
A statistical dissection of postsynaptic electrical recordings, based on the quantal hypothesis, remains an unsurpassed tool for understanding how different classes of nerve cells communicate with each other. The models used, and their corresponding mathematical formulations, are the same as used in game theory and many branches of the social, behavioral and political sciences; that is, the laws of chance. Stated otherwise, the release of a packet of transmitter after an impulse is equivalent to rolling dice. Despite their apparent complexity, these laws are quite straightforward. For all models, the average number of units released over a large series of trials, which is called mean quantal content or m, n
p jp j… pn l pi " # i=" where p is the release probability at each site. This is " the essence of the compound binomial release model, which encompasses all others that assume independent probabilities of release. 15383
Synaptic Transmission If there were no ambiguities about quantal size, q, i.e., the magnitude of the postsynaptic response to a quantum, amplitude distribution histograms of evoked responses would have discrete peaks corresponding to transmission failures, single quantum responses, doublets and so on, and these distributions could be described with one of the quantal models. For a compound binomial: px l p p …px(1kpx+ )…(1kpn) " " # jp p …px+ (1kp )(1kpx+ )…(1kpn) " # # " $ j(1kp )…(1kpn−x)(pn−x+ )…pn, " " where each line represents one way to get x successes and (nkx) failures. For example, in the first line, units 1 to x release quanta, whereas the other units (xj1 to n) do not. The probability of failure for one quantal unit is (1kp ). For a simple "bionomial: E
px l
n x F
G
px(1kp)(n−x) H
where E
F
n x
G
l H
n! (nkx)!x!
gives the number of combinations by which x quanta can be released out of an available pool of size n. For a Poisson: px l
e−mmx x!
with m l np As developed elsewhere (Martin 1977), most investigators have relied on the two latter models and on indirect methods derived from them rather than trying to fit the distributions. These methods involve either counting the number of failures, n , in N trials or ! measuring the coefficient of variation, CV, of the amplitude distribution, since according to a Poisson: m lkln n \N (method of failures) ! l CV# (coefficient of variation method). In rare cases, m can be determined directly, by counting quanta or by averaging miniature events arising from the same synapses. Corresponding expressions allow the last two methods to be applied to a binomial process if p is finite (see Martin 1977). Also, since the binomial CV l ((1kp)\np)"/# and is independent of q, it has been used as a tool for identifying the pre- vs. postsynaptic locus of a change in synaptic 15384
strength, but this approach is liable to misinterpretations (Faber and Korn 1991). Correct application of these tools requires the accurate identification of the quantal response, which is often difficult at central connections due to quantal variance, distortion by cable properties, and instrumental noise. As a consequence, discrete peaks in amplitude histograms are blurred and failures, single quantal responses, doublets, etc., cannot be distinguished. Various fitting algorithms have been developed, with some success (reviewed in Korn and Faber 1987; see also Stricker et al. 1994), their limitations, however, leaving some questions unresolved. One is the size of the quantum, which is related to the number of channels opened by an exocytotic event, and to their unitary conductance, and can be quite small at central contacts (Finkel and Redman 1983, Edwards et al. 1990). Another is that at some junctions there may be a mismatch between the large number of transmitter molecules released and the relatively few receptors available postsynaptically, which led to the suggestion of a saturated response (Jack et al. 1981). If so, the principles of quantal analysis would no longer hold, as quantitization would be imposed post- rather than presynaptically. A related issue is the difficulty in determining the pre- or postsynaptic locus of the increased synaptic weight that occurs during LTP, using quantal analysis alone. Raging controversies have not yet been entirely resolved, and mechanisms on both sides of the synapse have been invoked (see Stevens 1993, Liao et al. 1995, Isaac et al. 1996). One of the most intriguing proposals is that although the induction process is postsynaptic, its maintenance can be expressed as an increased release probability, which requires signaling in the reverse direction across the synapse, by a retrograde messenger (reviewed in Bliss and Collingridge 1993). This debate could be obscuring the essential notion of the unity of the synapse where pre- and postsynaptic regulatory factors could be implicated separately or in unison, depending upon functional requirements.
7. Other Aenues for Synaptic Studies Like all scientific fields, synaptic studies cannot remain confined to their historical roots, such as statistics, and need to be enriched by other approaches. The most direct is imaging of dynamic changes at the synapse, associated with the ability to tag pre- and postsynaptic structural components of activated junctions with fluorophores (Malgaroli et al. 1995). In addition, computer-aided modeling of single synapses, or of populations, is necessary to understand biophysical properties of vesicular exocytosis, the fate of the released molecules in the cleft, and their interactions with the postsynaptic receptors, all of which govern the size and kinetics of the postsynaptic response (Bartol et al. 1991, Kruk et al. 1997). Together with
Synaptic Transmission Enhancer
Promotor
Coding region
Transcription p p Regulatory protein
mRNA Translation RNA polymerase Protein
cAMP-dependent protein kinase 5-HT
cAMP
Learning
Figure 5 Modulation of gene expression during learning
anatomical correlates, these techniques should not only solve issues such as those related to single vs. multivesicular release and receptor saturation, but also deepen our understanding of brain development, and learning and memory, and how synapses are affected in pathological conditions. Along this line, it can already be envisioned how synapses contribute to genetic or acquired illnesses of behavior and cognition, including psychiatric disorders (Kandel 1998). Figure 5 illustrates the manner in which expression of a gene, leading to the production of a specific protein, could be altered following synaptic activation associated with learning. Conversely, genetic alterations can lead to significant psychiatric disorders, and it can be postulated that they do not leave synapses untouched. This example shows once more that synaptic studies represent a bridge between the neural sciences, cognitive psychology, and behavioral sciences. See also: Ion Channels and Molecular Events in Neuronal Activity; Long-term Potentiation (Hippocampus); Memory: Synaptic Mechanisms; Neural Plasticity; Neurons and Dendrites: Integration of Information; Neurotransmitters; Synapse Formation; Synapse Ultrastructure; Synaptic Efficacy, Regulation of
Bibliography Bartol T M Jr., Land B R, Salpeter E E, Salpeter M M 1991 Monte Carlo simulation of miniature endplate current generation in the vertebrate neuromuscular junction. Biophysicical Journal 59: 1290–307 Bear M F, Abraham W C 1996 Long-term depression in hippocampus. Annual Reiew of Neuroscience 19: 437–62 Bennett M V L 1977 Electrical transmission: A functional analysis and comparison to chemical transmission. In:
Brookhart J M, Mountcastle V B, Kandel E R, Geiger S (eds.) Handbook of Physiology: The Nerous System. Williams and Wilkins, Baltimore, MD, pp. 357–416 Bliss T V, Collingridge G L 1993 A synaptic model of memory: Long-term potentiation in the hippocampus. Nature 361: 31–9 Couteaux R, Pecot-Dechavassine M 1970 Vesicules synaptiques et poches au niveau des zones actives de la jonction neuromusculaire. Comfetes Rertus le l’Academic des Sciences 271: 2346–9 Del Castillo J, Katz B 1954 Quantal components of the end-plate potential. Journal of Physiology 124: 560–73 Eccles J C 1964 The Physiology of Synapses. Springer-Verlag, Berlin Edwards F A, Konnerth A, Sakmann B 1990 Quantal analysis of inhibitory synaptic transmission in the dentate gyrus of rat hippocampal slices: A patch-clamp study. Journal of Physiology (London) 430: 213–49 Faber D S, Korn H 1989 Electrical field effects: Their relevance in central neural networks. Physiological Reiew 69: 821–63 Finkel A S, Redman S J 1983 The synaptic current evoked in cat spinal motoneurones by impulses in single group la axons. Journal of Physiology 342: 615–32 Heuser J E, Reese T S, Landis D M D 1974 Functional changes in frog neuromuscular junctions studied with freeze-fracture. Journal of Neurocytology 3: 109–31 Hille B (ed.) 1984 Ionic Channels of Excitable Membranes. Sinauer Associates, Sunderland, MA Hokfelt T, Johansson O, Goldstein M 1984 Chemical anatomy of the brain. Science 225: 1326–34 Isaac J T, Hjelmstad G O, Nicoll R A, Malenka R C 1996 Longterm potentiation at single fiber inputs to hippocampal CA1 pyramidal cells. Proceedings of the National Academy of Science USA 93: 8710–15 Jack J J, Redman S J, Wong K 1981 The components of synaptic potentials evoked in cat spinal motoneurones by impulses in single group Ia afferents. Journal of Physiology (London) 321: 65–96 Johnston D, Miao-Sin Wu S (eds.) 1995 Foundations of Cellular Neurophysiology. MIT Press, Cambridge, MA Jonas P, Bischofberger J, Sandkuhler J 1998 Corelease of two fast neurotransmitters at a central synapse. Science 281: 419–24 Jonas P, Major G, Sakmann B 1993 Quantal components of unitary EPSCs at the mossy fibre synapse on CA3 pyramidal cells of rat hippocampus. Journal of Physiology (London) 472: 615–63 Kaczmarek L K, Levitan I B (eds.) 1987 Neuromodulation: The Biochemical Control of Neuronal Excitability. Oxford University Press, Oxford, UK Kandel E R 1998 A new intellectual framework for psychiatry. American Journal of Psychiatry 155: 457–69 Katz B (ed.) 1969 The Release of Neural Transmitter Substances. Charles C. Thomas, Springfield, IL Korn H, Faber D S 1987 Regulation and significance of probabilistic release mechanisms at central synapses. In: Edelman G M, Gall W E, Cowan W M (eds.) Synaptic Function. A Neuroscience Institute Publication. Wiley, NY, pp. 57–108 Korn H, Faber D S 1991 Quantal analysis and synaptic efficacy in the CNS. Trends in Neuroscience 14: 439–45 Korn H, Faber D S, Triller A 1990 Convergence of morphological, physiological, and immunocytochemical techniques for the study of single Mauthner cells. In: Bjorklund A, Hokfelt T, Wouterlood F G, van den Pol A N (eds.) Handbook of Chemical Neuroanatomy, Vol. 8: Analysis of Neuronal
15385
Synaptic Transmission Microcircuits and Synaptic Interactions. Elsevier Science Publishers, Amsterdam, pp. 403–80 Kruk P J, Korn H, Faber D S 1997 The effects of geometrical parameters on synaptic transmission: A Monte Carlo simulation study. Biophysics Journal 73: 2874–90 Kuffler S W, Yoshikami D 1975 The number of transmitter molecules in a quantum: An estimate from iontophoretic application of acetylcholine at the neuromuscular synapse. Journal of Physiology (London) 251: 465–82 Liao D, Hessler N A, Malinow R 1995 Activation of postsynaptically silent synapses during pairing-induced LTP in CA1 region of hippocampal slice. Nature 375: 400–4 Llinas R R 1982 Calcium in synaptic transmission. Scientific American 4: 56–65 Llinas R, Nicholson C, Precht W 1969 Preferred centripetal conduction of dendritic spikes in alligator Purkinje cells. Science 163: 184–7 Malenka R C, Nicoll R A 1997 Silent synapses speak up. Neuron 19: 473–6 Malgaroli A, Ting A E, Wendland B, Bergamaschi A, Villa A, Tsien R W, Scheller R H 1995 Presynaptic component of long-term potentiation visualized at individual hippocampal synapses. Science 268: 1624–8 Markram H, Tsodyks M 1996 Redistribution of synaptic efficacy between neocortical pyramidal neurons. Nature 382: 807–10 Martin A R 1977 Junctional transmission. II: Presynaptic mechanisms. In: Brookhart J M, Mountcastle V B, Kandel E R, Geiger S (eds.) Handbook of Physiology: The Nerous System. Williams and Wilkins, Bethesda, MD, pp. 357–416 Palay S L 1958 The morphology of synapses in the central nervous system. Experimental Cell Research 5: 275–93 Peters A, Palay S L, Webster H DeF (eds.) 1991 The Fine Structure of the Nerous System: Neurons and Their Supporting Cells. Oxford University Press, Oxford, UK Pfenninger K H, Akert K, Moor H, Sandri C 1972 The fine structure of freeze-fractured presynaptic membranes. Journal of Neurocytology 1: 129–49 Redman S J 1990 Quantal analysis of synaptic potentials in neurons of the central nervous system. Physiological Reiews 70: 165–98 Somogyi P 1998 Precision and variability in the placement of pre- and postsynaptic receptors in relation to transmitter release sites. In: Faber D S, Korn H, Redman S J, Thompson S M, Altman J S (eds.) Central Synapses: Quantal Mechanisms and Plasticity. pp. 82–95 Stevens C F 1993 Quantal release of neurotransmitter and longterm potentiation. Neuron 10: 55–63 Stricker C, Redman S, Daley D 1994 Statistical analysis of synaptic transmission: Model discrimination and confidence limits. Biophysics Journal 67: 532–47 Triller A, Cluzeaud F, Pfeiffer F, Betz H, Korn H 1985 Distribution of glycine receptors at central synapses: An electron microscopy study. Journal of Cell Biology 101: 683–8
H. Korn and D. S. Faber
Syncretism The meaning of the term ‘syncretism’ has changed drastically over the course of time. Currently, the term refers to the mixing together of elements from different 15386
religions. This mixing is sometimes viewed with contempt, especially by the clergy of the religions concerned. Scholarly use of the term is more objective and does not entail this kind of value judgement. In this article we shall describe the history of the term. Special attention will be given to current debates on power mechanisms in syncretism. Lastly, syncretism will be compared with fundamentalism.
1. History of the Term ‘Syncretism’ 1.1 Earliest Uses The first known use of the term ‘syncretism’ is found in Plutarch’s (46–125) Moralia (Morals), a treatise on moral principles. When suggesting that one should consider one’s brother’s enemies and friends as one’s own enemies and friends, he uses the example of the inhabitants of Crete. Plutarch uses the word syncretism to refer to their habit of forgetting internal conflicts and differences when confronting a common enemy. It may be that the use of the word was inspired not only by the Cretans’ example but also by a word play, since the Greek syngkrasis means a ‘mixing together’ (Stewart and Shaw 1994). The second author known to have used the term was the humanist Erasmus of Rotterdam (1496–1536). He quoted Plutarch and, like him, recommended syncretism, although here as a way of seeking a synthesis of seemingly disparate philosophical and theological opinions and as an attempt to understand the other completely, despite differences of view. After the Reformation, with its increase in divergent theological views, the term acquired a negative connotation for the first time. Opponents of the German Lutheran theologian George Calixtus (1586–1656), an early defendant of ecumenism and theological reconciliation, used the term pejoratively as a mixing of points of view that, in their opinion, was to be avoided. The debate became known as the ‘syncretistic controversy.’ 1.2 Views from Religious Studies and Missionary Practice In the nineteenth century, when research on world religions began to flourish and colonialism brought cultures and religions into contact, syncretism was understood more objectively as the mixing of elements from different religions, although purity continued to be viewed implicitly as the norm (Rudolph 1979). Christianity also reached the height of its expansion in the nineteenth century. Missionaries objected to the tendency of new converts to continue practices from their prior religion. Syncretism then became the concept used for this—condemned and combated— tendency. This meant that Christian theological use of
Syncretism the term was no longer restricted to internal differences between theologians but referred to the mixing of religious elements, especially Christian and nonChristian. Unlike its usage in religious studies, syncretism was considered censurable and illegitimate. The term still retains this meaning, even though ecumenical missionary circles have become more understanding of the need to translate the Christian message into local cultural terms (so-called inculturation; see Schreiter 1993). It has also been suggested that Christianity itself is the result of a syncretistic process (Pannenberg 1970). Yet, despite this development, the theological criterion continues to be how intact the core message of Christianity remains. For a long time scholars in religious studies have worked with the term and defined its meanings (Rudolph (1979) offers a useful summary), whereas, as will be shown in the next section, the term has a much shorter history in anthropology. Scholars in the field of religious studies, such as historians and phenomenologists of religion, view syncretism objectively and point to its general presence at the founding of new religions (even in religions that now oppose syncretism) and in their mutual contact. Scholars distinguish between symmetrical and equal relationships between religions on the one hand and asymmetrical and unequal ones on the other. Typologies of the degree to which elements can be mixed have been proposed, using terms such as identification, amalgamation, and symbiosis. It has been acknowledged that syncretism may involve elements other than religious ones, since religions are linked to cultural contexts. The basic ambiguity of meanings in syncretistic contexts has been emphasized (Pye 1971). Sometimes syncretism is presented as the result but more often as the process itself. Syncretism is presented as part of popular religion, while the representatives of official religion generally condemn it. Correspondingly, syncretism has often been depicted as unconscious, implicit and spontaneous, as typical of what Redfield called the little tradition of the unreflective many (Rudolph 1979). There have also been proposals to excise the term from scholarly vocabulary (Baird 1971, Schineller 1992, Van der Veer (1994) in Stewart and Shaw 1994) for being too vague, subjective, and too broad. If syncretism occurs in all religions and if borrowing and blending are universals in any event, the term seems to be insufficiently specific and therefore superfluous. Moreover, virtually no religious person sees himself or herself as syncretistic. However, terms are not easily eliminated from scholarly vocabularies. Precise definition or even redefinition may resolve this dilemma. In view of the increase in contact between believers of different religions, some term is needed to refer to the inevitable mixing of religious elements. In addition, the long history of the term has shown that its meaning may change.
1.3 Views from Cultural Anthropology While students of religion from outside cultural anthropology focused on world religions, the first anthropological texts on syncretism referred to Latin American and Caribbean examples, especially the Afro-Christian syncretism that was followed from the presence of slaves and their descendants in the region. Of course, syncretism was present in other areas, but where the negative missionary use of the term was the norm, as in Africa, anthropologists avoided using the term, even with respect to obvious cases of syncretism, as occurred in the so-called African independent churches founded by Africans and not by missionaries (Stewart and Shaw 1994). When the American anthropologist Melville Herskovits began using the term in anthropology, following his Brazilian student Arthur Ramos, he took his examples from Afro-American religions (Greenfield and Droogers 2001). His interest in the topic was partly ideological since he understood syncretism as a tool for promoting a unified culture in the United States, to be formed from the many cultures that immigrants, including those who had involuntarily come from Africa, had brought with them. This melting-pot idea was, of course, a way of rehabilitating the Afro-American strand in US culture over against white racism. Moreover, Herskovits saw syncretism as a half-way stage in the process of acculturation, the intermingling of cultures in contact. He was very much interested in the way people gave meanings to selected old and new elements, mixing cultural standards, and thus initiating change towards a new synthesis. The alternative course is an emphasis on authenticity and a return to tradition and to the past, as in the case of the black consciousness movement. Nourished by the political ideal of integration the concept of acculturation may have suggested a onesided linear evolution. Another scholar who used the term syncretism in the Latin American context was the French ethnologist Roger Bastide (Bastide 1978). On the basis of fieldwork in Afro-Brazilian religions, he emphasized the systematic way in which analogous elements from different religious sources are mixed. He also pointed to the role of power mechanisms, especially in the contact between slaves and their masters. The structural similarity between African, Catholic, and Amerindian worldviews facilitated syncretism. Thus African gods could be identified with Catholic saints and Amerindian spirits. In the oppression that the slaves suffered, supernatural help was most welcome. This redressed the balance of power somewhat, such as when masters were the object of slave rituals intended to harm them. The redress of this balance also became clear when Catholic elements were selectively adopted through the application of African criteria, and not vice versa, as the masters wished when they baptized their slaves. In practice, as a strategic 15387
Syncretism device, identification with and differentiation from the masters’ religion were both used in a double game. Thus the apparent adoption of a Catholic ritual attitude in Mass could serve as an alibi for the continuation of African ritual practices.
2. Current Issues 2.1 Power The political use of the concept of syncretism has received more attention in the 1990s (Droogers 1989, Stewart and Shaw 1994). The negative sense of the term is an indication of a conflict between clergy and laity on the right to produce legitimate religion. Leaders of a religion whose adepts are involved in a syncretistic process will, from a purist point of view, use their power to condemn any effort to mix elements from different religions. Stewart and Shaw (1994) accordingly proposed the term antisyncretism. As suggested above, the subjective meaning of the term could be a reason for abandoning the term in scholarly use. Yet it has the advantage of revealing the power dimension of the syncretism process. In that sense, syncretism is one of a number of terms used in anthropology despite their pejorative meanings, such as magic or popular religion. An attitude of antisyncretism is not only held by leaders of affected religions but can also be found in representatives of religions that are the result of a process of syncretism. Thus some leaders of the very syncretistic Afro-Brazilian religions adopt an antisyncretistic attitude, favoring a return to the African roots and the removal of all Catholic influence. 2.2 Syncretism and Fundamentalism In various senses syncretism and fundamentalism are opposites. Whereas syncretism uses more than one religious source, fundamentalism defends the uniqueness of the one source, often codified in a sacred text. Syncretism implies that more versions of the experience with the sacred are acceptable and can be combined, whereas fundamentalists defend one exclusive version. Syncretists experiment with religious symbols, as many as seem to be applicable to their situation, whereas fundamentalists jealously guard the one central symbol of their religion. Syncretism is usually criticized by religious specialists who feel that they are losing control and who may react by defending a fundamentalist view. Syncretists interpret the opportunity offered by the growing contact between cultures and religions as a chance to combine religious repertoires for thought and action, whereas fundamentalists react to the moral side effects of this contact by retreating to the traditional repertoire, even though they may make use of modern means of communi15388
cation to spread their message. Both syncretism and fundamentalism seem to flourish as a consequence of modernization and globalization but for different reasons: one embracing the new situation, the other being critical of it because it is felt to threaten cherished traditions and morals. Syncretists behave as global citizens, whereas fundamentalists divide the world into ‘us’ and ‘them.’ Syncretists are not organized and rarely present themselves as such, whereas fundamentalists tend to organize themselves and to occupy power positions. See also: Creolization: Sociocultural Aspects; Cultural Assimilation; Cultural Contact: Archaeological Approaches; Hybridity; Mission: Proselytization and Conversion
Bibliography Baird R D 1971 Category Formation and the History of Religion. Mouton, The Hague, The Netherlands Bastide R 1978 The African Religions of Brazil, Toward a Sociology of the Interpenetration of Ciilizations. Johns Hopkins University Press, Baltimore, MD Droogers A 1989 Syncretism: The problem of definition, the definition of the problem. In: Gort J D, Vroom H M, Fernhout R, Wessels A (eds.) Dialogue and Syncretism, an Interdisciplinary Approach. Eerdmans and Rodopi, Grand Rapids, MI, pp. 7–25 Greenfield S, Droogers A (eds.) 2001 Reinenting Religions: Syncretism and Transformation in Africa and the Americas. Rowman & Littlefield, Lanham, MD Pannenberg W 1970 Basic Questions in Theology. SCM, London Pye M 1971 Syncretism and ambiguity. Numen 18: 83–93 Rudolph K 1979 Synkretismus vom theologische Scheltwort zum religionswissenschaftlichen Begriff. Humanitas Religiosa: Festschrift fuW r Haralds Biezais zu seinem 70. Geburtstag. Almqvist & Wiksell, Stockholm, pp. 193–212 Schineller P 1992 Inculturation and syncretism: What is the real issue? International Bulletin of Missionary Research 16: 50–3 Schreiter R J 1993 Defining syncretism: An interim report. International Bulletin of Missionary Research 17: 50–3 Stewart S, Shaw R (eds.) 1994 Syncretism\Anti-syncretism, the Politics of Religious Synthesis. Routledge, London Van der Veer P 1994 Syncretism, multiculturalism and the discourse of tolerance. In: Stewart S, Shaw R (eds.) Syncretism\Anti-syncretism, the Politics of Religious Synthesis. Routledge, London, pp. 196–211
A. Droogers
Syndromal Diagnosis versus Dimensional Assessment, Clinical Psychology of A syndrome is a pattern of symptoms occurring in a disease; a diagnosis implies history and prognosis as well. Dimensional assessment refers to the severity of the associated symptoms or characteristics. In 1895
Syndromal Diagnosis ersus Dimensional Assessment, Clinical Psychology of the countries of Europe met to define the important diseases. They could not agree, but in 1905 when they next met, they just had to agree to define diseases important to public health, if only to be able to share information about prevention and treatment. Progress is possible only when all share a common language, and definition of disorders has been as much product of practical need as the result of informed research. In clinical psychology, the dominant classifications are now the American Psychiatric Association’s Diagnostic and Statistical Manual, 4th revision (DSM-IV) (1994) and the World Health Organization’s International Classification of Disease, 10th revision (ICD10) (1993). Both are lists of characteristics (symptoms and durations, and other diagnoses that preempt) but neither routinely includes history or cause. A dimensional assessment is particular to an individual, a fine grained description of symptoms and other attributes currently present. Both should inform about clinical picture and beneficial treatments: the syndromal diagnosis in respect to what is known about others with the same complaint; the dimensional assessment in respect to this individual. Historically there was a clear distinction—when doctors made diagnoses they said this person belongs to a category, and from that category one could infer etiology, natural history, and response to treatment; whereas when psychologists completed a dimensional assessment (see Minnesota Multiphasic Personality Inventory) they could profile the person’s strengths and weaknesses in respect to the normative values for the general population. A diagnosis is of value because it pulls together the ways that symptoms reflect a class phenomenon. A dimensional assessment is of value because it highlights the particular issues in an individual case. To some extent the dichotomy posed in the title of this article is an artifact, for good clinicians who make a syndromal diagnosis go on to make a formulation in which the individual characteristics are taken into account, and good clinicians who make a dimensional assessment acknowledge the syndromal information that might inform prognosis and treatment. For most, the syndromal diagnosis informs the broad choice of treatment, while the dimensional assessment is the key to measuring progress. Syndromal diagnosis is not straight forward. It varied until the American Psychiatric Association in 1980 and the World Health Organization in 1990 published detailed diagnostic criteria for each mental disorder (see Mental and Behaioral Disorders, Diagnosis and Classification of). Prior to this there were significant variations between countries in the way that serious mental disorders like schizophrenia or manic depressive illness were diagnosed, and probably even more intercountry variations in the way that the nonpsychotic anxiety, depressive, substance use, and somatoform disorders were identified, although these variations were rarely studied. The current DSM
(APA 1994) and ICD (WHO 1993) criteria have resulted in clinicians being able to agree much of the time. But as Meehl had originally remarked and Kendell (1989) has affirmed, two clinicians will vary in the way they ascertain the symptoms, and in the way they interpret and weight them toward the criteria, so that disagreement, even given an agreed classification, will still occur. There are no validated laboratory or x ray tests for any mental disorder and no gold standard method of determining diagnosis. Diagnosis depends solely on the pattern of illness complained of by the patient or elicited by the clinician. It is becoming increasingly clear that the present diagnostic criteria, largely based on the work of Kraepelin in Europe at the beginning of the twentieth century, are but surface markers for hypothetical underlying concepts about which clinicians agree. In Jablensky’s (1999) view, we have a nomenclature, not a logical classification of mental disorders. The present classifications, even though sanctified by tradition, and now reliable and stable, may not even be correct. The ICD and DSM classifications, while differing in minor ways, seem as though they should function equivalently. The advent of agreed diagnostic criteria has meant that fully structured diagnostic interviews could be developed (e.g., the Composite International Diagnostic Interview or CIDI) and as these interviews address both DSM and ICD classifications, the correspondence between the two classifications could be examined. The results are interesting (Andrews et al. 1999). Only two-thirds of people who met criteria for a diagnosis on either system met criteria on both DSM and ICD classifications, the agreement varying from a high of 85 percent for depression and dysthymia to a low of 35 percent for substance abuse and posttraumatic stress disorder. Most of the disagreements appeared to come from the way the criteria were defined and operationalized, and not from any basic disagreement about the underlying concept. Epidemiological studies show that the demographic correlates among people meeting criteria for a diagnosis of a mental disorder—age under 45, being female, currently unmarried, unemployed, and of lower socioeconomic status—were the same in both classifications, as though the cases identified by either classification shared similar underlying characteristics. Nevertheless, such demographic correlates have lead some to question the validity of the concept of mental disorders—‘Isn’t it understandable that the disenfranchised have symptoms of anxiety, depression and somatic dysfunction?’ Is mental illness therefore a social construct? Thomas Szasz (1961), appalled by the ease with which a supposition of psychosis could lead to involuntary hospitalization, argued that it was a social construct. The surprising feature is constancy of the syndromic nature of the disorders; psychosis has similar symptoms and natural history across cultures, in the developing and in the developed world, in people with and without the listed social correlates. 15389
Syndromal Diagnosis ersus Dimensional Assessment, Clinical Psychology of Major depressive disorder, agoraphobia, or neurasthenia are equally invariant in their symptoms and natural history across countries and across peoples. This invariant pattern is not evidence of a stereotyped response to stress, for contrary to popular belief, stress plays a very small part in mental disorders. It looks as though these are the patterns of symptoms to which we are heir. How important are the mental disorders? The World Bank began a project to estimate the important consequences of all diseases by adding the years of life lost due to a disease to the years lived with a disability, weighted for severity. Worldwide say Murray and Lopez (1996), mental disorders, as currently defined, account for 9 percent of the total burden of disease. In established market economies like North America, Europe, and Australasia, they account for 22 percent of the total burden, not because there is an epidemic of mental disorders in those countries, but because their relative importance has increased as the burden due to infection and chronic physical disease has declined. Mortality does not contribute a great amount, the burden of mental disorders largely comes from the morbidity—the years lived with a disability. Among the mental disorders, the anxiety and depressive disorders account for half this morbidity, not so much because they are so disabling, but because they are common, whereas the psychoses, which are more disabling, are relatively rare, and so account for less of the total morbidity. In total, mental disorders are an important cause of human capital loss. Syndromal diagnoses are important for political reasons. They are essential if epidemiological studies are to answer the usual questions asked by health planners—how many people have which disorders, how disabled are they by these disorders, what does it cost to treat them, and what services should we provide? Reliable case identification is also critical if therapeutics is to advance. Randomized controlled trials of a new remedy against placebo or established treatment are the only way that the efficacy of new treatments can be assessed. Being able to specify the diagnostic class to which the patients belong is critical, both to the scientific question, and to the marketing of the remedy once it has been shown to be efficacious. Last, funders often no longer pay for the treatment of a person, but pay on the basis of the syndromic diagnosis, i.e., a set casemix fee for a hospital to treat a new case of schizophrenia, or a set number of visits or set fee for a clinician to provide ambulatory treatment for a case of depression. Classification by syndromal diagnosis is essential to all these activities—defining the unmet need, identifying and marketing the treatments that work, and reimbursing the clinicians who provide the service. The whole edifice of mental health care would collapse were we to remove syndromal diagnosis even though we have no proof that these diagnoses are fundamentally valid. There are a number of problems that stem from 15390
accepting the present definitions. The first problem is the seemingly arbitrary nature of the syndrome definitions themselves. In part, the diagnosis of a major depressive episode requires the presence of five of nine symptoms known to occur in depressive illness. But people with four symptoms—depression, loss of interest, inability to concentrate, feelings of worthlessness—would not meet criteria even though they could share the same demographic correlates, response to treatment, and level of disability as did some people who did meet criteria. The diagnosis of schizophrenia in part depends on a disturbance persisting for at least six months characterized by the presence of two of five symptoms of psychosis during a one month period with concomitant social\occupational dysfunction. If the disorder has been present for less than a month it is classified as Brief Psychotic Disorder, if present for one month to five months it is classified as Schizophreniform Disorder, and when it has continued for six months it becomes schizophrenia. While there are practical reasons that relate to cause and outcome that led to this division based on chronicity, it is obvious that it is an arbitrary subdivision of a spectrum of psychotic disorders. Both the ICD and the DSM classifications of mental disorders contain many more examples, of a point on a dimension of symptoms marking a criterion for a disorder. Many presume that the existence of a specific treatment might validate a disorder, but there is a lack of specificity among current treatments for mental disorders. Given the millions of dollars spent, one would have thought that drug remedies for depression would cure depression specifically. Recent evidence suggests that three-quarters of the benefit from Prozac and other drugs sold as antidepressants is due to spontaneous remission or response to placebo and only one quarter of the observed improvement is due to the specific effects of the drug (Enserink 1999). In most jurisdictions the same drugs have won approval as being efficacious for panic disorder, social phobia, or for obsessive compulsive disorder. Clearly the drugs have little specific action and, at best, may act on some underlying brain mechanism common to depressive and anxiety disorders. There are a number of additional reasons why the skeptic might doubt the current diagnostic systems. The first problem is that multiple diagnoses are more common than one would expect if the individual syndromes were independent, four times more common in fact (Andrews et al. 1990). At the clinical level one is surprised when a person who was hospitalized with an episode of mania—days of unnatural elation and overactivity coupled with being overtalkative and grandiose, sexually unrestrained, and careless of money and reputation—recovers and is rehospitalized two years later with a diagnosis of schizophrenia— deluded and hallucinated, with disorganized thinking and speech, coupled with a steady occupational and social role deterioration over the past months. No
Syndromal Diagnosis ersus Dimensional Assessment, Clinical Psychology of one should be so unfortunate as to suffer two major mental disorders, yet clinical conferences to review the management plans of difficult to treat patients not infrequently learn that, at various times, a number of different diagnoses have been used to justify treatment. Comorbidity is to be expected if an underlying causal factor is related to both the disorders experienced, or if one disorder is a response to the other, but two or more apparently independent disorders occurring in the same person should be less common than it is. Comorbidity should refer to the concurrence of two or more disorders (see Comorbidity). Structured diagnostic interviews, like the CIDI, ask whether a person has ever had the symptoms that satisfy the criterion for a disorder. Comorbidity has come to refer to people who have satisfied the criteria for more than one disorder, sometime in their lifetime. Comorbidity is common and in a number of surveys it has been shown that people who meet criteria for one disorder, will on average meet criteria for an additional disorder, which is four times the expected risk. This finding is not just due to respondents being prone to say ‘yes’ to any symptom question. In the anxiety and depressive disorders comorbidity has been related to the presence of a predisposing vulnerability factor of high trait anxiety and poor coping. Sequential effects are also evident, with the stress of one illness acting as a precipitant of additional disorders in vulnerable persons. For example people with agoraphobia are likely to become depressed, and people with social phobia can become dependent on the alcohol that they initially used to manage their disorder. However the comorbidity comes about, people identified as having more than two illnesses in their lifetimes to date are more likely to meet criteria for a current disorder, to be more disabled, and to be and have been very high utilizers of health services. Comorbidity is not just an artifact of measurement, it is also an indicator of severity and unmet need. Kendell (1989) has revised the original Robins and Guze (1970) criteria for determining whether a clinical syndrome was valid as follows: (a) identification and description of the syndrome, either by clinical intuition or by statistical analysis; (b) demonstration of boundaries or ‘points of rarity’ between related syndromes by discriminant function analysis or latent class analysis; (c) follow-up studies establishing a distinctive course or outcome; (d) therapeutic trials establishing a distinctive treatment response; (e) family studies establishing that the syndrome ‘breeds true’; and (f) association with some more fundamental abnormality—histological, psychological, biochemical, or molecular. These six criteria have not been achieved in any group of mental disorders—the affective, anxiety, substance use, personality, or psychotic disorders. While identi-
fication and description of each syndrome was well advanced by the ICD and DSM classifications, clear boundaries have been difficult to establish, distinctive outcomes are rare, a unique response to one treatment is uncommon, and multiple diagnoses over time are common. The diagnostic classifications have attempted to deal with the last problem by establishing hierarchical rules as to which diagnosis precludes others, but the strategy is neither systematic nor wholly convincing. Information from studies of the human genome might go someway to resolve this impasse by showing discontinuities between genotypes and the accepted phenotypes, and so force a revision of the current classifications (Smoller and Tsuang 1998). Twenty years ago, it was believed that neurochemistry and neuroimaging studies would do this and they haven’t. It is not that there are no data, it is just that reliable data are so patchy that it is not yet possible to say that disease X is distinguished by specific information addressing three or more of the Kendell criteria. Until there is, the validity of any syndrome is more a belief than a fact. Health planners and researchers, clinicians and their patients, and funding agencies, all act as though the diagnostic categories are an established fact. Like the representatives at the International Health Conference in 1905, they have to. However, even good clinicians turn a blind eye when a depressed person has less than the required five symptoms—if, in the light of their history, the person probably will respond to the usual therapy for depression, the diagnosis will be made. Similarly, even if classic symptoms of schizophrenia have lasted for less than six months most clinicians will make the diagnosis and begin antipsychotic drugs. Most health planners will be pleased and so will most funding agencies. Good clinicians already use syndrome specific questionnaires or rating scales of thoughts, feelings and behaviors, and of global measures of disability. They know that the task is to get the person well, that is, no longer troubled by symptoms, and no longer disabled. This dimensional assessment is the key to the measurement of outcome and as outcome assessment is an increasing requirement, it is likely such dimensional assessment will soon rank in importance with the establishment of the syndromal diagnosis. Only with the personality disorders is there a strong move to discard the categories altogether and use a dimensional assessment based on measurement of personality traits as being more accurate and parsimonious. In summary, the categories of the common mental disorders are well established and accepted and appear to represent the domains of symptoms complained of by people in both developed and developing countries. While health planners, clinicians, and funding agencies pay lip service to these categories, most realize that they are dealing with dimensions of illness on which the categories impose an artificial dichotomy and so will advise or accept the appropriate treatment even if 15391
Syndromal Diagnosis ersus Dimensional Assessment, Clinical Psychology of the person does not strictly meet the criteria. Hopes that statistical analysis of symptom patterns, or distinctions evident in treatment response, or variations in natural history would inform categorical classification have proved illusory. Some think that the advances in molecular genetics might assist in rewriting the topography of mental disorders but even if this is so, most accept that the issue will still be dimensional, ‘just how much of disorder X should one cope with before seeking help,’ not ‘should all people with the gene for disease X be treated.’ Moreover the likely response will also be dimensional and range from public education, through self-help strategies and assistance at the primary care level, to specialist mental health care (Andrews 2000). Diagnostic criteria and dimensional assessment are but two sides of the same coin. Good clinicians use both. See also: Behavioral Assessment; Classification: Conceptions in the Social Sciences; Clinical Assessment: Interview Methods; Clinical Psychology: Validity of Judgment; Comorbidity; Differential Diagnosis in Psychiatry; Mental and Behavioral Disorders, Diagnosis and Classification of
Bibliography American Psychiatric Association (APA) 1994 Diagnostic and Statistical Manual of Mental Disorders, 4th edn. (DSM-IV). American Psychiatric Association, Washington, DC Andrews G 2000 Meeting the unmet need through disease management. In: Andrews G, Henderson S (eds.) The Unmet Need in Psychiatry. Cambridge University Press, London Andrews G, Slade T, Peters L 1999 Classification in psychiatry: ICD-10 versus DSM-IV. British Journal of Psychiatry 174: 3–5 Andrews G, Stewart G, Morris-Yates A, Holt P, Henderson S 1990 Evidence for a general neurotic syndrome. British Journal of Psychiatry 157: 6–12 Enserink M 1999 Can the placebo be the cure? Science 284: 238–40 Jablensky A 1999 The nature of psychiatric classification: Issues beyond ICD-10 and DSM-IV. Australian and New Zealand Journal of Psychiatry 33: 137–44 Kendell R E 1989 Clinical validity. Psychological Medicine 19: 45–55 Murray C J L, Lopez A D 1996 Evidence-based health policy: Lessons from the global burden of disease study. Science 274: 740–3 Robins F, Guze S B 1970 Establishment of diagnostic validity in psychiatric illness: Its application to schizophrenia. American Journal of Psychiatry 126: 983–7 Smoller J W, Tsuang M T 1998 Panic and phobic anxiety: Defining phenotypes for genetic studies. American Journal of Psychiatry 155: 1152–62 Szasz T 1961 The Myth of Mental Illness. Harper and Row, New York World Health Organization 1993 The ICD-10 Classification of Mental and Behaioural Disorders: Diagnostic Criteria for Research. World Health Organization, Geneva
G. Andrews 15392
Syntactic Aspects of Language, Neural Basis of Natural languages are built on three different knowledge components: the sound of words, called phonology; the meaning of words, called semantics; and the grammatical rules according to which words are put together, called syntax. This article concerns the neural mechanism underlying the knowledge and use of syntax. Three methods have been used to study the neural basis of syntax: (a) the observation of the correlation between brain lesion and behavior; (b) functional imaging approaches; and (c) event-related brain potential methods. Evidence from these three approaches with respect to syntactic processing will be reviewed and an explanatory model covering major aspects of the data will be suggested.
1. Lesion–Behaior Approaches Lesion–behavior approaches to investigating the neural basis of language date back to the middle of the nineteenth century (Broca 1861, Wernicke 1874). Using such an approach it was observed that patients with lesions in the temporal region of the left hemisphere (the so-called Wernicke’s area, Brodmann’s area, BA 22\42) are characterized by poor comprehension, whereas their speech is fluent and often syntactically well structured although semantically empty. Patients with lesions in the left inferior frontal gyrus (the so-called Broca’s area, BA 44), in contrast, demonstrate a selective deficit for syntactic aspects in language production and in language comprehension. Patients with lesions in Broca’s area produce agrammatic speech, that is utterances containing content words, but lacking most of the grammatical morphology (function words and inflections). They fail to comprehend sentences whenever the correct interpretation depends on the full and adequate processing of the grammatical morphology. In the 1980s a central topic of discussion centered around three possible ways of understanding this behavior. A first view took this behavior to be due to a lack of grammatical knowledge as such (Caramazza and Zurif 1976, Berndt and Caramazza 1980). A second approach interpreted it as being caused by computational deficit either in accessing lexical information or in accessing the syntactic information encoded in function words and inflectional morphology (Bradley et al. 1980). A third view assumed the computational deficit to be based either on reduced memory resources (Kolk and Van Grunsven 1985) or on reduced processing resources leading to a trade-off between syntactic and semantic interpretative processes (Linebarger et al. 1983). It has been shown that Broca’s patients are able to judge a sentence’s grammaticality (Linebarger et al.
Syntactic Aspects of Language, Neural Basis of prehension tasks. As these tasks only reflect the end point of the comprehension process they may imply certain on-line operations possibly supported by Broca’s area or adjacent tissue, as well as other operations possibly supported by other areas (see also Zurif et al. 1993) (see Fig. 1).
2. Functional Neuroimaging
Figure 1 Left hemisphere (numbers refer to Brodmann’s cytoarchitectonically defined areas)
1983) and even show grammatical priming (i.e., grammatically correct context sentences enhanced the recognition of a target word compared to ungrammatical context sentences; see Haarmann and Kolk 1991). These findings lead to the assumption that Broca’s area may not be the brain area where syntactic knowledge is represented, but rather that this brain tissue supports syntactic computation. While some view this brain region to be the home of very specific syntactic computations necessary to process dislocated elements (e.g., moved constituents that leave behind traces, as Grodzinsky 2000), as others consider this brain area to support on-line processes of syntactic phrase-structure building (Friederici 1995). There is, however, evidence that might challenge these views. It has been reported that patients with lesions outside Broca’s area show syntactic comprehension deficits in off-line tasks (Caplan et al. 1985, 1996). Moreover, both Broca’s and Wernicke’s patients display a certain similarity in their perfomance in offline comprehension tasks (Bates et al. 1987). It was demonstrated that patients speaking a language which marks syntax by inflectional morphology are generally more deficient, irrespective of their classification, than those speaking a language marking syntax by strict word order. However, the finding that in a highly inflected language Broca’s patients are more affected than Wernicke’s patients suggests a particular deficit of Broca’s aphasics for syntactic morphology. Thus it appears that similarities between different patient groups can be observed when using off-line com-
New imaging techniques during the 1990s have provided additional insight into the neural basis of syntactic parsing. Positron emission tomography (PET) studies (see Sect. 4.3.3) have investigated the reading of syntactically simple and complex sentences. It was found that the regional blood flow increases in the left Broca’s area as a function of syntactic complexity (Stromswold et al. 1996). A recent study using functional magnetic resonance imaging (f MRI) to register the brain’s activity during the reading of sentences of different complexity found a significant increase of the hemodynamic response in the left Broca’s and Wernicke’s area as a function of the sentence’s complexity (Just et al. 1996). Other studies investigated the brain–syntax relationship by comparing the brain activity during the processing of word lists with that during sentence processing. In an event-related fMRI study Friederici et al. (2000) presented four different auditory stimulus types: a list of real words (nouns); a list of pseudowords; normal sentences containing function words and real content words; and sentences containing function words and pseudowords instead of real content words. Both sentence types differed from the two word lists, showing an additional activation in the anterior portion of the left superior temporal gyrus. The sentences that contained only grammatical morphemes and pseudowords instead of content words displayed an additional activation in the left frontal operculum in the vicinity of the Broca’s area (i.e., medial to BA 44). The combined imaging data suggest that parts of the left superior temporal gyrus and the left frontal operculum are the brain systems that support syntactic processes. When speculating about the specific contribution of the two brain regions one may again consult the patient data. Given the observation that Broca’s patients are able to perform grammaticality judgments (Linebarger et al. 1983), but are deficient in computing syntactic information on-line (Friederici 1995), and the finding that Wernicke’s patients are deficient in performing judgments (Blumstein et al. 1982), but perform similar to normals in some on-line comprehension tasks (Zurif et al. 1993), the following account may be formulated. While the left temporal region is necessary for the representation of syntactic knowledge, the left inferior frontal region supports syntactic procedures in real time. 15393
Syntactic Aspects of Language, Neural Basis of
3. Neurophysiological Approaches The temporal course of the neural activity underlying syntax processing has been evaluated by means of electroencephalography (EEG) (see Sects. 4.3.5 and 4.3.6) and more recently also by means of magnetoencephalography (MEG) (see Sect. 4.3.7). With their seminal paper Kutas and Hillyard (1983) introduced the event-related brain potential (ERP) method to investigate language processes. They observed a particular ERP component correlated with lexicalsemantic processes: a negative potential peaking around 400 msec after the onset of the critical event, called the N400 component. The critical event in their study was a sentence final word which was highly unexpected and semantically incoherent (for example: ‘He spread his bread with butter and socks.’). It was shown that the amplitude of the N400 varied as a function of a word’s semantic expectancy and also as a function of a word’s lexical frequency. Importantly, the N400 component has been reported in a number of different languages including English, German, Dutch, French, Hebrew, and even sign language in the visual as well as in the auditory domain (for a review, see Van Petten 1995). Note that unexpected events in a non-semantic domain elicit a positivity (P300) rather than an N400 (Donchin 1981).
3.1 Syntax-related Electrophysiological Correlates The picture is somewhat more complex for the ERP pattern correlated with syntactic processing. There are two syntax-related components that have been identified: an early left anterior negativity, and a late centroparietal positivity. While the anterior negativity is only present when processing outright violations, the late positivity is observed when processing outright violation and when processing violations of structural preferences.
3.1.1 The syntax-related late positiity. The late positivity is a centroparietally distributed positivity usually at around 600 msec and beyond, called P600. P600 components are observed both in correlation with outright syntactic violation and with violations of syntactic preferences. Violations of the latter kind are present in so-called ‘garden path’ sentences, that is sentences for which it becomes clear at a particular word that the perceiver was misguided structurally by following a simple instead of a more complex syntactic structure (for example: ‘The broker hoped to sell the stock was sent to jail’ vs. ‘The broker persuaded to sell the stock was sent to jail,’ Osterhout and Holcomb 1992). At critical points in the sentence (at the word ‘to’ and the word ‘was’) the reader\listener has to revise and reanalyze the structure initially followed. The P600 is also 15394
present in sentences containing an outright syntactic violation, be it a phrase-structure violation realized by a word category error (Neville et al. 1991, Friederici et al. 1993) or be it a morphosyntactic violation realized by an incorrect inflection (Gunter et al. 1997, Mu$ nte et al. 1997). Functionally, the P600 is taken to reflect costs of reprocessing (Osterhout et al. 1994), syntactically guided processes of reanalysis and repair (Friederici 1995), or syntactic integration difficulties in general (Kaan et al. 2000).
3.1.2 The syntax-related anterior negatiity. For outright syntactic violations an anterior negativity, that is, a frontally distributed negative waveform usually with a maximum over the left, is observed. It is called ELAN (early left anterior negativity) or LAN (left anterior negativity), depending on its latency of occurrence. Before we discuss the particular appearance of this syntax-related negativity in more detail we shall recapitulate briefly that syntactic information is multiply encoded in a given language.
3.2 Domains of Syntax and their Neurophysiological Correlates There are at least four syntactic domains that have to be considered: (a) word category information necessary to build up local syntactic structures; (b) morphosyntactic information such as verb inflections necessary to establish agreement between different elements within and across phrases or case markings allowing the assignment of thematic roles (who is doing what to whom); (c) verb argument structure information necessary to establish subcategorization relations; and (d) long-distance dependencies necessary to establish the relation between a moved element (filler: ‘who’) and its original position in the sentence (gap: II) in a sentence such as ‘Who did Peter see II?’ Some, but not all, aspects of these different domains have been investigated using neurophysiological recordings. We shall take up the different domains in turn.
3.2.1 Word category information. The anterior negativity was found in correlation with phrase-structure violations realized as word category errors and with the processing of function words. It was observed between 125 and 200 msec after the critical word’s onset (italicized) in a number of studies using auditory presentation (for example: ‘Der Freund wurde im besucht’ ‘The friend was in the visited’ vs. for ‘Der Freund wurde besucht’\ ‘The friend was visited,’ Friederici et al. 1993, Hahne and Friederici 1999). (But see Neville et al. 1991, who found an early and late negativity in the visual domain.) The distribution of the word category error related negativity is often
Syntactic Aspects of Language, Neural Basis of frontal with a maximum to the left, but has also been found to be less lateralized (Friederici et al. 1999). So far it is not yet clear what determines the distributional differences. Current research is trying to solve this issue. With respect to the possible neural generators for the anterior negativity it is interesting that left anterior cortical lesions eliminate the anterior word category related negativity whereas left anterior lesions restricted to the basal ganglia do not (Friederici et al. 1999). Functionally, the word category related negativity was interpreted as reflecting initial structure building processes (Friederici 1995), which according to syntax first models (Frazier 1987), are based on word category information only. A comparable left anterior negativity also related to word category information was observed for the processing of function words in syntactically correct sentences (Neville et al. 1992, see also Pulvermu$ ller et al. 1995). It was present around 280 msec post onset when reading sentences. Similar to the above discussed negativity, the function word related negativity was eliminated with left anterior lesions (Ter Keurs et al. 1999). In line with the proposal by Friederici (1995), the authors interpret the left anterior negativity to reflect processes of word category identification.
3.2.2 Morphosyntactic information. Morphosyntactic violations (for example: ‘The plane took we to paradise’ vs. ‘The plane took us to paradise’) usually elicit a left anterior negativity between 300 and 500 msec after the critical word’s onset (italicized). It was found in the visual domain in a number of different languages (Mu$ nte et al. 1993, Gunter et al. 1997, Coulson et al. 1998), as well as in the auditory domain (although with a somewhat larger distribution, Friederici et al. 1993). Note that the types of morphosyntactic violations investigated in these studies do not alter the critical word’s syntactic category, but influence the relation between the verb and its arguments (e.g., subject–verb agreement). Under the view that initial phrase-structure building is concerned primarily with word category information, and that the lexical information encoded in an inflection only becomes relevant when thematic roles are to be assigned, it is conceivable that the violation of the former information type elicits an earlier brain response than the violation of the latter information type.
3.2.3 Verb argument structure information. The processing of verb argument structure information has been investigated in a number of different studies using different types of violation. In an early study, Ro$ sler et al. (1993) registered ERP for sentences which contained subcategorization violations (for example: ‘Der Lehrer wurde gehustet’\‘The teacher
was coughed’). A left anterior negativity between 300 and 500 msec was found in correlation with the violation. In this study, the sentence final verb (one-place verb) does not match the thematic grid opened up by the passive construction (two-place verb). When the mismatch involved subject vs. object arguments, again a left anterior negativity was reported (Coulson et al. 1998). Thus it appears that violations concerning the relation between external (subject) and internal arguments (objects) also evoke a left anterior negativity. Interestingly, the argument structure related negativity is present between 300 and 500 msec, that is, the time domain in which the lexical-semantic ERP component is found. This is not surprising, given that the availability of the verb argument structure information requires full access to the verb’s lexical entry.
3.2.4 Long-distance dependencies. Long-distance dependencies that establish the relation between a moved constituent (filler) and its original position (gap) have been investigated by Kluender and Kutas (1993) in a number of constructions. They found a LAN at the gap position at which the previously perceived filler has to be reactivated. As it is assumed that the filler has to be kept in memory until the gap is met, they claim that the LAN is a reflection of verbal memory. The two functional interpretations of the LAN as a syntactic violation related LAN and a filler-gap related LAN has recently been solved by showing that the former is present locally (at the violation) and that the latter spans from the filler to the gap position over the sentence (Kluender et al. 1998). In conclusion, it appears that outright syntactic violations elicit a local anterior negativity with a maximum over the left hemisphere, in addition to a P600. Moreover, the combined data suggest that word category related anterior negativities are present earlier than anterior negativities found in correlation with lexically encoded morphosyntactic information and lexically encoded verb argument information.
4. Modeling the Neurocognition of Syntactic Processes The data provided by recent ERP and functional imaging studies have advanced our knowledge on the neural basis of language, and of syntax in particular. The network identified to support syntactic processes clearly includes superior temporal and inferior frontal areas dominantly in the left hemisphere. Their specific function still has to be determined in detail. It appears, however, that the left temporal region is relevant for the representation of syntactic knowledge, whereas the 15395
Syntactic Aspects of Language, Neural Basis of inferior frontal region supports syntactic procedures. The timing of the brain’s activity correlated with syntactic processes can be described to consist of different stages: an early stage during which first-pass parsing processes (based on word category information) take place, and a next stage during which lexically encoded syntactic information is processed. Both are reflected in a left anterior negativity in the ERP. A late stage of syntactic processes is reflected by a late positivity in the ERP. This component is observed whenever syntactic integration is difficult. Future research will have to determine whether different aspects of syntactic integration difficulties are correlated with distinct topographies of the late positivity which would indicate the involvement of different neural structures. See also: Connectionist Models of Language Processing; Semantic Knowledge: Neural Basis of; Sign Language: Psychological and Neural Aspects; Syntax
Bibliography Bates E, Friederici A D, Wulfeck B 1987 Comprehension in aphasia: A cross-linguistic study. Brain and Language 32: 19–67 Berndt R S, Caramazza A 1980 A redefinition of the syndrome of Broca’s aphasia: Implications for a neuropsychological model of language. Applied Psycholinguistics 1: 225–78 Blumstein S E, Milberg W, Schrier R 1982 Semantic processing in aphasia: Evidence from an auditory lexical decision task. Brain and Language 17: 301–15 Bradley D C, Garrett M F, Zurif E B 1980 Syntactic deficits in Broca’s aphasia. In: Caplan D (ed.) Biological Studies on Mental Processes. MIT Press, Cambridge, MA, pp. 269–86 Broca P 1861 Remarques sur le sie' ge de la faculte! du langage articule! , suivies d’une observation d’aphe! mie (perte de la parole). Bulletins de la SocieT teT Anatomique de Paris 6: 330– 57 Caplan D, Baker C, Dehaut F 1985 Syntactic determinants of sentence comprehension in aphasia. Cognition 21: 117–75 Caplan D, Hildebrandt N, Makris N 1996 Location of lesions in stroke patients with deficits in syntactic processing in sentence comprehension. Brain 119: 933–49 Caramazza A, Zurif E B 1976 Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia. Brain and Language 3: 572–82 Coulson S, King J, Kutas M 1998 Expect the unexpected: Eventrelated brain responses of morphosyntactic violations. Language and Cognitie Processes 13: 21–58 Donchin E 1981 Surprise!…Surprise? Psychophysiology 18: 493–513 Frazier L 1987 Theories of sentence processing. In: Garfield J (ed.) Modularity in Knowledge Representation and Naturallanguage Processing. MIT Press, Cambridge, MA Friederici A D 1995 The time course of syntactic activation during language processing: A model based on neuropsychological and neurophysiological data. Brain and Language 50: 259–81
15396
Friederici A D, Meyer M, Von Cramon D Y 2000 Auditory language comprehension: An event-related fMRI study on the processing of syntactic and lexical information. Brain and Language 74: 289–300 Friederici A D, Pfeifer E, Hahne A 1993 Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitie Brain Research 1: 183–92 Friederici A D, Von Cramon D Y, Kotz S A 1999 Language related brain potentials in patients with cortical and subcortical left hemisphere lesions. Brain 122: 1033–47 Grodzinsky Y 2000 The neurology of syntax: Language use without Broca’s area. Behaioral and Brain Sciences 23: 1–71 Gunter T C, Stowe L A, Mulder G 1997 When syntax meets semantics. Psychophysiology 34: 660–76 Haarmann H J, Kolk H H J 1991 Syntactic priming in Broca’s aphasics: Evidence for slow activation. Aphasiology 5: 247–63 Hahne A, Friederici A D 1999 Electrophysiological evidence for two steps in syntactic analysis: Early automatic and late controlled processes. Journal of Cognitie Neuroscience 11: 193–204 Just M A, Carpenter P A, Keller T A, Eddy W F, Thulborn K R 1996 Brain activation modulated by sentence comprehension. Science 274: 114–16 Kaan E, Harris A, Gibson G, Holcomb P J 2000 The P600 as an index of syntactic integration difficulty. Language and Cognitie Processes 15: 159–201 Kluender R, Kutas M 1993 Subjacency as a processing phenomenon. Language and Cognitie Processes 8: 573–633 Kluender R, Mu$ nte T, Wind-Cowles H, Szentkuti A, Walenski M, Wieringa B 1998 Brain potentials to English and German questions. Journal of Cognitie Neuroscience Supplement 24 Kolk H H J, Van Grunsven M J F 1985 Agrammatism as a variable phenomen. Cognitie Neuropsychology 2: 347–84 Kutas M, Hillyard S A 1983 Event-related potentials to grammatical errors and semantic anomalies. Memory & Cognition 11: 539–50 Linebarger M C, Schwartz M F, Saffran E M 1983 Sensitivity to grammatical structure in so-called agrammatic aphasics. Cognition 13: 361–92 Mu$ nte T F, Heinze H J, Mangun G R 1993 Dissociation of brain activity related to syntactic and semantic aspects of language. Journal of Cognitie Neuroscience 5: 335–44 Mu$ nte T F, Matzke M, Johannes S 1997 Brain activity associated with syntactic incongruencies in words and pseudowords. Journal of Cognitie Neuroscience 9: 318–29 Neville H J, Mills D L, Lawson D S 1992 Fractionating language: Different neural subsystems with different sensitive periods. Cerebral Cortex 2: 244–58 Neville H J, Nicol J L, Barss A, Forster K I, Garrett M F 1991 Syntactically based sentence processing classes: Evidence from event-related brain potentials. Journal of Cognitie Neuroscience 3: 151–65 Osterhout L, Holcomb P J 1992 Event-related potentials and syntactic anomaly. Journal of Memory and Language 31: 785–804 Osterhout L, Holcomb P J, Swinney D 1994 Brain potentials elicited by garden-path sentences: Evidence of the application of verb information during parsing. Journal of Experimental Psychology-learning Memory and Cognition 20: 786–803 Pulvermu$ ller F, Lutzenberger W, Birbaumer N 1995 Electrocortical distinction of vocabulary types. Electroencephalography and Clinical Neurophysiology 94: 357–70
Syntactic Control Ro$ sler F, Friederici A D, Pu$ tz P, Hahne A 1993 Event-related brain potentials while encountering semantic and syntactic constraint violations. Journal of Cognitie Neuroscience 5: 345–62 Stromswold K, Caplan D, Alpert N, Rauch S 1996 Localization of syntactic comprehension by positron emission tomography. Brain and Language 52: 452–73 Ter Keurs M, Brown C M, Hagoort P, Stegemann D F 1999 Electrophysiological manifestations of open- and closed-class words in patients with Broca’s aphasia with agrammatic comprehension: An event-related brain potential study. Brain 122: 839–854 Van Petten C 1995 Words and sentences: Event-related brain potential measures. Psychophysiology 32: 511–25 Wernicke C 1874 Der aphasische Symptomenkomplex. Cohn and Weigert, Breslau, Germany Zurif E B, Swinney D, Prather P, Solomon J, Bushell C 1993 An on-line analysis of syntactic processing in Broca’s and Wernicke’s aphasia. Brain and Language 45: 448–64
A. D. Friederici
Syntactic Control The term ‘control’ is usually attributed to the generative linguist Paul Postal (1970) (see Generatie Syntax). The term expresses metaphorically the observation that the reference of an understood grammatical subject in a subordinate clause, the controllee, may depend on the reference of another constituent, the controller, in the main clause. An example illustrating the control relation is given in (1), with its relevant syntactic structure represented in (1a). In (1a) constituents with the same reference are marked by identical subscripts; the bracketed part of the sentence contains a nonfinite subordinate clause with an implicit subject symbolized as Ø (for ‘empty subject’). These conventions will be followed throughout the article. Mary convinced Billy to do her homework. (1) Maryi convinced Billyj [Øj to do heri homework]. (1a) In (1) the noun phrase Billy controls the reference of the empty subject of the infinitival complement clause. Every native speaker of English will understand (1) as meaning that Billy—and not Mary herself—will do her homework. The challenging linguistic problem is to find a principled account for why in (1) the main clause object is the controller, why the main clause subject cannot have this function, and why the empty subject cannot just refer to a third person not mentioned in the sentence at all. Sentences such as (1) exhibit the notion of obligatory control, i.e., a case in which the empty subject must be coreferential with a main clause constituent. Obligatory control contrasts with free (also called arbitrary or nonobligatory) control as in: [Ø to wait in the crowded airport] would be a mistake. (2)
In the subject clause in (2), the reference of the understood subject is not determined by any other noun phrase in the sentence but depends on the discourse context.
1. Some Control Phenomena In this article, the term ‘control’ will be understood as meaning ‘obligatory control.’ The following nonexhaustive list gives some sentences that exemplify obligatory control. (a) Infinitival Purpose Clauses Johni bought a bookj [Øi to amuse himself with]. (adapted from Nishigauchi 1984, p. 229) (3) Maryi bought the little boyj a toy [Øj to amuse himself with]. (adapted from Nishigauchi 1984, p. 229) (4) (b) Infinitival Complement Clauses Johni begged Maryj [Øj to leave]. (5) Maryi begged [Øi to leave]. (6) (c) Gerundial Complement Clauses Pauli thanked Maryj for [Øj writing the report]. (7) Pauli thanked Maryj for [Øi having been admitted to the tennis club]. (8) Most work on control has been done on nonfinite complement clauses, i.e., on clauses exemplified by sentences (5) through (8). The remainder of the article will therefore be concerned with control in nonfinite complement clauses.
2. A Short History of Control In the early days of generative grammar (see Linguistics: Incorporation) in the 1960s and 1970s, it was generally believed that control is a syntactic phenomenon, i.e., that the controller of the empty subject in the subordinate clause can be determined on a purely structural basis. More recently, it has become increasingly evident that semantic and pragmatic factors play an equally, if not more, important role in control.
2.1 Syntactic Approaches The syntactic approach to control assumes that control relations can be determined on a purely structural basis. The most influential proposal along these lines was made by Peter Rosenbaum (1967, 1970). He postulated a Minimal Distance Principle (MDP), which predicts that the reference of the (empty) subject of a nonfinite complement clause depends on the closest available controller in the main clause. In sentence (1) above, the main clause noun phrase closest to the empty subject Ø is Billy. 15397
Syntactic Control Indeed, this noun phrase controls the reference of the empty subject, and the MDP thus makes the right prediction. There are, however, many sentences for which the MDP is falsified. Difficulties emerge with constructions that display more than one potential controller in the main clause, as in (9) and (10): Maryi persuaded Billj [Øj to clear heri desk]. (9) Maryi promised Billj [Øi to clear heri desk]. (10) For sentence (9) the MDP correctly selects the closest noun phrase in the main clause, Bill, as controller. Persuade belongs to the class of control verbs that normally select the main clause object as the controller. However, for sentence (10) the MDP wrongly selects Bill as controller; in fact, the main clause subject Mary is the controller. Promise belongs to the class of control verbs that usually select their subject as controller. Subject control verbs of the promise class constitute a problem for the MDP and have to be marked as exceptional in the lexicon. Another problem that the MDP faces derives from the fact that one and the same verb may have either subject or object control: Billi thanked the coachj for [Øj selecting himi]. (11) Billi thanked the coachj for [Øi having been admitted to the tennis club]. (12) In (11), which can be regarded as the default case, the controller is the coach (object control), whereas in (12) a control switch or shift takes place: the most likely referent of the implicit subject Ø is Bill (subject control). The MDP wrongly predicts that in both sentences the coach should be the controller. Other syntactic control theories have been proposed by Joan Bresnan (1982) (see Lexical Functional Grammar) and Emmon Bach (1979) in a logical framework (see Montague Grammar). What these theories have in common is that they regard syntactic structure as the decisive factor for determining the control relationship. However, there is a substantial body of evidence that suggests strongly the importance of semantic and pragmatic factors in the determination of control relations.
2.2 Semantic Approaches A comparison of sentences (9) and (10), which look alike structurally, invites the conclusion that their differing control properties are caused by the meanings of the verbs persuade and promise, respectively. To persuade somebody to do something means that the addressee of an act of persuasion is led to believe that he or she should do something. This prospective action is expressed in the infinitival complement clause. Since in an active sentence such as (9) the addressee is the grammatical object of the main clause, it is not surprising that this constituent functions as controller of the empty subject of the subordinate clause. In contrast, a promise involves a commitment of the 15398
speaker to a future course of action. In an active sentence such as (10), the ‘promiser’ is the grammatical subject and one would therefore expect the subject to be controller of the understood subject of the complement clause. The most complete account that acknowledges the importance of the meaning of control verbs is probably Sag and Pollard’s (1991) control theory. These authors use semantic criteria to distinguish three classes of control verbs: (a) the order\permit class, which typically induces object control; (b) the promise class, which typically induces subject control; and (c) the want\expect class, which again induces subject control. The order\permit and the promise classes conceptually imply actions, whereas verbs of the want\expect class express what Sag and Pollard call an ‘orientation’ in their complement clause.
2.3 Pragmatic Approaches Since the mid-1980s, the insight has grown that, in addition to the meaning of control verbs, the context in which a sentence is uttered and world knowledge about the participants’ status play a significant role in assigning the coreferential links between controller and controllee (e.g., Comrie 1984, 1985, Ru/ z) ic) ka 1999; see Pragmatics: Linguistic). These additional pragmatic factors are illustrated by the following German sentences (adapted from Wegener 1989): Der Anwalti versprach dem Ha$ ftlingj [Øj bald aus dem Gefa$ ngnis entlassen zu werden]. (13) ‘The lawyer promised the prisoner that he (l the prisoner) would be released from jail soon.’ Der Ha$ ftlingi versprach seiner Frauj [Øi bald aus dem Gefa$ ngnis entlassen zu werden]. (14) ‘The prisoner promised his wife that he (l the prisoner) would be released from jail soon.’ The most plausible interpretation for both (13) and (14) is that the prisoner will be released from jail. These interpretations follow from the speech act verb ersprechen, ‘promise,’ the status of the referents of the noun phrases der Anwalt, dem HaW ftling, and seiner Frau, and the meaning of the passive complement clause. A promise implies a promiser who is committed to a future action, and a promisee who benefits from the action of the promiser. This semantic–pragmatic scenario can be applied to the interpretation of sentences (13) and (14) and the most plausible controller can be inferred (cf. Panther and Ko$ pcke 1993, Panther 1994). In (13) the lawyer is the promiser; hence the lawyer is committed to some future action. However, the complement clause does not express an action explicitly. A step toward a reasonable interpretation is to assume that the complement clause expresses the result of an action performed by the promiser. The action itself has to be inferred. Furthermore, the result of the
Syntactic Control action is beneficial to the prisoner. Therefore, the most plausible controller is the noun phrase dem HaW ftling. On this account, the assignment of the controller is not a matter of finding the noun phrase that is syntactically closest to Ø in the matrix clause, but a matter of common-sense knowledge about the roles of lawyers and prisoners. For (14), a coherent interpretation results if it is assumed that the prisoner can behave in such a way that he causes his release from jail. Again, the complement clause is interpreted as the result of an action—an action carried out by the prisoner to the effect that he will be released from prison. His wife can be regarded as the beneficiary of this action. The interpretations sketched above are not the only ones possible. On the assumption that the lawyer is in jail, (13) could be interpreted as conveying that the lawyer will be released from jail; and (14) could have the interpretation, ‘The prisoner promised his wife that she would be released from jail soon’ given the background knowledge that the prisoner’s wife is in prison. It is in the nature of the pragmatic approach to control that different background assumptions have an impact on what is the most plausible controller of Ø. Similar observations have been made for Chinese by Huang (1994, p. 153), who contrasts the following sentences (referential subscripts and bracketing have been added): Xueshengi daying laoshij [Øi mingtian jiao zuowen]. (15) ‘The pupil promises the teacher that he (l the pupil) will hand in the essay tomorrow’ Laoshii daying xueshengj [Øj mingtian fang yi tian jia]. (16) The teacher promises the pupil that he (l the pupil) will have a day off tomorrow’ The control verb daying, ‘promise,’ can be used either in the ‘unmarked,’ i.e., normal, sense ‘commitment-to-action, as in (15), or in the ‘marked’ sense ‘commitment-to-permission,’ as in (16). These senses are derived pragmatically; they constitute ‘preferred’ interpretations that can be inferred on the basis of the meaning of the speech act verb daying and world knowledge about teacher–pupil interactions in the institutional framework of a school. To summarize, the interpretation of control sentences relies not only on syntactic and semantic information but also, to a substantial degree, on world knowledge and contextual clues about the utterance situation.
3. Language-specific Coding Differences of Control Relations One might surmise that the mechanisms of control are very similar across languages, but on closer inspection it turns out that there are language-specific differences
in the way that control structures are coded. A selective list of these coding differences is given below (for a detailed account, see Ru/ z) ic) ka 1999). 3.1 Modal s. Non-modal Coding For many speakers of English the following sentence is ambiguous: Johnnyi asked his motherj [Øi/j to go to the movies]. (17) Sentence (17) can mean either that Johnny wants his mother to go to the movies or that Johnny asks for permission to go to the movies himself. This latter interpretation seems to be the most likely one, assuming that Johnny’s mother has authority over Johnny (see Sect. 2.3). As Ru/ z) ic) ka (1999, p. 45) points out, the ‘ask-forpermission’ (subject control) reading is also available alongside the ‘ask-for-action’ interpretation (object control) in Spanish, although the concept of permission is not coded explicitly. Marı! ai lej pidio! a Juanj [Øi hablar con los muchachos]. (18) ‘Marı! a asked Juan that she be allowed to talk to the boys’ The ‘ask-for-permission’ interpretation is not possible, however, if the same sentence is translated literally into German: Ha$ nscheni bat seine Mutterj [Øj ins Kino zu gehen]. (19) ‘Johnny asked his mother to go to the movies’ Sentence (19) can only mean that Johnny’s mother is supposed to go to the movies. In order to express the ‘ask-for-permission’ sense, with Johnny as controller, German speakers have to resort to an utterance that codes permission by means of the modal auxiliary du$ rfen: Ha$ nscheni bat seine Mutterj [Øi ins Kino gehen zu du$ rfen]. (20) ‘Johnny asked his mother to be allowed to go to the movies’ 3.2 Causatie s. Non-causatie Coding Another important coding difference between English and languages like German, Russian, and Czech is exemplified by Peteri persuaded his fatherj [Øj to be examined by a doctor]. (21) Sentence (21) means that Peter’s father should undergo a medical examination. This sense is, however, not explicitly coded in sentence (21), but is nevertheless conventionally conveyed by it. What is coded in the subordinate clause is the result of the action, not the action itself. If sentence (21) is translated literally into German, for many speakers ungrammaticality results and, for those who accept 15399
Syntactic Control the sentence, an interpretation with Peter as controller is more plausible: ?Peteri u$ berredete seinen Vaterj [Øi von einem Arzt untersucht zu werden]. (22) ‘Peter persuaded his father that he (l Peter) should be examined by a doctor’ The only way of rendering the intended message of (21) in German or Russian is to code his father as an active participant who engages in an action which results in a medical examination by a doctor: Peteri u$ berredete seinen Vaterj [Øj sich von einem Arzt untersuchen zu lassen]. (23) ‘Peter persuaded his father to have himself (l his father) examined by a doctor’ Petri ugovoril otcaj [Øj podvergnut’ sebja osmotru vrac) ]. (Ru/ z) ic) ka 1999, p. 58) (24) ‘Peter persuaded father that he (l father) should undergo a medical examination’ Both German and Russian have to code the agent role of father explicitly. This is done by the causative verb lassen in German and the action verb podergnut’ sebja ‘submit oneself ’ in Russian.
3.3 Implicit Control s. Explicit Control Another important difference between English and languages such as German and Russian concerns the ability for a conceptually present controller to remain unexpressed. In German it is possible to say: Der Professori bat Øj [Øj den Zeitschriftenartikel zu kopieren]. (25) ‘The professor requested (of some person) that this person photocopy the journal article’ In (25) the controller of the understood subject remains unexpressed. The person or group of persons supposed to comply with the request is indeterminate. This phenomenon may be called implicit control (cf. Panther 1997). However, a word-for-word translation of (25) would not have the same effect in English: The professori asked Øj [Øi to photocopy the journal article] (26) As the referential subscripts indicate, the only possible interpretation of (26) is that the professor himself wants to photocopy the journal article and asks for permission to do so. In German, however, the translation of (26) requires modal coding with duW rfen ‘be allowed to’ (see Sect. 3.1). Another example of implicit control is (27): Der Arzti riet Øj [Øj den Fettanteil in meinem Essen zu verringern]. (27) ‘The doctor advised (someone) that this person should reduce the amount of fat in my diet’ The literal translation of (27) in English leads to ungrammaticality (marked by an asterisk): *The doctori advised Øj [Øj to reduce the amount of fat in my diet]. (28) 15400
These examples illustrate a general tendency in English: in obligatory infinitival control constructions the controller has to be made explicit, whereas in languages such as German and Russian it can remain unexpressed. However, sometimes there is a way to circumvent the principle in English by resorting to a gerund construction. A possible translation of (27) is: The doctori advised Øj [Øj reducing the amount of fat in my diet]. (29)
4. Conclusion Human languages employ various means to keep track of referents on the sentence and the discourse level. Control relations can be seen in this larger framework of coreference phenomena in general (see Anaphora). As with other coreference relations, it turns out that the interpretation of control structures involves a complex interplay of syntactic, semantic, and pragmatic parameters. See also: Linguistics: Incorporation; Syntax; Syntax–Semantics Interface; Valency and Argument Structure in Syntax
Bibliography Bach E 1979 Control in Montague Grammar. Linguistic Inquiry 10: 515–31 Bresnan J 1982 Control and complementation. Linguistic Inquiry 13: 343–434 Comrie B 1984 Subject and object control: Syntax, semantics, pragmatics. Berkeley Linguistics Society: Proceedings of the Tenth Annual Meeting. Berkeley Linguistics Society, Berkeley, CA Comrie B 1985 Reflections on subject and object control. Journal of Semantics 4: 47–65 Huang Y 1994 The Syntax and Pragmatics of Anaphora: A Study with Special Reference to Chinese. Cambridge University Press, Cambridge, UK Nishigauchi T 1984 Control and the thematic domain. Language 60: 215–50 Panther K-U 1994 KontrollphaW nomene im Englischen und Deutschen aus semantisch-pragmatischer Perspektie. Narr, Tu$ bingen Panther K-U 1997 An account of implicit complement control in English and German. In: Verspoor M, Lee K D, Sweetser E (eds.) Lexical and Syntactical Constructions and the Construction of Meaning. Benjamins, Amsterdam Panther K-U, Ko$ pcke K-M 1993 A cognitive approach to obligatory control phenomena in English and German. Folia Linguistica 27: 57–105 Postal P M 1970 On coreferential complement subject deletion. Linguistic Inquiry 1: 439–500 Rosenbaum P S 1967 The Grammar of English Predicate Complement Constructions. MIT Press, Cambridge, MA
Syntax Rosenbaum P S 1970 A principle governing deletion in English sentential complementation. In: Jacobs R, Rosenbaum P S (eds.) Readings in English Transformational Grammar. Ginn, Waltham, MA Ru/ z) ic) ka R 1999 Control in Grammar and Pragmatics. Benjamins, Amsterdam Sag I A, Pollard C 1991 An integrated theory of complement control. Language 67: 63–113 Wegener H 1989 ‘Kontrolle’—semantisch gesehen: Zur Interpretation von Infinitivkomplementen im Deutschen. Deutsche Sprache 17: 206–28
K.-U. Panther
Syntax The linear arrangement of the words in a sentence is rarely arbitrary, and the morphological shape which words take often depends on other words they cooccur with. The part of grammar that deals with the pertinent regularities is called syntax.
1. Functions of Syntax Loosely speaking, the variations in form and position that constitute syntax fulfill a number of functions: The words and phrases of a sentence stand in various meaning relations to each other, and these relations are encoded by syntactic differences. A basic function is the expression of so-called ‘thematic relations’ or ‘argument linking’: a verb such as kill expresses a twoplace predicate KILL (x, y), a relation between an agent x and a patient y in an event of killing. Such argument roles of a predicate must be linked to the argument expressions in a sentence, and at least in the ideal case, this linking is carried out systematically in natural language, for example by word order, as in English (1), or by case, as in German (2): (1) the tiger killed the buffalo. agent predicate patient (2) dass der Tiger den Bu$ ffel to$ tete that the.nom tiger the.acc buffalo killed. Argument linking is not the only semantic-pragmatic function figuring in syntax. Argument places of verbs can, for example, be questioned (who did he see?), and languages may require that question phrases appear in specific positions (e.g., clause initially, as in English). Information structure (‘new vs. old information,’ ‘focus vs. topic’) is a relevant dimension for sentence construction too, and may influence word order, for example. Unlike (2), German (3) seems to be fully felicitous only in a situation in which the crucial new information is the identification of the tiger as the agent of killing. (3) dass den Bu$ ffel der Tiger to$ tete. A characterization of the functions that syntax encodes must be complemented by an identification of
the means of encoding. Word order and shape variation are two options. The latter comes in two varieties. The fact that a noun is the agent of a verb may be encoded by shape variation of the noun (l case), or by shape variation of the verb (agreement markers)—or a combination of both. ‘Word order,’ on the other hand, seems to be the result of mapping the two-dimensional hierarchical structure of a sentence onto the unidimensional sound (or letter) stream (see Sect. 4 below). Intonational differences often serve to express pragmatic and semantic functions, but they seem to be employed only rarely in argument linking.
2. Argument Roles In a sentence such as (4), we can distinguish words and phrases that are linked to argument places (in italics) of the predicate (boldface), and words and phrases that are not; which have an adjunct function (underlined). (4) Jack saw the girl in the park yesterday. Let us confine our attention to argument roles. They are linked to (the meaning of ) the predicate hierarchically. For a verb such as kill, the agent is higher in this hierarchy than the patient. Grammatical differences reflect these hierarchies systematically. When two arguments of a predicate refer to the same entity, the higher argument will, for example, be realized in its usual shape, whereas the lower argument takes a special form (it is, e.g., realized as a reflexive pronoun) but not vice versa. (5) (I expect) Bill to wash himself. (I expect) *himself to wash Bill. [Ungrammatical strings are preceded by a*] For prototypical instances of ‘transitive’ relations, that is, for verbs expressing instantaneous causation of an event affecting a patient by an agent (kill, beat), the hierarchy seems to be the same in all languages, with agents always being more prominent than patients. Potential exceptions to this (Dyirbal, Arctic Eskimo; see e.g., Marantz 1984) can presumably be explained away. For less prototypical relations—e.g., for psychological predicates—grammatical hierarchies are linked to semantics less systematically, as the difference between like and please suggests: the ‘experiencer’ of the psychological state is encoded as the subject (as the higher role) with like, but as an object (the lower role) with please. Languages may differ in this respect—in some, nearly all psychological predicates follow the model of please (Basque), in others, all relevant verbs pattern with like (Lezghian), and in still others (like English), both construction types occur (see Bossong 1998). In addition to identifying their place in the hierarchy, argument roles have often been classified in terms of notions that we have already used informally, namely\thematic (‘theta’) roles (Gruber 1965, Dowty 15401
Syntax 1991, Jackendoff 1990; but also Hjelmslev 1972, 1935–7, Jakobson 1971, 1936) such as ‘agent,’ ‘instrument,’ ‘goal,’ ‘experience,’ ‘source,’ ‘patient,’ etc. One needs to refer to these notions in the context of selecting the proper preposition or semantic case, but for capturing core syntactic laws, the necessity of integrating theta roles (in addition to hierarchy statements) into grammatical descriptions has yet to be established (apart, perhaps, from the special role played by the agent function). At least, most if not all current grammatical models can do without them easily (Fillmore 1968 and approaches following him are notable exceptions).
3. Grammatical Functions One issue of syntactic theory is whether argument roles can be linked directly to the formal means of encoding, or whether the rules of encoding crucially employ ancillary notions, namely, the grammatical functions ‘subject,’ ‘direct object,’ ‘indirect object,’ etc. Some grammatical models (Relational Grammar, Perlmutter (1983); Lexical Functional Grammar, (Bresnan 1982)) assume that such grammatical relations are indispensable primitive notions of syntax. Other approaches, such as the Standard Theory (Chomsky 1965), or the Government and Binding (GB) framework (Chomsky 1981) take grammatical functions (at best) to be mere classificatory concepts of little explanatory value, which can be defined in terms of structure and hierarchy: a ‘direct object’ is a noun phrase ‘governed’ by the verb, or it is a noun phrase which is a structural sister of the verb. Thus, Lexical Functional Grammar assumes that sentences are linked to a functional structure, in which, e.g., the verb kill is linked to two abstract grammatical functions, subject and object. How these grammatical relations are spelled out is a function of languageparticular rules. In English, grammatical functions are encoded by word order (more precisely, hierarchically)—the direct object has to follow the verb immediately (1). In German (2, 3), the grammatical functions seem encoded by case marking (direct objects bear accusative case). The mapping from argument structure to formal means of expression would thus proceed as in (6): (6) kill (role 1, role 2) kill (subject, object) kill (nominative, accusative) Lexical Decomposition Grammar (Wunderlich 1997) translates argument hierarchies directly into morphological markings or order relations; that is, the second step in (6) is skipped. The Standard Theory and the GB framework assume that sentences are linked to an abstract ‘deep structure,’ in which the words and phrases combine hierarchically, with argument role hierarchies being reflected by argument 15402
expression hierarchies in deep structure, as in (7). Such deep structures are mapped on to audible representations in various ways, e.g., by case marking, reordering, etc. (7) [Infl-Phrase Infl [VP [the tiger] [killed [the buffalo]]]] In other words, there is no consensus concerning the status of grammatical functions. It seems fair to say however, that no universally applicable definitional criteria for, e.g., subjecthood, have been found so far. This suggests that grammatical functions must either be considered fuzzy concepts, or that they reflect a grammatical perspective that may be adequate in some languages only. The nature of grammatical functions has often been discussed in the context of the claim that one type of linking argument expressions to argument places, namely the hierarchical-positional (‘configurational’) one that we find in English, is more fundamental than the others (such as case marking) in the sense that it underlies the grammatical system of all languages: at deep structure level, all languages follow roughly the constructional logic of English. This is the position held by the mainstream version of the GB framework, and has been taken over in the Minimalist program (Chomsky 1995). The contrary view that there are no privileged systems has been a minority position in GB (e.g., Hale 1983), and is the position held by Lexical Functional Grammar and Relational Grammar. Debates in the 1980s primarily contrasted languages with rigid word order and little case marking (English, Chinese) with languages with a relatively elaborate case system and free constituent order (German, Japanese). One outcome is the insight that a strong hierarchy-based configurational component can be identified in the grammar of languages with a fairly rich case marking too (e.g., Haider 1993). On the other hand, it may also be uncontroversial that polysynthetic languages (such as Mohawk) with a very rich system of verbal agreement have a grammatical system that differs radically from, say, English and Japanese. It appears as if nominal expressions in the polysynthetic languages do not show a grammatical behavior that is even remotely reminiscent of subjects and objects—rather, their grammar is comparable to what holds for adjuncts (like adverbs) in English. Argument linking and other important grammatical factors seem to be dealt with in the morphology only. Why this is so, and whether this means that there are languages without a syntax proper (a position strongly argued against by Baker 1996) is an open issue.
4. Moement If more than one semantic–pragmatic relation has to be expressed, and if the same encoding mechanism is used for the different functions, the encoding rules
Syntax may impose incompatible requirements on a phrase in a sentence. In English, the positional system of identifying arguments requires that objects follow the verb, whereas question phrases have to appear at the beginning of a clause. These requirements cannot be met simultaneously: (8) (a) *(I do not care) you like who (b) (I do not care) who you like One way of dealing with this problem has been the postulation of various leels of representation, such that different requirements have to be met at different levels. In the Chomskyian tradition (see e.g., Chomsky 1957, 1965, 1981) a level called deep structure\ d-structure, representing the thematic relations among arguments and predicates has played a prominent role until quite recently. This d-structure (corresponding to (9a)) would then be mapped on to representations coming close to the phonetic ‘surface’ by a set of structure-changing operations (‘transformations’), of which the movement (more precisely: fronting) of phrases has been the most important one (relative to (9a), who has moved to clause initial position in (9b)). The law that requires that complement questions begin with a question word applies at ‘surface structure’ (s-structure) only, not at deep structure. (9) (a) you like who? (b) who do you like? Details changed in the 1990s, but the distinction between structure building operations encoding argument linking and structure-changing operations (movement) is still a key property of all grammatical models inspired by Chomsky. Most alternative approaches (e.g., Generalized Phrase Structure Grammar; Gazdar et al. 1985) differ at a purely technical level only. Optimality Theory borrowed from phonology (Prince and Smolensky 1993; see also Grimshaw 1997, Kager 1999, Mu$ ller 2000) allows the handling of conflicts between incompatible grammatical requirements in a quite different way: the grammatical principles are ordered, and lower principles (such as: the object appears to the right of the verb) may be violated in the interest of higher ones (such as: question phrases appear clause-initially). Whether such options will eventually reduce the importance of movement metaphors in grammatical descriptions is an open question at the time of writing. Beginning with the seminal work of Ross (1967), investigations on syntactic movement have brought to light a new type of syntactic law. There is a set of ‘locality conditions’ on movement, in the sense that words and phrases cannot be moved out of certain kinds of construction. Thus, in English, simple complement clauses are transparent for movement, while complement questions or relative clauses are not. (The place where a phrase has moved from is indicated by.) (10) (a) you think that she likes someone who do you think that she likes I (b) you wonder when she kissed someone *who do you wonder when she kissed I
(c) you know a man who kissed someone *who do you know a man who kissed I In (10b) and (10c), we observe an ‘intervention’ effect. The phrase who should be placed in the position indicated by I at d-structure (because of the laws of argument linking), and in sentence initial position at s-structure because of the laws of question formation. The ‘preposing’ of who yields ungrammaticality in (10b) and (10c) because phrases cannot cross similar phrases (like when in (10b)) when they move to their surface position. A similar intervention effect can be observed in (11) when a choice must be made between different phrases that could be placed into sentence-initial position: again, a phrase of type X must not cross a phrase of the same type in this context. This is the so-called superiority effect (see Chomsky 1973, also Chomsky 1995, Rizzi 1990). (11) (a) who do you expect I to buy what (b) *what do you expect who to buy I Furthermore, only complement clauses with an object function are transparent for movement, but not subject clauses (12a) and adjunct clauses (12b): (12) (a) *who is to kiss I fun? (b) *who do you weep because you met I ? The Chomskyan approach (in the narrower sense) tries to identify a set of purely formal and universally valid syntactic laws that characterize such ‘island’ effects (see Chomsky 1981, 1986). The empirical challenge is that the constraints proposed so far are not being respected in all languages in the same way.
5. Logical Forms Not all languages require that constituent questions be formed by the fronting of a question word or phrase (wh-phrase). In Chinese, wh-phrases are realized in the position where they should appear according to the laws of argument linking. (13) ni xihuan shei you like who ‘who do you like?’ Interestingly, the scope options of Chinese whphrases are constrained, and in a way reminiscent of what we saw in the preceding section for English. Thus, a wh-phrase which takes scope over the whole matrix clause may appear as part of an object clause (14), but cannot show up in a subject clause (15). This is identical to what we have just observed for English (10a), (12). (14) Zhangsan xiangxin shei mai-le shu Zhangsan believe who bought books ‘who does Zhangsan believe bought books?’ (15) *Zhangsan tao-le shei zhen kexi Zhangsan marry who real pity ‘for which x: that Zhangsan married x is a real pity’ 15403
Syntax Such observations gave rise to the idea (Huang 1982), quite popular in the GB framework, that shei undergoes movement to the clause initial position corresponding to its semantic scope as well (so that it is clear why its scope behavior is constrained by restrictions on movement), but ‘invisibly’ so. There are two ways in which the relevant observations may be interpreted, one that gives priority to syntax, and one that does not. The GB framework postulated a grammatical architecture as in (16): by putting together words from the lexicon, a deep structure arises. This d-structure undergoes a number of transformations, mainly movement, in order to reach the interface levels Phonetic Form (PF) (interface with the articulatory–perceptory system) and Logical Form (LF) (interface with the cognitive system). The first steps, so-called ‘overt’ syntax, are common to these two derivations targeting PF and LF; they yield surface structure (s-structure). ‘After’ s-structure, the two paths split. The syntactic derivation of LF proceeds in the standard way, but no longer affecting phonetic material (covert syntax). (16)
Lexicon
d-structure
s-structure
PF
LF
In this model, the Logical Form of (14) is (17), i.e., question words are preposed in Chinese too—the only difference from English being that the movement of shei has applied ‘after’ s-structure, so that it lacks a phonetic effect. All languages, it is claimed, have the same Logical Form, but they may make (among other things) different decisions as to whether a certain movement operation (like the fronting of wh-phrases) applies in overt or covert syntax. (17) shei Zhangsan xiangxin I mai-le shu who Zhangsan believe bought books Note that the major argument for such a view is the empirical observation that conditions that constrain c visible movement in some languages (see (12)) restrict the scope-taking behavior of certain phrases that have not been moved visibly (see (15)) in other languages. If one presupposes or shows that the relevant constraints c are formal-syntactic in nature, the conclusion that there is a syntactic component with invisible movement is quite compelling—but one must keep in mind that the overt movement operations constrained by c serve the purpose of scope-taking too. 15404
Thus, one can also take quite a different perspective on facts as represented by (12) and (15): language could be governed by an elaborate system of semantic constraints on scope taking, and apparent syntactic conditions on movement might turn out to be superficial consequences of these constraints. Such semantic constraints affect the distribution of, for example, question words quite independently of whether they are moved syntactically or not. Proposals that can be related to this idea can be found in the literature (see, e.g., Erteshik-Shir 1977, Szabolsci and Zwarts 1997), but a full semantic account of constraints on movement has yet to be developed.
6.
Structural Hierarchies
The previous section illustrated the difficulty of disentangling syntax and semantics in the domain of constraints on movement. This difficulty reappears when one considers the role which the geometry of abstract syntactic structures plays for quite a variety of relations and processes. (18)–(20) illustrate probably universal asymmetries that are cases in point: the pronoun he can be bound semantically by no one in (18a) only, but not in (18b): (18b) cannot have the interpretation sketched in (18c). Eer in (19) is a socalled ‘negative polarity item.’ It can appear in a sentence only if there is a negating expression present in that sentence. Nobody manages to license the polarity element eer in (19a), but not in (19b). (20) illustrates the fact that question (wh-) phrases are almost always moved to the left (if at all), but not the right. (18) (a) No one thinks that he is incompetent. (b) *He thinks that no one is incompetent. (c) [Vx (x thinks: x is incompetent) (19) (a) Nobody thinks he will ever have enough money. (b) *Bill ever thinks nobody will have enough money. (20) (a) you tell me that she likes who who do you tell me that she likes? (b) you tell who that she likes Mary *you tell that she likes Mary who Such asymmetries were noted very early in modern syntax theory. After several unsuccessful attempts involving crucial reference to precedence, Culicover (1976) and Reinhart (1976) discovered the relevant purely structural factor at more or less the same time. The term c-command goes back to Reinhart (1976). We have reformulated it here as in (21), in a way that fits easily into present-day assumptions. (21) a c-commands b if b is part of the structural sister of a Recall that argument hierarchies are mapped, e.g., on to hierarchical abstract deep-structures, which are then transformed to surface structures by further operations. For (22), one might postulate a surface
Syntax structure such as (23) in the bracket notation, and (24) in the tree notation. (22) the dangerous tiger has killed the old buffalo (23) [the [[dangerous] [tiger]]] [has [[killed] [the [[old] [buffalo]]]]]]
occupy a hierarchically higher position in sentence structure than objects, and should therefore precede them if Kayne is right.
7. Operations Affecting Grammatical Relations
(24)
s t v w
z x
y
the dangerous tiger has killed the old buffalo
the dangerous tiger has killed the old buffalo We have labeled the nodes in the tree (24) for ease of reference. We say that A dominates B iff there is a uniquely downwards directed path in the tree from A to B. Thus, s dominates x, but neither y nor z dominate w. A immediately dominates B iff A dominates B and there is no C that dominates B but not A. A is a sister of B if there is a C such that C immediately dominates A and B. Therefore, (21) implies that z c-commands w but not vice versa. The following generalizations then hold for natural languages: (25) A noun phrase A can bind a pronoun B only if A c-commands B. (26) A negating expression A licenses a polarity element B only if A c-commands B. (27) An element can move from position A to position B only if A c-commands B. Let us exemplify the impact of these statements with (25) only. The examples in (18) translate into the simplified partial structural representations (28). It is easy to figure out that no one c-commands he in (28a), but not in (28b), so that (25) correctly predicts that no one can bind the pronoun in (18a) only. (28) (a) [no one [thinks [that [he [is [incompetent]]]]]] (b) [he [thinks [that [no one [is [incompetent]]]]]] Reference to the structural notion c-command in (25) cannot be replaced by precedence, because such an account would fail to explain why nobody can neither bind the pronoun nor license the polarity element in (29). (29) *a letter to nobody would ever please him Presumably, the establishment of syntactic relations between A and B always presupposes that A c-commands B, or vice versa. Kayne (1994) has proposed that elements α that asymmetrically c-command β precede β in all natural languages. If his approach is correct, we would understand why the basic orders subject–verb–object (SVO, English), SOV (Japanese), and VSO (Irish) predominate among the world’s languages; subjects
That the formation of questions, relative clauses, topicalization, etc. involves a preposing operation in certain languages such as English seems to be accepted quite generally, There may even be psycholinguistic evidence suggesting that such movement dependencies are processed in a special way by the human parser (see, e.g., Frazier and Clifton 1993). In addition to such dependencies, in which phrases are displaced in order to encode semantic or pragmatic functions (so-called A-bar-movement), there are grammatical operations such as ‘passive’ (as in (30)) or ‘raising to subject’ (as in (31)) which primarily affect the grammatical functions of arguments: in a thematic sense, a moose in (30c) has properties of an object (it expresses the patient role); in other respects, it behaves like a subject (in terms of position and agreement with the verb). (30) (a) one shot a moose (b) there was a moose shot I (c) a moose was shot I (31) (a) it seems that he likes Mary (b) he seems I to like Mary That such operations affecting grammatical functions should be expressed in terms of movement (see (32)) is less obvious (at least for languages in which grammatical functions are not straightforwardly encoded positionally) and less widely accepted. (32) d-structure I was shot a moose s-structure a moose was shot I Although the proper analysis of these processes is still subject to some debate, a number of locality restrictions on the process have, however, been uncovered, which are illustrated in (33). (33) (a) I shoot a moose a moose was shot (b) I laugh at him he was laughed at (c) I expect her to win she was expected to win (d) I expect she will win *she was expected will win (e) I prefer very much for you to kiss Mary *Mary was preferred very much for you to kiss The passive operation may map a direct object (33a), an object in a prepositional phrase (33b), and the subject of a nonfinite complement (33c) to the subject position, but fails to be able to establish such a relation for the subject of an embedded finite clause (33d), or any object in an embedded clause (33e). 15405
Syntax The restrictions may be slightly different in other languages, as the contrast between English and German in (34) shows, but the crucial observation is that similar locality restrictions can be observed with the binding of reflexive pronouns, at least in the most restrictive subcase of this domain, as it is realized in English; see (35). (34) (a) he forgot to repair the car *the car was forgotten to repair (b) er vergass den Wagen he.nom forgot the.acc car zu.repaieren to repair der Wagen wurde zu reparieren the.nom car was to repair vergessen forgotten (35) he shot himself he laughs at himself he expects himself to win *he expects himself will win *he prefers very much for her to kiss himself The set of constructions to which these locality restrictions common to passivization and the grammar of reflexives apply is larger: it includes quantifier scope. A quantifier s can always take scope over a quantifier t if s c-commands t at s-structure. Thus, since subjects are higher in the structural hierarchy than objects, sentence (36a) allows a reading in which the existential quantifier takes scope over the universal quantifier (36) (a) some man loves every woman (b) there is a man such that this man loves every woman (c) for every woman there is a man (possibly a different one for each woman) such that that man loves the woman In English, however, a structurally lower quantifier s in position S may additionally take scope over t in position T if S and T could interact in terms of, say, passive formation, or the binding of reflexive pronouns. Thus, (36a) allows the further interpretation indicated in (36c), in which the object quantifier takes scope over the subject quantifier. A comparison of (33), (35), and(37) exemplifies the parallelism between wide scope options for a hierarchically lower quantifier, and the possibility of passive formation\binding of reflexive pronouns. See also Hornstein (1995). (37) some woman laughs at every man (wide scope for eery man possible) some woman expects every man to be a loser (wide scope for eery man possible) some woman expects every man will be a loser (wide scope for eery man impossible) some woman prefers very much for me to kiss every man (wide scope for eery man impossible) If formal syntax is responsible for the restrictions exemplified in (33), the binding rules for reflexives and 15406
the scope taking behavior of quantifiers must be reducible to syntactic operations too. However, it is hard to exclude the possibility that some extrasyntactic constraints governing quantifier scope or reflexive binding have side effects on passivization or raising, so that (33) would find a nonsyntactic account. Note, furthermore, that the impression that the same factor constrains the three different types of process discussed here may be spurious. Thus, in many languages (e.g., Swedish) the domain for the binding of reflexive pronouns is much larger than the domain for passive formation.
8. A Note of Autonomy As a reaction to so-called Generative Semantics, the hypothesis of the autonomy of syntax governed much of the history of current syntactic theorizing (see Newmeyer 1980): syntax was considered to be the only ‘generative’ component, the results of which are merely interpreted by semantics and by phonology. The preceding sections have revealed that it is often difficult to draw a sharp boundary between syntax and semantics, and strong adherence to the autonomy hypothesis may nowadays reflect a rather parochial perspective. The same may be true for the interaction of phonology and syntax. The laws of the placement of clitics (e.g., unstressed pronouns) as we find them in Romance and Slavic languages often make crucial reference to phonological conditions (no clitic element must appear in the initial position of an intonation phrase) and syntactic conditions (elements bearing a given case etc. must appear in the leftmost position of, say, the verbal projection) at the same time. Phonological conditions may even be responsible for the application of syntactic operations such as verb movement. The idea of a completely autonomous and encapsulated syntactic component, in which all processes can be understood in purely formal syntax-based terms only, in which syntactic processes are driven by syntactic conditions only, which was most popular in the GB tradition, therefore has to be modified, at least.
9. Approaches to Uniersals The discovery of syntactic universals and their explanation has always been the major goal of syntactic theory. The typological tradition tries to attain this goal by comparing a large and representative sample of languages with respect to specific grammatical properties. Often this presupposes that the constructions can only be looked at in less detail than in the generative tradition. In the generative approach, the idea is to uncover abstract universal laws by an indepth analysis of a few languages only.
Syntax–Phonology Interface The GB approach (Chomsky 1981) tried to identify a small set of simple yet universally valid syntactic principles, defining a Universal Grammar. It was assumed, especially in later stages of the theory, that formal differences between languages reduce to differences in the lexical inventory. The quest for such simple yet universally valid syntactic generalization was not successful, however, so the GB framework was replaced by, essentially, two new models. The Minimalist Program (Chomsky 1995) tries to understand syntactic properties in terms of so-called interface conditions (phonology, semantics). It may have explanatory value, but its empirical coverage is sometimes limited. Optimality Theory (OT) (Prince and Smolensky 1993, Grimshaw 1997) assumes that there are universally valid and simple conditions. Phonology, syntax, and semantics are not necessarily isolated subcomponents. OT acknowledges that simple principles will be in conflict with each. Such conflicts are resolved by ordering the principles hierarchically. Hierarchies are different in various languages, the only source of grammatical differences. In particular, the major principles assumed in OT may correspond to ‘ideal types’ that languages may come close to in different degrees (as a function of the ranking of the relevant principles), so OT may have moved the generative enterprise closer to typology. This may be true in yet another sense; in contrast to GB, OT declared explicitly itself as a theory of variation and its explanation. See also: Constituent Structure: Syntactic; Linguistics: Incorporation; Syntactic Aspects of Language, Neural Basis of; Syntactic Control; Syntax–Phonology Interface; Syntax–Semantics Interface
Bibliography Baker M 1996 The Polysynthesis Parameter. Oxford University Press, New York Bossong G 1998 Le marquage de l’expe! rient dans les langues d’Europe. In: Feuillet J (ed.) Actance et Valence dans les Langues de l’europe. Mouton de Gruyter, Berlin, pp. 259– 94 Bresnan J (ed.) 1982 The Mental Representation of Grammatical Relations. MIT Press, Cambridge, MA Chomsky N 1957 Syntactic Structures. Mouton, The Hague, The Netherlands Chomsky N 1965 Aspects of the Theory of Syntax. MIT Press, Cambridge, MA Chomsky N 1973 Conditions on transformations. In: Anderson S, Kiparsky P (eds.) A Festschrift for Morris Halle. Holt, Rinehart and Winston, New York, pp. 232–86 Chomsky N 1981 Lectures on Goernment and Binding. Foris, Dordrecht, The Netherlands Chomsky N 1986 Barriers. MIT Press, Cambridge, MA Chomsky N 1995 The Minimalist Program. MIT Press, Cambridge, MA Culicover P 1976 A constraint on coreferentiality. Foundations of Language 14: 109–18 Dowty D 1991 Thematic protoroles and argument selection. Language 67: 574–619
Erteshik-Shir N 1977 On the Nature of Island Constraints. IULC, Bloomington, IN Fillmore C 1968 The case for case. In: Bach E, Harms R (eds.) Uniersals in Linguistic Theory. Holt, Rinehart and Winston, New York, pp. 1–88 Frazier K, Clifton C 1993 Construal. MIT Press, Cambridge, MA Gazdar G, Klein E, Pullum G, Sag I 1985 Generalized Phrase Structure Grammar. Blackwell, London Grimshaw J 1997 Projections, heads and optimality. Linguistic Inquiry 28: 373–422 Gruber J 1965 Studies in lexical relations. Diss., MIT, Cambridge Haider H 1993 Deutsche Syntax—generati. Narr, Tu$ bingen, Germany Hale K 1983 Warlpiri and the grammar of non-configurational languages. Natural Language and Linguistic Theory 1: 5–47 Hjelmslev L 1972 [1935–7] La cateT gorie des cas. En tude de grammaire geT neT rale. W Fink, Munich, Germany Hornstein N 1995 Logical Form. Blackwell, Oxford, UK Huang C T J 1982 Move WH in a language without Wh movement. The Linguistic Reiew 1(4): 369–416 Jackendoff R 1990 Semantic Structures. MIT Press, Cambridge, MA Jakobson R 1971, [1936] Beitrag zur allgemeinen Kasuslehre. Gesamtbedeutungen der russischen Kasus. Selected Writings II. Mouton, The Hague, pp. 23–71 Kager R 1999 Optimality Theory. Cambridge University Press, Cambridge, UK Kayne R 1994 The Antisymmetry of Syntax. MIT Press, Cambridge, MA Marantz A 1984 On the Nature of Grammatical Relations. MIT Press, Cambridge, MA Mu$ ller G 2000 Elemente der optimalitaW tstheoretischen Syntax. Stauffenburg, Tu$ bingen, Germany Newmeyer F 1980 Linguistic Theory in America. Academic Press, New York Perlmutter D (ed.) 1983 Studies in Relational Grammar 1. University of Chicago Press, Chicago, IL Prince A, Smolensky P 1993 Optimality theory Ms., Rutgers University, NJ Reinhart T 1976 The syntactic domain of anaphora. Ph.D. diss., MIT, Cambridge, MA Ross J R 1967 Constraints on variables in syntax. Ph.D. diss., MIT, Cambridge, MA Rizzi L 1990 Relatiized Minimality. MIT Press, Cambridge, MA Szabolcsi A, Zwarts F 1997 Weak islands and an algebraic semantics for scope taking. In: Szabolcsi A (ed.) Ways of Scope Taking. Kluwer, Dordrecht, The Netherlands Wunderlich D 1997 Cause and the structure of verbs. Linguistic Inquiry 28(1): 27–68
G. Fanselow
Syntax–Phonology Interface An individual word, pronounced in isolation, will have a characteristic pronunciation. Yet the pronunciation of a sentence, made up of a sequence of words, is not merely a stringing together of these individual pronunciations. The pronunciation of a sentence, i.e., 15407
Syntax–Phonology Interface its phonetic realization, is a function of the surface phonological representation of the sentence. Standard models of the organization of a grammar hold that this phonological representation interfaces with the surface syntactic representation of the sentence, the latter being made up of the sequence of the underlying phonological representations of the words and morphemes that are actually pronounced, and their organization into a syntactic word and phrase structure. This organization of the grammar permits in principle that the syntactic representation of a sentence may influence its phonological representation, and therefore allows an explanation for why sentence phonology is not simply the phonology of the individual words of the sentence strung together. Consider, for example, the sentence pairs No people will go vs. No, people will go and People will go happily vs. People will go, happily. In both cases, commas indicate a difference in syntactic structure within the pairs. That syntactic structure difference is reflected in a difference in pronunciation, specifically here in a difference in intonation. Effects like these of syntax on phonology have been studied most commonly, in part because earlier models of generative grammar sanctioned only effects in this direction. It is also possible that effects might go in the opposite direction, with phonological principles constraining the range of acceptable syntactic representations at the interface; some recent research explores this question. Two major lines of thinking on the influence of syntax on phonology can be distinguished. The prosodic structure hypothesis holds that the phonological representation of a sentence is organized into a prosodic constituent structure which is independent of, but related to, the surface syntactic structure of a sentence (Selkirk 1986, 1995; Nespor and Vogel 1986; Truckenbrodt 1999). It claims, moreover, that the syntax of a sentence can impinge directly only on this prosodic structure, namely on the organization into phonological words or phrases, and on the distribution of prosodic heads (or stresses). Other apparent effects of syntax on phonology, e.g., the choice of segmental variants or the placement of certain phrasal tones, are claimed to be indirect, mediated by the prosodic structure organization. And since the phonetic interpretation of a sentence is based on surface phonological representation, it follows that there are no direct effects of syntax on phonetic form. If the prosodic structure hypothesis embodies the correct theory of the domain structure for sentence phonology, then the theory of grammar must distinguish between the surface phonological representation of a sentence (PR), which would include this prosodic structure, and its surface syntactic representation (PF). The direct access hypothesis (Kaisse 1985, Odden 1995), by contrast, holds that phenomena of sentence phonology and phonetics may be defined directly from the surface syntactic word and phrase structure. It does not require a theory of grammar that dis15408
tinguishes the surface syntactic representation of a sentence crucially from its surface phonological representation, at least in its hierarchical structure. Evidence appears to favor the prosodic structure hypothesis.
1. The Prosodic Structure Hypothesis A prosodic structure is a well-formed labeled bracketing or tree. The constituents of prosodic structure belong to distinct prosodic categories, arranged in a prosodic hierarchy. (1) The Prosodic Hierarchy: Utterance (Utt) Intonational Phrase (IP) Major Phonological Phrase (MaP) Minor Phonological Phrase (MiP) Prosodic Word (PWd) Foot Syllable In the unmarked case, it is claimed, prosodic structure is strictly layered, in the sense that a constituent of a higher level in the hierarchy immediately dominates only constituents of the next level down in the hierarchy. In addition, within a prosodic constituent, in the unmarked case, one of the daughter constituents constitutes the prosodic head, the locus of prominence or stress. Compelling support for the claim that the domain structure of sentence phonology and phonetics has the formal properties of such a structure comes from evidence of domain conergence and domain layering within individual languages. Yet a direct access approach could potentially model these effects. Crucial support for the prosodic structure hypothesis comes from the fact that the constraints which define the hierarchical phonological domain structure are heterogeneous in type, including prosodic structure markedness constraints which are properly phonological in character, appealing only to properties of surface phonological representation, and syntax-prosodic structure interface constraints, which call for features of the syntactic representation to be reflected in phonological representation. Effects of prosodic markedness constraints show that the domain structure of surface phonology is not strictly determined by the syntax, only influenced by it. 1.1 Domain Conergence In English declarative sentences, the right edge of a major phrase (MaP) in prosodic structure is marked by the presence of a low tone. In (1a,b) a L- marks the low target tone found on the final syllable of a MaP. (2) (a) (NoL-) (animals are allowedL-) l No, animals are allowed. [kænkmlz] (b) (No animals are allowedL-) l No animals are allowed. [ænkmlz]
Syntax–Phonology Interface We also see a segmental reflex of that same MaP organization: a glottal stop appears as the onset of the first syllable of animals, [i-æni-mlz], in (2a), where it is phrase-initial, but not in (2b). This glottalization effect is arguably phonetic in character (Dilley and ShattuckHufnagel 1996). In (3a,b) we see an additional MaP effect: a contrast in the pronunciation of the function word to. (3) (a) (They’re allowed to graze thereL-) (by lawL-). [tb] (b) (They’re allowed toL-) (by lawL-) [tuw] To appears in its stressless, vowel-reduced weak from [tb] in (3a), where it is medial within the MaP, but in its stressed, full vowel strong form [tuw] when it appears at the right edge of MaP in (3b) due to the ellipsis of the following verb phrase. In English, monosyllabic function words like to generally appear in strong form only at the right edge of MaP (or when they bear contrastive stress) (Selkirk 1995). These three different phenomena which converge on major phrase in English together show that both edges of these constituents are simultaneously relevant to defining phonetic and phonological phenomena. In Bengali sentence phonology (Hayes and Lahiri 1991), MaP is marked at its right edge by a H- tone. The left edge of MaP is the locus of further phenomena: the word that is leftmost in a MaP is the head of that phrase, and a Low pitch accent, L*, is located on the initial syllable of this stressed word: (4) ( æL*moliH-) (raL*m-er bariH-) ([huL*ket.hilH-)Lm Shamoli Ram’s house entered Shamoli entered Ram’s house. In addition, there are segmental assimilation phenomena which apply optionally both within and between words. These assimilations operate across the span of the MaP, but are blocked if the sequence of segments belongs to different phrases: (5) (uL*morH-) (t.aL*dorH-) (taL*ra-keH-) (dı! eL*t.heH-)Lm [r] [t.] [r] [t] Amor scarf Tara–to gave Amor gave the scarf to Tara. (6) (uL*mor t.ador tara-keH-) (dı! eL*t.heH-)Lm [t.t.] [tt] Amor gave the scarf to Tara. The faster pronunciation in (6) is organized into fewer MaP than in the more deliberate pronunciation of the same sentence in (5). This difference in domain organization is indicated both by the patterns of tone distribution and by the assimilation of \r\. This Bengali phrasal domain, then, shows right and left domain-edge phenomena, as well as phenomena defined on the prominent head of the domain and across the span of the domain. 1.2 Domain Layering The edge of a clause in the syntax typically coincides with the edge of what prosodic structure theory calls
the intonational phrase (IP), the prosodic constituent immediately superordinate to the MaP in the prosodic hierarchy. In English, a final rising tonal contour on the last syllable of word (notatedL-Hm), sometimes referred to as the continuation rise, is taken to indicate the presence of the right edge of IP (Beckman and Pierrehumbert 1986), indicated with curly braces. (7) o(Since herds of grazing cowsL-) [hrdz] (have been allowed in her meadowL-)H%q [bv] [R] o(her flowers have all disappearedL-)q [hR] [bv] The right edge of an IP typically is accompanied by a short pause or a significant lengthening of the final syllable. In English (and other languages) the left edge of IP is, moreover, the site of phonetic strengthening effects that are potentially different in kind or in degree from those seen with the major phrase (Keating et al. 1998). For example, an underlying \h\ in English will typically fail to be pronounced if it is in an unstressed syllable (compare stressed herds to stressless her in the first clause), but at the beginning of an IP \h\ must necessarily be pronounced, even in a stressless syllable (compare stressless her in the second clause to stressless her in the first). By contrast, there is the possibility of absence of \h\ in the stressless hae which is MaPinitial in the first clause. This difference in appearance of [h] provides further testimony of the difference in the two levels of domain structure. In Japanese, a rise (LH) in pitch which is referred to as Initial Lowering is found at the left edge of a minor phrase (MiP) (Poser 1984, Kubozono 1993), indicated with angle brackets in (8b). (8) (a) Yamaai-no yamagoya-no uraniwa-no umagoya-ni kabi-ga haeH*-ta [[[[mountain village-GEN] hut-GEN] backyard-GEN] barn-LOC] [mold-NOM] [grow-PAST] The barn in the backyard of a hut in a mountain village grew moldy. (b) L H Yamaai-no yamagoya-noL H uraniwa-no umagoya-niL H kabi-ga haeH*-ta)L Sentence (8b) shows an organization into binary MiP. In (9b), by contrast, where all the words have a lexical accent, we see that there are as many instances of MiP, and hence of Initial Lowering, as there are of accents: (9) (a) YamaH* gata-no yamaH* dera-no one:H* san-ga mayone:H* zu-o ho:baH* tteiru-wa [[[Yamagata-GEN] mountain temple-GEN] young-lady-NOM] [[mayonnaise-ACC] filling-her-mouth] The young lady from a mountain temple in Yamagata was filling her mouth with mayonnaise. (b) (L HyamaH*gataL HyamaH*deraL Hone:H* san-gaL H mayone:H*zu-oL Hho:baH* tteiruwa)L 15409
Syntax–Phonology Interface The difference in patterns of minor phrasing seen in (8) and (9) are the result of prosodic markedness constraints, to be discussed below. At the level of MaP, the level above MiP in the prosodic hierarchy, two tonal phenomena, are defined: catathesis (), and upward pitch reset () (Poser 1984, Pierrehumbert and Beckman 1988). Catathesis is an accent-induced downstepping of the pitch range, while upward pitch reset returns the pitch range to a higher level. Only material within the same MaP undergoes catathesis. Upwards reset, seen in both (13) and (14), appears at the left edge of MaP. These examples, then, illustrate the basic strict layering of unmarked hierarchical prosodic structure: here a MaP is parsed into a sequence of MiP. 1.3 Nonsyntactic Determinants of Phonological and Phonetic Domain Structure A direct access theory could model patterns of edgesensitive phonetic and phonological phenomena as long as the edges were definable solely in syntactic terms. But demonstrably nonsyntactic factors in phonological domain organization, such as prosodic markedness constraints and speech rate, show that the influence of syntax is indirect. 1.3.1 The interaction of prosodic markedness constraints and interface constraints. In optimality theory (Prince and Smolensky 1993), phonological markedness constraints call for some ideal phonological target output shape. Of particular importance to our topic, markedness constraints compete with a class of constraints called faithfulness constraints, which call for properties of the surface phonological representation to reflect, or be identical to, properties of a related grammatical representation. Syntax–phonology interface constraints can be understood as a variety of faithfulness constraint, one of the input– output variety, if surface syntactic representation (PF) and surface phonological representation (PR) are in an input–output relation, or one of an output– output variety, if the PF and PR relation are evaluated in parallel, not serially. In the examples of Japanese minor phrasing cited in (8) and (9), prosodic markedness constraints complement the effects of syntax–phonology interface constraints, producing phrasing where it would not be called for by the syntax. A prosodic constraint calling for a MiP to be binary in its PWd composition, call it Binary MiP, results in the binary phrasing seen in (8). Another prosodic constraint, call it MiP Accent, calls for a MiP to contain at most one accent; it is responsible for the appearance of the nonbinary MiP in (14). In Japanese, MiP Accent takes precedence over Binary MiP, as in (9), a relation that can be expressed in the grammar of Japanese with an 15410
optimality-theoretic constraint ranking: MiP Accent Binary MiP. The ideal binary shape for MiP, which appears in (8), fails to appear in sentences with somewhat different syntactic structures, showing that certain syntax–phonology interface constraints also outrank the Binary MiP markedness constraint in the grammar of Japanese. The simple three word subject–object verb sentence (10) divides into two major phrases (MaP) at the break between subject and object, with the result that the subject consists of a nonbinary minor phrase (MiP), in violation of Binary MiP: (10) (a) Inayama-ga yuujin-o yonda [[Inayama-NOM] [[friend-ACC] call-PST]] Mr Inayama called his friend. (b) (L HInayama-ga(L Hyuujin-o yonda)L The MaP break here is imposed by a syntax– phonology interface constraint expressible as the alignment constraint (Selkirk and Tateishi 1988, 1991): (11) Align-L (XP, MaP) The left edge of a maximal projection in syntactic representation (PF) corresponds to the left edge of a major phrase in surface phonological representation (PR). The noun phrase (12), which consists of an accentless Adjective–Adjective–Noun sequence, consists of a single MaP (as shown by lack of internal upward pitch range reset), but nonetheless shows a MiP break between the two adjectives, leaving the first MiP nonbinary. (12) (a) Amai akai ame-ga [ sweet [ red [ candies]]] (b) (L HamaiL Hakai ame-ga)L NOT: *(L Hamai akai ame)L At play here is an interface constraint sensitive to syntactic branching (Kubozono 1993): (13) Align- L (Xbr, MiP). The left edge of a branching constituent in syntactic structure corresponds to the left edge of a minor phrase in prosodic structure. These two syntax–phonology alignment constraints outrank Binary MiP in the grammar of Japanese, causing the violations of Binary MiP seen in (10) and (12): (14) Align-L (XP, MaP), Align- L (Xbr, MiP) Binary MiP. It is because neither of these interface constraints is applicable in the left-branching structure of the subject noun phrase in (8) that we see the effects of the lowerranked Binary MiP emerge there. In other words, where the lower-ranked Binary MiP is not forced to be violated by the higher ranked interface constraints, it produces effects that are complementary to the effects of the interface constraints. We also find cases where an expected effect of a syntax–phonology interface constraint fails to appear because of pressure from a competing prosodic markedness constraint. This can be modeled by
Syntax–Phonology Interface ranking the prosodic constraint higher than the relevant interface constraint. (15) shows the expected MaP break at the left edge of VP in a neutral Focus sentence in Japanese. (15) A neutral Focus sentence in Japanese
aligned with an edge of the superordinate constituent (McCarthy and Prince 1993). The violation in (17) of the interface constraint (11) calling for a MaP edge at the left edge of VP is claimed to be driven by the higher-ranked prosodic constraint Align-R (MaPhead, IP). This constraint calls for the prominent MaP in the IP that dominates the Focus in (17) to align with the right edge of the intonational phrase, and so prevents the existence of any other MaP following the Focus. This analysis is captured in the ranking in (18). (18) Focus Prominence, Align-R (MaPhead, IP) Align-L (XP, MaP). So, here, a syntactic effect is overridden by a prosodic effect.
1.4 Speech Rate Sentence (16) contains a text which differs only in having a contrastive Focus on the last word of the subject NP, ani’yome. The phonological domain structure of this sentence shows an absence of the syntactically expected MaP break at the left edge of the VP, as well as the unexpected presence of a MaP break at the left edge of the Focus constituent (Poser 1984, Pierrehumbert and Beckman 1988, Nagahara 1994). (16) The same sentence, except with aniyome as FOCUS
One additional sort of nonsyntactic factor on phonological domain structure that has been recognized for some time in the literature is speech rate. In Bengali, a faster speech rate gives rise to larger phonological phrases (see Sect. 1.1). Speech rate plays a role as well in MiP organization in Japanese; in slower speech, single words may easily constitute a MiP (Selkirk and Tateishi 1988). Since syntactic structure does not vary with speech rate, these rate-based differences in phrasing give additional evidence for an autonomous, nonsyntactic, prosodic structural representation of phonological domains.
2. The Syntax–Prosodic Structure Interface: Serialist or Parallelist?
Truckenbrodt (1995) argues that the source of these phrasing effects in Focus sentences lies in a syntax– phonology interface constraint calling for a Focus to be more prosodically prominent than any other element within the Focus domain. (17) Focus Prominence The string in PR which corresponds to a string in PF dominated by Focus must contain a prosodic prominence that is greater than any other prominence contained in the Focus domain. In the case where the domain of the Focus is the entire sentence, as in (16), respect of this constraint means that the Focus element and all the prosodic phrases that dominate it will be the head(s) of the intonational phrase corresponding to that sentence in PR (underlined). A prominent constituent is subject to a prosodic markedness constraint that calls for it to be
The classic generative model of the relation between syntax and phonology is an input–output model, in which surface syntactic representation, the output of the syntactic component, is input to the phonological component, whose output is the surface phonological representation. Most approaches to modelling the syntax–prosodic structure interface have assumed this serialist, unidirectional, input–output model, with the prediction that syntax may influence phonology, but not vice versa. They have posited constraints, algorithms or rules relating the two representations which have the general form ‘If the surface syntactic representation of the sentence has property S, then the surface phonological representation will have property P.’ In the Align\Wrap theory of the interface (Selkirk 1986, 1995, Truckenbrodt 1999), the demarcational Align class of constraints calls for the edge of any syntactic constituent of type CS in the surface syntax to align with the edge of a prosodic constituent of type CP in the surface phonology, while the cohesional Wrap class of constraints calls for a syntactic constituent of type CS in the surface syntax to be contained within a 15411
Syntax–Phonology Interface prosodic constituent of type CP in the surface phonology. There have been other, relation-based, proposals concerning the nature of syntax–prosodic structure interface constraints as well, including Nespor and Vogel 1986, McHugh 1990, Kim 1997, all understood to be consistent with the prosodic structure hypothesis, and with the input–output architecture. An input–output model of the syntax–phonology interface makes the essential claim that constraints on phonological output representation do not interact with constraints on the surface syntactic representation. Two sorts of apparent challenges for the input–output model have recurred in recent literature. One sort concerns clitics whose distribution is governed by both syntactic and prosodic constraints, such as the second position clitics of Serbo-Croatian (Inkelas and Zec 1990). Yet because the syntax itself affords options in the positioning of second position clitics, it is not necessary to construe the prosodic subcategorization of a clitic as outranking a constraint on syntactic word order. Another sort of challenge concerns the alleged dependence of word order on the positioning of focus-related prosodic prominences in the sentence, as suggested by Vallduvi (1991) for Catalan, or Zubizarreta (1998) for Spanish. However, since the latter sort of phenomenon potentially can receive a purely syntactic treatment in which the Focus properties of syntactic representation are crucial in determining word order (see, e.g., Grimshaw and Samek-Lodovici 1998), this sort of case is not yet compelling. It remains to be seen whether the syntax– phonology interface is an input–output relation or whether it is instead a two-way street. See also Generative Grammar; Phonology; Syntax
Bibliography Beckman M, Pierrehumbert J 1986 Intonational structure in English and Japanese. Phonology 3: 255–309 Dilley L, Shattuck-Hufnagel S 1996 Glottalization of wordinitial vowels as a function of prosodic structure. Journal of Phonetics 24: 424–44 Grimshaw J, Samek-Lodovici V 1998 Optimal subjects and subject universals. In: Barbosa P, Fox D, Hagstrom P, McGinnis M, Pesetsky D (eds.) Is the Best Good Enough? Optimality and Competition in Syntax. MIT Press, Cambridge, MA, pp. 193–220 Hayes B, Lahiri A 1991 Bengali intonational phonology. Natural Language and Linguistic Theory 9: 47–96 Inkelas S, Zec D 1990 Prosodically constrained syntax. In: Inkelas S, Zec D (eds.) The Phonology–Syntax Connection. CSLI, Chicago, IL, pp. 365–78 Kaisse E M 1985 Connected Speech: The Interaction of Syntax and Phonology. Academic Press, San Diego, CA Keating P, Fougeron C, Cho T, Hsu C-S 1998 Domain-initial articulatory strengthening. Papers in Laboratory Phonology 6. Cambridge University Press, Cambridge, UK
15412
Kim N-J 1997 Tones, Segments, and their Interaction in North Kyungsang Korea. Ph.D. diss. Ohio State University, Columbus, OH Kubozono H 1993 The Organization of Japanese Prosody. Kurosio, Tokyo McCarthy J J, Prince A 1993 Generalized alignment. In: Booij G, Marle J v (eds.) Yearbook of Morphology. Kluwer, Dordrecht, The Netherlands, pp. 79–153 McHugh B 1990 The phrasal cycle in Kirunjo Chaga tonology. In: Inkelas S, Zec D (eds.) The Phonology–Syntax Connection. University of Chicago Press, Chicago, IL, pp. 217–42 Nagahara H 1994 Phonological Phrasing in Japanese, Ph.D. diss. UCLA, Los Angeles Nespor M, Vogel I 1986 Prosodic Phonology. Foris, Dordrecht, The Netherlands Odden D 1995 Phonology at the phrasal level in Bantu. In: Katamba F (ed.) Bantu Phonology and Morphology. LINCOM EUROPA, Munich, Germany, pp. 40–68 Pierrehumbert J, Beckman M 1988 Japanese Tone Structure. MIT Press, Cambridge, MA Poser W J 1984 The Phonetics and Phonology of Tone and Intonation in Japanese. Ph.D. diss. Massachusetts Institute of Technology, Cambridge, MA Prince A, Smolensky P 1993 Optimality theory: Constraint interaction in generative grammar. Ms, Rutgers University Center for Cognitive Science, New Brunswick, NJ Selkirk E 1986 On derived domains in sentence phonology. Phonology 3: 371–405 Selkirk E 1995 The prosodic structure of function words. In: Beckman J, Walsh Dickey L, Urbanczyk S (eds.) Papers in Optimality Theory. GLSA, Amherst, MA, pp. 439–70 Selkirk E, Tateishi K 1988 Constraints on minor phrase formation in Japanese. In: Proceedings of the 24th Annual Meeting of the Chicago Linguistics Society, Chicago Linguistic Society, Chicago, IL Selkirk E, Tateishi K 1991 Syntax and downstep in Japanese. In: Georgopoulos C, Ishihara R (eds.) Interdisciplinary Ap proaches to Language: Essays in Honor of S.-Y. Kuroda. Kluwer, Dordrecht, The Netherlands, pp. 519–44 Truckenbrodt H 1995 Phonological Phrases: Their Relation to Syntax, Focus and Prominence, Ph.D. diss. Massachusetts Institute of Technology, Cambridge, MA Truckenbrodt H 1999 On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry 30: 219–56 Vallduvi E 1991 The role of plasticity in the association of focus and prominence. In: Proceedings of the Eastern States Conference on Linguistics (ESCOL) 7, pp. 295–306 Zubizarreta M L 1998 Prosody, Focus and Word Order. MIT Press, Cambridge, MA
E. Selkirk
Syntax–Semantics Interface Anybody who speaks English knows that (1) is a sentence of English, and also has quite a precise idea about what (1) means. (1) Every boy is holding a block. English speakers know for infinitely many sentences that they are sentences of English, and they also know the meaning of these infinitely many sentences. A
Syntax–Semantics Interface central idea of generative syntax (see Linguistics: Incorporation) is that the structure of a sentence is given by a recursive procedure, as this provides the means to derive our knowledge about the infinite number of sentences with finite means. For the same reason, semanticists have developed recursive procedures that assign a meaning to sentences based on the meaning of their parts. The syntax–semantics interface establishes a relationship between these two recursive procedures. An interface between syntax and semantics becomes necessary only if they indeed constitute two autonomous systems. This is widely assumed to be the case, though not entirely uncontroversial as some approaches (see Grammar: Functional Approaches and Montague Grammar) do not subscribe to this hypothesis. Consider two arguments in favor of the assumption that syntax is autonomous: one is that there are apparently purely formal requirements for the wellformedness of sentences. For example, lack of agreement as in (2) renders (1) ill-formed (customarily marked by prefixing the sentence with an asterisk), though subject–verb agreement does not seem to make any contribution to the meaning of (1). (2) *Every boy hold a block. The special role of uninterpretable features for syntax comes out most sharply in recent work by Chomsky (1995), who regards it as one of the main purposes of syntax to eliminate such uninterpretable features before a sentence is interpreted. On the other hand, there are also sentences that are syntactically well-formed, but do not make any sense (often marked by prefixing the sentence with a hatch mark). Chomsky’s (1957, p. 15) famous example in (a) makes this point, and so does (b). (3) (a) FColorless green ideas sleep furiously. (b) FShe arrived for an hour. Independent of the value of these arguments, the separation of syntax and semantics has led to tremendous progress in the field. So, at least it has been a successful methodological principle.
the kind Ross (1967) first described. The constituent structure of a sentence is captured by the kind of phrase structure tree illustrated in (4)(a). (4)(b) shows a phrase structure tree with a movement relation. (4(a))
S NP
O
n
VP
boy
gre tee d
th
NP
e
te ær
S
(4(b)) W h
o
S
did
S J
o
h
n VP
gre t e
t
The task of semantics is to capture the meaning of a sentence. Consider first the term ‘meaning.’ In colloquial use, ‘meaning’ includes vague associations speakers may have with a sentence that do not exist for all speakers of a language: e.g., the sentence ‘I have a dream’ may have a special meaning in this colloquial sense to people familiar with Martin Luther King. Semantics, however, is at present concerned only with reproducible aspects of sentence meaning. In particular, semanticists have focused on the question of whether a sentence is judged true or false in a certain situation. Part of what any speaker of English knows about the meaning of (1) is that it is true in the situation shown in picture A, but false in the situation shown in picture B. (5) Every boy is holding a block.
1. Basic Assumptions Work on the syntax–semantics interface by necessity proceeds from certain assumptions about syntax and semantics. We are trying to keep to a few basic assumptions here. In the specialized articles on generative syntax (Linguistics: Incorporation), and quantifiers (Quantifiers, in Linguistics) many of these assumptions are discussed and justified in more depth. For the syntax, we assume that sentences have a hierarchical constituent structure that groups words and subconstituents into constituents. Also, we assume that constituents can be moved from one part of the tree structure to another, subject to constraints of
Picture A
Picture B
All speakers of English are equipped with certain mental mechanisms that enable them to make this truth value judgment for (1) and similar judgments for infinitely many other sentences. An explicit theory can be given using techniques similar to those used in mathematical logic. In this approach, the meaning of a constituent is modeled by a mathematical object (an individual, a function, a set, or a more complicated 15413
Syntax–Semantics Interface object). Complete declarative sentences such as (1) correspond to functions that assign to possible situations one of the truth values True or False. The basic intuition of entailment between sentences as in (6) is captured if for every situation to which the meanings of the premises (a) and (b) assign True, the meaning of the conclusion (c) also assigns to this situation True. (6) (a) Every boy is holding a block. (b) John is a boy. (c) Therefore, John is holding a block.
2. Syntax–Semantics Correspondences Though syntax and semantics are two autonomous recursive procedures, most researchers assume that there is a relationship between the two to be captured by the theory of the syntax–semantics interface. In particular, it seems to be the case that the steps of the recursion are largely the same. In other words, two phrases that form a syntactic constituent usually form a semantic constituent as well (Partee (1975) and others). Consider (7) as an illustration of this. (7) (a) A smart girl Q bought Q a thin book. (b) A thin girl Q bought Q a smart book. Subject Q Verb Q Object Syntacticians have argued that the subject and the object in (6) form constituents, which we call noun phrases (abbreviated as NPs). We see in (6) that the adjective that occurs in a NP also makes a semantic contribution to that NP. This is not just the case in English: as far as we know, there is no language where adjectives occurring with the subject modify the object, and vice versa. Further evidence for the close relation of syntactic and semantic constituency comes from so many phenomena that we cannot discuss them all. Briefly consider the case of idioms. On the one hand, an idiom is a semantically opaque unit whose meaning does not derive from the interpretation of its parts in a transparent way. On the other hand, an idiom is syntactically complex. Consider (8). (8) (a) Paul kicked the bucket. (‘Paul died.’) (b) The shit hit the fan. (‘Things went wrong.’) The examples in (8) show a verb–object idiom and a subject–verb–object idiom. What about a subject–verb idiom that is then combined transparently with the object? Marantz (1984, pp. 24–28) claims that there are no examples of this type in English. Since syntacticians have argued that the verb and the object form a constituent, the verb phrase, that does not include the subject (the VP in (4)), Marantz’s generalization corroborates the claim that idioms are always syntactic constituents, which follows from the close relationship between syntax and semantics. If syntactic and semantic recursion are as closely related as we claim, an important question is the semantic equivalent of syntactic constituent forma15414
tion. In other words, what processes can derive the interpretation of a syntactically complex phrase from the interpretation of its parts? Specifically, the most elementary case is that of a constituent that has two parts. This is the question to which the next two sections provide an answer.
2.1 Constituency 2.1.1 Predication as functional application. An old intuition about sentences is that the verb has a special relationship with the subject and the objects. Among the terms that have been used for this phenomenon are ‘predication’, which we adopt, ‘theta marking,’ and ‘assigns a thematic role.’ (9) John gave Mary Brothers Karamazo. One basic property of predication is a one-to-one relation of potential argument positions of a predicate and actually filled argument positions. For example, the subject position of a predicate can only contain one nominal: (a) shows that two nominals are two many; and (b) shows that none is not enough. (10) (a) *John Bill gave Mary Brothers Karamazo. (b) *gave Mary Brothers Karamazo. Chomsky (1981) describes this one-to-one requirement between predication position and noun phrases filling this position as the theta-criterion. The relationship between ‘gave’ and the three NPs in (9) is also a basic semantic question. The observed one-to-one correspondence has motivated an analysis of verbs as mathematical functions. A mathematical function maps an argument to a result. Crucially, a function must take exactly one argument to yield a result, and therefore the one-to-one property of predication is explained. So, far example, the meaning of ‘gave Mary Brothers K.’ in (10) would be captured as a function that takes one argument and yields either True or False as a result, depending on whether the sentence is true or false. The phrase ‘gave Mary Brothers K.’ (the verb phrase) in (9) is itself semantically complex, and involves two further predication relations. The meaning of the verb phrase, however, is a function. Therefore the semantic analysis of the verb phrase requires us to adopt higher order functions of the kind explored in mathematical work by Scho$ nfinkel (1924) and Curry (1930). Namely, we make use of functions the result of which is itself a function. Then, a transitive verb like ‘greet’ is modeled as a function that, after applying to one argument (the object), yields as its result a function that can be combined with the subject. The ditransitive verb ‘give’ is captured as a function that yields a function of the same type as ‘greet’ after combining with one argument. Functional application is one way to interpret a branching constituent. It applies more generally than just in NP–verb combinations. Another case of predi-
Syntax–Semantics Interface cation is that of an adjective and a noun joined by a copula, as in (11). We assume here that the copula is actually semantically empty (i.e., a purely formal element), and the adjective is a function from individuals to truth values, mapping exactly the red-haired individuals on to True (see Rothstein 1983). (11) Tina is red-haired. Similarly, (12) can be seen as a case of predication of ‘girl’ on the noun ‘Tina.’ For simplicity, we assume that in fact not only the copula but also the indefinite articles in (12) are required only for formal reasons. Then, ‘girl’ can be interpreted as the function mapping an individual who is a girl on to True, and all other individuals on to False. (12) Tina is a girl. 2.1.2 Predicate modification as intersection. Consider now example (13). In keeping with what we said about the interpretation of (11) and (12) above, (13) should be understood as a predication with the predicate ‘red-haired girl.’ (13) Tina is a red-haired girl. The predicate ‘red-haired girl’ would be true of all individuals who are girls and are red-haired. But clearly, this meaning is derived systematically from the meanings of ‘red-haired’ and ‘girl.’ This is generally viewed as a second way to interpret a branching constituent: by intersecting two predicates, and essentially goes back to Quine (1960). 2.1.3 Predicate abstraction. A third basic interpretation rule can be motivated by considering relative clauses. The meaning of (14a) is very similar to that of (13). For this reason, relative clauses are usually considered to be interpreted as predicates (Quine 1960, p. 110 f.). (14) (a) Tina is a girl who has red hair. (b) Tina is a girl who Tom likes. Syntacticians have argued that the relative pronouns in (14) are related in some way to an argument position of the verb. We adopt the assumption that this relationship is established by syntactic movement of the relative pronoun to an initial position of the relative clause. (15) whox x has red hair whox Tom likes x Looking at the representation in (15), the semantic contribution of the relative pronoun is to make a predicate out of a complete clause which denotes a truth value. An appropriate mathematical model for this process is lambda-abstraction (Church 1941). The definition in (16) captures the intuition that (14b) entails the sentence ‘Tom likes Tina,’ where the argument of the predicate is inserted in the appropriate position in the relative clause (see, e.g., Cresswell (1973) or Partee et al. (1990) for more precise treatments of lambda abstraction).
(16) [λx XP] is interpreted as the function that maps an individual with name A to a result r where r is the interpretation of XP after replacing all occurrences of x with A. Functional application, predicate modification and lambda abstraction are probably the minimal inventory that is needed for the interpretation of all hypothesized syntactic structures. 2.2 Scope 2.2.1 Quantifier scope. A second area where a number of correspondences between syntax and semantics have been found are semantic interactions between quantificational expressions. Consider the examples in (17) (adapted from Rodman (1976, p. 168): (17) (a) There’s a ball that every boy is playing with. (b) Every boy is playing with a ball. The difference between (17a) and (17b) is that only (17b) is true when every boy is playing with a different ball, as in picture B; (17a) is only judged to be true in the situation depicted in picture A, where every boy is playing with the same ball. (18)
Picture A
Picture B
The semantic difference between (17a) and (17b) correlates with a difference in syntactic structure. Namely, in (17b) ‘a ball’ is part of the sister constituent of ‘every boy,’ while it is not so in (17a). The special role the sister constituent plays for semantic interpretation was first made explicit by Reinhart (1976) building on the work of Klima (1964) and others. She introduced the term c-command for the relationship where a phrase enters with any other phrase inside its sister constituent, and proposed that the domain of c-commanded phrases (i.e., the sister constituent) is the domain of semantic rule application. The c-command domain is then equated with what logicians call the scope. On the recursive interpretation procedure developed in the previous section, Reinhart’s generalization is captured in a straightforward manner. This can be seen without being precise about the semantics of quantifiers (see Sect. 3.1, and Quantifiers, in Linguistics). Consider the interpretation of the relative clause constituent in (a), which is shown with the lambda operator forming a predicate in (19). This 15415
Syntax–Semantics Interface predicate will only be true of an object a if every boy is playing with a. Such an individual is found in picture B, but not in picture A. (19) Every boy is playing with x. For sentence (17b), however, we want to interpret the verb phrase ‘is playing with a ball’ as a predicate that is true of an individual a, if there is some ball that a is playing with. (Below, in Sect. 3.2, we sketch a way to derive this meaning of VP from the meanings of its parts.) Crucially, this predicate is fulfilled by every boy in both picture A and picture B. 2.2.2 Binding. A second case where c-command is important is the binding of pronouns by a quantificational antecedent. Consider the examples in (20). (20) (a) Every girl is riding her bike. (b) Every girl is riding John’s bike. Sentence (a) is judged true in a situation where the girls are riding different bicycles—specifically, their own bicycles. Example (b), however, is only true if John’s bike is carrying all the girls. The relationship between the subject ‘every girl’ and the pronoun ‘her’ in (a) can only be made more precise by using the concept of variable binding that has its origin in mathematical logic. (Note that, for example, the relevant interpretation of (a) is not expressible by replacing the pronominal with its antecedent ‘every girl,’ since this would result in a different meaning.) We assume that the interpretation of the VP in (20a) can be captured as in (21) following Partee (1975) and others. (21) λx is riding x’s bike. The relevance of c-command can be seen by comparing (20a) with (22). Though (22) contains the NP ‘every girl’ and the pronoun ‘her,’ (22) cannot be true unless all the boys are riding together on a single bike. The interpretation that might be expected in (22) in analogy to (20a) but which is missing can be paraphrased as ‘For every girl, the boys that talked with her are riding her bike.’ (22) The boys who talked with every girl are riding her bike. The relevance of c-command corroborates the recursive interpretation mechanism that ties syntax and semantics together. Because binding involves the interpretation mechanism (16) operating on the sister constituent of the λ-operator, only expressions ccommanded by a λ-operator can be bound by it. In other words, an antecedent can bind a pronoun only if the pronoun is in the scope of its λ-operator.
3. Syntax–Semantics Mismatches 3.1 Subject Quantifiers When we consider the semantics of quantifiers in more detail, it turns out that the view that predication in 15416
syntax and functional application in semantics have a one-to-one correspondence, as expressed above, is too simple. Consider example (23) which is repeated from (1). (23) Every boy is holding a block. Assume that the verb phrase ‘is holding a block’ is a one-place predicate that is true of any individual who is holding a block. Our expectation is then that the interpretation of (23) is achieved by applying this predicate to an individual who represents the interpretation of the subject ‘every boy.’ But some reflection shows that this is impossible to accomplish—the only worthwhile suggestion to capture the contribution of ‘every boy’ to sentence meaning by means of one individual, is that ‘every boy’ is interpreted as the group of all boys. But the examples in (24) show that ‘every boy’ cannot be interpreted in this way. (24) (a) Every boy (*together) weighs 50 kilos. (b) All the boys (together) weigh 50 kilos. The interpretation of (24) cannot be achieved by applying the predicate that represents the VP meaning to any individual. It can also be seen that predicate modification cannot be used to assign the right interpretation to examples like (24a). The solution that essentially goes back to Frege (1879) and was made explicit by Ajdukiewicz (1935) is that actually the quantificational subject is a higher order function that takes the VP predicate as its argument, rather than the other way round. This interpretation is given in (25): (25) [every boy] is a function mapping a predicate P to a truth value, namely it maps P to True if P is true of every boy. Note that this analysis still accounts for the exactly one-argument requirement, which was observed above, because the result of combining the subject quantifier and the VP is a truth value, and therefore cannot be combined with another subject quantifier. The Frege\Ajdukiewicz account of (23) does, however, depart from the intuition that the verb is predicated of the subject. Note that the semantic difference between quantifiers and nonquantificational NPs does not affect the syntax of English verb–subject relationships. Still in (1) the verb agrees with the subject, and the subject must precede the verb.
3.2 Object Quantifiers Consider now (26) with a quantificational noun phrase in the object position. We argued in Sect. 2.1.1 that the interpretation of a transitive verb is given by a higher order function that takes an individual as an argument and results in a function that must take another individual as its argument before resulting in a truth value. But then (26) is not interpretable if we assume the Frege\Ajdukiewicz semantics of quantifiers. (26) John greeted every boy.
Syntax–Semantics Interface From the partial account developed up to now, it follows that ‘greet’ and ‘every boy’ should be combined semantically to yield a predicate representing the meaning of the VP. However, the meaning of ‘every boy’ is a higher order predicate that takes a predicate from individuals to truth values as its argument. But the meaning of ‘greet’ is not such a oneplace predicate. Any solution to the interpretability problem we know of posits some kind of readjustment process. One successful solution is to assume that before the structure of (26) is interpreted, the object quantifier is moved out of the verbal argument position (see, e.g., Heim and Kratzer 1998). For example the structure shown in (27) is interpretable. (27) John λx [every boy] λy [x greeted y] The movement process deriving (27) from (26) is called ‘quantifier movement.’ Quantifier movement in English is not reflected by the order in which words are pronounced, but this has been claimed to be the case in Hungarian (Kiss 1991). 3.3 Inerse Scope The original motivation for quantifier movement was actually not the uninterpretability of quantifiers in object position, but the observation that the scope relations among quantifiers are not fixed in all cases. In example (17a) in Sect. 2.2.1, the scopal relation among the quantificational noun phrases was fixed. But consider the following sentence: (28) A boy held every block. The example is judged true in the situation in picture A, though there is no single boy such that every block was held by him. (29)
Hence, the hypothesis that movement of quantificational noun phrases prior to interpretation is possible (and indeed obligatory in some cases) provides a solution not only for the problem with interpreting object quantifiers, but also for the availability of inverse scope in some examples.
4. Concluding Remarks Many of the questions reviewed above are still topics of current research. We have based our discussion on the view that syntax and semantics are autonomous, and that there is a mapping from syntactic structures to interpretation. However, recent work by Fox (2000) argues that in some cases properties of interpretation must be visible to the syntax of quantifier movement. As the basic inventory of interpretation principles, we have assumed functional application, predicate modification, and lambda abstraction over variables. Other approaches, however, make use of function composition and more complex mathematical processes to eliminate lambda abstraction (Jacobson and others). In a separate debate, Sauerland (1998) challenges the assumption that movement in relative clauses should be interpreted as involving binding of a plain variable in the base position of movement. While the properties of scope and binding reviewed in Sect. 2.2 are to our knowledge uncontroversial, current work extends this analysis to similar phenomena such as modal verbs, tense morphemes, comparatives and many other topics (Stechow and Wunderlich (1991) give a comprehensive survey). Research activity on the syntax–semantics interface is currently expanding greatly, as an increasing number of researchers become proficient in the basic assumptions and the formal models of the fields of both syntax and semantics. Almost every new issue of journals such as Linguistic Inquiry or Natural Language Semantics brings with it some new insight on the questions raised here. See also: Linguistics: Incorporation; Movement Theory and Constraints in Syntax; Semantics; Syntax; Valency and Argument Structure in Syntax
Picture A
Picture B
The truth of (28) in this situation is not predicted by the representation in (30) that in some way maintains the syntactic constituency of the verb and the object to the exclusion of the subject. (30) [a boy] λx [every block] λy [x held y] For this reason, May (1977) and Chomsky (1975) propose to allow also quantifier movement that creates a constituent which includes the subject and the verb but not the object (similar ideas go back to Montague (1970) and Lewis (1972)). This structure is shown in (31). (31) [every block] λy [a boy] λx [x held y]
Bibliography Ajdukiewicz K 1935 Die syntaktische Konnexita$ t. Studia Philosophica 1: 1–27 Chomsky N 1957 Syntactic Structures. Mouton, Den Haag Chomsky N 1975 Questions of form and interpretation. Linguistic Analysis 1: 75–109 Chomsky N 1981 Lectures on Goernment and Binding. Foris, Dordrecht Chomsky N 1995 The Minimalist Program. MIT Press, Cambridge, MA Church A 1941 The Calculi of Lambda-Conersion. Princeton University Press, Princeton, Vol. 6 Cresswell M J 1973 Logic and Languages. Methuen, London
15417
Syntax–Semantics Interface Curry H B 1930 Grundlagen der kombinatorischen Logik. American Journal of Mathematics 50: 509–36 Fox D 2000 Economy and Semantic Interpretation. MIT Press, Cambridge, MA Frege G 1879 Begriffsschrift. Eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Neubert, Halle Heim I, Kratzer A 1998 Semantics in Generatie Grammar. Blackwell, Oxford Jacobson P 1999 Towards a variable-free semantics. Linguistics and Philosophy 22: 117–84 Kiss K 1991 Logical structure in linguistic structure. In: Huang J, May R (eds.) Logical Structure and Linguistic Structure. Kluwer, Dordrecht, The Netherlands, pp. 111–47 Klima E 1964 Negation in English. In: Fodor J A, Katz J J (eds.) The Structure of Language. Prentice-Hall, Englewood Cliffs, NJ, pp. 246–323 Lewis D 1972 General semantics. Synthese 22: 18–67 Marantz A 1984 On the Nature of Grammatical Relations. MIT Press, Cambridge, MA May R 1977 The grammar of quantification. Ph.D. dissertation, MIT Montague R 1970 Universal grammar. Theoria 36: 373–98 Partee B H 1975 Montague grammar and transformational grammar. Linguistic Inquiry 6: 203–300 Partee B H, ter Meulen A, Wall R E 1990 Mathematical Methods in Linguistics. Kluwer, Dordrecht Quine W 1960 Word and Object: Studies in Communication. MIT Press, Cambridge, MA Reinhart T 1976 The syntactic domain of anaphora. Ph.D. dissertation, MIT Rodman R 1976 Scope phenomena, movement transformations, and relative clauses. In: Partee B H (ed.) Montague Grammar. Academic Press, New York Ross J R 1967 Constraints on variables in syntax. Ph.D. dissertation, MIT Rothstein S D 1983 The syntactic form of predication. Ph.D. thesis, MIT Sauerland U 1998 The meaning of chains. Ph.D. thesis, MIT Scho$ nfinkel M 1924 U= ber die Bausteine der mathematischen Logik. Mathematischen Annalen 92: 305–16 Stechow A von, Wunderlich D 1991 Semantik. Ein internationales Handbuch zeitgenoW ssischer Forschung. Walter de Gruyter, Berlin, New York
U. Sauerland and A. von Stechow
System: Social A system is a set of differentiated and interdependent components. With such an abstract definition, the notion of system explicitly or implicitly underlies all or most of the theories of the social sciences. It is therefore useful to distinguish three ways of conceptualizing social systems, each one corresponding to a major paradigm. First, the historicist paradigm, according to which society constitutes a global system subject to a necessary law of historical evolution; this paradigm was mainly represented by Comte and Marx in the nineteenth century. Second, the structural-functional paradigm, according to which a society is a system whose components are the social institutions (the 15418
family, religion, etc.); this approach was developed in anthropology by Radcliffe-Brown and Malinowski in the 1930s, and then imported into sociology by Merton and Parsons. Third, the actionist paradigm, in which the social systems are constituted by actors and their interactions; one of the main sources of this paradigm is the subjectivist and marginalist revolution that happened in the 1870s in economics.
1. The Historicist Conception of Social Systems According to the historicist conception, society is destined to go through a necessary succession of great historical stages, each of which is a social system. The aim of sociology, then, is to determine the nature of these systems, and especially the historical ‘law’ governing their succession. Thanks to this historical ‘law,’ it becomes possible to know the nature of the social system that will appear in the future, the nature of the next great historical stage. What is the use of knowing in advance an inescapable future? Comte and Marx have the same answer: it gives us the power to facilitate and accelerate the advent of the ideal future society, since according to them the future social system will be the ultimate stage in the progress of humanity. 1.1 Comte and Marx In his positiist theory, Comte (1824) analyzes the past and future evolution of society as the succession of three great social systems. The theological system, first, was represented by the European absolute monarchies: social phenomena were then interpreted in the frame of the Christian religion, the spiritual power was held by the priests and the temporal power by the aristocracy (the sword). Then came the metaphysical system ushered in by the French Revolution: social phenomena were interpreted in the frame of the philosophy of the Enlightenment, the spiritual power gave way to freedom of conscience, and the temporal power came into the hands of the people (democracy). According to Comte, this metaphysical system is only transitional. It has a destructive nature and is thus unable to constitute a lasting social order. It will be replaced in the future by the positie system, in which sociology will become a ‘positive’ science (founded, just like modern physics and chemistry, upon observation and invariable natural laws), and in which the spiritual power will be held by the scientists and the temporal power by the producers (the manufacturers). Each social system is thus defined by its ‘civilization’ (the state of our understanding of social phenomena) and by its ‘social organization’ (as defined by who holds the spiritual and of the temporal powers). Social organization is here supposed to be determined by the state of civilization, and the latter is determined by scientific progress. Comte believed that this progress obeyed the law of the three stages: he thought that
System: Social sociology, like the other positive sciences, would appear at the end of a three-stage process (‘theological,’ ‘metaphysical,’ and ‘positive’) through which knowledge is progressively liberated from its animistic and anthropomorphic bias. In this way, the components of each social system and social evolution itself are determined by the progress of knowledge, and this progress is in turn determined by the law of the three stages. Marx’s historicism (1859), or dialectical materialism, describes a succession of four ‘modes of production’ (or social systems), antique, feudal, capitalistic, and socialistic. Each one of these systems is made up of two major components: the ‘material productive forces’ (techniques, tools, machinery) and the ‘production relations’ (property laws). The productive forces are supposed to determine the production relations, and also all the other ideological features of society (politics, religion, esthetics, philosophy). Marx illustrated this idea with a famous sentence: ‘the hand mill gives you feudal society, the steam mill industrial capitalism’ (MiseZ re de la philosophie 1847). In a watered-down version of his historicism, Marx (1859, Preface) explains that the succession of these systems is a historical necessity arising from the development of the productive forces (i.e., of the technical capacities of production). Technical and economic progress end up being incompatible with existing production relations. When this contradiction reaches its climax, the production relations are unshackled, and a great social revolution takes place, marking the transition between two modes of production. In Das Kapital (1867, Vol. 1), Marx attempts to study in a much more detailed way, with an actionist methodology, the necessary future succession between capitalism and socialism (see Sect. 3 for actionism). The historicist view was widely held in the nineteenth century. It is to be found in Tocqueville’s analysis of the secular transition between aristocratic and democratic society (De la DeT mocratie en AmeT rique, Vol. 1, 1835). Durkheim, greatly influenced by Comte, attempts to explain in De la Diision du Traail Social (1893) the necessary transition between the primitive societies, where social cohesion is ‘mechanical’ (cohesion comes from the similarity between individuals), and modern society, where social cohesion is ‘organic’ (cohesion arises from the differences between and the complementarity of individuals). In Spencer’s historicism (The Principles of Sociology, 1891), the historical law of succession between the primitive and modern societies is a law of evolution from a simple to a complex organization, based in part on the biological principles of Darwinism.
are represented in an oversimplified way, and their internal causal relations are debatable (the alleged primacy of ‘civilization’ over ‘social organization’ in Comte, and of ‘productive forces’ over ‘production relations’ in Marx). But these theories are especially questionable from a dynamic point of view: regardless of the obvious errors of prediction of a Comte or a Marx, the historicist project may be conclusively refuted with the help of an epistemological argument. Historicist ‘laws’ are supposed to be the equivalent in the field of social phenomena of physical laws in the field of natural phenomena. But physical laws (for instance, the law of falling bodies) explain reproducible phenomena, i.e., phenomena that may be repeatedly observed because they recur a great number of times. Historicist ‘laws,’ on the other hand, try to explain a unique phenomenon, human history. This phenomenon, which happens only once and is definitely not reproducible, cannot be explained by a scientific law in the sense of the natural sciences (Popper 1957). In other words, the claim of historicism to possess a scientific character is completely unfounded. Historicist authors are unable to establish, in a convincing way, the causal relations that are supposed to determine the necessary succession between the great social systems or historical stages (for an analysis of other deep contradictions in the positivism of Comte see Hayek 1952, and in the dialectical materialism of Marx see Mises 1957).
2. The Structural-functional Conception of Social Systems In the structural-functional paradigm, social institutions are the components of society considered as a system. On the one hand, these institutions play a role in the functioning of society ( functionalism) and on the other hand are held together by relationships of complementarity that constitute a structure (structuralism). The historicist sociologists of the nineteenth century were immersed in a vast process of sociological, economic, and political change. This process seemed to be explainable in terms of a necessary succession of great social systems. The structuralfunctional paradigm was born in a totally different intellectual context. It originated from the anthropological study of primitive societies, whose long-term history was unknown. As a consequence, this paradigm was completely severed from the most questionable aspect of historicism, namely the search for a global historical ‘law’ of succession between great social systems. 2.1 Society as a Functional System
1.2 Limits and Errors of Historicism It is easy today to perceive the limits of historicist theories. From a static point of view, social systems
As formulated by anthropologist Radcliffe-Brown (1935), the basic idea of functionalism is that in a given society, every institution, every social process, every 15419
System: Social cultural form possesses a ‘function,’ i.e., it plays a definite and even indispensable role in the survival and perpetuation of this system. This reasoning is borrowed from biology, where each organ plays a necessary role in the survival of the organism. Faced with a social institution, whatever it may be, the social scientist is supposed to establish its function from the standpoint of the global system constituted by society. For instance, Radcliffe-Brown explains the existence of ancestor worship in traditional societies by the function of strengthening solidarity: thanks to these rites, individuals learn that they can count on the magical help of their forebears, and also that they must submit to the traditional rules. A narrower conception of functionalism was offered by Malinowski (1944). According to him, the economic, educational and political institutions of a society have the function of satisfying—directly or indirectly—the ‘primary’ needs related to the biological survival of individuals (food, medical care, reproduction). These primary needs give birth to ‘derivative’ needs. Satisfying the need for food, for instance, implies a cooperation between the members of society. Since this cooperation requires rules of conduct, these rules themselves become a need, but an ‘indirect’ one in the sense that it is derived from a primary need. Functionalism was then imported into the sociology of complex societies by Merton (1949). His functional analysis is more qualified than the basic or ‘absolute’ functionalism developed by the anthropologists of traditional societies. Merton begins by distinguishing the functions that are ‘manifest’ (i.e., deliberately sought for or designed by the actors taking part in the system) and the functions that are ‘latent’ (i.e., those that play a role unknown to the participants in the system). The function of the cult of the ancestors according to Radcliffe-Brown is an example of a latent function. Merton then observes that in modern societies some institutions can be functional for a given group, and at the same time dysfunctional for another group (a technical advance, for instance, can increase the average productivity of labor, but also create a redundancy in some branches of production and compel some workers to search for a new job). A modern society, considered as a system, shares none of the ‘functional unity’ of biological organisms. Furthermore, some institutions or customs may not have any function. Finally, Merton criticizes the idea that there exist a certain number of functions that are necessary to the survival of any society, and that the same specific institutions would be indispensable to perform these functions across all societies.
mentarity of these institutions. A famous illustration is given by Parsons (1951). He observes that a highly industrialized society requires actors to be socially and geographically mobile. But this mobility is possible only if the actors are not too closely connected to their parents: there must be a clear-cut separation between an actor’s ‘orientation’ family (where he or she was born) and the ‘procreation’ family (the one created by his or her marriage). If we follow this reasoning to its logical conclusion, industrialization is seen to be incompatible with powerful and extensive traditional family ties. Those two components—industrialization on the one hand and the separation between orientation and procreation families on the other—constitute a consistent social structure typical of modern societies. Conversely, the absence of industrialization and extensive and powerful family ties are the components of another stable social structure, typical of traditional societies. This example shows how the existence of one institution can imply the existence of another institution, their combination representing a stable and coherent social structure. In anthropology, with Le! vi-Strauss (1958), the study of social structure became a structuralism based largely upon linguistics. Le! vi-Strauss realized that in traditional societies kinship relations (brother\sister), (husband\wife), (uncle\nephew), (father\son) constitute a system characterized by a law of opposition. If, in a given society, the relations between a brother and his sister are generally ‘good’ (friendly), then (husband\wife) relations are ‘bad’ (reserved or hostile); in the same way, if (uncle\nephew) relations are ‘good,’ then (father\son) relations are ‘bad,’ or vice versa. The system of kinship (brother\sister)(husband\wife)\ (uncle\nephew)(father\son) can thus exist in only four structures in the different societies: ‘j k\j k,’ ‘j k\k j,’ ‘k j\j k,’ or ‘k j\k j.’ The first of these structures, ‘j k\j k,’ means that (brother\ sister) relations are good, as symbolized by the ‘j’ sign, (husband\wife) relations are bad, ‘k’ sign, (uncle\nephew) relations are good, and (father\son) relations are bad. This law of opposition, in which relations take precedence over the components they relate, is perfectly similar to the law that had been previously discovered by the linguists N. S. Troubetzkoy and R. Jakobson in the area of structural phonology (these authors showed that language as a communication system rests on a law of opposition between the phonemes).
2.3 The Explanatory Limits of Structural-functional 2.2 The Structure of Social Systems The conception of society as a functional system of institutions leads quite naturally to an analysis of social structure, that is, to an analysis of the comple15420
This paradigm has been criticized mainly for its lack of explanatory power. In principle, the ‘absolute’ functionalism of a Radcliffe-Brown or a Malinowski permits us to explain the existence of an institution by the role it plays in the working and thus in the
System: Social existence of a given society. But the more qualified conception of functionalism to be found in Merton does not enable us to explain the existence of an institution by its latent function, since this institution or function is not considered as a functional requisite of the society anymore. The existence of an institution, then, can only be explained with the help of a ‘genetic’ analysis consisting of the study of the historical origin of this institution. Two important criticisms have been aimed at structural-functional from an actionist viewpoint by Homans (1987): first, this paradigm takes for granted the actors’ obedience to social rules, and does not explain why the actors choose to follow these rules; second, the existence of social structures is also taken for granted, and no explanation is offered as to how these structures emerge from social interaction.
3. The Actionist Conception of Social Systems The actionist paradigm aims at discovering the causes of social phenomena in the actions of individuals, and more precisely in the combination of these actions (e.g., see Action, Theories of Social; Methodological Indiidualism in Sociology). Social systems are defined as systems of actions or of interactions, from the smallest (two actors) to the largest ones (for instance, the international division of labor). In this paradigm, individual actions are generally considered as rational. This rationality may be utilitarian (appropriateness of means to ends), cognitive (rationality of positive beliefs, i.e., of factual propositions), or axiological (rationality of normative beliefs). The focus of attention, here, is neither on the global evolution of society (historicism), nor on social institutions seen from the standpoint of their functional role and of their mutual relationships (structural-functional), but on human action: social phenomena are considered as emergent and possibly unintentional effects whose causes are actions constituting a system. The actionist paradigm is widely used in economics as well as in sociology. It is quite old since it implicitly underlies, for instance, the works of a classical economist like A. Smith (An Inquiry into the Nature and Causes of the Wealth of Nations, 1776). Its importance was reinforced in the 1870s by the marginalist and subjectivist revolution that took place in economics. This revolution consisted of a much more rigorous analysis of human choices and actions in an environment characterized by the scarcity of goods, and made possible the development of really satisfying theories of prices and interest rates.
3.1 The Systems of Economic Exchange The most simple system of exchange is composed of two actors, either of which possesses a good and decides or refuses to exchange it against the other
good, according to his or her subjective preferences. For an exchange to take place, the good supplied must have less subjective value than the good demanded (and this is true of each of the two actors). This system can be generalized to a market comprising many sellers and buyers trading one good for another. It may then be generalized to a whole economic system made up of a great numbers of markets. In the tradition originating in L. Walras (En leT ments d ’En conomie Politique Pure, 1874), the representation of a global economic system rests on the concept of general equilibrium. In a situation of general equilibrium, all the actions of all the actors on all the markets are compatible with one another: they can all be carried out without any actor feeling any regret or disappointment. Thanks to a mathematical formalization of the subjective preferences of the consumers and the possibilities of production of the producers, the existence of a general equilibrium can be logically demonstrated: under some conditions, there exists a set of prices of consumer goods and factors of production such that all the actions of buying and selling are compatible (Arrow and Hahn 1971). This theory is rather complex from a mathematical point of view, and it suffers from two main defects: it does not demonstrate that the system converges towards a general equilibrium, and it does not take into account the radical uncertainty of the future. In another school of thought, the Austrian school (Mises 1949), the focus is on market processes. Economic analysis here focuses on the consequences in the economic system of a change in one of its elements. If, for instance, the consumers change their preferences and decide to spend more than before on good A and less on good B, the demand for A rises and its price tends to rise. The rate of profit also tends to rise in the branch of production of good A (since the price of this product rises when its costs of production remain the same). If this rate of profit becomes higher than the current rate of profit, more resources will be invested in the production of good A. Two consequences follow: first, the demand for the factors of production of A will rise and thus the prices of these factors will tend to rise; and second, the supply of A will increase and thus its price will tend to diminish. In this way, the rate of profit in the branch of production of A is brought down to the current level (since the price diminishes and the costs rise). Conversely, as the demand for B and its price are lowered, the rate of profit falls below the current level, and a disinvestment occurs. Demand for the factors of production of B decreases, resulting in lower prices. As a consequence, the supply of B tends to drop and its price to rise. As a consequence, the rate of profit in the production of B is brought back to the current level (the price rises and the costs diminish). A first adaptation of the economic system to the change of demand is thus achieved: the branch of production of A expands and the branch of B contracts. But the changes are not restricted to these 15421
System: Social two branches of production. To the extent that the demand for the factors of production of A or B (and for the factors of production of these factors) is also changed, the changes diffuse throughout the system and a new balance between the various branches of industry is established. It is possible to study in the same way the diffusion of the effects of a technical advance, or of a variation in the supply of a natural resource, in the quantity of money, in savings, and so on.
3.2 The Actionist Systems in Sociology The actionist tradition in sociology was born in Germany at the end of the nineteenth century. Weber (1956), for instance, developed a ‘comprehensive’ method, which consists of trying to understand actions from the point of view of the actors themselves. This method constitutes a generalization of the analysis of human choices derived from the subjectivist revolution of the 1870s in economics. The German sociological tradition had a major influence on sociology in the twentieth century, for instance through the ‘interactionist’ approach (e.g., see Interactionism, Symbolic). The use of actionist systems enables us to explain in a convincing way a number of important and puzzling phenomena. Boudon (1973) has thus explained ‘Anderson’s paradox’ in the field of sociology of education. From the 1960s on, an important democratization of schooling took place, as the proportion of children of low socioeconomic status increased in the colleges and universities. Since future social status is strongly related to academic achievement, vertical social mobility was expected to increase in developed countries. But, and this is the paradox, it didn’t. In spite of a reduction of this inequality, social mobility was not greater than before. Boudon showed that this paradox could be explained due to an actionist system based on the motivations and choices of individuals. Many other examples could be taken from the various fields of sociology (see Boudon 1979 for a survey).
4. The General System Theory In the 1940s, Bertalanffy (1968) began to develop a ‘general system theory,’ aiming at transcending the frontiers between a wide range of disciplines—physics, chemistry, biology, psychology, and the social sciences. His basic idea was that the systems analyzed in these different branches of learning share a number of features that can and should be the subject of a science of systems as such. But if it is admitted that the aim of science is to find satisfactory explanations of important phenomena, then the analogies that can be discovered between the systems originating from different branches of science do not matter very much. These analogies can constitute genuine scientific con15422
tributions only if they suggest new ways of explaining phenomena, or if they improve the existing explanations. Structural anthropology (Le! vi-Strauss 1958) is a good example of a set of theories that originated from an abstract theory of the structure of systems (see Sect. 2 above), but it is an exception. Most of the time, the study of social systems can dispense with a general science of systems (obviously, the opposite is not true). See also: Action, Theories of Social; Comte, Auguste (1798–1857); Durkheim, Emile (1858–1917); Exchange: Social; Functional Explanation: Philosophical Aspects; Functionalism, History of; Functionalism in Sociology; Interactionism: Symbolic; Marx, Karl (1818–89); Methodological Individualism in Sociology; Organizations, Sociology of; Parsons, Talcott (1902–79); Positivism: Sociological; Rational Choice Theory in Sociology; Rationality in Society; Social Change: Types; Spencer, Herbert (1820–1903); Structuralism, Theories of; Structure: Social
Bibliography Arrow K J, Hahn F H 1971 General Competitie Analysis. North-Holland, Amsterdam von Bertalanffy L 1968 General System Theory: Foundation, Deelopment, Applications. George Braziller, New York Boudon R 1973 L’IneT galiteT des Chances. La MobiliteT Sociale dans les SocieT teT s Industrielles. Armand Colin, Paris [1974 Education, Opportunity, and Social Inequality. Wiley, New York] Boudon R 1979 La Logique du Social. Introduction aZ l’Analyse Sociologique. Hachette, Paris [1981 The Logic of Social Action: An Introduction to Sociological Analysis. Routledge and Kegan Paul, Boston] Comte A 1824 Plan des travaux ne! cessaires pour re! organiser la socie! te! europe! enne. In: Arnaud P (ed.) Du pouoir spirituel. Le Livre de Poche, Paris [1978] von Hayek F A 1952 The Counter-Reolution of Science: Studies on the Abuse of Reason. The Free Press, Glencoe, IL Homans G C 1987 Certainties and Doubts: Collected Papers, 1962–1985. Transaction Books, New Brunswick, NJ Le! vi-Strauss C 1958 Anthropologie Structurale. Plon, Paris [1963 Structural Anthropology. Basic Books, New York] Malinowski B 1944 A Scientific Theory of Culture and Other Essays. The University of North Carolina Press, Chapel Hill, NC Merton R K 1949\1968 Manifest and latent functions. In: Merton R K (ed.) Social Theory and Social Structure, enlarged edn. Free Press, New York Marx K 1859 Zur Kritik der politischen Oekonomie. Duncker, Berlin [1904 A Contribution to the Critique of Political Economy. Kerr, Chicago, IL] von Mises L 1949 Human Action: A Treatise on Economics. Yale University Press, New Haven, CT von Mises L 1957 Theory and History: An Interpretation of Social and Economic Eolution. Yale University Press, New Haven, CT Parsons T 1951 The Social System. Free Press, New York Popper K R 1957 The Poerty of Historicism. Routledge and Kegan Paul, London
Systems Modeling Radcliffe-Brown A R 1935\1952 On the concept of function in social science. In: Radcliffe-Brown A R (ed.) Structure and Function in Primitie Society. Free Press, Glencoe, IL Weber M 1956 Wirtschaft und Gesellschaft. Grundriss der erstehenden Soziologie, 4th edn. Mohr, Tu$ bingen [1978 Economy and Society: An Outline of Interpretie Sociology. University of California Press, Berkeley and Los Angeles, CA]
R. Fillieule
Systems Modeling The term ‘system’ has many uses. Within the social sciences, one finds it used in several combinations such as: behavioral system, response system, neural system, social system, structural system, functional system, systems therapy, developmental system, cognitive system, etc. It is difficult to provide a general definition of ‘system’ which would be clear and distinct, covering the manifold uses of the term. The definition given in Klir (1991) is: a system is comprised of a set of things, together with a set of relations defined between these things. According to Klir, the traditional sciences can be understood as being concerned with things, whereas general systems science is concerned with systemhood as reflected by generalized relationships between things. Klir adds an important qualification—a system is not to be considered as a natural category but as a convenient cognitive construction. A number of fundamental characteristics of systemhood in the social and biological sciences will be discussed. These characteristics concern general types of relationships, in conformance with Klir’s (1991) definition. In contrast, specific relationships can be of any kind, like semantic relationships in cognitive systems, associative relationships in behavioral systems, synaptic relationships in neural systems, etc. From a general perspective, such specific relationships can each be conceived as a mapping in an abstract space, where this space is spanned up by generalized coordinates indexed by location and time. In this abstract setting, location-like relationships involve geometrical patterns and forms, timelike relationships involve dynamic regimes, and relationships involving both location and time pertain to spatiotemporal patterns. In this article, only timelike relationships can be considered in detail. However, much of the discussion generalizes rather straightforwardly to spatial and spatiotemporal aspects of systemhood. Perhaps the most basic characteristic of systemhood is stability. Stability is a multifaceted characteristic. On the one hand, it pertains to the relationships of a system with its environment, how a system reacts to external inputs. This is the more classical variant of stability in which various types of equilibrium are distinguished, associated with distinct dynamic re-
sponses to environmental perturbations. On the other hand, stability characterizes the internal relationships between components of a system, how these relationships accommodate to endogenous changes. This more modern variant of stability concerns the changes in within-system types and number of equilibria under slow (so-called adiabatic) variation of system parameters. In between these classical and modern variants of stability we find equilibrium concepts associated with dynamic game theory. In what follows the focus will be on a modern conceptualization of stability. That is, a consideration of qualitative changes in the equilibria of systems due to slow continuous variation of system parameters. Modern approaches to stability include much more than this; for instance, the recently discovered new equilibrium types (strange attractors associated with deterministic chaotic dynamic regimes). But attention will be restricted to qualitative changes in equilibria, because this has more definite implications for the social and behavioral sciences. A general theory of sudden qualitative changes, including methodological guidelines and some applications, will be outlined. Another important characteristic of systemhood, closely related to stability, is homeostasis. The reason why it is considered separately is because in theoretical treatises the ubiquity of homeostatic processes in neurological and physiological systems is often emphasized, yet in actual empirical research it is equally often neglected. It will be shown, to the best of our knowledge for the first time in the social scientific literature, what are the expected consequences of homeostasis for empirical studies of, for instance, the biological bases of personality and behavior. The final characteristic of systemhood to be considered is difficult to summarize with a single label. The characteristic concerned is related to the wellknown slogan: ‘The whole is more than its parts.’ This slogan could be elaborated as a new logic of part– whole relationships for general systems, but that is not our present endeavor. Yet the possibility of a whole transcending its constituent parts also bears on a number of fundamental issues such as unity, individuality, and other questions concerning the lack of comparability of systems that are composed of the same parts, but may differ at some higher level. In the closing sections it will be shown that taking the individuality of systems seriously leads to entirely new psychometric concepts and to a vindication of the so-called Developmental Systems paradigm.
1. Statistical Systems Modeling To provide a proper setting for the discussion of characteristics of systemhood, a preliminary overview of statistical systems modeling will be given. For this purpose a canonical representation of systems, the socalled state–space model, is introduced. This canonical 15423
Systems Modeling state–space model will only be used for the purpose of clarification. There exists an extensive literature on system identification, parameter estimation and hypothesis testing (cf. Elliott et al. 1995) which can be consulted for further details. The state–space model concerned is defined as: si(tj1) l fi[si(t), θi(t), t]jwi(tj1) yi(t) l hi[si(t), θi(t), t]jvi(t)
(1a) (1b)
In Eqn. (1a) si(t) denotes the state of system i at time t, a vector-valued (hence bold-faced) latent (i.e., not directly observed) process. The difference equation (1a) expresses the dynamics of the state process of system i: si(tj1) is related to si(t) by some function fi, where fi also may depend on time t as well as on a vector-valued parameter θi(t). The latter parameter θi(t) itself may vary in time. The remaining term wi(tj1) in Eqn. (1a) denotes random variation in si(tj1) which cannot be predicted on the basis of fi. In Eqn. (1b) yi(t) denotes the observed (manifest) vectorvalued process of system i at time t. The function hi expresses the dependence of yi(t) on the state si(t), where hi also may depend on time t as well as on the parameter θi(t). The remaining term vi(t) in Eqn (1b) denotes random measurement error. The state–space model Eqn. (1) is quite general, although an even more general definition could be given. But for our present purposes Eqn. (1) suffices. Through additional specifications, many well-known types of process models are obtained as special instances of Eqn. (1). These specifications mainly concern the measurement scales of the state process si(t) and manifest process yi(t), which each can be metrical or categorical; the forms of fi and hi, which can be linear or nonlinear, and the distributional assumptions characterizing wi(t) and vi(t). Furthermore, the time index in Eqn. (1) can be defined as realvalued instead of integer-valued. Some well-known process models which can be obtained as special instances of Eqn. (1) are the hidden Markov model, the generalized linear state–space model, and the doubly stochastic Poisson model. Statistical analysis of Eqn. (1), in particular parameter estimation of θi(t), can be accomplished according to a uniform Maximum Likelihood technique involving a combination of recursive filtering and the ExpectationMaximization algorithm (cf. Molenaar 1994). In what follows we will often consider the linear state–space model, a special case of Eqn. (1). Let si(t) and yi(t) be metrical processes, fi and hi be linear functions, and wi(t) and vi(t) be Gaussian processes. Denoting the linear functions fi and hi by, respectively, the (q, q) dimensional matrix Fi[si(t), θi(t), t] and the ( p, q) dimensional matrix Hi[si(t), θi(t), t] yields: si(tj1) l Fi[θi(t), t]si(t)jwi(tj1); yi(t) l Hi[θi(t), t]si(t)jvi(t)
15424
(2)
where si(t) is the q-variate factor or state process of system i at measurement occasion t, Fi is the matrix of autoregression coefficients relating si(tj1) to si(t), and Hi is the matrix of loadings at t. Specific version of this linear model underlie both longitudinal factor analysis (Jo$ reskog 1979) and dynamic factor analysis of time series (Molenaar 1994). Statistical analysis of Eqn. (2) proceeds as for Eqn. (1), although the linear state– space system Eqn. (2) allows for an analysis using standard structural equation modeling techniques.
2. System Stability Modern concepts of system stability characterize the ways in which the internal relationships between the components of a system react to endogenous parametric changes in θi(t). The analysis of this so-called structural stability constituted a milestone in the nonlinear dynamics paradigm which was initiated by three scientific theories: catastrophe theory, nonequilibrium thermodynamics, and synergism. Despite differences between the three theories, they share a basic focus on the explanation of structural stability and self-organization in nonlinear systems. Consider a system S that can vary along n dimensions. Take, for instance, a network of n artificial neurons, where the activity of each neuron constitutes a dimension along which the network can vary in time. Then the state of S at each time point is described by an n-dimensional vector sS(t) composed of the configuration of activity values along the n dimensions. Moreover the set of all possible values of the state across all time points constitutes the state space. Some states, i.e., some locations in state space, are special in that the system stays there if there are no perturbations. These special states are the equilibrium states of the system. From a geometrical point of view each equilibrium state acts as an attractor for those fluctuating states that are located in its domain of attraction in state space. Hence the set of equilibrium states yields a qualitative characterization of S in that it induces an organization of the state space in the form of a geometrical structure of domains of attraction. The time-dependent changes of S are described by an n-dimensional evolution equation like Eqn. (1) in which the parameter θS(t) is associated with the strength of the connections between each pair of neurons. It is a defining characteristic of θS(t) that it changes in time at a much slower rate than the rate of change of the state sS(t). Moreover the standard assumption is that the slow time-dependent changes of θS(t) are smooth, i.e., without sudden jumps. The equilibrium states of S are defined as the states for which the rate of change as described by Eqn. (1) is zero. Consequently, the equilibrium states constitute the roots of Eqn. (1). As Eqn. (1) depends upon θS(t), it follows that the equilibrium states of S also depend
Systems Modeling upon θS(t). Hence it is evident that the set of equilibrium states will change if θS(t) changes. The important question in assessing structural stability is, in what way these equilibria will change. More specifically, given that θS(t) changes in a slow and smooth way, does this cause the set of equilibrium states to change in a similar slow and smooth way? It turns out that this depends upon the functional form of Eqn. (1): if fS[sS(t), θS(t), t] is a linear function than smooth variation of θS(t) always causes smooth changes in the set of equilibria of S (i.e., smooth changes in the topological pattern of domains of attraction in its state space). But if fS[sS(t), θS(t), t] is a nonlinear function then smooth variation of θS(t) can give rise to abrupt qualitative changes in the set of equilibria of S which can be likened to sudden earthquakes that completely reorganize the existing topological pattern of domains of attraction in its state space. That is, existing equilibria may disappear or change in character, and new types of equilibria may suddenly emerge. Such abrupt qualitative changes in the equilibria as a result of smooth parameter changes are called bifurcations (in mathematics), phase transitions (in physics), or stage transitions (in biology and psychology). It might seem that a bifurcation originates from the smooth variation of the system parameter θS(t). But that can only be so in a restricted sense because smooth parameter variation belongs to a different category than the sudden qualitative changes characterizing a bifurcation. Smooth parameter variation is necessary for a bifurcation to occur, but it is not sufficient. Moreover, the action of the parameter variation in θS(t) is nonspecific in that the process according to which it varies does not determine the kind of qualitative changes of the equilibria of S during a bifurcation. Hence parameter variation is not a formative cause of the ensuing bifurcation. In contrast, it is the dynamical behavior of S, described by fS, that constitutes the true formative cause of the abrupt qualitative changes in its equilibrium states. The dynamical behavior of S defines a trajectory in the same state space to which also the equilibrium states of S belong. Hence the reorganization of the topographical pattern of domains of attraction in state-space associated with a bifurcation constitutes self-organization, i.e., it is due to the formative action of the dynamics of S itself. Analysis of structural stability of systems like S shows that nonlinear dynamical systems have the capacity to self-organize: the existing organization of their dynamical behavior transforms into qualitatively new organizations with emergent properties. This is a general mathematical result that in principle applies to any nonlinear dynamical system. In the context of developmental systems it is evident that the mathematical theory of structural stability and selforganization matches the characteristics of epigenetic processes. Epigenesis is the context-dependent un-
folding ofgenetic control of development. In theoretical accounts this context-dependent unfolding of genetic control gives rise to stagewise developmental processes (cf. Molenaar and Raijmakers 2000, for elaboration of this point). In mathematical analyses of bifurcations it is taken for granted that the details in Eqn. (1) describing a system are known. While this requirement will usually be met in applications in physics, it should be acknowledged that at present such detailed knowledge is lacking for most biological, behavioral and social systems. However, for a class of bifurcations called elementary catastrophes (Gilmore 1981) the analysis can proceed without specification of Eqn. (1). This makes catastrophe theory of unique importance to the analysis of structural stability, self-organization and epigenesis in biology and psychology. Van der Maas and Molenaar ( 1992) discuss in detail the application of catastrophe theory to the detection and classification of cognitive stage transitions. They present a set of inductive criteria, the so-called catastrophe flags, which provide the necessary methodological tools to establish empirically the presence of genuine stage transitions and to distinguish these from alternative developmental scenarios like sudden changes (spurts) in the level of equilibrium states due, for instance, to changes in environmental influences or the switching on or off of genetic influences. Some applications and elaborations of catastrophe analysis can be found in van der Maas and Molenaar ( 1992) and Raijmakers et al. (1996).
3. Homeostasis Homeostasis refers to a general principle that safeguards the stability of natural and artificial systems, where stability is understood in its more classical sense of robustness against external perturbations. Homeostasis is a fundamental concept in neuropsychology, psychophysiology and neuroscience (Cannon’s thesis). In the behavioral sciences, however, the concept of homeostasis is acknowledged, but it rarely fulfils a prominent role in actual analyses. A noteworthy exception is the monograph by McFarland (1971). Yet the mathematical-statistical theory of homeostasis, in particular optimal control theory of systems with feedback (e.g., Goodwin and Sin 1984), shows that homeostasis has important effects on system behavior and hence should be taken into account in statistical system modeling and analysis. Molenaar (1987) gives an application of optimal control theory to the optimization of a psychotherapeutic process. A concise illustration is given of some of the effects of homeostasis in applied systems modeling. Homeostasis will be defined as negative feedback control (McFarland 1971). It is shown by means of a simple computer simulation that the presence of homeostasis has profound effects on measured intercorrelations 15425
Systems Modeling between the behavior of coupled systems. Although the systems are strongly coupled, the manifest intercorrelations between their behavior approaches zero as a direct function of the strength of homeostasis. This has noteworthy implications for applied behavioral systems analysis. Consider, for instance, the long-standing research tradition investigating the biological basis (a physiological system) of personality (a behavioral system). Both systems are assumed to be coupled, but both have also to be considered as being homeostatic. The homeostatic nature of the coupled physiological and behavioral systems investigated in this research tradition therefore can be expected to yield biased measures of their coupling (i.e., intercorrelations close to zero), whereas the actual strength of this coupling may be considerable. The illustrative simulation study is based on a simple instance of the linear state–space model in Eqn. (2). Only a single set of two coupled systems is considered, hence the subscript i can be dropped in the defining equations. To simplify matters further, it is assumed that the state s(t) is observed directly, hence the matrix H is taken to be the identity matrix and the measurement noise v(t) is absent. Denoting the unidimensional state process of system 1 (e.g., the behavioral system) by s (t) and the unidimensional state process of system 2 "(e.g., the physiological system) by s (t), this yields: # s (tj1) l f s (t)jf s (t)jw (tj1) " "" " "# # " s (tj1) l f s (t)jf s (t)jw (tj1) # #" " ## # #
(3a)
In Eqn. (3a) f s (t) denotes the influence of the "# # on the behavioral state process physiological system s (tj1), where f is the strength of this coupling; "# reverse influence of the behavioral f " s (t) denotes the #" " system on the physiological state process s (tj1). # The system Eqn. (3a) does not include homeostasis. In contrast, the following analogue of Eqn. (3a) includes homeostasis: s (tj1) l f s (t)jf s (t)jc [s (t)]jw (tj1) s"(tj1) l f""s"(t)jf"#s#(t)jc"[s"(t)]jw"(tj1) (3b) # #" " ## # # # # In Eqn. (3b) c [s (t)] and c [s (t)] denote optimal " " which depend # # on a number of feedback functions additional parameters that are not displayed explicitly. Molenaar (1987, and references therein) presents a complete description of the computation of these optimal feedback functions. To assess the effects of homeostasis on the manifest correlation between s (t) and s (t), time series are " Eqns. (3a) # generated according to and (3b). This requires that numerical values are assigned to the system parameters in both Eqns. (3a) and (3b). For instance: f l 0.6, f l 0.4, f l 0.4, and f l 0.7. "" c [s (t)]"#is taken#"to be zero (only ## the In addition, " " 15426
physiological system 2 in (3b) includes homeostasis). It is then found that the manifest instantaneous intercorrelation between s (t) and s (t), as determined for " time points, # a realization of T l 100 is 0.85 for Eqn. (3a) and 0.15 for Eqn. (3b). This shows clearly that the intercorrelation for Eqn. (3b), that is when homeostasis is present in the physiological system 2, is substantially biased towards zero and does not reflect the strength of the coupling between the two systems without homeostasis present. The same result (substantial underestimation of the actual coupling strength by the manifest intercorrelation between coupled systems in case homeostasis is present) is found under all other possible scenarios (negative coupling between systems without homeostasis, homeostasis present in both systems, etc.). This general result also can be proven by means of mathematical-statistical methods. It is concluded that homeostasis has substantial biasing effects on manifest measures of system coupling, even in simple linear systems such as considered above.
4. Wholeness and Indiiduality The slogan that the whole is more than its parts implies that distinct systems may each be composed of the same set of components, yet still be different from each other in their global characteristics of systemhood. Another way to state this point of view is that the unity characterizing each system gives rise to emergent characteristics, and thus to individuality. To investigate the individuality of a given system i, it is required to observe its time-dependent behavior yS(t), t l 1, 2, …, T, at T consecutive time points (where, to simplify the discussion, consecutive time points are taken to be equidistant). Hence S is fixed and the time interval from t l 1 to t l T is conceived of as a sample from the time axis. With regard to Eqn. (1) and Eqn. (2), this implies that i l S is taken to be fixed, whereas statistics and parameter estimates are determined by taking appropriate averages over time points, and consecutively are generalized to the whole time axis (prediction and retrodiction). Henceforth this (N l 1 time series) design will be called an analysis of intra-individual variation (IAV). If a set (population) of systems can be taken to be homogeneous in all relevant aspects, then their time-dependent behavior can be analyzed in the following way: take a sample of N systems from the set and observe the time-dependent behavior yi(tj), i l 1, …, N of this sample at a fixed set of time points tj, j l 1, …, J. Hence, with regard to Eqns. (1) and (2) the J time points are fixed, whereas statistics and parameter estimates are determined by taking appropriate averages over systems, and consecutively are generalized to the whole set or population of systems. Henceforth this (longitudinal) design will be called an analysis of inter-individual variation (IIV).
Systems Modeling The particular aspect of wholeness and individuality of systems which will concern us in this section can now be stated more specifically as: under which circumstances and conditions will a longitudinal analysis of IIV yield results that are qualitatively the same as a time series analysis of IAV? This question can be raised with respect to all known techniques of statistical analysis, like analysis of variance, regression analysis, factor analysis, etc. In what follows, only factor analysis is considered in the context of the linear model in Eqn. (2), leading to the particular question: given that y consists of a fixed set of manifest variables (e.g., psychological test scores), under which circumstances will a longitudinal factor analysis of yi(tj) yield results that are qualitatively the same (same number of latent factors, same pattern of factor loadings, etc.) as a dynamic factor analysis of yS(t)? For more extensive elaboration of this issue, the reader is referred to Molenaar et al. (2001). The mathematical-statistical ergodicity theory is of direct relevance to this question. Ergodicity is a basic concept in statistical mechanics and equilibrium thermodynamics (e.g., Cornfield et al. 1982). Consider a closed domain containing a pure gas, where each gas molecule traverses a path or trajectory according to standard dynamical laws. The totality of these timedependent trajectories of all gas molecules within the domain constitutes an ensemble (comparable with the concept of population in statistics). The dynamics of this ensemble is called ergodic if statistics like the mean velocity determined by following a single molecule in time is (asymptotically) the same as the mean velocity determined by averaging across all molecules at a fixed point in time. Notice that this closely parallels the basic issue under consideration: a process is ergodic if analysis of its IAV yields the same results as an analysis of its IIV. In contrast, if a process is nonergodic then this equivalence no longer holds. A heuristic criterion for ergodicity which applies in many cases is that the mean, variance and covariance associated with an ergodic process are invariant in time. With respect to Eqn. (2) this ergodicity criterion implies that Fi[θi(t), t] l Fi[θi] and Hi[θi(t), t] l Hi[θi], i.e., these matrices (as well as the parameter vector θi) have to be invariant in time. The ergodicity question is an important issue in theoretical discussions in developmental psychology (cf. Molenaar et al. 2001). If developmental psychology is conceived of as the study of ontogenetic trajectories across the lifespan, then the life history of each individual subject constitutes a path in some high-dimensional state–space, while a population of subjects defines an ensemble of such age-dependent paths. The theoretical debate concerns the question whether the pattern of IIV within such an ensemble of life histories at some point in time contains sufficient information to arrive at valid causal explanations of the pattern of IAV characterizing the time course of each individual life history (cf. Wohlwill 1973).
The ergodicity question also is relevant to the foundations of psychometrics. Standard (longitudinal) factor analysis is the classical method to construct psychological tests (Lord and Novick 1968). Now suppose that it has been concluded in a longitudinal factor analysis that a particular psychological test obeys a 1-factor model at each measurement occasion, that the factor loadings stay invariant across measurement occasions, and that the intercorrelations of factor scores between measurement occasions are high. Hence the structure of interindividual variation of the scores on this test is simple: it stays qualitatively the same across measurement occasions and is characterized by high stability. If this particular test is used in a clinical or counseling setting to predict the scores of a single subject belonging to the population from which the longitudinal sample was drawn, can we be sure that this simple structure also characterizes this subject’s IAV? More specifically, can we expect to find in a dynamic factor analysis of this single subject’s scores also a 1-dimensional dynamic factor which stays qualitatively invariant in time and has high stability? If the test is measuring a nonergodic process then the answer will be negative. The consequences of nonergodicity can be observed clearly in classical test theory. Lord and Novick (1968) define the basic concept of true score as the mean of the intraindividual distribution of scores of a fixed subject. Then they remark (Lord and Novick 1968, p. 32): ‘The true and error scores defined above are not those primarily considered in test theory. They are, however, those that would be of interest to a theory that deals with individuals rather than with groups (counseling rather than selection).’ Next, instead of defining true and error scores for a single fixed person tested an arbitrarily large number of times, true and error scores are defined for an arbitrarily large number of persons tested at one or more fixed times. Hence, instead of focusing on IAV as stipulated by the initial definition of true score, classical test theory is based on IIV. The quote from Lord and Novick given above clearly indicates the restricted nature of results thus obtained in classical test theory. The standard longitudinal factor model is a special instance of Eqn. (2) in which Fi[θi(t), t] l F[θ(t), t] and Hi[θi(t), t] l H[θ(t), t]. That is, these matrices as well as the parameter vector are taken to be invariant across systems (subjects), but may vary in time. Comparison with the heuristic criterion for ergodicity given earlier (i.e., Fi[θi(t), t] l Fi[θi] and Hi[θi(t), t] l Hi[θi]), it is immediately evident that the possible timedependency of F[θ(t), t], H[θ(t), t] and θ(t) in the longitudinal factor model implies the possibility of nonergodicity. Hence in general the structure of IIV as determined in standard longitudinal factor analysis (e.g., test–retest designs) will differ from the structure of IAV as determined in dynamic factor analysis. In particular, psychological tests which have been constructed and normed by means of test–retest factor 15427
Systems Modeling analysis will in general yield invalid assessments and predictions in individual counseling. This is a straightforward consequence of classical theorems in ergodicity theory. It also is evident that the assumptions underlying standard (longitudinal) factor analysis regarding the invariance of the F and H matrices across systems (subjects) would seem to be a odds with the individuality and wholeness of systems as considered in this section. There is strong theoretical and empirical evidence that in many circumstances systems (subjects) do not obey these strong homogeneity assumptions and instead are heterogeneous, i.e., Fi, Hi and θi vary across systems i. It can be proven that (longitudinal) factor analysis is in general insensitive to this kind of heterogeneity, despite its incompatibility with the homogeneity assumptions underlying this analysis. This insensitivity has been corroborated in several simulation studies using the following design. An ensemble of individual time-dependent trajectories (life histories) yi(t), t l 1, 2, …, T, i l 1, 2, …, N, is generated according to dynamic factor models with heterogeneous system matrices and parameter vectors varying across systems (subjects). Then (longitudinal) factor analysis is carried out on a sample from this ensemble at one or more fixed time points. The consistent finding then is, that 1-factor (longitudinal) models yield satisfactory fits to these data, while the structure and parameter estimates in the fitted models do not match any of the structures and parameters in the dynamic factor model used to generate the data (cf. Molenaar et al. 2001, for details and references).
5. Conclusion Aspects of systemhood can have profound implications for the modeling and analysis of social, behavioral and biological systems. This has been illustrated by a consideration of three such aspects: structural stability (emergent changes in the geometry of (strange) attractors in state–space), homeostasis (negative feedback inducing underestimation of system coupling), and wholeness (nonergodicity inducing a lack of relationship between analyses of IAV and IIV). Each of these aspects can be underpinned by powerful mathematical-statistical theories and has been studied in extensive simulation studies, showing their manifold effects on system modeling. It is concluded that to accommodate the effects of systemhood in valid ways, advanced time series analysis techniques will prove to be indispensable.
See also: Factor Analysis and Latent Structure: Overview; Graphical Models: Overview; Motivational Development, Systems Theory of; Neural Systems and Behavior: Dynamical Systems Approaches; Selforganizing Dynamical Systems; Spatial Statistical Methods.
Bibliography Cornfield I P, Fomin S V, Sinai Y G 1982 Ergodic Theory. Springer, New York Elliott R J, Aggoun L, Moore J B 1995 Hidden Marko Models: Estimation and Control. Springer, New York Gilmore R 1981 Catastrophe Theory for Scientists and Engineers. Wiley, New York Goodwin G C, Sin K S 1984 Adaptie Filtering, Prediction and Control. Prentice-Hall, Englewood Cliffs, NJ Jo$ reskog K G 1979 Statistical estimation of structural models in longitudinal developmental investigations. In: Nesselroade J R, Baltes P B (eds.) Longitudinal Research in the Study of Behaior and Deelopment. Academic Press, New York, pp. 303–51 Klir G J 1991 Facets of System Science. Plenum, New York Lord F M, Novick M R 1968 Statistical Theories of Mental Test Scores. Addison-Wesley, Reading, MA McFarland D J 1971 Feedback Mechanisms in Animal Behaiour. Academic Press, London Molenaar P C M 1987 Dynamic assessment and adaptive optimisation of the therapeutic process. Behaioral Assessment 9: 389–416 Molenaar P C M 1994 Dynamic latent variable models in developmental psychology. In: von Eye A, Clogg C C (eds.) Latent Variable Analysis: Applications for Deelopmental Research. Sage, Newbury Park, CA, pp. 155–80 Molenaar P C M, Raijmakers M E J 2000 A causal interpretation of Piaget’s theory of cognitive development: Reflections on the relationship between epigenesis and nonlinear dynamics. New Ideas in Psychology 18: 41–55 Molenaar P C M, Huizenga H M, Nesselroade J R 2001 The relationship between the structure of inter-individual and intra-individual variability: A theoretical and empirical vindication of developmental systems theory. In: Staudinger U M, Lindenberger U (eds.) Understanding Human Deelopment Raijmakers M E J, van Kooten S, Molenaar P C M 1996 On the validity of simulating stagewise development by means of PDP networks: Application of catastrophe analysis and an experimental test of rule-like network performance. Cognitie Science 20: 101–36 Van der Maas H L J, Molenaar P C M 1992 Stagewise cognitive development: An application of catastrophe theory. Psychological Reiew 99: 395–417 Wohlwill J F 1973 The Study of Deelopment. Academic Press, New York
P. C. M. Molenaar Copyright # 2001 Elsevier Science Ltd. All rights reserved.
15428
International Encyclopedia of the Social & Behavioral Sciences
ISBN: 0-08-043076-7